Hierarchical Cluster Analysis Histogram Thresholding with Local Minima

Sengee, Nyamlkhagva; Radnaabazar, Chinzorig; Batsuuri, Suvdaa; Tsedendamba, Khurel-Ochir; Telue, Berekjan

doi:10.9717/JMIS.2017.4.4.189

J Multimed Inf Syst 4(4):189-194

eISSN: 2383-7632

DOI: https://doi.org/10.9717/JMIS.2017.4.4.189

Section A

Hierarchical Cluster Analysis Histogram Thresholding with Local Minima

Nyamlkhagva Sengee¹^,^*, Chinzorig Radnaabazar², Suvdaa Batsuuri², Khurel-Ochir Tsedendamba², Berekjan Telue²

Author Information & Copyright ▼

¹National University of Mongolia, ULAANBAATAR, MONGOLIA, nyamlkhagva@seas.num.edu.mn.

²National University of Mongolia, ULAANBAATAR, MONGOLIA, nyamlkhagva@seas.num.edu.mn.

^*Corresponding Author: Nyamlkhagva Sengee, KH SURGUULIIN GUDAMJ-1 P.O.BOX -46A/523, 210646, nyamlkhagva@seas.num.edu.mn.

© Copyright 2017 Korea Multimedia Society. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-CommercialLicense (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Oct 22, 2017 ; Revised: Dec 06, 2017 ; Accepted: Dec 10, 2017

Published Online: Dec 31, 2017

Abstract

In this study, we propose a method which is based on “Image segmentation by histogram thresholding using hierarchical cluster analysis”/HCA/ and “A nonparametric approach for histogram segmentation”/NHS/. HCA method uses that all histogram bins are one cluster then it reduces cluster numbers by using distance metric. Because this method has too many clusters, it is more computation. In order to eliminate disadvantages of “HCA” method, we used “NHS” method. NHS method finds all local minima of histogram. To reduce cluster number, we use NHS method which is fast. In our approach, we combine those two methods to eliminate disadvantages of Arifin method. The proposed method is not only less computational than “HCA” method because combined method has few clusters but also it uses local minima of histogram which is computed by “NHS”.

Keywords: Image thresholding; clustering; histogram segmentation

I. INTRODUCTION

In image analysis, thresholding is a simple and effective technique. Thresholding is very popular method which separates the foreground and background of an image for extracting useful information from the image. Thresholding assigns pixels to one class if its gray level is greater than a specified threshold and otherwise assigns it to the other class. There are many other segmentation methods but thresholding is less computational than those methods. While thresholding concept is simple, threshold selection is main problem of this method. In the threshold selection, a histogram of images is used and there are many threshold selection methods developed [1-7].

In 2006, Arifin and Asona suggested “Histogram thresholding using hierarchical cluster analysis” method [8]. This method’s main idea is that a one bin of histogram which is not empty becomes a one cluster, and then a distance which is proposed by them computes between every adjacent bin. After distance computing, two clusters with the smallest distance will be merged. After merging procedure, a number of clusters will decrease by one. If a number of clusters equal two, threshold will be chosen.

Although this method is effective than others, it has some disadvantages. This method does not notice local minima of a histogram and has many clusters because all bins of histogram become one cluster. Because this method has too many clusters, it is more computation.

Recently, Julie and her coworkers suggested “A Nonparametric Approach for Histogram Segmentation” method [9]. They defined that the algorithm of method is fast. This method determines all of possible local minima. Perhaps this method helps not only threshold selection but also other segmentation methods. It determines firstly monotonous increasing bins of histogram and monotonous decreasing bins of histogram, and then finds all local minima of histogram. Finally, it will merge some local minima which are not useful for image analysis.

In our work, we combine those two methods to eliminate disadvantages of Arifin method. Therefore, proposed method can reduce not only a number of clusters and computation but also observes local minima of histogram. In what follows, related works are discussed in Section 2 and Section 3 presents the proposed method. Section 4 shows a few simulation results to demonstrate the effectiveness of the proposed method. Section 5 serves as the conclusion of this paper.

II. RELATIVE WORK

2.1 Arifin and Asano’s histogram thresholding using hierarchical cluster analysis method

Arifin and Asano [8] were suggested cluster merging strategy: Let C_k be the kth cluster of gray levels in the ascending order, and T_k be the highest gray level in the cluster C_k. Consequently, the cluster C_k contains gray levels in the range [T_k-1 + 1, T_k], if we define T₀= -1. The proposed algorithm of cluster merging is summarized as follows:

We assume that the target histogram contains K different non-empty gray levels. At the beginning of the merging process, each cluster is assigned to each gray level, i.e. the number of clusters is K and each cluster contains only one gray level.
The following two steps are repeated (K - t) times for t-level thresholding.
- 2.1. The distance between every pair of adjacent clusters is computed. The distance indicates the dissimilarity of the adjacent clusters, and will be defined in the next subsection.
- 2.2. The pair of the smallest distance is found, and these clusters are unified into one cluster. The index of clusters C_k and T_k are reassigned since the number of clusters is decreased one by the merging.
Finally t clusters, C₁, C₂ …, C_t, are obtained. The gray levels T₁, T₂ …, T_t-1, which are the highest gray levels of the clusters, are the estimated thresholds. For the usual two-level thresholding, t = 2 and the estimated threshold is T₁, i.e., the highest gray level of the cluster of lower brightness.

2.1.1 Distance measurement

Let h(z), z = 0, 1, …, L-1, be the histogram of the target image, where z indicates the gray level and L is the number of available gray levels including empty ones. The histogram h(z) indicates the occurrence frequency of the pixel with gray level z. If we define p(z) as p(z) = h(z)/N, where N is the total number of pixels in the image, p(z) is regarded as the probability of the occurrence of the pixel with gray level z. And also define a function P(C_k) of a cluster C_k as follows

P (C k) = ∑ z = T k − 1 + 1 T k p (z), ∑ K = 1 K P (C k) = 1.

This function indicates the occurrence probability of pixels belonging to the cluster C_k. The distance between the clusters C_k1 and C_k2 is defined as

D i s t (C k 1, C k 2) = δ I 2 (C k 1 ∪ C k 2) δ A 2 (C k 1 ∪ C k 2)

The two parameters in the definition correspond to the inter-class variance and the intra-class variance, respectively. The inter-class variance, δ_I²(C_k1,C_k2 ), is the sum of the square distances between the means of the two clusters and the total mean of both clusters, and defined as follows:

δ I 2 (C k 1 ∪ C k 2) = P (C k 1) P (C k 1) + P (C k 2) [m (C k 1) − M (C k 1 ∪ C k 2)] 2 + P (C k 2) P (C k 1) + P (C k 2) [m (C k 2) − M (C k 1 ∪ C k 2)] 2 = P (C k 1) P (C k 2) (P (C k 1) + P (C k 2)) 2 [m (C k 1) − m (C k 2)] 2

where m(C_k) denotes the mean of cluster C_k, defined as follows:

m (C k) = 1 P (C k) ∑ z = T k − 1 + 1 T k z p (z)

and M(C_k1UC_k2) denotes the global mean of the clusters C_k1 and C_k2, defined as follows:

M (C k 1 ∪ C k 2) = P (C k 1) m (C k 1) + P (C k 2) m (C k 2) P (C k 1) + P (C k 2) .

The intra-class variance, δ_A²(C_k1U C_k2 ), is the variance of all pixel values in the merged cluster, and defined as follows:

δ A 2 (C k 1 ∪ C k 2) = 1 P (C k 1) + P (C k 2) X ∑ z = T k 1 − 1 + 1 T k 2 [(z − M (C k 1 ∪ C k 2)) 2 p (z)] .

2.2 “Nonparametric Approach for Histogram Segmentation” Method

This method was proposed by [6]. They treated the problem in three stages in A–C:

step A: testing a histogram against a fixed hypothesized density;
step B: testing a histogram against a qualitative assumption (decreasing, increasing);
step C: segmenting a histogram and generating an estimate of its underlying density.

This algorithm is very clearly introduced by reference 3.

The result of this algorithm on a histogram is shown Fig.1.

Fig. 1. (a) Initialization of the algorithm (all the local minima of the histogram). The histogram presents small oscillations, which create several local minima. (b) Final segmentation after FTC algorithm. Three modes are detected in this histogram, one is very small.

Download Original Figure

2.3 “A Threshold Selection Method from Gray-Level Histograms” /Otsu’s method/

Otsu’s thresholding method utilizes discriminant analysis to find the maximum separability of classes [5]. For every possible threshold value, the method evaluates the goodness of this value if used as the threshold. This evaluation uses either the heterogeneity of both classes or the homogeneity of every class. By maximizing the criterion function, the means of two classes can be separated as far as possible and the variances in both classes will be as minimal as possible. This method still remains one of the most referenced thresholding methods.

III. PROPOSED METHOD

In the Arifin and Asano’s work [8], a number of clustering algorithm was equal to all intensity of image. So, this is more computational. Also, this algorithm does not notice local minima of histogram of image.

In our work, we first will determine local minima of histogram of image and second create cluster of histogram using local minima which is determined by “nonparametric approach for histogram segmentation method. Therefore, we can greatly decrease the number of clusters of histogram.

Proposed histogram thresholding algorithm,

Determine t={t₁,t₂,…,t_n}all the local minima of the histogram using “Nonparametric Approach for Histogram Segmentation” method
Create cluster C_k={C_t1, C_t2, C_t3,….., C_tn} of histogram.
The following two steps are repeated (K - t) times for t-level thresholding.
- 3.1. The distance between every pair of adjacent clusters is computed. The distance indicates the dissimilarity of the adjacent clusters, and will be defined in the next subsection.
- 3.2. The pair of the smallest distance is found, and these clusters are unified into one cluster. The index of clusters C_k and T_k are reassigned since the number of clusters is decreased one by the merging.

IV. RESULT

The proposed method is applied to the three images and we have compared proposed method with Otsu’s and Arifin’s methods. Threshold values determined for each method are shown in Table 1.

Table. 1. Threshold values

Sample	Threshold selection methods
	Otsu’s method	Arifin’s method	Proposed method
Image1	128	18	23
Image2	112	117	124
Image3	122	161	140

Download Excel Table

From the histogram (Figure 2), it is clear that the proposed method is more accurate in local minima detection.

Fig. 2. Histogram of Image1.

Download Original Figure

Following example figures illustrate comparison of proposed method and other two methods.

Figure 3 shows that segmented black and white dots are well balanced on white and black background, when sample image is segmented by proposed method. Moreover, Fig.4 and Fig.5 illustrate that proposed method works better than compared two methods.

Fig. 3. (a) Image1 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method

Download Original Figure

Fig. 4. (a) Image2 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method

Download Original Figure

Fig. 5. (a) Image1 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method

Download Original Figure

V. CONCLUSION

By combining “Histogram thresholding using hierarchical cluster analysis” and “A Nonparametric Approach for Histogram Segmentation”, we can eliminate disadvantages of Arifin method and keep merging and computing distance procedure of Arifin method. Therefore, combined method is not only less computational than Arifin method because combined method has few clusters but also it uses local minima of histogram. In near future, we will need to compare other existing methods because Otsu and Arifin methods use in this work.

Acknowledgement

This study was supported by the Higher Education Reform Project, Mongolia.

REFERENCES

[1].

M. Sezgin and B. Sankir, “Survey over image thresholding techniques and quantitative performance evaluation,” Journal of Electronic Imaging, vol. 13, no. 1, pp.146-165, 2004.

[2].

Y. J. Zhang, “A survey on evaluation methods for image segmentation,” Pattern Recognition, vol. 29, pp.1335-1346, 1996.

[3].

R. Guo and S. M. Pandit, “Automatic threshold selection based on histogram modes and a discriminant criterion,” Mach. Vision Appl., vol. 10, pp.331–338, 1998.

[4].

S. H. Kwon. “Threshold selection based on cluster analysis,” Pattern Recognition Letters, vol. 25, pp.1045–1050, 2005.

[5].

Nobuyuki Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Transaction on Systems, Man, and Cybernetics, Vol. 9, No. 1, 1979.

[6].

Wenbing Tao, Hai Jin and Liman Liu, “A New Image Thresholding Method Based on Graph Cuts,” Acoustics, Speech and Signal Processing, ICASSP. IEEE International Conference, Vol 1, pp. 605-608, April 2007.

[7].

J. Kittler and J. Illingworth, “Minimum error thresholding,” Pattern Recognition, vol. 19, pp. 41–47, 1986.

[8].

Agus Zainal Arifin, Akiro Asona, “Image segmentation by histogram thresholding using hierarchical cluster analysis,” Pattern Recognition Letter, vol. 27, pp. 1515-1521, 2006.

[9].

Julie Delon, Agnes Desolneux, Jose-Luis Lisani, and Ana Belen Petro, “A Nonparametrc Approach for Histogram Segmentation,” IEEE Transaction on Image Processing, Vol. 16, No. 1, January 2007.