Journal of Multimedia Information System
Korea Multimedia Society
Section A

Hierarchical Cluster Analysis Histogram Thresholding with Local Minima

Nyamlkhagva Sengee1,*, Chinzorig Radnaabazar2, Suvdaa Batsuuri2, Khurel-Ochir Tsedendamba2, Berekjan Telue2
1National University of Mongolia, ULAANBAATAR, MONGOLIA, nyamlkhagva@seas.num.edu.mn.
2National University of Mongolia, ULAANBAATAR, MONGOLIA, nyamlkhagva@seas.num.edu.mn.
*Corresponding Author: Nyamlkhagva Sengee, KH SURGUULIIN GUDAMJ-1 P.O.BOX -46A/523, 210646, nyamlkhagva@seas.num.edu.mn.

© Copyright 2017 Korea Multimedia Society. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-CommercialLicense (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Oct 22, 2017 ; Revised: Dec 06, 2017 ; Accepted: Dec 10, 2017

Published Online: Dec 31, 2017

Abstract

In this study, we propose a method which is based on “Image segmentation by histogram thresholding using hierarchical cluster analysis”/HCA/ and “A nonparametric approach for histogram segmentation”/NHS/. HCA method uses that all histogram bins are one cluster then it reduces cluster numbers by using distance metric. Because this method has too many clusters, it is more computation. In order to eliminate disadvantages of “HCA” method, we used “NHS” method. NHS method finds all local minima of histogram. To reduce cluster number, we use NHS method which is fast. In our approach, we combine those two methods to eliminate disadvantages of Arifin method. The proposed method is not only less computational than “HCA” method because combined method has few clusters but also it uses local minima of histogram which is computed by “NHS”.

Keywords: Image thresholding; clustering; histogram segmentation

I. INTRODUCTION

In image analysis, thresholding is a simple and effective technique. Thresholding is very popular method which separates the foreground and background of an image for extracting useful information from the image. Thresholding assigns pixels to one class if its gray level is greater than a specified threshold and otherwise assigns it to the other class. There are many other segmentation methods but thresholding is less computational than those methods. While thresholding concept is simple, threshold selection is main problem of this method. In the threshold selection, a histogram of images is used and there are many threshold selection methods developed [1-7].

In 2006, Arifin and Asona suggested “Histogram thresholding using hierarchical cluster analysis” method [8]. This method’s main idea is that a one bin of histogram which is not empty becomes a one cluster, and then a distance which is proposed by them computes between every adjacent bin. After distance computing, two clusters with the smallest distance will be merged. After merging procedure, a number of clusters will decrease by one. If a number of clusters equal two, threshold will be chosen.

Although this method is effective than others, it has some disadvantages. This method does not notice local minima of a histogram and has many clusters because all bins of histogram become one cluster. Because this method has too many clusters, it is more computation.

Recently, Julie and her coworkers suggested “A Nonparametric Approach for Histogram Segmentation” method [9]. They defined that the algorithm of method is fast. This method determines all of possible local minima. Perhaps this method helps not only threshold selection but also other segmentation methods. It determines firstly monotonous increasing bins of histogram and monotonous decreasing bins of histogram, and then finds all local minima of histogram. Finally, it will merge some local minima which are not useful for image analysis.

In our work, we combine those two methods to eliminate disadvantages of Arifin method. Therefore, proposed method can reduce not only a number of clusters and computation but also observes local minima of histogram. In what follows, related works are discussed in Section 2 and Section 3 presents the proposed method. Section 4 shows a few simulation results to demonstrate the effectiveness of the proposed method. Section 5 serves as the conclusion of this paper.

II. RELATIVE WORK

2.1 Arifin and Asano’s histogram thresholding using hierarchical cluster analysis method

Arifin and Asano [8] were suggested cluster merging strategy: Let Ck be the kth cluster of gray levels in the ascending order, and Tk be the highest gray level in the cluster Ck. Consequently, the cluster Ck contains gray levels in the range [Tk-1 + 1, Tk], if we define T0= -1. The proposed algorithm of cluster merging is summarized as follows:

  1. We assume that the target histogram contains K different non-empty gray levels. At the beginning of the merging process, each cluster is assigned to each gray level, i.e. the number of clusters is K and each cluster contains only one gray level.

  2. The following two steps are repeated (K - t) times for t-level thresholding.

    • 2.1. The distance between every pair of adjacent clusters is computed. The distance indicates the dissimilarity of the adjacent clusters, and will be defined in the next subsection.

    • 2.2. The pair of the smallest distance is found, and these clusters are unified into one cluster. The index of clusters Ck and Tk are reassigned since the number of clusters is decreased one by the merging.

  3. Finally t clusters, C1, C2 …, Ct, are obtained. The gray levels T1, T2 …, Tt-1, which are the highest gray levels of the clusters, are the estimated thresholds. For the usual two-level thresholding, t = 2 and the estimated threshold is T1, i.e., the highest gray level of the cluster of lower brightness.

2.1.1 Distance measurement

Let h(z), z = 0, 1, …, L-1, be the histogram of the target image, where z indicates the gray level and L is the number of available gray levels including empty ones. The histogram h(z) indicates the occurrence frequency of the pixel with gray level z. If we define p(z) as p(z) = h(z)/N, where N is the total number of pixels in the image, p(z) is regarded as the probability of the occurrence of the pixel with gray level z. And also define a function P(Ck) of a cluster Ck as follows

P ( C k ) = z = T k 1 + 1 T k p ( z ) , K = 1 K P ( C k ) = 1.

This function indicates the occurrence probability of pixels belonging to the cluster Ck. The distance between the clusters Ck1 and Ck2 is defined as

D i s t ( C k 1 , C k 2 ) = δ I 2 ( C k 1 C k 2 ) δ A 2 ( C k 1 C k 2 )

The two parameters in the definition correspond to the inter-class variance and the intra-class variance, respectively. The inter-class variance, δI2(Ck1,Ck2 ), is the sum of the square distances between the means of the two clusters and the total mean of both clusters, and defined as follows:

δ I 2 ( C k 1 C k 2 ) = P ( C k 1 ) P ( C k 1 ) + P ( C k 2 ) [ m ( C k 1 ) M ( C k 1 C k 2 ) ] 2 + P ( C k 2 ) P ( C k 1 ) + P ( C k 2 ) [ m ( C k 2 ) M ( C k 1 C k 2 ) ] 2 = P ( C k 1 ) P ( C k 2 ) ( P ( C k 1 ) + P ( C k 2 ) ) 2 [ m ( C k 1 ) m ( C k 2 ) ] 2

where m(Ck) denotes the mean of cluster Ck, defined as follows:

m ( C k ) = 1 P ( C k ) z = T k 1 + 1 T k z p ( z )

and M(Ck1UCk2) denotes the global mean of the clusters Ck1 and Ck2, defined as follows:

M ( C k 1 C k 2 ) = P ( C k 1 ) m ( C k 1 ) + P ( C k 2 ) m ( C k 2 ) P ( C k 1 ) + P ( C k 2 ) .

The intra-class variance, δA2(Ck1U Ck2 ), is the variance of all pixel values in the merged cluster, and defined as follows:

δ A 2 ( C k 1 C k 2 ) = 1 P ( C k 1 ) + P ( C k 2 ) X z = T k 1 1 + 1 T k 2 [ ( z M ( C k 1 C k 2 ) ) 2 p ( z ) ] .
2.2 “Nonparametric Approach for Histogram Segmentation” Method

This method was proposed by [6]. They treated the problem in three stages in A–C:

  • step A: testing a histogram against a fixed hypothesized density;

  • step B: testing a histogram against a qualitative assumption (decreasing, increasing);

  • step C: segmenting a histogram and generating an estimate of its underlying density.

This algorithm is very clearly introduced by reference 3.

The result of this algorithm on a histogram is shown Fig.1.

jmis-4-4-189-g1
Fig. 1. (a) Initialization of the algorithm (all the local minima of the histogram). The histogram presents small oscillations, which create several local minima. (b) Final segmentation after FTC algorithm. Three modes are detected in this histogram, one is very small.
Download Original Figure
2.3 “A Threshold Selection Method from Gray-Level Histograms” /Otsu’s method/

Otsu’s thresholding method utilizes discriminant analysis to find the maximum separability of classes [5]. For every possible threshold value, the method evaluates the goodness of this value if used as the threshold. This evaluation uses either the heterogeneity of both classes or the homogeneity of every class. By maximizing the criterion function, the means of two classes can be separated as far as possible and the variances in both classes will be as minimal as possible. This method still remains one of the most referenced thresholding methods.

III. PROPOSED METHOD

In the Arifin and Asano’s work [8], a number of clustering algorithm was equal to all intensity of image. So, this is more computational. Also, this algorithm does not notice local minima of histogram of image.

In our work, we first will determine local minima of histogram of image and second create cluster of histogram using local minima which is determined by “nonparametric approach for histogram segmentation method. Therefore, we can greatly decrease the number of clusters of histogram.

Proposed histogram thresholding algorithm,

  1. Determine t={t1,t2,…,tn}all the local minima of the histogram using “Nonparametric Approach for Histogram Segmentation” method

  2. Create cluster Ck={Ct1, Ct2, Ct3,….., Ctn} of histogram.

  3. The following two steps are repeated (K - t) times for t-level thresholding.

    • 3.1. The distance between every pair of adjacent clusters is computed. The distance indicates the dissimilarity of the adjacent clusters, and will be defined in the next subsection.

    • 3.2. The pair of the smallest distance is found, and these clusters are unified into one cluster. The index of clusters Ck and Tk are reassigned since the number of clusters is decreased one by the merging.

IV. RESULT

The proposed method is applied to the three images and we have compared proposed method with Otsu’s and Arifin’s methods. Threshold values determined for each method are shown in Table 1.

Table. 1. Threshold values
Sample Threshold selection methods
Otsu’s method Arifin’s method Proposed method
Image1 128 18 23
Image2 112 117 124
Image3 122 161 140
Download Excel Table

From the histogram (Figure 2), it is clear that the proposed method is more accurate in local minima detection.

jmis-4-4-189-g2
Fig. 2. Histogram of Image1.
Download Original Figure

Following example figures illustrate comparison of proposed method and other two methods.

Figure 3 shows that segmented black and white dots are well balanced on white and black background, when sample image is segmented by proposed method. Moreover, Fig.4 and Fig.5 illustrate that proposed method works better than compared two methods.

jmis-4-4-189-g3
Fig. 3. (a) Image1 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method
Download Original Figure
jmis-4-4-189-g4
Fig. 4. (a) Image2 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method
Download Original Figure
jmis-4-4-189-g5
Fig. 5. (a) Image1 /original/, (b) result thresholded by Otsu method, (c) result thresholded by Arifin method, (d) result thresholded by Proposed method
Download Original Figure

V. CONCLUSION

By combining “Histogram thresholding using hierarchical cluster analysis” and “A Nonparametric Approach for Histogram Segmentation”, we can eliminate disadvantages of Arifin method and keep merging and computing distance procedure of Arifin method. Therefore, combined method is not only less computational than Arifin method because combined method has few clusters but also it uses local minima of histogram. In near future, we will need to compare other existing methods because Otsu and Arifin methods use in this work.

Acknowledgement

This study was supported by the Higher Education Reform Project, Mongolia.

REFERENCES

[1].

M. Sezgin and B. Sankir, “Survey over image thresholding techniques and quantitative performance evaluation,” Journal of Electronic Imaging, vol. 13, no. 1, pp.146-165, 2004.

[2].

Y. J. Zhang, “A survey on evaluation methods for image segmentation,” Pattern Recognition, vol. 29, pp.1335-1346, 1996.

[3].

R. Guo and S. M. Pandit, “Automatic threshold selection based on histogram modes and a discriminant criterion,” Mach. Vision Appl., vol. 10, pp.331–338, 1998.

[4].

S. H. Kwon. “Threshold selection based on cluster analysis,” Pattern Recognition Letters, vol. 25, pp.1045–1050, 2005.

[5].

Nobuyuki Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Transaction on Systems, Man, and Cybernetics, Vol. 9, No. 1, 1979.

[6].

Wenbing Tao, Hai Jin and Liman Liu, “A New Image Thresholding Method Based on Graph Cuts,” Acoustics, Speech and Signal Processing, ICASSP. IEEE International Conference, Vol 1, pp. 605-608, April 2007.

[7].

J. Kittler and J. Illingworth, “Minimum error thresholding,” Pattern Recognition, vol. 19, pp. 41–47, 1986.

[8].

Agus Zainal Arifin, Akiro Asona, “Image segmentation by histogram thresholding using hierarchical cluster analysis,” Pattern Recognition Letter, vol. 27, pp. 1515-1521, 2006.

[9].

Julie Delon, Agnes Desolneux, Jose-Luis Lisani, and Ana Belen Petro, “A Nonparametrc Approach for Histogram Segmentation,” IEEE Transaction on Image Processing, Vol. 16, No. 1, January 2007.