I. INTRODUCTION
Image segmentation is one of the traditional research areas of image processing. The purpose of image segmentation is usually to recognize the objects in an image [1]. Image segmentation is also used as a method to reduce the computational amount when processing a lot of images. Since an image consists of many pixels, it is necessary to reduce the amount of computation when processing many images. For this purpose, an image is divided into groups of similar pixels, and each pixel group is processed like one pixel. A group of similar pixels is called a superpixel [2], [3].
A good superpixel method reduces the resolution of an image but maintains the boundaries of the objects in the image. Therefore, we can reduce the overfitting of images and unnecessary noises in image processing applications using superpixels. Superpixel technique is commonly used as a preprocessing stage in computer vision applications such as lane detection algorithms.
Various methods for generating superpixels have been proposed, such as the mean shift methods [4] and the graph-based methods [5]. There are three criteria for evaluating good superpixels [6]-[8]. First, one superpixel must have similar color characteristics, and the next requirement is to preserve the boundaries of objects in the image well. The last requirement is similar sizes and shapes of superpixels.
The criteria required in superpixels have trade-off relationships with each other, so it is difficult to improve all characteristics at the same time. Therefore, it is proper to generate superpixels focusing on specific purposes according to the requirements of the application. For example, for the applications that track objects, superpixels of similar shapes are more advantageous in finding objects than superpixels that preserve the exact boundary of objects [9].
Since superpixel methods are used in the preprocessing stage of the computer vision applications, many existing algorithms keep similar shapes of superpixels. When a target number of superpixels is given, each superpixel is generated to be similar in size to the average size of superpixels, which is the value of dividing the total number of pixels by the target number of superpixels. However, the complexities of the areas in an image are different. In order to preserve the boundaries of the objects, a complex area requires many superpixels, while a simple area requires a small number of superpixels. In this paper, we propose a superpixel generation method that preserves the boundaries of objects well by reflecting the complexity of the areas in an image.
II. RELATED WORK
There are a lot of studies on applying artificial intelligence technology to image processing. Since an image has a large amount of data, a lot of research is also being conducted to reduce the amount of computation by using superpixels. Image processing algorithms using superpixels can be divided into the methods where the boundary information of the objects is important, and the methods where the boundary information is not critical. When detecting whether a fight is taking place in an image, it is important to detect movement changes of objects [10]. When recognizing signals expressed by hand, it is important to recognize the geometric shape of the fingers [11]. In these cases, it is advantageous to generate superpixels of similar shapes.
On the other hand, when superpixels are used in applications that search for a certain content in images, boundary information becomes important [12], [13]. In applications that recognize human facial expressions, it is important to accurately detect boundaries in order to capture subtle changes in facial expressions [14]. In these cases, superpixels that are irregular but preserve the boundaries is advantageous. Especially in the case of medical images, it is important to preserve the boundaries well. In applications that automatically detect tumors in MRI images or detect liver regions in CT images, preserving boundaries well affects the performance [15], [16].
Currently, many applications use the SLIC (Simple Linear Iterative Clustering) method to create superpixels [17]. SLIC algorithm performs better in adhering to image boundaries, speed, and memory efficiency compared with existing superpixel algorithms. However, in the case of images containing different elements with complex patterns such as medical images, the SLIC method cannot accurately detect the boundaries. It remains challenging to have the boundary properly covered by superpixels especially for the superpixels at the border of the target area such as liver region in CT images [16].
The proposed method consists of two steps. In the first step, a sufficient number of superpixels are generated to preserve the boundaries of objects well using the existing method. In the second step, adjacent similar superpixels are merged to create the final result. In this paper, we use the SLIC method for the first step. Thus, before describing the proposed method, we review the SLIC method [17].
The SLIC algorithm first changes the color model from RGB to CIELAB for the input image. The next step is to establish k initial clustering centers, where k is the target number of superpixels. The initial centers are set at a regular interval. When the number of pixels of the image is A, the interval between the initial centers is determined by Eq. (1).
If the centers of the clusters are determined, similarity metrics between a pixel and the cluster centers in the region of 2S × 2S are calculated. That is, a pixel compares similarity with the initial centers, which are in the region of 2S × 2S, and then it is clustered into the most similar cluster center. The similarity is measured by the distance, and the closer the distance, the higher the similarity. The distance between a pixel and a cluster center is determined by considering not only the color component but also the location. The feature vector of pixel i is composed of the color information (li, ai, and bi) and position information (xi and yi). The feature vector of the kth cluster Ck is also composed of the color information (lCk, aCk, and bCk) and the position information (xCk and yCk) as
A distance metric between pixel i and cluster Ck is defined as
where λ = r/S, and r is the weight used to combine Dc, which is the distance measured using color information, and Ds, which is the distance measured using location information. In the SLIC algorithm, each pixel compares the distances for cluster centers in the 2S × 2S area around the pixel to determine its cluster. Therefore, the SLIC algorithm has a constant number of distance calculations regardless of the number of clusters. Also, since each pixel calculates its distance only to its adjacent cluster centers, the SLIC algorithm can quickly generate superpixels.
Fig. 1 shows the results of generating superpixels by applying the SLIC algorithm to Bee image. Bee image consists of simple backgrounds and a bee and flowers that show intricate patterns. As shown in the figure, the SLIC algorithm segments the simple background into superpixels of relatively constant size. It can also be seen that the boundaries of the bee area and flower area having a complex pattern are not well preserved. Depending on the application, it can be more efficient to adjust the size of the superpixel by reflecting the complexity of the area in the image rather than to divide the image into superpixels of similar sizes.
III. PROPOSED SUPERPIXEL GENERATION METHOD
The proposed method generates the target number of superpixels by merging the over-segmented superpixels, which are generated using the SLIC algorithm. In the proposed scheme, the feature vector of each superpixel defined by Eq. (2) is modified, as shown in Eq. (6).
where nCk denotes the number of pixels constituting the superpixel, and pCk represents the index of the parent superpixel. pCk is initialized to Ck value, which means pCk points to iteself. In agglomerative clustering, if pCk refers to itself, Ck is the top superpixel of the cluster. When superpixel Ci and superpixel Cj merge, if the index i is smaller than j, the superpixel Ci becomes the parent, and superpixel Cj is merged into superpixel Ci. That is, superpixel Cj becomes a child of superpixel Ci, so pCj is updated to Ci.
The proposed method merges superpixels using the graph-based agglomerative clustering method [5]. Therefore, the first step to merging is to create a graph of over-segmented superpixels. A graph of superpixels G consists of the node set V and the edge set E. Each node corresponds to each over-segmented superpixel. Thus, if the number of over-segmented superpixels is N, the number of nodes is N. An edge is defined between adjacent two superpixels, so the number of edges M is determined by the connection state of the entire superpixels.
where ei is the ith edge. Each edge is defined by three elements, as shown in Eq. (10). The first two elements of an edge are superpixel Ci and superpixel Cj that constitute the edge. The third element of Eq. (10) is the weight w indicating the distance between the two superpixels. The edge distance De between two superpixels is defined as the norm of the color component difference vector of two superpixels as Eq. (12).
Once the graph is completed, the edges are sorted based on w. In the sorted edge set E, edges are selected in order, and the superpixels comprising the selected edge are merged. If superpixel Ci and superpixel Cj are merged, and i is less than j, superpixel Cj becomes a child of superpixel Ci. Thus, the feature vector fCi is updated with the information of the merged superpixel. In the case of Cj, only pCk is updated to Ci.
The proposed method limits the size of the merged superpixel. That is, two superpixels are merged only when the number of pixels constituting the two superpixels is less than the maximum size. The maximum size MS is controlled by the parameter α indicating a multiple for the average size of superpixels as Eq. (13).
where N* is the target number of superpixels. Thus, A/N* is the average number of pixels of final superpixels.
In addition, the proposed method controls the shapes of superpixels. When we limit the maximum size of each superpixel, the shape of each superpixel can be very irregular. We assume that the ideal shape of a superpixel is a circle. When the number of pixels constituting a superpixel is AC, and the superpixel has the ideal circle form, the radius r of the superpixel is approximated as Eq. (14).
Fig. 2 shows an example of a superpixel created by merging several superpixels. In the figure, a square means a superpixel created by the SLIC method. The actual superpixels vary in form, but they are assumed in the same form to explain the concept of roundness metric easily. Let the number of pixels of the merged superpixel be AC. The circle in Fig. 2 is a circle with the radius r centered on the average coordinate of all pixels. Let Ao denote the number of pixels that make up the superpixels whose centers are located outside this circle. The roundness of the merged superpixel can then be defined as Eq. (15).
As the shape of a superpixel is closer to the circle, the roundness metric R is closer to 1. When two superpixels are candidates to merge, the roundness value R is calculated for two superpixels, and they are merged only when the roundness value is greater than the threshold ThR.
Algorithm 1 describes the proposed superpixel generation method. The algorithm basically repeats the process of merging superpixels until the number of superpixels reaches N*. From the sorted E, edge et is extracted in order. Merging is made through the parent superpixel, so we look for the parent superpixels of the two superpixels that make up the edge. If parent superpixels are the same, it means that two superpixels have already been merged. If parent superpixels are different, the two superpixels can be merged. Before two superpixels merging, it should be confirmed that two superpixels satisfy the size constraint and the roundness constraint.
Input: G(V, E), N, N*, MS, and ThR
While N > N*
Take out et = (C1t, C2t, wt) from the sorted E
C1 = find_parent(C1t)
C2 = find_parent(C2t)
if(C1 = C2) continue
if(nC1 + nC2 > MS) continue
if(Roundness(C1, C2) < ThR) continue
Merge(C1, C2)
N = N − 1
end
Output: Updated G(V, E)
IV. EXPERIMENTAL RESULTS
In order to verify the performance of the proposed method, we compare the results of the proposed method with the results of the SLIC algorithm. First, we generate superpixels by applying the SLIC algorithm to Bee image, which is used in the algorithm description. Fig. 3 shows the result of superpixel segmentation with the target number of 300. The other parameters of the SLIC algorithm are set to the default values.
The proposed method can control the maximum size of a superpixel with parameter α. When the target number of superpixels is given, the average size of superpixels is determined. Then the maximum size of a superpixel is calculated as α times the average size. Fig. 4 shows the results of generating 300 superpixels with various α values. As the α value increases, the sizes of superpixels become more diverse. We can also see that the sizes of superpixels are determined according to the complexity of the area in the image.
The proposed method can also control the shapes of superpixels. We define the roundness as Eq. (15) and control the shapes of superpixels using threshold ThR. Fig. 5 shows superpixels created when ThR is set to 0.3 and 0.8, respectively. The small value of ThR means that there is no constraint on the shapes so that the superpixels divide the boundaries more accurately. However, the shapes of superpixels are irregular, and it can be a disadvantage depending on the purposes of applications. It can be seen that the shapes of superpixels become relatively constant with the large ThR value.
Fig. 6 shows the results of magnifying the head area for precise analysis. Fig. 6(a) and Fig. 6(b) are the head parts of Fig. 3 and Fig. 5(b), respectively. It can be seen that the proposed method allocates more superpixels to a complex area than the SLIC method, and preserves the boundaries of the objects more accurately.
We analyzed the performance of the proposed method for various images. Figure 7 shows the results of applying the superpixel algorithms to four images from the Berkeley database [18]. The target number of superpixels is set to 300, and parameter α is set to 5. The first row shows the original images, and the second row shows the results of applying the SLIC method. The third and fourth rows show the results of applying the proposed method with parameter ThR values of 0.3 and 0.8, respectively. The images in the fifth and sixth rows are the results of magnifying the areas containing complex elements in the images of the second and fourth rows, respectively. The third and fourth rows show that the proposed method can adequately control the shapes of superpixels. It can be seen from the images in the fifth row and sixth row that the proposed method segments the region more accurately than the SLIC method by allocating many superpixels to the intricate areas of the images.
It is examined how accurately the superpixels are generated for the target number of superpixels. Table 1 shows the target number of superpixels and the number of actually generated superpixels for five images. Since the SLIC method does not guarantee perfect connectivity for all pixels, it can be seen that the number of generated superpixels is less than the target number [17]. Since the proposed method continues merging process until the number of superpixels reaches the target number, the number of generated superpixels matches the target number.
Image | N* | Number of superpixels | |
---|---|---|---|
SLIC | Proposed | ||
Bee | 300 | 279 | 300 |
Airplane | 200 | 267 | 300 |
Eagle | 300 | 265 | 300 |
Harbor | 300 | 275 | 300 |
Rock | 300 | 256 | 300 |
The execution speed of the proposed method was compared with that of the SLIC method. The implementations were run on a 3.50GHz processor with 16GB of RAM. Table 2 shows the execution time of the proposed method and the SLIC method for five images. It can be seen that the proposed method takes about two to three times the execution speed of the SLIC method. To analyze the processing time of the proposed method, the proposed method was divided into the over-segmentation step, the graph generation step, and the superpixel merging step. Table 2 shows the step-by-step processing time of the proposed method. The proposed method takes more time than the SLIC method because the SLIC method is used in the over-segmentation step. It can be seen that it takes a lot of time to generate a graph for the over-segmented superpixels compared with the time to merge using the graph. It is because it takes a long time to create the feature vectors of all the superpixels and create an ordered graph of them.
The most important property of a superpixel method is its ability to adhere to object boundaries, and boundary recall is standard measure for boundary adherence [17]. Boundary recall measures what fraction of the ground truth edges fall within at least two pixels of a superpixel boundary. Fig. 8 shows the result of measuring boundary recall for 300 superpixels generated by applying the proposed method to the eagle image. The recall value of the superpixels created using the SLIC method is 0.86. It can be seen that the boundary recall of the proposed method is superior to the SLIC method. Fig. 8 shows that increasing the maximum size of superpixels improves the boundary recall performance. In addition, it can be seen that reducing the roundness threshold value improves the boundary recall performance. However, as the maximum size of the superpixels increases and the roundness threshold value decreases, shapes of the superpixels become irregular. Therefore, when applying the proposed method to an application, it is necessary to set the parameters appropriately for the purpose.
V. CONCLUSION
In this paper, we proposed a superpixel generation method that can adequately adjust the shapes of superpixels. Since superpixel segmentation is used in the preprocessing stage in computer vision applications, it is common to create superpixels in similar shapes and sizes. However, in order to accurately segment the boundaries of the objects in an image, it is advantageous to create superpixels with various shapes by reflecting the contents of the image. In this paper, we propose a method of generating superpixels with various shapes. The proposed method consists of two steps. The first step focuses on preserving the boundaries of the objects in an image and creates a sufficient number of superpixels. In the second step, superpixels are merged by reflecting the complexity of the area in the image. The shapes of superpixels are controlled by limiting the size and roundness. The experimental results confirmed that the proposed method maintains the boundaries of objects well by generating superpixels of various shapes in consideration of the complexity of the area. In this paper, the focus was not on processing speed. Thus, the proposed method takes about two to three times the execution speed of the SLIC method. In the future study, we will cover how to save time when generating graphs.