I. INTRODUCTION
The development of IT based GIS service and cartography technique enables the GIS vector map to be detailed and enriched by more detailed geographic information. However, since the detailed vector map needs huge amount of data, the problem is rearing how to process, store, and transmit them. Especially, web GIS or mobile GIS have the susceptibility for huge data because of limits of transmission bandwidth and mobile device for output, processing, and analysis of geographic information in realtime. Universal lossless compression of zlib and 7-zip has been widely used for the vector map compression based on entropy coding. However, they cannot do the random access for specific region or object in compressed vector map. In this case, they should recover all data of vector map for the random access. Furthermore, they have the limit of the compression efficiency because they do not consider the correlation of objects. Many researchers have worked for the effective vector map compression [1-6].
This paper addresses a combined method of simplification and lossless compression to reduce the data quantity while preserving the map quality and to cope with different GIS applications. Main features of our method are as follows. First, our method improves the compression efficiency with the shape preservation by poyline featurebased simplification and vector derivative-based compression. The former is a loss compression and the latter is a lossless compression. Second, polyline featurebased simplification enables precision and compression ratio to control or vary by a weighting factor. Third, vector derivative-based compression converts vector data to integer part and fractional part of second derivative vector in order to compact the spatial energy of vector data. The integer part that indicates the broad position of object is compressed by the correlation of neighbor vertices. The fractional part that indicates the close position of object is compressed progressively by varying the fraction precision.
Our experiment evaluated the performance of simplification and compression of our method by comparing with Park’s simplification method [4] and Jang’s compression method [5],[6], which are latest methods. From experimental results, we verified that the simplification rate of our method is about 4% higher than that of Park’s method while the distance error and area error of our method is about 0.4[m] and 0.6[m2] less than those of Park’s method. Furthermore, we verified that the compression ratio of our method is about 10% higher than that of Jang’s method.
II. PROPOSED VECTOR MAP SIMPLIFICATION
We address a hybrid compression for vector map of polyline feature-based simplification and vector derivativebased data compression. Our method has two aims as follows; 1) reduce vertices while preserving the global shape of object and 2) enable to vary different precisions and control compression ratio. The process of our hybrid compression for vector map is shown in Figure 1. Polyline feature-based simplification detects feature points of polyline using DP, SF, and TF algorithms and applies an algorithm with minimum area error to sections. Vector derivative-based data compression converts all data of vertices in simplified polyline to integer part and fractional part of second vector derivative and compresses two parts respectively.
Our method performs on all polylines in a vector map and segment polylines by feature points that reflects the object shape. Then it simplifies the segments so that the area difference is minimized. The detail process of our simplification is as follows.
We extract feature points using DP, SF, and TF algorithms. Figure 2 shows parameters for three algorithms; the maximum perpendicular distance dmax for DP algorithm, the angle tolerance ϴ for sector bound for SF algorithm, and the length threshold lth and orientation angle threshold ∆ϴth of a segment line for TF algorithm. Given a polyine L in any layer, let us consider that L consists of N vertices, L = {υ0, υ1,…, υN−1}, and a vertex has two coordinate values υi = {xi, yi}. We generate three simplified polylines of a polyline L using DP, SF, TF algorithms.
We define feature points FL for a polyline L as vertices that exist in three simplified polylines LDP, LSF, LTF ; for and and and . We use feature points as break points for segmenting polyline. Thus, a polyline L is segmented to NF sections S = {Sk|k ∈[0, NF − 1]} using NF feature points. A section Sk consists of k − 1th feature point and kth feature point and middle vertices between two feature points;
Our method simplifies a polyline L as a unit of segmented section Sk using DP, SF, and TF algorithms. Thus, we assign one among three algorithms to a section Sk using the minimum area difference (MAD) and then simplify a section Sk by an assigned algorithm. The minimum area difference (MAD) measures how different the simplified polyline is with the original polyline. We explain how to calculate MAD for a section in next subsection.
We use the weighting factor α for controlling the simplification intensity and for generating vector maps with different precisions. Thus, we set parameters for three algorithms by the weighting factor and initial parameters.
where α ≥ −1
Let us look at how to compute MAD for a polyline, which is used for simplifying vertices while preserving the shape of original polyline. A section Sk = {υl, υl+1,⋯, υl+Nk} likes a sub-polyline with the start vertex and end vertex of feature points. The process for computing MAD, MADk, of a section is as follows.
-
Consider Sk as sub-polygon type by connecting the start vertex to the end vertex.
-
Compute area Ak of sections Sk with Nk vertices.
And compute areas of sections that correspond to simplified polylines LDP, LSF, LTF using the above equation. Since Sk uses two feature points as endpoints, simplified polylines LDP, LSF, LTF are segmented to the same sections to original polyline. However, these sections include different vertices and different number of vertices.
-
Compute the differences between an area Ak and three areas Denote these differences as
for all k ∈ [1, Nk]
Since Ak for an original polyline is always higher than for simplified polylines, the differences are over zero. Define MADk for Sk as the minimum among
δ0 and ε0 are initial distance threshold for DP and SF and lth,0 and ∆ϴ0 are initial length threshold and initial angle threshold for TF. In this paper, we set δ0, ε0, lth,0 to 1[m] and set ∆ϴ0 to 5°.
The weighting factor α can control the compression ratio and the number of simplified vertices. If α = –1, a polyline do not be simplified. If α increases from -1, the simplified vertices increase while the data capacity decrease. However, the increased α produces the degradation of shape.
We separate the data of vertices in a simplified polyline L′ to two parts; integer parts VI and fractional parts VR. Fractional parts are converted to integer types by the reciprocal number of the precision c that indicates the precision level of map data. Thus, two data of integer part and fractional part of a vertex υ′j can be defined by
for all j ∈ [0, M − 1].
The integer part of vertex data represents the primary position of vector map. Therefore, our method compresses VI with lossless using the bounding box of object MBR. We firstly compute the first-order derivative vector of integer part on the left-top point MBRmin of MBR
and the average derivative vector . Then we compute the second-order derivative vector as follows.
All integer parts VI are converted to the magnitudes and signs of the second-order derivative vectors
The fractional part represents the detail precision of vector map. This part is sensitive to small deviation unlike integer part and is little correlated to each other. We separates fractional parts to precision data as a unit of 8-bit and rearranges precision data by precision levels. Example, the data of nth. precision level for can be defined as
This compression enables the vertex data to be transmitted or stored to different precision levels or preferred precision levels.
From the above processes, the integer part VI and fractional part VR for a polyline L can be converted to data for compressing. The coded data for a polyline has the structure of a polyline label (object index), a size of coded data, a number of vertices, a MBR coordinate value, and a payload, as shown in Figure 3. A MBR is for easy access to a polyline. A payload is the coded data of integer part and fractional part. Since the compression and recovery are performed as the unit of a polyline object, our method makes the random access of polyline objects easy.
III. Experimental Results
Our experiment used shapefiles (SHP format) of two cities with different layers and applied our method to layers with polyline object in shapefiles. A shapefile format that is a popular geospatial vector data format.
Our experiment evaluated the subject quality between the original map and compressed map. Figures 4(a),4(b) show full layers of original maps and compressed maps. From these figure, we know that though about 29% and 20% vertices are removed in compressed vector maps, there are no visually difference between original vector map and compressed vector map. From these figures, we know that it is very difficult to discriminate visually between original layer and compressed layer. We investigated the degradation of quality in enlarged maps. The quantitative criteria for evaluating quality in the enlargement has not been provided yet. Therefore, we enlarged two of original map and compressed map to arbitrary scales and looked the discrimination of two enlarged maps. From these results, we know that the degradation of compressed map comes into sight from about 2,356x. Thus, it should be enlarged our compressed map to large scale to see the degradation of quality.
Our method in α = 0 has the same parameters of three simplification algorithms as Park’s method. We compared our method in α = 0 to Park’s method. The simplification rates for layers of our method are lower 0.1%-10.08% than those of Park’s method. Thus, our method simplifies vertices of layers more higher 0.1%-10.08% than Park’s method. Moreover, our method simplifies vertices in a map higher 4% than Park’s method. Layers of facility and administrative boundary are simplified to about 94%-99% because there are a number of simple polylines. However, layers of river and road are simplified to about 63%-74% because there are a number of complex polylines.
We measured simplification rate r and mean area difference Ᾱ of our method on different weighting factors from -1.0 to 1.0. The measurement results are shown in Figures 5(a) and 5(b). As α increases, r decreases while Ᾱ increases. When α = 1, r decreases about 57%-65% while Ᾱ increases about 9.88[m2]-12.14[m2]. However, when α = −0.4, r increases about 92%-94% while Ᾱ increases about 0.25[m2]-0.40[m2]. From these results, we verified that our method can control the simplification rate and mean area difference varying the weighting factor. This enables users to receive simplified vector maps on different GIS applications.
We compared the compression efficiency of our method to that of Jang’s SEC method. Hereby, we experimented two ways of our method for comparing with Jang’s method. The first way is to do only the process of lossless compression that preserves data of all precision levels in fractional part. This way passes over the simplification. The second way is to do the simplification with α = 1 and lossless compression. The compression efficiency of our lossless compression has slightly higher 0.43%-3.60% than that of Jang’s method. Furthermore, the compression efficiency of our simplification and lossless compression has higher about 3.89%-16.58% than that of Jang’s method and higher about 3.47%-13.84% than that of only lossless compression. When looking at the results on layers, layers of river and tributary that has a number of polylines are compressed to high rate. However, layer of facility that has a few of polylines is compressed to low rate. It is natural that the polyline based compression method depends on the number of polylines on layers.
III. CONCLUSION
We have addressed a polyline simplification and compression for the effective transmission or storage of vector maps for various GIS applications. The proposed method consists of two processes of polyline feature pointbased simplification and the second derivative-based data compression. The first process for simplification detects the feature points of polyline using three algorithms of DP, SF, and TF and divides a polyline to sectors by feature points. The next process for data compression segments the second derivative of simplified polylines to integer parts and fractional parts. Experimental results showed that our method has good subjective quality and also has more simplification efficiency than Park’s simplification method and has more compression efficiency than Jang’s compression method. Furthermore, our method can provide the closest quality to original map with about 23% vertices of original map and generate vector maps with different precision levels or compression ratios.