CN111446968A

CN111446968A - Multilevel Compression Method for Vector Spatial Data

Info

Publication number: CN111446968A
Application number: CN202010314228.3A
Authority: CN
Inventors: 王涛; 刘东阁; 李小娟; 倪叶青
Original assignee: Capital Normal University
Current assignee: Capital Normal University
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2020-07-24
Anticipated expiration: 2040-04-20
Also published as: CN111446968B

Abstract

The disclosure belongs to the technical field of spatial information, and particularly relates to a method for multistage compression of vector spatial data. The spatial vector data compression method provided by the disclosure takes visualization of basic geographic spatial data as an application scene, sets the minimum resolution distance meeting the requirement of multiple scales as a precision requirement, realizes multilevel compression processing of vector spatial data, and further improves the data storage efficiency by combining grid filtering and binary offset storage.

Description

Multilevel Compression Method for Vector Spatial Data

技术领域technical field

本公开属于空间信息技术领域，特别涉及一种矢量空间数据多级压缩的方法。The present disclosure belongs to the technical field of spatial information, and in particular relates to a method for multi-level compression of vector spatial data.

背景技术Background technique

随着对地观测与导航定位装备以及互联网技术的发展，各类矢量型地理空间数据的生产能力逐步增强，进而导致数据存储量快速增长，这就给地理信息系统及其他应用信息系统的空间数据存储、查询、网络传输、可视化等功能的效率提出了新的挑战。With the development of earth observation, navigation and positioning equipment and Internet technology, the production capacity of various vector geospatial data has gradually increased, which has led to a rapid increase in the amount of data storage. The efficiency of functions such as storage, query, network transmission, and visualization presents new challenges.

众所周知，数据压缩可以减小数据量，提高数据处理各个环节的效率，进而更好地支撑包括移动互联网地理信息系统在内的空间数据应用场景。As we all know, data compression can reduce the amount of data, improve the efficiency of all aspects of data processing, and better support spatial data application scenarios including mobile Internet geographic information systems.

矢量数据模型是地理信息系统中最基础的地理要素实体表达模型之一。它是在给定地理空间坐标系下，将地理实体抽象表示为点、线、面等几何实体，通过记录特征点空间坐标并设定点集合表达规则的方式来实现地理实体的表达，具有实体目标描述完整、空间关系易于获取等特点。Vector data model is one of the most basic entity representation models of geographic elements in geographic information systems. It abstractly represents geographic entities as geometric entities such as points, lines, and surfaces under a given geographic spatial coordinate system, and realizes the expression of geographic entities by recording the spatial coordinates of feature points and setting expression rules for point sets. The target description is complete and the spatial relationship is easy to obtain.

发明内容SUMMARY OF THE INVENTION

本公开的目的在于提供一种矢量空间数据多级压缩方法。该方法的具体流程如下：The purpose of the present disclosure is to provide a multi-level compression method for vector space data. The specific process of this method is as follows:

在水平和垂直方向逐级对地理坐标表达的空间进行二分法剖分，以获得划分地理空间的多层次格网；Divide the space expressed by geographic coordinates step by step in the horizontal and vertical directions to obtain a multi-level grid that divides the geographic space;

确定位于所述格网中的待压缩数据坐标点的位置，并持续对所述格网剖分，以利用剖分后的格网的中心点的坐标替代待压缩数据坐标点的坐标值；其中所述剖分后的格网的中心点与所述待压缩数据坐标点的坐标偏差符合多级比例尺精度预设值；Determine the position of the coordinate point of the data to be compressed in the grid, and continue to divide the grid to replace the coordinate value of the coordinate point of the data to be compressed with the coordinates of the center point of the divided grid; wherein The coordinate deviation between the center point of the divided grid and the coordinate point of the data to be compressed conforms to the preset value of the multi-level scale precision;

以单个待压缩数据文件为单位，设定局部坐标参考系，并计算待压缩数据坐标点与所述剖分后的格网中心点的二进制坐标偏差，以得到二进制偏移量；Taking a single data file to be compressed as a unit, set a local coordinate reference system, and calculate the binary coordinate deviation of the coordinate point of the data to be compressed and the center point of the grid after the division to obtain a binary offset;

以所述二进制偏移量存储矢量空间数据的地理坐标。The geographic coordinates of the vector spatial data are stored at the binary offset.

进一步，在水平和垂直方向进行剖分时，采用二进制标识剖分后左右或者上下的子空间。Further, when the division is performed in the horizontal and vertical directions, the left and right or upper and lower subspaces after division are identified by binary.

进一步，所述预设值为在多尺度视觉无损条件下表达的最小可分辨距离。Further, the preset value is the minimum distinguishable distance expressed under the condition of multi-scale visual lossless.

进一步，所述的确定位于格网中的待压缩数据坐标点的位置，并持续对所述格网剖分，以利用剖分后的格网的中心点的坐标替代待压缩数据坐标点的坐标值的步骤包括：Further, the position of the coordinate point of the data to be compressed in the grid is determined, and the grid is continuously divided, so that the coordinates of the center point of the divided grid are used to replace the coordinates of the coordinate point of the data to be compressed. Value steps include:

获取待压缩数据坐标点的坐标数据；Obtain the coordinate data of the coordinate points of the data to be compressed;

在地理空间数据中获取包含待压缩数据坐标点的网格纬度覆盖范围的最小值和最大值；Obtain the minimum and maximum values of the latitude coverage of the grid containing the coordinate points of the data to be compressed in the geospatial data;

判断所述待压缩数据坐标点的纬度坐标是否大于所述纬度覆盖范围的中间值；Determine whether the latitude coordinates of the coordinate points of the data to be compressed are greater than the median value of the latitude coverage;

若所述纬度坐标大于对应的所述纬度覆盖范围的中间值，则剖分所述格网，使所述格网的纬度覆盖范围缩小为中间值至最大值；If the latitude coordinate is greater than the corresponding median value of the latitude coverage, the grid is divided, so that the latitude coverage of the grid is reduced from the median value to the maximum value;

若所述纬度坐标小于或等于所述纬度覆盖范围的中间值，则剖分所述格网，使所述格网的纬度覆盖范围缩小为最小值至中间值；If the latitude coordinate is less than or equal to the middle value of the latitude coverage, dividing the grid, so that the latitude coverage of the grid is reduced from the minimum value to the middle value;

递归对格网进行纬度划分的步骤，并同理对格网进行经度划分，以使所述格网的中心点与待压缩数据坐标点的偏差符合所述预设值The step of recursively dividing the latitude of the grid, and similarly dividing the longitude of the grid, so that the deviation between the center point of the grid and the coordinate point of the data to be compressed conforms to the preset value

进一步，所述剖分后的格网的中心点与待压缩数据坐标点之间的最大划分误差通过如下公式计算得出：Further, the maximum division error between the center point of the divided grid and the coordinate point of the data to be compressed is calculated by the following formula:

其中Width为格网宽度，Height为格网高度，Scale为比例尺。Where Width is the width of the grid, Height is the height of the grid, and Scale is the scale.

进一步，所述格网宽度Width和所述格网高度Height分别通过如下公式计算得出：Further, the grid width Width and the grid height Height are respectively calculated by the following formulas:

Width＝width/2ⁿ；Width=width/ ²ⁿ ;

Height＝height/2ⁿ。Height=height/2 ⁿ .

进一步，所述待压缩数据为点矢量要素时，所述点矢量要素的二进制偏移量计算步骤如下：Further, when the data to be compressed is a point vector element, the calculation steps of the binary offset of the point vector element are as follows:

对点矢量要素按照二进制码进行顺序重排；Rearrange point vector elements in order according to binary code;

记录第一点为原始二进制码，其后点存储为与上一点偏差的网格个数，并转化为二进制存储；Record the first point as the original binary code, and store the subsequent points as the number of grids that deviate from the previous point, and convert them into binary storage;

偏移二进制码的长度将会大于1小于原始二进制码的长度，统一将偏移二进制码长度做为所有偏移二进制码长度的最大长度；The length of the offset binary code will be greater than 1 and less than the length of the original binary code, and the length of the offset binary code will be the maximum length of all offset binary codes;

对于偏移位数不足的记录，以0在二进制吗序列前端进行填充。For records with insufficient offset bits, the front end of the binary sequence is padded with 0.

进一步，所述待压缩数据为线矢量要素或面矢量要素时，所述线矢量要素或面矢量要素的二进制偏移量计算步骤如下：Further, when the data to be compressed is a line vector element or an area vector element, the calculation steps of the binary offset of the line vector element or the area vector element are as follows:

记录每个矢量要素中要素对象的节点数量以及起始点坐标，以要素对象为单位进行偏移计算；Record the number of nodes of the feature object in each vector feature and the coordinates of the starting point, and perform the offset calculation in the unit of the feature object;

设立方向位，计算每个坐标点与上一坐标点的偏移方向，以两位二进制编码存储每一个点可能产生的八种偏移方向；Set up the direction bit, calculate the offset direction between each coordinate point and the previous coordinate point, and store the eight possible offset directions that each point may generate with a two-bit binary code;

计算偏移量，分别沿水平、垂直两个方向记录行列的偏移量，转换为二进制编码后再交替存储作为最终偏移量；Calculate the offset, record the offset of the row and column along the horizontal and vertical directions, convert it to binary code, and then alternately store it as the final offset;

偏移二进制码的长度大于1、并小于原始二进制码的长度，将统一偏移二进制码长度为所有偏移二进制码长度的最大长度；If the length of the offset binary code is greater than 1 and less than the length of the original binary code, the length of the offset binary code will be the maximum length of all offset binary codes;

对于偏移位数不足的记录，以0在前端进行填充。For records with insufficient offset bits, padding is performed at the front with 0.

本公开提供的空间矢量数据压缩方法以基础地理空间数据可视化为应用场景，设定满足屏幕显示视觉无损的最小分辨率距离为精度要求，实现了矢量空间数据压缩处理，结合格网过滤、二进制偏移存储进一步提高了数据存储效率。与现有技术相比，压缩比率也更高。The space vector data compression method provided by the present disclosure takes the visualization of basic geospatial data as the application scenario, and sets the minimum resolution distance that meets the visual lossless screen display as the precision requirement, and realizes the vector space data compression processing. Mobile storage further improves data storage efficiency. The compression ratio is also higher compared to the prior art.

附图说明Description of drawings

图1示出了根据本发明的矢量空间数据压缩算法的流程图；Fig. 1 shows the flow chart of the vector space data compression algorithm according to the present invention;

图2为地理空间剖分示意图；Figure 2 is a schematic diagram of geographic space subdivision;

图3为格网过滤处理过程的示意图；3 is a schematic diagram of a grid filtering process;

图4为不同剖分次数下最大划分误差评估图；Fig. 4 is the maximum division error evaluation diagram under different division times;

图5为点要素二进制偏移存储格式示意图；Figure 5 is a schematic diagram of the storage format of the binary offset of point elements;

图6为线、面要素二进制偏移存储格式示意图；Fig. 6 is a schematic diagram of the binary offset storage format of line and area elements;

图7a原始地理信息图像；Figure 7a original geographic information image;

图7b、图7c和7d分别为同剖分次数下比特位数分别为16、18和20的压缩比例结果图；Fig. 7b, Fig. 7c and Fig. 7d are the result graphs of the compression ratio with the number of bits being 16, 18 and 20 respectively under the same number of divisions;

图8为不同比特位数下压缩结果示意图。FIG. 8 is a schematic diagram of compression results under different bit numbers.

具体实施方式Detailed ways

通过上述说明内容可知，矢量数据模型是地理信息系统中最基础的地理要素实体表达模型之一。它通过记录特征点空间坐标并设定点集合表达规则的方式来实现地理要素实体的表达，具有实现目标描述完整、空间关系易于获取等特点。It can be seen from the above description that the vector data model is one of the most basic geographic element entity expression models in the geographic information system. It realizes the expression of geographic element entities by recording the spatial coordinates of feature points and setting point set expression rules, and has the characteristics of complete target description and easy access to spatial relationships.

由于矢量地理空间数据使用了高精度坐标数值表达集合信息，使得它比栅格数据结构精度更高，并适用于表达动态、多类型的复杂地理现象，因而众源地理信息、物联网定位信息的表达大多采用了矢量数据模型。对于矢量数据结构的压缩，目前也有许多方法被提出。例如申请号为2010101806110，发明名称为“基于有序点集像素无损压缩的矢量数据高校传输方法”的专利文件中以像素无损为前提提出了一种基于有序点集的压缩方法；还有申请号是2014100243164，发明名称为“一种压缩矢量数据的方法”的专利文件公开了一种基于偏移量的矢量数据压缩方法；再有申请号是2015107723716，发明名称为“可缩放矢量图形的压缩、绘制方法及装置”提出了一种针对SVG(Scalable Vector Graphics，可缩放的矢量图像)格式的优化存储方法。上述专利文件中公开的方案中对空间矢量数据的压缩提出了相应的解决方案，但是仍然存在着压缩比例有限、存储格式复杂，无法实现多分辨率压缩等问题。Because vector geospatial data uses high-precision coordinate values to express aggregate information, it has higher precision than raster data structure, and is suitable for expressing dynamic and multi-type complex geographic phenomena. The expression mostly adopts the vector data model. For the compression of vector data structures, many methods have also been proposed. For example, the patent document with the application number of 2010101806110 and the invention titled "Vector Data University Transmission Method Based on Pixel Lossless Compression of Ordered Point Sets" proposes a compression method based on ordered point sets on the premise that pixels are lossless; there are also applications No. 2014100243164, the patent document titled "A method for compressing vector data" discloses an offset-based vector data compression method; another application No. 2015107723716, the title of the invention is "Compression of Scalable Vector Graphics" , Drawing method and device" proposes an optimized storage method for SVG (Scalable Vector Graphics, scalable vector image) format. The solutions disclosed in the above-mentioned patent documents propose corresponding solutions for the compression of space vector data, but there are still problems such as limited compression ratio, complex storage format, and inability to achieve multi-resolution compression.

针对上述分析，发明人通过长期的创造性工作，提出了一种空间矢量数据压缩方法，能够有效地解决上述分析中提出的技术问题。本公开提供的空间矢量数据压缩方法首先根据矢量特征点坐标位置，对地理空间进行多次水平和垂直两个方向的交错剖分，依照每次剖分后特征点坐标所在的位置，顺序构造二进制整数编码，剖分的次数由设定的压缩精度而定，剖分的层数越多，则表达的精度越高，相应的编码越长，进而能够实现多分辨率矢量数据压缩的目的。给定区域低精度的编码结果是该区域高精度编码结果的前缀，因此编码的前缀相同的长度越长，则两个网格空间上的距离越近。本公开提供的空间矢量数据压缩方法以基础地理空间数据可视化为应用场景，设定满足屏幕显示视觉无损的最小分辨率距离为精度要求，实现了矢量空间数据压缩处理，结合格网过滤、二进制偏移存储进一步提高了数据存储效率。与现有技术相比，压缩比率也更高。In response to the above analysis, the inventor has proposed a space vector data compression method through long-term creative work, which can effectively solve the technical problems raised in the above analysis. In the space vector data compression method provided by the present disclosure, firstly, according to the coordinate positions of the vector feature points, the geographic space is divided into two horizontal and vertical directions for multiple times, and the binary structure is sequentially constructed according to the positions of the coordinates of the feature points after each division. Integer encoding, the number of subdivisions is determined by the set compression precision. The more layers are subdivided, the higher the precision of expression and the longer the corresponding encoding, which can achieve the purpose of multi-resolution vector data compression. The low-precision encoding result of a given region is the prefix of the high-precision encoding result of the region, so the longer the encoded prefixes have the same length, the closer the distance between the two grid spaces is. The space vector data compression method provided by the present disclosure takes the visualization of basic geospatial data as the application scenario, and sets the minimum resolution distance that meets the visual lossless screen display as the precision requirement, and realizes the vector space data compression processing. Mobile storage further improves data storage efficiency. The compression ratio is also higher compared to the prior art.

以下结合附图和具体实施例对本发明提出的基于空间数据压缩方法作进一步详细说明。根据下面说明和权利要求书，本发明的优点和特征将更清楚。需说明的是，附图均采用非常简化的形式且均使用非精准的比例，仅用以方便、明晰地辅助说明本发明实施例的目的。The space-based data compression method proposed by the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become apparent from the following description and claims. It should be noted that, the accompanying drawings are all in a very simplified form and in inaccurate scales, and are only used to facilitate and clearly assist the purpose of explaining the embodiments of the present invention.

在本公开的详细描述中使用诸如“在…之下”、“在…下面”、“下面的”、“上面的”等空间术语，目的是容易描述附图中所示的一个部件和另一个部件的位置关系，但这些仅是实施例并不旨在限制本发明。除图中所示的方位之外，空间关系术语将包括使用或操作中的装置的各种不同的方位。装置可以以其他方式定位，例如旋转90度或在其他方位，并且通过在此使用的空间关系描述符进行相应的解释。Spatial terms such as "under", "below", "below", "above" and the like are used in the detailed description of the present disclosure for the purpose of easily describing one element and another shown in the figures positional relationship of components, but these are only examples and are not intended to limit the present invention. In addition to the orientation shown in the figures, spatially relative terms will encompass various orientations of the device in use or operation. The device may be otherwise positioned, such as rotated 90 degrees or at other orientations, and interpreted accordingly by the spatially relative descriptors used herein.

本公开提供的矢量空间数据多级压缩方法的步骤如下：The steps of the multi-level compression method for vector space data provided by the present disclosure are as follows:

步骤1：根据设备显示精度、地图比例尺及显示比例尺信息，确定矢量空间数据在多比例尺视觉无损条件下表达的多个最小可分辨距离(Minimum Discernible Distance，MDD)，在这个过程中又主要包括以下实现步骤：Step 1: According to the display accuracy of the device, the map scale and the display scale information, determine multiple Minimum Discernible Distances (MDD) expressed by the vector space data under the condition of multi-scale visual loss. In this process, it mainly includes the following: Implementation steps:

S11：对于一台显示精度为DPI(Dot Per lnch，DPI)的显示终端，一英寸的长度内有DPI个像素点。在国际单位制下，1米代表39.3701英寸，可用每米的英寸数(Inches PerMeter，IPM)表示。则显示终端上一个像素点所能代表国际标准单位长度(Merer Per Dots，MPD)的计算公式为：S11: For a display terminal with a display precision of DPI (Dot Per lnch, DPI), there are DPI pixels in a length of one inch. Under the International System of Units, 1 meter represents 39.3701 inches, which can be expressed in inches per meter (Inches PerMeter, IPM). Then the calculation formula of the SI unit length (Merer Per Dots, MPD) that a pixel point on the display terminal can represent is:

MPD＝1/(DPI×IPM)MPD=1/(DPI×IPM)

(1) (1)

此距离用以表示计算机显示过程中确保视觉无损的最小可分辨距离。This distance is used to represent the minimum distinguishable distance that ensures visual loss during computer display.

S12：每一幅地图都有其地图比例尺(Scale)信息，而电子地图在计算机中显示的过程中可以根据显示需求放大缩小，此时将改变地图的显示比例尺(Display Scale，Dscale)，计算地图显示过程中的放大倍数(Ratio)的计算方法为：S12: Each map has its own map scale (Scale) information, and the electronic map can be enlarged or reduced according to the display requirements during the display process in the computer. At this time, the display scale (Display Scale, Dscale) of the map will be changed to calculate the map. The calculation method of the magnification (Ratio) in the display process is:

Ratio＝Dscale/ScaleRatio=Dscale/Scale

(2) (2)

S13由S11我们已经得知了电脑像素点所能分辨的最小距离，而对于在电脑屏幕上显示的具体图幅，其表达的实际地面距离的最小可分辨距离符合计算公式：S13 From S11, we have already learned the minimum distance that the computer pixel can distinguish, and for the specific picture displayed on the computer screen, the minimum distinguishable distance of the actual ground distance expressed by it conforms to the calculation formula:

MDD＝MPD×DscaleMDD=MPD×Dscale

(3) (3)

＝MPD×Scale/Ratio=MPD×Scale/Ratio

(4) (4)

步骤2：沿水平、垂直方向逐级对地理坐标表达的空间进行二分法剖分，将地理空间划分为多层次规则格网，剖分示意如图2所示。Step 2: Divide the space expressed by geographic coordinates step by step along the horizontal and vertical directions, and divide the geographic space into multi-level regular grids, as shown in Figure 2.

S21：将二维空间分别沿水平、垂直方向进行交错剖分，采用二进制数“0”或者是“1”标识剖分后左右或者上下的子空间。二维空间沿水平方向剖分后，左侧子空间标记为“0”，右侧子空间标记为“1”；二维空间沿垂直方向剖分后，下侧子空间标记为“0”，上侧子空间标记为“1”。以Morton编码为基础，按照先左右后上下的顺序交替存储二分标识“0”或“1”，产生每个格网的唯一标识“00”、“01”、“10”、“11”，并将地理空间划分为规则的四分格网。S21: Divide the two-dimensional space staggeredly along the horizontal and vertical directions respectively, and use the binary number "0" or "1" to identify the left and right or upper and lower subspaces after the division. After the two-dimensional space is divided in the horizontal direction, the left subspace is marked as "0", and the right subspace is marked as "1"; after the two-dimensional space is divided in the vertical direction, the lower subspace is marked as "0", The upper subspace is marked "1". Based on Morton coding, the binary identifiers "0" or "1" are alternately stored in the order of first, left, and then up and down, and the unique identifiers "00", "01", "10", "11" of each grid are generated, and Divide geographic space into a regular grid of quadrants.

S22：对于S21产生的四分格网，再以每个格网为单位，进行S21的剖分过程，每分割一次产生的格网为一层，剖分过程中产生的新的二分标识交替存储后连接在上一层的二分标识之后，形成与格网位置一一对应的二进制字符串。剖分层次越深，则二进制字符串累计越长，格网表示的范围就越小，位置信息表达就越精确。S22: For the quadratic grid generated by S21, the division process of S21 is carried out with each grid as a unit. The grid generated by each division is one layer, and the new binary identification generated during the division process is stored alternately. It is then connected to the binary identification of the previous layer to form a binary string corresponding to the grid position one-to-one. The deeper the subdivision level is, the longer the binary string accumulates, the smaller the range represented by the grid, and the more accurate the representation of the position information.

S23：将地理空间按进行递归剖分后，每一片区域有唯一的编码与之对应，并且在空间上有明显的分层特性，同一片区域不同层级编码前缀相同，不同区域相临近度越高，前缀匹配度也越高。S23: After the geographic space is recursively divided, each area has a unique code corresponding to it, and has obvious hierarchical characteristics in space. The code prefixes of different levels in the same area are the same, and the proximity of different areas is higher. , the prefix matching degree is also higher.

步骤3：确定待压缩数据坐标点在多层次规则格网中的位置，并以格网二进制编码代替原双精度类型坐标值。以剖分产生的规则格网中心点代替落在格网内的空间矢量数据坐标点。Step 3: Determine the position of the coordinate point of the data to be compressed in the multi-level regular grid, and replace the original double-precision type coordinate value with the grid binary code. The space vector data coordinate points in the grid are replaced by the regular grid center points generated by the subdivision.

S31：对于待压缩数据坐标点(117.67198438°E，42.1855639°N)，在规则格网内的位置确定步骤具体如下：S31: For the coordinate point of the data to be compressed (117.67198438°E, 42.1855639°N), the steps for determining the position in the regular grid are as follows:

S311：获取地理空间数据纬度覆盖范围最小值40°N到最大值44°N。S311: Obtain the latitude coverage of the geospatial data from a minimum value of 40°N to a maximum value of 44°N.

S312：找到纬度覆盖范围中间值，判断纬度坐标是否大于中间值。S312: Find the median value of the latitude coverage, and determine whether the latitude coordinates are greater than the median value.

S313：若大于则输出字符“1”，覆盖范围缩小为中间值至最大值；否则输出字符“0”，覆盖范围缩小为最小值到中间值。S313: If it is greater than the value, output the character "1", and the coverage range is reduced from the middle value to the maximum value; otherwise, the character "0" is output, and the coverage range is reduced from the minimum value to the middle value.

容易理解，S312和S313两步骤是对格网纬度划分的步骤。It is easy to understand that the two steps of S312 and S313 are the steps of dividing the grid latitude.

S314：递归S312和S313两步骤，使纬度划分逐步逼近精确坐标值，并将输出的字符拼接成字符串。当剖分次数为n＝9时，42.1855639°N转换为100010111这样一个二进制码来表示。S314: The two steps of S312 and S313 are recursive, so that the latitude division is gradually approached to the precise coordinate value, and the output characters are spliced into a string. When the number of divisions is n=9, 42.1855639°N is converted into a binary code such as 100010111 to represent.

S315：当然，经度划分原理与纬度相同，因此，当剖分次数为n＝9时，117.67198438°E转化为二进制码100111101S315: Of course, the principle of division of longitude is the same as that of latitude. Therefore, when the number of divisions is n=9, 117.67198438°E is converted into binary code 100111101

S316：交替存储经纬度坐标的二进制编码，得到坐标点(117.67198438°E，42.1855639°N)的二进制编码110000011101111011，也确定了坐标点在规则格网中的位置。S316: alternately store the binary codes of the latitude and longitude coordinates to obtain the binary code 110000011101111011 of the coordinate point (117.67198438°E, 42.1855639°N), and also determine the position of the coordinate point in the regular grid.

S32：地理空间划分为规则格网后，位于研究区域内的每一个坐标点，都会落在分级的唯一的格网内，采用格网过滤的方式化简矢量空间数据：对于同一要素对象落在同一格网内的所有坐标点，以该网格的中心点作为替代；对于不同要素对象落在同一格网内的坐标点，需分别以网格中心点替代存储，以确保要素对象的完整性。如图3所示，这样的格网过滤方式通过减少坐标点的数量，实现了对空间矢量数据的压缩。S32: After the geographic space is divided into regular grids, each coordinate point located in the study area will fall into the unique grid of the classification, and the grid filtering method is used to simplify the vector space data: for the same element object falling on the grid All coordinate points in the same grid are replaced by the center point of the grid; for coordinate points of different feature objects that fall in the same grid, the grid center points need to be replaced and stored to ensure the integrity of the feature objects . As shown in Figure 3, such a grid filtering method realizes the compression of space vector data by reducing the number of coordinate points.

S33：矢量坐标点通常以双精度数据类型存储，一个坐标值存储空间占用为16字节，每字节8bit，占96bit内存空间。通过S32以网格中心点代替矢量坐标后，以格网二进制编码存储格网中心点位置信息，可根据需求确定存储位数，也极大的节省了存储空间。S33: Vector coordinate points are usually stored in double-precision data type, and one coordinate value storage space occupies 16 bytes, 8 bits per byte, occupying 96 bits of memory space. After replacing the vector coordinates with the grid center point in S32, the grid center point position information is stored in the grid binary code, and the storage number can be determined according to the requirements, which also greatly saves the storage space.

步骤4：计算规则化剖分产生的误差，以剖分产生的最大误差小于最小可分辨距离，即满足视觉无损为前提，确定剖分层次及数据存储精度。Step 4: Calculate the error generated by the regularized subdivision, and determine the subdivision level and data storage accuracy on the premise that the maximum error generated by the subdivision is less than the minimum distinguishable distance, that is, the visual loss is satisfied.

S41：对于一个矩形格网，格网内所有点到中心点的最大距离为对角线的一半。因此以网格中心点代替网格内其他点，所能产生的最大划分误差(Maximal Error，ME)计算公式为：S41: For a rectangular grid, the maximum distance from all points in the grid to the center point is half the diagonal. Therefore, the calculation formula of the maximum division error (Maximal Error, ME) that can be generated by replacing other points in the grid with the grid center point is:

S42：格网宽度Width，与格网高度Height与剖分次数n相关：S42: grid width Width, which is related to grid height Height and subdivision times n:

Width＝width/2ⁿ Width=width/2 ⁿ

(6) (6)

Height＝height/2ⁿ (7)Height=height/2 ⁿ (7)

其中width为图幅宽度，height为图幅高度。Where width is the width of the frame, and height is the height of the frame.

将(6)(7)式带入(5)得到最大化分误差的最终计算公式：Bring (6) (7) into (5) to get the final formula for maximizing the score error:

由公式(8)可知n值越大，则最大划分误差越小。It can be known from formula (8) that the larger the value of n is, the smaller the maximum division error is.

S43：计算不同n值下的最大划分误差，ME＜MDD为空间矢量数据视觉无损显示的必要前提。在地理信息系统中显示空间矢量数据，需要满足放大显示的需求，视觉无损的前提下可放大倍数(Magnify)计算公式为：S43: Calculate the maximum division error under different n values, ME<MDD is a necessary prerequisite for the visual lossless display of the space vector data. To display spatial vector data in a geographic information system, it is necessary to meet the needs of enlarged display. The formula for magnifying magnification (Magnify) under the premise of visual loss is:

S44：在满足数据显示需求的前提下，确定最小的剖分次数n，进而确定了空间矢量数据的数据化简程度，坐标存储的位数，不同n值下的最大划分误差及地图可放大倍数如图4所示。S44: On the premise of satisfying the data display requirements, determine the minimum number of subdivisions n, and then determine the data simplification degree of the space vector data, the number of bits in coordinate storage, the maximum division error under different n values, and the magnification of the map As shown in Figure 4.

步骤5：以单个待压缩数据文件为单位，设定局部坐标参考系，进行二进制偏移计算，并最终以二进制偏移量存储矢量空间数据的地理坐标。其中待压缩数据又可以分为点适量要素、线矢量要素和面矢量要素。Step 5: Using a single data file to be compressed as a unit, set a local coordinate reference system, perform binary offset calculation, and finally store the geographic coordinates of the vector space data with the binary offset. The data to be compressed can be further divided into point quantity elements, line vector elements and area vector elements.

S51：对于点矢量要素，其空间分布具有随机性，但每个点即为一个独立对象，绘制的先后顺序并不会对绘制结果有任何影响，具体步骤如下。S51: For point vector elements, the spatial distribution is random, but each point is an independent object, and the drawing sequence does not have any effect on the drawing result. The specific steps are as follows.

S511：对点矢量要素按照二进制编码进行顺序重排。S511: Rearrange the point vector elements in order according to binary coding.

S512：记录第一点为原始二进制码，其后点存储为与上一点偏差的网格个数，并转化为二进制存储。S512: Record the first point as the original binary code, and store the subsequent points as the number of grids that deviate from the previous point, and convert them into binary storage.

S513：偏移二进制码的长度将会大于1小于原始二进制码的长度，将统一偏移二进制码长度为所有偏移二进制码长度的最大长度。S513: The length of the offset binary code will be greater than 1 and less than the length of the original binary code, and the length of the offset binary code will be unified as the maximum length of all offset binary code lengths.

S514：对于偏移位数不足的记录，以“0”在前端进行填充，确保偏移位数的统一S514: For records with insufficient offset digits, fill in the front end with "0" to ensure the uniformity of offset digits

具体存储格式见图5。The specific storage format is shown in Figure 5.

S52：线要素或面要素(边界)的坐标以要素对象为单位具有关联性，无法像点要素那样重新排序再绘制，而需要以每个对象为单位进行组织，具体步骤如下：S52: The coordinates of line elements or area elements (boundaries) are related in units of element objects, which cannot be reordered and drawn like point elements, but need to be organized in units of each object. The specific steps are as follows:

S521：记录每个矢量要素中要素对象的节点数量以及起始点坐标，以矢量对象为单位进行偏移计算。S521 : Record the number of nodes of element objects in each vector element and the coordinates of the starting point, and perform offset calculation in units of vector objects.

S522：设立方向位，计算每个坐标点与上一坐标点的偏移方向，以两位二进制编码存储每一个点可能产生的四种偏移方向。S522: Set up a direction bit, calculate the offset direction between each coordinate point and the previous coordinate point, and store the four possible offset directions generated by each point with a two-bit binary code.

S523：计算偏移量，分别沿水平、垂直两个方向记录行列的偏移量，转换为二进制编码后再交替存储作为最终偏移量。S523: Calculate the offset, record the row and column offsets in the horizontal and vertical directions respectively, convert them into binary codes, and then alternately store them as the final offset.

S524：偏移二进制码的长度大于1、并小于原始二进制码的长度，将统一偏移二进制码长度为所有偏移二进制码长度的最大长度。S524: The length of the offset binary code is greater than 1 and less than the length of the original binary code, and the length of the offset binary code is unified as the maximum length of all offset binary code lengths.

S525：对于偏移位数不足的记录，以“0”在前端进行填充，确保偏移位数的统一。S525: For records with insufficient number of offset bits, fill in the front end with "0" to ensure the unification of the number of offset bits.

具体存储格式见图6。The specific storage format is shown in Figure 6.

本公开的另一方面，还提供了一种设备，即可以在该设备上运行本公开公开的矢量空间数据多级压缩方法的步骤。In another aspect of the present disclosure, a device is also provided, that is, the steps of the multi-level compression method for vector spatial data disclosed in the present disclosure can be executed on the device.

本公开提供的矢量空间数据多级压缩方法可以在在一台配置有英特尔i7-9700K@3.6GHz的戴尔XPS8930台式计算机上，以Java编程语言作为实现工具进行了实例实施，在实施例中，以北京及周边地区1：100万的行政区划专题图中道路矢量数据为例，根据图1所示的方法实施流程，进行了矢量数据压缩计算；然后，根据不同二进制存储长度进行了压缩效率的评估。The multi-level compression method for vector space data provided by the present disclosure can be implemented on a Dell XPS8930 desktop computer configured with Intel i7-9700K@3.6GHz, using the Java programming language as an implementation tool. Taking the road vector data in the 1:1 million thematic map of administrative divisions in Beijing and its surrounding areas as an example, according to the method implementation process shown in Figure 1, the vector data compression calculation is carried out; then, the compression efficiency is evaluated according to different binary storage lengths .

图7不同剖分次数下产生的不同Bit(比特)长度，以及实现的矢量空间数据压缩比例。图8为不同Bit长度下压缩效果对展示，从图中可以看出Bit位数达到16和18时重绘的压缩结果有明显的锯齿存在，未能达到视觉无损的需求，当Bit位数达到20位时，已经可以在1∶100万作为显示比例尺下满足显示需求。此时数据的压缩比例可以达到97.18％。通过上述验证可以看出，本专利提出的矢量空间数据压缩算法具有十分可观的压缩效率。Fig. 7 Different Bit (bit) lengths generated under different subdivision times, and the realized vector space data compression ratio. Figure 8 is a display of the compression effect under different Bit lengths. It can be seen from the figure that the redrawn compression results have obvious aliasing when the number of Bits reaches 16 and 18, which fails to meet the requirement of visual lossless. When the number of Bits reaches 16 and 18 At 20 bits, it can already meet the display requirements under the display scale of 1:1 million. At this time, the compression ratio of the data can reach 97.18%. It can be seen from the above verification that the vector space data compression algorithm proposed in this patent has very considerable compression efficiency.

综上所述，本公开提供的空间矢量数据压缩方法以基础地理空间数据可视化为应用场景，设定满足屏幕显示视觉无损的最小分辨率距离为精度要求，实现了矢量空间数据压缩处理，结合格网过滤、二进制偏移存储进一步提高了数据存储效率。与现有技术相比，压缩比率也更高。To sum up, the space vector data compression method provided by the present disclosure takes the visualization of basic geospatial data as the application scenario, and sets the minimum resolution distance that meets the visual lossless of the screen display as the precision requirement, and realizes the vector space data compression processing. Net filtering and binary offset storage further improve data storage efficiency. The compression ratio is also higher compared to the prior art.

上述描述仅是对本发明较佳实施例的描述，并非对本发明范围的任何限定，本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰，均属于权利要求书的保护范围。The above description is only a description of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any changes and modifications made by those of ordinary skill in the field of the present invention based on the above disclosure all belong to the protection scope of the claims.

Claims

1. a multistage compression method for vector space data, the method comprising the steps:

Divide the space expressed by geographic coordinates step by step in the horizontal and vertical directions to obtain a multi-level grid that divides the geographic space;

Determine the position of the coordinate point of the data to be compressed in the grid, and continue to divide the grid to replace the coordinate value of the coordinate point of the data to be compressed with the coordinates of the center point of the divided grid; wherein The coordinate deviation between the center point of the divided grid and the coordinate point of the data to be compressed conforms to a preset value;

Taking a single data file to be compressed as a unit, set a local coordinate reference system, and calculate the binary coordinate deviation of the coordinate point of the data to be compressed and the center point of the grid after the division to obtain a binary offset;

The geographic coordinates of the vector spatial data are stored at the binary offset.

2 . The multi-level compression method for vector space data according to claim 1 , wherein when dividing the horizontal and vertical directions, the left and right or upper and lower subspaces after the division are identified by binary. 3 .

3 . The multi-stage compression method for vector space data according to claim 1 , wherein the plurality of preset values are the minimum distinguishable distances expressed under the condition of visual lossless. 4 .

4. The multi-stage compression method for vector space data according to claim 1, characterized in that, the position of the coordinate points of the data to be compressed located in the grid is determined, and the grid is continuously divided to utilize the The step of substituting the coordinates of the center point of the divided grid for the coordinates of the coordinate points of the data to be compressed includes:

Obtain the coordinate data of the coordinate points of the data to be compressed;

Obtain the minimum and maximum values of the latitude coverage of the grid containing the coordinate points of the data to be compressed in the geospatial data;

Determine whether the latitude coordinates of the coordinate points of the data to be compressed are greater than the median value of the latitude coverage;

If the latitude coordinate is greater than the corresponding median value of the latitude coverage, the grid is divided, so that the latitude coverage of the grid is reduced from the median value to the maximum value;

If the latitude coordinate is less than or equal to the middle value of the latitude coverage, dividing the grid, so that the latitude coverage of the grid is reduced from the minimum value to the middle value;

The step of recursively dividing the latitude of the grid, and similarly dividing the longitude of the grid, so that the deviation between the center point of the grid and the coordinate point of the data to be compressed conforms to the preset value.

5. The multistage compression method of vector space data as claimed in claim 1, wherein the maximum division error between the center point of the divided grid and the coordinate point of the data to be compressed is calculated by the following formula :

Where Width is the width of the grid, Height is the height of the grid, and Scale is the scale.

6. The multi-level compression method for vector space data as claimed in claim 5, wherein the grid width Width and the grid height Height are calculated by the following formula respectively:

Width=width/ ²ⁿ ;

Height=height/2 ⁿ .

7. The multistage compression method for vector space data as claimed in claim 1, wherein when the data to be compressed is a point vector element, the binary offset calculation steps of the point vector element are as follows:

Rearrange the point vector elements in order according to the binary code;

Record the first point as the original binary code, and store the subsequent points as the number of grids that deviate from the previous point, and convert them into binary storage;

The length of the offset binary code will be greater than 1 and less than the length of the original binary code, and the length of the offset binary code will be uniformly taken as the maximum length of all offset binary codes;

For records with insufficient offset bits, the front end of the binary code sequence is filled with 0.

8. The multi-stage compression method for vector space data according to claim 1, wherein when the data to be compressed is a line vector element or an area vector element, the binary offset of the line vector element or the area vector element The calculation steps are as follows:

Record the number of nodes of the feature object in each vector feature and the coordinates of the starting point, and perform the offset calculation in the unit of feature object;

Set up the direction bit, calculate the offset direction of each coordinate point and the previous coordinate point, and store the eight possible offset directions of each point with two-bit binary code;

Calculate the offset, record the offset of the row and column along the horizontal and vertical directions respectively, convert it to binary code, and then alternately store it as the final offset;

If the length of the offset binary code is greater than 1 and less than the length of the original binary code, the length of the offset binary code will be the maximum length of all offset binary codes;

For records with insufficient offset bits, padding is performed at the front with 0.