CN1777038A - Two-dimensional vector data compression method - Google Patents

Two-dimensional vector data compression method Download PDF

Info

Publication number
CN1777038A
CN1777038A CN 200510019931 CN200510019931A CN1777038A CN 1777038 A CN1777038 A CN 1777038A CN 200510019931 CN200510019931 CN 200510019931 CN 200510019931 A CN200510019931 A CN 200510019931A CN 1777038 A CN1777038 A CN 1777038A
Authority
CN
China
Prior art keywords
vector data
compression
data
coding
piece
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200510019931
Other languages
Chinese (zh)
Other versions
CN100546196C (en
Inventor
吴华意
朱海军
刘异
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CNB2005100199317A priority Critical patent/CN100546196C/en
Publication of CN1777038A publication Critical patent/CN1777038A/en
Application granted granted Critical
Publication of CN100546196C publication Critical patent/CN100546196C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The method divides original vector data into blocks of vector data. Next, loss compression based on DCT transform is carried out for the blocks of vector data according to one of two modes. Then, entropy coding is carried out for the obtained DCT transform so as to obtain compressed blocks of vector data. Finally, all compressed blocks of vector data are organized as binary bit steam in specific format. Main effects of the invention are: compression ratio near to 20 is reached for vector data formed natural in wide range such as contour line, road network and water system etc; SNR reaches to 60dB; and maximal absolute deformation is about 10-5 order of magnitude of dynamic range of original vector data. Features are: controllable distortion generated in reconstructing error and compression process; providing technical base for asymptotic transmission of vector data flexibly and in high efficiency by combining asymptotic compression technique in space domain and in frequency domain.

Description

A kind of compression method of two-dimensional vector data
Technical field
The present invention relates to field of information processing, particularly a kind of compression method of two-dimensional vector data.
Background technology
1.GIS the characteristics of middle two-dimensional vector data reach its necessity of compressing:
The two-dimensional space vector data is one of current GIS-Geographic Information System key data type of managing and using, and has characteristics such as data volume is big, skewness, topological relation complexity.Vector data is generally represented by the form of coordinate string among the GIS, coordinate data wherein is generally floating-point format, along with the ability of obtaining data improve and the user to the requirement of data precision, in order to express the real geographical world better, GIS vector data amount is very huge, needs a large amount of storage and transfer resource.Network service emerging GIS applications such as (referring to document [1]) in geography information, vector data need be connected with unsettled Internet via limited bandwidth and sends the user to, need the data volume size of the vector data of transmission to become decision application key of success, directly adopt original coordinate data can't meet the demands as the carrier of two-dimensional space vector data.
H.263/4 or the like in multi-medium data fields such as image, video and audio frequency, ripe international standard is all arranged at present, as JPEG, JPEG2000, MPEG1/2/4.These standards provide a kind of to huge multi-medium data effectively possess high compression ratio and high performance compression method in, huge commercial opportunity also is provided, making based on the online browse of these multi-medium datas and using in real time becomes possibility.To the GIS vector data, also there is not at present a kind of so special compression method can under the prerequisite of the geography information of not losing vector data and being contained, reduce the data volume of vector data effectively.The storage and the efficiency of management that this has not only influenced spatial data also become the bottleneck of geography information network service, hindered combining of GIS-Geographic Information System and Internet Distributed Application greatly.
Under this background, by decomposition that the complexity analyzing of two-dimensional space vector data inside and information are constituted and to the utilization of existing Techniques for Multimedia Compression with in the innovation aspect the vector data compression, the present invention proposes two-dimensional vector data is carried out compression method based on block transform coding.
2. the characteristic analysis of two-dimensional vector data and compression scheme are selected:
Compare with view data, two-dimensional vector data has complex inner structure.Vector data file constitutes by being positioned at inner all the interested object vectors of mapping scope, and the arrangement of the point of these object vector inside is orderly, yet the arrangement of object vector in vector data file inside is unordered.
As everyone knows, in view data, a main forms of its redundancy is exactly the correlation on the gray value between the adjacent image point.Existing image compression international standard has adopted the block transform coding method, by being the approximate incoherent conversion coefficient of frequency domain with view data from space field transformation, then conversion coefficient is carried out independently scalar quantization and utilizes this correlation.This method has been avoided the optimum in theory still too big vector quantization method of amount of calculation of direct employing, has also effectively utilized the correlation of initial data inside simultaneously, is " optimum " scheme of a kind of reality.
We find, also exist very strong long serial correlation in the sample point coordinate sequence data that constitutes vector data, thereby also exist a large amount of redundancies.As the digitized representations in the geographical world of reality, these coordinate sequences have shown the variation tendency of geographic object (as river, road etc.).Because mild property variation and a large amount of existence of cyclic variation in geographic object, make the inner sample point coordinate data sequence of vector data also have corresponding characteristics and show the said correlation in our front, show that also it is feasible that vector data is compressed.
It not is directly to be transplanted on the vector data Image Data Compression algorithm of maturation so simple that vector data is compressed.No matter how big view data can be regarded a two-dimensional matrix with several components (as RGB) as to data volume, and that the internal structure of this rule is a vector data is not available.Therefore, seek effective compression expression of vector data, must fully take into account of the influence of the structure of vector data compression algorithm and compression performance.
During the linear transformation adopted in selecting the vector data compression scheme, we have also adopted DCT (discrete cosine transform, english abbreviation) conversion as numerous image compression algorithms.Although from the meaning KL conversion (Kahunen-Lo é ve) of decorrelation is optimum linear transformation [2], but the transformation matrix of KL conversion is relevant with the statistical nature of the required data of handling, can not determine in advance, and it calculates and does not still have fast algorithm so far.Theoretical proof is to single order Markov signal, and dct transform is good being similar to the KL conversion when its coefficient correlation ρ approaches 1.Experiment shows that its coefficient correlation approaches 1 really very much when adopting the single order Markov process to carry out modeling to vector data, and because the dct transform matrix can be determined in advance, its calculating has multiple fast algorithm, and it all is feasible therefore upward adopting dct transform that vector data is compressed to practice theoretically.
3. quantize distortion with the two-dimensional vector data compression:
Quantizing process can produce the loss of precision, and reduce distortion is one of target of two-dimensional vector data compression pursuit as far as possible.The MSE (Mean Square Error, mean square error) that adopt also have other criterion as distortion metrics [3] more in the lossy compression method of multi-medium data at present, and these criterions respectively have its pluses and minuses.As the MSE criterion, it has calculates simply, explicit physical meaning, advantage such as is convenient to be optimized, and its outstanding shortcoming then is not meet with human apperceive characteristic in some aspects, bigger MSE value not necessarily correspondence relatively poor perceived quality.In addition, MSE has embodied the effect of a kind of " on average ", i.e. Zheng Ti distortion reached given standard, therefore, MSE can't carry out local control to the error of single pixel or audio sample value, may or to rebuild at reconstructed image have other data error more remarkable in the audio frequency, but but be the requirement that meets distortion control on the whole.
To the dct transform coefficient of vector data, adopt different quantization tables to carry out the uniform quantization meeting and obtain different compression result, obtain different distortions.Present patent application does not specify quantization table, does not specify the module of distortion yet.
4. the progressive compression and the transmission of vector data:
Progressive compression and transmission are emerging technology that is applied to multi-medium data in recent years, are characterized in according to user's the requirement and the factors such as ability of the network bandwidth, demonstration and computing equipment, and the data of transmitting suitable size are given the user.This transmission course can be ended by the user at any time, and the data that obtained have then constituted the expression of a compression of former multi-medium data.The advantage of this technology is that the user can obtain the compression expression of initial data in real time, and needn't wait for that whole transfer of data finishes, and this user for low bandwidth is even more important.The utilization of this technology need be carried out inherent combination to multi-medium data, and the JPEG2000 Standard of image compression provides good support (referring to document [4]) to this technology.
For vector data, the asymptotic transmission plan that Many researchers proposes only limits to choice and the transmission to spatial data points, that is to say the progressive expression that only limits to spatial domain, and it and the existing original storage mode of vector data are closely-related.Owing to there is not deep vector data frequency domain to represent mode as support, so these progressive transmission schemes can't relate in the vector data frequency domain and represent the frequency domain progressive transmission mode of carrying out on the basis.
Comprehensively with on, transform coding method is applied to the compression of two-dimensional space vector data, and its key problem in technology is frequency domain analysis, redundancy analysis, the evaluation of vector data compression artefacts, the Data Rate Distribution algorithm that is suitable for the vector data coding, the asymptotic transmission of vector data frequency domain of vector data or the like.With regard to above these problems, the researcher does not also propose total solution at present, just in research, have spatial database, the service of spatial data network, WebGIS or the like related, but do not carry out deep research, more do not form the method for system.
Summary of the invention
Technical problem to be solved by this invention is: by the decomposition that the frequency domain analysis of two-dimensional space vector data and information are constituted, use for reference Techniques for Multimedia Compression, the vector data compression scheme is proposed, a kind of novel two-dimensional space vector data compression method based on transition coding is provided, and the quick and effective of realization two-dimensional vector data stored, manages and transmit.
The technical scheme that the present invention solves the problems of the technologies described above is: behind the two-dimensional vector data piecemeal, use and based on the coding method of dct transform two-dimensional vector data is carried out lossy compression method, and be output as the bit stream of specified format.
What the present invention proposed carries out compression method based on transition coding to two-dimensional vector data, has following significant effect:
One. through test, in that the far-ranging vector data that forms naturally has reached under the situation of nearly 20 compression ratio to contour, road network, water system etc., its signal to noise ratio is up to 60dB, and maximum definitely distortion only is 10 of a former vector data dynamic range -5About the order of magnitude;
They are two years old. and the self adaptation piecemeal to the two-dimensional space vector data has generated the vector data piece that is suitable for block transform coding in the labyrinth that does not destroy spatial data self;
They are three years old. and adopt suitable quantization method, can control reconstruction error, thus the distortion that is produced in the control compression process;
They are four years old. in conjunction with the progressive compress technique of spatial domain and frequency domain, for providing technical foundation with the asymptotic transmission of vector data efficiently flexibly.By can satisfy user's demand better in conjunction with the new asymptotic more efficiently transmission plan that expression proposed of spatial domain and frequency domain vector data.
Description of drawings
Fig. 1: the basic flow sheet of compression method of the present invention
Fig. 2: the block algorithm block diagram, wherein N is 2 n, n specifies for compression is preceding.
Fig. 3: vector data compression main program block diagram.
Fig. 4: the compressing vector data piece is carried out the merogenesis schematic diagram according to given accuracy.
Fig. 5: jvg compressed file structure.
Fig. 6: the data before the compression.
Fig. 7: the data that compression back decompress(ion) obtains.
Fig. 8: the overlapping demonstration of before the compression and compression back data.Wherein: before the compression be black, and compression is a grey afterwards, black color dots by the grey lines and short-term be distortion cause not quite identical.
Fig. 9: an overlapping demonstration of local data after amplifying 3 times.
Figure 10: an overlapping demonstration of local data after amplifying 8 times.
Embodiment
The invention will be further described below in conjunction with example and accompanying drawing.
The present invention is a kind of compression method of two-dimensional space vector data, specifically: at first original vector data is divided into the vector data piece, according to one of two kinds of patterns the vector data piece is carried out lossy compression method based on dct transform again, then the dct transform coefficient that obtains is carried out entropy coding and obtain the compressing vector data piece, at last all compressing vector data pieces are organized as the bit stream of specified format, Fig. 1 is the basic flow sheet of compression method among the present invention.
The present invention adopts following steps that two-dimensional vector data is compressed.
One. the two-dimensional vector data collection is divided into vector data piece (abbreviation piecemeal)
1. the arrangement of two-dimensional vector data:
The handled original vector data of the present invention is a data set that comprises one or more geometric objects (generally being a plurality of), can be a file, also can be binary system or text data stream.A geometric object is formed (for example, having the polygon on island, the bifurcated in river etc.) by one or more independent parts, and each part is formed according to being linked in sequence of storage by the point that quantity does not wait.
If the data of compression do not meet above-mentioned model, before employing the technology of the present invention, must put in order in advance, perhaps arrangement in real time is the form of appointment in the process of implementing compression.
2. the two-dimensional vector data collection is divided into the vector data piece
The process that two-dimensional vector data is divided into the vector data piece is exactly a plurality of vector data pieces that each geometric object are divided into identical size respectively according to the needs of compression algorithm.For the ease of carrying out the quick calculating of dct transform, all contain 2 in each vector data piece nIndividual two-dimemsional number strong point, n is the integer greater than 1, specifies before compression.
For consistent with the original structure of vector data, piecemeal carries out in object vector inside.The number of the point of storage can not all be 2 in the geometric object (and various piece) nMultiple, extract the vector data piece from the various piece order after, to the remaining point of various piece, artificially the point that replenishes sufficient amount is to constitute a complete vector data piece, block algorithm is seen Fig. 2.N among Fig. 2 is 2 nBlock algorithm travels through each part of a geometric object successively, by the original order extraction 2 of point nThe vector data piece of size is wherein had a few complete vector data piece from some parts, is called GDB (GeometryData Block).To remaining less than 2 behind each extracting section GDB nPoint, reconstruct a complete vector data piece after artificially replenishing the point of sufficient amount, be called GSDB (Geometry Supplement Data Block).The coordinate of the point that replenishes equals the mean value of original coordinate of having a few among the GSDB.Compression to two kinds of vector data pieces of GDB and GSDB is the same, but in the compressing vector data file that forms, and the head of compressing vector data piece has comprised the information of the quantity of the actual data point that comprises in the type of indicating this piece and this piece.
Two. based on the lossy compression method (being called for short coding) of dct transform
Referring to Fig. 3, adopt following step that vector data is compressed.
1.DCT conversion:
To all vector data pieces, select one of two kinds of patterns that x and y coordinate data are carried out the one dimension dct transform respectively.First kind of pattern is at first the data of vector data piece to be quantized, then the integer dct transform that quantized data be can't harm; Second kind of pattern is directly to use the floating-point dct transform that diminishes, and conversion quantizes afterwards, if do not consider the rounding error in the floating-point operation, it is harmless that the floating-point dct transform also can be regarded as, but the follow-up quantification to conversion coefficient then diminishes.
One-dimensional sequence x (i), i=0,1, the dct transform of Λ N-1 and being inversely transformed into
X ( k ) = c k Σ i = 0 N - 1 x ( i ) cos ( 2 i + 1 ) kπ 2 N
x ( i ) = Σ k = 0 N - 1 c k X ( k ) cos ( 2 i + 1 ) kπ 2 N
Wherein c 0 = 1 / N , c k = 2 / N , 1 ≤ k ≤ N - 1 .
The integer dct transform is derived by the floating-point dct transform and is got, specifically can be referring to document [6].This patent does not limit the type of integer dct transform of concrete use and parameter wherein, as long as forward integer dct transform (Forward IntegerDCT) and reverse integer dct transform (Inverse Integer DCT) correspondence mutually.
2. quantize and reconstruction:
Quantification in first kind of pattern is an example with the uniform quantization of x component, and its method is: suppose that current vector data is [XMIN, XMAX] in the scope of x direction, specifying the quantization step of x data is Δ x, so the x data in all vector data pieces are carried out following quantification:
Figure A20051001993100074
Wherein
Figure A20051001993100075
Be downward rounding operation, follow-up conversion and coding are to q x(i) carry out, and can't harm.
To having obtained q behind the compressing vector data file decompress(ion) x(i), it is redeveloped into the interval mid point of former quantification, just: x ^ ( i ) = q x ( i ) Δ x + XMIN
Quantification in second kind of pattern is an example with the uniform quantization of x component, and its method is: suppose that the DCT coefficient after current vector data is through the floating-point dct transform is X (k), and k=0,1, Λ, N-1, the appointment quantization step is a Δ Ix, so all X (k) are carried out following quantification:
Figure A20051001993100077
Wherein
Be downward rounding operation, follow-up coding is to Q x(k), k=0,1, Λ, N-1 carries out, and can't harm.Notice that generally be different for its quantization step of different conversion coefficients of x coordinate data this moment, do not have only a quantization step directly coordinate data being quantized and do not resemble.
Similarly, to obtaining Q behind the compressing vector data file decompress(ion) X(k), it is redeveloped into the interval mid point of former quantification, just
X ^ ( k ) = Q X ( k ) Δ kX
Attention: no matter be first kind of pattern or second kind of pattern, the lossy compression method of vector data piece has all been obtained one group of integer dct transform coefficient.In follow-up harmless entropy coding (ground floor coding), will no longer distinguish this two kinds of patterns, and uniformly these coefficients will be called the integer dct transform coefficient.For diminishing quantification, this patent does not specify the step-length of quantification and the tolerance of quantizing distortion, and different quantization steps can produce different quantizing distortion, is not quite similar in different distortion metrics lower compression effects yet.How passing through to set distortion metrics and optimize quantization method, is the content of opening under the compress technique framework of this patent.
The parameter that quantizes generally is different with the difference of vector data, different quantization parameters will produce different quantization transform coefficients and last compression effectiveness, therefore this patent does not limit special quantization parameter especially yet, and the user of patent can select quantization parameter as required or select the method for parameter.
3.DC the processing of coefficient:
Dct transform coefficient can be divided into two classes, and first coefficient is called the DC coefficient, is the average of participating in all data of conversion; Other coefficient is called the AC coefficient.Because the DC coefficient is directly relevant with the position of vector data piece, therefore often have very strong correlation between the DC coefficient of all vector data pieces of same geometric object, be suitable for adopting predictive coding to compress.In the present invention, in a geometric object (comprising its a plurality of parts), the DC coefficient in first vector data piece, the DC coefficient of all subsequent vector data blocks is all calculated the difference of the DC coefficient of itself and previous compression data block, then this difference is carried out huffman coding.
The example of difference huffman coding table sees attached list 1.First row of this table are capable number, and the 3rd row are Huffman code words of this row, and every row comprises integer range ((2 i-1) ,+(2 i-1)), but do not comprise between the mesozone ((2 I-1-1) ,+(2 I-1-1)).The code word that difference is encoded comprises two parts, the monobasic Huffman code word that first is expert at for this difference; Second portion is the complement of one's (not needing sign bit) of this difference.For the DC coefficient in first vector data piece in the geometric object, directly use this table to encode.This coding method can be referring to document [6].The difference huffman coding of DC coefficient can illustrate as follows.The DC coefficient of supposing first vector data piece in the current geometric object is 1787, the DC coefficient of second vector data piece is 1783, the DC coefficient of first vector data piece is carried out direct coding obtain 111111110|11011111011, actual coding is difference-4 between the two in second vector data piece, is encoded to 100|011.
4. ground floor is encoded:
To each vector data piece, through all obtaining two integer dct transform coefficient pieces (corresponding respectively to x and y coordinate data) after the lossy compression method of first kind of pattern or second kind of pattern.First component of this piece is the DC coefficient, and it is handled and discusses in front, the distance of swimming huffman coding that all remaining AC coefficients can't harm in the following manner.
At first the AC coefficient block is divided into (nonzero coefficient, zero run-length length) right.The zero run-length lengths table is shown in the number that this nonzero coefficient front is zero AC coefficient.Right to each, at first find the line number R of nonzero coefficient in the difference huffman coding table, in distance of swimming huffman coding table, find the code word of position then corresponding to Z/R (Z is row number, and R is row number), wherein Z is the length of zero run-length.In this table, R is since 1 counting, and Z is since 0 counting.This code word back adds that the radix-minus-one complement (not needing sign bit) of nonzero coefficient is exactly this right coding.Last a string zero (number is indefinite, also may be 0) of AC coefficient block represented with EOB (End Of Block, block end).Distance of swimming huffman coding table comprised (zero run-length length, nonzero coefficient is expert at) to the Huffman code word of EOB.The example of a distance of swimming huffman coding table sees attached list 2.The distance of swimming huffman coding of AC coefficient block can illustrate as follows.Suppose the size N=16 of discrete cosine transform block, the AC coefficient block of desire coding is 10,0,0,2,0,0,0,0,0,0,0,0,0,0,0, so this piece be divided into 2 (nonzero coefficient, zero run-length length) to an EOB.The front of first nonzero coefficient 10 does not have zero coefficient, and it is arranged in difference huffman coding table the 4th row, and therefore its corresponding code word is that it is encoded to 1011|110 at 1011 of 0/4 place in distance of swimming huffman coding table.Obtain equally the back nonzero coefficient 2 be encoded to 11111001|10.Last this string AC coefficient is encoded as:
101111011111001101010
Wherein last 1010 is codings of EOB.
In addition, the length of zero run-length must not exceed 15, if exceed then represent it with a code word ZRL, and restarts zero run-length length is counted.For instance, if between two non zero AC coefficients 17 zero AC coefficients are arranged, be split into a ZRL and zero run-length length so and be 2 (nonzero coefficient, zero run-length length) right.
5. the second layer is encoded:
If directly the bit stream that obtains is output as binary file after the ground floor end-of-encode, so resulting compression vector file will have unique precision.Have only after the whole file decompression, the decompress(ion) vector file that just can obtain having given accuracy, only decompress(ion) part compression position flow then can not get the significant expression to former vector file.By reorganization to compression position flow, can make compression position flow carry out the part decompress(ion) after, obtain lower accuracy or decompress(ion) vector file more among a small circle.And if the user needs more high accuracy or wider vector data, then can further carry out decompress(ion) until ending to remaining compression position flow.This organizational form makes compression position flow have progressive character, and the user can need select corresponding bit-stream to carry out decompress(ion) according to it, just can use and need not carry out decompress(ion) to whole file.Above-mentioned precision or scope according to vector data organized compression position flow so that export the process that bit stream has approximate progressive character and is called second layer coding.
Second layer coding is by the realization of cutting apart to the compressing vector data piece.Suppose that the precision that former compressing vector data reaches is ε Total, this ε TotalRefer between all data points and the former data point behind the decompress(ion) a certain prior given threshold value (for example, 10 that do not exceed in x direction and y deflection error -5).Given one group of S accuracy class { ε from low to high 1, ε 2, Λ, ε S, ε wherein STotalAll compressing vector data pieces all will be split into the S joint so, make that decompressed data has reached ε when only decompress(ion) the 1st saves 1Accuracy class, and the like.Similarly, to whole compressing vector data file, (during the individual accuracy class of 1≤k≤S), whole decompress(ion) vector data file has also reached k accuracy class when will wherein all compressing vector data pieces all unziping to k.Therefore will compress the k section that the set that belongs to the compression position flow that k saves in all compressing vector data pieces of vector file (perhaps its certain scope, explanation hereinafter) is called this compression vector file (perhaps its certain scope).Having comprised several (nonzero coefficient, zero run-length length) right codings in the joint that forms after cutting apart single compressing vector data piece, also may be zero, can not interrupt (nonzero coefficient, zero run-length length) right coding but cut apart.Form joint afterwards in the code word that must add EOB at last (in the above example, being 1010) of these joints,, so directly represent this joint with EOB if this joint is empty.Schematic diagram of cutting apart with five accuracy classes is seen Fig. 4.The mode of determining and the compressing vector data piece being cut apart according to the accuracy class of determining of accuracy class can be specified by using, and is not the content of this patent.
Among the present invention vector data being carried out second layer coding has dual mode, is respectively precision mode of priority and scope mode of priority.In the precision mode of priority, the low precision version of whole vector data at first is extracted, and the data of decompress(ion) will progressively improve the precision of whole vector data then.In the scope mode of priority, at first whole vector data is divided into several scopes, all scope non-overlapping copies all are made up of several geometric objects.Then each scope is carried out the preferential tissue of precision.The division of scope is determined by using, is not the content of this patent.Because the precision mode of priority also can be regarded as whole file is considered as a scope, can think that therefore the compressing vector data bit stream is made up of one or more scope, a scope then is made up of one or more section.
Illustrate this two kinds of second layer coded systems below.Suppose to have and comprised three geometric object A among the vector data A 1, A 2, A 3All geometric objects all are divided into three joints, i.e. A according to three precision levels IjI, j=1,2,3, represent that the j of i geometric object saves.A is split into two scopes under the scope mode of priority, comprises geometric object A respectively 1, A 2With geometric object A 3If according to the precision mode of priority, the arrangement mode of all joints is in compressed file so:
A 11,A 21,A 31,A 12,A 22,A 32,A 13,A 23,A 33
If according to the scope mode of priority, the arrangement mode of all joints is in compressed file so:
A 11,A 21,A 12,A 22,A 13,A 23,A 31,A 32,A 33
Three. the output of compressing vector data
To all compressing vector data pieces, just formed last vector data compressed file in interpolation various marks and control information under the given coded system and after organizing, be called the jvg file, form (see figure 5) by file header and file body.Below the form of jvg file is done detailed explanation.All bytes all have the littleendian byte order (the most remarkable position in byte backmost) that uses on the Intel platform.
Referring to subordinate list 3, the file header of compressed file has comprised following two parts:
First is whole about the compressing vector data file and the information of the compression algorithm taked.
This part is essential.Here and the mark of in the formation of follow-up code stream, using form by two bytes, wherein first byte is 0XFF, second byte be value in the scope except 0X00 and 0XFF (be complete 0 and complete 1 byte), specifically sees attached list 3.
Second portion is the needed table of compression algorithm, comprises the dct transform coefficient quantization table, DC coefficient difference and AC coefficient be can't harm the coding schedule of entropy coding.
This part is optional.If do not provide these tables, may obtain from other acquiescence approach by compression and decompress(ion) program so.Specify as follows:
It at first is the definition of dct transform coefficient quantization table.The definition of quantization table is begun by mark DQT (referring to subordinate list 4), first kind of undefined quantization table of mode is made up of a double-precision floating point constant C and the individual integer of N (N is the discrete cosine transform block size), and the corresponding quantization step of each dct transform coefficient is the integer that C multiply by correspondence position; The undefined quantization table of the second way is made up of N double-precision floating points, and these numbers are the quantization step of corresponding dct transform coefficient.
Be the definition of DC coefficient difference huffman coding table then.In order to preserve this table, at first provide the line number R that one one byte signless integer is represented this table, the difference that every row comprises can be directly number be calculated by the row of this row, does not therefore need to preserve.Next be a R byte signless integer, represent the length of the pairing Huffman code word of every row.The length of Huffman code word does not allow to surpass 16.Each is preserved with two bytes in order with all code words at last, and real code word is positioned at the low side of these two bytes, because code word size does not allow to surpass 16, therefore two bytes are enough.With subordinate list 1 is example, and at first the byte of Bao Cuning is 0X11 (17 hexadecimal representations), is following 17 bytes then, represents the length of each code word:
0X02,0X03,0X03,…
Ensuing 34 bytes are:
0X00,0X00,0X00,0X02,0X00,0X03…
Be AC coefficient distance of swimming huffman coding table at last.In order to preserve this table, at first provide line number R and columns Z that two one byte signless integers are represented this table.R has represented the maximum number of lines of non zero AC coefficient in DC coefficient difference huffman coding table, and Z has represented the maximum length of the zero run-length of non zero AC coefficient front.Then, next preserve RZ code word in this table according to the order of row major again, and (Biao beginning just) inserts the code word of preserving EOB in (0,0) position, locate to insert the code word of preserving ZRL in (0,15).Amount to RZ+2 code word.When preserving these code words, equally at first preserve a byte signless integer of RZ+2 expression correspondence code word length, each is preserved with two bytes in order with all code words then.
If above-mentioned quantization table and coding schedule all provide, six tables (corresponding x and y coordinate data respectively) will be arranged so in compressed file.
The file body of jvg file has comprised actual compressing vector data.If be the ground floor coding, this part only has a compressing vector data section so, and this section is made up of one or more compressing vector data pieces.If be second layer coding, this part has one or more scopes so, and each scope is made up of one or more sections, and each section is made up of the joint of one or more compressing vector data pieces in order.In file body, need some marks to represent the beginning and the end of different scopes and section, and each joint ends up by EOB.
At first do some explanations below with regard to the compressing vector data piece in the ground floor coding.By the block algorithm of front as can be known the compressing vector data piece may handle by GDB or GSDB, and this GDB or GSDB may be arranged in a part that geometric object is different, these informational needs are preserved in the compressing vector data piece.For this reason at additional two bits of the beginning of each compressing vector data piece, its meaning is shown in subordinate list 5, if this compressing vector data piece is come by the GSDB compression then next byte is preserved actual the counting of comprising in this compressing vector data piece.Because GDB or GSDB will produce two compressing vector data pieces (corresponding x and y coordinate data respectively), and these two pieces always in succession, so only preserve these information previous beginning.Can rebuild the geometric object of forming by a plurality of parts by the points information that comprises among build portion information and the GSDB.If in addition a compressing vector data piece is divided into a plurality of joints according to required precision, so only in first segment, preserve former build portion information.
Determined to adopt ground floor still be second layer coding and specifically whole vector data is divided into scope and the compressing vector data piece in each scope is divided into the mode of the joint and the section of being combined as after get final product all bit streams are organized as file body.Accompanying drawing 5 is examples of file body.In compressed file, be difficult to find the border between each geometric object, but when decompress(ion), can be combined as geometric object according to the header information of each compressing vector data piece coordinate point range after with decompress(ion).
Also have 2 needs explanations at last.The one, each coding unit in the jvg file (as the compressing vector data piece or cut apart the formed joint of this piece) all must be a byte-aligned, therefore, must replenish enough 0 at last so that its figure place is 8 integral multiple at this coding unit so if the figure place of coding unit is not 8 integral multiple.The 2nd, for fear of bit stream and the mark in the compressed file (definition of the face that sees before) obscured in the coding unit, if in the regulation coding unit if 0XFF byte (just one is 1 byte entirely), inserting one so in this byte back is 0 byte entirely.If therefore decoder runs into a 0XFF byte, if the byte of its back is 0X00, should directly abandon this byte so, if not, should explain so these two bytes are interpreted as a mark and take corresponding action according to the value of this mark.Fig. 6 to Figure 10 is an example that compresses actual river data with the art of this patent, employing ground floor coding, size before the compression is 5,831,572 bytes (5.56M), the size after the compression are 269,716 bytes (264kB), comprised file header and quantization table, coding schedule etc., compression ratio has reached 21.6.
Subordinate list
Subordinate list 1 difference huffman coding table example
Capable number The coefficient difference The Huffman code word
0 0 00
1 -1.1 010
2 -3,-2,2,3 011
3 -7,…,-4,4,…,7 100
4 -15,…,-8,8,…,15 101
5 -31,…,-16,16,…,31 110
6 -63,…,-32,32,…,63 1110
7 -127,…,-64,64,…,127 11110
8 -255,…,-128,128,…,255 111110
9 -511,…,-256,256,…,511 1111110
10 -1023,…,-512,512,…,1023 11111110
11 -2047,…,-1024,1024,…,2047 111111110
12 -4095,…,-2048,2048,…,4095 11111111110
13 -8191,…,-4096,4096,…,8191 111111111110
14 -16383,…,-8192,8192,…,16383 1111111111110
15 -32767,…,-16384,16384,…,32767 11111111111110
16 32768 11111111111111
Subordinate list 2AC coefficient distance of swimming huffman coding table example
Z/R Code word Z/R Code word Z/R Code word
0/0(EOB) 1010
0/1 00 1/1 1100 2/1 11100
0/2 01 1/2 11011 2/2 11111001
0/3 100 1/3 1111001 2/3 1111110111
0/4 1011 1/4 111110110 2/4 111111110100
0/5 11010 1/5 11111110110 2/5 1111111110001001
Subordinate list 3 jvg file headers
The starting position Explanation Value Type Explanation
Byte 0 Compressed file is described “jveg” Character Essential
Byte 4 Compressed file begins SOF Assigned tags Essential
Byte 6 Compressed file length File size (byte) Integer Essential
Byte 10 The discrete cosine transform block size 2 n, n is specified by the user Integer Essential
Byte 14 The compressed file version 1000 Integer Essential
Byte 18 The vector data scope Xmin Double-precision floating points Essential
Byte 26 The vector data scope Ymin Double-precision floating points Essential
Byte 34 The vector data scope Xmax Double-precision floating points Essential
Byte 42 The vector data scope Ymax Double-precision floating points Essential
Byte 50 Direct quantization step Xstep Double-precision floating points Optional
Byte 58 Direct quantization step Ystep Double-precision floating points Optional
Employed mark in the subordinate list 4 jvg files
Value Title Explanation
0XFF60 SOF 0 Floating-point DCT, the ground floor coding
0XFF61 SOF 1 Floating-point DCT, second layer coding, precision is preferential
0XFF62 SOF 2 Floating-point DCT, second layer coding, scope is preferential
0XFF63 SOF 3 Integer DCT
0XFF70 SOV Compressing vector data begins
0XFF71 EOV Compressing vector data finishes
0XFF72 SOR Scope begins
0XFF73 EOR End of extent (EOE)
0XFF74 SOS The section beginning
0XFF75 EOS Section finishes
0XFF80 DQT 0 First kind of dct transform coefficient quantization table that mode defines
0XFF81 DQT 1 The dct transform coefficient quantization table of second way definition
0XFF82 DHT 0 Definition DC coefficient difference huffman coding table
0XFF83 DHT 1 Definition AC coefficient huffman coding table
Subordinate list 5 compressing vector data build portion information
The value of two bits of compressing vector data build portion Meaning
11 This piece is a GDB
10 This piece is a GSDB, but is not the ending of a geometric object
00 This piece is a GSDB, and the corresponding ending of a geometric object
List of references:
[1]H.Wu,Hanwu Zhang,Xiaojing Liu,Xia Sun,Adaptive Architecture of GeospatialInformation Service over the Internet with QoGIS Embedded,in:ISPRS Workshopon Service and Application of Spatial Data Infrastructure,XXXVI(4/W6),Oct14-16,2005,Hangzhou,China
[2] Hu Guangshu. Digital Signal Processing---theory, algorithm and realization. Beijing: publishing house of Tsing-Hua University, 1997
[3]A.Ortega and K.Ramchandran,Rate Distortion Methods for Image and VideoCompression,IEEE Signal Proc.Magazine,pp.23-49,Nov.1998.
[4]M.Rabbani and R.Joshi,An Overview of the JPEG2000 Still Image CompressionStandard,Signal Processing:Image Communication,vol.17,issue.1,pp.3-48,Jan.2002.
[5]V.K.Goyal,Theoreticai Foundations of Transform Coding,IEEE SignalProcessing Magazine,pp.9-21,Sept.2001.
[6]Y.Zeng,L.Cheng,G.Bi,and A.C.Kot,Integer DCTs and Fast Algorithms.IEEETransactions on Signal Processine,vol.49,no.11,pp.2774-2782,Nov.2001.
[7] David Salomon[U.S.] work, Wu Lenan etc. translate, data compression principle and application. Beijing: Electronic Industry Press, 2003

Claims (4)

1. the compression method of a two-dimensional space vector data, it is characterized in that behind the two-dimensional vector data piecemeal, use is carried out lossy compression method based on the coding method of dct transform to two-dimensional vector data, and is output as the bit stream of specified format, and DCT is the english abbreviation of discrete cosine; Concrete steps are as follows:
(1) two-dimensional vector data is divided into the vector data piece, follow-up cataloged procedure carries out individually at each vector data piece, and this step is called piecemeal,
(2) x in each vector data piece and y coordinate data are carried out lossy compression method based on dct transform, this step is called coding,
(3) compression data block that all codings are obtained adds various marks and control information formation compression position flow, outputs to binary compressing vector data file, and this step is called output.
2. the compression method of two-dimensional vector data according to claim 1 is characterized in that adopting the following steps piecemeal:
To handled two-dimensional vector data, the object vector that is comprised with vector data inside is a unit, is divided into the vector data piece of identical size according to the needs of compression algorithm; All contain 2 in each vector data piece nIndividual two-dimemsional number strong point, for consistent with the original structure of vector data, piecemeal carries out in object vector inside, less than 2 nThe vector data piece of individual data point constitutes a complete vector data block after artificially replenishing some data points; N is the integer greater than 1, specifies before compression.
3. the compression method of two-dimensional vector data according to claim 1 is characterized in that adopting the following steps coding:
A. to all vector data pieces, select one of two kinds of patterns that x and y coordinate data are carried out the one dimension dct transform respectively; First kind of pattern is earlier coordinate data to be quantized, and then the rounded coordinate data after quantizing be can't harm the integer dct transform, obtains the integer transform coefficient; Second kind of pattern is directly coordinate data to be carried out the floating-point dct transform, obtains the floating-point transform coefficient,
B. all conversion coefficients that second kind of mode treatment obtained carry out uniform quantization, obtain quantization transform coefficient,
C. to the integer transform coefficient in the quantization transform coefficient of each vector data piece or the first kind of pattern in order block-by-block can't harm entropy coding, generate the compressing vector data piece, the cataloged procedure of being made up of step a, b and c is called ground floor encodes, perhaps,
D. the further tissue compression vector data of the requirement piece that increases progressively according to the reconstruction precision of vector data makes compression position flow have progressive character, and like this, the cataloged procedure of being made up of step a, b, c and d is called second layer coding; Second layer coding is only at the quantization transform coefficient that obtains under second kind of pattern.
4. the compression method of two-dimensional vector data according to claim 1 is characterized in that output: the lossy compression method data block is added various marks and control information, and the bit stream that obtains is outputed to the compressed file of specified format, expansion jvg by name; This compressed file is made up of file header and two parts of file body.
CNB2005100199317A 2005-12-01 2005-12-01 A kind of compression method of two-dimensional vector data Expired - Fee Related CN100546196C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100199317A CN100546196C (en) 2005-12-01 2005-12-01 A kind of compression method of two-dimensional vector data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100199317A CN100546196C (en) 2005-12-01 2005-12-01 A kind of compression method of two-dimensional vector data

Publications (2)

Publication Number Publication Date
CN1777038A true CN1777038A (en) 2006-05-24
CN100546196C CN100546196C (en) 2009-09-30

Family

ID=36766383

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100199317A Expired - Fee Related CN100546196C (en) 2005-12-01 2005-12-01 A kind of compression method of two-dimensional vector data

Country Status (1)

Country Link
CN (1) CN100546196C (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529917B (en) * 2006-10-23 2011-08-31 高通股份有限公司 Signalling of maximum dynamic range of inverse discrete cosine transform
WO2011137841A1 (en) * 2010-09-29 2011-11-10 华为技术有限公司 Method and device for compression encoding, method and device for decompression decoding, and communication system
CN103427845A (en) * 2013-08-05 2013-12-04 广西电网公司电力科学研究院 Method for compressing and reconstructing harmonic data of power system on basis of two-dimensional block DCT (discrete cosine transformation)
CN103929185A (en) * 2013-01-10 2014-07-16 国际商业机器公司 Method And System For Real-time Reduction Of Cpu Overhead For Data Compression
CN106445445A (en) * 2011-11-08 2017-02-22 苏州超擎图形软件科技发展有限公司 Method and device for processing vector data
US9588980B2 (en) 2013-01-10 2017-03-07 International Business Machines Corporation Real-time identification of data candidates for classification based compression
CN106685426A (en) * 2016-11-28 2017-05-17 北京航天自动控制研究所 Coding method of target information
US9792350B2 (en) 2013-01-10 2017-10-17 International Business Machines Corporation Real-time classification of data into data compression domains
CN108668134A (en) * 2018-04-08 2018-10-16 北京奇艺世纪科技有限公司 A kind of decoding method, device and electronic equipment
CN109039342A (en) * 2018-08-24 2018-12-18 国网河北省电力有限公司电力科学研究院 A kind of compression method, system and the decompression method of force data, system out
CN111371461A (en) * 2020-04-26 2020-07-03 宁夏隆基宁光仪表股份有限公司 Original code and inverse code mixed data compression method suitable for intelligent electric meter
CN111930743A (en) * 2020-07-29 2020-11-13 武汉中地先进技术研究院有限公司 SQLite-based spatial data local storage method, medium and electronic device
CN114040028A (en) * 2021-10-29 2022-02-11 深圳智慧林网络科技有限公司 Data compression method and data decompression method based on three modes

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529917B (en) * 2006-10-23 2011-08-31 高通股份有限公司 Signalling of maximum dynamic range of inverse discrete cosine transform
WO2011137841A1 (en) * 2010-09-29 2011-11-10 华为技术有限公司 Method and device for compression encoding, method and device for decompression decoding, and communication system
CN106445445A (en) * 2011-11-08 2017-02-22 苏州超擎图形软件科技发展有限公司 Method and device for processing vector data
CN106445445B (en) * 2011-11-08 2022-06-03 苏州超擎图形软件科技发展有限公司 Vector data processing method and device
US10387376B2 (en) 2013-01-10 2019-08-20 International Business Machines Corporation Real-time identification of data candidates for classification based compression
CN103929185A (en) * 2013-01-10 2014-07-16 国际商业机器公司 Method And System For Real-time Reduction Of Cpu Overhead For Data Compression
US9564918B2 (en) 2013-01-10 2017-02-07 International Business Machines Corporation Real-time reduction of CPU overhead for data compression
US9588980B2 (en) 2013-01-10 2017-03-07 International Business Machines Corporation Real-time identification of data candidates for classification based compression
CN103929185B (en) * 2013-01-10 2017-05-24 国际商业机器公司 Method and system for real-time reduction of CPU overhead for data compression
US9792350B2 (en) 2013-01-10 2017-10-17 International Business Machines Corporation Real-time classification of data into data compression domains
CN103427845A (en) * 2013-08-05 2013-12-04 广西电网公司电力科学研究院 Method for compressing and reconstructing harmonic data of power system on basis of two-dimensional block DCT (discrete cosine transformation)
CN106685426B (en) * 2016-11-28 2021-02-09 北京航天自动控制研究所 Target information coding method
CN106685426A (en) * 2016-11-28 2017-05-17 北京航天自动控制研究所 Coding method of target information
CN108668134A (en) * 2018-04-08 2018-10-16 北京奇艺世纪科技有限公司 A kind of decoding method, device and electronic equipment
CN108668134B (en) * 2018-04-08 2021-03-26 北京奇艺世纪科技有限公司 Encoding and decoding method and device and electronic equipment
CN109039342A (en) * 2018-08-24 2018-12-18 国网河北省电力有限公司电力科学研究院 A kind of compression method, system and the decompression method of force data, system out
CN109039342B (en) * 2018-08-24 2022-12-06 国网河北省电力有限公司电力科学研究院 Compression method and system and decompression method and system of output data
CN111371461A (en) * 2020-04-26 2020-07-03 宁夏隆基宁光仪表股份有限公司 Original code and inverse code mixed data compression method suitable for intelligent electric meter
CN111371461B (en) * 2020-04-26 2023-04-07 宁夏隆基宁光仪表股份有限公司 Original code and inverse code mixed data compression method suitable for intelligent electric meter
CN111930743A (en) * 2020-07-29 2020-11-13 武汉中地先进技术研究院有限公司 SQLite-based spatial data local storage method, medium and electronic device
CN114040028A (en) * 2021-10-29 2022-02-11 深圳智慧林网络科技有限公司 Data compression method and data decompression method based on three modes
CN114040028B (en) * 2021-10-29 2023-11-24 深圳智慧林网络科技有限公司 Data compression method and data decompression method based on three modes

Also Published As

Publication number Publication date
CN100546196C (en) 2009-09-30

Similar Documents

Publication Publication Date Title
CN1777038A (en) Two-dimensional vector data compression method
CN1303820C (en) Quality based image compression
CN1689026A (en) Context-based adaptive variable length coding for adaptive block transforms
RU2567988C2 (en) Encoder, method of encoding data, decoder, method of decoding data, system for transmitting data, method of transmitting data and programme product
CN101061515A (en) Coding scheme for a data stream representing a temporally varying graphics model
KR20110045949A (en) Method and apparatus for encoding and decoding image by using rotational transform
CN1329439A (en) Method for producing qunantization matrix
CN102611888A (en) Encoding method for screen content
CN1148067C (en) Data compressing method for complex image of synthetic apertre radar
CN1215440C (en) Method and device using linear approximation to compress and reconfigure animation path
Barman et al. A quantization based codebook formation method of vector quantization algorithm to improve the compression ratio while preserving the visual quality of the decompressed image
JP2013502142A (en) Video encoding / decoding method and apparatus using rotational transformation
Rao et al. Evaluation of lossless compression techniques
Shingate et al. Still image compression using embedded zerotree wavelet encoding
CN1195449A (en) Image encoder, image decoder and image transmitting system
JPH08331563A (en) Image compressing method using ripple conversion
CN1664860A (en) Synthetic aperture radar complex numeric image data real time automatic compression method
Demaret et al. Scattered data coding in digital image compression
Ouafi et al. A modified embedded zerotree wavelet (MEZW) algorithm for image compression
CN104094607B (en) Modeling method and system based on context in transform domain of image/video
Zheng et al. A new image pre-processing for improved performance of entropy coding
CN100466749C (en) Anti-error code image coding and decoding method based on distributive source encoding
Senapati et al. Low bit rate image compression using hierarchical listless block-tree DTT algorithm
Sivia et al. Image Compression Using Row & Column Based Cosine Error Vector Rotation (TCEVR) and Error Vector Rotation Using Walsh Codebook Generation In Different Quantization Matrix
Thao et al. Set theoretic compression with an application to image coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090930

Termination date: 20111201