CN106407378A - Method for expressing road network trajectory data again - Google Patents
Method for expressing road network trajectory data again Download PDFInfo
- Publication number
- CN106407378A CN106407378A CN201610817878.3A CN201610817878A CN106407378A CN 106407378 A CN106407378 A CN 106407378A CN 201610817878 A CN201610817878 A CN 201610817878A CN 106407378 A CN106407378 A CN 106407378A
- Authority
- CN
- China
- Prior art keywords
- data
- road
- track
- time
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Navigation (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention belongs to the technical field of trajectory data calculation, and particularly relates to a method for expressing road network trajectory data again. The road network trajectory data obtained by original GPS (Global Positioning System) sampling is difficult in coding compression, and therefore, an original three-dimensional real number sequence needs to be changed before compression is carried out. By use of the method, a road network trajectory matched with a map is decomposed into spatial data and time data, wherein the spatial data is a road network road sequence, the time data is a distance-time two-tuple sequence, and data before and after decomposition is carried out can be subjected to lossless transformation in linear time. During trajectory calculation, trajectory storage and query cost in a database can be reduced.
Description
Technical field
The invention belongs to track data computing technique field and in particular to a kind of again represent road network track data side
Method.
Background technology
Track data is a kind of basic space-time data, is normally defined the function with regard to the time for the position.Through vehicle positioning
The tracing point that equipment sampling obtains represents with (x, y, t) tlv triple, wherein x and y is respectively longitude and latitude, and t is this sampling
The timestamp of point.Then original road network track can be represented with a triad sequence, that is,<(x1, y1, t1), (x2, y2, t2)...,
(xn, yn, tn)>, wherein, n is the length of track, and (xi, yi) it is vehicle in tiThe position in moment.With vehicle positioning equipment
Popularization, in city, vehicle creates the road network track of magnanimity.These road network track datas carry bulk information, in analysis city
Frequently as important decision foundation and information in the problems such as city's traffic, the behavioral pattern excavating people and prediction vehicular movement direction
Source.City road network is represented with a directed graph G=(V, E), and wherein V is the intersection point set of road, and E is between crossing
Section is gathered.
Data compression algorithm is divided into lossless compress and lossy compression method two class.Lossless compress does not produce information loss, that is, compress
Data can be reduced to initial data completely afterwards;Contrary lossy compression method is passed through directly to give up not affecting the portion of required precision in data
Point, to reach higher compression ratio, but data existence information loss after pressure ω contracting.Lossless compression algorithm be divided into again entropy code and
Lexicographic encodes two kinds.Conventional entropy code has Huffman coding and arithmetic coding;The conventional Lempel-Ziv of lexicographic coding compiles
Code and other algorithms deriving.Lossy Compression Algorithm is the dedicated algorithms for the design of some special datas, such as figure
The JPEG compression of picture and the MPEG compression for audio frequency, different with lossless compress, these methods are just for specific data (as schemed
Picture and audio frequency).For general track data and road network track data, also there is specific Lossy Compression Algorithm, these methods are usual
Directly delete the sampled point not affecting data precision in initial trace.
Due to being to obtain from the sampling of running fix equipment, usual table is a triad sequence T to original road network track data
=< (x1, y1, t1), (x2, y2, t2) ..., (xn, yn, tn) >, but this expression contains unnecessary redundancy, is unfavorable for counting
According to compression.Represent the limitation of method according to initial data, I proposes a kind of new track format and a kind of corresponding data is divided
Solution method directly to reduce data redundancy so as to be easily compressed algorithm process.
Content of the invention
Compression ratio is to weigh one of key index of data compression algorithm performance, generally defined as initial data and compression
The ratio of size of data afterwards.Given initial trace T, if its size is | T |;And track is T after compressingc, size is | Tc|, then press
Shrinkage isFor example, initial data size is 2KB, and after compressing, size of data is 1KB, then data compression rate is 2.
Consider first in initial trace using general lossless compression algorithm (sourcecoding).If we are direct
To initial trace data using classical lossless compression algorithm, because the theoretical background of data compression is theory of information, from comentropy
Angle to analyze for track data compression problem, may certify that, either entropy code or lexicographic coding, when real number number
According to precision improvement when, algorithm all will become very low for the compression ratio of track data.
Theorem:When the precision improvement of real data, entropy code and lexicographic coding for track data compression ratio all
Trend towards 1.
Prove:We demonstrate that entropy code is poorly efficient for high accuracy real data first.If X is a continuous distribution, its
Probability density function is p (x).In order to calculate the entropy of X, we first by the sample space ω of X=[a, b) be divided into n part, each
The length of minizone is Δ=(b-a)/n.If [a, b) it is divided into { [a=x0, x1), [x1, x2) ... [xn-1, xn=b) }, x falls
Probability in each interval constitutes Discrete Distribution, the available integral and calculating of its probability distribution row:
According to INTEGRAL THEOREM OF MEAN, certainly existSo that:
In other words, the entropy of this Discrete Distribution is
If function p (x) log p (x) Riemann interability, have
Wherein, h (X) is the differential entropy of continuous distribution X.
N in above-mentioned equation is exactly the precision of data because interval division is thinner, in order to represent different pieces of information symbol just
More, that is, the precision of data is higher.If do not compressed to data, we can directly useBit is storing each
Symbol.Source coding theorom according to Shannon and the optimality of entropy code algorithm, when n trends towards infinite, for data
Compression ratio r has:
In sum, when data precision is lifted, the compression ratio of entropy code will level off to 1, that is, cannot compressed data.
Different with entropy code, for the compression effectiveness of Dictionary of Computing formula coding, need to analyze the combination entropy of information source distribution, but
Be prove process be similar.Give any k character, if p is (x1, x2... xk) it is X1, X2... XkJoint probability density
Function.Then optimum average code length LkNecessarily satisfying for:
It is that we need at least H (X1, X2... Xk) bit to be representing k character.
Similarly, we are by sample space ω=[a1, b1)×[a2, b2)×…[ak, bk) it is divided into nkPart, each piece
Size beThen the entropy of calculating Discrete Distribution is:
So, if there are k item data, and their precision is n, then at least need H Δ (X1, X2... Xk) bit to be encoding
These data.If p is (x1, x2... xk)log p(x1, x2... xk) Riemann interability, then also have:
Wherein, h (X1, X2... Xk) it is joint differential entropy.Finally we can calculate compression ratio r:
Card is finished.
Although Lossy Compression Algorithm can reach very high compression ratio, it is intended to the precision sacrificing data as cost.Greatly
Partly have algorithm and directly from initial trace, delete sampled point, this will exist between track huge after leading to initial trace and compression
Big deviation.If required precision is higher, strictly confine the information loss that compression produces, then these Lossy Compression Algorithms
Compression ratio also can be very low.The most terrifically, if it is zero that require information is lost, at this moment lossy compression method is equivalent to lossless compress, then this
The lossless compression algorithm that Lossy Compression Algorithms also can be general a bit is equally poorly efficient.
Based on the key factor understanding, limiting data compression rate described above, it is the method for expressing of track data, rather than
The compression algorithm being adopted.It is true that entropy code algorithm and lexicographic encryption algorithm have proven to be optimum compression algorithm,
It is the entropy (or entropy rate) that they have all reached information source.In theory of information, comentropy has weighed the uncertainty of information source, and information source is not true
Qualitative higher, the quantity of information that we obtain from information source output is more, namely needs more data to encode.Consider existing
Track tlv triple represent, it be applied to represent two-dimensional space arbitrary trajectory.But the shape of road network track is tight by road
Lattice limit, and its uncertainty is significantly less than random two-dimensional track.In other words, it is unnecessary that original track method for expressing introduces
Uncertain (unnecessary information), this makes the track data being represented with original tlv triple be difficult to be compressed.
The present invention pass through reduce data dimension (dimensionalityreduction) make a return journey except in data unnecessary not
Definitiveness.Assume three-dimensional track data (xi, yi, ti) then use two dimensional form (di, ti) represent, then its basic compression ratio is just
Reach 1.5.Notice that this conversion must be lossless, that is, there must be one-to-one corresponding before and after changing between data and close
System, is otherwise equal to directly to data lossy compression method.The expression-form of slowly data in advance before data compression, not only directly
Connect and improve data compression rate, also make data be easy to be processed by follow-up compression algorithm.
In initial trace, sampled point (xi, yi, ti) represent in time ti, target is positioned at position (xi, yi).If (x1, y1) be
The starting sample point of track, from original position (x1, y1) arrive current location (xi, yi) apart from diIt is to determine.On the contrary, if known
The road stroke starting from starting point, corresponding position (xi, yi) be but difficult to determine.Therefore in order to set up initial trace to after decompose
One-to-one relationship between track is in addition it is also necessary to additionally preserve road sequence<e1, e2..., em>, wherein eiIt is the side in E and m
The quantity of the road that track is passed through.
So far, decomposing trajectories have been two parts, i.e. spatial data road sequence, time data distance-
Time serieses, namely the format of track data:The road sequence of track T is that T is a series of in the middle process of road network G=(V, E)
Continuous road, i.e. SPT=<e1, e2..., em>;(in road network G=(V, E), V is that figure summit (i.e. intersection of road) is gathered,
And E is the set of side between connection figure summit (connecting the section between crossing);V=<v0, v1, v2..., vm>, E=<e1,
e2..., em>, viFor directed edge side ei-1Terminal, or side eiStarting point.The distance verses time sequence of track T is a series of (di,
ti) two tuples, wherein diIt is that target begins to move into time t from starting pointiTill total distance, i.e. TST=<(d1, t1), (d2,
t2) ..., (dn, tn)>.
Give any road network track T, initial trace is decomposed can be in O (| T |) time or track reduction will be decomposed
Inside complete.After data is decomposed, track data is converted into road sequence and distance verses time sequence.Next,
COMPRESS is to road sequence lossless compress, and-time serieses lossy compression method of adjusting the distance.Why to road sequence no
Damage compression, be because that road sequence is integer sequence, its comentropy is relatively low;And distance verses time sequence remains real data, its
Remain unchanged in information higher, so needing to use lossy compression method.
According to above-mentioned analysis, the method again representing road network track data proposed by the present invention, is by road network track data
It is decomposed into spatial data and time data two parts;Wherein:
(1) original GPS sample track form is T=<(x1, y1, t1), (x2, y2, t2) ..., (xn, yn, tn)>, wherein adopt
Sampling point (xi, yi, ti) represent in time ti, mobile target is positioned at two-dimensional coordinate position (xi, yi), coordinate figure xi, yiWith timestamp ti
It is real data;
(2) decomposing trajectories are two parts:Spatial data and time data;
Described spatial data is road number sequence, for characterizing the spatial form of track;
Described time data is distance verses time two tuple sequence, for characterizing path velocity change;
Described road number sequence specifically represents that line is:
(1) track data after map match no longer contains GPS sampling error, and that is, tracing point is all corrected, sampling
There is not deviation in the corresponding map road of point positional distance;
(2) after map match, each sampled point, on map road, therefore can obtain the corresponding road of sampled point and compile
Number.Corresponding road number sequence SP of former sampled point sequenceT=<e1, e2..., em>, as decompose after spatial data;Wherein ei
It is the side in E, the quantity of the road that m passes through for track;
(3) also the vertex sequence of available map carrys out representation space data, i.e. SPT=<v0, v1, v2..., vm>, wherein viFor
Directed edge side ei-1Terminal, or side eiStarting point, represent be of equal value with vertex representation road sequence with side.
Described distance verses time two tuple sequence table shows that form is:(di, ti), diIt is target when starting point begins to move into
Between tiTill total distance, i.e. two tuple sequence TST=<(d1, t1), (d2, t2) ..., (dn, tn)>As the time number after decomposing
According to.
Initial trace is decomposed into the decomposing trajectories method of above-mentioned form, concretely comprises the following steps:
(1) to input trajectory through map match, each sampled point is made to correspond on road;
(2) export each sampled point (xi, yi, ti) corresponding road number ei, for continuous duplicate keys, only retain it
In one;
(3) calculate track often two neighboring sampled point (xi-1, yi-1) and (xi, yi) distance of process in road network, note
Make li, wherein, make l1=0;
(4) for each sampled point (xi, yi, ti), outputAs distance verses time two tuple (di, ti) in di,
And timestamp is constant.
In trajectory calculation, the inventive method can reduce track storage and Query Cost in data base.
Brief description
Fig. 1 is sample road network, comprises 12 crossings and 17 roads.
Fig. 2 is two sample tracks on road network.
Specific embodiment
To introduce data form and decomposing trajectories method with reference to example road network and track.
As shown in figure 1, given road network comprises 12 summits (crossing) and 17 sides (road).Consider that track 1 is (blue
Track), because track is all through map match, so all of sampled point has all corresponded on road.In fig. 2, sample
Point11Corresponding sides15;Sampled point12Corresponding sides16;Sampled point13Corresponding sides13;Sampled point14Corresponding sides16;Sampled point15Corresponding sides3.Note
If meaning sampled point falls just at crossing, should unify to take rear a line rather than front a line as corresponding road sequence
List, such as sampled point13Corresponding sides13Rather than16.So1Road sequence SP1=<e15, e16, e13, e6, e3>.
In order to calculate corresponding distance verses time sequence, need to apply decomposing trajectories method.According to calculated road sequence
Row and road shape, can calculate the road network distance between two sampled points successively, in such as Fig. 1,11With12The distance between be
(15)+Δ11, wherein (15) it is road15Total length, Δ11For12Distance16The distance of starting point.In order to calculate between adjacent 2 points
Distance, need to know the geographic shape of road, the road in Ordinary Rd network is all stored as a broken line, comprises some two
Dimension coordinate point, these two-dimensional coordinate points are linked the shape that can simulate real road successively.Can be calculated according to road shape
Fall the distance between sampled point on road it is only necessary to longitude and latitude (is used according to two-dimensional coordinate calculating Euclidean distance or spherical distance
During degree coordinate).In Fig. 1,11With12The distance between be1=(15)+Δ11;12With13The distance between be:2=(16)-
Δ11;13With14The distance between be3=(13)+Δ12;t14And t15The distance between be l4=w (e6)-Δ12+Δ13.
Next according in decomposing trajectories methodAdd up and obtain:
In practice to facilitating process time data, can be with the starting point of first sampled point place road as whole piece
T in the starting point of track, such as Fig. 22(red track), we calculate sampled point apart from v5Distance substituting sampled point distance
t21Distance.So obtainingAnd time data
Claims (2)
1. a kind of method of the road network track data of expression again is it is characterised in that be decomposed into spatial data by road network track data
With time data two parts;Wherein:
(1)If original GPS sample track form is, adopt
Sampling pointRepresent in the time, mobile target is positioned at two-dimensional coordinate position, coordinate figureAnd the time
StampIt is real data;
(2)Decomposing trajectories are two parts:Spatial data and time data;
Described spatial data is road number sequence, for characterizing the spatial form of track;
Described time data is distance verses time two tuple sequence, for characterizing path velocity change;
Described road number sequence specifically represents that line is:
(1)Track data after map match no longer contains GPS sampling error, and that is, tracing point is all corrected, sampling optimization
Put the corresponding map road of distance and there is not deviation;
(2)After map match, each sampled point, on map road, can obtain the corresponding road number of sampled point;Former sampling
Point sequence corresponding road number sequence, as decompose after spatial data;Wherein
It is the side in E, the quantity of the road that m passes through for track;
(3)With the vertex sequence of map come representation space data, that is,, whereinFor having
To while whileTerminal, or sideStarting point;
Described distance verses time two tuple sequence table shows that form is:,It is that target begins to move into the time from starting pointTill total distance, i.e. two tuple sequenceAs point
Time data after solution.
2. according to claim 1 again represent road network track data method it is characterised in that described by road network track
Data is decomposed into spatial data and time data two parts, and concrete operation step is:
(1)To input trajectory through map match, each sampled point is made to correspond on road;
(2)Export each sampled pointCorresponding road number, for continuous duplicate keys, only retain wherein
One;
(3)Calculate track often two neighboring sampled pointWithThe distance passed through in road network,
It is denoted as, wherein, make;
(4)For each sampled point, outputAs distance verses time two tupleIn,
And timestamp is constant.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610817878.3A CN106407378B (en) | 2016-09-11 | 2016-09-11 | Method for re-representing road network track data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610817878.3A CN106407378B (en) | 2016-09-11 | 2016-09-11 | Method for re-representing road network track data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106407378A true CN106407378A (en) | 2017-02-15 |
CN106407378B CN106407378B (en) | 2020-05-26 |
Family
ID=57999852
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610817878.3A Active CN106407378B (en) | 2016-09-11 | 2016-09-11 | Method for re-representing road network track data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106407378B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107463335A (en) * | 2017-08-02 | 2017-12-12 | 上海数烨数据科技有限公司 | A kind of location track big data high-efficiency storage method |
CN108022006A (en) * | 2017-11-24 | 2018-05-11 | 浙江大学 | The accessibility probability and Area generation method of a kind of data-driven |
CN108259463A (en) * | 2017-12-05 | 2018-07-06 | 北京掌行通信息技术有限公司 | A kind of positioning track merges compression method and system with driving path |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103162702A (en) * | 2013-03-05 | 2013-06-19 | 中山大学 | Vehicle running track reconstruction method based on multiple probability matching under sparse sampling |
US8744840B1 (en) * | 2013-10-11 | 2014-06-03 | Realfusion LLC | Method and system for n-dimentional, language agnostic, entity, meaning, place, time, and words mapping |
CN104318766A (en) * | 2014-10-22 | 2015-01-28 | 北京建筑大学 | Bus GPS track data road network matching method |
CN104330089A (en) * | 2014-11-17 | 2015-02-04 | 东北大学 | Map matching method by use of historical GPS data |
-
2016
- 2016-09-11 CN CN201610817878.3A patent/CN106407378B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103162702A (en) * | 2013-03-05 | 2013-06-19 | 中山大学 | Vehicle running track reconstruction method based on multiple probability matching under sparse sampling |
US8744840B1 (en) * | 2013-10-11 | 2014-06-03 | Realfusion LLC | Method and system for n-dimentional, language agnostic, entity, meaning, place, time, and words mapping |
CN104318766A (en) * | 2014-10-22 | 2015-01-28 | 北京建筑大学 | Bus GPS track data road network matching method |
CN104330089A (en) * | 2014-11-17 | 2015-02-04 | 东北大学 | Map matching method by use of historical GPS data |
Non-Patent Citations (1)
Title |
---|
孙静怡等: "基于浮动车GPS轨迹点数据的地图匹配算法研究", 《科技创新与应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107463335A (en) * | 2017-08-02 | 2017-12-12 | 上海数烨数据科技有限公司 | A kind of location track big data high-efficiency storage method |
CN108022006A (en) * | 2017-11-24 | 2018-05-11 | 浙江大学 | The accessibility probability and Area generation method of a kind of data-driven |
CN108022006B (en) * | 2017-11-24 | 2020-07-24 | 浙江大学 | Data-driven accessibility probability and region generation method |
CN108259463A (en) * | 2017-12-05 | 2018-07-06 | 北京掌行通信息技术有限公司 | A kind of positioning track merges compression method and system with driving path |
CN108259463B (en) * | 2017-12-05 | 2020-08-14 | 北京掌行通信息技术有限公司 | Fusion compression method and system for positioning track and driving path |
Also Published As
Publication number | Publication date |
---|---|
CN106407378B (en) | 2020-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112015835B (en) | Geohash compressed map matching method | |
Han et al. | COMPRESS: A comprehensive framework of trajectory compression in road networks | |
Nibali et al. | Trajic: An effective compression system for trajectory data | |
KR100943676B1 (en) | Digital map shape vector encoding method and position information transfer method | |
CN101277117B (en) | Increment and continuous data compression method and equipment | |
CN100517979C (en) | Data compression and decompression method | |
Chen et al. | Compression of GPS trajectories | |
CN106407378A (en) | Method for expressing road network trajectory data again | |
CN113094346A (en) | Big data coding and decoding method and device based on time sequence | |
CN110473251B (en) | Self-defined range spatial data area statistical method based on grid spatial index | |
CN109033141B (en) | Space-time trajectory compression method based on trajectory dictionary | |
CN107247761A (en) | Track coding method based on bitmap | |
CN109286399A (en) | The compression method of GPS track data based on lzw algorithm | |
Rakhmanov et al. | Compression of GNSS data with the aim of speeding up communication to autonomous vehicles | |
CN101469989B (en) | Compression method for navigation data in mobile phone network navigation | |
CN104125475A (en) | Multi-dimensional quantum data compressing and uncompressing method and apparatus | |
Chen et al. | DAVT: An error-bounded vehicle trajectory data representation and compression framework | |
Ji et al. | A comparison of road-network-constrained trajectory compression methods | |
JP2007104543A (en) | Apparatus and method for compressing latitude/longitude data stream | |
CN106253909B (en) | Lossless compression method for road network track | |
Cánovas et al. | Practical compression for multi-alignment genomic files | |
CN101877005B (en) | Document mode-based GML compression method | |
CN116673947A (en) | Mobile robot travel path point prediction method | |
Abdelwahab et al. | LiDAR data compression challenges and difficulties | |
CN110411450A (en) | It is a kind of for compressing the map-matching method of track |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |