CN107247761B - Track coding method based on bitmap - Google Patents

Track coding method based on bitmap Download PDF

Info

Publication number
CN107247761B
CN107247761B CN201710402219.8A CN201710402219A CN107247761B CN 107247761 B CN107247761 B CN 107247761B CN 201710402219 A CN201710402219 A CN 201710402219A CN 107247761 B CN107247761 B CN 107247761B
Authority
CN
China
Prior art keywords
track
space
bitmap
grid
codes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710402219.8A
Other languages
Chinese (zh)
Other versions
CN107247761A (en
Inventor
张蕊
周悦淇
刘克中
徐宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN201710402219.8A priority Critical patent/CN107247761B/en
Publication of CN107247761A publication Critical patent/CN107247761A/en
Application granted granted Critical
Publication of CN107247761B publication Critical patent/CN107247761B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices

Abstract

The invention discloses a track coding method based on bitmap, which comprises the following steps that under the preset division precision, the space is divided into a plurality of subspaces similar to grid shapes, each grid space obtains a unique identifier, and the like: the database uses the bitmap index to build an index for a field with smaller value domain base number and fixed value domain base number, and has high calculation speed and small storage space; the bitmap technology can also greatly improve the calculation efficiency by utilizing the optimization of a CPU instruction set; the spatial positions contained in the historical track data are fixed, but the value domain type of the spatial coordinates is a floating point number.

Description

Track coding method based on bitmap
Technical Field
The invention relates to the field of computer big data processing, in particular to a track coding method based on bitmaps.
Background
The applications of traffic management, weather monitoring, mobile computing and the like need to manage a large amount of space-time data, along with the popularization of mobile equipment and the development and perfection of public supervision, mobile computing and location-based services have developed a hot trend, positioning data increase and positioning precision improvement provide data bases for researching positioning data, and massive track data accumulation is formed, for example, the daily average data volume (sampling point) of a vehicle GPS is in the order of magnitude of tens of millions to billions, and the storage volume of track data reaches the PB order of magnitude, so that the analysis query pressure of the track data is large, and a feasible and efficient query scheme is needed.
The management and query of location data can be divided into two categories, real-time query and historical data query, and among historical track query schemes, one is based on a spatial database scheme. The existing relational database can realize the effect of managing spatial data based on a spatial query plug-in, but the existing extension plug-in scheme, such as PostGIS, is adept to convert into query of plane geometric relationship mainly aiming at calculation and query of spatial features, data generated by image mobile equipment not only has spatial information, but also time information, tracks are not ideal and can be simply converted into data described by geometric data types, spatial indexes realized in the extension scheme are also lack of optimization of query problems of space-time types, such as range query of space-time condition combination, and support of space-time data is still incomplete.
Through a GPS sensor of a mobile device or a special positioning scheme (such as an automatic identification system AIS of a ship), position sampling points generated by the GPS sensor or the special positioning scheme have time sequence relation and spatial characteristics different from independent position points, and are also suitable for being applied to track models in management to carry out data mining, analyze regular characteristics in transportation, propose applications such as route recommendation and the like, tracks do not have corresponding data types in a conventional database system, but query and storage of track data are carried out, and specially designed indexing technologies and system schemes, such as SETI and TrajStore, are also stored, in addition, the extraction of the track data and a track analysis algorithm lack uniform environmental support, and the analysis application of mass track data is limited.
Hadoop and Spark are emerging distributed computing schemes for computing and analyzing large-scale data, and the implementation of analysis of large-scale spatio-temporal data based on distributed computing is a current hot problem, for example, spatialHadoop and GeoSpark are schemes for implementing analysis of spatial data based on a distributed computing platform, and the implemented distributed indexing schemes can support parallel computing and query of spatial type data such as points and polygons, but the query and computation requirements of trajectory data are different from those of conventional spatial data, and the data structures supported by the schemes lack direct support for the trajectory query problem. On the other hand, a kNN query scheme based on a distributed platform implementation, such as a track, is also researched, but the scheme is often limited to solving individual track problems.
Disclosure of Invention
The present invention aims to overcome the above disadvantages and to provide a bitmap-based track coding method.
The invention relates to a track coding method based on bitmap, which comprises the following steps:
step 1: under the preset dividing precision, the space is divided into a plurality of subspaces which are approximate to grid shapes, and each grid space obtains a unique identifier;
step 2: splitting a track into continuous track segments, traversing the track segments one by one, and respectively calculating a grid space with a common position relation with the grid space obtained in the step 1, so as to obtain a group of grid identification sequences corresponding to the track;
and 3, step 3: carrying out repeated item removing treatment on the group of grid identification sequences obtained in the step 2;
and 4, step 4: and converting a group of grid coding sequences subjected to the repeated item removing treatment in the step 3 into bitmap format data.
The step 2 specifically comprises the following steps:
step 21: for one track segment, finding out all track points belonging to the track segment, and if the interval between the track points exceeds the maximum distance set when the grid space is divided, inserting supplementary points to enable a new track segment to be surrounded by the area;
step 22: according to each track point obtained in the step 21, obtaining the Hash code of the track point in the space through a GeoHash algorithm;
step 23: and (4) collecting all hash codes calculated in the step 22 of the track segment, and converting the hash codes into globally unique and unrepeated integer identifications.
The trajectory is a continuous sequence of (x, y, t), (x, y) being a point in space coordinates, t being the sampling time, representing (x, y, t)i,yi,ti) At tiThe position of the moving object is (x) at the momenti,yi) The track may be represented as track ═ x [ [ (x)1,y1,t1),...,(xi,yi,ti),...(xn,yn,tn)](t1<ti<tn) (ii) a At a certain time range ti,tj]The relation between a part of the motion process of the track and the whole motion process can be represented by a sub-track.
The track segment is a track segment formed by any two adjacent sampling points in the track, and if the number of the sampling points of the track is n, the track segment TS is Trajecorty (i, i +1) (i is more than or equal to 1 and less than n).
The sub-track is in a defined time range ti,tj]In the method, a part of motion process belonging to a track is composed of sampling points, the number of the sampling points of the track is n, and a sub-track can be expressed as
Trajectory(i,j)=[(xi,yi,ti),(xi+1,yi+1,ti+1),......,(xj,yj,tj)],1≤i<j≤n。
The step of judging the relationship according to the bitmap format data of the track is as follows:
step 61: two space objects, assuming two tracks A, B, correspond to bitmap format data GEA、GEB
Step 62: GE corresponding bitmap format dataA、GEBPerforming bitwise AND operation;
step 63: calculating the number length of non-zero bits in the bitmap structure after bit operation; the code overlap detection is based on the operation of track coding, and GE is calculatedBAnd GEAThe length after the bit operation is the number of the overlapping areas corresponding to the two tracks; when it is 0, the two codes do not coincide at all, and when it is not 0, the two codes intersect.
Step 64: the codes corresponding to the A and the B are subjected to overlap detection calculation to find out the number of the overlapped areas, and if the result is smaller than the size of the number of the areas corresponding to the spatial code of the B, the two codes are crossed; and the number of the overlapped detection values of A and B is equal to the number of the areas corresponding to the spatial coding of B, and the two codes contain judgment.
The subspace of the approximate grid shape is a numerical value interval which divides longitude and latitude into consistent interval ranges based on a GeoHash algorithm under a space coordinate system.
The preset model is either a linear interpolation method or other non-linear paths such as a knowledge base of motion rules and moving objects, and a result which is more in line with the actual motion path is obtained in the road network by using a Map-Match algorithm.
The database uses the bitmap index to build an index for a field with smaller value domain base number and fixed value domain base number, and has high calculation speed and small storage space; the bitmap technology can also greatly improve the calculation efficiency by utilizing the optimization of a CPU instruction set; the spatial positions contained in the historical track data are fixed, but the value domain type of the spatial coordinates is a floating point number.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Detailed Description
The invention is further described with reference to the following figures and examples.
Example (b): a track describes historical motion information of a moving object, the track is a continuous motion process but is described and stored in a space-time point through sampling points, and the sampling points of the track at least comprise time and space information.
A continuous sequence of (x, y, t), (x, y) being points in space coordinates and t being the sampling time, representing (x, y, t)i,yi,ti) At tiThe position of the moving object is (x) at the momenti,yi) Can be expressed as
Trajectory=[(x1,y1,t1),....,(xi,yi,ti),....(xn,yn,tn)](t1<ti<tn)。
A longer movement of the trajectory in a certain time range ti,tj]The relation between a part of the motion process of the track and the whole motion process can be represented by a sub-track.
Track is in a defined time range ti,tj]And the part of the motion process which is composed of sampling points and belongs to the track. The number of sampling points of a trace is n, and the sub-traces can be represented as
Trajctory(i,j)=[(xi,yi,ti),(xi+1,yi+1,ti+1),......,(xj,yj,tj)],1≤i<j≤n。
Existing systems typically employ R-tree type indexes. The method comprises the steps of representing any space data in a surrounding mode through a Minimum Bounding Rectangle (MBR), and storing objects in irregular space shapes. However, the track is approximately a broken line, and the area of the long and narrow motion shape relative to the MBR area is almost negligible, so that the probability that the track found by MBR space overlap detection does not meet the query condition is high, and the query efficiency is reduced. Therefore, the space is firstly divided into regions with smaller areas, and methods for dividing the space have more choices, but the correlation between the divided space and data distribution is large, the division result lacks consistency, for example, grid indexing is performed based on fixed parameters, or a method for dividing the regions based on data rules with dynamic characteristics through a Quad-tree is adopted, but the above method may cause that the divided regions are inconsistent with the granularity along with the change of the space points, so that the representation of the same track cannot be unified. The invention divides the space based on GeoHash algorithm, GeoHash can uniformly process the space coordinates between longitude-90, 90 latitude-180, the parameters are simple, and the consistency of track representation is easy to maintain.
And a Track Section (TS) formed by any two adjacent sampling points in the track, wherein if the number of the sampling points of the track is n, the TS = Trajecortry (i, i +1) (i is more than or equal to 1 and less than n).
When the distance and the sampling time of two adjacent sampling points of the track are greater than a critical condition, in a track identification algorithm, a sampling point sequence is split according to the condition to establish an independent track; when the conditions such as sampling interval and the like are all in critical conditions, the invention processes the relationship between the distance of the sampling points and the space division area range, so that the area number sequences passed by the track are adjacent and continuous.
The process of calculating all the sequences of regions through which the trajectory passes is called a trajectory encoding algorithm.
Given a track, the samples of which are all converted into a set of bits stored in a bitmap data structure, the elements of which are the numbers of the spatial regions traversed by the track,
GE(trajectory)=[GIDi,GIDi+1,......GIDj]。
the corresponding algorithm for track coding is as follows:
1. under the preset dividing precision, the space is divided into a plurality of subspaces which are approximate to grid shapes, and each grid space obtains a unique identifier.
2. And for a track, respectively calculating a grid space with a common position relation with the grid space obtained in step 1 by traversing the track segments contained in the track, thereby obtaining a group of grid identification sequences corresponding to the track.
3. And (4) identifying the sequence of the group of grids obtained in the step (2) and removing repeated items.
4. The set of trellis encoding sequences processed in 3 is converted into a bitmap format data.
The track is finally represented by bitmap format data through the steps, the second step of the processing process is a key step of calculation and processing, the problem of how to find the corresponding divided area according to each sampling point is solved, the process is processed according to the following steps, and the area to which the track point belongs is returned by integer coding.
1. And for one track segment, finding out all track points belonging to the track segment, and inserting supplementary points if the interval between the track points exceeds the maximum distance set when the grid space is divided.
2. And (3) according to each track point obtained in the step (1), obtaining the Hash code of the track point in the space through a GeoHash algorithm.
3. And (3) collecting all hash codes obtained by calculating the track segments in the step (2), and converting the hash codes into globally unique and unrepeated integer identifications.
Generally, the spatial range of each grid region of the division result is greater than the distance between the trace sampling points, since the spacing distance of the trace points included in the trace may exceed the range of the grid region, in order to make the sequence obtained by conversion spatially continuous, the preset model determines to complement the spacing of a trace segment by inserting supplementary points, after completing the conversion processing of the trace, the basic operation based on the encoded data is designed, which can be gradually expanded to advanced applications, and the basic calculation operation based on the bitmap format data of the trace encoding is as follows: overlap, cross and contain.
The relation judgment according to the bitmap format data of the track comprises the following steps:
1. two space objects, assuming two tracks A, B, correspond to bitmap format data GEA、GEB
2. GE corresponding bitmap format dataA、GEBAnd the operation is performed according to the bit.
3. And calculating the number length of non-zero bits in the bitmap structure after bit operation.
The code overlap detection is based on the operation of track coding, and GE is calculatedBAnd GEAThe length after the bit operation is the number of overlapping regions corresponding to the two tracks. When it is 0, the codes do not overlap at all, and when it is not 0, it can be judged that the two codes intersect. Similarly, we can extend to judging intersection and containment relationships. According to the spatial relationship definition, intersection means that two spatial objects A and B intersect, but B has a part of the region which does not belong to the region through which A passes, for example: the codes corresponding to the A and the B are subjected to overlap detection calculation to find out the number of the overlapped areas, and the result is smaller than the size of the number of the areas corresponding to the spatial code of the B; and an inclusion relationship means that two spatial objects intersect and one is completely contained by the other, for example: the number of the overlapped detection values of A and B is equal to the number of the areas corresponding to the spatial coding of B, so that the judgment of intersection and inclusion can be realized by adding one more relation judgment on the basis of the coding overlapped detection.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (3)

1. A bitmap-based track coding method is characterized by comprising the following steps:
step 1: under the preset dividing precision, the space is divided into a plurality of subspaces which are approximate to grid shapes, and each grid space obtains a unique identifier;
step 2: splitting a track into continuous track segments, traversing the track segments one by one, and respectively calculating a grid space with a common position relation with the grid space obtained in the step 1, so as to obtain a group of grid identification sequences corresponding to the track;
and 3, step 3: carrying out repeated item removing treatment on the group of grid identification sequences obtained in the step 2;
and 4, step 4: converting a group of grid coding sequences subjected to the repeated item removing treatment in the step 3 into bitmap format data; the step 2 specifically comprises the following steps:
step 21: for one track segment, finding out all track points belonging to the track segment, and if the interval between the track points exceeds the maximum distance set when the grid space is divided, inserting supplementary points to enable a new track segment to be surrounded by the area;
step 22: according to each track point obtained in the step 21, obtaining the Hash code of the track point in the space through a GeoHash algorithm;
step 23: collecting all hash codes calculated by the track segment in the step 22, and converting the hash codes into globally unique and unrepeated integer identifications; the trajectory is a continuous sequence of (x, y, t), (x, y) being a point in space coordinates, t being the sampling time, representing (x, y, t)i,yi,ti) At tiThe position of the moving object is (x) at the momenti,yi) The track may be represented as track ═ x [ [ (x)1,y1,t1),....,(xi,yi,ti),....(xn,yn,tn)](t1<ti<tn) (ii) a At a certain time range ti,tj]Some part of the inner and outer tracksThe relation between the motion process and the whole motion process can be represented by a sub-track; the step of judging the relationship according to the bitmap format data of the track is as follows:
step 61: two space objects, assuming two tracks A, B, correspond to bitmap format data GEA、GEB
Step 62: GE corresponding bitmap format dataA、GEBPerforming bitwise AND operation;
step 63: calculating the number length of non-zero bits in the bitmap structure after bit operation; the code overlap detection is based on the operation of track coding, and GE is calculatedBAnd GEAThe length after the bit operation is the number of the overlapping areas corresponding to the two tracks; when it is 0, the two codes are not overlapped at all, and when it is not 0, the two codes are crossed;
step 64: the codes corresponding to the A and the B are subjected to overlap detection calculation to find out the number of the overlapped areas, and if the result is smaller than the size of the number of the areas corresponding to the spatial code of the B, the two codes are crossed; and the number of the overlapped detection values of A and B is equal to the number of the areas corresponding to the spatial coding of B, and the two codes contain judgment.
2. The bitmap-based track coding method according to claim 1, wherein the track segment is a track segment composed of any two adjacent sampling points in the track, and if the number of sampling points of the track is n, the track segment TS is traj ecorty (i, i +1) (1 ≦ i < n).
3. A method for bitmap-based track coding as claimed in claim 2, characterized in that the sub-tracks are in a defined time range [ t [ ]i,tj]In the method, a partial motion process belonging to a track is composed of sampling points, the number of the sampling points of the track is n, and a sub-track can be represented as track (i, j) ═ xi,yi,ti),(xi+1,yi+1,ti+1),......,(xj,yj,tj)],1≤i<j≤n。
CN201710402219.8A 2017-06-01 2017-06-01 Track coding method based on bitmap Active CN107247761B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710402219.8A CN107247761B (en) 2017-06-01 2017-06-01 Track coding method based on bitmap

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710402219.8A CN107247761B (en) 2017-06-01 2017-06-01 Track coding method based on bitmap

Publications (2)

Publication Number Publication Date
CN107247761A CN107247761A (en) 2017-10-13
CN107247761B true CN107247761B (en) 2021-10-15

Family

ID=60017796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710402219.8A Active CN107247761B (en) 2017-06-01 2017-06-01 Track coding method based on bitmap

Country Status (1)

Country Link
CN (1) CN107247761B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704388A (en) * 2018-07-10 2020-01-17 南京航空航天大学 SETI index implementation technology based on Berkeley DB storage management
CN109815993B (en) * 2019-01-03 2023-05-23 西北大学 GPS track-based regional feature extraction, database establishment and intersection identification method
WO2021077313A1 (en) * 2019-10-23 2021-04-29 Beijing Voyager Technology Co., Ltd. Systems and methods for autonomous driving
CN113688193A (en) * 2020-05-19 2021-11-23 北京京东振世信息技术有限公司 Track data storage and indexing method and device, electronic equipment and readable medium
CN113298954B (en) * 2021-04-13 2022-11-22 中国人民解放军战略支援部队信息工程大学 Method and device for determining and navigating movement track of object in multi-dimensional variable-granularity grid
CN114238384B (en) * 2022-02-24 2022-08-30 阿里云计算有限公司 Area positioning method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646070A (en) * 2012-02-29 2012-08-22 武汉大学 Space-time trajectory data storage method based on area
RU2013139845A (en) * 2013-08-28 2015-03-10 Андрей Витальевич Фрейдман METHOD FOR TRANSFORMING AN IMAGE IN A SOUND IMAGE
CN104520732A (en) * 2012-02-10 2015-04-15 Isis创新有限公司 Method of locating sensor and related apparatus
US9128973B1 (en) * 2011-09-29 2015-09-08 Emc Corporation Method and system for tracking re-sizing and re-creation of volumes using modification time
CN105258704A (en) * 2014-06-16 2016-01-20 中国科学院沈阳自动化研究所 Multi-scale space-time hot point path detection method based on rapid road network modeling

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10387457B2 (en) * 2014-06-17 2019-08-20 Sap Se Grid-based analysis of geospatial trajectories

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9128973B1 (en) * 2011-09-29 2015-09-08 Emc Corporation Method and system for tracking re-sizing and re-creation of volumes using modification time
CN104520732A (en) * 2012-02-10 2015-04-15 Isis创新有限公司 Method of locating sensor and related apparatus
CN102646070A (en) * 2012-02-29 2012-08-22 武汉大学 Space-time trajectory data storage method based on area
RU2013139845A (en) * 2013-08-28 2015-03-10 Андрей Витальевич Фрейдман METHOD FOR TRANSFORMING AN IMAGE IN A SOUND IMAGE
CN105258704A (en) * 2014-06-16 2016-01-20 中国科学院沈阳自动化研究所 Multi-scale space-time hot point path detection method based on rapid road network modeling

Also Published As

Publication number Publication date
CN107247761A (en) 2017-10-13

Similar Documents

Publication Publication Date Title
CN107247761B (en) Track coding method based on bitmap
CN111209261B (en) User travel track extraction method and system based on signaling big data
CN107291842B (en) Track query method based on track coding
CN106408124B (en) Moving path hybrid prediction method oriented to data sparse environment
Liu et al. A novel framework for online amnesic trajectory compression in resource-constrained environments
CN109241227B (en) Spatiotemporal data prediction modeling method based on stacking integrated learning algorithm
Huang et al. Survey on vehicle map matching techniques
CN107741982B (en) Coordinate and administrative region matching system and method
CN108595608B (en) Road network communication scene oriented neighboring object index query method
Chen et al. Compression of GPS trajectories
CN111292356B (en) Method and device for matching motion trail with road
Wang et al. Extraction of maritime road networks from large-scale AIS data
CN113159403B (en) Intersection pedestrian track prediction method and device
KR101846294B1 (en) Rainfall center tracking method based on weather radar
CN114328780A (en) Hexagonal lattice-based smart city geographic information updating method, device and medium
CN110716925B (en) Cross-border behavior recognition method based on track analysis
Hu et al. A dynamic pyramid tilling method for traffic data stream based on Flink
CN115801024B (en) Coding method, system, device and medium for local equidistant optimized spherical grid
CN107341568B (en) Typhoon storm water increase prediction method and system
Kwoczek et al. An architecture to process massive vehicular traffic data
CN116226403A (en) Ship motion probability continuous prediction method and device based on behavior feature patterns
CN113240265B (en) Urban space division method based on multi-mode traffic data
CN104242949A (en) Track compression and decompression method
CN108399200B (en) Construction method of time-space buffer zone of road network constrained track
CN112988797A (en) Space-time adjoint query method based on p-stable lsh

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant