CN113656670A - Flight data-oriented space-time trajectory data management analysis method and device - Google Patents
Flight data-oriented space-time trajectory data management analysis method and device Download PDFInfo
- Publication number
- CN113656670A CN113656670A CN202110965172.2A CN202110965172A CN113656670A CN 113656670 A CN113656670 A CN 113656670A CN 202110965172 A CN202110965172 A CN 202110965172A CN 113656670 A CN113656670 A CN 113656670A
- Authority
- CN
- China
- Prior art keywords
- target
- data
- track
- point
- polygon
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013523 data management Methods 0.000 title claims abstract description 24
- 238000004458 analytical method Methods 0.000 title claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000012946 outsourcing Methods 0.000 claims abstract description 14
- 238000012216 screening Methods 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims description 32
- 238000003860 storage Methods 0.000 claims description 11
- 239000002131 composite material Substances 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 4
- 238000005192 partition Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000007405 data analysis Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012731 temporal analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/907—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/909—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Fuzzy Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Library & Information Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a flight data-oriented spatio-temporal trajectory data management analysis method and device, which are used for solving the problem that the existing system cannot well support spatio-temporal data range query. Wherein the method comprises the following steps: acquiring space-time trajectory data and establishing a space index, wherein the space index comprises a plurality of data blocks; querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a space-time polygon range query algorithm; the spatio-temporal polygon range query algorithm comprises the following steps: calculating a minimum outsourcing rectangle of the target polygon area; screening data blocks which are intersected with the minimum outsourcing rectangle and are intersected with the target polygon area in the plurality of data blocks, and adding a candidate result set; and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
Description
Technical Field
The invention belongs to the technical field of data management and analysis, and particularly relates to a flight data-oriented space-time trajectory data management and analysis method and device.
Background
In recent years, with the wide application of positioning technology, application devices such as sensors of the internet of things, GPS wearable devices, smart phones, satellites and the like are widely used, and these different application sources continuously generate massive space-time data. The large amount of spatiotemporal data exceeds the storage, processing and analysis capabilities of the original system. The spatio-temporal data is closely related, large in size, complex in structure, diverse and low in value density, and is difficult to efficiently store, manage and analyze. The traditional relational database is limited by a single machine architecture and cannot cope with scenes of mass data; however, distributed query processing frameworks such as Spark, Hadoop, etc. lack effective spatio-temporal index and spatio-temporal analysis algorithms, and thus it is difficult to efficiently process spatio-temporal data.
Therefore, for the limitation of the existing data management system, based on the performance requirements of data management and analysis, researchers propose secondary development for platforms such as Spark, Hadoop and NoSQL, a spatial index and a spatial query algorithm are embedded, the spatial query algorithm is parallelized by using a MapReduce task, and the data processing speed is improved. However, the method does not consider the time dimension when processing the spatio-temporal data, and does not support the spatio-temporal index, so that a large amount of invalid data can be scanned when performing the spatio-temporal range query, and the query efficiency cannot be ensured.
For the range query algorithm, the current research work only studies coarse-grained rectangular range query, but does not consider the problem of polygonal range query under the condition of space-time injection. In practice, the query range of polygons is a set of non-overlapping complex polygons, typically on the order of thousands, while the input is a large dataset containing hundreds of millions or even billions of spatial points. Performing data analysis or computational aggregation tasks on these spatio-temporal data points adds additional CPU cost overhead and as spatio-temporal data increases, the system may not scale well.
The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.
Disclosure of Invention
The invention aims to provide a flight data-oriented spatio-temporal trajectory data management analysis method, which is used for solving the problem that the existing system cannot well support spatio-temporal data range query.
In order to achieve the above object, the present invention provides a flight data-oriented spatiotemporal trajectory data management analysis method, which comprises:
acquiring space-time trajectory data and establishing a spatial index, wherein the spatial index comprises a plurality of data blocks; and the number of the first and second groups,
querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a spatio-temporal polygon range query algorithm;
wherein the spatio-temporal polygon range query algorithm comprises:
calculating a minimum outsourcing rectangle of the target polygon area;
screening data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set;
and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
In one embodiment, the method further comprises counting target airspace flow based on the spatio-temporal polygon range query algorithm; the method specifically comprises the following steps:
dividing a target airspace sector into a plurality of regular cylinders;
based on the space-time polygon range query algorithm and by using the upper and lower height boundaries of each cylinder as constraint conditions, filtering track points in each cylinder to add to a counting queue;
and carrying out data deduplication on the track points in the counting queue to obtain target airspace flow.
In an embodiment, data deduplication is performed on trace points in the count queue to obtain a target airspace flow, which specifically includes:
defining track id attributes and the affiliated cylinder Vid for track points in the counting queue;
when each cylinder executes a space-time polygon range query algorithm, if a track point is in the cylinder, marking the Vid of the track id attribute as the Vid of the current cylinder to form a composite key (Trajid, Vid >);
and carrying out data deduplication on the track points of the counting queue by using the composite key recorded in each row.
In one embodiment, the method further comprises calculating k trajectory points adjacent to the target query point based on a KNN query algorithm, thereby capturing spatiotemporal trajectory data correlations; the method specifically comprises the following steps:
searching a data block which is intersected with the target query point in the plurality of data blocks, and adding the data block to a first priority queue;
calculating the distance dist between each data block in the first priority queue and a target query pointp;
Calculating the kth shortest distance dist between the track point of each data block in the first priority queue and the target query pointk;
Will satisfy distp<distkDetermining the data block corresponding to the track point as a candidate data block;
positioning the track time in the candidate data block in a target time period, and enabling the distance dist (i, q) between the track time and a target query point to be smaller than distkAdding the trace point to a second priority queue;
generating a test range which takes the target query point as a center and takes the distance from the kth neighbor track point as a radius;
and determining k track points close to the target query point based on the test result of the test range on the track points in the second priority queue.
In an embodiment, determining k trace points near the target query point based on the test result of the trace point in the second priority queue from the test range specifically includes:
judging whether the test range is intersected with other data blocks; if not, the user can not select the specific application,
determining k track points in the test range as k track points close to the target query point; if so,
calculating the distance dist between the track point in the data block intersected with the test range and the target query point, and reducing the distDist in the second priority queuekThe track point replacement of (a) is added to the second priority queue.
In an embodiment, in a query of a target time period by using a spatio-temporal polygon range query algorithm, all trace points in the plurality of data blocks belong to a target data block of a target polygon region, which specifically includes:
storing the abscissa of each track point of the target polygon area in the array X2]And storing the ordinate in the array Y]And obtaining the array X2]Maximum value x ofmaxAnd the minimum value xminAnd the array Y [ alpha ]]Maximum value y ofmaxAnd the minimum value ymin;
Judging whether the track point is in the minimum outer rectangle (x) of the target polygon areamin,ymin)-(xmax,ymax) Performing the following steps; if not, the user can not select the specific application,
determining that the track point does not belong to the target polygon area; if so,
and leading out rays from the track points to intersect with the target polygon area, and determining whether the track points belong to the target polygon area or not based on the number of the intersected points.
In an embodiment, the method for obtaining spatiotemporal trajectory data and establishing a spatial index includes a plurality of data blocks, and specifically includes:
defining a set of spatio-temporal trajectory data p ═ { p ═ p1,p2,..,pnTherein, each trace point p1、p2、…、pnExpressed as (lng, lat, t), lng denotes longitude, lat denotes latitude, t denotes time attribute;
abstracting longitude long and latitude lat of each track point into coordinates of points on a two-dimensional plane<x,y>And define two points on a two-dimensional planeHas an Euclidean distance of
The invention also provides a flight data-oriented space-time trajectory data management and analysis device, which comprises:
the index establishing module is used for acquiring space-time trajectory data and establishing a spatial index, and the spatial index comprises a plurality of data blocks; and the number of the first and second groups,
the query module is used for querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a space-time polygon range query algorithm;
wherein the spatio-temporal polygon range query algorithm comprises:
calculating a minimum outsourcing rectangle of the target polygon area;
screening data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set;
and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
The present invention also provides a computing device comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method as described above.
The invention also provides a machine-readable storage medium having stored thereon executable instructions that, when executed, cause the machine to perform the method as described above.
Compared with the prior art, according to the flight data-oriented space-time trajectory data management analysis method, the target data blocks belonging to the target polygon region in the constructed spatial index are quickly queried through the proposed space-time polygon range query algorithm, the problem that the existing space-time data platform does not support a space-time data structure is solved, and the efficiency and precision of space-time range query are skipped; moreover, a space-time KNN query algorithm is provided, the types of space-time analysis operation algorithms are enriched, and the space-time query efficiency is improved; meanwhile, a strategy for counting airspace flow is provided, and the query efficiency and the expandability of functions are improved.
Drawings
FIG. 1 is a flow diagram of one embodiment of a flight data oriented spatiotemporal trajectory data management analysis method according to the present invention;
FIG. 2 is a system architecture diagram of an application scenario of a flight data oriented spatiotemporal trajectory data management analysis method according to the present invention;
FIG. 3 is a flow diagram of a spatiotemporal polygon range query algorithm in an embodiment of a flight data-oriented spatiotemporal trajectory data management analysis method according to the present invention;
FIG. 4 is a flow diagram of a spatiotemporal KNN query algorithm in an embodiment of a flight data-oriented spatiotemporal trajectory data management analysis method according to the present invention;
FIG. 5 is a flow chart of statistical airspace flow in one embodiment of a flight data-oriented spatiotemporal trajectory data management analysis method according to the present invention;
FIG. 6 is a block diagram of an embodiment of a flight data oriented spatiotemporal trajectory data management analysis device according to the present invention;
FIG. 7 is a hardware block diagram of an embodiment of a computing device for managing and analyzing flight data-oriented spatiotemporal trajectory data according to the present invention.
Detailed Description
The following detailed description of the present invention is provided in conjunction with the accompanying drawings, but it should be understood that the scope of the present invention is not limited to the specific embodiments.
Throughout the specification and claims, unless explicitly stated otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or component but not the exclusion of any other element or component.
Referring to fig. 1 and fig. 2, a specific embodiment of the method for managing and analyzing flight data-oriented space-time trajectory data according to the present application is described. In this embodiment, the method comprises:
and S11, acquiring space-time trajectory data and establishing a space index.
First, a spatio-temporal trajectory data set p ═ p is defined1,p2,..,pnTherein, each trace point p1、p2、…、pnExpressed as (lng, lat, t), lng denotes longitude, lat denotes latitude, and t denotes time attribute. In the present embodiment, in order to locate the distance between the object and the calculation point in space, the distance is calculated using euclidean space. The distance between two points in space is expressed in terms of euclidean distance: abstracting longitude long and latitude lat of each track point into coordinates of points on a two-dimensional plane<x,y>And define two points on a two-dimensional planeHas an Euclidean distance ofIn this way, the reading and storing of the spatiotemporal trajectory data is completed.
Then, a spatial index is built for the time-space track data, and each indexed data block is represented by partition. The type of index is denoted by index, the input data by input, and the output data by output. According to different index types, different index establishing modes can be provided.
And S12, inquiring the target data blocks of which all track points belong to the target polygon region in the target time period by utilizing a space-time polygon range inquiry algorithm.
Referring to fig. 3 in conjunction, in particular, the spatio-temporal polygon range query algorithm includes:
and S121, calculating the minimum outsourcing rectangle of the target polygon area.
Here, first, a set of spatio-temporal trajectory data sets p ═ { p is defined1,p2,..,pnIs EdIn a set of spatio-temporal trajectory data sets composed of spatio-temporal trajectory objects in (D-dimensional euclidean space), for each trajectory point i e D, i ═ l, lat, and t, wherein l represents longitude, lat represents latitude, and t represents time attribute.
And then setting query range parameters including a time dimension query range and a space dimension query range. Time checkingThe polling range is determined by T ═ tau ∈ (tau ∈begin,τend) Is represented by, whereinbeginDenotes the start time, τendIndicating the end time. The polygonal query range q is bounded by a set of spatial points. Using a positive integer n, a set of X-axis coordinates X ═ X1,x2,…,xnY, a set of Y-axis coordinates Y ═ Y1,y2,…,ynDenotes that the minimum bounding rectangle mbr of the polygon is computed.
S122, screening the data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set.
For each index data block partition obtained in the above step, whether the partition and the minimum outer-wrapping rectangle mbr are intersected or not can be judged firstly; if yes, judging whether the partition is intersected with the polygon query range q; if so, adding the current partition to the candidate result set. Otherwise, continuing judging the next partition until all partitions are screened.
And S123, determining the data block of which the track point in the candidate set is located in the target polygon area and the track time is located in the target time period as a target data block.
For each partition in the candidate set, a PNPoly algorithm may be called to determine whether each trace point i is in the target polygon area, and a time query range is determined at the same time, and if yes, the time query range is returned and output.
Specifically, the abscissa of each track point of the target polygon region can be stored in the array X [ alpha ], [ alpha ] form]And storing the ordinate in the array Y]And obtaining the array X2]Maximum value x ofmaxAnd the minimum value xminAnd the array Y [ alpha ]]Maximum value y ofmaxAnd the minimum value ymin;
By (test)x,testy) Representing the track point of the query, and judging the track point (test)x,testy) Whether or not to wrap a rectangle (x) at the minimum of the target polygon areamin,ymin)-(xmax,ymax) Performing the following steps; if not, determining the trackPoints do not belong to the target polygon area; if yes, extracting rays from the track points to intersect with the target polygon area, and determining whether the track points belong to the target polygon area or not based on the number of the intersected points.
Here, if the ray drawn from the trace point has an odd number of intersecting points with the target polygonal area, it is indicated that the trace point is located in the target polygonal area; and if the ray led out from the track point has even number of crossed points with the target polygon area, the track point is positioned outside the target polygon area.
In some embodiments, the method for managing and analyzing flight data-oriented space-time trajectory data further includes:
and S13, calculating k track points adjacent to the target query point based on the KNN query algorithm, thereby capturing the spatiotemporal track data correlation.
Referring to fig. 4, in a specific process, initialization may be performed first, where q denotes a certain target query point, and k denotes a threshold of a neighboring point in the vicinity of q that needs to be queried. Using T ═ tau ∈ (tau)begin,τend) Represents a time query range, wherebeginDenotes the start time, τendIndicating the end time. Two priority queues are defined, a first priority queue PQ and a second priority queue RQ, where PQ represents the set of all partitions that intersect with query point q, and the partitions in the priority queues may be sorted by distance, since they may not be unique. Each partition in a PQ is denoted by p and the candidate trace points that satisfy the condition of knn are denoted by the RQ priority queue.
Second, an initial result is generated. Searching a data block which is intersected with a target query point in a plurality of data blocks, and adding the data block to a first priority queue PQ; calculating the distance dist between each data block in the first priority queue PQ and the target query pointpAnd calculating the kth shortest distance dist between the track point of each data block in the first priority queue and the target query pointk(initialized to 0), then dist will be satisfiedp<distkThe data block corresponding to the track point is determined as a candidate data block. At this time, the candidate data block satisfying the condition is indicatedIncluding the initial results.
The first priority queue PQ may be a queue in which each data block partition is ordered according to a set rule. In one embodiment, for example, the data blocks are subject to distpThe sizes are sorted in descending order.
Then, time range matching is carried out on each track point i in the candidate data block, the track time in the candidate data block is located in a target time period, and the distance dist (i, q) between the track time and the target query point is smaller than distkIs added to the second priority queue RQ, thus obtaining the second priority queue RQ constituted by the initial result.
The distance formula for calculating the data block partition and the target query point q is as follows:
wherein p represents each partiion in the PQ queue, i represents each trace point in p, q represents a target query point, and dist (i, q) represents the distance from the trace point i to the query point q.
Similarly, the second priority queue RQ may also order each data block according to a set rule. In one embodiment, for example, the data blocks are subject to distkThe sizes are sorted in descending order.
Finally, the initial results are also tested to determine the final results. Specifically, a test range C (for example, a circle) with the target query point as the center and the distance from the kth neighboring trace point as the radius may be generated, and the kth trace point close to the target query point may be determined based on the test result of the test range on the trace point in the second priority queue.
Specifically, it may be determined whether the test range C intersects with other data blocks; if not, determining the k track points in the test range as the k track points close to the target query point; if yes, calculating the distance dist between the track point in the data block intersected with the test range and the target query point, and enabling the dist to be smaller than the dist in the second priority queue RQkThe track point replacement of (a) is added to the second priority queue.
In the implementation, the final result can be generated through m iterations under the time constraint condition. The size of m depends on the trace point distribution of the original dataset and the location of the target query point q. In each iteration, selecting a data block partition, traversing all track points in the partition, calculating the distance dist of the target query point q, and regarding the selected target track point i and the corresponding dist thereof, comparing the selected target track point i with the tail element dist of the second priority queue RQkThe comparison is performed (data block in RQ according to distkDescending order of magnitude), if dist is less than or equal to distkIf so, indicating that i is more in line with the target result, and writing the selected track point i and the dist corresponding to the track point i into the RQ; if not, discarding and continuing to search the next target point. The elements in the second priority queue RQ are updated according to the queuing rules of the priority queues. Through k rounds of iteration, k nearest results (trace points) are finally generated.
And S14, counting the target airspace flow based on the space-time polygon range query algorithm.
Referring to fig. 5, specifically, the spatial domain sector parameters are first obtained. The statistical airspace flow is the number of all aircraft passing through a sector. Dividing a target airspace Sector into a plurality of regular cylinders, wherein the section of each cylinder is a polygon formed by a plurality of points, and defining a space query range of a three-dimensional Sector represented by a Sector, wherein the Sector is formed as follows:
Sector:<Sid,List<Volume>>
Volume:<Vid,lower,upper,List<Point>>
Point:<Pid,lng,lat>
wherein Sid represents the unique id of Sector, and List < Volume > represents that the Sector is composed of a series of columns. A cylinder is defined by Volume, the unique id of the cylinder is represented by Vid, the lower limit of the height of the cylinder is represented by lower, the upper limit of the height of the cylinder is represented by upper, and the polygonal section of the cylinder is represented by List < Point >. Points represent a plurality of vertices constituting a polygon. In addition, because the cylinder is a three-dimensional space, the height attribute alt of the track point is introduced to represent the height of the track in the airspace.
Second, a Count queue is defined that records all of the track points of the aircraft passing through the sector. Based on the space-time polygon range query algorithm and by using the upper and lower height bounds of each column as constraint conditions, filtering out the flow passing through the single column. Namely:
i:: PNPoly and τstart≤i.t≤τendAnd lower is less than or equal to i.alt is less than or equal to upper
Wherein, i:: PNPoly indicates the track point i in the polygon, taustart≤i.t≤τendThe time for representing the track point is in the constraint condition, lower is less than or equal to i, alt is less than or equal to upper, and the height of the track point is between the upper and lower boundaries of the cylinder. And judging whether the track point passes through the airspace sector or not based on the condition. If yes, adding the counting result to a counting queue, otherwise, discarding the counting result. The step is executed in a loop until all columns of the sector are traversed.
And finally, carrying out data deduplication on the track points in the counting queue to obtain the target airspace flow. Sector traffic can be represented by the number of records res after the count queue has been de-duplicated. Taking an airplane as an example, the empty trace point may appear in more than one cylinder, so that duplicate trace points need to be deduplicated to ensure that the trace point of the airplane passing through a certain sector is counted only once, and finally res is returned.
In a specific deduplication process, a track id attribute (Trajid attribute) and a column Vid (initialized to 0) which belongs to the track point in the counting queue are defined; when each cylinder executes a space-time polygon range query algorithm, if a track point i is in a cylinder Volume, marking a Vid of a track id attribute as the Vid of the current cylinder to form a composite key (Trajid, Vid); and performing data deduplication on the track points of the counting queue by using the composite key recorded in each row.
Referring to fig. 2, a spatiotemporal trajectory data system to which the method for managing and analyzing spatiotemporal trajectory data oriented to flight data proposed in the above embodiment is applied is shown. The method comprises storage and processing nodes, an index layer, a space-time operation algorithm, statistic space-domain flow and a user request. The spatiotemporal trajectory data set can be stored in an HDFS file system of a Hadoop platform; the index layer is responsible for establishing indexes on a data source, and the method supports different space-time indexes (Grid index, Rtree index, KDTree index and the like); the space-time operation algorithm is the space-time polygon range query algorithm and the space-time KNN algorithm mentioned in the above embodiments; based on space-time operation, the application service of spatial domain flow statistics can be realized, and the final result is stored on the HDFS; when an external spatio-temporal query request accesses the system, the system can process the request in time and return a final result. The user obtains the final result directly on the HDFS.
Referring to fig. 6, an embodiment of the present application further provides a device for managing and analyzing flight data-oriented space-time trajectory data. The system comprises an index establishing module, an inquiring module, a flow counting module and a capturing module.
The index establishing module is used for acquiring space-time trajectory data and establishing a spatial index, and the spatial index comprises a plurality of data blocks;
the query module is used for querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a space-time polygon range query algorithm;
wherein the spatio-temporal polygon range query algorithm comprises:
calculating a minimum outsourcing rectangle of the target polygon area;
screening data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set;
and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
The flow statistic module is used for counting the target airspace flow based on the space-time polygon range query algorithm; the method is specifically used for: dividing a target airspace sector into a plurality of regular cylinders; based on the space-time polygon range query algorithm and by using the upper and lower height boundaries of each cylinder as constraint conditions, filtering track points in each cylinder to add to a counting queue; and carrying out data deduplication on the track points in the counting queue to obtain target airspace flow.
In one embodiment, the traffic statistic module is specifically configured to: defining track id attributes and the affiliated cylinder Vid for track points in the counting queue; when each cylinder executes a space-time polygon range query algorithm, if a track point is in the cylinder, marking the Vid of the track id attribute as the Vid of the current cylinder to form a composite key (Trajid, Vid >); and carrying out data deduplication on the track points of the counting queue by using the composite key recorded in each row.
The acquisition module is used for calculating k track points adjacent to the target query point based on a KNN query algorithm so as to acquire the correlation of the spatiotemporal track data; the method is specifically used for: searching a data block which is intersected with the target query point in the plurality of data blocks, and adding the data block to a first priority queue; calculating the distance dist between each data block in the first priority queue and a target query pointp(ii) a Calculating the kth shortest distance dist between the track point of each data block in the first priority queue and the target query pointk(ii) a Will satisfy distp<distkDetermining the data block corresponding to the track point as a candidate data block; positioning the track time in the candidate data block in a target time period, and enabling the distance dist (i, q) between the track time and a target query point to be smaller than distkAdding the trace point to a second priority queue; generating a test range which takes the target query point as a center and takes the distance from the kth neighbor track point as a radius; and determining k track points close to the target query point based on the test result of the test range on the track points in the second priority queue.
In one embodiment, the capture module is specifically configured to: judging whether the test range is intersected with other data blocks; if not, determining the k track points in the test range as the k track points close to the target query point; if yes, calculating the distance dist between the track point in the data block intersected with the test range and the target query point, and enabling the dist to be smaller than the dist in the second priority queuekThe track point replacement of (a) is added to the second priority queue.
In one embodiment, the query module is specifically configured to: tracing each target polygon areaThe abscissa of the point is stored in the array X [, ]]And storing the ordinate in the array Y]And obtaining the array X2]Maximum value x ofmaxAnd the minimum value xminAnd the array Y [ alpha ]]Maximum value y ofmaxAnd the minimum value ymin(ii) a Judging whether the track point is in the minimum outer rectangle (x) of the target polygon areamin,ymin)-(xmax,ymax) Performing the following steps; if not, determining that the track point does not belong to the target polygonal area; if yes, extracting rays from the track points to intersect with the target polygon area, and determining whether the track points belong to the target polygon area or not based on the number of the intersected points.
In an embodiment, the establishing module is specifically configured to: defining a set of spatio-temporal trajectory data p ═ { p ═ p1,p2,..,pnTherein, each trace point p1、p2、…、pnExpressed as (lng, lat, t), lng denotes longitude, lat denotes latitude, t denotes time attribute; abstracting longitude long and latitude lat of each track point into coordinates of points on a two-dimensional plane<x,y>And define two points on a two-dimensional planeHas an Euclidean distance of
FIG. 7 illustrates a hardware block diagram of a computing device 30 for spatiotemporal trajectory data management analysis oriented to flight data in accordance with an embodiment of the present description. As shown in fig. 7, the computing device 30 may include at least one processor 301, storage 302 (e.g., non-volatile storage), memory 303, and a communication interface 304, and the at least one processor 301, storage 302, memory 303, and communication interface 304 are connected together via a bus 305. The at least one processor 301 executes at least one computer readable instruction stored or encoded in the memory 302.
It should be appreciated that the computer-executable instructions stored in the memory 302, when executed, cause the at least one processor 301 to perform the various operations and functions described above in connection with fig. 1-5 in the various embodiments of the present specification.
In embodiments of the present description, computing device 30 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handheld devices, messaging devices, wearable computing devices, consumer electronics, and so forth.
According to one embodiment, a program product, such as a machine-readable medium, is provided. A machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-5 in the various embodiments of the present specification. Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of this specification.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
It will be understood by those skilled in the art that various changes and modifications may be made in the above-disclosed embodiments without departing from the spirit of the invention. Accordingly, the scope of the present description should be limited only by the attached claims.
It should be noted that not all steps and units in the above flows and system structure diagrams are necessary, and some steps or units may be omitted according to actual needs. The execution order of the steps is not fixed, and can be determined as required. The apparatus structures described in the above embodiments may be physical structures or logical structures, that is, some units may be implemented by the same physical client, or some units may be implemented by multiple physical clients, or some units may be implemented by some components in multiple independent devices.
In the above embodiments, the hardware units or modules may be implemented mechanically or electrically. For example, a hardware unit, module or processor may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware units or processors may also include programmable logic or circuitry (e.g., a general purpose processor or other programmable processor) that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.
The detailed description set forth above in connection with the appended drawings describes exemplary embodiments but does not represent all embodiments that may be practiced or fall within the scope of the claims. The term "exemplary" used throughout this specification means "serving as an example, instance, or illustration," and does not mean "preferred" or "advantageous" over other embodiments. The detailed description includes specific details for the purpose of providing an understanding of the described technology. However, the techniques may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the concepts of the described embodiments.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A flight data-oriented spatio-temporal trajectory data management analysis method is characterized by comprising the following steps:
acquiring space-time trajectory data and establishing a spatial index, wherein the spatial index comprises a plurality of data blocks; and the number of the first and second groups,
querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a spatio-temporal polygon range query algorithm;
wherein the spatio-temporal polygon range query algorithm comprises:
calculating a minimum outsourcing rectangle of the target polygon area;
screening data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set;
and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
2. The flight data-oriented spatio-temporal trajectory data management analysis method of claim 1, further comprising counting target airspace flows based on the spatio-temporal polygon range query algorithm; the method specifically comprises the following steps:
dividing a target airspace sector into a plurality of regular cylinders;
based on the space-time polygon range query algorithm and by using the upper and lower height boundaries of each cylinder as constraint conditions, filtering track points in each cylinder to add to a counting queue;
and carrying out data deduplication on the track points in the counting queue to obtain target airspace flow.
3. The flight data-oriented spatio-temporal trajectory data management analysis method according to claim 2, wherein the data deduplication is performed on the trajectory points in the counting queue to obtain a target airspace flow, and specifically comprises:
defining track id attributes and the affiliated cylinder Vid for track points in the counting queue;
when each cylinder executes a space-time polygon range query algorithm, if a track point is in the cylinder, marking the Vid of the track id attribute as the Vid of the current cylinder to form a composite key (Trajid, Vid >);
and carrying out data deduplication on the track points of the counting queue by using the composite key recorded in each row.
4. The flight data-oriented spatio-temporal trajectory data management analysis method according to claim 1, characterized in that the method further comprises calculating k trajectory points adjacent to the target query point based on a KNN query algorithm, thereby capturing spatio-temporal trajectory data correlations; the method specifically comprises the following steps:
searching a data block which is intersected with the target query point in the plurality of data blocks, and adding the data block to a first priority queue;
calculating the distance dist between each data block in the first priority queue and a target query pointp;
Calculating the kth shortest distance dist between the track point of each data block in the first priority queue and the target query pointk;
Will satisfy distp<distkDetermining the data block corresponding to the track point as a candidate data block;
positioning the track time in the candidate data block in a target time period, and enabling the distance dist (i, q) between the track time and a target query point to be smaller than distkAdding the trace point to a second priority queue;
generating a test range which takes the target query point as a center and takes the distance from the kth neighbor track point as a radius;
and determining k track points close to the target query point based on the test result of the test range on the track points in the second priority queue.
5. The flight data-oriented spatiotemporal trajectory data management analysis method according to claim 4, wherein the determining k trajectory points adjacent to the target query point based on the test result of the test range on the trajectory points in the second priority queue specifically comprises:
judging whether the test range is intersected with other data blocks; if not, the user can not select the specific application,
determining k track points in the test range as k track points close to the target query point; if so,
calculating the distance dist between the track point in the data block intersected with the test range and the target query point, and enabling the dist to be smaller than the dist in the second priority queuekThe track point replacement of (a) is added to the second priority queue.
6. The method for managing and analyzing flight data-oriented spatio-temporal trajectory data according to any one of claims 1 to 5, characterized in that, in querying a target time period by using a spatio-temporal polygon range query algorithm, all trajectory points in the plurality of data blocks belong to a target data block of a target polygon region, and specifically comprises:
storing the abscissa of each track point of the target polygon area in the array X2]And storing the ordinate in the array Y]And obtaining the array X2]Maximum value x ofmaxAnd the minimum value xminAnd the array Y [ alpha ]]Maximum value y ofmaxAnd the minimum value ymin;
Judging whether the track point is in the minimum outer rectangle (x) of the target polygon areamin,ymin)-(xmax,ymax) Performing the following steps; if not, the user can not select the specific application,
determining that the track point does not belong to the target polygon area; if so,
and leading out rays from the track points to intersect with the target polygon area, and determining whether the track points belong to the target polygon area or not based on the number of the intersected points.
7. The method for managing and analyzing flight data-oriented spatio-temporal trajectory data according to any one of claims 1 to 5, characterized in that spatio-temporal trajectory data are obtained and a spatial index is established, the spatial index comprising a plurality of data blocks, specifically comprising:
defining a set of spatio-temporal trajectory data p ═ { p ═ p1,p2,..,pnTherein, each trace point p1、p2、…、pnExpressed as (lng, lat, t), lng denotes longitude, lat denotes latitude, t denotes time attribute;
8. A flight data-oriented spatio-temporal trajectory data management analysis device, characterized by comprising:
the index establishing module is used for acquiring space-time trajectory data and establishing a spatial index, and the spatial index comprises a plurality of data blocks; and the number of the first and second groups,
the query module is used for querying a target data block of which all track points belong to a target polygon region in a target time period by utilizing a space-time polygon range query algorithm;
wherein the spatio-temporal polygon range query algorithm comprises:
calculating a minimum outsourcing rectangle of the target polygon area;
screening data blocks which are intersected with the minimum outsourcing rectangle and the target polygon area in the plurality of data blocks, and adding a candidate result set;
and determining the data block of which the track point in the candidate set is positioned in the target polygon area and the track time is positioned in the target time period as a target data block.
9. A computing device, comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 7.
10. A machine-readable storage medium storing executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110965172.2A CN113656670A (en) | 2021-08-23 | 2021-08-23 | Flight data-oriented space-time trajectory data management analysis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110965172.2A CN113656670A (en) | 2021-08-23 | 2021-08-23 | Flight data-oriented space-time trajectory data management analysis method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113656670A true CN113656670A (en) | 2021-11-16 |
Family
ID=78491919
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110965172.2A Pending CN113656670A (en) | 2021-08-23 | 2021-08-23 | Flight data-oriented space-time trajectory data management analysis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113656670A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114117260A (en) * | 2021-12-02 | 2022-03-01 | 中国人民解放军国防科技大学 | Spatiotemporal trajectory indexing and query processing method, device, equipment and medium |
CN114610774A (en) * | 2022-02-10 | 2022-06-10 | 中远海运散货运输有限公司 | Method, device, electronic equipment and medium for analyzing ship passing through selected area |
CN117648338A (en) * | 2024-01-29 | 2024-03-05 | 卡奥斯化智物联科技(青岛)有限公司 | Method, device, equipment and medium for optimizing space-time data range index filtration |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170314947A1 (en) * | 2016-04-28 | 2017-11-02 | National Tsing Hua University | Computing method for ridesharing paths, computing apparatus and recording medium using the same |
CN107423368A (en) * | 2017-06-29 | 2017-12-01 | 中国测绘科学研究院 | A kind of space-time data indexing means in non-relational database |
CN112465199A (en) * | 2020-11-18 | 2021-03-09 | 南京航空航天大学 | Airspace situation evaluation system |
-
2021
- 2021-08-23 CN CN202110965172.2A patent/CN113656670A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170314947A1 (en) * | 2016-04-28 | 2017-11-02 | National Tsing Hua University | Computing method for ridesharing paths, computing apparatus and recording medium using the same |
CN107423368A (en) * | 2017-06-29 | 2017-12-01 | 中国测绘科学研究院 | A kind of space-time data indexing means in non-relational database |
CN112465199A (en) * | 2020-11-18 | 2021-03-09 | 南京航空航天大学 | Airspace situation evaluation system |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114117260A (en) * | 2021-12-02 | 2022-03-01 | 中国人民解放军国防科技大学 | Spatiotemporal trajectory indexing and query processing method, device, equipment and medium |
CN114610774A (en) * | 2022-02-10 | 2022-06-10 | 中远海运散货运输有限公司 | Method, device, electronic equipment and medium for analyzing ship passing through selected area |
CN114610774B (en) * | 2022-02-10 | 2023-01-20 | 天津中远海运散运数字科技有限公司 | Method, device, electronic equipment and medium for analyzing ship passing through selected area |
CN117648338A (en) * | 2024-01-29 | 2024-03-05 | 卡奥斯化智物联科技(青岛)有限公司 | Method, device, equipment and medium for optimizing space-time data range index filtration |
CN117648338B (en) * | 2024-01-29 | 2024-06-07 | 卡奥斯化智物联科技(青岛)有限公司 | Method, device, equipment and medium for optimizing space-time data range index filtration |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Krishnaraj et al. | An efficient radix trie‐based semantic visual indexing model for large‐scale image retrieval in cloud environment | |
Wang et al. | Fast large-scale trajectory clustering | |
CN110442444B (en) | Massive remote sensing image-oriented parallel data access method and system | |
Pan et al. | Fast GPU-based locality sensitive hashing for k-nearest neighbor computation | |
Yiu et al. | Clustering objects on a spatial network | |
CN113656670A (en) | Flight data-oriented space-time trajectory data management analysis method and device | |
US10789231B2 (en) | Spatial indexing for distributed storage using local indexes | |
JP6032467B2 (en) | Spatio-temporal data management system, spatio-temporal data management method, and program thereof | |
CN112835627B (en) | Near nearest neighbor search for single instruction multithreading or single instruction multiple data type processors | |
Aghajarian et al. | GCMF: an efficient end-to-end spatial join system over large polygonal datasets on GPGPU platform | |
Yang et al. | Pase: Postgresql ultra-high-dimensional approximate nearest neighbor search extension | |
CN108549696B (en) | Time series data similarity query method based on memory calculation | |
Neethu et al. | Review of spatial clustering methods | |
Chen et al. | HiSpatialCluster: A novel high‐performance software tool for clustering massive spatial points | |
CN115878824B (en) | Image retrieval system, method and device | |
Gao et al. | An efficient and distributed framework for real-time trajectory stream clustering | |
Ghosh et al. | Aid*: a spatial index for visual exploration of geo-spatial data | |
US11593412B2 (en) | Providing approximate top-k nearest neighbours using an inverted list | |
Kim et al. | Multi-GPU efficient indexing for maximizing parallelism of high dimensional range query services | |
US9436715B2 (en) | Data management apparatus and data management method | |
Wu et al. | NEIST: A neural-enhanced index for spatio-temporal queries | |
Zhang et al. | U2sod-db: a database system to manage large-scale ubiquitous urban sensing origin-destination data | |
Carlson et al. | Resource efficient profiling of spatial variability in performance of regression models | |
Yang et al. | Workload-based ordering of multi-dimensional data | |
Peng et al. | DOS: a spatial system offering extremely high-throughput road distance computations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |