CN112380302B

CN112380302B - Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium

Info

Publication number: CN112380302B
Application number: CN202011148718.7A
Authority: CN
Inventors: 张健钦; 张昊; 郭小刚; 卢剑; 陆浩
Original assignee: Beijing University of Civil Engineering and Architecture
Current assignee: Beijing University of Civil Engineering and Architecture
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2023-07-21
Anticipated expiration: 2040-10-23
Also published as: CN112380302A

Abstract

The embodiment of the invention discloses a thermodynamic diagram generation method, a thermodynamic diagram generation device, electronic equipment and a storage medium based on track data. The method comprises the following steps: acquiring track data and map data; storing the track data in a Hadoop platform distributed file system in an original format; clustering the track data to obtain clustered data; storing the map data and the cluster data in an HBase distributed database; acquiring map data and cluster data corresponding to a thermodynamic diagram to be generated from the HBase distributed database; and generating a thermodynamic diagram according to the acquired map data and the cluster data. Based on the method and the device, the thermodynamic diagram visualization efficiency can be improved while the position characteristics of the track data are maintained, the diagram forming time is shortened, the problem of blocking caused by user interaction is solved, and the user experience is improved.

Description

Thermodynamic diagram generation method and device based on track data, electronic equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a thermodynamic diagram generation method, a thermodynamic diagram generation device, electronic equipment and a storage medium based on track data.

Background

In recent years, with the continuous development of satellite positioning technology, LBS technology, and the internet, position data is collected in various ways, and track data is growing in bursts. Conventional databases are not amenable to both management and storage capacity expansion. The arrival of big data age brings the problems of data structure change, complex storage structure, information fragmentation and the like, and the research of a technology for storing and managing track big data is one of the key research directions in the GIS field. The massive track data has great research value and contains a great amount of geographic space information. By using thermodynamic diagram rendering track data, spatial position features can be comprehensively displayed so that researchers can mine current regional spatial information and analyze vehicle movement features.

Currently, the shortcomings of thermodynamic diagram visualization of trajectory data are mainly manifested in: (1) the data scale is large, the time consumption of the visual formation is long, and the interactivity is low; (2) the thermodynamic diagram has low self-adaptive effect, the zoom level is switched, and the position characteristic deformation of the track data displayed by the thermodynamic diagram is larger; (3) the color gradients at different zoom levels are the same, resulting in the data-dense region exhibiting a thermonuclear phenomenon. At present, the technical requirements required by large-scale data visualization cannot be met only by optimizing the storage and query performances, and the track data is also processed. The current optimization of large data visualization mainly improves the mapping efficiency by reducing the whole data amount, but the method still can not sufficiently overcome the defect of the track data thermodynamic diagram visualization.

Disclosure of Invention

It is an aim of embodiments of the present invention to address at least the above problems and/or disadvantages and to provide at least the advantages described below.

The embodiment of the invention provides a thermodynamic diagram generating method, a thermodynamic diagram generating device, an electronic device and a storage medium based on track data, which can improve thermodynamic diagram visualization efficiency.

In a first aspect, a thermodynamic diagram generating method based on trajectory data is provided, including:

acquiring track data and map data;

storing the track data in a Hadoop platform distributed file system in an original format;

clustering the track data to obtain clustered data;

storing the map data and the cluster data in an HBase distributed database;

acquiring map data and cluster data corresponding to a thermodynamic diagram to be generated from the HBase distributed database;

and generating a thermodynamic diagram according to the acquired map data and the cluster data.

Optionally, the storing the track data in the original format in the Hadoop platform distributed file system includes:

dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range;

In the Hadoop platform distributed file system, track data contained in the same time slice is stored in a centralized mode in a primary format, and the time slices are stored adjacently according to a time sequence.

Optionally, the map data has a plurality of zoom levels;

the clustering of the track data to obtain clustered data includes:

determining a plurality of groups of clustering parameters according to the zoom levels;

clustering is carried out on track data contained in each time slice according to the plurality of groups of clustering parameters, so as to obtain a plurality of groups of clustering data corresponding to the plurality of zoom levels for each time slice;

the acquiring map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database comprises:

determining a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated;

determining a time slice to which clustering data corresponding to the thermodynamic diagram to be generated belongs according to the time range of the thermodynamic diagram to be generated;

and acquiring map data at corresponding zoom levels and cluster data at corresponding zoom levels of corresponding time slices from the HBase distributed database.

Optionally, the map data has a plurality of zoom levels;

the clustering of the track data to obtain clustered data includes:

clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels;

and acquiring map data and cluster data at corresponding zoom levels from the HBase distributed database.

Optionally, the sets of cluster parameters include a scan radius;

the determining a plurality of groups of clustering parameters according to the zoom levels comprises:

determining a scanning radius corresponding to each zoom level according to the zoom levels; wherein the corresponding scan radius of each zoom level decreases as the corresponding zoom level decreases.

Optionally, each set of cluster parameters includes a minimum inclusion point number;

The determining a plurality of groups of clustering parameters according to the zoom levels further includes:

and determining the minimum inclusion points corresponding to each zoom level according to the zoom levels, wherein the minimum inclusion points corresponding to each zoom level decrease along with the decrease of the corresponding zoom level.

Optionally, each set of cluster data includes center coordinates and influence values of a plurality of clusters and coordinates and influence values of a plurality of noise points.

Optionally, the clustering is implemented based on a DBScan algorithm.

Optionally, the storing the cluster data in the HBase distributed database includes:

each cluster data table is constructed for each group of cluster data corresponding to each zoom level of each time slice.

Optionally, the map data has a plurality of zoom levels;

the storing the map data in the HBase distributed database includes:

each map data table is constructed for map data at each zoom level, and 4 tiles adjacent to each other in the display state included in the map data at each zoom level are stored in the same row in the corresponding map data table.

Optionally, the building each map data table for the map data at each zoom level stores 4 tiles adjacent to each other in the display state in the same row in the corresponding map data table, including:

Calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,

dividing map data at each zoom level into m x m square sub-grids and n edge sub-grids when n-2m=1, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile;

filling the m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, connecting the m square sub-grids and the filling curves of the 2m edge sub-grids into a whole, and extending the filling curves of the m square sub-grids and the 2m edge sub-grids to 1 edge sub-grid which is not adjacent to the square sub-grids;

encoding the n tiles according to the filling sequence of the n tiles;

and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.

dividing map data of each zoom level into m x m square sub-grids when n=2m, wherein the square sub-grids are composed of 4 tiles;

filling the m x m square sub-grids based on a Z-shaped filling curve;

encoding the n tiles according to the filling sequence of the n tiles;

and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.

In a second aspect, there is provided a thermodynamic diagram generating device based on trajectory data, including:

the first acquisition module is used for acquiring track data and map data;

The first storage module is used for storing the track data in a Hadoop platform distributed file system in an original format;

the clustering module is used for clustering the track data to obtain clustered data;

the second storage module is used for storing the map data and the cluster data in an HBase distributed database;

the second acquisition module is used for acquiring map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database;

and the generation module is used for generating a thermodynamic diagram according to the acquired map data and the cluster data.

Optionally, the first storage module is specifically configured to:

Optionally, the map data has a plurality of zoom levels;

the clustering module comprises:

the first determining unit is used for determining a plurality of groups of clustering parameters according to the plurality of zoom levels;

The clustering unit is used for clustering the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels for each time slice;

the second acquisition module includes:

a second determining unit, configured to determine a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated;

a third determining unit, configured to determine, according to the time range of the thermodynamic diagram to be generated, a time slice to which cluster data corresponding to the thermodynamic diagram to be generated belongs;

and the acquisition unit is used for acquiring map data under the corresponding zoom level and cluster data of the corresponding zoom level under the corresponding time slice from the HBase distributed database.

Optionally, the map data has a plurality of zoom levels;

the clustering module comprises:

the clustering unit is used for clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels;

The second acquisition module includes:

and the acquisition unit is used for acquiring map data and cluster data under the corresponding zoom level from the HBase distributed database.

Optionally, the sets of cluster parameters include a scan radius;

the first determining unit is specifically configured to:

Optionally, the clustering is implemented based on a DBScan algorithm.

Optionally, the second storage module includes:

a first construction unit for constructing each of the cluster data tables for each of the sets of cluster data corresponding to each of the zoom levels for each of the time slices, respectively.

Optionally, the map data has a plurality of zoom levels;

the second storage module includes:

and a second construction unit for constructing each map data table for the map data at each zoom level, and storing 4 tiles adjacent to each other in the display state, which are included in the map data at each zoom level, in the same row in the corresponding map data table.

Optionally, the second construction unit is specifically configured to:

Filling the m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, and connecting the m square sub-grids and the n edge sub-grids by using connecting lines;

encoding the n tiles according to the filling sequence of the n tiles;

Optionally, the second construction unit is specifically configured to:

filling the m x m square sub-grids based on a Z-shaped filling curve;

Encoding the n tiles according to the filling sequence of the n tiles;

In a third aspect, an electronic device is provided, comprising: the system comprises at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method described above.

In a fourth aspect, a storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the method described above.

The embodiment of the invention at least comprises the following beneficial effects:

the thermodynamic diagram generating method and device based on the track data provided by the embodiment of the invention firstly acquire the track data and the map data; storing the track data in a Hadoop platform distributed file system in an original format; clustering the track data to obtain clustered data; storing the map data and the cluster data in an HBase distributed database; acquiring map data and cluster data corresponding to a thermodynamic diagram to be generated from the HBase distributed database; and generating a thermodynamic diagram according to the acquired map data and the cluster data. Based on the method and the device, the thermodynamic diagram visualization efficiency can be improved while the position characteristics of the track data are maintained, the diagram forming time is shortened, the problem of blocking caused by user interaction is solved, and the user experience is improved.

Additional advantages, objects, and features of embodiments of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of embodiments of the invention.

Drawings

FIG. 1 is a flow chart of a thermodynamic diagram generation method based on trajectory data according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a track data storage mode according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of map data in a display state according to an embodiment of the present invention;

fig. 4 (a) is a schematic diagram of a coding manner of map data when n=2 according to an embodiment of the present invention;

fig. 4 (b) is a schematic diagram of a coding manner of map data when n=4 according to an embodiment of the present invention;

fig. 4 (c) is a schematic diagram of a coding flow of map data when n=3 according to an embodiment of the present invention;

fig. 4 (d) is a schematic diagram of a coding manner of map data when n=3 according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a storage framework for track data and map data according to another embodiment of the present invention;

FIG. 6 is a flow chart of loading map data according to another embodiment of the present invention;

FIG. 7 is a graph showing a map data loading duration versus time according to another embodiment of the present invention;

FIG. 8 is a flowchart of a thermodynamic diagram generation method based on trajectory data according to another embodiment of the present invention;

FIG. 9 (a) is a thermodynamic diagram of map data at a zoom level of 11 and raw trajectory data that has not been clustered, according to yet another embodiment of the present invention;

FIG. 9 (b) is a thermodynamic diagram of map data and cluster data at a zoom level of 11 according to another embodiment of the present invention;

FIG. 9 (c) is a thermodynamic diagram of map data at a zoom level of 12 and raw trajectory data that has not been clustered, according to yet another embodiment of the present invention;

FIG. 9 (d) is a thermodynamic diagram of map data and cluster data at a zoom level of 12 according to another embodiment of the present invention;

FIG. 9 (e) is a thermodynamic diagram of map data at a zoom level of 13 and raw trajectory data that has not been clustered, according to yet another embodiment of the present invention;

FIG. 9 (f) is a thermodynamic diagram of map data and cluster data at a zoom level of 13 according to another embodiment of the present invention;

FIG. 9 (g) is a thermodynamic diagram of map data at a zoom level of 14 and raw trajectory data that has not been clustered, according to yet another embodiment of the present invention;

FIG. 9 (h) is a thermodynamic diagram of map data and cluster data at a zoom level of 14 according to another embodiment of the present invention;

FIG. 10 is a graph comparing time durations of thermodynamic diagrams according to yet another embodiment of the present invention;

FIG. 11 is a schematic structural diagram of a thermodynamic diagram generating device based on trajectory data according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Embodiments of the invention will be described in further detail below with reference to the drawings to enable those skilled in the art to practice the invention by reference to the description.

At present, a Oracle, postgreSQL relational database is mainly used as a data warehouse for storing track data, and is mainly used for statically storing and expressing the state of track data in a specific period, so that information storage and management in a specific period cannot be performed in real time. Specifically, the conventional database storage scheme can reach the upper limit of processing load when the input and output data volume is large, and is insufficient to support the rapid storage and query of massive data, and the conventional database supports single data types, and has poor capacity expansion, data backup and other functions when facing to massive data. The Hadoop open source cloud storage framework has the characteristics of high expansibility, high fault tolerance, economy and the like, has strong computing capacity, and can provide technical support for storing real-time mass track data. HBase is a NoSQL database based on Hadoop and comprises a plurality of core functions such as a heartbeat mechanism of HDFS, data backup and the like. In the aspect of storage, the HBase supports various data structures, can cope with PB-level mass data, and can be used for storing the mass data due to good expansibility.

Fig. 1 is a flowchart of a thermodynamic diagram generating method based on trajectory data, which is performed by a system, a server, or a thermodynamic diagram generating device based on trajectory data with processing capabilities, according to an embodiment of the present invention. As shown in fig. 1, the method includes:

in step 110, track data and map data are acquired.

The track data is a sampling sequence with position and time information, and contains the space-time dynamic property of the researched object. Based on the analysis of the trajectory data, spatiotemporal distribution characteristics of the object under study may be obtained.

Step 120, storing the track data in the original format in the Hadoop platform distributed file system.

The original format of the track data is txt file. When the track data is stored in the HDFS, the track data is directly stored in txt format without any processing on the format of the track data. Based on the process, the storage and management efficiency of massive track data is improved, and the thermodynamic diagram generation efficiency is improved.

In some embodiments, storing track data in a native format on a Hadoop platform distributed file system includes: dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range; in the Hadoop platform distributed file system, track data contained in the same time slice is stored in a centralized mode in a primary format, and a plurality of time slices are stored adjacently according to a time sequence.

The HBase distributed database stores data in the form of tables. The table consists of rows and columns, the columns being divided into several column families. HBase is similar to NoSQL databases, and HBase has Row Key as the primary Key for retrieving records. When the data are stored, the data are sorted and stored according to the dictionary sequence of the Row Key. Each column in the table belongs to a column family, and each column is composed of the smallest memory cell of a called cell (cell), and the data in the cell is of no type and is stored in the form of byte codes. Thus, the original data needs to be preprocessed before storing the data.

The original track data is preprocessed, and the process is as follows: the method comprises the steps of storing HDFS (Hadoop distributed file system) according to the original format, and then using HBase to store clustered data according to rows. The original track data storage mode is a storage mode based on a time dimension, namely a storage mode with time attribute priority. By adopting the method, the space point clustering can be conveniently carried out, namely, the clustering analysis can be conveniently carried out on the track data, and the space-time-based track data mining analysis can be facilitated. Other storage methods, such as vehicle track-based storage methods, spatial distribution-based storage methods, etc., cannot guarantee the query conditions required to effectively support such analysis, because track data within the same time period is not stored continuously on the storage device, resulting in a large number of IOs being generated, thereby reducing data access efficiency.

The embodiment of the invention uses a storage mode based on time dimension aiming at the original track data. Specifically, the time of all the track data is ordered, then the track data is divided into a plurality of time slices, one time slice contains all the track data in a preset time range, then the track data belonging to the same time slice are stored in a centralized manner, and all the time slices are arranged according to the time sequence, so that adjacent storage of the track data in a storage space is ensured. For example, all track data in one day can be divided into one time slice every 1 hour, all track data in one day can be divided into 12 time slices, namely track data between 0:00 and 1:00, track data between 1:00 and 2:00, track data between 23:00 and 24:00, track data contained in each time slice are stored in a concentrated manner, track data between 0:00 and 1:00 are stored adjacently to track data between 1:00 and 2:00, track data between 1:00 and 2:00 and track data between 2:00 and 3:00 are stored adjacently, and therefore all 12 time slices are stored adjacently in time sequence.

Fig. 2 is a schematic diagram of a track data storage mode according to an embodiment of the present invention. In HDFS, a data table (hereinafter referred to as a track data table) is constructed for each time slice. In the track data table, the column families may include the following: track data identity number ID, track data longitude LAT, track data latitude LON, DATE, TIME of day, a record format of a piece of track data may be: ID1, LAT1, LON1, DATE1, TIME. Each row in the table is used for storing one piece of track data, and all track data contained in the time slice are arranged in the track data table according to the time sequence. Here, the track data ID is used to indicate the subject to which the track data belongs, for example, when the track data is taxi track data, the track data ID is used to indicate from which taxi the track data is. That is, in the same track data table, track data of the same time slice may come from different subjects, i.e., different taxis. More precisely, when storing the trajectory data into the HDFS, it is stored based only on the temporal properties of the trajectory data, irrespective of which subject individual the trajectory data is specifically generated from.

And 130, clustering the track data to obtain clustered data.

Cluster processing analysis is generally a method for extracting information selectively from original data according to set cluster parameters and conditions, and is commonly used for classifying and simplifying the data.

In the step, the track data can be clustered, and the data quantity is reduced while the position characteristics of the track data are maintained, so that the thermodynamic diagram generation efficiency is improved. In addition, clustering analysis is carried out on the track data, so that the thermonuclear phenomenon of the data-intensive area can be optimized, and further the visualization effect of the thermodynamic diagram is improved. The embodiment of the invention does not directly utilize the original track data stored in the HDFS to generate the thermodynamic diagram, but clusters the track data, stores the clustered data in the HBase distributed database, and directly acquires the required clustered data from the HBase distributed database when generating the thermodynamic diagram. Based on this process, the generation efficiency of thermodynamic diagrams can be further improved.

In some embodiments, the map data has a plurality of zoom levels; clustering the track data to obtain clustered data, including: determining a plurality of groups of clustering parameters according to a plurality of zoom levels; and clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to multiple zoom levels.

The existing thermodynamic diagram has low self-adaptive effect, when the zoom level is switched, the characteristic deformation of the position of the track data displayed by the thermodynamic diagram is larger, and the thermodynamic diagram with different zoom levels has the same color gradient, so that the data-dense area presents a thermonuclear phenomenon. Based on the above, the embodiment of the invention sets different clustering parameters for different zoom levels of map data, thereby obtaining a clustering result matched with the zoom levels, further obtaining corresponding clustering data according to the zoom level of the generated thermodynamic diagram for generating an actual thermodynamic diagram, optimizing the thermonuclear phenomenon of the generated thermodynamic diagram in a data-intensive area, ensuring finer position feature display and improving the visual effect.

Further, clustering the track data to obtain clustered data, including: determining a plurality of groups of clustering parameters according to a plurality of zoom levels; and clustering the track data contained in each time slice according to a plurality of groups of clustering parameters to obtain a plurality of groups of clustering data corresponding to a plurality of zoom levels for each time slice.

In some examples, the sets of cluster parameters include a scan radius; determining a plurality of sets of cluster parameters from the plurality of zoom levels, comprising: determining a scanning radius corresponding to each zoom level according to the zoom levels; wherein the corresponding scan radius of each zoom level decreases as the corresponding zoom level decreases.

Each set of cluster parameters includes a scan radius. That is, when performing cluster analysis on the trajectory data included in a certain time slice, the trajectory data included in each cluster formed must be distributed within the range of the scanning radius.

As the zoom level of the map data decreases, the number of tiles included in the map data decreases, and the larger the spatial range of the real geographic space corresponding to the unit area in the map data, the denser the distribution of the track data corresponding to the unit area in the map data. Therefore, when the zoom level is reduced, the scanning radius is reduced, so that the number of points contained in each cluster is reduced, the density degree of track data in each cluster is reduced, and the thermonuclear phenomenon of a local area is further improved, so that each cluster can reflect the position characteristics of the track data more accurately.

Specifically, the scan radius corresponding to each zoom level may be determined according to the spatial range actually covered by a single pixel point in the map data at each zoom level in the real geographic space. In map data of different zoom levels, the size of individual pixel points is different. The lower the zoom level, the smaller the size of the single pixel dot, the smaller the spatial range that it actually covers in the real geographic space, whereas the larger the size of the single pixel dot, the larger the spatial range that it actually covers in the real geographic space. For example, in some map data at a lower zoom level, the spatial range actually covered by a single pixel in the real geographic space is only 300m, while in map data at a higher zoom level, the spatial range actually covered by a single pixel in the real geographic space is 1000m. The spatial range actually covered by a single pixel point in the map data under each zoom level in the real geographic space can be directly used as the corresponding scanning radius of each zoom level. The space range actually covered by a single pixel point in the real geographic space in the map data under each zoom level can be adjusted to a certain extent according to the needs, and the scanning radius corresponding to each zoom level is set. The embodiment of the present invention is not particularly limited thereto.

In some examples, each set of cluster parameters includes a minimum inclusion point number; determining a plurality of sets of cluster parameters according to the plurality of zoom levels, further comprising: and determining the minimum inclusion point corresponding to each zoom level according to the plurality of zoom levels, wherein the minimum inclusion point corresponding to each zoom level is reduced along with the reduction of the corresponding zoom level.

Each set of cluster parameters includes a minimum inclusion point number. That is, when performing cluster analysis on the trajectory data included in a certain time slice, the amount of trajectory data included in each cluster formed must be within the range of the minimum number of included points. It should be understood that when the scan radius and the minimum inclusion point are used as the clustering parameters at the same time, the limitation of the scan radius and the minimum inclusion point should be followed in the clustering process at the same time.

As the zoom level of the map data decreases, the number of tiles included in the map data decreases, and the larger the spatial range in the real geographic space corresponding to the unit area in the map data, the denser the distribution of the track data corresponding to the unit area in the map data. Therefore, when the zoom level is reduced, the minimum inclusion point number is reduced, the number of points included in each cluster is reduced, the density degree of track data in each cluster is reduced, and the thermonuclear phenomenon of a local area is further improved, so that each cluster can reflect the position characteristics of the track data more accurately. The minimum inclusion points corresponding to each zoom level may be set as required, which is not particularly limited in the embodiment of the present invention.

For any time slice, based on a plurality of groups of clustering parameters corresponding to a plurality of zoom levels, clustering the track data contained in the time slice, and obtaining a plurality of groups of clustering data. In some embodiments, each set of cluster data includes center coordinates and influence values of a plurality of clusters and coordinates and influence values of a plurality of noise points. Where a noise point may be understood as a discrete point, i.e. a single trajectory data that is not included in any cluster. The noise points may also reflect the position distribution of the trajectory data, so that the noise points are taken into account when the thermodynamic diagram is drawn.

In some examples, clustering of the trajectory data is implemented based on a DBScan algorithm. Common clustering algorithms include DBScan algorithm, K-means algorithm, and the like. Through comparison of different clustering algorithms, the DBScan algorithm has the following advantages: (1) the requirement on the shape of the data set is low; (2) abnormal points in the data can be found; (3) the number of clusters after clustering does not need to be set. Thus, based on the property that the DBScan algorithm is applicable to dense data sets of arbitrary shape, embodiments of the present invention select the DBScan algorithm to cluster the trajectory data.

Specifically, for all the trajectory data, the following procedure is adopted to realize cluster analysis:

(1) Firstly, all track data are preprocessed, and abnormal points are removed.

(2) And determining a plurality of groups of clustering parameters according to the plurality of zoom levels, wherein the clustering parameters comprise a scanning radius and minimum inclusion points.

(3) The trajectory data contained in each time slice is clustered based on the DBScan algorithm. For any time slice, multiple sets of cluster data may be obtained for multiple zoom levels. It is assumed that the cluster data corresponding to any zoom level includes n clusters and m noise points.

(4) For any one cluster, the cluster center point coordinates (x, y) and the influence value count (see formula (1)) are calculated using the trajectory data contained in the cluster. Because the noise points are all single coordinate points, the influence of the noise points can be directly assigned as 1.

Wherein n is the number of track points in a cluster, and x _i 、y _i The longitude and latitude of the ith track point in the cluster.

And 140, storing the map data and the cluster data in an HBase distributed database.

In some embodiments, storing the cluster data in the HBase distributed database comprises: each cluster data table is constructed for each group of cluster data corresponding to each zoom level of each time slice. Based on the above, when the thermodynamic diagram needs to be generated, the time range and the scaling level of the thermodynamic diagram can be determined, and then the cluster data table where the cluster data of the corresponding scaling level of the corresponding time slice is located is directly queried from the HBase distributed database, and further the corresponding cluster data is obtained. That is, the embodiment of the invention can improve the efficiency of acquiring the related cluster data from the HBase distributed database, thereby improving the efficiency of generating thermodynamic diagrams.

The storage mode of the cluster data is shown in table 1. The main stored information in the table is the central coordinates, influence values, and the coordinates and influence values of noise points of the clustered clusters after clustering. Row Key is an integer arranged in sequence, column family comprises 4 columns, namely LAT, LNG, COUNT and PROPERTIES, the first three columns respectively store longitude and latitude and influence values, and the PROPERTIES column is used as an information supplementary column to store other explanatory or auxiliary information.

Table 1 cluster data storage schema

Map data is typically raster data having a plurality of zoom levels, the map data at each zoom level being composed of a plurality of tiles, and the number of tiles gradually increasing as the zoom level increases. In order to store and query map data of each zoom level, tiles in the map data of each zoom level need to be encoded according to a certain rule. In some embodiments, tiles may be encoded according to the order in which the tiles naturally line up in the display state of the map data. Fig. 3 is a schematic diagram of map data in a display state according to an embodiment of the present invention. As shown in fig. 3, the map data is composed of 12 tiles encoded as 1-12, the tiles being sequentially encoded in a top-to-bottom, left-to-right manner. According to the above encoding, 12 tiles included in the map data are sequentially stored in the storage space, i.e., the physical storage locations of tiles having adjacent encoding are adjacent to each other, and the physical storage locations of tiles not adjacent to each other are not adjacent to each other. However, this encoding scheme affects the reading efficiency of map data. As shown in fig. 3, the tile codes in the screen display area (the area defined by the dashed box) are 7, 8, 11 and 12, respectively, and these 4 tiles are adjacent in the screen display area but are spaced apart in physical storage locations, in which case the time for both querying and reading data increases, thereby affecting the efficiency of thermodynamic diagram generation.

In order to reduce the inquiry time, the physical storage positions of the tiles adjacent to each other in the display state are required to be as close as possible, so that the data reading time is reduced, and the efficiency is improved. The map data adopted by the embodiment of the invention is map tile data based on a quadtree model. Since the HBase distributed database adopts a column-oriented storage manner, 4 tiles adjacent to each other in the display state are stored in the same row. Specifically, storing map data in an HBase distributed database includes: each map data table is constructed for map data at each zoom level, and 4 tiles adjacent to each other in the display state included in the map data at each zoom level are stored in the same row in the corresponding map data table. Here, "adjacent to each other in the display state" means that 4 tiles are in an adjacent relationship to each other, and it can also be considered that 4 tiles constitute a square region. For 4 tiles arranged in the same row in the lateral direction or 4 tiles arranged in the same column in the longitudinal direction in the map data, since these two cases are actually adjacent to each other only, there is also a case where 2 tiles are spaced apart by other tiles, and thus it does not belong to the case of being "adjacent to each other in the display state".

In order to achieve orderly storage and quick query of map data, and to cause 4 tiles adjacent to each other in a display state to be stored in the same row in a corresponding map data table, it is necessary to encode tiles included in the map data. In some examples, the process of encoding tiles contained in map data for each zoom level is as follows:

(1) Calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,right angle brackets indicate a round up.

(2) And judging the relation between the number n of tiles and the total order m. When n=2m, the map data of each zoom level is divided into m×m square sub-grids, wherein the square sub-grids are composed of 4 tiles.

(3) And filling m square sub-grids based on the Z-shaped filling curve. Here, each square sub-grid may be filled based on the Z-type filling curve, and then the filling curves in each square sub-grid are connected by using a connection line, so as to fill all the sub-grids.

(4) The n tiles are encoded according to a filling order of the n tiles.

(5) Each map data table is constructed for map data at each zoom level, and n tiles are sequentially stored in the corresponding map data table based on the encoding of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.

Fig. 4 (a) is a schematic diagram of a coding manner of map data when n=2 according to an embodiment of the present invention; fig. 4 (b) is a schematic diagram of a coding manner of map data when n=4 according to an embodiment of the present invention. As shown in fig. 4 (a), when n=2, m takes a value of 1, the map data is filled by a 1-order Z-type filling curve, and the tiles are encoded according to the filling order. As shown in fig. 4 (b), when n=4, the value of m is 2, the map data is filled by a 2-order Z-type filling curve, and the tiles are encoded according to the filling order.

The present embodiment provides an encoding method in the case where the number of tiles does not support the Z-fill curve encoding, because the number of tiles in the screen display area may not conform to the number required for the Z-fill curve due to the limitation of the screen display area. In some examples, the process of encoding tiles contained in map data for each zoom level is as follows:

(2) And judging the relation between the number n of tiles and the total order m. When n-2m=1, the map data at each zoom level is divided into m×m square sub-grids and n edge sub-grids, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile.

(3) And filling the m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, connecting the m square sub-grids and the filling curve of the 2m edge sub-grids into a whole, and extending the filling curve of the m square sub-grids and the 2m edge sub-grids to 1 edge sub-grid which is not adjacent to the square sub-grids.

(4) The n tiles are encoded according to a filling order of the n tiles.

(5) And constructing each map data table according to the map data at each zoom level, and sequentially storing n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.

Fig. 4 (c) is a schematic diagram of a map data encoding process when n=3 according to an embodiment of the present invention. When n=3, the value of m is 1, i.e. 1 sub-grid of map data can be filled by a 1 st order Z-shaped filling curve, and other sub-grids need to be filled in other ways. As shown in fig. 4 (c), the map data is divided into 1 square sub-grid and 3 edge sub-grids, 2 edge sub-grids adjacent to the square sub-grid (i.e., sub-grids numbered 2 and 3) are composed of 2 tiles, and an edge sub-grid not adjacent to the square sub-grid (i.e., sub-grid numbered 4) is composed of 1 tile; filling square sub-grids based on Z-shaped filling curves, filling sub-grids with numbers of 2 and 3 based on linear curves, connecting the square sub-grids and filling curves of 2 edge sub-grids into a whole, and continuing to extend to sub-grids with numbers of 4, so that filling of all sub-grids is realized; the tiles are encoded according to the filling order. Fig. 4 (d) is a schematic diagram of a coding manner of map data when n=3 according to an embodiment of the present invention. The encoding of the 9 tiles contained in the map data is shown in fig. 4 (d).

The storage mode of the map data provided by the embodiment of the invention is shown in table 2. In table 2, the primary Key Row Key corresponds to the number of the sub-grid, and the number of the sub-grid is determined by the filling order of the sub-grid. The column family contains at least four columns for storing tiles that belong to the same sub-grid, and the storage order of the tiles is consistent with the coding order of the tiles. The column names are named with tile numbers, and queries for particular tiles may be implemented according to the XY numbers of the tiles. If remarks or other information are stored in the remark column of each table. With the map data corresponding to fig. 4 (c) and 4 (d), the map data includes 4 sub-grids with numbers of 1, 2, 3 and 4, the Key Row Key can be determined according to the numbers of the 4 sub-grids, the tiles included in each sub-grid are respectively stored in the corresponding Row, wherein for the sub-grid with number 1, 4 tiles with numbers of 1, 2, 3 and 4 are sequentially stored in the same Row, for the sub-grid with number 2, only 2 tiles with numbers of 5 and 6 are included, then the 2 tiles are sequentially stored in the next Row, and for the sub-grid with number 4, only 1 tile with numbers of 9 is included, and then the 1 tile is stored in a single Row.

Table 2 map data storage mode

It should be understood that, since the tiles contained in the map data at different zoom levels will change, the number of tiles will also change, and the map data at each zoom level will need to be encoded separately, and finally, the map data at each zoom level will be stored in the respective map data table separately according to the encoding. Thus, when generating the thermodynamic diagram, map data of a corresponding zoom level can be obtained by querying a map data table of the map data of the corresponding zoom level from the HBase distributed data.

And step 150, acquiring map data and cluster data corresponding to the thermodynamic diagram to be generated from the HBase distributed database.

In some embodiments, obtaining map data corresponding to a thermodynamic diagram to be generated and cluster data from an HBase distributed database comprises: determining a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; map data and cluster data at the corresponding zoom level are obtained from the HBase distributed database. Based on the method, the obtained cluster data are matched with the zoom level of the map data, so that the problem of large deformation of the track data position characteristics of the thermodynamic diagram under different zoom levels can be solved, and the thermonuclear phenomenon of the data-intensive area is optimized.

In some embodiments, obtaining map data corresponding to a thermodynamic diagram to be generated and cluster data from an HBase distributed database comprises: determining a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; determining a time slice to which cluster data corresponding to the thermodynamic diagram to be generated belongs according to the time range of the thermodynamic diagram to be generated; map data at a corresponding zoom level and cluster data at a corresponding zoom level at a corresponding time slice are obtained from the HBase distributed database.

When clustering is performed on the trajectory data included in each time slice, cluster data of the trajectory data included in each time slice can be obtained. When the thermodynamic diagram is drawn, besides determining the zoom level of the thermodynamic diagram, a time range of the thermodynamic diagram is also required to be determined, and a time slice to which the clustering data belongs is determined based on the time range of the thermodynamic diagram, for example, the thermodynamic diagram reflecting the taxi trip heat in a certain natural day is required to be drawn, and then the clustering data of the natural day is required to be acquired from the HBase distributed database.

Step 160, generating a thermodynamic diagram according to the acquired map data and cluster data.

In some embodiments, the cluster data includes center coordinates and influence values of the clusters and coordinates and influence values of the noise points. According to the influence values of the clusters and the noise points, gray values of areas covered by the clusters and the noise points can be calculated, and then the central coordinates of the clusters and the noise points are combined, so that a thermodynamic diagram can be generated on map data. Here, the gray value is determined based on the center coordinates and the influence values of the clusters and the coordinates and the influence values of the noise points, and the process of generating the thermodynamic diagram in combination with the map data is implemented by a conventional method in the field of thermodynamic diagram generation, which is not described herein.

In summary, the embodiment of the invention provides a thermodynamic diagram generating method based on track data, which includes the steps of firstly acquiring track data and map data; storing the track data in a Hadoop platform distributed file system in an original format; clustering the track data to obtain clustered data; storing the map data and the cluster data in an HBase distributed database; acquiring map data and cluster data corresponding to a thermodynamic diagram to be generated from the HBase distributed database; and generating a thermodynamic diagram according to the acquired map data and the cluster data. Based on the method and the device, the thermodynamic diagram visualization efficiency can be improved while the position characteristics of the track data are maintained, the diagram forming time is shortened, the problem of blocking caused by user interaction is solved, and the user experience is improved.

The following provides a specific implementation scenario to further illustrate the thermodynamic diagram generating method based on the trajectory data provided in the embodiment of the present invention.

Fig. 5 is a schematic diagram of a storage frame of track data and map data according to an embodiment of the present invention. As shown in fig. 5, the trajectory data and map data storage frame constructed based on the HBase distributed database is composed of 5 parts in total. The method sequentially comprises the following steps from bottom to top: 1) A Hadoop storage frame built by PC clusters; 2) HBase cloud data storage layer depending on HDFS; 3) A data manipulation layer for querying data; 4) A Web service layer for receiving requests and retrieving data; 5) A Web browser based presentation layer. The HBase cloud data storage layer is an HBase distributed database, the HBase distributed database is used for storing map data and cluster data, and the HDFS is built in the Hadoop storage frame and used for storing track data. The data operation layer is used for realizing operations on track data, map data and cluster data and can comprise a map data processing module and a track data processing module, wherein the map data processing module comprises a map code conversion module and a map data interface, and the track data module comprises an original track data interface and a cluster data interface.

Specifically, the embodiment of the invention builds a Hadoop cluster consisting of 5 computers, wherein the memory of each node is 8Gb, the hard disk is 1Tb, and the CPU is an i7 processor. The software is configured as Hadoop version 2.7.6, distributed database HBase version 2.1.9, coordinator version 3.4.14, tomcat7.0.90 used by web server, java version 1.8.0.

According to the embodiment of the invention, a Web GIS (geographic information system) based on a Web environment can be understood by a Web GIS (Geographic Information System) visualization technology based on a B/S architecture is selected in a thermodynamic diagram visualization mode.

Fig. 6 is a flowchart of loading map data according to an embodiment of the present invention. The loading process of map data is described in conjunction with the storage frame of track data and map data shown in fig. 5 and fig. 6. Firstly, a web browser judges the required tile numbers according to a screen display area, namely, determines a query condition, sends a request to a web map server, converts the tile numbers into codes required by querying an HBase cloud data storage layer through a map code conversion module, namely, a Row Key, interacts with the HBase cloud data storage layer through a map data interface, queries a corresponding map data table according to the Row Key, determines cells in which tiles are located through the tile numbers, and then returns the queried map data to the web browser. Hereby, an acquisition process for map data required for the thermodynamic diagram to be generated is achieved.

The map data used in the embodiments of the present invention is 18 levels (i.e., 18 zoom levels) map data for Beijing area. And selecting one of the zoom level map data to perform a high-pressure query test so as to investigate the loading efficiency of the map data. Fig. 7 is a comparison chart of loading time periods of map data provided by an embodiment of the present invention, where two loading time curves in fig. 7 are loading time curves of map data for two coding modes, respectively, where the first coding mode is to code map data based on the purpose that 4 tiles adjacent to each other in a display state are stored in the same row in a corresponding map data table (for example, coding modes illustrated in fig. 4 (a) to 4 (d)), and the second coding mode is to code tiles according to the natural arrangement order of tiles in the display state of map data (for example, coding mode illustrated in fig. 3). Since the encoding scheme of map data determines the storage scheme of map data, the loading time comparison shown in fig. 7 is actually a comparison of the loading efficiencies of map data for both storage schemes.

As can be seen from fig. 7, the loading time of the map data for the two encoding modes is not much different when the number of requests is small, but the difference between the two becomes more and more apparent as the number of requests increases. Map data is encoded based on the purpose that 4 tiles adjacent to each other in a display state are stored in the same row in a corresponding map data table, and the time consumption is less than 2000ms after 100 times of loading are completed, namely, about 50 times of complete loading processes can be completed per second on average, and high concurrency situations occurring during visual interaction can be dealt with. It can be seen that after the above encoding process, from the sending request to the HBase distributed database, and then to the return of the data to the Web browser, the overall average loading time shows a tendency of shortening, and has a relatively stable real-time loading rate.

The following provides another specific implementation scenario to further illustrate the thermodynamic diagram generating method based on the trajectory data provided in the embodiment of the present invention.

The embodiment of the invention builds a Hadoop cluster consisting of 5 computers, wherein the memory of each node is 8Gb, the hard disk is 1Tb, and the CPU is an i7 processor. The software is configured as Hadoop version 2.7.6, distributed database HBase version 2.1.9, coordinator version 3.4.14, tomcat7.0.90 used by web server, java version 1.8.0.

The map data used in the embodiment of the invention is 18-level map data (i.e. 18-level zoom level) of Beijing area. The track data is 24 hours track data of taxies in Beijing city, and is recorded in more than 1440 ten thousands of pieces. And randomly selecting track data of a certain time slice for visualization processing.

Fig. 8 is a flowchart of a thermodynamic diagram generating method based on trajectory data according to an embodiment of the present invention. The thermodynamic diagram generation process will be described with reference to the storage frames of the trajectory data and the map data shown in fig. 5, and fig. 6 and 8. Wherein the map data is first loaded based on the loading process of the map data shown in fig. 6. The process is the same as the map data loading process in the previous implementation scenario, and the embodiments of the present invention are not described herein again. The loaded map data is map data stored in the HBase distributed database based on the encoding method, which is encoded based on the purpose that 4 tiles adjacent to each other in the display state are stored in the same row in the corresponding map data table. Then, based on the visualization process shown in fig. 8, a thermodynamic diagram is drawn in combination with the acquired cluster data and map data. Specifically, for taxi track data, storing the taxi track data in an HDFS, and clustering the track data based on a DBScan algorithm. And aiming at the clustered clusters and the noise points obtained by clustering, calculating the central coordinates and the influence values of the clustered clusters and the influence values of the noise points in a way of traversing all the clustered clusters and the noise points one by one, and warehousing clustered data after the calculation is completed. Here, each time the clustering parameters are changed, a new round of clustering is performed on the trajectory data, and a new round of traversal is performed on the newly obtained cluster and the noise points to obtain new cluster data, so that cluster data suitable for each zoom level can be obtained according to a plurality of zoom levels of the map data. And when the clustered data is put in storage, the loading process of the clustered data can be executed according to the needs generated by the thermodynamic diagram, a data request is sent to the web browser, and the web map server inquires corresponding data to the HBase distributed database according to the inquiry condition and returns to the web browser. And the Web browser side calculates a gray value according to the influence value of the clustered data and draws a thermodynamic diagram. In order to distinguish from the thermodynamic diagram generation process using clustered data, embodiments of the present invention also provide a thermodynamic diagram generation process based on raw trajectory data. Here, when generating a thermodynamic diagram based on the original trajectory data, the original trajectory data is directly obtained from the HDFS, and the thermodynamic diagram is drawn based on the original trajectory data and the loaded map data.

FIG. 9 (a) is a thermodynamic diagram of map data at a zoom level of 11 and raw trajectory data not clustered according to an embodiment of the present invention; FIG. 9 (b) is a thermodynamic diagram provided by an embodiment of the present invention that utilizes map data with a zoom level of 11 and cluster data; FIG. 9 (c) is a thermodynamic diagram of map data at a zoom level of 12 and raw trajectory data that has not been clustered, according to yet another embodiment of the present invention; FIG. 9 (d) is a thermodynamic diagram provided by an embodiment of the present invention that utilizes map data with a zoom level of 12 and cluster data; FIG. 9 (e) is a thermodynamic diagram of an embodiment of the present invention using map data at a zoom level of 13 and raw trajectory data that has not been clustered; FIG. 9 (f) is a thermodynamic diagram provided by an embodiment of the present invention that utilizes map data with a zoom level of 13 and cluster data; FIG. 9 (g) is a thermodynamic diagram of an embodiment of the present invention using map data with a zoom level of 14 and raw trajectory data that has not been clustered; fig. 9 (h) is a thermodynamic diagram generated by using map data and cluster data with a zoom level of 14 according to an embodiment of the present invention. "zoom" marked in fig. 9 (a) to 9 (h): the "typeface represents the zoom level of the thermodynamic diagram and the time typeface represents the time taken to generate the thermodynamic diagram. Here, for the thermodynamic diagram generation process using cluster data, the time taken to complete the following process is taken as the time taken to generate the thermodynamic diagram: the Web browser end sends a data request, the Web map server end inquires corresponding data to the HBase distributed database according to the inquiry condition and returns the data to the Web browser end, and the Web browser end calculates gray values according to the influence values of the clustered data and draws thermodynamic diagrams. For a thermodynamic diagram generation process that directly utilizes raw trajectory data that is not clustered, the time taken to complete the following process is taken as the time taken to generate the thermodynamic diagram: the Web browser end sends a data request, the Web map server end inquires corresponding data in the HDFS according to the inquiry condition and returns the data to the Web browser end, and the Web browser end draws a thermodynamic diagram according to the acquired original track data.

Comparing two thermodynamic diagrams with the same zoom level as a group, namely, fig. 9 (a) and 9 (b), fig. 9 (c) and 9 (d), fig. 9 (e) and 9 (f), fig. 9 (g) and 9 (h) respectively form 4 groups, and from the comparison result of each group, it is found that the thermonuclear phenomenon of the thermodynamic diagram generated by using the original track data which is not clustered is more serious, the deformation of the position characteristic is larger, the time for generating the thermodynamic diagram is longer, the visualization effect is poorer, correspondingly, the thermonuclear phenomenon of the data-dense region is optimized by using the thermodynamic diagram generated by using the clustered data, the whole visualization effect is improved. Especially when the thermodynamic diagram is generated by using the clustering data, different clustering parameters are designed for different zoom levels, and the obtained clustering data is suitable for the zoom level of the map data, so that the thermonuclear phenomenon of the data-intensive area is improved more favorably. Fig. 10 is a comparison chart of time length for generating thermodynamic diagrams according to an embodiment of the present invention. As can be seen from fig. 10, when the zoom level is low, for example, level 11 and level 12, there is a gap between the time spent on generating the thermodynamic diagram by using the raw trajectory data which is not clustered and the time spent on generating the thermodynamic diagram by using the clustered data, and when the zoom level is high, the visualized loading time of the clustered data is obviously reduced.

In summary, the thermodynamic diagram generating method based on track data provided by the embodiment of the invention improves the visualization efficiency, shortens the mapping time, reduces the clamping influence caused by user interaction and improves the user interaction experience while retaining the data position characteristics. The thermodynamic diagram generation method based on the track data provided by the embodiment of the invention can realize efficient management and storage of mass track data and obtain a better drawing effect.

In addition, the thermodynamic diagram generating method based on the track data provided by the embodiment of the invention designs a track big data storage scheme based on an HBase platform based on the advantages of high reliability, high expansibility, high efficiency, high fault tolerance and the like of the hadoop frame, and has better universality in the fields of spatial data storage, visualization and expansion. By processing the thermodynamic diagram mapping data, the thermodynamic diagram generation efficiency is improved, and key technical support is provided for track data mining and analysis based on time attributes.

Fig. 11 shows a schematic structural diagram of a thermodynamic diagram generating device based on trajectory data according to an embodiment of the present invention. As shown in fig. 11, the thermodynamic diagram generation device 1100 based on trajectory data includes: a first acquisition module 1110 for acquiring trajectory data and map data; the first storage module 1120 is configured to store the track data in a original format in a Hadoop platform distributed file system; a clustering module 1130, configured to cluster the track data to obtain clustered data; a second storage module 1140, configured to store the map data and the cluster data in an HBase distributed database; a second obtaining module 1150, configured to obtain map data and cluster data corresponding to a thermodynamic diagram to be generated from the HBase distributed database; a generating module 1160, configured to generate a thermodynamic diagram according to the acquired map data and cluster data.

In some embodiments, the first storage module is specifically configured to: dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time range; in the Hadoop platform distributed file system, track data contained in the same time slice is stored in a centralized mode in a primary format, and the time slices are stored adjacently according to a time sequence.

In some embodiments, the map data has a plurality of zoom levels; the clustering module comprises: the first determining unit is used for determining a plurality of groups of clustering parameters according to the plurality of zoom levels; the clustering unit is used for clustering the track data contained in each time slice according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels for each time slice; the second acquisition module includes: a second determining unit, configured to determine a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; a third determining unit, configured to determine, according to the time range of the thermodynamic diagram to be generated, a time slice to which cluster data corresponding to the thermodynamic diagram to be generated belongs; and the acquisition unit is used for acquiring map data under the corresponding zoom level and cluster data of the corresponding zoom level under the corresponding time slice from the HBase distributed database.

In some embodiments, the map data has a plurality of zoom levels; the clustering module comprises: the first determining unit is used for determining a plurality of groups of clustering parameters according to the plurality of zoom levels; the clustering unit is used for clustering the track data according to the multiple groups of clustering parameters to obtain multiple groups of clustering data corresponding to the multiple zoom levels; the second acquisition module includes: a second determining unit, configured to determine a zoom level of map data corresponding to the thermodynamic diagram to be generated according to the zoom level of the thermodynamic diagram to be generated; and the acquisition unit is used for acquiring map data and cluster data under the corresponding zoom level from the HBase distributed database.

In some embodiments, the sets of cluster parameters include a scan radius; the first determining unit is specifically configured to: determining a scanning radius corresponding to each zoom level according to the zoom levels; wherein the corresponding scan radius of each zoom level decreases as the corresponding zoom level decreases.

In some embodiments, the sets of cluster parameters include minimum inclusion points; the first determining unit is specifically configured to: and determining the minimum inclusion points corresponding to each zoom level according to the zoom levels, wherein the minimum inclusion points corresponding to each zoom level decrease along with the decrease of the corresponding zoom level.

In some embodiments, the sets of cluster data include center coordinates and influence values of a plurality of clusters and coordinates and influence values of a plurality of noise points.

In some embodiments, the clustering is implemented based on a DBScan algorithm.

In some embodiments, the second storage module comprises: a first construction unit for constructing each of the cluster data tables for each of the sets of cluster data corresponding to each of the zoom levels for each of the time slices, respectively.

In some embodiments, the map data has a plurality of zoom levels; the second storage module includes: and a second construction unit for constructing each map data table for the map data at each zoom level, and storing 4 tiles adjacent to each other in the display state, which are included in the map data at each zoom level, in the same row in the corresponding map data table.

In some embodiments, the second building unit is specifically configured to: calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,dividing map data at each zoom level into m x m square sub-grids and n edge sub-grids when n-2m=1, wherein the square sub-grids are composed of 4 tiles, 2m edge sub-grids adjacent to the square sub-grids are composed of 2 tiles, and 1 edge sub-grid not adjacent to the square sub-grids is composed of 1 tile; filling the m square sub-grids based on a Z-shaped filling curve, filling the 2m edge sub-grids based on a linear filling curve, and connecting the m square sub-grids and the n edge sub-grids by using connecting lines; encoding the n tiles according to the filling sequence of the n tiles; and constructing each map data table aiming at the map data under each zoom level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table, and the tiles belonging to the same edge sub-grid are stored in the same row in the corresponding map data table.

In some embodiments, the second building unit is specifically configured to: calculating the total order m of the map data at each zoom level according to the number n of tiles contained in each row of the map data at each zoom level, wherein,dividing map data of each zoom level into m x m square sub-grids when n=2m, wherein the square sub-grids are composed of 4 tiles; filling the m x m square sub-grids based on a Z-shaped filling curve; encoding the n tiles according to the filling sequence of the n tiles; for each zoomAnd constructing each map data table by map data under the level, and sequentially storing the n tiles in the corresponding map data table based on the codes of the n tiles, wherein 4 tiles belonging to the same square sub-grid are stored in the same row in the corresponding map data table.

Fig. 12 shows an electronic device of an embodiment of the invention. As shown in fig. 12, the electronic apparatus 1200 includes: at least one processor 1210, and a memory 1220 communicatively coupled to the at least one processor 1210, wherein the memory stores instructions executable by the at least one processor for causing the at least one processor to perform the method.

In particular, the above memory 1220 and the processor 1210 are connected together via the bus 1230, and can be general-purpose memory and processor, not specifically limited herein, and when the processor 1210 runs a computer program stored in the memory 520, it can perform various operations and functions described in connection with fig. 1 to 10 in the embodiment of the present invention.

The embodiment of the invention also provides a storage medium, on which a computer program is stored, which program, when being executed by a processor, implements the method. The specific implementation may refer to a method embodiment, which is not described herein.

Although the embodiments of the examples of the present invention have been disclosed above, they are not limited to the use listed in the specification and the embodiments. It can be fully adapted to various fields suitable for embodiments of the present invention. Additional modifications will readily occur to those skilled in the art. Therefore, embodiments of the invention are not limited to the specific details and illustrations shown and described herein, without departing from the general concepts defined in the claims and their equivalents.

Claims

1. A thermodynamic diagram generation method based on trajectory data, comprising:

acquiring track data and map data;

clustering the track data to obtain clustered data;

storing the map data and the cluster data in an HBase distributed database;

generating a thermodynamic diagram according to the acquired map data and cluster data;

the map data having a plurality of zoom levels;

the storing the map data in the HBase distributed database includes:

constructing each map data table for the map data at each zoom level, and storing 4 tiles adjacent to each other in the display state contained in the map data at each zoom level in the same row in the corresponding map data table;

the construction of each map data table for map data at each zoom level, storing 4 tiles adjacent to each other in a display state contained in the map data at each zoom level in a same row in a corresponding map data table, includes:

encoding the n tiles according to the filling sequence of the n tiles;

2. The thermodynamic diagram generation method based on trace data as claimed in claim 1, wherein the storing the trace data in a native format in a Hadoop platform distributed file system comprises:

dividing the track data into a plurality of time slices, wherein each time slice comprises all track data in a preset time period;

3. The method of generating a thermodynamic diagram based on trajectory data of claim 2,

the clustering of the track data to obtain clustered data includes:

4. The method of generating a thermodynamic diagram based on trajectory data of claim 1,

the clustering of the track data to obtain clustered data includes:

5. A thermodynamic diagram generation method based on trajectory data as claimed in claim 3 or 4, characterized in that,

each group of clustering parameters comprises a scanning radius;

6. A thermodynamic diagram generation method based on trajectory data as claimed in claim 3 or 4, characterized in that,

each group of clustering parameters comprises a minimum inclusion point number;

7. The thermodynamic diagram generation method of claim 3 or 4, wherein each set of cluster data includes center coordinates and influence values of a plurality of clusters and coordinates and influence values of a plurality of noise points.

8. The thermodynamic diagram generation method based on trajectory data of claim 3 or 4, wherein the clustering is implemented based on a DBScan algorithm.

9. The thermodynamic diagram generation method based on trajectory data of claim 3, wherein the storing the cluster data in an HBase distributed database comprises:

10. The thermodynamic diagram generation method based on trajectory data according to claim 1, wherein the constructing each map data table for map data at each zoom level, storing 4 tiles adjacent to each other in a display state included in the map data at each zoom level in the same row in the corresponding map data table, includes:

according to the ground at each zoom levelThe number of tiles n, contained in each row of map data, calculates the total order m of the map data at the respective zoom levels, wherein,

filling the m x m square sub-grids based on a Z-shaped filling curve;

encoding the n tiles according to the filling sequence of the n tiles;

11. A thermodynamic diagram generation device based on trajectory data, comprising:

the first acquisition module is used for acquiring track data and map data;

the generation module is used for generating a thermodynamic diagram according to the acquired map data and the cluster data;

the map data having a plurality of zoom levels;

the second storage module includes:

a second construction unit configured to construct each map data table for map data at each zoom level, and store 4 tiles adjacent to each other in a display state included in the map data at each zoom level in the same row in the corresponding map data table;

the second construction unit is specifically configured to:

encoding the n tiles according to the filling sequence of the n tiles;

12. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of any of claims 1-10.

13. A storage medium having stored thereon a computer program, which when executed by a processor, implements the method of any of claims 1-10.