CN116975182A - Data processing method, device, computer equipment and storage medium - Google Patents

Data processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116975182A
CN116975182A CN202310833667.9A CN202310833667A CN116975182A CN 116975182 A CN116975182 A CN 116975182A CN 202310833667 A CN202310833667 A CN 202310833667A CN 116975182 A CN116975182 A CN 116975182A
Authority
CN
China
Prior art keywords
target
data
track
information
vehicle track
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310833667.9A
Other languages
Chinese (zh)
Inventor
王茂林
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310833667.9A priority Critical patent/CN116975182A/en
Publication of CN116975182A publication Critical patent/CN116975182A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Remote Sensing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Fuzzy Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application relates to a data processing method, a data processing device, computer equipment, storage media and a computer program product, and can be applied to scenes such as maps, internet of vehicles and traffic. The method comprises the following steps: determining a target area where a target place is located; acquiring a plurality of historical vehicle trajectories within a historical time period; determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area; and carrying out statistical calculation on the data in the target statistical dimension in the vehicle track data of the related vehicle track to obtain target data. By adopting the method, the vehicle track data of the related vehicle track can be subjected to statistical calculation, so that the integrity and the accuracy of the obtained data are improved.

Description

Data processing method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a data processing method, apparatus, computer device, and storage medium.
Background
The sites are environments capable of providing users with a part of life needs or a part of service needs, and arrival statistics data of arrival at each site are important data for the management of sites and the service planning. At present, the arrival statistics data can be directly obtained through the hardware equipment, for example, the arrival statistics data of the arrival places can be directly counted by using electronic equipment such as an infrared sensor, a wireless radio frequency sensor and the like, but because different places have high population density and can have shielding among people, the problems of missing counting, miscounting and the like of the arrival statistics data are caused, and therefore, the integrity and the accuracy of the arrival statistics data obtained through the hardware equipment are low. Therefore, how to improve the integrity and accuracy of the obtained data is a need to be solved.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data processing method, apparatus, computer device, and storage medium capable of improving the integrity and accuracy of the obtained data.
In a first aspect, the present application provides a data processing method. The method comprises the following steps:
Determining a target area where a target place is located;
acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area;
and carrying out statistical calculation on data in a target statistical dimension in vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time.
In a second aspect, the application further provides a data processing device. The device comprises:
the area determining module is used for determining a target area where the target place is located;
the track acquisition module is used for acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
the area and track association module is used for determining an associated vehicle track with track dependency relationship with the target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is located in the target area;
The data statistics module is used for carrying out statistics calculation on data in a target statistics dimension in vehicle track data of the related vehicle track so as to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:
determining a target area where a target place is located;
acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area;
and carrying out statistical calculation on data in a target statistical dimension in vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
determining a target area where a target place is located;
acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area;
and carrying out statistical calculation on data in a target statistical dimension in vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
determining a target area where a target place is located;
acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
Determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area;
and carrying out statistical calculation on data in a target statistical dimension in vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time.
The data processing method, the data processing device, the computer equipment, the storage medium and the computer program product determine a target area where a target place is located; acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points; determining an associated vehicle track with track dependency relation with a target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area; and carrying out statistical calculation on data in a target statistical dimension in vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of reaching the target site over a historical period of time. Because the vehicle track can completely and accurately describe the travelling track of the vehicle in the historical time period, the associated vehicle track passing through the target area in the historical time is obtained by matching a plurality of historical vehicle tracks with the target area, so that the statistical data of the target place which reaches the target area in the historical time period can be described from the vehicle angle based on the vehicle related data corresponding to the vehicle track passing through the target area in the historical time period by carrying out statistical calculation on the associated vehicle track, and the problems that the statistical data is missed or miscalculated in the statistical data obtained by equipment are avoided, and the integrity and the accuracy of the obtained data are improved by the vehicle track data of the associated vehicle track.
Drawings
FIG. 1 is a diagram of an application environment for a data processing method in one embodiment;
FIG. 2 is a flow diagram of a data processing method in one embodiment;
FIG. 3 is a schematic diagram of the composition of historical vehicle trajectories in one embodiment;
FIG. 4 is a schematic illustration of an associated vehicle track in one embodiment;
FIG. 5 is a partial flow chart of a statistical calculation to obtain target data according to one embodiment;
FIG. 6 is a schematic diagram of a partial flow of performing data sample expansion processing to obtain target data in one embodiment;
FIG. 7 is a partial flow diagram of converting a target area to a corresponding target ink-card-holder grid in one embodiment;
FIG. 8 is a schematic diagram of an irregular closed region as the target region in one embodiment;
FIG. 9 is a partial flow diagram of determining a target ink-card tray grid in one embodiment;
FIG. 10 is a schematic diagram of an ink-card-tray grid with candidate coordinate points as vertices in one embodiment;
FIG. 11 is a partial flow diagram of a data processing method in one embodiment;
FIG. 12 is a partial flow diagram of determining an associated vehicle trajectory in one embodiment;
FIG. 13 is a schematic flow chart of a portion of a process for performing track point screening on each historical vehicle track point to obtain a plurality of screened track points in one embodiment;
FIG. 14 is a schematic flow chart of a portion of a process for performing a track point screening on each of the historical vehicle track points to obtain a plurality of screened track points in another embodiment;
FIG. 15 is a flow chart of determining a target vehicle trajectory point that falls within a target region from a plurality of screening trajectory points, in one embodiment;
FIG. 16 is a flow chart of determining a target area where a target location is located in one embodiment;
FIG. 17 is a schematic diagram of connecting venue points of interest to obtain a target area in one embodiment;
FIG. 18 is a complete flow diagram of a data processing method in one embodiment;
FIG. 19 is a block diagram of a data processing apparatus in one embodiment;
FIG. 20 is a block diagram showing a structure of a data processing apparatus in another embodiment;
fig. 21 is an internal structural view of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The sites are environments capable of providing users with a part of life needs or a part of service needs, and arrival statistics data of arrival at each site are important data for the management of sites and the service planning. At present, the statistics of the arrival statistics data can be directly obtained through hardware equipment, for example, the arrival statistics data of the arrival places can be directly counted by using electronic equipment such as an infrared sensor, a wireless radio frequency sensor and the like, but the problems of missing counting, false counting and the like of the arrival statistics data can be caused due to the fact that population density is high and shielding exists among people in different places. Second, the future arrival statistics are predicted based on the historical arrival statistics, but the historical arrival statistics substantially reflect only the historical situation, and cannot be predicted for special events (such as arrival statistics policy influence, venue service activities, etc.) that may influence the future arrival statistics, that is, the future arrival statistics predicted based on the historical arrival statistics may also differ from the actual arrival statistics, which has a problem of low integrity and is not suitable for counting the arrival statistics on the actual day. Thus, the integrity of the arrival statistics obtained in the manner described above is low. Therefore, how to improve the integrity and accuracy of the obtained data is a need to be solved.
The embodiment of the application provides a data processing method capable of obtaining the integrity and the accuracy of data, and the data processing method provided by the embodiment of the application can be applied to an application environment shown in figure 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on the cloud or other servers.
Specifically, taking the application to the server 104 as an example, when it is required to acquire statistical data of reaching a target location in a historical time period, the server may determine a target area where the target location is located, where the target area is a closed area that is connected, and the target location may be a place such as a mall, a school, a park, or the like, which is not limited herein. Then, the server 104 acquires a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are a sequence formed by a plurality of continuous and adjacent historical vehicle track points, determines an associated vehicle track with track dependency relationship with a target area from the plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area, and finally performs statistical calculation on data in a target statistical dimension in vehicle track data of the associated vehicle track to obtain target data, wherein the target data are used for describing: statistics of reaching the target site over a historical period of time. Because the vehicle track can completely and accurately describe the traveling track of the vehicle in the historical time period, the statistical data of the target places corresponding to the target areas in the historical time period are described from the vehicle angle based on the vehicle related data corresponding to the vehicle track passing through the target areas in the historical time period, so that the problems that the statistical data are missed or miscalculated in the statistical data obtained through equipment are avoided, and the integrity and the accuracy of the obtained data are improved through the vehicle track data related to the vehicle track.
The terminal 102 may be, but not limited to, various desktop computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.
In one embodiment, as shown in fig. 2, a data processing method is provided, where the method is applied to the server 104 in fig. 1, and it is understood that the method may also be applied to the terminal 102, and may also be applied to a system including the terminal 102 and the server 104, and implemented through interaction between the terminal 102 and the server 104. In this embodiment, the method includes the steps of:
step 202, determining a target area where the target place is located.
The target place is an environment capable of providing a part of life needs or a part of service needs for the user, for example, the target place may be a mall, a school, a park, a barbershop, a restaurant, or a key place focused by a scene, etc. Second, the target area is a closed area and the closed areas are connected, and the target area may be a regular quadrilateral area, for example, the target area may be square or rectangular. And the target area may also be an irregular polygonal area, for example, the target area may be circular or oval, i.e., the target area is a complete closed area without any interruptions.
Based on this, the target area is specifically a closed area formed by a plurality of area coordinate points, and the target area is specifically required to be determined according to the interest points (Polnt of Interest, poI) included in the target location, where the plurality of area coordinate points belong to the interest points included in the target location, and in this embodiment, the target area is necessarily a closed area that is connected. Because the target location needs to be flexibly determined according to the actual application requirement, the target area can also be flexibly determined based on the target location, that is, the specific target location and the specific target area are not specifically limited in the embodiment.
Specifically, when it is necessary to acquire statistical data that reaches a target location within a history period, the server may first determine a target area where the target location is located. That is, the server may directly obtain the target Area where the target location is located, and the server may directly obtain the target Area through communication connection with the terminal, or obtain the stored target Area from the data storage system, where the target Area may be an Area Of Interest (AOI) Of the target location.
Alternatively, the server may also determine the target area through points of interest included in the target location, i.e., the server may obtain points of interest included in the target location through a communication connection with the terminal, or obtain points of interest included in the stored target location from the data storage system, and then construct the target area based on the points of interest included in the target location. The present embodiment is not particularly limited in the manner of determining the target area.
Step 204, obtaining a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are a sequence formed by a plurality of continuous and adjacent historical vehicle track points.
The historical time period is specifically a historical time interval prior to the current time, and the specific historical time period needs to be determined according to actual requirements. Next, the historical vehicle track is a sequence of a plurality of continuous and adjacent historical vehicle track points, for example, if there is a historical vehicle track A1, the historical vehicle track point B2, the historical vehicle track point B3 and the historical vehicle track point B4 belong to the historical vehicle track A1, that is, the historical vehicle track points B1 to B4 are continuous and adjacent.
Specifically, the server obtains a plurality of historical vehicle trajectories over a historical period of time. Since many historical vehicle tracks can be obtained in the historical time period, in this embodiment, access data is specifically obtained for the target location, and on the basis of considering the data processing amount, a plurality of historical vehicle tracks in the same geographic area as the target location can be specifically obtained, where the geographic area can be a street area, a urban area, a city level area, or the like. For example, the target location is in urban area C1, then the server will specifically acquire a plurality of historical vehicle trajectories over a historical period of time and in urban area C1. Or, the target site is in street area C2, then the server will specifically acquire a plurality of historical vehicle trajectories over a historical period of time and in urban area C2.
Based on the above, in the running process of the vehicle, the track recording device of the vehicle can upload the track of the vehicle generated by the vehicle to the server in real time or according to a preset period, and the server stores the track of the vehicle generated by the vehicle into the data storage system, so that when the statistical data acquisition is required, a plurality of historical vehicle tracks in a historical time period are acquired from the data storage system, or a plurality of historical vehicle tracks in the same geographic area as the target place are selected from the data storage system.
To facilitate understanding of the historical vehicle trajectories, as shown in fig. 3, there are a historical vehicle trajectory 301 and a historical vehicle trajectory 302, and the historical vehicle trajectory 301 is specifically composed of a plurality of consecutive adjacent historical vehicle trajectory points, such as the historical vehicle trajectory point 3011 to the historical vehicle trajectory point 3017, and similarly, the historical vehicle trajectory 302 is specifically composed of a plurality of consecutive adjacent historical vehicle trajectory points, such as the historical vehicle trajectory point 3021 to the historical vehicle trajectory point 3027.
Step 206, determining an associated vehicle track with track dependency relationship with the target area from a plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is located in the target area.
Wherein at least one historical vehicle track point exists in the associated vehicle track and is in the target area. Based on this, the track dependency relationship is used to characterize that the vehicle track has a coincident portion with the region, i.e., that the vehicle track passes at least through the region having the track dependency relationship.
Specifically, the server determines an associated vehicle track having track dependency relationship with the target area from among a plurality of historical vehicle tracks. As can be seen from the foregoing embodiments, the historical vehicle track is a sequence of a plurality of continuous and adjacent historical vehicle track points, that is, if one of the historical vehicle track points is located in the target area, the historical vehicle track and the target area can be determined to have track dependency relationship, and the historical vehicle track is determined to be the associated vehicle track.
For ease of understanding, as shown in fig. 4, there is a target area 401 where the target location is located, and the acquired historical vehicle track 402 and historical vehicle track 403, none of the historical vehicle track points in the historical vehicle track 402 are located in the target area 401, and the historical vehicle track points 4031 and 4032 that are located in the historical vehicle track 403 are located in the target area 401, so that a track dependency relationship between the historical vehicle track 403 and the target area 401 can be constructed, and the historical vehicle track 403 is determined to be the associated vehicle track of the target area 401.
Step 208, performing statistical calculation on the data in the target statistical dimension in the vehicle track data of the associated vehicle track to obtain target data, where the target data is used for describing: statistics of reaching the target site over a historical period of time.
Wherein, since one vehicle track corresponds to one vehicle, the vehicle track data of the related vehicle track at least comprises the portrait data of the vehicle of the related vehicle track, and the portrait data at least comprises: the vehicle type, the vehicle price, and the object information of the vehicle-related object, the object information of the vehicle-related object may be: the object information of the object driving the vehicle in the history period, the object information of the object riding the vehicle in the history period, and the like are not particularly limited herein.
Second, the target data is used to describe: the statistics data of the target places in the historical time period is taken as an example of a scene applied to user data statistics, the statistics data is visit data, the statistics data can be the number of people or the number of people at the moment, the target data can describe the number of people visiting the target places in the historical time period, and the situation that the same visiting object visits for a plurality of times can be included at the moment. Alternatively, the target data may describe the number of people visiting the target location during a historical period of time, where only the number of different visited objects visited is counted. For example, taking a market as an example, and the history period is 14 o 'clock to 15 o' clock on 28 th month of 2024, if within 14 o 'clock to 15 o' clock on 28 th month of 2024, there are 3 times of visiting the market by visit object D1, 1 time of visiting the market by visit object D2, 2 times of visiting the market by visit object D3, 1 time of visiting the market by visit object D4, and 1 time of visiting the market by visit object D5.
If the user data is personal, the target data describes: the times of 14 to 15 points at 28 days 6 of 2024 to visit the mall are: 8 times (3+1+2+1+1). If the user data is the number of people, the target data describes: the number of people who reach the market from 14 to 15 on 28 th month 6 of 2024 is: 5 persons (i.e., visit object D1 to visit object D5). Thus, the user data specifically described by the target data needs to be determined based on the actual statistical demand.
And the target statistical dimension includes at least: time dimensions and space dimensions, the aforementioned time dimensions including, but not limited to: natural days, natural weeks, natural months, natural quarters, natural years, holidays, and the time dimension in this embodiment is specifically a history period, and the space dimension is specifically a target area. The target statistical dimension may be determined based on the data dimension included in the vehicle track data, for example, the vehicle track data includes at least the vehicle type, the vehicle price, and the object information of the vehicle-related object of the vehicle associated with the vehicle track, while the user data requiring statistical visit is specifically considered in the embodiment, so the target statistical dimension may further include the object information of the vehicle-related object.
Specifically, the server determines the target statistical dimension first, and the time dimension described in the determined target statistical dimension is the historical time period, and the space dimension is the target area. Because at least one historical vehicle track point exists in the target area in the associated vehicle track, and the associated vehicle tracks are all historical vehicle tracks in the historical time period, the acquired vehicle track data of the associated vehicle tracks are in the time dimension and the space dimension, at the moment, the server can directly perform statistical calculation on the associated vehicle tracks, so that statistical data in a plurality of data dimensions are obtained, at the moment, the service determines the statistical data in the data dimension for describing the object information as the statistical data for reaching the target place in the historical time period.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution. And object information (including but not limited to object equipment information, object personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the object or sufficiently authorized by parties, and the collection, use and processing of related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions.
In the data processing method, the vehicle track can completely and accurately describe the travelling track of the vehicle in the historical time period, and then the relevant vehicle track passing through the target area in the historical time is obtained by matching a plurality of historical vehicle tracks with the target area, so that the user data reaching the target place corresponding to the target area in the historical time period can be described from the vehicle angle based on the vehicle related data corresponding to the vehicle track passing through the target area in the historical time period by carrying out statistical calculation on the relevant vehicle track, and the problems of missing of the visit data, false counting and the like of the user data obtained through equipment are avoided, and the integrity and the accuracy of the obtained data are improved.
In one embodiment, as shown in fig. 5, performing statistical calculation on data in a target statistical dimension in vehicle track data of an associated vehicle track to obtain target data, including:
step 502, acquiring vehicle track data, wherein the vehicle track data at least comprises: representation data of a vehicle associated with a vehicle track.
Wherein the vehicle track data at least comprises: representation data of a vehicle associated with a vehicle track. As described in the foregoing embodiment, the portrait data includes at least: the vehicle type, the vehicle price, and the object information of the vehicle-related object, the object information of the vehicle-related object may be: object information of an object driving the vehicle in the history period, object information of an object riding the vehicle in the history period, and the like are not described here.
Specifically, after the server determines the relevant vehicle track again, the vehicle track data of the relevant vehicle track can be directly obtained from a data storage system connected with the server, and the vehicle track data of the relevant vehicle track can be mined in real time. The specific manner of acquiring the vehicle trajectory data is not limited herein.
And 504, constructing initial area data in a target area based on the vehicle track data, and performing index aggregation processing on the data in the target statistical dimension in the initial area data to obtain the area data to be processed.
The initial area data specifically includes: vehicle track data of an associated vehicle track having a track dependency relationship with the target area. For example, the associated vehicle track having track dependency relationship with the target area is specifically: the initial area data in the target area specifically includes: vehicle track data of the historical vehicle track A1, vehicle track data of the historical vehicle track A2, and vehicle track data of the historical vehicle track A3.
Secondly, the region data to be processed specifically characterizes the data of the initial region data under the target statistical dimension. For example, the target statistical dimension includes object information of the vehicle-related object, and the to-be-processed region data specifically characterizes: object information of a vehicle-related object of the history vehicle track A1, object information of a vehicle-related object of the history vehicle track A2, and object information of a vehicle-related object of the history vehicle track A3.
Specifically, the server first constructs initial area data within the target area based on the vehicle track data, that is, aggregates vehicle track data of associated vehicle tracks having track dependency relationship with the target area, thereby obtaining initial area data. Then, the server determines data in the target statistical dimension from the initial area data based on the target statistical dimension, and then performs index aggregation processing on the data in the target statistical dimension to obtain the area data to be processed, wherein the index aggregation processing is specifically performed on the data in the target statistical dimension.
Illustratively, object information of an object riding a vehicle for a history period of time and other data are included in vehicle track data such as a history vehicle track A1, and the object information is that the object riding the vehicle is 3 persons. The vehicle track data of the history vehicle track A2 includes object information of an object riding the vehicle for a history period and other data, and the object information is that the object riding the vehicle is 1 person. The vehicle track data of the history vehicle track A3 includes object information of an object riding the vehicle for a history period and other data, and the object information is that the object riding the vehicle is 2 persons. As can be seen from this, the initial area data includes: object information of an object of a vehicle riding in the history vehicle track A1 and other data related to the history vehicle track A1 in the history period, object information of an object of a vehicle riding in the history vehicle track A2 and other data related to the history vehicle track A2 in the history period, object information of an object of a vehicle riding in the history vehicle track A3 in the history period and other data related to the history vehicle track A3.
Based on this, taking the object information including the vehicle-related object in the target statistics dimension as the object information of the object of the riding vehicle in the history period as an example, it is specifically counted by the server that in the initial area data: object information of an object of a vehicle riding in the history vehicle track A1 in the history period, object information of an object of a vehicle riding in the history vehicle track A2 in the history period, object information of an object of a vehicle riding in the history vehicle track A3 in the history period, and performing index aggregation processing on this, that is, aggregation processing on data of object information of a vehicle-related object, so that area data to be processed can be obtained as: the total number of people riding the associated vehicle track over the historical time period is: 6 persons.
Step 506, performing data sample expansion processing on the data of the area to be processed based on the target sample expansion data related to the target area, so as to obtain target data.
The target sample expansion data specifically comprises: the statistics of the standard sample expansion coefficient of the target ink card tray grid corresponding to the target area can be an average value, a median, a maximum value or a minimum value, and the like. In particular, since vehicle trajectory data is considered, it is possible to characterize: the method comprises the steps that when target data in a target place are sampled without replacement under a certain proportion, the obtained data of the area to be processed need to be subjected to data sample expansion processing in consideration of actual application scenes, so that a server needs to acquire target sample expansion data related to the target area, and then the data sample expansion processing is performed on the data of the area to be processed through the target sample expansion data, so that the target data is obtained.
Based on the data, the server takes the target sample expansion data as the sample expansion coefficient of the target region, so that the data sample expansion processing is carried out on the data of the region to be processed based on the sample expansion coefficient, and the target data is obtained. It may be appreciated that, in the present solution, it is also considered that statistical visit data of a target location in a historical period is obtained, or data sample expansion may be performed on the statistical visit data based on different statistical dimensions in a target area where the target location is located, for example, data sample expansion of the statistical visit data for distinguishing a workday from a holiday, or data sample expansion may be performed on the statistical visit data for an object type dimension, which is not limited herein.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution.
In this embodiment, through performing index aggregation processing on the data under the target statistical dimension, it is ensured that the obtained data of the to-be-processed area can describe the required data more accurately, unnecessary redundant data is removed, and further the efficiency of subsequent data processing can be improved, so that data sample expansion processing is performed on the accurate data of the to-be-processed area, so that the situation of missing data sampling is avoided, and the integrity of the target data is further improved on the basis of ensuring the accuracy of the target data.
The method of how the data sample expansion processing is performed is described in detail below: in one embodiment, as shown in fig. 6, performing data sample expansion processing on data of a region to be processed based on target sample expansion data related to a target region to obtain target data, including:
step 602, converting the target area into a corresponding target ink-card-holder grid, and obtaining target sample expansion data in the target ink-card-holder grid, where the target sample expansion data at least includes: population flow data, and vehicle data within the same time window as the population flow data.
The target sample expansion data specifically is a statistic value of a standard sample expansion coefficient of the target ink card support grid, where the statistic value may be an average value, a median, a maximum value or a minimum value, and the like, that is, the target sample expansion data in this embodiment specifically is: demographic data acquired based on the published channel of the mercator mesh division, and vehicle data within the same time window as the demographic data.
Specifically, the server firstly converts the target area into a corresponding target ink card support grid, and then obtains population flow data and vehicle data which are in the same time window as the population flow data based on public channels divided by the target ink card support grid so as to construct target sample expansion data in the target ink card support grid. Based on the above, considering that the target area can be a regular quadrilateral area or an irregular polygon area, when the target area is a regular quadrilateral area, the preset step length of the grid can be determined based on the actual application requirement, then the target area is divided into regular grids through the preset step length, and then the conversion from the regular grids to the target ink card support grid is completed. In the case that the target area is an irregular polygonal area, the following description will be referred to for specific embodiments.
And step 604, performing data sample expansion processing on the region data to be processed through the target sample expansion data to obtain target data.
Specifically, the server performs data sample expansion processing on the region data to be processed through the target sample expansion data to obtain target data. Various models and sample-expanding techniques can be used in the foregoing data sample-expanding process, that is, the data sample-expanding process in this embodiment includes, but is not limited to: the method comprises the steps of coefficient sample expansion, sample expansion according to grid characteristics by using a machine learning model, grid characteristic fusion by using a deep learning model and surrounding multi-modal information sample expansion. And is not described in detail herein.
It can be understood that, in addition to the sample expansion data acquired through the mercator grid, in practical application, statistical visit data in the target area can also be acquired, and the embodiment can also improve the reliability of the data by calibrating the data on different levels. For example, taking the target place as the urban area, that is, the target area is the urban area, the sample expansion coefficient of the urban area can be calculated based on the vehicle data of the urban area after grid sample expansion and the statistical population number obtained by the related literature or demographics, and the sample expansion coefficient of the urban area is the target sample expansion data, so as to perform the data sample expansion processing on the area data to be processed. The method of data sample expansion processing and the manner of acquisition of target sample expansion data are not specifically defined here.
In this embodiment, through the conversion from the region to the ink card support grid, the statistic value of the standard sample expansion coefficient of the ink card support grid can be accurately obtained, so that the sample expansion processing is performed on the data in the target region through the accurate statistic value, and the accuracy and the integrity of the target data are further ensured on the basis of avoiding the condition of missing data sampling.
Since the obtained target ink-card-tray grid cannot be directly converted when the target area is an irregularly closed area, the method of how to construct the target ink-card-tray grid in this case will be described in detail below: in one embodiment, as shown in fig. 7, the target region is an irregularly closed region composed of a plurality of region coordinate points.
In this case, in the case where the target area is an irregular closed area, that is, the closed area composed of a plurality of area coordinate points is irregular, as shown in fig. 8, the target area 801 is specifically: the region coordinate points 802 to 817 constitute irregularly closed regions.
Based on this, converting the target area into a corresponding target ink-card-holding grid, comprising:
step 702, constructing a coordinate point array to be selected, and adding each region coordinate point in the target region to the coordinate point array to be selected.
Specifically, the server firstly constructs an array of coordinate points to be selected, wherein the constructed array of coordinate points to be selected is an array which does not comprise any coordinate point, and then the server adds each region coordinate point in the composition target region into the array of coordinate points to be selected, namely the array of coordinate points to be selected after adding the region coordinate points specifically comprises: each region coordinate point in the target region is composed.
It can be understood that, in practical application, the coordinate system corresponding to the regional coordinate points may be a WGS84 coordinate system or a GCJ02 coordinate system, and in consideration of consistency of data processing, the coordinate points in this embodiment all need to be processed based on the GCJ02 coordinate system, so if the regional coordinate points are coordinate points described based on the WGS84 coordinate system, the server also needs to perform geographic coordinate conversion on each regional coordinate point, that is, needs to convert the regional coordinate points into coordinate points described based on the GCJ02 coordinate system, and then adds the regional coordinate points after the geographic coordinate conversion to the array of coordinate points to be selected.
In step 704, a set of coordinate information is determined based on the coordinate information corresponding to each of the regional coordinate points.
Specifically, the server determines a set of coordinate information based on the coordinate information corresponding to each region coordinate point, respectively. The foregoing set of coordinate information may include: longitude information and latitude information in the coordinate information corresponding to each regional coordinate point respectively, or maximum longitude information and maximum latitude information in the coordinate information corresponding to each regional coordinate point respectively, or minimum longitude information and minimum latitude information in the coordinate information corresponding to each regional coordinate point respectively. And are not limited herein.
And step 706, performing space division on the target area based on the coordinate point array to be selected and the coordinate information set, and determining a target ink card support grid corresponding to the target area after space division.
Wherein the spatial division is used to divide the target region into a plurality of sub-regions, each of which may be the same or different in shape. Specifically, the server performs space division on the target area based on the coordinate point array to be selected and the coordinate information set to obtain a plurality of subareas obtained by dividing the target area, and then establishes a mapping relation between each subarea and the ink-card-holder grid, so that the target ink-card-holder grid is constructed through the plurality of ink-card-holder grids with the mapping relation with the respective areas.
In this embodiment, the reliability of space division is ensured by forming the regional coordinate points of the target region and the longitude information and the latitude information of each regional coordinate point, and considering the actual position of each coordinate point and the actual longitude and latitude information of the target region in the process of space division, so that the reliability and the accuracy of the constructed target ink card support grid are ensured by establishing the mapping relationship, and the integrity and the accuracy of the statistical value of the standard sample expansion coefficient of the obtained ink card support grid are further ensured.
The following describes in detail how the spatial segmentation is performed, and the method of determining the target ink-card-holder grid after the spatial segmentation: in one embodiment, as shown in fig. 9, the coordinate information set includes minimum longitude information, maximum longitude information, minimum latitude information, and maximum latitude information among the coordinate information corresponding to each region coordinate point, respectively.
The minimum longitude information (Lngmin) specifically includes: the coordinate information set includes the minimum value in the longitude information of the coordinate information corresponding to each region coordinate point. Similarly, the maximum longitude information (Lngmax) is specifically: the coordinate information set includes the maximum value in the longitude information of the coordinate information corresponding to each region coordinate point, respectively. The minimum latitude information (Latmin) is specifically: the coordinate information set comprises the minimum value in the latitude information of the coordinate information corresponding to each regional coordinate point. And maximum latitude information (maximum latitude Latmax) is specifically: the coordinate information set comprises the maximum value in the latitude information of the coordinate information corresponding to each regional coordinate point.
Based on the above, the space division is performed on the target area based on the coordinate point array to be selected and the coordinate information set, and the target ink card support grid corresponding to the target area after the space division is determined, which comprises the following steps:
In step 902, a longitude set and a latitude set are constructed based on the coordinate information set, the longitude set includes longitude information between the minimum longitude information and the maximum longitude information, the difference between each longitude information is a preset step, the latitude set includes latitude information between the minimum latitude information and the maximum latitude information, and the difference between each latitude information is a preset step.
Specifically, the longitude set includes longitude information between minimum longitude information and maximum longitude information, and the difference between each longitude information is a preset step, and the longitude set includes the minimum longitude information and the maximum longitude information. For example, if the minimum longitude information is 100 and the maximum longitude information is 100.10, and the preset step size is 0.1, the longitude set specifically includes: 100. 100.01, 100.02, 100.03, 100.04, 100.05, 100.06, 100.07, 100.08, 100.09, 100.10.
Secondly, the latitude set comprises latitude information between the minimum latitude information and the maximum latitude information, the difference value between each latitude information is a preset step length, and the latitude set comprises the minimum latitude information and the maximum latitude information. For example, if the minimum latitude information is 100 and the maximum latitude information is 100.20, and the preset step size is 0.2, the latitude set specifically includes: 100. 100.02, 100.04, 100.06, 100.08, 100.10, 100.12, 100.14, 100.16, 100.18, 100.20.
Specifically, the server divides coordinate information by a preset step length by taking the size of the ink card support grid as a preset step length, namely, divides the minimum latitude information to the maximum latitude information into a plurality of latitude information by the preset step length so as to obtain a latitude set, and divides the minimum longitude information to the maximum longitude information into a plurality of longitude information by the preset step length so as to obtain the longitude set.
In step 904, the longitude set and the latitude set are subjected to Cartesian product to obtain a plurality of longitude and latitude coordinate points, and each longitude and latitude coordinate point is added into the coordinate point array to be selected.
Specifically, the server performs cartesian product on the longitude set and the latitude set to obtain a plurality of longitude and latitude coordinate points, and adds each longitude and latitude coordinate point to the coordinate point array to be selected. That is, the server performs product calculation on each longitude information included in the longitude set and each latitude information included in the latitude set, so as to obtain a plurality of longitude and latitude coordinate points, where the longitude and latitude coordinate points include longitude information in the longitude set and latitude information in the latitude set.
For example, if the longitude set includes longitude information E1, longitude information E2, longitude information E3 and longitude information E4, and the latitude set includes latitude information F1, latitude information F2 and latitude information F3, then the plurality of longitude and latitude coordinate points obtained by cartesian integrating the foregoing longitude set and latitude set are specifically: the longitude and latitude coordinate point G1 (longitude information E1, latitude information F1), the longitude and latitude coordinate point G2 (longitude information E1, latitude information F2), the longitude and latitude coordinate point G3 (longitude information E1, latitude information F3), the longitude and latitude coordinate point G4 (longitude information E2, latitude information F1), the longitude and latitude coordinate point G5 (longitude information E2, latitude information F2), the longitude and latitude coordinate point G6 (longitude information E2, latitude information F3), the longitude and latitude coordinate point G7 (longitude information E3, latitude information F1), the longitude and latitude coordinate point G8 (longitude information E3, latitude information F2), the longitude and latitude coordinate point G9 (longitude information E3, latitude information F3), the longitude and latitude coordinate point G10 (longitude information E4, latitude information F1), the longitude and latitude coordinate point G11 (longitude information E2), and longitude and latitude coordinate point G12 (longitude information E4, latitude information F3).
It can be understood that, in practical application, the coordinate system corresponding to the longitude and latitude coordinate points may be a WGS84 coordinate system or a GCJ02 coordinate system, and considering consistency of data processing, in this embodiment, the coordinate points all need to be processed based on the GCJ02 coordinate system, so if the longitude and latitude coordinate points are coordinate points described based on the WGS84 coordinate system, the server also needs to perform geographic coordinate conversion on each longitude and latitude coordinate point, that is, needs to convert the longitude and latitude coordinate points into coordinate points described based on the GCJ02 coordinate system, and then adds the longitude and latitude coordinate points after the geographic coordinate conversion to the array of coordinate points to be selected.
Step 906, topology is performed on each coordinate point to be selected in the coordinate point array to reserve candidate coordinate points located in the target area, and the area coordinate point array is constructed based on the candidate coordinate points located in the target area.
Wherein, each coordinate point to be selected in the coordinate point array to be selected specifically includes: the regional coordinate points and the longitude and latitude coordinate points, or each coordinate point to be selected in the coordinate point array to be selected specifically comprises: the geographical coordinate transformed regional coordinate point and the geographical coordinate transformed longitude and latitude coordinate point. Second, methods of topology include, but are not limited to, DE-9IM matrices.
Specifically, the server determines the regional coordinate point and the longitude and latitude coordinate point added to the coordinate point array to be selected as the coordinate point to be selected belonging to the coordinate point array to be selected, or determines the regional coordinate point after the geographic coordinate conversion and the longitude and latitude coordinate point after the geographic coordinate conversion added to the coordinate point array to be selected as the coordinate point to be selected belonging to the coordinate point array to be selected.
Based on the above, the server traverses the coordinate point array to be selected, performs topology on the coordinate points to be selected in the coordinate point array to determine whether each coordinate point to be selected is located in the target area, determines candidate coordinate points located in the target area and candidate coordinate points not located in the target area according to the determination result, so as to reserve the candidate coordinate points located in the target area, and constructs an area coordinate point array based on the candidate coordinate points located in the target area.
For example, if the plurality of longitude and latitude coordinate points are specifically the longitude and latitude coordinate points G1 to G12 and the region coordinate points are specifically the region coordinate points H1 to H10, the server needs to perform topology processing on the longitude and latitude coordinate points G1 to G12 and the region coordinate points H1 to H10 belonging to the to-be-selected coordinate point array to determine whether the target region is located, and if the longitude and latitude coordinate points G1 to G9 and the region coordinate points H1 to H9 are located in the target region, the region coordinate point array is constructed based on the longitude and latitude coordinate points G1 to G9 and the region coordinate points H1 to H9.
Step 908, determining a grid array corresponding to the ink-card support grid to be constructed, and adding the ink-card support grid with each candidate coordinate point in the regional coordinate point array as a vertex into the grid array.
Specifically, the server determines a grid array corresponding to the ink-card support grid to be constructed, and then adds the ink-card support grid with each candidate coordinate point in the regional coordinate point array as a vertex into the grid array. The server constructs an array grid array with unique identification of the ink card support grid to be constructed, then traverses candidate coordinate points in a target area in the area coordinate point array to determine the ink card support grid with the candidate coordinate points as vertexes, and the ink card support grid with the candidate coordinate points as vertexes is usually four, so that the four ink card support grids with the candidate coordinate points as vertexes are added into the grid array, and the ink card support grid can be uniquely identified by grid identification, namely, the server respectively corresponds to the four ink card support grids with the candidate coordinate points as vertexes and adds the four ink card support grids into the grid array.
Further, further description will be made based on the foregoing example, the regional coordinate point array is constructed with the longitude and latitude coordinate points G1 to G9 and the regional coordinate points H1 to H9, that is, four ink card support grids with the longitude and latitude coordinate point G1 as the vertex, four ink card support grids with the longitude and latitude coordinate point G9 as the vertex, four ink card support grids with the regional coordinate point H1 as the vertex, and four ink card support grids with the regional coordinate point H9 as the vertex are required to be determined, and the grid identifications corresponding to the respective obtained ink card support grids are added to the grid array.
To facilitate understanding of the ink-held grid in which candidate coordinate points are vertices, as shown in fig. 10, as candidate coordinate point 1001, and candidate coordinate point 1001 is located within the target area, when the lower right corner vertex of ink-held grid 1002 is candidate coordinate point 1001, the lower left corner vertex of ink-held grid 1003 is candidate coordinate point 1001, the upper right corner vertex of ink-held grid 1004 is candidate coordinate point 1001, and the upper left corner vertex of ink-held grid 1005 is candidate coordinate point 1001, it is determined that ink-held grid 1002, ink-held grid 1003, ink-held grid 1004, and ink-held grid 1005 need to be added to the grid array. The above steps are performed for all candidate coordinate points located in the target region in the region coordinate point array, and are not repeated here.
In step 910, an ink-card-holder projection is performed on the ink-card-holder grid included in the grid array to construct a target ink-card-holder grid.
Specifically, the server performs ink-card-holder projection on the ink-card-holder grid included in the grid array to construct a target ink-card-holder grid. Specifically, the server projects longitude and latitude coordinates of southwest angles of the ink card support grids through the ink card support to obtain an abscissa and an ordinate of each ink card support grid, and then splices the abscissa and the ordinate of each ink card support grid by using a connector to construct a final target ink card support grid, wherein the target ink card support grid has a corresponding unique grid identifier.
Optionally, after the grid array is obtained, the grid identifiers included in the grid array may be subjected to identification deduplication, that is, the grid identifiers that are repeatedly consistent and included in the grid array are deduplicated, the grid array that is subjected to identification deduplication is used as a space segmentation result of the target area, and then the grid array that is subjected to identification deduplication is subjected to ink card support projection based on the foregoing description manner, so as to construct a target ink card support grid, where the target ink card support grid has a corresponding unique grid identifier.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution.
In this embodiment, the longitude and latitude maximum value and the longitude and latitude minimum value of the target area are described by determining the longitude and latitude set and the latitude and longitude set from the angle of longitude and latitude information, so that the longitude and latitude coordinate points obtained by the cartesian product can describe the geographic area where the target area is located in a larger range and more completely, the data integrity in the obtained coordinate point array to be selected is higher, topology and ink-card-support projection are performed on the coordinate point array to be selected with higher integrity, and the obtained target ink-card-support grid can be more attached to the actual space area of the target area, so that the reliability and accuracy of the constructed target ink-card-support grid are ensured.
In view of the fact that it takes much time to perform the statistical calculation after the data is acquired, before performing the statistical calculation, an aggregate index formula library may be constructed to enable the corresponding aggregate index formula to be actually acquired through the requirement for performing the statistical calculation, which will be described below: in one embodiment, as shown in fig. 11, the data processing method further includes:
step 1102, respectively performing statistical aggregation treatment on each statistical dimension, time dimension and space dimension to obtain respective aggregation index formulas of each statistical dimension; wherein the statistical aggregation includes any one of: sum, sum after de-duplication.
Wherein the statistical aggregation includes any one of: sum, sum after de-duplication. In practical applications, based on different scene requirements, the statistical aggregation may further include average, maximum, minimum, median, and the like, which is not limited herein. The aforementioned time dimension includes, but is not limited to: the space dimension is specifically a space region, and the natural days, natural weeks, natural months, natural quarters, natural years and holidays are all natural days.
Specifically, on the basis of including a time dimension and a space dimension, the server considers each statistical dimension to perform statistical aggregation processing together to obtain respective aggregation index formulas of each statistical dimension, and at this time, the statistical dimension can be single or multiple. The statistical dimension is described below as single or multiple, respectively:
For example, if each vehicle may have trajectory data with n statistical dimensions, the statistical dimensions include: f (F) 1 、F 2 、F 3 、…、F n-1 、F n
For the case where the statistical dimension is single:
at this time, the data corresponding to each of the n statistical dimensions need to be paid attention to, so that the distribution situation of a single statistical dimension needs to be paid attention to in the statistical process. Thus, for each statistical dimension, a statistical dimension F is first selected i As the statistical dimension to be analyzed, then based on the statistical dimension to be analyzed (i.e., statistical dimension F i ) Carrying out statistical aggregation treatment on the time dimension and the space dimension to obtain a statistical dimension F i The above steps are repeatedly executed for n statistical dimensions until all the n statistical dimensions finish the above steps, and at this time, the aggregation index formulas corresponding to the n statistical dimensions respectively can be obtained.
For the case where the statistical dimension is multiple:
at this time, the data corresponding to the cross statistics of the plurality of statistical dimensions under the n statistical dimensions need to be concerned, and at this time, the data corresponding to the cross statistics may be capable of storing more data information, that is, more comprehensive image data can be provided from the plurality of statistical dimensions, which is beneficial to accurately locating the visiting user. Therefore, a plurality of statistical dimensions are selected from n statistical dimensions to construct a statistical dimension subset, and the statistical dimension subset can be: statistical dimension F i 、F j And the subset of statistical dimensions may also be: statistical dimension F i 、F j 、F h . Then selecting one from the subset of statistical dimensions as the statistical dimension to be analyzed, wherein the statistical dimension to be analyzed comprises a plurality of statistical dimensions, and then performing statistical aggregation processing based on the statistical dimension to be analyzed (namely the subset of statistical dimensions), the time dimension and the space dimension to obtain a statistical dimension F i Repeating the steps for each obtained subset of statistical dimensions until each subset of statistical dimensions completes the steps, and obtaining the aggregation index formula corresponding to each subset of statistical dimensions.
Therefore, the server can obtain the aggregation index formulas corresponding to the statistical dimensions respectively when the statistical dimensions are single, and the aggregation index formulas corresponding to the statistical dimension subsets when the statistical dimensions are multiple, so that an aggregation index formula library is constructed, and the server can directly store the aggregation index formula library in the data storage system, so that the acquisition of the aggregation index formulas is conveniently carried out through the aggregation index formula library in practical application.
Based on this, the statistical calculation is performed on the data in the target statistical dimension among the vehicle track data of the associated vehicle track to obtain the target data, including:
Step 1104, determining a target aggregation indicator formula of the target statistical dimension based on the respective aggregation indicator formulas of the statistical dimensions.
Specifically, after determining the target statistical dimension, the server determines a target aggregation indicator formula of the target statistical dimension based on the respective aggregation indicator formula of each statistical dimension included in the obtained aggregation indicator formula library. For example, the target statistical dimension is statistical dimension F 3 The server can directly determine the statistics dimension F from the respective aggregation index formula of each statistics dimension 3 As a target aggregation indicator formula. Alternatively, the target statistical dimension is statistical dimension F 2 Dimension F of statistics 3 Dimension F of statistics 4 The server can directly determine from the respective aggregation index formula of each statistical dimension, includingStatistical dimension F 2 Dimension F of statistics 3 Dimension F of statistics 4 The corresponding aggregation indicator formula is used as a target aggregation indicator formula.
In step 1106, based on the aggregate index formula, the data in the target statistical dimension in the vehicle track data of the associated vehicle track is statistically calculated to obtain the target data.
Specifically, the server performs statistical calculation on data in a target statistical dimension in vehicle track data of the associated vehicle track based on an aggregation index formula to obtain target data. The statistical calculation of specific target data has been described in the foregoing embodiments, and will not be described herein.
Alternatively, since the statistical aggregation includes any one of the following: sum, sum after deduplication, so in the case of statistical aggregation to sum, the target data is used to describe: the number of persons visiting the target site during the historical time period. And in the case of statistical aggregation to a post-deduplication sum, the target data is used to describe: the number of people visiting the target site during the history period.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution.
In this embodiment, for the case that the statistical dimension is single and for the case that the statistical dimension is multiple, the aggregation index formulas of different statistical dimensions are calculated in advance, so that the target aggregation index formula of the target statistical dimension can be directly determined in the actual statistical calculation process, the data processing amount is reduced, and the efficiency of acquiring the target data is improved.
In one embodiment, as shown in FIG. 12, determining an associated vehicle track having track dependency relationship with a target area from a plurality of historical vehicle tracks includes:
step 1202, performing track point screening on each historical vehicle track point to obtain a plurality of screened track points.
Specifically, the server performs track point screening on each historical vehicle track point to obtain a plurality of screened track points, namely, the server performs track point screening on each historical vehicle track point included in each historical vehicle track, determines a screened track point with an abnormality from the historical vehicle track points, selects a normal screened track point in the historical vehicle track points, and the screened track point will not perform subsequent processing steps.
In this embodiment, the track point abnormality determination may be performed on the track points of the historical vehicle in the same historical vehicle track, or the track point abnormality determination may be performed based on the area where each of the track points of the historical vehicle is located, and the method for determining the screening track points is described in detail based on the two modes respectively:
optionally, as shown in fig. 13, track point screening is performed on each historical vehicle track point to obtain a plurality of screened track points, including:
Step 1302, determining a track area where each historical vehicle track point is located based on the coordinate information of each historical vehicle track point.
Specifically, the server determines the track area in which each of the history vehicle track points is located based on the coordinate information of each of the history vehicle track points. Such as: the historical vehicle track point B1 is on a sidewalk, the historical vehicle track point B2 is on a water area, the historical vehicle track point B3 is on a zebra crossing, and the historical vehicle track point B4 is on a normal road area.
In step 1304, the historical vehicle track points with track areas in the abnormal areas are determined to be screened track points, and the historical vehicle track points with track areas not in the abnormal areas are determined to be screened track points.
The abnormal region is used for representing a region which cannot be reached by the normal form of the vehicle, such as: sidewalks, waters, non-service areas, etc. Specifically, the server determines the historical vehicle track points in the abnormal region as screened track points based on the track region where each historical vehicle track point is located, namely, the screened track points are not considered in the subsequent processing process, and then the historical vehicle track points with the track regions not in the abnormal region are determined as screened track points.
Based on the foregoing example, the history vehicle track point B1 is on a sidewalk, the history vehicle track point B2 is on a water area, the history vehicle track point B3 is on a zebra crossing, and the history vehicle track point B4 is on a normal road area, so that it is determined that the history vehicle track point B1 and the history vehicle track point B2 are both on abnormal areas, that is, the history vehicle track point B1 and the history vehicle track point B2 are screened track points. And the historical vehicle track point B3 and the historical vehicle track point B4 are not in abnormal areas, namely the historical vehicle track point B3 and the historical vehicle track point B4 are screening track points.
Optionally, as shown in fig. 14, performing track point screening on each historical vehicle track point to obtain a plurality of screened track points, including:
step 1402, calculating the track point information difference between track point information of adjacent historical vehicle track points in the same historical vehicle track; the track point information is time information or coordinate information.
The track point information is time information or coordinate information. Therefore, the track point information difference may be a time information difference between time information of adjacent historical vehicle track points, or may also be a coordinate information difference between coordinate information of adjacent historical vehicle track points.
Specifically, the server performs similar calculations for each historical vehicle track: and calculating the track point information difference between track point information of adjacent historical vehicle track points in the same historical vehicle track. Taking the track point information as the time information as an example, and the historical vehicle track points B1 to B4 are continuously adjacent, it is necessary to calculate: the time information of the history vehicle track point B1, the time information difference between the time information of the history vehicle track point B2 and the time information of the history vehicle track point B3, and the time information difference between the time information of the history vehicle track point B3 and the time information of the history vehicle track point B4. The calculation mode of the track point information as the coordinate information is similar, and is not repeated here.
In step 1404, adjacent historical vehicle track points whose track point information differences are not in the track point information range are determined as screened track points, and adjacent historical vehicle track points whose track point information differences are in the track point information range are determined as screened track points.
The track point information range may be a time information range or a coordinate information range, specifically needs to be determined based on track point information, specifically, the server determines adjacent historical vehicle track points, of which track point information differences are not in the track point information range, as screened track points. For ease of understanding, taking the track point information difference as the time information difference as an example, that is, the time information difference between adjacent historical vehicle track points belonging to the same historical vehicle track is not in the time information range, two adjacent historical vehicle track points can be determined as screened track points. Or considering the time information difference between the last historical vehicle track point and the next historical vehicle track point in the two adjacent historical vehicle track points by the server, if the time information difference is still not in the time information range, determining the two adjacent historical vehicle track points as screened track points, and if the time information difference is not in the time information range, determining the last historical vehicle track point in the two adjacent historical vehicle track points as screened track points. Then the historical vehicle track points that do not satisfy the foregoing are determined to be screening track points, i.e., the server determines adjacent historical vehicle track points whose track point information differences are within the track point information range as screening track points.
Illustratively, as calculated: the time information of the history vehicle track point B1, the time information difference between the time information of the history vehicle track point B2 and the time information of the history vehicle track point B2, the time information difference between the time information of the history vehicle track point B3 and the time information of the history vehicle track point B4 may be determined as the selected track point only when the time information of the history vehicle track point B1 and the time information difference between the time information of the history vehicle track point B2 are not within the time information range, the time information of the history vehicle track point B2 and the time information of the history vehicle track point B3 are within the time information range, and the time information of the history vehicle track point B3 and the time information of the history vehicle track point B4 are within the time information range, and the history vehicle track point B2, the history vehicle track point B3 and the history vehicle track point B4 are all the selected track points.
Step 1204, determining a target vehicle track point falling into the target area from the plurality of screening track points.
Specifically, the server determines a target vehicle track point that falls within the target area from among the plurality of screening track points. That is, the server determines the target vehicle track point whose coordinate information falls into the target area based on the respective coordinate information of each screening track point.
Step 1206, determining the historical traffic track to which the target vehicle track point belongs as the associated vehicle track.
Specifically, the server determines the historical traffic track to which the target vehicle track point belongs as the associated vehicle track. For example, the determined target vehicle track points are the historical vehicle track point B2, the historical vehicle track point B3 and the historical vehicle track point B20, and the historical vehicle track point B2, the historical vehicle track point B3 belong to the historical traffic track A1, and the historical vehicle track point B20 belongs to the historical traffic track A3, so that it may be determined that the historical traffic track A1 and the historical traffic track A3 are associated vehicle tracks, that is, the associated vehicle tracks specifically include the historical traffic track A1 and the historical traffic track A3 as associated vehicle tracks.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution.
In this embodiment, whether the historical vehicle track points are in an abnormal area or not is judged by judging track point information of the historical vehicle track points adjacent to the same historical vehicle track, so that screening track points of the vehicle track can be determined normally and accurately described, a result of judging whether the screening track points fall into a target area is more fit with an actual situation, accuracy of related vehicle tracks is improved, and accuracy of target data obtained by subsequent statistical calculation is improved.
In one embodiment, as shown in fig. 15, determining a target vehicle track point that falls within a target area from a plurality of screening track points includes:
in step 1502, a target spatial index of the target region is constructed, the target spatial index being used to index into a target contour space of the target region.
The target space index is used for indexing to a target contour space of the target area. Specifically, since the target contour space in which the target region is located is considered to be generally large, the target contour space of the target region can be found more quickly by constructing the target space index in determining whether the coordinate information falls into the target region. Further, the server constructs a spatial index of the target region based on the input target region by one of, but not limited to, quadtree, R-tree, etc.
In step 1504, the screening track points falling into the target contour space are determined as target vehicle track points based on the target space index and the coordinate information of each screening track point.
Specifically, the server determines, based on the target spatial index and the coordinate information of each screening track point, that the screening track point falling into the target contour space is the target vehicle track point, that is, the server determines, based on the target contour space indexed to the target area by the target spatial index, whether the coordinate information of each screening track point falls into the target contour space, so as to determine, based on the determination result, that the screening track point falling into the target contour space is the target vehicle track point.
In this embodiment, by constructing the target spatial index, the target contour space of the target region can be found more quickly, so as to improve the efficiency of determining the screening track points, and further improve the efficiency of acquiring the target data.
In one embodiment, as shown in fig. 16, determining the target area where the target location is located includes:
step 1602, obtains venue interest points that are on the outline of the target venue.
Specifically, since there are multiple target sites in the target site, since the target area is a closed area of connectivity, the server will select site interest points that are on the outline of the target site in order to avoid site interest points that are inside the target site from affecting connectivity within the closed area. Optionally, the server may also select a location point of interest having a distance to the contour edge of the target location that is less than a preset distance threshold.
Step 1604, selecting an initial venue point of interest from the venue points of interest, and sequentially and non-repeatedly connecting the venue points of interest from the venue points of interest until returning to the initial venue point of interest again to obtain the target area.
Specifically, the server selects an initial location interest point from the plurality of location interest points, and uses the initial location interest point as a start position, and sequentially and non-repeatedly connects the location interest points along the direction of the outline of the target location until the initial location interest point is returned again, that is, the start position and the end position of the target area are the same location interest point (that is, the initial location interest point). For ease of understanding, as shown in fig. 17, there are location points of interest 1701 to 1710 in the (a) diagram of fig. 17, if the location point of interest 1701 is the initial location point of interest, the location point of interest 1701 is connected to the location point of interest 1706, the location point of interest 1706 is connected to the location point of interest 1707, the location point of interest 1707 is connected to the location point of interest 1708, the location point of interest 1708 is connected to the location point of interest 1709, the location point of interest 1709 is connected to the location point of interest 1710, the location point of interest 1710 is connected to the location point of interest 1705, the location point of interest 1705 is connected to the location point of interest 1704, the location point of interest 1703 is connected to the location point of interest 1702, and finally, the closed region 1711 obtained by the foregoing connection is the target region.
It will be appreciated that all examples in this embodiment are for the understanding of the present solution and should not be construed as a specific limitation on the present solution.
In this embodiment, by sequentially and repeatedly connecting the points of interest of each location along the direction of the outline of the target location, the obtained target area is ensured to be able to complete the range including the real area where the target location is located, so as to improve the authenticity of the target area, that is, provide a real and reliable data basis for subsequent data processing, and further improve the authenticity and reliability of target data acquisition.
Based on the foregoing detailed description of the embodiments, a complete flow of the data processing method in the embodiments of the present application will be described, and in one embodiment, as shown in fig. 18, a data processing method is provided, where the method is applied to the server 104 in fig. 1, and is illustrated by way of example, it will be understood that the method may also be applied to the terminal 102, and may also be applied to a system including the terminal 102 and the server 104, and implemented through interaction between the terminal 102 and the server 104. In this embodiment, the method includes the steps of:
step 1801, obtaining location interest points on the outline of the target location.
Specifically, since there are multiple target sites in the target site, since the target area is a closed area of connectivity, the server will select site interest points that are on the outline of the target site in order to avoid site interest points that are inside the target site from affecting connectivity within the closed area. Optionally, the server may also select a location point of interest having a distance to the contour edge of the target location that is less than a preset distance threshold.
Step 1802, selecting an initial location interest point from the location interest points, and sequentially and non-repeatedly connecting the location interest points from the location interest points until returning to the initial location interest point again to obtain the target area.
The target area is an irregular closed area formed by a plurality of area coordinate points. Specifically, the server selects an initial location interest point from the plurality of location interest points, and uses the initial location interest point as a start position, and sequentially and non-repeatedly connects the location interest points along the direction of the outline of the target location until the initial location interest point is returned again, that is, the start position and the end position of the target area are the same location interest point (that is, the initial location interest point).
Step 1803, a plurality of historical vehicle trajectories over a historical time period are acquired.
The historical time period is specifically a historical time interval prior to the current time, and the specific historical time period needs to be determined according to actual requirements. Next, the historical vehicle track is a sequence of a plurality of continuous and adjacent historical vehicle track points.
Specifically, the server obtains a plurality of historical vehicle trajectories over a historical period of time. Since many historical vehicle tracks can be obtained in the historical time period, in this embodiment, access data is specifically obtained for the target location, and on the basis of considering the data processing amount, a plurality of historical vehicle tracks in the same geographic area as the target location can be specifically obtained, where the geographic area can be a street area, a urban area, a city level area, or the like.
Based on the above, the track recording device of the vehicle can upload the track of the vehicle generated by the vehicle to the server in real time or according to a preset period during the running process of the vehicle, and the server stores the track of the vehicle generated by the vehicle into the data storage system, so that when the visiting data acquisition is required, a plurality of historical vehicle tracks in a historical time period are acquired from the data storage system, or a plurality of historical vehicle tracks in the same geographic area as the target place are selected from the data storage system.
Step 1804, determining a track area where each historical vehicle track point is located based on the coordinate information of each historical vehicle track point.
Specifically, the server determines the track area in which each of the history vehicle track points is located based on the coordinate information of each of the history vehicle track points. Such as: the historical vehicle track point B1 is on a sidewalk, the historical vehicle track point B2 is on a water area, the historical vehicle track point B3 is on a zebra crossing, and the historical vehicle track point B4 is on a normal road area.
Step 1805, determining the historical vehicle track points with track areas in the abnormal areas as screened track points, and determining the historical vehicle track points with track areas not in the abnormal areas as screened track points.
The abnormal region is used for representing a region which cannot be reached by the normal form of the vehicle, such as: sidewalks, waters, non-service areas, etc. Specifically, the server determines the historical vehicle track points in the abnormal region as screened track points based on the track region where each historical vehicle track point is located, namely, the screened track points are not considered in the subsequent processing process, and then the historical vehicle track points with the track regions not in the abnormal region are determined as screened track points.
Step 1806, calculating a track point information difference between track point information of adjacent track points of the historical vehicle in the same historical vehicle track.
The track point information is time information or coordinate information. Therefore, the track point information difference may be a time information difference between time information of adjacent historical vehicle track points, or may also be a coordinate information difference between coordinate information of adjacent historical vehicle track points. Specifically, the server performs similar calculations for each historical vehicle track: and calculating the track point information difference between track point information of adjacent historical vehicle track points in the same historical vehicle track.
Step 1807, determining adjacent historical vehicle track points with track point information differences not in the track point information range as screened track points, and determining adjacent historical vehicle track points with track point information differences in the track point information range as screened track points.
The track point information range may be a time information range or a coordinate information range, specifically needs to be determined based on track point information, specifically, the server determines adjacent historical vehicle track points, of which track point information differences are not in the track point information range, as screened track points, taking the track point information differences as time information differences as an example, that is, time information differences between adjacent historical vehicle track points in the same historical vehicle track are not in the time information range, and when the track point information differences are not in the time information range, the server determines both the adjacent historical vehicle track points as screened track points. Or considering the time information difference between the last historical vehicle track point and the next historical vehicle track point in the two adjacent historical vehicle track points by the server, if the time information difference is still not in the time information range, determining the two adjacent historical vehicle track points as screened track points, and if the time information difference is not in the time information range, determining the last historical vehicle track point in the two adjacent historical vehicle track points as screened track points. Then the historical vehicle track points that do not satisfy the foregoing are determined to be screening track points, i.e., the server determines adjacent historical vehicle track points whose track point information differences are within the track point information range as screening track points.
In step 1808, a target spatial index of the target region is constructed.
The target space index is used for indexing to a target contour space of the target area. Specifically, since the target contour space in which the target region is located is considered to be generally large, the target contour space of the target region can be found more quickly by constructing the target space index in determining whether the coordinate information falls into the target region. Further, the server constructs a spatial index of the target region based on the input target region by one of, but not limited to, quadtree, R-tree, etc.
Step 1809, determining, based on the target spatial index and the coordinate information of each screening track point, the screening track point falling into the target contour space as the target vehicle track point.
Specifically, the server determines, based on the target spatial index and the coordinate information of each screening track point, that the screening track point falling into the target contour space is the target vehicle track point, that is, the server determines, based on the target contour space indexed to the target area by the target spatial index, whether the coordinate information of each screening track point falls into the target contour space, so as to determine, based on the determination result, that the screening track point falling into the target contour space is the target vehicle track point.
Step 1810, determining the historical traffic track to which the target vehicle track point belongs as the associated vehicle track.
Wherein at least one historical vehicle track point exists in the associated vehicle track and is in the target area. Based on this, the track dependency relationship is used to characterize that the vehicle track has a coincident portion with the region, i.e., that the vehicle track passes at least through the region having the track dependency relationship. Specifically, the server determines the historical traffic track to which the target vehicle track point belongs as the associated vehicle track.
In step 1811, vehicle track data associated with a vehicle track is acquired.
Specifically, after the server determines the relevant vehicle track again, the vehicle track data of the relevant vehicle track can be directly obtained from a data storage system connected with the server, and the vehicle track data of the relevant vehicle track can be mined in real time. The specific manner of acquiring the vehicle trajectory data is not limited herein.
And step 1812, constructing initial area data in the target area based on the vehicle track data, and performing index aggregation processing on the data in the target statistical dimension in the initial area data to obtain the area data to be processed.
The initial area data specifically includes: vehicle track data of an associated vehicle track having a track dependency relationship with the target area. Secondly, the region data to be processed specifically characterizes the data of the initial region data under the target statistical dimension.
Specifically, the server first constructs initial area data within the target area based on the vehicle track data, that is, aggregates vehicle track data of associated vehicle tracks having track dependency relationship with the target area, thereby obtaining initial area data. Then, the server determines data in the target statistical dimension from the initial area data based on the target statistical dimension, and then performs index aggregation processing on the data in the target statistical dimension to obtain the area data to be processed, wherein the index aggregation processing is specifically performed on the data in the target statistical dimension.
In step 1813, an array of coordinate points to be selected is constructed, and each region coordinate point in the target region is added to the array of coordinate points to be selected.
Specifically, the server firstly constructs an array of coordinate points to be selected, wherein the constructed array of coordinate points to be selected is an array which does not comprise any coordinate point, and then the server adds each region coordinate point in the composition target region into the array of coordinate points to be selected, namely the array of coordinate points to be selected after adding the region coordinate points specifically comprises: each region coordinate point in the target region is composed.
In step 1814, a set of coordinate information is determined based on the coordinate information corresponding to each of the regional coordinate points.
Specifically, the server determines a set of coordinate information based on the coordinate information corresponding to each region coordinate point, respectively. The foregoing set of coordinate information may include: longitude information and latitude information in the coordinate information corresponding to each regional coordinate point respectively, or maximum longitude information and maximum latitude information in the coordinate information corresponding to each regional coordinate point respectively, or minimum longitude information and minimum latitude information in the coordinate information corresponding to each regional coordinate point respectively. And are not limited herein.
In step 1815, a longitude set and a latitude set are constructed based on the coordinate information set.
Specifically, the longitude set includes longitude information between minimum longitude information and maximum longitude information, and the difference between each longitude information is a preset step, and the longitude set includes the minimum longitude information and the maximum longitude information. Secondly, the latitude set comprises latitude information between the minimum latitude information and the maximum latitude information, the difference value between each latitude information is a preset step length, and the latitude set comprises the minimum latitude information and the maximum latitude information.
Specifically, the server divides coordinate information by a preset step length by taking the size of the ink card support grid as a preset step length, namely, divides the minimum latitude information to the maximum latitude information into a plurality of latitude information by the preset step length so as to obtain a latitude set, and divides the minimum longitude information to the maximum longitude information into a plurality of longitude information by the preset step length so as to obtain the longitude set.
In step 1816, the longitude set and the latitude set are subjected to cartesian product to obtain a plurality of longitude and latitude coordinate points, and each longitude and latitude coordinate point is added to the to-be-selected coordinate point array.
Specifically, the server performs cartesian product on the longitude set and the latitude set to obtain a plurality of longitude and latitude coordinate points, and adds each longitude and latitude coordinate point to the coordinate point array to be selected. That is, the server performs product calculation on each longitude information included in the longitude set and each latitude information included in the latitude set, so as to obtain a plurality of longitude and latitude coordinate points, where the longitude and latitude coordinate points include longitude information in the longitude set and latitude information in the latitude set.
And step 1817, topology is performed on each coordinate point to be selected in the coordinate point array to reserve candidate coordinate points located in the target area, and the area coordinate point array is constructed based on the candidate coordinate points located in the target area.
Wherein, each coordinate point to be selected in the coordinate point array to be selected specifically includes: the regional coordinate points and the longitude and latitude coordinate points, or each coordinate point to be selected in the coordinate point array to be selected specifically comprises: the geographical coordinate transformed regional coordinate point and the geographical coordinate transformed longitude and latitude coordinate point. Second, methods of topology include, but are not limited to, DE-9IM matrices.
Specifically, the server determines the regional coordinate point and the longitude and latitude coordinate point added to the coordinate point array to be selected as the coordinate point to be selected belonging to the coordinate point array to be selected, or determines the regional coordinate point after the geographic coordinate conversion and the longitude and latitude coordinate point after the geographic coordinate conversion added to the coordinate point array to be selected as the coordinate point to be selected belonging to the coordinate point array to be selected.
Based on the above, the server traverses the coordinate point array to be selected, performs topology on the coordinate points to be selected in the coordinate point array to determine whether each coordinate point to be selected is located in the target area, determines candidate coordinate points located in the target area and candidate coordinate points not located in the target area according to the determination result, so as to reserve the candidate coordinate points located in the target area, and constructs an area coordinate point array based on the candidate coordinate points located in the target area.
Step 1818, determining a grid array corresponding to the ink-card-holder grid to be constructed, and adding the ink-card-holder grid with each candidate coordinate point in the regional coordinate point array as a vertex into the grid array.
Specifically, the server determines a grid array corresponding to the ink-card support grid to be constructed, and then adds the ink-card support grid with each candidate coordinate point in the regional coordinate point array as a vertex into the grid array. The server constructs an array grid array with unique identification of the ink card support grid to be constructed, then traverses candidate coordinate points in a target area in the area coordinate point array to determine the ink card support grid with the candidate coordinate points as vertexes, and the ink card support grid with the candidate coordinate points as vertexes is usually four, so that the four ink card support grids with the candidate coordinate points as vertexes are added into the grid array, and the ink card support grid can be uniquely identified by grid identification, namely, the server respectively corresponds to the four ink card support grids with the candidate coordinate points as vertexes and adds the four ink card support grids into the grid array.
In step 1819, the ink-card-holder projection is performed on the ink-card-holder grid included in the grid array to construct a target ink-card-holder grid.
Specifically, the server performs ink-card-holder projection on the ink-card-holder grid included in the grid array to construct a target ink-card-holder grid. Specifically, the server projects longitude and latitude coordinates of southwest angles of the ink-card support grids through the ink-card support to obtain an abscissa and an ordinate of each ink-card support grid, and then splices the abscissa and the ordinate of each ink-card support grid by using a connector to construct a final target ink-card support grid.
Step 1820, obtaining target sample expansion data within the target ink-karton grid.
The target sample expansion data specifically is a statistic value of a standard sample expansion coefficient of the target ink card support grid, where the statistic value may be an average value, a median, a maximum value or a minimum value, and the like, that is, the target sample expansion data in this embodiment specifically is: demographic data acquired based on the published channel of the mercator mesh division, and vehicle data within the same time window as the demographic data. Specifically, the server firstly converts the target area into a corresponding target ink card support grid, and then obtains population flow data and vehicle data which are in the same time window as the population flow data based on public channels divided by the target ink card support grid so as to construct target sample expansion data in the target ink card support grid.
Step 1821, performing data sample expansion processing on the region data to be processed based on the target sample expansion data related to the target region, so as to obtain target data.
Wherein, the target data is used for describing: statistics of reaching the target site over a historical period of time. Specifically, the server performs data sample expansion processing on the region data to be processed through the target sample expansion data to obtain target data. Various models and sample-expanding techniques can be used in the foregoing data sample-expanding process, that is, the data sample-expanding process in this embodiment includes, but is not limited to: the method comprises the steps of coefficient sample expansion, sample expansion according to grid characteristics by using a machine learning model, grid characteristic fusion by using a deep learning model and surrounding multi-modal information sample expansion. And is not described in detail herein.
It should be understood that the specific implementation of steps 1801 to 1821 is similar to the previous embodiments, and will not be repeated here.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a data processing device for realizing the above related data processing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the data processing device provided below may refer to the limitation of the data processing method hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 19, there is provided a data processing apparatus including: a region determination module 1902, a trajectory acquisition module 1904, a region and trajectory association module 1906, and a data statistics module 1908, wherein:
the area determining module 1902 is configured to determine a target area where the target location is located;
a track acquisition module 1904, configured to acquire a plurality of historical vehicle tracks in a historical time period, where the historical vehicle tracks are a sequence of a plurality of continuous and adjacent historical vehicle track points;
an area and track association module 1906 for determining an associated vehicle track having track dependency relationship with the target area from a plurality of historical vehicle tracks, at least one historical vehicle track point in the associated vehicle track being in the target area;
The data statistics module 1908 is configured to perform statistics calculation on data in a target statistics dimension in vehicle track data related to a vehicle track, so as to obtain target data, where the target data is used for describing: statistics of reaching the target site over a historical period of time.
In one embodiment, the data statistics module 1908 is specifically configured to obtain vehicle track data, where the vehicle track data includes at least: representation data of vehicles associated with vehicle trajectories; constructing initial area data in a target area based on vehicle track data, and performing index aggregation processing on data in the initial area data under a target statistical dimension to obtain area data to be processed; and carrying out data sample expansion processing on the data of the area to be processed based on the target sample expansion data related to the target area so as to obtain target data.
In one embodiment, the data statistics module 1908 is specifically configured to convert the target area into a corresponding target ink-card-holder grid, and obtain target sample expansion data in the target ink-card-holder grid, where the target sample expansion data includes at least: population flow data, and vehicle data within the same time window as the population flow data; and carrying out data sample expansion processing on the region data to be processed through the target sample expansion data so as to obtain target data.
In one embodiment, the target region is an irregularly closed region composed of a plurality of region coordinate points;
the data statistics module 1908 is specifically configured to construct an array of coordinate points to be selected, and add each region coordinate point to the array of coordinate points to be selected; determining a coordinate information set based on the coordinate information corresponding to each regional coordinate point; and carrying out space division on the target area based on the coordinate point array to be selected and the coordinate information set, and determining a target ink card support grid corresponding to the target area after space division.
In one embodiment, the data statistics module 1908 is specifically configured to construct a longitude set and a latitude set based on the coordinate information set, where the longitude set includes longitude information between minimum longitude information and maximum longitude information, and the difference between each longitude information is a preset step size, the latitude set includes latitude information between minimum latitude information and maximum latitude information, and the difference between each latitude information is a preset step size; carrying out Cartesian product on the longitude set and the latitude set to obtain a plurality of longitude and latitude coordinate points, and adding each longitude and latitude coordinate point into a coordinate point array to be selected; topology is conducted on each coordinate point to be selected in the coordinate point array to be selected so as to keep candidate coordinate points located in a target area, and an area coordinate point array is built based on the candidate coordinate points located in the target area; determining a grid array corresponding to the ink card support grid to be constructed, and adding the ink card support grid with each candidate coordinate point in the regional coordinate point array as a vertex into the grid array; and performing ink card support projection on the ink card support grids included in the grid array to construct a target ink card support grid.
In one embodiment, as shown in fig. 20, the data processing apparatus further includes a statistics aggregation processing module 2002;
the statistics aggregation processing module 2002 is configured to perform statistics aggregation processing on each statistics dimension, each time dimension, and each space dimension, so as to obtain respective aggregation index formulas of each statistics dimension; wherein the statistical aggregation includes any one of: sum, sum after de-duplication;
the data statistics module 1908 is specifically configured to determine a target aggregation indicator formula of the target statistical dimension based on the respective aggregation indicator formulas of the statistical dimensions; and carrying out statistical calculation on the data in the target statistical dimension in the vehicle track data of the related vehicle track based on the aggregation index formula so as to obtain target data.
In one embodiment, the region and track association module 1906 is specifically configured to perform track point screening on each historical vehicle track point to obtain a plurality of screened track points; determining a target vehicle track point falling into a target area from a plurality of screening track points; and determining the historical traffic track to which the target vehicle track point belongs as the associated vehicle track.
In one embodiment, the region and track association module 1906 is specifically configured to construct a target spatial index of the target region, where the target spatial index is used to index into a target contour space of the target region; and determining the screening track points falling into the target contour space as target vehicle track points based on the target space index and the coordinate information of each screening track point.
In one embodiment, the region and track association module 1906 is specifically configured to determine, based on coordinate information of each historical vehicle track point, a track region in which each historical vehicle track point is located; and determining the historical vehicle track points with the track areas in the abnormal areas as screened track points, and determining the historical vehicle track points with the track areas not in the abnormal areas as screened track points.
In one embodiment, the region and track association module 1906 is specifically configured to calculate a track point information difference between track point information of adjacent historical vehicle track points in the same historical vehicle track; the track point information is time information or coordinate information; and determining adjacent historical vehicle track points with track point information differences not in the track point information range as screened track points, and determining adjacent historical vehicle track points with track point information differences in the track point information range as screened track points.
In one embodiment, the area determining module 1902 is specifically configured to obtain location interest points that are located on the outline of the target location; and selecting an initial place interest point from the place interest points, and connecting the place interest points sequentially and repeatedly from the place interest points until returning to the initial place interest point again to obtain a target area.
The modules in the data processing apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server or a terminal, and in this embodiment, the computer device is taken as a server for illustration, and the internal structure thereof may be as shown in fig. 21. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data related to the embodiment of the application, such as vehicle tracks, coordinate points and the like. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data processing method.
It will be appreciated by persons skilled in the art that the architecture shown in fig. 21 is merely a block diagram of some of the architecture relevant to the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applicable, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the object information (including, but not limited to, object device information, object personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) related to the present application are both information and data authorized by the object or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related countries and regions.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical feature information of the above embodiments may be arbitrarily combined, and for brevity of description, all possible combinations of the technical feature information in the above embodiments are not described, however, as long as there is no contradiction between the combinations of the technical feature information, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (15)

1. A method of data processing, comprising:
determining a target area where a target place is located;
acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
determining an associated vehicle track with track dependency relationship with the target area from the plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is positioned in the target area;
And carrying out statistical calculation on the data in the target statistical dimension in the vehicle track data of the related vehicle track to obtain target data, wherein the target data is used for describing: statistics of the target locale being reached within the historical time period.
2. The method according to claim 1, wherein the performing a statistical calculation on the data in the target statistical dimension in the vehicle track data of the associated vehicle track to obtain the target data includes:
acquiring the vehicle track data, wherein the vehicle track data at least comprises: the representation data of the vehicle associated with the vehicle track;
constructing initial area data in the target area based on the vehicle track data, and performing index aggregation processing on the data in the initial area data under the target statistical dimension to obtain area data to be processed;
and carrying out data sample expansion processing on the region data to be processed based on target sample expansion data related to the target region so as to obtain the target data.
3. The method according to claim 2, wherein the performing data sample expansion processing on the region data to be processed based on the target sample expansion data related to the target region to obtain the target data includes:
Converting the target area into a corresponding target ink-card support grid, and acquiring target sample expansion data in the target ink-card support grid, wherein the target sample expansion data at least comprises: population flow data, and vehicle data within the same time window as the population flow data;
and carrying out data sample expansion processing on the region data to be processed through the target sample expansion data so as to obtain the target data.
4. The method of claim 3, wherein the target region is an irregularly closed region of a plurality of region coordinate points;
the converting the target area into a corresponding target ink card support grid includes:
constructing a coordinate point array to be selected, and adding each regional coordinate point in the target region into the coordinate point array to be selected;
determining a coordinate information set based on the coordinate information corresponding to each regional coordinate point;
and carrying out space division on the target area based on the coordinate point array to be selected and the coordinate information set, and determining the target ink katuo grid corresponding to the target area after space division.
5. The method according to claim 4, wherein the coordinate information set includes minimum longitude information, maximum longitude information, minimum latitude information, and maximum latitude information among the coordinate information corresponding to each of the regional coordinate points, respectively;
The method for determining the target mercator grid corresponding to the spatially-segmented target region based on the coordinate point array to be selected and the coordinate information set includes:
constructing a longitude set and a latitude set based on the coordinate information set, wherein the longitude set comprises longitude information between the minimum longitude information and the maximum longitude information, the difference value between each longitude information is a preset step length, the latitude set comprises latitude information between the minimum latitude information and the maximum latitude information, and the difference value between each latitude information is the preset step length;
carrying out Cartesian product on the longitude set and the latitude set to obtain a plurality of longitude and latitude coordinate points, and adding each longitude and latitude coordinate point into the coordinate point array to be selected;
topology is conducted on all coordinate points to be selected in the coordinate point array to be selected so as to keep candidate coordinate points located in the target area, and an area coordinate point array is built based on the candidate coordinate points located in the target area;
determining a grid array corresponding to the ink-card support grid to be constructed, and adding the ink-card support grid taking each candidate coordinate point in the regional coordinate point array as a vertex into the grid array;
And carrying out ink card support projection on the ink card support grids included in the grid array so as to construct the target ink card support grid.
6. The method according to claim 1, wherein the method further comprises:
respectively carrying out statistical aggregation treatment on each statistical dimension, each time dimension and each space dimension to obtain respective aggregation index formulas of each statistical dimension; wherein the statistical aggregation comprises any one of the following: sum, sum after de-duplication;
and performing statistical calculation on the data in the target statistical dimension in the vehicle track data of the related vehicle track to obtain target data, wherein the method comprises the following steps of:
determining a target aggregation index formula of the target statistical dimension based on the aggregation index formula of each statistical dimension;
and carrying out statistical calculation on the data in the target statistical dimension in the vehicle track data of the related vehicle track based on the aggregation index formula so as to obtain the target data.
7. The method of claim 1, wherein the determining an associated vehicle track having a track dependency relationship with the target area from the plurality of historical vehicle tracks comprises:
Performing track point screening on each historical vehicle track point to obtain a plurality of screened track points;
determining target vehicle track points falling into the target area from the screening track points;
and determining the historical traffic track of the target vehicle track point as the associated vehicle track.
8. The method of claim 7, wherein the determining a target vehicle trajectory point from the plurality of screening trajectory points that falls within the target region comprises:
constructing a target spatial index of the target region, wherein the target spatial index is used for indexing to a target contour space of the target region;
and determining the screening track points falling into the target contour space as the target vehicle track points based on the target space index and the coordinate information of each screening track point.
9. The method of claim 7, wherein the performing the track point screening on each of the historical vehicle track points to obtain a plurality of screened track points comprises:
determining a track area where each historical vehicle track point is located based on the coordinate information of each historical vehicle track point;
and determining the historical vehicle track points of which the track areas are in the abnormal areas as screened track points, and determining the historical vehicle track points of which the track areas are not in the abnormal areas as the screened track points.
10. The method of claim 7, wherein the performing the track point screening on each of the historical vehicle track points to obtain a plurality of screened track points comprises:
calculating the track point information difference between track point information of adjacent historical vehicle track points in the same historical vehicle track; wherein the track point information is time information or coordinate information;
and determining adjacent historical vehicle track points, of which the track point information difference is not in the track point information range, as screened track points, and determining adjacent historical vehicle track points, of which the track point information difference is in the track point information range, as the screened track points.
11. The method of claim 1, wherein determining the target area in which the target location is located comprises:
acquiring place interest points on the outline of the target place;
and selecting an initial place interest point from the place interest points, and connecting the place interest points from the place interest points sequentially and repeatedly until returning to the initial place interest point again to obtain the target area.
12. A data processing apparatus, the apparatus comprising:
The area determining module is used for determining a target area where a target place is located, wherein the target area is a communicated closed area;
the track acquisition module is used for acquiring a plurality of historical vehicle tracks in a historical time period, wherein the historical vehicle tracks are sequences formed by a plurality of continuous and adjacent historical vehicle track points;
the area and track association module is used for determining an associated vehicle track with track dependency relationship with the target area from the plurality of historical vehicle tracks, wherein at least one historical vehicle track point exists in the associated vehicle track and is located in the target area;
the data statistics module is used for carrying out statistics calculation on data in a target statistics dimension in the vehicle track data of the related vehicle track so as to obtain target data, wherein the target data is used for describing: statistics of the target locale being reached within the historical time period.
13. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 11 when the computer program is executed.
14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 11.
15. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any one of claims 1 to 11.
CN202310833667.9A 2023-07-07 2023-07-07 Data processing method, device, computer equipment and storage medium Pending CN116975182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310833667.9A CN116975182A (en) 2023-07-07 2023-07-07 Data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310833667.9A CN116975182A (en) 2023-07-07 2023-07-07 Data processing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116975182A true CN116975182A (en) 2023-10-31

Family

ID=88482398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310833667.9A Pending CN116975182A (en) 2023-07-07 2023-07-07 Data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116975182A (en)

Similar Documents

Publication Publication Date Title
CN108446293B (en) Method for constructing city portrait based on city multi-source heterogeneous data
CN110019568B (en) Spatial clustering-based addressing method and device, computer equipment and storage medium
US10034141B2 (en) Systems and methods to identify home addresses of mobile devices
US11703352B2 (en) Vector tile pyramiding
CN111459723B (en) Terminal data processing system
CN116341773A (en) Vehicle demand prediction method, device, computer equipment and storage medium
CN116703132B (en) Management method and device for dynamic scheduling of shared vehicles and computer equipment
CN116934366A (en) Charging pricing method and device for charging station, storage medium and computer equipment
CN115525642A (en) Reverse geocoding method and device and electronic equipment
CN116433053A (en) Data processing method, device, computer equipment and storage medium
JP5281284B2 (en) System and method for manufacturing a flexible geographic grid
CN114360255B (en) Flow determination method and device, storage medium and electronic device
CN116975182A (en) Data processing method, device, computer equipment and storage medium
CN111833088B (en) Supply and demand prediction method and device
CN110428627B (en) Bus trip potential area identification method and system
CN113593244A (en) Flow determination method and device, storage medium and electronic device
CN112507045B (en) Terminal data processing system
CN112507046B (en) Terminal data processing system
CN110502593A (en) Multifactor assessment Self-service Library lays the method, device and equipment of suitability
CN112507044B (en) Terminal data processing system
CN112507043B (en) Terminal data processing system
CN116843091A (en) Warehouse location determination method, warehouse location determination device, computer equipment and storage medium
CN118524087A (en) Positioning method and device based on IP address, computer equipment and storage medium
CN118149844A (en) Navigation method, device, equipment and storage medium based on GIS map
CN115374353A (en) Target area determination method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication