CN117194596A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN117194596A
CN117194596A CN202311035409.2A CN202311035409A CN117194596A CN 117194596 A CN117194596 A CN 117194596A CN 202311035409 A CN202311035409 A CN 202311035409A CN 117194596 A CN117194596 A CN 117194596A
Authority
CN
China
Prior art keywords
data
landmark
drive test
test data
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311035409.2A
Other languages
Chinese (zh)
Inventor
欧阳晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202311035409.2A priority Critical patent/CN117194596A/en
Publication of CN117194596A publication Critical patent/CN117194596A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the invention provides a data processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: obtaining a road test data file determined based on landmark facilities; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a data processing party, an apparatus, a device, and a storage medium.
Background
In the wireless network optimization work, drive Test (DT) is a basic work, a terminal with service capability is connected through Drive Test equipment, coverage and signaling interaction data are collected from the existing network, the running condition of a 4G/5G network and the network service capability can be obtained objectively and comprehensively, in the prior art, when the Drive Test data are preprocessed, the Drive Test data are not better combined with service and analysis algorithms, the efficiency of the algorithms can be affected, and the deviation of analysis results can be caused.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention have been developed to provide a data processing apparatus, device, and storage medium that overcome, or at least partially solve, the foregoing problems.
In order to solve the above problems, an embodiment of the present invention discloses a data processing method, including:
obtaining a road test data file determined based on landmark facilities;
grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data;
respectively carrying out aggregation calculation on the multiple groups of drive test data to obtain multiple aggregation results;
respectively adding tile numbers to the aggregation results to obtain a plurality of target data points;
And matching the target data point with the outline of the landmark according to the improved cross-ray method to obtain an updated landmark and data file association table.
Optionally, the grouping the drive test data in the drive test data file to obtain multiple groups of drive test data includes:
acquiring the instantaneous speed and the average speed of data points in the drive test data;
determining the state of test equipment according to the instantaneous speed and the average speed of the drive test data;
and storing the drive test data to a corresponding array according to the state of the test equipment.
Optionally, the determining the state of the test device according to the instantaneous speed and the average speed of the data points in the drive test data includes:
if the instantaneous speed is less than a first instantaneous speed threshold and the average speed is less than an average speed threshold, the test equipment is in a stationary state;
the storing the drive test data to a corresponding array according to the state of the test equipment includes:
and if the test equipment is in a static state, storing the drive test data into a first array.
Optionally, the method further comprises:
if the instantaneous speed is greater than or equal to a second instantaneous speed threshold, the test device is in a fast fading state, wherein the first instantaneous speed threshold is less than the second instantaneous speed threshold;
The storing the drive test data to a corresponding array according to the state of the test equipment includes:
and if the test equipment is in a fast fading state, storing the drive test data into a second group.
Optionally, the method further comprises:
if the instantaneous speed is greater than or equal to a first instantaneous speed threshold and less than a second instantaneous speed threshold, or the average speed is greater than the average speed threshold, the test device is in a slow fading state;
the storing the drive test data to a corresponding array according to the state of the test equipment includes:
and if the test equipment is in a slow fading state, storing the drive test data into a third array.
Optionally, the aggregating computing is performed on the multiple sets of drive test data to obtain multiple aggregate results, including:
and calculating an aggregation result corresponding to each group of drive test data by adopting an aggregation function for the groups of drive test data, wherein the aggregation function comprises the following steps: summation, maximum, average, one of the first significant values.
Optionally, the adding tile numbers to the multiple aggregation results to obtain multiple target data points includes:
for the aggregation results, longitude and latitude of each aggregation result are respectively obtained;
According to the longitude and latitude of each aggregation result, calculating the tile number corresponding to each aggregation result;
and adding corresponding tile numbers to each aggregation result respectively to obtain a plurality of target data points.
Optionally, the matching the plurality of target data points with the contours of the landmarks according to the improved cross-ray method to obtain an updated landmark-data file association table includes:
acquiring longitude and latitude ranges of landmark facilities;
screening the target data points according to the latitude and longitude range of the landmark facility;
judging whether the screened data points are positioned in a polygon defined by a landmark or not by an improved cross-ray method, wherein the polygon defined by the landmark is determined according to contour information in a landmark information table, and the contour information comprises longitudes and latitudes of a plurality of points on the polygon defined by the contour;
and if the new landmark is positioned in the polygon defined by the landmark, matching the outline of each type of landmark facility information to generate a new landmark and a data file association table.
Optionally, the determining whether the screened data point is located in the polygon defined by the landmark by using the improved cross-ray method includes:
An east ray is led out from the screened data points, and the number of intersection points of the east ray and the polygon is calculated;
if the number of intersections is an odd number, the screened data points are determined to be positioned in the polygon defined by the landmark.
Optionally, the method further comprises:
if the number of the intersection points is not an odd number, leading out the south, west and north rays from the screened data points, and respectively calculating the corresponding number of the intersection points;
if the north ray and the east ray led out by the data point have no intersection point with the polygon, and the south ray and the west ray led out have intersection points with the polygon, determining the data point as a first quadrant point;
screening out a first target quadrant point positioned in the landmark facility auxiliary information table according to the first quadrant point;
deleting the first target quadrant point from the landmark facility auxiliary information table, and adding the first quadrant point to the landmark facility auxiliary information table.
Optionally, the method further comprises:
if the east ray, the south ray and the polygon have no intersection points, the data point is a second quadrant point;
screening out a second target quadrant point positioned in the landmark facility auxiliary information table according to the second quadrant point;
Deleting the second target quadrant point from the landmark facility auxiliary information table, and adding the second quadrant point to the landmark facility auxiliary information table.
Optionally, the method further comprises:
if the south ray, the west ray and the polygon have no intersection points, the data point is a third quadrant point;
screening out a third target quadrant point positioned in the landmark facility auxiliary information table according to the third quadrant point;
deleting the third target quadrant point from the landmark facility auxiliary information table, and adding the third quadrant point to the landmark facility auxiliary information table.
Optionally, the method further comprises:
if the Western ray and the North ray have no intersection points with the polygon, the data point is a fourth quadrant point;
screening out a fourth target quadrant point positioned in the landmark facility auxiliary information table according to the fourth quadrant point;
deleting the fourth target quadrant point from the landmark facility auxiliary information table, and adding the fourth quadrant point to the landmark facility auxiliary information table.
The invention also discloses a data processing device, which comprises:
The acquisition module is used for acquiring the road test data file determined based on the landmark facilities;
the grouping module is used for grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data;
the computing module is used for respectively carrying out aggregation computation on the plurality of groups of drive test data to obtain a plurality of aggregation results;
the numbering module is used for respectively adding tile numbers to the aggregation results to obtain a plurality of target data points;
and the matching module is used for matching the target data point with the outline of the landmark according to the improved cross ray method to obtain an updated landmark and data file association table.
The invention also discloses an electronic device, comprising: a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor performs the steps of the data processing method according to any of the preceding claims.
The invention also discloses a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data processing method as described in any of the above.
The embodiment of the invention has the following advantages:
the invention discloses a data processing method, which comprises the steps of obtaining a drive test data file determined based on landmark facilities; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
Drawings
FIG. 1 is a flow chart of steps of a data processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps of another data processing method according to an embodiment of the present invention;
FIG. 3 is a schematic illustration of a landmark profile provided by an embodiment of the present invention;
FIG. 4 is a schematic illustration of a cross-ray provided by an embodiment of the present invention;
FIG. 5 is a schematic illustration of another landmark profile provided by an embodiment of the present invention;
FIG. 6 is a flow chart of determining the location of a data point according to an embodiment of the present invention;
FIG. 7 is a block diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
In the wireless network optimization work, drive Test (DT) is a basic work, a terminal with service capability is connected through Drive Test equipment, coverage and signaling interaction data are collected from the existing network, and the running condition and network service capability of the 4G/5G network can be obtained objectively and comprehensively. By utilizing the 4G/5G drive test data, network optimization can be effectively guided, and multi-level data support can be provided for services such as the Internet of vehicles and the Internet of things.
The background of the drive test software provides a drive test data analysis function, but is limited to analysis of single or a plurality of data, if joint analysis is required to be carried out on a large amount of measured data in a period of time, a relational database or a NoSQL-based big data platform (data lake) can be established for storing the drive test data; and then writing a data analysis script to analyze regularly, and then using the program of the B/S architecture to present the analysis result to the user.
Aiming at the requirements of a large data platform (or a data lake), the data preprocessing flow is the process of ETL (extraction, conversion and loading), the drive test data preprocessing is also carried out according to the flow, and the drive test data preprocessing is mainly carried out according to the service characteristics of 4G/5G drive test data, so that valuable data lines (or data documents) for subsequent analysis are extracted, the data without information or with extremely low information is removed, and then certain rules are combined, and finally the data are loaded into (written into) a database or the large data platform.
In the prior art, in terms of data preprocessing, comparing E (decimating) and L (loading) in ETL and not paying much attention to T (converting), this causes two problems: one is that the data volume is larger, the storage burden of a database or a data platform is increased, the improvement of the efficiency of the subsequent analysis operation algorithm is unfavorable, and some data platforms (data lakes) have limitation on the data volume imported at one time, so that the data import is failed; another problem is that the imported data is not better combined with the business and analysis algorithms, which not only affects the efficiency of the algorithms, but more importantly, the incorrect preprocessing algorithms cause deviation of the analysis results.
One of the core concepts of the embodiment of the invention is that a drive test data file determined based on landmark facilities is acquired; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
Referring to fig. 1, a step flowchart of a data processing method provided by an embodiment of the present invention is shown, where the method specifically may include the following steps:
step 101, a drive test data file determined based on landmark facilities is acquired.
In the embodiment of the invention, the road test data file determined based on the landmark facility can be derived from the road test software, the data form in the road test data file is stored in the form of a table, and the fields of the data table in the road test data file comprise two parts, namely necessary fields and variable fields: the necessary fields include longitude, latitude, physical cell identity, base station number, cell number and fields indicating coverage (e.g., RSRP, SINR), as shown in table 1; the optional fields are specifically determined according to the traffic (including 4G, NSA, SA, NB-IoT, voLTE, voNR, etc.) tested by the drive test data, such as CQI, MCS, and uplink and downlink layer rates.
Field name Field type Field categorization Whether or not it can be empty
PC time Date and time Time stamp Whether or not
Longitude and latitude Floating point Position of Whether or not
Latitude of latitude Floating point Position of Whether or not
RSRP Floating point Overlay strength Is that
SINR Floating point Overlay strength Is that
PCI (physical district sign) Short integer Cell information Is that
Base station numbering Shaping type Cell information Is that
Cell numbering Short integer Cell information Is that
…… …… …… ……
TABLE 1
Step 102, grouping the drive test data in the drive test data file to obtain multiple groups of drive test data.
In the embodiment of the invention, a plurality of drive test data in the drive test data file can be grouped based on the characteristics of the drive test data, so that a plurality of groups of drive test data are obtained.
And 103, respectively carrying out aggregation calculation on the plurality of groups of drive test data to obtain a plurality of aggregation results.
In the embodiment of the invention, after a plurality of groups of drive test data are obtained, aggregation calculation can be performed on each group of drive test data through a preset algorithm, so as to obtain a plurality of groups of aggregation results, and in one example, the average value of each group of drive test data can be calculated, so as to obtain the average value of the plurality of groups of drive test data, namely a plurality of aggregation results.
Step 104, tile numbers are added to the aggregation results respectively, and a plurality of target data points are obtained.
In the embodiment of the invention, after a plurality of groups of drive test data are subjected to aggregation calculation to obtain a plurality of groups of aggregation results, tile numbers can be added to each group of aggregation results respectively to obtain a plurality of target data points, and in one example, the plurality of aggregation result numbers can be sequentially given according to the sizes of different aggregation results to obtain a plurality of target data points.
And 105, matching the target data point with the outline of the landmark according to the improved cross-ray method to obtain an updated landmark and data file association table.
In the embodiment of the invention, the determined multiple target data points can be matched with the outline of the landmark by an improved cross-ray method, so that the stability in the matching process is improved.
The method comprises the steps of obtaining a road test data file determined based on landmark facilities; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
Referring to fig. 2, a flowchart of steps of another data processing method provided by an embodiment of the present invention is shown, where the method specifically may include the following steps:
step 201, a drive test data file determined based on landmark facilities is acquired.
Step 202, obtaining an instantaneous speed and an average speed of data points in the drive test data.
In the embodiment of the invention, the instantaneous speed v of the data point in the drive test data can be obtained s And average velocity V t
And 203, determining the state of the test equipment according to the instantaneous speed and the average speed of the drive test data.
In the embodiment of the invention, the instantaneous speed v of the data points can be used s And average velocity v t The state of the test equipment is determined.
In one embodiment of the invention, the test device is in a stationary state if the instantaneous speed is less than a first instantaneous speed threshold and the average speed is less than an average speed threshold.
In the embodiment of the invention, the first instantaneous speed threshold value can be set to beThe average speed threshold is +.>If it isAnd->Determining that the test device is in a stationary state, +.>Equivalent toI.e. < -> Equivalent toI.e. < ->
I.e. the judging condition of the static state of the test equipment isAnd-> Wherein the subscripts j, k must satisfy j < k, and when the user is in a quasi-stationary state, the time stamp is used as the basis of the packet, i.e. +. >Time output array [ R ] j ,...,R k-1 ]。
In one embodiment of the present invention, the test device is in a fast fading condition if the instantaneous speed is greater than or equal to a second instantaneous speed threshold, wherein the first instantaneous speed threshold is less than the second instantaneous speed threshold.
In the embodiment of the invention, ifWherein->According to this formula, a fast fading instantaneous speed threshold +.>As shown in the table 2 below,
TABLE 2
In the embodiment of the invention, the RSRP change threshold is set as T RSRP =3 dB/100ms, then the coverage condition expression is Δ RSRP (k)>T RSRP . The two conditions are combined, and the judgment condition of the fast fading state is thatOr delta RSRP (k)>T RSRP . In fast fading condition, the grouping condition is immediate grouping, and the array [ R ] is output immediately as long as the above condition is satisfied j ,...,R k-1 ]。
In one embodiment of the present invention, the test device is in a slow fading condition if the instantaneous speed is greater than or equal to a first instantaneous speed threshold, less than a second instantaneous speed threshold, or the average speed is greater than the average speed threshold.
In the embodiment of the invention, ifOr->The test device may be determined to be in a slow fading condition,
step 204, according to the state of the test equipment, the drive test data is stored into the corresponding array.
In the embodiment of the invention, if the test equipment is in a static state, the drive test data is stored in the first array, if the test equipment is in a fast fading state, the drive test data is stored in the second array, and if the test equipment is in a slow fading state, the drive test data is stored in the third array, so that different drive test data in the drive test data can be detected and grouped.
Step 205, for a plurality of sets of drive test data, calculating an aggregate result corresponding to each set of drive test data by adopting an aggregate function, where the aggregate function includes: summation, maximum, average, one of the first significant values.
In the embodiment of the invention, if the output array is R j,k =[R j ,...,R k-1 ]Defining the relational expression of the processed data table as r=r (a, B, C.), wherein a, B, C.. Is each field of R, filtering non-empty data is needed before aggregation operation is performed on the field a, and the relational expression is pi A≠nullA (R)) and performing mathematical operations such as summation, maximum value, average value and the like on the output scalar array, thus performing one operation on each field, and finally outputting a record of the same field as the array, wherein the record is shown in a table 3, and a schematic diagram of aggregation functions for typical fields is provided by the embodiment of the invention.
TABLE 3 Table 3
In one embodiment of the present invention, after the aggregate calculation is performed on each set of drive test data, the total test points and the covered test points of the data file may be calculated, and in particular, the coverage valid field s=pi may be selected RSRP≠null∧SINR≠null (R), then the total test point n=count (S), covered point n c =count(π RSRP>-105dBm∧SINR>-3dB (S)), wherein-105 dBm and-3 dB are set RSRP and SINR thresholds.
Step 206, obtaining longitude and latitude of each aggregation result for a plurality of aggregation results.
In the embodiment of the present invention, each aggregation result carries longitude and latitude information, and in one example, longitude and latitude coordinates of the obtained aggregation result are (x, y).
Step 207, calculating the tile number corresponding to each aggregation result according to the longitude and latitude of each aggregation result.
In the embodiment of the invention, the latitude and longitude ranges of the research area are assumed to be respectively
x 0 ≤x<x 0 +K·Δ x
y≥y 0
The tile number corresponding to each aggregation result can be calculated by formula (1):
wherein the rectangle defining the tile is delta in length (longitude span) and width (latitude span) x And delta y . Wherein K is a positive integer, x 0 Is a value representing longitude, y 0 Is a value representing latitude.
Step 208, adding corresponding tile numbers to each aggregation result, so as to obtain a plurality of target data points.
In the embodiment of the invention, after determining the tile number of each aggregation result, the corresponding tile number can be added for each aggregation result to obtain a plurality of target data points.
Step 209, matching the target data point with the outline of the landmark according to the improved cross-ray method, and obtaining an updated landmark and data file association table.
In one embodiment of the present invention, the step 209 may include the following sub-steps:
in the substep S21, the latitude and longitude range of the landmark facility is acquired.
In the embodiment of the invention, the longitude and latitude of the left lower corner of the rectangular landmark facility can be set as (x) 1 ,y 1 ) The longitude and latitude of the upper right corner is (x) 2 ,y 2 )。
In the substep S22, a plurality of target data points are screened according to the latitude and longitude ranges of the landmark facility.
In the embodiment of the invention, a plurality of target data points can be primarily screened according to the longitude and latitude range of landmark facilities, as shown in fig. 3, a schematic diagram of a landmark outline provided by the embodiment of the invention is shown, a polygon is the outline of the landmark, a rectangle (actually a curved rectangle on the earth) is the longitude and latitude range of the landmark, the points in the diagram are road test target data points, obviously, if a test point is in the polygon, the test point is in the rectangle, based on the principle, the relation representing the road test data points can be set as R, the fields representing the longitude and latitude in R are x and y respectively, and the algebra of the relation in the primary screening process is Thus, the invention can screen the data points outside the landmark polygon to a great extent through the preliminary screening algorithm.
In the substep S23, it is determined whether the screened data points are located in a landmark defined polygon by using the modified cross-ray method, where the landmark defined polygon is determined according to the contour information in the landmark information table, and the contour information includes the longitudes and latitudes of a plurality of points on the contour defined polygon.
In the embodiment of the invention, after screening a plurality of target data points, whether the screened data points are positioned in a polygon defined by a landmark can be judged by an improved cross-ray method.
In an embodiment of the present invention, the determining, by using the modified cross-ray method, whether the screened data point is located in the landmark defined polygon includes: an east ray is led out from the screened data points, and the number of intersection points of the east ray and the polygon is calculated; if the number of intersections is an odd number, the screened data points are determined to be positioned in the polygon defined by the landmark.
In the embodiment of the present invention, as shown in fig. 4, a schematic diagram of an intersecting ray provided in the embodiment of the present invention is shown, where a ray may be led out from a screened data point a, if the intersection point is single, it is determined that a point B is a point outside the polygon, and if the intersection point of the point B and the polygon is an even number, it is indicated that the point is outside the polygon.
In one embodiment of the invention, if the number of intersection points is not an odd number, leading out the south, west and north rays from the screened data points, and respectively calculating the corresponding number of intersection points; if the north ray and the east ray led out by the data point have no intersection point with the polygon, and the south ray and the west ray led out have intersection points with the polygon, determining the data point as a first quadrant point; screening out a first target quadrant point positioned in the landmark facility auxiliary information table according to the first quadrant point; deleting the first target quadrant point from the landmark facility assistance information table, and adding the first quadrant point to the landmark facility assistance information table
Specifically, as shown in fig. 5, a schematic diagram of another landmark outline provided by the embodiment of the present invention is shown, where a region marked I, II, III, IV corresponds to a first, second, third and fourth quadrant region, so if there is no intersection point between a north ray and an east ray led out from a data point and a polygon, and there is an intersection point between both the south ray and the west ray led out from the data point and the polygon, the data point can be determined to be a first quadrant point, if there is no intersection point between the east ray, the south ray and the polygon, the exact data point is a second quadrant point, and if there is no intersection point between the south ray, the west ray and the polygon, the exact data point is a third quadrant point; if the western ray and the north ray have no intersection points with the polygon, the true data point is the fourth quadrant point.
In the embodiment of the invention, after determining that the target data point is located outside the polygon and determining the quadrant in which the data point is located, the first target quadrant point in the landmark facility auxiliary information table can be screened out according to the quadrant in which the data point is located, and then the data point is added into the landmark facility auxiliary information table, as shown in table 4, which shows the landmark facility auxiliary information provided by the embodiment of the invention.
TABLE 4 Table 4
Specifically, if the data point is a first quadrant point, a first target quadrant point in the landmark facility auxiliary information table can be screened out according to the first quadrant point; the first target quadrant point is deleted from the landmark facility assistance information table, and the first quadrant point is added to the landmark facility assistance information table.
Specifically, if the longitude and latitude coordinates of the first quadrant point are P (x 0 ,y 0 ) Then by the relationScreening out a first target quadrant point in the landmark facility auxiliary information table, wherein S is a record set in the landmark facility auxiliary information table, deleting the screened first target quadrant point from the landmark facility auxiliary information table, and adding a point P into the landmark facility auxiliary information table.
The invention fully excavates landmark facility information of the drive test data and marks each test data result table; the improved cross ray algorithm designs the concept and analysis algorithm of four quadrant points, and improves the matching efficiency.
If the data point is the second quadrant point, screening out a second target quadrant point positioned in the landmark facility auxiliary information table according to the second quadrant point; and deleting the second target quadrant point from the landmark facility auxiliary information table, and adding the second quadrant point to the landmark facility auxiliary information table.
If the data point is the third quadrant point, a third target quadrant point positioned in the landmark facility auxiliary information table can be screened out according to the third quadrant point; and deleting the third target quadrant point from the landmark facility auxiliary information table, and adding the third quadrant point to the landmark facility auxiliary information table.
If the data point is a fourth quadrant point, a fourth target quadrant point positioned in the landmark facility auxiliary information table can be screened out according to the fourth quadrant point; and deleting the fourth target quadrant point from the landmark facility auxiliary information table, and adding the fourth quadrant point to the landmark facility auxiliary information table.
FIG. 6 is a flowchart of determining a position of a data point according to an embodiment of the present invention, wherein an east ray is first drawn from the data point, and an intersection point with a polygon is calculated, and if the intersection point is an odd number, the data point is determined to be within the polygon and added to a landmark and a data file association table; if the number of the intersections is even, the number of the intersections with other three directions is calculated, whether the points are the quadrants or not is judged, if not, the points are judged to be outside the polygon, and if so, the points are matched with the data of the landmark facility auxiliary node information table and added to the landmark facility auxiliary node information table.
In the embodiment of the invention, in the process of matching road test data with landmark facilities, the existing cross-ray method is improved and divided into two steps of preliminary screening and fine synchronization, a self-learning mechanism is introduced, the proportion of rough screening stages with short operation time is greatly improved, the proportion of fine matching stages with long operation time is correspondingly reduced, and the overall operation time of the road test point data matching landmark facilities in the matching process is integrally reduced.
In the substep S24, if the landmark is located in the polygon defined by the landmark, the contours of the landmark facility information of each type are matched, and a new landmark and data file association table is generated.
In the embodiment of the invention, if the data points are determined to be located in the polygon, the profile of each type of landmark facility information can be matched, and records of a landmark and data file association table in the database can be generated, wherein the table can comprise a landmark number, a data file number, test points and test points meeting the coverage condition, and the table 5 shows a landmark and data file association table provided by the embodiment of the invention.
TABLE 5
According to the invention, by using an improved cross ray method, a landmark and data file association table is established and maintained, the proportion of a preliminary screening stage with short operation time is greatly improved, the proportion of a fine matching stage with long operation time is correspondingly reduced, the overall operation time of a road point data matching landmark facility matching process is integrally reduced, and the quick query of the coverage condition of each landmark facility is provided by establishing and maintaining the landmark and data file association table.
The method comprises the steps of obtaining a road test data file determined based on landmark facilities; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Referring to fig. 7, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:
an acquisition module 301, configured to acquire a drive test data file determined based on landmark facilities;
a grouping module 302, configured to group the drive test data in the drive test data file to obtain multiple groups of drive test data;
the calculation module 303 is configured to perform aggregate calculation on the multiple sets of drive test data respectively, so as to obtain multiple aggregate results;
a numbering module 304, configured to add tile numbers to the multiple aggregation results, respectively, to obtain multiple target data points;
and the matching module 305 is configured to match the target data point with the outline of the landmark according to the improved cross-ray method, so as to obtain an updated association table of the landmark and the data file.
The invention discloses a data processing device, which is characterized in that a road test data file determined based on landmark facilities is obtained; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
In one embodiment of the present invention, the grouping module 302 may include:
the first acquisition submodule is used for acquiring the instantaneous speed and the average speed of data points in the drive test data;
the state determining submodule is used for determining the state of the test equipment according to the instantaneous speed and the average speed of the drive test data;
and the storage sub-module is used for storing the drive test data to the corresponding array according to the state of the test equipment.
In one embodiment of the present invention, the state determining sub-module may include:
and the first state determining unit is used for determining that the test equipment is in a static state if the instantaneous speed is smaller than a first instantaneous speed threshold value and the average speed is smaller than an average speed threshold value.
The storage sub-module may include: and the first storage unit is used for storing the drive test data into a first array if the test equipment is in a static state.
In one embodiment of the present invention, the method may further include:
and the second state determining unit is used for determining that the test equipment is in a fast fading state if the instantaneous speed is greater than or equal to a second instantaneous speed threshold, wherein the first instantaneous speed threshold is smaller than the second instantaneous speed threshold.
The storage sub-module may include: and the second storage unit is used for storing the drive test data to a second group if the test equipment is in a fast fading state.
In one embodiment of the present invention, the method may further include:
and the third state determining unit is used for determining that the test equipment is in a slow fading state if the instantaneous speed is greater than or equal to the first instantaneous speed threshold and smaller than the second instantaneous speed threshold or the average speed is greater than the average speed threshold.
The storage sub-module may include: and the third storage unit is used for storing the drive test data into a third array if the test equipment is in a slow fading state.
In one embodiment of the present invention, the calculation module 303 may include:
the computing sub-module is used for computing an aggregation result corresponding to each group of drive test data by adopting an aggregation function for the groups of drive test data, wherein the aggregation function comprises: summation, maximum, average, one of the first significant values.
In one embodiment of the present invention, the numbering module 304 may include:
the second acquisition submodule is used for respectively acquiring longitude and latitude of each aggregation result for the plurality of aggregation results;
The number calculation sub-module is used for calculating the tile number corresponding to each aggregation result according to the longitude and the latitude of each aggregation result;
and the coding sub-module is used for adding corresponding tile numbers for each aggregation result respectively to obtain a plurality of target data points.
In one embodiment of the present invention, the matching module 305 may include:
a third acquisition sub-module for acquiring longitude and latitude ranges of landmark facilities;
the screening sub-module is used for screening the plurality of target data points according to the latitude and longitude range of the landmark facility;
the judging sub-module is used for judging whether the screened data points are positioned in a polygon defined by a landmark or not through an improved cross ray method, wherein the polygon defined by the landmark is determined according to contour information in a landmark information table, and the contour information comprises longitudes and latitudes of a plurality of points on the polygon defined by the contour;
and the matching sub-module is used for matching the outline of each type of landmark facility information if the landmark facility information is positioned in the polygon defined by the landmark, and generating a new landmark and a data file association table.
In an embodiment of the present invention, the judging sub-module may include:
The first calculation unit is used for leading out east rays from the screened data points and calculating the number of intersection points of the east rays and the polygons;
and the first determining unit is used for determining that the screened data points are positioned in the polygon defined by the landmark if the number of the data points is an odd number.
In one embodiment of the present invention, further comprising:
the second calculation unit is used for leading out southbound, westbound and northbound rays from the screened data points if the number of the intersections is not an odd number, and respectively calculating the corresponding number of the intersections;
the second determining unit is used for determining the data point as a first quadrant point if the north ray and the east ray led out by the data point have no intersection point with the polygon and the south ray and the west ray led out by the data point have intersection points with the polygon;
a matching sub-module, comprising:
the first screening unit is used for screening out a first target quadrant point positioned in the landmark facility auxiliary information table according to the first quadrant point;
and the first deleting unit is used for deleting the first target quadrant point from the landmark facility auxiliary information table and adding the first quadrant point to the landmark facility auxiliary information table.
In one embodiment of the present invention, further comprising:
and the third determining unit is used for determining that the data point is a second quadrant point if the east ray, the south ray and the polygon have no intersection points.
The second screening unit screens out a second target quadrant point positioned in the landmark facility auxiliary information table according to the second quadrant point;
and a second deleting unit that deletes the second target quadrant point from the landmark facility auxiliary information table and adds the second quadrant point to the landmark facility auxiliary information table.
In one embodiment of the present invention, further comprising:
and the fourth determining unit is used for determining that the data point is a third quadrant point if the south ray, the west ray and the polygon have no intersection points.
The third screening unit is used for screening out a third target quadrant point positioned in the landmark facility auxiliary information table according to the third quadrant point;
and a third deleting unit configured to delete the third target quadrant point from the landmark facility auxiliary information table, and add the third quadrant point to the landmark facility auxiliary information table.
In one embodiment of the present invention, further comprising:
And a fifth determining unit, configured to determine that the data point is a fourth quadrant point if the western ray and the north ray have no intersection points with the polygon.
The fourth screening unit is used for screening out a fourth target quadrant point positioned in the landmark facility auxiliary information table according to the fourth quadrant point;
a fourth deleting unit configured to delete the fourth target quadrant point from the landmark facility auxiliary information table, and add the fourth quadrant point to the landmark facility auxiliary information table
The invention discloses a data processing device, which is characterized in that a road test data file determined based on landmark facilities is obtained; grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data; respectively carrying out aggregation calculation on multiple groups of drive test data to obtain multiple aggregation results; respectively adding tile numbers to the multiple aggregation results to obtain multiple target data points; and matching the outline of the target data point and the landmark according to the improved cross ray method to obtain an updated landmark and data file association table. The invention subdivides the data preprocessing work into two stages, flexibly adapts to the actual configuration requirement of a data platform, matches landmark facilities through an improved cross-ray method, realizes linearization and standardization of analysis results, and improves the accuracy of the data results.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The embodiment of the invention also provides electronic equipment, which comprises:
the system comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes the processes of the data processing method embodiment when being executed by the processor, and can achieve the same technical effects, and the repetition is avoided, and the description is omitted here.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the above data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The foregoing has described in detail the methods, apparatus, devices and storage medium for data processing provided by the present invention, and specific examples have been provided herein to illustrate the principles and embodiments of the present invention, and the above examples are only for the purpose of aiding in the understanding of the methods and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (16)

1. A method of data processing, the method comprising:
obtaining a road test data file determined based on landmark facilities;
grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data;
respectively carrying out aggregation calculation on the multiple groups of drive test data to obtain multiple aggregation results;
respectively adding tile numbers to the aggregation results to obtain a plurality of target data points;
and matching the target data point with the outline of the landmark according to the improved cross-ray method to obtain an updated landmark and data file association table.
2. The method of claim 1, wherein grouping the drive test data in the drive test data file to obtain multiple sets of drive test data comprises:
Acquiring the instantaneous speed and the average speed of data points in the drive test data;
determining the state of test equipment according to the instantaneous speed and the average speed of the drive test data;
and storing the drive test data to a corresponding array according to the state of the test equipment.
3. The method of claim 2, wherein determining the state of the test device based on the instantaneous and average speeds of the data points in the drive test data comprises:
if the instantaneous speed is less than a first instantaneous speed threshold and the average speed is less than an average speed threshold, the test equipment is in a stationary state;
the storing the drive test data to a corresponding array according to the state of the test equipment includes:
and if the test equipment is in a static state, storing the drive test data into a first array.
4. A method according to claim 3, further comprising:
if the instantaneous speed is greater than or equal to a second instantaneous speed threshold, the test device is in a fast fading state, wherein the first instantaneous speed threshold is less than the second instantaneous speed threshold;
the storing the drive test data to a corresponding array according to the state of the test equipment includes:
And if the test equipment is in a fast fading state, storing the drive test data into a second group.
5. The method as recited in claim 4, further comprising:
if the instantaneous speed is greater than or equal to a first instantaneous speed threshold and less than a second instantaneous speed threshold, or the average speed is greater than the average speed threshold, the test device is in a slow fading state;
the storing the drive test data to a corresponding array according to the state of the test equipment includes:
and if the test equipment is in a slow fading state, storing the drive test data into a third array.
6. The method of claim 1, wherein the aggregating the plurality of sets of drive test data to obtain a plurality of aggregate results comprises:
and calculating an aggregation result corresponding to each group of drive test data by adopting an aggregation function for the groups of drive test data, wherein the aggregation function comprises the following steps: summation, maximum, average, one of the first significant values.
7. The method of claim 1, wherein adding tile numbers to the plurality of aggregate results, respectively, results in a plurality of target data points, comprising:
For the aggregation results, longitude and latitude of each aggregation result are respectively obtained;
according to the longitude and latitude of each aggregation result, calculating the tile number corresponding to each aggregation result;
and adding corresponding tile numbers to each aggregation result respectively to obtain a plurality of target data points.
8. The method of claim 1, wherein matching the plurality of target data points to contours of landmarks according to the modified cross-ray method to obtain an updated landmark-to-data file association table comprises:
acquiring longitude and latitude ranges of landmark facilities;
screening the target data points according to the latitude and longitude range of the landmark facility;
judging whether the screened data points are positioned in a polygon defined by a landmark or not by an improved cross-ray method, wherein the polygon defined by the landmark is determined according to contour information in a landmark information table, and the contour information comprises longitudes and latitudes of a plurality of points on the polygon defined by the contour;
and if the new landmark is positioned in the polygon defined by the landmark, matching the outline of each type of landmark facility information to generate a new landmark and a data file association table.
9. The method of claim 8, wherein determining whether the screened data point is within the landmark-defined polygon by the modified cross-ray method comprises:
an east ray is led out from the screened data points, and the number of intersection points of the east ray and the polygon is calculated;
if the number of intersections is an odd number, the screened data points are determined to be positioned in the polygon defined by the landmark.
10. The method as recited in claim 9, further comprising:
if the number of the intersection points is not an odd number, leading out the south, west and north rays from the screened data points, and respectively calculating the corresponding number of the intersection points;
if the north ray and the east ray led out by the data point have no intersection point with the polygon, and the south ray and the west ray led out have intersection points with the polygon, determining the data point as a first quadrant point;
screening out a first target quadrant point positioned in the landmark facility auxiliary information table according to the first quadrant point;
deleting the first target quadrant point from the landmark facility auxiliary information table, and adding the first quadrant point to the landmark facility auxiliary information table.
11. The method as recited in claim 10, further comprising:
if the east ray, the south ray and the polygon have no intersection points, the data point is a second quadrant point;
screening out a second target quadrant point positioned in the landmark facility auxiliary information table according to the second quadrant point;
deleting the second target quadrant point from the landmark facility auxiliary information table, and adding the second quadrant point to the landmark facility auxiliary information table.
12. The method as recited in claim 10, further comprising:
if the south ray, the west ray and the polygon have no intersection points, the data point is a third quadrant point;
screening out a third target quadrant point positioned in the landmark facility auxiliary information table according to the third quadrant point;
deleting the third target quadrant point from the landmark facility auxiliary information table, and adding the third quadrant point to the landmark facility auxiliary information table.
13. The method as recited in claim 10, further comprising:
if the Western ray and the North ray have no intersection points with the polygon, the data point is a fourth quadrant point;
Screening out a fourth target quadrant point positioned in the landmark facility auxiliary information table according to the fourth quadrant point;
deleting the fourth target quadrant point from the landmark facility auxiliary information table, and adding the fourth quadrant point to the landmark facility auxiliary information table.
14. A data processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring the road test data file determined based on the landmark facilities;
the grouping module is used for grouping the drive test data in the drive test data file to obtain a plurality of groups of drive test data;
the computing module is used for respectively carrying out aggregation computation on the plurality of groups of drive test data to obtain a plurality of aggregation results;
the numbering module is used for respectively adding tile numbers to the aggregation results to obtain a plurality of target data points;
and the matching module is used for matching the target data point with the outline of the landmark according to the improved cross ray method to obtain an updated landmark and data file association table.
15. An electronic device, comprising: a processor, a memory and a computer program stored on the memory and capable of running on the processor, which when executed by the processor carries out the steps of the data processing method according to any one of claims 1 to 13.
16. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the data processing method according to any of claims 1-13.
CN202311035409.2A 2023-08-16 2023-08-16 Data processing method, device, equipment and storage medium Pending CN117194596A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311035409.2A CN117194596A (en) 2023-08-16 2023-08-16 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311035409.2A CN117194596A (en) 2023-08-16 2023-08-16 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117194596A true CN117194596A (en) 2023-12-08

Family

ID=89002606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311035409.2A Pending CN117194596A (en) 2023-08-16 2023-08-16 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117194596A (en)

Similar Documents

Publication Publication Date Title
CN103035123B (en) Abnormal data acquisition methods and system in a kind of traffic track data
CN107657637A (en) A kind of agricultural machinery working area acquisition methods
CN108375363B (en) Antenna azimuth deflection checking method, device, equipment and medium
WO2017211150A1 (en) Processing method and device for storing fingerprint data in library
CN112506972B (en) User resident area positioning method and device, electronic equipment and storage medium
CN103929719A (en) Information locating optimization method and device
CN112381906B (en) Automatic drawing method for bus model basic line network
CN111024098A (en) Motor vehicle path fitting algorithm based on low-sampling data
CN111309945A (en) Method and system for accurately classifying inspection pictures of unmanned aerial vehicle
CN115439753A (en) Steep river bank identification method and system based on DEM
CN117194596A (en) Data processing method, device, equipment and storage medium
CN108243426B (en) Method and server for determining wireless base station demand point
CN115391069B (en) Parallel communication method and system based on ocean mode ROMS
CN114885369A (en) Network coverage quality detection processing method and device, electronic equipment and storage medium
CN110688436A (en) Improved GeoHash road clustering method based on driving track
CN113207170B (en) Position fusion correction method based on multi-source signaling
CN110455295B (en) Automatic planning method for river channel shipping route
CN109947877B (en) Method and system for improving map positioning precision of GIS mobile terminal
CN114396892A (en) Method for measuring curvature of curve track of track traffic
CN110475198B (en) Urban road user track deviation correction processing method and device
CN115033652A (en) Interest point aggregation method, device and system and related products
CN111291019B (en) Similarity discrimination method and device for data model
CN106022374A (en) Method and device for classifying historical process data
CN112464970A (en) Regional value evaluation model processing method and device and computing equipment
CN111737381B (en) Regional land parcel overlapping identification and overlapping area calculation method based on space-time big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination