CN114912689A - Map grid index and XGBOST-based over-limit vehicle destination prediction method and system - Google Patents

Map grid index and XGBOST-based over-limit vehicle destination prediction method and system Download PDF

Info

Publication number
CN114912689A
CN114912689A CN202210552848.XA CN202210552848A CN114912689A CN 114912689 A CN114912689 A CN 114912689A CN 202210552848 A CN202210552848 A CN 202210552848A CN 114912689 A CN114912689 A CN 114912689A
Authority
CN
China
Prior art keywords
vehicle
travel
data
destination
mileage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210552848.XA
Other languages
Chinese (zh)
Inventor
陈爱伟
万剑
陈玉飞
党倩
马宇飞
杨羚
谢斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Design Group Co Ltd
Original Assignee
China Design Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Design Group Co Ltd filed Critical China Design Group Co Ltd
Priority to CN202210552848.XA priority Critical patent/CN114912689A/en
Publication of CN114912689A publication Critical patent/CN114912689A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0137Measuring and analyzing of parameters relative to traffic conditions for specific applications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an overrun vehicle destination prediction method and system based on map grid index and XGBOST, wherein the prediction method comprises the following steps: preprocessing satellite navigation positioning data of a vehicle, and eliminating abnormal data containing incomplete, wrong and discrete information; identifying a vehicle stopping point based on the stopping characteristics, extracting the longitude and latitude of a starting and ending point of a journey and continuous path information according to the time sequence characteristics, and extracting the characteristics of average speed, mileage and the like of the latest 5 journeys; gridding the map data, establishing a spatial index, and quickly matching and extracting start and end point grid index numbers of a stroke based on start and end point information; and taking the sample travel characteristics as input, constructing a vehicle traveling destination prediction model based on XGBOOST, and predicting a traveling destination grid index number of the vehicle. By gridding the geographic spatial information, the invention effectively solves the problem of destination quantitative expression and can realize accurate prediction of the driving destination of the overrun vehicle.

Description

Map grid index and XGBOST-based overrun vehicle destination prediction method and system
Technical Field
The invention belongs to the field of intelligent traffic technology application, and particularly relates to an overrun vehicle destination prediction method and system based on map grid index and XGBOOST.
Background
The vehicle overrun overload behavior can greatly attenuate the service performance of the highway infrastructure, great pressure is brought to maintenance work, road traffic accidents are easily caused by the vehicle overrun overload behavior, the accident fatality rate is high, and huge traffic safety hidden dangers exist. With the rapid advance of traffic infrastructure construction and the continuous expansion of the overall scale, the task of controlling overload becomes increasingly heavy, and the traditional overload control modes such as 'human-sea tactics', 'plant conservation and rabbit conservation', and the like have high labor intensity and low working efficiency, and the law enforcement of overload control faces huge pressure.
At present, a transportation law enforcement department masters an illegal overload transportation vehicle list, particularly a 'hundred-ton king' vehicle list with serious overload behaviors, and although an existing overload detection station can acquire positioning data of a certain time point of a vehicle, the vehicle driving destination has uncertainty and the vehicle driving speed is high, so that the vehicle is difficult to track after being detected, and how to realize overload vehicle distribution and interception is a troublesome problem faced by the law enforcement department.
Relevant studies have been conducted by domestic and foreign scholars and institutions for the prediction of a vehicle travel destination. In the aspect of predicting the destination of the passenger on the bus commuter, a learner provides a method for predicting the destination of the passenger on the bus commuter based on XGBOOST and map correction, the destination is relatively clear due to the fact that the route and the stop of the bus are determined, and the method for predicting the destination of the passenger on the bus commuter cannot be used for predicting the running destination of the illegal and over-limit overloaded vehicle. In the aspect of automobile travel destination prediction research, some learners research automobile travel destination prediction based on XGBOOST, but in the aspect of destination expression, the travel destination is directly expressed by latitude and longitude, the accuracy of algorithm prediction is only 0.61, and the method cannot be used in an actual scene.
Disclosure of Invention
The invention aims to provide an overrun vehicle destination prediction method and system based on map grid index and XGBOST, and aims to solve the problems that a vehicle driving destination is difficult to express quantitatively, the destination prediction accuracy is low and the like.
The technical solution for realizing the purpose of the invention is as follows: in a first aspect, the invention provides an overrun vehicle destination prediction method based on map grid index and XGBOOST, comprising:
acquiring satellite navigation positioning data of an overrun overloaded vehicle, preprocessing the data, and removing abnormal data with incomplete, wrong and discrete information in a data set;
extracting a vehicle stopping point and residence time in the satellite navigation positioning data according to the vehicle stopping characteristics; extracting the longitude and latitude and continuous path information of the starting and ending points of the journey according to the time sequence characteristics, and extracting the running characteristics of the latest N journeys according to the speed and mileage parameters of the positioning data, wherein the running characteristics comprise average running speed and running mileage;
gridding map data of an application administrative area, establishing a spatial index, mapping vehicle positioning data to a map grid, and quickly matching and extracting a network index number of a travel end point based on longitude and latitude information of a start end point, so as to extract and supplement the end point grid number characteristic of a vehicle travel;
using the extracted vehicle travel characteristics as input to carry out model training by using XGBOOST to obtain a classifier, wherein the classifier is a prediction model of the illegal overrun overload vehicle traveling destination;
and extracting current travel features based on the real-time travel data of the vehicle, predicting the travel destination of the vehicle by taking the real-time travel features as input, and predicting to obtain the grid number of the travel destination of the vehicle.
In one embodiment, the overrun overloaded vehicle satellite navigation positioning data mainly comprises the following fields: operator number, longitude, latitude, speed, altitude, azimuth, terminal time, loading status, positioning status, alarm,
receiving time, the speed of a running instrument and total mileage, wherein data which is determined by the method and lacks any one of the fields is regarded as data with incomplete information, positioning points which are determined by longitude and latitude together jump out of the Chinese area range are regarded as error data, meanwhile, the distance between any continuous 5 positioning points is greater than a threshold value of 2 kilometers and is regarded as an abnormal jumping point of a track, and the abnormal data needs to be eliminated;
in one embodiment, vehicle stopping points and dwell times for vehicle trip feature extraction are extracted. The identification of the vehicle stopping point adopts direct mileage difference calculation, only the mileage difference of two adjacent points is compared during primary judgment, the point less than 2 kilometers is primarily judged as the stopping point, and the results are integrated after the primary calculation so as to obtain all the stopping points; and carrying out detailed judgment after the initial judgment, and judging the stopping point as a parking state if the speed is lower than 1km/h by calculating the average speed of two adjacent points. And aggregating all points of the vehicle stopping points within the range of 2 kilometers, calculating the stopping time of the vehicle according to the time sequence by all the aggregated points in each area, and extracting the areas stopping for more than 1 hour to filter the stopping state caused by congestion.
In one embodiment, the vehicle journey feature extraction comprises running feature extraction such as running average speed and running mileage. The method comprises the steps of identifying the stop points of the vehicle, dividing the running track of the vehicle into the stop points and the running travel, extracting the longitude and latitude of the starting and ending points of the travel and the continuous path information according to the time sequence characteristics, and extracting the running characteristics such as average running speed, running mileage and the like according to the parameters such as the speed, the mileage and the like of positioning data, wherein the running characteristics extracted by the method are the running characteristic data of the latest 5 travels.
Average speed of stroke: the average speed in each journey can be confirmed according to the ratio of the travel mileage to the travel time of each journey; the average speed of a plurality of trips can be obtained by arithmetic averaging. The average speed of travel can continue to form three features depending on the length of the statistical date: average speed over the past month, average speed over the past week, and average speed over the last trip.
Figure BDA0003655559620000031
Figure BDA0003655559620000032
Wherein k (i, j) is the kth stroke, the stroke stop point j, and the stroke next stop point i; v. of k(i,j) Is the average speed of travel k; n is n strokes of a certain vehicle;
Figure BDA0003655559620000034
is the average speed of n trips of a certain vehicle.
Average mileage of the trip: the travel mileage d of each journey can be determined i,j (ii) a The mileage of a plurality of trips can be averaged by arithmetic to obtain the average mileage of a plurality of trips of a certain vehicle.
In one embodiment, the satellite positioning data is defined as points in a two-dimensional plane:
p i ={lon i ,lat i },
lon i indicating longitude, lat i Expressed as latitude.
The map-containing area can be simply mapped into a two-dimensional plane with the latitude and longitude of the earth as coordinate axes:
R 2 {(lon,lat)|lon∈R + ,lat∈R + }
then, dividing the map into grids with equal size by using straight lines of parallel coordinate axes:
R i ={lon max ,lon min ,lat max ,lat min }
wherein lon max ,lon min ,lat max ,lat min The right boundary, the left boundary, the upper boundary, and the lower boundary of the cell are respectively defined, and the set S ═ R of the cells is obtained by dividing 1 ,R 2 ,R 3 ,…,R i-1 ,R i R, thereby defining a mapping relation F: R 2 →S。
In one embodiment, the network index number of the travel end point is quickly matched and extracted based on the longitude and latitude information of the start end point and the end point, so that the grid number characteristics of the end point of the vehicle travel are supplemented. And quickly matching the longitude and latitude range data divided by each grid with the longitude and latitude of the travel destination, and mapping the grid index number to the travel destination after matching.
In one embodiment, the XGBOOST-based prediction model is constructed.
The XGBOOST objective function is defined as
Figure BDA0003655559620000033
Wherein g is i Is the first derivative of the loss function, h i Is the second derivative of the loss function, where f t And omega is a regular term of the model for the t-th base model. Therefore, the method needs to obtain the values of the first derivative and the second derivative of the loss function of each step, then optimize the objective function to obtain f (x) of each step, and finally obtain the prediction method according to the addition model.
In one embodiment, a vehicle travel destination is predicted based on a predictive model. The method comprises the steps of extracting current travel characteristics based on real-time vehicle travel data, predicting a vehicle travel destination by taking the real-time travel characteristics as input, predicting to obtain a grid number of the vehicle travel destination, namely completing travel destination prediction, representing the destination by the grid number, and mapping the destination on a map to realize visualization.
In a second aspect, the present invention provides an overrun vehicle destination prediction system based on map grid index and XGBOOST, comprising:
the data preprocessing module is used for acquiring the satellite navigation positioning data of the overrun overloaded vehicle, carrying out preprocessing work on the data and eliminating abnormal data with incomplete, wrong and discrete information in the data set;
the vehicle travel characteristic extraction module is used for extracting vehicle stopping points and residence time in the satellite navigation positioning data according to vehicle stopping characteristics; extracting longitude and latitude and continuous path information of the starting and ending points of the travel according to the time sequence characteristics, and extracting the running characteristics of the latest N travels according to the speed and mileage parameters of the positioning data; the running characteristics comprise average running speed and mileage;
the map data gridding and index building module is used for gridding map data of an application administrative area, building a spatial index, mapping vehicle positioning data to a map grid, and quickly matching and extracting a network index number of a trip end point based on longitude and latitude information of the start end point so as to extract and supplement the end point grid number characteristic of a vehicle trip;
the prediction model construction module is used for performing model training by using XGBOOST with the extracted vehicle travel characteristics as input to obtain a classifier, and the classifier is a prediction model of the illegal overrun overload vehicle traveling destination;
and the vehicle running destination prediction module extracts current travel characteristics based on the real-time vehicle running data, predicts the vehicle running destination by taking the real-time travel characteristics as input, and predicts the grid number of the vehicle running destination.
Compared with the prior art, the invention has the following remarkable advantages: (1) the method carries out quantitative expression on the destination, can realize quantitative expression of irregular spatial data, and greatly improves the accuracy of prediction; (2) by combining with the extraction of vehicle running characteristics, the interpretability of a prediction result can be remarkably improved, visualization can be quickly realized through mapping after quantitative expression of a destination, and the application practicability of the method is improved.
Drawings
FIG. 1 is a flow chart of a prediction method according to the present invention.
FIG. 2 is a schematic diagram of map meshing and indexing in an embodiment.
Fig. 3 is a diagram showing a destination prediction result in the embodiment.
Detailed Description
The invention provides an overrun vehicle destination prediction method based on map grid index and XGBOST, which comprises the following specific processing steps:
1) data pre-processing
Acquiring satellite navigation positioning data of an overrun overloaded vehicle, carrying out preprocessing work on the data, and eliminating abnormal data such as incomplete information, errors, dispersion and the like in a data set;
the satellite navigation positioning data of the overrun overload vehicle mainly comprises the following fields: the method comprises the steps of determining the data lack of any field as data with incomplete information, determining the locating point jump out of the Chinese area range determined by longitude and latitude as error data, determining the distance between any continuous 5 locating points to be more than 2 kilometers as an abnormal jump point, and firstly rejecting the abnormal data.
2) Vehicle travel feature extraction
Extracting a vehicle stopping point and residence time in the satellite navigation positioning data according to the vehicle stopping characteristics; extracting the longitude and latitude and continuous path information of the starting and ending points of the journey according to the time sequence characteristics, and extracting the running characteristics such as average running speed, running mileage and the like of the latest 5 journeys according to the parameters such as speed, mileage and the like of the positioning data;
the identification of the vehicle stopping point adopts direct mileage difference calculation, only the mileage difference of two adjacent points is compared during primary judgment, the point less than 2 kilometers is primarily judged as the stopping point, and the results are integrated after the primary calculation so as to obtain all the stopping points; and carrying out detailed judgment after the initial judgment, and judging the stopping point as a parking state if the speed is lower than 1km/h by calculating the average speed of two adjacent points. And aggregating all points of the vehicle stopping points in the range of 2 kilometers, calculating and obtaining the stopping time of the vehicle according to the time sequence by all the aggregated points in each area, and extracting the area stopping for more than 1 hour to filter the stopping state of the vehicle due to congestion.
If the space distance between the two track points is smaller than the set space displacement threshold value and the time difference is larger than the set time threshold value, the vehicle can be judged to be parked as follows.
d i,j =Loc i -Loc j <d threshold
t i,j =T i -T j >t threshold
Wherein, i, j are two track points which are continuously recorded; loc i ,Loc j Recording mileage corresponding to the track points; t is i ,T j Time stamps corresponding to the track points; d threshold Setting the space displacement threshold value of parking as 2 kilometers; t is t threshold The method is set to 1 hour for the time threshold of parking loading and unloading.
The method comprises the steps of identifying the stop points of the vehicle, dividing the running track of the vehicle into the stop points and the running travel, extracting the longitude and latitude of the starting and ending points of the travel and the continuous path information according to the time sequence characteristics, and extracting the running characteristics such as average running speed, running mileage and the like according to the parameters such as the speed, the mileage and the like of positioning data, wherein the running characteristics extracted by the method are the running characteristic data of the last 5 travels.
Average speed of stroke: the average speed in each journey can be confirmed according to the ratio of the mileage to the travel time; the average speed of a plurality of trips can be obtained by arithmetic averaging. The average speed of travel can continue to form three features depending on the length of the statistical date: such as the average speed of the past month, the average speed of the past week, and the average speed of the last trip.
Figure BDA0003655559620000061
Figure BDA0003655559620000062
Wherein k (i, j) is the kth stroke, the starting point of the current stroke is a parking point j, and the end point of the current stroke, namely the next parking point, is i; v. of k(i,j) Is the average speed of travel k; n is n strokes of a certain vehicle;
Figure BDA0003655559620000063
average speed of n strokes of a certain vehicle;
average mileage of the trip: the mileage d of each trip can be determined i,j (ii) a The mileage of a plurality of trips can be averaged by arithmetic to obtain the average mileage of a plurality of trips of a certain vehicle.
3) Map data gridding and index construction
And gridding the map data of the application administrative area, establishing a spatial index, and quickly matching and extracting the network index number of the travel end point based on the longitude and latitude information of the start end point and the end point, thereby supplementing the end point grid number characteristic of the vehicle travel.
And gridding and constructing indexes of the map data. Defining satellite positioning data as points in a two-dimensional plane:
p i ={lon i ,lat i },
lon i indicating longitude, lat i Expressed as latitude.
The map-containing area can be simply mapped into a two-dimensional plane with the latitude and longitude of the earth as coordinate axes:
R 2 {(lon,lat)|lon∈R + ,lat∈R + }
then, dividing the map into grids with equal size by using a straight line parallel to coordinate axes:
R i ={lon max ,lon min ,lat max ,lat min }
wherein lon max ,lon min ,lat max ,lat min The right boundary, the left boundary, the upper boundary, and the lower boundary of the cell are respectively defined, and the set S ═ R of the cells is obtained by dividing 1 ,R 2 ,R 3 ,…,R i-1 ,R i R, thereby defining a mapping relation F: R 2 →S。
Taking Nanjing city as an example, the city domain ranges from 118 degrees 22 'to 119 degrees 14' of east longitude and 31 degrees 14 'to 32 degrees 37' of north latitude, the city domain can be further converted into 118.3700-119.2300, 31.2300-32.6200, grids are divided according to 0.01 in terms of longitude and latitude, each grid is about 1 square kilometer, 86 grids are arranged in each row, 139 grids are arranged in each column, each grid is numbered, and 11954 grids are counted in total.
And rapidly matching and extracting the network index number of the travel end point based on the longitude and latitude information of the start end point and the end point, thereby supplementing the end point grid number characteristic of the vehicle travel. And quickly matching the longitude and latitude range data divided by each grid with the longitude and latitude of the travel destination, and mapping the grid index number to the travel destination after matching.
4) Training sample generation
Through the feature extraction, the stroke training model feature data is extracted according to the training sample data, and the feature data is expressed as: trip destination number, trip destination dwell time, trip average speed, trip average mileage, etc., as shown in table 1:
TABLE 1 characteristic items of prediction method
Figure BDA0003655559620000071
Figure BDA0003655559620000081
5) Predictive model training based on XGBOOST
And performing model training by using the extracted vehicle travel characteristics as input and using XGBOOST to obtain a classifier, wherein the classifier is a prediction model of the driving destination of the illegal and ultralimited overloaded vehicle. The XGBOOST objective function is defined as
Figure BDA0003655559620000082
Wherein g is i Is the first derivative of the loss function, h i Is the second derivative of the loss function, where f t And omega is a regular term of the model for the t-th base model. Therefore, the method needs to obtain the values of the first derivative and the second derivative of the loss function of each step, then optimize the objective function to obtain f (x) of each step, and finally obtain the prediction method according to the addition model.
6) Vehicle travel destination prediction based on prediction model
The method comprises the steps of extracting current travel characteristics based on real-time vehicle travel data, predicting a vehicle travel destination by taking the real-time travel characteristics as input, predicting to obtain a grid number of the vehicle travel destination, namely completing travel destination prediction, representing the destination by the grid number, and mapping the destination on a map to realize visualization.
Based on the same inventive concept, the invention also provides an overrun vehicle destination prediction system based on map grid index and XGBOST, which comprises:
the data preprocessing module is used for acquiring the satellite navigation positioning data of the overrun overloaded vehicle, carrying out preprocessing work on the data and eliminating abnormal data with incomplete, wrong and discrete information in the data set;
the vehicle travel characteristic extraction module is used for extracting vehicle stopping points and residence time in the satellite navigation positioning data according to vehicle stopping characteristics; extracting the longitude and latitude and continuous path information of the starting and ending points of the journey according to the time sequence characteristics, and extracting the running characteristics of the latest N journeys according to the speed and mileage parameters of the positioning data; the running characteristics comprise average running speed and mileage;
the map data gridding and index building module is used for gridding map data of an application administrative area, building a spatial index, mapping vehicle positioning data to a map grid, and quickly matching and extracting a network index number of a trip end point based on longitude and latitude information of the start end point so as to extract and supplement the end point grid number characteristic of a vehicle trip;
the prediction model construction module is used for performing model training by using XGBOOST with the extracted vehicle travel characteristics as input to obtain a classifier, and the classifier is a prediction model of the illegal overrun overload vehicle traveling destination;
and the vehicle running destination prediction module extracts current travel characteristics based on the real-time vehicle running data, predicts the vehicle running destination by taking the real-time travel characteristics as input, and predicts the grid number of the vehicle running destination.
The specific implementation manner of each module in the prediction system is the same as that of each step of the prediction method, and is not described herein again.
The present invention will be described in detail with reference to the following examples and drawings.
Examples
(1) Training and testing data Condition
The samples consist of the vehicle-mounted satellite navigation positioning data of the last week of 9 months in 2020, the total number of the samples is 17555, 14044 training samples and 3511 test samples are obtained according to the dividing principle that 80% of the samples are training samples and 20% of the samples are training samples.
(2) Vehicle travel feature extraction example
TABLE 2 example of vehicle journey feature extraction
Characteristic serial number License plate number Threo A989 Threo A989 Threo A989
1 Longitude of origin of travel 118.6395 118.4610 118.6090
2 Travel starting point latitude 32.0871 32.1825 31.8946
3 Time of flight 20200901084921 20200902084351 20200903093149
4 Trip starting mileage 28655 28872 29010
5 Stroke Start grid numbering 26 165 276
6 End of travel longitude 118.4610 118.7952 118.8886
7 Travel end point latitude 32.1825 31.8677 31.8104
8 End of travel time 20200902084351 20200902124312 20200903153710
9 End of travel mileage 28872 28956 29076
10 End of travel grid numbering 165 226 334
11 Travel time 23.9083 3.9892 6.0892
12 Mileage of driving 217 84 66
13 Average speed of travel 9.08 21.06 10.84
(3) Map gridding and index construction
The prediction takes Nanjing city as an example, and the gridding and index construction of the map are shown in figure 2.
(4) Training sample and data set generation
TABLE 3 training samples and data and Generation
Figure BDA0003655559620000091
Figure BDA0003655559620000101
(5) Model prediction result comparison analysis
The results of model training are compared to find that XGBOOST has obvious advantages in three aspects of accuracy, precision and recall rate compared with random forest RF and logistic regression models. Therefore, the XGBOOST model is more suitable for the destination prediction problem, and the prediction result visualization is shown in fig. 3,
the XGBOOST model test sample case prediction output case sample data is shown in table 4.
Table 4 model output results sample data
Figure BDA0003655559620000102
Figure BDA0003655559620000111
TABLE 5 comparison of model test results
Model/index Rate of accuracy Rate of accuracy Recall rate
XGBOOST 98.55% 98.12% 98.55%
RF 65.86% 64.81% 65.86%
Logistic regression 43.25% 27.61% 43.25%
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.

Claims (10)

1. An overrun vehicle destination prediction method based on map grid index and XGBOST is characterized by comprising the following steps:
acquiring satellite navigation positioning data of an overrun overloaded vehicle, preprocessing the data, and removing abnormal data with incomplete, wrong and discrete information in a data set;
extracting a vehicle stopping point and residence time in the satellite navigation positioning data according to the vehicle stopping characteristics; extracting the longitude and latitude and continuous path information of the starting and ending points of the journey according to the time sequence characteristics, and extracting the running characteristics of the latest N journeys according to the speed and mileage parameters of the positioning data, wherein the running characteristics comprise average running speed and running mileage;
gridding map data of an application administrative area, establishing a spatial index, mapping vehicle positioning data to a map grid, and quickly matching and extracting a network index number of a travel end point based on longitude and latitude information of a start end point, so as to extract and supplement the end point grid number characteristic of a vehicle travel;
using the extracted vehicle travel characteristics as input to carry out model training by using XGBOOST to obtain a classifier, wherein the classifier is a prediction model of the illegal overrun overload vehicle traveling destination;
and extracting current travel characteristics based on the real-time travel data of the vehicle, predicting the travel destination of the vehicle by taking the real-time travel characteristics as input, and predicting to obtain the grid number of the travel destination of the vehicle.
2. The map grid index and XGBOOST based over-limit vehicle destination prediction method of claim 1 wherein the over-limit overloaded vehicle satellite navigation positioning data mainly comprises the following fields: operator number, longitude, latitude, speed, altitude, azimuth, terminal time, loading state, positioning state, alarm, reception time, vehicle speed, total mileage.
3. The map grid index and XGBOOST based over-limit vehicle destination prediction method of claim 1 wherein the decision rule to reject anomalous data is: data lacking any one field is regarded as data with incomplete information, positioning points determined by longitude and latitude together jump out of the Chinese area range are regarded as error data, meanwhile, the distance between any continuous 5 positioning points is larger than a threshold value of 2 kilometers and regarded as an abnormal jumping point, and the abnormal data need to be eliminated.
4. The map grid index and XGBOOST based overrun vehicle destination prediction method of claim 1, wherein the vehicle stopping point and dwell time extraction of the vehicle trip feature extraction is specifically: the identification of the vehicle stopping point adopts direct mileage difference calculation, the point with the mileage difference of two adjacent points less than 2 kilometers in the preliminary judgment is preliminarily judged as the stopping point, and the results are integrated after the preliminary calculation to obtain all the stopping points; carrying out detailed judgment after the initial judgment, and judging the stopping point as a parking state when the average speed of two adjacent points is lower than 1 km/h; all points of the vehicle stopping points within the range of 2 kilometers are aggregated, the stopping time of the vehicle is obtained by computing all the aggregated points in each area according to the time sequence, the area stopping for more than 1 hour is extracted, and the extracting process can filter the stopping state of the vehicle caused by congestion.
5. The map grid index and XGBOOST based over-limit vehicle destination prediction method of claim 1, wherein the vehicle trip feature extraction of average speed and mileage of driving is specifically: the method comprises the steps of dividing a running track of a vehicle into stop points and a running route through the identification of the stop points of the vehicle, extracting the longitude and latitude of the starting and ending points of the route and the information of a continuous route according to time sequence characteristics, and extracting running characteristic data of the latest 5 routes according to the speed and the mileage of positioning data;
1) average speed of travel
Confirming the average speed in each journey according to the ratio of the travel mileage to the travel time of each journey; obtaining the average speed of a plurality of strokes of a certain vehicle through arithmetic mean; the average speed of the travel continues to form three characteristics according to the length of the statistical date: average speed of past month, average speed of past week, average speed of last trip;
Figure FDA0003655559610000021
Figure FDA0003655559610000022
wherein k (i, j) is the kth stroke, the starting point of the current stroke is a parking point j, and the end point of the current stroke, namely the next parking point, is i; v. of k(i,j) Is the average speed of travel k; n is n strokes of a certain vehicle;
Figure FDA0003655559610000023
the average speed of n strokes of a certain vehicle;
2) average mileage of trip
Each trip can determine its mileage d i,j (ii) a The mileage of a plurality of trips is obtained by arithmetic average of the mileage of a plurality of trips of a certain vehicle.
6. The map grid index and XGBOOST based over-limit vehicle destination prediction method of claim 1, wherein said mapping vehicle positioning data to a map grid is specifically: defining satellite positioning data as points in a two-dimensional plane:
p i ={lon i ,lat i },
lon i indicating longitude, lat i Expressed as latitude;
the map-containing area is mapped into a two-dimensional plane by taking the latitude and longitude of the earth as coordinate axes:
R 2 {(lon,lat)|lon∈R + ,lat∈R + }
the map is divided into grids of equal size using straight lines parallel to the coordinate axes:
R i ={lon max ,lon min ,lat max ,lat min }
wherein lon max ,lon min ,lat max ,lat min The right boundary, the left boundary, the upper boundary, and the lower boundary of the cell are respectively defined, and the set S ═ R of the cells is obtained by dividing 1 ,R 2 ,R 3 ,…,R i-1 ,R i R, thereby defining a mapping F: R 2 →S。
7. The map grid index and XGBOOST based over-limit vehicle destination prediction method of claim 1 wherein the network index number of the trip end point is extracted based on the start and end point latitude and longitude information fast match to supplement the end point grid number feature of the vehicle trip; and matching the longitude and latitude range data divided by each grid with the longitude and latitude of the travel end point, and mapping the grid index number to the travel end point after matching.
8. The map grid index and XGBOOST based vehicle destination prediction method of claim 1 wherein an objective function of XGBOOST is defined as
Figure FDA0003655559610000031
Wherein g is i Is the first derivative of the loss function, h i Is the second derivative of the loss function, where f t Is the t-th base model, and omega is the regular term of the model; the first and second derivatives of the loss function are evaluated and optimizedAnd (4) obtaining an objective function to obtain f (x) of each step, and finally obtaining a prediction method according to an addition model.
9. The map grid index and XGBOOST based vehicle destination prediction method of claim 1, wherein the current journey feature is extracted based on the vehicle real-time travel data, the vehicle travel destination is predicted using the real-time journey feature as an input, the vehicle travel destination grid number is obtained by prediction, i.e. the travel destination prediction is completed, the destination is represented by the grid number, and is visualized after mapping on the map.
10. An overrun vehicle destination prediction system based on map grid indexing and XGBOOST, comprising:
the data preprocessing module is used for acquiring the satellite navigation positioning data of the overload vehicle, performing preprocessing work on the data and eliminating abnormal data with incomplete, wrong and discrete information in the data set;
the vehicle travel characteristic extraction module is used for extracting vehicle stopping points and residence time in the satellite navigation positioning data according to vehicle stopping characteristics; extracting the longitude and latitude and continuous path information of the starting and ending points of the journey according to the time sequence characteristics, and extracting the running characteristics of the latest N journeys according to the speed and mileage parameters of the positioning data; the running characteristics comprise average running speed and mileage;
the map data gridding and index building module is used for gridding the map data of the application administrative area, establishing a spatial index, mapping the vehicle positioning data to a map grid, and quickly matching and extracting a network index number of a travel terminal based on longitude and latitude information of the starting terminal so as to extract and supplement terminal grid number characteristics of the vehicle travel;
the prediction model construction module is used for performing model training by using XGBOOST with the extracted vehicle travel characteristics as input to obtain a classifier, and the classifier is a prediction model of the illegal overrun overload vehicle traveling destination;
and the vehicle running destination prediction module extracts the current travel characteristics based on the real-time vehicle running data, predicts the vehicle running destination by taking the real-time travel characteristics as input, and predicts the grid number of the vehicle running destination.
CN202210552848.XA 2022-05-21 2022-05-21 Map grid index and XGBOST-based over-limit vehicle destination prediction method and system Pending CN114912689A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210552848.XA CN114912689A (en) 2022-05-21 2022-05-21 Map grid index and XGBOST-based over-limit vehicle destination prediction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210552848.XA CN114912689A (en) 2022-05-21 2022-05-21 Map grid index and XGBOST-based over-limit vehicle destination prediction method and system

Publications (1)

Publication Number Publication Date
CN114912689A true CN114912689A (en) 2022-08-16

Family

ID=82769440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210552848.XA Pending CN114912689A (en) 2022-05-21 2022-05-21 Map grid index and XGBOST-based over-limit vehicle destination prediction method and system

Country Status (1)

Country Link
CN (1) CN114912689A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486639A (en) * 2023-06-14 2023-07-25 眉山环天智慧科技有限公司 Vehicle supervision method based on remote sensing and Beidou satellite data analysis
CN116578664A (en) * 2023-07-13 2023-08-11 鱼快创领智能科技(南京)有限公司 Construction method, travel prediction method and system of vehicle travel directional loop diagram

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116486639A (en) * 2023-06-14 2023-07-25 眉山环天智慧科技有限公司 Vehicle supervision method based on remote sensing and Beidou satellite data analysis
CN116486639B (en) * 2023-06-14 2023-09-29 眉山环天智慧科技有限公司 Vehicle supervision method based on remote sensing and Beidou satellite data analysis
CN116578664A (en) * 2023-07-13 2023-08-11 鱼快创领智能科技(南京)有限公司 Construction method, travel prediction method and system of vehicle travel directional loop diagram
CN116578664B (en) * 2023-07-13 2023-09-26 鱼快创领智能科技(南京)有限公司 Construction method, travel prediction method and system of vehicle travel directional loop diagram

Similar Documents

Publication Publication Date Title
CN108345666B (en) Vehicle abnormal track detection method based on time-space isolated points
CN114912689A (en) Map grid index and XGBOST-based over-limit vehicle destination prediction method and system
CN109716414A (en) A kind of multi-modal road traffic method for detecting abnormality
CN112270460A (en) Goods source station identification method for overweight truck based on multi-source data
CN106781468B (en) Link Travel Time Estimation method based on built environment and low frequency floating car data
CN104318781B (en) Based on the travel speed acquisition methods of RFID technique
CN105869402B (en) Express highway section speed modification method based on polymorphic type floating car data
CN111612670A (en) Method and device for constructing motor vehicle emission list and computer equipment
CN103632540A (en) An urban main road traffic operation information processing method based on floating vehicle data
CN109118770A (en) A kind of road section capacity method for digging based on Traffic monitoring data
CN106651728A (en) Determination method for advantageous haul distances of passenger transport modes in comprehensive transport system
CN106251628B (en) A kind of method and device of the traffic trip amount of determining motor vehicle
CN112734242A (en) Method and device for analyzing availability of vehicle running track data, storage medium and terminal
CN105206040A (en) Bus bunching predication method based on IC card data
CN107452207B (en) Floating car data source evaluation method, device and system
CN113095387B (en) Road risk identification method based on networking vehicle-mounted ADAS
CN109118769A (en) A kind of section free stream velocity method for digging based on Traffic monitoring data
CN110399364B (en) Data fusion method based on multiple road detector data
CN115511320A (en) Urban road method planning and equipment
CN113706875B (en) Road function studying and judging method
CN113284369B (en) Prediction method for actually measured airway data based on ADS-B
Cohen et al. Travel time estimation between loop detectors and FCD: A compatibility study on the Lille network, France
CN112767686B (en) Road network automobile emission estimation method based on multi-source data fusion
Czégé et al. Review on construction procedures of driving cycles
CN116753938A (en) Vehicle test scene generation method, device, storage medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination