CN110458214B

CN110458214B - Driver replacement recognition method and device

Info

Publication number: CN110458214B
Application number: CN201910698186.5A
Authority: CN
Inventors: 肖延国; 戴杰; 周忠球
Original assignee: Shanghai Yuanmou Software Co ltd
Current assignee: Shanghai Yuanmou Software Co ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2023-07-07
Anticipated expiration: 2039-07-31
Also published as: CN110458214A

Abstract

The invention relates to a method and a device for identifying driver replacement, wherein the method comprises the following steps: a driving data acquisition step of acquiring historical driving data of the vehicle through the OBD equipment; a feature construction step of constructing a data set and driving habit features; a model learning and training step of constructing and training a machine learning model according to the data set and the driving habit characteristics; updating and deploying the model. The invention can accurately analyze the driving behavior habit of the current driver by combining the historical driving record of the vehicle and the current driving external environment to see whether the driving habit is matched with the historical driving habit, thereby identifying whether to replace the driver.

Description

Driver replacement recognition method and device

Technical Field

The invention relates to the technical fields of Internet of vehicles and artificial intelligence, in particular to a method and a device for identifying driver replacement.

Background

With the development of social economy, the number of motor vehicles in China is continuously increased, and the increase of vehicles facilitates the travel of people, but simultaneously brings traffic safety problems. Therefore, higher demands are placed on the monitoring of the driving risk of the motor vehicle. Among them, the risk of driver replacement is a risk that is more common and has higher monitoring difficulty.

At present, the technology of replacing and identifying drivers is mainly divided into two types, one is to directly install a camera on a vehicle, and the face recognition technology is used for identifying the drivers; the other is to install an OBD (On Board Diagnostics, on-board automatic diagnostic system) device on the vehicle, collect driving data of the driver, and transmit the driving data to a server for analysis depending on a rule engine and a statistical model established by an expert.

The first approach involves user privacy, which is not acceptable to the average customer, except that permissions are available to customers on a fleet scale, such as logistics fleet and the like. The second mode uses manpower to construct driving habit characteristics, the method relies on the experience of expert, the workload is large, the characteristic library is complex to update, and the recognition accuracy is not guaranteed.

Disclosure of Invention

Based on this, it is necessary to provide a driver replacement recognition method and apparatus capable of accurately analyzing the driving behavior habit of the current driver in combination with the history of the running record of the vehicle and the external environment of the current running to see whether it matches the history of the driving habit, thereby recognizing whether to replace the driver.

In order to achieve the above object, the present invention adopts the following technical scheme.

The invention provides a driver replacement identification method, which comprises the following steps:

a driving data acquisition step of acquiring historical driving data of the vehicle through the OBD equipment;

a feature construction step of constructing a data set and driving habit features;

a model learning and training step of constructing and training a machine learning model according to the data set and the driving habit characteristics;

updating and deploying the model.

In the above-described driver replacement recognition method, the feature construction step further includes:

data preprocessing, namely decoding, cleaning, classifying and storing the historical driving data acquired in the driving data acquisition step to remove abnormal value and null value data;

constructing a data set, namely constructing a sample and the data set by adopting a pseudo tag method; and

driving habit characteristics are constructed, and driving habit characteristics of a driver are constructed based on travel characteristics, three-urgency characteristics and GPS data, wherein the travel characteristics comprise travel starting time, travel ending time, driving duration, mileage, average speed and overspeed times; the three jerk features include jerk, and jerk times and points in time for each trip.

In the above method for identifying driver replacement, the step of constructing the sample and the data set by using the pseudo tag method specifically includes:

assuming that the current vehicle driver is not replaced, when a sample is constructed using the historical driving data of the current vehicle, taking the sample as a negative sample, and setting the tag to 0; when a historical driving data construction sample is randomly extracted from other vehicles, the historical driving data construction sample is taken as a positive sample, and a label is set to be 1;

after all negative and positive samples are collected, the dataset is formed.

In the above method for identifying driver replacement, the step of constructing the driving habit features specifically includes:

the variable-length sequence data in the historical driving data is converted into fixed-length sequence data by adopting a cutting and filling mode, namely, the GPS data in each journey is averagely divided into three parts by adopting a segmented sampling method, K GPS points are respectively cut from each part, and the GPS sequence data with the length of 3K is formed;

if the travel time is short or the K value is large, so that the data is crossed, filling the null value of the crossed part by using the average value of the existing values.

In the above method for identifying driver replacement, the learning and training steps of the model specifically include:

constructing a machine learning model based on the data set and the driving habit characteristics, and learning and training the model in the following manner:

firstly, inputting the data set and driving habit characteristics into a GBDT model, and taking out a tree index generated by the GBDT model;

and inputting the result into an LR model in a one-hot mode, and outputting the identification result by the LR model.

In the above method for identifying driver replacement, the updating and deploying steps of the model specifically include:

and constructing a dock mirror image on the line, copying the model and the predictive code into the dock, inputting the historical driving data of the vehicle into the dock when the historical driving data of the vehicle is received, processing the historical driving data by the dock, and outputting and storing a judging result in real time.

The invention also provides a driver replacement identification device, comprising:

the driving data acquisition module is used for acquiring historical driving data of the vehicle through the OBD equipment;

the characteristic construction module is used for constructing a data set and driving habit characteristics;

the model learning and training module is used for constructing and training a machine learning model according to the data set and the driving habit characteristics;

and the model updating and deploying module is used for updating and deploying the machine learning model.

In the above-described driver replacement identifying apparatus, the feature constructing module further includes:

the data preprocessing unit is used for decoding, cleaning, classifying and storing the historical driving data acquired in the driving data acquisition step so as to remove abnormal value and null value data;

the data set construction unit is used for constructing a sample and a data set by adopting a pseudo tag method; and

the driving habit feature construction unit is used for constructing driving habit features of a driver based on the journey features, the three-urgent features and the GPS data, wherein the journey features comprise journey starting time, journey ending time, driving duration, mileage, average speed and overspeed times; the three jerk features include jerk, and jerk times and points in time for each trip.

In the above-mentioned driver replacement recognition device, the data set construction unit is specifically configured to:

after all negative and positive samples are collected, the dataset is formed.

In the above-described driver replacement recognition apparatus, the driving habit feature construction unit is specifically configured to,

Compared with the traditional driver replacement recognition method, the method has the advantages that the driving data are acquired, the data set and the driving habit characteristics are constructed, the machine learning model is constructed, a camera is not required to be installed, the privacy of a user can be protected, the driving behavior habit of the current driver can be accurately analyzed by combining the historical driving record of the vehicle and the current driving external environment, and whether the driving behavior habit is matched with the historical driving habit or not can be judged, so that whether the driver is replaced or not can be recognized. And the stability and the precision of model identification are effectively improved by using the GBDT+LR model fusion mode.

Drawings

Fig. 1 is a flow chart of a driver replacement recognition method in the present embodiment;

fig. 2 is a schematic diagram of a structure of a driver replacement recognition device in the present embodiment.

Detailed Description

Further description will be made with reference to the accompanying drawings and specific embodiments.

As shown in fig. 1, the present embodiment provides a driver replacement recognition method, including the steps of:

s1: a driving data acquisition step of acquiring historical driving data of the vehicle through the OBD equipment;

s2: a feature construction step of constructing a data set and driving habit features;

s3: a model learning and training step of constructing and training a machine learning model according to the data set and the driving habit characteristics;

s4: updating and deploying the model.

The simplest and effective method for acquiring vehicle data is to connect with an automobile through an OBD interface, and an OBD device is the most common OBD diagnosis and data read-write device at present.

The OBD device may collect a large amount of data related to the running of the vehicle, such as a running speed, a running time, a running distance, three urgent data (sharp turning, sharp deceleration, sharp acceleration), longitude and latitude information of a running point, and the like, which are collectively referred to herein as historical running data.

The present embodiment performs feature construction based on the above-mentioned historical driving data, and the feature construction step S1 further includes:

s21: data preprocessing, namely decoding, cleaning, classifying and storing the historical driving data acquired in the driving data acquisition step to remove abnormal value and null value data;

s22: constructing a data set, namely constructing a sample and the data set by adopting a pseudo tag method; and

s23: driving habit characteristics are constructed, and driving habit characteristics of a driver are constructed based on travel characteristics, three-step characteristics and GPS data, wherein the travel characteristics comprise travel starting time, travel ending time, starting point coordinates, end point coordinates, driving duration, mileage number, average speed, travel highest speed, travel lowest speed, travel stay interval time, maximum overspeed speed, overspeed times and the like; the three jerk features include jerk, and jerk times and points in time for each trip.

In step S21, because the driving process is affected by many external factors, the quality of the historical driving data collected by the OBD device is uneven, for example, when the vehicle passes through a tunnel, the GPS signal is weak, and at this time, the collected GPS data (such as coordinates and speed) of the vehicle is inaccurate, and sometimes even cannot be obtained at all, which causes some abnormal values and null values in the historical driving data. And the data types acquired through the OBD interface cannot be directly used, so that the data needs to be preprocessed, including steps of decoding, cleaning, classifying, storing and the like, so that abnormal value and null value data are removed, and the data are converted into a required data format and are stored for standby.

After the above steps are completed, the construction of the data set can be performed. One of the advantages of this embodiment is that a single trip driver change identification can be achieved, so in step S22, a sample is first defined defining the vehicle as a trip from a single start to a flameout, each sample based on data from a complete trip.

In constructing the sample, the present embodiment adopts a pseudo tag method, that is, assuming that the driver of the current vehicle does not change, when constructing the sample using the history running data of the current vehicle, it is taken as a negative sample, and the tag is set to 0; when a historical travel data construction sample is randomly drawn from other vehicles, it is taken as a positive sample and the tag is set to 1.

All negative and positive samples are then clustered to form the dataset. It is easy to understand that when the model is used to verify the data set later, the driver of the vehicle is not changed when the sample tag of the certain trip data is 0, and the driver of the vehicle is changed when the sample tag of the certain trip data is 1.

In step S23, when considering driving habit features, dimensions such as a trip start time, a trip end time, a start point coordinate, an end point coordinate, a driving duration, a mileage number, an average speed, a trip maximum speed, a trip minimum speed, a next trip residence time of the trip distance, a maximum overspeed, an overspeed number, a three-step feature, and GPS data are mainly analyzed, wherein the analysis of the data of the dimensions is mainly to process and analyze GPS data, and the method includes:

specifically, the method comprises the steps of intercepting and dividing the GPS sequence data into three parts, namely intercepting travel data of k time units before, during and after the travel, and finally forming GPS sequence data with the length of 3 k. The early phase of the journey is defined as the journey of k time units before the journey, the middle phase is defined as the journey of k time units in the middle, and the later phase is defined as the journey of k time units at the last. When the travel time is short, there may be a crossover in the three travel segments. When the value of k is larger and the travel time is shorter, there may be a case of taking a null value, and at this time, the null value of the average crossing portion of the existing value is used for filling.

For the speed of the vehicle, the speed of the first k time units or the speed of the last k time units can be intercepted as a reference value; for the acceleration of the vehicle, the acceleration of the first k time units or the acceleration of the last k time units can be intercepted as a reference value; for the average speed of the vehicle, the average speed of the first k time units or the average speed of the last k time units may be truncated as a reference value. Or other known data interception means.

After the feature construction step S2 is completed, a model learning and training step S3 is performed, which constructs and trains a machine learning model based on the data set and driving habit features.

Specifically, a machine learning model is constructed based on the data set and driving habit characteristics, and the model is learned and trained in the following manner:

inputting the data set and the driving habit characteristics into a GBDT model, and taking out a tree index generated by the GBDT model.

The GBDT model, which is simply described herein as a GBDT full gradient descent Tree, also known as MART, GBRT, tree Net or Tree, is one of the best algorithms to fit to the true distribution within a conventional machine learning algorithm, GBDT generates a weak classifier by multiple iterations, each of which is trained on the residual of the previous classifier. The requirements for weak classifiers are generally simple enough and low variance and high bias. Because the training process is to continually increase the accuracy of the final classifier by reducing the bias.

The weak classifier will typically be chosen as CART TREE (i.e. classification regression TREE). The depth of each classification regression tree is not very deep due to the high bias and the simple requirements described above. The final total classifier is obtained by summing the weak classifier weights obtained from each round of training (i.e., the addition model).

Then, the recognition result is input into an LR (Logistic Regression ) model in a one-hot mode, and the recognition result is output by the LR model.

According to the embodiment, the GBDT+LR fusion model is used, so that the accuracy and the reliability of system identification can be effectively improved.

The updating and deployment of the model of the present embodiment adopts the following steps:

dock is a natural microservice that can agilely and efficiently solve many pain points of deep learning, including:

partial neural network frames such as caffe rely on excessive weight and are difficult to install; the various network models are not subjected to engineering optimization, so that deployment is difficult; the occupation of hardware such as GPU by a framework such as Tensorflow is difficult to flexibly control, and the like. Therefore, the embodiment adopts a mode of constructing a dock mirror image on line to update and deploy the machine learning model, so that the problems can be avoided, the system can be simplified, the recognition speed can be improved, and the recognition precision can be improved.

Referring to fig. 2, the present embodiment accordingly provides a driver replacement recognition apparatus 100 including:

a driving data acquisition module 10, configured to acquire historical driving data of the vehicle through the OBD device;

a feature construction module 20 for constructing a data set and driving habit features;

a model learning and training module 30 for constructing and training a machine learning model based on the data set and driving habit characteristics;

and a model updating and deploying module 40, configured to update and deploy the machine learning model.

Wherein the feature construction module further comprises:

a data preprocessing unit 21 for decoding, cleaning, classifying and storing the history running data acquired in the driving data acquisition step to reject abnormal value and null value data;

a data set construction unit 22 for constructing a sample and a data set using a pseudo tag method; and

a driving habit feature construction unit 23 for constructing driving habit features of the driver based on the trip feature, the three-urgency feature, and the GPS data, the trip feature including a trip start time, a trip end time, a driving duration, a mileage number, an average speed, and an overspeed number; the three jerk features include jerk, and jerk times and points in time for each trip.

Further, the data set construction unit 22 is specifically configured to:

In this embodiment, when considering driving habit features, dimensions such as a trip start time, a trip end time, a start point coordinate, an end point coordinate, a driving duration, a mileage number, an average speed, a trip maximum speed, a trip minimum speed, a next trip residence time of the trip distance, a maximum overspeed speed, overspeed times, three-urgency features, and GPS data are mainly analyzed, where the analysis of the data of the dimensions is mainly to process and analyze GPS data, and includes:

In summary, compared with the traditional driver replacement identification method, the method and the device have the advantages that the driving data are acquired, the data set and the driving habit characteristics are constructed, the machine learning model is constructed, the camera is not required to be installed, the privacy of a user can be protected, the driving behavior habit of the current driver can be accurately analyzed by combining the historical driving record of the vehicle and the current driving external environment, and whether the driving behavior habit is matched with the historical driving habit or not can be accurately analyzed, so that whether the driver is replaced or not can be identified. And the stability and the precision of model identification are effectively improved by using the GBDT+LR model fusion mode.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims

1. A driver replacement recognition method, characterized by comprising the steps of:

updating and deploying the model;

the feature construction step further includes:

driving habit characteristics are constructed, and driving habit characteristics of a driver are constructed based on travel characteristics, three-urgency characteristics and GPS data, wherein the travel characteristics comprise travel starting time, travel ending time, driving duration, mileage, average speed and overspeed times; the three sharp features comprise sharp acceleration, sharp deceleration, sharp turning times and time points of each stroke;

the step of constructing the sample and the data set by adopting the pseudo tag method specifically comprises the following steps:

after all negative and positive samples are collected, the dataset is formed.

2. The driver replacement recognition method according to claim 1, wherein the step of constructing the driving habit feature specifically includes:

3. The driver change identification method according to claim 1, wherein the learning and training steps of the model specifically include:

4. The driver replacement identification method of claim 1, wherein the updating and deploying steps of the model specifically include:

5. A driver replacement recognition apparatus, characterized by comprising:

the updating and deploying module of the model is used for updating and deploying the module;

the feature construction module further comprises:

the driving habit feature construction unit is used for constructing driving habit features of a driver based on the journey features, the three-urgent features and the GPS data, wherein the journey features comprise journey starting time, journey ending time, driving duration, mileage, average speed and overspeed times; the three sharp features comprise sharp acceleration, sharp deceleration, sharp turning times and time points of each stroke;

the data set construction unit is specifically configured to:

after all negative and positive samples are collected, the dataset is formed.

6. The driver replacement recognition apparatus according to claim 5, wherein the driving habit feature construction unit is specifically configured to,