CN112651443A

CN112651443A - Driving style identification model evaluation method, device, medium and equipment based on machine learning

Info

Publication number: CN112651443A
Application number: CN202011583394.XA
Authority: CN
Inventors: 孙健东; 冯读康; 王群; 陶亚彬; 张曌; 吕帅康; 何谦; 孔令振
Original assignee: North China Institute of Science and Technology
Current assignee: North China Institute of Science and Technology
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2021-04-13

Abstract

The application provides a driving style identification model evaluation method and device based on machine learning, a computer readable medium and electronic equipment. The method comprises the following steps: dividing driving behavior data samples obtained in advance into training data samples and testing data samples; training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample; and evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample. Therefore, the driving style of the driver is accurately and effectively identified, the driving habit of the driver is pertinently guided, and the purpose of improving the fuel economy of the mining truck is achieved.

Description

Driving style identification model evaluation method, device, medium and equipment based on machine learning

Technical Field

The present disclosure relates to the field of driver assistance technologies, and in particular, to a method and an apparatus for evaluating a driving style identification model based on machine learning, a computer readable medium, and an electronic device.

Background

In the driving process of the mining truck, a reckimic driver can frequently and greatly step on an accelerator pedal or a brake pedal, the mining truck is more oil-consuming in driving, and the fuel economy is poor; a mild driver can slightly step on an accelerator pedal or a brake pedal, so that the mining truck is more fuel-saving when running, and the fuel economy is better. Therefore, the behavior characteristics of the driver in driving the mining truck are completely reflected in the aspects of the input of the driver to the mining truck and the response of the mining truck in the driving process of the mining truck, namely the driving style of the driver can have great influence on the fuel economy of the mining truck, the driving habit of the driver is guided by identifying the driving style of the driver, and the economy of the mining truck is improved. Therefore, whether the driving style identification model is accurate and effective in the process of identifying the driving style of the driver has an important influence on improving the economy of the mining truck.

Disclosure of Invention

An object of the present application is to provide a driving style identification model evaluation method, device, computer readable medium and electronic device based on machine learning, so as to solve or alleviate the above problems in the prior art.

In order to achieve the above purpose, the present application provides the following technical solutions:

the application provides a driving style identification model evaluation method based on machine learning, which comprises the following steps: dividing driving behavior data samples obtained in advance into training data samples and testing data samples; training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample; and evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample.

Optionally, in any embodiment of the present application, the dividing the driving behavior data sample obtained in advance into a training data sample and a testing data sample specifically includes: dividing the driving behavior data samples obtained in advance into the training data samples and the testing data samples according to a preset proportion.

Optionally, in any embodiment of the present application, the driving style identification model is trained according to the training data sample, and a prediction result is obtained according to the test data sample; the method comprises the following steps: based on a Scikit-leann machine learning platform, according to the training data sample, using a grid to search for an optimal hyper-parameter combination as an identification model parameter, and constructing the driving style identification model; and obtaining the prediction result according to the test data sample based on the driving style identification model.

Optionally, in any embodiment of the application, the Scikit-leann-based machine learning platform constructs the driving style identification model by combining grid search optimal parameters into identification model parameters according to the training data sample, specifically: based on a Scikit-leann machine learning platform, searching for an optimal hyper-parameter combination through a ten-fold cross grid according to the training data sample, and constructing the driving style identification model.

Optionally, in any embodiment of the present application, the evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample includes: calculating the accuracy of the average classification of the driving style identification model according to the prediction result and the test data sample, wherein the accuracy of the average classification represents the model generalization capability of the driving style identification model; according to the prediction result and the actual result, evaluating the model precision of the driving style identification model; and the actual result is a clustering result of the driving behavior data sample.

Optionally, in any embodiment of the present application, the estimating, according to the prediction result and the actual result, the model precision of the driving style identification model specifically includes: and performing difference comparison on the prediction result and the actual result of the driving style identification model based on a confusion matrix to obtain a difference result, wherein the difference result represents the model precision of the driving style identification model.

Optionally, in any embodiment of the present application, after training the driving style identification model according to the training data sample and obtaining a prediction result according to the test data sample, before evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample, the method further includes: determining a total evaluation index of the driving style identification model, wherein the total evaluation index comprises: model generalization capability and model accuracy.

The embodiment of the present application further provides a driving style identification model evaluation device based on machine learning, including: the sample dividing unit is configured to divide the driving behavior data samples obtained in advance into training data samples and testing data samples; the model prediction unit is configured to train the driving style identification model according to the training data sample and obtain a prediction result according to the test data sample; and the model evaluation unit is configured to evaluate the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample.

An embodiment of the present application further provides a computer-readable medium, on which a computer program is stored, where the program is the driving style identification model evaluation method based on machine learning according to any of the above embodiments.

An embodiment of the present application further provides an electronic device, including: a memory, a processor and a program stored in the memory and executable on the processor, the processor implementing the method for estimating a driving style recognition model based on machine learning according to any of the above embodiments when executing the program.

Compared with the closest prior art, the technical scheme of the embodiment of the application has the following beneficial effects:

according to the technical scheme provided by the embodiment of the application, a driving behavior data sample obtained in advance is divided into a training data sample and a testing data sample, a driving style identification model is trained by using the training data sample, a prediction result is obtained according to the testing data sample, and the model generalization capability and the model precision of the driving style identification model are evaluated through the prediction result and the testing data sample; the driving style of the driver is accurately and effectively identified, the driving habit of the driver is pertinently guided, and the purpose of improving the fuel economy of the mining truck is achieved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. Wherein:

fig. 1 is a schematic flow chart of a driving style identification model evaluation method based on machine learning according to some embodiments of the present application;

fig. 2 is a schematic flowchart of step S102 in a driving style identification model evaluation method based on machine learning according to some embodiments of the present application;

FIG. 3 is a schematic flow diagram of a method for constructing a machine learning-based driving style identification model according to some embodiments of the present application;

FIG. 4 is a diagram of accelerator pedal travel for a mining truck in a heavy duty operating condition and an unloaded operating condition;

FIG. 5 is a graph of the travel speed of a mining truck in a heavy load operating condition and an empty load operating condition;

fig. 6 is a schematic flowchart of step S301 in a driving style identification model construction method according to some embodiments of the present application;

fig. 7 is a schematic flowchart of step S311 in a driving style identification model construction method according to some embodiments of the present application;

fig. 8 is a correlation coefficient thermodynamic diagram between driving behavior characteristic parameters of a mining truck under heavy-duty operating conditions provided in accordance with some embodiments of the present application;

fig. 9 is a correlation coefficient thermodynamic diagram between driving behavior characteristic parameters of a mining truck in an unloaded operating state, provided in accordance with some embodiments of the present application;

FIG. 10 is a schematic illustration of a determination of a number of classifications of driving styles of a mining truck under heavy duty operating conditions using elbow rules, provided in accordance with some embodiments of the present application;

fig. 11 is a schematic illustration of a determination of a number of classifications of driving styles of a mining truck under an empty operating condition using elbow rules, provided in accordance with some embodiments of the present application;

fig. 12 is a schematic flowchart of step S302 in a method for constructing a driving style identification model based on machine learning according to some embodiments of the present application;

fig. 13 is a flowchart illustrating step S103 of the driving style identification model evaluation method based on machine learning according to some embodiments of the present application;

FIG. 14 is a schematic illustration of a confusion matrix for a mining truck under heavy-duty operation conditions, provided in accordance with some embodiments of the present application;

fig. 15 is a schematic illustration of a confusion matrix for a mining truck in an unloaded operating state, provided in accordance with some embodiments of the present application;

FIG. 16 is a schematic structural diagram of a driving style identification model evaluation device based on machine learning according to some embodiments of the present application;

FIG. 17 is a schematic diagram of a model prediction unit provided in accordance with some embodiments of the present application;

FIG. 18 is a schematic structural diagram of a model building subunit provided in accordance with some embodiments of the present application;

FIG. 19 is a block diagram of a driving style classification module provided in accordance with some embodiments of the present application;

FIG. 20 is a schematic diagram of a redundancy parameter removal sub-module provided in accordance with some embodiments of the present application;

FIG. 21 is a block diagram of a recognition model building module provided in accordance with some embodiments of the present application;

FIG. 22 is a schematic structural diagram of a model evaluation unit provided in accordance with some embodiments of the present application;

FIG. 23 is a schematic structural diagram of an electronic device provided in accordance with some embodiments of the present application;

fig. 24 is a hardware block diagram of an electronic device provided in accordance with some embodiments of the present application.

Detailed Description

The present application will be described in detail below with reference to the embodiments with reference to the attached drawings. The various examples are provided by way of explanation of the application and are not limiting of the application. In fact, it will be apparent to those skilled in the art that modifications and variations can be made in the present application without departing from the scope or spirit of the application. For instance, features illustrated or described as part of one embodiment, can be used with another embodiment to yield a still further embodiment. It is therefore intended that the present application cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.

Exemplary method

Fig. 1 is a schematic flow chart of a driving style identification model evaluation method based on machine learning according to some embodiments of the present application; as shown in fig. 1, the method for estimating a driving style based on machine learning includes:

step S101, dividing driving behavior data samples obtained in advance into training data samples and testing data samples;

in the embodiment of the application, the driving behavior data sample is divided into the training data sample and the testing data sample, the driving style identification model is constructed by using the training data sample, and the constructed driving style identification model is tested by using the testing data sample. Therefore, the driving style identification model is established through the same driving behavior data sample, and meanwhile, the driving style identification model is evaluated, so that the identification effectiveness of the driving style identification model is effectively improved, and the driving style identification model is more stable and comprehensive. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the dividing the driving behavior data sample obtained in advance into a training data sample and a testing data sample specifically includes: dividing the driving behavior data samples obtained in advance into the training data samples and the testing data samples according to a preset proportion. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, a ten-fold cross validation method is adopted for evaluating the generalization capability of the model, so that the driving behavior data is divided into training data samples and test data samples according to a ratio of 9:1, wherein 9 parts of the training data samples are used, and 1 part of the test data samples are used. And 9 training data samples are used for training the driving style identification model, and 1 testing data sample is used for testing the result of the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Step S102, training a driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample;

in the embodiment of the application, 9 training data samples are used for fitting the driving style identification model to realize the training of the driving style identification model; and 1 test data sample is used for testing the result of the driving style identification model so as to measure the accuracy of the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 2 is a schematic flowchart of step S102 in a driving style identification model evaluation method based on machine learning according to some embodiments of the present application; as shown in fig. 2, the training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample includes:

step S112, based on a Scikit-leann machine learning platform, searching an optimal hyper-parameter combination by a grid into an identification model parameter according to the training data sample, and constructing the driving style identification model;

in the embodiment of the application, the driving style of the driver is subjected to cluster analysis by obtaining a driving behavior data sample of the driver in advance, and a driving style identification model is constructed based on a random forest algorithm according to the obtained driving style category number and the driving behavior data sample so as to accurately and effectively identify the driving style of the driver. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the Scikit-leann-based machine learning platform constructs the driving style identification model by combining a grid search optimal hyper-parameter into an identification model parameter according to the training data sample, specifically: based on a Scikit-leann machine learning platform, searching for an optimal hyper-parameter combination through a ten-fold cross grid according to the training data sample, and constructing the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, according to the dividing result of the driving behavior data sample, a violent exhaustive search is performed on parameters such as n-estimators, max _ deph and the like in the driving behavior identification model by adopting a ten-fold cross validation (10-fold cross-validation) grid search (grid-search), an optimal combination of the parameters such as n-estimators, max _ deph and the like is obtained, the parameter optimization of the driving style identification model is realized, the optimized driving style identification model is obtained, and the generalization ability (generalization ability) of the driving style identification model is improved. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 3 is a schematic flow diagram of a method for constructing a machine learning-based driving style identification model according to some embodiments of the present application; as shown in fig. 3, the method is used for identifying the driving style of the driver, and comprises the following steps:

step S301, according to a driving behavior data sample of the driver obtained in advance, carrying out cluster analysis on the driving style of the driver to obtain the number of driving style categories of the driver;

in the embodiment of the application, a driving style identification model of a driver of the open-pit mining truck is mainly constructed, and driving behavior data samples are obtained by respectively acquiring data of a heavy-load operation state and a no-load operation state when a plurality of drivers drive the mining truck, wherein the data acquired by the mining truck in the process of transporting stripped rocks to a dump unloading point at a loading point at each time is the driving behavior data sample of the heavy-load operation state, and the data acquired by the mining truck in the process of returning to the loading point at the dump unloading point at the no-load original road at each time is the driving behavior data sample of the no-load operation state. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the driving style of the driver is classified based on the machine learning model according to the driving behavior data sample obtained in advance, and the number of the driving style categories of the driver is obtained. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, in the actual transportation operation of the truck for the open-pit mine, under the condition that the driving style of a driver is unknown, data in driving behavior data samples are divided into different clusters through unsupervised clustering analysis (unsupervised clustering analysis), so that the similarity of the samples in each cluster is larger than that of the samples in other clusters, then the result is transmitted to a supervised machine learning model such as regression or classification, the driving style of the driver is classified, and the number of the driving style categories of the driver is determined. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, a hardware part for acquiring data of the mining truck mainly comprises 1 ARM microcontroller (model STM32F103), 2 inertial navigation sensors (model WTGARRS 2), 1 SD memory card, a vehicle-mounted direct-current power supply, a protective shell and the like. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, in order to acquire driving behavior data of a driver of a mining truck in a heavy-load operation state and an idle-load operation state in real time, an inertial navigation sensor and an advanced RISC machines (advanced RISC machines) microcontroller are mounted on the mining truck, and data of an accelerator pedal stroke, an angular velocity of the accelerator pedal, a speed and a longitudinal acceleration of the mining truck, a gradient of a driving surface of the mining truck, a position and the like when the driver drives the mining truck are acquired and stored at a data sampling frequency of 2 hertz. Table 1 is a table of parameters collected from sensors having a mileage of about 650 km driven by 11 drivers in actual transportation operations based on the same mining truck, i.e., an experimental road, as follows:

TABLE 1

In the embodiment of the application, due to the existence of factors such as GPS signal shielding or other electromagnetic interference, the sensor may output wrong and invalid data, and in order to avoid the influence of driving behavior data samples on the learning result of the machine learning algorithm, the data collected by the sensor needs to be processed (for example, data extraction, data deletion, and the like) before performing cluster analysis. Thereby, the accuracy of machine learning is improved. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the inertial navigation sensors are two, and one is defined as a sensor No. 1, and the other is defined as a sensor No. 2. The No. 1 sensor is mainly used for collecting data of the travel and the angular speed of an accelerator pedal of the mining truck, and the No. 2 sensor is mainly used for collecting data of the speed, the longitudinal acceleration, the position and the running surface gradient of the mining truck. The inertial navigation sensor is a ten-axis inertial navigation sensor, modules such as a high-precision gyroscope, an accelerometer, a GPS and the like are integrated in the ten-axis inertial navigation sensor to form a GPS-IMU combined navigation unit, the GPS-IMU combined navigation unit has the advantages of high precision, low cost, low power consumption and small size, and can accurately measure parameters such as longitudinal acceleration, speed, GPS precision (namely position precision when the No. 2 sensor collects data of the position of the mining truck), accelerator pedal angular velocity and the like of the mining truck. Wherein, the performance parameters of the ten-axis inertial navigation sensor are shown in the following table 2:

TABLE 2

In the embodiment of the application, the No. 1 sensor is firmly installed on the back surface of an accelerator pedal of the mining truck along the X-axis direction, and the No. 2 sensor is firmly installed in the horizontal position (or approximate horizontal position) in a cab along the Y-axis direction. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, when data collected by the sensor is extracted, the data collected by the sensor No. 1 and the sensor No. 2 are respectively and independently stored in the SD card, the serial number identification and the time of the sensor are used as marks, a sensor data fusion program is developed based on Python language, the data collected by the sensor No. 1 and the sensor No. 2 are spliced at the same moment, and a finished driving behavior data sample is provided. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In bookIn the application embodiment, the driving style classification and identification are established in the dynamic transportation operation process of the mining truck, so that data with the speed of 0 in the data collected by the sensor is removed when the data of the sensor is deleted (the speed of 0 represents that the mining truck is in a static state); setting a threshold value of the operation running speed of the mining truck in consideration of errors caused by road bumping when the mining truck runs, if the speed of the mining truck exceeds 45km/h, determining the mining truck as abnormal data, and eliminating data with the speed being more than 45km/h in data collected by a sensor; since the longitudinal acceleration of the mining truck is limited by the deadweight and the load of the mining truck, the acceleration of the mining truck generally does not exceed 0.55m/s, considering that the deadweight and the load of the mining truck are combined to be about 230 tons²Therefore, acceleration abnormal values in the data collected by the sensor are eliminated. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 4 is a diagram of accelerator pedal travel for a mining truck in a heavy duty operating condition and an unloaded operating condition; FIG. 5 is a graph of the travel speed of a mining truck in a heavy load operating condition and an empty load operating condition; wherein, load represents the heavy load operation state, and unload represents the no load operation state. As shown in fig. 4 and 5, the travel and speed of the accelerator pedal of the mining truck have a large difference between the heavy-load operation state and the no-load operation state, so that the data acquired during the process of transporting and peeling rocks from each loading point to the dump site unloading point is the driving behavior data sample in the heavy-load operation state, and the data acquired during the process of returning the dump site unloading point to the loading point in the no-load original way is the driving behavior data sample in the no-load operation state. The data of 11 drivers collected by the sensors are divided into 111 driving behavior data under the heavy-load operation state and 108 driving behavior data under the no-load operation state. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 6 is a schematic flowchart of step S301 in a driving style identification model construction method according to some embodiments of the present application; as shown in fig. 6, the obtaining the number of driving style categories of the driver by performing cluster analysis on the driving style of the driver according to the driving behavior data sample of the driver obtained in advance includes:

step S311, based on a preset correlation analysis model, performing correlation analysis on the selected driving style characteristic parameters of the driver, and removing redundant driving style characteristic parameters in a driving behavior data sample of the driver obtained in advance according to the correlation analysis result;

in the embodiment of the application, in order to classify and identify the driving style of the driver of the truck for the strip mine, firstly, characteristic parameters capable of representing the driving style of the driver are determined. Typically, statistical values (maximum value, average value, standard deviation) of an accelerator pedal stroke, an accelerator pedal angular velocity, a speed of the mining truck, a longitudinal acceleration, and the like are selected as the driving style characteristic parameters. As shown in the following table 3,

TABLE 3

In the embodiment of the present application, when multiple collinearity (multicollinearity) exists between the driving style characteristic parameters, the weight occupied by the related driving style characteristic parameters in the euclidean distance (euclidian distance) calculation is higher, and the influence on the accuracy of the driving style classification is larger. Therefore, correlation analysis (correlation analysis) is required for the driving style characteristic parameters of the driver. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 7 is a schematic flowchart of step S311 in a driving style identification model construction method according to some embodiments of the present application; as shown in fig. 7, the performing a correlation analysis on the selected driving style characteristic parameters of the driver based on a preset correlation analysis model, and removing redundant driving style characteristic parameters in the driving behavior data sample of the driver according to the result of the correlation analysis includes:

step S311A, performing relevance analysis on the selected driving style characteristic parameters based on a preset relevance analysis model to obtain a correlation coefficient between the driving style characteristic parameters;

in the embodiment of the application, the correlation coefficient is used for representing the correlation size among the selected driving style characteristic parameters, and whether redundancy exists among different driving style characteristic parameters is determined through calculation of the correlation coefficient. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the present application, the correlation analysis model is a calculation model of a Pearson correlation coefficient, and is defined as shown in the following formula (1);

wherein r represents a correlation coefficient, x and y represent two different driving style characteristic parameters respectively, and x_i、y_iRespectively represent the values of the driving style characteristic parameters,

respectively, represent driving style characteristic parameter averages. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 8 is a correlation coefficient thermodynamic diagram between driving behavior characteristic parameters of a mining truck under heavy-duty operating conditions provided in accordance with some embodiments of the present application; fig. 9 is a correlation coefficient thermodynamic diagram between driving behavior characteristic parameters of a mining truck in an unloaded operating state, provided in accordance with some embodiments of the present application; as shown in fig. 8 and 9, the linear correlation degree between different driving style characteristic parameters can be clarified by the Pearson correlation coefficient between different driving style characteristic parameters calculated by the correlation analysis model. The range of the Pearson correlation coefficient is (-1, 1), and the larger the absolute value of the Pearson correlation coefficient is, the stronger the correlation between two different driving style characteristic parameters is; the closer the absolute value of the elsen correlation coefficient is to 0, the weaker the correlation between two different driving style characteristic parameters is. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Step 311B, removing redundant driving style characteristic parameters in the driving behavior data sample of the driver obtained in advance according to the correlation coefficient and a preset correlation coefficient threshold value.

In the embodiment of the present application, when the pearson correlation coefficient is in the range (0.6, 0.8), it is considered that there is a strong correlation between two different driving style characteristic parameters; for example, the pearson correlation coefficient of each driving style characteristic parameter of the mining truck in the heavy-load operation state is less than 0.8, which indicates that each driving style characteristic parameter has strong independence; the Pearson correlation coefficient of the angular speed average value (wx3_ mean) and the angular speed standard deviation (wx3_ std) of the mining truck under the no-load operation state is 0.94, which shows that the angular speed average value (wx3_ mean) and the angular speed standard deviation (wx3_ std) have extremely strong positive correlation under the no-load operation state.

In some optional embodiments, the removing, according to the correlation coefficient and a preset correlation coefficient threshold, redundant driving style characteristic parameters in a driving behavior data sample obtained in advance specifically includes: and comparing the correlation coefficient with a preset correlation coefficient threshold, and removing redundant driving style characteristic parameters in a driving behavior data sample obtained in advance according to a comparison result. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the pearson correlation coefficient between different driving style characteristic parameters calculated by the correlation analysis model is compared with a preset correlation coefficient threshold value, so that the correlation degree between the different driving style characteristic parameters is determined, and two driving style characteristic parameters which are extremely strong in correlation indicate that the two driving style characteristic parameters are redundant data, and one of the two driving style characteristic parameters needs to be removed. For example, the pearson correlation coefficient of the angular velocity average value (wx3_ mean) and the angular velocity standard deviation (wx3_ std) of the mining truck under the no-load operation state is 0.94, which indicates that the angular velocity average value (wx3_ mean) and the angular velocity standard deviation (wx3_ std) have strong positive correlation under the no-load operation state, the angular velocity average value (wx3_ mean) in the driving behavior data sample can be eliminated, and the angular velocity standard deviation (wx3_ std) is reserved. Table 4 shows the driving style characteristic parameters of the mining truck obtained according to the correlation coefficient thermodynamic diagrams of fig. 8 and 9 after removing redundancy in the heavy load operation state and the no load operation state, where table 4 is as follows:

TABLE 4

Table 5 is a driving behavior data sample obtained after redundant driving style characteristic parameters are removed based on the correlation coefficient thermodynamic diagrams of fig. 8 and 9 in the heavy-load operation state of the mining truck; table 5 is as follows:

TABLE 5

Table 6 shows driving behavior data samples of the mining truck after removing the redundant driving style characteristic parameters based on the correlation coefficient thermodynamic diagrams of fig. 8 and 9 in the heavy-load operation state; table 6 is as follows:

TABLE 6

Step S321, based on a preset clustering algorithm model, obtaining a clustering result of the driving behavior data samples according to the driving behavior data samples with redundant driving style characteristic parameters removed;

in the embodiment of the application, based on a preset clustering algorithm model, a driving style clustering center is determined according to the driving behavior data sample without redundant driving style characteristic parameters, a clustering result of the driving behavior data sample is obtained, and the driving style of a driver is classified. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the driving behavior data sample is X_iIndicating that the driving behavior data sample T contains data collected by n sensors, namely X_i＝{x_i1、x_i2、……、x_inThe data collected by the n sensors are gathered into k types (k is a natural number), and the clustering centers respectively use c₁、c₂、……c_kAnd (4) showing. Wherein, the calculation model of the clustering center is shown as the following formula (2):

wherein j is (1, k), and j is a natural number;

n represents the number of sensors for acquiring data of the mining truck;

u represents the number of centers in each class.

The calculation model of the error criterion function is shown in equation (3) below:

where J denotes an error criterion function, which is represented by … … in the driving style classification. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the obtaining, based on the preset clustering algorithm model, a clustering result of the driving behavior data sample according to the driving behavior data sample from which the redundant driving style characteristic parameter is removed specifically includes: and based on a preset clustering algorithm model, carrying out clustering analysis on the driving behavior data samples with redundant driving style characteristic parameters removed to obtain a clustering result of the driving behavior data samples. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the present application, the clustering algorithm model that can be adopted includes: a K-means Clustering algorithm model Based on distance Clustering, a hierarchical Clustering algorithm model, a fuzzy Clustering algorithm model, a Spatial Clustering algorithm model Based on Density (e.g., a Density-Based Clustering method with Noise (DBSCAN)). It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In an embodiment of the present application, the driving behavior data samples include: driving behavior data samples under heavy load operation state, and driving behavior data samples under no load operation state. And when the clustering analysis is carried out on the driving behavior data samples without the redundant driving style characteristic parameters based on a preset clustering algorithm model to obtain the clustering result of the driving behavior data samples, respectively fitting the driving behavior data samples without the redundant driving style characteristic parameters under the heavy-load operation state and the driving behavior data samples without the redundant driving style characteristic parameters under the no-load operation state based on the K-means clustering algorithm model to obtain the clustering result of the driving behavior data samples. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the driving behavior data samples are fitted based on a K-means (K-means) clustering algorithm model, so that the operating efficiency and the accuracy of the classified number of the driving styles can be effectively improved. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the K-means clustering algorithm uses the error square sum in the cluster as a target function for clustering, the error square sum in the cluster of sample data of the same driving style is small, the similarity degree is high and is distributed to the same cluster, the error square sum in the clusters of different driving styles is large, and the similarity degree is low and is distributed to different clusters. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Step S331, based on elbow rules, determining the number of driving style categories of the driver according to the clustering result;

in some optional embodiments, when the number of driving style categories of the driver is determined according to the clustering result based on the elbow rule, the clustering result is fitted based on a K-means clustering algorithm model, and the number of driving style categories of the driver is determined based on the elbow rule according to the fitting result. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, based on a K-means clustering algorithm model, driving behavior data samples in a heavy load operation state and a no-load operation state are respectively fitted, and then the driving style category number is determined by utilizing an elbow rule. FIG. 10 is a schematic illustration of a determination of a number of classifications of driving styles of a mining truck under heavy duty operating conditions using elbow rules, provided in accordance with some embodiments of the present application; fig. 11 is a schematic illustration of a determination of a number of classifications of driving styles of a mining truck under an empty operating condition using elbow rules, provided in accordance with some embodiments of the present application; as shown in fig. 10 and 11, when the number of the clustering centers of the mining truck is 3 in the heavy load operation state and the no load operation state, the square of the error in the cluster and the descending speed are obviously changed and then slowly descend, so that the number of the clustering centers of the driving style of the mining truck in the heavy load operation state and the no load operation state is 3. Namely, the number of driving style categories of the mining truck is 3 in a heavy-load operation state, and the number of driving style categories of the mining truck is 3 in an idle-load operation state. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the mining truck carries out cluster analysis on the driving style of a driver based on a K-means clustering algorithm model in a heavy load operation state and an idle load operation state, and unsupervised cluster analysis is respectively carried out on driving behavior data in the heavy load operation state and the idle load operation state by setting the number of clustering centers to be 3, the maximum iteration frequency to be 100 and the like, so that the driving style of the driver in the heavy load operation state and the idle load operation state is classified. Table 7 shows unsupervised cluster analysis results of the mining truck in the heavy-duty operation state; table 8 shows unsupervised cluster analysis results of the mining truck in the heavy-duty operation state; as can be seen from table 7, in the no-load operation state of the mining truck, the Cluster center of the driving style characteristic parameters related to the accelerator pedal stroke, the accelerator pedal angular velocity, the mining truck speed and the like in Cluster2 is the largest, the Cluster center of the driving style characteristic parameters related to the accelerator pedal stroke, the accelerator pedal angular velocity, the mining truck speed and the like in Cluster0 is the smallest, the distribution of the different driving style characteristic parameters conforms to the law, that is, the median and the upper quartile of the accelerator pedal stroke conforming to the aggressive driving style are larger than those of the normal type and the mild type, and more oil pedal strokes of the mild type driving style are distributed at the low level. Therefore, the driving style of the mining truck driver in the no-load operation state can be divided into three categories: normal (Cluster0), mild (Cluster1), aggressive (Cluster 2). In the same manner, as can be seen from table 8, in the heavy-load operation state of the mining truck, the distribution rule of the characteristic parameters related to the angular velocity of the accelerator pedal and the velocity of the mining truck is relatively obvious, and the driving style of the driver of the mining truck in the heavy-load operation state is classified into three categories according to the driving style characteristic parameters related to the angular velocity of the accelerator pedal and the velocity of the mining truck (i.e., statistical values (maximum value, average value, standard deviation) of the stroke of the accelerator pedal, the angular velocity of the accelerator pedal, the velocity, the longitudinal acceleration, and the like): normal (Cluster2), mild (Cluster0), aggressive (Cluster 1). It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

TABLE 7

TABLE 8

Step S302, based on a random forest algorithm, constructing a driving style identification model of the driver according to the driving style category number and the driving behavior data sample,

in the embodiment of the application, the number distribution of three driving styles of the mining truck in a heavy load state and a no-load state is uneven, the mining truck belongs to an unbalanced data set (unbalanced dataset), and a driving style identification model constructed by a random forest algorithm is not easy to generate overfitting and has stronger generalization capability. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 12 is a schematic flowchart of step S302 in a method for constructing a driving style identification model based on machine learning according to some embodiments of the present application; as shown in fig. 12, the constructing a driving style identification model of the driver according to the driving style category number and the driving behavior data sample based on the random forest algorithm includes:

s312, generating Z decision trees according to the driving style category number and the driving behavior data sample based on a random forest algorithm, wherein Z is a positive integer and is larger than 2;

in the embodiment of the application, a resampling technique (for example, a bootstrap sampling method) is adopted, r samples are randomly selected from a training set (the driving behavior sample data is divided into a test set and a training set according to a ratio of 3:7, the training set is used for training a decision tree, and the test set is used for testing the trained decision tree) and put back (namely, a sampling strategy of sampling and putting back) in the training set, so as to train one decision tree. According to the decision tree, when the nodes are split, p features are randomly selected from the driving style feature parameters, the Gini index of all possible splitting methods of each driving style feature parameter is respectively calculated, the driving style feature parameter with the minimum Gini index is selected as the minimum objective function, and the nodes are divided, namely the decision tree selects the node with the minimum Gini coefficient to split through the Gini coefficient. Wherein the Gini index of the driving style characteristic parameter is calculated according to formula (4). Equation (4) is as follows:

wherein N is the number of driving style categories, and T is a driving behavior data sample; c_nA set of driving behavior sample data for a driving style of class n.

In the embodiment of the present application, the driving style categories of the mining truck are 3 categories, that is, N is 3, which are: normal, mild, aggressive. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the Z decision trees are trained based on the bootstrap sampling method, and the Z decision trees of the random forest can be guaranteed to be different. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

And step S322, constructing a driving style identification model of the driver according to the Z decision trees based on majority voting.

In the embodiment of the application, after the Z decision trees are obtained, a random forest model for identifying the driving style of the driver of the mining truck, namely a driving style identification model, is formed based on majority voting. Because of the disadvantage that the decision tree is easy to be over-fitted, the random forest adopts a voting mechanism of a plurality of decision trees to improve the performance of the decision trees so as to improve the accuracy of the model of the driver style. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the driving style of the driver is subjected to cluster analysis by obtaining a driving behavior data sample of the driver in advance, and a driving style identification model is constructed based on a random forest algorithm according to the obtained driving style category number and the driving behavior data sample, so that the driving style of the driver is accurately and effectively identified, the driving habit of the driver is specifically guided, and the purpose of improving the fuel economy of the mining truck is achieved. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, after the constructing the driving style identification model of the driver according to the driving style category number and the driving behavior data sample based on the random forest algorithm, the method further includes: and dividing the driving behavior data samples according to a preset proportion, and optimizing parameters of the driving style identification model based on ten-fold cross validation grid search according to the dividing results of the driving behavior data samples to obtain an optimized driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, driving behavior data samples in a heavy-load operation state are divided into a heavy-load test set and a heavy-load training set according to a preset ratio (for example, 3:7), wherein the ratio of the heavy-load test set to the heavy-load training set is 3: 7; dividing driving behavior data samples in an idle operation state into an idle test set and an idle training set according to a preset proportion (for example, 3:7), wherein the proportion of the idle test set to the idle training set is 3: 7. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the present application, the grid search parameter optimization table of the driving style identification model is shown in table 9 below:

TABLE 9

In the embodiment of the application, the generalization capability and the identification precision of the driving style identification model are improved by optimizing the parameters of the driving style identification model, the driving style of a driver of the mining truck can be effectively identified, the driving habit of the driver is guided in a targeted manner, and the fuel economy of the mining truck is improved. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

And S122, obtaining the prediction result according to the test data sample based on the driving style identification model.

In the embodiment of the application, the training data samples are used for fitting the driving style identification model to complete the training of the driving style identification model, and the test data samples are predicted based on the trained driving style identification model to obtain the prediction result of the driving style of the driver. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

And S103, evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample.

In the embodiment of the application, based on the driving style identification model, the prediction result of the driving style of the driver is obtained according to the test data sample, the prediction result is compared with the test data sample to find out the difference, and the evaluation of the model generalization capability and the model precision of the driving style identification model is realized. Therefore, the driving style identification model is established through the same driving behavior data sample, and meanwhile, the driving style identification model is evaluated, so that the identification effectiveness of the driving style identification model is effectively improved, and the driving style identification model is more stable and comprehensive. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Fig. 13 is a flowchart illustrating step S103 of the driving style identification model evaluation method based on machine learning according to some embodiments of the present application; as shown in fig. 13, the evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample includes:

step S113, calculating the accuracy of the average classification of the driving style identification model according to the prediction result and the test data sample, wherein the accuracy of the average classification represents the model generalization capability of the driving style identification model;

in the embodiment of the application, the driving behavior data sample is divided into the training data sample and the testing data sample by a ten-fold cross validation method, the testing data sample is predicted based on the trained driving style identification model, and the accuracy of average classification is calculated according to the iteration result of each driving style identification model. For example, based on a Scikit-leann machine learning platform, the optimal parameter combination is searched by a grid to form identification model parameters, and when the constructed identification model of the driving style of the driver in the heavy load and no-load states of the mining truck is used for evaluating the generalization capability of the model, the average cross-validation score of the identification model of the driving style is 97% under the heavy load operation condition of the mining truck; under the condition of no-load operation of the mining truck, the average cross-evaluation score of the driving style identification model is 89%, so that the generalization capability of the driving style identification model based on the random forest is excellent, and the driving style of a driver can be effectively identified. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

S123, evaluating the model precision of the driving style identification model according to the prediction result and the actual result; and the actual result is a clustering result of the driving behavior data sample.

In the embodiment of the application, whether the prediction result of the driving style identification model is accurate or not can be determined by comparing the prediction result with the actual result. The prediction result is consistent with the actual result, which indicates that the prediction of the driving style identification model is correct, and the prediction result is inconsistent with the actual result, which indicates that the prediction of the driving style identification model is incorrect. For example, based on a Scikit-learn machine learning platform, the optimal parameter combination of grid search is used as an identification model parameter, and when the accuracy of the constructed identification model of the driving style of the driver in the heavy load and no-load states of the mining truck is evaluated, the overall accuracy of the prediction result of the identification model of the driving style is 95.49% under the heavy load operation condition of the mining truck; under the condition of no-load operation of the mining truck, the overall accuracy of the prediction result of the driving style identification model is 90.74%. Therefore, the overall accuracy performance of the driving style identification model based on the random forest is excellent, and the driving style of a driver can be effectively identified. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the estimating, according to the predicted result and the actual result, the model accuracy of the driving style identification model specifically includes: and performing difference comparison on the prediction result and the actual result of the driving style identification model based on a confusion matrix to obtain a difference result, wherein the difference result represents the model precision of the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, the overall accuracy of the driving style identification model is calculated through a confusion matrix, the prediction result and the actual result of the driving style identification model are displayed through the confusion matrix, and the model accuracy of the driving style identification model is evaluated by using a false negative example (FN) and a true negative example (TP) in the confusion matrix, so that the driving style identification model is comprehensively evaluated. For example, the model accuracy of the driving style identification model is evaluated by a confusion matrix of the mining truck under a heavy-load operation state (as shown in fig. 14) and a confusion matrix of the mining truck under an empty-load operation state (as shown in fig. 15). It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, other two-classification evaluation methods can be adopted to evaluate the model accuracy of the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the present application, the definition of the confusion matrix is shown in table 10. Table 10 is as follows:

watch 10

Wherein TN represents that the prediction result of the driving style identification model is a false negative example, and FP represents that the prediction result of the driving style identification model is a false positive example. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, after training the driving style identification model according to the training data sample and obtaining a prediction result according to the test data sample, before evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample, the method further includes: determining a total evaluation index of the driving style identification model, wherein the total evaluation index comprises: model generalization capability and model accuracy. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In the embodiment of the application, when the driving behavior data sample is an unbalanced data set and a single driving style identification model is evaluated, indexes such as accuracy, Precision, Recall and f-score can be selected, wherein the index f-score takes the Precision and the Recall into consideration at the same time, which indicates that the accuracy of evaluation measurement on the driving style identification model is better than the model accuracy through the index f-score. Wherein, based on the confusion matrix, the index accuracacy is calculated according to formula (5), and formula (5) is as follows:

based on the confusion matrix, the index Precision is calculated according to equation (6), equation (6) is as follows:

based on the confusion matrix, the index Recall is calculated according to equation (7), equation (7) being as follows:

based on the confusion matrix, the index f-score is calculated according to equation (8), equation (8) being as follows:

the evaluation indexes of the single driving style identification model are shown in table 11 and table 12, where table 11 is a ten-fold cross-category table of the driving style under the heavy-duty operation state, and table 12 is a ten-fold cross-category table of the driving style under the no-load operation state. Tables 11 and 12 are as follows:

TABLE 11

TABLE 12

As shown in tables 11 and 12, in the heavy-load operation state of the mining truck, f-scores of different driving styles are greater than 0.9, and the performance of the single driving style identification model is excellent; similarly, under the no-load operation state of the mining truck, the accuracy of the single driving style is larger than 88%, and the single driving style identification model has good performance. And finally, under the heavy load and no-load operation states of the mining truck, evaluating the identification capability of the single driving style model based on f-score to obtain that: the mild driving style is optimal, the normal driving style is inferior, and the aggressive driving style is weakest. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

Exemplary devices

FIG. 16 is a schematic structural diagram of a driving style identification model evaluation device based on machine learning according to some embodiments of the present application; as shown in fig. 16, the evaluation device includes:

a sample dividing unit 1601 configured to divide the driving behavior data samples obtained in advance into training data samples and test data samples; a model prediction unit 1602, configured to train the driving style identification model according to the training data sample, and obtain a prediction result according to the test data sample; a model evaluation unit 1603 configured to evaluate the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the sample dividing unit 1601 is further configured to divide the driving behavior data samples obtained in advance into the training data samples and the test data samples according to a preset ratio. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 17 is a schematic diagram of a model prediction unit provided in accordance with some embodiments of the present application; as shown in fig. 17, the model prediction unit 1602 includes: a model construction subunit 1612 configured to construct the driving style identification model by combining a grid search optimal hyper-parameter into an identification model parameter according to the training data sample based on the Scikit-leann machine learning platform; a prediction subunit 1622 configured to obtain the prediction result according to the test data sample based on the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the model constructing subunit 1612 is further configured to construct the driving style identification model by searching for an optimal hyper-parameter combination through a cross-grid cross-over search according to the training data sample based on the Scikit-leann machine learning platform. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 18 is a schematic structural diagram of a model building subunit provided in accordance with some embodiments of the present application; as shown in fig. 18, the model construction subunit 1612 includes: a driving style classification module 1801, configured to perform cluster analysis on the driving style of the driver according to a driving behavior data sample of the driver obtained in advance, so as to obtain the number of driving style categories of the driver; and the identification model building module 1802 is configured to build a driving style identification model of the driver according to the driving style category number and the driving behavior data sample based on a random forest algorithm. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 19 is a block diagram of a driving style classification module provided in accordance with some embodiments of the present application; as shown in fig. 19, the driving style classification module 1801 includes: a redundant parameter removing submodule 1811 configured to perform correlation analysis on the selected driving style characteristic parameters of the driver based on a preset correlation analysis model, and remove redundant driving style characteristic parameters in a driving behavior data sample of the driver obtained in advance according to the result of the correlation analysis; the cluster analysis submodule 1821 is configured to obtain a cluster result of the driving behavior data sample according to the driving behavior data sample without the redundant driving style characteristic parameters based on a preset cluster algorithm model; the driving style classification submodule 1831 is configured to determine the number of driving style categories of the driver according to the clustering result based on the elbow rule. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 20 is a schematic diagram of a redundancy parameter removal sub-module provided in accordance with some embodiments of the present application; as shown in fig. 20, the redundancy parameter removal submodule 1811 includes: a correlation coefficient unit 1811A configured to perform correlation analysis on the selected driving style characteristic parameters based on a preset correlation analysis model to obtain correlation coefficients between the driving style characteristic parameters; the redundant parameter part 1811B is configured to remove redundant driving style characteristic parameters in the driving behavior data sample of the driver obtained in advance according to the correlation coefficient and a preset correlation coefficient threshold. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the cluster analysis sub-module 1821 is further configured to perform cluster analysis on the driving behavior data samples without the redundant driving style characteristic parameters based on a preset clustering algorithm model, so as to obtain a clustering result of the driving behavior data samples. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the driving style classification sub-module 1831 is further configured to fit the clustering result based on a K-means clustering algorithm model, and determine the driving style category number of the driver based on elbow rules according to the fitting result. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 21 is a block diagram of a recognition model building module provided in accordance with some embodiments of the present application; as shown in fig. 21, the recognition model building module 1802 includes: a decision tree submodule 1812 configured to generate Z decision trees according to the driving style category number and the driving behavior data sample based on a random forest algorithm, where Z is a positive integer and is greater than 2; the identification model sub-module 1822 is configured to construct a driving style identification model of the driver according to the Z decision trees based on majority voting. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the machine learning based model building subunit 1612 further comprises: and the model optimization module is configured to divide the driving behavior data samples according to a preset proportion, optimize parameters of the driving style identification model based on ten-fold cross validation grid search according to the division results of the driving behavior data samples, and obtain the optimized driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 22 is a schematic structural diagram of a model evaluation unit provided in accordance with some embodiments of the present application; as shown in fig. 22, the model evaluation unit 1603 includes: a generalization ability evaluation subunit 1613, configured to calculate an accuracy of an average classification of the driving style identification model according to the prediction result and the test data sample, wherein the accuracy of the average classification represents a model generalization ability of the driving style identification model; a model accuracy evaluation subunit 1623 configured to evaluate a model accuracy of the driving style identification model according to the prediction result and the actual result; and the actual result is a clustering result of the driving behavior data sample. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

In some optional embodiments, the model accuracy evaluating subunit 1623 is further configured to perform a difference comparison between the predicted result and the actual result of the driving style identification model based on a confusion matrix to obtain a difference result, where the difference result represents a model accuracy of the driving style identification model. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

The driving style identification model evaluation device based on machine learning provided by the embodiment of the application can realize each process in the driving style identification model evaluation method based on machine learning, and achieve the same functions and effects, and is not repeated one by one.

Exemplary device

FIG. 23 is a schematic structural diagram of an electronic device provided in accordance with some embodiments of the present application; as shown in fig. 23, the electronic apparatus includes:

one or more processors 2301;

a computer readable medium, which may be configured to store one or more programs 2302, the one or more processors 2301 when executing the one or more programs 2302 performing the steps of: dividing driving behavior data samples obtained in advance into training data samples and testing data samples; training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample; and evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

FIG. 24 is a hardware block diagram of an electronic device provided in accordance with some embodiments of the present application; as shown in fig. 24, the hardware structure of the electronic device may include: a processor 2401, a communications interface 2402, a computer-readable medium 2403, and a communications bus 2404;

the processor 2401, the communication interface 2402 and the computer-readable medium 2403 complete communication with each other through the communication bus 2404;

alternatively, communication interface 2402 may be an interface to a communication module, such as an interface to a GSM module;

the processor 2401 may be specifically configured to: dividing driving behavior data samples obtained in advance into training data samples and testing data samples; training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample; and evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample. It should be understood that the above description is only exemplary, and the embodiments of the present application do not limit the present invention.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc., and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The electronic device of the embodiments of the present application exists in various forms, including but not limited to:

(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include: smart phones (e.g., IPhone), multimedia phones, functional phones, and low-end phones, etc.

(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as Ipad.

(3) A portable entertainment device: such devices can display and play multimedia content. This type of device comprises: audio and video players (e.g., iPod), handheld game players, electronic books, and smart toys and portable car navigation devices.

(4) A server: the device for providing the computing service comprises a processor, a hard disk, a memory, a system bus and the like, and the server is similar to a general computer architecture, but has higher requirements on processing capacity, stability, reliability, safety, expandability, manageability and the like because of the need of providing high-reliability service.

(5) And other electronic devices with data interaction functions.

It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present application may be divided into more components/steps, or two or more components/steps or partial operations of the components/steps may be combined into a new component/step to achieve the purpose of the embodiment of the present application.

The above-described methods according to embodiments of the present application may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine storage medium and to be stored in a local recording medium downloaded through a network, so that the methods described herein may be stored in such software processes on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the machine learning-based driving style recognition model evaluation methods described herein. Further, when a general-purpose computer accesses code for implementing the methods illustrated herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the methods illustrated herein.

Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the particular application of the solution and the constraints involved. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.

It should be noted that, in the present specification, each embodiment is described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and elements not shown as separate may or may not be physically separate, and elements not shown as unit hints may or may not be physical elements, may be located in one place, or may be distributed across multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above embodiments are only used for illustrating the embodiments of the present application, and not for limiting the embodiments of the present application, and those skilled in the relevant art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present application, so that all equivalent technical solutions also belong to the scope of the embodiments of the present application, and the scope of the embodiments of the present application should be defined by the claims.

Claims

1. A driving style identification model evaluation method based on machine learning is characterized by comprising the following steps:

dividing driving behavior data samples obtained in advance into training data samples and testing data samples;

training the driving style identification model according to the training data sample, and obtaining a prediction result according to the test data sample;

and evaluating the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample.

2. The machine learning-based driving style identification model evaluation method according to claim 1, wherein the driving behavior data samples obtained in advance are divided into training data samples and test data samples, specifically: dividing the driving behavior data samples obtained in advance into the training data samples and the testing data samples according to a preset proportion.

3. The machine learning-based driving style identification model evaluation method according to claim 1, wherein the driving style identification model is trained according to the training data samples, and a prediction result is obtained according to the test data samples; the method comprises the following steps:

based on a Scikit-leann machine learning platform, according to the training data sample, using a grid to search for an optimal hyper-parameter combination as an identification model parameter, and constructing the driving style identification model;

and obtaining the prediction result according to the test data sample based on the driving style identification model.

4. The driving style identification model evaluation method based on machine learning of claim 3, wherein the Scikit-leann-based machine learning platform constructs the driving style identification model by using a grid search optimal parameter combination as an identification model parameter according to the training data sample, and specifically comprises: based on a Scikit-leann machine learning platform, searching for an optimal hyper-parameter combination through a ten-fold cross grid according to the training data sample, and constructing the driving style identification model.

5. The method according to claim 1, wherein the evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data sample comprises:

calculating the accuracy of the average classification of the driving style identification model according to the prediction result and the test data sample, wherein the accuracy of the average classification represents the model generalization capability of the driving style identification model;

according to the prediction result and the actual result, evaluating the model precision of the driving style identification model; and the actual result is a clustering result of the driving behavior data sample.

6. The method for evaluating a driving style identification model based on machine learning according to claim 5, wherein the evaluating the model accuracy of the driving style identification model according to the predicted result and the actual result specifically comprises:

and performing difference comparison on the prediction result and the actual result of the driving style identification model based on a confusion matrix to obtain a difference result, wherein the difference result represents the model precision of the driving style identification model.

7. The method for evaluating a driving style identification model based on machine learning according to any one of claims 1 to 6, wherein after the driving style identification model is trained according to the training data samples and a prediction result is obtained according to the test data samples, before the evaluating the model generalization ability and the model accuracy of the driving style identification model according to the prediction result and the test data samples, the method further comprises: determining a total evaluation index of the driving style identification model, wherein the total evaluation index comprises: model generalization capability and model accuracy.

8. A driving style identification model evaluation device based on machine learning is characterized by comprising:

the sample dividing unit is configured to divide the driving behavior data samples obtained in advance into training data samples and testing data samples;

the model prediction unit is configured to train the driving style identification model according to the training data sample and obtain a prediction result according to the test data sample;

and the model evaluation unit is configured to evaluate the model generalization capability and the model precision of the driving style identification model according to the prediction result and the test data sample.

9. A computer-readable medium, on which a computer program is stored, characterized in that the program is a machine learning-based driving style identification model evaluation method according to any one of claims 1 to 7.

10. An electronic device, comprising: a memory, a processor, and a program stored in the memory and executable on the processor, the processor implementing the method of estimating a machine learning-based driving style recognition model according to any one of claims 1-7 when executing the program.