CN114140286A

CN114140286A - Driving training guidance method and terminal

Info

Publication number: CN114140286A
Application number: CN202111494767.0A
Authority: CN
Inventors: 李忠凯; 张铁监; 吴松; 叶剑
Original assignee: Duolun Internet Technology Co ltd
Current assignee: Duolun Internet Technology Co ltd
Priority date: 2021-12-08
Filing date: 2021-12-08
Publication date: 2022-03-04

Abstract

The invention discloses a driving training guidance method and a terminal, wherein the method comprises the following steps: collecting historical exercise data in the driving training of a student and a coaching scheme of a coach and preprocessing the historical exercise data and the coaching scheme; dividing the obtained object into different student groups with similar learning ability by using a K-Means clustering algorithm to obtain clustering results; taking the preprocessed historical exercise data as input, training an RNN (neural network) recurrent neural network model, and predicting the probability of successfully completing next exercise of the student; and (3) based on the training data of the student, the probability of the student successfully completing the next project exercise is predicted by the model, the training of the reinforcement learning model is performed by taking the instruction scheme of the training coach of the training and the passing condition of the real project of the training as input, and the optimal coach instruction scheme is selected and recommended to the student by using the reinforcement learning model. The invention can carry out customized instruction for each learning stage of each student, can greatly reduce the workload of coaches and improve the intelligent performance.

Description

Driving training guidance method and terminal

Technical Field

The invention relates to a driving training guidance method and a terminal, and belongs to the technical field of driving training systems.

Background

The motor vehicles become main transportation means for the public to go out, along with the improvement of living standard, the holding amount of the motor vehicles continuously rises, more and more people select the convenient transportation mode of the motor vehicles, and the motor vehicle driving technology also becomes a skill for people to compete for learning. As a basic skill of the modern society, a large number of people take motor vehicle driving tests every year, and the motor vehicle driving license can be obtained only after the final examination reaches the standard through the driving training of a driving school. The driving training stage is particularly important, so that the trainees can pass final assessment and learn the driving technique, and traffic accidents of the automobile during driving are avoided.

The current driving training also basically stays in a manual stage, most driving schools still have a mode that a coach carries a plurality of trainees, the trainees need to carefully observe the practice condition of each trainee to find the problems of the trainees, and a corresponding guidance scheme is given by combining years of teaching experience. This requires a great deal of manual intervention, as well as a training with sufficient dexterity and patience, resulting in less training efficiency. And a part of driving schools use a pattern matching method to count the error times of students, and different guidance schemes are given by different error times. This scheme gives guidance only on the number of machine passes and cannot give the most appropriate scheme according to the learning ability and grasping condition of each student.

In view of this, it is necessary to provide an intelligent guidance scheme applied to motor vehicle driving training, which can provide personalized guidance for the current learning state of each student according to a large amount of collected data, and really achieve the purpose of teaching according to the situation.

Disclosure of Invention

In order to solve the problems that the current driving training is subjected to multi-worker intervention and personalized guidance cannot be given according to the current learning state of each student, the invention provides a driving training guidance method and a driving training guidance terminal, which are used for performing customized guidance for each learning stage of each student.

The invention specifically adopts the following technical scheme to solve the technical problems:

a driving training guidance method comprises the following steps:

step 1, collecting historical exercise data and a coach guide scheme in trainee driving training;

step 2, preprocessing collected historical exercise data in the driving training of the trainees to obtain preprocessed historical exercise data;

step 3, dividing students in the preprocessed historical exercise data into different student groups with similar learning ability by using a K-Means clustering algorithm to obtain clustering results;

step 4, according to the different student groups divided in the step 3, preprocessing the historical training data of the students in each group, inputting and training the RNN recurrent neural network model, and predicting and outputting the probability of the students successfully completing the next project exercise by the RNN recurrent neural network model;

step 5, acquiring training data of the student at this time, inputting the RNN recurrent neural network model obtained by training in the step 4, predicting and outputting the probability of successfully completing next project exercise of the student by the RNN recurrent neural network model, and taking the acquired guidance scheme adopted by the training coach at this time and the passing condition of the real project at this time as input to train the reinforcement learning model; and for each training data of a subsequent student, predicting and outputting the probability of successfully completing the next project exercise of the student by using the RNN recurrent neural network model, inputting the result into the trained reinforcement learning model, selecting the guidance scheme of the optimal coach by the reinforcement learning model, and recommending the selected optimal coach guidance scheme to the student.

Further, as a preferred technical solution of the present invention, the historical exercise data collected in the driving training of the trainee in step 1 includes basic information of the trainee, simulator exercise information, and exercise process data.

Further, as a preferred technical solution of the present invention, the preprocessing the collected historical exercise data in the trainee driving training in step 2 includes:

analyzing the collected historical exercise data in the driving training of the trainees and transmitting the historical exercise data into a MySQL database;

checking whether historical exercise data in the uploaded trainee driving training is missing or not, and processing the missing data;

standardizing continuous data characteristics in historical practice data of trainees driving training to obtain annotated data with a mean value of 0 and a variance of 1;

performing one-hot coding on the class type data characteristics in the historical practice data of the trainee driving training to obtain binary vector representation of the data;

carrying out binarization on continuous data characteristics needing to be converted into class type data characteristics in historical practice data of trainees driving training, so that the obtained binarization data element is not 0, namely 1;

and on the basis of the historical training data of the trainee driving training, regularizing the data and searching for a nonlinear relation according to the selection, so as to obtain the processed historical training data of the trainee driving training.

Further, as a preferred technical solution of the present invention, in the step 3, the K-Means clustering algorithm calculates the distance between the data objects by using the euclidean distance, where the euclidean distance calculation formula is as follows:

wherein x is_iAnd x_jRepresenting two samples, n representing a feature dimension; x is the number of_inAnd x_jnRespectively representing the corresponding values of the two samples on the characteristic dimension n; d represents the calculated Euclidean distance;

and the K-Means clustering algorithm divides the students in the preprocessed historical exercise data into different categories through iteration, so that an average error criterion function E for evaluating the clustering performance is optimal, and the calculation formula of the average error criterion function is as follows:

wherein, X_iContaining k clustering subsets X₁,X₂,···,X_kFor each cluster subset, find the sample p to the cluster center m₁,m₂,···m_kThe final clustering result is obtained by continuously iterating until the average error criterion function E converges.

Further, as a preferable technical solution of the present invention, the modeling of the RNN recurrent neural network model in step 4 includes:

given a learner's historical learning sequence X_i＝(x₁,x₂,x₃,...x_t) Wherein x is_tIs a one _ hot vector and is a time series representing the trainee's exercise item q at time t_tThe exercise result of (a)_t；

Modeling the historical learning sequence of the student by using an RNN (neural network) recurrent neural network model in deep learning, wherein the formula is as follows:

h_t＝tanh(W_hxx_t+W_hhh_t-1+b_h)，

y_t＝σ(W_yhh_t+b_y)，

wherein, X is_i＝(x₁,x₂,x₃,...x_t) As input to the RNN recurrent neural network model, Y_i＝(y₁,y₂,y₃,...y_t) As an output of the RNN recurrent neural network model, where y_tRepresenting the probability of successful completion of the project exercise at time t by the trainee; h_i＝(h₁,h₂,h₃,...h_t) Is a hidden layer of the RNN recurrent neural network model, where h_tDenotes the t-th hidden layer unit, h_t-1Represents the t-1 hidden layer unit; sigma is a sigmode function; b_hAnd b_yRespectively being a bias term of the hidden layer unit and a bias term of the output unit; w_hxIs the state-input weight; w_hhIs state-state rightWeighing; w_yhIs the output weight.

Further, as a preferred technical solution of the present invention, the training of the reinforcement learning model in step 5 includes:

determining a reinforced learning model as an intelligent agent, taking the current training data and the current real item passing condition of the student as states, taking a guidance scheme adopted by a training coach at the time as an action, and taking the probability of successfully completing the next item exercise of the student predicted by the RNN recurrent neural network model as a reward to train the reinforced learning model, and selecting the optimal coach guidance scheme by the reinforced learning model with the maximized reward as a target.

The invention also provides a driving training guidance terminal, which comprises: the device comprises a memory and a processor, wherein the memory stores program instructions, and the processor calls the program instructions from the memory to execute the driving training guidance method.

By adopting the technical scheme, the invention can produce the following technical effects:

according to the driving training guidance method and the terminal, the trainees are divided into different trainee groups with similar learning capacity through a clustering algorithm, the probability of successfully completing the project exercise of the trainees is output by combining with the RNN recurrent neural network, the probability result of successfully completing the next project exercise of the trainees is predicted, an optimal coach guidance scheme is selected and recommended to the trainees by using the reinforcement learning model, so that customized guidance is performed for each learning stage of each trainee on the basis of three models, intelligent guidance scheme selection is realized, the workload of coaches can be greatly reduced, one coach can simultaneously train several times of the original trainees, the time for formulating the guidance scheme of the trainees can be shortened through the customized guidance, the near-optimal exercise scheme can be quickly found, and the final examination pass rate of the trainees is improved. Therefore, the invention can effectively solve the problem of multi-worker intervention in the current driving training and improve the intelligent performance.

Drawings

Fig. 1 is a schematic flow chart of the driving training guidance method of the present invention.

FIG. 2 is a schematic diagram of the modeling of the RNN recurrent neural network of the present invention.

Detailed Description

In order to facilitate understanding of those skilled in the art, the present invention will be further described with reference to the following examples and drawings, which are not intended to limit the present invention.

As shown in fig. 1, the invention relates to a driving training guidance method, which specifically comprises the following steps:

step 1, collecting historical exercise data and a coach guide scheme in the driving training of a student.

The collected historical exercise data in the trainee driving training comprises basic information of the trainee, simulator exercise information and exercise process data. Wherein the basic information of the trainees comprises sex, age, academic calendar and the like; the simulator exercise information comprises simulated exercise duration, exercise scores, error types and the like; the exercise process data comprises error points of each item, and steering wheel angles, vehicle speeds, line pressing conditions and the like at the error points; the coaching program of the coach refers to the information of the coaching program the coach takes when the student is in practice and when an error occurs.

Step 2, preprocessing the collected historical exercise data in the trainee driving training to obtain preprocessed historical exercise data, which is specifically as follows:

and 2-1, analyzing the collected historical exercise data in the driving training of the trainees and transmitting the analyzed historical exercise data into a MySQL database.

And 2-2, checking whether historical exercise data in the uploaded trainees' driving training are missing, namely checking whether data are randomly missing or not randomly missing due to equipment reasons or program reasons in the hardware and software data acquisition process, and processing the missing data, wherein the processing comprises selecting a deleting, filling or non-processing mode according to the missing reasons and the missing data types to process, so that the data quality is improved.

And 2-3, standardizing continuous data characteristics in historical practice data of the trainees driving training to obtain annotated data with a mean value of 0 and a variance of 1.

And 2-4, performing one-hot coding on the class type data characteristics in the historical practice data of the trainee driving training to obtain binary vector representation of the data.

And 2-5, performing binarization on continuous data characteristics needing to be converted into class type data characteristics in historical practice data of the trainee driving training, so that the obtained binarization data elements are not 0, namely 1, and the purpose of simplifying a mathematical model is achieved.

And 2-6, based on historical exercise data of the trainee driving training, in order to prevent overfitting or other reasons, regularizing the data according to selection.

And 2-7, based on historical exercise data of the trainees' driving training, finding that the effect is poor after the data are initially explored, trying to use a polynomial method to carry out polynomial expansion on the original characteristics, and searching for a nonlinear relation.

Therefore, after one or more steps, the processed historical exercise data of the trainee driving training is obtained.

Step 3, based on the preprocessed historical exercise data, dividing students in the preprocessed historical exercise data into different student groups with similar learning ability by using a K-Means clustering algorithm to obtain clustering results, wherein the clustering results are as follows:

firstly, a K-Means clustering algorithm is selected, the K-Means clustering algorithm is a clustering algorithm based on division, and the similarity between data objects is measured through distance, namely the smaller the distance between the data objects is, the higher the similarity is, the more likely the data objects are in the same cluster. Algorithms typically compute the distance between data objects in terms of euclidean distance, which is calculated as follows:

wherein x is_iAnd x_jRepresenting two samples, namely two trainees in the invention, wherein n represents a characteristic dimension, namely a basic information dimension and a simulator exercise information dimension; x is the number of_inAnd x_jnRespectively representing two samples in a bitCharacterizing a corresponding value in dimension n; d represents the calculated euclidean distance.

Then, the K-Means clustering algorithm divides the students in the preprocessed historical exercise data into different categories through an iterative process, so that an average error criterion function E for evaluating the clustering performance is optimal, and the effects of compact in each generated cluster and independent division among the categories are achieved. The average error criterion function E is calculated as follows:

in the formula, X_iContaining k clustering subsets X₁,X₂,···,X_kFor each subset, find the sample p to the cluster center m₁,m₂,···m_kThe absolute sum of squares of the two clusters is continuously iterated until the average error criterion function E converges, and a final clustering result is obtained.

Therefore, all the trainees are divided into a group with similar learning ability by using a clustering algorithm according to basic information such as sex, age and academic calendar of the trainees and the past performance in the training process of the simulator, so that a plurality of categories can be obtained by dividing, and customized guidance can be provided conveniently according to the learning ability of each trainee.

Step 4, according to the different student groups divided in the step 3, preprocessing the historical training data of the students in each group, inputting and training the RNN recurrent neural network model, and predicting and outputting the probability of the students successfully completing the next project exercise by the RNN recurrent neural network model, wherein the probability is as follows:

given a learner's historical learning sequence X_i＝(x₁,x₂,x₃,...x_t) Wherein x is_tIs a one _ hot vector and is a time series representing the trainee's exercise item q at time t_tThe exercise result of (a)_tTo predict the next exercise x of the student_t+1Result of (a)_t+1。

Modeling the historical learning sequence of the learner by using an RNN (neural network) cyclic neural network model in deep learning, as shown in FIG. 2, wherein the calculation formula is as follows:

h₁＝tanhh(W_hxx_t+W_hhh_t-1+h_h)，

y_t＝σ(W_yhh_t+b_y)，

wherein, X is_i＝(x₁,x₂,x₃,...x_t) As an input to the RNN recurrent neural network model, it represents a time series of x's in the series_tIs a one _ hot vector representing the items the trainee exercises in time t and the exercise results; y is_i＝(y₁,y₂,y₃,...y_t) Is the output of the RNN recurrent neural network model, and is also a time series, where y_tRepresenting the probability of successful completion of the project exercise at time t by the trainee; h_i＝(h₁,h₂,h₃,...h_t) Is a hidden layer of the RNN recurrent neural network model, where h_tDenotes the t-th hidden layer unit, h_t-1Represents the t-1 hidden layer unit, the state update of which is both input x_tIs also influenced by the state h of the hidden layer unit at the previous moment_t-1The influence of (a); b_hAnd b_yRespectively being a bias term of the hidden layer unit and a bias term of the output unit; w_hxIs the state-input weight; w_hhIs a state-state weight; w_yhIs the output weight.

The RNN recurrent neural network model can be used for predicting the probability of successful completion of next project exercise of the student according to all past historical exercise data of the student, and the change process of the skill level of the student along with time is automatically tracked according to the historical learning track of the student, so that the performance of the student in future exercise can be accurately predicted, and the prediction is not simply performed through the last exercise situation. The RNN recurrent neural network model has the defects that the learning abilities of all students are assumed to be the same, and in fact, the learning abilities of each student are different, the K-Means clustering method in the step 3 can well solve the problem, realize the classification of similar learning abilities into a class, lay a foundation for the prediction of the RNN recurrent neural network model, and enable the RNN recurrent neural network model to accurately predict the next exercise result of the students.

Step 5, acquiring training data of the student at this time, inputting the RNN recurrent neural network model obtained by training in the step 4, predicting and outputting the probability of successfully completing next project exercise of the student by the RNN recurrent neural network model, and taking the acquired guidance scheme adopted by the training coach at this time and the passing condition of the real project at this time as input to train the reinforcement learning model; for each training data of a subsequent student, predicting and outputting the probability of successfully completing the next project exercise of the student by using an RNN (neural network) recurrent neural network model, inputting the result into a trained reinforcement learning model, selecting a guidance scheme of an optimal coach by the reinforcement learning model, and recommending the selected optimal coach guidance scheme to the student, wherein the following concrete steps are as follows:

first, the principle of the reinforcement learning model is obtained: the system consists of basic elements such as an Agent, an Environment, a State, an Action and a Reward. Wherein:

(1) the intelligent agent is used as a body for reinforcement learning and is used as a learner or a decision maker.

(2) The environment is the external environment where the agent is located and is described by various states. The various state data constitute a state set.

(3) The action is the action made by the intelligent agent according to the current environment state and observation and according to the own strategy,

an action set is a set of all actions that an agent may make.

(4) The reward is a feedback signal obtained after the intelligent body makes a certain action in a certain state, and comprises positive feedback and negative feedback, and the feedback can be represented by designing a reward function.

The reinforcement learning model optimizes the strategy by letting the agent interact with the environment continuously to obtain as many reward values as possible.

Based on a reinforced learning model principle, the invention selects a coach guidance scheme to be recommended according to the current project practice condition of a student, specifically, the method selects a reinforced learning model as an intelligent agent, takes the current training data and the current real project passing condition of the student as states, takes a guidance scheme adopted by a coach for the current training as an action, takes the probability of successfully completing the next project practice of the student predicted by an RNN recurrent neural network model as a reward, performs a large amount of optimized training on the reinforced learning model by using the parameters as input, and selects an optimal coach guidance scheme by the reinforced learning model with the maximized reward as a target, so that the reinforced learning model obtains the optimal coach guidance scheme. In order to make an action more optimal, the agent must consider the long-term benefits of multiple sequences of actions after the action, i.e., maximizing the future is a reward. Therefore, the intelligence can try continuously and interact with the environment continuously, and the strategy of the intelligence is improved gradually.

The invention also relates to a driving training guidance terminal, which comprises: the driving training guidance system comprises a memory and a processor, wherein the memory stores program instructions, and the processor calls the program instructions from the memory to execute the driving training guidance method, so that the driving training guidance terminal can realize intelligent driving training guidance.

Therefore, according to the method and the terminal, the trainees are divided into different trainee groups with similar learning capacity through a clustering algorithm, an optimal coach guidance scheme is selected and recommended to the trainees in combination with the RNN recurrent neural network and the reinforcement learning model, customized guidance is performed for each learning stage of each trainee, intelligent guidance scheme selection is achieved, the workload of the trainees can be greatly reduced, an approximately optimal exercise scheme is found, the passing rate of final examination of the trainees is improved, the problem of multi-worker intervention in current driving training can be effectively solved, and the intelligent performance is improved.

While embodiments of the present invention have been described above, the present invention is not limited to the specific embodiments and applications described above, which are intended to be illustrative, instructive, and not limiting. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto without departing from the scope of the invention as defined by the appended claims.

Claims

1. A driving training guidance method is characterized by comprising the following steps:

2. The driving training guidance method as claimed in claim 1, wherein the step 1 of collecting historical exercise data in the trainee driving training comprises trainee basic information, simulator exercise information and exercise process data.

3. The driving training guidance method as claimed in claim 1, wherein the step 2 of preprocessing the collected historical exercise data in the trainee driving training comprises:

4. The driving training guidance method according to claim 1, wherein the K-Means clustering algorithm in step 3 calculates the distance between data objects by using the euclidean distance, and the euclidean distance is calculated by the following formula:

5. The driving training guidance method according to claim 1, wherein the step 4 of modeling the RNN recurrent neural network model comprises:

h_t=tanh(W_hxx_t+W_hhh_t-1+b_h)，

y_t＝σ(W_yhh_t+b_y)，

mixing X_i＝(x₁,x₂,x₃,...x_t) As input to the RNN recurrent neural network model, Y_i＝(y₁,y₂,y₃,...y_t) As an output of the RNN recurrent neural network model, where y_tRepresenting the probability of successful completion of the project exercise at time t by the trainee; h_i＝(h₁,h₂,h₃,...h_t) Is the RNN recurrent nerveHidden layer of network model, where h_tDenotes the t-th hidden layer unit, h_t-1Represents the t-1 hidden layer unit; sigma is a sigmode function; b_hAnd b_yRespectively being a bias term of the hidden layer unit and a bias term of the output unit; w_hxIs the state-input weight; w_hhIs a state-state weight; w_yhIs the output weight.

6. The driving training guidance method according to claim 1, wherein the step 5 of training a reinforcement learning model comprises:

7. A driving training guidance terminal, comprising: a memory and a processor, wherein the memory stores program instructions that the processor retrieves from the memory to perform the driving training guidance method of any of claims 1-6.