CN113753049A - Social preference-based automatic driving overtaking decision determination method and system - Google Patents

Social preference-based automatic driving overtaking decision determination method and system Download PDF

Info

Publication number
CN113753049A
CN113753049A CN202111322969.7A CN202111322969A CN113753049A CN 113753049 A CN113753049 A CN 113753049A CN 202111322969 A CN202111322969 A CN 202111322969A CN 113753049 A CN113753049 A CN 113753049A
Authority
CN
China
Prior art keywords
vehicle
overtaking
overridden
model
current stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111322969.7A
Other languages
Chinese (zh)
Other versions
CN113753049B (en
Inventor
吕超
王昊阳
鲁洪良
于洋
龚建伟
臧政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beili Huidong Beijing Technology Co ltd
Beijing Institute of Technology BIT
Original Assignee
Beili Huidong Beijing Technology Co ltd
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beili Huidong Beijing Technology Co ltd, Beijing Institute of Technology BIT filed Critical Beili Huidong Beijing Technology Co ltd
Priority to CN202111322969.7A priority Critical patent/CN113753049B/en
Publication of CN113753049A publication Critical patent/CN113753049A/en
Application granted granted Critical
Publication of CN113753049B publication Critical patent/CN113753049B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units, or advanced driver assistance systems for ensuring comfort, stability and safety or drive control systems for propelling or retarding the vehicle
    • B60W30/18Propelling the vehicle
    • B60W30/18009Propelling the vehicle related to particular drive situations
    • B60W30/18163Lane change; Overtaking manoeuvres
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0019Control system elements or transfer functions
    • B60W2050/0028Mathematical models, e.g. for simulation
    • B60W2050/0031Mathematical model of the vehicle

Abstract

The invention discloses a social preference-based automatic driving overtaking decision determining method and system, which are applied to a parallel driving stage in an overtaking process and comprise the following steps: inputting the acquired target road information of the current stage into a social preference prediction model to determine social preference of the transcended vehicle of the current stage; determining a state transition model of the current stage overridden vehicle based on social preferences of the current stage overridden vehicle; inputting the target road information of the current stage, the social preference of the overtaking vehicle of the current stage and the state transition model of the overtaking vehicle of the current stage into the overtaking decision model to determine the overtaking decision of the host vehicle of the current stage; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning; the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm. The invention can output accurate overtaking decision, and improve overtaking efficiency and overtaking safety.

Description

Social preference-based automatic driving overtaking decision determination method and system
Technical Field
The invention relates to the technical field of automatic driving, in particular to a social preference-based automatic driving overtaking decision determining method and system.
Background
With the continuous improvement of automobile holding capacity and the continuous progress of automatic driving technology, the intelligent driving system gradually enters the public visual field, wherein the autonomous overtaking system is increasingly concerned by researchers at home and abroad. At present, the research on an autonomous overtaking system at home and abroad, particularly the parallel driving stage of two vehicles during overtaking has certain defects.
In a driver's typical driving scenario, overtaking behavior is one of the most risky and challenging driving approaches. Aiming at the problem of autonomous overtaking of an intelligent vehicle, the longitudinal driving behavior and the change of the overtaken vehicle in the overtaking process are rarely considered by the conventional autonomous overtaking system, so that the longitudinal driving behavior of the overtaken vehicle cannot be adjusted in real time in the process of overtaking the overtaken vehicle by a host vehicle.
Disclosure of Invention
The invention aims to provide a method and a system for determining an automatic driving overtaking decision based on social preference so as to achieve the purpose of outputting an accurate overtaking decision and further improve the overtaking efficiency and the overtaking safety.
In order to achieve the purpose, the invention provides the following scheme:
an automatic driving overtaking decision determining method based on social preference, which is applied to a parallel driving stage in an overtaking process, comprises the following steps:
acquiring target road information at the current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; the host vehicle and the overtaking vehicle both run on the target road;
inputting the current stage target road information into a social preference prediction model to determine the social preference of the transcended vehicle at the current stage;
determining a state transition model of the current stage overridden vehicle based on social preferences of the current stage overridden vehicle;
inputting the current stage target road information, the social preference of the current stage overtaken vehicle and the state transition model of the current stage overtaken vehicle into an overtaking decision model to determine an overtaking decision of the current stage host vehicle; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning;
the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm.
Optionally, the determining process of the social preference prediction model is as follows:
constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting;
and determining a social preference prediction model based on the sample database, a support vector machine model with a linear kernel and a maximum entropy model based on logistic regression.
Optionally, the determining a state transition model of the overridden vehicle at the current stage based on the social preference of the overridden vehicle at the current stage specifically includes:
performing statistical operation on the data in the sample database to obtain the state transition probability of the transcended vehicle at each position under each tag;
summarizing the state transition probabilities of the transcendered vehicles at all positions under the same label to construct state transition models of the transcendered vehicles under all labels;
and screening out the state transition models which accord with the social preference of the overtaken vehicle at the current stage from the state transition models of the overtaken vehicle under each label.
Optionally, the constructing a sample database specifically includes:
carrying out clustering processing on the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering a parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data;
constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data;
the sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaking vehicle information comprises position information and speed information of an overtaking vehicle in a parallel running stage in the overtaking process;
the overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
Optionally, the semi-model-based improved Q-learning algorithm is designed based on a reinforcement learning method.
An autonomous driving overtaking decision making system based on social preferences, the autonomous driving overtaking decision making system being applied in a parallel driving phase in an overtaking process, the autonomous driving overtaking decision making system comprising:
the data acquisition module is used for acquiring the target road information at the current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; the host vehicle and the overtaking vehicle both run on the target road;
the social preference determination module is used for inputting the current stage target road information into a social preference prediction model so as to determine the social preference of the transcended vehicle at the current stage;
the state transition model determining module is used for determining a state transition model of the overtaken vehicle at the current stage based on the social preference of the overtaken vehicle at the current stage;
the overtaking decision output module is used for inputting the current stage target road information, the social preference of the overtaken vehicle in the current stage and the state transition model of the overtaken vehicle in the current stage into an overtaking decision model so as to determine an overtaking decision of a host vehicle in the current stage; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning;
the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm.
Optionally, the system further comprises a social preference prediction model determining module; the social preference prediction model determining module specifically comprises:
the sample database construction unit is used for constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting;
and the social preference prediction model determining unit is used for determining a social preference prediction model based on the sample database, the support vector machine model with the linear kernel and the maximum entropy model based on logistic regression.
Optionally, the state transition model determining module specifically includes:
the state transition probability calculation unit is used for carrying out statistical operation on the data in the sample database to obtain the state transition probability of the transcendered vehicle at each position under each label;
the state transition model building unit is used for summarizing the state transition probabilities of the transcendered vehicles under the same label at all positions so as to build state transition models of the transcendered vehicles under all labels;
and the state transition model determining unit is used for screening out the state transition models which accord with the social preference of the overtaken vehicles at the current stage from the state transition models of the overtaken vehicles under each label.
Optionally, the sample database constructing unit specifically includes:
carrying out clustering processing on the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering a parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data;
constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data;
the sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaking vehicle information comprises position information and speed information of an overtaking vehicle in a parallel running stage in the overtaking process;
the overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
Optionally, the semi-model-based improved Q-learning algorithm is designed based on a reinforcement learning method.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method and the system for determining the automatic driving overtaking decision based on the social preference, the longitudinal driving behavior and the social preference of the overtaken vehicle can be monitored in real time in the process that the main vehicle overtakes the overtaken vehicle, then the accurate overtaking decision is output based on the target road information of the current stage, the social preference of the overtaken vehicle of the current stage and the comprehensive consideration of the state transition probability of the overtaken vehicle of the current stage, the problem of interaction between the main vehicle and the overtaken vehicle in the overtaking process is solved, and the overtaking efficiency and the overtaking safety are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of a method for determining an autonomous driving overtaking decision based on social preferences in accordance with the present invention;
FIG. 2 is a schematic flow chart of a design method of an autonomous overtaking decision system based on social preferences according to the present invention;
FIG. 3 is a schematic diagram of the lattice space in the autonomous overtaking state according to the present invention;
FIG. 4 is a diagram of a staged overtaking process of the present invention;
fig. 5 is a schematic structural diagram of an automatic driving overtaking decision-making system based on social preference according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to solve the problems provided by the background technology and conform to the intelligent development direction of the automatic driving technology, the invention models social preference based on a Markov decision process and a statistical machine Learning method, develops a semi-model-based improved Q-Learning algorithm based on the social preference attribute of the overtaking vehicle, completes the effective decision of the intelligent vehicle in the overtaking parallel driving stage, solves the interaction problem of the main vehicle and the overtaking vehicle in the overtaking process, and improves the overtaking efficiency and safety.
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Example one
As shown in fig. 1, the present embodiment provides an automatic driving overtaking decision determining method based on social preferences, which is applied to a parallel driving stage in an overtaking process, and the automatic driving overtaking decision determining method includes:
step 101: acquiring target road information at the current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; both the host vehicle and the overridden vehicle travel on the target road.
Step 102: and inputting the current stage target road information into a social preference prediction model to determine the social preference of the transcended vehicle at the current stage.
Step 103: determining a state transition model for the current stage overridden vehicle based on social preferences of the current stage overridden vehicle.
Step 104: inputting the current stage target road information, the social preference of the current stage overtaken vehicle and the state transition model of the current stage overtaken vehicle into an overtaking decision model to determine an overtaking decision of the current stage host vehicle; the overtaking decision includes lane keeping, lane changing and overtaking abandoning.
The algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm. The semi-model-based improved Q-learning algorithm is designed based on a reinforcement learning method.
The determination process of the social preference prediction model comprises the following steps:
constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting.
And determining a social preference prediction model based on the sample database, a support vector machine model with a linear kernel and a maximum entropy model based on logistic regression.
The constructing of the sample database specifically includes:
and clustering the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering the parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data.
Constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data.
The sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaken vehicle information comprises position information and speed information of an overtaken vehicle in a parallel driving stage in the overtaking process.
The overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
Step 103 specifically comprises:
and carrying out statistical operation on the data in the sample database to obtain the state transition probability of the transcended vehicle at each position under each tag.
And summarizing the state transition probabilities of the transcended vehicles under the same label at all positions to construct state transition models of the transcended vehicles under all labels.
And screening out the state transition models which accord with the social preference of the overtaken vehicle at the current stage from the state transition models of the overtaken vehicle under each label.
Example two
The invention aims to provide a design method of an autonomous overtaking decision system based on social preference, which is characterized in that the driving experience and driving behavior of a human driver are integrated into an overtaking decision model (also called as an overtaking decision module) of the autonomous overtaking system, and an improved Q-learning algorithm based on a half model is developed based on a Markov decision process so as to solve the problem of vehicle interaction in the overtaking process.
Referring to fig. 2, an embodiment of the present invention provides a method for designing an autonomous overtaking decision system based on social preferences, including:
step 1, defining a state lattice space. In the embodiment of the invention, aiming at the problem of autonomous overtaking on a straight road, in order to facilitate calculation and modeling in the overtaking process, a state lattice space as shown in fig. 3 is established on the researched straight road. Thus, the position of the vehicle can be represented by the coordinates of the lattice vertices.
The overall state lattice space can be expressed as:
Figure 666416DEST_PATH_IMAGE001
(1);
wherein S is sl Is a position matrix, xm,ynThe vectors representing the abscissa and ordinate values, respectively, the subscripts m, n representing the position index and R the set of real numbers.
The position coordinates in the state lattice space can be used for determining and representing the position of the vehicle, so that subsequent modeling and calculation are facilitated.
And 2, defining the overtaking process. The autonomous overtaking decision system provided by the embodiment of the invention mainly aims at the overtaking problem of urban road environment, and aims at the parallel driving stage during overtaking, so that the overtaking process needs to be specifically defined. The overtaking scene of the embodiment of the invention mainly comprises two objects, as shown in fig. 4: a host vehicle and an overrunning vehicle. A typical overtaking process includes three phases: the initial overtaking stage: the host vehicle changes the lane to the overtaking lane; and (3) a parallel driving stage: the main vehicle and the overtaking vehicle run in parallel and execute overtaking; and (3) overtaking termination stage: the host vehicle drives back to the origin lane.
The overtaking mode adopted by the embodiment of the invention is accelerated overtaking, and the main vehicle keeps running at a constant speed in the parallel running stage in the overtaking process.
The step 1 and the step 2 mainly define the lattice space, the overtaking process and the parallel driving stage, and lay the foundation for the subsequent design.
And 3, defining social preference of the overtaking process. Smart vehicles deployed on highways need to understand the intentions of human drivers and adapt to their driving style. The embodiment of the invention integrates social psychology concept-social preference into the autonomous overtaking decision so as to quantify and predict the social behaviors of other drivers and enable the intelligent vehicle to realize unmanned driving in a mode of meeting social rules.
According to the definition of social psychology, social preference refers to the embodiment of the inherent sociality of people in a preference level. According to the embodiment of the invention, the definition of the social preference and the classification method are integrated, and three types of social preferences are defined aiming at the overtaking scene, namely the type of benefiting oneself, the type of mutual benefit and the type of benefiting others, so as to distinguish the longitudinal driving modes of drivers with different styles.
The three social preferences defined in this step will be the basis for the cluster analysis in the next step.
And 4, clustering overtaking data. After defining the social preference of the overtaking process, analyzing the overtaking data by using a clustering method. For the social preference of autonomous overtaking, the average running speed of the overtaken vehicle within five seconds after entering the parallel running stage of the overtaking process can be used as a characteristic quantity, the social preference of the overtaking process is used as a clustering category, and the overtaking data is clustered, so that three groups of overtaken vehicle driving data corresponding to the three social preferences of the overtaking process mentioned in the step 3 are obtained. The driving data of the overtaken vehicle subjected to cluster analysis mainly comprises position information and speed information of a host vehicle and the overtaken vehicle in a parallel driving stage, and the data source of the driving data can adopt a common data set such as highway traffic flow data (NGSIM) in the United states.
At present, the common clustering algorithms mainly have five categories, wherein a K-Means algorithm in a clustering method based on division is a classic algorithm in the clustering algorithm, and compared with other clustering algorithms, the clustering principle and idea of the algorithm are simpler, and the clustering effect is better. Although the K-means algorithm has disadvantages, such as the number of clusters in the algorithm needs to be determined in advance, the data dimension is low and the data volume is small in the cluster of overtaking data, and the requirement can be met by using the simple K-means algorithm.
And further carrying out average state transition probability statistics on each type of overtaking data by using the overtaking vehicle driving data after the clustering analysis, thereby obtaining an overtaking vehicle state transition model.
And 5, carrying out statistics on the transgressed vehicle state transition probability. The overtaking vehicle driving data of the overtaking parallel driving stage under three different social preferences are obtained through clustering, wherein the overtaking vehicle driving data comprise data information of initial positions of the overtaking vehicle relative to the host vehicle in the parallel driving stage. Using this information as input, the state transition probabilities of the overtaken vehicles are counted.
From step 1, the overtaking process is defined within the state lattice space. For the data of the overtaken vehicle under a certain social preference, the following data statistical operation is required to obtain the state transition probability of the overtaken vehicle under the social preference.
The following formula is first applied to convert the overridden vehicle relative position data under social preference to grid state information:
Figure 278794DEST_PATH_IMAGE002
(2);
wherein the content of the first and second substances,Sthe finger state matrix stores the grid position information of the overtaking vehicle in the overtaking data in each row of the state matrix;P k refers to a matrix of relative position data,P k each row of (a) stores data information of the initial position of the overtaking vehicle relative to the host vehicle during the parallel driving phase for each overtaking process.gIs the grid width;floor(g)is a floor function.
After the state matrix is obtained, the state transition number can be counted, and then the state transition probability of the overtaken vehicle under each social preference is calculated according to the following formula:
Figure 84070DEST_PATH_IMAGE003
(3)
wherein the content of the first and second substances,
Figure 589613DEST_PATH_IMAGE004
is shown astThe position of the grid where the overtaking vehicle is located at the moment isk
Figure 175447DEST_PATH_IMAGE005
Is shown ast+1The position of the grid where the overtaking vehicle is located at the moment isk+n
Figure 857095DEST_PATH_IMAGE006
When the overtaking vehicle is in the second placetIs in the grid at all timeskAnd at the firstt+ 1Is in the grid at all timesk+nI.e. the state transition probability;count(g)the function is used to count the number of state transition events before and after the occurrence,Gis a grid position matrix in which all grid positions are contained.
By counting the state transition probability of the transcendered vehicle at each grid position under each social preference, the state transition model of the transcendered vehicle can be obtained, and the state transition model has the function of predicting the position of the transcendered vehicle at a next time step on the premise that the social preference of the transcendered vehicle and the grid position of the transcendered vehicle at a certain time step are known. The model is used for judging the position information of the overtaking vehicle in the overtaking decision module, assisting the model training and making the optimal decision.
And 6, establishing a social preference prediction model. The social preference prediction model is also established on the basis of the overtaking data obtained after clustering in the step 4, and the purpose is to judge the social preference of the overtaken vehicle on line and in real time by analyzing the three types of overtaking data with labels. In essence, classification or prediction beyond the social preference of the vehicle is achieved through certain classifiers or probabilistic models.
The classification method based on data is various, and in view of the fact that the number of overtaking data which can be extracted in the embodiment of the invention is moderate, the interpretability of the support vector machine is good and the support vector machine has sparsity, namely, a good classification effect can be obtained by a small amount of samples. In addition, the maximum entropy statistical model has higher accuracy when being used as a classical classification model, can flexibly set constraint conditions, and can adjust the fitness of the model to unknown data and the fitting degree of the model to known data according to the number of the constraint conditions. Therefore, the embodiment of the invention selects to establish a support vector machine model with a linear kernel and a maximum entropy model based on logistic regression to predict the social preference of the transcended vehicle in real time.
The input of the social preference prediction model is real-time road information which mainly comprises speed information of the overtaking vehicle, the output of the social preference prediction model is social preference of the overtaking vehicle, and the social preference is used as the input of the overtaking decision module to assist the overtaking decision module to make an optimal decision.
And 7, designing a overtaking decision module. The module is based on a reinforcement learning method, and is used for designing a semi-model-based improved Q-learning algorithm for making decisions on whether to continue overtaking and lane changing and when to change lanes in an overtaking parallel driving stage, so that the overtaking efficiency and safety are optimized.
Specifically, the overtaking decision model takes the overtaking vehicle state transition model obtained in the step 5, the overtaking vehicle real-time social preference prediction result obtained in the step 6 and the host vehicle and overtaken vehicle real-time data in the environment as input, and outputs one of three decisions of lane keeping, lane changing and overtaking abandoning at each time step by analyzing the speed and position information of the host vehicle and the overtaken vehicle, the social preference and the future position prediction of the overtaken vehicle at each time step.
The model starting time is that the host vehicle finishes lane changing and enters a parallel running stage, and the ending point is that the host vehicle makes a lane changing execution decision and enters a passing ending stage or the host vehicle makes a passing abandoning decision. The evaluation indexes comprise lane changing efficiency, position difference with the optimal lane changing point and whether overtaking is given up in time when the social preference of the overtaking vehicle is a good-hand type. The reward function of the reinforcement learning algorithm is also designed based on the above.
And 8, designing a semi-model-based improved Q-learning algorithm. The step introduces an algorithm related to the overtaking decision module. The state and motion space discretization process is relatively simple due to the definition of the cut-in problem within the state lattice space and the discretization of the decision motion. Therefore, in order to better adapt to uncertainty of the social preference of the overtaken vehicle, for the overtaking decision module based on the Markov decision process and the social preference, the embodiment of the invention adopts an improved Q-Learning algorithm with discretized state and action space for training. The autonomous overtaking problem during the parallel driving phase in the embodiment of the invention is based on the above-mentioned overtaken vehicle state transition model, so that the improvement on the iterative formula of the model-free Q-Learning algorithm is needed. The improved Q-Learning algorithm is referred to as being semi-model based, in that the state of the overridden vehicle is model based and the state transition of the host vehicle is simulation platform based. The semi-model-based improved Q-Learning algorithm is well adapted to uncertainties that are surpassed by the social preferences of the vehicle.
When considering the transcendental vehicle state transition model, the Q function needs to consider the state of the transcendental vehicle, so the theoretical formula of the Q-learning algorithm is rewritten as:
Figure 907090DEST_PATH_IMAGE007
(4);
wherein the superscripts HV, OV respectively represent the host vehicle and the overtaken vehicle,
Figure 651711DEST_PATH_IMAGE008
and
Figure 787157DEST_PATH_IMAGE009
respectively representing the vehicle state and behavior at time step t.
The expected Q value needs to be rewritten as:
Figure 69234DEST_PATH_IMAGE010
(5);
wherein the content of the first and second substances,
Figure 708157DEST_PATH_IMAGE011
from time t state for an overridden vehicle
Figure 683066DEST_PATH_IMAGE012
Transition to the State at time t +1
Figure 774650DEST_PATH_IMAGE013
The state transition probability of (2).
Equation 5 means that when the state transition probability of the overridden vehicle is known, the expected Q value at each time step needs to be adjusted based on all possible states of the overridden vehicle at all next times.
And 9, defining the reinforcement learning model elements. The reinforcement learning model basic elements comprise a state space, an action space and a reward function. Aiming at the overtaking scene in the invention, the following definitions are made:
1) state space
For lane change point decision problems, the relative positions and speeds of the host vehicle and the overridden vehicle determine the optimal location for the lane change. Furthermore, for an overridden vehicle with a different social preference, the host vehicle should adjust the lane change position according to its aggressiveness. Thus, the factors considered by the state space should include:
Figure 60750DEST_PATH_IMAGE014
(6);
where G is a matrix of grid positions where the vehicle is located, V represents a matrix of vehicle speeds,
Figure 350918DEST_PATH_IMAGE015
is a social preference matrix for the transcended vehicle, an
Figure 699990DEST_PATH_IMAGE016
-1, 0, 1 represent the social preferences of the type of risa, reciprocal, and the type of liber, respectively.
2) Movement space
The decision of the intelligent vehicle autonomous overtaking lane change point belongs to one behavior decision. When the intelligent vehicle is not in the lane change state, the intelligent vehicle should keep the lane to continue driving; when the intelligent vehicle is at the optimal lane changing point, a lane changing decision is made at all times; when the overtaken vehicle is too profitable, it is considered to give up overtaking. Thus, three optional actions are defined herein in the cut-in decision module, namely lane keeping, performing lane change and abandoning cut-in:
a = { "lane keeping", "lane change is performed", "overtaking is abandoned" } (7).
3. Reward function
For the decision problem of the autonomous overtaking lane change point, different rewards or punishments are given according to different decision actions. Specifically, for lane keeping behavior, a small penalty should be given to avoid that the vehicle is always lane keeping without changing lanes; for lane change behaviors, punishment is given according to the difference between the decided lane change position and the optimal lane change position; for surrendering overtaking behavior, choosing this action at an inappropriate time should give a large penalty. Thus, the reward function is defined as follows:
Figure 75608DEST_PATH_IMAGE017
Figure 902750DEST_PATH_IMAGE018
(8);
wherein the content of the first and second substances,K l is a constant parameter.g HV g OV Respectively representing the positions of the grids in which the host vehicle and the overtaking vehicle are located at the current moment,
Figure 578582DEST_PATH_IMAGE019
to expect a position error, the correlation to the three second rule and the social preference of the overridden vehicle is calculated by:
Figure 847625DEST_PATH_IMAGE020
(9);
wherein the content of the first and second substances,gindicating the unit length of the grid.T r A travel time of 3 seconds in the three second rule.
Figure 976118DEST_PATH_IMAGE021
Is a constant parameter.
EXAMPLE III
Referring to fig. 5, the system for determining an automatic driving overtaking decision based on social preference according to the present embodiment is applied to a parallel driving stage in an overtaking process, and includes:
a data obtaining module 501, configured to obtain target road information at a current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; both the host vehicle and the overridden vehicle travel on the target road.
A social preference determination module 502 for inputting the current stage target road information to a social preference prediction model to determine the social preference of the current stage transcended vehicle.
A state transition model determination module 503 for determining a state transition model of the current stage overridden vehicle based on the social preference of the current stage overridden vehicle.
The overtaking decision output module 504 is configured to input the current stage target road information, the social preference of the overtaken vehicle in the current stage, and a state transition model of the overtaken vehicle in the current stage into an overtaking decision model, so as to determine an overtaking decision of the host vehicle in the current stage; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning; the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm. The semi-model-based improved Q-learning algorithm is designed based on a reinforcement learning method.
Further, the system of this embodiment further includes a social preference prediction model determining module.
The social preference prediction model determining module specifically comprises:
the sample database construction unit is used for constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting.
And the social preference prediction model determining unit is used for determining a social preference prediction model based on the sample database, the support vector machine model with the linear kernel and the maximum entropy model based on logistic regression.
The sample database construction unit specifically includes:
and clustering the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering the parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data.
Constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data.
The sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaken vehicle information comprises position information and speed information of an overtaken vehicle in a parallel driving stage in the overtaking process.
The overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
The state transition model determining module 503 specifically includes:
the state transition probability calculation unit is used for carrying out statistical operation on the data in the sample database to obtain the state transition probability of the transcendered vehicle at each position under each label;
the state transition model building unit is used for summarizing the state transition probabilities of the transcendered vehicles under the same label at all positions so as to build state transition models of the transcendered vehicles under all labels;
and the state transition model determining unit is used for screening out the state transition models which accord with the social preference of the overtaken vehicles at the current stage from the state transition models of the overtaken vehicles under each label.
Compared with the prior art, the overtaking decision system based on the Markov decision process and the social preference in the overtaking parallel running stage can fully consider the longitudinal driving behavior mode and the change of the overtaken vehicle in the overtaken process, so that the main vehicle is controlled to make corresponding overtaking behavior adjustment on the longitudinal driving behavior of the overtaken vehicle in real time. Specifically, when the overtaking vehicle is a reciprocal type or a rivalry type, the main vehicle controlled by the overtaking decision system can accurately realize accurate lane change point decision; when the social preference of the overtaken vehicle is good, the main vehicle can also smoothly make a decision to give up the overtaking of the front vehicle. The hierarchical reinforcement learning autonomous overtaking system has the capability of solving the interaction problem between the main vehicle and the overtaken vehicle in the autonomous overtaking process, can realize safe and efficient autonomous overtaking by applying the framework, and has certain application feasibility in an actual overtaking scene.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. An automatic driving overtaking decision determining method based on social preference is characterized in that the automatic driving overtaking decision determining method is applied to a parallel driving stage in an overtaking process, and the automatic driving overtaking decision determining method comprises the following steps:
acquiring target road information at the current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; the host vehicle and the overtaking vehicle both run on the target road;
inputting the current stage target road information into a social preference prediction model to determine the social preference of the transcended vehicle at the current stage;
determining a state transition model of the current stage overridden vehicle based on social preferences of the current stage overridden vehicle;
inputting the current stage target road information, the social preference of the current stage overtaken vehicle and the state transition model of the current stage overtaken vehicle into an overtaking decision model to determine an overtaking decision of the current stage host vehicle; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning;
the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm.
2. The method of claim 1, wherein the social preference prediction model is determined by:
constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting;
and determining a social preference prediction model based on the sample database, a support vector machine model with a linear kernel and a maximum entropy model based on logistic regression.
3. The method for determining a social preference-based automatic driving overtaking decision as claimed in claim 2, wherein the determining a state transition model of the current stage overtaken vehicle based on the social preference of the current stage overtaken vehicle comprises:
performing statistical operation on the data in the sample database to obtain the state transition probability of the transcended vehicle at each position under each tag;
summarizing the state transition probabilities of the transcendered vehicles at all positions under the same label to construct state transition models of the transcendered vehicles under all labels;
and screening out the state transition models which accord with the social preference of the overtaken vehicle at the current stage from the state transition models of the overtaken vehicle under each label.
4. The method according to claim 2, wherein the constructing a sample database comprises:
carrying out clustering processing on the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering a parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data;
constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data;
the sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaking vehicle information comprises position information and speed information of an overtaking vehicle in a parallel running stage in the overtaking process;
the overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
5. A social preference based decision making method for autonomous driving and passing as claimed in claim 1 wherein the semi model based modified Q-learning algorithm is designed based on reinforcement learning method.
6. An autonomous driving overtaking decision making system based on social preferences, the autonomous driving overtaking decision making system being applied in a parallel driving phase in an overtaking process, the autonomous driving overtaking decision making system comprising:
the data acquisition module is used for acquiring the target road information at the current stage; the target road information comprises host vehicle position information, host vehicle speed information, overridden vehicle position information and overridden vehicle speed information; the host vehicle and the overtaking vehicle both run on the target road;
the social preference determination module is used for inputting the current stage target road information into a social preference prediction model so as to determine the social preference of the transcended vehicle at the current stage;
the state transition model determining module is used for determining a state transition model of the overtaken vehicle at the current stage based on the social preference of the overtaken vehicle at the current stage;
the overtaking decision output module is used for inputting the current stage target road information, the social preference of the overtaken vehicle in the current stage and the state transition model of the overtaken vehicle in the current stage into an overtaking decision model so as to determine an overtaking decision of a host vehicle in the current stage; the overtaking decision comprises lane keeping, lane changing execution and overtaking abandoning;
the algorithm applied by the overtaking decision model is a semi-model-based improved Q-learning algorithm.
7. A social preference based autonomous driving overtaking decision making system as claimed in claim 6 further comprising a social preference prediction model determination module; the social preference prediction model determining module specifically comprises:
the sample database construction unit is used for constructing a sample database; the sample data comprises three types of data, wherein the first type of overridden vehicle driving data comprises first overridden vehicle driving data and a first tag corresponding to the first overridden vehicle driving data, the second type of overridden vehicle driving data comprises second overridden vehicle driving data and a second tag corresponding to the second overridden vehicle driving data, and the third type of overridden vehicle driving data comprises third overridden vehicle driving data and a third tag corresponding to the third overridden vehicle driving data; the first label is of a type of benefiting oneself, the second label is of a reciprocal type, and the third label is of a type of benefiting;
and the social preference prediction model determining unit is used for determining a social preference prediction model based on the sample database, the support vector machine model with the linear kernel and the maximum entropy model based on logistic regression.
8. The system of claim 7, wherein the state transition model determination module specifically comprises:
the state transition probability calculation unit is used for carrying out statistical operation on the data in the sample database to obtain the state transition probability of the transcendered vehicle at each position under each label;
the state transition model building unit is used for summarizing the state transition probabilities of the transcendered vehicles under the same label at all positions so as to build state transition models of the transcendered vehicles under all labels;
and the state transition model determining unit is used for screening out the state transition models which accord with the social preference of the overtaken vehicles at the current stage from the state transition models of the overtaken vehicles under each label.
9. The system according to claim 7, wherein the sample database construction unit specifically comprises:
carrying out clustering processing on the sample traffic flow data by taking the average running speed of the overtaking vehicle after entering a parallel running stage in the overtaking process as a characteristic quantity and taking the social preference in the overtaking process as a clustering category to obtain first overtaken vehicle driving data, second overtaken vehicle driving data and third overtaken vehicle driving data;
constructing a sample database based on the first class of overridden vehicle driving data, the second class of overridden vehicle driving data, and the third class of overridden vehicle driving data;
the sample traffic flow data includes host vehicle information and overridden vehicle information; the host vehicle information comprises position information and speed information of a host vehicle in a parallel driving stage in the overtaking process; the overtaking vehicle information comprises position information and speed information of an overtaking vehicle in a parallel running stage in the overtaking process;
the overtaking process social preferences include a benef type, a reciprocal type, and a profit type.
10. A social preference based autonomous driving overtaking decision making system as claimed in claim 6 wherein the semi model based modified Q-learning algorithm is designed based on reinforcement learning methods.
CN202111322969.7A 2021-11-10 2021-11-10 Social preference-based automatic driving overtaking decision determination method and system Active CN113753049B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111322969.7A CN113753049B (en) 2021-11-10 2021-11-10 Social preference-based automatic driving overtaking decision determination method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111322969.7A CN113753049B (en) 2021-11-10 2021-11-10 Social preference-based automatic driving overtaking decision determination method and system

Publications (2)

Publication Number Publication Date
CN113753049A true CN113753049A (en) 2021-12-07
CN113753049B CN113753049B (en) 2022-02-08

Family

ID=78784864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111322969.7A Active CN113753049B (en) 2021-11-10 2021-11-10 Social preference-based automatic driving overtaking decision determination method and system

Country Status (1)

Country Link
CN (1) CN113753049B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100315217A1 (en) * 2009-06-15 2010-12-16 Aisin Aw Co., Ltd. Driving support device and program
CN106874597A (en) * 2017-02-16 2017-06-20 北理慧动(常熟)车辆科技有限公司 A kind of highway passing behavior decision-making technique for being applied to automatic driving vehicle
EP3594921A1 (en) * 2018-07-09 2020-01-15 Continental Automotive GmbH Overtaking assistance system for a vehicle
CN110969848A (en) * 2019-11-26 2020-04-07 武汉理工大学 Automatic driving overtaking decision method based on reinforcement learning under opposite double lanes
CN111452789A (en) * 2020-04-07 2020-07-28 北京汽车集团越野车有限公司 Automatic driving overtaking control method and system
CN111645692A (en) * 2020-06-02 2020-09-11 中国科学技术大学先进技术研究院 Hybrid strategy game-based driver overtaking intention identification method and system
CN112046484A (en) * 2020-09-21 2020-12-08 吉林大学 Q learning-based vehicle lane-changing overtaking path planning method
CN113306558A (en) * 2021-07-30 2021-08-27 北京理工大学 Lane changing decision method and system based on lane changing interaction intention

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100315217A1 (en) * 2009-06-15 2010-12-16 Aisin Aw Co., Ltd. Driving support device and program
CN106874597A (en) * 2017-02-16 2017-06-20 北理慧动(常熟)车辆科技有限公司 A kind of highway passing behavior decision-making technique for being applied to automatic driving vehicle
EP3594921A1 (en) * 2018-07-09 2020-01-15 Continental Automotive GmbH Overtaking assistance system for a vehicle
CN110969848A (en) * 2019-11-26 2020-04-07 武汉理工大学 Automatic driving overtaking decision method based on reinforcement learning under opposite double lanes
CN111452789A (en) * 2020-04-07 2020-07-28 北京汽车集团越野车有限公司 Automatic driving overtaking control method and system
CN111645692A (en) * 2020-06-02 2020-09-11 中国科学技术大学先进技术研究院 Hybrid strategy game-based driver overtaking intention identification method and system
CN112046484A (en) * 2020-09-21 2020-12-08 吉林大学 Q learning-based vehicle lane-changing overtaking path planning method
CN113306558A (en) * 2021-07-30 2021-08-27 北京理工大学 Lane changing decision method and system based on lane changing interaction intention

Also Published As

Publication number Publication date
CN113753049B (en) 2022-02-08

Similar Documents

Publication Publication Date Title
CN109670277B (en) Travel time prediction method based on multi-mode data fusion and multi-model integration
CN110164128B (en) City-level intelligent traffic simulation system
CN111523643A (en) Trajectory prediction method, apparatus, device and storage medium
Yan et al. Spatial-temporal chebyshev graph neural network for traffic flow prediction in iot-based its
CN112215487A (en) Vehicle driving risk prediction method based on neural network model
Li et al. Short-term vehicle speed prediction based on BiLSTM-GRU model considering driver heterogeneity
WO2024027027A1 (en) Method and system for recognizing lane changing intention of manually-driven vehicle
CN112907970B (en) Variable lane steering control method based on vehicle queuing length change rate
Wheeler et al. Analysis of microscopic behavior models for probabilistic modeling of driver behavior
CN111695737A (en) Group target advancing trend prediction method based on LSTM neural network
He et al. Probabilistic intention prediction and trajectory generation based on dynamic bayesian networks
CN116050672B (en) Urban management method and system based on artificial intelligence
Qi et al. What is the appropriate temporal distance range for driving style analysis?
Wang et al. A state dependent mandatory lane-changing model for urban arterials with hidden Markov model method
Trauth et al. Learning and adapting behavior of autonomous vehicles through inverse reinforcement learning
Papathanasopoulou et al. Data-driven traffic simulation models: Mobility patterns using machine learning techniques
Axenie et al. Fuzzy modelling and inference for physics-aware road vehicle driver behaviour model calibration
CN113753049B (en) Social preference-based automatic driving overtaking decision determination method and system
Wang et al. Transformation mechanism of vehicle cluster situations under dynamic evolution of driver’s propensity
Lu et al. Learning Car-Following Behaviors for a Connected Automated Vehicle System: An Improved Sequence-to-Sequence Deep Learning Model
Arbabi et al. Planning for autonomous driving via interaction-aware probabilistic action policies
Chen et al. Platoon separation strategy optimization method based on deep cognition of a driver’s behavior at signalized intersections
CN115062202A (en) Method, device, equipment and storage medium for predicting driving behavior intention and track
Ma et al. Lane change analysis and prediction using mean impact value method and logistic regression model
Bhattacharyya Modeling Human Driving from Demonstrations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant