CN115071758B - Man-machine common driving control right switching method based on reinforcement learning - Google Patents

Man-machine common driving control right switching method based on reinforcement learning Download PDF

Info

Publication number
CN115071758B
CN115071758B CN202210758672.3A CN202210758672A CN115071758B CN 115071758 B CN115071758 B CN 115071758B CN 202210758672 A CN202210758672 A CN 202210758672A CN 115071758 B CN115071758 B CN 115071758B
Authority
CN
China
Prior art keywords
driving
driver
vehicle
road
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210758672.3A
Other languages
Chinese (zh)
Other versions
CN115071758A (en
Inventor
陈慧勤
朱嘉祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210758672.3A priority Critical patent/CN115071758B/en
Publication of CN115071758A publication Critical patent/CN115071758A/en
Application granted granted Critical
Publication of CN115071758B publication Critical patent/CN115071758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/005Handover processes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/005Handover processes
    • B60W60/0059Estimation of the risk associated with autonomous or manual driving, e.g. situation too complex, sensor failure or driver incapacity

Abstract

The application discloses a reinforcement learning-based man-machine driving sharing control right switching method, which is suitable for distribution of a reinforcement learning-based man-machine driving sharing control right switching system to driving weights between a driver and a driving system, and comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into the control weight switching system, and calculating the driving weight between the driver and the driving system. Through the technical scheme in the application, the risk of longitudinal and transverse synthesis of the vehicle is effectively solved, the influence of uncertainty caused by a driver is weakened, and the driver is comprehensively considered at different angles, so that the judgment error of the driver is reduced.

Description

Man-machine common driving control right switching method based on reinforcement learning
Technical Field
The application relates to the technical field of intelligent driving, in particular to a man-machine driving sharing control right switching method based on reinforcement learning.
Background
In the conventional automatic driving technology, a control right switching mode is generally adopted to correct the driving behavior of a driver so as to improve the driving safety of a vehicle.
For example, in patent CN 109795486A, the co-driving coefficient (range is 0-1) is dynamically adjusted according to the driver input torque Td and the time TLC from the left and right wheels to the lane boundary, so as to realize gradual transition from the driver to the auxiliary control system, and the co-driving coefficient at this time is determined by fuzzy control. However, this approach, while addressing the risk of lateral deviation in driving, does not take into account the longitudinal risks involved in driving.
For another example, patent CN 108469806A performs key factor construction on the current driving environment, vehicle and driver states, performs situational assessment on the key factors, and synchronously assesses the driving abilities of the automatic driving system and the driver, and determines whether driving right transfer can be performed. Although a plurality of factors which may affect driving safety are considered in the scheme, the evaluation mode of the driving capacity in the driving right switching process is too complex, subjective and random factors are large, the considered data are too much, and the instantaneity and the stability are poor.
And quantifying the environmental risk as in a thesis 'human-computer co-driving model based on a driver risk response mechanism', obtaining a safety risk response strategy by fitting the environmental risk action and the driving acceleration of the driver, and flexibly switching the human-computer co-driving control right through strategy deviation. The safety control method solves the coupling problem of the state of a driver and the environmental safety, but the safety strategy is established on a large number of driving segments which cannot completely summarize all safety operations and only solves the switching problem when the driver overtakes the following vehicles on the highway. Meanwhile, the control right switching mode only considers the safety problem at the current moment and does not consider traffic hazards possibly caused in the future time period.
Therefore, the safety and stability of the control right switching scheme in the existing automatic driving need to be improved.
Disclosure of Invention
The purpose of this application lies in: how to effectively solve the risk of vehicle vertically and transversely synthesizing reduces the judgement error to the driver in order to improve the accuracy and the security of driving right switching.
The technical scheme of the application is as follows: the utility model provides a reinforcement learning-based man-machine common driving control right switching method, which comprises the following steps: the method is suitable for the distribution of the man-machine common driving control right switching system based on reinforcement learning to the driving weight between the driver and the driving system, and comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into a control right switching system, and calculating the driving weight between the driver and the driving system.
In any of the above technical solutions, further, the driver information at least includes a driver state, a driver intention, a driver style, and a driver subconscious driving influence deviation, the vehicle path prediction information at least includes a predicted vehicle path risk degree and a predicted vehicle path risk threshold,
the calculation formula of the driving operation action prediction index is as follows:
Figure BDA0003720378570000021
in the formula (I), the compound is shown in the specification,
Figure BDA0003720378570000022
for predicting an index of driving performance, Z t Delay for driver state operation response, σ is driver subconscious driving influence deviation, δ is driver intention, and S is drivingHuman style, v risk To predict the degree of risk of the vehicle road, A arisk To predict a threshold vehicle route hazard.
In any of the above technical solutions, further, the calculation formula of the driver subconscious driving influence deviation σ is:
Figure BDA0003720378570000023
Figure BDA0003720378570000024
R d =|d-q ki |
wherein sigma is the driver subconscious driving influence deviation, sum is the collected traffic scene number, and D i For a series of subconscious driving strengths in a traffic scene time period, rho', tau and omega are undetermined parameters, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, and q is the current transverse position of the vehicle ki Is the fitted lateral position of the vehicle under this scenario (label), a is the vehicle acceleration, R d Is a location parameter.
In any one of the above technical solutions, further, the driver information at least includes a driver state, a driver intention, and a driver style, and the calculation process of the comprehensive driving operation action index specifically includes:
determining current vehicle path information according to the position of a current vehicle in a road, wherein the current vehicle path information at least comprises a current vehicle path danger degree and a current vehicle path danger threshold;
determining a comprehensive driving operation action index by combining an environmental response factor and a piecewise function according to the driver information and the current vehicle path information, wherein the calculation formula of the comprehensive driving operation action index is as follows:
Figure BDA0003720378570000031
in the formula,
Figure BDA0003720378570000032
For the comprehensive driving operation action index, z 1 For driver state, gamma is the environmental response factor, H x,y For the current vehicle road risk, σ is a road correction parameter, a pre For real-time operation of the quantized parameters, risk is the current vehicle route risk threshold.
In any one of the above technical solutions, further, determining the current vehicle path information according to the position of the current vehicle on the road specifically includes:
determining the position of the current vehicle in the road, wherein the position at least comprises the distance between the current vehicle and the front vehicle and the transverse position of the current vehicle;
determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle;
determining a transverse vehicle road danger value according to the transverse position of the current vehicle;
calculating the current vehicle road danger degree according to the longitudinal vehicle road danger value and the transverse vehicle road danger value, wherein the corresponding calculation formula is as follows:
Figure BDA0003720378570000033
in the formula, H x,y The current degree of danger of the vehicle road is,
Figure BDA0003720378570000034
the risk distance influence factors of different road sections have the value range of [1,10 ]],y 1 As longitudinal road hazard value, y 2 Is a lateral vehicle road danger value;
and calculating current vehicle road danger thresholds of different scenes according to the current vehicle road danger degree, and recording the current vehicle road danger thresholds and the current vehicle road danger degree as current vehicle road information.
In any of the above technical solutions, further, the calculation formula of the environmental response factor γ is:
Figure BDA0003720378570000041
wherein M is the vehicle mass, M is the vehicle type and purpose correction parameter, k 1 In order to correct the parameters for the dynamics,
Figure BDA0003720378570000042
representing the desired speed and direction of speed, v, of the vehicle limleast (t) is the minimum velocity value, k 2 The parameters are corrected for the traffic scene,
Figure BDA0003720378570000043
as a vehicle interaction force parameter, k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure BDA0003720378570000044
is a pedestrian interaction force parameter, k 4 The parameters are corrected for the complexity of the surrounding physical environment,
Figure BDA0003720378570000045
as an environmental interaction force parameter, k 5 A correction parameter for the degree of influence of the traffic regulations,
Figure BDA0003720378570000046
are rule parameters.
In any one of the above technical solutions, further, calculating a driving weight between the driver and the driving system specifically includes: step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure BDA0003720378570000047
And comprehensive driving operation action index
Figure BDA0003720378570000048
Normalizing, and calculating the prediction index of the driving operation from the beginning to the current driving operation during the driving
Figure BDA0003720378570000049
And comprehensive driving operation action index
Figure BDA00037203785700000410
Mean and standard deviation of; step 9.2, driving operation action prediction index after Z-Score standardization
Figure BDA00037203785700000411
And comprehensive driving operation action index
Figure BDA00037203785700000412
Inputting the current corresponding mean value and standard deviation as input parameters into a human-computer co-driving control right switching system based on reinforcement learning to judge whether weight distribution conditions are met, if so, executing the step 9.3, and if not, acquiring driver information and vehicle path prediction information again; and 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and assigning a driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithm, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
In any one of the above technical solutions, further, the weight assignment condition specifically includes: 5 times in succession of the first parameter
Figure BDA00037203785700000413
And a second parameter
Figure BDA00037203785700000414
Are both less than or equal to a first trigger threshold; or, the second parameter is continued for 3 times
Figure BDA00037203785700000415
Less than or equal to a second trigger threshold; or, the first parameter is continued for 3 times
Figure BDA0003720378570000051
Less than or equal to a second trigger threshold, wherein the first parameter
Figure BDA0003720378570000052
Predicting index for currently inputted driving operation action
Figure BDA0003720378570000053
And a driving operation action prediction index corresponding to all the inputs from the driving behavior to the current time
Figure BDA0003720378570000054
By the number of standard deviations, second parameter
Figure BDA0003720378570000055
For the currently inputted comprehensive driving operation action index
Figure BDA0003720378570000056
And the comprehensive driving operation action index of all the input from the driving behavior to the current time
Figure BDA0003720378570000057
By the number of standard deviations.
The beneficial effect of this application is:
the technical scheme in the application has effectively solved the risk of vehicle vertically and transversely synthesizing, the influence of the uncertainty that has weakened the driver and itself brought, thereby carry out different angle's comprehensive consideration to the driver and reduced the judgment error to the driver, be applicable to multiple traffic scene simultaneously, and also carry out comprehensive consideration with the traffic danger that the time quantum probably leads to in the future, the accuracy and the security that the right of driving switches have further been improved, finally synthesize into two index input switching systems with all factors, the data bulk is few and accurate, the real-time is higher.
In the preferred implementation mode of the application, the influence of experience and subconscious of a driver on driving is considered, the judgment burden of a switching system is reduced, and the real-time performance is better. And the risk that other vehicles may cause the vehicle can be predicted in advance, and the rear-end collision and collision in the driving process are avoided.
Drawings
The advantages of the above and/or additional aspects of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of a reinforcement learning-based human-machine co-driving control authority switching method according to an embodiment of the present application;
FIG. 2 is a diagram of relative positions of roads and relative safe positions according to one embodiment of the present application;
FIG. 3 is a schematic diagram of a model-free reinforcement learning process according to an embodiment of the present application;
fig. 4 is a schematic diagram illustrating an overall structure of a reinforcement learning-based human-machine co-driving control right switching mechanism according to an embodiment of the present application;
FIG. 5 is a diagram illustrating Q-tables in a Q-learning algorithm in reinforcement learning according to an embodiment of the present application.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present embodiment provides a method for switching a driving control right of a human-computer based on reinforcement learning, including:
step 1, constructing a simulator based on a real vehicle road environment, and constructing vehicle road scenes of various situations in the simulator;
further, step 1 is realized by:
step 1.1, simulator hardware needs to have a camera for acquiring a human image of a driver and a driving operation environment for simulating a real vehicle;
step 1.2, constructing a large number of typical traffic environments which can be met in the real world, wherein the typical traffic environments comprise a full-type road section car following scene, a full-type road section overtaking scene, a road intersection scene, a congestion road section scene and the like;
and 1.3, inserting a certain number of dangerous traffic scenes and accident simulation scenes in the construction of different typical traffic environment scenes.
Step 2, continuously collecting vehicle information of surrounding roads, current driver state, driver intention, driver style and action information, control weight distribution information and relevant information of an automatic driving system, and calculating to obtain subconscious driving influence deviation of the driver;
further, step 2 may include the following processes:
step 2.1, the driver needs to complete a complete driving process in different scenes;
step 2.2, under the condition of no intervention of a control right switching system, a driver needs to drive normally in a certain amount of different driving scenes, collect and record the driving operation and road conditions of the driver in the driving process, obtain the style of the driver through statistical analysis, and calculate the subconscious driving influence deviation of the driver (the subconscious driving operation influence of the driver is described by the acceleration and deceleration caused by the experience accumulated by the driver in different driving scenes and the change of the transverse position of the road):
wherein, the calculation formula of the driver subconscious driving influence deviation is as follows:
Figure BDA0003720378570000071
Figure BDA0003720378570000072
R d =|d-q ki |
wherein sigma is the driver subconscious driving influence deviation, sum is the collected traffic scene number, and D i For a series of subconscious driving strengths in a traffic scene time period, rho', tau and omega are undetermined parameters, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, and q is the current transverse position of the vehicle ki Is the fitted lateral position of the vehicle under this scenario (label), a is the vehicle acceleration, R d Is a location parameter.
Specifically, the driver subconscious driving influence deviation does not consider the influence of other traffic participants, and only considers from the perspective of personal safety. Based on the maximum entropy principle, a maximum entropy method related to the subconsciousness of a driver is established.
Firstly, an entropy function is constructed:
Figure BDA0003720378570000073
where H (x) is entropy, a measure representing uncertainty about the thing; p is a radical of k Is a probability distribution; c is a constant, depending on the measure of entropy, here taken to be 1.
In this entropy function, what is needed is how the driver subconsciously drives the impact deviation, i.e. the subconsciousness of the environment in which it is currently located, affects the behavior, but due to the probability distribution p k Is a decimal fraction of 0 to 1, such that log 2 p k Is a negative number, so a non-negative integer q is introduced in the embodiment i Substituting probability distribution p in entropy function k
Defining a parameter q i The safety position is a relative safety position of different road scenes, wherein the relative position is as shown in fig. 2, a transverse coordinate axis is established by taking the left side of the road as an origin, the half width of a single lane of the road is taken as a driving position, the road is divided into eight areas, and the relative safety position is that more than half of the vehicle is positioned at the position when the vehicle normally drives.
Roads in different scenes have large difference, and specific positions which can be used for completely representing road paths cannot be obtained accurately, so that a mode that ln replaces log with base 2 is adopted, and q is used as a base i The main value is an integer larger than one, so the negative sign of the original entropy function needs to be removed, and the difference can be corrected by the following methodEntropy represents that:
Figure BDA0003720378570000081
secondly, establishing a constraint condition of the correction entropy: first, the road conditions are constrained, and each driver will choose the side with good road conditions. Second, under the constraints of traffic regulations, drivers tend to drive more as specified by traffic regulations. Thirdly, the constraint of traffic demand, namely whether the driver needs to overtake, follow or go straight in the road scene, is as follows:
constraint 1:
Figure BDA0003720378570000082
constraint 2:
Figure BDA0003720378570000083
constraint 3: b (q) i )∈S
In the formula, A min 、A max The lower limit and the upper limit of the road traffic capacity score are set;
Figure BDA0003720378570000084
interference coefficients for unfamiliar degrees of different road sections; b is a traffic demand impact weight; b is the maximum boundary of the traffic rule; b (q) i ) For the determination of traffic demands, i.e. knowing whether a traffic demand is overtaking, following or going straight, q is estimated by the demand i (ii) a And S is a traffic demand set, and all normal driving behavior position results are in the set.
Three constraint conditions and different road scenes are set, the correction entropy is used for calculation, and the relative safe position q of the different road scenes is obtained when the value of the correction entropy E is maximum i For a relative safety position q i Clustering is performed, labels are marked for each type (such as overtaking, following and going straight), and then the relative safety position q is determined i Fitting to obtain a fitted transverse position q ki The position is the safety of the driver who is most inclined to walk under different labelsFull position.
In summary, fitting the lateral position q ki For correcting the relative safe position q when the entropy E value is maximum under the constraint condition i
D i The calculation formula of a series of subconscious driving strengths in one time period of the traffic scene is as follows:
Figure BDA0003720378570000091
R d =|d-q ki |
where d is the current lateral position of the vehicle and q is ki The fitting transverse position of the vehicle in the scene (label) is shown as a, the acceleration of the vehicle is shown as a, the subconscious side weight is shown as alpha, the personal safety tendency weight of a driver is shown as beta, rho' is shown as a undetermined parameter, the value of the undetermined parameter meets the change trend of subconscious driving strength in different traffic scenes, and the change trend is as follows:
when R is d More than or equal to Z (Z is a safety value and a set value, and the values of different roads are different), and subconscious driving strength D i When the value of (a) is larger, the values of the undetermined parameters rho', tau and omega follow R d And | a | increases, i.e., becomes more and more unsafe, at which time the greater the strength of the subconsciously driven operational action;
when R is d <Z、D i When the value of (a) is small, the values of the undetermined parameters rho', tau and omega follow R d And | a | decreases, i.e., becomes more and more safe, at which time the strength of the subconsciously driven operational action is less.
Figure BDA0003720378570000092
sum is the number of collected traffic scenes, and the result sigma of the averaging of the intensity is the subconscious driving influence deviation of the driver.
Step 2.3, simulating conditions possibly encountered in the real driving process by a driver, such as dangerous states of fatigue, emotional excitement, distraction and the like, and normal driving;
and 2.4, collecting data to obtain the speed, distance and road surface information of surrounding vehicles, the brake, accelerator and steering wheel data of the own vehicle, the driving weight distribution and the intention and operation data of a driver in a driving system, and obtaining the state and intention information of the driver in a data statistical processing mode.
Step 3, obtaining interaction force and environmental response factor gamma with each peripheral unit according to the collected peripheral road information and vehicle state;
further, step 3 is realized by:
and 3.1, the environmental response factor gamma is an interaction force under the combined effect of the vehicle and the vehicle road environment interaction, and particularly responds to different units. The environmental response factor γ was calculated using the following formula:
the formula is as follows:
Figure BDA0003720378570000101
v limleast (t) is the minimum speed value of the speed limit and the vehicle speed in the current time period scene;
m is the mass of the vehicle;
m is a vehicle type and a target correction parameter;
Figure BDA0003720378570000102
representing the desired speed and direction of speed of the vehicle,
Figure BDA0003720378570000103
Figure BDA0003720378570000104
it is derived from Newton's second law and kinematic formula.
k 1 Correcting parameters for dynamics;
k 2 parameters are corrected for traffic scenarios, such as highway segments, congested road segments, etc.,
Figure BDA0003720378570000105
is an interaction force with other vehicles, wherein the vehicle interaction force parameter
Figure BDA0003720378570000106
Comprises the following steps:
Figure BDA0003720378570000107
θ 1l is the angle between the direction of travel of the vehicle and the direction of travel of the other vehicles, Δ v 1l /Δμ 1l The expression that u is the safe distance and ρ is the distance to other vehicles in the ratio of the speed difference to the distance difference indicates that a distance greater than the safe distance represents an attractive force, the attractive force is smaller as the distance is closer to the safe distance, and the attractive force is converted into a repulsive force when the distance is smaller than the safe distance, and the repulsive force is larger as the distance is closer to other vehicles. And no interaction force exists between the vehicles at the transverse parallel position and in the parallel advancing direction, and the absolute value of the interaction force of the same longitudinal lane is the largest.
k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure BDA0003720378570000111
is an interaction force with a pedestrian, wherein the pedestrian interaction force parameter
Figure BDA0003720378570000112
Comprises the following steps:
Figure BDA0003720378570000113
v is the current speed of the vehicle, θ 1j The angle between the center of the vehicle head and the pedestrian r 1j Is the distance difference, t 1j The formula shows that when the vehicle speed is 0, no interaction force is generated between the vehicle and the pedestrian, when the distance between the vehicle and the pedestrian is shorter, the angle difference is smaller, the estimated meeting time is shorter, the vehicle speed is higher,will result in an increase in the repulsive force.
k 4 The parameters are corrected for the complexity of the surrounding physical environment,
Figure BDA0003720378570000114
is interaction force with surrounding physical environment such as non-moving object such as building, etc., wherein environment interaction force parameter
Figure BDA0003720378570000115
Comprises the following steps:
Figure BDA0003720378570000116
t is the volume of the immobile object, the larger the volume is, the larger the repulsive force is, when the volume is smaller than or equal to the passable size of the vehicle, the interaction force is attractive force, when the volume is larger than the passable size of the vehicle, and when the collision time T is 1R The smaller the repulsion force, the greater the repulsion force when the vehicle mass is greater, the greater the vehicle speed, the greater the exclusion force, and the 0 speed, there is no interaction force.
k 5 In order to reflect the importance degree of the vehicle to the traffic rule as the correction parameter of the influence degree of the traffic rule,
Figure BDA0003720378570000117
acting as a resistance to traffic regulations, wherein the regulation parameters
Figure BDA0003720378570000118
Comprises the following steps:
Figure BDA0003720378570000119
v lim the maximum speed is limited for the traffic regulations and the traffic signs, the lower the limited speed is, the larger the resistance is, and when the traffic regulations and the traffic signs require parking, the resistance is infinite under the red light condition.
Step 4, according to the collected current action informationNamely a brake, an accelerator and a steering wheel, and carrying out normalization pretreatment to obtain a real-time operation quantization parameter a pre
Further, step 4 is implemented by:
step 4.1, extracting brake force, accelerator force and steering wheel angle data through a sensor;
and 4.2, normalizing the three data by using min-max standardization:
braking:
Figure BDA0003720378570000121
accelerator:
Figure BDA0003720378570000122
steering wheel corner:
Figure BDA0003720378570000123
* Is the current value, min is the minimum value, max is the maximum value;
the operation specification can know that the accelerator and the brake are mutually exclusive operations, so the normalization results are combined as follows:
longitudinal operation interval: [ -1;
the transverse operation interval: [ -1;
step 4.3, for the longitudinal operation interval and the transverse operation interval: -1 x-1, construction of-1:
longitudinal value of (0.a) 1 a 2 a 3 a 4 82300, transverse value of (0.b) 1 b 2 b 3 b 4 8230), constructing a cross method, segmenting the two decimal classes, segmenting after all non-0 digits, and cross-recombining the segmented segments to obtain a one-dimensional real-time operation quantization parameter a pre
And 5, determining current vehicle path information according to the position of the current vehicle on the road, wherein the current vehicle path information at least comprises the current vehicle path danger degree and the current vehicle path danger threshold.
Further, step 5 is implemented by:
and 5.1, determining the position of the current vehicle in the road, wherein the position at least comprises the distance between the current vehicle and the front vehicle and the transverse position of the current vehicle, and determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle.
The risk of the longitudinal position is inversely proportional to the distance from the tail of the front vehicle, namely the closer the distance from the tail of the front vehicle, the greater the risk, the longitudinal vehicle road danger function is established, the tail of the front vehicle is used as the original point to set a coordinate axis, and the specified normal safe distance is zeta 1 . Setting a minimum safe distance eta 1 ,η 1 Is set to the maximum deceleration braking to the just-no-collision distance of the preceding vehicle.
Figure BDA0003720378570000131
y 1 The longitudinal vehicle-road risk value is the longitudinal vehicle-road risk value,
x 1 is the distance from the front vehicle;
and 5.2, determining a transverse vehicle road danger value according to the transverse position of the current vehicle:
establishing a transverse vehicle road danger function by taking a vehicle head central point as an original point:
y 2 =0.5cos[(π/T)x 2 ]-0.5,-T≤x 2 ≤T
y 2 is a value of the risk of the lateral vehicle,
x 2 is the current lateral position;
t is the distance from the center line of the lane to the sideline;
step 5.3, calculating to obtain the current vehicle road danger degree H x,y
Figure BDA0003720378570000132
Figure BDA0003720378570000133
The risk distance influence factors of different road sections have the value range of [1,10 ]]When the value is 1, the current road section and the driving state are both a standard passing road section and a driving environment under the rule of intersection. When the value is 10, the conditions that the current driving environment is severe, the road traffic capacity is extremely poor and rear-end accidents happen frequently around the road, such as a heavy fog and frozen road section, are indicated.
Step 5.4, calculating current vehicle road danger thresholds of different scenes:
risk=ωγH x,y
omega is a scene impact parameter. Environmental response factor gamma
Step 6, quantizing the parameters a according to the collected driver state, the driver intention and the real-time operation pre Determining and obtaining the self comprehensive driving operation action index by combining the environmental response factor gamma, the current vehicle road danger degree and the current vehicle road danger threshold value in a piecewise function mode
Figure BDA0003720378570000141
The corresponding calculation formula is:
Figure BDA0003720378570000142
z 1 the driver states are different, the environmental response degrees are different,
gamma is an environmental response factor and is a specific factor,
delta is the intention of the driver, represents the degree of coincidence of the current operation with the recognized intention of the driver,
H x,y the current degree of danger of the vehicle road is,
sigma is a road correction parameter, and the road correction parameter,
a pre in order to operate the quantization parameter in real-time,
risk is the current roadway hazard threshold.
Step 7, obtaining a predicted vehicle road danger degree and a predicted vehicle road danger threshold according to the increase rate of the interaction force;
specifically, the interaction force in step 3 is inversely proportional to the distance, and the faster the interaction force increases and the more danger is likely to occur, so the following derivation formula can be derived:
Figure BDA0003720378570000143
Figure BDA0003720378570000144
Figure BDA0003720378570000145
Figure BDA0003720378570000146
A arisk =ρa risk
in the formula, v f Is the rate of increase of single unit interaction force, v risk To predict the degree of risk of the vehicle road, a f Acceleration for single unit interaction force increase, a risk For the sum of the acceleration increases for all peripheral units of interaction force, A arisk In order to predict the threshold value of the vehicle road danger, rho is a vehicle road danger influence factor, is determined by the complexity of the current road, and has a value range of [0,1]]。
Step 8, calculating a driving operation action prediction index according to the driver information and the vehicle path prediction information
Figure BDA0003720378570000151
The driver information at least comprises a driver state, a driver intention, a driver style and a driver subconscious driving influence deviation, the vehicle path prediction information at least comprises a predicted vehicle path danger degree and a predicted vehicle path danger threshold value, and the driving operationFormula for calculating action prediction index
Figure BDA0003720378570000152
Comprises the following steps:
Figure BDA0003720378570000153
in the formula, sigma is the subconscious driving influence deviation of the driver, and the subconscious driving influence deviation of the historical driver under the most similar scene is obtained by comparing traffic scenes;
s is the style of the driver, namely the quantitative evaluation of the style of the driver of [0,10] is obtained through the style test of the driver, and the style of the driver which is less than 1 is an extremely unsuitable style of the driver, so that the delay of the intention of the driver and the operation reaction of the driving state is influenced.
Z t The state operation reaction of the driver is delayed, the delay is a set value, and the larger the delay is, the smaller the prediction index of the driving operation action is;
delta is the intention of the driver, and different driver intention paths have larger influence on the driver subconscious driving influence deviation.
Step 9, predicting the driving operation action index
Figure BDA0003720378570000154
And comprehensive driving operation action index
Figure BDA0003720378570000155
And inputting the driving weight into a man-machine driving sharing control weight switching system based on reinforcement learning, and calculating and adjusting driving weight weights respectively needed by a driver and a driving system.
Specifically, as shown in fig. 3 and 4, the driving operation action prediction index
Figure BDA0003720378570000156
Various risk factors representing the future time period influence the operation safety degree, and the influence of other units after the operation of the vehicle is emphasized. Integrated driving operation action index
Figure BDA0003720378570000157
Representing that each risk factor influences the operation safety degree at the current time, and emphasizing whether the current position is safe or not and whether the current state can be effectively driven or not;
step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure BDA0003720378570000161
And comprehensive driving operation action index
Figure BDA0003720378570000162
Normalizing, and calculating the prediction index of the driving operation from the beginning to the current driving operation during the driving
Figure BDA0003720378570000163
And comprehensive driving operation action index
Figure BDA0003720378570000164
Mean and standard deviation of.
Step 9.2, driving operation action prediction index after Z-Score standardization
Figure BDA0003720378570000165
And comprehensive driving operation action index
Figure BDA0003720378570000166
And inputting the current corresponding mean value and standard deviation as input parameters into a human-computer co-driving control right switching system based on reinforcement learning to judge whether weight distribution conditions are met, if so, executing the step 9.3, and if not, re-acquiring driver information and vehicle path prediction information.
Specifically, the parameter Z-Score represents the number of the sampled sample values differing from the data mean value by several standard deviations, so as to predict the index of the driving operation action
Figure BDA0003720378570000167
As an example, the first parameter
Figure BDA0003720378570000168
Predicting index for currently inputted driving operation action
Figure BDA0003720378570000169
(sample value sampling) and index of prediction of driving operation behavior for all inputs from driving behavior to current time
Figure BDA00037203785700001610
Is different by the number of standard deviations that are prediction indexes of all the inputted driving operation actions from the start of the driving behavior to the current time
Figure BDA00037203785700001611
Standard deviation of (2). Second parameter
Figure BDA00037203785700001612
Similarly, no further description is given.
In this embodiment, the weight distribution conditions in the control weight switching system include three types:
(1) 5 times in succession of the first parameter
Figure BDA00037203785700001613
And a second parameter
Figure BDA00037203785700001614
Are both less than or equal to a first trigger threshold;
specifically, the first parameter is judged
Figure BDA00037203785700001615
And a second parameter
Figure BDA00037203785700001616
Whether the first trigger threshold values are all smaller than or equal to the first trigger threshold value, wherein the value of the first trigger threshold value can be-3, namely when the input driving operation action prediction index is input
Figure BDA00037203785700001617
And comprehensive driving operation action index
Figure BDA00037203785700001618
Whether all are less than or equal to 3 standard deviations from the current corresponding mean, the corresponding formula is:
Figure BDA00037203785700001619
and is
Figure BDA00037203785700001620
Such a situation indicates that the current state does not satisfy the safe state and that the safe state is not satisfied in the future, and therefore, the control right switching system is triggered to start operating. Under the condition, the index of five inputs in succession is (
Figure BDA0003720378570000171
And
Figure BDA0003720378570000172
) All satisfy
Figure BDA0003720378570000173
And is
Figure BDA0003720378570000174
In time, the driving weights respectively required by the driver and the driving system need to be adjusted.
(2) 3 times in succession of the second parameter
Figure BDA0003720378570000175
Less than or equal to a second trigger threshold;
specifically, the value of the second trigger threshold may be-4. When the inputted comprehensive driving operation action index
Figure BDA0003720378570000176
The current mean value is less than or equal to 4 standard deviations, i.e. the second parameter
Figure BDA0003720378570000177
Figure BDA0003720378570000178
When the current state does not meet the safety state, the driving system needs to be intervened emergently, and the control right switching system is triggered to start working. Under this condition, when the index is inputted three times in succession
Figure BDA0003720378570000179
Satisfies the second parameter
Figure BDA00037203785700001710
And adjusting the driving weight respectively required by the driver and the driving system.
(3) 3 times in succession of the first parameter
Figure BDA00037203785700001711
Less than or equal to a second trigger threshold;
specifically, when the index of prediction of the driving operation action is inputted
Figure BDA00037203785700001712
The current mean is less than or equal to 4 standard deviations, i.e. the first parameter
Figure BDA00037203785700001713
And when the future state does not meet the safe state and the state is safe after the intervention of a driver cannot be corrected by self, triggering the control right switching system to start working. Under this condition, when the index is inputted three times in succession
Figure BDA00037203785700001714
Satisfies the first parameter
Figure BDA00037203785700001715
And adjusting the driving weight respectively required by the driver and the driving system.
And 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and assigning a driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithm, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
In this embodiment, the driving right weight algorithm respectively required for the driver and the driving system in the control right switching system is a Q learning algorithm, and the specific training process is as follows:
(1) The transition rule for Q learning is:
Q(state,action)=R(state,action)+Gamma*MaxQ(next state,all actions)
that is, Q (state, action) = R (state, action) + Gamma max [ Q (next state, all actions)
Gamma is a discount factor (discount factor), and the larger the discount factor is, the greater the MaxQ plays a role. Here, the term "anterior value (R)" is understood to mean the value in the eye and the value in the memory. MaxQ refers to the value in memory, and it refers to the maximum value of the value in the action of the next state in memory.
(2) A "matrix Q" is added as a learning-intensive agent, i.e. the brain of the driving right switching system, i.e. something learned empirically. The rows of the "matrix Q" represent the current state of the driving right switching system and the columns represent the possible actions of the next state (link between nodes). The driving right switching system is initialized to 0, i.e., the "matrix Q" is initialized to zero. The "matrix Q" can only start with one element. If a new state is found, the "matrix Q" is updated, which is referred to as unsupervised learning.
(3) The driving weight of the driver is power (driver), the driving weight of the driving system is power (system) =1-power (driver), the control right switching system adjusts the driving weight of the driver by using a reinforcement learning Q learning algorithm, the Q learning action is to directly assign a value to the driving weight of the driver, the value range of the weight value is [0,1], and the step length is 0.05.
(4) The Q learning state is set to 0,1,2,3,4,5. Wherein:
0 represents
Figure BDA0003720378570000181
And is
Figure BDA0003720378570000182
1 represents
Figure BDA0003720378570000183
2 represents
Figure BDA0003720378570000184
3 represents
Figure BDA0003720378570000185
4 represents
Figure BDA0003720378570000186
5 represents
Figure BDA0003720378570000187
And is
Figure BDA0003720378570000188
(5) After triggering the control right switching system to start working, obtaining one of the initial states 0,1 and 2, when the state is still one of the states 0,1 and 2 through the action (adjusting the weight) in the step (3), rewarding to be-1, updating a matrix Q, and assigning the element in the matrix corresponding to the state and the use action to be-1;
when the state reaches the state 3,4 through the action in the step (3), the reward is 1, the matrix Q is updated, and the element in the corresponding matrix of the state and the use action is assigned as 1;
when the state reaches the state 5 through the action in (3), the reward is 100, the matrix Q is updated, the element in the corresponding matrix of the state and the use action is assigned as 100, the state 5 is the target state, and finally the Q-table is obtained as shown in fig. 5 (the element is not assigned).
(6) Selecting a road environment, applying (1) to (5), and obtaining a Q-table of an initial matrix Q as a selection of MaxQ (next state, all actions) in Q learning Q (state, action) = R (state, action) + Gamma MaxQ (next state, all actions), wherein Gamma is selected in [0,1] according to the road similarity, and R (state, action) is a value of a state obtained by the current road environment: the reward is-1 when the status is 0,1,2, 1 when the status is 3,4, and 100 when the status is 5.
When the control right switching system calculates the driving weight, the weight which can be adjusted and the reward obtained by the achieved state are calculated in advance according to the Q-table of the similar road section, and the reward is MaxQ (next state) in the formula. Therefore, Q (state, action) is the sum of the R (state, action) value in the current road environment and MaxQ (next state, all actions).
When Q (state, action) is maximum, the action next state in MaxQ (next state, all actions) is the weight that needs to be adjusted, and is denoted as driver driving weight power.
(7) And updating the Q-table according to the calculated Q (state) value.
(8) The switching of weights is stopped when state 5 is reached.
The technical scheme of the application is described in detail in the above with reference to the accompanying drawings, and the application provides a reinforcement learning-based man-machine driving sharing control right switching method, which is suitable for a reinforcement learning-based man-machine driving sharing control right switching system to distribute driving weights between a driver and a driving system, and comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into the control weight switching system, and calculating the driving weight between the driver and the driving system. Through the technical scheme in the application, the risk of longitudinal and transverse integration of the vehicle is effectively solved, the influence of uncertainty caused by a driver is weakened, and the driver is comprehensively considered from different angles, so that the judgment error of the driver is reduced.
The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.
The units in the device can be merged, divided and deleted according to actual requirements.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.

Claims (7)

1. A method for switching the driving right of a driver and a driving system together based on reinforcement learning is characterized in that the method is suitable for a system for switching the driving right of the driver and the driving system together based on reinforcement learning to distribute the driving weight between the driver and the driving system, and the method comprises the following steps:
calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information;
inputting the driving operation action prediction index and the comprehensive driving operation action index into the control right switching system, and calculating the driving weight between the driver and the driving system;
the driver information at least comprises a driver state, a driver intention, a driver style and a driver subconscious driving influence deviation, and the vehicle path prediction information at least comprises a predicted vehicle path danger degree and a predicted vehicle path danger threshold;
the driving operation action prediction index represents that each risk factor in a future time period influences the operation safety degree, and emphasizes the influence of other units after self operation on the vehicle safety;
the comprehensive driving operation action index represents that each risk factor at the current time influences the operation safety degree, and whether the current position is safe or not and whether the current state can be effectively driven or not are emphasized;
the calculation formula of the driving operation action prediction index is as follows:
Figure FDA0004066029330000011
in the formula (I), the compound is shown in the specification,
Figure FDA0004066029330000012
predicting an index, Z, for the driving maneuver t For a driver state operation response delay, σ is the driver subconscious Driving influence deviation, δ is the driver intent, S is the driver style, v risk For said prediction of the road hazard level, A arisk And the predicted vehicle road danger threshold value is used.
2. The reinforcement learning-based man-machine driving sharing control right switching method according to claim 1, wherein the calculation formula of the driver subconscious driving influence deviation σ is as follows:
Figure FDA0004066029330000013
Figure FDA0004066029330000014
R d =|d-q ki |
wherein, sigma is the driver subconscious driving influence deviation, sum is the number of collected traffic scenes, D i A series of subconscious driving strength rho' in a time period of a traffic scene is a undetermined parameter, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, q is ki Is the fitted lateral position of the vehicle under this scenario, a is the vehicle acceleration, R d Is a location parameter.
3. The reinforcement learning-based human-computer co-driving control right switching method as claimed in claim 1, wherein the driver information at least includes a driver state, a driver intention, and a driver style, and the calculation process of the comprehensive driving operation action index specifically includes:
determining current vehicle path information according to the position of a current vehicle in a road, wherein the current vehicle path information at least comprises a current vehicle path danger degree and a current vehicle path danger threshold;
determining the comprehensive driving operation action index by combining an environmental response factor and adopting a piecewise function mode according to the driver information and the current vehicle path information, wherein the calculation formula of the comprehensive driving operation action index is as follows:
Figure FDA0004066029330000021
in the formula (I), the compound is shown in the specification,
Figure FDA0004066029330000022
is the index of the integrated driving maneuver z 1 Is the driver state, gamma is the environmental response factor, H x,y For the current vehicle road risk, sigma is a road correction parameter, a pre And for real-time operation of the quantized parameters, risk is the current vehicle road risk threshold.
4. The reinforcement learning-based man-machine co-driving control right switching method as claimed in claim 3, wherein the determining current vehicle path information according to the current vehicle position in the road specifically comprises:
determining a position of a current vehicle in a road, including at least a distance to a preceding vehicle of the current vehicle and a lateral position of the current vehicle;
determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle;
determining a transverse vehicle road danger value according to the transverse position of the current vehicle;
calculating the current vehicle road risk degree according to the longitudinal vehicle road risk value and the transverse vehicle road risk value, wherein the corresponding calculation formula is as follows:
Figure FDA0004066029330000031
in the formula, H x,y In order to obtain the current vehicle road risk degree,
Figure FDA0004066029330000032
the risk distance influence factors of different road sections have the value range of [1,10 ]],y 1 Is said longitudinal road hazard value, y 2 The value is the lateral vehicle road danger value;
and calculating current vehicle road danger thresholds of different scenes according to the current vehicle road danger degrees, and recording the current vehicle road danger thresholds and the current vehicle road danger degrees as the current vehicle road information.
5. The reinforcement learning-based man-machine co-driving control right switching method as claimed in claim 3, wherein the environmental response factor γ is calculated by the formula:
Figure FDA0004066029330000033
wherein M is the vehicle mass, M is the vehicle type and purpose correction parameter, k 1 In order to correct the parameters for the dynamics,
Figure FDA0004066029330000034
representing the desired speed and direction of speed, v, of the vehicle limleast (t) is the minimum velocity value, k 2 The parameters are corrected for the traffic scene,
Figure FDA0004066029330000035
as a vehicle interaction force parameter, k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure FDA0004066029330000036
is a pedestrian interaction force parameter, k 4 Is peripheral physicsThe environmental complexity level is used to modify the parameters,
Figure FDA0004066029330000037
as an environmental interaction force parameter, k 5 A correction parameter for the degree of influence of the traffic regulations,
Figure FDA0004066029330000038
are rule parameters.
6. The reinforcement learning-based human-computer co-driving control right switching method according to any one of claims 1 to 5, wherein the calculating the driving weight between the driver and the driving system specifically includes:
step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure FDA0004066029330000039
And comprehensive driving operation action index
Figure FDA00040660293300000310
Normalizing, and calculating the prediction index of the driving operation from the beginning to the current driving operation during the driving
Figure FDA00040660293300000311
And comprehensive driving operation action index
Figure FDA00040660293300000312
Mean and standard deviation of;
step 9.2, driving operation action prediction index after Z-Score standardization
Figure FDA00040660293300000313
And comprehensive driving operation action index
Figure FDA00040660293300000314
And the current corresponding mean value and standard deviation are used as input parameters, the input parameters are input into a man-machine co-driving control right switching system based on reinforcement learning, whether a weight distribution condition is met or not is judged, if yes, the step 9.3 is executed, and if not, the driver information and the vehicle road prediction information are obtained again;
and 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and assigning a driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithm, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
7. The reinforcement learning-based human-computer co-driving control weight switching method as claimed in claim 6, wherein the weight distribution condition specifically includes:
5 times in succession of the first parameter
Figure FDA0004066029330000041
And a second parameter
Figure FDA0004066029330000042
Are both less than or equal to a first trigger threshold; alternatively, the first and second electrodes may be,
3 times in succession of the second parameter
Figure FDA0004066029330000043
Less than or equal to a second trigger threshold; alternatively, the first and second electrodes may be,
the first parameter is continued for 3 times
Figure FDA0004066029330000044
Less than or equal to the second trigger threshold, wherein,
the first parameter
Figure FDA0004066029330000045
Predicting index a for currently inputted driving operation action
Figure FDA0004066029330000046
And a driving operation action prediction index corresponding to all the inputs from the driving behavior to the current time
Figure FDA0004066029330000047
The mean of (a) differs by the number of standard deviations,
the second parameter
Figure FDA0004066029330000048
Is the currently input comprehensive driving operation action index a
Figure FDA0004066029330000049
And the comprehensive driving operation action index of all the input from the driving behavior to the current time
Figure FDA00040660293300000410
By the number of standard deviations.
CN202210758672.3A 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning Active CN115071758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210758672.3A CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210758672.3A CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN115071758A CN115071758A (en) 2022-09-20
CN115071758B true CN115071758B (en) 2023-03-21

Family

ID=83254772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210758672.3A Active CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN115071758B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549367A (en) * 2018-04-09 2018-09-18 吉林大学 A kind of man-machine method for handover control based on prediction safety

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017183077A1 (en) * 2016-04-18 2017-10-26 本田技研工業株式会社 Vehicle control system, vehicle control method, and vehicle control program
US11940790B2 (en) * 2018-12-12 2024-03-26 Allstate Insurance Company Safe hand-off between human driver and autonomous driving system
JP7158352B2 (en) * 2019-08-08 2022-10-21 本田技研工業株式会社 DRIVING ASSIST DEVICE, VEHICLE CONTROL METHOD, AND PROGRAM
CN113341730B (en) * 2021-06-28 2022-08-30 上海交通大学 Vehicle steering control method under remote man-machine cooperation
CN113335291B (en) * 2021-07-27 2022-07-08 燕山大学 Man-machine driving-sharing control right decision method based on man-vehicle risk state

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549367A (en) * 2018-04-09 2018-09-18 吉林大学 A kind of man-machine method for handover control based on prediction safety

Also Published As

Publication number Publication date
CN115071758A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
WO2021077725A1 (en) System and method for predicting motion state of surrounding vehicle based on driving intention
CN111104969B (en) Collision possibility pre-judging method for unmanned vehicle and surrounding vehicles
Lawitzky et al. Interactive scene prediction for automotive applications
CN109727469B (en) Comprehensive risk degree evaluation method for automatically driven vehicles under multiple lanes
CN110843789B (en) Vehicle lane change intention prediction method based on time sequence convolution network
CN110077398B (en) Risk handling method for intelligent driving
CN111775949A (en) Personalized driver steering behavior assisting method of man-machine driving-sharing control system
CN112249008B (en) Unmanned automobile early warning method aiming at complex dynamic environment
Li et al. Modeling vehicle merging position selection behaviors based on a finite mixture of linear regression models
Olstam et al. A framework for simulation of surrounding vehicles in driving simulators
Toledo et al. State dependence in lane-changing models
CN115056798A (en) Automatic driving vehicle lane change behavior vehicle-road cooperative decision algorithm based on Bayesian game
Chen et al. Towards human-like speed control in autonomous vehicles: A mountainous freeway case
US20190146493A1 (en) Method And Apparatus For Autonomous System Performance And Benchmarking
Julian et al. Complex lane change behavior in the foresighted driver model
CN115071758B (en) Man-machine common driving control right switching method based on reinforcement learning
Shao et al. A discretionary lane-changing decision-making mechanism incorporating drivers’ heterogeneity: A signalling game-based approach
CN116653957A (en) Speed changing and lane changing method, device, equipment and storage medium
Zhang et al. A lane-changing prediction method based on temporal convolution network
Griesbach et al. Prediction of lane change by echo state networks
DE102018008599A1 (en) Control system and control method for determining a trajectory for a motor vehicle
CN113033902B (en) Automatic driving lane change track planning method based on improved deep learning
CN114148349A (en) Vehicle personalized following control method based on generation countermeasure simulation learning
CN113761715A (en) Method for establishing personalized vehicle following model based on Gaussian mixture and hidden Markov
CN112308171A (en) Vehicle position prediction modeling method based on simulated driver

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant