CN115071758A - Man-machine common driving control right switching method based on reinforcement learning - Google Patents

Man-machine common driving control right switching method based on reinforcement learning Download PDF

Info

Publication number
CN115071758A
CN115071758A CN202210758672.3A CN202210758672A CN115071758A CN 115071758 A CN115071758 A CN 115071758A CN 202210758672 A CN202210758672 A CN 202210758672A CN 115071758 A CN115071758 A CN 115071758A
Authority
CN
China
Prior art keywords
driving
driver
vehicle
current
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210758672.3A
Other languages
Chinese (zh)
Other versions
CN115071758B (en
Inventor
陈慧勤
朱嘉祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210758672.3A priority Critical patent/CN115071758B/en
Publication of CN115071758A publication Critical patent/CN115071758A/en
Application granted granted Critical
Publication of CN115071758B publication Critical patent/CN115071758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/005Handover processes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/005Handover processes
    • B60W60/0059Estimation of the risk associated with autonomous or manual driving, e.g. situation too complex, sensor failure or driver incapacity

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Human Computer Interaction (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application discloses a reinforcement learning-based man-machine driving sharing control right switching method, which is suitable for distribution of a reinforcement learning-based man-machine driving sharing control right switching system to driving weights between a driver and a driving system, and comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into the control weight switching system, and calculating the driving weight between the driver and the driving system. Through the technical scheme in the application, the risk of longitudinal and transverse synthesis of the vehicle is effectively solved, the influence of uncertainty caused by a driver is weakened, and the driver is comprehensively considered at different angles, so that the judgment error of the driver is reduced.

Description

Man-machine common driving control right switching method based on reinforcement learning
Technical Field
The application relates to the technical field of intelligent driving, in particular to a man-machine driving sharing control right switching method based on reinforcement learning.
Background
In the conventional automatic driving technology, a control right switching mode is generally adopted to correct the driving behavior of a driver so as to improve the driving safety of a vehicle.
For example, in patent CN 109795486 a, the common driving coefficient (range is 0-1) is dynamically adjusted according to the input torque Td of the driver and the time TLC from the left and right wheels to the lane boundary, so as to realize gradual transition from the driver to the auxiliary control system, and the common driving coefficient at this time is determined through fuzzy control. However, this approach, while addressing the risk of lateral deviation from driving, does not take into account the longitudinal risks during driving.
For another example, patent CN 108469806 a performs key factor construction on the current driving environment, the state of the vehicle and the driver, performs situational assessment on the key factors, and synchronously assesses the driving abilities of the automatic driving system and the driver, so as to determine whether the driving right transfer can be performed. Although a plurality of factors which may affect driving safety are considered in the scheme, the evaluation mode of the driving capacity in the driving right switching process is too complex, subjective and random factors are large, the considered data are too much, and the instantaneity and the stability are poor.
And quantifying the environmental risk as in a thesis 'human-computer co-driving model based on a driver risk response mechanism', obtaining a safety risk response strategy by fitting the environmental risk action and the driving acceleration of the driver, and flexibly switching the human-computer co-driving control right through strategy deviation. The safety control method solves the coupling problem of the state of a driver and the environmental safety, but the safety strategy is established on a large number of driving segments which cannot completely summarize all safety operations and only solves the switching problem when the driver overtakes the following vehicles on the highway. Meanwhile, the control right switching mode only considers the safety problem at the current moment and does not consider traffic hazards possibly caused in the future time period.
Therefore, the safety and stability of the control right switching scheme in the existing automatic driving need to be improved.
Disclosure of Invention
The purpose of this application lies in: how to effectively solve the risk of vehicle vertically and transversely synthesizing reduces the judgement error to the driver in order to improve the accuracy and the security of driving right switching.
The technical scheme of the application is as follows: the utility model provides a reinforcement learning-based man-machine common driving control right switching method, which comprises the following steps: the method is suitable for the distribution of the man-machine common driving control right switching system based on reinforcement learning to the driving weight between the driver and the driving system, and comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into a control right switching system, and calculating the driving weight between the driver and the driving system.
In any of the above technical solutions, further, the driver information at least includes a driver state, a driver intention, a driver style, and a driver subconscious driving influence deviation, the vehicle path prediction information at least includes a predicted vehicle path risk degree and a predicted vehicle path risk threshold,
the calculation formula of the driving operation action prediction index is as follows:
Figure BDA0003720378570000021
in the formula (I), the compound is shown in the specification,
Figure BDA0003720378570000022
for predicting an index of driving performance, Z t Delay of the driver 'S state response, σ is the driver' S subconscious driving influence deviation, δ is the driver 'S intention, S is the driver' S style, v risk To predict the degree of risk of the vehicle road, A arisk To predict a threshold vehicle route hazard.
In any of the above technical solutions, further, the calculation formula of the driver subconscious driving influence deviation σ is:
Figure BDA0003720378570000023
Figure BDA0003720378570000024
R d =|d-q ki |
wherein sigma is the driver subconscious driving influence deviation, sum is the collected traffic scene number, and D i For a series of subconscious driving strengths in a traffic scene time period, rho', tau and omega are undetermined parameters, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, and q is the current transverse position of the vehicle ki Is the fitted lateral position of the vehicle under this scenario (label), a is the vehicle acceleration, R d Is a location parameter.
In any one of the above technical solutions, further, the driver information at least includes a driver state, a driver intention, and a driver style, and the calculation process of the comprehensive driving operation action index specifically includes:
determining current vehicle path information according to the position of a current vehicle in a road, wherein the current vehicle path information at least comprises a current vehicle path danger degree and a current vehicle path danger threshold;
determining a comprehensive driving operation action index by combining an environmental response factor and a piecewise function according to the driver information and the current vehicle path information, wherein the calculation formula of the comprehensive driving operation action index is as follows:
Figure BDA0003720378570000031
in the formula (I), the compound is shown in the specification,
Figure BDA0003720378570000032
for the comprehensive driving operation action index, z 1 For driver state, gamma is the environmental response factor, H x,y For the current vehicle road risk, σ is a road correction parameter, a pre For real-time operation of the quantized parameters, risk is the current vehicle route risk threshold.
In any one of the above technical solutions, further, determining the current vehicle path information according to the position of the current vehicle on the road specifically includes:
determining the position of the current vehicle in the road, wherein the position at least comprises the distance between the current vehicle and the front vehicle and the transverse position of the current vehicle;
determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle;
determining a transverse vehicle road danger value according to the transverse position of the current vehicle;
calculating the current vehicle road danger degree according to the longitudinal vehicle road danger value and the transverse vehicle road danger value, wherein the corresponding calculation formula is as follows:
Figure BDA0003720378570000033
in the formula, H x,y The current degree of danger of the vehicle road is,
Figure BDA0003720378570000034
the risk distance influence factors of different road sections have the value range of [1,10 ]],y 1 As longitudinal road hazard value, y 2 Is a lateral vehicle road danger value;
and calculating current vehicle road danger thresholds of different scenes according to the current vehicle road danger degree, and recording the current vehicle road danger threshold and the current vehicle road danger degree as current vehicle road information.
In any of the above technical solutions, further, the calculation formula of the environmental response factor γ is:
Figure BDA0003720378570000041
wherein M is the vehicle mass, M is the vehicle type and purpose correction parameter, k 1 In order to correct the parameters for the dynamics,
Figure BDA0003720378570000042
representing the desired speed and direction of speed, v, of the vehicle limleast (t) is the minimum velocity value, k 2 The parameters are corrected for the traffic scene,
Figure BDA0003720378570000043
as a vehicle interaction force parameter, k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure BDA0003720378570000044
is a pedestrian interaction force parameter, k 4 The parameters are corrected for the complexity of the surrounding physical environment,
Figure BDA0003720378570000045
as an environmental interaction force parameter, k 5 A correction parameter for the degree of influence of the traffic regulations,
Figure BDA0003720378570000046
are rule parameters.
In any one of the above technical solutions, further, calculating a driving weight between the driver and the driving system specifically includes: step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure BDA0003720378570000047
And comprehensive driving operation action index
Figure BDA0003720378570000048
Normalizing, and calculating the prediction index of the driving operation from the beginning to the current driving operation during the driving
Figure BDA0003720378570000049
And comprehensive driving operation action index
Figure BDA00037203785700000410
Mean and standard deviation of; step 9.2, driving operation action prediction index after Z-Score standardization
Figure BDA00037203785700000411
And comprehensive driving operation action index
Figure BDA00037203785700000412
Inputting the current corresponding mean value and standard deviation as input parameters into a human-computer co-driving control right switching system based on reinforcement learning to judge whether weight distribution conditions are met, if so, executing the step 9.3, and if not, acquiring driver information and vehicle path prediction information again; step 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and performing the driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithmAnd assigning, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
In any one of the above technical solutions, further, the weight assignment condition specifically includes: 5 times in succession of the first parameter
Figure BDA00037203785700000413
And a second parameter
Figure BDA00037203785700000414
Are both less than or equal to a first trigger threshold; or, the second parameter is continued for 3 times
Figure BDA00037203785700000415
Less than or equal to a second trigger threshold; or, the first parameter is continued for 3 times
Figure BDA0003720378570000051
Less than or equal to a second trigger threshold, wherein the first parameter
Figure BDA0003720378570000052
Predicting index for currently inputted driving operation action
Figure BDA0003720378570000053
And a driving operation action prediction index corresponding to all the inputs from the driving behavior to the current time
Figure BDA0003720378570000054
By the number of standard deviations, second parameter
Figure BDA0003720378570000055
For the currently inputted comprehensive driving operation action index
Figure BDA0003720378570000056
And the comprehensive driving operation action index of all the input from the driving behavior to the current time
Figure BDA0003720378570000057
By the number of standard deviations.
The beneficial effect of this application is:
according to the technical scheme, the risk of longitudinal and transverse integration of the vehicle is effectively solved, the influence of uncertainty caused by a driver is weakened, the driver is comprehensively considered from different angles, so that the judgment error of the driver is reduced, the method is suitable for multiple traffic scenes, traffic dangers possibly caused in future time periods are comprehensively considered, the accuracy and the safety of driving right switching are further improved, finally, all factors are integrated into two index input switching systems, the data volume is small and accurate, and the real-time performance is higher.
In the preferred implementation mode of the application, the influence of experience and subconscious of a driver on driving is considered, the judgment burden of a switching system is reduced, and the real-time performance is better. And the risk that other vehicles may cause the vehicle can be predicted in advance, and the rear-end collision and collision in the driving process are avoided.
Drawings
The advantages of the above and/or additional aspects of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of a reinforcement learning-based human-machine co-driving control right switching method according to an embodiment of the present application;
FIG. 2 is a diagram of relative positions of roads and relative safe positions according to one embodiment of the present application;
FIG. 3 is a schematic diagram of a model-free reinforcement learning process according to an embodiment of the present application;
fig. 4 is a schematic diagram illustrating an overall structure of a reinforcement learning-based human-machine co-driving control right switching mechanism according to an embodiment of the present application;
FIG. 5 is a diagram illustrating Q-tables in a Q-learning algorithm in reinforcement learning according to an embodiment of the present application.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present embodiment provides a method for switching a driving control right of a human-computer based on reinforcement learning, including:
step 1, constructing a simulator based on a real vehicle road environment, and constructing vehicle road scenes of various situations in the simulator;
further, step 1 is realized by:
step 1.1, simulator hardware needs to have a camera for acquiring a human image of a driver and a driving operation environment for simulating a real vehicle;
step 1.2, constructing a large number of typical traffic environments which can be met in the real world, wherein the typical traffic environments comprise a full-type road section car following scene, a full-type road section overtaking scene, a road intersection scene, a congestion road section scene and the like;
step 1.3, inserting a certain number of dangerous traffic scenes and accident simulation scenes in the construction of different typical traffic environment scenes.
Step 2, continuously collecting vehicle information of surrounding roads, current driver state, driver intention, driver style and action information, control weight distribution information and relevant information of an automatic driving system, and calculating to obtain subconscious driving influence deviation of the driver;
further, step 2 may include the following processes:
step 2.1, the driver needs to complete a complete driving process in different scenes;
step 2.2, under the condition of no intervention of a control right switching system, a driver needs to drive normally in a certain amount of different driving scenes, collect and record the driving operation and road conditions of the driver in the driving process, obtain the style of the driver through statistical analysis, and calculate the subconscious driving influence deviation of the driver (the subconscious driving operation influence of the driver is described by the acceleration and deceleration caused by the experience accumulated by the driver in different driving scenes and the change of the transverse position of the road):
wherein, the calculation formula of the driver subconscious driving influence deviation is as follows:
Figure BDA0003720378570000071
Figure BDA0003720378570000072
R d =|d-q ki |
wherein sigma is the driver subconscious driving influence deviation, sum is the collected traffic scene number, and D i For a series of subconscious driving strengths in a traffic scene time period, rho', tau and omega are undetermined parameters, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, and q is the current transverse position of the vehicle ki Is the fitted lateral position of the vehicle under this scenario (label), a is the vehicle acceleration, R d Is a location parameter.
Specifically, the driver subconscious driving influence deviation does not consider the influence of other traffic participants, and only considers from the perspective of personal safety. Based on the maximum entropy principle, a maximum entropy method related to the subconscious of the driver is established.
Firstly, an entropy function is constructed:
Figure BDA0003720378570000073
where H (x) is entropy, a measure of uncertainty in the thing; p is a radical of k Is a probability distribution; c is a constant, depending on the measure of entropy, here taken to be 1.
At the entropy functionIn particular, it is desirable that the driver's subconscious driving influence deviation, i.e. how subconscious the environment is currently in, has an influence on the behavior, but due to the probability distribution p k Is a decimal fraction of 0 to 1, such that log 2 p k Is a negative number, so a non-negative integer q is introduced in the embodiment i Substituting probability distribution p in entropy function k
Defining a parameter q i The method is characterized in that relative safety positions of different road scenes are provided, wherein the relative positions are shown in fig. 2, a transverse coordinate axis is established by taking the left side of a road as an origin, the half width of a single lane of the road is taken as a driving position, the road is divided into eight areas, and the relative safety positions are that more than half of vehicles are located at the position when the vehicles normally drive.
Roads in different scenes have large difference, and specific positions which can be used for completely representing road paths cannot be obtained accurately, so that a mode that ln replaces log with base 2 is adopted, and q is used as a base i The main value is an integer greater than one, so the negative sign of the original entropy function needs to be removed, and the difference can be expressed by the following correction entropy:
Figure BDA0003720378570000081
secondly, establishing a constraint condition of the correction entropy: first, the road conditions are constrained, and each driver will choose the side with good road conditions. Second, under the constraints of traffic regulations, drivers tend to drive more as specified by traffic regulations. Thirdly, the constraint of traffic demand, namely whether the driver needs to overtake, follow or go straight in the road scene, is as follows:
constraint 1:
Figure BDA0003720378570000082
constraint 2:
Figure BDA0003720378570000083
constraint 3: b (q) i )∈S
In the formula, A min 、A max The lower limit and the upper limit of the road traffic capacity score are set;
Figure BDA0003720378570000084
interference coefficients for unfamiliar degrees of different road sections; b is a traffic demand impact weight; b is the maximum boundary of the traffic rule; b (q) i ) For the determination of traffic demands, i.e. knowing whether a traffic demand is overtaking, following or going straight, q is estimated by the demand i (ii) a And S is a traffic demand set, and all normal driving behavior position results are in the set.
Three constraint conditions and different road scenes are set, the correction entropy is used for calculation, and the relative safe position q of the different road scenes is obtained when the value of the correction entropy E is maximum i For a relative safety position q i Clustering is performed, labels are marked for each type (such as overtaking, following and going straight), and then the relative safety position q is determined i Fitting to obtain a fitted transverse position q ki This position is the safe position that this driver is most inclined to walk under different labels.
In summary, fitting the lateral position q ki For correcting the relative safe position q when the entropy E value is maximum under the constraint condition i
D i The calculation formula of a series of subconscious driving strengths in one time period of the traffic scene is as follows:
Figure BDA0003720378570000091
R d =|d-q ki |
where d is the current lateral position of the vehicle and q is ki The fitting transverse position of the vehicle in the scene (label) is shown as a, the acceleration of the vehicle is shown as a, the subconscious side weight is shown as alpha, the personal safety tendency weight of a driver is shown as beta, rho' is shown as a undetermined parameter, the value of the undetermined parameter meets the change trend of subconscious driving strength in different traffic scenes, and the change trend is as follows:
when R is d More than or equal to Z (Z is a safety value and a set value, and the values of different roads are differentSample), subconscious Driving Strength D i When the value of (a) is larger, the values of the undetermined parameters rho', tau and omega follow R d And | a | increases, i.e., becomes more and more unsafe, at which time the greater the strength of the subconsciously driven operational action;
when R is d <Z、D i When the value of (a) is small, the values of the undetermined parameters rho', tau and omega follow R d And | a | decreases, i.e., becomes more and more safe, at which time the strength of the subconsciously driven operational action is less.
Figure BDA0003720378570000092
sum is the number of collected traffic scenes, and the result sigma of the averaging of the intensity is the subconscious driving influence deviation of the driver.
Step 2.3, simulating conditions possibly encountered in the real driving process by a driver, such as dangerous states of fatigue, emotional excitement, distraction and the like, and normal driving;
and 2.4, collecting data to obtain the speed, distance and road surface information of surrounding vehicles, the brake, accelerator and steering wheel data of the own vehicle, the driving weight distribution and the intention and operation data of a driver in a driving system, and obtaining the state and intention information of the driver in a data statistical processing mode.
Step 3, obtaining interaction force and environmental response factor gamma with each peripheral unit according to the collected peripheral road information and vehicle state;
further, step 3 is realized by:
and 3.1, the environmental response factor gamma is an interaction force under the influence of the interaction of the vehicle and the vehicle road environment, and particularly responds to different units. The environmental response factor γ was calculated using the following formula:
the formula:
Figure BDA0003720378570000101
v limleast (t) is the minimum speed value of the speed limit and the vehicle speed in the current time period scene;
m is the mass of the vehicle;
m is a vehicle type and a target correction parameter;
Figure BDA0003720378570000102
representing the desired speed and direction of speed of the vehicle,
Figure BDA0003720378570000103
Figure BDA0003720378570000104
it is derived from Newton's second law and kinematic formula.
k 1 Correcting parameters for dynamics;
k 2 parameters are corrected for traffic scenarios, such as highway segments, congested road segments, etc.,
Figure BDA0003720378570000105
is an interaction force with other vehicles, wherein the vehicle interaction force parameter
Figure BDA0003720378570000106
Comprises the following steps:
Figure BDA0003720378570000107
θ 1l is the angle between the direction of travel of the vehicle and the direction of travel of other vehicles, Δ v 1l /Δμ 1l The expression that u is the safe distance and ρ is the distance to other vehicles in the ratio of the speed difference to the distance difference indicates that a distance greater than the safe distance represents an attractive force, the attractive force is smaller as the distance is closer to the safe distance, and the attractive force is converted into a repulsive force when the distance is smaller than the safe distance, and the repulsive force is larger as the distance is closer to other vehicles. The vehicles in the transverse parallel position and the parallel advancing direction do not have interaction force, and the absolute value of the interaction force of the longitudinal same laneAnd max.
k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure BDA0003720378570000111
is an interaction force with a pedestrian, wherein the pedestrian interaction force parameter
Figure BDA0003720378570000112
Comprises the following steps:
Figure BDA0003720378570000113
v is the current speed of the vehicle, θ 1j The angle between the center of the vehicle head and the pedestrian r 1j Is the distance difference, t 1j The formula shows that when the vehicle speed is 0, no interaction force exists between the vehicle and the pedestrian, the closer the vehicle and the pedestrian are, the smaller the angle difference is, the shorter the estimated meeting time is, and the higher the vehicle speed is, the larger the repulsive force is caused.
k 4 The parameters are corrected for the complexity of the surrounding physical environment,
Figure BDA0003720378570000114
is an interaction force with a surrounding physical environment such as a non-moving object like a building, wherein the environmental interaction force parameter
Figure BDA0003720378570000115
Comprises the following steps:
Figure BDA0003720378570000116
t is the volume of the non-moving object, the larger the volume is, the larger the repulsive force is, when the volume is smaller than or equal to the passable size of the vehicle, the interaction force is attractive force, when the volume is larger than the passable size of the vehicle, when the collision time T is 1R The smaller the repulsion force, the greater the repulsion force when the vehicle mass is greater, and the greater the vehicle speed, the greater the exclusion forceLarge, at a speed of 0, there is no interaction force.
k 5 In order to reflect the attention degree of the vehicle to the traffic regulation as the correction parameter of the influence degree of the traffic regulation,
Figure BDA0003720378570000117
acting as a resistance to traffic regulations, wherein the regulation parameters
Figure BDA0003720378570000118
Comprises the following steps:
Figure BDA0003720378570000119
v lim the maximum speed is limited for the traffic regulations and the traffic signs, the lower the limited speed is, the larger the resistance is, and when the traffic regulations and the traffic signs require parking, the resistance is infinite under the red light condition.
Step 4, carrying out normalization preprocessing according to the collected current action information, namely the brake, the accelerator and the steering wheel, so as to obtain a real-time operation quantization parameter a pre
Further, step 4 is implemented by:
step 4.1, extracting brake force, accelerator force and steering wheel angle data through a sensor;
and 4.2, normalizing the three data by using min-max standardization:
braking:
Figure BDA0003720378570000121
accelerator:
Figure BDA0003720378570000122
steering wheel corner:
Figure BDA0003720378570000123
the value is the current value, min is the minimum value, and max is the maximum value;
the operation specification can know that the accelerator and the brake are mutually exclusive operations, so the normalization results are combined as follows:
longitudinal operation interval: [ -1:1 ];
the transverse operation interval: [ -1:1 ];
step 4.3, for the longitudinal operation interval and the transverse operation interval: -1:1, constructing bijections from-1: 1, -1:1 to-1:
longitudinal value of (0. a) 1 a 2 a 3 a 4 …) and has a transverse value of (0. b) 1 b 2 b 3 b 4 …), constructing a crossover method, segmenting the two types of decimal, segmenting after all non-0 digits, and performing crossover recombination on the segmented segments to obtain a one-dimensional real-time operation quantization parameter a pre
And 5, determining current vehicle path information according to the position of the current vehicle on the road, wherein the current vehicle path information at least comprises the current vehicle path danger degree and the current vehicle path danger threshold.
Further, step 5 is implemented by:
and 5.1, determining the position of the current vehicle in the road, wherein the position at least comprises the distance between the current vehicle and the front vehicle and the transverse position of the current vehicle, and determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle.
The risk of the longitudinal position is inversely proportional to the distance from the tail of the front vehicle, namely the closer the distance from the tail of the front vehicle, the greater the risk, the longitudinal vehicle road danger function is established, the tail of the front vehicle is used as the original point to set a coordinate axis, and the specified normal safe distance is zeta 1 . Setting a minimum safety distance eta 1 ,η 1 Is set to the maximum deceleration braking to the just-no-collision distance of the preceding vehicle.
Figure BDA0003720378570000131
y 1 Is a longitudinal vehicle road risk value,
x 1 is the distance from the front vehicle;
and 5.2, determining a transverse vehicle road danger value according to the transverse position of the current vehicle:
and (3) establishing a transverse vehicle road danger function by taking the vehicle head central point as an original point:
y 2 =0.5cos[(π/T)x 2 ]-0.5,-T≤x 2 ≤T
y 2 is a value of the risk of the lateral vehicle,
x 2 is the current lateral position;
t is the distance from the center line of the lane to the sideline;
step 5.3, calculating to obtain the current vehicle road danger degree H x,y
Figure BDA0003720378570000132
Figure BDA0003720378570000133
The risk distance influence factors of different road sections have the value range of [1,10 ]]When the value is 1, the current road section and the driving state are standard traffic road sections and driving environments under the regulation of the intersection standard. When the value is 10, the conditions that the current driving environment is severe, the road traffic capacity is extremely poor and rear-end accidents happen frequently around the road, such as a heavy fog and frozen road section, are indicated.
Step 5.4, calculating the current vehicle road danger threshold values of different scenes:
risk=ωγH x,y
omega is a scene impact parameter. Environmental response factor gamma
Step 6, quantizing the parameters a according to the collected driver state, the driver intention and the real-time operation pre Determining and obtaining self synthesis by combining the environmental response factor gamma, the current vehicle road danger degree and the current vehicle road danger threshold value in a piecewise function modeIndex of driving operation action
Figure BDA0003720378570000141
The corresponding calculation formula is:
Figure BDA0003720378570000142
z 1 the driver states are different, the environmental response degrees are different,
gamma is an environmental response factor and is a specific factor,
delta is the intention of the driver, representing the degree to which the current operation coincides with the recognized intention of the driver,
H x,y the current degree of danger of the vehicle road is,
sigma is a road correction parameter, and the road correction parameter,
a pre in order to operate the quantization parameter in real-time,
risk is the current roadway hazard threshold.
Step 7, obtaining a predicted vehicle road risk degree and a predicted vehicle road risk threshold according to the interaction force growth rate;
specifically, the interaction force in step 3 is inversely proportional to the distance, and the faster the interaction force increases and the more danger is likely to occur, so the following derivation formula can be derived:
Figure BDA0003720378570000143
Figure BDA0003720378570000144
Figure BDA0003720378570000145
Figure BDA0003720378570000146
A arisk =ρa risk
in the formula, v f Is the rate of increase of single unit interaction force, v risk To predict the degree of risk of the vehicle road, a f Acceleration for single unit interaction force increase, a risk For the sum of the acceleration increases for all peripheral units of interaction force, A arisk In order to predict the threshold value of the vehicle road danger, rho is a vehicle road danger influence factor and is determined by the complexity of the current road, and the value range is [0,1]]。
Step 8, calculating a driving operation action prediction index according to the driver information and the vehicle path prediction information
Figure BDA0003720378570000151
The driver information at least comprises a driver state, a driver intention, a driver style and a driver subconscious driving influence deviation, the vehicle path prediction information at least comprises a predicted vehicle path danger degree and a predicted vehicle path danger threshold, and a calculation formula of the driving operation action prediction index
Figure BDA0003720378570000152
Comprises the following steps:
Figure BDA0003720378570000153
in the formula, sigma is the subconscious driving influence deviation of the driver, and the subconscious driving influence deviation of the historical driver under the most similar scene is obtained by comparing traffic scenes;
s is the style of the driver, namely the quantitative evaluation of the style of the driver of [0,10] is obtained through the style test of the driver, and the style of the driver is less than 1, so that the style of the driver is extremely unsuitable, and the time delay of the intention of the driver and the operation reaction of the driving state is influenced.
Z t The state operation reaction of the driver is delayed, the delay is a set value, and the larger the delay is, the smaller the prediction index of the driving operation action is;
delta is the intention of the driver, and different driver intention paths have larger influence on the driver subconscious driving influence deviation.
Step 9, predicting the driving operation action index
Figure BDA0003720378570000154
And comprehensive driving operation action index
Figure BDA0003720378570000155
And inputting the driving weight into a man-machine driving sharing control weight switching system based on reinforcement learning, and calculating and adjusting driving weight weights respectively needed by a driver and a driving system.
Specifically, as shown in fig. 3 and 4, the driving operation action prediction index
Figure BDA0003720378570000156
Various risk factors representing the future time period influence the operation safety degree, and the influence of other units after the operation of the vehicle is emphasized. Integrated driving operation action index
Figure BDA0003720378570000157
Representing that each risk factor influences the operation safety degree at the current time, and emphasizing whether the current position is safe or not and whether the current state can be effectively driven or not;
step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure BDA0003720378570000161
And comprehensive driving operation action index
Figure BDA0003720378570000162
Normalizing, and calculating the prediction index of the driving operation from the beginning to the current driving operation during the driving
Figure BDA0003720378570000163
And comprehensive driving operation action index
Figure BDA0003720378570000164
Mean and standard deviation of.
Step 9.2, driving operation action prediction index after Z-Score standardization
Figure BDA0003720378570000165
And comprehensive driving operation action index
Figure BDA0003720378570000166
And inputting the current corresponding mean value and standard deviation as input parameters into a human-computer co-driving control right switching system based on reinforcement learning to judge whether weight distribution conditions are met, if so, executing the step 9.3, and if not, re-acquiring driver information and vehicle path prediction information.
Specifically, the parameter Z-Score represents the number of the sampled sample values differing from the data mean value by several standard deviations, so as to predict the index of the driving operation action
Figure BDA0003720378570000167
As an example, the first parameter
Figure BDA0003720378570000168
Predicting index for currently inputted driving operation action
Figure BDA0003720378570000169
(sample value sampling) and index of prediction of driving operation behavior for all inputs from driving behavior to current time
Figure BDA00037203785700001610
Is different by the number of standard deviations that are prediction indexes of all the inputted driving operation actions from the start of the driving behavior to the current time
Figure BDA00037203785700001611
Standard deviation of (2). Second parameter
Figure BDA00037203785700001612
Similarly, no further description is given.
In this embodiment, the weight distribution conditions in the control weight switching system include three types:
(1) 5 times in succession of the first parameter
Figure BDA00037203785700001613
And a second parameter
Figure BDA00037203785700001614
Are both less than or equal to a first trigger threshold;
specifically, the first parameter is judged
Figure BDA00037203785700001615
And a second parameter
Figure BDA00037203785700001616
Whether the first trigger threshold values are all smaller than or equal to the first trigger threshold value, wherein the value of the first trigger threshold value can be-3, namely when the input driving operation action prediction index is input
Figure BDA00037203785700001617
And comprehensive driving operation action index
Figure BDA00037203785700001618
Whether all are less than or equal to 3 standard deviations from the current corresponding mean, the corresponding formula is:
Figure BDA00037203785700001619
and is
Figure BDA00037203785700001620
Such a situation indicates that the current state does not satisfy the safe state and that the safe state is not satisfied in the future, and therefore, the control right switching system is triggered to start operating. Under the condition, the index of five inputs in succession is (
Figure BDA0003720378570000171
And
Figure BDA0003720378570000172
) All satisfy
Figure BDA0003720378570000173
And is
Figure BDA0003720378570000174
In time, the driving weights respectively required by the driver and the driving system need to be adjusted.
(2) Continuous 3 times of the second parameter
Figure BDA0003720378570000175
Less than or equal to a second trigger threshold;
specifically, the value of the second trigger threshold may be-4. When the inputted comprehensive driving operation action index
Figure BDA0003720378570000176
The current mean value is less than or equal to 4 standard deviations, i.e. the second parameter
Figure BDA0003720378570000177
Figure BDA0003720378570000178
When the current state does not meet the safety state, the driving system needs to be intervened emergently, and the control right switching system is triggered to start working. Under the condition, when the indexes are input three times in succession
Figure BDA0003720378570000179
Satisfies the second parameter
Figure BDA00037203785700001710
And adjusting the driving weight respectively required by the driver and the driving system.
(3) 3 times in succession of the first parameter
Figure BDA00037203785700001711
Less than or equal to a second trigger threshold;
specifically, when the driving operation is inputtedIndex of action prediction
Figure BDA00037203785700001712
The current mean is less than or equal to 4 standard deviations, i.e. the first parameter
Figure BDA00037203785700001713
And when the future state does not meet the safe state and the state is safe after the intervention of a driver cannot be corrected by self, triggering the control right switching system to start working. Under this condition, when the index is inputted three times in succession
Figure BDA00037203785700001714
Satisfies the first parameter
Figure BDA00037203785700001715
And adjusting the driving weight respectively required by the driver and the driving system.
And 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and assigning a driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithm, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
In this embodiment, the driving right weight algorithm respectively required for the driver and the driving system in the control right switching system is a Q learning algorithm, and the specific training process is as follows:
(1) the transition rule for Q learning is:
Q(state,action)=R(state,action)+Gamma*MaxQ(next state,all actions)
that is, Q (state, action) + Gamma max [ Q (next state, all actions) ]
Gamma is a discount factor (discount factor), and the larger the discount factor is, the greater the MaxQ plays a role. Here, the value (R) before the eye, and the value in memory can be understood. MaxQ refers to the value in memory, and it refers to the maximum value of the value in the action of the next state in memory.
(2) A "matrix Q" is added as a learning-intensive agent, i.e. the brain of the driving right switching system, i.e. something learned empirically. The rows of the "matrix Q" represent the current state of the driving right switching system and the columns represent the possible actions of the next state (link between nodes). The driving right switching system is initialized to 0, i.e., the "matrix Q" is initialized to zero. The "matrix Q" can only start with one element. If a new state is found, the "matrix Q" is updated, which is referred to as unsupervised learning.
(3) The driving weight of the driver is power (driver), the driving weight of the driving system is power (system) 1-power (driver), the control right switching system adjusts the driving weight of the driver by using a reinforcement learning Q learning algorithm, the Q learning action is directly assigned to the driving weight of the driver, the value range of the weight value is [0,1], and the step length is 0.05.
(4) The Q learning state is set to 0,1, 2, 3, 4, 5. Wherein:
0 represents
Figure BDA0003720378570000181
And is
Figure BDA0003720378570000182
1 represents
Figure BDA0003720378570000183
2 represents
Figure BDA0003720378570000184
3 represents
Figure BDA0003720378570000185
4 represents
Figure BDA0003720378570000186
5 represents
Figure BDA0003720378570000187
And is
Figure BDA0003720378570000188
(5) After triggering the control right switching system to start working, obtaining one of the initial states 0,1 and 2, when the state is still one of the states 0,1 and 2 through the action (adjusting the weight) in the step (3), rewarding to be-1, updating a matrix Q, and assigning the element in the matrix corresponding to the state and the use action to be-1;
when the state reaches the states 3 and 4 through the action in the step (3), the reward is 1, the matrix Q is updated, and the element in the corresponding matrix of the state and the use action is assigned as 1;
when the state reaches the state 5 through the action in (3), the reward is 100, the matrix Q is updated, the element in the corresponding matrix of the state and the use action is assigned as 100, the state 5 is the target state, and finally the Q-table is obtained as shown in fig. 5 (the element is not assigned).
(6) Selecting a road environment, applying (1) to (5), and obtaining a Q-table of an initial "matrix Q" as a selection of MaxQ (next state, all actions) in Q learning Q (state, action) ═ R (state, action) + Gamma (next state, all actions), wherein Gamma is selected according to the road similarity degree [0,1], and R (state, action) is a value of a state obtained by the current road environment: the reward is-1 when the state is 0,1, 2, 1 when the state is 3, 4, and 100 when the state is 5.
When the control right switching system calculates the driving weight, the weight which can be adjusted and the reward obtained by the achieved state are calculated in advance according to the Q-table of the similar road section, and the reward is MaxQ (next state) in the formula. Therefore, Q (state, action) is the sum of the R (state, action) value in the current road environment and MaxQ (next state, all actions).
When Q (state, action) is maximum, the action next state in MaxQ (next state, all actions) is the weight that needs to be adjusted, and is recorded as the driver driving weight power (driver).
(7) And updating the Q-table according to the calculated Q (state) value.
(8) The switching of weights is stopped when state 5 is reached.
The technical scheme of the application is explained in detail by combining the attached drawings, the application provides a reinforcement learning-based man-machine driving control right switching method, the method is suitable for a reinforcement learning-based man-machine driving control right switching system to distribute driving weights between a driver and a driving system, and the method comprises the following steps: calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information; and inputting the driving operation action prediction index and the comprehensive driving operation action index into the control weight switching system, and calculating the driving weight between the driver and the driving system. Through the technical scheme in the application, the risk of longitudinal and transverse integration of the vehicle is effectively solved, the influence of uncertainty caused by a driver is weakened, and the driver is comprehensively considered from different angles, so that the judgment error of the driver is reduced.
The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.
The units in the device can be merged, divided and deleted according to actual requirements.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and is not intended to limit the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.

Claims (8)

1. A reinforcement learning-based man-machine driving control right switching method is applicable to distribution of driving weights between a driver and a driving system by a reinforcement learning-based man-machine driving control right switching system, and comprises the following steps:
calculating a driving operation action prediction index according to the driver information and the vehicle road prediction information;
and inputting the driving operation action prediction index and the comprehensive driving operation action index into the control weight switching system, and calculating the driving weight between the driver and the driving system.
2. The reinforcement learning-based human-machine co-driving control right switching method according to claim 1, wherein the driver information at least includes a driver state, a driver intention, a driver style, and a driver subconscious driving influence deviation, the vehicle path prediction information at least includes a predicted vehicle path risk and a predicted vehicle path risk threshold,
the calculation formula of the driving operation action prediction index is as follows:
Figure FDA0003720378560000011
in the formula (I), the compound is shown in the specification,
Figure FDA0003720378560000012
predicting an index, Z, for the driving maneuver r For a driver state operation response delay, σ is the driver subconscious Driving influence deviation, δ is the driver intent, S is the driver style, v risk For the prediction of the degree of risk of the vehicle road, A arisk And the predicted vehicle road danger threshold value is used.
3. The reinforcement learning-based man-machine driving sharing control right switching method according to claim 2, wherein the calculation formula of the driver subconscious driving influence deviation σ is as follows:
Figure FDA0003720378560000013
Figure FDA0003720378560000014
R d =|d-q ki |
wherein sigma is the driver subconscious driving influence deviation, sum is the collected traffic scene number, and D i For a series of subconscious driving strengths in a traffic scene time period, rho', tau and omega are undetermined parameters, alpha is subconscious side weight, beta is personal safety tendency weight of a driver, d is the current transverse position of a vehicle, and q is the current transverse position of the vehicle ki Is the fitted lateral position of the vehicle under this scenario (label), a is the vehicle acceleration, R d Is a location parameter.
4. The reinforcement learning-based human-computer co-driving control right switching method as claimed in claim 1, wherein the driver information at least includes a driver state, a driver intention, and a driver style, and the calculation process of the comprehensive driving operation action index specifically includes:
determining current vehicle path information according to the position of a current vehicle in a road, wherein the current vehicle path information at least comprises a current vehicle path danger degree and a current vehicle path danger threshold;
determining the comprehensive driving operation action index by combining an environmental response factor and adopting a piecewise function mode according to the driver information and the current vehicle path information, wherein the calculation formula of the comprehensive driving operation action index is as follows:
Figure FDA0003720378560000021
in the formula (I), the compound is shown in the specification,
Figure FDA0003720378560000022
is the index of the integrated driving maneuver z 1 Is the driver state, gamma is the environmental response factor, H x,y For the current vehicle road risk, sigma is a road correction parameter, a pre And for real-time operation of the quantized parameters, risk is the current vehicle road risk threshold.
5. The reinforcement learning-based man-machine co-driving control right switching method as claimed in claim 4, wherein the determining current vehicle path information according to the current vehicle position in the road specifically comprises:
determining a position of a current vehicle in a road, including at least a distance to a preceding vehicle of the current vehicle and a lateral position of the current vehicle;
determining a longitudinal vehicle road danger value according to the distance between the current vehicle and the front vehicle;
determining a transverse vehicle road danger value according to the transverse position of the current vehicle;
calculating the current vehicle road risk degree according to the longitudinal vehicle road risk value and the transverse vehicle road risk value, wherein the corresponding calculation formula is as follows:
Figure FDA0003720378560000023
in the formula, H x,y In order to obtain the current vehicle road risk degree,
Figure FDA0003720378560000024
the value range of the risk distance influence factors of different road sections is [1,10 ]],y 1 Is said longitudinal road hazard value, y 2 The value is the lateral vehicle road danger value;
and calculating current vehicle road danger thresholds of different scenes according to the current vehicle road danger degrees, and recording the current vehicle road danger thresholds and the current vehicle road danger degrees as the current vehicle road information.
6. The reinforcement learning-based man-machine co-driving control right switching method as claimed in claim 4, wherein the environmental response factor γ is calculated by the formula:
Figure FDA0003720378560000031
wherein M is the vehicle mass, M is the vehicle type andtarget correction parameter, k 1 In order to correct the parameters for the dynamics,
Figure FDA0003720378560000032
representing the desired speed and direction of speed, v, of the vehicle limleast (t) is the minimum velocity value, k 2 The parameters are corrected for the traffic scene,
Figure FDA0003720378560000033
as a vehicle interaction force parameter, k 3 A correction parameter for the degree to which the pedestrian complies with the traffic regulations,
Figure FDA0003720378560000034
is a pedestrian interaction force parameter, k 4 The parameters are corrected for the complexity of the surrounding physical environment,
Figure FDA0003720378560000035
as an environmental interaction force parameter, k 5 A correction parameter for the degree of influence of the traffic regulations,
Figure FDA0003720378560000036
are rule parameters.
7. The reinforcement learning-based human-computer co-driving control right switching method according to any one of claims 1 to 6, wherein the calculating the driving weight between the driver and the driving system specifically includes:
step 9.1, using the Z-score standardized formula, predicting the index of the driving operation action at the current moment
Figure FDA0003720378560000037
And comprehensive driving operation action index
Figure FDA0003720378560000038
Standardizing, and calculating the operation steps from the start to the current operation in the current drivingIndex of measurement
Figure FDA0003720378560000039
And comprehensive driving operation action index
Figure FDA00037203785600000310
Mean and standard deviation of (d);
step 9.2, driving operation action prediction index after Z-Score standardization
Figure FDA00037203785600000311
And comprehensive driving operation action index
Figure FDA00037203785600000312
Inputting the current corresponding mean value and standard deviation as input parameters into a human-computer co-driving control right switching system based on reinforcement learning to judge whether weight distribution conditions are met, if so, executing the step 9.3, and if not, acquiring driver information and vehicle path prediction information again;
and 9.3, based on the Q learning algorithm, adjusting the learning state in the Q learning algorithm by using the input parameters, and assigning a driving weight of the driver according to the action in the value maximum value of the next state in the Q learning algorithm, wherein the driving weight of the driving system is the difference between 1 and the driving weight of the driver.
8. The reinforcement learning-based human-computer co-driving control weight switching method as claimed in claim 7, wherein the weight distribution condition specifically includes:
5 times in succession of the first parameter
Figure FDA0003720378560000041
And a second parameter
Figure FDA0003720378560000042
Are both less than or equal to a first trigger threshold; alternatively, the first and second electrodes may be,
3 times of continuous workThe second parameter
Figure FDA0003720378560000043
Less than or equal to a second trigger threshold; alternatively, the first and second electrodes may be,
the first parameter is continued for 3 times
Figure FDA0003720378560000044
Less than or equal to the second trigger threshold, wherein,
the first parameter
Figure FDA0003720378560000045
Predicting index a for currently inputted driving operation action
Figure FDA0003720378560000046
And a driving operation action prediction index corresponding to all the inputs from the driving behavior to the current time
Figure FDA0003720378560000047
The mean of (a) differs by the number of standard deviations,
the second parameter
Figure FDA0003720378560000048
An index a of the comprehensive driving operation action for the current input
Figure FDA0003720378560000049
And the comprehensive driving operation action index of all the input from the driving behavior to the current time
Figure FDA00037203785600000410
By the number of standard deviations.
CN202210758672.3A 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning Active CN115071758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210758672.3A CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210758672.3A CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN115071758A true CN115071758A (en) 2022-09-20
CN115071758B CN115071758B (en) 2023-03-21

Family

ID=83254772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210758672.3A Active CN115071758B (en) 2022-06-29 2022-06-29 Man-machine common driving control right switching method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN115071758B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549367A (en) * 2018-04-09 2018-09-18 吉林大学 A kind of man-machine method for handover control based on prediction safety
US20190118832A1 (en) * 2016-04-18 2019-04-25 Honda Motor Co., Ltd. Vehicle control system, vehicle control method, and vehicle control program
US20200192359A1 (en) * 2018-12-12 2020-06-18 Allstate Insurance Company Safe Hand-Off Between Human Driver and Autonomous Driving System
US20210039638A1 (en) * 2019-08-08 2021-02-11 Honda Motor Co., Ltd. Driving support apparatus, control method of vehicle, and non-transitory computer-readable storage medium
CN113335291A (en) * 2021-07-27 2021-09-03 燕山大学 Man-machine driving sharing control right decision method based on man-vehicle risk state
CN113341730A (en) * 2021-06-28 2021-09-03 上海交通大学 Vehicle steering control method under remote man-machine cooperation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190118832A1 (en) * 2016-04-18 2019-04-25 Honda Motor Co., Ltd. Vehicle control system, vehicle control method, and vehicle control program
CN108549367A (en) * 2018-04-09 2018-09-18 吉林大学 A kind of man-machine method for handover control based on prediction safety
US20200192359A1 (en) * 2018-12-12 2020-06-18 Allstate Insurance Company Safe Hand-Off Between Human Driver and Autonomous Driving System
US20210039638A1 (en) * 2019-08-08 2021-02-11 Honda Motor Co., Ltd. Driving support apparatus, control method of vehicle, and non-transitory computer-readable storage medium
CN113341730A (en) * 2021-06-28 2021-09-03 上海交通大学 Vehicle steering control method under remote man-machine cooperation
CN113335291A (en) * 2021-07-27 2021-09-03 燕山大学 Man-machine driving sharing control right decision method based on man-vehicle risk state

Also Published As

Publication number Publication date
CN115071758B (en) 2023-03-21

Similar Documents

Publication Publication Date Title
WO2021077725A1 (en) System and method for predicting motion state of surrounding vehicle based on driving intention
CN109345020B (en) Non-signalized intersection vehicle driving behavior prediction method under complete information
EP3464005B1 (en) Method for estimating a probability distribution of the maximum coefficient of friction at a current and/or future waypoint of a vehicle
CN111104969B (en) Collision possibility pre-judging method for unmanned vehicle and surrounding vehicles
CN109727469B (en) Comprehensive risk degree evaluation method for automatically driven vehicles under multiple lanes
CN110077398B (en) Risk handling method for intelligent driving
CN112249008B (en) Unmanned automobile early warning method aiming at complex dynamic environment
CN110843789A (en) Vehicle lane change intention prediction method based on time sequence convolution network
CN112071059A (en) Intelligent vehicle track changing collaborative planning method based on instantaneous risk assessment
Li et al. Modeling vehicle merging position selection behaviors based on a finite mixture of linear regression models
CN115056798A (en) Automatic driving vehicle lane change behavior vehicle-road cooperative decision algorithm based on Bayesian game
Toledo et al. State dependence in lane-changing models
Babojelić et al. Modelling of driver and pedestrian behaviour–a historical review
CN113761715A (en) Method for establishing personalized vehicle following model based on Gaussian mixture and hidden Markov
Julian et al. Complex lane change behavior in the foresighted driver model
Griesbach et al. Prediction of lane change by echo state networks
CN115071758B (en) Man-machine common driving control right switching method based on reinforcement learning
CN112750304A (en) Intersection data acquisition interval determining method and device based on traffic simulation
CN116811854A (en) Method and device for determining running track of automobile, automobile and storage medium
Dey et al. Left-turn phasing selection considering vehicle to vehicle and vehicle to pedestrian conflicts
CN113033902B (en) Automatic driving lane change track planning method based on improved deep learning
DE102018008599A1 (en) Control system and control method for determining a trajectory for a motor vehicle
Ni et al. Situation assessment for lane-changing risk based on driver’s perception of adjacent rear vehicles
CN113823118A (en) Intelligent network vehicle lane changing method combining urgency degree and game theory
CN116946089B (en) Intelligent brake auxiliary system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant