CN109598934B - Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed - Google Patents

Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed Download PDF

Info

Publication number
CN109598934B
CN109598934B CN201811524283.4A CN201811524283A CN109598934B CN 109598934 B CN109598934 B CN 109598934B CN 201811524283 A CN201811524283 A CN 201811524283A CN 109598934 B CN109598934 B CN 109598934B
Authority
CN
China
Prior art keywords
vehicle
model
action
unmanned
speed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811524283.4A
Other languages
Chinese (zh)
Other versions
CN109598934A (en
Inventor
杨殿阁
曹重
江昆
封硕
王思佳
肖中阳
谢诗超
焦新宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Chaoxing Future Technology Co., Ltd
Original Assignee
Beijing Chaoxing Future Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chaoxing Future Technology Co ltd filed Critical Beijing Chaoxing Future Technology Co ltd
Priority to CN201811524283.4A priority Critical patent/CN109598934B/en
Publication of CN109598934A publication Critical patent/CN109598934A/en
Application granted granted Critical
Publication of CN109598934B publication Critical patent/CN109598934B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/0962Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
    • G08G1/0968Systems involving transmission of navigation instructions to the vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Automation & Control Theory (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention relates to a method for enabling an unmanned vehicle to drive away from a high speed based on a fusion rule and a learning model, which comprises the following steps: the method comprises the following steps that during the driving process of an unmanned automobile on a highway, a down-ramp motive machine is generated according to a distance between a navigation system and a ramp, a rule model is utilized to try down the ramp, whether the success rate of the down-ramp is reduced or not is judged based on the rule decision model, if not, the rule model is adopted to perform decision action, otherwise, the next step is performed; the hybrid decision model can be driven by adopting a rule model when the hybrid decision model is far away from a ramp, and the action of a vehicle is adjusted by utilizing the reinforcement learning decision model according to the urgency of a lower ramp in the process of driving to the ramp. The method can improve the running efficiency and stability of the unmanned automobile in the ramp-off process, and realize the high-efficiency and high-stability ramp-off decision of the unmanned automobile under the environment vehicle conditions which are limited in sensing range and difficult to predict.

Description

Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed
Technical Field
The invention relates to the technical field of unmanned automobile decision making, in particular to a rule and learning model based unmanned automobile driving-away high-speed method.
Background
The autonomous decision-making of the unmanned automobile is an important component in an unmanned automobile system, the expressway is an important application scene of the unmanned automobile, wherein the driving efficiency of the unmanned automobile is greatly influenced in the process that the unmanned automobile drives away from the expressway (a lower ramp), and the driving efficiency is obviously reduced when the unmanned automobile is switched to the right-most lane to wait for the lower ramp or misses the ramp. At present, the mainstream of the ramp-down method is to realize the ramp-down process by generating a lane change motivation at a proper place and utilizing the lane change behavior for a plurality of times. However, the lane changing operation can not be self-adjusted according to the urgency of the next ramp, the method has low success rate of driving away from the highway, the required preparation distance is long, and the efficiency of the unmanned automobile is reduced. On the other hand, because the perception range of the unmanned automobile is limited, and the behaviors of the drivers on the highway are full of uncertainty, the influence of a simple enumeration lane change rule on the success rate of a lower ramp is difficult to estimate, and all environmental states cannot be covered; the result generated by using the pure learning method is difficult to control, and the safety and stability of the vehicle running are affected.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a rule and learning model-based method for enabling an unmanned vehicle to drive away from a high speed, which can fully exert the decision-making capability of reinforcement learning on a definite target in a highly uncertain environment, and simultaneously give consideration to the safety and stability of the rule-based decision-making model, thereby improving the driving efficiency and stability of the unmanned vehicle in a ramp-off process, and realizing efficient and high-stability ramp-off decision-making of the unmanned vehicle in an environment vehicle condition with a limited sensing range and difficult prediction.
In order to achieve the purpose, the invention adopts the following technical scheme: a method for enabling an unmanned vehicle to drive away from a high speed based on a fusion rule and a learning model comprises the following steps: 1) in the running process of the unmanned automobile on the expressway, generating a down-ramp motive according to a distance of a navigation system in front of a ramp, firstly trying down the ramp by using a rule model, judging whether the success rate of the down-ramp is reduced or not based on the rule decision model, if not, adopting a rule model decision action, and if so, entering a step 2); using the starting point of the ramp opening as the origin point to turn the vehicleEstablishing a rectangular coordinate system, wherein the vehicle running direction is x, the vertical vehicle running direction is upward y, and the unit is m; the unmanned vehicle has a driving position with a speed acceleration of
Figure GDA0002566426770000029
Figure GDA00025664267700000210
Position, velocity and acceleration of the surrounding vehicle
Figure GDA00025664267700000211
Figure GDA00025664267700000212
1,2, …, n; in addition, the time interval of the regular model is Δ t, and the output of the regular model is the lateral and longitudinal accelerations that the unmanned vehicle is expected to maintain in the next Δ t time
Figure GDA0002566426770000025
Figure GDA0002566426770000026
Wherein
Figure GDA0002566426770000027
The speed and the acceleration of the vehicle in the longitudinal direction and the transverse direction are respectively, and t represents the current moment; 2) the hybrid decision model can be driven by adopting a rule model when the hybrid decision model is far away from a ramp, and the action of a vehicle is adjusted by utilizing the reinforcement learning decision model according to the urgency of a lower ramp in the process of driving to the ramp, so that the success rate of the lower ramp is improved.
Further, in step 1), the method for establishing the rule model includes the following steps: 1.1) the decision in the x direction needs to comprehensively analyze the expected running speed of the unmanned automobile, the distance between the unmanned automobile and the expected keeping distance of the front automobile and the dynamic characteristics of the unmanned automobile; 1.2) the vehicle decides in the y direction to decide whether to change lanes, the y direction decision is preset in the lane changing process, and after a lane changing motivation is generated, lane changing is started once a safe position is found, otherwise, the vehicle continues to keep driving in the lane; 1.3) generating a smooth curve by using a fifth-order polynomial under the boundary conditions of the current position, the speed, the next-moment target position and the next-moment target speed of the unmanned automobile, dispersing the smooth curve into a guide point with the frequency of 20Hz, and sending a guide signal to the unmanned automobile to generate a local track of the unmanned automobile.
Further, in step 1.1), the decision in the x direction includes the following steps: 1.1.1) desired speed of travel of the unmanned vehicle
Figure GDA0002566426770000028
Comprises the following steps:
Figure GDA0002566426770000031
wherein the content of the first and second substances,
Figure GDA0002566426770000032
is the maximum deceleration of the unmanned vehicle; Δ t is the time interval; dfThe distance between the unmanned automobile and the vehicle in front of the lane where the unmanned automobile is located at the current moment;
Figure GDA0002566426770000033
the current unmanned vehicle speed;
Figure GDA0002566426770000034
is the current preceding vehicle speed;
Figure GDA0002566426770000035
is the maximum deceleration of the leading vehicle;
Figure GDA0002566426770000036
is the expected driving speed of the unmanned vehicle when the unmanned vehicle is normally driving;
1.1.2) desired acceleration of the unmanned vehicle to achieve desired speed of the unmanned vehicle
Figure GDA0002566426770000037
Comprises the following steps:
Figure GDA0002566426770000038
1.1.3) adjusting the final decision in the x direction according to the expected acceleration of the unmanned vehicle as:
Figure GDA0002566426770000039
wherein, aminIs the maximum deceleration of the unmanned vehicle during normal running, amaxThe maximum acceleration of the unmanned automobile during normal running is obtained.
Further, in the step 1.2), the y-direction decision includes the following steps: 1.2.1) determining whether the current lane change is safe by judging the motion states of vehicles in front of and behind the target lane, and starting the lane change when any one of the following conditions is met: (1) no vehicle exists in the observation ranges at the front and the rear of the target lane; (2) the target lane has a front vehicle, and the current vehicle speed meets the following conditions:
Figure GDA00025664267700000310
wherein d isf,jIs the following distance between the unmanned vehicle and the preceding vehicle on the target lane;
Figure GDA00025664267700000311
the speed of the front vehicle on the target lane;
Figure GDA00025664267700000312
is the maximum deceleration of the vehicle ahead of the target lane; (3) the target lane exists in the rear vehicle, and the speed of the rear vehicle meets the following requirements:
Figure GDA0002566426770000041
wherein the content of the first and second substances,
Figure GDA0002566426770000042
is the maximum deceleration of the vehicle behind the target lane; dr,jIs the following distance between the unmanned vehicle and the rear vehicle on the target lane;
Figure GDA0002566426770000043
the speed of the rear vehicle on the target lane;
Figure GDA0002566426770000044
is the maximum deceleration of the vehicle behind the target lane; (4) the target lane has a front vehicle and a rear vehicle at the same time, and the speeds of the front vehicle and the rear vehicle meet the requirements of the conditions (2) and (3); 1.2.2) when the unmanned vehicle decides to change lanes, the y-direction decision in the lane changing process is constant, and the lane changing decision is as follows: the whole lane changing process is set to be subjected to two time intervals 2 delta t, so that two processes of firstly accelerating and then decelerating are required to be carried out in the transverse direction; when a feasible lane change time is obtained, a y-direction decision is set
Figure GDA0002566426770000045
Comprises the following steps:
Figure GDA0002566426770000046
wherein w is the lane width; when the last lane change has been started, the next y-direction decision is set to:
Figure GDA0002566426770000047
at the moment, the driverless automobile completes one lane change; 1.2.3) calculating the speed and the position of the unmanned automobile at the next moment according to the decision-making action:
Figure GDA0002566426770000048
further, in the step 2), the method for establishing and training the hybrid decision model includes the following steps: 2.1) defining an environment state space, an action space and a reward mechanism; 2.2) the action output by the reinforcement learning model must meet the limits of safety and traffic regulation speed, so that the action of reinforcement learning is limited; 2.3) the hybrid decision model is trained in a highly uncertain simulation environment through a constantly repeated off-ramp process.
Further, in the step 2.1), the environment state space, the action space and the reward mechanism are defined as follows: 2.1.1) the environmental state is constructed by the position of the vehicle in the environment, the driving state, the driving strategy and the distance between ramps, and is defined as follows:
Figure GDA0002566426770000049
Figure GDA0002566426770000051
wherein the coordinate system is the same as the regular model coordinate system, and l ═ xeI is the distance between the current vehicle and the ramp; q. q.seThe driving state of the unmanned automobile is set; q. q.siAs ambient vehicle driving conditions, thetaiAn environmental vehicle driving strategy; s represents an environmental state;
Figure GDA0002566426770000052
representing an environment state space formed by all environment states; in the running state, the difference of the x-direction coordinates of any one environmental vehicle and the unmanned automobile is less than 50m, namely
|xe-xi|≤50m;
2.1.2) motion definition is defined by vehicle acceleration in x and y directions, and all selectable motion spaces are as follows:
Figure GDA0002566426770000053
Figure GDA0002566426770000054
wherein, abrakeMaximum deceleration for the unmanned vehicle; a isruleAn action generated for the rule model;
Figure GDA0002566426770000055
the y-direction action being taken when the vehicle starts changing lanes
Figure GDA0002566426770000056
And the next moment adopts
Figure GDA0002566426770000057
Realizing lane change; each action can calculate the position and the speed of the unmanned vehicle at the next moment, a fifth-order polynomial is constructed by taking the position and the speed as boundary conditions, and the fifth-order polynomial is dispersed into a local guide track of 20Hz to guide the unmanned vehicle to complete the action;
2.1.3) the mixed decision model reward mechanism comprises two parts, namely a lower ramp completion reward and a regular model enlightenment reward, and the setting method is as follows: lower ramp completion reward r1Comprises the following steps:
Figure GDA0002566426770000058
regular model elicitation reward r2Comprises the following steps:
Figure GDA0002566426770000059
the final action gets the reward of:
r=r1+r2
further, in the step 2.2), the limiting method comprises the following steps: 2.2.1) for satisfying the security demand that current lane traveled, the distance that needs to guarantee unmanned vehicle and its front truck can satisfy: when the front vehicle decelerates at the maximum deceleration until the vehicle stops, the unmanned vehicle can stop without a collision by decelerating the vehicle with the maximum deceleration, and therefore the vehicle speed v of the unmanned vehicle is limited to:
Figure GDA0002566426770000061
when a certain item in the action space can cause the speed of the next moment not to meet the constraint, deleting the action from the action space; when there is no front vehicle, there is no safety speed limit; when the lane is changed, when the states of the front vehicle, the rear vehicle and the unmanned vehicle on the target lane do not meet the lane changing condition, the lane changing action is deleted from the action space, and the generated action can ensure the driving safety of the vehicle; 2.2.2) to meet the speed requirement of the traffic rules, when a certain action in the action space causes the vehicle speed to not meet the speed limit of the traffic rules at the next moment, the action is deleted from the action space.
Further, in the step 2.3), the training method is implemented by using a particle filtering and monte carlo tree searching method, and specifically includes the following steps: 2.3.1) fitting the vehicle driving strategy by adopting IDM and MOBIL and fitting by adopting a particle filtering method; wherein IDM is an intelligent driver model, and MOBIL is a total braking minimum lane change model; 2.3.2) the hybrid decision model uses a reinforcement learning model, and because the state space is continuous and the dimensionality is high, the reinforcement learning model is trained by adopting a Monte Carlo tree searching method; 2.3.3) repeating the above process for several times to complete the training.
Further, in the step 2.3.1), the step of fitting by using a particle filtering method includes: (1) establishing a particle library for each new environmental vehicle; (2) randomly selecting 50 groups of driving strategy model parameters as initial particles; (3) transferring all the environmental vehicles to the next moment state according to a driving model formed by 50 groups of particles; (4) analyzing the difference between 50 groups of particles and a real driving model of the environmental vehicle according to the actually observed next environmental vehicle state, and intensively resampling the new 50 groups of particles to the vicinity of the particles close to the real driving model; (5) this process is repeated and at each moment the particle closest to the real driving model is selected as the driving model input state space.
Further, in the step 2.3.2), the enhanced learning model is trained by adopting a monte carlo tree search method, and the specific steps are as follows: (1) each state has a plurality of alternative actions, the actions meet the requirements of safety and traffic rules, and the value of each action of the initialized Monte Carlo tree is the same; (2) in each simulation process, when all action values are the same, the actions generated by the rule model are preferentially adopted for simulation; (3) if the action values are different, the action selected is as follows:
Figure GDA0002566426770000071
wherein Q (s, a) is a cost function of the action a to the environmental state s; n (s, a) is the number of times action a was taken at ambient state s in the past simulation process; n(s) ═ ΣaN (s, a); c is exploring new action intention constant;
(4) after each simulation is finished, the mapping of the value between the state and the action in the process is adjusted according to the finally obtained reward, and the value function Q (s, a) is updated.
Due to the adoption of the technical scheme, the invention has the following advantages: 1. the unmanned vehicle driving strategy can be adjusted according to the urgency of the off-ramp, and the off-ramp success rate is improved. 2. The unmanned driving decision model based on the rules is preferentially adopted, and the decision model based on reinforcement learning is used for adjustment when the rule model is possibly invalid, so that the driving stability is improved. 3. The method can finish the ramp-off process under the conditions of limited sensing range and uncertain environmental vehicle behavior height, and the condition is similar to a real traffic scene, thereby ensuring the practicability of the method. 4. The rule model and the reinforcement learning model both meet the safety requirement, and the safety of the unmanned automobile is ensured. 5. The unmanned decision-making model generated by the invention can output a smooth curve with the frequency of 20Hz, and meets the requirements of a vehicle dynamics model and vehicle track tracking.
In summary, based on the rule-based unmanned vehicle decision model, the unmanned vehicle is trained for the problem of the off-ramp by means of reinforcement learning, so that the unmanned vehicle can adjust the driving strategy according to the urgency of the off-ramp, and the method is one of effective ways for improving the form efficiency and stability of the unmanned vehicle, and further promotes the development of the unmanned vehicle.
Drawings
Fig. 1 is a schematic view of a framework of a driverless automobile down-ramp decision model (hybrid down-ramp decision model) based on rules and reinforcement learning;
FIG. 2 is a schematic diagram of an inter-action-decision linkage method;
FIG. 3 is an algorithmic schematic of a hybrid down-ramp decision model;
FIG. 4 is a schematic diagram of a reinforcement learning environment state space;
FIG. 5 is a schematic illustration of the impact of the reward mechanism on the model;
FIG. 6 is a schematic diagram of a Monte Carlo tree search method.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
As shown in FIG. 1, the invention provides a method for enabling an unmanned vehicle to drive away from a high speed based on a fusion rule and a learning model, which comprises the following steps:
1) in the running process of an unmanned automobile on a highway, generating a down-ramp motivation according to a distance between a navigation system and a ramp, firstly trying down the ramp by using a rule-based decision model (namely a rule model), judging whether the down-ramp success rate of the rule-based decision model is reduced or not, if not, adopting a rule model decision action, and if so, entering a step 2);
for convenience of description, a rectangular coordinate system is first established with the starting point of the ramp opening as the origin, the vehicle traveling direction as x, and the vertical vehicle traveling direction as y, in m. The driverless vehicle driving position, the speed acceleration can be expressed as
Figure GDA0002566426770000086
The position, velocity and acceleration of the surrounding vehicle can be expressed as
Figure GDA0002566426770000087
i is 1,2, …, n. The time interval of the regular model is Δ t (═ 0.75s), and the output of the regular model is atThe lateral and longitudinal acceleration that the drone vehicle is expected to maintain during the next delta t time, i.e. the time
Figure GDA0002566426770000083
Wherein subscripts e, i represent the unmanned vehicle and the ambient vehicle, respectively, x, y represent the position of the vehicle in the coordinate system, the velocity and acceleration of the vehicle in the longitudinal and transverse directions, respectively, and t represents the current time.
The establishment method of the rule model comprises the following steps:
1.1) the decision in the x direction (longitudinal direction) needs to comprehensively analyze the expected running speed of the unmanned automobile, the distance to the expected keeping of the front automobile and the dynamic characteristics of the unmanned automobile; the method comprises the following specific steps:
1.1.1) desired speed of travel of the unmanned vehicle
Figure GDA0002566426770000084
Comprises the following steps:
Figure GDA0002566426770000085
Figure GDA0002566426770000091
wherein the content of the first and second substances,
Figure GDA0002566426770000092
is the maximum deceleration of the unmanned vehicle; Δ t is the time interval; dfThe distance between the unmanned automobile and the vehicle in front of the lane where the unmanned automobile is located at the current moment;
Figure GDA0002566426770000093
the current unmanned vehicle speed;
Figure GDA0002566426770000094
is the current preceding vehicle speed;
Figure GDA0002566426770000095
is the largest of the front vehicleA deceleration rate;
Figure GDA0002566426770000096
is the expected driving speed of the unmanned vehicle when the unmanned vehicle is normally driving;
1.1.2) desired acceleration of the unmanned vehicle to achieve desired speed of the unmanned vehicle
Figure GDA0002566426770000097
Comprises the following steps:
Figure GDA0002566426770000098
1.1.3) the final desired speed in x direction may not be achieved by one decision adjustment due to dynamics limitations and driving comfort requirements, so the final decision in x direction is adjusted to:
Figure GDA0002566426770000099
wherein, aminIs the maximum deceleration of the unmanned vehicle during normal running, amaxThe two values are set to be the maximum acceleration of the unmanned automobile during normal running, and the two values are set to be 0.1 times of the maximum deceleration and the maximum acceleration of the vehicle dynamics.
1.2) making a decision of the vehicle in the y direction (transverse direction) to determine whether to change the lane, wherein the transverse decision is preset in the lane changing process, and after a lane changing motivation is generated due to the fact that a regular model is used for a next ramp, the lane changing is started once a safe position is found, otherwise, the vehicle continues to keep driving the lane; the specific settings are as follows:
1.2.1) determining whether the current lane change is safe by judging the motion states of vehicles in front of and behind the target lane, and starting the lane change when any one of the following conditions is met:
(1) no vehicle exists in the observation ranges at the front and the rear of the target lane;
(2) the target lane has a front vehicle, and the speed of the current self vehicle (i.e. the unmanned vehicle) meets the following requirements:
Figure GDA0002566426770000101
wherein d isf,jIs the following distance between the unmanned vehicle and the preceding vehicle on the target lane;
Figure GDA0002566426770000102
the speed of the front vehicle on the target lane;
Figure GDA0002566426770000103
is the maximum deceleration of the vehicle ahead of the target lane;
(3) the target lane exists in the rear vehicle, and the speed of the rear vehicle meets the following requirements:
Figure GDA0002566426770000104
wherein the content of the first and second substances,
Figure GDA0002566426770000105
is the maximum deceleration of the vehicle behind the target lane; dr,jIs the following distance between the unmanned vehicle and the rear vehicle on the target lane;
Figure GDA0002566426770000106
the speed of the rear vehicle on the target lane;
Figure GDA0002566426770000107
is the maximum deceleration of the vehicle behind the target lane;
(4) the target lane has a front vehicle and a rear vehicle at the same time, and the speeds of the front vehicle and the rear vehicle meet the requirements of the conditions (2) and (3);
1.2.2) when the unmanned automobile decides to change lanes, the decision in the y direction is constant in the lane changing process;
the lane change decision is as follows:
the entire lane change procedure setup requires two time intervals, i.e., the lane change procedure time is 2 Δ t (═ 1.5s), and therefore two procedures of acceleration before deceleration in the lateral direction are required. Obtaining feasible lane change time after step 1.2.1)Then, the y-direction decision is set
Figure GDA0002566426770000108
Comprises the following steps:
Figure GDA0002566426770000109
wherein w is the lane width;
when the last lane change has been started, the next y-direction decision is set to:
Figure GDA00025664267700001010
at the moment, the driverless automobile completes one lane change.
1.2.3) calculating the speed and the position of the unmanned automobile at the next moment according to the decision-making action:
Figure GDA00025664267700001011
1.3) generating a smooth curve by using a fifth-order polynomial with the current position, the speed, the target position at the next moment and the target speed at the next moment of the unmanned automobile as boundary conditions, dispersing the smooth curve into a guide point with the frequency of 20Hz, and sending a guide signal to the unmanned automobile to generate a local track of the unmanned automobile, wherein the local track is shown in FIG. 2.
The above is a rule-based decision model for the off-ramp of the unmanned vehicle, which considers the problems of safety, driving comfort and the like, can generate a smooth vehicle guiding track to achieve the purpose of the off-ramp of the unmanned vehicle, but the local lane change decision cannot respond to the urgency of the off-ramp, thereby affecting the passing efficiency of the unmanned vehicle.
2) As shown in fig. 3, a decision model (i.e., a hybrid decision model) based on reinforcement learning and a decision rule model for reinforcement learning are established based on a framework for reinforcement learning, and a training method thereof, wherein the hybrid decision model can be driven by adopting a rule model when the hybrid decision model is far away from a ramp, and the action of a vehicle is adjusted by using the reinforcement learning decision model according to the urgency of a lower ramp in the process of driving to the ramp, so that the success rate of the lower ramp is improved;
the establishment and training method of the hybrid decision model comprises the following steps:
2.1) the purpose of reinforcement learning is to establish a mapping model from the environment state to the action, the model is continuously trained by using the rewards obtained by different actions, and finally the actions generated by the model can obtain the rewards to the maximum extent. Therefore, the environment state space, the action space and the reward mechanism need to be defined first.
2.1.1) the environmental state in fig. 3 is constructed from the vehicle position, driving state, driving strategy, and distance from the ramp in the environment, as shown in fig. 4, which is defined as follows:
Figure GDA0002566426770000111
Figure GDA0002566426770000112
wherein the coordinate system is the same as the regular model coordinate system, and l ═ xeI is the distance between the current vehicle and the ramp; q. q.seThe driving state of the unmanned automobile is set; q. q.siAs ambient vehicle driving conditions, thetaiAn environmental vehicle driving strategy; s represents an environmental state;
Figure GDA0002566426770000113
representing an environment state space formed by all environment states;
the driving strategy of the environmental vehicle in the driving state cannot be directly observed and needs to be continuously estimated in the driving process of the vehicle. In addition, due to the limitation of the observation range, the unmanned automobile can only observe the environmental vehicles within the range of 50m in front and at the back, so that the difference of the x-direction coordinates between any one environmental vehicle and the unmanned automobile is less than 50m, namely
|xe-xi|≤50m;
2.1.2) motion definition is defined by vehicle acceleration in x and y directions, and all selectable motion spaces are as follows:
Figure GDA0002566426770000121
Figure GDA0002566426770000122
wherein, abrakeMaximum deceleration for the unmanned vehicle; a isruleAn action generated for the rule model;
Figure GDA0002566426770000123
the lateral movement being in accordance with a regular model, i.e. when the vehicle starts changing lanes
Figure GDA0002566426770000124
And the next moment adopts
Figure GDA0002566426770000125
And realizing channel change.
Each action can calculate the position and the speed of the unmanned vehicle at the next moment, a fifth-order polynomial is constructed by taking the position and the speed as boundary conditions, and the fifth-order polynomial is dispersed into a local guide track of 20Hz to guide the unmanned vehicle to complete the action, as shown in FIG. 2.
2.1.3) the mixed decision model reward mechanism comprises two parts, namely a lower ramp completion reward and a regular model enlightenment reward, and the setting method is as follows:
lower ramp completion reward r1Comprises the following steps:
Figure GDA0002566426770000126
regular model elicitation reward r2Comprises the following steps:
Figure GDA0002566426770000127
the final action gets the reward of:
r=r1+r2
when the unmanned automobile is far away from the ramp port, the influence of vehicle decision on the lower ramp is small, so a rule model needs to be adopted, and the rule model elicitation reward mechanism can help to maintain the unmanned automobile to adopt the rule model. As shown in FIG. 5, the value f of the action generated by the rule model when the unmanned vehicle is away from the rampdIs promoted to f due to the inspiring rewardd' significantly greater than the value of the other actions, so the vehicle will always take the action of the rule model;
when the unmanned automobile approaches the ramp junction, the influence of the action on the success rate of the lower ramp is enhanced, namely the probability of obtaining the completion of the reward of the lower ramp is increased. When there is an action with a value higher than the value f of the action of the rule model after being promoteddWhen the action is more beneficial to getting off the ramp than the action of the regular model, the unmanned automobile adopts the reinforcement learning decision model to get off the ramp;
by the mode, the mixed decision model can adopt a regular model to drive when the mixed decision model is far away from the ramp, and the action of the vehicle is adjusted by utilizing the reinforcement learning decision model according to the urgency of the lower ramp in the process of driving to the ramp, so that the success rate of the lower ramp is improved.
2.2) the action output by the reinforcement learning model must meet the limits of safety and traffic regulation speed, so the reinforcement learning action needs to be limited;
the limiting method comprises the following steps:
2.2.1) for satisfying the security demand that current lane traveled, the distance that needs to guarantee unmanned vehicle and its front truck can satisfy: when the front vehicle decelerates at the maximum deceleration until the vehicle stops, the unmanned vehicle can stop without a collision by decelerating the vehicle with the maximum deceleration, and therefore the vehicle speed v of the unmanned vehicle is limited to:
Figure GDA0002566426770000131
when an item in the action space of step 2.1.2) would cause the speed at the next moment not to satisfy the constraint, the action is deleted from the action space. When there is no front vehicle, there is no safety speed limit. And in lane changing, when the states of the front vehicle, the rear vehicle and the unmanned vehicle on the target lane do not meet the lane changing condition in the step 1.2.1), deleting the lane changing action from the action space. By the mode, the generated action can ensure the running safety of the vehicle.
2.2.2) to meet the speed requirement of the traffic rules, when a certain action in the action space causes the vehicle speed to not meet the speed limit of the traffic rules at the next moment, the action is deleted from the action space. Therefore, the action generated by the reinforcement learning can ensure that the speed of the unmanned automobile always meets the traffic regulation limit.
2.3) the mixed decision model is trained in a highly uncertain simulation environment (the environmental vehicles have different driving strategies, and the next-vehicle action of the same driving strategy has randomness) through a continuously repeated off-ramp process;
the training method is realized by utilizing a particle filtering and Monte Carlo tree searching method, and comprises the following specific steps:
2.3.1) since in the state space the driving strategy cannot be observed directly, it is necessary to supplement the strategy by means of an online fit. In this embodiment, an IDM (Intelligent Driver Model) and a MOBIL (Minimizing over all Braking Induced by LaneChanges) are adopted to fit a vehicle driving strategy, and the two models have 8 parameters in total and need to be fitted according to the vehicle driving performance. The invention adopts a particle filtering method to carry out fitting, and the steps are as follows:
(1) establishing a particle library for each new environmental vehicle;
(2) randomly selecting 50 groups of driving strategy model parameters as initial particles;
(3) transferring all the environmental vehicles to the next moment state according to a driving model formed by 50 groups of particles;
(4) analyzing the difference between 50 groups of particles and a real driving model of the environmental vehicle according to the actually observed next environmental vehicle state, and intensively resampling the new 50 groups of particles to the vicinity of the particles close to the real driving model;
(5) this process is repeated and at each moment the particle closest to the real driving model is selected as the driving model input state space.
The method adopts particle filtering to obtain a driving model (namely an environmental vehicle driving strategy) theta with maximum likelihoodiAnd sending the driving model into an enhanced learning model as a part of the environmental state for training. At this time, all the environmental states required for reinforcement learning have been completely acquired.
2.3.2) the hybrid decision model in the invention uses reinforcement learning model, because the state space is continuous and the dimensionality is high, the reinforcement learning model is trained by adopting a Monte Carlo tree searching method, and the concrete steps are as follows:
(1) as shown in fig. 6, there are several alternative actions for each state, and these actions meet the requirements for safety and traffic regulations in step 2.2). Each action value of the initialized Monte Carlo tree is the same;
(2) in each simulation process, when all action values are the same, the actions generated by the rule model are preferentially adopted for simulation;
(3) if the action values are different, the action selected is as follows:
Figure GDA0002566426770000151
wherein Q (s, a) is a cost function of the action a to the environmental state s; n (s, a) is the number of times action a was taken at ambient state s in the past simulation process; n(s) ═ ΣaN (s, a); c is the number of constants for the intention to search for a new action, and is preferably 5 in the present embodiment;
(4) after each simulation is finished (the unmanned vehicle enters a ramp or misses a ramp), the mapping between the state and the value of the action in the process is adjusted according to the finally obtained reward, and the value function Q (s, a) is updated.
2.3.3) repeating the above process for several times to complete the training.
In conclusion, the present invention performs the off-ramp test in a highly random simulation environment, and the unmanned vehicle is on the leftmost lane of a four-lane highway and prepares for the off-ramp. In order to compare with the regular model, the lane changing is not allowed before 1000m,1500m and 2000m respectively, then the regular model obtains a down-ramp engine, the ramp is turned 500 times by adopting the method in the step 1), the mixed decision model is turned 500 times under the same condition, and the result is shown in table 1. The result shows that the mixed decision model can effectively improve the success rate of the off-ramp by 5-50%, and ensure the safety of vehicles and meet the constraint of traffic regulations in the whole process.
TABLE 1 comparison of results for rule-based model and hybrid down-ramp model
Figure GDA0002566426770000152
The above embodiments are only for illustrating the present invention, and the steps may be changed, and on the basis of the technical solution of the present invention, the modification and equivalent changes of the individual steps according to the principle of the present invention should not be excluded from the protection scope of the present invention.

Claims (10)

1. A method for enabling an unmanned vehicle to drive away from a high speed based on a fusion rule and a learning model is characterized by comprising the following steps:
1) in the running process of the unmanned automobile on the expressway, generating a down-ramp motive according to a distance of a navigation system in front of a ramp, firstly trying down the ramp by using a rule model, judging whether the success rate of the down-ramp is reduced or not based on the rule decision model, if not, adopting a rule model decision action, and if so, entering a step 2);
establishing a rectangular coordinate system by taking the starting point of the ramp opening as the origin, the vehicle running direction as x, the upward direction vertical to the vehicle running direction as y and the unit as m; the unmanned vehicle has a driving position and a speed acceleration of (x)e,ye
Figure FDA0002566426760000011
) Position, velocity and acceleration of the surrounding vehicle are (x)i,yi
Figure FDA0002566426760000012
Figure FDA0002566426760000013
) I ═ 1,2, …, n; in addition, the time interval of the regular model is Δ t, and the output of the regular model is the lateral and longitudinal accelerations that the unmanned vehicle is expected to maintain in the next Δ t time
Figure FDA0002566426760000014
Figure FDA0002566426760000015
Wherein
Figure FDA0002566426760000016
The speed and the acceleration of the vehicle in the longitudinal direction and the transverse direction are respectively, and t represents the current moment;
2) the hybrid decision model can be driven by adopting a rule model when the hybrid decision model is far away from a ramp, and the action of a vehicle is adjusted by utilizing the reinforcement learning decision model according to the urgency of a lower ramp in the process of driving to the ramp, so that the success rate of the lower ramp is improved.
2. A method of navigating away from high speed in an unmanned vehicle according to claim 1, wherein: in the step 1), the method for establishing the rule model comprises the following steps:
1.1) the decision in the x direction needs to comprehensively analyze the expected running speed of the unmanned automobile, the distance between the unmanned automobile and the expected keeping distance of the front automobile and the dynamic characteristics of the unmanned automobile;
1.2) the vehicle decides in the y direction to decide whether to change lanes, the y direction decision is preset in the lane changing process, and after a lane changing motivation is generated, lane changing is started once a safe position is found, otherwise, the vehicle continues to keep driving in the lane;
1.3) generating a smooth curve by using a fifth-order polynomial under the boundary conditions of the current position, the speed, the next-moment target position and the next-moment target speed of the unmanned automobile, dispersing the smooth curve into a guide point with the frequency of 20Hz, and sending a guide signal to the unmanned automobile to generate a local track of the unmanned automobile.
3. A method for driving away from high speed in an unmanned vehicle according to claim 2, wherein: in step 1.1), the decision in the x direction includes the following steps:
1.1.1) desired speed of travel of the unmanned vehicle
Figure FDA0002566426760000021
Comprises the following steps:
Figure FDA0002566426760000022
if a leading vehicle exists;
wherein the content of the first and second substances,
Figure FDA0002566426760000023
is the maximum deceleration of the unmanned vehicle; Δ t is the time interval; dfThe distance between the unmanned automobile and the vehicle in front of the lane where the unmanned automobile is located at the current moment;
Figure FDA0002566426760000024
the current unmanned vehicle speed;
Figure FDA0002566426760000025
is the current preceding vehicle speed;
Figure FDA0002566426760000026
is the maximum deceleration of the leading vehicle;
Figure FDA0002566426760000027
is the expected driving speed of the unmanned vehicle when the unmanned vehicle is normally driving;
1.1.2) desired acceleration of the unmanned vehicle to achieve desired speed of the unmanned vehicle
Figure FDA0002566426760000028
Comprises the following steps:
Figure FDA0002566426760000029
1.1.3) adjusting the final decision in the x direction according to the expected acceleration of the unmanned vehicle as:
Figure FDA00025664267600000210
wherein, aminIs the maximum deceleration of the unmanned vehicle during normal running, amaxThe maximum acceleration of the unmanned automobile during normal running is obtained.
4. A method for enabling an unmanned vehicle to drive away from a high speed as claimed in claim 3, wherein: in step 1.2), the y-direction decision includes the following steps:
1.2.1) determining whether the current lane change is safe by judging the motion states of vehicles in front of and behind the target lane, and starting the lane change when any one of the following conditions is met:
(1) no vehicle exists in the observation ranges at the front and the rear of the target lane;
(2) the target lane has a front vehicle, and the current vehicle speed meets the following conditions:
Figure FDA0002566426760000031
wherein d isf,jIs the following distance between the unmanned vehicle and the preceding vehicle on the target lane;
Figure FDA0002566426760000032
is a target vehicleThe speed of the vehicle ahead on the road;
Figure FDA0002566426760000033
is the maximum deceleration of the vehicle ahead of the target lane;
(3) the target lane exists in the rear vehicle, and the speed of the rear vehicle meets the following requirements:
Figure FDA0002566426760000034
wherein the content of the first and second substances,
Figure FDA0002566426760000035
is the maximum deceleration of the vehicle behind the target lane; dr,jIs the following distance between the unmanned vehicle and the rear vehicle on the target lane;
Figure FDA0002566426760000036
the speed of the rear vehicle on the target lane;
Figure FDA0002566426760000037
is the maximum deceleration of the vehicle behind the target lane;
(4) the target lane has a front vehicle and a rear vehicle at the same time, and the speeds of the front vehicle and the rear vehicle meet the requirements of the conditions (2) and (3);
1.2.2) when the unmanned vehicle decides to change lanes, the y-direction decision in the lane changing process is constant, and the lane changing decision is as follows:
the whole lane changing process is set to be subjected to two time intervals 2 delta t, so that two processes of firstly accelerating and then decelerating are required to be carried out in the transverse direction; when a feasible lane change time is obtained, a y-direction decision is set
Figure FDA0002566426760000038
Comprises the following steps:
Figure FDA0002566426760000039
wherein w is the lane width;
when the last lane change has been started, the next y-direction decision is set to:
Figure FDA00025664267600000310
at the moment, the driverless automobile completes one lane change;
1.2.3) calculating the speed and the position of the unmanned automobile at the next moment according to the decision-making action:
Figure FDA00025664267600000311
5. the method for enabling an unmanned vehicle to drive away from a high speed according to any one of claims 3 or 4, wherein: in the step 2), the method for establishing and training the hybrid decision model comprises the following steps:
2.1) defining an environment state space, an action space and a reward mechanism;
2.2) the action output by the reinforcement learning model must meet the limits of safety and traffic regulation speed, so that the action of reinforcement learning is limited;
2.3) the hybrid decision model is trained in a highly uncertain simulation environment through a constantly repeated off-ramp process.
6. The method for enabling an unmanned vehicle to drive away from a high speed as set forth in claim 5, wherein: in the step 2.1), the environment state space, the action space and the reward mechanism are defined as follows:
2.1.1) the environmental state is constructed by the position of the vehicle in the environment, the driving state, the driving strategy and the distance between ramps, and is defined as follows:
Figure FDA0002566426760000041
Figure FDA0002566426760000042
wherein the coordinate system is the same as the regular model coordinate system, and l ═ xeI is the distance between the current vehicle and the ramp; q. q.seThe driving state of the unmanned automobile is set; q. q.siAs ambient vehicle driving conditions, thetaiAn environmental vehicle driving strategy; s represents an environmental state;
Figure FDA0002566426760000048
representing an environment state space formed by all environment states;
in the running state, the difference of the x-direction coordinates of any one environmental vehicle and the unmanned automobile is less than 50m, namely
|xe-xi|≤50m;
2.1.2) motion definition is defined by vehicle acceleration in x and y directions, and all selectable motion spaces are as follows:
Figure FDA0002566426760000043
Figure FDA0002566426760000044
wherein, abrakeMaximum deceleration for the unmanned vehicle; a isruleAn action generated for the rule model;
Figure FDA0002566426760000045
the y-direction action being taken when the vehicle starts changing lanes
Figure FDA0002566426760000046
And the next moment adopts
Figure FDA0002566426760000047
Realizing lane change;
each action can calculate the position and the speed of the unmanned vehicle at the next moment, a fifth-order polynomial is constructed by taking the position and the speed as boundary conditions, and the fifth-order polynomial is dispersed into a local guide track of 20Hz to guide the unmanned vehicle to complete the action;
2.1.3) the mixed decision model reward mechanism comprises two parts, namely a lower ramp completion reward and a regular model enlightenment reward, and the setting method is as follows:
lower ramp completion reward r1Comprises the following steps:
Figure FDA0002566426760000051
regular model elicitation reward r2Comprises the following steps:
Figure FDA0002566426760000052
the final action gets the reward of:
r=r1+r2
7. the method for enabling an unmanned vehicle to drive away from a high speed as set forth in claim 5, wherein: in the step 2.2), the limiting method comprises the following steps:
2.2.1) for satisfying the security demand that current lane traveled, the distance that needs to guarantee unmanned vehicle and its front truck can satisfy: when the front vehicle decelerates at the maximum deceleration until the vehicle stops, the unmanned vehicle can stop without a collision by decelerating the vehicle with the maximum deceleration, and therefore the vehicle speed v of the unmanned vehicle is limited to:
Figure FDA0002566426760000053
when a certain item in the action space can cause the speed of the next moment not to meet the constraint, deleting the action from the action space; when there is no front vehicle, there is no safety speed limit; when the lane is changed, when the states of the front vehicle, the rear vehicle and the unmanned vehicle on the target lane do not meet the lane changing condition, the lane changing action is deleted from the action space, and the generated action can ensure the driving safety of the vehicle;
2.2.2) to meet the speed requirement of the traffic rules, when a certain action in the action space causes the vehicle speed to not meet the speed limit of the traffic rules at the next moment, the action is deleted from the action space.
8. The method for enabling an unmanned vehicle to drive away from a high speed as set forth in claim 5, wherein: in the step 2.3), the training method is realized by using a particle filtering and Monte Carlo tree searching method, and the specific steps are as follows:
2.3.1) fitting the vehicle driving strategy by adopting IDM and MOBIL and fitting by adopting a particle filtering method; wherein IDM is an intelligent driver model, and MOBIL is a total braking minimum lane change model;
2.3.2) the hybrid decision model uses a reinforcement learning model, and because the state space is continuous and the dimensionality is high, the reinforcement learning model is trained by adopting a Monte Carlo tree searching method;
2.3.3) repeating the above process for several times to complete the training.
9. The method for enabling an unmanned vehicle to drive away from a high speed as recited in claim 8, wherein: in the step 2.3.1), the step of fitting by using a particle filtering method comprises the following steps:
(1) establishing a particle library for each new environmental vehicle;
(2) randomly selecting 50 groups of driving strategy model parameters as initial particles;
(3) transferring all the environmental vehicles to the next moment state according to a driving model formed by 50 groups of particles;
(4) analyzing the difference between 50 groups of particles and a real driving model of the environmental vehicle according to the actually observed next environmental vehicle state, and intensively resampling the new 50 groups of particles to the vicinity of the particles close to the real driving model;
(5) this process is repeated and at each moment the particle closest to the real driving model is selected as the driving model input state space.
10. The method for enabling an unmanned vehicle to drive away from a high speed as recited in claim 8, wherein: in the step 2.3.2), a Monte Carlo tree searching method is adopted to train the reinforcement learning model, and the specific steps are as follows:
(1) each state has a plurality of alternative actions, the actions meet the requirements of safety and traffic rules, and the value of each action of the initialized Monte Carlo tree is the same;
(2) in each simulation process, when all action values are the same, the actions generated by the rule model are preferentially adopted for simulation;
(3) if the action values are different, the action selected is as follows:
Figure FDA0002566426760000071
wherein Q (s, a) is a cost function of the action a to the environmental state s; n (s, a) is the number of times action a was taken at ambient state s in the past simulation process;
Figure FDA0002566426760000072
c is exploring new action intention constant;
(4) after each simulation is finished, the mapping of the value between the state and the action in the process is adjusted according to the finally obtained reward, and the value function Q (s, a) is updated.
CN201811524283.4A 2018-12-13 2018-12-13 Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed Active CN109598934B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811524283.4A CN109598934B (en) 2018-12-13 2018-12-13 Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811524283.4A CN109598934B (en) 2018-12-13 2018-12-13 Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed

Publications (2)

Publication Number Publication Date
CN109598934A CN109598934A (en) 2019-04-09
CN109598934B true CN109598934B (en) 2020-11-06

Family

ID=65961837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811524283.4A Active CN109598934B (en) 2018-12-13 2018-12-13 Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed

Country Status (1)

Country Link
CN (1) CN109598934B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109991987B (en) * 2019-04-29 2023-08-04 北京智行者科技股份有限公司 Automatic driving decision-making method and device
CN110427682B (en) * 2019-07-26 2020-05-19 清华大学 Traffic scene simulation experiment platform and method based on virtual reality
CN111413974B (en) * 2020-03-30 2021-03-30 清华大学 Automobile automatic driving motion planning method and system based on learning sampling type
CN111483468B (en) * 2020-04-24 2021-09-07 广州大学 Unmanned vehicle lane change decision-making method and system based on confrontation and imitation learning
CN111605565A (en) * 2020-05-08 2020-09-01 昆山小眼探索信息科技有限公司 Automatic driving behavior decision method based on deep reinforcement learning
CN111645687A (en) * 2020-06-11 2020-09-11 知行汽车科技(苏州)有限公司 Lane changing strategy determining method, device and storage medium
TWI750762B (en) * 2020-08-06 2021-12-21 財團法人車輛研究測試中心 Hybrid planniing method in autonomous vehicles and system thereof
CN112198794A (en) * 2020-09-18 2021-01-08 哈尔滨理工大学 Unmanned driving method based on human-like driving rule and improved depth certainty strategy gradient
CN112099515A (en) * 2020-11-16 2020-12-18 北京鼎翰科技有限公司 Automatic driving method for lane change avoidance
CN112896166A (en) * 2021-03-01 2021-06-04 苏州挚途科技有限公司 Vehicle lane changing method and device and electronic equipment
CN113120003B (en) * 2021-05-18 2022-06-03 同济大学 Unmanned vehicle motion behavior decision method
CN113511215B (en) * 2021-05-31 2022-10-04 西安电子科技大学 Hybrid automatic driving decision method, device and computer storage medium
CN113324556B (en) * 2021-06-04 2024-03-26 苏州智加科技有限公司 Path planning method and device based on vehicle-road collaborative reinforcement learning and application system
CN113345268B (en) * 2021-07-16 2022-03-18 长沙理工大学 CAV lane change decision-making method for expressway down-ramp shunting area
CN113593228B (en) * 2021-07-26 2022-06-03 广东工业大学 Automatic driving cooperative control method for bottleneck area of expressway
CN113682312B (en) * 2021-09-23 2023-07-25 中汽创智科技有限公司 Autonomous channel switching method and system integrating deep reinforcement learning
EP4209963A1 (en) 2022-01-11 2023-07-12 Ford Global Technologies, LLC Method for autonomous driving of a vehicle, a data processing circuit, a computer program, and a computer-readable medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874597A (en) * 2017-02-16 2017-06-20 北理慧动(常熟)车辆科技有限公司 A kind of highway passing behavior decision-making technique for being applied to automatic driving vehicle
CN107145936A (en) * 2017-04-22 2017-09-08 大连理工大学 A kind of vehicle following-model method for building up based on intensified learning
CN107161155A (en) * 2017-04-27 2017-09-15 大连理工大学 A kind of vehicle collaboration lane-change method and its system based on artificial neural network
CN107315411A (en) * 2017-07-04 2017-11-03 合肥工业大学 A kind of lane-change method for planning track based on automatic driving vehicle under collaborative truck
CN108897313A (en) * 2018-05-23 2018-11-27 清华大学 A kind of end-to-end Vehicular automatic driving system construction method of layer-stepping

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180052812A (en) * 2016-11-10 2018-05-21 한국전자통신연구원 Method for building a database for driving experience

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874597A (en) * 2017-02-16 2017-06-20 北理慧动(常熟)车辆科技有限公司 A kind of highway passing behavior decision-making technique for being applied to automatic driving vehicle
CN107145936A (en) * 2017-04-22 2017-09-08 大连理工大学 A kind of vehicle following-model method for building up based on intensified learning
CN107161155A (en) * 2017-04-27 2017-09-15 大连理工大学 A kind of vehicle collaboration lane-change method and its system based on artificial neural network
CN107315411A (en) * 2017-07-04 2017-11-03 合肥工业大学 A kind of lane-change method for planning track based on automatic driving vehicle under collaborative truck
CN108897313A (en) * 2018-05-23 2018-11-27 清华大学 A kind of end-to-end Vehicular automatic driving system construction method of layer-stepping

Also Published As

Publication number Publication date
CN109598934A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN109598934B (en) Rule and learning model-based method for enabling unmanned vehicle to drive away from high speed
US11970168B2 (en) Vehicle trajectory modification for following
US11754408B2 (en) Methods and systems for topological planning in autonomous driving
CN110244713B (en) Intelligent vehicle lane change track planning system and method based on artificial potential field method
JP6791905B2 (en) Systems and methods for dynamic vehicle control according to traffic
US11137766B2 (en) State machine for traversing junctions
US20190317499A1 (en) Automatic Driving Device
US6873911B2 (en) Method and system for vehicle operator assistance improvement
EP3794572A1 (en) Drive envelope determination
US20210300348A1 (en) Vehicle control device, vehicle control method, and storage medium
CN108919795A (en) A kind of autonomous driving vehicle lane-change decision-making technique and device
CN112046484B (en) Q learning-based vehicle lane-changing overtaking path planning method
CN111301419A (en) Reinforcement learning based method for SAE4 level automated lane change
EP3425341B1 (en) Information processing apparatus, vehicle information processing method, and computer-readable medium
CN113071487B (en) Automatic driving vehicle control method and device and cloud equipment
JP7216766B2 (en) vehicle controller
US11433924B2 (en) System and method for controlling one or more vehicles with one or more controlled vehicles
CN114830055A (en) Occlusion zone guidance
Kim et al. Multiple vehicle driving control for traffic flow efficiency
JP7379033B2 (en) Driving support method and driving support device
EP4237300A1 (en) Collision avoidance planning system
JP2024520301A (en) Vehicle trajectory determination
Guo et al. Toward human-like behavior generation in urban environment based on Markov decision process with hybrid potential maps
Althoff et al. Stochastic reachable sets of interacting traffic participants
CN112542061B (en) Lane borrowing and overtaking control method, device and system based on Internet of vehicles and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20191231

Address after: 100083 401a, floor 4, building 6, yard 1, Zhongguancun East Road, Haidian District, Beijing

Applicant after: Beijing Chaoxing Future Technology Co., Ltd

Address before: 100084 Beijing, Haidian District, 100084 box office box office, Tsinghua University,

Applicant before: Tsinghua University

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant