WO2024051310A1 - 一种控制方法、装置及车辆 - Google Patents

一种控制方法、装置及车辆 Download PDF

Info

Publication number
WO2024051310A1
WO2024051310A1 PCT/CN2023/103926 CN2023103926W WO2024051310A1 WO 2024051310 A1 WO2024051310 A1 WO 2024051310A1 CN 2023103926 W CN2023103926 W CN 2023103926W WO 2024051310 A1 WO2024051310 A1 WO 2024051310A1
Authority
WO
WIPO (PCT)
Prior art keywords
strategy
cost
driving
party
target
Prior art date
Application number
PCT/CN2023/103926
Other languages
English (en)
French (fr)
Inventor
杨绍宇
陈巍
郝东浩
安全
程思源
王新宇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024051310A1 publication Critical patent/WO2024051310A1/zh

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • B60W60/001Planning or execution of driving tasks

Definitions

  • This application relates to the technical field of artificial intelligence (AI), and in particular to a control method, device and vehicle.
  • AI artificial intelligence
  • Autonomous driving is a mainstream application in the field of artificial intelligence.
  • Autonomous driving technology relies on the collaborative cooperation of computer vision, radar, monitoring devices and global positioning systems to allow vehicles to achieve autonomous driving without the need for active human operation.
  • vehicles can implement corresponding driving strategies based on actual driving scenarios to ensure safe driving of the vehicle.
  • driving strategies such as mistaken deceleration and mistaken acceleration, which affects safe acceleration and driving experience.
  • This application provides a control method, device, vehicle, computer storage medium and computer product, which can enable the vehicle to continuously test the driving intentions of other objects until the driving intentions of other objects are clear, and then make the final grab or move. Yield decision-making, thereby achieving the effect of slowly accelerating to grab the right of way or slowly decelerating to give way, improving the driving experience.
  • this application provides a control method, which method includes: dividing the target vehicle and the target object into an active party and a passive party, wherein the active party has priority over the passive party, and the target vehicle and the target object exist Collision possibility; obtain the first feasible strategy set of the active party, which includes at least one first driving strategy; according to each first driving strategy in the first strategy set, the driving parameters of the active party at the current moment and the passive strategy The driving parameters at the current moment are used to obtain the second driving strategy of the passive party under each first driving strategy in the first strategy set to obtain the second strategy set, where the second driving strategy is any one of the following: the passive party only grabs The active party can move, and the passive party can only let the active party move, or the passive party can both grab the active party and allow the active party to move; according to the first strategy set and the second strategy set, determine the target strategy pair set, and the target strategy pair
  • the set includes at least one feasible strategy pair, and each feasible strategy pair is composed of a first travel strategy and a second travel strategy; determine
  • a cost set includes the execution cost of each feasible strategy pair; according to the first cost set, the target driving strategy is determined, and the target driving strategy is the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set; according to the target Driving strategy to control the target vehicle.
  • the driving parameters at the current moment may, but are not limited to, refer to: the driving parameters observed when obtaining the first strategy set, or the driving parameters observed when solving the second strategy set, or when executing the method. The latest observed driving parameters.
  • the two can be divided into the active party and the passive party, and the feasible strategy of the passive party is solved from the feasible strategy of the active party, and then the feasible strategies of the two are calculated and executed. Execution cost, and finally select a feasible strategy with the lowest cost to control the target vehicle.
  • This allows the target vehicle to continuously test the driving intentions of other objects until the driving intentions of other objects are clear, and then make the final decision to rush or yield, thereby achieving the effect of slowly accelerating to rush or slowing down to yield. Improve driving experience.
  • the target vehicle is the active party and the target object is the passive party; determine the target driving strategy based on the first cost set, specifically including: if the second of the feasible strategy pairs corresponding to the target cost
  • the driving strategy is: the passive party only rushes to the active party, then the target driving strategy is determined as: the target vehicle yields to the target object, where the target cost is the lowest execution cost in the first cost set; if the feasible strategy corresponding to the target cost is The second driving strategy in is: the passive party only gives way to the active party, then the target driving strategy is determined as: the target vehicle grabs the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost is: the passive party both If it can grab the active party and allow the active party to move, then the target driving strategy is determined to be: the first driving strategy among the feasible strategies corresponding to the target cost.
  • the target vehicle is the active party and the target object is the passive party
  • the target object's driving strategy is one of rushing or yielding to the target vehicle
  • the target vehicle can be determined at this time
  • the driving strategy of is opposite to the driving strategy of the target object, that is, when the driving strategy of the target object is to rush the target vehicle, then the driving strategy of the target vehicle is determined to yield to the target object, and when the driving strategy of the target object is to yield to the target vehicle , then it is determined that the target vehicle’s driving strategy is to grab the target object.
  • the target object's driving strategy is to both rush and yield to the target vehicle, it indicates that the target object's driving intention is not clear yet.
  • the target vehicle can continue to be controlled to test the target's alignment.
  • the target vehicle is the active party at this time, and the active party's right of way is higher than the passive party's right of way, during the driving process, the party with lower right of way often needs to yield to the vehicle with lower right of way. one party. Therefore, at this time, the target vehicle can be controlled to execute the first driving strategy in the pair of feasible strategies corresponding to the target cost, that is, the driving strategy of the target object is determined to be the first driving strategy in the pair of feasible strategies corresponding to the target cost.
  • the target vehicle is the passive party and the target object is the active party; determine the target driving strategy based on the first cost set, specifically including: if the second of the feasible strategy pairs corresponding to the target cost
  • the driving strategy is: the passive party only rushes to the active party, then the target driving strategy is determined as: the target vehicle rushes to the target object, where the target cost is the lowest execution cost in the first cost set; if the feasible strategy corresponding to the target cost is The second driving strategy in is: the passive party only yields to the active party, then the target driving strategy is determined to be: the target vehicle yields to the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost is: the passive party both If the vehicle can overtake the active party and allow the active party to move, then the target driving strategy is determined as: the target vehicle yields to the target object.
  • the target vehicle is the passive party and the target object is the active party
  • the target vehicle's driving strategy when the target vehicle's driving strategy is to rush the target object, it means that the target vehicle has only one driving strategy to choose from. Therefore, it can be determined that the driving strategy is the target vehicle.
  • the target vehicle's driving strategy is to yield to the target object, it indicates that the target vehicle has only one driving strategy to choose from. Therefore, it can be determined that this driving strategy is the driving strategy that the target vehicle needs to execute.
  • the target vehicle's driving strategy is to both rush and yield to the target, it indicates that the target's driving intention is not clear.
  • the target vehicle rashly executes the decision of rushing or yielding, heavy braking/ In situations such as braking or even taking over, the driving experience is poor.
  • the target vehicle since the target vehicle is the passive party at this time, the right of way of the target vehicle is lower than that of the target object.
  • the target vehicle can be controlled to slow down to test the target's driving intention, that is, the target vehicle's driving strategy is determined to be yield. target.
  • the execution cost associated with a feasible strategy pair includes one or more of the following: comfort cost, which is used to characterize the comfort level when executing the feasible strategy pair; passability cost, which is used to characterize the active strategy and/or the efficiency of the passive party through the conflict point between the two; the offset cost, which is used to characterize the evaluation of the deviation of the active party and/or the passive party when executing the corresponding driving strategy; the inconsistency cost, which is used to characterize the feasible The evaluation of the deviation between the driving behavior of the target object in the strategy pair and the actual driving behavior of the target object; or the decision penalty cost is used to represent the evaluation of whether the passive party's intention is clear.
  • determining the comfort cost associated with the first strategy pair specifically includes: according to the first driving strategy in the first strategy pair The corresponding acceleration and the current acceleration of the active party determine the first comfort cost of the active party when executing the first driving strategy in the first strategy pair; according to the acceleration and passive direction corresponding to the second driving strategy in the first strategy pair The current acceleration is used to determine the second comfort cost of the passive party when executing the second driving strategy of the first strategy pair; based on the first comfort cost and the second comfort cost, the comfort associated with the first strategy pair is determined cost.
  • the current acceleration of the active party may, but is not limited to, refer to: the acceleration of the active party observed when acquiring the first strategy set, or the acceleration of the active party observed when solving the second strategy set, or, The last observed acceleration of the active party before executing this method.
  • determining the passability cost associated with the first strategy pair specifically includes: determining the initiative based on the first time and the second time.
  • the first passability cost of the active party when executing the first driving strategy of the first strategy pair where the first time is the time when the active party passes the target point when executing the first driving strategy of the first strategy pair, and the second time is the time when the active party passes the target point with its current speed and acceleration.
  • the target point is the conflict point between the active party's driving path and the passive party's driving path. According to the third time and the fourth time, it is determined that the passive party is executing the first step.
  • the second passability cost when the second driving strategy in the strategy pair is used, where the third time is the time when the passive party passes the target point when executing the second driving strategy in the first strategy pair, and the fourth time is when the passive party passes the target point with the second driving strategy in the first strategy pair.
  • the time when the current speed and acceleration pass through the target point; according to the first passability cost and the second passability cost, the passability cost associated with the first strategy pair is determined.
  • determining the offset cost associated with the first strategy pair specifically includes: based on the preset offset and offset The mapping relationship between the costs, and the offset amount when the active party executes the first driving strategy in the first strategy pair determines the first offset cost of the active party when executing the first traveling strategy in the first strategy pair; Based on the mapping relationship between the preset offset and the offset cost, and the offset when the passive party executes the second driving strategy of the first strategy pair, it is determined that the passive party executes the third driving strategy of the first strategy pair.
  • the second offset cost when driving with two strategies; determine the offset cost associated with the first strategy pair based on the first offset cost and the second offset cost.
  • the inconsistency cost associated with the first strategy pair specifically including: when the first strategy pair matches the driving strategy of the target object To only overtake the target vehicle, or to only yield to the target vehicle, determine the target probability of the target object overtaking the target vehicle, and determine the inconsistency cost based on the target probability; when the first strategy matches the target object's driving strategy In order to both grab the target vehicle and allow the target vehicle to pass, the inconsistency cost is determined to be a preset value. value.
  • determining the decision penalty cost associated with the first strategy pair specifically includes: based on a preset decision penalty rule, and, The second driving strategy in the first strategy pair determines the decision penalty cost associated with the first strategy pair.
  • the second driving strategy in the first strategy pair is: the passive party can both grab the active party and allow the active party to move; based on the preset decision-making penalty rules, and, the first The second driving strategy in the strategy pair determines the decision penalty cost associated with the first strategy pair, which specifically includes: based on the decision penalty rule, and the second driving strategy in the first strategy pair determines the first decision penalty cost; based on The acceleration corresponding to the first driving strategy in the first strategy pair and the current acceleration of the active party determine the second decision penalty cost of the active party when executing the first driving strategy in the first strategy pair; according to the first decision penalty cost and The second decision penalty cost determines the decision penalty cost associated with the first strategy.
  • this application provides a control device, which includes: a dividing module and a processing module.
  • the division module is used to divide the target vehicle and the target object into active parties and passive parties, where the active party has priority over the passive party, and there is a possibility of collision between the target vehicle and the target object.
  • the processing module is configured to obtain a first feasible strategy set for the active party, where the first strategy set includes at least one first driving strategy.
  • the processing module is also configured to obtain the driving parameters of the passive party under each first driving strategy in the first strategy set based on each first driving strategy in the first strategy set, the driving parameters of the active party at the current moment, and the driving parameters of the passive party at the current moment.
  • the second driving strategy is to obtain the second strategy set, where the second driving strategy is any one of the following: the passive party only overtakes the active party, the passive party only gives way to the active party, or the passive party can both overtake the active party , and allow action to take the initiative.
  • the processing module is also configured to determine a target strategy pair set according to the first strategy set and the second strategy set.
  • the target strategy pair set includes at least one feasible strategy pair, and each feasible strategy pair consists of a first driving strategy and a second driving strategy. Driving strategy composition.
  • the processing module is also used to determine the execution cost of each feasible strategy pair in the target strategy pair set, and obtain a first cost set, which includes the execution cost of each feasible strategy pair.
  • the processing module is also configured to determine a target driving strategy according to the first cost set, where the target driving strategy is the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set.
  • the processing module is also used to control the target vehicle according to the target driving strategy.
  • the processing module determines the target driving strategy based on the first cost set, it is specifically used to: If the feasible strategy corresponding to the target cost The second driving strategy in the pair is: the passive party only rushes to the active party, then the target driving strategy is determined as: the target vehicle yields to the target object, where the target cost is the lowest execution cost in the first cost set; if the target cost is The second driving strategy in the corresponding feasible strategy pair is: the passive party only yields to the active party, then the target driving strategy is determined to be: the target vehicle grabs the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost If the passive party can both grab the active party and allow the active party to move, then the target driving strategy is determined to be: the first driving strategy among the feasible strategies corresponding to the target cost.
  • the processing module determines the target driving strategy based on the first cost set, it is specifically used to: If the feasible strategy corresponding to the target cost The second driving strategy in the pair is: the passive party only overtakes the active party, then the target driving strategy is determined as: the target vehicle overtakes the target object, where the target cost is the lowest execution cost in the first cost set; if the target cost is The second driving strategy in the corresponding feasible strategy pair is: the passive party only yields to the active party, then the target driving strategy is determined to be: the target vehicle yields to the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost If the passive party can both overtake the active party and allow the active party to move, then the target driving strategy is determined as: the target vehicle yields to the target object.
  • the execution cost associated with a feasible strategy pair includes one or more of the following: comfort cost, which is used to characterize the comfort level when executing the feasible strategy pair; passability cost, which is used to characterize the active strategy and/or the efficiency of the passive party through the conflict point between the two; the offset cost, which is used to characterize the evaluation of the deviation of the active party and/or the passive party when executing the corresponding driving strategy; the inconsistency cost, which is used to characterize the feasible The evaluation of the deviation between the driving behavior of the target object in the strategy pair and the actual driving behavior of the target object; or the decision penalty cost is used to represent the evaluation of whether the passive party's intention is clear.
  • the processing module when determining the comfort cost associated with the first strategy pair, is specifically configured to: according to the first strategy pair The acceleration corresponding to the first driving strategy of the first driving strategy and the current acceleration of the active party are determined to determine the first comfort cost of the active party when executing the first driving strategy of the first strategy pair; according to the corresponding acceleration of the second driving strategy of the first strategy pair and the current acceleration of the passive party, determine the second comfort cost of the passive party when executing the second driving strategy of the first strategy pair; determine the first strategy pair based on the first comfort cost and the second comfort cost.
  • the associated comfort costs when determining the comfort cost associated with the first strategy pair.
  • the processing module when determining the passability cost associated with the first strategy pair, is specifically configured to: based on the first time and the third At two times, determine the first passability cost of the active party when executing the first driving strategy in the first strategy pair, where the first time is the cost of the active party passing the target point when executing the first traveling strategy in the first strategy pair.
  • the second time is the time when the active party passes the target point with its current speed and acceleration
  • the target point is the driving path of the active party and The conflict point of the passive party's driving path
  • determine the second passability cost of the passive party when executing the second driving strategy of the first strategy pair where the third time is when the passive party executes
  • the second driving strategy in the first strategy pair is the time when the target point is passed
  • the fourth time is the time when the passive party passes the target point with its current speed and acceleration
  • the processing module when determining the offset cost associated with the first strategy pair, is specifically configured to: based on a preset bias The mapping relationship between the displacement and the offset cost, and the offset amount when the active party executes the first traveling strategy in the first strategy pair, determines the active party’s third traveling strategy when executing the first traveling strategy in the first strategy pair.
  • An offset cost based on the mapping relationship between the preset offset and the offset cost, and the offset when the passive party executes the second driving strategy in the first strategy pair, it is determined that the passive party executes the first driving strategy.
  • the second offset cost of the second driving strategy in the strategy pair determine the offset cost associated with the first strategy pair based on the first offset cost and the second offset cost.
  • the processing module when determining the inconsistency cost associated with the first strategy pair, is specifically used to: when the first strategy pair is in the When the target object's driving strategy is to only pass the target vehicle, or to only yield to the target vehicle, determine the target probability of the target object to pass the target vehicle, and determine the inconsistency cost based on the target probability; when the first strategy is correct When the target object's driving strategy is to both rush the target vehicle and allow the target vehicle to pass, the inconsistency cost is determined to be a preset cost value.
  • the processing module when determining the decision penalty cost associated with the first strategy pair, is specifically used to: based on a preset decision The penalty rule, and the second travel strategy in the first strategy pair, determines the penalty cost of the decision associated with the first strategy pair.
  • the second driving strategy in the first strategy pair is: the passive party can both grab the active party and allow the active party to move; the processing module is based on the preset decision-making penalty rules, and , when the second driving strategy in the first strategy pair determines the decision penalty cost associated with the first strategy pair, it is specifically used to: based on the decision penalty rule, and, the second driving strategy in the first strategy pair, determines the first Decision penalty cost; according to the acceleration corresponding to the first driving strategy in the first strategy pair and the current acceleration of the active party, determine the second decision penalty cost of the active party when executing the first driving strategy in the first strategy pair; according to the The first decision penalty cost and the second decision penalty cost determine the decision penalty cost associated with the first strategy pair.
  • this application provides a vehicle, including the control device described in the second aspect or any possible implementation of the second aspect.
  • the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program When the computer program is run on a processor, it causes the processor to execute the first aspect or any one of the first aspects. Possible implementations of the methods described.
  • the present application provides a computer program product.
  • the computer program product When the computer program product is run on a processor, it causes the processor to execute the method described in the first aspect or any possible implementation of the first aspect.
  • Figure 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • Figure 2 is a hardware structure of a vehicle provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a control method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a vehicle-to-vehicle intersection provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of a comfort cost function provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a passability cost function provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of a decision-making penalty cost function provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a control device provided by an embodiment of the present application.
  • a and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. these three situations.
  • the symbol "/" in this article indicates that the associated object is or, for example, A/B means A or B.
  • first, second, etc. in the description and claims herein are used to distinguish different objects, rather than to describe a specific order of objects.
  • first response message and the second response message are used to distinguish different response messages, but are not used to describe a specific sequence of response messages.
  • multiple refers to two or more, for example, multiple processing units refers to two or more processing units, etc.; multiple Component refers to two or more components, etc.
  • Figure 1 shows an application scenario.
  • vehicles 100 and 200 travel to a T-shaped road condition, and vehicle 100 travels along the direction marked by line segment x, and vehicle 200 moves along the direction marked by line segment y.
  • vehicle 200 merges into the road where vehicle 100 is located, and vehicle 100 travels along the direction marked by line segment x, and vehicle 200 moves along the direction marked by line segment y.
  • Figure 1 only shows the situation where there is a possibility of collision between two vehicles. Other situations where there is a possibility of collision between vehicles due to conflicts in the driving trajectories of the vehicles are still within the scope of this application. within the scope of protection.
  • the vehicle 100 can be retained, and the vehicle 200 can be replaced with other objects, such as a moving object, a stationary object, etc.
  • the replaced solution is still within the scope of protection of this application. within.
  • the vehicle 200 can be retained, and the vehicle 100 can be replaced with other objects, such as a moving object, a stationary object, etc.
  • the replaced solution is still within the protection scope of the present application.
  • the following description will take the reserved vehicle 100 as an example.
  • the driving trajectories of other objects can be, but are not limited to, the driving trajectories predicted for the vehicle.
  • the vehicle can predict a movement trajectory for the objects; when the other objects are moving objects, At this time, the vehicle can also predict a movement trajectory for the object.
  • embodiments of the present application provide a control method that can divide the vehicle and other objects into The active party and the passive party, and the active party has priority over the passive party. Then, the vehicle can be controlled to test the intentions of other objects. When it is determined that the intentions of other objects are obvious, the final decision of rushing or yielding is made. This achieves the effect of slowly accelerating to grab the right of way or slowly decelerating to give way, improving the driving experience.
  • FIG. 2 shows the hardware structure of a vehicle.
  • the vehicle 100 may include: a sensor component 110 , a fusion unit 120 and an intelligent driving function component 130 .
  • the sensor component 110 and the fusion unit 120 are connected through the interface 140, and the fusion unit 120 and the intelligent driving function component 130 are connected through the interface 150.
  • the sensor component 110 may include a vehicle attitude sensor and/or a perception sensor, etc. Data collected by the sensor assembly 110 may be transmitted to the fusion unit 120 through the interface 140 .
  • the vehicle attitude sensor can obtain the driving status information of the vehicle 100, such as speed, acceleration, heading angle, road topology, etc., as well as obtain the external environment information of the vehicle 100, such as road condition information, etc.
  • Perception sensors can acquire information about other objects outside the vehicle 100, such as driving status information, such as speed, heading angle, position, orientation, acceleration, road topology, and so on.
  • the vehicle attitude sensor may include one or more of a gyroscope sensor, a radar sensor, an ultrasonic sensor, a camera, a computer vision system, etc.
  • the radar sensor may include a laser radar sensor and/or a millimeter wave radar sensor, etc.
  • Perception sensors may include one or more of cameras, radar sensors, ultrasonic sensors, etc.
  • the radar sensor can be used to sense objects in the environment surrounding the vehicle 100 using radio signals, and can also sense the speed and/or direction of travel of the objects, and so on.
  • Cameras can be used to capture multiple images of the vehicle's surroundings.
  • the camera can be a still camera or a video camera.
  • the computer vision system may be operable to process and analyze images captured by the cameras in order to identify objects and/or features in the environment surrounding the vehicle.
  • Objects and/or features may include traffic signals, road boundaries, obstacles, other objects, etc.
  • Computer vision systems can use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking and other computer vision technologies.
  • SFM Structure from Motion
  • the fusion unit 120 can transmit the data collected by the sensor component 110 to the intelligent driving function component 130 through the interface 150 , so that the intelligent driving function component 130 can implement the intelligent driving function based on the data collected by the sensor component 110 .
  • the intelligent driving function component 130 can receive the data collected by the sensor component 110 transmitted by the fusion unit 120 through the interface 150, and then, Based on the received data, predict the driving trajectories of other objects, etc., and/or make driving decisions to achieve intelligent driving functions, such as adaptive cruise control (ACC), lane keeping assist (LKA) ), highway assist HWA (highway assist, HWA), traffic jam assistant (traffic jam assistant, TJA) and other different intelligent driving/autonomous driving functions.
  • intelligent driving functions such as adaptive cruise control (ACC), lane keeping assist (LKA) ), highway assist HWA (highway assist, HWA), traffic jam assistant (traffic jam assistant, TJA) and other different intelligent driving/autonomous driving functions.
  • the interface 140 can realize data transmission between the sensor component 110 and the fusion unit 120, which can be the interface between the sensor and the fusion unit specified in ISO 130150.
  • the interface 150 can realize data transmission between the fusion unit 120 and the intelligent driving function component 130 .
  • the message content transmitted by the interface 150 may include one or more of the following: the speed, heading angle, historical trajectory information, acceleration, road topology of the vehicle 100; the type, speed, heading angle, position, orientation, acceleration, road topology of other objects etc.; the collision time between the vehicle 100 and other objects, etc.
  • the intelligent driving function component 130 may include an object decision module and a motion planning module.
  • the object decision module can determine whether the vehicle 100 will collide with other traffic participants when traveling along the reference path based on the data collected by the sensor assembly 110 .
  • the reference path may refer to the reference datum used by the vehicle to make object decisions, such as the center line of the current road, etc. This path mainly reflects the real map information and can guide the driving direction of the vehicle.
  • the object decision module can also classify situations in which the vehicle 100 conflicts with other traffic participants, and divide the two into game objects or non-game objects.
  • the game object refers to the mutual influence between the two.
  • the driving decision of vehicle 100 such as rushing or yielding, etc.
  • the driving decision of vehicle 200 such as rushing or yielding, etc.
  • the driving decision of vehicle 100 has an impact, so vehicles 100 and 200 are game objects.
  • Non-game objects mean that there is no mutual influence between the two.
  • the driving decision of vehicle 100 cannot affect vehicles located in front of it and traveling in the same direction.
  • the main purpose is to solve how to make a driving decision when the two are game objects, so as to avoid collisions between the two.
  • the motion planning module can perform horizontal planning and longitudinal planning based on the driving decisions made by the object decision module.
  • Horizontal planning refers to the planning made in the direction parallel to the reference path (such as the center line of the lane, etc.) and corresponding to the vehicle's obstacle avoidance, detour and other behaviors.
  • the horizontal planning can have coordinate information, but not speed information.
  • Longitudinal planning refers to the planning made in the direction along the reference path, corresponding to the acceleration and deceleration of the vehicle.
  • the longitudinal planning can have speed information but not coordinate information.
  • horizontal planning and longitudinal planning can jointly form the vehicle's driving trajectory.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the vehicle 100 .
  • the vehicle 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • Figure 3 shows a control method.
  • the vehicle involved in this method may be, but is not limited to, the aforementioned vehicle 100 .
  • the scenario involved in this method can be, but is not limited to, the scenario where there is a possibility of collision in the driving route between the vehicles 100 and 200 described in Figure 1, that is, there is a certain probability of a collision between the two vehicles.
  • this method can be executed by any device, device, platform, or device cluster with computing and processing capabilities.
  • it can be executed by a processor in the vehicle 100 or a vehicle-mounted terminal, which is not limited here.
  • the control method may include S301 to S307, specifically:
  • the target vehicle after the target vehicle collects information about each traffic participant in the environment, when the target vehicle senses that there is a possibility of collision with other traffic participants (hereinafter referred to as the "target object"), the target vehicle can Traffic rules divide it and the target object into active parties and passive parties. Among them, the active party has priority over the passive party. For example, continuing to refer to (A) of Figure 1 , according to traffic rules, turning left requires yielding to go straight. Therefore, vehicle 100 is the active party and vehicle 200 is the passive party. For another example, when the target vehicle goes straight and the target object turns, the target vehicle can be determined to be the active party and the target object to be the passive party. When the target vehicle turns and the target object goes straight, it can be determined that the target vehicle is the passive party and the target object is the active party.
  • the traffic participants in the embodiments of this application may be vehicles or other objects, which are not limited here.
  • the first strategy set includes at least one first driving strategy.
  • the first policy set can be stored in the target vehicle in advance, or can be obtained from the network in real time, or can be calculated in real time. The details can be determined according to the actual situation, and are not limited here.
  • the first strategy set includes at least one first driving strategy.
  • Each first driving strategy can be used to represent at least one feasible information (such as acceleration, etc.) of the active party in the direction perpendicular to the lane where the target vehicle is located and one feasible driving information (such as offset, etc.) along the direction of the lane. ).
  • the first strategy set may be determined based on at least the preset acceleration sampling space, the allowed range of the active party's acceleration, and the preset lateral offset range.
  • the intersection of the preset acceleration sampling space and the range allowed by the active party's acceleration can be used as the variation range of acceleration in the first strategy set
  • the preset lateral offset range can be used as the lateral direction in the strategy space.
  • the range of offset changes. For example, when the target vehicle is the active party, if the preset acceleration sampling space is [-4.0,3.0]m/s 2 , the allowed acceleration range of the target vehicle is [-3.0,2.0]m/s 2 , then the acceleration range in the first strategy set is [-3.0,2.0]m/s 2 . If the preset lateral offset range is [-1,1]m, then the variation range of the lateral offset in the first policy set is [-1,1]m.
  • the change interval of the acceleration in the first strategy set and the offset interval of the lateral offset can be set.
  • acceleration can change every 1m/s interval and lateral offset can change every 1m interval.
  • the acceleration change interval can be set as 1m/s 2
  • the lateral offset of the target vehicle can be set as 1m to the left (i.e., offset to the left of the lane centerline) while maintaining the center of the current lane. If it is +1m, avoid to the right (that is, deviate to the right from the center line of the lane) 1m, recorded as -1m.
  • the first policy set shown in Table 1 can be generated.
  • an acceleration and a lateral offset can form a driving strategy.
  • the driving strategy composed of an acceleration of 1m/s 2 and a lateral offset of -1m is: with an acceleration of 1m/s 2 , and Drive 1m to the right away from the center line of the lane.
  • time domain deduction can be performed on each driving strategy in the first strategy set to form a first trajectory set corresponding to the active party's feasible driving strategy.
  • the system delay of the active party such as the transition time from one acceleration to another, etc.
  • speed limit and other constraints can be observed.
  • the deduction remains Driving at a constant speed at the road speed limit.
  • the first trajectory set can be obtained.
  • the first trajectory set may include at least one driving trajectory, and each driving strategy in the first strategy set may correspond to one driving trajectory.
  • S303 After obtaining the first policy set, S303 can be executed.
  • each first driving strategy in the first strategy set the driving parameters of the active party at the current time and the driving parameters of the passive party at the current time, obtain the second driving strategy of the passive party under each first driving strategy in the first strategy set.
  • strategy to obtain the second strategy set in which the second driving strategy is any one of the following: the passive party only robs the active party, the passive party only gives way to the active party, or the passive party can both rob the active party and Give way to the active party.
  • the time for the active party to reach the position where it collides with the passive party can be calculated based on the active party's current driving parameters (such as speed, acceleration, position, etc.) and each first driving strategy. Then, under each first driving strategy, the passive party uses its current driving parameters (such as speed, acceleration, position, etc.) as the initial parameters to drive, and whether it can stagger a certain time interval and distance interval from the active party.
  • current driving parameters such as speed, acceleration, position, etc.
  • the passive party can take actions, such as only being able to overtake the active party, or It can only allow the active party to move, or it can both grab the active party and allow the active party to move, etc., so as to obtain the second driving strategy of the passive party under each first driving strategy in the first strategy set, thereby obtaining the second strategy. gather.
  • the second driving strategy is any one of the following: the passive party only rushes to the active party, the passive party only gives way to the active party, or the passive party can both rush to the active party and allow the active party to move.
  • the driving parameters at the current moment may, but are not limited to, refer to: the driving parameters observed when obtaining the first strategy set, or the driving parameters observed when solving the second strategy set, or when executing the method. The latest observed driving parameters.
  • algorithms such as quadratic programming (QP) can be used to calculate the driving position, driving speed, acceleration, dynamic kinematic constraints, collision time, and road rules of the active and passive parties. Solve to find the acceleration/deceleration/avoidance driving trajectory of the passive party and determine the driving strategy of the passive party. For example, the target object's driving speed, position, orientation, acceleration, road topology, etc. can be input into the pre-trained model to predict the driving trajectory and/or driving strategy of the passive party.
  • QP quadratic programming
  • the feasible trajectory 46 of vehicle 200 (ie, the passive party) can be obtained using QP solution.
  • the distance from the vehicle 100 to the trajectory conflict point 43 is 20.11m.
  • the width of the vehicle 100 is approximately 1.9m, with a width of 0.5m each as a safe passage for the vehicle 100 (i.e., the area composed of lines 41 and 42).
  • the vehicle 200 drives along its predicted path (i.e., the feasible trajectory 46) and begins to invade the vehicle.
  • the point of the safe passage of the vehicle 100 is regarded as the intrusion point (that is, when the vehicle 200 is located at the position 44), and the point of completely leaving the safe passage of the vehicle 100 is regarded as the exit point (that is, when the vehicle 200 is located at the position 45).
  • the conditions for vehicle 200 to safely overtake vehicle 100 and to safely yield to vehicle 100 are: when yielding to vehicle 100, the time for vehicle 200 to reach the invasion point needs to be a certain time after vehicle 100 passes the trajectory conflict point 43. When the vehicle 100 is overtaken, the time when the vehicle 200 reaches the departure point needs to be a period of time before the vehicle 100 reaches the trajectory conflict point 43.
  • the time interval for the vehicle 200 to safely rush/yield can be preset to use different safety time intervals according to different scenarios.
  • the safety time interval can be 1s, that is, if vehicle 200 robs vehicle 100, it needs to leave the safe passage where vehicle 100 is located within 2.41s. If vehicle 200 yields to vehicle 100, it needs to enter the safe passage where vehicle 100 is located after 4.41s. aisle.
  • the distance from vehicle 200 to trajectory intersection 43 is 11.92m, the observed acceleration of vehicle 200 is 0.0m/s 2 , and the current speed is 17km/h.
  • the vehicle 200 intrusion point position 44 is 7.63m according to the current position of vehicle 200
  • the vehicle 200 departure point position 45 according to the current position of vehicle 200 is 13.4m. m.
  • the QP quadrattic programming
  • the solutions of vehicle 200, yielding vehicle 100, and overtaking vehicle 100 that meet the above conditions can be obtained at the same time. That is, at this time, the vehicle 200 can either overtake the vehicle 100 or yield to the vehicle 100. At this time, the intention of the vehicle 200 is unclear.
  • the corresponding driving strategy of the vehicle 200 is solved.
  • the driving strategy of vehicle 100 is: lateral offset 0m and acceleration 2m/s 2
  • the strategy of vehicle 200 to yield to vehicle 100 can be found, while the strategy of vehicle 200 to overtake vehicle 100 cannot be determined regardless of whether vehicle 200 offsets or not.
  • the safe stagger constraint and other constraints are satisfied, so a feasible solution cannot be obtained, so vehicle 200 can only give way to vehicle 100 at this time, and the intention of vehicle 200 is clear at this time.
  • the driving strategy of vehicle 100 is: the lateral offset is 0m and the acceleration is -3m/s 2
  • only the strategy of vehicle 200 to overtake vehicle 100 can be found. Therefore, vehicle 200 can only overtake vehicle 100 at this time.
  • the driving strategy of the passive party is "driving with a lateral offset of 1m and an acceleration of -3m/ s2 "
  • the driving strategy of the passive party is "overtaking the active party", that is, the passive party can only overrun the vehicle.
  • the active party when the active party's driving strategy is "driving with a lateral offset of 0m and an acceleration of 1m/ s2 ", the passive party's driving strategy is "grabbing and yielding to the active party", that is, the passive party can both overtake the active party and
  • the active party can also allow the active party to move; when the active party's driving strategy is "driving with a lateral offset of 1m and an acceleration of 1m/ s2 ", the passive party cannot avoid collision with the active party. At this time, the passive party's driving strategy cannot be obtained. , that is, "no solution”.
  • each driving strategy in the first strategy set can be The corresponding driving trajectory, as well as the conditions that the passive party needs to meet to safely rush and/or yield to the active party, conduct a time domain deduction on the passive party's driving trajectory to form a second trajectory set corresponding to the passive party's feasible driving strategy.
  • the system delay of the passive party such as the transition time from one acceleration to another, etc.
  • speed limit and other constraints can be observed.
  • the passive party accelerates to the road speed limit the deduction remains Driving at a constant speed at the road speed limit.
  • the second trajectory set can be obtained.
  • the second trajectory set may include at least one driving trajectory, and each driving strategy in the second strategy set may correspond to one driving trajectory.
  • S304 After obtaining the second policy set, S304 can be executed.
  • S304 Determine a target strategy pair set according to the first strategy set and the second strategy set.
  • the target strategy pair set includes at least one feasible strategy pair, and each feasible strategy pair consists of a first driving strategy and a second driving strategy. .
  • a target policy pair set can be determined from these two policy sets.
  • the target policy pair set includes at least one feasible policy pair, and each feasible policy pair It consists of a first travel strategy and a second travel strategy.
  • the driving strategy “the active party drives with a lateral offset of 1m and an acceleration of -3m/ s2 " and the driving strategy "the passive party overtakes the active party” can form a feasible strategy pair.
  • the driving strategy "the active party travels with a lateral offset of -3m/s2""Traveling with lateral offset of 0m and acceleration of 1m/s 2 " and the driving strategy of "passive party rushing to go and giving way to active party” can form a feasible strategy pair.
  • the feasible strategy pair composed of this driving strategy can be called a bilateral solution strategy pair.
  • the feasible strategy pair consisting of the driving strategy "the active party drives with a lateral offset of 0m and an acceleration of 1m/s2" and the driving strategy "the passive party grabs the right of way and yields to the active party” can be called a bilateral solution.
  • the passive party can only find the safe driving strategy of the active party to rush or yield the feasible strategy pair composed of the driving strategy can be called a unilateral solution strategy pair. For example, continue to refer to Table 2.
  • the driving strategy "the active party drives with a lateral offset of 1m and an acceleration of -3m/ s2 " and the driving strategy "the passive party rushes to the active party” form a feasible strategy pair, which can be called a unilateral solution strategy. right.
  • S305 Determine the execution cost of each feasible strategy pair in the target strategy pair set, and obtain a first cost set.
  • the first cost set includes the execution cost of each feasible strategy pair.
  • the execution costs of each feasible strategy in the target strategy pair set can be analyzed to obtain the first cost set.
  • the first cost set includes the execution cost associated with each feasible strategy pair.
  • the cost associated with each feasible policy pair is used to indicate the cost of executing that feasible policy pair.
  • the execution cost associated with each feasible strategy pair may include one or more of the following: comfort cost, passability cost, offset cost, inconsistency cost, or decision penalty cost.
  • Comfort cost is used to characterize the comfort level when executing a feasible strategy pair. Among them, the higher the comfort cost, the lower the comfort level. In some embodiments, for the active side or the passive side, the smaller the acceleration change rate, the better the corresponding comfort, and the smaller the comfort cost.
  • the acceleration change amount of the active party when executing the corresponding driving strategy under its current acceleration can be processed to obtain the result of the active party executing the corresponding driving strategy.
  • the acceleration change amount of the passive party when executing the corresponding driving strategy under its current acceleration can be processed to obtain the comfort cost of the passive party when executing the corresponding driving strategy.
  • the comfort cost of the active party and the comfort cost of the passive party can be weighted and summed (of course, other calculation methods can also be used, such as choosing one, averaging, etc., there are no restrictions here) to get The corresponding feasible strategy has an associated comfort cost.
  • the current acceleration of the active party may, but is not limited to, refer to: the acceleration of the active party observed when acquiring the first strategy set, or the acceleration of the active party observed when solving the second strategy set, or, The last observed acceleration of the active party before executing this method.
  • the vehicle 100 described in Figure 1 is the active party
  • the vehicle 200 is the passive party
  • the current observed acceleration of the vehicle 100 is -0.67m/s 2
  • the current observed acceleration of vehicle 200 is 0m/s 2 .
  • the first policy set may be as shown in the aforementioned Table 1
  • the second policy set may be as shown in the aforementioned Table 2.
  • the corresponding driving strategy of vehicle 200 is to rush and yield.
  • the corresponding driving strategy of vehicle 200 is to rush to the right of way, its acceleration is 1.45m/s 2
  • its corresponding driving strategy is to give way
  • its acceleration is 0.69m/s 2 .
  • the comfort cost of vehicle 200 is the rush solution and the yield solution.
  • the average of the corresponding comfort costs (of course, other calculation methods can also be used, such as summation, selection, etc., which are not limited here).
  • the weights of the active party and the passive party are both 1. Of course, they can also be other values, and there is no limit here.
  • the passability cost is used to characterize the efficiency of the active party and/or the passive party passing through the conflict point between the two (ie, the conflict point between the two driving paths). Among them, the faster the time to pass the conflict point, the smaller the corresponding passability cost.
  • the time it takes for the active party to pass the corresponding conflict point when executing the driving strategy in the feasible strategy pair can be based on the preset passability cost function (which can also be called the "th "One time”), and the time when the active party passes the corresponding conflict point under the current acceleration and speed (which can also be called “second time”) are processed to obtain the passing time of the active party when executing the corresponding driving strategy. sexual price.
  • the preset passability cost function which can also be called the "th "One time”
  • second time the time when the active party passes the corresponding conflict point under the current acceleration and speed
  • the time when the passive party passes the corresponding conflict point when executing the driving strategy in the feasible strategy pair (which can also be called the "third time"), and the time when the passive party passes through the corresponding conflict point
  • the time it takes to pass the corresponding conflict point under the current acceleration and speed (which can also be called the "fourth time) is processed to obtain the passability cost of the passive party when executing the corresponding driving strategy.
  • the passability cost of the active party and the passability cost of the passive party can be weighted and summed (of course, other calculation methods can also be used, such as choosing one, averaging, etc., there are no restrictions here) to get
  • the corresponding feasible strategies have associated passability costs.
  • x is the horizontal axis
  • y is the vertical axis
  • the horizontal axis in Figure 6 is the execution of the driving strategy through conflicts
  • the time difference between the time at the point and the time when the conflict point is passed normally that is, without executing the driving strategy.
  • the vehicle 100 depicted in Figure 1 is the active party
  • the vehicle 200 is the passive party.
  • the first policy set may be as shown in the aforementioned Table 1
  • the second policy set may be as shown in the aforementioned Table 2. Under the driving strategy where the corresponding acceleration of vehicle 100 is 1m/s2 and the lateral offset is 0m, the corresponding driving strategy of vehicle 200 is to rush and yield.
  • the weights of the active party and the passive party are both 1. Of course, they can also be other values, and there is no limit here.
  • the passability cost associated with other feasible strategies you can refer to the above calculation method, which will not be described again here.
  • the offset cost is used to characterize the evaluation of the offset of the active party and/or the passive party when executing the corresponding driving strategy. Among them, when offset is not required, the offset cost is 0. Of course, it can also take other values, which are not limited here; when offset is required, it can be based on the preset offset and offset cost. mapping relationship, and the offset when the active party executes the corresponding driving strategy, determine the offset cost of the active party when executing the corresponding driving strategy, and/or, based on the preset offset and offset The mapping relationship between costs and the offset when the passive party executes the corresponding driving strategy determines the offset cost of the passive party when executing the corresponding driving strategy. Finally, the offset cost associated with a certain feasible strategy pair can be obtained based on the determined offset cost of the active party when executing the corresponding driving strategy and the determined offset cost of the passive party when executing the corresponding traveling strategy.
  • the offset cost is 0; when the offset is 1m or -1m, the offset cost is 0.3.
  • vehicle 100 is the active party and vehicle 200 is the passive party, and the driving strategy corresponding to vehicle 100 is the driving strategy in the aforementioned Table 1, the driving strategy corresponding to the vehicle 200 is the driving strategy in the aforementioned Table 2.
  • the weights of the active party and the passive party are both 1. Of course, they can also be other values, and there is no limit here.
  • the inconsistency cost is used to characterize the evaluation of the deviation between the driving behavior of the target object in the feasible strategy pair and the actual driving behavior of the target object. It is mainly determined based on the probability of the passive party overtaking the active party as the input. Among them, for the probability that the passive party overtakes the active party, the relative position and relative speed of the passive party and the active party, the speed, position, acceleration of the passive party, historical observation data, etc. can be input into the preset model to obtain The probability that the passive party will overtake the active party.
  • the inconsistency cost when solving the driving strategy of the passive party, if a bilateral solution can be solved, the inconsistency cost can be 0.5, and of course it can also be other values, which are not limited here.
  • the inconsistency cost can be: 1-the probability that the passive party grabs the row of the active party.
  • the cost of inconsistency can be: the probability that the passive party overtakes the active party.
  • the target probability of the target object to rush the target vehicle can be determined first, and then based on the target probability, determine The inconsistency cost associated with this strategy pair; when the driving strategy of a target object in a strategy pair is to both grab the target vehicle and allow the target vehicle to pass, it can be determined that the inconsistency cost associated with the strategy pair is the predetermined Set value.
  • the probability of the passive party overtaking the active party is 0.7, and at the same time, the vehicle 100 described in Figure 1 is the active party, and the vehicle 200 is the passive party, and the first strategy set can be as shown in the aforementioned Table 1,
  • the second policy set may be as shown in the aforementioned Table 2.
  • 0.3 is the inconsistency cost weight, which can be set based on the actual situation.
  • the decision penalty cost is used to characterize the evaluation of whether the intention of the moving party is clear. It can be obtained by solving the driving strategy based on the active party
  • the corresponding penalty cost is when the driving strategy of the passive party is a unilateral solution or a bilateral solution. Among them, when there is a bilateral solution in the solution obtained, it means that the intention of the passive party is not clear. Therefore, the penalty can be set to be high when the solution obtained is a unilateral solution, and the penalty when the solution obtained is a bilateral solution. Low.
  • the decision penalty cost corresponding to the unilateral solution and the bilateral solution can be preset, that is, the decision penalty rule is preset, and then the corresponding decision penalty cost is determined based on the driving strategy of the passive party.
  • the decision penalty cost corresponding to the unilateral solution can be set to 1, and the decision penalty cost corresponding to the bilateral solution can be set to 0.
  • the passive party solves the bilateral solution
  • the difference between the active party's acceleration under the driving strategy and its current acceleration is large, the active party's acceleration change will be large when executing the driving strategy, which will affect the driving experience. Therefore, The penalty for this bilateral solution can be increased. For example, based on a preset penalty cost function, the difference between the active party's acceleration under the driving strategy and its current acceleration can be processed to obtain the corresponding feasible strategy penalty cost for the associated decision. Then, the final decision penalty cost is determined based on the determined decision penalty cost and the decision penalty cost corresponding to the passive party.
  • the vehicle 100 described in Figure 1 is the active party, and the vehicle 200 is the passive party, and the first strategy set may be as shown in the aforementioned Table 1, and the second strategy set may be as shown in the aforementioned Table 2.
  • certain weight values can also be set.
  • the weight of the basic penalty cost is 0.4
  • the weight of the penalty cost based on the acceleration change is 0.6, etc.
  • the specifics can be determined according to the actual situation and are not limited here.
  • the above-described costs can be selected from one or more, and the specifics can be determined according to the actual situation, and are not limited here. Among them, when multiple are selected, these costs can be calculated as a weighted sum or average, and the final result is used as the cost associated with the corresponding feasible strategy pair. In addition, when the driving strategy in the second strategy set has no solution, the cost associated with the feasible strategy pair corresponding to the driving strategy can be set to infinity, that is, the cost of executing the feasible strategy pair is very large.
  • S306 After obtaining the first cost set, S306 can be executed.
  • the target driving strategy is the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set.
  • the target driving strategy can be determined based on the first cost set. Among them, since the cost of executing the lowest cost in the first cost set is the lowest, the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set can be used as the target driving strategy.
  • the target vehicle can be controlled to execute a driving strategy opposite to that of the target object. For example, when the target object's driving strategy is to rush the target vehicle, the target vehicle can be controlled to quickly decelerate and yield to the target object.
  • the target vehicle's driving strategy at this time is: yield to the target object; when the target object's driving strategy is When yielding to the target vehicle, the target vehicle can be controlled to quickly accelerate to pass the target object. That is, the target vehicle's driving strategy at this time is to pass the target object, thereby allowing the target vehicle to quickly respond to the target object's driving intention.
  • the passive party can both grab the active party and allow the active party to travel, if The target vehicle is the active party and the target object is the passive party. At this time, the target object's driving intention is not clear, that is, the target object can either rob the target vehicle or yield to the target vehicle. At this time, if the target vehicle rashly robs or yields to the target vehicle, The decision to give way or give way is prone to heavy braking/point braking or even takeover, resulting in a poor driving experience.
  • the target vehicle can be controlled according to the first driving strategy of the feasible strategy pair corresponding to the lowest cost in the first cost set to test the driving intention of the target object, that is, the lowest cost in the first cost set
  • the first driving strategy in the corresponding feasible strategy pair is used as the driving strategy of the target vehicle.
  • the target vehicle since the target vehicle is the active party at this time, the target vehicle has a higher right of way than the target object, that is, the target vehicle has a higher right of way than the target object.
  • the party with lower right of way often needs to give way to the party with higher right of way during driving, the target vehicle can be controlled to accelerate, decelerate or drive at a constant speed to test the target's driving intention.
  • the change in the speed of the target vehicle can be controlled to be less than the first threshold, so that the target vehicle can slowly accelerate or decelerate, etc.
  • the passive party can both grab the active party and allow the active party to travel, if The target vehicle is the passive party and the target object is the active party. At this time, the target object's driving intention is not clear, that is, the target object can either rob the target vehicle or yield to the target vehicle. At this time, if the target vehicle rashly robs or yields to the target vehicle, The decision to give way is prone to heavy braking/point braking or even takeover, resulting in a poor driving experience.
  • the target vehicle since the target vehicle is the passive party at this time, the target vehicle has a lower priority right of way relative to the target object, that is, the target vehicle has a lower right of way than the target object.
  • the target vehicle can be controlled to slow down to test the target's driving intention. That is to say, the driving strategy of the target vehicle at this time is: yield to the target object.
  • the change amount of the speed of the target vehicle when controlling the target vehicle to decelerate, in order to reduce the discomfort of people in the vehicle, the change amount of the speed of the target vehicle can be controlled to be less than the second threshold, so that the target vehicle can slowly decelerate.
  • the active party's vertical decision on the passive party is to give way.
  • the feasible strategy pair has a unilateral solution for the passive party's driving strategy and the active party gives way, the active party's vertical decision on the passive party is to grab the line.
  • the feasible strategy pairs the passive party's driving strategy as a bilateral solution the active party's vertical decision on the passive party is critical rush/give way, that is, the two continue to play the game.
  • the maximum and minimum range of the target vehicle's avoidance can be constrained laterally, and the maximum range and minimum range of the target vehicle's speed can be constrained longitudinally, that is, Bilateral constraints.
  • the driving strategy of the passive party in the feasible strategy is a bilateral solution
  • the maximum or minimum range of the target vehicle's avoidance is constrained laterally
  • the maximum or minimum range of the target vehicle's speed is constrained longitudinally, that is, a unilateral solution is performed. constraint.
  • the target vehicle can be controlled according to the target driving strategy.
  • the target driving strategy is: the target vehicle rushes to the target object, the target vehicle is controlled to rush to the target object; when the target driving strategy is: the target vehicle yields to the target object, the target vehicle is controlled to yield to the target object;
  • the target driving strategy is: the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set, the target vehicle is controlled to execute the driving strategy, and after a preset time interval, the aforementioned S301 to S307 are executed again.
  • One or more steps are continued in this loop until it is determined that the driving strategy of the target vehicle is to only rush the target object or to only give way to the target object.
  • the target driving strategy is: the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set, and the driving strategy is: the acceleration is 1m/ s2 and the lateral offset is 0m
  • the target vehicle is controlled to The acceleration is 1m/s2 and lateral offset of 0m.
  • the target vehicle can make a critical decision to rush and yield, that is, the target vehicle can continue to move forward and test The target object’s driving intention.
  • the target vehicle can make corresponding adjustments based on the target object's driving intention, such as rushing to pass the target object, or yielding to the target object, etc.
  • the planning results of horizontal planning/vertical planning can be accurately controlled through bilateral constraints, and as the decision-making constraints change from uncertain bilateral constraints to deterministic unilateral
  • the continuous change of edge constraints enables the autonomous driving system to make human-like, flexible and continuous decisions when facing game scenarios. It achieves the effect of controlling the target vehicle to slowly accelerate or decelerate when the target object's driving intention is unclear, and the effect of controlling the target vehicle to quickly respond to the target object's driving intention when the target object's driving intention is clear, avoiding the need for unilateral grabs.
  • Yield decision-making may cause planning jumps due to decision-making jumps during the interaction process, which may lead to mis-braking/heavy braking or sudden emergency takeover of the entire autonomous driving system, which improves the driving experience.
  • each step in the above embodiment does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not constitute any influence on the implementation process of the embodiment of the present application. limited.
  • each step in the above embodiments may be selectively executed according to actual conditions, may be partially executed, or may be executed in full, which is not limited here.
  • the embodiment of the present application provides a control device.
  • Figure 8 shows a control device.
  • the control device 800 may include: a dividing module 810 and a processing module 820 .
  • the dividing module 810 is used to divide the target vehicle and the target object into active parties and passive parties, where the active party has priority over the passive party, and there is a possibility of collision between the target vehicle and the target object.
  • the processing module 820 is configured to obtain a first feasible strategy set for the active party, where the first strategy set includes at least one first driving strategy.
  • the processing module 820 is also configured to obtain the driving parameters of the passive party under each first driving strategy in the first strategy set based on each first driving strategy in the first strategy set, the driving parameters of the active party at the current moment, and the driving parameters of the passive party at the current moment.
  • the second driving strategy is to obtain the second strategy set, in which the second driving strategy is any of the following: the passive party only overtakes the active party, the passive party only gives way to the active party, or the passive party can both overtake the active party It can also make people take the initiative.
  • the processing module 820 is also configured to determine a target strategy pair set according to the first strategy set and the second strategy set.
  • the target strategy pair set includes at least one feasible strategy pair, and each feasible strategy pair consists of a first driving strategy and a third driving strategy.
  • the processing module 820 is also configured to determine the execution cost of each feasible strategy pair in the target strategy pair set to obtain a first cost set, which includes the execution cost of each feasible strategy pair.
  • the processing module 820 is also configured to determine a target driving strategy according to the first cost set, where the target driving strategy is the driving strategy of the target vehicle associated with the lowest execution cost in the first cost set.
  • the processing module 820 is also used to control the target vehicle according to the target driving strategy.
  • the processing module 820 when determining the target driving strategy according to the first cost set, is specifically used to: if the feasible strategy corresponding to the target cost is in the pair
  • the second driving strategy is: the passive party only grabs the active party, then the target driving strategy is determined as: the target vehicle yields to the target object, where the target cost is the lowest execution cost in the first cost set; if the target cost corresponds to The second driving strategy in the feasible strategy pair is: the passive party only yields to the active party, then the target driving strategy is determined to be: the target vehicle grabs the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost is: If the passive party can both grab the active party and allow the active party to move, then the target driving strategy is determined to be: the first driving strategy among the feasible strategies corresponding to the target cost.
  • the processing module 820 when determining the target driving strategy according to the first cost set, is specifically used to: if the feasible strategy corresponding to the target cost is in the pair
  • the second driving strategy is: the passive party only rushes to the active party, then the target driving strategy is determined to be: the target vehicle rushes to the target object, where the target cost is the lowest execution cost in the first cost set; if the target cost corresponds to The second driving strategy in the feasible strategy pair is: the passive party only yields to the active party, then the target driving strategy is determined to be: the target vehicle yields to the target object; if the second driving strategy in the feasible strategy pair corresponding to the target cost is: If the passive party can both overtake the active party and allow the active party to move, then the target driving strategy is determined as: the target vehicle yields to the target object.
  • the execution cost associated with the feasible strategy pair includes one or more of the following: a comfort cost, used to characterize the comfort level when executing the feasible strategy pair; a passability cost, used to characterize the active party and/or The efficiency of the passive party through the conflict point between the two; the offset cost, which is used to characterize the evaluation of the deviation of the active party and/or the passive party when executing the corresponding driving strategy; the inconsistency cost, which is used to characterize the deviation in the feasible strategy pair The evaluation of the deviation between the target object's driving behavior and the target's actual driving behavior; or, the decision penalty cost is used to represent the evaluation of whether the intention of the passive party is clear.
  • the processing module 820 when determining the comfort cost associated with the first strategy pair, is specifically configured to: according to the first strategy pair in the first strategy pair The acceleration corresponding to a driving strategy and the current acceleration of the active party degree, determine the first comfort cost of the active party when executing the first driving strategy in the first strategy pair; determine the passive party's first comfort cost according to the acceleration corresponding to the second driving strategy in the first strategy pair and the current acceleration of the passive party.
  • the second comfort cost when executing the second driving strategy of the first strategy pair; determine the comfort cost associated with the first strategy pair based on the first comfort cost and the second comfort cost.
  • the processing module 820 when determining the passability cost associated with the first strategy pair, is specifically configured to: based on the first time and the second time , determine the first passability cost of the active party when executing the first driving strategy in the first strategy pair, where the first time is the time when the active party passes the target point when executing the first driving strategy in the first strategy pair, The second time is the time when the active party passes the target point with its current speed and acceleration. The target point is the conflict point between the active party's driving path and the passive party's driving path. According to the third time and the fourth time, determine where the passive party is.
  • the second passability cost when executing the second driving strategy in the first strategy pair where the third time is the time when the passive party passes the target point when executing the second driving strategy in the first strategy pair, and the fourth time is the passive The time it takes for the party to pass the target point with its current speed and acceleration; determine the passability cost associated with the first strategy pair based on the first passability cost and the second passability cost.
  • the processing module 820 when determining the offset cost associated with the first strategy pair, is specifically configured to: based on a preset offset The mapping relationship between the offset cost and the offset amount when the active party executes the first driving strategy in the first strategy pair determines the first bias of the active party when executing the first traveling strategy in the first strategy pair. moving cost; based on the mapping relationship between the preset offset and the offset cost, and the offset when the passive party executes the second traveling strategy in the first strategy pair, it is determined that the passive party executes the first strategy pair The second offset cost when using the second driving strategy; determine the offset cost associated with the first strategy pair based on the first offset cost and the second offset cost.
  • the processing module 820 when determining the inconsistency cost associated with the first policy pair, is specifically used to: when the first policy pair hits the target object When the driving strategy is to only overtake the target vehicle, or to only yield to the target vehicle, determine the target probability of the target object overtaking the target vehicle, and determine the inconsistency cost based on the target probability; when the first strategy hits the target object When the driving strategy is to both rush the target vehicle and allow the target vehicle to pass, the inconsistency cost is determined to be a preset cost value.
  • the processing module 820 when determining the decision penalty cost associated with the first strategy pair, is specifically configured to: based on a preset decision penalty rule , and, the second driving strategy in the first strategy pair determines the decision penalty cost associated with the first strategy pair.
  • the second driving strategy in the first strategy pair is: the passive party can both overtake the active party and allow the active party to move; the processing module 820 based on the preset decision penalty rules, and, When the second driving strategy in a strategy pair determines the decision penalty cost associated with the first strategy pair, it is specifically used to: determine the first decision penalty based on the decision penalty rule, and the second driving strategy in the first strategy pair. Cost; According to the acceleration corresponding to the first driving strategy in the first strategy pair and the current acceleration of the active party, determine the second decision penalty cost of the active party when executing the first driving strategy in the first strategy pair; According to the first decision The penalty cost and the second decision penalty cost determine the penalty cost of the decision associated with the first strategy.
  • embodiments of the present application provide a vehicle, which may include the control device 800 shown in FIG. 8 .
  • embodiments of the present application provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program When the computer program is run on a processor, it causes the processor to execute the steps in the above embodiments. Methods.
  • embodiments of the present application provide a computer program product, which when the computer program product is run on a processor, causes the processor to execute the methods in the above embodiments.
  • processors in the embodiments of the present application can be a central processing unit (CPU), or other general-purpose processor, digital signal processor (DSP), or application-specific integrated circuit (application specific integrated circuit, ASIC), field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • a general-purpose processor can be a microprocessor or any conventional processor.
  • the method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules.
  • the software modules can be stored in random access memory (random access memory, RAM), flash memory, read-only memory (read-only memory, ROM), programmable read-only memory (programmable rom). , PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically erasable programmable read-only memory (electrically EPROM, EEPROM), register, hard disk, mobile hard disk, CD-ROM or other well-known in the art any other form of storage media.
  • an exemplary A storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage media may be located in an ASIC.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted over a computer-readable storage medium.
  • the computer instructions may be transmitted from one website, computer, server or data center to another website through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means. , computer, server or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Human Computer Interaction (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Traffic Control Systems (AREA)

Abstract

一种控制方法,在目标车辆和目标对象存在碰撞可能性时,将目标车辆和目标对象划分为主动方和被动方,主动方相对于被动方具有优先通行权;获取主动方可行的第一策略集合,第一策略集合包括至少一个第一行驶策略;根据第一策略集合,确定被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,得到第二策略集合;对由第一策略集合和第二策略集合得到的各个可行策略的执行代价进行分析,得到第一代价集合,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成,第一代价集合中包括每个可行策略对所关联的用于指示执行可行策略对的成本的代价;根据第一代价集合中最低的执行代价所对应的行驶策略对目标车辆进行控制。提升了驾驶体验。

Description

一种控制方法、装置及车辆
本申请要求于2022年9月5日提交中国国家知识产权局、申请号为202211077102.4、申请名称为“一种控制方法、装置及车辆”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能(artificial intelligence,AI)技术领域,尤其涉及一种控制方法、装置及车辆。
背景技术
自动驾驶是人工智能领域的一种主流应用。自动驾驶技术依靠计算机视觉、雷达、监控装置和全球定位系统等协同合作,让车辆可以在不需要人类主动操作下,实现自动驾驶。在自动驾驶领域,车辆可以根据实际的驾驶场景执行相应的行驶策略,以保证车辆的安全行驶。但目前,当多个车辆间出现轨迹冲突时,车辆常会执行误减速、误加速等行驶策略,影响安全加速和驾驶体验。
发明内容
本申请提供了一种控制方法、装置、车辆、计算机存储介质及计算机产品,能够使车辆可以不断的试探其他对象的行驶意图,直至其他对象的行驶意图明确时,在做出最终的抢行或让行决策,由此实现缓慢加速抢行或缓慢减速让行的效果,提升驾驶体验。
第一方面,本申请提供一种控制方法,该方法包括:将目标车辆和目标对象划分为主动方和被动方,其中,主动方相对于被动方具有优先通行权,且目标车辆和目标对象存在碰撞可能性;获取主动方可行的第一策略集合,第一策略集合中包括至少一个第一行驶策略;根据第一策略集合中的各个第一行驶策略、主动方当前时刻的行驶参数和被动方当前时刻的行驶参数,得到被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,第二行驶策略为以下任意一项:被动方仅抢行主动方,被动方仅让行主动方,或者,被动方既能抢行主动方,又能让行主动方;根据第一策略集合和第二策略集合,确定目标策略对集合,目标策略对集合中包括至少一个可行策略对,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成;确定目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,第一代价集合中包括每个可行策略对的执行代价;根据第一代价集合,确定目标行驶策略,目标行驶策略为与第一代价集合中最低的执行代价所关联的目标车辆的行驶策略;根据目标行驶策略,对目标车辆进行控制。示例性的,当前时刻的行驶参数可以但不限于是指:在获取第一策略集合时观测到的行驶参数,或者,在求解第二策略集合时观测到的行驶参数,或者,在执行该方法前最新观测到的行驶参数。
这样,在目标车辆和目标对象存在冲突可能性时,可以将两者划分为主动方和被动方,并由主动方的可行策略求解被动方的可行策略,然后在计算执行两者的可行策略的执行代价,最后选取一个代价最低的可行策略对目标车辆进行控制。由此使得目标车辆可以不断的试探其他对象的行驶意图,直至其他对象的行驶意图明确时,在做出最终的抢行或让行决策,从而实现缓慢加速抢行或缓慢减速让行的效果,提升驾驶体验。
在一种可能的实现方式中,若目标车辆为主动方,目标对象为被动方时;根据第一代价集合,确定目标行驶策略,具体包括:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆让行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆抢行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标代价所对应的可行策略对中的第一行驶策略。
这样,若目标车辆为主动方,目标对象为被动方时,当目标对象的行驶策略为抢行或让行目标车辆中的一种时,表明目标对象的行驶意图明确,此时可以确定目标车辆的行驶策略与目标对象的行驶策略相反,即当目标对象的行驶策略为抢行目标车辆时,则确定目标车辆的行驶策略为让行目标对象,当目标对象的行驶策略为让行目标车辆时,则确定目标车辆的行驶策略为抢行目标对象。当目标对象的行驶策略为既能抢行又能让行目标车辆时,表明目标对象的行驶意图还不明确,此时则可以继续控制目标车辆试探目标对 象的行驶意图;同时,由于此时目标车辆为主动方,且主动方的路权高于被动方的路权,而在行驶过程中往往是路权较低的一方需要让行路权较低的一方。因此,此时可以控制目标车辆执行目标代价所对应的可行策略对中的第一行驶策略,即确定目标对象的行驶策略为目标代价所对应的可行策略对中的第一行驶策略。
在一种可能的实现方式中,若目标车辆为被动方,目标对象为主动方时;根据第一代价集合,确定目标行驶策略,具体包括:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆抢行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆让行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标车辆让行目标对象。
这样,若目标车辆为被动方,目标对象为主动方时,当目标车辆的行驶策略为抢行目标对象时,表明目标车辆仅有一种行驶策略可选,因此,可以确定该行驶策略为目标车辆所需执行的行驶策略。同样的,当目标车辆的行驶策略为让行目标对象时,表明目标车辆仅有一种行驶策略可选,因此,可以确定该行驶策略为目标车辆所需执行的行驶策略。当目标车辆的行驶策略为既能抢行又能让行目标对象时,表明目标对象的行驶意图还不明确,此时若目标车辆贸然执行抢行决策或者让行决策,则容易出现重刹/点刹甚至接管等情况,驾驶体验较差。另外,由于此时目标车辆为被动方,所以目标车辆的路权低于目标对象的路权。同时,由于在行驶过程中往往是路权较低的一方需要让行路权较低的一方,因此可以控制目标车辆减速行驶,以试探目标对象的行驶意图,即确定目标车辆的行驶策略为让行目标对象。
在一种可能的实现方式中,可行策略对所关联的执行代价包括以下一项或多项:舒适性代价,用于表征执行可行策略对时的舒适程度;通过性代价,用于表征主动方和/或被动方通过两者冲突点的效率;偏移代价,用于表征对主动方和/或被动方在执行相应的行驶策略时发生偏移的评价;不一致性代价,用于表征在可行策略对中目标对象的行驶行为与目标对像实际的行驶行为间的偏差的评价;或者,决策惩罚代价,用于表征对被动方的意图是否明确的评价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,确定第一策略对所关联的舒适性代价,具体包括:根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速度,确定主动方在执行第一策略对中的第一行驶策略时的第一舒适性代价;根据第一策略对中的第二行驶策略对应的加速度和被动方当前的加速度,确定被动方在执行第一策略对中的第二行驶策略时的第二舒适性代价;根据第一舒适性代价和第二舒适性代价,确定第一策略对所关联的舒适性代价。示例性的,主动方当前的加速度可以但不限于是指:在获取第一策略集合时观测到的主动方的加速度,或者,在求解第二策略集合时观测到的主动方的加速度,或者,在执行该方法前最新观测到的主动方的加速度。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,确定第一策略对所关联的通过性代价,具体包括:根据第一时间和第二时间,确定主动方在执行第一策略对中的第一行驶策略时的第一通过性代价,其中,第一时间为主动方执行第一策略对中的第一行驶策略时通过目标点的时间,第二时间为主动方以其当前的速度和加速度通过目标点的时间,目标点为主动方的行驶路径和被动方的行驶路径的冲突点;根据第三时间和第四时间,确定被动方在执行第一策略对中的第二行驶策略时的第二通过性代价,其中,第三时间为被动方执行第一策略对中的第二行驶策略时通过目标点的时间,第四时间为被动方以其当前的速度和加速度通过目标点的时间;根据第一通过性代价和第二通过性代价,确定第一策略对所关联的通过性代价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,确定第一策略对所关联的偏移代价,具体包括:基于预先设定的偏移量与偏移代价间的映射关系,以及,主动方执行第一策略对中的第一行驶策略时的偏移量,确定主动方在执行第一策略对中的第一行驶策略时的第一偏移代价;基于预先设定的偏移量与偏移代价间的映射关系,以及,被动方执行第一策略对中的第二行驶策略时的偏移量,确定被动方在执行第一策略对中的第二行驶策略时的第二偏移代价;根据第一偏移代价和第二偏移代价,确定第一策略对所关联的偏移代价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,确定第一策略对所关联的不一致性代价,具体包括:当第一策略对中目标对象的行驶策略为仅抢行目标车辆,或者,为仅让行目标车辆时,确定目标对象抢行目标车辆的目标概率,以及,根据目标概率,确定不一致性代价;当第一策略对中目标对象的行驶策略为既能抢行目标车辆,又能让行目标车辆时,确定不一致性代价为预先设定的 代价值。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,确定第一策略对所关联的决策惩罚代价,具体包括:基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价。
在一种可能的实现方式中,第一策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方;基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价,具体包括:基于决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一决策惩罚代价;根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速度,确定主动方在执行第一策略对中的第一行驶策略时的第二决策惩罚代价;根据第一决策惩罚代价和第二决策惩罚代价,确定第一策略对所关联的决策惩罚代价。
第二方面,本申请提供一种控制装置,该装置包括:划分模块和处理模块。其中,划分模块用于将目标车辆和目标对象划分为主动方和被动方,其中,主动方相对于被动方具有优先通行权,且目标车辆和目标对象存在碰撞可能性。处理模块用于获取主动方可行的第一策略集合,第一策略集合中包括至少一个第一行驶策略。处理模块还用于根据第一策略集合中的各个第一行驶策略、主动方当前时刻的行驶参数和被动方当前时刻的行驶参数,得到被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,第二行驶策略为以下任意一项:被动方仅抢行主动方,被动方仅让行主动方,或者,被动方既能抢行主动方,又能让行主动方。处理模块还用于根据第一策略集合和第二策略集合,确定目标策略对集合,目标策略对集合中包括至少一个可行策略对,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成。处理模块还用于确定目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,第一代价集合中包括每个可行策略对的执行代价。处理模块还用于根据第一代价集合,确定目标行驶策略,目标行驶策略为与第一代价集合中最低的执行代价所关联的目标车辆的行驶策略。处理模块还用于根据目标行驶策略,对目标车辆进行控制。
在一种可能的实现方式中,若目标车辆为主动方,目标对象为被动方时;处理模块在根据第一代价集合,确定目标行驶策略时,具体用于:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆让行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆抢行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标代价所对应的可行策略对中的第一行驶策略。
在一种可能的实现方式中,若目标车辆为被动方,目标对象为主动方时;处理模块在根据第一代价集合,确定目标行驶策略时,具体用于:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆抢行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆让行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标车辆让行目标对象。
在一种可能的实现方式中,可行策略对所关联的执行代价包括以下一项或多项:舒适性代价,用于表征执行可行策略对时的舒适程度;通过性代价,用于表征主动方和/或被动方通过两者冲突点的效率;偏移代价,用于表征对主动方和/或被动方在执行相应的行驶策略时发生偏移的评价;不一致性代价,用于表征在可行策略对中目标对象的行驶行为与目标对像实际的行驶行为间的偏差的评价;或者,决策惩罚代价,用于表征对被动方的意图是否明确的评价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,处理模块在确定第一策略对所关联的舒适性代价时,具体用于:根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速度,确定主动方在执行第一策略对中的第一行驶策略时的第一舒适性代价;根据第一策略对中的第二行驶策略对应的加速度和被动方当前的加速度,确定被动方在执行第一策略对中的第二行驶策略时的第二舒适性代价;根据第一舒适性代价和第二舒适性代价,确定第一策略对所关联的舒适性代价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,处理模块在确定第一策略对所关联的通过性代价时,具体用于:根据第一时间和第二时间,确定主动方在执行第一策略对中的第一行驶策略时的第一通过性代价,其中,第一时间为主动方执行第一策略对中的第一行驶策略时通过目标点的时间,第二时间为主动方以其当前的速度和加速度通过目标点的时间,目标点为主动方的行驶路径和 被动方的行驶路径的冲突点;根据第三时间和第四时间,确定被动方在执行第一策略对中的第二行驶策略时的第二通过性代价,其中,第三时间为被动方执行第一策略对中的第二行驶策略时通过目标点的时间,第四时间为被动方以其当前的速度和加速度通过目标点的时间;根据第一通过性代价和第二通过性代价,确定第一策略对所关联的通过性代价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,处理模块在确定第一策略对所关联的偏移代价时,具体用于:基于预先设定的偏移量与偏移代价间的映射关系,以及,主动方执行第一策略对中的第一行驶策略时的偏移量,确定主动方在执行第一策略对中的第一行驶策略时的第一偏移代价;基于预先设定的偏移量与偏移代价间的映射关系,以及,被动方执行第一策略对中的第二行驶策略时的偏移量,确定被动方在执行第一策略对中的第二行驶策略时的第二偏移代价;根据第一偏移代价和第二偏移代价,确定第一策略对所关联的偏移代价。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,处理模块在确定第一策略对所关联的不一致性代价时,具体用于:当第一策略对中目标对象的行驶策略为仅抢行目标车辆,或者,为仅让行目标车辆时,确定目标对象抢行目标车辆的目标概率,以及,根据目标概率,确定不一致性代价;当第一策略对中目标对象的行驶策略为既能抢行目标车辆,又能让行目标车辆时,确定不一致性代价为预先设定的代价值。
在一种可能的实现方式中,针对所有的可行策略对中的任意一个第一策略对,处理模块在确定第一策略对所关联的决策惩罚代价时,具体用于:基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价。
在一种可能的实现方式中,第一策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方;处理模块在基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价时,具体用于:基于决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一决策惩罚代价;根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速度,确定主动方在执行第一策略对中的第一行驶策略时的第二决策惩罚代价;根据第一决策惩罚代价和第二决策惩罚代价,确定第一策略对所关联的决策惩罚代价。
第三方面,本申请提供一种车辆,包括第二方面或第二方面的任一种可能的实现方式所描述的控制装置。
第四方面,本申请提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,当计算机程序在处理器上运行时,使得处理器执行第一方面或第一方面的任一种可能的实现方式所描述的方法。
第五方面,本申请提供一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行第一方面或第一方面的任一种可能的实现方式所描述的方法。
可以理解的是,上述第二方面至第五方面的有益效果可以参见上述第一方面中的相关描述,在此不再赘述。
附图说明
下面对实施例或现有技术描述中所需使用的附图作简单地介绍。
图1是本申请实施例提供的一种应用场景的示意图;
图2是本申请实施例提供的一种车辆的硬件结构;
图3是本申请实施例提供的一种控制方法的流程示意图;
图4是本申请实施例提供的一种车辆与车辆交汇的示意图;
图5是本申请实施例提供的一种舒适性代价函数的示意图;
图6是本申请实施例提供的一种通过性代价函数的示意图;
图7是本申请实施例提供的一种决策惩罚性代价函数的示意图;
图8是本申请实施例提供的一种控制装置的结构示意图。
具体实施方式
本文中术语“和/或”,是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。本文中符号“/”表示关联对象是或者的关系,例如A/B表示A或者B。
本文中的说明书和权利要求书中的术语“第一”和“第二”等是用于区别不同的对象,而不是用于描述对象的特定顺序。例如,第一响应消息和第二响应消息等是用于区别不同的响应消息,而不是用于描述响应消息的特定顺序。
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
在本申请实施例的描述中,除非另有说明,“多个”的含义是指两个或者两个以上,例如,多个处理单元是指两个或者两个以上的处理单元等;多个元件是指两个或者两个以上的元件等。
示例性的,图1示出了一种应用场景。如图1的(A)所示,车辆100和200行驶至丁字路况,且车辆100沿线段x标识的方向行驶,车辆200沿线段y标识的方向移动。在图1的(B)中,车辆200汇入车辆100所在的道路中,且车辆100沿线段x标识的方向行驶,车辆200沿线段y标识的方向移动。在图1的(A)或(B)中,由于线段x和y之间存在交叉,即车辆100和200的行驶轨迹存在冲突,因此,车辆100和200间存在碰撞可能性的情况。应理解的是,图1中仅示出了两种车辆间存在碰撞可能性的情况,其他的由于车辆间的行驶轨迹存在冲突,而导致的车辆间存在碰撞可能性的情况仍在本申请的保护范围之内。
另外,在图1所示的场景中,车辆100可以保留,而车辆200可以替换为其他的对象,比如替换为运动的物体,或静止的物体等,替换后的方案仍在本申请的保护范围之内。同样的,车辆200可以保留,而车辆100可以替换为其他的对象,比如替换为运动的物体,或静止的物体等,替换后的方案仍在本申请的保护范围之内。为便于描述下面将以保留车辆100为例进行描述。
一般地,当车辆与其他对象(比如运动的物体,或静止的物体等)间的行驶轨迹存在冲突时,往往确定出的决策只有抢行或让行两种策略。但是这种“非黑即白”的抢让行决策,可能会出现决策跳动,导致车辆出现重刹/点刹甚至接管等情况,驾驶体验较差。示例性的,其他对象的行驶轨迹可以但不限于为车辆预测出的行驶轨迹,比如,当其他对象为静止的物体时,车辆可以为该物体预测出一个运动轨迹;当其他对象为运动的物体时,车辆也可以为该物体预测出一个运动轨迹。
有鉴于此,本申请实施例提供了一种控制方法,在车辆与其他对象(比如运动的物体,或静止的物体等)间的行驶轨迹存在碰撞可能性时,可以将车辆和其他对象划分为主动方和被动方,且主动方相对于被动方具有优先通行权。然后,可以控制车辆对其他对象的意图进行试探,当确定出其他对象的意图明显时,在做出最终的抢行或让行决策。由此实现缓慢加速抢行或缓慢减速让行的效果,提升驾驶体验。
示例性的,图2示出了一种车辆的硬件结构。如图2所示,该车辆100中可以包括:传感器组件110、融合单元120和智能驾驶功能组件130。传感器组件110与融合单元120之间通过接口140连接,融合单元120与智能驾驶功能组件130之间通过接口150连接,。
传感器组件110可以包括车姿传感器和/或感知传感器等。传感器组件110采集的数据可以通过接口140传输至融合单元120。车姿传感器可以实现获取车辆100的行驶状态信息,如速度、加速度、航向角、道路拓扑等等,以及获取车辆100的外部环境信息,如路况信息等。感知传感器可以实现获取车辆100外部的其他对象的信息,比如:行驶状态信息,如速度、航向角、位置、朝向、加速度、道路拓扑等等。可选地,车姿传感器可以包括陀螺仪传感器、雷达传感器、超声波传感器、相机、计算机视觉系统等中的一种或多种。其中,雷达传感器可以包括激光雷达传感器和/或毫米波雷达传感器等。感知传感器可以包括相机、雷达传感器、超声波传感器等中的一种或多种。
其中,雷达传感器可以用于利用无线电信号来感测车辆100周边环境中的物体,也可以感测物体的速度和/或行进方向等等。
相机可以用于捕捉车辆周边环境的多个图像。相机可以是静态相机或视频相机。
计算机视觉系统可以操作来处理和分析由相机捕捉的图像以便识别车辆周边环境中物体和/或特征。其中,物体和/或特征可以包括交通信号、道路边界、障碍物、其他对象等等。计算机视觉系统可以使用物体识别算法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪和其他计算机视觉技术等。
融合单元120可以将传感器组件110采集的数据通过接口150传输至智能驾驶功能组件130,从而使得智能驾驶功能组件130可以基于传感器组件110采集的数据实现智能驾驶功能。
智能驾驶功能组件130可以通过接口150接收融合单元120传输的传感器组件110采集的数据,之后, 基于接收到的数据,预测其他对象的行驶轨迹等,和/或,做出行驶决策,以实现智能驾驶功能,如自适应巡航(adaptive cruise control,ACC)、车道保持辅助(lane keeping assist,LKA)、高速公路辅助HWA(highway assist,HWA)、交通拥堵辅助(traffic jam assistant,TJA)等各种不同的智能驾驶/自动驾驶功能。
接口140可以实现传感器组件110和融合单元120间的数据传输,其可以是在ISO 130150中规定的传感器和融合单元之间的接口。接口150可以实现融合单元120和智能驾驶功能组件130之间的数据传输。接口150传输的消息内容可以包括以下一项或多项:车辆100的速度、航向角、历史轨迹信息、加速度、道路拓扑;其他对象的类型、速度、航向角、位置、朝向、加速度、道路拓扑等;车辆100与其他对象间的碰撞时间等。
在一些实施例中,智能驾驶功能组件130中可以包括物体决策模块和运动规划模块。其中,物体决策模块可以根据传感器组件110采集的数据等,确定车辆100沿着参考路径行驶时是否会与其他的交通参与者发生碰撞。例如,继续参阅图1,车辆100和200存在碰撞的情况。示例性的,参考路径可以是指车辆用于进行物体决策的参考基准,如当前道路的中心线等,该路径主要反应真实的地图信息,可以引导车辆的行驶方向。
另外,物体决策模块还可以对车辆100与其他的交通参与者存在冲突的情形进行划分,将两者划分为博弈对象或非博弈对象。其中,博弈对象是指两者之间存在相互影响。例如,继续参阅图1,车辆100的行驶决策(比如抢行或让行等)会对车辆200的行驶决策产生影响,同时,车辆200的行驶决策(比如抢行或让行等)也会对车辆100的行驶决策产生影响,因此车辆100和200为博弈对象。非博弈对象是指两者之间不存在相互影响。例如,在车辆100跟随其他车辆行驶的场景中,车辆100的行驶决策并不能影响位于其前方且同向行驶的车辆。在本申请实施例中主要是解决两者属于博弈对象时,如何做出行驶决策,以避免两者之间发生碰撞冲突。
运动规划模块可以根据物体决策模块做出的行驶决策,进行横向规划和纵向规划。横向规划是指在平行于参考路径(比如车道的中心线等)的方向,对应于车辆的避障、绕行等行为做出的规划。其中,横向规划可以具有坐标信息,但不具有速度信息。纵向规划是指在沿参考路径的方向,对应于车辆的加、减速等行为做出的规划。其中,纵向规划可以具有速度信息,但不具有坐标信息。示例性的,由于车辆的行驶轨迹主要是由坐标信息和速度信息组成,因此,横向规划和纵向规划可以共同组成车辆的行驶轨迹。
可以理解的是,本申请实施例示意的结构并不构成对车辆100的具体限定。在本申请另一些实施例中,车辆100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
接下来,基于上述内容,对本申请实施例提供的控制方法进行介绍。
示例性的,图3示出了一种控制方法。该方法中涉及的车辆可以但不限于为前述的车辆100。该方法涉及的场景可以但不限于是前述图1中所描述的车辆100和200间的行驶路线存在碰撞可能性的场景,即两车之间有一定的概率会发生碰撞的场景,当然也可以是车辆100和其他对象间的行驶路线存在碰撞可能性的场景。可以理解,该方法可以通过任何具有计算、处理能力的装置、设备、平台、设备集群来执行。例如,可以通过车辆100中的处理器或者车载终端等执行,此处不做限定。如图3所示,该控制方法可以包括S301至S307,具体地:
S301、将目标车辆和目标对象划分为主动方和被动方,其中,主动方相对于被动方具有优先通行权,且目标车辆和目标对象存在碰撞可能性。
本实施例中,在目标车辆采集到环境中各个交通参与者的信息后,当目标车辆感知到其与其他的交通参与者(以下简称“目标对象”)存在碰撞可能性时,目标车辆可以根据交通规则,将其和目标对象划分为主动方和被动方,其中,主动方相对于被动方具有优先通行权。例如,继续参阅图1的(A),根据交通规则,左转需让直行,因此,车辆100为主动方,车辆200为被动方。再例如,当目标车辆直行,目标对象转弯时,可以确定目标车辆为主动方,目标对象为被动方。当目标车辆转弯,目标对象直行时,可以确定目标车辆为被动方,目标对象为主动方。在一些实施例中,本申请实施例中的交通参与者可以为车辆,也可以为其他对象,此处不做限定。
S302、获取主动方可行的第一策略集合,第一策略集合中包括至少一个第一行驶策略。
本实施例中,第一策略集合可以预先存储在目标车辆中,也可以实时从网络中获取,亦可以实时计算,具体可根据实际情况而定,此处不做限定。其中,第一策略集合中包括至少一个第一行驶策略。示例性的, 每个第一行驶策略均至少可以用于表征主动方在垂直于目标车辆所在车道的方向的一个可行信息(比如加速度等)和沿该车道的方向上一个可行的行驶信息(比如偏移量等)。
在一些实施例中,可以至少根据预先设定的加速度的采样空间、主动方的加速度所允许的范围,以及预先设定的横向偏移范围,确定出第一策略集合。例如,可以将预先设定的加速度的采样空间和主动方的加速度所允许的范围的交集作为第一策略集合中加速度的变化范围,以及将预先设定的横向偏移范围作为策略空间中的横向偏移的变化范围。举例来说,当目标车辆为主动方时,若预先设定的加速度的采样空间为[-4.0,3.0]m/s2,目标车辆允许的加速度区间为[-3.0,2.0]m/s2,则第一策略集合中的加速度的变化范围为[-3.0,2.0]m/s2。若预先设定的横向偏移范围为[-1,1]m,则第一策略集合中的横向偏移的变化范围为[-1,1]m。
进一步地,考虑到计算复杂度和策略空间精度之间的平衡,可以设定第一策略集合中加速度的变化间隔,以及横向偏移的偏移间隔。例如,加速度可以每间隔1m/s2变化一次,横向偏移可以每间隔1m变化一次。
举例来说,若第一策略集合中的加速度的变化范围为[-3.0,2.0]m/s2,横向偏移的变化范围为[-1,1]m。此时,可以将加速度的变化间隔定为1m/s2,目标车辆的横向偏移可以定为在其保持当前车道中心行驶时,向左避让(即向左偏移车道中心线)1m,记为+1m,向右避让(即向右偏离车道中心线)1m,记为-1m。最终可生成如表1所示的第一策略集合。在表1所示的策略集合中,一个加速度和一个横向偏移可以组成一个行驶策略,例如,加速度1m/s2和横向偏移-1m组成的行驶策略为:以加速度1m/s2,且向右偏离车道中心线1m行驶。
表1
在一些实施例中,在得到第一策略集合后,可以对该第一策略集合中的各个行驶策略进行时域推演,以形成主动方可行的行驶策略对应的第一轨迹集合。示例性的,在进行时域推演时可以遵守主动方的系统延迟(比如由一个加速度向另一个加速度的过渡的时间等),限速等约束,当主动方加速到道路限速后,推演保持在道路限速匀速推演。当第一策略集合中所有的行驶策略均推演完成后,即可以得到第一轨迹集合。示例性的,第一轨迹集合中可以包括有至少一个行驶轨迹,且,第一策略集合中的每个行驶策略均可以对应有一个行驶轨迹。
在得到第一策略集合后,可以执行S303。
S303、根据第一策略集合中的各个第一行驶策略、主动方当前时刻的行驶参数和被动方当前时刻的行驶参数,得到被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,第二行驶策略为以下任意一项:被动方仅抢行主动方,被动方仅让行主动方,或者,被动方既能抢行主动方,又能让行主动方。
本实施例中,可以根据主动方当前时刻的行驶参数(比如:速度、加速度、位置等),以及各个第一行驶策略,计算主动方到达其与被动方发生碰撞的位置的时间。然后,在求解在各个第一行驶策略下,被动方以其当前时刻的行驶参数(比如:速度、加速度、位置等)为初始的参数行驶,能否与主动方错开一定的时间间隔和距离间隔通过两者发生碰撞的位置,以及,在能够与主动方错开一定的时间间隔和距离间隔通过两者发生碰撞的位置时,被动方所能够做出的动作,比如只能抢行主动方,或者只能让行主动方,或者,既能抢行主动方又能让行主动方等,以得到被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,从而得到第二策略集合。其中,第二行驶策略为以下任意一项:被动方仅抢行主动方,被动方仅让行主动方,或者,被动方既能抢行主动方,又能让行主动方。示例性的,当前时刻的行驶参数可以但不限于是指:在获取第一策略集合时观测到的行驶参数,或者,在求解第二策略集合时观测到的行驶参数,或者,在执行该方法前最新观测到的行驶参数。
在一些实施例中,可以使用二次规划(quadratic programming,QP)等算法,对主动方和被动方的行驶位置、行驶速度、加速度、动力学运动学约束、碰撞时间,以及道路规则等参数进行求解,以求解被动方加速/减速/避让的行驶轨迹,以及确定出被动方的行驶策略。示例性的,可以将目标对象的行驶速度、位置、朝向、加速度、道路拓扑等,输入至预训练得到的模型中,预测被动方的行驶轨迹和/或行驶策略。
举例来说,如图4所示,当车辆100为主动方,且在当前时刻,车辆100的速度为17km/h,车辆100当前的观测加速度为-0.67m/s2,道路静态限速60km/h。以车辆100的加速度为1m/s2,横向偏移为0m的行驶策略为例,使用QP求解可以得到车辆200(即被动方)的可行轨迹46。此时车辆100到轨迹冲突点43的距离为20.11m,当车辆100采用加速度为1m/s2进行行驶时,到达冲突点的时间为eTTC=3.51s。
将以车辆100的宽度1.9m左右各阔开0.5m的宽度作为车辆100的安全通道(即线条41和42组成的区域),车辆200沿其预测路径(即可行轨迹46)行驶,开始入侵车辆100的安全通道的点作为入侵点(即车辆200位于位置44时),完全离开车辆100的安全通道的点作为离开点(即车辆200位于位置45时)。
则车辆200安全抢行车辆100和安全让行车辆100的条件为:让行车辆100时,车辆200到达入侵点的时间需要在车辆100通过轨迹冲突点43后一定时间。抢行车辆100时,车辆200到达离开点的时间,需要在车辆100到达轨迹冲突点43之前一段时间。示例性的,车辆200安全抢行/让行的时间间隔可以根据不同的场景,预先设定采用不同的安全时间间隔。例如:安全时间间隔可以采用1s,即车辆200如果抢行车辆100,需要在2.41s内离开车辆100所在的安全通道,车辆200如果让行车辆100,需要在4.41s后进入车辆100所在的安全通道。
车辆200到轨迹交点43的距离为11.92m,车辆200的观测加速度为0.0m/s2,当前速度为17km/h。考虑车辆100所在的安全通道宽度,和车辆200所在的道路拓扑角度,可以得到车辆200入侵点位置44据车辆200当前位置距离为7.63m,车辆200离开点位置45据车辆200当前位置距离为13.4m。则在车辆200的行驶策略求解中,如采用QP(二次规划)方法,对于让行策略求解时,4.41s前的所有最大位移约束为7.63m,对于抢行策略求解时,2.41s后的所有最小位移约束为13.4m。在求解过程中使用到的最大最小加速度约束,最大速度约束,可以根据车辆200运动学动力学约束及交通规则约束进行设置。
车辆100加速度为1m/s2,横向偏移为0m的行驶策略,可以同时求出符合上述条件的车辆200让行车辆100和抢行车辆100的解。即此时车辆200既可以抢行车辆100,也可以让行车辆100,此时车辆200的意图不明确。
对于剩余车辆100的行驶策略求解车辆200对应的行驶策略。当车辆100行驶策略为:横向偏移0m且加速度2m/s2时,可以求出车辆200让行车辆100的策略,而车辆200抢行车辆100的策略无论车辆200是否进行偏移,均无法满足安全错开约束条件和其他约束,从而无法求出可行解,所以此时车辆200只能让行车辆100,此时车辆200的意图是明确的。当车辆100行驶策略为:横向偏移0m且加速度为-3m/s2时,只能求出车辆200抢行车辆100的策略,所以此时车辆200只能抢行车辆100,此时车辆200的意图是明确的。当车辆100行驶策略为:横向偏移+1m(向左偏移),且加速度为1m/s2时,无论车辆200是否进行偏移,均无法求出满足约束条件的车辆200抢行或让行车辆100的策略,所以此时无解,即无法安全抢行或让行车辆100。求解得到车辆200所有的行驶策略后,得到的第二策略集合可以如下表2所示。在表2中除第一列和第一行之外的内容即为车辆200(即被动方)的第二策略集合。例如,参阅表2,在主动方的行驶策略为“以横向偏移1m且加速度-3m/s2行驶”时,被动方的行驶策略为“抢行主动方”,即被动方仅能抢行主动方;在主动方的行驶策略为“以横向偏移0m且加速度1m/s2行驶”时,被动方的行驶策略为“抢行和让行主动方”,即被动方既能抢行主动方也能让行主动方;在主动方的行驶策略为“以横向偏移1m且加速度1m/s2行驶”时,被动方无法避免与主动方发生碰撞此时无法求出被动方的行驶策略,即“无解”。

表2
在一些实施例中,在基于第一策略集合、主动方当前时刻的行驶参数和被动方当前时刻的行驶参数,得到第二策略集合的过程中,可以根据该第一策略集合中的各个行驶策略对应的行驶轨迹,以及被动方安全抢行和/或让行主动方时所需满足的条件,对被动方的行驶轨迹进行时域推演,以形成被动方可行的行驶策略对应的第二轨迹集合。示例性的,在进行时域推演时可以遵守被动方的系统延迟(比如由一个加速度向另一个加速度的过渡的时间等),限速等约束,当被动方加速到道路限速后,推演保持在道路限速匀速推演。当第二策略集合中所有的行驶策略均推演完成后,即可以得到第二轨迹集合。示例性的,第二轨迹集合中可以包括有至少一个行驶轨迹,且,第二策略集合中的每个行驶策略均可以对应有一个行驶轨迹。
在得到第二策略集合后,可以执行S304。
S304、根据第一策略集合和第二策略集合,确定目标策略对集合,目标策略对集合中包括至少一个可行策略对,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成。
本实施例中,在得到第一策略集合和第二策略集合后,可以由这两个策略集合确定出目标策略对集合,目标策略对集合中包括至少一个可行策略对,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成。例如,继续参阅表2,行驶策略“主动方以横向偏移1m且加速度-3m/s2行驶”和行驶策略“被动方抢行主动方”可以组成一个可行策略对,行驶策略“主动方以横向偏移0m且加速度1m/s2行驶”和行驶策略“被动方抢行和让行主动方”可以组成一个可行策略对。
另外,当被动方同时存在安全的既能抢行又能让行主动方的行驶策略时,由该行驶策略所组成的可行策略对可以称为双边解策略对。例如,继续参阅表2,行驶策略“主动方以横向偏移0m且加速度1m/s2行驶”和行驶策略“被动方抢行和让行主动方”组成的可行策略对,可以称之为双边解策略对。当被动方只能求出安全的抢行或让行主动方的行驶策略时,由该行驶策略所组成的可行策略对可以称为单边解策略对。例如,继续参阅表2,行驶策略“主动方以横向偏移1m且加速度-3m/s2行驶”和行驶策略“被动方抢行主动方”组成可行策略对,可以称之为单边解策略对。
S305、确定目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,第一代价集合中包括每个可行策略对的执行代价。
本实施例中,可以对目标策略对集合中各个可行策略的执行代价进行分析,得到第一代价集合。其中,第一代价集合中包括每个可行策略对所关联的执行代价。每个可行策略对所关联的代价用于指示执行该可行策略对的成本。
本实施例中,每个可行策略对所关联的执行代价均可以包括以下一项或多项:舒适性代价,通过性代价,偏移代价,不一致性代价,或,决策惩罚代价。
下面分别对各个代价进行介绍。
(1)舒适性代价
舒适性代价用于表征执行某个可行策略对时的舒适程度。其中,舒适性代价越高,舒适程度越低。在一些实施例中,对于主动方或被动方,其加速度变化率越小,相应的舒适性越好,则舒适性代价越小。
本实施例中,对于任意一个可行策略对,可以基于预设的舒适性代价函数,对主动方在其当前加速度下执行相应的行驶策略时的加速度变化量进行处理,以得到主动方在执行相应的行驶策略时的舒适性代价。同样的,可以基于预设的舒适性代价函数,对被动方在其当前加速度下执行相应的行驶策略时的加速度变化量进行处理,以得到被动方在执行相应的行驶策略时的舒适性代价。最后,可以将主动方的舒适性代价和被动方的舒适性代价进行加权求和(当然,也可以采用其他的计算方式,比如择一选择,求平均等,此处不做限定),以得到相应的可行策略对所关联的舒适性代价。示例性的,主动方当前的加速度可以但不限于是指:在获取第一策略集合时观测到的主动方的加速度,或者,在求解第二策略集合时观测到的主动方的加速度,或者,在执行该方法前最新观测到的主动方的加速度。
举例来说,若预设的舒适性代价函数如图5所示,即y=0.1429x,其中,x为横轴,y为纵轴。同时,图1中所描述的车辆100为主动方,车辆200为被动方,以及,车辆100当前的观测加速度为-0.67m/s2, 车辆200当前的观测加速度为0m/s2。另外,第一策略集合可以如前述的表1所示,第二策略集合可以如前述的表2所示。在车辆100对应的加速度为1m/s2,且横向偏移0m的行驶策略下,车辆200对应的行驶策略为抢行和让行。同时,车辆200对应的行驶策略为抢行时,其加速度为1.45m/s2,其对应的行驶策略为让行时,其加速度为0.69m/s2
其中,对于车辆100,在该可行策略对下,其加速度变化量为eDeltaAcc=1-(-0.67)=1.67m/s2,结合图5所示的舒适性代价函数,可以确定出车辆100的舒适性代价为eComfCost=0.92×(1.67÷7)=0.2205。其中,0.92为车辆100的舒适性代价的权重,该值可以基于实际情况进行设定。
对于车辆200,由于在该可行策略对下,其存在双边解(即:既可以抢行车辆100,也可以让行车辆100),因此,车辆200的舒适性代价为抢行解和让行解对应的舒适性代价的平均(当然,也可以采用其他的计算方式,比如求和,择一选择等,此处不做限定)。其中,抢行解对应的加速度变化量为oDeltaAcc=1.45-0=1.45m/s2,此时的舒适性代价为oComfCost=1.45÷7=0.2071。让行解对应的加速度变化量为oDeltaAcc=0.69-0=0.69m/s2,此时的舒适性代价为oComfCost=0.69÷7=0.0986。车辆200在可行策略对下最终的舒适性代价oComfCost=1×(0.2071+0.0986)÷2=0.1529。其中,1为车辆200的舒适性代价的权重,该值可以基于实际情况进行设定。
进一步地,可以确定出该可行策略对所关联的舒适性代价oComfCost=0.2205+0.1592=0.3797。此时,主动方和被动方的权重均为1,当然也可以为其他的值,此处不做限定。对于其他的可行策略对所关联的舒适性代价,可以参考上述的计算方式,此处不再赘述。
(2)通过性代价
通过性代价用于表征主动方和/或被动方通过两者冲突点(即两者行驶路径的冲突点)的效率。其中,通过冲突点的时间越快,相应的通过性代价越小。
本实施例中,对于任意一个可行策略对,可以基于预设的通过性代价函数,对主动方执行该可行策略对中的行驶策略时通过相应的冲突点的时间(也可以称之为“第一时间”),以及主动方在当前的加速度和速度下通过相应的冲突点的时间(也可以称之为“第二时间”)进行处理,以得到主动方在执行相应的行驶策略时的通过性代价。同样的,可以基于预设的通过性代价函数,对被动方执行该可行策略对中的行驶策略时通过相应的冲突点的时间(也可以称之为“第三时间”),以及被动方在当前的加速度和速度下通过相应的冲突点的时间(也可以称之为“第四时间”)进行处理,以得到被动方在执行相应的行驶策略时的通过性代价。最后,可以将主动方的通过性代价和被动方的通过性代价进行加权求和(当然,也可以采用其他的计算方式,比如择一选择,求平均等,此处不做限定),以得到相应的可行策略对所关联的通过性代价。
举例来说,若预设的通过性代价函数如图6所示,即y=0.0833x+0.5,其中,x为横轴,y为纵轴,且图6中横轴为执行行驶策略通过冲突点的时间与正常(即不执行行驶策略)通过冲突点的时间的时间差。同时,图1中所描述的车辆100为主动方,车辆200为被动方。另外,若在推演过程中,车辆100在当前加速度和速度下,通过冲突点(也可以称之为“碰撞点”)的时间为eRealPassTime=4.47s,车辆200在当前加速度和速度下,通过冲突点的时间为oRealPassTime=2.52s。此外,第一策略集合可以如前述的表1所示,第二策略集合可以如前述的表2所示。在车辆100对应的加速度为1m/s2,且横向偏移0m的行驶策略下,车辆200对应的行驶策略为抢行和让行。
若在推演过程中,车辆100在该行驶策略下,通过冲突点的时间eSamplePassTime=3.41s,与其正常通过冲突点的时间的通过时间差为eDeltaPassTime=eSamplePassTime–eRealPassTime=3.41-4.47=-1.06。结合图6,可以确定出车辆100在该可行策略对下的通过性代价为ePassCost=1×(0.0833×(-1.06)+0.5)=0.4117。其中,1为车辆100的通过性代价的权重,该值可以基于实际情况进行设定。
若在推演过程中,车辆200在车辆100对应的行驶策略下,其让行通过冲突点的时间为oSamplePassTime=5.02s,则该时间与其正常通过冲突点的时间的通过时间差为oDeltaPassTime=oSamplePassTime–oRealPassTime=5.02-2.52=2.5s,则车辆200在让行时的通过性代价为oPassCost=0.0833×2.5+0.5=0.708。若车辆200抢行通过冲突点的时间为oSamplePassTime=2.3s,则该时间与其正常通过冲突点的时间的通过时间差为oDeltaPassTime=oSamplePassTime–oRealPassTime=2.3-2.52=-0.22,则车辆200在抢行时的通过性代价为oPassCost=0.0833×(-0.22)+0.5=0.4817。车辆200在可行策略对下最终的通过性代价为oPassCost=1×(0.708+0.4871)÷ 2=0.5976。其中,1为车辆200的通过性代价的权重,该值可以基于实际情况进行设定。应理解的是,在基于车辆200的抢行解对应的通过性代价和让行解对应的通过性代价,确定车辆200在可行策略对下最终的通过性代价时,也可以采用其他的计算方式,比如求和,择一选择等,此处不做限定。
进一步地,可以确定出该可行策略对所关联的通过性代价oComfCost=0.4117+0.5976=1.0093。此时,主动方和被动方的权重均为1,当然也可以为其他的值,此处不做限定。对于其他的可行策略对所关联的通过性代价,可以参考上述的计算方式,此处不再赘述。
(3)偏移代价
偏移代价用于表征对主动方和/或被动方在执行相应的行驶策略时发生偏移的评价。其中,当不需要偏移时,则偏移代价为0,当然也可以取其他的值,此处不做限定;当需要偏移时,可以基于预先设定的偏移量与偏移代价间的映射关系,以及,主动方执行相应的行驶策略时的偏移量,确定出主动方在执行相应的行驶策略时的偏移代价,和/或,基于预先设定的偏移量与偏移代价间的映射关系,以及,被动方执行相应的行驶策略时的偏移量,确定出被动方在执行相应的行驶策略时的偏移代价。最后,可以根据确定出的主动方在执行相应的行驶策略时的偏移代价和被动方在执行相应的行驶策略时的偏移代价,得到某个可行策略对所关联的偏移代价。
举例来说,若偏移为0m时,偏移代价取0,偏移为1m或-1m时,偏移代价为0.3。继续参阅图1,若车辆100为主动方,车辆200为被动方,且车辆100对应的行驶策略为前述表1中的行驶策略,车辆200对应的行驶策略为前述表2中的行驶策略。则,车辆100在加速度为1m/s2且横向偏移0m的行驶策略下,车辆100的偏移代价eOffsetCost=0。在推演过程中,若在该行驶策略下,车辆200无需偏移,即可求出双边解,则车辆200的偏移代价oOffsetCost=0。
进一步地,可以确定出该可行策略对所关联的偏移代价oComfCost=0+0=0;当然,也可以采用其他的计算方式,比如择一选择,求平均等,此处不做限定。此时,主动方和被动方的权重均为1,当然也可以为其他的值,此处不做限定。对于其他的可行策略对所关联的偏移代价,可以参考上述的计算方式,此处不再赘述。
(4)不一致性代价
不一致性代价用于表征在可行策略对中目标对象的行驶行为与目标对像实际的行驶行为间的偏差的评价。其主要是以被动方抢行主动方的概率为输入确定。其中,对于被动方抢行主动方的概率,可以将被动方和主动方的相对位置,相对速度,被动方的速度、位置、加速度,历史观测数据等,输入到预先设的模型中,以得到被动方抢行主动方的概率。
本实施例中,在求解被动方的行驶策略时,若可以求解出双边解,则不一致性代价可以为0.5,当然也可以是其他值,此处不做限定。当只能求出一个解,且该解为抢行时,不一致性代价可以为:1-被动方抢行主动方的概率。当只能求出一个解,且该解为让行时,不一致性代价可以为:被动方抢行主动方的概率。换言之,当某个策略对中目标对象的行驶策略为仅抢行目标车辆,或者,为仅让行目标车辆时,可以先确定目标对象抢行目标车辆的目标概率,然后在根据目标概率,确定该策略对所关联的不一致性代价;当某个策略对中目标对象的行驶策略为既能抢行目标车辆,又能让行目标车辆时,可以确定该策略对所关联的不一致性代价为预先设定的代价值。
举例来说,若被动方抢行主动方的概率为0.7,同时,图1中所描述的车辆100为主动方,车辆200为被动方,且第一策略集合可以如前述的表1所示,第二策略集合可以如前述的表2所示。
在车辆100对应的加速度为1m/s2,且横向偏移0m的行驶策略下,车辆200对应的行驶策略为抢行和让行,即存在双边解。因此,此时,车辆200的不一致性代价probCost=0.3×0.5=0.15。其中,0.3为不一致性代价权重,该值可以基于实际情况进行设定。
在车辆100对应的加速度为2m/s2,且横向偏移0m的行驶策略下,车辆200对应的行驶策略只能是让行。因此,此时,车辆200的不一致性代价probCost=0.3×0.7=0.21。其中,0.3为不一致性代价权重,该值可以基于实际情况进行设定。
在车辆100对应的加速度为-2m/s2,且横向偏移0m的行驶策略下,车辆200对应的行驶策略只能是抢行。因此,此时,车辆200的不一致性代价probCost=0.3×(1-0.7)=0.09。其中,0.3为不一致性代价权重,该值可以基于实际情况进行设定。
(5)决策惩罚代价
决策惩罚代价用于表征对动方的意图是否明确的评价。其可以是指基于主动方的行驶策略求解得到 的被动方的行驶策略是单边解或双边解时对应的惩罚代价。其中,当求出的解中存在双边解时,表明被动方的意图还不明确,因此,可以设定当求出的解为单边解时惩罚高,当求出的解为双边解时惩罚低。示例性的,可以预先设定单边解双边解各自所对应的决策惩罚代价,即预先设定决策惩罚规则,然后在结合被动方的行驶策略,确定出相应的决策惩罚代价。例如,可以将单边解对应的决策惩罚代价设定为1,双边解对应的决策惩罚代价设定为0。
另外,在被动方求解出双边解时,若主动方在该行驶策略下的加速度与其当前的加速度的差值较大,执行该行驶策略时主动方的加速度变化量较大,影响驾驶体验,因此可以增加该双边解的惩罚。示例性的,可以基于预设的惩罚代价函数,对主动方在该行驶策略下的加速度与其当前的加速度的差值进行处理,以得到相应的可行策略对所关联的决策惩罚代价。然后,在根据确定出的该决策惩罚代价和被动方对应的决策惩罚代价,确定出最终的决策惩罚代价。
举例来说,若预设的惩罚代价函数如图7所示,即y=0.333x,其中,x为横轴,y为纵轴。同时,图1中所描述的车辆100为主动方,车辆200为被动方,且第一策略集合可以如前述的表1所示,第二策略集合可以如前述的表2所示。
在车辆100对应的在加速度为1m/s2且横向偏移0m的行驶策略下,车辆200对应的行驶策略为抢行和让行,即存在双边解。因此,基础惩罚为eBasicPunishmentCost=0。
进一步地,若车辆100的当前加速度为-0.67m/s2,则其加速度变化量为eDeltaAcc=1-(-0.67)=1.67m/s2。结合图7,可以得到该决策惩罚代价eDeltaAccPunishmentCost=1×0.333×1.67=0.55。其中,1为决策惩罚代价的权重,该值可以基于实际情况进行设定。
进一步地,可以确定出车辆100在加速度为1m/s2且横向偏移0m的行驶策略下的决策惩罚代价Cost=0+0.55=0.55。当然,也可以设定一定的权重值,比如,基础惩罚代价的权重为0.4,基于加速度变化量得到的惩罚代价的权重为0.6等,具体可根据实际情况而定,此处不做限定。
应理解的是,上述所描述的代价可以择一选取,也可以选取多个,具体可根据实际情况而定,此处不做限定。其中,当选取多个时,可以将这些代价进行加权求和或求平均等计算,并将最后的结果作为相应的可行策略对所关联的代价。另外,当第二策略集合中的行驶策略为无解时,可以将该行驶策略对应的可行策略对所关联的代价置为无穷大,即执行该可行策略对的成本非常大。
示例性的,以选取上述所描述的5个代价为例,对上述每一项代价进行求和,即可获得主动方在加速度为1m/s2且横向偏移0m的可行策略对所关联的的最终代价costTotal=0.3797+1.0093+0+0.15+0.55=2.089。
另外,对前述表1和表2所组成的每一个可行策略对均进行以上步骤的计算,可得如表3所示的所有的可行策略对下的总代价。其中,在表3中,除第一策略集合和第二策略集合之外的内容,即为各个可行策略对所关联的代价。
表3
在得到第一代价集合后可以执行S306。
S306、根据第一代价集合,确定目标行驶策略,目标行驶策略为与第一代价集合中最低的执行代价所关联的目标车辆的行驶策略。
本实施例中,在得到第一代价集合后,即可以根据第一代价集合,确定出目标行驶策略。其中,由于执行第一代价集合中最低的代价的成本最低,因此,可以将第一代价集合中最低的执行代价所关联的目标车辆的行驶策略,作为目标行驶策略。
作为一种可能的实现方式,当第一代价集合中最低的代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,或者,被动方仅让行主动方时,由于此时不论目标对象是主动方还是被动方,其行驶策略已经非常明确,其要么抢行,要么让行,因此,这时可以控制目标车辆执行与目标对象相反的行驶策略。示例性的,当目标对象的行驶策略为抢行目标车辆时,可以控制目标车辆迅速减速让行目标对象,即此时目标车辆的行驶策略为:让行目标对象;当目标对象的行驶策略为让行目标车辆时,可以控制目标车辆迅速加速抢行目标对象,即此时目标车辆的行驶策略为抢行目标对象,由此以使目标车辆快速响应目标对象的行驶意图。
作为另一种可能的实现方式,当第一代价集合中最低的代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方时,若目标车辆为主动方,目标对象为被动方,则此时目标对象的行驶意图还不明确,即目标对象既可以抢行目标车辆也可以让行目标车辆,此时若目标车辆贸然执行抢行或者让行,让行的决策,则容易出现重刹/点刹甚至接管等情况,驾驶体验较差。所以此时,可以根据第一代价集合中最低的代价所对应的可行策略对中的第一行驶策略,对目标车辆进行控制,以试探目标对象的行驶意图,即将第一代价集合中最低的代价所对应的可行策略对中的第一行驶策略,作为目标车辆的行驶策略。其中,由于此时目标车辆为主动方,所以目标车辆相对于目标对象具有较高优先通行权,即目标车辆的路权高于目标对象。同时,由于在行驶过程中往往是路权较低的一方需要让行路权较高的一方,因此可以控制目标车辆加速行驶、减速行驶或者匀速行驶,以试探目标对象的行驶意图。另外,在控制目标车辆加速或减速行驶时,为了降低车辆中人员的不适感,可以控制目标车辆的速度的变化量小于第一阈值,从而使得目标车辆可以缓慢加速或缓慢减速等。
作为又一种可能的实现方式,当第一代价集合中最低的代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方时,若目标车辆为被动方,目标对象为主动方,则此时目标对象的行驶意图还不明确,即目标对象既可以抢行目标车辆也可以让行目标车辆,此时若目标车辆贸然执行抢行或者让行的决策,则容易出现重刹/点刹甚至接管等情况,驾驶体验较差。另外,由于此时目标车辆为被动方,所以目标车辆相对于目标对象具有较低的优先通行权,即目标车辆的路权低于目标对象。同时,由于在行驶过程中往往是路权较低的一方需要让行路权较低的一方,因此可以控制目标车辆减速行驶,以试探目标对象的行驶意图。也即是说,此时目标车辆的行驶策略为:让行目标对象。另外,在控制目标车辆减速行驶时,为了降低车辆中人员的不适感,可以控制目标车辆的速度的变化量小于第二阈值,从而使得目标车辆可以缓慢减速。
也即是说,当第一代价集合中最低的代价所对应的可行策略对中被动方的行驶策略为单边解,且为抢行主动方时,主动方对被动方的纵向决策为让行。当该可行策略对中被动方的行驶策略为单边解,且为让行主动方时,主动方对被动方的纵向决策为抢行。当该可行策略对中被动方的行驶策略为双边解时,主动方对被动方的纵向决策为临界抢行/让行,即两者继续博弈。其中,当该可行策略对中被动方的行驶策略为双边解时,可以在横向上约束目标车辆避让的最大范围和最小范围,在纵向上约束目标车辆的速度的最大范围和最小范围,即进行双边约束。当该可行策略对中被动方的行驶策略为单边解时,在横向上约束目标车辆避让的最大范围或者最小范围,在纵向上约束目标车辆的速度的最大范围或者最小范围,即进行单边约束。
S307、根据目标行驶策略,对目标车辆进行控制。
本实施例中,在确定出目标行驶策略后,即可以根据该目标行驶策略,对目标车辆进行控制。示例性的,当目标行驶策略为:目标车辆抢行目标对象时,则控制目标车辆抢行目标对象;当目标行驶策略为:目标车辆让行目标对象时,则控制目标车辆让行目标对象;当目标行驶策略为:第一代价集合中最低的执行代价所关联的目标车辆的行驶策略时,则控制目标车辆执行该行驶策略,以及在间隔预设时间后,再次执行前述的S301至S307中的一个或多个步骤,一直这样循环下去,直至确定出目标车辆的行驶策略为仅抢行目标对象或者为仅让行目标对象。例如,当目标行驶策略为:第一代价集合中最低的执行代价所关联的目标车辆的行驶策略,且该行驶策略为:加速度为1m/s2且横向偏移0m时,则控制目标车辆以加速度为 1m/s2且横向偏移0m行驶。
这样,当目标车辆与目标对象距离较远时,目标对象抢行或让行目标车辆的行驶意图不明确,此时目标车辆可以做临界抢让行决策,即目标车辆可以继续向前行进并试探目标对象的行驶意图。当目标对象的行驶意图明确时,目标车辆可以基于目标对象的行驶意图,做出相应的调整,比如抢行目标对象,或者让行目标对象等。
由此,在主动方和被动方博弈初期或者一方的意图不明确的时期,可以通过双边约束准确控制横向规划/纵向规划的规划结果,并且随着决策约束由不确定性双边约束向确定性单边约束的连续性变化,做到自动驾驶系统面对博弈场景的类人,柔性连续决策。实现了在目标对象的行驶意图不明确时控制目标车辆缓慢加速或缓慢减速的效果,以及在目标对象的行驶意图明确时控制目标车辆快速响应目标对象的行驶意图的效果,避免了只有单边抢让行决策在交互过程中可能带来的因为决策跳变导致规划跳变,进而导致整个自动驾驶系统出现误刹/重刹或者突然抢行紧急接管,提升了驾驶体验。
可以理解的是,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。此外,在一些可能的实现方式中,上述实施例中的各步骤可以根据实际情况选择性执行,可以部分执行,也可以全部执行,此处不做限定。
基于上述实施例中的方法,本申请实施例提供了一种控制装置。
示例性的,图8示出了一种控制装置。如图8所示,该控制装置800可以包括:划分模块810和处理模块820。其中,划分模块810用于将目标车辆和目标对象划分为主动方和被动方,其中,主动方相对于被动方具有优先通行权,且目标车辆和目标对象存在碰撞可能性。处理模块820用于获取主动方可行的第一策略集合,第一策略集合中包括至少一个第一行驶策略。处理模块820还用于根据第一策略集合中的各个第一行驶策略、主动方当前时刻的行驶参数和被动方当前时刻的行驶参数,得到被动方在第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,第二行驶策略为以下任意一项:被动方仅抢行主动方,被动方仅让行主动方,或者,被动方既能抢行主动方,又能让行主动方。处理模块820还用于根据第一策略集合和第二策略集合,确定目标策略对集合,目标策略对集合中包括至少一个可行策略对,每个可行策略对均由一个第一行驶策略和一个第二行驶策略组成。处理模块820还用于确定目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,第一代价集合中包括每个可行策略对的执行代价。处理模块820还用于根据第一代价集合,确定目标行驶策略,目标行驶策略为与第一代价集合中最低的执行代价所关联的目标车辆的行驶策略。处理模块820还用于根据目标行驶策略,对目标车辆进行控制。
在一些实施例中,若目标车辆为主动方,目标对象为被动方时;处理模块820在根据第一代价集合,确定目标行驶策略时,具体用于:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆让行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆抢行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标代价所对应的可行策略对中的第一行驶策略。
在一些实施例中,若目标车辆为被动方,目标对象为主动方时;处理模块820在根据第一代价集合,确定目标行驶策略时,具体用于:若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅抢行主动方,则确定目标行驶策略为:目标车辆抢行目标对象,其中,目标代价为第一代价集合中最低的执行代价;若目标代价所对应的可行策略对中的第二行驶策略为:被动方仅让行主动方,则确定目标行驶策略为:目标车辆让行目标对象;若目标代价所对应的可行策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方,则确定目标行驶策略为:目标车辆让行目标对象。
在一些实施例中,可行策略对所关联的执行代价包括以下一项或多项:舒适性代价,用于表征执行可行策略对时的舒适程度;通过性代价,用于表征主动方和/或被动方通过两者冲突点的效率;偏移代价,用于表征对主动方和/或被动方在执行相应的行驶策略时发生偏移的评价;不一致性代价,用于表征在可行策略对中目标对象的行驶行为与目标对实际的行驶行为间的偏差的评价;或者,决策惩罚代价,用于表征对所被动方的意图是否明确的评价。
在一些实施例中,针对所有的可行策略对中的任意一个第一策略对,处理模块820在确定第一策略对所关联的舒适性代价时,具体用于:根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速 度,确定主动方在执行第一策略对中的第一行驶策略时的第一舒适性代价;根据第一策略对中的第二行驶策略对应的加速度和被动方当前的加速度,确定被动方在执行第一策略对中的第二行驶策略时的第二舒适性代价;根据第一舒适性代价和第二舒适性代价,确定第一策略对所关联的舒适性代价。
在一些实施例中,针对所有的可行策略对中的任意一个第一策略对,处理模块820在确定第一策略对所关联的通过性代价时,具体用于:根据第一时间和第二时间,确定主动方在执行第一策略对中的第一行驶策略时的第一通过性代价,其中,第一时间为主动方执行第一策略对中的第一行驶策略时通过目标点的时间,第二时间为主动方以其当前的速度和加速度通过目标点的时间,目标点为主动方的行驶路径和被动方的行驶路径的冲突点;根据第三时间和第四时间,确定被动方在执行第一策略对中的第二行驶策略时的第二通过性代价,其中,第三时间为被动方执行第一策略对中的第二行驶策略时通过目标点的时间,第四时间为被动方以其当前的速度和加速度通过目标点的时间;根据第一通过性代价和第二通过性代价,确定第一策略对所关联的通过性代价。
在一些实施例中,针对所有的可行策略对中的任意一个第一策略对,处理模块820在确定第一策略对所关联的偏移代价时,具体用于:基于预先设定的偏移量与偏移代价间的映射关系,以及,主动方执行第一策略对中的第一行驶策略时的偏移量,确定主动方在执行第一策略对中的第一行驶策略时的第一偏移代价;基于预先设定的偏移量与偏移代价间的映射关系,以及,被动方执行第一策略对中的第二行驶策略时的偏移量,确定被动方在执行第一策略对中的第二行驶策略时的第二偏移代价;根据第一偏移代价和第二偏移代价,确定第一策略对所关联的偏移代价。
在一些实施例中,针对所有的可行策略对中的任意一个第一策略对,处理模块820在确定第一策略对所关联的不一致性代价时,具体用于:当第一策略对中目标对象的行驶策略为仅抢行目标车辆,或者,为仅让行目标车辆时,确定目标对象抢行目标车辆的目标概率,以及,根据目标概率,确定不一致性代价;当第一策略对中目标对象的行驶策略为既能抢行目标车辆,又能让行目标车辆时,确定不一致性代价为预先设定的代价值。
在一些实施例中,针对所有的可行策略对中的任意一个第一策略对,处理模块820在确定第一策略对所关联的决策惩罚代价时,具体用于:基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价。
在一些实施例中,第一策略对中的第二行驶策略为:被动方既能抢行主动方,又能让行主动方;处理模块820在基于预先设定的决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一策略对所关联的决策惩罚代价时,具体用于:基于决策惩罚规则,以及,第一策略对中的第二行驶策略,确定第一决策惩罚代价;根据第一策略对中的第一行驶策略对应的加速度和主动方当前的加速度,确定主动方在执行第一策略对中的第一行驶策略时的第二决策惩罚代价;根据第一决策惩罚代价和第二决策惩罚代价,确定第一策略对所关联的决策惩罚代价。
应当理解的是,上述装置用于执行上述实施例中的方法,装置中相应的程序模块,其实现原理和技术效果与上述方法中的描述类似,该装置的工作过程可参考上述方法中的对应过程,此处不再赘述。
基于上述实施例中的装置,本申请实施例提供了一种车辆,该车辆可以包括图8中所示的控制装置800。
基于上述实施例中的方法,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,当计算机程序在处理器上运行时,使得处理器执行上述实施例中的方法。
基于上述实施例中的方法,本申请实施例提供了一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行上述实施例中的方法。
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、只读存储器(read-only memory,ROM)、可编程只读存储器(programmable rom,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性 的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。

Claims (23)

  1. 一种控制方法,其特征在于,包括:
    将目标车辆和目标对象划分为主动方和被动方,其中,所述主动方相对于所述被动方具有优先通行权,且所述目标车辆和所述目标对象存在碰撞可能性;
    获取所述主动方可行的第一策略集合,所述第一策略集合中包括至少一个第一行驶策略;
    根据所述第一策略集合中的各个第一行驶策略、所述主动方当前时刻的行驶参数和所述被动方当前时刻的行驶参数,得到所述被动方在所述第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,所述第二行驶策略为以下任意一项:所述被动方仅抢行所述主动方,所述被动方仅让行所述主动方,或者,所述被动方既能抢行所述主动方,又能让行所述主动方;
    根据所述第一策略集合和所述第二策略集合,确定目标策略对集合,所述目标策略对集合中包括至少一个可行策略对,每个所述可行策略对均由一个所述第一行驶策略和一个所述第二行驶策略组成;
    确定所述目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,所述第一代价集合中包括每个所述可行策略对的执行代价;
    根据所述第一代价集合,确定目标行驶策略,所述目标行驶策略为与所述第一代价集合中最低的执行代价所关联的所述目标车辆的行驶策略;
    根据所述目标行驶策略,对所述目标车辆进行控制。
  2. 根据权利要求1所述的方法,其特征在于,若所述目标车辆为主动方,所述目标对象为被动方时;
    所述根据所述第一代价集合,确定目标行驶策略,具体包括:
    若目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅抢行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象,其中,所述目标代价为所述第一代价集合中最低的执行代价;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅让行所述主动方,则确定所述目标行驶策略为:所述目标车辆抢行所述目标对象;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方,则确定所述目标行驶策略为:所述目标代价所对应的可行策略对中的第一行驶策略。
  3. 根据权利要求1所述的方法,其特征在于,若所述目标车辆为被动方,所述目标对象为主动方时;
    所述根据所述第一代价集合,确定目标行驶策略,具体包括:
    若目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅抢行所述主动方,则确定所述目标行驶策略为:所述目标车辆抢行所述目标对象,其中,所述目标代价为所述第一代价集合中最低的执行代价;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅让行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象。
  4. 根据权利要求1-3任一所述的方法,其特征在于,所述可行策略对所关联的执行代价包括以下一项或多项:
    舒适性代价,用于表征执行所述可行策略对时的舒适程度;
    通过性代价,用于表征所述主动方和/或所述被动方通过两者冲突点的效率;
    偏移代价,用于表征对所述主动方和/或所述被动方在执行相应的行驶策略时发生偏移的评价;
    不一致性代价,用于表征在所述可行策略对中所述目标对象的行驶行为与所述目标对像实际的行驶行为间的偏差的评价;
    或者,决策惩罚代价,用于表征对所述被动方的意图是否明确的评价。
  5. 根据权利要求4所述的方法,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,确定所述第一策略对所关联的舒适性代价,具体包括:
    根据所述第一策略对中的第一行驶策略对应的加速度和所述主动方当前的加速度,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一舒适性代价;
    根据所述第一策略对中的第二行驶策略对应的加速度和所述被动方当前的加速度,确定所述被动方在 执行所述第一策略对中的第二行驶策略时的第二舒适性代价;
    根据所述第一舒适性代价和所述第二舒适性代价,确定所述第一策略对所关联的舒适性代价。
  6. 根据权利要求4或5所述的方法,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,确定所述第一策略对所关联的通过性代价,具体包括:
    根据第一时间和第二时间,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一通过性代价,其中,所述第一时间为所述主动方执行所述第一策略对中的第一行驶策略时通过所述目标点的时间,所述第二时间为所述主动方以其当前的速度和加速度通过所述目标点的时间,所述目标点为所述主动方的行驶路径和所述被动方的行驶路径的冲突点;
    根据第三时间和第四时间,确定所述被动方在执行所述第一策略对中的第二行驶策略时的第二通过性代价,其中,所述第三时间为所述被动方执行所述第一策略对中的第二行驶策略时通过所述目标点的时间,所述第四时间为所述被动方以其当前的速度和加速度通过所述目标点的时间;
    根据所述第一通过性代价和所述第二通过性代价,确定所述第一策略对所关联的通过性代价。
  7. 根据权利要求4-6任一所述的方法,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,确定所述第一策略对所关联的偏移代价,具体包括:
    基于预先设定的偏移量与偏移代价间的映射关系,以及,所述主动方执行所述第一策略对中的第一行驶策略时的偏移量,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一偏移代价;
    基于所述预先设定的偏移量与偏移代价间的映射关系,以及,所述被动方执行所述第一策略对中的第二行驶策略时的偏移量,确定所述被动方在执行所述第一策略对中的第二行驶策略时的第二偏移代价;
    根据所述第一偏移代价和所述第二偏移代价,确定所述第一策略对所关联的偏移代价。
  8. 根据权利要求4-7任一的方法,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,确定所述第一策略对所关联的不一致性代价,具体包括:
    当所述第一策略对中所述目标对象的行驶策略为仅抢行所述目标车辆,或者,为仅让行所述目标车辆时,确定所述目标对象抢行所述目标车辆的目标概率,以及,根据所述目标概率,确定所述不一致性代价;
    当所述第一策略对中所述目标对象的行驶策略为既能抢行所述目标车辆,又能让行所述目标车辆时,确定所述不一致性代价为预先设定的代价值。
  9. 根据权利要求4-8任一所述的方法,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,确定所述第一策略对所关联的决策惩罚代价,具体包括:
    基于预先设定的决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定所述第一策略对所关联的决策惩罚代价。
  10. 根据权利要求9所述的方法,其特征在于,所述第一策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方;
    所述基于预先设定的决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定所述第一策略对所关联的决策惩罚代价,具体包括:
    基于所述决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定第一决策惩罚代价;
    根据所述第一策略对中的第一行驶策略对应的加速度和所述主动方当前的加速度,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第二决策惩罚代价;
    根据所述第一决策惩罚代价和所述第二决策惩罚代价,确定所述第一策略对所关联的决策惩罚代价。
  11. 一种控制装置,其特征在于,所述装置包括:
    划分模块,用于将目标车辆和目标对象划分为主动方和被动方,其中,所述主动方相对于所述被动方具有优先通行权,且所述目标车辆和所述目标对象存在碰撞可能性;
    处理模块,用于获取所述主动方可行的第一策略集合,所述第一策略集合中包括至少一个第一行驶策略;
    所述处理模块,还用于根据所述第一策略集合中的各个第一行驶策略、所述主动方当前时刻的行驶参数和所述被动方当前时刻的行驶参数,得到所述被动方在所述第一策略集合中各个第一行驶策略下的第二行驶策略,以得到第二策略集合,其中,所述第二行驶策略为以下任意一项:所述被动方仅抢行所述主动方,所述被动方仅让行所述主动方,或者,所述被动方既能抢行所述主动方,又能让行所述主动方;
    所述处理模块,还用于根据所述第一策略集合和所述第二策略集合,确定目标策略对集合,所述目标策略对集合中包括至少一个可行策略对,每个所述可行策略对均由一个所述第一行驶策略和一个所述第二 行驶策略组成;
    所述处理模块,还用于确定所述目标策略对集合中各个可行策略对的执行代价,得到第一代价集合,所述第一代价集合中包括每个所述可行策略对的执行代价;
    所述处理模块,还用于根据所述第一代价集合,确定目标行驶策略,所述目标行驶策略为与所述第一代价集合中最低的执行代价所关联的所述目标车辆的行驶策略;
    所述处理模块,还用于根据所述目标行驶策略,对所述目标车辆进行控制。
  12. 根据权利要求11所述的装置,其特征在于,若所述目标车辆为主动方,所述目标对象为被动方时;
    所述处理模块在根据所述第一代价集合,确定目标行驶策略时,具体用于:
    若目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅抢行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象,其中,所述目标代价为所述第一代价集合中最低的执行代价;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅让行所述主动方,则确定所述目标行驶策略为:所述目标车辆抢行所述目标对象;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方,则确定所述目标行驶策略为:所述目标代价所对应的可行策略对中的第一行驶策略。
  13. 根据权利要求11所述的装置,其特征在于,若所述目标车辆为被动方,所述目标对象为主动方时;
    所述处理模块在根据所述第一代价集合,确定目标行驶策略时,具体用于:
    若目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅抢行所述主动方,则确定所述目标行驶策略为:所述目标车辆抢行所述目标对象,其中,所述目标代价为所述第一代价集合中最低的执行代价;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方仅让行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象;
    若所述目标代价所对应的可行策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方,则确定所述目标行驶策略为:所述目标车辆让行所述目标对象。
  14. 根据权利要求11-13任一所述的装置,其特征在于,所述可行策略对所关联的执行代价包括以下一项或多项:
    舒适性代价,用于表征执行所述可行策略对时的舒适程度;
    通过性代价,用于表征所述主动方和/或所述被动方通过两者冲突点的效率;
    偏移代价,用于表征对所述主动方和/或所述被动方在执行相应的行驶策略时发生偏移的评价;
    不一致性代价,用于表征在所述可行策略对中所述目标对象的行驶行为与所述目标对像实际的行驶行为间的偏差的评价;
    或者,决策惩罚代价,用于表征对所述被动方的意图是否明确的评价。
  15. 根据权利要求14所述的装置,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,所述处理模块在确定所述第一策略对所关联的舒适性代价时,具体用于:
    根据所述第一策略对中的第一行驶策略对应的加速度和所述主动方当前的加速度,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一舒适性代价;
    根据所述第一策略对中的第二行驶策略对应的加速度和所述被动方当前的加速度,确定所述被动方在执行所述第一策略对中的第二行驶策略时的第二舒适性代价;
    根据所述第一舒适性代价和所述第二舒适性代价,确定所述第一策略对所关联的舒适性代价。
  16. 根据权利要求14或15所述的装置,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,所述处理模块在确定所述第一策略对所关联的通过性代价时,具体用于:
    根据第一时间和第二时间,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一通过性代价,其中,所述第一时间为所述主动方执行所述第一策略对中的第一行驶策略时通过所述目标点的时间,所述第二时间为所述主动方以其当前的速度和加速度通过所述目标点的时间,所述目标点为所述主动方的行驶路径和所述被动方的行驶路径的冲突点;
    根据第三时间和第四时间,确定所述被动方在执行所述第一策略对中的第二行驶策略时的第二通过性代价,其中,所述第三时间为所述被动方执行所述第一策略对中的第二行驶策略时通过所述目标点的时间,所述第四时间为所述被动方以其当前的速度和加速度通过所述目标点的时间;
    根据所述第一通过性代价和所述第二通过性代价,确定所述第一策略对所关联的通过性代价。
  17. 根据权利要求14-16任一所述的装置,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,所述处理模块在确定所述第一策略对所关联的偏移代价时,具体用于:
    基于预先设定的偏移量与偏移代价间的映射关系,以及,所述主动方执行所述第一策略对中的第一行驶策略时的偏移量,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第一偏移代价;
    基于所述预先设定的偏移量与偏移代价间的映射关系,以及,所述被动方执行所述第一策略对中的第二行驶策略时的偏移量,确定所述被动方在执行所述第一策略对中的第二行驶策略时的第二偏移代价;
    根据所述第一偏移代价和所述第二偏移代价,确定所述第一策略对所关联的偏移代价。
  18. 根据权利要求14-17任一的装置,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,所述处理模块在确定所述第一策略对所关联的不一致性代价时,具体用于:
    当所述第一策略对中所述目标对象的行驶策略为仅抢行所述目标车辆,或者,为仅让行所述目标车辆时,确定所述目标对象抢行所述目标车辆的目标概率,以及,根据所述目标概率,确定所述不一致性代价;
    当所述第一策略对中所述目标对象的行驶策略为既能抢行所述目标车辆,又能让行所述目标车辆时,确定所述不一致性代价为预先设定的代价值。
  19. 根据权利要求14-18任一所述的装置,其特征在于,针对所有的所述可行策略对中的任意一个第一策略对,所述处理模块在确定所述第一策略对所关联的决策惩罚代价时,具体用于:
    基于预先设定的决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定所述第一策略对所关联的决策惩罚代价。
  20. 根据权利要求19所述的装置,其特征在于,所述第一策略对中的第二行驶策略为:所述被动方既能抢行所述主动方,又能让行所述主动方;
    所述处理模块在基于预先设定的决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定所述第一策略对所关联的决策惩罚代价时,具体用于:
    基于所述决策惩罚规则,以及,所述第一策略对中的第二行驶策略,确定第一决策惩罚代价;
    根据所述第一策略对中的第一行驶策略对应的加速度和所述主动方当前的加速度,确定所述主动方在执行所述第一策略对中的第一行驶策略时的第二决策惩罚代价;
    根据所述第一决策惩罚代价和所述第二决策惩罚代价,确定所述第一策略对所关联的决策惩罚代价。
  21. 一种车辆,其特征在于,包括如权利要求11-20任一所述的控制装置。
  22. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,当所述计算机程序在处理器上运行时,使得所述处理器执行如权利要求1-10任一所述的方法。
  23. 一种计算机程序产品,其特征在于,当所述计算机程序产品在处理器上运行时,使得所述处理器执行如权利要求1-10任一所述的方法。
PCT/CN2023/103926 2022-09-05 2023-06-29 一种控制方法、装置及车辆 WO2024051310A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211077102.4 2022-09-05
CN202211077102.4A CN115285149A (zh) 2022-09-05 2022-09-05 一种控制方法、装置及车辆

Publications (1)

Publication Number Publication Date
WO2024051310A1 true WO2024051310A1 (zh) 2024-03-14

Family

ID=83832199

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/103926 WO2024051310A1 (zh) 2022-09-05 2023-06-29 一种控制方法、装置及车辆

Country Status (2)

Country Link
CN (1) CN115285149A (zh)
WO (1) WO2024051310A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115285149A (zh) * 2022-09-05 2022-11-04 华为技术有限公司 一种控制方法、装置及车辆

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140195093A1 (en) * 2013-01-04 2014-07-10 Carnegie Mellon University Autonomous Driving Merge Management System
CN113963535A (zh) * 2021-09-30 2022-01-21 华为技术有限公司 行驶决策确定方法、装置、电子设备存储介质
WO2022133684A1 (zh) * 2020-12-21 2022-06-30 华为技术有限公司 控制方法、相关设备及计算机可读存储介质
CN114771560A (zh) * 2022-03-28 2022-07-22 小米汽车科技有限公司 车辆行驶控制方法、装置、设备及存储介质
CN114987515A (zh) * 2022-06-22 2022-09-02 阿波罗智能技术(北京)有限公司 驾驶策略的确定方法、装置及自动驾驶车辆
CN115285149A (zh) * 2022-09-05 2022-11-04 华为技术有限公司 一种控制方法、装置及车辆

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140195093A1 (en) * 2013-01-04 2014-07-10 Carnegie Mellon University Autonomous Driving Merge Management System
WO2022133684A1 (zh) * 2020-12-21 2022-06-30 华为技术有限公司 控制方法、相关设备及计算机可读存储介质
CN113963535A (zh) * 2021-09-30 2022-01-21 华为技术有限公司 行驶决策确定方法、装置、电子设备存储介质
CN114771560A (zh) * 2022-03-28 2022-07-22 小米汽车科技有限公司 车辆行驶控制方法、装置、设备及存储介质
CN114987515A (zh) * 2022-06-22 2022-09-02 阿波罗智能技术(北京)有限公司 驾驶策略的确定方法、装置及自动驾驶车辆
CN115285149A (zh) * 2022-09-05 2022-11-04 华为技术有限公司 一种控制方法、装置及车辆

Also Published As

Publication number Publication date
CN115285149A (zh) 2022-11-04

Similar Documents

Publication Publication Date Title
US11151868B2 (en) Remote vehicle control at intersections
US11167756B2 (en) Navigation based on detected response of a pedestrian to navigational intent
CN110383008B (zh) 基于车辆活动的导航
CN109829351B (zh) 车道信息的检测方法、装置及计算机可读存储介质
US10606278B2 (en) Constraint relaxation in a navigational system
JP7235247B2 (ja) 不確実性を検知しながらナビゲートするためのシステム及び方法
KR102577645B1 (ko) 자율 주행 차량 설계 방법, 장치, 전자기기 및 저장매체
WO2018172849A1 (en) Trajectory selection for an autonomous vehicle
CN111427369A (zh) 一种无人车控制方法及装置
WO2024051310A1 (zh) 一种控制方法、装置及车辆
Schörner et al. Predictive trajectory planning in situations with hidden road users using partially observable markov decision processes
WO2019073526A1 (ja) 運転制御方法及び運転制御装置
WO2021196041A1 (zh) 一种关键目标选取方法、装置及系统
US11794728B2 (en) Electronic control device
KR102166811B1 (ko) 심층강화학습과 운전자보조시스템을 이용한 자율주행차량의 제어 방법 및 장치
EP4083959A1 (en) Traffic flow machine-learning modeling system and method applied to vehicles
WO2021051959A1 (zh) 车辆控制的方法、装置、控制器和智能汽车
WO2024066572A1 (zh) 一种智能驾驶决策方法、决策装置以及车辆
CN116176616A (zh) 一种基于增强感知的自动驾驶车行为决策系统
CN115743183A (zh) 自动驾驶控制方法、装置、设备、介质及车辆
WO2022088658A1 (zh) 一种行人穿行意图估计方法、装置、设备和汽车
WO2022246802A1 (zh) 一种驾驶策略确定方法、装置、设备及车辆
CN115476874A (zh) 车辆控制设备、包括车辆控制设备的系统及车辆控制方法
WO2023087524A1 (zh) 车辆行驶的方法及装置
CN116572994B (zh) 一种车辆速度规划方法、装置及计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23862011

Country of ref document: EP

Kind code of ref document: A1