CN117508219A

CN117508219A - Vehicle path planning method and device

Info

Publication number: CN117508219A
Application number: CN202311224581.2A
Authority: CN
Inventors: 高艺璇
Original assignee: Zero Beam Technology Co ltd
Current assignee: Zero Beam Technology Co ltd
Priority date: 2023-09-21
Filing date: 2023-09-21
Publication date: 2024-02-06

Abstract

A vehicle path planning method and apparatus, the method comprising: acquiring image information and related data information of traffic participants around a tested vehicle and road conditions through cameras, laser radars and the like; determining a target traffic participant associated with a vehicle path planning task using the image information and the data information; and extracting effective feature vectors of the target traffic participants by using a feature extraction model, mapping the feature vectors into a path point coordinate sequence of the vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence. The invention can determine the target traffic participants related to the path planning task in the complex traffic environment, ignore other irrelevant traffic participants and the behavior thereof, and solve the problems that the path planning method in the prior art has weak variation capability and the path planning is easily influenced by the change of surrounding environment. The attention mechanism and reinforcement learning used by the present invention may enhance the interpretability of the model.

Description

Vehicle path planning method and device

Technical Field

The present invention relates to the field of automatic driving technologies, and in particular, to a vehicle path planning method and apparatus.

Background

In the automatic driving field, a vehicle acquires surrounding environment information through an environment sensing module, performs behavior decision, path planning and motion control by combining the position information and the motion state information of the vehicle, and generates a corresponding control instruction; and then issuing a control instruction to a chassis executing mechanism of the vehicle, so that the automatic driving of the vehicle is realized. Existing autopilot path planning models can be divided into two major types: rule-based and learning-based. The model based on the rules has the advantages of simple structure, strong interpretability and good real-time performance, but when facing complex environments, the model has weak flexibility and poor robustness, and control actions are hard due to the regular switching, so that passenger riding comfort is poor.

The path planning method based on learning generally makes decisions according to the state information of the vehicle itself and the state information and road conditions of other traffic participants in a certain surrounding range, and cannot fully consider: the path planning task of the own vehicle, namely the tested vehicle, is not only related to traffic participants close to the tested vehicle, but also related to traffic participants which are far away from the own vehicle but have conflict with the future path of the own vehicle; traffic participants near the host vehicle are not all related to the path planning task; traffic participants that do not conflict with future paths of the host vehicle do not have to take into account; the flexibility of the planning method in the face of complex and sudden environmental changes; the interpretability of the path planning method.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a vehicle path planning method and device, which can capture specific traffic participants and road conditions strongly related to a vehicle path planning task and optimize the prior vehicle path planning method.

In a first aspect of the present invention, there is provided a vehicle path planning method, comprising:

acquiring image information and related data information of traffic participants around a tested vehicle and road conditions through cameras, laser radars and the like;

determining a target traffic participant associated with a vehicle path planning task using the image information and the data information;

and extracting effective feature vectors of the target traffic participants by using a feature extraction model, mapping the feature vectors into a path point coordinate sequence of the vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence.

In an alternative embodiment, the determining a target traffic participant associated with a vehicle path planning task using the image information and the data information includes:

predicting the future running track of the traffic participant, and taking the traffic participant with the future running track conflicting with the path planned in the future of the vehicle as a target traffic participant.

In an alternative embodiment, the extracting the effective feature vector of the target traffic participant using a feature extraction model includes:

the method comprises the steps that effective information extraction is carried out on image information and data information respectively by using a convolutional neural network model and a multi-layer perceptron, the convolutional neural network model outputs feature vectors extracted from the image information as first feature vectors, and the multi-layer perceptron outputs feature vectors extracted from the data information as second feature vectors;

the first feature vector and the second feature vector are stitched into a valid feature vector of the target traffic participant.

In an alternative embodiment, the stitching the first feature vector and the second feature vector into the effective feature vector of the target traffic participant includes:

aligning a tensor dimension of the first feature vector with a tensor dimension of the second feature vector; the first characteristic vector characterizes image information from a camera, and the second characteristic vector characterizes data information acquired by a sensor;

and regarding the first feature vector as an n+2th second feature vector, inputting the second feature vector into a multi-head attention network, splicing and aggregating the first feature vector and the second feature vector into a representation tensor of the target traffic participant and surrounding road conditions, and using the representation tensor for the subsequent input into the multi-head attention mechanism network.

In an alternative embodiment, the multi-headed attention model is combined with the Actor-Critic algorithm architecture in reinforcement learning, allowing the input tensor to be variable in dimension and arrangement order; the number of the target traffic participants around the tested vehicle is variable, and the arrangement order of the characterization vectors of the target traffic participants is variable; and the feature vector can be mapped into a path point coordinate sequence of the own vehicle in a future period of time.

acquiring image information and data information of traffic participants around the vehicle at the current moment and a plurality of previous moments; and taking the image information and the data information of the traffic participants around the vehicle at the current moment and a plurality of previous moments as input data of the feature extraction model.

A second aspect of the present invention provides a vehicle path planning apparatus, comprising:

the acquisition and identification module is used for acquiring image information and related data information of traffic participants around the tested vehicle and road conditions through cameras, laser radars and the like;

a target determination module for determining a target traffic participant associated with a vehicle path planning task using the image information and the data information;

and the path planning module is used for extracting effective feature vectors of the target traffic participants by utilizing a feature extraction model, mapping the feature vectors into a path point coordinate sequence of the vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence.

According to a third aspect of the invention, a vehicle control method is provided, which comprises the steps of obtaining a vehicle planning path determined by the vehicle path planning method according to the second aspect of the invention, and controlling the vehicle to run according to the vehicle planning path.

In a fourth aspect of the present invention, there is provided an electronic apparatus comprising:

at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method according to the first aspect of the embodiments of the invention.

In a fifth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored, which, when being executed by a computer, performs the method according to the first aspect of the embodiment of the invention.

The invention is based on a learning-based automatic driving path planning model, and can determine target traffic participants related to path planning tasks in complex and changeable traffic environments, and ignore traffic participants and behaviors thereof unrelated to the path planning tasks. The invention can enhance the decision capability of the vehicle in the face of complex and changeable traffic environments, especially in the face of emergency, so that the generalization and the flexibility of path planning are stronger; the method and the device solve the problems that in the prior art, the path planning task becomes weak in capability and the path planning is easily influenced by the change of surrounding environment.

Drawings

Fig. 1 is a flow chart of a vehicle path planning method according to an embodiment of the invention.

Fig. 2 is a flow chart of a method for identifying a target traffic participant in an embodiment of the invention.

Fig. 3 is a schematic block diagram of a vehicle path planning apparatus according to an embodiment of the invention.

Fig. 4 is a flowchart of a vehicle control method according to an embodiment of the invention.

Fig. 5 is a schematic structural view of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the terms "first," "second," and "third," etc. in the claims, specification and drawings of the present disclosure are used for distinguishing between different objects and not for describing a particular sequential order. The terms "comprises" and "comprising" when used in the specification and claims of the present disclosure, specify the presence of stated features, integers, steps, operations, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

In the prior art, reinforcement learning is more used for training a motion controller of an automatic driving vehicle, and a common method in path planning is based on rules, but the surrounding environment in the real world is changeable, so that the prediction is difficult to be performed accurately. Once the surrounding environment, such as the obstacle behavior, changes, the vehicle decision-making changes immediately. The learning-based path planning method has stronger generalization capability and flexibility than the rule-based method.

The inputs of the path planning include other information such as high-precision maps, states of traffic lights, destination coordinates, vehicle pose, position coordinates and the like. The output of the path planning is a function which can map different moments into corresponding position coordinates of the vehicle, i.e. the positions that the vehicle needs to reach at a given moment in the future. The path planning is thus to calculate a safe and comfortable trajectory for the autonomous vehicle to complete a predetermined driving task. The vehicle keeps a proper distance from the obstacle in the running process, so that collision is avoided; providing passengers with comfortable riding experience, such as avoiding over-sudden acceleration and deceleration, properly decelerating in a curve to avoid excessive centripetal acceleration, and the like; finally, completing a travel task means that the planned path is to reach the destination without violating traffic regulations and without unacceptable travel time due to too low a travel speed.

The path planning method based on deep reinforcement learning generally makes decisions according to the state information of the own vehicle and the state information and road conditions of other traffic participants in a certain range around the own vehicle, and fails to fully consider that the path planning task target of the own vehicle is only related to a specific traffic participant, but takes all traffic participants in a certain range around the own vehicle into consideration by a model. Furthermore, the number of traffic participants in a range around the vehicle is constantly changing, so the input of the path planning model needs to meet the following two requirements: the first is that the dimension is variable, and the second is that the order of the token vector input network of the traffic participants is variable. The conventional path planning network in the automatic driving field is a convolutional neural network, a multi-layer perceptron and the like, and the two requirements cannot be met by the network at the same time.

The path planning strategy network provided by the invention is mainly focused on traffic participants strongly related to the path planning task targets of the own vehicle, predicts the driving track of the traffic participants in a period of time in the future and prevents the traffic participants from conflicting with the planning paths of the own vehicle. For example, a host vehicle plans to turn left at a multi-lane, signal-lamp-free intersection, the host vehicle is positioned at the first position of the left turn area, and no other vehicles are in front; the planning network only needs to pay attention to the vehicle that wants to drive into the target lane and does not need to pay attention to the behavior of other vehicles on the road where the own vehicle is located. The invention provides a network based on a multi-head attention mechanism to capture the dependence of a vehicle on a specific traffic participant and meet the requirement that a strategy model can accept input vectors with variable dimensionality and arrangement sequence, because the network structure of the multi-head attention mechanism can ensure that any arranged characterization vector input model can obtain the same result, namely the arrangement of the input vectors has sequence invariance.

Path Planning includes Detection (Detection), tracking (Mapping), mapping, trajectory prediction (Motion Forecasting), occupancy grid prediction (Occupancy prediction), and finally performing secure path Planning (Planning). The present invention relates generally to tracking and trajectory prediction, and is directed to completing a vehicle's path planning task by identifying particular traffic participants associated with the vehicle's path planning task and predicting their movement trajectories with the vehicle.

Referring to fig. 1, the present invention provides a vehicle path planning method, which includes:

step 100: and acquiring image information and related data information of traffic participants around the tested vehicle and road conditions through cameras, laser radars and the like.

Image information of traffic participants and road conditions around the vehicle is acquired by the vehicle-mounted camera, typically taken from video information acquired by the pan-around camera, and identified from the image. Illustratively, the identification of traffic participants is accomplished in the target detection model using BEV (Bird's Eye View) images. Typically, the image is converted into image features, based on which the traffic participant is identified. Where traffic participants include pedestrians, vehicles, non-movable obstacles, etc. The data information of the traffic participants around the own vehicle includes the position coordinates of the traffic participants including the own vehicle, the camber angle (camber angle), the pose state (roll), and the like, and can be obtained by radar or the like. The road conditions comprise the number of lanes of the road where the own vehicle is located, the speed limit value of each lane and the like, and can be obtained through a high-precision map.

In one embodiment, the image information and the data information of the traffic participants and the road conditions around the vehicle are subjected to feature extraction and information aggregation by using CNN (convolutional neural network ) and MLP (Multilayer Perceptron, multi-layer perceptron) respectively, to obtain the required feature vector. It should be understood that the aggregation of information herein is a calculation of all traffic participants around the host vehicle and is not limited to the target traffic participants.

Step 200: and determining a target traffic participant related to the vehicle path planning task by using the image information and the data information.

All traffic participants around the own vehicle can be identified through the steps, the own vehicle path planning is used as a task target, and some traffic participants do not need to pay attention. For example, when the vehicle runs straight in the middle lane, the vehicle behind the adjacent lane of the vehicle does not need to pay attention; when the vehicle changes lanes from the vehicle to the right, the vehicle on the lane on the left side of the vehicle does not need to pay attention to; when the vehicle changes lanes to the left, the vehicle on the lane on the right side of the vehicle does not need to pay attention.

Therefore, all traffic participants around the vehicle can be screened according to the current and future planned movement directions of the vehicle and the lane and surrounding road conditions, and traffic participants with possible conflict between the future driving track and the future path planned by the vehicle can be identified. In one embodiment, the traffic participant whose future travel track is on the target travel lane of the own vehicle or the travel lane in which the own vehicle is located is taken as the target traffic participant by predicting the future travel track of the traffic participant. The future travel path may be calculated by a traffic participant having an influence on the direction of movement of the vehicle and on the lane in which the vehicle is located, it being understood that the target traffic participant comprises a fixed obstacle in the direction of travel of the vehicle or on the lane in which the vehicle is traveling, the path of the fixed obstacle being constant.

After identifying and determining the target traffic participant, the target traffic participant may be tracked based on the intelligent tracking network. In particular, a set of tracking query vectors may be utilized to detect a target traffic participant from the image features and to continuously track the target traffic participant. The overall process of target participants from appearance to complete disappearance in the scene is tracked by introducing a set of tracking query vector modeling. The tracking query vector is processed by multi-head cross attention mechanism operation with image characteristics and decoded by a multi-layer perceptron (Multilayer Perceptron, MLP), and finally the boundary position coordinates, the speed, the acceleration, the angular speed and other attributes of the tracked object are obtained.

Since the location coordinates of the traffic participants are known in step 100, the determined interactions between the target traffic participants and the different map elements are generated based on the high-precision map and the future trajectory of each target traffic participant is predicted. Wherein the map generation may be generated by an online map in a 3D scene, map elements of different categories present in the scene may be segmented, such as: lane lines, sidewalks, zebra crossings, demarcations, drivable areas, etc. (these map elements will be used for the learning of ambient information by downstream task modules). According to the data information and map information input of the target traffic participants around the own vehicle, the future movement track of the target traffic participants in the scene can be predicted based on an Actor-Critic composite network of an attention mechanism. The occupied grid prediction can take the aerial view angle characteristic diagram as a query vector, judge whether the value of each occupied grid represents that the occupied grid is occupied by a target traffic participant, update the future aerial view angle characteristic diagram through a multi-head attention mechanism, and predict the future multistep occupied grid diagram.

Step 300: and extracting effective feature vectors of the target traffic participants by using a feature extraction model, mapping the feature vectors into a path point coordinate sequence of the vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence.

The effective feature vectors of the own vehicle and the target traffic participants are obtained by extracting the features by using the CNN (convolutional neural network ) and the MLP (Multilayer Perceptron, multi-layer perceptron). An Actor-criter (Actor-Critic) composite neural network model based on a Multi-head-attention (Multi-head-attention) mechanism maps the feature vector into a path point coordinate sequence of the own vehicle in a future period of time to be used as a planning path of the own vehicle.

In one embodiment of the invention, the image information and the data information of the traffic participants around the own vehicle and the road condition at the current moment and a plurality of previous moments are obtained; and taking the image information and the data information of the traffic participants around the vehicle and the road condition at the current moment and a plurality of previous moments as input data of the feature extraction model. For example, the present invention takes image information and data information at 8 times in total, i.e. 7 times before the current time, as input to the feature extraction network, for example, sensing information is acquired every 0.1 second, i.e. sensing information acquired including the current time and the previous 0.7 seconds is used. Correspondingly, the finally output visual planning path is a future 8-moment own vehicle path point coordinate sequence, and the future 8 moments can be mapped into functions of corresponding positions.

The invention mainly utilizes CNN and MLP to extract characteristics, the result is spliced and polymerized into a characteristic vector, and then the characteristic vector is input into an Actor-Critic composite network based on a Multi-head-attribute mechanism.

As can be seen from the above, the invention is based on a learned path planning model, and can determine the target traffic participants related to the path planning task in a complex traffic environment, neglect other irrelevant traffic participants and their behaviors, and promote the decision-making ability of the own vehicle, so that the generalization and flexibility of the path planning of the own vehicle are stronger; the method solves the problems that in the prior art, the task of path planning is weak in flexible capability and the path planning is easily influenced by the change of surrounding environment. The composite network is composed of an Actor-Critic algorithm framework based on a multi-head attention mechanism. The vehicle is regarded as a single agent (self vehicle), the acquired perception information is regarded as the observation space of the agent, the output of an Actor part in an Actor-Critic framework is the action space of the agent, namely the path point coordinate sequence of the vehicle at 8 moments in the future, and the Critic part is an action cost function and is responsible for evaluating the performance of the Actor under a given State (State) and guiding the action of the Actor in the next stage. Finally, in the simulation experiment stage, only an Actor part in the trained model is needed to be used, and the output route point coordinate sequences of 8 moments in the future of the vehicle are transmitted to the vehicle motion controller.

The training process of the CNN, the MLP and the Actor-Critic composite network based on the Multi-head-saturation mechanism is as follows:

pretreatment of training data:

in state s ₀ In the (state), the sensor is used for collecting the information of the traffic participants around the vehicle, the radar and the like are used for obtaining the data information of the traffic participants, the subject vehicle is used as the center, and the image information within 50 meters in the front-back and left-right ranges is collected, so that the collected area is 100 multiplied by 100m ² Using python's cv2 packet processing image to obtain RGB arrays, which are then passed into CNN for feature extraction. The tensor of the final output of CNN is denoted as c _i ，c _i The MLP needs to align the feature vector dimensions extracted from the data information of each vehicle. For data information of N vehicles (including tested vehicles) in a specific range collected by a vehicle sensor, recording a vector containing the ith vehicle data information as m _i All m _i The formed matrix is respectively input into three MLPs to perform feature extraction to obtain a query (query) matrix, a key (key) matrix and an attribute value (value) matrix, and the query vector, the key vector and the value vector of the ith vehicle are recorded as q respectively _i 、k _i 、v _i . C, previously mentioned _i Q is required to be equal to _i 、v _i The same number of elements, and c can be calculated in the subsequent calculation process _i Consider an n+1th vehicle other than the own vehicle.

Training an intelligent agent:

using the query vector q corresponding to the tested vehicle ₀ In turn with n+1 (one test vehicle plus N surrounding vehicles) key vectors k _i Multiplying to obtain different attention weight vectors a of the tested vehicle relative to each vehicle _i . Key vector k _i The number of elements of (a) is d, each weight vector a _i Divided byAfter Normalization (Normalization) with a softmax function and then multiplying by the corresponding v _i And finally, summing to obtain the output of the multi-head attention mechanism, namely f (shown in formula I), and obtaining the output of the Actor network part through a multi-layer perceptron. The Actor outputs a sequence of coordinates of the waypoints at 5 times in the future.

The method comprises the steps of carrying out a first treatment on the surface of the (one)

The path point coordinate sequence output by the Actor part is transmitted to a motion controller of the vehicle, the controller used in the simulation experiment is a PID controller (Proportion Integration Differentiation, proportional-integral-derivative controller), and the dynamics model is a bicycle model.

State s of agent ₀ Take action a ₀ (action) the environment receives the actions of the agent and then changes from state s ₀ Becomes s ₁ While the environment reports (reward) r on the behavior of the agent ₁ . The return consists of the following parts: whether collision with other traffic participants occurs, the offset (offset) between the sequence of the route points given by the route planner, the offset between the center line of the lane, whether collision with the road shoulder occurs, the duration used for simulation, the speed of the self-vehicle running, etc.

The Critic part functions as an Action-Value Function Q (s, a) to evaluate the behavior of the Actor selection. The Critic network consists of three fully connected layers, and the tanh function is used as the activation function.

Network updating of Actor-Critic, updating Actor and Critic networks using TD3 algorithm (Twin Delayed Deep Deterministic policy gradient algorithm); however, the algorithm is not limited to the TD3 algorithm, and other algorithms conforming to the Actor-Critic architecture may be used. The invention changes the network structure of the intelligent agent to make the intelligent agent focus on the traffic participators concerned, neglect the traffic participators irrelevant to the path planning task, effectively combine the information obtained by the intelligent agent from the observation space and the attention mechanism to form an Actor part.

Referring to fig. 2, the determination of the target traffic participant related to the mission objective of the vehicle path planning in the traffic participant in the present invention is specifically accomplished by the following steps:

step 210: and respectively extracting the image information and the data information by using a convolutional neural network model and a multi-layer perceptron, wherein the convolutional neural network model outputs the extracted characteristic information as a first characteristic vector, and the multi-layer perceptron outputs the extracted characteristic information as a second characteristic vector. Wherein the first feature vector characterizes image information and the second feature vector characterizes data information acquired by the sensor.

The tensor of the CNN processing and outputting the image information is marked as c _i MLP processes data information to output tensor m containing each vehicle data information _i . I.e. CNN outputs a first eigenvector and MLP outputs a second eigenvector.

Step 220: aligning a tensor dimension of the first feature vector with a tensor dimension of the second feature vector; and regarding the first feature vector as an n+2th second feature vector, inputting the second feature vector into a multi-head attention mechanism model, splicing and aggregating the first feature vector and the second feature vector into a representation tensor of the target traffic participant and surrounding road conditions, and using the representation tensor for the subsequent input into the multi-head attention mechanism network.

Step 230: and predicting a path point coordinate sequence of the own vehicle in a future period of time by using the target traffic participant and the representation tensor of the road condition extracted by the feature extraction model as input and using an Actor part model based on a multi-head attention mechanism and an Actor-Critic algorithm framework in reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence.

After feature extraction is performed through CNN and MLP, the tensor dimension of the first feature vector and the tensor dimension of the second feature vector are aligned, and the feature matrix is formed by splicing and aggregation. The first feature vector is regarded as a second feature vector and is input into a subsequent Actor-Critic composite network. And the Actor-Critic composite network realizes the identification of the target traffic participants and the path planning of the own vehicle by receiving the characteristic matrix of the splicing aggregation as input.

The multi-head attention mechanism network is combined with an Actor-Critic algorithm framework, allows input tensors to be variable in dimension and arrangement sequence, and maps feature tensors into a path point coordinate sequence of a vehicle in a future period of time.

Referring to fig. 3, the present invention further provides a vehicle path planning apparatus, including:

the acquisition and identification module 31 is configured to acquire image information and related data information of traffic participants around the vehicle under test and road conditions through a camera, a laser radar, and the like. Performing feature extraction and information aggregation by using CNN (convolutional neural network ) and MLP (MultilayerPerceptron, multi-layer perceptron) to obtain effective feature vectors; feature vectors are input into an Actor-Critic composite neural network based on a Multi-head-attention (Multi-attention) mechanism. The composite neural network acquires image information and data information of traffic participants around the vehicle at the current moment and a plurality of previous moments; and taking the image information and the data information of the traffic participants around the vehicle at the current moment and a plurality of previous moments as input data of the feature extraction model. The multi-head attention model, combined with the Actor-Critic algorithm framework, allows the input tensor to be variable in dimension and arrangement order, and maps feature vectors into a sequence of path point coordinates of the own vehicle in a future period of time.

A target determination module 32 for determining a target traffic participant associated with a vehicle path planning task using the image information and the data information;

for example, by predicting the future travel track of the traffic participant, the traffic participants whose future travel tracks are on the target travel lane in the future of the own vehicle or the travel lane in which the own vehicle is currently located and which collide with each other are taken as target traffic participants.

Specifically, the information extraction is carried out on the image information and the data information by using a convolutional neural network model and a multi-layer perceptron, the convolutional neural network model outputs the extracted characteristic information as a first characteristic vector, and the multi-layer perceptron outputs the extracted characteristic information as a second characteristic vector;

aligning a tensor dimension of the first feature vector with a tensor dimension of the second feature vector; the first characteristic vector characterizes image information, and the second characteristic vector characterizes data information acquired by the sensor; and regarding the first eigenvector as an n+2th second eigenvector, splicing and polymerizing the first eigenvector and the second eigenvector into a characterization tensor of the target traffic participant and surrounding road conditions, and using the characterization tensor for the subsequent input multi-head attention mechanism network.

The path planning module 33 is configured to extract an effective feature vector of the target traffic participant by using a feature extraction model, map the feature vector to a sequence of path point coordinates of the vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and plan a path for the vehicle based on the sequence of path point coordinates. For example, the characteristic tensor of the target traffic participant and the road condition extracted by the characteristic extraction model is used as input, a route point coordinate sequence of the own vehicle in a future period is predicted by using an Actor part model based on a multi-head attention mechanism and an Actor-Critic algorithm framework in reinforcement learning, and a route is planned for the vehicle based on the route point coordinate sequence.

As shown in fig. 4, the present invention further provides a vehicle control method, including:

step 410: acquiring image information and related data information of traffic participants around a tested vehicle and road conditions through cameras, laser radars and the like, and identifying each traffic participant according to the image information and the data information;

step 420: determining a target traffic participant associated with a vehicle path planning task using the image information and the data information;

step 430: extracting effective feature vectors of the target traffic participants by using a feature extraction model, mapping the feature vectors into a path point coordinate sequence of a vehicle in a future period of time based on a multi-head attention mechanism and reinforcement learning, and planning a path for the vehicle based on the path point coordinate sequence;

step 440: and controlling the vehicle to travel according to the vehicle planned path, for example, controlling the vehicle to travel according to the vehicle planned path by using a PID controller and the like.

The specific content of this embodiment may refer to the above description, and will not be repeated.

As shown in fig. 5, the present invention further provides an electronic device, including:

at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the vehicle path planning method described above.

The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the vehicle path planning method described above.

It is understood that the computer-readable storage medium may include: any entity or device capable of carrying a computer program, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth. The computer program comprises computer program code. The computer program code may be in the form of source code, object code, executable files, or in some intermediate form, among others. The computer readable storage medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth.

In some embodiments of the present invention, the apparatus may include a controller and a processor, where the controller is a single chip, and integrates a processor, a memory, a communication module, and the like. The processor may refer to a processor comprised by the controller. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A vehicle path planning method, comprising:

2. The vehicle path planning method of claim 1, wherein the determining a target traffic participant associated with a vehicle path planning task using the image information and the data information comprises:

and predicting the future running track of the traffic participant, and taking the traffic participant with the future running track on a target running lane of the own vehicle or a running lane of the own vehicle as a target traffic participant.

3. The vehicle path planning method of claim 2, wherein the extracting effective feature vectors of the target traffic participant using a feature extraction model comprises:

extracting effective information of the image information and the data information by using a convolutional neural network model and a multi-layer perceptron, wherein the convolutional neural network model outputs the extracted characteristic information as a first characteristic vector, and the multi-layer perceptron outputs the extracted characteristic information as a second characteristic vector;

4. The vehicle path planning method of claim 3, wherein the stitching the first feature vector with the second feature vector into the effective feature vector of the target traffic participant comprises:

aligning a tensor dimension of the first feature vector with a tensor dimension of the second feature vector; wherein the first feature vector characterizes valid image information extracted from the camera and the second feature vector characterizes valid data information extracted from the sensor;

and regarding the first eigenvector as an n+2th second eigenvector, splicing and polymerizing the first eigenvector and the second eigenvector into a characterization tensor of the target traffic participant and surrounding road conditions, and using the characterization tensor for the subsequent input multi-head attention mechanism network.

5. The vehicle path planning method of claim 4, wherein combining the multi-headed attention mechanism with an Actor-Critic algorithm architecture in reinforcement learning allows the input tensor to be variable in dimension and arrangement order; the number of the target traffic participants around the tested vehicle is variable, and the arrangement order of the characterization vectors of the target traffic participants is variable; and can map the feature vector to a sequence of waypoint coordinates for the vehicle over a period of time in the future.

6. The vehicle path planning method of claim 2, wherein the determining a target traffic participant associated with a vehicle path planning task using the image information and the data information comprises:

acquiring image information and data information of traffic participants around a tested vehicle at the current moment and a plurality of previous moments; and taking the image information and the data information of the traffic participants around the vehicle at the current moment and a plurality of previous moments as input data of the feature extraction model.

7. A vehicle path planning apparatus, characterized by comprising:

8. A vehicle control method characterized by comprising: a vehicle planned path determined by the vehicle path planning method according to any one of claims 1 to 6 is acquired, and vehicle travel is controlled according to the vehicle planned path.

9. An electronic device, comprising:

at least one processor; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, the processor invoking the program instructions capable of performing the method of any of claims 1-6 or claim 8.

10. A computer-readable storage medium, on which a computer program is stored, which, when being run by a computer, performs the method of any one of claims 1 to 6 or claim 8.