CN115675518A

CN115675518A - Trajectory planning method and device and electronic equipment

Info

Publication number: CN115675518A
Application number: CN202211062058.XA
Authority: CN
Inventors: 赵昊玮; 柳长春
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2023-02-03

Abstract

The disclosure provides a trajectory planning method, a trajectory planning device and electronic equipment, and relates to the technical field of data processing, in particular to the technical field of automatic driving. The specific implementation scheme is as follows: under the condition that interaction between the vehicle and a first obstacle is detected, acquiring a first state at a first moment and M first behavior combinations in a first time period; determining M second states at a second time at the end of the first time period based on the first state and the M first behavior combinations; constructing a first game tree based on a first state and M second states, wherein the first state is the state of a first node of the first game tree, the second state is the state of a second node of the first game tree, and the second node is a child node of the first node; and under the condition that the interaction of the vehicle and a first obstacle is detected to be finished, planning the running track of the vehicle on the basis of the first game tree.

Description

Trajectory planning method and device and electronic equipment

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a trajectory planning method and apparatus, and an electronic device.

Background

Vehicles, such as autonomous vehicles, need to plan safe and reasonable driving trajectories while interacting with surrounding traffic participants, which is very important in the field of autonomous driving.

At present, the trajectory planning method of an autonomous vehicle is generally to plan the traveling trajectory of the own vehicle by using the predicted trajectories of surrounding traffic participants.

Disclosure of Invention

The disclosure provides a track planning method and device and electronic equipment.

According to a first aspect of the present disclosure, there is provided a trajectory planning method, including:

under the condition that the vehicle is detected to interact with a first obstacle, acquiring a first state at a first moment and M first behavior combinations in a first time period, wherein the first state comprises: the driving states of the vehicle and the first obstacle at the first time, respectively, the first behavior combination including: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period starting from the first moment and having a preset time length, and M is a positive integer;

determining, based on the first state and the M first behavior combinations, M second states at a second time at the end of the first time period, the second states including: simulating the driving state of the vehicle and the first obstacle after driving according to the driving behaviors under the first behavior combination;

constructing a first game tree based on the first state and the M second states, wherein the first state is a state of a first node of the first game tree, the second state is a state of a second node of the first game tree, and the second node is a child node of the first node;

and under the condition that the vehicle is detected to finish interacting with the first obstacle, planning a driving track of the vehicle based on the first game tree.

According to a second aspect of the present disclosure, there is provided a trajectory planning device comprising:

a first obtaining module, configured to, when it is detected that the vehicle interacts with a first obstacle, obtain a first state at a first time and M first behavior combinations in a first time period, where the first state includes: the vehicle and the first obstacle are respectively in a running state at the first time, and the first behavior combination includes: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period starting from the first moment and having a preset time length, and M is a positive integer;

a determining module, configured to determine, based on the first state and the M first behavior combinations, M second states at a second time at the end of the first time period, where the second states include: simulating the driving state of the vehicle and the first obstacle after driving according to the driving behaviors under the first behavior combination;

a first building module, configured to build a first game tree based on the first state and the M second states, where the first state is a state of a first node of the first game tree, the second state is a state of a second node of the first game tree, and the second node is a child node of the first node;

and the trajectory planning module is used for planning the running trajectory of the vehicle based on the first game tree under the condition that the interaction between the vehicle and the first obstacle is detected to be finished.

According to a third aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods of the first aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform any one of the methods of the first aspect.

According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements any of the methods of the first aspect.

According to a sixth aspect of the present disclosure, there is provided an autonomous vehicle comprising the electronic device of the third aspect.

According to the technology disclosed by the invention, the problem of lower accuracy of the trajectory planning of the automatic driving vehicle is solved, and the accuracy of the trajectory planning of the automatic driving vehicle is improved.

It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow diagram of a trajectory planning method according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a first gaming tree;

FIG. 3 is a schematic structural diagram of a trajectory planning device according to a second embodiment of the present disclosure;

FIG. 4 is a schematic block diagram of an example electronic device used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

First embodiment

As shown in fig. 1, the present disclosure provides a trajectory planning method, including the following steps:

step S101: under the condition that the vehicle is detected to interact with a first obstacle, acquiring a first state at a first moment and M first behavior combinations in a first time period, wherein the first state comprises: the driving states of the vehicle and the first obstacle at the first time, respectively, the first behavior combination including: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period which is started from the first moment and has a preset time length;

step S102: determining, based on the first state and the M first behavior combinations, M second states at a second time at the end of the first time period, the second states including: simulating the driving state of the vehicle and the first obstacle after driving according to the driving behaviors under the first behavior combination;

step S103: constructing a first game tree based on the first state and the M second states, wherein the first state is a state of a first node of the first game tree, the second state is a state of a second node of the first game tree, and the second node is a child node of the first node;

step S104: under the condition that the vehicle is detected to finish interacting with the first obstacle, planning a running track of the vehicle on the basis of the first game tree.

Wherein M is a positive integer.

In the embodiment, the trajectory planning method relates to the technical field of data processing, in particular to the technical field of automatic driving, and can be widely applied to automatic driving scenes. The trajectory planning method of the embodiment of the present disclosure may be executed by the trajectory planning apparatus of the embodiment of the present disclosure. The trajectory planning device of the embodiment of the present disclosure may be configured in any electronic device to execute the trajectory planning method of the embodiment of the present disclosure. The electronic device may be deployed in an autonomous vehicle to perform trajectory planning for the autonomous vehicle.

In step S101, the vehicle may be an autonomous vehicle, which may be referred to as a master vehicle, and the first obstacle may be an obstacle in the periphery of the master vehicle, which may be referred to as a slave vehicle.

The trajectory planning means may detect whether there is a slave vehicle interacting with the host vehicle in the periphery, to plan the travel trajectory of the host vehicle for safe travel in a case where it is detected that there is a slave vehicle interacting with the host vehicle in the periphery. Specifically, the trajectory planning means may determine whether the master vehicle interacts with the peripheral slave vehicle by detecting the positions of the master vehicle and the slave vehicle, the driving states of the master vehicle and the slave vehicle, and the like, for example, may determine that the master vehicle interacts with the peripheral slave vehicle when detecting that the positions are relatively close to each other or when the acceleration of the slave vehicle is relatively large.

When it is detected that the vehicle interacts with the first obstacle, a first state at a first time and M first behavior combinations within a first time period may be obtained, where the first time may be a time t0 when the vehicle starts interacting with the first obstacle, or a time during which the vehicle interacts with the first obstacle.

In an alternative embodiment, the first time T1 may be T0+ L × step T, where L is a positive integer. When the first time is t0, the first state at the first time may be referred to as an initial state, which includes a running state of the host vehicle and the first obstacle at the time t0, respectively, and may further include positions of the host vehicle and the first obstacle at the time t0, respectively, and the running state may include a speed, an acceleration, an angle, an angular velocity, and the like.

In the event that the interaction of the host vehicle with the first barrier is detected, a first game tree may be constructed in which the host vehicle interacts with the first barrier, and the initial state may be taken as the node state of the root node of the first game tree. The number of layers of the first game tree is determined based on the interaction time of the main vehicle and the first obstacle, the number of layers is increased when the interaction time is longer, and the number of nodes of each layer is determined based on the number M of first behavior combinations formed by the driving behavior of the main vehicle at the first moment and the driving behavior of the first obstacle at the first moment.

The first time is different, the number of layers of the corresponding first game trees is different, for example, when the first time is t0, the first game trees correspond to root nodes, and the node states of the root nodes are initial states of the main vehicle and the first barrier at the time of t0 respectively. And when the first time is (T0 + step length T), the first time corresponds to a first-layer node in the first game tree, and the node states of the node are states of the main car and the first obstacle at the time of (T0 + step length T) respectively, and so on.

The trajectory planner may obtain the first state of the host vehicle and the first obstacle at a first time point, respectively, by an existing or new detection means.

The first time period may be a time period starting from a first time, where the time period is a preset time period, the preset time period may be a step length T, and the first time period is different at different times. For example, if the first time is T0, the first time period is a time period between the time T0 and the time (T0 + step T).

The M first behavior combinations may be combinations of driving behaviors of the host and the first obstacle in a first time period, respectively, at least one driving behavior of the host that is possible to make a decision in the driving scene corresponding to the first moment and at least one driving behavior of the first obstacle that is possible to make a decision may be obtained, and the at least one driving behavior of the host that is possible to make a decision may be combined with the at least one driving behavior of the first obstacle that is possible to make a decision to obtain M first behavior combinations, where the first behavior combinations include one driving behavior of the host that is possible to make a decision and one driving behavior of the first obstacle that is possible to make a decision. The driving behavior may include, among others, acceleration of the vehicle in the longitudinal direction, and a nose angular velocity in the lateral direction.

For example, the at least one driving behavior of the host vehicle possible decision includes a driving behavior A1 and a driving behavior A2, the at least one driving behavior of the first obstacle possible decision includes a driving behavior B1 and a driving behavior B2, and the M first behavior combinations include (A1, B1), (A1, B2), (A2, B1), and (A2, B2).

The first time is different, and the driving scenes corresponding to the first time may be different, for example, when the first time is the time T0, the corresponding scene is a straight-going scene, and when the first time is the time (T0 + step length T), the corresponding driving scene is an intersection scene. Accordingly, the first behavior combinations may be different, as may the number of first behavior combinations, i.e., M.

In step S102, the second time T2 may be a time at the end of the first time period, that is, a time (T1 + step T), and may simulate, for each first combination of behaviors, the traveling of the host vehicle at the decided driving behavior based on the traveling state of the host vehicle at the time T1 and estimate the traveling state of the host vehicle at the second time after the step T, and simulate the traveling of the first obstacle at the decided driving behavior based on the traveling state of the first obstacle at the time T1 and estimate the traveling state of the first obstacle at the second time after the step T. Accordingly, a second state corresponding to the first behavior combination can be obtained, and thus M second states can be obtained.

In step S103, the first state may be used as a node state of a first node of the first game tree, and the second state may be used as a node state of a second node of the first game tree, where the second node may be a child node of the first node, that is, the first node is a parent node of the second node, and the number of child nodes of the first node may be M.

It should be noted that, steps S101, S102 and S103 are processes of constructing a set of parent-child nodes in the first game tree, and all the parent-child nodes in the first game tree may be constructed in the above manner until the end of the interaction between the host and the first obstacle is detected, so that the first game tree of the interaction between the host and the first obstacle may be obtained accordingly.

Fig. 2 is a schematic structural diagram of a first game tree, and as shown in fig. 2, the relationship between a root node and a first-level node of the first game tree may be a set of parent-child nodes, and the node state of the root node may be S ₀ Indicating that the node state of the first level node can be used

Meaning, j may be the number of the child node. For each node in the first layer of nodes, the node connected with the node edge in the second layer can also be a group of parent-child nodes, and the node state of the node in the second layer can be used

Indicating that the process of constructing the parent-child nodes is repeated as the interaction between the host vehicle and the first barrier is performed until the interaction between the host vehicle and the first barrier is finished.

In step S104, the trajectory planning means may determine whether the interaction between the master vehicle and the first obstacle is ended by detecting information such as the positions of the master vehicle and the slave vehicle, the driving states of the master vehicle and the slave vehicle, and in the case where the interaction between the vehicle and the first obstacle is ended, plan the driving trajectory of the master vehicle based on the constructed first game tree.

Because the first game tree reflects the driving behavior and the running state of the main vehicle and the first barrier in the interactive simulation process, the interaction between the main vehicle and the barrier can be fully considered in the process of planning the track of the main vehicle based on the first game tree, so that the safe and reasonable running track can be planned for the main vehicle, and the accuracy of the track planning of the automatic driving vehicle is improved.

Optionally, the step S104 specifically includes:

determining a target leaf node from the first gaming tree upon detecting the end of the vehicle interaction with the first obstacle;

determining a first node set, wherein the first node set comprises all nodes arranged under a path from a root node to a target leaf node in the first game tree;

and determining a target driving track of the vehicle based on the driving state of the vehicle corresponding to the first node set at each moment and the driving behavior of the vehicle under the path corresponding to the first node set.

In this embodiment, in a case where it is detected that the interaction between the host vehicle and the first obstacle is ended, the target leaf node may be determined from the first game tree based only on the node states of the leaf nodes in the first game tree. Wherein the leaf node may be a node of a last layer in the first gaming tree. The target leaf node may also be determined from the first game tree based on at least one of a driving state, a position, a driving behavior, and the like of the vehicle and the first obstacle under the first path, which may be a path from the root node to the leaf node.

All leaf results simulated in the first game tree can be scored, interaction safety, interaction body feeling, traffic rules and the like can be considered in the scoring standard, so that the leaf nodes with higher score values have better safety of the main vehicle and the first barrier at the interaction ending moment or in the interaction process, the interaction body feeling is better, the main vehicle and the first barrier conform to the traffic rules, the leaf nodes with lower score values have poorer safety of the main vehicle and the first barrier at the interaction ending moment or in the interaction process, the interaction body feeling is poorer, and the two may violate the traffic rules.

The leaf node with the highest score value may be determined as the target leaf node, or one leaf node may be selected as the target leaf node from candidate leaf nodes with the top score values, which is not specifically limited herein.

On the basis of determining the target leaf node, the target leaf node can be traced back to the root node, and each node arranged under a path from the root node to the target leaf node in the first game tree is obtained, so that a first node set is obtained.

And recording the driving behavior of the main vehicle under the path from the root node to the target leaf node in the first game tree and the driving states of the vehicles corresponding to the first node set at all times according to the path sequence, so as to obtain a target driving track of the vehicles, wherein the target driving track can comprise information such as angle, angular velocity, position, acceleration, speed and the like.

In the embodiment, a target leaf node is determined from the first game tree by detecting that the interaction between the vehicle and the first obstacle is finished; determining a first node set, wherein the first node set comprises all nodes arranged under a path from a root node to a target leaf node in the first game tree; and determining a target driving track of the vehicle based on the driving state of the vehicle corresponding to the first node set at each moment and the driving behavior of the vehicle under the path corresponding to the first node set. In this way, the planning of the target travel track of the main vehicle can be realized based on the first game tree.

Optionally, the determining a target leaf node from the first game tree includes:

scoring each leaf node in the first game tree according to preset standards based on target information to obtain the score value of the leaf node, wherein the preset standards comprise at least one of interactive safety, interactive body feeling and traffic rules;

selecting a leaf node with the highest scoring value from the first game tree to obtain the target leaf node;

wherein the target information comprises at least one of:

driving behavior of the vehicle and the first obstacle under a first path, the first path being a path from the root node to the leaf node;

a running state of the vehicle and the first obstacle under the first path;

a position of the vehicle and the first obstacle under the first path.

In this embodiment, the first path is a path from the root node to the leaf nodes, the first path may include nodes and edges, and each leaf node in the first game tree may be scored based on information related to the main vehicle and the first obstacle in the interaction process in the first path, in consideration of information such as interaction security, interaction somatosensory, traffic rules, and the like. The relevant information of the main vehicle and the first obstacle in the interaction process can comprise the driving behavior of the main vehicle corresponding to the lower side of the first path, the driving state and position of the main vehicle corresponding to the node of the first path and the like.

On the basis of scoring each leaf node, the leaf node with the highest scoring value in the first game tree may be determined as the target leaf node. Therefore, based on the relevant information of the main vehicle and the first barrier in the interaction process under the first path, the preset standard is considered to score each leaf node, so that the scoring accuracy and the scoring rationality of the leaf nodes can be improved, and the vehicle trajectory planning accuracy is further improved.

Optionally, before the detecting that the vehicle has finished interacting with the first obstacle, the method further includes:

under the condition that the vehicle is detected to interact with a second obstacle, constructing a second game tree based on a third state and N fourth states, wherein the third state is the state of a third node of the second game tree, the fourth states are the states of fourth nodes of the second game tree, and the fourth nodes are child nodes of the third nodes; the third state includes a driving state of the vehicle and the second obstacle at the start of the interaction, respectively, and the fourth state includes: simulating a driving state of the vehicle and the second obstacle after driving according to driving behaviors under a second behavior combination in N second behavior combinations in a second time period, wherein the second behavior combination comprises: driving behaviors of the vehicle and the second obstacle in a second time period respectively, wherein the second time period is a time period from interaction of the vehicle and the second obstacle to a third time, the third time is matched with a time corresponding to a node state in the first game tree, and N is a positive integer;

obtaining a second set of nodes in the second game tree, the second set of nodes comprising: the time corresponding to the node state in the second game tree is matched with the time corresponding to the leaf node state in the first game tree;

the determining target leaf nodes from the first game tree includes:

k candidate leaf nodes are selected from the first game tree, the K candidate leaf nodes are arranged at K front in the ranking value, and K is a positive integer;

determining the target leaf node from the K candidate leaf nodes based on the second set of nodes.

In the embodiment, in the process of simulating the interaction between the main vehicle and the first obstacle, whether a second obstacle interacting with the main vehicle exists around the main vehicle can be detected, and if the interaction between the main vehicle and the second obstacle is detected, a second game tree of the interaction between the main vehicle and the second obstacle can be constructed.

The third state may be a running state of the host vehicle and the second obstacle at the start of interaction, respectively, and the fourth state may be a node state of a child node of the root node in the second game tree, and the second state may be a running state after simulating that the host vehicle and the second obstacle are driven according to driving behaviors in a second behavior combination of N second behavior combinations within a second time period, respectively, and the second time period may be a time period from the start of interaction of the host vehicle and the second obstacle to a third time.

The third time is matched with the time corresponding to the node state in the first game tree, for example, the time corresponding to the node state in the first game tree is T0+ L step length T, and if the interaction starting time of the main vehicle and the second obstacle is detected to be between two adjacent times corresponding to the node state in the first game tree. In order to align the time corresponding to the node state in the second game tree with the time corresponding to the node state in the first game tree to more accurately simulate the interaction effect of the same driving behavior of the host vehicle on the first obstacle and the second obstacle respectively, the driving state of the host vehicle and the second obstacle after driving according to the driving behavior of the second behavior combination in the N second behavior combinations in the second time period can be simulated only, and the child node of the root node is constructed based on the driving state. The duration of the second time period is less than the step length T, so that the time corresponding to the node state of the child node of the root node in the second game tree is aligned with the time corresponding to the node state in the first game tree, that is, the third time may be the time in T0+ L step length T, for example, (T0 +3 step length T).

The second behavior combination may include driving behaviors of the host and the second obstacle, respectively, in a second time period, and at least one driving behavior of the host that is a possible decision in the second time period may be combined with at least one driving behavior of the second obstacle that is a possible decision in the second time period to obtain N second behavior combinations.

Then, on the basis of the third time and a fourth state corresponding to the third time, the running states of the host vehicle and the second obstacle at the respective times can be estimated on the basis of the step length T, and other layer nodes of the second obstacle are constructed on the basis of the running states until the interaction between the host vehicle and the first obstacle is finished. The building process of other layer nodes of the second barrier may be the same as the building process of the first game tree, and is not described herein again.

The leaf nodes in the first game tree can be matched with the nodes in the second game tree according to the time corresponding to the node state, so that a second node set in the second game tree at the time corresponding to the node state of the leaf nodes in the first game tree is obtained. For example, if the node state of the leaf node in the first game tree corresponds to the time (T0 +10 × step T), a node constructed based on the state at the time (T0 +10 × step T) in the second game tree is obtained, and a second node set is obtained.

Thereafter, in a case where it is detected that the interaction between the host vehicle and the first obstacle ends, K candidate leaf nodes with top scoring values may be selected from the leaf nodes of the first game tree, and a final target leaf node may be selected from the K candidate leaf nodes based on the second node set.

In an alternative embodiment, it may be determined for each candidate leaf node whether there is an intersection between a first set of driving behaviors of the host along the path corresponding to each node in the second set of nodes and a second set of driving behaviors of the host along the path from the root node to the candidate leaf node in the first game tree, and if there is an intersection, it may be determined that the same driving behaviors exist for the host along both paths, and the same driving behaviors may act on the interaction of the first obstacle and the second obstacle at the same time.

The node with the same driving behavior under the path corresponding to the candidate leaf node in the first game tree under the corresponding path in the second game tree can be selected from the second node set, and the score value of the node and the score value of the candidate leaf node are weighted to obtain a target score value. The scoring mode of the nodes in the second game tree may be similar to the scoring mode of the leaf nodes in the first game tree, and details are not repeated here.

Accordingly, the candidate leaf node with the highest target score value may be selected from the K candidate leaf nodes as the target leaf node. In this way, the travel path of the host vehicle can be planned by considering the interactive influence of the same driving behavior on the first obstacle and the second obstacle, respectively, so that the travel path of the host vehicle can be accurately planned in the case where the host vehicle interacts with a plurality of obstacles at the same time.

Optionally, the determining a target driving track of the vehicle based on the driving state of the vehicle corresponding to the first node set at each time and the driving behavior of the vehicle in the path corresponding to the first node set includes:

determining an initial driving track of the vehicle based on the driving state of the vehicle corresponding to the first node set at each moment and the driving behavior of the vehicle under the path corresponding to the first node set;

and smoothing the initial running track to obtain a target running track of the vehicle.

In this embodiment, the driving behavior of the host vehicle along the path from the root node to the target leaf node in the first game tree and the driving states of the vehicles corresponding to the first node set at various times may be recorded in the order of the path, so as to obtain an initial driving trajectory of the vehicle, where the initial driving trajectory may include information including an angle, an angular velocity, a position, an acceleration, a speed, and the like.

And then, taking the initial driving track as a constraint, and outputting a smooth planning track by using a quadratic planning method to obtain a target driving track. Correspondingly, the target running track can be output to the control module to be executed, so that automatic driving is realized, and the riding experience of a user is improved.

Second embodiment

As shown in fig. 3, the present disclosure provides a trajectory planning apparatus 300, comprising:

a first obtaining module 301, configured to, when it is detected that the vehicle interacts with the first obstacle, obtain a first state at a first time and M first behavior combinations in a first time period, where the first state includes: the driving states of the vehicle and the first obstacle at the first time, respectively, the first behavior combination including: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period starting from the first moment and having a preset time length, and M is a positive integer;

a determining module 302, configured to determine, based on the first state and the M first behavior combinations, M second states at a second time at the end of the first time period, where the second states include: simulating the driving state of the vehicle and the first obstacle after driving according to the driving behaviors under the first behavior combination;

a first building module 303, configured to build a first game tree based on the first state and the M second states, where the first state is a state of a first node of the first game tree, the second state is a state of a second node of the first game tree, and the second node is a child node of the first node;

and a trajectory planning module 304, configured to plan a driving trajectory of the vehicle based on the first game tree in a case that it is detected that the interaction between the vehicle and the first obstacle is ended.

Optionally, the trajectory planning module 304 includes:

a first determining unit, configured to determine a target leaf node from the first gaming tree in a case where the vehicle is detected to end interacting with the first obstacle;

a second determining unit, configured to determine a first node set, where the first node set includes nodes arranged under a path from a root node to a target leaf node in the first game tree;

and the third determining unit is used for determining a target running track of the vehicle based on the running state of the vehicle corresponding to the first node set at each moment and the driving behavior of the vehicle under the path corresponding to the first node set.

Optionally, the first determining unit is specifically configured to:

wherein the target information comprises at least one of:

a running state of the vehicle and the first obstacle under the first path;

a position of the vehicle and the first obstacle under the first path.

Optionally, the method further includes:

the second building module is used for building a second game tree based on a third state and N fourth states under the condition that the vehicle is detected to interact with a second obstacle, wherein the third state is the state of a third node of the second game tree, the fourth state is the state of a fourth node of the second game tree, and the fourth node is a child node of the third node; the third state includes a driving state of the vehicle and the second obstacle at the start of the interaction, respectively, and the fourth state includes: simulating a driving state of the vehicle and the second obstacle after driving according to driving behaviors under a second behavior combination in N second behavior combinations in a second time period, wherein the second behavior combinations comprise: driving behaviors of the vehicle and the second obstacle in a second time period respectively, wherein the second time period is a time period from interaction of the vehicle and the second obstacle to a third time, the third time is matched with a time corresponding to a node state in the first game tree, and N is a positive integer;

a second obtaining module, configured to obtain a second node set in the second game tree, where the second node set includes: the time corresponding to the node state in the second game tree is matched with the time corresponding to the leaf node state in the first game tree;

the first determining unit is specifically configured to:

k candidate leaf nodes are selected from the first game tree, the K candidate leaf nodes are K leaf nodes with the score values arranged at the top, and K is a positive integer;

Optionally, the third determining unit is specifically configured to:

The trajectory planning apparatus 300 provided by the present disclosure can implement each process implemented by the trajectory planning method embodiment, and can achieve the same beneficial effects, and for avoiding repetition, the details are not repeated here.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 4 shows a schematic block diagram of an example electronic device that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 4, the device 400 comprises a computing unit 401, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data required for the operation of the device 400 can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

A number of components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, or the like; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408 such as a magnetic disk, optical disk, or the like; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

Computing unit 401 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 401 executes the respective methods and processes described above, such as the trajectory planning method. For example, in some embodiments, the trajectory planning method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 400 via the ROM 402 and/or the communication unit 409. When the computer program is loaded into RAM 403 and executed by the computing unit 401, one or more steps of the trajectory planning method described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the trajectory planning method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A trajectory planning method, comprising:

under the condition that the vehicle is detected to interact with a first obstacle, acquiring a first state at a first moment and M first behavior combinations in a first time period, wherein the first state comprises: the vehicle and the first obstacle are respectively in a running state at the first time, and the first behavior combination includes: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period starting from the first moment and having a preset time length, and M is a positive integer;

under the condition that the vehicle is detected to finish interacting with the first obstacle, planning a running track of the vehicle on the basis of the first game tree.

2. The method of claim 1, wherein the planning of the driving trajectory of the vehicle based on the first game tree in case of detecting the end of the vehicle interaction with the first obstacle comprises:

3. The method of claim 2, wherein said determining a target leaf node from said first gaming tree comprises:

selecting a leaf node with the highest score value from the first game tree to obtain the target leaf node;

wherein the target information comprises at least one of:

a running state of the vehicle and the first obstacle under the first path;

a position of the vehicle and the first obstacle under the first path.

4. The method of claim 2, before the detecting that the vehicle has ended interacting with the first obstacle, the method further comprising:

under the condition that the vehicle is detected to interact with a second obstacle, constructing a second game tree based on a third state and N fourth states, wherein the third state is the state of a third node of the second game tree, the fourth states are the states of fourth nodes of the second game tree, and the fourth nodes are child nodes of the third nodes; the third state includes a driving state of the vehicle and the second obstacle at the start of interaction, respectively, and the fourth state includes: simulating a driving state of the vehicle and the second obstacle after driving according to driving behaviors under a second behavior combination in N second behavior combinations in a second time period, wherein the second behavior combination comprises: driving behaviors of the vehicle and the second obstacle in a second time period respectively, wherein the second time period is a time period from interaction of the vehicle and the second obstacle to a third time, the third time is matched with a time corresponding to a node state in the first game tree, and N is a positive integer;

the determining a target leaf node from the first gaming tree comprises:

5. The method of claim 2, wherein the determining a target driving trajectory of the vehicle based on the driving state of the vehicle corresponding to the first node set at each time and the driving behavior of the vehicle on the path corresponding to the first node set comprises:

6. A trajectory planning apparatus comprising:

a first obtaining module, configured to, when it is detected that the vehicle interacts with a first obstacle, obtain a first state at a first time and M first behavior combinations in a first time period, where the first state includes: the vehicle and the first obstacle are respectively in a running state at the first time, and the first behavior combination includes: driving behaviors of the vehicle and the first obstacle in the first time period respectively, wherein the first time period is a time period from the first moment to a preset time period, and M is a positive integer;

and the trajectory planning module is used for planning the running trajectory of the vehicle on the basis of the first game tree under the condition that the interaction of the vehicle and the first obstacle is detected to be finished.

7. The apparatus of claim 6, wherein the trajectory planning module comprises:

a second determining unit, configured to determine a first node set, where the first node set includes nodes arranged under a path from a root node to the target leaf node in the first game tree;

8. The apparatus according to claim 7, wherein the first determining unit is specifically configured to:

wherein the target information comprises at least one of:

a running state of the vehicle and the first obstacle under the first path;

a position of the vehicle and the first obstacle under the first path.

9. The apparatus of claim 7, further comprising:

the second building module is used for building a second game tree based on a third state and N fourth states under the condition that the vehicle is detected to interact with a second obstacle, wherein the third state is the state of a third node of the second game tree, the fourth state is the state of a fourth node of the second game tree, and the fourth node is a child node of the third node; the third state includes a driving state of the vehicle and the second obstacle at the start of interaction, respectively, and the fourth state includes: simulating a driving state of the vehicle and the second obstacle after driving according to driving behaviors under a second behavior combination in N second behavior combinations in a second time period, wherein the second behavior combination comprises: driving behaviors of the vehicle and the second obstacle in the second time period respectively, wherein the second time period is a time period from the beginning of interaction of the vehicle and the second obstacle to a third time, the third time is matched with a time corresponding to a node state in the first game tree, and N is a positive integer;

the first determining unit is specifically configured to:

10. The apparatus according to claim 7, wherein the third determining unit is specifically configured to:

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.

14. An autonomous vehicle comprising the electronic device of claim 11.