CN109857118B

CN109857118B - Method, device, equipment and storage medium for planning driving strategy of unmanned vehicle

Info

Publication number: CN109857118B
Application number: CN201910186289.3A
Authority: CN
Inventors: 夏中谱; 潘屹峰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2022-08-16
Anticipated expiration: 2039-03-12
Also published as: CN109857118A

Abstract

The invention provides a method and a device for planning a driving strategy by an unmanned vehicle, wherein the method for planning the driving strategy by the unmanned vehicle can comprise the following steps: according to the collected planning track data B and the road scene data A, a functional relation B among the planning track data B, the road scene data A and the neural network model W is constructed _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3, … n; splicing the behavior tracks of the drivers under the driving condition of the drivers to generate behavior track data P of the drivers; inputting the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample into a neural network model W, comparing and learning the outputs of the neural network model W in the case of the negative sample and the positive sample, and correcting the neural network model W to obtain a planned driving strategy G ═ f (a, W '), where W' is a corrected neural network model.

Description

Method, device, equipment and storage medium for planning driving strategy of unmanned vehicle

Technical Field

The invention relates to the field of motor vehicle driving, in particular to a method and a device for planning a driving strategy of an unmanned vehicle, computer equipment, a computer storage medium and the like.

Background

In the prior art, when an unmanned vehicle system performs a road test, some uncontrollable events (for example, defects of a computer program itself, no brake when braking is needed, no turning when turning, and the like) inevitably occur, so that manual take-over is caused, that is, the automatic driving mode needs to be switched to the manual driving mode at this time. And after the automatic driving condition is met, switching from the manual driving mode to the automatic driving mode. This is repeated. At present, after manual taking over, the taking over condition is mainly analyzed and processed by developers, for example, the modification of rule parameters is carried out to realize the improvement and optimization of the driving strategy.

Because when these uncontrollable events appear, need the manual work to take over for unmanned car's experience is felt, comfort, security all receives the influence.

Disclosure of Invention

How to solve among the prior art when the uncontrollable incident appears in unmanned car driving process, need the problem of artifical takeover to improve and optimize prior art's driving strategy is the problem that a urgent need be solved.

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

According to a first aspect of the invention, there is provided a method of unmanned vehicle planning driving strategy, which may comprise:

according to the collected planning track data B and the road scene data A, a functional relation B among the planning track data B, the road scene data A and the neural network model W is constructed _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3,. n;

splicing the behavior tracks of the drivers under the driving condition of the drivers to generate behavior track data P of the drivers;

inputting the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample into a neural network model W, comparing and learning the outputs of the neural network model W in the case of the negative sample and the positive sample, and correcting the neural network model W to obtain a planned driving strategy G ═ f (a, W '), where W' is a corrected neural network model.

In one embodiment of the present invention, the planned trajectory data B as a negative example may include:

planning trajectory data B obtained under the condition of traversing all planning trajectories.

In another embodiment of the present invention, wherein the stitching is performed for the behavior trajectory of the driver in the driving situation of the driver, the generating the driver behavior trajectory data P may include:

and splicing the behavior tracks of the driver in a plurality of time periods to generate behavior track data P of the driver.

In still another embodiment of the present invention, wherein the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample are input to the neural network model W, the output of the neural network model W in the case of the negative sample and the positive sample of comparative learning may include:

and simultaneously inputting the planning track data B as a negative sample and the driver behavior track data P as a positive sample into the neural network model W, and comparing and learning the output of the neural network model W under the conditions of the negative sample and the positive sample in the same time period.

According to a second aspect of the present invention, there is provided an apparatus for planning a driving strategy for an unmanned vehicle, which may include:

a constructing and fitting unit for constructing a functional relation B between the planning track data B, the road scene data A and the neural network model W according to the collected planning track data B and the road scene data A _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3,. n;

the generating unit is used for splicing the behavior tracks of the drivers under the driving condition of the drivers to generate behavior track data P of the drivers;

and the comparison and correction unit inputs the planned trajectory data B serving as a negative sample and the driver behavior trajectory data P serving as a positive sample into the neural network model W, compares and learns the output of the neural network model W under the conditions of the negative sample and the positive sample, and corrects the neural network model W to obtain a planned driving strategy G ═ f (A, W '), wherein W' is a corrected neural network model.

In an embodiment of the present invention, the planned trajectory data B as a negative example may include:

In a further embodiment of the present invention, wherein the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample are input to the neural network model W, the output of the neural network model W in the case of the negative and positive samples of the comparative learning comprises:

According to a third aspect of the present invention, there is provided a computer device, which may include:

one or more processors;

storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods described above.

According to a fourth aspect of the invention, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the method described above.

By means of the method and the device for planning the driving strategy of the unmanned vehicle, and the like, the problem that the unmanned vehicle needs to take over manually when an uncontrollable event occurs in the driving process in the prior art is solved. That is to say, with the help of the obtained modified neural network model W', the unmanned vehicle can continue to be in an automatic driving state when an uncontrollable event occurs, and the driving strategy adopted is the same as the driving strategy adopted after the unmanned vehicle is taken over manually, so that the experience, comfort and safety of the unmanned vehicle are further improved.

The foregoing summary is provided for the purpose of illustration only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features of the present invention will be readily apparent by reference to the drawings and following detailed description.

Drawings

In the drawings, like reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily to scale. It is appreciated that these drawings depict only some embodiments in accordance with the disclosure and are therefore not to be considered limiting of its scope.

FIG. 1 schematically shows a flow chart of a method for planning a driving strategy according to an embodiment of a first aspect of the present invention;

FIG. 2 schematically illustrates planned trajectory data B as a negative example, according to one embodiment of the invention;

FIG. 3 schematically shows a schematic diagram of stitching driver behavior traces for driver driving situations to generate driver behavior trace data P according to an embodiment of the invention;

fig. 4 schematically shows a schematic diagram of the neural network model W in which planning trajectory data B as negative samples and driver behavior trajectory data P as positive samples are input to the neural network model W, and the neural network model W is compared and learned in the case of the negative samples and the positive samples, according to an embodiment of the present invention;

FIG. 5 schematically illustrates a schematic diagram of planning a driving strategy according to an embodiment of the invention;

FIG. 6 schematically illustrates a flow chart of an apparatus for unmanned vehicle planning driving strategy according to an embodiment of a second aspect of the present invention;

fig. 7 schematically shows an embodiment of a computer device according to a third aspect of the invention.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

In describing various embodiments of the present disclosure, the terms "include" and its derivatives should be interpreted as being inclusive, i.e., "including but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". Other explicit and implicit definitions are also possible below.

This is described in detail below in conjunction with the present invention shown in figures 1-7.

Fig. 1 schematically shows a flow chart of a method 100 of unmanned vehicle planning driving strategy according to an embodiment of a first aspect of the present invention, which may comprise the following steps:

102, constructing a functional relation B among the planning track data B, the road scene data A and the neural network model W according to the collected planning track data B and the road scene data A _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3,. n;

104, splicing the behavior tracks of the drivers under the driving condition of the drivers to generate behavior track data P of the drivers;

and step 106, inputting the planning track data B as a negative sample and the driver behavior track data P as a positive sample into the neural network model W, comparing and learning the output of the neural network model W under the conditions of the negative sample and the positive sample, and correcting the neural network model W to obtain a planned driving strategy G f (A, W '), wherein W' is a corrected neural network model.

A detailed explanation of the method 100 for unmanned vehicle planning driving strategy will be described below in conjunction with fig. 5. There are several paths between the departure point a and the destination point D shown in fig. 5, such as path 1, path 2, path 3, etc. which are shown in a simple schematic in fig. 5. There are many nodes on each path, which are represented by "·" (hereinafter, for convenience, "node" is simply referred to as "point"), such as point a, point C0, point C1, point C2, point C3, point C4, point C5, point C6, point C7, point D, and so on path 1. These nodes are all the switching points assumed to exist, for example the vehicle is in an autonomous or driverless mode within the distance from point a to point C0; within the distance from point C0 to point C1, the vehicle needs to switch to manual driving mode due to the triggering of an uncontrollable event; the vehicle continues to be in the autopilot mode again within the distance from point C1 to point C2 due to the autopilot condition being met; this is repeated.

It can therefore be assumed that the vehicle is in autonomous driving mode within the distance from point a to point C0, within the distance from point C1 to point C2, within the distance from point C3 to point C4, within the distance from point C5 to point C6, within the distance from point C7 to point D; the vehicle is in the manned mode within the distance from point C0 to point C1, within the distance from point C2 to point C3, within the distance from point C4 to point C5, and within the distance from point C6 to point C7.

In step 102, according to the collected planning track data B and road scene data a, a functional relationship B between the planning track data B, the road scene data a and the neural network model W is constructed _i ＝f _i (A _i W) may specifically include the following, for example:

for the point A to the point C1, planning track data B is constructed ₁ Road scene data A ₁ Functional relation B between neural network models W ₁ ＝f ₁ (A ₁ W) in which B ₁ Represents the planned trajectory from point A to point C1, A ₁ Representing a road scene from point a to point C1, W represents a neural network model, and as shown in fig. 5, it can be considered that a motor vehicle, for example, an unmanned vehicle, is in an autonomous driving state, or in an unmanned state, in a range from point a to point C1. But is triggered by an uncontrollable event in planning trajectory data B ₁ Road scene data A ₁ Functional relationship B between neural network models W ₁ ＝f ₁ (A ₁ W) the planned distance from point a to point C1, it is concluded that the vehicle is manually taken over from point C0 to point C1 (or during the time period) and is in manual driving mode due to the occurrence of an uncontrollable event at point C0. But planning trajectory data B ₁ Road scene data A ₁ Functional relation B between neural network models W ₁ ＝f ₁ (A ₁ W) have actually planned the autodrive trajectory data for the vehicle within a distance (or time period) from C0 to C1. Subsequent needThe automatic driving track data initially planned in the automatic driving state within the distance from the point C0 to the point C1 (or within the time period) and the distance from the point C0 to the point C1 (the automatic driving track data is actually planned track data B) ₁ A portion corresponding to the data within the distance from the point C0 to the point C1) is input to the neural network model W in order to compare the output results of the neural network model W in the above two cases, which will be described later.

Similarly, planned trajectory data B is constructed for points C1 to C3 ₂ Road scene data A ₂ Functional relation B between neural network models W ₂ ＝f ₂ (A ₂ W) in which B ₂ Represents the planned trajectory from point C1 to point C3, A ₂ Representing a road scene from point C1 to point C3, W represents a neural network model, and as shown in fig. 5, it can be considered that a motor vehicle, for example, an unmanned vehicle, is in an autonomous driving state, or in an unmanned driving state, in a range from point C1 to point C3. But is triggered by an uncontrollable event in planning trajectory data B ₂ Road scene data A ₂ Functional relation B between neural network models W ₂ ＝f ₂ (A ₂ W) is within the planned distance from C2 point to C3 point, and it is concluded that the vehicle is manually taken over within the distance from C2 point to C3 point (or within the time period) due to the occurrence of an uncontrollable event at C2 point, and is in a manual driving mode. But planning trajectory data B ₂ Road scene data A ₂ Functional relation B between neural network models W ₂ ＝f ₂ (A ₂ W) have actually planned the autodrive trajectory data for the vehicle within a distance (or time period) from C2 to C3. It is subsequently necessary to take over the driver behavior trajectory after the manual takeover of the motor vehicle from the point C2 to the point C3 (or during the time period) from the point C2 to the point C3 (which is actually the planned trajectory data B) and to automatically drive the vehicle in the initially planned automatic driving trajectory data in the automatic driving state ₂ Corresponding to data within the distance from point C2 to point C3) is input into the neural network model W in order to compare the nervesThe network model W outputs results in both cases described above, which will be described later.

Similarly, for point C3 to point C5, planned trajectory data B is constructed ₃ Road scene data A ₃ Functional relation B between neural network models W ₃ ＝f ₃ (A ₃ W) in which B ₃ Represents the planned trajectory from point C3 to point C5, A ₃ Representing a road scene from a point C3 to a point C5, W represents a neural network model, and as shown in fig. 5, it can be considered that a motor vehicle, for example, an unmanned vehicle, is in an autonomous driving state, or in an unmanned driving state, in a range from a point C3 to a point C5. But is triggered by an uncontrollable event in planning trajectory data B ₃ Road scene data A ₃ Functional relation B between neural network models W ₃ ＝f ₃ (A ₃ W) is within the planned distance from C4 point to C5 point, and it is concluded that the vehicle is manually taken over within the distance from C4 point to C5 point (or within the time period) due to the occurrence of an uncontrollable event at C4 point, and is in a manual driving mode. But planning trajectory data B ₃ Road scene data A ₃ Functional relationship B between neural network models W ₃ ＝f ₃ (A ₃ W) have actually planned the autodrive trajectory data for the vehicle within a distance (or time period) from C4 to C5. It is subsequently necessary to take over the driver behavior trajectory after the manual takeover of the motor vehicle from the point C4 to the point C5 (or during the time period) from the point C4 to the point C5 (which is actually the planned trajectory data B) and to automatically drive the vehicle in the initially planned automatic driving trajectory data in the automatic driving state ₃ A portion corresponding to the data within the distance from the point C4 to the point C5) is input to the neural network model W in order to compare the output results of the neural network model W in the above two cases, which will be described later.

Similarly, planned trajectory data B is constructed for points C5 to C7 ₄ Road scene data A ₄ Functional relation B between neural network models W ₄ ＝f ₄ (A ₄ W) in which B ₄ Representing points C5 to C7Planning a trajectory, A ₄ Representing a road scene from a point C5 to a point C7, W represents a neural network model, and as shown in fig. 5, it can be considered that a motor vehicle, for example, an unmanned vehicle, is in an autonomous driving state, or in an unmanned driving state, in a range from a point C5 to a point C7. But is triggered by an uncontrollable event in planning trajectory data B ₄ Road scene data A ₄ Functional relation B between neural network models W ₄ ＝f ₄ (A ₄ W) is within the planned distance from C6 point to C7 point, and it is concluded that the vehicle is manually taken over within the distance from C6 point to C7 point (or within the time period) due to the occurrence of an uncontrollable event at C6 point, and is in a manual driving mode. But planning trajectory data B ₄ Road scene data A ₄ Functional relation B between neural network models W ₄ ＝f ₄ (A ₄ W) have actually planned the autodrive trajectory data for the vehicle within a distance (or time period) from C6 to C7. It is subsequently necessary to take over the driver behavior trajectory after the manual takeover of the motor vehicle from the point C6 to the point C7 (or during the time period) from the point C6 to the point C7 (which is actually the planned trajectory data B) and to automatically drive the vehicle in the initially planned automatic driving trajectory data in the automatic driving state ₄ A portion corresponding to the data within the distance from the point C6 to the point C7) is input to the neural network model W in order to compare the output results of the neural network model W in the above two cases, which will be described later.

Similarly, for point C7 to point D, planned trajectory data B is constructed ₅ Road scene data A ₅ Functional relation B between neural network models W ₅ ＝f ₅ (A ₅ W) in which B ₅ Represents the planned trajectory from point C7 to point D, A ₅ Representing a road scene from point C7 to point D, W represents a neural network model, and as shown in fig. 5, it can be considered that a motor vehicle, for example, an unmanned vehicle, is in a range from point C7 to point D, the motor vehicle is in an autonomous driving state, or in an unmanned state, assuming that no uncontrollable event occurs within a distance from point C7 to point D.

Of course, the distance between the starting point a and the destination D on the path 1 can be further subdivided into several nodes, and the number of nodes shown in fig. 5 is only illustrative and is not limited to those shown on the path 1, the path 2, and the path 3.

As shown above, a plurality of planned trajectories B on the path 1 are obtained ₁ ＝f ₁ (A ₁ ，W)，B ₂ ＝f ₂ (A ₂ ，W)，B ₃ ＝f ₃ (A ₃ ，W)，…B _i ＝f _i (A _i ，W)。

In one embodiment of the present invention, the planned trajectory data B as a negative sample includes: planning trajectory data B obtained under the condition of traversing all planning trajectories. For example, a plurality of planned trajectories B obtained when selecting the route 2 are included ₁ ’＝f ₁ ’(A ₁ ’，W)，B ₂ ’＝f ₂ ’(A ₂ ’，W)，B ₃ ’＝f ₃ ’(A ₃ ’，W)，...B _i ’＝f _i ’(A _i ', W). Also comprises a plurality of planning tracks B obtained under the condition of selecting the path 3 ₁ ”＝f ₁ ”(A ₁ ”，W)，B ₂ ”＝f ₂ ”(A ₂ ”，W)，B ₃ ”＝f ₃ ”(A ₃ ”，W)，...B _i ”＝f _i ”(A _i ", W). Also included is a plurality of planned trajectories B obtained in the case of a selection of a path n (not shown in fig. 5) ₁ ⁽ⁿ⁾ ＝f ₁ ⁽ⁿ⁾ (A ₁ ⁽ⁿ⁾ ，W)，B ₂ ⁽ⁿ⁾ ＝f ₂ ⁽ⁿ⁾ (A ₂ ⁽ⁿ⁾ ，W)，B ₃ ⁽ⁿ⁾ ＝f ₃ ⁽ⁿ⁾ (A ₃ ⁽ⁿ⁾ ，W)，…B _i ⁽ⁿ⁾ ＝f _i ⁽ⁿ⁾ (A _i ⁽ⁿ⁾ ，W)。

In an embodiment of the present invention, alternatively, a plurality of planning trajectories B obtained on the path 1 may be used ₁ ＝f ₁ (A ₁ ，W)，B ₂ ＝f ₂ (A ₂ ，W)，B ₃ ＝f ₃ (A ₃ ，W)，...B _i ＝f _i (A _i W); multiple planning tracks B obtained on path 2 ₁ ’＝f ₁ ’(A ₁ ’，W)，B ₂ ’＝f ₂ ’(A ₂ ’，W)，B ₃ ’＝f ₃ ’(A ₃ ’，W)，...B _i ’＝f _i ’(A _i ', W); multiple planning tracks B obtained on path 3 ₁ ”＝f ₁ ”(A ₁ ”，W)，B ₂ ”＝f ₂ ”(A ₂ ”，W)，B ₃ ”＝f ₃ ”(A ₃ ”，W)，...B _i ”＝f _i ”(A _i ", W); .. ₁ ⁽ⁿ⁾ ＝f ₁ ⁽ⁿ⁾ (A ₁ ⁽ⁿ⁾ ，W)，B ₂ ⁽ⁿ⁾ ＝f ₂ ⁽ⁿ⁾ (A ₂ ⁽ⁿ⁾ ，W)，B ₃ ⁽ⁿ⁾ ＝f ₃ ⁽ⁿ⁾ (A ₃ ⁽ⁿ⁾ ，W)，…B _i ⁽ⁿ⁾ ＝f _i ⁽ⁿ⁾ (A _i ⁽ⁿ⁾ W) to obtain a neural network model W, where i ═ 1, 2, 3,. n. By adopting the fitting method, all factors such as alternative paths/planning tracks and road scenes between the starting point A and the destination D are considered equivalently, so that the neural network model W obtained by fitting has more comprehensiveness and inclusiveness. Alternatively, the neural network model W may be fitted for a specific path, for example, path 1.

In step 104 of the present invention, it is mentioned that the behavior tracks of the drivers under the driving condition of the drivers are spliced to generate the behavior track data P of the drivers.

After the above-described uncontrollable event occurs, the driver takes over and generates a behavior trajectory of the driver.

For example, in the case of selecting route 1, the behavior locus X of the driver is obtained within the distance from the point C0 to the point C1 ₁ At this time, the corresponding scene data is A ₁₁ (ii) a Obtaining the behavior track X of the driver within the distance from the point C2 to the point C3 ₂ At this timeCorresponding scene data is A ₂₂ (ii) a Obtaining the behavior track X of the driver within the distance from the point C4 to the point C5 ₃ At this time, the corresponding scene data is A ₃₃ (ii) a Obtaining the behavior track X of the driver within the distance from the point C6 to the point C7 ₄ At this time, the corresponding scene data is A ₄₄ . Here, the nodes A, C0, C1.. C7, D on the path 1 shown in fig. 5 are only schematic lines, and the path 1 may be actually subdivided to obtain the corresponding behavior trajectory X of the driver _n 。

In one embodiment, the behavior trajectory X of the driver is obtained ₁ ，X ₂ ，X ₃ ，......X _n Then, a stitch fitting is performed to obtain a behavior locus P ═ f (a) of the driver between the departure point a and the destination D when the route 1 is selected (i.e., the route is a route from the departure point a to the destination D) ₁₁ ，X ₁ )+f(A ₂₂ ，X ₂ )+f(A ₃₃ ，X ₃ )+...+f(A _nn ，X _n )。

As mentioned above, generating the driver behavior trace data P for the driver behavior traces in the driver driving situation as shown in fig. 3 includes generating the driver behavior trace data P for the driver behavior traces in a plurality of time periods. For example, as mentioned above, the driver behavior trajectory X will be within a corresponding time period from point C0 to point C1 ₁ The behavior track X of the driver in the corresponding time period from the point C2 to the point C3 ₂ The behavior track X of the driver in the corresponding time period from the point C4 to the point C5 ₃ The behavior track X of the driver in the corresponding time period from the point C6 to the point C7 ₄ And (6) splicing.

Wherein step 106 mentions that the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample are input into the neural network model W, the outputs of the neural network model W in the case of the negative sample and the positive sample are compared and learned, and the neural network model W is modified to obtain the planned driving strategy G ═ f (a, W '), where W' is the modified neural network model.

Alternatively, the obtained gauge under the automatic driving stateThe stroking trajectory is taken as a negative sample and the driver behavior trajectory is taken as a positive sample. For example, for route 1, a plurality of planned trajectories obtained in the case of automatic driving, that is, B ═ B ₁ +B ₂ +B ₃ ......B _i In which B is ₁ ＝f ₁ (A ₁ ，W)，B ₂ ＝f ₂ (A ₂ ，W)，B ₃ ＝f ₃ (A ₃ ，W)，...B _i ＝f _i (A _i W) as negative examples; the behavior track obtained when the driver drives is defined as f (A) ₁₁ ，X ₁ )+f(A ₂₂ ，X ₂ )+f(A ₃₃ ，X ₃ )+......+f(A _nn ，X _n ) As a positive sample. Here negative samples are considered as less reliable samples that need correction in the future and positive samples are considered as reliable samples.

As shown in fig. 4, the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample are input to the neural network model W, and the output of the neural network model W in the case of the negative sample and the positive sample by the comparative learning may include: and simultaneously inputting the planning track data B as a negative sample and the driver behavior track data P as a positive sample into the neural network model W, and comparing and learning the output of the neural network model W under the conditions of the negative sample and the positive sample in the same time period. For example, the corresponding time period between the point C0 and the point C1 (planned trajectory data in the autonomous driving state, negative sample) and the corresponding time period between the point C0 and the point C1 (driver behavior trajectory obtained in the manual driving state, positive sample) are in the same time period; the corresponding time period between the point C2 and the point C3 (planned trajectory data in the automatic driving state, negative sample) and the corresponding time period between the point C2 and the point C3 (obtained driver behavior trajectory in the manual driving state, positive sample) are in the same time period; the corresponding time period between the point C4 and the point C5 (planned trajectory data in the automatic driving state, negative sample) and the corresponding time period between the point C4 and the point C5 (obtained driver behavior trajectory in the manual driving state, positive sample) are in the same time period; the corresponding time period between point C6 and point C7 (planned trajectory data in the autonomous driving state, negative sample) and the corresponding time period between point C6 and point C7 (driver behavior trajectory obtained in the manual driving state, positive sample) are in the same time period.

After comparing the outputs of the neural network model W in the case of learning the negative and positive samples, the neural network model W is modified to obtain the planned driving strategy G ═ f (a, W '), where W' is the modified neural network model. The purpose of the correction is to enable the obtained driving strategy G ═ f (a, W ') to reflect changes in the road scene a more accurately, and in the process of subsequent use, the driving strategy G obtained after inputting the departure point a and the destination D to the corrected neural network model W' is more comfortable and safer.

According to a second aspect of the present invention, there is provided an apparatus for planning a driving strategy for an unmanned vehicle, as shown in fig. 6, may include:

the constructing and fitting unit 202 is used for constructing a functional relationship B among the planning track data B, the road scene data A and the neural network model W according to the collected planning track data B and the road scene data A _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3,. n;

the generating unit 204 is used for splicing the behavior tracks of the drivers under the driving condition of the drivers to generate behavior track data P of the drivers;

the comparison and correction unit 206 inputs the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample into the neural network model W, compares and learns the output of the neural network model W in the case of the negative sample and the positive sample, and corrects the neural network model W to obtain the planned driving strategy G ═ f (a, W '), where W' is a corrected neural network model.

In one embodiment of the present invention, the planned trajectory data B as a negative example may include: planning trajectory data B obtained under the condition of traversing all planning trajectories.

In another embodiment of the present invention, wherein the stitching is performed for the behavior trajectory of the driver in the driving situation of the driver, the generating the driver behavior trajectory data P may include: and splicing the behavior tracks of the driver in a plurality of time periods to generate behavior track data P of the driver.

In a further embodiment of the present invention, wherein the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample are input to the neural network model W, the output of the neural network model W in the case of the negative and positive samples of the comparative learning comprises: and simultaneously inputting the planning track data B as a negative sample and the driver behavior track data P as a positive sample into the neural network model W, and comparing and learning the output of the neural network model W under the conditions of the negative sample and the positive sample in the same time period.

In an embodiment according to the third aspect of the invention, there is provided a computer device, which may include: one or more processors; a storage device for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as above.

In an embodiment according to the fourth aspect of the invention, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the method as above.

Fig. 7 shows a block diagram of a computer apparatus according to an embodiment of the present invention. As shown in fig. 7, the computer apparatus includes: a memory 310 and a processor 320, the memory 310 having stored therein a computer program operable on the processor 320. The processor 320, when executing the computer program, implements the method of unmanned vehicle planning driving strategy of the above-described embodiments. The number of the memory 310 and the processor 320 may be one or more.

The apparatus/device/terminal/server further comprises:

the communication interface 330 is used for communicating with an external device to perform data interactive transmission.

Memory 310 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

If the memory 310, the processor 320 and the communication interface 330 are implemented independently, the memory 310, the processor 320 and the communication interface 330 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

Optionally, in an implementation, if the memory 310, the processor 320 and the communication interface 330 are integrated on a chip, the memory 310, the processor 320 and the communication interface 330 may complete communication with each other through an internal interface.

An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and the computer program is executed by a processor to implement the method in any one of the above embodiments.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for planning a driving strategy, comprising:

according to the collected planning track data B and road scene data AEstablishing a functional relation B among the planning track data B, the road scene data A and the neural network model W _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3, … n;

splicing the behavior tracks of the driver under the condition that the driver takes over the driving to generate behavior track data P of the driver;

inputting the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample into the neural network model W, comparing and learning the output of the neural network model W under the conditions of the negative sample and the positive sample, and correcting the neural network model W to obtain a planned driving strategy G ═ f (A, W '), wherein W' is a corrected neural network model; the planning trajectory data B as a negative example includes: planning trajectory data B obtained under the condition of traversing all planning trajectories.

2. A method for planning a driving strategy according to claim 1, wherein the stitching of driver behaviour trajectories in driver driving situations, generating driver behaviour trajectory data P comprises:

3. A method for planning a driving strategy according to claim 2, wherein the inputting of the planned trajectory data B as negative samples and driver behavior trajectory data P as positive samples into the neural network model W, the comparative learning of the output of the neural network model W in the case of the negative and positive samples comprises:

and simultaneously inputting the planning track data B as a negative sample and the driver behavior track data P as a positive sample into the neural network model W, and comparing and learning the output of the neural network model W under the condition of the negative sample and the positive sample in the same time period.

4. An apparatus for planning a driving strategy, comprising:

the construction and fitting unit is used for constructing a functional relation B among the planning track data B, the road scene data A and the neural network model W according to the collected planning track data B and the road scene data A _i ＝f _i (A _i W) and fitting to obtain a neural network model W, wherein i ═ 1, 2, 3, … n;

the generating unit is used for splicing the behavior tracks of the drivers under the condition that the drivers take over driving to generate behavior track data P of the drivers;

a comparison and correction unit which inputs the planned trajectory data B as a negative sample and the driver behavior trajectory data P as a positive sample into the neural network model W, compares and learns the output of the neural network model W in the case of the negative sample and the positive sample, and corrects the neural network model W to obtain a planned driving strategy G ═ f (a, W '), where W' is a corrected neural network model; the planning trajectory data B as a negative example includes: planning trajectory data B obtained under the condition of traversing all planning trajectories.

5. An apparatus for planning a driving strategy according to claim 4, wherein the stitching of the driver behavior trajectory in the driver driving situation, generating driver behavior trajectory data P comprises:

6. An apparatus for planning a driving strategy according to claim 5, wherein the inputting of the planned trajectory data B as negative samples and driver behavior trajectory data P as positive samples into the neural network model W, the comparative learning of the output of the neural network model W in the case of the negative and positive samples comprises:

7. A computer device, comprising:

one or more processors;

storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-3.

8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-3.