CN112085165A

CN112085165A - Decision information generation method, device, equipment and storage medium

Info

Publication number: CN112085165A
Application number: CN202010910254.2A
Authority: CN
Inventors: 何柳; 李宇寂; 尚秉旭
Original assignee: FAW Group Corp
Current assignee: FAW Group Corp
Priority date: 2020-09-02
Filing date: 2020-09-02
Publication date: 2020-12-15

Abstract

The invention discloses a decision information generation method, a decision information generation device, decision information generation equipment and a storage medium. The method comprises the following steps: acquiring state parameters and environmental information of a current vehicle at the current moment; performing data processing on the state parameters and the environmental information to obtain target characteristic vectors; the method comprises the steps of inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generation model is a bidirectional LSTM network model, and the model structure and model parameters of the decision information generation model are obtained by training according to sample state parameters, sample environment information and sample decision information respectively.

Description

Decision information generation method, device, equipment and storage medium

Technical Field

The embodiment of the invention relates to vehicle technology, in particular to a method, a device, equipment and a storage medium for generating decision information.

Background

With the development of artificial intelligence technology, intelligent products have been slowly integrated into the aspects of our lives. The automatic driving becomes the development direction of future traffic, and the automatic driving not only has the conventional automobile functions of acceleration, deceleration, steering and the like, but also integrates the system functions of environmental perception, behavior decision, path planning, vehicle control and the like.

The decision planning problem is a very critical step in automatic driving, and whether the decision is reasonable or not directly determines the intelligent level of the automatic driving vehicle. And is therefore also an important challenge to the development of autonomous driving. The decision system is mainly a rule-based behavior decision, namely a method for establishing a behavior rule base for automatic driving behaviors according to form rules, traffic regulations, driving common knowledge and the like, dividing vehicle states according to different scenes and determining vehicle behaviors according to rule logics.

The rule-based behavior decision is a conservative behavior decision system which can be normally used under most conditions, but the behavior decision of the vehicle cannot be adjusted according to the driving habits and the like of drivers, and the behavior of the vehicle is inconsistent due to the state cutting and dividing conditions; the triggering conditions of the behavior rule base are easy to overlap, so that the system fails; and the behavior decision based on the rule has the defects that all emergency scenes cannot be covered and the scene depth traversal is insufficient, so that the system decision accuracy is difficult to improve, and the bottleneck exists in the improvement of complex working condition processing and algorithm performance.

Disclosure of Invention

The embodiment of the invention provides a decision information generation method, a decision information generation device, decision information generation equipment and a storage medium, which are used for solving the defects that an intelligent vehicle cannot self-adjust based on rule learning in the automatic driving process, the scene coverage is incomplete and the like at present, and improving the behavior decision accuracy and the driving safety of the automatic driving vehicle.

In a first aspect, an embodiment of the present invention provides a method for generating decision information, including:

acquiring state parameters and environmental information of a current vehicle at the current moment;

performing data processing on the state parameters and the environmental information to obtain target characteristic vectors;

inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generation model is a bidirectional LSTM network model, and the model structure and model parameters of the decision information generation model are obtained by training according to sample state parameters, sample environment information and sample decision information respectively.

Further, the training method of the decision information generation model includes:

acquiring sample state parameters, sample environment information and sample decision information;

inputting the sample state parameters and the sample environment information into a bidirectional LSTM network model to be trained to obtain prediction decision information;

training a model structure and model parameters of the bidirectional LSTM network model to be trained according to an objective function formed by the sample decision information and the prediction decision information;

and returning to execute the operation of inputting the sample state parameters and the sample environment information into the bidirectional LSTM network model to be trained to obtain the prediction decision information until a decision information generation model is obtained.

Further, the environment information includes: the vehicle-mounted navigation system comprises at least one of a planned path generated by global navigation, the speed of a vehicle which is in front of a current vehicle and has a distance with the current vehicle smaller than a preset distance, the distance between the vehicle which is in front of the current vehicle and has a distance with the current vehicle smaller than the preset distance and the current vehicle, the lane line condition of a current vehicle driving road section, the obstacle state around the current vehicle and the traffic light state.

Further, the bidirectional LSTM network model includes: a forward LSTM network model, a backward LSTM network model and a feedforward neural network model;

inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information comprises:

inputting the target characteristic vector into the forward LSTM network model to obtain a first characteristic vector corresponding to state parameters and environmental information of the current time and preset time before the current time;

inputting the target characteristic vector into the backward LSTM network model to obtain a second characteristic vector corresponding to state parameters and environmental information of the current time and preset time after the current time;

and inputting the first characteristic vector and the second characteristic vector into the feedforward neural network model to obtain decision information corresponding to the target characteristic vector.

Further, the bidirectional LSTM network model includes: the device comprises an input layer, two hidden layers and an output layer, wherein an activation function of the hidden layers is a correction activation function, and an activation function of the output layer is a softmax function.

Further, the bidirectional LSTM network model includes: the system comprises a forgetting gate, an input gate, an output gate and a memory unit;

the forgetting gate is realized by the following formula:

f_t＝σ(W_f·[h_t-1,x_t]+b_f)；

wherein, W_fWeight of forgetting gate, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_fThe offset vector of the forgetting gate is sigma function;

the input gate is implemented by the following formula:

i_t＝σ(W_i·[h_t-1,x_t]+b_i)；

wherein, W_iIs the weight of the input gate, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_fSigma is a sigmoid function which is an offset vector of an input gate;

the candidate value at the current time t is calculated by the following formula:

wherein, W_CIs the weight of the memory cell, h_t-1Is t-1 output of the time memory unit, x_tIs an input at time t, b_CIs the offset vector of the memory unit;

the state of the memory cell at the current time t is calculated by the following formula:

wherein, C_t-1The state of the memory cell at time t-1;

the output gate is realized by the following formula:

o_t＝σ(W_o·[h_t-1,x_t]+b_o)；

wherein, W_oIs the weight of the input gate, h_t-1Is the output at time t-1, x_tIs an input at time t, b_oSigma is a sigmoid function which is an offset vector of an input gate;

the output of the decision information generation model is calculated by the following formula:

h_t＝o_t*tanh(c_t)。

in a second aspect, an embodiment of the present invention further provides a decision information generating apparatus, where the apparatus includes: the acquisition module is used for acquiring the state parameters and the environmental information of the current vehicle at the current moment;

the processing model is used for carrying out data processing on the state parameters and the environmental information to obtain a target characteristic vector;

and the generating module is used for inputting the target characteristic vector into a decision information generating model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generating model is a bidirectional LSTM network model, and the model structure and the model parameters of the decision information generating model are obtained by training according to the sample state parameters, the sample environment information and the sample decision information respectively.

Further, the generating module is specifically configured to:

the generative model is specifically configured to:

the forgetting gate is realized by the following formula:

f_t＝σ(W_f·[h_t-1,x_t]+b_f)；

the input gate is implemented by the following formula:

i_t＝σ(W_i·[h_t-1,x_t]+b_i)；

wherein, W_CIs the weight of the memory cell, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_CIs the offset vector of the memory unit;

wherein, C_t-1The state of the memory cell at time t-1;

the output gate is realized by the following formula:

o_t＝σ(W_o·[h_t-1,x_t]+b_o)；

h_t＝o_t*tanh(c_t)。

in a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the decision information generating method according to any one of the embodiments of the present invention.

In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the decision information generating method according to any one of the embodiments of the present invention.

The embodiment of the invention obtains the state parameters and the environmental information of the current vehicle at the current moment; performing data processing on the state parameters and the environmental information to obtain target characteristic vectors; inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generation model is a bidirectional LSTM network model, and the model structure and model parameters of the decision information generation model are obtained by training according to sample state parameters, sample environment information and sample decision information respectively, so that the defects that the existing intelligent vehicle cannot self-adjust based on rule learning in the automatic driving process, the scene coverage is incomplete and the like are overcome, and the behavior decision accuracy and the driving safety of the automatic driving vehicle are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a flowchart of a decision information generating method according to a first embodiment of the present invention;

FIG. 1a is a flow chart of decision information generation according to a first embodiment of the present invention;

FIG. 1b is a diagram of a decision information generation model according to a first embodiment of the present invention;

fig. 2 is a schematic structural diagram of a decision information generating apparatus according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a computer device in a third embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Example one

Fig. 1 is a flowchart of a decision information generating method according to an embodiment of the present invention, where this embodiment is applicable to a situation of decision information generation, and the method may be executed by a decision information generating apparatus according to an embodiment of the present invention, where the apparatus may be implemented in a software and/or hardware manner, as shown in fig. 1, the method specifically includes the following steps:

and S110, acquiring the state parameters and the environmental information of the current vehicle at the current moment.

Wherein the state information of the current vehicle includes: one or more of a current vehicle position, a current vehicle speed, a current vehicle acceleration, a current vehicle steering wheel angle, a current vehicle pressure applied to a throttle, and a current vehicle pressure applied to a brake may further include other information capable of characterizing a current vehicle state, which is not limited in this embodiment of the present invention.

Wherein the state information of the current vehicle may further include: trajectory data of the current vehicle.

The current state parameter of the vehicle may be acquired through a vehicle-mounted sensor or may be directly acquired through a CAN bus, which is not limited in the embodiment of the present invention.

The environment information may include at least one of a planned path generated by global navigation, a vehicle speed of a vehicle that is in front of the current vehicle and whose distance from the current vehicle is less than a preset distance, a vehicle distance between the vehicle that is in front of the current vehicle and whose distance from the current vehicle is less than the preset distance and the current vehicle, a lane line condition of a road section where the current vehicle travels, an obstacle state around the current vehicle, and a traffic light state, which is not limited in this embodiment of the present invention.

The environment information can be acquired by a vehicle-mounted camera, a vehicle-mounted laser radar and other vehicle-mounted equipment.

Specifically, the state parameter and the environmental information of the current vehicle at the current time are obtained, for example, if the current time is time t, the state parameter and the environmental information of the current vehicle at time t are obtained.

And S120, performing data processing on the state parameters and the environment information to obtain a target characteristic vector.

Specifically, data processing is performed on the state parameters and the environmental information of the current vehicle at the current moment, so that target feature vectors corresponding to the state parameters and the environmental information of the current vehicle at the current moment are obtained.

And S130, inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generation model is a bidirectional LSTM network model, and the model structure and the model parameters of the decision information generation model are obtained by training according to the sample state parameters, the sample environment information and the sample decision information respectively.

The decision-making model is a trained neural network model with input as a target feature vector and output as decision information, and may be a Long Short-Term Memory (LSTM) network model.

Wherein, the decision information generation model is a bidirectional LSTM network model, which comprises: a forward LSTM network model, a backward LSTM network model, and a feedforward neural network model.

The sample state parameters and the sample environment information are information of the same vehicle acquired at the same historical moment, and the sample decision information is sample decision information corresponding to the sample state parameters and the sample environment information.

The sample parameters, the sample environment information and the sample decision information may be obtained by obtaining trajectory data of the current vehicle in the driving process and environment data within a certain range.

Specifically, acquiring track data and environmental data within a certain range of a vehicle in the driving process: obtaining a trajectory data set of the vehicle as T ═ T₁,T₂,...,T_t,...,T_nX ═ X corresponding to the environment data set₁,X₂,...,X_t,...,X_n}; the track data T are behavior decisions of the vehicle at different moments, including lane changing, overtaking, cruising and scramming; t is_tAnd (4) making a decision on the behavior of the vehicle at the t-th time point. The environment data X is a data set of different environment information, and comprises a planned path obtained through global navigation, the speed and the distance of the nearest vehicle in front, the lane line condition of a driving road section, the obstacle condition in a certain range around the vehicle and the front traffic light condition. X_tThe environmental data of the vehicle at the t-th time point, the number of environments at each time pointAccording to the set of five environment data, the data is further processed into [0,1 ]]Feature vectors within a range.

Optionally, the training method of the decision information generation model includes:

The sample state parameters, the sample environment information and the sample decision information are the state parameters of the current vehicle in the running process of the current vehicle, the environment information in a certain range around the current vehicle and the decision information corresponding to the state parameters of the current vehicle in the running process of the current vehicle and the environment information in a certain range around the current vehicle.

Specifically, the sample state parameters, the sample environment information and the sample decision information, and the obtained state parameters and the obtained environment information of the current vehicle at the current moment are information of the same vehicle at different moments, so that a better-quality sample can be provided for the model, and the output of the trained model is more consistent with the current vehicle.

Optionally, the environment information includes: the vehicle-mounted navigation system comprises at least one of a planned path generated by global navigation, the speed of a vehicle which is in front of a current vehicle and has a distance with the current vehicle smaller than a preset distance, the distance between the vehicle which is in front of the current vehicle and has a distance with the current vehicle smaller than the preset distance and the current vehicle, the lane line condition of a current vehicle driving road section, the obstacle state around the current vehicle and the traffic light state.

Optionally, the bidirectional LSTM network model includes: a forward LSTM network model, a backward LSTM network model and a feedforward neural network model;

Specifically, at time t, the environment data X is a set of a planned route obtained by global navigation, the speed and distance of the nearest vehicle ahead, the lane line condition of a driving road section, the obstacle condition in a certain range around the vehicle, and the front traffic light condition, and X is_t＝{A_t,B_t,C_t,D_t,E_tIn which A is_tPlanned route for global navigation at time t, B_tSpeed and distance of the nearest vehicle ahead at time t, C_tThe lane marking condition of the road section traveled at time t, D_tFor an obstacle situation in a certain range around the vehicle at time t, E_tThe situation of the traffic light ahead at the time t. Wherein X at each time instant is an environmental condition represented by vectors of the same length.

For a bidirectional LSTM network, at time t, the forward LSTM may get x_tAnd x_tPrevious current vehicle state parameters and environmental information x₁,x₂,…，x_tThe backward LSTM can obtain the state parameters and the environmental information x of the current vehicle at the time t and after the time t_t,x_t+1,…，x_τOutput vector h of two hidden layers^t _fAnd h_b ^tThe combination yields information of the entire sequence of bi-directional LSTM at time t.

For a feedforward neural network, the input at time t is the forward output h of the bi-directional LSTM at time t^t _fAnd the reverse output h at the same time_b ^tThe composed feature vector adopts a ReLU (Rectified Linear Unit) as an activation function of an implicit layer, a Softmax function as an activation function of an output layer, and a Back Propagation Through Time (BPTT) algorithm capable of adaptively adjusting LSTM and multi-layer perceptron parameters based on Time.

Because the behavior decision of the system at present is four conditions of lane change, overtaking, cruising and emergency stop, the vehicle behavior decision prediction can be understood as a mapping problem, namely, a characteristic vector formed by integrating five environmental factors of the environmental condition of the vehicle is mapped into a process of four behavior decisions, and each environmental factor corresponds to one of the behavior decisions, so that the system can be regarded as an output vector formed by four conditions of lane change, overtaking, cruising and emergency stop.

And (4) carrying out cross validation method training on the input and output data to train a deep learning model, adjusting parameters, and stopping training when the prediction precision is not improved any more.

Specifically, as shown in fig. 1a, the environmental information and the state parameter of the current vehicle at the current time are obtained, data processing is performed on the state parameter and the environmental information to obtain a feature vector corresponding to the state parameter and the environmental information, and the feature vector is input into the bidirectional LSTM network model to obtain decision information corresponding to the feature vector.

Optionally, the bidirectional LSTM network model includes: the device comprises an input layer, two hidden layers and an output layer, wherein an activation function of the hidden layers is a correction activation function, and an activation function of the output layer is a softmax function.

Optionally, the bidirectional LSTM network model includes: the system comprises a forgetting gate, an input gate, an output gate and a memory unit;

the forgetting gate is realized by the following formula:

f_t＝σ(W_f·[h_t-1,x_t]+b_f)；

the input gate is implemented by the following formula:

i_t＝σ(W_i·[h_t-1,x_t]+b_i)；

wherein, C_t-1The state of the memory cell at time t-1;

the output gate is realized by the following formula:

o_t＝σ(W_o·[h_t-1,x_t]+b_o)；

h_t＝o_t*tanh(c_t)。

the prior art generally adopts a rule-based behavior decision method, and the rule-based behavior decision method has the following defects: 1) the vehicle behavior is not consistent due to the state cutting division condition; 2) the triggering conditions of the behavior rule base are easy to overlap, so that the system fails; 3) the finite state machine has difficulty in completely covering all the operating conditions that the vehicle may encounter, and usually ignores the environmental details that may lead to decision errors; 4) due to insufficient scene depth traversal, the decision accuracy of the system is difficult to improve, and a bottleneck exists in the improvement of complex working condition processing and algorithm performance. In order to solve the above problems, the embodiment of the present invention adopts a bidirectional LSTM algorithm as a network model, where LSTM is a special RNN, and is capable of learning a long-term dependency relationship, and mainly includes three gates, namely a forgetting gate (forget gate), an input gate (input gate), an output gate (output gate), and a memory cell, and since the gate is composed of a sigmoid neural network layer and a dot product operation. The LSTM may delete or add information to the cell state. Because the output of sigmoid function is [0,1 ]]The real number vector between the two, so the transmission of the information can be controlled by multiplying the two, when the sigmoid function output is 0, any vector multiplied by the two can obtain a 0 vector, namely the 0 vector cannot pass; when the output is 1, any vector multiplied by itself, which is equivalent to passing through the gate at will. The LSTM controls the information input and output through these three gates. The LSTM first decides which information can pass through the cell state, namely, the information is a forget gate controlled by sigmoid, as shown in formula 1-1, and outputs h at the time of t-1_t-1Combining input x at current time t_tGenerating an f_tValue due to f_tThe value is in the range of 0 to 1, so that it determines whether or not to let the information C learned at time t-1_t-1Pass or partially pass.

f_t＝σ(W_f·[h_t-1,x_t]+b_f) (1-1)

Wherein, W_fWeight of forgetting gate, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_fσ is sigmoid function for the offset vector of the forgetting gate.

The second step of LSTM is to update information, as shown in formula 1-2, the input gate layer determines which values can be updated by sigmoid, and the tanh layer generates new candidate values

As shown in the formulas 1-3,

the candidate generated as the current layer is calculated from the output at time t-1 and the input at current time t, and may be added to the cell state.

i_t＝σ(W_i·[h_t-1,x_t]+b_i) (1-2)

Wherein, W_iIs the weight of the input gate, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_fσ is the sigmoid function for the offset vector of the input gate.

Wherein, W_CIs the weight of the memory cell, h_t-1Is the output of the memory cell at time t-1, x_tIs an input at time t, b_CIs the offset vector of the memory cell.

Multiplying the unit state Ct-1 at the time t-1 by the forgetting gate f according to elements_tThe cell state C at the current time t can be obtained_tIn addition to using the input gate i_tMultiplying the current candidate value by element

The resulting value yields the candidate value for the current time t, which is determined by both the output gate and the cell state, as shown in equations 1-4.

Wherein, C_t-1The state of the memory cell at time t-1.

The final output of the model is firstly obtained by a sigmoid function to obtain an initial output o at the current time t_tThen, as shown in equations 1-5, the current cell state C is transformed using the tanh function_tAnd (3) zooming the value between-1 and 1 as shown in a formula 1-6, and finally multiplying the zoomed unit state at the current moment by the output obtained by the sigmoid function pair by pair to obtain the final output of the model as shown in the formula 1-6.

o_t＝σ(W_o·[h_t-1,x_t]+b_o) (1-5)

Wherein, W_oIs the weight of the input gate, h_t-1Is the output at time t-1, x_tIs an input at time t, b_oσ is the sigmoid function for the offset vector of the input gate.

h_t＝o_t*tanh(c_t) (1-6)

The LSTM has an input door and a forgetting door, so that information can be screened to avoid that the current irrelevant content enters a memory unit, information long before can be stored, and an output door controls the influence of long-term memory on the current output.

The technical scheme adopted by the invention is as follows: 1) extracting track data and an environment data set of the current vehicle under different scenes of the current vehicle; 2) carrying out characteristic value addition and standardization processing on the track data, extracting the track data at the time t as multi-class LSTM classification, and using the environmental information at the same time as an input characteristic vector at the time t; 3) training driving behavior decisions of the current vehicle under different scenes through an LSTM algorithm; 4) predicting decision information of the current vehicle in a corresponding scene according to the trained model; 5) and adjusting the motion trail of the vehicle according to the predicted decision information.

The embodiment of the invention has the following beneficial effects: the embodiment of the invention can continuously train the deep learning model according to the usual driving habit of the driver, adjust the parameters to enable the prediction result to be closer and closer to the behavior decision habit of the driver, and can avoid the problems of state partition boundary determination problem, incomplete scene coverage and the like of a rule-based behavior decision system in the automatic driving process. The driving safety of the intelligent vehicle is further guaranteed, and the use experience of a driver is improved.

In one specific example, historical state parameters and environmental information of the vehicle and corresponding decision information are obtained. The environmental squeak is a set of a planned path obtained by global navigation, the speed and the distance of a front nearest vehicle, the lane line condition of a driving road section, the obstacle condition in a certain range around the vehicle and the front traffic light condition. The decision information includes lane change, overtaking, cruising, and/or scram. The state parameters and environmental information of the vehicle are used as input, the decision information is used as output, and the bidirectional LSTM is used for supervised training. And the trained decision information generation model is used for predicting the decision information of the subsequent vehicle, and the vehicle running track is further adjusted according to the prediction result. And in the running process, the state parameters and the environmental information of the current vehicle at the current moment are input into the trained decision information generation model to obtain decision information based on the state parameters and the environmental information of the current vehicle at the current moment, and the running track of the vehicle is adjusted according to the output result of the model.

In a specific example, as shown in fig. 1b, the decision information generation model includes: the bidirectional LSTM model comprises a forward LSTM model and a backward LSTM model. The feedforward neural network model includes an input layer, a hidden layer 1, a hidden layer 2, and an output layer.

The embodiment of the invention provides a neural network model-based behavior decision generation method for an automatic driving vehicle, wherein state parameters of the vehicle and environment information of the vehicle are used as model input, and decision information is used as model output to train an LSTM algorithm model, so that the vehicle can adjust a behavior decision model according to behavior decisions of a driver, and the decision is closer to the selection of the driver. Therefore, the method overcomes the defects that the current intelligent vehicle cannot self-adjust based on rule learning in the automatic driving process, the scene coverage is incomplete and the like, improves the behavior decision accuracy and the driving safety of the automatic driving vehicle, and is a method which can be widely used.

According to the technical scheme of the embodiment, the state parameters and the environmental information of the current vehicle at the current moment are acquired; performing data processing on the state parameters and the environmental information to obtain target characteristic vectors; inputting the target characteristic vector into a decision information generation model to obtain decision information corresponding to the target characteristic vector, wherein the decision information generation model is a bidirectional LSTM network model, and the model structure and model parameters of the decision information generation model are obtained by training according to sample state parameters, sample environment information and sample decision information respectively, so that the defects that the existing intelligent vehicle cannot self-adjust based on rule learning in the automatic driving process, the scene coverage is incomplete and the like are overcome, and the behavior decision accuracy and the driving safety of the automatic driving vehicle are improved.

Example two

Fig. 2 is a schematic structural diagram of a decision information generating apparatus according to a second embodiment of the present invention. The present embodiment may be applicable to the case of generating the decision information, where the apparatus may be implemented in a software and/or hardware manner, and the apparatus may be integrated in any device providing a decision information generating function, as shown in fig. 2, where the decision information generating apparatus specifically includes: an acquisition module 210, a process model 220, and a generation module 230.

The obtaining module 210 is configured to obtain a current state parameter and environmental information of a current vehicle at a current time;

the processing model 220 is used for performing data processing on the state parameters and the environmental information to obtain a target feature vector;

the generating module 230 is configured to input the target feature vector into a decision information generating model, so as to obtain decision information corresponding to the target feature vector, where the decision information generating model is a bidirectional LSTM network model, and a model structure and model parameters of the decision information generating model are obtained by training according to a sample state parameter, sample environment information, and sample decision information, respectively.

Optionally, the generating module is specifically configured to:

the generative model is specifically configured to:

The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

EXAMPLE III

Fig. 3 is a schematic structural diagram of a computer device in a third embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in FIG. 3 is only an example and should not impose any limitation on the scope of use or functionality of embodiments of the present invention.

As shown in FIG. 3, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. In the computer device 12 of the present embodiment, the display 24 is not provided as a separate body but is embedded in the mirror surface, and when the display surface of the display 24 is not displayed, the display surface of the display 24 and the mirror surface are visually integrated. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, to implement the decision information generating method provided by the embodiment of the present invention:

Example four

A fourth embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the decision information generating method provided in all the inventive embodiments of this application:

Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for generating decision information, comprising:

2. The method of claim 1, wherein the training method of the decision information generation model comprises:

3. The method of claim 2, wherein the context information comprises: the vehicle-mounted navigation system comprises at least one of a planned path generated by global navigation, the speed of a vehicle which is in front of a current vehicle and has a distance with the current vehicle smaller than a preset distance, the distance between the vehicle which is in front of the current vehicle and has a distance with the current vehicle smaller than the preset distance and the current vehicle, the lane line condition of a current vehicle driving road section, the obstacle state around the current vehicle and the traffic light state.

4. The method of claim 3, wherein the bidirectional LSTM network model comprises: a forward LSTM network model, a backward LSTM network model and a feedforward neural network model;

5. The method of claim 3, wherein the bidirectional LSTM network model comprises: the device comprises an input layer, two hidden layers and an output layer, wherein an activation function of the hidden layers is a correction activation function, and an activation function of the output layer is a softmax function.

6. A decision information generating apparatus, comprising:

the acquisition module is used for acquiring the state parameters and the environmental information of the current vehicle at the current moment;

7. The apparatus of claim 6, wherein the generation module is specifically configured to:

8. The apparatus of claim 7, wherein the environment information comprises: the vehicle-mounted navigation system comprises at least one of a planned path generated by global navigation, the speed of a vehicle which is in front of a current vehicle and has a distance with the current vehicle smaller than a preset distance, the distance between the vehicle which is in front of the current vehicle and has a distance with the current vehicle smaller than the preset distance and the current vehicle, the lane line condition of a current vehicle driving road section, the obstacle state around the current vehicle and the traffic light state.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-5 when executing the program.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.