CN111007719B

CN111007719B - Automatic driving steering angle prediction method based on domain adaptive neural network

Info

Publication number: CN111007719B
Application number: CN201911102180.3A
Authority: CN
Inventors: 余宙; 俞俊; 邵镇炜; 罗宇矗
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2019-11-12
Filing date: 2019-11-12
Publication date: 2022-08-05
Anticipated expiration: 2039-11-12
Also published as: CN111007719A

Abstract

The invention discloses an automatic driving steering angle prediction method based on a domain adaptive neural network. The invention comprises the following steps: 1. and acquiring a real scene data set and a virtual scene data set, and preprocessing data. 2. Semantic features are extracted using two convolutional neural networks with independent parameters but the same structure. 3. The feature vectors are input into a steering angle prediction network and a domain classification network and a set of antagonistic loss functions is modeled. 4. And (5) training a model and optimizing a loss function. 5. And (4) reserving a semantic feature extraction network and a steering angle prediction network, and testing or applying the model. The invention provides an end-to-end antagonistic neural network architecture for realizing the field self-adaptation of an angle prediction model from a virtual environment data set to a real environment data set, and designs a proper loss function and a training method aiming at the neural network architecture, thereby improving the accuracy and the generalization of the model under various real driving scenes.

Description

Automatic driving steering angle prediction method based on domain adaptive neural network

Technical Field

The invention relates to the field of automatic driving, in particular to a method for predicting a steering angle of an automatic driving vehicle based on a Domain Adaptation method (Domain Adaptation) and an end-to-end deep neural network model.

Background

The explosion of deep learning, especially its success in computer vision and natural language processing tasks, has greatly expanded the application scenarios of modern computers and profoundly influenced the current social life. As one of the important applications of deep learning, the automatic driving technology has also become a hot research field. The technology tries to liberate human beings in the aspect of automobile driving, automatically controls the automobile driving by using the computer system, and improves the safety and the convenience of the automobile driving, so the technology has a good application prospect. The steering angle prediction module in an automatic driving system is the most important component of the automatic driving system, and the function of the steering angle prediction module is to control the steering of the front wheels of the automobile so as to control the advancing direction of the automobile. The work flow of the steering angle prediction module is that a video or a photo sequence of the surrounding road surface is obtained by shooting by a front camera, the video or the photo sequence is input into a vehicle-mounted computing system for processing, the steering angle at the next moment is predicted, and finally a steering control instruction is generated to control the steering of the front wheels. Early steering angle prediction methods relied on finding image features that fit human experience, such as lanes, pedestrians, buildings, etc., in video or picture input, and learned the relationship of these features to the steering angle of the vehicle through pattern recognition. Many new researches show that the method has better effect than a method based on human experience characteristics by adopting an end-to-end neural network model and directly inputting original pixel data to predict the steering angle. Due to the strong representation capability of the deep convolutional neural network, the end-to-end steering prediction model can automatically learn which visual information in a road surface scene is linked with the driving behaviors of human beings, and correct and reasonable prediction is carried out accordingly.

However, deep learning based steering angle prediction methods are quite limited by the size and quality of the autonomous driving data set. The autopilot data set contains samples, often a sequence of video or pictures of the road scene over a period of time and a sequence of correct steering angles over that period of time. The existing data sets are mainly divided into two types, one is a real data set obtained by collecting video records of real driving scenes and driving data of people, and the other is a virtual data set which simulates road scenes through a Computer Graphics (CG) technology and automatically generates labels through programs. It is difficult to directly obtain a large-scale and high-quality real data set; meanwhile, due to the complexity of the automatic driving task, including the diversity of factors such as road conditions and weather, the real data set is difficult to effectively cover the daily driving conditions. Therefore, when the driving decision model trained on the real data set is used for processing the road scene which is not seen in the training set, poor decisions are often made, namely the generalization capability of the model is poor. Compared with a real data set, the method is easier to construct a large-scale virtual data set, and the reliability and diversity of the virtual data set are stronger. However, since the virtual scene is different from the real scene, some methods for model training on the virtual data set and proven to be feasible and effective in the virtual environment are difficult to implement in the real scene.

Aiming at the defects of the current automatic driving steering angle prediction method, research is necessary, a new method is provided, the knowledge learned by the steering angle prediction model on a virtual data set is transferred to a real data set, and the generalization capability of the steering angle prediction model under a real road surface scene is improved.

Disclosure of Invention

In order to overcome the defects of the existing automatic driving steering angle prediction method, the invention discloses a steering angle prediction method based on an adaptive neural network in the antagonism field, which mainly comprises the following two points:

1. and the method provides a domain self-adaptive method for realizing the knowledge transfer of the steering angle prediction model from the virtual environment to the real environment. The virtual data set is used as a source domain, the real data set is used as a target domain, and visual feature vectors from two different data domains are projected into a common semantic space through a domain self-adaption method of a feature level. In this space, the feature vectors that characterize the same semantic information will have very small spatial distances and it is not possible to distinguish whether they were originally mapped from real dataset samples or virtual dataset samples.

2. An end-to-end antagonism deep learning method is provided for realizing a steering angle prediction model, real data set and virtual image information are used as input, and two outputs are included simultaneously, wherein the prediction is carried out on a steering instruction of an automatic driving system, and the classification is carried out on whether the image is from a real data set or a virtual data set. Through training, the model automatically learns the semantic mapping from two different visual feature spaces to the common semantic space, and further learns the action mapping from the common semantic space to a steering instruction space, so that a driving decision model which has strong generalization capability and is applicable to a real environment is obtained.

It should be noted that this approach is in fact a universal framework that enables the domain adaptation of neural network models to different autonomous driving datasets, and the principles can be modeled using any of a variety of excellent neural network architectures.

In order to solve the problems of the prior art, the technical scheme of the invention comprises the following steps:

step 1, acquiring a data set and carrying out data preprocessing.

And acquiring a real data set and a virtual data set, and simply preprocessing image data in the real data set and the virtual data set to be used as the input of a subsequent deep neural network.

And 2, respectively modeling real domain semantic mapping and virtual domain semantic mapping by using two semantic feature extraction networks with independent parameters and the same structure. The semantic feature extraction network is realized by using a convolutional neural network, and the network model parameter of the real domain semantic mapping is recorded as theta _v Network model parameter θ of virtual Domain semantic mapping _r . Input X from real dataset and virtual dataset ^r And X ^v Respectively extracting the features to obtain a feature vector s with higher semantic information ^r And s ^v . Mixing features from both data sets and generating corresponding domain labels c _i ：

The input features are represented from a real data set,

the representative input features are from a virtual dataset.

Step 3, the characteristic vector s _i The steering angle prediction network P and the domain classification network D are input and a set of antagonistic Loss Functions (Loss Functions) is modeled. The invention realizes the field adaptivity through the antagonistic network training, so two output layers and two loss functions need to be designed. The step (3) further comprises the following steps:

step 3-1, feature vector s _i Input at θ _d Obtaining output of domain classification network for domain classification network of model parameters

Further computing a cross-entropy loss function as a classification loss

Wherein N is ^r For the true data set sample size, N ^v Is the virtual data set sample size.

Step 3-2, feature vector s _i Input at theta _p Outputting scalar prediction results for steering angle prediction networks of model parameters

Then calculating the prediction loss function, in particular using the mean square error formula, i.e.

Wherein, y _i Representing input samples X _i A corresponding steering command tag.

Step 4, training the semantic feature extraction network, the field classification network and the steering angle prediction network end to end, and optimizing the model parameter theta _v 、θ _r 、θ _d And theta _p . The optimization target adopted by the invention is as follows:

wherein the content of the first and second substances,

and

respectively for optimizing theta _v 、θ _r 、θ _d And theta _p The resulting model parameters. λ is a constant coefficient, indicating how important the classification loss is compared to the prediction loss. And training a neural network model through a gradient back propagation algorithm and an Adam optimizer to enable model parameters to approach the optimization target.

And 5, testing or applying the model. When the trained model is tested or actually applied in a real environment, the steering angle prediction result with higher generalization can be obtained only by reserving the semantic feature extraction network and the steering angle prediction network and inputting the image data of the real environment.

Compared with the existing method, the algorithm provided by the invention has the following advantages:

high accuracy: the invention overcomes the defects caused by the shortage of the training data set in the existing deep learning method for predicting the steering angle of the automatic driving, fully utilizes the real data set and the virtual data set, and effectively improves the generalization capability of the steering angle prediction model. Experiments prove that the model provided by the invention obtains higher accuracy on a test data set.

High efficiency: because the end-to-end steering angle prediction model is adopted, the operation flow of the steering angle prediction module is simplified, the parameters and the operation amount of the model are moderate, and the model is suitable for being deployed in a vehicle-mounted computing system with limitations in terms of storage and calculation power to carry out real-time steering control.

Drawings

FIG. 1 is an overall frame diagram of the present invention;

FIG. 2 is a diagram of a semantic feature extraction network architecture; the following specific examples will further illustrate the invention in conjunction with the above figures.

Detailed Description

The following will further describe an automatic driving steering angle prediction method based on a domain adaptive neural network according to the present invention with reference to the accompanying drawings.

Referring to fig. 1, an overall model framework for the method is shown. The method comprises the following specific steps:

step (1): and acquiring a data set and carrying out data preprocessing.

The method comprises the steps of firstly, acquiring a real data set sampled in a real driving environment and a virtual data set generated in a computer simulation environment, and downloading an open-source data set or manufacturing the data set by self. And performing frame extraction on the video data in the data set to obtain a group of time sequence picture sequences. Each RGB picture in the picture sequence is adjusted to a size of 120 x 240. Constructing a training sample for each time t, and taking x _t-3 、x _t-2 、x _t-1 And x _t Splicing the four pictures in the time dimension to form an input sample X of a fourth-order tensor _t And its size is (3,4, 120, 240). Each input sample X _t Label y corresponding to a steering command _t Where y is _t α/90 ∈ (-1,1), α is the angle at which the front wheels are turned, α is negative when the vehicle turns left and α is positive when the vehicle turns right. Let the sample size of the actual dataset be N ^r The sample size of the virtual data set is N ^v . Generally, there is N ^r ＜N ^r In order to ensure the balance of the training data, each training step randomly extracts N samples from each of the two data sets, i.e. 2N samples in total for each batch of data during network training.

Step (2): modeling real domain semantic mapping and virtual domain semantic mapping separately using two semantic feature extraction networks with independent parameters but identical structure, for input X from real dataset and virtual dataset ^r And X ^v Feature extraction is performed separately. The semantic feature extraction network is composed of 5 layers of three-dimensional convolutional layers (Conv3d) and two fully-connected layers, and the network structure can be seen in FIG. 2. The convolution Kernel (Kernel) sizes of the first two convolutional layers are (32,2,5,5) and (64,2,5,5), respectively, and the sliding offset (Stride) is (1,4,4) and (1,3, 3); the convolution kernels of the last three convolutional layers are all (64,2,3,3), and the sliding offset is (1,1, 1). The residual error is added to all the convolution layers except the first two convolution layers. Flattening the output tensor of the fifth layer of convolution layer intoAnd inputting the vector into two fully-connected layers, wherein the output feature dimension is 512 and 128 respectively, namely the dimension of the finally output semantic feature vector is 128. The input of each Layer network is normalized by a Batch Normalization Layer (Batch Normalization Layer), and the output is activated by a modified Linear Unit (ReLU). Mixing X ^r And X ^v Respectively inputting two independent parameters into the model with the network structure, and outputting to obtain a feature vector with higher semantic information, namely

s ^r ＝f(X ^r ；θ _r )

s ^v ＝f(X ^v ；θ _v )

Wherein theta is _r And theta _v Network model parameters representing real domain semantic mapping and network model parameters representing virtual domain semantic mapping, respectively. Will be characterized by

And

mixed into a set of characteristic data

And generates a corresponding domain label

Denotes s _i From the set of real data, the data is,

denotes s _i From a virtual data set.

And (3): the feature vector s _i The steering angle prediction network P and the domain classification network D are input and a set of antagonistic loss functions is modeled. The invention realizes the field adaptability through antagonistic network training, so two output layers and two loss functions need to be designed. The step (3) further comprises the following steps:

step (3.1): the feature vector s _i Input fieldThe network D is classified and the classification loss is calculated. The domain classification network is realized by using a multilayer perceptron (MLP), and specifically comprises the following steps: one linear layer with 256-dimensional output and one linear layer with 2-dimensional output, wherein a modified linear unit is used between the two linear layers for activation, and the parameter of the network is set as theta _d . Feature vector s _i After inputting the domain classification network D, outputting a 2-dimensional vector

Then obtaining the output of the domain classification network through a softmax function

Namely, it is

Further computing a cross-entropy loss function as a classification loss

Step (3.2): the feature vector s _i The steering angle prediction network P is input and the prediction loss is calculated. The steering angle prediction network P uses a multi-layer perceptron model in the same form as the domain classification network D, but the output dimensionality of the final output layer is 1, and the parameter of the steering angle prediction network P is set as theta _p . Steering angle prediction network P output scalar

Post-calculation of the prediction loss function, in particular using the mean square error equation (MSE), i.e.

And (4): end-to-end training is carried out on semantic feature extraction network, field classification network and steering angle prediction network, and model parameter theta is optimized _v 、θ _r 、θ _d And theta _p . The invention adopts the optimization target of

Wherein the content of the first and second substances,

and

respectively for optimizing theta _v 、θ _r 、θ _d And theta _p The resulting model parameters. λ is a coefficient constant, indicating the importance of classification loss compared to prediction loss, and is suggested to be in the range of [0.5,1 ]]. In order to enable the entire network to be trained uniformly using the gradient back propagation algorithm and the Adam optimizer, the two loss functions are combined into one joint loss function, i.e. the combined loss function

Meanwhile, a Gradient Reversal Layer (GRL) is added before the domain classification network D, and a pseudo-function (pseudo-function) thereof has the following form, wherein I is an identity matrix:

the model is then trained using an Adam optimizer, minimizing a joint loss function

Until convergence. The parameters of the Adam optimizer all adopt default values.

And (5): the model is tested or applied.

To be provided with

Extracting networks and/or parameters for semantic features of parameters

The classification network for the parameter domain is designed for realizing the domain self-adaption and the knowledge migration in the model training process, so that the classification network only needs to be reserved after the training is finished

Extracting networks for semantic features of parameters, and

and predicting the network for the steering angle of the parameter so as to carry out subsequent test experiments or field application. When in test and application, only the image data of the real environment is needed to be input, and the prediction result of the steering angle with higher generalization can be obtained.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The automatic driving steering angle prediction method based on the field self-adaptive neural network is characterized by comprising the following steps of:

step 1, acquiring a data set and carrying out data preprocessing;

acquiring a real data set and a virtual data set, and preprocessing image data in the real data set and the virtual data set to be used as the input of a subsequent deep neural network;

step 2, using two semantic feature extraction networks with independent parameters and the same structure to respectively model a real domain semantic mapping and a virtual domain semantic mapping; the semantic feature extraction network is realized by using a convolutional neural network, and the network model parameter of the real domain semantic mapping is recorded as theta _υ Network model parameter θ of virtual Domain semantic mapping _r (ii) a Input X from real dataset and virtual dataset ^r And X ^υ Respectively extracting the features to obtain a feature vector s with higher semantic information ^r And s ^υ (ii) a Mixing features from both data sets and generating corresponding domain labels c _i ：

The representative input features are from the real data set,

representing the input features from a virtual dataset;

step 3, the characteristic vector s _i Inputting a steering angle prediction network P and a domain classification network D and modeling a group of antagonistic loss functions; the domain adaptivity is realized through antagonistic network training, so two output layers and two loss functions need to be designed:

step 3-1, feature vector s _i Input at theta _d Obtaining output of domain classification network for domain classification network of model parameters

Further computing a cross-entropy loss function as a classification loss

Wherein N is the sample size extracted from the real data set and the virtual data set;

Wherein, y _i Representing input samples X _i A corresponding steering command tag;

step 4, training the semantic feature extraction network, the field classification network and the steering angle prediction network end to end, and optimizing the model parameter theta _υ 、θ _r 、θ _d And theta _p The optimization goals adopted are as follows:

wherein the content of the first and second substances,

and

are respectively optimized for theta _υ 、θ _r 、θ _d And theta _p Obtaining model parameters; λ is a coefficient constant, indicating the importance of classification loss compared to prediction loss; training a neural network model through a gradient back propagation algorithm and an Adam optimizer to enable model parameters to approach the optimization target;

step 5, testing or applying the model; when the trained model is tested or actually applied in a real environment, the steering angle prediction result with higher generalization can be obtained only by reserving the semantic feature extraction network and the steering angle prediction network and inputting the image data of the real environment.

2. The automatic driving steering angle prediction method based on the domain-adaptive neural network according to claim 1, wherein the step (1) is specifically realized as follows:

firstly, acquiring a real data set sampled in a real driving environment and a virtual data set generated in a computer simulation environment, and performing frame extraction on video data in the data set to obtain a group of time sequence picture sequences; adjusting each RGB picture in the picture sequence to be 120 x 240 in size; constructing a training sample for each time t, and taking x _t-3 、x _t-2 、x _t-1 And x _t Splicing the four pictures in the time dimension to form an input sample X of a fourth-order tensor _t A size of (3,4, 120, 240); each input sample X _t Label y corresponding to a steering command _t Here y _t α/90 ∈ (-1,1), α is the steering angle of the front wheel, α is negative when the automobile turns left, and α is positive when the automobile turns right; let the sample size of the actual dataset be N ^r The sample size of the virtual data set is N ^υ (ii) a In order to ensure the balance of the training data, each training step randomly extracts N samples from each of the two data sets, i.e. 2N samples in total for each batch of data during network training.

3. The automatic driving steering angle prediction method based on the domain-adaptive neural network according to claim 2, wherein the step (2) is specifically realized as follows:

modeling real domain semantic mapping and virtual domain semantic mapping separately using two semantic feature extraction networks with independent parameters but identical structure, for input X from real dataset and virtual dataset ^r And X ^υ Respectively extracting the characteristics; the semantic feature extraction network is composed of 5 layers of three-dimensional convolution layers and two layers of full connection layers; the convolution kernel sizes of the first two convolution layers are (32,2,5,5) and (64,2,5,5), and the sliding offset is (1,4,4) and (1,3, 3); the convolution kernels of the last three convolution layers are all (64,2,3,3), and the sliding offset is (1,1, 1); the residual errors are added to all the convolution layers except the first two convolution layers; flattening the output tensor of the fifth convolutional layer into vectors, and inputting the vectors into two fully-connected layers, wherein the dimensionalities of output features of the five fully-connected layers are 512 and 128 respectively, namely the dimensionality of the finally output semantic feature vector is 128; the input of each layer network is normalized by using a batch normalization layer, and the output is activated by using a modified linear unit; x is to be ^r And X ^υ Respectively inputting two independent parameters into the model with the network structure, and outputting to obtain a feature vector with higher semantic information, namely

s ^r ＝f(X ^r ；θ _r )

s ^υ ＝f(X ^υ ；θ _υ )

Wherein theta is _r And theta _υ Respectively representing network model parameters of real domain semantic mapping and network model parameters of virtual domain semantic mapping; will be characterized by

And

mixed into a set of characteristic data

And generates a corresponding domain label

Denotes s _i From the set of real data, the data is,

denotes s _i From a virtual data set.

4. The automatic driving steering angle prediction method based on the domain-adaptive neural network according to claim 3, wherein the step (3) is specifically realized as follows:

step 3-1: the feature vector s _i Inputting a domain classification network D and calculating classification loss; the neighborhood classification network is realized by using a multilayer perceptron, specifically, a linear layer with 256-dimensional output and a linear layer with 2-dimensional output are arranged on one layer, a correction linear unit is used for activation between the two linear layers, and the parameter of the network is set as theta _d (ii) a Feature vector s _i After the D network is input, a 2-dimensional vector is output

Namely, it is

Wherein k is 1,2

Further computing a cross-entropy loss function as a classification loss

Step 3-2: the feature vector s _i Inputting a steering angle prediction network P and calculating prediction loss; the P network uses a multilayer perceptron model in the same form as the D network, but the output dimension of the final output layer is 1, and the parameter of the P network is set as theta _p (ii) a P-net output scalar

Post-calculation of the prediction loss function, in particular using the mean square error formula, i.e.

5. The automatic driving steering angle prediction method based on the domain-adaptive neural network according to claim 4, wherein the step (4) is specifically realized as follows:

end-to-end training is carried out on semantic feature extraction network, field classification network and steering angle prediction network, and model parameter theta is optimized _υ 、θ _r 、θ _d And theta _p The optimization goals adopted are as follows:

wherein the content of the first and second substances,

and

respectively for optimizing theta _υ 、θ _r 、θ _d And theta _p Obtaining model parameters; λ is a coefficient constant, indicating the importance of classification loss compared to prediction loss, and its value range is [0.5, 1%](ii) a In order to enable the entire network to be trained uniformly using the gradient back propagation algorithm and the Adam optimizer, the two loss functions are combined into one joint loss function, i.e. the combined loss function

Meanwhile, a gradient inversion layer is added in front of the domain classification network D, and the pseudo function of the gradient inversion layer has the following form, wherein I is an identity matrix:

Until convergence; the parameters of the Adam optimizer all adopt default values.