CN111208818B - Intelligent vehicle prediction control method based on visual space-time characteristics - Google Patents

Intelligent vehicle prediction control method based on visual space-time characteristics Download PDF

Info

Publication number
CN111208818B
CN111208818B CN202010012552.XA CN202010012552A CN111208818B CN 111208818 B CN111208818 B CN 111208818B CN 202010012552 A CN202010012552 A CN 202010012552A CN 111208818 B CN111208818 B CN 111208818B
Authority
CN
China
Prior art keywords
time
steering wheel
feature
network
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010012552.XA
Other languages
Chinese (zh)
Other versions
CN111208818A (en
Inventor
吴天昊
程洪
黄瑞
詹惠琴
周润发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010012552.XA priority Critical patent/CN111208818B/en
Publication of CN111208818A publication Critical patent/CN111208818A/en
Application granted granted Critical
Publication of CN111208818B publication Critical patent/CN111208818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • G05D1/0246Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means
    • G05D1/0253Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means using a video camera in combination with image processing means extracting relative motion information from a plurality of images taken successively, e.g. visual odometry, optical flow
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0212Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
    • G05D1/0221Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0276Control of position or course in two dimensions specially adapted to land vehicles using signals provided by a source external to the vehicle

Abstract

The invention discloses an intelligent vehicle prediction control method based on visual space-time characteristics, which comprises the steps of firstly constructing a steering wheel corner prediction network, wherein the steering wheel corner prediction network comprises a space characteristic extraction network, N space-time characteristic extraction modules and a space-time characteristic map fusion prediction module, obtaining characteristic maps of different scales and different time steps by the space characteristic extraction network, respectively extracting space-time characteristics from the characteristic maps of each scale by the space-time characteristic extraction modules, fusing the space-time characteristics of different scales together by the space-time characteristic map fusion prediction module to predict steering wheel corners, predicting the moment to be predicted after training the steering wheel corner prediction network, and carrying out exponential weighted averaging on the predicted value and the historical predicted value of the steering wheel corners to obtain the final predicted value of the steering wheel corners. The invention can effectively extract the spatio-temporal information in the continuous image frames, and fuses the spatio-temporal information with different scales, thereby greatly improving the prediction control precision of the intelligent vehicle.

Description

Intelligent vehicle prediction control method based on visual space-time characteristics
Technical Field
The invention belongs to the technical field of intelligent vehicle control, and particularly relates to an intelligent vehicle prediction control method based on visual spatiotemporal characteristics.
Background
The intelligent vehicle end-to-end decision method is characterized in that the deviation of a vehicle can be automatically corrected according to the faced condition when the vehicle runs in a lane. The traditional intelligent vehicle end-to-end decision method generally needs the following steps: the sensor module composed of cameras obtains images of a road ahead, the images are sent to the sensing module to detect lane lines in the images, and then the steering wheel turning angle degree required for keeping the lane lines at the current moment is calculated according to the relationship among the lane lines, the vehicle state, the vehicle pose and the vehicle running direction. The intelligent vehicle end-to-end decision method based on deep learning is characterized in that a plurality of steps in the traditional method are integrally understood to be a model, the model can directly receive information such as images from a sensor and the like to calculate the steering wheel corner required at the current moment, and due to the strong fitting capacity of a deep network, the relation between the road image characteristics and the steering wheel corner can be directly learned based on the deep learning algorithm.
Due to the strong fitting capability and generalization capability of the convolutional neural network, the convolutional neural network has excellent performance in tasks such as image classification, image segmentation, target detection and behavior prediction. The essence of the lane line keeping method is a mapping relation among the relative relation between the vehicle pose, the driving direction and the lane line and the vehicle and the corresponding steering wheel corner, and the essence of the intelligent vehicle end-to-end decision algorithm based on deep learning is that the network can fit the mapping relation in a high-dimensional space through training a deep network, so that the method has the capability of calculating the steering wheel corner according to the image.
Patent publication No. CN108227707A introduces an automatic driving method based on a laser radar and an end-to-end deep learning method, which includes the following steps: converting driving environment information acquired by a laser radar into a depth map in real time; generating a corresponding data label pair according to a corresponding matching rule, and taking the data label pair as training data; and inputting training data into the constructed deep convolution neural network model for training, and obtaining the control quantity of the vehicle through the deep convolution neural network model. The method can utilize the depth map obtained by the laser radar to make an end-to-end decision, but the control of the vehicle is a continuous process, a longer continuous relation should be considered in the process of automatic driving of the vehicle, and a pure depth convolution neural network lacks the capability of extracting time sequence dependence between continuous frames.
The patent with the publication number of CN109581928A introduces an intelligent vehicle end-to-end decision method and system facing to a highway scene, and the method mainly provides a concept of utilizing transfer learning to expand a database, for a convolutional neural network, more data means stronger robustness, and the robustness of an algorithm facing to different scenes can be enhanced when a model is trained in different databases by utilizing the transfer learning, so that the anti-interference capability of the algorithm is stronger. As the network utilizes more data in the training process, the phenomenon of overfitting of the network can be avoided, and the phenomenon that the network shows low deviation and high variance on a test set is relieved. The method of this patent, while improving performance, still lacks consideration for a continuous vehicle control process.
A patent with publication number CN109656134A introduces an intelligent vehicle end-to-end decision method based on a spatio-temporal joint recurrent neural network, in an algorithm used by the patent, a long and short time memory network is utilized to extract a time dependency relationship between continuous data frames, the intelligent vehicle end-to-end decision method proposed in the patent lacks rationality for fusion of time sequence information and spatial feature information, in the method, the long and short time memory network is utilized to extract the time dependency information between the continuous data frames, the information has a large amount of redundant information when being subjected to joint calculation with an image frame, and the simple utilization of the long and short time memory network destroys two-dimensional features in the image, thereby losing some information in the link.
Patent publication No. CN109615064A introduces an intelligent vehicle end-to-end decision method based on a space-time feature fusion recurrent neural network, in the method, a convolutional neural network and a long-time memory network are used for extracting space features and time dependence information respectively, in the method, four different ways are adopted to try to fuse two features, namely feature addition, feature subtraction, feature multiplication and feature cascade methods, and finally a better effect is obtained in the feature cascade fusion method. Although the method utilizes a feature cascading mode to improve the accuracy of an end-to-end decision network to some extent, no matter which method mentioned in the method is lack of rationality, and the extraction of the spatial feature and the extraction of the temporal feature in the method are still two separate processes.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide an intelligent vehicle prediction control method based on visual space-time characteristics.
In order to realize the aim, the intelligent vehicle prediction control method based on the visual space-time characteristics comprises the following steps:
s1: the method comprises the following steps of constructing a steering wheel corner prediction network, wherein the steering wheel corner prediction network comprises a spatial feature extraction network, N space-time feature extraction modules and a time feature map fusion prediction module, wherein:
the input of the spatial feature extraction network is a front road image detected by an intelligent vehicle, the front road image detected by the intelligent vehicle at the current moment t and the front K frames of front road images are sequentially input into the spatial feature extraction network according to time, feature maps of the last N layers of the spatial feature extraction network are respectively output to the corresponding nth space-time feature extraction module, and the feature map of the last nth layer corresponding to the moment t-K is recorded as F t-k,n Wherein K =0,1, \8230;, K, N =1,2, \8230;, N;
each space-time feature extraction module comprises a first convolution layer, a convolution long-time and short-time memory network, a second convolution layer and a third convolution layer, wherein:
the convolution kernel size of the first convolution layer is 1 × 1, and the input feature map F t-k,n Reducing dimension, outputting to convolution time memory network, and recording first convolution layer outputCharacteristic diagram F 'of' t-k,n The size is W multiplied by H multiplied by L;
the input of the convolution length time memory network is a combined feature map, and the combined feature map is composed of feature maps F' t-k,n A characteristic diagram F' output by the convolution long-time and short-time memory network corresponding to the previous frame of road image t-k-1,n Spliced, the size of the obtained product is W multiplied by H multiplied by 2L, and when t-k-1 is less than 0, a characteristic diagram F ″ t-k-1,n Wherein each pixel value is 0, the convolution long-time and short-time memory network is used for extracting the space-time characteristics in the combined characteristic diagram and outputting a characteristic diagram F ″ t-k-1,n (ii) a K +1 feature maps F 'are sequentially added' t-k,n Inputting the corresponding combined feature map into a convolution long-time and short-time memory network, and inputting the feature map F' corresponding to the current time t t,n Outputting to the second convolution layer;
the convolution kernel size of the second convolution layer is 3 × 3, for the input feature map F ″) t,n Performing convolution processing and outputting to a third convolution layer;
the convolution kernel size of the third convolution layer is 3 multiplied by 3, and after convolution processing is carried out on the feature graph output by the second convolution layer, the obtained feature graph is output to the feature graph fusion prediction module;
the space-time feature map fusion prediction module fuses feature maps of different scales output by the N space-time feature extraction modules and outputs steering wheel corner predicted values V at the current time t and M future times t+m Wherein M =0,1, \ 8230;, M;
s2: acquiring a plurality of continuous road images in front of the intelligent vehicle and corresponding steering wheel corners, taking the road images in front as the input of a steering wheel corner prediction network, taking the steering wheel corners as expected output, and training the steering wheel corner prediction network;
s3: for the time t' to be predicted, sequentially inputting the front road image detected by the intelligent vehicle at the time t and the front K frames of front road images into the steering wheel corner prediction network according to time, and taking the obtained M +1 steering wheel corner prediction values as initial prediction values V of corresponding time t′+m (ii) a Finally predicting values V 'of steering wheel rotation angles at the first Q moments' t′-q Q =1,2, \ 8230;, Q, and M +1 initial predicted values V t′+m Obtaining a sequence of predicted values of the steering wheel rotation angles according to time sequence, performing exponential weighted averaging on the sequence, and taking a result obtained by performing exponential weighted averaging at the time t '+ M as a final predicted value V' of the steering wheel rotation angle at the time t 'to be predicted' t′
The invention relates to an intelligent vehicle prediction control method based on visual space-time characteristics, which comprises the steps of firstly constructing a steering wheel corner prediction network, wherein the steering wheel corner prediction network comprises a space characteristic extraction network, N space-time characteristic extraction modules and a space-time characteristic map fusion prediction module, the space characteristic extraction network is used for obtaining characteristic maps of different scales and different time steps, the space-time characteristic extraction modules are used for respectively extracting space-time characteristics from the characteristic maps of each scale, then the space-time characteristic map fusion prediction module is used for fusing the space-time characteristics of different scales together to predict a steering wheel corner, after the steering wheel corner prediction network is trained, the time to be predicted is predicted, and the predicted value of the steering wheel corner and a historical predicted value are subjected to exponential weighted averaging to obtain the final predicted value of the steering wheel corner. The invention can effectively extract the spatio-temporal information in the continuous image frames, and fuses the spatio-temporal information with different scales, thereby greatly improving the prediction control precision of the intelligent vehicle.
Drawings
FIG. 1 is a block diagram of an embodiment of the present invention based on the visual spatiotemporal features of the intelligent vehicle predictive control method;
FIG. 2 is a block diagram of a steering wheel angle prediction network in accordance with the present invention;
FIG. 3 is a block diagram of a spatiotemporal feature extraction module according to the present invention;
FIG. 4 is a schematic structural diagram of a spatiotemporal feature map fusion prediction module in the present embodiment;
FIG. 5 is a comparison curve of the output value and the tag value of the present invention on the Udacity chanllenge II database;
FIG. 6 is a graph comparing the output value and the driver control value of the present invention in a campus environment.
Detailed Description
Specific embodiments of the present invention are described below in conjunction with the accompanying drawings so that those skilled in the art can better understand the present invention. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a block diagram of an embodiment of the intelligent vehicle predictive control method based on visual spatiotemporal characteristics according to the present invention. As shown in fig. 1, the intelligent vehicle predictive control method based on visual space-time characteristics of the present invention specifically includes the steps of:
s101: constructing a steering wheel corner prediction network:
and constructing a steering wheel corner prediction network. Fig. 2 is a block diagram of a steering wheel angle prediction network in accordance with the present invention. As shown in fig. 2, the present invention includes a spatial feature extraction network, N spatio-temporal feature extraction modules, and a temporal feature map fusion prediction module, and each module is described in detail below.
The input of the spatial feature extraction network is a front road image detected by an intelligent vehicle, the front road image detected by the intelligent vehicle at the current moment t and a front K frame of front road image (K +1 frames of front road images in total) are sequentially input into the spatial feature extraction network according to time, feature maps of the last N layers of the spatial feature extraction network are respectively output to a corresponding nth space-time feature extraction module, and the feature map of the last nth layer corresponding to the moment t-K is recorded as F t-k,n Where K =0,1, \8230;, K, N =1,2, \8230;, N. In the embodiment, the spatial feature extraction network adopts a convolutional neural network part of an Nvidia-Pilot network, the total number of input front road images is 15, and feature maps of the last 4 layers are output, so that feature maps of 15 time steps on 4 different scales can be obtained by the spatial feature extraction network.
The space-time feature extraction module is used for extracting corresponding space-time features from the feature graph output by the space feature extraction network. FIG. 3 is a block diagram of a spatiotemporal feature extraction module according to the present invention. As shown in fig. 3, each spatio-temporal feature extraction module of the present invention includes a first convolutional layer, a convolutional long-short-time memory network, a second convolutional layer, and a third convolutional layer, respectively, where:
the convolution kernel size of the first convolution layer is 1 × 1, and the feature map F is input t-k,n After dimensionality reduction is carried out, the output is transmitted to a convolution duration memory network, and a feature map F 'output by the first convolution layer is recorded' t-k,n The size is W × H × L. The first convolution layer has the function of reducing the number of channels of the feature map, thereby reducing the calculation amount of the space-time feature extraction module. In the neural element of the convolution memory network of the embodiment, eight convolution layers and a plurality of fully connected layers exist, if the neural element is expanded in fifteen time steps, the calculated amount of the neural element is at least equivalent to the calculated amount of one hundred twenty convolution layers, so that the invention arranges one 1 × 1 convolution layer to reduce the parameter amount of the characteristic diagram, thereby reducing the calculated amount of the space-time characteristic extraction module.
The input of the convolution long-time and short-time memory network is a combined characteristic diagram, and the combined characteristic diagram is composed of a characteristic diagram F' t-k,n A characteristic diagram F' output by the convolution long-time and short-time memory network corresponding to the previous frame of the front road image t-k-1,n Spliced, the size of the obtained product is W multiplied by H multiplied by 2L, and when t-k-1 is less than 0, a characteristic diagram F ″ t-k-1,n Each pixel value is 0, the convolution long-time and short-time memory network is used for extracting the space-time characteristics in the combined characteristic diagram and outputting a characteristic diagram F ″ t-k-1,n (ii) a K +1 feature maps F 'are sequentially added' t-k,n Inputting the corresponding combined feature map into a convolution long-time and short-time memory network, and inputting the feature map F' corresponding to the current time t t,n And outputting to the second convolution layer. The convolution long-short-time memory network is different from a common long-short-time memory network in that the former replaces the operation of multiplying a forgetting gate, an input gate and an output gate in the latter by a one-dimensional feature vector and a weighted value in a cell state with the operation of convolving a two-dimensional feature map and a multi-channel convolution kernel, and due to the characteristic, the convolution long-short-time memory network can extract spatial information in continuous image timing dependence information at the same time, and does not destroy the original spatial structure of the image like the common long-short-time memory network.
The convolution kernel size of the second convolution layer is 3 × 3, and the feature map F ″, which is input t,n And outputting the convolution processed result to a third convolution layer.
The convolution kernel size of the third convolution layer is 3 x 3, and after convolution processing is performed on the feature map output by the second convolution layer, the obtained feature map is output to the feature map fusion prediction module.
Two convolutional layers are arranged at the last of the space-time feature extraction module, and are processed by a convolutional long-time memory network, the time features inferred by fifteen time steps are fused into one time step in the obtained feature diagram, and the two convolutional layers are arranged to further screen and extract the features on the basis. Compared with the prior art, the space-time characteristic extraction module designed by the invention has more reasonable mode for extracting the space information and the time dependence information in the continuous images, and can simultaneously learn the space dependence relationship between the continuous frames under the condition of keeping the space characteristics of the two-dimensional characteristic diagram.
The space-time feature map fusion prediction module fuses feature maps of different scales output by the N space-time feature extraction modules and outputs steering wheel corner predicted values V at the current time t and M future times t+m Wherein M =0,1, \8230;, M. In this embodiment, the spatio-temporal feature map fusion prediction module is implemented by using a full connection layer. FIG. 4 is a schematic structural diagram of the spatio-temporal feature map fusion prediction module in this embodiment. As shown in fig. 4, the spatio-temporal feature map fusion prediction module in this embodiment includes N fully-connected layers. Arranging the N scales of feature maps obtained by the N space-time feature extraction modules from small to large according to the scales, and respectively recording the length and the width of the nth feature map and the number of channels as A n 、B n 、C n . The processing flow of the spatio-temporal feature map fusion prediction module to the N feature maps is as follows:
1) Expanding the characteristic diagram of the Nth scale into a characteristic vector F N Dimension of the feature vector is A N ×B N ×C N The output dimension is A after inputting the 1 st layer full connection layer N-1 ×B N-1 ×C N-1 Characteristic vector f of N
2) Let the full-connection layer number n' =1.
3) The feature vector f N-n′+1 And characteristic vector F obtained by expanding characteristic diagram of N-N' scale N-n′ Adding to obtain dimension A N-n′ ×B N-n′ ×C N-n′ Feature vector F' N-n′
4) Judging whether N' < N-1, if yes, entering step 5), and if not, entering step 7).
5) Feature vector F' N-n′ The output dimension is A after the n' +1 th layer full connection layer is input N-n′-1 ×B N-n′-1 ×C N-n′-1 Characteristic vector f of N-n′
6) Let n '= n' +1, return to step 3).
7) The feature vector F 'at this time' N-n′ Outputting predicted steering wheel angle values V at the current time t and M future times after being input to the N-th full-connection layer t+m And forming a predicted steering wheel angle vector.
S102: training a steering wheel corner prediction network:
the method comprises the steps of obtaining a plurality of continuous road images in front of the intelligent vehicle and corresponding steering wheel angles, taking the road images in front as the input of a steering wheel angle prediction network, taking the steering wheel angles as the expected output, and training the steering wheel angle prediction network. The training data can be the data of an open database or can be acquired by self. The loss function employed during the training process is as follows:
Figure BDA0002357672660000071
wherein L is t+m Root mean square error, lambda, of predicted and true values of the rotation angle at the time t + m t+m Represents the weight corresponding to the time t + m, is set according to actual needs, and
Figure BDA0002357672660000072
s103: predicting the steering wheel angle:
for the time t' to be predicted, the intelligent vehicle is detected at the time tThe front road image and the front K frames of front road images are sequentially input into a steering wheel corner prediction network according to time, and the obtained M +1 steering wheel corner prediction values are used as initial prediction values V of corresponding moments t′+m . In order to control the steering wheel of the smart car more smoothly, the final predicted values V 'of the steering wheel angles at the first Q moments' t′-q Q =1,2, \8230;, Q, and M +1 initial predictors V t′+m Arranging according to time sequence to obtain a sequence of predicted values of the steering wheel angle, performing exponential weighted averaging on the sequence, and taking the result obtained by performing exponential weighted averaging at the time t '+ M as the final predicted value V' of the steering wheel angle at the time t 'to be predicted' t′ . The exponential weighted average is a common sequence data processing method, and the detailed process thereof is not described herein.
In order to better illustrate the technical effects of the invention, the invention is experimentally verified by using a specific example. The experimental environment verified by the experiment is as follows: the central processing unit is Intel (R) Xeon (R) CPU E5-2620v3@2.40GHz, the graphic processing unit is NVIDIA GTX Titan X, the display memory is 11GB, the operating system is Ubuntu 16.04LTS, the deep learning framework uses PyTorch1.0, the Python environment is Python3.7, the version of the ROS system is Kinetic, and the experimental vehicle is a steam FAW A70E.
The method is respectively tested on an end-to-end decision public database (Udacity-Change II database) of the intelligent vehicle and an actual vehicle, and the adopted evaluation index is root mean square error (the root mean square error refers to the square root of the square sum of the mean value of the errors of corresponding points of the predicted data and the original data). Since the wheel rotation angle value is stored in the Udacity-change ii database, the steering wheel rotation angle prediction value obtained by the method is converted into the wheel rotation angle value for comparison. FIG. 5 is a graph showing the comparison between the output value and the tag value of the present invention on the Udacity chanllenge II database. FIG. 6 is a graph comparing the output value and the driver control value of the present invention in a campus environment. The root mean square error RMSE between the steering wheel angle value obtained by the method and the actual value/reference value is 0.0491, so that the method can obtain the accurate steering wheel angle value.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims (2)

1. An intelligent vehicle prediction control method based on visual space-time characteristics is characterized by comprising the following steps:
s1: the method comprises the following steps of constructing a steering wheel corner prediction network, wherein the steering wheel corner prediction network comprises a spatial feature extraction network, N space-time feature extraction modules and a space-time feature map fusion prediction module, wherein:
the input of the spatial feature extraction network is a front road image detected by an intelligent vehicle, the front road image detected by the intelligent vehicle at the current moment t and the front K frames of front road images are sequentially input into the spatial feature extraction network according to time, feature maps of the last N layers of the spatial feature extraction network are respectively output to the corresponding nth space-time feature extraction module, and the feature map of the last nth layer corresponding to the moment t-K is recorded as F t-k,n Wherein K =0,1, \8230;, K, N =1,2, \8230;, N;
each space-time feature extraction module comprises a first convolution layer, a convolution long-time and short-time memory network, a second convolution layer and a third convolution layer, wherein:
the convolution kernel size of the first convolution layer is 1 × 1, and the input feature map F t-k,n After dimension reduction, outputting the obtained product to a long-time and short-time memory network, and recording a feature map F 'output by the first convolution layer' t-k,n The size is W multiplied by H multiplied by L;
the input of the convolution length time memory network is a combined feature map, and the combined feature map is composed of feature maps F' t-k,n A characteristic diagram F' output by the convolution long-time and short-time memory network corresponding to the previous frame of road image t-k-1,n Spliced, the size of the obtained product is W multiplied by H multiplied by 2L, and when t-k-1 is less than 0, a characteristic diagram F ″ t-k-1,n Each pixel value is 0, the convolution long-time and short-time memory network is used for extracting the space-time characteristics in the combined characteristic diagram and outputting a characteristic diagram F ″ t-k-1,n (ii) a K +1 feature maps F 'are sequentially added' t-k,n Inputting the corresponding combined feature map into a convolution long-time and short-time memory network, and inputting the feature map F' corresponding to the current time t t,n Outputting to the second convolution layer;
the convolution kernel size of the second convolution layer is 3 × 3, for the input feature map F ″) t,n Performing convolution processing and outputting to a third convolution layer;
the convolution kernel size of the third convolution layer is 3 multiplied by 3, and after convolution processing is carried out on the feature graph output by the second convolution layer, the obtained feature graph is output to the feature graph fusion prediction module;
the space-time feature map fusion prediction module fuses feature maps of different scales output by the N space-time feature extraction modules and outputs steering wheel corner predicted values V at the current time t and M future times t+m Wherein M =0,1, \ 8230;, M; the spatio-temporal feature map fusion prediction module comprises N full-connected layers, the feature maps of N scales obtained by the N spatio-temporal feature extraction modules are arranged from small to large according to the scale, and the length, the width and the channel number of the nth feature map are respectively recorded as A n 、B n 、C n The processing flow of the spatio-temporal feature map fusion prediction module to the N feature maps is as follows:
1) Expanding the characteristic diagram of the Nth scale into a characteristic vector F N Dimension of the feature vector is A N ×B N ×C N The output dimension is A after inputting the 1 st layer full connection layer N-1 ×B N-1 ×C N-1 Characteristic vector f of N
2) Making a serial number n' =1 of a full connection layer;
3) The feature vector f N-n′+1 And a feature vector F obtained by expanding the feature map of the N-N' th scale N-n′ Adding to obtain dimension A N-n′ ×B N-n′ ×C N-n′ Feature vector F of N-n′
4) Judging whether N' < N-1, if yes, entering step 5), otherwise, entering step 7);
5) Feature vector F N-n′ The dimension of output is A after the n' +1 layer full connection layer is input N-n′-1 ×B N-n′-1 ×C N-n′-1 Characteristic vector f of N-n′
6) Let n '= n' +1, return to step 3);
7) Feature vector F at the moment N-n′ After being input to the N-th full-connection layer, the predicted steering wheel angle values V at the current moment t and M future moments are output t+m Forming a predicted value vector of the steering wheel angle;
s2: acquiring a plurality of continuous road images in front of the intelligent vehicle and corresponding steering wheel corners, taking the road images in front as the input of a steering wheel corner prediction network, taking the steering wheel corners as expected output, and training the steering wheel corner prediction network;
s3: for the time t' to be predicted, sequentially inputting the front road image detected by the intelligent vehicle at the time t and the front road image of the previous K frames into the steering wheel corner prediction network according to time, and taking the obtained M +1 steering wheel corner prediction values as initial prediction values V of corresponding time t′+m (ii) a The final predicted value V of the steering wheel rotation angle of the previous Q moments t-q Q =1,2, \ 8230;, Q, and M +1 initial predicted values V t′+m And according to time sequence, obtaining a predicted value sequence of the steering wheel angle, carrying out exponential weighted average on the sequence, and taking a result obtained by carrying out exponential weighted average on the time t '+ M as a final predicted value of the steering wheel angle at the time t' to be predicted.
2. The intelligent vehicle predictive control method of claim 1, wherein the spatial feature extraction network employs a convolutional neural network portion of an Nvidia-Pilot network.
CN202010012552.XA 2020-01-07 2020-01-07 Intelligent vehicle prediction control method based on visual space-time characteristics Active CN111208818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010012552.XA CN111208818B (en) 2020-01-07 2020-01-07 Intelligent vehicle prediction control method based on visual space-time characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010012552.XA CN111208818B (en) 2020-01-07 2020-01-07 Intelligent vehicle prediction control method based on visual space-time characteristics

Publications (2)

Publication Number Publication Date
CN111208818A CN111208818A (en) 2020-05-29
CN111208818B true CN111208818B (en) 2023-03-07

Family

ID=70785554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010012552.XA Active CN111208818B (en) 2020-01-07 2020-01-07 Intelligent vehicle prediction control method based on visual space-time characteristics

Country Status (1)

Country Link
CN (1) CN111208818B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111862158B (en) * 2020-07-21 2023-08-29 湖南师范大学 Staged target tracking method, device, terminal and readable storage medium
CN112212872B (en) * 2020-10-19 2022-03-11 合肥工业大学 End-to-end automatic driving method and system based on laser radar and navigation map
CN113156958A (en) * 2021-04-27 2021-07-23 东莞理工学院 Self-supervision learning and navigation method of autonomous mobile robot based on convolution long-short term memory network
CN115797708B (en) * 2023-02-06 2023-04-28 南京博纳威电子科技有限公司 Power transmission and distribution synchronous data acquisition method

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019092A (en) * 2012-12-31 2013-04-03 上海师范大学 Prediction control method for positioning platform of mechanical transmission system
CN105292243A (en) * 2015-09-29 2016-02-03 桂林电子科技大学 Prediction control method of electric power steering system of automobile
WO2017211395A1 (en) * 2016-06-07 2017-12-14 Toyota Motor Europe Control device, system and method for determining the perceptual load of a visual and dynamic driving scene
CN109168003A (en) * 2018-09-04 2019-01-08 中国科学院计算技术研究所 A method of generating the neural network model for being used for video estimation
CN109344701A (en) * 2018-08-23 2019-02-15 武汉嫦娥医学抗衰机器人股份有限公司 A kind of dynamic gesture identification method based on Kinect
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN109508375A (en) * 2018-11-19 2019-03-22 重庆邮电大学 A kind of social affective classification method based on multi-modal fusion
CN109581928A (en) * 2018-12-07 2019-04-05 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle towards highway scene and system
CN109656134A (en) * 2018-12-07 2019-04-19 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle based on space-time joint recurrent neural network
WO2019147396A1 (en) * 2018-01-23 2019-08-01 Gopro, Inc. Relative image capture device orientation calibration
CN110188683A (en) * 2019-05-30 2019-08-30 北京理工大学 A kind of automatic Pilot control method based on CNN-LSTM

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019092A (en) * 2012-12-31 2013-04-03 上海师范大学 Prediction control method for positioning platform of mechanical transmission system
CN105292243A (en) * 2015-09-29 2016-02-03 桂林电子科技大学 Prediction control method of electric power steering system of automobile
WO2017211395A1 (en) * 2016-06-07 2017-12-14 Toyota Motor Europe Control device, system and method for determining the perceptual load of a visual and dynamic driving scene
WO2019147396A1 (en) * 2018-01-23 2019-08-01 Gopro, Inc. Relative image capture device orientation calibration
CN109344701A (en) * 2018-08-23 2019-02-15 武汉嫦娥医学抗衰机器人股份有限公司 A kind of dynamic gesture identification method based on Kinect
CN109168003A (en) * 2018-09-04 2019-01-08 中国科学院计算技术研究所 A method of generating the neural network model for being used for video estimation
CN109410575A (en) * 2018-10-29 2019-03-01 北京航空航天大学 A kind of road network trend prediction method based on capsule network and the long Memory Neural Networks in short-term of nested type
CN109508375A (en) * 2018-11-19 2019-03-22 重庆邮电大学 A kind of social affective classification method based on multi-modal fusion
CN109581928A (en) * 2018-12-07 2019-04-05 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle towards highway scene and system
CN109656134A (en) * 2018-12-07 2019-04-19 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle based on space-time joint recurrent neural network
CN110188683A (en) * 2019-05-30 2019-08-30 北京理工大学 A kind of automatic Pilot control method based on CNN-LSTM

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
End-to-End Driving Model for Steering Control of Autonomous Vehicles with Future Spatiotemporal Features;Tianhao Wu等;《2019 IEEE/RSJ International Conference on Intelligent Robots and Systems》;20191231;全文 *
MST-ResNet: A Multiscale Spatial Temporal ResNet for Steering Prediction;Long Wen等;《2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence》;20191231;全文 *
基于时空递归神经网络的智能车端到端决策研究;金凡;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20181015(第10期);全文 *
面向动态场景理解的时空深度学习算法;蒙冰皓;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180215(第02期);全文 *

Also Published As

Publication number Publication date
CN111208818A (en) 2020-05-29

Similar Documents

Publication Publication Date Title
CN111208818B (en) Intelligent vehicle prediction control method based on visual space-time characteristics
CN110298262B (en) Object identification method and device
CN111582201B (en) Lane line detection system based on geometric attention perception
CN108985269B (en) Convergence network driving environment perception model based on convolution and cavity convolution structure
CN112818903B (en) Small sample remote sensing image target detection method based on meta-learning and cooperative attention
US9286524B1 (en) Multi-task deep convolutional neural networks for efficient and robust traffic lane detection
EP3278317B1 (en) Method and electronic device
CN109726627B (en) Neural network model training and universal ground wire detection method
CN111738037B (en) Automatic driving method, system and vehicle thereof
US11940803B2 (en) Method, apparatus and computer storage medium for training trajectory planning model
Wulff et al. Early fusion of camera and lidar for robust road detection based on U-Net FCN
CN113409361B (en) Multi-target tracking method and device, computer and storage medium
CN111696110B (en) Scene segmentation method and system
CN116188999B (en) Small target detection method based on visible light and infrared image data fusion
CN112861619A (en) Model training method, lane line detection method, equipment and device
CN112686207A (en) Urban street scene target detection method based on regional information enhancement
Dinh et al. Transfer learning for vehicle detection using two cameras with different focal lengths
CN112348116A (en) Target detection method and device using spatial context and computer equipment
CN115880658A (en) Automobile lane departure early warning method and system under night scene
Zheng et al. A novel vehicle lateral positioning methodology based on the integrated deep neural network
CN112597996A (en) Task-driven natural scene-based traffic sign significance detection method
CN111860411A (en) Road scene semantic segmentation method based on attention residual error learning
CN117037119A (en) Road target detection method and system based on improved YOLOv8
CN116863241A (en) End-to-end semantic aerial view generation method, model and equipment based on computer vision under road scene
CN116630702A (en) Pavement adhesion coefficient prediction method based on semantic segmentation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant