CN115285143A

CN115285143A - Automatic driving vehicle navigation method based on scene classification

Info

Publication number: CN115285143A
Application number: CN202210925922.8A
Authority: CN
Inventors: 孙晓峥; 郭戈; 刘佳庚; 高振宇; 康健; 张忍永康; 张琦
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2022-08-03
Filing date: 2022-08-03
Publication date: 2022-11-04

Abstract

The invention discloses an automatic driving vehicle navigation method based on scene classification, relating to the technical field of automatic driving vehicle navigation; the navigation capability of the automatic driving vehicle is improved when resources are limited, particularly in complex urban scenes, classification of scene trafficability is introduced, the navigation direction is determined through rough semantic analysis of environmental information, the vehicle is controlled to travel, the accurate modeling of the environment and the recognition process of objects in images are avoided, the unmanned vehicle is guided to run by utilizing direct mapping between the scenes and actions, and therefore a navigation strategy adaptive to consumption level is achieved.

Description

Automatic driving vehicle navigation method based on scene classification

Technical Field

The invention relates to the technical field of automatic driving vehicle navigation, in particular to an automatic driving vehicle navigation method based on scene classification.

Background

The navigation technology of the automatic driving vehicle depends on the cooperative cooperation of computer vision, radar, a monitoring device, a global positioning system and the like, so that the vehicle can move from one position to another position without the active operation of human beings. Because the automatic driving navigation technology does not need human driving, the driving error of human can be effectively avoided theoretically, the occurrence of traffic accidents is reduced, and the transportation efficiency of roads can be improved. As a result, autonomous vehicle navigation technology is gaining increasing attention.

The traditional navigation technology mainly depends on a global navigation satellite system (such as GPS), but because the traditional navigation technology is easy to be shielded and has multipath effect, the high-precision positioning and navigation are difficult to realize in complex scenes such as urban canyons, jungles and the like. In recent years, many satellite-independent navigation technologies such as inertial navigation, visual navigation, database matching navigation, bionic navigation, collaborative navigation and the like are researched and developed at home and abroad. The typical example of the method is a "map" based navigation technology, that is, a high-performance sensor (such as a laser radar) and a map (such as a high-precision map) are used for feature matching to realize accurate navigation. However, in some highly dynamic and complex scenarios, it is difficult to achieve an exact match to the environment, especially when global information is lost, it becomes strange, which is not unlike GPS outage. In addition, the expensive sensor also restricts the vector production, and is difficult to adapt to the consumption level requirement.

In order to meet the consumer-level navigation requirements, how to avoid the dependence on a high-performance sensor and realize a navigation task by using a low-cost visual sensor is a technical problem to be solved.

At present, artificial intelligence grows up rapidly, the machine simulation animal vision technology is mature gradually, a new opportunity is provided for high-precision guidance, and low-cost vision navigation is expected to become a wind vane of the next generation unmanned system navigation technology. Therefore, based on the advanced artificial intelligence technology, the novel low-cost navigation method and system can provide a new solution for vehicle navigation and provide important technical support for high-precision navigation in mass production.

Disclosure of Invention

The invention aims to provide an automatic driving vehicle navigation method based on scene classification, which improves the navigation capability of an automatic driving vehicle when resources are limited, particularly introduces scene trafficability classification in complex urban scenes, determines the navigation direction through rough semantic analysis of environmental information, controls the vehicle to advance, bypasses the precise modeling of the environment and the precise identification process of objects in images, and guides the operation of an unmanned vehicle by utilizing direct mapping between the scene and actions, thereby realizing a navigation strategy adaptive to consumption level.

In order to achieve the purpose, the invention adopts the technical scheme that:

an automatic driving vehicle navigation method based on scene classification comprises the following steps:

s1: acquiring and classifying images of the surrounding environment of the automatic driving vehicle by using a vehicle-mounted camera; establishing a scene classification model based on a convolutional neural network technology, and training the scene classification model;

s2: based on an extended Kalman filtering framework, a system state model is established, the prior estimation of the state variables of the step lines is predicted by using extended Kalman filtering, a covariance matrix of the prior state is updated in real time by using an adaptive algorithm based on filtering information, and the posterior estimation of the state variables of the step lines is updated by using extended Kalman filtering, so that the adaptive positioning of the vehicle is realized;

s3: selecting a vehicle control mode according to the classification result of the scene classification model to control the running of the vehicle;

s4: establishing an automatic driving automobile navigation frame; the navigation process of the navigation framework comprises the following steps: and collecting the surrounding environment image of the automatic driving vehicle at the current moment as the input of a scene classification model, carrying out the driving control of the vehicle according to the output of the scene classification model, and then carrying out the position judgment of the current vehicle according to a self-adaptive positioning result, and repeating the steps until the automatic driving vehicle reaches the target position.

The S1 specifically comprises the following steps:

s1.1: acquiring a sample data set: shooting scene pictures of different environments by using a vehicle-mounted camera, and classifying the scene pictures;

s1.2: establishing a scene classification model based on a convolutional neural network architecture, carrying out normalization processing on a data set, and training the established scene classification model by using a training set;

s1.3: adjusting the hyper-parameters of the scene classification model according to the number of the object types in the scene, wherein the hyper-parameters comprise a learning rate, an attenuation rate, iteration times, the number of layers, the number of neurons in each layer, batch size and the weight of each part in a loss function; and repeating S1.2, and training the scene classification models suitable for different complexities, including high, medium and low complexity scene classification models.

The difference of the scene complexity represents the difference of the number of things contained in the scene picture in the sample data set; the complexity is low when the number of the object types is less than or equal to three, medium complexity when the number of the object types is more than three and less than seven, and high complexity when the number of the object types is more than seven.

The S1.1 specifically comprises the following steps:

s1.1.1: manually controlling the vehicle to run, acquiring an environment image through a camera arranged at a preset position of the vehicle, generating a sample data set, and recording the shooting direction of each sample data;

the camera can be twisted by plus or minus eighty degrees; the sample data shooting direction is divided into five directions, namely a front direction, a front left direction, a front right direction, a left direction and a right direction;

s1.1.2: performing semantic classification on each sample scene picture in the sample data set, and selecting a scene classification label for each sample scene picture according to an actual scene;

the semantic classification represents classifying things in the sample scene picture according to categories, including ten categories of highways, non-motor vehicle roads, pedestrians, vehicles, buildings, street lamps, trees, grasslands, sand lands and sky; the scene classification labels comprise fifteen label categories, namely three conditions of priority, feasibility and infeasibility respectively corresponding to the right front, the left front, the right front, the left side and the right side;

s1.1.3: and dividing the sample data set subjected to scene classification into a training set and a test set according to a proportion.

The S1.2 specifically comprises the following steps:

s1.2.1: establishing a scene classification model based on a shallow convolutional neural network architecture based on a Tensorflow open source framework; the model comprises a convolution layer, a pooling layer and a full-connection layer; outputting x characteristic graphs by each convolution and pooling layer, performing characteristic fusion by a full connection layer, and dividing a scene into fifteen probability values by utilizing a Softmax function to output; as shown in formula (1):

wherein z is _i Outputting a value for the ith node of the full connection layer; z is a radical of _c Outputting a value for the c-th node of the full connection layer;

s1.2.2: adjusting the training set sample subjected to scene classification processing into y x y size as the input of a scene classification model, and performing unified normalization processing on the training set and the test set:

x _norm ＝(x _RGB -min)/(max-min)-b (2)

wherein x is _RGB Color values for three channels; max and min are the extreme values of the current image,set to 255 and 0, respectively; adjust the pixel intensity of each image to [ b, b]；

S1.2.3: training a scene classification model by using a training set in the sample data set;

s1.2.3 specifically comprises the steps of selecting a training platform and setting training iteration times, batch processing size, learning rate and attenuation rate; initializing the weight and the deviation parameter by a random function, and optimizing after each iteration;

and in the first full-connection layer, dropout is used, the performance of the GPU is improved by utilizing a cuDNN acceleration library, a ReLU function is adopted for nonlinear activation, and an Adam optimizer is used for cross entropy loss optimization.

Converting the one-hot coded labels into maximum normalized probability distribution so that the target level has the highest probability value; given a set of label ranks R = { R = { [ R ] _QF-BKX ,r _QF-KX ,r _QF-YX ；r _YQF-BKX ,r _YQF-KX ,r _YQF-YX ；r _ZQF-BKX ,r _ZQF-KX ,r _ZQF-YX ；r _YF-BKX ,r _YF-KX ,r _YF-YX ；r _ZF-BKX ,r _ZF-KX ,r _ZF-YX R according to the target grade r _t Prediction level r _i Generating a ground truth value label y _i Optimizing the training direction of the scene classification model: as shown in formula (3):

wherein r is _i For the ith prediction level, r _k For the kth prediction level, y _i For the ith ground truth value tag,

is a metric function which penalizes the deviation of the predicted grade from the target grade, as shown in the formula (4)

When r _i -r _t |>N, y is reduced to one-hot encoded vector; when r _i -r _t |<When epsilon, y tends to be a uniform probability distribution;

given a pixel location p = [ px, py ] in a ground truth mask] ^T Calculating a weight map w (p), the loss applied to each pixel by element-level multiplication, the weight of a pixel depending on its euclidean distance d (p) to the nearest segmentation boundary and its vertical position h (p) in the image, as shown in equation (5):

wherein, β is a constant adjustable according to practical conditions, h (p) is used for scaling the rate of pixel weight increase when the pixel is far away from the boundary, and is used as a multiplication factor of pixel level to enable the pixel point of low pixel to obtain high weight.

The S2 comprises the following steps:

s2.1: vehicle motion data of a vehicle-mounted inertial sensor at the current moment and vehicle position and speed measurement data of a satellite are collected,

s2.2: establishing a system state space model, wherein the system state space model comprises a state equation model and a measurement equation model:

the equation of state model is as shown in equation (6):

X＝[ψ,δV,δ,γ,ζ] ^T (6)

wherein Ψ, V, δ, γ and ζ are represented by formula (7):

where ψ represents a vehicle attitude error variable, ψ _E ,ψ _N ,ψ _U Respectively representing the attitude errors of the east, north and sky vehicles, delta V representing the speed error variable of the vehicle, delta V _E ,δV _N ,δV _U Respectively representing east, north and sky vehicle speed errors, delta representing vehicle position error, delta L, delta lambdaAnd δ h respectively represent the dimension, precision and elevation error of the position of the vehicle, γ represents the offset of the accelerometer in the coordinate system of the body, γ _bx ,、γ _by 、γ _bz The bias of the accelerometer in the X direction, the Y direction and the Z direction under the body coordinate system respectively, zeta represents the offset of the gyroscope under the body coordinate system, zeta _bx ,、ζ _by ,、ζ _bz Respectively drift of the gyroscope in X, Y and Z directions under a body coordinate system;

the measurement equation model is represented by formula (8):

wherein, P _state 、P _gps Position data, v, representing vehicle-mounted inertial sensors and satellite measurements, respectively _state 、v _gps Respectively representing the speed data measured by the vehicle-mounted inertial sensor and the satellite, wherein the measurement variable Z is a matrix vector formed by the difference values of the position and the speed measured by the vehicle-mounted inertial sensor and the satellite;

s2.3: reading in the measurement information of the vehicle-mounted inertial sensor, and performing the prediction step of the extended Kalman filtering by using the constructed state equation model so as to obtain the prior estimation of the system state variable at the current moment

S2.4: prior state covariance matrix P _k|k-1 And (3) self-adaptive estimation:

calculating filtering innovation v _k ：

Wherein z is _k Measurement data for a satellite;

calculating an adaptive parameter lambda _k ：

Wherein R is _k For observing the noise covariance matrix, Q _k Is a state noise covariance matrix, P _k-1 Is a covariance matrix of posterior states at time k-1, H _k In order to observe the matrix, the system is,

is H _k Transpose of matrix, parameter C ₀ Is determined by the formula (11)

Wherein, the first and the second end of the pipe are connected with each other,

is v ₀ Transpose of (v) ₀ The filtering information matrix at the initial moment is given by experience according to the actual situation;

calculating a prior state covariance matrix at the current moment:

wherein, F _k|k-1 Representing a state transition matrix at the k moment;

is F _k|k-1 Transposing;

s2.5: reading in measurement information z of the current satellite _k And performing an updating step of the extended Kalman filtering by using the constructed measurement equation model so as to correct the state variable of the system estimated a priori to obtain a posterior estimation result of the state variable of the system

The S3 comprises the following steps:

s3.1: according to the scene classification result, calculating control selection parameters:

the scene classification result is expressed as shown in equation (13):

wherein, the function theta is used for a convolutional neural network model of scene perception and outputs P _QF-YX 、P _QF-KX 、P _QF-BKX 、P _YQF-YX 、P _YQF-KX 、P _YQF-BKX 、P _ZQF-YX 、P _ZQF-KX、 P _ZQF-BKX 、P _YF-YX 、P _YF-KX 、P _YF-BKX 、P _ZF-YX 、P _ZF-KX And P _ZF-BKX Fifteen kinds of probabilities are provided, wherein QF, YQF, ZQF, YF and ZF respectively represent right front, left front, right side and left side, and YX, KX and BKX respectively represent prior passing, feasible and infeasible; a is a model parameter, o _t Is the currently observed image, d _i Is the KL distance used to select models of different complexity;

defining a control selection parameter P _QF 、P _ZF 、P _YF As shown in formulas (14), (15), and (16):

P _QF ＝P _QF-YX +P _QF-KX +P _QF-BKX (14)

P _ZF ＝P _ZF-YX +P _ZF-KX +P _ZF-BKX +a(P _ZQF-YX +P _ZQF-KX +P _ZQF-BKX ) (15)

P _YF ＝P _YF-YX +P _YF-KX +P _YF-BKX +a(P _ZQF-YX +P _ZQF-KX +P _ZQF-BKX ) (16)

wherein a is an adjustable parameter in the range of 0.5-0.8;

s3.2: selecting a control mode according to the control selection parameters, and controlling the vehicle to move:

wherein the functionπ _AWC And pi _RC Are two control modes, AWC for weighted control mode, RC for regular control mode, B for manually set baseline speed, [ A, B]The range of (c) is given empirically; velocity commands u = (v, w) include linear and angular velocities;

for the normal control mode, the linear velocity and the angular velocity at which the vehicle is controlled to travel are represented by equations (18), (19);

v＝V _l P _QF (18)

w＝Ω _a (P _ZF -P _YF ) (19)

wherein the linear velocity v is P _QF And a base line velocity V _l The product of (a); angular velocity w of P _ZF And P _YF Difference between the base angular velocity omega _a The product of (a);

for the weighted control mode, the forward speed is reduced and the turn speed is increased using adjustable empirical parameters.

The S4 comprises the following steps:

s4.1: acquiring current scene information by using a vehicle-mounted camera, and positioning a vehicle at the current moment by using the vehicle positioning at the current moment;

s4.2: according to a Kullback-Leibler (KL) distance method, selecting a scene classification model with high, medium and low complexity:

wherein p represents the posterior probability of the current driving environment category,

and

representing the posterior probability density of low, medium and high complexity, d _i Is KL distance, E _p Is the expectation function, i is the scene complexity level; the current driving scene complexity level i is according to d in formula (19) _i Determining the minimum value, and selecting a model corresponding to the scene level complexity;

s4.3: classifying the current scene by using a scene classification model with corresponding complexity, and outputting fifteen scene probability value results by using the scene classification model;

s4.4: judging whether the driving decision is feasible or not according to the navigation task, the surrounding environment and the traffic regulations; if the driving decision is feasible, selecting a vehicle control mode by using a scene classification model output result, and generating an angular speed and a driving speed instruction for controlling steering;

if the driving decision is not feasible, generating prompt information and sending the prompt information to a user, wherein the prompt information comprises the reason why the driving decision is not feasible and the driving decision for recommending replacement;

s4.5: judging whether the vehicle reaches the key point or not according to the self-adaptive positioning result, and if the vehicle reaches the terminal point, obtaining a navigation result; if the terminal is not reached, returning to S3.1 to continue navigation.

Advantageous technical effects

1. The invention provides a low-cost navigation method based on an artificial intelligence technology, which aims to solve the problem of autonomous navigation of the existing automatic driving vehicle. The method utilizes a scene classification model based on a convolutional neural network field to understand the surrounding environment of the vehicle, and forms an end-to-end automatic driving vehicle navigation method through direct mapping of a scene classification result and vehicle control, does not need a complex intermediate module, avoids the difficulties of perception algorithm fusion, map construction and the like, greatly reduces the complexity of the system, and has the advantages of small development difficulty and low hardware requirement;

2. the invention provides various schemes to enhance the precision and robustness of the navigation method: an adaptive positioning module is provided to assist in judging the navigation process; different vehicle control modes are introduced; providing scene classification models suitable for different scene complexity, and providing a corresponding model selection method; a two-stage scene picture processing method is provided; different model training enhancement methods are introduced.

Drawings

FIG. 1 is a flowchart of a method for navigating an autonomous vehicle based on scene classification according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a scene classification model according to an embodiment of the present invention;

FIG. 3 is a schematic view of a vehicle driving control diagram provided by an embodiment of the invention;

FIG. 4 is a diagram of an autonomous vehicle navigation framework according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions and advantages thereof more clear, the technical solutions in the present application will be described below with reference to the accompanying drawings.

As shown in fig. 1, the method for navigating an automatic driving vehicle based on scene classification according to the present embodiment includes the following steps:

s1: scene classification, namely acquiring and classifying images of the surrounding environment of the automatic driving vehicle by using a vehicle-mounted camera; establishing a scene classification model based on a convolutional neural network technology, and training the scene classification model; the method specifically comprises the following steps:

s1.1: acquiring a sample data set, shooting scene pictures of different environments by using a vehicle-mounted camera, and classifying the scene pictures;

s1.3: adjusting the hyper-parameters of the scene classification model according to different scene complexity, wherein the hyper-parameters comprise learning rate, attenuation rate, iteration times, layer number, the number of neurons in each layer, batch size and weight of each part in a loss function; repeating S1.2, and training scene classification models suitable for different complexities, including high, medium and low complexity scene classification models;

the difference of the scene complexity represents the difference of the number of things contained in the scene picture in the sample data set; the complexity is low when the number of the object types is less than or equal to three, the complexity is medium when the number of the object types is more than three and less than seven, and the complexity is high when the number of the object types is more than seven;

the S1.1 specifically comprises the following steps:

s1.1.1: manually controlling the automatic driving vehicle to run in an outdoor non-structural environment, acquiring an environment image through a twistable camera arranged at the front end of the vehicle, generating a sample data set, and recording the shooting direction of each sample data; the twistable camera represents that the camera is provided with a fixed base, and the camera can twist plus and minus eighty degrees around the fixed base; the sample data shooting direction is divided into five directions, namely a front direction, a left front direction, a right front direction, a left direction and a right direction; the right front represents that the torsion angle of the camera is plus or minus ten degrees, the left front is more than plus ten degrees and less than plus sixty degrees, the right front is more than minus ten degrees and less than minus sixty degrees, the left is more than plus sixty degrees and less than plus eighty degrees, and the right is more than minus sixty degrees and less than minus eighty degrees;

the semantic classification represents classifying things in the sample scene picture according to categories, including ten categories of highways, non-motor vehicle roads, pedestrians, vehicles, buildings, street lamps, trees, grasslands, sand lands and sky; the scene classification labels comprise fifteen label categories, namely three conditions of priority, feasibility and infeasibility respectively corresponding to the right front, the left front, the right front, the left side and the right side; the priority indicates a place where the vehicle is expected to travel, i.e., an open road without obstacles; the feasibility represents a place where the vehicle may travel but is generally not desired to travel, such as non-motorized roadways, grass, and sand; the impracticable representation of places where vehicles cannot travel and things that should be avoided, such as buildings, street lights, trees, the sky, and roads with pedestrians and other obstacles;

s1.1.3: and (3) the sample data set after scene classification processing is proportioned to 8:2, dividing the training set into a training set and a testing set;

fifteen labels are shown in table 1:

TABLE 1

The S1.2 specifically comprises the following steps:

s1.2.1: establishing a scene classification model based on a shallow convolutional neural network architecture by using the existing Tensorflow open source framework as shown in FIG. 2; the model consists of three convolution layers, three pooling layers and two full-connection layers; outputting 32 feature graphs by each convolution and pooling layer, performing feature fusion through a full connection layer, and dividing a scene into fifteen probability values by utilizing a Softmax function to output; as shown in formula (1):

wherein z is _i Outputting a value for the ith node of the full connection layer; z is a radical of formula _c Outputting a value for the c-th node of the full connection layer;

s1.2.2: adjusting the size of a training set sample subjected to scene classification processing to 64 × 64 as the input of a scene classification model, and performing unified normalization processing on the training set and the test set:

x _norm ＝(x _RGB -min)/(max-min)-0.5 (2)

wherein x is _RGB Color values for three channels; max and min are the extreme values of the current image, set to 255 and 0, respectively; the pixel intensity of each image is adjusted to [0.5,0.5]；

S1.2.3: training a scene classification model by using a training set in the sample data set:

using a NvidiaGTX1060GPU as a platform to train the model;

performing 1000 iterations to train the model, wherein the batch processing size is 1024 frames, the initial learning rate is set to be 0.001, and the attenuation is 0.05 every 10 steps; initializing the weight and the deviation parameter by a random function, and optimizing after each iteration;

in order to avoid overfitting as much as possible, dropout is used in the first full-connection layer, and a cuDNN acceleration library is used for improving GPU performance; in addition, a ReLU function is adopted for nonlinear activation, the problem of gradient disappearance is solved, and the training efficiency is improved; performing cross entropy loss optimization by using an Adam optimizer to relieve the problems of gradient sparsity and large noise;

converting the one-hot coded labels into maximum normalized probability distribution according to the sequencing definition so as to enable the target level to have the highest probability value; given a set of label ranks R = { R = { [ R ] _QF-BKX ,r _QF-KX ,r _QF-YX ；r _YQF-BKX ,r _YQF-KX ,r _YQF-YX ；r _ZQF-BKX ,r _ZQF-KX ,r _ZQF-YX ；r _YF-BKX ,r _YF-KX ,r _YF-YX ；r _ZF-BKX ,r _ZF-KX ,r _ZF-YX R according to the target grade r _t Generating a ground true value label y, and optimizing the training direction of the model: as shown in formula (3):

wherein r is _i For the ith prediction level, r _k For the k-th prediction level, y _i For the ith ground truth value tag,

When the inter-level distance approaches infinity, y is reduced to one-hot encoded vector; when the distance tends to 0, y tends to be a uniform probability distribution;

the concept of image depth is introduced to distinguish near and distant elements; one pixel location p = [ px, py ] in a given ground truth mask] ^T Calculating a weight map w (p) which applies to the penalty of each pixel by element-level multiplication; the weight of a pixel depends on its euclidean distance d (p) to the nearest segmentation boundary and its vertical position h (p) in the image;

as shown in formula (5):

wherein β is a constant that is adjustable according to the actual situation; the height map h (p) is used to scale the rate at which the pixel weights increase as one moves away from the boundary and to assign higher weights to lower pixels as a multiplicative factor at the pixel level; it is used as the original placeholder for depth data, and the working principle is that the lower the pixel in the image is, the closer it is to the camera;

s2: based on an extended Kalman filtering framework, establishing a system state model, predicting prior estimation of a stepping state variable by using extended Kalman filtering, updating a covariance matrix of the prior state in real time by using a filtering innovation-based adaptive algorithm, and updating posterior estimation of the stepping state variable by using the extended Kalman filtering to realize vehicle adaptive positioning; the method specifically comprises the following steps:

the state equation model is represented by equation (6):

X＝[ψ,δV,δ,γ,ζ] ^T (6)

wherein Ψ, V, δ, γ and ζ are represented by formula (7):

wherein ψ represents a vehicle attitude error variable, ψ _E ,ψ _N ,ψ _U Respectively representing the attitude errors of the east, north and sky vehicles, delta V representing the speed error variable of the vehicle, delta V _E ,δV _N ,δV _U Respectively representing east, north and sky vehicle speed errors, delta representing vehicle position error, delta L, delta lambda and delta h respectivelyError of dimension, precision and elevation of the position of the vehicle is represented, gamma represents offset of the accelerometer under a body coordinate system, gamma represents offset of the accelerometer under the body coordinate system _bx ,、γ _by 、γ _bz Bias of the accelerometer in X, Y and Z directions in a body coordinate system, zeta represents offset of the gyroscope in the body coordinate system, and zeta represents offset of the gyroscope in the body coordinate system _bx ,、ζ _by ,、ζ _bz Respectively drift of the gyroscope in X, Y and Z directions under a body coordinate system;

the measurement equation model is shown in formula (8):

wherein, P _state 、P _gps Position data, v, representing vehicle-mounted inertial sensors and satellite measurements, respectively _state 、v _gps Respectively representing the speed data measured by the vehicle-mounted inertial sensor and the satellite; the measurement variable Z is a matrix vector formed by the difference values of the position and the speed measured by the vehicle-mounted inertial sensor and the satellite;

s2.3: reading in measurement information of the vehicle-mounted inertial sensor, and performing prediction step of extended Kalman filtering by using the constructed state equation model so as to obtain prior estimation of the state variable of the system at the current moment

S2.4: prior state covariance matrix P _k|k-1 Self-adaptive estimation:

calculating a filtering innovation v _k

Wherein z is _k Measurement data for a satellite;

calculating an adaptive parameter lambda _k

Wherein R is _k For observing the noise covariance matrix, Q _k Is a state noise covariance matrix, P _k-1 Is a covariance matrix of posterior states at time k-1, H _k For the observation matrix, parameter C ₀ Is determined by the formula (11)

Wherein the content of the first and second substances,

is v is ₀ Transpose of v ₀ The filtering information matrix at the initial moment is given by experience according to the actual situation;

calculating prior state covariance matrix at current moment

Wherein, F _k|k-1 Representing a state transition matrix at the k moment;

s2.5: reading in measurement information z of the current satellite _k And performing the updating step of the extended Kalman filtering by using the constructed measurement equation model so as to correct the state variable of the system estimated a priori to obtain the posterior estimation result of the state variable of the system

S3: as shown in fig. 3, the vehicle travel control is performed by selecting a vehicle control mode according to the classification result of the scene classification model; acquiring surrounding environment information through a vehicle, and selecting a corresponding scene classification model to perform multi-modal understanding on the surrounding environment information of the vehicle according to the number of the types of objects in the scene, so as to determine a driving decision according to a scene classification result; then according to the requirement of the navigation task, generating a steering and speed instruction for the vehicle, so that the automatic driving vehicle can automatically identify the emergency situation according to the understanding of the scene when executing the navigation task, and accurately give a control instruction for the vehicle; the method specifically comprises the following steps:

s3.1: according to the scene classification result, calculating a control selection parameter:

the scene classification result is expressed as shown in equation (13):

the function theta is used for a convolutional neural network model for scene perception, fifteen probabilities are output, a is a model parameter, and o is _t Is the image that is currently being observed and,

P _QF ＝P _QF-YX +P _QF-KX +P _QF-BKX (14)

wherein a is an adjustable parameter in the range of 0.5-0.8;

because in some special cases, the vehicle can sense the obstacles around, but the fifteen probability values are all small and are not enough to drive the vehicle away from nearby objects; aiming at the problem, the method is used in a navigation system by combining an Adaptive Weighted Control (AWC) algorithm and a conventional control (RC) algorithm; the AWC algorithm adds a self-adaptive adjustment factor according to an environment understanding result, and drives the vehicle to run in a corresponding direction and at a corresponding speed; the specific control mode is selected as the formula (17):

wherein the function pi _AWC And pi _RC Are two control modes, AWC for weighted control mode, RC for regular control mode, B for manually set baseline speed, [ A, B]The range of (a) is given empirically; velocity command u = (v, w) includes linear velocity and angular velocity;

v＝V _l P _QF (18)

w＝Ω _a (P _ZF -P _YF ) (19)

wherein the linear velocity is P _QF Product of the baseline velocity; angular velocity P _ZF And P _YF The product of the difference and the base angular velocity;

for the weighted control mode, the advancing speed is reduced and the turning speed is increased by using the adjustable empirical parameters;

s4: and establishing an automatic driving automobile navigation frame according to the S1, the S2 and the S3, wherein as shown in the figure 4, the navigation process of the navigation frame comprises the following steps: collecting surrounding environment images of the automatic driving vehicle at the current moment as input of a scene classification model, carrying out driving control on the vehicle according to the output of the scene classification model, carrying out position judgment on the current vehicle according to a self-adaptive positioning result, and iterating until the automatic driving vehicle reaches a target position;

in this embodiment, the automatic driving car navigation model is established: the navigation model consists of a sensor, a control mode decision, a controller and self-adaptive positioning; the input of the navigation model is a processed current scene picture of the automatic driving vehicle; the sensor is responsible for image classification and generates P respectively _QF-YX 、P _QF-KX 、P _QF-BKX 、P _YQF-YX 、P _YQF-KX 、P _YQF-BKX 、P _ZQF-YX 、P _ZQF-KX 、P _ZQF-BKX 、P _YF-YX 、P _YF-KX 、P _YF-BKX 、P _ZF-YX 、P _ZF-KX And P _ZF-BKX Fifteen probabilities are used as input of control mode decision; the probability value corresponds to the feasible direction of the vehicle motion, and indirectly determines the control mode and the speed instruction of the vehicle; the self-adaptive positioning is used for determining the position of the vehicle and judging the navigation progress; the method specifically comprises the following steps:

s4.1: activating a camera, acquiring the current environment information around the vehicle by using the vehicle-mounted camera, and positioning the vehicle at the current moment by using the self-adaptive positioning scheme S2: the method comprises the steps that a camera is activated, namely a shooting activation signal is sent to the camera to activate the camera to shoot environmental information around a vehicle; acquiring environmental information around the vehicle includes: acquiring environmental information around the vehicle, which is shot by a shooting device according to a shooting activation signal, wherein the environmental information shot by a camera is image information; acquiring image information of an environment around a vehicle includes: acquiring environmental information around a vehicle, which is periodically photographed by a camera;

where p represents the posterior probability of the current driving environment category, i represents a different category of complexity,

and

representing the posterior probability density of low, medium and high complexity, d _i Is a KL distance, E _p Is the expectation function, i is the scene complexity level; the current driving scene complexity level i can be determined according to d in formula (19) _i Minimum value, selecting the scene with complex corresponding scene levelA model;

wherein p represents the posterior probability of the current driving environment class, p1class, p2class and p3class represent the posterior probability density of low, medium and high complexity respectively, and d _i Is a KL distance, E _p Is the expectation function, i is the scene complexity level; the current driving scene complexity level i can be determined according to d in formula (19) _i Determining the minimum value, and selecting a model with complex corresponding scene grade;

Claims

1. An automatic driving vehicle navigation method based on scene classification is characterized in that: the method comprises the following steps:

s2: based on an extended Kalman filtering framework, a system state model is established, the prior estimation of the stepping state variable is predicted by using extended Kalman filtering, a covariance matrix of the prior state is updated in real time by using an adaptive algorithm based on filtering information, the posterior estimation of the stepping state variable is updated by using extended Kalman filtering, the adaptive positioning of the vehicle is realized,

s4: establishing an automatic driving automobile navigation frame; the navigation process of the navigation framework comprises the following steps: and collecting the surrounding environment image of the automatic driving vehicle at the current moment as the input of a scene classification model, carrying out the driving control of the vehicle according to the output of the scene classification model, judging the current vehicle position according to the self-adaptive positioning result, and repeating the steps until the automatic driving vehicle reaches the target position.

2. The method of scene classification based autonomous vehicle navigation according to claim 1, characterized in that:

the S1 specifically comprises the following steps:

s1.3: adjusting the hyper-parameters of the scene classification model according to the number of the object types in the scene, wherein the hyper-parameters comprise a learning rate, an attenuation rate, iteration times, the number of layers, the number of neurons in each layer, batch size and the weight of each part in a loss function; and repeating the S1.2, and training the scene classification models suitable for different complexities, including scene classification models with high, medium and low complexities.

3. The method of scene classification-based autonomous vehicle navigation according to claim 2, characterized in that:

the difference of the scene complexity represents the difference of the number of the object types contained in the scene picture in the sample data set; the complexity is low when the number of the object types is less than or equal to three, medium complexity when the number of the object types is more than three and less than seven, and high complexity when the number of the object types is more than seven.

4. The method of claim 1, wherein the method comprises:

the S1.1 specifically comprises the following steps:

the camera can be twisted by plus or minus eighty degrees; the sample data shooting direction is divided into five directions, namely a front direction, a left front direction, a right front direction, a left direction and a right direction;

the semantic classification represents classifying things in the sample scene picture according to categories, including ten categories of highways, non-motor vehicle roads, pedestrians, vehicles, buildings, street lamps, trees, grasslands, sand lands and sky; the scene classification labels comprise fifteen label categories, namely, three conditions of priority, feasibility and infeasibility respectively correspond to the right front, the left front, the right front, the left side and the right side;

5. The method of scene classification based autonomous vehicle navigation according to claim 1, characterized in that:

the S1.2 specifically comprises the following steps:

s1.2.1: establishing a scene classification model based on a shallow convolutional neural network architecture by using the existing Tensorflow open source framework; the model comprises a convolution layer, a pooling layer and a full-connection layer; outputting x characteristic graphs by each convolution and pooling layer, performing characteristic fusion by a full connection layer, and dividing a scene into fifteen probability values by utilizing a Softmax function to output; as shown in formula (1):

wherein z is _i Outputting a value for the ith node of the full connection layer; z is a radical of formula _c Outputting a value for the c node of the full connection layer;

x _norm ＝(x _RGB -min)/(max-min)-b (2)

wherein x is _RGB Color values for three channels; max and min are the extreme values of the current image, set to 255 and 0, respectively; adjust the pixel intensity of each image to [ b, b]；

S1.2.3: and training the scene classification model by using a training set in the sample data set.

6. The method of scene classification-based autonomous vehicle navigation of claim 5, characterized in that:

using dropout in a first full-connection layer, improving the performance of a GPU by using a cuDNN acceleration library, carrying out nonlinear activation by using a ReLU function, and carrying out cross entropy loss optimization by using an Adam optimizer;

converting the one-hot coded labels into maximum normalized probability distribution so that the target level has the highest probability value; given a set of label ranks R = { R = { [ R ] _QF-BKX ,r _QF-KX ,r _QF-YX ；r _YQF-BKX ,r _YQF-KX ,r _YQF-YX ；r _ZQF-BKX ,r _ZQF-KX ,r _ZQF-YX ；r _YF-BKX ,r _YF-KX ,r _YF-YX ；r _ZF-BKX ,r _ZF-KX ,r _ZF-YX R according to the target grade r _t Prediction level r _i Generating a ground truth value label y _i And optimizing the model training direction: as shown in formula (3):

is a metric function for punishing the deviation of the prediction grade from the target grade, as shown in the formula (4):

wherein, β is a constant adjustable according to practical conditions, h (p) is used for scaling the rate of pixel weight increase when the boundary is far away, and is used as a multiplication factor of pixel level to enable the pixel point of low pixel to obtain high weight.

7. The method of scene classification based autonomous vehicle navigation according to claim 1, characterized in that:

the S2 comprises the following steps:

s2.1: collecting vehicle motion data of a vehicle-mounted inertial sensor at the current moment and vehicle position and speed measurement data of a satellite,

s2.2: establishing a system state space model, wherein the system state space model comprises a state equation model and a measurement equation model;

S2.4: prior state covariance matrix P _k|k-1 And (3) self-adaptive estimation:

calculating filtering innovation v _k

Wherein z is _k Measurement data for a satellite;

calculating an adaptive parameter lambda _k

Wherein R is _k To observe the noise covariance matrix, Q _k Is a state noise covariance matrix, P _k-1 Is a covariance matrix of posterior states at time k-1, H _k In order to observe the matrix, the system,

is H _k Transpose of matrix, parameter C ₀ Is determined by the formula (11)

Wherein the content of the first and second substances,

calculating prior state covariance matrix at current moment

Wherein, F _k|k-1 Representing a state transition matrix at the k moment;

is F _k|k-1 Transposing;

8. The method of scene classification-based autonomous vehicle navigation of claim 7, characterized in that:

the state equation model is represented by equation (6):

X＝[ψ,δV,δ,γ,ζ] ^T (6)

wherein Ψ, V, δ, γ and ζ are represented by formula (7):

where ψ represents a vehicle attitude error variable, ψ _E ,ψ _N ,ψ _U Respectively representing the attitude errors of the east, north and sky vehicles, delta V representing the speed error variable of the vehicle, delta V _E ,δV _N ,δV _U Respectively representing east, north and sky vehicle speed errors, delta representing a vehicle position error, delta L, delta lambda and delta h representing dimensions, precision and elevation errors of the position of the vehicle, gamma representing the offset of the accelerometer in a body coordinate system, gamma _bx ,、γ _by 、γ _bz The bias of the accelerometer in the X direction, the Y direction and the Z direction under the body coordinate system respectively, zeta represents the offset of the gyroscope under the body coordinate system, zeta _bx ,、ζ _by ,、ζ _bz Respectively drift of the gyroscope in X, Y and Z directions under a body coordinate system;

the measurement equation model is shown in formula (8):

wherein, P _state 、P _gps Position data, v, representing vehicle-mounted inertial sensors and satellite measurements, respectively _state 、v _gps The speed data measured by the vehicle-mounted inertial sensor and the satellite are respectively represented, and the measurement variable Z is a matrix vector formed by the difference values of the position and the speed measured by the vehicle-mounted inertial sensor and the satellite.

9. The method of scene classification based autonomous vehicle navigation according to claim 1, characterized in that:

the S3 comprises the following steps:

the scene classification result is expressed as shown in equation (13):

wherein, the function theta is used for a convolutional neural network model of scene perception and outputs P _QF-YX 、P _QF-KX 、P _QF-BKX 、P _YQF-YX 、P _YQF-KX 、P _YQF-BKX 、P _ZQF-YX 、P _ZQF-KX 、P _ZQF-BKX 、P _YF-YX 、P _YF-KX 、P _YF-BKX 、P _ZF-YX 、P _ZF-KX And P _ZF-BKX Fifteen kinds of probabilities are provided, wherein QF, YQF, ZQF, YF and ZF respectively represent right front, left front, right side and left side, and YX, KX and BKX respectively represent prior passing, feasible and infeasible; a is a model parameter, o _t Is the currently observed image, d _i Is the KL distance used to select models of different complexity;

P _QF ＝P _QF-YX +P _QF-KX +P _QF-BKX (14)

wherein a is an adjustable parameter in the range of 0.5-0.8;

v＝V _l P _QF (18)

w＝Ω _a (P _ZF -P _YF ) (19)

10. The method of claim 1, wherein the method comprises: the S4 comprises the following steps:

wherein p represents a posterior probability of the current driving environment category,

and

a posteriori probability densities of low, medium and high complexity, d _i Is KL distance, E _p Is the expectation function, i is the scene complexity level; the current driving scene complexity level i is according to (19) wherein d _i Determining the minimum value, and selecting a model with complex corresponding scene grade;

s4.5: judging whether the vehicle reaches the key point or not according to the self-adaptive positioning result, and if the vehicle reaches the terminal point, obtaining a navigation result; if the terminal is not reached, the navigation is continued by returning to S3.1.