CN112801201A - Deep learning visual inertial navigation combined navigation design method based on standardization - Google Patents

Deep learning visual inertial navigation combined navigation design method based on standardization Download PDF

Info

Publication number
CN112801201A
CN112801201A CN202110171232.3A CN202110171232A CN112801201A CN 112801201 A CN112801201 A CN 112801201A CN 202110171232 A CN202110171232 A CN 202110171232A CN 112801201 A CN112801201 A CN 112801201A
Authority
CN
China
Prior art keywords
module
deep learning
inertial navigation
main module
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110171232.3A
Other languages
Chinese (zh)
Other versions
CN112801201B (en
Inventor
胡斌杰
丘金光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110171232.3A priority Critical patent/CN112801201B/en
Publication of CN112801201A publication Critical patent/CN112801201A/en
Application granted granted Critical
Publication of CN112801201B publication Critical patent/CN112801201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Automation & Control Theory (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep learning visual inertial navigation combined navigation design method based on standardization, which comprises the following steps: designing and introducing standard operation on a training set label, calculating the mean value and variance of the label, converting the label into distribution with the mean value of 0 and the variance of 1, and storing the obtained mean value and variance; in the network design, in order to balance the contribution of image data and inertial navigation data, the inertial navigation characteristic and the image characteristic are changed to be in the same dimension m through a network; and in the verification stage or the test stage, the result output by the network is subjected to inverse standardization operation through the mean value and the variance obtained by the calculation to obtain a final result. According to the method, in the deep learning visual inertial navigation combined navigation design method based on standardization, the standardization operation is carried out on the training set labels, the selection of balance factors in the target function is reduced, the generalization performance of a neural network is improved, and the accuracy of the relative pose prediction is improved.

Description

Deep learning visual inertial navigation combined navigation design method based on standardization
Technical Field
The invention relates to the technical field of sensor fusion and motion estimation, in particular to a deep learning visual inertial navigation combined navigation design method based on standardization.
Background
With automatic driving and continuous development of unmanned aerial vehicles, high-precision and high-robustness positioning is an important premise for completing autonomous navigation and exploring tasks such as unknown areas, a pure visual odometer method is adopted, a system acquires surrounding environment information by using a visual sensor, and motion state of the system is estimated by analyzing image data. The vision inertial navigation odometer adds Inertial Measurement Unit (IMU) information on the basis of a pure vision odometer, and can improve the precision of motion state estimation under the condition of visual loss.
The conventional visual inertial navigation odometer technology has been fully researched, but some researches on data loss, data damage and the like are not well solved, and a large amount of manual feature selection and external reference calibration among sensors are required, which is undoubtedly time-consuming. In recent years, deep learning techniques have achieved enormous success in the field of computer vision, and are widely used in various fields. The visual inertial navigation combined navigation is taken as a regression task, and can also be trained by adopting a deep learning method, in the existing visual inertial navigation combined navigation task based on the deep learning, the target function balances the learning of translation and rotation through a balance factor, and a large amount of training time is needed for finding the balance factor, so that manpower and material resources are undoubtedly consumed, and aiming at the problem, a new target function needs to be designed to avoid manual setting of the balance factor.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a deep learning visual inertial navigation combined navigation design method based on standardization, so that the manual setting of balance factors of relative translation and relative rotation is avoided.
The purpose of the invention can be achieved by adopting the following technical scheme:
a deep learning visual inertial navigation combined navigation design method based on standardization comprises the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called as a main module A; the second master module comprises two layers of Bi-LSTM, referred to as master module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data to a main module A to extract image characteristics; inputting inertial navigation data into the main module B to extract inertial navigation characteristics, and ensuring that the dimensionality of the inertial navigation characteristics is consistent with the dimensionality of image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention submodule in a main module C, the output of the Attention submodule is multiplied by the input of the Attention submodule and then input into two layers of Bi-LSTM submodules of the main module C, and the output of the two layers of Bi-LSTM submodules is input into a fully-connected layer submodule to output a result;
s2, designing a loss function, standardizing the labels of the training set, transforming the labels of the training set into a distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function;
s3, training and storing results, wherein the activating functions adopted by the sub-modules of the full connection layer in the main module A and the main module C are Relu, the activating functions adopted by the Attention sub-module in the main module C are Relu and Sigmoid, training data are input to train the deep learning network model constructed in the step S1, and the deep learning network model is stored to an appointed path after training is finished;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization on the mean value and the variance obtained in the step S2 to obtain a prediction result.
Further, the navigation design method further comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by foreign matters, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
Further, the navigation design method divides the training set and the test set as follows: and taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences as a test set.
Further, the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are formed by two-dimensional convolution, convolution kernels of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and convolution kernels of the last seven layers of CNNs are 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention submodule, two layers of Bi-LSTM submodules and a full-connection layer submodule, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM submodules comprises 1000 neurons; the Attention submodule consists of two full-connection layers, the full-connection layer activation function of the first layer is Relu, and the full-connection layer activation function of the second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of the neurons of the four full-connection layers is 512, 128, 64 and 6 respectively.
Further, the step S2 normalization process is as follows:
the training set label mean calculation method is as follows:
Figure BDA0002938980110000031
the training set label variance is calculated as follows:
Figure BDA0002938980110000032
the training set label standardization calculation mode is as follows:
Figure BDA0002938980110000041
wherein n is the number of training set labels; y israwFor training the original labels of the set, including the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, the dimension is 6; u is a mean value comprising relative translation of the x, y, z axes and relative rotation of the x, y, z axes, with a dimension of 6; sigma2The dimension is 6 for the variance of the relative translation based on the x, y, z axes and the relative rotation based on the x, y, z axes; sigma is sigma2The corresponding standard deviation of the measured signal is,
Figure BDA0002938980110000042
as a label after standardization.
Further, the loss function in step S3 is as follows:
Figure BDA0002938980110000043
wherein B refers to a batch of single input data in the training process, and i refers to a serial number of a corresponding batch; k is the dimension of the label, and the size is 6; t refers to the output result of the fully connected layer sub-module in the main module C and the element index corresponding to the standardized label,
Figure BDA0002938980110000044
is that
Figure BDA0002938980110000045
The single element corresponding to the position of t,
Figure BDA0002938980110000046
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the position t, | purpleiI-th of BThe output and the all-connected layer sub-module in the third main module C after the group data is transmitted into the deep learning network model
Figure BDA0002938980110000047
And performing absolute value operation after subtraction.
Further, in step S3, the fixed learning rate is 0.0001, the epoch is 200, the batch is 8, and an Adam optimizer is used.
Further, the inverse normalization in the step S4 is as follows:
Figure BDA0002938980110000048
where σ is the standard deviation of the training set labels, YinvFor the last value predicted after inverse normalization, the dimension is 6;
Figure BDA0002938980110000049
the result is output for the fully connected layer sub-module in the third main module C in step S1, and the dimension is 6.
Further, the implementation process of simulating various extreme cases in the step S4 is as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, then randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the case of image data loss, the test set randomly selects a picture, and the selected picture is replaced with a pure black picture.
Compared with the prior art, the invention has the following advantages and effects:
1. in the method for designing the kernel of the visual inertial navigation system based on deep learning, the real label of the training set is subjected to standardized processing, so that the condition that the learning of relative translation and relative rotation needs to be balanced by manually setting a balance factor in other deep learning methods is avoided, the generalization capability of the kernel design method is improved, and the time consumed by manually setting the balance factor is avoided.
2. In the visual inertial navigation system kernel design method based on deep learning, the dimension of an image feature is reduced to m dimension, the dimension of an inertial navigation feature is increased to m dimension, and the dimensions of the image feature and the inertial navigation feature are the same.
3. In the method for designing the visual inertial navigation system kernel based on deep learning, disclosed by the invention, an attention mechanism is introduced, the features are subjected to self-adaptive weighting, unnecessary features are inhibited, and the precision of motion state estimation is improved.
Drawings
FIG. 1 is a flowchart of a method for designing a deep learning-based visual inertial navigation combination based on normalization, which is disclosed in the embodiment of the present invention;
FIG. 2 is a diagram of an Attention module in an embodiment of the present invention;
FIG. 3 is an overview of a deep learning network model in an embodiment of the invention;
fig. 4 is a trace diagram of scene sequence number 09 in the KITTI dataset.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
As shown in fig. 1, the embodiment discloses a deep learning visual inertial navigation integrated navigation design method based on standardization, in which a training set and a test set are divided as follows: and taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences as a test set.
The method comprises the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called as a main module A; the second master module comprises two layers of Bi-LSTM, referred to as master module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data to a main module A to extract image characteristics; inputting inertial navigation data into the main module B to extract inertial navigation characteristics, and ensuring that the dimensionality of the inertial navigation characteristics is consistent with the dimensionality of image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention submodule in a main module C, the output of the Attention submodule is multiplied by the input of the Attention submodule and then input into two layers of Bi-LSTM submodules of the main module C, and the output of the two layers of Bi-LSTM submodules is input into a fully-connected layer submodule to output a result;
s2, designing a loss function, standardizing the labels of the training set, transforming the labels of the training set into a distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function;
s3, training and storing results, wherein an activation function adopted by a fully-connected layer sub-module in the main module A and the main module C is Relu, an activation function adopted by an Attention sub-module in the main module C is Relu and Sigmoid, the learning rate is fixed, training data are input to train the deep learning network model constructed in the step S1, and the deep learning network model is stored to an appointed path after training;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization on the mean value and the variance obtained in the step S2 to obtain a prediction result.
In addition, the navigation design method also comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by foreign matters, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
Example two
On the basis of the method for designing the deep learning visual inertial navigation combination navigation based on standardization disclosed by the embodiment, the embodiment further discloses that the structure of the deep learning network model is as follows:
the main module a has structural parameters shown in table 1, and the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are formed by two-dimensional convolution, the sizes of convolution kernels of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and the sizes of convolution kernels of the second seven layers of CNNs are 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention sub-module, two layers of Bi-LSTM sub-modules and a full-connection layer sub-module, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM sub-modules comprises 1000 neurons; the Attention submodule consists of two fully-connected layers, the structure of the Attention submodule is shown in figure 2, the activation function of the fully-connected layer of the first layer is Relu, and the activation function of the fully-connected layer of the second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of neurons of the four full-connection layers is 512, 128, 64 and 6 respectively;
TABLE 1 first Module Structure parameter Table
Figure BDA0002938980110000071
Figure BDA0002938980110000081
As shown in table 1, in a parameter column, K refers to the size of a convolution kernel, S refers to a convolution step, P refers to whether zero padding operation is performed, and zero padding is required when the value of P is 1;
the specific implementation of step S2 is as follows:
when the task of the deep learning network model is to predict the relative translation of the x, y and z axes and the relative rotation of the x, y and z axes, the prediction effect of the relative translation of the x, y and z axes is good and the prediction effect of the relative rotation of the x, y and z axes is poor due to the fact that the magnitude of the relative translation of the x, y and z axes and the magnitude difference of the relative rotation of the x, y and z axes are extremely large; training to balance the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes is usually performed by a scaling factor to make the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes equal in magnitude, but the scaling factor is selected to be determined through multiple experiments, and the label is normalized, i.e., the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes are normalized without adding the scaling factor, and the process of normalizing the label in step S2 is as follows:
the training set label mean is calculated according to:
Figure BDA0002938980110000082
the training set label variance is calculated according to:
Figure BDA0002938980110000083
training set label normalization was performed according to the following equation:
Figure BDA0002938980110000091
wherein n is the number of training set labels; y israwFor training the original labels of the set, including the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, the dimension is 6; u comprises the mean of the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, with a dimension of 6; sigma2The dimension is 6 for the variance of the relative translation based on the x, y, z axes and the relative rotation based on the x, y, z axes;sigma is sigma2The corresponding standard deviation of the measured signal is,
Figure BDA0002938980110000092
as a label after standardization.
The loss function described in step S2 is calculated according to the following equation:
Figure BDA0002938980110000093
wherein B is a batch of single input data volume in the training process, i refers to a serial number of the corresponding batch, k refers to a dimension of a label, the size is 6, t refers to an output result of a fully-connected layer sub-module in the main module C and an element subscript corresponding to the standardized label,
Figure BDA0002938980110000094
is that
Figure BDA0002938980110000095
The single element corresponding to the position of t,
Figure BDA0002938980110000096
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the t position | purpleiThe output and the output of the fully connected layer sub-module in the third main module C after the ith group of data of B is transmitted into the deep learning network model
Figure BDA0002938980110000097
And performing absolute value operation after subtraction.
The specific implementation of step S3 is as follows:
the fixed learning rate was 0.0001, epoch was 200, batch was 8, and an Adam optimizer was used.
The specific implementation of step S4 is as follows:
inverse normalization was performed according to the following formula:
Figure BDA0002938980110000098
where σ is the standard deviation of the training set labels, YinvIs the last value predicted after inverse normalization;
Figure BDA0002938980110000099
outputting the result for the fully connected layer sub-module in the third main module C of step S1, with dimension 6;
the implementation process for simulating various extreme conditions in the step S4 is as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the case of image data loss, the test set randomly selects pictures, and the selected pictures are replaced by pure black pictures;
creating four folders, and storing data of three extreme conditions, namely the image data is shielded by foreign matters, the inertial navigation data is lost and the image data is lost into the corresponding folders after the four extreme conditions are simulated; taking the deep learning network model trained in the step S4 to perform the test under the four extreme conditions, the test results are as follows:
table 2 shows that the depth learning based visual-inertial navigation combination based on normalization of the present invention (hereinafter, referred to as normaize _ VIO) is compared with the depth learning based Soft fusion method (hereinafter, referred to as Soft _ VIO) without data loss, and table 2 shows that the present invention is superior to Soft _ VIO in terms of the method:
TABLE 2 comparison of the two methods in case of no data damage
Figure BDA0002938980110000101
In Table 2, m refers to meter units and rad refers to radian units.
Table 3 verifies the performance comparison of the two methods in case of inertial navigation data loss, and the result shows that the normaize _ VIO has higher precision than the Soft _ VIO.
TABLE 3 comparison table of two methods in inertial navigation data loss condition
Figure BDA0002938980110000111
Table 4 verifies the comparison of the two methods in the case of image data blocked by foreign matter, and the results show that normaize _ VIO has higher precision than Soft _ VIO.
TABLE 4 comparison table of two methods for the case of image data being blocked by foreign matter
Figure BDA0002938980110000112
Table 5 verifies the comparison of the two methods in case of image data loss, and the results show that normaize _ VIO has higher precision than Soft _ VIO.
TABLE 5 comparison of two methods in case of image data loss
Figure BDA0002938980110000113
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. A deep learning visual inertial navigation combined navigation design method based on standardization is characterized by comprising the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called as a main module A; the second master module comprises two layers of Bi-LSTM, referred to as master module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data to a main module A to extract image characteristics; inputting inertial navigation data into the main module B to extract inertial navigation characteristics, and ensuring that the dimensionality of the inertial navigation characteristics is consistent with the dimensionality of image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention submodule in a main module C, the output of the Attention submodule is multiplied by the input of the Attention submodule and then input into two layers of Bi-LSTM submodules of the main module C, and the output of the two layers of Bi-LSTM submodules is input into a fully-connected layer submodule to output a result;
s2, designing a loss function, standardizing the labels of the training set, transforming the labels of the training set into a distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function;
s3, training and storing results, inputting training data to train the deep learning network model constructed in the step S1, and storing the deep learning network model to a specified path after training;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization on the mean value and the variance obtained in the step S2 to obtain a prediction result.
2. The method for designing the navigation based on the standardized deep learning visual inertial navigation combination of the claim 1, wherein the method for designing the navigation further comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by foreign matters, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
3. The method for designing deep learning visual inertial navigation combination navigation based on standardization according to claim 1, wherein the navigation design method is characterized in that a training set and a test set are divided as follows: and taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences as a test set.
4. The design method of deep learning visual inertial navigation combination based on standardization according to claim 1, wherein the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are all formed by two-dimensional convolution, the convolution kernel sizes of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and the convolution kernel sizes of the last seven layers of CNNs are all 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention submodule, two layers of Bi-LSTM submodules and a full-connection layer submodule, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM submodules comprises 1000 neurons; the Attention submodule consists of two full-connection layers, the full-connection layer activation function of the first layer is Relu, and the full-connection layer activation function of the second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of the neurons of the four full-connection layers is 512, 128, 64 and 6 respectively.
5. The method for designing combined navigation based on deep learning and visual inertial navigation of claim 1, wherein the step S2 includes the following steps:
the training set label mean calculation method is as follows:
Figure FDA0002938980100000031
the training set label variance is calculated as follows:
Figure FDA0002938980100000032
the training set label standardization calculation mode is as follows:
Figure FDA0002938980100000033
wherein n is the number of training set labels; y israwFor training the original labels of the set, including the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, the dimension is 6; u is a mean value comprising relative translation of the x, y, z axes and relative rotation of the x, y, z axes, with a dimension of 6; sigma2The dimension is 6 for the variance of the relative translation based on the x, y, z axes and the relative rotation based on the x, y, z axes; sigma is sigma2The corresponding standard deviation of the measured signal is,
Figure FDA0002938980100000039
as a label after standardization.
6. The method for designing combined navigation based on deep learning and visual inertial navigation of claim 1, wherein the loss function in step S3 is as follows:
Figure FDA0002938980100000034
wherein B refers to a batch of single input data in the training process, and i refers to a serial number of a corresponding batch; k is the dimension of the label, and the size is 6; t refers to the output result of the fully connected layer sub-module in the main module C and the element index corresponding to the standardized label,
Figure FDA0002938980100000035
is that
Figure FDA0002938980100000036
The single element corresponding to the position of t,
Figure FDA0002938980100000037
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the position t, | purpleiThe output and the of the sub-module of the full connection layer in the main module C after the ith group of data of B is transmitted into the deep learning network model
Figure FDA0002938980100000038
And performing absolute value operation after subtraction.
7. The method of claim 1, wherein in step S3, the fixed learning rate is 0.0001, the epoch is 200, the batch is 8, and an Adam optimizer is used.
8. The method for designing deep learning-based visual inertial navigation combination based on normalization of claim 1, wherein the inverse normalization in step S4 is as follows:
Figure FDA0002938980100000041
where σ is the standard deviation of the training set labels, YinvFor the last value predicted after inverse normalization, the dimension is 6;
Figure FDA0002938980100000042
the result is output for the fully connected layer sub-module in the third main module C in step S1, and the dimension is 6.
9. The method for designing combined navigation based on deep learning and visual inertial navigation of claim 1, wherein the simulation of extreme conditions in step S4 is implemented as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, then randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the case of image data loss, the test set randomly selects a picture, and the selected picture is replaced with a pure black picture.
10. The design method of deep learning visual inertial navigation combination based on standardization according to claim 1, wherein the activation functions used by the fully-connected layer sub-modules in the main module a and the main module C are Relu, and the activation functions used by the Attention sub-module in the main module C are Relu and Sigmoid.
CN202110171232.3A 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization Active CN112801201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110171232.3A CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110171232.3A CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Publications (2)

Publication Number Publication Date
CN112801201A true CN112801201A (en) 2021-05-14
CN112801201B CN112801201B (en) 2022-10-25

Family

ID=75814791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110171232.3A Active CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Country Status (1)

Country Link
CN (1) CN112801201B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392904A (en) * 2021-06-16 2021-09-14 华南理工大学 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190102692A1 (en) * 2017-09-29 2019-04-04 Here Global B.V. Method, apparatus, and system for quantifying a diversity in a machine learning training data set
US20190325269A1 (en) * 2018-04-20 2019-10-24 XNOR.ai, Inc. Image Classification through Label Progression
US20200160177A1 (en) * 2018-11-16 2020-05-21 Royal Bank Of Canada System and method for a convolutional neural network for multi-label classification with partial annotations
CN111210435A (en) * 2019-12-24 2020-05-29 重庆邮电大学 Image semantic segmentation method based on local and global feature enhancement module
US20200363815A1 (en) * 2019-05-17 2020-11-19 Nvidia Corporation Object pose estimation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190102692A1 (en) * 2017-09-29 2019-04-04 Here Global B.V. Method, apparatus, and system for quantifying a diversity in a machine learning training data set
US20190325269A1 (en) * 2018-04-20 2019-10-24 XNOR.ai, Inc. Image Classification through Label Progression
US20200160177A1 (en) * 2018-11-16 2020-05-21 Royal Bank Of Canada System and method for a convolutional neural network for multi-label classification with partial annotations
US20200363815A1 (en) * 2019-05-17 2020-11-19 Nvidia Corporation Object pose estimation
CN111210435A (en) * 2019-12-24 2020-05-29 重庆邮电大学 Image semantic segmentation method based on local and global feature enhancement module

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张谦 等: "一种小型无人直升机捷联式组合导航系统设计", 《计算机仿真》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392904A (en) * 2021-06-16 2021-09-14 华南理工大学 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method
WO2022262878A1 (en) * 2021-06-16 2022-12-22 华南理工大学 Ltc-dnn-based visual inertial navigation combined navigation system and self-learning method

Also Published As

Publication number Publication date
CN112801201B (en) 2022-10-25

Similar Documents

Publication Publication Date Title
CN107945204B (en) Pixel-level image matting method based on generation countermeasure network
CN110188765B (en) Image semantic segmentation model generation method, device, equipment and storage medium
CN109522942B (en) Image classification method and device, terminal equipment and storage medium
CN110910391B (en) Video object segmentation method for dual-module neural network structure
CN111723691B (en) Three-dimensional face recognition method and device, electronic equipment and storage medium
CN113095129B (en) Gesture estimation model training method, gesture estimation device and electronic equipment
CN106548192A (en) Based on the image processing method of neutral net, device and electronic equipment
CN111811694B (en) Temperature calibration method, device, equipment and storage medium
CN113177559B (en) Image recognition method, system, equipment and medium combining breadth and dense convolutional neural network
CN113469088A (en) SAR image ship target detection method and system in passive interference scene
CN112183491A (en) Expression recognition model, training method, recognition method, device and computing equipment
CN111739037B (en) Semantic segmentation method for indoor scene RGB-D image
EP4318313A1 (en) Data processing method, training method for neural network model, and apparatus
CN112801047B (en) Defect detection method and device, electronic equipment and readable storage medium
US20210056353A1 (en) Joint representation learning from images and text
CN113743417A (en) Semantic segmentation method and semantic segmentation device
CN113763371A (en) Pathological image cell nucleus segmentation method and device
CN112420125A (en) Molecular attribute prediction method and device, intelligent equipment and terminal
CN112801201B (en) Deep learning visual inertial navigation combined navigation design method based on standardization
CN111382619B (en) Picture recommendation model generation method, picture recommendation method, device, equipment and medium
CN116805387A (en) Model training method, quality inspection method and related equipment based on knowledge distillation
CN110288691B (en) Method, apparatus, electronic device and computer-readable storage medium for rendering image
CN115222691A (en) Image defect detection method, system and related device
CN112862840B (en) Image segmentation method, device, equipment and medium
CN109871835B (en) Face recognition method based on mutual exclusion regularization technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant