CN112801201B - Deep learning visual inertial navigation combined navigation design method based on standardization - Google Patents

Deep learning visual inertial navigation combined navigation design method based on standardization Download PDF

Info

Publication number
CN112801201B
CN112801201B CN202110171232.3A CN202110171232A CN112801201B CN 112801201 B CN112801201 B CN 112801201B CN 202110171232 A CN202110171232 A CN 202110171232A CN 112801201 B CN112801201 B CN 112801201B
Authority
CN
China
Prior art keywords
module
sub
main module
deep learning
inertial navigation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110171232.3A
Other languages
Chinese (zh)
Other versions
CN112801201A (en
Inventor
胡斌杰
丘金光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110171232.3A priority Critical patent/CN112801201B/en
Publication of CN112801201A publication Critical patent/CN112801201A/en
Application granted granted Critical
Publication of CN112801201B publication Critical patent/CN112801201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Automation & Control Theory (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep learning visual inertial navigation combined navigation design method based on standardization, which comprises the following steps of: designing and introducing standard operation on labels of a training set, calculating the mean value and variance of the labels, converting the labels into distribution with the mean value of 0 and the variance of 1, and storing the obtained mean value and variance; in the network design, in order to balance the contribution of image data and inertial navigation data, inertial navigation characteristics and image characteristics are changed to be in the same dimension m through a network; and in the verification stage or the test stage, the result output by the network is subjected to inverse standardization operation through the mean value and the variance obtained by calculation to obtain a final result. According to the method, in the deep learning visual inertial navigation combined navigation design method based on standardization, the standardization operation is carried out on the training set labels, the selection of balance factors in the target function is reduced, the generalization performance of a neural network is improved, and the accuracy of the relative pose prediction is improved.

Description

Deep learning visual inertial navigation combined navigation design method based on standardization
Technical Field
The invention relates to the technical field of sensor fusion and motion estimation, in particular to a deep learning visual inertial navigation combined navigation design method based on standardization.
Background
With automatic driving and continuous development of unmanned aerial vehicles, high-precision and high-robustness positioning is an important premise for completing autonomous navigation and exploring tasks such as unknown areas, a pure visual odometer method is adopted, a system acquires surrounding environment information by using a visual sensor, and motion state of the system is estimated by analyzing image data. The vision inertial navigation odometer adds Inertial Measurement Unit (IMU) information on the basis of a pure vision odometer, and can improve the precision of motion state estimation under the condition of visual loss.
The conventional visual inertial navigation odometer technology has been fully researched, but some researches on data loss, data damage and the like are not well solved, and a large amount of manual feature selection and external reference calibration among sensors are required, which is undoubtedly time-consuming. In recent years, deep learning techniques have achieved enormous success in the field of computer vision, and are widely used in various fields. The visual inertial navigation combined navigation is taken as a regression task, and can also be trained by adopting a deep learning method, in the existing visual inertial navigation combined navigation task based on the deep learning, the target function balances the learning of translation and rotation through a balance factor, and a large amount of training time is needed for finding the balance factor, so that manpower and material resources are undoubtedly consumed, and aiming at the problem, a new target function needs to be designed to avoid manual setting of the balance factor.
Disclosure of Invention
The invention aims to solve the defects in the prior art, and provides a deep learning visual inertial navigation combined navigation design method based on standardization, so that the manual setting of balance factors of relative translation and relative rotation is avoided.
The purpose of the invention can be achieved by adopting the following technical scheme:
a deep learning visual inertial navigation combined navigation design method based on standardization comprises the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called a main module A; the second main module comprises two layers of Bi-LSTM, called main module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data to a main module A to extract image characteristics; inputting inertial navigation data into the main module B to extract inertial navigation characteristics, and ensuring that the dimensionality of the inertial navigation characteristics is consistent with the dimensionality of image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention submodule in a main module C, the output of the Attention submodule is multiplied by the input of the Attention submodule and then input into two layers of Bi-LSTM submodules of the main module C, and the output of the two layers of Bi-LSTM submodules is input into a fully-connected layer submodule to output a result;
s2, designing a loss function, standardizing the labels of the training set, transforming the labels of the training set into distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function;
s3, training and storing results, wherein an activation function adopted by all-connection layer sub-modules in the main module A and the main module C is Relu, an activation function adopted by an Attention sub-module in the main module C is Relu and Sigmoid, training data are input to train the deep learning network model constructed in the step S1, and the deep learning network model is stored to an appointed path after training is finished;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization through the mean value and the variance obtained in the step S2 to obtain a prediction result.
Further, the navigation design method further comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by a foreign matter, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
Further, the navigation design method divides the training set and the test set as follows: taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences to divide the training set into a test set.
Further, the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are formed by two-dimensional convolution, convolution kernels of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and convolution kernels of the last seven layers of CNNs are 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention submodule, two layers of Bi-LSTM submodules and a full-connection layer submodule, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM submodules comprises 1000 neurons; the Attention submodule consists of two full-connection layers, wherein the full-connection layer activation function of a first layer is Relu, and the full-connection layer activation function of a second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of neurons of the four full-connection layers is 512, 128, 64 and 6 respectively.
Further, the step S2 normalization process is as follows:
the training set label mean calculation mode is as follows:
Figure BDA0002938980110000031
the training set label variance is calculated as follows:
Figure BDA0002938980110000032
the training set label standardization calculation mode is as follows:
Figure BDA0002938980110000041
wherein n is the number of training set labels; y is raw For training the original labels of the set, including relative translation of the x, y, z axes and relative rotation of the x, y, z axes, the dimension is 6; u is a mean value comprising relative translation of the x, y, z axes and relative rotation of the x, y, z axes, with a dimension of 6; sigma 2 The dimension is 6 for the variance based on the relative translation of the x, y, z axes and on the relative rotation of the x, y, z axes; sigma is sigma 2 The corresponding standard deviation of the measured signal is,
Figure BDA0002938980110000042
as a label after standardization.
Further, the loss function in step S3 is as follows:
Figure BDA0002938980110000043
wherein B refers to a batch of single input data in the training process, and i refers to a serial number of a corresponding batch; k is the dimension of the label, and the size is 6; t is the output result of the fully connected layer sub-module in the main module C and the element subscript corresponding to the standardized label,
Figure BDA0002938980110000044
is that
Figure BDA0002938980110000045
The single element corresponding to the position of t,
Figure BDA0002938980110000046
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the position t, | calting i The output and the output of the fully connected layer sub-module in the third main module C after the ith group of data of B is transmitted into the deep learning network model
Figure BDA0002938980110000047
And performing absolute value operation after subtraction.
Further, in the step S3, the fixed learning rate is 0.0001, the epoch is 200, the batch is 8, and an Adam optimizer is adopted.
Further, the inverse normalization in step S4 is as follows:
Figure BDA0002938980110000048
where σ is the standard deviation of the training set labels, Y inv After inverse normalizationThe last value of the prediction, dimension 6;
Figure BDA0002938980110000049
the dimension is 6 for the fully connected layer sub-module in the third main module C in step S1 to output the result.
Further, the implementation process of simulating various extreme cases in the step S4 is as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, then randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the event of image data loss, the test set randomly selects a picture, and the selected picture is replaced with a pure black picture.
Compared with the prior art, the invention has the following advantages and effects:
1. in the method for designing the kernel of the visual inertial navigation system based on deep learning, the real label of the training set is subjected to standardized processing, so that the condition that the learning of relative translation and relative rotation needs to be balanced by manually setting a balance factor in other deep learning methods is avoided, the generalization capability of the kernel design method is improved, and the time consumed by manually setting the balance factor is avoided.
2. In the visual inertial navigation system kernel design method based on deep learning, the dimension of an image feature is reduced to m dimension, the dimension of an inertial navigation feature is increased to m dimension, and the dimensions of the image feature and the inertial navigation feature are the same.
3. In the method for designing the visual inertial navigation system kernel based on deep learning, disclosed by the invention, an attention mechanism is introduced, the features are subjected to self-adaptive weighting, unnecessary features are inhibited, and the precision of motion state estimation is improved.
Drawings
FIG. 1 is a flowchart of a standardized deep learning-based visual inertial navigation integrated navigation design method disclosed in the embodiment of the present invention;
FIG. 2 is a diagram of an Attention module in an embodiment of the present invention;
FIG. 3 is an overview of a deep learning network model in an embodiment of the invention;
fig. 4 is a trace diagram of scene sequence number 09 in the KITTI dataset.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
As shown in fig. 1, the embodiment discloses a deep learning visual inertial navigation integrated navigation design method based on standardization, in which a training set and a test set are divided as follows: and taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences as a test set.
The method comprises the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called a main module A; the second main module comprises two layers of Bi-LSTM, called main module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data into a main module A to extract image characteristics; inputting inertial navigation data into the main module B to extract inertial navigation characteristics, and ensuring that the dimensionality of the inertial navigation characteristics is consistent with the dimensionality of image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention submodule in a main module C, the output of the Attention submodule is multiplied by the input of the Attention submodule and then input into two layers of Bi-LSTM submodules of the main module C, and the output of the two layers of Bi-LSTM submodules is input into a fully-connected layer submodule to output a result;
s2, designing a loss function, standardizing labels of the training set, converting the labels of the training set into distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function;
s3, training and storing results, wherein an activation function adopted by a fully-connected layer sub-module in the main module A and a fully-connected layer sub-module in the main module C is Relu, an activation function adopted by an Attention sub-module in the main module C is Relu and Sigmoid, the learning rate is fixed, training data are input to train the deep learning network model constructed in the step S1, and the deep learning network model is stored to an appointed path after training is finished;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization through the mean value and the variance obtained in the step S2 to obtain a prediction result.
In addition, the navigation design method also comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by a foreign matter, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
Example two
On the basis of the method for designing the deep learning visual inertial navigation combination navigation based on standardization disclosed by the embodiment, the embodiment further discloses that the structure of the deep learning network model is as follows:
the structural parameters of the main module a are shown in table 1, the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are formed by two-dimensional convolution, the convolution kernels of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and the convolution kernels of the last seven layers of CNNs are 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention sub-module, two layers of Bi-LSTM sub-modules and a full-connection layer sub-module, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM sub-modules comprises 1000 neurons; the Attention submodule consists of two fully-connected layers, the structure of the Attention submodule is shown in figure 2, the activation function of the fully-connected layer of the first layer is Relu, and the activation function of the fully-connected layer of the second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of neurons of the four full-connection layers is 512, 128, 64 and 6 respectively;
TABLE 1 first Module Structure parameter Table
Figure BDA0002938980110000071
Figure BDA0002938980110000081
As shown in table 1, in a parameter column, K refers to the size of a convolution kernel, S refers to a convolution step, P refers to whether zero padding operation is performed, and zero padding is required when the value of P is 1;
the specific implementation of step S2 is as follows:
when the task of the deep learning network model is to predict the relative translation of the x, y and z axes and the relative rotation of the x, y and z axes, the prediction effect of the relative translation of the x, y and z axes is good and the prediction effect of the relative rotation of the x, y and z axes is poor due to the fact that the magnitude of the relative translation of the x, y and z axes and the magnitude difference of the relative rotation of the x, y and z axes are extremely large; in order to balance the training of the relative translation of the x, y, and z axes and the relative rotation of the x, y, and z axes, the relative translation of the x, y, and z axes and the relative rotation of the x, y, and z axes are often equal in order of magnitude by a scaling factor, but the selection of the scaling factor requires multiple experiments to determine the scaling factor, and the label is normalized, that is, the normalization of the relative translation of the x, y, and z axes and the relative rotation of the x, y, and z axes does not require the addition of the scaling factor, and the label normalization process in step S2 is as follows:
the training set label mean is calculated according to:
Figure BDA0002938980110000082
the training set label variance is calculated according to:
Figure BDA0002938980110000083
training set label normalization was performed according to the following equation:
Figure BDA0002938980110000091
wherein n is the number of training set labels; y is raw For training the original labels of the set, including the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, the dimension is 6; u comprises the mean of the relative translations and rotations of the x, y, z axes, with a dimension of 6; sigma 2 The dimension is 6 for the variance based on the relative translation of the x, y, z axes and on the relative rotation of the x, y, z axes; sigma is sigma 2 The corresponding standard deviation of the measured signal is,
Figure BDA0002938980110000092
as a label after standardization.
The loss function described in step S2 is calculated according to the following equation:
Figure BDA0002938980110000093
wherein B is the batch of single input data volume in the training process, i refers to the serial number of the corresponding batch, k refers to the dimension of the label, the size is 6, t refers to the output result of the fully-connected layer sub-module in the main module C and the element subscript corresponding to the standardized label,
Figure BDA0002938980110000094
is that
Figure BDA0002938980110000095
The single element corresponding to the position of t,
Figure BDA0002938980110000096
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the t position | purple i The output of the sub-module of the full connection layer in the third main module C is connected with the data of the ith group of the data B after the data of the ith group of the data B is transmitted into the deep learning network model
Figure BDA0002938980110000097
And performing absolute value operation after subtraction.
The specific implementation of step S3 is as follows:
the fixed learning rate was 0.0001, the epoch was 200, the batch was 8, and an Adam optimizer was used.
The specific implementation of step S4 is as follows:
inverse normalization was performed according to the following formula:
Figure BDA0002938980110000098
where σ is the standard deviation of the training set labels, Y inv Is the last value predicted after inverse normalization;
Figure BDA0002938980110000099
outputting a result for the fully connected layer sub-module in the third main module C in the step S1, wherein the dimension is 6;
the implementation process of simulating various extreme conditions in the step S4 is as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the case of image data loss, the test set randomly selects pictures, and the selected pictures are replaced by pure black pictures;
creating four folders, and storing data of three extreme conditions, namely the image data is shielded by foreign matters, the inertial navigation data is lost and the image data is lost into the corresponding folders after the four extreme conditions are simulated; and (4) testing the deep learning network model trained in the step (S4) under the four extreme conditions, wherein the test results are as follows:
table 2 shows that the method for designing the standardized deep learning-based visual inertial navigation integrated navigation system (hereinafter, referred to as normaize _ VIO) according to the present invention is superior to Soft-fusion method based on deep learning (hereinafter, referred to as Soft _ VIO) in the present invention, as shown in table 2, when the data is not damaged:
TABLE 2 comparison of the two methods in case of no data damage
Figure BDA0002938980110000101
In Table 2 m refers to meter units and rad refers to units of radians.
Table 3 verifies the performance comparison of the two methods in case of inertial navigation data loss, and the result shows that the normaize _ VIO has higher precision than the Soft _ VIO.
TABLE 3 comparison table of inertial navigation data loss conditions of two methods
Figure BDA0002938980110000111
Table 4 verifies the comparison of the two methods in the case of image data blocked by foreign matter, and the results show that normaize _ VIO has higher precision than Soft _ VIO.
TABLE 4 comparison table of image data blocked by foreign matter in two methods
Figure BDA0002938980110000112
Table 5 verifies the comparison of the two methods in case of image data loss, and the results show that normaize _ VIO has higher precision than Soft _ VIO.
TABLE 5 comparison of two methods in case of image data loss
Figure BDA0002938980110000113
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (8)

1. A deep learning visual inertial navigation combined navigation design method based on standardization is characterized by comprising the following steps:
s1, establishing a deep learning network model, wherein the deep learning network model comprises a first main module, a second main module and a third main module, and the first main module is formed by stacking 10 layers of CNNs and is called a main module A; the second main module comprises two layers of Bi-LSTM, called main module B; the third main module is called as a main module C, the main module C comprises a first sub module, a second sub module and a third sub module, the first sub module is an Attention sub module, the second sub module is a two-layer Bi-LSTM sub module, and the third sub module is a full-connection layer sub module; inputting image data into a main module A to extract image characteristics; inertial navigation data are input into a main module B to extract inertial navigation characteristics, and the dimension of the inertial navigation characteristics is ensured to be consistent with the dimension of the image characteristics; the image characteristic and the inertial navigation characteristic are serially connected and input into an Attention sub-module in a main module C, the output of the Attention sub-module is multiplied by the input of the Attention sub-module and then input into two layers of Bi-LSTM sub-modules of the main module C, and the output of the two layers of Bi-LSTM sub-modules is input into a fully-connected layer sub-module to output a result;
s2, designing a loss function, standardizing labels of the training set, transforming the labels of the training set into distribution with a mean value of 0 and a variance of 1, storing the mean value and the variance obtained by standardized calculation, and subtracting the standardized labels from the output of the all-connection layer sub-module in the main module C to obtain a final loss function, wherein the standardization process comprises the following steps:
the training set label mean calculation mode is as follows:
Figure FDA0003743419690000011
the training set label variance is calculated as follows:
Figure FDA0003743419690000012
the training set label standardization calculation mode is as follows:
Figure FDA0003743419690000021
wherein n is the number of training set labels; y is raw For training the original labels of the set, including the relative translation of the x, y, z axes and the relative rotation of the x, y, z axes, the dimension is 6; u is a mean value comprising relative translations and rotations of the x, y, z axes, with a dimension of 6; sigma 2 The dimension is 6 for the variance based on the relative translation of the x, y, z axes and on the relative rotation of the x, y, z axes; sigma is sigma 2 The corresponding standard deviation of the measured signal is,
Figure FDA0003743419690000022
as a label after normalization;
the loss function is as follows:
Figure FDA0003743419690000023
b refers to a batch of single input data in the training process, and i refers to a serial number of a corresponding batch; k is the dimension of the label, and the size is 6; t refers to the output result of the fully connected layer sub-module in the main module C and the element index corresponding to the standardized label,
Figure FDA0003743419690000024
is that
Figure FDA0003743419690000025
The single element corresponding to the position of t,
Figure FDA0003743419690000026
is a single element of the output result of the fully-connected layer sub-module in the main module C corresponding to the position t, | purple i The output and the of the sub-module of the full connection layer in the main module C after the ith group of data of B is transmitted into the deep learning network model
Figure FDA0003743419690000027
Performing absolute value operation after subtraction;
s3, training and storing results, inputting training data to train the deep learning network model constructed in the step S1, and storing the deep learning network model to a specified path after training is finished;
and S4, inputting the test data into the deep learning network model obtained by training in the step S3 to obtain an output result, and then carrying out inverse standardization through the mean value and the variance obtained in the step S2 to obtain a prediction result.
2. The method for designing the navigation based on the standardized deep learning visual inertial navigation combination of the claim 1, wherein the method for designing the navigation further comprises a test verification step, and the process is as follows:
and simulating four extreme conditions, namely a data non-damage condition, an image data shielding condition by a foreign matter, an inertial navigation data loss condition and an image data loss condition, inputting the corresponding test data under the four extreme conditions into the deep learning network model obtained by training in the step S3 for testing, and carrying out inverse standardization on the output result of the deep learning network model according to the mean value and the variance stored in the step S2 to obtain a prediction result.
3. The method for designing deep learning visual inertial navigation combination navigation based on standardization according to claim 1, wherein the navigation design method is characterized in that a training set and a test set are divided as follows: and taking 00-08 sequences in the KITTI data set as a training set, and taking 09 and 10 sequences as a test set.
4. The design method of deep learning visual inertial navigation combination based on standardization according to claim 1, wherein the main module a is formed by sequentially stacking 10 layers of CNNs in sequence, wherein the CNNs are all formed by two-dimensional convolution, the convolution kernel sizes of the first three layers of CNNs are 7 × 7, 5 × 5 and 5 × 5, and the convolution kernel sizes of the last seven layers of CNNs are all 3 × 3; the main module B consists of two layers of Bi-LSTM, and each layer of Bi-LSTM comprises 512 neurons; the main module C comprises an Attention sub-module, two layers of Bi-LSTM sub-modules and a full-connection layer sub-module, wherein each layer of Bi-LSTM in the two layers of Bi-LSTM sub-modules comprises 1000 neurons; the Attention submodule consists of two full-connection layers, the full-connection layer activation function of the first layer is Relu, and the full-connection layer activation function of the second layer is Sigmoid; the full-connection layer submodule is formed by cascading four full-connection layers, and the number of neurons of the four full-connection layers is 512, 128, 64 and 6 respectively.
5. The method according to claim 1, wherein in the step S3, a fixed learning rate is 0.0001, an epoch is 200, a batch is 8, and an Adam optimizer is adopted.
6. The method for designing deep learning visual inertial navigation combined navigation based on standardization according to claim 1, wherein in the step S4, inverse standardization is as follows:
Figure FDA0003743419690000031
where σ is the standard deviation of the training set labels, Y inv For the last value predicted after inverse normalization, the dimension is 6;
Figure FDA0003743419690000032
for the fully connected layer sub-module in the third main module C in step S1 to output the result, the dimension is 6.
7. The method for designing deep learning visual inertial navigation combination based on standardization according to claim 1, wherein the simulation of various extreme conditions in step S4 is implemented as follows:
in the condition that the image data is shielded by foreign matters, the test set randomly selects pictures, then randomly selects a pixel coordinate of the pictures from the selected pictures, and adds a black mask block with the size of 100 x 100 by taking the pixel coordinate as the center;
in the case of inertial navigation data loss, the test set randomly selects inertial navigation data, and the selected inertial navigation data is set to be zero;
in the case of image data loss, the test set randomly selects a picture, and the selected picture is replaced with a pure black picture.
8. The design method of deep learning visual inertial navigation combination based on standardization according to claim 1, wherein the activation functions used by the fully-connected layer sub-modules in the main module a and the main module C are Relu, and the activation functions used by the Attention sub-module in the main module C are Relu and Sigmoid.
CN202110171232.3A 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization Active CN112801201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110171232.3A CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110171232.3A CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Publications (2)

Publication Number Publication Date
CN112801201A CN112801201A (en) 2021-05-14
CN112801201B true CN112801201B (en) 2022-10-25

Family

ID=75814791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110171232.3A Active CN112801201B (en) 2021-02-08 2021-02-08 Deep learning visual inertial navigation combined navigation design method based on standardization

Country Status (1)

Country Link
CN (1) CN112801201B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392904B (en) * 2021-06-16 2022-07-26 华南理工大学 LTC-DNN-based visual inertial navigation combined navigation system and self-learning method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190102692A1 (en) * 2017-09-29 2019-04-04 Here Global B.V. Method, apparatus, and system for quantifying a diversity in a machine learning training data set
US11030486B2 (en) * 2018-04-20 2021-06-08 XNOR.ai, Inc. Image classification through label progression
CA3061717A1 (en) * 2018-11-16 2020-05-16 Royal Bank Of Canada System and method for a convolutional neural network for multi-label classification with partial annotations
US11670001B2 (en) * 2019-05-17 2023-06-06 Nvidia Corporation Object pose estimation
CN111210435B (en) * 2019-12-24 2022-10-18 重庆邮电大学 Image semantic segmentation method based on local and global feature enhancement module

Also Published As

Publication number Publication date
CN112801201A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN107945204B (en) Pixel-level image matting method based on generation countermeasure network
CN110188765B (en) Image semantic segmentation model generation method, device, equipment and storage medium
JP7350878B2 (en) Image analysis method, device, program
CN111723691B (en) Three-dimensional face recognition method and device, electronic equipment and storage medium
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
CN113469088B (en) SAR image ship target detection method and system under passive interference scene
CN106548192A (en) Based on the image processing method of neutral net, device and electronic equipment
CN111811694B (en) Temperature calibration method, device, equipment and storage medium
US20210174482A1 (en) Visualization of inspection results
CN113177559B (en) Image recognition method, system, equipment and medium combining breadth and dense convolutional neural network
CN113191489B (en) Training method of binary neural network model, image processing method and device
CN113095129A (en) Attitude estimation model training method, attitude estimation device and electronic equipment
CN111739037B (en) Semantic segmentation method for indoor scene RGB-D image
EP4318313A1 (en) Data processing method, training method for neural network model, and apparatus
CN112801047B (en) Defect detection method and device, electronic equipment and readable storage medium
CN114331985A (en) Electronic component scratch defect detection method and device and computer equipment
CN114612902A (en) Image semantic segmentation method, device, equipment, storage medium and program product
CN112801201B (en) Deep learning visual inertial navigation combined navigation design method based on standardization
CN113763371A (en) Pathological image cell nucleus segmentation method and device
CN111382619B (en) Picture recommendation model generation method, picture recommendation method, device, equipment and medium
CN112634174B (en) Image representation learning method and system
CN116805387B (en) Model training method, quality inspection method and related equipment based on knowledge distillation
CN110647973A (en) Operation method and related method and product
CN110288691B (en) Method, apparatus, electronic device and computer-readable storage medium for rendering image
CN116645471A (en) Modeling method, system, equipment and storage medium for extracting foreground object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant