CN110119768B - Visual information fusion system and method for vehicle positioning - Google Patents
Visual information fusion system and method for vehicle positioning Download PDFInfo
- Publication number
- CN110119768B CN110119768B CN201910332583.0A CN201910332583A CN110119768B CN 110119768 B CN110119768 B CN 110119768B CN 201910332583 A CN201910332583 A CN 201910332583A CN 110119768 B CN110119768 B CN 110119768B
- Authority
- CN
- China
- Prior art keywords
- image
- module
- vehicle
- image information
- visual
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 91
- 230000004927 fusion Effects 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims description 19
- 238000013528 artificial neural network Methods 0.000 claims abstract description 20
- 230000010365 information processing Effects 0.000 claims abstract description 14
- 238000007500 overflow downdraw method Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 41
- 238000012512 characterization method Methods 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 21
- 238000003062 neural network model Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000004807 localization Effects 0.000 claims 3
- 230000000875 corresponding effect Effects 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 210000005036 nerve Anatomy 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/26—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S11/00—Systems for determining distance or velocity not using reflection or reradiation
- G01S11/12—Systems for determining distance or velocity not using reflection or reradiation using electromagnetic waves other than radio waves
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Electromagnetism (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Automation & Control Theory (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a visual information fusion system and a visual information fusion method for vehicle positioning, wherein the system comprises a vehicle-mounted visual sensor module, an online visual information processing module and an offline training module, wherein the output end of the vehicle-mounted visual sensor module is respectively connected with the input end of the online visual information processing module and the input end of the offline training module in a signal manner, and the offline training module is used for training a proposed deep neural network and copying network parameters obtained in a training stage to the online visual information processing module. The invention can effectively utilize the image information acquired by a plurality of vision sensors in the vehicle-mounted vision sensor module, and through complementation among the image information, the problem that a positioning system cannot work normally under the condition that a single sensor fails is avoided, and the accuracy of vehicle positioning and the reliability of an algorithm are improved.
Description
Technical Field
The invention relates to a visual information fusion system and a visual information fusion method, in particular to a visual information fusion system and a visual information fusion method for vehicle positioning, and belongs to the field of vehicle navigation or unmanned vehicle positioning.
Background
Vehicle positioning is an important technology in the fields of vehicle navigation and unmanned, and most intelligent vehicles are currently provided with visual sensor modules for completing tasks such as environment sensing, visual navigation, target identification and the like. The conventional vehicle-mounted vision sensor module generally comprises one or more vision sensors, the vision sensors can acquire images around the vehicle, and the images contain rich environment information and are important information sources for sensing the environment of the vehicle carrier. In recent years, thanks to rapid development of computer vision technology, machine learning technology and the like, more and more researchers begin to utilize image information collected by vision sensors to help intelligent vehicles achieve positioning functions.
In general, in the prior art, the manner of implementing the vehicle positioning function by means of visual information can be largely divided into two major categories, namely, a direct method and an indirect method. The direct method is to reconstruct motion information of a vehicle through geometric relations between images, and further calculate the position of the vehicle, such as a visual odometer. The indirect rule is that under the condition that the environment information is known, the visual image information is utilized to search out the information consistent with the current vehicle position in the database through matching or scene recognition with the image in the database, so as to recover the vehicle position or eliminate the positioning error of the vehicle, and improve the positioning precision.
The technical scheme of the invention relates to an indirect method in the two types of methods. Taking this kind of method as an example, although there has been some development and progress in the prior art, it has been found that there are still problems in this technology in practical application. In particular, existing vehicle positioning methods generally rely on single visual sensor information, which has the disadvantage that visual sensors often lose visual information in certain special situations, such as blocked, violent shake, etc., and thus lead to failure of the positioning algorithm.
In summary, how to provide a new system and method for positioning a vehicle based on the prior art, so as to improve accuracy and reliability of positioning the vehicle, is a problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, an object of the present invention is to provide a visual information fusion system and method for vehicle positioning, which are as follows.
A visual information fusion system for vehicle positioning, comprising:
the vehicle-mounted visual sensor module is used for acquiring image information in the external environment;
the on-line visual information processing module is used for processing the image information output by the vehicle-mounted visual sensor module and forming a deep neural network model;
the off-line training module is used for training the deep neural network model and copying network parameters obtained in the training stage to the on-line visual information processing module;
the output end of the vehicle-mounted vision sensor module is respectively in signal connection with the input end of the online vision information processing module and the input end of the offline training module.
Preferably, the vehicle-mounted vision sensor module comprises a plurality of vision sensors for acquiring image information in the external environment.
Preferably, the online visual information processing module takes a deep neural network model formed and trained by the offline training module as a basic frame, and comprises an image characterization sub-module and an image feature fusion sub-module;
the image characterization submodule is composed of a multipath convolutional neural network, the input of the image characterization submodule is the image information acquired by the vehicle-mounted vision sensor module, and the output of the image characterization submodule is a high-dimensional image feature vector;
the image feature fusion sub-module is composed of multiple paths of neural networks, each path of the neural network comprises three layers of fully-connected neural networks, wherein the scale of a first layer of fully-connected layers is 512 multiplied by 512, the scale of a second layer of fully-connected layers is 512 multiplied by 128, the scale of a third layer of fully-connected layers is 128 multiplied by 1, the multiple paths of the neural networks are linked by means of Softmax layers to form the deep neural network model, the input of the image feature fusion sub-module is an image feature vector constructed by the image representation sub-module, and the output of the image feature fusion sub-module is a weight value of different images.
Preferably, the offline training module comprises a training set generating sub-module and an end-to-end training sub-module;
the training set generation sub-module is used for generating a training data set of the triplets;
the end-to-end training sub-module is used for training the deep neural network model by using the Triplet training data set generated by the training set generation sub-module.
A visual information fusion method for vehicle positioning, comprising the steps of:
s1, acquiring image information and finishing preprocessing of the image information;
s2, inputting the image information obtained in the S1 into an image characterization submodule, respectively inputting the image information obtained by different visual sensors in the vehicle-mounted visual sensor module into corresponding convolutional neural networks to obtain convolutional layer feature map output, and constructing an image feature vector according to the convolutional layer feature map output;
s3, inputting the image feature vector obtained in the S2 into an image feature fusion submodule, and calculating to obtain a weight value of each image;
s4, calculating the visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database according to the image feature vector obtained in the S2 and the weight value obtained in the S3;
s5, comparing the visual similarity obtained in the S4 with a preset threshold, if the visual similarity is higher than the preset threshold, judging that the current position of the vehicle is the same as a pre-stored position in a corresponding database, and realizing vehicle positioning, and if the visual similarity is lower than the preset threshold, continuing to search the next group of images.
Preferably, the image information in S1 is acquired by a vehicle-mounted vision sensor module, where the vehicle-mounted vision sensor module is a system including a plurality of vision sensors, and the plurality of vision sensors simultaneously acquire image information of surrounding environments of the vehicle;
counting the number of the vision sensors in the vehicle-mounted vision sensor module as C; time alignment is carried out by adopting the time stamp of the image information, so that a group of images are collected at the same time and the same position, namely, the group of image information represents one position;
each group of image information contains C images, which are marked as { I } 1 ,I 2 ,...,I C }。
Preferably, the processing flow of the image characterization submodule in S2 is as follows: inputting each image in a group of image information into a corresponding convolutional neural network respectively, wherein each path of convolutional neural network obtains a part of convolutional layer feature map output, the size of the convolutional layer feature map is W multiplied by H multiplied by K, W multiplied by H is the size of each feature map, and K is the number of the feature maps;
then for each feature mapPerforming maximum pooling processing to obtain K-dimensional image feature vectors, and further obtaining a group of image information image feature vectors, which are marked as { f } 1 ,f 2 ,...,f C }。
Preferably, the calculating in S3 obtains a weight value of each image, and the calculating formula is:
wherein c=1, 2,., C, represents the index of the visual sensor; f (f) c An image feature vector representing a c-th image in the set of image information; v c 、Representing network parameters of a c-th full-connection layer in the deep neural network model; w (w) c Is the characteristic vector f c Is a function of (c), a weight representing the characteristic of the image of the c-th path, and satisfies +.>
Preferably, in S4, the calculating a visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database includes:
S ai =l-D ai ,
wherein a represents an index value of the a-th group image information; i=1, 2, 3.,. N, N is the number of groups of image information in the database; d (D) ai Representing the distance between two sets of image information, i.e. the distance of the visual space between the corresponding two positions.
Preferably, said D ai The calculation formula of (2) is as follows:
wherein,,a feature vector representing a c-th image in the a-th image information; w (w) c The weight value corresponding to the c-th path image in the a-th group image information is obtained;
the Euclidean distance between the corresponding c-th image feature vector in the a-th position and the i-th position is represented, the calculation formula is that,
wherein,,a feature vector representing a c-th image in the i-th set of image information in the database; I.I 2 Representing the 2-norm of the vector.
Compared with the prior art, the invention has the advantages that:
the visual information fusion system and the visual information fusion method for vehicle positioning can effectively utilize image information acquired by a plurality of visual sensors in the vehicle-mounted visual sensor module, solve the problem that a positioning system cannot work normally under the condition that a single sensor fails in the prior art by complementation among the image information, and remarkably improve the accuracy of vehicle positioning and the reliability of an algorithm.
Meanwhile, the invention constructs high-quality image features based on the convolutional neural network, and reasonably fuses the image information from a plurality of vision sensors through the designed image feature fusion neural network, and the network has sample self-adaption capability, and can automatically adjust the weight values of different images through training, thereby further improving the environmental adaptability of the invention and the robustness performance of the method.
In addition, the invention provides reference for other technical schemes in the same field, can be used for expanding and extending based on the reference, and is applied to other technical schemes related to vehicle positioning technology or visual information fusion technology, and has high use and popularization values.
The following detailed description of the embodiments of the present invention is provided with reference to the accompanying drawings, so that the technical scheme of the present invention can be understood and mastered more easily.
Drawings
FIG. 1 is a schematic diagram of the system architecture of the present invention;
FIG. 2 is a schematic diagram of the structure of an image characterization sub-module according to the present invention;
FIG. 3 is a schematic diagram of an image feature fusion sub-module according to the present invention;
fig. 4 is a schematic flow chart of the method of the present invention.
Detailed Description
Aiming at the defects of the prior art, the invention provides a visual information fusion system and a visual information fusion method for vehicle positioning, which are specifically as follows.
As shown in fig. 1, a visual information fusion system for vehicle positioning, comprising:
the vehicle-mounted visual sensor module is used for acquiring image information in the external environment;
the on-line visual information processing module is used for processing the image information output by the vehicle-mounted visual sensor module and forming a deep neural network model;
the off-line training module is used for training the deep neural network model and copying network parameters obtained in the training stage to the on-line visual information processing module;
the output end of the vehicle-mounted vision sensor module is respectively in signal connection with the input end of the online vision information processing module and the input end of the offline training module.
The vehicle-mounted vision sensor module comprises a plurality of vision sensors for acquiring image information in the external environment. In this embodiment, three vision sensors, i.e., c=3, are included in the vehicle vision sensor module. The three visual sensors are respectively mounted right in front of the vehicle, on the left side and on the right side.
The online visual information processing module takes the depth neural network model formed and trained by the offline training module as a basic frame, and comprises an image representation sub-module and an image feature fusion sub-module.
As shown in fig. 2, the image characterization sub-module is composed of a multi-path convolutional neural network, and in this embodiment, the image characterization sub-module is composed of a three-path convolutional neural network, and adopts a pretrained VGG-16 as a convolutional neural network model.
The input of the image characterization submodule is the image information acquired by the vehicle-mounted vision sensor module, and the output of the image characterization submodule is a high-dimensional image feature vector.
The construction process of the specific image feature vector is roughly as follows, each image in a group of image information is respectively input into a corresponding convolutional neural network, and for each path of convolutional neural network, the final layer of convolutional layer output of VGG-16 is obtained, namely, tensor (feature map) with the size of W multiplied by H multiplied by K is obtained, wherein W multiplied by H is the size of each feature map, and K is the number of the feature maps; the last convolutional layer of VGG-16 has 512 feature maps, i.e., k=512. And then carrying out maximum pooling processing on each feature map according to the feature map to obtain 512-dimensional image feature vectors.
As shown in fig. 3, the image feature fusion sub-module is formed by a multi-path neural network, and in this embodiment, the image feature fusion sub-module is formed by three paths of neural networks, where each path of neural network includes three layers of fully-connected neural networks. The scale of the first layer of fully-connected layer is 512×512, the scale of the second layer of fully-connected layer is 512×128, the scale of the third layer of fully-connected layer is 128×1, and multiple nerve networks are linked by means of a Softmax layer to form the deep nerve network model. The input of the image feature fusion sub-module is an image feature vector constructed by the image characterization sub-module, and the output of the image feature fusion sub-module is a weight value of different images.
The off-line training module comprises a training set generation sub-module and an end-to-end training sub-module; the training set generation sub-module is used for generating a training data set of the triplets; the end-to-end training sub-module is used for training the deep neural network model by using the Triplet training data set generated by the training set generation sub-module.
It should be noted that, in the offline training module of the present invention, the training of the deep neural network is based on the optimization method of batch gradient descent, and an end-to-end training mode is adopted. The specific training steps are as follows:
step 1, generating a Triplet sample on a training set; each Triplet sample contains three groups of image information, namely image information to be searched, positive correlation image information and negative correlation image information. The positive correlation image information and the image information to be searched come from the same position, and the negative correlation image information and the image information to be searched come from different positions.
According to the training data set of the triplets, a training method based on the triplets loss function is adopted, the calculation method of the loss function L is that,
L=max(D qp -D qn +m,0),
wherein D is qp Representing the visual spatial distance between the image information to be retrieved and the positively correlated image information, D qn The calculation method of the distance representing the visual space distance between the image information to be retrieved and the positively correlated image information can be derived according to the calculation formula below.
Step 2, initializing each path of convolutional neural network of a feature extraction network (image characterization submodule) by using a pretrained VGG-16 convolutional neural network model respectively; and then fixing parameters of each path of convolutional neural network and training the image feature fusion network.
And 3, fine-tuning parameters of the feature extraction network and the feature fusion network at the same time.
As shown in fig. 4, the present invention further includes a visual information fusion method for vehicle positioning, corresponding to the above visual information fusion system for vehicle positioning, including the following steps:
s1, collecting image information and finishing preprocessing of the image information.
S2, inputting the image information obtained in the S1 into an image characterization submodule, respectively inputting the image information obtained by different visual sensors in the vehicle-mounted visual sensor module into corresponding convolutional neural networks to obtain convolutional layer feature map output, and constructing image feature vectors according to the convolutional layer feature map output.
S3, inputting the image feature vector obtained in the S2 into an image feature fusion submodule, and calculating to obtain a weight value of each image.
S4, calculating the visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database according to the image feature vector obtained in the S2 and the weight value obtained in the S3.
S5, comparing the visual similarity obtained in the S4 with a preset threshold value, and performing position retrieval. If the visual similarity is higher than a preset threshold, judging that the current position of the vehicle is the same as the pre-stored position in the corresponding database, and realizing vehicle positioning, and if the visual similarity is lower than the preset threshold, indicating that the current position cannot be searched in the corresponding database, continuing to search the next group of images.
Specifically, the image information in S1 is acquired by a vehicle-mounted vision sensor module, where the vehicle-mounted vision sensor module is a system including a plurality of vision sensors, and a plurality of vision sensors acquire image information of the surrounding environment of the vehicle at the same time.
The number of the vision sensors in the vehicle-mounted vision sensor module is denoted as C, and in this embodiment, the number of the vision sensors is three, i.e., c=3. To ensure time synchronization of images obtained by a plurality of sensors, time alignment is performed using time stamps of image information to ensure that a set of images are acquired at the same time and at the same location, i.e., a set of image information represents a location.
Each group of image information contains C images, which are marked as { I } 1 ,I 2 ,...,I C }。
The processing flow of the image characterization submodule in the S2 is as follows:
and respectively inputting each image in a group of image information into a corresponding convolutional neural network, wherein each path of convolutional neural network obtains a set of convolutional layer feature map output, the size of the convolutional layer feature map is W multiplied by H multiplied by K, W multiplied by H is the size of each feature map, and K is the number of the feature maps.
Then, according to the feature images, carrying out maximum pooling processing on each feature image to obtain K-dimensional image feature vectors, and further obtaining image feature vectors of a group of image information, which are marked as { f } 1 ,f 2 ,...,f C }。
And S3, calculating a weight value of each image, wherein a calculation formula is as follows:
wherein c=1, 2,., C, represents the index of the visual sensor; f (f) c An image feature vector representing a c-th image in the set of image information; v c 、Representing network parameters of a c-th full-connection layer in the deep neural network model; w (w) c Is the characteristic vector f c Is a function of (c), a weight representing the characteristic of the image of the c-th path, and satisfies +.>
And S4, calculating the visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database, wherein the calculation formula is as follows:
S ai =1-D ai ,
wherein a represents an index value of the a-th group image information; i=1, 2, 3.,. N, N is the number of groups of image information in the database; d (D) ai Representing the distance between two sets of image information, i.e. the distance of the visual space between the corresponding two positions.
The saidD ai The calculation formula of (2) is as follows:
wherein,,a feature vector representing a c-th image in the a-th image information; w (w) c The weight value corresponding to the c-th path image in the a-th group image information is obtained; />And the Euclidean distance between the corresponding c-th image feature vector in the a-th position and the i-th position is represented.
The saidThe calculation formula of (a) is as follows,
wherein,,a feature vector representing a c-th image in the i-th set of image information in the database; I.I 2 Representing the 2-norm of the vector.
The visual information fusion system and the visual information fusion method for vehicle positioning can effectively utilize image information acquired by a plurality of visual sensors in the vehicle-mounted visual sensor module, solve the problem that a positioning system cannot work normally under the condition that a single sensor fails in the prior art by complementation among the image information, and remarkably improve the accuracy of vehicle positioning and the reliability of an algorithm.
Meanwhile, the invention constructs high-quality image features based on the convolutional neural network, and reasonably fuses the image information from a plurality of vision sensors through the designed image feature fusion neural network, and the network has sample self-adaption capability, and can automatically adjust the weight values of different images through training, thereby further improving the environmental adaptability of the invention and the robustness performance of the method.
In addition, the invention provides reference for other technical schemes in the same field, can be used for expanding and extending based on the reference, and is applied to other technical schemes related to vehicle positioning technology or visual information fusion technology, and has high use and popularization values.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.
Claims (5)
1. A visual information fusion system for vehicle positioning, comprising:
the vehicle-mounted visual sensor module comprises a plurality of visual sensors for acquiring image information in the external environment;
the off-line training module is used for training the deep neural network model and copying network parameters obtained in the training stage to the on-line visual information processing module;
the on-line visual information processing module comprises an image characterization sub-module and an image feature fusion sub-module, and the image characterization sub-module is trained to form a basic framework of a deep neural network model by the image information output by the vehicle-mounted visual sensor module through an off-line training module, wherein the image characterization sub-module is composed of a plurality of paths of convolutional neural networks, the input of the image characterization sub-module is the image information acquired by the vehicle-mounted visual sensor module, and the output of the image characterization sub-module is a high-dimensional image feature vector;
the image feature fusion sub-module is composed of multiple paths of neural networks, each path of the neural network comprises three layers of fully-connected neural networks, wherein the scale of a first layer of fully-connected neural network is 512 multiplied by 512, the scale of a second layer of fully-connected neural network is 512 multiplied by 128, the scale of a third layer of fully-connected neural network is 128 multiplied by 1, the multiple paths of the neural networks are linked by means of a Softmax layer to form the deep neural network model, the input of the image feature fusion sub-module is an image feature vector constructed by the image characterization sub-module, and the output of the image feature fusion sub-module is a weight value of different images;
the output end of the vehicle-mounted vision sensor module is respectively in signal connection with the input end of the online vision information processing module and the input end of the offline training module.
2. The visual information fusion system for vehicle localization of claim 1, wherein: the off-line training module comprises a training set generation sub-module and an end-to-end training sub-module;
the training set generation sub-module is used for generating a training data set of the triplets;
the end-to-end training sub-module is used for training the deep neural network model by using the Triplet training data set generated by the training set generation sub-module.
3. A visual information fusion method for vehicle positioning, characterized by comprising the steps of:
s1, vehicle-mounted visual senseThe method comprises the steps that an image information is obtained by a processor module, preprocessing of the image information is completed, wherein the vehicle-mounted vision sensor module is a system comprising a plurality of vision sensors, the vision sensors simultaneously obtain the image information of the surrounding environment of a vehicle, and the number of the vision sensors in the vehicle-mounted vision sensor module is marked as C; time alignment is carried out by adopting the time stamp of the image information, so that a group of images are collected at the same time and the same position, namely, the group of image information represents one position; each group of image information contains C images, which are marked as { I } 1 ,I 2 ,...,I C };
S2, inputting the image information obtained in the S1 into an image characterization submodule, wherein the processing flow is that each image in a group of image information is respectively input into a corresponding convolutional neural network, each path of convolutional neural network obtains a part of convolutional layer feature map output, and the size of the convolutional layer feature map is as follows: w×h×k, wherein w×h is the size of each feature map and K is the number of feature maps; then carrying out maximum pooling processing on each feature map to obtain K-dimensional image feature vectors, and further obtaining image feature vectors of a group of image information, which are marked as { f } 1 ,f 2 ,...,f C The method comprises the steps that image information acquired by different visual sensors in a vehicle-mounted visual sensor module is respectively input into corresponding convolutional neural networks to obtain convolutional layer feature map output, and image feature vectors are constructed according to the convolutional layer feature map output;
s3, inputting the image feature vector obtained in the S2 into an image feature fusion submodule, and calculating to obtain a weight value of each image, wherein a calculation formula is as follows:wherein c=1, 2, C, index representing visual sensor; f (f) C An image feature vector representing a c-th image in the set of image information; v C 、/>Representing deep neural network modesNetwork parameters of the c-th full-connection layer in the model; w (w) C Is the characteristic vector f C Is a function of (c) path image characteristics, and satisfies the following
S4, calculating the visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database according to the image feature vector obtained in the S2 and the weight value obtained in the S3;
s5, comparing the visual similarity obtained in the S4 with a preset threshold, if the visual similarity is higher than the preset threshold, judging that the current position of the vehicle is the same as a pre-stored position in a corresponding database, and realizing vehicle positioning, and if the visual similarity is lower than the preset threshold, continuing to search the next group of images.
4. A visual information fusion method for vehicle localization as claimed in claim 3, wherein: and S4, calculating the visual similarity between the image information of the current position of the vehicle and the pre-stored image information in the database, wherein the calculation formula is as follows: s is S ai =1-D ai Wherein a represents an index value of the image information of the a-th group; i=1, 2,3,.. N, N is the number of sets of image information in the database; d (D) ai Representing the distance between two sets of image information, i.e. the distance of the visual space between the corresponding two positions.
5. The visual information fusion method for vehicle localization of claim 4, wherein: the D is ai The calculation formula of (2) is as follows:wherein (1)>Feature vector, w, representing the c-th path image in the a-th set of image information C For group a of figuresThe weight value corresponding to the c-th image in the image information; />The Euclidean distance between the corresponding c-th image feature vector in the a-th position and the i-th position is represented, and the calculation formula is as follows: />Wherein (1)>Feature vectors representing the c-th image in the i-th set of image information in the database 2 Representing the 2-norm of the vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910332583.0A CN110119768B (en) | 2019-04-24 | 2019-04-24 | Visual information fusion system and method for vehicle positioning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910332583.0A CN110119768B (en) | 2019-04-24 | 2019-04-24 | Visual information fusion system and method for vehicle positioning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110119768A CN110119768A (en) | 2019-08-13 |
CN110119768B true CN110119768B (en) | 2023-10-31 |
Family
ID=67521291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910332583.0A Active CN110119768B (en) | 2019-04-24 | 2019-04-24 | Visual information fusion system and method for vehicle positioning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110119768B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110514662B (en) * | 2019-09-10 | 2022-06-28 | 上海深视信息科技有限公司 | Visual detection system with multi-light-source integration |
CN110660103B (en) * | 2019-09-17 | 2020-12-25 | 北京三快在线科技有限公司 | Unmanned vehicle positioning method and device |
CN110889378B (en) * | 2019-11-28 | 2023-06-09 | 湖南率为控制科技有限公司 | Multi-view fusion traffic sign detection and identification method and system thereof |
CN111240187B (en) * | 2020-01-16 | 2023-01-13 | 南京理工大学 | Vehicle track tracking control algorithm based on vehicle error model |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10395144B2 (en) * | 2017-07-24 | 2019-08-27 | GM Global Technology Operations LLC | Deeply integrated fusion architecture for automated driving systems |
CN108921013B (en) * | 2018-05-16 | 2020-08-18 | 浙江零跑科技有限公司 | Visual scene recognition system and method based on deep neural network |
CN109242003B (en) * | 2018-08-13 | 2021-01-01 | 浙江零跑科技有限公司 | Vehicle-mounted vision system self-motion determination method based on deep convolutional neural network |
-
2019
- 2019-04-24 CN CN201910332583.0A patent/CN110119768B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110119768A (en) | 2019-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110119768B (en) | Visual information fusion system and method for vehicle positioning | |
CN108960140B (en) | Pedestrian re-identification method based on multi-region feature extraction and fusion | |
CN110335319B (en) | Semantic-driven camera positioning and map reconstruction method and system | |
Zhou et al. | To learn or not to learn: Visual localization from essential matrices | |
CN111144364B (en) | Twin network target tracking method based on channel attention updating mechanism | |
CN109341703B (en) | Visual SLAM algorithm adopting CNNs characteristic detection in full period | |
CN110781262B (en) | Semantic map construction method based on visual SLAM | |
CN106780631B (en) | Robot closed-loop detection method based on deep learning | |
CN110717927A (en) | Indoor robot motion estimation method based on deep learning and visual inertial fusion | |
CN104732518A (en) | PTAM improvement method based on ground characteristics of intelligent robot | |
CN107424161B (en) | Coarse-to-fine indoor scene image layout estimation method | |
CN112562081B (en) | Visual map construction method for visual layered positioning | |
CN111376273B (en) | Brain-like inspired robot cognitive map construction method | |
CN113393524B (en) | Target pose estimation method combining deep learning and contour point cloud reconstruction | |
CN112084895B (en) | Pedestrian re-identification method based on deep learning | |
CN113076891B (en) | Human body posture prediction method and system based on improved high-resolution network | |
CN110059597B (en) | Scene recognition method based on depth camera | |
CN112767546B (en) | Binocular image-based visual map generation method for mobile robot | |
CN114693720A (en) | Design method of monocular vision odometer based on unsupervised deep learning | |
CN114926742B (en) | Loop detection and optimization method based on second-order attention mechanism | |
CN113888603A (en) | Loop detection and visual SLAM method based on optical flow tracking and feature matching | |
CN112069997B (en) | Unmanned aerial vehicle autonomous landing target extraction method and device based on DenseHR-Net | |
CN110647917B (en) | Model multiplexing method and system | |
CN111578956A (en) | Visual SLAM positioning method based on deep learning | |
CN114913289B (en) | Three-dimensional dynamic uncertainty semantic SLAM method for production workshop |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |