CN102831446A - Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping) - Google Patents

Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping) Download PDF

Info

Publication number
CN102831446A
CN102831446A CN2012102951816A CN201210295181A CN102831446A CN 102831446 A CN102831446 A CN 102831446A CN 2012102951816 A CN2012102951816 A CN 2012102951816A CN 201210295181 A CN201210295181 A CN 201210295181A CN 102831446 A CN102831446 A CN 102831446A
Authority
CN
China
Prior art keywords
image
visual
mrow
msub
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012102951816A
Other languages
Chinese (zh)
Inventor
梁志伟
陈燕燕
朱松豪
徐国政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN2012102951816A priority Critical patent/CN102831446A/en
Publication of CN102831446A publication Critical patent/CN102831446A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping). The image appearance based loop closure detecting method includes acquiring images of the current scene by a monocular camera carried by a mobile robot during advancing, and extracting characteristics of bag of visual words of the images of the current scene; preprocessing the images by details of measuring similarities of the images according to inner products of image weight vectors and rejecting the current image highly similar to a previous history image; updating posterior probability in a loop closure hypothetical state by a Bayesian filter process to carry out loop closure detection so as to judge whether the current image is subjected to loop closure or not; and verifying loop closure detection results obtained in the previous step by an image reverse retrieval process. Further, in a process of establishing a visual dictionary, the quantity of clustering categories is regulated dynamically according to TSC (tightness and separation criterion) values which serve as an evaluation criterion for clustering results. Compared with the prior art, the loop closure detecting method has the advantages of high instantaneity and detection precision.

Description

Closed loop detection method based on image appearance in monocular vision SLAM
Technical Field
The invention provides a closed-loop detection method based on image appearance in a monocular vision SLAM (synchronous localization and mapping) aiming at the problem of closed-loop detection in the SLAM, and belongs to the technical field of mobile robot navigation.
Background
Synchronous positioning and mapping are basic problems and research hotspots in the field of mobile robot navigation, and whether the synchronous positioning and mapping capabilities are provided is considered by many people as a key precondition for whether the robot can realize autonomous navigation. The robot realizes self-positioning and simultaneously constructs an environment map in the SLAM process, and due to the lack of prior knowledge and the uncertainty of the environment, the robot needs to judge whether the current position is in an environment area which is visited or not in the walking process and uses the environment area as a basis for judging whether the environment needs to be updated or not, namely the problem of closed-loop detection is solved.
Due to the limited range of vision sensor observation, monocular vision SLAM closed-loop detection faces many problems, including uncertainty and error of robot motion, which may lead to data correlation error, how to detect visual features, how to characterize visual scene models, and so on. How to accurately establish a scene model is the key of visual SLAM closed-loop detection, and most of the visual-based scene models are described by directly obtained environmental appearance characteristics at present. The BoVW (bag of services) algorithm is an effective image feature modeling method and is widely used for visual SLAM closed-loop detection. According to the method, local features of an image are extracted by using a SURF or SIFT operator, then the features are classified to construct a visual dictionary, and any one image can be represented by a visual word set in the visual dictionary based on the created visual dictionary.
In the aspect of visual SLAM closed-loop detection, Angeli and the like propose a topological closed-loop detection method based on enhanced vision, Cummins and the like propose a probabilistic closed-loop detection method based on topological appearance, and the two methods can be used for effective detection in a large-scale environment but cannot meet the closed-loop detection requirements of high efficiency and real-time performance in the SLAM problem. RTAB-MAP is a real-time closed-loop detection method based on scene appearance, and the strong memory management function of the RTAB-MAP enables the robot to process each frame of image online for a long time, but the detection accuracy is low and false closed-loop detection is easy to occur.
Disclosure of Invention
The technical problem to be solved by the invention is to overcome the defects of the prior art, and provide a closed-loop detection method based on image appearance in monocular vision SLAM, which can effectively improve the real-time and accuracy of online closed-loop detection.
The invention specifically adopts the following technical scheme to solve the technical problems:
a closed loop detection method based on image appearance in monocular vision SLAM comprises the following steps:
step 1, acquiring a current scene image by using a monocular camera carried by a mobile robot in the advancing process of the mobile robot, and extracting visual bag-of-word characteristics of the current scene image;
step 2, calculating the content similarity between the current scene image and the previous frame of historical image, and if the maximum value of the content similarity is smaller than a preset similarity threshold value, calculating the content similarity between the current scene image and the previous frame of historical imageSaving the current scene image and executing the step 3; otherwise, deleting the current scene image and transferring to the step 1 to obtain a new image, wherein the current scene image ItAnd the previous frame history image IcThe content similarity S is calculated according to the following formula:
S = ( V I t , V I c ) | V I t | | V I c |
in the formula,
Figure BDA00002030744300022
representing a current scene image ItThe visual bag of words feature vector of (1),
Figure BDA00002030744300023
representing the previous frame of the historical image IcThe visual bag of words feature vector;
and 3, continuously updating the posterior probability of the closed-loop assumed state by using a Bayesian filtering method to perform closed-loop detection, and judging whether the current scene image is closed-loop or not.
In order to improve the accuracy of closed-loop detection, the method further verifies the detection result obtained in the step 3 by using an image reverse retrieval method, and specifically, the method further comprises the following steps:
and 4, verifying the detection result of the step 3 according to the following method: when the current image and a historical image are detected to form a closed loop in the step 3, counting the frequency of the visual words in the visual word bag characteristics of the current image in the visual word bag characteristics of each historical image; selecting the previous P historical images with the highest frequency, wherein P is a natural number; if the historical image detected in the step 3 is any one of the P historical images, the closed loop detection is considered to be correct, and the closed loop is accepted; otherwise, the closed loop detection is wrong, and the closed loop is rejected.
The invention can adopt the existing method to extract the visual word bag characteristics of the image, the existing method usually directly uses the frequency vector of the visual word to represent the image, in order to ensure that the visual word bag characteristics can more accurately represent the image, the invention uses the tf-idf weighting method in the text retrieval for reference, and uses the word frequency vector with weight to represent each frame of image. Specifically, the invention extracts the visual bag-of-words feature of the image according to the following method:
step 1, extracting local visual features of an image I to obtain a local visual feature vector set of the image I;
step 2, representing the image I as a K-dimensional vector VINamely, the visual bag-of-words feature vector of the image I:
VI=[t1,...,tj,…tK]T j=1,2,…,K
wherein,k is the number of visual words in the visual dictionary, njIRepresenting the frequency, n, of occurrence of the jth visual word in the set of local visual feature vectors of the image IIRepresenting the number of visual words appearing in the set of local visual feature vectors of image I, N representing the number of all the historical images currently saved, NjIndicating the number of images of which the visual bag-of-words feature contains the jth visual word in all the current saved history images.
Preferably, the visual dictionary is constructed offline by the following method:
step 1, collecting a group of environment scene images in advance and extracting local visual features of the images respectively, wherein all local visual feature vectors form a training sample set;
step 2, clustering the training sample set, and constructing a visual dictionary by taking the obtained clustering centers as visual words, wherein each clustering center is a visual word; the method specifically comprises the following steps:
step 201, setting an initial clustering category number K;
202, carrying out fuzzy K-means clustering on the training sample set according to the current clustering class number K, in each iteration step, distributing the samples to a certain clustering center according to the maximum membership criterion, wherein the membership R of the ith sample to the jth clustering centerljThe following formula:
<math> <mrow> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mo>=</mo> <mfrac> <mrow> <mn>1</mn> <mo>/</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>m</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </munderover> <mrow> <mo>(</mo> <mn>1</mn> <mo>/</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>m</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>l</mi> <mo>&le;</mo> <mi>M</mi> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>j</mi> <mo>&le;</mo> <mi>K</mi> </mrow> </math>
in the formula, DlRepresenting the ith sample in the training sample set; vjRepresenting the jth cluster center; m is the number of samples in the training sample set; k is the number of the categories of the clusters;
and updating the clustering center according to the following formula:
<math> <mrow> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>+</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <mrow> <mo>(</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow> </math>
in the formula, Vj(t)、Vj(t +1) respectively representing the clustering centers of the jth class in the t-th and t + 1-th iteration steps; rlj(t) represents the membership of the ith sample to the jth clustering center in the tth iteration step;
step 203, judging whether the TSC value of the current clustering result is within a preset range, if so, turning to step 204; if not, changing the current clustering category number K, and turning to the step 202; the TSC value is calculated according to the following formula:
<math> <mrow> <mi>TSC</mi> <mrow> <mo>(</mo> <mi>V</mi> <mo>,</mo> <mi>K</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mfrac> <mn>1</mn> <mi>M</mi> </mfrac> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </msubsup> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </msubsup> <msubsup> <mi>R</mi> <mi>lj</mi> <mn>2</mn> </msubsup> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <msub> <mi>min</mi> <mrow> <msub> <mi>j</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>j</mi> <mn>2</mn> </msub> </mrow> </msub> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>V</mi> <msub> <mi>j</mi> <mn>1</mn> </msub> </msub> <mo>-</mo> <msub> <mi>V</mi> <msub> <mi>j</mi> <mn>2</mn> </msub> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> </mfrac> </mrow> </math>
in the formula,
Figure BDA00002030744300042
representing nearest ones of K cluster centers
Figure BDA00002030744300043
And
Figure BDA00002030744300044
a distance between, DlRepresents the l sample in the training sample set, VjRepresenting the jth clustering center, wherein M is the number of samples in the training sample set; k is the number of classes of the cluster, RljIs the first sampleMembership to a jth cluster center;
and step 204, constructing a visual dictionary by taking the K clustering centers as visual words, wherein each clustering center is a visual word.
Compared with the prior art, the invention has the following beneficial effects:
firstly, the acquired image is preprocessed by utilizing the image similarity, a scene image with high similarity to a previous frame of historical image is eliminated, only a small number of images which can represent the current environmental characteristics most are reserved for subsequent closed-loop detection, the calculated amount and the requirement on hardware storage capacity are greatly reduced, and the detection real-time performance is improved;
on the basis of the closed loop detection result of the Bayesian filtering updating method, the closed loop detection result is verified by using an image inverse retrieval method, so that the accuracy of the detection result is effectively improved;
thirdly, the visual word frequency vector with weight is used as the visual word bag characteristic to represent the image, so that the description of the image characteristic is more accurate;
and fourthly, when the visual dictionary is constructed offline, evaluating the clustering effect by adopting the TSC judgment standard, thereby obtaining a more accurate clustering result.
Drawings
FIG. 1 is a schematic flow chart of a closed-loop detection method based on image appearance in monocular vision SLAM according to the present invention;
FIG. 2 is a flow chart of the construction process of the visual dictionary in the method of the present invention.
Detailed Description
The technical scheme of the invention is explained in detail in the following with the accompanying drawings:
the basic flow of the closed-loop detection method based on the image appearance in the monocular vision SLAM is shown in figure 1, and the method comprises the following steps:
step 1, the mobile robot collects a current image by using a monocular camera carried by the mobile robot, and extracts visual word bag characteristics of the current image.
In a BoVW image feature model (the details can be referred to by the documents T.Botterill, S.Mill, R.Green.Bags-of-words-drive, Single camera multiple localization and mapping. journal of field Robotics,2011,28(2): 204-226), a visual dictionary is constructed by using a large number of image local visual feature vectors, each local visual feature is used as a visual word in the visual dictionary, so that any one image can be characterized by using a visual word set in the visual dictionary based on the created visual dictionary, namely, the visual bag feature of the image. The invention adopts the following method to extract the visual bag-of-words characteristics of the image:
step 1.1, extracting local visual features of an image I to obtain a local visual feature vector set of the image I; the extraction of local visual features can be realized by using the existing SURF or SIFT operator, and the SURF algorithm is briefly described below, and the content of the SIFT algorithm is referred to in the literature [ D.Lowe.object recognition from local-innovative recognition. in Proceedings of IEEE International Conference on computer Vision,1999:1150-1157 ].
SURF detects feature points using the most basic Hessian approximation matrix, whose determinant values can be used as a basis for scale selection. Given any point (x, y) in the image I, the Hessian matrix H (x, y, σ) has a dimension σ at (x, y) defined as follows:
<math> <mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>&sigma;</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <msub> <mi>L</mi> <mi>xx</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>&sigma;</mi> <mo>)</mo> </mrow> </mtd> <mtd> <msub> <mi>L</mi> <mi>xy</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>&sigma;</mi> <mo>)</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <msub> <mi>L</mi> <mi>xy</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>&sigma;</mi> <mo>)</mo> </mrow> </mtd> <mtd> <msub> <mi>L</mi> <mi>yy</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>&sigma;</mi> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein L (x, y, σ) is the log (laplacian of Gaussian) operator of the image, which is the convolution operation of the original image I (x, y) with a 2-dimensional Gaussian function of variable scale:
L(x,y,σ)=G(x,y,σ)*I(x,y) (2)
SURF approximates the LoG operator with the dog (difference of gaussian) operator:
D(x,y,σ)=[G(x,y,kσ)-G(x,y,σ)]*I(x,y)=L(x,y,kσ)-L(x,y,σ) (3)
the DoG only needs to subtract the images after adjacent scale gaussian smoothing in calculation, thereby simplifying the calculation. The box-type filter is adopted to approximate Gaussian second derivative, and the integral image is used to quickly calculate the image convolution of the average filters, so as to obtain Dxx,Dxy,DyyIs approximately Lxx,Lxy,LyyThus, the determinant of the Hessian matrix has an approximation:
det(Happrox)=DxxDyy-(ωDxy)2 (4)
wherein, omega is an adjusting parameter used for balancing the Hessian determinant expression, and is generally set as a constant of 0.9 in actual calculation. The SURF scale space is divided according to groups, the number of layers of each group is a constant, the difference is that the size of an image is kept unchanged, the size of a filter is changed to construct the scale space, and characteristic points are searched under different scales.
And calculating the wavelet response of the circular neighborhood taking 6s (s is the scale of the characteristic point) as the radius in the x and y directions of the detected characteristic point, and performing Gaussian weighting by taking the characteristic point as the center to obtain the point coordinate description (x and y) of the characteristic point, wherein x represents the response in the x direction, and y represents the response in the y direction. And calculating the response sum in the window by adopting a sliding window to obtain a local direction vector, and taking the longest vector as a description vector of the characteristic point. A20 s-sized box is constructed by taking the feature point as the center, and is divided into 16 (4 x 4) subregions, each subregion is divided into 4 small blocks, and 64 description primitives are formed. And respectively calculating sigma dx, sigma | dx |, ∑ dy and sigma | dy |, for each subregion, representing each subregion by a vector v = (∑ dx, sigma | dx |, ∑ dy and sigma | dy |), and combining 16 vectors to obtain a descriptor vector with the length of 64, namely a feature point can be described by a feature vector with the dimension of 64.
Step 1.2, visual word bag characteristics of the image are extracted, namely, according to a visual dictionary, a local visual characteristic vector set of the image is expressed in a visual word set mode. In the embodiment, a traditional method for representing images by directly using frequency vectors of appearance of visual words is not adopted, but a weighting method of tf-idf in text retrieval is used for reference, and a word frequency vector with weight is used for representing each frame of image. The method comprises the following specific steps: representing the image I as a K-dimensional vector VINamely, the visual bag-of-words feature vector of the image I:
VI=[t1,...,tj,…tK]T j=1,2,…,K
wherein,
Figure BDA00002030744300061
k is the number of visual words in the visual dictionary, njIRepresenting the frequency, n, of occurrence of the jth visual word in the set of local visual feature vectors of the image IIRepresenting the number of visual words appearing in the set of local visual feature vectors of image I, N representing the number of all the historical images currently saved, NjIndicating the number of images of which the visual bag-of-words feature contains the jth visual word in all the current saved history images.
The visual dictionary has a crucial influence on the accuracy of visual bag feature extraction, and the visual dictionary is constructed offline by adopting the following method:
a, acquiring a group of environment scene images in advance, extracting local visual features of the images respectively, and forming a training sample set by all local visual feature vectors; for example, local visual features are extracted using SURF, and a total of M visual feature vectors are extracted, and are denoted as { D }l1 ≦ l ≦ M), each visual feature vector (sample) in the set has a fixed length T (in this embodiment T is 64).
And B, clustering the training sample set, and constructing a visual dictionary by taking the obtained clustering centers as visual words, wherein each clustering center is a visual word. In order to improve the clustering accuracy, the conventional fuzzy K-means clustering is improved, the TSC (timing and Separation criterion) is used as an evaluation standard of a clustering result, the clustering category number K is dynamically adjusted, and an optimized clustering result can be obtained. Specifically, the present step includes the following steps
Step B1, setting the initial clustering category number K; the initial K value may be determined empirically, and is preferably 500 in the present invention.
B2, carrying out fuzzy K-means clustering on the training sample set according to the current clustering class number K, and in each iteration step, distributing the samples to a certain clustering center according to the maximum membership criterion, namely classifying each sample in the training sample set into the class where the clustering center with the maximum membership is located; membership R of ith sample to jth clustering centerljThe following formula:
<math> <mrow> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mo>=</mo> <mfrac> <mrow> <mn>1</mn> <mo>/</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>m</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </munderover> <mrow> <mo>(</mo> <mn>1</mn> <mo>/</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>m</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>l</mi> <mo>&le;</mo> <mi>M</mi> <mo>,</mo> <mn>1</mn> <mo>&le;</mo> <mi>j</mi> <mo>&le;</mo> <mi>K</mi> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow> </math>
in the formula, DlRepresenting the ith sample in the training sample set; vjRepresenting the jth cluster center; m is the number of samples in the training sample set; k is the number of the categories of the clusters;
and updating the clustering center according to the following formula:
<math> <mrow> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>+</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <mrow> <mo>(</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <msub> <mi>R</mi> <mi>lj</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> </math>
in the formula, Vj(t)、Vj(t +1) respectively representing the clustering centers of the jth class in the t-th and t + 1-th iteration steps; rlj(t) represents the membership of the ith sample to the jth clustering center in the tth iteration step;
and repeating the iteration in such a way, stopping the iteration when a preset iteration termination condition is met, for example, when a preset iteration frequency is reached or the distance between the clustering centers obtained by two iterations reaches a preset distance threshold value, and at this time, classifying all local visual feature vectors in the training sample set into K classes. In this embodiment, the latter is used as an iteration termination condition, and the expression is as follows:
<math> <mrow> <munder> <mi>max</mi> <mrow> <mn>1</mn> <mo>&le;</mo> <mi>j</mi> <mo>&le;</mo> <mi>K</mi> </mrow> </munder> <mo>{</mo> <mo>|</mo> <mo>|</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> <mo>}</mo> <mo>&lt;</mo> <mi>&epsiv;</mi> </mrow> </math>
namely, when the maximum distance value of the clustering centers obtained by two iterations reaches a preset distance threshold value epsilon, the iteration is stopped.
B3, judging whether the TSC value of the current clustering result is within a preset range, if so, turning to the step B4; if not, changing the current cluster category number K, and turning to the step B2; the TSC value is calculated according to the following formula:
<math> <mrow> <mi>TSC</mi> <mrow> <mo>(</mo> <mi>V</mi> <mo>,</mo> <mi>K</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mfrac> <mn>1</mn> <mi>M</mi> </mfrac> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </msubsup> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </msubsup> <msubsup> <mi>R</mi> <mi>lj</mi> <mn>2</mn> </msubsup> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>D</mi> <mi>l</mi> </msub> <mo>-</mo> <msub> <mi>V</mi> <mi>j</mi> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <msub> <mi>min</mi> <mrow> <msub> <mi>j</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>j</mi> <mn>2</mn> </msub> </mrow> </msub> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>V</mi> <msub> <mi>j</mi> <mn>1</mn> </msub> </msub> <mo>-</mo> <msub> <mi>V</mi> <msub> <mi>j</mi> <mn>2</mn> </msub> </msub> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow> </math>
in the formula,
Figure BDA00002030744300075
representing nearest ones of K cluster centers
Figure BDA00002030744300076
And
Figure BDA00002030744300077
a distance between, DlRepresents the l sample in the training sample set, VjRepresenting the jth clustering center, wherein M is the number of samples in the training sample set; k is the number of classes of the cluster, RljThe membership degree of the ith sample to the jth clustering center;
and step B4, constructing a visual dictionary by taking K clustering centers as visual words, wherein each clustering center is a visual word.
The construction process of the visual dictionary is shown in fig. 2.
Step 2, calculating the content similarity between the current scene image and the previous frame of historical image, such as content phaseIf the maximum value of the similarity is smaller than a preset similarity threshold value, saving the current scene image and executing the step 3; otherwise, deleting the current scene image and transferring to the step 1 to obtain a new image, wherein the current scene image ItAnd the previous frame history image IcThe content similarity S is calculated according to the following formula:
S = ( V I t , V I c ) | V I t | | V I c |
in the formula,representing a current scene image ItThe visual bag of words feature vector of (1),
Figure BDA00002030744300083
representing the previous frame of the historical image IcThe visual bag of words feature vector.
Because the mobile robot continuously shoots scene images during operation, the similarity between adjacent images is high, and therefore the content similarity between a newly acquired image and a reserved image at the previous moment needs to be judged. And only the image with the similarity smaller than a certain threshold value represents a new position scene image, and subsequent processing is carried out. The similarity between the images is measured by calculating the inner product between the image weight vectors, and the larger the value of the similarity S is, the more similar the images are. If S is lower than a fixed threshold, the current image is considered to represent a new scene position; and if the S is higher than the fixed threshold, the similarity of the two frames of images is higher, and the current image is not used for closed-loop detection and is directly rejected. Therefore, a large number of similar redundant images can be removed, and only a small number of images which fully reflect the environmental characteristics are reserved, so that the algorithm complexity is reduced, and the detection real-time performance is improved. The setting of the similarity threshold value depends on the quality of the image and the rate of acquiring the image, the smaller the value is, the more scene images are rejected, the higher the accuracy is, but if the similarity threshold value is too small, the images which cannot be detected by the robot when the robot returns to the circulation starting point cannot be closed, and the setting of the similarity threshold value in practice follows the principle of 'unjustly less' and 'unjustly less'.
And 3, continuously updating the posterior probability of the closed-loop assumed state by using a Bayesian filtering method to perform closed-loop detection, and judging whether the current scene image is closed-loop or not. The method is prior art (see the literature [ M.Labbe, F.Michaud.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.. The method adopts a probability calculation method, treats closed-loop detection as a recursive Bayesian estimation problem, and detects the closed loop by estimating the posterior probability distribution of the current closed-loop assumed state. If the probability is greater than a given threshold, the closed loop is considered to be detected; and if not, adding the current image as a new scene image into the map and continuously detecting. The basic content is as follows:
let XtFor a random variable representing the assumed state of the closed loop at time t, XtI denotes the picture ItAnd image IiMatch, end loop, at which time ItAnd IiRepresenting the same scene location; xt0 represents ItIs a new scene image, i.e. no closed loop occurs at time t. The filtering process is carried out by calculating each time i is 0, …, t to form a closed loopProbability of occurrence to estimate the posterior probability distribution p (X) of the system integrityt/It) The whole filtering process is divided into two steps of prediction/updating and recursion:
and (3) prediction: <math> <mrow> <msup> <mi>Bel</mi> <mo>-</mo> </msup> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>t</mi> </msub> <mo>|</mo> <msub> <mi>X</mi> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>=</mo> <mi>i</mi> <mo>)</mo> </mrow> <mi>Bel</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>9</mn> <mo>)</mo> </mrow> </mrow> </math>
updating: bel (X)t)=ηp(It|Xt)Bel-(Xt) (10)
And (3) expanding the posterior probability by using a Bayes formula to obtain the posterior probability density at the moment t:
Figure BDA00002030744300092
where eta is a normalization factor, It=IO,…,ItRepresenting the sequence of images acquired at time t.
Observation model p (I)t|Xt) Using likelihood functions L (X)t|It) And (6) evaluating. Likelihood function L (X) when closed-loop occurst|It) The calculation is as follows:
<math> <mrow> <mi>L</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>i</mi> <mo>|</mo> <msub> <mi>I</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>t</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msubsup> <mi>X</mi> <mi>t</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msubsup> <mo>|</mo> <msub> <mi>I</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>t</mi> </munderover> <mfrac> <mrow> <mo>(</mo> <msub> <mi>V</mi> <msub> <mi>I</mi> <mi>t</mi> </msub> </msub> <mo>,</mo> <msub> <mi>V</mi> <msub> <mi>I</mi> <mi>c</mi> </msub> </msub> <mo>)</mo> </mrow> <mrow> <mo>|</mo> <msub> <mi>V</mi> <msub> <mi>I</mi> <mi>t</mi> </msub> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>V</mi> <msub> <mi>I</mi> <mi>c</mi> </msub> </msub> <mo>|</mo> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>12</mn> <mo>)</mo> </mrow> </mrow> </math>
if no closed loop occurs at the current moment, likelihood function L (X)t|It) Calculated by the following formula:
<math> <mrow> <mi>L</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>t</mi> </msub> <mo>=</mo> <mn>0</mn> <mo>|</mo> <msub> <mi>I</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mi>&mu;</mi> <mi>&sigma;</mi> </mfrac> <mo>+</mo> <mn>1</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>13</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein mu is the similarity mean value obtained by comparing the current image with each frame of historical image, and sigma is the standard deviation. Therefore, when the closed loop does not occur, the likelihood value of the current image as a new scene image is also large, and the probability is updated.
Motion model p (X)t|Xt-1) Taking values according to experience:
(1)p(Xt=0|Xt-1=0) =0.9 indicates that no closed loop occurs at time t-1 and the probability that no closed loop occurs at time t;
(2)p(Xt=i|Xt-1=0) =0.1/N (i ═ 0, …, t) denotes that no closed loop occurs at time t-1, the probability of closed loop occurring at time t, and N denotes the total number of observed images;
(3)p(Xt=0|Xt-1= j) =0.1(j =0, …, t) represents the probability that the current image has high similarity to the image j and closed loop occurs at the time t-1, and the closed loop does not occur at the time t;
(4)p(Xt=i|Xt-1= j) (i, j =0, …, t) indicates the probability that closed loop will occur at time t and time t-1, probability
Figure BDA00002030744300095
Defined as a discrete gaussian curve centered at j, 8 neighborhood non-zero values (i j-4, …, j +4) are computed, giving the sum of the gaussian coefficients of these 9 values 0.9.
Constantly updated and normalized posterior probability p (X) in closed loop detectiont|It) When probability p (X)t|It) Than a predetermined closed-loop threshold value TloopAnd if the current time is high, the current time is considered to be closed loop, otherwise, no closed loop is generated, and the detection is continued.
And 4, verifying the detection result of the step 3 according to the following method: when the current image and a historical image are detected to form a closed loop in the step 3, counting the frequency of the visual words in the visual word bag characteristics of the current image in the visual word bag characteristics of each historical image; selecting the previous P historical images with the highest frequency, wherein P is a natural number (for example, 10); if the historical image detected in the step 3 is any one of the P historical images, the closed loop detection is considered to be correct, and the closed loop is accepted; otherwise, the closed loop detection is wrong, and the closed loop is rejected.

Claims (5)

1. A closed loop detection method based on image appearance in monocular vision SLAM is characterized by comprising the following steps:
step 1, acquiring a current scene image by using a monocular camera carried by a mobile robot in the advancing process of the mobile robot, and extracting visual bag-of-word characteristics of the current scene image;
step 2, calculating the content similarity between the current scene image and the previous frame of historical image, if the maximum value of the content similarity is smaller than a preset similarity threshold value, saving the current scene image and executing the step 3; otherwise, deleteThe current scene image is transferred to step 1, a new image is obtained, wherein the current scene image
Figure 2012102951816100001DEST_PATH_IMAGE002
With the previous frame of history image
Figure 2012102951816100001DEST_PATH_IMAGE004
Content similarity between themSThe formula is as follows:
Figure 2012102951816100001DEST_PATH_IMAGE006
in the formula,
Figure 2012102951816100001DEST_PATH_IMAGE008
representing a current scene image
Figure 894543DEST_PATH_IMAGE002
The visual bag of words feature vector of (1),
Figure 2012102951816100001DEST_PATH_IMAGE010
representing a previous frame of historical image
Figure 687050DEST_PATH_IMAGE004
The visual bag of words feature vector;
and 3, continuously updating the posterior probability of the closed-loop assumed state by using a Bayesian filtering method to perform closed-loop detection, and judging whether the current scene image is closed-loop or not.
2. The closed-loop image appearance-based detection method in monocular vision SLAM as claimed in claim 1, further comprising:
step 4, detecting the result of the step 3 according to the following methodAnd (4) carrying out verification: when the current image and a historical image are detected to form a closed loop in the step 3, counting the frequency of the visual words in the visual word bag characteristics of the current image in the visual word bag characteristics of each historical image; selecting the front with the highest frequencyPThe number of the history images is one,Pis a natural number; if the history image detected in step 3 is thisPIf any one of the historical images is correct, the closed loop detection is considered to be correct, and the closed loop is accepted; otherwise, the closed loop detection is wrong, and the closed loop is rejected.
3. The closed-loop detection method based on image appearance in monocular vision SLAM as set forth in claim 1 or 2, characterized in that the visual bag-of-words feature of the image is extracted according to the following method:
step 101, extracting an imageObtaining an image based on the local visual features of the imageA set of local visual feature vectors;
step 102, image is processed
Figure 173580DEST_PATH_IMAGE012
Is shown as followsKA dimension vector ofKDimension vector
Figure DEST_PATH_IMAGE014
I.e. an image
Figure 361854DEST_PATH_IMAGE012
The visual bag of words feature vector of (2):
Figure DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE018
wherein,
Figure DEST_PATH_IMAGE020
Kis the number of visual words in the visual dictionary,is shown in the image
Figure 371267DEST_PATH_IMAGE012
Is present in the local visual feature vector set
Figure DEST_PATH_IMAGE024
The frequency of the individual visual words is,is shown in the image
Figure 887568DEST_PATH_IMAGE012
The number of visual words appearing in the set of local visual feature vectors,
Figure DEST_PATH_IMAGE028
indicating the number of all history images currently saved,
Figure DEST_PATH_IMAGE030
indicating that the visual bag-of-words feature of all the current saved history images containsNumber of images of individual visual words.
4. The closed-loop image appearance-based detection method in monocular vision SLAM as claimed in claim 3, wherein the visual dictionary is constructed offline by using the following method:
step 1, collecting a group of environment scene images in advance and extracting local visual features of the images respectively, wherein all local visual feature vectors form a training sample set;
step 2, clustering the training sample set, and constructing a visual dictionary by taking the obtained clustering centers as visual words, wherein each clustering center is a visual word; the method specifically comprises the following steps:
step 201, setting initial clustering category numberK
Step 202, according to the current cluster category numberKFuzzy training sample setKMean clustering, in each iteration step, assigning samples to a certain clustering center according to the maximum membership criterion, the first step
Figure DEST_PATH_IMAGE032
A sample pair
Figure 942298DEST_PATH_IMAGE024
Membership of individual cluster centers
Figure DEST_PATH_IMAGE034
The following formula:
Figure DEST_PATH_IMAGE038
in the formula,
Figure DEST_PATH_IMAGE040
representing the first in a training sample set
Figure 558830DEST_PATH_IMAGE032
A sample is obtained;
Figure DEST_PATH_IMAGE042
is shown as
Figure 188526DEST_PATH_IMAGE024
A cluster center;Mthe number of samples in the training sample set;Kthe number of the clusters is the category number;
and updating the clustering center according to the following formula:
Figure DEST_PATH_IMAGE044
in the formula,
Figure DEST_PATH_IMAGE046
respectively represent
Figure 198243DEST_PATH_IMAGE024
Is classified as
Figure DEST_PATH_IMAGE050
The first steptCluster centers in +1 iteration steps;is shown as
Figure 95529DEST_PATH_IMAGE050
In an iterative stepA sample pair
Figure 642366DEST_PATH_IMAGE024
Membership of each clustering center;
step 203, judging whether the TSC value of the current clustering result is within a preset range, if so, turning to step 204; if not, changing the current cluster category numberKAnd go to step 202; the TSC value is calculated according to the following formula:
Figure DEST_PATH_IMAGE054
in the formula,
Figure DEST_PATH_IMAGE056
to representKNearest in cluster center
Figure DEST_PATH_IMAGE058
And
Figure DEST_PATH_IMAGE060
the distance between the two or more of the two or more,
Figure 104440DEST_PATH_IMAGE040
representing the first in a training sample set
Figure 184129DEST_PATH_IMAGE032
The number of the samples is one,
Figure 456978DEST_PATH_IMAGE042
is shown as
Figure 123583DEST_PATH_IMAGE024
The center of each cluster is determined by the center of each cluster,Mthe number of samples in the training sample set;Kis the number of categories of the cluster,
Figure 140081DEST_PATH_IMAGE034
is as follows
Figure 74276DEST_PATH_IMAGE032
A sample pair
Figure 783606DEST_PATH_IMAGE024
Membership of each clustering center;
step 204, inKThe clustering centers are used as visual words to construct a visual dictionary, and each clustering center is a visual wordVisual words.
5. The closed-loop detection method based on image appearance in monocular vision SLAM as claimed in claim 3, wherein the SURF or SIFT operator is adopted to extract the local visual features of the image.
CN2012102951816A 2012-08-20 2012-08-20 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping) Pending CN102831446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012102951816A CN102831446A (en) 2012-08-20 2012-08-20 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012102951816A CN102831446A (en) 2012-08-20 2012-08-20 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)

Publications (1)

Publication Number Publication Date
CN102831446A true CN102831446A (en) 2012-12-19

Family

ID=47334572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012102951816A Pending CN102831446A (en) 2012-08-20 2012-08-20 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)

Country Status (1)

Country Link
CN (1) CN102831446A (en)

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103901885A (en) * 2012-12-28 2014-07-02 联想(北京)有限公司 Information processing method and information processing device
CN104062977A (en) * 2014-06-17 2014-09-24 天津大学 Full-autonomous flight control method for quadrotor unmanned aerial vehicle based on vision SLAM
CN104200483A (en) * 2014-06-16 2014-12-10 南京邮电大学 Human body central line based target detection method under multi-camera environment
CN104374395A (en) * 2014-03-31 2015-02-25 南京邮电大学 Graph-based vision SLAM (simultaneous localization and mapping) method
CN104464733A (en) * 2014-10-28 2015-03-25 百度在线网络技术(北京)有限公司 Multi-scene managing method and device of voice conversation
CN104964683A (en) * 2015-06-04 2015-10-07 上海物景智能科技有限公司 Closed loop correction method for indoor environment map creation
CN105203092A (en) * 2014-06-30 2015-12-30 联想(北京)有限公司 Information processing method and device and electronic equipment
CN105527968A (en) * 2014-09-29 2016-04-27 联想(北京)有限公司 Information processing method and information processing device
CN105865462A (en) * 2015-01-19 2016-08-17 北京雷动云合智能技术有限公司 Three dimensional SLAM method based on events with depth enhanced vision sensor
CN106339001A (en) * 2015-07-09 2017-01-18 松下电器(美国)知识产权公司 Map Production Method, Mobile Robot, And Map Production System
CN106575280A (en) * 2014-07-22 2017-04-19 香港科技大学 System and methods for analysis of user-associated images to generate non-user generated labels and utilization of the generated labels
CN106778767A (en) * 2016-11-15 2017-05-31 电子科技大学 Visual pattern feature extraction and matching process based on ORB and active vision
CN106840148A (en) * 2017-01-24 2017-06-13 东南大学 Wearable positioning and path guide method based on binocular camera under outdoor work environment
JP2017162457A (en) * 2016-03-11 2017-09-14 株式会社東芝 Image analysis system and method
CN107529650A (en) * 2017-08-16 2018-01-02 广州视源电子科技股份有限公司 Network model construction and closed loop detection method, corresponding device and computer equipment
CN108182271A (en) * 2018-01-18 2018-06-19 维沃移动通信有限公司 A kind of photographic method, terminal and computer readable storage medium
CN108229416A (en) * 2018-01-17 2018-06-29 苏州科技大学 Robot SLAM methods based on semantic segmentation technology
CN108256563A (en) * 2018-01-09 2018-07-06 深圳市沃特沃德股份有限公司 Visual dictionary closed loop detection method and device based on distance metric
CN108287550A (en) * 2018-02-01 2018-07-17 速感科技(北京)有限公司 The method of SLAM systems and construction data correlation based on data correlation and error detection
CN108647307A (en) * 2018-05-09 2018-10-12 京东方科技集团股份有限公司 Image processing method, device, electronic equipment and storage medium
CN109242899A (en) * 2018-09-03 2019-01-18 北京维盛泰科科技有限公司 A kind of real-time positioning and map constructing method based on online visual dictionary
CN109272021A (en) * 2018-08-22 2019-01-25 广东工业大学 A kind of intelligent mobile robot air navigation aid based on width study
CN109584302A (en) * 2018-11-27 2019-04-05 北京旷视科技有限公司 Camera pose optimization method, device, electronic equipment and computer-readable medium
CN109902619A (en) * 2019-02-26 2019-06-18 上海大学 Image closed loop detection method and system
WO2019136612A1 (en) * 2018-01-09 2019-07-18 深圳市沃特沃德股份有限公司 Distance measurement-based visual dictionary closed-loop detection method and device
CN110126846A (en) * 2019-05-24 2019-08-16 北京百度网讯科技有限公司 Representation method, device, system and the storage medium of Driving Scene
CN110165657A (en) * 2018-08-30 2019-08-23 中国南方电网有限责任公司 Consider substation's load characteristics clustering analysis method of user's industry attribute
CN110390356A (en) * 2019-07-03 2019-10-29 Oppo广东移动通信有限公司 Visual dictionary generation method and device, storage medium
CN110443263A (en) * 2018-05-02 2019-11-12 北京京东尚科信息技术有限公司 Closed loop detection method and device
WO2019233299A1 (en) * 2018-06-05 2019-12-12 杭州海康机器人技术有限公司 Mapping method and apparatus, and computer readable storage medium
CN110570465A (en) * 2018-06-05 2019-12-13 杭州海康机器人技术有限公司 real-time positioning and map construction method and device and computer readable storage medium
CN110633336A (en) * 2018-06-05 2019-12-31 杭州海康机器人技术有限公司 Method and device for determining laser data search range and storage medium
CN110781841A (en) * 2019-10-29 2020-02-11 北京影谱科技股份有限公司 Closed loop detection method and device based on SLAM space invariant information
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, image processing device, electronic equipment and storage medium
CN111812978A (en) * 2020-06-12 2020-10-23 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Cooperative SLAM method and system for multiple unmanned aerial vehicles
CN111882663A (en) * 2020-07-03 2020-11-03 广州万维创新科技有限公司 Visual SLAM closed-loop detection method achieved by fusing semantic information
CN112651988A (en) * 2021-01-13 2021-04-13 重庆大学 Finger-shaped image segmentation, finger-shaped plate dislocation and fastener abnormality detection method based on double-pointer positioning
CN113191435A (en) * 2021-05-07 2021-07-30 南京邮电大学 Image closed-loop detection method based on improved visual dictionary tree
CN115410140A (en) * 2022-11-02 2022-11-29 中国船舶集团有限公司第七〇七研究所 Image detection method, device, equipment and medium based on marine target
US11625870B2 (en) 2017-07-31 2023-04-11 Oxford University Innovation Limited Method of constructing a model of the motion of a mobile device and related systems

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李博: "基于场景外观建模的移动机器人视觉闭环检测研究", 《中国博士学位论文全文数》 *
梁志伟等: "融合人运动模式分析的服务机器人和谐导航", 《东南大学学报(自然科学版》 *

Cited By (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103901885A (en) * 2012-12-28 2014-07-02 联想(北京)有限公司 Information processing method and information processing device
CN104374395A (en) * 2014-03-31 2015-02-25 南京邮电大学 Graph-based vision SLAM (simultaneous localization and mapping) method
CN104200483A (en) * 2014-06-16 2014-12-10 南京邮电大学 Human body central line based target detection method under multi-camera environment
CN104200483B (en) * 2014-06-16 2018-05-18 南京邮电大学 Object detection method based on human body center line in multi-cam environment
CN104062977A (en) * 2014-06-17 2014-09-24 天津大学 Full-autonomous flight control method for quadrotor unmanned aerial vehicle based on vision SLAM
CN104062977B (en) * 2014-06-17 2017-04-19 天津大学 Full-autonomous flight control method for quadrotor unmanned aerial vehicle based on vision SLAM
CN105203092B (en) * 2014-06-30 2018-12-14 联想(北京)有限公司 A kind of information processing method, device and electronic equipment
CN105203092A (en) * 2014-06-30 2015-12-30 联想(北京)有限公司 Information processing method and device and electronic equipment
CN106575280A (en) * 2014-07-22 2017-04-19 香港科技大学 System and methods for analysis of user-associated images to generate non-user generated labels and utilization of the generated labels
CN105527968A (en) * 2014-09-29 2016-04-27 联想(北京)有限公司 Information processing method and information processing device
CN104464733A (en) * 2014-10-28 2015-03-25 百度在线网络技术(北京)有限公司 Multi-scene managing method and device of voice conversation
CN104464733B (en) * 2014-10-28 2019-09-20 百度在线网络技术(北京)有限公司 A kind of more scene management method and devices of voice dialogue
CN105865462A (en) * 2015-01-19 2016-08-17 北京雷动云合智能技术有限公司 Three dimensional SLAM method based on events with depth enhanced vision sensor
CN105865462B (en) * 2015-01-19 2019-08-06 北京雷动云合智能技术有限公司 The three-dimensional S LAM method based on event with depth enhancing visual sensor
CN104964683B (en) * 2015-06-04 2018-06-01 上海物景智能科技有限公司 A kind of closed-loop corrected method of indoor environment map building
CN104964683A (en) * 2015-06-04 2015-10-07 上海物景智能科技有限公司 Closed loop correction method for indoor environment map creation
CN106339001A (en) * 2015-07-09 2017-01-18 松下电器(美国)知识产权公司 Map Production Method, Mobile Robot, And Map Production System
CN106339001B (en) * 2015-07-09 2021-01-08 松下电器(美国)知识产权公司 Map generation method, mobile robot, and map generation system
JP2017162457A (en) * 2016-03-11 2017-09-14 株式会社東芝 Image analysis system and method
CN106778767A (en) * 2016-11-15 2017-05-31 电子科技大学 Visual pattern feature extraction and matching process based on ORB and active vision
CN106778767B (en) * 2016-11-15 2020-08-11 电子科技大学 Visual image feature extraction and matching method based on ORB and active vision
CN106840148B (en) * 2017-01-24 2020-07-17 东南大学 Wearable positioning and path guiding method based on binocular camera under outdoor working environment
CN106840148A (en) * 2017-01-24 2017-06-13 东南大学 Wearable positioning and path guide method based on binocular camera under outdoor work environment
US11625870B2 (en) 2017-07-31 2023-04-11 Oxford University Innovation Limited Method of constructing a model of the motion of a mobile device and related systems
CN107529650A (en) * 2017-08-16 2018-01-02 广州视源电子科技股份有限公司 Network model construction and closed loop detection method, corresponding device and computer equipment
CN108256563A (en) * 2018-01-09 2018-07-06 深圳市沃特沃德股份有限公司 Visual dictionary closed loop detection method and device based on distance metric
WO2019136612A1 (en) * 2018-01-09 2019-07-18 深圳市沃特沃德股份有限公司 Distance measurement-based visual dictionary closed-loop detection method and device
CN108256563B (en) * 2018-01-09 2020-05-26 深圳市无限动力发展有限公司 Visual dictionary closed-loop detection method and device based on distance measurement
CN108229416A (en) * 2018-01-17 2018-06-29 苏州科技大学 Robot SLAM methods based on semantic segmentation technology
CN108229416B (en) * 2018-01-17 2021-09-10 苏州科技大学 Robot SLAM method based on semantic segmentation technology
CN108182271A (en) * 2018-01-18 2018-06-19 维沃移动通信有限公司 A kind of photographic method, terminal and computer readable storage medium
CN108182271B (en) * 2018-01-18 2020-11-17 维沃移动通信有限公司 Photographing method, terminal and computer readable storage medium
CN108287550B (en) * 2018-02-01 2020-09-11 速感科技(北京)有限公司 SLAM system based on data association and error detection and method for constructing data association
CN108287550A (en) * 2018-02-01 2018-07-17 速感科技(北京)有限公司 The method of SLAM systems and construction data correlation based on data correlation and error detection
CN110443263A (en) * 2018-05-02 2019-11-12 北京京东尚科信息技术有限公司 Closed loop detection method and device
CN108647307A (en) * 2018-05-09 2018-10-12 京东方科技集团股份有限公司 Image processing method, device, electronic equipment and storage medium
WO2019233299A1 (en) * 2018-06-05 2019-12-12 杭州海康机器人技术有限公司 Mapping method and apparatus, and computer readable storage medium
CN110570465A (en) * 2018-06-05 2019-12-13 杭州海康机器人技术有限公司 real-time positioning and map construction method and device and computer readable storage medium
CN110633336A (en) * 2018-06-05 2019-12-31 杭州海康机器人技术有限公司 Method and device for determining laser data search range and storage medium
CN110633336B (en) * 2018-06-05 2022-08-05 杭州海康机器人技术有限公司 Method and device for determining laser data search range and storage medium
CN110570465B (en) * 2018-06-05 2022-05-20 杭州海康机器人技术有限公司 Real-time positioning and map construction method and device and computer readable storage medium
CN109272021B (en) * 2018-08-22 2022-03-04 广东工业大学 Intelligent mobile robot navigation method based on width learning
CN109272021A (en) * 2018-08-22 2019-01-25 广东工业大学 A kind of intelligent mobile robot air navigation aid based on width study
CN110165657A (en) * 2018-08-30 2019-08-23 中国南方电网有限责任公司 Consider substation's load characteristics clustering analysis method of user's industry attribute
CN109242899A (en) * 2018-09-03 2019-01-18 北京维盛泰科科技有限公司 A kind of real-time positioning and map constructing method based on online visual dictionary
CN109242899B (en) * 2018-09-03 2022-04-19 北京维盛泰科科技有限公司 Real-time positioning and map building method based on online visual dictionary
CN109584302B (en) * 2018-11-27 2023-12-01 北京旷视科技有限公司 Camera pose optimization method, camera pose optimization device, electronic equipment and computer readable medium
CN109584302A (en) * 2018-11-27 2019-04-05 北京旷视科技有限公司 Camera pose optimization method, device, electronic equipment and computer-readable medium
CN109902619A (en) * 2019-02-26 2019-06-18 上海大学 Image closed loop detection method and system
CN110126846A (en) * 2019-05-24 2019-08-16 北京百度网讯科技有限公司 Representation method, device, system and the storage medium of Driving Scene
CN110390356A (en) * 2019-07-03 2019-10-29 Oppo广东移动通信有限公司 Visual dictionary generation method and device, storage medium
CN110390356B (en) * 2019-07-03 2022-03-08 Oppo广东移动通信有限公司 Visual dictionary generation method and device and storage medium
CN110781841A (en) * 2019-10-29 2020-02-11 北京影谱科技股份有限公司 Closed loop detection method and device based on SLAM space invariant information
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, image processing device, electronic equipment and storage medium
CN111812978A (en) * 2020-06-12 2020-10-23 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Cooperative SLAM method and system for multiple unmanned aerial vehicles
CN111882663A (en) * 2020-07-03 2020-11-03 广州万维创新科技有限公司 Visual SLAM closed-loop detection method achieved by fusing semantic information
CN112651988A (en) * 2021-01-13 2021-04-13 重庆大学 Finger-shaped image segmentation, finger-shaped plate dislocation and fastener abnormality detection method based on double-pointer positioning
CN113191435B (en) * 2021-05-07 2022-08-23 南京邮电大学 Image closed-loop detection method based on improved visual dictionary tree
CN113191435A (en) * 2021-05-07 2021-07-30 南京邮电大学 Image closed-loop detection method based on improved visual dictionary tree
CN115410140A (en) * 2022-11-02 2022-11-29 中国船舶集团有限公司第七〇七研究所 Image detection method, device, equipment and medium based on marine target

Similar Documents

Publication Publication Date Title
CN102831446A (en) Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)
CN111553193B (en) Visual SLAM closed-loop detection method based on lightweight deep neural network
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN107515895B (en) Visual target retrieval method and system based on target detection
Lynen et al. Placeless place-recognition
CN107633226B (en) Human body motion tracking feature processing method
CN109784223B (en) Multi-temporal remote sensing image matching method and system based on convolutional neural network
Liu et al. Visual loop closure detection with a compact image descriptor
CN110209859B (en) Method and device for recognizing places and training models of places and electronic equipment
CN109919241B (en) Hyperspectral unknown class target detection method based on probability model and deep learning
CN110097060B (en) Open set identification method for trunk image
Xia et al. Loop closure detection for visual SLAM using PCANet features
CN110084149B (en) Face verification method based on hard sample quadruple dynamic boundary loss function
CN111401144A (en) Escalator passenger behavior identification method based on video monitoring
CN107169117B (en) Hand-drawn human motion retrieval method based on automatic encoder and DTW
CN107066951B (en) Face spontaneous expression recognition method and system
CN113592894B (en) Image segmentation method based on boundary box and co-occurrence feature prediction
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
CN110880010A (en) Visual SLAM closed loop detection algorithm based on convolutional neural network
CN110716792B (en) Target detector and construction method and application thereof
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN109344720B (en) Emotional state detection method based on self-adaptive feature selection
Raparthi et al. Machine Learning Based Deep Cloud Model to Enhance Robustness and Noise Interference
CN111985332A (en) Gait recognition method for improving loss function based on deep learning
CN115393631A (en) Hyperspectral image classification method based on Bayesian layer graph convolution neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121219