CN112396593B - Closed loop detection method based on key frame selection and local features - Google Patents
Closed loop detection method based on key frame selection and local features Download PDFInfo
- Publication number
- CN112396593B CN112396593B CN202011360902.8A CN202011360902A CN112396593B CN 112396593 B CN112396593 B CN 112396593B CN 202011360902 A CN202011360902 A CN 202011360902A CN 112396593 B CN112396593 B CN 112396593B
- Authority
- CN
- China
- Prior art keywords
- image
- input image
- current input
- closed
- key frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/532—Query formulation, e.g. graphical querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20164—Salient point detection; Corner detection
Abstract
The invention relates to a closed loop detection method based on key frame selection and local features. The key frames are selected through KLT sparse optical flow tracking, the motion speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be well processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved.
Description
Technical Field
The invention relates to the field of positioning and navigation based on vision in autonomous inspection of unmanned aerial vehicles, in particular to a closed-loop detection method based on key frame selection and local features.
Background
In the intelligent inspection process of the unmanned aerial vehicle, the unmanned aerial vehicle needs to autonomously determine the operation required to be performed according to the environmental information. Therefore, autonomous positioning and environmental map sensing and construction are key links in autonomous inspection of the unmanned aerial vehicle. In recent years, development of visual SLAM (simultaneous localization and mapping) technology has improved the capability of autonomous localization and mapping of mobile robots. The closed-loop detection is an important component in the visual SLAM system, is used for detecting whether the mobile robot returns to a place visited once, and plays an extremely important role in reducing the positioning error of the mobile robot and constructing a globally consistent environment map. The closed-loop detection matches the current frame with the key frame, and judges whether to be closed according to the matching degree, so that the correct selection of the key frame is crucial to the closed-loop detection.
The Chinese patent application document with the publication number of CN109902619A and the publication date of 2019, 6 and 18 discloses an image closed-loop detection method and a system, and the method comprises the following steps: extracting a FAST corner point for each frame image, and calculating a BRIEF operator; substituting the BRIEF operator into a pre-established word bag model to obtain a visual word corresponding to the operator; the visual words are used for establishing vector description of the image; judging whether a current image is likely to generate a closed loop or not based on a tracking prediction algorithm, and predicting the position of the likely generated closed loop to obtain a closed loop candidate set; evaluating the similarity degree of the current image and each image in the closed-loop candidate set through the visual word vector, and taking the image with the highest similarity in the closed-loop candidate set as a candidate image; carrying out normalization processing on the candidate image to obtain a normalized image; and calculating an ORB global operator of the normalized image to complete the structure check of the candidate image. The invention can effectively accelerate the detection algorithm and provide more accurate closed loop detection performance.
The method belongs to a closed loop detection method based on a visual bag-of-words model, and comprises the steps of extracting local feature points and descriptors of an input image, obtaining BoW vector representation of the input image by means of a visual dictionary, and judging whether to be closed loop or not through a tracking prediction algorithm. Closed-loop detection based on the visual bag-of-words model has better robustness under the condition of changing the visual angle of an image, but is difficult to process the condition of changing the appearance. Meanwhile, the method lacks selection of key frames, only takes the similarity as a candidate image, and the calculation amount is large, so that the final detection speed is influenced.
Disclosure of Invention
The invention aims to solve the problem of slow detection speed in the prior art, and provides a closed-loop detection method based on key frame selection and local features.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a closed loop detection method based on key frame selection and local features comprises the following steps:
the method comprises the following steps: an input image acquired by a mobile robot; determining a first frame of an input image sequence as a key frame, extracting Shi _ Tomasi corner points of a previous key frame of a current input image, tracking the corner points in the current input image iteratively by adopting a sparse optical flow tracking algorithm, and if the number of the corner points which cannot be tracked is greater than a threshold value, determining the current input image as a new key frame;
step two: extracting global features from the current input image by adopting a convolutional neural network trained by an image classification data set, and inserting the extracted global features into a layered navigable small-world map of an approximate nearest neighbor retrieval algorithm if the current input image is a key frame;
step three: in the retrieval range of the current input image, retrieving a key frame most similar to the current input image as a closed-loop candidate key frame of the current image through HNSW, and taking all images between the closed-loop candidate key frame and a key frame next to the closed-loop candidate key frame as a closed-loop candidate image queue;
step four: introducing geometric consistency check, respectively extracting ORB characteristic points and corresponding local difference binary descriptors LDB from the input image and the retrieved closed-loop candidate image, and respectively matching the input image with descriptors of images in a closed-loop candidate image queue;
step five: the closed-loop candidate image which is most matched with the LDB descriptor of the current input image is used as an optimal closed-loop candidate image, the feature points matched with the two images are input into a random sampling consistency algorithm to further eliminate mismatching and solve a basic matrix, and if the number of the inner points between the two images is less than a threshold value, the two images do not form a closed loop; if the number of inner points between the two images is larger than the threshold value, the two images may form a closed loop;
step six: and (4) introducing time consistency check, and if the continuous 2 frames of images after the current input image all meet the threshold condition of the step five, considering that the input image and the closed-loop candidate image form a group of closed loops.
Preferably, in the first step, the corner points are iteratively tracked in the current input image by using a sparse optical flow tracking algorithm KLT, specifically:
current input image I i Is marked as I k-1 For image I i And I k-1 Carrying out graying processing to obtain an image G i 、G k-1 (ii) a Extracting image G k-1 Shi _ Tomasi corner, set image I i And I k-1 The brightness of the middle pixel point is kept constant before and after movement, and an image G is calculated k-1 At the center point P (x, y) in the image G i Position P (x + dx, y + dy) and optical flow
The specific calculation steps are as follows: for the current input image I i Performing graying to obtain an image G i Image I i The gray image of the previous key frame is G k-1 Extracting an image G k-1 The Shi _ Tomasi corner point. For image G k-1 、G i And respectively carrying out Gaussian pyramid transformation to obtain L layers of images with different resolutions. At L m In a layer, assume G k-1 At the corner P (x, y) in the image G i To point P (x + dx, y + dy), taking time dt. Because the luminance keeps invariable before and after the pixel moves in two pictures, promptly:
I(x,y,t)=I(x+dx,y+dy,t+dt) (1)
where I (x, y, t) represents the brightness of the pixel P (x, y) at time t, and I (x + dx, y + dy, t + dt) represents the shifted image G i Brightness at the middle pixel point P (x + dx, y + dy). I (x + dx, y + dy, t + dt) can be decomposed by Taylor's equation as:
wherein epsilon is infinitesimal and can be ignored. Equation (1) can therefore be simplified to:
both sides are divided by dt simultaneously:
let u, v be the velocity components of the flow along the X-axis and Y-axis, respectively, i.e.
In addition, note
In this case, equation (5) can be written as:
I x u+I y v+I t =0 (8)
assuming that the pixel points around P (x, y) keep the same moving distance with P (x, y), a window with size of (5, 5) is taken around P (x, y), and for the pixel points in the window:
and solving the optimal solution of the equation system by adopting a least square method so as to minimize the matching error sum in the window. Equation (9) can be abbreviated as:
Ad=b (10)
multiplying both sides by A T :(A T A)d=A T b (11)
At this time, velocity vectors u and v of the optical flow along the X axis and the Y axis are obtained as follows:
the L < th > value can be calculated by solving the u and v values m Corner point P (x, y) in layer in image G i Position P (x + dx, y + dy) and optical flowWill L m The optical flow value obtained by layer calculation is taken as L m-1 The initial value of the laminar flow, and calculate L m-1 The precise value of the laminar flow until the lowest layer L is calculated 0 The optical flow of the original image and the tracked corner point P (x + dx, y + dy).
Preferably, if the number of corner points that cannot be tracked is greater than the threshold, it is determined that the current input image is a new key frame specifically:
key frame image G k-1 At the current input image G i When KLT sparse optical flow tracking is performed, if the following occurs, it is considered that tracking has failed:
(1) Corner point P (x, y) at G i Out of image range;
(2) The sum of matching errors in the neighborhood of the matching corners is larger than a threshold value;
if the number of the corner points which fail to track is larger than the set threshold value, the current input image I is considered i Is a new key frame.
Preferably, in the second step, the extracting global features of the current input image by using the convolutional neural network trained by the image classification dataset specifically includes: for the current input image I i And preprocessing is carried out, the input of the convolutional neural network requires to adjust the size of the image, and the output of the last but one full connection of the convolutional neural network is used as the global characteristic of the image.
Preferably, in the third step, if the current input image is a key frame, the specific process of inserting the extracted global features into the hierarchical navigable small-world map of the approximate nearest neighbor search algorithm is as follows: if the current input image I i Selected as key frames, are randomized by exponentially decaying probability distribution functionsSpecifying an image I i Characteristic node of (1) the highest level number l in the HNSW structure max Inserting the feature node into l max To the bottom layer 0 In all layers of (a). And searching M nodes nearest to the node in each layer respectively, and connecting the new feature node with the M nodes nearest to the new feature node.
Preferably, in the second step, the search range of the current input image is specifically:
U sa =U before -U fr×ct
wherein, U sa Indicating a search range of the input image; u shape before A set representing all images preceding the current input image; fr is the frame rate of the camera; ct is a time constant; u shape fr×ct Is a set of fr × ct frame images preceding the current input image.
Preferably, in the fourth step, the specific process of extracting ORB feature points and corresponding local differential binary descriptors LDB from the current input image and the retrieved closed-loop candidate image queue includes:
respectively extracting ORB characteristic points from the current input image and the closed loop candidate image queue, and for each ORB characteristic point k ij In k, with ij Cropping a block S of size S × S for the center ij Will S ij Divided into c x c mesh units of equal sizeCalculating the average intensity I of each grid unit avg And gradient d x 、d y . For S ij Of any two grid cellsPerforming binary test to obtain binary code as the sum of characteristic points k ij Corresponding binary LDB descriptors.
Preferably, for S ij Of any two grid cellsExecuteThe binary test specifically comprises the following steps:
wherein f (m) and f (n) respectively represent grid cellsAverage intensity of avg And gradient d x 、d y The value is obtained.
Preferably, in the fourth step, the input image is matched with the descriptors of the images in the closed-loop candidate image queue, specifically, the descriptors are matched
Input image I using Hamming distance i And closed loop candidate image I n For the input image I i LDB descriptor of (1)In the candidate image I n In search andtwo descriptors with the closest distanceIf it isAnd withIf the following conditions are satisfied, the product is considered to beAndis a pair of satisfactory feature matching:
wherein the content of the first and second substances,respectively represent feature descriptorsAndhamming distance between, epsilon d The value is usually less than 1 for the distance scaling factor.
Preferably, the Hamming distance is adopted for the input image I i And closed loop candidate image I n The specific matching of the LDB descriptors is as follows:
wherein d is 1 ,d 2 Represents two LDB descriptors, d i Denotes d 1 ,d 2 Bit i of the descriptor.
Compared with the prior art, the invention has the beneficial effects that:
1. the key frames are selected through KLT sparse optical flow tracking, the movement speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be better processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved.
2. The invention checks whether the two images form a closed loop or not through the local differential binary descriptor LDB, thereby not only obtaining the geometric topological relation between the two images, but also verifying whether the two images form the closed loop or not, and improving the precision of closed loop detection.
3. The invention extracts the global features of the image by adopting the convolutional neural network trained by the image classification dataset and uses the global features for nearest neighbor image retrieval, thereby being capable of better coping with scenes with appearance changes.
Drawings
FIG. 1 is a flow chart of a closed loop detection method based on key frame selection and local features of the present invention;
FIG. 2 is a flowchart of key frame selection for a closed loop detection method based on key frame selection and local features according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the specific embodiments and the accompanying drawings:
examples
Fig. 1-2 show an embodiment of a closed loop detection method based on key frame selection and local features, which includes the following steps:
the method comprises the following steps: the first frame of the input image sequence is identified as the key frame, the current input image I i Is marked as I k-1 For image I i And I k-1 Performing graying to obtain an image G i 、G k-1 . Extracting image G k-1 Shi _ Tomasi corner point; for image G k-1 、G i Respectively carrying out Gaussian pyramid transformation to obtain L layers of images with different resolutions;
due to the image G k-1 、G i The brightness is kept constant before and after the movement of the middle pixel point, and the L-th pixel point is calculated by solving the velocity components u and v of the optical flow along the X axis and the Y axis m Corner point P (x, y) in layer in image G i Position P (x + dx, y + dy) and optical flowMixing L with m The optical flow value obtained by layer calculation is taken as L m-1 Initial value of laminar flow, and calculating L m-1 The precise value of the laminar flow until the lowest layer L is calculated 0 The optical flow of the original image and the traced corner point P (x + dx, y + dy).
Image G k-1 In image G i When KLT sparse optical flow tracking is performed, if the following occurs, it is considered that tracking has failed:
(1) Corner point P (x, y) at G i Out of the image range.
(2) The sum of the matching errors in the neighborhood of the matching corner points in the two images is greater than a threshold.
If the number of the corner points which fail to track is larger than the set threshold value, the current input image I is considered i Is a new key frame.
Step two: for the current input image I i Preprocessing is performed to resize the image to 224 x 224 pixels. Extraction of image I Using convolutional neural network VGG16 trained from Places365-standard datasets i The output of the penultimate fully connected layer of the VGG16 network will be the image I i Global feature f of glo,i . And if the current input image is a key frame, inserting the extracted global features into a hierarchical navigable small world map (HNSW) of an approximate nearest neighbor search algorithm.
Step three: at the current input image I i In the search range of (2), the most phase with the current input image is searched by HNSWThe similar key frame is used as a closed loop candidate key frame of the current image, and all images between the closed loop candidate key frame and the key frame of the next frame are used as a closed loop candidate image queue. Since the image sequence transmitted by the mobile robot has high similarity between adjacent images, the retrieval range of the current input image is U sa All key frames within:
U sa =U before -U fr×ct
in the formula of U before For at the current input image I i Set of all previous images, fr frame rate of camera, ct time constant, U fr×ct Is a set of fr × ct frame images preceding the current input image.
Step four: introducing geometric consistency check to the current input image I i And extracting ORB characteristic points respectively with the retrieved closed loop candidate image queue. For each ORB feature point k ij In k, with ij Cropping a block S of size S x S for the center ij . Secondly, adding S ij Divided into c x c mesh units of equal sizeCalculating the average intensity I of each grid unit avg And gradient d x 、d y . For S ij Of any two grid cellsThe binary test is performed as follows:
wherein f (m) is a grid unitAverage intensity of avg And gradient d x 、d y F (n) represents a grid cellAverage intensity of avg And gradient d x 、d y The value of (c). To S ij After the c × c grid units all execute binary test, the obtained binary code is the sum of the characteristic point k ij Corresponding binary LDB descriptors.
After the current input image I is obtained i After the LDB descriptor of the closed loop candidate image queue, the Hamming distance is respectively adopted for the input image I i With pictures I in a closed-loop candidate picture queue q,n For I, for the LDB descriptor of i LDB descriptor of (1)In I q,n In search andtwo LDB descriptors with the nearest distanceIf it isAndif the following conditions are satisfied, the product is considered to beAndis a good feature match:
wherein the content of the first and second substances,presentation descriptorAndthe Hamming distance between the two electrodes,presentation descriptorAndhamming distance between them. Epsilon d The distance scaling factor is usually smaller than 1.
Input image I using Hamming distance i And closed loop candidate image I n The specific steps for matching the LDB descriptors are:
wherein d is 1 ,d 2 Representing two LDB descriptors, d i Denotes d 1 ,d 2 Bit i of the descriptor.
Step five: with the current input image I i The closed loop candidate image with the most matched LDB descriptor is used as the optimal closed loop candidate image, the matched feature points of the two images are input into a random sampling consistency algorithm (RANSAC) to further eliminate mismatching and solve a basic matrix; if the number of the inner points between the two images is less than the threshold value, the two images do not form a closed loop; if the number of inliers between two images is not less than the threshold, the two images may form a closed loop.
Step six: checking the consistency of the incoming time if the current input image I i And C, if the subsequent 2 continuous frame images meet the threshold condition of the step five, the current input image and the optimal closed-loop candidate image are considered to form a group of closed loops.
The beneficial effects of this example: 1. the key frames are selected through KLT sparse optical flow tracking, the movement speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be better processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved. 2. The invention checks whether the two images form a closed loop or not through the local differential binary descriptor LDB, thereby not only obtaining the geometric topological relation between the two images, but also verifying whether the two images form the closed loop or not, and improving the precision of closed loop detection. 3. The invention extracts the global features of the image by adopting the convolutional neural network trained by the image classification dataset and uses the global features for nearest neighbor image retrieval, thereby being capable of better coping with scenes with appearance changes.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. A closed loop detection method based on key frame selection and local features is characterized by comprising the following steps:
the method comprises the following steps: an input image acquired by a mobile robot; determining a first frame of an input image sequence as a key frame, extracting Shi _ Tomasi corner points of a previous key frame of a current input image, iteratively tracking the corner points in the current input image by adopting a sparse optical flow tracking algorithm, and determining the current input image as a new key frame if the number of corner points which cannot be tracked is greater than a threshold value;
step two: extracting global features of the current input image by adopting a convolutional neural network trained by an image classification data set, and if the current input image is a key frame, inserting the extracted global features into a layered navigable small world map approximate to a nearest neighbor retrieval algorithm;
step three: in the retrieval range of the current input image, retrieving a key frame most similar to the current input image as a closed-loop candidate key frame of the current image through HNSW, and taking all images between the closed-loop candidate key frame and a key frame next to the closed-loop candidate key frame as a closed-loop candidate image queue;
step four: introducing geometric consistency check, respectively extracting ORB characteristic points and corresponding local difference binary descriptors LDB from the input image and the retrieved closed-loop candidate image, and respectively matching the input image with descriptors of images in a closed-loop candidate image queue;
step five: the closed-loop candidate image which is most matched with the LDB descriptor of the current input image is used as an optimal closed-loop candidate image, the feature points matched with the two images are input into a random sampling consistency algorithm to further eliminate mismatching and solve a basic matrix, and if the number of the inner points between the two images is less than a threshold value, the two images do not form a closed loop; if the number of inner points between the two images is larger than the threshold value, the two images may form a closed loop;
step six: and (4) introducing time consistency check, and if the continuous 2 frames of images after the current input image all meet the threshold condition of the step five, considering that the input image and the closed-loop candidate image form a group of closed loops.
2. A closed-loop detection method based on key-frame selection and local features as claimed in claim 1, characterized in that in said step one, the corner points are iteratively tracked in the current input image using a sparse optical flow tracking algorithm KLT, specifically:
current input image I i Is marked as I k-1 For image I i And I k-1 Carrying out graying processing to obtain an image G i 、G k-1 (ii) a Extracting image G k-1 Shi _ Tomasi corner point, set image I i And I k-1 The brightness is kept constant before and after the movement of the middle pixel point, and an image G is calculated k-1 At the center point P (x, y) in the image G i Position P (x + dx, y + dy) and optical flow
3. The method according to claim 2, wherein if the number of corner points that cannot be tracked is greater than a threshold, the method considers that the current input image is a new key frame specifically as follows:
key frame image G k-1 At the current input image G i When KLT sparse optical flow tracking is performed, if the following situations occur, the tracking is considered to be failed:
(1) Corner point P (x, y) at G i Out of image range;
(2) The sum of the matching errors in the neighborhood of the matching corner points is greater than a threshold value;
if the number of corner points which fail to track is greater than the set threshold value, the current input image I is considered i Is a new key frame.
4. The method as claimed in claim 3, wherein in the second step, the current input image is extracted with a convolutional neural network trained by the image classification dataset by: for the current input image I i And preprocessing is carried out, the input of the convolutional neural network requires to adjust the size of the image, and the output of the last but one full connection of the convolutional neural network is used as the global characteristic of the image.
5. The method according to claim 3, wherein in the third step, if the current input image is a key frame, the specific process of inserting the extracted global features into the hierarchical navigable small-world map of the approximate nearest neighbor search algorithm is as follows: if the current input image I i Selected as key frames, the image I is randomly assigned by an exponentially decaying probability distribution function i Characteristic node of (1) the highest level number l in the HNSW structure max Insert the feature node into l max To the bottom layer l 0 Of all layers of (a). And searching M nodes nearest to the node in each layer respectively, and connecting the new feature node with the M nodes nearest to the new feature node.
6. The method according to claim 1, wherein in the second step, in the search range of the current input image, the method specifically comprises:
U sa =U before -U fr×ct
wherein, U sa Indicating a search range of the input image; u shape before A set representing all images preceding the current input image; fr is the frame rate of the camera; ct is a time constant; u shape fr×ct Is a set of fr × ct frame images preceding the current input image.
7. The method as claimed in claim 1, wherein in the fourth step, the specific process of extracting ORB feature points and corresponding local differential binary descriptors LDB from the current input image and the retrieved closed-loop candidate image queue is as follows:
respectively extracting ORB characteristic points from the current input image and the closed loop candidate image queue, and for each ORB characteristic point k ij In k is given ij Cropping a block S of size S x S for the center ij Will S ij Divided into c x c mesh units of equal sizeCalculating the average intensity I of each grid cell avg And gradient d x 、d y (ii) a For S ij Of any two grid cellsExecuting binary test to obtain binary code as the sum of characteristic points k ij Corresponding binary LDB descriptors.
8. The method of claim 7, wherein for S, the closed loop detection method is based on key frame selection and local feature ij Of any two grid cellsExecuting binary test, specifically:
9. The method according to claim 8, wherein in step four, the input image is matched with the descriptors of the images in the closed-loop candidate image queue, specifically, the descriptors of the images in the closed-loop candidate image queue are matched
Input image I using Hamming distance i And closed loop candidate image I n For the input image I i LDB descriptor of (1)In the candidate image I n In search andtwo descriptors with the closest distanceIf it isAndif the following conditions are satisfied, the product is considered to beAndis a pair of satisfactory feature matching:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011360902.8A CN112396593B (en) | 2020-11-27 | 2020-11-27 | Closed loop detection method based on key frame selection and local features |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011360902.8A CN112396593B (en) | 2020-11-27 | 2020-11-27 | Closed loop detection method based on key frame selection and local features |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112396593A CN112396593A (en) | 2021-02-23 |
CN112396593B true CN112396593B (en) | 2023-01-24 |
Family
ID=74604695
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011360902.8A Active CN112396593B (en) | 2020-11-27 | 2020-11-27 | Closed loop detection method based on key frame selection and local features |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112396593B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109631855A (en) * | 2019-01-25 | 2019-04-16 | 西安电子科技大学 | High-precision vehicle positioning method based on ORB-SLAM |
CN109902619A (en) * | 2019-02-26 | 2019-06-18 | 上海大学 | Image closed loop detection method and system |
-
2020
- 2020-11-27 CN CN202011360902.8A patent/CN112396593B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109631855A (en) * | 2019-01-25 | 2019-04-16 | 西安电子科技大学 | High-precision vehicle positioning method based on ORB-SLAM |
CN109902619A (en) * | 2019-02-26 | 2019-06-18 | 上海大学 | Image closed loop detection method and system |
Non-Patent Citations (1)
Title |
---|
基于ORB关键帧匹配算法的机器人SLAM实现;艾青林等;《机电工程》;20160520(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112396593A (en) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ding et al. | Object detection in aerial images: A large-scale benchmark and challenges | |
CN111563442B (en) | Slam method and system for fusing point cloud and camera image data based on laser radar | |
CN108734210B (en) | Object detection method based on cross-modal multi-scale feature fusion | |
Chen et al. | Vehicle detection in high-resolution aerial images via sparse representation and superpixels | |
CN111126359B (en) | High-definition image small target detection method based on self-encoder and YOLO algorithm | |
CN114202672A (en) | Small target detection method based on attention mechanism | |
CN110781262B (en) | Semantic map construction method based on visual SLAM | |
CN110287826B (en) | Video target detection method based on attention mechanism | |
CN110738673A (en) | Visual SLAM method based on example segmentation | |
CN109785298B (en) | Multi-angle object detection method and system | |
CN113506318B (en) | Three-dimensional target perception method under vehicle-mounted edge scene | |
CN111368759B (en) | Monocular vision-based mobile robot semantic map construction system | |
CN109063549B (en) | High-resolution aerial video moving target detection method based on deep neural network | |
Dong et al. | Learning a robust CNN-based rotation insensitive model for ship detection in VHR remote sensing images | |
CN111767854B (en) | SLAM loop detection method combined with scene text semantic information | |
CN111723660A (en) | Detection method for long ground target detection network | |
Saleem et al. | Neural network-based recent research developments in SLAM for autonomous ground vehicles: A review | |
CN113724388B (en) | High-precision map generation method, device, equipment and storage medium | |
CN113704276A (en) | Map updating method and device, electronic equipment and computer readable storage medium | |
Ali et al. | A life-long SLAM approach using adaptable local maps based on rasterized LIDAR images | |
CN112651294A (en) | Method for recognizing human body shielding posture based on multi-scale fusion | |
CN111932612A (en) | Intelligent vehicle vision positioning method and device based on second-order hidden Markov model | |
CN116721206A (en) | Real-time indoor scene vision synchronous positioning and mapping method | |
CN112396593B (en) | Closed loop detection method based on key frame selection and local features | |
CN115187614A (en) | Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |