CN112396593A - Closed loop detection method based on key frame selection and local features - Google Patents

Closed loop detection method based on key frame selection and local features Download PDF

Info

Publication number
CN112396593A
CN112396593A CN202011360902.8A CN202011360902A CN112396593A CN 112396593 A CN112396593 A CN 112396593A CN 202011360902 A CN202011360902 A CN 202011360902A CN 112396593 A CN112396593 A CN 112396593A
Authority
CN
China
Prior art keywords
image
input image
current input
closed
key frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011360902.8A
Other languages
Chinese (zh)
Other versions
CN112396593B (en
Inventor
宋海龙
游林辉
胡峰
孙仝
陈政
张谨立
黄达文
王伟光
梁铭聪
黄志就
何彧
陈景尚
谭子毅
尤德柱
区嘉亮
陈宇婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhaoqing Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Zhaoqing Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhaoqing Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Zhaoqing Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN202011360902.8A priority Critical patent/CN112396593B/en
Publication of CN112396593A publication Critical patent/CN112396593A/en
Application granted granted Critical
Publication of CN112396593B publication Critical patent/CN112396593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Library & Information Science (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Biophysics (AREA)

Abstract

The invention relates to a closed loop detection method based on key frame selection and local features. The key frames are selected through KLT sparse optical flow tracking, the motion speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be well processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved.

Description

Closed loop detection method based on key frame selection and local features
Technical Field
The invention relates to the field of positioning and navigation based on vision in autonomous inspection of unmanned aerial vehicles, in particular to a closed loop detection method based on key frame selection and local features.
Background
In the unmanned aerial vehicle intelligence inspection process, the unmanned aerial vehicle needs to independently decide the required operation of going on according to environmental information. Therefore, autonomous positioning and environment map sensing and construction are key links in autonomous inspection of the unmanned aerial vehicle. In recent years, development of visual SLAM (simultaneous localization and mapping) technology has improved the capability of autonomous localization and mapping of mobile robots. The closed-loop detection is an important component in the visual SLAM system, is used for detecting whether the mobile robot returns to a place visited once, and plays an extremely important role in reducing the positioning error of the mobile robot and constructing a globally consistent environment map. The closed-loop detection matches the current frame with the key frame, and judges whether to be closed according to the matching degree, so that the correct selection of the key frame is crucial to the closed-loop detection.
The Chinese patent application with publication number "CN 109902619A" and publication date of 2019, 6 and 18 discloses an image closed-loop detection method and system, and the method comprises the following steps: extracting a FAST corner point for each frame image, and calculating a BRIEF operator; substituting the BRIEF operator into a pre-established word bag model to obtain a visual word corresponding to the operator; the visual words are used for establishing vector description of the image; judging whether a current image is likely to generate a closed loop or not based on a tracking prediction algorithm, and predicting the likely position of the closed loop to obtain a closed loop candidate set; evaluating the similarity degree of the current image and each image in the closed-loop candidate set through the visual word vector, and taking the image with the highest similarity in the closed-loop candidate set as a candidate image; carrying out normalization processing on the candidate image to obtain a normalized image; and calculating an ORB global operator of the normalized image to complete the structure check of the candidate image. The invention can effectively accelerate the detection algorithm and provide more accurate closed-loop detection performance.
The method belongs to a closed loop detection method based on a visual bag-of-words model, and comprises the steps of extracting local feature points and descriptors of an input image, obtaining BoW vector representation of the input image by means of a visual dictionary, and judging whether to be closed loop or not through a tracking prediction algorithm. Closed-loop detection based on the visual bag-of-words model has better robustness under the condition of changing the visual angle of an image, but is difficult to process the condition of changing the appearance. Meanwhile, the method lacks selection of key frames, only takes the similarity as a candidate image, and the calculation amount is large, so that the final detection speed is influenced.
Disclosure of Invention
The invention aims to solve the problem of slow detection speed in the prior art, and provides a closed-loop detection method based on key frame selection and local features.
In order to solve the technical problems, the invention adopts the technical scheme that: a closed loop detection method based on key frame selection and local features comprises the following steps:
the method comprises the following steps: an input image acquired by a mobile robot; determining a first frame of an input image sequence as a key frame, extracting Shi _ Tomasi corner points of a previous key frame of a current input image, tracking the corner points in the current input image iteratively by adopting a sparse optical flow tracking algorithm, and if the number of the corner points which cannot be tracked is greater than a threshold value, determining the current input image as a new key frame;
step two: extracting global features from the current input image by adopting a convolutional neural network trained by an image classification data set, and inserting the extracted global features into a layered navigable small-world map of an approximate nearest neighbor retrieval algorithm if the current input image is a key frame;
step three: in the retrieval range of the current input image, retrieving a key frame most similar to the current input image as a closed-loop candidate key frame of the current image through HNSW, and taking all images between the closed-loop candidate key frame and a key frame next to the closed-loop candidate key frame as a closed-loop candidate image queue;
step four: introducing geometric consistency check, respectively extracting ORB characteristic points and corresponding local difference binary descriptors LDB from the input image and the retrieved closed-loop candidate image, and respectively matching the input image with descriptors of images in a closed-loop candidate image queue;
step five: the closed-loop candidate image which is most matched with the LDB descriptor of the current input image is used as an optimal closed-loop candidate image, the feature points matched with the two images are input into a random sampling consistency algorithm to further eliminate mismatching and solve a basic matrix, and if the number of the inner points between the two images is less than a threshold value, the two images do not form a closed loop; if the number of inner points between the two images is larger than the threshold value, the two images may form a closed loop;
step six: and (4) introducing time consistency check, and if the continuous 2 frames of images after the current input image all meet the threshold condition of the step five, considering that the input image and the closed-loop candidate image form a group of closed loops.
Preferably, in the first step, the corner points are iteratively tracked in the current input image by using a sparse optical flow tracking algorithm KLT, specifically:
current input image IiIs marked as Ik-1For image IiAnd Ik-1Performing graying to obtain an image Gi、Gk-1(ii) a Extracting image Gk-1Shi _ Tomasi corner point, set image IiAnd Ik-1The brightness is kept constant before and after the movement of the middle pixel point, and an image G is calculatedk-1At the center point P (x, y) in the image GiPosition P (x + dx, y + dy) and optical flow
Figure BDA0002803960130000031
The specific calculation steps are as follows: for the current input image IiPerforming graying to obtain an image GiImage IiThe gray image of the previous key frame is Gk-1Extracting an image Gk-1The Shi _ Tomasi corner point. For image Gk-1、GiAnd respectively carrying out Gaussian pyramid transformation to obtain L layers of images with different resolutions. At LmIn a layer, assume Gk-1At the corner P (x, y) in the image GiTo point P (x + dx, y + dy), taking time dt. Because the luminance keeps invariable before and after the pixel moves in two pictures, promptly:
I(x,y,t)=I(x+dx,y+dy,t+dt) (1)
where I (x, y, t) represents the brightness of the pixel P (x, y) at time t, and I (x + dx, y + dy, t + dt) represents the shifted image GiBrightness at the middle pixel point P (x + dx, y + dy). I (x + dx, y + dy, t + dt) can be decomposed by Taylor's formula as:
Figure BDA0002803960130000032
wherein epsilon is infinitesimal and can be ignored. Equation (1) can therefore be simplified to:
Figure BDA0002803960130000033
Figure BDA0002803960130000034
both sides are divided by dt simultaneously:
Figure BDA0002803960130000035
let u, v be the velocity components of the flow along the X-axis and Y-axis, respectively, i.e.
Figure BDA0002803960130000036
In addition, note
Figure BDA0002803960130000037
At this time, equation (5) can be written as:
Ixu+Iyv+It=0 (8)
assuming that the pixel points around P (x, y) keep the same moving distance with P (x, y), a window with size of (5,5) is taken around P (x, y), and the pixel points in the window have:
Figure BDA0002803960130000041
and solving the optimal solution of the equation set by adopting a least square method so as to minimize the matching error sum in the window. Equation (9) can be abbreviated as:
Ad=b (10)
multiplying both sides by AT:(ATA)d=ATb (11)
At this time, velocity vectors u and v of the optical flow along the X axis and the Y axis are obtained as follows:
Figure BDA0002803960130000042
the L < th > value can be calculated by the solved u and v valuesmCorner point P (x, y) in layer in image GiPosition P (x + dx, y + dy) and optical flow
Figure BDA0002803960130000043
Mixing L withmThe optical flow value obtained by layer calculation is taken as Lm-1Initial value of laminar flow, and calculating Lm-1The precise value of the laminar flow until the lowest layer L is calculated0The optical flow of the original image and the traced corner point P (x + dx, y + dy).
Preferably, if the number of corner points that cannot be tracked is greater than the threshold, it is determined that the current input image is a new key frame specifically:
key frame image Gk-1At the current input image GiWhen KLT sparse optical flow tracking is performed, if the following occurs, it is considered that tracking has failed:
(1) corner point P (x, y) at GiOut of image range;
(2) the sum of the matching errors in the neighborhood of the matching corner points is greater than a threshold value;
if the number of corner points which fail to track is greater than the set threshold value, the current input image I is considerediIs a new key frame.
Preferably, in the second step, the extracting global features of the current input image by using the convolutional neural network trained by the image classification dataset specifically includes: for the current input image IiAnd preprocessing is carried out, the input of the convolutional neural network requires to adjust the size of the image, and the output of the last but one full connection of the convolutional neural network is used as the global characteristic of the image.
Preferably, in the third step, if the current input image is a key frame, the specific process of inserting the extracted global features into the hierarchical navigable small-world map of the approximate nearest neighbor search algorithm is as follows: if the current input image IiSelected as key frames, the images I are randomly assigned by an exponentially decaying probability distribution functioniCharacteristic node of (1) the highest level number l in the HNSW structuremaxInsert the feature node into lmaxTo the bottom layer l0In all layers of (a). And searching M nodes nearest to the node in each layer respectively, and connecting the new characteristic node with the M nodes nearest to the new characteristic node.
Preferably, in the second step, the search range of the current input image specifically includes:
Usa=Ubefore-Ufr×ct
wherein, UsaIndicating a search range of the input image; u shapebeforeA set representing all images preceding the current input image; fr is the frame rate of the camera; ct is a time constant; u shapefr×ctIs a set of fr × ct frame images preceding the current input image.
Preferably, in the fourth step, the specific process of extracting ORB feature points and corresponding local differential binary descriptors LDB from the current input image and the retrieved closed-loop candidate image queue includes:
respectively extracting ORB characteristic points from the current input image and the closed loop candidate image queue, and for each ORB characteristic point kijIn k, withijCropping a block S of size S × S for the centerijWill SijDivided into c x c mesh units of equal size
Figure BDA0002803960130000051
Calculating the average intensity I of each grid cellavgAnd gradient dx、dy. For SijOf any two grid cells
Figure BDA0002803960130000052
Executing binary test to obtain binary code as the sum of characteristic points kijCorresponding binary LDB descriptors.
Preferably, for SijOf any two grid cells
Figure BDA0002803960130000053
Executing binary test, specifically:
Figure BDA0002803960130000054
wherein f (m), f (n) respectively represent grid cells
Figure BDA0002803960130000055
Average intensity ofavgAnd gradient dx、dyThe value is obtained.
Preferably, in the fourth step, the input image is matched with the descriptors of the images in the closed-loop candidate image queue, specifically, the descriptors are matched
Input image I using Hamming distanceiAnd closed loop candidate image InFor the input image IiLDB descriptor of
Figure BDA0002803960130000056
In the candidate image InIn search and
Figure BDA0002803960130000057
two descriptors with the closest distance
Figure BDA0002803960130000061
If it is
Figure BDA0002803960130000062
And
Figure BDA0002803960130000063
if the following conditions are satisfied, the product is considered to be
Figure BDA0002803960130000064
And
Figure BDA0002803960130000065
is a pair of satisfactory feature matching:
Figure BDA0002803960130000066
wherein the content of the first and second substances,
Figure BDA0002803960130000067
respectively represent feature descriptors
Figure BDA0002803960130000068
And
Figure BDA0002803960130000069
hamming distance, epsilon betweendThe value is usually less than 1 for the distance scaling factor.
Preferably, the Hamming distance is adopted for the input image IiAnd closed loop candidate image InThe specific matching of the LDB descriptors is as follows:
Figure BDA00028039601300000610
wherein d is1,d2Representing two LDB descriptors, diDenotes d1,d2Bit i of the descriptor.
Compared with the prior art, the invention has the beneficial effects that:
1. the key frames are selected through KLT sparse optical flow tracking, the motion speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be well processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved.
2. The invention checks whether the two images form a closed loop or not through the local differential binary descriptor LDB, thereby not only obtaining the geometric topological relation between the two images, but also verifying whether the two images form the closed loop or not, and improving the precision of closed loop detection.
3. The invention extracts the global features of the image by adopting the convolutional neural network trained by the image classification dataset and uses the global features for nearest neighbor image retrieval, thereby being capable of better coping with scenes with appearance changes.
Drawings
FIG. 1 is a flow chart of a closed loop detection method based on key frame selection and local features of the present invention;
FIG. 2 is a flowchart of key frame selection for a closed loop detection method based on key frame selection and local features according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
examples
Fig. 1-2 show an embodiment of a closed-loop detection method based on key frame selection and local features, comprising the following steps:
the method comprises the following steps: the first frame of the input image sequence is identified as the key frame, the current input image IiIs marked as Ik-1For image IiAnd Ik-1Performing graying to obtain an image Gi、Gk-1. Extracting image Gk-1Shi _ Tomasi corner point; for image Gk-1、GiRespectively carrying out Gaussian pyramid transformation to obtain L layers of images with different resolutions;
due to the image Gk-1、GiThe brightness is kept constant before and after the movement of the middle pixel point, and the L-th pixel point is calculated by solving the velocity components u and v of the optical flow along the X axis and the Y axismCorner point P (x, y) in layer in image GiPosition P (x + dx, y + dy) and optical flow
Figure BDA0002803960130000071
Mixing L withmThe optical flow value obtained by layer calculation is taken as Lm-1Initial value of laminar flow, and calculating Lm-1The precise value of the laminar flow until the lowest layer L is calculated0The optical flow of the original image and the traced corner point P (x + dx, y + dy).
Image Gk-1In image GiWhen KLT sparse optical flow tracking is performed, if the following occurs, it is considered that tracking has failed:
(1) corner point P (x, y) at GiOut of the image range.
(2) The sum of the matching errors in the neighborhood of the matching corner points in the two images is greater than a threshold.
If the number of corner points which fail to track is greater than the set threshold value, the current input is consideredImage IiIs a new key frame.
Step two: for the current input image IiPreprocessing is performed to resize the image to 224 x 224 pixels. Extraction of image I Using convolutional neural network VGG16 trained by Places365-standard datasetiThe output of the penultimate fully connected layer of the VGG16 network will be the image IiGlobal feature f ofglo,i. And if the current input image is a key frame, inserting the extracted global features into a hierarchical navigable small-world map (HNSW) of an approximate nearest neighbor search algorithm.
Step three: at the current input image IiIn the retrieval range, a key frame most similar to the current input image is retrieved through HNSW and used as a closed loop candidate key frame of the current image, and all images between the closed loop candidate key frame and a key frame next to the closed loop candidate key frame are used as a closed loop candidate image queue. Since the image sequence transmitted by the mobile robot has high similarity between adjacent images, the retrieval range of the current input image is UsaAll key frames within:
Usa=Ubefore-Ufr×ct
in the formula of UbeforeFor at the current input image IiSet of all previous images, fr frame rate of camera, ct time constant, Ufr×ctIs a set of fr × ct frame images preceding the current input image.
Step four: introducing geometric consistency check to the current input image IiAnd extracting ORB characteristic points respectively with the retrieved closed loop candidate image queue. For each ORB feature point kijIn k, withijCropping a block S of size S × S for the centerij. Secondly, adding SijDivided into c x c mesh units of equal size
Figure BDA0002803960130000081
Calculating the average intensity I of each grid cellavgAnd gradient dx、dy. For SijOf any two grid cells
Figure BDA0002803960130000082
The binary test is performed as follows:
Figure BDA0002803960130000083
wherein f (m) is a grid unit
Figure BDA0002803960130000084
Average intensity ofavgAnd gradient dx、dyValue of (a), (b), (c) represents a grid cell
Figure BDA0002803960130000085
Average intensity ofavgAnd gradient dx、dyThe value of (c). To SijAfter the c × c grid units all execute binary test, the obtained binary code is the sum of the characteristic point kijCorresponding binary LDB descriptors.
After the current input image I is obtainediAfter the LDB descriptor of the closed loop candidate image queue, the Hamming distance is respectively adopted for the input image IiWith pictures I in a closed-loop candidate picture queueq,nFor I, for the LDB descriptor ofiLDB descriptor of
Figure BDA0002803960130000086
In Iq,nIn search and
Figure BDA0002803960130000087
two LDB descriptors with the nearest distance
Figure BDA0002803960130000088
If it is
Figure BDA0002803960130000089
And
Figure BDA00028039601300000810
if the following conditions are satisfied, the product is considered to be
Figure BDA00028039601300000811
And
Figure BDA00028039601300000812
is a good feature match:
Figure BDA00028039601300000813
wherein the content of the first and second substances,
Figure BDA00028039601300000814
presentation descriptor
Figure BDA00028039601300000815
And
Figure BDA00028039601300000816
the Hamming distance between the two electrodes,
Figure BDA00028039601300000817
presentation descriptor
Figure BDA00028039601300000818
And
Figure BDA00028039601300000819
the Hamming distance between them. EpsilondThe value is usually less than 1 for the distance scaling factor.
Input image I using Hamming distanceiAnd closed loop candidate image InThe specific matching of the LDB descriptors is as follows:
Figure BDA00028039601300000820
wherein d is1,d2Representing two LDB descriptors, diDenotes d1,d2Bit i of the descriptor.
Step five: with the current input image IiMost matching of LDB descriptorsThe closed-loop candidate image is used as an optimal closed-loop candidate image, the matched characteristic points of the two images are input into a random sampling consistency algorithm (RANSAC) to further eliminate mismatching and solve a basic matrix; if the number of the inner points between the two images is less than the threshold value, the two images do not form a closed loop; if the number of inliers between two images is not less than the threshold, the two images may form a closed loop.
Step six: checking the consistency of the incoming time if the current input image IiAnd C, if the subsequent 2 continuous frame images meet the threshold condition of the step five, the current input image and the optimal closed-loop candidate image are considered to form a group of closed loops.
The beneficial effects of this example: 1. the key frames are selected through KLT sparse optical flow tracking, the motion speed of the mobile robot does not need to be considered, meanwhile, the images at the corners can be well processed, and the selected key frames are more representative. Meanwhile, the key frame is selected, so that the operation speed in the matching process can be reduced, and the detection speed of the whole method is improved. 2. The invention checks whether the two images form a closed loop or not through the local differential binary descriptor LDB, thereby not only obtaining the geometric topological relation between the two images, but also verifying whether the two images form the closed loop or not, and improving the precision of closed loop detection. 3. The invention extracts the global features of the image by adopting the convolutional neural network trained by the image classification dataset and uses the global features for nearest neighbor image retrieval, thereby being capable of better coping with scenes with appearance changes.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. A closed loop detection method based on key frame selection and local features is characterized by comprising the following steps:
the method comprises the following steps: an input image acquired by a mobile robot; determining a first frame of an input image sequence as a key frame, extracting Shi _ Tomasi corner points of a previous key frame of a current input image, tracking the corner points in the current input image iteratively by adopting a sparse optical flow tracking algorithm, and if the number of the corner points which cannot be tracked is greater than a threshold value, determining the current input image as a new key frame;
step two: extracting global features from the current input image by adopting a convolutional neural network trained by an image classification data set, and inserting the extracted global features into a layered navigable small-world map of an approximate nearest neighbor retrieval algorithm if the current input image is a key frame;
step three: in the retrieval range of the current input image, retrieving a key frame most similar to the current input image as a closed-loop candidate key frame of the current image through HNSW, and taking all images between the closed-loop candidate key frame and a key frame next to the closed-loop candidate key frame as a closed-loop candidate image queue;
step four: introducing geometric consistency check, respectively extracting ORB characteristic points and corresponding local difference binary descriptors LDB from the input image and the retrieved closed-loop candidate image, and respectively matching the input image with descriptors of images in a closed-loop candidate image queue;
step five: the closed-loop candidate image which is most matched with the LDB descriptor of the current input image is used as an optimal closed-loop candidate image, the feature points matched with the two images are input into a random sampling consistency algorithm to further eliminate mismatching and solve a basic matrix, and if the number of the inner points between the two images is less than a threshold value, the two images do not form a closed loop; if the number of inner points between the two images is larger than the threshold value, the two images may form a closed loop;
step six: and (4) introducing time consistency check, and if the continuous 2 frames of images after the current input image all meet the threshold condition of the step five, considering that the input image and the closed-loop candidate image form a group of closed loops.
2. A closed-loop detection method based on key-frame selection and local features as claimed in claim 1, characterized in that in said step one, the corner points are iteratively tracked in the current input image using a sparse optical flow tracking algorithm KLT, specifically:
current input image IiIs marked as Ik-1For image IiAnd Ik-1Performing graying to obtain an image Gi、Gk-1(ii) a Extracting image Gk-1Shi _ Tomasi corner point, set image IiAnd Ik-1The brightness is kept constant before and after the movement of the middle pixel point, and an image G is calculatedk-1At the center point P (x, y) in the image GiPosition P (x + dx, y + dy) and optical flow
Figure FDA0002803960120000021
3. The method according to claim 2, wherein if the number of corner points that cannot be tracked is greater than a threshold, the method considers that the current input image is a new key frame specifically as follows:
key frame image Gk-1At the current input image GiWhen KLT sparse optical flow tracking is performed, if the following occurs, it is considered that tracking has failed:
(1) corner point P (x, y) at GiOut of image range;
(2) the sum of the matching errors in the neighborhood of the matching corner points is greater than a threshold value;
if the number of corner points which fail to track is greater than the set threshold value, the current input image I is considerediIs a new key frame.
4. The method according to claim 3, wherein in the second step, the current input image uses a convolutional neural network trained by the image classification dataset to extract global features specifically as follows: for the current input image IiPreprocessing, inputting the image size required by the convolution neural network, and calculating the reciprocal of the convolution neural networkThe second fully connected output serves as a global feature of the image.
5. The method according to claim 3, wherein in the third step, if the current input image is a key frame, the specific process of inserting the extracted global features into the hierarchical navigable small-world map of the approximate nearest neighbor search algorithm is as follows: if the current input image IiSelected as key frames, the images I are randomly assigned by an exponentially decaying probability distribution functioniCharacteristic node of (1) the highest level number l in the HNSW structuremaxInsert the feature node into lmaxTo the bottom layer l0In all layers of (a). And searching M nodes nearest to the node in each layer respectively, and connecting the new characteristic node with the M nodes nearest to the new characteristic node.
6. The method according to claim 1, wherein in the second step, the search range of the current input image is specifically:
Usa=Ubefore-Ufr×ct
wherein, UsaIndicating a search range of the input image; u shapebeforeA set representing all images preceding the current input image; fr is the frame rate of the camera; ct is a time constant; u shapefr×ctIs a set of fr × ct frame images preceding the current input image.
7. The method as claimed in claim 1, wherein in the fourth step, the specific process of extracting ORB feature points and corresponding local differential binary descriptors LDB from the current input image and the retrieved closed-loop candidate image queue is as follows:
respectively extracting ORB characteristic points from the current input image and the closed loop candidate image queue, and for each ORB characteristic point kijIn k, withijCropping a block S of size S × S for the centerijWill SijDivided into c x c mesh units of equal size
Figure FDA0002803960120000031
Calculating the average intensity I of each grid cellavgAnd gradient dx、dy(ii) a For SijOf any two grid cells
Figure FDA0002803960120000032
Executing binary test to obtain binary code as the sum of characteristic points kijCorresponding binary LDB descriptors.
8. The method of claim 7, wherein for S, the closed loop detection method is based on key frame selection and local featureijOf any two grid cells
Figure FDA0002803960120000033
Executing binary test, specifically:
Figure FDA0002803960120000034
wherein f (m), f (n) respectively represent grid cells
Figure FDA0002803960120000035
Average intensity ofavgAnd gradient dx、dyThe value is obtained.
9. The method according to claim 8, wherein in step four, the input image is matched with the descriptors of the images in the closed-loop candidate image queue, specifically, the descriptors of the images in the closed-loop candidate image queue are matched
Input image I using Hamming distanceiAnd closed loop candidate image InFor the input image IiLDB tracing ofWord-writing device
Figure FDA0002803960120000036
In the candidate image InIn search and
Figure FDA0002803960120000037
two descriptors with the closest distance
Figure FDA0002803960120000038
If it is
Figure FDA0002803960120000039
And
Figure FDA00028039601200000310
if the following conditions are satisfied, the product is considered to be
Figure FDA00028039601200000311
And
Figure FDA00028039601200000312
is a pair of satisfactory feature matching:
Figure FDA00028039601200000313
wherein the content of the first and second substances,
Figure FDA00028039601200000314
respectively represent feature descriptors
Figure FDA00028039601200000315
And
Figure FDA00028039601200000316
hamming distance, epsilon betweendThe value is usually less than 1 for the distance scaling factor.
10. Root of herbaceous plantThe method as claimed in claim 9, wherein the Hamming distance is used for the input image IiAnd closed loop candidate image InThe specific matching of the LDB descriptors is as follows:
Figure FDA00028039601200000317
wherein d is1,d2Representing two LDB descriptors, diDenotes d1,d2Bit i of the descriptor.
CN202011360902.8A 2020-11-27 2020-11-27 Closed loop detection method based on key frame selection and local features Active CN112396593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011360902.8A CN112396593B (en) 2020-11-27 2020-11-27 Closed loop detection method based on key frame selection and local features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011360902.8A CN112396593B (en) 2020-11-27 2020-11-27 Closed loop detection method based on key frame selection and local features

Publications (2)

Publication Number Publication Date
CN112396593A true CN112396593A (en) 2021-02-23
CN112396593B CN112396593B (en) 2023-01-24

Family

ID=74604695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011360902.8A Active CN112396593B (en) 2020-11-27 2020-11-27 Closed loop detection method based on key frame selection and local features

Country Status (1)

Country Link
CN (1) CN112396593B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109631855A (en) * 2019-01-25 2019-04-16 西安电子科技大学 High-precision vehicle positioning method based on ORB-SLAM
CN109902619A (en) * 2019-02-26 2019-06-18 上海大学 Image closed loop detection method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109631855A (en) * 2019-01-25 2019-04-16 西安电子科技大学 High-precision vehicle positioning method based on ORB-SLAM
CN109902619A (en) * 2019-02-26 2019-06-18 上海大学 Image closed loop detection method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
艾青林等: "基于ORB关键帧匹配算法的机器人SLAM实现", 《机电工程》 *

Also Published As

Publication number Publication date
CN112396593B (en) 2023-01-24

Similar Documents

Publication Publication Date Title
Ding et al. Object detection in aerial images: A large-scale benchmark and challenges
CN111507271B (en) Airborne photoelectric video target intelligent detection and identification method
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
CN111126359B (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN114202672A (en) Small target detection method based on attention mechanism
CN111563442A (en) Slam method and system for fusing point cloud and camera image data based on laser radar
CN110781262B (en) Semantic map construction method based on visual SLAM
CN112258618A (en) Semantic mapping and positioning method based on fusion of prior laser point cloud and depth map
CN109785298B (en) Multi-angle object detection method and system
CN111462210B (en) Monocular line feature map construction method based on epipolar constraint
CN111368759B (en) Monocular vision-based mobile robot semantic map construction system
CN111914720B (en) Method and device for identifying insulator burst of power transmission line
CN109063549B (en) High-resolution aerial video moving target detection method based on deep neural network
CN109886159B (en) Face detection method under non-limited condition
CN114049381A (en) Twin cross target tracking method fusing multilayer semantic information
Dong et al. Learning a robust CNN-based rotation insensitive model for ship detection in VHR remote sensing images
Saleem et al. Neural network-based recent research developments in SLAM for autonomous ground vehicles: A review
CN114067128A (en) SLAM loop detection method based on semantic features
CN113704276A (en) Map updating method and device, electronic equipment and computer readable storage medium
Ali et al. A life-long SLAM approach using adaptable local maps based on rasterized LIDAR images
CN111932612A (en) Intelligent vehicle vision positioning method and device based on second-order hidden Markov model
CN116721206A (en) Real-time indoor scene vision synchronous positioning and mapping method
CN112396593B (en) Closed loop detection method based on key frame selection and local features
CN115187614A (en) Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network
CN112396596A (en) Closed loop detection method based on semantic segmentation and image feature description

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant