CN116958595A - Visual SLAM loop detection improvement method based on image block region feature points - Google Patents

Visual SLAM loop detection improvement method based on image block region feature points Download PDF

Info

Publication number
CN116958595A
CN116958595A CN202310960132.8A CN202310960132A CN116958595A CN 116958595 A CN116958595 A CN 116958595A CN 202310960132 A CN202310960132 A CN 202310960132A CN 116958595 A CN116958595 A CN 116958595A
Authority
CN
China
Prior art keywords
image
image block
feature points
weight
gradient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310960132.8A
Other languages
Chinese (zh)
Inventor
肖震东
魏武
杨姗
柳雄顶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202310960132.8A priority Critical patent/CN116958595A/en
Publication of CN116958595A publication Critical patent/CN116958595A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/772Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries

Abstract

The invention belongs to image processing, in particular to a visual SLAM technology, and discloses a visual SLAM loop detection improvement method based on image block region feature points. Firstly, converting an input image into a gray image, dividing the gray image into image blocks through grids, and calculating a gradient direction histogram of characteristic points to obtain gradient vectors of the characteristic points; judging the gradient vector weight of the feature points in the image block, wherein the condition that the gradient vector weight does not reach a set threshold value is called an invalid image block, and the condition that the gradient vector weight reaches the set threshold value is called an effective image block; and constructing a vocabulary tree model by using the effective image blocks in the current frame and the gradient vectors of the corresponding feature points, and continuously iterating and updating the content information of the vocabulary tree. And the similarity between two frames of images is measured through searching in the updated vocabulary tree model, so that better image retrieval and matching results are obtained, and more accurate loop detection is realized. Thus, the problems of error accumulation and camera track drift caused in the long-time positioning and three-dimensional map construction process of the camera are solved.

Description

Visual SLAM loop detection improvement method based on image block region feature points
Technical Field
The invention belongs to the technical field of simultaneous localization and mapping (SLAM) of image processing and vision, and particularly relates to an improved visual SLAM loop detection method based on image block region feature points.
Technical Field
In the long-time operation process of the vision SLAM system, the vision sensor is easily affected by environmental noise to cause continuous accumulation of errors, this accumulation of errors eventually leads to serious distortions in the positioning and mapping results. In order to solve the key problem, by adding a loop detection module in the visual SLAM framework, the camera acquires environment information and judges whether the camera returns to the original position. Therefore, the accurate loop detection module can provide reliable basis for realizing robustness of the visual SLAM system, and has important significance for realizing reasonable optimization of the whole motion trail of the camera and the map.
In the visual SLAM system, a loop detection module plays a key role in camera positioning and map construction. Therefore, according to the implementation principle of the loop detection method, the loop detection method can be largely classified into a loop detection algorithm based on geometric information and a loop detection method based on appearance information. The loop detection method based on the appearance information is mainly used for loop detection through image matching, is realized according to the color and texture changes of the images, and is easy to be influenced by ambient illumination. The loop detection algorithm based on the geometric information is simple and easy to realize by judging whether the current position is near a certain historical position in the past, and when the camera rotates, the noise problem generated by the camera can be overcome due to the characteristics of rotation invariance and the like. In addition, according to geometric characteristic information, key points and descriptors can be extracted from the image by utilizing feature point-based loop detection to help complete data association.
The feature point method utilizing the geometric information mainly completes data association between image frames by extracting texture information rich local areas of an input image, and further calculates the pose of a camera and corresponding space 3D point coordinates. The feature point method of the geometric information is not easy to be influenced by the change of the camera pose, and in order to have the advantages of the descriptors, the input image information is considered to be divided into image blocks, and the point feature extraction and description are carried out in the obvious region of the image block features. However, the common image global feature points have large descriptor extraction and matching calculation amount and are easily influenced by noise change, so that the problem of accumulation of camera pose errors is significant through image block information.
Disclosure of Invention
Aiming at the problems, the method for extracting the characteristic points of the image block region calculates the gradient direction histogram through image blocks and judges the effective image blocks by using weights, and loop detection is carried out through the effective image blocks, so that a great amount of time can be prevented from being wasted in a weak texture region with unobvious matching characteristics.
Therefore, in order to overcome the problems of error accumulation and camera track drift caused by long-time positioning of a camera and a three-dimensional map construction process, the method firstly converts an input image into a gray image, divides the image blocks through grids, and simultaneously calculates gradient direction histograms of feature points in each image block to obtain gradient vectors of the feature points. Then judging the gradient vector weight of the feature points in the image block, wherein the image block is called an invalid image block when the gradient vector weight does not reach a set threshold value, and the image block is called an effective image block when the gradient vector weight reaches the set threshold value. And finally, constructing a vocabulary tree model by using the effective image blocks in the current frame and gradient vectors of the corresponding feature points, and continuously iterating and updating the content information of the vocabulary tree. When the loop detection stage is executed, the similarity between two frames of images is measured through searching in the updated vocabulary tree model, so that better image retrieval and matching results are obtained, and more accurate loop detection is realized. Thus, the present invention has been completed. The general technical scheme is that image blocks are divided through grids, gradient histograms of the image blocks are calculated at the same time, feature extraction is carried out on effective image blocks of a current frame through the gradient direction histograms, a word bag model and a word tree are constructed, and finally similarity of two frames of images in the word bag model is judged to complete loop detection.
The technical scheme provided by the invention is as follows:
an improved visual SLAM loop detection method based on image block region feature points comprises the following steps:
step one, acquiring image data of a current frame, converting the image data of the current frame into a gray image, and dividing the gray image of the current frame into different sub-image blocks.
Step two, dividing grids of the whole image according to the size of the input image of the current frame. Preferably, in order to limit the size of the image block, the gradient direction histogram information of the feature points in the image block is guaranteed to be reflected, and the grid size is adaptively divided by inputting the size of the image. I.e. for an image of size h x w, it is partitioned using a grid of h/10 x w/10.
And thirdly, extracting characteristic points on the gray image blocks after division, calculating a gradient direction histogram, and setting the gradient amplitude value of the gradient to p (x, y) and the gradient direction to theta (x, y). The gradient vector of the feature points is as follows:
and fourthly, obtaining h/10 Xw/10 image blocks through grid division, calculating gradient direction histograms of the feature points in each image block, and counting gradient vectors of the feature points in each image block.
Step five, through calculation of the gradient vector of the characteristic point in the image block, a weight set W is used for representing the weight total value of the characteristic point in the image block, and a gradient direction histogram vector set W= { W is set 1 ,w 2 ,...,w n Mean weight w m
And step six, calculating the weight of the feature point gradient vector in any image block W in the current frame and adjacent to the image block W (upper, lower, left and right adjacent image blocks). If the gradient vector weight W reaches the mean weight W m It is regarded as a valid imageAnd if not, the block is an invalid image block.
Step seven, starting searching from the effective image block to search for the effective image block capable of meeting the average weight w m Is added to the weight set, and for each newly added effective weight value, the mean weight w is updated m Until a new valid image block cannot be added.
Step eight, setting corresponding image sequences and image block set numbers contained in the current frame, the image blocks and the feature points, and then normalizing gradient direction histogram vectors of the effective image blocks in the current frame into unit vectors to obtain feature vectors of the effective image blocks.
And step nine, adding the effective image block information of the current frame into a word bag model to construct a vocabulary library.
And step ten, when the next frame of image arrives, repeating the steps, judging whether the similarity between the current effective image block and the image frame in the word bag library model reaches a weight threshold, and considering effective loop detection if the condition is met.
Preferably, in step seven, starting from the valid image block, it is found that it can satisfy the mean weight w m The mean weight w m As the dynamic weight, when other effective image blocks are added, the average weight w is dynamically updated m
Preferably, in step eight, the feature vector of the valid image block includes the size and direction of the feature, which is used as a descriptor of the region feature and is an important factor for loop detection.
Preferably, in step ten, the weight threshold is a set weight parameter, and when the two frames of images are greater than or equal to the set weight parameter, the loop detection is considered to be effective.
The invention has the beneficial effects that:
(1) The invention adopts the characteristic points of the effective image blocks to extract and describe, can avoid the algorithm from wasting a large amount of time on the ineffective image blocks with unobvious characteristics, and greatly improves the real-time efficiency of the algorithm;
(2) The similarity between images is measured through the word bag model, and the word tree in the word bag model is continuously updated under the iteration of the algorithm, so that the algorithm can accurately and efficiently realize loop detection.
(3) In addition, the invention can also be applied to loop detection in the fields of disinfection epidemic prevention robots, warehouse logistics robots, unmanned automatic driving, AR/VR, military rescue and the like.
Drawings
In order to make the technical solutions of the embodiments of the present invention more clearly revealed, the drawings required in the description of the embodiments will be briefly described in detail.
FIG. 1 is a block diagram of an improved method for visual SLAM loop detection based on image block region feature points;
FIG. 2 is a schematic illustration of meshing an input image;
FIG. 3 is a schematic diagram of gradient locations and direction vectors of feature points in an image block;
FIG. 4 is a schematic diagram of valid tiles and invalid tiles;
FIG. 5 is a weight value of an effective image block of each frame image;
FIG. 6 is a schematic diagram of a vocabulary tree structure in a bag of words model.
Detailed Description
In order to further understand and appreciate the method of the present invention, the following description will simply and in detail describe the technical solution of the embodiment of the present invention with reference to the accompanying drawings.
Example 1
Referring to fig. 1, the invention provides an improved visual SLAM loop detection method based on image block region feature points, comprising the following steps:
preparation: as shown in the schematic diagram of fig. 2, when the camera collects one frame of picture data, the image information data is in RGB format, and the RGB image is converted into a gray image. The color of each pixel point in the gray image has a corresponding numerical value, and the value range of each pixel is 0-255. 0 represents black 255 represents white, and pixel values of pixels at specific locations can be obtained by locating the abscissa and ordinate of the pixel grid.
Referring to fig. 2, step one, obtaining image data of a current frame, converting the image data of the current frame into a gray scale image, and dividing the gray scale image of the current frame into different sub-image blocks.
Step two, dividing grids of the whole image according to the size of the input image of the current frame. In order to limit the size of the image block and ensure that the gradient direction histogram information of the feature points in the image block is reflected, the grid size is adaptively divided by inputting the size of the image. I.e. for an image of size h x w, it is partitioned using a grid of h/10 x w/10.
Specifically, the input image size is 600×800, and is divided by using a 60×80 mesh.
Referring to fig. 3, step three, feature point extraction is performed on the divided gray image blocks and a gradient direction histogram is calculated, and its gradient magnitude is set to p (x, y) and gradient direction is set to θ (x, y). The gradient vector of the feature points is as follows:
specifically, the green area is a divided image block, and the gradient direction histogram is calculated for the feature points in the image block, including the position and direction of the feature points, so as to obtain gradient vectors of the feature points.
And fourthly, obtaining h/10 Xw/10 image blocks through grid division, calculating gradient direction histograms of the feature points in each image block, and counting gradient vectors of the feature points in each image block.
Referring to fig. 4, step five, through calculation of gradient vectors of feature points in an image block, a weight set W is used to represent features in the image blockThe weight total value of the sign points and sets a gradient direction histogram vector set w= { W 1 ,w 2 ,...,w n Mean weight w m
And step six, calculating the weight of the feature point gradient vector in any image block W in the current frame and adjacent to the image block W (upper, lower, left and right adjacent image blocks). If the gradient vector weight W reaches the mean weight W m It is considered a valid image block, otherwise an invalid image block.
Specifically, the green region is an effective image block because the weight W of the green region reaches the average weight W m The method comprises the steps of carrying out a first treatment on the surface of the And the blue region does not reach the average weight and is regarded as an invalid image block.
Referring to fig. 5, step seven, starting from the effective image block, searching for the effective image block that satisfies the average weight w m Is added to the weight set, and for each newly added effective weight value, the mean weight w is updated m Until a new valid image block cannot be added.
Step eight, setting corresponding image sequences and image block set numbers contained in the current frame, the image blocks and the feature points, and then normalizing gradient direction histogram vectors of the effective image blocks in the current frame into unit vectors to obtain feature vectors of the effective image blocks.
Referring to fig. 6, step nine, adding the effective image block information of the current frame to the word bag model to construct a vocabulary tree.
Specifically, the image frames acquired by the camera can continuously construct and update node information of the vocabulary tree, and the node information of the vocabulary tree is constructed and updated to be the characteristic of the effective image block.
And step ten, when the next frame of image arrives, repeating the steps, judging whether the similarity between the current effective image block and the image frame in the word bag library model reaches a weight threshold, and considering effective loop detection if the condition is met.
Example two
The embodiment is implemented by a handheld Kinect RGB-D camera on the premise of the technical scheme, the resolution is 640 multiplied by 480, and 1700 groups of data are collected for model training and verification in an indoor environment. The effectiveness and accuracy of the whole method are demonstrated by comparing the model training result with the true loop similarity score through a comparison experiment with the traditional orb algorithm (Oriented FAST and Rotated BRIEF). Because the current program algorithm cannot accurately judge whether two images are similar or shot from the same place and the same angle like a human brain, the perception deviation and the perception variation can occur. Therefore, we choose to compare recall and accuracy at the same time to evaluate the effectiveness of the model, and the calculation method is as follows:
precision =number of correct loop image frames extracted by algorithm/number of loop image frames extracted
Recall =number of correct loop image frames extracted by algorithm/number of loop image frames in sample
The accuracy describes the probability that the algorithm will extract all loops that are truly loops. Recall refers to the probability of being correctly detected in all real loops. The comparison results are as follows:
table 1 algorithm comparison
Method Precision Recall
Orb(Baseline) 0.683 0.772
PatchUp(ours) 0.718 0.816
In table 1, by comparing with the Orb traditional algorithm, because the Orb algorithm adopts binary descriptors, the Orb algorithm is sensitive to indoor image noise and environmental brightness change, and the accuracy of loop detection is lower than that of the patch up method of the characteristic point of the image block area, which is proposed by the invention, the probability of truly being a true loop in all loops is improved to 0.718, and the accuracy of the method is fully shown. In addition, the Orb algorithm has weaker scale change capability in processing indoor environment images, is not efficient and stable as the PatchUp method provided by the invention when the handheld RGB-D camera generates larger rotation change to extract characteristic points, and can fully demonstrate that the method of the invention has better efficiency performance from a recall rate of 0.816.

Claims (9)

1. The visual SLAM loop detection improvement method based on the image block region characteristic points is characterized by comprising the following steps of:
step one, acquiring image data of a current frame, converting the image data of the current frame into a gray image, and dividing the gray image of the current frame into different sub-image blocks;
step two, dividing grids of the whole image according to the size of the input image of the current frame;
step three, extracting characteristic points on the gray image blocks after division and calculating a gradient direction histogram, and setting the gradient amplitude value of the gradient direction histogram as p (x, y) and the gradient direction as theta (x, y);
step four, obtaining h/10 Xw/10 image blocks through grid division, calculating gradient direction histograms of feature points in each image block, and counting gradient vectors of the feature points in each image block;
step five, through calculation of the gradient vector of the characteristic point in the image block, a weight set W is used for representing the weight total value of the characteristic point in the image block, and a gradient direction histogram vector set W= { W is set 1 ,w 2 ,...,w n Mean value of }The weight is w m
Step six, calculating the weights of the feature point gradient vectors in the upper, lower, left and right adjacent image blocks adjacent to any image block Wi in the current frame and the positions of the image blocks, if the gradient vector weights Wi reach the mean weight w m The image is regarded as a valid image block, otherwise, the image is an invalid image block;
step seven, starting from the effective image block, searching that the effective image block can meet and exceed the average weight w m Is added to the weight set, and for each newly added effective weight value, the mean weight w is updated m Until a new valid image block cannot be added;
step eight, setting corresponding image sequences and image block set numbers contained in the current frame, the image blocks and the feature points, and then normalizing gradient direction histogram vectors of the effective image blocks in the current frame into unit vectors to obtain feature vectors of the effective image blocks;
step nine, adding the effective image block information of the current frame into a word bag model to construct a word tree;
and step ten, when the next frame of image arrives, repeating the steps, judging whether the similarity between the current effective image block and the image frame in the word bag library model reaches a weight threshold, and considering effective loop detection if the condition is met.
2. The improved method for detecting the visual SLAM loop based on the characteristic points of the image block area as claimed in claim 1, wherein in the first step and the third step, the gray level image is converted from an RGB image, the image data acquired by the camera is the RGB image, and the local characteristic information can be enhanced to a certain extent through graying.
3. The improvement method for visual SLAM loop detection based on image block area feature points of claim 1, wherein in step two, the mesh size is adaptively divided by the size of the input image, i.e. for an image of size h x w, a mesh of h/10 x w/10 is used for dividing it.
4. The improvement method for visual SLAM loop detection based on feature points of image block area as claimed in claim 1, wherein in the third step, gradient vectors of the feature points are as follows:
5. the improvement method for visual SLAM loop detection based on image block area feature points as set forth in claim 1, wherein in step seven, starting from the valid image block, searching for a valid image block that satisfies the mean weight w m The mean weight w m As the dynamic weight, when other effective image blocks are added, the average weight w is dynamically updated m
6. The improvement method for visual SLAM loop detection based on image block area feature points as claimed in claim 1, wherein in step eight, said feature vector of said effective image block includes the size and direction of the feature, which is used as a descriptor of the area feature as an important factor of loop detection.
7. The improvement method for visual SLAM loop detection based on image block area feature points as claimed in claim 1, wherein in step nine, the word bag model is composed of each valid image block extracted from the current frame image and the corresponding feature vector, and the vocabulary tree is composed of each corresponding frame image sequence and image block set number.
8. The improvement in visual SLAM loop detection based on feature points of image block areas of claim 7, wherein in step nine, the node information of the lexical tree is constructed and updated as the feature of the valid image block based on the node information of the lexical tree that is continuously constructed and updated based on the acquired image frames.
9. The improvement method for detecting loop back in visual SLAM based on feature points of image block area as set forth in claim 1, wherein in step ten, the weight threshold is a set weight parameter, and when two frames of images are greater than or equal to the set weight parameter, the loop back detection is considered to be effective.
CN202310960132.8A 2023-08-01 2023-08-01 Visual SLAM loop detection improvement method based on image block region feature points Pending CN116958595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310960132.8A CN116958595A (en) 2023-08-01 2023-08-01 Visual SLAM loop detection improvement method based on image block region feature points

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310960132.8A CN116958595A (en) 2023-08-01 2023-08-01 Visual SLAM loop detection improvement method based on image block region feature points

Publications (1)

Publication Number Publication Date
CN116958595A true CN116958595A (en) 2023-10-27

Family

ID=88456428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310960132.8A Pending CN116958595A (en) 2023-08-01 2023-08-01 Visual SLAM loop detection improvement method based on image block region feature points

Country Status (1)

Country Link
CN (1) CN116958595A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315274A (en) * 2023-11-28 2023-12-29 淄博纽氏达特机器人系统技术有限公司 Visual SLAM method based on self-adaptive feature extraction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117315274A (en) * 2023-11-28 2023-12-29 淄博纽氏达特机器人系统技术有限公司 Visual SLAM method based on self-adaptive feature extraction
CN117315274B (en) * 2023-11-28 2024-03-19 淄博纽氏达特机器人系统技术有限公司 Visual SLAM method based on self-adaptive feature extraction

Similar Documents

Publication Publication Date Title
Lee et al. Simultaneous traffic sign detection and boundary estimation using convolutional neural network
CN109241913B (en) Ship detection method and system combining significance detection and deep learning
CN111640157B (en) Checkerboard corner detection method based on neural network and application thereof
CN111179217A (en) Attention mechanism-based remote sensing image multi-scale target detection method
CN111862213A (en) Positioning method and device, electronic equipment and computer readable storage medium
CN110991444B (en) License plate recognition method and device for complex scene
CN110472625B (en) Chinese chess piece visual identification method based on Fourier descriptor
CN109101981B (en) Loop detection method based on global image stripe code in streetscape scene
CN112364931A (en) Low-sample target detection method based on meta-feature and weight adjustment and network model
CN116958595A (en) Visual SLAM loop detection improvement method based on image block region feature points
CN112734844B (en) Monocular 6D pose estimation method based on octahedron
CN114882222A (en) Improved YOLOv5 target detection model construction method and tea tender shoot identification and picking point positioning method
CN110659637A (en) Electric energy meter number and label automatic identification method combining deep neural network and SIFT features
CN111998862A (en) Dense binocular SLAM method based on BNN
CN111709317A (en) Pedestrian re-identification method based on multi-scale features under saliency model
Wu et al. Location recognition algorithm for vision-based industrial sorting robot via deep learning
CN113205023B (en) High-resolution image building extraction fine processing method based on prior vector guidance
Zhang et al. Scale-adaptive NN-based similarity for robust template matching
CN102324043B (en) Image matching method based on DCT (Discrete Cosine Transformation) through feature description operator and optimization space quantization
CN113762278A (en) Asphalt pavement damage identification method based on target detection
CN112784869A (en) Fine-grained image identification method based on attention perception and counterstudy
CN107291813B (en) Example searching method based on semantic segmentation scene
CN111368637A (en) Multi-mask convolution neural network-based object recognition method for transfer robot
CN111353509B (en) Key point extractor generation method of visual SLAM system
CN115187614A (en) Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination