CN112115953B

CN112115953B - Optimized ORB algorithm based on RGB-D camera combined plane detection and random sampling coincidence algorithm

Info

Publication number: CN112115953B
Application number: CN202010985540.5A
Authority: CN
Inventors: 程明; 司雨晨
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2020-09-18
Filing date: 2020-09-18
Publication date: 2023-07-11
Anticipated expiration: 2040-09-18
Also published as: CN112115953A

Abstract

The invention provides an optimized ORB algorithm based on an RGB-D camera combined plane detection and random sampling coincidence algorithm, which comprises the following steps: s1: acquiring image data using an RGB-D camera, the image data comprising a color image and a depth image; s2: extracting feature points of the image data by using an ORB algorithm, and judging the distribution uniformity of the feature points by using a feature point uniformity evaluation method; s3: for the image data part with the characteristic points evenly distributed, generating point clouds and downsampling the point clouds; s4: performing plane detection extraction on the down-sampled point cloud, and eliminating mismatching by using a random sampling coincidence algorithm; s5: and for the image data part with uneven characteristic point distribution, using a set threshold value to extract the characteristic points and eliminating overlapped characteristic points by a non-maximum value inhibition method. The invention can reduce the calculated amount, improve the accuracy of feature point extraction and reduce mismatching, thereby realizing the requirements of mobile robots on accuracy and instantaneity.

Description

Optimized ORB algorithm based on RGB-D camera combined plane detection and random sampling coincidence algorithm

Technical Field

The invention relates to an optimized ORB algorithm based on an RGB-D camera combined plane detection and random sampling consistency algorithm, and belongs to the technical field of indoor mobile robot path planning navigation.

Background

In recent years, intelligent mobile robot technology has rapidly developed, and has been widely used in the fields of industry, military, logistics, office and home services, etc. With the advent of RGB-D sensors, studies on mobile robot positioning or SLAM using RGB-D sensors have been rapidly developed. The RGB-D sensor has the advantages of rich acquired information, non-contact measurement, easy installation and use, low cost and the like, so that the RGB-D sensor is widely applied to the fields of target identification, tracking and the like. The first of the robot navigation problems is how to determine a scene model, and in the last decade, many solutions have relied on two-dimensional sensors such as lasers, radars, etc. for map construction and robot pose estimation. With the advent of RGB-D cameras, more and more researchers began focusing on using RGB-D cameras to solve the problem of robot indoor environment model construction, and many significant research results were produced.

At present, the visual SLAM technology gradually becomes a mainstream positioning scheme, however, monocular SLAM cannot obtain depth information of a pixel point from one image, and the depth of the pixel point needs to be estimated by a triangulation or inverse depth method. And the depth information estimated by the monocular SLAM has scale uncertainty, and the phenomenon of scale drift is easy to appear along with the accumulation of positioning errors. The binocular SLAM obtains matching feature points by matching images of the left camera and the right camera, and then estimates depth information of the feature points according to a parallax method. Binocular SLAM has advantages such as measuring range is big, but the calculated amount is big and to camera's precision requirement high, usually needs GPU to accelerate in order to satisfy real-time requirement. RGB-D cameras are a new type of camera that has been developed in recent years, and that can actively acquire depth information of pixels in an image through physical hardware. Compared with monocular and binocular cameras, the RGB-D camera does not need to consume a large amount of computing resources to calculate the depth of the pixel points, can directly perform three-dimensional measurement on the surrounding environment and the obstacle, and generates a dense point cloud map through the RGB-D SLAM technology, so that convenience is provided for subsequent navigation planning.

The feature point extraction matching method based on the RGB-D camera commonly used at present comprises an ORB algorithm and an ICP algorithm. ORB (Oriented FAST and Rotated BRIEF) is an algorithm for fast feature point extraction and description. The ORB algorithm is divided into two parts, feature point extraction and feature point description. Feature extraction was developed from the FAST (Features from Accelerated Segment Test) algorithm, and feature point descriptions were improved according to the BRIEF (Binary Robust Independent Elementary Features) feature description algorithm. ORB feature is to combine the detection method of FAST feature points with BRIEF feature descriptors, and make improvement and optimization based on the original detection method and BRIEF feature descriptors. The ORB algorithm has the biggest characteristic of high calculation speed. This is benefited first by using FAST to detect feature points. Again, the descriptor is calculated using the BRIEF algorithm, and the representation of the 2-ary string specific to the descriptor not only saves memory space, but also greatly shortens the matching time. The ICP algorithm is proposed by the Besl and McKay 1992,Method for registration of 3-D shapes article. The basic principle of the ICP algorithm is as follows: and finding out nearest points (pi, qi) in the target point cloud P and the source point cloud Q with matching according to a certain constraint condition, and then calculating optimal matching parameters R and t to minimize an error function. This method can improve the efficiency of generating the optimal solution, but is less practical for the mobile robot due to the larger calculation amount, thereby increasing the cost.

Disclosure of Invention

In view of the above, the present invention aims to provide an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sampling coincidence algorithm, which is used for reducing the calculation amount, improving the accuracy of feature point extraction, and reducing mismatching, so as to achieve the requirements of mobile robots on accuracy and instantaneity.

In order to solve the technical problems, the invention adopts the following technical scheme:

an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sample consensus algorithm, the method comprising the steps of:

s1: acquiring image data using an RGB-D camera, the image data comprising a color image and a depth image;

s2: extracting feature points of the image data by using an ORB algorithm, and judging the distribution uniformity of the feature points by using a feature point uniformity evaluation method;

s3: for the image data part with the characteristic points evenly distributed, generating point clouds and downsampling the point clouds;

s4: performing plane detection extraction on the down-sampled point cloud, and eliminating mismatching by using a random sampling coincidence algorithm;

s5: for the image data part with uneven characteristic point distribution, a set threshold value is used for characteristic point extraction and a non-maximum value suppression method are used for eliminating overlapped characteristic points;

s6: and (3) for the point cloud after the mismatching is eliminated in the step S4 and the characteristic points after the overlapped characteristic points are eliminated in the step S5, projecting the characteristic points back to the two-dimensional image plane, reconstructing and equalizing the gray level image.

Preferably, in S1, the RGB-D camera comprises a Kinect camera.

Preferably, S2 specifically comprises the following steps:

s21: judging whether the feature point x is a feature point or not by using a feature point extraction Oriented FAST algorithm, calculating a main direction of the feature point when the feature point x is judged to be a feature point, and naming the feature point as a key point to enable the key point to have directivity;

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the characteristic point x as the center, wherein the circle passes through n pixel points, and whether at least m continuous pixel points in the n pixel points arranged on the circumference are at least or not and the distance between the characteristic point x satisfy the average ratio l _x Greater than +t, or all greater than l _x T is small, if such a requirement is met, the feature point x is judged to be a feature point;

wherein l _x Representing the distance between the feature point x and the pixel point on the circle; l represents distance; t represents a threshold value, which is the adjustment amount of the range; n=16; m is more than or equal to 9 and less than or equal to 12;

s22: using a feature point description rBRIEF algorithm to carry out Gaussian smoothing on image data, taking a key point as a center, selecting a pixel point y in a neighborhood, forming n point pairs (xi, yi), comparing I (x, y) with each other to obtain a gray value, wherein I represents the gray value, x > y is 1, otherwise 0 is obtained to generate an n-dimensional feature descriptor, defining the n point pairs (xi, yi) as a 2*n matrix S,

s is rotated by theta and,

S _θ ＝R _θ S (2)

in the formula (2), S _θ Representing a matrix with a rotation angle theta, wherein theta is the angle theta along the main direction of the feature point;

selecting a pixel point y in a circle with k pixel points by taking the key point as the center in the neighborhood; wherein 0< k < n;

s23: the characteristic point distribution uniformity evaluation method is characterized in that image data are divided in different dividing modes, and an image data part with uniform characteristic point distribution and an image data part with nonuniform characteristic point distribution are obtained after the image data are divided.

Further preferably, the feature point distribution uniformity evaluation method is as follows: firstly, the image data is initially divided into a plurality of subareas S _i For each sub-region S _i Re-dividing intoA plurality of secondary subregions S _ij Secondary subregion S _ij Includes S _i1 To S _ij A region according to the secondary sub-region S _ij The number of the characteristic points in the region is evaluated whether the characteristic points in the region are uniformly distributed; if S _i1 To S _ij The number of the feature points is similar, and the similar calculation method comprises the following steps: by calculating the variance value of the feature points of the statistical distribution of the secondary subareas and judging according to the variance value, when the variance value is smaller than 15, S can be judged _i And the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is uneven.

Further preferably, the image data dividing method includes dividing into a center and a peripheral direction, dividing from top left to bottom right, or dividing from bottom left to top right.

Preferably, S3 specifically comprises the following steps:

s31: obtaining a point cloud by adopting the following formula (3) according to the color image and the depth image with uniformly distributed characteristic points,

the formula (3) uses a pixel coordinate system o-u-v; c _X ,c _Y ,f _X ,f _Y S is a camera internal reference; u, v are pixel coordinates of the feature points; d is the depth of the feature point; the coordinates of the feature points are (X, Y, Z), and a plurality of points defined by the coordinates form a point cloud;

s32: processing is performed using a mesh filter, and a plane is extracted from the point cloud.

Preferably, S4 specifically comprises the following steps:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

s42: and extracting a plane from the noisy point cloud data by using a random sampling coincidence method, wherein the extraction judgment conditions are as follows: when the number of the point cloud residual points is larger than a threshold value g of the total number of the point clouds or the extraction plane is smaller than a threshold value h, extracting the characteristic points, and carrying out surface extraction on the residual points again after extracting the characteristic points until the number of the point clouds reaches the threshold value h of the extraction plane or the number of the residual points is smaller than the threshold value g;

wherein, g is more than or equal to 20% and less than or equal to 40%; h is more than or equal to 3 and less than or equal to 5.

Preferably, S5 specifically comprises the following steps:

s51: changing the threshold t according to the oFAST algorithm of S2, wherein the changed threshold t is t', and reducing l _x +t' and l _x -a range between t ' and adjusting the range of t ' according to the feature point extraction result, and determining the changed t ' value to be optimal;

s52: if a plurality of key points exist in the neighborhood of a certain key point, comparing the values J of the feature points, wherein the values J are defined as follows:

in the formula (5), l _xy -l _x And l _x -l _xy Representing the distance between the key point and the known feature point.

Preferably, in S6, the projection formula is:

in the formula (6), s is a scale factor, after projection, the gray level image of each plane is reconstructed, and after gray level histogram equalization is performed on the gray level image, the image is clearer and clearer, and noise brought by depth can be reduced.

The invention has the beneficial effects that: the accuracy of the algorithm is obviously higher than that of the single object or the plurality of objects by independently using the ORB algorithm and the ICP algorithm, and the accuracy brought by the algorithm is still far higher than that of other algorithms although parameters such as the calibration error of the camera have a small influence on the result. In addition, because the ORB algorithm is combined with the algorithm, and the method for judging the uniformity of the regional characteristic point distribution and combining the plane extraction and random sampling is inserted, the running time is slightly longer than that of the ORB algorithm alone, but the practicability of the algorithm is not affected. The ICP algorithm suffers from excessive running time due to excessive computational effort, which impairs usability and increases the demands on mobile robot hardware. It can be said that the algorithm of the invention only sacrifices negligible additional running time and can greatly improve the accuracy of extracting the feature points of the picture. Therefore, the invention can improve the accuracy of feature point extraction under the condition of keeping less running time, thereby improving the positioning precision and real-time performance of the robot.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a diagram of the invention of a freiburg1_desk picture before (left) and after (right) extraction;

FIG. 3 is a picture of the invention before (left) and after (right) extraction of the freiburg1_room picture;

fig. 4 is a diagram of the freiburg1_teddy picture before (left) and after (right) extraction in the present invention.

Detailed Description

The invention relates to an SCR denitration system prediction model optimization method based on machine learning, which is further described in detail below with reference to the accompanying drawings and specific embodiments.

Example 1

As shown in fig. 1, an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sample consensus algorithm, the method comprising the steps of:

Preferably, in S1, the RGB-D camera comprises a Kinect camera.

Preferably, S2 specifically comprises the following steps:

s21: judging whether the feature point x is a feature point or not by using a feature point extraction Oriented FAST algorithm (i.e. an oFAST algorithm), calculating a main direction of the feature point when the feature point x is judged to be a feature point, and naming the feature point as a key point so that the key point (i.e. a detector) has directivity;

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the characteristic point x as the center, wherein the circle passes through 16 pixel points, and whether the distance between at least 12 continuous pixel points and the characteristic point x in the 16 pixel points arranged on the circumference meets the average ratio l _x Greater than +t, or all greater than l _x T is small, if such a requirement is met, the feature point x is judged to be a feature point;

wherein l _x Representing the distance between the characteristic point x and 12 continuous pixel points on the circle respectively; l represents distance; t represents a threshold value, which is an adjustment quantity of a range, and a set t value can screen out pixel points with a distance close to an x point, if the t value is not set or is too small, the number of the pixel points meeting the judgment condition is too large, so that the number of the feature points is too large, and similarly, the number of the feature points is too small;

s22: using a feature point description rBRIEF algorithm to carry out Gaussian smoothing on image data, taking a key point as a center, selecting a pixel point y in a neighborhood, forming 6 point pairs (xi, yi), wherein the point pairs are randomly selected, I (x, y) mutually compares gray values, I represents the gray values, x > y takes 1, otherwise takes 0, generating an n-dimensional feature descriptor, defining n point pairs (xi, yi) as a 2*6 matrix S,

s is rotated by theta and,

S _θ ＝R _θ S (2)

Further preferably, the feature point distribution uniformity evaluation method is as follows: firstly, the image data is initially divided into a plurality of subareas S _i For each sub-region S _i Re-dividing into secondary sub-regions S _ij Secondary subregion S _ij Includes S _i1 To S _ij A region according to the secondary sub-region S _ij The number of the characteristic points in the region is evaluated whether the characteristic points in the region are uniformly distributed; if S _i1 To S _ij The number of the feature points is similar, and the similar calculation method comprises the following steps: by calculating the variance value of the feature points of the statistical distribution of the secondary subareas and judging according to the variance value, when the variance value is smaller than 15, S can be judged _i And the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is uneven.

In the embodiment, the characteristic point distribution condition of each region is judged through image segmentation, so that different algorithms are applied according to different distribution conditions.

Preferably, S3 specifically comprises the following steps:

s32: a mesh filter is used for processing, a plane is extracted from the point cloud, and a z-direction interval filter is used to filter out points that are farther apart. Points that are farther away are those that are too far from other points, resulting in only one point in the extracted plane if filtered out, thus increasing the number of non-feature points.

In this embodiment, color information (i.e., gray scale) is obtained by a color image, and distance information is obtained by a depth image, so that 3D camera coordinates of pixels can be calculated, and a point cloud is generated. The present embodiment generates a point cloud from an RGB image and a depth image.

Preferably, S4 specifically comprises the following steps:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

s42: and extracting a plane from the noisy point cloud data by using a random sampling coincidence method, wherein the extraction judgment conditions are as follows: and extracting the characteristic points when the number of the point cloud residual points is more than 30% of the total number or the extraction plane is less than 3, and extracting the residual points again after extracting the characteristic points until the number of the point cloud residual points reaches 3 or the number of the residual points is less than 30% of the total number.

Preferably, S5 specifically comprises the following steps:

s51: changing the threshold t according to the oFAST algorithm of S2, wherein the changed threshold t is t', and reducing l _x +t' and l _x -a range between t 'and adjusting the range of t' based on the feature point extraction result to determine a changed rangethe value of t' optimizes it;

Preferably, in S6, the projection formula is:

in the formula (6), s is a scale factor, and can be selected according to actual conditions, after projection, the gray level image of each plane is reconstructed, and after gray level histogram equalization is carried out on the gray level image, the image is clearer and more clear, and noise brought by depth can be reduced.

And comparing the feature point extraction accuracy of different algorithms and the algorithm with the running time according to the values of the table 1 and the table 2.

As shown in fig. 2 to 4, the present invention extracts the characteristic points of the desk, the room and the teddy bear, and the extracted characteristic points are distributed uniformly and have representativeness and accuracy, so that the subsequent reconstructed image is clearer.

Table 1 shows a comparison table of feature point extraction accuracy of 3 pictures of the algorithm, ORB algorithm and ICP algorithm

Image category	freiburg1_desk	freiburg1_room	freiburg1_teddy
				ORB algorithm	83.45％	76.09％	85.64％
ICP algorithm	83.18％	83.56％	85.08％
				The algorithm of the invention	93.44％	90.28％	97.21％

Table 2 is a comparison of the run times of 3 pictures of the algorithm of the present invention and ORB algorithm and ICP algorithm

Image category	freiburg1_desk	freiburg1_room	freiburg1_teddy
				ORB algorithm	0.7523	0.9735	0.6285
ICP algorithm	1.0028	1.5236	0.9658
				The algorithm of the invention	0.7886	1.0032	0.6422

As can be seen from Table 1, the accuracy of the algorithm of the present invention is significantly higher than that of the ORB algorithm and ICP algorithm alone, regardless of the feature point extraction of a single object or the feature point extraction of a plurality of objects, and the accuracy of the algorithm of the present invention is still significantly higher than that of other algorithms despite the slight influence of parameters such as the calibration error of the camera on the results.

As can be seen from table 2, since the inventive algorithm incorporates the ORB algorithm and the uniformity determination for the distribution of regional feature points is inserted in combination with the plane extraction and random sampling coincidence method, the run time is slightly longer than that of the ORB algorithm alone, but the practicality of the inventive algorithm is not affected. The ICP algorithm suffers from excessive running time due to excessive computational effort, which impairs usability and increases the demands on mobile robot hardware. It can be said that the algorithm of the invention only sacrifices negligible additional running time and can greatly improve the accuracy of extracting the feature points of the picture.

Example 2

This embodiment differs from embodiment 1 only in that: in S21, m=9; in S42, the point cloud remaining points are greater than 40% of the total number or the extraction plane is smaller than 5, and the feature points are extracted.

Example 3

This embodiment differs from embodiment 1 only in that: in S21, m=10; in S42, the point cloud remaining points are greater than 20% of the total number or the extraction plane is smaller than 4, and the feature points are extracted.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

In the description of the present invention, it should be understood that the directions or positional relationships indicated by the terms "upper", "lower", "front", "rear", "left", "right", etc., are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

As used herein, unless otherwise specified the use of the ordinal terms "first," "second," "third," etc., to describe a general object merely denote different instances of like objects, and are not intended to imply that the objects so described must have a given order, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined by the appended claims.

The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.

Claims

1. A method for processing an image using an optimized ORB algorithm based on an RGB-D camera combined with a plane detection and random sample consensus algorithm, the method comprising the steps of:

s6: for the point cloud after the mismatching is eliminated in the S4 and the characteristic points after the overlapped characteristic points are eliminated in the S5, projecting the characteristic points back to the two-dimensional image plane, reconstructing and equalizing the gray level image;

s2 specifically comprises the following steps:

the method for judging whether the feature point x is a feature point comprises the following steps: drawing a circle by taking the characteristic point x as the center, wherein the circle passes through n pixel points, and whether at least m continuous pixel points in the n pixel points arranged on the circumference are at least or not and the distance between the characteristic point x satisfy the average ratio l _x Greater than +t, or all greater than l _x T is small, if such a requirement is met, the feature point x is judged to be a feature point; wherein l _x Representing the distance between the feature point x and the pixel point on the circle; l represents distance; t represents a threshold value, which is the adjustment amount of the range; n=16; m is more than or equal to 9 and less than or equal to 12;

s is rotated by theta and,

S _θ ＝R _θ S (2)

s23: the characteristic point distribution uniformity evaluation method is characterized in that image data are divided in different dividing modes, and an image data part with uniform characteristic point distribution and an image data part with nonuniform characteristic point distribution are obtained after the image data are divided;

the characteristic point distribution uniformity evaluation method comprises the following steps: firstly, the image data is initially divided into a plurality of subareas S _i For each sub-region S _i Re-dividing into secondary sub-regions S _ij Secondary subregion S _ij Includes S _i1 To S _ij A region according to the secondary sub-region S _ij The number of the characteristic points in the region is evaluated whether the characteristic points in the region are uniformly distributed; if S _i1 To S _ij The number of the feature points is similar, and the similar calculation method comprises the following steps: by calculating the variance value of the feature points of the statistical distribution of the secondary subareas and judging according to the variance value, when the variance value is smaller than 15, S can be judged _i And the characteristic points of the subareas are uniformly distributed, otherwise, the judgment is uneven.

2. The processing method according to claim 1, wherein in S1, the RGB-D camera includes a Kinect camera.

3. The processing method according to claim 1, wherein the image data dividing method includes dividing into a center and a peripheral direction, dividing from top left to bottom right, or dividing from bottom left to top right.

4. A process according to claim 3, wherein S3 comprises the steps of:

5. The method according to claim 4, wherein S4 comprises the steps of:

s41: the formula for plane extraction is as follows:

aX+bY+cZ+d＝0 (4)

in the formula (4), a, b and c represent constants;

6. The method according to claim 5, wherein S5 comprises the steps of:

7. The processing method according to claim 1, wherein in S6, the projection formula is:

in the formula (6), s is a scale factor, after projection, the gray level image of each plane is reconstructed, and after gray level histogram equalization is carried out on the gray level image, the image is clearer and clearer, and noise brought by depth can be reduced;

formula (6) uses a pixel coordinate system o-u-v; c _X ,c _Y ,f _X ,f _Y S is a camera internal reference; u, v are pixel coordinates of the feature points; d is the depth of the feature point; the coordinates of the feature points are X, Y and Z, and a plurality of points defined by the coordinates form a point cloud.