CN113191281A

CN113191281A - ORB feature extraction method based on region of interest and adaptive radius

Info

Publication number: CN113191281A
Application number: CN202110495753.4A
Authority: CN
Inventors: 孙超; 孙佳; 董璐; 王远大
Original assignee: Nanjing Yunzhikong Industrial Technology Research Institute Co ltd
Current assignee: Nanjing Yunzhikong Industrial Technology Research Institute Co ltd
Priority date: 2021-05-07
Filing date: 2021-05-07
Publication date: 2021-07-30

Abstract

The invention discloses an ORB feature extraction method based on a region of interest and a self-adaptive radius. Secondly, a Region of Interest (RoI) marking technology is designed to mark any Region where ORB features may exist based on the gray variance in each layer of the laplacian of gaussian pyramid. And finally, adjusting the radius of the template according to the illumination brightness and the image contrast to extract the features by adopting a self-adaptive ORB feature extraction method. The method can obviously improve the extraction quantity and the uniform distribution of the ORB characteristic points under the complex illumination.

Description

ORB feature extraction method based on region of interest and adaptive radius

Technical Field

The invention relates to the field of image processing, in particular to an ORB (object-oriented bounding box) feature extraction method based on a region of interest and a self-adaptive radius.

Background

An image feature extraction algorithm is an essential key step in computer vision, and since a cheap camera can provide a large amount of environmental information such as color, shape and texture, a robot expert widely applies the computer vision technology to the fields of Automatic Guided Vehicle (AGV) visual navigation, target detection and tracking and the like. Some challenging problems for vision-based AGVs running in complex work areas involve large amounts of camera data, appearance feature variations due to non-uniform and dynamic lighting, and intensive computations in the feature extraction process.

Currently, in order to improve the intelligence and adaptability of automated guided vehicles and reduce the reliance on navigation infrastructure, robotics experts must develop computer vision techniques that can efficiently and stably identify natural landmarks in dynamic lighting environments. Since natural landmarks are not a navigation infrastructure that is deliberately placed in the environment, it is inconvenient to acquire a priori knowledge of the existing distribution and characteristic appearance of natural landmarks. Under the visual SLAM framework, the quality of feature extraction seriously affects the accuracy of feature matching, which is a key step for determining the accuracy and robustness of attitude estimation. In this sense, feature detection is a difficult and important process for visual navigation with natural landmarks.

Common natural landmark feature extraction algorithms SIFT and SURF cause a large amount of calculation due to high-dimensional feature description, which limits the real-time performance of robot navigation under the condition that software and hardware resources of a vehicle-mounted controller are limited. In comparison, the ORB algorithm improves the efficiency and robustness of feature detection. However, the existing ORB-based feature detection algorithm still has some problems that 1) feature points are concentrated in a region with complex texture, and other regions have fewer feature points. 2) For many real-time applications, the processing time for feature point detection is still long. 3) The number of ORB feature points decreases dramatically when the illumination intensity or image contrast conditions deteriorate.

Disclosure of Invention

In view of the above deficiencies of the prior art, the present invention aims to provide an ORB feature extraction method based on an interested region and an adaptive radius, so as to solve the problem that the ORB feature extraction method in the prior art is difficult to balance the advantages of robustness, distribution uniformity and time overhead in a complex illumination environment.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

an ORB feature extraction method based on an interested region and an adaptive radius comprises the following steps:

1) collecting RGB images as the bottom layer image, namely a basic octave 0, by using an on-board RGB-D camera, performing convolution on the RGB images by using a Gaussian core, establishing a Gaussian pyramid, and generating a series of different octave images which gradually decrease from bottom to top;

2) aiming at each layer of image in the Gaussian pyramid, namely each octave, performing Laplace transform on the image by utilizing different convolution cores so as to generate a group of gradually blurred images with the same size and different intervals from bottom to top in each octave, and finally establishing the Gaussian Laplace pyramid;

3) combining Laplace transform and a Gaussian pyramid to construct a Gaussian Laplace pyramid, and realizing multi-resolution expression in a multi-scale space;

4) marking any area which is possible to have ORB characteristics according to the gray variance at each interval of each octave of the Gaussian Laplacian pyramid by adopting an interested area marking technology;

5) and aiming at the region where the ORB features possibly exist, namely the region of interest, adaptively adjusting the template radius according to the illumination brightness and the image contrast of the region, and extracting the potential ORB features.

Preferably, the RGB-D camera in step 1) is mounted on a mobile robot, and performs the natural environment image I in the surrounding environment without landmarks_x,yCapture, image I_x,yHas a size of B₁×B₂Image I_x,yThe pixel coordinates of (2) are x, y, and the total frequency multiplication path number of the Gaussian pyramid is:

l＝[log₂(min(B₁,B₂))] (1)

preferably, the laplace transform in step 2) is implemented as follows: first, a gaussian function with variable scale is defined, as follows:

for the image octave at the l-rd layer of the gaussian pyramid, its kernel scale σ is calculated as σ (l) ═ σ ₀2^l/2，σ₀Is an initial scale;

input image I by using a variable scale Gaussian function G (x, y, σ)_x,yAnd (3) performing convolution, wherein x and y are image pixel coordinates, and obtaining a multiresolution expression of the image octaves with the same size, wherein the convolution operation is as follows:

L(x,y,σ)＝G(x,y,σ)*I_x,y (3)

preferably, the laplacian of gaussian pyramid in step 3) is constructed by combining laplacian transformation with a gaussian pyramid, so as to realize multi-resolution expression in a multi-scale space, in the multi-resolution expression in the multi-scale space, a group of images which gradually become more and more blurred at different intervals form an image octave, a group of gradually reduced image octaves form a gaussian pyramid at different layers, and the image kernel scale σ at the s-th interval in the l-th octave of the laplacian of pyramid is calculated as σ(s)_l＝σ(l)(2^1/s)^n-sAnd n is the total number of image intervals in the l-th layer octave.

Preferably, the method of the region of interest marking technology described in step 4) is as follows:

1) multi-resolution image partitioning: each layer of the Gaussian Laplacian pyramid is initially divided equally into 4

sub-blocks

1^(k),2^(k),3^(k),4^(k)Searching a region of interest possibly containing ORB features, wherein k represents the depth number of the region of interest to be searched;

2) calculating the average gray level: for a given center point of any one sub-block, the average gray level in the (2a +1) × (2a +1) window of its neighboring region is calculated by the following formula:

wherein a represents a pixel value, m is one of four subblocks per every interval image in the kth depth search step, and (i, j) is a pixel coordinate in a (2a +1) × (2a +1) window;

3) calculating the sum of gray variance: in the k-th depth search step, the sum of the gray variances of each sub-block is obtained by the following formula

Then, the maximum and minimum sum of gray-scale variances is determined in the kth depth search step:

4) sub-block classification: in the k-th (k ═ 0) depth search step, sub-block classification is performed for each interval image, and the sub-block classification is divided into three classes in total, as follows:

N⁽⁰⁾in the form of an original interval image,

is the sum of the gray variances of each sub-block in the 0 th depth search step,

is the classification result of four sub-blocks in the 0 th depth search step;

5) sub-block evaluation: can be obtained from the above formula

Identifying the sub-block as a region of interest (RoI) at a 0 th depth search step; sub-blocks

Ignoring; the rest of the sub-blocks

An initial image input used as a kth (k ═ 1) depth search;

repeating the process from step (1) to step (4) to perform another loop process, wherein the classification result of the k (k is 1) th depth search step is as follows:

in the 1 st depth search step, N⁽¹⁾Is an input of an original image, and,

is the sum of the gray variance of each sub-block,

is the classification result of four sub-blocks; similarly, a sub-block is selected

As a region of interest, while sub-blocks are ignored in this step

The rest of the sub-blocks

Again used as the initial image input for the kth (k ═ 2) depth search;

6) termination conditions were as follows: sum of gray variance at four sub-blocks

After sorting, in which case the value is smaller than the gray-scale variation threshold Δ Z, the depth search process is terminated and all the region-of-interest sub-blocks obtained in the k-step depth search can be accumulated into the entire region-of-interest

Which is used as a search scope to extract ORB features.

Preferably, the method for extracting ORB features by adjusting the template radius in step 5) is as follows:

1) in the obtained region of interest N_RoIIn the method, the gray difference is calculated by using a 4-neighbor method between different adjacent pixels, so that

And

for the gray levels of the adjacent pixels, the gray level difference is calculated as follows

A 4-neighbor means that there are 4 horizontal and vertical neighbors to the center point p at coordinates (x, y) given by the formula, the group of pixels being called the 4-neighbor of the center point p

(x+1,y),(x-1,y),(x,y+1),(x,y-1) (10)

2) According to the gray difference delta and the distribution probability

Calculating the image contrast C, assuming P here_δIn the form of a normal distribution of the signals,

3) the template radius is adjusted according to the gray level difference and the image contrast as follows:

R_i,j＝αC+βI_i,j (12)

wherein α and β are adaptive coefficients in the range of (0, 1);

4) and (3) feature detection: let p be the center point at radius R_i,jFind all surrounding points N, e.g. 16 points around a circle with radius 3; the circle with the radius of 5 or 7 is surrounded by 28 or 40 points respectively, and the gray level I of N surrounding points on the circle is represented_x,yGrey scale I from the centre point p_pComparing, and checking whether the difference is larger than a threshold value T, wherein the formula is as follows;

if the number of surrounding points relative to the center point exceeds a threshold M, the center point p is identified as an ORB feature point.

The invention has the beneficial effects that:

according to the invention, a Gaussian Laplace transform method is adopted for a complex illumination natural image to realize multi-scale spatial expression of an original image, so that visual characteristics have scale invariance; any area where ORB features possibly exist is marked in each layer of the Gaussian Laplace pyramid by adopting an area-of-interest marking technology, so that excessive intensive collection of ORB features in a small area is avoided, and the range and detection time of subsequent feature detection are reduced; and (3) adjusting the radius of the template to extract the ORB features according to the illumination brightness and the image contrast by adopting a self-adaptive ORB feature extraction method, wherein the method has stronger complex illumination adaptability and improves the extraction quantity and the uniform distribution of the ORB feature points in the complex illumination environment.

Drawings

FIG. 1 is a system flow chart of the ORB feature extraction method based on the region of interest and the adaptive radius according to the present invention;

FIG. 2 is a schematic diagram of the multiresolution expression in the multiscale space after the Gaussian Laplace transform in the present invention;

FIG. 3 is a flow chart of a region of interest marking technique according to the present invention;

FIG. 4 is a block diagram of a multi-resolution image according to the present invention;

FIG. 5a is a schematic illustration of region of interest (RoI) labeling results for a first interval image of a first octave of a natural image with uniformly illuminated regions in accordance with the present invention;

FIG. 5b is a schematic representation of the region of interest (RoI) labeling results for a first interval image of a second octave of a natural image with a uniform illumination area according to the present invention;

FIG. 5c is a schematic illustration of region of interest (RoI) labeling results for a first interval image of a third octave of a natural image with a uniform illumination area according to the present invention;

FIG. 5d is a schematic illustration of the region of interest (RoI) labeling results for the first interval image of the fourth octave of the natural image with uniform illumination area in accordance with the present invention;

FIG. 6a is a schematic diagram of region of interest (RoI) labeling results for a first interval image of a first octave of a natural image with a complex illumination area according to the present invention;

FIG. 6b is a schematic diagram of the region of interest (RoI) labeling results for the first interval image of the second octave of the natural image with complex illumination areas in accordance with the present invention;

FIG. 6c is a schematic diagram of the region of interest (RoI) labeling results for the first interval image of the third octave of the natural image with complex illumination areas in accordance with the present invention;

FIG. 6d is a schematic diagram of the region of interest (RoI) labeling result of the first interval image of the fourth octave of the natural image with the complex illumination area according to the present invention;

FIG. 7a is a schematic diagram of feature detection with a template radius of 3 according to the present invention;

FIG. 7b is a schematic view of a template of the present invention having a radius of 3;

FIG. 8a is a schematic view of a template of the present invention having a radius of 5;

FIG. 8b is a schematic view of a template of the present invention having a radius of 7;

FIG. 8c is a schematic view of a template of the present invention having a radius of 9;

FIG. 9a is a diagram illustrating the ORB feature extraction result of a first complex-illumination natural image according to the present invention;

FIG. 9b is a diagram illustrating the ORB feature extraction result of a second complex-illumination natural image according to the present invention;

Detailed Description

For the understanding of those skilled in the art, the following detailed description will be made on an ORB feature extraction method based on a region of interest and an adaptive radius according to the present invention with reference to the accompanying drawings.

Referring to fig. 1, the ORB feature extraction method mainly includes three stages, namely, establishing a laplacian of gaussian pyramid to realize image multi-scale spatial expression, region-of-interest marking and adaptive template radius extraction of ORB features.

The ORB feature extraction method based on the region of interest and the adaptive radius specifically comprises the following steps:

1) the method comprises the steps of collecting RGB images as bottom layer images, namely a fundamental octave (octave 0), through a vehicle-mounted RGB-D camera, carrying out convolution on the RGB images through a Gaussian core, establishing a Gaussian pyramid, and generating a series of different octave images gradually reduced from bottom to top.

2) Aiming at each layer of image in the Gaussian pyramid, namely each octave, different convolution cores are utilized to carry out Laplacian transformation on the image, so that a group of gradually blurred images with the same size and different intervals from bottom to top are generated in each octave, and the Gaussian Laplacian pyramid is finally established.

3) Laplacian transformation and a Gaussian pyramid are combined to construct the Gaussian Laplacian pyramid, and multi-resolution expression in a multi-scale space is achieved.

4) And marking any region where ORB characteristics possibly exist according to the gray variance at each interval of each octave of the Gaussian Laplace pyramid by adopting a region-of-interest marking technology.

Firstly, an RGB-D camera is mounted on a mobile robot, and a natural environment image I is carried out in a surrounding landmark-free laying environment_x,yCapture, image I_x,yHas a size of B₁×B₂Image I_x,yThe pixel coordinates of (1) are x and y, and the total frequency multiplication path number of the Gaussian pyramid is l:

l＝[log₂(min(B₁,B₂))](1) then, a gaussian function with variable scale is defined, as follows:

for the image octave (octave l) at the l-rd layer of the gaussian pyramid, its kernel scale σ is calculated as σ (l) ═ σ ₀2^l/2，σ₀Is the initial dimension.

Input image I by using a variable scale Gaussian function G (x, y, σ)_x,yAnd performing convolution to obtain a multiresolution expression of the image octaves with the same size, wherein the convolution operation is as follows:

L(x,y,σ)＝G(x,y,σ)*I_x,y (3)

a laplacian of gaussian pyramid is constructed by combining the laplacian transform with a gaussian pyramid, as shown in fig. 2. Multi-resolution representation in a multi-scale space is achieved. In multi-resolution representation in the multi-scale space, a set of images that become progressively more blurred at different intervals form an image octave (octave), while a set of progressively smaller image octaves at different levels form a gaussian pyramid. The image kernel scale sigma of the s-th interval in the pyramid l-th layer octave (octave l) is calculated as sigma(s)_l＝σ(l)(2^1/s)^n-sAnd n is the total number of image intervals in the l-th layer octave.

As shown in fig. 3, the region of interest labeling technique based on gray variance narrows down the search for ORB features to some potential regions that are most likely to contain potential ORB features in each layer/interval of the laplacian of gaussian pyramid. The method comprises the following specific steps:

1) multi-resolution image partitioning: as shown in FIG. 4, each layer of the Gaussian Laplacian pyramid is initially divided equally into 4

sub-blocks

1^(k),2^(k),3^(k),4^(k)To search for a region of interest that may contain ORB features, and k represents the depth of the region of interest to search for.

2) Calculating the average gray level: for a given center point of any one sub-block, the average gray level in the (2a +1) × (2a +1) window of its neighboring region can be calculated by the following formula:

where a is a pixel value whose size is defined according to its own actual situation, m is one of four sub-blocks of each interval image in the kth depth search step, and (i, j) is a pixel coordinate in a (2a +1) × (2a +1) window.

Then, the maximum and minimum sum of gray-scale variances can be determined in the k-th depth search step.

4) Sub-block classification: in the k-th (k ═ 0) depth search step, sub-block classification is performed for each interval image, and the total is divided into three classes, as shown in the following formula.

N⁽⁰⁾In the form of an original interval image,

are the classification results of the four sub-blocks in the 0 th depth search step.

5) Sub-block evaluation: can be obtained from the above formula

Contains rich gray-scale variation information, which is the most likely region associated with ORB features. Therefore, the sub-block is identified as a region of interest (RoI) at the 0 th depth search step. On the contrary, since the sub-blocks

The gray scale change of (a) is not significant and thus is ignored. The rest of the sub-blocks

Used as an initial image input for the kth (k ═ 1) depth search. Repeating the process from step (1) to step (4) to perform another cyclic process. The classification result of the k (k is 1) th depth search step is

In the 1 st depth search step, N⁽¹⁾Is an input of an original image, and,

is the sum of the gray variance of each sub-block,

is the result of the classification of four sub-blocks. Similarly, a sub-block is selected

As a region of interest, while sub-blocks are ignored in this step

The rest of the sub-blocks

Again used as the initial image input for the kth (k-2) depth search.

After the ordering, where the value is less than the gray-scale change threshold Δ Z, this means that it is almost impossible to extract the potential ORB features in the input image for the next depth search. Therefore, it is reasonable to terminate the deep search process. All region of interest sub-blocks obtained in the k-step depth search may be accumulated as the entire region of interest

Which is used as a search scope to extract ORB features.

FIG. 5 is a schematic illustration of region of interest (RoI) labeling results for a first spaced image of four octaves of a natural image with uniformly illuminated regions; fig. 6 is a schematic diagram of region of interest (RoI) labeling results for the first spaced image of four octaves of a natural image with complex illumination areas.

Due to irregular gray scale variation in a complex lighting environment, it is difficult to ensure the number and uniform distribution of the feature points extracted in fig. 7a with a fixed template radius of 3 in fig. 7 b. Therefore, the template radius needs to be adaptively adjusted according to the illumination brightness and the image contrast to extract the ORB features, and the method comprises the following specific steps:

And

the gray scale of the adjacent pixel. The gray difference is calculated as follows

A 4-neighbor means that there are 4 horizontal and vertical neighbors to the pixel p located at the coordinates (x, y), whose coordinates are given by the following equation, this group of pixels is called the 4-neighbor of p

(x+1,y),(x-1,y),(x,y+1),(x,y-1) (10)

2) According to the gray difference delta and the distribution probability

The image contrast C is calculated. Let P be assumed here_δIs normally distributed.

R_i,j＝αC+βI_i,j (12)

wherein α and β are adaptive coefficients in the range of (0, 1);

4) and (3) feature detection: let p be the center point at radius R_i,jFind all surrounding points N on the circle. As on a circle of radius 3, there are 16 points around, as shown in fig. 7 b; a circle with

radius

5 or 7, with 28 or 40 points around it, respectively, as shown in fig. 8. The gray level I of N surrounding points on the circle_x,yGrey scale I from the centre point p_pComparing, and checking whether the difference is larger than a threshold value T, wherein the formula is as follows;

if the surrounding points have sufficient gray scale differences with respect to the center point (e.g., d or c is the result of equation (13)) and their number exceeds the threshold value M, the center point p is identified as an ORB feature point. T is generally taken as the gray level I_p20% of the total, M is typically 9, and it is clear that a smaller threshold T or M will detect more ORB feature points. Meanwhile, the larger template radius is more likely to check the center point p as a feature point because more peripheral points on the circle are selected to compare the gray scale variation. If the number of ORB features decreases due to feature degradation (lack of gray scale variation) under high or low light conditions, the template radius can be adaptively adjusted to maintain the number and quality of feature points.

Fig. 9 is a schematic diagram of ORB feature extraction results of a complex illumination natural image, and it can be seen that although loss of gray scale variation tends to bring great difficulty to feature extraction, the method of the present invention can still extract sufficient and uniformly distributed features under complex illumination conditions.

The specific application of the present invention is many, and the above is the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto. Any changes or substitutions which are obvious to those skilled in the art and which are not made by the inventive idea in the technical scope of the present invention are included in the protection scope of the present invention, and therefore, the protection scope of the present invention is subject to the protection scope defined by the appended claims.

Claims

1. An ORB feature extraction method based on an interested region and an adaptive radius is characterized by comprising the following steps:

2. The ORB feature extraction method based on the region of interest and the adaptive radius according to claim 1, wherein the RGB-D camera in step 1) is installed on a mobile robot, and the natural environment image I is performed in a surrounding landmark-free paving environment_x，yCapture, image I_x，yHas a size of B₁×B₂Image I_x，yThe pixel coordinates of (2) are x, y, and the total frequency multiplication path number of the Gaussian pyramid is:

l＝[loq₂(min(B₁，B₂))] (1)

3. the method of claim 1, wherein the laplace transform in step 2) is implemented as follows: first, a gaussian function with variable scale is defined, as follows:

for the image octave at the l-rd layer of the gaussian pyramid, its kernel scale σ is calculated as σ (l) ═ a₀2^l/2，σ₀Is an initial scale;

input image I by using a variable scale Gaussian function G (x, y, σ)_x，yAnd performing convolution to obtain a multiresolution expression of the image octaves with the same size, wherein the convolution operation is as follows:

L(x，y，σ)＝G(x，y，σ)*I_x，y (3)

4. the ORB feature extraction method based on a region of interest and an adaptive radius according to claim 1, wherein the laplacian of gaussian pyramid in step 3) is constructed by combining laplacian transformation with the laplacian pyramid, so as to realize multi-resolution expression in a multi-scale space, in the multi-resolution expression in the multi-scale space, a group of images gradually becoming more and more blurred at different intervals forms an image octave, a group of gradually shrinking image octaves forms the laplacian pyramid at different layers, and the kernel scale σ of the image at the s interval in the l-th octave of the laplacian pyramid is calculated as σ(s)_l＝σ(l)(2^1/s)^n-sAnd n is the total number of image intervals in the l-th layer octave.

5. The method for extracting ORB features based on region of interest and adaptive radius according to claim 1, wherein the region of interest labeling technique in step 4) is as follows:

1) multi-resolution image partitioning: each layer of the Gaussian Laplacian pyramid is initially divided equally into 4 sub-blocks 1^(k)，2^(k)，3^(k)，4^(k)Searching a region of interest possibly containing ORB features, wherein k represents the depth number of the region of interest to be searched;

4) sub-block classification: in the k (k ═ O) depth search step, sub-block classification is performed for each interval image, and the sub-block classification is divided into three classes in total, as follows:

N⁽⁰⁾in the form of an original interval image,

is the classification result of four sub-blocks in the 0 th depth search step;

5) sub-block evaluation: can be obtained from the above formula

Ignoring; the rest of the sub-blocks

An initial image input used as a kth (k ═ 1) depth search;

in the 1 st depth search step, N⁽¹⁾Is an input of an original image, and,

is the sum of the gray variance of each sub-block,

As a region of interest, while sub-blocks are ignored in this step

The rest of the sub-blocks

Again used as the initial image input for the kth (k ═ 2) depth search;

After a non-good order, where the value is less than the gray change threshold Δ Z, the depth search process is terminated and all the region-of-interest sub-blocks obtained in the k-step depth search can be accumulated into the entire region-of-interest

Which is used as a search scope to extract ORB features.

6. The method as claimed in claim 1, wherein the method for extracting ORB features based on region of interest and adaptive radius is as follows:

And

(x+1，y)，(x-1，y)，(x，y+1)，(x，y-1) (10)

2) According toGray difference Δ δ and distribution probability

R_i，j＝αC+βI_i，j (12)

wherein α and β are adaptive coefficients in the range of (0, 1);

4) and (3) feature detection: let p be the center point at radius R_i，jFind all surrounding points N, e.g. 16 points around a circle with radius 3; the circle with the radius of 5 or 7 is surrounded by 28 or 40 points respectively, and the gray level I of N surrounding points on the circle is represented_x，yGrey scale I from the centre point p_pComparing, and checking whether the difference is larger than a threshold value T, wherein the formula is as follows;