CN106997591A

CN106997591A - A kind of super voxel dividing method of RGB D image mutative scales

Info

Publication number: CN106997591A
Application number: CN201710168730.6A
Authority: CN
Inventors: 袁夏; 徐鹏; 周宏扬
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2017-03-21
Filing date: 2017-03-21
Publication date: 2017-08-01

Abstract

The invention discloses the super voxel dividing method of mutative scale in one towards RGB D view data, seed point is selected in data using the sampling of Poisson dish, then cluster is iterated according to the color distance and space length between data point and each seed point, obtain initial super voxel segmentation result, the super voxel obtained using initial segmentation is summit, the syntople of super voxel is that non-directed graph is set up on side, super voxel is carried out using the method based on graph theory to merge, the super voxel segmentation result that scale size differs is obtained in same RGB D images, the super voxel segmentation of mutative scale is realized.The inventive method belongs to data prediction, splits obtained super voxel and obtains the super voxel of large scale in the high region of data consistency, obtains the super voxel of small yardstick in the place of data consistency difference, more meets human vision cognitive features.

Description

RGB-D image variable-scale hyper-voxel segmentation method

Technical Field

The invention relates to a superpixel segmentation method, in particular to a variable-scale superpixel segmentation method suitable for RGB-D images.

Background

With the progress of sensor technology, the acquisition cost of RGB-D images becomes lower and lower, and how to more effectively preprocess RGB-D images is an important research content of computer vision in recent years. In order to fully utilize the three-dimensional geometric information in the RGB-D image, similar to the concept of two-dimensional image super-pixel over-segmentation, the RGB-D image is over-segmented into super-voxels, so that an effective preprocessing mode is provided, and the data volume processed by a subsequent algorithm can be effectively reduced.

The currently common voxel size of the voxel division algorithm is high in consistency of the sizes of the voxels after the scale parameters are determined, if multi-scale analysis is needed subsequently, the sizes of the voxels are controlled by setting different scale factors, so that a voxel division result of one scale is obtained by calculation when each scale is analyzed, and the calculation amount is increased. Based on the thought of variable-scale analysis, when the superpixel segmentation is carried out, the region with high data consistency is segmented into the superpixels with large scale, the region with poor data consistency is segmented into the superpixels with small scale, and therefore the local distribution of the data is realizedThe RGB-D data are adaptively segmented into the superpixels with different scales, so that a subsequent algorithm can perform variable-scale calculation on over-segmented data, and the calculation amount is reduced.

Disclosure of Invention

The invention aims to provide a variable-scale superpixel segmentation method for an RGB-D image, which can better perform segmentation pretreatment on the RGB-D image.

The technical solution for realizing the purpose of the invention is as follows: a variable-scale hyper-voxel segmentation method for an RGB-D image comprises the following steps:

step 1, selecting seed points, setting a frame of RGB-D image data as P, setting the resolution as m rows and n columns, wherein each data point comprises 3 color channels (r, g and b) and 1 depth channel (D). Randomly selecting a point P from P₀Setting a radius threshold value R as the minimum distance between seed points to be generated as initial seed points, and sampling in P by using a Poisson disc sampling algorithm to obtain a seed point set; the seed point selection specifically comprises the following steps:

step 1-1, randomly selecting a point P in a frame of RGB-D image data P₀As an initial seed point, an active sampling point queue L₁Initialized to null, p₀Adding L₁Queue the inactive sampling points L₂Initialization is null;

step 1-2, judging an active sampling point queue L₁Whether or not it is empty, if L is₁If not, then from L₁A point p of intermediate dequeue_iWith p_iRandomly selecting candidate sampling points in a concentric circle region with the radius of R and 2R as the circle center, and adding the candidate sampling points into L if the distance between the candidate sampling points and the existing seed points is greater than R₁(ii) a If K times of trying still no qualified candidate sampling point, p is added_iFrom L₁Is deleted and L is added₂(ii) a Wherein R adopts a pixel coordinate unit, and K is a preset numerical value;

step 1-3, L₂The point in (1) is the selected seed point.

Step 2, voxel pre-segmentation, namely converting the color space of each point in the data from RGB to Lab, converting the depth value d to a three-dimensional space coordinate (x, y, z), taking each seed point as a clustering center, iteratively calculating the distance between a non-seed point and a seed point by integrating the color distance and the three-dimensional space distance, and clustering the closest point to the seed point into a class, thereby obtaining an initial voxel segmentation result; the pre-segmentation of the hyper-voxels specifically comprises the following steps:

step 2-1, converting the depth value of the point in the RGB-D image into a three-dimensional space coordinate, and setting the point of the ith row and the jth column in P as P_ijDepth value of d_ijThe depth values are converted to three-dimensional space coordinates using equation (1):

where f is the focal length of the camera and (c)_x,c_y) For the center coordinates of the image, the RGB colors of each point in P are converted to Lab colors using a standard color space conversion formula, and the points in P are all represented as 6-dimensional vectors [ l, a, b, x, y, z]；

Step 2-2, taking the seed points obtained in the step 1 as initial clustering centers, carrying out area search clustering, calculating the distance between each point in the neighborhood range of 2R x 2R of each clustering center and the clustering center, classifying each non-clustering center point into the clustering center with the minimum characteristic distance to complete the first clustering process, wherein the two points P in P are used as the clustering centers_iAnd p_jThe feature distance metric between the following equations:

in the formula (2), d_labIs the color distance, d_xyzλ is a weight for determining color information and spatial distance information, and the spatial distance is more important as λ is larger, and the color distance is more important as λ is smaller;

step 2-3, an iterative clustering process, wherein the clustering center of each class is recalculated according to the result of the first clustering in the step 2-2; the new clustering center characteristic value is the average value of all the point characteristics of each class, then the point closest to the new clustering center characteristic value is searched in one class to be used as a new clustering center point, and the class of each non-clustering center point is recalculated according to the characteristic distance calculation and classification method in 2-2; ending the iteration for k times; after the iterative computation is finished, the points in each class form an initial hyper-voxel.

And 3, carrying out hyper-voxel fusion, namely taking the initial hyper-voxels as vertexes, establishing an edge structure undirected graph G (V, E) between adjacent hyper-voxels, measuring the difference between the adjacent hyper-voxels by using the Lab color space distance and the normal vector direction included angle of each initial hyper-voxel, and fusing the initial hyper-voxels by using the intra-class difference minimization and inter-class difference maximization ideas to obtain the scale-variable hyper-voxels. The hyper-voxel fusion specifically comprises the following steps:

step 3-1, setting the obtained super voxel set of the initial segmentation as C, and each super voxel as C_iTo super voxel c_iIs a vertex v_iEstablishing a set of vertices V, neighboring superpixels V with a common boundary_i，v_jBetween establishes the edge e_ijForming an edge set E, and constructing an undirected graph G by using V and E; calculating the normal vector of each point by using a standard k neighbor point coordinate covariance matrix principal component decomposition method according to the three-dimensional space coordinate of the point P, and then assigning a weight w (v) to each edge of G by using an equation (5)_i,v_j)

Wherein,l_maxare respectively hyper-voxel c_iAverage and maximum luminance of all points in, theta_ijIs a hyper-voxel c_iAnd c_jThe normal quantity of the hyper-voxel is taken as the normal vector mean value of all points in the hyper-voxel, and α is a weight factor;

step 3-2, merging the initial superpixels, and initializing each superpixel before merging into a region A_i，v_kAnd v_lIs a region A_iDefine the internal dissimilarity as the maximum edge of the minimum spanning tree of the region, as shown in equation (7)

Int(A_i)＝maxw(v_k,v_l),v_k,v_l∈A_i,(v_k,v_l)∈E (7)

Two regions A are defined by the formula (8)_iAnd A_jOf (a) external dissimilarity of, wherein v_mIs a region A_iOf (5) supra voxel v_nIs a region A_jSuper voxel in (1)

Dif(A_i,A_j)＝minw(v_m,v_n),v_m∈A_i,v_n∈A_j,(v_m,v_n)∈E (8)

MInt(A_i,A_j) Is A_iAnd A_jThe minimum internal dissimilarity of these two regions is then calculated using equation (9)

MInt(A_i,A_j)＝min((Int(A_i)+τ(A_i)),(Int(A_j+τ(A_j))) (9)

Wherein τ (A)_i)＝e/|A_i|，|A_iI is the region A_iThe number of the included points, e is a set constant;

then comparing the external dissimilarity and the internal dissimilarity of the two regions, merging the two regions if the formula (10) is satisfied, otherwise not merging

MInt(A_i,A_j)<Dif(A_i,A_j) (10)

The region merging process is carried out until the regions in P can not be merged, and the variable-scale hyper-voxel is obtained.

Compared with the prior art, the invention has the following remarkable advantages: (1) the initial seed points are selected by adopting a Poisson disc sampling method, and compared with an even grid sampling method, the method is more in line with the distribution characteristics of cone cells on the retina and is beneficial to even sampling in an uneven three-dimensional data point set; (2) the method integrates geometric information such as depth values, normal vectors and the like on the basis of colors, and is more suitable for processing RGB-D type data; (3) the method of the invention obtains the superpixels with different scales by segmenting the same frame of RGB-D data through one-time calculation, is obviously different from the over-segmentation method which can only obtain the superpixels with more consistent scales on the same frame of data through one-time calculation, and is more beneficial to reducing the calculated amount of the subsequent algorithm and multi-scale analysis.

The invention is further described below with reference to the accompanying drawings.

Drawings

FIG. 1 shows the results of the variable scale voxel segmentation after the algorithm pre-segmentation and fusion.

FIG. 2 is a flow chart of the RGB-D image scale-variable hyper-voxel segmentation method of the present invention.

Detailed Description

The invention discloses a variable-scale superpixel segmentation method of an RGB-D image, which comprises the following steps of:

step 1, selecting seed points, setting a frame of RGB-D image data as P, setting the resolution as m rows and n columns, wherein the information of each data point comprises 3 color channels (r, g and b) and 1 depth channel (D). Randomly selecting a point P from P₀And as an initial seed point, setting a radius threshold value R as the minimum distance between seed points to be generated, and sampling in P by using a Poisson disc sampling algorithm to obtain a seed point set. The method specifically comprises the following steps:

step 1-1, randomly selecting a point P in P₀As an initial seed point, an active sampling point queue L₁Initialized to null, p₀Adding L₁Queue the inactive sampling points L₂Initialization is null;

step 1-2, if L₁If not, then from L₁A point p of intermediate dequeue_iWith p_iTaking R and 2R (R adopts pixel coordinate unit) as the circle center, randomly selecting candidate sampling points in a concentric circle region with the radius of R and 2R respectively, and adding the candidate sampling points into L if the distance between the candidate sampling points and the existing seed points is more than R₁. If K times of trying still no qualified candidate sampling point, p is added_iFrom L₁Is deleted and L is added₂；

Step 1-3, L₂The point in (1) is the selected seed point.

And 2, performing presorting on the voxels, converting the color space of each point in the P into Lab from RGB by using a standard color space conversion method, converting the depth value d into a three-dimensional space coordinate (x, y, z), and iteratively calculating the distance between a non-seed point and a seed point by taking each seed point as a clustering center and integrating the color distance and the three-dimensional space distance to obtain an initial voxel presorting result. The method specifically comprises the following steps:

step 2-1, converting the depth value of the point in the RGB-D image into a three-dimensional space coordinate, and setting the point of the ith row and the jth column in P as P_ijDepth value of d_ijConverting the depth value to three using equation (1)Dimensional space coordinates:

where f is the focal length of the camera and (c)_x,c_y) For the center coordinates of the image, the RGB color of each point in P is converted to Lab color using a standard color space conversion formula. By color space and coordinate space transformation, the points in P are both represented as 6-dimensional vectors [ l, a, b, x, y, z]；

Step 2-2, taking the various sub-points obtained in the step 1 as initial clustering centers, carrying out area search clustering, calculating the distance between the point in the neighborhood range of 2R x 2R of each clustering center and the clustering center, classifying each non-clustering center point into the clustering center with the minimum characteristic distance to complete the first clustering process, wherein the two points P in P are used as the clustering centers_iAnd p_jThe feature distance metric between the following equations:

in the formula (2), d_labIs the color distance, d_xyzFor the spatial distance, λ is a weight for determining the color information and the spatial distance information, the spatial distance is more important when λ is larger, and the color distance is more important when λ is smaller, specifically set according to the specific application requirements.

And 2-3, performing iterative clustering process, and recalculating the clustering center of each class according to the result of the first clustering in the step 2-2. And the new clustering center characteristic value is the average value of the characteristics of all the points in the class, then the point closest to the new clustering center characteristic value is searched in the class as a new clustering center point, and the class of each non-clustering center point is recalculated according to the characteristic distance calculation and classification method in 2-2. And ending the iteration k times. After the iterative computation is finished, the points in each class form an initial hyper-voxel.

And 3, fusing the hyper-voxels, namely constructing an undirected graph G (V, E) for the edges by taking the initial hyper-voxels as vertexes and the adjacency relation among the hyper-voxels, measuring the difference between the adjacent hyper-voxels by using the Lab color space distance and the normal vector direction included angle of each initial hyper-voxel, and fusing the initial hyper-voxels by using the intra-class difference minimization and the inter-class difference maximization concepts to obtain the scale-variable hyper-voxels. The method specifically comprises the following steps:

step 3-1, setting the obtained super voxel set of the initial segmentation as C, and each super voxel as C_iTo super voxel c_iIs a vertex v_iEstablishing a set of vertices V, neighboring superpixels V with a common boundary_i，v_jBetween establishes the edge e_ijAnd forming an edge set E, and constructing an undirected graph G by using V and E. Calculating the normal vector of each point by using a standard k neighbor point coordinate covariance matrix principal component decomposition method according to the three-dimensional space coordinate of the point P, and then assigning a weight w (v) to each edge of G by using an equation (5)_i,v_j)

Wherein,l_maxare respectively hyper-voxel c_iAverage and maximum luminance of all points in, theta_ijIs a hyper-voxel c_iAnd c_jAngle of normal vector, of voxels of hyper-voxelsThe normal quantity is taken as the normal vector mean of all points in the hyper-voxel, and α is a weight factor.

Step 3-2, the initial hyper-voxels are merged, and each hyper-voxel is initialized to a region A before merging_i，v_kAnd v_lIs a region A_iDefine the internal dissimilarity as the maximum edge of the minimum spanning tree of the region, as shown in equation (7)

Int(A_i)＝maxw(v_k,v_l),v_k,v_l∈A_i,(v_k,v_l)∈E (7)

The external dissimilarity of the two regions is defined by equation (8)

Dif(A_i,A_j)＝minw(v_m,v_n),v_m∈A_i,v_n∈A_j,(v_m,v_n)∈E (8)

MINt (Ai, Aj) is A_iAnd A_jThe minimum internal difference between these two regions is then calculated using equation (9)

MInt(A_i,A_j)＝min((Int(A_i)+τ(A_i)),(Int(A_j+τ(A_j))) (9)

MInt(A_i,A_j)<Dif(A_i,A_j) (10)

The initial seed points are selected by adopting a Poisson disc sampling method, and compared with a uniform grid sampling method, the method is more consistent with the distribution characteristic of cone cells on the retina and is beneficial to uniform sampling in a non-uniform three-dimensional data point set.

The present invention is further illustrated by the following specific examples.

Example 1

FIG. 1 is a schematic diagram of the process of pre-segmentation and scale-variable segmentation of an RGB-D image collected by a frame of Kinect depth camera and the result thereof.

The method for segmenting the RGB-D image of one frame by using the variable-scale hyper-voxel over-segmentation method comprises the following steps:

step 1, selecting seed points, wherein one frame of RGB-D image data is P, the resolution is 480 lines and 640 columns, and the information of each data point comprises 3 color channels (r, g and b) and 1 depth channel (D). Randomly selecting a point P from P₀And setting a radius threshold value R to be 20 as the minimum distance between the seed points to be generated as the initial seed points, and sampling in P by using a Poisson disc sampling algorithm to obtain a seed point set. The method specifically comprises the following steps:

step 1-2, if L₁If not, then from L₁A point p of intermediate dequeue_iWith p_iRandomly selecting candidate sampling points in a concentric circle region with the radius of R and 2R as the circle center, and adding the candidate sampling points into L if the distance between the candidate sampling points and the existing seed points is greater than R₁. If no qualified candidate sample point exists after K is 30 times of trials, p is added_iFrom L₁Is deleted and L is added₂；

Steps 1-3,L₂The point in (1) is the selected seed point

And 2, performing voxel pre-segmentation, namely converting the color space of each point in the P into Lab from RGB by using a standard color space conversion formula, converting the depth value d into a three-dimensional space coordinate (x, y, z), and iteratively calculating the distance between a non-seed point and a seed point by taking each seed point as a clustering center and integrating the color distance and the three-dimensional space distance to obtain an initial voxel segmentation result. The method specifically comprises the following steps:

according to the Kinect depth camera's development manual, f-570.3, c_x＝320，c_y240. The RGB colors of each point in P are converted to Lab colors using standard color space conversion formulas. By color space and coordinate space transformation, the points in P are both represented as 6-dimensional vectors [ l, a, b, x, y, z]；

Step 2-2, taking the seed points obtained in the step 1 as initial clustering centers, carrying out area search clustering, calculating the distance between each point in the neighborhood range of 2R x 2R of each clustering center and the clustering center, classifying each non-clustering center point into the clustering center with the minimum characteristic distance to complete a first clustering process, wherein two points P in P are used as the clustering centers_iAnd p_jThe feature distance metric between the following equations:

in the formula (2), d_labIs the color distance, d_xyzλ is 1 as the spatial distance.

And 2-3, performing iterative clustering process, and recalculating the clustering center of each class according to the result of the first clustering in the step 2-2. And the new clustering center characteristic value is the average value of all the point characteristics of each class, then the point closest to the new clustering center characteristic value is searched in one class to be used as a new clustering center point, and the class of each non-clustering center point is recalculated according to the characteristic distance calculation and classification method in 2-2. The iteration k ends 10 times. After the iterative computation is completed, the points in each class form an initial hyper-voxel, as shown by the pre-segmentation result in fig. 1.

Wherein,l_maxare respectively hyper-voxel c_iAverage and maximum luminance of all points in, theta_ijIs a hyper-voxel c_iAnd c_jThe normal quantity of the hyper-voxel is the normal vector mean of all points in the hyper-voxel, α is 0.5.

Step 3-2, initial hyper-voxel combination, wherein each hyper-voxel is initialized to a region A before combination_i，v_kAnd v_lIs a region A_iDefine the internal dissimilarity as the maximum edge of the minimum spanning tree of the region, as shown in equation (7)

Int(A_i)＝maxw(v_k,v_l),v_k,v_l∈A_i,(v_k,v_l)∈E (7)

Dif(A_i,A_j)＝minw(v_m,v_n),v_m∈A_i,v_n∈A_j,(v_m,v_n)∈E (8)

MInt(A_i,A_j)＝min((Int(A_i)+τ(A_i)),(Int(A_j+τ(A_j))) (9)

Wherein τ (A)_i)＝e/|A_i|，|A_iI is the region A_iThe number of dots included, e equals 100.

MInt(A_i,A_j)<Dif(A_i,A_j) (10)

The region merging process is performed until the regions in P can not be merged, and each finally obtained region after merging is a variable-scale superpixel obtained by segmentation, as shown in the variable-scale superpixel segmentation result in fig. 1.

The method of the invention obtains the superpixels with different scales by segmenting the same frame of RGB-D data through one-time calculation, is obviously different from the over-segmentation method which can only obtain the superpixels with more consistent scales on the same frame of data through one-time calculation, and is more beneficial to reducing the calculated amount of the subsequent algorithm and multi-scale analysis.

Claims

1. A variable-scale hyper-voxel segmentation method for an RGB-D image is characterized by comprising the following steps of:

step 1, selecting seed points, setting a frame of RGB-D image data as P, setting the resolution as m rows and n columns, wherein the information of each data point comprises 3 color channels (r, g and b) and 1 depth channel (D); randomly selecting a point P from P₀(the number 0 is a subscript) is used as an initial seed point, a radius threshold value R is set as the minimum distance between seed points to be generated, and a Poisson disc sampling algorithm is used for sampling in the P to obtain a seed point set;

step 2, pre-dividing the voxel, converting the color space of each point in the data from RGB to Lab, converting the depth value d to a three-dimensional space coordinate (x, y, z), then taking each seed point as a clustering center, and iteratively calculating the distance between a non-seed point and the seed point by integrating the color distance and the three-dimensional space distance to obtain an initial voxel division result;

and 3, fusing the superpixels, constructing an undirected graph G (V, E) for the edges by using the initial superpixels as vertexes and the adjacency relation among the superpixels as edges, measuring the difference between the adjacent superpixels by using the Lab color space distance and the normal vector direction included angle of each initial superpixel, obtaining the variable-scale superpixels by combining the initial superpixels, and completing the segmentation of the variable-scale superpixels of the RGB-D image.

2. The RGB-D image scale-variable hyper-voxel segmentation method according to claim 1, wherein the seed point selection in step 1 specifically comprises the following steps:

step 1-3, L₂The point in (1) is the selected seed point.

3. The RGB-D image scale-variable voxel segmentation method according to claim 1 or 2, wherein the pre-segmentation of the voxels in step 2 comprises the following steps:

p_{i j} {(x, y, z)}^{T} = d_{i j} {(\frac{i - c_{x}}{f}, \frac{j - c_{y}}{f}, 1)}^{T} - - - (1)

Step 2-2, taking the seed points obtained in the step 1 as initial clustering centers, carrying out area search clustering, calculating the distance between each point in the neighborhood range of 2R x 2R of each clustering center and the clustering center, and classifying each non-clustering center point intoAnd the clustering center with the minimum characteristic distance to complete the first clustering process, two points P in P_iAnd p_jThe feature distance metric between the following equations:

D (p_{i}, p_{j}) = d_{l a b} + λ \frac{d_{x y z}}{R} - - - (2)

d_{l a b} = \sqrt{{(l_{i} - l_{j})}^{2} + {(a_{i} - a_{j})}^{2} + {(b_{i} - b_{j})}^{2}} - - - (3)

d_{x y z} = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2} + {(z_{i} - z_{j})}^{2}} - - - (4)

in the formula (2), d_labIs the color distance, d_xyzIs the spatial distance, λ is the weight used to determine the color information and the spatial distance information, the larger λ the more important the spatial distance,the smaller λ is, the more important the color distance is;

4. The method for estimating the normal vector of the cloud point of the three-dimensional point according to claim 1, wherein the hyper-voxel fusion in the step 3 specifically comprises the following steps:

w (v_{i}, v_{j}) = α \frac{| {\overset{&OverBar;}{l}}_{i} - {\overset{&OverBar;}{l}}_{j} |}{l_{m a x}} + (1 - α) \frac{θ_{i j}}{π} - - - (5)

θ_{i j} = \cos^{- 1} (\frac{{\overset{&RightArrow;}{n}}_{i} \cdot {\overset{&RightArrow;}{n}}_{j}}{| {\overset{&RightArrow;}{n}}_{i} | | {\overset{&RightArrow;}{n}}_{j} |}) - - - (6)

Int(A_i)＝max w(v_k,v_l),v_k,v_l∈A_i,(v_k,v_l)∈E (7)

Dif(A_i,A_j)＝min w(v_m,v_n),v_m∈A_i,v_n∈A_j,(v_m,v_n)∈E (8)

MInt(A_i,A_j)＝min((Int(A_i)+τ(A_i)),(Int(A_j+τ(A_j))) (9)

MInt(A_i,A_j)<Dif(A_i,A_j) (10)