CN105513070A

CN105513070A - RGB-D salient object detection method based on foreground and background optimization

Info

Publication number: CN105513070A
Application number: CN201510897635.0A
Authority: CN
Inventors: 周圆; 陈阳; 崔波; 霍树伟; 侯春萍
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2015-12-07
Filing date: 2015-12-07
Publication date: 2016-04-20

Abstract

The invention discloses an RGB-D salient object detection method based on foreground and background optimization. The method comprises the following steps: initial foreground modeling is performed based on low-level feature contrast, and a superpixel-level initial salient figure is obtained; a middle-level aggregation processing is performed on the superpixel-level initial salient figure, and a middle-level salient figure is obtained; a high-level prior is introduced in the middle-level salient figure to improve the detection effect, and a foreground probability is generated; edge connectivity mixing depth information is calculated, and the edge connectivity is converted into a background probability; the foreground probability and the background probability are optimized, and a objective function is obtained; the objective function is solved, a optimal salient figure is obtained, and the detection of a salient object is realized. According to the invention, a optimization framework based on foreground and background measurement and the depth information of a scene is fully utilized by the invention, a high recall rate can be obtained, and the accuracy is high; the method can accurately position the salient object in different scenes and different sizes of objects and can also obtain nearly equal salience values in the target object.

Description

A kind of RGB-D obvious object detection method optimized based on prospect background

Technical field

The present invention relates to the detection field of computer vision, particularly relate to a kind of RGB-D (colored and depth image) obvious object merging depth information and detect.

Background technology

At computer vision field, detecting from natural scene and being partitioned into obvious object is an active problem, also creates a lot of significant application.Current most obvious object detection method utilizes colouring information and multiple prior imformation to obtain good result.Although scene depth plays very important role in the vision system of the mankind, it obvious object detect in effect also do not excavated by the degree of depth.The current appearance due to a series of depth camera, makes to obtain depth information of scene more convenient.

It is a kind of well clue that scene depth detects for conspicuousness.Compared with detecting with 2D conspicuousness, the work of minority is only had to be detect this task for the conspicuousness based on RGB-D data.The people such as Desingh ^[1]propose a kind of method of compute depth conspicuousness, then by Support vector regression, degree of depth significance and 2D significance are merged.But this method carries out significance calculating to colouring information and depth information respectively, have ignored these two kinds of information correlativity each other.The people such as Ciptadi ^[2]the 3D profile of explicit extraction scene, shape and color characteristic simply, significance is defined as the diversity between these features, but the result that the method draws does not carry out depth optimization, Detection results is not very desirable.The people such as Peng ^[3]construct first database being exclusively used in obvious object and detecting, and propose a kind of multistage conspicuousness detection algorithm.But its performance is still limited.

Up to now, there is not yet the RGB-D obvious object detection algorithm carried out about based on prospect background optimize in the paper published and document at home and abroad.

Summary of the invention

The invention provides a kind of RGB-D obvious object detection method optimized based on prospect background, the present invention makes full use of the depth information of image, introduces the framework optimized based on prospect background, to realize the accurate detection of obvious object, described below:

Based on the RGB-D obvious object detection method that prospect background is optimized, described obvious object detection method comprises the following steps:

Carry out initial prospect modeling based on low-level image feature contrast, obtain the initial significantly figure of super-pixel rank; Middle layers of polymer process is carried out to the initial significantly figure of described super-pixel rank, obtains middle rank significantly figure;

Middle rank is significantly schemed to introduce high-rise priori and promotes Detection results further, generate prospect probability;

Calculate the contour connection degree merging depth information, described contour connection degree is converted into background probability;

Described prospect probability and described background probability are optimized, obtain objective function;

Described objective function is solved, obtains optimum significantly figure, realize the detection to obvious object.

Wherein, described obvious object detection method also comprises:

To a given input RGB-D image, by over-segmentation algorithm, RGB-D Iamge Segmentation is become super-pixel.

Wherein, describedly carry out initial prospect modeling based on low-level image feature contrast, the step obtaining the initial significantly figure of super-pixel rank is specially:

Each super-pixel proper vector is represented; Define multiple contextual information;

By proper vector, contextual information and Weighted Gauss cuclear density to each super-pixel calculating probability density, corresponding probability density value is assigned to corresponding super-pixel, gets the initial significantly figure of super-pixel rank.

Further, described contextual information comprises:

1) nearest with current super-pixel n _lindividual adjacent super-pixel;

2) comprise except current super-pixel, the every other super-pixel in image;

3) super-pixel on four angles of image is comprised.

Wherein, the described initial significantly figure to described super-pixel rank carries out middle layers of polymer process, obtains the remarkable figure of middle rank and is specially:

Weighting function is obtained by the weight parameter after estimation and side-play amount;

Carry out spanning tree process by weighting function in conjunction with the initial significantly figure of spanning tree algorithm to described super-pixel rank, obtain middle rank significantly figure.

Wherein, described by carrying out spanning tree process to the initial significantly figure of described super-pixel rank, the step obtaining the remarkable figure of middle rank is specially:

Generate a conspicuousness subset, make it to comprise m the super-pixel with higher saliency value;

The set of iterative tree is carried out to each conspicuousness seed in conspicuousness subset, selects the fillet with weight limit, and add in spanning tree, export spanning tree;

Calculate the frequency that each super-pixel occurs in spanning tree, in this, as saliency value, generate middle rank significantly figure.

Describedly significantly scheme to introduce high-rise priori to middle rank and promote Detection results further, the step generating prospect probability is specially:

By significantly to scheme described middle rank and Gauss model carries out priori and is multiplied, obtain senior remarkable figure, in described senior remarkable figure, the saliency value of each super-pixel is the prospect probability obtained by bottom, middle level, high-rise prospect modeling.

The beneficial effect of technical scheme provided by the invention is: this method takes full advantage of the depth information of Optimization Framework and the scene of measuring based on prospect background, and can obtain very high recall rate, degree of accuracy is also very high.For different scene type and dimension of object, this method can accurately locate remarkable object, and provides significance value almost equal in target object.Background and marking area discrimination less time the method still can play a role well.

In addition, this method has carried out objective examination in disclosed NLPR image data base, adopt accuracy rate (Precision), recall rate (Recall), curve lower area area (AUC) and F-measure tetra-indexs to carry out the effect of quantitative measurement RGB-D obvious object detection, fully demonstrate the accuracy of this method.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the RGB-D obvious object detection method based on prospect background optimization;

Fig. 2 is that this method compares schematic diagram with the PR curve with reference to algorithm;

Fig. 3 is that this method compares schematic diagram with the ROC curve with reference to algorithm;

Fig. 4 is that this method compares schematic diagram with AUC and F-measure with reference to algorithm;

Fig. 5 is that the qualitative results of RGB-D image compares schematic diagram.

Embodiment

For making the object, technical solutions and advantages of the present invention clearly, below embodiment of the present invention is described further in detail.

The feature that this method detects from obvious object and based on prospect background Optimization Framework, how research carries out effective prospect tolerance and background is measured.In prospect tolerance, this method takes a kind of multistage prospect measure, considers that the low layer of image, middle level and high-rise characteristic extract the most reliable prospect part effectively.In background tolerance, this method carries out modeling by proposing a kind of contour connection degree merging the degree of depth to image background.Prospect and background tolerance bring the majorized function of an appropriate design into the most at last, are significantly schemed by effective optimization.

Embodiment 1

Based on the RGB-D obvious object detection method that prospect background is optimized, see Fig. 1, this obvious object detection method comprises the following steps: 101: carry out initial prospect modeling based on low-level image feature contrast, obtains the initial significantly figure of super-pixel rank; Middle layers of polymer process is carried out to the initial significantly figure of super-pixel rank, obtains middle rank significantly figure;

102: middle rank is significantly schemed to introduce high-rise priori and promotes Detection results further, generate prospect probability;

103: calculate the contour connection degree merging depth information, contour connection degree is converted into background probability;

104: prospect probability and described background probability are optimized, obtain objective function;

105: objective function is solved, obtain optimum significantly figure, realize the detection to obvious object.

Wherein, this obvious object detection method also comprises:

Wherein, carry out initial prospect modeling in step 101 based on low-level image feature contrast, the step obtaining the initial significantly figure of super-pixel rank is specially:

Wherein, contextual information comprises:

1) nearest with current super-pixel n _lindividual adjacent super-pixel;

2) comprise except current super-pixel, the every other super-pixel in image;

3) super-pixel on four angles of image is comprised.

Wherein, in step 101, middle layers of polymer process is carried out to the initial significantly figure of super-pixel rank, obtain middle rank significantly figure be specially:

Carry out spanning tree process by weighting function in conjunction with the initial significantly figure of spanning tree algorithm to super-pixel rank, obtain middle rank significantly figure.

Further, by carrying out spanning tree process to the initial significantly figure of super-pixel rank, the step obtaining the remarkable figure of middle rank is specially:

Wherein, middle rank significantly schemed to introduce high-rise priori promote Detection results further in step 102, the step generating prospect probability is specially:

By significantly to scheme middle rank and Gauss model carries out priori and is multiplied, obtain senior remarkable figure, in senior remarkable figure, the saliency value of each super-pixel is the prospect probability obtained by bottom, middle level, high-rise prospect modeling.

In sum, the embodiment of the present invention makes full use of the depth information of image by above-mentioned steps 101-step 105, introduces the framework optimized based on prospect background, achieves the accurate detection of obvious object.

Embodiment 2

Below in conjunction with Fig. 1, concrete computing formula, the scheme in embodiment 1 is introduced, described below:

201: to a given input RGB-D image, by over-segmentation algorithm, RGB-D Iamge Segmentation is become super-pixel;

This step is specially: each pixel in image is expressed as sextuple proper vector, [L, a, b] for pixel is at the color characteristic of CIElab (a kind of color mode that International Commission on Illumination announced in 1976) color space, wherein Lab pattern is made up of three passages, first passage is lightness, i.e. " L ".The color of " a " passage is from redness to dark green; " b " passage is from blueness to yellow then.The volume coordinate that [x, y, z] is pixel; Then the color distance d between two pixels (i, j) is defined _cwith space length d _s; Finally obtain the distance metric d between pixel.

Wherein,

\begin{matrix} d_{c} = \sqrt{{(L_{i} - L_{j})}^{2} + {(a_{i} - a_{j})}^{2} + {(b_{i} - b_{j})}^{2}} & d_{s} = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2} + {(z_{i} - z_{j})}^{2}} \end{matrix}

utilize this distance metric formula, RGB-D Iamge Segmentation can be become super-pixel.

Wherein, L _iit is the lightness of i-th pixel; L _jfor the lightness of a jth pixel; a _iit is a channel value of i-th pixel; a _jfor a channel value of a jth pixel; b _iit is the b channel value of i-th pixel; b _jfor the b channel value of a jth pixel; x _iit is the abscissa value of i-th pixel; x _jfor the abscissa value of a jth pixel; y _iit is the ordinate value of i-th pixel; y _jfor the ordinate value of a jth pixel; z _iit is the depth value of i-th pixel; z _jfor the depth value of a jth pixel; M is regular shape degree coefficient; K is the total number of image pixel; N is the super-pixel number needing to generate.

202: after RGB-D Iamge Segmentation is become super-pixel, carry out initial prospect modeling based on low-level image feature contrast, obtain the initial significantly figure of super-pixel rank;

For each super-pixel P, its proper vector is defined as f=[c, l, r, d], and wherein c represents in CIELab color space, the average color of all pixels in P; L represents the center of super-pixel P on imaging plane; R represents the size in super-pixel region, i.e. the number of pixel in super-pixel; D represents the mean depth of all pixels in super-pixel P.

Define multiple contextual information ψ ^k, k ∈ L, G, B}, wherein represent the set of local context, comprise and n that this super-pixel is nearest _lindividual adjacent super-pixel. represent global context set, contain the every other super-pixel except this super-pixel in image. represent background set of context, contain the super-pixel on four angles of image.

Use feature defined above and multiple contextual information, carry out estimated probability by the method for a kind of Weighted Gauss cuclear density modeling.Weighted Gauss nuclear density model is defined as:

p (P | ψ^{k}) = \hat{p} (f | F^{k}) = \frac{1}{n_{k}} Σ_{j = 1}^{n_{k}} α_{j}^{k} e^{- \frac{| | c - c_{j}^{k} | |^{2}}{2 {(σ_{c}^{k})}^{2}}} e^{- \frac{| | l - l_{j}^{k} | |^{2}}{2 {(σ_{l}^{k})}^{2}}} e^{- \frac{| | d - d_{j}^{k} | |^{2}}{2 {(σ_{d}^{k})}^{2}}}

Wherein, p (P| ψ ^k) be the potential probability density of super-pixel P under given set of context; be and super-pixel set ψ ^kcorresponding character representation set; for super-pixel P characteristic of correspondence vector f=[c, l, r, d] is at given character representation set F ^kunder potential probability density; n _kfor the super-pixel number adjacent with super-pixel P; for the average color of a jth super-pixel; for the center of a jth super-pixel on imaging plane; d _j ^kfor the mean depth of all pixels in a jth super-pixel; for the Gaussian function coefficient of variation of color item; for the Gaussian function coefficient of variation of location entries; for the Gaussian function coefficient of variation of degree of depth item; be a kind of weight coefficient, be defined as target super-pixel P and context super-pixel size ratio.

By above-mentioned Weighted Gauss cuclear density formula, its probability density is calculated to each super-pixel, corresponding probability density value is assigned to corresponding super-pixel, just can obtain the initial significantly figure of super-pixel rank.

203: middle layers of polymer process is carried out to the initial significantly figure of super-pixel rank, obtains middle rank significantly figure.

Another G=(ν, ε, ρ) represents the upper weighting connection layout formed of super-pixel segmentation on RGB-D image, and the summit ν of figure represents each super-pixel, and limit (P, Q) ∈ ε connects adjacent super-pixel P and Q.Weighting function ρ: ε → [0,1] is that the limit connecting super-pixel P and Q assigns weight, and the method uses logistic function ^[4]for ρ modeling is as follows:

ρ _P,Q＝σ(w ^Tx _P,Q+b),σ(h)＝(1+exp(-h)) ^-1

Wherein, h represents w ^tx _p,Q+ b, x _p,Qbe a proper vector, contain the validity feature that similarity and the compatibility of super-pixel P and Q are measured.σ is sigmoid function ^[4].Weight parameter w and side-play amount b comes from training data learning.In order to reach this target of training, first each super-pixel is divided into foreground area or background area by the method.Partitioning standards is: for a super-pixel, if belong to the ratio of conspicuousness object more than 80% in its pixel of comprising, is so just divided into prospect part, otherwise is divided into background parts.So for all super-pixel in entire image, the method is gone to find and may be included in conspicuousness object, and is its distributing labels, thinks that it is positive sample (y _p,Q=1), otherwise, for it distributes the label (y of negative sample _p,Q=0).

By maximizing the estimation obtaining weight parameter w and side-play amount b with minor function:

{w, b} = \underset{w, b}{argmax} \underset{&ForAll; (P, Q) &Element; ϵ}{Σ} y_{P, Q} {logρ}_{P, Q} + (1 - y_{P, Q}) l o g (1 - ρ_{P, Q})

Wherein, { w, b} are the vector of weight parameter w and side-play amount b composition; y _p,Qfor the label value distributed; ρ _p,Qfor the weighting function that the limit connecting super-pixel P and Q is distributed.

Namely weighting function ρ is obtained, by weighting function ρ in conjunction with spanning tree algorithm spanning tree generating model by the weight parameter w after estimation and side-play amount b.

Given weighting connected-graph model, the initial significantly figure of this method to super-pixel rank carries out middle layers of polymer process, obtains middle rank significantly figure.From a conspicuousness seed, the limit with larger weight is absorbed, forms a part spanning tree.The key step of carrying out middle layers of polymer process for initial significantly figure is as follows:

1, a conspicuousness subset { s is generated ₁..., s _m, make it to comprise m the super-pixel with higher saliency value.This can arrange threshold value by the initial significantly figure being calculated to gained the first stage, gets and exceedes threshold portion to realize.

2, for each conspicuousness seed s in conspicuousness subset _i, repeat following steps: use s _iinitialization spanning tree T ₁ ⁽ⁱ⁾; By based on Prim algorithm ^[5]the polymerization of iterative tree, select that there is weight limit ρ _p,Qfillet (P, Q) ∈ ε, and add spanning tree T to ₁ ⁽ⁱ⁾.After this algorithm meets end condition, export this spanning tree T ⁽ⁱ⁾.

3, calculate the frequency that each super-pixel occurs in these spanning trees, in this, as their saliency value, generate and significantly scheme.

In a first step, the method needs the set generating conspicuousness seed.The method can by arranging threshold value T, and the super-pixel in initial significantly figure is divided into two groups: have one group of higher saliency value and have one group of lower saliency value.In the group with higher saliency value, super-pixel constitutes the set of conspicuousness seed.In the iterative process of spanning tree algorithm, an effective end condition is absolutely necessary.It can ensure that the spanning tree that the method obtains can either comprise whole obvious object as far as possible, can effectively avoid again the super-pixel of non-interior of articles to comprise to come in.In order to reach this purpose, the method devises one and stops function, and it comprises two ingredients: two super-pixel that (1) reference edge (P, Q) connects do not belong to the probability 1-ρ of same object _p,Q.(2) reflect conspicuousness contrast, calculate poor S (the P)-S (Q) of the saliency value of gained in the low layer stage.In fact, the method setting stops function is the mean value of the two

f _P,Q＝(1-ρ _P,Q+S(P)-S(Q))/2

Wherein, f _p,Qfor stopping function.Use this termination function, spanning tree algorithm will check end condition f in each iteration _p,Q> f ₀, thus determine whether limit (P, Q) is brought into spanning tree.F ₀it is the threshold parameter set in the starting stage of algorithm.

According to above step, the initial significantly figure of this method to super-pixel rank carries out middle layers of polymer process, obtains middle rank significantly figure.

204: middle rank is significantly schemed to introduce high-rise priori and promotes Detection results further.

This priori is expressed as a Gauss model by the embodiment of the present invention:

G (a) = \exp [- (\frac{{(x_{a} - μ_{x})}^{2}}{2 σ_{x}^{2}} + \frac{{(y_{a} - μ_{y})}^{2}}{2 σ_{y}^{2}} + \frac{{(z_{a} - μ_{z})}^{2}}{2 σ_{z}^{2}})]

Wherein, (x _a, y _a, z _a) be the coordinate of pixel a on normalized image plane X-Y and depth range Z.(μ _x, μ _y, μ _z) be mean value from all pixel coordinates inside the set of conspicuousness seed, i.e. the center of conspicuousness object.Consider the impact of article size, the method is by variance be set to (2o _x, 2o _y, 2o _z), o _x, o _yand o _zthe coordinate figure that conspicuousness subset is combined in X, Y and Z tri-coordinate axis respectively.

Final remarkable figure is significantly schemed to be multiplied with this Gauss model priori to obtain by middle rank.Wherein in remarkable figure, the saliency value of each super-pixel is the prospect probability obtained by bottom, middle level, high-rise prospect modeling.

205: calculate the contour connection degree merging depth information;

First connect all neighbouring super pixels (p, q), and distribute their color and degree of depth weight, set up undirected weighted graph.Color weight d _color(p, q) and degree of depth weight d _depth(p, q) is calculated by the Euclidean distance between the respective average color of two neighbouring super pixels and between the average normalized degree of depth respectively.Color calculates carries out at CIE-Lab color space.Geodesic line color distance tolerance between any two super-pixel ^[6]be defined as:

d_{c l r} (p, q) = \min Σ_{i = 1}^{n - 1} d_{c o l o r} (p_{i}, q_{i + 1})

Wherein, d _clr(p, q) is the geodesic line color distance tolerance between any two super-pixel ^[6]; d _color(p _i, q _i+1) be super-pixel p _iand q _i+1euclidean distance between respective average color.

Geodesic line depth distance measure definitions between two super-pixel is:

d_{d e p} (p, q) = \min Σ_{i = 1}^{n - 1} d_{d e p t h} (p_{i}, q_{i + 1})

Wherein, d _depth(p _i, q _i+1) be super-pixel p _iand q _i+1euclidean distance between the respective average normalized degree of depth.

Then consider color and the degree of depth, what the embodiment of the present invention used following formula to represent each super-pixel " opens into region ^[7]":

A (p) = Σ_{i = 1}^{N} e^{- \frac{d_{c l r}^{2} (p, p_{i})}{2 σ_{c l r}^{2}} - \frac{d_{d e p}^{2} (p, p_{i})}{2 σ_{d e p}^{2}}} = Σ_{i = 1}^{N} S (p, p_{i})

Wherein, A (p) " opens into region for each super-pixel p ^[7]", above formula opens into region by calculating belonging to p; S (p, p _i) span be (0,1], represent super-pixel p _ito the contribution opening into region of p; d _clr(p, q) is the geodesic line color distance tolerance between two super-pixel ^[6]; d _dep(p, q) is the geodesic line depth distance tolerance between two super-pixel.

Work as p _iwhen belonging to same region with p, then there is d _clr(p, q)=0 and d _dep(p, q)=0, thus make S (p, p _i)=1, can guarantee p like this _ican for p open into area contribution unit area.Work as p _iwhen not belonging to a region with p, at least have a limit in shortest path between which, its color weight or degree of depth weight are very large, thus make d _clr(p, q) and d _depat least one is had much larger than 0, to guarantee p in (p, q) _imuch areas can not be increased for p opens into region.σ _clrand σ _depit is variance parameter.In an experiment, for obtaining stabilization result, the two is set to 10 and 0.5 by the embodiment of the present invention respectively.Final utilization following formula calculates the contour connection degree merging the degree of depth:

C o n (p) = \frac{Σ_{i = 1}^{N} S (p, p_{i}) δ (p_{i} &Element; B n d)}{\sqrt{A (p)}}

Wherein, Con (p) the contour connection degree that is super-pixel p; Bnd is the set of a series of super-pixel compositions in image boundary.Be 1 for the value being in borderline super-pixel δ (), the value for other super-pixel δ () is 0.

206: above-mentioned contour connection degree is converted into background probability, introduce variable

According to the computing formula of contour connection degree in 205, when super-pixel p is positioned at image boundary, contour connection degree Con (p) value calculated is far longer than 1; When super-pixel p is positioned at picture centre, Con (p) value calculated is close to zero.

The embodiment of the present invention adopts following mapping function, and above-mentioned contour connection degree is converted into background probability

w_{i}^{b g} = 1 - e^{- \frac{{Con}^{2} (p_{i})}{2 σ_{C o n}^{2}}}

Wherein, for super-pixel p _ibackground probability; Con (p _i) be super-pixel p _icontour connection degree; σ _conbe variance parameter, in an experiment it be set to 1.

Utilize above-mentioned conversion formula, can see, when super-pixel p is positioned at image boundary, contour connection degree Con (p) value calculated is far longer than 1, now background probability value close to 1; When super-pixel p is positioned at picture centre, Con (p) value calculated close to zero, background probability value close to 0.So just effectively can distinguish the prospect in image and background, calculate the background probability of super-pixel

207: the background probability that the prospect probability generate step 204 and step 206 generate is optimized.

The embodiment of the present invention builds an objective function, can give saliency value 1 for object area, for saliency value 0 is given in background area.By optimizing this objective function, optimum remarkable figure can be obtained.The saliency value of all N number of super-pixel is made to be define following objective function:

E = Σ_{i = 1}^{N} w_{i}^{b g} s_{i}^{2} + Σ_{i = 1}^{N} w_{i}^{f g} {(s_{i} - 1)}^{2} + λ_{c} \underset{i, j}{Σ} w_{i, j}^{c} {(s_{i} - s_{j})}^{2} + λ_{d} \underset{i, j}{Σ} w_{i, j}^{d} {(s_{i} - s_{j})}^{2}

Wherein, it is the prospect probability obtained by the modeling of multilayer prospect; for super-pixel p _ibackground probability; λ _cand λ _dfor Lagrange multiplier, its value is 0.5.Four parts in above formula come from four different constraint conditions.S _iit is the saliency value of i-th super-pixel; s _jfor the saliency value of a jth super-pixel.

For first part the probability of background is belonged in super-pixel time larger, in order to make entirety is little as far as possible, can cause s _ilittle as far as possible, so this super-pixel is assigned with the saliency value close to 0.Similarly, the second part be that it distributes the saliency value close to 1 when the probability that super-pixel belongs to prospect is larger.Latter two is smoothness condition, and will the super-pixel belonging to a region together be impelled to be endowed similar saliency value, the super-pixel not belonging to same region be endowed different saliency value, thus ensures flatness and the continuity of saliency value in image.Wherein, two weights with represent the color between two neighbouring super pixels and degree of depth weight respectively.Be expressed as:

w_{i, j}^{c} = e^{- \frac{d_{c o l o r}^{2} (p_{i}, p_{j})}{2 σ_{c l r}^{2}}} + μ, w_{i, j}^{d} = e^{- \frac{d_{d e p t h}^{2} (p_{i}, p_{j})}{2 σ_{d e p}^{2}}}

Wherein, d _depth(p _i, p _j) be super-pixel p _iand p _jeuclidean distance between the respective average normalized degree of depth; d _color(p _i, p _j) be super-pixel p _iand p _jeuclidean distance between respective average color; μ=0.1 is a little canonical constant, is used for ensureing with the stability of result when being 0, and reduce issuable noise in prospect modeling and background modeling.

208: solve the objective function of definition in step 207, the embodiment of the present invention adopts lowest mean square solution.

Optimization aim can be expressed as:

S^{*} = \underset{S}{\arg \min} {E = Σ_{i = 1}^{N} w_{i}^{b g} s_{i}^{2} + Σ_{i = 1}^{N} w_{i}^{f g} {(s_{i} - 1)}^{2} + \underset{i, j}{Σ} (λ_{c} w_{i, j}^{c} + λ_{c} w_{i, j}^{d}) {(s_{i} - s_{j})}^{2}}

For the ease of calculating, definition Laplacian Matrix L=D-W, D is a diagonal matrix, and diagonal element is wherein for the element in weight matrix W.Write energy function as matrix form as follows:

E(S)＝w ^bg·S ²+w ^fg·(S-I) ²+2SLS ^T

Wherein, w ^bgfor the background probability of super-pixel; w ^fgfor the prospect probability obtained by the modeling of multilayer prospect; S is the saliency value of super-pixel; T is transposition; L is the Laplacian Matrix of constructed figure.S is differentiated, obtains following derivation:

\begin{matrix} {&dtri;}_{S} E (S) = {&dtri;}_{S} (w^{b g} \cdot S^{2} + w^{f g} \cdot {(S - I)}^{2} + 2 {SLS}^{T}) \\ = {&dtri;}_{S} (w^{b g} \cdot S^{2} + w^{f g} \cdot (S^{2} - 2 S + I) + 2 {SLS}^{T}) \\ = 2 w^{b g} S + 2 w^{f g} S - 2 w^{f g} + 4 S L \end{matrix}

Wherein, represent and S is differentiated.

Derivative is made to be 0 can to obtain: w ^fg=(w ^bg+ w ^fg+ 2L) S ^*

Wherein S ^*be exactly required lowest mean square solution, be finally expressed as: S ^*=(w ^fg) ^-1(w ^bg+ w ^fg+ 2L)

After obtaining the saliency value of each super-pixel, the saliency value of super-pixel is distributed to each pixel that this super-pixel comprises.Each pixel in this sampled images can obtain a saliency value.By the objective function in step 207, this method can give saliency value 1 for object area, for saliency value 0 is given in background area.By optimizing this objective function, optimum remarkable figure can be obtained.

In sum, the embodiment of the present invention makes full use of the depth information of image by above-mentioned steps 201-step 207, introduces the framework optimized based on prospect background, achieves the accurate detection of obvious object.

Embodiment 3

This method mainly adopts accuracy rate (Precision), and recall rate (Recall) and curve lower area area (AUC) and F-measure tetra-indexs carry out the effect that quantitative measurement RGB-D obvious object detects.

In statistics, ROC curve refers to Receiver operating curve (ReceiverOperatingcharacteristicCurve), and it features the performance of a binary classifier system when threshold value changes.In order to better weigh the quality of ROC Curves expression of results, the embodiment of the present invention additionally uses curve lower area area (AreaUnderCurve, AUC).AUC is exactly the ratio that PR curve lower right corner area accounts for whole rectangular coordinates axial plane area in simple terms.PR (Precision-Recall) curve is the another kind of standard evaluated binary classifier performance.Detect for conspicuousness, degree of accuracy is exactly under certain threshold value, first calculates the number of pixels of obvious object intersection in the remarkable figure of acquisition and the remarkable figure of true value, then with obtain obvious object number of pixels in remarkable figure and ask ratio.Recall ratio is then the number of pixels first calculating obvious object intersection in the remarkable figure of acquisition and the remarkable figure of true value, then asks ratio with obvious object number of pixels in the remarkable figure of true value.Generally along with the increase accuracy of recall ratio also will increase.PR curve also can be converted into numerical value by appropriate formula.

F-measure is a kind of effective method, is defined as the harmonic-mean of accuracy and recall rate:

F_{β} = (1 + β^{2}) \frac{p r e c i s i o n * r e c a l l}{β^{2} * p r e c i s i o n + r e c a l l},

Wherein β ²=0.3.

One, quantitative result analysis

This method is tested on disclosed database.This database is by people such as Peng ^[3]in NLPR image library, extracted 1000 width image constructions, and people be marked there is precision target border true value for the test of algorithm.Accompanying drawing 2, Fig. 3 shows the quantitative comparison of this method in PR (precision and recall rate) curve and ROC curve.This method is labeled as: `Opti'.In PR curve map, abscissa axis represents recall rate, and axis of ordinates represents accuracy rate.When recall rate is identical, the accuracy rate value of curve is higher, and to represent Detection results better.Can see, this method is better than other several methods in PR index.In ROC curve map, abscissa axis represents the ratio of obvious object number of pixels in the number being mistaken for obvious object pixel of acquisition and the remarkable figure of true value, and axis of ordinates represents the ratio being correctly judged to obvious object number of pixels in obvious object number of pixels and the remarkable figure of true value of acquisition.Equally, when abscissa axis is fixing, the higher expression Detection results of value of axis of ordinates is better.Can find out, this method is obviously better than multi-stage process, is in same level with based on the method learnt and based on the method for Bayesian Fusion.

In addition, this method also compares curve lower area area (AreaUnderCurve, AUC) and F-measure, refers to Fig. 4.Can see, this method is obviously better than MR_D ^[8]method, DSR_D ^[9]method and multi-stage process, with based on the method learnt and the methods and results based on Bayesian Fusion very close.This is because the 2D method of expansion separately process color and depth information, thus ignore complementarity very strong between outward appearance and the corresponding clue of the degree of depth.

Two, qualitative results analysis

Fig. 5 demonstrates the qualitative comparison of some RGB-D images.Observe from figure, for different scene type and dimension of object, this method can accurately locate remarkable object, and provides significance value almost equal in target object.Background and marking area discrimination less time the method still can play a role well.Fig. 5-a represents original RGB image, and Fig. 5-b represents original depth-map, and Fig. 5-c represents MR_D ^[8]method, Fig. 5-d represents DSR_D ^[9]method, Fig. 5-e represents multi-stage process, and Fig. 5-f represents the method based on study, and Fig. 5-g represents the result that this method obtains.Finally, the obvious object Detection results of this method is obviously better than MR_D ^[8]method, DSR_D ^[9]method and multi-stage process, obvious object is lighted equably.As can be seen from the 3rd width and last width test pattern, this method creates than based on the more accurate result of the method for study, has carried out effective suppression when foreground object is accurately detected to background area simultaneously.Even if under complex scene or foreground object and background difference less time, this method can be still that foreground object distributes higher saliency value effectively, Background suppression information simultaneously.

List of references

[1]K.Desingh,K.M.Krishna,C.V.Jawahar,andD.Rajan,Depthreallymatters:Improvingvisualsalientregiondetectionwithdepth,ProceedingsofBritishMachineVisionConference,2013.

[2]A.Ciptadi,T.Hermans,andJ.M.Rehg,AninDepthViewofSaliency,ProceedingsofBritishMachineVisionConference,2013.

[3]H.Peng,B.Li,W.Xiong,W.HuandR.Ji,RGBDSalientObjectDetection:ABenchmarkandAlgorithms,ProceedingsofEuropeanConferenceonComputerVision,2014.

[4]Weisstein,EricW.LogisticEquation.MathWorld.

[5]PrimRC.Shortestconnectionnetworksandsomegeneralizations.Bellsystemtechnicaljournal,1957,36(6):1389-1401.

[6]W.Zhu,S.Liang,Y.Wei,andJ.Sun,SaliencyOptimizationfromRobustBackgroundDetection,ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2014.

[7]RutishauserU,WaltherD,KochC,etal.,Isbottom-upattentionusefulforobjectrecognition,ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2004,37–44.

[8]C.Yang,L.Zhang,H.Lu,X.Ruan,andM.-H.Yang,Saliencydetectionviagraph-basedmanifoldranking,ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2013,3166-3173.

[9]X.Li,H.Lu,L.Zhang,X.Ruan,andM.-H.Yang,Saliencydetectionviadenseandsparsereconstruction,ProceedingsofIEEEInternationalConferenceonComputerVision,2013,2976-2983.

It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, the invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1., based on the RGB-D obvious object detection method that prospect background is optimized, it is characterized in that, described obvious object detection method comprises the following steps:

2. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 1, it is characterized in that, described obvious object detection method also comprises:

3. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 1, is characterized in that, describedly carry out initial prospect modeling based on low-level image feature contrast, and the step obtaining the initial significantly figure of super-pixel rank is specially:

4. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 3, it is characterized in that, described contextual information comprises:

1) nearest with current super-pixel n _lindividual adjacent super-pixel;

2) comprise except current super-pixel, the every other super-pixel in image;

3) super-pixel on four angles of image is comprised.

5. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 1, is characterized in that, the described initial significantly figure to described super-pixel rank carries out middle layers of polymer process, obtains intermediate significantly figure and is specially:

6. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 5, is characterized in that, described by carrying out spanning tree process to the initial significantly figure of described super-pixel rank, the step obtaining intermediate significantly figure is specially:

7. a kind of RGB-D obvious object detection method optimized based on prospect background according to claim 1, is characterized in that, describedly significantly schemes to introduce high-rise priori to middle rank and promotes Detection results further, and the step generating prospect probability is specially: