CN105046689A

CN105046689A - Method for fast segmenting interactive stereo image based on multilayer graph structure

Info

Publication number: CN105046689A
Application number: CN201510354774.9A
Authority: CN
Inventors: 马伟; 邱晓慧; 杨璐维; 邓米克; 张明亮; 段立娟
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2015-06-24
Filing date: 2015-06-24
Publication date: 2015-11-11
Anticipated expiration: 2035-06-24
Also published as: CN105046689B

Abstract

Provided is a method for fast segmenting an interactive stereo image based on multilayer graph structure. The method comprises: inputting a group of stereo images and obtaining a parallax graph by means of a stereo image matching algorithm; appointing part of a foreground and a background in either of the left graph or the right graph of an original image; establishing foreground color, background color, and a prior statistic model of parallax distribution by using a CUDA parallel computation method according to the appointed part; performing Gauss filtering and down-sampling on the original image to obtain an image with a small scale and forming a multilayer graph structure by using a rough image and the original image. In view of problems of a complex segmenting model and low computational efficiency of a conventional stereo image segmentation method, the invention provides the new segmentation method based on a theoretical framework of parallax graph stereo image synchronous segmentation. The method simplifies the model complexity, achieves parallel intensive task processing and computation, increases stereo image segmentation speed, and achieves a purpose of segmenting stereo images with normal sizes.

Description

A kind of interactive stereo-picture fast partition method based on multi-level graph structure

Technical field

The invention belongs to the crossing domains such as image procossing, computer graphics and computer vision, relate to a kind of interactive stereo-picture fast partition method based on multi-level graph structure.

Background technology

3D technology development in recent years, from 3D stereotelevision to 3D three-dimensional film, proposes urgent demand to the creation of 3D content and the exploitation of 3D edit tool.Interactive stereo-picture segmentation is a wherein important process, and it is the link of the most important process of many application, as object identification, tracking, and Images Classification, picture editting and image reconstruction etc.Current stereo-picture segmentation has been applied to segmentation and the analysis of organ in medical image, and the tracking of object, in the real lifes such as the understanding of scene.Therefore, stereo-picture segmentation efficiency becomes important research direction.

Compare the segmentation of single image, the intelligent scissor of interactive stereo-picture is started late.Current image partition method mainly deposits challenge both ways: calculate accuracy rate and computing velocity.This is the problem of conflict, is difficult to reach good balance between.Calculate in accuracy rate in raising, people have done a lot of effort.In " StereoCut:ConsistentInteractiveObjectSelectioninStereoIm agePairs " that the people such as Price deliver on the ICCV of 2011, utilize the parallax information between stereo pairs to improve the accuracy rate degree of stereo-picture segmentation.The information such as color, gradient, parallax of pixel each in image incorporates in traditional figure hugger opinion by it, by solving the result that max-flow obtains optimizing on stereo-picture border.Although this method segmentation precision is higher, the parted pattern limit built and the huge number of node, calculation of complex, inefficiency.The many specific implementation process by changing graphcut algorithm of current partitioning algorithm improve splitting speed.Many for stereo-picture number of pixels, the baroque problem in limit, the implementation process only changing graphcut algorithm cannot fundamentally solve.Meanwhile, in stereo-picture cutting procedure, there is the task of a lot of single instruction stream multiple data stream computation-intensive.Classic method does not well utilize this task can the feature of executed in parallel, and serial processing, makes efficiency low, consumes a large amount of time, thus makes segmentation inefficiency.

Summary of the invention

In view of current stereo-picture segmentation exists parted pattern complexity, the problem that counting yield is low.Under the theoretical frame that the present invention is synchronously split at the stereo-picture based on disparity map, explore new dividing method.Try hard to the complexity of simplified model, the task of parallel processing computation-intensive, improves stereo-picture splitting speed, realizes the object of Real-time segmentation common-size stereo-picture.

For realizing this target, technical scheme of the present invention is: first input one group of stereo-picture, obtain disparity map by stereo matching algorithm.About original image in any figure before specified portions, background.Before method establishment according to specified portions application CUDA parallel computation, the priori statistical model of the color of background and parallax distribution.By carrying out gaussian filtering to original image, down-sampling obtains the less image of coarse scale, then coarse image formed multi-level graph structure together with original image.Based on this, the constraints such as color, gradient and parallax under figure cuts theoretical frame in the multi-level graph structure of formalization, structure energy function.In order to raise the efficiency, figure process is built in the method process of application CUDA parallel computation.Adopt global optimization's result of the max-flow/multi-level figure of minimal cut Algorithm for Solving of figure.Then the pixel that statistical boundary place error is larger, adopts traditional figure hugger opinion, carries out local optimum to the boundary pixel point of statistics.The result of Global treatment and local optimum is merged, forms last segmentation result.If user does not obtain desirable effect, can also continue to delineate zone errors in figure, until obtain desired result.

Compared with prior art, the present invention has the following advantages: the present invention, by the stereo-picture parted pattern of framework based on multi-level graph structure, simplifies the complexity on limit, significantly improves the speed of process.Meanwhile, by the CUDA technology parallel processing of the task of the single instruction stream multiple data stream of some computation-intensives, save the plenty of time.Experiment proves: compare existing method, and under the prerequisite of equal interactive quantity, the method for the invention, when splitting accuracy rate and consistance changes little, can significantly improve splitting speed.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of method involved in the present invention;

Fig. 2 is application example experimental result of the present invention: (a), (b) are the left and right image of input, and (c), (d) are the results of the method segmentation in " the StereoCut:ConsistentInteractiveObjectSelectioninStereoIm agePairs " adopting the people such as Price to deliver on the ICCV of 2011; E (), (f) are segmentation result of the present invention; The user's input used of two kinds of methods shows in (c), (e) figure, wherein the first lines mark prospect, the second lines mark background.Give the accuracy rate of two kinds of method segmentations and the time of segmentation simultaneously.The present embodiment is tested notebook computer used and is configured to: CPU processor Intel (R) Pentium (R) CPUB9502.10GHz2.10GHz; Gpu processor NVIDIAGeForceGT540M.

Embodiment

Below in conjunction with the drawings and specific embodiments, the present invention will be further described.

Flow process of the present invention as shown in Figure 1, specifically comprises the steps:

Step one, coupling stereo-picture.

Read in a stereoscopic image I={I ^l, I ^r, I ^lwith I ^rrepresent left and right image respectively.Calculate disparity map corresponding to left and right image by Stereo Matching Algorithm, use D respectively ^lwith D ^rrepresent.What Stereo Matching Algorithm adopted is the middle algorithm proposed of paper " EfficientBeliefPropagationforEarlyVision " that the people such as Felzenszwalb deliver on CVPR04.

Step 2, before interpolation, background clue.

User by before specified portions in designed interface wherein any image, background.The invention process adopts the method being similar to and using in " StereoCut:ConsistentInteractiveObjectSelectioninStereoIm agePairs " that the people such as Price deliver on the ICCV of 2011, utilize the input equipments such as mouse, touch-screen or writing pencil, by before the lines specified portions of delineating different colours on image, background pixel.As shown in Fig. 2 (e), the pixel that the first lines cover belongs to prospect, and the pixel that the second lines cover belongs to background.Subsequent step of the present invention for before used in this step, background pixel specific mode unrestricted, alternate manner also can use.

Step 3, before foundation, the color of background, parallax prior model.

Represent that the foreground pixel set that user specifies, B represent the background pixel set that user specifies with F; Before, the prior model of the color of background, parallax adopts the form of GMM, histogram and multiple class bunch to express.What the present invention adopted is multiclass bunch form, obtains class bunch by color, the parallax adding up respective pixel set.In order to improve processing speed, adopt based on the parallel Kmeans algorithm of CUDA, respectively cluster is carried out to color value corresponding to the pixel in F and B, parallax value.The detailed process of process color model is as follows: each thread process pixel, calculates the distance of each pixel to all prospects, background classes bunch, selects nearest distance, by pixel cluster in the class bunch of correspondence.Obtain N _cindividual foreground color class bunch m _cindividual background color class bunch above-mentioned color class bunch represents the color distribution statistical model of prospect, background respectively; , use the same method, the parallax value corresponding to the pixel in F and B carries out cluster respectively, obtains N meanwhile _dindividual foreground disparities class bunch m _dindividual background parallax class bunch above-mentioned parallax class bunch represents the parallax statistical distribution model of prospect, background respectively; In the present embodiment, N _c=M _c=64; N _d=M _d=16.

Step 4, based on the global optimization of multi-level graph structure;

Compare due to prospect, background distribution separately in image namely assemble before, background interior pixels difference is less, boundary pixel difference is larger.Utilize this characteristic, represent all pixels of neighborhood by the pixel that region is representative.This method adopts the mode of gaussian filtering, down-sampling, obtains representational pixel.And then obtain the less image of coarse yardstick.Coarse image and original image are merged, forms multi-level graph structure.Global treatment is carried out to the model of multi-level graph structure.By original three-dimensional image to being expressed as I={I ^l, I ^r, coarse stereo pairs is expressed as I ^τ={ I ^{l, τ}, I ^{r, τ}, I ^l, I ^{l, τ}with I ^r, I ^{r, τ}represent left and right image respectively.Original three-dimensional image and coarse stereo-picture are expressed as jointly a non-directed graph G=< ν, ε >; Wherein, ν is the node set in non-directed graph G, and ε is the set on limit; Each vertex correspondence stereo-picture I and I in non-directed graph G ^τin a pixel; Interactive stereo-picture Fast Segmentation is under the constraint of entering stroke, is each pixel p of original three-dimensional image centering _igive a label x _i; x _i∈ 1,0}, respectively represent before, background; Limit in non-directed graph G comprises the fillet of each pixel and source point, meeting point, the fillet of neighbor in image, and the fillet between the stereo-picture correspondence point that determines of disparity map; Fillet between the father and son's node simultaneously also comprising rough layer and original image.Order for rough layer image slices vegetarian refreshments.Obtain because rough layer carries out down-sampling to original layers, so one n in I image before representative sampling _l* N _lregion in pixel, N in the present embodiment _l=3.

Solving the optimization problem that the above-mentioned stereo-picture Fast Segmentation problem definition based on multi-level graph structure is following objective energy function:

\begin{matrix} E (X) = \\ w_{u n a r y} \underset{p_{i}^{τ} &Element; I^{τ}}{Σ} E_{u n a r y} (p_{i}^{τ}) + w_{int r a} \underset{(p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a}}{Σ} E_{int r a} (p_{i}^{τ}, p_{j}^{τ}) + w_{int e r} \underset{(p_{i}^{l, τ}, p_{i}^{r, τ}) &Element; N_{int e r}}{Σ} E_{int e r} (p_{i}^{l, τ}, p_{i}^{r, τ}) \\ + w_{p a t e r n i t y} \underset{(p_{i}^{τ}, p_{i, j}) &Element; N_{p a t e r n i t y}}{Σ} E_{p a t e r n i t y} (p_{i}^{τ}, p_{i j}) \end{matrix} - - - (1)

Wherein be unitary item, represent the similarity of the color of rough layer pixel, parallax and front, background color and parallax statistical model, be also called data item; Similarity is higher, be worth larger; be binary item in rough layer image, reflect the difference between all pixels of rough layer image and four neighborhoods, Ν _intrarepresent the set comprising the syntople of all pixels in the rough layer figure of left and right; Difference is larger, then this Xiang Yue little; Cut calculation ratio juris according to figure, now tend between neighborhood territory pixel get different labels; be the binary item between coarse image, define the result of the coupling of corresponding point, matching degree is higher, then this Xiang Yue great; Ν _interrepresent the set containing left and right rough layer pixel corresponding relation.

be the binary constraint relation between rough layer image and original image, represent the similarity of father and son's node, father and son's node difference is less, and this value is larger, and border is less through both possibilities.Ν _paternityrepresent the set of father and son's corresponding relation.W _unary, w _intra, w _inter, w _paternityregulate the weights between each energy term; w _unary=1, w _intra=4000, w _inter=8000, w _paternity=1000000.

(1) unitary bound term is defined

Unitary bound term comprises color unitary item and parallax unitary item two parts, is defined as follows:

E_{u n a r y} (p_{i}^{τ}) = w_{c} (1 - P_{c} (x_{i}^{τ} | c_{i}^{τ})) + w_{d} (1 - P_{d} (x_{i}^{τ} | d_{i}^{τ})) - - - (2)

Wherein, represent given pixel color get the probable value of prospect or background label; Because probability is larger, energy function should be less, so get 1-P _crepresent color unitary item; Similarly, represent the parallax value of given pixel get the probable value of prospect or background label; Get 1-P _drepresent parallax unitary item; w _c, w _drespectively representative color and parallax affect weights, w _c+ w _d=1;

Before this method represents with class bunch form, the color of background and parallax model, comprise N _cindividual foreground color class bunch m _cindividual background color class bunch n _dindividual foreground disparities class bunch m _dindividual background parallax class bunch provide the computing method of unitary item;

The account form of color unitary item is as follows: this method adopts and calculates based on CUDA parallel method.The color value of all pixels of being held by CPU passes to GPU end.In GPU, all pixels of parallel processing.Each thread represents a unmarked pixel.Thread is separate, and all threads calculate the distance of pixel color to the Lei Cu center of prospect, background color model simultaneously, find wherein minimum distance; By this minimum distance, pixel color and similarity that is front, background color are described; From prospect or background color distance less, then color is more close, and according to figure hugger opinion, this pixel more tends to selection prospect or background label; Treat that all threads terminate, held by GPU the solving result of each pixel to pass to CPU end, carry out detailed building figure process at CPU end.The mathematical form of color unitary item is described as:

1 - P_{c} (x_{i}^{τ} | c_{i}^{τ}) = \{\begin{matrix} \frac{s_{i}^{\min}}{s_{i}^{\min} + t_{i}^{\min}}, x = 1 \\ \frac{t_{i}^{\min}}{s_{i}^{\min} + t_{i}^{\min}}, x = 0 \end{matrix} - - - (3)

Wherein, represent pixel respectively color to the minor increment at all kinds of bunches of centers of prospect and background color, its expression formula is respectively:

s_{i}^{\min} = \min ({|| c_{i}^{τ} - C_{n}^{F} ||}^{2}), n = 1, ..., N_{c}

t_{i}^{\min} = \min ({|| c_{i}^{τ} - C_{m}^{B} ||}^{2}), m = 1, ..., M_{c}

Parallax unitary item is identical with the computation process of color unitary item;

(2) binary bound term in image is defined

Binary bound term in image comprise two, describe the change of pixel ambient color and parallax change respectively, i.e. color gradient and gradient of disparity, is defined as follows:

E_{int r a} (p_{i}^{τ}, p_{j}^{τ}) = f_{c} (p_{i}^{τ}, p_{j}^{τ}) f_{d} (p_{i}^{τ}, p_{j}^{τ}) | x_{i}^{τ} - x_{j}^{τ} | - - - (4)

Wherein, represent the similarity of color between neighbor, color its value more close is larger, cuts calculation ratio juris according to figure, and border is just less through the probability of the two; represent pixel relative to adjacent pixels point the similarity of parallax; The two parallax is more close, and its value is larger, cuts calculation ratio juris according to figure, and the two probability getting different label is just less; In order to reduce the error that parallax produces, the parallax in parallax item, the parallax information of what this step adopted the be through rough layer that gaussian filtering and down-sampling obtain.The form of Definition of two is as follows:

f_{c} (p_{i}^{τ}, p_{j}^{τ}) = \frac{1}{{|| c_{i}^{τ} - c_{j}^{τ} ||}^{2} + 1}, (p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a} - - - (5)

f_{d} (p_{i}^{τ}, p_{j}^{τ}) = \frac{1}{{|| d_{i}^{τ} - d_{j}^{τ} ||}^{2} + 1}, (p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a} - - - (6)

(3) binary bound term between image is defined

Between image, between binary item constraint image, respective pixel gets same label, is defined as follows:

E_{int e r} (p_{i}^{l, τ}, p_{i}^{r, τ}) = \frac{C (p_{i}^{l, τ}, p_{i}^{r, τ}) + C (p_{i}^{r, τ}, p_{i}^{l, τ})}{2} | x_{i}^{l, τ} - x_{j}^{r, τ} | - - - (7)

Wherein, C represents in stereo-picture between as the possibility of corresponding point, be asymmetric step function:

C (p_{i}^{l, τ}, p_{i}^{r, τ}) = P (x_{i}^{l, τ} | M (p_{i}^{l, τ}) = p_{j}^{r, τ}, p_{j}^{l, τ}) P (M (p_{i}^{l, τ}) = p_{j}^{r, τ}) - - - (8)

determine based on disparity map between as the probability distribution function of corresponding point; Function represent it is left rough layer pixel corresponding point on right rough layer, corresponding relation determines according to original disparity map; adopt consistent Delta function, definition mode is as follows;

P (M (p_{i}^{l, τ}) = p_{j}^{t, τ}) = {\begin{matrix} 1, | p_{i}^{l, τ} - p_{j}^{r, τ} | = d_{i}^{l} a n d | p_{j}^{r, τ} - p_{i}^{l, τ} | = p_{j}^{r} \\ 0, o t h e r \end{matrix} - - - (9)

Wherein, for pixel in left rough layer with corresponding point in right figure parallax value; for pixel in right rough layer with left figure corresponding point parallax; In order to better determine the corresponding relation of left images element, what adopt at this is the parallax of undressed original disparity map.

In formula (8) represent with between the probability of color similarity, when parallax entirely accurate, but current parallax calculation method exists error, in order to better determine the corresponding relation of left and right figure, abandon parallax item.Only utilize color item, take following form:

P (x_{i}^{l, τ} | M (p_{i}^{l, τ}) = p_{j}^{r, τ}, x_{j}^{r, τ}) = \frac{1}{{|| c_{i}^{l, τ} - c_{j}^{r, τ} ||}^{2} + 1} - - - (10)

Wherein, for left rough layer pixel color value, be in right rough layer corresponding point value;

(4) the parent-child constraint relation between levels is defined

The final result of Iamge Segmentation should show in pixel layer.In order to the result of rough layer is delivered to pixel layer, keeps the consistance of the father and son's pixel between levels image simultaneously, by the parent-child constraint contextual definition between levels be:

E_{p a t e r n i t y} (p_{i}^{τ}, p_{i, j}) = \infty, (p_{i}^{τ}, p_{i, j}) &Element; N_{p a t e r n i t y} - - - (11)

represent the similarity between levels father and son pixel.Pixel due to rough layer represents original pixels layer N _l* N _lall pixels in region, rough layer pixel label namely represent all pixel tags of pixel layer corresponding region, therefore the limit power between father and son's pixel is defined as infinity.The limit of non-father and son's node pixel is no longer considered.

(5) energy function minimum value is solved

For the parent-child constraint relation between levels, be defined as infinity in the present invention, the limit therefore between father and son is divided never, and the label of father node can be directly delivered to child node.Because the limit calculating father and son's node can consume a large amount of internal memories, increase the time of calculating simultaneously.In concrete Optimization Solution process, calculate the internodal limit of father and son no longer in detail.Employing figure cuts algorithm, max-flow/minimal cut algorithm that the people such as such as YuriBoykov propose in the paper " AnExperimentalComparisonofMin-Cut/Max-FlowAlgorithmsforE nergyMinimizationinVision " to deliver on " IEEETransactiononPAMI " for 2004, the energy function (formula (1)) defined by optimization the present invention, obtain optimum mark result, i.e. rough layer segmentation result.Then according to the label of rough layer pixel, the area pixel label that pixel layer is corresponding is directly determined.By this method when accuracy rate is constant, the speed of segmentation can be significantly improved.Because the direct label by rough layer is delivered to pixel layer, there is larger error in the pixel differed greatly for boundary neighborhood territory pixel.In order to improve the accuracy rate of segmentation, the point that statistical boundary place error is larger, carries out local optimum.

Step 5, based on the boundary local optimum of original image

Through the global optimization of step 4, obtain coarse partitioning boundary.Due to rough layer pixel the N of corresponding original pixels layer _l* N _lthe set of pixel in region, will label be directly delivered to pixel layer N _l* N _lregion.N in the present embodiment _l=3.For boundary, the difference of neighborhood territory pixel is large, directly the label of rough layer pixel is assigned to all pixels in region, can there is larger error.Therefore, independent local optimum is carried out to boundary.

Before carrying out local optimum, first statistics local boundary information.First the coarse partitioning boundary obtained is divided into upper and lower border and left and right border two parts.Then by upper and lower border above boundary line with expand N respectively below _lindividual pixel, expands N by left and right border respectively to the left side of boundary line and the right side _lindividual pixel, in the present embodiment N _l=3.To the boundary pixel of statistics, traditional figure hugger opinion is adopted to carry out local optimum.Local optimum is carried out on pixel layer, because disparity computation exists error, abandons parallax information when local optimum.When Global treatment, ensure that the consistance that stereo-picture is split, and local optimum is the process carried out local pixel.Therefore, when local optimum, independently carry out on the width image of left and right two simultaneously.If I ^efor the pending figure in local of statistics.The energy function of definition local is:

E^{e} (X) = w_{u n a r y} \underset{p_{i} &Element; I^{e}}{Σ} E_{u n a r y}^{e} (p_{i}) + w_{int r a} \underset{(p_{i}, p_{j}) &Element; N_{int r a}^{e}}{Σ} E_{int r a}^{e} (p_{i}, p_{j}) - - - (12)

be unitary item and data item, represent the pixel of boundary and similarity that is front, background color model, similarity is larger, is worth larger. be binary item and level and smooth item, represent the similarity of neighborhood territory pixel, the two is more similar, is worth less.Border is less through the possibility of the two. represent the combination of all of its neighbor relation in boundary graph.Wherein, w _unary+ w _intra=1

Unitary item is defined as follows:

E_{u n a r y}^{e} (p_{i}) = P (x_{i} | c_{i}) = \frac{p (c_{i} | x_{i})}{P (c_{i} | x_{i} = 1) + P (c_{i} | x_{i} = 0)} - - - (13)

The optimization of boundary is the accurate optimization of local, and should reduce error as far as possible, therefore, unitary item only adopts color item.The concrete calculating of unitary item is with the calculating of unitary item color in global optimization.

Binary item, in order to reduce error, also only adopts color item.Shown in being defined as follows:

E_{int r a}^{e} (p_{i}, p_{j}) = \frac{1}{{|| c_{i} - c_{j} ||}^{2} + 1} | x_{i} - x_{j} |, (p_{i}, p_{j}) &Element; N_{int r a}^{e} - - - (14)

After local energy function defines, adopt max-flow/minimal cut optimized algorithm that step 4 is mentioned, optimization local energy function and formula (12), obtain optimum mark result, i.e. segmentation result; Result with step 4 segmentation merges mutually, forms the segmentation result that whole image is right.

Step 6, alternately

As to segmentation result be unsatisfied with, return step 2, continue add before, background clue; Often add one, will once complete cutting procedure be triggered.On the basis split, further split, until obtain satisfied result.

Method in " the StereoCut:ConsistentInteractiveObjectSelectioninStereoIm agePairs " that deliver on the ICCV of 2011 with people such as Price is contrast object, and the validity of the inventive method is described.Two kinds of methods all adopt consistent Delta function (formula (9)) as the probability distribution function between corresponding point.Fig. 2 gives Contrast on effect.Fig. 2 (a), (b) are the left and right image of input.C (), (d) are the results adopting the segmentation of StereoCut method; Fig. 2 (e), (f) are segmentation result of the present invention; Two row give the accuracy rate of two kinds of method segmentations and the T.T. of segmentation below.Being defined as follows of accuracy rate (representing with A):

A = \frac{1}{2} (\frac{Σ_{i = 1}^{N_{L}} f_{A} (c_{i}^{l} - c_{i}^{\lg})}{N_{L}} + \frac{Σ_{j = 1}^{N_{r}} f_{A} (c_{j}^{r} - c_{j}^{r g})}{N_{r}}), - - - (15)

Wherein

f_{A} (Δ c) = \{\begin{matrix} 1, & Δ c = 0 \\ 0, & Δ c &NotEqual; 0 \end{matrix}

Wherein, N _land N _rrepresent the sum of all pixels of left figure and right figure image respectively, for the label (0 or 1) of i-th pixel in segmentation rear left figure, accordingly represent the label of a segmentation rear right figure jth pixel. represent left and right figure true value respectively, then reflect the difference between the label of a certain pixel of left figure and true value.Function f _abe the function about difference, when difference is 0, function is 1, otherwise is designated as 0.Can find out from formula (15), be the accuracy rate of segmentation with the indifference sum of true value and the ratio of image size in single image, the segmentation accuracy of stereo-picture is exactly the mean value of left and right two figure accuracys rate.

User's input display in figure (c), (e) respectively that two kinds of methods are used, the wire tag prospect of the first lines of object inside, the wire tag background of the second lines of object outside.Comparison diagram (c), (d) and figure (e), (f), and the computing time of given two kinds of methods and accuracy rate value, can find out: this method is under the prerequisite of equal interactive quantity, when splitting accuracy rate and changing little, the speed of Iamge Segmentation can be significantly improved.

Claims

1. based on an interactive stereo-picture fast partition method for multi-level graph structure, it is characterized in that: first the method inputs one group of stereo-picture, obtains disparity map by stereo matching algorithm; About original image in any figure before specified portions, background; Before method establishment according to specified portions application CUDA parallel computation, the priori statistical model of the color of background and parallax distribution; By carrying out gaussian filtering to original image, down-sampling obtains the less image of coarse scale, then coarse image formed multi-level graph structure together with original image; Based on this, the constraints such as color, gradient and parallax under figure cuts theoretical frame in the multi-level graph structure of formalization, structure energy function; In order to raise the efficiency, figure process is built in the method process of application CUDA parallel computation; Adopt global optimization's result of the max-flow/multi-level figure of minimal cut Algorithm for Solving of figure; Then the pixel that statistical boundary place error is larger, adopts traditional figure hugger opinion, carries out local optimum to the boundary pixel point of statistics; The result of Global treatment and local optimum is merged, forms last segmentation result; If user does not obtain desirable effect, continue to delineate zone errors in figure, until obtain desired result;

It is characterized in that: the method specifically comprises the steps:

Step one, coupling stereo-picture;

Read in a stereoscopic image I={I ^l, I ^r, I ^lwith I ^rrepresent left and right image respectively; Calculate disparity map corresponding to left and right image by Stereo Matching Algorithm, use D respectively ^lwith D ^rrepresent;

Step 2, before interpolation, background clue;

User by before specified portions in designed interface wherein any image, background; Utilize the input equipments such as mouse, touch-screen or writing pencil, by before the lines specified portions of delineating different colours on image, background pixel; The pixel that first lines cover belongs to prospect, and the pixel that the second lines cover belongs to background; The subsequent step of this method for before used in this step, background pixel specific mode unrestricted, alternate manner also can use;

Step 3, before foundation, the color of background, parallax prior model;

Represent that the foreground pixel set that user specifies, B represent the background pixel set that user specifies with F; Before, the prior model of the color of background, parallax adopts the form of GMM, histogram and multiple class bunch to express; What this method adopted is multiclass bunch form, obtains class bunch by color, the parallax adding up respective pixel set; In order to improve processing speed, adopt based on the parallel Kmeans algorithm of CUDA, respectively cluster is carried out to color value corresponding to the pixel in F and B, parallax value; The detailed process of process color model is as follows: each thread process pixel, calculates the distance of each pixel to all prospects, background classes bunch, selects nearest distance, by pixel cluster in the class bunch of correspondence; Obtain N _cindividual foreground color class bunch m _cindividual background color class bunch above-mentioned color class bunch represents the color distribution statistical model of prospect, background respectively; , use the same method, the parallax value corresponding to the pixel in F and B carries out cluster respectively, obtains N meanwhile _dindividual foreground disparities class bunch m _dindividual background parallax class bunch above-mentioned parallax class bunch represents the parallax statistical distribution model of prospect, background respectively; In the present embodiment, N _c=M _c=64; N _d=M _d=16;

Step 4, based on the global optimization of multi-level graph structure;

Compare due to prospect, background distribution separately in image namely assemble before, background interior pixels difference is less, boundary pixel difference is larger; Utilize this characteristic, represent all pixels of neighborhood by the pixel that region is representative; This method adopts the mode of gaussian filtering, down-sampling, obtains representational pixel; And then obtain the less image of coarse yardstick; Coarse image and original image are merged, forms multi-level graph structure; Global treatment is carried out to the model of multi-level graph structure; By original three-dimensional image to being expressed as I={I ^l, I ^r, coarse stereo pairs is expressed as I ^τ={ I ^{l, τ}, I ^{r, τ}, I ^l, I ^{l, τ}with I ^r, I ^{r, τ}represent left and right image respectively; Original three-dimensional image and coarse stereo-picture are expressed as jointly a non-directed graph G=< ν, ε >; Wherein, ν is the node set in non-directed graph G, and ε is the set on limit; Each vertex correspondence stereo-picture I and I in non-directed graph G ^τin a pixel; Interactive stereo-picture Fast Segmentation is under the constraint of entering stroke, is each pixel p of original three-dimensional image centering _igive a label x _i; x _i∈ 1,0}, respectively represent before, background; Limit in non-directed graph G comprises the fillet of each pixel and source point, meeting point, the fillet of neighbor in image, and the fillet between the stereo-picture correspondence point that determines of disparity map; Fillet between the father and son's node simultaneously also comprising rough layer and original image; Order for rough layer image slices vegetarian refreshments; Obtain because rough layer carries out down-sampling to original layers, so one n in I image before representative sampling _l* N _lregion in pixel, N in the present embodiment _l=3;

\begin{matrix} E (X) = \\ w_{u n a r y} \underset{p_{i}^{τ} &Element; I^{τ}}{Σ} E_{u n a r y} (p_{i}^{τ}) + w_{int r a} \underset{(p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a}}{Σ} E_{int r a} (p_{i}^{τ}, p_{j}^{τ}) + w_{int e r} \underset{(p_{i}^{l, τ}, p_{j}^{r, τ}) &Element; N_{inter}}{Σ} E_{inter} (p_{i}^{l . τ}, p_{j}^{r, τ}) \\ + w_{p a t e r n i t y} \underset{(p_{i}^{τ}, p_{i, j}) &Element; N_{p a t e r n i t y}}{Σ} E_{p a t e r n i t y} (p_{i}^{τ}, p_{i, j}) \end{matrix} - - - (1)

Wherein be unitary item, represent the similarity of the color of rough layer pixel, parallax and front, background color and parallax statistical model, be also called data item; Similarity is higher, be worth larger; be binary item in rough layer image, reflect the difference between all pixels of rough layer image and four neighborhoods, Ν _intrarepresent the set comprising the syntople of all pixels in the rough layer figure of left and right; Difference is larger, then this Xiang Yue little; Cut calculation ratio juris according to figure, now tend between neighborhood territory pixel get different labels; be the binary item between coarse image, define the result of the coupling of corresponding point, matching degree is higher, then this Xiang Yue great; Ν _interrepresent the set containing left and right rough layer pixel corresponding relation; be the binary constraint relation between rough layer image and original image, represent the similarity of father and son's node, father and son's node difference is less, and this value is larger, and border is less through both possibilities; Ν _paternityrepresent the set of father and son's corresponding relation; w _unary, w _intra, w _inter, w _paternityregulate the weights between each energy term; W in the present invention _unary=1, w _intra=4000, w _inter=8000, w _paternity=1000000.

(1) unitary bound term is defined

E_{u n a r y} (p_{i}^{τ}) = w_{c} (1 - P_{c} (x_{i}^{τ} | c_{i}^{τ})) + w_{d} (1 - P_{d} (x_{i}^{τ} | d_{i}^{τ})) - - - (2)

The account form of color unitary item is as follows: this method adopts and calculates based on CUDA parallel method; The color value of all pixels of being held by CPU passes to GPU end; In GPU, all pixels of parallel processing; Each thread represents a unmarked pixel; Thread is separate, and all threads calculate the distance of pixel color to the Lei Cu center of prospect, background color model simultaneously, find wherein minimum distance; By this minimum distance, pixel color and similarity that is front, background color are described; From prospect or background color distance less, then color is more close, and according to figure hugger opinion, this pixel more tends to selection prospect or background label; Treat that all threads terminate, held by GPU the solving result of each pixel to pass to CPU end, carry out detailed building figure process at CPU end; The mathematical form of color unitary item is described as:

1 - P_{c} (x_{i}^{τ} | c_{i}^{τ}) = \{\begin{matrix} \frac{s_{i}^{\min}}{s_{i}^{\min} + t_{i}^{\min}}, & x = 1 \\ \frac{t_{i}^{\min}}{s_{i}^{\min} + t_{i}^{\min}}, & x = 0 \end{matrix} - - - (3)

s_{i}^{\min} = m i n (| | c_{i}^{τ} - C_{n}^{F} | |^{2}), n = 1, ..., N_{c}

t_{i}^{\min} = m i n (| | c_{i}^{τ} - C_{m}^{B} | |^{2}), m = 1, ..., M_{c}

(2) binary bound term in image is defined

E_{int r a} (p_{i}^{τ}, p_{j}^{τ}) = f_{c} (p_{i}^{τ}, p_{j}^{τ}) f_{d} (p_{i}^{τ}, p_{j}^{τ}) | x_{i}^{τ} - x_{j}^{τ} | - - - (4)

Wherein, represent the similarity of color between neighbor, color its value more close is larger, cuts calculation ratio juris according to figure, and border is just less through the probability of the two; represent pixel relative to adjacent pixels point the similarity of parallax; The two parallax is more close, and its value is larger, cuts calculation ratio juris according to figure, and the two probability getting different label is just less; In order to reduce the error that parallax produces, the parallax in parallax item, the parallax information of what this step adopted the be through rough layer that gaussian filtering and down-sampling obtain; The form of Definition of two is as follows:

f_{c} (p_{i}^{τ}, p_{j}^{τ}) = \frac{1}{| | c_{i}^{τ} - c_{j}^{τ} | |^{2} + 1}, (p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a} - - - (5)

f_{d} (p_{i}^{τ}, p_{j}^{τ}) = \frac{1}{| | d_{i}^{τ} - d_{j}^{τ} | |^{2} + 1}, (p_{i}^{τ}, p_{j}^{τ}) &Element; N_{int r a} - - - (6)

(3) binary bound term between image is defined

E_{int e r} (p_{i}^{l, τ}, p_{i}^{r, τ}) = \frac{C (p_{i}^{l, τ}, p_{i}^{r, τ}) + C (p_{i}^{r, τ}, p_{i}^{l, τ})}{2} | x_{i}^{l, τ} - x_{j}^{r, τ} | - - - (7)

C (p_{i}^{l, τ}, p_{i}^{r, τ}) = P (x_{i}^{l, τ} | M (p_{i}^{l, τ}) = p_{j}^{r, τ}, x_{j}^{r, τ}) P (M (p_{i}^{l, τ}) = p_{j}^{r, τ}) - - - (8)

P (M (p_{i}^{l, τ}) = p_{j}^{r, τ}) = \{\begin{matrix} 1, & | p_{i}^{l, τ} - p_{j}^{r, τ} | = d_{i}^{l} a n d | p_{j}^{r, τ} - p_{i}^{l, τ} | = d_{j}^{r} \\ 0, & o t h e r s \end{matrix} - - - (9)

Wherein, for pixel in left rough layer with corresponding point in right figure parallax value; for pixel in right rough layer with left figure corresponding point parallax; In order to better determine the corresponding relation of left images element, what adopt at this is the parallax of undressed original disparity map;

In formula (8) represent with between the probability of color similarity, when parallax entirely accurate, but current parallax calculation method exists error, in order to better determine the corresponding relation of left and right figure, abandon parallax item; Only utilize color item, take following form:

P (x_{i}^{l, τ} | M (p_{i}^{l, τ}) = p_{j}^{r, τ}, x_{j}^{r, τ}) = \frac{1}{| | c_{i}^{l, τ} - c_{j}^{r, τ} | |^{2} + 1} - - - (10)

Wherein, for left rough layer pixel color value, be in right rough layer corresponding point value; (4) the parent-child constraint relation between levels is defined

The final result of Iamge Segmentation should show in pixel layer; In order to the result of rough layer is delivered to pixel layer, keeps the consistance of the father and son's pixel between levels image simultaneously, by the parent-child constraint contextual definition between levels be:

E_{p a t e r n i t y} (p_{i}^{τ}, p_{i, j}) = \infty, (p_{i}^{τ}, p_{i, j}) &Element; N_{p a t e r n i t y} - - - (11)

represent the similarity between levels father and son pixel; Pixel due to rough layer represents original pixels layer N _l* N _lall pixels in region, rough layer pixel label namely represent all pixel tags of pixel layer corresponding region, therefore the limit power between father and son's pixel is defined as infinity; The limit of non-father and son's node pixel is no longer considered;

(5) energy function minimum value is solved

For the parent-child constraint relation between levels, be defined as infinity in this method, the limit therefore between father and son is divided never, and the label of father node can be directly delivered to child node; Because the limit calculating father and son's node can consume a large amount of internal memories, increase the time of calculating simultaneously; In concrete Optimization Solution process, calculate the internodal limit of father and son no longer in detail; Employing figure cuts algorithm, the energy function (formula (1)) defined by optimization this method, obtains optimum mark result, i.e. rough layer segmentation result; Then according to the label of rough layer pixel, the area pixel label that pixel layer is corresponding is directly determined; By this method when accuracy rate is constant, the speed of segmentation can be significantly improved; Because the direct label by rough layer is delivered to pixel layer, there is larger error in the pixel differed greatly for boundary neighborhood territory pixel; In order to improve the accuracy rate of segmentation, the point that statistical boundary place error is larger, carries out local optimum;

Step 5, based on the boundary local optimum of original image

Through the global optimization of step 4, obtain coarse partitioning boundary; Due to rough layer pixel the N of corresponding original pixels layer _l* N _lthe set of pixel in region, will label be directly delivered to pixel layer N _l* N _lregion; For boundary, the difference of neighborhood territory pixel is large, directly the label of rough layer pixel is assigned to all pixels in region, can there is larger error; Therefore, independent local optimum is carried out to boundary;

Before carrying out local optimum, first statistics local boundary information; First the coarse partitioning boundary obtained is divided into upper and lower border and left and right border two parts; Then by upper and lower border above boundary line with expand N respectively below _lindividual pixel, expands N by left and right border respectively to the left side of boundary line and the right side _lindividual pixel; N in the present invention _l=3; To the boundary pixel of statistics, traditional figure hugger opinion is adopted to carry out local optimum; Local optimum is carried out on pixel layer, because disparity computation exists error, abandons parallax information when local optimum; When Global treatment, ensure that the consistance that stereo-picture is split, and local optimum is the process carried out local pixel; Therefore, when local optimum, independently carry out on the width image of left and right two simultaneously; If I ^efor the pending figure in local of statistics; The energy function of definition local is:

E^{e} (X) = w_{u n a r y} \underset{p_{i} &Element; I^{e}}{Σ} E_{u n a r y}^{e} (p_{i}) + w_{int r a} \underset{(p_{i}, p_{j}) &Element; N_{int r a}^{e}}{Σ} E_{int r a}^{e} (p_{i}, p_{j}) - - - (12)

be unitary item and data item, represent the pixel of boundary and similarity that is front, background color model, similarity is larger, is worth larger; be binary item and level and smooth item, represent the similarity of neighborhood territory pixel, the two is more similar, is worth less; Border is less through the possibility of the two; represent the combination of all of its neighbor relation in boundary graph; Unitary item is defined as follows:

E_{u n a r y}^{e} (p_{i}) = P (x_{i} | c_{i}) = \frac{P (c_{i} | x_{i})}{p (c_{i} | x_{r} = 1) + p (c_{i} | x_{i} = 0)} - - - (13)

The optimization of boundary is the accurate optimization of local, and should reduce error as far as possible, therefore, unitary item only adopts color item; The concrete calculating of unitary item is with the calculating of unitary item color in global optimization;

Binary item, in order to reduce error, also only adopts color item; Shown in being defined as follows:

E_{int r a}^{e} (p_{i}, p_{j}) = \frac{1}{| | c_{i} - c_{j} | |^{2} + 1} | x_{i} - x_{j} |, (p_{i}, p_{j}) &Element; N_{int r a}^{e} - - - (14)

After local energy function defines, adopt max-flow/minimal cut optimized algorithm that step 4 is mentioned, optimization local energy function and formula (12), obtain optimum mark result, i.e. segmentation result; Result with step 4 segmentation merges mutually, forms the segmentation result that whole image is right;

Step 6, alternately

As to segmentation result be unsatisfied with, return step 2, continue add before, background clue; Often add one, will once complete cutting procedure be triggered; On the basis split, further split, until obtain satisfied result.