CN107301642A

CN107301642A - A kind of full-automatic prospect background segregation method based on binocular vision

Info

Publication number: CN107301642A
Application number: CN201710402285.5A
Authority: CN
Inventors: 王炜; 马俊磊; 张政; 刘煜; 徐玮
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2017-06-01
Filing date: 2017-06-01
Publication date: 2017-10-27
Anticipated expiration: 2037-06-01
Also published as: CN107301642B

Abstract

The present invention proposes a kind of full-automatic prospect background segregation method based on binocular vision, in initial phase, local matching algorithmic preliminaries generate disparity map first, afterwards using initial parallax figure generation trimap, and assume that theoretical method solves initial opacity α based on color linear according to what Levin et al. was proposed.In the iteration optimization stage, the opacity information of left images is incorporated into cost aggregate function first strengthens disparity map, especially borderline region.Recycle enhanced disparity map and provide more reliably trimap to scratch figure, and be inserted into by the use of gradient of disparity as diffusion mapping in stingy figure formula.The whole continuous iteration of optimization process is until obtain satisfied result.There is good robustness by experimental verification this method, hence it is evident that reduce the error at disparity map boundary, improve the accuracy that image scratches figure.

Description

Full-automatic foreground and background separation method based on binocular vision

Technical Field

The invention belongs to the technical field of digital image processing, and particularly relates to a full-automatic foreground and background separation method based on binocular vision.

Background

In the field of machine vision, the binocular stereo vision technology has an increasingly important position, and the binocular stereo vision technology has wide application in the fields of medical diagnosis, mobile phone photography, image tracking and the like. The requirements for the quality of the depth image, especially the edge portion of the depth image, are becoming more and more demanding. It is difficult to generate very accurate depth map boundaries in conventional stereo matching algorithms. Therefore, a new method is necessary to compensate for the shortcomings of the stereo matching algorithm in this regard.

One promising approach is to take advantage of the rich border details that image matting implies. However, the existing classical matting algorithms require a user to firstly indicate a determined foreground part and a determined background part, and automatic matting cannot be realized. This greatly limits the scope of applications for matting, such as on cell phones. Worse, this makes the matting quality completely dependent on the representativeness of the foreground and background specified by the user, and if the background is not comprehensive before the certainty specified by the user, the matting effect is greatly reduced.

Disclosure of Invention

Aiming at the defects of the existing method, the invention aims to provide a full-automatic foreground and background separation method based on binocular vision. Image matting is considered to be a rough "depth map". Therefore, the depth map can provide reliable foreground and background segmentation for image matting, can automatically generate trimap, and meanwhile, the depth value can also provide new information for an image matting algorithm, so that the depth map is more accurate when multi-layer matting is realized. The invention combines the binocular stereo matching algorithm and the image matting algorithm, achieves the effect of mutual iterative enhancement, realizes automatic matting, does not need human-computer interaction, can make the technology more convenient, has wider application range, and can be applied to the existing binocular mobile phones.

The technical scheme of the invention is as follows:

a full-automatic foreground and background separation method based on binocular vision comprises the following steps:

s1, obtaining an initial disparity map by using a local matching algorithm on the binocular images subjected to line registration;

s2, automatically generating trimap by using the initial disparity map obtained in the step S1 to obtain an initial cutout;

s3, merging the cutout information in the initialized cutout obtained in the step S2 into the local matching algorithm in the step S1 to obtain an optimized disparity map;

s4 using the optimized disparity map obtained in step S3, optimizing the disparity and color information as a smooth item in step S2 to obtain an initialization cutout;

and S5, repeating the iteration steps S3 and S4 for more than 2 times, and outputting the final disparity map and the cutout.

In the invention:

the implementation method of step S1 is:

step S11: selecting a window with the size of (2n +1) × (2n +1) in the left image and the right image in the binocular image, and determining the candidate parallax value as D { -D_max,…,-1,0,1,…,d_max}(d_maxRepresenting the maximum disparity range), the color distance cost C is calculated_IGradient distance costAnd the total matching cost function C:

wherein, x and y represent the horizontal and vertical directions of the pixel point p at the center of the windowThe coordinates D ∈ D, I and j are natural numbers I_l、I_rRespectively representing the pixel values of the corresponding pixel points of the left image and the right image in the binocular image,respectively representing gradient values at corresponding pixel points of left and right images in the binocular image, wherein lambda is the weight of the balance color and gradient information on the influence of the matching cost.

Step S12: selecting the optimal parallax value d of the pixel point p by using winnertaks all_p。

Wherein d is_lpFor an initial disparity value, d, corresponding to a pixel point p on the left image_rpThe initial disparity value corresponding to the pixel point p on the right image.

Step S13: and traversing all windows with the size of (2n +1) × (2n +1) in the left image and the right image in the binocular image, obtaining the optimal parallax value of all pixel points according to the same method, and generating a left initial parallax image and a right initial parallax image.

The implementation method of step S2 is:

step S21: and (4) dividing the initial disparity map obtained in the step (S1) by using a watershed algorithm, and binarizing the divided disparity map into a foreground and a background according to a preset threshold value.

a. Calculating a gradient of the initial disparity map obtained in step S1, and performing a threshold process on the gradient:

g(x,y)＝max(grad(d(x,y)),θ) (5)

where d (x, y) represents the disparity value of any point on the initial disparity map obtained in step S1, g (x, y) represents the gradient value of the point, θ is the threshold, and grad () is the gradient function.

b. And c, segmenting the gradient image obtained in the step a into a foreground part and a background part by using a watershed algorithm.

Step S22: and performing morphological corrosion on the foreground and the background obtained in the step S21, performing binarization to obtain the determined foreground and the determined background, and using a corroded part as an uncertainty area to obtain traimap.

Step S23: calculating the opacity alpha of each pixel point in the left image and the right image in the binocular image, and generating the corresponding initialization sectional drawing of the left image and the right image, wherein the energy function formula is as follows:

where L is a Laplace matrix, and the (i, j) th term is:

wherein,_ijis the product of Crohn's disease, A_iIs the RGB three-dimensional vector, mu, of a pixel point_kIs a binocular image with a window w of arbitrary size 3 × 3 in the left and right images_kInner vector A_iIs calculated as the average vector of, | w_kI is the number of pixel points in the window k, is a constant for ensuring numerical stability, Σ k is a 3 × 3 covariance matrix, I₃Is an identity matrix of 3 × 3.

In step S3, the opacity α of each pixel point of the left and right images calculated in step S2 is integrated into the local matching algorithm in step S1 to obtain an optimized disparity map, and the implementation method thereof is as follows:

step S31: in the initialization cutout corresponding to the left and right images obtained in step S2, i.e., the left initialization cutout and the right initialization cutout, all windows with the size of (2n +1) × (2n +1) are traversed, and when the candidate disparity value is D ═ 1,2, …, D_maxCalculating the matching cost C of the corresponding initialization cutout opacities of the left image and the right image_α：

Wherein x and y represent the horizontal and vertical coordinates of the central pixel point p of the window, D ∈ D, α_r、α₁Respectively representing the opacity of corresponding points of the left initialization sectional drawing and the right initialization sectional drawing;

step S32: c is to be_αAdding the calculated binocular disparity cost aggregation function into the formula (3), and calculating the optimized binocular disparity cost aggregation function:

C'(x,y,d)＝C(x,y,d)+ξ·C_α(x,y,d) (9)

and xi is a balance parameter and the value range is [0,1 ].

Step S33: and (4) substituting the optimized binocular disparity cost obtained in the step (S32) into a formula (4) to obtain the optimal disparity values of the pixel points p in the left initialization cutout and the right initialization cutout, obtaining the optimal disparity values of all the pixel points in the left initialization cutout and the right initialization cutout according to the same method, and generating a left optimized disparity map and a right optimized disparity map.

In step S4, using the optimized disparity map obtained in step S3, using the disparity and the color information together as a smoothing term to perform weighted filtering on the opacity α of each pixel of the left and right images obtained in step S2, and obtaining an initialization matte in the optimization step S2;

wherein, W_C、W_DThe weights respectively representing the color and the parallax distance are calculated by the formula:

W_C(I(i),I(j))＝exp{-||I_i-I_j||²/w_c} (9)

W_D(d(i),d(j))＝exp{-||d_i-d_j||²/w_d}

wherein, w_c、w_dThe parameters are respectively preset parameters for adjusting the distance weight of the color value I and the parallax value d.

The invention provides a full-automatic foreground and background separation method based on binocular vision. In the iterative optimization stage, opacity information of the left and right images is firstly fused into a cost aggregation function enhanced disparity map, especially a boundary region. And then, the enhanced disparity map is used for providing more reliable trimap for the matting, and the disparity gradient is used as diffusion mapping to be inserted into the matting formula. The whole optimization process is iterated continuously until a satisfactory result is obtained.

The invention combines binocular parallax and image matting algorithm, fully utilizes complementary information provided by the binocular parallax and image matting algorithm, and obtains high-quality parallax images and image matting through iterative optimization. The idea is to realize automatic generation and enhancement of image matting by means of a disparity map; and the effect of the disparity map is improved by utilizing rich boundary detail information contained in the image matting. Compared with a manual matting algorithm, the method and the device have the advantages that automatic matting is realized, the image regions which are difficult to label manually can be processed, and more accurate trimap can be obtained. Experiments prove that the algorithm has good robustness, the error at the boundary of the disparity map is obviously reduced, and the accuracy of image matting is improved.

Drawings

FIG. 1 is a flow chart of a fully automatic foreground and background separation method based on binocular vision according to the present invention;

FIG. 2 is a schematic diagram of obtaining trimap automatically according to the present invention.

FIG. 3 is a schematic diagram of the optimized enhanced depth map and matting of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Referring to fig. 1, a flowchart of a full-automatic foreground and background separation method based on binocular vision in this embodiment is shown, which includes the following steps:

step S1: using a local matching algorithm to obtain an initial disparity map for the binocular images subjected to line registration;

step S11: selecting a window with the size of (2n +1) × (2n +1) in the left image and the right image in the binocular image, and determining the candidate parallax value as D { -D_max,…,-1,0,1,…,d_max}(d_maxRepresenting the maximum disparity range), the color distance cost C is calculated_IDistance cost from gradientAnd the total matching cost function C:

wherein x and y represent the horizontal and vertical coordinates of a pixel point p positioned in the center of the window, D ∈ D, I and j are natural numbers, I_l、I_rRespectively representing the pixel values of the corresponding pixel points of the left image and the right image in the binocular image,respectively representing gradient values at corresponding pixel points of left and right images in the binocular image, wherein lambda is the weight of the balance color and gradient information on the influence of the matching cost.

Step S12: selecting the optimal parallax value d of the pixel point p by using winnertaks all_p：

Step S2: referring to fig. 2, using the initial disparity map obtained in step S1 to automatically generate a trimap, so as to obtain an initial matte;

step S21: and (4) dividing the disparity map obtained in the step (S1) by using a watershed algorithm, and binarizing the divided disparity map into a foreground and a background according to a preset threshold value.

a, calculating a gradient of the initial disparity map obtained in the step S1, and carrying out threshold processing on the gradient:

g(x,y)＝max(grad(d(x,y)),θ) (5)

where d (x, y) represents a disparity value of any point in the initial disparity map obtained in step S1, g (x, y) represents a gradient value of the point, θ is a threshold, and grad () is a gradient function.

b: b, segmenting the gradient image obtained in the step a into a foreground part and a background part by using a watershed algorithm,

Step S23: according to the theory of linear assumption based on color proposed by Levin et al, A.Levin, D.Lischinski, and Y.Weiss.2008.A closed form solution to natural image matching. IEEETrans.on PAMI,30(2): 228-:

where L is a Laplace matrix, and the (i, j) th term is:

wherein,_ijis the product of Crohn's disease, A_iIs the RGB three-dimensional vector, mu, of a pixel point_kIs a window w of arbitrary size 3 × 3 for left and right images of a binocular image_kInner vector A_iIs calculated as the average vector of, | w_kI is the number of pixel points in the window k, is a constant for ensuring numerical stability, Σ k is a 3 × 3 covariance matrix, I₃Is an identity matrix of 3 × 3.

Step S3: blending the opacity of each pixel point of the left image and the right image obtained in the step S2 into the local matching algorithm in the step S1 to obtain an optimized disparity map;

step S31: in the initialization cutout corresponding to the left and right images obtained in step S2, i.e., the left initialization cutout and the right initialization cutout, all windows with the size of (2n +1) × (2n +1) are traversed, and when the candidate disparity value is D ═ 1,2, …, D_max}(d_maxRepresenting the maximum parallax range), calculating the corresponding of the left and right imagesMatching cost C for initializing matting opacity_α：

Wherein x and y represent the horizontal and vertical coordinates of the central pixel point p of the window, D ∈ D, α_r、α_lRepresenting left and right initialization matte corresponding point opacities, respectively.

C'(x,y,d)＝C(x,y,d)+ξ·C_α(x,y,d) (9)

and xi is a balance parameter and the value range is [0,1 ].

Step S4: using the optimized disparity map obtained in step S3, performing weighted filtering on the opacity α of each pixel point of the left and right images obtained in step 2 by using the disparity and the color information as a smoothing term, and obtaining initialization matting in the optimization step S2:

W_C(I(i),I(j))＝exp{-||I_i-I_j||²/w_c} (9)

W_D(d(i),d(j))＝exp{-||d_i-d_j||²/w_d}

Step five: and repeating the iteration steps S3 and S42-3 times, and outputting the final disparity map and the cutout, which are shown in figure 3.

The foregoing description of the preferred embodiments of the present invention has been included to describe the features of the invention in detail, and is not intended to limit the inventive concepts to the particular forms of the embodiments described, as other modifications and variations within the spirit of the inventive concepts will be protected by this patent. The subject matter of the present disclosure is defined by the claims, not by the detailed description of the embodiments.

Claims

1. A full-automatic foreground and background separation method based on binocular vision is characterized by comprising the following steps:

2. The binocular vision based full-automatic foreground and background separation method of claim 1, wherein: the implementation method of step S1 is:

step S11: selecting a window with the size of (2n +1) × (2n +1) in the left image and the right image in the binocular image, and determining the candidate parallax value as D { -D_max,…,-1,0,1,…,d_maxAt, calculate the color distance cost C_IGradient distance costAnd the total matching cost function C:

<mrow> <msub> <mi>C</mi> <mi>I</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <msub> <mi>I</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>I</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>d</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>

<mrow> <msub> <mi>C</mi> <mo>&dtri;</mo> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <mo>&dtri;</mo> <msub> <mi>I</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>-</mo> <mo>&dtri;</mo> <msub> <mi>I</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>d</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>

<mrow> <mi>C</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>C</mi> <mi>I</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>&lambda;</mi> <mo>&CenterDot;</mo> <msub> <mi>C</mi> <mo>&dtri;</mo> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>

wherein x and y represent the horizontal and vertical coordinates of a pixel point p positioned in the center of the window, D ∈ D, I and j are natural numbers, I_l、I_rRespectively representing the pixel values of the corresponding pixel points of the left image and the right image in the binocular image,respectively representing gradient values at corresponding pixel points of left and right images in the binocular image, wherein lambda is the weight of the influence of balanced color and gradient information on the matching cost;

<mrow> <msub> <mi>d</mi> <mrow> <mi>l</mi> <mi>p</mi> </mrow> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mrow> <mi>d</mi> <mo>&Element;</mo> <mi>D</mi> <mo>,</mo> <mi>d</mi> <mo>></mo> <mn>0</mn> </mrow> </munder> <mi>C</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>

<mrow> <msub> <mi>d</mi> <mrow> <mi>r</mi> <mi>p</mi> </mrow> </msub> <mo>=</mo> <mi>arg</mi> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mrow> <mi>d</mi> <mo>&Element;</mo> <mi>D</mi> <mo>,</mo> <mi>d</mi> <mo><</mo> <mn>0</mn> </mrow> </munder> <mi>C</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow>

Wherein d is_lpFor an initial disparity value, d, corresponding to a pixel point p on the left image_rpThe initial parallax value corresponding to the pixel point p on the right image;

3. The binocular vision based full-automatic foreground and background separation method of claim 2, wherein: the implementation method of S2 is:

step S21: dividing the initial disparity map obtained in the step S1 by using a watershed algorithm, and binarizing the divided disparity map into a foreground and a background according to a preset threshold;

step S22: performing morphological corrosion on the foreground and the background obtained in the step S21, performing binarization to obtain determined foreground and background, and using a corroded part as an uncertainty area to obtain traimap;

<mrow> <mi>J</mi> <mrow> <mo>(</mo> <mi>&alpha;</mi> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mrow> <mi>a</mi> <mo>,</mo> <mi>b</mi> </mrow> </munder> <msup> <mi>&alpha;</mi> <mi>T</mi> </msup> <mi>L</mi> <mi>&alpha;</mi> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>1

where L is a Laplace matrix, and the (i, j) th term is:

<mrow> <mi>L</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&Sigma;</mi> <mrow> <mrow> <mo>(</mo> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> <mo>)</mo> </mrow> <mo>&Element;</mo> <msub> <mi>w</mi> <mi>k</mi> </msub> </mrow> </msub> <mrow> <mo>(</mo> <msub> <mi>&delta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <mrow> <mo>|</mo> <msub> <mi>w</mi> <mi>k</mi> </msub> <mo>|</mo> </mrow> </mfrac> <mo>(</mo> <mrow> <mn>1</mn> <mo>+</mo> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>&mu;</mi> <mi>k</mi> </msub> </mrow> <mo>)</mo> </mrow> <msup> <mrow> <mo>(</mo> <mrow> <mi>&Sigma;</mi> <mi>k</mi> <mo>+</mo> <mfrac> <mi>&epsiv;</mi> <mrow> <mo>|</mo> <msub> <mi>w</mi> <mi>k</mi> </msub> <mo>|</mo> <msub> <mi>I</mi> <mn>3</mn> </msub> </mrow> </mfrac> </mrow> <mo>)</mo> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <mrow> <msub> <mi>A</mi> <mi>j</mi> </msub> <mo>-</mo> <msub> <mi>&mu;</mi> <mi>k</mi> </msub> </mrow> <mo>)</mo> </mrow> </mrow> <mo>)</mo> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>

4. The binocular vision based full-automatic foreground and background separation method of claim 3, wherein: the implementation method of S21 is as follows:

g(x,y)＝max(grad(d(x,y)),θ) (5)

wherein d (x, y) represents the disparity value of any point on the initial disparity map obtained in step S1, g (x, y) represents the gradient value of the point, θ is the threshold, and grad () is the gradient function;

5. The binocular vision based full-automatic foreground-background separation method of claim 3 or 4, wherein: in step S3, the opacity α of each pixel point of the left and right images calculated in step S2 is integrated into the local matching algorithm in step S1 to obtain an optimized disparity map, and the implementation method thereof is as follows:

<mrow> <msub> <mi>C</mi> <mi>&alpha;</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mo>-</mo> <mi>n</mi> </mrow> <mi>n</mi> </munderover> <mo>|</mo> <msub> <mi>&alpha;</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>&alpha;</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mi>d</mi> <mo>+</mo> <mi>i</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>|</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>8</mn> <mo>)</mo> </mrow> </mrow>

Wherein x and y represent the horizontal and vertical coordinates of the central pixel point p of the window, D ∈ D, α_r、α_lRespectively representing the opacity of corresponding points of the left initialization sectional drawing and the right initialization sectional drawing;

C'(x,y,d)＝C(x,y,d)+ξ·C_α(x,y,d) (9)

xi is a balance parameter, and the value range is [0,1 ];

6. The binocular vision based full-automatic foreground and background separation method of claim 5, wherein: in step S4, using the optimized disparity map obtained in step S3, using the disparity and the color information together as a smoothing term to perform weighted filtering on the opacity α of each pixel of the left and right images obtained in step S2, and obtaining an initialization matte in the optimization step S2;

<mrow> <msup> <mi>&alpha;</mi> <mo>&prime;</mo> </msup> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>&Element;</mo> <mi>W</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </munder> <msub> <mi>W</mi> <mi>C</mi> </msub> <mrow> <mo>(</mo> <mrow> <mi>I</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <msub> <mi>W</mi> <mi>D</mi> </msub> <mrow> <mo>(</mo> <mrow> <mi>d</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <mi>d</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <mi>&alpha;</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munder> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>&Element;</mo> <mi>W</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </munder> <msub> <mi>W</mi> <mi>C</mi> </msub> <mrow> <mo>(</mo> <mrow> <mi>I</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <msub> <mi>W</mi> <mi>D</mi> </msub> <mrow> <mo>(</mo> <mrow> <mi>d</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <mi>d</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>8</mn> <mo>)</mo> </mrow> </mrow>

W_C(I(i),I(j))＝exp{-||I_i-I_j||²/w_c} (9)

W_D(d(i),d(j))＝exp{-||d_i-d_j||²/w_d}