CN101299268B

CN101299268B - Semantic object dividing method suitable for low depth image

Info

Publication number: CN101299268B
Application number: CN2008100400009A
Authority: CN
Inventors: 李伟伟; 刘志; 顾建栋; 韩忠民; 颜红波
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2008-07-01
Filing date: 2008-07-01
Publication date: 2010-08-11
Anticipated expiration: 2028-07-01
Also published as: CN101299268A

Abstract

The present invention relates to a semantic object segmentation method suitable for low field depth image, which includes: firstly introducing gradient histogram to figure out the distribution of image in the energy space, obtaining an energy focusing significance map in combination with the character of the low field depth map; using the two-sided filter and morphology instrument to rehandle theenergy focusing significance map; and then setting self-adapting threshold value and processing to obtain an initial object mask map, combining with the edge information obtained by canny operator toobtain the corrected object mask, in order to raise the segmentation accuracy of the interesting object; finally using the Bayesian eclosion algorithm to obtain the ideal semantic object segmentationresult, in order to delicately process the complicated image boundary, such as hairs. Accurate segmentation to the interesting object in the field depth scope in the image video sequence.

Description

Be suitable for the semantic object dividing method of low depth image

Technical field

The present invention relates to a kind of semantic object dividing method that is applicable to low depth image, distinctly with current methods be, finding the solution on the accuracy that focal zone distributes, the energy space by image obtains energy focusing conspicuousness figure; This method has been utilized image edge information again and has been come the border of refined object with (matting) algorithm of sprouting wings simultaneously, has improved the segmentation precision of object.

Background technology

Image segmentation is a major issue of graphical analysis, pattern-recognition and computer vision field, also is the difficult point problem of classics simultaneously, and the final purpose of image segmentation is to be partitioned into the object with particular real-world meaning, i.e. semantic object.Numerous scholars have proposed the partitioning algorithm of many classics in the image segmentation field, but rely on traditional image partition method and from arbitrary image, extract semantic object, the practicality and the adaptability of algorithm are unsatisfactory, so propose to be applicable to that the reliable semantic object dividing method of a class image becomes very necessary research direction.

Low depth similarly is the special image of a class, extensively is present in people's the productive life.The depth of field is very important notion in the photography, it is meant the object distance scope that the imager axis that can obtain picture rich in detail in camera lens or other imager forward positions is measured, it generally is exactly the place of focusing, have also clearly imaging of a segment distance in its front and back, this segment distance is exactly that the depth of field is seen Δ L length among Fig. 1.The degree of the depth of field represents that with the depth depth of field is deeply felt and shown distance clearly, and the shallow expression of depth of field distance clearly is short.What low depth image was promptly represented is the shallow image of the depth of field.In low depth image, the observer is the clear area that the cameraman uses the notion of the depth of field to take out to interesting areas in the image, a remarkable characteristic in the low depth image be exactly imaging region in field depth be the abundant zone of image high-frequency information.The prerequisite of the present invention's research just is being based on cutting apart at the inequality realization semantic object of low depth of field zone medium-high frequency information distribution.

Summary of the invention

The object of the present invention is to provide a kind of semantic object dividing method that is applicable to low depth image.Low depth image is divided into focal zone and out-focus region usually, in low field depth, be the zone that high-frequency information is assembled, so focal zone is different with the information distribution of out-focus region medium-high frequency, this also is the characteristics that low depth image is different from general pattern, and this method depends on this character just and realize being partitioned into semantic object from low depth image.

For reaching above-mentioned purpose, design of the present invention is:

As shown in Figure 2, at first original low depth image is mapped in the gradient energy space, obtain energy focusing conspicuousness figure, and handle by two-sided filter and morphology and to improve energy focusing conspicuousness figure, next set adaptive threshold and obtain the initial object mask, then utilize the canny operator to obtain image border object mask information, obtain revising back object mask in order to perfect initial object mask.Utilize the complex boundary of emergence algorithm process object at last, extract accurate semantic object.

According to above-mentioned design, technical scheme of the present invention is:

A kind of semantic object dividing method that is applicable to low depth image is characterized in that:

At first introduce histogram of gradients and calculate the distribution of image in energy space, character in conjunction with low depth image obtains energy focusing conspicuousness figure, utilize two-sided filter and morphology instrument that energy focusing conspicuousness figure is handled again then, next set the adaptive threshold processing and obtain the initial object mask artwork, in order to improve the segmentation precision of objects, the marginal information that obtains in conjunction with the canny operator obtains revising back object mask.For can handle fine and smoothly the complex boundary of image, hair problem for example uses Bayes's (Bayesian Matting) algorithm of sprouting wings to obtain desirable semantic object segmentation result at last.Can in the image/video sequence, realize accurately cutting apart effectively to the objects in the field depth.Its concrete steps are:

A. gradient energy is handled: original image and blurred picture are transformed in the energy space show, kept the marginal information in the image effectively.And utilize the character of low depth image, obtain energy focusing conspicuousness figure.

B. bilateral filtering and morphology are handled: by two-sided filter and morphologic filtering energy focusing conspicuousness figure is handled.Next set adaptive threshold, obtain the initial object mask through threshold process.

C. edge treated:, need to consider the marginal information of object in order to improve the Object Segmentation precision.To Flame Image Process, obtain edge object mask by canny filtering, and combine with the initial object mask and to obtain revising back object mask artwork.

D. the algorithm of sprouting wings: use Bayes's algorithm of sprouting wings that the semantic object border is carefully handled, obtain the segmentation result of low depth image.

The step that above-mentioned gradient energy is handled is:

At first provide the energy function formula, promptly following formula 1: image is carried out x respectively, and the partial derivative of y both direction is got the branch subitem of their absolute value as energy function; Try to achieve direction gradient histogram HoG (the Histogram of Gradients) maximal value of each pixel, by calculating pixel point i (x, y) the direction gradient angle of all pixels in 11 * 11 neighborhoods is divided in 20 intervals, seeks out the maximal value number as this pixel denominator term; Wherein (x y) is image pixel value to i.

E_{HoG} = \frac{| \frac{&PartialD;}{&PartialD; x} i | + | \frac{&PartialD;}{&PartialD; y} i |}{\max (HoG (i (x, y)))} - - - (1)

Next understand the character of low depth image:

1. the low depth of field is important camera work, and the object in field depth can clear demonstration, this extraneous object and object then show comparatively fuzzy, so piece image can be divided into focal zone and out-focus region.Definition r _Fo(x, y) and r _Unfo(x y) is respectively focal zone and out-focus region in the low depth image, original image i (x y) can be expressed as:

i(x，y)＝r _fo(x，y)+r _unfo(x，y) (2)

2. in low depth image, focal zone has been concentrated a large amount of high-frequency informations, utilizes this character can obtain the distribution of focal zone in image.If (x y) is ambiguity function to α, is expressed as low-pass filter on mathematics.

i _α(x，y)＝i(x，y)*α(x，y)＝r _fo(x，y)*α(x，y)+r _unfo(x，y)*α(x，y)(3)

(x, y) Fuzzy Processing obtains blurred picture, by (texture in focal zone, edge etc. are by obfuscation for x, y) convolution implementation low-pass filtering, and out-focus region then is difficult to find out for what difference with α to i in formula (3) expression.The high frequency value of supposing focal zone concentrates on f _High1～f _High2In, mainly concentrated a large amount of low-frequency information in the out-focus region, establish and concentrate on f _Low1～f _Low2In, (x, frequency values y) they are f to low-pass filter α wherein _α, adjust f _αValue make f _High1～f _High2The bulk information of frequency range is by filtering.Blurred picture is reflected as in frequency spectrum

I _α(u，v)＝I(u，v)×α(u，v)＝R _fo(u，v)×α(u，v)+R _unfo(u，v)×α(u，v)(4)

Through above discussion, (4) formula can be reduced to through after the Filtering Processing

I _α(u，v)＝I(u，v)×α(u，v)≈R _unfo(u，v)×α(u，v)＝R _unfo(u，v)(5)

Can obtain by formula (5), after a width of cloth low depth image filtering, the high-frequency information in the image can be weakened or filtering, and low-frequency information does not then change, so utilize image subtraction after original image and the filtering, can obtain the distribution of original image medium-high frequency information.

By above introduction, can draw energy focusing conspicuousness figure EFSM (Emergy FocusedSaliency Map), wherein E in conjunction with formula (6) _i(x, y), E _{I α}(x y) represents energy function after original image and blurred picture are mapped to energy space respectively.Can keep information such as edge in the image, texture by introducing energy function that histogram of gradients calculates, and clear these change in information that demonstrates.So calculating high-frequency information in energy space distributes then more accurate.

EFSM(x，y)＝|E _i(x，y)-E _iα(x，y)| (6)

Above-mentioned bilateral filtering and morphology are handled performing step:

(1) use two-sided filter to the EFSM smoothing processing, bilateral filtering is respectively to the weighting of input pixel and neighborhood brightness similarity and space similarity, shown in the promptly following formula (7):

BFSM (x, y) = \frac{1}{K} (\underset{(j, k) &Element; (x, y)}{Σ} \exp (- \frac{{(j - x)}^{2} + {(k - y)}^{2}}{{2 σ}_{b}^{2}}) \cdot \exp (- \frac{{(EFSM (j, k) - EFSM (x, y))}^{2}}{{2 σ}_{i}^{2}}) \cdot EFSM (j, k)))

Wherein K is the normalization operator, and (x, y) expression is so that (x y) is the smooth window at center for η.

K = \underset{(j, k) &Element; η (x, y)}{Σ} \exp (- \frac{{(j - x)}^{2} + {(k - y)}^{2}}{{2 σ}_{b}^{2}}) \cdot \exp (- \frac{{(EFSM (j, k) - EFSM (x, y))}^{2}}{{2 σ}_{i}^{2}})) - - - (7)

(2) BFSM being carried out morphology closes to handle and obtains MFSM.

MFSM＝close(BFSM) (8)

(3) setting threshold: ask for average mean and the variance std of MFSM respectively, setting threshold is thre.

thre＝mean(MFSM)+std(MFSM) (9)

(4) threshold decision obtains initial object mask OOM (Original Obiect Mask)

OOM (x, y) = \{\begin{matrix} 1, & MFSM (x, y) &GreaterEqual; thre \\ 0, & MFSM (x, y) < thre \end{matrix} - - - (10)

The performing step of above-mentioned edge treated:

(1) utilizes the canny operator to the gray level image Filtering Processing, obtain image edge information.

(2) marginal information obtains edge in the mask in conjunction with the initial object mask, because these edges generally are the parts of original edge information, so to its neighborhood search expansion, make it recover original edge length.

(3) " cavity " filling is carried out in the closed region that the edge is formed, and has obtained an edge object mask EOM (Edge Object Mask) who is made up of marginal information.

(4) the two-value key map form opening operation that stack obtains to OOM and EOM in order to level and smooth profile, removes tiny noise spot.Obtain revising back object mask AOM (Amended Object Mask).

Above-mentioned emergence algorithm performing step:

(1) obtain three value figure (trimap) by AOM figure: to AOM figure boundary corrosion certain width as focus point zone TRI _FAOM figure is negated and obtained out-focus region TRI by border certain pixel wide that expands _D, uncertain region TRI _UAs formula (13).Corrosion here and expansion width are the scale decisions that is accounted for entire image by AOM figure, generally be the edge etching interfacial area reach the AOM figure total area 6%, the expansion boundary areas reaches 10% of the AOM figure total area.Because the AOM position of semantic object in image of having schemed basic fixed position uses the emergence algorithm just for object bounds is carefully handled, and makes experimental result more accurate.

TRI _F＝erode(AOM) (11)

TRI _D＝～dilate(AOM) (12)

TRI _U＝～(TRI _F∪TRI _D) (13)

(2) to uncertain region TRI _UCarry out Bayes's emergence (Bayesian Matting) algorithm process: the uncertain region interior pixel is judged the problem that prospect still is a background that belongs to.

The emergence algorithm is based on formula

C＝αF+(1-α)B (14)

Wherein C is the color of unknown point, and α is a transparency, and B is a background colour, and F is a foreground.The present invention adopts Bayes to sprout wings, and defines the parameter in the next formulism of a Bayesian frame (14), and finds the solution a maximum a posteriori probability problem.As formula 15, wherein F, B, α are respectively background colour, foreground and the opacities that need find the solution, and C is any color on the image of having known.Under the known prerequisite of C, ask to make and F, B, the α of probability P maximum need the thing of doing exactly.L is a logarithmic function, and multiplication is turned to addition, simplifies and calculates.

argmaxP(F，B，α|C)

＝arg?max _F，B，αP(C|F，B，α)P(B)P(α)/P(C) (15)

＝argmax _F，B，αL(C|F，B，α)+L(F)+L(B)+L(α)

To the modeling respectively of equation the right, solution procedure is an iterative process, supposing under the known condition of α the right formula after the modeling to be asked partial derivative, makes it to be converted into simple linear equations group problem, seeks out F, B,

[\begin{matrix} Σ_{F}^{- 1} + {Iα}^{2} / σ_{C}^{2} & Iα (1 - α) / σ_{C}^{2} \\ Iα (1 - α) / σ_{C}^{2} & Σ_{B}^{- 1} + Iα (1 - α) / σ_{C}^{2} \end{matrix}] [\begin{matrix} F \\ B \end{matrix}] = [\begin{matrix} Σ_{F}^{- 1} \overset{&OverBar;}{F} + Cα / σ_{C}^{2} \\ Σ_{B}^{- 1} \overset{&OverBar;}{B} + C (1 - α) / σ_{C}^{2} \end{matrix}] - - - (16)

Obtain F by (16) formula, the value of B, ask partial derivative to obtain formula (17) to α this moment

α = \frac{(C - B) \cdot (F - B)}{{| | F - B | |}^{2}} - - - (17)

Near the mean value of pixel initial α can get is according to many values of trying to achieve formula (15) maximum of carefully separating of above interative computation institute.Finally then obtain transparency.Border hair or subtle difference to complex object behind the use emergence algorithm can identify, and have improved segmentation precision greatly.

The present invention has following outstanding feature and remarkable advantage compared with prior art: the introducing of energy space among the present invention, the correction of marginal information make the object mask artwork more can satisfy the integrality of object and keep the object detail.The present invention almost can make accurately all low depth images to be cut apart, and has very strong adaptability.

Description of drawings

Fig. 1 is a depth of field conceptual schematic view of the present invention.

Fig. 2 is the structure FB(flow block) that low depth image of the present invention is cut apart.

Fig. 3 is the program flow chart that low depth image of the present invention is cut apart.

Fig. 4 be among Fig. 3 by the structured flowchart of EFSM to OOM.

Fig. 5 is the structured flowchart of the edge treated among Fig. 2.

Fig. 6 is the diagram of cutting apart to image, and provides the result schematic diagram of committed step.

Fig. 7 is the segmentation result diagram that provides other type low depth image.

Embodiment

Details are as follows in conjunction with the accompanying drawings for one embodiment of the present of invention:

The semantic object dividing method that the present invention is based on low depth image is by flow chart shown in Figure 3, is that programming realizes that Fig. 6 and Fig. 7 illustrate the emulation testing result on the PC test platform of 3.0GHz, internal memory 1024M at CPU.

Referring to Fig. 2, cut apart based on the semantic object of low depth image, the Object Segmentation that concentrates in the image in the low field depth can be come out, realize the extraction of object.Utilize high-frequency information distribution difference in the different field depths in image, orient focal zone, and cutting apart by a series of algorithm optimization object.

Fig. 3 shows the program circuit step of present embodiment overall technological scheme:

Below technique scheme is further described:

(1) the spatial domain image transitions is shown to energy space:

(x y) carries out histogram of gradients and calculates, and at first coloured image is converted into gray level image, gray-scale map is carried out Fuzzy Processing, wherein fuzzy coefficient σ=6 to each the pixel i in the image.Energy function is the phase angle of calculating pixel point in 11 * 11 window, phase angular region [0,2 π], be divided into 20 five equilibriums, gained phase angle in the mask is judged scope under it, obtain the number not worth any more than in certain interval, as the histogram of gradients value of this pixel.If dividing, by stages numbers such as attention phase angle can reduce histogrammic precision very little.Carrying out level actual with vertical local derviation to image is that image is carried out level and vertical filtering, and providing horizontal filter is [101;-101;-101], vertical filter is [1-1-1; 000; 111].

(2) significantly scheme (EFSM) by energy focusing and obtain initial object mask (OOM):

As shown in Figure 4, must handle through three steps.At first EFSM is carried out bilateral filtering, learn the value σ that needs to set two variablees by formula (7) _b, σ _i, providing the statistics value by a large amount of experiments is 9,0.8.The size of numerical value has reflected the edge-smoothing degree to EFSM, is worth more for a short time, and then level and smooth Shaoxing opera is strong, but can a lot of valuable marginal informations of filtering.The above-mentioned statistical value that provides satisfies most needs.Next being that BFSM morphology is closed processing, is in order to connect the edge and to remove unnecessary details, use formula (9) at last, and (10) obtain OOM to the MFSM threshold process.

(3) Canny filtering obtains the marginal information of object:

Edge treated such as Fig. 5 flow process, at first to original-gray image canny filtering, this wave filter can be described as the most effective edge detector at present, and this method basic procedure is: every bit is calculated partial gradient and edge direction angle.Marginal point is defined as the local maximum point of intensity on the gradient direction; Marginal point can cause ridge occurring in the gradient amplitude image, follows the trail of the top of all ridges then, and with all not the pixel at the top of ridge be made as zero, conveniently find out a fine rule, carry out boundary chain by 8 neighborhoods and be connected into edge line.

To the image edge information that obtains behind the image filtering, in conjunction with OOM the marginal information of gained is judged, eliminate mixed and disorderly marginal information.Keep the mask inward flange,,, make edge line prolong so, carry out neighborhood search with reference to original edge information in order to keep the integrality of edge line because these edges may be the parts of original edge information.Implementing edge mask then extracts.The way that fill in " cavity " is carried out in the closed region that edge line under the mask is formed, obtain edge object mask.

(4) obtain to revise back object mask artwork, utilize the emergence algorithm:

In conjunction with OOM and EOM, can overcome the disappearance that any one brings to accurate object information description, both stacks obtain revising back object mask, use (11) (12) (13) three formulas can obtain three value figure, finally obtain segmentation result in conjunction with the emergence algorithm.

Its concrete steps are:

A. gradient energy is handled: original image and blurred picture are converted into the energy space demonstration, have effectively kept the marginal information in the image.And utilize the character of low depth image, obtain energy focusing conspicuousness figure.

B. bilateral filtering and morphology are handled: by two-sided filter and morphologic filtering energy focusing conspicuousness figure is handled.Next set adaptive threshold, to obtaining the initial object mask after its threshold process.

C. edge treated:, need to consider the marginal information of object in order to improve the Object Segmentation precision.To Flame Image Process, obtain edge object mask by the canny wave filter, combine with the initial object mask and obtain revising back object mask artwork.

The step that above-mentioned gradient energy is handled is:

(1) (x, y) low-pass filtering obtains i to original image i _α(x y), carries out following processing respectively to two width of cloth images;

(2) image is carried out x, as the branch subitem of energy function such as formula 1. the partial derivative of y direction takes absolute value;

(3) try to achieve the direction gradient histogram HoG maximal value of each pixel, as the denominator term of this pixel energy function;

(4) by top two steps, with gained bring into formula 1. in:

E_{HOG} = \frac{| \frac{2}{2 x} i | + | \frac{2}{2 y} i |}{\max (HOG (i (x, y))},

Wherein (x y) is image pixel value to i.Obtain the energy distribution function E of image _HoG

(5) respectively to i (x, y), i _α(x, y) two width of cloth images are asked for energy function, obtain two width of cloth energy space key map E _i(x, y), E _{I α}(x, y);

(6) according to low field depth characteristic, utilize formula 6.: EFSM (x, y)=| E _i(x, y)-E _{I α}(x, y) | obtain energy focusing conspicuousness figure.

Above-mentioned bilateral filtering and morphology are handled performing step:

(1) use two-sided filter that energy focusing conspicuousness figure smoothing processing is obtained BFSM;

(2) BFSM is carried out morphology and close to handle and obtain MFSM, as formula 8.: MFSM=close (BFSM),

(3) set adaptive threshold: ask for average mean and the variance std of MFSM respectively, setting threshold is thre, as formula 9.: and thre=mean (MFSM)+std (MFSM),

(4) threshold decision obtains the initial object mask, as formula 10.:

OOM (x, y) = \{\begin{matrix} 1, & MFSM (x, y) &GreaterEqual; thre \\ 0, & MFSM (x, y) < thre \end{matrix},

The performing step of above-mentioned edge treated:

(1) utilizes the canny operator to the original image Filtering Processing, obtain image edge information;

(2) combined with the initial object mask by marginal information and obtain the mask inward flange, to the expansion of mask inward flange, the less discontinuous edge of adjusting the distance is filled and is connected, and keeps continuity with reference to above-mentioned marginal information;

(3) " cavity " filling is carried out in the closed region that the above-mentioned edge that obtains is formed, and has obtained an edge object mask EOM who is made up of marginal information;

(4) the binary map morphology opening operation that stack obtains to OOM and EOM is in order to eliminate mixed and disorderly edge line.Obtain revising back object mask AOM.

Above-mentioned emergence algorithm performing step:

(1) obtain three value figure by AOM figure:

1. at first to the AOM corrosion treatment, the certain pixel of corrosion width is as focusing on zone, foreground point TRI _F, make corroded area reach 6% of the AOM figure total area;

2. AOM being expanded, negate obtains defocusing background area TRI after certain pixel wide _D, the expansion area reaches 10% of the AOM figure total area;

3. uncertain region TRI _UBy formula (13) gained: TRI _U=～(TRI _F∪ TRI _D)

(2) to uncertain region TRI _UCarry out Bayes's algorithm process of sprouting wings, promptly to its attaching problem of uncertain region interior pixel point judgement.

1. represent the image as formula (14), C=α F+ (1-α) B, wherein F, B, α are respectively background colour, foreground and the opacities that need find the solution, C is any color on the image of having known;

2. define a Bayesian frame and come formulism, and find the solution maximum a posteriori probability suc as formula the parameter in (14).Framework such as formula (15) under the known prerequisite of C, are asked to make F, B, the α of probability P maximum;

3. simplify calculating formula (15):

\begin{matrix} \arg \max P (F, B, α / C) \\ = {\arg \max}_{F, B, α} P (C / F, B, α) P (α) / P (C) \\ {agr \max}_{F, B, α} L (C / F, B, α) + L (F) + L (B) + L (α) \end{matrix},

Introduce logarithmic function multiplication is turned to addition, wherein L is a logarithmic function;

4. to the modeling respectively of equation the right, carry out iterative.

Sprout wings behind the algorithm through Bayes, trickle objects such as the hair of object bounds can well identify, and are applicable to cutting apart of complex boundary object, have improved algorithm segmentation precision and applicability greatly.

Can effectively handle the remarkable Object Segmentation in the low depth image as mentioned above, background and background texture complex image similar with prospect can be effectively handled in this invention.Below provide the example that low depth image is cut apart, according to Fig. 3 program flow diagram, the low depth sheet is mainly from coral database and network, and experimental result sees Fig. 6 and Fig. 7.Wherein the experimental result picture that provides of Fig. 6 has comprised the several important steps in the algorithm.Fig. 7 then is the experimental result that provides all the other types, in order to practicality of the present invention and algorithm accuracy to be described.Provide the description of test of Fig. 6 below:

Experiment: shooting person's objects is red flower in the image, extracts this to liking the purpose of this algorithm.Flower in the image is in low field depth as can be seen according to the notion of the depth of field, and relatively remainder then seems more clear, and background parts is in depth of field exterior domain, then seems comparatively fuzzy.At first obtain EFSM figure according to the character of low depth image according to Fig. 3 program flow chart, next realize bilateral and morphologic filtering, setting threshold obtains the initial object mask.To original image canny filtering, edge object mask is combined with the initial object mask at the same time, obtain revising back object mask.Obtain three value figure through relevant treatment, use the emergence algorithm finally to realize cutting apart of semantic object in the image.Find out that from experimental result the object flower can be good at extracting from mixed and disorderly background image.

In order to further specify practicality of the present invention, provided more experimental result among Fig. 7, personage's 2 width of cloth are wherein arranged, animal 3 width of cloth and spend 1 width of cloth, totally 6 width of cloth images respectively.Can both well the objects segmented extraction the image be come out as can be seen from experimental result, keep the complete of object, and can realize under the complex background that the semantic object of low depth image cuts apart handling.

Claims

1. semantic object dividing method that is applicable to low depth image, it is characterized in that introducing histogram of gradients and calculate the distribution of image in energy space, obtaining energy focusing in conjunction with the character of low depth image significantly schemes, utilize two-sided filter and morphology instrument that the remarkable figure of energy focusing is handled again then, next setting threshold is handled and is obtained the initial object mask artwork, in order to improve segmentation precision, detect the marginal information that obtains in conjunction with the canny operator and obtain revising back object mask; Use Bayes's algorithm of sprouting wings to obtain desirable semantic object segmentation result at last; This method can realize accurately cutting apart to the semantic object in the low field depth in the image/video sequence effectively; Its concrete steps are:

A. gradient energy is handled: at first provide the energy function formula, promptly following formula (1): image is carried out x respectively, and the partial derivative of y both direction is got the branch subitem of their absolute value as energy function; Try to achieve the direction gradient histogram HoG maximal value of each pixel, by calculating pixel point i (x, the direction gradient angle of all pixels in 11 * 11 neighborhoods y) is divided in 20 intervals, seeks out HoG maximal value number as this pixel denominator term; Wherein i (x y) is image pixel value,

E_{HoG} = \frac{| \frac{&PartialD;}{&PartialD; x} i | + | \frac{&PartialD;}{&PartialD; y} i |}{\max (HoG (i (x, y)))} - - - (1)

1. piece image is divided into focal zone and out-focus region; Definition r _Fo(x, y) and r _Unfo(x y) is respectively focusing and out-focus region in the low depth image, original image i (x y) is expressed as:

i(x，y)＝r _fo(x，y)+r _unfo(x，y) (2)

2. establish α (x y) is ambiguity function, is expressed as low-pass filter on mathematics,

i _α(x，y)＝i(x，y)*α(x，y)＝r _fo(x，y)*α(x，y)+r _unfo(x，y)*α(x，y) (3)

(x, y) Fuzzy Processing obtains blurred picture, by (x, y) convolution is carried out low-pass filtering, supposes that the high frequency value of focal zone concentrates on f with α to i in formula (3) expression _High1～f _High2In, mainly concentrated a large amount of low-frequency information in the out-focus region, establish and concentrate on f _Low1～f _Low2In, (x, frequency values y) they are f to low-pass filter α wherein _α, adjust f _αValue make f _High1～f _High2The bulk information of frequency range is by filtering; Blurred picture is reflected as in frequency spectrum

I _α(u，v)＝I(u，v)×α(u，v)＝R _fo(u，v)×α(u，v)+R _unfo(u，v)×α(u，v) (4)

(4) formula is reduced to through after the Filtering Processing

I _α(u，v)＝I(u，v)×α(u，v)≈R _unfo(u，v)×α(u，v)＝R _unfo(u，v) (5)

Can obtain by formula (5), after a width of cloth low depth image filtering, utilize image subtraction after original image and the filtering, obtain the distribution of original image medium-high frequency information; Draw energy focusing conspicuousness figure EFSM, wherein E in conjunction with formula (6) _i(x, y), E _{I α}(x y) represents energy function after original image and blurred picture are mapped to energy space respectively respectively; Can keep edge, texture information in the image by introducing energy function that histogram of gradients calculates, thus in energy space, calculate high-frequency information distribute then more accurate,

EFSM(x，y)＝|E _i(x，y)-E _iα(x，y)| (6)

B bilateral filtering and morphology are handled:

BFSM (x, y) = \frac{1}{K} (\underset{(j, k) &Element; (x, y)}{Σ} \exp (\frac{{(j - x)}^{2} + {(k - y)}^{2}}{2 σ_{b}^{2}}) \cdot \exp (\frac{{(EFSM (j, k) - EFSM (x, y))}^{2}}{2 σ_{i}^{2}}) \cdot EFSM (j, k)),

Wherein

K is the normalization operator, and η (x, y) expression is so that (x y) is the smooth window at center;

K = \underset{(j, k) &Element; η (x, y)}{Σ} \exp (- \frac{{(j - x)}^{2} + {(k - y)}^{2}}{2 σ_{b}^{2}}) \cdot \exp (- \frac{{(EFSM (j, k) - EFSM (x, y))}^{2}}{2 σ_{i}^{2}})) - - - (7)

(2) BFSM being carried out morphology closes to handle and obtains MFSM;

(3) setting threshold: ask for average mean and the variance std of MFSM respectively, and both additions are obtained threshold value is thre;

(4) threshold decision obtains initial object mask OOM;

The c edge treated:

(2) marginal information obtains edge in the mask in conjunction with the initial object mask, to its neighborhood search expansion, makes it recover original edge length, keeps being communicated with;

(3) " cavity " filling is carried out in the closed region that the edge is formed, and has obtained a two-value edge object mask EOM who is made up of marginal information;

(4) the two-value key map form opening operation that stack obtains to OOM and EOM in order to level and smooth profile, removes tiny noise spot; Obtain revising back object mask AOM;

D emergence algorithm:

(1) obtain three value figure by AOM figure: to AOM figure by the boundary corrosion certain width as focus point zone TRI _FAOM figure is negated and obtained out-focus region TRI by border certain pixel wide that expands _D, uncertain region TRI _UBe following formula (8); Corrosion here and expansion width are the scale decisions that is accounted for entire image by AOM figure, be the edge etching interfacial area reach the AOM figure total area 6%, the expansion boundary areas reaches 10% of the AOM figure total area; Because the AOM position in image of semantic object of having schemed basic fixed position uses the emergence algorithm just for object bounds carefully being handled TRI _U=～(TRI _F∪ TRI _D) (8);

(2) to uncertain region TRI _UCarry out Bayes's algorithm process of sprouting wings: the uncertain region interior pixel is judged the problem that prospect still is a background that belongs to; The emergence algorithm is based on formula C=α F+ (1-α) B,

Wherein C is the color of unknown point, α is a transparency, B is a background colour, F is a foreground, define the parameter in the next formulistic following formula of a Bayesian frame, and find the solution a maximum a posteriori probability problem, and promptly following formula (9), wherein F, B, α are respectively background colour, foreground and the opacities that need find the solution, C is any color on the image of having known, under the known prerequisite of C, ask to make and F, B, the α of probability P maximum need the thing of doing exactly, L is a logarithmic function, multiplication is turned to addition, simplify and calculate

arg?maxP(F，B，α|C)

＝arg?max _F，B，αP(C|F，B，α)P(B)P(α)/P(C) (9)

＝arg?max _F，B，αL(C|F，B，α)+L(F)+L(B)+L(α)

To the modeling respectively of equation the right, solution procedure is an iterative process, supposing to ask partial derivative to be converted into simple linear equations group problem to the right formula after the modeling under the known condition of α, seeks out F, B,

[\begin{matrix} Σ_{F}^{- 1} + {Iα}^{2} / σ_{C}^{2} & Iα (1 - α) / σ_{C}^{2} \\ Iα (1 - α) / σ_{C}^{2} & Σ_{B}^{- 1} + Iα (1 - α) / σ_{C}^{2} \end{matrix}] [\begin{matrix} F \\ B \end{matrix}] = [\begin{matrix} Σ_{F}^{- 1} \overset{&OverBar;}{F} + Cα / σ_{C}^{2} \\ Σ_{B}^{- 1} \overset{&OverBar;}{B} + C (1 - α) / σ_{C}^{2} \end{matrix}] - - - (10)

Obtain F by (10) formula, the value of B, ask partial derivative to obtain formula (11) to α this moment

α = \frac{(C - B) \cdot (F - B)}{{| | F - B | |}^{2}} - - - (11)

Near the mean value of pixel initial α gets is separated the value of trying to achieve formula (9) maximum according to many groups of above interative computation institute; Finally then obtain transparency.