CN102103751B

CN102103751B - Foreground image extraction method and device

Info

Publication number: CN102103751B
Application number: CN 200910261630
Authority: CN
Inventors: 吴治国; 高辉; 傅彦; 陈安龙
Original assignee: Huawei Technologies Co Ltd; University of Electronic Science and Technology of China
Current assignee: Huawei Technologies Co Ltd; University of Electronic Science and Technology of China
Priority date: 2009-12-18
Filing date: 2009-12-18
Publication date: 2012-12-19
Anticipated expiration: 2029-12-18
Also published as: CN102103751A

Abstract

The invention discloses a foreground image extraction method and a device, wherein the foreground image extraction method comprises the following steps of: obtaining distance between pixels having same position in the i frame and the i-1 frame; obtaining a set Z of pixels of which distance is more than a predetermined value; obtaining a set U of pixels having same position with the pixels in a foreground region of the i-1 frame from the set Z; performing background differential treatment on the set U of pixels to obtain a set E of pixels; and determining a foreground region of the i frame according to a union set of the set E of pixels, a set T of pixels and a set W of pixels, wherein the set W of pixels is a set of pixels, having different position with the pixels in the foreground region of the i-1 frame, of the set Z of pixels, and the set T of pixels is a set of pixels, having different position with the pixels in the set U of pixels, in the foreground region of the i-1 frame. Theabove technical scheme can quickly and accurately extract the foreground image in the i frame.

Description

Foreground image method for distilling and device

Technical field

The present invention relates to the vision technique field, relate in particular to the foreground image extractive technique.

Background technology

The foreground image extractive technique has a wide range of applications in intelligent monitoring, video compress, self-navigation, man-machine interaction and virtual computer vision field such as synthetic as from video flowing, extracting the portrait technology.

In existing foreground image extractive technique, can adopt frame-to-frame differences point-score and/or background subtraction point-score to obtain foreground area usually, thereby obtain foreground image.Yet, utilize existing frame-to-frame differences point-score and/or background subtraction point-score exist operand big, the cavity is arranged and problem such as the foreground image profile that extracts is inaccurate.Existing foreground image extractive technique awaits further perfect.

Summary of the invention

The embodiment of the invention provides a kind of foreground image method for distilling and device, can extract the foreground image in the i frame fast and accurately.

The foreground image method for distilling that the embodiment of the invention provides comprises:

Obtain the distance between the identical pixel in position in i frame and the i-1 frame, obtain the pixel set Z of said distance greater than predetermined value;

Obtain among the said pixel set Z with said i-1 frame in the identical pixel set U in pixel position of foreground area;

Said pixel set U is carried out the background subtraction divisional processing, obtain pixel set E;

Confirm the foreground area of said i frame according to the union of said pixel set E, pixel set T and pixel set W; Said pixel set W be among the said pixel set Z with said i-1 frame in the pixel position pixel set inequality of foreground area, in the foreground area that said pixel set T is said i-1 frame with said pixel set U in the pixel set inequality of pixel position.

The foreground image extraction element that the embodiment of the invention provides comprises:

Spacing module is used for obtaining the distance between the i frame pixel identical with i-1 frame position;

First collection modules is used to obtain the pixel set Z of i frame middle distance greater than predetermined value;

Second collection modules is used for obtaining the identical pixel set U in pixel position of the foreground area in said pixel set Z and the said i-1 frame;

The background subtraction sub-module is used for said pixel set U is carried out the background subtraction divisional processing, obtains pixel set E;

The foreground area module; Be used for confirming the foreground area of said i frame according to the union of said pixel set E, pixel set T and pixel set W; Said pixel set W be among the said pixel set Z with said i-1 frame in the pixel position pixel set inequality of foreground area, in the foreground area that said pixel set T is said i-1 frame with said pixel set U in the pixel set inequality of pixel position.

Technical scheme by the invention described above embodiment provides can be found out, carries out the background subtraction divisional processing through utilizing pixel set U, has reduced the calculated amount of background subtraction divisional processing; Through utilizing pixel set E, pixel set T and pixel set W to confirm foreground area, can determine the foreground area in the i frame comparatively accurately; Thereby present embodiment can extract the foreground image in the i frame fast and accurately.

Description of drawings

Fig. 1 is the foreground image method for distilling process flow diagram of the embodiment of the invention one;

Fig. 2 is the schematic flow sheet that from video flowing, extracts portrait of the embodiment of the invention two;

Fig. 2 .1 is the adjacent two two field picture synoptic diagram of the embodiment of the invention two;

Fig. 2 .2 is the neighborhood scanning sequency synoptic diagram of the embodiment of the invention two;

Fig. 2 .3 is that the binaryzation before and after the employing connected region of the embodiment of the invention two is handled is covered the plate synoptic diagram;

Fig. 2 .4 is that the foreground image before the employing connected region of the embodiment of the invention two is handled extracts the result;

Fig. 2 .5 is that the foreground image after the employing connected region of the embodiment of the invention two is handled extracts the result;

Fig. 3 .1 is 6 frame background image synoptic diagram of the embodiment of the invention three;

Fig. 3 .2 is that the prospect of the embodiment of the invention three gets into the background image synoptic diagram;

Fig. 3 .3 is the portrait synoptic diagram of deluster the line variation and the Shadows Processing of the embodiment of the invention three;

Fig. 3 .4 is the embodiment of the invention three final portrait synoptic diagram that obtain;

Fig. 4 is the foreground image extraction element synoptic diagram of the embodiment of the invention four;

Fig. 5 is the another kind of foreground image extraction element synoptic diagram of the embodiment of the invention four.

Embodiment

Embodiment one, foreground extracting method.The flow process of this method is shown in accompanying drawing 1.

S100, obtain the distance between the identical pixel in position in i frame and the i-1 frame.

Position among the S100 can be the coordinate position of pixel in frame.Distance between the pixel among the S100 can be based on the distance between the pixel of rgb space, also can be based on the distance between the pixel of yuv space.Because picked-up of equipment such as video camera and image stored frame be the picture frame of rgb space often; Therefore; In the distance between the pixel is under the situation based on the distance between the pixel of yuv space, can be earlier the i-1 frame based on rgb space is converted into based on the i-1 frame of yuv space and will convert the i frame based on yuv space based on the i frame of rgb space into, then; Obtain again based on the i frame of yuv space with based on the distance between the identical pixel in position in the i-1 frame of yuv space; Thereby can make each pixel in the i frame all to distance value should be arranged, wherein, i is a natural number.

Need to prove no matter be based on the pixel of rgb space, also be based on the pixel of yuv space, the distance among the S100 can represent through the distance of various ways, for example, and Euclidean distance or manhatton distance etc.Present embodiment can adopt existing multiple distance calculation mode to obtain based on rgb space or based on the distance between the identical pixel in position in the i frame of yuv space and the i-1 frame, and present embodiment does not limit the embodiment of obtaining the distance between the identical pixel in position in i frame and the i-1 frame.

S110, obtain distance greater than the pixel of predetermined value set Z.That is to say; Can all calculate distance value for each identical pixel of position in i frame and the i-1 frame through S100, promptly all to distance value should be arranged, S110 can judge to the corresponding distance value of each pixel each pixel of i frame; If the distance value of pixel correspondence is greater than predetermined value; Then this pixel belongs to pixel set Z, otherwise this pixel does not belong to pixel set Z.The predetermined value here can be adjusted based on actual conditions, for example, under the bigger situation of the noise ratio of picture frame, can reduce predetermined value, more for example, under the many situation in cavity, can increase predetermined value.

Pixel set Z can represent with the form that binaryzation is covered plate, and it is identical with the pixel quantity that the i frame comprises that binaryzation is covered the pixel quantity that plate comprised, and the pixel value in the binaryzation illiteracy plate is not to be 1, is 0 exactly.Binaryzation at remarked pixel point set Z is covered in the plate; The value that belongs to the pixel of pixel set Z is set to 1; The value that does not belong to the pixel of pixel set Z is set to 0, thereby the pixel list of values that covers in the plate through this binaryzation shows pixel set Z.Certainly, pixel set Z also can represent through alternate manner, for example, stores the positional information of each pixel that belongs to pixel set Z etc.The positional information here can the co-ordinate position information of above pixel in the i frame etc.Present embodiment does not limit the concrete expression mode of pixel set Z.

S120, obtain among the pixel set Z the identical pixel set U in pixel position with the foreground area of i-1 frame.Promptly obtain the common factor of the foreground area in pixel set Z and the i-1 frame, this common factor is pixel set U.Position among the S120 can be the coordinate position of pixel in frame.

If the mode that pixel set Z covers plate with binaryzation is represented (as covering in the plate in binaryzation; The value that belongs to the pixel of pixel set Z is set to 1; The value that does not belong to the pixel of pixel set Z is set to 0); And the foreground area in the i-1 frame also representes that with the mode of binaryzation illiteracy plate (as the value that belongs to the pixel of the foreground area in the i-1 frame is set to 1; The value that does not belong to the pixel of the foreground area in the i-1 frame is set to 0); The detailed process of then obtaining pixel set U can comprise: the value of the binaryzation of remarked pixel point set Z being covered the identical pixel in position in the binaryzation illiteracy plate of plate and the foreground area of representing the i-1 frame is carried out respectively and computing; Covering plate through the binaryzation of the remarked pixel point set U that obtains with computing is: the common factor pixel value partly of the foreground area in pixel set Z and i-1 frame is 1, and the pixel value of other part is 0.

If the foreground area in pixel set Z and the i-1 frame is not represented with the mode of binaryzation illiteracy plate; But represent with the mode of the positional information of pixel in frame; Then can determine pixel and gather the identical pixel of pixel positional information in the foreground area of Z and i-1 frame according to the comparison of positional information; And the positional information of the identical pixel of stored position information, the represented pixel that goes out of the positional information of this storage is pixel set U.

In the present embodiment, the foreground area in the i-1 frame is known.Extract the video flowing of foreground image for needs; Set this video flowing and comprise that the 0th frame to the N two field picture and the 0th frame are background image frame; Then can the foreground area of the 0th frame be initialized as sky, thereby to the 1st frame, the foreground area of the 0th frame is known; Then utilizing after present embodiment obtained the foreground area of the 1st frame, to the 2nd frame, the foreground area of the 1st frame is known; Hence one can see that, and the foreground area of i-1 frame is known in the present embodiment can realize fully, and utilize present embodiment from continuous a plurality of frames of video flowing, to get access to a plurality of foreground images.Need to prove that if the 0th frame to the N frame all is a background image frame in the video flowing, the foreground area of then utilizing present embodiment from the 1st frame to the N frame, to obtain should all be empty.

S130, pixel is gathered U carry out the background subtraction divisional processing, obtain pixel set E.

In S130; Carry out of the set of regional not all distance of background subtraction divisional processing greater than the pixel of predetermined value; But distance is greater than the subregion in the pixel set of predetermined value (being the corresponding zone of pixel set U); Thereby reduced the calculated amount of background subtraction divisional processing, simplified background subtraction divisional processing process.Present embodiment can adopt existing background subtraction divisional processing operation to realize the background subtraction divisional processing process to pixel set U, for example, adopts existing background subtraction divisional processing operation based on the background image model that pixel set U is carried out the background subtraction divisional processing.The background image model here can be the background image model that utilizes existing background image modelling operation to obtain, and also can be the background image model that adopts the mode based on training of present embodiment to obtain.The detailed process that obtains the background image model based on the mode of training can comprise: set up background image model process and training background image model process.

The above-mentioned background image model process of setting up comprises: obtain N continuous frame background image; Utilize partial frame in the N continuous frame background image (like the first half frame; The follow-up frame that is used to set up the background image model that is called; Wherein N is a natural number) set up the background image model, this background image model is the background image model based on Gaussian distribution, promptly the noise profile through the partial frame in the statistics N continuous frame background image obtains the background image model.Present embodiment can adopt existing mode to utilize the partial frame in the N continuous frame background image to set up the background image model; As utilize modes such as noise average and the variance of first half frame through calculating pixel point to set up the background image model; Certainly; Can adopt other types value outside average and the variance type to set up the background image model, present embodiment does not limit the concrete implementation of utilizing partial frame to set up the background image model yet.

Above-mentioned training background image model process comprises: utilize another part frame in the above-mentioned N continuous frame background image (like the latter half frame; The follow-up frame that is used to train the background image model that is called) to the background image model training of above-mentioned foundation, the background image model after the training is to carry out the background image model that uses in the background subtraction divisional processing process among the S130.An object lesson of training background image model is: to any pixel x in the background image model of setting up; Utilization is used for training each frame each pixel identical with pixel x position of background image model to calculate threshold value respectively; Promptly to pixel x; Each frame that is used to train the background image model is selected maximum threshold for pixel x all to a threshold value should be arranged from all threshold values; Utilize this method can make each pixel in the background image model of above-mentioned foundation all to maximum threshold should be arranged, utilize all maximum thresholds to set up the threshold value matrix, this threshold value matrix is the background image model after the training.

In the present embodiment, if the background image model of setting up serves as that the basis is set up with noise average and variance, then in the training process of background image model, threshold value is also calculated to noise average and variance.Certainly, if the background image model of above-mentioned foundation is to set up with other types value except that average or variance type, then in the training process of background image model, threshold value is also calculated to other types value.

The quantity that is used to set up the frame of background image model in the present embodiment can be N/2; Also can be any numerical value among the N/2 to 2N/3; Accordingly, being used to train the quantity of the frame of background image model can be N/2, also can be any numerical value among the N/2 to N/3.Certainly; The quantity of the above-mentioned frame that is used to set up the background image model also can adopt other ratio with the quantity of the frame that is used to train the background image model; A kind of mode of recommendation is, the quantity that is used to set up the frame of background image model usually can not be very few, as be no less than 100 frames etc.; Be used to train the quantity of the frame of background image model can not be very few, as be no less than N/3 etc.In addition, the quantity sum of quantity and the frame that is used to train the background image model that is used for setting up the frame of background image model should be the total quantity N of video flowing background image frame.

Need to prove that to S130 present embodiment does not limit the concrete implementation procedure of background subtraction divisional processing, and, obtain the process of the background image model that is used for the background subtraction divisional processing and can before above-mentioned S130 or S120 or S110 or S100, carry out.Also have,, can adopt frame, also can adopt frame based on yuv space based on rgb space obtaining the background image model process that is used for the background subtraction divisional processing.

S140, confirm the foreground area of i frame according to the union of pixel set E, pixel set T and pixel set W, promptly the foreground area of i frame is the union of pixel set E, pixel set T and pixel set W.The set of the pixel here W be among the pixel set Z with the i-1 frame in the pixel position pixel set inequality of foreground area, the pixel set T here be in the foreground area of i-1 frame with pixel set U in the pixel set inequality of pixel position.

Pixel set W and pixel set T can obtain when needs use, and promptly in S140, calculate pixel set W and pixel set T; Certainly, also can before S140, obtain, for example, before the background subtraction divisional processing, just obtain pixel set W and pixel set T.

If pixel set E representes with the form that binaryzation is covered plate, then pixel set T and pixel set W also can represent with the form that binaryzation is covered plate.The detailed process of obtaining the binaryzation illiteracy plate of pixel set T and pixel set W no longer specifies at this.Under the situation that pixel set E, pixel set T and pixel set W all represent with the form of binaryzation illiteracy plate; The process of obtaining above-mentioned union can comprise: the binaryzation illiteracy plate of pixel being gathered E, pixel set T and pixel set W carries out exclusive disjunction; The binaryzation that obtains behind the exclusive disjunction is covered the foreground area that plate can be expressed the i frame; Can the binaryzation that obtain behind the exclusive disjunction be covered the binaryzation illiteracy plate that plate is called the foreground area of i frame, the binaryzation that perhaps is called the i frame is covered plate.

If pixel set E representes with the form of the positional information of each pixel in the storage set; Then pixel set T and pixel set W also can represent that the process of obtaining above-mentioned union can comprise with the form of the positional information of each pixel in the storage set: the positional information that merges each pixel among pixel set E, pixel set T and the pixel set W.Concrete implementation procedure no longer specifies at this.

The foreground area of having determined the i frame has also just been determined the foreground image of i frame; If the foreground area of the i frame of determining is the form that binaryzation is covered plate; Then can from the i frame, obtain this binaryzation and cover the RGB that value in plate is 1 pixel, thereby obtain the foreground image of i frame.If the foreground area of the i frame of determining not is the form that binaryzation is covered plate, but the form of the positional information of storage then can be obtained the RGB of the corresponding pixel of each positional information from the i frame, thereby obtains the foreground image of i frame.

Be the accuracy of the foreground area that guarantees to extract, the foreground area that present embodiment can also be optional gets access to above-mentioned S140 is further handled, this further processing can in following three kinds of processing of giving an example any one or multiple arbitrarily:

R value similarity, G value similarity and the B value similarity of the identical pixel in position in the foreground area of handle 1, obtaining the i frame and the background image model; If the R value similarity of the pixel that the position is identical, G value similarity and B value similarity meet predetermined similarity requirement; The R value similarity, the G value similarity that then show the pixel of this position are identical or similar with B value similarity; Thereby can confirm that this pixel possibly be because noise former for shade or light variation etc. thereby that cause; This pixel should not belong to foreground area; Therefore, should from the foreground area of i frame, remove R value similarity, G value similarity and B value similarity and meet the pixel that predetermined similarity requires, the pixel of removal belongs to the background area of i frame.

If the form that the foreground area of i frame is covered plate with binaryzation is represented; Then from the foreground area of i frame, removing R value similarity, G value similarity and B value similarity meets the pixel that predetermined similarity requires and be: modification R value similarity, G value similarity and B value similarity meet the value of the pixel of being scheduled to the similarity requirement, are revised as 0 like the value with this pixel by 1.

If the foreground area of i frame representes with the form of stored position information, then from the foreground area of i frame, remove R value similarity, G value similarity and B value similarity and meet the pixel that predetermined similarity requires and be: from the positional information of storage, delete the positional information that R value similarity, G value similarity and B value similarity meet the pixel of being scheduled to the similarity requirement.

Above-mentioned R value similarity, G value similarity and B value similarity can obtain through modes such as ratio or differences; For example, the ratio of the R value of the R value of the pixel x in the foreground area of i frame and the pixel x in the background image model is the foreground area of i frame and the R value similarity of the pixel x in the background image model; The ratio of the G value of the pixel x in the foreground area of i frame and the G value of the pixel x in the background image model is the foreground area of i frame and the G value similarity of the pixel x in the background image model; The ratio of the B value of the pixel x in the foreground area of i frame and the B value of the pixel x in the background image model is the foreground area of i frame and the B value similarity of the pixel x in the background image model.Again for example, the difference of the R value of the R value of the pixel x in the foreground area of i frame and the pixel x in the background image model is the foreground area of i frame and the R value similarity of the pixel x in the background image model; The difference of the G value of the pixel x in the foreground area of i frame and the G value of the pixel x in the background image model is the foreground area of i frame and the G value similarity of the pixel x in the background image model; The difference of the B value of the pixel x in the foreground area of i frame and the B value of the pixel x in the background image model is the foreground area of i frame and the B value similarity of the pixel x in the background image model.Present embodiment does not limit the concrete implementation procedure of R value similarity, G value similarity and the B value similarity of the identical pixel in the foreground area of obtaining the i frame and the background image model that is used for the background subtraction divisional processing (like the background image model etc. after the training) position.

Whether handle 2, carry out neighbor pixel respectively to each pixel in the foreground area of i frame is the judgement of foreground pixel point; If surpass neighbor number of spots half the of pixel x in each neighbor pixel of the pixel x of the foreground area of i frame for the quantity of foreground pixel point; Then from the foreground area of i frame, remove this pixel x, the pixel x of this removal belongs to the background area of i frame.Above-mentioned foreground pixel point promptly belongs to the pixel of the foreground area of i frame.

The neighbor pixel of above-mentioned pixel x can be the eight neighborhood territory pixels point of pixel x, and the eight neighborhood territory pixels point here can comprise: with pixel, following pixel and bottom right pixel point under the top left pixel point of pixel x adjacency, last pixel, upper right pixel, left pixel, right pixel, the left side.Certainly, the neighbor pixel of pixel x also can be the partial pixel point in the above-mentioned 8 neighborhood territory pixel points, as with the last pixel of pixel x adjacency, left pixel, right pixel and following pixel.Be under the situation of 8 neighborhood territory pixel points of pixel x at the neighbor pixel of pixel x; If in the 8 neighborhood territory pixel points of pixel x for the quantity of foreground pixel point less than 4; Then should the pixel x in the i frame foreground area be modified to the pixel in the background area of i frame; Otherwise, confirm that pixel x is the pixel in the i frame foreground area.

In above-mentioned processing 2 processes; Each pixel of the foreground area of i frame is carried out order that neighbor pixel judges can be for from left to right and from top to bottom; For example; Earlier the leftmost pixel of the top delegation in the foreground area of i frame is carried out neighbor pixel and judge, and then inferior leftmost pixel of the top delegation is carried out the neighbor pixel judgement, all judge completion up to neighbor pixel to the pixel of the top delegation in the foreground area of i frame; Afterwards; Again in the foreground area time topmost the neighbor pixel of the leftmost pixel of delegation judge that the rest may be inferred, accomplish up to the neighbor pixel of the rightmost pixel of the delegation bottom in the foreground area of i frame is judged.

Above-mentioned processing 2 can realize through the corresponding binaryzation illiteracy plate of the foreground area of i frame is carried out the spatial domain LPF.The implementation procedure of an object lesson of spatial domain LPF such as the description among the following embodiment no longer specify at this.

Handle 3, the foreground area of i frame is carried out the connected region processing, perhaps the connected region treatment capacity is carried out in the background area of i frame, perhaps the connected region processing is all carried out in the foreground area and the background area of i frame.The zone outside the foreground area is promptly removed in the i frame in the background area of i frame.

The foreground area of i frame is carried out the connected region processing can obtain a plurality of connected regions; These a plurality of connected regions all can be called the prospect connected region; Confirm not meet in each prospect connected region the prospect connected region that foreground area requires; From the foreground area of i frame, remove the pixel in the prospect connected region that does not meet the foreground area requirement, the pixel of from the foreground area of i frame, removing belongs to the background area of i frame.Do not meet prospect connected region that foreground area requires can for: if the area of prospect connected region less than the first area thresholding and adjacent with this prospect connected region left side or to go up adjacent connected region be not the prospect connected region, then this prospect connected region does not meet the foreground area requirement.Utilization to the foreground area of i frame carry out connected region handle can foreground area to the i frame in the actual pixel that should belong to the background area of i frame revise.It possibly be the prospect connected region of noise that the above-mentioned first area thresholding is used for tentatively choosing; That is to say; Generally, reality is that the area of the prospect connected region of noise can be smaller, can determine the prospect join domain of doubtful noise through the first area thresholding; It is adjacent or whether go up adjacent connected region be the judgement of prospect connected region that the prospect connected region of doubtful noise is carried out a left side again, and can determine is whether the prospect connected region of doubtful noise is noise really.

The connected region processing is carried out in the background area of i frame can obtain a plurality of connected regions; These a plurality of connected regions all can be called the background connected region; Confirm not meet in each background connected region the background connected region that the background area requires; From the background area of i frame, remove the pixel in the background connected region that does not meet the background area requirement, the pixel of from the background area of i frame, removing belongs to the foreground area of i frame.Do not meet background connected region that the background area requires can for: if the area of background connected region less than second area thresholding and adjacent with this background connected region left side or to go up adjacent connected region be not the background connected region, then this background connected region does not meet the background area requirement.The connected region that utilization is carried out the background area of i frame handle can background area to the i frame in the actual pixel that should belong to the foreground area of i frame revise.It possibly be the prospect connected region in cavity that above-mentioned second area thresholding is used for tentatively choosing; That is to say; Generally, reality is that the area of the background connected region in cavity can be smaller, can determine the background join domain in doubtful cavity through the second area thresholding; It is adjacent or whether go up adjacent connected region be the judgement of background connected region that the background connected region in doubtful cavity is carried out a left side again, and can determine is whether the background connected region in doubtful cavity is the cavity really.

The form of covering plate with binaryzation in the foreground area of i frame and background area represent and foreground area in the pixel value be set to 1 and the background area in the pixel value be set under 0 the situation, from the foreground area of i frame, remove an object lesson that does not meet the pixel in the prospect connected region that foreground area requires and be: the binaryzation of i frame is covered the value that does not meet the pixel in the prospect connected region that foreground area requires in the plate be revised as 0 by 1; An object lesson from the background area of i frame, removing the pixel in the background connected region that does not meet the background area requirement is: the value of the binaryzation of i frame being covered the pixel in the background connected region that does not meet the background area requirement in the plate is revised as 1 by 0.

Under the situation that the foreground area of i frame is represented with the positional information of pixel in frame; From the foreground area of i frame, remove the pixel in the prospect connected region that does not meet the foreground area requirement; Promptly from the pixel positional information set of the foreground area of i frame, remove the positional information that does not meet each pixel in the prospect connected region that foreground area requires; If store the pixel positional information set of the background area of i frame; Then in this set, increase the positional information of each pixel in the prospect connected region that does not meet the foreground area requirement; If do not store the pixel positional information set of the background area of i frame, then do not carry out this increase operation.In addition; If store the pixel positional information set of the background area of i frame; From the background area of i frame, remove the pixel do not meet in the background connected region that the background area requires promptly: from the pixel positional information set of the background area of i frame, remove the positional information that does not meet each pixel in the background connected region that the background area requires, and in the pixel positional information set of the foreground area of i frame, increase the positional information that does not meet each pixel in the background connected region that the background area requires; If store the pixel positional information set of the background area of i frame, from the background area of i frame, remove the pixel that do not meet in the background connected region that the background area requires promptly: directly in the pixel positional information set of the foreground area of i frame, increase the positional information that does not meet each pixel in the background connected region that the background area requires.

Multiple connected region algorithm is arranged at present; Present embodiment can adopt existing multiple connected region processing mode to obtain a plurality of prospect connected regions and background connected region, also can adopt following connected region processing mode based on tree to obtain a plurality of prospect connected regions and background connected region.

Connected region processing mode based on tree is specially: foreground area and background area to the i frame promptly are directed against whole i frame; Each pixel in the i frame is provided with the data structure based on tree, should comprise based on the data structure of tree: father node coordinate, color value and area.The data structure of each pixel in the i frame is initialized as: the father node coordinate is this pixel coordinate, and color value is this pixel color value, and area is that predetermined initial value is as 1.Afterwards, according to from left to right and order from top to bottom pixel x is handled as follows:

Obtain the data structure of left neighborhood; If the left neighborhood of pixel x is identical with the color value of pixel x; Then the father node coordinate of pixel x is set to the father node coordinate of left neighborhood; The area of pixel x is set to predetermined terminal value as 0, and the area of left neighborhood is increased predetermined initial value as increasing by 1; Obtain the data structure of upper right neighborhood; When if the color value of the upper right neighborhood of pixel x and pixel x is identical; The father node coordinate of above-mentioned upper right neighborhood is set to the father node coordinate of above-mentioned left neighborhood; The area of above-mentioned upper right neighborhood is set to predetermined terminal value as 0, and the area of above-mentioned left neighborhood is increased predetermined initial value as increasing by 1; No longer carry out the scan process of other neighborhood to pixel x.

If the left neighborhood of pixel x is different with the color value of pixel x; Then obtain the data structure of upper left neighborhood; If the upper left neighborhood of pixel x is identical with the color value of pixel x, then the father node coordinate of pixel x is set to the father node coordinate of upper left neighborhood, and the area of pixel x is set to predetermined terminal value as 0; The area of upper left neighborhood is increased predetermined initial value as increasing by 1; Obtain the data structure of upper right neighborhood, if when the color value of the upper right neighborhood of pixel x and pixel x is identical, the father node coordinate of above-mentioned upper right neighborhood is set to the father node coordinate of above-mentioned upper left neighborhood; The area of above-mentioned upper right neighborhood is set to predetermined terminal value as 0, and the area of above-mentioned upper left neighborhood is increased predetermined initial value as increasing by 1; No longer carry out the scan process of other neighborhood to pixel x.

If the left neighborhood of pixel x is all different with the color value of pixel x with upper left neighborhood; Then obtain the data structure of neighborhood; If the last neighborhood of pixel x is identical with the color value of pixel x; Then the father node coordinate of pixel x is set to the father node coordinate of neighborhood, and the area of pixel x is set to predetermined terminal value as 0, and the area of last neighborhood is increased predetermined initial value as increasing by 1; No longer carry out the scan process of other neighborhood to pixel x.

If the left neighborhood of pixel x, upper left neighborhood and last neighborhood are all different with the color value of pixel x; Then obtain the data structure of upper right neighborhood; If the upper right neighborhood of pixel x is identical with the color value of pixel x; Then the father node coordinate of pixel x is set to the father node coordinate of upper right neighborhood, and the area of pixel x is set to predetermined terminal value as 0, and the area of upper right neighborhood is increased predetermined initial value as increasing by 1; No longer carry out the scan process of other neighborhood to pixel x.

Be communicated with above-mentioned to handle more and also can be expressed as: four neighborhoods to pixel x are judged according to the order of left neighborhood, upper left neighborhood, last neighborhood and upper right neighborhood; If it is identical with the color value of pixel x to judge a neighborhood; Then pixel x is added this neighborhood; The father node coordinate that is about to pixel x is revised as the father node coordinate of this neighborhood, and increases the area of this neighborhood; Pixel x can only add a neighborhood.If the neighborhood identical with the color value of pixel x is left neighborhood or upper left neighborhood, then judge the upper right neighborhood of pixel x, if the color value of upper right neighborhood is identical with pixel x color value, then upper right neighborhood is also added the neighborhood that pixel x adds.

All carried out above-mentioned processing operation to each pixel in the i frame after, can obtain a plurality of father nodes, a father node is represented a connected region, promptly obtains to comprise a plurality of connected regions of prospect connected region and background connected region.

After obtaining a plurality of prospect connected regions and background connected region; Judge the area (being the area of father node) of each prospect connected region and each background connected region respectively; If the area of prospect connected region is less than the first area thresholding; And the left connection adjacent with this prospect connected region zone, upper left connected region or last connected region are the background connected region; Confirm that then this prospect connected region is a noise, the pixel in this prospect connected region should belong to the background area, can revise the binaryzation of i frame and cover the pixel value in this prospect connected region in the plate.If the area of background connected region is less than the second area thresholding; And the left connection adjacent with this background connected region zone, upper left connected region or last connected region are the prospect connected region; Confirm that then this background connected region is the cavity; Pixel in this background connected region should belong to foreground area, can revise the binaryzation of i frame and cover the pixel value in this background connected region in the plate.

Above-mentioned first area thresholding and second area thresholding can be adjusted according to actual conditions, for example, under the many situation in cavity, can increase the second area thresholding, more for example, under the many situation of noise, can increase by the first area thresholding.

Above-mentioned connected region processing mode based on tree of giving an example is to foreground area in the i frame and background area; If only the foreground area in the i frame is carried out perhaps only the background area in the i frame being carried out handling based on the connected region of tree based on the connected region processing of tree; Its processing procedure and the above-mentioned explanation of giving an example are basic identical, no longer specify at this.

Need to prove; Comprise simultaneously at present embodiment handle 1, handle 2 and handle in 3 the three or arbitrarily under both situation; Present embodiment can be not to handle 1, handle 2 and the execution sequencing handled between 3 limit; Preferably, comprise simultaneously at present embodiment under the situation of processing 2 and processing 3, handle 2 and can carry out before in processing 3.

Can know from the description of the foregoing description one, carry out the background subtraction divisional processing, reduce to carry out the zone of background subtraction divisional processing, and then reduced the calculated amount of background subtraction divisional processing through utilizing pixel set U; Through the background image model training to setting up, the background image model that the background subtraction divisional processing is adopted is more accurate, thereby has improved the performance of background subtraction divisional processing; Through utilizing pixel set E, pixel set T and pixel set W to confirm the foreground area of i frame, can be fast and determine the foreground area in the i frame comparatively accurately; Foreground area through adopting 1 pair of i frame of above-mentioned processing is revised, and can effectively remove in the foreground area because shade or light variation etc. are former thereby the noise region of generation; Through adopting above-mentioned processing 2 can realize smoothing processing, thereby can effectively remove the glitch noise in the foreground area to the foreground area of i frame; Through adopting above-mentioned processing 3 can fill up the cavity in the foreground area of i frame and removing noise; Through adopting connected region processing mode, effectively improved the efficient that connected region is handled based on tree; In sum, utilize embodiment one can extract the foreground image in the i frame fast and accurately.

Embodiment two, foreground image method for distilling.Below in conjunction with accompanying drawing 2, accompanying drawing 2.1,2.2,2.3,2.4 and 2.5, be example from video flowing, to extract portrait, embodiment two is described.

Accompanying drawing 2 shows the flow process of from video flowing, extracting portrait.

In Fig. 2, S1, the original image frame in the video flowing is carried out pre-service, to realize the colour space transformation of original image frame.

Because the influence of the factors such as camera lens of picture pick-up device possibly have geometric distortion in the video image that picture pick-up device obtains, thereby possibly there is noise (being noise) in the original image frame that picture pick-up device obtains.In addition, because may there be defectives such as colour cast and contrast are low in the restriction of factors such as surrounding environment in the original image frame that picture pick-up device obtains.For removing noise and eliminating colour cast and defective such as contrast is low, present embodiment can carry out the pre-service of colour space transformation earlier to each original image frame in the video flowing, to improve the picture quality of video flowing.The pre-service of the colour space transformation here can comprise: the original image frame based on rgb space in the video flowing is transformed to the picture frame based on yuv space.Wherein, R representes redness, and G representes green, and B representes blueness, and Y representes brightness, and U representes to be partial to blue misalignment, and V representes to be partial to red misalignment.

With an object lesson that is transformed to based on the picture frame of yuv space based on the original image frame of rgb space in the video flowing be: each original image frame based on rgb space in the video flowing all is transformed to the picture frame based on yuv space through following formula (1).

\{\begin{matrix} Y = 0.299 R + 0.587 G + 0.114 B \\ U = 0.1687 R - 0.3313 G + 0.5 B + 128 \\ V = 0.5 R - 0.4187 G - 0.0813 B + 128 \end{matrix}

Formula (1)

S2a, the background image model is set to pretreated picture frame.Include the continuous multiple frames background image in the above-mentioned pretreated picture frame, for example include continuous 2n frame background image, wherein n is the integer greater than zero, and an object lesson of n value is: n=100.Pretreated background image frame is used to be provided with the background image model.

The noise of supposing each pixel in the picture frame is mutually independently on statistical significance; Each pixel in the picture frame is at the noise and the residing location independent of this pixel of present frame; And this noise does not rely on the noise of this pixel in former frame; Under the situation of above-mentioned hypothesis, can set up the background image model for background image through the noise profile of measuring a plurality of background image frame.

From pretreated picture frame, obtain continuous 2n frame background image; Preceding n frame background image is as the data sample of setting up the background image model; Back n frame background image conduct is to the data sample of the background image model training of foundation, and promptly n frame background image in back is used for the background image model of setting up is revised.

To any pixel x in each background image frame, all there are Y, U and three components of V, if regard the preceding n frame background image in the background image frame as a background image matrix, then the Y component of the pixel x in the background image matrix can be expressed as

The noise stochastic variable of pixel x in the background image matrix can be expressed as

Any frame in the background image matrix then has

μ_{x_{i}} = μ_{x_{i}^{'}} + δ_{x_{i}},

Promptly for

&ForAll; i &Element; [1, . . ., n],

Then have

&ForAll; &Element; [1 . . ., n] :

μ_{x_{i}} = μ_{x_{i}^{'}} + δ_{x_{i}}

Set up.Above-mentioned

is the actual value of pixel x, i.e. the actual color value of pixel x.Because it is not obvious that U and V component embody, so above-mentionedly do not consider U and V component aspect noise.

If set the noise stochastic variable δ of pixel x in the background image matrix _xAverage be 0, promptly following formula (2) is set up:

E (δ_{x}) = E ({δ_{x_{1}}, δ_{x_{2}}, . . . . . ., δ_{x_{n}}, . . . . . .}) = 0

Formula (2)

Then can know: if the quantity of the background image frame that comprises in the background image matrix is abundant according to central limit theorem; Then the pixel x in the background image matrix

is similar to Gaussian distributed, and promptly following formula (3) is set up:

\frac{1}{n} Σ_{i = 1}^{n} μ_{x_{i}} ~ N ({\overset{&OverBar;}{μ}}_{x}, σ_{x}^{2})

Formula (3)

In formula (3),

{\overset{&OverBar;}{μ}}_{x} = \frac{1}{n} Σ_{i = 1}^{n} μ_{x_{i}} \approx μ_{x_{i}},,

μ _xFor pixel x's

The average of statistic.

The variance such as the following formula (4) of

statistic of pixel x:

σ_{x} = \sqrt{\frac{1}{n - 1} Σ_{i = 1}^{n} {(μ_{x_{i}} - {\overset{&OverBar;}{μ}}_{x})}^{2}}

Formula (4)

Utilize the average and the variance of each pixel in the background image matrix can set up the background image model, thereby utilize preceding n frame background image can successfully set up the background image model of mean variance type.Afterwards, with n+1 to 2n frame background image as the training data sample, to the background image model training of setting up.Below so that the pixel x in the background image model is trained for example, the training process of background image model is described.

To any pixel x in n+1 to the 2n frame, with the Y component of pixel x in n+1 to 2n frame background image list of values is shown

utilizes following formula (5) to obtain the corresponding threshold value Nx of pixel x:

| μ_{x_{i}} - \overset{&OverBar;}{μ_{x}} | = N_{X} \times σ_{x}

Formula (5)

In formula (5), and i ∈ [n+1 ..., 2n], μ _xFor pixel x's

The average of statistic, σ _xFor pixel x's

The variance of statistic.

Because training sample has n frame background image, therefore, pixel x is for there being n Nx, and can choose Nx maximum among n the Nx is N _MaxAs final threshold value.

Because each pixel in the background image model is all to there being final threshold value, therefore, the final threshold value that all pixels are corresponding can be formed a threshold matrix N, and the threshold value matrix N is the background image model after the training.

S2b, pretreated picture frame is carried out inter-frame difference handle.Adjacent two picture frames that carry out the inter-frame difference processing can be background image frame, also can also be the picture frame that comprises portrait for background image frame and the picture frame that comprises portrait.Concrete example of adjacent two picture frames is shown in accompanying drawing 2.1.The above-mentioned picture frame that comprises portrait promptly includes the picture frame of foreground area and background area.

Among Fig. 2 .1, the former frame in adjacent two picture frames (i.e. i-1 frame) is C _N-1, C _N-1Foreground area be F _N-1, C _N-1The background area be B _N-1, and F _N-1Known; Back frame in adjacent two picture frames (promptly the i frame also is a present frame) is C _n, C _nForeground area be F _n, C _nThe background area be B _n

An object lesson of the inter-frame difference processing procedure of present embodiment comprises:

Calculate the Euclidean distance between the pixel of the same position in i frame and the i-1 frame, can obtain this Euclidean distance through following formula (6):

\sqrt{{(Y (i - 1) - Y (i))}^{2} + {(U (i - 1) - U (i))}^{2} + {(V (i - 1) - V (i))}^{2}}

Formula (6)

In formula (6); Y (i-1) is the Y component of pixel A (i-1) in the i-1 frame; U (i-1) is the U component of pixel A (i-1) in the i-1 frame, and V (i-1) is the V component of pixel A (i-1) in the i-1 frame, and Y (i) is the Y component of pixel A (i) in the i frame; U (i) is the U component of pixel A (i) in the i frame; V (i) is the V component of pixel A (i) in the i frame, and the position of A (i-1) in the i-1 frame is identical with the position of A (i) in the i frame, and promptly the position coordinates of A (i-1) in the i-1 frame is identical with the position coordinates of A (i) in the i frame.

After calculating Euclidean distance, Euclidean distance and predetermined value are compared, and binaryzation illiteracy plate is set based on comparative result.This is relatively and binaryzation is set covers the process of plate and can realize through following formula (7):

A (i - 1) &CircleTimes; A (i) = \{\begin{matrix} 1 & \sqrt{{(Y (i - 1) - Y (i))}^{2} + {(U (i - 1) - U (i))}^{2} + {(V (i - 1) - V (i))}^{2}} > k \\ 0 & \sqrt{{(Y (i - 1) - Y (i))}^{2} + {(U (i - 1) - U (i))}^{2} + {(V (i - 1) - V (i))}^{2}} < = k \end{matrix}

Formula (7)

K in the formula (7) is a predetermined value, and the big I of predetermined value is provided with according to actual needs.Above-mentioned formula (7) expression: if the Euclidean distance between A (i-1) and the A (i) greater than predetermined value k, then the binaryzation value of covering the A (i) in the plate is set to 1, otherwise the value that binaryzation is covered the A (i) in the plate is set to 0.This binaryzation is covered plate and is pixel set Z.

The binaryzation that relatively is provided with through Euclidean distance and predetermined value is covered plate, has five kinds of condition of different, and these five kinds of condition of different can be represented through following formula (8):

A (i - 1) &CircleTimes; A (i) = \{\begin{matrix} 1 & A (i - 1) = F, A (i) = B \\ 1 & A (i - 1) = B, A (i) = F \\ 1 & A (i - 1) = F, A (i) = F \\ 0 & A (i - 1) = B, A (i) = B \\ 0 & A (i - 1) = F, A (i) = F \end{matrix}

Formula (8)

F in the formula (8) representes the pixel in the foreground area, and B representes the pixel in the background area.

Situation 1 in the formula (8) (A (i-1)=F; A (i)=B) and situation 2 (A (i-1)=B; A (i)=F): if promptly among A (i-1) and the A (i) one be pixel in the foreground area; Another is the pixel in the background area, and the Euclidean distance that then calculates should be bigger, thereby the value of the A (i) in the binaryzation illiteracy plate is set to 1.

Situation 3 in the formula (8) (A (i-1)=F; A (i)=F): if promptly A (i-1) and A (i) are that bigger variation has taken place for pixel and color component in the foreground area; For example owing to reasons such as moving of foreground image cause the color component of this position pixel that bigger variation has taken place; The Euclidean distance that then calculates should be bigger, thereby the value of the A (i) in the binaryzation illiteracy plate is set to 1.

Situation 4 in the formula (8) (A (i-1)=B; A (i)=B): if promptly A (i-1) and A (i) are the pixel in the background area; Promptly on this position, do not have blocking of foreground image, then the pixel owing to this position in adjacent two frames has identical attribute, so; The Euclidean distance that calculates should be less, thereby the value of the A (i) in the binaryzation illiteracy plate is set to 0.

Situation 5 in the formula (8) (A (i-1)=F; A (i)=F): if promptly A (i-1) and A (i) are that pixel and color component in the foreground area do not change basically; As owing to reasons such as foreground image is not moved cause the color component of this position pixel not change basically; The Euclidean distance that then calculates should be less, thereby the value of the A (i) in the binaryzation illiteracy plate is set to 1.

For making things convenient for subsequent descriptions; The pixel set note that will produce above-mentioned the 1st kind of situation is below made Q; The pixel set note that produces above-mentioned the 2nd kind of situation is made W; The pixel set note that produces above-mentioned the 3rd kind of situation is made E, the pixel set note that produces above-mentioned the 4th kind of situation is made R, the pixel set note that produces above-mentioned the 5th kind of situation is made T.

The binaryzation that comparison through above-mentioned Euclidean distance and predetermined value is provided with is covered the union that plate has been expressed pixel set Q, pixel set W and pixel set E, i.e. pixel set Z=Q ∪ W ∪ E.Pixel set Z promptly handles the target portrait area (being foreground area) that obtains through inter-frame difference.

To Fig. 2 .1, C _N-1Foreground area (i.e. the foreground area of i-1 frame) F _N-1Should be Q ∪ E ∪ T, and C _nForeground area (i.e. the foreground area of i frame) F _nShould be W ∪ E ∪ T.Yet the zone of handling the back acquisition through above-mentioned inter-frame difference is: Q ∪ E ∪ W.That is to say that owing to comprise pixel set T, therefore, this target portrait area exists empty in the target portrait area that obtains after handling through inter-frame difference; And gather Q owing to comprised pixel in this target portrait area, therefore, include the background area in this target portrait area.Thereby this target portrait area is not accurately, needs to obtain the accurate target portrait area through following processing operation.

Need to prove, can executed in parallel between the S2a of present embodiment and the S2b, also can successively carry out, present embodiment does not limit the execution sequence of S2a and S2b.

After having obtained pixel set Z, can obtain pixel set W through

." " expression asks the difference set of set.

After having obtained pixel set W; Can obtain pixel set U through , pixel set U is Q ∪ E.

After having obtained pixel set U, through F _N-1U=(Q ∪ E ∪ T) (Q ∪ E) can obtain pixel set T.

Need to prove that the operation that obtains pixel set W, U and T also can be placed on after the S3 perhaps follow-up among the follow-up S3.

S3, pixel set U is carried out the background subtraction divisional processing, obtain pixel set E.

The object lesson of a background subtraction divisional processing comprises: the background image model of pixel being gathered U and above-mentioned setting compares; Mode such as the identical pixel characteristic (like pixel average and variance) of comparison position relatively; Compare same pixel provincial characteristics (like pixel zone average and variance) for another example, perhaps compare further feature etc.; If result relatively is for existing difference to a certain degree, then having the pixel of difference or pixel region among the pixel set U is the foreground area of i frame, otherwise, be the background area of i frame.This foreground area of determining is pixel set E.

Object lesson based on the background subtraction divisional processing of pixel average and variance comprises: utilize following formula (9) to confirm the binaryzation illiteracy plate that pixel set E is corresponding:

formula (9)

In formula (9), μ _xBe the actual color value of the pixel x of i frame, Nx is the corresponding threshold value of pixel x in the threshold value matrix, μ _xFor pixel x's The average of statistic, σ _xFor pixel x's

The variance of statistic.

Formula (9) expression: if μ _xWith μ _xThe absolute value of difference surpass pixel x corresponding threshold value and variances sigma _xProduct, then pixel x is the foreground pixel point, promptly pixel x belongs to pixel set E, covers the value that this pixel x is corresponding in the plate in the corresponding binaryzation of pixel set E and is set to 1; Otherwise pixel x is the background pixel point, and promptly pixel x attribute pixel set E not covers the value that this pixel x is corresponding in the plate in the corresponding binaryzation of pixel set E and is set to 0.

After having carried out the background subtraction divisional processing, merge pixel set W, E and T, (promptly the i frame also is C can to obtain present frame _nFrame) foreground area.

S4, the foreground area that obtains after the background subtraction divisional processing (being the union of pixel set W, E and T) is carried out shadow Detection handle and the line change process of delustering, to remove the direct-shadow image vegetarian refreshments in the foreground area and because light changes the noise region that produces.

The variation of the variation of the rgb value of direct-shadow image vegetarian refreshments and the rgb value of foreground pixel point has different laws, and promptly for any direct-shadow image vegetarian refreshments x, generally following formula (10) is set up:

\frac{R}{R} \approx \frac{G}{G} \approx \frac{B}{B} = K

Formula (10)

In the formula (10), R0, G0 and B0 are the rgb values of pixel x in the background image model, and R, G and B are the rgb values of the pixel x among the pixel set E.

And for foreground pixel point, R0, G0 and the B0 of the pixel of same position and disproportionate in the R of foreground pixel point, G and B value and the background image model, generally promptly, the foreground pixel point can not make above-mentioned formula (10) set up.

Equally, because the variation of the rgb value of the variation of the rgb value of the noise that the light variation produces and foreground pixel point also has different laws, generally, any noise that produces owing to the light variation also can make above-mentioned formula (10) establishment.

Therefore; Can utilize above-mentioned formula (10) that each pixel in the foreground area is carried out ratio calculation; If three ratios are identical or close; Be that the degree of approximation meets predetermined similarity requirement, then confirm this pixel owing to need from foreground area, remove for the direct-shadow image vegetarian refreshments or for light changes the noise that produces.Thereby utilize formula (10) can eliminate shade and because the influence that the light variation brings in the foreground area.

S5, the foreground area of handling through shadow Detection and removing after the light change process is carried out smoothing processing,, promote the foreground image quality of from the picture frame of video flowing, extracting to eliminate glitch noise.

Two dimensional image can be decomposed into different frequency components at frequency domain, and wherein low frequency component is described large-scale information, and high fdrequency component is described detail information, like edge of object etc.Smoothing processing in the present embodiment can be the binaryzation of the foreground area that obtains to be covered plate carry out the spatial domain LPF.The spatial domain LPF can filter the radio-frequency component in the foreground area through low-pass filter; And the low frequency part of the spatial frequency in the reservation foreground area; Thereby can reduce the visual noise in the foreground image; Simultaneously behind the HFS in removing foreground image, can make in the foreground image originally the identification more easily that becomes of unconspicuous low-frequency component.

The spatial domain LPF specifically can adopt convolution to realize; The ranks number of convolution is an odd number, and is generally 3, and the convolution coefficient is that the center is symmetrically distributed with the central point; All convolution coefficients all are positive numbers; The value of the convolution coefficient that distance center is far away is less or remain unchanged, and convolution coefficient sum is 1, can not change the brightness of foreground image thus.

After the binaryzation of foreground area covered plate D (x) and carry out the spatial domain LPF, the binaryzation of the foreground area of acquisition was covered plate D ' _s(x) can represent through following formula (11):

formula (11)

M can represent the numbering of neighborhood in the formula (11), is the left neighborhood of 1 remarked pixel point x, the upper left neighborhood that m is 2 remarked pixel point x etc. like m, and the value of m can be 1 to 8; Value in the corresponding low pass convolution of the m neighborhood of H (m) the remarked pixel point x masterplate; D _s(x+m) binaryzation of expression foreground area is covered the value of the neighborhood of the middle pixel x of plate D (x).

Above-mentioned formula (11) expression: if the quantity for foreground pixel point meets or exceeds 4 in eight neighborhoods of the pixel x in the foreground area; Then pixel x belongs to foreground area; The value that pixel x covers in the plate in binaryzation is 1; Otherwise, not belonging to foreground area, the value that pixel x covers in the plate in binaryzation is 0.

The low pass convolution masterplate that above-mentioned spatial domain LPF is adopted can be shown in following formula (12):

H = \frac{1}{9} |\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}|

Formula (12)

Can realize through above-mentioned spatial domain LPF: whether the pixel that reaches over half is arranged in eight neighborhoods to each pixel in the foreground image is that the foreground pixel point is judged; In eight neighborhoods of pixel the pixel that reaches over half to be arranged be the foreground pixel point if judge; Confirm that then this pixel is the foreground pixel point; Otherwise, confirm that this pixel is the background pixel point.

S6, the binaryzation after the smoothing processing is covered plate carry out connected region and handle, remove the independent noise zone that exists in the foreground area, and fill up the cavity that should not occur in the foreground area.

Binaryzation illiteracy plate after the smoothing processing is made up of a plurality of mutual disconnected subregions usually, and a sub regions i.e. a connected region.In extracting the foreground image process, certain subregion that should belong in the foreground area possibly then be judged as the background area owing to less with corresponding background area difference in background subtraction divisional processing process, thereby the cavity occurs in the foreground area that actual extracting is gone out.In addition, certain sub regions that should belong in the background area maybe be owing to reasons such as slight disturbances, and in background subtraction divisional processing process, is confirmed as foreground area, thereby the foreground area that actual extracting is gone out comprises the noise subregion.Present embodiment is handled through connected region, and each connected region is analyzed, and can effectively remove the noise that exists in the foreground area and fill up the cavity that should not occur in the foreground area.

The concrete implementation procedure of handling in the face of a connected region that adopts tree construction down describes, and should handle based on the connected region of tree construction to comprise:

All regard each pixel that the binaryzation after the smoothing processing is covered in the plate as a node; The set that belongs to the node of same connected region constitutes one tree; Trees all in the forest all belong to connected region, and different connected regions can identify with the root node of different trees in the forest.

For the data structure of node setting is:

NODE

{

int?x，y；

int?color；

int?area；

}

X in the above-mentioned data structure and y represent the co-ordinate position information of the father node of present node (being this node), and color representes the color value of present node, if present node is a father node, then area representes the area of the pairing connected region of present node.

If the wide of picture frame is width, long is height, then can be image frame defining width * height node, utilizes the NODE of each node can form a two-dimensional array, and the subscript of array can be expressed the co-ordinate position information of node in picture frame.

Above-mentioned two-dimensional array is carried out initialization.This initialization operation can for: according to from left to right and order from top to bottom carry out node scan; The father node of each node that setting scans is this node; X and y that each node promptly is set are the co-ordinate position information of this node; The area of each node is set to 1, and the pixel color settings color corresponding according to node.The pixel color here can be 0 or 1, and for example, this pixel is in foreground area, and then the pixel color is 1, and this pixel is in the background area, and then this pixel color is 0.That is to say that the value of the color in the data structure is that the corresponding binaryzation of this pixel is covered the value in the plate.

After initialization operation is accomplished, according to from left to right and order from top to bottom each node is scanned successively, four neighborhoods in eight neighborhoods of sequential scanning present node can be shown in accompanying drawing 2.2 to the scanning sequency of four neighborhoods of present node.Scanning sequency shown in Fig. 2 .2 is: left sibling (following first neighborhood that is called), upper left node (following second neighborhood that is called), last node (following the 3rd neighborhood that is called) and upper right node (the following neighbours territory that is called).

If a neighborhood in four neighborhoods is identical with the color attribute of present node; Then the father node of present node is set to the father node in the identical field of this attribute; Promptly in the tree at the identical place, field of attribute, add present node, the area that will set simultaneously adds 1.If the neighborhood identical with the color attribute of present node is first neighborhood of present node, and the neighbours territory of present node is identical with the color attribute of present node, then first neighborhood and neighbours territory is set merging; If the neighborhood identical with the color attribute of present node is second neighborhood of present node, and the neighbours territory of present node is identical with the color attribute of present node, then second neighborhood and neighbours territory is set merging.The tree merging can merge to for the tree that area is less in the bigger tree of area, also can merge to for the tree that area is bigger in the less tree of area.Modification and area that the tree merging can relate to neighborhood father node information merge, and detailed process no longer specifies.

Need to prove; If present embodiment scan first neighborhood and judge first neighborhood identical with the color attribute of present node; Then after having carried out in the tree at first neighborhood place, adding the present node operation; Can be not second neighborhood and the 3rd neighborhood of present node do not scanned, and directly scan the neighbours territory of present node, and carry out whether identical judgement and post-treatment operations with present node color attribute to the neighbours territory.In addition; If present embodiment scan second neighborhood and judge second neighborhood identical with the color attribute of present node; Then after having carried out in the tree at second neighborhood place, adding the present node operation; Can be or not 3rd neighborhood of present node not be scanned, and directly scan the neighbours territory of present node, and carry out whether identical judgement and post-treatment operations with present node color attribute to the neighbours territory.Processing such as certainly scan and judge one by one one by one also is fine to four neighborhoods of present node.Above-mentioned color attribute is identical to be that the color value is identical.

After having carried out the connected region processing, need carry out denoising and filling cavity and handle.

Shown in accompanying drawing 2.3, noise region can be surrounded by the background area usually, and hole region can be surrounded by foreground area usually.In addition, because left neighborhood, upper left neighborhood and the last neighborhood of the root node of a connected region necessarily belong to other connected region.Therefore; When carrying out the processing of denoising and filling cavity; Can judge the root node of the root node of a connected region and the left neighborhood of this root node, upper left neighborhood or last neighborhood; Promptly judge the area of above-mentioned two root nodes, and the connected region that area is little merges in the big connected region of area.That is to say that if the little connected region of area belongs to foreground area, the connected region that then this area is little is a noise, need from foreground area, remove the little connected region of this area; If the connected region that area is little belongs to the background area, the connected region that then this area is little is the cavity, needs the connected region that this area is little to fill up in the foreground area.

Can know from foregoing description; Connected region processing based on tree construction mainly is to find root node separately through the mode that adopts tree construction for each node; Promptly merge the identical and adjacent connected region of color attribute, thus the some prospect subregions and the background subregion that can be connected.

After denoising and filling cavity processing, the binaryzation template of the foreground area of i frame can be expressed the foreground area of i frame accurately.The foreground extraction result of a reality is shown in accompanying drawing 2.4 and 2.5.Accompanying drawing 2.4 is that the extraction result of foreground image includes a large amount of noises and cavity before connected region is handled.Accompanying drawing 2.5 is that the extraction result of foreground image has eliminated a large amount of noises, has filled up a large amount of cavities after connected region is handled, and has improved the accuracy that foreground image extracts.

Embodiment three, portrait method for distilling.In the present embodiment, in conjunction with accompanying drawing a concrete portrait leaching process is described.

At first, obtain 6 frame background images shown in accompanying drawing 3.1 through the first-class picture pick-up device of making a video recording.This 6 frame background image is carried out pre-service, be about to this 6 frame background image and transform to yuv space by rgb space.Formula that colour space transformation adopted such as the description of above-mentioned embodiment.The quantity of the background image frame here only is the explanation of giving an example, and in practical application, can obtain more as the background image frame more than 200 frames.

Secondly, the background image model is set.Utilize preceding 3 frame background images to set up the background image model of mean variance, back 3 frame background images as training sample, are trained the threshold value of each pixel in the background image model, i.e. basis

| μ_{x_{i}} - \overset{&OverBar;}{μ_{x}} | = N_{X} \times σ_{x}, i &Element; [n + 1, . . ., 2 n]

Obtain the threshold value Nx of pixel x.Because training sample has 3 frame background images, therefore, to pixel x 3 Nx is arranged, and gets a threshold value N that conduct is final of 3 maximums among the Nx _Max, the threshold value N that all pixels are corresponding _MaxForm threshold matrix N, this threshold matrix N is the background image model of final setting.This background image model is similarly the background image model based on Gaussian distribution.

Once more, carry out inter-frame difference and handle and obtain the pixel process of aggregation.Begin to get into (portrait shown in accompanying drawing 3.2 gets into background image) in the background image process at portrait (being foreground image), because the time difference of adjacent two interframe in front and back is very short, so the portrait of adjacent two frames changes part to be had overlapping.Utilize formula can obtain pixel set Z and Z=Q ∪ W ∪ E.Utilize formula

can obtain pixel set W.Utilize formula

can obtain pixel set U, i.e. U=Q ∪ E.Utilize formula F _N-1U=(Q ∪ E ∪ T) (Q ∪ E) can obtain pixel set T.

Afterwards, U carries out the background subtraction divisional processing to the pixel set, obtains pixel set E.For example, pixel is gathered U and the background image model carries out difference, if the color intensity μ of pixel x _xWith this pixel average μ _xThe absolute value of difference exceed threshold value and variances sigma _xProduct, then pixel x is the foreground pixel point, otherwise then is the background pixel point.

Then, merge pixel set W, E and T, thereby obtain current frame image C _nThe portrait F of (i.e. i frame) _nPortrait F _nShown in accompanying drawing 3.3.

Again, the line that delusters changes and shadow Detection is handled.The K value of each pixel in the statistics foreground area; If the K value stabilization of pixel; It is the only fluctuation among a small circle of K value; Then this pixel is to be changed the noise spot cause or be shadow spots by light, removes this pixel and can eliminate light and change the influence that brings or remove the shade in the foreground area.

Again, smoothing processing.Smoothing processing can realize through the LPF of spatial domain, the low pass convolution masterplate that uses in the LPF can for

H = \frac{1}{9} |\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}| .

Can judge through this spatial domain LPF in eight fields of each pixel in the portrait image whether the foreground pixel point that reaches over half is arranged, if having, then this pixel is the foreground pixel point, otherwise, be the background pixel point.

At last, noise and filling cavity processing are handled and removed to connected region.The description of concrete implementation procedure such as above-mentioned embodiment no longer specifies at this.

The portrait that present embodiment finally extracts is shown in accompanying drawing 3.4.

Embodiment four, foreground image extraction element.The structure of this device is shown in accompanying drawing 4.

Device among Fig. 4 comprises: spacing module 400, first collection modules 410, second collection modules 420, background subtraction sub-module 430 and foreground area module 440.Optional, as shown in Figure 5, this device can also comprise: one or more arbitrarily in space conversion module 450, shade and light change detection module 460, smoothing processing module 470 and the connected region module 480.

Spacing module 400 is used for obtaining the distance between the i frame pixel identical with i-1 frame position, thereby can make each pixel in the i frame all to distance value should be arranged.The position here can be the coordinate position of pixel in frame.The distance that spacing module 400 calculates can be based on the distance between the pixel of rgb space, also can be based on the distance between the pixel of yuv space, this distance such as Euclidean distance or manhatton distance etc.If distance be based on the distance between the pixel of yuv space, space conversion module in the present embodiment 450 needs will convert into based on the i frame of yuv space with based on the i-1 frame of yuv space based on the i frame of rgb space with based on the i-1 frame of rgb space.After space conversion module 450 was carried out color space transformation, spacing module 400 calculated the distance between the identical pixel in position in i frame and the i-1 frame after the color space transformation again.

First collection modules 410 is used to obtain the pixel set Z of i frame middle distance greater than predetermined value.First collection modules 410 can be judged to the corresponding distance value of each pixel in the i frame; If the distance value of pixel correspondence is greater than predetermined value; Then first collection modules 410 confirms that this pixel belongs to pixel set Z; Otherwise first collection modules 410 confirms that this pixel does not belong to pixel set Z.The predetermined value here can be adjusted according to actual conditions.

The form that first collection modules 410 can adopt binaryzation to cover plate is come remarked pixel point set Z; Certainly; First collection modules 410 also can adopt alternate manner to come remarked pixel point set Z, and for example, first collection modules 410 is stored the positional information of each pixel that belongs to pixel set Z etc.

Second collection modules 420 is used for obtaining the identical pixel set U in pixel position of the foreground area in pixel set Z and the i-1 frame.Promptly second collection modules 420 is obtained the common factor of the foreground area in pixel set Z and the i-1 frame, and this common factor is pixel set U.

If the mode that pixel set Z covers plate with binaryzation is represented; And the foreground area in the i-1 frame is also represented with the mode of binaryzation illiteracy plate; Then second collection modules 420 is covered plate and the binaryzation of the foreground area of expression i-1 frame with the binaryzation of remarked pixel point set Z and is covered the value of the identical pixel in position in the plate and carry out respectively and computing, with the result of computing be the binaryzation illiteracy plate that remarked pixel point is gathered U.Foreground area in the i-1 frame here is known.Concrete like the description among the above-mentioned embodiment.

Background subtraction sub-module 430 is used for that pixel is gathered U and carries out the background subtraction divisional processing, obtains pixel set E.Background subtraction sub-module 430 can adopt existing background subtraction divisional processing operation to realize the background subtraction divisional processing process to pixel set U; For example, background subtraction sub-module 430 adopts existing background subtraction divisional processing operation based on the background image model that pixel set U is carried out the background subtraction divisional processing.Background subtraction sub-module 430 also can be to adopt the mode based on training to obtain the background image model.Adopting the mode based on training to obtain under the situation of background image model, background subtraction sub-module 430 comprises: model submodule and background subtraction are set divide submodule.

The model submodule is set, is used for utilizing the partial frame of N continuous frame background image to set up background image model, and utilize another part frame in the N continuous frame background image the background image model training of above-mentioned foundation based on Gaussian distribution.

Concrete; The model submodule is set obtains N continuous frame background image; The model submodule is set utilizes the partial frame in the N continuous frame background image to set up the background image model; This background image model is the background image model based on Gaussian distribution, and the noise profile acquisition background image model of model submodule through the partial frame in the statistics N continuous frame background image promptly is set.The model submodule is set can utilize modes such as noise average and the variance of first half frame through calculating pixel point to set up the background image model; Certainly, the model submodule being set also can adopt other types value outside average and the variance type to set up the background image model.The model submodule is set utilizes another part frame in the above-mentioned N continuous frame background image to the background image model training of above-mentioned foundation.The object lesson that model submodule training background image model is set is: to any pixel x in the background image model of setting up; The utilization of model submodule is set is used for training each frame each pixel identical with pixel x position of background image model to calculate threshold value respectively; Promptly to pixel x; Each frame that is used to train the background image model is selected maximum threshold for pixel x all to a threshold value should be arranged from all threshold values; Utilize this method each pixel in the background image model that the model submodule can make above-mentioned foundation to be set all to maximum threshold should be arranged; The model submodule is set utilizes all maximum thresholds to set up the threshold value matrix, this threshold value matrix is the background image model after the training.

If it serves as that the basis is set up with noise average and variance that the background image model of model submodule foundation is set, the model submodule then is set in the training process of background image model, threshold value is also calculated to noise average and variance.

Background subtraction divides submodule, is used to utilize the background image model after the above-mentioned training that pixel set U is carried out the background subtraction divisional processing, obtains pixel set E.The concrete implementation procedure that background subtraction divides submodule to carry out the background subtraction divisional processing can no longer specify at this like the record among the above-mentioned embodiment.

Foreground area module 440 is used for gathering E, pixel set T and pixel according to pixel and gathers the foreground area that the union of W is confirmed the i frame.The set of the pixel here W be among the pixel set Z with the i-1 frame in the pixel position pixel set inequality of foreground area, pixel set T be in the foreground area of i-1 frame with pixel set U in the pixel set inequality of pixel position.

Foreground area module 440 can be when needs use pixel set W and pixel set T; Obtain pixel set W and pixel set T, foreground area module 440 also can just be obtained pixel set W and pixel set T before the background subtraction sub-module is carried out the background subtraction divisional processing.

If pixel set E representes with the form that binaryzation is covered plate, then foreground area module 440 also can come remarked pixel point set T and pixel to gather W with the form that binaryzation is covered plate.Under the situation that pixel set E, pixel set T and pixel set W all represent with the form of binaryzation illiteracy plate; The process that foreground area module 440 is obtained above-mentioned union can comprise: the binaryzation illiteracy plate of pixel being gathered E, pixel set T and pixel set W carries out exclusive disjunction; The binaryzation that obtains behind the exclusive disjunction is covered the foreground area that plate can be expressed the i frame; The binaryzation that obtains behind the exclusive disjunction is covered the binaryzation illiteracy plate that plate also can be called the foreground area of i frame, and the binaryzation that perhaps is called the i frame is covered plate.

The foreground area that foreground area module 440 has been determined the i frame has also just been determined the foreground image of i frame; If the foreground area of the i frame that foreground area module 440 is determined is the form that binaryzation is covered plate; Then foreground area module 440 can be obtained this binaryzation and cover the RGB that value in plate is 1 pixel from the i frame, thereby obtains the foreground image of i frame.If the foreground area of the i frame that foreground area module 440 is determined not is the form that binaryzation is covered plate; But the form of the positional information of storage; Then foreground area module 440 can be obtained the RGB of the corresponding pixel of each positional information from the i frame, thereby obtains the foreground image of i frame.

Shade and light change detection module 460; Be used for obtaining R value similarity, G value similarity and the B value similarity of the foreground area of the i frame pixel identical with background image model position; And from the foreground area of i frame, remove the pixel that said R value similarity, G value similarity and B value similarity meet predetermined similarity requirement, the pixel of removal belongs to the background area of i frame.

Shade and light change detection module 460 can obtain R value similarity, G value similarity and B value similarity through modes such as ratio or differences; For example; Shade and light change detection module 460 are obtained the ratio of R value of R value and the pixel x in the background image model of the pixel x in the foreground area of i frame; Obtain the ratio of G value of G value and the pixel x in the background image model of the pixel x in the foreground area of i frame, and obtain the ratio of B value of B value and the pixel x in the background image model of the pixel x in the foreground area of i frame.If it is basic identical or approximate that shade and light change detection module 460 are judged above-mentioned three ratios, then from the foreground area of i frame, remove pixel x.Concrete like the description among the above-mentioned embodiment, no longer specify at this.

Smoothing processing module 470; Whether being used for each pixel to the foreground area of i frame, to carry out neighbor pixel respectively be the judgement of foreground pixel point; If do not reach neighbor number of spots half the of pixel x in each neighbor pixel of the pixel x of the foreground area of i frame for the quantity of foreground pixel point; Then from the foreground area of i frame, remove pixel x, the pixel x of removal belongs to the background area of i frame.

The neighbor pixel that smoothing processing module 470 adopts can be the eight neighborhood territory pixels point of pixel x, also can be the partial pixel point in the above-mentioned 8 neighborhood territory pixel points.Be under the situation of 8 neighborhood territory pixel points of pixel x at the neighbor pixel of pixel x; If in the 8 neighborhood territory pixel points of pixel x for the quantity of foreground pixel point less than 4; Then smoothing processing module 470 should be modified to the pixel x in the i frame foreground area pixel in the background area of i frame; Otherwise smoothing processing module 470 confirms that pixel x is the pixel in the i frame foreground area.

Each pixel of the foreground area of 470 pairs of i frames of smoothing processing module carries out order that neighbor pixel judges can be for from left to right and from top to bottom.Smoothing processing module 470 can adopt covers the form that plate carries out the spatial domain LPF to the corresponding binaryzation of the foreground area of i frame and realizes aforesaid operations.The object lesson of spatial domain LPF no longer specifies at this.

Connected region module 480 is used for that the foreground area of i frame is carried out connected region and handles, and obtains each prospect connected region; Connected region module 480 is removed from foreground area and is not met the prospect connected region that foreground area requires.

Perhaps connected region module 480 also can be used for the connected region processing is carried out in the background area of i frame, obtains each background connected region; Connected region module 480 is removed the pixel in the background connected region that does not meet the background area requirement from the background area.

Perhaps connected region module 480 can also be used for the connected region processing is carried out in the foreground area and the background area of i frame, obtains each prospect connected region and each background connected region; Connected region module 480 is removed from foreground area and is not met the prospect connected region that foreground area requires, and connected region module 480 is removed the pixel in the background connected region that does not meet the background area requirement from the background area.

The foreground area of 480 pairs of i frames of connected region module is carried out the connected region processing can obtain a plurality of connected regions; These a plurality of connected regions all can be called the prospect connected region; If the area of prospect connected region is less than the first area thresholding and adjacent with this prospect connected region left side or to go up adjacent connected region be not the prospect connected region, then connected region module 480 confirms that these prospect connected regions do not meet the foreground area requirement.

The connected region processing is carried out in the background area of 480 pairs of i frames of connected region module can obtain a plurality of connected regions; These a plurality of connected regions all can be called the background connected region; If the area of background connected region is less than second area thresholding and adjacent with this background connected region left side or to go up adjacent connected region be not the background connected region, then connected region module 480 confirms that these background connected regions do not meet the background area requirement.

Connected region module 480 can adopt existing multiple connected region processing mode to obtain a plurality of prospect connected regions and background connected region, also can adopt following connected region processing mode based on tree to obtain a plurality of prospect connected regions and background connected region.When adopting following connected region processing mode based on tree, said connected region module 480 comprises: obtain the connected region submodule and remove submodule.

Obtain the connected region submodule, be used for the data structure of each pixel of foreground area and background area is initialized as: the father node coordinate is this pixel coordinate, and color value is this pixel color value, and area is that predetermined initial value is as 1; According to from left to right and order from top to bottom successively the pixel x of i frame is handled as follows:

Obtain the data structure that the connected region submodule obtains left neighborhood; If the left neighborhood of pixel x is identical with the color value of pixel x; The father node coordinate that then obtains connected region submodule pixel x is set to the father node coordinate of left neighborhood; The area of pixel x is set to predetermined terminal value as 0, and the area of left neighborhood is increased predetermined initial value as increasing by 1.Obtain the data structure that the connected region submodule obtains upper right neighborhood; When if the color value of the upper right neighborhood of pixel x and pixel x is identical; The father node coordinate that obtains the above-mentioned upper right neighborhood of connected region submodule is set to the father node coordinate of above-mentioned left neighborhood; The area of above-mentioned upper right neighborhood is set to predetermined terminal value as 0, and the area of above-mentioned left neighborhood is increased predetermined initial value as increasing by 1; Obtain the connected region submodule no longer carries out other neighborhood to pixel x scan process.

If the left neighborhood of pixel x is different with the color value of pixel x; Then obtain the data structure that the connected region submodule obtains upper left neighborhood; If the upper left neighborhood of pixel x is identical with the color value of pixel x; The father node coordinate that then obtains connected region submodule pixel x is set to the father node coordinate of upper left neighborhood; The area of pixel x is set to predetermined terminal value as 0, and the area of upper left neighborhood is increased predetermined initial value as increasing by 1, obtains the data structure that the connected region submodule obtains upper right neighborhood; When if the color value of the upper right neighborhood of pixel x and pixel x is identical; The father node coordinate that obtains the above-mentioned upper right neighborhood of connected region submodule is set to the father node coordinate of above-mentioned upper left neighborhood, and the area of above-mentioned upper right neighborhood is set to predetermined terminal value as 0, and the area of above-mentioned upper left neighborhood is increased predetermined initial value as increasing by 1; Obtain the connected region submodule no longer carries out other neighborhood to pixel x scan process.

If the left neighborhood of pixel x is all different with the color value of pixel x with upper left neighborhood; Then obtain the data structure that the connected region submodule obtains neighborhood; If the last neighborhood of pixel x is identical with the color value of pixel x; The father node coordinate that then obtains connected region submodule pixel x is set to the father node coordinate of neighborhood, and the area of pixel x is set to predetermined terminal value as 0, and the area of last neighborhood is increased predetermined initial value as increasing by 1; Obtain the connected region submodule no longer carries out other neighborhood to pixel x scan process.

If the left neighborhood of pixel x, upper left neighborhood and last neighborhood are all different with the color value of pixel x; Then obtain the data structure that the connected region submodule obtains upper right neighborhood; If the upper right neighborhood of pixel x is identical with the color value of pixel x; The father node coordinate that then obtains connected region submodule pixel x is set to the father node coordinate of upper right neighborhood, and the area of pixel x is set to predetermined terminal value as 0, and the area of upper right neighborhood is increased predetermined initial value as increasing by 1; Obtain the connected region submodule no longer carries out other neighborhood to pixel x scan process.

Above-mentioned obtain the connected region submodule carry out be communicated with to handle also can be expressed as: obtain the connected region submodule four neighborhoods of pixel x judged according to left neighborhood, upper left neighborhood, the order that goes up neighborhood and upper right neighborhood; If it is identical with the color value of pixel x to judge a neighborhood; Then obtain the connected region submodule pixel x is added this neighborhood; Promptly obtain the connected region submodule father node coordinate of pixel x is revised as the father node coordinate of this neighborhood, and increase the area of this neighborhood; The pixel x here can only add a neighborhood.If the neighborhood identical with the color value of pixel x is left neighborhood or upper left neighborhood; Then obtain the connected region submodule and judge the upper right neighborhood of pixel x; If the color value of upper right neighborhood is identical with pixel x color value, then obtains the connected region submodule upper right neighborhood is also added the neighborhood that pixel x adds.

Obtain the connected region submodule all carried out above-mentioned processing operation to each pixel in the i frame after; Can obtain a plurality of father nodes; A father node is represented a connected region; This connected region also can be called subregion, that is to say, can obtain to comprise a plurality of connected regions of prospect connected region and background connected region through the processing of obtaining the connected region submodule.

Remove submodule; If the area that is used for the prospect connected region is less than the first area thresholding and adjacent with this prospect connected region left side or to go up adjacent connected region be not the prospect connected region; Then determine this prospect connected region and do not meet the foreground area requirement, from foreground area, remove this prospect connected region; If the area of background connected region is less than second area thresholding and adjacent with this background connected region left side or to go up adjacent connected region be not the background connected region; Then determine this background connected region and do not meet the background area requirement, from the background area, remove this background connected region.Removing the first area thresholding and the second area thresholding of submodule employing can adjust based on actual conditions.

Description through above embodiment; Those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential hardware platform; Can certainly all implement, but the former is better embodiment under a lot of situation through hardware.Based on such understanding, all or part of can the coming out that technical scheme of the present invention contributes to background technology with the embodied of software product, described software product can be used to carry out above-mentioned method flow.This computer software product can be stored in the storage medium; Like ROM/RAM, magnetic disc, CD etc.; Comprise that some instructions are with so that a computer equipment (can be a personal computer; Server, the perhaps network equipment etc.) carry out the described method of some part of each embodiment of the present invention or embodiment.

Though described the present invention through embodiment, those of ordinary skills know, the present invention has many distortion and variation and do not break away from spirit of the present invention, and the claim of application documents of the present invention comprises these distortion and variation.

Claims

1. a foreground image method for distilling is characterized in that, comprising:

Obtain the distance between the corresponding pixel points that coordinate position is identical in each pixel and i-1 frame in the i frame, and judge, obtain the pixel set Z of said distance value greater than predetermined value based on the distance value that each pixel is corresponding in the i frame;

Obtain among the said pixel set Z with said i-1 frame in the identical pixel set U of pixel coordinate position of foreground area;

Said pixel set U is carried out the background subtraction divisional processing, obtain pixel set E; Wherein, said background subtraction divisional processing is the background subtraction divisional processing based on the background image model, and the process of setting up of said background image model comprises:

Obtain N continuous frame background image, through the noise profile acquisition background image model of the partial frame in the statistics N continuous frame background image, this background image model is the background image model based on Gaussian distribution;

Utilize another part frame in the said N continuous frame background image to said background image model training based on Gaussian distribution; Be specially to any pixel x in the background image model of setting up; Utilization is used for training each frame each pixel identical with pixel x coordinate position of background image model to calculate threshold value respectively; And from all threshold values, select maximum threshold for pixel x; Utilize all maximum thresholds to set up the threshold value matrix, this threshold value matrix is the background image model after the training;

Background image model after the said training is used for said background subtraction divisional processing;

2. the method for claim 1 is characterized in that, said method also comprises:

To convert into based on the i frame of yuv space with based on the i-1 frame of yuv space based on the i frame of rgb space with based on the i-1 frame of rgb space;

The said distance of obtaining between the corresponding pixel points that coordinate position is identical in each pixel and i-1 frame in the i frame comprises: obtain based on each pixel in the i frame of yuv space and based on the distance between the corresponding pixel points that coordinate position is identical in the i-1 frame of yuv space.

3. the method for claim 1 is characterized in that, said method also comprises:

Obtain R value similarity, G value similarity and the B value similarity of the corresponding pixel points that coordinate position is identical in pixel and the said background image model in the foreground area of said i frame; Said R value similarity, G value similarity and B value similarity obtain through the mode of ratio or difference;

From the foreground area of said i frame, remove said R value similarity, G value similarity and B value similarity and meet the pixel that predetermined similarity requires, the pixel of said removal belongs to the background area of i frame.

4. the method for claim 1 is characterized in that, said method also comprises:

Whether carry out neighbor pixel respectively to each pixel in the foreground area of said i frame is the judgement of foreground pixel point;

If do not reach neighbor number of spots half the of said pixel x in each neighbor pixel of the pixel x of the foreground area of said i frame for the quantity of foreground pixel point; Then from the foreground area of said i frame, remove said pixel x, the pixel x of said removal belongs to the background area of said i frame.

5. the method for claim 1 is characterized in that, said method also comprises:

The foreground area of said i frame is carried out connected region handle, obtain each prospect connected region;

From foreground area, remove and do not meet the prospect connected region that foreground area requires;

Perhaps said method also comprises:

Connected region is carried out in the background area of said i frame handle, obtain each background connected region;

From the background area, remove and do not meet the background connected region that the background area requires;

Perhaps said method also comprises:

The connected region processing is carried out in foreground area and background area to said i frame, obtains each prospect connected region and each background connected region;

From foreground area, remove and do not meet the prospect connected region that foreground area requires, from the background area, remove and do not meet the background connected region that the background area requires.

6. method as claimed in claim 5 is characterized in that, connected region is carried out in said foreground area and background area handle and comprise:

Each pixel in said foreground area and the background area is provided with the data structure based on tree, and said data structure comprises: father node coordinate, color value and area;

The data structure of each pixel in said foreground area and the background area is initialized as: the father node coordinate is this pixel coordinate, and color value is this pixel color value, and area is a predetermined initial value;

According to from left to right and order from top to bottom successively the pixel x of i frame is handled as follows:

If have a neighborhood identical in the left neighborhood of pixel x, upper left neighborhood, last neighborhood and the upper right neighborhood with the color value of pixel x; Then the father node coordinate of pixel x is revised as the father node coordinate of this identical neighborhood of color value; The area of pixel x is set to predetermined terminal value, and the area of the neighborhood that color value is identical increases predetermined initial value;

If said left neighborhood or upper left neighborhood are identical with the color value of pixel x; And the color value of said upper right neighborhood is identical with the color value of pixel x; The father node coordinate of upper right neighborhood is revised as the father node coordinate of said left neighborhood or upper left neighborhood; The area of upper right neighborhood is set to predetermined terminal value, with the area increase predetermined initial value of said left neighborhood or upper left neighborhood.

7. method as claimed in claim 6 is characterized in that, the said prospect connected region that does not meet the foreground area requirement of from foreground area, removing comprises:

If the area of prospect connected region is less than the first area thresholding and adjacent with this prospect connected region left side or to go up adjacent connected region be not the prospect connected region; Then this prospect connected region does not meet the foreground area requirement, from foreground area, removes this prospect connected region;

The said background connected region that does not meet the background area requirement of from the background area, removing comprises:

If the area of background connected region is less than second area thresholding and adjacent with this background connected region left side or to go up adjacent connected region be not the background connected region; Then this background connected region does not meet the background area requirement, from the background area, removes this background connected region.

8. a foreground image extraction element is characterized in that, comprising:

Spacing module is used for obtaining the distance between the corresponding pixel points that coordinate position is identical in each pixel of i frame and the i-1 frame;

First collection modules is used for judging based on the corresponding distance value of each pixel of i frame, obtains the pixel set Z of said distance value greater than predetermined value;

Second collection modules is used for obtaining the identical pixel set U of pixel coordinate position of the foreground area in said pixel set Z and the said i-1 frame;

The background subtraction sub-module is used for said pixel set U is carried out the background subtraction divisional processing, obtains pixel set E, and wherein, said background subtraction sub-module comprises:

The model submodule is set, is used to obtain N continuous frame background image, through the noise profile acquisition background image model of the partial frame in the statistics N continuous frame background image, this background image model is the background image model based on Gaussian distribution; And further utilize another part frame in the said N continuous frame background image to said background image model training based on Gaussian distribution; Be specially to any pixel x in the background image model of setting up; Utilization is used for training each frame each pixel identical with pixel x coordinate position of background image model to calculate threshold value respectively; And from all threshold values, select maximum threshold for pixel x; Utilize all maximum thresholds to set up the threshold value matrix, this threshold value matrix is the background image model after the training;

Background subtraction divides submodule, is used to utilize the background image model after the said training that said pixel set U is carried out the background subtraction divisional processing, obtains pixel set E;

9. device as claimed in claim 8 is characterized in that, said device also comprises:

The space conversion module is used for converting into based on the i frame of yuv space with based on the i-1 frame of yuv space based on the i frame of rgb space with based on the i-1 frame of rgb space;

And said spacing module specifically be used for obtaining based on each pixel of i frame of yuv space with based on the distance between the corresponding pixel points that coordinate position is identical in the i-1 frame of yuv space.

10. device as claimed in claim 8 is characterized in that, said device also comprises:

Shade and light change detection module are used for obtaining R value similarity, G value similarity and the B value similarity of the corresponding pixel points that coordinate position is identical in pixel and the said background image model of foreground area of said i frame; Said R value similarity, G value similarity and B value similarity obtain through the mode of ratio or difference; And from the foreground area of said i frame, remove the pixel that said R value similarity, G value similarity and B value similarity meet predetermined similarity requirement, the pixel of said removal belongs to the background area of said i frame.

11. device as claimed in claim 8 is characterized in that, said device also comprises:

The smoothing processing module; Whether being used for each pixel to the foreground area of said i frame, to carry out neighbor pixel respectively be the judgement of foreground pixel point; If do not reach neighbor number of spots half the of said pixel x in each neighbor pixel of the pixel x of the foreground area of said i frame for the quantity of foreground pixel point; Then from the foreground area of said i frame, remove said pixel x, the pixel x of said removal belongs to the background area of said i frame.

12. device as claimed in claim 8 is characterized in that, said device also comprises:

The connected region module is used for that the foreground area of said i frame is carried out connected region and handles, and obtains each prospect connected region; From foreground area, remove and do not meet the prospect connected region that foreground area requires;

Be used for that perhaps connected region is carried out in the background area of said i frame and handle, obtain each background connected region; From the background area, remove the pixel in the background connected region that does not meet the background area requirement;

Perhaps be used for the connected region processing is carried out in the foreground area and the background area of said i frame, obtain each prospect connected region and each background connected region; From foreground area, remove and do not meet the prospect connected region that foreground area requires, from the background area, remove the pixel in the background connected region that does not meet the background area requirement.

13. device as claimed in claim 12 is characterized in that, said connected region module comprises:

Obtain the connected region submodule, be used for the data structure of each pixel of said foreground area and background area is initialized as: the father node coordinate is this pixel coordinate, and color value is this pixel color value, and area is a predetermined initial value; According to from left to right and order from top to bottom successively the pixel x of i frame is handled as follows:

If said left neighborhood or upper left neighborhood are identical with the color value of pixel x; And the color value of said upper right neighborhood is identical with the color value of pixel x; The father node coordinate of upper right neighborhood is revised as the father node coordinate of said left neighborhood or upper left neighborhood; The area of upper right neighborhood is set to predetermined terminal value, with the area increase predetermined initial value of said left neighborhood or upper left neighborhood;

Remove submodule; If the area that is used for the prospect connected region is less than the first area thresholding and adjacent with this prospect connected region left side or to go up adjacent connected region be not the prospect connected region; Then this prospect connected region does not meet the foreground area requirement, from foreground area, removes this prospect connected region; If the area of background connected region is less than second area thresholding and adjacent with this background connected region left side or to go up adjacent connected region be not the background connected region; Then this background connected region does not meet the background area requirement, from the background area, removes this background connected region.