CN111062974B

CN111062974B - Method and system for extracting foreground target by removing ghost

Info

Publication number: CN111062974B
Application number: CN201911181481.XA
Authority: CN
Inventors: 张军; 雷民; 金淼; 陈习文; 卢冰; 王斯琪; 王旭; 陈卓; 郭鹏; 周玮; 汪泉; 付济良; 聂高宁; 齐聪; 郭子娟; 匡义; 余雪芹; 刘俊; 朱赤丹
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI
Priority date: 2019-11-27
Filing date: 2019-11-27
Publication date: 2022-02-01
Anticipated expiration: 2039-11-27
Also published as: CN111062974A

Abstract

The invention discloses a method and a system for extracting a foreground target by removing ghosts, and belongs to the technical field of foreground target detection. The method comprises the following steps: acquiring video stream information, selecting pixel values of a plurality of arbitrary position points of an initial frame image in the video stream information as a sample library, selecting a plurality of pixel values with medium probability in the neighborhood of the arbitrary position of the initial frame image, assigning values to the sample library, and generating a background model; selecting any frame in the video stream information, and carrying out foreground and background classification on any pixel point of any frame; updating the pixel value of any pixel point to a sample library of the background model according to a preset probability; and determining any pixel point as a ghost pixel point, and removing the ghost pixel point to extract the foreground target. The dynamic degree of the pixel points is evaluated by introducing a dynamic model and the flicker degree, and the sampling distance threshold value and the matching threshold value are updated in a self-adaptive manner, so that the foreground extraction accuracy is improved, and the omission ratio is reduced.

Description

Method and system for extracting foreground target by removing ghost

Technical Field

The present invention relates to the technical field of foreground object detection, and more particularly, to a method and system for extracting a foreground object by removing ghosts.

Background

Foreground object detection is that a video sequence of a camera is automatically segmented into a foreground object and a background of interest, and as a result, the subsequent research such as object tracking, counting, recognition and classification is based on the result. How to deal with the variability of the scene (such as dynamic background, shadow, etc.) is the biggest challenge faced by this technology at this stage.

The mainstream foreground object detection methods at present include a frame difference method, an optical flow method and a background subtraction method. The background subtraction method becomes a current research hotspot due to the comprehensive advantages of real-time performance and accuracy, and the method takes establishing a background model with high accuracy and strong adaptability as a core task. Commonly used background models include parametric background models and non-parametric background models. The GMM models a dynamic background by using a plurality of Gaussian probability density functions, but the GMM models the dynamic background by using the color intensity of each pixel, but the GMM models the dynamic background with high computational complexity and poor real-time performance, and is difficult to eliminate false objects caused by the dynamic background. The non-parametric Background model is represented by Kernel Density Estimation (KDE), a CodeBook model (CodeBook) and Visual Background extraction (ViBe), and unlike the parametric model, the KDE estimates the Background probability Density according to the historical pixel values of each pixel position, but the update strategy of first-in first-out observation values makes the method unable to adapt to long-period events. The CodeBook clusters pixel values into code words and stores the code words in a local dictionary, and learns and updates the model by combining the time information of the pixels, but the background model trained by only depending on images with specific frame numbers is difficult to adapt to complex conditions such as illumination change, irregular background motion and the like. The ViBe algorithm establishes a background model by randomly clustering neighborhood pixel values, firstly matches the current frame pixels with the corresponding background model, then carries out foreground and background classification on the pixel points by utilizing a global fixed threshold value, and finally carries out random replacement on the background model at a certain updating rate.

The ViBe algorithm has the characteristics of simple model, small calculated amount, high processing speed, high detection precision and the like, but has some defects:

1) the strategy of classifying and updating by fixed values is difficult to adapt to dynamic backgrounds such as water flow, branch and leaf shaking and the like;

2) if a foreground object exists in a single frame used for initializing a background, a ghost image appears in subsequent detection.

The Gaussian mixture model is high in calculation complexity and poor in real-time performance, false targets caused by dynamic backgrounds are difficult to eliminate, the parameter Background models represent Kernel Density Estimation (KDE), a CodeBook model (CodeBook) and Visual Background extraction (ViBe), the method cannot adapt to long-period events due to the strategy of updating the observation value in a first-in first-out mode, the CodeBook only depends on the Background models trained by images with specific frame numbers, and the complex situations such as illumination changes and irregular Background motions are difficult to adapt.

Disclosure of Invention

The invention provides a method for extracting a foreground target by removing ghosts, which aims to solve the problems, and comprises the following steps:

acquiring video stream information, selecting pixel values of a plurality of arbitrary position points of an initial frame image in the video stream information as a sample library, selecting a plurality of pixel values with medium probability in the neighborhood of the arbitrary position of the initial frame image, assigning values to the sample library, and generating a background model;

selecting any frame in the video stream information, and carrying out foreground and background classification on any pixel point of any frame;

selecting any pixel point of any frame classified as the background, determining the pixel value of any pixel point, randomly replacing a sample of a corresponding any position point in the background model by the pixel value of any pixel point according to a preset probability, and updating the pixel value of any pixel point into a sample library of the background model according to the preset probability;

the method comprises the steps of obtaining pixel significant values of pixel values of any pixel points of any frame classified as a foreground and significant values of pixel values of a plurality of any position points of an updated background model sample library, obtaining a significant difference value of each pixel point by subtracting an absolute value from a visual significant image of a current frame and a background model, and if the foreground pixel points exist so that the significance degree is larger than or equal to a current threshold value, removing ghost pixel points and extracting a foreground target.

Optionally, the method further includes:

and reclassifying the ghost pixel points, dividing the ghost pixel points into the background, and replacing the pixel values of the ghost pixel points with the pixel values of any corresponding position points in the background model.

Optionally, the preset probability is 1/T, and T is an update time sampling factor;

the updating time sampling factor is obtained according to the background dynamics model and the flicker degree value;

and the updating time sampling factor is replaced after the pixel value of the ghost pixel point is replaced with the pixel value of the corresponding arbitrary position point in the background model.

Optionally, the front/background classification is performed on any pixel point of any frame, specifically:

and defining a two-dimensional color space which takes the pixel value of any pixel point of any frame as the center and the sampling distance threshold as the radius, and if the number of samples of the pixel value of the corresponding position point in the background model sample library falling in the two-dimensional color space is less than a preset matching threshold, taking any pixel point as the background, or else, taking the pixel value as the foreground.

Alternatively to this, the first and second parts may,

the sampling distance threshold;

updating according to the flicker degree of each pixel and the minimum distance between the pixel point and the center of the two-dimensional color space

. The invention also provides a system for extracting the foreground target by removing the ghost, which comprises the following steps:

the sampling module is used for acquiring video stream information, selecting pixel values of a plurality of arbitrary position points of an initial frame image in the video stream information as a sample library, selecting a plurality of pixel values with medium probability in the neighborhood of the arbitrary position of the initial frame image, assigning values to the sample library and generating a background model;

the classification module selects any frame in the video stream information and classifies the foreground and the background of any pixel point of any frame;

the updating module selects any pixel point of any frame classified as the background, determines the pixel value of any pixel point, randomly replaces the sample of any position point in the background model by the pixel value of any pixel point according to the preset probability, and updates the pixel value of any pixel point to the sample library of the background model according to the preset probability;

the identification module is used for acquiring pixel significant values of pixel values of any pixel points of any frame classified as the foreground and significant values of pixel values of a plurality of any position points of the updated background model sample library, taking an absolute value by subtracting the visual significant images of the current frame and the background model to acquire a significant difference value of each pixel point, if the foreground pixel points exist so that the significance degree is more than or equal to the current threshold value, removing ghost pixel points to extract the foreground target.

Optionally, the identification module is further configured to:

Alternatively to this, the first and second parts may,

the sampling distance threshold and the matching threshold;

. Compared with the prior art, the invention has the following beneficial effects:

the dynamic degree of the pixel points is evaluated by introducing a dynamic model and the flicker degree, and the sampling distance threshold value and the matching threshold value are updated in a self-adaptive manner, so that the missing rate is reduced while the accuracy rate of foreground extraction is improved;

according to the invention, the time sub-sampling factors of each pixel point are updated in a self-adaptive manner through the dynamic degree, so that the accuracy of the background model in a complex scene is improved;

the frame statistics of the invention takes the significance value of each pixel position as the basis of ghost judgment, and updates the background model for the pixel points identified as ghosts according to the matching threshold, thereby achieving the purpose of rapidly eliminating the ghosts.

Drawings

FIG. 1 is a flow chart of a method for foreground object extraction using ghost removal in accordance with the present invention;

FIG. 2 is a flowchart of an embodiment of a method for foreground object extraction using ghost removal according to the present invention;

FIG. 3 is a flowchart illustrating the front/background classification of any pixel point of any frame according to an embodiment of the method for extracting a foreground object by removing ghosts of the present invention;

FIG. 4 is a flowchart illustrating a method for extracting a foreground object by removing ghosts according to an embodiment of the present invention, wherein any pixel point is determined to be a ghost pixel point;

FIG. 5 is a diagram of a system for foreground object extraction using ghost removal according to the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

The invention provides a method for extracting a foreground target by removing ghosts, which comprises the following steps of:

The preset probability is 1/T, and T is an update time sampling factor;

Performing front/background classification on any pixel point of any frame, specifically:

And sampling the distance threshold and the matching threshold, and updating according to the flicker degree of each pixel and the minimum distance between the pixel point and the center of the two-dimensional color space.

The present invention will be further illustrated with reference to the following examples;

as shown in fig. 2, video stream information is obtained, pixel values of a plurality of arbitrary position points of an initial frame image in the video stream information are selected as a sample library, a plurality of pixel values with medium probability in the neighborhood of the arbitrary position of the initial frame image are selected, and the sample library is assigned to generate a background model, specifically, the background model is generated;

step 1.1, inputting video stream information ═ { I ═ I₀,I₁,…,I_i… } (wherein I is_iFor the image of the i-th frame,

wherein

Representing the pixel value at the (s, t) position in the image, the image size is M × N), and first, it is determined whether or not the image is an initial frame image. If the image is the initial frame image, the procedure goes to step 1.2. If not, go to step 2.1.

Step 1.2, with the video initial frame (I)_iI ═ 0) pixel values of the image as a sample library, the background model is initialized. By passing from

P neighborhood N_p(s, t) selecting N pixel values with equal probability to respectively assign background samples to the positions (s, t):

B＝{B(s,t)|1≤s≤M,1≤t≤N}

B(s,t)＝{b₁(s,t),b₂(s,t),…,b_n(s,t)}

(s′,t′)∈N_p(s, t) and k is not less than 1 and not more than n

In the formula: b is a background model in the algorithm; b (s, t) is a background model of the location (s, t); n is a radical of_p(s, t) is the p neighborhood of location (s, t).

Selecting any frame in the video stream information, and performing front/background classification on any pixel point of any frame, as shown in fig. 3, specifically:

and 2.1, evaluating the dynamic degree of the background. For non-initial frame video image I_i(i ≧ 1), a recursive minimum distance D is first definedⁱ(s, t) and a scintillation value gⁱ(s, t). The recursive minimum distance is defined by acquiring the motion entropy state quantity of each pixel point on the nearest time window according to the thought of background dynamics. The flicker degree value of each pixel point is switched between the background and the foreground frequently according to the pixel point belonging to the dynamic backgroundThe characteristics of (2) are defined. Specifically, the following formula:

wherein the content of the first and second substances,

is the pixel value I of the current frame position (s, t)ⁱ(s, t) the minimum value of the distance in the two-dimensional Euclidean color space from its background model B (s, t). g_incAnd g_decRespectively, are the increasing and decreasing coefficients of the flicker degree,

representing an exclusive OR operation, Fⁱ(s, t) indicates whether or not the pixel point at the ith frame position (s, t) belongs to the background.

Step 2.2, according to the flicker degree and the minimum distance of each pixel, the self-adaptive updating mode of the sampling distance threshold value is as follows:

step 2.3, in order to ensure the integrity of the foreground target in the detection result, the matching threshold # min (s, t) is adaptively updated for the foreground pixel point by using the minimum distance and the flicker degree value:

and 2.4, performing front background classification judgment on the pixel points at the position (s, t) according to the background model B (s, t) established at the position (s, t). Constructed with pixel values

Centered on the sampling distance threshold RⁱTwo-dimensional Euclidean color space with radius (s, t)

If the background model falls on

If the number of the samples in the pixel is less than the matching threshold # min (s, t), the current pixel is considered as the background, otherwise, the current pixel is considered as the foreground. The discrimination formula is as follows:

selecting any pixel point of any frame classified as the background, determining the pixel value of any pixel point, randomly replacing a sample corresponding to any position point in the background model by the pixel value of any pixel point according to the preset probability, and updating the pixel value of any pixel point to a sample library of the background model according to the preset probability, wherein the method specifically comprises the following steps:

and 3.1, adaptively updating the time sampling factor for each pixel point. In order to adapt to the characteristic of fast change of a dynamic region and solve the problem that pixel values of slowly moving foregrounds are influenced by a neighborhood propagation mechanism and are easily updated into a background so as to influence the accuracy of a background model, a self-adaptive updating time sampling factor of each pixel point is set:

in the formula, T_maxAnd T_minThe up-regulation amplitude and the down-regulation amplitude are fixed values and are used for controlling the basic floating of the sampling factor;

step 3.2, the method is to classify the pixel points out of the position (s, T) as the background by 1/Tⁱ(s,t)(Tⁱ(s, t) is a temporal sub-sampling factor) and its corresponding pixel value

Randomly replace one sample in the background model B (s, T) while at 1/TⁱProbability of (s, t) pixel value

Update to its neighbor pixel N_p(s, t) in the background model.

The method comprises the steps of obtaining pixel significant values of pixel values of any pixel point of any frame classified as a foreground and significant values of pixel values of a plurality of any position points of an updated background model sample library, subtracting a visual significant image of a current frame and a background model to obtain an absolute value so as to obtain significant difference values of each pixel point, if the foreground pixel points exist, enabling the significance degree to be larger than or equal to a current threshold value, then enabling the pixel points to be ghost pixel points, reclassifying the ghost pixel points, dividing the ghost pixel points into a background, and replacing the pixel values of the ghost pixel points with the pixel values of corresponding any position points in the background model, wherein as shown in FIG. 4, the method specifically comprises the following steps:

step 4.1, respectively solving visual saliency maps of a current frame and a background model by using an image pixel saliency value detection method based on histogram contrast

The significant difference value of each pixel point is further obtained by difference absolute value calculation

Step 4.2, carrying out foreground detection on the current frame by using a ViBe algorithm with threshold value self-adaptive updating, and establishing a significance degree function H for all pixel pointsⁱ(s, t). H classified as foreground points in combination with significant difference valuesⁱ(s, t) update:

step 4.3, defining a significance degree threshold c of the current frame_iComprises the following steps:

wherein beta is a threshold adjustment parameter,

is the mean of the foreground pixel significant difference. The size of β is determined by the contrast of the ghost area to the background.

Step 4.4, ghost judgment is carried out on the foreground pixel points every other delta i frame, and H is carried out if the foreground pixel points exist and the significance degree is larger than or equal to the current threshold valueⁱ(s,t)≥c_iThen, the pixel (s, t) is considered as a ghost pixel. First, the ghost pixel point is reclassified as background, and the current pixel value is used

Randomly replacing n' samples in the background model, and adjusting a time sub-sampling factor:

n′＝Ceil(#min(s,t))

in the formula, Ceil (g) represents the smallest integer which is a number larger than the number between parentheses.

Step 4.5, repeating steps 4.1 to 4.4 until no pixel point in the image meets Hⁱ(s,t)≥c_iI.e. the foreground object no longer contains ghost areas.

Through the 4 steps in the embodiment, accurate extraction of the parameter-adaptive ghost-removing foreground target can be finally realized.

The invention provides a system for extracting a foreground target by removing ghosts, which comprises the following steps of:

the sampling module 201 acquires video stream information, selects pixel values of a plurality of arbitrary position points of an initial frame image in the video stream information as a sample library, selects a plurality of pixel values with medium probability in the neighborhood of the arbitrary position of the initial frame image, assigns values to the sample library, and generates a background model;

the classification module 202 selects any frame in the video stream information, and performs foreground and background classification on any pixel point of any frame;

the updating module 203 selects any pixel point of any frame classified as the background, determines the pixel value of any pixel point, randomly replaces the sample of any corresponding position point in the background model with the pixel value of any pixel point according to the preset probability, and updates the pixel value of any pixel point to the sample library of the background model according to the preset probability;

the identification module 204 is configured to obtain a pixel significant value of a pixel value of any pixel point of any frame classified as a foreground and significant values of pixel values of a plurality of any position points of an updated background model sample library, perform subtraction on a visual significant graph of a current frame and a background model to obtain an absolute value of each pixel point to obtain a significant difference value of each pixel point, and if a foreground pixel point exists so that the significance degree is greater than or equal to a current threshold, remove a ghost pixel point and extract a foreground target;

The preset probability is 1/T, and T is an update time sampling factor;

The sampling distance threshold and the matching threshold;

. The dynamic degree of the pixel points is evaluated by introducing a dynamic model and the flicker degree, and the sampling distance threshold value and the matching threshold value are updated in a self-adaptive manner, so that the missing rate is reduced while the accuracy rate of foreground extraction is improved;

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. A method for foreground object extraction using de-ghosting, the method comprising:

acquiring pixel significant values of pixel values of any pixel points of any frame classified as a foreground and significant values of pixel values of a plurality of any position points of an updated background model sample library;

respectively solving visual saliency maps of a current frame and a background model by using an image pixel saliency value detection method based on histogram contrast, obtaining a saliency value of each pixel point by carrying out difference absolute value calculation, carrying out foreground detection on the current frame by using a ViBe algorithm adaptively updated by a threshold value, and establishing a saliency function for all the pixel points

Combining significant difference values to a significance level function classified as foreground points

Updating:

indicating the ith frame position

Whether the pixel point of (2) belongs to the background, wherein

The significant difference of the pixel points;

defining a significance level threshold for a current frame

Comprises the following steps:

in the formula (I), the compound is shown in the specification,

the parameters are adjusted for the purpose of the threshold value,

is the mean of the significant differences of the foreground pixels,

the size of (a) is determined by the contrast of the ghost area and the background;

every other

And (4) carrying out ghost judgment on the foreground pixel points by the frame, if the foreground pixel points exist, and the significance degree is more than or equal to the current significance degree threshold, considering the pixel points as ghost pixel points, and removing the ghost pixel points to extract the foreground target.

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein the preset probability is 1/T, T being an update time sampling factor;

4. The method according to claim 1, wherein the front/background classification of any pixel point of any frame is specifically:

and determining a two-dimensional color space which takes the pixel value of any pixel point of any frame as the center and the sampling distance threshold as the radius, and if the number of samples of the pixel value of the corresponding position point in the background model sample library falling in the two-dimensional color space is less than a preset matching threshold, taking any pixel point as the background, otherwise, taking the pixel value as the foreground.

5. A system for foreground object extraction using de-ghosting, the system comprising:

the identification module is used for acquiring pixel significant values of pixel values of any pixel points of any frame classified as the foreground and significant values of pixel values of a plurality of any position points of the updated background model sample library;

using histogram basedThe method for detecting the pixel significance value of the image with the contrast ratio solves visual significance maps of a current frame and a background model respectively, obtains a significant difference value of each pixel point through difference absolute value calculation, performs foreground detection on the current frame by using a ViBe algorithm with threshold value self-adaptive updating, and establishes a significance degree function for all the pixel points