Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a method and a device for detecting a moving target of a video image. In addition, the invention also aims to provide a stable video image target detection method
Another object of the present invention is to implement a two-layer gaussian modeling algorithm.
It is another object of the invention to implement a post-processing algorithm for processing images.
In order to achieve the above object, the present invention provides a method for detecting a moving object in a video image, including: s1, performing background modeling on an image input by a video through two layers of Gaussian mixture models to obtain a video image background of the video input image, wherein the input of a secondary Gaussian mixture model is a result of the modeling of the primary Gaussian mixture model; s2, performing difference processing on the video image background and the video input image frame by frame to obtain a video image foreground of the video image; and S3, sequentially carrying out binarization processing on the foreground of the video image based on an Otsu threshold value, eliminating a communicated region based on morphological corrosion and expansion operation and eliminating a small region based on the size of a pixel value to form a foreground target of the input video image.
In addition, the present invention also provides a video image target detection apparatus, including: the background modeling unit is used for carrying out background modeling on the image input by the video through two layers of Gaussian mixture models to obtain the video image background of the video input image, wherein the input of the secondary Gaussian mixture model is the result of the modeling of the primary Gaussian mixture model; the background elimination unit is used for carrying out difference processing on the video image background output by the background modeling unit and the video input image frame by frame to obtain a video image foreground of the video image; and the post-processing unit is used for sequentially carrying out binarization processing based on an Otsu threshold value, elimination of a connected region based on morphological corrosion and expansion operation and elimination of a small region based on the size of a pixel value on the video image foreground output by the background elimination unit to form a foreground target of the input video image.
In addition, the present invention also provides a video image target detection device, including: one or more processors; a non-transitory computer readable storage medium storing one or more instructions that when executed by the processor are configured to: performing background modeling on an image input by a video through two layers of Gaussian mixture models to obtain a video image background of the video input image, wherein the input of the second Gaussian mixture model is a result of the modeling of the first Gaussian mixture model; performing frame-by-frame difference processing on the video image background and the video input image to obtain a video image foreground of the video image; and carrying out binarization processing on the foreground of the video image based on an Otsu threshold value, eliminating a communicated region based on morphological corrosion and expansion operation and eliminating a small region based on the size of a pixel value in sequence to form a foreground target of the input video image.
The invention has the beneficial effects that: firstly, the modeling mode of the two-layer Gaussian mixture model provided by the invention can achieve a relatively balanced effect in the aspects of calculation speed and background modeling accuracy, and the effect is stable. Secondly, the algorithm avoids the interference of small targets through various image post-processing operations including binarization and morphological operations. And thirdly, the algorithm can effectively realize background modeling of the video continuous images, thereby effectively detecting the moving target.
Detailed Description
In image and adaptation, so-called video is actually a series of images with time series characteristics, also a series of images. For a common surveillance video, since the camera is relatively stable, there will be a relatively stable scene in the sequence image, i.e. the background, which will not change. Objects that change in the background, i.e. moving objects, need to be detected because they have valuable information, so-called foreground information. The background and the foreground are actually images, but the background is generally unchanged, and the foreground is used as a moving target and is changed in real time.
Fig. 1 is a schematic diagram of detecting a moving target of a video image based on a two-layer gaussian mixture model according to the present invention, where an input video image is processed by 101 a two-layer gaussian mixture model to obtain a video background, then background elimination 102 is performed, and a result of the background elimination is subjected to image post-processing 103 to obtain a foreground image of the moving target.
Fig. 2 is a flowchart of a method for detecting a moving target of a video image based on a two-layer gaussian mixture model, which mainly includes three steps of background modeling of the two-layer gaussian mixture model S1, background subtraction elimination S2 and image post-processing S3.
1) Background modeling S1: and inputting images frame by frame from an input monitoring video, and modeling the background through a two-layer Gaussian mixture model, wherein the input of the second Gaussian mixture model is the result of the first modeling. And finally forming the background of the video image through two-layer model modeling.
The modeling of the single Gaussian mixture model comprises the following steps:
initializing a background model mu, the initial background mean value being mu0Initial standard deviation σ0Initial difference threshold T (set to 20), Ix,yFor the pixel value at pixel point (x, y):
μ(x,y)=Ix,y
σ(x,y)=T
wherein, T is a pixel value of the image, which has only a gray level and no dimension, and can be artificially set according to the environment.
i. Inspection pixel Ix,yWhether the image belongs to the foreground or the background, wherein the image is a lambda threshold parameter, and whether the mean value mu (x, y) is within a certain range is judged:
if Ix,y-μ(x,y)|<λ*σ(x,y),Ix,yAs a background
Otherwise, Ix,yIs a prospect of
Learning update to background, update formula as follows, where α is learning rate, which can be generally set to 1 e-4:
μ(x,y)=(1-α)*μ(x,y)+α*Ix,y
repeating steps ii, iii until the algorithm stops, i.e. when
And stop, here also a constant small amount, which may take 1 e-5.
2) Background elimination S2: the video background established in the first step is used for carrying out frame-by-frame subtraction with the video image to eliminate the background, and the result of the subtraction processing is the foreground of the video image and is also the moving target of the video sequence image;
the difference processing formula is as follows:
Dx,y=Ix,y-μ(x,y)
wherein, Ix,yThe original image is shown, and μ (x, y) is the calculated background.
3) Image post-processing S3: and performing image post-processing operation on the image subjected to the difference processing to finally form a foreground target of the image. The post-treatment operations are sequentially carried out, and specifically comprise the following steps: binarization processing based on an Otsu threshold value, elimination of a communicated region based on morphological corrosion and expansion operation, and elimination of a small region based on the size of a pixel value.
The binarization processing based on the Otsu threshold value is as follows:
the Otsu threshold assumes a bimodal distribution in the image histogram, and the basic assumption is to set a threshold that separates the foreground and background of the image G, such that the inter-class variance of the foreground and background pixels should be maximized. Mathematically, the Otsu threshold t should satisfy the following optimal expression:
wherein, ω is
0=N
0/N,ω
1=N
1/N,
Here, N
0,N
1And N represents the foreground, background and total pixel count, respectively. p is a radical of
iRepresenting the frequency of the grey level i. Mu.s
0,μ
1And
representing the mean values of the gray levels of the foreground, background and full image pixels, respectively. For an RGB image, t has a value in the range of 0-255. Therefore, after t is obtained, the segmented image R can be obtained by thresholding
segThe following were used:
in the divided image RsegPixels representing foreground are all labeled 1, while background pixels are labeled 0.
Wherein, the step of eliminating the communication area based on morphological corrosion and expansion operation is as follows:
two images B are provided, a, B being called a structural element and also being called a brush, if a is the object to be processed, i.e. the data after binarization based on the atrazine threshold, and B is used to process a. The structuring elements are usually relatively small images. The data after binarization processing based on the Otsu threshold value is firstly subjected to corrosion operation and then expansion operation, namely morphological opening operation.
Wherein the corrosion (Erosion) operation is:
the result of X erosion with S is a collection of all X' S that remain in X after S is translated by X. In other words, the set obtained by corroding X with S is a set of the origin positions of S when S is completely included in X.
Wherein the expansion (scaling) operation is:
dilation can be viewed as the dual operation of erosion, which is defined as: and translating the structural element B by a to obtain Ba, and recording the point a if the Ba hits X. The set of all points a satisfying the above condition is called the result of expansion of X by B.
And performing small region elimination based on the size of a pixel value on the data after the communication region is eliminated based on the morphological erosion and the expansion operation, wherein the small region elimination based on the size of the pixel value is as follows:
let the connected region in the obtained image G be { A1,A2,…,ANThe number of the corresponding pixel values of the communicated area is { n }1,n2,…,nNIs then if ni<, wherein if the human setting is 30, the region is left and determined as non-target; if n isi>. sup..
Fig. 3 is a video image object detection apparatus 300 according to the present invention, which includes: the background modeling unit 301 performs background modeling on an image input by a video through two layers of mixed gaussian models to obtain a video image background of the video input image, wherein the input of the second mixed gaussian model is a result of the first mixed gaussian model modeling; a background elimination unit 302, configured to perform frame-by-frame subtraction processing on the video image background output by the background modeling unit and the video input image to obtain a video image foreground of the video image; and the post-processing unit 303 is used for sequentially carrying out binarization processing based on an Otsu threshold value, elimination of a connected region based on morphological corrosion and expansion operation and elimination of a small region based on the size of a pixel value on the foreground of the video image output by the background elimination unit to form a foreground target of the input video image.
Wherein the modeling method of each layer in the two layers of Gaussian mixture models in the background modeling unit is as described above for the modeling method in the video image target detection method of FIG. 1.
As shown in fig. 4, the post-processing unit 303 includes a binarization processing module 304 based on an atrazine threshold, a module 305 for eliminating a connected region based on morphological erosion and dilation operation, and a small region eliminating module 306 based on a pixel value, and the image foreground is processed by the three modules in sequence to obtain a foreground target of the input video image.
Fig. 5 shows another video image object detection apparatus 400 according to the present invention, which includes: one or more processors 401;
a non-transitory computer-readable storage medium 403 storing one or more instructions 402, wherein the computer-readable storage medium 403 may also store data to be processed, which may also be stored in other storage media, and wherein the processor, when executing the one or more instructions, is configured to: performing background modeling on an image input by a video through two layers of Gaussian mixture models to obtain a video image background of the video input image, wherein the input of the second Gaussian mixture model is a result of the modeling of the first Gaussian mixture model; performing frame-by-frame difference processing on the video image background and the video input image to obtain a video image foreground of the video image; and carrying out binarization processing on the foreground of the video image based on an Otsu threshold value, eliminating a communicated region based on morphological corrosion and expansion operation and eliminating a small region based on the size of a pixel value in sequence to form a foreground target of the input video image.
Fig. 6 shows the results obtained by the method and apparatus of the present invention as an example of indoor results, and the processing forms of the urban video and the indoor video are consistent except for the positions.
In a word, the modeling form based on the two-layer Gaussian mixture model provided by the invention can effectively avoid the influence of interference points in the detection of the moving target of the video image, and achieves the effect of relative balance in the aspects of calculation speed and the accuracy of background modeling. Various post-processing operations of the image, such as binarization, morphological operations, etc., can avoid the interference of small targets. Generally, the method for detecting the moving target in the video image can realize automatic analysis and study and judgment of the urban monitoring video, and can play an effective auxiliary support role in processing urban problems such as illegal parking, road occupation and the like.