Disclosure of Invention
The invention aims to provide a multi-target detection method for a complex background video image based on a multi-variation level set aiming at the defects of the prior art. The invention reduces the detection complexity, better avoids the situation that the background is mistakenly detected as the moving target under the complex background, and improves the detection accuracy.
The method comprises the following specific steps:
(1) inputting video images in a video format:
(2) selecting continuous three-frame video image with obvious moving object from input video image, defining it as first frame video image I1Second frame video image I2Third frame video image I3;
(3) Obtaining a binary image:
performing interframe difference on the selected continuous three-frame image by adopting an interframe difference method to obtain a binary image B;
(4) removing interference noise:
denoising other interference noises except the target in the binary image by adopting a morphological function to obtain a denoised binary image I;
(5) obtaining a zero level set of the moving object:
(5a) clustering the binary image I after the noise is removed by adopting a K-means clustering algorithm to respectively obtain the sum of the distances from all data points in a clustering class to the clustering center, the number of all data points in the clustering class and the clustering center;
(5b) taking the result of comparing the sum of the distances from all the data points in the cluster category to the cluster center with the number of all the data points in the cluster category as the radius, and taking the cluster center point as the center of a circle to obtain a circle which is the zero level set phi of the moving targetk;
(6) Updating a moving target zero level set by using a multi-variation level set method:
(6a) the internal energy of the zero level set of the moving object is calculated according to the following formula:
wherein W (phi) represents the internal energy of the zero level set of the moving object, and phi represents the zero level set of the moving objectkMu represents the constraint coefficient of the internal energy, the value range of the constraint coefficient is more than 0 and less than 0.25, and k represents the constraint coefficient of the noise removalThe number of cluster categories obtained by clustering data points in the post-vocal binary image I, K being 1,2, …, K representing the total number of moving objects, ∑ representing summation operation, ^ integral representing integral operation, and Ω representing the second frame video image I2The plane set, |, represents the absolute value operation, ▽ represents the gradient operator operation, phikZero level set representing moving object, x represents zero level set phi of moving objectkThe coordinate value of the horizontal axis of (a) y represents a zero level set phi of the moving objectkThe coordinate value of the longitudinal axis of (c);
(6b) obtaining a second frame video image I according to the following formula2Edge of middle moving object:
wherein g represents the second frame video image I2The edge of the medium moving target, |, represents the operation of taking absolute value, ▽ represents the operation of solving gradient operator, G represents Gaussian kernel, I represents convolution operation2Representing a second frame of video image of the three consecutive frames of video images;
(6c) the length of the zero level set of the moving object is calculated according to the following formula:
wherein L isg(phi) denotes the length of the zero level set of the moving object, and g denotes the second frame video image I2The edge of the middle moving object, phi, represents the zero level set phi of the moving objectkSet of (a) phikRepresents a zero level set of the moving object, K represents a serial number of a cluster category obtained by clustering data points in the binary image I after the noise is removed, K ═ 1,2, …, K represents the total number of the moving object, ∑ represents a summation operation, ^ represents an integral operation, and Ω represents a second frame video image I2The plane set of (1) represents a Dirac function of a single variable, | · | represents an absolute value taking operation, ▽ represents a gradient operator solving operation, and x represents a zero level set phi of a moving targetkThe coordinate value of the horizontal axis of (a) y represents a zero level set phi of the moving objectkThe coordinate value of the longitudinal axis of (c);
(6d) the area of the moving target region is calculated as follows:
wherein A isg(phi) denotes the area of the moving object region, and g denotes the second frame video image I2The edge of the middle moving object, phi, represents the zero level set phi of the moving objectkSet of (a) phikRepresents a zero level set of the moving object, K represents a serial number of a cluster category obtained by clustering data points in the binary image I after the noise is removed, K ═ 1,2, …, K represents the total number of the moving object, ∑ represents a summation operation, ^ represents an integral operation, and Ω represents a second frame video image I2H represents the Heaviside function, x represents the zero level set phi of the moving objectkThe coordinate value of the horizontal axis of (a) y represents a zero level set phi of the moving objectkThe coordinate value of the longitudinal axis of (c);
(6e) the external energy of the zero level set of the moving object is calculated according to the following formula:
wherein,representing the external energy of the zero level set of the moving object, g representing the second frame video image I2Edge of medium moving object, λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer less than 10, K denotes a serial number of a cluster type obtained by clustering data points in the binary image I after the noise is removed, K is 1,2, …, K denotes a total number of moving objects, v denotes a number of moving objects, and K denotes a number of clusterskA constraint coefficient for representing the area of the moving target region, phi represents the zero level set phi of the moving targetkA value of-5 < vk< 5 and vkNot equal to 0, ∑ denotes a summation operation, Lg(phi) represents the length of the zero level set of the moving object, Ag(phi) represents the area of the moving target region;
(6f) the total energy of the zero level set of the moving object is calculated according to the following formula:
wherein (phi) represents the total energy of the zero level set of the moving target, and phi represents the zero level set phi of the moving targetkW (phi) denotes the zero level set of moving objectsThe energy of the interior is such that,representing the external energy of the zero level set of the moving object, g representing the second frame video image I2Contour of medium moving object, λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer less than 10, k represents the serial number of the cluster category obtained after clustering the data points in the binary image I after removing the noise, vkA constraint coefficient representing the area of the moving target region, and the value of the constraint coefficient is-5 < vk< 5 and vk≠0;
(6g) Carrying out level set evolution on the moving target zero level set according to the following formula to obtain an updated moving target zero level set:
wherein,represents the updated zero level set of the moving object,denotes a derivative operation phikRepresents the zero level set of the moving object, phi represents the zero level set of the moving objectkK represents the serial number of a cluster class obtained by clustering data points in the binary image I after noise removal, K is 1,2, …, K represents the total number of moving targets, t represents an iteration step length, mu represents a constraint coefficient for internal energy, the value range of the constraint coefficient is 0 & ltmu & lt 0.25, delta represents a Laplacian operator operation, div (·) represents the divergence of a solved vector, ▽ represents a gradient operator operation, | · | represents an absolute value operation, represents a Dirac function of a single variable, ∑ represents a summation operation, and λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer < 10, g denotes a second frame video image I2Edge of medium moving object, vkA constraint coefficient representing the area of the target region, the value of which is-5 < vk< 5 and vk≠0;
(7) Judging whether the updated moving object zero level set is corresponding to the second frame video image I2If the edges of the middle moving target coincide, executing the step (8), otherwise, executing the step (6);
(8) outputting a second frame video image I2And (5) detecting the moving target.
Compared with the prior art, the invention has the following advantages:
firstly, the invention adopts the k-means clustering algorithm to obtain the zero level set of the moving target and the position of the zero level set of the moving target, and determines the number of the zero level sets of the moving target according to the number of the moving target, thereby avoiding that the zero level set of the moving target in the prior art is obtained by a fixed curve, which causes that a non-moving target is falsely detected as the moving target, and the position of the zero level set in the prior art is randomly selected, the number of the zero level sets is not determined according to the number of the moving target, for multi-target detection in a complex scene, the background can be judged as the target, and the detection capability is insufficient, thus improving the accuracy of detecting the moving target in a video image.
Secondly, because the invention adopts the multi-variation level set method to update the moving target zero level set, the defect that the prior art adopts the global information of the image area and can not detect the local area is avoided, when a plurality of moving targets are detected, each moving target zero level set can be interfered by other moving target zero level sets, the position and the number of the moving target zero level sets are comprehensively considered, and the accuracy of the multi-target detection of the complex background in the video image is improved.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
The steps of the present invention will be described in further detail with reference to fig. 1.
Step 1: a video image in a video format is input.
Step 2: selecting continuous three-frame video image with obvious moving object from input video image, defining it as first frame video image I1Second frame video image I2Third frame video image I3。
And step 3: and obtaining a binary image.
And performing interframe difference on the selected continuous three-frame image by adopting an interframe difference method to obtain a binary image B.
The specific steps of the interframe difference method are as follows:
using the first frame video image I of the selected continuous three frame video images1Subtracting the second frame video image I2Obtaining a differential image B1。
Using the second frame video image I of the selected continuous three frame video images2Subtracting the third frame video image I3Obtaining a differential image B2。
Two differentiated images B1And B2And (5) carrying out phase AND to obtain a binary image B.
In the present invention, a binary image obtained by performing inter-frame difference operation on the selected continuous three-frame video images by using the above inter-frame difference algorithm is shown in fig. 2 (a). In fig. 2(a), the moving object is a white region, and the background is a black region. As can be seen from fig. 2, there is noise interference in the video image, which mistargets the background, and therefore, a small number of white dots are included in the black background region.
And 4, step 4: and removing interference noise.
And denoising other interference noises except the target in the binary image by adopting a morphological function to obtain a denoised binary image I.
In the invention, the obtained binary image is processed by a bewareaopen function with a morphological function of MATLAB software, and the obtained binary image after noise removal is shown as an attached figure 2 (b). In fig. 2(b), the moving object is a white region, and the background is a black region. As can be seen from fig. 2(b), white dots existing in the background area in the binary image are removed, and noise interference existing in the background in the binary image is removed, so that the influence of background noise on target detection is reduced, and the detection accuracy is improved.
And 5: a zero level set of moving objects is obtained.
And clustering the binary image I after the noise is removed by adopting a K-means clustering algorithm to respectively obtain the sum of the distances from all data points in the clustering class to the clustering center, the number of all data points in the clustering class and the clustering center.
The k-means clustering algorithm comprises the following specific steps:
and randomly selecting K central points from the binary image I after the noise is removed, wherein K represents the total number of the moving targets.
Traversing all data points in the binary image I after the noise is removed according to the following formula, and dividing each data point into the nearest central point:
j (C) represents the distance from each data point in the binary image I after the noise is removed to the center point of the division, C represents a cluster category set obtained after the data points in the binary image I after the noise is removed are divided, min (·) represents minimum value taking operation, ∑ represents summation operation, m (·) represents summation operation, m represents the sum of the data points in the binary image I after the noise is removed, and the sum of the data points in the binary image I is equal to mkiRepresenting coefficients, I representing the sequence numbers of data points in the denoised binary image I, k representing the denoised binary image IThe serial number of the cluster type obtained by dividing the data points in the binary image I, K is 1,2, …, K represents the total number of moving targets, dist (·) represents a distance function, and x representsiRepresenting the ith data point, s, in the denoised binary image IkA cluster center point C representing the kth class obtained by dividing the binary image I after removing the noisekThe class k category obtained by dividing the binary image I from which the noise has been removed is shown, and ∈ shows the class belongs to the operation.
Calculating the average value of each cluster category obtained after dividing the binary image I after removing the noise according to the following formula, and taking the average value as a new cluster center point:
wherein s iskRepresenting the clustering center point of the kth class obtained after the division of the binary image I after the noise removal, k representing the serial number of the class obtained after the division of the data points in the binary image I after the noise removal, nkIndicating the number of data points in the kth cluster class obtained by dividing the data points in the binary image I after the noise is removed, ∑ indicating the summation operation,and the data points in the binary image I after the noise is removed are divided to obtain the ith data point in the kth clustering class, and I represents the serial number of the data point in the binary image I after the noise is removed.
To gatherThe result of the comparison of the sum of the distances from all the data points in the class type to the cluster center of the class type and the number of all the data points in the cluster type is used as a radius, and a circle obtained by taking the cluster center point as the center of the circle is the zero level set phi of the moving targetk。
The method comprises the following specific steps of obtaining a zero level set of the moving target:
the radius of the zero level set of the moving object is calculated according to the following formula:
Rk=Dk/Nk
wherein R iskRadius, D, representing zero level set of moving objectkRepresents the sum of distances from all data points in the k class category in the denoised binary image I to the class cluster center, NkAnd k represents the serial number of the clustering category obtained by clustering the data points in the binary image I after the noise is removed.
With radius R of zero level set of moving objectkAs a radius, a cluster center s of a kth class obtained by clustering the binary image I after the noise is removedkMaking a circle as the center of the circle to obtain a zero level set phi of the moving targetk。
In the invention, a k-means clustering algorithm is performed on the binary image without noise to obtain a zero level set of the moving object, and the level set of the moving object is placed in the second frame video image to obtain a schematic diagram as shown in fig. 2 (c). In fig. 2(c), three trolleys are moving targets, and three white circles on the three trolleys are zero level sets of the moving targets. As can be seen from the figure, the zero level set of the moving target is positioned on the moving target and is almost similar to the size of the target, so that the time required for updating the zero level set of the moving target for the subsequent multi-variation level set method is shortened while the moving target can be determined.
Step 6: and updating the zero level set of the moving target by using a multi-variation level set method.
The internal energy of the zero level set of the moving object is calculated according to the following formula:
wherein W (phi) represents the internal energy of the zero level set of the moving object, and phi represents the zero level set of the moving objectkμ represents a constraint coefficient for internal energy, the value range of μ is 0 < μ < 0.25, K represents a serial number of a cluster category obtained by clustering data points in the binary image I after noise removal, K is 1,2, …, K represents the total number of the motion targets, ∑ represents summation operation, ^ represents integral operation, and Ω represents the second frame video image I2The plane set, |, represents the absolute value operation, ▽ represents the gradient operator operation, phikZero level set representing moving object, x represents zero level set phi of moving objectkThe coordinate value of the horizontal axis of (a) y represents a zero level set phi of the moving objectkThe ordinate coordinate value of (a).
Obtaining a second frame video image I according to the following formula2Edge of middle moving object:
wherein g represents the second frame video image I2The edge of the medium moving target, |, represents the operation of taking absolute value, ▽ represents the operation of solving gradient operator, G represents Gaussian kernel, I represents convolution operation2Representing the second frame of video image of the three consecutive frames of video images.
The length of the zero level set of the moving object is calculated according to the following formula:
wherein L isg(phi) denotes the length of the zero level set of the moving object, and g denotes the second frame video image I2The edge of the middle moving object, phi, represents the zero level set phi of the moving objectkSet of (a) phikRepresenting zero level of moving objectSet, K denotes a serial number of a cluster category obtained by clustering data points in the binary image I after the noise is removed, K is 1,2, …, K denotes a total number of moving objects, ∑ denotes a summation operation, ^ denotes an integral operation, and Ω denotes a second frame video image I2The plane set of (1) represents a Dirac function of a single variable, | · | represents an absolute value taking operation, ▽ represents a gradient operator solving operation, and x represents a zero level set phi of a moving targetkThe coordinate value of the horizontal axis of (a) y represents a zero level set phi of the moving objectkThe ordinate coordinate value of (a).
The area of the moving target region is calculated as follows:
wherein A isg(phi) denotes the area of the moving object region, and g denotes the second frame video image I2The edge of the middle moving object, phi, represents the zero level set phi of the moving objectkSet of (a) phikRepresents a zero level set of the moving object, K represents a serial number of a cluster category obtained by clustering data points in the binary image I after the noise is removed, K ═ 1,2, …, K represents the total number of the moving object, ∑ represents a summation operation, ^ represents an integral operation, and Ω represents a second frame video image I2H represents the Heaviside function, x represents the zero level set phi of the moving objectkThe coordinate value of the horizontal axis of (A) and y represents a zero level set of the moving objectφkThe ordinate coordinate value of (a).
The external energy of the zero level set of the moving object is calculated according to the following formula:
wherein,representing the external energy of the zero level set of the moving object, g representing the second frame video image I2Edge of medium moving object, λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer less than 10, K denotes a serial number of a cluster type obtained by clustering data points in the binary image I after the noise is removed, K is 1,2, …, K denotes a total number of moving objects, v denotes a number of moving objects, and K denotes a number of clusterskConstraints representing the area of a moving target regionCoefficient phi denotes the zero level set phi of the moving objectkA value of-5 < vk< 5 and vkNot equal to 0, ∑ denotes a summation operation, Lg(phi) represents the length of the zero level set of the moving object, Ag(phi) represents the area of the moving object region.
The total energy of the zero level set of the moving object is calculated according to the following formula:
wherein (phi) represents the total energy of the zero level set of the moving target, and phi represents the zero level set phi of the moving targetkW (phi) represents the internal energy of the zero level set of moving objects,representing the external energy of the zero level set of the moving object, g representing the second frame video image I2Contour of medium moving object, λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer less than 10, k represents the serial number of the cluster category obtained after clustering the data points in the binary image I after removing the noise, vkA constraint coefficient representing the area of the moving target region, and the value of the constraint coefficient is-5 < vk< 5 and vk≠0。
Carrying out level set evolution on the moving target zero level set according to the following formula to obtain an updated moving target zero level set:
wherein,represents the updated zero level set of the moving object,denotes a derivative operation phikRepresents the zero level set of the moving object, phi represents the zero level set of the moving objectkK represents the serial number of a cluster class obtained by clustering data points in the binary image I after noise removal, K is 1,2, …, K represents the total number of moving targets, t represents an iteration step length, mu represents a constraint coefficient for internal energy, the value range of the constraint coefficient is 0 & ltmu & lt 0.25, delta represents a Laplacian operator operation, div (·) represents the divergence of a solved vector, ▽ represents a gradient operator operation, | · | represents an absolute value operation, represents a Dirac function of a single variable, ∑ represents a summation operation, and λkA constraint coefficient representing the length of the zero level set of the moving target, and the value of the constraint coefficient is more than 0 and more than lambdakAn integer < 10, g denotes a second frame video image I2Edge of medium moving object, vkA constraint coefficient representing the area of the target region, the value of which is-5 < vk< 5 and vk≠0。
And 7: judging whether the updated moving object zero level set is corresponding to the second frame video image I2If the edges of the middle moving target coincide, executing the step (8), otherwise, executing the step (6);
and 8: outputting a second frame video image I2And (5) detecting the moving target.
The effect of the present invention will be further described with reference to simulation experiments.
1. Simulation conditions are as follows:
the simulation of the invention is carried out in a hardware environment with an operating system of windows7, a CPU of Intel (R) core (TM) i5-2400, a basic frequency of 3.20GHz and a memory of 4GB and a software environment of MatlabR2011 b. In the experiment, the iteration step of time is Δ t-5, the constraint coefficient of internal energy is μ -0.2/Δ t, the constraint coefficient of the zero level set length of the moving target is λ -5, and the constraint coefficient of the target area is v-1.5.
2. Simulation content:
the data used in the simulation experiment of the invention are 291,292 and 293 frame video images selected from highway II test video in computer Vision and robotics research, and the size of the video image is 320 x 240 pixels.
3. And (3) simulation result analysis:
the simulation experiment of the invention is to detect the moving target of the video image containing a plurality of moving targets. As shown in fig. 2, fig. 2(a) is a binary image obtained by performing inter-frame difference operation on selected 291,292 th and 293 th frame video images containing a plurality of moving objects in the highway ii test video; fig. 2(b) is a binary image obtained by performing noise removal processing on the binary image of fig. 2(a) by using a morphological function bewareaopen function; fig. 2(c) is a schematic diagram of performing k-means clustering on the noise-removed binary image of fig. 2(b) to obtain a moving target zero-level set, and displaying the moving target zero-level set in a second frame video image to obtain the moving target zero-level set; fig. 2(d) is an effect diagram of moving object detection on a video image containing multiple moving objects, which is obtained by updating the zero level set of the moving objects in fig. 2(c) with multiple diversity level sets.
In fig. 2(d), the cart is the moving target, and the white curve is the zero level set of the moving target. As can be seen from fig. 2(d), the method of the present invention can make the zero level set of the moving target coincide with the edge profile of the moving target, can accurately detect a plurality of moving targets under a complex background, and the obtained detection result has high accuracy.