CN107680116B

CN107680116B - Method for monitoring moving target in video image

Info

Publication number: CN107680116B
Application number: CN201710711920.8A
Authority: CN
Inventors: 张延良; 李兴旺; 李赓; 卢冰
Original assignee: Henan University of Technology
Current assignee: Henan University of Technology
Priority date: 2017-08-18
Filing date: 2017-08-18
Publication date: 2020-07-28
Anticipated expiration: 2037-08-18
Also published as: CN107680116A

Abstract

The invention discloses a method for monitoring a moving target in a video image, which can be used in the occasions of carrying out intelligent analysis on a monitoring video. The video image is formed by overlapping a background image and a foreground moving object, and the separation of the background and the foreground can be realized by adopting a robust principal component analysis method according to the strong correlation existing between background image sequences and the sparsity existing in the foreground image. The invention adopts the weighted nuclear norm as the low-rank constraint of the matrix, so that the compression threshold value and the corresponding singular value are in a monotonous decreasing relation, and the large singular value is compressed in a smaller amplitude. The structure sparse norm is used as the foreground sparse constraint, and the priori knowledge of the spatial region continuity of the foreground moving target is effectively utilized. A new cost function is formed by the weighted nuclear norm and the structural sparse norm and is optimized by an alternative direction multiplier method, so that the moving target in the monitoring video can be detected and tracked more effectively and accurately.

Description

Method for monitoring moving target in video image

Technical Field

The invention relates to a visual analysis technology, in particular to a method for monitoring a moving target in a video image.

Background

The moving object detection is to separate a background image and a moving object image (also called as a foreground) in a monitoring video frame sequence, and is an important basic step in intelligent video analysis. Accurate detection of moving objects is of great significance to completion of subsequent high-level computer vision tasks (such as behavior recognition, scene analysis, traffic control and the like). In recent years, extensive studies have been made by a large number of scholars both at home and abroad around the subject, and many algorithms have been proposed. However, since the moving target detection faces many challenges such as illumination change, dynamic background, camera shake, and algorithm real-time, it is still a research difficulty and hot spot in the field of machine vision.

Background modeling is a main method for realizing moving target detection, and the idea is to establish a background model and then judge whether pixel points in a video frame to be detected accord with the background model or not, and if not, the pixel points belong to a foreground moving target. Typical methods include single gaussian model, mixture gaussian model, kernel density estimation model, visual background extraction, linear autoregressive model, and pixel-based adaptive segmentation. The methods use a single pixel as a processing unit, and often make a stricter assumption on the characteristics of a background model, so that the algorithms do not achieve ideal effects when applied to actual scenes.

On one hand, the skilled person applies the robust principal component analysis method to the moving object detection problem according to the two characteristics of strong correlation existing between the background image sequences of the monitoring video and sparsity of the foreground moving object pixels. The method does not need too many assumptions, can effectively overcome target false detection caused by periodic changes of the background environment, and has better robustness to illumination changes and the like. The method considers that a monitoring video image is obtained by superposing a background moving target and a foreground moving target (as shown in figure 1), and can convert the detection and tracking of the foreground moving target into the optimization problem of the following cost function by utilizing the low-rank constraint between background image sequences and the sparsity constraint of the foreground moving target:

wherein D ∈ R^M×NN is the number of video frames, the column vectors of which are vectorized from the corresponding image frames, L is the background image, S is the foreground moving object, the rank function and l in the equation (1)₀The norm is non-linear and non-convex, so its solution is an unsolved problem.

On the other hand, those skilled in the art also use the kernel norm | L | | non-woven cells_*Replacing rank (L) by l₁Norm | | S | non-conducting phosphor₁Replacement | | S | non-woven phosphor₀Equation (1) is transformed into a new cost function as follows:

wherein | L | purple_*＝∑_iσ_i(L)(σ_i(L) is the ith singular value of matrix L) to approximate the rank of the matrix | · | |. non-phosphor₁Is 1₁Norm to approximate l₀And (4) norm. Due to nuclear norm and l₁The norm is a convex function, so equation (2) can be solved by a well-developed convex optimization method. And the method can enable the moving target to still have accuracy and noise immunity under certain noise-containing conditions. However, this method also has two problems (1) in that₁The norm realizes sparsity constraint, each pixel is treated independently by the method, the regional continuity priori knowledge of the foreground moving target is not utilized, scattered non-target large noise can be misjudged as the foreground moving target, (2) the kernel norm of the matrix is used to generate L | primitive_*Substitution rank function rank (L) achieves low rank constraint, kernel norm term ∑_iσ_i(L) all singular values are weighted to 1 so that the kernel norm term is minimized and the singular values of different sizes are compressed with the same magnitude, whereas the large singular values contain the main information of the image according to their physical meaning, which tends to degrade the quality of the background image obtained by decomposition.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a method for monitoring a moving target in a video image, which can realize more effective and accurate detection and tracking of the moving target in the monitoring video image and improve the quality of a background image.

In one aspect, the present invention provides a method for monitoring a moving object in a video image, including:

s1, constructing an observation matrix D of the video image to be processed;

s2, acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a structure sparse norm and the observation matrix D; each column of the low-rank matrix is obtained by vectorizing background images of corresponding frames (frames corresponding to each column) in an observation matrix D, and each column of the sparse matrix is obtained by vectorizing foreground moving target images of corresponding frames in the observation matrix D;

and S3, respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence.

Optionally, the step S2 includes:

s21, establishing a cost function according to the robust principal component analysis theory:

wherein, L∈ R^M×NFor the low rank matrix to be determined, S ∈ R^M×NFor the sparse matrix to be determined, rank () represents the rank function of the matrix, λ is the balance factor, | · luminance |₀Is represented by₀D, L + S is a constraint condition;

s22, defining weighted nuclear norm of low rank matrix L

Defining the structural sparse norm of matrix S

Wherein w ═ w₁,…,w_n],0≤w₁≤…≤w_n,σ₁(L)≥…σ_n(L)＞0,σ_i(L) representing the ith singular value of the matrix L;

in the matrix S, the j-th column S^jFrom the foreground moving object image F of the j frame_j∈R^a×bVectorization to obtain s^j∈R^MIs an M-dimensional vector, s^jThe sequence numbers of the middle vector elements are referred to as pixel sequence numbers, and a set of all the pixel sequence numbers is denoted by Θ {1,2, …, M };

the defined sliding window of e × e slides through matrix F row by row and column by column_jWhen the pixel covered by one position of the sliding window is recorded as

Where g is the number of covered pixelsThe set of (c) is a subset of the set theta, and G corresponding to sliding windows at different positions forms a set G, | · | | purple sweet_∞Is represented by_∞A norm;

optionally, the step S2 further includes:

s24, processing the formula II by adopting an augmented Lagrange multiplier to obtain a formula III;

wherein, Y ∈ R^M×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphor_FRepresents the Frobenius norm;

and S25, minimizing the formula III in a loop iteration mode to obtain L, S and Y.

Optionally, the sub-step S25 includes:

initialization parameter mu₀＞0,ρ＞1,θ＞0,k＝0,L₀＝D,Y₀＝0；

Equation three is minimized by loop iteration, at which point L_k+1、S_k+1、Y_k+1Is represented as follows:

l of formula IV_k+1In the formula G^L＝D-S_k+Y_k/μ_kThen L_k+1Determined by the following formula five;

judgment of

If yes, the iteration is terminated, and when k +1 equals q, L obtained by the iteration is recorded_q、S_qNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, repeat the iterative process until

If true;

wherein svd (-) represents the singular value decomposition function, i.e., G^L＝UΣV^TU, V is unitary matrix, Σ is diagonal matrix; g^LSingular value of_i(G^L) Arranging the two parts in the order from big to small on the diagonal line of the sigma; singular value shrinkage operator

Is a diagonal matrix whose diagonal elements are defined as follows:

optionally, the step S3 includes:

B_i＝reshape(l_i,a×b)，B_i∈R^a×bi ═ 1, …, N, for the determined background image frame sequence; wherein:

reshape(l_ia × b) represents a vector torque matrix function, L_q＝[l₁,l₂,…,l_N]，l_i∈R^M×1,i＝1,…,N，reshape:R^M×1→R^a×b,M＝ab；

F_i＝reshape(s_i,a×b)，F_i∈R^a×b1, …, N, which is the determined foreground moving object image frame sequence; s_q＝[s₁,s₂,…,s_N]，s_i∈R^M×1,i＝1,…,N。

OptionallyIn the formula III, the value of lambda is

e in the e × e sliding window takes the value of 3-5;

ρ＝1.05,θ＝1×10^-8wherein σ is₁(D) Is the largest singular value of the observation matrix D.

Optionally, the step S1 includes:

s11, carrying out gray processing on each frame of image in the video image, wherein N frames of images after the gray processing are I₁,…,I_NAnd the resolution of each frame image after the gray scale processing is a × b, I_i∈R^a×b,i＝1,…,N，R^a×bA real space of size a × b;

s12, sequentially vectorizing each frame of image after gray level processing, and constructing an observation matrix D according to each vectorized frame of image;

wherein, D ═ Vec (I)₁),...,Vec(I_N)]∈R^M×N；

R^M×NRepresenting a real space of size M × N, Vec (I)_i) Representing a vectoring function, Vec: R^a×b→R^M×1,M＝ab。

In another aspect, the present invention further provides an apparatus for monitoring a moving object in a video image, including:

the observation matrix constructing unit is used for constructing an observation matrix D of the video image to be processed;

the low-rank matrix and sparse matrix acquisition unit is used for acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a predefined structural sparse norm and the observation matrix D; each column of the low-rank matrix is obtained by vectorizing a background image of a corresponding frame in the observation matrix D, and each column of the sparse matrix is obtained by vectorizing a foreground moving target image of the corresponding frame in the observation matrix D;

and the processing unit is used for respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence.

In still another aspect, the present invention further provides an image processing apparatus, including the above apparatus for monitoring a moving object in a video image, and an image monitoring apparatus for acquiring the video image;

and the image monitoring device sends the acquired video image to a device for monitoring a moving target in the video image for processing.

The invention has the following beneficial effects:

the method for monitoring the moving target in the video image provided by the invention realizes more effective and accurate detection and tracking of the moving target in the monitoring video by providing a new cost function based on a robust principal component analysis principle and performing minimum optimization on the cost function by means of an alternative direction multiplier method.

That is to say, in the invention, the weighted nuclear norm is used as the low-rank constraint of the background image matrix, so that the compression threshold value and the corresponding singular value are in a monotone decreasing relation, and the large singular value is compressed in a smaller amplitude. By using the structure sparse norm as the sparse constraint of the foreground matrix, the priori knowledge of the spatial region continuity of the foreground moving target can be effectively utilized. A new cost function is formed by the weighted nuclear norm and the structural sparse norm and is optimized by an alternative direction multiplier method, so that the moving target in the monitoring video can be detected and tracked more effectively and accurately.

Drawings

FIG. 1 is a schematic diagram of a robust principal component analysis;

FIG. 2 is an overall flow diagram of an embodiment of the invention;

FIG. 3 is a flow chart of step 103 in an embodiment of the present invention;

FIG. 4 is a diagram illustrating the effect of foreground recognition of a frame in 6 scenes by using the method of the present invention and the conventional method;

fig. 5 is a schematic diagram of the principle of the sparse norm of the structure, in which (a) - (b) show the distribution of two kinds of sparse pixels, and (c) - (d) show several examples of positions of the sliding window of 3 × 3 in two cases.

Detailed Description

For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.

All technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

Currently, moving object detection is to separate the background and the moving object in the video sequence, and is an important basic step in intelligent video analysis. The surveillance video image can be considered to be superimposed by the background and foreground moving objects as shown in fig. 1. The robust principal component analysis method utilizes the low-rank characteristic between background image sequences and the sparsity characteristic of a foreground moving target, and realizes the separation of the background moving target and the foreground moving target by establishing a cost function and carrying out minimum optimization on the cost function.

The existing cost function realizes low-rank constraint by using a rank function of a nuclear norm approximate matrix, and the method has too large error and is not accurate enough, so that the separation effect is influenced. On the other hand, the existing cost function has multiple purposes of l₁The norm realizes sparsity constraint, each pixel is treated independently by the method, and the detection effect of the foreground moving target is influenced without utilizing the prior knowledge of the regional continuity of the foreground moving target.

The embodiment of the invention provides a new cost function, and the cost function is subjected to minimum optimization by adopting an alternative direction multiplier method, so that the detection of the foreground moving target is realized. In the new cost function, the rank function of the weighted nuclear norm approximate matrix is adopted as the low-rank constraint of the background image, so that the approximation effect is better, and the precision is higher; the structural sparse norm is used as sparsity constraint of a foreground target, and spatial region continuity priori knowledge of the foreground is fully utilized.

Example one

As shown in fig. 2, the present invention provides a method for monitoring a moving object in a video image, the method comprising the following steps:

101, constructing an observation matrix D of a video image to be processed;

specifically, in this step 101, graying and vectorizing operations may be performed on each frame of image of the obtained monitoring video sequence to construct an observation matrix.

For example, graying the acquired N frames of surveillance video images, and marking the grayed N frames of images as I₁,…,I_NThe resolution of each frame of image is recorded as a × b, i.e. I_i∈R^a×b,i＝1,…,N，R^a×bRepresenting a real space of size a × b.

And will I₁,…,I_NSequentially vectorized to construct an observation matrix D ∈ R^M×NWherein M ═ ab, where M is,

R^M×Nrepresenting a real space of size M × N, the specific operation is as follows:

D＝[Vec(I₁),…,Vec(I_N)]∈R^M×N

Vec(I_i) Representing a vectoring function, Vec: R^a×b→R^M×1M ═ ab, it combines matrices I_i∈R^a×bThe columns are connected into a vector of M × 1 in sequence from left to right.

And 102, establishing a new cost function for solving the background image and the foreground moving target image according to a robust principal component analysis theory, a predefined weighted nuclear norm and a structure sparse norm.

And 103, combining the observation matrix, performing iterative optimization on the cost function established in the step 102, and solving a low-rank matrix and a sparse matrix.

In this step, each column of the low-rank matrix is obtained by vectorizing the background image of the corresponding frame (i.e., the frame corresponding to the column) in the observation matrix D, and each column of the sparse matrix is obtained by vectorizing the foreground moving object image of the corresponding frame in the observation matrix D.

And 104, respectively carrying out vector torque matrix operation on each column of the low-rank matrix and the sparse matrix, and solving a background image frame sequence and a foreground moving target image frame sequence of the monitoring video.

In step 104, the low rank matrix L is considered_k+1Each column of (a) is derived from the background image frame vectorization, a sparse matrix S_k+1Is vectorized from the foreground moving object image frame L is respectively_k+1、S_k+1And carrying out vector torque array operation on each column to obtain a background frame image sequence and a foreground moving target frame image sequence.

For the above step 102, the following is illustrated:

102-1, establishing a cost function according to a robust principal component analysis theory:

wherein, L∈ R^M×NIs a low-rank matrix to be solved, consisting of a sequence of background images S ∈ R^M×NRepresenting a sparse matrix to be solved, which consists of a foreground image sequence; rank (·) represents the rank function of the matrix; λ represents a balance factor; i | · | purple wind₀Is represented by₀And (4) norm. S.t. in the formula (1-1) represents the abbreviation of Subject to, constrained meaning.

102-2, define a weighted kernel norm of matrix L

Where w is ═ w₁,…,w_n]，0≤w₁≤…≤w_n,σ₁(L)≥…σ_n(L)＞0,σ_i(L) represents the ith singular value of matrix L.

102-3, defining the structural sparse norm y (S) of the matrix S.

The matrix formed by the foreground moving target frame sequence to be solved is S ∈ R^M×NIts j-th column s^j∈R^MFrom the foreground moving object image of the j frameF_j∈R^a×bVectorization to obtain s^jThe set of included pixel numbers is denoted by Θ {1,2, …, M }. s^j∈R^MIs an M-dimensional vector, and the pixel sequence number is the sequence number of the vector element.

In this embodiment, a sliding window of e × e may be designed (e.g., e may have a value of 3-5), and the matrix F is slid row by row and column by column_jThe pixel covered by a position of the window is marked as

Wherein G is a set formed by the sequence numbers of the covered pixels, which is a subset of the set theta, and G corresponding to windows at different positions forms a set marked as G. The structural sparse norm is defined as

Wherein | · | purple_∞Is represented by_∞And (4) norm.

Accordingly, step 103, comprises the following sub-steps:

103-1, establishing an augmented Lagrange multiplier of the type (1-3):

wherein, Y ∈ R^M×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphor_FRepresenting the Frobenius norm.

Alternatively, λ may take the value

103-2, initializing parameter μ₀＞0,ρ＞1,θ＞0,k＝0,L₀＝D,Y₀＝0。

For example, can order

ρ＝1.05,θ＝1×10-⁸Where σ is₁(D) Is the largest singular value of the observation matrix D.

103-3, minimizing equation (1-4) by loop iteration to determine L, S, and Y:

in the first formula of the formula (1-5), G^L＝D-S_k+Y_k/μ_kThen L_k+1Is determined by:

in the formula, svd (-) represents a singular value decomposition function, i.e., G^L＝UΣV^TU, V is unitary matrix, Σ is diagonal matrix; g^LSingular value of_i(G^L) The two parts are arranged in the order of big to small on the diagonal line of Σ. C can take the value of

The weight w in the formula (1-6)_iWith corresponding singular values σ_i(L_k) Is in a monotonically decreasing relationship.

Singular value shrinkage operator

Is a diagonal matrix whose diagonal elements are defined as follows

In the formulas (1 to 7), a large singular value is subjected to compression of a smaller magnitude, thereby improving the quality of the background image obtained by separation.

103-4, judgment

If yes, the iteration is terminated, and when k +1 equals q, L obtained by the iteration is recorded_q、S_qNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, go to step 103-3 to continue execution, as shown in fig. 3.

In addition, the step 104 may include the following sub-steps:

104-1, note L_q＝[l₁,l₂,…,l_N]Wherein l is_i∈R^M×1I-1, …, n-pair L_qPerforming vector torque matrix operation on each row B_i＝reshape(l_iA × B), where B_i∈R^a×bI is 1, …, N, which is a determined background frame image sequence; wherein:

reshape(l_ia × b) represents the vector torque matrix function, reshape: R^M×1→R^a×b,M＝ab。

104-2, recording S_q＝[s₁,s₂,…,s_N]Wherein s is_i∈R^M×1I is 1, …, N. To S_qPerforming vector torque matrix operation on each row F_i＝reshape(S_i,a×b)。F_i∈R^a×bI is 1, …, N, i.e. the determined foreground moving object image frame sequence.

In the embodiment, a cost function for measuring low rank of a background image and sparsity of a foreground moving target is provided, and the cost function is subjected to minimization optimization to realize detection and tracking of the foreground moving target.

The above-described embodiments may have the following advantages:

1. in a new cost function, the weighted kernel norm is used for replacing the kernel norm widely adopted by the existing method as the low-rank constraint of the background image, and the weight value and the corresponding singular value are in a monotonically decreasing relation. In this way, when the low-rank matrix (background image) is solved iteratively by using the singular value contraction operator, the large singular value is compressed with a smaller amplitude, so that the quality of the background image obtained by separation is improved.

2. In a new cost function, l is widely adopted by using a structure sparse norm to replace the existing method₁The norm is used as sparse constraint of a foreground matrix, prior knowledge that a foreground moving target has spatial region continuity is fully utilized, and the detection effect of the foreground moving target is effectively improved.

3. And carrying out minimum optimization on the new cost function by adopting an efficient alternative direction multiplier method, and simultaneously obtaining a low-rank structure (corresponding to a background image) and a sparse structure (corresponding to a foreground moving target) of the original video matrix through multiple iterations.

In addition, the method of the embodiment is adopted to test the real video of the CDNET2014 data set, and 6 scenes such as Highway, Office, Peertrajectory, PETS2006, Overpass, and Canoe are selected in a targeted manner to perform comparison analysis of the algorithm. According to industry consensus, the overall performance of the process is measured by the parameter F-measer. Fig. 4 shows the effect of the foreground identification of a certain frame of motion obtained by each algorithm under 6 scenes. The first column is a surveillance video frame, the second column is a real motion foreground (Ground route) provided by a data set, the third column is a motion foreground obtained by the method of the anti-noise motion target detection algorithm based on low-rank matrix decomposition in the prior art, and the fourth column is a motion foreground obtained by the method of the invention. As can be seen from FIG. 4, compared with the conventional method (i.e., the original method), the method has the advantages of clear foreground contour, fewer holes of the moving object and better detection effect.

Further, the following Table 1 lists the F-measer parameter comparison for the two methods. As can be seen from table 1, the performance of the method of the present invention is superior to the conventional method in each scenario.

TABLE 1 comparison of F1 indexes in 6 scenarios between the conventional method and the proposed method

	Office	Highway	Pedestrian	PETS2006	Overpass	Canoe
							Conventional methods	0.880224	0.820563	0.916904	0.855262	0.737762	0.702293
The method mentioned	0.909801	0.910037	0.918604	0.917042	0.752604	0.846014

Example two

In this embodiment, the method of the present invention is applied to the monitoring video of the CDNET2014Office scene to detect and track the moving object.

201, carrying out graying operation on the acquired 100 frames of three-channel color monitoring video images, and marking the 100 frames of images obtained after graying as I₁,…,I₁₀₀The resolution of each frame of image is denoted as 360 × 240;

202, mixing I₁,…,I₁₀₀Sequentially vectorized to construct an observation matrix D ∈ R^86400×100Wherein 86400 is 360 × 240, the operation is as follows:

D＝[Vec(I₁),…,Vec(I₁₀₀)]∈R^86400×100

Vec(I_i) Representing a vectorized function, which is a matrix I_iThe columns are connected in sequence from left to right into a vector of 86400 × 1.

203, establishing a cost function based on a robust principal component analysis theory:

wherein:

L∈R^86400×100representing a low-rank matrix to be solved, consisting of a sequence of background images S ∈ R^86400×100Representing a sparse matrix to be solved, which consists of a foreground image sequence; rank (·) represents the rank function of the matrix; λ represents a balance factor; i | · | purple wind₀Is represented by₀And (4) norm.

204 defining a weighted kernel norm of the matrix L

Where w is ═ w₁,…,w_n]，0≤w₁≤…≤w_n,σ₁(L)≥…σ_n(L)＞0,σ_i(L) representing the ith singular value of the matrix L. weight w_iOf and the corresponding singular value σ_iThe magnitude of (L) is monotonically decreasing, with larger singular values given less weight, so that larger singular values are compressed less when the weighted kernel norm is minimized.

205, defining a matrix SThe matrix formed by the foreground motion target image frame sequence to be solved is S ∈ R^86400×100Its j-th column s^j∈R^MFrom the foreground moving object image F of the j frame_j∈R^360×240Vectorization to obtain s^jThe set of included pixel numbers is denoted by Θ {1,2, …,86400 }.

Designing a sliding window of 3 × 3 to slide through matrix F row by row and column by column_jThe pixel covered by a position of the window is marked as

Referring to fig. 5, fig. 5 further illustrates the idea of structured sparse norms. Assuming that the graphs (a) and (b) are two possible choices of foreground moving objects in one frame of image during the optimization process, the white points in fig. 5 represent foreground moving object pixels (values represent pixel values), and the black points represent background pixels (pixel values are 0). If l is adopted₁The sparsity of the norm scale plots (a), (b) gives the same result. However, from a priori knowledge of the spatial region continuity of the foreground moving object, it is clear that the image (a) is more likely to be the foreground object image.

Consider designing a square window of 3 × 3, then sliding the window row by row and column by column, with 6 pixel coincidences between two adjacent window positions_∞Norm, then l of all window positions_∞The norms are added as a measure of sparsity. Obviously, the sparsity (420) of graph (a) is significantly smaller than that of graph (b) (680) by adopting the method, so that graph (a) can be considered as a foreground target in the optimization process according to the requirement of minimum sparsity of the moving target matrix.

It can be seen that the present embodiment employs a structure-sparse norm γ (S) instead of l₁The norm is used as sparsity constraint of the foreground moving target, spatial region continuity priori knowledge of the foreground can be fully utilized, and the foreground moving target can be detected more accurately.

206, using the weighted kernel norm L | | non-volatile lighting_w,*As the low-rank constraint of the background image, the structure sparse norm γ (S) was used as the sparsity constraint of the foreground object, and equation (2-1) was transformed into a new cost function as follows

207, λ is taken to be

Build the formula (2-3) of increase Lagrange multiplier:

wherein Y represents a Lagrange multiplier matrix, μ > 0 represents a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphor_FRepresenting the Frobenius norm.

208, initializing parameters:

minimizing equation (2-4) by loop iteration to determine L, S, and Y:

in the first formula of the formula (2-5), G^L＝D-S_k+Y_k/μ_kThen L_k+1Is determined by:

in the formula, svd (-) represents a singular value decomposition function, i.e., G^L＝UΣV^TU, V is unitary matrix, Σ is diagonal matrix; g^LSingular value of_i(G^L) The two parts are arranged in the order of big to small on the diagonal line of Σ. Singular value shrinkage operator

Is a diagonal matrix whose diagonal elements are defined as follows

From the above equation and step 204, the singular value σ_i(G^L) Compression threshold value w of_iMu and the corresponding singular value σ_i(G^L) In a monotonically decreasing relationship. In this way, small singular values get more compressed and thus approach zero at a faster rate. And the large singular value is compressed less, so that the quality of the background image can be effectively ensured, and a better background and foreground separation effect is obtained.

209, judgment

If yes, the iteration is terminated, and if k +1 equals q, L is obtained by the iteration_q、S_qNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, go to step 208 and continue execution.

210, notes L_q＝[l₁,l₂,…,l_N]Wherein l is_i∈R^86400×1I-1, …,100, pair L_qPerforming vector torque matrix operation on each row B_i＝reshape(l_i360 × 240), wherein:

B_i∈R^360×240,i＝1,…,100，reshape(l_i360 × 240) represents a vector transfer matrix function, i.e., reshape R^86400×1→R^360×240。B_iI 1, …,100, i.e. the obtained background image frameAnd (4) sequencing.

211, S_q＝[s₁,s₂,…,s_N]Wherein:

s_i∈R^86400×1,i＝1,…,100。

to S_qPerforming vector torque matrix operation on each row F_i＝reshape(s_i360 × 240), wherein F_i∈R^a×b,i＝1,…,100。F_iI is 1, …,100, namely the obtained foreground moving object image frame sequence.

Fig. 4 shows the relevant results of the present example on line 1, where the 4 th image is the foreground moving object image obtained by the method of the present embodiment, and the 3 rd image is the moving foreground obtained by applying the anti-noise moving object detection algorithm based on the prior art low-rank matrix decomposition. The two images are compared, the moving foreground contour obtained by the method is obvious and clear, and the body cavity is obviously reduced compared with the traditional method. As can be seen from Table 1, the F-measure of 0.90981 is significantly higher than 0.880224 of the conventional method, and the effectiveness of the method of this embodiment is further illustrated.

In addition, the invention also provides a device for monitoring the moving target in the video image, which comprises an observation matrix constructing unit, a low-rank matrix and sparse matrix acquiring unit and a processing unit;

the device comprises an observation matrix constructing unit, a processing unit and a processing unit, wherein the observation matrix constructing unit is used for constructing an observation matrix D of a video image to be processed;

the low-rank matrix and sparse matrix acquisition unit is used for acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a predefined structural sparse norm and the observation matrix D; the elements of each column of the low-rank matrix are obtained by vectorizing the background image of the corresponding frame in the observation matrix D, and the elements of each column of the sparse matrix are obtained by vectorizing the foreground moving target image of the corresponding frame in the observation matrix D;

The apparatus of this embodiment can perform the method of any of the above embodiments, and reference is made to the above description, which is not repeated herein.

In addition, the embodiment of the present invention further provides an image processing apparatus, where the image processing apparatus may include an image monitoring apparatus for acquiring a video image and the apparatus for monitoring a moving object in the video image;

The image processing device of this embodiment may be a background monitoring server or other servers, and the present embodiment does not limit this. The image processing equipment of the embodiment can realize more effective and accurate detection and tracking of the moving target in the monitoring video image, and improve the quality of the background image.

Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for monitoring a moving object in a video image, comprising:

s1, constructing an observation matrix D of the video image to be processed;

s2, acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a structure sparse norm and the observation matrix D; the elements of each column of the low-rank matrix are obtained by vectorizing the background image of the corresponding frame in the observation matrix D, and the elements of each column of the sparse matrix are obtained by vectorizing the foreground moving target image of the corresponding frame in the observation matrix D;

s3, respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence;

wherein the step S2 includes:

s22, defining weighted nuclear norm of low rank matrix L

Defining a structural sparse norm of a sparse matrix S

Wherein w ═ w₁,…,w_n],0≤w₁≤…≤w_n,σ₁(L)≥…σ_n(L)＞0,σ_i(L) representing the ith singular value of the low rank matrix L;

in the sparse matrix S, the jth column S^jFrom the foreground moving object image F of the j frame_j∈R^a×bVectorization to obtain s^j∈R^MIs an M-dimensional vector, s^jThe sequence numbers of the middle vector elements are referred to as pixel sequence numbers, and a set of all the pixel sequence numbers is denoted by Θ {1,2, …, M };

Where g is the covered pixelThe set is a subset of the set theta, and G corresponding to sliding windows at different positions forms a set marked as G, | · | | survival_∞Is represented by_∞A norm;

the step S2 further includes:

wherein, Y ∈ R^M×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·，·>represents inner product operation, | · | non-conducting phosphor_FRepresents the Frobenius norm;

s25, minimizing the formula III in a loop iteration mode to obtain L, S and Y;

the value of lambda in the formula III is

e in the e × e sliding window takes the value of 3-5;

ρ＝1.05,θ＝1×10^-8wherein σ is₁(D) Is the largest singular value of the observation matrix D,

the sub-step S25 includes:

initialization parameter mu₀＞0,ρ＞1,θ＞0,k＝0,L₀＝D,Y₀＝0；

Formula pair by means of loop iterationThree is minimized, at which point L_k+1、S_k+1、Y_k+1Is represented as follows:

judgment of

If true;

Is a diagonal matrix whose diagonal elements are defined as follows:

2. the method according to claim 1, wherein the step S1 includes:

s11, carrying out gray processing on each frame of image in the video image, wherein N frames of images after the gray processing are I₁,…,I_NAnd the resolution of each frame image after the gray scale processing is a × b, I_i∈R^a×b，i＝1,…,N，R^a×bA real space of size a × b;

wherein, D ═ Vec (I)₁),…,Vec(I_N)]∈R^M×N；

3. The method according to claim 2, wherein the step S3 includes:

B_i＝reshape(l_i,a×b)，B_i∈R^a×bi ═ 1, …, N, for the determined background image frame sequence;

wherein:

reshape(l_ia × b) represents a vector torque matrix function, L_q＝[l₁,l₂,…,l_N]，l_i∈R^M×1,i＝1,…,N,reshape:R^M×1→R^a×b,M＝ab；

F_i＝reshape(s_i,a×b),F_i∈R^a×b1, …, N, which is the determined foreground moving object image frame sequence; s_q＝[s₁,s₂,…,s_N]，s_i∈R^M×1,i＝1,…,N。

4. An apparatus for monitoring a moving object in a video image, comprising:

the processing unit is used for respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence;

the apparatus for monitoring moving objects in video images performs the method of any of the above claims 1 to 3.

5. An image processing apparatus comprising the apparatus for monitoring a moving object in a video image according to claim 4, and image monitoring means for acquiring the video image;