CN107680116B - Method for monitoring moving target in video image - Google Patents

Method for monitoring moving target in video image Download PDF

Info

Publication number
CN107680116B
CN107680116B CN201710711920.8A CN201710711920A CN107680116B CN 107680116 B CN107680116 B CN 107680116B CN 201710711920 A CN201710711920 A CN 201710711920A CN 107680116 B CN107680116 B CN 107680116B
Authority
CN
China
Prior art keywords
matrix
image
sparse
norm
rank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710711920.8A
Other languages
Chinese (zh)
Other versions
CN107680116A (en
Inventor
张延良
李兴旺
李赓
卢冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan University of Technology
Original Assignee
Henan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan University of Technology filed Critical Henan University of Technology
Priority to CN201710711920.8A priority Critical patent/CN107680116B/en
Publication of CN107680116A publication Critical patent/CN107680116A/en
Application granted granted Critical
Publication of CN107680116B publication Critical patent/CN107680116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for monitoring a moving target in a video image, which can be used in the occasions of carrying out intelligent analysis on a monitoring video. The video image is formed by overlapping a background image and a foreground moving object, and the separation of the background and the foreground can be realized by adopting a robust principal component analysis method according to the strong correlation existing between background image sequences and the sparsity existing in the foreground image. The invention adopts the weighted nuclear norm as the low-rank constraint of the matrix, so that the compression threshold value and the corresponding singular value are in a monotonous decreasing relation, and the large singular value is compressed in a smaller amplitude. The structure sparse norm is used as the foreground sparse constraint, and the priori knowledge of the spatial region continuity of the foreground moving target is effectively utilized. A new cost function is formed by the weighted nuclear norm and the structural sparse norm and is optimized by an alternative direction multiplier method, so that the moving target in the monitoring video can be detected and tracked more effectively and accurately.

Description

Method for monitoring moving target in video image
Technical Field
The invention relates to a visual analysis technology, in particular to a method for monitoring a moving target in a video image.
Background
The moving object detection is to separate a background image and a moving object image (also called as a foreground) in a monitoring video frame sequence, and is an important basic step in intelligent video analysis. Accurate detection of moving objects is of great significance to completion of subsequent high-level computer vision tasks (such as behavior recognition, scene analysis, traffic control and the like). In recent years, extensive studies have been made by a large number of scholars both at home and abroad around the subject, and many algorithms have been proposed. However, since the moving target detection faces many challenges such as illumination change, dynamic background, camera shake, and algorithm real-time, it is still a research difficulty and hot spot in the field of machine vision.
Background modeling is a main method for realizing moving target detection, and the idea is to establish a background model and then judge whether pixel points in a video frame to be detected accord with the background model or not, and if not, the pixel points belong to a foreground moving target. Typical methods include single gaussian model, mixture gaussian model, kernel density estimation model, visual background extraction, linear autoregressive model, and pixel-based adaptive segmentation. The methods use a single pixel as a processing unit, and often make a stricter assumption on the characteristics of a background model, so that the algorithms do not achieve ideal effects when applied to actual scenes.
On one hand, the skilled person applies the robust principal component analysis method to the moving object detection problem according to the two characteristics of strong correlation existing between the background image sequences of the monitoring video and sparsity of the foreground moving object pixels. The method does not need too many assumptions, can effectively overcome target false detection caused by periodic changes of the background environment, and has better robustness to illumination changes and the like. The method considers that a monitoring video image is obtained by superposing a background moving target and a foreground moving target (as shown in figure 1), and can convert the detection and tracking of the foreground moving target into the optimization problem of the following cost function by utilizing the low-rank constraint between background image sequences and the sparsity constraint of the foreground moving target:
Figure GDA0001535613100000021
wherein D ∈ RM×NN is the number of video frames, the column vectors of which are vectorized from the corresponding image frames, L is the background image, S is the foreground moving object, the rank function and l in the equation (1)0The norm is non-linear and non-convex, so its solution is an unsolved problem.
On the other hand, those skilled in the art also use the kernel norm | L | | non-woven cells*Replacing rank (L) by l1Norm | | S | non-conducting phosphor1Replacement | | S | non-woven phosphor0Equation (1) is transformed into a new cost function as follows:
Figure GDA0001535613100000022
wherein | L | purple*=∑iσi(L)(σi(L) is the ith singular value of matrix L) to approximate the rank of the matrix | · | |. non-phosphor1Is 11Norm to approximate l0And (4) norm. Due to nuclear norm and l1The norm is a convex function, so equation (2) can be solved by a well-developed convex optimization method. And the method can enable the moving target to still have accuracy and noise immunity under certain noise-containing conditions. However, this method also has two problems (1) in that1The norm realizes sparsity constraint, each pixel is treated independently by the method, the regional continuity priori knowledge of the foreground moving target is not utilized, scattered non-target large noise can be misjudged as the foreground moving target, (2) the kernel norm of the matrix is used to generate L | primitive*Substitution rank function rank (L) achieves low rank constraint, kernel norm term ∑iσi(L) all singular values are weighted to 1 so that the kernel norm term is minimized and the singular values of different sizes are compressed with the same magnitude, whereas the large singular values contain the main information of the image according to their physical meaning, which tends to degrade the quality of the background image obtained by decomposition.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a method for monitoring a moving target in a video image, which can realize more effective and accurate detection and tracking of the moving target in the monitoring video image and improve the quality of a background image.
In one aspect, the present invention provides a method for monitoring a moving object in a video image, including:
s1, constructing an observation matrix D of the video image to be processed;
s2, acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a structure sparse norm and the observation matrix D; each column of the low-rank matrix is obtained by vectorizing background images of corresponding frames (frames corresponding to each column) in an observation matrix D, and each column of the sparse matrix is obtained by vectorizing foreground moving target images of corresponding frames in the observation matrix D;
and S3, respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence.
Optionally, the step S2 includes:
s21, establishing a cost function according to the robust principal component analysis theory:
Figure GDA0001535613100000031
wherein, L∈ RM×NFor the low rank matrix to be determined, S ∈ RM×NFor the sparse matrix to be determined, rank () represents the rank function of the matrix, λ is the balance factor, | · luminance |0Is represented by0D, L + S is a constraint condition;
s22, defining weighted nuclear norm of low rank matrix L
Figure GDA0001535613100000032
Defining the structural sparse norm of matrix S
Figure GDA0001535613100000033
Wherein w ═ w1,…,wn],0≤w1≤…≤wn1(L)≥…σn(L)>0,σi(L) representing the ith singular value of the matrix L;
in the matrix S, the j-th column SjFrom the foreground moving object image F of the j framej∈Ra×bVectorization to obtain sj∈RMIs an M-dimensional vector, sjThe sequence numbers of the middle vector elements are referred to as pixel sequence numbers, and a set of all the pixel sequence numbers is denoted by Θ {1,2, …, M };
the defined sliding window of e × e slides through matrix F row by row and column by columnjWhen the pixel covered by one position of the sliding window is recorded as
Figure GDA0001535613100000044
Where g is the number of covered pixelsThe set of (c) is a subset of the set theta, and G corresponding to sliding windows at different positions forms a set G, | · | | purple sweetIs represented byA norm;
s23, calculating the weighted kernel norm | L | in the sub-step S22w,*Replacing rank (L) in formula one, replacing | | | S | | sweet hair in formula one with structure sparse norm γ (S)0Obtaining a new cost function:
Figure GDA0001535613100000041
optionally, the step S2 further includes:
s24, processing the formula II by adopting an augmented Lagrange multiplier to obtain a formula III;
Figure GDA0001535613100000042
wherein, Y ∈ RM×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphorFRepresents the Frobenius norm;
and S25, minimizing the formula III in a loop iteration mode to obtain L, S and Y.
Optionally, the sub-step S25 includes:
initialization parameter mu0>0,ρ>1,θ>0,k=0,L0=D,Y0=0;
Equation three is minimized by loop iteration, at which point Lk+1、Sk+1、Yk+1Is represented as follows:
Figure GDA0001535613100000043
l of formula IVk+1In the formula GL=D-Sk+YkkThen Lk+1Determined by the following formula five;
Figure GDA0001535613100000051
judgment of
Figure GDA0001535613100000052
If yes, the iteration is terminated, and when k +1 equals q, L obtained by the iteration is recordedq、SqNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, repeat the iterative process until
Figure GDA0001535613100000053
If true;
wherein svd (-) represents the singular value decomposition function, i.e., GL=UΣVTU, V is unitary matrix, Σ is diagonal matrix; gLSingular value ofi(GL) Arranging the two parts in the order from big to small on the diagonal line of the sigma; singular value shrinkage operator
Figure GDA0001535613100000054
Is a diagonal matrix whose diagonal elements are defined as follows:
Figure GDA0001535613100000055
optionally, the step S3 includes:
Bi=reshape(li,a×b),Bi∈Ra×bi ═ 1, …, N, for the determined background image frame sequence; wherein:
reshape(lia × b) represents a vector torque matrix function, Lq=[l1,l2,…,lN],li∈RM×1,i=1,…,N,reshape:RM×1→Ra×b,M=ab;
Fi=reshape(si,a×b),Fi∈Ra×b1, …, N, which is the determined foreground moving object image frame sequence; sq=[s1,s2,…,sN],si∈RM×1,i=1,…,N。
OptionallyIn the formula III, the value of lambda is
Figure GDA0001535613100000056
e in the e × e sliding window takes the value of 3-5;
Figure GDA0001535613100000057
ρ=1.05,θ=1×10-8wherein σ is1(D) Is the largest singular value of the observation matrix D.
Optionally, the step S1 includes:
s11, carrying out gray processing on each frame of image in the video image, wherein N frames of images after the gray processing are I1,…,INAnd the resolution of each frame image after the gray scale processing is a × b, Ii∈Ra×b,i=1,…,N,Ra×bA real space of size a × b;
s12, sequentially vectorizing each frame of image after gray level processing, and constructing an observation matrix D according to each vectorized frame of image;
wherein, D ═ Vec (I)1),...,Vec(IN)]∈RM×N
RM×NRepresenting a real space of size M × N, Vec (I)i) Representing a vectoring function, Vec: Ra×b→RM×1,M=ab。
In another aspect, the present invention further provides an apparatus for monitoring a moving object in a video image, including:
the observation matrix constructing unit is used for constructing an observation matrix D of the video image to be processed;
the low-rank matrix and sparse matrix acquisition unit is used for acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a predefined structural sparse norm and the observation matrix D; each column of the low-rank matrix is obtained by vectorizing a background image of a corresponding frame in the observation matrix D, and each column of the sparse matrix is obtained by vectorizing a foreground moving target image of the corresponding frame in the observation matrix D;
and the processing unit is used for respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence.
In still another aspect, the present invention further provides an image processing apparatus, including the above apparatus for monitoring a moving object in a video image, and an image monitoring apparatus for acquiring the video image;
and the image monitoring device sends the acquired video image to a device for monitoring a moving target in the video image for processing.
The invention has the following beneficial effects:
the method for monitoring the moving target in the video image provided by the invention realizes more effective and accurate detection and tracking of the moving target in the monitoring video by providing a new cost function based on a robust principal component analysis principle and performing minimum optimization on the cost function by means of an alternative direction multiplier method.
That is to say, in the invention, the weighted nuclear norm is used as the low-rank constraint of the background image matrix, so that the compression threshold value and the corresponding singular value are in a monotone decreasing relation, and the large singular value is compressed in a smaller amplitude. By using the structure sparse norm as the sparse constraint of the foreground matrix, the priori knowledge of the spatial region continuity of the foreground moving target can be effectively utilized. A new cost function is formed by the weighted nuclear norm and the structural sparse norm and is optimized by an alternative direction multiplier method, so that the moving target in the monitoring video can be detected and tracked more effectively and accurately.
Drawings
FIG. 1 is a schematic diagram of a robust principal component analysis;
FIG. 2 is an overall flow diagram of an embodiment of the invention;
FIG. 3 is a flow chart of step 103 in an embodiment of the present invention;
FIG. 4 is a diagram illustrating the effect of foreground recognition of a frame in 6 scenes by using the method of the present invention and the conventional method;
fig. 5 is a schematic diagram of the principle of the sparse norm of the structure, in which (a) - (b) show the distribution of two kinds of sparse pixels, and (c) - (d) show several examples of positions of the sliding window of 3 × 3 in two cases.
Detailed Description
For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.
All technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
Currently, moving object detection is to separate the background and the moving object in the video sequence, and is an important basic step in intelligent video analysis. The surveillance video image can be considered to be superimposed by the background and foreground moving objects as shown in fig. 1. The robust principal component analysis method utilizes the low-rank characteristic between background image sequences and the sparsity characteristic of a foreground moving target, and realizes the separation of the background moving target and the foreground moving target by establishing a cost function and carrying out minimum optimization on the cost function.
The existing cost function realizes low-rank constraint by using a rank function of a nuclear norm approximate matrix, and the method has too large error and is not accurate enough, so that the separation effect is influenced. On the other hand, the existing cost function has multiple purposes of l1The norm realizes sparsity constraint, each pixel is treated independently by the method, and the detection effect of the foreground moving target is influenced without utilizing the prior knowledge of the regional continuity of the foreground moving target.
The embodiment of the invention provides a new cost function, and the cost function is subjected to minimum optimization by adopting an alternative direction multiplier method, so that the detection of the foreground moving target is realized. In the new cost function, the rank function of the weighted nuclear norm approximate matrix is adopted as the low-rank constraint of the background image, so that the approximation effect is better, and the precision is higher; the structural sparse norm is used as sparsity constraint of a foreground target, and spatial region continuity priori knowledge of the foreground is fully utilized.
Example one
As shown in fig. 2, the present invention provides a method for monitoring a moving object in a video image, the method comprising the following steps:
101, constructing an observation matrix D of a video image to be processed;
specifically, in this step 101, graying and vectorizing operations may be performed on each frame of image of the obtained monitoring video sequence to construct an observation matrix.
For example, graying the acquired N frames of surveillance video images, and marking the grayed N frames of images as I1,…,INThe resolution of each frame of image is recorded as a × b, i.e. Ii∈Ra×b,i=1,…,N,Ra×bRepresenting a real space of size a × b.
And will I1,…,INSequentially vectorized to construct an observation matrix D ∈ RM×NWherein M ═ ab, where M is,
RM×Nrepresenting a real space of size M × N, the specific operation is as follows:
D=[Vec(I1),…,Vec(IN)]∈RM×N
Vec(Ii) Representing a vectoring function, Vec: Ra×b→RM×1M ═ ab, it combines matrices Ii∈Ra×bThe columns are connected into a vector of M × 1 in sequence from left to right.
And 102, establishing a new cost function for solving the background image and the foreground moving target image according to a robust principal component analysis theory, a predefined weighted nuclear norm and a structure sparse norm.
And 103, combining the observation matrix, performing iterative optimization on the cost function established in the step 102, and solving a low-rank matrix and a sparse matrix.
In this step, each column of the low-rank matrix is obtained by vectorizing the background image of the corresponding frame (i.e., the frame corresponding to the column) in the observation matrix D, and each column of the sparse matrix is obtained by vectorizing the foreground moving object image of the corresponding frame in the observation matrix D.
And 104, respectively carrying out vector torque matrix operation on each column of the low-rank matrix and the sparse matrix, and solving a background image frame sequence and a foreground moving target image frame sequence of the monitoring video.
In step 104, the low rank matrix L is consideredk+1Each column of (a) is derived from the background image frame vectorization, a sparse matrix Sk+1Is vectorized from the foreground moving object image frame L is respectivelyk+1、Sk+1And carrying out vector torque array operation on each column to obtain a background frame image sequence and a foreground moving target frame image sequence.
For the above step 102, the following is illustrated:
102-1, establishing a cost function according to a robust principal component analysis theory:
Figure GDA0001535613100000101
wherein, L∈ RM×NIs a low-rank matrix to be solved, consisting of a sequence of background images S ∈ RM×NRepresenting a sparse matrix to be solved, which consists of a foreground image sequence; rank (·) represents the rank function of the matrix; λ represents a balance factor; i | · | purple wind0Is represented by0And (4) norm. S.t. in the formula (1-1) represents the abbreviation of Subject to, constrained meaning.
102-2, define a weighted kernel norm of matrix L
Figure GDA0001535613100000102
Where w is ═ w1,…,wn],0≤w1≤…≤wn1(L)≥…σn(L)>0,σi(L) represents the ith singular value of matrix L.
102-3, defining the structural sparse norm y (S) of the matrix S.
The matrix formed by the foreground moving target frame sequence to be solved is S ∈ RM×NIts j-th column sj∈RMFrom the foreground moving object image of the j frameFj∈Ra×bVectorization to obtain sjThe set of included pixel numbers is denoted by Θ {1,2, …, M }. sj∈RMIs an M-dimensional vector, and the pixel sequence number is the sequence number of the vector element.
In this embodiment, a sliding window of e × e may be designed (e.g., e may have a value of 3-5), and the matrix F is slid row by row and column by columnjThe pixel covered by a position of the window is marked as
Figure GDA0001535613100000103
Wherein G is a set formed by the sequence numbers of the covered pixels, which is a subset of the set theta, and G corresponding to windows at different positions forms a set marked as G. The structural sparse norm is defined as
Figure GDA0001535613100000104
Wherein | · | purpleIs represented byAnd (4) norm.
102-4 with a weighted kernel norm | L | | non-volatile phosphorw,*Replacement of rank (L), replacement of | | | S | | survival with a structurally sparse norm γ (S)0Transforming the equation (1-1) into a new cost function
Figure GDA0001535613100000111
Accordingly, step 103, comprises the following sub-steps:
103-1, establishing an augmented Lagrange multiplier of the type (1-3):
Figure GDA0001535613100000112
wherein, Y ∈ RM×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphorFRepresenting the Frobenius norm.
Alternatively, λ may take the value
Figure GDA0001535613100000113
103-2, initializing parameter μ0>0,ρ>1,θ>0,k=0,L0=D,Y0=0。
For example, can order
Figure GDA0001535613100000114
ρ=1.05,θ=1×10-8Where σ is1(D) Is the largest singular value of the observation matrix D.
103-3, minimizing equation (1-4) by loop iteration to determine L, S, and Y:
Figure GDA0001535613100000115
in the first formula of the formula (1-5), GL=D-Sk+YkkThen Lk+1Is determined by:
Figure GDA0001535613100000116
in the formula, svd (-) represents a singular value decomposition function, i.e., GL=UΣVTU, V is unitary matrix, Σ is diagonal matrix; gLSingular value ofi(GL) The two parts are arranged in the order of big to small on the diagonal line of Σ. C can take the value of
Figure GDA0001535613100000121
The weight w in the formula (1-6)iWith corresponding singular values σi(Lk) Is in a monotonically decreasing relationship.
Singular value shrinkage operator
Figure GDA0001535613100000122
Is a diagonal matrix whose diagonal elements are defined as follows
Figure GDA0001535613100000123
In the formulas (1 to 7), a large singular value is subjected to compression of a smaller magnitude, thereby improving the quality of the background image obtained by separation.
103-4, judgment
Figure GDA0001535613100000124
If yes, the iteration is terminated, and when k +1 equals q, L obtained by the iteration is recordedq、SqNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, go to step 103-3 to continue execution, as shown in fig. 3.
In addition, the step 104 may include the following sub-steps:
104-1, note Lq=[l1,l2,…,lN]Wherein l isi∈RM×1I-1, …, n-pair LqPerforming vector torque matrix operation on each row Bi=reshape(liA × B), where Bi∈Ra×bI is 1, …, N, which is a determined background frame image sequence; wherein:
reshape(lia × b) represents the vector torque matrix function, reshape: RM×1→Ra×b,M=ab。
104-2, recording Sq=[s1,s2,…,sN]Wherein s isi∈RM×1I is 1, …, N. To SqPerforming vector torque matrix operation on each row Fi=reshape(Si,a×b)。Fi∈Ra×bI is 1, …, N, i.e. the determined foreground moving object image frame sequence.
In the embodiment, a cost function for measuring low rank of a background image and sparsity of a foreground moving target is provided, and the cost function is subjected to minimization optimization to realize detection and tracking of the foreground moving target.
The above-described embodiments may have the following advantages:
1. in a new cost function, the weighted kernel norm is used for replacing the kernel norm widely adopted by the existing method as the low-rank constraint of the background image, and the weight value and the corresponding singular value are in a monotonically decreasing relation. In this way, when the low-rank matrix (background image) is solved iteratively by using the singular value contraction operator, the large singular value is compressed with a smaller amplitude, so that the quality of the background image obtained by separation is improved.
2. In a new cost function, l is widely adopted by using a structure sparse norm to replace the existing method1The norm is used as sparse constraint of a foreground matrix, prior knowledge that a foreground moving target has spatial region continuity is fully utilized, and the detection effect of the foreground moving target is effectively improved.
3. And carrying out minimum optimization on the new cost function by adopting an efficient alternative direction multiplier method, and simultaneously obtaining a low-rank structure (corresponding to a background image) and a sparse structure (corresponding to a foreground moving target) of the original video matrix through multiple iterations.
In addition, the method of the embodiment is adopted to test the real video of the CDNET2014 data set, and 6 scenes such as Highway, Office, Peertrajectory, PETS2006, Overpass, and Canoe are selected in a targeted manner to perform comparison analysis of the algorithm. According to industry consensus, the overall performance of the process is measured by the parameter F-measer. Fig. 4 shows the effect of the foreground identification of a certain frame of motion obtained by each algorithm under 6 scenes. The first column is a surveillance video frame, the second column is a real motion foreground (Ground route) provided by a data set, the third column is a motion foreground obtained by the method of the anti-noise motion target detection algorithm based on low-rank matrix decomposition in the prior art, and the fourth column is a motion foreground obtained by the method of the invention. As can be seen from FIG. 4, compared with the conventional method (i.e., the original method), the method has the advantages of clear foreground contour, fewer holes of the moving object and better detection effect.
Further, the following Table 1 lists the F-measer parameter comparison for the two methods. As can be seen from table 1, the performance of the method of the present invention is superior to the conventional method in each scenario.
TABLE 1 comparison of F1 indexes in 6 scenarios between the conventional method and the proposed method
Office Highway Pedestrian PETS2006 Overpass Canoe
Conventional methods 0.880224 0.820563 0.916904 0.855262 0.737762 0.702293
The method mentioned 0.909801 0.910037 0.918604 0.917042 0.752604 0.846014
Example two
In this embodiment, the method of the present invention is applied to the monitoring video of the CDNET2014Office scene to detect and track the moving object.
201, carrying out graying operation on the acquired 100 frames of three-channel color monitoring video images, and marking the 100 frames of images obtained after graying as I1,…,I100The resolution of each frame of image is denoted as 360 × 240;
202, mixing I1,…,I100Sequentially vectorized to construct an observation matrix D ∈ R86400×100Wherein 86400 is 360 × 240, the operation is as follows:
D=[Vec(I1),…,Vec(I100)]∈R86400×100
Vec(Ii) Representing a vectorized function, which is a matrix IiThe columns are connected in sequence from left to right into a vector of 86400 × 1.
203, establishing a cost function based on a robust principal component analysis theory:
Figure GDA0001535613100000141
wherein:
L∈R86400×100representing a low-rank matrix to be solved, consisting of a sequence of background images S ∈ R86400×100Representing a sparse matrix to be solved, which consists of a foreground image sequence; rank (·) represents the rank function of the matrix; λ represents a balance factor; i | · | purple wind0Is represented by0And (4) norm.
204 defining a weighted kernel norm of the matrix L
Figure GDA0001535613100000142
Where w is ═ w1,…,wn],0≤w1≤…≤wn1(L)≥…σn(L)>0,σi(L) representing the ith singular value of the matrix L. weight wiOf and the corresponding singular value σiThe magnitude of (L) is monotonically decreasing, with larger singular values given less weight, so that larger singular values are compressed less when the weighted kernel norm is minimized.
205, defining a matrix SThe matrix formed by the foreground motion target image frame sequence to be solved is S ∈ R86400×100Its j-th column sj∈RMFrom the foreground moving object image F of the j framej∈R360×240Vectorization to obtain sjThe set of included pixel numbers is denoted by Θ {1,2, …,86400 }.
Designing a sliding window of 3 × 3 to slide through matrix F row by row and column by columnjThe pixel covered by a position of the window is marked as
Figure GDA0001535613100000153
Wherein G is a set formed by the sequence numbers of the covered pixels, which is a subset of the set theta, and G corresponding to windows at different positions forms a set marked as G. The structural sparse norm is defined as
Figure GDA0001535613100000151
Referring to fig. 5, fig. 5 further illustrates the idea of structured sparse norms. Assuming that the graphs (a) and (b) are two possible choices of foreground moving objects in one frame of image during the optimization process, the white points in fig. 5 represent foreground moving object pixels (values represent pixel values), and the black points represent background pixels (pixel values are 0). If l is adopted1The sparsity of the norm scale plots (a), (b) gives the same result. However, from a priori knowledge of the spatial region continuity of the foreground moving object, it is clear that the image (a) is more likely to be the foreground object image.
Consider designing a square window of 3 × 3, then sliding the window row by row and column by column, with 6 pixel coincidences between two adjacent window positionsNorm, then l of all window positionsThe norms are added as a measure of sparsity. Obviously, the sparsity (420) of graph (a) is significantly smaller than that of graph (b) (680) by adopting the method, so that graph (a) can be considered as a foreground target in the optimization process according to the requirement of minimum sparsity of the moving target matrix.
It can be seen that the present embodiment employs a structure-sparse norm γ (S) instead of l1The norm is used as sparsity constraint of the foreground moving target, spatial region continuity priori knowledge of the foreground can be fully utilized, and the foreground moving target can be detected more accurately.
206, using the weighted kernel norm L | | non-volatile lightingw,*As the low-rank constraint of the background image, the structure sparse norm γ (S) was used as the sparsity constraint of the foreground object, and equation (2-1) was transformed into a new cost function as follows
Figure GDA0001535613100000152
207, λ is taken to be
Figure GDA0001535613100000161
Build the formula (2-3) of increase Lagrange multiplier:
Figure GDA0001535613100000162
wherein Y represents a Lagrange multiplier matrix, μ > 0 represents a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphorFRepresenting the Frobenius norm.
208, initializing parameters:
Figure GDA0001535613100000163
minimizing equation (2-4) by loop iteration to determine L, S, and Y:
Figure GDA0001535613100000164
in the first formula of the formula (2-5), GL=D-Sk+YkkThen Lk+1Is determined by:
Figure GDA0001535613100000165
in the formula, svd (-) represents a singular value decomposition function, i.e., GL=UΣVTU, V is unitary matrix, Σ is diagonal matrix; gLSingular value ofi(GL) The two parts are arranged in the order of big to small on the diagonal line of Σ. Singular value shrinkage operator
Figure GDA0001535613100000166
Is a diagonal matrix whose diagonal elements are defined as follows
Figure GDA0001535613100000167
From the above equation and step 204, the singular value σi(GL) Compression threshold value w ofiMu and the corresponding singular value σi(GL) In a monotonically decreasing relationship. In this way, small singular values get more compressed and thus approach zero at a faster rate. And the large singular value is compressed less, so that the quality of the background image can be effectively ensured, and a better background and foreground separation effect is obtained.
209, judgment
Figure GDA0001535613100000171
If yes, the iteration is terminated, and if k +1 equals q, L is obtained by the iterationq、SqNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, go to step 208 and continue execution.
210, notes Lq=[l1,l2,…,lN]Wherein l isi∈R86400×1I-1, …,100, pair LqPerforming vector torque matrix operation on each row Bi=reshape(li360 × 240), wherein:
Bi∈R360×240,i=1,…,100,reshape(li360 × 240) represents a vector transfer matrix function, i.e., reshape R86400×1→R360×240。BiI 1, …,100, i.e. the obtained background image frameAnd (4) sequencing.
211, Sq=[s1,s2,…,sN]Wherein:
si∈R86400×1,i=1,…,100。
to SqPerforming vector torque matrix operation on each row Fi=reshape(si360 × 240), wherein Fi∈Ra×b,i=1,…,100。FiI is 1, …,100, namely the obtained foreground moving object image frame sequence.
Fig. 4 shows the relevant results of the present example on line 1, where the 4 th image is the foreground moving object image obtained by the method of the present embodiment, and the 3 rd image is the moving foreground obtained by applying the anti-noise moving object detection algorithm based on the prior art low-rank matrix decomposition. The two images are compared, the moving foreground contour obtained by the method is obvious and clear, and the body cavity is obviously reduced compared with the traditional method. As can be seen from Table 1, the F-measure of 0.90981 is significantly higher than 0.880224 of the conventional method, and the effectiveness of the method of this embodiment is further illustrated.
In addition, the invention also provides a device for monitoring the moving target in the video image, which comprises an observation matrix constructing unit, a low-rank matrix and sparse matrix acquiring unit and a processing unit;
the device comprises an observation matrix constructing unit, a processing unit and a processing unit, wherein the observation matrix constructing unit is used for constructing an observation matrix D of a video image to be processed;
the low-rank matrix and sparse matrix acquisition unit is used for acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a predefined structural sparse norm and the observation matrix D; the elements of each column of the low-rank matrix are obtained by vectorizing the background image of the corresponding frame in the observation matrix D, and the elements of each column of the sparse matrix are obtained by vectorizing the foreground moving target image of the corresponding frame in the observation matrix D;
and the processing unit is used for respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence.
The apparatus of this embodiment can perform the method of any of the above embodiments, and reference is made to the above description, which is not repeated herein.
In addition, the embodiment of the present invention further provides an image processing apparatus, where the image processing apparatus may include an image monitoring apparatus for acquiring a video image and the apparatus for monitoring a moving object in the video image;
and the image monitoring device sends the acquired video image to a device for monitoring a moving target in the video image for processing.
The image processing device of this embodiment may be a background monitoring server or other servers, and the present embodiment does not limit this. The image processing equipment of the embodiment can realize more effective and accurate detection and tracking of the moving target in the monitoring video image, and improve the quality of the background image.
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. A method for monitoring a moving object in a video image, comprising:
s1, constructing an observation matrix D of the video image to be processed;
s2, acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a structure sparse norm and the observation matrix D; the elements of each column of the low-rank matrix are obtained by vectorizing the background image of the corresponding frame in the observation matrix D, and the elements of each column of the sparse matrix are obtained by vectorizing the foreground moving target image of the corresponding frame in the observation matrix D;
s3, respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence;
wherein the step S2 includes:
s21, establishing a cost function according to the robust principal component analysis theory:
Figure FDA0002437669280000011
wherein, L∈ RM×NFor the low rank matrix to be determined, S ∈ RM×NFor the sparse matrix to be determined, rank () represents the rank function of the matrix, λ is the balance factor, | · luminance |0Is represented by0D, L + S is a constraint condition;
s22, defining weighted nuclear norm of low rank matrix L
Figure FDA0002437669280000012
Defining a structural sparse norm of a sparse matrix S
Figure FDA0002437669280000013
Wherein w ═ w1,…,wn],0≤w1≤…≤wn1(L)≥…σn(L)>0,σi(L) representing the ith singular value of the low rank matrix L;
in the sparse matrix S, the jth column SjFrom the foreground moving object image F of the j framej∈Ra×bVectorization to obtain sj∈RMIs an M-dimensional vector, sjThe sequence numbers of the middle vector elements are referred to as pixel sequence numbers, and a set of all the pixel sequence numbers is denoted by Θ {1,2, …, M };
the defined sliding window of e × e slides through matrix F row by row and column by columnjWhen the pixel covered by one position of the sliding window is recorded as
Figure FDA0002437669280000014
Where g is the covered pixelThe set is a subset of the set theta, and G corresponding to sliding windows at different positions forms a set marked as G, | · | | survivalIs represented byA norm;
s23, calculating the weighted kernel norm | L | in the sub-step S22w,*Replacing rank (L) in formula one, replacing | | | S | | circuitry in formula one with a structured sparse norm γ (S)0Obtaining a new cost function:
Figure FDA0002437669280000025
the step S2 further includes:
s24, processing the formula II by adopting an augmented Lagrange multiplier to obtain a formula III;
Figure FDA0002437669280000021
wherein, Y ∈ RM×NRepresenting a lagrange multiplier matrix, μ > 0 representing a penalty factor,<·,·>represents inner product operation, | · | non-conducting phosphorFRepresents the Frobenius norm;
s25, minimizing the formula III in a loop iteration mode to obtain L, S and Y;
the value of lambda in the formula III is
Figure FDA0002437669280000022
e in the e × e sliding window takes the value of 3-5;
Figure FDA0002437669280000023
ρ=1.05,θ=1×10-8wherein σ is1(D) Is the largest singular value of the observation matrix D,
the sub-step S25 includes:
initialization parameter mu0>0,ρ>1,θ>0,k=0,L0=D,Y0=0;
Formula pair by means of loop iterationThree is minimized, at which point Lk+1、Sk+1、Yk+1Is represented as follows:
Figure FDA0002437669280000024
Figure FDA0002437669280000031
l of formula IVk+1In the formula GL=D-Sk+YkkThen Lk+1Determined by the following formula five;
Figure FDA0002437669280000032
judgment of
Figure FDA0002437669280000033
If yes, the iteration is terminated, and when k +1 equals q, L obtained by the iteration is recordedq、SqNamely, the low rank matrix and the sparse matrix are obtained; otherwise, let k be k +1, repeat the iterative process until
Figure FDA0002437669280000034
If true;
wherein svd (-) represents the singular value decomposition function, i.e., GL=UΣVTU, V is unitary matrix, Σ is diagonal matrix; gLSingular value ofi(GL) Arranging the two parts in the order from big to small on the diagonal line of the sigma; singular value shrinkage operator
Figure FDA0002437669280000035
Is a diagonal matrix whose diagonal elements are defined as follows:
Figure FDA0002437669280000036
2. the method according to claim 1, wherein the step S1 includes:
s11, carrying out gray processing on each frame of image in the video image, wherein N frames of images after the gray processing are I1,…,INAnd the resolution of each frame image after the gray scale processing is a × b, Ii∈Ra×b,i=1,…,N,Ra×bA real space of size a × b;
s12, sequentially vectorizing each frame of image after gray level processing, and constructing an observation matrix D according to each vectorized frame of image;
wherein, D ═ Vec (I)1),…,Vec(IN)]∈RM×N
RM×NRepresenting a real space of size M × N, Vec (I)i) Representing a vectoring function, Vec: Ra×b→RM×1,M=ab。
3. The method according to claim 2, wherein the step S3 includes:
Bi=reshape(li,a×b),Bi∈Ra×bi ═ 1, …, N, for the determined background image frame sequence;
wherein:
reshape(lia × b) represents a vector torque matrix function, Lq=[l1,l2,…,lN],li∈RM×1,i=1,…,N,reshape:RM×1→Ra×b,M=ab;
Fi=reshape(si,a×b),Fi∈Ra×b1, …, N, which is the determined foreground moving object image frame sequence; sq=[s1,s2,…,sN],si∈RM×1,i=1,…,N。
4. An apparatus for monitoring a moving object in a video image, comprising:
the observation matrix constructing unit is used for constructing an observation matrix D of the video image to be processed;
the low-rank matrix and sparse matrix acquisition unit is used for acquiring a low-rank matrix and a sparse matrix according to a robust principal component analysis theory, a predefined weighted kernel norm, a predefined structural sparse norm and the observation matrix D; each column of the low-rank matrix is obtained by vectorizing a background image of a corresponding frame in the observation matrix D, and each column of the sparse matrix is obtained by vectorizing a foreground moving target image of the corresponding frame in the observation matrix D;
the processing unit is used for respectively carrying out vector torque matrix operation on the low-rank matrix and the sparse matrix to obtain a background image frame sequence and a foreground moving target image frame sequence;
the apparatus for monitoring moving objects in video images performs the method of any of the above claims 1 to 3.
5. An image processing apparatus comprising the apparatus for monitoring a moving object in a video image according to claim 4, and image monitoring means for acquiring the video image;
and the image monitoring device sends the acquired video image to a device for monitoring a moving target in the video image for processing.
CN201710711920.8A 2017-08-18 2017-08-18 Method for monitoring moving target in video image Active CN107680116B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710711920.8A CN107680116B (en) 2017-08-18 2017-08-18 Method for monitoring moving target in video image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710711920.8A CN107680116B (en) 2017-08-18 2017-08-18 Method for monitoring moving target in video image

Publications (2)

Publication Number Publication Date
CN107680116A CN107680116A (en) 2018-02-09
CN107680116B true CN107680116B (en) 2020-07-28

Family

ID=61134696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710711920.8A Active CN107680116B (en) 2017-08-18 2017-08-18 Method for monitoring moving target in video image

Country Status (1)

Country Link
CN (1) CN107680116B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108447029A (en) * 2018-02-12 2018-08-24 深圳创维-Rgb电子有限公司 A kind of denoising method of video sequence, device, server and storage medium
CN108961261B (en) * 2018-03-14 2022-02-15 中南大学 Optic disk region OCT image hierarchy segmentation method based on space continuity constraint
CN109002802B (en) * 2018-07-23 2021-06-15 武汉科技大学 Video foreground separation method and system based on adaptive robust principal component analysis
CN109345563B (en) * 2018-09-14 2022-05-10 南京邮电大学 Moving target detection method based on low-rank sparse decomposition
JP7034050B2 (en) * 2018-10-29 2022-03-11 京セラ株式会社 Image processing equipment, cameras, moving objects and image processing methods
CN109543650A (en) * 2018-12-04 2019-03-29 钟祥博谦信息科技有限公司 Warehouse intelligent control method and system
CN110109114B (en) * 2019-05-09 2020-11-10 电子科技大学 Scanning radar super-resolution imaging detection integrated method
CN110136164B (en) * 2019-05-21 2022-10-25 电子科技大学 Method for removing dynamic background based on online transmission transformation and low-rank sparse matrix decomposition
CN111951191B (en) * 2020-08-14 2022-05-24 新疆大学 Video image snow removing method and device and storage medium
CN113177462B (en) * 2021-04-26 2022-04-15 四川大学 Target detection method suitable for court trial monitoring
CN117640900B (en) * 2024-01-25 2024-04-26 广东天耘科技有限公司 Global security video system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361611A (en) * 2014-11-18 2015-02-18 南京信息工程大学 Group sparsity robust PCA-based moving object detecting method
CN104599292A (en) * 2015-02-03 2015-05-06 中国人民解放军国防科学技术大学 Noise-resistant moving target detection algorithm based on low rank matrix
KR101556603B1 (en) * 2014-12-30 2015-10-01 연세대학교 산학협력단 Apparatus and Method for Image Seperation using Rank Prior
CN106056607A (en) * 2016-05-30 2016-10-26 天津城建大学 Monitoring image background modeling method based on robustness principal component analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361611A (en) * 2014-11-18 2015-02-18 南京信息工程大学 Group sparsity robust PCA-based moving object detecting method
KR101556603B1 (en) * 2014-12-30 2015-10-01 연세대학교 산학협력단 Apparatus and Method for Image Seperation using Rank Prior
CN104599292A (en) * 2015-02-03 2015-05-06 中国人民解放军国防科学技术大学 Noise-resistant moving target detection algorithm based on low rank matrix
CN106056607A (en) * 2016-05-30 2016-10-26 天津城建大学 Monitoring image background modeling method based on robustness principal component analysis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Background Subtraction Based on Low-Rank and Structured Sparse Decomposition";Liu X , Zhao G , Yao J , et al.;《IEEE Transactions on Image Processing》;20150831;第2502-2514页 *
"基于加权RPCA的非局部图像去噪方法";杨国亮等;《计算机工程与设计》;20151116;第3035-3040页 *
"基于改进RPCA的非局部图像去噪算法研究";王艳芳;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160215;第I138-1747页 *

Also Published As

Publication number Publication date
CN107680116A (en) 2018-02-09

Similar Documents

Publication Publication Date Title
CN107680116B (en) Method for monitoring moving target in video image
Zhang et al. AMP-Net: Denoising-based deep unfolding for compressive image sensing
Yu et al. Hallucinating very low-resolution unaligned and noisy face images by transformative discriminative autoencoders
Liu et al. Denet: A universal network for counting crowd with varying densities and scales
CN107016357B (en) Video pedestrian detection method based on time domain convolutional neural network
CN104408742B (en) A kind of moving target detecting method based on space time frequency spectrum Conjoint Analysis
Du et al. Variational image deraining
CN111080675B (en) Target tracking method based on space-time constraint correlation filtering
Kumar et al. Fast learning-based single image super-resolution
Fooladgar et al. Multi-modal attention-based fusion model for semantic segmentation of RGB-depth images
CN111260738A (en) Multi-scale target tracking method based on relevant filtering and self-adaptive feature fusion
CN107767416B (en) Method for identifying pedestrian orientation in low-resolution image
CN110135344B (en) Infrared dim target detection method based on weighted fixed rank representation
CN104732566B (en) Compression of hyperspectral images cognitive method based on non-separation sparse prior
CN110490894B (en) Video foreground and background separation method based on improved low-rank sparse decomposition
CN110879982A (en) Crowd counting system and method
Wang et al. Multi-scale fish segmentation refinement and missing shape recovery
CN104734724B (en) Based on the Compression of hyperspectral images cognitive method for weighting Laplce&#39;s sparse prior again
Xia et al. Single image rain removal via a simplified residual dense network
Anantrasirichai Atmospheric turbulence removal with complex-valued convolutional neural network
Jia et al. Effective meta-attention dehazing networks for vision-based outdoor industrial systems
Sun et al. Hyperspectral image denoising via low-rank representation and CNN denoiser
Zhang et al. DuGAN: An effective framework for underwater image enhancement
Khanna et al. Fractional derivative filter for image contrast enhancement with order prediction
CN111401209B (en) Action recognition method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant