CN108965885B - Video online reconstruction and moving target detection method based on frame compression measurement - Google Patents

Video online reconstruction and moving target detection method based on frame compression measurement Download PDF

Info

Publication number
CN108965885B
CN108965885B CN201810564696.9A CN201810564696A CN108965885B CN 108965885 B CN108965885 B CN 108965885B CN 201810564696 A CN201810564696 A CN 201810564696A CN 108965885 B CN108965885 B CN 108965885B
Authority
CN
China
Prior art keywords
video
foreground
background
frame
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810564696.9A
Other languages
Chinese (zh)
Other versions
CN108965885A (en
Inventor
曹文飞
韩国栋
徐麟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Normal University
Original Assignee
Shaanxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Normal University filed Critical Shaanxi Normal University
Priority to CN201810564696.9A priority Critical patent/CN108965885B/en
Publication of CN108965885A publication Critical patent/CN108965885A/en
Application granted granted Critical
Publication of CN108965885B publication Critical patent/CN108965885B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a video online reconstruction and moving target detection method based on frame compression measurement. Firstly, carrying out compression measurement on a monitoring scene at an information source end by a camera designed based on a compression perception principle; then, the compression measurement of the single frame is transmitted to a monitoring center after channel coding; and finally, collecting the compression measurement data of the single frame in the monitoring center through decoding, and reconstructing the background and the foreground of the video frame through a compression reconstruction algorithm. The low-rank property of the video background is modeled by using the matrix kernel norm, the fragment smoothness of the video foreground and the sparsity of the Laplace mixed distribution modeling foreground are modeled by using the total variation TV function, and the reconstruction precision of the video and the detection precision of the moving target can be improved by the refined modeling.

Description

Video online reconstruction and moving target detection method based on frame compression measurement
Technical Field
The invention belongs to the technical field of video processing, and particularly relates to a video online reconstruction and moving target detection method based on frame compression measurement.
Background
Video moving object detection based on compression measurement is a new technology which is recently raised in the field of video processing, couples two basic problems of video reconstruction and moving object detection, and integrates respective advantageous features of the two problems. The existing video moving target detection method focuses on moving target detection on complete video data. Because the method can be processed after a complete video frame sequence is transmitted, technical bottlenecks such as mass data storage effectiveness, transmission timeliness, processing instantaneity and the like are met in the current background of video big data. Fortunately, a new information processing theory, compressed sensing [ e.j.cand. an interaction to compressed sampling. ieee signal processing magazine,25(2):21-30,2008 ], brings new clues to the resolution of the technical bottleneck. The theory states that: at the information source end, only the compression measurement is needed to be carried out on the scene or the critical information of the scene is sampled and stored; then, transmitting little measurement data to the sink end through the channel; finally, the original scene can be reconstructed with high probability on the sink side with a small number of compressed measurements. Under the inspiration of the compressed sensing theory, researchers have proposed a video reconstruction and moving object detection method based on compression measurement, or a moving object detection method on compressed video data. Such methods are highly challenging. The following describes a conventional approach to moving object detection on full video data and on compressed video data.
1. Moving target detection method on complete video data
For ease of description, we introduce some notation. Suppose a video sequence is marked as
Figure BDA0001684226840000011
The number of frames of the video sequence is K, the size of each frame image is M multiplied by N, and the video frame sequence can be supposed to be decomposed into a background sequence and a foreground sequence, namely
X=B+S,
Where B is ═ B(1),b(2),…,b(K)],S=[s(1),s(2),…,s(K)]。
The method aims to design a certain method to extract a foreground sequence S from a complete video sequence X. From literature investigations, such methods can be divided into two broad categories: the first type uses a batch processing mode (integral processing) of video sequences, and the second type uses an online processing mode (frame-by-frame processing) of video sequences.
For batch processing methods, the existing popular algorithms are usually implemented by optimizing an energy function. The background sequence and the foreground sequence are used as optimization variables, an energy function model based on constraints such as video frame image sparsity and low rank performance is established, and the background sequence and the foreground sequence are simultaneously obtained by minimizing the energy function model. The general form of its energy function model is:
Figure BDA0001684226840000021
where B and S are the variables to be optimized, λ and μ are the regularization parameters, and Ω (B) and Ψ (S) are the regularization constraints on the background sequence and the foreground sequence, respectively, usually some sparsity constraint. Typical literature for such processes is: J.Wright, A.Ganesh, S.Rao, Y.Peng, and Y.Ma.Robust primary component analysis of Exact retrieval of corrected low-random matrix vision correction, in NIPS,2009. B.Xin, Y.Tian, Y.Wang, and W.Gao.Background sub-traction video corrected functional for corrected modulation in CVPR,2015. and W.Hu, Y.Yayayayayag, W.Zhang, and Y.Xie.Moving object detection using sensor-base low-random and corrected parameter synthesis, 26.737.27. I.M. J.G.G.G.G.M.. Wherein, the document [ J.Wright, A.Ganesh, S.Rao, Y.Peng, and Y.Ma.Robust primary component analysis: Exact recovery of corrected low-random information visual convergence. in NIPS,2009 ] uses a robust principal component analysis model to solve the problem of mobile object detection, wherein the correlation of the background sequence is modeled by matrix low rank, and the characteristics of the foreground sequence are characterized by sparsity; in the document [ b.xin, y.tie, y.wang, and w.gao.background generated Fused used for aggregate modeling. in CVPR,2015 ] a Generalized Fused Lasso is used to model a foreground sequence, and this modeling mode not only describes the sparsity of the foreground image but also considers the local similarity of the pixel intensity, so that a better moving target detection effect can be obtained; the documents [ W.Hu, Y.Yang, W.Zhang, and Y.Xie.moving object detection using tensor-based low-rank and saturated full-spaced composition. IEEE Trans.on Image Processing,26(2): 724-. For a foreground sequence, document [4] firstly obtains a saliency region of a video sequence through saliency detection, and then embeds saliency region information into a generalized fusion lasso model. Compared with the two former documents, documents [ W.Hu, Y.Yang, W.Zhang, and Y.Xie.moving object detection using transmitter-based low-rank and sample full-parameter composition. IEEE trans. on Image Processing,26(2): 724-.
The other is a detection method in an online processing mode. The method has higher processing speed, and can meet the requirement of detecting the moving target from video frame to video frame in practical application. The general models appearing in the literature are:
Figure BDA0001684226840000031
wherein omega (b)t) And Ψ(s)t) Respectively being the t-th background frame btAnd the foreground frame stThe regular constraint function of (a) above is typically a manifold constraint of some kind in the background and a sparse constraint of some kind in the foreground. Specifically, based on the incremental gradient on the grassmann manifold, documents [ j.he, l.balzano, and a.szlam.incorporated gradient on gradsmanian for online formed and background separation in subsampled video in CVPR,2012.]Providing an online video moving target detection method; on the basis of this document, the document [ J.Xu, V.K.Ithapu, L.Mukherjee, J.M.Rehg, and V.Singh.GOSUS: Grassmannian online subspaces with structured-space.In ICCV,2013.]Further modeling the block sparsity of the foreground, the documents [ j.he, l.balzano, and a.szlam.incorporated graphics on Grassmannian for online formed and background section in subsampled video in CVPR,2012 are given.]An improved version of the method; documents [ Y.Pang, L.Ye, X.Li, and J.Pan.inductive learning with localization map for moving object detection. IEEE Transactions on Circuits and Systems for Video Technology,28(3): 640-.]In view of the salient features of video frames, and in the literature [ j.he, l.balzano, and a.szlam.incorporated graphics on Grassmannian for online formed and background separation in subsampled video in CVPR,2012.]An efficient moving target detection model and an algorithm thereof are provided on the basis of the model.
Under the background of large video data, the above method for detecting a moving target on complete video data usually encounters some bottlenecks, for example, a large number of cameras distributed on an urban traffic main road may capture a large number of surveillance videos, the surveillance videos need to be timely transmitted to a network control center for analysis, and synchronous transmission of the complete video data to the network control center may cause network congestion and even network paralysis, so that it is especially necessary to transmit a small amount of non-redundant video data. To address the challenge, moving object detection methods on compressed video data are in force.
2. Moving object detection method on compressive video data
The aim of such methods is to reconstruct the video background from the compressed video data and to detect moving objects by designing detection methods. Among them, documents [ v.cevher, a.sankararayanan, m.f.duarte, d.reddy, r.g.baraniuk, and r.chellappa.compressive sensing for background collection in ECCV,2008 ] propose an effective moving object detection method based on a typical compressive sensing recovery model. In the document [ A.E.Waters, A.C.Sankaraarayanan, and R.Baraniuk.SparCS: Recovering low-rank and sparse matrices from complex measurements.In NIPS,2011 ] an optimization model is provided to solve the problem of mobile target detection by modeling background and sparsity through low rank. Based on the methods proposed in the documents [ A.E.Waters, A.C.Sankararayanan, and R.Baraniuk.SpaCS: Recovering low-rank and spark formats from complex measures.In NIPS,2011 ], the documents [ W.Cao, Y.Wang, J.Sun, D.Meng, C.Yang, A.Cichorki, and Z.xu.T. variation regulated tenso RPCA for background and bottom conductivity from complex measures.IEEE Transactions Image Processing,25 (9); 4075 4090,2016 ] consider designing more refined background and foreground sequences for modeling. Specifically, tensor low-rank property is used for replacing matrix low-rank property to model a background sequence, a time-space total variation model replaces sparsity to depict a foreground sequence, and an effective moving target detection method is provided based on refined modeling.
The other type is an online processing type method based on frame compression measurement, and the method can meet the real-time requirement in practical video monitoring application. The general form of the model in the literature is:
Figure BDA0001684226840000051
where A is some compression measurement matrix, Ω (b)t) And Ψ(s)t) Respectively being the t-th background frame btAnd the foreground frame stSome regular constraint function of (1). In the documents [ J.F.Mota, N.Deligianiis, A.C.Sankaraarayanan, V.Cevher, and M.R.Rodrigues.Adaptation-rate specific Signal reception with application in compressive background.IEEE Transactions on Signal Processing,64(14): 3651) 3666,2016.]In, authors are based on l1-l1The minimization model provides an online moving target detection algorithm. In the literature [ H.Luong, N.Deligianiis, J.Seiler, S.Forchhammer, and A.Kaup.compressive on reliable basic component analysis via n-l1minimization. to apply the polypeptide in IEEE Transactions on Image Processing,2018.]In, the authors write l1-l1Minimization model to n-l1Minimizing the model, converting the moving target detection problem based on frame compression measurement into a sparse optimization problem, and designing an efficient optimization method.
The above moving object detection methods on full/compressed video data all have limitations. For the moving target detection method on the complete video data, because the compressibility of the data is not considered in the scene acquisition process, the defects of storage burden at the information source end, network blockage in the transmission process and the like are easily caused under the background of large video data; for the moving target detection method on the compressive video data, although the compressive measurement is considered in the scene acquisition process, only some simple sparse prior constraints are considered when the background and the foreground of the video are modeled, and finer modeling is not considered for the 3-dimensional video data, so that the detection accuracy of the moving target under a certain compressive sampling rate needs to be further improved.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides a video online reconstruction and moving object detection method based on frame compression measurement. The technical problem to be solved by the invention is realized by the following technical scheme: a video online reconstruction and moving target detection method based on frame compression measurement comprises the following steps:
step 1, collecting a video sequence X0As a training set, the video sequence X0Inputting into a steady principal component analysis model, and outputting a video prior background B0And foreground sequence S0The foreground sequence S0The value of the k frame video image is assigned to obtain the prior foreground of the video
Figure BDA0001684226840000061
Wherein the video sequence X0The number of frames of (a) is L, L represents a positive integer;
step 2, collecting compression measurement y of t frame video image of monitoring scenet
Step 3, establishing a reconstruction model, wherein the reconstruction model adopts a matrix nuclear norm to model the low rank of the video prior background, and adopts a total variation TV regular function and a negative logarithm Laplace mixed function to respectively model the fragment smoothness and the sparsity of the video prior foreground;
step 4, measuring the compression y of the t frame video imagetVideo prior background Bt-1Foreground prior to video
Figure BDA0001684226840000062
Inputting the reconstruction model, and obtaining the t frame background b of the video by minimizing the output of the reconstruction modeltAnd foreground stThen using image threshold segmentation method to segment from the foreground stDetecting a moving target; wherein t represents a positive integer;
step 5, according to the background btBackground a priori with video Bt-1Updating the prior background of the current video to be BtAnd according toProspect stForeground prior to video
Figure BDA0001684226840000063
Update the current video prior foreground to
Figure BDA0001684226840000064
And 6, sequentially circulating the steps 2 to 5, and when T is equal to T, terminating the updating of the prior background B of the current videotAnd current video prior foreground
Figure BDA0001684226840000065
Where T represents a monitoring time or a video frame number.
Further, the reconstruction model in step 3 is:
Figure BDA0001684226840000071
where a is the compression measurement matrix, λ and μ are regularization parameters, γ ═ μ · τ; b ist-1Which represents the a priori background of the video,
Figure BDA0001684226840000072
representing the a priori foreground of the video, btRepresenting the t-th frame background, s of the video imagetRepresenting the tth frame foreground of the video image.
Ω(bt)=||[Bt-1,bt]||*Modeling the low rank nature of the video prior background for a matrix nuclear norm, where the matrix nuclear norm is | | Z | | survival*=∑iσi(Z),σi(Z) the ith singular value of the matrix Z;
Figure BDA0001684226840000073
respectively modeling the fragment smoothness and the sparsity of the prior foreground of the video for a total variation TV regular function and a negative logarithm Laplace mixing function,
Figure BDA0001684226840000074
is the video prior foreground
Figure BDA0001684226840000075
The number of the ith row of (a),
Figure BDA0001684226840000076
is a variance vector of the mixed laplacian distribution,
Figure BDA0001684226840000077
is a proportional coefficient vector of Laplace mixture distribution components, and tau is a balance thetat-1t-1Parameter of, TV(s)t)=||Dst||1Is the total variation function of the anisotropy of the image, D ═ Dh;Dv]The image processing method comprises the steps of forming a difference operator in the horizontal direction and the vertical direction of an image;
the laplacian mixture probability distribution is:
Figure BDA0001684226840000078
parameter(s)
Figure BDA0001684226840000079
And
Figure BDA00016842268400000710
estimating by an expectation maximization algorithm;
further, in the step 4, the t-th frame background b of the video is obtained by minimizing the output of the reconstruction modeltAnd foreground stThe method comprises the following specific steps: optimizing the reconstruction model using a neighboring gradient method and an expectation-maximization algorithm, wherein the neighboring gradient function is:
Figure BDA0001684226840000081
wherein
Figure BDA0001684226840000082
Are gradients, their size being
Figure BDA0001684226840000083
Parameter(s)
Figure BDA0001684226840000084
By an alternative optimization method, the proximity gradient function is converted into
Figure BDA0001684226840000085
Figure BDA0001684226840000086
And
Figure BDA0001684226840000087
three subproblems, which are iterated, when the relative change is less than 1e-5, the iteration is stopped, and the t frame background b of the video is outputtAnd foreground st
Wherein the sub-problem
Figure BDA0001684226840000088
And obtaining an updated expression of the parameter through the expectation maximization algorithm.
Further, the specific steps of step 5 are:
step 5.1, video prior background BtThe update formula of (2) is:
[U,S,V]=SVD([Bt-1,bt]) Formula (1)
Bt=U(:,1:L)·S(1:L,1:L)·V(:,1:L)TFormula (2)
Wherein formula (1) represents a pair matrix [ B ]t-1,bt]Singular value decomposition is carried out, and then a video priori background B can be derived based on the singular value formulat(ii) a SVD represents singular value decomposition of a matrix, U represents a left singular vector of the matrix, V represents a right singular vector of the matrix, and S represents a singular value of the matrix;
step 5.2, the current video priori foreground
Figure BDA0001684226840000089
The update formula of (2) is:
Figure BDA00016842268400000810
further, in the step 2, a block discrete hadamard matrix which is randomly down-sampled is adopted to simulate a compression measurement matrix, the size of the discrete hadamard matrix is 32 × 32, and the sampling rate of the random matrix is set.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a novel online video reconstruction and detection method on frame compression measurement based on recent progress of sparsity modeling. The method uses the matrix kernel norm to model the low rank of the video background, uses the total variation TV function to model the fragment smoothness of the video foreground and the sparsity of the Laplace mixed distribution modeling foreground, and the refined modeling can improve the reconstruction precision of the video and the detection precision of the moving target.
The method is designed based on the compression sensing theory, and because the sampling and the compression are carried out simultaneously, the method not only can reduce the storage burden of a signal source end, but also can reduce the defects of network blockage and the like to a certain extent.
The invention can be applied to video monitoring applications such as traffic video monitoring, campus video monitoring, military port video monitoring and the like, and can simultaneously meet the video monitoring requirements of data compression and online processing.
Drawings
Fig. 1 is a schematic diagram of the whole process of video reconstruction and moving object detection according to the present invention.
FIG. 2 is a flow chart of the method of the present invention.
FIG. 3 is an alternate iteration of background, foreground, parameters of the present invention.
Fig. 4a is a real original image of video reconstruction and moving object detection under a sampling of 0.7 of a scene of the present invention.
Fig. 4b is a reconstructed foreground image of video reconstruction and moving object detection under a sampling of 0.7 of a scene according to the present invention.
Fig. 4c is a reconstructed background image of video reconstruction and moving object detection at a sampling of 0.7 for a scene according to the present invention.
Fig. 5a is a real original image of video reconstruction and moving object detection under a sampling of 0.4 for a scene of the present invention.
Fig. 5b is a reconstructed foreground image of video reconstruction and moving object detection under a sampling of 0.4 for a scene according to the present invention.
Fig. 5c is a reconstructed background image of video reconstruction and moving object detection at a sampling of 0.4 for a scene according to the present invention.
Fig. 6a shows the result of detecting a moving object in the 10 th frame (sample rate ═ 0.15) according to the present invention.
Fig. 6b is the result of the method proposed by Luong et al detecting a moving object at frame 10 (sample rate 0.15).
Fig. 7a shows the result of detecting a moving object at frame 30 (sample rate 0.15) according to the present invention.
Fig. 7b shows the result of the method proposed by Luong et al detecting a moving object at frame 30 (sample rate 0.15).
Fig. 8a shows the result of detecting a moving object in the 10 th frame (sample rate ═ 0.2) according to the present invention.
Fig. 8b is the result of the method proposed by Luong et al detecting a moving object at frame 10 (sample rate 0.2).
Fig. 9a shows the result of detecting a moving object at frame 30 (sample rate 0.2) according to the present invention.
Fig. 9b shows the result of the method proposed by Luong et al detecting a moving object at frame 30 (sampling rate 0.2).
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in the orientation or positional relationship indicated in the drawings, which are merely for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be construed as limiting the invention.
Furthermore, the terms "first," "second," "third," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to a number of indicated technical features. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the invention, the meaning of "a plurality" is two or more unless otherwise specified.
The terms "mounted," "connected," and "coupled" are to be construed broadly and may, for example, be fixedly coupled, detachably coupled, or integrally coupled; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the creation of the present invention can be understood by those of ordinary skill in the art through specific situations.
As shown in fig. 1, we show the overall process of video reconstruction and moving object detection over video frame compression measurement. Firstly, carrying out compression measurement on a monitoring scene at an information source end by a camera designed based on a compression perception principle; then, the compression measurement of the single frame is transmitted to a monitoring center after channel coding; and finally, collecting the compressed measurement data of the single frame by decoding in the monitoring center, and reconstructing the background and the foreground of the video frame by a certain compressed reconstruction algorithm.
Fig. 2 shows a flow chart of the implementation of the present invention, and the specific implementation steps are as follows:
the embodiment provides a video online reconstruction and moving target detection method based on frame compression measurement, which comprises the following steps:
step 1, collecting a video sequence X0As a training set, the video sequence X0Inputting into a steady principal component analysis model, and outputting a video prior background B0And foreground sequence S0The foreground sequence S0The value of the k frame video image is assigned to obtain the prior foreground of the video
Figure BDA0001684226840000111
Wherein the video sequence X0The number of frames of (a) is L, L represents a positive integer;
specifically, the step 1 comprises the following specific steps:
step 1.1, construct training set
Video sequence X for collecting a certain monitoring scene for a period of time0The frame number of the video sequence is L, and L is usually 250. The video sequence will be used to initialize the video a priori background and the video a priori foreground.
Step 1.2, monitoring video sequence X collected in step 1.10Input into a Robust Principal Component Analysis (RPCA) method that will output the video prior background (background sequence) B of the video segment0And foreground sequence S0. Assigning the rear k of the video foreground sequence to the prior foreground by 3 frames of images
Figure BDA0001684226840000121
A priori foreground
Figure BDA0001684226840000122
The video motion target detection method is used as auxiliary prior information for video motion target detection. Here, k may take a value of 2, 4 or as needed.
Step 2, collecting compression measurement y of t frame video image of monitoring scenet(ii) a The implementation adopts a block discrete Hadamard matrix which is randomly sampled to simulate a compression measurement matrix, and carries out compression measurement on the scene image. Of course, in actual operation, a video camera (such as a single-pixel camera) designed based on the compressed sensing principle can be purchased in the market to carry out compressed measurement on a scene, and the invention carries out shooting based on the compressed sensing principleThe type of camera is not limited in any way. In the present invention, the size of the discrete hadamard matrix is not set to be 32 × 32, and the sampling rate ratio of the random matrix is 0.7, but the sampling rate may be set to other values, for example, 0.15, 0.2, 0.4, or set according to actual needs.
Step 3, establishing a reconstruction model, wherein the reconstruction model adopts a matrix nuclear norm to model the low rank of the video prior background, and adopts a total variation TV regular function and a negative logarithm Laplace mixed function to respectively model the fragment smoothness and the sparsity of the video prior foreground;
specifically, for video background, we use matrix kernel norm to model the low rank property of the video prior background, namely:
Ω(bt)=||[Bt-1,bt]||*
here | Z | non-calculation*=∑iσi(Z),σi(Z) is the ith singular value of the matrix Z, [ B ]t-1,bt]. For a video foreground, a negative logarithm Laplace mixing function and a total variation TV regular function are utilized to respectively model sparsity and slicing smoothness of a video prior foreground, namely:
Figure BDA0001684226840000131
here, the
Figure BDA0001684226840000132
Is the video prior foreground
Figure BDA0001684226840000133
The number of the ith row of (a),
Figure BDA0001684226840000134
is a variance vector of the mixed laplacian distribution,
Figure BDA0001684226840000135
is a proportional coefficient vector of Laplace mixture distribution components, and tau is a balance thetat-1t-1Both of these termsParameter, TV(s)t)=||Dst||1Is the anisotropic total variation function of the image, where D ═ Dh;Dv]The image processing method is characterized by comprising difference operators in the horizontal direction and the vertical direction of an image, and the mixed probability distribution of Laplace is as follows:
Figure BDA0001684226840000136
attention parameter
Figure BDA0001684226840000137
And
Figure BDA0001684226840000138
can be estimated by an expectation maximization algorithm (EM-algorithm).
Based on the above background and foreground modeling for the t-th frame, a reconstructed model of the video can be obtained as follows:
Figure BDA0001684226840000139
the first item at the right end of the equation equal sign is a data fidelity item, the second item is the low rank of the prior background of the modeling video, the third item is the slice smoothness of the prior foreground of the description t frame video, and the fourth item is the sparsity of the prior foreground of the description t frame video. The t frame background b can be obtained by minimizing the modeltAnd foreground st. A is the compression measurement matrix, λ and μ are the regularization parameters, γ ═ μ · τ. B ist-1Which represents the a priori background of the video,
Figure BDA00016842268400001310
representing the a priori foreground of the video, btRepresenting the t-th frame background, stRepresenting the tth frame foreground.
Step 4, measuring the compression y of the t frame video imagetVideo prior background Bt-1Foreground prior to video
Figure BDA00016842268400001311
Inputting the reconstruction model, and obtaining the t frame background b of the video by minimizing the output of the reconstruction modeltAnd foreground stThen using image threshold segmentation method to segment from the foreground stDetecting a moving target; wherein t represents a positive integer;
specifically, the t frame background b of the video is obtained by minimizing the output of the reconstruction modeltAnd foreground stThe method comprises the following specific steps: optimizing the reconstruction model using a proximity gradient method and an expectation-maximization algorithm (EM-algorithm), wherein the proximity gradient function is:
Figure BDA0001684226840000141
wherein
Figure BDA0001684226840000142
Are gradients, their size being
Figure BDA0001684226840000143
Parameter(s)
Figure BDA0001684226840000148
By the alternative optimization method, the adjacent gradient function is converted into three sub-problems:
sub-problem 1:
Figure BDA0001684226840000144
sub-problem 2:
Figure BDA0001684226840000145
sub-problem 3:
Figure BDA0001684226840000146
iterating the three sub-problems, when the relative change is less than 1e-5, terminating the iteration, and outputting the t frame background b of the obtained videotAnd foreground st
Wherein, sub-problem 3:
Figure BDA0001684226840000147
and obtaining an updated expression of the parameter through the expectation maximization algorithm. Figure 3 shows an alternate iteration diagram of three sub-problems of the algorithm.
Obtaining the t frame background b of the videotAnd foreground stThen adopting image threshold segmentation method to segment from foreground stThe moving target is detected, for example, the moving target in the foreground can be obtained by using an image threshold segmentation method.
Step 5, according to the background btBackground a priori with video Bt-1Updating current video prior background BtAnd according to the foreground stForeground prior to video
Figure BDA0001684226840000151
Updating a current video prior foreground
Figure BDA0001684226840000152
Specifically, the specific steps of step 5 are:
step 5.1, video prior background BtThe update formula of (2) is:
[U,S,V]=SVD([Bt-1,bt]) Formula (1)
Bt=U(:,1:L)·S(1:L,1:L)·V(:,1:L)TFormula (2)
Wherein formula (1) represents a pair matrix [ B ]t-1,bt]Singular value decomposition is carried out, and then a video priori background B can be derived based on the singular value formulat(ii) a SVD represents singular value decomposition of a matrix, U represents a left singular vector of the matrix, V represents a right singular vector of the matrix, and S represents a singular value of the matrix;
step 5.2, the current video priori foreground
Figure BDA0001684226840000153
The update formula of (2) is:
Figure BDA0001684226840000154
update B in combination with the above-mentioned a priori backgroundtA priori foreground update
Figure BDA0001684226840000155
And the t +1 frame compression measurement y of the monitored scenet+1By the above step 5, we can reconstruct the background and foreground of the frame image. Cycling is performed in sequence until a termination condition is reached.
And 6, sequentially circulating the steps 2 to 5, and when T is equal to T, terminating the updating of the prior background B of the current videotAnd current video prior foreground
Figure BDA0001684226840000156
Wherein, T represents the monitoring time or the video frame number, and the monitoring time can be set according to the requirement.
Fig. 4a is a real original image of video reconstruction and moving object detection under a sampling of 0.7 of a scene of the present invention.
Fig. 4b is a reconstructed foreground image of video reconstruction and moving object detection under a sampling of 0.7 of a scene according to the present invention.
Fig. 4c is a reconstructed background image of video reconstruction and moving object detection at a sampling of 0.7 for a scene according to the present invention.
Fig. 5a is a real original image of video reconstruction and moving object detection under a sampling of 0.4 for a scene of the present invention.
Fig. 5b is a reconstructed foreground image of video reconstruction and moving object detection under a sampling of 0.4 for a scene according to the present invention.
Fig. 5c is a reconstructed background image of video reconstruction and moving object detection at a sampling of 0.4 for a scene according to the present invention.
To specifically explain the effects of the present invention, the following description is made by comparative experiments:
as shown in the figure, the method of the invention has higher precision or more accurate detection of the moving target under the same sampling rate. The comparative examples are shown in fig. 6a and 6b, in fig. 7a and 7b, and in fig. 8a and 8b, in fig. 9a and 9 b. In this figure, the results of the method of the present invention and the results of the method proposed by Luong et al are shown [ H.Luong, N.Deligianics, J.Seiler, S.Forchhammer, and A.Kaup.compressive on line robust basic component analysis via n-l 11 minimization. to apply in IEEE Transactions on Image Processing,2018 ], and it can be seen that at low sampling rates, the method of the present invention can still detect moving objects, whereas the Luong method cannot detect clear moving objects.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (4)

1. A video online reconstruction and moving target detection method based on frame compression measurement is characterized in that: the method comprises the following steps:
step 1, collecting a video sequence X0As a training set, the video sequence X0Inputting into a steady principal component analysis model, and outputting a video prior background B0And foreground sequence S0The foreground sequence S0The value of the k frame video image is assigned to obtain the prior foreground of the video
Figure FDA0002584766180000011
Wherein the video sequence X0The number of frames of (a) is L, L represents a positive integer;
step 2, collecting compression measurement y of t frame video image of monitoring scenet
Step 3, establishing a reconstruction model, wherein the reconstruction model adopts a matrix nuclear norm to model the low rank of the video prior background, and adopts a total variation TV regular function and a negative logarithm Laplace mixed function to respectively model the fragment smoothness and the sparsity of the video prior foreground;
step 4, measuring the compression y of the t frame video imagetVideo prior background Bt-1Foreground prior to video
Figure FDA0002584766180000012
Inputting the reconstruction model, and obtaining the t frame background b of the video by minimizing the output of the reconstruction modeltAnd foreground stThen using image threshold segmentation method to segment from the foreground stDetecting a moving target; wherein t is a positive integer;
step 5, according to the background btBackground a priori with video Bt-1Updating the prior background of the current video to be BtAnd according to the foreground stForeground prior to video
Figure FDA0002584766180000013
Update the current video prior foreground to
Figure FDA0002584766180000014
And 6, sequentially circulating the steps 2 to 5, and when T is equal to T, terminating the updating of the prior background B of the current videotAnd current video prior foreground
Figure FDA0002584766180000015
Wherein T represents monitoring time or video frame number;
the reconstruction model in the step 3 is as follows:
Figure FDA0002584766180000016
where a is the compression measurement matrix, λ and μ are regularization parameters, γ ═ μ · τ; b ist-1Which represents the a priori background of the video,
Figure FDA0002584766180000017
representing the a priori foreground of the video, btRepresenting the t-th frame background, s of the video imagetRepresenting a tth frame foreground of the video image;
Ω(bt)=‖[Bt-1,bt]‖*modeling low rank nature of video prior background for matrix kernel norm, wherein matrix kernel norm is | Z | |)*=∑iσi(Z),σi(Z) the ith singular value of the matrix Z;
Figure FDA0002584766180000021
respectively modeling the fragment smoothness and the sparsity of the prior foreground of the video for a total variation TV regular function and a negative logarithm Laplace mixing function,
Figure FDA0002584766180000022
is the video prior foreground
Figure FDA0002584766180000023
The number of the ith row of (a),
Figure FDA0002584766180000024
is a variance vector of the mixed laplacian distribution,
Figure FDA0002584766180000025
a scale coefficient vector in which components are mixedly distributed for Laplace, and pit-1Satisfy the constraint condition
Figure FDA0002584766180000026
And
Figure FDA0002584766180000027
τ is the balance θt-1t-1Parameter of, TV(s)t)=||Dst||1Is an anisotropic total variation function of the image,D=[Dh;Dv]the image processing method comprises the steps of forming a difference operator in the horizontal direction and the vertical direction of an image;
the laplacian mixture probability distribution is:
Figure FDA0002584766180000028
parameter(s)
Figure FDA0002584766180000029
And
Figure FDA00025847661800000210
the estimation is performed by an expectation maximization algorithm.
2. The method for video online reconstruction and moving object detection based on frame compression measurement as claimed in claim 1, wherein: in the step 4, the t-th frame background b of the video is obtained through the minimized reconstruction model outputtAnd foreground stThe method comprises the following specific steps: optimizing the reconstruction model using a neighboring gradient method and an expectation-maximization algorithm, wherein the neighboring gradient function is:
Figure FDA00025847661800000211
wherein
Figure FDA00025847661800000212
Are gradients, their size being
Figure FDA00025847661800000213
Parameter(s)
Figure FDA00025847661800000214
By an alternative optimization method, the proximity gradient function is converted into
Figure FDA00025847661800000215
Figure FDA0002584766180000031
And
Figure FDA0002584766180000032
three subproblems, which are iterated, when the relative change is less than 1e-5, the iteration is stopped, and the t frame background b of the video is outputtAnd foreground st
Wherein the sub-problem
Figure FDA0002584766180000033
And obtaining an updated expression of the parameter through the expectation maximization algorithm.
3. The method for video online reconstruction and moving object detection based on frame compression measurement as claimed in claim 1 or 2, wherein: the specific steps of the step 5 are as follows:
step 5.1, video prior background BtThe update formula of (2) is:
[U,S,V]=SVD([Bt-1,bt]) Formula (1)
Bt=U(:,1:L)·S(1:L,1:L)·V(:,1:L)TFormula (2)
Wherein formula (1) represents a pair matrix [ B ]t-1,bt]Singular value decomposition is carried out, and then a video priori background B can be derived based on the singular value formulat(ii) a SVD represents singular value decomposition of a matrix, U represents a left singular vector of the matrix, V represents a right singular vector of the matrix, and S represents a singular value of the matrix;
step 5.2, the current video priori foreground
Figure FDA0002584766180000034
The update formula of (2) is:
Figure FDA0002584766180000035
4. the method for video online reconstruction and moving object detection based on frame compression measurement as claimed in claim 1, wherein: in the step 2, a block discrete Hadamard matrix which is randomly down-sampled is adopted to simulate a compression measurement matrix, the size of the discrete Hadamard matrix is 32 multiplied by 32, and the sampling rate of the random matrix is set.
CN201810564696.9A 2018-06-04 2018-06-04 Video online reconstruction and moving target detection method based on frame compression measurement Active CN108965885B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810564696.9A CN108965885B (en) 2018-06-04 2018-06-04 Video online reconstruction and moving target detection method based on frame compression measurement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810564696.9A CN108965885B (en) 2018-06-04 2018-06-04 Video online reconstruction and moving target detection method based on frame compression measurement

Publications (2)

Publication Number Publication Date
CN108965885A CN108965885A (en) 2018-12-07
CN108965885B true CN108965885B (en) 2020-11-10

Family

ID=64492819

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810564696.9A Active CN108965885B (en) 2018-06-04 2018-06-04 Video online reconstruction and moving target detection method based on frame compression measurement

Country Status (1)

Country Link
CN (1) CN108965885B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113329228B (en) * 2021-05-27 2024-04-26 杭州网易智企科技有限公司 Video encoding method, decoding method, device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011091815A1 (en) * 2010-01-28 2011-08-04 Scivis Wissenschaftliche Bildverarbeitung Gmbh Tomographic imaging using poissonian detector data
CN102915562A (en) * 2012-09-27 2013-02-06 天津大学 Compressed sensing-based multi-view target tracking and 3D target reconstruction system and method
CN103745465A (en) * 2014-01-02 2014-04-23 大连理工大学 Sparse coding background modeling method
CN104599292A (en) * 2015-02-03 2015-05-06 中国人民解放军国防科学技术大学 Noise-resistant moving target detection algorithm based on low rank matrix
CN105243670A (en) * 2015-10-23 2016-01-13 北京航空航天大学 Sparse and low-rank joint expression video foreground object accurate extraction method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011091815A1 (en) * 2010-01-28 2011-08-04 Scivis Wissenschaftliche Bildverarbeitung Gmbh Tomographic imaging using poissonian detector data
CN102915562A (en) * 2012-09-27 2013-02-06 天津大学 Compressed sensing-based multi-view target tracking and 3D target reconstruction system and method
CN103745465A (en) * 2014-01-02 2014-04-23 大连理工大学 Sparse coding background modeling method
CN104599292A (en) * 2015-02-03 2015-05-06 中国人民解放军国防科学技术大学 Noise-resistant moving target detection algorithm based on low rank matrix
CN105243670A (en) * 2015-10-23 2016-01-13 北京航空航天大学 Sparse and low-rank joint expression video foreground object accurate extraction method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Total Variation Regularized Tensor RPCA for Background Subtraction From Compressive Measurements;Wenfei Cao等;《IEEE Explore》;20161231;全文 *
基于先验信息的压缩感知图像重建方法研究;陈义光;《中国优秀硕士学位论文全文数据库》;20131231;全文 *
基于压缩感知超声图像重构研究;秦晓伟;《中国优秀硕士学位论文全文数据库》;20140331;全文 *

Also Published As

Publication number Publication date
CN108965885A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
Bhoi Monocular depth estimation: A survey
CN107480772B (en) License plate super-resolution processing method and system based on deep learning
CN105488812A (en) Motion-feature-fused space-time significance detection method
Ju et al. BDPK: Bayesian dehazing using prior knowledge
KR20150079576A (en) Depth map generation from a monoscopic image based on combined depth cues
CN110910437B (en) Depth prediction method for complex indoor scene
CN111369548B (en) No-reference video quality evaluation method and device based on generation countermeasure network
WO2017110836A1 (en) Method and system for fusing sensed measurements
CN114463218B (en) Video deblurring method based on event data driving
WO2023159898A1 (en) Action recognition system, method, and apparatus, model training method and apparatus, computer device, and computer readable storage medium
CN105931189B (en) Video super-resolution method and device based on improved super-resolution parameterized model
US20100322517A1 (en) Image processing unit and image processing method
CN110675422A (en) Video foreground and background separation method based on generalized non-convex robust principal component analysis
Mehra et al. TheiaNet: Towards fast and inexpensive CNN design choices for image dehazing
CN108965885B (en) Video online reconstruction and moving target detection method based on frame compression measurement
Nie et al. Context and detail interaction network for stereo rain streak and raindrop removal
CN111626944A (en) Video deblurring method based on space-time pyramid network and natural prior resistance
CN109002802B (en) Video foreground separation method and system based on adaptive robust principal component analysis
Li et al. Dvonet: unsupervised monocular depth estimation and visual odometry
Tosi et al. RGB-Multispectral matching: Dataset, learning methodology, evaluation
Gupta et al. Reconnoitering the Essentials of Image and Video Processing: A Comprehensive Overview
Liao Optimization and Application of Image Defogging Algorithm Based on Deep Learning Network
Sridevi et al. Efficient motion compensation and detection algorithm using modified Kalman filtering
Wu et al. MPCNet: Compressed multi-view video restoration via motion-parallax complementation network
Weiyao et al. Background modeling from video sequences via online motion-aware RPCA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant