Background
Industrial surveillance cameras, cameras for mobile devices, motion cameras, and the like are producing large amounts of image and video data each day. Due to considerations such as device cost, mobility requirements, size limitations, battery capacity, etc., these image and video sensing devices need to compress and store or transmit data after acquiring image and video data. This approach of sensing first and compressing data requires energy consumption in both sensing and compressing data steps, while occupying more storage space. The proposed technique of compressed sensing solves this problem. The compressed sensing utilizes the sparse structure property of the measurement signal, and the combination of the sensing and the compression of the information is achieved. The signals obtained by the compressive sensing technology need to be decompressed by some existing compressive sensing algorithms to restore the original signals.
The proposed compressed sensing technology has not been applicable on a large scale due to several problems: 1. most of actually measured signals are non-sparse, 2, the measured signals are easily interfered by burst noise, and 3, the popularization of compressed sensing sensor hardware is realized. The solution of the last problem requires the development of the hardware industry, while the first two problems can be solved by better algorithmic and technical methods.
In addition to the sparsity concept introduced at the beginning by the compressed sensing technology, the low rank property is also an important data property. Many signals in nature can be modeled with low rank models, such as video streams. Since there is a great correlation between each frame of video image, the video stream can be modeled as low-rank matrix data. The difference between foreground and background results from the difference between video frames. The variation of foreground objects makes low rank modeling of the entire data inaccurate. Since the variation of the foreground is generally more concentrated and small relative to the amount of background data, it can be modeled as a sparse signal. Based on such an idea, robust principal component analysis is proposed and applied to the separation of low rank and sparse signals.
Due to the significant correlation and difference between each frame of image of the video data, the video stream data can be modeled as the superimposed data of the address matrix and the sparse matrix. Based on the method, for uncompressed video stream data, the foreground and the background of the video are effectively recovered based on a robust principal component analysis method. But for sensors based on compressed sensing technology, this approach loses its effectiveness due to the compressed measurement.
Disclosure of Invention
The present invention is directed to the above problem, and provides a method for solving the problem of separating foreground and background of a video data stream obtained after compressed sensing.
The technical scheme adopted by the invention is as follows:
a foreground and background restoration method for a compressed perceptual post-video data stream, comprising the steps of:
s1, initialization:
uncompressed video stream data is modeled as a mixture of low rank and sparse matrices, namely:
X=L+S
wherein L represents a low-rank matrix, i.e. the background in the video stream, S represents a sparse matrix, i.e. the foreground image changed for each frame, and X is the complete video stream data;
after the compression operation, the data observed by the sensor is y ═ F (x) + n, where F is the compression operation, n is the measurement noise, obeys a mean of 0, and has a variance σ2(ii) a gaussian distribution of;
method for recovering front and back backgrounds based on information transmission method
S2, estimating the low-rank matrix and the sparse matrix simultaneously by using the linear estimator A, and respectively obtaining the estimation values and the estimation errors of the two matrices, wherein the estimation values and the estimation errors are specifically as follows:
the estimated values for the sparse matrix are obtained as:
wherein the content of the first and second substances,
is the linear estimator estimate of S, M is the dimension of the output after operation of the y operator a, N is the product of the number of rows and columns of the matrix X,
is an estimate of L by the linear estimator;
the estimation error of the sparse matrix is:
wherein the content of the first and second substances,
is the estimation error of S input to the linear estimator a,
is the estimate error of L input to the linear estimator a;
the estimated values for the low rank matrix are obtained as:
the estimation error of the low rank matrix is:
wherein the content of the first and second substances,
s3, according to the result of the step S2, a sparse matrix estimator is adopted to further obtain an estimated value of the sparse matrix (different sparse matrix estimators may be selected, such as a soft threshold denoiser, a SURE-L ET estimator, etc.):
wherein c isBAnd αBIs a linear combination coefficient used to make the correlation of the input and output estimation errors 0 while minimizing the output estimation error of the module:
wherein the content of the first and second substances,<A,B>=tr(A
TB),
D
Sis the estimation operation of the sparse matrix estimator on S;
and further obtaining an estimated value of the low-rank matrix by using a low-rank matrix estimator (the low-rank matrix estimator which can be selected comprises an optimal rank r estimator, a soft threshold low-rank matrix estimator, a hard threshold estimator and the like):
wherein the content of the first and second substances,
for the low rank matrix estimator C to estimate L,
D
Lis the estimation operation of the low rank matrix estimator pair L;
and S4, feeding the result of the step S3 back to the step S2, and performing iterative estimation until the output converges to respectively obtain the restored foreground and background.
The estimation error output in step S3 can be estimated by different estimation methods, and the specific estimation method may be different according to different usage scenarios. Meanwhile, in order to ensure the robustness of the method, the estimator involved in step S2 may be damming, i.e. the variation speed is slowed down, so that the whole method is more robust and reliable
The method has the beneficial effect of solving the problem of separation of the foreground and the background of the video data stream obtained after the compressed sensing.
Examples
Using a segment of surveillance video from a laboratory surveillance video for example, the video includes 500 frames of images, each frame of image has a size of 240 × 320 pixels, the whole video is compressed first, the compression uses a randomly selected discrete cosine transform operation + a random positive and negative phase operation, the compression ratio of the video is 15%.
Firstly, initializing a matrix of a front background and a back background to be expressed as an all-0 matrix with corresponding sizes, and simultaneously initializing an estimation error to be
The whole iterative recovery process is carried out with reference to fig. 1, the estimated matrix is transmitted from the module a to the module B and the module C, and then the input of the module B and the module C is transmitted to the module a, and the whole process is repeated until the estimated matrix is converged.
The SURE-L ET estimator is used in a module B to estimate a sparse matrix, and the optimal rank r estimator is used in a module C to estimate a low rank matrix, the final recovery effect is shown in figures 3 and 4, the recovered video background image is shown in figure 3, and the corresponding foreground image is shown in figure 4.