CN108881911B

CN108881911B - Foreground and background recovery method for compressed sensing rear video data stream

Info

Publication number: CN108881911B
Application number: CN201810667783.7A
Authority: CN
Inventors: 袁晓军
Original assignee: University of Electronic Science and Technology of China
Current assignee: Sichuan Chuangshu Intelligent Technology Co ltd
Priority date: 2018-06-26
Filing date: 2018-06-26
Publication date: 2020-07-10
Anticipated expiration: 2038-06-26
Also published as: CN108881911A

Abstract

The invention belongs to the technical field of video information processing and image processing, and particularly relates to a foreground and background recovery method for compressed sensing of a rear video data stream. The method adopts an iterative recovery mode, adopts a linear estimator to simultaneously estimate a low-rank matrix and a sparse matrix to respectively obtain the estimation values and the estimation errors of the two matrices, further obtains the estimation value of the sparse matrix and the estimation value of the low-rank matrix by respectively adopting the sparse matrix estimator and the low-rank matrix estimator, then feeds back the obtained estimation values to the linear estimator, and iteratively estimates until the output converges, thereby respectively obtaining the recovered foreground and the recovered background. The method has the beneficial effect of solving the problem of separation of the foreground and the background of the video data stream obtained after the compressed sensing.

Description

Foreground and background recovery method for compressed sensing rear video data stream

Technical Field

The invention belongs to the technical field of video information processing and image processing, and particularly relates to a foreground and background recovery method for compressed sensing of a rear video data stream.

Background

Industrial surveillance cameras, cameras for mobile devices, motion cameras, and the like are producing large amounts of image and video data each day. Due to considerations such as device cost, mobility requirements, size limitations, battery capacity, etc., these image and video sensing devices need to compress and store or transmit data after acquiring image and video data. This approach of sensing first and compressing data requires energy consumption in both sensing and compressing data steps, while occupying more storage space. The proposed technique of compressed sensing solves this problem. The compressed sensing utilizes the sparse structure property of the measurement signal, and the combination of the sensing and the compression of the information is achieved. The signals obtained by the compressive sensing technology need to be decompressed by some existing compressive sensing algorithms to restore the original signals.

The proposed compressed sensing technology has not been applicable on a large scale due to several problems: 1. most of actually measured signals are non-sparse, 2, the measured signals are easily interfered by burst noise, and 3, the popularization of compressed sensing sensor hardware is realized. The solution of the last problem requires the development of the hardware industry, while the first two problems can be solved by better algorithmic and technical methods.

In addition to the sparsity concept introduced at the beginning by the compressed sensing technology, the low rank property is also an important data property. Many signals in nature can be modeled with low rank models, such as video streams. Since there is a great correlation between each frame of video image, the video stream can be modeled as low-rank matrix data. The difference between foreground and background results from the difference between video frames. The variation of foreground objects makes low rank modeling of the entire data inaccurate. Since the variation of the foreground is generally more concentrated and small relative to the amount of background data, it can be modeled as a sparse signal. Based on such an idea, robust principal component analysis is proposed and applied to the separation of low rank and sparse signals.

Due to the significant correlation and difference between each frame of image of the video data, the video stream data can be modeled as the superimposed data of the address matrix and the sparse matrix. Based on the method, for uncompressed video stream data, the foreground and the background of the video are effectively recovered based on a robust principal component analysis method. But for sensors based on compressed sensing technology, this approach loses its effectiveness due to the compressed measurement.

Disclosure of Invention

The present invention is directed to the above problem, and provides a method for solving the problem of separating foreground and background of a video data stream obtained after compressed sensing.

The technical scheme adopted by the invention is as follows:

a foreground and background restoration method for a compressed perceptual post-video data stream, comprising the steps of:

s1, initialization:

uncompressed video stream data is modeled as a mixture of low rank and sparse matrices, namely:

X＝L+S

wherein L represents a low-rank matrix, i.e. the background in the video stream, S represents a sparse matrix, i.e. the foreground image changed for each frame, and X is the complete video stream data;

after the compression operation, the data observed by the sensor is y ═ F (x) + n, where F is the compression operation, n is the measurement noise, obeys a mean of 0, and has a variance σ²(ii) a gaussian distribution of;

method for recovering front and back backgrounds based on information transmission method

S2, estimating the low-rank matrix and the sparse matrix simultaneously by using the linear estimator A, and respectively obtaining the estimation values and the estimation errors of the two matrices, wherein the estimation values and the estimation errors are specifically as follows:

the estimated values for the sparse matrix are obtained as:

wherein the content of the first and second substances,

is the linear estimator estimate of S, M is the dimension of the output after operation of the y operator a, N is the product of the number of rows and columns of the matrix X,

is an estimate of L by the linear estimator;

the estimation error of the sparse matrix is:

wherein the content of the first and second substances,

is the estimation error of S input to the linear estimator a,

is the estimate error of L input to the linear estimator a;

the estimated values for the low rank matrix are obtained as:

the estimation error of the low rank matrix is:

wherein the content of the first and second substances,

s3, according to the result of the step S2, a sparse matrix estimator is adopted to further obtain an estimated value of the sparse matrix (different sparse matrix estimators may be selected, such as a soft threshold denoiser, a SURE-L ET estimator, etc.):

wherein c is_BAnd α_BIs a linear combination coefficient used to make the correlation of the input and output estimation errors 0 while minimizing the output estimation error of the module:

wherein the content of the first and second substances,<A,B>＝tr(A^TB)，

D_Sis the estimation operation of the sparse matrix estimator on S;

and further obtaining an estimated value of the low-rank matrix by using a low-rank matrix estimator (the low-rank matrix estimator which can be selected comprises an optimal rank r estimator, a soft threshold low-rank matrix estimator, a hard threshold estimator and the like):

wherein the content of the first and second substances,

for the low rank matrix estimator C to estimate L,

D_Lis the estimation operation of the low rank matrix estimator pair L;

and S4, feeding the result of the step S3 back to the step S2, and performing iterative estimation until the output converges to respectively obtain the restored foreground and background.

The estimation error output in step S3 can be estimated by different estimation methods, and the specific estimation method may be different according to different usage scenarios. Meanwhile, in order to ensure the robustness of the method, the estimator involved in step S2 may be damming, i.e. the variation speed is slowed down, so that the whole method is more robust and reliable

The method has the beneficial effect of solving the problem of separation of the foreground and the background of the video data stream obtained after the compressed sensing.

Drawings

FIG. 1 is a block diagram of a context separation method of the present invention;

FIG. 2 is a block diagram of two frames of images extracted from a video segment, the picture pixel size being 240px × 320 px;

FIG. 3 is a video background extracted using the method of the present invention corresponding to the two frames of images in FIG. 2;

fig. 4 is a video foreground extracted by the method of the present invention corresponding to the two frames of images in fig. 2.

Detailed Description

The method of the present invention will be further described with reference to the accompanying drawings and examples.

Examples

Using a segment of surveillance video from a laboratory surveillance video for example, the video includes 500 frames of images, each frame of image has a size of 240 × 320 pixels, the whole video is compressed first, the compression uses a randomly selected discrete cosine transform operation + a random positive and negative phase operation, the compression ratio of the video is 15%.

Firstly, initializing a matrix of a front background and a back background to be expressed as an all-0 matrix with corresponding sizes, and simultaneously initializing an estimation error to be

The whole iterative recovery process is carried out with reference to fig. 1, the estimated matrix is transmitted from the module a to the module B and the module C, and then the input of the module B and the module C is transmitted to the module a, and the whole process is repeated until the estimated matrix is converged.

The SURE-L ET estimator is used in a module B to estimate a sparse matrix, and the optimal rank r estimator is used in a module C to estimate a low rank matrix, the final recovery effect is shown in figures 3 and 4, the recovered video background image is shown in figure 3, and the corresponding foreground image is shown in figure 4.

Claims

1. A foreground and background restoration method for a compressed perceptual post-video data stream, comprising the steps of:

s1, initialization:

for uncompressed perceptual video stream data, it is modeled as a mixture of low rank matrices and sparse matrices, i.e.:

X＝L+S

after the compressive sensing operation, the data observed by the sensor is y ═ F (x) + n, where F is the compressive sensing operator, n is the measurement noise, obeys a mean of 0, and has a variance of σ²(ii) a gaussian distribution of;

the estimated values for the sparse matrix are obtained as:

wherein the content of the first and second substances,

is an estimate of S by the linear estimator, M is the dimension of y, N is the product of the number of rows and columns of the matrix X,

is an estimate of L by the linear estimator;

the estimation error of the sparse matrix is:

wherein the content of the first and second substances,

is the estimation error of S input to the linear estimator a,

is the estimate error of L input to the linear estimator a;

the estimated values for the low rank matrix are obtained as:

the estimation error of the low rank matrix is:

wherein the content of the first and second substances,

s3, according to the result of the step S2, a sparse matrix estimator B is adopted to further obtain a sparse matrix estimation value:

wherein c is_BAnd α_BIs a linear combination coefficient of the two or more,

for the sparse matrix estimator estimation of S:

wherein the content of the first and second substances,<A,B>＝tr(A^TB)，

and further obtaining a low-rank matrix estimation value by adopting a low-rank matrix estimator C:

wherein the content of the first and second substances,

for the low rank matrix estimator C to estimate L,

and S4, feeding the result of the step S3 back to the linear estimator of the step S2, and performing iterative estimation until the output converges to respectively obtain the restored foreground and background.