CN113992920A - Video compressed sensing reconstruction method based on deep expansion network - Google Patents

Video compressed sensing reconstruction method based on deep expansion network Download PDF

Info

Publication number
CN113992920A
CN113992920A CN202111239880.4A CN202111239880A CN113992920A CN 113992920 A CN113992920 A CN 113992920A CN 202111239880 A CN202111239880 A CN 202111239880A CN 113992920 A CN113992920 A CN 113992920A
Authority
CN
China
Prior art keywords
network
expansion network
deep expansion
deep
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111239880.4A
Other languages
Chinese (zh)
Inventor
张健
武卓远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN202111239880.4A priority Critical patent/CN113992920A/en
Publication of CN113992920A publication Critical patent/CN113992920A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

A video compressed sensing reconstruction method based on a deep expansion network comprises the following steps: s1, constructing a training data set: the training data set is composed of a plurality of data pairs, and each data pair is composed of an observation frame formed by compressing multiple frames and a corresponding uncompressed multiple frame; s2, constructing a deep expansion network: expanding the half-quadratic splitting algorithm for optimizing compressed sensing into a deep expanded network, and adding a dense feature fusion technology; s3, training a deep expansion network: based on a training data set, a loss function is given, and parameters in a deep expansion network are continuously optimized by using a back propagation and gradient descent algorithm until the loss function is stable; and S4, applying the trained deep expansion network to carry out a video compression sensing reconstruction process: the input is compressed observation frames and sampling matrixes, and the output is reconstructed video multiframes. The method has good interpretability and can achieve high reconstruction precision on the premise of ensuring higher reconstruction speed.

Description

Video compressed sensing reconstruction method based on deep expansion network
Technical Field
The invention belongs to the technical field of video processing, and particularly relates to a video compressed sensing reconstruction method based on a deep expansion network.
Background
Video compressed sensing is widely used in imaging systems, where the purpose is to capture, for example, video using two-dimensional sensors[1]Or spectrum of light[2]And (4) equal height dimension signals. By introducing an additional hardware component into the imaging system, the high-dimensional signal is compressed into a two-dimensional signal, and then the reconstruction from the two-dimensional signal to the high-dimensional signal is completed by applying a reconstruction algorithm.
The conventional video compressed perceptual reconstruction algorithm will be briefly described below.
From a mathematical point of view, after obtaining the compressed observation frames and the sampling matrix, the video compression perception solves an ill-posed inverse problem. Traditional methods introduce a priori knowledge of the image or video as a regularization term to iteratively solve a sparsely regularized optimization problem, e.g., introducing a total variance[3]Gaussian mixture model[4]Optical flow[5]Non-local low rank[6]The equistructure sparse property is used as a regular term a priori. Such conventional methods based on optimization iteration can be directly applied to different sampling matrices without retraining, but the performance is limited by the selected prior, and moreover, the process of iterative optimization takes a long time. In recent years, due to the vigorous development of deep learning, a plurality of video compression sensing methods based on the convolutional neural network are promoted, and the methods can be divided into two types, one type is a deep non-expanded network, and the other type is a deep expanded network. Deep non-expanded network learning is a direct mapping of compressed observation frames to original multiframes, e.g. stacked multi-layer convolutions to learn such a mapping[7]A deep fully-connected network is designed to learn this mapping[8]Selecting a jointly optimized sampling matrix and reconstructed network taking into account hardware limitations[9]A network named BIRNAT is designed[10]Wherein the first frame is reconstructed by a convolutional neural network and the rest frames are reconstructed by a bidirectional cyclic network, and the MetaSCI is provided[11]Wherein, the backbone network selects a light design mode andand the network may be applied to different sampling matrices. Proposed RevSCI[12]Packet reversible 3D convolution is used to handle large-scale video compressed sensing reconstruction, and such methods have the disadvantage of poor interpretability. Deep-unfolding network maps optimization method into a deep network to solve the problem that the traditional method needs multiple iterations, such as tenor-ADMM[13]、tensor-FISTA[14]、GAP-UNet[15]、PnP-FFDNet[16]Etc. are proposed in succession. The disadvantages of such a process are as follows: first, 2D convolution is commonly used in networks to exploit inter-frame correlation, which is not an optimal choice; second, deep-developed networks suffer significant loss of information transferred between stages.
Disclosure of Invention
The invention aims to overcome the defects and shortcomings of the conventional video compressed sensing reconstruction method and provide a video compressed sensing reconstruction algorithm based on a deep expansion network. The method designs a deep expansion network for training and reconstruction, not only has good interpretability, but also can achieve high reconstruction precision on the premise of ensuring higher reconstruction speed.
The technical scheme of the invention is as follows:
a video compressed sensing reconstruction method based on a deep expansion network comprises the following steps: s1, constructing a training data set: the training data set is composed of a plurality of data pairs, and each data pair is composed of an observation frame formed by compressing multiple frames and a corresponding uncompressed multiple frame; s2, constructing a deep expansion network: expanding the half-quadratic splitting algorithm for optimizing compressed sensing into a deep expanded network, and adding a dense feature fusion technology; s3, training a deep expansion network: based on a training data set, a loss function is given, and parameters in a deep expansion network are continuously optimized by using a back propagation and gradient descent algorithm until the loss function is stable; and S4, applying the trained deep expansion network to carry out a video compression sensing reconstruction process: the input is compressed observation frames and sampling matrixes, and the output is reconstructed video multiframes.
Preferably, in the method for video compressed sensing reconstruction based on the deep expansion network, in step S1, a video training data set is constructed for training the deep expansion network, where the training data set is composed of a plurality of data pairs, and each data pair includes a group of consecutive video frames and a corresponding multi-frame compressed observation frame.
Preferably, in the video compressed sensing reconstruction method based on the depth expansion network, in step S2, the depth expansion network is expanded by a semi-quadratic splitting algorithm for optimizing compressed sensing, a network structure of the depth expansion network is formed by alternately stacking data modules and prior modules, and a 3D convolution is introduced to improve a characterization capability of the depth expansion network on inter-frame correlation; and using dense feature fusion techniques to reduce the loss caused by information passing between different stages and to help information to be adaptively transmitted in different stages.
Preferably, in the video compressed sensing reconstruction method based on the deep expansion network, in step S3, a back propagation algorithm is used to calculate gradients of the loss function with respect to each parameter in the deep expansion network, and then a gradient descent algorithm is used to optimize parameters of a network layer of the deep expansion network based on the training data set until the value of the loss function is stable, so as to obtain an optimal parameter of the deep expansion network.
Preferably, in the video compressed sensing reconstruction method based on the deep expansion network, in step S4, a rough reconstruction is performed by using the acquired observation frame and the sampling matrix, and then the reconstruction result and the sampling matrix are sent to the trained deep expansion network, and the output is a high-quality reconstruction result.
According to the technical scheme of the invention, the beneficial effects are as follows:
1. the method constructs a deep expansion network for video compressed sensing reconstruction, wherein 3D convolution is introduced to solve near-end mapping, and the method can better utilize the correlation of a time domain and a space domain;
2. the method of the invention greatly surpasses the prior method in subjective results and numerical values, and obtains the best reconstruction effect up to now. The dense feature fusion technology provided by the invention well solves the problem of information loss in the network and brings about 0.45dB of gain;
3. the method of the invention is the first discussion of how to solve the problem of information loss in the task of video compression perception, and provides a feasible solution. In addition, in order to improve the information fusion capability, the method of the invention provides a dense feature adaptive fusion mode, which can make the effective information in the dense feature adaptively transmitted between different stages.
For a better understanding and appreciation of the concepts, principles of operation, and effects of the invention, reference will now be made in detail to the following examples, taken in conjunction with the accompanying drawings, in which:
drawings
In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below.
FIG. 1 is a flowchart of an implementation of a video compressed sensing reconstruction method based on a deep expansion network according to the present invention.
Fig. 2 is a schematic diagram of a compressed perceptual reconstruction of video.
Fig. 3 is a structural diagram of a deep developed network.
FIG. 4 is a schematic diagram of a dense feature adaptive fusion technique.
Fig. 5 and 6 are respectively a visual comparison of the respective reconstruction algorithms on the partially synthesized data set.
Fig. 7 is a visual comparison result of each reconstruction algorithm on real data in an experiment.
Detailed Description
In order to make the objects, technical means and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific examples. These examples are merely illustrative and not restrictive of the invention.
The invention discloses a video compressed sensing reconstruction method based on a depth expansion network, which is used for reconstructing a high-quality video sequence from observation frames which are obtained by compressing a plurality of frames and acquired by a Digital Micromirror Device (DMD) or a Liquid Crystal On Silicon (LCOS) camera.
As shown in fig. 1, the video compressed sensing reconstruction method based on the deep expansion network of the present invention includes the following steps:
s1, constructing a training data set: the training data set is composed of a plurality of data pairs, each data pair being composed of a plurality of compressed observation frames and a corresponding uncompressed multiframe. Specifically, a video training data set is constructed for training the deep expansion network, the training data set is composed of a plurality of data pairs, and each data pair comprises a group of continuous video frames and corresponding observation frames formed by compressing multiple frames.
To determine the optimal parameters of the proposed deep expansion network, the present invention constructs a training data set for the video compressed perceptual reconstruction problem. In the experiment, the invention selects to train on a DAVIS data set, the data set comprises 90 scenes and has the resolution of 480p, the original image is cut to 128 x 128 size in the training process, each data pair has 8 frames, and 25600 data pairs are acquired in total. Therefore, the data pair is composed of a compressed observation frame y and an uncompressed multiframe x, the observation frame and the sampling matrix are used as the input of the network, and the output is the reconstructed result xKWhere uncompressed multiframes are the reconstruction targets, pairs of such training data form a network training data set S.
S2, constructing a deep expansion network: and expanding the half-quadratic splitting algorithm for optimizing the compressed sensing into a neural network, and adding a dense feature fusion technology. The deep expansion network is formed by expanding a semi-quadratic splitting algorithm for optimizing compressed sensing, the network structure is formed by alternately stacking a data module and a prior module, and 3D convolution is introduced to improve the characterization capability of the network on interframe correlation. Dense feature fusion techniques are used to reduce the loss caused by information passing between different stages and to help information to be adaptively transmitted in different stages.
The reconstruction result of the video compressed sensing can be obtained by solving the following optimization problem:
Figure BDA0003319031800000041
where x is a video frame (i.e. a plurality of frames x in fig. 2) with consecutive B frames, y is an observation frame compressed from the B frames, Φ is a corresponding sampling matrix, Ψ (x) is a regular term that constrains some a priori properties of the video frame x, and λ is a coefficient of the regular term.
Introducing another parameter V (intermediate variable, as shown in fig. 2), equation (1) can translate into a limited optimization problem:
Figure BDA0003319031800000042
the obtained objective function can be subjected to iterative optimization through a semi-quadratic splitting algorithm, and the method specifically comprises the following steps:
Figure BDA0003319031800000043
Figure BDA0003319031800000044
where k represents the number of iteration steps of the semi-quadratic split and η is another regular term coefficient.
(3) The solution can be accelerated by:
Figure BDA0003319031800000045
vk=xk-1T(ΦΦTk)-1(rk-Φxk-1).(6)
wherein r is0R is also an intermediate variable (as shown in fig. 2).
(4) Formula can be viewed as a de-noising subproblem, whereby the X and V subproblems can be re-expressed as:
Figure BDA0003319031800000046
Figure BDA0003319031800000047
i-represents a cascading operation,
Figure BDA0003319031800000048
represents a regularized observation frame, since
Figure BDA0003319031800000049
Has rich information and is therefore introduced in the prior module to provide supplementary reconstruction information. One iteration of the original optimization problem is converted into a data module in the network
Figure BDA00033190318000000410
And a prior module
Figure BDA00033190318000000411
Therefore, the invention expands the semi-quadratic splitting algorithm into a deep expanded network, wherein the reconstructed network is formed by alternately stacking the data modules and the prior modules (as shown in fig. 3). Initial value (i.e. initialization in fig. 2) x0=ΦTy is the result of a coarse reconstruction from the observation frame and the sampling matrix, then x0Y and
Figure BDA0003319031800000051
sending the data to the network for reconstruction. The data module and the prior module are respectively used for solving the (7) and the (8), wherein the prior module comprises a dense feature adaptive fusion module for reducing information loss.
To better model the inter-frame correlation, a priori models are proposed in which a 3D convolution is used. Compared with 2D convolution, the 3D convolution kernel not only slides in the two-dimensional plane, but also slides in the time-domain dimension, which enables better utilization of inter-frame correlation in multi-frame reconstruction.
According to the above description, the information transmitted between each stage in the deeply-expanded network has only a limited number of channels, while the framework network used in the present invention is a UNet-like structure, it is obvious that the information transmitted in the network is often multi-channel, and then there is a loss of information in the course of transmission of each stage, in order to solve this problem, the present invention proposes a dense feature fusion technique, and the specific process is as follows (refer to fig. 2 and fig. 3):
as shown in FIG. 3, the network structure of the k-th stage (i.e., stage k) prior module is an encoder-decoder structure and has multiple scales, here, j ∈ [1, 2, 3 ] is used]To represent different scales, where j-1 corresponds to the shallowest layer of the network, and the characteristic diagram of the j-th scale output in the k-th stage encoder is defined as
Figure BDA0003319031800000052
Correspondingly, the input and output of the j scale in the k stage decoder are respectively
Figure BDA0003319031800000053
And
Figure BDA0003319031800000054
the output of each scale of the network can be expressed as:
Figure BDA0003319031800000055
Figure BDA0003319031800000056
it should be noted that
Figure BDA0003319031800000057
Represents the input to the a priori module(s),
Figure BDA0003319031800000058
indicating that there is no residual connection between the encoder and decoder at the scale corresponding to the deepest layer of the network.
To solve this problem, the present invention proposes a technique of dense feature fusion, as shown in fig. 2, the details of which are shown below: the inputs and outputs of the prior modules in each stage typically have a limited number of channels, indicating that there is a loss of information in the transmission of information between stages
Figure BDA0003319031800000059
Figure BDA00033190318000000510
Figure BDA00033190318000000511
Where ×) represents nearest neighbor upsampling. It can be seen that the input of the decoder of each stage fuses the feature maps of the corresponding scales in the previous stage, thereby reducing the information loss due to the change of the number of channels and the up-down sampling, wherein the features from the k-1 th stage are called as dense features and are expressed as follows:
Figure BDA0003319031800000061
finally, what the prior module learns is a residual map of the multi-frame reconstruction result of the data module (as shown in fig. 3):
Figure BDA0003319031800000062
since the characteristics of different channels have different contributions to the final result when the dense characteristics are fused, a dense characteristic adaptive fusion technology is provided to ensure that information is selectively transferred between adjacent stages. Specifically, when the dense feature Fk-1And regularizing the observation frame
Figure BDA0003319031800000063
In the same way, enhancement should be achieved during fusion, and inhibition should be achieved otherwise.
The core idea is to compute the similarity between dense features and the regularized observation frame, as shown in FIG. 4, for the m-th dense feature
Figure BDA0003319031800000064
For a certain position (p, q) of the c-th channel, the similarity S (p, q, c) can be defined by an anisotropic filter
Figure BDA0003319031800000065
Multiplied by dense features, where H and W denote the height and width of the ith scale feature map, nf denotes the size of the anisotropic filter, and the calculation process can be expressed as:
Figure BDA0003319031800000066
wherein
Figure BDA0003319031800000067
u is the index of the third element in the five tuple. The adaptive dense feature map may then be calculated by:
Figure BDA0003319031800000068
where σ denotes the sigmoid function, which acts to transform the variable to the [0, 1] interval. Then, the adaptive dense feature map may be represented as:
Figure BDA0003319031800000069
s3, training a deep expansion network, wherein the training process is as follows: based on a training data set, a loss function is given, and parameters in a deep expansion network are continuously optimized by using a back propagation and gradient descent algorithm until the loss function is stable. Specifically, a loss function is designed, the gradient of the loss function relative to each parameter in the deep expansion network is calculated by adopting a back propagation algorithm, then the parameters of the network layer are optimized by adopting a gradient descent algorithm based on a training data set until the value of the loss function is stable, namely, until a model converges, and the optimal parameters of the deep expansion network are obtained.
Taking S as a training data set, and taking a mean square error as a loss function of the network:
Figure BDA0003319031800000071
wherein N iskRepresenting the total number of training data pairs, NsRepresenting the total number of pixels of the image in each data pair. Calculating the gradient of the loss function relative to each parameter in the network through a back propagation algorithm, and then optimizing the parameters of the network layer by adopting a gradient descent algorithm based on the training data set until the value of the loss function is stable, so as to obtain the optimal parameters of the deeply expanded network.
S4, applying the trained deep expansion network to carry out a video compression perception reconstruction process: the input is compressed observation frames and sampling matrixes, and the output is reconstructed video multiframes.
Through the training process of the third step, the optimal depth expansion network parameters can be determined, and based on the trained model, when video compressed sensing reconstruction is performed, rough reconstruction is performed by using the acquired observation frame y and the sampling matrix phi (as shown in fig. 2), namely x0=ΦTAnd y, then sending the reconstruction result and the sampling matrix into a trained deep expansion network, and outputting the result, namely the high-quality reconstruction result.
During testing, the invention respectively reconstructs synthetic data and real data, wherein the synthetic data comprises six scenes of Kobe, Traffic, Runner, Drop, blast and Aerio, the dimensionality of the six scenes is 256 multiplied by 8, the real data set comprises two scenes of Water Balloon and Dominoes, and the dimensionality of the real data set is 512 multiplied by 10. In order to objectively evaluate the reconstruction accuracy of the different methods, the peak signal-to-noise ratio (PSNR) was used as an index for comparison. All experiments were run on servers of NVIDIA Tesla V100. The deep developed network used in the experiment was K ═ i 0.
Table 1: comparison of results of different methods under synthesized data
Figure BDA0003319031800000072
As shown in table 1 above, comparing the depth-expanded network proposed by the present invention with ten video-based compressed sensing reconstruction methods under synthesized data, the comparing method includes: GAP-TV[3]、E2E-CNN[7]、DeSCI[6]、PnP-FFDNet[15]、BIRNAT[10]、Tensor-ADMM[13]、Tensor-FISTA[14]、GAP-UNet[15]、MetaSCI[16]、RevSCI[12]. The depth expansion network provided by the invention achieves the highest reconstruction precision under the synthetic data, FIGS. 5 and 6 are the reconnection results of the methods under different scenes of the synthetic data, FIG. 7 is the reconstruction results of the methods under the real data, and the reconstruction results of the methods in the whole and amplified details can be seen.
The foregoing description is of the preferred embodiment of the concepts and principles of operation in accordance with the invention. The above-described embodiments should not be construed as limiting the scope of the claims, and other embodiments and combinations of implementations according to the inventive concept are within the scope of the invention.
Reference documents:
[1]Llull P,Liao X,Yuan X,et al.Coded aperture compressive temporal imaging[J].Optics express,2013,21(9):10526-10545.
[2]Wagadarikar A A,Pitsianis N P,Sun X,et al.Video rate spectral imaging using a coded aperture snapshot spectral imager[J].Optics express,2009,17(8):6368-6388.
[3]Yuan X.Generalized alternating projection based total variation minimization for compressive sensing[C].2016IEEE International Conference on Image Processing.IEEE,2016:2539-2543.
[4]Yang J,Yuan X,Liao X,et al.Video compressive sensing using Gaussian mixture models[J].IEEE Transactions on Image Processing,2014,23(11):4863-4878.
[5]Reddy D,Veeraraghavan A,Chellappa R.P2C2:Programmable pixel compressive camera for high speed imaging[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2011:329-336.
[6]Liu Y,Yuan X,Suo J,et al.Rank minimization for snapshot compressive imaging[J].IEEE transactions on pattern analysis and machine intelligence,2018,41(12):2990-3006.
[7]Qiao M,Meng Z,Ma J,et al.Deep learning for video compressive sensing[J].APL Photonics,2020,5(3):030801.
[8]Iliadis M,Spinoulas L,Katsaggelos A K.Deep fully-connected networks for video compressive sensing[J].Digital Signal Processing,2018,72:9-18.
[9]Yoshida M,Torii A,Okutomi M,et al.Joint optimization for compressive video sensing and reconstruction under hardware constraints[C].Proceedings of the European Conference on Computer Vision.2018:634-649.
[10]Cheng Z,Lu R,Wang Z,et al.BIRNAT:Bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging[C].European Conference on Computer Vision.Springer,Cham,2020:258-275.
[11]Wang Z,Zhang H,Cheng Z,et al.Metasci:Scalable and adaptive reconstruction for video compressive sensing[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.,2021.
[12]Cheng Z,Chen B,Liu G,et al.Memory-efficient network for large-scale video compressive sensing[J].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.,2021.
[13]Ma J,Liu X Y,Shou Z,et al.Deep tensor admm-net for snapshot compressive imaging[C].Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:10223-10232.
[14]Han X,Wu B,Shou Z,et al.Tensor FISTA-Net for real-time snapshot compressive imaging[C].Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(07):10933-10940.
[15]Meng Z,Jalali S,Yuan X.GAP-net for Snapshot Compressive Imaging[J].arXiv preprint arXiv:2012.08364,2020.
[16]Yuan X,Liu Y,Suo J,et al.Plug-and-play algorithms for large-scale snapshot compressive imaging[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1447-1457.

Claims (5)

1. a video compressed sensing reconstruction method based on a deep expansion network is characterized by comprising the following steps:
s1, constructing a training data set: the training data set is composed of a plurality of data pairs, and each data pair is composed of an observation frame formed by compressing multiple frames and a corresponding uncompressed multiple frame;
s2, constructing a deep expansion network: expanding the half-quadratic splitting algorithm for optimizing compressed sensing into a deep expanded network, and adding a dense feature fusion technology;
s3, training the deep expansion network: based on the training data set, giving a loss function, and continuously optimizing parameters in the deep expansion network by using a back propagation and gradient descent algorithm until the loss function is stable; and
s4, applying the trained deep expansion network to carry out a video compression perception reconstruction process: the input is compressed observation frames and sampling matrixes, and the output is reconstructed video multiframes.
2. The method for video compressed sensing reconstruction based on deep expansion network as claimed in claim 1, wherein in step S1, a video training data set is constructed for training the deep expansion network, the training data set is composed of a plurality of data pairs, each data pair includes a group of consecutive video frames and corresponding multi-frame compressed observation frames.
3. The video compressed sensing reconstruction method based on the deep expansion network according to claim 1, wherein in step S2, the deep expansion network is expanded by a semi-quadratic splitting algorithm for optimizing compressed sensing, a network structure of the deep expansion network is formed by alternately stacking data modules and prior modules, wherein 3D convolution is introduced to improve a capability of the deep expansion network in characterizing inter-frame correlation; and using dense feature fusion techniques to reduce the loss caused by information passing between different stages and to help information to be adaptively transmitted in different stages.
4. The method for video compressed sensing reconstruction based on the deep expansion network of claim 1, wherein in step S3, a back propagation algorithm is used to calculate gradients of a loss function with respect to each parameter in the deep expansion network, and then a gradient descent algorithm is used to optimize parameters of a network layer of the deep expansion network based on the training data set until values of the loss function are stable, so as to obtain optimal parameters of the deep expansion network.
5. The method for video compressed sensing reconstruction based on the deep expansion network as claimed in claim 1, wherein in step S4, firstly, the collected observation frames and the sampling matrix are used to perform rough reconstruction, and then the reconstructed result and the sampling matrix are sent to the trained deep expansion network, and the output is the reconstructed result with high quality.
CN202111239880.4A 2021-10-25 2021-10-25 Video compressed sensing reconstruction method based on deep expansion network Pending CN113992920A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111239880.4A CN113992920A (en) 2021-10-25 2021-10-25 Video compressed sensing reconstruction method based on deep expansion network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111239880.4A CN113992920A (en) 2021-10-25 2021-10-25 Video compressed sensing reconstruction method based on deep expansion network

Publications (1)

Publication Number Publication Date
CN113992920A true CN113992920A (en) 2022-01-28

Family

ID=79740885

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111239880.4A Pending CN113992920A (en) 2021-10-25 2021-10-25 Video compressed sensing reconstruction method based on deep expansion network

Country Status (1)

Country Link
CN (1) CN113992920A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841901A (en) * 2022-07-01 2022-08-02 北京大学深圳研究生院 Image reconstruction method based on generalized depth expansion network
CN117058045A (en) * 2023-10-13 2023-11-14 阿尔玻科技有限公司 Method, device, system and storage medium for reconstructing compressed image
CN117994176A (en) * 2023-12-27 2024-05-07 中国传媒大学 Depth priori optical flow guided video restoration method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190075309A1 (en) * 2016-12-30 2019-03-07 Ping An Technology (Shenzhen) Co., Ltd. Video compressed sensing reconstruction method, system, electronic device, and storage medium
WO2020037965A1 (en) * 2018-08-21 2020-02-27 北京大学深圳研究生院 Method for multi-motion flow deep convolutional network model for video prediction
CN112991472A (en) * 2021-03-19 2021-06-18 华南理工大学 Image compressed sensing reconstruction method based on residual dense threshold network
CN113222812A (en) * 2021-06-02 2021-08-06 北京大学深圳研究生院 Image reconstruction method based on information flow reinforced deep expansion network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190075309A1 (en) * 2016-12-30 2019-03-07 Ping An Technology (Shenzhen) Co., Ltd. Video compressed sensing reconstruction method, system, electronic device, and storage medium
WO2020037965A1 (en) * 2018-08-21 2020-02-27 北京大学深圳研究生院 Method for multi-motion flow deep convolutional network model for video prediction
CN112991472A (en) * 2021-03-19 2021-06-18 华南理工大学 Image compressed sensing reconstruction method based on residual dense threshold network
CN113222812A (en) * 2021-06-02 2021-08-06 北京大学深圳研究生院 Image reconstruction method based on information flow reinforced deep expansion network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHUOYUAN WU等: "Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging", 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), pages 2 - 5 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114841901A (en) * 2022-07-01 2022-08-02 北京大学深圳研究生院 Image reconstruction method based on generalized depth expansion network
CN114841901B (en) * 2022-07-01 2022-10-25 北京大学深圳研究生院 Image reconstruction method based on generalized depth expansion network
CN117058045A (en) * 2023-10-13 2023-11-14 阿尔玻科技有限公司 Method, device, system and storage medium for reconstructing compressed image
CN117994176A (en) * 2023-12-27 2024-05-07 中国传媒大学 Depth priori optical flow guided video restoration method and system

Similar Documents

Publication Publication Date Title
CN107730451B (en) Compressed sensing reconstruction method and system based on depth residual error network
CN113992920A (en) Video compressed sensing reconstruction method based on deep expansion network
CN106910161B (en) Single image super-resolution reconstruction method based on deep convolutional neural network
Hawe et al. Analysis operator learning and its application to image reconstruction
CN112884851B (en) Construction method of deep compressed sensing network based on expansion iteration optimization algorithm
CN112435191B (en) Low-illumination image enhancement method based on fusion of multiple neural network structures
CN111739077A (en) Monocular underwater image depth estimation and color correction method based on depth neural network
CN112102182B (en) Single image reflection removing method based on deep learning
CN112801877B (en) Super-resolution reconstruction method of video frame
CN111179167A (en) Image super-resolution method based on multi-stage attention enhancement network
CN113362250B (en) Image denoising method and system based on dual-tree quaternary wavelet and deep learning
CN109523513B (en) Stereoscopic image quality evaluation method based on sparse reconstruction color fusion image
CN111028150A (en) Rapid space-time residual attention video super-resolution reconstruction method
CN110136060B (en) Image super-resolution reconstruction method based on shallow dense connection network
Han et al. Tensor FISTA-Net for real-time snapshot compressive imaging
CN110189260B (en) Image noise reduction method based on multi-scale parallel gated neural network
CN114746895A (en) Noise reconstruction for image denoising
CN109886898B (en) Imaging method of spectral imaging system based on optimization heuristic neural network
Chen et al. Image denoising via deep network based on edge enhancement
CN114170286A (en) Monocular depth estimation method based on unsupervised depth learning
CN110782458A (en) Object image 3D semantic prediction segmentation method of asymmetric coding network
Zhao et al. A simple and robust deep convolutional approach to blind image denoising
CN114841859A (en) Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115526779A (en) Infrared image super-resolution reconstruction method based on dynamic attention mechanism
CN109615576A (en) The single-frame image super-resolution reconstruction method of base study is returned based on cascade

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220128