CN103596010B - Video coding and decoding system based on dictionary learning and compressed sensing - Google Patents
Video coding and decoding system based on dictionary learning and compressed sensing Download PDFInfo
- Publication number
- CN103596010B CN103596010B CN201310589803.0A CN201310589803A CN103596010B CN 103596010 B CN103596010 B CN 103596010B CN 201310589803 A CN201310589803 A CN 201310589803A CN 103596010 B CN103596010 B CN 103596010B
- Authority
- CN
- China
- Prior art keywords
- frame
- video
- decoding
- compressed sensing
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention relates to the field of video compressed sensing and image sparse representation, and discloses a video coding and decoding system based on compressed sensing. The video coding and decoding system based on compressed sensing is designed to make a wireless video sensing network have the advantages that the complexity and calculated amount of a coding terminal are small, the volume of data transmitted through a channel is small and a decoding terminal can carry out high-quality real-time video reconstruction. According to the technical scheme, the video coding and decoding system based on dictionary learning and compressed sensing mainly comprises the video coding terminal and the video decoding terminal, wherein the coding terminal is used for temporarily storing image pixel data of K frames, reducing the dimensionality of the image pixel data of the K frames and transmitting data after dimensionality reduction to the decoding terminal through a wireless transmitting module according to the compressed sensing theory, and the decoding terminal is used for decoding the K frames according to the compressed sensing reconstruction algorithm (namely, the improved NSL0 method), storing the K frames and finally forming a video through integration according to frame sequences and outputting the video. The video coding and decoding system based on compressed sensing is mainly applied to video compressed sensing and transmission.
Description
Technical field
The present invention relates to video compress perception and image sparse represents field, particularly relate to compressed sensing video based on dictionary learning
Coding/decoding system.
Background technology
Present invention is generally directed to the field of video applications that some coding sides are resource-constrained, such as video monitoring, wireless video sensing network
Deng.The equipment used because of it and the limitation of environment, this application require low complex degree, low-power consumption coding side with ensures grow
Phase steady operation, and receiving terminal can carry out the storage of substantial amounts of data and complicated decoding calculates.
But, the most H.26X series or the conventional video coding techniques of MPEG series, all use coding side complexity, decoding
Hold simple system structure, i.e. coding side by inter prediction, infra-frame prediction and discrete cosine transform (DCT) remove the time and
Spatial redundancy, to obtain high compression efficiency, this makes whole system to the requirement of the computing capability of encoder and memory size very
Height, far above decoder.Therefore, traditional Video coding mode is not suitable for above-mentioned field.
Compressed sensing (CS) is a kind of emerging theory that be born in signal processing field in recent years.Same at signal acquisition of this theory
Time data are compressed, its frequency be far below Nyquist sampling frequency, so can reduce the information data of sampling, joint
Enough information is included again while saving memory space.When needs recover primary signal, suitable restructing algorithm is used to carry out
Reduction, thus recover enough data.Traditional data acquisition and compression are united two into one by compressive sensing theory, it is not necessary to complicated
Data encoding calculate, be especially suitable for being used in the occasion that coding side is resource-constrained.
Summary of the invention
Present invention seek to address that and overcome the deficiencies in the prior art, design a compressed sensing video for wireless video sensing network
Coding/decoding system so that it is having coding side complexity low little with amount of calculation, channel transmission data amount is few, and decoding end can carry out height
The features such as quality real-time video reconstruct.To this end, the technical solution used in the present invention is, compressed sensing video based on dictionary learning
Coding/decoding system, mainly includes Video coding end and decoding end two parts:
Coding side: according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame (K frame),
Another kind of for non-key frame (CS frame), every two frames compositions one group, i.e. image sets (GOP) are 2, and odd-numbered frame is K frame,
The CS frame for this group following closely;For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out temporarily
Storage, then carries out dimensionality reduction by observing matrix Φ, by wireless transmitter module, the data after dimensionality reduction is transferred to decoding end;Pin
To CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame, and judge dv's
Square mean error amount (MSE), if MSE is less than bottom threshold, then judges that this two frame is closely similar, sends a 1bit signalisation
This CS frame of decoding end, without reconstruct, directly uses former frame K frame reconstruction result as its reconstruction result;If MSE is more than in threshold value
Data after dimensionality reduction, by dv by observing matrix Φ dimensionality reduction, are sent to decoding end by limit, send 1bit signalisation solution simultaneously
Code end carries out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then directly dv is passed through observing matrix
Φ dimensionality reduction also sends;
In decoding end, K frame out and stores through the most follow-on modified newton method of compressed sensing restructing algorithm (NSL0) decoding,
If coding side transmits the signal updating dictionary, then carry out the dictionary of sparse matrix more according to K-singular value decomposition algorithm (K-SVD)
Newly;For CS frame, coefficient matrix and observing matrix that use K frame is updated carry out NSL0 compression reconfiguration, the knot that will reconstruct
Fruit is added the reconstruct obtaining CS frame with the reconstruction result of former frame K frame, is finally integrated into video according to frame sequence and exports.
Observing matrix uses the gaussian random matrix of piecemeal.
Compressive sensing theory specifically refers to, and uses K-SVD dictionary learning method to generate sparse dictionary, and initial sparse dictionary is set as
Global Dictionary, i.e. uses the picture training of scene dictionary out residing for photographic head.
The technical characterstic of the present invention and effect:
The present invention uses compressed sensing to carry out using the encoding and decoding of wireless video sensing network, is moved on to from coding side by computation complexity
Decoding end.
Use difference partition method and block-based observing matrix, on the premise of ensureing reconstruction accuracy, effectively reduce the biography of CS frame
Transmission of data amount and reconstitution time.
Use Global Dictionary is as initial dictionary, and updates dictionary by dictionary learning timing, is not affecting video reconstruction real-time
On the premise of, it is effectively improved reconstruction accuracy.
Accompanying drawing explanation
The hardware structure diagram of Fig. 1 present invention.
The compressed sensing video coding and decoding system block diagram based on dictionary learning of Fig. 2 present invention.
Dictionary learning algorithm flow chart in Fig. 3 present invention.
Detailed description of the invention
In order to achieve the above object, the present invention uses compressed sensing based on dictionary learning to complete whole video coding and decoding system.Main
Video coding end to be included and decoding end two parts.
At coding side, according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame (K
Frame), another kind of for non-key frame (CS frame).It is 2 that every two frames form a group, i.e. image sets (GOP), and odd-numbered frame is K
Frame, the CS frame for this group following closely.For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out
Interim storage, then carries out dimensionality reduction by observing matrix Φ, by wireless transmitter module, the data after dimensionality reduction is transferred to decoding end.
For CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame, and judge dv
Square mean error amount (MSE), if MSE less than bottom threshold, then judges that this two frame is closely similar, sends one 1bit signal logical
Know that this CS frame of decoding end, without reconstruct, can be used directly former frame K frame reconstruction result as its reconstruction result;If MSE is more than threshold
The value upper limit, illustrates that this two frame has a long way to go, and photographed scene there occurs bigger change, therefore should update dictionary new to adapt to
Data after dimensionality reduction, therefore by dv by observing matrix Φ dimensionality reduction, are sent to decoding end by scene, send 1bit signal simultaneously
Notice decoding end carries out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then directly pass through to see by dv
Survey matrix Φ dimensionality reduction and send.
In decoding end, K frame out and stores through the follow-on modified newton method of compressed sensing restructing algorithm (NSL0) decoding.
If coding side transmits the signal updating dictionary, then carry out sparse matrix according to K-singular value decomposition algorithm (K-SVD) algorithm
Dictionary updating.For CS frame, coefficient matrix and observing matrix that use K frame is updated carry out NSL0 compression reconfiguration, will reconstruct
The result gone out is added the reconstruct obtaining CS frame with the reconstruction result of former frame K frame.Finally it is integrated into video according to frame sequence and exports.
Here, observing matrix uses the gaussian random matrix of piecemeal, can effectively reduce the data volume transmitting observing matrix for the first time,
Reconstitution time is reduced, it is ensured that real-time in the case of not affecting reconstruction accuracy.
Compressive sensing theory use premise be signal be can be sparse in Ψ territory, i.e. signal can be right with it with a sparse dictionary Ψ
The sparse coefficient answered represents, and the nonzero term number in sparse coefficient is less than coefficient degree K.The quality of sparse dictionary determines signal
Reconstruction accuracy.In the present system, K-SVD dictionary learning method is used to generate sparse dictionary, because it has adaptivity, therefore
Compared to dct transform or wavelet transformation, quality reconstruction is more preferable.Initial sparse dictionary is set as Global Dictionary, i.e. uses shooting
The picture training of the residing scene of head dictionary out.Such initial dictionary can be more sparse expression image, from the basis of ensure
High reconstruction accuracy.
The present invention devises a compressed sensing video coding and decoding system for wireless video sensing network so that it is have coding side
Complexity is low and amount of calculation is little, and channel transmission data amount is few, and decoding end can carry out the features such as high-quality real-time video reconstruct.
In order to achieve the above object, the present invention uses compressed sensing based on dictionary learning to complete whole video coding and decoding system.System
System mainly includes Video coding end and decoding end two parts.
The present invention will be described in more detail below in conjunction with the accompanying drawings.
Fig. 1 show the hardware block diagram of native system.It is made up of following components: digital camera, DSP video
Coding module, wireless transmitter module, wireless receiving module and PC Video decoding module.Digital camera, DSP Video coding mould
Block and wireless transmitter module constitute system coding end, wireless receiving module and PC Video decoding module and constitute system decoding end.
At coding side, digital camera is connected to DSP video compressing module by multiplexing 32 data line, is passed by the video data gathered
Enter and module is carried out store and encoding operation;Afterwards the data after coding are transmitted to decoding end by wireless sending module.Decoding
The wireless receiving module of end receives coded data, is transferred to PC, is decoded operation, and finally output reconstructing video stream.
Fig. 2 show compressed sensing video coding and decoding system block diagram based on dictionary learning.Assume that each two field picture of input video is equal
For Nr × Nc dimensional signal, an image sets is made up of (GOP=2) two frames, and the first frame is K frame Xk, and the second frame is CS frame Xcs.
Whole encoding-decoding process is described below as a example by an image sets.
At coding side, first read in Xk, and store temporarily.Afterwards by observing matrix Φ dimensionality reduction be M × Nc dimension letter
Number Yk, as shown in Equation (1), wherein Φ is M × Nr block Gauss random matrix, M < < Nr.View data quilt after Φ
Significantly compress, finally the Yk obtained is sent to coding side.So far, K frame has encoded.
Y=ΦX (1)
Read in Xcs afterwards, it is carried out calculus of differences, i.e. dv=Xcs-Xk with the interim Xk stored, obtain after dv it
Carry out the calculating of square mean error amount (MSE), and compare with the threshold value preset.If MSE is less than bottom threshold Sl, then
Judging that this two frame is closely similar, coding side is not required to this frame is carried out any encoding operation, only need to send a 1bit control signal " 0 ",
Notice this CS frame of decoding end, without reconstruct, can be used directly K frame reconstruction result as its reconstruction result;If MSE is more than in threshold value
Limit Sh, then carry out dimensionality reduction according to formula (1) by observing matrix Φ by dv, and data Ydv after dimensionality reduction be sent to decoding
End, sends 1bit control signal " 1 " simultaneously, and notice decoding end will carry out dictionary learning after this CS frame has reconstructed;If MSE
In threshold range, then directly dv by observing matrix Φ dimensionality reduction and is sent Ydv according to formula (1).
Decoding end is started working after receiving Yk.Being compressed sensing reconstructing according to formula (2), wherein Φ keeps with coding side
Unanimously, Ψ is Nr × Nr dimension sparse matrix (dictionary), by reconstructStore.
min||S||oS.t.Y=Φ ψ s (2)
Start afterwards to decode CS frame.First look at and whether receive 1bit control signal.If control signal is " 0 ", then
Directly it is sequentially output K frame and CS frame reconstruction result.If without control signal, then according to formula (2), Ydv is compressed perception
Reconstruct, because having carried out Difference Calculation at coding side, soIf control signal is " 1 ", then first pass through above-mentioned
Reconstructing method reconstructsUse againAs training signal, update dictionary by K-SVD dictionary learning so that it is more adapt to become
Scene after change.
Here dictionary learning method uses K-SVD algorithm as shown in Figure 3.First carry out according to initial dictionary and training signal
Sparse coding, i.e. fixes dictionary, with this dictionary, data-oriented is carried out rarefaction representation and (i.e. approximates as far as possible with the fewest coefficient
Earth's surface registration evidence), obtain coefficient matrix α.Fixed coefficient matrix afterwards, updates each dictionary atom (every string of dictionary) successively,
Make its closer expression training signal.So iteration is repeatedly, can complete dictionary learning, obtain being more suitable for the sparse of new scene
Matrix Ψ.This matrix will be used for the frame reconstruct of next GOP.
Finally, the reconstruction result of K frame and CS frame is pressed frame sequence and frame per second output, forms outputting video streams.
Wherein, observing matrix Φ uses the gaussian random matrix of piecemeal.If block size is 8 × 8, then coding side generates one 8 × 8
Gaussian random matrix Φ 0, use Φ 0 to generate the diagonal matrix of M × Nr afterwards.Diagonal is by the individual Φ in (M × Nr)/(8 × 8) 0 group
Become.The gaussian random matrix of this piecemeal can effectively reduce the data volume transmitting observing matrix for the first time, is not affecting reconstruction accuracy
In the case of reduce reconstitution time, it is ensured that real-time.Reconstruct is a link during compressed sensing, and restructing algorithm is reconstruct
The concrete grammar used.The restructing algorithm that the present invention uses is NSL0 algorithm, and NSL0 is follow-on modified newton method, is
Experiments verify that, in existing compressed sensing restructing algorithm, the algorithm that effect is optimum, because it has, reconstruction accuracy is high and reconstruct is taken
Between short feature, meet native system high reconstruction accuracy and the requirement of real-time.
Claims (2)
1. a compressed sensing video coding and decoding system based on dictionary learning, is characterized in that, mainly include Video coding end and decoding end
Two parts:
Coding side: according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame K
Frame, another kind of for non-key frame CS frame, every two frames compositions one group, i.e. image sets GOP are 2, and odd-numbered frame is K frame,
The CS frame for this group following closely;For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out
Interim storage, then carries out dimensionality reduction by observing matrix Φ, is transferred to solve by wireless transmitter module by the data after dimensionality reduction
Code end;For CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame,
And judge the square mean error amount (MSE) of dv, if MSE is less than bottom threshold, then judge that this two frame is closely similar, send one
Individual this CS frame of 1bit signalisation decoding end, without reconstruct, directly uses former frame K frame reconstruction result as its reconstruction result;
If MSE is more than upper threshold, by dv by observing matrix Φ dimensionality reduction, the data after dimensionality reduction are sent to decoding end, simultaneously
Send 1bit signalisation decoding end and carry out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then
Directly dv by observing matrix Φ dimensionality reduction and is sent;
In decoding end, K frame decodes out through compressed sensing restructing algorithm NSL0 and stores, if coding side transmits renewal word
The signal of allusion quotation, then carry out the dictionary updating of sparse matrix according to K-singular value decomposition algorithm (K-SVD);For CS frame, make
The coefficient matrix updated with K frame and observing matrix carry out NSL0 compression reconfiguration, result and the former frame K frame that will reconstruct
Reconstruction result be added and obtain the reconstruct of CS frame, be finally integrated into video according to frame sequence and export;
Compressive sensing theory specifically refers to, and uses K-SVD dictionary learning method to generate sparse dictionary, and initial sparse dictionary sets
For Global Dictionary, i.e. use the picture training of scene dictionary out residing for photographic head.
2. compressed sensing video coding and decoding system based on dictionary learning as claimed in claim 1, is characterized in that, observing matrix uses
The gaussian random matrix of piecemeal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310589803.0A CN103596010B (en) | 2013-11-20 | 2013-11-20 | Video coding and decoding system based on dictionary learning and compressed sensing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310589803.0A CN103596010B (en) | 2013-11-20 | 2013-11-20 | Video coding and decoding system based on dictionary learning and compressed sensing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103596010A CN103596010A (en) | 2014-02-19 |
CN103596010B true CN103596010B (en) | 2017-01-11 |
Family
ID=50085966
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310589803.0A Expired - Fee Related CN103596010B (en) | 2013-11-20 | 2013-11-20 | Video coding and decoding system based on dictionary learning and compressed sensing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103596010B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105554502A (en) * | 2015-12-07 | 2016-05-04 | 天津大学 | Distributed compressed sensing video encoding and decoding method based on foreground-background separation |
CN106991426B (en) * | 2016-09-23 | 2020-06-12 | 天津大学 | Remote sensing image sparse coding dictionary learning method based on embedded DSP |
CN107659315B (en) * | 2017-09-25 | 2020-11-10 | 天津大学 | Sparse binary coding circuit for compressed sensing |
CN109089123B (en) * | 2018-08-23 | 2021-08-03 | 江苏大学 | Compressed sensing multi-description coding and decoding method based on 1-bit vector quantization |
CN109194968B (en) * | 2018-09-13 | 2020-12-25 | 天津大学 | Image compression sensing method fusing information source channel decoding |
CN111192334B (en) * | 2020-01-02 | 2023-06-06 | 苏州大学 | Trainable compressed sensing module and image segmentation method |
CN113365065B (en) * | 2021-06-09 | 2024-04-26 | 湖南大学 | Lossless video coding method and decoding method for RPA robot screen recording |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102427527A (en) * | 2011-09-27 | 2012-04-25 | 西安电子科技大学 | Method for reconstructing non key frame on basis of distributed video compression sensing system |
-
2013
- 2013-11-20 CN CN201310589803.0A patent/CN103596010B/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102427527A (en) * | 2011-09-27 | 2012-04-25 | 西安电子科技大学 | Method for reconstructing non key frame on basis of distributed video compression sensing system |
Non-Patent Citations (2)
Title |
---|
Dictionary learning-based distributed compressive video sensing;Hung-Wei Chen et al.;《PCS2010》;20101210;全文 * |
Distributed video coding using compressive sampling;Josep Prades-Nebot et al.;《PCS2009》;20090508;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103596010A (en) | 2014-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103596010B (en) | Video coding and decoding system based on dictionary learning and compressed sensing | |
Chen et al. | Dictionary learning-based distributed compressive video sensing | |
CN100512443C (en) | Distributive vide frequency coding method based on self adaptive Hashenhege type vector quantization | |
CN101540926B (en) | Stereo video coding-decoding method based on H.264 | |
JP2022133346A (en) | Codec using neural network | |
CN103141092B (en) | The method and apparatus carrying out encoded video signal for the super-resolution based on example of video compress use motion compensation | |
CN103618903B (en) | The high-speed low-power-consumption radio sensing network video compress method of sampling | |
CN110062239B (en) | Reference frame selection method and device for video coding | |
TWI468018B (en) | Video coding using vector quantized deblocking filters | |
CN103002283A (en) | Multi-view distributed video compression side information generation method | |
KR20070028404A (en) | Method of storing pictures in a memory using compression coding and cost function including power consumption | |
CN102572428B (en) | Side information estimating method oriented to distributed coding and decoding of multimedia sensor network | |
CN105554502A (en) | Distributed compressed sensing video encoding and decoding method based on foreground-background separation | |
CN104333757A (en) | Video coding and decoding method based on multiple description CS measurement value | |
CN109587431A (en) | A kind of multi-channel video code stream merging method, device, equipment and storage medium | |
US20090323810A1 (en) | Video encoding apparatuses and methods with decoupled data dependency | |
Bernatin et al. | Video compression based on Hybrid transform and quantization with Huffman coding for video codec | |
CN104581173A (en) | Soft decoding verification model platform | |
US20140269896A1 (en) | Multi-Frame Compression | |
CN101360237A (en) | Reference frame processing method, video decoding method and apparatus | |
CN1848960B (en) | Residual coding in compliance with a video standard using non-standardized vector quantization coder | |
Wang et al. | Deep correlated image set compression based on distributed source coding and multi-scale fusion | |
CN105306941B (en) | A kind of method for video coding | |
CN107770537B (en) | Light field image compression method based on linear reconstruction | |
KR100349058B1 (en) | video compression and decompression Apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170111 Termination date: 20201120 |
|
CF01 | Termination of patent right due to non-payment of annual fee |