CN103596010B - Video coding and decoding system based on dictionary learning and compressed sensing - Google Patents

Video coding and decoding system based on dictionary learning and compressed sensing Download PDF

Info

Publication number
CN103596010B
CN103596010B CN201310589803.0A CN201310589803A CN103596010B CN 103596010 B CN103596010 B CN 103596010B CN 201310589803 A CN201310589803 A CN 201310589803A CN 103596010 B CN103596010 B CN 103596010B
Authority
CN
China
Prior art keywords
frame
video
decoding
compressed sensing
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310589803.0A
Other languages
Chinese (zh)
Other versions
CN103596010A (en
Inventor
郭继昌
金卯亨嘉
申燊
许颖
孙骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201310589803.0A priority Critical patent/CN103596010B/en
Publication of CN103596010A publication Critical patent/CN103596010A/en
Application granted granted Critical
Publication of CN103596010B publication Critical patent/CN103596010B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the field of video compressed sensing and image sparse representation, and discloses a video coding and decoding system based on compressed sensing. The video coding and decoding system based on compressed sensing is designed to make a wireless video sensing network have the advantages that the complexity and calculated amount of a coding terminal are small, the volume of data transmitted through a channel is small and a decoding terminal can carry out high-quality real-time video reconstruction. According to the technical scheme, the video coding and decoding system based on dictionary learning and compressed sensing mainly comprises the video coding terminal and the video decoding terminal, wherein the coding terminal is used for temporarily storing image pixel data of K frames, reducing the dimensionality of the image pixel data of the K frames and transmitting data after dimensionality reduction to the decoding terminal through a wireless transmitting module according to the compressed sensing theory, and the decoding terminal is used for decoding the K frames according to the compressed sensing reconstruction algorithm (namely, the improved NSL0 method), storing the K frames and finally forming a video through integration according to frame sequences and outputting the video. The video coding and decoding system based on compressed sensing is mainly applied to video compressed sensing and transmission.

Description

Compressed sensing video coding and decoding system based on dictionary learning
Technical field
The present invention relates to video compress perception and image sparse represents field, particularly relate to compressed sensing video based on dictionary learning Coding/decoding system.
Background technology
Present invention is generally directed to the field of video applications that some coding sides are resource-constrained, such as video monitoring, wireless video sensing network Deng.The equipment used because of it and the limitation of environment, this application require low complex degree, low-power consumption coding side with ensures grow Phase steady operation, and receiving terminal can carry out the storage of substantial amounts of data and complicated decoding calculates.
But, the most H.26X series or the conventional video coding techniques of MPEG series, all use coding side complexity, decoding Hold simple system structure, i.e. coding side by inter prediction, infra-frame prediction and discrete cosine transform (DCT) remove the time and Spatial redundancy, to obtain high compression efficiency, this makes whole system to the requirement of the computing capability of encoder and memory size very Height, far above decoder.Therefore, traditional Video coding mode is not suitable for above-mentioned field.
Compressed sensing (CS) is a kind of emerging theory that be born in signal processing field in recent years.Same at signal acquisition of this theory Time data are compressed, its frequency be far below Nyquist sampling frequency, so can reduce the information data of sampling, joint Enough information is included again while saving memory space.When needs recover primary signal, suitable restructing algorithm is used to carry out Reduction, thus recover enough data.Traditional data acquisition and compression are united two into one by compressive sensing theory, it is not necessary to complicated Data encoding calculate, be especially suitable for being used in the occasion that coding side is resource-constrained.
Summary of the invention
Present invention seek to address that and overcome the deficiencies in the prior art, design a compressed sensing video for wireless video sensing network Coding/decoding system so that it is having coding side complexity low little with amount of calculation, channel transmission data amount is few, and decoding end can carry out height The features such as quality real-time video reconstruct.To this end, the technical solution used in the present invention is, compressed sensing video based on dictionary learning Coding/decoding system, mainly includes Video coding end and decoding end two parts:
Coding side: according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame (K frame), Another kind of for non-key frame (CS frame), every two frames compositions one group, i.e. image sets (GOP) are 2, and odd-numbered frame is K frame, The CS frame for this group following closely;For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out temporarily Storage, then carries out dimensionality reduction by observing matrix Φ, by wireless transmitter module, the data after dimensionality reduction is transferred to decoding end;Pin To CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame, and judge dv's Square mean error amount (MSE), if MSE is less than bottom threshold, then judges that this two frame is closely similar, sends a 1bit signalisation This CS frame of decoding end, without reconstruct, directly uses former frame K frame reconstruction result as its reconstruction result;If MSE is more than in threshold value Data after dimensionality reduction, by dv by observing matrix Φ dimensionality reduction, are sent to decoding end by limit, send 1bit signalisation solution simultaneously Code end carries out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then directly dv is passed through observing matrix Φ dimensionality reduction also sends;
In decoding end, K frame out and stores through the most follow-on modified newton method of compressed sensing restructing algorithm (NSL0) decoding, If coding side transmits the signal updating dictionary, then carry out the dictionary of sparse matrix more according to K-singular value decomposition algorithm (K-SVD) Newly;For CS frame, coefficient matrix and observing matrix that use K frame is updated carry out NSL0 compression reconfiguration, the knot that will reconstruct Fruit is added the reconstruct obtaining CS frame with the reconstruction result of former frame K frame, is finally integrated into video according to frame sequence and exports.
Observing matrix uses the gaussian random matrix of piecemeal.
Compressive sensing theory specifically refers to, and uses K-SVD dictionary learning method to generate sparse dictionary, and initial sparse dictionary is set as Global Dictionary, i.e. uses the picture training of scene dictionary out residing for photographic head.
The technical characterstic of the present invention and effect:
The present invention uses compressed sensing to carry out using the encoding and decoding of wireless video sensing network, is moved on to from coding side by computation complexity Decoding end.
Use difference partition method and block-based observing matrix, on the premise of ensureing reconstruction accuracy, effectively reduce the biography of CS frame Transmission of data amount and reconstitution time.
Use Global Dictionary is as initial dictionary, and updates dictionary by dictionary learning timing, is not affecting video reconstruction real-time On the premise of, it is effectively improved reconstruction accuracy.
Accompanying drawing explanation
The hardware structure diagram of Fig. 1 present invention.
The compressed sensing video coding and decoding system block diagram based on dictionary learning of Fig. 2 present invention.
Dictionary learning algorithm flow chart in Fig. 3 present invention.
Detailed description of the invention
In order to achieve the above object, the present invention uses compressed sensing based on dictionary learning to complete whole video coding and decoding system.Main Video coding end to be included and decoding end two parts.
At coding side, according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame (K Frame), another kind of for non-key frame (CS frame).It is 2 that every two frames form a group, i.e. image sets (GOP), and odd-numbered frame is K Frame, the CS frame for this group following closely.For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out Interim storage, then carries out dimensionality reduction by observing matrix Φ, by wireless transmitter module, the data after dimensionality reduction is transferred to decoding end. For CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame, and judge dv Square mean error amount (MSE), if MSE less than bottom threshold, then judges that this two frame is closely similar, sends one 1bit signal logical Know that this CS frame of decoding end, without reconstruct, can be used directly former frame K frame reconstruction result as its reconstruction result;If MSE is more than threshold The value upper limit, illustrates that this two frame has a long way to go, and photographed scene there occurs bigger change, therefore should update dictionary new to adapt to Data after dimensionality reduction, therefore by dv by observing matrix Φ dimensionality reduction, are sent to decoding end by scene, send 1bit signal simultaneously Notice decoding end carries out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then directly pass through to see by dv Survey matrix Φ dimensionality reduction and send.
In decoding end, K frame out and stores through the follow-on modified newton method of compressed sensing restructing algorithm (NSL0) decoding. If coding side transmits the signal updating dictionary, then carry out sparse matrix according to K-singular value decomposition algorithm (K-SVD) algorithm Dictionary updating.For CS frame, coefficient matrix and observing matrix that use K frame is updated carry out NSL0 compression reconfiguration, will reconstruct The result gone out is added the reconstruct obtaining CS frame with the reconstruction result of former frame K frame.Finally it is integrated into video according to frame sequence and exports.
Here, observing matrix uses the gaussian random matrix of piecemeal, can effectively reduce the data volume transmitting observing matrix for the first time, Reconstitution time is reduced, it is ensured that real-time in the case of not affecting reconstruction accuracy.
Compressive sensing theory use premise be signal be can be sparse in Ψ territory, i.e. signal can be right with it with a sparse dictionary Ψ The sparse coefficient answered represents, and the nonzero term number in sparse coefficient is less than coefficient degree K.The quality of sparse dictionary determines signal Reconstruction accuracy.In the present system, K-SVD dictionary learning method is used to generate sparse dictionary, because it has adaptivity, therefore Compared to dct transform or wavelet transformation, quality reconstruction is more preferable.Initial sparse dictionary is set as Global Dictionary, i.e. uses shooting The picture training of the residing scene of head dictionary out.Such initial dictionary can be more sparse expression image, from the basis of ensure High reconstruction accuracy.
The present invention devises a compressed sensing video coding and decoding system for wireless video sensing network so that it is have coding side Complexity is low and amount of calculation is little, and channel transmission data amount is few, and decoding end can carry out the features such as high-quality real-time video reconstruct.
In order to achieve the above object, the present invention uses compressed sensing based on dictionary learning to complete whole video coding and decoding system.System System mainly includes Video coding end and decoding end two parts.
The present invention will be described in more detail below in conjunction with the accompanying drawings.
Fig. 1 show the hardware block diagram of native system.It is made up of following components: digital camera, DSP video Coding module, wireless transmitter module, wireless receiving module and PC Video decoding module.Digital camera, DSP Video coding mould Block and wireless transmitter module constitute system coding end, wireless receiving module and PC Video decoding module and constitute system decoding end. At coding side, digital camera is connected to DSP video compressing module by multiplexing 32 data line, is passed by the video data gathered Enter and module is carried out store and encoding operation;Afterwards the data after coding are transmitted to decoding end by wireless sending module.Decoding The wireless receiving module of end receives coded data, is transferred to PC, is decoded operation, and finally output reconstructing video stream.
Fig. 2 show compressed sensing video coding and decoding system block diagram based on dictionary learning.Assume that each two field picture of input video is equal For Nr × Nc dimensional signal, an image sets is made up of (GOP=2) two frames, and the first frame is K frame Xk, and the second frame is CS frame Xcs. Whole encoding-decoding process is described below as a example by an image sets.
At coding side, first read in Xk, and store temporarily.Afterwards by observing matrix Φ dimensionality reduction be M × Nc dimension letter Number Yk, as shown in Equation (1), wherein Φ is M × Nr block Gauss random matrix, M < < Nr.View data quilt after Φ Significantly compress, finally the Yk obtained is sent to coding side.So far, K frame has encoded.
Y=ΦX (1)
Read in Xcs afterwards, it is carried out calculus of differences, i.e. dv=Xcs-Xk with the interim Xk stored, obtain after dv it Carry out the calculating of square mean error amount (MSE), and compare with the threshold value preset.If MSE is less than bottom threshold Sl, then Judging that this two frame is closely similar, coding side is not required to this frame is carried out any encoding operation, only need to send a 1bit control signal " 0 ", Notice this CS frame of decoding end, without reconstruct, can be used directly K frame reconstruction result as its reconstruction result;If MSE is more than in threshold value Limit Sh, then carry out dimensionality reduction according to formula (1) by observing matrix Φ by dv, and data Ydv after dimensionality reduction be sent to decoding End, sends 1bit control signal " 1 " simultaneously, and notice decoding end will carry out dictionary learning after this CS frame has reconstructed;If MSE In threshold range, then directly dv by observing matrix Φ dimensionality reduction and is sent Ydv according to formula (1).
Decoding end is started working after receiving Yk.Being compressed sensing reconstructing according to formula (2), wherein Φ keeps with coding side Unanimously, Ψ is Nr × Nr dimension sparse matrix (dictionary), by reconstructStore.
min||S||oS.t.Y=Φ ψ s (2)
Start afterwards to decode CS frame.First look at and whether receive 1bit control signal.If control signal is " 0 ", then Directly it is sequentially output K frame and CS frame reconstruction result.If without control signal, then according to formula (2), Ydv is compressed perception Reconstruct, because having carried out Difference Calculation at coding side, soIf control signal is " 1 ", then first pass through above-mentioned Reconstructing method reconstructsUse againAs training signal, update dictionary by K-SVD dictionary learning so that it is more adapt to become Scene after change.
Here dictionary learning method uses K-SVD algorithm as shown in Figure 3.First carry out according to initial dictionary and training signal Sparse coding, i.e. fixes dictionary, with this dictionary, data-oriented is carried out rarefaction representation and (i.e. approximates as far as possible with the fewest coefficient Earth's surface registration evidence), obtain coefficient matrix α.Fixed coefficient matrix afterwards, updates each dictionary atom (every string of dictionary) successively, Make its closer expression training signal.So iteration is repeatedly, can complete dictionary learning, obtain being more suitable for the sparse of new scene Matrix Ψ.This matrix will be used for the frame reconstruct of next GOP.
Finally, the reconstruction result of K frame and CS frame is pressed frame sequence and frame per second output, forms outputting video streams.
Wherein, observing matrix Φ uses the gaussian random matrix of piecemeal.If block size is 8 × 8, then coding side generates one 8 × 8 Gaussian random matrix Φ 0, use Φ 0 to generate the diagonal matrix of M × Nr afterwards.Diagonal is by the individual Φ in (M × Nr)/(8 × 8) 0 group Become.The gaussian random matrix of this piecemeal can effectively reduce the data volume transmitting observing matrix for the first time, is not affecting reconstruction accuracy In the case of reduce reconstitution time, it is ensured that real-time.Reconstruct is a link during compressed sensing, and restructing algorithm is reconstruct The concrete grammar used.The restructing algorithm that the present invention uses is NSL0 algorithm, and NSL0 is follow-on modified newton method, is Experiments verify that, in existing compressed sensing restructing algorithm, the algorithm that effect is optimum, because it has, reconstruction accuracy is high and reconstruct is taken Between short feature, meet native system high reconstruction accuracy and the requirement of real-time.

Claims (2)

1. a compressed sensing video coding and decoding system based on dictionary learning, is characterized in that, mainly include Video coding end and decoding end Two parts:
Coding side: according to reconstruction accuracy and the requirement of real-time, the frame in video will be divided into two classes, and a class is key frame K Frame, another kind of for non-key frame CS frame, every two frames compositions one group, i.e. image sets GOP are 2, and odd-numbered frame is K frame, The CS frame for this group following closely;For K frame, according to compressive sensing theory, the image pixel data of K frame is carried out Interim storage, then carries out dimensionality reduction by observing matrix Φ, is transferred to solve by wireless transmitter module by the data after dimensionality reduction Code end;For CS frame, after reading in image pixel data, carry out difference, i.e. dv=Xcs-Xk with former frame K frame, And judge the square mean error amount (MSE) of dv, if MSE is less than bottom threshold, then judge that this two frame is closely similar, send one Individual this CS frame of 1bit signalisation decoding end, without reconstruct, directly uses former frame K frame reconstruction result as its reconstruction result; If MSE is more than upper threshold, by dv by observing matrix Φ dimensionality reduction, the data after dimensionality reduction are sent to decoding end, simultaneously Send 1bit signalisation decoding end and carry out dictionary learning after completing the reconstruct of this CS frame;If MSE is in threshold range, then Directly dv by observing matrix Φ dimensionality reduction and is sent;
In decoding end, K frame decodes out through compressed sensing restructing algorithm NSL0 and stores, if coding side transmits renewal word The signal of allusion quotation, then carry out the dictionary updating of sparse matrix according to K-singular value decomposition algorithm (K-SVD);For CS frame, make The coefficient matrix updated with K frame and observing matrix carry out NSL0 compression reconfiguration, result and the former frame K frame that will reconstruct Reconstruction result be added and obtain the reconstruct of CS frame, be finally integrated into video according to frame sequence and export;
Compressive sensing theory specifically refers to, and uses K-SVD dictionary learning method to generate sparse dictionary, and initial sparse dictionary sets For Global Dictionary, i.e. use the picture training of scene dictionary out residing for photographic head.
2. compressed sensing video coding and decoding system based on dictionary learning as claimed in claim 1, is characterized in that, observing matrix uses The gaussian random matrix of piecemeal.
CN201310589803.0A 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing Expired - Fee Related CN103596010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310589803.0A CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310589803.0A CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Publications (2)

Publication Number Publication Date
CN103596010A CN103596010A (en) 2014-02-19
CN103596010B true CN103596010B (en) 2017-01-11

Family

ID=50085966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310589803.0A Expired - Fee Related CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Country Status (1)

Country Link
CN (1) CN103596010B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554502A (en) * 2015-12-07 2016-05-04 天津大学 Distributed compressed sensing video encoding and decoding method based on foreground-background separation
CN106991426B (en) * 2016-09-23 2020-06-12 天津大学 Remote sensing image sparse coding dictionary learning method based on embedded DSP
CN107659315B (en) * 2017-09-25 2020-11-10 天津大学 Sparse binary coding circuit for compressed sensing
CN109089123B (en) * 2018-08-23 2021-08-03 江苏大学 Compressed sensing multi-description coding and decoding method based on 1-bit vector quantization
CN109194968B (en) * 2018-09-13 2020-12-25 天津大学 Image compression sensing method fusing information source channel decoding
CN111192334B (en) * 2020-01-02 2023-06-06 苏州大学 Trainable compressed sensing module and image segmentation method
CN113365065B (en) * 2021-06-09 2024-04-26 湖南大学 Lossless video coding method and decoding method for RPA robot screen recording

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102427527A (en) * 2011-09-27 2012-04-25 西安电子科技大学 Method for reconstructing non key frame on basis of distributed video compression sensing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102427527A (en) * 2011-09-27 2012-04-25 西安电子科技大学 Method for reconstructing non key frame on basis of distributed video compression sensing system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Dictionary learning-based distributed compressive video sensing;Hung-Wei Chen et al.;《PCS2010》;20101210;全文 *
Distributed video coding using compressive sampling;Josep Prades-Nebot et al.;《PCS2009》;20090508;全文 *

Also Published As

Publication number Publication date
CN103596010A (en) 2014-02-19

Similar Documents

Publication Publication Date Title
CN103596010B (en) Video coding and decoding system based on dictionary learning and compressed sensing
Chen et al. Dictionary learning-based distributed compressive video sensing
CN100512443C (en) Distributive vide frequency coding method based on self adaptive Hashenhege type vector quantization
CN101540926B (en) Stereo video coding-decoding method based on H.264
JP2022133346A (en) Codec using neural network
CN103141092B (en) The method and apparatus carrying out encoded video signal for the super-resolution based on example of video compress use motion compensation
CN103618903B (en) The high-speed low-power-consumption radio sensing network video compress method of sampling
CN110062239B (en) Reference frame selection method and device for video coding
TWI468018B (en) Video coding using vector quantized deblocking filters
CN103002283A (en) Multi-view distributed video compression side information generation method
KR20070028404A (en) Method of storing pictures in a memory using compression coding and cost function including power consumption
CN102572428B (en) Side information estimating method oriented to distributed coding and decoding of multimedia sensor network
CN105554502A (en) Distributed compressed sensing video encoding and decoding method based on foreground-background separation
CN104333757A (en) Video coding and decoding method based on multiple description CS measurement value
CN109587431A (en) A kind of multi-channel video code stream merging method, device, equipment and storage medium
US20090323810A1 (en) Video encoding apparatuses and methods with decoupled data dependency
Bernatin et al. Video compression based on Hybrid transform and quantization with Huffman coding for video codec
CN104581173A (en) Soft decoding verification model platform
US20140269896A1 (en) Multi-Frame Compression
CN101360237A (en) Reference frame processing method, video decoding method and apparatus
CN1848960B (en) Residual coding in compliance with a video standard using non-standardized vector quantization coder
Wang et al. Deep correlated image set compression based on distributed source coding and multi-scale fusion
CN105306941B (en) A kind of method for video coding
CN107770537B (en) Light field image compression method based on linear reconstruction
KR100349058B1 (en) video compression and decompression Apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170111

Termination date: 20201120

CF01 Termination of patent right due to non-payment of annual fee