CN103596010A - Video coding and decoding system based on dictionary learning and compressed sensing - Google Patents

Video coding and decoding system based on dictionary learning and compressed sensing Download PDF

Info

Publication number
CN103596010A
CN103596010A CN201310589803.0A CN201310589803A CN103596010A CN 103596010 A CN103596010 A CN 103596010A CN 201310589803 A CN201310589803 A CN 201310589803A CN 103596010 A CN103596010 A CN 103596010A
Authority
CN
China
Prior art keywords
frame
video
compressed sensing
dictionary
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310589803.0A
Other languages
Chinese (zh)
Other versions
CN103596010B (en
Inventor
郭继昌
金卯亨嘉
申燊
许颖
孙骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201310589803.0A priority Critical patent/CN103596010B/en
Publication of CN103596010A publication Critical patent/CN103596010A/en
Application granted granted Critical
Publication of CN103596010B publication Critical patent/CN103596010B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the field of video compressed sensing and image sparse representation, and discloses a video coding and decoding system based on compressed sensing. The video coding and decoding system based on compressed sensing is designed to make a wireless video sensing network have the advantages that the complexity and calculated amount of a coding terminal are small, the volume of data transmitted through a channel is small and a decoding terminal can carry out high-quality real-time video reconstruction. According to the technical scheme, the video coding and decoding system based on dictionary learning and compressed sensing mainly comprises the video coding terminal and the video decoding terminal, wherein the coding terminal is used for temporarily storing image pixel data of K frames, reducing the dimensionality of the image pixel data of the K frames and transmitting data after dimensionality reduction to the decoding terminal through a wireless transmitting module according to the compressed sensing theory, and the decoding terminal is used for decoding the K frames according to the compressed sensing reconstruction algorithm (namely, the improved NSL0 method), storing the K frames and finally forming a video through integration according to frame sequences and outputting the video. The video coding and decoding system based on compressed sensing is mainly applied to video compressed sensing and transmission.

Description

Compressed sensing video coding and decoding system based on dictionary learning
Technical field
The present invention relates to video compression perception and image sparse and represent field, relate in particular to the compressed sensing video coding and decoding system based on dictionary learning.
Background technology
The present invention is mainly for the resource-constrained field of video applications of some coding sides, as video monitoring, wireless video sensing network etc.Because of the limitation of equipment and the environment of its use, this application requires the coding side of low complex degree, low-power consumption to guarantee long-term stable operation, and receiving terminal can carry out a large amount of data storages and complicated decoding calculating.
Yet, no matter H.26X series or the conventional video coding techniques of MPEG series, all adopt coding side complexity, the simple system configuration of decoding end, be that coding side passes through inter prediction, infra-frame prediction and discrete cosine transform (DCT) and removes time and spatial redundancy, to obtain high compression efficiency, this makes whole system very high to the requirement of the computing capability of encoder and memory size, far above decoder.Therefore, traditional Video coding mode be not suitable for above-mentioned field.
Compressed sensing (CS) is a kind of emerging theory that signal process field is born in recent years.This theory is compressed data in signal acquisition, and its frequency, far below Nyquist sampling frequency, so can reduce the information data of sampling, includes again enough information when saving memory space.When needs recover primary signal, adopt suitable restructing algorithm to reduce, thereby recover enough data.Compressive sensing theory unites two into one traditional data acquisition and compression, does not need complicated data encoding to calculate, and is applicable to being very much used in the resource-constrained occasion of coding side.
Summary of the invention
The present invention is intended to solution and overcomes the deficiencies in the prior art, design a compressed sensing video coding and decoding system for wireless video sensing network, make it have coding side complexity low little with amount of calculation, channel transmission data amount is few, and decoding end can be carried out the features such as high-quality real-time video reconstruct.For this reason, the technical solution used in the present invention is that the compressed sensing video coding and decoding system based on dictionary learning, mainly comprises Video coding end and decoding end two parts:
Coding side: according to the requirement of reconstruction accuracy and real-time, frame in video will be divided into two classes, one class is key frame (K frame), another kind of is non-key frame (CS frame), every two frames form Yi Ge group, be that image sets (GOP) is 2, odd-numbered frame is K frame, following closely be the CS frame of this group; For K frame, according to compressive sensing theory, the image pixel data of K frame is stored temporarily, then by observing matrix Φ, carry out dimensionality reduction, the data after dimensionality reduction are transferred to decoding end by wireless transmitter module; For CS frame, after reading in image pixel data, carry out difference with former frame K frame, be dv=Xcs-Xk, and judge the square mean error amount (MSE) of dv, if MSE is less than threshold value lower limit, judge that this two frame is closely similar, send this CS frame of 1bit signal notice decoding end without reconstruct, directly use former frame K frame reconstruction result as its reconstruction result; If MSE is greater than upper threshold, dv, by observing matrix Φ dimensionality reduction, is sent to decoding end by the data after dimensionality reduction, send 1bit signal notice decoding end simultaneously and carry out dictionary learning after completing this CS frame reconstruct; If MSE, in threshold range, directly also sends dv by observing matrix Φ dimensionality reduction;
In decoding end, K frame is that follow-on modified newton method (NSL0) decoding out also stores through compressed sensing restructing algorithm, if coding side transmits the signal that upgrades dictionary, according to K-singular value decomposition algorithm (K-SVD), carry out the dictionary updating of sparse matrix; For CS frame, the coefficient matrix and the observing matrix that use K frame to upgrade carry out NSL0 compression reconfiguration, the reconstruction result of the result reconstructing and former frame K frame are added to the reconstruct that obtains CS frame, finally according to frame order, are integrated into video output.
Observing matrix is used the gaussian random matrix of piecemeal.
Compressive sensing theory specifically refers to, adopts K-SVD dictionary learning method to generate sparse dictionary, and initial sparse dictionary is set as Global Dictionary, the picture training of using camera scene of living in dictionary out.
Technical characterstic of the present invention and effect:
The present invention adopts compressed sensing to use the encoding and decoding of wireless video sensing network, and computation complexity has been moved on to decoding end from coding side.
Use difference partition method and block-based observing matrix, guaranteeing, under the prerequisite of reconstruction accuracy, to effectively reduce transmitted data amount and the reconstitution time of CS frame.
Use Global Dictionary as initial dictionary, and regularly upgrade dictionary by dictionary learning, not affecting under the prerequisite of video reconstruction real-time, effectively improve reconstruction accuracy.
Accompanying drawing explanation
Fig. 1 hardware structure diagram of the present invention.
Fig. 2 compressed sensing video coding and decoding system block diagram based on dictionary learning of the present invention.
Dictionary learning algorithm flow chart in Fig. 3 the present invention.
Embodiment
In order to achieve the above object, the present invention adopts the compressed sensing based on dictionary learning to complete whole video coding and decoding system.Mainly comprise Video coding end and decoding end two parts.
At coding side, according to the requirement of reconstruction accuracy and real-time, the frame in video will be divided into two classes, and a class is key frame (K frame), and another kind of is non-key frame (CS frame).Every two frames form Yi Ge groups, and image sets (GOP) is 2, and odd-numbered frame is K frame, following closely be the CS frame of this group.For K frame, according to compressive sensing theory, the image pixel data of K frame is stored temporarily, then by observing matrix Φ, carry out dimensionality reduction, the data after dimensionality reduction are transferred to decoding end by wireless transmitter module.For CS frame, after reading in image pixel data, carry out difference with former frame K frame, be dv=Xcs-Xk, and judge the square mean error amount (MSE) of dv, if MSE is less than threshold value lower limit, judge that this two frame is closely similar, send this CS frame of 1bit signal notice decoding end without reconstruct, can directly use former frame K frame reconstruction result as its reconstruction result; If MSE is greater than upper threshold, illustrate that this two frame has a long way to go, there is larger change in photographed scene, therefore should upgrade dictionary to adapt to new scene, therefore dv is passed through to observing matrix Φ dimensionality reduction, data after dimensionality reduction are sent to decoding end, send 1bit signal notice decoding end simultaneously and carry out dictionary learning after completing this CS frame reconstruct; If MSE, in threshold range, directly also sends dv by observing matrix Φ dimensionality reduction.
In decoding end, through the follow-on modified newton method of compressed sensing restructing algorithm (NSL0), decoding out also stores K frame.If coding side transmits the signal that upgrades dictionary, according to K-singular value decomposition algorithm (K-SVD) algorithm, carry out the dictionary updating of sparse matrix.For CS frame, the coefficient matrix and the observing matrix that use K frame to upgrade carry out NSL0 compression reconfiguration, the reconstruction result of the result reconstructing and former frame K frame are added to the reconstruct that obtains CS frame.Finally according to frame order, be integrated into video output.
Here, observing matrix is used the gaussian random matrix of piecemeal, can effectively reduce the data volume of transmitting for the first time observing matrix, reduces reconstitution time in the situation that not affecting reconstruction accuracy, guarantees real-time.
The prerequisite that compressive sensing theory is used is that signal is can be sparse in Ψ territory, and signal can represent with a sparse dictionary Ψ sparse coefficient corresponding with it, and nonzero term number in sparse coefficient is less than coefficient degree K.The quality of sparse dictionary has determined the reconstruction accuracy of signal.In native system, adopt K-SVD dictionary learning method to generate sparse dictionary, because it has adaptivity, therefore compared to dct transform or wavelet transformation, reconstruct better effects if.Initial sparse dictionary is set as Global Dictionary, the picture training of using camera scene of living in dictionary out.Initial dictionary like this can be more sparse presentation video, from basis, guarantee high reconstruction accuracy.
The present invention has designed a compressed sensing video coding and decoding system for wireless video sensing network, makes it have coding side complexity low little with amount of calculation, and channel transmission data amount is few, and decoding end can be carried out the features such as high-quality real-time video reconstruct.
In order to achieve the above object, the present invention adopts the compressed sensing based on dictionary learning to complete whole video coding and decoding system.System mainly comprises Video coding end and decoding end two parts.
Below in conjunction with accompanying drawing, the present invention will be described in more detail.
Figure 1 shows that the hardware block diagram of native system.By following components, formed: digital camera, DSP video encoding module, wireless transmitter module, wireless receiving module and PC video decode module.Digital camera, DSP video encoding module and wireless transmitter module have formed system coding end, wireless receiving module and PC video decode module composition system decodes end.At coding side, digital camera is connected to DSP video compressing module by multiplexing 32 data wires, will in the video data afferent module of collection, store and encoding operation; Afterwards the data after coding are transferred to decoding end by wireless sending module.The wireless receiving module of decoding end receives coded data, is transferred to PC, carries out decode operation, and finally exports reconstructing video stream.
Figure 2 shows that the compressed sensing video coding and decoding system block diagram based on dictionary learning.Suppose that each two field picture of input video is Nr * Nc dimensional signal, an image sets forms (GOP=2) by two frames, and the first frame is K frame Xk, and the second frame is CS frame Xcs.Take below an image sets as example illustrates whole encoding-decoding process.
At coding side, first read in Xk, and store temporarily.By observing matrix Φ dimensionality reduction, be M * Nc dimensional signal Yk afterwards, as shown in Equation (1), wherein Φ is M * Nr piecemeal gaussian random matrix, M<<Nr.After Φ, view data is significantly compressed, and finally the Yk obtaining is sent to coding side.So far, K frame has been encoded.
Y=ΦX (1)
Read in afterwards Xcs, it is carried out to calculus of differences with the interim Xk storing, dv=Xcs-Xk, obtains, after dv, it is carried out to the calculating of square mean error amount (MSE), and compares with the threshold value presetting.If MSE is less than threshold value lower limit Sl, judge that this two frame is closely similar, coding side does not need this frame to carry out any encoding operation, only needs to send a 1bit control signal " 0 ", this CS frame of notice decoding end, without reconstruct, can directly be used K frame reconstruction result as its reconstruction result; If MSE is greater than upper threshold Sh, dv is carried out to dimensionality reduction according to formula (1) by observing matrix Φ, and the data Y dv after dimensionality reduction is sent to decoding end, and sending 1bit control signal " 1 " simultaneously, notice decoding end will be carried out dictionary learning after this CS frame reconstruct completes; If MSE in threshold range, directly by dv according to formula (1) by observing matrix Φ dimensionality reduction and send Ydv.
Decoding end is started working after receiving Yk.According to formula (2), carry out compressed sensing reconstruct, wherein Φ and coding side are consistent, and Ψ is that Nr * Nr ties up sparse matrix (dictionary), by what reconstruct
Figure BDA0000418364880000031
store.
min||S|| os.t.Y=Φψs (2)
Start afterwards the CS frame of decoding.First check and whether receive 1bit control signal.If control signal is " 0 ",
Figure BDA0000418364880000041
directly export successively K frame and CS frame reconstruction result.If without control signal, according to formula (2), Ydv is carried out to compressed sensing reconstruct, because carried out Difference Calculation at coding side, so
Figure BDA0000418364880000042
if control signal is " 1 ", first by above-mentioned reconstructing method, reconstruct
Figure BDA0000418364880000043
use again
Figure BDA0000418364880000044
as training signal, by K-SVD dictionary learning, upgrade dictionary, make its more scene after Adaptive change.
The dictionary learning method here adopts K-SVD algorithm as shown in Figure 3.First according to initial dictionary and training signal, carry out sparse coding, i.e. fixing dictionary, carries out rarefaction representation (i.e. use try one's best few coefficient represent as far as possible approx data) with this dictionary to data-oriented, obtains coefficient matrix α.Fixed coefficient matrix, upgrades each dictionary atom (each row of dictionary) successively afterwards, makes its more approaching expression training signal.So iteration repeatedly, can complete dictionary learning, obtains being more suitable for the sparse matrix Ψ of new scene.This matrix is by the frame reconstruct for next GOP.
Finally, by order and the frame per second output frame by frame of the reconstruction result of K frame and CS frame, form outputting video streams.
Wherein, observing matrix Φ is used the gaussian random matrix of piecemeal.If block size is 8 * 8, coding side generates the gaussian random matrix Φ 0 of 8 * 8, uses afterwards Φ 0 to generate the diagonal matrix of M * Nr.Diagonal is comprised of the individual Φ 0 in (M * Nr)/(8 * 8).The gaussian random matrix of this piecemeal can effectively reduce the data volume of transmitting for the first time observing matrix, reduces reconstitution time in the situation that not affecting reconstruction accuracy, guarantees real-time.Reconstruct is a link in compressed sensing process, and restructing algorithm is the concrete grammar that reconstruct is used.The restructing algorithm that the present invention adopts is NSL0 algorithm, NSL0 is follow-on modified newton method, through the algorithm of experimental verification effect optimum in existing compressed sensing restructing algorithm, because it has reconstruction accuracy height and the short feature of reconstruct required time, meet the requirement of the high reconstruction accuracy of native system and real-time.

Claims (3)

1. the compressed sensing video coding and decoding system based on dictionary learning, is characterized in that, mainly comprises Video coding end and decoding end two parts:
Coding side: according to the requirement of reconstruction accuracy and real-time, the frame in video will be divided into two classes, a class is key frame K frame, another kind of is non-key frame CS frame, and every two frames form Yi Ge group, and image sets GOP is 2, odd-numbered frame is K frame, following closely be the CS frame of this group; For K frame, according to compressive sensing theory, the image pixel data of K frame is stored temporarily, then by observing matrix Φ, carry out dimensionality reduction, the data after dimensionality reduction are transferred to decoding end by wireless transmitter module; For CS frame, after reading in image pixel data, carry out difference with former frame K frame, be dv=Xcs-Xk, and judge the square mean error amount (MSE) of dv, if MSE is less than threshold value lower limit, judge that this two frame is closely similar, send this CS frame of 1bit signal notice decoding end without reconstruct, directly use former frame K frame reconstruction result as its reconstruction result; If MSE is greater than upper threshold, dv, by observing matrix Φ dimensionality reduction, is sent to decoding end by the data after dimensionality reduction, send 1bit signal notice decoding end simultaneously and carry out dictionary learning after completing this CS frame reconstruct; If MSE, in threshold range, directly also sends dv by observing matrix Φ dimensionality reduction;
In decoding end, K frame is decoded out and stores through compressed sensing restructing algorithm NSL0, if coding side transmits the signal that upgrades dictionary, according to K-singular value decomposition algorithm (K-SVD), carries out the dictionary updating of sparse matrix; For CS frame, the coefficient matrix and the observing matrix that use K frame to upgrade carry out NSL0 compression reconfiguration, the reconstruction result of the result reconstructing and former frame K frame are added to the reconstruct that obtains CS frame, finally according to frame order, are integrated into video output.
2. the compressed sensing video coding and decoding system based on dictionary learning as claimed in claim 1, is characterized in that, observing matrix is used the gaussian random matrix of piecemeal.
3. the compressed sensing video coding and decoding system based on dictionary learning as claimed in claim 1, it is characterized in that, compressive sensing theory specifically refers to, adopt K-SVD dictionary learning method to generate sparse dictionary, initial sparse dictionary is set as Global Dictionary, the picture training of using camera scene of living in dictionary out.
CN201310589803.0A 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing Expired - Fee Related CN103596010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310589803.0A CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310589803.0A CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Publications (2)

Publication Number Publication Date
CN103596010A true CN103596010A (en) 2014-02-19
CN103596010B CN103596010B (en) 2017-01-11

Family

ID=50085966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310589803.0A Expired - Fee Related CN103596010B (en) 2013-11-20 2013-11-20 Video coding and decoding system based on dictionary learning and compressed sensing

Country Status (1)

Country Link
CN (1) CN103596010B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554502A (en) * 2015-12-07 2016-05-04 天津大学 Distributed compressed sensing video encoding and decoding method based on foreground-background separation
CN106991426A (en) * 2016-09-23 2017-07-28 天津大学 Remote sensing images sparse coding dictionary learning method based on DSP embedded
CN107659315A (en) * 2017-09-25 2018-02-02 天津大学 A kind of sparse binary-coding circuit for compressed sensing
CN109089123A (en) * 2018-08-23 2018-12-25 江苏大学 Compressed sensing multi-description coding-decoding method based on the quantization of 1 bit vectors
CN109194968A (en) * 2018-09-13 2019-01-11 天津大学 A kind of compression of images cognitive method of fusion message source and channel decoding
CN111192334A (en) * 2020-01-02 2020-05-22 苏州大学 Trainable compressed sensing module and image segmentation method
CN113365065A (en) * 2021-06-09 2021-09-07 湖南大学 Lossless video coding method and decoding method for RPA robot screen recording

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102427527A (en) * 2011-09-27 2012-04-25 西安电子科技大学 Method for reconstructing non key frame on basis of distributed video compression sensing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102427527A (en) * 2011-09-27 2012-04-25 西安电子科技大学 Method for reconstructing non key frame on basis of distributed video compression sensing system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUNG-WEI CHEN ET AL.: "Dictionary learning-based distributed compressive video sensing", 《PCS2010》, 10 December 2010 (2010-12-10) *
JOSEP PRADES-NEBOT ET AL.: "Distributed video coding using compressive sampling", 《PCS2009》, 8 May 2009 (2009-05-08) *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554502A (en) * 2015-12-07 2016-05-04 天津大学 Distributed compressed sensing video encoding and decoding method based on foreground-background separation
CN106991426A (en) * 2016-09-23 2017-07-28 天津大学 Remote sensing images sparse coding dictionary learning method based on DSP embedded
CN106991426B (en) * 2016-09-23 2020-06-12 天津大学 Remote sensing image sparse coding dictionary learning method based on embedded DSP
CN107659315A (en) * 2017-09-25 2018-02-02 天津大学 A kind of sparse binary-coding circuit for compressed sensing
CN109089123A (en) * 2018-08-23 2018-12-25 江苏大学 Compressed sensing multi-description coding-decoding method based on the quantization of 1 bit vectors
CN109089123B (en) * 2018-08-23 2021-08-03 江苏大学 Compressed sensing multi-description coding and decoding method based on 1-bit vector quantization
CN109194968A (en) * 2018-09-13 2019-01-11 天津大学 A kind of compression of images cognitive method of fusion message source and channel decoding
CN109194968B (en) * 2018-09-13 2020-12-25 天津大学 Image compression sensing method fusing information source channel decoding
CN111192334A (en) * 2020-01-02 2020-05-22 苏州大学 Trainable compressed sensing module and image segmentation method
CN113365065A (en) * 2021-06-09 2021-09-07 湖南大学 Lossless video coding method and decoding method for RPA robot screen recording
CN113365065B (en) * 2021-06-09 2024-04-26 湖南大学 Lossless video coding method and decoding method for RPA robot screen recording

Also Published As

Publication number Publication date
CN103596010B (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN103596010A (en) Video coding and decoding system based on dictionary learning and compressed sensing
Kang et al. Distributed compressive video sensing
CN100512443C (en) Distributive vide frequency coding method based on self adaptive Hashenhege type vector quantization
CN101540926B (en) Stereo video coding-decoding method based on H.264
CN103141092B (en) The method and apparatus carrying out encoded video signal for the super-resolution based on example of video compress use motion compensation
CN103618903B (en) The high-speed low-power-consumption radio sensing network video compress method of sampling
CN104301730A (en) Two-way video coding and decoding system and method based on video mobile equipment
CN103002283A (en) Multi-view distributed video compression side information generation method
CN104333757B (en) Based on the video coding-decoding method described CS measured values more
CN102223536B (en) Compressed-sensing-based distributed video coding and decoding system and method thereof
WO2023279961A1 (en) Video image encoding method and apparatus, and video image decoding method and apparatus
WO2018120019A1 (en) Compression/decompression apparatus and system for use with neural network data
CN102572428B (en) Side information estimating method oriented to distributed coding and decoding of multimedia sensor network
CN105120276A (en) Adaptive Motion JPEG coding method and system
CN114157863A (en) Video coding method, system and storage medium based on digital retina
CN103533351B (en) A kind of method for compressing image quantifying table more
CN107205151B (en) Coding and decoding device and method based on mixed distortion measurement criterion
CN1848960B (en) Residual coding in compliance with a video standard using non-standardized vector quantization coder
CN105306941B (en) A kind of method for video coding
CN104135662A (en) Improved H.264 compressed encoding method for video file under limited bandwidth and emission rate condition
WO2021168827A1 (en) Image transmission method and apparatus
CN104079930A (en) Achieving method of remote-sensing image compression system
Jian et al. Residual distributed compressive video sensing based on double side information
Fei et al. Review of distributed video coding
CN102088612A (en) Non-feedback Wyner-Ziv video decoder with robustness

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170111

Termination date: 20201120

CF01 Termination of patent right due to non-payment of annual fee