CN102625124A - Stereo encoding device, decoding device and system - Google Patents

Stereo encoding device, decoding device and system Download PDF

Info

Publication number
CN102625124A
CN102625124A CN201210055895XA CN201210055895A CN102625124A CN 102625124 A CN102625124 A CN 102625124A CN 201210055895X A CN201210055895X A CN 201210055895XA CN 201210055895 A CN201210055895 A CN 201210055895A CN 102625124 A CN102625124 A CN 102625124A
Authority
CN
China
Prior art keywords
frame
odd
encoder
module
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210055895XA
Other languages
Chinese (zh)
Other versions
CN102625124B (en
Inventor
白慧慧
赵耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiaotong University
Original Assignee
Beijing Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiaotong University filed Critical Beijing Jiaotong University
Priority to CN201210055895.XA priority Critical patent/CN102625124B/en
Publication of CN102625124A publication Critical patent/CN102625124A/en
Application granted granted Critical
Publication of CN102625124B publication Critical patent/CN102625124B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a stereo encoding device, a decoding device and a system. The encoding device comprises a plurality of pairs of encoders which are identical with but independent from one another, and each pair of encoders comprises a left view access and a right access. Each of the accesses comprises an odd-even frame separation module, a CS encoder, a standard encoder and a mode selection module. Left view signals are separated into odd frames and even frames through the odd-even frame separation module, the odd frames are encoded by the CS encoder to obtain CS codes, the mode selection module is used for controlling working mode of the CS encoder, and the even frames are encoded by the standard encoder to obtain key frames. Right view signals are separated into odd frames and even frames through the odd-even frame separation module, the even frames are encoded by the CS encoder to obtain CS codes, the mode selection module is used for controlling working mode of the CS encoder, and the odd frames are encoded by the standard encoder to obtain key frames. The invention further provides the decoding device and the system. The stereo encoding device, the decoding device and the system can be applied to products with multi-view or binocular stereo display.

Description

A kind of D encoding, decoding device and system
Technical field
The present invention relates to the video coding and decoding technology field, relate in particular to a kind of D encoding, decoding device and system.
Background technology
Because 3 D video can be experienced for the user provides the multimedia of high-quality and immersion, has attracted industrial quarters academia extensive studies interest.3 D video generally is divided into many orders (Multiview) representation of video shot at N visual angle and binocular (Stereo-view) representation of video shot at two visual angles.The basic format of binocular video comprises LOOK LEFT and LOOK RIGHT, is to be obtained simultaneously by two close video cameras of distance.Consider the terseness and the practicality of binocular video, at present binocular video is the form of extensive use the most on the 3 D video market.Yet binocular or how visual frequency googol are used to 3 D video according to amount and have been proposed bigger challenge, for example obtain, compress in data and transmission aspect, especially be applied to wireless video sensor network.In plurality of applications,, can not carry out message transmission between the video coding of requirement use low complex degree and the video camera in view of the lower power consumption of video camera.Therefore, be necessary to develop a kind of system and have not communication between high compression efficiency and the low complex degree simultaneous camera concurrently.
In recent years, the MPEG of the VCEG of ITU-T tissue and ISO/IEC organizes and has proposed the H.264/MPEG-4 extension standards of AVC, is used for realizing how visual frequency coding (Multiview video coding is called for short MVC).The basic thought of MVC also is based on the predictive coding of piece, can be good at utilizing the relativity of time domain of correlation and same visual angle between the visual angle.According to correlation between the visual angle and same visual angle relativity of time domain, a kind of binocular video of adaptive prediction structure be coded in " L.Meng, Y.Zhao; A.Wang, J.Pan and H.Bai, " Compatible Stereo Video Coding with Adaptive Prediction Structure; " IEICE Trans.on Information and Systems, vol.E94-D, no.7; Pp.1506-1509,2011. " the middle proposition.In addition, " L.Ding, S.Chien and L.Chen; " Joint Prediction Algorithm and Architecture for Stereo Video Hybrid Coding Systems; " IEEE Trans.on Circuits and Systems for Video Technology, vol.16, no.11; Pp.1324-1337,2006. " also propose a kind of associated prediction algorithm in and designed the binocular video encoder.Though above-mentioned algorithm has obtained higher compression efficiency,, video encoder needs higher power consumption to support predictive coding, and also needs transmission channel to communicate between the video camera.In practical application, be difficult to provide the communication channel between the video camera.
Summary of the invention
The technical problem that the present invention solves is how to reduce video coding power, and does not need transmission channel to communicate between the video camera.
In order to overcome the above problems; A kind of D encoding device; It is characterized in that: comprise that some each comprises a LOOK LEFT and LOOK RIGHT passage to encoder to identical and encoder independently, each passage comprises with lower module: parity frame separation module, CS encoder, standard coders and mode selection module; Separate into odd-numbered frame and even frame behind the LOOK LEFT signal process parity frame separation module; Odd-numbered frame obtains the CS sign indicating number through the CS encoder encodes, and mode selection module is used to control CS encoder mode of operation, and even frame obtains key frame through the standard coders coding; Separate into odd-numbered frame and even frame behind the LOOK RIGHT signal process parity frame separation module, even frame obtains the CS sign indicating number through the CS encoder encodes, and mode selection module is used to control CS encoder mode of operation, and odd-numbered frame obtains key frame through the standard coders coding.
Further, as a kind of preferred version, said D encoding is many order codings of various visual angles or double vision angle binocular coding.
Further, as a kind of preferred version, said mode selection module comprises the SKIP pattern; At first calculate the mean square error between the piece that current block is adjacent same position in the key frame; If should be worth less than threshold value t0, then current block is skipped, and has no measured value to need transmission.
The present invention also provides a kind of three-dimensional decoding device; Comprise some to identical and decoder independently; Every pair of decoder comprises a LOOK LEFT and LOOK RIGHT passage, and each passage comprises with lower module: CS reconstructed module, standard decoder, associating dictionary and parity frame interweave, and the CS frame that receives obtains odd-numbered frame through the CS reconstructed module; The key frame that receives obtains even frame through standard decoder, obtains the output of LOOK LEFT video sequence after odd-numbered frame and even frame interweave through parity frame; The CS frame that receives obtains even frame through the CS reconstructed module, and the key frame that receives obtains odd-numbered frame through standard decoder, obtains the output of LOOK RIGHT video sequence after odd-numbered frame and even frame interweave through parity frame; Obtain the associating dictionary through the decoded data dependence of different visual angles standard decoder, the associating dictionary is used to control the CS reconstructed module.
The present invention also provides a kind of three-dimensional coding/decoding system that is made up of above code device and decoding device.
Through adopting the independently encoder of low complex degree, reduce video coding power, and do not need transmission channel to communicate between the video camera.
Description of drawings
When combining accompanying drawing to consider; Through with reference to following detailed, can more completely understand the present invention better and learn wherein many attendant advantages easily, but accompanying drawing described herein is used to provide further understanding of the present invention; Constitute a part of the present invention; Illustrative examples of the present invention and explanation thereof are used to explain the present invention, do not constitute to improper qualification of the present invention, wherein:
Fig. 1 is based on the distributed compressed sensing binocular video coded system block diagram of associating dictionary;
Reference block in Fig. 2 SINGLE pattern;
Fig. 3 distortion performance compares: (a) " rabbit "; (b) " soccer ".
Embodiment
Followingly describe with reference to Fig. 1-3 pair embodiments of the invention.
For make above-mentioned purpose, feature and advantage can be more obviously understandable, below in conjunction with accompanying drawing and embodiment the present invention done further detailed explanation.
Owing to same processing method has been adopted at each visual angle, so present embodiment is easy to expand to multi-view video coding.Consider the encoder design of low complex degree, each visual angle is separated according to parity frame earlier and is obtained video sequence, and these video sequence are respectively as the CS frame and the key frame of distributed compressed sensing coding then.Design is during decoder, utilizes that correlation and same visual angle relativity of time domain can make key frame generate the associating dictionary between the visual angle, helps obtaining better reconstructed results.
Embodiment 1
A kind of D encoding device; Comprise some to identical and encoder 10 independently; Each comprises a LOOK LEFT and LOOK RIGHT passage to encoder 10; Each passage comprises with lower module: parity frame separation module 1, CS encoder 3, standard coders 4 and mode selection module 2, separate into odd-numbered frame and even frame behind the LOOK LEFT signal process parity frame separation module 1, and odd-numbered frame obtains the CS sign indicating number through CS encoder 3 codings; Mode selection module 2 is used to control CS encoder 3 mode of operations, and even frame obtains key frame through standard coders 4 codings; Separate into odd-numbered frame and even frame behind the LOOK RIGHT signal process parity frame separation module 1; Even frame obtains the CS sign indicating number through CS encoder 3 codings; Mode selection module 2 is used to control CS encoder 3 mode of operations, and odd-numbered frame obtains key frame through standard coders 4 codings.
As shown in Figure 1, at coding side, each visual angle absolute coding does not need communication between two visual angles.
Embodiment 2
A kind of three-dimensional decoding device; Comprise some to identical and decoder 20 independently; Every pair of decoder 20 comprises a LOOK LEFT and LOOK RIGHT passage, and each passage comprises with lower module: CS reconstructed module 6, standard decoder 5, associating dictionary 7 and parity frame interweave 8, and the CS frame that receives obtains odd-numbered frame through CS reconstructed module 6; The key frame that receives obtains even frame through standard decoder 5, and odd-numbered frame and even frame interweave through parity frame and obtain the output of LOOK LEFT video sequence after 8; The CS frame that receives obtains even frame through CS reconstructed module 6, and the key frame that receives obtains odd-numbered frame through standard decoder 5, and odd-numbered frame and even frame interweave through parity frame and obtain the output of LOOK RIGHT video sequence after 8; Obtain associating dictionary 7 through different visual angles standard decoder 5 decoded data dependences, associating dictionary 7 is used to control CS reconstruct 6 modules.
CS encoder and CS reconstruct operation principle are following:
Suppose x ∈ R nBe a discrete signal, u is its coefficient under certain orthogonal basis Ψ, then x=Ψ TU.Here, be nonzero element if having only k coefficient in n coefficient, claim that then x is sparse for k under certain orthogonal basis Ψ.Theoretical according to CS, k the nonzero coefficient that need as conventional codec, not go to encode, the flow process of CS encoder is following.
y=Φx (1)
Here Φ is m * n matrix, y ∈ R mBecause m<n, so primary signal x has been compressed.In CS reconstruct, u can obtain reconstruct through separating following optimization problem.
min||u|| 1,subject?to?y=ΦΨ Tu (2)
Then according to x=Ψ TU finally rebuilds primary signal x.
In the present embodiment, those frames that are encoded to the CS frame for needs are handled according to piecemeal, and block size is 16 * 16.Each piece can be arranged as the column vector x of n * 1, at this moment n=256 according to line scanning.The sampling point that each behavior symmetry Independent B ernoulli of matrix Φ distributes, the element in promptly every row be ± 1, wherein+1 and-1 probability be 1/2.It should be noted that for all pieces and use same matrix Φ, thereby guaranteed the low complex degree that calculates.According to formula (1), can obtain measured value is y, is the column vector of m * 1.Then y through after the scalar quantization in channel.This patent does not have to use fixing orthogonal basis Ψ, but has specifically designed the associating dictionary as Ψ in CS reconstruct.Concrete recovery algorithms has used general log-barrier algorithm to come solution formula (2).
Associating dictionary generating principle is following:
Theoretical according to CS, choosing of matrix Ψ make signal below this orthogonal basis, satisfy maximized sparse property, thereby can effectively reduce the number of measurement values of transmission.In most of CS use, used fixing orthogonal basis as Ψ, for example discrete cosine transform or wavelet transform.Consider the relativity of time domain of vision signal, current block can be predicted by its reference block.Therefore, if current block representes that with the linear combination of its reference block then current block can be regarded sparse signal as.For 3 dimension videos, also has very big correlation between the visual angle.Therefore, this patent utilizes that the relativity of time domain of correlation and same visual angle designs the associating dictionary between the visual angle when the dictionary Ψ of each piece of design.For example, if current block x is positioned at the odd-numbered frame Fk of LOOK LEFT, then the reference block among reference block among the LOOK LEFT even frame Fk+1 and the LOOK RIGHT odd-numbered frame Fk all will be as generating the associating dictionary.Selected reference block is that the position with current block x is the center, and the selected window size is all possible among w * w in reference frame.With selected according to after the line scanning as the row of associating dictionary matrix Ψ.
The model selection principle is following:
In order to improve distortion performance, this patent has designed three kinds of patterns in encoder-side.
Pattern 1 is the SKIP pattern.At first calculate the mean square error (mean square error is called for short MSE) between the piece that current block is adjacent same position in the key frame.If should be worth less than threshold value t0, then current block can be skipped, and has no measured value to need transmission.For the decoding of SKIP pattern, only need the same position piece of its adjacent key frame of copy to get final product.
Pattern 2 is the SINGLE pattern.For current block x among the frame Fk, can select its 4 reference block xt in adjacent key frame Fk+1, xb, xl and xr are as shown in Figure 2.These 4 reference blocks can be from being the center with the x position, and size is for choosing in the window of w * w.Suppose that 4 reference blocks are p, then w=2p+1 with respect to the side-play amount of x.Can calculate the least mean-square error (minimum MSE is called for short MMSE) between x and 4 reference blocks then.If should be worth less than threshold value t1, then piece x can use m1 measured value to carry out the CS coding.Otherwise (the individual measured value encoding block x of m2>m1) is the L1 mode treatment as mode 3 at this moment need to use m2.Under the SINGLE pattern, during CS reconstruct, with the measured value of the reference block of every row representative in m1 measured value being received earlier relatively and the dictionary, selection wherein has the piece of the piece of least mean-square error as CS reconstruct.The SINGLE pattern effectively reduces the complexity of decoding end.Under the L1 pattern, CS reconstruct will be used m2 the measured value solution formula of being received (2).
With reference to figure 3, the system of having selected two standard video sequence " rabbit " and " soccer " to test this patent here. the resolution of these two cycle testss is that 720 * 480 frame per second are per second 30 frames.What the standard codec adopted is JM 10.2 versions of H.264 encoder.In order to prove the performance of associating dictionary, at the CS encoder with H.264 selected same experiment parameter in the encoder.Can see that from Fig. 3 the system that is proposed compares and do not use the system of associating dictionary (only the adjacent key frame by same visual angle constitutes dictionary) to have more performance: the scheme that in Fig. 3 (a), is proposed for video sequence " rabbit " has obtained in the gain that has obtained about 0.5dB aspect the PSNR value; Gain for video sequence " soccer " in Fig. 3 (b) has surpassed 0.5dB.Its reason possibly be video sequence " soccer " motion has caused the correlation between the visual angle bigger than relativity of time domain so have faster, thereby considers that the associating dictionary of correlation and relativity of time domain has obtained more performance between the visual angle.
In addition, from Fig. 3, can see the gain that 0.5-1dB is arranged for video sequence " rabbit " PSNR value in code check is the 50-300kbps scope, the gain of 0.5-1dB is arranged for video sequence " soccer " PSNR value in code check is the 200-1800kbps scope.
As stated, embodiments of the invention have been carried out explanation at length, but as long as not breaking away from inventive point of the present invention and effect in fact can have a lot of distortion, this will be readily apparent to persons skilled in the art.Therefore, such variation also all is included within protection scope of the present invention.

Claims (6)

1. D encoding device; It is characterized in that: comprise some identical and encoder independently; Each comprises a LOOK LEFT and LOOK RIGHT passage to encoder; Each passage comprises with lower module: parity frame separation module, CS encoder, standard coders and mode selection module, separate into odd-numbered frame and even frame behind the LOOK LEFT signal process parity frame separation module, and odd-numbered frame obtains the CS sign indicating number through the CS encoder encodes; Mode selection module is used to control CS encoder mode of operation, and even frame obtains key frame through the standard coders coding; Separate into odd-numbered frame and even frame behind the LOOK RIGHT signal process parity frame separation module, even frame obtains the CS sign indicating number through the CS encoder encodes, and mode selection module is used to control CS encoder mode of operation, and odd-numbered frame obtains key frame through the standard coders coding.
2. a kind of according to claim 1 D encoding device is characterized in that: said D encoding is many order codings of various visual angles or double vision angle binocular coding.
3. a kind of according to claim 1 D encoding device; It is characterized in that: said mode selection module comprises the SKIP pattern; At first calculate the mean square error between the piece that current block is adjacent same position in the key frame; If should be worth less than threshold value t0, then current block is skipped, and has no measured value to need transmission.
4. three-dimensional decoding device; It is characterized in that: comprise some identical and decoder independently; Every pair of decoder comprises a LOOK LEFT and LOOK RIGHT passage, and each passage comprises with lower module: CS reconstructed module, standard decoder, associating dictionary and parity frame interweave, and the CS frame that receives obtains odd-numbered frame through the CS reconstructed module; The key frame that receives obtains even frame through standard decoder, obtains the output of LOOK LEFT video sequence after odd-numbered frame and even frame interweave through parity frame; The CS frame that receives obtains even frame through the CS reconstructed module, and the key frame that receives obtains odd-numbered frame through standard decoder, obtains the output of LOOK RIGHT video sequence after odd-numbered frame and even frame interweave through parity frame; Obtain the associating dictionary through the decoded data dependence of different visual angles standard decoder, the associating dictionary is used to control the CS reconstructed module.
5. like the said a kind of three-dimensional decoding device of claim 4, it is characterized in that: said solid is decoded as many order decodings of various visual angles or the decoding of double vision angle binocular.
6. a three-dimensional coding/decoding system is characterized in that: be made up of one of said a kind of D encoding device of one of claim 1~3 and claim 4~5 a kind of three-dimensional decoding device.
CN201210055895.XA 2012-03-05 2012-03-05 Stereo encoding device, decoding device and system Expired - Fee Related CN102625124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210055895.XA CN102625124B (en) 2012-03-05 2012-03-05 Stereo encoding device, decoding device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210055895.XA CN102625124B (en) 2012-03-05 2012-03-05 Stereo encoding device, decoding device and system

Publications (2)

Publication Number Publication Date
CN102625124A true CN102625124A (en) 2012-08-01
CN102625124B CN102625124B (en) 2014-01-15

Family

ID=46564781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210055895.XA Expired - Fee Related CN102625124B (en) 2012-03-05 2012-03-05 Stereo encoding device, decoding device and system

Country Status (1)

Country Link
CN (1) CN102625124B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108076295A (en) * 2017-12-15 2018-05-25 北京五特自动化工程有限公司 A kind of efficient machine vision communication system based on multidimensional code
CN115866151A (en) * 2023-02-27 2023-03-28 南昌市一境信息技术有限公司 Image communication method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101621690A (en) * 2009-07-24 2010-01-06 北京交通大学 Two-description video coding method based on Wyner-Ziv principle
US20100142620A1 (en) * 2008-12-04 2010-06-10 Electronics And Telecommunications Research Method of generating side information by correcting motion field error in distributed video coding and dvc decoder using the same
CN101742313A (en) * 2009-12-10 2010-06-16 北京邮电大学 Compression sensing technology-based method for distributed type information source coding
CN101860748A (en) * 2010-04-02 2010-10-13 西安电子科技大学 Side information generating system and method based on distribution type video encoding
CN102123278A (en) * 2010-12-10 2011-07-13 北京邮电大学 Signal source encoding method based on distributed compressive sensing technology

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100142620A1 (en) * 2008-12-04 2010-06-10 Electronics And Telecommunications Research Method of generating side information by correcting motion field error in distributed video coding and dvc decoder using the same
CN101621690A (en) * 2009-07-24 2010-01-06 北京交通大学 Two-description video coding method based on Wyner-Ziv principle
CN101742313A (en) * 2009-12-10 2010-06-16 北京邮电大学 Compression sensing technology-based method for distributed type information source coding
CN101860748A (en) * 2010-04-02 2010-10-13 西安电子科技大学 Side information generating system and method based on distribution type video encoding
CN102123278A (en) * 2010-12-10 2011-07-13 北京邮电大学 Signal source encoding method based on distributed compressive sensing technology

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108076295A (en) * 2017-12-15 2018-05-25 北京五特自动化工程有限公司 A kind of efficient machine vision communication system based on multidimensional code
CN108076295B (en) * 2017-12-15 2020-04-24 北京五特自动化工程有限公司 Machine vision communication system based on multi-dimensional code
CN115866151A (en) * 2023-02-27 2023-03-28 南昌市一境信息技术有限公司 Image communication method

Also Published As

Publication number Publication date
CN102625124B (en) 2014-01-15

Similar Documents

Publication Publication Date Title
US10582203B2 (en) Method and apparatus for transform coefficient coding of non-square blocks
Guo et al. Distributed multi-view video coding
US8582904B2 (en) Method of second order prediction and video encoder and decoder using the same
CN1893666B (en) Video encoding and decoding methods and apparatuses
KR101425602B1 (en) Method and apparatus for encoding/decoding image
KR101854003B1 (en) Video including multi layers encoding and decoding method
CN100471278C (en) Multi-view video compressed coding-decoding method based on distributed source coding
CN102572435B (en) Compressive sampling-based (CS-based) video coding/decoding system and method thereof
CN104303502A (en) Disparity vector construction method for 3D-hevc
CN109068143B (en) Video data decoding method and video data decoding apparatus
CN102630012B (en) Coding and decoding method, device and system based on multiple description videos
US20180324461A1 (en) Method and device for encoding and decoding intra-frame skip mode information
CN103002283A (en) Multi-view distributed video compression side information generation method
CN105659605A (en) Video decoding method and apparatus for decoding multi-view video
CN102308583A (en) Apparatus and method for encoding and decoding multi-view image
CN102984525B (en) A kind of video code flow error concealing method
CN105163130B (en) A kind of Lossless Image Compression Algorithm method based on discrete Tchebichef orthogonal polynomial
CN102625124B (en) Stereo encoding device, decoding device and system
KR20110065116A (en) Video encoding apparatus and method, transform encoding apparatus and method, basis transform generating apparatus and method, and video decoding apparatus and method
CN101389014A (en) Resolution variable video encoding and decoding method based on regions
KR100587952B1 (en) Video encoding/decoding apparatus and method using compensation for the asymmetric decimated left/right-view images
Toffetti et al. Image compression in a multi-camera system based on a distributed source coding approach
Naidu et al. A novel framework for JPEG image compression using baseline coding with parallel process
KR20030001758A (en) Apparatus and Method for stereoscopic video coding/decoding with motion and disparity compensated prediction
Chen et al. An Efficient Multiple Description Coding for Multi-View Video Based on the Correlation of Spatial Polyphase Transformed Subsequences.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140115

Termination date: 20160305