CN103220532A - Joint prediction encoding method and joint predication encoding system for stereoscopic video - Google Patents

Joint prediction encoding method and joint predication encoding system for stereoscopic video Download PDF

Info

Publication number
CN103220532A
CN103220532A CN201310158699XA CN201310158699A CN103220532A CN 103220532 A CN103220532 A CN 103220532A CN 201310158699X A CN201310158699X A CN 201310158699XA CN 201310158699 A CN201310158699 A CN 201310158699A CN 103220532 A CN103220532 A CN 103220532A
Authority
CN
China
Prior art keywords
prediction
coding
depth
ref
macro block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310158699XA
Other languages
Chinese (zh)
Other versions
CN103220532B (en
Inventor
季向阳
汪启扉
戴琼海
张乃尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201310158699.XA priority Critical patent/CN103220532B/en
Publication of CN103220532A publication Critical patent/CN103220532A/en
Application granted granted Critical
Publication of CN103220532B publication Critical patent/CN103220532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention provides a joint prediction encoding method and a joint predication encoding system for a stereoscopic video. The method comprises the following steps: 1. inputting a stereoscopic video and dividing the stereoscopic video into multiple encoding macro blocks; 2. predicting the depth prediction parallax error of the current encoding macro blocks through a depth prediction method, and performing depth-aided vision prediction encoding on the current encoding macro blocks; 3. performing the traditional vision prediction encoding on the current encoding macro blocks; 4. performing time domain prediction encoding on the current encoding macro blocks; 5. respectively calculating the rate-distortion performance of the current encoding macro blocks in the depth-aided vision prediction encoding mode, the traditional vision prediction encoding mode and the time domain prediction encoding mode; and 6. selecting the prediction encoding mode with the optimal rate-distortion performance as a prediction mode of the current encoding macro blocks and encoding. According to the method in the embodiment, the parallax error of the encoding macro blocks is estimated through the depth so as to perform vision compensation prediction, the code rate required by vision encoding in the stereoscopic video encoding is reduced, and the stereoscopic video encoding efficiency is improved.

Description

The associated prediction coding method and the system of three-dimensional video-frequency
Technical field
The present invention relates to field of video encoding, particularly a kind of associated prediction coding method and system of three-dimensional video-frequency.
Background technology
Along with the continuous development of video technique, three-dimensional video-frequency has obtained to pay close attention to widely with its vivid visual effect.In three-dimensional video-frequency, video data is made of video sequence and depth map sequence.Wherein, video sequence comprises two-way even multi-channel video sequence usually.Depth map sequence then comprises the pairing depth map of each road video.Therefore, in the application of three-dimensional video-frequency, how effectively the video and the depth map of compression and transmission magnanimity become one of three-dimensional video-frequency key technologies for application bottleneck.
In order to realize the efficient compression of stereoscopic video data, the researcher has proposed the multiple view video coding scheme.In this scheme, road video in the multi-view point video adopts the redundancy on traditional Video Coding Scheme compression time domain as basic viewpoint.For the video of all the other viewpoints, this encoding scheme has been introduced and has been looked a predictive mode, by time domain prediction and between looking prediction compress the time domain of multi-view point video and look between redundant, thereby effectively reduced the needed code check of coding multi-view point video.Because depth map can be considered as many viewpoints greyscale video sequence, therefore, the multiple view video coding scheme is used for depth map is encoded equally.In current main flow stereo scopic video coding scheme, encoder adopts the multiple view video coding scheme to compress respectively to multi-view point video and depth map, obtain video and depth map two-way code stream, and the two-way code stream is transferred to decoding end, reconstruct multi-view point video and depth map sequence simultaneously.Decoding end is further drawn virtual view according to user's needs, thereby forms the needed stereoscopic video sequence of user, and plays on corresponding stereo video display.
Although multiple view video coding can effectively compress the time domain of multi-view point video and depth map and look between redundant, yet the redundancy between multi-view point video and the depth map still can't be effectively incompressible.In three-dimensional video-frequency, depth map has characterized the depth information of corresponding points in the video sequence.Under the prerequisite of given shooting condition, the parallax information of each coded macroblocks can obtain by the depth value prediction.In three-dimensional video-frequency, depth map can be considered as the side information of multiple view video coding, thereby can replace reducing the needed encoder bit rate of coding parallax, the redundancy between compression multi-view point video and the depth map by the parallax search parallax that obtains by the depth calculation parallax.
Stereo scopic video coding mode based on multi-view point video and depth map combined coding has two kinds at present.A kind of be encoder by according to the depth map of current frame of video correspondence to be encoded and and reference video frame render a width of cloth virtual reference frame, thereby reduce the redundant information that exists in depth map and the parallax coding.Another kind is the Forecasting Methodology that draws time domain movable information and parallax information correlation by time domain movable information and the geometrical-restriction relation of looking a parallax information.
The shortcoming of prior art comprises:
(1) needs extra codec buffer memory, increased the space complexity of codec
(2) computation complexity is higher, has increased the time complexity of codec
Summary of the invention
Purpose of the present invention is intended to solve at least one of above-mentioned technological deficiency.
For this reason, one object of the present invention is to propose a kind of associated prediction coding method of three-dimensional video-frequency.
Another object of the present invention is to propose a kind of associated prediction coded system of three-dimensional video-frequency.
For achieving the above object, the embodiment of one aspect of the present invention proposes a kind of associated prediction coding method of three-dimensional video-frequency, may further comprise the steps: S1: import three-dimensional video-frequency and described three-dimensional video-frequency is divided into a plurality of coded macroblockss; S2: the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to described depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding; S3: the method by coupling between looking obtains disparity vector, and according to described disparity vector described current macro is carried out tradition and look a predictive coding; S4: the method by the time domain estimation obtains motion vector, and according to described motion vector described current coding macro block is carried out the time domain prediction coding; S5: calculate respectively described current coding macro block the described degree of depth auxiliary look a predictive coding, tradition is looked the distortion performance under a predictive coding and the time domain prediction coding mode; And S6: the predictive coding pattern of selection rate distortion performance optimum is as the predictive mode of current coding macro block and encode.
According to the method for the embodiment of the invention, come the parallax of estimated coding macro block to look a compensation prediction by the degree of depth, reduced in the stereo scopic video coding parallax needed code check of encoding, improved the efficient of stereo scopic video coding simultaneously.
In one embodiment of the present of invention, described method also comprises: S7: judge whether described all coded macroblockss encode and finish; S8:, then coded macroblocks repeating said steps S1-S5 is not all finished coding until all coded macroblockss if do not finish.
In one embodiment of the present of invention, the distortion performance of described time domain prediction coding obtains by following formula, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) , Wherein,
Figure BDA00003136320400022
Be motion vector, B kBe current coding macro block, ref mFor
Figure BDA00003136320400023
Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure BDA00003136320400024
For
Figure BDA00003136320400025
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of time domain prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
In one embodiment of the present of invention, the distortion performance that described tradition is looked a predictive coding obtains by following formula, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion ( r d + r h ) , Wherein, Be coupling gained parallax between looking, B kBe current coding macro block, ref dFor
Figure BDA00003136320400031
Reference frame pointed,
Figure BDA00003136320400032
For
Figure BDA00003136320400033
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor tradition look between the prediction Lagrange multiplier, r dBe the needed encoder bit rate of code search difference vector.
In one embodiment of the present of invention, the auxiliary distortion performance of looking a predictive coding of the described degree of depth obtains by following formula, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ , Wherein,
Figure BDA00003136320400035
Be depth calculation parallax, B kBe current coding macro block, ref zFor
Figure BDA00003136320400036
Reference frame pointed,
Figure BDA00003136320400037
For
Figure BDA00003136320400038
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor the degree of depth auxiliary look between the Lagrange multiplier of prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
For achieving the above object, embodiments of the invention propose a kind of associated prediction coded system of three-dimensional video-frequency on the other hand, comprising: divide module, be used to import three-dimensional video-frequency and described three-dimensional video-frequency is divided into a plurality of coded macroblockss; First prediction module is used for the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to described depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding; Second prediction module is used for that described current macro is carried out tradition and looks a predictive coding; The 3rd prediction module is used for described current coding macro block is carried out the time domain prediction coding; Computing module, be used for calculating respectively described current coding macro block the described degree of depth auxiliary look a predictive coding, tradition is looked the distortion performance under a predictive coding and the time domain prediction coding mode; And the selection module, the predictive coding pattern that is used for selection rate distortion performance optimum is as the predictive mode of current coding macro block and encode.
According to the system of the embodiment of the invention, come the parallax of estimated coding macro block to look a compensation prediction by the degree of depth, reduced in the stereo scopic video coding parallax needed code check of encoding, improved the efficient of stereo scopic video coding simultaneously.
In one embodiment of the present of invention, described system also comprises: judge module is used to judge whether described all coded macroblockss encode finishes; Processing module is used for when encoding imperfect tense, then reuses division module, first prediction module, second prediction module, the 3rd prediction module, computing module and selection module and all finishes coding until all coded macroblockss.
In one embodiment of the present of invention, the distortion performance of described time domain prediction coding obtains by following formula, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) , Wherein,
Figure BDA000031363204000310
Be motion vector, B kBe current coding macro block, ref mFor
Figure BDA000031363204000311
Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure BDA000031363204000312
For
Figure BDA000031363204000313
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of time domain prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
In one embodiment of the present of invention, the distortion performance that described tradition is looked a predictive coding obtains by following formula, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) , Wherein,
Figure BDA000031363204000315
Be coupling gained parallax between looking, B kBe current coding macro block, ref dFor Reference frame pointed,
Figure BDA000031363204000317
For
Figure BDA000031363204000318
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor tradition look between the prediction Lagrange multiplier, r dBe the needed encoder bit rate of code search difference vector.
In one embodiment of the present of invention, the auxiliary distortion performance of looking a predictive coding of the described degree of depth obtains by following formula, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ , Wherein,
Figure BDA00003136320400042
Be depth calculation parallax, B kBe current coding macro block, ref zFor
Figure BDA00003136320400043
Reference frame pointed, For
Figure BDA00003136320400045
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor the degree of depth auxiliary look between the Lagrange multiplier of prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
Aspect that the present invention adds and advantage part in the following description provide, and part will become obviously from the following description, or recognize by practice of the present invention.
Description of drawings
Above-mentioned and/or additional aspect of the present invention and advantage are from obviously and easily understanding becoming the description of embodiment below in conjunction with accompanying drawing, wherein:
Fig. 1 is the flow chart of the associated prediction coding method of three-dimensional video-frequency according to an embodiment of the invention;
Fig. 2 is virtual viewpoint rendering schematic diagram according to an embodiment of the invention;
Fig. 3 is coded prediction structural representation according to an embodiment of the invention; And
Fig. 4 is the structured flowchart of the associated prediction coded system of three-dimensional video-frequency according to an embodiment of the invention.
Embodiment
Describe embodiments of the invention below in detail, the example of embodiment is shown in the drawings, and wherein identical from start to finish or similar label is represented identical or similar elements or the element with identical or similar functions.Below by the embodiment that is described with reference to the drawings is exemplary, only is used to explain the present invention, and can not be interpreted as limitation of the present invention.
Fig. 1 is the flow chart of the associated prediction coding method of three-dimensional video-frequency according to an embodiment of the invention.As shown in Figure 1, the associated prediction coding method according to the three-dimensional video-frequency of the embodiment of the invention may further comprise the steps:
Step S101 imports three-dimensional video-frequency and three-dimensional video-frequency is divided into a plurality of coded macroblockss.
Particularly, the input three-dimensional video-frequency and to its proofread and correct, alignment etc. handles in earlier stage, the three-dimensional video-frequency after will handling again is divided into a plurality of coded macroblockss.
Step S102, the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to the depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding.
Particularly, suppose only to comprise in the stereoscopic video sequence video and the depth map sequence of left and right sides viewpoint.The parallax range of left and right sides viewpoint is c, and the camera focus of left and right sides viewpoint is f.Current coding macro block is B kB kInclude n jIndividual pixel, the depth value of each pixel correspondence is
Figure BDA00003136320400051
Predict current coding macro block B by the depth value of each pixel kThe depth prediction parallax.If current coding macro block B kDepth value be B kThe maximum likelihood value of the depth value of all pixel correspondences that comprised, z kCan be expressed as z k = arg max z k j prob ( { z k j | j = 1,2 , . . . , n B k } ) , Wherein,
Figure BDA00003136320400053
Depth value for each pixel.
Fig. 2 is virtual viewpoint rendering schematic diagram according to an embodiment of the invention.As shown in Figure 2, obtaining B kAfter the pairing depth value, can calculate the parallax of current coding macro block by the mapping relations between the degree of depth and the parallax.The prediction parallax of current coding macro block can be expressed as,
Figure BDA00003136320400054
Wherein, d kFor calculating parallax, f is a focal length, and c is the parallax range of left and right sides viewpoint.For the coding mode of 1/4th pixel precisions, with d kBe rounded to 1/4th pixel positions closed on most depth prediction parallax as current coding macro block.
Step S103 by the method acquisition disparity vector of coupling between looking, and carries out tradition according to disparity vector to current macro and looks a predictive coding.
Step S104 by the method acquisition motion vector of time domain estimation, and carries out time domain prediction according to motion vector to current coding macro block and encodes.
Step S105, calculate respectively current coding macro block the degree of depth auxiliary look a predictive coding, tradition is looked the distortion performance under a predictive coding and the time domain prediction coding mode
Particularly, encoder will calculate the distortion performance under the different predictive modes.If current coding macro block B kMotion vector be
Figure BDA00003136320400055
The search parallax is
Figure BDA00003136320400056
The depth prediction parallax is
The distortion performance of the time domain prediction coding of current macro search parallax obtains by following formula, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) , Wherein,
Figure BDA00003136320400059
Be motion vector, B kBe current coding macro block, ref mFor
Figure BDA000031363204000510
Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure BDA000031363204000511
For The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of time domain prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
The distortion performance that the tradition of current macro search parallax is looked a predictive coding obtains by following formula, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) , Wherein,
Figure BDA000031363204000514
Be coupling gained parallax between looking, B kBe current coding macro block, ref dFor
Figure BDA000031363204000515
Reference frame pointed,
Figure BDA000031363204000516
For
Figure BDA000031363204000517
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor tradition look between the prediction Lagrange multiplier, r dBe the needed encoder bit rate of code search difference vector.
In three-dimensional video-frequency, depth information can be considered as the side information of video coding.Therefore, we can suppose that coding side can obtain identical reconstruct depth map simultaneously with decoding end.Thereby the depth prediction parallax does not need to enroll in the middle of the code stream.Therefore, current macro is carried out auxiliary the looking the pairing distortion performance of a predictive coding and can be expressed as of the degree of depth by the depth prediction parallax, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ , Wherein,
Figure BDA00003136320400062
Be depth calculation parallax, B kBe current coding macro block, ref zFor
Figure BDA00003136320400063
Reference frame pointed,
Figure BDA00003136320400064
For
Figure BDA00003136320400065
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor the degree of depth auxiliary look between the Lagrange multiplier of prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
Step S106, the predictive mode of the distortion performance correspondence of selection rate distortion performance minimum is as the predictive mode of current coding macro block and encode.
Particularly, encoder is with the predictive mode of the selection rate distortion optimum predictive mode as current coding macro block.Its selection course can be expressed as, J = min ( J MCP ( m → , ref m ) , J DCP ( d s → , ref d ) , J DADCP ( d z → , ref z ) ) , Wherein,
Figure BDA00003136320400067
With
Figure BDA00003136320400068
The distortion performance of prediction and the degree of depth were auxiliary between distortion performance, the tradition of representing time domain prediction respectively looked look between the distortion performance of prediction.
In one embodiment of the invention, it is the standard testing video sequence of " Book Arrival " that the video sequence of stereoscopic video coding adopts the name of SD form, and the pixel of this SD format video sequence is 1024 * 768.Decoder adopts H.264/AVC(Multi-view Video Coding, the multi-view point video extended version) the reference software JMVC(Joint Multi-view Video Coding of standard, multiple view video coding), encoder GOP(Group of Pictures, image sets) frame number is 8, the time domain prediction coding of coding adopts Hierarchical B(stratification bi-directional predictive coding frame, be called for short stratification B frame) predict, Fig. 3 is coded prediction structural representation according to an embodiment of the invention.As shown in Figure 3, virtual viewpoint rendering adopts two-way color video and the depth map adjacent with virtual view to draw.The viewpoint 10 of this enforcement sample employing " Book Arrival " sequence and viewpoint 8 these two-path videos are as the multi-view point video list entries, and wherein viewpoint 10 is called left reference view, and viewpoint 8 is called right reference view.The span of multi-view point video and multi-view depth graph code quantization parameter QP is the integer between 0 to 51.The parallax range of left and right sides viewpoint is 10, and the focal length of camera is 100.
If current coding macro block B kBe one 8 * 8 macro block in the frame in viewpoint 8 videos of " Book Arrival " sequence.Its corresponding depth value is shown in following 8 * 8 matrixes.
z B k = 62 63 62 61 63 58 62 63 61 62 62 63 61 64 65 64 62 61 57 61 63 59 63 63 67 61 58 62 61 61 66 64 63 62 62 58 60 62 63 62 61 62 61 61 61 63 60 58 62 62 61 62 62 62 60 62 64 63 61 62 62 62 60 61 . For current coding macro block B k, its corresponding depth value is B kThe maximum likelihood value z of the depth value of all pixel correspondences that comprised kFor, z k = arg max z k j prob ( { z k j | j = 1,2 , . . . , n B k } ) = 62 . Obtaining current coding macro block B kDepth value after, the prediction parallax of current coding macro block is,
Figure BDA00003136320400075
For the coding mode of 1/4th pixel precisions, with d kAfter being rounded to the pixel that closes on most, its parallax should be, d k'=[d k]=16.25.Encoder carry out again based on the prediction parallax information carry out tradition look between the prediction.For current coding macro block, its prediction parallax is 16.25.Encoder finds corresponding reference macroblock in the corresponding frame of viewpoint 10, predict.If prediction residual absolute value sum is 50.In addition, encoder also will compensate prediction to current macro, i.e. prediction between time domain prediction and tradition are looked.In time domain prediction, the motion vector that might as well establish current macro is 32, and the absolute value sum of the residual error of time domain prediction is 80.In the prediction, establishing encoder was 16 by the parallax that the piece match search obtains between tradition between looking was looked, and the absolute value sum of the residual error of prediction was 45 between tradition was looked.
Then, the distortion performance of predicting between the macro block that encoder will be more different.If coding current macro B kThe needed bit number of motion vector be r m=10, coding B kThe needed bit number of piece match search gained parallax be r d=8, coding B kThe needed bit number of header be r h=20.So, in the prediction, B encodes between looking based on the tradition of depth prediction parallax kThe needed bit number of header be r h'=21.The current macro that is used to an extra bit identify adopt based on the depth prediction parallax carry out tradition look between prediction.In the rate-distortion optimization process, establish Lagrange multiplier λ MotionValue be 1.5.
Therefore, for macro block B k, the distortion performance of its time domain prediction is, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) = 80 + 1.5 × ( 10 + 20 ) = 125 .
B kTradition look between the prediction distortion performance be, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) = 45 + 1.5 × ( 8 + 20 ) = 87
When adopting the depth prediction parallax to carry out predictive coding, B kThe auxiliary distortion performance of looking a predictive coding of the degree of depth be, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ = 50 + 1.5 × 21 = 81.5 .
Afterwards, encoder is selected optimum inter prediction encoding pattern by distortion performance under the more different predictive modes.For current macro B k, J = min ( J MCP ( m → , ref m ) , J DCP ( d s → , ref d ) , J DADCP ( d z → , ref z ) ) = min ( 125,87,81.5 ) = 81.5 . Therefore, its optimum inter prediction encoding pattern be the degree of depth auxiliary look a predictive coding.After obtaining optimum inter-frame forecast mode, encoder will carry out the rate-distortion optimization selection second time.Encoder will further compare the distortion performance of inter-frame forecast mode and intra prediction mode, and the pattern of final selection rate distortion optimum is encoded to current macro.
According to the method for the embodiment of the invention, come the parallax of estimated coding macro block to look a compensation prediction by the degree of depth, reduced in the stereo scopic video coding parallax needed code check of encoding, improved the efficient of stereo scopic video coding simultaneously.
Fig. 4 is the structured flowchart of the associated prediction coded system of three-dimensional video-frequency according to an embodiment of the invention.As shown in Figure 4, the associated prediction coded system of three-dimensional video-frequency comprises division module 100, first prediction module 200, second prediction module 300, the 3rd prediction module 400, computing module 500 and selects module 600.
Dividing module 100 is used to import three-dimensional video-frequency and three-dimensional video-frequency is divided into a plurality of coded macroblockss.
Particularly, the input three-dimensional video-frequency and to its proofread and correct, alignment etc. handles in earlier stage, the three-dimensional video-frequency after will handling again is divided into a plurality of coded macroblockss.
First prediction module 200 is used for the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to the depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding.
Particularly, suppose only to comprise in the stereoscopic video sequence video and the depth map sequence of left and right sides viewpoint.The parallax range of left and right sides viewpoint is c, and the camera focus of left and right sides viewpoint is f.Current coding macro block is B kB kInclude n jIndividual pixel, the depth value of each pixel correspondence is Predict current coding macro block B by the depth value of each pixel kThe depth prediction parallax.If current coding macro block B kDepth value be B kThe maximum likelihood value of the depth value of all pixel correspondences that comprised, z kCan be expressed as
Figure BDA00003136320400084
Wherein,
Figure BDA00003136320400085
Depth value for each pixel.
Fig. 2 is virtual viewpoint rendering schematic diagram according to an embodiment of the invention.As shown in Figure 2, obtaining B kAfter the pairing depth value, can calculate the parallax of current coding macro block by the mapping relations between the degree of depth and the parallax.The prediction parallax of current coding macro block can be expressed as,
Figure BDA00003136320400086
Wherein, d kFor calculating parallax, f is a focal length, and c is the parallax range of left and right sides viewpoint.For the coding mode of 1/4th pixel precisions, with d kBe rounded to 1/4th pixel positions closed on most depth prediction parallax as current coding macro block.
Second prediction module 300 is used for obtaining disparity vector by the method for coupling between looking, and according to disparity vector current macro is carried out tradition and look a predictive coding.
The 3rd prediction module 400 is used for obtaining motion vector by the method for time domain estimation, and according to motion vector current coding macro block is carried out the time domain prediction coding.
Computing module 500 is used to calculate that current coding macro block is predicted and a plurality of distortion performance of compensation prediction between looking.
Particularly, encoder will calculate the distortion performance under the different predictive modes.If current coding macro block B kMotion vector be
Figure BDA00003136320400091
The search parallax is
Figure BDA00003136320400092
The depth prediction parallax is
Figure BDA00003136320400093
The motion compensated prediction distortion performance of current macro can be expressed as, and the distortion performance of motion compensated prediction obtains by following formula, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) , Wherein,
Figure BDA00003136320400095
Be motion vector, B kBe current coding macro block, ref mFor
Figure BDA00003136320400096
Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure BDA00003136320400097
For
Figure BDA00003136320400098
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of time domain prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
The distortion performance of the search parallax compensation prediction of current macro obtains by following formula, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion ( r d + r h ) , Wherein,
Figure BDA000031363204000910
Be coupling gained parallax between looking, B kBe current coding macro block, ref dFor
Figure BDA000031363204000911
Reference frame pointed,
Figure BDA000031363204000912
For
Figure BDA000031363204000913
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor tradition look between the prediction Lagrange multiplier, r dBe the needed encoder bit rate of code search difference vector.
In three-dimensional video-frequency, depth information can be considered as the side information of video coding.Therefore, we can suppose that coding side can obtain identical reconstruct depth map simultaneously with decoding end.Thereby the depth prediction parallax does not need to enroll in the middle of the code stream.Therefore, current macro is carried out parallax compensation by the depth prediction parallax and is predicted that pairing distortion performance can be expressed as, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ , Wherein,
Figure BDA000031363204000915
Be depth calculation parallax, B kBe current coding macro block, ref zFor
Figure BDA000031363204000916
Reference frame pointed, For
Figure BDA000031363204000918
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionFor the degree of depth auxiliary look between the Lagrange multiplier of prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
Select the predictive mode of distortion performance correspondence that module 600 is used for selection rate distortion performance minimum as the predictive mode of current coding macro block and encode.
Particularly, encoder is with the predictive mode of the selection rate distortion optimum predictive mode as current coding macro block.Its selection course can be expressed as, J = min ( J MCP ( m → , ref m ) , J DCP ( d s → , ref d ) , J DADCP ( d z → , ref z ) ) , Wherein,
Figure BDA00003136320400101
With
Figure BDA00003136320400107
The distortion performance of prediction and the degree of depth were auxiliary between distortion performance, the tradition of representing time domain prediction respectively looked look between the distortion performance of prediction.
In one embodiment of the invention, establish current coding macro block B kBe one 8 * 8 macro block in the frame in viewpoint 8 videos of " Book Arrival " sequence.Its corresponding depth value is shown in following 8 * 8 matrixes.
z B k = 62 63 62 61 63 58 62 63 61 62 62 63 61 64 65 64 62 61 57 61 63 59 63 63 67 61 58 62 61 61 66 64 63 62 62 58 60 62 63 62 61 62 61 61 61 63 60 58 62 62 61 62 62 62 60 62 64 63 61 62 62 62 60 61 . For current coding macro block B k, its corresponding depth value is B kThe maximum likelihood value zk of the depth value of all pixel correspondences that comprised is, z k = arg max z k j prob ( { z k j | j = 1,2 , . . . , n B k } ) = 62 . Obtaining current coding macro block B kDepth value after, the prediction parallax of current coding macro block is, d k = fc z k = 100 × 10 62 = 16.13 .
For the coding mode of 1/4th pixel precisions, with d kAfter being rounded to the pixel that closes on most, its parallax should be, d k'=[d k]=16.25.Encoder carry out again based on the prediction parallax information carry out tradition look between the prediction.For current coding macro block, its prediction parallax is 16.25.Encoder finds corresponding reference macroblock in the corresponding frame of viewpoint 10, predict.If prediction residual absolute value sum is 50.In addition, encoder also will compensate prediction to current macro, i.e. prediction between time domain prediction and tradition are looked.In time domain prediction, the motion vector that might as well establish current macro is 32, and the absolute value sum of the residual error of time domain prediction is 80.In the prediction, establishing encoder was 16 by the parallax that the piece match search obtains between tradition between looking was looked, and the absolute value sum of the residual error of prediction was 45 between tradition was looked.
In one embodiment of the invention, the distortion performance under the different predictive coding patterns of computing module 500 calculating.For macro block B k, the distortion performance of its time domain prediction is, J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) = 80 + 1.5 × ( 10 + 20 ) = 125 .
B kTradition look between the prediction distortion performance be, J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) = 45 + 1.5 × ( 8 + 20 ) = 87
When adopting the depth prediction parallax to carry out predictive coding, B kThe auxiliary distortion performance of looking a predictive coding of the degree of depth be, J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ = 50 + 1.5 × 21 = 81.5 .
Select module 600 by distortion performance under the more different predictive modes of encoder, and select optimum predictive coding pattern.For current macro B k, J = min ( J MCP ( m → , ref m ) , J DCP ( d s → , ref d ) , J DADCP ( d z → , ref z ) ) = min ( 125,87,81.5 ) = 81.5 . Therefore, its optimum inter prediction encoding pattern be the degree of depth auxiliary look a predictive coding.After obtaining optimum inter prediction encoding pattern, encoder will carry out the rate-distortion optimization selection second time.Encoder will further compare the distortion performance of inter-frame forecast mode and intra prediction mode, and the pattern of final selection rate distortion optimum is encoded to current macro.
According to the system of the embodiment of the invention, come the parallax of estimated coding macro block to look a compensation prediction by the degree of depth, reduced in the stereo scopic video coding parallax needed code check of encoding, improved the efficient of stereo scopic video coding simultaneously.
Although illustrated and described embodiments of the invention above, be understandable that, the foregoing description is exemplary, can not be interpreted as limitation of the present invention, those of ordinary skill in the art can change the foregoing description under the situation that does not break away from principle of the present invention and aim within the scope of the invention, modification, replacement and modification.

Claims (10)

1. the associated prediction coding method of a three-dimensional video-frequency is characterized in that, may further comprise the steps:
S1: import three-dimensional video-frequency and described three-dimensional video-frequency is divided into a plurality of coded macroblockss;
S2: the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to described depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding;
S3: the method by coupling between looking obtains disparity vector, and according to described disparity vector described current macro is carried out tradition and look a predictive coding;
S4: the method by the time domain estimation obtains motion vector, and according to described motion vector described current coding macro block is carried out the time domain prediction coding;
S5: calculate respectively described current coding macro block the described degree of depth auxiliary look a predictive coding, tradition is looked the distortion performance under a predictive coding and the time domain prediction coding mode; And
S6: the predictive coding pattern of selection rate distortion performance optimum is as the predictive mode of current coding macro block and encode.
2. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 1 is characterized in that, also comprises:
S7: judge whether described all coded macroblockss encode and finish;
S8:, then coded macroblocks repeating said steps S1-S5 is not all finished coding until all coded macroblockss if do not finish.
3. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 1 is characterized in that, the distortion performance of described time domain prediction coding obtains by following formula,
J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) ,
Wherein,
Figure FDA00003136320300012
Be motion vector, B kBe current coding macro block, ref mFor Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure FDA00003136320300014
For
Figure FDA00003136320300015
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of motion compensated prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
4. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 1 is characterized in that, the distortion performance that described tradition is looked a predictive coding obtains by following formula,
J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) ,
Wherein,
Figure FDA00003136320300017
Be coupling gained parallax between looking, B kBe current coding macro block, ref dFor
Figure FDA00003136320300018
Reference frame pointed,
Figure FDA00003136320300019
For
Figure FDA000031363203000110
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionBe the Lagrange multiplier of motion compensated prediction, r dBe the needed encoder bit rate of code search difference vector.
5. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 1 is characterized in that, the auxiliary distortion performance of looking a predictive coding of the described degree of depth obtains by following formula,
J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ ,
Wherein,
Figure FDA00003136320300022
Be depth calculation parallax, B kBe current coding macro block, ref zFor
Figure FDA00003136320300023
Reference frame pointed, For
Figure FDA00003136320300025
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionBe the Lagrange multiplier of motion compensated prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
6. the associated prediction coded system of a three-dimensional video-frequency is characterized in that, comprising:
Divide module, be used to import three-dimensional video-frequency and described three-dimensional video-frequency is divided into a plurality of coded macroblockss;
First prediction module is used for the depth prediction parallax of the method prediction current coding macro block by depth prediction, and according to described depth prediction parallax to current coding macro block carry out the degree of depth auxiliary look a predictive coding;
Second prediction module is used for obtaining disparity vector by the method for coupling between looking, and according to described disparity vector described current macro is carried out tradition and look a predictive coding;
The 3rd prediction module is used for obtaining motion vector by the method for time domain estimation, and according to described motion vector described current coding macro block is carried out the time domain prediction coding;
Computing module, be used for calculating respectively described current coding macro block the described degree of depth auxiliary look a predictive coding, tradition is looked the distortion performance under a predictive coding and the time domain prediction coding mode; And
Select module, the predictive coding pattern that is used for selection rate distortion performance optimum is as the predictive mode of current coding macro block and encode.
7. the associated prediction coded system of three-dimensional video-frequency according to claim 6 is characterized in that, also comprises:
Judge module is used to judge whether described all coded macroblockss encode finishes;
Processing module is used for when encoding imperfect tense, then reuses division module, first prediction module, second prediction module, the 3rd prediction module, computing module and selection module and all finishes coding until all coded macroblockss.
8. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 6 is characterized in that, the distortion performance of described time domain prediction coding obtains by following formula,
J MCP ( m → , ref m ) = Σ X ∈ B k | I - I p ( m → , ref m ) | + λ motion ( r m + r h ) ,
Wherein,
Figure FDA00003136320300027
Be motion vector, B kBe current coding macro block, ref mFor
Figure FDA00003136320300028
Reference frame pointed, X are B kIn each pixel, I is the brightness or the chromatic component value of X correspondence,
Figure FDA00003136320300029
For
Figure FDA000031363203000210
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, λ MotionBe the Lagrange multiplier of motion compensated prediction, r mBe the needed encoder bit rate of encoding motion vector, r hBe coding needed code check of other macro block headers except that motion vector.
9. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 6 is characterized in that, the distortion performance that described tradition is looked a predictive coding obtains by following formula,
J DCP ( d s → , ref d ) = Σ X ∈ B k | I - I p ( d s → , ref d ) | + λ motion × ( r d + r h ) ,
Wherein,
Figure FDA00003136320300032
Be solid coupling gained parallax, ref dFor
Figure FDA00003136320300033
Reference frame pointed,
Figure FDA00003136320300034
For
Figure FDA00003136320300035
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, B kBe current coding macro block, I is the brightness or the chromatic component value of X correspondence, λ MotionBe the Lagrange multiplier of motion compensated prediction, r dBe the needed encoder bit rate of code search difference vector.
10. the associated prediction coding method of three-dimensional video-frequency as claimed in claim 6 is characterized in that, the auxiliary distortion performance of looking a predictive coding of the described degree of depth obtains by following formula,
J DADCP ( d z → , ref z ) = Σ X ∈ B k | I - I p ( d z → , ref z ) | + λ motion × r h ′ ,
Wherein,
Figure FDA00003136320300037
Be depth prediction parallax, B kBe current coding macro block, ref zFor
Figure FDA00003136320300038
Reference frame pointed,
Figure FDA00003136320300039
For
Figure FDA000031363203000310
The brightness of corresponding pixel points or chromatic component value in the reference frame pointed, X is B kIn each pixel, I is the brightness or the chromatic component value of X correspondence, λ MotionBe the Lagrange multiplier of motion compensated prediction, r h' be under the parallax compensation predictive mode based on the depth prediction parallax, the needed code check of coded macroblocks header.
CN201310158699.XA 2013-05-02 2013-05-02 The associated prediction coded method of three-dimensional video-frequency and system Active CN103220532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310158699.XA CN103220532B (en) 2013-05-02 2013-05-02 The associated prediction coded method of three-dimensional video-frequency and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310158699.XA CN103220532B (en) 2013-05-02 2013-05-02 The associated prediction coded method of three-dimensional video-frequency and system

Publications (2)

Publication Number Publication Date
CN103220532A true CN103220532A (en) 2013-07-24
CN103220532B CN103220532B (en) 2016-08-10

Family

ID=48817935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310158699.XA Active CN103220532B (en) 2013-05-02 2013-05-02 The associated prediction coded method of three-dimensional video-frequency and system

Country Status (1)

Country Link
CN (1) CN103220532B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103763557A (en) * 2014-01-03 2014-04-30 华为技术有限公司 Do-NBDV acquiring method and video decoding device
CN104125469A (en) * 2014-07-10 2014-10-29 中山大学 Fast coding method for high efficiency video coding (HEVC)
CN106303547A (en) * 2015-06-08 2017-01-04 中国科学院深圳先进技术研究院 3 d video encoding method and apparatus
WO2019114024A1 (en) * 2017-12-13 2019-06-20 北京大学 Lagrange multiplication model-based coding optimization method and device in point cloud frame

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170702A (en) * 2007-11-23 2008-04-30 四川虹微技术有限公司 Multi-view video coding method
CN101222639A (en) * 2007-01-09 2008-07-16 华为技术有限公司 Inter-view prediction method, encoder and decoder of multi-viewpoint video technology
CN101754042A (en) * 2008-10-30 2010-06-23 华为终端有限公司 Image reconstruction method and image reconstruction system
CN102238391A (en) * 2011-05-25 2011-11-09 深圳市融创天下科技股份有限公司 Predictive coding method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101222639A (en) * 2007-01-09 2008-07-16 华为技术有限公司 Inter-view prediction method, encoder and decoder of multi-viewpoint video technology
CN101170702A (en) * 2007-11-23 2008-04-30 四川虹微技术有限公司 Multi-view video coding method
CN101754042A (en) * 2008-10-30 2010-06-23 华为终端有限公司 Image reconstruction method and image reconstruction system
CN102238391A (en) * 2011-05-25 2011-11-09 深圳市融创天下科技股份有限公司 Predictive coding method and device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103763557A (en) * 2014-01-03 2014-04-30 华为技术有限公司 Do-NBDV acquiring method and video decoding device
CN104125469A (en) * 2014-07-10 2014-10-29 中山大学 Fast coding method for high efficiency video coding (HEVC)
CN104125469B (en) * 2014-07-10 2017-06-06 中山大学 A kind of fast encoding method for HEVC
CN106303547A (en) * 2015-06-08 2017-01-04 中国科学院深圳先进技术研究院 3 d video encoding method and apparatus
CN106303547B (en) * 2015-06-08 2019-01-01 中国科学院深圳先进技术研究院 3 d video encoding method and apparatus
WO2019114024A1 (en) * 2017-12-13 2019-06-20 北京大学 Lagrange multiplication model-based coding optimization method and device in point cloud frame

Also Published As

Publication number Publication date
CN103220532B (en) 2016-08-10

Similar Documents

Publication Publication Date Title
CN104247432B (en) The efficient multi-vision-point encoding estimated using depth map and updated
CN102685532B (en) Coding method for free view point four-dimensional space video coding system
CN101072356B (en) Motion vector predicating method
CN104995916B (en) Video data decoding method and video data decoding device
CN102006480B (en) Method for coding and decoding binocular stereoscopic video based on inter-view prediction
CN102210152A (en) A method and an apparatus for processing a video signal
CN106105191A (en) For the method and apparatus processing multiview video signal
CN102801995B (en) A kind of multi-view video motion based on template matching and disparity vector prediction method
KR20090046826A (en) A method and apparatus for processing a signal
CN106028037A (en) Equipment for decoding images
CN104412597A (en) Method and apparatus of unified disparity vector derivation for 3d video coding
CN104412587A (en) Method and apparatus of inter-view candidate derivation in 3d video coding
CN104010196B (en) 3D quality scalable video coding method based on HEVC
KR20080114482A (en) Method and apparatus for illumination compensation of multi-view video coding
CN103098475B (en) Method for encoding images and device, picture decoding method and device
CN103051894B (en) A kind of based on fractal and H.264 binocular tri-dimensional video compression & decompression method
KR20120066579A (en) Apparatus and method for encoding and decoding multi-view video
CN102752588A (en) Video encoding and decoding method using space zoom prediction
CN103220532A (en) Joint prediction encoding method and joint predication encoding system for stereoscopic video
CN104429079A (en) Method and system for processing multiview videos for view synthesis using motion vector predictor list
CN102740081B (en) Method for controlling transmission errors of multiview video based on distributed coding technology
CN101959067B (en) Decision method and system in rapid coding mode based on epipolar constraint
CN105637875A (en) Method and apparatus for decoding multi-view video
CN102917233A (en) Stereoscopic video coding optimization method in space teleoperation environment
CN102316323A (en) Rapid binocular stereo-video fractal compressing and uncompressing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant