CN103220532B - The associated prediction coded method of three-dimensional video-frequency and system - Google Patents
The associated prediction coded method of three-dimensional video-frequency and system Download PDFInfo
- Publication number
- CN103220532B CN103220532B CN201310158699.XA CN201310158699A CN103220532B CN 103220532 B CN103220532 B CN 103220532B CN 201310158699 A CN201310158699 A CN 201310158699A CN 103220532 B CN103220532 B CN 103220532B
- Authority
- CN
- China
- Prior art keywords
- coding
- prediction
- depth
- current
- ref
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 239000013598 vector Substances 0.000 claims description 53
- 230000002123 temporal effect Effects 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 2
- 238000013139 quantization Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 7
- 238000007476 Maximum Likelihood Methods 0.000 description 4
- 238000009877 rendering Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 230000001447 compensatory effect Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The present invention proposes associated prediction coded method and the system of a kind of three-dimensional video-frequency.Wherein, method includes: S1: three-dimensional video-frequency is also divided into multiple coded macroblocks by input three-dimensional video-frequency;S2: predicted the depth prediction parallax of current coding macro block by the method for depth prediction, and current coding macro block is carried out degree of depth auxiliary regard between predictive coding;S3: current macro is carried out tradition regard between predictive coding;S4: current coding macro block carries out time domain prediction coding;S5: calculate respectively current coding macro block the degree of depth assist depending between predictive coding, tradition regard between distortion performance under predictive coding and time domain prediction coding mode;The optimum predictive coding pattern of S6: selection rate distortion performance is as the predictive mode of current coding macro block and encodes.Method according to embodiments of the present invention, by the degree of depth come the parallax of estimated coding macro block regard between compensation prediction, decrease the code check required for parallax coding in stereo scopic video coding, improve the efficiency of stereo scopic video coding simultaneously.
Description
Technical Field
The present invention relates to the field of video coding, and in particular, to a joint predictive coding method and system for a stereoscopic video.
Background
With the continuous development of video technology, stereoscopic video has gained wide attention with its vivid visual effect. In stereoscopic video, video data is composed of a video sequence and a depth map sequence. Wherein a video sequence typically comprises two or even more video sequences. The depth map sequence includes a depth map corresponding to each video path. Therefore, in the application of stereoscopic video, how to effectively compress and transmit massive video and depth map becomes one of the key technical bottlenecks of the application of stereoscopic video.
In order to achieve efficient compression of stereoscopic video data, researchers have proposed multi-view video coding schemes. In the scheme, one path of video in the multi-view video is used as a basic view, and the redundancy in a time domain is compressed by adopting a traditional video coding scheme. For videos of other viewpoints, the coding scheme introduces an inter-view prediction mode, and compresses time domains and inter-view redundancies of the multi-view video through time domain prediction and inter-view prediction, so that the code rate required by coding the multi-view video is effectively reduced. Since the depth map can be viewed as a multi-view grayscale video sequence, the multi-view video coding scheme is also used to code the depth map. In the current mainstream stereo video coding scheme, an encoder respectively compresses a multi-view video and a depth map by adopting the multi-view video coding scheme to obtain two paths of code streams of the video and the depth map, and simultaneously transmits the two paths of code streams to a decoding end to reconstruct a multi-view video and depth map sequence. The decoding end further draws the virtual viewpoint according to the user requirement, thereby forming a stereoscopic video sequence required by the user and playing the stereoscopic video sequence on a corresponding stereoscopic video display.
Although multi-view video coding can efficiently compress the temporal and inter-view redundancies of multi-view video and depth maps, the redundancies between multi-view video and depth maps cannot be efficiently compressed. In stereoscopic video, a depth map characterizes depth information of corresponding points in a video sequence. The disparity information of each encoded macroblock can be obtained by depth value prediction given photographing conditions. In the stereoscopic video, the depth map can be regarded as side information of multi-view video coding, so that the parallax can be calculated through the depth instead of the parallax obtained through parallax searching, the coding rate required by the parallax is reduced, and the redundancy between the multi-view video and the depth map is compressed.
At present, there are two stereo video coding modes based on multi-view video and depth map joint coding. One is that the encoder renders a virtual reference frame according to a depth map corresponding to a current video frame to be encoded and a reference video frame thereof, thereby reducing redundant information existing in the depth map and disparity encoding. The other method is a prediction method for obtaining the correlation between the time domain motion information and the parallax information through the geometric constraint relation between the time domain motion information and the inter-view parallax information.
The disadvantages of the prior art include:
(1) additional codec buffering is required, increasing the spatial complexity of the codec
(2) The calculation complexity is higher, and the time complexity of a coder and a decoder is increased
Disclosure of Invention
The object of the present invention is to solve at least one of the technical drawbacks mentioned above.
To this end, an object of the present invention is to provide a method for joint predictive coding of stereoscopic video.
Another objective of the present invention is to provide a joint predictive coding system for stereoscopic video.
To achieve the above object, an embodiment of an aspect of the present invention provides a joint prediction encoding method for stereoscopic video, including the following steps: s1: inputting a stereoscopic video and dividing the stereoscopic video into a plurality of encoded macro blocks; s2: predicting the depth prediction parallax of the current coding macro block by a depth prediction method, and carrying out depth-assisted inter-view prediction coding on the current coding macro block according to the depth prediction parallax; s3: obtaining a disparity vector by an inter-view matching method, and performing traditional inter-view predictive coding on the current macroblock according to the disparity vector; s4: obtaining a motion vector by a time domain motion estimation method, and performing time domain predictive coding on the current coding macro block according to the motion vector; s5: respectively calculating the rate-distortion performance of the current coding macro block under the depth-assisted inter-view prediction coding mode, the traditional inter-view prediction coding mode and the time domain prediction coding mode; and S6: and selecting the predictive coding mode with the optimal rate distortion performance as the predictive mode of the current coding macro block and coding.
According to the method provided by the embodiment of the invention, the parallax of the coded macro block is estimated through the depth to perform inter-view compensation prediction, so that the code rate required by parallax coding in stereoscopic video coding is reduced, and the efficiency of stereoscopic video coding is improved.
In one embodiment of the invention, the method further comprises: s7: judging whether all the coding macro blocks are coded; s8: if not, the steps S1-S5 are repeated for the non-encoded macroblocks until all encoded macroblocks have been encoded.
In one embodiment of the present invention, the rate-distortion performance of the temporal prediction coding is obtained by the following formula, wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, for time domain predictionmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
In one embodiment of the present invention, the rate-distortion performance of the conventional inter-view predictive coding is obtained by the following formula, wherein,to match the resulting parallaxes between views, BkFor the current coded macroblock, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for conventional inter-view predictiondThe coding rate required to search for the disparity vector for coding.
In one embodiment of the present invention, the rate-distortion performance of the depth-assisted inter-view prediction coding is obtained by the following formula, wherein,calculating the parallax for depth, BkFor the current coded macroblock, refzIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for depth-assisted inter-view predictionh' is a code rate required for encoding macroblock header information in a disparity compensated prediction mode based on depth prediction disparity.
In order to achieve the above object, another aspect of the present invention provides a system for joint predictive coding of stereoscopic video, including: a dividing module for inputting a stereoscopic video and dividing the stereoscopic video into a plurality of encoded macro blocks; the first prediction module is used for predicting the depth prediction parallax of the current coding macro block by a depth prediction method and carrying out depth-assisted inter-view prediction coding on the current coding macro block according to the depth prediction parallax; a second prediction module for performing conventional inter-view predictive coding on the current macroblock; a third prediction module, configured to perform time-domain prediction coding on the current coding macroblock; a calculation module, configured to calculate rate-distortion performance of the current coded macroblock in the depth-assisted inter-view prediction coding mode, the conventional inter-view prediction coding mode, and the temporal prediction coding mode, respectively; and the selection module is used for selecting the predictive coding mode with the optimal rate distortion performance as the predictive mode of the current coding macro block and coding the predictive coding mode.
According to the system provided by the embodiment of the invention, the parallax of the coded macro block is estimated through the depth to perform inter-view compensation prediction, so that the code rate required by parallax coding in stereoscopic video coding is reduced, and the efficiency of stereoscopic video coding is improved.
In one embodiment of the invention, the system further comprises: the judging module is used for judging whether all the coding macro blocks are coded; and the processing module is used for repeatedly using the dividing module, the first prediction module, the second prediction module, the third prediction module, the calculation module and the selection module until all the coding macro blocks are coded when the coding is not finished.
In one embodiment of the present invention, the rate-distortion performance of the temporal prediction coding is obtained by the following formula, wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, for time domain predictionmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
In one embodiment of the present invention, the rate-distortion performance of the conventional inter-view predictive coding is obtained by the following formula, wherein,to match the resulting parallaxes between views, BkFor the current coded macroblock, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for conventional inter-view predictiondThe coding rate required to search for the disparity vector for coding.
In one embodiment of the present invention, the substrate is,the rate-distortion performance of the depth-assisted inter-view prediction coding is obtained by the following formula, wherein,calculating the parallax for depth, BkFor the current coded macroblock, refzIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for depth-assisted inter-view predictionh' is a code rate required for encoding macroblock header information in a disparity compensated prediction mode based on depth prediction disparity.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart of a method for joint predictive coding of stereoscopic video according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of virtual viewpoint rendering according to one embodiment of the present invention;
FIG. 3 is a diagram of a coding prediction structure according to an embodiment of the present invention; and
fig. 4 is a block diagram illustrating a structure of a system for joint predictive coding of stereoscopic video according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.
Fig. 1 is a flowchart illustrating a method for joint predictive coding of stereoscopic video according to an embodiment of the present invention. As shown in fig. 1, the method for joint predictive coding of stereoscopic video according to an embodiment of the present invention includes the following steps:
step S101 inputs a stereoscopic video and divides the stereoscopic video into a plurality of encoded macroblocks.
Specifically, a stereoscopic video is input and subjected to preprocessing such as correction and alignment, and the processed stereoscopic video is divided into a plurality of coding macro blocks.
Step S102, predicting the depth prediction parallax of the current coding macro block by a depth prediction method, and carrying out depth-assisted inter-view prediction coding on the current coding macro block according to the depth prediction parallax.
Specifically, it is assumed that only video of left and right viewpoints and a depth map sequence are included in a stereoscopic video sequence. The baseline distance of the left and right viewpoints is c, and the focal lengths of the cameras of the left and right viewpoints are f. The current coding macro block is Bk。BkIncludes njEach pixel point has a corresponding depth value ofPredicting current coding macro block B by depth value of each pixel pointkThe depth of (3) predicts the disparity. Let current coding macroblock BkDepth value of BkMaximum likelihood value z of depth values corresponding to all pixel points containedkCan be expressed as Wherein,the depth value for each pixel point.
Fig. 2 is a schematic diagram of virtual viewpoint rendering according to an embodiment of the present invention. As shown in fig. 2, when B is obtainedkAfter the corresponding depth value, the disparity of the current coding macro block can be calculated through the mapping relation between the depth and the disparity. The prediction disparity of the current coded macroblock can be expressed as,wherein d iskTo calculate the disparity, f is the focal length and c is the baseline distance of the left and right viewpoints. For encoding modes of quarter-pixel precision, d is addedkAnd rounding to the nearest quarter of pixel point position to be used as the depth prediction parallax of the current coding macro block.
Step S103, obtaining a disparity vector by an inter-view matching method, and performing conventional inter-view predictive coding on the current macroblock according to the disparity vector.
And step S104, obtaining a motion vector by a time domain motion estimation method, and performing time domain predictive coding on the current coding macro block according to the motion vector.
Step S105, respectively calculating the rate-distortion performance of the current coding macro block under the depth-assisted inter-view prediction coding, the traditional inter-view prediction coding and the time domain prediction coding modes
Specifically, the encoder will calculate the rate-distortion performance in different prediction modes. Let current coding macroblock BkHas a motion vector ofSearch for disparities ofDepth-predicted parallax of
The rate-distortion performance of the temporal prediction coding of the current macroblock search disparity is obtained by the following formula, wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, for time domain predictionmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
The rate-distortion performance of the conventional inter-view predictive coding of the current macroblock search disparity is obtained by the following formula, wherein,to match the resulting parallaxes between views, BkFor the current coded macroblock, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for conventional inter-view predictiondThe coding rate required to search for the disparity vector for coding.
In stereoscopic video, depth information may be considered as side information for video coding. Therefore, we can assume that the encoding end and the decoding end can obtain the same reconstructed depth map at the same time. So that depth prediction disparity does not need to be coded into the codestream. Therefore, the rate-distortion performance corresponding to depth-assisted inter-view prediction coding of a current macroblock by depth prediction disparity can be expressed as, wherein,calculating the parallax for depth, BkFor the current coded macroblock, refzIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for depth-assisted inter-view predictionh' is a code rate required for encoding macroblock header information in a disparity compensated prediction mode based on depth prediction disparity.
And step S106, selecting the prediction mode corresponding to the rate distortion performance with the minimum rate distortion performance as the prediction mode of the current coding macro block and coding.
Specifically, the encoder will select the prediction mode with the best rate distortion as the prediction mode for the current encoded macroblock. The selection process thereof can be expressed as, wherein,andrespectively representing the rate-distortion performance of temporal prediction, the rate-distortion performance of conventional inter-view prediction and the rate-distortion performance of depth-assisted inter-view prediction.
In one embodiment of the invention, a video sequence encoded for stereoscopic video is a standard test video sequence named "Book Arrival" in standard definition format, and the pixels of the standard definition format video sequence are 1024 × 768. The decoder adopts the reference software JMVC (Joint Multi-view Video Coding) of the h.264/AVC (Multi-view Video Coding) standard, the number of frames of a GOP (Group of Pictures) of the encoder is 8, the coded temporal prediction Coding adopts the prediction structure of Hierarchical B (Hierarchical bi-directional prediction coded frame, referred to as Hierarchical B frame), and fig. 3 is a schematic diagram of the Coding prediction structure according to an embodiment of the present invention. As shown in fig. 3, the virtual viewpoint rendering uses two paths of color video and depth map adjacent to the virtual viewpoint. In the embodiment, two paths of videos, namely, a viewpoint 10 and a viewpoint 8 of a "Book Arrival" sequence, are used as a multi-viewpoint video input sequence, wherein the viewpoint 10 is called a left reference viewpoint, and the viewpoint 8 is called a right reference viewpoint. The values of the multi-view video and multi-view depth map coding quantization parameters QP range from an integer between 0 and 51. The baseline distance for the left and right viewpoints is 10 and the focal length of the camera is 100.
Let current coding macroblock BkA macroblock of 8 × 8 in a frame in view 8 video, which is a "Book Arrival" sequence, its corresponding depth values are shown in the 8 × 8 matrix below.
The encoder will then compare the rate-distortion performance of the different inter-macroblock predictions. Let encoding the current macroblock BkThe number of bits required for the motion vector of rm=10, code BkThe number of bits required for the block matching search of (1) to obtain the disparity is rd=8, code BkThe number of bits required for the header information is rh= 20. Then, in the conventional inter-view prediction based on depth prediction disparity, B is encodedkThe number of bits required for the header information is rh' = 21. An additional bit is used to identify the current macroblock for conventional inter-view prediction using depth-prediction-based disparity. In the rate distortion optimization process, a Lagrange multiplier lambda is setmotionIs 1.5.
Thus, for macroblock BkThe rate distortion performance of the time domain prediction is,
Bkthe rate-distortion performance of conventional inter-view prediction is,
when predictive coding with depth prediction disparity, BkThe rate-distortion performance of depth-assisted inter-view prediction coding of (1) is,
then, the encoder selects the optimal inter-prediction encoding mode by comparing the rate-distortion performance in different prediction modes. For the current macroblock Bk, Therefore, the optimal inter-prediction coding mode is depth-assisted inter-prediction coding. After the optimal inter prediction mode is obtained, the encoder will perform a second time rate distortion optimization selection. The encoder further compares the rate distortion performance of the inter-frame prediction mode and the intra-frame prediction mode, and finally selects the mode with the optimal rate distortion to encode the current macro block.
According to the method provided by the embodiment of the invention, the parallax of the coded macro block is estimated through the depth to perform inter-view compensation prediction, so that the code rate required by parallax coding in stereoscopic video coding is reduced, and the efficiency of stereoscopic video coding is improved.
Fig. 4 is a block diagram illustrating a structure of a system for joint predictive coding of stereoscopic video according to an embodiment of the present invention. As shown in fig. 4, the joint predictive coding system for stereoscopic video includes a partitioning module 100, a first prediction module 200, a second prediction module 300, a third prediction module 400, a calculation module 500, and a selection module 600.
The partitioning module 100 is used to input a stereoscopic video and partition the stereoscopic video into a plurality of encoded macroblocks.
Specifically, a stereoscopic video is input and subjected to preprocessing such as correction and alignment, and the processed stereoscopic video is divided into a plurality of coding macro blocks.
The first prediction module 200 is configured to predict a depth prediction disparity of a current coded macroblock through a depth prediction method, and perform depth-assisted inter-view prediction coding on the current coded macroblock according to the depth prediction disparity.
Specifically, it is assumed that only video of left and right viewpoints and a depth map sequence are included in a stereoscopic video sequence. The baseline distance of the left and right viewpoints is c, and the focal lengths of the cameras of the left and right viewpoints are f. The current coding macro block is Bk。BkIncludes njEach pixel point has a corresponding depth value ofPredicting current coding macro block B by depth value of each pixel pointkThe depth of (3) predicts the disparity. Let current coding macroblock BkDepth value of BkMaximum likelihood value z of depth values corresponding to all pixel points containedkCan be expressed asWherein,the depth value for each pixel point.
Fig. 2 is a schematic diagram of virtual viewpoint rendering according to an embodiment of the present invention. As shown in fig. 2, when B is obtainedkAfter the corresponding depth value, the disparity of the current coding macro block can be calculated through the mapping relation between the depth and the disparity. The predicted disparity of a current coded macroblock can be expressed as,Wherein d iskTo calculate the disparity, f is the focal length and c is the baseline distance of the left and right viewpoints. For encoding modes of quarter-pixel precision, d is addedkAnd rounding to the nearest quarter of pixel point position to be used as the depth prediction parallax of the current coding macro block.
The second prediction module 300 is configured to obtain a disparity vector through an inter-view matching method, and perform conventional inter-view prediction coding on a current macroblock according to the disparity vector.
The third prediction module 400 is configured to obtain a motion vector through a temporal motion estimation method, and perform temporal prediction coding on the current coded macroblock according to the motion vector.
The calculation module 500 is used to calculate a plurality of rate-distortion performances of the current coded macroblock in inter-view prediction and compensated prediction.
Specifically, the encoder will calculate the rate-distortion performance in different prediction modes. Let current coding macroblock BkHas a motion vector ofSearch for disparities ofDepth-predicted parallax of
The rate-distortion performance of motion compensated prediction of the current macroblock can be expressed as, the rate-distortion performance of motion compensated prediction is obtained by the following formula, wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, for time domain predictionmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
The rate-distortion performance of the search disparity compensated prediction of the current macroblock is obtained by the following formula, wherein,to match the resulting parallaxes between views, BkFor the current coded macroblock, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for conventional inter-view predictiondThe coding rate required to search for the disparity vector for coding.
In stereoscopic video, depth information may be considered as side information for video coding. Therefore, we can assume that the encoding end and the decoding end can obtain the same reconstructed depth map at the same time. So that depth prediction disparity does not need to be coded into the codestream. Therefore, the rate-distortion performance corresponding to the disparity-compensated prediction of the current macroblock by the depth prediction disparity can be expressed as, wherein,calculating the parallax for depth, BkFor the current coded macroblock, refzIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, for depth-assisted inter-view predictionh' is a code rate required for encoding macroblock header information in a disparity compensated prediction mode based on depth prediction disparity.
The selecting module 600 is configured to select the prediction mode corresponding to the rate-distortion performance with the minimum rate-distortion performance as the prediction mode of the current encoded macroblock, and perform encoding.
Specifically, the encoder will select the prediction mode with the best rate distortion as the prediction mode for the current encoded macroblock. The selection process thereof can be expressed as, wherein,andrespectively representing the rate-distortion performance of temporal prediction, the rate-distortion performance of conventional inter-view prediction and the rate-distortion performance of depth-assisted inter-view prediction.
In one embodiment of the present invention, let us assume that the current coded macroblock BkA macroblock of 8 × 8 in a frame in view 8 video, which is a "Book Arrival" sequence, its corresponding depth values are shown in the 8 × 8 matrix below.
for encoding modes of quarter-pixel precision, d is addedkAfter rounding to the nearest pixel point, the parallax should be dk′=[dk]= 16.25. The encoder then performs conventional inter-view prediction based on the predicted disparity information. For the current coded macroblock, its prediction disparity is 16.25. The encoder finds a corresponding reference macroblock in the corresponding frame of view 10 for prediction. The sum of the absolute values of the prediction residuals is assumed to be 50. In addition, the encoder will also perform a compensatory prediction, i.e., a temporal prediction and a conventional inter-view prediction, on the current macroblock. In the temporal prediction, the motion vector of the current macroblock is not set to 32, and the sum of absolute values of residuals in the temporal prediction is set to 80. In the conventional inter-view prediction between views, it is assumed that the disparity obtained by the encoder through the block matching search is 16, and the sum of the absolute values of the residuals of the conventional inter-view prediction is 45.
In one embodiment of the present invention, the calculation module 500 calculates the rate-distortion performance for different predictive coding modes. For macroblock BkThe rate distortion performance of the time domain prediction is,
Bkthe rate-distortion performance of conventional inter-view prediction is,
when predictive coding with depth prediction disparity, BkThe rate-distortion performance of depth-assisted inter-view prediction coding of (1) is,
the selection module 600 compares the rate-distortion performance in different prediction modes through the encoder and selects the optimal prediction coding mode. For the current macroblock Bk, Therefore, the optimal inter-prediction coding mode is depth-assisted inter-prediction coding. After the optimal inter-frame prediction coding mode is obtained, the encoder performs a second time rate distortion optimization selection. The encoder will further compare the rate-distortion performance of the inter-prediction mode with the intra-prediction mode, and finally selectAnd the mode with the optimal rate distortion encodes the current macro block.
According to the system provided by the embodiment of the invention, the parallax of the coded macro block is estimated through the depth to perform inter-view compensation prediction, so that the code rate required by parallax coding in stereoscopic video coding is reduced, and the efficiency of stereoscopic video coding is improved.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention.
Claims (8)
1. A joint predictive coding method for stereoscopic video, comprising the steps of:
s0: coding a depth map sequence by adopting a multi-view video coding method, and inputting depth map information obtained by decoding as side information of three-dimensional video color image coding;
s1: inputting a stereoscopic video color image sequence and dividing each frame image in the stereoscopic video color image sequence into a plurality of coding macro blocks;
s2: predicting the depth prediction parallax of the current coding macro block by a depth prediction method, and carrying out depth-assisted inter-view prediction coding on the current coding macro block according to the depth prediction parallax;
s3: obtaining a disparity vector of the current coding macro block by an inter-view matching method, and carrying out traditional inter-view predictive coding on the current macro block according to the disparity vector;
s4: obtaining a motion vector of the current coding macro block by a time domain motion estimation method, and performing time domain predictive coding on the current coding macro block according to the motion vector;
s5: respectively calculating the rate-distortion performance of the current coding macro block under the depth-assisted inter-view prediction coding mode, the traditional inter-view prediction coding mode and the time domain prediction coding mode;
s6: selecting a predictive coding mode with optimal rate distortion performance as a predictive mode of a current coding macro block and coding;
s7: judging whether all the coding macro blocks are coded;
s8: if not, repeating the steps S1-S5 for the non-coded macroblocks until all the coded macroblocks are coded;
the depth-assisted inter-view prediction coding specifically includes calculating the depth prediction disparity for each pixel point in the current coding macroblock, performing quarter-pixel precision quantization on the depth prediction disparity vector of the current coding macroblock, and performing pixel-by-pixel prediction coding according to the depth prediction disparity to obtain a prediction coding residual.
2. The joint predictive coding method of stereoscopic video according to claim 1, wherein the rate-distortion performance of the temporal predictive coding is obtained by the following formula,
wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, predicted for motion compensationmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
3. The joint predictive coding method of stereoscopic video according to claim 1, wherein the rate-distortion performance of the conventional inter-view predictive coding is obtained by the following formula,
wherein,to match the resulting parallaxes between views, BkFor the current coded macroblock, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkI is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, predicted for motion compensationdThe coding rate required to search for the disparity vector for coding.
4. The joint predictive coding method of stereoscopic video according to claim 1, wherein the rate-distortion performance of the depth-assisted inter-view predictive coding is obtained by the following formula,
wherein, BkFor the current coded macroblock, X is BkI is the luminance or chrominance component value corresponding to X,calculating a disparity vector, ref, for the quantized depth corresponding to pixel point XzIs composed ofThe reference frame that is pointed to is,is composed ofThe luminance or chrominance component values of the corresponding pixel points in the pointed reference frame,λmotionlagrange multiplier, r 'for motion compensated prediction'hAnd coding the code rate required by the macroblock header information in a disparity compensated prediction mode based on the depth prediction disparity.
5. A system for joint predictive coding of stereoscopic video, comprising:
a depth map coding and decoding module for coding and decoding a depth map sequence;
a dividing module for inputting a stereoscopic video and dividing the stereoscopic video into a plurality of encoded macro blocks;
the first prediction module is used for predicting the depth prediction parallax of each pixel in the current coding macro block by a depth prediction method and carrying out depth-assisted inter-view prediction coding on the current coding macro block according to the depth prediction parallax;
the second prediction module is used for obtaining the disparity vector of the current coding module by an inter-view matching method and carrying out traditional inter-view prediction coding on the current macro block according to the disparity vector;
the third prediction module is used for obtaining the motion vector of the current coding module by a time domain motion estimation method and carrying out time domain prediction coding on the current coding macro block according to the motion vector;
a calculation module, configured to calculate rate-distortion performance of the current coded macroblock in the depth-assisted inter-view prediction coding mode, the conventional inter-view prediction coding mode, and the temporal prediction coding mode, respectively;
the selection module is used for selecting the predictive coding mode with the optimal rate distortion performance as the predictive mode of the current coding macro block and coding the predictive coding mode;
the judging module is used for judging whether all the coding macro blocks are coded; and
the processing module is used for repeatedly using the dividing module, the first prediction module, the second prediction module, the third prediction module, the calculation module and the selection module until all the coding macro blocks are coded when the coding is not finished;
the first prediction module calculates the depth prediction parallax for each pixel point in the current coding macro block, quantizes the depth prediction parallax vector of the current coding macro block with quarter-pixel precision, and performs pixel-by-pixel prediction coding according to the depth prediction parallax to obtain a prediction coding residual.
6. The joint predictive coding system for stereoscopic video according to claim 5, wherein the rate-distortion performance of the temporal predictive coding is obtained by the following formula,
wherein,is a motion vector, BkFor the current coded macroblock, refmIs composed ofPointed to as reference frame, X is BkI is the luminance or chrominance component value corresponding to X,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r, predicted for motion compensationmThe coding rate, r, required for coding motion vectorshThe code rate required for encoding other macroblock header information besides the motion vector.
7. The joint predictive coding system for stereoscopic video according to claim 5, wherein the rate-distortion performance of the conventional inter-view predictive coding is obtained by the following formula,
wherein,for stereo matching the resulting parallax, refdIs composed ofThe reference frame that is pointed to is,is composed ofThe brightness or chroma component value of the corresponding pixel point in the pointed reference frame, X is BkEach pixel of (a), BkFor a currently encoded macroblock, I is the luminance or chrominance component value corresponding to X, λmotionLagrange multiplier, r, predicted for motion compensationdThe coding rate required to search for the disparity vector for coding.
8. The joint predictive coding system for stereoscopic video according to claim 5, wherein the rate-distortion performance of the depth-assisted inter-view predictive coding is obtained by the following formula,
wherein, BkIs as followsPre-coded macro block, X being BkI is the luminance or chrominance component value corresponding to X,predicting a disparity vector, ref, for the quantized depth corresponding to pixel point XzIs composed ofThe reference frame that is pointed to is,is composed ofThe value of the luminance or chrominance component, λ, of the corresponding pixel in the pointed reference framemotionLagrange multiplier, r 'for motion compensated prediction'hAnd coding the code rate required by the macroblock header information in a disparity compensated prediction mode based on the depth prediction disparity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310158699.XA CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310158699.XA CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103220532A CN103220532A (en) | 2013-07-24 |
CN103220532B true CN103220532B (en) | 2016-08-10 |
Family
ID=48817935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310158699.XA Active CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103220532B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103763557B (en) * | 2014-01-03 | 2017-06-27 | 华为技术有限公司 | A kind of Do NBDV acquisition methods and video decoder |
CN104125469B (en) * | 2014-07-10 | 2017-06-06 | 中山大学 | A kind of fast encoding method for HEVC |
CN106303547B (en) * | 2015-06-08 | 2019-01-01 | 中国科学院深圳先进技术研究院 | 3 d video encoding method and apparatus |
CN108235018B (en) * | 2017-12-13 | 2019-12-27 | 北京大学 | Point cloud intra-frame coding optimization method and device based on Lagrange multiplier model |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101222639B (en) * | 2007-01-09 | 2010-04-21 | 华为技术有限公司 | Inter-view prediction method, encoder and decoder of multi-viewpoint video technology |
CN101170702B (en) * | 2007-11-23 | 2010-08-11 | 四川虹微技术有限公司 | Multi-view video coding method |
CN101754042B (en) * | 2008-10-30 | 2012-07-11 | 华为终端有限公司 | Image reconstruction method and image reconstruction system |
CN102238391B (en) * | 2011-05-25 | 2016-12-07 | 深圳市云宙多媒体技术有限公司 | A kind of predictive coding method, device |
-
2013
- 2013-05-02 CN CN201310158699.XA patent/CN103220532B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN103220532A (en) | 2013-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yea et al. | View synthesis prediction for multiview video coding | |
JP5234586B2 (en) | Video encoding method and decoding method, apparatus thereof, program thereof, and storage medium storing program | |
JP2022123085A (en) | Partial cost calculation | |
CN104412597B (en) | The method and device that unified difference vector for 3D Video codings is derived | |
KR101747434B1 (en) | Apparatus and method for encoding and decoding motion information and disparity information | |
US20120189060A1 (en) | Apparatus and method for encoding and decoding motion information and disparity information | |
KR20120080122A (en) | Apparatus and method for encoding and decoding multi-view video based competition | |
CN103907346A (en) | Method and apparatus of motion and disparity vector derivation for 3D video coding and HEVC | |
CN102790892A (en) | Depth map coding method and device | |
CN102801995B (en) | A kind of multi-view video motion based on template matching and disparity vector prediction method | |
KR101893559B1 (en) | Apparatus and method for encoding and decoding multi-view video | |
CN103051894B (en) | A kind of based on fractal and H.264 binocular tri-dimensional video compression & decompression method | |
CN104995916A (en) | Video data decoding method and video data decoding apparatus | |
WO2012077634A9 (en) | Multiview image encoding method, multiview image decoding method, multiview image encoding device, multiview image decoding device, and programs of same | |
KR101598855B1 (en) | Apparatus and Method for 3D video coding | |
CN102291579A (en) | Rapid fractal compression and decompression method for multi-cast stereo video | |
CN103220532B (en) | The associated prediction coded method of three-dimensional video-frequency and system | |
CN102316323B (en) | Rapid binocular stereo-video fractal compressing and uncompressing method | |
CN102917233A (en) | Stereoscopic video coding optimization method in space teleoperation environment | |
KR20120083209A (en) | Depth map coding/decoding apparatus and method | |
Mallik et al. | HEVC based multi-view video codec using frame interleaving technique | |
KR20080006494A (en) | A method and apparatus for decoding a video signal | |
CN102263952B (en) | Quick fractal compression and decompression method for binocular stereo video based on object | |
Yea et al. | View synthesis prediction for rate-overhead reduction in ftv | |
CN102263953B (en) | Quick fractal compression and decompression method for multicasting stereo video based on object |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |