CN103220532B - The associated prediction coded method of three-dimensional video-frequency and system - Google Patents
The associated prediction coded method of three-dimensional video-frequency and system Download PDFInfo
- Publication number
- CN103220532B CN103220532B CN201310158699.XA CN201310158699A CN103220532B CN 103220532 B CN103220532 B CN 103220532B CN 201310158699 A CN201310158699 A CN 201310158699A CN 103220532 B CN103220532 B CN 103220532B
- Authority
- CN
- China
- Prior art keywords
- prediction
- depth
- predictive coding
- coding
- ref
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本发明提出一种立体视频的联合预测编码方法及系统。其中,方法包括:S1:输入立体视频并将立体视频分为多个编码宏块;S2:通过深度预测的方法预测当前编码宏块的深度预测视差,并对当前编码宏块进行深度辅助的视间预测编码;S3:对当前宏块进行传统视间预测编码;S4:当前编码宏块进行时域预测编码;S5:分别计算当前编码宏块在深度辅助的视间预测编码、传统视间预测编码和时域预测编码模式下的率失真性能;S6:选择率失真性能最优的预测编码模式作为当前编码宏块的预测模式并进行编码。根据本发明实施例的方法,通过深度来估计编码宏块的视差来进行视间补偿预测,减少了立体视频编码中视差编码所需要的码率,同时提高了立体视频编码的效率。
The present invention proposes a joint predictive coding method and system for stereoscopic video. Among them, the method includes: S1: Input stereoscopic video and divide the stereoscopic video into multiple coded macroblocks; S2: Predict the depth prediction disparity of the current coded macroblock through the method of depth prediction, and perform depth-assisted viewing on the current coded macroblock Inter-predictive coding; S3: Perform traditional inter-view predictive coding on the current macroblock; S4: Perform time-domain predictive coding on the current coded macroblock; S5: Calculate the depth-assisted inter-view predictive coding and traditional inter-view prediction of the current coded macroblock respectively Rate-distortion performance in coding and time-domain predictive coding modes; S6: Select the predictive coding mode with the best rate-distortion performance as the prediction mode of the currently coded macroblock and perform coding. According to the method of the embodiment of the present invention, the disparity of coded macroblocks is estimated by depth to perform inter-view compensation prediction, which reduces the bit rate required for disparity coding in stereoscopic video coding and improves the efficiency of stereoscopic video coding.
Description
技术领域technical field
本发明涉及视频编码领域,特别涉及一种立体视频的联合预测编码方法及系统。The present invention relates to the field of video coding, in particular to a joint predictive coding method and system for stereoscopic video.
背景技术Background technique
随着视频技术的不断发展,立体视频以其逼真的视觉效果获得了广泛的关注。在立体视频中,视频数据由视频序列和深度图序列构成。其中,视频序列通常包含两路甚至多路视频序列。深度图序列则包含每一路视频所对应的深度图。因此,在立体视频的应用中,如何有效的压缩和传输海量的视频和深度图成为立体视频应用的关键技术瓶颈之一。With the continuous development of video technology, stereoscopic video has gained widespread attention for its realistic visual effects. In stereoscopic video, the video data consists of a video sequence and a depth map sequence. Wherein, the video sequence usually includes two or even multiple video sequences. The depth map sequence contains the depth map corresponding to each channel of video. Therefore, in the application of stereoscopic video, how to effectively compress and transmit massive video and depth maps has become one of the key technical bottlenecks in the application of stereoscopic video.
为了实现对立体视频数据的高效压缩,研究人员提出了多视点视频编码方案。在该方案中,多视点视频中的一路视频作为基本视点,采用传统的视频编码方案压缩时域上的冗余。对于其余视点的视频,该编码方案引入了视间预测模式,通过时域预测和视间预测来压缩多视点视频的时域以及视间冗余,从而有效的降低了编码多视点视频所需要的码率。由于深度图可以视为多视点灰度视频序列,因此,多视点视频编码方案同样用来对深度图进行编码。在当前的主流立体视频编码方案中,编码器对多视点视频以及深度图分别采用多视点视频编码方案进行压缩,获得视频以及深度图两路码流,并将两路码流同时传输到解码端,重构多视点视频以及深度图序列。解码端根据用户需要进一步绘制虚拟视点,从而形成用户所需要的立体视频序列,并在相应的立体视频显示器上进行播放。In order to achieve efficient compression of stereoscopic video data, researchers proposed a multi-view video coding scheme. In this scheme, one video in the multi-view video is taken as the basic view, and the redundancy in the time domain is compressed using the traditional video coding scheme. For the video of other viewpoints, the coding scheme introduces an inter-view prediction mode, which compresses the time domain and inter-view redundancy of multi-view video through temporal prediction and inter-view prediction, thus effectively reducing the time required for encoding multi-view video. code rate. Since the depth map can be regarded as a multi-view grayscale video sequence, the multi-view video coding scheme is also used to code the depth map. In the current mainstream stereoscopic video coding scheme, the encoder compresses the multi-viewpoint video and the depth map respectively using a multi-viewpoint video coding scheme to obtain two code streams of video and depth map, and transmit the two code streams to the decoding end at the same time , to reconstruct multi-view video and depth map sequences. The decoding end further draws the virtual viewpoint according to the needs of the user, thereby forming the stereoscopic video sequence required by the user, and playing it on the corresponding stereoscopic video display.
尽管多视点视频编码能够有效的压缩多视点视频以及深度图的时域和视间冗余,然而多视点视频和深度图之间的冗余仍无法被有效地压缩。在立体视频中,深度图表征了视频序列中对应点的深度信息。在给定拍摄条件的前提下,每个编码宏块的视差信息可以通过深度值预测获得。在立体视频中,深度图可以视为多视点视频编码的边信息,从而通过深度计算视差可以代替通过视差搜索所获得视差,减少编码视差所需要的编码码率,压缩多视点视频以及深度图之间的冗余。Although multi-view video coding can effectively compress the temporal and inter-view redundancy of multi-view videos and depth maps, the redundancy between multi-view videos and depth maps cannot be effectively compressed. In stereoscopic video, the depth map represents the depth information of corresponding points in the video sequence. Under the premise of given shooting conditions, the disparity information of each coded macroblock can be obtained through depth value prediction. In stereoscopic video, the depth map can be regarded as the side information of multi-view video coding, so the disparity obtained by disparity search can be replaced by disparity calculated by depth, the coding bit rate required for coding disparity can be reduced, and the difference between multi-view video and depth map can be compressed. Redundancy between.
目前基于多视点视频和深度图联合编码的立体视频编码方式有两种。一种是编码器通过根据当前待编码视频帧对应的深度图和及其参考视频帧渲染出一幅虚拟参考帧,从而减少深度图和视差编码中存在的冗余信息。另一种是通过时域运动信息以及视间视差信息的几何约束关系得出时域运动信息和视差信息相互关系的预测方法。Currently, there are two stereoscopic video coding methods based on joint coding of multi-view video and depth map. One is that the encoder renders a virtual reference frame according to the depth map corresponding to the current video frame to be encoded and its reference video frame, so as to reduce the redundant information in the depth map and disparity coding. The other is a prediction method that obtains the relationship between temporal motion information and disparity information through the geometric constraint relationship between temporal motion information and inter-view disparity information.
现有技术的缺点包括:Disadvantages of existing technologies include:
(1)需要额外的编解码器缓存,增加了编解码器的空间复杂度(1) An additional codec cache is required, which increases the space complexity of the codec
(2)计算复杂度较高,增加了编解码器的时间复杂度(2) The computational complexity is high, which increases the time complexity of the codec
发明内容Contents of the invention
本发明的目的旨在至少解决上述的技术缺陷之一。The object of the present invention is to solve at least one of the above-mentioned technical drawbacks.
为此,本发明的一个目的在于提出一种立体视频的联合预测编码方法。Therefore, an object of the present invention is to propose a joint predictive coding method for stereoscopic video.
本发明的另一目的在于提出一种立体视频的联合预测编码系统。Another object of the present invention is to propose a joint predictive coding system for stereoscopic video.
为达到上述目的,本发明一方面的实施例提出一种立体视频的联合预测编码方法,包括以下步骤:S1:输入立体视频并将所述立体视频分为多个编码宏块;S2:通过深度预测的方法预测当前编码宏块的深度预测视差,并根据所述深度预测视差对当前编码宏块进行深度辅助的视间预测编码;S3:通过视间匹配的方法获得视差向量,并根据所述视差向量对所述当前宏块进行传统视间预测编码;S4:通过时域运动估计的方法获得运动向量,并根据所述运动向量对所述当前编码宏块进行时域预测编码;S5:分别计算所述当前编码宏块在所述深度辅助的视间预测编码、传统视间预测编码和时域预测编码模式下的率失真性能;以及S6:选择率失真性能最优的预测编码模式作为当前编码宏块的预测模式并进行编码。In order to achieve the above purpose, an embodiment of the present invention proposes a joint predictive coding method for stereoscopic video, including the following steps: S1: input stereoscopic video and divide the stereoscopic video into multiple coded macroblocks; S2: pass depth The prediction method predicts the depth prediction disparity of the current coded macroblock, and performs depth-assisted inter-view prediction coding on the current coded macroblock according to the depth prediction disparity; S3: Obtain the disparity vector through the inter-view matching method, and according to the Perform traditional inter-view predictive coding on the current macroblock with the disparity vector; S4: Obtain a motion vector through time domain motion estimation, and perform time domain predictive coding on the current coded macroblock according to the motion vector; S5: Respectively Calculate the rate-distortion performance of the currently coded macroblock in the depth-assisted inter-view predictive coding, traditional inter-view predictive coding, and time-domain predictive coding modes; and S6: Select the predictive coding mode with the best rate-distortion performance as the current The prediction mode of the macroblock is encoded and encoded.
根据本发明实施例的方法,通过深度来估计编码宏块的视差来进行视间补偿预测,减少了立体视频编码中视差编码所需要的码率,同时提高了立体视频编码的效率。According to the method of the embodiment of the present invention, the disparity of coded macroblocks is estimated by depth to perform inter-view compensation prediction, which reduces the bit rate required for disparity coding in stereoscopic video coding and improves the efficiency of stereoscopic video coding.
本发明的一个实施例中,所述方法还包括:S7:判断所述所有编码宏块是否编码完成;S8:如果未完成,则对未编码宏块重复所述步骤S1-S5直至所有编码宏块均完成编码。In an embodiment of the present invention, the method further includes: S7: judging whether the encoding of all encoded macroblocks is completed; S8: if not, repeating the steps S1-S5 for unencoded macroblocks until all encoded macroblocks The blocks are all encoded.
本发明的一个实施例中,所述时域预测编码的率失真性能通过如下公式获得,
本发明的一个实施例中,所述传统视间预测编码的率失真性能通过如下公式获得,
本发明的一个实施例中,所述深度辅助的视间预测编码的率失真性能通过如下公式获得,
为达到上述目的,本发明的实施例另一方面提出一种立体视频的联合预测编码系统,包括:划分模块,用于输入立体视频并将所述立体视频分为多个编码宏块;第一预测模块,用于通过深度预测的方法预测当前编码宏块的深度预测视差,并根据所述深度预测视差对当前编码宏块进行深度辅助的视间预测编码;第二预测模块,用于对所述当前宏块进行传统视间预测编码;第三预测模块,用于对所述当前编码宏块进行时域预测编码;计算模块,用于分别计算所述当前编码宏块在所述深度辅助的视间预测编码、传统视间预测编码和时域预测编码模式下的率失真性能;以及选择模块,用于选择率失真性能最优的预测编码模式作为当前编码宏块的预测模式并进行编码。In order to achieve the above object, another embodiment of the present invention proposes a joint predictive coding system for stereoscopic video, including: a division module for inputting stereoscopic video and dividing the stereoscopic video into multiple coded macroblocks; The prediction module is used to predict the depth prediction disparity of the currently coded macroblock through the method of depth prediction, and perform depth-assisted inter-view prediction coding on the current coded macroblock according to the depth prediction disparity; the second prediction module is used to encode the current coded macroblock performing traditional inter-view predictive encoding on the current macroblock; a third prediction module, configured to perform time-domain predictive encoding on the currently encoded macroblock; and a calculation module, configured to calculate the depth-assisted Rate-distortion performance in inter-view predictive coding, traditional inter-view predictive coding and time-domain predictive coding modes; and a selection module, used to select the predictive coding mode with the best rate-distortion performance as the prediction mode of the currently coded macroblock and perform coding.
根据本发明实施例的系统,通过深度来估计编码宏块的视差来进行视间补偿预测,减少了立体视频编码中视差编码所需要的码率,同时提高了立体视频编码的效率。According to the system of the embodiment of the present invention, the disparity of coded macroblocks is estimated by depth to perform inter-view compensation prediction, which reduces the bit rate required for disparity coding in stereoscopic video coding and improves the efficiency of stereoscopic video coding.
本发明的一个实施例中,所述系统还包括:判断模块,用于判断所述所有编码宏块是否编码完成;处理模块,用于当编码未完成时,则重复使用划分模块、第一预测模块、第二预测模块、第三预测模块、计算模块和选择模块直至所有编码宏块均完成编码。In an embodiment of the present invention, the system further includes: a judging module, used to judge whether the coding of all coded macroblocks is completed; a processing module, used to reuse the division module, the first prediction module, the second prediction module, the third prediction module, the calculation module and the selection module until all coded macroblocks are coded.
本发明的一个实施例中,所述时域预测编码的率失真性能通过如下公式获得,
本发明的一个实施例中,所述传统视间预测编码的率失真性能通过如下公式获得,
本发明的一个实施例中,所述深度辅助的视间预测编码的率失真性能通过如下公式获得,
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, wherein:
图1为根据本发明一个实施例的立体视频的联合预测编码方法的流程图;FIG. 1 is a flowchart of a joint predictive coding method for stereoscopic video according to an embodiment of the present invention;
图2为根据本发明一个实施例的虚拟视点绘制原理图;Fig. 2 is a schematic diagram of drawing a virtual viewpoint according to an embodiment of the present invention;
图3为根据本发明一个实施例的编码预测结构示意图;以及FIG. 3 is a schematic diagram of a coding prediction structure according to an embodiment of the present invention; and
图4为根据本发明一个实施例的立体视频的联合预测编码系统的结构框图。Fig. 4 is a structural block diagram of a joint predictive coding system for stereoscopic video according to an embodiment of the present invention.
具体实施方式detailed description
下面详细描述本发明的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
图1为根据本发明一个实施例的立体视频的联合预测编码方法的流程图。如图1所示,根据本发明实施例的立体视频的联合预测编码方法,包括以下步骤:FIG. 1 is a flowchart of a joint predictive coding method for stereoscopic video according to an embodiment of the present invention. As shown in FIG. 1, the joint predictive coding method for stereoscopic video according to an embodiment of the present invention includes the following steps:
步骤S101,输入立体视频并将立体视频分为多个编码宏块。Step S101, input a stereoscopic video and divide the stereoscopic video into a plurality of coded macroblocks.
具体地,输入立体视频并对其进行校正、对齐等前期处理,再将处理后的立体视频分为多个编码宏块。Specifically, the stereoscopic video is input, pre-processing such as correction and alignment is performed on it, and then the processed stereoscopic video is divided into multiple coded macroblocks.
步骤S102,通过深度预测的方法预测当前编码宏块的深度预测视差,并根据深度预测视差对当前编码宏块进行深度辅助的视间预测编码。Step S102 , predicting the depth prediction disparity of the currently coded macroblock through a depth prediction method, and performing depth-assisted inter-view prediction coding on the current coded macroblock according to the depth prediction disparity.
具体地,假设立体视频序列中仅包含左右视点的视频和深度图序列。左右视点的基线距离为c,左右视点的相机焦距均为f。当前编码宏块为Bk。Bk包含有nj个像素点,每个像素点对应的深度值为通过每个像素点的深度值来预测当前编码宏块Bk的深度预测视差。设当前编码宏块Bk的深度值为Bk所包含的所有像素点对应的深度值的最大似然值,zk可以表示为
图2为根据本发明一个实施例的虚拟视点绘制原理图。如图2所示,在获得Bk所对应的深度值之后,可以通过深度与视差之间的映射关系来计算当前编码宏块的视差。当前编码宏块的预测视差可以表示为,其中,dk为计算视差,f为焦距,c为左右视点的基线距离。对于四分之一像素精度的编码模式,将dk取整到最临近的四分之一像素点位置作为当前编码宏块的深度预测视差。Fig. 2 is a schematic diagram of virtual viewpoint rendering according to an embodiment of the present invention. As shown in FIG. 2 , after the depth value corresponding to B k is obtained, the disparity of the currently coded macroblock can be calculated through the mapping relationship between depth and disparity. The predicted disparity of the currently coded macroblock can be expressed as, Among them, d k is the calculated disparity, f is the focal length, and c is the baseline distance between the left and right viewpoints. For a coding mode with quarter-pixel precision, d k is rounded to the nearest quarter-pixel point position as the depth prediction disparity of the current coded macroblock.
步骤S103,通过视间匹配的方法获得视差向量,并根据视差向量对当前宏块进行传统视间预测编码。In step S103, a disparity vector is obtained by means of inter-view matching, and conventional inter-view predictive coding is performed on the current macroblock according to the disparity vector.
步骤S104,通过时域运动估计的方法获得运动向量,并根据运动向量对当前编码宏块进行时域预测编码。In step S104, a motion vector is obtained by means of time-domain motion estimation, and time-domain predictive coding is performed on the currently coded macroblock according to the motion vector.
步骤S105,分别计算当前编码宏块在深度辅助的视间预测编码、传统视间预测编码和时域预测编码模式下的率失真性能Step S105, respectively calculate the rate-distortion performance of the current coded macroblock in depth-assisted inter-view predictive coding, traditional inter-view predictive coding and time-domain predictive coding modes
具体地,编码器将计算不同预测模式下的率失真性能。设当前编码宏块Bk的运动向量为搜索视差为深度预测视差为 Specifically, the encoder will calculate the rate-distortion performance in different prediction modes. Let the motion vector of the current coded macroblock B k be Search parallax for The depth prediction disparity is
当前宏块搜索视差的时域预测编码的率失真性能通过如下公式获得,
当前宏块搜索视差的传统视间预测编码的率失真性能通过如下公式获得,
在立体视频中,深度信息可以视为视频编码的边信息。因此,我们可以假设编码端和解码端能够同时获得相同的重构深度图。从而深度预测视差不需要编入码流当中。因此,当前宏块通过深度预测视差进行深度辅助的视间预测编码所对应的率失真性能可以表示为,
步骤S106,选择率失真性能最小的率失真性能对应的预测模式作为当前编码宏块的预测模式并进行编码。Step S106, selecting the prediction mode corresponding to the rate-distortion performance with the smallest rate-distortion performance as the prediction mode of the currently coded macroblock and performing coding.
具体地,编码器将选择率失真最优的预测模式作为当前编码宏块的预测模式。其选择过程可以表示为,
在本发明的一个实施例中,对立体视频编码的视频序列采用标清格式的名字为“Book Arrival”的标准测试视频序列,且该标清格式视频序列的像素为1024×768。解码器采用H.264/AVC(Multi-view Video Coding,多视点视频扩展版本)标准的参考软件JMVC(Joint Multi-view Video Coding,多视点视频编码),编码器GOP(Group of Pictures,图像组)的帧数为8,编码的时域预测编码采用Hierarchical B(层次化双向预测编码帧,简称层次化B帧)的预测结构,图3为根据本发明一个实施例的编码预测结构示意图。如图3所示,虚拟视点绘制采用与虚拟视点相邻的两路彩色视频和深度图来绘制。本实施样例采用“Book Arrival”序列的视点10和视点8这两路视频作为多视点视频输入序列,其中视点10称为左参考视点,视点8称为右参考视点。多视点视频和多视点深度图编码量化参数QP的取值范围为0到51之间的整数。左右视点的基线距离为10,相机的焦距为100。In one embodiment of the present invention, a standard test video sequence named "Book Arrival" in standard definition format is used for the video sequence of stereoscopic video encoding, and the pixels of the video sequence in standard definition format are 1024×768. The decoder adopts H.264/AVC (Multi-view Video Coding, multi-view video extension version) standard reference software JMVC (Joint Multi-view Video Coding, multi-view video coding), encoder GOP (Group of Pictures, group of pictures) ) frame number is 8, and the time-domain predictive coding of coding adopts the prediction structure of Hierarchical B (hierarchical bidirectional predictive coding frame, referred to as hierarchical B frame). FIG. 3 is a schematic diagram of the coding prediction structure according to an embodiment of the present invention. As shown in Figure 3, the virtual view point is drawn using two channels of color video and a depth map adjacent to the virtual view point. In this implementation example, two videos of viewpoint 10 and viewpoint 8 of the "Book Arrival" sequence are used as the multi-view video input sequence, wherein viewpoint 10 is called the left reference viewpoint, and viewpoint 8 is called the right reference viewpoint. The value range of the multi-view video and multi-view depth map coding quantization parameter QP is an integer between 0 and 51. The baseline distance of the left and right viewpoints is 10, and the focal length of the camera is 100.
设当前编码宏块Bk为“Book Arrival”序列的视点8视频中一帧中的一个8×8的宏块。其对应的深度值如下述8×8矩阵所示。Assume that the currently coded macroblock B k is an 8×8 macroblock in one frame of view 8 video in the “Book Arrival” sequence. Its corresponding depth value is shown in the following 8×8 matrix.
然后,编码器将比较不同的宏块间预测的率失真性能。设编码当前宏块Bk的运动向量所需要的比特数为rm=10,编码Bk的块匹配搜索所得视差所需要的比特数为rd=8,编码Bk的头信息所需要的比特数为rh=20。那么,在基于深度预测视差的传统视间预测中,编码Bk的头信息所需要的比特数为rh′=21。额外的一个比特用于标识当前宏块采用基于深度预测视差进行传统视间预测。在率失真优化过程中,设拉格朗日乘子λmotion的取值为1.5。The encoder will then compare the rate-distortion performance of different inter-macroblock predictions. Assume that the number of bits required to encode the motion vector of the current macroblock B k is rm = 10, the number of bits required to encode the disparity obtained by the block matching search of B k is r d = 8, and the number of bits required to encode the header information of B k The number of bits is r h =20. Then, in the traditional inter-view prediction based on the depth prediction disparity, the number of bits required to encode the header information of B k is r h ′=21. An additional bit is used to identify that the current macroblock adopts traditional inter-view prediction based on depth prediction disparity. In the rate-distortion optimization process, the value of the Lagrangian multiplier λ motion is set to 1.5.
因此,对于宏块Bk,其时域预测的率失真性能为,
Bk的传统视间预测的率失真性能为,
当采用深度预测视差进行预测编码时,Bk的深度辅助的视间预测编码的率失真性能为,
之后,编码器通过比较不同预测模式下率失真性能来选择最优的帧间预测编码模式。对于当前宏块Bk,
根据本发明实施例的方法,通过深度来估计编码宏块的视差来进行视间补偿预测,减少了立体视频编码中视差编码所需要的码率,同时提高了立体视频编码的效率。According to the method of the embodiment of the present invention, the disparity of coded macroblocks is estimated by depth to perform inter-view compensation prediction, which reduces the bit rate required for disparity coding in stereoscopic video coding and improves the efficiency of stereoscopic video coding.
图4为根据本发明一个实施例的立体视频的联合预测编码系统的结构框图。如图4所示,立体视频的联合预测编码系统包括划分模块100、第一预测模块200、第二预测模块300、第三预测模块400、计算模块500和选择模块600。Fig. 4 is a structural block diagram of a joint predictive coding system for stereoscopic video according to an embodiment of the present invention. As shown in FIG. 4 , the joint predictive coding system for stereoscopic video includes a division module 100 , a first prediction module 200 , a second prediction module 300 , a third prediction module 400 , a calculation module 500 and a selection module 600 .
划分模块100用于输入立体视频并将立体视频分为多个编码宏块。The dividing module 100 is used for inputting stereoscopic video and dividing the stereoscopic video into multiple coded macroblocks.
具体地,输入立体视频并对其进行校正、对齐等前期处理,再将处理后的立体视频分为多个编码宏块。Specifically, the stereoscopic video is input, pre-processing such as correction and alignment is performed on it, and then the processed stereoscopic video is divided into multiple coded macroblocks.
第一预测模块200用于通过深度预测的方法预测当前编码宏块的深度预测视差,并根据深度预测视差对当前编码宏块进行深度辅助的视间预测编码。The first prediction module 200 is configured to predict the depth prediction disparity of the currently coded macroblock through a depth prediction method, and perform depth-assisted inter-view prediction coding on the current coded macroblock according to the depth prediction disparity.
具体地,假设立体视频序列中仅包含左右视点的视频和深度图序列。左右视点的基线距离为c,左右视点的相机焦距均为f。当前编码宏块为Bk。Bk包含有nj个像素点,每个像素点对应的深度值为通过每个像素点的深度值来预测当前编码宏块Bk的深度预测视差。设当前编码宏块Bk的深度值为Bk所包含的所有像素点对应的深度值的最大似然值,zk可以表示为其中,为每个像素点的深度值。Specifically, it is assumed that a stereoscopic video sequence only contains videos and depth map sequences of left and right viewpoints. The baseline distance between the left and right viewpoints is c, and the camera focal lengths of the left and right viewpoints are both f. The currently coded macroblock is B k . B k contains n j pixels, and the depth value corresponding to each pixel is The depth prediction disparity of the currently coded macroblock B k is predicted through the depth value of each pixel. Assuming that the depth value of the current coded macroblock B k is the maximum likelihood value of the depth values corresponding to all pixels contained in B k , z k can be expressed as in, is the depth value of each pixel.
图2为根据本发明一个实施例的虚拟视点绘制原理图。如图2所示,在获得Bk所对应的深度值之后,可以通过深度与视差之间的映射关系来计算当前编码宏块的视差。当前编码宏块的预测视差可以表示为,其中,dk为计算视差,f为焦距,c为左右视点的基线距离。对于四分之一像素精度的编码模式,将dk取整到最临近的四分之一像素点位置作为当前编码宏块的深度预测视差。Fig. 2 is a schematic diagram of virtual viewpoint rendering according to an embodiment of the present invention. As shown in FIG. 2 , after the depth value corresponding to B k is obtained, the disparity of the currently coded macroblock can be calculated through the mapping relationship between depth and disparity. The predicted disparity of the currently coded macroblock can be expressed as, Among them, d k is the calculated disparity, f is the focal length, and c is the baseline distance between the left and right viewpoints. For a coding mode with quarter-pixel precision, d k is rounded to the nearest quarter-pixel point position as the depth prediction disparity of the current coded macroblock.
第二预测模块300用于通过视间匹配的方法获得视差向量,并根据视差向量对当前宏块进行传统视间预测编码。The second prediction module 300 is configured to obtain a disparity vector through an inter-view matching method, and perform traditional inter-view predictive coding on the current macroblock according to the disparity vector.
第三预测模块400用于通过时域运动估计的方法获得运动向量,并根据运动向量对当前编码宏块进行时域预测编码。The third prediction module 400 is configured to obtain a motion vector through a time-domain motion estimation method, and perform time-domain predictive coding on the currently coded macroblock according to the motion vector.
计算模块500用于计算当前编码宏块在视间预测和补偿预测的多个率失真性能。The calculation module 500 is used to calculate multiple rate-distortion performances of the current coded macroblock in inter-view prediction and compensation prediction.
具体地,编码器将计算不同预测模式下的率失真性能。设当前编码宏块Bk的运动向量为搜索视差为深度预测视差为 Specifically, the encoder will calculate the rate-distortion performance in different prediction modes. Let the motion vector of the current coded macroblock B k be Search parallax for The depth prediction disparity is
当前宏块的运动补偿预测率失真性能可以表示为,运动补偿预测的率失真性能通过如下公式获得,
当前宏块的搜索视差补偿预测的率失真性能通过如下公式获得,
在立体视频中,深度信息可以视为视频编码的边信息。因此,我们可以假设编码端和解码端能够同时获得相同的重构深度图。从而深度预测视差不需要编入码流当中。因此,当前宏块通过深度预测视差进行视差补偿预测所对应的率失真性能可以表示为,
选择模块600用于选择率失真性能最小的率失真性能对应的预测模式作为当前编码宏块的预测模式并进行编码。The selection module 600 is configured to select the prediction mode corresponding to the rate-distortion performance with the smallest rate-distortion performance as the prediction mode of the currently coded macroblock and perform coding.
具体地,编码器将选择率失真最优的预测模式作为当前编码宏块的预测模式。其选择过程可以表示为,
在本发明的一个实施例中,设当前编码宏块Bk为“Book Arrival”序列的视点8视频中一帧中的一个8×8的宏块。其对应的深度值如下述8×8矩阵所示。In one embodiment of the present invention, the current coded macroblock B k is assumed to be an 8×8 macroblock in one frame of view 8 video in the “Book Arrival” sequence. Its corresponding depth value is shown in the following 8×8 matrix.
对于四分之一像素精度的编码模式,将dk取整到最临近的像素点后,其视差应为,dk′=[dk]=16.25。编码器再进行基于预测的视差信息进行传统视间预测。对于当前编码宏块,其预测视差为16.25。编码器在视点10的对应帧中找到相应的参考宏块,进行预测。设预测残差绝对值之和为50。此外,编码器还将对当前宏块进行补偿预测,即时域预测和传统视间预测。在时域预测中,不妨设当前宏块的运动向量为32,时域预测的残差的绝对值之和为80。在视间的传统视间预测中,设编码器通过块匹配搜索得到的视差为16,传统视间预测的残差的绝对值之和为45。For the encoding mode with quarter-pixel precision, after d k is rounded to the nearest pixel, the disparity should be, d k ′=[d k ]=16.25. The encoder then performs traditional inter-view prediction based on the predicted disparity information. For the currently coded macroblock, its predicted disparity is 16.25. The encoder finds the corresponding reference macroblock in the corresponding frame of the viewpoint 10, and performs prediction. Let the sum of the absolute values of the forecast residuals be 50. In addition, the encoder will perform compensation prediction, instant domain prediction and traditional inter-view prediction on the current macroblock. In the time-domain prediction, it is advisable to set the motion vector of the current macroblock as 32, and the sum of the absolute values of the time-domain prediction residuals as 80. In the traditional inter-view prediction of inter-view, it is assumed that the disparity obtained by the encoder through block matching search is 16, and the sum of the absolute values of the residuals of the traditional inter-view prediction is 45.
在本发明的一个实施例中,计算模块500计算不同预测编码模式下的率失真性能。对于宏块Bk,其时域预测的率失真性能为,
Bk的传统视间预测的率失真性能为,
当采用深度预测视差进行预测编码时,Bk的深度辅助的视间预测编码的率失真性能为,
选择模块600通过编码器比较不同预测模式下率失真性能,并选择最优的预测编码模式。对于当前宏块Bk,
根据本发明实施例的系统,通过深度来估计编码宏块的视差来进行视间补偿预测,减少了立体视频编码中视差编码所需要的码率,同时提高了立体视频编码的效率。According to the system of the embodiment of the present invention, the disparity of coded macroblocks is estimated by depth to perform inter-view compensation prediction, which reduces the bit rate required for disparity coding in stereoscopic video coding and improves the efficiency of stereoscopic video coding.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在不脱离本发明的原理和宗旨的情况下在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and cannot be construed as limitations to the present invention. Variations, modifications, substitutions, and modifications to the above-described embodiments are possible within the scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310158699.XA CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310158699.XA CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103220532A CN103220532A (en) | 2013-07-24 |
CN103220532B true CN103220532B (en) | 2016-08-10 |
Family
ID=48817935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310158699.XA Active CN103220532B (en) | 2013-05-02 | 2013-05-02 | The associated prediction coded method of three-dimensional video-frequency and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103220532B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103763557B (en) * | 2014-01-03 | 2017-06-27 | 华为技术有限公司 | A Do‑NBDV acquisition method and video decoding device |
CN104125469B (en) * | 2014-07-10 | 2017-06-06 | 中山大学 | A kind of fast encoding method for HEVC |
CN106303547B (en) * | 2015-06-08 | 2019-01-01 | 中国科学院深圳先进技术研究院 | 3 d video encoding method and apparatus |
CN108235018B (en) * | 2017-12-13 | 2019-12-27 | 北京大学 | Point cloud intra-frame coding optimization method and device based on Lagrange multiplier model |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101222639B (en) * | 2007-01-09 | 2010-04-21 | 华为技术有限公司 | Inter-view prediction method, encoder and decoder in multi-view video technology |
CN101170702B (en) * | 2007-11-23 | 2010-08-11 | 四川虹微技术有限公司 | Multi-view video coding method |
CN101754042B (en) * | 2008-10-30 | 2012-07-11 | 华为终端有限公司 | Image reconstruction method and image reconstruction system |
CN102238391B (en) * | 2011-05-25 | 2016-12-07 | 深圳市云宙多媒体技术有限公司 | A kind of predictive coding method, device |
-
2013
- 2013-05-02 CN CN201310158699.XA patent/CN103220532B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN103220532A (en) | 2013-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102055982B (en) | Coding and decoding methods and devices for three-dimensional video | |
US8582904B2 (en) | Method of second order prediction and video encoder and decoder using the same | |
CN103037218B (en) | Multi-view stereoscopic video compression and decompression method based on fractal and H.264 | |
CN103051894B (en) | A kind of based on fractal and H.264 binocular tri-dimensional video compression & decompression method | |
CN102413353B (en) | Bit rate allocation method for multi-viewpoint video and depth map in stereoscopic video coding process | |
CN102006480B (en) | Coding and decoding method of binocular stereo video based on inter-view prediction | |
KR20120080122A (en) | Apparatus and method for encoding and decoding multi-view video based competition | |
CN101600108A (en) | A joint motion and disparity estimation method in multi-view video coding | |
CN101204094A (en) | Method for scalably encoding and decoding video signal | |
CN101980538B (en) | Fractal-based binocular stereoscopic video compression coding/decoding method | |
CN101883284B (en) | Video encoding/decoding method and system based on background modeling and optional differential mode | |
CN101404766A (en) | Multi-view point video signal encoding method | |
CN102905150A (en) | A New Compression and Decompression Method of Multi-View Video Fractal Coding | |
US20130335526A1 (en) | Multi-view video encoding/decoding apparatus and method | |
CN110557646B (en) | Intelligent inter-view coding method | |
CN103220532B (en) | The associated prediction coded method of three-dimensional video-frequency and system | |
CN110493603B (en) | Multi-view video transmission error control method based on rate distortion optimization of joint information source channel | |
CN102801995A (en) | Template-matching-based multi-view video motion and parallax vector prediction method | |
CN101198061A (en) | Stereoscopic Video Stream Coding Method Based on View Image Mapping | |
CN102740081B (en) | Method for controlling transmission errors of multiview video based on distributed coding technology | |
CN102316323B (en) | A Fast Fractal Compression and Decompression Method for Binocular Stereo Video | |
KR100738867B1 (en) | Coding Method and Multi-view Corrected Variation Estimation Method for Multi-view Video Coding / Decoding System | |
CN101980539B (en) | Fractal-based multi-view three-dimensional video compression coding and decoding method | |
CN101568038A (en) | Multi-viewpoint error resilient coding scheme based on disparity/movement joint estimation | |
CN101222640A (en) | Method and device for determining reference frame |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |