WO2013039348A1

WO2013039348A1 - Method for signaling image information and video decoding method using same

Info

Publication number: WO2013039348A1
Application number: PCT/KR2012/007368
Authority: WO
Inventors: 성재원; 예세훈; 손은용; 정지욱
Original assignee: 엘지전자 주식회사
Priority date: 2011-09-16
Filing date: 2012-09-14
Publication date: 2013-03-21

Abstract

The present invention relates to a method for signaling image information and a video decoding method and apparatus using the same. The method for signaling image information, according to the present invention, includes: a step of producing a global disparity vector (GDV) between a decoding target picture within a present view and a reference picture within a reference view; and a step of signaling information for inducting the GDV, wherein the decoding target picture and the reference picture can have the same picture order count (POC).

Description

Video information signaling method and video decoding method using the same

More particularly, the present invention relates to a method of signaling information for decoding a 3D image and a method of performing video decoding through prediction between different views based on signaled information will be.

Recently, a demand for high resolution and high quality image has been increasing in various application fields. However, as the image has high resolution and high quality, the amount of information about the image also increases.

Therefore, when video information is transmitted using a medium such as a wired / wireless broadband line, or when image information is stored using an existing storage medium, information transmission cost and storage cost increase. A high-efficiency image compression technique can be used to efficiently transmit, store and reproduce information of high-resolution and high-quality images.

On the other hand, digital video broadcasting using 3D video has received attention as one of the next generation broadcasting services as it can process high resolution / large capacity video. 3D video can provide a sense of presence and immersion using a plurality of view channels.

3D video can be used in various areas such as free viewpoint video (FVV), free viewpoint TV (FTV), 3DTV, surveillance and home entertainment.

Unlike single view video, 3D video using multi-view has a high correlation between views of the same POC (picture order count). Since multi-view images capture the same scene simultaneously using several cameras, ie, multiple viewpoints, the multi-view image has a high degree of correlation between different views because it contains almost the same information except for a time difference and a slight illumination difference.

Therefore, in multi-view video encoding / decoding, correlation between different views can be considered. For example, the decoding target block of the current view can be predicted or decoded by referring to the block of another view.

In this case, the relationship between different views can be calculated and used for prediction.

It is an object of the present invention to provide a method and apparatus for effectively performing inter-view prediction, which is a prediction between different views, in 3D (3 Dimensional) video coding.

The present invention provides a method and apparatus for efficiently signaling information required to derive a global disparity vector (GDV) that defines a relationship between different views in 3D video coding.

It is an object of the present invention to provide a method and apparatus for enabling an encoding apparatus and a decoding apparatus to perform inter-view prediction using the same GDV without transmission of a GDV value from an encoding apparatus in 3D video coding.

An object of the present invention is to provide a method and an apparatus for using GDV already calculated in performing inter-view prediction on a decoding target block of a current view in 3D video coding.

(1) One embodiment of the present invention is a signaling method of video information for decoding 3D video, the method comprising: calculating a Global Disparity Vector (GDV) between a current picture to be decoded and a reference picture in a reference view; The method comprising the steps of:

The decoding target picture and the reference picture may have the same picture order count (POC).

(2) In (1), the information for deriving the GDV may be transmitted in a sequence parameter set.

(3) In (1), the information for deriving the GDV may include information indicating a POC at which calculation of the GDV is performed.

(4) In (3), the information indicating the POC at which the calculation of the GDV is performed includes a first POC of an intra period, a POC for a picture below a predetermined temporal level, The POC of the picture to be used and the POC of all the pictures.

(5) In (1), the information for deriving the GDV may include information indicating a method of deriving the GDV for the decoding target picture.

(6) In (5), the information indicating the method of deriving the GDV for the current picture to be decoded may be any one of the GDVs calculated in the POC of the pictures decoded earlier than the current picture to be decoded, And a method of interpolating GDVs selected from the GDVs calculated in the POC of pictures decoded prior to the decoding target picture and using the GDVs as GDVs and GDVs used in the indicated method, And may include information indicating the direction.

(7) In (5), the information indicating the method of deriving the GDV for the current picture to be decoded may be the closest to the current picture in the POC order among the GDVs calculated in the POC of the pictures decoded earlier than the current picture The GDVs calculated in the POC are used as the GDVs for the current picture to be decoded and the GDVs calculated at the two POCs closest to the current picture in the POC order among the GDVs calculated in the POCs of the pictures decoded earlier than the decoding target picture are interpolated interpolation to use the GDV as a GDV for the current picture to be decoded.

(8) In (1), the information for deriving the GDV may include information indicating the reference view.

(9) Another embodiment of the present invention is a 3D video decoding method, comprising: receiving a bitstream including information for deriving a Global Disparity Vector (GDV); decoding a GOP of pictures decoded earlier than a current picture to be decoded; Deriving a GDV for a current picture to be decoded based on information for deriving the GDV, and generating inter-view prediction between a current view and a reference view based on the derived GDV, Wherein the step of deriving the GDV comprises decoding the current picture and the current view using the GDV calculated in the GOP of the pictures decoded prior to the decoding target picture of the current view, The GDV between the pictures of the reference view having the same POC as the target picture can be calculated.

(10) In (9), the information for deriving the GDV may be transmitted in a set of sequence parameters in the bitstream.

(11) In (9), the information for deriving the GDV may include POC information indicating a POC at which calculation of the GDV is performed. In the calculating of the GDV, the POC indicated by the POC information may include GDV Can be calculated.

(12) In (11), the POC information indicating the POC at which the calculation of the GDV is performed includes a first POC of an intra period, a POC of a picture at a predetermined temporal level or lower, The POC of the picture to be used as the POC and the POC of all the pictures.

(13) In (9), the information for deriving the GDV may include GDV derivation information indicating a method for deriving a GDV for the decoding target picture, and in the GDV derivation step, The GDV for the decoding target picture of the current view can be derived according to the instruction method.

(14) In (13), the GDV derivation information may be one in which one of the GDVs calculated in the POC of the pictures decoded earlier than the decoding target picture is used as the GDV for the decoding target picture, Information indicating one of the methods of interpolating GDVs selected from the GDVs calculated in the POC of the decoded pictures as GDVs, and information indicating the GDVs used in the indicated method.

(15) In (14), when the GDV derivation information indicates that any one of the GDVs calculated in the POC of the pictures decoded prior to the decoding target picture is used as the GDV for the decoding subject picture, the GDV In the derivation step, the GDV calculated at the POC closest to the current picture in the POC order among the GDVs calculated at the POC of the pictures decoded earlier than the decoding target picture may be guided to the GDV for the decoding target picture.

(16) In (15), when the GDV derivation information indicates one of the methods of interpolating GDVs selected from the GDVs calculated in the POC of the pictures decoded prior to the decoding target picture and using the interpolated GDVs as GDVs, In the GDV derivation step, interpolated GDVs calculated at two POCs closest to the current picture in the POC order among the GDVs calculated at the POC of the pictures decoded earlier than the decoding target picture are used as GDVs for the decoding target picture .

(17) In (9), the reference view may be indicated by information for deriving the GDV.

(18) In (9), the reference view may be specified by a decoding order between views.

According to the present invention, inter-view prediction, which is a prediction between different views in 3D (3 Dimensional) video coding, can be effectively performed.

According to the present invention, information on a global disparity vector (GDV) defining a relation between different views in 3D video coding can be signaled with a small bit amount and a low overhead.

According to the present invention, since the encoding apparatus and the decoding apparatus can perform the inter-view prediction using the same GDV in the 3D video coding without transmitting the GDV value from the encoding apparatus, the transmission efficiency can be increased.

According to the present invention, in the 3D video coding, decoding complexity can be reduced by reusing GDV that has already been calculated in performing inter-view prediction on a decoding target block of a current view.

1 is a block diagram schematically illustrating a video encoding apparatus according to an embodiment of the present invention.

2 is a block diagram schematically showing an image decoding apparatus according to an embodiment of the present invention.

3 is a diagram schematically illustrating a method of performing inter-view prediction using a GDV in a 3D video encoding / decoding process.

FIG. 4 is a view for schematically explaining an example of a case of decoding multi-view video.

5 is a diagram for explaining the POC for calculating the GDV.

6 schematically illustrates an example of a method of calculating a GDV value to be used for inter-view prediction in a current decoding picture.

7 is a flowchart schematically illustrating a method of signaling information about a GDV in an encoding apparatus according to the present invention.

8 is a diagram schematically illustrating a method of calculating a GDV for a current picture in a decoding apparatus according to the present invention.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. However, it is not intended to limit the invention to the specific embodiments. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like refer to the presence of stated features, integers, steps, operations, elements, components, or combinations thereof, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

In the meantime, the respective components in the drawings described in the present invention are shown independently for the convenience of explanation of different characteristic functions in the video encoding / decoding device, and the respective components may be separated from each other by hardware or separate software It does not mean that it is implemented. For example, two or more of the configurations may combine to form one configuration, or one configuration may be divided into a plurality of configurations. Embodiments in which each configuration is integrated and / or separated are also included in the scope of the present invention unless they depart from the essence of the present invention.

In the encoding / decoding for reproducing the three-dimensional stereoscopic image on the display device, the image input to the encoding device may be a texture image and a depth map. The depth map means an image representing the distance from the viewpoint to the surface of the object in the image. Here, the viewpoint may be, for example, a camera for photographing the corresponding image. The depth map (depth image) can be generated through a camera that captures depth.

The texture image is an image constituting a three-dimensional image and includes information other than depth information (for example, color, contrast, and the like), and may be composed of multi-view images.

In order to process a three-dimensional image, a depth map (depth image) and a texture image may be respectively processed in an encoding / decoding process to be described later, and a texture image may be processed for each view. At this time, the texture image may be referred to for processing of the depth map, and the depth map may be referred to for processing of the texture image. Also, in the case of a texture image, it may be processed by referring to an image of another view.

In this specification, for convenience of explanation, unless otherwise specified, 'image' means a texture image.

Hereinafter, a method of processing a three-dimensional image will be described with reference to the drawings.

1 is a block diagram schematically illustrating an encoding apparatus according to an embodiment of the present invention. 1, the encoding apparatus 100 includes a picture dividing unit 105, a predicting unit 110, a transforming unit 115, a quantizing unit 120, a reordering unit 125, an entropy encoding unit 130, An inverse quantization unit 135, an inverse transform unit 140, a filter unit 145, and a memory 150. [

The picture dividing unit 105 can divide the input picture into at least one processing unit block. At this time, the block as a processing unit may be a prediction unit (PU), a transform unit (TU), a coding unit (CU) Quot;). &Lt; / RTI >

The prediction unit 110 generates a prediction block by performing prediction on a processing unit of a picture in the picture dividing unit 105. [ The processing unit of the picture in the predicting unit 110 may be a CU, a TU, or a PU. In addition, the predicting unit 110 may determine a prediction method to be applied to the processing unit, and may determine concrete contents (for example, prediction mode, etc.) of each prediction method.

The prediction unit 110 may apply any one of intra prediction, inter prediction, and inter-view prediction as a prediction method.

Through the inter prediction, prediction can be performed by performing prediction based on information of at least one of a previous picture and a following picture of the current picture. Through the intra prediction, the prediction block can be generated by performing prediction based on the pixel information in the current picture. Through the inter-view prediction, prediction blocks can be generated by referring to pictures of different views.

As a method of inter prediction, a skip mode, a merge mode, MVP (Motion Vector Prediction), or the like can be used. In the inter prediction, a reference block having the same size as the PU can be selected for the PU by selecting a reference picture. The reference block may be selected in integer pixel units. Then, a prediction block in which a residual signal with respect to the current PU is minimized and a motion vector size is also minimized is generated.

The prediction block may be generated in units of integer samples or in units of pixels or less, such as a half-pixel unit or a quarter-pixel unit. At this time, the motion vector can also be expressed in units of integer pixels or less. For example, it can be expressed in units of quarter pixels for luminance samples and in units of eighth pixels for color difference samples.

Information such as an index of a reference picture selected through inter prediction, a motion vector (Predictor), a residual signal, and the like is entropy-encoded and transmitted to the decoding apparatus. In the case where the skip mode is applied, the residual can be used as a reconstruction block so that residuals can be generated, transformed, quantized, and not transmitted.

When intra prediction is performed, the prediction mode is determined in units of PU, and prediction can be performed in units of PU. In addition, the prediction mode may be determined in units of PU, and intra prediction may be performed in units of TU.

In the intra prediction, the prediction mode may have 33 directional prediction modes and at least two non-directional modes. The non-directional mode may include a DC prediction mode and a planer mode (Planar mode).

In intra prediction, a prediction block can be generated after applying a filter to a reference sample. At this time, whether to apply the filter to the reference sample can be determined according to the intra prediction mode and / or the size of the current block.

In the inter-view prediction, the prediction of the current block can be performed using a global disparity vector that specifies the position of the corresponding block that can be referred to in prediction of the current block in the current view in the reference view.

PUs can be blocks of various sizes / types. For example, the PU may be a 2N × 2N block, a 2N × N block, an N × 2N block, or an N × N block (N is an integer). Further, in addition to the PU of the above-mentioned size, a PU such as an N × mN block, an mN × N block, a 2N × mN block, or an mN × 2N block (m <1) may be further defined.

The residual value (residual block or residual signal) between the generated prediction block and the original block is input to the conversion unit 115. In addition, the prediction mode information, motion vector information, and the like used for prediction are encoded in the entropy encoding unit 130 together with the residual value, and then transmitted to the decoding apparatus.

The conversion unit 115 performs a conversion on a residual block on a conversion unit basis and generates a conversion coefficient. The conversion unit 115 may perform the downsampling on the texture image and the depth map, and then perform the conversion. The downsampling may be performed on the low frequency region in the texture image and the depth map, or may be performed on the region in which the detail characteristic is not important. Downsampling can reduce complexity and improve coding efficiency.

The conversion unit in the conversion unit 115 may be a TU and may have a quad tree structure. At this time, the size of the conversion unit can be set within a range of predetermined maximum and minimum sizes. The transforming unit 115 may transform the residual block using DCT (Discrete Cosine Transform) and / or DST (Discrete Sine Transform).

The quantization unit 120 may quantize the residual values converted by the conversion unit 115 to generate a quantization coefficient. The values calculated by the quantization unit 120 are provided to the dequantization unit 135 and the reordering unit 125. [

The reordering unit 125 rearranges the quantization coefficients provided from the quantization unit 120. The encoding efficiency in the entropy encoding unit 130 can be increased by rearranging the quantization coefficients. The reordering unit 125 may rearrange the quantization coefficients of the two-dimensional block form into a one-dimensional vector form through a coefficient scanning method. The reordering unit 125 may increase the entropy encoding efficiency in the entropy encoding unit 130 by changing the order of the coefficient scanning based on the probabilistic statistics of the coefficients transmitted from the quantization unit.

The entropy encoding unit 130 may perform entropy encoding on the rearranged quantization coefficients by the reordering unit 125. [ For entropy encoding, for example, an encoding method such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be used. The entropy encoding unit 130 receives quantization coefficient information, block type information, prediction mode information, division unit information, PU information, transmission unit information, motion vector information, and motion vector information of the CU received from the reordering unit 125 and the prediction unit 110, Reference picture information, block interpolation information, filtering information, and the like can be encoded.

In addition, the entropy encoding unit 130 may make certain changes to the parameter set or syntax to be transmitted, if necessary.

The entropy encoding unit 130 may multiplex the image information of the multi-view and the image information of the depth map and transmit the multiplexed image as a bit stream.

The inverse quantization unit 135 dequantizes the quantized values in the quantization unit 120 and the inverse transformation unit 140 inversely transforms the dequantized values in the inverse quantization unit 135. [ The inverse transform unit 140 may perform upsampling on the inverse transformed residual block when downsampling is performed in the transform unit 115. [ The sampling rate of the upsampling may be determined corresponding to the sampling rate of the downsampling performed in the conversion unit 115. [

A residual value generated by the inverse quantization unit 135 and the inverse transform unit 140 and a predicted block predicted by the predictor 110 may be combined to generate a reconstructed block.

In FIG. 1, it is explained that a residual block and a prediction block are combined through an adder to generate a restored block. At this time, the adder may be regarded as a separate unit (reconstruction block generation unit) for generating a reconstruction block.

The filter unit 145 may apply at least one of a deblocking filter, an Adaptive Loop Filter (ALF), and a Sample Adaptive Offset (SAO) to the reconstructed picture as necessary.

The deblocking filter can remove the distortion caused in the boundary between the blocks in the reconstructed picture. The ALF (Adaptive Loop Filter) can perform filtering based on the comparison between the reconstructed image and the original image after the block is filtered through the deblocking filter. ALF may be performed only when high efficiency is applied. SAO restores the offset difference from the original image in units of pixels for a residual block to which a deblocking filter is applied and is applied in the form of a band offset and an edge offset.

The memory 150 may store a restored block or a picture calculated through the filter unit 145. [ The reconstruction block or picture stored in the memory 150 may be provided to the prediction unit 110 that performs inter prediction.

2 is a block diagram schematically showing an image decoding apparatus according to an embodiment of the present invention. 2, the image decoding apparatus 200 includes an entropy decoding unit 210, a reordering unit 215, an inverse quantization unit 220, an inverse transform unit 225, a prediction unit 230, a filter unit 235, And a memory 240.

When an image bitstream is input in the image encoding apparatus, the input bitstream may be decoded according to a procedure in which image information is processed in the image encoding apparatus.

For example, when a variable length coding (VLC), such as CAVLC, is used to perform entropy encoding in a video encoding apparatus, the entropy decoding unit 210 also uses a VLC The entropy decoding can be performed by implementing the same VLC table as the table. Also, when CABAC is used to perform entropy encoding in the video encoding apparatus, the entropy decoding unit 210 can perform entropy decoding using CABAC in correspondence thereto.

If the bit stream received from the encoding apparatus is a multiplexed image information of the multi view and a depth map, the entropy decoding unit 210 may perform entropy decoding after demultiplexing the received bit stream.

Information for generating a predictive block from information decoded by the entropy decoding unit 210 is provided to the predicting unit 230. Residual values for which the entropy decoding is performed in the entropy decoding unit 210 are supplied to a reordering unit 215 Can be input.

The reordering unit 215 may rearrange the entropy-decoded bitstream in the entropy decoding unit 210 based on a method of rearranging the entropy-decoded bitstream in the image encoding apparatus. The reordering unit 215 may rearrange the coefficients represented in the one-dimensional vector form by restoring the coefficients of the two-dimensional block form again. The reordering unit 215 may perform reordering by receiving information related to the coefficient scanning performed in the encoding apparatus and performing a reverse scanning based on the scanning order performed in the encoding apparatus.

The inverse quantization unit 220 can perform inverse quantization based on the quantization parameters provided by the encoding apparatus and the coefficient values of the re-arranged blocks.

The inverse transform unit 225 may perform inverse DCT and / or inverse DST on the DCT and DST performed by the transform unit of the encoding apparatus, on the quantization result performed by the image encoding apparatus. The inverse transformation may be performed based on the transmission unit determined in the encoding apparatus or the division unit of the image. The DCT and / or DST in the transform unit of the encoding apparatus may be selectively performed according to a plurality of information such as prediction method, size and prediction direction of the current block, and the inverse transform unit 225 of the decoding apparatus transforms It is possible to perform an inverse conversion based on the performed conversion information.

If the conversion is performed after the downsampling is performed on the residual block in the encoding apparatus, the inverse transform unit 225 may perform the upsampling on the inverse-transformed residual block corresponding to the downsampling performed in the encoding apparatus.

The prediction unit 230 may generate a prediction block based on the prediction block generation related information provided by the entropy decoding unit 210 and the previously decoded block and / or picture information provided in the memory 240.

When the prediction mode for the current block is the intra prediction mode, the prediction unit 240 can perform intra prediction that generates a prediction block based on the pixel information in the current picture.

In the case where the prediction mode for the current block is an inter prediction mode, the prediction unit 240 predicts the inter prediction of the current block based on the information included in at least one of the previous picture of the current picture and the following picture Can be performed. At this time, motion information necessary for inter prediction of a current block provided in the encoding apparatus, for example, information on a motion vector, a reference picture index, and the like can be derived in response to checking skip flags, merge flags, and the like received from the encoding apparatus.

In a case where inter-view prediction is applied to the current block, the prediction unit 240 can perform prediction on the current block using a reference picture in another view.

The reconstruction block may be generated using the prediction block generated by the prediction unit 230 and the residual block provided by the inverse transform unit 225. [ In FIG. 2, it is explained that a prediction block and a residual block are combined in an adder to generate a restored block. At this time, the adder may be regarded as a separate unit (reconstruction block generation unit) for generating a reconstruction block.

When the skip mode is applied, the residual is not transmitted and the prediction block can be a restoration block.

The reconstructed block and / or picture may be provided to the filter unit 235. The filter unit 235 may apply deblocking filtering, sample adaptive offset (SAO), and / or ALF to the restored blocks and / or pictures as needed.

The memory 240 may store the reconstructed picture or block to be used as a reference picture or a reference block, and may provide the reconstructed picture to the output unit. Although not shown, the output unit can provide a 3DV image using restored multi-view pictures.

In 3D video encoding / decoding, a multi-view video sequence photographed by a plurality of cameras is used. There is a disparity in global disparity between images taken at different view points. The global disparity can be regarded as a difference in global variation existing between pictures of the same time in a different view from a picture of a specific time in the current view. At this time, a difference in global variation between the two views can be expressed through a Global Disparity Vector (GDV).

3, an inter-view prediction is performed on a current block 310 in a picture 300 of an n-th view Vn among multi-views.

Referring to FIG. 3, motion information of a reference block 330 in a picture 320 of an m-th view Vm of a multi-view may be referred to for inter-view prediction of the current block 310. The picture 300 of the n-th view and the picture 320 of the m-th view are the same time, that is, pictures of the same POC (Picture Order Count). The POC is information indicating the output order of a picture.

The relationship between the current block 310 and the reference block 330 belonging to different views can be defined by the GDVnm 350. (X, y) of the current block 310 in the n-th view and the (x, y) -th view of the m-th view in the nth view, considering the relationship between the block 340 and the reference block 330, The relationship of the upper left pixel (x ', y') of the reference block 330 can be expressed by Equation (1).

<수식 1>&Lt; Formula 1 >

(x ', y') = GDVnm + (x, y)

GDVnm = (dvx _nm , dvy _nm )

the current block 310 of the n-th view may be predicted 360 using the reference block 330 of the m-th view specified by the GDV derived as Equation 1 as a reference block.

The encoding apparatus can select a disparity as a global disparity between the corresponding views in order to minimize an error between two pictures of different views at the same POC. For example, a disparity having an optimum SAD (Sum of Absolute Differences) value between two views can be determined on a block-by-block basis. At this time, MSE (Mean Square Error), MAD (Mean Absolute Difference), or the like may be used instead of SAD.

A Global Disparity Vector (GDV) is a vector representation representing the selected global disparity.

The encoding apparatus may transmit the calculated GDV information as a bit stream so that the same GDV is used in the decoding apparatus.

In the case of performing inter-view prediction using GDV, motion information is taken from a block (reference block) in a reference picture of a reference view corresponding to the current block of the current view by GDV and used for prediction of the current block . For example, motion information (e.g., a motion vector) of a reference block may be copied and used as a temporal motion vector of a current block, or may be used as a motion vector predictor of a current block.

FIG. 4 is a view for schematically explaining an example of a case of decoding multi-view video. In the example of FIG. 4, the pictures of the three views (V0, V1, V2) are decoded, the pictures of each view form a sequence according to the POC, and eight consecutive pictures in the POC order are grouped Picture: GOP) is described as an example.

In FIG. 4, GDV ₁₀ is a GDV used for predicting a current block of a view V1 with a view V0 as a reference view, and GDV ₁₂ is used for predicting a current block of a view V1 by using a view V2 as a reference view GDV ₂₀ is the GDV used to predict the current block of view V2 with view V0 as the reference view. The GDV indicates the correspondence of two blocks between pictures at the same time (POC) of two different views.

Each GDV may be transmitted as a bitstream from the encoding device as described above, and the decoding device may use the GDV from the bitstream to decode it.

Referring to FIG. 4, in the view V2, prediction of the current block 420 of the current picture 400 is performed with reference to a block 430 (reference block) in the reference picture 410 of the view V0. The current picture 400 and the reference picture 410 have the same POC (POC = 2). GDV ₂₀ = (dvx ₂₀ , dvy ₂₀ ) is a global disparity vector (GDV) for specifying the position of a reference block in the reference view V 0 when decoding the current view V 2, as described above.

(Reference view) to the corresponding block 430 (reference block) in the reference picture 410 when the upper left position of the current block in the current picture 400 is (x, y) at V0, The upper left position (x ', y') can be specified as shown in Equation (2).

<수식 2>&Quot; (2) "

(x ', y') = (x, y) + GDV ₂₀

The encoding apparatus and the decoding apparatus perform prediction on a current block by referring to motion information of a block in a reference view determined by GDV in the case of inter-view prediction. For example, in the example of FIG. 4, the encoding apparatus and the decoding apparatus copy the motion information at the reference block 430 of the position (x ', y') specified by the GDV ₂₀ and store the motion information as a temporal motion vector of the current block 420 Or may be used as a motion vector predictor of the current block 420.

On the other hand, in the inter-view prediction of the current block, as described above, the GDV calculated by the encoding apparatus is not transmitted as a bit stream, and the decoding apparatus calculates the GDV in the same manner as the encoding apparatus calculates the GDV . For example, a decoding apparatus can determine a global disparity between corresponding views on a block-by-block basis using SAD or the like for two pictures having the same POC in different views in the same manner as the encoding apparatus.

In this case, the inter-view prediction for the current block in the current picture can be performed using the GDV calculated before the current picture is decoded. This can be referred to as motion information reuse in multi-view video coding.

The reused motion information, that is, the GDV used for the inter-view prediction of the current block, becomes the GDV derived from the POC of the decoded picture before the current picture among the pictures in the current view.

For example, in the example of FIG. 4, if the decoding is performed in the POC order, the GDV ₂₀ used for prediction of the current block 410 is the picture decoded before the current picture 400 in the current view V2, Lt; / RTI > may be a GDV calculated from a single picture or a picture belonging to a previous GOP.

Hereinafter, the reuse of motion information in inter-view prediction will be described with reference to the drawings.

In the reuse of the motion information of the inter-view prediction, the encoding device does not transmit the calculated GDV value to the decoding device. The decoding apparatus calculates the GDV in the same manner as the encoding apparatus calculates the GDV and uses it for inter-view prediction.

So that the decoding apparatus can calculate the GDV in the same manner as the encoding apparatus calculates the GDV, that is, the decoding apparatus can calculate the GDV identical to the GDV calculated by the encoding apparatus and use it for inter-view prediction of the current block (1) The encoding apparatus can signal to the decoding apparatus information indicating a point-in-time (POC) at which the GDV is calculated, and (2) the encoding apparatus decodes information indicating the GDV to be used for inter- Lt; / RTI >

First, with respect to the point of time when the GDV is calculated, the encoding apparatus and the decoding apparatus decode the same viewpoints of the current view (that is, at the same time or the same POC ) To calculate the GDV. At this time, the current view may be a view to be encoded / decoded.

If the POC for calculating the GDV between the encoding apparatus and the decoding apparatus is predetermined, the GDV can be calculated at a predetermined POC.

The encoding device may signal information indicating when the POC is producing the GDV. That is, the encoding apparatus signals to the decoding apparatus information indicating when the POC calculated the GDV, and the decoding apparatus can calculate the GDV in the same manner as the encoding apparatus at the POC indicated by the signaled information.

Table 1 is a simplified representation of an example syntax that indicates when to update or update the GDV (POC).

< 표 1><Table 1>

As shown in Table 1, the encoding apparatus can signal the decoding apparatus to gdv_update_interval indicating the POC for calculating / updating the GDV in the sequence parameter set.

The gdv_update_interval specifies a point in time (POC) at which the GDV is calculated / updated. The encoding apparatus transmits the gdv_update_interval indicating the calculated POC to the calculated GDV, and the decoding apparatus calculates the GDV in the same manner as the encoding apparatus at the POC indicated by the gdv_update_interval.

For example, according to the value of gdv_update_interval, it is possible to set the POC to calculate the GDV as shown in Table 2. [

<표 2><Table 2>

5 is a diagram for explaining the POC for calculating the GDV. In the picture shown in Figure _{5 (..., P k, P} k + 1, P k + 2, P k + 3, P k + 4, ..., P k + l-3, P k + l-2, P _{_{k + l-1, P k}} + l, P k + l + 1, ..., wherein k and l is an integer) is deulyimyeo picture belonging to the same view (view), it is illustrated in accordance with the POC order.

In addition, the pictures shown in FIG. 5 are arranged according to the temporal level to which they belong. Considering the temporal level, a picture (slice) or a block to be inter-predicted in inter-prediction between pictures may refer to a picture whose temporal level is lower than itself or whose temporal level is the same as itself. In the example of Figure 5, TL0, TL1, and assuming the three temporal levels of TL2, the lowest temporal level of _{_{TL0 P k, P k + 2}} , P k + l-2, P k + l, P k + _{l + 1} belongs, and the next temporal level TL1 P _{k + 1,} P _{k + 4,} P _{k + l-3} belongs, the a high temporal level of TL2 P _{k + 3,} P _{k + l-1} is .

Considering again the example of Table 2, the decoding apparatus calculates the GDV in the same manner as the encoding apparatus at the POC indicated by gdv_update_interval.

For example, if the value of gdv_update_interval is 0, the decoding apparatus calculates GDV at the POC which is the starting point of the intra period. The intra period corresponds to a period from an IDR (instantaneous decoding refresh) picture to a next IDR picture, that is, a unit period of random access. For example, the P _k the IDR picture in the example of Figure 5, then when the IDR picture is a P _{k + l} assuming, the value of gdv_update_interval received from the encoding unit 0, the decoding apparatus GDV in the POC of the P _k And the GDV is again calculated at the POC of Pk _{+ 1} after the POC corresponding to the intra period has elapsed.

Also, in the example of Table 2, if the value of gdv_update_interval is 1, the decoding apparatus calculates GDV at the POC of the picture whose temporal level is lower than a predetermined value. The temporal level for calculating the POC when the value of gdv_update_interval is 1 may be set in advance between the encoding apparatus and the decoding apparatus, or may be transmitted as separate information. For example, suppose that GDV is computed at a POC with a temporal level of 1 or less when the value of gdv_update_interval is 1. Since the temporal level generally sets the lowest temporal level to 0, the value of the temporal level corresponding to TL0 is 0 and the temporal level corresponding to TL1 is 1 in the example of FIG. Therefore, when the value of gdv_update_interval 1 day, in the example of Figure 5 the decoding apparatus in the P _{k + 1,} P _{k + 3,} P _{k + 4,} P _{k + l-3,} P _{k +} POC of _l-1 Respectively, to calculate GDV.

In the example of Table 2, if the value of gdv_update_interval is 2, the decoding apparatus calculates GDV at the POC of the picture used as the reference picture. For example, in Fig. 5, if _{Pk + 2} makes _{Pk + 1} a reference picture, the decoding apparatus calculates GDV at the POC of Pk _{+ 1} . Here, the reference picture may be a picture referred to in an inter picture between pictures in the same view, or a picture referred to in an inter picture between pictures in different views.

In the example of Table 2, if the value of gdv_update_interval is 3, GDV is calculated at POC of all pictures. Therefore, even in consideration of the five patients, the decoding apparatus every picture _{_{(..., P k, P k}} + 1, P k + 2, P k + 3, P k + 4, ..., P k + l-3, POC of each _{P k + l-2, P} k + l-1, P k + l, P k + l + 1, ...) to calculate the GDV.

As described above, since the gdv_update_interval is transmitted in the Sequence Parameter Set (SPS), the decoding apparatus can calculate the GDV at a predetermined POC according to the received gdv_update_interval value until the sequence is changed and the new gdv_update_interval is signaled through the next SPS .

In addition, although the encoding apparatus selects the GDV that performs the prediction using the GDV calculated in various POCs and obtains the optimum result, in view of performing the inter-view prediction using the same GDV in the decoding apparatus and the encoding apparatus , The encoding apparatus may also say that GDV is calculated at the POC indicated by gdv_update_interval, and the inter-view prediction is performed using the calculated GDV.

The calculated GDV is used for inter-view prediction in decoding for a subsequent picture. That is, the GDV calculated at the POC of the decoded picture is used for inter-view prediction of the block of the picture to be decoded subsequently. Thus, to compute a GDV at a particular POC, after the decoding of the multi-view pictures at the POC is complete, the GDV can be computed using the decoded multi-view pictures. As described above, the GDV can be calculated by calculating a global disparity on a block-by-block basis that optimizes an error (SAD, etc.) between two pictures of different views in the POC.

On the other hand, whether the GDV between the current view and which view is calculated may be determined between the encoding apparatus and the decoding apparatus. The current view is the view to be decoded, and the view from which the GDV is calculated with respect to the current view may be the reference view that is referred to the inter-view prediction of the current view.

For example, in the case of encoding / decoding a multi-view composed of three views as in the example of FIG. 4, if the encoding / decoding order is determined, reference relationships between views can be set according to the encoding / decoding order. Thus, the reference view of the current decoding target can be set to the decoded view just before the current view. For example, in FIG. 4, if three views are decoded in the order of view V1? View V0? View V2, the GDVs of view V1 become the GDV calculated between view V1 and view V0 with view V0 as a reference view.

Alternatively, only the reference relationships between the views may be set apart from the decoding order of the views. As shown, if the view V1 is decoded referring to the view V0 and the view V2 and the view V2 is set to be decoded with reference to the view V0, the GDV of the view V1 calculated by referring to the view V0 or the view V2 is GDV ₁₀ or GDV ₁₂ , and the GDV of the view V2 calculated with reference to the view V0 becomes GDV ₂₀ .

When the reference view for calculating the GDV between the current view and the reference view is not predetermined or if the reference view needs to be separately designated, the encoding apparatus generates reference information indicating the reference view (e.g., reference direction ) To the decoding device. Upon receiving the reference information indicating the reference view from the encoding apparatus, the decoding apparatus can calculate the GDV between the reference view indicating the reference information and the current view at the predetermined POC. The reference information may be included in the SPS and transmitted together with information indicating a time point (POC) at which the GDV is calculated / updated.

With respect to the GDV to be used for the inter-view prediction, the encoding device can use information indicating how to use the GDV at a certain time point (using already calculated GDVs so that the decoding device can use the same GDV as the GDV calculated by the encoding device) Information indicating how to calculate the GDV for the picture to be decoded) to the decoding device.

The GDV to be used in the currently decoding POC can be variously calculated based on the GDV of another POC already calculated. For example, the already calculated GDV of another POC may be used for inter-view prediction of the current picture as it is, or the GDVs of already calculated POCs may be interpolated to use for inter-view prediction of the current picture.

The encoding apparatus can signal information indicating how to obtain the GDV value to be applied in the currently decoded POC through a Sequence Parameter Set (SPS). The decoding apparatus can perform inter-view prediction on the current picture using the GDV (s) calculated at the POC of the decoded picture prior to the current picture, as indicated by the signaled information from the encoding apparatus.

Table 3 is a simplified representation of an example of a syntax for signaling information indicating a GDV value to be applied at the POC to be decoded at present.

<표 3><Table 3>

In the example of Table 3, the encoding device may signal gdv_interpolation_method in the SPS to indicate the GDV that the decoding device will use for inter-view prediction of the current picture.

In the example of Table 3, if the value of gdv_interpolation_method is 0, the decoding apparatus can use any of the already calculated GDVs in the same view to interpolate the current picture without interpolation.

In the example of Table 3, if the value of gdv_interpolation_method is 1, the decoding apparatus can interpolate already calculated GDV values in the same view and use it for inter-view prediction of the current picture.

6 schematically illustrates an example of a method of calculating a GDV value to be used for inter-view prediction in a current decoding picture. In the example of FIG. 6, the pictures Pa, Pb, Pc, Pd, Pe and the like shown in the same view are arranged in the POC order.

The signaling in Table 3 will be described in detail using an example of the pictures in the same view shown in Fig.

Referring to Table 3, when the value of gdv_interpolation_method is 0, the decoding apparatus can use any of the already calculated GDVs without interpolation. For example, if the value of gdv_interpolation_method is 0, the decoding apparatus can use the GDV value calculated at the POC closest to the POC of the current picture for inter-view prediction of the current picture.

If in Fig picture in the same view are aligned POC order 6, the nearest picture to the current picture Pc (600) Pb, i.e. of the distance (POC difference) d _b, d _c, d _d, d _e between the picture If the smallest value is d _b , the GDV calculated from the POC of the picture Pb as the GDV for the current picture Pc can be used.

If the value of gdv_interpolation_method is 0 and GDV is calculated at the POC of two pictures that are the same distance from the current picture, then the GDV computed at the POC preceding the POC order can be used as the GDV for the current picture. For example, in the example of Figure 6, if _b = d d _d, the GDV calculated in the POC of the picture Pb may be used as a GDV of the current picture Pc.

If the value of gdv_interpolation_method is 1, the decoding apparatus can interpolate the already calculated GDV values and use it as a GDV for the current picture. For example, if the value of gdv_interpolation_method is 1, the GDV interpolated GDVs calculated at two POCs close to the POC of the current picture among the calculated POCs can be used as a GDV for the current picture.

If that picture in the same view, consider the example of Figure 6 the sorted as POC order, the current when it is called a picture Pc (600), the distance between the GDV is already calculated POC d _a, d _b, d _d, d _e The GDV calculated at the two POCs having the smallest distance (POC difference) from the current picture can be used by comparing the magnitudes. For example, d _b <d _d <d _a <d _e of assuming the relationship is satisfied, the current picture Pc of the inter-view prediction, the interpolation of the GDV calculated in the GDV and the picture Pd calculated at the picture Pb POC POC .

Assume that the POC closest to the POC of the current picture among the POCs for which the GDV has already been calculated in the same view as the current picture is poc1 and the POC closest to the POC of the current picture is poc2. If GDV calculated in poc1 is gdv1 and GDV calculated in poc2 is gdv2, gdv, which is the GDV to be used for inter-view prediction of the current picture, can be derived by interpolation as shown in Equation 3.

<수식 3>&Quot; (3) "

gdv =? x gdv1 +? x gdv2

In Equation 3, α and β can be determined according to the position of the POC calculated by gdv1 and gdv2 in the POC order as shown in Table 4.

<표 4><Table 4>

(D1 = | poc1 - poc0 |) between the calculated POC of the gdv and the POC of the current picture, and d2 is the distance between the calculated POC and the POC of the current picture (D2 = | poc2 - poc0 |).

Referring to Table 4, when GDVs calculated in the different POCs are used based on the POC of the current picture (when GDVs calculated in the POC before and after the current picture are used in the POC order), the GDV Is derived from the POC of the current picture to the sum of the GDV values scaled by the distance from each GDV to the calculated POC.

If the GDVs calculated in the POCs of the current picture are used based on the POC of the current picture (in the case of using the GDVs calculated in the POCs before the current picture in the POC order or in the POCs after the current picture, , The GDV for the current picture is derived as the difference of the scaled GDV values from the POC of the current picture to the distance of each GDV to the calculated POC.

Information indicating how to derive the GDV value to be used in the decoding target picture using the already calculated GDV values may be transmitted in the SPS as described above. Accordingly, the GDV to be used for inter-view prediction of the pictures to be decoded can be derived in the same manner until a new gdv_interpolation_method value is received through the next SPS in the current sequence.

Referring to FIG. 7, the encoding apparatus calculates a GDV for a current picture (S710). The GDV may be calculated by a predetermined unit in the encoding apparatus, for example, a prediction unit. The encoding apparatus can select a disparity as a global disparity between the corresponding views in order to minimize an error between two pictures of different views at the same POC. For example, a disparity having an optimum SAD (Sum of Absolute Differences) value between two views can be determined on a block-by-block basis. At this time, MSE (Mean Square Error), MAD (Mean Absolute Difference), or the like may be used instead of SAD.

The encoding apparatus signals the information on the calculated GDV to the decoding apparatus (S720). Instead of signaling the calculated GDV value to the decoding apparatus, the encoding apparatus generates information indicating the time point (POC) at which the GDV is calculated in the corresponding view, and GDVs already calculated in the case of performing inter- To the decoding apparatus, information indicating how to calculate the GDV for the picture to be decoded. The information on the GDV is transmitted in a bit stream, for example, through a Sequence Parameter Set (SPS). If it is necessary to indicate a reference view for the current view, the encoding device may send information indicating the reference view via the SPS.

Referring to FIG. 8, the decoding apparatus receives the GDV information from the encoding apparatus (S810). The GDV information indicates a method of calculating a GDV for a picture to be decoded using information indicating a point of time (POC) of calculating a GDV in the current view and GDVs already calculated in the case of inter-view prediction in each picture Information. The GDV information may also include information indicating a reference view. The GDV information may be transmitted from the encoding apparatus through a bitstream, for example, included in the SPS and transmitted.

The decoding apparatus calculates GDV at a predetermined POC based on the received GDV information (S820).

The decoding apparatus can calculate the GDV with the reference view at the point of time (POC) indicated by the GDV information. The GDV is calculated using already decoded pictures. The predetermined POC indicated by the GDV may be any one of a start POC of an intra period, a POC of a picture below a predetermined temporal level, a POC of a picture to be a reference picture, and a POC of all pictures. The calculation of the GDV may be performed using SAD or the like as performed in the encoding apparatus, and pictures for each view may be performed in the decoded POC. The GDV is calculated between the pictures of the POC indicated by the GDV information in the current view and the reference view, the reference view for the current view may be predetermined or may be signaled from the encoding device.

Based on the received GDV information, the decoding apparatus derives the GDV for the current picture using the already calculated GDVs (S830). The decoding apparatus calculates the GDV for the current picture using the GDV calculated at the POC of the already decoded picture according to the method indicated by the GDV information. The decoding apparatus may use the already calculated GDV as it is as the GDV for the current picture, or interpolate the already calculated GDVs and use it as the GDV for the current picture.

The calculation of the GDV at each POC and the derivation of the GDV for the current picture may be performed in a predetermined unit in the decoding apparatus, for example, in the prediction unit.

The decoding apparatus may decode the current picture using the GDV for the current picture (S840). The decoding apparatus can perform inter-view prediction on the current picture using the derived GDV. The decoding apparatus can decode the current picture using inter-view prediction. For example, as described above, the decoding apparatus specifies the block corresponding to the GDV in the reference picture having the same POC as the current picture in the reference view, and uses the motion information of the corresponding block as the motion information of the current block The motion information of the current block can be predicted.

The decoding apparatus can restore the current block by adding the prediction block generated through the inter-view prediction and the residual transmitted from the encoding apparatus. When the skip mode is applied as a method of inter-view prediction, a prediction block of the current block may be used as a reconstruction block.

In this specification, GDV for a current picture means a GDV used for inter-view prediction of blocks to be decoded in a current picture.

In the above-described exemplary system, the methods are described on the basis of a flowchart as a series of steps or blocks, but the present invention is not limited to the order of the steps, and some steps may occur in different orders . In addition, the above-described embodiments include examples of various aspects. Accordingly, it is intended that the invention include all alternatives, modifications and variations that fall within the scope of the following claims.

In the description of the present invention so far, when one component is referred to as being "connected" or "connected" to another component, the other component is directly connected It should be understood that there may be other components between the two components. On the other hand, when one component is referred to as being "directly connected" or "directly connected" to another component, it should be understood that no other component exists between the two components.

Claims

Calculating a GDV (Global Disparity Vector) between a decoding target picture in the current view and a reference picture in the reference view; And
And signaling information for deriving the GDV,
Wherein the decoding target picture and the reference picture have the same picture order count (POC).
The method of claim 1, wherein the information for deriving the GDV is transmitted in a sequence parameter set.
The method of claim 1, wherein the information for deriving the GDV includes information indicating a POC at which calculation of the GDV is performed.
4. The method of claim 3, wherein the information indicating the POC at which the calculation of the GDV is performed,
A POC of a picture to be used as a reference picture, and a POC of all pictures, wherein the first POC of an intra period, the POC of a picture below a predetermined temporal level, the POC of a picture used as a reference picture, Signaling method.
The method of claim 1, wherein the information for deriving the GDV includes information indicating a method of deriving a GDV for the decoding target picture.
6. The method of claim 5, wherein the information indicating the method of deriving the GDV for the picture to be decoded,
The GDVs selected from the GDVs calculated in the POC of the pictures decoded earlier than the decoding target picture are used as the GDVs for the decoding target picture and the GDVs selected from the POCs of the pictures decoded earlier than the decoding target picture Information indicating one of the methods of interpolating and using as a GDV; And
And information indicating a GDV used in the indicated method.
6. The method of claim 5, wherein the information indicating the method of deriving the GDV for the picture to be decoded,
Using the GDV calculated for the POC closest to the current picture in the POC order among the GDVs calculated in the POC of the pictures decoded earlier than the decoding target picture as the GDV for the decoding target picture
And interpolating GDVs calculated at two POCs closest to the current picture in the POC order among the GDVs calculated at the POC of the pictures decoded earlier than the decoding target picture to use the GDVs as GDVs for the current picture to be decoded Wherein the video information signaling information is information indicating the video information signal.
The method of claim 1, wherein the information for deriving the GDV includes information indicating the reference view.
Receiving a bitstream including information for deriving a Global Disparity Vector (GDV);
Calculating a GDV in a GOP of pictures decoded prior to a current picture to be decoded;
Deriving a GDV for a current picture to be decoded based on information for deriving the GDV; And
Performing an inter-view prediction between a current view and a reference view based on the derived GDV,
In the step of deriving the GDV,
A GDV between the picture of the current view and a picture of the reference view having the same POC as the current picture to be decoded using the GDV calculated in the GOP of pictures decoded earlier than the current picture to be decoded of the current view, Of the video data.
10. The method of claim 9, wherein the information for deriving the GDV is transmitted in a sequence parameter set in the bitstream.
10. The method of claim 9, wherein the information for deriving the GDV includes POC information indicating a POC at which calculation of the GDV is performed,
Wherein the step of calculating the GDV calculates the GDV at the POC indicated by the POC information.
12. The method of claim 11, wherein the POC information indicating the POC at which the calculation of the GDV is performed,
A POC of a picture to be used as a reference picture, and a POC of all pictures, the first POC of an intra period, a POC of a picture below a predetermined temporal level, a POC of a picture used as a reference picture, Way.
10. The method of claim 9, wherein the information for deriving the GDV includes GDV derivation information indicating a method of deriving a GDV for the picture to be decoded,
In the GDV derivation step,
And derives a GDV for a current picture to be decoded according to a method indicated by the GDV derivation information.
14. The method according to claim 13,
The GDVs selected from the GDVs calculated in the POC of the pictures decoded earlier than the decoding target picture are used as the GDVs for the decoding target picture and the GDVs selected from the POCs of the pictures decoded earlier than the decoding target picture Information indicating one of the methods of interpolating and using as a GDV; And
And information indicating a GDV used in the indicated method.
15. The method as claimed in claim 14, wherein when the GDV derivation information indicates to use one of GDVs calculated in the POC of pictures decoded earlier than the current picture as a GDV for the current picture to be decoded,
In the GDV derivation step,
Wherein the GDV calculated for the POC closest to the current picture in the POC order among the GDVs calculated in the POC of the pictures decoded earlier than the decoding target picture is guided to the GDV for the decoding target picture.
16. The method of claim 15, wherein if the GDV derivation information indicates any one of interpolating GDVs selected from among GDVs computed at the POC of pictures decoded prior to the decoding subject and using the interpolated GDVs as GDVs,
In the GDV derivation step,
Interpolating the GDVs calculated at two POCs closest to the current picture in order of POC among the GDVs calculated at the POC of the pictures decoded earlier than the decoding target picture, and deriving the interpolated GDVs for the decoding target picture / RTI >
10. The method of claim 9, wherein the reference view is indicated by information for deriving the GDV.
10. The method of claim 9, wherein the reference view is specified by a decoding order between views.