WO2015141696A1 - Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction - Google Patents

Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction Download PDF

Info

Publication number
WO2015141696A1
WO2015141696A1 PCT/JP2015/057953 JP2015057953W WO2015141696A1 WO 2015141696 A1 WO2015141696 A1 WO 2015141696A1 JP 2015057953 W JP2015057953 W JP 2015057953W WO 2015141696 A1 WO2015141696 A1 WO 2015141696A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
image
prediction
depth
block
Prior art date
Application number
PCT/JP2015/057953
Other languages
English (en)
Japanese (ja)
Inventor
知宏 猪飼
健史 筑波
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Priority to JP2016508750A priority Critical patent/JPWO2015141696A1/ja
Publication of WO2015141696A1 publication Critical patent/WO2015141696A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/583Motion compensation with overlapping blocks

Definitions

  • the present invention relates to an image decoding device, an image encoding device, and a prediction device.
  • the multi-view image encoding technique includes a parallax predictive encoding that reduces the amount of information by predicting a parallax between images when encoding images of a plurality of viewpoints, and a decoding method corresponding to the encoding method.
  • a vector representing the parallax between viewpoint images is called a displacement vector.
  • the displacement vector is a two-dimensional vector having a horizontal element (x component) and a vertical element (y component), and is calculated for each block which is an area obtained by dividing one image.
  • x component horizontal element
  • y component vertical element
  • each viewpoint image is encoded as a different layer in each of a plurality of layers.
  • a method for encoding a moving image composed of a plurality of layers is generally referred to as scalable encoding or hierarchical encoding.
  • scalable coding high coding efficiency is realized by performing prediction between layers.
  • a reference layer without performing prediction between layers is called a base layer, and other layers are called enhancement layers.
  • Scalable encoding in the case where a layer is composed of viewpoint images is referred to as view scalable encoding.
  • the base layer is also called a base view
  • the enhancement layer is also called a non-base view.
  • scalable coding when a layer is composed of a texture layer (image layer) and a depth layer (distance image layer) is called three-dimensional scalable coding.
  • scalable coding in addition to view scalable coding, spatial scalable coding (pictures with low resolution as the base layer and pictures with high resolution in the enhancement layer), SNR scalable coding (image quality as the base layer) Low picture, high resolution picture as an enhancement layer).
  • a base layer picture may be used as a reference picture in coding an enhancement layer picture.
  • Non-Patent Document 1 it is called depth-based block partitioning (DBBP), in which partition information (segmentation) is derived from a depth image, and one prediction image is synthesized from two interpolated images using the segmentation as a mask.
  • DBBP depth-based block partitioning
  • the segmentation is derived from the region division based on the depth pixels, thereby forming a rectangle (2N ⁇ 2N, 2N ⁇ N, 2N ⁇ nU, 2N ⁇ nD, N ⁇ 2N, nL ⁇ 2N, nR ⁇ 2N). Division with a high degree of freedom which is not limited is possible.
  • Non-Patent Document 2 is a technique for unifying viewpoint synthesis prediction (VSP) and a disparity vector (DV) used for depth pixel reference in DBBP. Both VSP and DBBP are adjacent base disparity vectors before depth refinement ( NBDV).
  • Non-Patent Document 1 has a problem that implementation is complicated because different partitioning methods are used for VSP and DBBP.
  • Non-Patent Document 2 can unify disparity vectors (DV) used for reference to depth pixels of VSP and DBBP, but has a problem that encoding efficiency is lowered.
  • DV disparity vectors
  • a segmentation deriving unit that derives segmentation information from a depth image
  • an image interpolating unit that generates two motion compensation images
  • a single motion compensation image are generated by combining the two interpolation images.
  • the image interpolation unit generates the two motion compensation images by bilinear prediction.
  • a segmentation deriving unit that derives segmentation information from a depth image, an image interpolating unit that generates two motion compensation images, and a single motion compensation image are generated by combining the two interpolation images.
  • a depth-based block prediction image generation apparatus including an image synthesizing unit that further includes a depth division mode deriving unit for deriving a division mode from the depth image, wherein the depth division mode deriving unit is divided from pixels at four corners of the depth block. A mode is derived.
  • One embodiment of the present invention is characterized in that the depth division mode deriving unit derives the division mode from the comparison of the upper left and lower right of the depth and the comparison of the upper right and lower left of the depth.
  • a segmentation deriving unit that derives segmentation information from a depth image, an image interpolating unit that generates two motion compensation images, and a single motion compensation image are generated by combining the two interpolation images.
  • a depth division mode deriving unit for deriving a division mode, wherein the depth division mode deriving unit derives a 2N ⁇ N or N ⁇ 2N division mode.
  • the segmentation deriving unit derives segmentation information that takes 0 or 1 for each pixel, and the image synthesizing unit includes two pieces of information for each pixel of the block.
  • the composition is performed by selecting one of the interpolation images.
  • One embodiment of the present invention is an image decoding device including the depth base block prediction image generation device and the DBBP flag decoding unit, and the depth base block prediction image generation device performs DBBP prediction when the DBBP flag is 1. It is characterized by performing.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a view synthesis prediction unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images, an image synthesis unit that generates one motion compensation image by synthesizing the two interpolation images, and a division mode deriving unit that derives a division mode.
  • the composite prediction means includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the division mode derivation unit and the partition division unit include a common division mode.
  • a derivation unit is provided.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a merge mode parameter derivation unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images and an image synthesis unit that combines the two interpolation images to generate one motion compensation image.
  • the image decoding apparatus further includes a DBBP flag decoding unit.
  • the merge mode parameter derivation unit converts bi-prediction to uni-prediction when the DBBP flag is 1.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and an inter prediction parameter decoding unit, wherein the depth base block prediction image generation unit includes a segmentation deriving unit for deriving segmentation information from the depth image, An image interpolation unit that generates two motion compensation images and an image synthesis unit that combines the two interpolation images to generate one motion compensation image.
  • the image decoding apparatus further includes a DBBP flag decoding unit.
  • the inter prediction parameter decoding unit does not decode a bi-prediction value as an inter prediction identifier when the DBBP flag is 1.
  • One embodiment of the present invention is an image encoding device including the depth base block prediction image generation device and the DBBP flag encoding unit, and the depth base block prediction image generation device is configured to perform DBBP when the DBBP flag is 1.
  • An image encoding apparatus that performs prediction.
  • One embodiment of the present invention is an image encoding device including a depth base block prediction image generation unit and a viewpoint synthesis prediction unit, wherein the depth base block prediction image generation unit is a segmentation deriving unit that derives segmentation information from a depth image.
  • An image interpolation unit that generates two motion compensation images, an image synthesis unit that combines the two interpolation images to generate one motion compensation image, and a division mode derivation unit that derives a division mode
  • the viewpoint synthesis prediction unit includes a partition division unit that performs partition division in the depth, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the division mode derivation unit and the partition division unit include:
  • a common division mode deriving unit is provided.
  • One aspect of the present invention is an image encoding device including a depth base block prediction image generation unit and a merge mode parameter derivation unit, wherein the depth base block prediction image generation unit derives segmentation information from the depth image. And an image interpolating unit for generating two motion compensated images and an image synthesizing unit for synthesizing the two interpolated images to generate one motion compensated image.
  • the image encoding device encodes the DBBP flag.
  • a merge mode parameter deriving unit that performs conversion from uni-prediction to bi-prediction when the DBBP flag is 1.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a view synthesis prediction unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images, an image synthesis unit that generates one motion compensation image by synthesizing the two interpolation images, and a division mode deriving unit that derives a division mode.
  • the composite prediction unit includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image.
  • One embodiment of the present invention is an image encoding device including a depth base block prediction image generation unit and a viewpoint synthesis prediction unit, wherein the depth base block prediction image generation unit is a segmentation deriving unit that derives segmentation information from a depth image.
  • An image interpolation unit that generates two motion compensation images, an image synthesis unit that combines the two interpolation images to generate one motion compensation image, and a division mode derivation unit that derives a division mode
  • the viewpoint synthesis prediction unit includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the segmentation derivation unit of the depth base block prediction image generation unit, the division Deriving the position of the depth image referenced by the mode derivation unit And a disparity vector used for deriving a position of a depth image by the partition dividing unit and the depth motion vector deriving unit of the viewpoint synthesis prediction unit as a common disparity vector.
  • FIG. 6 is a diagram showing division mode patterns, wherein (a) to (h) are division modes of 2N ⁇ 2N, 2N ⁇ N, 2N ⁇ nU, 2N ⁇ nD, N ⁇ 2N, nL ⁇ 2N, and nR, respectively.
  • the partition shapes in the case of ⁇ 2N and N ⁇ N are shown.
  • VSP prediction part 30374 VSP prediction part 30374.
  • VSP prediction part 30374 VSP prediction part 30374.
  • FIG. 2 is a schematic diagram showing the configuration of the image transmission system 1 according to the present embodiment.
  • the image transmission system 1 is a system that transmits a code obtained by encoding a plurality of layer images and displays an image obtained by decoding the transmitted code.
  • the image transmission system 1 includes an image encoding device 11, a network 21, an image decoding device 31, and an image display device 41.
  • the signal T indicating a plurality of layer images (also referred to as texture images) is input to the image encoding device 11.
  • a layer image is an image that is viewed or photographed at a certain resolution and a certain viewpoint.
  • each of the plurality of layer images is referred to as a viewpoint image.
  • the viewpoint corresponds to the position or observation point of the photographing apparatus.
  • the plurality of viewpoint images are images taken by the left and right photographing devices toward the subject.
  • the image encoding device 11 encodes each of the signals to generate an encoded stream Te (encoded data). Details of the encoded stream Te will be described later.
  • a viewpoint image is a two-dimensional image (planar image) observed at a certain viewpoint.
  • the viewpoint image is indicated by, for example, a luminance value or a color signal value for each pixel arranged in a two-dimensional plane.
  • one viewpoint image or a signal indicating the viewpoint image is referred to as a picture.
  • the plurality of layer images include a base layer image having a low resolution and an enhancement layer image having a high resolution.
  • SNR scalable encoding is performed using a plurality of layer images
  • the plurality of layer images are composed of a base layer image with low image quality and an extended layer image with high image quality.
  • view scalable coding, spatial scalable coding, and SNR scalable coding may be arbitrarily combined.
  • encoding and decoding of an image including at least a base layer image and an image other than the base layer image is handled as the plurality of layer images.
  • the image on the reference side is referred to as a first layer image
  • the image on the reference side is referred to as a second layer image.
  • the base layer image is treated as a first layer image and the enhancement layer image is treated as a second layer image.
  • the enhancement layer image include an image of a viewpoint other than the base view and a depth image.
  • a depth image (also referred to as depth map, “depth image”, or “distance image”) is a signal value (“depth value”) corresponding to the distance from the viewpoint (shooting device, etc.) of the subject or background included in the subject space. ”,“ Depth value ”,“ depth ”, etc.), and is an image signal composed of signal values (pixel values) for each pixel arranged in a two-dimensional plane.
  • the pixels constituting the depth image correspond to the pixels constituting the viewpoint image. Therefore, the depth map is a clue for representing the three-dimensional object space by using the viewpoint image which is a reference image signal obtained by projecting the object space onto the two-dimensional plane.
  • the network 21 transmits the encoded stream Te generated by the image encoding device 11 to the image decoding device 31.
  • the network 21 is the Internet, a wide area network (WAN: Wide Area Network), a small network (LAN: Local Area Network), or a combination thereof.
  • the network 21 is not necessarily limited to a bidirectional communication network, and may be a unidirectional or bidirectional communication network that transmits broadcast waves such as terrestrial digital broadcasting and satellite broadcasting.
  • the network 21 may be replaced by a storage medium that records an encoded stream Te such as a DVD (Digital Versatile Disc) or a BD (Blue-ray Disc).
  • the image decoding device 31 decodes each of the encoded streams Te transmitted by the network 21, and generates a plurality of decoded layer images Td (decoded viewpoint images Td).
  • the image display device 41 displays all or part of the plurality of decoded layer images Td generated by the image decoding device 31. For example, in view scalable coding, a 3D image (stereoscopic image) and a free viewpoint image are displayed in all cases, and a 2D image is displayed in some cases.
  • the image display device 41 includes, for example, a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display.
  • a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display.
  • the spatial scalable coding and SNR scalable coding when the image decoding device 31 and the image display device 41 have a high processing capability, a high-quality enhancement layer image is displayed and only a lower processing capability is provided. Displays a base layer image that does not require higher processing capability and display capability as an extension layer.
  • FIG. 3 is a diagram showing a hierarchical structure of data in the encoded stream Te.
  • the encoded stream Te illustratively includes a sequence and a plurality of pictures constituting the sequence.
  • (A) to (f) of FIG. 3 respectively show a sequence layer that defines a sequence SEQ, a picture layer that defines a picture PICT, a slice layer that defines a slice S, a slice data layer that defines slice data, and a slice data.
  • Coding Unit CU
  • sequence layer In the sequence layer, a set of data referred to by the image decoding device 31 for decoding a sequence SEQ to be processed (hereinafter also referred to as a target sequence) is defined.
  • the sequence SEQ includes a video parameter set (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), a picture PICT, and an additional extension.
  • Information SEI Supplemental Enhancement Information
  • # indicates the layer ID.
  • FIG. 3 shows an example in which encoded data of # 0 and # 1, that is, layer 0 and layer 1, exists, but the type of layer and the number of layers are not dependent on this.
  • the video parameter set VPS is a set of encoding parameters common to a plurality of moving images, a plurality of layers included in the moving image, and encoding parameters related to individual layers in a moving image composed of a plurality of layers.
  • a set is defined.
  • the sequence parameter set SPS defines a set of encoding parameters that the image decoding device 31 refers to in order to decode the target sequence. For example, the width and height of the picture are defined.
  • a set of encoding parameters referred to by the image decoding device 31 in order to decode each picture in the target sequence is defined.
  • a quantization width reference value (pic_init_qp_minus26) used for picture decoding and a flag (weighted_pred_flag) indicating application of weighted prediction are included.
  • a plurality of PPS may exist. In that case, one of a plurality of PPSs is selected from each picture in the target sequence.
  • Picture layer In the picture layer, a set of data referred to by the image decoding device 31 for decoding a picture PICT to be processed (hereinafter also referred to as a target picture) is defined. As shown in FIG. 3B, the picture PICT includes slices S0 to SNS-1 (NS is the total number of slices included in the picture PICT).
  • slice layer In the slice layer, a set of data referred to by the image decoding device 31 for decoding the slice S to be processed (also referred to as a target slice) is defined. As shown in FIG. 3C, the slice S includes a slice header SH and slice data SDATA.
  • the slice header SH includes a coding parameter group that the image decoding device 31 refers to in order to determine a decoding method of the target slice.
  • the slice type designation information (slice_type) that designates the slice type is an example of an encoding parameter included in the slice header SH.
  • I slice using only intra prediction at the time of encoding (2) P slice using unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used.
  • the slice header SH may include a reference (pic_parameter_set_id) to the picture parameter set PPS included in the sequence layer.
  • the slice data layer a set of data referred to by the image decoding device 31 in order to decode the slice data SDATA to be processed is defined.
  • the slice data SDATA includes a coded tree block (CTB).
  • the CTB is a fixed-size block (for example, 64 ⁇ 64) constituting a slice, and may be referred to as a maximum coding unit (LCU).
  • the coding tree layer defines a set of data referred to by the image decoding device 31 in order to decode a coding tree block to be processed.
  • the coding tree unit is divided by recursive quadtree division.
  • a tree-structured node obtained by recursive quadtree partitioning is called a coding tree.
  • An intermediate node of the quadtree is a coded tree unit (CTU), and the coded tree block itself is also defined as the highest CTU.
  • the CTU includes a split flag (split_flag). When the split_flag is 1, the CTU is split into four coding tree units CTU.
  • the coding tree unit CTU is divided into four coding units (CU: Coded Unit).
  • the coding unit CU is a terminal node of the coding tree layer and is not further divided in this layer.
  • the encoding unit CU is a basic unit of the encoding process.
  • the size of the coding unit CU is any of 64 ⁇ 64 pixels, 32 ⁇ 32 pixels, 16 ⁇ 16 pixels, and 8 ⁇ 8 pixels. Can take.
  • the encoding unit layer defines a set of data referred to by the image decoding device 31 in order to decode the processing target encoding unit.
  • the encoding unit includes a CU header CUH, a prediction unit (prediction unit), a conversion tree, and a CU header CUF.
  • the CU header CUH it is defined whether the coding unit is a unit using intra prediction or a unit using inter prediction.
  • the encoding unit indicates a residual prediction index iv_res_pred_weight_idx indicating a weight used for residual prediction (or whether or not to perform residual prediction), and illuminance indicating whether or not illuminance compensation prediction is used.
  • a compensation flag ic_flag is included.
  • the encoding unit is the root of a prediction unit (PU) and a transform tree (TT).
  • the CU header CUF is included between the prediction unit and the conversion tree or after the conversion tree.
  • the encoding unit is divided into one or a plurality of prediction blocks, and the position and size of each prediction block are defined.
  • the prediction block is one or a plurality of non-overlapping areas constituting the coding unit.
  • the prediction unit includes one or a plurality of prediction blocks obtained by the above-described division.
  • Prediction processing is performed for each prediction block.
  • a prediction block that is a unit of prediction is also referred to as a prediction unit. More specifically, since prediction is performed in units of color components, hereinafter, blocks for each color component, such as a luminance prediction block and a color difference prediction block, are referred to as prediction blocks, and blocks of multiple color components (luminance prediction blocks). A block and a color difference prediction block) are collectively called a prediction unit.
  • a block whose index cIdx (colour_component Idx) indicating a color component type is 0 indicates a luminance block (predicted luminance block) (usually displayed as L or Y), and a block whose cIdx is 1 or 2 is Cb,
  • the Cr color difference block (color difference prediction block) is shown.
  • Intra prediction is prediction within the same picture
  • inter prediction refers to prediction processing performed between different pictures (for example, between display times and between layer images).
  • the division method is encoded by the encoded data division mode part_mode.
  • the division mode specified by the division mode part_mode includes the following eight types of patterns in total, assuming that the size of the target CU is 2N ⁇ 2N pixels. That is, 4 symmetric splittings of 2N ⁇ 2N pixels, 2N ⁇ N pixels, N ⁇ 2N pixels, and N ⁇ N pixels, and 2N ⁇ nU pixels, 2N ⁇ nD pixels, nL ⁇ 2N pixels, And four asymmetric motion partitions (AMP) of nR ⁇ 2N pixels.
  • N 2 m (m is an arbitrary integer of 1 or more).
  • a prediction block whose division mode is asymmetric division is also referred to as an AMP block. Since the number of divisions is one of 1, 2, and 4, PUs included in the CU are 1 to 4. These PUs are expressed as PU0, PU1, PU2, and PU3 in order.
  • 4 (a) to 4 (h) specifically show the positions of the PU partition boundaries in the CU for each partition mode.
  • FIG. 4A shows a 2N ⁇ 2N division mode in which no CU is divided.
  • FIGS. 4B and 4E show the shapes of partitions when the division modes are 2N ⁇ N and N ⁇ 2N, respectively.
  • FIG. 4H shows the shape of the partition when the division mode is N ⁇ N.
  • the numbers assigned to the respective regions indicate the region identification numbers, and the regions are processed in the order of the identification numbers. That is, the identification number represents the scan order of the area.
  • a specific value of N is defined by the size of the CU to which the PU belongs, and specific values of nU, nD, nL, and nR are determined according to the value of N.
  • 32 ⁇ 32 pixel CUs are 32 ⁇ 32 pixels, 32 ⁇ 16 pixels, 16 ⁇ 32 pixels, 32 ⁇ 16 pixels, 32 ⁇ 8 pixels, 32 ⁇ 24 pixels, 8 ⁇ 32 pixels, and 24 ⁇ 32. It can be divided into prediction blocks for inter prediction of pixels.
  • the encoding unit is divided into one or a plurality of transform blocks, and the position and size of each transform block are defined.
  • the transform block is one or a plurality of non-overlapping areas constituting the encoding unit.
  • the conversion tree includes one or a plurality of conversion blocks obtained by the above division.
  • the division in the transformation tree includes the one in which an area having the same size as that of the encoding unit is assigned as the transformation block, and the one in the recursive quadtree division like the above-described division in the tree block.
  • a transform block that is a unit of transformation is also referred to as a transform unit (TU).
  • the prediction image of the prediction unit is derived by a prediction parameter associated with the prediction unit.
  • the prediction parameters include a prediction parameter for intra prediction or a prediction parameter for inter prediction.
  • prediction parameters for inter prediction inter prediction (inter prediction parameters) will be described.
  • the inter prediction parameter includes prediction use flags predFlagL0 and predFlagL1, reference picture indexes refIdxL0 and refIdxL1, and vectors mvL0 and mvL1.
  • the prediction use flags predFlagL0 and predFlagL1 are flags indicating whether or not reference picture lists called L0 list and L1 list are used, respectively, and a reference picture list corresponding to a value of 1 is used.
  • the prediction use flag information can also be expressed by an inter prediction identifier inter_pred_idc described later. Normally, a prediction use flag is used in a prediction image generation unit and a prediction parameter memory, which will be described later, and an inter prediction identifier inter_pred_idc is used when decoding information about which reference picture list is used from encoded data. .
  • Syntax elements for deriving inter prediction parameters included in the encoded data include, for example, a partition mode part_mode, a merge flag merge_flag, a merge index merge_idx, an inter prediction identifier inter_pred_idc, a reference picture index refIdxLX, a prediction vector flag mvp_LX_flag, and a difference There is a vector mvdLX.
  • LX is a description method used when L0 prediction and L1 prediction are not distinguished. By replacing LX with L0 and L1, parameters for the L0 list and parameters for the L1 list are distinguished (the same applies hereinafter).
  • refIdxL0 is a reference picture index used for L0 prediction
  • refIdxL1 is a reference picture index used for L1 prediction
  • refIdx (refIdxLX) is a notation used when refIdxL0 and refIdxL1 are not distinguished.
  • FIG. 5 is a conceptual diagram showing an example of the reference picture list RefPicListX.
  • the reference picture list RefPicListX five rectangles arranged in a line on the left and right indicate reference pictures, respectively.
  • the codes P1, P2, Q0, P3, and P4 shown in order from the left end to the right are codes indicating respective reference pictures.
  • P such as P1 indicates the viewpoint P
  • Q of Q0 indicates a viewpoint Q different from the viewpoint P.
  • the subscripts P and Q indicate the picture order number POC.
  • a downward arrow directly below refIdxLX indicates that the reference picture index refIdxLX is an index that refers to the reference picture Q0 in the reference picture memory 306.
  • FIG. 6 is a conceptual diagram illustrating an example of a reference picture.
  • the horizontal axis indicates the display time
  • the vertical axis indicates the viewpoint.
  • the rectangles shown in FIG. 6 with 2 rows and 3 columns (6 in total) indicate pictures.
  • the rectangle in the second column from the left in the lower row indicates a picture to be decoded (target picture), and the remaining five rectangles indicate reference pictures.
  • a reference picture Q0 indicated by an upward arrow from the target picture is a picture that has the same display time as the target picture and a different viewpoint (view ID). In the displacement prediction based on the target picture, the reference picture Q0 is used.
  • a reference picture P1 indicated by a left-pointing arrow from the target picture is a past picture at the same viewpoint as the target picture.
  • a reference picture P2 indicated by a right-pointing arrow from the target picture is a future picture at the same viewpoint as the target picture. In motion prediction based on the target picture, the reference picture P1 or P2 is used.
  • >> is a right shift
  • is a left shift. Therefore, as the inter prediction parameter, the prediction use flags predFlagL0 and predFlagL1 may be used, or the inter prediction identifier inter_pred_idc may be used.
  • the determination using the prediction usage flags predFlagL0 and predFlagL1 may be replaced with the inter prediction identifier inter_pred_idc.
  • the determination using the inter prediction identifier inter_pred_idc can be replaced with the prediction use flags predFlagL0 and predFlagL1.
  • the prediction parameter decoding (encoding) method includes a merge mode and an AMVP (Adaptive Motion Vector Prediction) mode.
  • the merge flag merge_flag is a flag for identifying these.
  • the prediction parameter of the target PU is derived using the prediction parameter of the already processed block.
  • the merge mode is a mode that uses the prediction parameter already derived without including the prediction use flag predFlagLX (inter prediction identifier inter_pred_idc), the reference picture index refIdxLX, and the vector mvLX in the encoded data.
  • the AMVP mode is an inter prediction identifier.
  • inter_pred_idc reference picture index refIdxLX
  • vector mvLX are included in the encoded data.
  • the vector mvLX is encoded as a prediction vector flag mvp_LX_flag indicating a prediction vector and a difference vector (mvdLX).
  • Inter prediction identifier inter_pred_idc is data indicating the type and number of reference pictures, and takes one of the values Pred_L0, Pred_L1, and Pred_BI.
  • Pred_L0 and Pred_L1 indicate that reference pictures stored in reference picture lists called an L0 list and an L1 list are used, respectively, and that both use one reference picture (single prediction).
  • Prediction using the L0 list and the L1 list are referred to as L0 prediction and L1 prediction, respectively.
  • Pred_BI indicates that two reference pictures are used (bi-prediction), and indicates that two reference pictures stored in the L0 list and the L1 list are used.
  • the prediction vector flag mvp_LX_flag is an index indicating a prediction vector
  • the reference picture index refIdxLX is an index indicating a reference picture stored in the reference picture list.
  • the merge index merge_idx is an index that indicates whether one of the prediction parameter candidates (merge candidates) derived from the processed block is used as a prediction parameter of the prediction unit (target block).
  • the vector mvLX includes a motion vector and a displacement vector (disparity vector).
  • a motion vector is a positional shift between the position of a block in a picture at a certain display time of a layer and the position of the corresponding block in a picture of the same layer at a different display time (for example, an adjacent discrete time). It is a vector which shows.
  • the displacement vector is a vector indicating a positional shift between the position of a block in a picture at a certain display time of a certain layer and the position of a corresponding block in a picture of a different layer at the same display time.
  • the pictures in different layers may be pictures from different viewpoints or pictures with different resolutions.
  • a displacement vector corresponding to pictures of different viewpoints is called a disparity vector.
  • a vector mvLX A prediction vector and a difference vector related to the vector mvLX are referred to as a prediction vector mvpLX and a difference vector mvdLX, respectively.
  • Whether the vector mvLX and the difference vector mvdLX are motion vectors or displacement vectors is determined using a reference picture index refIdxLX associated with the vectors.
  • FIG. 7 is a schematic diagram illustrating a configuration of the image decoding device 31 according to the present embodiment.
  • the image decoding device 31 includes an entropy decoding unit 301, a prediction parameter decoding unit 302, a reference picture memory (reference image storage unit, frame memory) 306, a prediction parameter memory (prediction parameter storage unit, frame memory) 307, and a prediction image generation unit 308.
  • An inverse quantization / inverse DCT unit 311, an addition unit 312, and a depth DV derivation unit 351 (not shown).
  • the prediction parameter decoding unit 302 includes an inter prediction parameter decoding unit 303 and an intra prediction parameter decoding unit 304.
  • the predicted image generation unit 308 includes an inter predicted image generation unit 309 and an intra predicted image generation unit 310.
  • the entropy decoding unit 301 performs entropy decoding on the encoded stream Te input from the outside, and separates and decodes individual codes (syntax elements).
  • the separated codes include prediction information for generating a prediction image and residual information for generating a difference image.
  • the entropy decoding unit 301 outputs a part of the separated code to the prediction parameter decoding unit 302.
  • Some of the separated codes are, for example, prediction mode PredMode, split mode part_mode, merge flag merge_flag, merge index merge_idx, inter prediction identifier inter_pred_idc, reference picture index refIdxLX, prediction vector flag mvp_LX_flag, difference vector mvdLX, residual prediction index iv_res_pred_weight_idx and illuminance compensation flag ic_flag. Control of which code to decode is performed based on an instruction from the prediction parameter decoding unit 302.
  • the entropy decoding unit 301 outputs the quantization coefficient to the inverse quantization / inverse DCT unit 311.
  • This quantization coefficient is a coefficient obtained by performing quantization and performing DCT (Discrete Cosine Transform) on the residual signal in the encoding process.
  • the entropy decoding unit 301 outputs the depth DV conversion table DepthToDisparityB to the depth DV deriving unit 351.
  • BitDepthY indicates the bit depth of the pixel value corresponding to the luminance signal, and takes, for example, 8 as the value.
  • the prediction parameter decoding unit 302 receives a part of the code from the entropy decoding unit 301 as an input.
  • the prediction parameter decoding unit 302 decodes the prediction parameter corresponding to the prediction mode indicated by the prediction mode PredMode that is a part of the code.
  • the prediction parameter decoding unit 302 outputs the prediction mode PredMode and the decoded prediction parameter to the prediction parameter memory 307 and the prediction image generation unit 308.
  • the inter prediction parameter decoding unit 303 decodes the inter prediction parameter with reference to the prediction parameter stored in the prediction parameter memory 307 based on the code input from the entropy decoding unit 301.
  • the inter prediction parameter decoding unit 303 outputs the decoded inter prediction parameter to the prediction image generation unit 308 and stores it in the prediction parameter memory 307. Details of the inter prediction parameter decoding unit 303 will be described later.
  • the intra prediction parameter decoding unit 304 refers to the prediction parameter stored in the prediction parameter memory 307 on the basis of the code input from the entropy decoding unit 301 and decodes the intra prediction parameter.
  • the intra prediction parameter is a parameter used in a process of predicting a picture block within one picture, for example, an intra prediction mode IntraPredMode.
  • the intra prediction parameter decoding unit 304 outputs the decoded intra prediction parameter to the prediction image generation unit 308 and stores it in the prediction parameter memory 307.
  • the reference picture memory 306 stores the decoded picture block recSamples generated by the adding unit 312 at the position of the decoded picture block.
  • the prediction parameter memory 307 stores the prediction parameter in a predetermined position for each decoding target picture and block. Specifically, the prediction parameter memory 307 stores the inter prediction parameter decoded by the inter prediction parameter decoding unit 303, the intra prediction parameter decoded by the intra prediction parameter decoding unit 304, and the prediction mode PredMode separated by the entropy decoding unit 301. .
  • the stored inter prediction parameters include, for example, a prediction use flag predFlagLX, a reference picture index refIdxLX, and a vector mvLX.
  • the prediction image generation unit 308 receives the prediction mode PredMode and the prediction parameter from the prediction parameter decoding unit 302. Further, the predicted image generation unit 308 reads a reference picture from the reference picture memory 306. The prediction image generation unit 308 generates prediction picture blocks predSamples (prediction images) using the input prediction parameter and the read reference picture in the prediction mode indicated by the prediction mode PredMode.
  • the inter prediction image generation unit 309 uses the inter prediction parameters input from the inter prediction parameter decoding unit 303 and the read reference pictures to perform prediction picture block predSamples by inter prediction. Is generated.
  • the prediction picture block predSamples corresponds to the prediction unit PU.
  • the PU corresponds to a part of a picture composed of a plurality of pixels as a unit for performing the prediction process as described above, that is, a target block on which the prediction process is performed at a time.
  • the inter predicted image generation unit 309 is located at the position indicated by the vector mvLX with reference to the prediction unit from the reference picture RefPicListLX [refIdxLX] indicated by the reference picture index refIdxLX.
  • a reference picture block is read from the reference picture memory 306.
  • the inter prediction image generation unit 309 performs motion compensation on the read reference picture block to generate prediction picture blocks predSamplesLX.
  • the inter prediction image generation unit 309 further generates prediction picture blocks predSamples by weighted prediction from the prediction picture blocks predSamplesL0 and predSamplesL1 derived from the reference pictures in each reference picture list, and outputs the prediction picture blocks predSamples to the addition unit 312.
  • the intra predicted image generation unit 310 When the prediction mode PredMode indicates the intra prediction mode, the intra predicted image generation unit 310 performs intra prediction using the intra prediction parameter input from the intra prediction parameter decoding unit 304 and the read reference picture. Specifically, the intra predicted image generation unit 310 reads, from the reference picture memory 306, a reference picture block that is a decoding target picture and is in a predetermined range from a prediction unit among blocks that have already been processed.
  • the predetermined range is, for example, the range of adjacent blocks on the left, upper left, upper, and upper right, and differs depending on the intra prediction mode.
  • the intra predicted image generation unit 310 performs prediction in the prediction mode indicated by the intra prediction mode IntraPredMode for the read reference picture block, generates predicted picture block predSamples, and outputs the prediction picture block predSamples to the adding unit 312.
  • the inverse quantization / inverse DCT unit 311 inversely quantizes the quantization coefficient input from the entropy decoding unit 301 to obtain a DCT coefficient.
  • the inverse quantization / inverse DCT unit 311 performs inverse DCT (Inverse Discrete Cosine Transform) on the obtained DCT coefficient to calculate a decoded residual signal.
  • the inverse quantization / inverse DCT unit 311 outputs the calculated decoded residual signal to the adder 312.
  • the addition unit 312 performs pixel value processing on the prediction picture block predSamples input from the inter prediction image generation unit 309 and the intra prediction image generation unit 310 and the signal value resSamples of the decoded residual signal input from the inverse quantization / inverse DCT unit 311 for each pixel.
  • the adder 312 outputs the generated decoded picture block recSamples to the reference picture memory 306.
  • the decoded picture block is integrated for each picture.
  • a loop filter such as a deblocking filter and an adaptive offset filter is applied to the decoded picture.
  • the decoded picture is output to the outside as a decoded layer image Td.
  • FIG. 8 is a schematic diagram illustrating a configuration of the inter prediction parameter decoding unit 303 according to the present embodiment.
  • the inter prediction parameter decoding unit 303 includes an inter prediction parameter decoding control unit 3031, an AMVP prediction parameter deriving unit 3032, an adding unit 3035, a merge mode parameter deriving unit 3036, and a displacement deriving unit 30363.
  • the inter prediction parameter decoding control unit 3031 instructs the entropy decoding unit 301 to decode a code related to the inter prediction (the syntax element) includes, for example, a division mode part_mode, a merge included in the encoded data.
  • a flag merge_flag, a merge index merge_idx, an inter prediction identifier inter_pred_idc, a reference picture index refIdxLX, a prediction vector flag mvp_LX_flag, a difference vector mvdLX, a residual prediction index iv_res_pred_weight_idx, an illumination compensation flag ic_flag, and a DBBP flag dbbp_flag are extracted.
  • 3031 expresses that a certain syntax element is extracted, it means that the entropy decoding unit 301 is instructed to decode a certain syntax element, and the corresponding syntax element is read out from the encoded data.
  • the inter prediction parameter decoding control unit 3031 extracts the merge index merge_idx from the encoded data when the merge flag merge_flag is 1, that is, when the prediction unit is in the merge mode.
  • the inter prediction parameter decoding control unit 3031 outputs the extracted residual prediction index iv_res_pred_weight_idx, the illumination compensation flag ic_flag, and the merge index merge_idx to the merge mode parameter deriving unit 3036.
  • the inter prediction parameter decoding control unit 3031 uses the entropy decoding unit 301 to calculate the inter prediction identifier inter_pred_idc, the reference picture index refIdxLX, and the prediction vector flag. mvp_LX_flag and difference vector mvdLX are extracted.
  • the inter prediction parameter decoding control unit 3031 outputs the prediction use flag predFlagLX derived from the extracted inter prediction identifier inter_pred_idc and the reference picture index refIdxLX to the AMVP prediction parameter derivation unit 3032 and the prediction image generation unit 308, and also the prediction parameter memory 307 To remember.
  • the inter prediction parameter decoding control unit 3031 outputs the extracted prediction vector flag mvp_LX_flag to the AMVP prediction parameter derivation unit 3032 and outputs the extracted difference vector mvdLX to the addition unit 3035.
  • the inter prediction parameter decoding control unit 3031 decodes the DBBP flag dbbp_flag from the encoded data when the division mode PartMode is a specific value. In other cases, when dbbp_flag is not included in the encoded data, 0 is estimated as dbbp_flag.
  • FIG. 20 is a syntax table related to the DBBP flag dbbp_flag of this embodiment.
  • the inter prediction parameter decoding control unit 3031 decodes cu_skip_flag, pred_mode, part_mode, and dbbp_flag shown in SE1001 to SE1004 in the figure.
  • cu_skip_flag is a flag indicating whether or not the target CU is skipped.
  • PartMode is limited to 2N ⁇ 2N, and decoding of the split mode part_mode is omitted.
  • the division mode part_mode decoded from the encoded data is set to the division mode PredMode.
  • the inter prediction parameter decoding control unit 3031 displays the displacement vector (NBDV) derived when the inter prediction parameter is derived and the VSP mode flag VspModeFlag that is a flag indicating whether to perform viewpoint synthesis prediction, as an inter prediction image generation unit. To 309.
  • FIG. 9 is a schematic diagram illustrating a configuration of the merge mode parameter deriving unit 3036 according to the present embodiment.
  • the merge mode parameter derivation unit 3036 includes a merge candidate derivation unit 30361, a merge candidate selection unit 30362, and a bi-prediction restriction unit 30363.
  • the merge candidate derivation unit 30361 includes a merge candidate storage unit 303611, an extended merge candidate derivation unit 30370, and a basic merge candidate derivation unit 30380.
  • the merge candidate storage unit 303611 stores the merge candidates input from the extended merge candidate derivation unit 30370 and the basic merge candidate derivation unit 30380 in the merge candidate list mergeCandList.
  • the merge candidate includes a prediction usage flag predFlagLX, a vector mvLX, a reference picture index refIdxLX, a VSP mode flag VspModeFlag, a displacement vector MvDisp, and a layer ID RefViewIdx.
  • an index is assigned to the merge candidates stored in the merge candidate list mergeCandList according to a predetermined rule.
  • FIG. 11 shows an example of the merge candidate list mergeCandList derived by the merge candidate deriving unit 30361.
  • the parentheses are nicknames of merge candidates, and in the case of spatial merge candidates, they correspond to the positions of reference blocks used for derivation.
  • merge merge candidate and a zero merge candidate thereafter, which are omitted in FIG.
  • merge candidates that is, the spatial merge candidate, temporal merge candidate, join merge candidate, and zero merge candidate are derived by the basic merge candidate deriving unit 30380.
  • Texture merge candidate T
  • interview merge candidate IvMC
  • spatial merge candidate A1
  • spatial merge candidate B1
  • spatial merge candidate B0
  • displacement merge candidate IvDC
  • VSP merge Candidate VSP
  • spatial merge candidate A0
  • spatial merge candidate B2
  • motion shift merge candidate IvMCShift
  • displacement shift merge candidate IvDCShift
  • temporal merge candidate Col.
  • the parentheses are nicknames of merge candidates. Further, there are a merge merge candidate and a zero merge candidate thereafter, which are omitted in FIG.
  • Texture merge candidate (T), Interview merge candidate (IvMC), Displacement merge candidate (IvDC), VSP merge candidate (VSP), Motion shift merge candidate (IvMCShift), Displacement shift merge candidate (IvDCShift) are derived as extended merge candidates. Derived in part 30370.
  • FIG. 12 is a diagram showing the position of an adjacent block that is referenced by a spatial merge candidate.
  • A0, A1, B0, B1, and B2 each correspond to the position shown in FIG. 12, and the coordinates are as follows.
  • the positions of adjacent blocks are as follows.
  • the extended merge candidate derivation unit 30370 includes an inter-layer merge candidate derivation unit 30371 (interview merge candidate derivation unit 30371), a displacement merge candidate derivation unit 30373, and a VSP merge candidate derivation unit 30374 (VSP prediction unit 30374). .
  • the extended merge candidate is a merge candidate different from a basic merge candidate described later, and includes at least a texture merge candidate (T), an interview merge candidate (IvMC), a displacement merge candidate (IvDC), a VSP merge candidate (VSP), Either a motion shift merge candidate (IvMCShift) or a displacement shift merge candidate (IvDCShift) is included.
  • T texture merge candidate
  • IvMC interview merge candidate
  • IvDC displacement merge candidate
  • VSP VSP merge candidate
  • Either a motion shift merge candidate (IvMCShift) or a displacement shift merge candidate (IvDCShift) is included.
  • the inter-layer merge candidate derivation unit 30371 derives a texture merge candidate (T), an inter-view merge candidate (IvMC), and a motion shift merge candidate (IvMCShift). For these merge candidates, a block corresponding to a prediction unit is selected from reference pictures of different layers (for example, a base layer and a base view) having the same POC as the target picture, and a prediction parameter that is a motion vector included in the block is selected as a prediction parameter. It is derived by reading from the memory 307.
  • the texture merge candidate (T) is derived by the inter-layer merge candidate deriving unit 30371 when the target picture is depth.
  • the texture merge candidate (T) is derived by specifying a reference block from a depth picture having the same view ID as the target picture and reading a motion vector of the reference block.
  • the coordinates (xRef, yRef) of the reference block are derived from the following equations when the upper left coordinates of the prediction unit are xPb and yPb, and the width and height of the prediction unit are nPbW and nPbH.
  • the motion vector of the reference block is textMvLX
  • the motion vector mvLXT of the texture merge candidate is derived by the following formula.
  • prediction parameters may be assigned in units of sub-blocks obtained by further dividing the prediction unit.
  • the inter-view merge candidate moves from the reference block of the reference picture ivRefPic having the same POC as the target picture specified by the later-described displacement vector deriving unit 352 in the inter-layer merge candidate deriving unit 30371 and having a different view ID (refViewIdx). It is derived by reading prediction parameters such as vectors. This process is called a temporal inter-view motion candidate derivation process.
  • the inter-layer merge candidate derivation unit 30371 is first derived from the upper left coordinates of the block (xPb, yPb), the block width and height from the nPbW, nPbH, and the displacement vector derivation unit 352.
  • Reference coordinates (xRef, yRef) are derived from the following equations when the displacement vector to be used is (mvDisp [0], mvDisp [1]).
  • xRefFull xPb + (nPbW >> 1) + ((mvDisp [0] + 2) >> 2)
  • yRefFull yPb + (nPbH >> 1) + ((mvDisp [1] + 2) >> 2)
  • xRef Clip3 (0, PicWidthInSamplesL-1, (xRefFull >> 3) ⁇ 3)
  • yRef Clip3 (0, PicHeightInSamplesL-1, (yRefFull >> 3) ⁇ 3)
  • the inter-layer merge candidate derivation unit 30371 performs temporal inter-view motion candidate derivation processing in a temporal inter-view motion candidate derivation unit 303711 (not shown).
  • the temporal inter-view motion candidate derivation unit 3037111 derives the reference block position (xRef, yRef) from the block coordinates (xPb, yPb), the block widths nPbW, nPbH, and the block displacement vector mvDisp by the above processing, and further A vector of a temporal inter-view motion candidate is derived by referring to the vector of the prediction unit on the reference picture ivRefPic located at the reference block position (xRef, yRef).
  • reference block position (xRef, yRef) (XIvRefPb, yIvRefPb) is the upper left coordinate of the prediction unit (luminance prediction block) on the reference picture ivRefPic including the coordinates indicated by the reference picture list, prediction list flag, vector, and reference picture index included in the prediction unit on the reference picture ivRefPic Are set as refPicListLYIvRef, predFlagLYIvRef [x] [y], mvLYIvRef [x] [y], and refIdxLYIvRef [x] [y], respectively.
  • the temporal inter-view motion candidate derivation unit 3037111 predicts on the reference picture ivRefPic for the index i from 0 to the reference picture list element number ⁇ 1 (num_ref_idx_lX_active_minus1). It is determined whether PicOrderCnt (PrefPicListLYIvRef [refIdxLYIvRef [xIvRefPb] [yIvRefPb]])) of the unit's POC is equal to PicOrderCnt (RefPicListLX [i]) of the reference picture of the target prediction unit.
  • the temporal inter-view motion candidate derivation unit 303711 predicts the prediction unit on the reference picture ivRefPic when the reference picture referred to by the target prediction unit and the reference picture referenced by the prediction unit on the reference picture ivRefPic are the same.
  • a vector mvLXInterView and a reference picture index refIdxLX are derived using parameters.
  • the prediction parameter may be assigned in units of sub-blocks obtained by further dividing the prediction unit.
  • the width and height of the prediction unit are nPbW and nPbH, and the minimum size of the subblock is SubPbSize
  • the width nSbW and the height nSbH of the subblock are derived by the following equations.
  • (xBlk, yBlk) are relative coordinates in the prediction unit of the sub-block (coordinates based on the upper left coordinate of the prediction unit), and from 0 (nPbW / nSbW-1), 0 to (nPbH / nSbH Takes an integer value of-1). If the coordinates of the prediction unit are (xPb, yPb) and the relative coordinates (xBlk, yBlk) in the prediction unit of the sub-block, the coordinates in the picture of the sub-block are (xPb + xBlk * nSbW, yPb + yBlk * nSbH) Expressed.
  • the intra-picture coordinates (xPb + xBlk * nSbW, yPb + yBlk * nSbH) of the sub-block, the width nSbW and the height nSbH of the sub-block are input to the temporal inter-view motion candidate derivation unit 3037111 (xPb, yPb), nPbW, As nPbH, temporal inter-view motion candidate derivation processing is performed in units of sub-blocks.
  • the temporal interview motion candidate derivation unit 3037111 supports subblocks from the intermerge candidate vector mvLXInterView, the reference picture index refIdxLXInterView, and the prediction usage flag availableFlagLXInterView for the subblock for which the predictable flag availableFlagLXInterView is 0.
  • a vector spMvLX, a reference picture index spRefIdxLX, and a prediction usage flag spPredFlagLX are derived by the following equations.
  • xBlk and yBlk are sub-block addresses and take values from 0 to (nPbW / nSbW-1) and from 0 to (nPbH / nSbH-1), respectively.
  • the vector mvLXInterView, the reference picture index refIdxLXInterView, and the prediction usage flag availableFlagLXInterView are (xPb + (nPbW / nSbW / 2) * nSbW, yPb + (nPbH / nSbH / 2) * nSbH) with reference block coordinates As a temporal inter-view motion candidate derivation process.
  • the motion shift merge candidate also reads a prediction parameter such as a motion vector from a reference block of a picture having the same POC as the target picture identified by the displacement vector deriving unit 352 and having a different view ID in the inter-layer merge candidate deriving unit 30371. It is derived by this.
  • the coordinates (xRef, yRef) of the reference block, the upper left coordinates of the prediction unit are xPb, yPb, the width and height of the prediction unit are nPbW, nPbH, and the displacement vector derived from the displacement vector deriving unit 352 is mvDisp [0], When mvDisp [1], it is derived from the following equation.
  • xRefFull xPb + (nPbW >> 1) + ((mvDisp [0] + nPbW * 2 + 4 + 2) >> 2 )
  • yRefFull yPb + (nPbH >> 1) + ((mvDisp [1] + nPbH * 2 + 4 + 2) >> 2 )
  • xRef Clip3 (0, PicWidthInSamplesL-1, (xRefFull >> 3) ⁇ 3)
  • yRef Clip3 (0, PicHeightInSamplesL-1, (yRefFull >> 3) ⁇ 3) (Displacement merge candidate)
  • the displacement merge candidate derivation unit 30373 derives a displacement merge candidate (IvDC) and a shift displacement merge candidate (IvDcShift) from the displacement vector input from the displacement vector derivation unit 352.
  • the displacement merge candidate derivation unit 30373 is the horizontal component mvDisp [0] of the displacement vector (mvDisp [0], mvDisp [1]) to which the horizontal component is input as the displacement merge candidate (IvDC), and the vertical component is 0.
  • a vector is generated by the following equation.
  • DepthFlag is a variable that becomes 1 in the case of depth.
  • the displacement merge candidate derivation unit 30373 stores the generated vector and the reference picture index refIdxLX of the previous layer image pointed to by the displacement vector (for example, the index of the base layer image having the same POC as the decoding target picture) as a merge candidate. Output to the unit 303611.
  • the displacement merge candidate derivation unit 30373 derives, as a shift displacement merge candidate (IvDC), a merge candidate having a vector obtained by shifting the displacement merge candidate in the horizontal direction by the following equation.
  • IvDC shift displacement merge candidate
  • VSP merge candidate The VSP merge candidate derivation unit 30374 (hereinafter, VSP prediction unit 30374) derives a VSP (View Synthesis Prediction) merge candidate.
  • the VSP prediction unit 30374 divides the prediction unit into a plurality of sub-blocks (sub-prediction units), and sets the vector mvLX, the reference picture index refIdxLX, and the view ID RefViewIdx for each divided sub-block.
  • the VSP prediction unit 30374 outputs the derived VSP merge candidate to the merge candidate storage unit 303611.
  • FIG. 14 is a block diagram showing the relationship between the VSP prediction unit 30374 and other means.
  • the VSP prediction unit 30374 operates using the split flag horSplitFlag derived by the split flag deriving unit 353 and the displacement vector disparitySamples derived by the depth DV deriving unit 351.
  • the partition division unit (not shown) of the VSP prediction unit 30374 selects either a horizontally long rectangle (here 8 ⁇ 4) or a vertically long rectangle (here 4 ⁇ 8) according to the partition flag horSplitFlag derived by the partition flag deriving unit 353.
  • the sub-block size is determined by selection. Specifically, the sub-block width nSubBlkW and height nSubBlkH are set using the following equations.
  • the depth vector derivation unit (not shown) of the VSP prediction unit 30374 uses the motion vector disparitySamples [] derived from the depth DV derivation unit 351 for each subblock of the derived subblock size, as a horizontal component motion vector mvLX [ A vector mvLX [] is derived using 0], 0 as a vertical component motion vector mvLX [1], and a prediction parameter of a VSP merge candidate is derived.
  • the VSP prediction unit 30374 controls whether or not to add the VSP merge candidate to the merge candidate list mergeCandList according to the residual prediction index iv_res_pred_weight_idx and the illumination compensation flag ic_flag input from the inter prediction parameter decoding control unit 3031. Also good. Specifically, the VSP prediction unit 30374 may add the VSP merge candidate to the elements of the merge candidate list mergeCandList only when the residual prediction index iv_res_pred_weight_idx is 0 and the illumination compensation flag ic_flag is 0.
  • the basic merge candidate derivation unit 30380 includes a spatial merge candidate derivation unit 30382, a temporal merge candidate derivation unit 30382, a merge merge candidate derivation unit 30383, and a zero merge candidate derivation unit 30384.
  • the basic merge candidate is a merge candidate used in the base layer, that is, a merge candidate used in HEVC (for example, HEVC main profile) instead of scalable, and includes at least one of a spatial merge candidate and a temporal merge candidate.
  • the spatial merge candidate derivation unit 30381 reads the prediction parameters (prediction usage flag predFlagLX, vector mvLX, reference picture index refIdxLX) stored in the prediction parameter memory 307 according to a predetermined rule, and uses the read prediction parameters as spatial merge candidates. To derive. Prediction parameters to be read are predictions related to each of adjacent blocks that are blocks within a predetermined range from the prediction unit (for example, all or a part of blocks that touch the lower left end, the upper left end, and the upper right end of the prediction unit, respectively). It is a parameter.
  • the derived spatial merge candidate is stored in the merge candidate storage unit 303611.
  • the spatial merge candidate derivation unit 30381 sets a merge candidate VSP mode flag mergeCandIsVspFlag that is derived by inheriting the VSP mode flag VspModeFlag of the adjacent block. That is, when the VSP mode flag VspModeFlag of the adjacent block is 1, the VSP mode flag mergeCandIsVspFlag of the corresponding spatial merge candidate is 1, and otherwise, the VSP mode flag mergeCandIsVspFlag is 0.
  • VSP mode flag VspModeFlag is set to 0 for merge candidates derived by the time merge candidate derivation unit 30382, the merge merge candidate derivation unit 30383, and the zero merge candidate derivation unit 30384.
  • the temporal merge candidate derivation unit 30382 reads out the prediction parameter of the block in the reference image including the lower right coordinates of the prediction unit from the prediction parameter memory 307 and sets it as a merge candidate.
  • the reference image can be specified by using, for example, the collocated picture col_ref_idx specified by the slice header and the reference picture index refIdxLX specified by RefPicListX [col_ref_idx] specified by the reference picture list RefPicListX. It is stored in the merge candidate storage unit 303611.
  • the merge merge candidate derivation unit 30383 derives a merge merge candidate by combining two different derived merge candidate vectors and reference picture indexes that have already been derived and stored in the merge candidate storage unit 303611 as L0 and L1 vectors, respectively. To do.
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the zero merge candidate derivation unit 30384 derives merge candidates whose reference picture index refIdxLX is i and whose X component and Y component of the vector mvLX are both 0 until the number of derived merge candidates reaches the maximum value.
  • the value of i indicating the reference picture index refIdxLX is assigned in order from 0.
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the merge candidate selection unit 30362 selects, from the merge candidates stored in the merge candidate storage unit 303611, a merge candidate to which an index corresponding to the merge index merge_idx input from the inter prediction parameter decoding control unit 3031 is assigned.
  • an inter prediction parameter That is, when the merge candidate list is mergeCandList, the prediction parameter indicated by mergeCandList [merge_idx] is selected and output to the bi-prediction restriction unit 30363.
  • the merge candidate selection unit 30362 sets the sub-block motion compensation flag subPbMotionFlag to 1 when an inter-view merge candidate is selected as a merge candidate.
  • the merge candidate selection unit 30362 may set the sub-block motion compensation flag subPbMotionFlag to 1 even when the merge candidate VSP mode flag vspModeFlag is 1. In other cases, the sub-block motion compensation flag subPbMotionFlag is set to 0.
  • the bi-prediction restriction unit 30363 stores the selected merge candidate in the prediction parameter memory 307 and outputs it to the prediction image generation unit 308.
  • FIG. 10 is a schematic diagram illustrating a configuration of the AMVP prediction parameter derivation unit 3032 according to the present embodiment.
  • the AMVP prediction parameter derivation unit 3032 includes a vector candidate derivation unit 3033, a prediction vector selection unit 3034, and an inter prediction identifier derivation unit 3035.
  • the vector candidate derivation unit 3033 reads a vector stored in the prediction parameter memory 307 based on the reference picture index refIdx, and generates a vector candidate list mvpListLX.
  • the reference block is a block (for example, a block at the lower left end, an upper right end, or a temporally adjacent block of the prediction unit) at a predetermined position based on the position of the prediction unit.
  • the prediction vector selection unit 3034 selects the vector mvpListLX [mvp_lX_flag] indicated by the prediction vector flag mvp_LX_flag input from the inter prediction parameter decoding control unit 3031 among the vector candidates mvpListLX derived by the vector candidate derivation unit 3033 as the prediction vector mvpLX. .
  • the prediction vector selection unit 3034 outputs the selected prediction vector mvpLX to the addition unit 3035.
  • the addition unit 3035 adds the prediction vector mvpLX input from the prediction vector selection unit 3034 and the difference vector mvdLX input from the inter prediction parameter decoding control unit to calculate a vector mvLX.
  • the adding unit 3035 outputs the calculated vector mvLX to the predicted image generation unit 308.
  • FIG. 15 is a block diagram illustrating a configuration of the inter prediction parameter decoding control unit 3031 according to the embodiment of this invention.
  • the inter prediction parameter decoding control unit 3031 includes a split mode decoding unit 30311, an inter prediction identifier decoding unit 30312, a DBBP flag decoding unit 30313, and a merge flag decoding unit, a merge index decoding unit, and inter prediction that are not illustrated.
  • An identifier decoding unit, a reference picture index decoding unit, a vector candidate index decoding unit, a vector difference decoding unit, a residual prediction index decoding unit, and an illuminance compensation flag decoding unit are configured.
  • the partition mode decoding unit, the merge flag decoding unit, the merge index decoding unit, the reference picture index decoding unit, the vector candidate index decoding unit, and the vector difference decoding unit are respectively divided mode part_mode, merge flag merge_flag, merge index merge_idx, inter prediction identifier inter_pred_idc
  • the reference picture index refIdxLX, the prediction vector flag mvp_LX_flag, and the difference vector mvdLX are decoded.
  • the prediction unit decodes an inter prediction identifier inter_pred_flag indicating L0 prediction (PRED_L0), L0 prediction (PRED_L1), and bi-prediction (PRED_BI).
  • the residual prediction index decoding unit uses the entropy decoding unit 301 to decode the residual prediction index iv_res_pred_weight_idx from the encoded data when the division mode PartMode (part_mode) of the encoding unit CU is 2Nx2N. In other cases, the residual prediction index decoding unit sets (infers) 0 to iv_res_pred_weight_idx. The residual prediction index decoding unit outputs the decoded residual prediction index iv_res_pred_weight_idx to the merge mode parameter derivation unit 3036 and the inter prediction image generation unit 309.
  • the residual prediction index is a parameter for changing the operation of residual prediction.
  • it is an index indicating the weight of residual prediction, and takes values of 0, 1, and 2.
  • iv_res_pred_weight_idx When iv_res_pred_weight_idx is 0, residual prediction is not performed.
  • the vector used for residual prediction may be changed instead of changing the weight of residual prediction according to the index.
  • a flag (residual prediction flag) indicating whether to perform residual prediction may be used.
  • the illuminance compensation flag decoding unit uses the entropy decoding unit 301 to decode the illuminance compensation flag ic_flag from the encoded data when the division mode PartMode is 2Nx2N. In other cases, the illuminance compensation flag decoding unit sets (infers) 0 to ic_flag. The illuminance compensation flag decoding unit outputs the decoded illuminance compensation flag ic_flag to the merge mode parameter derivation unit 3036 and the inter predicted image generation unit 309.
  • the displacement vector deriving unit 352, the division flag deriving unit 353, and the depth DV deriving unit 351 which are means used for deriving the prediction parameters, will be described in order.
  • the displacement vector deriving unit 352 stores the displacement vector (hereinafter referred to as MvDisp [x] [y] or mvDisp [x] [y]) of the encoding unit (target CU) to which the target PU belongs in the space of the encoding unit. From adjacent blocks temporally or temporally. Specifically, a block Col that is temporally adjacent to the target CU, a second block AltCol that is temporally adjacent, a block A1 that is spatially adjacent to the left, and a block B1 that is adjacent to the top are used as reference blocks.
  • a block prediction flag predFlagLX, a reference picture index refIdxLX, and a vector mvLX are extracted in order.
  • the extracted vector mvLX is a displacement vector
  • the displacement vector of the adjacent block is output. If there is no displacement vector in the prediction parameter of the adjacent block, the prediction parameter of the next adjacent block is read and the displacement vector is derived in the same manner. If displacement vectors cannot be derived in all adjacent blocks, a zero vector is output as a displacement vector.
  • the displacement vector deriving unit 352 also outputs a reference picture index and a view ID (RefViewIdx [x] [y], where (xP, yP) are coordinates) of the block from which the displacement vector is derived.
  • the displacement vector obtained as above is called NBDV (Neighbour Base Base Disparity Vector).
  • the displacement vector deriving unit 352 further outputs the obtained displacement vector NBDV to the depth DV deriving unit 351.
  • the depth DV deriving unit 351 derives depth-derived displacement vectors (displacement array disparitySamples).
  • the depth DV deriving unit 351 updates (refines) the displacement vector by using the displacement vector disparitySamples obtained from the depth as the horizontal component mvLX [0] of the motion vector.
  • the updated displacement vector is called DoNBDV (Depth Orientated Neighbour Base Disparity Vector).
  • the displacement vector deriving unit 352 outputs the displacement vector (DoNBDV) to the inter-layer merge candidate deriving unit 30371, the displacement merge candidate deriving unit, and the viewpoint synthesis prediction merge candidate deriving unit. Further, the obtained displacement vector (NBDV) is output to the inter predicted image generation unit 309.
  • the VSP prediction unit 30374 performs coordinates (xTL, yTL) obtained by shifting the target block by the displacement vector MvDisp in the partition division of the division flag derivation unit 353 and the displacement vector array disparitySamples derivation of the depth DV derivation unit 351. And refer to the point on the depth block of coordinates (xTL, yTL).
  • the DBBP prediction unit 3095 also derives the coordinates (xTL, yTL) obtained by shifting the target block by the displacement vector MvDisp in the segmentation unit 30952 and the DBBP division mode derivation unit 30954, and points on the depth block of the coordinates (xTL, yTL) Refer to
  • the VSP prediction unit 30374 derives the coordinates (xTL, yTL) of the depth block using the displacement vector NBDV that has not been updated (refineed) by the depth reference as the displacement vector mvDisp, and the depth DV derivation unit 351 A block displacement vector is derived.
  • the depth reference is not necessary for the derivation of the displacement vector mvDisp, and the depth reference is performed when the displacement vector of the sub-block is derived. Therefore, the depth reference may be performed only once.
  • the DBBP prediction unit 3095 derives the coordinates (xTL, ⁇ ⁇ ⁇ yTL) of the depth block using the displacement vector DoNBDV updated by the depth reference as the displacement vector mvDisp, and a segmentation unit 30952 and a DBBP division mode derivation unit 30954.
  • the depth reference by the displacement vector NBDV is performed for the derivation of the displacement vector mvDisp
  • the depth reference (depth transfer) by another displacement vector DoNBDV is performed for the segmentation unit 30952 and the DBBP division mode derivation unit 30954.
  • depth transfer is required twice, the transfer and processing amount of depth images are large.
  • the VSP prediction unit 30374 and the DBBP prediction unit 3095 perform different depth transfers, the process cannot be shared, so that one process is complicated, and the other process is easy, resulting in a design imbalance. Since the overall worst case complexity is determined by the maximum value of the two processes, even if one of the processes is easy, the overall complexity is not reduced. Therefore, in design, it is preferable to allow the same degree of complexity as long as the worst case complexity can be shared. Therefore, it is preferable to use the same displacement vector by standardizing the criteria for how much the updated displacement vector is used in the viewpoint synthesis prediction and DBBP.
  • the worst-case processing amount of depth reference is reduced while the processing of the VSP prediction unit 30374 and the DBBP prediction unit 3095 is made common.
  • the worst case processing amount is to perform depth reference many times in a small block. Therefore, in this embodiment, when the size of the target block is larger than the predetermined size, the displacement vector (DoNBDV) updated with reference to the depth is used, and in other cases, the update is performed with reference to the depth. A displacement vector (NBDV) that is not used is used.
  • the VSP prediction unit 30374 uses the displacement vector DoNBDV updated by depth reference as the displacement vector mvDisp, and coordinates of the depth block (xTL, yTL) is derived, and the depth DV deriving unit 351 derives the displacement vector of the sub-block.
  • the VSP prediction unit 30374 uses the displacement vector NBDV that has not been updated by the depth reference as the displacement vector mvDisp, and uses the displacement vector coordinates (xTL, yTL). Then, the depth DV deriving unit 351 derives the displacement vector of the sub-block.
  • the DBBP prediction unit 3095 uses the displacement vector DoNBDV updated by the depth reference as the displacement vector mvDisp, and uses the coordinates (xTL, yTL) of the depth block.
  • the segmentation unit 30952 and the DBBP split mode deriving unit 30954 refer to the depth.
  • the DBBP prediction unit 3095 uses the displacement vector NBDV that has not been updated by the depth reference as the displacement vector mvDisp, and coordinates of the depth block (xTL, yTL)
  • the segmentation unit 30952 and the DBBP division mode deriving unit 30954 refer to the depth.
  • FIG. 29 is a flowchart for explaining operations of the image decoding device 31 and the image encoding device 11 configured to use a common displacement vector as the VSP prediction unit 30374 and the DBBP prediction unit 3095.
  • the disparity vector DoNBDV obtained by refining the depth image is used as the disparity vector.
  • the image decoding device 31 and the image encoding device 11 perform the following processing.
  • the sum of the predicted block width nPbW and height nPbH is a predetermined value ( Here, when exceeding 16), the displacement vector MvRefinedDisp [xPb] [yPb] updated using the depth image, and in other cases, the displacement vector MvDisp [xPb not updated using the depth image.
  • [YPb] is used to derive the disparity array DisparitySamples for deriving the displacement vector mvLX of the sub-block by the depth DV unit 351, the partition by the partition dividing unit by the horSplitFlag using the division flag derived by the division flag deriving unit 353 Splitting is performed.
  • mvLXVSP nPbW + nPbH> 16? MvRefinedDisp [xPb] [yPb]: MvDisp [xPb] [yPb] Formula A1
  • the sum of the prediction block width (here, nTbS) and the prediction block height (here, nTbS) is a predetermined value (here, 16).
  • the displacement vector MvRefinedDisp [xTb] [yTb] refined using the depth image is exceeded if not, and the displacement vector MvDisp [xTb] [yTb] not refined using the depth image otherwise.
  • SegMask derivation in the segmentation block 30952 and division mode PartMode derivation in the DBBP division mode derivation unit 30954 are performed.
  • the predetermined size may be larger than that described above.
  • FIG. 30 is a diagram illustrating a data flow of an example in which a common displacement vector is used in the VSP prediction unit 30374 and the DBBP prediction unit 3095.
  • the depth DV deriving unit 351 uses the depth image # 1 which is a depth block determined by the displacement vector MvDisp [] [] before being updated by the upper left coordinates of the block and the depth reference from the reference picture memory 306,
  • the displacement vector MvRefinedDisp [] [] updated by the depth reference is derived.
  • the switch 354 selects MvRefinedDisp [] [] when the block size is larger than a predetermined size, and sets MvDisp [] [] to the displacement vector mvDisp in other cases.
  • the depth DV deriving unit 351 derives disparity arrays DispariytSamples derived in subblock units using the displacement vector mvDisp, and the VSP predicting unit 30374 uses the values indicated by the disparity array DispariytSamples as horizontal vectors. A predicted image is generated by compensation. At this time, the depth DV deriving unit 351 refers to the depth image # 2 that is a depth block determined by the displacement vector mvDisp. In view synthesis prediction, an 8 ⁇ 4 or 4 ⁇ 8 sub-block size is selected using the depth image # 2 that is referred to by the division flag deriving unit 353 using the displacement vector mvDisp.
  • the depth block to be referenced for deriving MvRefinedDisp [] is equal to the depth block for deriving the disparity array DispariytSamples and the split flag horSplitFlag.
  • Image # 2 is equal to depth image # 1. Therefore, a series of processes of derivation of the displacement vector MvRefinedDisp, the disparity array DispariytSamples, and the division flag horSplitFlag can be performed by transferring the depth image once, and two depth transfers are not necessary to obtain the depth image.
  • the segmentation unit 30952 and the DBBP partition mode deriving unit 30954 derive the segmentation information segMask and the partition mode PartMode using the displacement vector mvDisp.
  • the segmentation unit 30952 and the DBBP division mode deriving unit 30954 refer to the depth image # 2 that is a depth block determined by the displacement vector mvDisp.
  • the displacement vector mvDisp is equal to MvDisp [] []
  • the depth image # 2 is equal to the depth image # 1 and two depth transfers are not required.
  • the displacement vector mvDisp is MvRefinedDisp [] []
  • the depth image # 2 is different from the depth image # 1, and two depth transfers are necessary.
  • FIG. 30 shows an example in which the DBBP split mode deriving unit 3095C (DBBP division mode deriving unit 30954C) is used instead of the DBBP prediction unit 3095 (DBBP division mode deriving unit 30954).
  • DBBP division mode deriving unit 30954C DBBP division mode deriving unit 30954C
  • the disparity array DisparitySamples derivation process of the depth DV derivation unit 351 of the viewpoint synthesis prediction process and the division flag horSplitFlag derivation process of the division flag derivation unit is derived using a common displacement vector.
  • the image decoding apparatus 31 and the image encoding apparatus 11 configured as described above are the above-described depth base block prediction image generation means in an image decoding apparatus including a depth base block prediction image generation unit (DBBP prediction unit 3095) and a viewpoint synthesis prediction unit.
  • a segmentation deriving unit 30952 for deriving segmentation information from a depth image
  • a DBBP image interpolating unit 30951 for generating two motion compensated images, and image synthesis for synthesizing the two interpolated images to generate one motion compensated image.
  • a partition division unit for deriving a partition flag horSplitFlag from the depth image and obtaining a sub-block size
  • a depth division unit 30953 and a DBBP partition mode deriving unit 30954 for deriving a partition mode PartMode.
  • a depth motion DV deriving unit 351 for deriving ySamples and obtaining a motion vector mvLX is provided, and a position of a depth image referred to by the segmentation deriving unit 30952 and the DBBP division mode deriving unit 30954 of the depth base block prediction image generating unit is derived.
  • the disparity vector used for deriving the position of the depth image by the partition dividing unit and the depth motion DV deriving unit 351 of the viewpoint synthesis prediction unit are used as a common disparity vector.
  • the image decoding device 31 and the image encoding device 11 are the above-described depth base block in an image decoding device including a depth base block predicted image generation unit (DBBP prediction unit 3095C) and a viewpoint synthesis prediction unit.
  • the predicted image generation means includes a segmentation deriving unit 30951 for deriving segmentation information from the depth image, a DBBP image interpolating unit 30951 for generating two motion compensated images, and combining the two interpolated images into one motion compensated image.
  • the disparity vector used for deriving the position of the depth image by the partition dividing unit and the depth motion DV deriving unit 351 of the viewpoint synthesis prediction unit are set as a common disparity vector.
  • the DBBP partition mode deriving unit 30954C and the partition partitioning unit respectively perform partition mode PartMode derivation and subblock size derivation using the output of the common partition flag deriving unit 353.
  • a division flag deriving unit 353A or the like may be used instead of the division flag deriving unit 353.
  • the common disparity vector is a disparity vector refined by depth when the block size is larger than a predetermined size, and refined by depth when the block size is equal to or smaller than the predetermined size. This is the disparity vector before being processed.
  • the common disparity vector is a disparity vector refined by depth when the sum of the width and height of the prediction block is larger than 16, and is refined by depth otherwise. This is the disparity vector before being processed.
  • the common disparity vector is a disparity vector refined by depth when the sum of the width and height of the prediction block is greater than 24. In other cases, the disparity before being refined by depth It may be a vector.
  • a small block having a predetermined size or less does not use a disparity vector that needs to be refined by a depth image.
  • the memory bandwidth for transferring the image and the processing amount referring to the depth image are reduced.
  • the division flag deriving unit 353 refers to the depth image corresponding to the target block and derives the division flag horSplitFlag. The following description will be made assuming that the coordinates of the target block set as the input of the division flag deriving unit 353 are (xP, yP), the width and height are nPSW, nPSH, and the displacement vector is mvDisp.
  • the division flag deriving unit 353 refers to the depth image when the width and height of the target block are equal, but when the width and height of the target block are not equal, the division flag deriving unit 353 refers to the division flag without referring to the depth image. Deriving horSplitFlag may be derived. Details of the division flag deriving unit 353 will be described below.
  • the division flag deriving unit 353 reads, from the reference picture memory 306, a depth image refDepPels that has the same POC as the decoding target picture and has the same view ID as the view ID (RefViewIdx) of the reference picture indicated by the displacement vector mvDisp.
  • the division flag deriving unit 353 derives coordinates (xTL, yTL) obtained by shifting the upper left coordinates (xP, yP) of the target block by the displacement vector MvDisp by the following formula.
  • xTL xP + ((mvDisp [0] + 2) >> 2)
  • yTL yP + ((mvDisp [1] + 2) >> 2)
  • mvDisp [0] and mvDisp [1] are the X component and the Y component of the displacement vector MvDisp, respectively.
  • the derived coordinates (xTL, yTL) indicate the coordinates of the block corresponding to the target block on the depth image refDepPels.
  • the division flag deriving unit 353 sets the flag minSubBlkSizeFlag to 1 according to the following expression when the width nPSW or the height nPSH of the target block is other than a multiple of 8.
  • (nPSH% 8! 0)
  • the split flag deriving unit 353 determines that the horSplitFlag is 1 when the height of the target block is not a multiple of 8 (when nPSH% 8 is true), and In this case, 0 is set.
  • horSplitFlag 1
  • target block width is other than a multiple of 8 (when nPSW% 8 is true) Is set to 0 in horSplitFlag.
  • the division flag deriving unit 353 derives the sub block size from the depth value.
  • the sub-block size is derived from the comparison of the four points (TL, TR, BL, BR) at the corners of the prediction block.
  • the flag minSubBlkSizeFlag is 0, the pixel value of the depth image of the upper left (TL) coordinates of the target block is refDepPelsP0, the pixel value of the upper right end (TR) is refDepPelsP1, the pixel value of the lower left end (BL) is refDepPelsP2, and the lower right end
  • the split flag derivation unit 353 outputs horSplitFlag to the split mode derivation unit 30954C and the VSP prediction unit 30374.
  • the division flag deriving unit 353 may derive as follows. When the width nPSW and the height nPSH of the target block are different, it is derived by the following formula according to the width and height of the target block.
  • the target block of the division flag deriving unit 353 is a prediction unit in the case of viewpoint synthesis prediction, and a block having the same width and height in the case of DBBP. In the case of DBBP, since the width and the height are equal, in the above derivation method, the division flag horSplitFlag is derived with reference to the four corners of the depth image.
  • the division flag deriving unit 353A refers to the depth image corresponding to the target block and derives the division flag horSplitFlag.
  • the coordinates of the target block set as the input of the division flag deriving unit 353A are (xP, yP), the width and height are nPSW, nPSH, and the displacement vector is mvDisp.
  • the division flag deriving unit 353A reads, from the reference picture memory 306, a depth image refDepPels having the same POC as the decoding target picture and having the same view ID as the view ID (RefViewIdx) of the reference picture indicated by the displacement vector mvDisp.
  • the division flag deriving unit 353A derives the coordinates (xTL, yTL) obtained by shifting the upper left coordinates (xP, yP) of the target block by the displacement vector MvDisp by the following equations.
  • xTL xP + ((mvDisp [0] + 2) >> 2)
  • yTL yP + ((mvDisp [1] + 2) >> 2)
  • mvDisp [0] and mvDisp [1] are the X component and the Y component of the displacement vector MvDisp, respectively.
  • the derived coordinates (xTL, yTL) indicate the coordinates of the block corresponding to the target block on the depth image refDepPels.
  • the split flag deriving unit 353A sets the flag minSubBlkSizeFlag to 1 according to the following expression when the width nPSW or the height nPSH of the target block is other than a multiple of 8.
  • (nPSH% 8! 0)
  • the split flag deriving unit 353A sets 1 to horSplitFlag when the height of the target block is not a multiple of 8 (when nPSH% 8 is true), otherwise In this case, 0 is set.
  • horSplitFlag 1
  • target block width is other than a multiple of 8 (when nPSW% 8 is true) Is set to 0 in horSplitFlag.
  • the division flag deriving unit 353A derives the sub-block size from the depth value when both the width of the target block and the height of the target block are multiples of 8. Specifically, the sub-block size is derived from a comparison of three points (TL, TR, BL) at the corners of the prediction block.
  • the flag minSubBlkSizeFlag is 0, the pixel value of the depth image at the upper left (TL) coordinates of the target block is refDepPelsP0, the upper right (TR) pixel value is refDepPelsP1, and the lower left (BL) pixel value is refDepPelsP2.
  • the split flag deriving unit 353A derives horSplitFlag by the following equation.
  • the split flag derivation unit 353A outputs the horSplitFlag to the split mode derivation unit 30954C and the VSP prediction unit 30374.
  • the depth DV deriving unit 351 derives disparity arrays disparitySamples (horizontal vectors), which are horizontal components of depth-derived displacement vectors, in designated block units (sub-blocks).
  • the input of the depth DV derivation unit 351 includes the depth DV conversion table DepthToDisparityB, the block width nBlkW and height nBlkH, the split flag splitFlag, the depth image refDepPels, and the coordinates (xTL, yTL) of the corresponding block on the depth image refDepPels.
  • View IDrefViewIdx output is disparity array disparitySamples (horizontal vector).
  • the depth DV deriving unit 351 sets a pixel used for deriving the depth representative value maxDep for each target block. Specifically, as shown in FIG. 13, when the relative coordinates from the upper left prediction block (xTL, yTL) of the target block are (xSubB, ySubB), the X coordinate xP0 of the left end of the sub-block and the right end The X coordinate xP1, the upper end Y coordinate yP0, and the lower end Y coordinate yP1 are obtained from the following equations.
  • xP0 Clip3 (0, pic_width_in_luma_samples-1, xTL + xSubB)
  • yP0 Clip3 (0, pic_height_in_luma_samples-1, yTL + ySubB)
  • xP1 Clip3 (0, pic_width_in_luma_samples-1, xTL + xSubB + nBlkW-1)
  • yP1 Clip3 (0, pic_height_in_luma_samples-1, yTL + ySubB + nBlkH-1)
  • pic_width_in_luma_samples and pic_height_in_luma_samples represent the width and height of the image, respectively.
  • the depth DV deriving unit 351 derives the depth representative value maxDep of the target block. Specifically, pixel values refDepPels [xP0] [yP0], refDepPels [xP0] [yP1], refDepPels [xP1] [yP0], refDepPels [xP1] [yP1] [yP1 ], The representative depth value maxDep is derived from the following equation.
  • the function Max (x, y) is a function that returns x if the first argument x is greater than or equal to the second argument y, and returns y otherwise.
  • the depth DV deriving unit 351 uses the representative depth value maxDep, the depth DV conversion table DepthToDisparityB, and the view ID refViewIdx of the layer indicated by the displacement vector (NBDV) to target the disparity array disparitySamples that is the horizontal component of the displacement vector derived from the depth. For each pixel (x, y) in the block (where x is a value from 0 to nBlkW-1, and y is a value from 0 to nBlkH-1), it is derived by the following equation.
  • the depth DV deriving unit 351 outputs the derived parallax array disparitySamples [] to the displacement vector deriving unit 352 as the displacement vector DoNBDV (a horizontal component thereof).
  • the depth DV deriving unit 351 also outputs the displacement vector (the horizontal component thereof) to the VSP prediction unit 30374.
  • FIG. 16 is a schematic diagram illustrating a configuration of the inter predicted image generation unit 309 according to the present embodiment.
  • the inter prediction image generation unit 309 includes a motion displacement compensation unit 3091, a residual prediction unit 3092, an illuminance compensation unit 3093, a DBBP prediction unit 3095 (depth base block prediction image generation device 3095), and a weighted prediction unit 3096. .
  • the inter prediction image generation unit 309 performs processing in units of subblocks when the subblock motion compensation flag subPbMotionFlag input from the inter prediction parameter decoding unit 303 is 1, and performs prediction when the subblock motion compensation flag subPbMotionFlag is 0. The following processing is performed for each unit.
  • the sub-block motion compensation flag subPbMotionFlag is set to 1 when the inter-view merge candidate is selected as the merge mode or when the VSP merge candidate is selected.
  • the inter prediction image generation unit 309 derives prediction images predSamples using the motion displacement compensation unit 3091 based on the prediction parameters.
  • the inter predicted image generation unit 309 sets the residual prediction execution flag resPredFlag to 1 indicating that residual prediction is to be performed, and the motion displacement compensation unit 3091 The result is output to the difference prediction unit 3092.
  • the residual prediction execution flag resPredFlag is set to 0 and output to the motion displacement compensation unit 3091 and the residual prediction unit 3092.
  • the weighted prediction unit 3096 derives a predicted image predSamples from one motion compensated image predSamplesL0 or predSamplesL1 in the case of uni-prediction, and calculates a predicted image predSamples from two two motion compensated images predSamplesL0 and predSamplesL1 in the case of bi-prediction. To derive.
  • the motion displacement compensation unit 3091 generates a motion prediction image predSampleLX based on the prediction use flag predFlagLX, the reference picture index refIdxLX, and the vector mvLX (motion vector or displacement vector).
  • the motion displacement compensation unit 3091 reads out a block at a position shifted by the vector mvLX from the reference picture memory 306, starting from the position of the prediction unit of the reference picture specified by the reference picture index refIdxLX, and interpolates the predicted image. Generate.
  • a prediction image is generated by applying a filter called a motion compensation filter (or displacement compensation filter) for generating a pixel at a decimal position.
  • the above processing is called motion compensation
  • the vector mvLX is a displacement vector
  • it is called displacement compensation.
  • motion displacement compensation the prediction image of L0 prediction
  • predSamplesL0 the prediction image of L1 prediction
  • predSamplesL1 the prediction image of L1 prediction
  • predSamplesLX the prediction image of L1 prediction
  • predSamplesLX the prediction image of L1 prediction
  • these output images are also referred to as prediction images predSamplesLX.
  • residual prediction and illuminance compensation when an input image and an output image are distinguished, the input image is expressed as predSamplesLX and the output image is expressed as predSamplesLX ′.
  • the motion displacement compensation unit 3091 When the residual prediction execution flag resPredFlag is 0, the motion displacement compensation unit 3091 generates a motion compensated image predSamplesLX by using a motion compensation filter having 8 taps for the luminance component and 4 taps for the chrominance component.
  • the residual prediction execution flag resPredFlag When the residual prediction execution flag resPredFlag is 1, a motion compensation image predSamplesLX is generated by a 2-tap motion compensation filter for both the luminance component and the chrominance component.
  • the motion displacement compensation unit 3091 performs motion compensation in units of sub-blocks. Specifically, a vector of sub-blocks of coordinates (xCb, yCb), a reference picture index, and a reference list use flag are derived from the following expressions.
  • SubPbRefIdxL0 [xCb + x] [yCb + y]: refIdxL0 RefIdxL1 [xCb + x] [yCb + y] subPbMotionFlag?
  • SubPbRefIdxL1 [xCb + x] [yCb + y]: refIdxL1 PredFlagL0 [xCb + x] [yCb + y] subPbMotionFlag?
  • SubPbPredFlagL0 [xCb + x] [yCb + y]: predFlagL0 PredFlagL1 [xCb + x] [yCb + y] subPbMotionFlag?
  • SubPbMvLX, SubPbRefIdxLX, and SubPbPredFlagLX (X is 0, 1) correspond to subPbMvLX, subPbRefIdxLX, and subPbPredFlagLX described in the inter-layer merge candidate derivation unit 30371.
  • the residual prediction unit 3092 performs residual prediction when the residual prediction execution flag resPredFlag is 1. When the residual prediction execution flag resPredFlag is 0, the residual prediction unit 3092 outputs the input predicted image predSamplesLX as it is.
  • the refResSamples residual prediction is performed by estimating the residual of the motion compensated image predSamplesLX generated by motion prediction or displacement prediction and adding it to the predicted image predSamplesLX of the target layer. Specifically, when the prediction unit is motion prediction, it is assumed that a residual similar to the reference layer also occurs in the target layer, and the residual of the reference layer already derived is estimated as the residual of the target layer. Use as a value.
  • the prediction unit is displacement prediction, a residual between a reference layer picture and a target layer picture at a time (POC) different from that of the target picture is used as an estimated value of the residual.
  • the residual prediction unit 3092 Similar to the motion displacement compensation unit 3091, the residual prediction unit 3092 also performs residual prediction on a sub-block basis when the sub-block motion compensation flag subPbMotionFlag is 1.
  • FIG. 17 is a block diagram showing the configuration of the residual prediction unit 3092.
  • the residual prediction unit 3092 includes a reference image interpolation unit 30922 and a residual synthesis unit 30923.
  • the reference image interpolation unit 30922 receives the vector mvLX and the residual prediction displacement vector mvDisp input from the inter prediction parameter decoding unit 303, and the reference picture stored in the reference picture memory 306. Are used to generate two residual prediction motion compensated images (corresponding block rpSamplesLX, reference block rpRefSamplesLX).
  • DiffPicOrderCnt (X, Y) indicates the difference between the POC of picture X and picture Y (the same applies hereinafter).
  • the target block is assumed to be subject to displacement prediction and ivRefFlag is set to 1. Otherwise, ivRefFlag is set to 0, assuming that motion prediction is applied to the target block.
  • FIG. 18 is a diagram for explaining the corresponding block rpSamplesLX and the reference block rpRefSamplesLX when the vector mvLX is a motion vector (when the interview prediction flag ivRefFlag is 0).
  • the corresponding block corresponding to the prediction unit on the target layer is a displacement vector mvDisp that is a vector indicating the positional relationship between the reference layer and the target layer, starting from the position of the prediction unit of the image on the reference layer. It is located in a block that is displaced by a certain amount.
  • FIG. 19 is a diagram for explaining the corresponding block rpSamplesLX and the reference block rpRefSamplesLX when the vector mvLX is a displacement vector (when the interview prediction flag ivRefFlag is 1).
  • the corresponding block rpSamplesLX is a block on the reference picture rpPic that has a different time from the target picture and the same view ID as the target picture.
  • the corresponding block rpSamplesLX is located in the block that is shifted by the vector mvT starting from the position of the prediction unit (target block).
  • the residual prediction unit 3092 includes reference pictures rpPic and rpPicRef, which are reference pictures to be referred to in derivation of residual prediction motion compensated images (rpSamplesLX and rpRefSamplesLX), and the position of the reference block (the reference block based on the coordinates of the target block). Relative coordinates) vectors mvRp and mvRpRef are derived.
  • the residual prediction unit 3092 sets a picture having the same display time (POC) or the same view ID as the target picture to which the target block belongs as rpPic.
  • the residual prediction unit 3092 when the target block is motion prediction (when the interview prediction flag ivRefFlag is 0), the residual prediction unit 3092 has the same POC of the reference picture rpPic and PicOrderCntVal that is the POC of the target picture, and the reference The view ID of the picture rpPic and the reference view ID RefViewIdx [xP] [yP] of the prediction unit (this is different from the view ID of the target picture).
  • the reference picture rpPic is derived from the above condition.
  • the residual prediction unit 3092 sets the displacement vector MvDisp to the rpPic vector mvRp.
  • the residual prediction unit 3092 sets the reference picture used for generating the predicted image of the target block to rpPic when the target block is displacement prediction (when the interview prediction flag ivRefFlag is 1). That is, when the reference index of the target block is RpRefIdxLY and the reference picture list is RefPicListY, the reference picture rpPic is derived from RefPicListY [RpRefIdxLY]. Furthermore, a residual prediction vector deriving unit 30924 (not shown) included in the residual prediction unit 3092 is included.
  • the residual prediction vector deriving unit 30924 is the same POC as the target picture to which the target block vector mvLX (which is equal to the displacement vector MvDisp) points to the rpPic vector mvRp.
  • MvT which is a vector of the prediction unit is derived, and the motion vector mvT is set to mvRp.
  • the residual prediction unit 3092 sets a reference picture having a display time (POC) different from that of the target picture and a different view ID as rpPicRef.
  • POC display time
  • the residual prediction unit 3092 has the same POC of the reference picture rpPicRef and the POC of the reference picture RefPicListY [RpRefIdxLY] of the target block.
  • the reference picture rpPicRef is derived from the condition that the view ID of the reference picture rpPicRef and the view ID RefViewIdx [xxP] [yP] of the reference picture of the displacement vector MvDisp are equal.
  • the residual prediction unit 3092 sets the vector mvLX obtained by scaling the motion vector of the prediction block to the vector mvRpRef of rpPicRef and the sum (mvRp + mvLX) of the vector mvRp.
  • the residual prediction unit 3092 When the target prediction unit is displacement prediction (when the inter prediction prediction flag ivRefFlag is 1), the residual prediction unit 3092 has the POC of the reference picture rpPicRef equal to the POC of the reference picture rpPic and the view ID of the reference picture rpPicRef.
  • the reference picture rpPicRef is derived from the condition that the view IDs RefViewIdx [xP] [yP] of the prediction units are equal. Further, the residual prediction unit 3092 sets the motion vector mvLX of the prediction block and the sum (mvRp + mvLX) of the prediction block to the vector mvRpRef of the rpPicRef.
  • mvRp and mvRpRef are derived as follows.
  • the residual prediction vector deriving unit 30924 receives the reference picture, the target block coordinates (xP, yP), the target block size nPSW, nPSH, and the vector mvLX, and receives motion compensation parameters (vector, reference picture) of the prediction unit on the reference picture.
  • the vector mvT and view ID are derived from the index and view ID).
  • the residual prediction vector deriving unit 30924 derives the reference coordinates (xRef, yRef) as the center coordinates of the block at the position shifted by the vector mvLX from the target block on the reference picture instructed as an input by the following expression: To do.
  • xRef Clip3 (0, PicWidthInSamplesL-1, xP + (nPSW >> 1) + ((mvDisp [0] + 2) >> 2))
  • yRef Clip3 (0, PicHeightInSamplesL-1, yP + (nPSH >> 1) + ((mvDisp [1] + 2) >> 2))
  • the residual prediction vector deriving unit 30924 derives a refPU vector mvLX and a reference picture index refPicLX that are prediction units including reference block coordinates (xRef, yRef).
  • the target prediction unit is displacement prediction (DiffPicOrderCnt (currPic, refPic) is 0) and the reference prediction unit refPU is motion prediction (DiffPicOrderCnt (refPic, refPicListRefX [refIdxLX]) is non-zero)
  • the refPU vector is mvT
  • the referable flag availFlagT is set to 1.
  • the residual prediction vector deriving unit 30924 derives a vector of prediction units on a picture different from the target picture. Residual prediction vector deriving unit 30924 receives target block coordinates (xP, yP), target block sizes nPbW, nPbH, and displacement vector mvDisp, and derives the following reference block coordinates (xRef, yRef).
  • xRef Clip3 (0, PicWidthInSamplesL-1, xP + (nPSW >> 1) + ((mvDisp [0] + 2) >> 2))
  • yRef Clip3 (0, PicHeightInSamplesL-1, yP + (nPSH >> 1) + ((mvDisp [1] + 2) >> 2))
  • the residual prediction vector deriving unit 30924 derives a refPU vector mvLX and a reference picture index refPicLX that are prediction units including reference block coordinates (xRef, yRef).
  • the reference flag availFlagT is set to 1. To do.
  • a block vector having the same POC as the target picture and a picture with a different view ID as a reference picture can be derived as mvT.
  • the reference image interpolation unit 30922 generates an interpolation image of the reference block rpSamplesLX by setting the vector mvC to the vector mvLX.
  • a pixel at a position where the coordinates (x, y) of the pixel of the interpolation image is shifted by the vector mvLX of the prediction unit is derived by linear interpolation (bilinear interpolation).
  • the reference image interpolating unit 30922 uses the X of the pixel R0 with integer precision corresponding to the case where the pixel coordinates of the prediction unit are (xP, yP).
  • X & 3 is a mathematical expression for extracting only the lower 2 bits of X.
  • the reference image interpolation unit 30922 generates an interpolation pixel predPartLX [x] [y] in consideration of the fact that the vector mvLX has a 1/4 pel decimal precision.
  • xA Clip3 (0, picWidthInSamples-1, xInt)
  • xB Clip3 (0, picWidthInSamples-1, xInt + 1)
  • xC Clip3 (0, picWidthInSamples-1, xInt)
  • xD Clip3 (0, picWidthInSamples-1, xInt + 1)
  • yA Clip3 (0, picHeightInSamples-1, yInt)
  • yB Clip3 (0, picHeightInSamples-1,
  • the integer pixel A is a pixel corresponding to the pixel R0
  • the integer pixels B, C, and D are integer precision pixels adjacent to the right, bottom, and bottom right of the integer pixel A, respectively.
  • the reference image interpolation unit 30922 includes reference pixels refPicLX [xA] [yA], refPicLX [xB] [yB], refPicLX [xC] [yC], and refPicLX [corresponding to the integer pixels A, B, C, and D, respectively.
  • xD] [yD] is read from the reference picture memory 306.
  • the reference image interpolation unit 30922 then subtracts the X component of the reference pixel refPicLX [xA] [yA], refPicLX [xB] [yB], refPicLX [xC] [yC], refPicLX [xD] [yD] and the vector mvLX.
  • An interpolated pixel predPartLX [x] [y] which is a pixel shifted by a decimal part of the vector mvLX from the pixel R0, is derived by linear interpolation (bilinear interpolation) using the part xFrac and the fractional part yFrac of the Y component.
  • predPartLX [x] [y] (refPicLX [xA] [yA] * (8-xFrac) * (8-yFrac) + refPicLX [xB] [yB] * (8-yFrac) * xFrac + refPicLX [xC] [yC] * (8-xFrac) * yFrac + refPicLX [xD] [yD] * xFrac * yFrac)
  • the four-point pixels around the target pixel are used for deriving by one-step bilinear interpolation.
  • horizontal linear interpolation and vertical linear interpolation are separated, and the remaining pixels are separated by two-step linear interpolation.
  • a difference prediction interpolation image may be generated.
  • the reference image interpolation unit 30922 performs the above-described interpolation pixel derivation process on each pixel in the prediction unit, and sets a set of interpolation pixels as an interpolation block predPartLX.
  • the reference image interpolation unit 30922 outputs the derived interpolation block predPartLX to the residual synthesis unit 30923 as the corresponding block rpSamplesLX.
  • the reference image interpolation unit 30922 derives the reference block rpRefSamplesLX by performing the same processing except that the corresponding block rpSamplesLX is derived and the displacement vector mvLX is replaced with the vector mvR.
  • the reference image interpolation unit 30922 outputs the reference block rpRefSamplesLX to the residual synthesis unit 30923.
  • the residual synthesis unit 30923 derives a residual from the difference between the two residual prediction motion compensated images (rpSamplesLX, rpRefSamplesLX), and uses this residual in the motion compensated image.
  • a predicted image is derived by adding the predicted images.
  • the residual synthesis unit 30923 derives a corrected predicted image predSamplesLX ′ from the predicted image predSamplesLX, the corresponding block rpSamplesLX, the reference block rpRefSamplesLX, and the residual prediction index iv_res_pred_weight_idx.
  • the corrected predicted image predSamplesLX ⁇ predSamplesLX ⁇ [x] [y] predSamplesLX [x] [y] + ((rpSamplesLX [x] [y]-rpRefSamplesLX [x] [y]) >> (iv_res_pred_weight_idx-1)) It is calculated using the following formula. x is 0 to the width of the prediction block ⁇ 1, and y is 0 to the height of the prediction block ⁇ 1. When the residual prediction execution flag resPredFlag is 0, the residual synthesis unit 30923 outputs the predicted image predSamplesLX as it is as in the following equation.
  • predSamplesLX ⁇ [x] [y] predSamplesLX [x] [y] (Illuminance compensation)
  • the illumination compensation unit 3093 performs illumination compensation on the input predicted image predSamplesLX.
  • the illumination compensation flag ic_flag is 0, the input predicted image predSamplesLX is output as it is.
  • the weighted prediction unit 3096 derives the predicted image predSamples from the L0 motion compensated image predSampleL0 or the L1 motion compensated image predSampleL1. .
  • the prediction from L0 and the prediction from L1 are respectively derived using the following equations.
  • bitDepth is a value indicating the bit depth.
  • bitDepth is a value indicating the bit depth.
  • the DBBP prediction unit 3095 When the DBBP mode flag dbbp_flag is 1, the DBBP prediction unit 3095 generates predicted images predSamples by depth-based block partitioning (DBBP).
  • the depth-based block division divides the target block into two regions (region 1 and region 2) based on the segmentation of the depth image corresponding to the target block.
  • An interpolation image of region 1 hereinafter referred to as predSamplesA
  • predSamplesB an interpolation image of region 2
  • a segmentation indicating region division are derived, and further, two interpolation images are synthesized according to the segmentation, thereby obtaining one interpolation image (Predicted image) is generated.
  • the division mode (PartMode) decoded from the encoded data is different from the division (segmentation) actually applied.
  • Prediction parameters such as motion vectors decoded from the encoded data are used to derive prediction parameters of subsequent prediction units and prediction units of different pictures, so use a division mode that is as close to actual division (segmentation) as possible. It is appropriate to store the prediction parameters applied to the DBBP. Therefore, the DBBP prediction unit 3095 derives a division mode PartMode from the depth image in a DBBP division mode derivation unit 30954 described later. The DBBP prediction unit 3095 replaces the partition mode obtained by decoding with the derived partition mode PartMode.
  • the prediction parameter decoded by the inter prediction parameter decoding control unit 3031 is stored in the prediction parameter memory 307 according to the replaced partition mode PartMode. Note that segmentation indicates division in units of pixels, and the division mode indicates division in units of rectangles that are preliminarily dished.
  • FIG. 1 is a block diagram illustrating a configuration of the DBBP prediction unit 3095 according to the embodiment of this invention.
  • the DBBP prediction unit 3095 includes a DBBP image interpolation unit 30951, a segmentation unit 30952, an image synthesis unit 30953, and a DBBP split mode derivation unit 30954. Note that the segmentation unit 30952 and the DBBP division mode derivation unit 30954 may be performed by the prediction parameter decoding unit 302 instead of the prediction image generation unit 101.
  • the DBBP image interpolation unit 30951 generates two interpolated images (predSamplesA, predSampleB) for each reference picture list L0 or L1 by bilinear interpolation based on the two vectors input to the DBBP prediction unit 3095.
  • the operation of the DBBP image interpolation unit 30951 related to bilinear interpolation is the same as that of the ARP reference image interpolation unit 30922.
  • the integer position xInt, yInt and the phases xFrac, yFrac are calculated from the motion vector mvLX by (Equation C-1).
  • the segmentation unit 30952 derives the segmentation information segMask from the depth block corresponding to the target block input from the reference picture memory 306.
  • the depth block corresponding to the target block has the same POC as the decoding target picture, and is on the image above the depth picture having the same view ID as the view ID (RefViewIdx) of the reference picture indicated by the displacement vector MvDisp.
  • This is a block having coordinates (xP + mvDisp [0], xP + mvDisp [1]) as upper left coordinates.
  • (xP, yP) represents the coordinates of the target block
  • MvDisp represents the displacement vector of the target block as MvDisp.
  • the segmentation unit 30952 derives the representative value thresVal of the pixel value of the depth block, and segMask is set to 1 when each pixel value of the depth block is larger than the representative value thresVal, and 0 when it is less than or equal to the representative value thresVal. [] [] Is derived.
  • the sum sumVals of y] is derived as shown in the following equation, and is further derived by the segmentation unit 30952 by right shifting by a value corresponding to the logarithm of 2 of the depth block size.
  • the representative value thresVal is exceeded, it is derived by the segmentation unit 30952 according to the following equation.
  • segMask [x] [y] (refSamples [x] [y]> threshVal) segMask [x] [y] is a block having the same size as the target block and each pixel value having 0 or 1.
  • the image composition unit 30953 uses the segMask derived by the segmentation unit 30952 and the two interpolated images (predSamplesA and predSampleB) derived by the DBBP image interpolation unit 30951 to interpolate the image predSamplesLX. Is derived.
  • the derived interpolation image predSamplesL0 and interpolation image predSamplesL1 are output to the weighted prediction unit 3096.
  • FIG. 22 is a diagram illustrating the image composition unit 30953. Based on the segmentation information segMask, the image composition unit 30953 selects one of the two interpolated images for each pixel, and further performs a filtering process to derive a predicted image predSamplesLX (here, PredSamplesDbbp).
  • a predicted image predSamplesLX here, PredSamplesDbbp
  • the interpolated image predSamples is set in predSamplesDbbp according to the values of partIdx and segmentation information segMask [x] [y] that distinguish the two interpolated images.
  • the image composition unit 30953 may select a pixel from two interpolation images based on the upper left pixel segMask [0] [0] of segMask. In this case, the image composition unit 30953 selects pixels from the interpolated image corresponding to partIdx different from segMask [0] [0] as follows. The image composition unit 30953 derives a flag curSegmentFlag indicating whether or not the values of partIdx and segMask [0] [0] are equal to each other by the following expression.
  • curSegmentFlag 1
  • the image composition unit 30953 assigns the interpolated image predSamples to the predicted image PredSamplesDbbp according to the segmentation information segMask [x] [y] of each pixel according to the following equation.
  • PredSamplesDbbp Cb [x / 2] [y / 2] predSamples Cb [x / 2] [y / 2]
  • PredSamplesDbbp Cr [x / 2] [y / 2] predSamples Cr [x / 2] [y / 2] ⁇ ⁇ ⁇
  • the image composition unit 30953 may further filter each pixel according to the segmentation information segMask [x] [y].
  • segmentation information rFlag segMask [x + 1] [y]
  • cFlag! RFlag
  • and the right pixel p [x + 1] [y] are filtered using a weight of 1: 2: 1.
  • the DBBP division mode deriving unit 30954 derives the division mode partMode from the depth block refSamples corresponding to the target block.
  • the DBBP image interpolation unit 30951 generates an interpolation image by bilinear prediction from two interpolation images used for DBBP prediction synthesis. Compared with the case where the interpolated image of the DBBP prediction unit is generated with 8 taps or 4 taps used, the processing amount and the transfer amount are greatly reduced.
  • DBBP motion compensation when motion compensation is performed by the motion displacement compensation unit 3091 that uses 8 taps for luminance and 4 taps for color difference, compared to motion compensation in bi-prediction of 8 ⁇ 8 blocks, DBBP motion compensation has a very high complexity, with a maximum of 170% for multiplication, a maximum of 170% for addition, and a maximum of 119% for memory bandwidth.
  • the maximum complexity of DBBP can be reduced to a maximum of 44% by multiplication, a maximum of 26% by addition, and a memory bandwidth of 70%.
  • the decrease in encoding efficiency due to the use of this bilinear interpolation is 0.00% on an average of 8 sequences, and the processing amount can be decreased without decreasing the encoding efficiency. .
  • the segmentation unit 30952 derives segmentation information segMask that takes 0 or 1 for each pixel, and the image synthesis unit 30953 determines each of the target blocks based on the segmentation information segMask.
  • the pixel is synthesized by selecting one of the two motion compensation images. Accordingly, an effect of reducing the processing of the image synthesizing unit 30953 is achieved as compared with a case where there are pixels that are synthesized by weighting, for example, two motion compensation images with a weight of 1/2.
  • segmentation information segMask [x] [y] corresponding to each pixel (x, y) but also segmentation information segMask [x] [y-1], segMask [x] [y + 1] , SegMask [x-1] [y] and segMask [x + 1] [y] are also referred to, and the effect of significantly reducing the processing of the image composition unit 30953 is achieved compared to the case of composition.
  • the DBBP split mode deriving unit 30954 included in the DBBP prediction unit 3095 can use another process.
  • a DBBP split mode deriving unit 30954A and a DBBP split mode deriving unit 30954B which will be described later, may be used.
  • the DBBP prediction unit 3095A includes a DBBP image interpolation unit 30951, a segmentation unit 30952, an image synthesis unit 30953, and a DBBP division mode derivation unit 30954A.
  • the DBBP prediction unit 3095A basically has the same configuration as the DBBP prediction unit 3095, but includes a DBBP split mode deriving unit 30954A instead of the DBBP split mode deriving unit 30954. Since the DBBP image interpolation unit 30951 and the segmentation unit 30952 have already been described, description thereof will be omitted.
  • the DBBP prediction unit 3095A does not derive an asymmetric partition (AMP partition, SIZE_2NxnU, SIZE_2NxnD, SIZE_nLx2N, SIZE_nRx2N) as a partition mode, and only two partition modes N ⁇ 2N and 2N ⁇ N of symmetric partition Is targeted.
  • the DBBP split mode deriving unit 30954A uses only N ⁇ 2N or 2N ⁇ N as the split mode. There is an effect to reduce. That is, since only the total values partSum [0] and partSum [1] for the two partition modes are derived as the total values, the total values partSum [0], partSum [1], and partSum [2] for the six partition modes are derived. ], PartSum [3], partSum [4], and partSum [5] can be reduced to a third of the processing amount.
  • DBBP prediction unit 3095B DBBP prediction unit 3095B
  • DBBP prediction unit 3095B DBBP prediction unit 3095B
  • the DBBP prediction unit 3095B basically has the same configuration as the DBBP prediction unit 3095, but includes a DBBP split mode deriving unit 30954B instead of the DBBP split mode deriving unit 30954.
  • the DBBP split mode deriving unit 30954B determines the split mode PartMode by referring to only the four corner pixels of the target block shown in FIG. 13 of refSamples (hereinafter referred to as refDepPels) corresponding to the target block.
  • coordinates xP0, xP0, yP0, yP1 corresponding to the upper left coordinates (xP0, yP0), upper right coordinates (xP1, yP0), lower left coordinates (xP0, xP1), and lower right coordinates (xP1, yP1) are as follows: Derived by the formula.
  • xP0 Clip3 (0, pic_width_in_luma_samples-1, x TL )
  • yP0 Clip3 (0, pic_height_in_luma_samples-1, y TL )
  • xP1 Clip3 (0, pic_width_in_luma_samples-1, xTL + nPSW-1)
  • yP1 Clip3 (0, pic_height_in_luma_samples-1, y TL + nPSH-1)
  • the upper left pixel TLrefDepPels [xP0] [yP0] and the lower right pixel BR are compared (refDepPels [xP0] [yP0] ⁇ refDepPels [xP1] [yP1]) and the lower left pixel BLrefDepPels [xP0] [yP0] From the comparison (refDepPels [xP1] [yP0] ⁇ re
  • the DBBP split mode deriving unit 30954B assigns 2N ⁇ N or N ⁇ 2N according to the split flag horSplitFlag. Specifically, the split mode PartMode is derived by assigning 2N ⁇ N when horSplitFlag is 1 and N ⁇ 2N when horSplitFlag is 0.
  • the DBBP prediction unit 3095B having the above configuration, only the limited pixels of the depth block (here, the four pixels in the block and the pixels at the four corners of the block) are referred to. Compared to the above, the amount of processing is greatly reduced. Further, since it is not necessary to calculate the depth representative value or to derive the total value partSum [] for each division mode, the processing amount can be further reduced.
  • the division mode is derived by a simple process of comparing the upper left pixel and the lower right pixel of the depth and comparing the upper right pixel and the lower left pixel of the depth. As compared with the case where the process is performed, the processing amount is greatly reduced.
  • DBBP prediction unit 3095C DBBP prediction unit 3095C
  • DBBP prediction unit 3095C DBBP prediction unit 3095C
  • the DBBP prediction unit 3095C has basically the same configuration as the DBBP prediction unit 3095B, but includes a DBBP split mode deriving unit 30954C instead of the DBBP split mode deriving unit 30954B.
  • FIG. 21 is a block diagram illustrating a configuration of the DBBP prediction unit 3095C according to the embodiment of this invention.
  • the DBBP prediction unit 3095C includes a DBBP image interpolation unit 30951, a segmentation unit 30952, an image synthesis unit 30953, and a DBBP division mode derivation unit 30954C.
  • the DBBP split mode deriving unit 30954C uses the split flag deriving unit 353 to derive a split flag horSplitFlag having a value of 0 or 1.
  • the DBBP split mode deriving unit 30954C derives a split mode based on horSplitFlag. For example, the DBBP split mode deriving unit 30954C derives a split mode by assigning 2N ⁇ N when horSplitFlag is 1 and N ⁇ 2N when horSplitFlag is 0.
  • An image decoding apparatus including a DBBP prediction unit 3095C and a VSP prediction unit 30374 includes, as the DBBP prediction unit 3095C, a segmentation derivation unit 30952 that derives segmentation information from a depth image, and a DBBP image interpolation unit 30951 that generates two motion compensation images.
  • the image synthesizing unit 30953 for synthesizing the two interpolation images to generate one motion compensation image, and the DBBP division mode deriving unit 30954C for deriving the division mode are provided.
  • the VSP prediction unit 30374 includes a partition division unit that performs partition division according to depth, and a depth motion DV derivation unit 351 that derives a motion vector from the depth image. Further, the partition mode deriving unit 30954C and the partition division unit of the VSP prediction unit 30374 include a common partition mode deriving unit 353.
  • the DBBP prediction unit 3095C having the above configuration, only the limited pixels of the depth block (here, the pixels at the four corners) are referred to, so that the processing amount is significantly reduced as compared with the case of referring to all the pixels. The effect to do.
  • the DBBP prediction unit 3095C and the VSP prediction unit 30374 use the common partition flag derivation unit 353, and thus different methods are used in the DBBP prediction unit and the VSP prediction unit. As compared with the case where the division method is derived, the effect of simplifying the mounting is obtained.
  • the division mode PartMode is derived by a simple process of comparing the upper left pixel and the lower right pixel of the depth block and comparing the upper right pixel and the lower left pixel of the depth block, There is an effect that the amount of processing is significantly reduced as compared with the case of performing comparison with respect to pixels.
  • the DBBP prediction unit 3095C and the VSP prediction unit 30374 may use the division flag derivation unit 353A instead of the division flag derivation unit 353 as a common division flag derivation unit. Also in this case, according to the image decoding apparatus included in the DBBP prediction unit 3095C, the DBBP prediction unit 3095C and the VSP prediction unit 30374 use the same division flag derivation unit 353A, and therefore different methods are used in the DBBP prediction unit and the VSP prediction unit. As compared with the case where the division method is derived, the effect of simplifying the mounting is obtained.
  • the image decoding apparatus is configured not to apply bi-prediction in the case of DBBP.
  • the image decoding apparatus according to the modified example includes an inter prediction parameter decoding control unit 3031A instead of the inter prediction parameter decoding control unit 3031 and a merge mode parameter deriving unit 3036A instead of the merge mode parameter deriving unit 3036. Since operations other than the inter prediction parameter decoding control unit 3031A and the merge mode parameter deriving unit 3036A are as described above, description thereof will be omitted.
  • FIG. 23 is a schematic diagram illustrating a configuration of the inter prediction parameter decoding control decoding unit 3031A according to the present embodiment.
  • the inter prediction parameter decoding control decoding unit 3031A has the same configuration as the inter prediction parameter decoding control decoding unit 3031, but includes an inter prediction identifier decoding unit 30312A instead of the inter prediction identifier decoding unit 30312.
  • FIG. 24 is a diagram for explaining the derivation of inter_pred_flag in the inter prediction parameter decoding control unit 3031A.
  • inter_pred_flag is decoded when the slice type is B (bi-prediction is available).
  • FIG. 24A is a diagram showing values that inter_pred_flag can take, and
  • FIG. 24B shows a bit string (binarization) after CABAC decoding of inter_pred_flag.
  • the inter prediction parameter decoding control unit 3031A inter prediction identifier decoding unit 30312A
  • Bi-prediction restriction condition 2 DBBP flag dbbp_flag is 1.
  • the bit string of inter_pred_flag is 00, 01, 1 and corresponds to 0 (PRED_L0), 1 (PRED_L1), and 2 (PREDBI_BI), respectively.
  • FIG. 25 is a block diagram illustrating a configuration of the merge mode parameter deriving unit 3036A.
  • the merge mode parameter derivation unit 3036 includes a merge candidate derivation unit 30361, a merge candidate selection unit 30362, and a bi-prediction restriction unit 30363A.
  • the partial operation of the merge mode parameter deriving unit 3036A and the means other than the bi-prediction limiting unit 30363A are as described above, and thus the description thereof is omitted.
  • the merge mode parameter deriving unit 3036A adds the DBBP flag dbbp_flag to the bi-prediction restriction unit 30363A in addition to the prediction parameter derived by the merge candidate selection unit 30362, the width nOrigPbW and the height nOrigPbH of the prediction unit. Output.
  • the interpolation image derived by the DBBP prediction unit 3095 is limited to the case of uni-prediction (in the case of a reference picture of L0 or L1). Compared with the case of generating an interpolated image using prediction, the processing amount and the transfer amount are greatly reduced.
  • FIG. 26 is a block diagram illustrating a configuration of the image encoding device 11 according to the present embodiment.
  • the image encoding device 11 includes a prediction image generation unit 101, a subtraction unit 102, a DCT / quantization unit 103, an entropy encoding unit 104, an inverse quantization / inverse DCT unit 105, an addition unit 106, a prediction parameter memory (prediction parameter storage). Section, frame memory) 108, reference picture memory (reference image storage unit, frame memory) 109, coding parameter determination unit 110, and prediction parameter coding unit 111.
  • the prediction parameter encoding unit 111 includes an inter prediction parameter encoding unit 112 and an intra prediction parameter encoding unit 113.
  • the predicted image generation unit 101 generates predicted picture block predSamples for each block that is an area obtained by dividing the picture for each viewpoint of the layer image T input from the outside.
  • the predicted image generation unit 101 reads the reference picture block from the reference picture memory 109 based on the prediction parameter input from the prediction parameter encoding unit 111.
  • the prediction parameter input from the prediction parameter encoding unit 111 is, for example, a motion vector or a displacement vector.
  • the predicted image generation unit 101 reads the reference picture block of the block at the position indicated by the motion vector or the displacement vector predicted from the encoded prediction unit.
  • the predicted image generation unit 101 generates predicted picture blocks predSamples using one prediction method among a plurality of prediction methods for the read reference picture block.
  • the predicted image generation unit 101 outputs the generated predicted picture block predSamples to the subtraction unit 102 and the addition unit 106. Note that since the predicted image generation unit 101 performs the same operation as the predicted image generation unit 308 already described, details of generation of the predicted picture block predSamples are omitted.
  • the predicted image generation unit 101 calculates an error value based on a difference between a signal value for each pixel of a block included in the layer image and a signal value for each corresponding pixel of the predicted picture block predSamples. Select the prediction method to minimize. Note that the method of selecting the prediction method is not limited to this.
  • the plurality of prediction methods are intra prediction, motion prediction, and merge mode.
  • Motion prediction is prediction between display times among the above-mentioned inter predictions.
  • the merge mode is a prediction that uses the same reference picture block and prediction parameter as a block that has already been encoded and is within a predetermined range from the prediction unit.
  • the plurality of prediction methods are intra prediction, motion prediction, merge mode (including viewpoint synthesis prediction), and displacement prediction.
  • the displacement prediction (disparity prediction) is prediction between different layer images (different viewpoint images) in the above-described inter prediction. For displacement prediction (disparity prediction), there are predictions with and without additional prediction (residual prediction and illuminance compensation).
  • the predicted image generation unit 101 When the intra prediction is selected, the predicted image generation unit 101 outputs a prediction mode PredMode indicating the intra prediction mode used when generating the predicted picture block predSamples to the prediction parameter encoding unit 111.
  • the prediction image generation unit 101 stores the motion vector mvLX used when generating the prediction picture block predSamples in the prediction parameter memory 108 and outputs the motion vector mvLX to the inter prediction parameter encoding unit 112.
  • the motion vector mvLX indicates a vector from the position of the encoded prediction unit to the position of the reference picture block when the predicted picture block predSamples is generated.
  • the information indicating the motion vector mvLX may include information indicating a reference picture (for example, a reference picture index refIdxLX, a picture order number POC), and may represent a prediction parameter.
  • the predicted image generation unit 101 outputs a prediction mode PredMode indicating the inter prediction mode to the prediction parameter encoding unit 111.
  • the prediction image generation unit 101 When the prediction image generation unit 101 selects the displacement prediction, the prediction image generation unit 101 stores the displacement vector used when generating the prediction picture block predSamples in the prediction parameter memory 108 and outputs it to the inter prediction parameter encoding unit 112.
  • the displacement vector dvLX indicates a vector from the position of the encoded prediction unit to the position of the reference picture block when the predicted picture block predSamples is generated.
  • the information indicating the displacement vector dvLX may include information indicating a reference picture (for example, reference picture index refIdxLX, view IDview_id) and may represent a prediction parameter.
  • the predicted image generation unit 101 outputs a prediction mode PredMode indicating the inter prediction mode to the prediction parameter encoding unit 111.
  • the predicted image generation unit 101 When the merge mode is selected, the predicted image generation unit 101 outputs a merge index merge_idx indicating the selected reference picture block to the inter prediction parameter encoding unit 112. Further, the predicted image generation unit 101 outputs a prediction mode PredMode indicating the merge mode to the prediction parameter encoding unit 111.
  • the prediction image generation unit 101 performs the viewpoint in the VSP prediction unit 30374 included in the prediction image generation unit 101 as described above. Perform synthetic prediction. Further, in the motion prediction, displacement prediction, and merge mode, the prediction image generation unit 101 includes the prediction image generation unit 101 as described above when the residual prediction execution flag resPredFlag indicates that the residual prediction is performed. The residual prediction unit 3092 performs residual prediction.
  • the subtraction unit 102 subtracts the signal value of the prediction picture block predSamples input from the prediction image generation unit 101 for each pixel from the signal value of the corresponding block of the layer image T input from the outside, and generates a residual signal. Generate.
  • the subtraction unit 102 outputs the generated residual signal to the DCT / quantization unit 103 and the encoding parameter determination unit 110.
  • the DCT / quantization unit 103 performs DCT on the residual signal input from the subtraction unit 102 and calculates a DCT coefficient.
  • the DCT / quantization unit 103 quantizes the calculated DCT coefficient to obtain a quantization coefficient.
  • the DCT / quantization unit 103 outputs the obtained quantization coefficient to the entropy encoding unit 104 and the inverse quantization / inverse DCT unit 105.
  • the entropy coding unit 104 receives the quantization coefficient from the DCT / quantization unit 103 and the coding parameter from the coding parameter determination unit 110.
  • the input encoding parameters include codes such as a reference picture index refIdxLX, a prediction vector flag mvp_LX_flag, a difference vector mvdLX, a prediction mode PredMode, a merge index merge_idx, a residual prediction index iv_res_pred_weight_idx, and an illumination compensation flag ic_flag.
  • the entropy encoding unit 104 generates an encoded stream Te by entropy encoding the input quantization coefficient and encoding parameter, and outputs the generated encoded stream Te to the outside.
  • the inverse quantization / inverse DCT unit 105 inversely quantizes the quantization coefficient input from the DCT / quantization unit 103 to obtain a DCT coefficient.
  • the inverse quantization / inverse DCT unit 105 performs inverse DCT on the obtained DCT coefficient to calculate a decoded residual signal.
  • the inverse quantization / inverse DCT unit 105 outputs the calculated decoded residual signal to the addition unit 106 and the encoding parameter determination unit 110.
  • the addition unit 106 adds the signal value of the prediction picture block predSamples input from the prediction image generation unit 101 and the signal value of the decoded residual signal input from the inverse quantization / inverse DCT unit 105 for each pixel, and refers to them. Generate a picture block.
  • the adding unit 106 stores the generated reference picture block in the reference picture memory 109.
  • the prediction parameter memory 108 stores the prediction parameter generated by the prediction parameter encoding unit 111 at a predetermined position for each picture and block to be encoded.
  • the reference picture memory 109 stores the reference picture block generated by the adding unit 106 at a predetermined position for each picture and block to be encoded.
  • the encoding parameter determination unit 110 selects one set from among a plurality of sets of encoding parameters.
  • the encoding parameter is a parameter to be encoded that is generated in association with the above-described prediction parameter or the prediction parameter.
  • the predicted image generation unit 101 generates predicted picture blocks predSamples using each of these sets of encoding parameters.
  • the encoding parameter determination unit 110 calculates a cost value indicating the amount of information and the encoding error for each of a plurality of sets.
  • the cost value is, for example, the sum of a code amount and a square error multiplied by a coefficient ⁇ .
  • the code amount is the information amount of the encoded stream Te obtained by entropy encoding the quantization error and the encoding parameter.
  • the square error is the sum between pixels regarding the square value of the residual value of the residual signal calculated by the subtracting unit 102.
  • the coefficient ⁇ is a real number larger than a preset zero.
  • the encoding parameter determination unit 110 selects a set of encoding parameters that minimizes the calculated cost value. As a result, the entropy encoding unit 104 outputs the selected set of encoding parameters to the outside as the encoded stream Te, and does not output the set of unselected encoding parameters.
  • the prediction parameter encoding unit 111 derives a prediction parameter used when generating a prediction picture based on the parameter input from the prediction image generation unit 101, and encodes the derived prediction parameter to generate a set of encoding parameters. To do.
  • the prediction parameter encoding unit 111 outputs the generated set of encoding parameters to the entropy encoding unit 104.
  • the prediction parameter encoding unit 111 stores, in the prediction parameter memory 108, a prediction parameter corresponding to the set of the generated encoding parameters selected by the encoding parameter determination unit 110.
  • the prediction parameter encoding unit 111 operates the inter prediction parameter encoding unit 112 when the prediction mode PredMode input from the predicted image generation unit 101 indicates the inter prediction mode.
  • the prediction parameter encoding unit 111 operates the intra prediction parameter encoding unit 113 when the prediction mode PredMode indicates the intra prediction mode.
  • the inter prediction parameter encoding unit 112 derives an inter prediction parameter based on the prediction parameter input from the encoding parameter determination unit 110.
  • the inter prediction parameter encoding unit 112 includes the same configuration as the configuration in which the inter prediction parameter decoding unit 303 derives the inter prediction parameter as a configuration for deriving the inter prediction parameter.
  • the configuration of the inter prediction parameter encoding unit 112 will be described later.
  • the intra prediction parameter encoding unit 113 determines the intra prediction mode IntraPredMode indicated by the prediction mode PredMode input from the encoding parameter determination unit 110 as a set of inter prediction parameters.
  • the inter prediction parameter encoding unit 112 is means corresponding to the inter prediction parameter decoding unit 303.
  • FIG. 27 is a schematic diagram illustrating the configuration of the inter prediction parameter encoding unit 112 according to the present embodiment.
  • the inter prediction parameter encoding unit 112 includes a merge mode parameter deriving unit 1121, an AMVP prediction parameter deriving unit 1122, a subtracting unit 1123, and an inter prediction parameter encoding control unit 1126.
  • the merge mode parameter derivation unit 1121 has the same configuration as the merge mode parameter derivation unit 3036 (see FIG. 9).
  • the AMVP prediction parameter derivation unit 1122 has the same configuration as the AMVP prediction parameter derivation unit 3032 (see FIG. 10).
  • the subtraction unit 1123 subtracts the prediction vector mvpLX input from the AMVP prediction parameter derivation unit 1122 from the vector mvLX input from the coding parameter determination unit 110 to generate a difference vector mvdLX.
  • the difference vector mvdLX is output to the inter prediction parameter encoding control unit 1126.
  • the inter prediction parameter coding control unit 1126 instructs the entropy coding unit 104 to decode a code related to inter prediction (the syntax element) includes, for example, a code (syntax element) included in the coded data. , Merge flag merge_flag, merge index merge_idx, inter prediction identifier inter_pred_idc, reference picture index refIdxLX, prediction vector flag mvp_LX_flag, and difference vector mvdLX are encoded.
  • the inter prediction parameter encoding control unit 1126 includes a residual prediction index encoding unit 10311, an illumination compensation flag encoding unit 10312, a merge index encoding unit, a vector candidate index encoding unit, a split mode encoding unit, and a merge flag encoding. , An inter prediction identifier encoding unit, a reference picture index encoding unit, and a vector difference encoding unit.
  • the division mode encoding unit, the merge flag encoding unit, the merge index encoding unit, the inter prediction identifier encoding unit, the reference picture index encoding unit, the vector candidate index encoding unit, and the vector difference encoding unit are respectively divided mode part_mode , Merge flag merge_flag, merge index merge_idx, inter prediction identifier inter_pred_idc, reference picture index refIdxLX, prediction vector flag mvp_LX_flag, and difference vector mvdLX are encoded.
  • the residual prediction index encoding unit 10311 encodes the residual prediction index iv_res_pred_weight_idx to indicate whether or not residual prediction is performed.
  • the illuminance compensation flag encoding unit 10312 encodes the illuminance compensation flag ic_flag to indicate whether or not illuminance compensation is performed.
  • the inter prediction parameter encoding control unit 1126 uses the merge index merge_idx input from the encoding parameter determination unit 110 as the entropy encoding unit 104. To be encoded.
  • the inter prediction parameter encoding control unit 1126 performs the following process when the prediction mode PredMode input from the predicted image generation unit 101 indicates the inter prediction mode.
  • the inter prediction parameter encoding control unit 1126 integrates the reference picture index refIdxLX and the prediction vector flag mvp_LX_flag input from the encoding parameter determination unit 110 and the difference vector mvdLX input from the subtraction unit 1123.
  • the inter prediction parameter encoding control unit 1126 outputs the integrated code to the entropy encoding unit 104 to be encoded.
  • the inter prediction parameter encoding control unit 1126 includes a DBBP flag encoding unit dbbp_flag (not shown).
  • the predicted image generation unit 101 is a means corresponding to the predicted image generation unit 308 described above, and the process of generating a predicted image from the prediction parameters is the same.
  • the predicted image generation unit 101 also includes the above-described residual synthesis unit 30923 in the same manner as the predicted image generation unit 308. That is, residual prediction is not performed when the size of the target block (predicted block) is equal to or smaller than a predetermined size. Also, the predicted image generation unit 101 of the present embodiment performs residual prediction only when the division mode part_mode of the coding unit CU is 2N ⁇ 2N. That is, the residual prediction index iv_res_pred_weight_idx is set to 0. Also, the residual prediction index encoding unit 10311 of the present embodiment encodes the residual prediction index iv_res_pred_weight_idx only when the division mode part_mode of the encoding unit CU is 2N ⁇ 2N.
  • the image encoding device including the residual prediction unit 3092 is the image encoding device including the residual prediction index encoding unit that encodes the residual prediction index, and the division mode of the encoding unit including the target block is 2N ⁇ . In the case of 2N, the residual prediction index is encoded. In other cases, the residual prediction index is not encoded, and when the residual prediction index is other than 0, residual prediction is performed.
  • the predicted image generation unit 101 included in the image encoding device 11 of the present embodiment includes a DBBP prediction unit 3095. Details of the operation of the DBBP prediction unit 3095 have already been described, and are omitted here.
  • the DBBP prediction unit 3095 performs depth-based block prediction when 1 is encoded as the DBBP flag encoding unit dbbp_flag described above.
  • the DBBP image interpolation unit 30951 According to the image coding apparatus including the DBBP prediction unit 3095 having the above configuration, the DBBP image interpolation unit 30951 generates an interpolation image by bilinear prediction of two interpolation images, so that the processing amount and the transfer amount are greatly reduced. The effect to do.
  • the segmentation unit 30952 derives segmentation information segMask that takes 0 or 1 for each pixel, and the image synthesis unit 30953 outputs the segmentation information segMask to the segmentation information segMask. Based on this, synthesis is performed by selecting one of the two motion compensation images in each pixel of the target block. Accordingly, an effect of reducing the processing of the image synthesizing unit 30953 is achieved as compared with a case where there are pixels that are synthesized by weighting, for example, two motion compensated images with a weight of 1/2.
  • the image coding apparatus including the DBBP prediction unit 3095 having the above configuration, not only segmentation information segMask [x] [y] corresponding to each pixel (x, y), but also upper, lower, left and right segmentation information segMask Image compositing compared to compositing with reference to [x] [y-1], segMask [x] [y + 1], segMask [x-1] [y], segMask [x + 1] [y]
  • the effect of greatly reducing the processing of the unit 30953 is achieved.
  • the predicted image generation unit 101 included in the image encoding device 11 may include any of the DBBP prediction unit 3095A, the DBBP prediction unit 3095B, and the DBBP prediction unit 3095C instead of the DBBP prediction unit 3095.
  • the image coding apparatus including the DBBP prediction units 3095A to 3095C having the above configuration, only the limited pixels of the depth block (here, the four corner pixels) are referred to, so all the pixels are referred to. Compared with the case where it does, there exists an effect which reduces a processing amount significantly.
  • the division is performed by a simple process of comparing the upper left pixel and the lower right pixel of the depth and comparing the upper right pixel and the lower left pixel of the depth. Since the mode is derived, the processing amount is greatly reduced as compared with the case where all the pixels are compared.
  • the image encoding device including the DBBP prediction unit 3095B and the DBBP prediction unit 3095C having the above configuration, as in the DBBP prediction unit 3095A, only N ⁇ 2N or 2N ⁇ N is used as the division mode. There is an effect of reducing the processing amount as compared with the case where AMP division is targeted.
  • the DBBP prediction unit 3095C and the VSP prediction unit 30374 use the common split mode deriving unit 3096. Therefore, different methods are used in the DBBP prediction unit and the VSP prediction unit. As compared with the case where the division method is derived by using the method, the effect of simplifying the mounting is obtained.
  • the image encoding device 11 is configured not to apply bi-prediction in the case of DBBP.
  • an inter prediction parameter encoding unit 103A (not shown) instead of the inter prediction parameter encoding unit 103, a merge mode parameter encoding unit 1036A and a merge mode parameter deriving unit 3036A (not shown) instead of the merge mode parameter deriving unit 3036.
  • the inter prediction parameter encoding unit 103A is means corresponding to the above-described inter prediction parameter decoding control unit 3031A.
  • the image coding apparatus includes a DBBP prediction unit 3095 and a merge mode parameter derivation unit 3036A.
  • the DBBP prediction unit 3095 includes a segmentation derivation unit 30952 that derives segmentation information from a depth image, and two motion compensations.
  • a DBBP image interpolation unit 30951 that generates an image and an image synthesis unit 30953 that combines the two interpolation images to generate one motion compensation image, and the image encoding device encodes a DBBP flag (not shown).
  • a DBBP flag encoding unit is further provided, and the merge mode parameter deriving unit 3036A is an image encoding device that performs conversion from bi-prediction to single prediction when the DBBP flag is 1.
  • the interpolated image derived by the DBBP prediction unit 3095 is uni-prediction (in the case of a reference picture of L0 or L1. Since 1 or predFlagL1 is limited to 1), the worst case of the processing amount and the transfer amount is greatly reduced as compared with the case where an interpolated image can be generated using DBBP prediction in bi-prediction. Play.
  • the image encoding unit including the inter prediction parameter encoding unit 103A having the above configuration when the DBBP flag dbbp_flag is 1, a single prediction (PRED_L0 or PRED_L1) value is encoded as the inter prediction identifier inter_pred_idc. Since the bi-prediction PRED_BI value is not encoded, it is prohibited to perform bi-prediction in the case of DBBP prediction. Therefore, the worst case of the processing amount and the transfer amount is greatly reduced as compared with the case where an interpolation image can be generated using DBBP prediction in bi-prediction.
  • a part of the image encoding device 11 and the image decoding device 31 in the above-described embodiment for example, the entropy decoding unit 301, the prediction parameter decoding unit 302, the predicted image generation unit 101, the DCT / quantization unit 103, and entropy encoding.
  • Unit 104, inverse quantization / inverse DCT unit 105, encoding parameter determination unit 110, prediction parameter encoding unit 111, entropy decoding unit 301, prediction parameter decoding unit 302, predicted image generation unit 308, inverse quantization / inverse DCT unit 311 may be realized by a computer.
  • the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed.
  • the “computer system” is a computer system built in either the image encoding device 11 or the image decoding device 31 and includes an OS and hardware such as peripheral devices.
  • the “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk incorporated in a computer system.
  • the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line,
  • a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time.
  • the program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
  • part or all of the image encoding device 11 and the image decoding device 31 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration).
  • LSI Large Scale Integration
  • Each functional block of the image encoding device 11 and the image decoding device 31 may be individually made into a processor, or a part or all of them may be integrated into a processor.
  • the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.
  • a segmentation deriving unit that derives segmentation information from a depth image
  • an image interpolating unit that generates two motion compensation images
  • a single motion compensation image are generated by combining the two interpolation images.
  • the image interpolation unit generates the two motion compensation images by bilinear prediction.
  • a segmentation deriving unit that derives segmentation information from a depth image, an image interpolating unit that generates two motion compensation images, and a single motion compensation image are generated by combining the two interpolation images.
  • a depth-based block prediction image generation apparatus including an image synthesizing unit that further includes a depth division mode deriving unit for deriving a division mode from the depth image, wherein the depth division mode deriving unit is divided from pixels at four corners of the depth block. A mode is derived.
  • One embodiment of the present invention is characterized in that the depth division mode deriving unit derives the division mode from the comparison of the upper left and lower right of the depth and the comparison of the upper right and lower left of the depth.
  • a segmentation deriving unit that derives segmentation information from a depth image, an image interpolating unit that generates two motion compensation images, and a single motion compensation image are generated by combining the two interpolation images.
  • the depth-based block prediction image generation apparatus including the image synthesizing unit further includes a depth division mode deriving unit for deriving a division mode from the depth image, and the depth division mode deriving unit has a 2N ⁇ N or N ⁇ 2N division mode. It is derived.
  • a segmentation deriving unit that derives segmentation information from a depth image, an image interpolating unit that generates two motion compensation images, and a single motion compensation image are generated by combining the two interpolation images.
  • a depth-based block prediction image generation apparatus including an image synthesis unit that further includes a depth division mode deriving unit that derives a division mode from a depth image, and the segmentation deriving unit derives segmentation information that takes 0 or 1 for each pixel.
  • the image synthesizing unit synthesizes by selecting one of the two interpolated images at each pixel of the block.
  • One embodiment of the present invention is an image decoding device including the depth base block prediction image generation device and the DBBP flag decoding unit, and the depth base block prediction image generation device performs DBBP prediction when the DBBP flag is 1. It is characterized by performing.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a view synthesis prediction unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images, an image synthesis unit that generates one motion compensation image by synthesizing the two interpolation images, and a division mode deriving unit that derives a division mode.
  • the composite prediction means includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the division mode derivation unit and the partition division unit include a common division mode.
  • a derivation unit is provided.
  • the division mode deriving unit is configured such that the depth division mode deriving unit compares the upper left pixel and the lower right pixel of the depth block corresponding to the target block, and compares the upper right pixel and the lower left pixel of the depth block.
  • the division mode is derived from
  • the division mode deriving unit derives an absolute value difference between the horizontal directions from two depth pixels having the same vertical component coordinates from the depth block corresponding to the target block, and An absolute value difference between vertical directions is derived from two depth pixels having the same coordinates, and when the absolute value difference in the horizontal direction is larger than the absolute value difference in the vertical direction, the absolute value difference is divided vertically.
  • a partition mode is derived by horizontally dividing, and the partition dividing unit divides into 4 ⁇ 8 sub-blocks when the absolute value difference in the horizontal direction is larger than the absolute value difference in the vertical direction. In other cases, it is divided into 8 ⁇ 4 sub-blocks horizontally long.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a merge mode parameter derivation unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images and an image synthesis unit that combines the two interpolation images to generate one motion compensation image.
  • the image decoding apparatus further includes a DBBP flag decoding unit.
  • the merge mode parameter deriving unit converts from single prediction to bi-prediction when the DBBP flag is 1.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and an inter prediction parameter decoding unit, wherein the depth base block prediction image generation unit includes a segmentation deriving unit for deriving segmentation information from the depth image, An image interpolation unit that generates two motion compensation images and an image synthesis unit that combines the two interpolation images to generate one motion compensation image.
  • the image decoding apparatus further includes a DBBP flag decoding unit.
  • the inter prediction parameter decoding unit does not decode a bi-prediction value as an inter prediction identifier when the DBBP flag is 1.
  • One embodiment of the present invention is an image encoding device including the depth base block prediction image generation device and the DBBP flag encoding unit, and the depth base block prediction image generation device is configured to perform DBBP when the DBBP flag is 1.
  • An image encoding apparatus that performs prediction.
  • One embodiment of the present invention is an image encoding device including a depth base block prediction image generation unit and a viewpoint synthesis prediction unit, wherein the depth base block prediction image generation unit is a segmentation deriving unit that derives segmentation information from a depth image.
  • An image interpolation unit that generates two motion compensation images, an image synthesis unit that combines the two interpolation images to generate one motion compensation image, and a division mode derivation unit that derives a division mode
  • the viewpoint synthesis prediction unit includes a partition division unit that performs partition division in the depth, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the division mode derivation unit and the partition division unit include:
  • a common division mode deriving unit is provided.
  • One aspect of the present invention is an image encoding device including a depth base block prediction image generation unit and a merge mode parameter derivation unit, wherein the depth base block prediction image generation unit derives segmentation information from the depth image. And an image interpolating unit for generating two motion compensated images, and an image synthesizing unit for synthesizing the two interpolated images to generate one motion compensated image, and the image encoding device includes a DBBP flag encoding unit And the merge mode parameter derivation unit converts from single prediction to bi-prediction when the DBBP flag is 1.
  • One aspect of the present invention is an image decoding apparatus including a depth base block prediction image generation unit and a view synthesis prediction unit, wherein the depth base block prediction image generation unit includes a segmentation derivation unit that derives segmentation information from a depth image, and An image interpolation unit that generates two motion compensation images, an image synthesis unit that generates one motion compensation image by synthesizing the two interpolation images, and a division mode deriving unit that derives a division mode.
  • the composite prediction unit includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image.
  • the common disparity vector is a disparity vector refined by depth when the block size is larger than a predetermined size, and is reduced when the block size is equal to or smaller than the predetermined size. It is a disparity vector before being refined by.
  • the common disparity vector is a disparity vector refined by depth when the sum of the width and height of the prediction block is greater than 16, and by depth otherwise. It is a disparity vector before being refined.
  • the common disparity vector is a disparity vector refined by depth if the sum of the width and height of the prediction block is greater than 24, and by depth otherwise. It is a disparity vector before being refined.
  • One embodiment of the present invention is an image encoding device including a depth base block prediction image generation unit and a viewpoint synthesis prediction unit, wherein the depth base block prediction image generation unit is a segmentation deriving unit that derives segmentation information from a depth image.
  • An image interpolation unit that generates two motion compensation images, an image synthesis unit that combines the two interpolation images to generate one motion compensation image, and a division mode derivation unit that derives a division mode
  • the viewpoint synthesis prediction unit includes a partition division unit that performs partition division from the depth image, and a depth motion vector derivation unit that derives a motion vector from the depth image, and the segmentation derivation unit of the depth base block prediction image generation unit, the division Deriving the position of the depth image referenced by the mode derivation unit And a disparity vector used for deriving a position of a depth image by the partition dividing unit and the depth motion vector deriving unit of the viewpoint synthesis prediction unit as a common disparity vector.
  • the present invention can be suitably applied to an image decoding apparatus that decodes encoded data obtained by encoding image data and an image encoding apparatus that generates encoded data obtained by encoding image data. Further, the present invention can be suitably applied to the data structure of encoded data generated by an image encoding device and referenced by the image decoding device.
  • AMVP prediction parameter derivation part 1123 ... Subtraction part 1126 ...
  • Inter prediction parameter coding control part 113 ... Intra prediction parameter encoding unit 21 ... Network 31 ... Image decoding device 301 ... Entropy decoding unit 302 ...
  • Prediction parameter decoding unit 303 ...
  • Inter prediction parameter decoding units 3031 and 3031A ...
  • Inter prediction parameter decoding control unit 30311 ... Division mode decoding unit 30312 , 30312A ... Inter prediction identifier decoding unit 30313 ... DBBP flag decoding unit 3032 ... AMVP prediction parameter derivation unit 3035 ...
  • Addition unit 3036 ... Merge mode parameter derivation unit 30361 ... Merge candidate Output unit 303611 ... Merge candidate storage unit 30362 ...
  • Prediction parameter memory (frame memory) 308 ... Prediction image generation unit 309 ... Inter prediction image generation unit 3091 ... Motion displacement compensation unit 3092 ... Residual prediction unit 30922 ... Reference image interpolation unit 30923 ... Residual synthesis unit 30924 ... Residual prediction vector derivation unit 3093 ... Illuminance compensation Units 3095, 3095A, 3095B, 3095C ... DBBP prediction unit (depth base block prediction image generation device) 30951 ... DBBP image interpolation unit (image interpolation unit, image interpolation means) 30952 ... Segmentation unit 30953 ... Image composition units 30954, 30954A, 30954B, 30954C ...
  • DBBP division mode deriving unit depth division mode deriving means 3096 ... Weighted prediction unit 310 ... Intra prediction image generation unit 311 ... Inverse quantization / inverse DCT unit 312 ... Addition unit 351 ... Depth DV derivation unit 352 ... Displacement vector derivation unit 353 ... Split mode derivation unit 354 ... Switch 41 ... Image display apparatus

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Un partitionnement de bloc en fonction de la profondeur classique (DBBP) représente un problème dans la mesure où la charge de traitement d'un traitement d'interpolation pour générer deux images d'interpolation, d'un traitement de dérivation de mode divisé pour dériver un mode divisé à partir d'une image de profondeur, et d'un traitement de combinaison pour combiner deux images complémentaires en fonction d'une segmentation, est grande. Le dispositif de génération d'image de prédiction de bloc de base de profondeur de la présente invention comporte une unité de dérivation de segmentation qui dérive des informations de segmentation à partir d'une image de profondeur, une unité d'interpolation d'image qui génère deux images à compensation de mouvement, et une unité de combinaison d'images qui combine les deux images d'interpolation et génère une image à compensation de mouvement. Le dispositif de génération d'image de prédiction de bloc de base de profondeur est caractérisé en ce que l'unité d'interpolation d'image génère les deux images à compensation de mouvement par prédiction bilinéaire. Le dispositif de génération d'image de prédiction de bloc de base de profondeur est également caractérisé en ce qu'un mode divisé est dérivé à partir des pixels aux quatre coins d'un bloc de profondeur. Le dispositif de génération d'image de prédiction de bloc de base de profondeur est également caractérisé en ce qu'une combinaison est effectuée par sélection de l'un des deux pixels d'interpolation dans chaque pixel d'un bloc. Le dispositif de génération d'image de prédiction de bloc de base de profondeur est également caractérisé en ce qu'une bi-prédiction n'est pas effectuée.
PCT/JP2015/057953 2014-03-18 2015-03-17 Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction WO2015141696A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2016508750A JPWO2015141696A1 (ja) 2014-03-18 2015-03-17 画像復号装置、画像符号化装置および予測装置

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2014054957 2014-03-18
JP2014-054957 2014-03-18
JP2014-126218 2014-06-19
JP2014126218 2014-06-19

Publications (1)

Publication Number Publication Date
WO2015141696A1 true WO2015141696A1 (fr) 2015-09-24

Family

ID=54144661

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/057953 WO2015141696A1 (fr) 2014-03-18 2015-03-17 Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction

Country Status (2)

Country Link
JP (1) JPWO2015141696A1 (fr)
WO (1) WO2015141696A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109788283A (zh) * 2019-01-08 2019-05-21 中南大学 一种编码单元分割方法及其系统、装置、存储介质
CN113196773A (zh) * 2018-12-21 2021-07-30 北京字节跳动网络技术有限公司 具有运动矢量差的Merge模式中的运动矢量精度

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FABIAN JÄGER ET AL.: "CE3: Results on Depth-based Block Partitioning (DBBP", JOINT COLLABORATIVE TEAM ON 3D VIDEO CODING EXTENSIONS OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 7TH MEETING, 11 January 2013 (2013-01-11), San Jose, USA *
GERHARD TECH ET AL.: "3D-HEVC Draft Text 3", JOINT COLLABORATIVE TEAM ON 3D VIDEO CODING EXTENSION DEVELOPMENT OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 7TH MEETING, San Jose, USA *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113196773A (zh) * 2018-12-21 2021-07-30 北京字节跳动网络技术有限公司 具有运动矢量差的Merge模式中的运动矢量精度
CN113196773B (zh) * 2018-12-21 2024-03-08 北京字节跳动网络技术有限公司 具有运动矢量差的Merge模式中的运动矢量精度
CN109788283A (zh) * 2019-01-08 2019-05-21 中南大学 一种编码单元分割方法及其系统、装置、存储介质
CN109788283B (zh) * 2019-01-08 2023-01-06 中南大学 一种编码单元分割方法及其系统、装置、存储介质

Also Published As

Publication number Publication date
JPWO2015141696A1 (ja) 2017-04-13

Similar Documents

Publication Publication Date Title
JP6469588B2 (ja) 残差予測装置、画像復号装置、画像符号化装置、残差予測方法、画像復号方法、および画像符号化方法
WO2016125685A1 (fr) Dispositif de décodage d'image, dispositif de codage d'image et dispositif de calcul de vecteur de prédiction
JP6441236B2 (ja) 画像復号装置及び画像符号化装置
WO2015194669A1 (fr) Appareil de décodage d'image, appareil de codage d'image et dispositif de génération d'image de prédiction
WO2015056719A1 (fr) Dispositif de décodage d'images et dispositif de codage d'images
JP6225241B2 (ja) 画像復号装置、画像復号方法、画像符号化装置及び画像符号化方法
JP6360053B2 (ja) 照度補償装置、画像復号装置、画像符号化装置
JP6473078B2 (ja) 画像復号装置
WO2015056620A1 (fr) Dispositif de décodage d'image, et dispositif de codage d'image
JP6118199B2 (ja) 画像復号装置、画像符号化装置、画像復号方法、画像符号化方法及びコンピュータ読み取り可能な記録媒体。
WO2015141696A1 (fr) Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction
WO2014103600A1 (fr) Structure de données codées et dispositif de décodage d'image
JP2016066864A (ja) 画像復号装置、画像符号化装置およびマージモードパラメータ導出装置
WO2016056587A1 (fr) Dispositif de dérivation d'agencement de déplacement, dispositif de dérivation de vecteur de déplacement, dispositif de dérivation d'indice de vues de référence par défaut, et dispositif de dérivation de table de consultation de profondeur
WO2015190510A1 (fr) Dispositif de prédiction de synthèse de points de vue, dispositif de décodage d'image, et dispositif de codage d'image
JP2017135432A (ja) 視点合成予測装置、画像復号装置及び画像符号化装置
JP6401707B2 (ja) 画像復号装置、画像復号方法、および記録媒体
JP2015080053A (ja) 画像復号装置、及び画像符号化装置
JP2014204327A (ja) 画像復号装置および画像符号化装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15765341

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016508750

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15765341

Country of ref document: EP

Kind code of ref document: A1