WO2014103600A1 - Structure de données codées et dispositif de décodage d'image - Google Patents

Structure de données codées et dispositif de décodage d'image Download PDF

Info

Publication number
WO2014103600A1
WO2014103600A1 PCT/JP2013/081971 JP2013081971W WO2014103600A1 WO 2014103600 A1 WO2014103600 A1 WO 2014103600A1 JP 2013081971 W JP2013081971 W JP 2013081971W WO 2014103600 A1 WO2014103600 A1 WO 2014103600A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
unit
flag
prediction
picture
Prior art date
Application number
PCT/JP2013/081971
Other languages
English (en)
Japanese (ja)
Inventor
知宏 猪飼
内海 端
貴也 山本
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2014103600A1 publication Critical patent/WO2014103600A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates to an encoded data structure and an image decoding apparatus.
  • scalable coding includes spatial scalable (pictures with low resolution as the base layer and pictures with high resolution at the enhancement layer), SNR scalable coding (pictures with low image quality as the base layer). And processing a picture with a high resolution as an extension layer).
  • a base layer picture may be used as a reference picture in coding an enhancement layer picture.
  • the encoded data structure of the second configuration has a restriction that when a certain picture is an IDR_W_LP picture, pictures in all layers of the same access unit must have the same RAP NAL unit type.
  • the encoded data structure of the third configuration has a NAL unit header and NAL unit data as a NAL unit, and in the encoded data composed of one or more NAL units, the NAL unit header includes a layer ID, a NAL unit, A coded data structure that includes a NALL unit type nal_unit_type that defines the type of data, and that specifies a scalable mask that indicates the type of scalable and a dimension ID that indicates the content of each type of scalable in the video parameter set included in the NAL unit data The dimension ID is defined by the layer ID.
  • the view ID and the depth flag related to the three-dimensional scalability can be derived from the video parameters having the same encoded data structure as the spatial scalability and the image quality scalability.
  • (A) is the functional block diagram shown about the schematic structure of VPS decoding part 212D which concerns on embodiment of this invention.
  • (B) is a figure which shows another structure of the coding data of the VPS extension regarding the tool effectiveness flag which concerns on embodiment of this invention.
  • a viewpoint image is a two-dimensional image (planar image) observed at a certain viewpoint.
  • the viewpoint image is indicated by, for example, a luminance value or a color signal value for each pixel arranged in a two-dimensional plane.
  • one viewpoint image or a signal indicating the viewpoint image is referred to as a picture.
  • the plurality of layer images include a base layer image having a low resolution and an enhancement layer image having a high resolution.
  • SNR scalable encoding is performed using a plurality of layer images
  • the plurality of layer images are composed of a base layer image with low image quality and an extended layer image with high image quality. Note that view scalable coding, spatial scalable coding, and SNR scalable coding may be arbitrarily combined.
  • the NAL is a layer provided to abstract communication between a VCL (Video Coding Layer) that is a layer that performs a moving image encoding process and a lower system that transmits and stores encoded data.
  • VCL Video Coding Layer
  • FIG. 19 is a diagram showing the relationship between the value of the NAL unit type and the type of the NAL unit.
  • a NAL unit having a NAL unit type of 0 to 15 indicated by SYNA101 is a non-RAP (random access picture) slice.
  • a NAL unit having a NAL unit type of 16 to 21 indicated by SYNA102 is a slice of RAP (Random Access Picture).
  • RAP pictures are roughly classified into BLA pictures, IDR pictures, and CRA pictures.
  • BLA pictures are further classified into BLA_W_LP, BLA_W_DLP, and BLA_N_LP.
  • IDR pictures are further classified into IDR_W_DLP and IDR_N_LP.
  • Pictures other than the RAP picture include an LP picture, a TSA picture, an STSA picture, and a TRAIL picture, which will be described later.
  • Vps_extension_flag (SYNA 404 in FIG. 20) is a flag indicating whether or not the VPS further includes a VPS extension.
  • Vps_extension_data_flag (SYNA 405 in FIG. 20) is a VPS extension main body, and will be specifically described with reference to FIG.
  • num_dimensions dimension_id_len_minus1 [1] +1.
  • num_dimensions is 2 when the scalable type is depth, and the number of viewpoints is decoded when it is a view.
  • the dimension IDdimention_id (SYN 503 in FIG. 21) is information indicating the picture type for each scalable type.
  • the number of dependent layers num_direct_ref_layers (SYN 504 in FIG. 21) is information indicating the number of dependent layers ref_layer_id.
  • the dependency layer ref_layer_id (SYN 505 in FIG. 21) is information indicating the layer ID of the layer referred to by the target layer. In the SYN 506 in FIG. 21, the portion indicated by “...” Is information that differs for each profile or scalable type (details will be described later).
  • FIG. 2 is a diagram showing a hierarchical structure of data in the encoded data # 1.
  • the encoded data # 1 exemplarily includes a sequence and a plurality of pictures constituting the sequence.
  • (A) to (f) of FIG. 2 respectively show a sequence layer that defines a sequence SEQ, a picture layer that defines a picture PICT, a slice layer that defines a slice S, a slice data layer that defines slice data, and a slice data.
  • the slice header SH includes an encoding parameter group that is referred to by the image decoding apparatus 1 in order to determine a decoding method of the target slice.
  • Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.
  • transform processing is performed for each conversion block.
  • the transform block which is a unit of transform is also referred to as a transform unit (TU).
  • the prediction list use flag information can also be expressed by an inter prediction flag inter_pred_idx described later. Normally, a prediction list use flag is used in a prediction image generation unit and a prediction parameter memory described later, and an inter prediction flag inter_pred_idx is used when decoding information on which reference picture list is used from encoded data. It is done.
  • a reference picture P1 indicated by a left-pointing arrow from the target picture is a past picture at the same viewpoint as the target picture.
  • a reference picture P2 indicated by a right-pointing arrow from the target picture is a future picture at the same viewpoint as the target picture. In motion prediction based on the target picture, the reference picture P1 or P2 is used.
  • FIG. 22A shows a case where the RAP picture is not other than the first picture.
  • the letter in the box indicates the name of the picture, and the number indicates the POC (the same applies hereinafter).
  • the display order is arranged from left to right in the figure. IDR0, A1, A2, B4, B5, and B6 are decoded in the order of IDR0, B4, A1, A2, B6, and B5.
  • the case where the picture indicated by B4 in FIG. 22A is changed to a RAP picture is shown in FIG. 22B to FIG. 22G.
  • IDR_W_LP is an abbreviation for Instantaneous Decoding Refresh With Leading Picture and may include an LP picture such as picture A3.
  • the picture A2 refers to the IDR0 and POC4 pictures.
  • the RPS is initialized when the IDR'0 is decoded. To IDR′0 is prohibited.
  • the POC is initialized.
  • FIG. 22C shows an example in which an IDR picture (particularly an IDR_N_LP picture) is inserted.
  • IDR_N_LP is an abbreviation of Instantaneous Decoding Refresh No Leading Picture, and the presence of LP pictures is prohibited. Therefore, the presence of the A3 picture in FIG. 22B is prohibited. Therefore, the A3 picture needs to be decoded before the IDR′0 picture by referring to the IDR0 picture instead of the IDR′0 picture.
  • decoding is performed from a picture that is later than RAP (CRA) in display order.
  • Prohibition of reference to pictures prior to RAP (CRA) in order is required. Note that POC is not initialized by CRA.
  • FIG. 22 (e) shows an example using a BLA picture (particularly a BLA_W_LP picture).
  • BLA_W_LP is an abbreviation for Broken Link Access With Leading Picture, and the presence of an LP picture is allowed.
  • the A2 picture and the A3 picture which are LP pictures of the BLA picture, may exist in the encoded data.
  • the A2 picture is a picture decoded before the BLA_W_LP picture, the A2 picture does not exist in the encoded data in the encoded data edited with the BLA_W_LP picture as the first picture.
  • the inter prediction flag inter_pred_idc is data indicating the type and number of reference pictures, and takes any value of Pred_L0, Pred_L1, and Pred_Bi.
  • Pred_L0 and Pred_L1 indicate that reference pictures stored in a reference picture list called an L0 reference list and an L1 reference list are used, respectively, and that both use one reference picture (single prediction). Prediction using the L0 reference list and the L1 reference list are referred to as L0 prediction and L1 prediction, respectively.
  • Pred_Bi indicates that two reference pictures are used (bi-prediction), and indicates that two reference pictures stored in the L0 reference list and the L1 reference list are used.
  • a displacement vector corresponding to pictures of different viewpoints is called a disparity vector.
  • a vector mvLX A prediction vector and a difference vector related to the vector mvLX are referred to as a prediction vector mvpLX and a difference vector mvdLX, respectively.
  • Whether the vector mvLX and the difference vector mvdLX are motion vectors or displacement vectors is determined using a reference picture index refIdxLX associated with the vectors.
  • FIG. 24 is a functional block diagram showing a schematic configuration of the header decoding unit 10.
  • the header decoding unit 10 includes a NAL unit header decoding unit 211, a VPS decoding unit 212 (video parameter set decoding unit), a layer information storage unit 213, a view depth derivation unit 214, and a tool effectiveness information decoding unit 215. It has.
  • the layer ID decoding unit 2111 decodes the layer ID from the encoded data.
  • the NAL unit type decoding unit 2112 decodes the NAL unit type from the encoded data.
  • the layer ID is 6-bit information from 0 to 63, for example, and a layer with a layer ID of 0 indicates a base layer.
  • the NAL unit type is 6-bit information from 0 to 63, for example, and indicates the type of data included in the NAL unit.
  • parameter types such as VPS, SPS, and PPS, RPS pictures such as IDR pictures, CRA pictures, and LBA pictures, non-RPS pictures such as LP pictures, and SEI are identified from the NAL unit type. Is done.
  • the VPS decoding unit 212 decodes information used for decoding in a plurality of layers based on a defined syntax definition from the VPS and VPS extension included in the encoded data. For example, the syntax shown in FIG. 20 is decoded from the VPS, and the syntax shown in FIG. 21 is decoded from the VPS extension. The VPS extension is decoded when the flag vps_extension_flag is 1.
  • the VPS decoding unit 212 decodes a syntax element vps_max_layers_minus1 indicating the number of layers from the encoded data by an internal layer number decoding unit (not shown) and outputs the decoded element to the dimension ID decoding unit 2122 and the dependent layer ID decoding unit 2123.
  • the information is stored in the information storage unit 213.
  • the tool validity information decoding unit 215 decodes a flag (tool validity flag) indicating whether or not a specific tool can be used from a parameter set such as VPS, SPS, and PPS.
  • the tool validity flag include an inter-view prediction flag multi_view_mv_pred_flag, a residual prediction flag multi_view_residual_pred_flag, a depth intra prediction flag enable_dmm_flag, and a motion parameter inheritance flag use_mpi_flag.
  • Inter-view prediction flag multi_view_mv_pred_flag and residual prediction flag multi_view_residual_pred_flag are tools that can be used when the target layer is other than the base layer.
  • Depth intra prediction flag enable_dmm_flag and motion parameter inheritance flag use_mpi_flag are depth pictures. It is a tool that can be used.
  • FIG. 37 is a view showing a part of the structure of the encoded data of the VPS extension related to the tool validity flag according to the embodiment of the present invention.
  • 37 includes the encoded data of the VPS extension shown in FIG. 21 (conversely, the encoded data in FIG. 37 is included in SYN 506 in FIG. 21. ).
  • the VPS extension data in FIG. 37 is information included when the multi-view profile or scalable type indicates depth scalable.
  • the encoded data having the configuration shown in FIG. 37 is encoded data decoded by the tool effectiveness information decoding unit 215.
  • 37 corresponds to an operation in which the view depth deriving unit 214 sets 0 to the dimension ID depth_dimension_id indicating the depth flag and 1 to the dimension ID view_dimension_id indicating the view ID.
  • the encoded data of this embodiment is a tool that can be used when the target layer layer_id is other than 0 (in the figure, i is other than 0) and the target layer is other than the base layer.
  • the validity flag (inter-view prediction flag multi_view_mv_pred_flag, residual prediction flag multi_view_residual_pred_flag) is included.
  • the tool validity flag of a tool depth intra prediction flag enable_dmm_flag, motion parameter inheritance flag use_mpi_flag) that can be used when the target layer is depth corresponds to the depth corresponding to the target layer layer_id.
  • the header decoding unit 10B sets the dependency layer ID indicating the layer dependency relationship.
  • a tool validity flag of a tool that includes a VPS decoding unit 212B that includes a dependency layer ID decoding unit 2123 for decoding, and can be used when the target layer is depth only when the view ID of the dependency layer is equal to the view ID of the target picture Is decrypted.
  • the prediction parameter decoding unit 302 includes an inter prediction parameter decoding unit 303 and an intra prediction parameter decoding unit 304.
  • the predicted image generation unit 308 includes an inter predicted image generation unit 309 and an intra predicted image generation unit 310.
  • the inter prediction parameter decoding unit 303 decodes the inter prediction parameter with reference to the prediction parameter stored in the prediction parameter memory 307 based on the code input from the entropy decoding unit 301.
  • the intra prediction parameter decoding unit 304 decodes the depth intra prediction mode dmm_mode from the input code.
  • the intra prediction parameter decoding unit 304 generates an intra prediction mode IntraPredMode from the following equation using the depth intra prediction mode dmm_mode.
  • the intra prediction parameter decoding unit 304 decodes the wedgelet pattern index wedge_full_tab_idx from the input code.
  • the intra prediction parameter decoding unit 304 outputs the intra prediction parameters to the prediction image generation unit 308 and stores them in the prediction parameter memory 307.
  • the intra prediction image generation unit 310 generates a prediction picture block using the read reference picture block and the input prediction parameter.
  • FIG. 10 is a schematic diagram illustrating a configuration of the intra predicted image generation unit 310 according to the present embodiment.
  • the intra predicted image generation unit 310 includes a direction prediction unit 3101 and a DMM prediction unit 3102.
  • the intra predicted image generation unit 310 In the case where the value of the intra prediction mode IntraPredMode is 35 or more, the intra predicted image generation unit 310 generates a prediction picture block using depth intra prediction in the DMM prediction unit 3102.
  • the intra predicted image generation unit 310 When the value of the intra prediction mode IntraPredMode is 35, the intra predicted image generation unit 310 generates a predicted picture block using the MODE_DMM_WFULL mode in depth intra prediction. The intra predicted image generation unit 310 first generates a wedgelet pattern list. Hereinafter, a method for generating a wedgelet pattern list will be described.
  • the intra predicted image generation unit 310 selects a wedgelet pattern from the wedgelet pattern list using the wedgelet pattern index wedge_full_tab_idx included in the prediction parameter.
  • the intra predicted image generation unit 310 divides the predicted picture block into two regions according to the wedgelet pattern, and derives predicted values dmmPredPartitionDC1 and dmmPredPartitionDC2 for each region.
  • a prediction value derivation method for example, an average value of pixel values of reference picture blocks adjacent to a region is used as a prediction value.
  • 1 ⁇ ( ⁇ BitDepth-1) is set as the predicted value when the bit depth of the pixel is BitDepth.
  • the intra predicted image generation unit 310 generates a predicted picture block by filling each area with the predicted values dmmPredPartitionDC1 and dmmPredPartitionDC2.
  • the intra prediction image generation unit 310 When the value of the intra prediction mode IntraPredMode is 36, the intra prediction image generation unit 310 generates a prediction picture block using the MODE_DMM_WFULLDELTA mode in depth intra prediction. First, as in the MODE_DMM_WFULL mode, the intra predicted image generation unit 310 selects a wedgelet pattern from the wedgelet pattern list and derives predicted values dmmPredPartitionDC1 and dmmPredPartitionDC2 for each region.
  • dmmOffsetDC1 DmmQuantOffsetDC1 * Clip3 (1, (1 ⁇ BitDepth Y )-1, 2 ⁇ ((QP / 10) -2)
  • dmmOffsetDC2 DmmQuantOffsetDC2 * Clip3 (1, (1 ⁇ BitDepth Y )-1, 2 ⁇ ((QP / 10) -2)
  • the intra prediction image generation unit 310 generates a prediction picture block by filling each region with values obtained by adding the intra prediction offsets dmmOffsetDC1 and dmmOffsetDC2 to the prediction values dmmPredPartitionDC1 and dmmPredPartitionDC2, respectively.
  • the intra predicted image generation unit 310 When the value of the intra prediction mode IntraPredMode is 37, the intra predicted image generation unit 310 generates a prediction picture block using the MODE_DMM_CPREDTEX mode in the depth intra prediction.
  • the intra predicted image generation unit 310 reads the corresponding block from the decoded picture buffer 12.
  • the intra predicted image generation unit 310 calculates the average value of the pixel values of the corresponding block.
  • the intra predicted image generation unit 310 uses the calculated average value as a threshold, and divides the corresponding block into a region 1 that is equal to or greater than the threshold and a region 2 that is equal to or less than the threshold.
  • the intra prediction image generation unit 310 divides the prediction picture block into two regions having the same shape as the regions 1 and 2.
  • the intra predicted image generation unit 310 derives predicted values dmmPredPartitionDC1 and dmmPredPartitionDC2 for each region using the same method as in the MODE_DMM_WFULL mode.
  • the intra predicted image generation unit 310 generates a predicted picture block by filling each area with the predicted values dmmPredPartitionDC1 and dmmPredPartitionDC2.
  • the intra predicted image generation unit 310 When the value of the intra prediction mode IntraPredMode is 38, the intra predicted image generation unit 310 generates a predicted picture block using the MODE_DMM_CPREDTEXDELTA mode in depth intra prediction. First, similarly to the MODE_DMM_CPREDTEX mode, the intra prediction image generation unit 310 divides the prediction picture block into two regions, and derives prediction values dmmPredPartitionDC1 and dmmPredPartitionDC2 for each region.
  • the intra prediction image generation unit 310 derives the intra prediction offsets dmmOffsetDC1 and dmmOffsetDC2 and fills each region with the values obtained by adding the intra prediction offsets dmmOffdDC1 and dmmOffsetDC2 to the prediction values dmmPredPartitionDC1 and dmmPredPartitionDC2, respectively. To generate a predicted picture block.
  • the intra predicted image generation unit 310 outputs the generated predicted picture block P to the addition unit 312.
  • the inverse quantization / inverse DCT unit 311 inversely quantizes the quantization coefficient input from the entropy decoding unit 301 to obtain a DCT coefficient.
  • the inverse quantization / inverse DCT unit 311 performs inverse DCT (Inverse Discrete Cosine Transform, Inverse Discrete Cosine Transform) on the obtained DCT coefficient to calculate a decoded residual signal.
  • the inverse quantization / inverse DCT unit 311 outputs the calculated decoded residual signal to the adder 312.
  • the adder 312 outputs the prediction picture block P input from the inter prediction image generation unit 309 and the intra prediction image generation unit 310 and the signal value of the decoded residual signal input from the inverse quantization / inverse DCT unit 311 for each pixel. Addition to generate a reference picture block.
  • the adding unit 312 stores the generated reference picture block in the decoded picture buffer 12, and outputs a decoded layer image Td in which the generated reference picture block is integrated for each picture to the outside.
  • FIG. 6 is a schematic diagram illustrating a configuration of the inter prediction parameter decoding unit 303 according to the present embodiment.
  • the inter prediction parameter decoding unit 303 includes an inter prediction parameter decoding control unit 3031, an AMVP prediction parameter derivation unit 3032, an addition unit 3035, and a merge prediction parameter derivation unit 3036.
  • the inter prediction parameter decoding control unit 3031 instructs the entropy decoding unit 301 to decode a code related to the inter prediction (the syntax element) includes, for example, a division mode part_mode, a merge included in the encoded data.
  • a flag merge_flag, a merge index merge_idx, an inter prediction flag inter_pred_idx, a reference picture index refIdxLX, a prediction vector index mvp_LX_idx, and a difference vector mvdLX are extracted.
  • the inter prediction parameter decoding control unit 3031 uses the entropy decoding unit 301 to extract the AMVP prediction parameter from the encoded data.
  • AMVP prediction parameters include an inter prediction flag inter_pred_idc, a reference picture index refIdxLX, a vector index mvp_LX_idx, and a difference vector mvdLX.
  • the inter prediction parameter decoding control unit 3031 outputs the prediction list use flag predFlagLX derived from the extracted inter prediction flag inter_pred_idx and the reference picture index refIdxLX to the AMVP prediction parameter derivation unit 3032 and the prediction image generation unit 308 (FIG. 5).
  • the inter prediction parameter decoding control unit 3031 outputs the extracted vector index mvp_LX_idx to the AMVP prediction parameter derivation unit 3032.
  • the inter prediction parameter decoding control unit 3031 outputs the extracted difference vector mvdLX to the addition unit 3035.
  • FIG. 7 is a schematic diagram illustrating the configuration of the merge prediction parameter deriving unit 3036 according to the present embodiment.
  • the merge prediction parameter derivation unit 3036 includes a merge candidate derivation unit 30361 and a merge candidate selection unit 30362.
  • the merge candidate derivation unit 30361 includes a merge candidate storage unit 303611, an extended merge candidate derivation unit 303612, a basic merge candidate derivation unit 303613, and an MPI candidate derivation unit 303614.
  • the merge candidate storage unit 303611 stores the merge candidates input from the extended merge candidate derivation unit 303612 and the basic merge candidate derivation unit 303613.
  • the merge candidate includes a prediction list use flag predFlagLX, a vector mvLX, and a reference picture index refIdxLX.
  • an index is assigned to the stored merge candidates according to a predetermined rule. For example, “0” is assigned as an index to the merge candidate input from the extended merge candidate derivation unit 303612 or the MPI candidate derivation unit 303614.
  • the MPI candidate derivation unit 303614 The merge candidate is derived using the motion compensation parameter of a layer different from the above.
  • the layer different from the target layer is, for example, a texture layer picture having the same view IDview_id and the same POC as the target depth picture.
  • the MPI candidate derivation unit 303614 reads, from the prediction parameter memory 307, a prediction parameter of a block having the same coordinates as the target block (also referred to as a corresponding block) in a picture of a layer different from the target layer.
  • the MPI candidate derivation unit 303614 outputs the read prediction parameters to the merge candidate storage unit 303611 as merge candidates.
  • the split flag split_flag of the CTU is also read, the split information is also included in the merge candidate.
  • Interlayer merge candidate derivation unit 3036121 receives the displacement vector from displacement vector acquisition unit 3036122.
  • the inter-layer merge candidate derivation unit 3036121 selects a block indicated only by the displacement vector input from the displacement vector acquisition unit 3036122 from a picture having the same POC as the decoding target picture of another layer (eg, base layer, base view).
  • the prediction parameter which is a motion vector included in the block, is read from the prediction parameter memory 307. More specifically, the prediction parameter read by the inter-layer merge candidate derivation unit 3036121 is a prediction parameter of a block including coordinates obtained by adding a displacement vector to the coordinates of the starting point when the center point of the target block is the starting point. .
  • the reference block coordinates (xRef, yRef) are the target block coordinates (xP, yP), the displacement vector (mvDisp [0], mvDisp [1]), and the target block width and height are nPSW, nPSH. Is derived by the following equation.
  • xRef Clip3 (0, PicWidthInSamples L -1, xP + ((nPSW-1) >> 1) + ((mvDisp [0] + 2) >> 2))
  • yRef Clip3 (0, PicHeightInSamples L -1, yP + ((nPSH-1) >> 1) + ((mvDisp [1] + 2) >> 2))
  • the inter-layer merge candidate derivation unit 3036121 determines whether or not the prediction parameter is a motion vector in the determination method of a reference layer determination unit 303111 (described later) included in the inter-prediction parameter decoding control unit 3031 (not a displacement vector). The determination is made according to the determined method.
  • the spatial merge candidate derivation unit 3036131 reads the prediction parameters (prediction list use flag predFlagLX, vector mvLX, reference picture index refIdxLX) stored in the prediction parameter memory 307 according to a predetermined rule, and uses the read prediction parameters as merge candidates.
  • the prediction parameter to be read is a prediction parameter relating to each of the blocks within a predetermined range from the decoding target block (for example, all or a part of the blocks in contact with the lower left end, upper left upper end, and upper right end of the decoding target block, respectively). is there.
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the temporal merge candidate derivation unit 3036132 reads the prediction parameter of the block in the reference image including the lower right coordinate of the decoding target block from the prediction parameter memory 307 and sets it as a merge candidate.
  • the reference picture designation method may be, for example, the reference picture index refIdxLX designated in the slice header, or may be designated using the smallest reference picture index refIdxLX of the block adjacent to the decoding target block. .
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the merge merge candidate derivation unit 3036133 derives merge merge candidates by combining two different derived merge candidate vectors and reference picture indexes already derived and stored in the merge candidate storage unit 303611 as L0 and L1 vectors, respectively. To do.
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the zero merge candidate derivation unit 3036134 derives a merge candidate in which the reference picture index refIdxLX is 0 and both the X component and the Y component of the vector mvLX are 0.
  • the derived merge candidates are stored in the merge candidate storage unit 303611.
  • the prediction vector selection unit 3034 selects a vector candidate indicated by the vector index mvp_LX_idx input from the inter prediction parameter decoding control unit 3031 among the vector candidates read by the vector candidate derivation unit 3033 as the prediction vector mvpLX.
  • the prediction vector selection unit 3034 outputs the selected prediction vector mvpLX to the addition unit 3035.
  • FIG. 9 is a conceptual diagram showing an example of vector candidates.
  • a predicted vector list 602 illustrated in FIG. 9 is a list including a plurality of vector candidates derived by the vector candidate deriving unit 3033.
  • five rectangles arranged in a line on the left and right indicate areas indicating prediction vectors, respectively.
  • the downward arrow directly below the second mvp_LX_idx from the left end and mvpLX below the mvp_LX_idx indicate that the vector index mvp_LX_idx is an index referring to the vector mvpLX in the prediction parameter memory 307.
  • the addition unit 3035 adds the prediction vector mvpLX input from the prediction vector selection unit 3034 and the difference vector mvdLX input from the inter prediction parameter decoding control unit to calculate a vector mvLX.
  • the adding unit 3035 outputs the calculated vector mvLX to the predicted image generation unit 308 (FIG. 5).
  • Prediction when the target picture layer and the reference picture layer are the same layer is called the same layer prediction, and the vector obtained in this case is a motion vector.
  • Prediction when the target picture layer and the reference picture layer are different layers is called inter-layer prediction, and the vector obtained in this case is a displacement vector.
  • the reference layer determination unit 303111 determines that the vector mvLX is a displacement vector, for example, using the following equation: To do.
  • the motion displacement compensation unit 3091 is designated by the reference picture index refIdxLX from the decoded picture buffer 12 based on the prediction list use flag predFlagLX, the reference picture index refIdxLX, and the motion vector mvLX input from the inter prediction parameter decoding unit 303.
  • a motion displacement compensation image is generated by reading out a block at a position shifted by the vector mvLX starting from the position of the target block of the reference picture.
  • a motion displacement compensation image is generated by applying a filter for generating a pixel at a decimal position called a motion compensation filter (or displacement compensation filter).
  • the above processing is called motion compensation
  • the vector mvLX is a displacement vector
  • it is called displacement compensation
  • it is collectively referred to as motion displacement compensation
  • the L0 predicted motion displacement compensation image is referred to as predSamplesL0
  • the L1 predicted motion displacement compensation image is referred to as predSamplesL1.
  • predSamplesLX When both are not distinguished, they are called predSamplesLX.
  • the residual prediction unit 3092 performs residual prediction on the input motion displacement compensation image predSamplesLX.
  • the residual prediction flag res_pred_flag is 0, the input motion displacement compensation image predSamplesLX is output as it is.
  • residual prediction is performed on the motion displacement compensation image predSamplesLX obtained by the motion displacement compensation unit 3091. I do.
  • Illuminance compensation is a process in which a pixel value of a motion displacement image in an adjacent region adjacent to a target block for which a predicted image is to be generated, a change in a decoded image in the adjacent region, and a pixel value in the target block and an original image of the target block. This is done on the assumption that it is similar to a change.
  • the illuminance parameter estimation unit 30931 calculates estimated parameters (illuminance change parameters) a and b from the pixels L (L0 to LN-1) around the target block and the pixels C (C0 to CN-1) around the reference block. Is obtained from the following equation using the least square method.
  • the illuminance compensation unit 3093 derives estimation parameters (illuminance change parameters) icaidx, ickidx, and icbidx according to the following formula.
  • the illuminance compensation filter unit 30932 included in the illuminance compensation unit 3093 derives a pixel compensated for illuminance change from the target pixel using the estimation parameter derived by the illuminance parameter estimation unit 30931.
  • the estimation parameters are decimal numbers a and b, the following equation is used.
  • the reference layer depth determination unit 2153 determines whether or not the dependent layer is a texture picture (the depth flag is 0). Since the determination method has already been described, the details are omitted.
  • the predicted image generation unit 101 calculates an error value based on a difference between a signal value for each pixel of a block included in the layer image and a signal value for each corresponding pixel of the predicted picture block P. Select the prediction method to minimize.
  • the method for selecting the prediction method is not limited to this.
  • the prediction image generation unit 101 When the prediction image generation unit 101 selects the displacement prediction, the prediction image generation unit 101 stores the displacement vector used when generating the prediction picture block P in the prediction parameter memory 108 and outputs it to the inter prediction parameter encoding unit 112.
  • the displacement vector dvLX indicates a vector from the position of the encoding target block to the position of the reference picture block when the predicted picture block P is generated.
  • the information indicating the displacement vector dvLX may include information indicating a reference picture (for example, reference picture index refIdxLX, view IDview_id) and may represent a prediction parameter.
  • the predicted image generation unit 101 outputs a prediction mode predMode indicating the inter prediction mode to the prediction parameter encoding unit 111.
  • the inverse quantization / inverse DCT unit 105 inversely quantizes the quantization coefficient input from the DCT / quantization unit 103 to obtain a DCT coefficient.
  • the inverse quantization / inverse DCT unit 105 performs inverse DCT on the obtained DCT coefficient to calculate an encoded residual signal.
  • the inverse quantization / inverse DCT unit 105 outputs the calculated encoded residual signal to the addition unit 106.
  • the prediction parameter encoding unit 111 stores, in the prediction parameter memory 108, a prediction parameter corresponding to the set of the generated encoding parameters selected by the encoding parameter determination unit 110.
  • the inter prediction parameter encoding unit 112 derives an inter prediction parameter based on the prediction parameter input from the encoding parameter determination unit 110.
  • the inter prediction parameter encoding unit 112 includes the same configuration as the configuration in which the inter prediction parameter decoding unit 303 (see FIG. 5 and the like) derives the inter prediction parameter as a configuration for deriving the inter prediction parameter.
  • the configuration of the inter prediction parameter encoding unit 112 will be described later.
  • the intra prediction parameter encoding unit 113 determines the intra prediction mode IntraPredMode indicated by the prediction mode predMode input from the encoding parameter determination unit 110 as a set of inter prediction parameters.
  • the merge prediction parameter derivation unit 1121 has the same configuration as the merge prediction parameter derivation unit 3036 (see FIG. 7).
  • the AMVP prediction parameter derivation unit 1122 has the same configuration as the AMVP prediction parameter derivation unit 3032 (see FIG. 8).
  • the AMVP prediction parameter derivation unit 1122 receives the vector mvLX from the encoding parameter determination unit 110 when the prediction mode predMode input from the prediction image generation unit 101 indicates the inter prediction mode.
  • the AMVP prediction parameter derivation unit 1122 derives a prediction vector mvpLX based on the input vector mvLX.
  • the AMVP prediction parameter derivation unit 1122 outputs the derived prediction vector mvpLX to the subtraction unit 1123. Note that the reference picture index refIdx and the vector index mvp_LX_idx are output to the prediction parameter integration unit 1126.
  • the prediction parameter integration unit 1126 integrates the reference picture index refIdxLX and the vector index mvp_LX_idx input from the encoding parameter determination unit 110, and the difference vector mvdLX input from the subtraction unit 1123.
  • the prediction parameter integration unit 1126 outputs the integrated code to the entropy encoding unit 104.
  • the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line,
  • a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time.
  • the program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.
  • the video parameter set decoding unit includes a dimension ID decoding unit that derives a depth flag of each layer, and the tool effectiveness information decoding unit is derived in the dimension ID decoding unit.
  • the tool validity flag is decoded only when the depth flag of the dependency layer indicates that the depth is not depth.
  • the encoded data further includes a plurality of dimension IDs, and the first dimension ID is defined as a view ID from the first bit or more of the layer ID, and the second dimension ID However, it may be defined as a depth flag from the 0th bit of the layer ID.
  • the video parameter set decoding unit further includes a layer validity flag decoding unit that decodes a layer validity flag indicating the validity of each layer.
  • the present invention can be suitably applied to an image decoding apparatus that decodes encoded data obtained by encoding image data and an image encoding apparatus that generates encoded data obtained by encoding image data. Further, the present invention can be suitably applied to the data structure of encoded data generated by an image encoding device and referenced by the image decoding device.
  • NAL unit type encoding unit 212 ... VPS decoding unit (video parameter set decoding unit) 212B ... VPS decoding unit (video parameter set decoding unit) 212C ... VPS decoding unit (video parameter set decoding unit) 212D ... VPS decoding unit (video parameter set decoding unit) 2121 ... Scalable type decoding unit 2122 ... Dimension ID decoding unit 2123 ... Dependent layer ID decoding unit 2124 ... Layer validity flag decoding unit 212E ... VPS encoding unit (video parameter set encoding unit) 2121E ... Scalable type encoder 2122E ... Dimension ID encoder 2123E ... Dependent layer encoder 213 ... Layer information storage (layer parameter storage) 214 ...
  • Basic merge candidate derivation unit 3036131 Spatial merge candidate derivation unit 3036132 ... Time merge candidate derivation unit 3036133 ... Join merge candidate derivation unit 3036134 ... Zero merge candidate derivation unit 303614 ... MPI candidate derivation unit 30362 ... Merge candidate selection unit 04 ... Intra prediction parameter decoding unit 307 ... Prediction parameter memory 308 ... Prediction image generation unit 309 ... Inter prediction image generation unit 3091 ... Displacement compensation unit 3092 ... Residual prediction unit 30921 ... Residual acquisition unit 30922 ... Residual filter unit 3093 ... Illuminance compensation unit 30931 ... Illuminance parameter estimation unit 30932 ... Illumination compensation filter unit 3094 ... Prediction unit 310 ... Intra prediction image generation unit 3101 ... Direction prediction unit 3102 ... DMM prediction unit 311 ... Inverse quantization / inverse DCT unit 312 ... Addition unit 313 ... residual storage

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention a pour objet de résoudre le problème selon lequel, lors de l'utilisation d'un identifiant de dimension défini au moyen d'une extension HEVC, il n'existe pas de procédé pour déduire un identifiant de vue et un drapeau de profondeur nécessaire pour l'extensibilité 3D. De même, dans une couche dépendante, l'identifiant de vue et le drapeau de profondeur ne sont pas limités, ainsi il existait des cas dans lesquels la taille de la mémoire du décodeur a augmenté. Par le biais de la détermination d'un indice pour l'identifiant de dimension correspondant au drapeau de profondeur et à l'identifiant de vue sur la base d'un masque extensible, l'identifiant de vue et le drapeau de profondeur sont déduits de l'identifiant de dimension. La taille de la mémoire du décodeur est diminuée en limitant l'identifiant de vue ou l'identifiant de dimension d'une couche dépendante.
PCT/JP2013/081971 2012-12-28 2013-11-27 Structure de données codées et dispositif de décodage d'image WO2014103600A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012286710A JP2016034050A (ja) 2012-12-28 2012-12-28 画像復号装置、画像符号化装置および、データ構造
JP2012-286710 2012-12-28

Publications (1)

Publication Number Publication Date
WO2014103600A1 true WO2014103600A1 (fr) 2014-07-03

Family

ID=51020694

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/081971 WO2014103600A1 (fr) 2012-12-28 2013-11-27 Structure de données codées et dispositif de décodage d'image

Country Status (2)

Country Link
JP (1) JP2016034050A (fr)
WO (1) WO2014103600A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015001762A1 (fr) * 2013-07-05 2015-01-08 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédés de codage et de décodage d'image, dispositifs de codage et de décodage d'image, et dispositif de codage/décodage d'image
US20220007055A1 (en) * 2019-03-26 2022-01-06 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113519162B (zh) * 2019-03-08 2023-05-23 中兴通讯股份有限公司 数字视频中的参数集信令
US11153598B2 (en) * 2019-06-04 2021-10-19 Tencent America LLC Method and apparatus for video coding using a subblock-based affine motion model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BYEONGDOO CHOI ET AL.: "AHG7: On Random access point pictures and picture order counts for MV-HEVC", JOINT COLLABORATIVE TEAM ON 3D VIDEO CODING EXTENSION DEVELOPMENT OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 3RD MEETING, 17 January 2013 (2013-01-17), GENEVA, CH *
BYEONGDOO CHOI ET AL.: "On Random Access Pictures", JOINT COLLABORATIVE TEAM ON 3D VIDEO CODING EXTENSION DEVELOPMENT OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 1ST MEETING, 16 July 2012 (2012-07-16), STOCKHOLM, SE *
GERHARD TECH ET AL.: "3D-HEVC Test Model 1", JOINT COLLABORATIVE TEAM ON 3D VIDEO CODING EXTENSION DEVELOPMENT OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 1ST MEETING, 16 July 2012 (2012-07-16), STOCKHOLM, SE *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015001762A1 (fr) * 2013-07-05 2015-01-08 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Procédés de codage et de décodage d'image, dispositifs de codage et de décodage d'image, et dispositif de codage/décodage d'image
JP5680812B1 (ja) * 2013-07-05 2015-03-04 パナソニック インテレクチュアル プロパティ コーポレーション オブアメリカPanasonic Intellectual Property Corporation of America 画像符号化方法、画像復号方法、画像符号化装置および画像復号装置
US9706213B2 (en) 2013-07-05 2017-07-11 Sun Patent Trust Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding device
US9992508B2 (en) 2013-07-05 2018-06-05 Sun Patent Trust Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding device
US10321147B2 (en) 2013-07-05 2019-06-11 Sun Patent Trust Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding device
US10869055B2 (en) 2013-07-05 2020-12-15 Sun Patent Trust Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding device
US20220007055A1 (en) * 2019-03-26 2022-01-06 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

Also Published As

Publication number Publication date
JP2016034050A (ja) 2016-03-10

Similar Documents

Publication Publication Date Title
JP6397421B2 (ja) 画像復号装置及び画像符号化装置
WO2014103529A1 (fr) Dispositif de décodage d'image et structure de données
CA2909309C (fr) Prediction harmonisee de vues intermediaires et de syntheses de vues pour un codage video 3d
US9967592B2 (en) Block-based advanced residual prediction for 3D video coding
KR101662963B1 (ko) 3d 비디오 코딩을 위한 장치, 방법 및 컴퓨터 프로그램
JP6469588B2 (ja) 残差予測装置、画像復号装置、画像符号化装置、残差予測方法、画像復号方法、および画像符号化方法
JP6360053B2 (ja) 照度補償装置、画像復号装置、画像符号化装置
WO2016125685A1 (fr) Dispositif de décodage d'image, dispositif de codage d'image et dispositif de calcul de vecteur de prédiction
EP2966868B1 (fr) Procédé pour la prédiction et l'héritage des informations de mouvement en codage vidéo
WO2015056719A1 (fr) Dispositif de décodage d'images et dispositif de codage d'images
US20160261888A1 (en) Method and apparatus for decoding multi-view video
WO2015056620A1 (fr) Dispositif de décodage d'image, et dispositif de codage d'image
JP6118199B2 (ja) 画像復号装置、画像符号化装置、画像復号方法、画像符号化方法及びコンピュータ読み取り可能な記録媒体。
WO2014103600A1 (fr) Structure de données codées et dispositif de décodage d'image
WO2015141696A1 (fr) Dispositif de décodage d'image, dispositif de codage d'image et dispositif de prédiction
WO2016056587A1 (fr) Dispositif de dérivation d'agencement de déplacement, dispositif de dérivation de vecteur de déplacement, dispositif de dérivation d'indice de vues de référence par défaut, et dispositif de dérivation de table de consultation de profondeur
JP6401707B2 (ja) 画像復号装置、画像復号方法、および記録媒体
JP2014204327A (ja) 画像復号装置および画像符号化装置
JP2015015626A (ja) 画像復号装置および画像符号化装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13867746

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13867746

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP