WO2024080216A1 - Image decoding device and image encoding device - Google Patents

Image decoding device and image encoding device Download PDF

Info

Publication number
WO2024080216A1
WO2024080216A1 PCT/JP2023/036356 JP2023036356W WO2024080216A1 WO 2024080216 A1 WO2024080216 A1 WO 2024080216A1 JP 2023036356 W JP2023036356 W JP 2023036356W WO 2024080216 A1 WO2024080216 A1 WO 2024080216A1
Authority
WO
WIPO (PCT)
Prior art keywords
mode
dimd
unit
prediction
image
Prior art date
Application number
PCT/JP2023/036356
Other languages
French (fr)
Japanese (ja)
Inventor
哲銘 范
知宏 猪飼
将伸 八杉
友子 青野
Original Assignee
シャープ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by シャープ株式会社 filed Critical シャープ株式会社
Publication of WO2024080216A1 publication Critical patent/WO2024080216A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Definitions

  • Embodiments of the present invention relate to an image decoding device and an image encoding device.
  • a video encoding device that generates encoded data by encoding video
  • a video decoding device is used that generates a decoded image by decoding the encoded data
  • video coding methods include those proposed in H.264/AVC and HEVC (High-Efficiency Video Coding).
  • the images (pictures) that make up a video are managed in a hierarchical structure consisting of slices obtained by dividing the images, coding tree units (CTUs) obtained by dividing the slices, coding units (sometimes called coding units: CUs) obtained by dividing the coding tree units, and transform units (TUs) obtained by dividing the coding units, and are coded/decoded for each CU.
  • CTUs coding tree units
  • coding units sometimes called coding units: CUs
  • transform units TUs
  • a predicted image is usually generated based on a locally decoded image obtained by encoding/decoding an input image, and the prediction error (sometimes called a "difference image” or “residual image") obtained by subtracting the predicted image from the input image (original image) is coded.
  • the prediction error sometimes called a "difference image” or “residual image”
  • Methods for generating predicted images include inter-frame prediction (inter prediction) and intra-frame prediction (intra prediction).
  • Non-Patent Document 1 discloses decoder-side intra mode derivation (DIMD) prediction, in which the decoder derives a predicted image by deriving an intra direction prediction mode number using pixels in adjacent regions.
  • DIMD decoder-side intra mode derivation
  • Non-Patent Document 1 the intra mode is derived on the decoder side using the gradient of pixel values of the image adjacent to the target area, but there is an issue that the angle gradient of the adjacent image and the angle gradient of the target block do not necessarily match.
  • the present invention aims to improve the accuracy of decoder-side intra mode derivation by switching intra prediction mode derivation depending on the properties of adjacent blocks and the current block.
  • It includes a reference sample derivation unit that selects adjacent images for the target block according to the DIMD mode, a gradient derivation unit that uses the selected adjacent images to derive pixel-level gradients, and an angle mode selection unit that derives the intra prediction mode from the gradient.
  • FIG. 1 is a schematic diagram showing a configuration of an image transmission system according to an embodiment of the present invention.
  • FIG. 2 is a diagram showing a hierarchical structure of data in an encoded stream.
  • FIG. 13 is a schematic diagram showing types of intra-prediction modes (mode numbers).
  • FIG. 1 is a schematic diagram showing a configuration of a video decoding device. This is an example of DIMD syntax. 13 is a diagram explaining the binarization of the syntax dimd_mode used in the DIMD prediction unit 31046. FIG. Here is another example of DIMD syntax. A diagram showing context settings in decoding syntax elements of dimd_mode.
  • FIG. 2 is a diagram illustrating a configuration of a predicted image generating unit.
  • FIG. 13 is a diagram showing details of a DIMD prediction unit.
  • FIG. 13 is a diagram showing an example of a reference region referred to by a DIMD prediction unit 31046.
  • FIG. 13 is a diagram showing a configuration for changing the number of lines in a reference region for DIMD prediction according to dimd_mode.
  • FIG. This is an example of a spatial filter.
  • FIG. 13 is a diagram illustrating an example of a pixel from which gradient is derived;
  • FIG. 13 is a diagram illustrating the relationship between gradient and area.
  • 4 is a block diagram showing a configuration of an angle mode derivation unit.
  • FIG. 13 is a diagram showing an example of a reference range in gradient derivation by the DIMD prediction unit 31046.
  • FIG. 4 is a functional block diagram showing an example of the configuration of an inverse quantization and inverse transform unit.
  • FIG. FIG. 1 is a block diagram showing a configuration of a video encoding device.
  • FIG. 1 is a schematic diagram showing the configuration of an image transmission system 1 according to this embodiment.
  • the image transmission system 1 is a system that transmits an encoded stream obtained by encoding an image to be encoded, and decodes the transmitted encoded stream to display an image.
  • the image transmission system 1 is composed of a video encoding device (image encoding device) 11, a network 21, a video decoding device (image decoding device) 31, and a video display device (image display device) 41.
  • An image T is input to the video encoding device 11.
  • the network 21 transmits the encoded stream Te generated by the video encoding device 11 to the video decoding device 31.
  • the network 21 is the Internet, a wide area network (WAN), a local area network (LAN), or a combination of these.
  • the network 21 is not necessarily limited to a bidirectional communication network, and may be a unidirectional communication network that transmits broadcast waves such as terrestrial digital broadcasting and satellite broadcasting.
  • the network 21 may also be replaced by a storage medium on which the encoded stream Te is recorded, such as a DVD (Digital Versatile Disc: registered trademark) or a BD (Blu-ray Disc: registered trademark).
  • the video decoding device 31 decodes each of the encoded streams Te transmitted by the network 21 and generates one or more decoded images Td.
  • the video display device 41 displays all or part of one or more decoded images Td generated by the video decoding device 31.
  • the video display device 41 is equipped with a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display. Display forms include stationary, mobile, HMD, etc. Furthermore, when the video decoding device 31 has high processing power, it displays high quality images, and when it has only lower processing power, it displays images that do not require high processing power or display power.
  • x?y:z is a ternary operator that takes y if x is true (non-zero) and z if x is false (0).
  • BitDepthY is the luminance bit depth.
  • abs(a) is a function that returns the absolute value of a.
  • Int(a) is a function that returns the integer value of a.
  • Floor(a) is a function that returns the largest integer less than or equal to a.
  • Log2(a) is a function that returns the logarithm to the base 2.
  • Ceil(a) is a function that returns the smallest integer greater than or equal to a.
  • a/d represents the division of a by d (rounded down to nearest whole number).
  • Min(a,b) is a function that returns the smaller of a and b.
  • FIG. 2 is a diagram showing the hierarchical structure of data in an encoded stream Te.
  • the encoded stream Te illustratively includes a sequence and a number of pictures that make up the sequence.
  • FIG. 2 shows a coded video sequence that defines a sequence SEQ, a coded picture that specifies a picture PICT, a coded slice that specifies a slice S, coded slice data that specifies slice data, a coding tree unit included in the coded slice data, and a coding unit included in the coding tree unit.
  • the coded video sequence defines a set of data to be referred to by the video decoding device 31 in order to decode the sequence SEQ to be processed.
  • the sequence SEQ includes a video parameter set VPS (Video Parameter Set), a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), a picture PICT, and supplemental enhancement information SEI (Supplemental Enhancement Information).
  • the video parameter set VPS specifies a set of coding parameters common to multiple videos composed of multiple layers, as well as a set of coding parameters related to multiple layers and each individual layer included in the video.
  • the sequence parameter set SPS specifies a set of coding parameters that the video decoding device 31 references in order to decode the target sequence. For example, the width and height of a picture are specified. Note that there may be multiple SPSs. In that case, one of the multiple SPSs is selected from the PPS.
  • the picture parameter set PPS specifies a set of coding parameters that the video decoding device 31 references in order to decode each picture in the target sequence. For example, it includes the reference value of the quantization width used in decoding the picture (pic_init_qp_minus26) and a flag indicating the application of weighted prediction (weighted_pred_flag). Note that there may be multiple PPSs. In that case, one of the multiple PPSs is selected for each picture in the target sequence.
  • a coded picture defines a set of data to be referenced by the video decoding device 31 in order to decode a picture PICT to be processed. As shown in the coded picture of FIG. 2, the picture PICT includes slices 0 to NS-1 (NS is the total number of slices included in the picture PICT).
  • An encoded slice defines a set of data to be referenced by the video decoding device 31 in order to decode a slice S to be processed. As shown in the encoded slice of Fig. 2, a slice includes a slice header and slice data.
  • the slice header includes a set of coding parameters that the video decoding device 31 refers to in order to determine the decoding method for the target slice.
  • Slice type designation information (slice_type) that specifies the slice type is an example of a coding parameter included in the slice header.
  • Slice types that can be specified by the slice type specification information include (1) an I slice that uses only intra prediction when encoding, (2) a P slice that uses unidirectional prediction or intra prediction when encoding, and (3) a B slice that uses unidirectional prediction, bidirectional prediction, or intra prediction when encoding.
  • inter prediction is not limited to unidirectional or bidirectional prediction, and a predicted image may be generated using more reference pictures.
  • P or B slice it refers to a slice that includes a block for which inter prediction can be used.
  • the slice header may include a reference to the picture parameter set PPS (pic_parameter_set_id).
  • the coded slice data specifies a set of data to be referenced by the video decoding device 31 in order to decode the slice data to be processed.
  • the slice data includes a CTU, as shown in the coded slice header in Fig. 2.
  • a CTU is a block of a fixed size (e.g., 64x64) that constitutes a slice, and is also called a Largest Coding Unit (LCU).
  • LCU Largest Coding Unit
  • coding tree unit In the coding tree unit in Fig. 2, a set of data that the video decoding device 31 refers to in order to decode the CTU to be processed is specified.
  • the CTU is divided into coding units CU, which are basic units of the coding process, by recursive quad tree division (QT (Quad Tree) division), binary tree division (BT (Binary Tree) division), or ternary tree division (TT (Ternary Tree) division).
  • BT division and TT division are collectively called multi tree division (MT (Multi Tree) division).
  • a node of a tree structure obtained by recursive quad tree division is called a coding node.
  • the intermediate nodes of the quad tree, binary tree, and ternary tree are coding nodes, and the CTU itself is specified as the top coding node.
  • the CU is composed of a CU header CUH, prediction parameters, transformation parameters, quantization transformation coefficients, etc.
  • the CU header defines a prediction mode, etc.
  • Prediction processing may be performed on a CU basis, or on a sub-CU basis, which is a further division of a CU. If the size of the CU and sub-CU are equal, there is one sub-CU in the CU. If the size of the CU is larger than the size of the sub-CU, the CU is divided into sub-CUs. For example, if the CU is 8x8 and the sub-CU is 4x4, the CU is divided into 2 parts horizontally and 2 parts vertically, into 4 sub-CUs.
  • Intra prediction is a prediction within the same picture
  • inter prediction refers to a prediction process performed between different pictures (for example, between display times or between layer images).
  • the transform and quantization process is performed on a CU basis, but the quantized transform coefficients may be entropy coded on a subblock basis, such as 4x4.
  • the predicted image is derived from prediction parameters associated with the block, which include intra-prediction and inter-prediction parameters.
  • the intra prediction parameters consist of a luminance prediction mode IntraPredModeY and a chrominance prediction mode IntraPredModeC.
  • Figure 3 is a schematic diagram showing the types of intra prediction modes (mode numbers). As shown in the figure, there are, for example, 67 types of intra prediction modes (0 to 66). These include planar prediction (0), DC prediction (1), and angular prediction (2 to 66).
  • linear model (LM: Linear Model) prediction such as cross component linear model (CCLM: Cross Component Linear Model) prediction and multi-mode linear model (MMLM: Multi Mode Linear Model) prediction may also be used.
  • LM Linear Model
  • CCLM Cross Component Linear Model
  • MMLM Multi Mode Linear Model
  • an LM mode may be added for chrominance.
  • the video decoding device 31 includes an entropy decoding unit 301, a parameter decoding unit (prediction image decoding device) 302, a loop filter 305, a reference picture memory 306, a prediction parameter memory 307, a prediction image generating unit (prediction image generating device) 308, an inverse quantization and inverse transform unit 311, an addition unit 312, and a prediction parameter derivation unit 320.
  • the video decoding device 31 may also be configured not to include the loop filter 305.
  • CTU and CU are used as processing units, but this is not limiting and processing may be performed in sub-CU units.
  • CTU and CU may be read as blocks and sub-CU as sub-blocks, and processing may be performed in block or sub-block units.
  • the entropy decoding unit 301 performs entropy decoding on the externally input encoded stream Te and parses each code (syntax element).
  • entropy coding There are two types of entropy coding: one is to perform variable-length coding of syntax elements using a context (probability model) adaptively selected according to the type of syntax element and surrounding circumstances, and the other is to perform variable-length coding of syntax elements using a predefined table or formula.
  • CABAC Context Adaptive Binary Arithmetic Coding
  • the probability model of a picture using the same slice type and quantization parameters of the same slice level is set as the initial state of the context of a P picture or B picture. This initial state is used for the encoding and decoding processes.
  • the parsed code includes prediction information for generating a predicted image and prediction errors for generating a difference image.
  • the entropy decoding unit 301 may decode each bin of the syntax element using the variables ivlCurrRange, ivlOffset, valIdx, pStateIdx0, and pStateIdx1.
  • ivlCurrRange and ivlOffset are context-independent variables.
  • valIdx, pStateIdx0, and pStateIdx1 are context-specific variables.
  • binVal valMps Furthermore, the entropy decoding unit 301 updates the state of the context by the following calculation.
  • pStateIdx1 pStateIdx1 - (pStateIdx1 >> shift1) + (16383 * binVal >> shift1) (Bin decryption in case of bypass)
  • the entropy decoding unit 301 obtains ivlCurrRange and ivlOffset by the following calculation.
  • ivlCurrRange ivlCurrRange ⁇ 1
  • ivlOffset ivlOffset
  • read_bits(1) reads one bit from the bitstream and returns that value.
  • the entropy decoding unit 301 outputs the parsed syntax elements to the parameter decoding unit 302. Control of which syntax elements to parse is performed based on instructions from the parameter decoding unit 302.
  • the entropy decoding unit 301 may parse, for example, the syntax element dimd_mode shown in the syntax table of FIG. 5 as follows.
  • dimd_mode is a syntax element that selects the reference region of the DIMD from the encoded data.
  • dimd_mode parses dimd_mode from the encoded data.
  • dimd_mode may be DIMD_MODE_TOP_LEFT mode, DIMD_MODE_TOP mode, or DIMD_MODE_LEFT mode, which may be 0, 1, or 2, respectively.
  • Figure 6(a) shows an example of binarization of dimd_mode.
  • Bin0 Flag to select DIMD_MODE_TOP_LEFT or other. 0 indicates DIMD_MODE_TOP_LEFT, 1 indicates not DIMD_MODE_TOP_LEFT mode.
  • the syntax element assigned to Bin0 is called dimd_mode_flag
  • the syntax element assigned to Bin1 is called dimd_mode_dir (see, for example, FIG. 7).
  • 1 bit (for example, "0") is assigned to DIMD_MODE_TOP_LEFT, and 1 more bit is assigned after "1" to DIMD_MODE_TOP and DIMD_MODE_LEFT.
  • shorter bits are assigned than when using left or top, which has the effect of shortening the average code amount and improving the coding efficiency.
  • dimd_mode parses dimd_mode from the encoded data.
  • dimd_mode may be DIMD_LINES1 mode or DIMD_LINED2, which may be 0 or 1, respectively.
  • Figure 8 shows the setting of the context (ctxInc) when parsing the syntax element of dimd_mode.
  • a context is a variable area for holding the probability (state) of CABAC, and is identified by the value of the context index ctxIdx (0, 1, 2, ). The case where 0 and 1 are always equally probable, in other words 0.5, 0.5, is called EP (Equal Probability) or bypass. In this case, no context is used because there is no need to hold a state for a specific syntax element.
  • ctxIdx is derived by referencing ctxInc.
  • Bin0 is a syntax element indicating whether DIMD_MODE_TOP_LEFT
  • Bin1 is a syntax element indicating whether DIMD_MODE_LEFT or DIMD_MODE_TOP.
  • Bypass is a parsing method that does not use a context.
  • dimd_mode is a syntax element that selects the DIMD reference area from the encoded data. With the above configuration, no context is used to select between DIMD_MODE_LEFT and DIMD_MODE_TOP, which has the effect of reducing memory.
  • the formula and values are not limited to the above, and the order of judgment and values may be changed.
  • ctxIdx ( bW > bH ) ? 1 : ( bW ⁇ bH ) ? 2 : bypass
  • the DIMD mode (dimd_mode) is composed of a first bit and a second bit, and the first bit selects whether the reference area of the DIMD is both above and to the left of the target block, and the second bit selects whether the reference area of the DIMD is to the left or above the target block, and the above adjacent area may be selected.
  • a predetermined context e.g. 1, 2
  • a different context e.g., 2
  • dimd_mode is decoded using the value obtained by swapping the binary value of Bin1 (1 to 0, 0 to 1, for example, 1 - Bin1) depending on whether bW > bH or bH ⁇ bW.
  • dimd_mode is derived as follows.
  • the above configuration uses different contexts depending on the shape of the target block, for example, whether the target block is square or not (and/or whether it is horizontal or vertical), so it is possible to adaptively encode the block with a short code according to its characteristics, improving performance. Also, if no context is used when the block is square, for example, this has the effect of reducing memory usage.
  • xC and yC are variables related to the top left position of the block
  • refIdxW and refIdxH are variables related to the number of lines in the reference area for DIMD prediction.
  • dimd_mode DIMD_MODE_TOP
  • dimd_mode DIMD_MODE_LEFT
  • the parameter decoding unit 302 informs the entropy decoding unit 301 which syntax elements to parse.
  • the entropy decoding unit 301 outputs the syntax elements parsed by the entropy decoding unit 301 to the prediction parameter derivation unit 320.
  • the prediction parameter derivation unit 320 derives prediction parameters, for example, an intra-prediction mode IntraPredMode, by referring to the prediction parameters stored in the prediction parameter memory 307 based on the syntax elements input from the parameter decoding unit 302.
  • the prediction parameter derivation unit 320 outputs the derived prediction parameters to the predicted image generation unit 308, and also stores them in the prediction parameter memory 307.
  • the prediction parameter derivation unit 320 may derive different prediction modes for luminance and chrominance.
  • the prediction parameter derivation unit 320 may derive prediction parameters from syntax elements related to intra prediction such as those shown in FIG. 5.
  • the loop filter 305 is a filter provided in the encoding loop that removes block distortion and ringing distortion and improves image quality.
  • the loop filter 305 applies filters such as a deblocking filter, sample adaptive offset (SAO), and adaptive loop filter (ALF) to the decoded image of the CU generated by the adder 312.
  • filters such as a deblocking filter, sample adaptive offset (SAO), and adaptive loop filter (ALF) to the decoded image of the CU generated by the adder 312.
  • the reference picture memory 306 stores the decoded image of the CU generated by the adder 312 in a predetermined location for each target picture and target CU.
  • the prediction parameter memory 307 stores prediction parameters at a predetermined location for each CTU or CU to be decoded. Specifically, the prediction parameter memory 307 stores the parameters decoded by the parameter decoding unit 302 and the prediction mode predMode derived by the prediction parameter derivation unit 320.
  • the prediction mode predMode, prediction parameters, etc. are input to the prediction image generation unit 308.
  • the prediction image generation unit 308 also reads a reference picture from the reference picture memory 306.
  • the prediction image generation unit 308 generates a prediction image of a block or sub-block using the prediction parameters and the read reference picture (reference picture block).
  • a reference picture block is a set of pixels on the reference picture (usually rectangular, so called a block), and is the area referenced to generate a prediction image.
  • the predicted image generation unit 310 performs intra prediction using the intra prediction parameters input from the prediction parameter derivation unit 320 and the reference pixels read from the reference picture memory 306.
  • the predicted image generation unit 308 reads adjacent blocks in a predetermined range from the target block on the target picture from the reference picture memory 306.
  • the predetermined range refers to the adjacent blocks to the left, upper left, upper, and upper right of the target block, and the area to be referenced differs depending on the intra prediction mode.
  • the predicted image generation unit 308 generates a predicted image of the current block by referring to the decoded pixel values that have been read and the prediction mode indicated by IntraPredMode.
  • the predicted image generation unit 308 outputs the generated predicted image of the block to the addition unit 312.
  • reference region R a decoded surrounding area adjacent (close) to the block to be predicted is set as reference region R. Then, a predicted image is generated by extrapolating pixels in reference region R in a specific direction.
  • reference region R may be set as an L-shaped region that includes the left and top of the block to be predicted (or further, the top left, top right, and bottom left).
  • the predicted image generation unit 308 includes a reference sample filter unit 3103 (second reference image setting unit), a prediction unit 3104, and a predicted image correction unit 3105 (predicted image correction unit, filter switching unit, and weighting coefficient changing unit).
  • the prediction unit 3104 Based on each reference pixel (reference image) in the reference region R, a filtered reference image generated by applying a reference pixel filter (first filter), and the intra prediction mode, the prediction unit 3104 generates a prediction image (provisional predicted image, uncorrected predicted image) of the block to be predicted, and outputs it to the prediction image correction unit 3105.
  • the prediction image correction unit 3105 corrects the provisional predicted image according to the intra prediction mode, and generates and outputs a prediction image (corrected predicted image).
  • the reference sample filter unit 3103 derives a reference sample s[x][y] at each position (x, y) on the reference region R by referring to the reference image.
  • the reference sample filter unit 3103 applies a reference pixel filter (first filter) to the reference sample s[x][y] according to the intra prediction mode to update the reference sample s[x][y] at each position (x, y) on the reference region R (derives a filtered reference image s[x][y]).
  • a low-pass filter is applied to the position (x, y) and the reference image therearound to derive a filtered reference image.
  • a low-pass filter may be applied to some intra prediction modes.
  • the filter applied to the reference image on the reference region R in the reference sample filter unit 3103 is referred to as a "reference pixel filter (first filter)"
  • the filter that corrects the tentative predicted image in the prediction image correction unit 3105 described later is referred to as a "position-dependent filter (second filter)”.
  • the intra prediction unit generates a tentative predicted image (tentative predicted pixel value, pre-corrected predicted image) of a prediction target block based on an intra prediction mode, a reference image, and a filtered reference pixel value, and outputs the generated image to a prediction image correction unit 3105.
  • the prediction unit 3104 includes a planar prediction unit 31041, a DC prediction unit 31042, an angular prediction unit 31043, an LM prediction unit 31044, a matrix-based intra prediction unit 31045, and a DIMD prediction unit 31046 (Decoder-side Intra Mode Derivation, DIMD).
  • the prediction unit 3104 selects a specific prediction unit according to the intra prediction mode, and inputs a reference image and a filtered reference image.
  • the relationship between the intra prediction mode and the corresponding prediction unit is as follows. ⁇ Planar prediction ⁇ Planar prediction section 31041 ⁇ DC prediction...DC prediction unit 31042 ⁇ Angular prediction ⁇ Angular prediction section 31043 LM prediction...LM prediction unit 31044 ⁇ Matrix intra prediction ⁇ MIP part 31045 DIMD prediction: DIMD prediction unit 31046 (Planar forecast)
  • the planar prediction unit 31041 generates a provisional predicted image by linearly adding the reference sample s[x][y] according to the distance between the prediction target pixel position and the reference pixel position, and outputs the provisional predicted image to the predicted image correction unit 3105.
  • the DC prediction unit 31042 derives a DC predicted value equivalent to the average value of the reference samples s[x][y], and outputs a temporary predicted image q[x][y] whose pixel values are the DC predicted values.
  • the angular prediction unit 31043 generates a temporary predicted image q[x][y] using a reference sample s[x][y] in the prediction direction (reference direction) indicated by the intra prediction mode, and outputs the temporary predicted image q[x][y] to the predicted image correction unit 3105.
  • the LM prediction unit 31044 predicts pixel values of chrominance based on pixel values of luminance. Specifically, this is a method of generating a predicted image of a chrominance image (Cb, Cr) using a linear model based on a decoded luminance image.
  • LM prediction is a prediction method that uses a linear model to predict chrominance from luminance for one block.
  • the MIP unit 31045 generates a temporary predicted image q[x][y] by performing a product-sum operation on the reference sample s[x][y] derived from the adjacent region and a weighting matrix, and outputs the generated image to the predicted image correction unit 3105.
  • the DIMD prediction unit 31046 is a prediction method that generates a predicted image using an intra prediction mode that is not explicitly signaled.
  • the angle mode derivation device 310465 derives an intra prediction mode suitable for the current block using information on the neighboring region, and the DIMD prediction unit 31046 generates a temporary predicted image using this intra prediction mode. Details will be described later.
  • the predicted image correction unit 3105 corrects the provisional predicted image output from the prediction unit 3104 according to the intra prediction mode. Specifically, the predicted image correction unit 3105 derives a position-dependent weighting coefficient for each pixel of the provisional predicted image according to the reference region R and the position of the target predicted pixel. Then, the predicted image correction unit 3105 performs weighted addition (weighted averaging) of the reference sample s[][] and the provisional predicted image q[x][y] to derive a predicted image (corrected predicted image) Pred[][] obtained by correcting the provisional predicted image. Note that, in some intra prediction modes, the predicted image correction unit 3105 may set the provisional predicted image q[x][y] as a predicted image without correcting it.
  • Example 1 10 shows the configuration of the DIMD prediction unit 31046 in this embodiment.
  • the DIMD prediction unit 31046 is composed of a reference sample derivation unit 310460, an angle mode derivation device 310465 (gradient derivation unit 310461, angle mode derivation unit 310462), an angle mode selection unit 310463, and a temporary predicted image generation unit 310464.
  • the angle mode derivation device 310465 may include the angle mode selection unit 310463.
  • Figure 5 shows an example of the syntax of encoded data related to DIMD.
  • the prediction parameter derivation unit 320 decodes a flag dimd_flag indicating whether or not to use DIMD for each block from the encoded data. If dimd_flag for the target block is 1, the parameter decoding unit 302 does not need to decode syntax elements related to intra prediction mode (intra_mip_flag, intra_luma_mpm_flag, intra_luma_mpm_idx, intra_luma_mpm_reminder) from the encoded data.
  • intra_mip_flag is a flag indicating whether or not to perform MIP prediction.
  • intra_luma_mpm_flag is a flag indicating whether or not to use the prediction candidate Most Probable Mode (MPM).
  • intra_luma_mpm_idx is an index that specifies MPM when MPM is used.
  • intra_luma_mpm_reminder is an index that selects the remaining candidate when MPM is not used. If dimd_flag is 0, intra_luma_mpm_flag is decoded, and if intra_luma_mpm_flag is 0, intra_luma_mpm_reminder is also decoded. If dimd_flag of the current block is 1, dimd_mode of the current block is also decoded. dimd_mode indicates a reference region used to derive an intra prediction mode in DIMD prediction. The meaning of dimd_mode may be as follows:
  • dimd_flag 1
  • the DIMD prediction unit 31046 derives an angle indicating the texture direction in the adjacent region using pixel values. Then, a provisional predicted image is generated using an intra prediction mode corresponding to the angle. For example, (1) a gradient direction of pixel values is derived for a pixel at a predetermined position in the adjacent region. (2) The derived gradient direction is converted to a corresponding directional prediction mode (angular prediction mode).
  • a histogram of the obtained prediction direction is created for each predetermined pixel in the adjacent region.
  • a prediction mode of the most frequent value or a plurality of prediction modes including the most frequent value is selected from the histogram, and a provisional predicted image is generated using the prediction mode.
  • the reference sample derivation unit 310460 derives a reference sample refUnit from decoded pixels recSamples adjacent to the current block. Note that the operation of the reference sample derivation unit 310460 may be performed by the reference sample filter unit 3103.
  • FIG. 11 is a diagram showing an example of a reference region referred to by the DIMD prediction unit 31046.
  • the reference sample derivation unit 310460 stores adjacent images (images in the DIMD reference region) recSamples of the current block to be used by a gradient derivation unit 310461 and the predicted image generation unit 308, which will be described later, in a sample array refUnit.
  • the reference sample derivation unit 310460 derives a sample array refUnit from the left and top areas of the target block as follows.
  • refUnit[x][y] recSamples[xC+x][yC+y]
  • y -1-refIdxH..refH-1
  • refIdxW and refIdxH are constants indicating the width of the adjacent reference area to the left and the height of the adjacent reference area above.
  • refIdxW 2 or 3
  • extending refers to using adjacent images including the lower left adjacent area in addition to the left, and using adjacent images including the upper right adjacent area in addition to the top.
  • RTL is the area that combines RL and RT.
  • the reference sample derivation unit 310460 derives refUnit from the left area of the target block, for example, the above RL.
  • the reference sample derivation unit 310460 derives refUnit from the area above the target block, for example, the above RT.
  • the reference sample derivation unit 310460 may perform the following process.
  • the reference sample derivation unit 310460 derives refUnit from the left area of the target block, for example, the above RL.
  • the reference sample derivation unit 310460 derives refUnit from the area above the target block, for example, the above RT.
  • Figure 11(b) shows another example of the reference range in the gradient derivation of DIMD prediction.
  • DIMD_MODE_TOP an extended region including the adjacent region to the top and the upper right
  • the reference sample derivation unit 310460 may perform the following processing:
  • the reference sample derivation unit 310460 derives a sample array refUnit from the left and top areas of the target block as follows. First, the following process is carried out in the pixel range RL in the left region of the target block.
  • refUnit[x][y] recSamples[xC+x][yC+y]
  • y -1-refIdxH..refH-1
  • refW bW
  • refH bH.
  • refUnit[x][y] recSamples[xC+x][yC+y]
  • y -1-refIdxH..-1
  • refW bW
  • refH bH
  • RTL is the area (range of positions) that combines RL and RT.
  • the reference sample derivation unit 310460 derives refUnit from the left and bottom left areas of the target block, for example, RL_EXT.
  • the reference sample derivation unit 310460 may perform the following process.
  • the reference sample derivation unit 310460 derives refUnit from the left and bottom left areas of the target block, for example, RL_ADAP.
  • y -1-refIdxH..refH-1
  • the reference sample derivation unit 310460 derives refUnit from the upper and upper right areas of the target block, for example, RT_ADAP.
  • y -1-refIdxH..-1
  • the reference sample derivation unit 310460 may replace the value of the area that cannot be referenced because it is outside the target picture, outside the target subpicture, or outside the target slice boundary according to refUnit[x][y] with the pixel value derived above or a predetermined fixed value, for example, 1 ⁇ (bitDepth-1).
  • FIG. 12(a) shows another example of the reference range in gradient derivation for DIMD prediction.
  • the reference sample derivation unit 310460 sets the reference line numbers refIdxW and refIdxH in accordance with dimd_mode.
  • the reference sample derivation unit 310460 derives a sample array refUnit from the left and upper areas of the target block as follows.
  • the following process is performed in the pixel range RL of the area to the left of the target block.
  • refUnit[x][y] recSamples[xC+x][yC+y]
  • y -1-refIdxH..refH-1
  • refW bW
  • refUnit[x][y] recSamples[xC+x][yC+y]
  • y -1-refIdxH..-1
  • refW bW
  • the reference sample derivation unit 310460 sets the numbers of reference lines refIdxW and refIdxH in accordance with the block size.
  • bH > 8) ? N-1 : M-1
  • bH > 8) ? N-1 : M-1
  • the reference sample derivation unit 310460 derives refUnit[x][y] from recSamples[xC+x][yC+y] of RL in the left region of the target block and RT in the upper region.
  • Fig. 12(b) shows another example of the reference range in gradient derivation for DIMD prediction.
  • the direction of the reference region is selected according to the dimd mode, and at the same time, the number of lines of the reference region is also selected.
  • the reference sample derivation unit 310460 derives refUnit[x][y] from recSamples[xC+x][yC+y] of the left region RL and the top region RT.
  • the reference sample derivation unit 310460 derives refUnit[x][y] from the left area of the target block, for example, recSamples[xC+x][yC+y] of RL.
  • the reference sample derivation unit 310460 derives refUnit[x][y] from the area above the target block, for example, recSamples[xC+x][yC+y] of RT.
  • the gradient derivation unit 310461 derives an angle (angle information) indicating a texture direction based on pixel values of a gradient derivation target image.
  • the angle information may be a value representing an angle with 1/36 precision, or may be another value.
  • the gradient derivation unit 310461 derives gradients in two or more specific directions (e.g., Dx, Dy), and derives the gradient direction (angle information) from the relationship between the gradients Dx and Dy.
  • a spatial filter may be used to derive the gradient.
  • a 3x3 pixel Sobel filter corresponding to the horizontal and vertical directions as shown in Figures 13(a) and (b) may be used as the spatial filter.
  • the gradient derivation unit 310461 derives the gradient for point P[x][y] (hereinafter simply P) within the sample array refUnit[x][y] referenced and derived by the reference sample derivation unit 310460 in the gradient derivation target image. Note that it is also possible to configure the system to refer to recSamples[xC+x][yC+y] as point P instead of refUnit[x][y] without copying from recSamples to the sample array refUnit[x][y].
  • FIG. 14 shows an example of the positions of pixels to be subjected to gradient derivation in a target block of 8x8 pixels.
  • a shaded image in an adjacent region of the target block may be the image to be subjected to gradient derivation.
  • the image to be subjected to gradient derivation may also be a luminance image corresponding to the chrominance image of the target block.
  • the number of pixels to be subjected to gradient derivation, the position pattern, and the reference range of the spatial filter may be changed depending on information such as the size of the target block and the intra prediction mode of the blocks included in the adjacent region.
  • Dx and Dy are derived using the following equations.
  • Dx - P[x-1][y-1] - 2*P[x-1][y] - P[x-1][y+1] + P[x+1][y-1] + 2*P[x+1][y] + P[x+1][y+1]
  • Dy P[x-1][y-1] + 2*P[x][y-1] + P[x+1][y-1] - P[x-1][y+1] - 2*P[x][y+1] - P[x+1][y+1]
  • the method of deriving the gradient is not limited to this, and other methods (filters, formulas, tables, etc.) may be used.
  • a Prewitt filter or a Scharr filter may be used instead of the Sobel filter, and the filter size may be 2x2 or 5x5.
  • the gradient derivation unit 310461 derives Dx and Dy using a Prewitt filter as follows.
  • Dx P[x-1][y-1] + P[x-1][y] + P[x-1][y+1] - P[x+1][y-1] - P[x+1][y] - P[x+1][y1]
  • Dy - P[x-1][y-1] - P[x][y-1] - P[x+1][y-1] + P[x-1][y+1] + P[x][y+1] + P[x+1][y+1]
  • the following equation is an example of deriving Dx and Dy using a Scharr filter.
  • Dx 3*P[x-1][y-1]+10*P[x-1][y]+3*P[x-1][1] -3*P[x+1][y-1]-10*P[x+1][0]-3*P[x+1][y+1]
  • Dy -3*P[x-1][y-1]-10*P[x][y-1]-3*P[x+1][-1] +3*P[x-1][y+1]+10*P[x][1]+3*P[x+1][y+1]
  • the gradient derivation method may be changed for each block. For example, a Sobel filter is used for a target block of 4x4 pixels, and a Scharr filter is used for blocks larger than 4x4. In this way, by using a filter with simpler calculations for small blocks, the increase in the amount of calculations for small blocks can be suppressed.
  • the gradient derivation method may be changed for each position of the pixel for which the gradient is to be derived.
  • a Sobel filter is used for the pixel for which the gradient is to be derived that is in the upper or left adjacent region
  • a Scharr filter is used for the pixel for which the gradient is to be derived that is in the upper left adjacent region.
  • the gradient derivation unit 310461 derives angle information consisting of the quadrant (hereinafter referred to as region) of the texture angle of the target block and the angle within the quadrant based on the signs and magnitude relationship of Dx and Dy. Being able to express it by region makes it possible to standardize the processing of directions that are rotationally symmetric or line symmetric.
  • the angle information is not limited to the region and the angle within the quadrant.
  • the angle information may be information only about the angle, and the region may be derived as necessary.
  • the intra direction prediction modes derived below are limited to directions from the bottom left to the top right (2 to 66 in Figure 3), and intra direction prediction modes for directions that are rotationally symmetric by 180 degrees are treated the same.
  • Fig. 15(a) is a table showing the relationship between the signs (signx, signy) of Dx and Dy, the magnitude relationship (xgty), and the region (each of Ra to Rd is a constant representing the region).
  • Fig. 15(b) shows the quadrants indicated by the regions Ra to Rd.
  • the area indicates a rough angle, and can be derived only from the signs signx, signy of Dx, Dy and the magnitude relationship xgty.
  • the gradient derivation unit 310461 derives a region from the signs signx, signy and the magnitude relationship xgty using calculations and table references.
  • the gradient derivation unit 310461 may derive the corresponding region by referencing the table in FIG. 15(a).
  • the gradient derivation unit 310461 may derive the region using a logical formula as follows.
  • region xgty ? ( (signx ⁇ signy) ? 1 : 0 ) : ( (signx ⁇ signy) ? 2 : 3)
  • indicates XOR (exclusive OR).
  • the region is expressed as a value from 0 to 3.
  • ⁇ Ra, Rb, Rc, Rd ⁇ ⁇ 0, 1, 2, 3 ⁇ . Note that the way in which the region value is assigned is not limited to the above.
  • the angle mode derivation unit 310462 derives an angle mode (a prediction mode corresponding to the gradient, for example, an intra prediction mode) based on the gradient information of each point P described above.
  • FIG. 16 is a block diagram showing one configuration of the angle mode derivation unit 310462.
  • the angle mode mode_delta may be derived as follows using a first gradient, a second gradient, and two tables.
  • the angle mode derivation unit 310462 consists of an angle coefficient derivation unit 310466 and a mode conversion unit 310467.
  • An integer expressing ratio in increments of 1/R_UNIT is used as iRatio.
  • iRatio int(R_UNIT*absy/absx) ⁇ ratio*R_UNIT
  • the value norm_s1 is derived by shifting the first gradient (absx or absy) at a pixel by the logarithmic value x.
  • norm_s1 is used to reference the gradDivTable to derive the angle coefficient v.
  • idx is derived by the product of v and a second gradient (s0 or s1) different from the first gradient, and shifting the above logarithmic value x.
  • idx is used to reference a second table LUT (LUT') to derive the angle mode mode_delta.
  • idx min((s0 * v) ⁇ 3 >> x, N_LUT-1) Furthermore, it is also appropriate to clip and shift the multiplication by s0*v to a value equal to or less than a predetermined value KK so that the result does not exceed, for example, 32 bits.
  • s0*v (min(s0*v, KK) ⁇ 3) >> x
  • the mode conversion unit 310467 derives and outputs the second angle mode modeVal using mode_delta.
  • modeVal base_mode[region] + direction[region] * mode_delta
  • the angle mode derivation unit 310462 derives a histogram (frequency HistMode) of the value modeVal of the angle mode modeVal obtained for each point P.
  • the histogram may be obtained by incrementing the value of HistMode by 1 at each point P (hereinafter referred to as counting with a histogram).
  • the angle mode selection unit 310463 derives one or more representative values dimdModeVal (dimdModeVal0, dimdModeVal1, ...) of the angle mode using modeVal (modeVal) at multiple points P included in the gradient derivation target image.
  • the representative value of the angle mode in this embodiment is an estimated value of the directionality of the texture pattern of the target block.
  • the representative value dimdModeVal is derived from the most frequent value derived using the derived histogram.
  • the first mode dimdModeVal0 and the second mode dimdModeVal1 are derived by selecting the most frequent mode and the second most frequent mode in the frequency, respectively.
  • HistMode[x] is scanned for x, and the value of x that gives the maximum value of HisMode is set to dimdModeVal0, and the value of x that gives the second largest value is set to dimdModeVal1.
  • the angle mode selection unit 310463 may set the average value of modeVal to dimdModeVal0 or dimdModelVal1.
  • the angle mode selection unit 310463 sets a predetermined mode (for example, intra prediction mode or transform mode) to dimdModeVal2 as the third mode.
  • Another mode may be set adaptively, or the third mode may not be used.
  • the angle mode selection unit 310463 may further derive weights corresponding to the representative values of each angle mode for intra prediction in the provisional predicted image generation unit 310464 described later.
  • the sum of the weights is set to 64.
  • the derivation of the weights is not limited to this, and the weights w0, w1, and w2 of the first, second, and third modes may be adaptively changed. For example, w2 may be increased or decreased according to the number of the first or second mode, or the frequency or ratio thereof.
  • the angle mode selection unit sets the corresponding weight value to 0 for any of the first to third modes when that mode is not used.
  • the angle mode selection unit 310463 is equipped with an angle mode selection unit that selects an angle mode representative value from multiple angle modes derived for pixels in the image for which gradient is to be derived, thereby enabling the deriving of an angle mode with higher accuracy.
  • the angle mode selection unit 310463 selects the angle mode (representative value of the angle mode) estimated from the gradient, and outputs it together with the weight corresponding to each angle mode.
  • the region of the reference image used to derive the intra prediction mode from the reference image is changed according to dimd_mode.
  • the positions of points P of the gradient derivation unit 310461, the angle mode derivation unit 310462, and the angle mode selection unit 310463 are changed according to dimd_mode.
  • (Configuration example 1 of reference area according to mode) 17(a) shows an example of a reference range in gradient derivation for DIMD prediction.
  • the angle mode derivation device 310465 derives Dx, Dy from each point P in the left region RDL of the target block, derives modeVal, and counts it in a histogram.
  • Dx, Dy are derived from each point P of the RDT in the range of pixels in the region above the target block, and modeVal is derived and counted in a histogram.
  • RDTL is the combined domain of RDL and RDT.
  • the gradient derivation unit 310461 and angle mode derivation unit 310462 (hereinafter referred to as the angle mode derivation device 310465) derive Dx and Dy from the left region of the target block, for example the above RDL, derive modeVal, and count it in a histogram.
  • the angle mode derivation device 310465 derives Dx and Dy from the area above the target block, for example, the RDT above, derives modeVal, and counts it in a histogram.
  • the angle mode derivation device 310465 may perform the following processing.
  • the angle mode derivation device 310465 derives Dx and Dy from the left area of the target block, for example, the above RDL, derives modeVal, and counts it in a histogram.
  • the angle mode derivation device 310465 derives Dx and Dy from the area above the target block, for example, the RDT mentioned above, derives modeVal, and counts it in a histogram.
  • Figure 17(b) shows another example of the reference range in the gradient derivation of DIMD prediction.
  • an extended region including the adjacent region on the top and the top right is used.
  • the left and top regions are used without extension.
  • the angle mode derivation device 310465 may perform the following processing:
  • the reference sample derivation unit 310460 derives Dx, Dy from the left and bottom left areas of the target block, derives modeVal, and counts it in a histogram.
  • Dx, Dy are derived at point P at position (x, y) in the left region RDL of the target block, and modeVal is derived and counted in a histogram.
  • Dx, Dy are derived at point P at position (x, y) in region RDT above the target block, and modeVal is derived and counted in a histogram.
  • RDTL is the domain of RDL and RDT.
  • the angle mode derivation device 310465 derives Dx, Dy from the left and bottom left areas of the target block, for example RDL_EXT, and derives modeVal and counts it in a histogram.
  • the angle mode derivation device 310465 derives Dx, Dy from the region above the target block, for example, the above RDT_EXT, and derives modeVal and counts it in a histogram.
  • the angle mode derivation device 310465 may perform the following processing.
  • the reference sample derivation unit 310460 derives Dx and Dy from the left and bottom left areas of the target block, for example, RDL_ADAP, and derives modeVal and counts it in a histogram.
  • the angle mode derivation device 310465 derives Dx and Dy from the upper and upper right regions of the target block, for example, RDT_ADAP, and derives modeVal and counts it in a histogram.
  • x 1..refW-2.
  • the above configuration makes it possible to switch between at least the top and left, and the left and top of the target block as the adjacent images depending on the DIMD mode. Therefore, even if the characteristics of the target block differ from the characteristics of the adjacent area to the left or above, the intra prediction mode can be derived on the decoder side with high accuracy and high efficiency.
  • the decoder derives the intra prediction mode using the gradient of pixel values of an image adjacent to the target area
  • the angle gradient of the adjacent image and the angle gradient of the target block do not necessarily match. Even in such cases, the effect of improving accuracy is achieved by switching the derivation of the intra prediction mode depending on the properties of the adjacent blocks and the target block.
  • DIMD_MODE_TOP_LEFT when both the top and left are used (DIMD_MODE_TOP_LEFT), the top right and bottom left extension regions are not used, whereas when only the left (DIMD_MODE_LEFT) or only the top (DIMD_MODE_TOP) is used, the left and bottom left extension regions, and the top and top right extension regions are used, respectively, which has the effect of reducing the amount of processing required for sampling reference pixels, deriving gradients, and deriving histograms.
  • Fig. 12(a) shows another example of the reference range in gradient derivation for DIMD prediction.
  • the angle mode derivation device 310465 sets the reference line numbers refIdxW and refIdxH in accordance with dimd_mode.
  • the angle mode derivation unit 310465 derives Dx and Dy from the left region of the target block, for example, RDL, derives modeVal, and counts it in a histogram.
  • the angle mode derivation unit 310465 derives Dx and Dy from the area above the target block, for example, RDT, derives modeVal, and counts it in a histogram.
  • the angle mode derivation device 310465 sets the reference line numbers refIdxW and refIdxH in accordance with the block size.
  • bH > 8) ? N-1 : M-1
  • bH > 8) ? N-1 : M-1
  • the angle mode derivation device 310465 derives Dx and Dy from the left region RDL and the top region RDT of the target block, derives modeVal, and counts it in a histogram.
  • Configuration Example 6 According to Mode) 12(b) shows another example of the reference range in the gradient derivation of the DIMD prediction. In this example, the direction of the reference region is selected according to the dimd_mode, and at the same time, the number of lines of the reference region is also selected.
  • M-1 M-1
  • the reference sample derivation unit 310460 sets the number of reference lines to M, derives Dx and Dy from the left region RDL and the top region RDT, derives modeVal, and counts it in a histogram.
  • the angle mode derivation device 310465 sets the number of reference lines to N, derives Dx and Dy from the area to the left of the target block, for example, RDL, derives modeVal, and counts it in a histogram.
  • the angle mode derivation device 310465 sets the number of reference lines to N, derives Dx and Dy from the area above the target block, for example, RDT, derives modeVal, and counts it in a histogram.
  • the number of lines to be referenced is switched between when both the top and left are used as the reference area, and when only the left or only the top is used. This further makes it possible to derive an intra prediction mode according to the difference in the continuity of the characteristics of the target block and the adjacent blocks, thereby improving prediction accuracy.
  • the number of reference lines when using both the top and left is set to M
  • the number of reference lines when using only the left is set to N (where M ⁇ N).
  • This configuration has the effect of reducing the amount of processing required for sampling reference pixels, deriving gradients, and deriving histograms by referring to both the top and left.
  • the reference area to the left of the target block may be the left and bottom left reference areas, and the reference area above the target block may be the top and top right reference areas, or the top left of the target block may be referenced.
  • the prediction image generation unit (provisional prediction image generation unit) 310464 generates a prediction image (provisional prediction image) using one or more input angle mode representative values (intra prediction modes). When there is one intra prediction mode, an intra prediction image is generated in that intra prediction mode and output as a provisional prediction image q[x][y]. When there are multiple intra prediction modes, a prediction image (pred0, pred1, pred2) is generated in each intra prediction mode. Multiple prediction images are synthesized using the corresponding weights (w0, w1, w2) and output as a prediction image q[x][y].
  • the prediction image q[x][y] is derived as follows.
  • q[x][y] (w0 * pred0[x][y] + w1 * pred1[x][y] + w2 * pred2[x][y]) >> 6
  • the frequency of the second mode is 0 or it is not a directional prediction mode (such as DC mode)
  • the inverse quantization and inverse transform unit 311 inverse quantizes the quantized transform coefficients input from the prediction parameter derivation unit 320 to obtain transform coefficients.
  • the quantized transform coefficients are coefficients obtained by performing frequency transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform) on the prediction error in the encoding process and quantizing the transform coefficients.
  • the inverse quantization and inverse transform unit 311 performs inverse frequency transform such as inverse DCT or inverse DST on the transform coefficients to calculate the prediction error.
  • the inverse quantization and inverse transform unit 311 outputs the prediction error to the adder unit 312.
  • FIG. 18 is a block diagram showing the configuration of the inverse quantization and inverse transform unit 311 of this embodiment.
  • the inverse quantization and inverse transform unit 311 is composed of a scaling unit 31111, an inverse non-separable transform unit 31121, and an inverse separate transform unit 31123. Note that the transform coefficients decoded from the encoded data may be transformed using the angle mode derived by the angle mode derivation device 310465.
  • the inverse quantization and inverse transform unit 311 obtains the transform coefficients d[][] by scaling (inverse quantization) the quantized transform coefficients qd[][] input from the prediction parameter derivation unit 320 using the scaling unit 31111.
  • the quantized transform coefficients qd[][] are coefficients obtained by performing a transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform) on the prediction error in the encoding process and quantizing it, or coefficients obtained by further performing a non-separable transform on the transformed coefficients.
  • a transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform)
  • the inverse quantization and inverse transform unit 31121 performs an inverse transform.
  • an inverse frequency transform such as inverse DCT or inverse DST is performed on the transform coefficients to calculate the prediction error.
  • the inverse non-separable transform unit 31121 does not perform processing, and the scaling unit 31111 performs inverse frequency transform such as inverse DCT and inverse DST on the scaled transform coefficients to calculate a prediction error.
  • the inverse quantization and inverse transform unit 311 outputs the prediction error to the adder 312.
  • the adder 312 adds, for each pixel, the predicted image of the block input from the predicted image generation unit 308 and the prediction error input from the inverse quantization and inverse transform unit 311 to generate a decoded image of the block.
  • the adder 312 stores the decoded image of the block in the reference picture memory 306, and also outputs it to the loop filter 305.
  • FIG. 19 is a block diagram showing the configuration of the video encoding device 11 according to this embodiment.
  • the video encoding device 11 includes a prediction image generating unit 101, a subtraction unit 102, a transformation/quantization unit 103, an inverse quantization/inverse transformation unit 105, an addition unit 106, a loop filter 107, a prediction parameter memory (prediction parameter storage unit, frame memory) 108, a reference picture memory (reference image storage unit, frame memory) 109, an encoding parameter determining unit 110, a parameter encoding unit 111, an entropy encoding unit 104, and a prediction parameter derivation unit 120.
  • the predicted image generating unit 101 generates a predicted image for each CU, which is an area obtained by dividing each picture of the image T.
  • the predicted image generating unit 101 operates in the same way as the predicted image generating unit 308 already explained, and so a description thereof will be omitted.
  • the subtraction unit 102 subtracts the pixel values of the predicted image of the block input from the predicted image generation unit 101 from the pixel values of image T to generate a prediction error.
  • the subtraction unit 102 outputs the prediction error to the transformation and quantization unit 103.
  • the transform/quantization unit 103 calculates transform coefficients by frequency transforming the prediction error input from the subtraction unit 102, and derives quantized transform coefficients by quantizing it.
  • the transform/quantization unit 103 outputs the quantized transform coefficients to the entropy coding unit 104, the inverse quantization/inverse transform unit 105, and the coding parameter determination unit 110.
  • the inverse quantization and inverse transform unit 105 is the same as the inverse quantization and inverse transform unit 311 (FIG. 4) in the video decoding device 31, and a description thereof will be omitted.
  • the calculated prediction error is output to the addition unit 106.
  • the entropy coding unit 104 receives prediction parameters and quantized transform coefficients from the parameter coding unit 111.
  • the entropy coding unit 104 entropy codes the split information, prediction parameters, quantized transform coefficients, etc. to generate and output an encoded stream Te.
  • the parameter coding unit 111 instructs the entropy coding unit 104 to code the prediction parameters, quantization coefficients, etc. derived by the prediction parameter derivation unit 120.
  • the prediction parameter derivation unit 120 derives syntax elements from the parameters input from the encoding parameter determination unit 110.
  • the prediction parameter derivation unit 120 includes a configuration that is partially the same as the configuration of the prediction parameter derivation unit 320.
  • the adder 106 generates a decoded image by adding, for each pixel, the pixel values of the predicted image of the block input from the predicted image generation unit 101 and the prediction error input from the inverse quantization and inverse transform unit 105.
  • the adder 106 stores the generated decoded image in the reference picture memory 109.
  • the loop filter 107 applies a deblocking filter, SAO, and ALF to the decoded image generated by the adder 106.
  • SAO deblocking filter
  • ALF ALF
  • the loop filter 107 does not necessarily have to include the above three types of filters, and may be configured, for example, as only a deblocking filter.
  • the prediction parameter memory 108 stores the prediction parameters input from the prediction parameter derivation unit 120 in a predetermined location for each target picture and CU.
  • the reference picture memory 109 stores the decoded image generated by the loop filter 107 in a predetermined location for each target picture and CU.
  • the coding parameter determination unit 110 selects one set from among multiple sets of coding parameters.
  • the coding parameters are the above-mentioned QT, BT or TT division information, prediction parameters, or parameters to be coded that are generated in relation to these.
  • the predicted image generation unit 101 generates a predicted image using these coding parameters.
  • the coding parameter determination unit 110 calculates an RD cost value indicating the amount of information and the coding error for each of the multiple sets.
  • the coding parameter determination unit 110 selects the set of coding parameters that minimizes the calculated cost value.
  • the entropy coding unit 104 outputs the selected set of coding parameters as the coding stream Te.
  • the coding parameter determination unit 110 stores the determined coding parameters in the prediction parameter memory 108.
  • a part of the video encoding device 11 and video decoding device 31 in the above-mentioned embodiment for example, the entropy decoding unit 301, the parameter decoding unit 302, the loop filter 305, the predicted image generating unit 308, the inverse quantization and inverse transform unit 311, the addition unit 312, the predicted image generating unit 101, the subtraction unit 102, the transform and quantization unit 103, the entropy encoding unit 104, the inverse quantization and inverse transform unit 105, the loop filter 107, the encoding parameter determination unit 110, and the parameter encoding unit 111 may be realized by a computer.
  • a program for realizing this control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into and executed by a computer system.
  • the "computer system” referred to here is a computer system built into either the video encoding device 11 or the video decoding device 31, and includes hardware such as an OS and peripheral devices.
  • “computer-readable recording media” refers to portable media such as flexible disks, optical magnetic disks, ROMs, and CD-ROMs, as well as storage devices such as hard disks built into computer systems.
  • “computer-readable recording media” may also include devices that dynamically store a program for a short period of time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, or devices that store a program for a certain period of time, such as volatile memory within a computer system that serves as a server or client in such cases.
  • the above-mentioned program may be one that realizes part of the functions described above, or may be one that can realize the functions described above in combination with a program already recorded in the computer system.
  • part or all of the video encoding device 11 and video decoding device 31 in the above-mentioned embodiments may be realized as an integrated circuit such as an LSI (Large Scale Integration).
  • LSI Large Scale Integration
  • Each functional block of the video encoding device 11 and video decoding device 31 may be individually made into a processor, or part or all of them may be integrated into a processor.
  • the integrated circuit method is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Furthermore, if an integrated circuit technology that can replace LSI appears due to advances in semiconductor technology, an integrated circuit based on that technology may be used.
  • An image decoding device includes a reference sample derivation unit that selects a neighboring image of a current block in accordance with a DIMD mode, a gradient derivation unit that derives a pixel-by-pixel gradient using the selected neighboring image, and an angle mode selection unit that derives an intra-prediction mode from the gradient.
  • the image decoding device is characterized in that, in the above aspect 1, it includes an entropy decoding unit that decodes the DIMD flag and DIMD mode of the target block from the encoded data, and further includes a predicted image generating unit that, when the DIMD flag is true, further decodes the DIMD mode and generates a predicted image using the derived intra prediction mode.
  • the image decoding device is characterized in that in any of aspects 1 and 2 above, the DIMD mode switches at least between the top and left, and the left and top as the adjacent images.
  • the image decoding device is characterized in that in any of aspects 1 to 3 above, the DIMD mode is composed of a first bit and a second bit, and the first bit is a choice between top and left, and the second bit is a choice between left or top to select the adjacent image.
  • the image decoding device is characterized in that in any one of aspects 1 to 4, the entropy decoding unit decodes the DIMD mode using a context that holds a probability for decoding the first bit, and using an equal probability without using a context for decoding the second bit.
  • the image decoding device is characterized in that in any one of aspects 1 to 5, the entropy decoding unit decodes the DIMD mode using a context that holds probabilities for decoding the first bit and the second bit.
  • the image decoding device is any one of aspects 1 to 6 above, characterized in that the entropy decoding unit derives the context index using the width and height of the target block.
  • the image decoding device is any one of aspects 1 to 7 above, characterized in that the entropy decoding unit derives a context index from the second bit by using a determination of whether the target block is a square or not.
  • the image decoding device is characterized in that in any one of aspects 1 to 8 above, the gradient derivation unit changes the number of lines to be referenced depending on dimd_mode.
  • the image decoding device is characterized in that in any one of aspects 1 to 9 above, the gradient derivation unit changes the number of lines to be referenced depending on the size of the target block.
  • An image decoding device is characterized in that in any one of aspects 1 to 10 above, the gradient derivation unit changes the number of lines to be referenced and the reference direction according to dimd_mode.
  • the image encoding device includes a reference sample derivation unit that selects an adjacent image of a target block according to a DIMD mode, a gradient derivation unit that uses the selected adjacent image to derive a pixel-by-pixel gradient, and an angle mode selection unit that derives an intra prediction mode from the gradient.
  • Embodiments of the present invention can be suitably applied to a video decoding device that decodes coded data in which image data has been coded, and a video coding device that generates coded data in which image data has been coded.
  • the present invention can also be suitably applied to the data structure of coded data that is generated by a video coding device and referenced by the video decoding device.
  • Image Decoding Device 301 Entropy Decoding Unit 302 Parameter Decoding Unit 308 Prediction Image Generation Unit 31046 DIMD Prediction Department 310460 Reference sample derivation part 310465 Angle mode extraction device 310461 Gradient derivation part 310462 Angle mode derivation part 310463 Angle mode selection section 310464 Temporary predicted image generation unit 311 Inverse quantization and inverse transformation unit 312 Addition section 11 Image Encoding Device 101 Prediction image generation unit 102 Subtraction section 103 Transformation and Quantization Section 104 Entropy coding unit 105 Inverse quantization and inverse transformation section 107 Loop Filter 110 Encoding parameter determination unit 111 Parameter Encoding Unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention addresses the problem that, when an intra-prediction mode is derived on the decoder side using the gradient of a pixel value of an image adjacent to a region of interest, the angular gradient of the adjacent image and the angular gradient of a block of interest do not necessarily match each other. This image decoding device comprises: a reference sample derivation unit that selects an image adjacent to a block of interest in accordance with a DIMD mode; a gradient derivation unit that derives the gradient of a pixel unit using the selected adjacent image; and an angle mode selection unit that derives an intra-prediction mode from the gradient.

Description

画像復号装置および画像符号化装置Image decoding device and image encoding device
 本発明の実施形態は、画像復号装置および画像符号化装置に関する。 Embodiments of the present invention relate to an image decoding device and an image encoding device.
 動画像を効率的に伝送または記録するために、動画像を符号化することによって符号化データを生成する動画像符号化装置、および、当該符号化データを復号することによって復号画像を生成する動画像復号装置が用いられている。 In order to efficiently transmit or record video, a video encoding device is used that generates encoded data by encoding video, and a video decoding device is used that generates a decoded image by decoding the encoded data.
 具体的な動画像符号化方式としては、例えば、H.264/AVCやHEVC(High-Efficiency Video Coding)にて提案されている方式などが挙げられる。 Specific examples of video coding methods include those proposed in H.264/AVC and HEVC (High-Efficiency Video Coding).
 このような動画像符号化方式においては、動画像を構成する画像(ピクチャ)は、画像を分割することにより得られるスライス、スライスを分割することにより得られる符号化ツリーユニット(CTU:Coding Tree Unit)、符号化ツリーユニットを分割することで得られる符号化単位(符号化ユニット(Coding Unit:CU)と呼ばれることもある)、及び、符号化単位を分割することより得られる変換ユニット(TU:Transform Unit)からなる階層構造により管理され、CU毎に符号化/復号される。 In such video coding methods, the images (pictures) that make up a video are managed in a hierarchical structure consisting of slices obtained by dividing the images, coding tree units (CTUs) obtained by dividing the slices, coding units (sometimes called coding units: CUs) obtained by dividing the coding tree units, and transform units (TUs) obtained by dividing the coding units, and are coded/decoded for each CU.
 また、このような動画像符号化方式においては、通常、入力画像を符号化/復号することによって得られる局所復号画像に基づいて予測画像が生成され、当該予測画像を入力画像(原画像)から減算して得られる予測誤差(「差分画像」または「残差画像」と呼ぶこともある)が符号化される。予測画像の生成方法としては、画面間予測(インター予測)、および、画面内予測(イントラ予測)が挙げられる。 In addition, in such video coding methods, a predicted image is usually generated based on a locally decoded image obtained by encoding/decoding an input image, and the prediction error (sometimes called a "difference image" or "residual image") obtained by subtracting the predicted image from the input image (original image) is coded. Methods for generating predicted images include inter-frame prediction (inter prediction) and intra-frame prediction (intra prediction).
 また、近年の動画像符号化及び復号の技術として非特許文献1が挙げられる。非特許文献1には、デコーダが隣接領域の画素を用いてイントラ方向予測モード番号を導出することにより予測画像を導出するデコーダ側イントラモード導出(Decoder-side Intra Mode Derivation, DIMD)予測が開示されている。 Another recent example of video encoding and decoding technology is Non-Patent Document 1. Non-Patent Document 1 discloses decoder-side intra mode derivation (DIMD) prediction, in which the decoder derives a predicted image by deriving an intra direction prediction mode number using pixels in adjacent regions.
 非特許文献1では対象領域隣接画像の画素値の勾配を用いてデコーダ側でイントラモードを導出するが、隣接画像の角度勾配と対象ブロックの角度勾配は必ずしも一致しないという課題があった。 In Non-Patent Document 1, the intra mode is derived on the decoder side using the gradient of pixel values of the image adjacent to the target area, but there is an issue that the angle gradient of the adjacent image and the angle gradient of the target block do not necessarily match.
 本発明は、デコーダ側イントラモード導出において、隣接ブロックと対象ブロックの性質に応じてイントラ予測モードの導出を切り替えて精度を向上させることを目的とする。 The present invention aims to improve the accuracy of decoder-side intra mode derivation by switching intra prediction mode derivation depending on the properties of adjacent blocks and the current block.
 DIMDモードに応じて対象ブロックの隣接画像を選択する参照サンプル導出部と、選択された隣接画像を用いて、画素単位の勾配を導出する勾配導出部、勾配からイントラ予測モードを導出する角度モード選択部を備える。 It includes a reference sample derivation unit that selects adjacent images for the target block according to the DIMD mode, a gradient derivation unit that uses the selected adjacent images to derive pixel-level gradients, and an angle mode selection unit that derives the intra prediction mode from the gradient.
 本発明の一態様によれば、デコーダ側イントラモード導出の計算量を増大させることなく好適なイントラ予測を行うことができる。 According to one aspect of the present invention, it is possible to perform suitable intra prediction without increasing the amount of calculation required for decoder-side intra mode derivation.
本実施形態に係る画像伝送システムの構成を示す概略図である。1 is a schematic diagram showing a configuration of an image transmission system according to an embodiment of the present invention. 符号化ストリームのデータの階層構造を示す図である。FIG. 2 is a diagram showing a hierarchical structure of data in an encoded stream. イントラ予測モードの種類(モード番号)を示す概略図である。FIG. 13 is a schematic diagram showing types of intra-prediction modes (mode numbers). 動画像復号装置の構成を示す概略図である。FIG. 1 is a schematic diagram showing a configuration of a video decoding device. DIMDのシンタックス例である。This is an example of DIMD syntax. DIMD予測部31046で用いるシンタックスdimd_modeのバイナリゼーションを説明する図である。13 is a diagram explaining the binarization of the syntax dimd_mode used in the DIMD prediction unit 31046. FIG. DIMDのシンタックスの別の例である。Here is another example of DIMD syntax. dimd_modeのシンタックス要素復号におけるコンテキストの設定を示す図である。A diagram showing context settings in decoding syntax elements of dimd_mode. 予測画像生成部の構成を示す図である。FIG. 2 is a diagram illustrating a configuration of a predicted image generating unit. DIMD予測部の詳細を示す図である。FIG. 13 is a diagram showing details of a DIMD prediction unit. DIMD予測部31046の参照する参照領域の例を示す図である。13 is a diagram showing an example of a reference region referred to by a DIMD prediction unit 31046. FIG. dimd_modeに応じてDIMD予測の参照領域のライン数を変更する構成を示す図である。13 is a diagram showing a configuration for changing the number of lines in a reference region for DIMD prediction according to dimd_mode. FIG. 空間フィルタの一例である。This is an example of a spatial filter. 勾配導出対象画素の例を示す図である。FIG. 13 is a diagram illustrating an example of a pixel from which gradient is derived; 勾配と領域の関係を示す図である。FIG. 13 is a diagram illustrating the relationship between gradient and area. 角度モード導出部の構成を示すブロック図である。4 is a block diagram showing a configuration of an angle mode derivation unit. FIG. DIMD予測部31046の勾配導出における参照範囲の例を示す図である。13 is a diagram showing an example of a reference range in gradient derivation by the DIMD prediction unit 31046. FIG. 逆量子化・逆変換部の構成例を示す機能ブロック図である。4 is a functional block diagram showing an example of the configuration of an inverse quantization and inverse transform unit. FIG. 動画像符号化装置の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a video encoding device.
  (第1の実施形態)
 以下、図面を参照しながら本発明の実施形態について説明する。
First Embodiment
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
 図1は、本実施形態に係る画像伝送システム1の構成を示す概略図である。 FIG. 1 is a schematic diagram showing the configuration of an image transmission system 1 according to this embodiment.
 画像伝送システム1は、符号化対象画像を符号化した符号化ストリームを伝送し、伝送された符号化ストリームを復号し画像を表示するシステムである。画像伝送システム1は、動画像符号化装置(画像符号化装置)11、ネットワーク21、動画像復号装置(画像復号装置)31、及び動画像表示装置(画像表示装置)41を含んで構成される。 The image transmission system 1 is a system that transmits an encoded stream obtained by encoding an image to be encoded, and decodes the transmitted encoded stream to display an image. The image transmission system 1 is composed of a video encoding device (image encoding device) 11, a network 21, a video decoding device (image decoding device) 31, and a video display device (image display device) 41.
 動画像符号化装置11には画像Tが入力される。 An image T is input to the video encoding device 11.
 ネットワーク21は、動画像符号化装置11が生成した符号化ストリームTeを動画像復号装置31に伝送する。ネットワーク21は、インターネット(Internet)、広域ネットワーク(WAN:Wide Area Network)、小規模ネットワーク(LAN:Local Area Network)またはこれらの組み合わせである。ネットワーク21は、必ずしも双方向の通信網に限らず、地上デジタル放送、衛星放送等の放送波を伝送する一方向の通信網であってもよい。また、ネットワーク21は、DVD(Digital Versatile Disc:登録商標)、BD(Blu-ray Disc:登録商標)等の符号化ストリームTeを記録した記憶媒体で代替されてもよい。 The network 21 transmits the encoded stream Te generated by the video encoding device 11 to the video decoding device 31. The network 21 is the Internet, a wide area network (WAN), a local area network (LAN), or a combination of these. The network 21 is not necessarily limited to a bidirectional communication network, and may be a unidirectional communication network that transmits broadcast waves such as terrestrial digital broadcasting and satellite broadcasting. The network 21 may also be replaced by a storage medium on which the encoded stream Te is recorded, such as a DVD (Digital Versatile Disc: registered trademark) or a BD (Blu-ray Disc: registered trademark).
 動画像復号装置31は、ネットワーク21が伝送した符号化ストリームTeのそれぞれを復号し、復号した1または複数の復号画像Tdを生成する。 The video decoding device 31 decodes each of the encoded streams Te transmitted by the network 21 and generates one or more decoded images Td.
 動画像表示装置41は、動画像復号装置31が生成した1または複数の復号画像Tdの全部または一部を表示する。動画像表示装置41は、例えば、液晶ディスプレイ、有機EL(Electro-luminescence)ディスプレイ等の表示デバイスを備える。ディスプレイの形態としては、据え置き、モバイル、HMD等が挙げられる。また、動画像復号装置31が高い処理能力を有する場合には、画質の高い画像を表示し、より低い処理能力しか有しない場合には、高い処理能力、表示能力を必要としない画像を表示する。 The video display device 41 displays all or part of one or more decoded images Td generated by the video decoding device 31. The video display device 41 is equipped with a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display. Display forms include stationary, mobile, HMD, etc. Furthermore, when the video decoding device 31 has high processing power, it displays high quality images, and when it has only lower processing power, it displays images that do not require high processing power or display power.
 <演算子>
 本明細書で用いる演算子を以下に記載する。
<Operator>
The operators used in this specification are listed below.
 >>は右ビットシフト、<<は左ビットシフト、&はビットワイズAND、|はビットワイズOR、^はビットワイズXOR、|=はOR代入演算子であり、!は論理否定(NOT)、&&は論理積(AND)、||は論理和(OR)を示す。 >> is a right bit shift, << is a left bit shift, & is a bitwise AND, | is a bitwise OR, ^ is a bitwise XOR, |= is the OR assignment operator, ! is logical negation (NOT), && is logical AND, and || is logical OR.
 x?y:zは、xが真(0以外)の場合にy、xが偽(0)の場合にzをとる3項演算子である。 x?y:z is a ternary operator that takes y if x is true (non-zero) and z if x is false (0).
 Clip3(a,b,c) は、cをa以上b以下の値にクリップする関数であり、c<aの場合にはaを返し、c>bの場合にはbを返し、その他の場合にはcを返す関数である(ただし、a<=b)。 Clip3(a,b,c) is a function that clips c to a value between a and b, and returns a if c<a, returns b if c>b, and returns c in all other cases (where a<=b).
 Clip1Y(c)は、Clip3(a,b,c)においてa=0、b=(1<<BitDepthY)-1に設定した演算子である。BitDepthYは輝度のビット深度である。 Clip1Y(c) is the operator for Clip3(a,b,c) with a=0 and b=(1<<BitDepthY)-1. BitDepthY is the luminance bit depth.
 abs(a)はaの絶対値を返す関数である。 abs(a) is a function that returns the absolute value of a.
 Int(a)はaの整数値を返す関数である。 Int(a) is a function that returns the integer value of a.
 Floor(a)はa以下の最大の整数を返す関数である。 Floor(a) is a function that returns the largest integer less than or equal to a.
 Log2(a)は2を底とする対数を返す関数である。 Log2(a) is a function that returns the logarithm to the base 2.
 Ceil(a)はa以上の最小の整数を返す関数である。 Ceil(a) is a function that returns the smallest integer greater than or equal to a.
 a/dはdによるaの除算(小数点以下切り捨て)を表す。 a/d represents the division of a by d (rounded down to nearest whole number).
 Min(a,b)はaとbのうち小さい値を返す関数である。 Min(a,b) is a function that returns the smaller of a and b.
  <符号化ストリームTeの構造>
 本実施形態に係る動画像符号化装置11および動画像復号装置31の詳細な説明に先立って、動画像符号化装置11によって生成され、動画像復号装置31によって復号される符号化ストリームTeのデータ構造について説明する。
<Structure of the Encoded Stream Te>
Before describing in detail the video encoding device 11 and the video decoding device 31 according to this embodiment, the data structure of the encoded stream Te generated by the video encoding device 11 and decoded by the video decoding device 31 will be described.
 図2は、符号化ストリームTeにおけるデータの階層構造を示す図である。符号化ストリームTeは、例示的に、シーケンス、およびシーケンスを構成する複数のピクチャを含む。図2には、それぞれ、シーケンスSEQを既定する符号化ビデオシーケンス、ピクチャPICTを規定する符号化ピクチャ、スライスSを規定する符号化スライス、スライスデータを規定する符号化スライスデータ、符号化スライスデータに含まれる符号化ツリーユニット、符号化ツリーユニットに含まれる符号化ユニットを示す図が示されている。 FIG. 2 is a diagram showing the hierarchical structure of data in an encoded stream Te. The encoded stream Te illustratively includes a sequence and a number of pictures that make up the sequence. FIG. 2 shows a coded video sequence that defines a sequence SEQ, a coded picture that specifies a picture PICT, a coded slice that specifies a slice S, coded slice data that specifies slice data, a coding tree unit included in the coded slice data, and a coding unit included in the coding tree unit.
  (符号化ビデオシーケンス)
 符号化ビデオシーケンスでは、処理対象のシーケンスSEQを復号するために動画像復号装置31が参照するデータの集合が規定されている。シーケンスSEQは、図2の符号化ビデオシーケンスに示すように、ビデオパラメータセットVPS(Video Parameter Set)、シーケンスパラメータセットSPS(Sequence Parameter Set)、ピクチャパラメータセットPPS(Picture Parameter Set)、ピクチャPICT、及び、付加拡張情報SEI(Supplemental Enhancement Information)を含んでいる。
(Coded Video Sequence)
The coded video sequence defines a set of data to be referred to by the video decoding device 31 in order to decode the sequence SEQ to be processed. As shown in the coded video sequence of Fig. 2, the sequence SEQ includes a video parameter set VPS (Video Parameter Set), a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), a picture PICT, and supplemental enhancement information SEI (Supplemental Enhancement Information).
 ビデオパラメータセットVPSは、複数のレイヤから構成されている動画像において、複数の動画像に共通する符号化パラメータの集合および動画像に含まれる複数のレイヤおよ
び個々のレイヤに関連する符号化パラメータの集合が規定されている。
The video parameter set VPS specifies a set of coding parameters common to multiple videos composed of multiple layers, as well as a set of coding parameters related to multiple layers and each individual layer included in the video.
 シーケンスパラメータセットSPSでは、対象シーケンスを復号するために動画像復号装置31が参照する符号化パラメータの集合が規定されている。例えば、ピクチャの幅や高さが規定される。なお、SPSは複数存在してもよい。その場合、PPSから複数のSPSの何れかを選択する。 The sequence parameter set SPS specifies a set of coding parameters that the video decoding device 31 references in order to decode the target sequence. For example, the width and height of a picture are specified. Note that there may be multiple SPSs. In that case, one of the multiple SPSs is selected from the PPS.
 ピクチャパラメータセットPPSでは、対象シーケンス内の各ピクチャを復号するために動画像復号装置31が参照する符号化パラメータの集合が規定されている。例えば、ピクチャの復号に用いられる量子化幅の基準値(pic_init_qp_minus26)や重み付き予測の適用を示すフラグ(weighted_pred_flag)が含まれる。なお、PPSは複数存在してもよい。その場合、対象シーケンス内の各ピクチャから複数のPPSの何れかを選択する。 The picture parameter set PPS specifies a set of coding parameters that the video decoding device 31 references in order to decode each picture in the target sequence. For example, it includes the reference value of the quantization width used in decoding the picture (pic_init_qp_minus26) and a flag indicating the application of weighted prediction (weighted_pred_flag). Note that there may be multiple PPSs. In that case, one of the multiple PPSs is selected for each picture in the target sequence.
  (符号化ピクチャ)
 符号化ピクチャでは、処理対象のピクチャPICTを復号するために動画像復号装置31が参照するデータの集合が規定されている。ピクチャPICTは、図2の符号化ピクチャに示すように、スライス0~スライスNS-1を含む(NSはピクチャPICTに含まれるスライスの総数)。
(Encoded Picture)
A coded picture defines a set of data to be referenced by the video decoding device 31 in order to decode a picture PICT to be processed. As shown in the coded picture of FIG. 2, the picture PICT includes slices 0 to NS-1 (NS is the total number of slices included in the picture PICT).
 なお、以下、スライス0~スライスNS-1のそれぞれを区別する必要が無い場合、符号の添え字を省略して記述することがある。また、以下に説明する符号化ストリームTeに含まれるデータであって、添え字を付している他のデータについても同様である。 Note that in the following descriptions, when there is no need to distinguish between slices 0 to NS-1, the subscripts of the symbols may be omitted. The same applies to other data that are included in the encoded stream Te described below and have subscripts.
  (符号化スライス)
 符号化スライスでは、処理対象のスライスSを復号するために動画像復号装置31が参照するデータの集合が規定されている。スライスは、図2の符号化スライスに示すように、スライスヘッダ、および、スライスデータを含んでいる。
(Coding Slice)
An encoded slice defines a set of data to be referenced by the video decoding device 31 in order to decode a slice S to be processed. As shown in the encoded slice of Fig. 2, a slice includes a slice header and slice data.
 スライスヘッダには、対象スライスの復号方法を決定するために動画像復号装置31が参照する符号化パラメータ群が含まれる。スライスタイプを指定するスライスタイプ指定情報(slice_type)は、スライスヘッダに含まれる符号化パラメータの一例である。 The slice header includes a set of coding parameters that the video decoding device 31 refers to in order to determine the decoding method for the target slice. Slice type designation information (slice_type) that specifies the slice type is an example of a coding parameter included in the slice header.
 スライスタイプ指定情報により指定可能なスライスタイプとしては、(1)符号化の際にイントラ予測のみを用いるIスライス、(2)符号化の際に単方向予測、または、イントラ予測を用いるPスライス、(3)符号化の際に単方向予測、双方向予測、または、イントラ予測を用いるBスライスなどが挙げられる。なお、インター予測は、単予測、双予測に限定されず、より多くの参照ピクチャを用いて予測画像を生成してもよい。以下、P、Bスライスと呼ぶ場合には、インター予測を用いることができるブロックを含むスライスを指す。 Slice types that can be specified by the slice type specification information include (1) an I slice that uses only intra prediction when encoding, (2) a P slice that uses unidirectional prediction or intra prediction when encoding, and (3) a B slice that uses unidirectional prediction, bidirectional prediction, or intra prediction when encoding. Note that inter prediction is not limited to unidirectional or bidirectional prediction, and a predicted image may be generated using more reference pictures. Hereinafter, when referring to P or B slice, it refers to a slice that includes a block for which inter prediction can be used.
 なお、スライスヘッダは、ピクチャパラメータセットPPSへの参照(pic_parameter_set_id)を含んでいてもよい。 In addition, the slice header may include a reference to the picture parameter set PPS (pic_parameter_set_id).
  (符号化スライスデータ)
 符号化スライスデータでは、処理対象のスライスデータを復号するために動画像復号装置31が参照するデータの集合が規定されている。スライスデータは、図2の符号化スライスヘッダに示すように、CTUを含んでいる。CTUは、スライスを構成する固定サイズ(例えば64x64)のブロックであり、最大符号化単位(LCU:Largest Coding Unit)と呼ぶこともある。
(Encoded slice data)
The coded slice data specifies a set of data to be referenced by the video decoding device 31 in order to decode the slice data to be processed. The slice data includes a CTU, as shown in the coded slice header in Fig. 2. A CTU is a block of a fixed size (e.g., 64x64) that constitutes a slice, and is also called a Largest Coding Unit (LCU).
  (符号化ツリーユニット)
 図2の符号化ツリーユニットには、処理対象のCTUを復号するために動画像復号装置31が参照するデータの集合が規定されている。CTUは、再帰的な4分木分割(QT(Quad Tree)分割)、2分木分割(BT(Binary Tree)分割)あるいは3分木分割(TT(Ternary Tree)分割)により符号化処理の基本的な単位である符号化ユニットCUに分割される。BT分割とTT分割を合わせてマルチツリー分割(MT(Multi Tree)分割)と呼ぶ。再帰的な4分木分割により得られる木構造のノードのことを符号化ノード(Coding Node)と称する。4分木、2分木、及び3分木の中間ノードは、符号化ノードであり、CTU自身も最上位の符号化ノードとして規定される。
(coding tree unit)
In the coding tree unit in Fig. 2, a set of data that the video decoding device 31 refers to in order to decode the CTU to be processed is specified. The CTU is divided into coding units CU, which are basic units of the coding process, by recursive quad tree division (QT (Quad Tree) division), binary tree division (BT (Binary Tree) division), or ternary tree division (TT (Ternary Tree) division). BT division and TT division are collectively called multi tree division (MT (Multi Tree) division). A node of a tree structure obtained by recursive quad tree division is called a coding node. The intermediate nodes of the quad tree, binary tree, and ternary tree are coding nodes, and the CTU itself is specified as the top coding node.
  (符号化ユニット)
 図2の符号化ユニットに示すように、処理対象の符号化ユニットを復号するために動画像復号装置31が参照するデータの集合が規定されている。具体的には、CUは、CUヘッダCUH、予測パラメータ、変換パラメータ、量子化変換係数等から構成される。CUヘッダでは予測モード等が規定される。
(Encoding Unit)
As shown in the coding unit of Fig. 2, a set of data to be referenced by the video decoding device 31 in order to decode the coding unit to be processed is defined. Specifically, the CU is composed of a CU header CUH, prediction parameters, transformation parameters, quantization transformation coefficients, etc. The CU header defines a prediction mode, etc.
 予測処理は、CU単位で行われる場合と、CUをさらに分割したサブCU単位で行われる場合がある。CUとサブCUのサイズが等しい場合には、CU中のサブCUは1つである。CUがサブCUのサイズよりも大きい場合、CUは、サブCUに分割される。例えばCUが8x8、サブCUが4x4の場合、CUは水平2分割、垂直2分割からなる、4つのサブCUに分割される。 Prediction processing may be performed on a CU basis, or on a sub-CU basis, which is a further division of a CU. If the size of the CU and sub-CU are equal, there is one sub-CU in the CU. If the size of the CU is larger than the size of the sub-CU, the CU is divided into sub-CUs. For example, if the CU is 8x8 and the sub-CU is 4x4, the CU is divided into 2 parts horizontally and 2 parts vertically, into 4 sub-CUs.
 予測の種類(予測モード)は、イントラ予測と、インター予測の2つがある。イントラ予測は、同一ピクチャ内の予測であり、インター予測は、互いに異なるピクチャ間(例えば、表示時刻間、レイヤ画像間)で行われる予測処理を指す。 There are two types of prediction (prediction modes): intra prediction and inter prediction. Intra prediction is a prediction within the same picture, while inter prediction refers to a prediction process performed between different pictures (for example, between display times or between layer images).
 変換・量子化処理はCU単位で行われるが、量子化変換係数は4x4等のサブブロック単位でエントロピー符号化してもよい。 The transform and quantization process is performed on a CU basis, but the quantized transform coefficients may be entropy coded on a subblock basis, such as 4x4.
  (予測パラメータ)
 予測画像は、ブロックに付随する予測パラメータによって導出される。予測パラメータには、イントラ予測とインター予測の予測パラメータがある。
(Prediction parameters)
The predicted image is derived from prediction parameters associated with the block, which include intra-prediction and inter-prediction parameters.
 以下、イントラ予測の予測パラメータについて説明する。イントラ予測パラメータは、輝度予測モードIntraPredModeY、色差予測モードIntraPredModeCから構成される。図3は、イントラ予測モードの種類(モード番号)を示す概略図である。図に示すように、イントラ予測モードは、例えば67種類(0~66)存在する。例えば、プレーナ予測(0)、DC予測(1)、Angular予測(2~66)である。この他、色成分間線形モデル(CCLM: Cross Component Linear Model)予測や、マルチモード線形モデル(MMLM: Multi Mode Linear Model)予測といったリニアモデル(LM: Linear Model)予測を用いてもよい。さらに、色差ではLMモードを追加してもよい。 Below, the prediction parameters for intra prediction are explained. The intra prediction parameters consist of a luminance prediction mode IntraPredModeY and a chrominance prediction mode IntraPredModeC. Figure 3 is a schematic diagram showing the types of intra prediction modes (mode numbers). As shown in the figure, there are, for example, 67 types of intra prediction modes (0 to 66). These include planar prediction (0), DC prediction (1), and angular prediction (2 to 66). In addition, linear model (LM: Linear Model) prediction such as cross component linear model (CCLM: Cross Component Linear Model) prediction and multi-mode linear model (MMLM: Multi Mode Linear Model) prediction may also be used. Furthermore, an LM mode may be added for chrominance.
  (動画像復号装置の構成)
 本実施形態に係る動画像復号装置31(図4)の構成について説明する。
(Configuration of video decoding device)
The configuration of a video decoding device 31 (FIG. 4) according to this embodiment will be described.
 動画像復号装置31は、エントロピー復号部301、パラメータ復号部(予測画像復号装置)302、ループフィルタ305、参照ピクチャメモリ306、予測パラメータメモリ307、予測画像生成部(予測画像生成装置)308、逆量子化・逆変換部311、及び加算部312、予測パラメータ導出部320を含んで構成される。なお、後述の動画像符号化装置11に合わせ、動画像復号装置31にループフィルタ305が含まれない構成もある。 The video decoding device 31 includes an entropy decoding unit 301, a parameter decoding unit (prediction image decoding device) 302, a loop filter 305, a reference picture memory 306, a prediction parameter memory 307, a prediction image generating unit (prediction image generating device) 308, an inverse quantization and inverse transform unit 311, an addition unit 312, and a prediction parameter derivation unit 320. Note that, in accordance with the video encoding device 11 described below, the video decoding device 31 may also be configured not to include the loop filter 305.
 また、以降では処理の単位としてCTU、CUを使用した例を記載するが、この例に限らず、サブCU単位で処理をしてもよい。あるいはCTU、CU、をブロック、サブCUをサブブロックと読み替え、ブロックあるいはサブブロック単位の処理としてもよい。 In the following, an example will be described in which CTU and CU are used as processing units, but this is not limiting and processing may be performed in sub-CU units. Alternatively, CTU and CU may be read as blocks and sub-CU as sub-blocks, and processing may be performed in block or sub-block units.
 エントロピー復号部301は、外部から入力された符号化ストリームTeに対してエントロピー復号を行って、個々の符号(シンタックス要素)をパースする。エントロピー符号化には、シンタックス要素の種類や周囲の状況に応じて適応的に選択したコンテキスト(確率モデル)を用いてシンタックス要素を可変長符号化する方式と、あらかじめ定められた表、あるいは計算式を用いてシンタックス要素を可変長符号化する方式がある。前者のCABAC(Context Adaptive Binary Arithmetic Coding)は、符号化あるいは復号したピクチャ(スライス)毎に更新した確率モデルをメモリに格納する。そして、Pピクチャ、あるいはBピクチャのコンテキストの初期状態として、メモリに格納された確率モデルの中から、同じスライスタイプ、同じスライスレベルの量子化パラメータを使用したピクチャの確率モデルを設定する。この初期状態を符号化、復号処理に使用する。パースされた符号には、予測画像を生成するための予測情報および、差分画像を生成するための予測誤差などがある。 The entropy decoding unit 301 performs entropy decoding on the externally input encoded stream Te and parses each code (syntax element). There are two types of entropy coding: one is to perform variable-length coding of syntax elements using a context (probability model) adaptively selected according to the type of syntax element and surrounding circumstances, and the other is to perform variable-length coding of syntax elements using a predefined table or formula. The former CABAC (Context Adaptive Binary Arithmetic Coding) stores in memory an updated probability model for each encoded or decoded picture (slice). Then, from among the probability models stored in memory, the probability model of a picture using the same slice type and quantization parameters of the same slice level is set as the initial state of the context of a P picture or B picture. This initial state is used for the encoding and decoding processes. The parsed code includes prediction information for generating a predicted image and prediction errors for generating a difference image.
 エントロピー復号部301は、変数ivlCurrRange、ivlOffset、valIdx, pStateIdx0, pStateIdx1を用いてシンタックス要素の各Binを復号してもよい。ivlCurrRange、ivlOffsetはコンテキストによらない変数である。valIdx、pStateIdx0、pStateIdx1はコンテキスト単位の変数である。 The entropy decoding unit 301 may decode each bin of the syntax element using the variables ivlCurrRange, ivlOffset, valIdx, pStateIdx0, and pStateIdx1. ivlCurrRange and ivlOffset are context-independent variables. valIdx, pStateIdx0, and pStateIdx1 are context-specific variables.
 (コンテキストを用いる場合のBinの復号)
 エントロピー復号部301は、コンテキストを用いる場合には以下の計算によりivlCurrRange 、ivlOffsetを得る。
(Bin Decryption with Context)
When the entropy decoding unit 301 uses a context, it obtains ivlCurrRange and ivlOffset by the following calculation.
 qRangeIdx = ivlCurrRange >> 5
 pState = pStateIdx1 + 16 * pStateIdx0
 valMps = pState >> 14
 ivlLpsRange = (qRangeIdx * ((valMps ? 32767 - pState : pState) >> 9) >> 1) + 4
 ivlCurrRange = ivlCurrRange - ivlLpsRange
 続いてエントロピー復号部301は、ivlOffset >= ivlCurrRangeの場合、以下でBinの値binVal、変数ivlOffset、ivlCurrRangeを導出する。
qRangeIdx = ivlCurrRange >> 5
pState = pStateIdx1 + 16 * pStateIdx0
valMps = pState >> 14
ivlLpsRange = (qRangeIdx * ((valMps ? 32767 - pState : pState) >> 9) >> 1) + 4
ivlCurrRange = ivlCurrRange - ivlLpsRange
Next, if ivlOffset>=ivlCurrRange, the entropy decoding unit 301 derives the Bin value binVal, the variables ivlOffset and ivlCurrRange as follows.
 binVal = !valMps
 ivlOffset = ivlOffset - ivlCurrRange
 ivlCurrRange = ivlLpsRange
 それ以外の場合、以下でbinValを得る。
binVal = !valMps
ivlOffset = ivlOffset - ivlCurrRange
ivlCurrRange = ivlLpsRange
Otherwise, get binVal as follows:
 binVal = valMps
 さらにエントロピー復号部301は、以下の計算によりコンテキストの状態を更新する。
binVal = valMps
Furthermore, the entropy decoding unit 301 updates the state of the context by the following calculation.
  shift0 = (shiftIdx >> 2) + 2
 shift1 = (shiftIdx & 3) + 3 + shift0
 pStateIdx0 = pStateIdx0 - (pStateIdx0 >> shift0) + (1023 * binVal >> shift0)
 pStateIdx1 = pStateIdx1 - (pStateIdx1 >> shift1) + (16383 * binVal >> shift1)
 (bypass時の場合のBinの復号)
 エントロピー復号部301は、bypassの場合には以下の計算によりivlCurrRange、ivlOffsetを得る。
shift0 = (shiftIdx >> 2) + 2
shift1 = (shiftIdx & 3) + 3 + shift0
pStateIdx0 = pStateIdx0 - (pStateIdx0 >> shift0) + (1023 * binVal >> shift0)
pStateIdx1 = pStateIdx1 - (pStateIdx1 >> shift1) + (16383 * binVal >> shift1)
(Bin decryption in case of bypass)
In the case of bypass, the entropy decoding unit 301 obtains ivlCurrRange and ivlOffset by the following calculation.
 ivlCurrRange = ivlCurrRange << 1
 ivlOffset = ivlOffset | read_bits(1)
ここでread_bits(1)はビットストリームから1bit読み、その値を返す。
続いてエントロピー復号部301は、ivlOffset >= ivlCurrRangeの場合、binVal、ivlOffsetを以下のように設定する。
ivlCurrRange = ivlCurrRange << 1
ivlOffset = ivlOffset | read_bits(1)
Here, read_bits(1) reads one bit from the bitstream and returns that value.
Next, if ivlOffset>=ivlCurrRange, the entropy decoding unit 301 sets binVal and ivlOffset as follows.
 binVal = 1
 ivlOffset = ivlOffset - ivlCurrRange
 それ以外の場合、binValを以下のように設定する。
binVal = 1
ivlOffset = ivlOffset - ivlCurrRange
Otherwise, set binVal as follows:
 binVal = 0
 エントロピー復号部301は、bypassの場合にはコンテキストの状態を更新しない。
binVal = 0
In the case of bypass, the entropy decoding unit 301 does not update the state of the context.
 エントロピー復号部301は、パースしたシンタックス要素をパラメータ復号部302に出力する。どのシンタックス要素をパースするかの制御は、パラメータ復号部302の指示に基づいて行われる。 The entropy decoding unit 301 outputs the parsed syntax elements to the parameter decoding unit 302. Control of which syntax elements to parse is performed based on instructions from the parameter decoding unit 302.
 エントロピー復号部301は、例えば、図5のシンタックス表に示すシンタックス要素dimd_modeを以下のようにパースしてもよい。dimd_modeは、符号化データからDIMDの参照領域を選択するシンタックス要素である。 The entropy decoding unit 301 may parse, for example, the syntax element dimd_mode shown in the syntax table of FIG. 5 as follows. dimd_mode is a syntax element that selects the reference region of the DIMD from the encoded data.
 エントロピー復号部301は、符号化データからdimd_modeをパースする。DIMDの参照画像の位置を変更する構成において、dimd_modeは、DIMD_MODE_TOP_LEFTのモード、DIMD_MODE_TOPのモード、DIMD_MODE_LEFTのモードであり各々0、1、2であってもよい。 The entropy decoding unit 301 parses dimd_mode from the encoded data. In a configuration in which the position of the DIMD reference image is changed, dimd_mode may be DIMD_MODE_TOP_LEFT mode, DIMD_MODE_TOP mode, or DIMD_MODE_LEFT mode, which may be 0, 1, or 2, respectively.
 図6(a)はdimd_modeのバイナリゼーションの例を示す図である。binIdxはビットの位置を示す変数、Bin0(binIdx==0)、シンタックス要素のBin1(binIdx==1)は先頭ビットと次のビットを指す。 Figure 6(a) shows an example of binarization of dimd_mode. binIdx is a variable indicating the bit position, Bin0 (binIdx==0), and the syntax element Bin1 (binIdx==1) indicate the first bit and the next bit.
 Bin0 DIMD_MODE_TOP_LEFTとそれ以外を選択するフラグ。0の場合、DIMD_MODE_TOP_LEFTであることを示し、1の場合、DIMD_MODE_TOP_LEFTモードでないことを示す。 Bin0: Flag to select DIMD_MODE_TOP_LEFT or other. 0 indicates DIMD_MODE_TOP_LEFT, 1 indicates not DIMD_MODE_TOP_LEFT mode.
 Bin1 DIMD_MODE_TOPとDIMD_MODE_LEFTを選択するフラグ。0の場合、DIMD_MODE_TOPであることを示し、1の場合、DIMD_MODE_LEFTであることを示す。
なお、Bin0、Bin1で一つのシンタックス要素を構成するのではなく、Bin0、Bin1に各々シンタックス要素を割り当て、dimd_modeの代わりに2つのシンタックス要素をパースしてもよい。ここでは、Bin0に割り当てられたシンタックス要素をdimd_mode_flag、Bin1に割り当てられたシンタックス要素をdimd_mode_dirと呼ぶ(例えば図7)。この場合、エントロピー復号部301は、dimd_mode_flag、dimd_mode_dirからdimd_modeを以下の式で導出してもよい。dimd_mode_flag==0の場合、dimd_mode_dirを復号せず、0に設定する。
Bin1 Flag to select DIMD_MODE_TOP or DIMD_MODE_LEFT. 0 indicates DIMD_MODE_TOP, 1 indicates DIMD_MODE_LEFT.
Note that instead of constituting one syntax element with Bin0 and Bin1, a syntax element may be assigned to each of Bin0 and Bin1, and the two syntax elements may be parsed instead of dimd_mode. Here, the syntax element assigned to Bin0 is called dimd_mode_flag, and the syntax element assigned to Bin1 is called dimd_mode_dir (see, for example, FIG. 7). In this case, the entropy decoding unit 301 may derive dimd_mode from dimd_mode_flag and dimd_mode_dir using the following formula. If dimd_mode_flag==0, dimd_mode_dir is not decoded and is set to 0.
 dimd_mode = ((dimd_mode_flag == 0) ? 0 : 1) + dimd_mode_dir
 この例では、DIMD_MODE_TOP_LEFTに1bit(例えば”0”)、DIMD_MODE_TOPとDIMD_MODE_LEFTでは”1”の次にさらに1bitを割り当てる。dimd_modeのバイナリゼーションにおいて、選択率の高い左と上を使う場合、左を使う場合と上を使う場合よりも短いビットを割り当てることにより平均符号量を短くして符号化効率を向上させる効果を奏する。
dimd_mode = ((dimd_mode_flag == 0) ? 0 : 1) + dimd_mode_dir
In this example, 1 bit (for example, "0") is assigned to DIMD_MODE_TOP_LEFT, and 1 more bit is assigned after "1" to DIMD_MODE_TOP and DIMD_MODE_LEFT. In the binarization of dimd_mode, when using left and top, which have a high selection rate, shorter bits are assigned than when using left or top, which has the effect of shortening the average code amount and improving the coding efficiency.
 エントロピー復号部301は、符号化データからdimd_modeをパースする。図6(b)に示すようにDIMDの参照画像のライン数を変更する構成において、dimd_modeは、DIMD_LINES1モード、DIMD_LINED2であり各々0、1であってもよい。 The entropy decoding unit 301 parses dimd_mode from the encoded data. In a configuration in which the number of lines of the DIMD reference image is changed as shown in FIG. 6(b), dimd_mode may be DIMD_LINES1 mode or DIMD_LINED2, which may be 0 or 1, respectively.
 図示しないが、dimd_modeのバイナリゼーションはBin0(binIdx==0)であってもよい。 Although not shown, the binarization of dimd_mode may be Bin0 (binIdx==0).
 図8はdimd_modeのシンタックス要素のパースにおけるコンテキスト(ctxInc)の設定を示す図である。コンテキストとは、CABACの確率(状態)を保持するための変数領域であり、コンテキストインデックスctxIdxの値(0, 1, 2, …)によって識別される。また常に0と1が等確率つまり0.5, 0.5の場合をEP(Equal Probability)もしくはbypassと呼ぶ。この場合は特定のシンタックス要素に対して状態を保持する必要がないのでコンテキストを用いない。ctxIdxはctxIncを参照して導出される。 Figure 8 shows the setting of the context (ctxInc) when parsing the syntax element of dimd_mode. A context is a variable area for holding the probability (state) of CABAC, and is identified by the value of the context index ctxIdx (0, 1, 2, ...). The case where 0 and 1 are always equally probable, in other words 0.5, 0.5, is called EP (Equal Probability) or bypass. In this case, no context is used because there is no need to hold a state for a specific syntax element. ctxIdx is derived by referencing ctxInc.
 図8(a)に示すように、エントロピー復号部301は、先頭のBin0の復号に対してコンテキストを用い(ctxInc=0)、Bin1に対してbypassを用いてシンタックス要素dimd_modeをパースしてもよい。Bin0はDIMD_MODE_TOP_LEFTか否かを示すシンタックス要素、Bin1はDIMD_MODE_LEFTかDIMD_MODE_TOPか否かを示すシンタックス要素である。Bypassはコンテキストを用いないパース方法である。dimd_modeは、符号化データからDIMDの参照領域を選択するシンタックス要素である。上記構成によれば、DIMD_MODE_LEFTとDIMD_MODE_TOPとの選択にコンテキストを用いないため、メモリを低減できる効果を奏する。 As shown in FIG. 8(a), the entropy decoding unit 301 may parse the syntax element dimd_mode using a context (ctxInc=0) for decoding the first Bin0, and bypass for Bin1. Bin0 is a syntax element indicating whether DIMD_MODE_TOP_LEFT, and Bin1 is a syntax element indicating whether DIMD_MODE_LEFT or DIMD_MODE_TOP. Bypass is a parsing method that does not use a context. dimd_mode is a syntax element that selects the DIMD reference area from the encoded data. With the above configuration, no context is used to select between DIMD_MODE_LEFT and DIMD_MODE_TOP, which has the effect of reducing memory.
 図8(b)に示すように、エントロピー復号部301は、先頭のBin0の復号に対してコンテキストを用い(ctxInc=0)、Bin1に対して別のコンテキスト(ctxInc=1)を用いて、符号化データからdimd_modeをパースしてもよい。上記構成によれば、全ての方向に対してコンテキストを用いるために適応的に符号化することが可能であり性能が向上する効果を奏する。 As shown in FIG. 8(b), the entropy decoding unit 301 may parse dimd_mode from the encoded data by using a context (ctxInc=0) for decoding the first Bin0 and another context (ctxInc=1) for Bin1. With the above configuration, adaptive encoding is possible to use contexts for all directions, which has the effect of improving performance.
 図8(c)に示すように、エントロピー復号部301は、対象ブロックの形状に応じて、Bin1に対して別のコンテキスト(ctxInc=1,2,3)を用いて符号化データからdimd_modeをパースしてもよい。例えば以下のように対象ブロックの幅bWと高さbHが等しい場合、横長の場合、縦長の場合に別のコンテキストの値を割り当ててもよい。 As shown in FIG. 8(c), the entropy decoding unit 301 may parse dimd_mode from the encoded data using a different context (ctxInc=1,2,3) for Bin1 depending on the shape of the target block. For example, if the width bW and height bH of the target block are equal, different context values may be assigned when the block is horizontally long and when it is vertically long, as shown below.
 ctxInc = ( bW == bH ) ? 1 : ( bW < bH ) ? 2 : 3
なお式および値は上記に限定されず、判定の順序や値を変更してもよい。例えば以下であってもよい。
ctxInc = ( bW == bH ) ? 1 : ( bW < bH ) ? 2 : 3
The formulas and values are not limited to those described above, and the order of determination and values may be changed. For example, the following may be used.
 ctxIdx = ( bW > bH ) ? 1 : ( bW < bH ) ? 2 : 3
 上記構成によれば、ブロックの形状によって、例えば横長と縦長で、異なるコンテキストを用いるために適応的に符号化することが可能であり、性能が向上する効果を奏する。
ctxIdx = ( bW > bH ) ? 1 : ( bW < bH ) ? 2 : 3
According to the above configuration, it is possible to adaptively encode the block using different contexts depending on the shape of the block, for example, between horizontal and vertical, and this provides the effect of improving performance.
 図8(d)に示すように、エントロピー復号部301は、対象ブロックの形状に応じて、Bin1に対して別のコンテキスト(ctxInc=1,2)を用いて符号化データからdimd_modeをパースしてもよい。ブロックの形状が正方形の場合にはbypassを用いて、dimd_modeをパースしてもよい。 As shown in FIG. 8(d), the entropy decoding unit 301 may parse dimd_mode from the encoded data using a different context (ctxInc=1,2) for Bin1 depending on the shape of the target block. If the shape of the block is square, dimd_mode may be parsed using bypass.
 ctxIdx = ( bW == bH ) ? bypass : ( bW < bH ) ? 1 : 2
なお式および値は上記に限定されず、判定の順序や値を変更してもよい。例えば
 ctxIdx = ( bW > bH ) ? 1 : ( bW < bH ) ? 2 : bypass
 DIMDモード(dimd_mode)は、第1ビットと第2ビットから構成され、第1ビットはDIMDの参照領域が対象ブロックの上と左の双方かどうか、第2ビットはDIMDの参照領域が対象ブロックの左もしくは上かどうかを選択肢として、上記隣接領域を選択してもよい。
ctxIdx = ( bW == bH ) ? bypass : ( bW < bH ) ? 1 : 2
The formula and values are not limited to the above, and the order of judgment and values may be changed. For example, ctxIdx = ( bW > bH ) ? 1 : ( bW < bH ) ? 2 : bypass
The DIMD mode (dimd_mode) is composed of a first bit and a second bit, and the first bit selects whether the reference area of the DIMD is both above and to the left of the target block, and the second bit selects whether the reference area of the DIMD is to the left or above the target block, and the above adjacent area may be selected.
 図8(e)に示すように、エントロピー復号部301は、対象ブロックの形状に応じて、Bin1に対して別のコンテキスト(ctxInc=1,2)を用いて符号化データからdimd_modeをパースし
てもよい。ブロックの形状が正方形の場合には所定のコンテキスト(例えば1)、それ以外の場合に別のコンテキスト(例えば2)を用いて、dimd_modeをパースしてもよい。
As shown in Fig. 8(e), the entropy decoding unit 301 may parse dimd_mode from the encoded data using a different context (ctxInc = 1, 2) for Bin1 depending on the shape of the target block. If the shape of the block is a square, a predetermined context (e.g., 1) may be used to parse dimd_mode, and if not, a different context (e.g., 2) may be used to parse dimd_mode.
 ctxIdx = ( bW == bH ) ? 1 : 2
 このとき、正方形ではない場合には、bW > bHであるのか、bH < bWに応じてBin1のバイナリの値をスワップ(1を0に、0を1にする。たとえば1 - Bin1)した値を使いdimd_modeを復号する。つまり以下のようにdimd_modeを導出する。
ctxIdx = ( bW == bH ) ? 1 : 2
In this case, if the dimd_mode is not a square, dimd_mode is decoded using the value obtained by swapping the binary value of Bin1 (1 to 0, 0 to 1, for example, 1 - Bin1) depending on whether bW > bH or bH < bW. In other words, dimd_mode is derived as follows.
 dimd_mode = ((Bin0 == 0) ? 0 : 1) + ((bW >= bH) ? Bin1 : 1-Bin1)
上述の2つのシンタックスを用いる場合には以下のようにdimd_modeを導出する。
dimd_mode = ((Bin0 == 0) ? 0 : 1) + ((bW >= bH) ? Bin1 : 1-Bin1)
When using the above two syntaxes, dimd_mode is derived as follows:
 dimd_mode = ((dimd_mode_flag == 0) ? 0 : 1) + ((bW >= bH) ? dimd_mode_dir : 1-dimd_mode_dir)
なお、上述のbW >= bHは、bW > bHもしくはbW <= bHもしくはbW < bHとしてもよい。
dimd_mode = ((dimd_mode_flag == 0) ? 0 : 1) + ((bW >= bH) ? dimd_mode_dir : 1-dimd_mode_dir)
Note that the above bW >= bH may be bW > bH, bW <= bH, or bW < bH.
 上記構成によれば、対象ブロックの形状によって、例えば対象ブロックが正方形か否か(もしくは/かつ横長か縦長か)で異なるコンテキストを用いるため、ブロックの特徴に合わせて適応的に短い符号で符号化することが可能であり、性能が向上する。また例えば正方形の場合に同時にコンテキストを用いない場合はメモリを低減する効果を奏する。 The above configuration uses different contexts depending on the shape of the target block, for example, whether the target block is square or not (and/or whether it is horizontal or vertical), so it is possible to adaptively encode the block with a short code according to its characteristics, improving performance. Also, if no context is used when the block is square, for example, this has the effect of reducing memory usage.
 なお、エントロピー復号部301は、ブロックの左参照位置(xC - refIdxW - 1, yC)、上参照位置(xC, yC - refIdxH - 1)が画面端、タイルの端、スライスの端など隣接領域が参照できない場合には、dimd_modeの復号を省略し、dimd_mode=DIMD_MODE_TOP_LEFTと設定してもよい。なおxC, yCはブロック左上位置、refIdxW, refIdxHはDIMD予測の参照領域のライン数に係る変数である。 Note that if the left reference position (xC - refIdxW - 1, yC) and top reference position (xC, yC - refIdxH - 1) of the block cannot reference adjacent areas such as the edge of the screen, tile, or slice, the entropy decoding unit 301 may omit decoding of dimd_mode and set dimd_mode = DIMD_MODE_TOP_LEFT. Note that xC and yC are variables related to the top left position of the block, and refIdxW and refIdxH are variables related to the number of lines in the reference area for DIMD prediction.
 別の構成として、ブロックの上隣接領域のみが利用できる場合はdimd_mode=DIMD_MODE_TOPとし、左隣接領域のみが利用できる場合はdimd_mode=DIMD_MODE_LEFTとしてもよい。 As an alternative configuration, if only the upper adjacent region of the block is available, dimd_mode=DIMD_MODE_TOP, and if only the left adjacent region is available, dimd_mode=DIMD_MODE_LEFT.
 パラメータ復号部302は、エントロピー復号部301にどのシンタックス要素をパースするかを通知する。また、エントロピー復号部301がパースしたシンタックス要素を予測パラメータ導出部320に出力する。 The parameter decoding unit 302 informs the entropy decoding unit 301 which syntax elements to parse. In addition, the entropy decoding unit 301 outputs the syntax elements parsed by the entropy decoding unit 301 to the prediction parameter derivation unit 320.
 (予測パラメータ導出部320の構成)
 予測パラメータ導出部320は、パラメータ復号部302から入力されたシンタックス要素に基づいて、予測パラメータメモリ307に記憶された予測パラメータを参照して予測パラメータ、例えば、イントラ予測モードIntraPredModeを導出する。予測パラメータ導出部320は、導出した予測パラメータを予測画像生成部308に出力し、また予測パラメータメモリ307に記憶する。予測パラメータ導出部320は、輝度と色差で異なる予測モードを導出してもよい。
(Configuration of prediction parameter derivation unit 320)
The prediction parameter derivation unit 320 derives prediction parameters, for example, an intra-prediction mode IntraPredMode, by referring to the prediction parameters stored in the prediction parameter memory 307 based on the syntax elements input from the parameter decoding unit 302. The prediction parameter derivation unit 320 outputs the derived prediction parameters to the predicted image generation unit 308, and also stores them in the prediction parameter memory 307. The prediction parameter derivation unit 320 may derive different prediction modes for luminance and chrominance.
 予測パラメータ導出部320は、図5に示すようなイントラ予測に関するシンタックス要素から予測パラメータを導出してもよい。 The prediction parameter derivation unit 320 may derive prediction parameters from syntax elements related to intra prediction such as those shown in FIG. 5.
 ループフィルタ305は、符号化ループ内に設けたフィルタで、ブロック歪やリンギング歪を除去し、画質を改善するフィルタである。ループフィルタ305は、加算部312が生成したCUの復号画像に対し、デブロッキングフィルタ、サンプル適応オフセット(SAO)、適応ループフィルタ(ALF)等のフィルタを施す。 The loop filter 305 is a filter provided in the encoding loop that removes block distortion and ringing distortion and improves image quality. The loop filter 305 applies filters such as a deblocking filter, sample adaptive offset (SAO), and adaptive loop filter (ALF) to the decoded image of the CU generated by the adder 312.
 参照ピクチャメモリ306は、加算部312が生成したCUの復号画像を、対象ピクチャ及び対象CU毎に予め定めた位置に記憶する。 The reference picture memory 306 stores the decoded image of the CU generated by the adder 312 in a predetermined location for each target picture and target CU.
 予測パラメータメモリ307は、復号対象のCTUあるいはCU毎に予め定めた位置に予測パラメータを記憶する。具体的には、予測パラメータメモリ307は、パラメータ復号部302が復号したパラメータ及び予測パラメータ導出部320が導出した予測モードpredMode等を記憶する。 The prediction parameter memory 307 stores prediction parameters at a predetermined location for each CTU or CU to be decoded. Specifically, the prediction parameter memory 307 stores the parameters decoded by the parameter decoding unit 302 and the prediction mode predMode derived by the prediction parameter derivation unit 320.
 予測画像生成部308には、予測モードpredMode、予測パラメータ等が入力される。また、予測画像生成部308は、参照ピクチャメモリ306から参照ピクチャを読み出す。予測画像生成部308は、予測パラメータと読み出した参照ピクチャ(参照ピクチャブロック)を用いてブロックもしくはサブブロックの予測画像を生成する。ここで、参照ピクチャブロックとは、参照ピクチャ上の画素の集合(通常矩形であるのでブロックと呼ぶ)であり、予測画像を生成するために参照する領域である。 The prediction mode predMode, prediction parameters, etc. are input to the prediction image generation unit 308. The prediction image generation unit 308 also reads a reference picture from the reference picture memory 306. The prediction image generation unit 308 generates a prediction image of a block or sub-block using the prediction parameters and the read reference picture (reference picture block). Here, a reference picture block is a set of pixels on the reference picture (usually rectangular, so called a block), and is the area referenced to generate a prediction image.
 予測モードpredModeがイントラ予測モード(IntraPredMode)を示す場合、予測画像生成部310は、予測パラメータ導出部320から入力されたイントラ予測パラメータと参照ピクチャメモリ306から読み出した参照画素を用いてイントラ予測を行う。 When the prediction mode predMode indicates an intra prediction mode (IntraPredMode), the predicted image generation unit 310 performs intra prediction using the intra prediction parameters input from the prediction parameter derivation unit 320 and the reference pixels read from the reference picture memory 306.
 具体的には、予測画像生成部308は、対象ピクチャ上の、対象ブロックから予め定めた範囲にある隣接ブロックを参照ピクチャメモリ306から読み出す。予め定めた範囲とは、対象ブロックの左、左上、上、右上の隣接ブロックであり、イントラ予測モードによって参照する領域は異なる。 Specifically, the predicted image generation unit 308 reads adjacent blocks in a predetermined range from the target block on the target picture from the reference picture memory 306. The predetermined range refers to the adjacent blocks to the left, upper left, upper, and upper right of the target block, and the area to be referenced differs depending on the intra prediction mode.
 予測画像生成部308は、読み出した復号画素値とIntraPredModeが示す予測モードを参照して、対象ブロックの予測画像を生成する。予測画像生成部308は生成したブロックの予測画像を加算部312に出力する。 The predicted image generation unit 308 generates a predicted image of the current block by referring to the decoded pixel values that have been read and the prediction mode indicated by IntraPredMode. The predicted image generation unit 308 outputs the generated predicted image of the block to the addition unit 312.
 イントラ予測モードに基づく予測画像の生成について以下で説明する。Planar予測、DC予測、Angular予測では、予測対象ブロックに隣接(近接)する復号済みの周辺領域を参照領域Rとして設定する。そして、参照領域R上の画素を特定の方向に外挿することで予測画像を生成する。例えば、参照領域Rは、予測対象ブロックの左と上(あるいは、さらに、左上、右上、左下)を含むL字型の領域として設定してもよい。 The generation of predicted images based on intra prediction modes is explained below. In planar prediction, DC prediction, and angular prediction, a decoded surrounding area adjacent (close) to the block to be predicted is set as reference region R. Then, a predicted image is generated by extrapolating pixels in reference region R in a specific direction. For example, reference region R may be set as an L-shaped region that includes the left and top of the block to be predicted (or further, the top left, top right, and bottom left).
  (予測画像生成部の詳細)
 次に、図9を用いて予測画像生成部308の構成の詳細を説明する。予測画像生成部308は、参照サンプルフィルタ部3103(第2の参照画像設定部)、予測部3104、および、予測画像補正部3105(予測画像補正部、フィルタ切替部、重み係数変更部)を備える。
(Details of predicted image generation unit)
Next, the configuration of the predicted image generation unit 308 will be described in detail with reference to Fig. 9. The predicted image generation unit 308 includes a reference sample filter unit 3103 (second reference image setting unit), a prediction unit 3104, and a predicted image correction unit 3105 (predicted image correction unit, filter switching unit, and weighting coefficient changing unit).
 参照領域R上の各参照画素(参照画像)、参照画素フィルタ(第1のフィルタ)を適用して生成したフィルタ済参照画像、イントラ予測モードに基づいて、予測部3104は予測対象ブロックの予測画像(仮予測画像、補正前予測画像)を生成し、予測画像補正部3105に出力する。予測画像補正部3105は、イントラ予測モードに応じて仮予測画像を修正し、予測画像(補正済予測画像)を生成し、出力する。 Based on each reference pixel (reference image) in the reference region R, a filtered reference image generated by applying a reference pixel filter (first filter), and the intra prediction mode, the prediction unit 3104 generates a prediction image (provisional predicted image, uncorrected predicted image) of the block to be predicted, and outputs it to the prediction image correction unit 3105. The prediction image correction unit 3105 corrects the provisional predicted image according to the intra prediction mode, and generates and outputs a prediction image (corrected predicted image).
 以下、予測画像生成部308が備える各部について説明する。 The following describes each component of the predicted image generation unit 308.
  (参照サンプルフィルタ部3103)
 参照サンプルフィルタ部3103は、参照画像を参照して参照領域R上の各位置(x,y)の参照サンプルs[x][y]を導出する。また、参照サンプルフィルタ部3103は、イントラ予測モードに応じて、参照サンプルs[x][y]に参照画素フィルタ(第1のフィルタ)を適用して
、参照領域R上の各位置(x,y)の参照サンプルs[x][y]を更新する(フィルタ済参照画像s[x][y]を導出する)。具体的には、位置(x,y)とその周辺の参照画像にローパスフィルタを適用し、フィルタ済参照画像を導出する。なお、必ずしも全イントラ予測モードにローパスフィルタを適用する必要はなく、一部のイントラ予測モードに対してローパスフィルタを適用してもよい。なお、参照サンプルフィルタ部3103において参照領域R上の参照画像に適用するフィルタを「参照画素フィルタ(第1のフィルタ)」と呼称するのに対し、後述の予測画像補正部3105において仮予測画像を補正するフィルタを「ポジション依存フィルタ(第2のフィルタ)」と呼称する。
(Reference sample filter unit 3103)
The reference sample filter unit 3103 derives a reference sample s[x][y] at each position (x, y) on the reference region R by referring to the reference image. In addition, the reference sample filter unit 3103 applies a reference pixel filter (first filter) to the reference sample s[x][y] according to the intra prediction mode to update the reference sample s[x][y] at each position (x, y) on the reference region R (derives a filtered reference image s[x][y]). Specifically, a low-pass filter is applied to the position (x, y) and the reference image therearound to derive a filtered reference image. Note that it is not necessary to apply a low-pass filter to all intra prediction modes, and a low-pass filter may be applied to some intra prediction modes. Note that the filter applied to the reference image on the reference region R in the reference sample filter unit 3103 is referred to as a "reference pixel filter (first filter)", whereas the filter that corrects the tentative predicted image in the prediction image correction unit 3105 described later is referred to as a "position-dependent filter (second filter)".
  (イントラ予測部の構成)
 イントラ予測部は、イントラ予測モードと、参照画像、フィルタ済参照画素値に基づいて予測対象ブロックの仮予測画像(仮予測画素値、補正前予測画像)を生成し、予測画像補正部3105に出力する。予測部3104は、内部にPlanar予測部31041、DC予測部31042、Angular予測部31043、LM予測部31044、MIP(Matrix-based Intra Prediction)部31045、DIMD予測部31046(Decoder-side Intra Mode Derivation, DIMD)を備えている。予測部3104は、イントラ予測モードに応じて特定の予測部を選択して、参照画像、フィルタ済参照画像を入力する。イントラ予測モードと対応する予測部との関係は次の通りである。
・Planar予測 ・・・・Planar予測部31041
・DC予測   ・・・・DC予測部31042
・Angular予測 ・・・・Angular予測部31043
・LM予測   ・・・・LM予測部31044
・行列イントラ予測・・MIP部31045
・DIMD予測・・・・・・DIMD予測部31046
  (Planar予測)
 Planar予測部31041は、予測対象画素位置と参照画素位置との距離に応じて参照サンプルs[x][y]を線形加算して仮予測画像を生成し、予測画像補正部3105に出力する。
(Configuration of intra prediction unit)
The intra prediction unit generates a tentative predicted image (tentative predicted pixel value, pre-corrected predicted image) of a prediction target block based on an intra prediction mode, a reference image, and a filtered reference pixel value, and outputs the generated image to a prediction image correction unit 3105. The prediction unit 3104 includes a planar prediction unit 31041, a DC prediction unit 31042, an angular prediction unit 31043, an LM prediction unit 31044, a matrix-based intra prediction unit 31045, and a DIMD prediction unit 31046 (Decoder-side Intra Mode Derivation, DIMD). The prediction unit 3104 selects a specific prediction unit according to the intra prediction mode, and inputs a reference image and a filtered reference image. The relationship between the intra prediction mode and the corresponding prediction unit is as follows.
・Planar prediction ・・・Planar prediction section 31041
・DC prediction...DC prediction unit 31042
・Angular prediction ・・・・Angular prediction section 31043
LM prediction...LM prediction unit 31044
・Matrix intra prediction・・MIP part 31045
DIMD prediction: DIMD prediction unit 31046
(Planar forecast)
The planar prediction unit 31041 generates a provisional predicted image by linearly adding the reference sample s[x][y] according to the distance between the prediction target pixel position and the reference pixel position, and outputs the provisional predicted image to the predicted image correction unit 3105.
  (DC予測)
 DC予測部31042は、参照サンプルs[x][y]の平均値に相当するDC予測値を導出し、DC予測値を画素値とする仮予測画像q[x][y]を出力する。
(DC forecast)
The DC prediction unit 31042 derives a DC predicted value equivalent to the average value of the reference samples s[x][y], and outputs a temporary predicted image q[x][y] whose pixel values are the DC predicted values.
  (Angular予測)
 Angular予測部31043は、イントラ予測モードの示す予測方向(参照方向)の参照サンプルs[x][y]を用いて仮予測画像q[x][y]を生成し、予測画像補正部3105に出力する。
(Angular predictions)
The angular prediction unit 31043 generates a temporary predicted image q[x][y] using a reference sample s[x][y] in the prediction direction (reference direction) indicated by the intra prediction mode, and outputs the temporary predicted image q[x][y] to the predicted image correction unit 3105.
  (LM予測)
 LM予測部31044は、輝度の画素値に基づいて色差の画素値を予測する。具体的には、復号した輝度画像をもとに、線形モデルを用いて、色差画像(Cb、Cr)の予測画像を生成する方式である。LM予測の1つとして、CCLM(Cross-Component Linear Model prediction)予測がある。CCLM予測は、1つのブロックに対し、輝度から色差を予測するための線形モデルを使用する予測方式である。
(LM forecast)
The LM prediction unit 31044 predicts pixel values of chrominance based on pixel values of luminance. Specifically, this is a method of generating a predicted image of a chrominance image (Cb, Cr) using a linear model based on a decoded luminance image. One type of LM prediction is CCLM (Cross-Component Linear Model prediction). CCLM prediction is a prediction method that uses a linear model to predict chrominance from luminance for one block.
  (行列イントラ予測)
 MIP部31045は、隣接領域から導出した参照サンプルs[x][y]と重み行列の積和演算により仮予測画像q[x][y]を生成し、予測画像補正部3105に出力する。
(Matrix Intra Prediction)
The MIP unit 31045 generates a temporary predicted image q[x][y] by performing a product-sum operation on the reference sample s[x][y] derived from the adjacent region and a weighting matrix, and outputs the generated image to the predicted image correction unit 3105.
  (DIMD予測)
 DIMD予測部31046は、明示的にシグナルされないイントラ予測モードを用いて予測画像を生成する予測方式である。角度モード導出装置310465において、隣接領域の情報を用いて対象ブロックに適したイントラ予測モードを導出し、DIMD予測部31046はこのイントラ
予測モードを用いて仮予測画像を生成する。詳細については後述する。
(DIMD forecast)
The DIMD prediction unit 31046 is a prediction method that generates a predicted image using an intra prediction mode that is not explicitly signaled. The angle mode derivation device 310465 derives an intra prediction mode suitable for the current block using information on the neighboring region, and the DIMD prediction unit 31046 generates a temporary predicted image using this intra prediction mode. Details will be described later.
  (予測画像補正部3105の構成)
 予測画像補正部3105は、イントラ予測モードに応じて、予測部3104から出力された仮予測画像を修正する。具体的には、予測画像補正部3105は、仮予測画像の各画素に対し、参照領域Rと対象予測画素の位置に応じて、ポジションに依存した重み係数を導出する。そして、参照サンプルs[][]と仮予測画像q[x][y]を重み付け加算(加重平均)することで、仮予測画像を修正した予測画像(補正済予測画像)Pred[][]を導出する。なお、一部のイントラ予測モードでは、予測画像補正部3105で仮予測画像q[x][y]を補正せずに予測画像としてセットしてもよい。
(Configuration of the predicted image correction unit 3105)
The predicted image correction unit 3105 corrects the provisional predicted image output from the prediction unit 3104 according to the intra prediction mode. Specifically, the predicted image correction unit 3105 derives a position-dependent weighting coefficient for each pixel of the provisional predicted image according to the reference region R and the position of the target predicted pixel. Then, the predicted image correction unit 3105 performs weighted addition (weighted averaging) of the reference sample s[][] and the provisional predicted image q[x][y] to derive a predicted image (corrected predicted image) Pred[][] obtained by correcting the provisional predicted image. Note that, in some intra prediction modes, the predicted image correction unit 3105 may set the provisional predicted image q[x][y] as a predicted image without correcting it.
  (実施例1)
 図10は、本実施形態におけるDIMD予測部31046の構成を示している。DIMD予測部31046は、参照サンプル導出部310460、角度モード導出装置310465(勾配導出部310461、角度モード導出部310462)と、角度モード選択部310463、仮予測画像生成部310464から構成される。角度モード導出装置310465は角度モード選択部310463を含んでもよい。
Example 1
10 shows the configuration of the DIMD prediction unit 31046 in this embodiment. The DIMD prediction unit 31046 is composed of a reference sample derivation unit 310460, an angle mode derivation device 310465 (gradient derivation unit 310461, angle mode derivation unit 310462), an angle mode selection unit 310463, and a temporary predicted image generation unit 310464. The angle mode derivation device 310465 may include the angle mode selection unit 310463.
 図5は、DIMDに関する符号化データのシンタックス例を示している。予測パラメータ導出部320は、符号化データからブロック毎にDIMDを用いるかどうかを示すフラグdimd_flagを復号する。対象ブロックのdimd_flagが1の場合、パラメータ復号部302は、イントラ予測モードに関するシンタックス要素(intra_mip_flag、intra_luma_mpm_flag、intra_luma_mpm_idx、intra_luma_mpm_reminder)を符号化データから復号しなくてもよい。intra_mip_flagはMIP予測を行うか否かを示すフラグである。intra_luma_mpm_flagは予測候補Most Probable Mode(MPM)を利用するか否かを示すフラグである。intra_luma_mpm_idxはMPMを利用する場合にMPMを指定するインデックスである。intra_luma_mpm_reminderはMPMを利用しない場合に残り候補を選択するインデックスである。dimd_flagが0の場合、intra_luma_mpm_flagを復号し、intra_luma_mpm_flagが0の場合さらにintra_luma_mpm_reminderを復号する。対象ブロックのdimd_flagが1の場合、さらに対象ブロックのdimd_modeを復号する。dimd_modeは、DIMD予測でのイントラ予測モード導出に用いる参照領域を示す。dimd_modeの意味は以下であってもよい。 Figure 5 shows an example of the syntax of encoded data related to DIMD. The prediction parameter derivation unit 320 decodes a flag dimd_flag indicating whether or not to use DIMD for each block from the encoded data. If dimd_flag for the target block is 1, the parameter decoding unit 302 does not need to decode syntax elements related to intra prediction mode (intra_mip_flag, intra_luma_mpm_flag, intra_luma_mpm_idx, intra_luma_mpm_reminder) from the encoded data. intra_mip_flag is a flag indicating whether or not to perform MIP prediction. intra_luma_mpm_flag is a flag indicating whether or not to use the prediction candidate Most Probable Mode (MPM). intra_luma_mpm_idx is an index that specifies MPM when MPM is used. intra_luma_mpm_reminder is an index that selects the remaining candidate when MPM is not used. If dimd_flag is 0, intra_luma_mpm_flag is decoded, and if intra_luma_mpm_flag is 0, intra_luma_mpm_reminder is also decoded. If dimd_flag of the current block is 1, dimd_mode of the current block is also decoded. dimd_mode indicates a reference region used to derive an intra prediction mode in DIMD prediction. The meaning of dimd_mode may be as follows:
 dimd_mode = 0 DIMD_MODE_TOP_LEFT(上と左を使う)
 dimd_mode = 2 DIMD_MODE_LEFT(左を使う)
 dimd_mode = 3 DIMD_MODE_TOP(上を使う)
 dimd_flagが1の場合、DIMD予測部31046は、隣接領域におけるテクスチャ方向を示す角度を、画素値を用いて導出する。そして、その角度に対応するイントラ予測モードを用いて仮予測画像を生成する。例えば、(1)隣接領域内の所定の位置の画素について画素値の勾配方向を導出する。(2)導出された勾配方向を、対応する方向予測モード(Angular予測モード)に変換する。(3)隣接領域内の所定の画素毎に、それぞれ得られた予測方向のヒストグラムを作成する。(4)前記ヒストグラムから最頻値の予測モードまたは最頻値を含む複数の予測モードを選択し、当該予測モードを用いて仮予測画像を生成する。以下では図10に示したDIMD予測部31046の各部における処理について、より詳細に説明する。
dimd_mode = 0 DIMD_MODE_TOP_LEFT (use top and left)
dimd_mode = 2 DIMD_MODE_LEFT (use left)
dimd_mode = 3 DIMD_MODE_TOP (use top)
When dimd_flag is 1, the DIMD prediction unit 31046 derives an angle indicating the texture direction in the adjacent region using pixel values. Then, a provisional predicted image is generated using an intra prediction mode corresponding to the angle. For example, (1) a gradient direction of pixel values is derived for a pixel at a predetermined position in the adjacent region. (2) The derived gradient direction is converted to a corresponding directional prediction mode (angular prediction mode). (3) A histogram of the obtained prediction direction is created for each predetermined pixel in the adjacent region. (4) A prediction mode of the most frequent value or a plurality of prediction modes including the most frequent value is selected from the histogram, and a provisional predicted image is generated using the prediction mode. The processing in each part of the DIMD prediction unit 31046 shown in FIG. 10 will be described in more detail below.
 (1)参照サンプル導出部
 参照サンプル導出部310460は、対象ブロックに隣接する既復号画素recSamplesから、参照サンプルrefUnitを導出する。なお、参照サンプル導出部310460の動作を参照サンプルフィルタ部3103が行う構成であってもよい。
(1) Reference Sample Derivation Unit The reference sample derivation unit 310460 derives a reference sample refUnit from decoded pixels recSamples adjacent to the current block. Note that the operation of the reference sample derivation unit 310460 may be performed by the reference sample filter unit 3103.
 図11は、DIMD予測部31046の参照する参照領域の例を示す図である。参照サンプル導出部310460は、後述の勾配導出部310461及び予測画像生成部308で用いる対象ブロックの隣
接画像(DIMD参照領域の画像)recSamplesをサンプル配列refUnitに格納する。
11 is a diagram showing an example of a reference region referred to by the DIMD prediction unit 31046. The reference sample derivation unit 310460 stores adjacent images (images in the DIMD reference region) recSamples of the current block to be used by a gradient derivation unit 310461 and the predicted image generation unit 308, which will be described later, in a sample array refUnit.
 (モードに応じた参照領域の構成例1)
 dimd_mode == DIMD_MODE_TOP_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と上の領域から以下のようにサンプル配列refUnitを導出する。
(Configuration example 1 of reference area according to mode)
When dimd_mode == DIMD_MODE_TOP_LEFT, the reference sample derivation unit 310460 derives a sample array refUnit from the left and top areas of the target block as follows.
 まず対象ブロックの左の領域RL上の位置(x, y)(以下単にRL)において、以下の処理を行う。 First, the following process is performed at position (x, y) in the left region RL of the target block (hereafter simply referred to as RL).
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRLは、x=-1-refIdxW..-1, y=-1-refIdxH..refH-1。
(xC,yC)は対象ブロックの左上座標であり、refIdxW、refIdxHは左に隣接する参照領域の幅と上に隣接する参照領域の高さを示す定数である。refIdxW = 2 or 3としてもよいし、refIdxH = 2 or 3にしてもよい。さらに、それぞれをブロックサイズに応じて変えてもよい(以下同様)。対象ブロックの上側の領域の画素が利用できない場合、RLのy座標の範囲は、y=0..refH-1とする。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here, RL is x=-1-refIdxW..-1, y=-1-refIdxH..refH-1.
(xC, yC) are the top left coordinates of the target block, and refIdxW and refIdxH are constants indicating the width of the adjacent reference area to the left and the height of the adjacent reference area above. refIdxW = 2 or 3, and refIdxH = 2 or 3 may be used. Furthermore, each may be changed according to the block size (similar below). If the pixels in the area above the target block are not available, the y coordinate range of RL is y=0..refH-1.
 refW,refHは、DIMD参照領域の幅と高さを示し対象ブロックのサイズと同じ場合にはrefW = bW、refH = bHとし、拡張する場合にはrefW = bW*2、refH = bH*2とする(以下同様)。なお拡張する場合とは左に加えて左下の隣接領域を含めた隣接画像を用いること、上に加えて右上の隣接領域を含めた隣接画像を用いること、をここでは指す。
続いて、対象ブロックの上の領域の画素の範囲のRTにおいて以下の処理を行う。
refW, refH indicate the width and height of the DIMD reference area, and if they are the same as the size of the target block, refW = bW, refH = bH, and if they are extended, refW = bW*2, refH = bH*2 (same below). In this case, extending refers to using adjacent images including the lower left adjacent area in addition to the left, and using adjacent images including the upper right adjacent area in addition to the top.
Next, the following process is performed on the RT in the range of pixels in the area above the target block.
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRTは、x=-1-refIdxW..refW-1, y=-1-refIdxH..-1。対象ブロックの左側の領域の画素が利用できない場合、RTのx座標の範囲は、x=0..refW-1とする。RTLはRLとRTを合わせた領域である。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here, RT is x=-1-refIdxW..refW-1, y=-1-refIdxH..-1. If the pixels in the area to the left of the target block are not available, the x-coordinate range of RT is x=0..refW-1. RTL is the area that combines RL and RT.
 dimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左の領域、例えば上記RLからrefUnitを導出する。 When dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives refUnit from the left area of the target block, for example, the above RL.
 dimd_mode == DIMD_MODE_TOPの場合、参照サンプル導出部310460は、対象ブロックの上の領域、例えば上記RTからrefUnitを導出する。 When dimd_mode == DIMD_MODE_TOP, the reference sample derivation unit 310460 derives refUnit from the area above the target block, for example, the above RT.
 (モードに応じた参照領域の構成例2)
 なお、参照サンプル導出部310460は、下記のような処理を行ってもよい。
(Configuration example 2 of reference area according to mode)
The reference sample derivation unit 310460 may perform the following process.
 dimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左の領域、例えば上記RLからrefUnitを導出する。 When dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives refUnit from the left area of the target block, for example, the above RL.
 さらに続けてdimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_TOPの場合、参照サンプル導出部310460は、対象ブロックの上の領域、例えば上記RTからrefUnitを導出する。 Furthermore, when dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_TOP, the reference sample derivation unit 310460 derives refUnit from the area above the target block, for example, the above RT.
 (モードに応じた参照領域の構成例3:dimd_modeに応じて拡張領域を利用する例)
 図11(b)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd_mode == DIMD_MODE_LEFTの場合に、左に加えて左下の隣接領域を含めた拡張領域を用い、dimd_mode == DIMD_MODE_TOPの場合に、上に加えて右上の隣接領域を含めた拡張領域を用い、dimd_mode == DIMD_MODE_TOP_LEFTの場合には拡張せずに左と上の領域を用いる。
(Example 3 of configuration of reference area according to mode: Example of using extension area according to dimd_mode)
Figure 11(b) shows another example of the reference range in the gradient derivation of DIMD prediction. In this example, when dimd_mode == DIMD_MODE_LEFT, an extended region including the adjacent region to the left and the lower left is used, when dimd_mode == DIMD_MODE_TOP, an extended region including the adjacent region to the top and the upper right is used, and when dimd_mode == DIMD_MODE_TOP_LEFT, the left and upper regions are used without being extended.
 例えば、参照サンプル導出部310460は、下記のような処理を行ってもよい。 For example, the reference sample derivation unit 310460 may perform the following processing:
 dimd_mode == DIMD_MODE_TOP_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と上の領域から以下のようにサンプル配列refUnitを導出する。
まず対象ブロックの左の領域の画素の範囲のRLにおいて以下の処理を行う。
When dimd_mode == DIMD_MODE_TOP_LEFT, the reference sample derivation unit 310460 derives a sample array refUnit from the left and top areas of the target block as follows.
First, the following process is carried out in the pixel range RL in the left region of the target block.
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRLは、x=-1-refIdxW..-1, y=-1-refIdxH..refH-1、refW = bW、refH = bH。
続いて、対象ブロックの上の領域の画素の範囲のRTにおいて以下の処理を行う。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here RL is x=-1-refIdxW..-1, y=-1-refIdxH..refH-1, refW = bW, refH = bH.
Next, the following process is performed on the RT in the range of pixels in the area above the target block.
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRTは、x=-1-refIdxW..refW-1, y=-1-refIdxH..-1、refW = bW、refH = bH。RTLはRLとRTを合わせた領域(位置の範囲)である。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here, RT is x=-1-refIdxW..refW-1, y=-1-refIdxH..-1, refW = bW, refH = bH. RTL is the area (range of positions) that combines RL and RT.
 dimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と左下の領域、例えばRL_EXTからrefUnitを導出する。
RL_EXTは、x=-1-refIdxW..-1, y=-1-refIdxH..refH2-1、refH2 = bH*2。対象ブロックの上の領域の画素が利用できない場合は、y=0..refH2-1とする。
When dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives refUnit from the left and bottom left areas of the target block, for example, RL_EXT.
RL_EXT is x=-1-refIdxW..-1, y=-1-refIdxH..refH2-1, refH2 = bH*2. If the pixels in the area above the target block are not available, set y=0..refH2-1.
 dimd_mode == DIMD_MODE_TOPの場合、参照サンプル導出部310460は、対象ブロックの上の領域、例えばRT_EXTからrefUnitを導出する。
RT_EXTは、x=-1-refIdxW..refW2-1, y=-1-refIdxH..-1、refW2 = bW*2。対象ブロックの左の領域の画素が利用できない場合は、x=0..refW2-1とする。
When dimd_mode == DIMD_MODE_TOP, the reference sample derivation unit 310460 derives refUnit from the area above the target block, for example, RT_EXT.
RT_EXT is x=-1-refIdxW..refW2-1, y=-1-refIdxH..-1, refW2 = bW*2. If the pixels in the left area of the target block are not available, set x=0..refW2-1.
 (モードに応じた参照領域の構成例4:dimd_modeに応じて拡張領域を利用する例2)
 なお、参照サンプル導出部310460は、下記のような処理を行ってもよい。
(Example 4 of configuration of reference area according to mode: Example 2 of using extension area according to dimd_mode)
The reference sample derivation unit 310460 may perform the following process.
 dimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と左下の領域、例えばRL_ADAPからrefUnitを導出する。
RL_ADAPは、x=-1-refIdxW..-1, y=-1-refIdxH..refH-1、refH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bH : bH*2。DIMD_MODE_LEFTで対象ブロックの上の領域の画素が利用できない場合は、y=0..refH-1とする。
When dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives refUnit from the left and bottom left areas of the target block, for example, RL_ADAP.
RL_ADAP: x=-1-refIdxW..-1, y=-1-refIdxH..refH-1, refH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bH : bH*2. If the pixels above the target block are not available in DIMD_MODE_LEFT, y=0..refH-1.
 さらに続けてdimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_TOPの場合、参照サンプル導出部310460は、対象ブロックの上と右上の領域、例えばRT_ADAPからrefUnitを導出する。
RT_ADAPは、x=-1-refIdxW..refW-1, y=-1-refIdxH..-1、refW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bW : bW*2。DIMD_MODE_TOPで対象ブロックの左の領域の画素が利用できない場合は、x=0..refW-1とする。
Furthermore, when dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_TOP, the reference sample derivation unit 310460 derives refUnit from the upper and upper right areas of the target block, for example, RT_ADAP.
RT_ADAP: x=-1-refIdxW..refW-1, y=-1-refIdxH..-1, refW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bW : bW*2. If the pixels to the left of the target block cannot be used in DIMD_MODE_TOP, set x=0..refW-1.
 続いて、参照サンプル導出部310460は、refUnit[x][y]に応じて対象ピクチャ外や対象サブピクチャ外、対象スライス境界の外などで参照できなかった領域の値を、上記で導出された画素の値もしくは所定の固定値例えば1<<(bitDepth-1)で置き換えてもよい。 Then, the reference sample derivation unit 310460 may replace the value of the area that cannot be referenced because it is outside the target picture, outside the target subpicture, or outside the target slice boundary according to refUnit[x][y] with the pixel value derived above or a predetermined fixed value, for example, 1<<(bitDepth-1).
 (モードに応じた構成例5)
 図12(a)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd_linesに応じて参照領域のライン数を変更する。例えば、dimd_mode == DIMD_LINES1の場合、Mラインを参照し、dimd_mode == DIMD_LINES2の場合、Mより大きいNラインを参照する
。たとえばM=3, N=4。
(Configuration Example 5 According to Mode)
FIG. 12(a) shows another example of the reference range in gradient derivation for DIMD prediction. In this example, the number of lines in the reference region is changed according to dimd_lines. For example, when dimd_mode == DIMD_LINES1, M lines are referenced, and when dimd_mode == DIMD_LINES2, N lines greater than M are referenced. For example, M=3, N=4.
 参照サンプル導出部310460は、dimd_modeに応じて参照ライン数refIdxW, refIdxHを設定する。
refIdxW = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
 参照サンプル導出部310460は、対象ブロックの左と上の領域から以下のようにサンプル配列refUnitを導出する。
The reference sample derivation unit 310460 sets the reference line numbers refIdxW and refIdxH in accordance with dimd_mode.
refIdxW = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
The reference sample derivation unit 310460 derives a sample array refUnit from the left and upper areas of the target block as follows.
 まず対象ブロックの左の領域の画素の範囲のRLにおいて以下の処理を行う。 First, the following process is performed in the pixel range RL of the area to the left of the target block.
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRLは、x=-1-refIdxW..-1, y=-1-refIdxH..refH-1、refW = bW、refH = bH。対象ブロックの上の領域の画素が利用できない場合は、y=0..refH-1とする。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here, RL is x=-1-refIdxW..-1, y=-1-refIdxH..refH-1, refW = bW, refH = bH. If the pixels in the area above the target block are not available, set y=0..refH-1.
 続いて、対象ブロックの上の領域の画素の範囲のRTにおいて以下の処理を行う。 Next, the following process is performed on the RT in the range of pixels in the area above the target block.
 refUnit[x][y] = recSamples[xC+x][yC+y]
ここでRTは、x=-1-refIdxW..refW-1, y=-1-refIdxH..-1、refW = bW、refH = bH。対象ブロックの左の領域の画素が利用できない場合は、x=0..refW-1とする。
refUnit[x][y] = recSamples[xC+x][yC+y]
Here, RT is x=-1-refIdxW..refW-1, y=-1-refIdxH..-1, refW = bW, refH = bH. If the pixels in the left area of the target block are not available, then x=0..refW-1.
 さらにブロックサイズに応じて選択してもよい。 You can also select based on block size.
 (ブロックサイズに応じた構成例1)
 参照サンプル導出部310460は、ブロックサイズに応じて参照ライン数refIdxW, refIdxHを設定する。
refIdxW = (bW >= 8 || bH >=8) ? N-1 : M-1
refIdxH = (bW >= 8 || bH >=8) ? N-1 : M-1
 さらに参照サンプル導出部310460は、対象ブロックの左の領域のRL、上の領域のRTのrecSamples[xC+x][yC+y]からrefUnit[x][y]を導出する。
(Configuration example 1 according to block size)
The reference sample derivation unit 310460 sets the numbers of reference lines refIdxW and refIdxH in accordance with the block size.
refIdxW = (bW >= 8 || bH >= 8) ? N-1 : M-1
refIdxH = (bW >= 8 || bH >= 8) ? N-1 : M-1
Furthermore, the reference sample derivation unit 310460 derives refUnit[x][y] from recSamples[xC+x][yC+y] of RL in the left region of the target block and RT in the upper region.
 (モードに応じた構成例6)
 図12(b)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd modeに応じて、参照領域の方向を選択するのと同時に、参照領域のライン数も選択する。
(Configuration Example 6 According to Mode)
Fig. 12(b) shows another example of the reference range in gradient derivation for DIMD prediction. In this example, the direction of the reference region is selected according to the dimd mode, and at the same time, the number of lines of the reference region is also selected.
 参照サンプル導出部310460は、dimd_modeが左と上を参照する場合、参照ライン数をM、それ以外の場合参照ライン数をN(M<N)となるように導出する。たとえばM=3, N=4。
refIdxW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
M=3, N=4以外の値、M=3, N=5でもよい。
The reference sample derivation unit 310460 derives the number of reference lines to be M when dimd_mode refers to the left and top, and otherwise derives the number of reference lines to be N (M<N). For example, M=3, N=4.
refIdxW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
Values other than M=3, N=4, such as M=3, N=5, may also be used.
 dimd_mode == DIMD_MODE_TOP_LEFTの場合、参照サンプル導出部310460は、左の領域RL、上の領域RTのrecSamples[xC+x][yC+y]からrefUnit[x][y]を導出する。 When dimd_mode == DIMD_MODE_TOP_LEFT, the reference sample derivation unit 310460 derives refUnit[x][y] from recSamples[xC+x][yC+y] of the left region RL and the top region RT.
 dimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左の領域、例えばRLのrecSamples[xC+x][yC+y]からrefUnit[x][y]を導出する。 When dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives refUnit[x][y] from the left area of the target block, for example, recSamples[xC+x][yC+y] of RL.
 dimd_mode == DIMD_MODE_TOPの場合、参照サンプル導出部310460は、対象ブロックの上の領域、例えばRTのrecSamples[xC+x][yC+y]からrefUnit[x][y]を導出する。
(1)勾配導出部
 勾配導出部310461は、勾配導出対象画像の画素値に基づいてテクスチャ方向を示す角度(角度情報)を導出する。角度情報は、1/36精度の角度を表す値であってもよいし、他の値でもよい。勾配導出部310461は、2つ以上の特定方向の勾配(例えばDx,Dy)を導出し、勾配Dx,Dyの関係から勾配の方向(角度情報)を導出する。
When dimd_mode == DIMD_MODE_TOP, the reference sample derivation unit 310460 derives refUnit[x][y] from the area above the target block, for example, recSamples[xC+x][yC+y] of RT.
(1) Gradient Derivation Unit The gradient derivation unit 310461 derives an angle (angle information) indicating a texture direction based on pixel values of a gradient derivation target image. The angle information may be a value representing an angle with 1/36 precision, or may be another value. The gradient derivation unit 310461 derives gradients in two or more specific directions (e.g., Dx, Dy), and derives the gradient direction (angle information) from the relationship between the gradients Dx and Dy.
 勾配の導出には、空間フィルタを用いてもよい。空間フィルタには、例えば図13(a)および(b)に示すような、水平方向および垂直方向に対応する3x3画素Sobelフィルタを用いてもよい。勾配導出部310461は、勾配導出対象画像における上記参照サンプル導出部310460で参照、導出したサンプル配列refUnit[x][y]の内部の点P[x][y](以下単にP)について勾配を導出する。なおrecSamplesからサンプル配列refUnit[x][y]にコピーせず、refUnit[x][y]の代わりにrecSamples[xC+x][yC+y]を点Pとして参照する構成も可能である。 A spatial filter may be used to derive the gradient. For example, a 3x3 pixel Sobel filter corresponding to the horizontal and vertical directions as shown in Figures 13(a) and (b) may be used as the spatial filter. The gradient derivation unit 310461 derives the gradient for point P[x][y] (hereinafter simply P) within the sample array refUnit[x][y] referenced and derived by the reference sample derivation unit 310460 in the gradient derivation target image. Note that it is also possible to configure the system to refer to recSamples[xC+x][yC+y] as point P instead of refUnit[x][y] without copying from recSamples to the sample array refUnit[x][y].
 図14は、8x8画素の対象ブロックにおいて、勾配導出対象画素の位置の例を示している。角度モード導出装置310465をイントラ予測に用いる場合、対象ブロックの隣接領域にある網掛けの画像が勾配導出対象画像であってもよい。また、勾配導出対象画像は、対象ブロックの色差画像に対応する輝度画像であってもよい。このように、対象ブロックのサイズや隣接領域に含まれるブロックのイントラ予測モード等の情報によって、勾配導出対象画素の個数、位置のパターン、空間フィルタの参照範囲を変更してもよい。 FIG. 14 shows an example of the positions of pixels to be subjected to gradient derivation in a target block of 8x8 pixels. When the angle mode derivation device 310465 is used for intra prediction, a shaded image in an adjacent region of the target block may be the image to be subjected to gradient derivation. The image to be subjected to gradient derivation may also be a luminance image corresponding to the chrominance image of the target block. In this way, the number of pixels to be subjected to gradient derivation, the position pattern, and the reference range of the spatial filter may be changed depending on information such as the size of the target block and the intra prediction mode of the blocks included in the adjacent region.
 具体的には、勾配導出部310461は、各点Pについて次式のように水平方向と垂直方向の勾配Dx,Dyを導出する。
Dx = P[x-1][y-1] + 2*P[x-1][y] + P[x-1][y+1] - P[x+1][y-1] - 2*P[x+1][y] - P[x+1][y+1]
Dy = - P[x-1][y-1] - 2*P[x][y-1] - P[x+1][y-1] + P[x-1][y+1] + 2*P[x][y+1] + P[x+1][y+1]
 図13(a)および(b)のフィルタを左右または上下に反転した、図13(c)および(d)のフィルタを用いてもよい。その場合は次式を用いてDxおよびDyを導出する。
Dx = - P[x-1][y-1] - 2*P[x-1][y] - P[x-1][y+1] + P[x+1][y-1] + 2*P[x+1][y] + P[x+1][y+1]
Dy = P[x-1][y-1] + 2*P[x][y-1] + P[x+1][y-1] - P[x-1][y+1] - 2*P[x][y+1] - P[x+1][y+1]
 勾配の導出方法はこれに限らず、他の方法(フィルタ、計算式、テーブル等)を用いてもよい。例えば、Sobelフィルタの代わりにPrewittフィルタやScharrフィルタなどを用いてもよいし、フィルタサイズを2x2や5x5にしてもよい。勾配導出部310461はPrewittフィルタを用いてDx,Dyを次のように導出する。
Dx = P[x-1][y-1] + P[x-1][y] + P[x-1][y+1] - P[x+1][y-1] - P[x+1][y] - P[x+1][y1]
Dy = - P[x-1][y-1] - P[x][y-1] - P[x+1][y-1] + P[x-1][y+1] + P[x][y+1] + P[x+1][y+1]
次式は、Scharrフィルタを用いてDx,Dyを導出する例である。
Dx = 3*P[x-1][y-1]+10*P[x-1][y]+3*P[x-1][1] -3*P[x+1][y-1]-10*P[x+1][0]-3*P[x+1][y+1]
Dy = -3*P[x-1][y-1]-10*P[x][y-1]-3*P[x+1][-1] +3*P[x-1][y+1]+10*P[x][1]+3*P[x+1][y+1]
 勾配の導出方法は、ブロック毎に変更してもよい。例えば、4x4画素の対象ブロックに対してはSobelフィルタを用い、4x4より大きなブロックに対してはScharrフィルタを使う。このように小ブロックでは演算のより簡単なフィルタを用いることで、小ブロックにおける計算量の増加を抑制することができる。
Specifically, the gradient deriving unit 310461 derives the horizontal and vertical gradients Dx, Dy for each point P as follows:
Dx = P[x-1][y-1] + 2*P[x-1][y] + P[x-1][y+1] - P[x+1][y-1] - 2*P[x+1][y] - P[x+1][y+1]
Dy = - P[x-1][y-1] - 2*P[x][y-1] - P[x+1][y-1] + P[x-1][y+1] + 2*P[x][y+1] + P[x+1][y+1]
13(c) and (d) may be used, which are obtained by inverting the filters in Fig. 13(a) and (b) horizontally or vertically. In that case, Dx and Dy are derived using the following equations.
Dx = - P[x-1][y-1] - 2*P[x-1][y] - P[x-1][y+1] + P[x+1][y-1] + 2*P[x+1][y] + P[x+1][y+1]
Dy = P[x-1][y-1] + 2*P[x][y-1] + P[x+1][y-1] - P[x-1][y+1] - 2*P[x][y+1] - P[x+1][y+1]
The method of deriving the gradient is not limited to this, and other methods (filters, formulas, tables, etc.) may be used. For example, a Prewitt filter or a Scharr filter may be used instead of the Sobel filter, and the filter size may be 2x2 or 5x5. The gradient derivation unit 310461 derives Dx and Dy using a Prewitt filter as follows.
Dx = P[x-1][y-1] + P[x-1][y] + P[x-1][y+1] - P[x+1][y-1] - P[x+1][y] - P[x+1][y1]
Dy = - P[x-1][y-1] - P[x][y-1] - P[x+1][y-1] + P[x-1][y+1] + P[x][y+1] + P[x+1][y+1]
The following equation is an example of deriving Dx and Dy using a Scharr filter.
Dx = 3*P[x-1][y-1]+10*P[x-1][y]+3*P[x-1][1] -3*P[x+1][y-1]-10*P[x+1][0]-3*P[x+1][y+1]
Dy = -3*P[x-1][y-1]-10*P[x][y-1]-3*P[x+1][-1] +3*P[x-1][y+1]+10*P[x][1]+3*P[x+1][y+1]
The gradient derivation method may be changed for each block. For example, a Sobel filter is used for a target block of 4x4 pixels, and a Scharr filter is used for blocks larger than 4x4. In this way, by using a filter with simpler calculations for small blocks, the increase in the amount of calculations for small blocks can be suppressed.
 勾配の導出方法は、勾配導出対象画素の位置毎に変更してもよい。例えば、上または左の隣接領域内にある勾配導出対象画素についてはSobelフィルタを用い、左上の隣接領域内にある勾配導出対称画素についてはScharrフィルタを使う。 The gradient derivation method may be changed for each position of the pixel for which the gradient is to be derived. For example, a Sobel filter is used for the pixel for which the gradient is to be derived that is in the upper or left adjacent region, and a Scharr filter is used for the pixel for which the gradient is to be derived that is in the upper left adjacent region.
 勾配導出部310461は、Dx,Dyの符号と大小関係に基づいて、対象ブロックのテクスチャの角度の象限(quadrant、以下領域(region)と記す)と象限内の角度(angle)からなる角度情報を導出する。領域で表現できることにより回転対称や線対称の関係にある方向の処理を共通化することが可能になる。ただし角度情報は領域と象限内の角度に限らない。例えば角度情報は角度のみの情報とし、領域は必要に応じて導出してもよい。また、本実施形態では、以下で導出されるイントラ方向予測モードは左下から右上までの方向(図3の2~66)に限定し、180度回転対称の方向のイントラ方向予測モードはこれと同一に扱う。 The gradient derivation unit 310461 derives angle information consisting of the quadrant (hereinafter referred to as region) of the texture angle of the target block and the angle within the quadrant based on the signs and magnitude relationship of Dx and Dy. Being able to express it by region makes it possible to standardize the processing of directions that are rotationally symmetric or line symmetric. However, the angle information is not limited to the region and the angle within the quadrant. For example, the angle information may be information only about the angle, and the region may be derived as necessary. Also, in this embodiment, the intra direction prediction modes derived below are limited to directions from the bottom left to the top right (2 to 66 in Figure 3), and intra direction prediction modes for directions that are rotationally symmetric by 180 degrees are treated the same.
 図15(a)は、Dx,Dyの符号(signx, signy)、大小関係(xgty)、および、領域region(Ra~Rdの各々は領域を表す定数)の関係を示すテーブルである。図15(b)は、領域Ra~Rdの示す象限を示している。勾配導出部310461はsignx, signy, xgtyを次のように導出する。
absx = abs(Dx)
absy = abs(Dy)
signx = Dx < 0 ? 1 : 0
signy = Dy < 0 ? 1 : 0
xgty = absx > absy ? 1 : 0
ここで、不等号(>、<)は等号つき不等号(>=、<=)であっても構わない。領域は大まかな角度を示し、Dx,Dyの符号signx, signyと大小関係xgtyだけから導出できる。
Fig. 15(a) is a table showing the relationship between the signs (signx, signy) of Dx and Dy, the magnitude relationship (xgty), and the region (each of Ra to Rd is a constant representing the region). Fig. 15(b) shows the quadrants indicated by the regions Ra to Rd. The gradient derivation unit 310461 derives signx, signy, and xgty as follows.
absx = abs(Dx)
absy = abs(Dy)
signx = Dx < 0 ? 1 : 0
signy = Dy < 0 ? 1 : 0
xgty = absx > absy ? 1 : 0
Here, the inequality signs (>, <) can be replaced by inequality signs with equality signs (>=, <=). The area indicates a rough angle, and can be derived only from the signs signx, signy of Dx, Dy and the magnitude relationship xgty.
 勾配導出部310461は、符号signx, signyと大小関係xgtyから、演算やテーブル参照を用いてregionを導出する。勾配導出部310461は、図15(a)のテーブルを参照し、対応するregionを導出してもよい。 The gradient derivation unit 310461 derives a region from the signs signx, signy and the magnitude relationship xgty using calculations and table references. The gradient derivation unit 310461 may derive the corresponding region by referencing the table in FIG. 15(a).
 勾配導出部310461は、以下のように論理式を用いてregionを導出してもよい。
region = xgty ? ( (signx^signy) ? 1 : 0 ) : ( (signx^signy) ? 2 : 3)
ここで^はXOR(排他的論理和)を示す。regionを0から3の値で示す。{Ra,Rb,Rc,Rd} = {0,1,2,3}。なお、regionの値の割り当て方は上記に限定されない。
The gradient derivation unit 310461 may derive the region using a logical formula as follows.
region = xgty ? ( (signx^signy) ? 1 : 0 ) : ( (signx^signy) ? 2 : 3)
Here, ^ indicates XOR (exclusive OR). The region is expressed as a value from 0 to 3. {Ra, Rb, Rc, Rd} = {0, 1, 2, 3}. Note that the way in which the region value is assigned is not limited to the above.
 勾配導出部310461は、以下のように別の論理式と加算乗算を用いてregionを導出してもよい。
region = 2*(!xgty) + (signx^signy^!xgty)
ここで記号!は論理否定を意味する。
The gradient derivation unit 310461 may derive the region using another logical expression and addition/multiplication as follows.
region = 2*(!xgty) + (signx^signy^!xgty)
Here the symbol ! means logical negation.
 (2)角度モード導出部
 角度モード導出部310462は、上記各点Pの勾配情報に基づき、角度モード(勾配に対応する予測モード、例えば、イントラ予測モード)を導出する。
(2) Angle Mode Derivation Unit The angle mode derivation unit 310462 derives an angle mode (a prediction mode corresponding to the gradient, for example, an intra prediction mode) based on the gradient information of each point P described above.
 図16は角度モード導出部310462の一つの構成を示すブロック図である。図16に示すように第1の勾配、第2の勾配と2つのテーブルを用いて以下のように角度モードmode_deltaを導出してもよい。 FIG. 16 is a block diagram showing one configuration of the angle mode derivation unit 310462. As shown in FIG. 16, the angle mode mode_delta may be derived as follows using a first gradient, a second gradient, and two tables.
 角度モード導出部310462は、角度係数導出部310466とモード変換部310467からなる。角度係数導出部310466は2つの勾配に基づき角度係数iRatio(もしくはv)を導出する。ここでは、第1の勾配の絶対値absx,第2の勾配の絶対値absyに基づいて傾きiRatio(=absy÷absx)を導出する。iRatioとしてratioを1/R_UNIT刻みで表現した整数を用いる。
iRatio = int(R_UNIT*absy/absx) ≒ ratio*R_UNIT
 R_UNITは、2の指数乗(1<<shiftR)、例えば65536(shiftR=16)などを用いる。
The angle mode derivation unit 310462 consists of an angle coefficient derivation unit 310466 and a mode conversion unit 310467. The angle coefficient derivation unit 310466 derives an angle coefficient iRatio (or v) based on two gradients. Here, it derives a slope iRatio (= absy ÷ absx) based on the absolute value absx of the first gradient and the absolute value absy of the second gradient. An integer expressing ratio in increments of 1/R_UNIT is used as iRatio.
iRatio = int(R_UNIT*absy/absx) ≒ ratio*R_UNIT
R_UNIT is an exponential power of 2 (1<<shiftR), for example, 65536 (shiftR=16).
 以下にiRatioの導出方法を例示するが、この例に限らない。 The following is an example of how to derive iRatio, but the method is not limited to this example.
 s0 = xgty ? absy : absx
 s1 = xgty ? absx : absy
 x = Floor( Log2( s1 ) )
 norm_s1 = (s1 << 4 >> x) & 15
 v = gradDivTable[norm_s1] | 8
 x += (norm_s1 != 0)
 shift = 13 - x
 if (shift < 0){
  shift = -shift
  add = (1 << (shift - 1))
  iRatio = (s0 * v + add) >> shift
 } else {
  iRatio = (s0 * v) << shift
 }
 ここでgradDivTable = { 0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 }
 あるいは、上式"| 8"(8とのOR演算)を"+8"で計算してもよい。同様に以降の説明に出てくる"|16","|32","|64"は各々"+16","+32","+64"でも計算可能である。
s0 = xgty ? absy : absx
s1 = xgty ? absx : absy
x = Floor( Log2( s1 ) )
norm_s1 = (s1 << 4 >> x) & 15
v = gradDivTable[norm_s1] | 8
x += (norm_s1 != 0)
shift = 13 - x
if (shift < 0) {
shift = -shift
add = (1 << (shift - 1))
iRatio = (s0 * v + add) >> shift
} else {
iRatio = (s0 * v) << shift
}
where gradDivTable = { 0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 }
Alternatively, the above formula "| 8" (OR operation with 8) may be calculated as "+8". Similarly, "| 16", "| 32", and "| 64" that appear in the following explanations can also be calculated as "+16", "+32", and "+64", respectively.
 ある画素における第1の勾配(absxもしくはabsy)の対数値xによるシフトを用いて値norm_s1を導出する。norm_s1を用いてgradDivTableを参照して角度係数vを導出する。さらに、vと第1の勾配とは異なる第2の勾配(s0もしくはs1)の積と上記対数値xによるシフトによりidxを導出する。idxを用いて第2のテーブルLUT(LUT')を参照して角度モードmode_deltaを導出する。 The value norm_s1 is derived by shifting the first gradient (absx or absy) at a pixel by the logarithmic value x. norm_s1 is used to reference the gradDivTable to derive the angle coefficient v. Furthermore, idx is derived by the product of v and a second gradient (s0 or s1) different from the first gradient, and shifting the above logarithmic value x. idx is used to reference a second table LUT (LUT') to derive the angle mode mode_delta.
 なお、idxがLUTの域数の範囲を超えないように、以下のようにクリップを行ってもよい。 In addition, you can clip idx as follows so that it does not exceed the range of the LUT's number of regions.
 idx = min((s0 * v)<< 3 >> x, N_LUT-1)
 さらに、s0*vの乗算の後に所定の値KK以下にクリップしシフトすることで、例えば32bitを超えないようにすることも適当である。
idx = min((s0 * v)<< 3 >> x, N_LUT-1)
Furthermore, it is also appropriate to clip and shift the multiplication by s0*v to a value equal to or less than a predetermined value KK so that the result does not exceed, for example, 32 bits.
 s0*v = (min(s0*v, KK)<<3) >> x
 KKは、例えば、(1<<(31-3))-1=268435455。
s0*v = (min(s0*v, KK)<<3) >> x
For example, KK is (1<<(31-3))-1=268435455.
 モード変換部310467は、mode_deltaを用いて、第2の角度モードmodeValを導出し出力する。
modeVal = base_mode[region] + direction[region] * mode_delta
 角度モード導出部310462は、各点Pに対して得られた角度モードmodeValの値modeValのヒストグラム(頻度HistMode)を導出する。ヒストグラムは、各点PにおいてHistModeの値を1だけインクリメントすることで求めてもよい(以下、ヒストグラムでカウントする、と称す)。
The mode conversion unit 310467 derives and outputs the second angle mode modeVal using mode_delta.
modeVal = base_mode[region] + direction[region] * mode_delta
The angle mode derivation unit 310462 derives a histogram (frequency HistMode) of the value modeVal of the angle mode modeVal obtained for each point P. The histogram may be obtained by incrementing the value of HistMode by 1 at each point P (hereinafter referred to as counting with a histogram).
 HistMode[modeVal] += 1
 cntMode += 1
 (3)角度モード選択部
 角度モード選択部310463は、勾配導出対象画像に含まれる複数の点PにおけるmodeVal(modeVal)を用いて、一つ以上の角度モードの代表値dimdModeVal(dimdModeVal0、dimdModeVal1、…)を導出する。本実施形態における角度モードの代表値は対象ブロックのテクスチャパターンの方向性の推定値である。ここでは導出されたヒストグラムを用いて導出した最頻値から代表値dimdModeValを導出する。各点Pに対して得られた角度モードmodeValの値modeValのヒストグラムにおいて、第1のモードdimdModeVal0および第2のモードdimdModeVal1を、それぞれ当該頻度において最も頻度の高いモードと次に頻度の高いモードを選択することにより導出する。
HistMode[modeVal] += 1
cntMode += 1
(3) Angle Mode Selection Unit The angle mode selection unit 310463 derives one or more representative values dimdModeVal (dimdModeVal0, dimdModeVal1, ...) of the angle mode using modeVal (modeVal) at multiple points P included in the gradient derivation target image. The representative value of the angle mode in this embodiment is an estimated value of the directionality of the texture pattern of the target block. Here, the representative value dimdModeVal is derived from the most frequent value derived using the derived histogram. In the histogram of the angle mode modeVal value modeVal obtained for each point P, the first mode dimdModeVal0 and the second mode dimdModeVal1 are derived by selecting the most frequent mode and the second most frequent mode in the frequency, respectively.
 さらに、HistMode[x]をxに関して走査し、HistModeの最大値を与えるxの値をdimdModeVal0、2番目に大きい値を与えるxをdimdModeVal1とする。 Furthermore, HistMode[x] is scanned for x, and the value of x that gives the maximum value of HisMode is set to dimdModeVal0, and the value of x that gives the second largest value is set to dimdModeVal1.
 maxVal = 0
 for (x = 0; x < cntMode; x++) {
  if (HistMode[x] > maxVal) {
   maxVal = HistMode[x]
   dimdModeVal1 = dimdModeVal0
   dimdModeVal0 = x
  }
 }
 なお、dimdModeVal0またはdimdModeVal1の導出方法は、ヒストグラムに限らない。例えば、角度モード選択部310463は、modeValの平均値をdimdModeVal0またはdimdModelVal1に設定してもよい。
maxVal = 0
for (x = 0; x <cntMode; x++) {
if (HistMode[x] > maxVal) {
maxVal = HistMode[x]
dimdModeVal1 = dimdModeVal0
dimdModeVal0 = x
}
}
Note that the method of deriving dimdModeVal0 or dimdModeVal1 is not limited to the histogram. For example, the angle mode selection unit 310463 may set the average value of modeVal to dimdModeVal0 or dimdModelVal1.
 角度モード選択部310463は第3のモードとして、dimdModeVal2に所定のモード(例えば、イントラ予測モードや変換モード)を設定する。ここではdimdModeVal2=0(Planar)とするが、これに限らない。適応的に他のモードを設定してもよいし、第3のモードを用いなくてもよい。 The angle mode selection unit 310463 sets a predetermined mode (for example, intra prediction mode or transform mode) to dimdModeVal2 as the third mode. Here, dimdModeVal2=0 (Planar) is set, but this is not limited to this. Another mode may be set adaptively, or the third mode may not be used.
 角度モード選択部310463は、さらに、後述する仮予測画像生成部310464におけるイントラ予測のために、各角度モードの代表値に対応する重みを導出してもよい。例えば、第3のモードの重みw2=21とし、残りをヒストグラムにおける第1と第2のモードの頻度の比率に応じて、それぞれ重みw0とw1に配分する。なお、重みの合計は64とする。重みの導出はこれに限らず、第1、第2、第3のモードの重みw0,w1,w2を適応的に変化させてもよい。例えば第1のモードまたは第2のモードの番号、あるいは頻度やその比率に応じてw2を増減させてもよい。なお、角度モード選択部は、第1から第3のいずれのモードについても、当該モードを用いない場合は対応する重みの値を0と設定する。 The angle mode selection unit 310463 may further derive weights corresponding to the representative values of each angle mode for intra prediction in the provisional predicted image generation unit 310464 described later. For example, the weight of the third mode is set to w2=21, and the remainder is allocated to weights w0 and w1 according to the frequency ratio of the first and second modes in the histogram. The sum of the weights is set to 64. The derivation of the weights is not limited to this, and the weights w0, w1, and w2 of the first, second, and third modes may be adaptively changed. For example, w2 may be increased or decreased according to the number of the first or second mode, or the frequency or ratio thereof. Note that the angle mode selection unit sets the corresponding weight value to 0 for any of the first to third modes when that mode is not used.
 上記角度モード選択部310463は、勾配導出対象画像の中の画素で導出された複数の角度モードから角度モード代表値を選択する角度モード選択部を備えることにより、より精度の高い角度モードが導出できる。 The angle mode selection unit 310463 is equipped with an angle mode selection unit that selects an angle mode representative value from multiple angle modes derived for pixels in the image for which gradient is to be derived, thereby enabling the deriving of an angle mode with higher accuracy.
 以上のようにして角度モード選択部310463は、勾配から推定される角度モード(角度モードの代表値)を選択し、各角度モードに対応する重みとともに出力する。 In this way, the angle mode selection unit 310463 selects the angle mode (representative value of the angle mode) estimated from the gradient, and outputs it together with the weight corresponding to each angle mode.
 (適応的な勾配導出部310461と角度モード導出部310462の構成)
 上述したように、本実施形態では、dimd_modeに応じて参照画像からイントラ予測モードの導出に用いる参照画像の領域を変更する。具体的には、勾配導出部310461と角度モード導出部310462、角度モード選択部310463の各点Pの位置をdimd_modeに応じて変更する。
(Configuration of the adaptive gradient derivation unit 310461 and the angle mode derivation unit 310462)
As described above, in this embodiment, the region of the reference image used to derive the intra prediction mode from the reference image is changed according to dimd_mode. Specifically, the positions of points P of the gradient derivation unit 310461, the angle mode derivation unit 310462, and the angle mode selection unit 310463 are changed according to dimd_mode.
 勾配導出及び角度導出、ヒストグラムに用いる位置の範囲(x, y)のセットは、参照領域RL, RT, RTLの中の位置としてもよい。つまり、勾配導出の位置範囲は、3x3フィルタを適用するために、始点を1だけ大きく、終点を1だけ小さくしてもよい。つまり、DIMD予測の参照範囲がx=X0..X1, y=Y0..Y1の場合、勾配導出の配列範囲はx=X0+1..X1-1、y=Y0+1..Y1-1であってもよい。なお、RL, RT, RTLに対応する勾配導出の(x, y)の位置範囲をRDL, RDT, RDTLと称する。 The set of position ranges (x, y) used for gradient derivation, angle derivation, and histograms may be positions within the reference regions RL, RT, and RTL. That is, the position range of gradient derivation may be 1 larger at the start point and 1 smaller at the end point to apply a 3x3 filter. That is, if the reference range of DIMD prediction is x=X0..X1, y=Y0..Y1, the array range of gradient derivation may be x=X0+1..X1-1, y=Y0+1..Y1-1. Note that the position ranges of (x, y) for gradient derivation corresponding to RL, RT, and RTL are called RDL, RDT, and RDTL.
 (モードに応じた参照領域の構成例1)
 図17(a)は、DIMD予測の勾配導出における参照範囲の例を示す。dimd_mode == DIMD_MODE_TOP_LEFTの場合、角度モード導出装置310465は、対象ブロックの左の領域RDLの各点PからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDLは、x=-refIdxW..-2, y=-refIedxH..refH-2。
続いて、対象ブロックの上の領域の画素の範囲のRDTの各点PからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDTは、x=-refIdxW..refW-2, y=-refIdxH..-2。RDTLはRDLとRDTを合わせた領域である。
(Configuration example 1 of reference area according to mode)
17(a) shows an example of a reference range in gradient derivation for DIMD prediction. When dimd_mode == DIMD_MODE_TOP_LEFT, the angle mode derivation device 310465 derives Dx, Dy from each point P in the left region RDL of the target block, derives modeVal, and counts it in a histogram.
The RDL is x=-refIdxW..-2, y=-refIedxH..refH-2.
Next, Dx, Dy are derived from each point P of the RDT in the range of pixels in the region above the target block, and modeVal is derived and counted in a histogram.
RDT is x=-refIdxW..refW-2, y=-refIdxH..-2. RDTL is the combined domain of RDL and RDT.
 dimd_mode == DIMD_MODE_LEFTの場合、勾配導出部310461、角度モード導出部310462(以下、角度モード導出装置310465)は、対象ブロックの左の領域、例えば上記RDLからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 When dimd_mode == DIMD_MODE_LEFT, the gradient derivation unit 310461 and angle mode derivation unit 310462 (hereinafter referred to as the angle mode derivation device 310465) derive Dx and Dy from the left region of the target block, for example the above RDL, derive modeVal, and count it in a histogram.
 dimd_mode == DIMD_MODE_TOPの場合、角度モード導出装置310465は、対象ブロックの上の領域、例えば上記RDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 When dimd_mode == DIMD_MODE_TOP, the angle mode derivation device 310465 derives Dx and Dy from the area above the target block, for example, the RDT above, derives modeVal, and counts it in a histogram.
 (モードに応じた参照領域の構成例2)
 なお、角度モード導出装置310465は、下記のような処理を行ってもよい。
(Configuration example 2 of reference area according to mode)
In addition, the angle mode derivation device 310465 may perform the following processing.
 dimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_LEFTの場合、角度モード導出装置310465は、対象ブロックの左の領域、例えば上記RDLからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 When dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_LEFT, the angle mode derivation device 310465 derives Dx and Dy from the left area of the target block, for example, the above RDL, derives modeVal, and counts it in a histogram.
 さらにdimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_TOPの場合、角度モード導出装置310465は、対象ブロックの上の領域、例えば上記RDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 Furthermore, when dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_TOP, the angle mode derivation device 310465 derives Dx and Dy from the area above the target block, for example, the RDT mentioned above, derives modeVal, and counts it in a histogram.
 (モードに応じた参照領域の構成例3:dimd_modeに応じて拡張領域を利用する例)
 図17(b)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd_mode == DIMD_MODE_LEFTの場合に、左に加えて左下の隣接領域を含めた拡張領域を用いる。また、dimd_mode == DIMD_MODE_TOPの場合に、上に加えて右上の隣接領域を含めた拡張領域を用いる。dimd_mode == DIMD_MODE_TOP_LEFTの場合には拡張せずに左と上の領域を用いる。
(Example 3 of configuration of reference area according to mode: Example of using extension area according to dimd_mode)
Figure 17(b) shows another example of the reference range in the gradient derivation of DIMD prediction. In this example, when dimd_mode == DIMD_MODE_LEFT, an extended region including the adjacent region on the left and the bottom left is used. When dimd_mode == DIMD_MODE_TOP, an extended region including the adjacent region on the top and the top right is used. When dimd_mode == DIMD_MODE_TOP_LEFT, the left and top regions are used without extension.
 例えば、角度モード導出装置310465は、下記のような処理を行ってもよい。 For example, the angle mode derivation device 310465 may perform the following processing:
 dimd_mode == DIMD_MODE_TOP_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と左下の領域からDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
まず対象ブロックの左の領域RDLの位置(x, y)の点PにおいてDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
ここでRDLは、x=-refIdxW..-2, y=-refIdxH..refH-2、refW = bW、refH = bH。
続いて、対象ブロックの上の領域RDTの位置(x, y)の点PにおいてDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
ここでRDTは、x=-refIdxW..refW-2, y=-refIdxH..-2、refW = bW、refH = bH。RDTLはRDLとRDTを合わせた領域である。
When dimd_mode == DIMD_MODE_TOP_LEFT, the reference sample derivation unit 310460 derives Dx, Dy from the left and bottom left areas of the target block, derives modeVal, and counts it in a histogram.
First, Dx, Dy are derived at point P at position (x, y) in the left region RDL of the target block, and modeVal is derived and counted in a histogram.
Here, the RDL is x=-refIdxW..-2, y=-refIdxH..refH-2, refW = bW, refH = bH.
Next, Dx, Dy are derived at point P at position (x, y) in region RDT above the target block, and modeVal is derived and counted in a histogram.
Here, RDT is x=-refIdxW..refW-2, y=-refIdxH..-2, refW = bW, refH = bH. RDTL is the domain of RDL and RDT.
 dimd_mode == DIMD_MODE_LEFTの場合、角度モード導出装置310465は、対象ブロックの左と左下の領域、例えばRDL_EXTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDL_EXTは、x=-refIdxW..-2, y=-refIdxH..refH2-2、refH2 = bH*2。
対象ブロックの上の領域の画素が利用できない場合は、y=1..refH2-2とする。
When dimd_mode == DIMD_MODE_LEFT, the angle mode derivation device 310465 derives Dx, Dy from the left and bottom left areas of the target block, for example RDL_EXT, and derives modeVal and counts it in a histogram.
RDL_EXT is x=-refIdxW..-2, y=-refIdxH..refH2-2, refH2 = bH*2.
If the pixels in the region above the target block are not available, then y=1..refH2-2.
 dimd_mode == DIMD_MODE_TOPの場合、角度モード導出装置310465は、対象ブロックの上の領域、例えば上記RDT_EXTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDT_EXTは、x=-refIdxW..refW2-2, y=-refIdxH..-2、refW2 = bW*2。
対象ブロックの左の領域の画素が利用できない場合は、x=1..refW2-2とする。
When dimd_mode == DIMD_MODE_TOP, the angle mode derivation device 310465 derives Dx, Dy from the region above the target block, for example, the above RDT_EXT, and derives modeVal and counts it in a histogram.
RDT_EXT is x=-refIdxW..refW2-2, y=-refIdxH..-2, refW2 = bW*2.
If the pixels in the area to the left of the target block are unavailable, then x=1..refW2-2.
 (構成例4:dimd_modeに応じて拡張領域を利用する例2)
 なお、角度モード導出装置310465は、下記のような処理を行ってもよい。
(Configuration Example 4: Example 2 of using the extension area according to dimd_mode)
In addition, the angle mode derivation device 310465 may perform the following processing.
 dimd_mode == DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_LEFTの場合、参照サンプル導出部310460は、対象ブロックの左と左下の領域、例えばRDL_ADAPからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDL_ADAPは以下を用いてもよい。
x=-refIdxW..-2, y=-refIdxH..refH-2、refH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bH : bH*2
DIMD_MODE_LEFTで対象ブロックの上の領域の画素が利用できない場合は、y=1..refH-2を用いる。
When dimd_mode == DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_LEFT, the reference sample derivation unit 310460 derives Dx and Dy from the left and bottom left areas of the target block, for example, RDL_ADAP, and derives modeVal and counts it in a histogram.
RDL_ADAP may use the following:
x=-refIdxW..-2, y=-refIdxH..refH-2, refH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bH : bH*2
In DIMD_MODE_LEFT, if the pixels in the area above the target block are not available, y=1..refH-2 is used.
 さらに続けてdimd_mode = DIMD_MODE_TOP_LEFTもしくはdimd_mode == DIMD_MODE_TOPの場合、角度モード導出装置310465は、対象ブロックの上と右上の領域、例えばRDT_ADAPからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
RDT_ADAPは以下を用いてもよい。
x=-refIdxW..refW-2, y=-refIdxH..-2、refW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bW*2 : bW
DIMD_MODE_TOPで対象ブロックの左の領域の画素が利用できない場合は、x=1..refW-2を用いる。
Furthermore, if dimd_mode = DIMD_MODE_TOP_LEFT or dimd_mode == DIMD_MODE_TOP, the angle mode derivation device 310465 derives Dx and Dy from the upper and upper right regions of the target block, for example, RDT_ADAP, and derives modeVal and counts it in a histogram.
RDT_ADAP may use the following:
x=-refIdxW..refW-2, y=-refIdxH..-2, refW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? bW*2 : bW
In DIMD_MODE_TOP, if pixels in the area to the left of the target block are not available, use x=1..refW-2.
 上記構成によれば、DIMDモードに応じて、少なくとも、対象ブロックの上と左、左、上を上記隣接画像として切り替えることができる。従って、対象ブロックの特徴と左もしくは上の隣接領域の特徴が異なる場合であっても、デコーダ側のイントラ予測モード導出を高い精度で実現でき高効率を実現できる効果を奏する。 The above configuration makes it possible to switch between at least the top and left, and the left and top of the target block as the adjacent images depending on the DIMD mode. Therefore, even if the characteristics of the target block differ from the characteristics of the adjacent area to the left or above, the intra prediction mode can be derived on the decoder side with high accuracy and high efficiency.
 対象領域隣接画像の画素値の勾配を用いてデコーダ側でイントラ予測モードを導出する場合、隣接画像の角度勾配と対象ブロックの角度勾配は必ずしも一致しない。そのような場合でも、隣接ブロックと対象ブロックの性質に応じてイントラ予測モードの導出を切り替えて精度を向上させる効果を奏する。 When the decoder derives the intra prediction mode using the gradient of pixel values of an image adjacent to the target area, the angle gradient of the adjacent image and the angle gradient of the target block do not necessarily match. Even in such cases, the effect of improving accuracy is achieved by switching the derivation of the intra prediction mode depending on the properties of the adjacent blocks and the target block.
 さらに、上と左の両方を使う場合(DIMD_MODE_TOP_LEFT)では右上、左下の拡張領域を用いず、左だけ(DIMD_MODE_LEFT)、または、上だけ(DIMD_MODE_TOP)を使う場合では各々左と左下、上と右上の拡張領域を用いる構成では、参照画素のサンプリング、勾配導出、ヒストグラム導出の処理量を削減する効果を奏する。 Furthermore, when both the top and left are used (DIMD_MODE_TOP_LEFT), the top right and bottom left extension regions are not used, whereas when only the left (DIMD_MODE_LEFT) or only the top (DIMD_MODE_TOP) is used, the left and bottom left extension regions, and the top and top right extension regions are used, respectively, which has the effect of reducing the amount of processing required for sampling reference pixels, deriving gradients, and deriving histograms.
 (モードに応じた構成例5)
 図12(a)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd_linesに応じて参照領域のライン数を変更する。例えば、dimd_mode == DIMD_LINES1の場合に、Mラインを参照し、dimd_mode == DIMD_LINES2の場合にMより大きいNラインを参照する。M, NはたとえばM=3, N=4。
(Configuration Example 5 According to Mode)
Fig. 12(a) shows another example of the reference range in gradient derivation for DIMD prediction. In this example, the number of lines in the reference region is changed according to dimd_lines. For example, when dimd_mode == DIMD_LINES1, M lines are referenced, and when dimd_mode == DIMD_LINES2, N lines greater than M are referenced. M and N are, for example, M=3 and N=4.
 角度モード導出装置310465は、dimd_modeに応じて参照ライン数refIdxW, refIdxHを設定する。
refIdxW = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
 角度モード導出装置310465は対象ブロックの左の領域、例えばRDLからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
The angle mode derivation device 310465 sets the reference line numbers refIdxW and refIdxH in accordance with dimd_mode.
refIdxW = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_LINES1) ? M-1 : N-1
The angle mode derivation unit 310465 derives Dx and Dy from the left region of the target block, for example, RDL, derives modeVal, and counts it in a histogram.
 角度モード導出装置310465は対象ブロックの上の領域、例えばRDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 The angle mode derivation unit 310465 derives Dx and Dy from the area above the target block, for example, RDT, derives modeVal, and counts it in a histogram.
 さらにブロックサイズに応じて選択してもよい。 You can also select based on block size.
 (ブロックサイズに応じた構成例1)
 角度モード導出装置310465はブロックサイズに応じて参照ライン数refIdxW, refIdxHを設定する。
refIdxW = (bW >= 8 || bH >=8) ? N-1 : M-1
refIdxH = (bW >= 8 || bH >=8) ? N-1 : M-1
 角度モード導出装置310465は、対象ブロックの左の領域RDL、上の領域RDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
(モードに応じた構成例6)
 図12(b)は、DIMD予測の勾配導出における参照範囲の別の例を示す。この例ではdimd_modeに応じて、参照領域の方向を選択するのと同時に、参照領域のライン数も選択する。
(Configuration example 1 according to block size)
The angle mode derivation device 310465 sets the reference line numbers refIdxW and refIdxH in accordance with the block size.
refIdxW = (bW >= 8 || bH >= 8) ? N-1 : M-1
refIdxH = (bW >= 8 || bH >= 8) ? N-1 : M-1
The angle mode derivation device 310465 derives Dx and Dy from the left region RDL and the top region RDT of the target block, derives modeVal, and counts it in a histogram.
(Configuration Example 6 According to Mode)
12(b) shows another example of the reference range in the gradient derivation of the DIMD prediction. In this example, the direction of the reference region is selected according to the dimd_mode, and at the same time, the number of lines of the reference region is also selected.
 角度モード導出装置310465は、dimd_modeが左と上を参照する場合、参照ライン数をM、それ以外の場合、参照ライン数をN(M<N)となるように導出する。たとえばM=2, N=3。
refIdxW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
 dimd_mode == DIMD_MODE_TOP_LEFTの場合、参照サンプル導出部310460は、参照ライン数をMに設定した上で、左の領域RDL、上の領域RDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。
The angle mode derivation device 310465 derives the number of reference lines to be M if dimd_mode refers to the left and top, and otherwise derives the number of reference lines to be N (M<N), for example, M=2, N=3.
refIdxW = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
refIdxH = (dimd_mode == DIMD_MODE_TOP_LEFT) ? M-1 : N-1
When dimd_mode == DIMD_MODE_TOP_LEFT, the reference sample derivation unit 310460 sets the number of reference lines to M, derives Dx and Dy from the left region RDL and the top region RDT, derives modeVal, and counts it in a histogram.
 dimd_mode == DIMD_MODE_LEFTの場合、角度モード導出装置310465は、参照ライン数をNに設定した上で、対象ブロックの左の領域、例えばRDLからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 When dimd_mode == DIMD_MODE_LEFT, the angle mode derivation device 310465 sets the number of reference lines to N, derives Dx and Dy from the area to the left of the target block, for example, RDL, derives modeVal, and counts it in a histogram.
 dimd_mode == DIMD_MODE_TOPの場合、角度モード導出装置310465は、参照ライン数をNに設定した上で、対象ブロックの上の領域、例えばRDTからDx, Dyを導出し、modeValを導出しヒストグラムでカウントする。 When dimd_mode == DIMD_MODE_TOP, the angle mode derivation device 310465 sets the number of reference lines to N, derives Dx and Dy from the area above the target block, for example, RDT, derives modeVal, and counts it in a histogram.
 上記構成によれば、参照領域隣接画像の画素値の勾配を用いてデコーダ側でイントラ予測モードを導出する構成において、参照領域として上と左の両方を使う場合と、左だけ、または、上だけを使う場合とで参照するライン数を切りかえる。これにより、さらに、対象ブロックと隣接ブロックの特徴の連続性の違いに応じたイントラ予測モードを導出することができ、予測精度が向上する効果を奏する。 With the above configuration, in a configuration in which the intra prediction mode is derived on the decoder side using the gradient of pixel values of the reference area adjacent image, the number of lines to be referenced is switched between when both the top and left are used as the reference area, and when only the left or only the top is used. This further makes it possible to derive an intra prediction mode according to the difference in the continuity of the characteristics of the target block and the adjacent blocks, thereby improving prediction accuracy.
 さらに、上と左の両方を使う場合(DIMD_MODE_TOP_LEFT)の参照ライン数をM、左だけ(DIMD_MODE_LEFT)、上だけ(DIMD_MODE_TOP)を使う場合の参照するライン数をNに設定する(ここでM<N)。この構成では、上と左の両方を参照することによる参照画素のサンプリング、勾配導出、ヒストグラム導出の処理量を削減する効果を奏する。 Furthermore, the number of reference lines when using both the top and left (DIMD_MODE_TOP_LEFT) is set to M, and the number of reference lines when using only the left (DIMD_MODE_LEFT) or only the top (DIMD_MODE_TOP) is set to N (where M<N). This configuration has the effect of reducing the amount of processing required for sampling reference pixels, deriving gradients, and deriving histograms by referring to both the top and left.
 なお、上記構成において対象ブロックの左の参照領域を左と左下の参照領域、対象ブロックの上の参照領域を上と右上の参照領域を用いる構成であってもよいし、対象ブロックの左上を参照する構成であってもよい。 In the above configuration, the reference area to the left of the target block may be the left and bottom left reference areas, and the reference area above the target block may be the top and top right reference areas, or the top left of the target block may be referenced.
 (4)予測画像生成部
 予測画像生成部(仮予測画像生成部)310464は、入力された1つ以上の角度モード代表値(イントラ予測モード)を用いて予測画像(仮予測画像)を生成する。イントラ予測モードが1つの場合は、当該イントラ予測モードによるイントラ予測画像を生成し、仮予測画像q[x][y]として出力する。イントラ予測モードが複数の場合、各イントラ予測モードによる予測画像(pred0, pred1, pred2)を生成する。対応する重み(w0,w1,w2)を用いて複数の予測画像を合成し、予測画像q[x][y]として出力する。予測画像q[x][y]は以下のように導出する。
q[x][y] = (w0 * pred0[x][y] + w1 * pred1[x][y] + w2 * pred2[x][y]) >> 6
ただし、第2のモードの頻度が0である、または、方向予測モードでない(DCモードなど)場合は、第1のイントラ予測モードによる予測画像pred0[][]を予測画像q[][]とする(q[x][y]=pred0[0][0])。
(4) Prediction Image Generation Unit The prediction image generation unit (provisional prediction image generation unit) 310464 generates a prediction image (provisional prediction image) using one or more input angle mode representative values (intra prediction modes). When there is one intra prediction mode, an intra prediction image is generated in that intra prediction mode and output as a provisional prediction image q[x][y]. When there are multiple intra prediction modes, a prediction image (pred0, pred1, pred2) is generated in each intra prediction mode. Multiple prediction images are synthesized using the corresponding weights (w0, w1, w2) and output as a prediction image q[x][y]. The prediction image q[x][y] is derived as follows.
q[x][y] = (w0 * pred0[x][y] + w1 * pred1[x][y] + w2 * pred2[x][y]) >> 6
However, if the frequency of the second mode is 0 or it is not a directional prediction mode (such as DC mode), the predicted image pred0[][] based on the first intra prediction mode is used as the predicted image q[][] (q[x][y]=pred0[0][0]).
 (dimd_mode復号の構成)
 逆量子化・逆変換部311は、予測パラメータ導出部320から入力された量子化変換係数を逆量子化して変換係数を求める。この量子化変換係数は、符号化処理において、予測誤差に対してDCT(Discrete Cosine Transform、離散コサイン変換)、DST(Discrete Sine Transform、離散サイン変換)等の周波数変換を行い量子化して得られる係数である。逆量子化・逆変換部311は変換係数について逆DCT、逆DST等の逆周波数変換を行い、予測誤差を算出する。逆量子化・逆変換部311は予測誤差を加算部312に出力する。
(Dimd_mode Decryption Configuration)
The inverse quantization and inverse transform unit 311 inverse quantizes the quantized transform coefficients input from the prediction parameter derivation unit 320 to obtain transform coefficients. The quantized transform coefficients are coefficients obtained by performing frequency transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform) on the prediction error in the encoding process and quantizing the transform coefficients. The inverse quantization and inverse transform unit 311 performs inverse frequency transform such as inverse DCT or inverse DST on the transform coefficients to calculate the prediction error. The inverse quantization and inverse transform unit 311 outputs the prediction error to the adder unit 312.
 図18は、本実施形態の逆量子化・逆変換部311の構成を示すブロック図である。逆量子化・逆変換部311は、スケーリング部31111、逆非分離変換部31121、逆分離変換部31123から構成される。なお、角度モード導出装置310465の導出した角度モードを用いて、符号化データから復号した変換係数を変換してもよい。 FIG. 18 is a block diagram showing the configuration of the inverse quantization and inverse transform unit 311 of this embodiment. The inverse quantization and inverse transform unit 311 is composed of a scaling unit 31111, an inverse non-separable transform unit 31121, and an inverse separate transform unit 31123. Note that the transform coefficients decoded from the encoded data may be transformed using the angle mode derived by the angle mode derivation device 310465.
 逆量子化・逆変換部311は、予測パラメータ導出部320から入力された量子化変換係数qd[][]をスケーリング部31111によりスケーリング(逆量子化)して変換係数d[][]を求める。この量子化変換係数qd[][]は、符号化処理において、予測誤差に対してDCT(Discrete Cosine Transform、離散コサイン変換)、DST(Discrete Sine Transform、離散サイン変換)等の変換を行い量子化して得られる係数、もしくは、変換後の係数をさらに非分離変換した係数である。逆量子化・逆変換部311は、非分離変換フラグlfnst_idx!=0の場合、逆非分離変換部31121により逆変換を行う。さらに変換係数について逆DCT、逆DST等の逆周波数変換を行い、予測誤差を算出する。また、lfnst_idx==0の場合、逆非分離変換部31121で処理を行わず、スケーリング部31111によりスケーリングされた変換係数に逆DCT、逆DST等の逆周波数変換を行い、予測誤差を算出する。逆量子化・逆変換部311は予測誤差を加算部312に出力する。 The inverse quantization and inverse transform unit 311 obtains the transform coefficients d[][] by scaling (inverse quantization) the quantized transform coefficients qd[][] input from the prediction parameter derivation unit 320 using the scaling unit 31111. The quantized transform coefficients qd[][] are coefficients obtained by performing a transform such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform) on the prediction error in the encoding process and quantizing it, or coefficients obtained by further performing a non-separable transform on the transformed coefficients. When the non-separable transform flag lfnst_idx!=0, the inverse quantization and inverse transform unit 31121 performs an inverse transform. Furthermore, an inverse frequency transform such as inverse DCT or inverse DST is performed on the transform coefficients to calculate the prediction error. Also, when lfnst_idx==0, the inverse non-separable transform unit 31121 does not perform processing, and the scaling unit 31111 performs inverse frequency transform such as inverse DCT and inverse DST on the scaled transform coefficients to calculate a prediction error. The inverse quantization and inverse transform unit 311 outputs the prediction error to the adder 312.
 加算部312は、予測画像生成部308から入力されたブロックの予測画像と逆量子化・逆変換部311から入力された予測誤差を画素毎に加算して、ブロックの復号画像を生成する。加算部312はブロックの復号画像を参照ピクチャメモリ306に記憶し、また、ループフィルタ305に出力する。 The adder 312 adds, for each pixel, the predicted image of the block input from the predicted image generation unit 308 and the prediction error input from the inverse quantization and inverse transform unit 311 to generate a decoded image of the block. The adder 312 stores the decoded image of the block in the reference picture memory 306, and also outputs it to the loop filter 305.
  (動画像符号化装置の構成)
 次に、本実施形態に係る動画像符号化装置11の構成について説明する。図19は、本実施形態に係る動画像符号化装置11の構成を示すブロック図である。動画像符号化装置11は、予測画像生成部101、減算部102、変換・量子化部103、逆量子化・逆変換部105、加算部106、ループフィルタ107、予測パラメータメモリ(予測パラメータ記憶部、フレームメモリ)108、参照ピクチャメモリ(参照画像記憶部、フレームメモリ)109、符号化パラメータ決定部110、パラメータ符号化部111、エントロピー符号化部104、予測パラメータ導出部120を含んで構成される。
(Configuration of the video encoding device)
Next, the configuration of the video encoding device 11 according to this embodiment will be described. Fig. 19 is a block diagram showing the configuration of the video encoding device 11 according to this embodiment. The video encoding device 11 includes a prediction image generating unit 101, a subtraction unit 102, a transformation/quantization unit 103, an inverse quantization/inverse transformation unit 105, an addition unit 106, a loop filter 107, a prediction parameter memory (prediction parameter storage unit, frame memory) 108, a reference picture memory (reference image storage unit, frame memory) 109, an encoding parameter determining unit 110, a parameter encoding unit 111, an entropy encoding unit 104, and a prediction parameter derivation unit 120.
 予測画像生成部101は画像Tの各ピクチャを分割した領域であるCU毎に予測画像を生成する。予測画像生成部101は既に説明した予測画像生成部308と同じ動作であり、説明を省略する。 The predicted image generating unit 101 generates a predicted image for each CU, which is an area obtained by dividing each picture of the image T. The predicted image generating unit 101 operates in the same way as the predicted image generating unit 308 already explained, and so a description thereof will be omitted.
 減算部102は、予測画像生成部101から入力されたブロックの予測画像の画素値を、画像Tの画素値から減算して予測誤差を生成する。減算部102は予測誤差を変換・量子化部103に出力する。 The subtraction unit 102 subtracts the pixel values of the predicted image of the block input from the predicted image generation unit 101 from the pixel values of image T to generate a prediction error. The subtraction unit 102 outputs the prediction error to the transformation and quantization unit 103.
 変換・量子化部103は、減算部102から入力された予測誤差に対し、周波数変換によって変換係数を算出し、量子化によって量子化変換係数を導出する。変換・量子化部103は、量子化変換係数をエントロピー符号化部104及び逆量子化・逆変換部105、および符号化パラメータ決定部110に出力する。 The transform/quantization unit 103 calculates transform coefficients by frequency transforming the prediction error input from the subtraction unit 102, and derives quantized transform coefficients by quantizing it. The transform/quantization unit 103 outputs the quantized transform coefficients to the entropy coding unit 104, the inverse quantization/inverse transform unit 105, and the coding parameter determination unit 110.
 逆量子化・逆変換部105は、動画像復号装置31における逆量子化・逆変換部311(図4)と同じであり、説明を省略する。算出した予測誤差は加算部106に出力される。 The inverse quantization and inverse transform unit 105 is the same as the inverse quantization and inverse transform unit 311 (FIG. 4) in the video decoding device 31, and a description thereof will be omitted. The calculated prediction error is output to the addition unit 106.
 エントロピー符号化部104には、パラメータ符号化部111から予測パラメータ、量子化変換係数が入力される。エントロピー符号化部104は、分割情報、予測パラメータ、量子化変換係数等をエントロピー符号化して符号化ストリームTeを生成し、出力する。 The entropy coding unit 104 receives prediction parameters and quantized transform coefficients from the parameter coding unit 111. The entropy coding unit 104 entropy codes the split information, prediction parameters, quantized transform coefficients, etc. to generate and output an encoded stream Te.
 パラメータ符号化部111は、予測パラメータ導出部120で導出した予測パラメータ、量子化係数等の符号化をエントロピー符号化部104に指示する。 The parameter coding unit 111 instructs the entropy coding unit 104 to code the prediction parameters, quantization coefficients, etc. derived by the prediction parameter derivation unit 120.
 予測パラメータ導出部120は、符号化パラメータ決定部110から入力されたパラメータ等からシンタックス要素を導出する。予測パラメータ導出部120は、予測パラメータ導出部320の構成と、一部同一の構成を含む。 The prediction parameter derivation unit 120 derives syntax elements from the parameters input from the encoding parameter determination unit 110. The prediction parameter derivation unit 120 includes a configuration that is partially the same as the configuration of the prediction parameter derivation unit 320.
 加算部106は、予測画像生成部101から入力されたブロックの予測画像の画素値と逆量子化・逆変換部105から入力された予測誤差を画素毎に加算して復号画像を生成する。加算部106は生成した復号画像を参照ピクチャメモリ109に記憶する。 The adder 106 generates a decoded image by adding, for each pixel, the pixel values of the predicted image of the block input from the predicted image generation unit 101 and the prediction error input from the inverse quantization and inverse transform unit 105. The adder 106 stores the generated decoded image in the reference picture memory 109.
 ループフィルタ107は加算部106が生成した復号画像に対し、デブロッキングフィルタ、SAO、ALFを施す。なお、ループフィルタ107は、必ずしも上記3種類のフィルタを含まなくてもよく、例えばデブロッキングフィルタのみの構成であってもよい。 The loop filter 107 applies a deblocking filter, SAO, and ALF to the decoded image generated by the adder 106. Note that the loop filter 107 does not necessarily have to include the above three types of filters, and may be configured, for example, as only a deblocking filter.
 予測パラメータメモリ108は、予測パラメータ導出部120から入力された予測パラメータを、対象ピクチャ及びCU毎に予め定めた位置に記憶する。 The prediction parameter memory 108 stores the prediction parameters input from the prediction parameter derivation unit 120 in a predetermined location for each target picture and CU.
 参照ピクチャメモリ109は、ループフィルタ107が生成した復号画像を対象ピクチャ及びCU毎に予め定めた位置に記憶する。 The reference picture memory 109 stores the decoded image generated by the loop filter 107 in a predetermined location for each target picture and CU.
 符号化パラメータ決定部110は、符号化パラメータの複数のセットのうち、1つのセットを選択する。符号化パラメータとは、上述したQT、BTあるいはTT分割情報、予測パラメータ、あるいはこれらに関連して生成される符号化の対象となるパラメータである。予測画像生成部101は、これらの符号化パラメータを用いて予測画像を生成する。 The coding parameter determination unit 110 selects one set from among multiple sets of coding parameters. The coding parameters are the above-mentioned QT, BT or TT division information, prediction parameters, or parameters to be coded that are generated in relation to these. The predicted image generation unit 101 generates a predicted image using these coding parameters.
 符号化パラメータ決定部110は、複数のセットの各々について情報量の大きさと符号化誤差を示すRDコスト値を算出する。符号化パラメータ決定部110は、算出したコスト値が最小となる符号化パラメータのセットを選択する。これにより、エントロピー符号化部104は、選択した符号化パラメータのセットを符号化ストリームTeとして出力する。符号化パラメータ決定部110は決定した符号化パラメータを予測パラメータメモリ108に記憶する。 The coding parameter determination unit 110 calculates an RD cost value indicating the amount of information and the coding error for each of the multiple sets. The coding parameter determination unit 110 selects the set of coding parameters that minimizes the calculated cost value. As a result, the entropy coding unit 104 outputs the selected set of coding parameters as the coding stream Te. The coding parameter determination unit 110 stores the determined coding parameters in the prediction parameter memory 108.
 なお、上述した実施形態における動画像符号化装置11、動画像復号装置31の一部、例えば、エントロピー復号部301、パラメータ復号部302、ループフィルタ305、予測画像生成部308、逆量子化・逆変換部311、加算部312、予測画像生成部101、減算部102、変換・量子化部103、エントロピー符号化部104、逆量子化・逆変換部105、ループフィルタ107、符号化パラメータ決定部110、パラメータ符号化部111をコンピュータで実現するようにしてもよい。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、動画像符号化装置11、動画像復号装置31のいずれかに内蔵されたコンピュータシステムであって、OSや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ROM、CD-ROM等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Note that a part of the video encoding device 11 and video decoding device 31 in the above-mentioned embodiment, for example, the entropy decoding unit 301, the parameter decoding unit 302, the loop filter 305, the predicted image generating unit 308, the inverse quantization and inverse transform unit 311, the addition unit 312, the predicted image generating unit 101, the subtraction unit 102, the transform and quantization unit 103, the entropy encoding unit 104, the inverse quantization and inverse transform unit 105, the loop filter 107, the encoding parameter determination unit 110, and the parameter encoding unit 111 may be realized by a computer. In this case, a program for realizing this control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into and executed by a computer system. Note that the "computer system" referred to here is a computer system built into either the video encoding device 11 or the video decoding device 31, and includes hardware such as an OS and peripheral devices. Additionally, "computer-readable recording media" refers to portable media such as flexible disks, optical magnetic disks, ROMs, and CD-ROMs, as well as storage devices such as hard disks built into computer systems. Furthermore, "computer-readable recording media" may also include devices that dynamically store a program for a short period of time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, or devices that store a program for a certain period of time, such as volatile memory within a computer system that serves as a server or client in such cases. Furthermore, the above-mentioned program may be one that realizes part of the functions described above, or may be one that can realize the functions described above in combination with a program already recorded in the computer system.
 また、上述した実施形態における動画像符号化装置11、動画像復号装置31の一部、または全部を、LSI(Large Scale Integration)等の集積回路として実現してもよい。動画像符号化装置11、動画像復号装置31の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化してもよい。また、集積回路化の手法はLSIに限らず専用回路、または汎用プロセッサで実現してもよい。また、半導体技術の進歩によりLSIに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 Furthermore, part or all of the video encoding device 11 and video decoding device 31 in the above-mentioned embodiments may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each functional block of the video encoding device 11 and video decoding device 31 may be individually made into a processor, or part or all of them may be integrated into a processor. The integrated circuit method is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Furthermore, if an integrated circuit technology that can replace LSI appears due to advances in semiconductor technology, an integrated circuit based on that technology may be used.
 以上、図面を参照してこの発明の一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 Although one embodiment of the present invention has been described in detail above with reference to the drawings, the specific configuration is not limited to that described above, and various design changes can be made without departing from the spirit of the present invention.
 (まとめ)
 本発明の態様1に係る画像復号装置は、DIMDモードに応じて対象ブロックの隣接画像を選択する参照サンプル導出部と、選択された隣接画像を用いて、画素単位の勾配を導出する勾配導出部、勾配からイントラ予測モードを導出する角度モード選択部を備える。
(summary)
An image decoding device according to aspect 1 of the present invention includes a reference sample derivation unit that selects a neighboring image of a current block in accordance with a DIMD mode, a gradient derivation unit that derives a pixel-by-pixel gradient using the selected neighboring image, and an angle mode selection unit that derives an intra-prediction mode from the gradient.
 本発明の態様2に係る画像復号装置は、上記態様1において、符号化データから上記対象ブロックのDIMDフラグとDIMDモードを復号するエントロピー復号部を備え、上記DIMDフラグがtrueの場合に、さらにDIMDモードを復号し、さらに導出されたイントラ予測モードを用いて予測画像生成を行う予測画像生成部を備えることを特徴とする。 The image decoding device according to aspect 2 of the present invention is characterized in that, in the above aspect 1, it includes an entropy decoding unit that decodes the DIMD flag and DIMD mode of the target block from the encoded data, and further includes a predicted image generating unit that, when the DIMD flag is true, further decodes the DIMD mode and generates a predicted image using the derived intra prediction mode.
 本発明の態様3に係る画像復号装置は、上記態様1~2のいずれかにおいて、DIMDモードは、少なくとも、上と左、左、上を上記隣接画像として切り替えることを特徴とする。 The image decoding device according to aspect 3 of the present invention is characterized in that in any of aspects 1 and 2 above, the DIMD mode switches at least between the top and left, and the left and top as the adjacent images.
 本発明の態様4に係る画像復号装置は、上記態様1~3のいずれかにおいて、DIMDモードは、第1ビットと第2ビットから構成され、第1ビットは上と左か否か、第2ビットで左もしくは上かを選択肢として、上記隣接画像を選択することを特徴とする。 The image decoding device according to aspect 4 of the present invention is characterized in that in any of aspects 1 to 3 above, the DIMD mode is composed of a first bit and a second bit, and the first bit is a choice between top and left, and the second bit is a choice between left or top to select the adjacent image.
 本発明の態様5に係る画像復号装置は、上記態様1~4のいずれかにおいて、上記エントロピー復号部は、上記第1ビットの復号には確率を保持するコンテキストを用い、上記第2ビットはコンテキストを用いない等確率を用いて、上記DIMDモードを復号することを特徴とする。 The image decoding device according to aspect 5 of the present invention is characterized in that in any one of aspects 1 to 4, the entropy decoding unit decodes the DIMD mode using a context that holds a probability for decoding the first bit, and using an equal probability without using a context for decoding the second bit.
 本発明の態様6に係る画像復号装置は、上記態様1~5のいずれかにおいて、上記エントロピー復号部は、上記第1ビットと上記2ビットの復号には確率を保持するコンテキストを用いて、上記DIMDモードを復号することを特徴とする。 The image decoding device according to aspect 6 of the present invention is characterized in that in any one of aspects 1 to 5, the entropy decoding unit decodes the DIMD mode using a context that holds probabilities for decoding the first bit and the second bit.
 本発明の態様7に係る画像復号装置は、上記態様1~6のいずれかにおいて、上記エントロピー復号部は、対象ブロックの幅と高さを利用してコンテキストインデックスを導出することを特徴とする。 The image decoding device according to aspect 7 of the present invention is any one of aspects 1 to 6 above, characterized in that the entropy decoding unit derives the context index using the width and height of the target block.
 本発明の態様8に係る画像復号装置は、上記態様1~7のいずれかにおいて、上記エントロピー復号部は、上記第2ビットには対象ブロックが正方形か否かの判定を用いてコンテキストインデックスを導出することを特徴とする。 The image decoding device according to aspect 8 of the present invention is any one of aspects 1 to 7 above, characterized in that the entropy decoding unit derives a context index from the second bit by using a determination of whether the target block is a square or not.
 本発明の態様9に係る画像復号装置は、上記態様1~8のいずれかにおいて、上記勾配導出部は、dimd_modeに応じて参照するライン数を変更することを特徴とする。 The image decoding device according to aspect 9 of the present invention is characterized in that in any one of aspects 1 to 8 above, the gradient derivation unit changes the number of lines to be referenced depending on dimd_mode.
 本発明の態様10に係る画像復号装置は、上記態様1~9のいずれかにおいて、上記勾配導出部は、上記対象ブロックのサイズに応じて参照するライン数を変更することを特徴とする。 The image decoding device according to aspect 10 of the present invention is characterized in that in any one of aspects 1 to 9 above, the gradient derivation unit changes the number of lines to be referenced depending on the size of the target block.
 本発明の態様11に係る画像復号装置は、上記態様1~10のいずれかにおいて、上記勾配導出部は、dimd_modeに応じて参照するライン数と参照方向を変更することを特徴とする。 An image decoding device according to aspect 11 of the present invention is characterized in that in any one of aspects 1 to 10 above, the gradient derivation unit changes the number of lines to be referenced and the reference direction according to dimd_mode.
 本発明の態様12に係る画像符号化装置は、DIMDモードに応じて対象ブロックの隣接画像を選択する参照サンプル導出部と、選択された隣接画像を用いて、画素単位の勾配を導出する勾配導出部、勾配からイントラ予測モードを導出する角度モード選択部を備える。 The image encoding device according to aspect 12 of the present invention includes a reference sample derivation unit that selects an adjacent image of a target block according to a DIMD mode, a gradient derivation unit that uses the selected adjacent image to derive a pixel-by-pixel gradient, and an angle mode selection unit that derives an intra prediction mode from the gradient.
 〔関連出願の相互参照〕
 本出願は、2022年10月11日に出願された日本国特許出願:特願2022-163200に対して優先権の利益を主張するものであり、それを参照することにより、その内容の全てが本書に含まれる。
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of priority to Japanese patent application No. 2022-163200, filed on October 11, 2022, the entire contents of which are incorporated herein by reference.
 本発明の実施形態は、画像データが符号化された符号化データを復号する動画像復号装置、および、画像データが符号化された符号化データを生成する動画像符号化装置に好適に適用することができる。また、動画像符号化装置によって生成され、動画像復号装置によって参照される符号化データのデータ構造に好適に適用することができる。 Embodiments of the present invention can be suitably applied to a video decoding device that decodes coded data in which image data has been coded, and a video coding device that generates coded data in which image data has been coded. The present invention can also be suitably applied to the data structure of coded data that is generated by a video coding device and referenced by the video decoding device.
31 画像復号装置
301 エントロピー復号部
302 パラメータ復号部
308 予測画像生成部
31046 DIMD予測部
310460 参照サンプル導出部
310465 角度モード導出装置
310461 勾配導出部
310462 角度モード導出部
310463 角度モード選択部
310464 仮予測画像生成部
311 逆量子化・逆変換部
312 加算部
11 画像符号化装置
101 予測画像生成部
102 減算部
103 変換・量子化部
104 エントロピー符号化部
105 逆量子化・逆変換部
107 ループフィルタ
110 符号化パラメータ決定部
111 パラメータ符号化部
31 Image Decoding Device
301 Entropy Decoding Unit
302 Parameter Decoding Unit
308 Prediction Image Generation Unit
31046 DIMD Prediction Department
310460 Reference sample derivation part
310465 Angle mode extraction device
310461 Gradient derivation part
310462 Angle mode derivation part
310463 Angle mode selection section
310464 Temporary predicted image generation unit
311 Inverse quantization and inverse transformation unit
312 Addition section
11 Image Encoding Device
101 Prediction image generation unit
102 Subtraction section
103 Transformation and Quantization Section
104 Entropy coding unit
105 Inverse quantization and inverse transformation section
107 Loop Filter
110 Encoding parameter determination unit
111 Parameter Encoding Unit

Claims (12)

  1.  DIMDモードに応じて対象ブロックの隣接画像を選択する参照サンプル導出部と、選択された隣接画像を用いて、画素単位の勾配を導出する勾配導出部、勾配からイントラ予測モードを導出する角度モード選択部を備える画像復号装置。 An image decoding device that includes a reference sample derivation unit that selects adjacent images of a target block according to a DIMD mode, a gradient derivation unit that uses the selected adjacent images to derive pixel-by-pixel gradients, and an angle mode selection unit that derives an intra prediction mode from the gradient.
  2.  符号化データから上記対象ブロックのDIMDフラグとDIMDモードを復号するエントロピー復号部を備え、上記DIMDフラグがtrueの場合に、さらにDIMDモードを復号し、さらに導出されたイントラ予測モードを用いて予測画像生成を行う予測画像生成部を備えることを特徴とする請求項1に記載の画像復号装置。 The image decoding device according to claim 1, further comprising an entropy decoding unit that decodes the DIMD flag and DIMD mode of the target block from the encoded data, and a predicted image generating unit that further decodes the DIMD mode when the DIMD flag is true, and further generates a predicted image using the derived intra prediction mode.
  3.  DIMDモードは、少なくとも、上と左、左、上を上記隣接画像として切り替えることを特徴とする請求項1に記載の画像復号装置。 The image decoding device according to claim 1, characterized in that the DIMD mode switches at least between the top and left, and the left and top as the adjacent images.
  4.  DIMDモードは、第1ビットと第2ビットから構成され、第1ビットは上と左か否か、第2ビットで左もしくは上かを選択肢として、上記隣接画像を選択することを特徴とする請求項2に記載の画像復号装置。 The image decoding device according to claim 2, characterized in that the DIMD mode is composed of a first bit and a second bit, the first bit being a choice between top and left, and the second bit being a choice between left and top, to select the adjacent image.
  5.  上記エントロピー復号部は、上記第1ビットの復号には確率を保持するコンテキストを用い、上記第2ビットはコンテキストを用いない等確率を用いて、上記DIMDモードを復号することを特徴とする請求項4に記載の画像復号装置。 The image decoding device according to claim 4, characterized in that the entropy decoding unit decodes the DIMD mode using a context that holds a probability for decoding the first bit, and using equal probability without using a context for decoding the second bit.
  6.  上記エントロピー復号部は、上記第1ビットと上記第2ビットの復号には確率を保持するコンテキストを用いて、上記DIMDモードを復号することを特徴とする請求項4に記載の画像復号装置。 The image decoding device according to claim 4, characterized in that the entropy decoding unit decodes the DIMD mode using a context that holds a probability for decoding the first bit and the second bit.
  7.  上記エントロピー復号部は、対象ブロックの幅と高さを利用してコンテキストインデックスを導出することを特徴とする請求項2に記載の画像復号装置。 The image decoding device according to claim 2, characterized in that the entropy decoding unit derives the context index using the width and height of the target block.
  8.  上記エントロピー復号部は、上記第2ビットには対象ブロックが正方形か否かの判定を用いてコンテキストインデックスを導出することを特徴とする請求項4に記載の画像復号装置。 The image decoding device according to claim 4, characterized in that the entropy decoding unit derives a context index from the second bit by using a determination of whether the target block is a square or not.
  9.  上記勾配導出部は、dimd_modeに応じて参照するライン数を変更することを特徴とする請求項1に記載の画像復号装置。 The image decoding device according to claim 1, characterized in that the gradient derivation unit changes the number of lines to be referenced depending on dimd_mode.
  10.  上記勾配導出部は、上記対象ブロックのサイズに応じて参照するライン数を変更することを特徴とする請求項1に記載の画像復号装置。 The image decoding device according to claim 1, characterized in that the gradient derivation unit changes the number of lines to be referenced depending on the size of the target block.
  11.  上記勾配導出部は、dimd_modeに応じて参照するライン数と参照方向を変更することを特徴とする請求項1に記載の画像復号装置。 The image decoding device according to claim 1, characterized in that the gradient derivation unit changes the number of lines to be referenced and the reference direction according to dimd_mode.
  12.  DIMDモードに応じて対象ブロックの隣接画像を選択する参照サンプル導出部と、選択された隣接画像を用いて、画素単位の勾配を導出する勾配導出部、勾配からイントラ予測モードを導出する角度モード選択部を備える画像符号化装置。 An image encoding device that includes a reference sample derivation unit that selects an adjacent image of a target block according to a DIMD mode, a gradient derivation unit that uses the selected adjacent image to derive a pixel-by-pixel gradient, and an angle mode selection unit that derives an intra prediction mode from the gradient.
PCT/JP2023/036356 2022-10-11 2023-10-05 Image decoding device and image encoding device WO2024080216A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-163200 2022-10-11
JP2022163200A JP2024056375A (en) 2022-10-11 2022-10-11 Image decoding device and image encoding device

Publications (1)

Publication Number Publication Date
WO2024080216A1 true WO2024080216A1 (en) 2024-04-18

Family

ID=90669197

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/036356 WO2024080216A1 (en) 2022-10-11 2023-10-05 Image decoding device and image encoding device

Country Status (2)

Country Link
JP (1) JP2024056375A (en)
WO (1) WO2024080216A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012191295A (en) * 2011-03-09 2012-10-04 Canon Inc Image coding apparatus, image coding method, program, image decoding apparatus, image decoding method, and program
WO2018110462A1 (en) * 2016-12-16 2018-06-21 シャープ株式会社 Image decoding device and image encoding device
WO2019007492A1 (en) * 2017-07-04 2019-01-10 Huawei Technologies Co., Ltd. Decoder side intra mode derivation tool line memory harmonization with deblocking filter
US20190166370A1 (en) * 2016-05-06 2019-05-30 Vid Scale, Inc. Method and system for decoder-side intra mode derivation for block-based video coding
JP2019535211A (en) * 2016-10-14 2019-12-05 インダストリー アカデミー コーオペレイション ファウンデーション オブ セジョン ユニバーシティ Image encoding / decoding method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012191295A (en) * 2011-03-09 2012-10-04 Canon Inc Image coding apparatus, image coding method, program, image decoding apparatus, image decoding method, and program
US20190166370A1 (en) * 2016-05-06 2019-05-30 Vid Scale, Inc. Method and system for decoder-side intra mode derivation for block-based video coding
JP2019535211A (en) * 2016-10-14 2019-12-05 インダストリー アカデミー コーオペレイション ファウンデーション オブ セジョン ユニバーシティ Image encoding / decoding method and apparatus
WO2018110462A1 (en) * 2016-12-16 2018-06-21 シャープ株式会社 Image decoding device and image encoding device
WO2019007492A1 (en) * 2017-07-04 2019-01-10 Huawei Technologies Co., Ltd. Decoder side intra mode derivation tool line memory harmonization with deblocking filter

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Z. FAN, Y. YASUGI, T. IKAI (SHARP): "Non-EE2: Adaptive reference region DIMD", 28. JVET MEETING; 20221021 - 20221028; MAINZ; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 14 October 2022 (2022-10-14), XP030304493 *

Also Published As

Publication number Publication date
JP2024056375A (en) 2024-04-23

Similar Documents

Publication Publication Date Title
KR102636668B1 (en) Methods of coding block information using quadtree and appararuses for using the same
CN109716771B (en) Linear model chroma intra prediction for video coding
JP5587508B2 (en) Intra smoothing filter for video coding
CN105144718B (en) When skipping conversion for damaging the intra prediction mode of decoding
KR101771332B1 (en) Intra prediction improvements for scalable video coding
KR101855269B1 (en) Intra prediction method and encoding apparatus and decoding apparatus using same
KR102013561B1 (en) Repositioning of prediction residual blocks in video coding
KR20190055113A (en) Variable number of intra modes for video coding
KR102182441B1 (en) Low-complexity support of multiple layers for hevc extensions in video coding
KR20190006174A (en) Signaling of filtering information
KR20160135226A (en) Search region determination for intra block copy in video coding
KR20160023729A (en) Intra prediction from a predictive block using displacement vectors
KR20150138308A (en) Multiple base layer reference pictures for shvc
WO2024080216A1 (en) Image decoding device and image encoding device
KR20230170072A (en) Coding enhancements in cross-component sample adaptive offset
JP2024513160A (en) Coding improvements in cross-component sample adaptive offsets
WO2023234200A1 (en) Video decoding device, video encoding device, and angle mode derivation device
JP2023177425A (en) Moving image decoding device, moving image encoding device, and angle mode derivation device
WO2023048165A1 (en) Video decoding device and video coding device
JP2023183430A (en) Moving image decoding device, moving image encoding device, and angle mode derivation device
JP2024047922A (en) Image decoding device and image encoding device
JP2024047921A (en) Image Decoding Device
KR20240042646A (en) Coding enhancements in cross-component sample adaptive offset