WO2018159526A1

WO2018159526A1 - Moving picture coding device and moving picture decoding device

Info

Publication number: WO2018159526A1
Application number: PCT/JP2018/006937
Authority: WO
Inventors: 友子青野; 知宏猪飼
Original assignee: シャープ株式会社
Priority date: 2017-03-03
Filing date: 2018-02-26
Publication date: 2018-09-07
Also published as: JP2020072277A

Abstract

The definition of a flag that indicates the number of non-zero conversion coefficients in a coding unit (CU) is changed to reduce the amount of code for LAST, which indicates the position of the last non-zero conversion coefficient in the CU.

Description

Video encoding apparatus and video decoding apparatus

One embodiment of the present invention relates to an image decoding device and an image encoding device.

In order to efficiently transmit or record a moving image, an image encoding device that generates encoded data by encoding the moving image, and image decoding that generates a decoded image by decoding the encoded data The device is used.

Specific examples of the moving image encoding method include a method proposed in H.264 / AVC and HEVC (High-Efficiency Video Coding).

In such a moving image coding system, an image (picture) constituting a moving image is a slice obtained by dividing the image, a coding unit obtained by dividing the slice (coding unit (Coding Unit : CU)), and a hierarchical structure consisting of a prediction unit (PU) and a transform unit (TU) that are obtained by dividing a coding unit. Decrypted.

In such a moving image coding method, a predicted image is usually generated based on a local decoded image obtained by encoding / decoding an input image, and the predicted image is generated from the input image (original image). A prediction residual obtained by subtraction (sometimes referred to as “difference image” or “residual image”) is encoded. Examples of methods for generating a predicted image include inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction).

The moving image encoding device encodes the quantized transform coefficient obtained by performing orthogonal transform and quantization on the prediction residual, and the moving image decoding device decodes the quantized transform coefficient from the encoded data, and performs inverse quantization and Inverse orthogonal transformation is performed to restore the prediction error (Non-Patent Document 1). In encoding and decoding of quantized transform coefficients, processing is divided into position information and level information of each quantized transform coefficient in the CU, thereby removing redundancy and reducing the amount of codes (Non-Patent Document) 2).

In Non-Patent Document 2, each quantized transform coefficient in the CU is a first flag indicating the presence or absence of a non-zero quantized transform coefficient in the CU, and indicates the position of the last non-zero quantized transform coefficient in the scan order. LAST, second flag indicating presence / absence of non-zero quantized transform coefficient for each sub-block obtained by dividing CU, third flag indicating whether each quantized transform coefficient in the sub-block is non-zero, each non-zero Encoding is performed by dividing into a plurality of syntaxes indicating levels (magnitudes) of quantized transform coefficients. As described above, the quantization transform coefficient is not directly encoded, but is divided into a plurality of pieces of information, thereby reducing the amount of codes. However, although LAST indicating the position of the last non-zero quantized transform coefficient in the scan order is a single coordinate, it does not depend much on the quantization parameter and has a large code amount.

Therefore, one aspect of the present invention has been made in view of the above-described problems, and its purpose is to reduce the code amount of LAST indicating the position of the last non-zero quantized transform coefficient in scan order, An object of the present invention is to provide an image decoding apparatus and an image encoding apparatus capable of improving encoding efficiency.

An image encoding device according to an aspect of the present invention includes a dividing unit that divides one screen of an input moving image into encoding units (CU) including a plurality of pixels, and performs predetermined conversion using the CU as a unit to perform conversion coefficients. And a variable-length coding unit that performs variable-length coding on the transform coefficient, and the variable-length coding unit indicates whether a non-zero transform coefficient exists in the CU. A first determination unit for determining a value of one flag, and a second value for determining a second flag value indicating whether or not a non-zero conversion coefficient exists only in a limited region of the CU. Refers to the determination unit, the derivation unit that derives the syntax indicating the farthest position (LAST) and non-zero coefficient value (LEVEL) by scanning the transform coefficient from the DC component in the scan order in the CU, and the encoding parameter Switching between the first flag and the second flag to be variable-length encoded. When the variable length coding is performed on the first encoding unit that performs variable length encoding and the first flag, and there is a non-zero transform coefficient in the CU, the syntax indicating LAST and LEVEL is encoded. When the variable length coding of the second encoding unit and the second flag is performed, if a non-zero transform coefficient exists only in a limited region of the CU, the syntax indicating LEVEL is encoded. When a non-zero transform coefficient is present outside the limited region of CU, a third encoding unit that encodes syntax indicating LAST and LEVEL is provided.

An image decoding apparatus according to an aspect of the present invention includes a variable length decoding unit, an output unit that performs variable length decoding of encoded data using a coding unit (CU) including a plurality of pixels as a processing unit, and outputs a syntax. And a derivation unit for deriving a transform coefficient from the syntax, wherein the variable length decoding unit refers to an encoding parameter and indicates whether or not a non-zero transform coefficient exists in the CU And a first decoding unit that performs variable-length decoding by switching which of the second flags indicating whether or not non-zero transform coefficients exist only in a limited region of the CU, Second decoding for variable-length decoding the syntax indicating LAST and LEVEL when the first flag indicates variable-length decoding and the first flag indicates that a non-zero transform coefficient exists in the CU Variable length decoding the second flag and the CU limitation To indicate that non-zero transform coefficients exist only in the specified area, set the value indicating the highest frequency component in the limited area to LAST, decode the variable length code indicating LEVEL, and limit the CU When a non-zero transform coefficient is present outside the region, a third decoding unit that performs variable-length decoding on the syntax indicating LAST and LEVEL is provided.

According to one aspect of the present invention, it is possible to improve the image quality of moving images and improve the encoding efficiency.

1 is a schematic diagram illustrating a configuration of an image transmission system according to an embodiment of the present invention. It is a figure which shows the hierarchical structure of the data of the encoding stream which concerns on one Embodiment of this invention. It is a figure which shows the pattern of PU division | segmentation mode. (A) to (h) respectively show the partition shapes when the PU partitioning modes are 2Nx2N, 2NxN, 2NxnU, 2NxnD, Nx2N, nLx2N, nRx2N, and NxN. It is a conceptual diagram which shows an example of a reference picture and a reference picture list. It is a block diagram which shows the structure of the image decoding apparatus which concerns on one Embodiment of this invention. It is a block diagram which shows the structure of the image coding apparatus which concerns on one Embodiment of this invention. It is a figure which shows the syntax of a conversion factor, and a decoding process. It is the figure shown about the structure of the transmitter which mounts the image coding apparatus which concerns on one Embodiment of this invention, and the receiver which mounts an image decoding apparatus. (A) shows a transmission device equipped with an image encoding device, and (b) shows a reception device equipped with an image decoding device. It is the figure which showed about the structure of the recording device carrying the image coding apparatus which concerns on one Embodiment of this invention, and the reproducing | regenerating apparatus carrying an image decoding apparatus. (A) shows a recording device equipped with an image encoding device, and (b) shows a playback device equipped with an image decoding device. It is a block diagram explaining the entropy decoding part which concerns on one Embodiment of this invention. It is a block diagram explaining the entropy encoding part which concerns on one Embodiment of this invention. It is an example of a variable length code table. It is a flowchart which shows the operation | movement of the decoding process of a transform coefficient. It is a flowchart which shows the operation | movement of the encoding process of a transform coefficient. It is a flowchart which shows the operation | movement of the encoding process of a transform coefficient. It is a figure which shows the ratio for the whole code amount of LAST. It is a figure which shows the relationship between a quantization parameter, CU size, and the number of nonzero transform coefficients. It is a figure which shows the definition of cbf. It is a flowchart which shows the operation | movement of the decoding process of a transform coefficient when the definition of cbf is changed. 10 is another flowchart showing the operation of transform coefficient decoding processing when the definition of cbf is changed. 10 is another flowchart showing the operation of transform coefficient decoding processing when the definition of cbf is changed. It is a flowchart which shows the operation | movement of the encoding process of a transform coefficient when the definition of cbf is changed. 10 is another flowchart showing the operation of transform coefficient encoding processing when the definition of cbf is changed. 10 is another flowchart showing the operation of transform coefficient encoding processing when the definition of cbf is changed. In this example, the definition of cbf is switched depending on the quantization parameter and the CU size. 10 is another flowchart showing the operation of transform coefficient decoding processing when the definition of cbf is switched. 10 is another flowchart showing the operation of transform coefficient encoding processing when the definition of cbf is switched. It is a figure explaining the expression method of LAST. It is a figure which shows the code amount required for encoding of LAST. It is a flowchart which shows the operation | movement of the decoding process of LAST. It is a flowchart which shows the operation | movement of the encoding process of LAST. This is an example of switching the encoding method of LAST depending on the quantization parameter and the CU size. It is another example of a variable-length code table. It is another example of a variable-length code table. It is the figure which compared the variable length code table. This is an example of switching a variable length code table according to a quantization parameter and a CU size. It is a flowchart which shows the operation | movement which determines a variable-length code table and implements the encoding and decoding process of LAST. It is a figure which shows a scanning direction. It is a figure which shows the code amount required for the encoding of a scanning direction and LAST. It is a figure which shows object CU and its adjacent CU. It is a figure which shows the correspondence of the estimated value of the number of the non-zero conversion coefficients of the object CU, and the variable length code table to select.

(Embodiment 1)
Hereinafter, embodiments of the present invention will be described with reference to the drawings.

FIG. 1 is a schematic diagram showing a configuration of an image transmission system 1 according to the present embodiment.

The image transmission system 1 is a system that transmits a code obtained by encoding an encoding target image, decodes the transmitted code, and displays an image. The image transmission system 1 includes an image encoding device 11, a network 21, an image decoding device 31, and an image display device 41.

The image encoding device 11 receives an image T indicating a single layer image or a plurality of layers. A layer is a concept used to distinguish a plurality of pictures when there are one or more pictures constituting a certain time. For example, when the same picture is encoded with a plurality of layers having different image quality and resolution, scalable encoding is performed, and when a picture of a different viewpoint is encoded with a plurality of layers, view scalable encoding is performed. When prediction is performed between pictures of a plurality of layers (inter-layer prediction, inter-view prediction), encoding efficiency is greatly improved. Further, even when prediction is not performed (simultaneous casting), encoded data can be collected.

The network 21 transmits the encoded stream Te generated by the image encoding device 11 to the image decoding device 31. The network 21 is the Internet, a wide area network (WAN: Wide Area Network), a small network (LAN: Local Area Network), or a combination thereof. The network 21 is not necessarily limited to a bidirectional communication network, and may be a unidirectional communication network that transmits broadcast waves such as terrestrial digital broadcasting and satellite broadcasting. The network 21 may be replaced with a storage medium that records an encoded stream Te such as a DVD (Digital Versatile Disc) or a BD (Blue-ray Disc).

The image decoding device 31 decodes each of the encoded streams Te transmitted by the network 21, and generates one or a plurality of decoded images Td decoded.

The image display device 41 displays all or part of one or more decoded images Td generated by the image decoding device 31. The image display device 41 includes, for example, a display device such as a liquid crystal display or an organic EL (Electro-luminescence) display. In addition, in the spatial scalable coding and SNR scalable coding, when the image decoding device 31 and the image display device 41 have a high processing capability, a high-quality enhancement layer image is displayed and only a lower processing capability is provided. Displays a base layer image that does not require higher processing capability and display capability as an extension layer.

<Operator>
The operators used in this specification are described below.

>> is right bit shift, << is left bit shift, & is bitwise AND, | is bitwise OR, | = is sum operation (OR) with another condition.

X? Y: z is a ternary operator that takes y when x is true (non-zero) and takes z when x is false (0).

Clip3 (a, b, c) is a function that clips c to a value between a and b, but returns a if c <a, returns b if c> b, otherwise Is a function that returns c (where a <= b).

<Structure of encoded stream Te>
Prior to detailed description of the image encoding device 11 and the image decoding device 31 according to the present embodiment, a data structure of an encoded stream Te generated by the image encoding device 11 and decoded by the image decoding device 31 will be described. .

FIG. 2 is a diagram showing a hierarchical structure of data in the encoded stream Te. The encoded stream Te illustratively includes a sequence and a plurality of pictures constituting the sequence. (A) to (f) of FIG. 2 respectively show an encoded video sequence defining a sequence SEQ, an encoded picture defining a picture PICT, an encoded slice defining a slice S, and an encoded slice defining a slice data It is a figure which shows the coding unit (Coding | unit: CU) contained in the coding tree unit contained in data and coding slice data, and a coding tree unit.

(Encoded video sequence)
In the encoded video sequence, a set of data referred to by the image decoding device 31 for decoding the sequence SEQ to be processed is defined. As shown in FIG. 2A, the sequence SEQ includes a video parameter set (Video Parameter Set), a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), a picture PICT, and an addition. Includes SEI (Supplemental Enhancement Information). Here, the value indicated after # indicates the layer ID. FIG. 2 shows an example in which encoded data of # 0 and # 1, that is, layer 0 and layer 1, exists, but the type of layer and the number of layers are not dependent on this.

The video parameter set VPS is a set of coding parameters common to a plurality of moving images, a plurality of layers included in the moving image, and coding parameters related to individual layers in a moving image composed of a plurality of layers. A set is defined.

The sequence parameter set SPS defines a set of encoding parameters that the image decoding device 31 refers to in order to decode the target sequence. For example, the width and height of the picture are defined. A plurality of SPSs may exist. In that case, one of a plurality of SPSs is selected from the PPS.

In the picture parameter set PPS, a set of encoding parameters referred to by the image decoding device 31 in order to decode each picture in the target sequence is defined. For example, a quantization width reference value (pic_init_qp_minus26) used for picture decoding and a flag (weighted_pred_flag) indicating application of weighted prediction are included. There may be a plurality of PPSs. In that case, one of a plurality of PPSs is selected from each picture in the target sequence.

(Encoded picture)
In the coded picture, a set of data referred to by the image decoding device 31 in order to decode the picture PICT to be processed is defined. As shown in FIG. 2B, the picture PICT includes slices S0 to S _NS-1 (NS is the total number of slices included in the picture PICT).

In the following description, if it is not necessary to distinguish each of the slices S0 to _SNS-1 , the subscripts may be omitted. The same applies to data included in an encoded stream Te described below and to which other subscripts are attached.

(Encoded slice)
In the coded slice, a set of data referred to by the image decoding device 31 for decoding the slice S to be processed is defined. As shown in FIG. 2C, the slice S includes a slice header SH and slice data SDATA.

The slice header SH includes an encoding parameter group that is referred to by the image decoding device 31 in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.

As slice types that can be specified by the slice type specification information, (1) I slice using only intra prediction at the time of encoding, (2) P slice using unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used. Note that inter prediction is not limited to single prediction and bi-prediction, and a predicted image may be generated using more reference pictures. Hereinafter, the P, PB slice refers to a slice including a block that can use inter prediction.

Note that the slice header SH may include a reference (pic_parameter_set_id) to the picture parameter set PPS included in the encoded video sequence.

(Encoded slice data)
In the encoded slice data, a set of data referred to by the image decoding device 31 for decoding the slice data SDATA to be processed is defined. As shown in FIG. 2D, the slice data SDATA includes a coding tree unit (CTU). A CTU is a block of a fixed size (for example, 64x64) that constitutes a slice, and is sometimes called a maximum coding unit (LCU: Large Coding Unit).

(Encoding tree unit)
As shown in (e) of FIG. 2, a set of data referred to by the image decoding device 31 in order to decode the processing target coding tree unit is defined. The coding tree unit is divided into coding units (CU: Coding Unit) which is a basic unit of coding processing by recursive quadtree division (QT division) or binary tree division (BT division). . A tree structure obtained by recursive quadtree partitioning or binary tree partitioning is called a coding tree (CT), and a node of the tree structure is called a coding node (CN). The intermediate nodes of the quadtree and the binary tree are coding nodes, and the coding tree unit itself is defined as the highest coding node.

CT includes, as CT information, a QT split flag (cu_split_flag) indicating whether or not to perform QT split, and a BT split mode (split_bt_mode) indicating a split method of BT split. cu_split_flag and / or split_bt_mode are transmitted for each coding node CN. When cu_split_flag is 1, the encoding node CN is divided into four encoding nodes CN. When cu_split_flag is 0 and split_bt_mode is 1, the encoding node CN is horizontally divided into two encoding nodes CN. When split_bt_mode is 2, the encoding node CN is vertically divided into two encoding nodes CN. When split_bt_mode is 0, the encoding node CN is not divided and has one encoding unit CU as a node. The encoding unit CU is a terminal node (leaf node) of the encoding node and is not further divided.

Also, when the size of the coding tree unit CTU is 64x64 pixels, the size of the coding unit is 64x64 pixels, 64x32 pixels, 32x64 pixels, 32x32 pixels, 64x16 pixels, 16x64 pixels, 32x16
Pixel, 16x32 pixel, 16x16 pixel, 64x8 pixel, 8x64 pixel, 32x8 pixel, 8x32 pixel, 16x8 pixel, 8x16 pixel, 8x8 pixel, 64x4 pixel, 4x64 pixel, 32x4 pixel, 4x32 pixel, 16x4 pixel, 4x16 pixel, 8x4 pixel, It can take either 4x8 pixels or 4x4 pixels.

(Encoding unit)
As shown in (f) of FIG. 2, a set of data referred to by the image decoding device 31 in order to decode the encoding unit to be processed is defined. Specifically, the encoding unit includes a prediction tree, a conversion tree, and a CU header CUH. The CU header defines a prediction mode, a division method (PU division mode), and the like.

In the prediction tree, the prediction parameters (reference picture index, motion vector, etc.) of each prediction unit (PU) obtained by dividing the coding unit into one or a plurality are defined. In other words, the prediction unit is one or a plurality of non-overlapping areas constituting the encoding unit. The prediction tree includes one or a plurality of prediction units obtained by the above-described division. Hereinafter, a prediction unit obtained by further dividing the prediction unit is referred to as a “sub-block”. The sub block is composed of a plurality of pixels. When the sizes of the prediction unit and the sub-block are equal, the number of sub-blocks in the prediction unit is one. If the prediction unit is larger than the size of the sub-block, the prediction unit is divided into sub-blocks. For example, when the prediction unit is 8 × 8 and the sub-block is 4 × 4, the prediction unit is divided into four sub-blocks that are divided into two horizontally and two vertically.

The prediction process may be performed for each prediction unit (sub block).

There are roughly two types of division in the prediction tree: intra prediction and inter prediction. Intra prediction is prediction within the same picture, and inter prediction refers to prediction processing performed between different pictures (for example, between display times and between layer images).

In the case of intra prediction, there are 2Nx2N (the same size as the encoding unit) and NxN division methods.

Also, in the case of inter prediction, the division method is encoded by the PU division mode (part_mode) of encoded data, and 2Nx2N (the same size as the encoding unit), 2NxN, 2NxnU, 2NxnD, Nx2N, nLx2N, nRx2N, and NxN etc. 2NxN and Nx2N indicate 1: 1 symmetrical division,
2NxnU, 2NxnD and nLx2N, nRx2N indicate 1: 3, 3: 1 asymmetric partitioning. The PUs included in the CU are expressed as PU0, PU1, PU2, and PU3 in this order.

3 (a) to (h) in FIG. 3 specifically show the partition shape (the position of the boundary of the PU partition) in each PU partition mode. 3A shows a 2Nx2N partition, and FIGS. 3B, 3C, and 2D show 2NxN, 2NxnU, and 2NxnD partitions (horizontal partitions), respectively. (E), (f), and (g) show partitions (vertical partitions) in the case of Nx2N, nLx2N, and nRx2N, respectively, and (h) shows an NxN partition. The horizontal partition and the vertical partition are collectively referred to as a rectangular partition, and 2Nx2N and NxN are collectively referred to as a square partition.

Also, in the conversion tree, the encoding unit is divided into one or a plurality of conversion units, and the position and size of each conversion unit are defined. In other words, a transform unit is one or more non-overlapping areas that make up a coding unit. The conversion tree includes one or a plurality of conversion units obtained by the above division.

The division in the conversion tree includes a case where an area having the same size as that of the encoding unit is assigned as a conversion unit, and a case where recursive quadtree division is used, as in the case of the CU division described above.

Conversion processing is performed for each conversion unit.

(Prediction parameter)
A prediction image of a prediction unit (PU: Prediction Unit) is derived from a prediction parameter associated with the PU. The prediction parameters include a prediction parameter for intra prediction or a prediction parameter for inter prediction. Hereinafter, prediction parameters for inter prediction (inter prediction parameters) will be described. The inter prediction parameter includes prediction list use flags predFlagL0 and predFlagL1, reference picture indexes refIdxL0 and refIdxL1, and motion vectors mvL0 and mvL1. The prediction list use flags predFlagL0 and predFlagL1 are flags indicating whether or not reference picture lists called L0 list and L1 list are used, respectively, and a reference picture list corresponding to a value of 1 is used. In this specification, when “flag indicating whether or not it is XX” is described, when the flag is not 0 (for example, 1) is XX, 0 is not XX, and logical negation, logical product, etc. 1 is treated as true and 0 is treated as false (the same applies hereinafter). However, other values can be used as true values and false values in an actual apparatus or method.

Syntax elements for deriving inter prediction parameters included in the encoded data include, for example, PU partition mode part_mode, merge flag merge_flag, merge index merge_idx, inter prediction identifier inter_pred_idc, reference picture index refIdxLX, prediction vector index mvp_LX_idx, There is a difference vector mvdLX.

(Reference picture list)
The reference picture list is a list including reference pictures stored in the reference picture memory 306. FIG. 4 is a conceptual diagram illustrating an example of a reference picture and a reference picture list. In FIG. 4A, a rectangle is a picture, an arrow is a reference relationship of the picture, a horizontal axis is time, I, P, and B in the rectangle are an intra picture, a single prediction picture, a bi-prediction picture, and numbers in the rectangle are Indicates the decoding order. As shown in the figure, the decoding order of pictures is I0, P1, B2, B3, and B4, and the display order is I0, B3, B2, B4, and P1. FIG. 4B shows an example of the reference picture list. The reference picture list is a list representing candidate reference pictures, and one picture (slice) may have one or more reference picture lists. In the illustrated example, the target picture B3 has two reference picture lists, an L0 list RefPicList0 and an L1 list RefPicList1. When the target picture is B3, the reference pictures are I0, P1, and B2, and the reference picture has these pictures as elements. In each prediction unit, which picture in the reference picture list RefPicListX is actually referred to is specified by the reference picture index refIdxLX. The figure shows an example in which reference pictures P1 and B2 are referred to by refIdxL0 and refIdxL1.

(Merge prediction and AMVP prediction)
The prediction parameter decoding (encoding) method includes a merge prediction (merge) mode and an AMVP (Adaptive Motion Vector Prediction) mode. The merge flag merge_flag is a flag for identifying these. The merge mode is a mode in which the prediction list use flag predFlagLX (or inter prediction identifier inter_pred_idc), the reference picture index refIdxLX, and the motion vector mvLX are not included in the encoded data and are derived from the prediction parameters of already processed neighboring PUs. The AMVP mode is a mode in which the inter prediction identifier inter_pred_idc, the reference picture index refIdxLX, and the motion vector mvLX are included in the encoded data. The motion vector mvLX is encoded as a prediction vector index mvp_LX_idx for identifying the prediction vector mvpLX and a difference vector mvdLX.

The inter prediction identifier inter_pred_idc is a value indicating the type and number of reference pictures, and takes one of PRED_L0, PRED_L1, and PRED_BI. PRED_L0 and PRED_L1 indicate that reference pictures managed by the reference picture lists of the L0 list and the L1 list are used, respectively, and that one reference picture is used (single prediction). PRED_BI indicates that two reference pictures are used (bi-prediction BiPred), and reference pictures managed by the L0 list and the L1 list are used. The prediction vector index mvp_LX_idx is an index indicating a prediction vector, and the reference picture index refIdxLX is an index indicating a reference picture managed in the reference picture list. Note that LX is a description method used when L0 prediction and L1 prediction are not distinguished from each other. By replacing LX with L0 and L1, parameters for the L0 list and parameters for the L1 list are distinguished.

The merge index merge_idx is an index that indicates whether one of the prediction parameter candidates (merge candidates) derived from the processed PU is used as the prediction parameter of the decoding target PU.

(Motion vector)
The motion vector mvLX indicates a shift amount between blocks on two different pictures. A prediction vector and a difference vector related to the motion vector mvLX are referred to as a prediction vector mvpLX and a difference vector mvdLX, respectively.

(Inter prediction identifier inter_pred_idc and prediction list use flag predFlagLX)
The relationship between the inter prediction identifier inter_pred_idc and the prediction list use flags predFlagL0 and predFlagL1 is as follows and can be converted into each other.

inter_pred_idc = (predFlagL1 << 1) + predFlagL0
predFlagL0 = inter_pred_idc & 1
predFlagL1 = inter_pred_idc >> 1
Note that a prediction list use flag or an inter prediction identifier may be used as the inter prediction parameter. Further, the determination using the prediction list use flag may be replaced with the determination using the inter prediction identifier. Conversely, the determination using the inter prediction identifier may be replaced with the determination using the prediction list use flag.

(Determination of bi-prediction biPred)
The flag biPred as to whether it is a bi-prediction BiPred can be derived depending on whether the two prediction list use flags are both 1. For example, it can be derived by the following formula.

biPred = (predFlagL0 == 1 && predFlagL1 == 1)
The flag biPred can also be derived depending on whether or not the inter prediction identifier is a value indicating that two prediction lists (reference pictures) are used. For example, it can be derived by the following formula.

biPred = (inter_pred_idc == PRED_BI)? 1: 0
The above formula can also be expressed by the following formula.

biPred = (inter_pred_idc == PRED_BI)
For example, a value of 3 can be used for PRED_BI.

(Intra prediction mode)
The luminance intra prediction mode IntraPredModeY is 67 mode and corresponds to planar prediction (0), DC prediction (1), and direction prediction (2 to 66). The color difference intra prediction mode IntraPredModeC is a 68 mode obtained by adding a Color Component Linear Mode (CCLM) to the above 67 mode. CCLM is a mode in which the pixel value of the target pixel in the target color component is derived by linear prediction with reference to the pixel value of another color component encoded before the target color component. The color component includes luminance Y, color difference Cb, and color difference Cr. Different intra prediction modes may be assigned depending on luminance and color difference, and the prediction mode is encoded and decoded in units of CU or PU.

(Configuration of image decoding device)
Next, the configuration of the image decoding device 31 according to the present embodiment will be described. FIG. 5 is a schematic diagram illustrating a configuration of the image decoding device 31 according to the present embodiment. The image decoding device 31 includes an entropy decoding unit 301, a prediction parameter decoding unit (prediction image decoding device) 302, a loop filter 305, a reference picture memory 306, a prediction parameter memory 307, a prediction image generation unit (prediction image generation device) 308, and inversely. A quantization / inverse DCT unit 311 and an addition unit 312 are included.

The prediction parameter decoding unit 302 includes an inter prediction parameter decoding unit 303 and an intra prediction parameter decoding unit 304. The predicted image generation unit 308 includes an inter predicted image generation unit 309 and an intra predicted image generation unit 310.

The entropy decoding unit 301 performs entropy decoding on the coded stream Te input from the outside, and separates and decodes individual codes (syntax elements). The separated code includes a prediction parameter for generating a prediction image and residual information for generating a difference image.

The entropy decoding unit 301 outputs a part of the separated code to the prediction parameter decoding unit 302. Some of the separated codes are, for example, the prediction mode predMode, the PU partition mode part_mode, the merge flag merge_flag, the merge index merge_idx, the inter prediction identifier inter_pred_idc, the reference picture index ref_Idx_lX, the prediction vector index mvp_LX_idx, and the difference vector mvdLX. Control of which code is decoded is performed based on an instruction from the prediction parameter decoding unit 302. The entropy decoding unit 301 outputs the quantization coefficient to the inverse quantization / inverse DCT unit 311. The quantization coefficient is a coefficient obtained by performing quantization by performing DCT (Discrete Cosine Transform) on the residual signal in the encoding process.

A detailed block diagram of the entropy decoding unit 301 is shown in FIG. The entropy decoding unit 301 includes a header decoding unit 1001, a CT information decoding unit 1002, a CU decoding unit 1003, and a decoding module 1004.

(Decryption module)
Hereinafter, the schematic operation of each module will be described. The decoding module 1004 performs a decoding process for decoding the syntax value from the encoded data. Based on the encoded data and syntax type supplied from the header decoding unit 1001, CT information decoding unit 1002, and CU decoding unit 1003, the decoding module 1004 uses a fixed-length encoding method or an entropy encoding method such as CABAC. Decodes the encoded syntax value and returns the decoded syntax value to the supplier.

(Header decoding part)
The header decoding unit 1001 uses the decoding module 1004 to decode the VPS, SPS, PPS, and slice header of the encoded data input from the image encoding device 11.

(CT information decoding unit)
The CT information decoding unit 1002 uses the decoding module 1004 to perform decoding processing of the encoding tree unit and the encoding tree from the encoded data input from the image encoding device 11. The CT information decoding unit 1002 uses the decoding module 1004 to decode the tree unit header CTUH as CTU information included in the CTU. Next, the CT information decoding unit 1002 indicates, as CT information, a QT division flag indicating whether or not the target CT is QT-divided, and whether or not the target CT is BT-divided, and in the case of BT division, indicates a BT division method The BT division mode is decoded, and the target CT is recursively divided and decoded until the QT division flag and the BT division mode no longer notify further division. Finally, the tree unit footer CTUF is further decoded as CTU information.

The tree unit header CTUH and the tree unit footer CTUF include coding parameters referred to by the image decoding device 31 in order to determine a decoding method of the target coding tree unit. In addition to the QT division flag and the BT division mode, the CT information may include parameters applied in the target CT and lower coding nodes.

(CU decoding unit)
The CU decoding unit 1003 includes PUI information (merge flag (merge_flag), merge index (merge_idx), prediction motion vector index (mvp_idx), reference image index (ref_idx_lX), and inter prediction of the lowest coding node CN (ie, CU) Identifier (inter_pred_flag), difference vector (mvdLX, etc.), quantization transform coefficient (residual_coding), and TTI information (TU partition flag SP_TU (split_transform_flag), CU residual flag CBP_TU (cbf_cb, cbf_cr, cbf_luma), etc.) Decryption is performed using the decryption module 1004.
Here, decoding of the quantization residual will be described. Quantized residuals are CU residual flags CBP_TU (cbf_luma, cbf_cb, cbf_cr) and non-zero quantized transform coefficient positions (last_sig_coeff_x_prefix, last_sig_coeff_y_prefix, last_sig_coeff_x_suffix, last_sig_coefficient_coded_coefficient_coded_coefficients It is expressed by the syntax to be expressed (coeff_abs_level_greater1, coeff_abs_level_greater2, coeff_abs_level_remaining, coeff_sign_flag). Hereinafter, the quantized transform coefficient is referred to as a transform coefficient.

CBP_TU is a flag indicating whether a non-zero conversion coefficient is included in the luminance component and color difference component (Cb, Cr) of a certain CU. When cbf_luma is 1, a non-zero conversion coefficient is included in the luminance component of the CU. When cbf_luma is 0, a non-zero conversion coefficient is not included. Similarly, when cbf_cb and cbf_cr are each 1, a non-zero conversion coefficient is included in the Cb component and Cr component of the CU, and when cbf_cb and cbf_cr are each 0, the Cb component and Cr component of the CU are each non-zero. Does not include conversion factor. When cbf_luma, cbf_cb, and cbf_cr are 0, since the conversion coefficients of the corresponding components are all 0 in the CU, other syntaxes indicating the conversion coefficients are not encoded or decoded.

Fig. 7 (1) shows an example of syntax when CBP_TU = 1 (when non-zero transform coefficient exists in CU), CU size is 8x8, and scan direction is diagonal. The CU size is 128x128 to 4x4, and there are three types of scan directions (diagonal direction, horizontal direction, vertical direction), but the same processing is performed for different CU sizes and scan directions. Hereinafter, unless otherwise specified, the luminance component and the color difference component are not distinguished.

First, the syntax shown in FIG. 7 is decoded in order. LAST is the position of the last non-zero transform coefficient when the upper left coordinate of the CU is (0,0) and scanned in the specified scan direction, and is the four variable-length encoded syntaxes (last_sig_coeff_x_prefix, last_sig_coeff_y_prefix , Last_sig_coeff_y_suffix, last_sig_coeff_y_suffix). When the variable length code table is FIG. 12, last_sig_coeff_x_prefix = “11111”, last_sig_coeff_x_suffix = “01”, last_sig_coeff_y_prefix = “11111”, last_sig_coeff_y_suffix = “00” represents coordinates (7, 6) (FIG. 7 (2)). In the variable length code table of FIG. 12, the part represented by “1” or “0” is a prefix, and the part represented by “X” is a suffix. Here, “X” represents “1” or “0”. Next, the CU is divided into fixed-size sub-blocks (for example, 4x4). The flag coded_sub_block_flag indicating the presence or absence of the non-zero transform coefficient of each sub-block is decoded starting from the sub-block including the position of LAST (FIG. 7 (3)). However, since the non-zero coefficient exists in the sub-block in which LAST exists and the upper left sub-block in which DC component exists, coded_sub_block_flag is not encoded and is always set to 1. When coded_sub_block_flag = 0, the sub-block does not include a non-zero transform coefficient, so all transform coefficient values of the sub-block are 0. When coded_sub_block_flag = 1, since there are non-zero transform coefficients in the sub-block, sig_coeff_flag indicating whether each transform coefficient of the sub-block is non-zero is decoded (FIG. 7 (4)). When sig_coeff_flag = 0, since the conversion coefficient is 0, the conversion coefficient value is 0. When sig_coeff_flag = 1, the syntax expressing the coefficient level (coeff_abs_level_greater1, coeff_abs_level_greater2, coeff_abs_level_remaining, coeff_sign_flag) is decoded, and each conversion coefficient value is derived based on these values (FIG. 7 (5)).

The above operation will be described with reference to the flowchart of FIG.

In S1301, the CU decoding unit 1003 sets all transform coefficients in the CU to 0. In S1302, the CU decoding unit 1003 decodes the cbf using the decoding module 1004. In S1303, the CU decoding unit 1003 checks whether cbf is 1. When cbf is not 1, the CU decoding unit 1003 ends the process. When cbf is 1, the process proceeds to S1304, where the CU decoding unit 1003 decodes the syntax representing LAST and derives the LAST position. In S1305, the CU decoding unit 1003 decodes the coded_sub_block_flag of each subblock using the subblock including the position of LAST as a starting point. Next, the CU decoding unit 1003 performs S1306 to S1308 for all subblocks before the subblock including LAST. In S1306, the CU decoding unit 1003 refers to coded_sub_block_flag to check whether each subblock has a non-zero transform coefficient. When coded_sub_block_flag = 1, the process proceeds to S1307, and the CU decoding unit 1003 decodes all the sig_coeff_flags of the subblock. When coded_sub_block_flag = 0, the process proceeds to S1308, and the CU decoding unit 1003 sets all sig_coeff_flags of the subblock to 0. Next, the CU decoding unit 1003 performs S1309 to S1311 for all sig_coeff_flags. In S1309, the CU decoding unit 1003 checks whether sig_coeff_flag = 1. When sig_coeff_flag = 1, the CU decoding unit 1003 decodes the syntax (coeff_abs_level_greater1, coeff_abs_level_greater2, coeff_abs_level_remaining, coeff_sign_flag) expressing the coefficient level in S1310, and derives the transform coefficient value by referring to the decoding result in S1311.

The inter prediction parameter decoding unit 303 decodes the inter prediction parameter with reference to the prediction parameter stored in the prediction parameter memory 307 based on the code input from the entropy decoding unit 301. The inter prediction parameter decoding unit 303 outputs the decoded inter prediction parameter to the prediction image generation unit 308 and stores it in the prediction parameter memory 307.

The intra prediction parameter decoding unit 304 refers to the prediction parameter stored in the prediction parameter memory 307 on the basis of the code input from the entropy decoding unit 301 and decodes the intra prediction parameter. The intra prediction parameter is a parameter used in a process of predicting a CU within one picture, for example, an intra prediction mode IntraPredMode. The intra prediction parameter decoding unit 304 outputs the decoded intra prediction parameter to the prediction image generation unit 308 and stores it in the prediction parameter memory 307.

The loop filter 305 applies filters such as a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) to the decoded image of the CU generated by the adding unit 312.

The reference picture memory 306 stores the decoded image of the CU generated by the adding unit 312 at a predetermined position for each decoding target picture and CU.

The prediction parameter memory 307 stores the prediction parameter in a predetermined position for each decoding target picture and prediction unit (or sub-block, fixed-size block, pixel). Specifically, the prediction parameter memory 307 stores the inter prediction parameter decoded by the inter prediction parameter decoding unit 303, the intra prediction parameter decoded by the intra prediction parameter decoding unit 304, and the prediction mode predMode separated by the entropy decoding unit 301. . The stored inter prediction parameters include, for example, a prediction list utilization flag predFlagLX (inter prediction identifier inter_pred_idc), a reference picture index refIdxLX, and a motion vector mvLX.

The prediction image generation unit 308 receives the prediction mode predMode input from the entropy decoding unit 301 and the prediction parameter from the prediction parameter decoding unit 302. Further, the predicted image generation unit 308 reads a reference picture from the reference picture memory 306. The prediction image generation unit 308 generates a prediction image of the PU using the input prediction parameter and the read reference picture in the prediction mode indicated by the prediction mode predMode.

Here, when the prediction mode predMode indicates the inter prediction mode, the inter prediction image generation unit 309 uses the inter prediction parameter input from the inter prediction parameter decoding unit 303 and the read reference picture to perform prediction of the PU by inter prediction. Is generated.

The inter prediction image generation unit 309 performs a motion vector on the basis of the decoding target PU from the reference picture indicated by the reference picture index refIdxLX for a reference picture list (L0 list or L1 list) having a prediction list use flag predFlagLX of 1. The reference picture block at the position indicated by mvLX is read from the reference picture memory 306. The inter prediction image generation unit 309 performs prediction based on the read reference picture block to generate a prediction image of the PU. The inter prediction image generation unit 309 outputs the generated prediction image of the PU to the addition unit 312.

When the prediction mode predMode indicates the intra prediction mode, the intra predicted image generation unit 310 performs intra prediction using the intra prediction parameter input from the intra prediction parameter decoding unit 304 and the read reference picture. Specifically, the intra predicted image generation unit 310 reads, from the reference picture memory 306, neighboring PUs that are pictures to be decoded and are in a predetermined range from the decoding target PUs among the PUs that have already been decoded. The predetermined range is, for example, one of the left, upper left, upper, and upper right adjacent PUs when the decoding target PU sequentially moves in the so-called raster scan order, and differs depending on the intra prediction mode. The raster scan order is an order in which each row is sequentially moved from the left end to the right end in each picture from the upper end to the lower end.

The intra predicted image generation unit 310 performs prediction in the prediction mode indicated by the intra prediction mode IntraPredMode for the read adjacent PU, and generates a predicted image of the PU. The intra predicted image generation unit 310 outputs the generated predicted image of the PU to the adding unit 312.

When the intra prediction parameter decoding unit 304 derives an intra prediction mode different in luminance and color difference, the intra prediction image generation unit 310 performs planar prediction (0), DC prediction (1), direction according to the luminance prediction mode IntraPredModeY. Prediction image of luminance PU is generated by any one of prediction (2 to 66), and planar prediction (0), DC prediction (1), direction prediction (2 to 66), LM mode according to color difference prediction mode IntraPredModeC A prediction image of a color difference PU is generated according to any one of (67).

The inverse quantization / inverse DCT unit 311 inversely quantizes the quantization coefficient input from the entropy decoding unit 301 to obtain a DCT coefficient. The inverse quantization / inverse DCT unit 311 performs inverse DCT (Inverse Discrete Cosine Transform) on the obtained DCT coefficient to calculate a residual signal. The inverse quantization / inverse DCT unit 311 outputs the calculated residual signal to the addition unit 312.

The addition unit 312 adds the prediction image of the PU input from the inter prediction image generation unit 309 or the intra prediction image generation unit 310 and the residual signal input from the inverse quantization / inverse DCT unit 311 for each pixel, Generate a decoded PU image. The adding unit 312 stores the generated decoded image of the PU in the reference picture memory 306, and outputs a decoded image Td in which the generated decoded image of the PU is integrated for each picture to the outside.

(Configuration of image encoding device)
Next, the configuration of the image encoding device 11 according to the present embodiment will be described. FIG. 6 is a block diagram illustrating a configuration of the image encoding device 11 according to the present embodiment. The image encoding device 11 includes a prediction image generation unit 101, a subtraction unit 102, a DCT / quantization unit 103, an entropy encoding unit 104, an inverse quantization / inverse DCT unit 105, an addition unit 106, a loop filter 107, and a prediction parameter memory. (Prediction parameter storage unit, frame memory) 108, reference picture memory (reference image storage unit, frame memory) 109, encoding parameter determination unit 110, and prediction parameter encoding unit 111. The prediction parameter encoding unit 111 includes an inter prediction parameter encoding unit 112 and an intra prediction parameter encoding unit 113.

The predicted image generation unit 101 generates, for each picture of the image T, a predicted image P of the prediction unit PU for each encoding unit CU that is an area obtained by dividing the picture. Here, the predicted image generation unit 101 reads a decoded block from the reference picture memory 109 based on the prediction parameter input from the prediction parameter encoding unit 111. The prediction parameter input from the prediction parameter encoding unit 111 is, for example, a motion vector in the case of inter prediction. The predicted image generation unit 101 reads a block at a position on the reference image indicated by the motion vector with the target PU as a starting point. In the case of intra prediction, the prediction parameter is, for example, an intra prediction mode. A pixel value of an adjacent PU used in the intra prediction mode is read from the reference picture memory 109, and a predicted image P of the PU is generated. The predicted image generation unit 101 generates a predicted image P of the PU using one prediction method among a plurality of prediction methods for the read reference picture block. The predicted image generation unit 101 outputs the generated predicted image P of the PU to the subtraction unit 102.

Note that the predicted image generation unit 101 has the same operation as that of the predicted image generation unit 308 already described, and therefore description thereof is omitted here.

The prediction image generation unit 101 generates a prediction image P of the PU based on the pixel value of the reference block read from the reference picture memory, using the parameter input from the prediction parameter encoding unit. The predicted image generated by the predicted image generation unit 101 is output to the subtraction unit 102 and the addition unit 106.

The subtraction unit 102 subtracts the signal value of the predicted image P of the PU input from the predicted image generation unit 101 from the pixel value of the corresponding PU of the image T, and generates a residual signal. The subtraction unit 102 outputs the generated residual signal to the DCT / quantization unit 103.

The DCT / quantization unit 103 performs DCT on the residual signal input from the subtraction unit 102 and calculates a DCT coefficient. The DCT / quantization unit 103 quantizes the calculated DCT coefficient to obtain a quantization coefficient. The DCT / quantization unit 103 outputs the obtained quantization coefficient to the entropy coding unit 104 and the inverse quantization / inverse DCT unit 105.

The entropy encoding unit 104 receives a quantization coefficient from the DCT / quantization unit 103 and receives a prediction parameter from the prediction parameter encoding unit 111. The input prediction parameters include, for example, codes such as a reference picture index ref_Idx_lX, a prediction vector index mvp_LX_idx, a difference vector mvdLX, a prediction mode pred_mode_flag, and a merge index merge_idx.

The entropy encoding unit 104 entropy-encodes the input division information, prediction parameters, quantization transform coefficients, and the like to generate an encoded stream Te, and outputs the generated encoded stream Te to the outside.

A detailed block diagram of the entropy encoding unit 104 is shown in FIG. The entropy encoding unit 104 includes a header encoding unit 1101, a CT information encoding unit 1102, a CU encoding unit 1103, and an encoding module 1104. The entropy encoding unit 104 encodes the header information supplied from the prediction parameter encoding unit 111, the prediction parameter, and the quantized transform coefficient supplied from the DCT / quantization unit 103, and outputs encoded data.

(Header encoding part)
The header encoding unit 1101 encodes the VPS, SPS, PPS, and slice header using the encoding module 1104.

(CT information encoding unit)
The CT information encoding unit 1102 uses the encoding module 1104 to perform CTU and CT encoding processing. The CT information encoding unit 1102 uses the encoding module 1104 to encode the tree unit header CTUH as CTU information included in the CTU. Next, the CT information encoding unit 1102 uses, as CT information, a QT division flag indicating whether or not to subject the target CT to QT division, whether or not to subject the target CT to BT division, and a division method in the case of BT division. The BT division mode shown is encoded, and the target CT is recursively divided and encoded until the QT division flag and the BT division mode no longer notify further division. Finally, the tree unit footer CTUF is further encoded as CTU information.

(CU encoding part)
The CU encoding unit 1103 includes PUI information (merge flag (merge_flag), merge index (merge_idx), prediction motion vector index (mvp_idx), reference image index (ref_idx_lX), interlaced encoding node CN (ie, CU) Prediction identifier (inter_pred_flag), difference vector (mvdLX, etc.), quantization transform coefficient (residual_coding), and TTI information (TU partition flag SP_TU (split_transform_flag), CU residual flag CBP_TU (cbf_cb, cbf_cr, cbf_luma), etc.) Encoding is performed using the encoding module 1104.

(Encoding module)
The encoding module 1104 performs an encoding process for encoding various prediction parameters, quantization transform coefficients, and the like in a fixed-length encoding method or entropy encoding. More specifically, the encoding module 1104 encodes the header encoding unit 1101, the CTU information encoding unit 1102, and the CU encoding unit 1103 using a fixed-length encoding or an entropy encoding scheme such as CABAC, and performs encoding. Output data. Here, the operation of the encoding process of the quantized transform coefficient (transform coefficient) will be described using the flowchart of FIG. In S1401, the CU encoding unit 1103 counts the number of non-zero transform coefficients in the CU. In S1402, the CU encoding unit 1103 checks the presence / absence of a non-zero transform coefficient in the CU. When there is no non-zero transform coefficient, the process proceeds to S1403, and the CU encoding unit 1103 sets cbf to 0. When there is a non-zero transform coefficient, the process proceeds to S1404, and the CU encoding unit 1103 sets cbf to 1. In S1405, the CU encoding unit 1103 encodes cbf. In S1406, the CU encoding unit 1103 checks whether cbf is 0 or not. When cbf = 0, the CU encoding unit 1103 ends the process. When cbf = 1, the CU encoding unit 1103 derives the syntax representing LAST in S1407 and encodes the syntax representing LAST in S1408. Next, CU encoding section 1103 performs S1409 to S1413 for the subblocks before the subblock including LAST. In S1409, the CU encoding unit 1103 counts the number of non-zero transform coefficients in the sub-block. In S1410, the CU encoding unit 1103 checks whether or not there is a non-zero transform coefficient, and if there is no non-zero transform coefficient, proceeds to S1411, sets 0 to coded_sub_block_flag, proceeds to S1412 if there is a non-zero transform coefficient, and sets coded_sub_block_flag Set 1 In S1413, the CU encoding unit 1103 encodes coded_sub_block_flag. Next, the CU encoding unit 1103 performs the processing of S1414 to S1418 for each sub-block. In S1414, the CU encoding unit 1103 checks whether or not coded_sub_block_flag = 1. If coded_sub_block_flag = 1, the process proceeds to S1415, and S1415 to S1418 are performed on each transform coefficient in the subblock. If coded_sub_block_flag = 1, the CU encoding unit 1103 proceeds to S1419 and sets all sig_coeff_flags in the sub-block to 0. The CU encoding unit 1103 checks whether or not the transform coefficient value is 0 in S1415.If the transform coefficient value is 0, the process proceeds to S1416, sets sig_coeff_flag to 0, and if the transform coefficient value is 1, the process proceeds to S1417. Then, sig_coeff_flag is set to 1, and sig_coeff_flag is encoded in S1418. Next, the CU encoding unit 1103 performs S1420 to S1421 for all transform coefficients. In S1420, the CU encoding unit 1103 checks whether or not sig_coeff_flag = 1. Derived and encoded. The above is the detailed description of the entropy encoding unit 104.

The inverse quantization / inverse DCT unit 105 inversely quantizes the quantization coefficient input from the DCT / quantization unit 103 to obtain a DCT coefficient. The inverse quantization / inverse DCT unit 105 performs inverse DCT on the obtained DCT coefficient to calculate a residual signal. The inverse quantization / inverse DCT unit 105 outputs the calculated residual signal to the addition unit 106.

The addition unit 106 adds the signal value of the prediction image P of the PU input from the prediction image generation unit 101 and the signal value of the residual signal input from the inverse quantization / inverse DCT unit 105 for each pixel, and performs decoding. Generate an image. The adding unit 106 stores the generated decoded image in the reference picture memory 109.

The loop filter 107 performs a deblocking filter, a sample adaptive offset (SAO), and an adaptive loop filter (ALF) on the decoded image generated by the adding unit 106.

The prediction parameter memory 108 stores the prediction parameter generated by the encoding parameter determination unit 110 at a predetermined position for each encoding target picture and CU.

The reference picture memory 109 stores the decoded image generated by the loop filter 107 at a predetermined position for each picture to be encoded and each CU.

The encoding parameter determination unit 110 selects one set from among a plurality of sets of encoding parameters. The encoding parameter is a parameter to be encoded that is generated in association with the above-described QTBT division parameter and prediction parameter. The predicted image generation unit 101 generates a predicted image P of the PU using each of these encoding parameter sets.

The encoding parameter determination unit 110 calculates an RD cost value indicating the amount of information and the encoding error for each of a plurality of sets. The RD cost value is, for example, the sum of a code amount and a square error multiplied by a coefficient λ. The code amount is the information amount of the encoded stream Te obtained by entropy encoding the quantization error and the encoding parameter. The square error is the sum between pixels regarding the square value of the residual value of the residual signal calculated by the subtracting unit 102. The coefficient λ is a real number larger than a preset zero. The encoding parameter determination unit 110 selects a set of encoding parameters that minimizes the calculated RD cost value. As a result, the entropy encoding unit 104 outputs the selected set of encoding parameters to the outside as the encoded stream Te, and does not output the set of unselected encoding parameters. The encoding parameter determination unit 110 stores the determined encoding parameter in the prediction parameter memory 108.

The prediction parameter encoding unit 111 derives a format for encoding from the parameters input from the encoding parameter determination unit 110 and outputs the format to the entropy encoding unit 104. Deriving the format for encoding is, for example, deriving a difference vector from a motion vector and a prediction vector. Also, the prediction parameter encoding unit 111 derives parameters necessary for generating a prediction image from the parameters input from the encoding parameter determination unit 110 and outputs the parameters to the prediction image generation unit 101. The parameter necessary for generating the predicted image is, for example, a motion vector in units of sub-blocks.

The inter prediction parameter encoding unit 112 derives an inter prediction parameter such as a difference vector based on the prediction parameter input from the encoding parameter determination unit 110. The inter prediction parameter encoding unit 112 derives parameters necessary for generating a prediction image to be output to the prediction image generating unit 101, and an inter prediction parameter decoding unit 303 (see FIG. 5 and the like) derives inter prediction parameters. Some of the configurations are the same as the configuration to be performed. In addition, the intra prediction parameter encoding unit 113 derives a prediction parameter necessary for generating a prediction image to be output to the prediction image generation unit 101, and the intra prediction parameter decoding unit 304 (see FIG. 5 and the like) And a part of the same structure as that for deriving.

The intra prediction parameter encoding unit 113 derives a format (for example, MPM_idx, rem_intra_luma_pred_mode) for encoding from the intra prediction mode IntraPredMode input from the encoding parameter determination unit 110.

(CBP_TU and LAST)
In the image coding apparatus and the image decoding apparatus according to Embodiment 1, the transform coefficient level is limited by using CBP_TU (cbf_luma, cbf_cb, cbf_cr), LAST, coded_sub_block_flag, and sig_coeff_flag as the transform coefficient, and the code amount Have reduced. FIG. 15 is a graph showing a breakdown of the code amount. From FIG. 15, it can be seen that the sum of last_sig_coeff_x_prefix, last_sig_coeff_y_prefix, last_sig_coeff_x_suffix, and last_sig_coeff_y_suffix used for the expression of LAST reaches about 10% of the code amount, and it is understood that the syntax representing one coordinate is considerably large.

Also, FIG. 16 shows the result of aggregating the average value of the number of non-zero transform coefficients for each CU size and quantization parameter in 5 types of HD resolution sequences. Here, width and height are the width and height of the CU, and 22 and 27 are quantization parameters. In FIG. 16, (1) is luminance, (2) is color difference (Cb), and (3) is the average number of non-zero transform coefficients included in 1CU in intra coding and inter coding of color difference (Cr). In inter prediction of color difference components and some luminance components, the average number of non-zero coefficients per CU is often less than 3. Since the energy is concentrated on the low frequency component by the conversion, the non-zero conversion coefficient is considered to be concentrated on the DC component and one or two AC low frequency components.

Considering these, the following describes a method for reducing the code amount of LAST by changing the definition of CBP_TU to indicate the presence or absence of non-zero transform coefficients in the CU. Hereinafter, cbf_luma, cbf_cb, and cbf_cr are represented as cbf.

First, the following four types (1) to (4) of cbf are defined. These are non-zero transform coefficients DC only, AC0 only, AC1 only, DC and AC0 only, DC and AC1 only, AC0 and AC1 only, DC and AC0 and AC1 only, otherwise encoding of the transform coefficient, FIGS. 17 (1) to (4) respectively show codes and code amounts of the syntaxes related to decoding. However, the coded_sub_block_flag and the syntax representing the transform coefficient level are omitted because there is no change in the code amount in (1) to (4), and only cbf, LAST, and sig_coeff_flag are shown. The variable length code table of LAST uses FIG.

(1) When cbf is 1 bit, cbf = 0 indicates that the CU has no non-zero conversion coefficient, and cbf = 1 indicates the case where the CU has a non-zero conversion coefficient (the above method).

(2) cbf is 2 bits, cbf = 0 is CU has no non-zero conversion coefficient, cbf = 10 is non-zero conversion coefficient only DC component, cbf = 11 is non-zero conversion coefficient other than DC component Indicates the case where it exists.

(3) cbf is 2 bits, cbf = 0 is CU has no non-zero conversion coefficient, cbf = 10 is non-zero conversion coefficient other than DC and AC0 component, cbf = 11 is non-zero conversion coefficient is DC And the case where it exists other than AC0 component is shown.

(4) When cbf is 2 bits, cbf = 0 is a non-zero conversion coefficient in the CU, cbf = 10 is a nonzero conversion coefficient other than DC, AC0, and AC1 components, cbf = 11 is a nonzero conversion coefficient Shows a case where is present in addition to DC, AC0 and AC1 components.

Fig. 17 (1) shows the syntax and code amount for (1) above. When the above (2) to (4) cannot be expressed by cbf = 10 (cbp = 11), the 1-bit code amount is increased as compared with the above (1). Therefore, the code amount can be efficiently reduced if the non-zero transform coefficient is used under the condition where the majority of the non-zero transform coefficients are concentrated near the DC component. For example, one of the above methods (2) to (4) is used for the inter coding of the color difference component, and the above method (1) is used for the intra / inter coding of the luminance component and the intra coding of the color difference component. To do. Which of the above (2) to (4) is used in the inter-coding of the color difference component may be defined in advance between the image encoding device and the image decoding device, or among the SPS, PPS, and slice headers. You may be notified of how to use.

FIGS. 18 and 19 are flowcharts showing operations of the entropy decoding unit 301 (CU decoding unit 1003) and the entropy encoding unit 104 (CU encoding unit 1103) when the methods (2) to (4) are used. It is.

FIG. 18 (2) is a flowchart showing the decoding process of (2) above. In S1801, the CU decoding unit 1003 sets all transform coefficients in the CU to 0. In S1802, the CU decoding unit 1003 decodes cbf. In S1803, the CU decoding unit 1003 checks whether cbf is 0 or not. When cbf is 0, the CU decoding unit 1003 ends the process. If cbf is 1, the process advances to step S1804 to check whether cbf = 10. If cbf = 10, S1805 to S1807 are executed. S1805 is the same as S1304 in FIG. 13, S1806 is the same as S1305 in FIG. 13, and S1807 is the same as S1306 to S1308 in FIG. When cbf = 10, the process proceeds to S1808, and the CU decoding unit 1003 sets (0, 0) to LAST. In S1809, the CU decoding unit 1003 sets 1 to sig_coeff_flag of the DC component and 0 to sig_coeff_flag of other AC components. In S1810, the CU decoding unit 1003 decodes the transform coefficient level, but this process is the same as S1309 to S1311 in FIG.

FIG. 18 (3) is a flowchart showing the decoding process (3) above. FIG. 18 (2) is the same as FIG. 18 (2) except that the processing of S1808 and S1809 is changed to S18081 and S18091. In S18081 of FIG. 18 (3), the CU decoding unit 1003 sets (0, 1) to LAST. In S18091, the CU decoding unit 1003 decodes sig_coeff_flag of the DC component and the AC0 component, and sets 0 to sig_coeff_flag of the other AC components.

FIG. 18 (4) is a flowchart showing the decoding process (4) above. 18 (2) is the same as FIG. 18 (2) except that the processing of S1808 and S1809 in FIG. 18 (2) is changed to S18082 and S18092. In S18082 of FIG. 18 (4), the CU decoding unit 1003 sets (1,0) to LAST. In S18092, the CU decoding unit 1003 decodes the DC component, sig_coeff_flag of the AC0 and AC1 components, and sets 0 to the sig_coeff_flag of the other AC components.

FIG. 19 (2) is a flowchart showing the encoding process of (2) above. In S1901, the CU encoding unit 1103 counts the number of non-zero transform coefficients in the CU. In S1902, the CU encoding unit 1103 checks whether or not there is a non-zero transform coefficient in the CU. When there is no non-zero transform coefficient, the process proceeds to S1903, and the CU encoding unit 1103 sets cbf to 0. If there is a non-zero transform coefficient, the process proceeds to S1904, where the CU encoder 1103 checks whether the non-zero transform coefficient is only a DC component. If the non-zero transform coefficient is only the DC component, the process proceeds to S1905, where cbf is set to 10. If the non-zero transform coefficient exists in addition to the DC component, the process proceeds to S1906, and cbf is set to 11. In S1907, the CU encoding unit 1103 encodes cbf. In S1908, the CU encoding unit 1103 checks whether cbf is 0 or not. When cbf = 0, the CU encoding unit 1103 ends the process. If cbf = 0 is not satisfied, the process proceeds to S1909, where the CU encoding unit 1103 checks whether cbf = 11. When cbf = 11, the CU encoding unit 1103 performs the processing of S1910 to S1912. S1910 is the same as S1407 to S1408 in FIG. 14, S1911 is the same as S1409 to S1413 in FIG. 14, and S1912 is the same as S1414 to S1419 in FIG. When cbf = 11, 1 is set to sig_coeff_flag of the DC component and 0 is set to sig_coeff_flag of the other AC components in S1914. In S1913, the CU encoding unit 1103 encodes a syntax expressing the coefficient level of the non-zero transform coefficient. S1913 is the same as S1420 to S1421 in FIG.

FIG. 19 (3) is a flowchart showing the encoding process (3) above. FIG. 19 (2) is the same as FIG. 19 (2) except that the processing of S1904 and S1914 is changed to S19041 and S19141. In S19041 of FIG. 19 (3), the CU encoding unit 1103 checks whether there are non-zero transform coefficients other than DC and AC0 components. In S19141, the CU encoding unit 1103 encodes sig_coeff_flag of the DC component and the AC0 component, and sets 0 to sig_coeff_flag of the other AC components.

FIG. 19 (4) is a flowchart showing the encoding process (4) above. FIG. 19 (2) is the same as FIG. 19 (2) except that the processes of S1904 and S1914 in FIG. 19 (2) are changed to S19042 and S19142. In S19042 of FIG. 19 (4), the CU encoding unit 1103 checks whether there are non-zero transform coefficients other than DC, AC0, and AC1 components. In S19142, the CU encoding unit 1103 encodes the DC component, sig_coeff_flag of the AC0 and AC1 components, and sets sig_coeff_flag of the other AC components to 0.

As described above, in the first embodiment, the code amount of LAST is reduced by changing the definition of cbf. By using this method under conditions where the majority of non-zero transform coefficients are concentrated near the DC component, the amount of codes can be reduced.

(Embodiment 2)
In the first embodiment of the present application, the inter-coding of the color difference component has been described as a condition that occupies most cases where the non-zero transform coefficients are concentrated in the vicinity of the DC component. In the second embodiment, the CU size (CUwidth: CU width, CUheight: CU height) and a quantization parameter are added as a condition that the majority of cases where non-zero transform coefficients are concentrated near the DC component are described. To do.

As shown in FIG. 16, the number of non-zero transform coefficients depends on the CU size and the quantization parameter (QP) in addition to intra coding and inter coding. In QP <= THQ1, the method (1) of the first embodiment is used, but in THQ1 <QP, the method (2) of the first embodiment is used with a specific CU size. For example, in THQ1 <QP <= THQ2, (2) of the first embodiment is used when CUmax> = THC1, and (1) of the first embodiment is used when CUmax <THC1. Here, CUmax = max (CUwidth, CUheight). Further, in THQ2 <= QP, (2) of the first embodiment is used when CUmax> = THC2, (3) of the first embodiment is used when THC3 <= CUmax <THC2, and CUmax <THC3. (1) of Embodiment 1 is used. Note that THQ1 <THQ2 and THC1> THC2> THC3.

For example, FIG. 20 (1) is an example of inter coding of luminance components, but THQ1 = 22, THQ2 = 32, THC1 = 128, THC2 = 64, and THC3 = 32.

As another example, when QP <= THQ1, THC2 <= CUmax <THC1 and minCU> = THC3, (1) of the first embodiment is used. Otherwise, (2) of the first embodiment is used. use. Here, CUmin = min (CUwidth, CUheight). Furthermore, when QP> THQ1, (2) of the first embodiment is used regardless of the CU size.

For example, FIG. 20 (2) is an example of inter-coding of color difference components, and THQ1 = 22, THC1 = 64, THC2 = 32, and TUC3 = 16.

As another example, in QP <= THQ1, (1) of Embodiment 1 is used when CUmax> = THC1, (2) of Embodiment 1 is used when CUmax <= THC2, and otherwise In this case, (3) of Example 1 is used. Further, in THQ1 <QP, (1) of the first embodiment is used when CUmax> = THC1, and (2) of the first embodiment is used otherwise.

For example, FIG. 20 (3) is an example of intra coding of the color difference component, and THQ1 = 27, THC1 = 32, and THC2 = 2.

Furthermore, for color difference Cr, as shown in FIG. 20 (4), different threshold values may be set for Cb (example in FIG. 20 (3)) and Cr.

21 and 22 are flowcharts showing operations of the entropy decoding unit 301 (CU decoding unit 1003) and the entropy encoding unit 104 (CU encoding unit 1103) when the method of the second embodiment is used. Steps denoted by the same numbers as in FIGS. 18 and 19 are the same processes as in FIGS.

In FIG. 21, in S2101, the CU decoding unit 1003 determines the definition of cbf from the CU size and the quantization parameter. The definition of cbf is as shown in FIG. 21 (2). When the definition of cbf is (1) of the first embodiment, LAST and sig_coeff_flag are extracted from the encoded data, so no non-zero transform coefficient is set. When the definition of cbf is (2) of the first embodiment, DC component is set to (0, 0) in LAST and non-zero conversion coefficient in the case of cbf = 10. When the definition of cbf is (3) of the first embodiment, sig_coeff_flag of DC and AC0 components is encoded / decoded into LAST, and DC and AC0 components are converted into non-zero transform coefficients when cbf = 10 Set. When the definition of cbf is (4) of the first embodiment, (1,0) is encoded in LAST, sig_coeff_flag of DC, AC0, and AC1 components is encoded / decoded, and DC is converted into a non-zero transform coefficient when cbf = 10, Set the AC0 and AC1 components. In S2102, the CU decoding unit 1003 determines LAST, sig_coeff_flag to be decoded, and non-zero transform coefficient when cbf = 10 from the definition of cbf. In S2108, the CU decoding unit 1003 sets LAST determined in S2102. In S2109, the CU decoding unit 1003 decodes sig_coeff_flag of the transform coefficient determined in S2102, and sets sig_coeff_flag of other transform coefficients to 0.

22, in S2204, the CU encoding unit 1103 checks whether the non-zero transform coefficient in the CU is based on the definition of cbf = 10 determined in S2102. In S2214, the CU encoding unit 1103 encodes sig_coeff_flag of the transform coefficient determined in S2102 and sets sig_coeff_flag of other transform coefficients to 0.

As described above, in the second embodiment, when the code amount of LAST is reduced by changing the definition of cbf, CU size or By adding the quantization parameter, it is possible to extend the range to which the changed definition of cbf is applied and to increase the code amount reduction range.

(Embodiment 3)
In the first and second embodiments of the present application, the LAST code amount reduction method in the case where the non-zero transform coefficients are concentrated near the DC component has been described. In the third embodiment, a code amount reduction method in the case where a non-zero transform coefficient exists in a high frequency region and the LAST coordinate becomes large will be described.

Fig. 23 shows an example where LAST is at the position (14, 6). If the LAST coordinates in the CU are expressed directly (in one step), they are expressed in 17 bits (1111111010, 1111100) using the variable length code table of FIG.

As shown in FIG. 23, in addition to directly expressing the LAST coordinates in the CU, it can also be expressed in stages using the LAST sub-block position and the position in the sub-block. In this case, the position of the subblock included in LAST is (3,1), and the position of LAST in the subblock at position (3,1) is (2,2). It is expressed by 12 bits (1110,10) and (110,110) using a long code table. Therefore, if the position of LAST is far away from (0,0), rather than directly encoding the coordinates of the LAST in the CU, it is divided into 2 sub-block positions containing the LAST and 2 LAST positions in the sub-block. The amount of code can be reduced by encoding in stages.

LAST is expressed in two-dimensional coordinates, but the encoding method is the same in both the horizontal and vertical directions, so the following description is based on one dimension (either horizontal or vertical).

FIG. 24 is a diagram showing the LAST position and the code amount necessary to express the LAST position. As the variable length code table, the code table shown in FIG. 12 was used. Fig. 24 (1) shows the case where the LAST coordinates in the CU are directly encoded, and Fig. 24 (2) shows the case where the encoding is performed in two stages: the position of the LAST in the sub-block and the position of the LAST in the sub-block. It is. If the LAST position is 0 to 3, that is, included in the upper left sub-block in the CU, the code amount is less if the LAST coordinates in the CU are directly encoded, but in other cases, the LAST is included It can be seen that the amount of code is smaller when encoding is performed in two stages, the sub-block position and the LAST position in the sub-block.

FIG. 25 is a flowchart showing the operation when S1304 in FIG. 13 and S1805 in FIG. 18 are replaced by the method of the third embodiment.

In S2501, the CU decoding unit 1003 decodes the position of the sub block including the LAST and the position of the LAST in the sub block. In S2502, the CU decoding unit 1003 derives the coordinates of the LAST in the CU from the position of the subblock including the decoded LAST and the position of the LAST in the subblock.

FIG. 26 is a flowchart showing the operation when S1407 to S1408 in FIG. 14 and S1910 in FIG. 19 are replaced by the method of the third embodiment.

In S2601, the CU encoding unit 1103 derives the LAST position in the CU. The CU encoding unit 1103 derives the position of the subblock including LAST in S2602, derives the position of LAST in the subblock in S2603, and positions of the subblock including LAST and the position of LAST in the subblock in S2604. Is encoded.

Furthermore, as shown in FIG. 16, in the intra coding of the luminance component and the color difference component, the number of non-zero transform coefficients included in the CU is large. In other words, LAST is likely to be in the high frequency region. Therefore, the amount of LAST code can be reduced by encoding LAST in two stages in intra encoding and directly encoding LAST in inter encoding.

Also, as shown in FIG. 16, even in the case of intra coding of luminance components and color difference components, the number of non-zero transform coefficients included in the CU depends on the quantization parameter (QP) and the CU block size. Therefore, by switching between the case where the LAST coordinates are directly encoded according to the quantization parameter and the CU size, and the case where the LAST coordinates are encoded in two stages, the position of the sub-block and the position within the sub-block, the LAST Can be further reduced. For example, in the case of QP <THQ1, if CUmax> = THC1, LAST is encoded in two stages, and if CUmax <THC1, LAST is encoded directly. In the case of QP> = THQ1, if CUmax> = THC2, LAST is encoded in two stages, and if CUmax <THC2, LAST is directly encoded. FIG. 27 (1) is an example in which THQ1 = 32, THC1 = 32, and THC2 = 64 in the luminance component intra coding. As another example, if QP = <THQ2, CUmax> = THC3 and CUmin> = THC4, LAST can be encoded in two stages, otherwise LAST can be encoded directly. FIG. 27 (2) shows an example in which THQ2 = 22, THC3 = 32, and THC4 = 16 in the intra coding of the color difference component. As another example, different threshold values may be set for Cb and Cr. For example, Cr, which has a smaller pixel value change than Cb, has a limited effect in encoding LAST in two stages. Therefore, if QP <= THQ2 and CUmax = Cumin = THC5, LAST is encoded in two stages. Can also set a separate threshold for Cb and Cr, such as by encoding LAST directly. FIG. 27 (3) shows an example in which THQ2 = 22 and THC5 = 32 in Cb intra coding. As described above, encoding is performed in two stages: the position of the subblock including LAST and the position of LAST within the subblock, only when there is a high possibility that LAST is included in other than the upper left subblock of CU. As a result, the LAST code amount reduction effect can be enhanced.

(Embodiment 4)
In the third embodiment of the present application, a method of encoding LAST in two stages has been described as a code amount reduction method in the case where non-zero transform coefficients exist in the high frequency region and the coordinates of LAST become large. In Embodiment 4 of the present application, switching of a variable-length code table used for LAST coding is described as another code amount reduction method when non-zero transform coefficients exist in the high frequency region and the LAST coordinates become large. To do.

In Embodiments 1 to 3, the variable length code table A in FIG. 12 is used. The variable-length code table A in FIG. 12 has an advantage that the code amount is small when LAST is small. However, since the rate of increase in the code amount increases as LAST increases, there is a disadvantage when there are many large values of LAST. In the fourth embodiment, when a large LAST is likely to occur, for example, in the intra coding of the luminance component, the code amount when the LAST is small is larger than the variable length code of FIG. A variable length code table that does not become too large will be described.

28 (1) to (4) are examples of variable length code tables that are effective when LAST is large (the code amount does not become so large). As in FIG. 12, the portion indicated by “1” and “0” is last_sig_coeff_Z_prefix (Z is x or y), and the portion indicated by “X” is last_sig_coeff_Z_suffix (Z is x or y). Here, “X” is “1” or “0”. For example, the code amount when LAST is “3” is 4 bits when using the variable length code table A of FIG. 12, 4 bits when using the variable length code table B of FIG. 28 (1), and the variable length of FIG. 28 (2). The code table C is 3 bits, the variable length code table D in FIG. 28 (3) is 3 bits, and the variable length code table E in FIG. 28 (4) is 4 bits.

The code amount when LAST is “7” is 7 bits when using the variable length code table A in FIG. 12, 6 bits when using the variable length code table B in FIG. 28 (1), and the variable length code in FIG. 28 (2). If Table C is used, it will be 5 bits, if variable length code table D of FIG. 28 (3) is used, it will be 4 bits, and if variable length code table E of FIG. 28 (4) is used, it will be 4 bits.

The code amount when LAST is “15” is 10 bits when using the variable length code table A of FIG. 12, 8 bits when using the variable length code table B of FIG. 28 (1), and the variable length code of FIG. 28 (2). If Table C is used, it will be 8 bits, if variable length code table D of FIG. 28 (3) is used, it will be 6 bits, and if variable length code table E of FIG. 28 (4) is used, it will be 5 bits.

As described above, in the intra coding of the luminance component in which a large LAST is likely to occur, one of the variable length code tables of FIGS. 28 (1) to (4) is used, and in other cases (inter coding and By encoding the color difference component using the variable length code table of FIG. 12, the amount of LAST code can be reduced.

(Modification 1)
In the fourth embodiment of the present application, the intra coding of the luminance component has been described as a condition that the non-zero transform coefficient exists in the high frequency region and the LAST coordinate becomes large. In the first modification, the CU size (CUwidth: CU width, CUheight: CU height) and quantization parameter (QP) are set under the condition that the non-zero transform coefficient exists in the high frequency region and the LAST coordinate becomes large. In addition, a method for switching the variable-length code table used for LAST encoding will be described.

FIG. 29 is a table showing a code amount necessary for LAST encoding when the variable length code tables A to E of FIGS. 12 and 28 are used. 29A is a variable length code table of FIG. 12, B is a variable length code table of FIG. 28 (1), C is FIG. 28 (2), D is FIG. 28 (3), and E is FIG. When LAST exists over the entire high frequency component, a variable length code table B close to fixed length coding is preferable. When LAST concentrates on a specific high frequency component, the code amount of LAST can be further reduced by switching the variable-length code tables C, D, and E depending on the concentrated location. For example, when a large number of non-zero transform coefficients exist in high frequency components, the variable length code table E is used.

As shown in FIG. 16, the number of non-zero transform coefficients in the CU depends on the quantization parameter and the CU size. When QP <= THQ1, variable length code table E (Fig. 28 (4)) is used when CUmin> = THC2, variable length code table A (Fig. 12) is used when CUmax <= THC3, otherwise variable The long code table D (FIG. 28 (3)) is used. When QP> THQ1, variable length code table E (Fig. 28 (4)) is used when CUmin> = THC1, variable length code table A (Fig. 12) is used when CUmax <= THC3, and variable length otherwise. Code table D (FIG. 28 (3)) is used. Here, CUmax = max (CUwidth, CUheight) and CUmin = min (CUwidth, CUheight). FIG. 30 (1) is an example of THQ1 = 22, THC1 = 64, THC2 = 32, and THC3 = 4 in the luminance component intra coding.

As another example, when QP <= THQ1, CUmin> = THC1 uses variable-length code table D (FIG. 28 (3)), and otherwise uses variable-length code table A (FIG. 12). FIG. 30 (2) shows an example of THQ1 = 22 and THC1 = 16 in the intra coding of color difference components.

As another example, when QP <= THQ1, variable length code table D (Fig. 28 (3)) is used when CUmax <= THC1 and CUmin> = THC2, otherwise variable length code table A (Fig. 12) Is used. FIG. 30 (3) shows an example in which THQ1 = 22, THC1 = 64, and THC2 = 32 in the luminance component inter-coding.

In the above, A, D, and E are used for the variable length code table, but B and C may be used instead of D and E.

FIG. 31 (1) is a flowchart showing a process of determining a variable length code table based on a quantization parameter and a CU size, and encoding or decoding LAST. In S3101, the CU decoding unit 1003 refers to the quantization parameter and the CU size to determine a variable length code table used for LAST encoding or decoding. In S3102, the CU decoding unit 1003 encodes or decodes LAST using the variable length code table determined in S3101. Here, the encoding process of LAST is the same as the process shown in FIG. 14 and the decoding process is the same as the process shown in FIG.

As described above, when the non-zero transform coefficient exists in the high frequency region and the LAST coordinate becomes large, by switching the variable length code table used for the LAST encoding according to the CU size and the quantization parameter, The code amount can be further reduced by taking advantage of the characteristics of the variable length code table.

(Modification 2)
In the first modification, the variable length code table to be used is switched depending on the quantization parameter and the CU size. In the second modification, a method for switching the variable length code table used for LAST encoding according to the scan direction will be described.

Fig. 32 shows the scan order (reverse order) of diagonal scan, horizontal scan, and vertical scan when the CU size is 16x16. In the horizontal and vertical scans, the number of bits required for LAST encoding increases at an earlier stage of the scan order than in the oblique scan. FIG. 33 (1) shows the code amount of LAST at the time of oblique scanning and horizontal / vertical scanning when the variable length code table A of FIG. 12 is used. In the horizontal and vertical scans, it can be seen that the code amount of LAST increases at an early stage of the scan. FIG. 33 (2) shows an example of using the variable length code table of FIG. 28 in the horizontal direction and the variable length code table of FIG. (3) is an example in which the variable length code table of FIG. 28 is used in the vertical direction and the variable length code table of FIG. 12 is used in the horizontal direction in the vertical scan. FIG. 33 (4) shows the code amount of LAST in this case. It can be seen that the code amount in the high frequency region can be reduced as compared with the case where only the variable length code table of FIG. 12 shown in FIG. 33 (1) is used.

FIG. 31 (2) is a flowchart showing a process of determining a variable length code table according to the scan direction and encoding or decoding LAST. The difference from FIG. 31 (1) is that S3101 in FIG. 31 (1) is changed to S31011 in FIG. 31 (2), and the other processes are the same, and thus the description thereof is omitted. In S31011, the CU decoding unit 1003 determines a variable length code table to be used for LAST encoding or decoding with reference to the scan direction.

As described above, when the non-zero transform coefficient exists in the high frequency region and the LAST coordinate becomes large, the variable length code table is switched by switching the variable length code table used for LAST encoding according to the scan direction. The amount of codes can be further reduced by taking advantage of the above characteristics.

(Modification 3)
In the third modification, a method of switching the variable-length code table used for LAST encoding according to the number of non-zero transform coefficients of adjacent CUs of the target CU will be described.

The number of non-zero transform coefficients included in the upper adjacent CU (CU_A) and left adjacent CU (CU_L) of the target CU (CU_C) shown in FIG. 34 is NA and NL, and the estimated values of the non-zero transform coefficients of the target CU N is derived. When the target CU is inter-prediction, when either CU_A or CU_L is inter-prediction, the number of non-zero transform coefficients of adjacent CUs of inter prediction is the number of transform coefficients N of the target CU, and both CU_A and CU_L are inter-predicted. In this case, the average value of NA and NL is N.

if (pred. mode of CU_C == “inter”) {
if (pred. mode of CU_A == “inter” && pred. mode of CU_B == “inter”)
N = (NA + NB) >> 1
else if (pred. mode of CU_A == “inter”)
N = NA
else
N = NB
}
When the target CU is intra prediction, if either CU_A or CU_L is intra prediction, the number of non-zero transform coefficients of adjacent CUs in the intra prediction is the number N of transform coefficients of the target CU, and both CU_A and CU_L In this case, the average value of NA and NL is N.

if (pred. mode of CU_C == “intra”) {
if (pred. mode of CU_A == “intra” && pred. mode of CU_B == “intra”)
N = (NA + NB) >> 1
else if (pred. mode of CU_A == “intra”)
N = NA
else
N = NB
}
Alternatively, when all of CU_A, CU_B, and CU_C are intra predictions, the number of non-zero conversion coefficients of the CU whose CU_A and CU_L have an intra prediction direction closer to the intra prediction direction of the CU_C The estimated value N may be used.

if (pred. mode of CU_C == “intra”) {
if (pred. mode of CU_A == “intra” && pred. mode of CU_B == “intra”) {
if (diff (CU_C, CU_A)> = diff (CU_C, CU_B))
N = NB
else
N = NA
}
}
Where diff (A, B) = max (A, B) -min (A, B)
And

Also, when the size of the target CU and the adjacent CU is different, the number of non-zero transform coefficients may be scaled. If the areas of CU_A, CU_B, and CU_C are AA, AB, and AC, scaling according to the following formula is performed first.

NA = NA * AC / AA
NB = NB * AC / AB
Next, an estimated value N of the number of non-zero conversion coefficients of the target CU is derived according to the above description. Based on the estimated value N, the LAST variable length code table used in the target CU is selected. If N <= TH1, the variable length code table A in FIG. 12 is used, and if N> TH1, the LAST is encoded using the variable length code table E in FIG. FIG. 35 (1) is an example of TH1 = 7. Alternatively, if N <= TH1, the variable-length code table A of FIG. 12 is used, if TH1 <N <= TH2, the variable-length code table D of FIG. 28 (3) is used, and if N> TH2, the variable-length code table D of FIG. 28 (4) is used. LAST may be encoded using the variable length code table E. FIG. 35 (2) is an example of TH1 = 3 and TH2 = 7.

FIG. 31 (3) is a flowchart showing a process of determining a variable-length coding table based on the estimated value of the number of non-zero transform coefficients of the target CU and coding or decoding LAST. The difference from FIG. 31 (1) is that S3101 in FIG. 31 (1) is changed to S31012 in FIG. 31 (2), and the other processes are the same, and the description thereof is omitted. In S31012, the CU decoding unit 1003 refers to the estimated value of the number of non-zero transform coefficients of the target CU to determine a variable length code table used for LAST encoding or decoding.

As described above, when non-zero transform coefficients exist in the high frequency range and the LAST coordinates become large, the variable-length code table used for LAST encoding is switched depending on the number of non-zero coefficients of adjacent CUs. Thus, the code amount can be further reduced by taking advantage of the characteristics of the variable-length code table.

An image encoding apparatus according to an aspect of the present invention includes a unit that divides one screen of an input moving image into encoding units (CU) including a plurality of pixels, and performs a predetermined conversion using the CU as a unit to obtain a conversion coefficient. Means for outputting and variable length coding means for variable length coding the transform coefficient, wherein the variable length coding means indicates whether or not a non-zero transform coefficient exists in the CU. Means for determining the value of the flag and non-zero conversion only within a limited region of the CU (DC component only, DC component and first AC component, or DC component and first and second AC components) Means for determining the value of the second flag indicating whether or not a coefficient exists, and the most distant position (LAST) and non-zero coefficient value (LEVEL) by scanning the conversion coefficient from the DC component in the scan order in the CU Means for deriving the syntax indicating the encoding parameter (prediction mode (intra Or inter), quantization parameter, CU size), variable length encoding means by switching which one of the first flag and the second flag is variable length encoded, and the first When variable-length coding is performed on the flag, and there is a non-zero transform coefficient in the CU, means for encoding syntax indicating LAST and LEVEL, and variable-length coding of the second flag If the non-zero transform coefficient exists only in the limited region of CU, the LEVEL syntax is encoded.If the non-zero transform coefficient exists outside the limited region of CU, LAST and And a means for encoding a syntax indicating LEVEL.

An image decoding apparatus according to an aspect of the present invention includes a unit that performs variable length decoding of encoded data using a coding unit (CU) including a plurality of pixels as a processing unit and outputs a syntax, and a transform coefficient from the syntax. The variable length decoding means refers to a coding parameter (prediction mode (intra or inter), quantization parameter, CU size), and whether a non-zero transform coefficient exists in the CU. Non-zero only in the first flag indicating whether or not in a limited region of the CU (DC component only, DC component and first AC component, or DC component and first and second AC components) Means for variable length decoding by switching which of the second flags indicating whether or not a transform coefficient exists, variable length decoding; variable length decoding of the first flag; wherein the first flag is CU Must have nonzero transform coefficients In the case shown, means for variable-length decoding the syntax indicating LAST and LEVEL, and variable-length decoding the second flag, indicating that a non-zero transform coefficient exists only in a limited area of the CU. Set the position indicating the highest frequency component in the limited area to LAST, decode the variable length code indicating LEVEL, and if there is a non-zero transform coefficient outside the limited area of CU, LAST And means for variable-length decoding the syntax indicating LEVEL.

An image encoding apparatus according to an aspect of the present invention includes a unit that divides one screen of the input moving image into encoding units (CU) including a plurality of pixels, and performs a predetermined conversion using the CU as a unit to perform a conversion coefficient. And variable length coding means for variable length coding the transform coefficient, the variable length coding means indicates whether or not there is a non-zero transform coefficient in the CU. Means for determining the value of the first flag, means for variable-length coding the first flag, the farthest position (CU LAST) and the non-zero position in the CU by scanning the transform coefficient from the DC component in the scan order Means for deriving syntax indicating coefficient value (LEVEL), means for dividing CU into sub-blocks, position of sub-block including LAST, and position of LAST within sub-block (LAST of sub-block) And the first flag is non-zero in the CU. Whether to perform variable-length encoding of CU LAST or sub-block LAST with reference to encoding parameter prediction mode (intra or inter), quantization parameter, CU size) And a variable length coding means for switching and a means for coding a syntax indicating LEVEL.

An image decoding apparatus according to an aspect of the present invention includes a unit that performs variable length decoding of encoded data using a coding unit (CU) including a plurality of pixels as a processing unit and outputs a syntax, and a transform coefficient from the syntax. Means for deriving, wherein the variable length decoding means decodes a first flag indicating whether or not a non-zero transform coefficient exists in the CU; and the first flag is a non-zero of the CU. A means for decoding one of LAST of CU and LAST of sub-block with reference to encoding parameter prediction mode (intra or inter), quantization parameter, CU size), and sub-block In the case of decoding LAST, a means for deriving LAST of CU and a means for decoding syntax indicating LEVEL are provided.

An image encoding apparatus according to an aspect of the present invention includes a unit that divides one screen of the input moving image into encoding units (CU) including a plurality of pixels, and performs a predetermined conversion using the CU as a unit to perform a conversion coefficient. And variable length coding means for variable length coding the transform coefficient, the variable length coding means indicates whether or not there is a non-zero transform coefficient in the CU. Means for determining the value of the flag, means for variable-length encoding the first flag, and the most distant position (LAST) and non-zero coefficient value by scanning the transform coefficient from the DC component in the scan order in the CU Means for deriving a syntax indicating (LEVEL), and when the first flag indicates the presence of a non-zero transform coefficient in the CU, the encoding parameter (prediction mode (intra or inter), quantization parameter, CU Size, number of non-zero coefficients of adjacent CUs, Referring to (scan direction), it comprises means for switching a variable length code table used for LAST encoding and variable length encoding, and means for encoding a syntax indicating LEVEL.

An image decoding apparatus according to an aspect of the present invention includes a unit that performs variable length decoding of encoded data using a coding unit (CU) including a plurality of pixels as a processing unit and outputs a syntax, and a transform coefficient from the syntax. Means for deriving, wherein the variable length decoding means decodes a first flag indicating whether or not a non-zero transform coefficient exists in the CU; and the first flag is a non-zero of the CU. When indicating the presence of transform coefficients, refer to coding parameters (prediction mode (intra or inter), quantization parameter, CU size, number of non-zero coefficients of adjacent CUs, scan direction), and variable used for LAST decoding It is characterized by comprising means for variable length decoding by switching the long code table and means for decoding the syntax indicating LEVEL.

(Example of software implementation)
Note that a part of the image encoding device 11 and the image decoding device 31 in the above-described embodiment, for example, the entropy decoding unit 301, the prediction parameter decoding unit 302, the loop filter 305, the predicted image generation unit 308, the inverse quantization / inverse DCT. Unit 311, addition unit 312, predicted image generation unit 101, subtraction unit 102, DCT / quantization unit 103, entropy encoding unit 104, inverse quantization / inverse DCT unit 105, loop filter 107, encoding parameter determination unit 110, The prediction parameter encoding unit 111 may be realized by a computer. In that case, the program for realizing the control function may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by a computer system and executed. Here, the “computer system” is a computer system built in either the image encoding device 11 or the image decoding device 31 and includes hardware such as an OS and peripheral devices. The “computer-readable recording medium” refers to a storage device such as a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a hard disk built in a computer system. Furthermore, the “computer-readable recording medium” is a medium that dynamically holds a program for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line, In such a case, a volatile memory inside a computer system serving as a server or a client may be included and a program that holds a program for a certain period of time. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

Further, part or all of the image encoding device 11 and the image decoding device 31 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each functional block of the image encoding device 11 and the image decoding device 31 may be individually made into a processor, or a part or all of them may be integrated into a processor. Further, the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Further, in the case where an integrated circuit technology that replaces LSI appears due to progress in semiconductor technology, an integrated circuit based on the technology may be used.

(Application examples)
The image encoding device 11 and the image decoding device 31 described above can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

First, it will be described with reference to FIG. 8 that the above-described image encoding device 11 and image decoding device 31 can be used for transmission and reception of moving images.

(A) of FIG. 8 is a block diagram showing a configuration of a transmission device PROD_A in which the image encoding device 11 is mounted. As illustrated in FIG. 8A, the transmission apparatus PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image, and with the encoded data obtained by the encoding unit PROD_A1. Thus, a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided. The above-described image encoding device 11 is used as the encoding unit PROD_A1.

Transmission device PROD_A, as a source of moving images to be input to the encoding unit PROD_A1, a camera PROD_A4 that captures moving images, a recording medium PROD_A5 that records moving images, an input terminal PROD_A6 for inputting moving images from the outside, and An image processing unit A7 that generates or processes an image may be further provided. FIG. 8A illustrates a configuration in which the transmission apparatus PROD_A includes all of these, but some of them may be omitted.

Note that the recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 in accordance with the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.

(B) of FIG. 8 is a block diagram showing a configuration of a receiving device PROD_B in which the image decoding device 31 is mounted. As shown in FIG. 8B, the receiving device PROD_B includes a receiving unit PROD_B1 that receives the modulated signal, a demodulating unit PROD_B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit PROD_B1, and a demodulator. A decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2. The above-described image decoding device 31 is used as the decoding unit PROD_B3.

The receiving device PROD_B is a display destination PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording a moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3 PROD_B6 may be further provided. FIG. 8B illustrates a configuration in which all of these are provided in the receiving device PROD_B, but some of them may be omitted.

Note that the recording medium PROD_B5 may be used for recording a non-encoded moving image, or is encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.

Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

For example, a terrestrial digital broadcast broadcasting station (broadcasting equipment, etc.) / Receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting. A broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.

In addition, a server (workstation, etc.) / Client (television receiver, personal computer, smartphone, etc.) such as a VOD (Video On Demand) service or a video sharing service using the Internet is a transmission device that transmits and receives modulated signals via communication. This is an example of PROD_A / receiving device PROD_B (normally, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

In addition to the function of decoding the encoded data downloaded from the server and displaying it on the display, the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.

Next, the fact that the above-described image encoding device 11 and image decoding device 31 can be used for recording and reproduction of moving images will be described with reference to FIG.

FIG. 9A is a block diagram showing a configuration of a recording apparatus PROD_C equipped with the image encoding device 11 described above. As shown in FIG. 9A, the recording apparatus PROD_C includes an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on a recording medium PROD_M. A writing unit PROD_C2 for writing. The above-described image encoding device 11 is used as the encoding unit PROD_C1.

The recording medium PROD_M may be of a type built into the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.

In addition, the recording device PROD_C is a camera PROD_C3 that captures moving images as a source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and a reception for receiving moving images A unit PROD_C5 and an image processing unit PROD_C6 for generating or processing an image may be further provided. FIG. 9A illustrates a configuration in which the recording apparatus PROD_C includes all of these, but some of them may be omitted.

The receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.

Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, an HDD (Hard Disk Drive) recorder, and the like (in this case, the input terminal PROD_C4 or the receiver PROD_C5 is a main source of moving images). . In addition, a camcorder (in this case, the camera PROD_C3 is a main source of moving images), a personal computer (in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images), a smartphone (this In this case, the camera PROD_C3 or the reception unit PROD_C5 is a main source of moving images), and the like is also an example of such a recording apparatus PROD_C.

(B) of FIG. 9 is a block showing a configuration of a playback device PROD_D in which the above-described image decoding device 31 is mounted. As shown in FIG. 9 (b), the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written to the recording medium PROD_M and a read unit PROD_D1 that reads the encoded data. And a decoding unit PROD_D2 to obtain. The above-described image decoding device 31 is used as the decoding unit PROD_D2.

The recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory. It may be of the type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as a DVD or BD. Good.

In addition, the playback device PROD_D has a display unit PROD_D3 that displays a moving image as a supply destination of the moving image output by the decoding unit PROD_D2, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image. PROD_D5 may be further provided. FIG. 9B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but some of them may be omitted.

The transmission unit PROD_D5 may transmit a non-encoded moving image, or transmits encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image using a transmission encoding method between the decoding unit PROD_D2 and the transmission unit PROD_D5.

Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main moving image supply destination). . In addition, a television receiver (in this case, the display PROD_D3 is a main supply destination of moving images), a digital signage (also referred to as an electronic signboard or an electronic bulletin board), and the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images Desktop PC (in this case, output terminal PROD_D4 or transmission unit PROD_D5 is the main video source), laptop or tablet PC (in this case, display PROD_D3 or transmission unit PROD_D5 is video) A smartphone (which is a main image supply destination), a smartphone (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a main moving image supply destination), and the like are also examples of such a playback device PROD_D.

(Hardware implementation and software implementation)
Each block of the image decoding device 31 and the image encoding device 11 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central Processing Unit). You may implement | achieve by software using.

In the latter case, each of the above devices includes a CPU that executes instructions of a program that realizes each function, a ROM (Read Memory) that stores the program, a RAM (RandomAccess Memory) that expands the program, the program, and various data A storage device (recording medium) such as a memory for storing the. The object of the embodiment of the present invention is a record in which the program code (execution format program, intermediate code program, source program) of the control program for each of the above devices, which is software that realizes the above-described functions, is recorded in a computer-readable manner This can also be achieved by supplying a medium to each of the above devices, and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).

Examples of the recording medium include tapes such as magnetic tapes and cassette tapes, magnetic disks such as floppy (registered trademark) disks / hard disks, CD-ROMs (Compact Disc Read-Only Memory) / MO discs (Magneto-Optical discs). ) / MD (Mini Disc) / DVD (Digital Versatile Disc) / CD-R (CD Recordable) / Blu-ray Disc (Blu-ray Disc: registered trademark) and other optical disks, IC cards (including memory cards) / Cards such as optical cards, Mask ROM / EPROM (Erasable Programmable Read-Only Memory) / EEPROM (Electrically Erasable and Programmable Read-Only Memory: registered trademark) / Semiconductor memories such as flash ROM, or PLD (Programmable logic device ) Or FPGA (Field Programmable Gate Gate Array) or the like.

Further, each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, Internet, intranet, extranet, LAN (Local Area Network), ISDN (Integrated Services Digital Network), VAN (Value-Added Network), CATV (Community Area Antenna / television / Cable Television), Virtual Private Network (Virtual Private Network) Network), telephone line network, mobile communication network, satellite communication network, and the like. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, IEEE (Institute of Electrical and Electronic Engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, etc. wired such as IrDA (Infrared Data Association) or remote control , BlueTooth (registered trademark), IEEE802.11 wireless, HDR (High Data Rate), NFC (Near Field Communication), DLNA (Digital Living Network Alliance: registered trademark), mobile phone network, satellite line, terrestrial digital broadcasting network, etc. It can also be used wirelessly. The embodiment of the present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

The embodiments of the present invention are not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and the embodiments can be obtained by appropriately combining technical means disclosed in different embodiments. Embodiments are also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

(Cross-reference of related applications)
This application claims the benefit of priority over Japanese patent application: Japanese Patent Application No. 2017-040322 filed on March 3, 2017, and all of its contents are referred to Included in this document.

Embodiments of the present invention are preferably applied to an image decoding apparatus that decodes encoded data in which image data is encoded, and an image encoding apparatus that generates encoded data in which image data is encoded. it can. Further, the present invention can be suitably applied to the data structure of encoded data generated by an image encoding device and referenced by the image decoding device.

10 CT information decoding unit 11 Image encoding device 20 CU decoding unit 31 Image decoding device 41 Image display device

Claims

In a video encoding device that encodes an input video,
A dividing unit that divides one screen of the input moving image into coding units (CU) including a plurality of pixels;
An output unit that performs a predetermined conversion in units of the CU and outputs a conversion coefficient;
A variable length coding unit for variable length coding the transform coefficient,
The variable length encoding unit includes:
A first determination unit that determines a value of a first flag indicating whether or not a non-zero transform coefficient exists in the CU; and a non-zero transform coefficient exists only in a limited region of the CU. A second determination unit for determining a value of a second flag indicating whether or not
A derivation unit for deriving a syntax indicating a farthest position (LAST) and a non-zero coefficient value (LEVEL) by scanning conversion coefficients from the DC component in the scan order in the CU;
A first encoding unit that refers to an encoding parameter and performs variable length encoding by switching which of the first flag and the second flag is variable length encoded;
When the first flag is variable length encoded, if there is a non-zero transform coefficient in the CU, a second encoding unit that encodes syntax indicating LAST and LEVEL;
When the second flag is variable-length encoded, if a non-zero transform coefficient exists only in a limited area of the CU, the syntax indicating LEVEL is encoded, and is outside the limited area of the CU. A moving image encoding apparatus comprising: a third encoding unit that encodes a syntax indicating LAST and LEVEL when a non-zero transform coefficient exists.
In a video decoding device for decoding a video,
A variable length decoding unit;
An output unit that performs variable-length decoding of encoded data using a coding unit (CU) composed of a plurality of pixels as a processing unit, and outputs a syntax;
A derivation unit for deriving a transform coefficient from the syntax,
The variable length decoding unit includes:
A first flag indicating whether or not a non-zero transform coefficient exists in the CU with reference to an encoding parameter, and whether or not a non-zero transform coefficient exists only in a limited area of the CU. A first decoding unit that performs variable length decoding by switching which of the second flags to indicate variable length decoding;
Second decoding for variable-length decoding the syntax indicating LAST and LEVEL when the first flag indicates variable-length decoding and the first flag indicates that a non-zero transform coefficient exists in the CU And
If the second flag is variable-length decoded and indicates that a non-zero transform coefficient exists only in a limited area of the CU, the position indicating the highest frequency component in the limited area is set to LAST. And a third decoding unit that decodes the variable length code indicating LEVEL and variable length decoding the syntax indicating LAST when there is a non-zero transform coefficient outside the limited region of the CU. A video decoding apparatus characterized by the above.
The moving picture encoding apparatus or moving picture decoding apparatus according to claim 1 or 2, wherein the limited region of the CU is a position of a DC component of a transform coefficient.
The moving picture encoding apparatus or moving picture decoding apparatus according to claim 1 or 2, wherein the limited region of the CU is a position of a DC component and a first AC component of a transform coefficient.
3. The moving picture coding apparatus or moving picture decoding apparatus according to claim 1, wherein the coding parameter is one of a prediction mode (intra or inter), a quantization parameter, and a CU size.
In a video encoding device that encodes an input video,
A dividing unit that divides one screen of the input moving image into coding units (CU) including a plurality of pixels;
An output unit that performs a predetermined conversion in units of the CU and outputs a conversion coefficient;
A variable length coding unit for variable length coding the transform coefficient,
The variable length encoding unit includes:
A determination unit for determining a value of a first flag indicating whether or not a non-zero conversion coefficient exists in the CU;
A first encoding unit for variable-length encoding the first flag;
A first deriving unit for deriving a syntax indicating a farthest position (LAST of CU) and a non-zero coefficient value (LEVEL) by scanning conversion coefficients from the DC component in the scan order in the CU;
A dividing unit for dividing the CU into sub-blocks;
A second derivation unit for deriving a position of a subblock including LAST and a position of LAST in the subblock (LAST of the subblock);
If the first flag indicates the presence of a non-zero transform coefficient in the CU, refer to the encoding parameter and switch between CU LAST and sub-block LAST variable-length encoding. A second encoding unit for encoding;
And a third encoding unit that encodes syntax indicating LEVEL.
In a video decoding device for decoding a video,
A variable length decoding unit;
An output unit that performs variable-length decoding of encoded data using a coding unit (CU) composed of a plurality of pixels as a processing unit, and outputs a syntax;
A derivation unit for deriving a transform coefficient from the syntax,
The variable length decoding unit includes:
A first decoding unit for decoding a first flag indicating whether or not non-zero transform coefficients exist in the CU;
If the first flag indicates the presence of a non-zero transform coefficient of the CU, a second decoding unit that decodes either the LAST of the CU or the LAST of the sub-block with reference to an encoding parameter;
When the LAST of the sub-block is decoded, a derivation unit for deriving the LAST of the CU,
And a third decoding unit for decoding a syntax indicating LEVEL.
8. The moving picture coding apparatus or moving picture decoding apparatus according to claim 6, wherein the coding parameter is one of a prediction mode (intra or inter), a quantization parameter, and a CU size.
In a video encoding device that encodes an input video,
A dividing unit that divides one screen of the input moving image into coding units (CU) including a plurality of pixels;
An output unit that performs a predetermined conversion in units of the CU and outputs a conversion coefficient;
A variable length coding unit for variable length coding the transform coefficient,
The variable length encoding unit includes:
A determination unit for determining a value of a first flag indicating whether or not a non-zero conversion coefficient exists in the CU;
A first encoding unit for variable-length encoding the first flag;
A derivation unit for deriving a syntax indicating a farthest position (LAST) and a non-zero coefficient value (LEVEL) by scanning conversion coefficients from the DC component in the scan order in the CU;
When the first flag indicates the presence of a non-zero transform coefficient in the CU, a second code that performs variable length coding by switching a variable length code table used for LAST coding with reference to a coding parameter And
And a third encoding unit that encodes syntax indicating LEVEL.
In a video decoding device for decoding a video,
A variable length decoding unit;
An output unit that performs variable-length decoding of encoded data using a coding unit (CU) composed of a plurality of pixels as a processing unit, and outputs a syntax;
A derivation unit for deriving a transform coefficient from the syntax,
The variable length decoding unit includes:
A first decoding unit for decoding a first flag indicating whether or not non-zero transform coefficients exist in the CU;
When the first flag indicates the presence of a non-zero transform coefficient of the CU, a second decoding unit that performs variable-length decoding by switching a variable-length code table used for decoding LAST with reference to an encoding parameter;
And a third decoding unit for decoding a syntax indicating LEVEL.
The encoding parameter according to claim 9 or 10, wherein the encoding parameter is any one of a prediction mode (intra or inter), a quantization parameter, a CU size, the number of non-zero coefficients of adjacent CUs, and a scan direction. A video encoding device or a video decoding device.