WO2012090962A1

WO2012090962A1 - Image decoding device, image encoding device, data structure of encoded data, arithmetic decoding device, and arithmetic encoding device

Info

Publication number: WO2012090962A1
Application number: PCT/JP2011/080126
Authority: WO
Inventors: 知宏猪飼; 将伸八杉
Original assignee: シャープ株式会社
Priority date: 2010-12-28
Filing date: 2011-12-26
Publication date: 2012-07-05

Abstract

A video decoding device (1) comprises: a final-coefficient decoding unit (121) for detecting, in reverse scan order in a coefficient matrix (MX10), the position of the final coefficient; a position determination unit (122) for determining whether or not, in the coefficient matrix (MX10), the final coefficient is positioned in the first row containing DC coefficient DC; and a two-step decoding unit (123) for, in a case where the position of the final coefficient is in the first row, performing decoding of the coefficient matrix (MX10) separately for the first row and rows other than the first row.

Description

Image decoding device, image encoding device, data structure of encoded data, arithmetic decoding device, arithmetic encoding device

The present invention relates to an image decoding apparatus for decoding a coefficient matrix, an image encoding apparatus for encoding a coefficient matrix, and a data structure of encoded data obtained by encoding the coefficient matrix. The present invention also relates to an arithmetic decoding device that decodes encoded data that has been arithmetically encoded, and an image decoding device that includes such an arithmetic decoding device. The present invention also relates to an arithmetic encoding device that generates encoded data that has been arithmetically encoded, and an image encoding device that includes such an arithmetic encoding device.

In order to efficiently transmit or record a moving image, a moving image encoding device that generates encoded data by encoding the moving image, and a moving image that generates a decoded image by decoding the encoded data An image decoding device is used.

Specific examples of the moving image encoding method include H.264. H.264 / MPEG-4. A method adopted in KTA software which is a codec for joint development in AVC (Non-patent Documents 1 and 2) and VCEG (Video Coding Expert Group), and TMuC (Test Model Under Consationation) software which is a successor codec (Non-Patent Document 3).

In these moving image encoding systems, a prediction residual obtained by subtracting a prediction image from an original image is converted, and further, quantized conversion coefficients are encoded to generate encoded data.

The conversion coefficient of this prediction residual occupies most of the encoded data. More specifically, although it depends on encoding settings and image characteristics, it is said that 50 to 80% of the encoded data is information related to the transform coefficient. For this reason, it is desired to efficiently encode the transform coefficient.

The conversion coefficient is expressed in the form of a two-dimensional matrix of a predetermined size. Each element of this matrix is scanned in a predetermined order and stored in a one-dimensional array, and then transform coefficients are encoded.

For example, H. In H.264 / AVC, the scan order is taken into consideration so that the correlation between the values of transform coefficients can be used effectively, and the transform coefficients are stored in a one-dimensional array by so-called zigzag scanning (Non-Patent Documents 1 and 2). . And H. In H.264 / AVC, CAVLC (Context-Adaptive Variable Length Coding) is adopted as one of the variable length coding methods. H. In H.264 / AVC, a variable length coding method called CABAC (Context-AdaptiveAdBinary Arithmetic Coding) is also employed.

In CAVLC, variable length coding is performed based on a series (0 run; zero run) of absolute values (level), positive / negative (sign), and coefficient 0 (also referred to as non-zero coefficient) of each transform coefficient. Furthermore, in CAVLC, VLC (Variable Length Coding) tables corresponding to the run and level encoding are used. In addition, when encoding the number of non-zero coefficients (TotalCoeff) and the number of coefficients having the last consecutive absolute value 1 (TrailingOnes), among the plurality of VLC tables according to the situation of surrounding blocks, One is adaptively selected and encoded.

Also, TMuC (Test Model Consideration) employs the following method for the VLC encoding method (Non-Patent Document 3). That is, first, in the encoding process, the position and value of the last coefficient are encoded. Next, in a procedure called a run mode, a set of zero runs and coefficients is encoded. This run mode is executed until a predetermined condition such as the appearance of a coefficient having a large level is satisfied. When the run mode ends, the procedure for sequentially encoding the remaining coefficients is executed, and the encoding process ends when all the coefficients are encoded.

Furthermore, it is as follows to supplement. First, in such an encoding method, an image (picture) constituting a moving image is a slice obtained by dividing the image, and a maximum coding unit (LCU: Largegest Coding Unit) obtained by dividing the slice. The coding unit (CU: Coding Unit) obtained by dividing the maximum coding unit and the hierarchical structure consisting of blocks and partitions obtained by dividing the coding unit are often used. Is encoded as a minimum unit.

Further, as described above, in such an encoding method, a prediction image is usually generated based on a locally decoded image obtained by encoding / decoding an input image, and the difference between the prediction image and the input image is determined. A transform coefficient obtained by DCT (Discrete Cosine Transform) transform of an image (sometimes referred to as “residual image” or “predictive residual”) is encoded.

As described above, specific coding methods for transform coefficients include context-adaptive variable-length coding (CALVC) and context-adaptive binary arithmetic coding (CABAC). : Context-based (Adaptive (Binary) Arithmetic (Coding)).

Further, as described above, in CALVC, each transform coefficient is zigzag scanned into a one-dimensional vector, and then the syntax indicating the value of each transform coefficient and the length of consecutive zeros (also called a run) are used. Representing syntax and the like are encoded.

In CABAC, binarization processing is performed on various syntaxes representing transform coefficients, and binary data obtained by the binarization processing is arithmetically encoded. Here, examples of the various syntaxes include a flag sig_flag indicating whether or not the conversion coefficient is 0, a flag last_flag indicating whether or not the conversion coefficient is the last conversion coefficient in the processing order, and the like.

In CABAC, when one symbol (1 bit of binary data, also referred to as Bin) is encoded, the context index assigned to the frequency domain to be processed is referred to, and the context variable specified by the context index is used. Is subjected to arithmetic coding according to the occurrence probability indicated by the probability state index included in. The occurrence probability specified by the probability state index is updated every time one symbol is encoded.

Non-Patent Document 3 discloses a technique for assigning a context index corresponding to the position of each transform coefficient in the frequency domain to the syntax representing each transform coefficient in the frequency domain corresponding to the block to be processed. .

According to this technique, when the distribution of transform coefficients in the frequency domain is substantially stationary for a plurality of blocks, for example, when the directions of edges in the plurality of blocks are substantially the same, for each transform coefficient, Since arithmetic encoding using an appropriate occurrence probability can be performed, high encoding efficiency can be obtained.

However, conventional variable length coding of transform coefficients by VLC may be inefficient for the following reasons.

First, the conventional coefficient scanning method for the matrix of transform coefficients is constant regardless of the coefficient distribution on the matrix. For example, a typical scanning method is the above-described zigzag scan, but the conventional technique uses only this zigzag scan regardless of the coefficient distribution on the matrix. For this reason, only a VLC table dedicated to zigzag scanning has been defined.

Therefore, there is room for improvement in coding efficiency when the coefficient distribution on the matrix is biased. Using FIG. 35, a more specific example is given as follows. FIG. 35 shows an example in which transform coefficients represented by a two-dimensional matrix are stored in a one-dimensional array by zigzag scanning. (A) of the figure shows an example of scanning the matrix MX100 of 4 × 4 size conversion coefficients, and (b) of the figure shows the one-dimensional array ARR100 in which the conversion coefficients are stored by scanning. An example is shown.

35. The numbers shown in the matrix MX100 shown in (a) of FIG. 35 represent the conversion coefficient values. In the matrix MX100, non-zero coefficients are distributed in the first column. On the other hand, the zero coefficients are concentrated and distributed in the right three columns of the matrix MX100.

In addition, the dotted line arrows shown in (a) of the figure represent the scan order of the matrix MX100. That is, scans are executed in the scan order indicated by the dotted arrows from the upper left of the matrix MX100, and the conversion coefficients are stored in the one-dimensional array ARR100 shown in FIG.

In the matrix MX100 shown in (a) of FIG. 35, the zero coefficients are concentrated and distributed in the right three columns. However, in the one-dimensional array ARR100 shown in (b) of FIG. RUN 101 and RUN 102 are stored separately.

As described above, when the conversion coefficients are concentrated on the upper end or the left end of the matrix, such as the first column or the first row of the conversion coefficient matrix, the concentrated zero coefficients are obtained by scanning in a plan view. There is a problem that the one-dimensional array is divided.

The present invention has been made in view of the above-described problems, and an object of the present invention is to realize an image decoding apparatus or the like that can improve the encoding efficiency when the distribution of transform coefficients is biased. is there.

In addition, in addition, the technical scope of the present invention is not limited to the case where VLC is applied to the encoding method, but extends to the case where CABAC is applied. Details thereof will be described later.

In order to solve the above problems, an image decoding apparatus according to the present invention decodes a coefficient matrix by decoding and inverse scanning encoded quantized transform coefficients included in encoded data obtained by encoding image data. In the image decoding device to be reproduced, the last coefficient position detecting means for detecting the position of the last non-zero coefficient in the reverse scan order in the coefficient matrix, and the position of the last non-zero coefficient in the coefficient matrix, A position determination means for determining whether or not a position determination range from a row or column including a coefficient of a direct current component to a predetermined row or column and the position of the last non-zero coefficient are in the position determination range 2 steps of decoding the coefficient matrix in a first decoding target range including the position of the last non-zero coefficient and a second decoding target range other than the first decoding target range, respectively. Characterized in that it comprises a No. means.

In the above configuration, the quantized transform coefficient is obtained by orthogonally transforming and quantizing the prediction residual obtained by subtracting the predicted image from the image data in the image data encoding process. Also, the quantized transform coefficient in the block to be encoded can be expressed in the form of a coefficient matrix.

Also, the encoded quantized transform coefficient is obtained by scanning and encoding a coefficient matrix composed of quantized transform coefficients. Scanning the coefficient matrix means storing the coefficients included in the coefficient matrix in a one-dimensional array.

Here, reverse scanning is to rearrange the decoded coefficients at corresponding positions on the coefficient matrix.

The reverse scan order is a predetermined order in which reverse scan is performed. For example, a zigzag scan can be employed in the reverse scan order.

The DC component coefficient in the coefficient matrix is specifically the coefficient in the first row and the first column. Furthermore, it is a coefficient that makes the reverse scan order first in the zigzag scan.

Also, the coefficient tends to be a non-zero coefficient whose value is not zero in the low frequency component near the DC component. On the other hand, the coefficient tends to be a zero coefficient whose value becomes zero as it becomes a high frequency component. In the zigzag scan, the low frequency component coefficients are processed in the order in which they are processed before the high frequency component coefficients.

The last non-zero coefficient is the last non-zero coefficient in reverse scan order when viewed from the DC component coefficient. Furthermore, all coefficients in order after the last non-zero coefficient are all zero coefficients.

According to the above configuration, the position of the last non-zero coefficient is (1) the position from the first row to the predetermined row or (2) the position from the first column to the predetermined column of the coefficient matrix. Determine if it exists.

The above (1) and (2) are collectively referred to as a position determination range. The position determination range can also be said to be an end portion including a coefficient of a direct current component in the coefficient matrix.

Further, according to the above configuration, when the position determination range includes the position of the last non-zero coefficient, the coefficient matrix includes the first decoding target range including the position of the last non-zero coefficient, and the first Decoding processing is performed in two stages with a second decoding target range other than the one decoding target range.

The first decoding target range may include the position of the DC component coefficient. Therefore, the decoding process can be performed as follows based on the position of the last non-zero coefficient, for example. First, the first non-zero coefficient to the DC component coefficient are decoded in the first decoding target range. Thereafter, in the second decoding target range, decoding is performed from the last non-zero coefficient in the scanning order in the second decoding target range to the first coefficient in the scanning order. The zigzag scan described above can be applied to the scan order in the second decoding target range.

By the way, non-zero coefficients tend to be biased to horizontal or vertical components when the residual is relatively small and the total number of coefficients is small. For example, in the case of coding of an inter prediction block, such a tendency is seen.

Also, the coding efficiency is better when the lengths of continuous non-zero coefficients (hereinafter referred to as “runs”) are collected as much as possible, that is, when coding is performed without dividing.

In the above configuration, the first decoding target range is decoded separately from the second decoding target range. For this reason, when non-zero coefficients are concentrated in the first decoding target range, the code amount of the code data to be decoded can be suppressed.

Also, in the second decoding target range, the run is encoded without being divided by the coefficients in the first decoding target range, and the encoding is efficiently performed. Therefore, the code amount to be decoded can be suppressed.

Thus, when there is a bias in the distribution of coefficients, it is possible to reduce the code amount of code data to be decoded.

An image encoding apparatus according to the present invention outputs encoded data by performing orthogonal transform on a prediction residual obtained by subtracting a predicted image from image data and scanning and encoding a coefficient matrix composed of quantized transform coefficients. In the image encoding device, the last coefficient position detecting means for detecting the position of the last non-zero coefficient in the scan order in the coefficient matrix, and the position of the last non-zero coefficient are the DC component in the coefficient matrix. Position determination means for determining whether or not a position determination range from a row or column including the coefficient to a predetermined row or column and the position of the last non-zero coefficient are in the position determination range, A two-stage code for encoding a coefficient matrix in each of a first encoding target range including the position of the last non-zero coefficient and a second encoding target range other than the first encoding target range. Characterized in that it and a means.

The data structure of the encoded data according to the present invention is a code generated by performing orthogonal transform on a prediction residual obtained by subtracting a predicted image from image data and scanning and coding a coefficient matrix composed of quantized transform coefficients. In the data structure of the digitized data, the position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix and the last non-zero coefficient are included in the coefficient matrix, and the coefficient matrix includes the DC component coefficient. Alternatively, the first coefficient encoded data in which the coefficients in the first encoding target range from the column to the predetermined row or column are sequentially encoded, and the second encoding target other than the first encoding target range And second coefficient encoded data in which the coefficients in the range are sequentially encoded.

According to the image encoding device or the data structure of the encoded data configured as described above, the same effects as those of the image decoding device according to the present invention can be obtained.

The image encoding device includes position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix and the last non-zero coefficient, and includes a coefficient of a DC component in the coefficient matrix. First coefficient encoded data obtained by sequentially encoding coefficients in a first range from a row or column to a predetermined row or column, and coefficients in a second range other than the first range are sequentially encoded. The encoded second coefficient encoded data may be included in the data structure of the encoded data. The image encoding device may encode the information in, for example, side information.

Also, the above problem can be expressed as follows from another viewpoint. That is, when CABAC is applied to the encoding method, when the distribution of transform coefficients in the frequency domain varies from block to block, for example, the directions of edges in a plurality of adjacent blocks are greatly different from each other. In this case, even if the technique disclosed in Non-Patent Document 3 is used, an appropriate generation probability cannot be used for each transform coefficient, which causes a problem that the encoding efficiency is lowered.

Therefore, the present invention can be expressed as follows from another viewpoint. That is, the present invention has been made in view of the above problems, and its purpose is to provide an arithmetic code with high coding efficiency even when the distribution of transform coefficients in the frequency domain varies from block to block. And an arithmetic decoding device.

In order to solve the above-described problem, the arithmetic decoding apparatus according to the present invention arithmetically calculates one or a plurality of types of syntax representing transform coefficients for each transform coefficient obtained by subjecting a target image to frequency conversion for each unit region. An arithmetic decoding device that decodes encoded data obtained by encoding, wherein each syntax in a target frequency domain corresponding to a unit area to be processed has a type of syntax and a syntax of the syntax Context index assigning means for assigning a context index determined according to a position in the target frequency domain, and each syntax in the target frequency domain based on a probability state specified by the context index assigned to the syntax Syntax decoding means for sequential arithmetic decoding, and the above Transform coefficient restoring means for restoring each transform coefficient from each syntax decoded by the syntax decoding means, and undecoding in the target frequency domain according to the bias of the distribution of the restored non-zero transform coefficients in the target frequency domain And a context index changing means for changing a context index assigned to the syntax.

According to the arithmetic decoding apparatus configured as described above, the context index allocating unit is responsive to each syntax in the target frequency domain according to the type of syntax and the position of the syntax in the target frequency domain. Assign a context index. Further, the context index changing means changes a context index assigned to an undecoded syntax in the target frequency domain according to a bias of a distribution of decoded non-zero transform coefficients in the target frequency domain, and The decoding means decodes the undecoded syntax based on the probability state specified by the changed context index assigned to the undecoded syntax.

Therefore, according to the arithmetic decoding apparatus configured as described above, an appropriate probability state for decoding the undecoded syntax according to the distribution bias of the decoded non-zero transform coefficient in the target frequency domain And the undecoded syntax can be decoded based on the appropriate probability state.

According to the arithmetic coding apparatus having a configuration corresponding to the above configuration, each syntax representing a transform coefficient can be coded based on an appropriate probability state. Even in a different case, encoded data having high encoding efficiency can be generated.

According to the arithmetic decoding apparatus configured as described above, it is possible to decode such encoded data with high encoding efficiency.

The unit area corresponds to, for example, a TU (conversion unit) in TMuC (Test （under Consideration) or a block obtained by dividing the TU.

In addition, the arithmetic coding apparatus according to the present invention encodes each transform coefficient obtained by frequency-transforming the target image for each unit region by arithmetically coding one or more kinds of syntax representing the transform coefficient. An arithmetic coding apparatus for generating coded data, wherein each syntax representing each transform coefficient in a target frequency region corresponding to a unit region to be processed has a syntax type and a target frequency of the syntax Context index allocating means for allocating a context index determined according to a position in a region, and each syntax in the target frequency region is sequentially arithmetic code based on a probability state specified by the context index allocated to the syntax Syntax encoding means for generating the target circumference Context index changing means for changing a context index to be assigned to an uncoded syntax in the target frequency domain according to a bias in the distribution of encoded non-zero transform coefficients in a number domain. Yes.

According to the arithmetic coding apparatus configured as described above, the context index allocating unit determines, for each syntax in the target frequency domain, the type of syntax and the position of the syntax in the target frequency domain. Assign a context index that is determined accordingly. Further, the context index changing means changes a context index assigned to an uncoded syntax in the target frequency domain according to a bias in a distribution of encoded non-zero transform coefficients in the target frequency domain, and The tax encoding means encodes the uncoded syntax based on the probability state specified by the changed context index assigned to the undecoded syntax.

Therefore, according to the arithmetic coding apparatus configured as described above, it is possible to appropriately encode an uncoded syntax in accordance with the distribution of the distribution of coded non-zero transform coefficients in the target frequency domain. A random state can be specified and the uncoded syntax can be encoded based on the appropriate probability state. As a result, even if the bias of each transform coefficient is different for each unit region, encoded data having high encoding efficiency can be generated.

Also, the data structure of the encoded data according to the present invention is such that, for each transform coefficient obtained by frequency transforming the original image for each unit region, one or more kinds of syntax representing the transform coefficient are arithmetically encoded. The one or more types of syntax includes a flag indicating whether or not a transform coefficient is 0, and arithmetic decoding is performed to decode the encoded data. The apparatus assigns, to each syntax in the target frequency domain corresponding to the unit area to be processed, a context index determined according to the type of the syntax and the position of the syntax in the target frequency domain, and the target frequency Each syntax in the region is indicated by the context index assigned to that syntax. Sequentially performing arithmetic decoding based on the probability state to be performed, and changing a context index assigned to an undecoded syntax in the target frequency domain according to a bias in the distribution of the restored flag in the target frequency domain It is characterized by.

The following explains one aspect of the present invention. That is, the image decoding apparatus according to the present invention includes a last coefficient position detecting unit that detects the position of the last non-zero coefficient in the reverse scan order in the coefficient matrix, and the position of the last non-zero coefficient is the coefficient matrix In the position determination means, the position determination means for determining whether or not the position determination range from the row or column including the coefficient of the DC component to the predetermined row or column is within the position determination range. In some cases, the coefficient matrix is decoded in a first decoding target range including the position of the last non-zero coefficient and a second decoding target range other than the first decoding target range, respectively. Means.

The image coding apparatus according to the present invention includes a last coefficient position detecting unit that detects a position of the last non-zero coefficient in the scan order in the coefficient matrix, and a position of the last non-zero coefficient in the coefficient matrix. A position determination means for determining whether or not a position determination range from a row or column including a coefficient of a direct current component to a predetermined row or column and the position of the last non-zero coefficient are in the position determination range The coefficient matrix is encoded by a first encoding target range including the position of the last non-zero coefficient and a second encoding target range other than the first encoding target range, respectively. 2 And a stage encoding means.

The data structure of the encoded data according to the present invention includes position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix and the last non-zero coefficient. In the coefficient matrix, the coefficient of the DC component First coefficient encoded data obtained by sequentially encoding the coefficients in the first encoding target range from the row or column including the predetermined row or column, and the second coefficient other than the first encoding target range. This is a data structure including second coefficient encoded data in which coefficients in the encoding target range are sequentially encoded.

Therefore, even when the coefficient distribution is biased, it is possible to improve the coding efficiency by the variable length coding method.

Further, another aspect of the present invention will be described as follows. That is, the arithmetic decoding device according to the present invention is obtained by arithmetically encoding one or a plurality of types of syntax representing the transform coefficient for each transform coefficient obtained by frequency transforming the target image for each unit region. An arithmetic decoding apparatus for decoding encoded data, wherein each syntax in a target frequency domain corresponding to a unit area to be processed has a type of syntax and a position of the syntax in the target frequency domain. Context index allocating means for allocating a context index determined in accordance with the syntax decoding, and sequentially decoding each syntax in the target frequency domain based on a probability state specified by the context index allocated to the syntax Means and the syntax decoding means Therefore, the transform coefficient restoring means for restoring each transform coefficient from each decoded syntax, and the undecoded syntax in the target frequency domain according to the distribution of the restored non-zero transform coefficient in the target frequency domain And a context index changing means for changing a context index to be allocated.

In addition, the arithmetic coding apparatus according to the present invention encodes each transform coefficient obtained by frequency-transforming the target image for each unit region by arithmetically coding one or more kinds of syntax representing the transform coefficient. An arithmetic coding apparatus for generating coded data, wherein each syntax representing each transform coefficient in a target frequency region corresponding to a unit region to be processed has a syntax type and a target frequency of the syntax Context index allocating means for allocating a context index determined according to a position in a region, and each syntax in the target frequency region is sequentially arithmetic code based on a probability state specified by the context index allocated to the syntax Syntax encoding means for generating the target circumference Depending on the deviation of the distribution of coded nonzero transform coefficients in several regions, and a, a context index changing means for changing a context index allocated to uncoded syntax in the subject frequency range.

Also, the data structure of the encoded data according to the present invention is such that, for each transform coefficient obtained by frequency transforming the original image for each unit region, one or more kinds of syntax representing the transform coefficient are arithmetically encoded. The data structure of the encoded data obtained by
The one or more types of syntax include a flag indicating whether or not the conversion coefficient is 0,
The arithmetic decoding apparatus that decodes the encoded data, for each syntax in the target frequency domain corresponding to the unit area to be processed, according to the type of syntax and the position of the syntax in the target frequency domain A fixed context index is allocated, and each syntax in the target frequency domain is sequentially arithmetically decoded based on the probability state specified by the context index allocated to the syntax, and the restored in the target frequency domain The context index assigned to the undecoded syntax in the target frequency domain is changed according to the distribution of the flag.

According to the arithmetic decoding apparatus configured as described above, it is possible to decode encoded data with high encoding efficiency.

Other objects, features, and superior points of the present invention will be fully understood from the following description. The advantages of the present invention will become apparent from the following description with reference to the accompanying drawings.

It is a block diagram which shows the structural example of the TU information decoding part with which the moving image decoding apparatus which concerns on one Embodiment of this invention is provided. It is a functional block diagram which shows the schematic structure of the said moving image decoding apparatus. It is a figure which shows the structure of the encoding data produced | generated by the moving image encoder which concerns on one Embodiment of this invention, and decoded by the said moving image decoder. FIG. 4 is a diagram for explaining a coefficient matrix scan, where (a) shows an example of 8 × 8 coefficient matrix scan, and (b) shows a one-dimensional array obtained from the scan result. It is the figure shown about the relationship between the process in the two-stage decoding part with which the said TU information decoding part is provided, and the input-output data regarding the said process. It is a figure which illustrates about the coefficient encoding process or coefficient decoding process of the two steps | paragraphs of the said coefficient matrix, (a) has shown about the 1st step and (b) has shown about the 2nd step. It is the flowchart shown about the flow of the coefficient decoding process in the said TU information decoding part. It is a functional block diagram shown about the structural example of the moving image encoder which concerns on one Embodiment of this invention. It is a block diagram shown about the structural example of the variable length encoding part with which the said moving image encoder is provided. It is the figure shown about the relationship in the input / output data regarding the process in the 2 step | paragraph encoding part with which the TU information encoding part which concerns on this embodiment is provided. It is the flowchart shown about the flow of the coefficient encoding process in the said TU information encoding part. It is a figure shown about an example of the syntax of the coefficient encoding process in the said TU information encoding part. It is a figure shown about an example of the syntax of the coefficient encoding process in the said TU information encoding part. It is a figure which shows an example of the table which defined the correspondence of the code number (CodeNum) corresponding to the combination of "run" and "isLevelOne" in case the maximum value of a run (run) is 5. It is a figure which shows an example of the VLC table for zigzag scanning in case the maximum value of a run is 5. It is a figure explaining the maximum value of "run". It is a figure which shows the other example of the table which defined the response | compatibility of the code number (CodeNum) corresponding to the combination of "run" and "isLevelOne" in case the maximum value of a run is 5. It is a figure which shows the other example of the VLC table for linear scan in case the maximum value of a run is 5. FIG. It is a functional block diagram shown about the structural example of the TU information decoding part which concerns on other embodiment of this invention. It is a figure shown about the coefficient encoding process or coefficient decoding process of the last coefficient of a coefficient matrix, and the 2nd coefficient from the last. It is a figure which illustrates about the coefficient encoding process of 2 steps | paragraphs of a coefficient matrix, or a coefficient decoding process, (a) has shown about the 1st step and (b) has shown about the 2nd step. It is the flowchart shown about the flow of the coefficient decoding process in the said TU information decoding part. It is a functional block diagram shown about the structural example of the TU information encoding part which concerns on other embodiment of this invention. It is the flowchart shown about the flow of the coefficient decoding process in the said TU information encoding part. Another example of the syntax of the coefficient encoding process in the TU information encoding unit will be described. Another example of the syntax of the coefficient encoding process in the TU information encoding unit will be described. It is a figure explaining the coefficient encoding process and coefficient decoding process in the case of omitting the second stage flag. It is the flowchart shown about the flow of the coefficient encoding process in the TU information encoding part which concerns on a modification. It is the flowchart shown about the flow of the coefficient decoding process in the TU information decoding part which concerns on a modification. Another example of the syntax of the coefficient encoding process in the TU information encoding unit according to the modification will be described. Another example of the syntax of the coefficient encoding process in the TU information encoding unit according to the modification will be described. It is a figure explaining the expansion of the encoding object area | region of the 1st step. It is a figure explaining the example of a restriction | limiting of the number of the coefficients to encode. It is a figure explaining another example of restriction | limiting of the number of the coefficients to encode. It is a figure which illustrates the case where the conversion coefficient represented by a two-dimensional matrix is conventionally stored in a one-dimensional array by a zigzag scan, (a) is an example when scanning the matrix of a conversion coefficient of 4x4 size (B) shows an example of a one-dimensional array in which conversion coefficients are stored by scanning. It is a block diagram which shows the structure of the quantization residual information decoding part with which the moving image decoding apparatus which concerns on another embodiment of this invention is provided. It is a figure which shows the data structure of the coding data which concern on embodiment. (A) shows the configuration of the picture layer of the encoded data, (b) shows the configuration of the slice layer included in the picture layer, and (c) shows the LCU layer included in the slice layer. (D) shows the configuration of the leaf CU included in the CU layer, (e) shows the configuration of the inter prediction information for the leaf CU, ( f) shows the configuration of the intra prediction information for the leaf CU, and (g) shows the configuration of the filter parameter included in the slice header. It is a figure which shows each syntax contained in the quantization residual information of the coding data which concerns on embodiment. It is a figure for demonstrating each syntax contained in the quantization residual information of the coding data which concerns on embodiment, Comprising: (a) has illustrated the non-zero conversion coefficient in the frequency domain of a process target. , (B) illustrates the value of the syntax sig_flag in the frequency domain to be processed, (c) illustrates the value of the syntax last_flag in the frequency domain to be processed, and (d) The value of syntax abs_greater_one in the frequency domain to be processed is illustrated. It is a block diagram which shows the structure of the moving image decoding apparatus which concerns on embodiment. It is a block diagram which shows the structure of the variable length code decoding part 11 with which the moving image decoding apparatus which concerns on embodiment is provided. It is a figure for demonstrating operation | movement of the context index allocation part with which the quantization residual information decoding part which concerns on embodiment is equipped, Comprising: (a) is when the frequency domain used as a process target is 4x4 component, The example of the context index allocated by the context index allocation unit for each frequency component is shown. (B) shows the case where the frequency region to be processed is 8 × 8 component, An example of the context index allocated by the context index allocation unit is shown. It is a figure for demonstrating operation | movement of the transform coefficient decoding part with which the quantization residual information decoding part which concerns on embodiment is equipped, Comprising: (a) has shown the scan line in each scan, (b) has zigzag The scanning order of scanning is shown, and (c) shows an example of the scanning order of adaptive scanning. It is a flowchart which shows the flow of the process of the adaptive scan performed by the transform coefficient decoding part which concerns on embodiment. FIG. 6 is a diagram for explaining an adaptive scan performed by the transform coefficient decoding unit according to the embodiment, in which (a) shows a region for counting a variable scanTopRight indicating the appearance frequency of non-zero transform coefficients; (B) shows an area in which a variable scanBottomLeft representing the frequency of appearance of non-zero conversion coefficients is counted. It is a figure for demonstrating the 1st specific example of the context index change process by the context index change part with which the quantization residual information decoding part which concerns on embodiment is provided, Comprising: (a) is distribution of non-zero conversion coefficient (B) illustrates a context index that has not been changed by the context index changing unit, and (c) illustrates a context index changing unit. The context index by which change by (2) was performed is illustrated. It is a flowchart which shows the flow of a process of a context index change part about the 1st specific example of a context index change process. It is a sequence diagram which shows an example of cooperation operation | movement with a transform coefficient decoding part and a context index change part about the 1st specific example of a context index change process. It is a flowchart which shows the flow of a process of a context index change part about the 2nd specific example of a context index change process. It is a figure for demonstrating the 2nd specific example of a context index change process, Comprising: (a) is referred in order to calculate the variable coeffBottomLeft and coeffTopRight when the frequency domain of a process target is a 4x4 component. (B) illustrates a context index that has not been changed by the context index changing unit, and (c) shows a change that has been made by the context index changing unit. The context index is illustrated. It is a figure for demonstrating the 2nd specific example of a context index change process, Comprising: (a) is referred in order to calculate the variable coeffBottomLeft and coeffTopRight when the frequency domain of a process target is an 8x8 component. (B) illustrates a context index that has not been changed by the context index changing unit, and (c) shows a change that has been made by the context index changing unit. The context index is illustrated. It is a figure for demonstrating the 2nd specific example of a context index change process, Comprising: (a) is referred in order to calculate the variable coeffBottomLeft and coeffTopRight when the frequency domain of a process target is an 8x8 component. (B) illustrates a context index that has not been changed by the context index changing unit, and (c) shows a change that has been made by the context index changing unit. The context index is illustrated. It is a sequence diagram which shows an example of cooperation operation | movement with a transform coefficient decoding part and a context index change part about the 2nd specific example of a context index change process. It is a figure for demonstrating the 3rd modification of embodiment, Comprising: (a) And (b) has shown the frequency component in which a context index can be changed, and the frequency component in which a context index is not changed. FIG. 10 is a diagram for explaining a fourth modification of the embodiment, in which (a) shows the context index determined by the context index changing unit when there is almost no bias in the decoded non-zero transform coefficients. (B) shows the context index determined by the context index changing unit when the decoded non-zero transform coefficients are biased in the horizontal direction, and (c) shows the decoded When the non-zero transform coefficient is biased in the vertical direction, the context index determined by the context index changing unit is shown. FIG. 11 is a diagram for explaining a fifth modification of the embodiment, in which (a) shows the context index determined by the context index changing unit when there is almost no bias in the decoded non-zero transform coefficients. (B) shows the context index determined by the context index changing unit when the decoded non-zero transform coefficients are biased in the horizontal direction, and (c) shows the decoded When the non-zero transform coefficient is biased in the vertical direction, the context index determined by the context index changing unit is shown. It is a figure for demonstrating the 6th modification of embodiment, Comprising: It is a figure which illustrates the frequency component which a context index change part refers in order to calculate coeffTopRight and coeffBottomLeft. It is a block diagram which shows the structure of the moving image encoder which concerns on embodiment. It is a block diagram which shows the structure of the variable-length code encoding part with which the moving image encoder which concerns on embodiment is provided. It is a block diagram which shows the structure of the quantization residual information encoding part with which the variable-length code encoding part which concerns on embodiment is provided. It is the figure shown about the structure of the transmitter which mounts the said moving image encoder, and the receiver which mounts the said moving image decoder. (A) shows a transmitting apparatus equipped with a moving picture coding apparatus, and (b) shows a receiving apparatus equipped with a moving picture decoding apparatus. It is the figure shown about the structure of the recording device which mounts the said moving image encoder, and the reproducing | regenerating apparatus which mounts the said moving image decoder. (A) shows a recording apparatus equipped with a moving picture coding apparatus, and (b) shows a reproduction apparatus equipped with a moving picture decoding apparatus.

<< Embodiment 1 >>
An embodiment of a decoding apparatus and an encoding apparatus according to the present invention will be described below with reference to the drawings.
[1] Embodiment 1-1
An embodiment of the present invention will be described with reference to FIGS. First, an overview of the moving picture decoding apparatus (image decoding apparatus) 1 and the moving picture encoding apparatus (image encoding apparatus) 2 will be described with reference to FIG. FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.

The video decoding device 1 and the video encoding device 2 shown in FIG. H.264 / MPEG-4 AVC standard technology, VCEG (Video Coding Expert Group) technology used in KTA software, which is a joint development codec, and successor codec TMuC (Test Model Under Consideration ) The technology used in the software is implemented.

The moving image decoding apparatus 1 receives encoded data (data structure of encoded data) # 1 obtained by encoding a moving image by the moving image encoding apparatus 2. The video decoding device 1 decodes the input encoded data # 1 and outputs the video # 2 to the outside. Prior to detailed description of the moving picture decoding apparatus 1, the configuration of the encoded data # 1 will be described below.

[Configuration of encoded data]
The configuration of encoded data # 1 that is generated by the video encoding device 2 and decoded by the video decoding device 1 will be described with reference to FIG. The encoded data # 1 has a hierarchical structure including a sequence layer, a GOP (Group Of Pictures) layer, a picture layer, a slice layer, and a maximum coding unit (LCU: Large Coding Unit) layer.

FIG. 3 shows the hierarchical structure below the picture layer in the encoded data # 1. FIGS. 3A to 3D are diagrams showing the structures of leaf CUs (denoted as CUL in FIG. 3D) included in the picture layer PICT, slice layer S, LCU layer LCU, and LCU, respectively. is there.

(Picture layer)
The picture layer PICT is a set of data referred to by the video decoding device 1 in order to decode a target picture that is a processing target picture. As shown in FIG. 3A, the picture layer PICT includes a picture header PH and slice layers S ₁ to S _NS (NS is the total number of slice layers included in the picture layer PICT). Hereinafter, when it is not necessary to distinguish each of the slice layers S ₁ to S _NS , the reference numerals may be omitted. The same applies to other configurations included in the encoded data # 1.

The picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture. For example, the encoding mode information (entoropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH. When entorpy_coding_mode_flag is 0, the picture is encoded by CAVLC (Context-based Adaptive Variable Length Coding). It has become.

(Slice layer)
Each slice layer S included in the picture layer PICT is a set of data referred to by the video decoding device 1 in order to decode a target slice that is a processing target slice. As shown in FIG. 3B, the slice layer S includes a slice header SH and LCU layers LCU ₁ to LCU _NC (NC is the total number of LCUs included in the slice S).

The slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH. In addition, the slice header SH includes a filter parameter FP that is referred to by a loop filter (not shown) included in the video decoding device 1.

As slice types that can be specified by the slice type specification information, (1) I slice using only intra prediction at the time of encoding, and (2) P using unidirectional prediction or intra prediction at the time of encoding. Slice, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding.

(LCU layer)
Each LCU layer LCU included in the slice layer S is a set of data that the video decoding device 1 refers to in order to decode the target LCU that is the processing target LCU. LCU layer LCU, as shown in (c) of FIG. 3, LCU header LCUH, and a plurality of coding units obtained by the quadtree dividing the LCU: the _{(CU Coding Unit) CU 1 ~} CU NL Contains.

The size that each CU can take depends on the LCU size and the hierarchical depth included in the sequence parameter set SPS of the encoded data # 1. For example, when the size of the LCU is 128 × 128 pixels and the maximum hierarchical depth is 5, the CU included in the LCU has five types of sizes, that is, 128 × 128 pixels, 64 × 64 pixels, Any of 32 × 32 pixels, 16 × 16 pixels, and 8 × 8 pixels can be taken. A CU that is not further divided is called a leaf CU.

(LCU header)
The LCU header LCUH includes an encoding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target LCU. Specifically, as shown in FIG. 3C, CU partition information SP_CU that specifies a partition pattern for each leaf CU of the target LCU, and a quantization parameter difference Δqp that specifies the size of the quantization step. (Mb_qp_delta) is included.

CU division information SP_CU is information that specifies the shape and size of each CU (and leaf CU) included in the target LCU, and the position in the target LCU. Note that the CU partition information SP_CU does not necessarily need to explicitly include the shape and size of the leaf CU. For example, the CU partition information SP_CU may be a set of flags (split_coding_unit_flag) indicating whether or not the entire LCU or a partial region of the LCU is divided into four. In that case, the shape and size of each leaf CU can be specified by using the shape and size of the LCU together.

Further, the quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target LCU and the quantization parameter qp ′ in the LCU encoded immediately before the LCU.

(Leaf CU)
A CU (leaf CU) that cannot be further divided is treated as a prediction unit (PU: Prediction Unit) and a transform unit (TU: Transform Unit).

As shown in (d) of FIG. 3, the leaf CU (denoted as CUL in (d) of FIG. 3) is (1) PU information PUI that is referred to when the moving image decoding apparatus 1 generates a predicted image. And (2) the TU information TUI that is referred to when the moving image decoding apparatus 1 decodes the residual data. The PU information PUI may include a skip flag SKIP. When the value of the skip flag SKIP is 1, the TU information is omitted.

The PU information PUI includes prediction type information PT and prediction information PI, as shown in FIG. The prediction type information PT is information that specifies whether intra prediction or inter prediction is used as a predicted image generation method for the target leaf CU (target PU). The prediction information PI is composed of intra prediction information or inter prediction information depending on which prediction method the prediction type information PT designates. Hereinafter, a PU to which intra prediction is applied is also referred to as an intra PU, and a PU to which inter prediction is applied is also referred to as an inter PU.

The PU information PUI includes information specifying the shape and size of each partition included in the target PU and the position in the target PU. Here, the partition is one or a plurality of non-overlapping areas constituting the target leaf CU, and the generation of the predicted image is performed in units of partitions.

As shown in FIG. 3 (d), the TU information TUI includes TU partition information SP_TU that specifies a partition pattern for each block of the target leaf CU (target TU), and quantized prediction residuals QD ₁ to QD _NT. (NT is the total number of blocks included in the target TU).

TU partition information SP_TU is information that specifies the shape and size of each block included in the target TU and the position in the target TU. Each TU can be, for example, a size from 64 × 64 pixels to 2 × 2 pixels. Here, the block is one or a plurality of non-overlapping areas constituting the target leaf CU, and prediction residual encoding / decoding is performed in units of TUs or blocks obtained by dividing TUs.

Each quantized prediction residual QD is encoded data generated by the moving image encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block. Process 1: DCT transform (Discrete Cosine Transform) is performed on the prediction residual obtained by subtracting the prediction image from the encoding target image. Process 2: The transform coefficient obtained in Process 1 is quantized. Process 3: The transform coefficient quantized in Process 2 is variable-length encoded. The quantization parameter qp described above represents the magnitude of the quantization step QP used when the moving image encoding apparatus 2 quantizes the transform coefficient (QP = 2 ^{qp / 6} ).

(Prediction information PI)
As described above, there are two types of prediction information PI: inter prediction information and intra prediction information.

The inter prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an inter predicted image by inter prediction. More specifically, the inter prediction information includes inter PU partition information that specifies a partition pattern of the target PU (inter PU) into each partition, and inter prediction parameters for each partition.

The inter prediction parameters include a reference image index, an estimated motion vector index, and a motion vector residual.

On the other hand, the intra prediction information includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction. More specifically, the intra prediction information includes intra PU partition information for specifying a partition pattern of the target PU (intra PU) into each partition, and intra prediction parameters for each partition. The intra prediction parameter is a parameter for designating an intra prediction method (prediction mode) for each partition.

[Video decoding device]
Hereinafter, the configuration of the video decoding device 1 according to the present embodiment will be described with reference to FIGS.

(Outline of video decoding device)
The video decoding device 1 generates a predicted image for each partition, generates a decoded image # 2 by adding the generated predicted image and the prediction residual decoded from the data # 1, and is generated The decoded image # 2 is output to the outside.

Here, the generation of the predicted image is performed with reference to the encoding parameter obtained by decoding the encoded data # 1. Here, the encoding parameter is a parameter referred to in order to generate a prediction image, and in addition to a prediction parameter such as a motion vector referred to in inter-screen prediction and a prediction mode referred to in intra-screen prediction. Partition size and shape, block size and shape, and residual data between the original image and the predicted image. Hereinafter, a set of all information excluding the residual data among the information included in the encoding parameter is referred to as side information.

Further, in the following description, the case where the prediction unit is a partition constituting the LCU will be described as an example. However, the present embodiment is not limited to this, and the prediction unit is a unit larger than the partition. The present invention can also be applied to the case where the prediction unit is a unit smaller than the partition.

In the following, a frame (picture), a slice, an LCU, a block, and a partition to be decoded are referred to as a target frame, a target slice, a target LCU, a target block, and a target partition, respectively.

The LCU size is, for example, 64 × 64 pixels, and the partition size is, for example, 64 × 64 pixels, 32 × 32 pixels, 16 × 16 pixels, 8 × 8 pixels, 4 × 4 pixels, or the like. These sizes do not limit the present embodiment, and the size and partition of the LCU may be other sizes.

(Configuration of video decoding device)
Referring to FIG. 2 again, the schematic configuration of the moving picture decoding apparatus 1 will be described as follows. FIG. 2 is a functional block diagram showing a schematic configuration of the moving picture decoding apparatus 1.

As shown in FIG. 2, the moving picture decoding apparatus 1 includes a variable length code demultiplexing unit 11, a TU information decoding unit 12, an inverse quantization / inverse conversion unit 13, a predicted image generation unit (channel decoding unit) 14, and an adder. (Channel decoding means) 15 and a frame memory 16 are provided.

[Variable length code demultiplexer]
The variable-length code demultiplexing unit 11 demultiplexes the encoded data # 1 for one frame input to the video decoding device 1 to obtain various kinds of information included in the hierarchical structure shown in FIG. To separate. For example, the variable length code demultiplexing unit 11 refers to information included in various headers, and sequentially separates the encoded data # 1 into slices and LCUs.

Here, the various headers include (1) information on the method of dividing the target frame into slices, and (2) information on the size, shape, and position of the LCU belonging to the target slice.

Then, the variable length code demultiplexer 11 refers to the CU partition information SP_CU included in the encoded LCU header LCUH and divides the target LCU into leaf CUs. In addition, the variable-length code demultiplexing unit 11 acquires the TU information TUI and the PU information PUI for the target leaf CU: CUL.

The variable length code demultiplexing unit 11 supplies the TU information TUI obtained for the target leaf CU to the TU information decoding unit 12. In addition, the variable length code demultiplexing unit 11 supplies the PU information PUI obtained for the target leaf CU to the predicted image generation unit 14.

[TU information decoding unit]
The TU information decoding unit 12 decodes the TU information TUI supplied from the variable length code demultiplexing unit 11.

Specifically, the TU information decoding unit 12 first decodes the TU partition information SP_TU from the TU information TUI for the target leaf CU supplied from the variable length code demultiplexing unit 11.

Further, the TU information decoding unit 12 divides the target leaf CU into one or a plurality of blocks according to the decoded TU division information SP_TU.

Then, the TU information decoding unit 12 decodes the TU partition information SP_TU and the quantized prediction residual QD from the TU information TUI for each block. Here, the quantized prediction residual QD for each block can be expressed in a format in which quantized transform coefficients are arranged in a two-dimensional matrix. Hereinafter, a two-dimensional matrix representation of quantized transform coefficients is referred to as a coefficient matrix. Details of the configuration for obtaining the coefficient matrix from the TU information TUI in the TU information decoding unit 12 will be described later.

The TU information decoding unit 12 supplies the decoded TU information TUI ′ including the decoded TU partition information SP_TU ′ and the quantized prediction residual QD ′ to the inverse quantization / inverse conversion unit 13.

[Inverse quantization / inverse transform unit]
The inverse quantization / inverse transform unit 13 performs inverse quantization / inverse transformation of the quantization prediction residual for each block on the target leaf CU based on the decoded TU information TUI ′ supplied from the TU information decoding unit 12. Do. The inverse quantization / inverse transform unit 13 performs inverse quantization and inverse DCT transform (Inverse Discrete Cosine Transform) on the quantized prediction residual QD ′ included in the decoded TU information TUI ′, so that a pixel is obtained for each target partition. The prediction residual D for each is restored. The inverse quantization / inverse transform unit 13 supplies the restored prediction residual D to the adder 14.

[Predicted image generator]
For each partition included in the target leaf CU, the predicted image generation unit 14 refers to a local decoded image P ′ that is a decoded image around the partition, and generates a predicted image Pred by intra prediction or inter prediction.

Specifically, the predicted image generation unit 14 operates as follows. First, the predicted image generation unit 14 decodes the PU information PUI for the target leaf CU supplied from the variable length code demultiplexing unit 11. Subsequently, the predicted image generation unit 14 determines a division pattern for each partition of the target leaf CU according to the PU information PUI. Further, the predicted image generation unit 14 selects a prediction mode of each partition according to the PU information PUI, and assigns each selected prediction mode to each partition.

Then, the predicted image generation unit 14 generates a predicted image Pred for each partition included in the target leaf CU with reference to the selected prediction mode and the pixel values of the local decoded image P ′ around the partition. The predicted image generation unit 14 supplies the predicted image Pred generated for the target leaf CU to the adder 15.

[Adder]
The adder 15 adds the predicted image Pred supplied from the predicted image generation unit 14 and the prediction residual D supplied from the inverse quantization / inverse transform unit 13, thereby decoding the decoded image P for the target leaf CU. Is generated.

[Frame memory]
The decoded image P that has been decoded is sequentially recorded in the frame memory 16. In the frame memory 16, at the time of decoding the target LCU, decoded images corresponding to all the LCUs decoded before the target LCU (for example, all the LCUs preceding in the raster scan order) are recorded. .

Note that, in the moving image decoding apparatus 1, one frame of encoded data # 1 input to the moving image decoding apparatus 1 at the time when the decoded image generation processing for each LCU is completed for all the LCUs in the image. The decoded image # 2 corresponding to is output to the outside.

(About coefficient scanning)
Here, before describing the details of the configuration of the TU information decoding unit 12, first, an example of scanning of quantized transform coefficients will be described with reference to FIG. FIG. 4 is a diagram illustrating scanning of quantized transform coefficients. 4A shows a scan example of the 8 × 8 coefficient matrix MX10, and FIG. 4B shows the one-dimensional array ARR10 obtained from the scan result.

As described above, the quantized transform coefficients (hereinafter simply referred to as coefficients) can be expressed in the form of an 8 × 8 coefficient matrix MX10 as shown in FIG. Also, the numbers shown in FIG. 4A are the scan order. In the coding of coefficients, first, the coefficient matrix MX10 is scanned in the scan order (also the order of dotted arrows) shown in FIG. Scanning is a process of converting the coordinates in the coefficient matrix MX10 into a one-dimensional scan order index. When the coefficients are scanned in the numerical order shown in FIG. 4A, the coefficients are stored in the one-dimensional array ARR10.

More specifically, scanning starts from “1” in the upper left of the coefficient matrix MX10 shown in FIG. That is, the DC component coefficient (hereinafter referred to as DC coefficient) DC is scanned first. Subsequently, the coefficient is scanned in a zigzag manner from the position “2” on the right side of the DC coefficient DC. Then, in order, the coefficients are scanned up to the last coefficient FIN of the coefficient matrix at the position “64”. The DC coefficient DC is a coordinate (0, 0) in the coefficient matrix. The scan order index is obtained by subtracting 1 from the number shown in FIG. 4A, and the scan order index of the DC coefficient DC is 0.

Thereby, the DC coefficient DC to the coefficient FIN are stored in the one-dimensional array ARR10 shown in FIG.

By the way, while the coefficient value of the high frequency component tends to be zero or close to zero, the coefficient value of the low frequency component is not zero or has a high possibility of being large. For this reason, the zigzag scan has a scan order in which the low-frequency component coefficients are scanned quickly.

When the scan is completed up to “64” and all the coefficients are stored in the one-dimensional array ARR10, the encoding process is started thereafter.

Note that the encoding process is performed in the reverse order of the scan. That is, the encoding process is performed from “64” to “1”. Hereinafter, this processing order is also referred to as reverse zigzag scanning.

As described above, the coefficient at a position close to “64” is a high-frequency component, and is often “0”. Therefore, in the encoding process, encoding of such zero coefficients is omitted, and the process is performed based on the last non-zero coefficient as viewed from the DC coefficient. Hereinafter, the non-zero coefficient that appears last from the viewpoint of the DC coefficient is referred to as the last coefficient. Details of the coefficient encoding process will be described separately.

In the above, for convenience of explanation, an example of scanning an 8 × 8 coefficient matrix is shown, but the same can be said for a case of scanning a 4 × 4 coefficient matrix.

(About coefficient coding)
Hereinafter, coefficient encoded data encoded by the moving image encoding apparatus 2 will be described. If there is no last coefficient in the first row or the first column of the coefficient matrix, the moving picture coding apparatus 2 performs coding from the last coefficient to the DC coefficient in the reverse order of scanning as described above. Is assumed to have been performed. The encoding process in this case is hereinafter referred to as a batch coefficient encoding process.

Further, when the last coefficient is in the first row or the first column of the coefficient matrix, the moving picture decoding apparatus 1 performs the coefficient encoding process in two stages.

Specifically, a first-stage coefficient encoding process in which the first row or first column with the last coefficient is the first target range of encoding, and other than the first row or first column This is a two-stage process with a second-stage coefficient encoding process in which the second row or column is the second target range of encoding.

In the first stage coefficient encoding process, the last coefficient to the DC coefficient are sequentially scanned and encoded in order in the first target range.

In the second-stage coefficient encoding process, the second non-zero coefficient in the second target range to the first coefficient in the scan order are scanned zigzag in the second target range in order. It becomes.

The details of the coefficient coded data in the moving picture coding apparatus 2 will be described later.

[Last coefficient]
The last coefficient is encoded with the position of the last coefficient last_pos, the level of the last coefficient level, and the sign of the last coefficient sign. Here, the coefficient level means the absolute value of the coefficient. last_pos takes a value in the form of a scan order index. The case where the last coefficient is the 10th in the scanning order (the DC coefficient is the first) and the value is “−1” is as follows.

last_pos: 9
level: 1, sign: 1
In the case where the coefficient encoding process is performed in two stages, the last coefficient is detected and encoded in the target range in each stage. For example, if it is the first stage, it is the last coefficient itself of the entire coefficient matrix, and if it is the second stage, it is the last non-zero coefficient in the scan order in the second target range.

[Run mode]
The run mode is a mode for encoding the number of consecutive zero coefficients (0 run). In the run mode, for each coefficient, a 0 run, a coefficient level, and a coefficient sign sign are encoded. The next case where the non-zero coefficient to be encoded is one 0 between the previous non-zero coefficient and the value is “1” is as follows.

run: 1, level: 1, sign: 0
In Non-Patent Document 3, a case where a coefficient having a level of 2 or more appears is used as one of the end conditions of the run mode.

[Non-run mode]
The non-run mode is a mode that encodes one by one even with zero coefficients. In the non-run mode, for each coefficient, the coefficient level level and the coefficient sign sign are encoded. The case where the value of the coefficient to be encoded is “−6” and the case where the coefficient to be encoded is a zero coefficient are as follows.

level: 6, sign: 1
level: 0
[Remarks]
The encoded data shown above is merely an example, and the syntax element may be encoded with a name or definition different from the above. For example, the level level of the coefficient of the run mode is a thin level of isLevelOne (level is 1) and additionally, level_magnitude_minus2 (when the coefficient value is 2 or more, encoding is a level obtained by subtracting 2). It may be encoded as a tax element (see “level_magnitude_minus2” in Non-Patent Document 3).

Note that the level may be encoded as a syntax element of the last_pos_level for the last coefficient.

Further, the level of the non-run mode may be encoded as a syntax element of the level_magnitude.

(TU information decoding unit)
Next, the configuration of the TU information decoding unit 12 will be described in more detail with reference to FIG. FIG. 1 is a block diagram showing the configuration of the TU information decoding unit 12.

Hereinafter, a configuration for the TU information decoding unit 12 to decode the encoded quantized prediction residual QD, that is, the coefficient encoded data C10 among the encoded data included in the TU information TUI will be described. .

However, the TU information decoding unit 12 is not limited to this, and can decode data other than the coefficient encoded data C10 included in the encoded data, for example, side information.

In the following, it is assumed that the size of the coefficient matrix MX10 obtained by decoding the coefficient encoded data C10 is, for example, 8 × 8.

As shown in FIG. 1, the TU information decoding unit 12 includes a last coefficient decoding unit (last coefficient position detection unit) 121, a position determination unit (position determination unit, distance determination unit) 122, a two-stage decoding unit 123, and a batch. A decoding unit 124, a linear scanning VLC table TBL11, and a zigzag scanning VLC table TBL12 are provided.

The last coefficient decoding unit 121 decodes the last coefficient from the coefficient encoded data C10. Specifically, the last coefficient decoding unit 121 decodes the last coefficient position last_pos, the last coefficient level level, and the last coefficient sign sign included in the coefficient coded data C10.

The position determination unit 122 determines the last coefficient position last_pos decoded by the last coefficient decoding unit 121. The position determination unit 122 determines whether to perform the coefficient decoding process in the two-stage decoding unit 123 or the collective decoding unit 124 according to the position of the last coefficient.

Specifically, the position determination unit 122 assumes that the position of the last coefficient is last_pos (= LastIdx ₁ ).
LastX ₁ = scanX (LastIdx ₁ ), LastY ₁ = scanY (LastIdx ₁ )
To calculate the coordinates (LastX ₁ , LastY ₁ ) of the last coefficient in the coefficient matrix.

Here, scanX is a function that converts a scan order index into an x coordinate, and scanY is a function that converts a scan order index into a y coordinate.

And the position determination part 122 calculates the truth of the following determination formula (1).

(LastX ₁ == 0 && LastY ₁ ≧ ThY ₁ ) ||
(LastX ₁ ≧ ThX ₁ && LastY ₁ == 0) (1)
In the determination formula (1), “&&” is a logical product operator, and “||” is a logical sum operator.

In the determination formula (1), the following two determinations are made. First, it is determined whether or not the last coefficient position LastIdx ₁ is in either the first row or the first column in the coefficient matrix (“LastX ₁ == 0”). , “LastY ₁ == 0”).

Second, it is determined whether or not the last coefficient position LastX ₁ is more than a predetermined distance from the DC coefficient position (X, Y) = (0, 0) (“LastY ₁ ≧ ThY ₁ ” and “LastX ₁ ≧ ThX ₁ ”).

This is because the effect of adaptation of the scan order is expected to be improved when the last coefficient position LastIdx ₁ is sufficiently away from the position of the DC coefficient.

Conversely, if the position of the last coefficient LastIdx ₁ is in the first row or the first column but is closer to the DC coefficient than the thresholds ThX ₁ and ThY ₁ , It is a configuration that refrains from adaptation.

In the case of an 8 × 8 size coefficient matrix, the thresholds ThX ₁ and ThY ₁ can be set to about “3”, respectively.

When the result of the determination formula (1) is “true”, the position determination unit 122 determines to perform a two-stage coefficient decoding process. In this case, the position determination unit 122 notifies the determination result to the two-stage decoding unit 123.

On the other hand, when the result of the determination formula (1) is “false”, the position determination unit 122 determines to perform the coefficient decoding process collectively. In this case, the position determination unit 122 notifies the determination result to the batch decoding unit 124.

The two-stage decoding unit 123 receives the notification from the position determination unit 122 and performs a two-stage coefficient decoding process. In this case, the two-stage decoding unit 123 refers to the linear scan VLC table TBL11 and the zigzag scan VLC table TBL12, and performs a two-stage coefficient decoding process on the target block. Details of the two-stage decoding unit 123 will be described later. The two-stage decoding unit 123 outputs the decoded coefficient matrix MX10.

The collective decoding unit 124 receives the notification from the position determination unit 122 and performs a coefficient decoding process in a batch. In this case, the collective decoding unit 124 refers to the zigzag scan VLC table TBL12 and performs coefficient decoding processing on the target block in a batch. The collective decoding unit 124 outputs the decoded coefficient matrix MX10.

The linear scan VLC table TBL11 is a table in which a bit string of encoded data is associated with each parameter. The linear scanning VLC table TBL11 is used in the first-stage coefficient decoding process of the two-stage decoding unit 123. A plurality of linear scan VLC tables TBL11 are defined according to the context. The context refers to the status of the target block and its surrounding blocks. The status of the target block and its surrounding blocks refers to various parameters of each block and values that can be derived from the various parameters.

The zigzag scan VLC table TBL12 is a table in which a bit string of encoded data is associated with each parameter. The zigzag scan VLC table TBL12 is used in both the second-stage coefficient decoding process of the two-stage decoding unit 123 and the coefficient decoding process of the collective decoding unit 124. A plurality of zigzag scan VLC tables TBL12 are defined according to the context.

(Configuration of batch decryption unit)
Next, the configuration of the collective decoding unit 124 will be described in more detail with reference to FIG. The batch decoding unit 124 includes a batch coefficient decoding unit 241 and a batch reverse scanning unit 242. In the following description, it is assumed that the position determination unit 122 has decided to perform the coefficient decoding process all at once.

The collective coefficient decoding unit 241 outputs a one-dimensional array of coefficients by decoding the encoded coefficients with the coefficient encoded data C10 as input for the target block. That is, the collective coefficient decoding unit 241 reproduces a one-dimensional array having the 64 coefficients of the target block as elements.

The collective coefficient decoding unit 241 illustratively performs coefficient decoding as follows. First, the batch coefficient decoding unit 241 performs decoding in the run mode. That is, the collective coefficient decoding unit 241 decodes 0 run, coefficient level, and coefficient sign sign for each coefficient under predetermined conditions. Examples of the predetermined condition include whether or not the coefficient value exceeds a threshold value, or whether or not a predetermined number of coefficients are decoded.

When the run mode ends, the batch coefficient decoding unit 241 performs decoding in the non-run mode. In other words, the collective coefficient decoding unit 241 decodes the coefficient level level and the coefficient sign sign. The collective coefficient decoding unit 241 reproduces a one-dimensional array by decoding the coefficients in this way.

The collective inverse scan unit 242 reproduces the 8 × 8 coefficient matrix MX10 by performing inverse scan on the one-dimensional array output by the collective coefficient decoding unit 241. Inverse scanning refers to a process of sequentially reading elements stored in a one-dimensional array and rearranging them in a two-dimensional coefficient matrix. Further, the technique of Non-Patent Document 3 can be adopted for the reverse scan of the collective reverse scan unit 242.

(Outline of the two-stage decoding process)
Next, the outline of the two-stage decoding process by the two-stage decoding unit 123 will be described with reference to FIG. FIG. 5 is a diagram showing the relationship between the process in the two-stage decoding unit 123 and the input / output data related to the process.

As shown in FIG. 5, the two-stage decoding unit 123 receives the coefficient encoded data C10 included in the TU information TUI as input, performs a decoding process, and outputs a two-dimensional coefficient matrix MX10. That is, the two-stage decoding unit 123 outputs an 8 × 8 coefficient matrix MX10 including quantized coefficients as elements by decoding processing.

In the following description, it is assumed that the position determination unit 122 has decided to perform a two-stage coefficient decoding process.

Also, the coefficient encoded data C10 includes first coefficient encoded data C11, a second stage flag F1, and second coefficient encoded data C12. The second stage flag F1 indicates whether or not the second stage decoding process needs to be executed.

In the two-stage decoding unit 123, first, the first-stage coefficient decoding process is executed as follows. That is, first, the first coefficient encoded data C11 and the second stage flag F1 are input, and the coefficients are decoded in the first stage while referring to the linear scan VLC table TBL11. As a result, a one-dimensional array ARR1 is output. The size of the one-dimensional array ARR1 is “8”.

Also, reverse scanning (straight line) is performed on the one-dimensional array ARR1. Accordingly, the coefficient matrix MX11 (8 × 1) in the first row is reproduced from the 8 × 8 coefficient matrix MX10.

Further, according to the second-stage flag F1, the second-stage decoding unit 123 performs the second-stage coefficient decoding process. That is, the coefficient decoding in the second stage is performed with the second coefficient encoded data C12 as input and referring to the zigzag scan VLC table TBL12. As a result, a one-dimensional array ARR2 is output.

Also, reverse scanning is performed on the one-dimensional array ARR2. Thus, the coefficient matrix MX12 (8 × 7) including the remaining coefficients is reproduced from the 8 × 8 coefficient matrix MX10. In the coefficient matrix MX12, the eighth row is empty, but apparently, the coefficient encoding process can be performed as an 8 × 8 coefficient matrix. In other words, the processing order of the coefficient encoding processing in a batch is applicable. Moreover, the technique of nonpatent literature 3 is employable for this reverse scan.

Thereafter, in order to adjust the position of the reproduced coefficient, a coefficient matrix MX13 in which the coefficient of the coefficient matrix MX12 is copied is generated with the first line open.

In the two-stage decoding unit 123, the coefficient matrix MX11 and the coefficient matrix MX13 obtained by the first-stage and second-stage coefficient decoding processes are merged and output as a coefficient matrix MX10.

(Configuration of two-stage decoding unit)
Next, the configuration of the two-stage decoding unit 123 will be described in more detail with reference to FIG. 1 again. As shown in FIG. 1, the two-stage decoding unit 123 includes a first coefficient decoding unit 231, a first inverse scanning unit 232, a flag determining unit 233, a second coefficient decoding unit 234, and a second inverse scanning unit 235.

The first coefficient decoding unit 231 receives the first coefficient encoded data C11 as input, and decodes the coefficient included in the row or column including the last coefficient in the coefficient matrix. That is, the first coefficient decoding unit 231 decodes the coefficient included in the first row or the first column of the coefficient matrix.

The first coefficient decoding unit 231 supplies the decoded one-dimensional array ARR1 to the first reverse scanning unit 232.

The first reverse scanning unit 232 performs reverse scanning on the one-dimensional array ARR1 and rearranges it in the two-dimensional array MX11.

The flag determination unit 233 determines whether or not the second-stage decoding process is performed by determining the flag of the second-stage flag F1.

The second coefficient decoding unit 234 receives the second coefficient encoded data C12 as input, and decodes a coefficient included in a row or column other than the row or column including the last coefficient in the coefficient matrix. That is, the second coefficient decoding unit 234 decodes the coefficients included in the rows or columns other than the first row or the first column of the coefficient matrix. The first coefficient decoding unit 231 supplies the decoded one-dimensional array ARR1 to the first inverse scanning unit 232.

The second reverse scanning unit 235 performs reverse scanning on the one-dimensional array ARR2 and rearranges it in the two-dimensional array MX12.

(Concrete example)
Next, a specific example of the two-stage coefficient decoding process of the two-stage decoding unit 123 will be described with reference to FIG. FIG. 6 is a diagram illustrating a two-stage coefficient encoding process or coefficient decoding process of the coefficient matrix MX10. (A) of the figure shows the first stage, and (b) of the figure shows the second stage.

In the example shown in FIG. 6, the last coefficient LT is at a dark shaded position. In the example shown in FIG. 6, the last coefficient LT is located in the seventh column of the first row of the coefficient matrix MX10. For this reason, the last coefficient LT is at a position away from the DC coefficient DC by a predetermined distance or more in the first row. In the first line, the non-zero coefficient is shaded.

As shown in FIG. 6A, in the first-stage coefficient decoding process, a processing target range (hereinafter referred to as a first target range) R10 is the first row.

Also, as shown in FIG. 6B, in the second-stage coefficient decoding process, a range to be processed (hereinafter referred to as a second target range) R20 is a line other than the first line. In FIG. 6B, the positions that have been processed (encoding / decoding) in the first stage are indicated by hatching.

Hereinafter, details of the coefficient decoding process in the first stage and the second stage will be described.

First, the first-stage coefficient decoding process is executed as follows. That is, first, the first coefficient decoding unit 231 decodes from the last coefficient LT to the DC coefficient DC in the first target range R10.

Explaining with reference to FIG. 6A, the coefficients decoded by the first coefficient decoding unit 231 are continuous from the coefficient next to the last coefficient LT to the DC coefficient DC in the first target range R10. This is a coefficient included in the region R11. As described above, the first coefficient decoding unit 231 actually sets a region R11 in which a non-zero coefficient may exist as a decoding target.

Note that the eighth column of the first row is a position that appears after the last coefficient LT in the scan order of the entire coefficient matrix, and therefore, the value of the coefficient at this position is treated as “0”.

Subsequently, the first reverse scanning unit 232 performs reverse scanning on the coefficients decoded by the first coefficient decoding unit 231. Specifically, the first reverse scanning unit 232 performs reverse scanning as follows.

That is, the first reverse scanning unit 232
X = scanX (idx), Y = scanY (idx)
To calculate the coordinates (X, Y) in the coefficient matrix. Here, scanX is a function that converts the scan order index idx into X of the x coordinate, and scanY is a function that converts the scan order index into Y of the y coordinate.

Also, scanX and scanY can be realized by a table that defines the correspondence between idx and (X, Y).

For example, the position of the DC coefficient is idx = 0 according to the scan order index, and the corresponding coordinate expression is (X, Y) = (scanX (0), scanY (0)) = (0, 0). is there.

In the example shown in FIG. 6A, the position of the last coefficient LT is the scan order index LastIdx ₁ = 27. The position of the last coefficient LT is (LastX ₁ , LastY ₁ ) = ( ₆ , 0) by coordinate expression.

The first inverse scan unit 123 performs the inverse scan in this way to restore the first row of the coefficient matrix.

When the first-stage coefficient decoding process is completed as described above, the second-stage coefficient decoding process is started according to the flag determination of the flag determination unit 233.

Next, the second stage coefficient decoding process is executed as follows. First, the second coefficient decoding unit 234 starts with the first coefficient in the scan order in the second target range R20 from the last coefficient K1 in the scan order in the second target range R20 in the second target range R20. Decode up to ST.

Subsequently, the second reverse scanning unit 235 performs reverse scanning on the coefficients decoded by the second coefficient decoding unit 234. Specifically, the second reverse scanning unit 235 performs reverse scanning as follows.

Here, the second reverse scanning unit 235 applies the scan order in the case where the coefficient decoding process is performed collectively for the second target range R20.

In addition, the second reverse scanning unit 235 actually sets a region R21 from the coefficient K1 to the coefficient ST as a reverse scan target. At this time, the last coefficient in the region R21 is the coefficient K1. The second reverse scanning unit 235 sequentially performs reverse scanning from the coefficient ST to the coefficient K1.

The second reverse scanning unit 235 performs the reverse scanning in this manner to reproduce the second to eighth rows of the coefficient matrix.

As described above, the two-stage decoding unit 123 reproduces the first to eighth rows of the coefficient matrix, includes the reproduced coefficient matrix in the decoded TU information TUI, and transmits it to the encoded data multiplexing unit 34. Supply.

(Process flow)
Next, the flow of coefficient decoding processing in the TU information decoding unit 12 will be described with reference to FIG. FIG. 7 is a flowchart showing the flow of coefficient decoding processing in the TU information decoding unit 12.

First, when the last coefficient decoding unit 121 decodes the last coefficient LT (S11), the position determination unit 122 determines whether or not the position of the last coefficient LT satisfies a predetermined condition (S12).

If the position of the last coefficient LT does not satisfy the predetermined condition (NO in S12), the batch decoding unit 124 performs batch coefficient decoding processing (S13).

That is, in S13, the collective coefficient decoding unit 241 first decodes the coefficients in order from the coefficient next to the last coefficient (S131). Then, the collective inverse scan unit 242 performs inverse scan on the one-dimensional array in which the decoded coefficients are stored to reproduce the coefficient matrix (S132). The collective decoding unit 124 outputs the two-dimensional coefficient matrix obtained in this way, and the process ends.

On the other hand, when the position of the last coefficient LT satisfies a predetermined condition (YES in S12), the two-stage decoding unit 123 performs a two-stage coefficient decoding process as follows.

That is, first, the first coefficient decoding unit 231 decodes from the coefficient next to the last coefficient to the DC coefficient (S14).

Next, the first inverse scanning unit 232 performs inverse scanning on the coefficients decoded in S14 (S15). Thereby, the first row or the first column of the coefficient matrix is reproduced.

Next, the flag determination unit 233 decodes the second stage flag (S16), and determines whether or not the second stage flag is “1” (S17). If the second stage flag is not “1” (NO in S17), since there is no non-zero coefficient in the region to be decoded in the second stage, the second stage coefficient decoding process is not performed. The coefficient decoding process ends.

On the other hand, when the second stage flag is “1” (YES in S17), the second stage coefficient decoding process is executed. That is, the second coefficient decoding unit 234 decodes the remaining coefficients (S18). Then, the second inverse scanning unit 235 performs inverse scanning on the coefficients decoded in S18, thereby reproducing the coefficient matrix for the remaining coefficients (S19).

The first-stage and second-stage coefficient decoding processes are performed, so that the entire coefficient matrix is reproduced, and the process ends.

(Action / Effect)
As described above, the moving picture decoding apparatus 1 according to the present invention reproduces the coefficient matrix MX10 by decoding and inverse scanning the coefficient encoded data C10 included in the encoded data # 1 obtained by encoding the image data. In the moving picture decoding apparatus 1, the last coefficient decoding unit 121 that detects the position of the last coefficient LT in the reverse scan order in the coefficient matrix MX10, and the position of the last coefficient LT in the coefficient matrix MX10 The position determination unit 122 that determines whether or not the first target range R10 in the first row or the first column including the coefficient DC and the position of the last coefficient LT are in the first target range R10. In some cases, the coefficient matrix MX10 includes a two-stage decoding unit 123 that performs a decoding process on each of the first target range R10 and the second target range R20 other than the first target range R10. It is a configuration.

For this reason, when the coefficient distribution is biased, it is possible to reduce the code amount of the code data to be decoded.

In the decoding process of the second target range R20, the second target range R20 is further divided into a new first target range and a new second target range, and the above processing is recursively performed. You can go.

[Moving picture encoding device]
Below, the structure of the moving image encoder 2 which concerns on this embodiment is demonstrated with reference to FIG. In addition, the same code | symbol is attached | subjected about the same member as the already demonstrated member, and the description is abbreviate | omitted.

(Outline of video encoding device)
Generally speaking, the moving image encoding device 2 is a device that generates and outputs encoded data # 1 by encoding the input image # 10.

(Configuration of video encoding device)
FIG. 8 is a functional block diagram showing the configuration of the moving picture coding apparatus 2. As illustrated in FIG. 8, the moving image encoding device 2 includes an encoding setting unit 21, an inverse quantization / inverse conversion unit 22, a predicted image generation unit 23, an adder 24, a frame memory 25, a subtractor 26, A quantization unit 27 and a variable length coding unit 28 are provided.

The encoding setting unit 21 generates image data related to encoding and various setting information based on the input image # 10.

Specifically, the encoding setting unit 21 generates the next image data and setting information.

First, the encoding setting unit 21 generates the leaf CU image # 100 for the target leaf CU by sequentially dividing the input image # 10 into slice units and LCU units.

Also, the encoding setting unit 21 generates header setting information H ′ based on the result of the division process. The header information H ′ includes (1) information about the size, shape and position of the LCU belonging to the target slice, and (2) the size, shape and shape of the leaf CU belonging to each LCU. It includes CU information CU ′ about the position.

Furthermore, the encoding setting unit 21 refers to the leaf CU image # 100 and the CU information CU 'to generate PU setting information PUI'. The PU setting information PUI 'includes information on all combinations of (1) possible division patterns for each partition of the target leaf CU and (2) prediction modes that can be assigned to each partition.

The encoding setting unit 21 supplies the leaf CU image # 100 to the subtractor 26.

Also, the encoding setting unit 21 supplies the header information H ′ to the variable length encoding unit 28. Also, the encoding setting unit 21 supplies the PU setting information PUI ′ to the predicted image generation unit 23.

The inverse quantization / inverse transform unit 22 performs inverse quantization and inverse DCT transform (Inverse Discrete Cosine Transform) on the quantization prediction residual for each block supplied from the transform / quantization unit 27, Restore the prediction residual for each block. Further, the inverse quantization / inverse transform unit 22 integrates the prediction residual for each block according to the division pattern specified by the TU partition information, and generates a prediction residual D for the target leaf CU. The inverse quantization / inverse transform unit 22 supplies the prediction residual D for the generated target leaf CU to the adder 24.

The predicted image generation unit 23 refers to the locally decoded image P ′ recorded in the frame memory 25 and the PU setting information PUI ′, and generates a predicted image Pred for the target leaf CU. When performing inter-channel prediction in the prediction of chrominance, the prediction image generation unit 23 refers to the luminance decoded image P _Y. The predicted image generation unit 23 sets the prediction parameter obtained by the predicted image generation process in the PU setting information PUI ′, and transfers the set PU setting information PUI ′ to the variable length encoding unit 28. Note that the predicted image generation process performed by the predicted image generation unit 23 is the same as that performed by the predicted image generation unit 13 included in the video decoding device 1, and thus description thereof is omitted here.

The adder 24 adds the predicted image Pred supplied from the predicted image generation unit 23 and the prediction residual D supplied from the inverse quantization / inverse transform unit 22 to add a decoded image P for the target leaf CU. Is generated.

Decoded decoded image P is sequentially recorded in the frame memory 25. In the frame memory 25, decoded images corresponding to all the LCUs decoded before the target LCU (for example, all the LCUs preceding in the raster scan order) at the time of decoding the target LCU are recorded. .

The subtracter 26 generates a prediction residual D for the target leaf CU by subtracting the prediction image Pred from the leaf CU image # 100. The subtractor 26 supplies the generated prediction residual D to the transform / quantization unit 27.

The transform / quantization unit 27 performs a DCT transform (Discrete Cosine Transform) and quantization on the prediction residual D to generate a quantized prediction residual.

Specifically, the transform / quantization unit 27 refers to the leaf CU image # 100 and the CU information CU ', and determines the division pattern of the target leaf CU into one or a plurality of blocks. Further, according to the determined division pattern, the prediction residual D is divided into prediction residuals for each block.

The transform / quantization unit 27 generates a prediction residual in the frequency domain by performing DCT transform (DiscretecreCosine Transform) on the prediction residual for each block, and then quantizes the prediction residual in the frequency domain. Thus, a quantized prediction residual for each block is generated.

Also, the transform / quantization unit 27 relates to the generated quantization prediction residual for each block, TU partition information that specifies the partition pattern of the target leaf CU, and all possible partition patterns for each block of the target leaf CU. TU setting information TUI ′ including the information is generated.

The transform / quantization unit 27 supplies the generated TU setting information TUI 'to the inverse quantization / inverse transform unit 22 and the variable length coding unit 28.

The variable length encoding unit 28 generates and outputs encoded data # 1 based on the TU setting information TUI ′, the PU setting information PUI ′, and the header information H ′. Details of the variable length coding unit 28 will be described below.

(Variable length coding unit)
Next, the configuration of the variable length coding unit 28 will be described in more detail with reference to FIG. FIG. 9 is a block diagram illustrating a configuration example of the variable length coding unit 28.

As shown in FIG. 9, the variable length coding unit 28 includes a TU information coding unit 31, a header information coding unit 32, a PUI information coding unit 33, and a coded data multiplexing unit 34.

The TU information encoding unit 31 encodes the TU setting information TUI ′ and supplies it to the encoded data multiplexing unit 34. The header information encoding unit 32 encodes the header information H ′ and supplies the encoded header information H ′ to the encoded data multiplexing unit 34. The PUI information encoding unit 33 encodes the PUI information PUI ′ and supplies the encoded PUI information PUI ′ to the encoded data multiplexing unit 34.

The encoded data multiplexing unit 34 multiplexes the TU setting information TUI ', header information H', and PUI information PUI 'to generate encoded data # 1, and outputs it.

Here, the configuration of the TU information encoding unit 31 will be described in more detail as follows. Hereinafter, a configuration for encoding a quantized prediction residual, that is, a coefficient matrix included in the TU setting information TUI ′ will be described.

The TU information encoding unit 31 includes a final coefficient detection unit (last coefficient position detection unit) 310, a position determination unit (position determination unit) 320, a two-stage encoding unit 330, a batch encoding unit 340, and a linear scan VLC. A table TBL21 and a zigzag scanning VLC table TBL22 are provided.

The last coefficient detection unit 310 performs a reverse zigzag scan on the coefficient matrix to detect the last coefficient. The last coefficient detection unit 310 can employ, for example, the technique of Non-Patent Document 3 as the last coefficient detection method. The last coefficient detection unit 310 encodes the last coefficient position last_pos of the detected coefficient, the last coefficient level level, and the last coefficient positive / negative sign sign and includes them in the coefficient encoded data.

The position determination unit 320 determines whether or not the last coefficient position last_pos detected by the last coefficient detection unit 310 satisfies a predetermined condition. Specifically, the position determination unit 320 determines the authenticity of the above-described determination formula (1).

When the result of the determination formula (1) is “true”, the position determination unit 320 determines to perform a two-stage coefficient decoding process. In this case, the position determination unit 320 notifies the determination result to the two-stage encoding unit 330.

On the other hand, when the result of the determination formula (1) is “false”, the position determination unit 320 determines to perform the coefficient decoding process collectively. In this case, the position determination unit 320 notifies the determination result to the batch encoding unit 340.

The two-stage encoding unit 330 receives the notification from the position determination unit 320 and performs a two-stage coefficient decoding process. In this case, the two-stage encoding unit 330 refers to the linear scan VLC table TBL21 and the zigzag scan VLC table TBL22, and performs a two-stage coefficient decoding process on the target block. Details of the two-stage encoding unit 330 will be described later. The two-stage encoding unit 330 outputs the encoded coefficient matrix MX10.

The batch encoding unit 340 receives the notification from the position determination unit 320 and performs a coefficient encoding process in a batch. In this case, the batch encoding unit 340 refers to the zigzag scan VLC table TBL22 and performs a coefficient decoding process on the target block in a batch. The batch encoding unit 340 outputs the encoded coefficient matrix MX10.

The linear scan VLC table TBL21 is a table in which the correspondence between each parameter and a bit string of encoded data is defined. The linear scanning VLC table TBL21 is used in the first-stage coefficient encoding process of the two-stage encoding unit 330. A plurality of linear scan VLC tables TBL21 are defined according to the context. More specifically, the linear scanning VLC table TBL11 in the moving picture decoding apparatus 1 is such that the definition of the linear scanning VLC table TBL21 can be reversed.

The zigzag scan VLC table TBL22 is a table in which the correspondence between each parameter and a bit string of encoded data is defined. The zigzag scan VLC table TBL22 is used in both the second-stage coefficient encoding process of the two-stage encoding unit 330 and the coefficient encoding process in the batch encoding unit 340. A plurality of zigzag scan VLC tables TBL22 are defined in accordance with the context.

Further, the zigzag scan VLC table TBL12 in the moving picture decoding apparatus 1 is a table in which the definition of the zigzag scan VLC table TBL22 can be reversed.

(Batch encoding unit)
Next, the configuration of the batch encoding unit 340 will be described in more detail with reference to FIG. The batch encoding unit 340 includes a batch scan unit 341 and a batch coefficient encoding unit 342. In the following description, it is assumed that the position determination unit 320 determines to perform the coefficient decoding process all at once.

The collective scanning unit 341 sequentially stores the coefficients read by performing the zigzag scan on the two-dimensional coefficient matrix MX10 in a one-dimensional array. The collective scanning unit 341 supplies the one-dimensional array obtained by scanning to the collective coefficient encoding unit 342.

The collective coefficient encoding unit 342 encodes the one-dimensional array supplied from the collective scan unit 341 with reference to the zigzag scan VLC table TBL22, and outputs coefficient encoded data.

The batch coefficient encoding unit 342 illustratively encodes coefficients as follows. First, the batch coefficient encoding unit 342 performs encoding in the run mode. That is, the collective coefficient encoding unit 342 encodes 0 run, the coefficient level level, and the coefficient sign sign for each coefficient under a predetermined condition. Examples of the predetermined condition include whether or not the coefficient value exceeds a threshold value, or whether or not a predetermined number of coefficients are encoded.

When the run mode ends, the batch coefficient encoding unit 342 performs encoding in the non-run mode. In other words, the collective coefficient encoding unit 342 encodes the coefficient level level and the coefficient sign sign. The collective coefficient encoding unit 342 generates coefficient encoded data by encoding the coefficients in this way.

(Outline of the two-stage encoding process)
Next, an overview of the two-stage encoding process performed by the two-stage encoding unit 330 will be described with reference to FIG. FIG. 10 is a diagram showing the relationship between the process in the two-stage encoding unit 330 and the input / output data related to the process.

As shown in FIG. 10, the two-stage encoding unit 330 performs an encoding process with a two-dimensional coefficient matrix MX10 as an input, and outputs coefficient encoded data C10. It is assumed that the position determination unit 320 has decided to perform a two-stage coefficient decoding process.

In the two-stage encoding unit 330, first, the first-stage encoding process is executed as follows. That is, first, a scan process is performed on the coefficient matrix MX11 (8 × 1) in the first row of the coefficient matrix MX10. This scanning process is performed linearly from the DC coefficient included in the first row to the last coefficient. In addition, a one-dimensional array ARR1 is output by this scanning process. The size of the one-dimensional array ARR1 is “8” at the maximum.

Then, a process of encoding the one-dimensional array ARR1 is executed with reference to the linear scan VLC table TBL21. As a result, the coefficient encoded data C11 is output.

Subsequently, in the two-stage encoding unit 330, the second-stage coefficient encoding process is executed as follows. That is, first, of the coefficient matrix MX11, the coefficient matrix MX12 (8 × 7) including the remaining coefficients is copied for position adjustment. That is, an 8 × 8 coefficient matrix MX13 storing the coefficients of the coefficient matrix MX12 (8 × 7) from the first row is generated. By performing copying in this way, it is possible to apply the processing order of coefficient encoding processing in a lump.

Subsequently, a one-dimensional array ARR22 is generated by performing a zigzag scan on the coefficient matrix MX13. In this scan process, when the coefficient matrix MX13 includes a non-zero coefficient, the second stage flag F1 is set.

Further, in the two-stage encoding unit 330, the coefficient stored in the one-dimensional array ARR22 is encoded by referring to the zigzag scan VLC table TBL22. Thereby, coefficient encoded data C12 is generated.

In the two-stage encoding unit 330, the coefficient encoded data C11, the second-stage flag F1, and the coefficient encoded data C12 obtained by the first-stage and second-stage coefficient encoding processes are merged to generate coefficient encoded data. Output as C10.

(Configuration of two-stage encoding unit)
Next, the configuration of the two-stage encoding unit 330 will be described in more detail with reference to FIG. 9 again. As shown in FIG. 9, the two-stage encoding unit 330 includes a first scanning unit 331, a first coefficient encoding unit 332, a second scanning unit 333, a flag setting unit 334, and a second coefficient encoding unit 335. .

The first scan unit 331 performs a linear scan on the first row or the first column in the 8 × 8 coefficient matrix MX10 and stores the scanned coefficients in the one-dimensional array ARR1. Here, the linear scan is to perform a linear scan from the DC coefficient to the last coefficient for the first row or the first column in the 8 × 8 coefficient matrix MX10.

The first coefficient encoding unit 332 encodes the coefficients stored in the one-dimensional array ARR1 generated by the first scanning unit 331 while referring to the linear scan VLC table TBL21 to thereby generate first coefficient encoded data C11. Is generated.

The coefficient encoding process in the first coefficient encoding unit 332 is the same as the coefficient encoding process in the collective coefficient encoding unit 342 except that the scan order is a straight line and the number of coefficients is different. For example, the number of coefficients encoded by the first coefficient encoding unit 332 is at most (LastX ₁ +1) or (LastY ₁ +1).

The second scan unit 333 performs a zigzag scan on the remaining part other than the first row or the first column in the 8 × 8 coefficient matrix MX10, and stores the scanned coefficients in the one-dimensional array ARR2.

The flag setting unit 334 determines whether or not the one-dimensional array ARR2 has a non-zero coefficient, and sets a second stage flag indicating the determination result. That is, the second stage flag indicates whether or not the second stage decoding process needs to be executed.

The second coefficient encoding unit 335 generates the second coefficient encoded data C12 by encoding the non-zero coefficient stored in the one-dimensional array ARR2 while referring to the zigzag scan VLC table TBL12.

Note that the coefficient encoding process in the second coefficient encoding unit 335 is the same as the coefficient encoding process in the collective coefficient encoding unit 342 except that the position of the target coefficient is shifted by one row or one column.

(Concrete example)
Next, referring to FIG. 6 again, a specific example of the two-stage coefficient encoding processing of the two-stage encoding unit 330 will be described.

As shown in FIG. 6A, since the last coefficient LT is located at a predetermined distance or more away from the DC coefficient DC in the first row, in the first stage, the first target range R10 is a coefficient code. To be processed.

At this time, the first-stage coefficient encoding process is executed as follows. That is, first, the first scanning unit 331 linearly scans the first target range R10 from the coefficient next to the last coefficient LT to the DC coefficient DC, and stores the scanned coefficients in the one-dimensional array ARR1.

Subsequently, the first coefficient encoding unit 332 encodes the coefficient stored in the one-dimensional array ARR1 from the coefficient next to the last coefficient LT to the DC coefficient DC while referring to the linear scan VLC table TBL21. . The order in which the first coefficient encoding unit 332 encodes the coefficients is indicated by solid line arrows in FIG.

When the first stage coefficient encoding process is completed as described above, the second stage coefficient encoding process is started.

Further, as shown in FIG. 6B, in the second stage coefficient encoding process, the second target range R20 is a processing target.

At this time, the second-stage coefficient decoding process is executed as follows. That is, first, the second coefficient scanning unit 333 performs a zigzag scan on the target range R20 with the coefficient ST as a base point. That is, the zigzag scan is performed in the reverse order of the solid line arrows shown in FIG. The second coefficient scanning unit 333 stores the scanned coefficients in order in the one-dimensional array ARR2.

Next, the second coefficient encoding unit 335 encodes the coefficients stored in the one-dimensional array ARR2.

In the second stage, the region R21 from the coefficient K1 to the coefficient ST is actually the target of the encoding process.

The order in which the second coefficient encoding unit 335 encodes the coefficients is indicated by solid line arrows in FIG. That is, the second coefficient encoding unit 335 performs encoding from the last coefficient K1 in the second stage to the first coefficient ST in the second stage, and generates second coefficient encoded data C12.

In this way, the two-stage encoding unit 330 performs coefficient encoding processing to generate coefficient encoded data C10 including the first coefficient encoded data C11, the first-stage flag F1, and the second coefficient encoded data C12. To do.

(Process flow)
Next, the flow of coefficient encoding processing in the TU information encoding unit 31 will be described with reference to FIG. FIG. 11 is a flowchart showing the flow of coefficient encoding processing in the TU information encoding unit 31.

First, when the last coefficient detection unit 310 derives the last coefficient LT of the coefficient matrix MX10 (S21), the position determination unit 320 determines whether the position of the last coefficient LT satisfies a predetermined condition. (S22).

When the position of the last coefficient LT does not satisfy the predetermined condition (NO in S22), the batch encoding unit 340 performs the coefficient encoding process in a batch (S23).

That is, in S23, the collective scanning unit 341 first scans the coefficient matrix MX10 and stores the scanned coefficients in a one-dimensional array (S231). Then, the collective coefficient encoding unit 342 sequentially encodes the coefficients stored in the one-dimensional array (S232). The batch encoding unit 340 outputs the coefficient encoded data C10 obtained in this way, and the process ends.

Note that the batch coefficient encoding unit 342 first encodes the coefficient in the run mode, and when the predetermined condition is satisfied, switches to the non-run mode and encodes the coefficient.

On the other hand, when the position of the last coefficient LT satisfies a predetermined condition (YES in S22), the two-stage encoding unit 330 performs a two-stage coefficient encoding process as follows.

That is, first, the first scanning unit 331 scans the coefficient from the coefficient next to the last coefficient to the DC coefficient to store the coefficient in the one-dimensional array ARR1 (S24).

Next, the first coefficient encoding unit 332 encodes the coefficient of the one-dimensional array ARR1 obtained in S24 from the coefficient next to the last coefficient to the DC coefficient (S25). The first coefficient encoding unit 332 outputs first coefficient encoded data C11 generated by encoding.

Next, the second scanning unit 333 scans the remaining coefficients (S26). That is, the second scanning unit 333 performs scanning on the region R21. As a result, the coefficients in the region R21 are stored in the one-dimensional array ARR2.

Here, the flag setting unit 334 determines whether there is a remaining coefficient (S27). That is, the flag setting unit 334 determines whether or not a non-zero coefficient exists in the one-dimensional array ARR2.

If there is no remaining coefficient (NO in S27), the flag setting unit 334 sets “0 (no remaining coefficient)” to the second stage flag F1 (S28). Then, in the second stage, the processing ends without encoding the coefficients.

On the other hand, when there is a remaining coefficient (YES in S27), the flag setting unit 334 sets “1 (with remaining coefficient)” to the second stage flag F1 (S29).

Next, the second coefficient encoding unit 335 encodes the remaining coefficients in the region R21 (S30). The second coefficient encoding unit 335 outputs the second coefficient encoded data C12 generated by encoding, and the process ends.

Note that the second coefficient encoding unit 335 may first encode the coefficient in the run mode, and may encode the coefficient by switching to the non-run mode when a predetermined condition is satisfied. Further, the second coefficient encoding unit 335 may encode the coefficient only in the non-run mode.

(Action / Effect)
As described above, the video encoding device 2 scans the coefficient matrix MX10 including the quantized prediction residual QD obtained by orthogonally transforming the prediction residual D obtained by subtracting the prediction image Pred from the input image # 10 and quantizing the prediction residual D. In the moving picture encoding apparatus 2 that outputs the coefficient encoded data C10 by encoding the last coefficient detection unit 310 that detects the position of the last coefficient LT in the scan order in the coefficient matrix MX10, and the last A position determination unit 320 that determines whether or not the position of the coefficient LT is within the first stage target range R10 of the first row or first column including the DC coefficient DC in the coefficient matrix MX10; When the position of the coefficient LT is in the first stage target range R10, the coefficient matrix MX10 is converted into the first stage target range R10 and the first stage target range R2 other than the first stage target range R10. Characterized in that it comprises a two step encoding unit 330 for encoding in two stages with.

With this configuration, when there is a bias in the distribution of coefficients, it is possible to prevent the 0 run from being divided during the coefficient encoding process and improve the encoding efficiency.

(Example)
In the following, a detailed embodiment of the video encoding device 2 will be further described.

[Syntax]
FIGS. 12 and 13 show an example of the syntax of the coefficient encoding process in the TU information encoding unit 31 of the moving image encoding apparatus 2. It should be noted that switching between performing the coefficient encoding process all at once or in two stages is performed in the upper syntax. In other words, the determination of the position of the last coefficient LT is performed using the upper syntax.

The syntax shown in FIG. 12 will be described as follows. “Residual_block_vlc_1” on the first line is called, and a two-stage coefficient encoding process is started.

Also, in “residual_block_vlc_1a” of SYN11, the first-stage coefficient encoding process is called. “More_coeff_flag” of SYN 12 is a syntax element indicating whether or not to perform the second-stage coefficient encoding process.

When “more_coeff_flag” is “1”, the second stage coefficient encoding process is called by “residual_block_vlc — 1a” in SYN13.

The syntax shown in FIG. 13 indicates the coefficient encoding process in “residual_block_vlc_1a”. As shown in FIG. 13, in the SYN 21, the last coefficient is encoded. More specifically, first, “last_pos_table_idx” is encoded. “Last_pos_table_idx” is a syntax element indicating a combination of a flag (levelMagnitudeGreaterThanOneFlag) indicating whether or not the last coefficient is 1 and the position (last_pos) of the last coefficient. This combination is defined in the table, and the combination is specified by an index.

When “levelMagnitudeGreaterThanOneFlag” is “1”, the level of the last coefficient is larger than 1, so the value obtained by subtracting 2 is encoded (last_pos_level).

Then, the sign of the last coefficient is encoded (last_pos_sign).

Subsequently, in SYN 22, the run mode and the non-run mode are executed. More specifically, first, the loop of the run mode is executed in SYN 220 under a predetermined condition.

Here, “isLevelOne_run” of SYN 221 is a syntax element indicating a combination of a flag (isLevelOne) indicating whether or not the next coefficient is 1 and the length (run) of 0 run. Also, in SYN 222, processing for the run that continues with 0 is skipped.

Also, when “isLevelOne” is not “1”, the coefficient level is larger than 1, so the value obtained by subtracting 2 from the actual coefficient value (level_magnitude_minus2) and its sign (level_sign) are encoded.

On the other hand, when “isLevelOne” is “1”, the sign of the coefficient value (level_sign) is encoded.

It should be noted that “i <= i_max” in the loop termination condition will be described as follows. When the target block is 8 × 8, first, the initial value of “i” is “63-last_pos”. “I_max” is “63−last_pos + scanX (last_pos)” in the first stage, and “63” in the second stage.

Also, “runMode” becomes 0 when a predetermined condition is satisfied in the loop of SYN220. When the target block is 8 × 8, it is a condition that two coefficients of level 2 or higher appear or “i_max-i” is equal to or less than a threshold value.

The threshold value is for controlling the run mode to end when the remaining number of coefficients is n or less. This threshold value is determined according to block attributes such as whether the prediction mode is inter or intra, and values of luminance and color difference. For example, the threshold value n = 15.

When the run mode loop of SYN 220 is completed, the loop of non-run mode of SYN 230 is entered, and the remaining coefficients are encoded. The non-run mode includes a “level_magnitude” syntax element and a “level_sign” syntax element.

“Level_magnitude” is a coefficient level. When this level is not 0, “level_sign” that is a positive / negative sign is encoded.

As described above, the syntax of the coefficient encoding process in the first stage and the second stage can be made common if the level of detail is the above level. In more detail, the difference between the first and second stage coefficient encoding processes includes the scan order, the number of target coefficients, the VLC table to be used, the coefficient indicated by “last_pos_table_idx”, and the like.

[VLC table referenced by the batch coefficient encoding unit]
Hereinafter, a specific example of the zigzag scan VLC table TBL22 referred to by the collective coefficient encoding unit 342 in the moving image encoding device 2 will be described with reference to FIGS. 14, 15, and 16.

FIG. 14 shows a table defining the correspondence of code numbers (CodeNum) corresponding to combinations of “run” and “isLevelOne” when the maximum value of run is 5.

FIG. 15 shows an example of the zigzag scan VLC table TBL22 when the maximum value of the run is 5. The zigzag scan VLC table TBL22 shown in the figure is an example of the definition when the maximum value of run is “5”, P slice, and 8 × 8 conversion.

In the coefficient encoding process, the code number corresponding to the combination of “run” and “isLevelOne” is read from the table shown in FIG. 14, and the code corresponding to the read code number is read in the zigzag scan VLC table TBL22 shown in FIG. It is done by referring to.

In the run mode, the zigzag scan VLC table TBL22 is referred to based on the combination of “run” and “isLevelOne”. The zigzag scan VLC table TBL22 is determined based on the maximum value of “run” at the position of the coefficient to be encoded.

Here, the maximum value of “run” will be described with reference to FIG. In FIG. 16, the coefficient at the position of T1 (hereinafter referred to as coefficient T1) is the object of encoding.

At this time, encoding is performed in the order of solid line arrows shown in FIG. 16 after the coefficient T1. Encoding after the coefficient T1 is performed up to the DC coefficient DC. That is, the number of coefficients from the coefficient processed next to the coefficient T1 to the DC coefficient DC is the maximum value of run. In other words, the maximum value of run is determined from the number of unprocessed coefficients among the coefficients to be encoded. In the example shown in FIG. 16, the maximum value of run is “5”.

When the coefficient T2 at the lower left of the coefficient T1 is a non-zero coefficient, run is “0”, and when the DC coefficient DC is a non-zero coefficient that appears next to the coefficient T1, run is “4”. When run is “5”, it means that there is no non-zero coefficient after coefficient T1.

The numbers shown at each position in FIG. 16 indicate the code numbers obtained when the next non-zero coefficient appears at that position. The code numbers shown in FIG. 14 are those when isLevelOne = true in FIG.

In the example of the zigzag scan VLC table TBL22 shown in FIG. 15, the longer the code number, the longer the code is assigned.

However, in the same figure, the code number does not become smaller as the run is shorter. This is because the code number is determined based on the planar positional relationship between the coefficients.

The following is a more detailed description based on FIG. That is, in the region to which the run mode is applied, there is a high possibility of “isLevelOne = true”. Therefore, in the same figure, a relatively short code is assigned to “isLevelOne” being “true”. On the other hand, a relatively long code is assigned to “isLevelOne” having “false”.

Further, in FIG. 15, since there is a high possibility that the coefficients adjacent to the left and lower left of the coefficient T1 are the next non-zero coefficients, relatively short codes are also assigned to them (see code numbers: 1 and 2) ).

In the video encoding device 2, a plurality of zigzag scan VLC tables TBL22 as exemplified above are defined in accordance with the context such as the maximum value of the run mode and the attribute of the block. The batch coefficient encoding unit 342 refers to the zigzag scan VLC table TBL22 by appropriately switching according to the context.

[VLC table referred to by first coefficient encoding unit]
Next, a specific example of the linear scan VLC table TBL21 referred to by the first coefficient encoding unit 332 in the moving image encoding device 2 will be described with reference to FIGS. 17 and 18.

FIG. 17 shows a table defining the correspondence of code numbers (CodeNum) corresponding to combinations of “run” and “isLevelOne” when the maximum value of run is 5.

FIG. 18 shows the VLC table TBL21 for linear scanning when the maximum value of run is 5.

The first coefficient encoding unit 332 acquires a corresponding code number by referring to the table shown in FIG. 17 based on the combination of “run” and “isLevelOne”. Then, the first coefficient encoding unit 332 is based on the acquired code number. The coefficient is encoded by obtaining the corresponding code with reference to the linear scanning VLC table TBL21 shown in FIG.

By the way, since the region encoded by the first coefficient encoding unit 332 is a straight line, the linear scan VLC table TBL21 is not defined in consideration of the planar positional relationship like the zigzag scan VLC table TBL22. It doesn't matter.

Therefore, as shown in FIG. 18, in the linear scanning VLC table TBL21, a code having a length proportional to the length of run is preferably assigned. However, run = 5 means that the non-zero coefficient does not exist in the remaining processing target coefficients, and since it is an event with a high occurrence frequency, a short code is assigned.
[2] Embodiment 1-2
Another embodiment of the present invention will be described with reference to FIGS. 19 to 26 as follows. For convenience of explanation, members having the same functions as those in the drawings described in the embodiment 1-1 are denoted by the same reference numerals and description thereof is omitted.

[Image decoding device: TU information decoding unit]
First, the configuration of the TU information decoding unit 12A according to the present embodiment will be described with reference to FIG. FIG. 19 is a functional block diagram illustrating another example of the configuration of the TU information decoding unit 12A.

The TU information decoding unit 12A determines the position of the last coefficient and the second nonzero coefficient from the last (hereinafter referred to as the second coefficient from the last), and based on this determination, performs two-stage coefficient decoding. Process. Hereinafter, the last coefficient and the second coefficient from the last are collectively referred to as the last two coefficients.

As shown in FIG. 19, the TU information decoding unit 12A is different from the TU information decoding unit 12 shown in FIG. 1 in the following points. That is, in the TU information decoding unit 12A, the last coefficient decoding unit 121, the position determination unit 122, the two-stage decoding unit 123, and the collective decoding unit 124 are respectively replaced with the last two coefficient decoding units 125, the position determination units 122A, 2A, and 2B. The stage decoding unit 123A and the batch decoding unit 124A are changed.

The last two coefficient decoding units 125 decode the last coefficient and the second coefficient from the last from the coefficient encoded data C10.

Specifically, the last two coefficient decoding units 125 firstly include the last coefficient position last_pos (= LastIdx _1a ), the last coefficient level level included in the coefficient encoded data C10, and the sign of the last coefficient. Decode sign. Further, the last two coefficient decoding units 125 decode the position LastIdx _{1b, the} level level, and the positive / negative sign sign of the second coefficient from the last.

The position determination unit 122A determines the position of the last coefficient decoded by the last coefficient decoding unit 125 and the position of the last coefficient obtained by the last coefficient decoding unit 125.

First, the position determination unit 122A uses the scanX and scanY functions as described above, from the last coefficient position LastIdx _1a and the last coefficient position LastIdx _1b , respectively, to the coordinates of the last coefficient position ( LastX _1a , LastY _1a ) and coordinates (LastX _1b , LastY _1b ) of the second coefficient from the last are obtained.

Then, the position determination unit 122A calculates true / false of the following determination formula (2).

((LastX _1a == 0 && LastY _1a ≧ ThY ₁ ) &&
(LastX _1b == 0 && LastY _1b ≧ ThY ₁ )) ||
((LastX _1a ≧ ThX ₁ && LastY _1a == 0) &&
(LastX _1b ≧ ThX ₁ && LastY _1b == 0)) (2)
The meanings of the operators and ThX ₁ and ThY ₁ in the judgment formula (2) are the same as those in the judgment formula (1).

In the determination formula (2), in addition to the position of the last coefficient, the position of the second coefficient from the last is determined. That is, it is determined whether or not the position of the second coefficient from the end is in the first row or the first column and is more than the threshold value from the position of the DC coefficient.

When the result of the determination formula (2) is “true”, the position determination unit 122A determines to perform a two-stage coefficient decoding process. In this case, the position determination unit 122A notifies the determination result to the two-stage decoding unit 123A.

On the other hand, when the result of the determination formula (2) is “false”, the position determination unit 122A determines to perform the coefficient decoding process collectively. In this case, the position determination unit 122A and the determination result are notified to the collective decoding unit 124A.

The two-stage decoding unit 123A receives the notification from the position determination unit 320A and performs a two-stage coefficient decoding process. The function of the two-stage decoding unit 123A is the same as that of the two-stage decoding unit 123A except that the decoding range is from the second coefficient from the last to the DC coefficient.

The collective decoding unit 124A receives the notification from the position determination unit 320 and performs coefficient coding processing in a batch. The function of the collective decoding unit 124A is the same as that of the collective decoding unit 124A except that the decoding range is from the second coefficient from the last to the DC coefficient.

(Concrete example)
Next, a specific example of a two-stage coefficient decoding process will be described with reference to FIGS. FIG. 20 is a diagram illustrating the coefficient encoding process or the coefficient decoding process for the last coefficient and the second coefficient from the end of the coefficient matrix MX10. 21A shows the first stage, and FIG. 21B shows the second stage.

Further, in FIG. 20, the dark shading indicates the last coefficient LT1. In FIG. 20 and FIG. 21A, thin shading indicates a non-zero coefficient. In FIG. 21A, positions that have been processed (encoded / decoded) up to the first stage are indicated by hatching. In FIG. 21B, the positions that have been processed (encoding / decoding) in the first stage are indicated by hatching.

First, specific operations of the last two coefficient decoding units 125 and the position determination unit 122A will be described with reference to FIG.

20, the last two coefficient decoding units 125 decode the position of the last coefficient LT1 and the second coefficient LT2 from the last. Between the position of the last coefficient LT1 and the second coefficient LT2 from the end, there is a run of length 11. In other words, the last two coefficient decoding units 125 decode the region R41 between the position of the last coefficient LT1 and the second coefficient LT2 from the last shown in FIG.

Then, the last two coefficient decoding units 125 obtain the position of the last coefficient LT1 and the position of the second coefficient LT2 from the last. Position coordinates of the last coefficient LT1 _(LastX _1a, LastY 1a) is (6,0). In addition, the position (LastX _1b , LastY _1b ) of the second coefficient LT2 from the last is (5, 0).

Subsequently, the position determination unit 122A determines the position of the last coefficient LT1 and the position of the second coefficient LT2 from the last.

Here, from (LastX _1a , LastY _1a ) = (6, 0), the position of the last coefficient LT1 is included in the first row, and LastX _1a is larger than the threshold ThX ₁ (= 3). . Further, from (LastX _1b , LastY _1b ) = (5, 0), the second coefficient LT2 from the last is included in the first row, and LastX _1b is more than the threshold ThX ₁ (= 3). large.

Therefore, since the determination formula (2) is “true”, the position determination unit 122A determines to perform a two-stage coefficient decoding process.

Next, a specific example of the two-stage coefficient decoding process of the two-stage decoding unit 123A will be described with reference to FIG. 21A shows the first-stage coefficient decoding process and coefficient encoding process, and FIG. 21B shows the second-stage coefficient decoding process and coefficient encoding process.

21 (a), the two-stage decoding unit 123A sets the first target range R30 in the first row as a decoding target in the first-stage coefficient decoding process. Actually, the two-stage decoding unit 123A, as shown in FIG. 21A, the position of the coefficient to be actually decoded in the first target range R30, that is, the second coefficient LT2 from the end. The region R31 from the next coefficient CF10 to the DC coefficient is decoded.

Subsequently, as illustrated in FIG. 21B, the two-stage decoding unit 123A sets the second target range R40 other than the first row as a decoding target in the second-stage coefficient decoding processing. Actually, as shown in FIG. 21B, the two-stage decoding unit 123A sets a region R42 having a remaining coefficient that has not been decoded as a decoding target. Specifically, the two-stage decoding unit 123A decodes from the last coefficient K2 in the region R42 to the first coefficient ST in the scan order.

(Process flow)
Next, the flow of coefficient decoding processing in the TU information decoding unit 12A will be described using FIG. FIG. 22 is a flowchart showing the flow of coefficient decoding processing in the TU information decoding unit 12A.

First, the last two coefficient decoding units 125 decode the last coefficient LT1 and the second coefficient LT2 from the last (S11A).
Next, the position determination unit 122A determines whether or not the last coefficient LT1 and the second coefficient LT2 from the last satisfy the condition (determination formula (2)) (S12A).

When the last coefficient LT1 and the second coefficient LT2 from the last do not satisfy the predetermined condition (NO in S12A), the collective decoding unit 124A performs the coefficient decoding process on the remaining coefficients in a lump (S13).

On the other hand, if the last coefficient LT1 and the second coefficient LT2 from the last satisfy the predetermined condition (YES in S12A), the two-stage decoding unit 123A performs the two-stage coefficient decoding process as follows.

That is, first, the first coefficient decoding unit 231 decodes the coefficient from the coefficient CF10 next to the second coefficient from the last to the DC coefficient (S14A). In other words, the first coefficient decoding unit 231 decodes the coefficient included in the first row or the first column in the coefficient matrix.

Next, the first reverse scanning unit 232 performs reverse scanning on the coefficients decoded in S14A (S15A). Thereby, the first row or the first column of the coefficient matrix is reproduced.

On the other hand, when the second stage flag is “1” (YES in S17), the second stage coefficient decoding process is executed. That is, the second coefficient decoding unit 234 decodes the remaining coefficients (S18A). Here, the remaining coefficients are the area R41 between the last coefficient LT1 and the second coefficient LT2 from the last, and the second coefficient from the last, among the areas where the coefficients are actually encoded. This is a coefficient in the region R42 excluding the region from the next coefficient CF10 to the DC coefficient of LT2. Furthermore, the region R42 is a region from the coefficient K2 to the coefficient ST in the reverse scan order.

Then, the second inverse scanning unit 235 performs inverse scanning on the coefficients decoded in S18, thereby reproducing the coefficient matrix for the remaining coefficients (S19A).

[Moving picture encoding device: TU information encoding unit]
Next, the configuration of the TU information encoding unit 31A according to the present embodiment will be described using FIG. FIG. 23 is a functional block diagram illustrating another example of the configuration of the TU information encoding unit 31A.

The TU information encoding unit 31A determines the position of the last coefficient and the second coefficient from the last, and performs a two-stage coefficient decoding process based on this determination.

23, the TU information encoding unit 31A is different from the TU information encoding unit 31 illustrated in FIG. 9 in the following points. That is, in the TU information encoding unit 31A, the last coefficient encoding unit 310, the position determination unit 320, and the two-stage encoding unit 330 are changed from the last coefficient detection unit 312 and the position determination unit 320A, respectively. , And the two-stage encoding unit 330A.

The second to last coefficient detection unit 312 detects and encodes the position of the last coefficient and the second coefficient from the last. Further, the second to last coefficient detection unit 312 notifies the position determination unit 320A of the detected last coefficient and the position of the second coefficient from the last.

The position determination unit 320A determines whether or not the position of the last coefficient detected by the second to last coefficient detection unit 312 and the position of the second coefficient from the last satisfy a predetermined condition. Specifically, position determination unit 320A determines the authenticity of the above-described determination formula (2).

When the result of the determination formula (2) is “true”, the position determination unit 320A determines to perform a two-stage coefficient decoding process. In this case, the position determination unit 320A notifies the determination result to the two-stage encoding unit 330A.

On the other hand, when the result of the determination formula (2) is “false”, the position determination unit 320A determines to perform the coefficient decoding process collectively. In this case, the position determination unit 320A notifies the determination result to the batch encoding unit 340.

The two-stage encoding unit 330A receives the notification from the position determination unit 320A and performs a two-stage coefficient encoding / decoding process. The function of the two-stage encoding unit 330A is the same as that of the two-stage encoding unit 330 except that the encoding range is from the coefficient next to the second coefficient from the last to the DC coefficient.

(Concrete example)
Next, a specific example of a two-stage coefficient encoding process will be described with reference to FIGS. 20 and 21 again.

First, the specific operations of the coefficient detection unit 312 and the position determination unit 320A from the last to the second will be described with reference to FIG.

As shown in FIG. 20, the second to last coefficient detection units 312 detect the position of the last coefficient LT1 and the second last coefficient LT2. For example, the second to last coefficient detection units 312 perform a reverse zigzag scan on the coefficient matrix to detect the position of the last coefficient LT1. The second to last coefficient detection unit 312 further detects the second coefficient LT2 from the last by performing reverse zigzag scanning from the last coefficient LT1 for the region R41.

The second to last coefficient detection unit 312 obtains the position of the last coefficient LT1 and the position of the second coefficient LT2 from the last. As described above, the coordinates (LastX _1a , LastY _1a ) of the last coefficient LT1 are (6, 0). In addition, the position (LastX _1b , LastY _1b ) of the second coefficient LT2 from the last is (5, 0).

Also, the second to last coefficient detection unit 312 encodes the last coefficient position last_pos of the last coefficient LT1, the last coefficient level level, and the sign of the last coefficient sign. In addition, the coefficient detection unit 312 from the last to the second encodes the 0 run run, the level level of the second coefficient LT2 from the last, and the positive / negative sign sign.

Subsequently, the position determination unit 320A determines the position of the last coefficient LT1 and the position of the second coefficient LT2 from the last.

As described above, since the determination formula (2) is “true”, the position determination unit 320A determines to perform a two-stage coefficient encoding process.

Next, a specific example of the two-stage coefficient encoding process in the two-stage encoding unit 330A will be described with reference to FIG.

21 (a), the two-stage encoding unit 330A sets the first target range R30 as an encoding target in the first-stage coefficient encoding process. Actually, the two-stage encoding unit 330A, as shown in FIG. 21A, the position where the coefficient to be actually encoded exists in the first target range R30, that is, the second coefficient from the end. A region R31 from the next coefficient CF10 of LT2 to the DC coefficient is encoded.

Subsequently, as illustrated in FIG. 21B, the two-stage encoding unit 330A sets the second target range R40 as a decoding target in the second-stage coefficient decoding process. Actually, as shown in FIG. 21B, the two-stage encoding unit 330A sets a region R42 having a remaining coefficient that has not been encoded as a decoding target. Specifically, the two-stage encoding unit 330A decodes from the last coefficient K2 in the region R42 to the first coefficient ST in the scan order.

[Example of coefficient coded data]
In the example illustrated in FIGS. 20 and 21, the TU information encoding unit 31A outputs, for example, the following coefficient encoded data.

last_pos = 27, level = 1, sign = 0
run = 11, level = 2, sign = 0
run = 1, level = 1, sign = 1
...
In the example of the coefficient encoded data, the first two rows are output by the coefficient detector 312 from the last to the second. The third row is output by the first-stage coefficient encoding process in the two-stage encoding unit 330A.

[Reference example]
For reference, coefficient encoded data that will be output when the above-described TU information encoding unit 31 is given a coefficient matrix as shown in FIG. 20 is as follows.

last_pos = 27, level = 1, sign = 0
run = 0, level = 2, sign = 0
run = 1, level = 1, sign = 1
...
In the example of the coefficient encoded data, the first one row is output by the last coefficient detection unit 310. The second and third lines are output by the first coefficient encoding process in the two-stage encoding unit 330A.

(Process flow)
Next, the flow of coefficient decoding processing in the TU information encoding unit 31A will be described using FIG. FIG. 24 is a flowchart showing the flow of coefficient decoding processing in the TU information encoding unit 31A.

First, the second to last coefficient detector 312 detects and encodes the last coefficient LT1 in the zigzag scan order and the second coefficient LT2 from the last (S21A).

Next, the position determination unit 320A determines whether or not the last coefficient LT1 and the second coefficient LT2 from the last satisfy the condition (determination formula (2)) (S22A).

If the last coefficient LT1 and the second coefficient LT2 from the last do not satisfy the predetermined condition (NO in S22A), the collective decoding unit 340A performs coefficient coding processing on the remaining coefficients in a lump (S23). .

On the other hand, when the last coefficient LT1 and the second coefficient LT2 from the last satisfy the predetermined condition (YES in S22A), the two-stage encoding unit 330A performs a two-stage coefficient encoding process as follows. Is called.

That is, first, the first scanning unit 331 scans the coefficient from the coefficient CF10 next to the second coefficient from the last to the DC coefficient, and stores the coefficient in the one-dimensional array ARR1 (S24A).

Next, the first coefficient encoding unit 332 encodes the one-dimensional array ARR1 obtained in S24A from the coefficient CF10 to the DC coefficient (S25A). The first coefficient encoding unit 332 outputs first coefficient encoded data C11 generated by encoding.

Next, the second scanning unit 333 scans the remaining coefficients (S26A). That is, the second scanning unit 333 scans the coefficient K2 from the coefficient ST in the region R42. As a result, the coefficients in the region R42 are stored in the one-dimensional array ARR2.

Here, when there is a remaining coefficient (YES in S27), the flag setting unit 334 sets “1 (with remaining coefficient)” to the second stage flag F1 (S29), and the second coefficient encoding unit. 335 encodes the remaining coefficients in region R42 (S30A).

Then, the second coefficient encoding unit 335 outputs the second coefficient encoded data C12 generated by encoding, and the process ends.

(Example)
Hereinafter, a detailed example of the TU information encoding unit 31A will be further described.

[Syntax]
FIG. 25 and FIG. 26 show an example of the syntax of the coefficient encoding process in the TU information encoding unit 31A. It should be noted that switching between performing the coefficient encoding process all at once or in two stages is performed in the upper syntax. In other words, the determination of the position of the last coefficient LT is performed using the upper syntax.

The syntax shown in FIG. 25 will be described as follows. “Residual_block_vlc_2” on the first line is called, and a two-stage coefficient encoding process is started.

Also, in “residual_block_vlc_2a” of SYN31, the first-stage coefficient encoding process is called. “More_coeff_flag” of SYN 32 is a syntax element indicating whether or not to perform the second-stage coefficient encoding process.

When “more_coeff_flag” is “1”, the second stage coefficient encoding process is called by “residual_block_vlc — 1a” in SYN33. SYN33 is the same as SYN13 shown in FIG.

The syntax shown in FIG. 26 indicates the coefficient encoding process in “residual_block_vlc_2a”. As shown in FIG. 26, in the SYN 41, the last coefficient is encoded. More specifically, first, “last_pos_table_idx” is encoded. Since “last_pos_table_idx” has already been described with reference to FIG. 13, the description thereof is omitted here.

Then, the sign of the last coefficient is encoded (last_pos_sign).

In the subsequent SYN 42, the second coefficient from the end is encoded. For the second coefficient from the end, zigzag scan and run mode are used. Details are as follows.

Here, SYN 421 includes the syntax element of “isLevelOne_run” described with reference to FIG. Also, in SYN 422, the processing for run that continues with 0 is skipped.

Also, in SYN 423, when “isLevelOne” is not “1”, the coefficient level is larger than 1, so the value obtained by subtracting 2 from the actual coefficient value (level_magnitude_minus2) and its sign (level_sign) are encoded. Is done.

On the other hand, when “isLevelOne” is “1” in SYN 423, the sign of the coefficient value (level_sign) is encoded.

Subsequently, in SYN 43, a linear scan is executed. This is the same as the case where SYN 22 shown in FIG. 13 is executed in the first-stage coefficient encoding process, and a description thereof will be omitted.

[3] Modifications In the following, the TU information decoding unit 12 and the TU information decoding unit 12A described for the moving image decoding device 1, and the TU information encoding unit 31 and the TU information encoding unit described for the moving image encoding device 2. A preferred modification with 31A will be described.

[3-1] Omission of the second-stage flag Even if the two-stage coefficient encoding process is performed, depending on the characteristics of the image, the coefficients are included in the rows or columns other than the first row or the first column. There are many cases. In this case, in the coefficient encoding process, the second stage flag may be omitted instead of encoding the second stage flag one by one. Further, a flag indicating whether or not the second stage flag is omitted may be encoded for each processing unit such as a sequence, a picture, a slice, or an LCU, and the processing may be switched as appropriate.

Also, in this modification, a configuration example that can be appropriately processed even if the last coefficient in the second stage exists or does not exist is shown. That is, in the present modification, the last coefficient in the second stage is not encoded using last_pos in the second coefficient encoded data. Furthermore, the last coefficient in the second stage does not exist if there is no non-zero coefficient in the region to be encoded in the second stage.

Hereinafter, two modified examples will be described with reference to FIG. FIG. 27 is a diagram illustrating the coefficient encoding process and the coefficient decoding process when the second stage flag is omitted.

In FIG. 27, in the first stage, the first target range R50 that is the first row is the processing range. More specifically, in actuality, in the first stage, a linear region R51 from the coefficient next to the last coefficient LT to the DC coefficient DC is to be processed.

Further, in FIG. 27, in the second stage, the range R60 other than the first row is the processing range. More specifically, in the second stage, the coefficient encoding process is started from the last coefficient K3 in the zigzag scan in the range R60. The coefficient K3 may be a non-zero coefficient or a zero coefficient. Then, the coefficient encoding process is executed up to the first coefficient ST in the range R60. Accordingly, the region R61 is actually processed.

In the second stage coefficient encoding process, the process starts from the run mode.

[First Modification]
In the first modification, a case where a run is divided between the first stage and the second stage will be described. In this case, the TU information encoding unit 31 encodes the coefficient matrix MX10 illustrated in FIG. 27 as follows.

First, the TU information encoding unit 31 encodes the last coefficient LT as follows.

last_pos = 15, level = 1, sign = 1
Subsequently, as shown in FIG. 27, even when the remaining coefficients are not non-zero coefficients in the region R51, the run is ended in the region R51. The TU information encoding unit 31 outputs the following encoded data.

run = 5
In the second stage, in the second stage, the run is reset, and the coefficient coding process is continued from the coefficient K3 to the first coefficient ST in the region R61. The TU information encoding unit 31 outputs the following encoded data.

run = 2, level = 1, sign = 1
run = 1, level = 1, sign = 0
run = 5
This completes the coefficient encoding process.

[Second Modification]
In the second modification, a case where the run is continued between the first stage and the second stage will be described. In this case, the TU information encoding unit 31 illustrated in FIG. 27 performs encoding as follows.

First, the TU information encoding unit 31 encodes the last coefficient LT as follows. This is the same as in the case of the first modification.

last_pos = 15, level = 1, sign = 1
Subsequently, since the non-zero coefficient does not appear up to the region R61, the run length in the region R51 and the run length in the region R61 are added together and encoded as follows.

run = 7, level = 1, sign = 1
run = 1, level = 1, sign = 0
run = 5
In this case, the information is encoded with the row of run = 7 as the first step and the remaining two rows as the second step.

This completes the coefficient encoding process.

(Processing flow of TU information encoding unit)
Next, the flow of coefficient encoding processing in the TU information encoding unit 31 according to this modification will be described with reference to FIG. FIG. 28 is a flowchart showing the flow of the coefficient encoding process in the TU information encoding unit 31 according to this modification.

Since the processing of S21 to S23 is as described with reference to FIG. 11, the description is ended. The description from S24B is as follows.

When the position of the last coefficient LT satisfies a predetermined condition (YES in S22), the two-stage encoding unit 330 performs a two-stage coefficient encoding process as follows.

That is, first, the first scanning unit 331 scans the coefficient after the last coefficient LT until it passes the DC coefficient, thereby storing the coefficient in the one-dimensional array ARR1 (S24B). That is, in the region R51, if the DC coefficient DC is a zero coefficient, the first scanning unit 331 continues to scan the region R61 coefficient. If the DC coefficient DC is a non-zero coefficient, the first-stage coefficient encoding process stops there.

Next, the first coefficient encoding unit 332 encodes the coefficient of the one-dimensional array ARR1 obtained in S24B from the coefficient next to the last coefficient LT until the DC coefficient passes (S25B). The first coefficient encoding unit 332 outputs first coefficient encoded data C11 generated by encoding.

Next, the second scanning unit 333 scans the remaining coefficients in the region R61 (S26B). That is, the second scanning unit 333 starts scanning from the coefficient next to the coefficient scanned last in S24B. As a result, the coefficients in the region Q1 are stored in the one-dimensional array ARR2.

Next, the second coefficient encoding unit 335 encodes the remaining coefficients in the region R61 (S30B). The second coefficient encoding unit 335 outputs the second coefficient encoded data C12 generated by encoding, and the process ends.

(Processing flow of TU information decoding unit)
Next, the flow of coefficient decoding processing in the TU information decoding unit 12 according to this modification will be described with reference to FIG. FIG. 29 is a flowchart showing the flow of coefficient decoding processing in the TU information decoding unit 12 according to this modification.

Since S11 to S13 are as described with reference to FIG. 7, the description is ended. The description from S14B is as follows.

When the position of the last coefficient LT satisfies a predetermined condition (YES in S12), the two-stage decoding unit 123 performs a two-stage coefficient decoding process as follows.

That is, first, the first coefficient decoding unit 231 decodes a coefficient from the coefficient next to the last coefficient LT until the DC coefficient is passed (S14B).

Next, the first reverse scanning unit 232 performs reverse scanning on the coefficients decoded in S14B (S15B). That is, the first coefficient decoding unit 231 performs reverse scanning from the coefficient next to the last coefficient LT until the DC coefficient is passed.

Next, the second stage coefficient decoding process is executed. That is, the second coefficient decoding unit 234 decodes the remaining coefficients (S18B). Then, the second inverse scanning unit 235 performs inverse scanning on the coefficients decoded in S18, thereby reproducing the coefficient matrix for the remaining coefficients (S19B).

(Syntax)
30 and 31 show an example of the syntax of the coefficient encoding process in the TU information encoding unit 31 according to this modification. It should be noted that switching between performing the coefficient encoding process all at once or in two stages is performed in the upper syntax. In other words, the determination of the position of the last coefficient LT is performed using the upper syntax.

The syntax shown in FIG. 30 will be described as follows. "Residual_block_vlc_1a" on the first line is called to start a two-stage coefficient encoding process.

First, in the SYN 51, the last coefficient is encoded. SYN51 is the same as SYN21 shown in FIG.

Subsequently, in “residual_block_vlc_3a” of SYN 52 and SYN 53, the first-stage coefficient encoding process and the second-stage coefficient encoding process are called, respectively.

The syntax shown in FIG. 31 indicates the coefficient encoding process in “residual_block_vlc_3a”. As shown in FIG. 31, in SYN 61, the run mode and the non-run mode are called in order. SYN61 is the same as SYN22 shown in FIG.

As described above, the syntax of the coefficient encoding process in the first stage and the second stage can be made common if the level of detail is the above level.

[3-2] Expansion of First Stage Encoding Target Area Hereinafter, a modified example of extending the first stage encoding target area will be described with reference to FIG. That is, the position determination unit 122 of the TU information decoding unit 12 and the position determination unit 320 of the TU information encoding unit 31 determine whether the position of the last coefficient is up to the second row or the second column. It may be determined whether or not. Hereinafter, the position determination unit 320 will be described as follows.

Specifically, the position determination unit 320 calculates true / false of the following determination formula (3).

(LastX ₁ ≦ 1 && LastY ₁ ≧ ThY ₁ ) ||
(LastX ₁ ≧ ThX ₁ && LastY ₁ ≦ 1) (3)
When the result of the determination formula (1) is “true”, the position determination unit 320 determines to perform a two-stage coefficient decoding process. In this case, the position determination unit 320 notifies the determination result to the two-stage encoding unit 330.

32 will be described in more detail with reference to FIG. 32. In the example shown in FIG. 32, the last coefficient LT is in the sixth column of the second row. Therefore, in the example shown in FIG. 32, the determination formula (3) is “true”.

At this time, the first target range R70 to be subjected to the first-stage coefficient decoding process is an area of 2 rows and 8 columns, and is not a linear area as described above. Therefore, the scan order in the first-stage coefficient decoding process may be changed to a flat one as shown by the solid line arrow in FIG.

Also, the scan order in the second target range R80, which is the target of the second stage coefficient decoding process, is the same as when performing the coefficient decoding process in a lump. That is, in the copy process from the coefficient matrix MX12 to the coefficient matrix MX13 shown in FIG. 5, the coefficient matrix MX12 may be copied to the coefficient matrix MX13 with the first row and the second row open.

The same applies to the position determination unit 122. The same applies to the position determination unit 122A of the TU information decoding unit 12A and the position determination unit 320A of the TU information encoding unit 31A. That is, it may be determined whether or not the position of the second coefficient from the end is up to the second row or the second column.

According to the configuration of the present modification, since the first stage encoding target area is expanded, it is possible to moderately determine whether or not the coefficient is biased in the horizontal direction or the vertical direction. As a result, it is possible to efficiently encode the non-zero coefficient biased to the second row or the second column of the coefficient matrix, and to increase opportunities for improving the coding efficiency.

Further, the rows or columns that can be used for the determination are not limited to the second row or the second column, and the number of rows or columns can be further increased. Alternatively, it may be combined with only a part of a row or column, for example, only the entire first column and the left half of the second column.

Further, as a determination condition, when the last coefficient is in the first row or the first column and the next coefficient after the last coefficient is in the second row or the second column, or the last coefficient is When the second row or the second column is present and the coefficient next to the last coefficient is the first row or the first column, the first condition is appropriately combined with the determination conditions described in the above embodiments. The encoding target region at the stage may be expanded. Opportunities for improving the coding efficiency can be increased by appropriately combining them.

Note that a criterion other than the position may be used as the determination condition. For example, when the level of the last coefficient is larger than 1, even if the last coefficient is located in the first row or the first column, the two-stage encoding may not be performed.

[3-3] Restriction on Number of Coding Coefficients A modification example for restricting the number of coefficients to be coded will be described below with reference to FIGS. 33 and 34. FIG. The TU information encoding unit 31 or the TU information decoding unit 12 may limit the coefficient for performing the coefficient encoding process or the coefficient decoding process when the transform size is large, that is, when the size of the coefficient matrix is large. Hereinafter, the TU information encoding unit 31 will be described as follows.

First, a modified example in which the TU information encoding unit 31 encodes coefficients by limiting the number of intra 16 × 16 coefficient matrices to 64 in the scan order will be described with reference to FIG.

33, the TU information encoding unit 31 first detects 64 coefficients from the first to the 64th in the scan order. Then, the TU information encoding unit 31 performs a two-stage coefficient encoding process on the detected 64 coefficients.

That is, as illustrated in FIG. 33, in the first stage coefficient encoding process, the TU information encoding unit 31 targets the region R31 in which the detected coefficient is present in the first target range R30. In the second stage coefficient encoding process, the TU information encoding unit 31 targets the region R41 in which the detected coefficient exists in the second target range R40.

Next, a modification example in which the TU information encoding unit 31 encodes coefficients by limiting the number of inter 16 × 16 coefficient matrices to 64 in the scan order will be described with reference to FIG.

As illustrated in FIG. 34, the TU information encoding unit 31 first detects 64 coefficients included in an 8 × 8 size with a DC coefficient as a base point. Then, the TU information encoding unit 31 performs a two-stage coefficient encoding process on the detected 64 coefficients.

That is, as illustrated in FIG. 34, in the first stage coefficient encoding process, the TU information encoding unit 31 targets the region R51 in which the detected coefficient is present in the first target range R50. In the second-stage coefficient encoding process, the TU information encoding unit 31 targets the region R61 in which the detected coefficient exists in the second target range R60 that is a row other than the first row. .

The same applies to the position determination unit 122. The same applies to the position determination unit 122A of the TU information decoding unit 12A and the position determination unit 320A of the TU information encoding unit 31A.

According to this modification, when the coefficient matrix size is large, it is possible to prevent a coefficient that does not need to be processed from being subjected to coefficient encoding processing or coefficient decoding processing.

Further, this modification is merely an example, and the number of coefficients is not limited to 64, and can be arbitrarily set. Furthermore, the prediction mode of the coefficient matrix to be applied is not limited to the above. For example, the size of the upper left 8 × 8 may be detected for the intra 16 × 16 coefficient matrix.

(Conclusion)
Finally, each block of the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central It may be realized by software using a Processing Unit).

In the latter case, each device includes a CPU that executes instructions of a program that realizes each function, a ROM (Read （Memory) that stores the program, a RAM (Random Memory) that expands the program, the program, and various types A storage device (recording medium) such as a memory for storing data is provided. An object of the present invention is to provide a recording medium in which a program code (execution format program, intermediate code program, source program) of a control program of each of the above devices, which is software that realizes the above-described functions, is recorded so as to be readable by a computer. This can also be achieved by supplying to each of the above devices and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).

Examples of the recording medium include tapes such as magnetic tape and cassette tape, magnetic disks such as floppy (registered trademark) disks / hard disks, and CD-ROM / MO / MD / DVD / CD-R / Blu-ray disks (registered trademarks). ) And other optical disks, IC cards (including memory cards) / optical cards, semiconductor memories such as mask ROM / EPROM / EEPROM / flash ROM, PLD (Programmable logic device) and FPGA ( Logic circuits such as Field Programmable Gate Array can be used.

Also, each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network (Virtual Private Network), telephone line network, mobile communication network, satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, even in the case of wired lines such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared rays such as IrDA and remote control, Bluetooth (registered trademark), IEEE 802.11 wireless, HDR ( It can also be used by wireless such as High Data Rate, NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, and terrestrial digital network. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

The present invention is not limited to the above-described embodiment, and various modifications can be made within the scope indicated in the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

For example, the image decoding device according to the present embodiment described above decodes a moving image from encoded data. However, regardless of whether the image is a moving image or a still image, the image decoding device generally Applicable. The same applies to the image encoding device.

As described above, the moving image decoding apparatus 1 according to the present invention reproduces the coefficient matrix MX10 by decoding and inverse scanning the coefficient encoded data C10. In the reverse scan order, the last coefficient decoding unit 121 that detects the position of the last coefficient and the position of the last coefficient are in the first row or the first column including the DC coefficient DC in the coefficient matrix MX10. If the position determination unit 122 for determining whether or not the position of the last coefficient is in the first row or the first column, the coefficient matrix MX10 is changed to the first row or the first column and the first row. And a two-stage decoding unit 123 that performs decoding processing on each row or column other than the first or first column. Therefore, it is possible to improve the encoding efficiency when the distribution of transform coefficients is biased.

Further, the present invention can also be expressed as follows. That is, an image decoding apparatus according to the present invention is an image decoding apparatus that reproduces a coefficient matrix by decoding and inverse scanning an encoded quantized transform coefficient included in encoded data obtained by encoding image data. The last coefficient position detecting means for detecting the position of the last non-zero coefficient in the reverse scan order in the coefficient matrix, and the position of the last non-zero coefficient includes the coefficient of the DC component in the coefficient matrix. When the position determination means for determining whether or not the position determination range is from a row or column to a predetermined row or column, and the position of the last non-zero coefficient is within the position determination range, the coefficient matrix is Two-stage decoding means for decoding each of a first decoding target range including the position of the last non-zero coefficient and a second decoding target range other than the first decoding target range. It is a configuration.

In the image decoding device according to the present invention, the position determination means determines whether the position of the last non-zero coefficient is in the first row or the first column in the coefficient matrix, and the two steps It is preferable that the first decoding target range in the decoding process of the decoding unit coincides with the position determination range.

If the non-zero coefficient is biased horizontally or vertically, the last non-zero coefficient is likely to be in the first row or the first column in the coefficient matrix. In the above configuration, the position determination range is linear.

In addition, if the position determination range matches the first decoding target range, the first decoding target range is set in a relatively simple scan order when the non-zero coefficient is biased in the horizontal direction or the vertical direction. Decoding processing can be performed.

The image decoding apparatus according to the present invention further comprises distance determining means for determining whether or not the position of the last non-zero coefficient is a predetermined distance or more away from the position of the coefficient of the DC component. When the position of the last non-zero coefficient is a predetermined distance or more away from the position of the DC component coefficient, the decoding process is preferably performed.

¡The more likely the run is divided, the more the position of the last non-zero coefficient is far from the position of the DC component coefficient.

According to the above configuration, since the decoding process is performed in two stages when the position of the last non-zero coefficient is away from the position of the DC component coefficient, it is possible to prevent the run from being divided.

In the image decoding apparatus according to the present invention, it is preferable that the two-stage decoding unit performs a zigzag scan in the reverse scan in the decoding of the second decoding target range.

Zigzag scanning is a scanning method adopted in Non-Patent Document 3 and the like. In addition, for example, when the first decoding target range and the second decoding target range are collectively decoded, the technique of Non-Patent Document 3 can be employed.

According to the above configuration, it is possible to apply the control in the case of performing batch decoding processing to the second decoding target range. For this reason, it is possible to divert the control in the case of performing batch decoding processing to the decoding processing in the second decoding target range.

In the image decoding apparatus according to the present invention, for the decoding process of the second decoding target range by the two-stage decoding unit, the position detection by the last coefficient position detection unit, the position determination by the position determination unit, and the two steps It is preferable to provide recursive control means for controlling recursively decoding by the decoding means.

According to the above configuration, in the decoding process of the second decoding target range, the second decoding target range is further divided into a new first decoding target range and a new second decoding target range. Then, a two-stage decoding process is performed recursively. Note that the new second decoding target range may be further recursively processed. In addition, the recursive process can be repeatedly executed under conditions such as “perform until the new second decoding target range reaches a predetermined size”.

This makes it possible to improve the coding efficiency when the coefficients are distributed unevenly in the horizontal direction or the vertical direction.

In the image decoding apparatus according to the present invention, the two-stage decoding means is a variable length code table that defines a combination of a variable length code and a length of continuous non-zero coefficients in the decoding of the first decoding target range. Therefore, it is preferable to refer to the variable length code table in which the length of the continuous non-zero coefficient is longer as the length of the variable length code is longer than the shortest variable length code.

The variable length code table is, for example, a VLC table. In the VLC table disclosed in Non-Patent Document 3, the VLC table is defined in accordance with the spatial characteristics of the coefficient matrix. That is, in these VLC tables, the longer the variable length code is, the longer the continuous non-zero coefficient is not.

On the other hand, in the above configuration, since the decoding process of the first decoding target range is a linear range, the spatial characteristics as described above may not be taken into consideration.

Rather, it is preferable to perform the decoding process using the variable length code table in which the length of the continuous non-zero coefficient is longer as the length of the variable length code is longer as in the above configuration.

When the length of consecutive non-zero coefficients is the longest, that is, when all remaining coefficients in the range are non-zero coefficients, the shortest possible variable-length code is assigned because the frequency of occurrence is high. Thus, the encoding efficiency can be improved. In addition, unless the length of the continuous non-zero coefficient is the longest, the longer the variable length code, the longer the variable length code corresponding to the longer non-zero coefficient. Alternatively, the table may be configured with a variable length code.

The image decoding apparatus according to the present invention comprises non-zero coefficient position detection means for detecting a position of a predetermined non-zero coefficient from the last non-zero coefficient in the reverse scan order in the coefficient matrix,
The position determination means preferably determines whether or not the position from the last non-zero coefficient to the predetermined non-zero coefficient is in the position determination range.

According to the above configuration, it is possible to prevent the decoding process from being performed in two stages even though only the last non-zero coefficient happens to be in the first range.

In the image decoding device according to the present invention, the two-stage decoding means switches from a non-zero coefficient length decoding mode that decodes the length of consecutive non-zero coefficients to a sequential decoding mode that decodes successive coefficients, with a predetermined mode switching condition. When the condition is satisfied, decoding is performed while switching the mode, and the mode switching condition for decoding the first decoding target range is different from the mode switching condition for decoding the second decoding target range. It is preferable.

It is conceivable that the first range is smaller than the second range. In such a case, the appearance tendency of the non-zero coefficient may be completely different between the first range and the second range. Therefore, if the mode switching condition in the non-zero coefficient decoding mode is the same, the switching condition may be inappropriate for the appearance tendency of the first range, or may be inappropriate for the second range. is there. In such a case, the encoding efficiency may be reduced.

The mode switching condition includes the number of decoded coefficients and the number of coefficients whose absolute values exceed a threshold.

According to the above configuration, since the mode switching condition in the first range can be appropriately set, it is possible to prevent the encoding efficiency from being lowered.

An image encoding apparatus according to the present invention outputs encoded data by performing orthogonal transform on a prediction residual obtained by subtracting a predicted image from image data and scanning and encoding a coefficient matrix composed of quantized transform coefficients. In the image encoding device, the last coefficient position detecting means for detecting the position of the last non-zero coefficient in the scan order in the coefficient matrix, and the position of the last non-zero coefficient are the DC component in the coefficient matrix. Position determination means for determining whether or not a position determination range from a row or column including the coefficient to a predetermined row or column and the position of the last non-zero coefficient are in the position determination range, A two-stage code for encoding a coefficient matrix in each of a first encoding target range including the position of the last non-zero coefficient and a second encoding target range other than the first encoding target range. And means, which is configured to include.

The data structure of the encoded data according to the present invention is a code generated by performing orthogonal transform on a prediction residual obtained by subtracting a predicted image from image data and scanning and coding a coefficient matrix composed of quantized transform coefficients. In the data structure of the digitized data, the position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix and the last non-zero coefficient are included in the coefficient matrix, and the coefficient matrix includes the DC component coefficient. Alternatively, the first coefficient encoded data in which the coefficients in the first encoding target range from the column to the predetermined row or column are sequentially encoded, and the second encoding target other than the first encoding target range This is a data structure including second coefficient encoded data in which coefficients in a range are sequentially encoded.

The image encoding device includes position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix and the last non-zero coefficient, and includes a coefficient of a DC component in the coefficient matrix. First coefficient encoded data obtained by sequentially encoding coefficients in a first range from a row or column to a predetermined row or column, and coefficients in a second range other than the first range are sequentially encoded. The encoded second coefficient encoded data may be included in the data structure of the encoded data. The image encoding device may encode the information in, for example, side information.
<< Embodiment 2 >>
Another embodiment of the decoding device and the encoding device according to the present invention will be described below with reference to the drawings. Note that the decoding apparatus according to the present embodiment decodes a moving image from encoded data. Therefore, hereinafter, this is referred to as “moving image decoding apparatus”. In addition, the encoding device according to the present embodiment generates encoded data by encoding a moving image. Therefore, in the following, this is referred to as a “video encoding device”.

However, the scope of application of the present invention is not limited to this. That is, as will be apparent from the following description, the features of the present invention can be realized without assuming a plurality of frames. That is, the present invention can be applied to a general decoding apparatus and a general encoding apparatus regardless of whether the target is a moving image or a still image.

(Configuration of encoded data # 1)
Prior to the description of the moving picture decoding apparatus 1 according to the present embodiment, the configuration of encoded data # 1 generated by the moving picture encoding apparatus 2 according to the present embodiment and decoded by the moving picture decoding apparatus 1 will be described with reference to FIG. Will be described with reference to FIG. The encoded data # 1 has a hierarchical structure including a sequence layer, a GOP (Group Of Pictures) layer, a picture layer, a slice layer, and a maximum coding unit (LCU: Large Coding Unit) layer.

FIG. 37 shows the hierarchical structure below the picture layer in the encoded data # 1. FIGS. 37A to 37F show a picture layer P, a slice layer S, an LCU layer LCU, a leaf CU included in the LCU (denoted as CUL in FIG. 37D), and inter prediction (inter-screen prediction). It is a figure which shows the structure of inter prediction information PI_Inter which is the prediction information PI about a partition, and intra prediction information PI_Intra which is the prediction information PI about an intra prediction (prediction in a screen) partition.

(Picture layer)
The picture layer P is a set of data that is referenced by the video decoding device 1 in order to decode a target picture that is a processing target picture. As shown in FIG. 37 (a), the picture layer P includes a picture header PH and slice layers S1 to SNs (Ns is the total number of slice layers included in the picture layer P).

The picture header PH includes a coding parameter group referred to by the video decoding device 1 in order to determine a decoding method of the target picture. For example, the encoding mode information (entropy_coding_mode_flag) indicating the variable length encoding mode used in encoding by the moving image encoding device 2 is an example of an encoding parameter included in the picture header PH.

(Slice layer)
Each slice layer S included in the picture layer P is a set of data referred to by the video decoding device 1 in order to decode a target slice that is a slice to be processed. As shown in FIG. 37B, the slice layer S includes a slice header SH and LCU layers LCU1 to LCUn (Nc is the total number of LCUs included in the slice S).

The slice header SH includes a coding parameter group that the moving image decoding apparatus 1 refers to in order to determine a decoding method of the target slice. Slice type designation information (slice_type) for designating a slice type is an example of an encoding parameter included in the slice header SH.

As slice types that can be specified by the slice type specification information, (1) I slice that uses only intra prediction at the time of encoding, (2) P slice that uses unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used.

Also, the slice header SH includes a filter parameter FP that is referred to by an adaptive filter included in the video decoding device 1. The configuration of the filter parameter FP will be described later and will not be described here.

(LCU layer)
Each LCU layer LCU included in the slice layer S is a set of data that the video decoding device 1 refers to in order to decode the target LCU that is the processing target LCU.

The LCU layer LCU is composed of a plurality of coding units (CU: Coding Units) obtained by hierarchically dividing the LCU into a quadtree. In other words, the LCU layer LCU is a coding unit corresponding to the highest level in a hierarchical structure that recursively includes a plurality of CUs. As shown in FIG. 37 (c), each CU included in the LCU layer LCU has a hierarchical structure that recursively includes a CU header CUH and a plurality of CUs obtained by dividing the CU into quadtrees. is doing.

The size of each CU excluding the LCU is half the size of the CU to which the CU directly belongs (that is, the CU one layer higher than the CU), and the size that each CU can take is encoded data # 1. Dependent on the size and hierarchical depth of the LCU included in the sequence parameter set SPS. For example, when the size of the LCU is 128 × 128 pixels and the maximum hierarchical depth is 5, the CUs in the hierarchical level below the LCU have five sizes, that is, 128 × 128 pixels and 64 × 64 pixels. , 32 × 32 pixels, 16 × 16 pixels, and 8 × 8 pixels. A CU that is not further divided is called a leaf CU.

(CU header)
The CU header CUH includes a coding parameter referred to by the video decoding device 1 in order to determine a decoding method of the target CU. Specifically, as shown in FIG. 37 (c), a CU division flag SP_CU for specifying whether or not the target CU is further divided into four subordinate CUs is included. When the CU division flag SP_CU is 0, that is, when the CU is not further divided, the CU is a leaf CU.

(Leaf CU)
A CU (CU leaf) that is not further divided is handled as a prediction unit (PU: Prediction Unit) and a transform unit (TU: Transform Unit).

As shown in FIG. 37 (d), the leaf CU (denoted as CUL in FIG. 37 (d)) includes (1) PU information PUI that is referred to when the moving image decoding apparatus 1 generates a predicted image, and (2) The TU information TUI that is referred to when the residual data is decoded by the moving picture decoding apparatus 1 is included.

The skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target PU. When the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target leaf, PU information PUI and TU information TUI in the leaf CU are omitted. Note that the skip flag SKIP is omitted for the I slice.

The PU information PUI includes a skip flag SKIP, prediction type information PT, and prediction information PI as shown in FIG. The prediction type information PT is information that specifies whether intra prediction or inter prediction is used as a predicted image generation method for the target leaf CU (target PU). The prediction information PI includes intra prediction information PI_Intra or inter prediction information PI_Inter depending on which prediction method is specified by the prediction type information PT. Hereinafter, a PU to which intra prediction is applied is also referred to as an intra PU, and a PU to which inter prediction is applied is also referred to as an inter PU.

As shown in FIG. 37 (d), the TU information TUI specifies the quantization parameter difference Δqp (tu_qp_delta) that specifies the size of the quantization step and the division pattern of the target leaf CU (target TU) into each block. TU partition information SP_TU and quantization residual information QD are included.

The quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target TU and the quantization parameter qp ′ in the TU encoded immediately before the TU.

TU partition information SP_TU is information that specifies the shape and size of each block included in the target TU and the position in the target TU. Each TU can be, for example, a size from 64 × 64 pixels to 2 × 2 pixels. Here, the block is one or a plurality of non-overlapping areas constituting the target leaf CU, and encoding / decoding of the prediction residual is performed in units of blocks.

The quantization residual information QD is encoded data generated by the moving picture encoding apparatus 2 performing the following processes 1 to 3 on a target block that is a processing target block. Process 1: DCT transform (Discrete Cosine Transform) is performed on the prediction residual obtained by subtracting the prediction image from the encoding target image. Process 2: The transform coefficient obtained in Process 1 is quantized. Process 3: The transform coefficient quantized in Process 2 is variable-length encoded. The quantization parameter qp described above represents the magnitude of the quantization step QP used when the moving image encoding apparatus 2 quantizes the transform coefficient (QP = 2 ^{qp / 6} ). Various syntaxes included in the quantization residual information QD will be described later.

(Inter prediction information PI_Inter)
The inter prediction information PI_Inter includes a coding parameter that is referred to when the video decoding device 1 generates an inter prediction image by inter prediction. As shown in FIG. 37 (e), the inter prediction information PI_Inter includes inter PU partition information SP_Inter that specifies a partition pattern for each partition of the target PU, and inter prediction parameters PP_Inter1 to PP_InterNe (Ne for each partition). The total number of inter prediction partitions included in the target PU).

Specifically, the inter-PU partition information SP_Inter is information for designating the shape and size of each inter prediction partition included in the target PU (inter PU) and the position in the target PU.

The inter PU is composed of four symmetric splittings of 2N × 2N pixels, 2N × N pixels, N × 2N pixels, and N × N pixels, and 2N × nU pixels, 2N × nD pixels, and nL × 2N. It is possible to divide into 8 types of partitions in total by four asymmetric splits of pixels and nR × 2N pixels. Here, the specific value of N is defined by the size of the CU to which the PU belongs, and the specific values of nU, nD, nL, and nR are determined according to the value of N. For example, an inter PU of 128 × 128 pixels is 128 × 128 pixels, 128 × 64 pixels, 64 × 128 pixels, 64 × 64 pixels, 128 × 32 pixels, 128 × 96 pixels, 32 × 128 pixels, and 96 × It is possible to divide into 128-pixel inter prediction partitions.

(Inter prediction parameter PP_Inter)
As illustrated in FIG. 37E, the inter prediction parameter PP_Inter includes a reference image index RI, an estimated motion vector index PMVI, and a motion vector residual MVD.

The motion vector residual MVD is encoded data generated by the moving image encoding device 2 executing the following processes 4 to 6. Process 4: Select an encoded / decoded locally decoded image (more precisely, an image obtained by performing deblocking processing and adaptive filtering on the encoded / decoded local decoded image) The motion vector mv for the target partition is derived with reference to the selected encoded / decoded local decoded image (hereinafter also referred to as “reference image”). Process 5: An estimation method is selected, and an estimated value (hereinafter also referred to as “estimated motion vector”) pmv of the motion vector mv assigned to the target partition is derived using the selected estimation method. Process 6: The motion vector residual MVD obtained by subtracting the estimated motion vector pmv derived in Process 5 from the motion vector mv derived in Process 4 is encoded.

The reference image index RI designates the locally decoded image (reference image) that has been encoded / decoded selected in the process 4. The estimated motion vector index PMVI described above is the estimation method selected in the process 5. Is specified. The estimation methods that can be selected in the processing 5 include: (1) a locally decoded image being encoded / decoded (more precisely, a region that has already been decoded in a locally decoded image being encoded / decoded). In an image obtained by performing block processing and adaptive filter processing), a median of a motion vector allocated to a partition adjacent to the target partition (hereinafter also referred to as “adjacent partition”) is used as an estimated motion vector pmv. (2) In a locally decoded image that has been encoded / decoded, a motion vector assigned to a partition (often referred to as a “collocated partition”) occupying the same position as the target partition is used as an estimated motion vector pmv, etc. Is mentioned.

Note that the prediction parameter PP related to the partition for which unidirectional prediction is performed includes one each of the reference image index RI, the estimated motion vector index PMVI, and the motion vector residual MVD, as shown in FIG. However, the prediction parameters PP for a partition that performs bi-directional prediction (weighted prediction) include two reference image indexes RI1 and RI2, two estimated motion vector indexes PMVI1 and PMVI2, and two motion vector residuals MVD1. And MVD2.

(Intra prediction information PI_Intra)
The intra prediction information PI_Intra includes an encoding parameter that is referred to when the video decoding device 1 generates an intra predicted image by intra prediction. As shown in FIG. 37 (f), the intra prediction information PI_Intra includes intra PU partition information SP_Intra that specifies a partition pattern for the target PU (intra PU) into each partition, and intra prediction parameters PP_Intra1 to PP_IntraNa for each partition. (Na is the total number of intra prediction partitions included in the target PU).

Specifically, the intra-PU partition information SP_Intra is information that specifies the shape and size of each intra-predicted partition included in the target PU, and the position in the target PU. The intra PU split information SP_Intra includes an intra split flag (intra_split_flag) that specifies whether or not the target PU is split into partitions. If the intra partition flag is 1, the target PU is divided symmetrically into four partitions. If the intra partition flag is 0, the target PU is not divided and the target PU itself is one partition. Are treated as Therefore, if the size of the target PU is 2N × 2N pixels, the intra prediction partition can take any of 2N × 2N pixels (no division) and N × N pixels (four divisions) (where, N = 2 ⁿ , n is an arbitrary integer of 1 or more). For example, a 128 × 128 pixel intra PU can be divided into 128 × 128 pixel and 64 × 64 pixel intra prediction partitions.

(Intra prediction parameter PP_Intra)
As shown in FIG. 37 (f), the intra prediction parameter PP_Intra includes an estimation flag MPM and a residual prediction mode index RIPM. The intra prediction parameter PP_Intra is a parameter for designating an intra prediction method (prediction mode) for each partition.

The estimation flag MPM is a flag indicating whether or not the prediction mode estimated based on the prediction mode allocated to the peripheral partition of the target partition that is the processing target is the same as the prediction mode for the target partition. . Here, examples of partitions around the target partition include a partition adjacent to the upper side of the target partition and a partition adjacent to the left side of the target partition.

The residual prediction mode index RIPM is an index included in the intra prediction parameter PP_Intra when the estimated prediction mode and the prediction mode for the target partition are different, and is an index for designating a prediction mode assigned to the target partition. It is.

(Filter parameter FP)
As described above, the slice header SH includes the filter parameter FP that is referred to by the adaptive filter included in the video decoding device 1. The filter parameter FP includes a filter coefficient group as shown in FIG. The filter coefficient group includes (1) tap number designation information for designating the number of taps of the filter, (2) filter coefficients a ₀ to a _NT-1 (NT is the total number of filter coefficients included in the filter coefficient group), and (3) Offset o is included.

(Configuration of quantization residual information QD)
FIG. 38 is a table showing each syntax included in the quantization residual information QD (indicated as residual_block_pipe (tuSize) in FIG. 38).

38, the quantization residual information QD includes syntax sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign. The syntax sig_flag and the syntax last_flag are also referred to as a significant map. Various syntaxes included in the quantized residual information QD are encoded by context adaptive binary arithmetic coding (CABAC: (Context-based Adaptive Binary Arithmetic Coding)).

FIG. 39A is a diagram illustrating a non-zero transform coefficient (hereinafter also referred to as a non-zero change coefficient) in an 8 × 8 pixel block. In FIG. 39A, the horizontal axis represents the horizontal frequency, and the vertical axis represents the vertical frequency. Hereinafter, the transform coefficient for the horizontal frequency u (0 ≦ u ≦ 7) and the vertical frequency v (0 ≦ v ≦ 7) will be expressed as Coeff (u, v). In FIG. 39A, the conversion coefficient Coeff (0, 0) represents a DC component, and the other conversion coefficients represent components other than the DC component.

The variable-length code decoding unit 11 included in the moving picture decoding apparatus 1 to be described later is directed from the low frequency side (upper left in FIG. 39A) to the high frequency side (lower right in FIG. 39A) for the target block. Scanning is performed, and the transform coefficient Coeff (u, v) is sequentially decoded. A specific scan order will be described later.

In the following description, among the partial areas included in the frequency domain, the partial areas specified by the horizontal frequency u (0 ≦ u ≦ 7) and the vertical frequency v (0 ≦ v ≦ 7) are described. Also called frequency components (u, v).

The syntax sig_flag is a flag indicating the presence or absence of a non-zero conversion coefficient for each u and v. FIG. 39 (b) shows the value of the syntax sig_flag when the transform coefficient to be decoded is the one shown in FIG. 39 (a). As shown in FIG. 39A, the syntax sig_flag is a flag that takes 0 for each of u and v if the transform coefficient is 0, and 1 if the transform coefficient is not 0.

The syntax last_flag is a flag indicating whether or not u and v are the last conversion coefficients in the scan order. FIG. 39 (c) is a diagram illustrating the syntax last_flag when the transform coefficient to be decoded is the one illustrated in FIG. 39 (a). The syntax last_flag is a flag that takes 1 for u and v if the last conversion coefficient is in the scan order, and takes 0 if it is not the last conversion coefficient in the scan order. The variable length code decoding unit 11 sequentially decodes the syntax last_flag from the low frequency side, and performs the transform coefficient decoding processing up to the frequency component where last_flag = 1.

The syntax abs_greater_one is a flag indicating whether or not the absolute value of the conversion coefficient exceeds 1 for each of u and v. FIG. 39D is a diagram illustrating the syntax abs_greater_one when the transform coefficients to be decoded are those illustrated in FIG. As shown in FIG. 39D, the syntax abs_greater_one takes 1 when the absolute value of the transform coefficient exceeds 1, and takes 0 when the absolute value of the transform coefficient does not exceed 1. Flag.

The syntax coeff_abs_level is a syntax indicating the magnitude of the absolute value of the transform coefficient when the absolute value of the transform coefficient exceeds 1, and a value obtained by subtracting 2 from the absolute value is encoded. The syntax sign is a flag indicating the sign of the transform coefficient (whether positive or negative). Sign = 1 is taken when the value of the transform coefficient is negative, and sign = 0 is taken when the value of the transform coefficient is positive.

The variable-length code decoding unit 11 included in the video decoding device 1 decodes the quantization residual information QD from the syntax sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign, thereby transform coefficients Coeff ( u, v) can be generated. For example, if sig_flag = 1, abs_greater_one = 0, and sign = 1 for the frequency component (u, v), the variable-length code decoding unit 11 generates Coeff (u, v) = − 1 and sig_flag = 1 If abs_greater_one = 1, coeff_abs_level = 3, and sign = 0, Coeff (u, v) = + 5 is generated.

(Moving picture decoding apparatus 1)
Hereinafter, the video decoding device 1 according to the present embodiment will be described with reference to FIGS. The moving picture decoding apparatus 1 includes H.264 as a part thereof. H.264 / MPEG-4. Decoding device including technology adopted in KTA software which is a codec for joint development in AVC and VCEG (Video Coding Expert Group), and technology adopted in TMuC (Test Model under Consideration) software which is a successor codec It is.

FIG. 40 is a block diagram showing a configuration of the moving picture decoding apparatus 1. As shown in FIG. 40, the moving image decoding apparatus 1 includes a variable length code decoding unit 11, a predicted image generation unit 12, an inverse quantization / inverse conversion unit 13, an adder 14, a frame memory 15, and a loop filter 16. I have. As illustrated in FIG. 40, the predicted image generation unit 12 includes a motion vector restoration unit 12a, an inter predicted image generation unit 12b, an intra predicted image generation unit 12c, and a prediction method determination unit 12d. The moving picture decoding apparatus 1 is an apparatus for generating moving picture # 2 by decoding encoded data # 1.

(Variable-length code decoding unit 11)
FIG. 41 is a block diagram illustrating a main configuration of the variable length code decoding unit 11. As shown in FIG. 41, the variable-length code decoding unit 11 includes a quantization residual information decoding unit 111, a prediction parameter decoding unit 112, a prediction type information decoding unit 113, and a filter parameter decoding unit 114.

The variable length code decoding unit 11 uses the prediction parameter decoding unit 112 to decode the prediction parameter PP related to each partition from the encoded data # 1, and supplies the decoded image to the predicted image generation unit 12. Specifically, for the inter prediction partition, the prediction parameter decoding unit 112 decodes the inter prediction parameter PP_Inter including the reference image index RI, the estimated motion vector index PMVI, and the motion vector residual MVD from the encoded data # 1. These are supplied to the motion vector restoration unit 12a. On the other hand, for the intra prediction partition, the intra prediction parameter PP_Intra including the estimation flag MPM, the residual index RIPM, and the additional index AI is decoded from the encoded data # 1, and these are supplied to the intra prediction image generation unit 12c.

Also, the variable length code decoding unit 11 decodes the prediction type information PT for each partition from the encoded data # 1 in the prediction type information decoding unit 113, and supplies this to the prediction method determination unit 12d. Further, the variable length code decoding unit 11 uses the quantization residual information decoding unit 111 to convert the quantization residual information QD related to the block and the quantization parameter difference Δqp related to the TU including the block from the encoded data # 1. These are decoded and supplied to the inverse quantization / inverse transform unit 13. In the variable length code decoding unit 11, the filter parameter decoding unit 114 decodes the filter parameter FP from the encoded data # 1 and supplies this to the loop filter 16. Note that a specific configuration of the quantized residual information decoding unit 111 will be described later, and a description thereof will be omitted here.

(Predicted image generation unit 12)
The predicted image generation unit 12 identifies whether each partition is an inter prediction partition for performing inter prediction or an intra prediction partition for performing intra prediction based on the prediction type information PT for each partition. In the former case, the inter prediction image Pred_Inter is generated, and the generated inter prediction image Pred_Inter is supplied to the adder 14 as the prediction image Pred. In the latter case, the intra prediction image Pred_Intra is generated, The generated intra predicted image Pred_Intra is supplied to the adder 14. Note that, when the skip mode is applied to the processing target PU, the predicted image generation unit 12 omits decoding of other parameters belonging to the PU.

(Motion vector restoration unit 12a)
The motion vector restoration unit 12a restores the motion vector mv related to each inter prediction partition from the motion vector residual MVD related to that partition and the restored motion vector mv ′ related to another partition. Specifically, (1) the estimated motion vector pmv is derived from the restored motion vector mv ′ according to the estimation method specified by the estimated motion vector index PMVI, and (2) the derived estimated motion vector pmv and the motion vector remaining are derived. The motion vector mv is obtained by adding the difference MVD. It should be noted that the restored motion vector mv ′ relating to other partitions can be read from the frame memory 15. The motion vector restoration unit 12a supplies the restored motion vector mv to the inter predicted image generation unit 12b together with the corresponding reference image index RI.

(Inter prediction image generation unit 12b)
The inter prediction image generation unit 12b generates a motion compensated image mc related to each inter prediction partition by inter-screen prediction. Specifically, using the motion vector mv supplied from the motion vector restoration unit 12a, the motion compensated image from the adaptive filtered decoded image P_ALF ′ designated by the reference image index RI also supplied from the motion vector restoration unit 12a. Generate mc. Here, the adaptive filtered decoded image P_ALF ′ is an image obtained by performing the filtering process by the loop filter 16 on the decoded image that has already been decoded for the entire frame. 12b can read out the pixel value of each pixel constituting the adaptive filtered decoded image P_ALF ′ from the frame memory 15. The motion compensated image mc generated by the inter predicted image generation unit 12b is supplied to the prediction method determination unit 12d as an inter predicted image Pred_Inter.

(Intra predicted image generation unit 12c)
The intra predicted image generation unit 12c generates a predicted image Pred_Intra related to each intra prediction partition. Specifically, first, a prediction mode is specified based on the intra prediction parameter PP_Intra supplied from the variable length code decoding unit 11, and the specified prediction mode is assigned to the target partition in, for example, raster scan order.

Here, specification of the prediction mode based on the intra prediction parameter PP_Intra can be performed as follows. (1) The estimation flag MPM is decoded, and the estimation flag MPM indicates that the prediction mode for the target partition to be processed is the same as the prediction mode assigned to the peripheral partition of the target partition. If it is, the prediction mode assigned to the partition around the target partition is assigned to the target partition. (2) On the other hand, if the estimation flag MPM indicates that the prediction mode for the target partition to be processed is not the same as the prediction mode assigned to a partition around the target partition, the remaining The prediction mode index RIPM is decoded, and the prediction mode indicated by the residual prediction mode index RIPM is assigned to the target partition.

The intra predicted image generation unit 12c generates a predicted image Pred_Intra from the (local) decoded image P by intra prediction according to the prediction method indicated by the prediction mode assigned to the target partition. The intra predicted image Pred_Intra generated by the intra predicted image generation unit 12c is supplied to the prediction method determination unit 12d. Note that the intra predicted image generation unit 12c can also be configured to generate the predicted image Pred_Intra from the adaptive filtered decoded image P_ALF by intra prediction.

(Prediction method determination unit 12d)
The prediction method determination unit 12d determines whether each partition is an inter prediction partition that should perform inter prediction or an intra prediction partition that should perform intra prediction based on the prediction type information PT about the PU to which each partition belongs. To do. In the former case, the inter prediction image Pred_Inter generated by the inter prediction image generation unit 12b is supplied to the adder 14 as the prediction image Pred. In the latter case, the inter prediction image generation unit 12c generates the inter prediction image Pred_Inter. The intra predicted image Pred_Intra that has been processed is supplied to the adder 14 as the predicted image Pred.

(Inverse quantization / inverse transform unit 13)
The inverse quantization / inverse transform unit 13 (1) inverse quantizes the transform coefficient Coeff decoded from the quantized residual information QD, and (2) transforms the transform coefficient Coeff_IQ obtained by the inverse quantization into an inverse DCT (Discrete Cosine). Transform), and (3) the prediction residual D obtained by the inverse DCT transform is supplied to the adder 14. Note that when the transform coefficient Coeff decoded from the quantization residual information QD is inversely quantized, the inverse quantization / inverse transform unit 13 performs quantization from the quantization parameter difference Δqp supplied from the variable length code decoding unit 11. Deriving step QP. The quantization parameter qp can be derived by adding the quantization parameter difference Δqp to the quantization parameter qp ′ relating to the TU that has been inversely quantized / inversely DCT transformed immediately before, and the quantization step QP is derived from the quantization step qp, for example, QP = 2 ^{pq / 6} . The generation of the prediction residual D by the inverse quantization / inverse transform unit 13 is performed in units of blocks obtained by dividing TUs or TUs.

For example, when the size of the target block is 8 × 8 pixels, the inverse DCT transform performed by the inverse quantization / inverse transform unit 13 sets the pixel position in the target block to (i, j) (0 ≦ i If it is assumed that ≦ 7, 0 ≦ j ≦ 7) and the value of the prediction residual D at the position (i, j) is represented as D (i, j), it is given by the following mathematical formula (1), for example.

Here, Coeff_IQ (u, v) (0 ≦ u ≦ 7, 0 ≦ v ≦ 7) represents inversely quantized transform coefficients for the horizontal frequency u and the vertical frequency v. C (u) and C (v) are given as follows.

・ C (u) = 1 / √2 (u = 0)
・ C (u) = 1 (u ≠ 0)
・ C (v) = 1 / √2 (v = 0)
・ C (v) = 1 (v ≠ 0)
(Adder 14)
The adder 14 generates the decoded image P by adding the prediction image Pred supplied from the prediction image generation unit 12 and the prediction residual D supplied from the inverse quantization / inverse conversion unit 13. The generated decoded image P is stored in the frame memory 15.

(Loop filter 16)
The loop filter 16 includes (1) a function as a deblocking filter (DF) that performs smoothing (deblocking processing) on an image around a block boundary or partition boundary in the decoded image P, and (2) a deblocking filter. It has a function as an adaptive filter (ALF: Adaptive Loop Filter) which performs an adaptive filter process using the filter parameter FP with respect to the image which the blocking filter acted on.

(Quantization residual information decoding unit 111)
The quantization residual information decoding unit 111 decodes the quantized transform coefficient Coeff (PosX, PosY) for each frequency component (PosX, PosY) from the quantization residual information QD included in the encoded data # 1. It is the structure for doing. Here, PosX and PosY are indexes representing the position of each frequency component in the frequency domain, and are indexes corresponding to the above-described horizontal frequency u and vertical frequency v, respectively. Various syntaxes included in the quantized residual information QD are encoded by context adaptive binary arithmetic coding (CABAC: (Context-based Adaptive Binary Arithmetic Coding)). Hereinafter, the quantized transform coefficient Coeff may be simply referred to as a transform coefficient Coeff.

FIG. 36 is a block diagram showing a configuration of the quantized residual information decoding unit 111. As shown in FIG. 36, the quantized residual information decoding unit 111 includes a context index allocating unit 111a, a transform coefficient decoding unit 111b, a context variable managing unit 111c, and a context index changing unit 111d.

(Context index allocation unit 111a)
The context index allocation unit 111a allocates a context index ctxIdx to each frequency component according to the position of the frequency component in the frequency domain. More specifically, the context index assigning unit 111a assigns a context index ctxIdx determined by the following expression to the frequency component specified by PosX and PosY.

ctdIdx = ((PosX >> shift) << 2) + (PosX >> shift)
Here, the symbol “>>” represents a right bit shift operation, and the symbol “<<” represents a left bit shift operation. When the frequency domain is 4 × 4 components, shift = 0, and when the frequency domain is 8 × 8 components, shift = 1.

FIG. 42 (a) is a diagram showing an example of the context index ctxIdx assigned by the context index assigning unit 111a to each frequency component when the frequency region to be processed is 4 × 4 components. FIG. 42B is a diagram illustrating an example of the context index ctxIdx allocated by the context index allocation unit 111a for each frequency component when the frequency region to be processed is an 8 × 8 component. is there.

As shown in FIG. 42 (a), when the frequency domain is 4 × 4 components, the context index ctxIdx is individually assigned to each frequency component. On the other hand, as shown in FIG. 42 (b), when the frequency domain is 8 × 8 components, the context index ctxIdx is assigned to each adjacent 2 × 2 component. For example, the context index ctxIdx = 0 shown in FIG. 42B is assigned to the frequency components (0, 0), (0, 1), (1, 0), (1, 1).

Note that the context index ctxIdx illustrated in FIGS. 42A to 42B may be used when decoding any one of various syntaxes sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign described later. (The same applies to context indexes exemplified in other drawings). However, the specific context index allocated by the context index allocation unit 111a may differ depending on the type of syntax to be decoded. Therefore, the context index allocation unit 111a expresses that each syntax in the target frequency domain is allocated with a context index determined according to the type of syntax and the position of the syntax in the target frequency domain. be able to.

Also, the context index ctxIdx for each frequency component illustrated in FIGS. 42A to 42B may be an increment value from the offset of the context index set for each frequency region to be processed (others) The same applies to the context index illustrated in the drawing). For example, when the context index offset set for the frequency region to be processed is 200, ctxIdx = 201 is assigned to the frequency component (1, 0) in FIG. The same applies to other frequency components.

(Transform coefficient decoding unit 111b)
The transform coefficient decoding unit 111b refers to the context index ctxIdx assigned to the target frequency component that is the frequency component to be processed, and decodes various syntaxes sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign for the target frequency component . Also, transform coefficients Coeff (PosX, PosY) for each frequency component are generated from the decoded syntax.

Also, the transform coefficient decoding unit 111b supplies the decoded syntax sig_flag (PosX, PosY) to the context index changing unit 111d.

(Decoding process of various syntaxes)
Decoding processing of various syntaxes sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign by the transform coefficient decoding unit 111b will be described. The transform coefficient decoding unit 111b generates multi-value data for each syntax by performing binary arithmetic code decoding processing and multi-value processing for each syntax to be decoded.

The binary arithmetic code decoding process is a process of decoding encoded data regarding the syntax to be decoded included in the quantized residual information QD into binary data. When transforming one symbol (1 bit of binary data, also referred to as Bin), the transform coefficient decoding unit 111b derives a context index ctxIdx assigned to the frequency domain (PosX, PosY) to be processed. Next, the context variable CV specified by the context index ctxIdx is acquired from the context variable management unit 111c. Here, the context variable CV includes (1) a dominant symbol MPS (most probable symbol) having a high occurrence probability, and (2) a probability state index pStateIdx for designating the occurrence probability of the dominant symbol MPS. The transform coefficient decoding unit 111b performs the arithmetic decoding according to the occurrence probability specified by the probability state index pStateIdx included in the acquired context variable CV, thereby converting the encoded data about the syntax to be decoded into binary data. Decrypt. The dominant symbol MPS is 0 or 1. The dominant symbol MPS and the probability state index pStateIdx are updated by the context variable management unit 111c every time one symbol is decoded.

The multi-value process is a process for converting binary data decoded by the binary arithmetic code decoding process into multi-value data. By performing multi-value processing for each syntax, multi-value data for each syntax is generated.

The transform coefficient decoding unit 111b performs binary arithmetic code decoding processing and multi-value processing of various syntaxes on the frequency components (PosX, PosY) to be processed in accordance with the scan order described later, and is generated by multi-value processing. Conversion coefficients Coeff (PosX, PosY) are generated from the multivalued data for the various syntaxes.

(Scanning order of frequency components by transform coefficient decoding unit 111b)
The scan of each frequency component by the transform coefficient decoding unit 111b is performed by either a zigzag scan or an adaptive scan. Hereinafter, each scan will be described with reference to FIGS. 43 to 44. FIG.

FIG. 43 (a) is a diagram for explaining a scan line in each scan. In both cases of zigzag scanning and adaptive scanning, scanning of each frequency component is performed along each scanning line indicated by a broken line in FIG. Each scan line can be characterized by PosX + PosY = L. Hereinafter, a scan line that satisfies PosX + PosY = L is also referred to as a scan line (L). For example, scan line (2) is a scan for frequency components (2, 0), (1, 1), (0, 2) that satisfy PosX + PosY = 2.

In either case of the zigzag scan and the adaptive scan, the scan is performed along the scan line (L + 1) after the scan is performed along the scan line (L).

Further, in FIG. 43A, since the frequency component along the scan line (0) is only (0, 0), the frequency component (0, 0) is scanned regardless of the scan direction. The same applies to the scan line (14).

(1) Zigzag Scan FIG. 43B is a diagram showing the scan order of each frequency component by the zigzag scan. As shown in FIG. 43 (b), in the zigzag scan, from the low frequency side to the high frequency side, the scan to the lower left direction (also referred to as BottomLeft or DownLeft) and the upper right direction (also referred to as TopRight or UpRight) The scanning for each frequency component is performed by alternately repeating the scanning to the above. That is, the scan direction is reversed between the scan line (L) and the scan line (L + 1). In FIG. 43B, the numbers assigned to the frequency components represent the scan order of the frequency components in the zigzag scan.

(2) Adaptive Scan On the other hand, in the adaptive scan, in the decoded transform coefficient, the scan order for each frequency component to be decoded is determined according to the bias of the appearance frequency of the non-zero transform coefficient. . More specifically, the scan direction in the scan line (L + 1) is determined according to the bias in the appearance frequency of non-zero transform coefficients decoded from the scan line (0) to the scan line (L). FIG. 43C shows an example of the scan order of the adaptive scan.

FIG. 44 is a flowchart showing a flow of adaptive scan processing performed by the transform coefficient decoding unit 111b.

(Step S101)
First, the transform coefficient decoding unit 111b initializes the scan direction by setting the scan direction scanDirection to BottomLeft (lower left direction). Note that the scan direction scanDirection initialized in this step is applied to the scan line (1).

(Step S102)
Subsequently, the transform coefficient decoding unit 111b sets variables PosX and PosY along the scan line (L) to be processed. Here, the value of L is sequentially updated with L = 0 as an initial value.

(Step S103)
Subsequently, the transform coefficient decoding unit 111b decodes the syntax sig_flag for the frequency component specified by PosX and PosY set in step S102.

(Step S104)
Subsequently, the transform coefficient decoding unit 111b determines whether or not the value of the syntax sig_flag decoded in step S103 is non-zero.

(Step S105)
When the value of the syntax sig_flag is non-zero (Yes in Step S104), the variable length code decoding unit 11 determines whether PosX> PosY.

(Step S106)
If PosX> PosY (Yes in step S105), the transform coefficient decoding unit 111b adds 1 to the variable scanTopRight. Here, the variable scanTopRight is a variable representing the frequency of appearance of non-zero transform coefficients for frequency components satisfying PosX> PosY. The shaded area in FIG. 45A indicates an area for counting the variable scanTopRight.

(Step S107)
If PosX> PosY is not satisfied (No in step S105), the transform coefficient decoding unit 111b determines whether PosY> PosX.

(Step S108)
If PosY> PosX (Yes in step S107), the transform coefficient decoding unit 111b adds 1 to the variable scanBottomLeft. Here, the variable scanBottomLeft is a variable representing the frequency of occurrence of non-zero conversion coefficients for frequency components satisfying PosX <PosY. The shaded area in FIG. 45 (b) indicates an area where the variable scanBottomLeft is counted.

(Step S109)
Subsequently, the transform coefficient decoding unit 111b decodes the syntax last_flag for the frequency component to be decoded.

(Step S110)
Subsequently, the transform coefficient decoding unit 111b determines whether or not the value of the syntax last_flag decoded in step S109 is 1. If the value of the syntax last_flag is 1, the process ends.

(Step S111)
When the value of syntax sig_flag is not non-zero (that is, when the value of syntax sig_flag is 0) (No in step S104), or when the value of syntax last_flag is not 1 (No in step S110), conversion The coefficient decoding unit 111b determines whether PosX and PosY for the target frequency component correspond to the folding position in the frequency domain. Here, that the target frequency component is the turn-back position indicates a case where the following condition is satisfied.

-When scan direction scanDirection = DownLeft:
When PosX = 0 or PosY = Height−1 When the scan direction scanDirection = UpDownLeft:
When PosY = 0 or PosX = Width−1 Here, “Height” represents the number of vertical frequency components in the frequency domain, and “Width” represents the horizontal direction in the frequency domain. It represents the number of frequency components. For example, as shown in FIG. 39A, when the frequency domain is composed of 8 × 8 components, Height = 8 and Width = 8.

If PosX and PosY for the target frequency component do not correspond to the return position in the frequency domain (No in this step), the process returns to step S102, and the next frequency component in the processing order along the already set scan direction scanDirection PosX and PosY are set.

(Step S112)
When PosX and PosY for the target frequency component correspond to the return position in the frequency domain (Yes in step S111), the transform coefficient decoding unit 111b determines whether scanTopRight ≧ scanBottomLeft.

(Step S113)
If scanTopRight ≧ scanBottomLeft (Yes in step S112), the scan direction scanDirection for the next scan line in the processing order is set to BottomLeft. After the scan direction scanDirection for the next scan line in the processing order is set to BottomLeft, the process returns to step S102, and along the next scan line in the processing order and the scan direction is BottomLeft, PosX and PosY Is set.

(Step S114)
If scanTopRight ≧ scanBottomLeft is not satisfied (No in step S112), the scan direction scanDirection for the next scan line in the processing order is set to TopRight. After the scan direction scanDirection for the next scan line in the processing order is set to TopRight, the process returns to step S102, and along the next scan line in the processing order and with the scan direction as TopRight, PosX and PosY Is set.

In the above description, the case where the scan direction in the scan line (L + 1) is determined according to the bias in the appearance frequency of the non-zero transform coefficients decoded from the scan line (0) to the scan line (L). As an example, the present embodiment is not limited to this. The scan in the scan line (L + 1) according to the bias of the appearance frequency of non-zero transform coefficients decoded along the scan line (L). The direction may be determined. That is, the scanTopRight and scanBottomLeft may be initialized for each scan line.

(Context variable manager 111c)
The context variable management unit 111c manages the context variable CV corresponding to each context. Each context variable CV is stored in a memory provided in the context variable management unit 111c. Each context variable CV includes (1) a dominant symbol MPS (most probable symbol) having a high occurrence probability, and (2) a probability state index pStateIdx designating the occurrence probability of the dominant symbol MPS. Each context variable CV is specified by a context index ctxIdx. The context variable CV specified by the context index ctxIdx is also expressed as CV (ctxIdx).

The context variable management unit 111c updates the probability state index pStateIdx every time the transform coefficient decoding unit 111b decodes one symbol. By updating the probability state index pStateIdx, the occurrence probability specified by the probability state index pStateIdx changes.

In arithmetic codes, decoding is performed according to the probability of symbol occurrence. In general, when the symbol of a certain syntax element is 1, the probability that the next decoded symbol is 1 is also high, and the probability of occurrence of the decoded value (probability) Updates adaptively to increase the state index (pStateIdx). Further, since the occurrence probability of low frequency component symbols and the occurrence probability of high frequency component symbols are different, the context (probability state index pStateIdx) used for each frequency component is switched. In this way, the conditional probability that the occurrence probability differs depending on the condition such as the frequency component is used.

(Context index change unit 111d)
The context index changing unit 111d refers to the syntax sig_flag (PosX, PosY) supplied from the transform coefficient decoding unit 111b, determines the non-uniform distribution of transform coefficients in the frequency domain, and according to the determination result, A context index assigned to an undecoded component in the frequency domain is changed.

(Context index change processing example 1)
Below, the 1st specific example of the context index change process by the context index change part 111d is demonstrated.

In this processing example, the context index changing unit 111d refers to the decoded syntax sig_flag at a stage where a predetermined number of syntaxes sig_flag for frequency components on the low frequency side are decoded, and sets non-zero in the frequency domain. The bias of the distribution of the conversion coefficient is determined. Further, the context index assigned to the undecoded component is changed according to the determination result.

When the frequency domain to be processed is an 8 × 8 component, the context index changing unit 111d, for example, the syntax sig_flag supplied from the transform coefficient decoding unit 111b for the hatched component in FIG. Refer to the syntax sig_flag for frequency components (0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0), and their frequencies The distribution bias of the non-zero transform coefficient for the component is determined. In other words, the context index changing unit 111d refers to the syntax sig_flag for the frequency components along the scan line (0) to the scan line (2), and distributes non-zero transform coefficients for those frequency components. Judge the bias.

FIG. 47 is a flowchart showing a specific processing flow of the context index changing unit 111d in this processing example.

(Step S201)
The context index changing unit 111d first sets variables PosX and PosY. Here, the variables PosX and PosY are sequentially set according to a predetermined scan order (the above-described zigzag scan or adaptive scan) from the low frequency side of the frequency region to be processed.

(Step S202)
Subsequently, the context index changing unit 111d determines whether or not the value of the syntax sig_flag for the frequency component specified by PosX and PosY is non-zero. The syntax sig_flag to be determined in this step is supplied from the transform coefficient decoding unit 111b.

(Step S203)
When the value of the syntax sig_flag is non-zero (Yes in step S104), the context index changing unit 111d determines whether PosX> PosY.

(Step S204)
If PosX> PosY (Yes in step S203), the context index changing unit 111d adds 1 to the variable coeffTopRight and proceeds to the process of step S207. Here, the variable coeffTopRight is a variable representing the frequency of appearance of non-zero transform coefficients for frequency components satisfying PosX> PosY.

(Step S205)
If PosX> PosY is not satisfied (No in step S203), the context index changing unit 111d determines whether PosY> PosX. If PosY> PosX, the process proceeds to step S207.

(Step S206)
If PosY> PosX (Yes in step S205), the context index changing unit 111d adds 1 to the variable coeffBottomLeft and proceeds to the process of step S207. Here, the variable coeffBottomLeft is a variable representing the frequency of occurrence of non-zero transform coefficients for frequency components satisfying PosX <PosY.

(Step S207)
Subsequently, the context index changing unit 111d determines whether or not processing for a predetermined number of components has been completed. Here, for example, when the frequency region to be processed is an 8 × 8 component, the predetermined number may be 6 or 10 as shown in FIG. If the processing for the predetermined number of components has not been completed, the process returns to step S201, and PosX and PosY for the next component are set in the processing order.

(Step S208)
When the processing for the predetermined number of components is completed (Yes in step S207), the context index changing unit 111d determines whether coeffBottomLeft> coeffTopRight. If coeffBottomLeft ≦ coeffTopRight, the process ends without changing the context index. In other words, if the number of non-zero transform coefficients in the region satisfying PosX> PosY is equal to or greater than the number of non-zero transform coefficients in the region satisfying PosY> PosX, the process is terminated without changing the context index. To do.

FIG. 46B is a diagram illustrating an example of a context index that has been allocated by the context index allocation unit 111a and has not been changed when the frequency domain to be processed is an 8 × 8 component. .

(Step S209)
When coeffBottomLeft> coeffTopRight (Yes in step S208), the context index changing unit 111d changes the context index ctxIdx for the undecoded component in the frequency domain. That is, when the number of non-zero transform coefficients in a region satisfying PosY> PosX is larger than the number of non-zero transform coefficients in a region satisfying PosX> PosY, the context index ctxIdx for an undecoded component in the frequency domain To change.

FIG. 46C is a diagram illustrating an example of the context index changed in this step when the frequency domain to be processed is an 8 × 8 component. As shown in FIG. 46 (c), the context index assigned to each frequency component by the context index assigning unit 111a is subjected to line symmetry conversion with a diagonal line satisfying PosX = PosY as a symmetry axis in this step. That is, the context index ctxIdx (PosX, PosY) allocated to the frequency components (PosX, PosY) by the context index allocation unit 111a is changed to ctxIdx (PosY, PosX) in this step. In other words, if the context index for the frequency component (PosX, PosY) and changed in this step is represented as ctxIdx ′ (PosX, PosY),
ctxIdx ′ (PosX, PosY) = ctxIdx (PosY, PosX)
Is satisfied.

In the above description, the context index ctxIdx is changed when coeffBottomLeft> coeffTopRight, but the opposite configuration, that is, the context index ctxIdx is changed when coeffBottomLeft <coeffTopRight, and otherwise. A configuration in which no change is made may be used.

FIG. 48 is a sequence diagram illustrating an example of a cooperative operation between the context index changing unit 111d that performs the operation of this processing example and the transform coefficient decoding unit 111b.

(Step S301)
The transform coefficient decoding unit 111b decodes transform coefficients along the scan line (0) to the scan line (2). Further, the decoded syntax sig_flag for each frequency component along those scan lines is supplied to the context index changing unit 111d. In FIG. 48, a set of decoded syntax sig_flag for each frequency component along the scan line (0) to the scan line (2) is expressed as sig_flag (0) to (2).

(Step S302)
The context index changing unit 111d refers to the value of the decoded syntax sig_flag supplied from the transform coefficient decoding unit 111b, and performs non-zero conversion on frequency components along the scan line (0) to the scan line (2). Judge the distribution of coefficients.

(Step S303)
Subsequently, the context index changing unit 111d determines the context index for the undecoded frequency component in accordance with the non-zero transform coefficient distribution bias for the frequency component along the scan line (0) to the scan line (2). To change. Also, the changed context index ctxIdx ′ is supplied to the transform coefficient decoding unit 111b.

(Step S304)
The transform coefficient decoding unit 111b decodes transform coefficients for undecoded frequency components using the changed context index ctxIdx ′ supplied from the context index changing unit 111d.

In general, if some non-zero transform coefficients appear biased when some frequency components are decoded, there is a high possibility that the same bias exists even when the remaining frequency components are decoded.
As described above, the context index changing unit 111d changes the context index for the undecoded syntax in accordance with the bias of the non-zero transform coefficient distribution in the frequency domain. Each syntax can be decoded using an appropriate probability of occurrence.

(Context index change processing example 2)
Subsequently, a second specific example of the context index changing process by the context index changing unit 111d will be described.

In this processing example, the determination of the non-zero transform coefficient distribution in the frequency domain and the change of the context index are performed for each scan line.

FIG. 49 is a flowchart showing a specific processing flow of the context index changing unit 111d in this processing example. As shown in FIG. 49, in this processing example, steps S201, S207, and S209 for the above-described (context index change processing example 1) are replaced with steps S401, S407, and S409 described below, respectively.

(Step S401)
The context index changing unit 111d sequentially sets variables PosX and PosY along the scan lines (0) to (L).

(Step S407)
The context index changing unit 111d determines whether or not the processing for the scan lines (0) to (L) has been completed.

(Step S409)
When coeffBottomLeft> coeffTopRight (Yes in step S208), the context index changing unit 111d changes the context index ctxIdx for the frequency component along the scan line (L + 1).

FIG. 50 is a diagram for explaining the present processing example when the frequency domain to be processed is 4 × 4 components. As shown in FIG. 50A, for example, according to the values of coeffBottomLeft and coeffTopRight calculated along the scan line (0) to the scan line (2), for each frequency component along the scan line (3). The context index ctxIdx of is changed.

FIG. 50 (b) shows an example of the context index assigned to each frequency component along the scan line (0) to the scan line (3) by the context index assigning unit 111a. An example of the context index changed by the context index changing unit 111d according to the values of coeffBottomLeft and coeffTopRight calculated along the scan line (0) to the scan line (2) is shown.

As shown in FIG. 50C, the context index ctxIdx (PosX, PosY) allocated along the scan line (3) by the context index allocation unit 111a is converted into ctxIdx (PosY, PosX) by the context index change unit 111d. Changed to

FIG. 51 is a diagram for explaining the present processing example when the frequency region to be processed is an 8 × 8 component. As shown in FIG. 51A, for example, for each frequency component along the scan line (4) according to the values of coeffBottomLeft and coeffTopRight calculated along the scan line (0) to the scan line (3). The context index ctxIdx of is changed.

FIG. 51 (b) shows an example of the context index assigned to the frequency components along the scan lines (0) to (4) by the context index assigning unit 111a. FIG. The example of the context index changed by the index change part 111d is shown.

As shown in FIG. 51 (c), the context index ctxIdx (PosX, PosY) allocated along the scan line (L + 1) by the context index allocation unit 111a is converted into ctxIdx (PosY, PosX) by the context index change unit 111d. Changed to In other words, if the context index for the frequency components (PosX, PosY) and changed by the context index changing unit 111d is expressed as ctxIdx ′ (PosX, PosY),
ctxIdx ′ (PosX, PosY) = ctxIdx (PosY, PosX)
Is satisfied.

In FIG. 51 (c), the context index assigned to the frequency components along the scan line (2) and the scan line (3) is also changed. For these frequency components, the transform coefficients are decoded. Is completed, it is not referred to in the actual decoding process.

In the above description, the case where the scan direction in the scan line (L + 1) is determined according to the values of scanTopRight and scanBottomLeft calculated along the scan line (0) to the scan line (L) is taken as an example. However, the present embodiment is not limited to this, and the scan direction in the scan line (L + 1) may be determined according to the values of scanTopRight and scanBottomLeft calculated along the scan line (L). . That is, the scanTopRight and scanBottomLeft may be initialized for each scan line.

Also, the scanTopRight and scanBottomLeft may be initialized every time a predetermined number (for example, two) of scan lines are scanned. For example, the context index assigned to the frequency component along the scan line (L + 1) is changed according to the scanTopRight and scanBottomLeft accumulated by scanning the scan line (L-1) and the scan line (L). It is good also as composition to do.

In this processing example, for example, the context index allocated to the frequency components along the scan lines (0) to (2) is not changed, and the scan line ( 3) It is preferable to change the context index assigned to the frequency components along the subsequent scan lines.

This is because a sufficient number of frequency components have not been decoded at the time when scan lines (0) to (2) are decoded, and therefore, non-zero conversion coefficients are not used for frequency components along such a scan line. This is because an error is likely to occur in the estimation of the distribution bias. According to said structure, the fall of the encoding efficiency resulting from such an estimation error can be prevented.

In FIG. 52B, the context index assigning unit 111a assigns the frequency components (0, 3), (1, 2), (2, 1), (3, 0) along the scan line (3). An example of the context index is shown, and FIG. 52C shows an example of the context index changed by the context index changing unit 111d.

FIG. 53 is a sequence diagram illustrating an example of a cooperative operation between the context index changing unit 111d that performs the operation of the present processing example and the transform coefficient decoding unit 111b.

(Step S501)
The transform coefficient decoding unit 111b decodes transform coefficients along the scan line (0) to the scan line (2). Further, the decoded syntax sig_flag for each frequency component along those scan lines is supplied to the context index changing unit 111d. In FIG. 53, a set of decoded syntax sig_flag for each frequency component along scan line (0) to scan line (2) is denoted as sig_flag (0) to (2) (the same applies hereinafter).

(Step S502)
The context index changing unit 111d refers to the value of the decoded syntax sig_flag supplied from the transform coefficient decoding unit 111b, and performs non-zero conversion on frequency components along the scan line (0) to the scan line (2). Judge the distribution of coefficients.

(Step S503)
Subsequently, the context index changing unit 111d determines whether the non-zero transform coefficient distribution for the frequency components along the scan line (0) to the scan line (2) is unbalanced along the scan line (3). Change the context index for the frequency component of decoding. Also, the changed context index ctxIdx ′ is supplied to the transform coefficient decoding unit 111b. In FIG. 53, a set of changed context indexes for undecoded frequency components along the scan line (3) is denoted as ctxIdx ′ (3) (the same applies hereinafter).

(Step S504)
The transform coefficient decoding unit 111b decodes transform coefficients for undecoded frequency components along the scanline (3) using the changed context index ctxIdx ′ supplied from the context index changing unit 111d. Also, the decoded syntax sig_flag for each frequency component along the scan line (3) is supplied to the context index changing unit 111d.

(Step S505)
The context index changing unit 111d refers to the value of the decoded syntax sig_flag supplied from the transform coefficient decoding unit 111b, and performs non-zero conversion on frequency components along the scan lines (0) to (3). Judge the distribution of coefficients.

(Step S506)
Subsequently, the context index changing unit 111d applies the non-decoded frequency component along the scan line (4) according to the non-zero transform coefficient distribution for the frequency component along the scan line (3). Change the context index. Also, the changed context index ctxIdx ′ is supplied to the transform coefficient decoding unit 111b.

(Step S507)
The transform coefficient decoding unit 111b decodes transform coefficients for undecoded frequency components along the scan line (4) using the changed context index ctxIdx ′ supplied from the context index changing unit 111d. Also, the decoded syntax sig_flag for each frequency component along the scan line (4) is supplied to the context index changing unit 111d.

(Step S508)
The context index changing unit 111d refers to the value of the decoded syntax sig_flag supplied from the transform coefficient decoding unit 111b, and performs non-zero conversion on frequency components along the scan lines (0) to (4). Judge the distribution of coefficients.

Thereafter, by repeating the same operations as in steps S506 to S508, the transform coefficients for each frequency component included in the processing target frequency region are decoded.

Also when the context index changing unit 111d performs the operation of this processing example, the transform coefficient decoding unit 111b can perform decoding of each syntax using a more appropriate occurrence probability.

<Modification 1>
In the above-described (context index change processing example 1) and (context index change processing example 2), the transform coefficient decoding unit 111b supplies sig_flag to the context index change unit 111d, and the context index change unit 111d receives sig_flag. The configuration for determining the distribution bias of non-zero conversion coefficients has been described with reference to FIG. 5, but the present embodiment is not limited to this. For example, the transform coefficient decoding unit 111b supplies the variables scanBottomLeft and scanTopRight to the context index changing unit 111d, and the context index changing unit 111d compares the value of the variable scanBottomLeft with the value of the variable scanTopRight. A configuration may be used in which the deviation of the distribution of the conversion coefficient of 0 is determined. In such a configuration, when scanBottomLeft> scanTopRight, the context index ctxIdx for an undecoded component in the frequency domain is changed.

With the above configuration, the processing amount by the context index changing unit 111d can be reduced.

<Modification 2>
In addition, the transform coefficient decoding unit 111b supplies the scan direction scanDirection for each scan line to the context index change unit 111d, and the context index change unit 111d refers to the scan direction scanDirection to determine a non-zero transform coefficient. It is good also as a structure which determines the deviation of distribution of. In such a configuration, when scanDirection = TopRight, the context index ctxIdx for an undecoded component in the frequency domain is changed.

Also with the above configuration, the processing amount by the context index changing unit 111d can be reduced.

<Modification 3>
Further, the context index changing unit 111d may be configured to limit the frequency component that is the target of changing the text index, and to change the context index only for the frequency component at a predetermined (PosX, PosY).

For example, the following condition ABS (PosX−PosY) ≧ TH1
Only the frequency components satisfying the condition may be configured such that the context index can be changed. Here, ABS (...) Is a symbol representing the absolute value of a variable in parentheses. TH1 represents a predetermined threshold value. FIG. 54A is a diagram showing frequency components whose context index can be changed when the frequency region to be processed is 4 × 4 components and TH1 = 1. In the example shown in FIG. 54A, the context index can be changed for the hatched frequency components, but the context index is not changed for the other frequency components. The same applies to the case where the frequency region to be processed is an 8 × 8 component.

For frequency components near the diagonal line that satisfies PosX = PosY in the frequency domain, even if the context index is changed, the encoding efficiency may not be improved as expected. For example, when a diagonal edge exists in the processing target block, a non-zero transform coefficient is generated for a frequency component near the diagonal line that satisfies PosX = PosY in the frequency domain. Even if this is changed, the encoding efficiency may not be improved as expected. This is because, for frequency components in the vicinity of the diagonal line that satisfies PosX = PosY, an error is likely to occur in the estimation of the non-zero transform coefficient distribution.

According to this modification, the encoding efficiency can be improved while reducing the processing amount.

Moreover, in this modification, the following conditions PosX ≦ TH2 or PosY ≦ TH2
Only the frequency components satisfying the condition may be configured such that the context index can be changed. FIG. 54B is a diagram showing frequency components whose context index can be changed when the frequency region to be processed is 4 × 4 components and TH2 = 1. In the example shown in FIG. 54B, the context index can be changed for the hatched frequency component, but the context index is not changed for the other frequency components. The same applies to the case where the frequency region to be processed is an 8 × 8 component.

As shown in FIG. 54B, the encoding efficiency is improved while reducing the processing amount by adopting a configuration in which the context index can be changed only for frequency components in the vicinity of PosX = 0 and PosY = 0. Can be made.

<Modification 4>
In the above description, the context index changing unit 111d performs the context index ctxIdx (PosX, PosY) for the undecoded frequency component in accordance with the bias of the distribution of the decoded non-zero transform coefficients in the frequency domain to be processed. Has been described as ctxIdx ′ (PosX, PosY) = ctxIdx (PosY, PosX), but the method of changing the context index in the present embodiment is not limited to this example.

In this modification, in the frequency domain to be processed, when (1) decoded non-zero transform coefficients are biased in the X direction (horizontal direction), (2) decoded non-zero transform coefficients are in the Y direction. Regarding a configuration in which different context indexes are allocated for each of the three cases of (biased in the vertical direction) and (3) the case where the decoded non-zero transform coefficient has little bias (when the bias is small). explain.

First, the context index changing unit 111d determines the context group ctxgroup by comparing coeffTopRight and coeffBottomLeft. Specifically, the context index changing unit 111d determines the context group ctxgroup as follows.

When ctxgroup = 1: coeffTopRight ≧ 2 × coeffBottomLeft ctxgroup = 2: When 2 × coeffTopRight ≦ coeffBottomLeft ctxgroup = 0: Other (coeffTopRight <2 × coeffBottomLeft and 2 × coeffTopRight> coeffBottomLeft) The unit 111d determines ctxgroup = 1 when the decoded non-zero transform coefficient is biased in the horizontal direction, and ctxgroup = 2 when the decoded non-zero transform coefficient is skewed in the vertical direction. If there is almost no bias in the decoded non-zero transform coefficients, ctxgroup = 0 is determined.

Subsequently, the context index changing unit 111d determines a context index ctxIdx (PosX, PosY) for the frequency components (PosX, PosY) by the following formula.

ctxIdx (PosX, PosY) = (4 × PosY + PosX) + 16 × ctxgroup
FIG. 55A is determined by the context index changing unit 111d when coeffTopRight <2 × coeffBottomLeft and 2 × coeffTopRight> coeffBottomLeft, that is, when there is almost no bias in the decoded non-zero transform coefficients. FIG. FIG. 55B shows a context index determined by the context index changing unit 111d when coeffTopRight ≧ 2 × coeffBottomLeft, that is, when decoded non-zero transform coefficients are biased in the horizontal direction. It is. FIG. 55C is a diagram showing a context index determined by the context index changing unit 111d when 2 × coeffTopRight ≦ coeffBottomLeft, that is, when decoded non-zero transform coefficients are biased in the vertical direction. It is.

As shown in FIGS. 55A to 55C, different context index sets are determined for the frequency domain to be processed according to the bias of the decoded non-zero transform coefficient.

Assuming that the context index assigning unit 111a assigns the context index shown in FIG. 55 (a) to each frequency component in the frequency domain to be processed, the context index changing unit 111d assigns to the undecoded frequency domain. The obtained context index is changed to the context index shown in FIG. 55 (b) or 55 (c) according to the bias of the decoded non-zero transform coefficient.

<Modification 5>
Also, a configuration in which different sets of context indexes are used for a case where the distribution of decoded non-zero transform coefficients is biased and a case where the distribution of decoded non-zero transform coefficients is almost biased Further, if the distribution of the decoded non-zero transform coefficients is biased, the context index is set according to whether the decoded non-zero transform coefficients are biased in the horizontal direction or the vertical direction. It is good also as a structure to replace.

In this modification, the context index changing unit 111d first determines the context group ctxgroup and the replacement index swap by comparing coeffTopRight and coeffBottomLeft. Specifically, the context index changing unit 111d determines the context group ctxgroup and the replacement index swap as follows.

ctxgroup = 1, swap = 0: when coeffTopRight ≧ 2 × coeffBottomLeft ctxgroup = 1, swap = 1: 2 × coeffTopRight ≦ coeffBottomLeft ctxgroup = 0, swap = 0: coeffTopRight <2 × coeffBottomLeft and 2 × coeffTopRight> In the case of coeffBottomLeft, that is, the context index changing unit 111d determines ctxgroup = 1 when there is a bias in the decoded non-zero transform coefficient, and ctxgroup when there is almost no bias in the decoded non-zero transform coefficient. = 0. Also, when the decoded non-zero transform coefficient is biased in the horizontal direction, the replacement index swap = 0 is determined, and when the decoded non-zero transform coefficient is biased in the vertical direction, the swap index swap = 1. To decide.

When swap = 1: ctxIdx (PosX, PosY) = (4 × PosX + PosY) + 16 × ctxgroup
When swap = 0: ctxIdx (PosX, PosY) = (4 × PosY + PosX) + 16 × ctxgroup
FIG. 56A is determined by the context index changing unit 111d when coeffTopRight <2 × coeffBottomLeft and 2 × coeffTopRight> coeffBottomLeft, that is, when there is almost no bias in the decoded non-zero transform coefficients. FIG. FIG. 56B is a diagram showing a context index determined by the context index changing unit 111d when coeffTopRight ≧ 2 × coeffBottomLeft, that is, when decoded non-zero transform coefficients are biased in the horizontal direction. It is. FIG. 56C is a diagram showing the context index determined by the context index changing unit 111d when 2 × coeffTopRight ≦ coeffBottomLeft, that is, when the decoded non-zero transform coefficient is biased in the vertical direction. It is.

As shown in FIGS. 56A to 56C, different context indexes are determined for the frequency domain to be processed according to the bias of the decoded non-zero transform coefficient.

In the determination of the swap value when the decoded non-zero transform coefficient is biased, the configuration opposite to the above description, that is, the decoded non-zero transform coefficient is biased in the horizontal direction. In this case, the replacement index swap = 1 may be determined, and the replacement index swap = 0 may be determined when the decoded non-zero transform coefficient is biased in the vertical direction.

Assuming that the context index assigning unit 111a assigns the context index shown in FIG. 56 (a) to each frequency component in the frequency domain to be processed, the context index changing unit 111d assigns to the undecoded frequency domain. The obtained context index is changed to the context index shown in FIG. 56 (b) or 56 (c) according to the bias of the decoded non-zero transform coefficient.

Note that the context index changing unit 111d according to the present modification, when the index indicating the bias of the distribution of the decoded non-zero transform coefficient in the frequency domain to be processed is larger than a predetermined threshold, It can also be expressed as changing the context index assigned to the undecoded syntax in the region.

For example, as the above index,
R = coeffTopRight-2 × coeffBottomLeft
, The context index changing unit 111d according to the present modification can be considered to change the context index as shown in FIG. 56B when the index R is 0 or more. .

In addition, as the above index,
R '= coeffBottomLeft-2 × coeffTopRight
When the index R ′ is 0 or more, the context index changing unit 111d according to this modification may be considered to change the context index as shown in FIG. 56 (c). it can.

<Modification 6>
In the above description, the context index changing unit 111d calculates, as coeffTopRight, the number of non-zero transform coefficients that have been decoded for frequency components that satisfy PosX> PosY, and the non-zero that has been decoded for frequency components that satisfy PosX <PosY. Although the configuration for calculating the number of transform coefficients as coeffBottomLeft has been described, the present embodiment is not limited to this.

For example, the context index changing unit 111d calculates, as coeffTopRight, the number of non-zero transformed coefficients that have been decoded for frequency components with PosY = 0 or 1, and the non-zero that has been decoded for frequency components with PosX = 0 or 1. The number of conversion coefficients may be calculated as coeffBottomLeft.

FIG. 57 is a diagram illustrating frequency components referred to by the context index changing unit 111d according to the present modification for calculating coeffTopRight and coeffBottomLeft. In FIG. 57, a frequency component with a diagonal line rising to the right is referred to calculate coeffTopRight, and a frequency component with a diagonal line to the right is referred to calculate coeffBottomLeft.

The context index changing unit 111d adopts the configuration of the present modification, thereby preventing a reduction in encoding efficiency due to an estimation error of the non-zero transform coefficient distribution.

<Modification 7>
Further, the context index changing unit 111d calculates, as coeffTopRight, a weighted average of absolute values of transform coefficients that have been decoded for frequency components that satisfy PosX> PosY, and an absolute value of transform coefficients that have been decoded for frequency components that satisfy PosX <PosY. It is good also as a structure which calculates the weighted average of as coeffBottomLeft.

For example, the context index changing unit 111d may be configured to calculate coeffTopRight and coeffBottomLeft using the following equations (2) and (3), respectively.

Here, Σ in Equation (2) is a decoded transform coefficient and represents the sum of transform coefficients satisfying PosX> PosY, and Σ in Equation (3) is a decoded transform coefficient. The sum of transform coefficients satisfying PosX <PosY. The symbol | ... | is a symbol representing an absolute value.

In Equations (2) and (3), the weighting factor w (PosX, PosY) may be set so as to more effectively prevent a decrease in coding efficiency due to an estimation error in the non-zero transform coefficient distribution bias. it can. For example, the weighting factor w (PosX, PosY) is
w (PosX = 0, PosY)> w (PosX ≠ 0, PosY)
w (PosX, PosY = 0)> w (PosX, PosY ≠ 0)
Can be set to satisfy.

Also, in Equations (2) and (3), instead of | Coeff (PosX, PosY) |, the value of the syntax sig_flag decoded for the frequency component (PosX, PosY) may be used.

(Data structure of encoded data)
The data structure of the encoded data of the present invention has the following characteristics. That is, a data structure of encoded data obtained by arithmetically encoding one or a plurality of types of syntax representing the conversion coefficient for each conversion coefficient obtained by frequency-converting the original image for each unit region. The one or more types of syntax include a flag indicating whether or not the transform coefficient is 0, and the arithmetic decoding apparatus that decodes the encoded data has a target corresponding to the unit area to be processed. For each syntax in the frequency domain, a context index determined according to the type of syntax and the position of the syntax in the target frequency domain is assigned, and each syntax in the target frequency domain is assigned to the syntax. Sequential arithmetic decoding based on the probability state specified by the assigned context index Rutotomoni, depending on the deviation of the distribution of restored the flag in the target frequency range, changes the context index allocated to the syntax of the undecoded in the subject frequency range, is characterized in that.

Also, it is possible to define a context complexity flag indicating whether or not to change the context index assigned to the undecoded syntax in the target frequency domain according to the distribution of each syntax distribution in the target frequency domain. When the context complexity flag is 0 indicating low complexity, the context index is not changed due to the distribution bias. When the context complexity flag is 1 indicating high complexity, the context index is determined due to the distribution bias. Make changes.

The context complexity flag may be included in the picture header PH or the slice header SH shown in FIG. Also, information different from the picture header PH and the slice header SH, for example, H. It may be included in a picture parameter set or a sequence parameter set in H.264. The context complexity flag is decoded by a header decoding unit or a parameter set decoding unit (not shown) in the variable length code decoding unit 11 and transmitted to the quantization residual information decoding unit 111. When the context complexity flag is 0 indicating low complexity, the operation of the context index changing unit 111d is not performed. When the context complexity flag is 1 indicating high complexity, the operation of the context index changing unit 111d is not performed.

In this way, the complexity of the context allocation unit can be controlled by switching the context allocation method using the context complexity flag. When the context complexity flag is low complexity, it is possible to reduce the amount of switching processing common to all the modified examples. In addition, when applied to a configuration in which contexts are switched according to the presence or absence of bias as in Modification 4 and Modification 5, it is not necessary to prepare a context according to the presence or absence of bias, so that the memory can be reduced. Is possible. When the context complexity flag is high complexity, higher decoding efficiency can be realized than when the flag is low complexity.

As described above, according to the configuration described above, not only the position of the transform coefficient to be decoded in the frequency domain but also the context is switched according to the bias of the decoded non-zero transform coefficient, so that the bias of the transform coefficient is changed for each unit region. Even in a different case, since an appropriate probability state is used according to the bias, encoded data having high encoding efficiency can be decoded.

(Moving picture encoding device 2)
The configuration of the moving picture encoding apparatus 2 according to the present embodiment will be described with reference to FIGS. The moving image encoding apparatus 2 includes H.264 as a part thereof. H.264 / MPEG-4. Coding including technology adopted in KTA software, which is a codec for joint development in AVC and VCEG (Video Coding Expert Group), and technology adopted in TMuC (Test Model under Consideration) software, which is the successor codec Device. In the following, the same parts as those already described are denoted by the same reference numerals, and the description thereof is omitted.

FIG. 58 is a block diagram showing a configuration of the moving picture encoding apparatus 2. As illustrated in FIG. 58, the moving image encoding device 2 includes a predicted image generation unit 21, a transform / quantization unit 22, an inverse quantization / inverse transform unit 23, an adder 24, a frame memory 25, a loop filter 26, a variable A long code encoding unit 27 and a subtracter 28 are provided. As shown in FIG. 58, the prediction image generation unit 21 includes an intra prediction image generation unit 21a, a motion vector detection unit 21b, an inter prediction image generation unit 21c, a prediction method control unit 21d, and a motion vector redundancy deletion unit. 21e. The moving image encoding device 2 is a device that generates encoded data # 1 by encoding moving image # 10 (encoding target image).

(Predicted image generation unit 21)
The predicted image generation unit 21 recursively divides the processing target LCU into one or a plurality of lower-order CUs, further divides each leaf CU into one or a plurality of partitions, and uses an inter-screen prediction for each partition. A predicted image Pred_Inter or an intra predicted image Pred_Intra using intra prediction is generated. The generated inter prediction image Pred_Inter and intra prediction image Pred_Intra are supplied to the adder 24 and the subtracter 28 as the prediction image Pred.

Note that the prediction image generation unit 21 omits encoding of other parameters belonging to the PU for the PU to which the skip mode is applied. Also, (1) the mode of division into lower CUs and partitions in the target LCU, (2) whether to apply the skip mode, and (3) which of the inter predicted image Pred_Inter and the intra predicted image Pred_Intra for each partition Whether to generate is determined so as to optimize the encoding efficiency.

(Intra predicted image generation unit 21a)
The intra predicted image generation unit 21a generates a predicted image Pred_Intra for each partition by intra prediction. Specifically, (1) a prediction mode used for intra prediction is selected for each partition, and (2) a prediction image Pred_Intra is generated from the decoded image P using the selected prediction mode. The intra predicted image generation unit 21a supplies the generated intra predicted image Pred_Intra to the prediction method control unit 21d.

In addition, the intra predicted image generation unit 21a determines an estimated prediction mode for the target partition from the prediction modes assigned to the peripheral partitions of the target partition, and the estimated prediction mode and the prediction mode actually selected for the target partition Are supplied as a part of the intra prediction parameter PP_Intra to the variable length code encoding unit 27 via the prediction scheme control unit 21d, and the variable length code encoding unit 27 The flag is included in the encoded data # 1.

In addition, when the estimated prediction mode for the target partition is different from the prediction mode actually selected for the target partition, the intra predicted image generation unit 21a sets the residual prediction mode index RIPM indicating the prediction mode for the target partition. , And supplied as a part of the intra prediction parameter PP_Intra to the variable length code encoding unit 27 via the prediction scheme control unit 21d, and the variable length code encoding unit 27 is included in the residual prediction mode index encoded data # 1 And

(Motion vector detection unit 21b)
The motion vector detection unit 21b detects a motion vector mv regarding each partition. Specifically, (1) by selecting an adaptive filtered decoded image P_ALF ′ used as a reference image, and (2) by searching for a region that best approximates the target partition in the selected adaptive filtered decoded image P_ALF ′, A motion vector mv related to the target partition is detected. Here, the adaptive filtered decoded image P_ALF ′ is an image obtained by performing an adaptive filter process by the loop filter 26 on a decoded image that has already been decoded for the entire frame, and is a motion vector detection unit. The unit 21b can read out the pixel value of each pixel constituting the adaptive filtered decoded image P_ALF ′ from the frame memory 25. The motion vector detection unit 21b sends the detected motion vector mv to the inter-predicted image generation unit 21c and the motion vector redundancy deletion unit 21e together with the reference image index RI designating the adaptive filtered decoded image P_ALF ′ used as a reference image. Supply.

(Inter prediction image generation unit 21c)
The inter prediction image generation unit 21c generates a motion compensated image mc related to each inter prediction partition by inter-screen prediction. Specifically, using the motion vector mv supplied from the motion vector detection unit 21b, the motion compensated image mc from the adaptive filtered decoded image P_ALF ′ designated by the reference image index RI supplied from the motion vector detection unit 21b. Is generated. Similar to the motion vector detection unit 21b, the inter prediction image generation unit 21c can read out the pixel value of each pixel constituting the adaptive filtered decoded image P_ALF ′ from the frame memory 25. The inter prediction image generation unit 21c supplies the generated motion compensated image mc (inter prediction image Pred_Inter) together with the reference image index RI supplied from the motion vector detection unit 21b to the prediction method control unit 21d.

(Prediction method controller 21d)
The prediction scheme control unit 21d compares the intra predicted image Pred_Intra and the inter predicted image Pred_Inter with the encoding target image, and selects whether to perform intra prediction or inter prediction. When the intra prediction is selected, the prediction scheme control unit 21d supplies the intra prediction image Pred_Intra as the prediction image Pred to the adder 24 and the subtracter 28, and sets the intra prediction parameter PP_Intra supplied from the intra prediction image generation unit 21a. This is supplied to the variable length code encoding unit 27. On the other hand, when the inter prediction is selected, the prediction scheme control unit 21d supplies the inter prediction image Pred_Inter as the prediction image Pred to the adder 24 and the subtracter 28, and the reference image index RI and motion vector redundancy described later. The estimated motion vector index PMVI and the motion vector residual MVD supplied from the deletion unit 21e are supplied to the variable length code encoding unit 27 as an inter prediction parameter PP_Inter. In addition, the prediction scheme control unit 21 d supplies the prediction type information PT indicating which prediction image is selected from the intra prediction image Pred_Intra and the inter prediction image Pred_Inter to the variable length code encoding unit 27.

(Motion vector redundancy deleting unit 21e)
The motion vector redundancy deletion unit 21e deletes redundancy in the motion vector mv detected by the motion vector detection unit 21b. Specifically, (1) an estimation method used for estimating the motion vector mv is selected, (2) an estimated motion vector pmv is derived according to the selected estimation method, and (3) the estimated motion vector pmv is subtracted from the motion vector mv. As a result, a motion vector residual MVD is generated. The motion vector redundancy deleting unit 21e supplies the generated motion vector residual MVD to the prediction method control unit 21d together with the estimated motion vector index PMVI indicating the selected estimation method.

(Transformation / quantization unit 22)
The transform / quantization unit 22 performs (1) DCT transform (Discrete Cosine Transform) for each block (transform unit) on the prediction residual D obtained by subtracting the predicted image Pred from the encoding target image, and (2) obtains the DCT transform. The obtained transform coefficient Coeff_IQ is quantized, and (3) the transform coefficient Coeff obtained by the quantization is supplied to the variable length code encoding unit 27 and the inverse quantization / inverse transform unit 23. The transform / quantization unit 22 (1) selects a quantization step QP to be used for quantization for each TU, and (2) sets a quantization parameter difference Δqp indicating the size of the selected quantization step QP. The variable length code encoding unit 28 is supplied, and (3) the selected quantization step QP is supplied to the inverse quantization / inverse transform unit 23. Here, the quantization parameter difference Δqp is the quantization parameter qp for the TU that has been DCT transformed / quantized immediately before, based on the value of the quantization parameter qp (for example, QP = 2 ^{pq / 6} ) for the TU to be DCT transformed / quantized. The difference value obtained by subtracting the value of '.

Note that the DCT transform performed by the transform / quantization unit 22 is, for example, when transforming coefficients before quantization for the horizontal frequency u and the vertical frequency v when the size of the target block is 8 × 8 pixels. Coeff_IQ (u, v) (0 ≦ u ≦ 7, 0 ≦ v ≦ 7), for example, is given by the following mathematical formula (4).

Here, D (i, j) (0 ≦ i ≦ 7, 0 ≦ j ≦ 7) represents the prediction residual D at the position (i, j) in the target block. C (u) and C (v) are given as follows.

・ C (u) = 1 / √2 (u = 0)
・ C (u) = 1 (u ≠ 0)
・ C (v) = 1 / √2 (v = 0)
・ C (v) = 1 (v ≠ 0)
(Inverse quantization / inverse transform unit 23)
The inverse quantization / inverse transform unit 23 (1) inversely quantizes the quantized transform coefficient Coeff, and (2) performs inverse DCT (Discrete Cosine Transform) transform on the transform coefficient Coeff_IQ obtained by inverse quantization. 3) The prediction residual D obtained by the inverse DCT transform is supplied to the adder 24. When the quantized transform coefficient Coeff is inversely quantized, the quantization step QP supplied from the transform / quantization unit 22 is used. Note that the prediction residual D output from the inverse quantization / inverse transform unit 23 is obtained by adding a quantization error to the prediction residual D input to the transform / quantization unit 22. Common names are used for this purpose. A more specific operation of the inverse quantization / inverse transform unit 23 is substantially the same as that of the inverse quantization / inverse transform unit 13 included in the video decoding device 1.

(Adder 24)
The adder 24 adds the predicted image Pred selected by the prediction scheme control unit 21d to the prediction residual D generated by the inverse quantization / inverse transform unit 23, thereby obtaining the (local) decoded image P. Generate. The (local) decoded image P generated by the adder 24 is supplied to the loop filter 26 and stored in the frame memory 25, and is used as a reference image in intra prediction.

(Variable-length code encoding unit 27)
The variable length code encoding unit 27 (1) the quantized transform coefficient Coeff and Δqp supplied from the transform / quantization unit 22, and (2) the quantization parameter PP (interpolation) supplied from the prediction scheme control unit 21d. Prediction parameter PP_Inter and intra prediction parameter PP_Intra), (3) Prediction type information, and (4) Encoding data # 1 is generated by variable-length encoding the filter parameter FP supplied from the loop filter 26. To do.

FIG. 59 is a block diagram showing a configuration of the variable-length code encoding unit 27. As shown in FIG. 59, the variable-length code encoding unit 27 includes a quantization residual information encoding unit 271 that encodes the quantized transform coefficient Coeff, and a prediction parameter encoding unit 272 that encodes the prediction parameter PP. A prediction type information encoding unit 273 that encodes the prediction type information, and a filter parameter encoding unit 274 that encodes the filter parameter FP. Since the specific configuration of the quantization residual information encoding unit 271 will be described later, the description thereof is omitted here.

(Subtractor 28)
The subtracter 28 generates the prediction residual D by subtracting the prediction image Pred selected by the prediction method control unit 21d from the encoding target image. The prediction residual D generated by the subtracter 28 is DCT transformed / quantized by the transform / quantization unit 22.

(Loop filter 26)
The loop filter 26 includes (1) a function as a deblocking filter (DF: Deblocking Filter) that performs smoothing (deblocking processing) on an image around a block boundary or partition boundary in the decoded image P; It has a function as an adaptive filter (ALF: Adaptive Loop Filter) which performs an adaptive filter process using the filter parameter FP with respect to the image which the blocking filter acted on.

(Quantization residual information encoding unit 271)
The quantization residual information encoding unit 271 generates quantization residual information QD by performing context adaptive binary arithmetic coding (CABAC) on the transform coefficient Coeff. The syntax included in the generated quantization residual information QD is as shown in FIG.

FIG. 60 is a block diagram illustrating a configuration of the quantization residual information encoding unit 271. As illustrated in FIG. 60, the quantization residual information encoding unit 271 includes a context index allocation unit 271a, a transform coefficient encoding unit 271b, a context variable management unit 271c, and a context index change unit 271d.

(Context index allocation unit 271a)
The context index assigning unit 271a performs each frequency component in the frequency domain to be processed. A context index ctxIdx is assigned according to the position of the frequency component in the frequency domain. The specific operation of the context index allocating unit 271a is the same as that of the context index allocating unit 111a included in the video decoding device 1, and thus the description thereof is omitted here.

(Context variable management unit 271c)
The context variable management unit 271c manages the context variable CV corresponding to each context. The context variable CV managed by the context variable management unit 271c includes (1) a dominant symbol MPS (most probable symbol) having a high occurrence probability, and (2) a probability state index pStateIdx that specifies the occurrence probability of the dominant symbol MPS. Is included. Each context variable CV is specified by a context index ctxIdx.

The context variable management unit 271c updates the occurrence probability specified by the probability state index pStateIdx every time the transform coefficient encoding unit 271b encodes one symbol.

(Transform coefficient coding unit 271b)
The transform coefficient encoding unit 271b refers to the context index ctxIdx assigned to the target frequency component that is the frequency component to be processed, and encodes the transform coefficient Coeff (PosX, PosY) for the target frequency component, thereby Encoded data for the syntax sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign is generated.

Also, the transform coefficient encoding unit 271b supplies the syntax sig_flag (PosX, PosY) to the context index changing unit 271d.

(Encoding process of various syntaxes)
A process for generating encoded data for various syntaxes sig_flag, last_flag, abs_greater_one, coeff_abs_level, and sign by the transform coefficient encoding unit 271b will be described. The transform coefficient encoding unit 271b generates encoded data for each syntax by performing binarization processing and binary arithmetic code encoding processing on each syntax to be encoded.

Binary processing is processing that converts input multi-value data into binary data. By performing binarization processing for each syntax, binary data for each syntax is generated.

Binary arithmetic code encoding processing is processing for arithmetically encoding binary data for each syntax. The transform coefficient encoding unit 271b refers to the context index ctxIdx (PosX, PosY) assigned to the frequency domain to be processed when encoding one symbol (1 bit of binary data, also referred to as Bin), The context variable CV (ctxIdx) specified by the context index is acquired from the context variable management unit 271c. The transform coefficient encoding unit 271b performs binary encoding on the syntax to be encoded by performing arithmetic encoding according to the occurrence probability specified by the probability state index pStateIdx included in the acquired context variable CV (ctxIdx). The encoded data is encoded, and encoded data for the syntax to be encoded is generated. The occurrence probability specified by the probability state index pStateIdx is updated by the context variable management unit 271c every time one symbol is encoded.

The scan order of each frequency component by the transform coefficient encoding unit 271b is substantially the same as that described in (Scan order of each frequency component by the transform coefficient decoding unit 111b). However, “decoding” in (scan order of each frequency component by the transform coefficient decoding unit 111b) is read as “encoding”. The transform coefficient encoding unit 271b encodes last_flag = 1 for the last transform coefficient in the scan order.

(Context index change unit 271d)
The context index changing unit 111d refers to the syntax sig_flag (PosX, PosY) supplied from the transform coefficient encoding unit 271b, determines the bias of the non-zero transform coefficient distribution in the frequency domain, and depends on the determination result , Change the context index allocated for the uncoded components in the frequency domain.

The context index changing process by the context index changing unit 271d is the same as (context index changing process example 1) and (context index changing process example 2) by the context index changing unit 111d, and thus description thereof is omitted here. Further, the context index changing unit 271d may be configured to perform the same processing as in <Modification 1> to <Modification 7> described for the context index changing unit 111d. However, “decoding” in (context index change processing example 1), (context index change processing example 2) and <variation 1> to <variation 7> is read as “encoding”.

The moving picture decoding apparatus 1 of the present configuration also has a context indicating whether or not to change a context index assigned to the syntax of an uncoded component in the target frequency domain in accordance with the distribution of each syntax in the target frequency domain. The complexity flag can be encoded. The context complexity flag is encoded by being included in the picture header PH or the slice header SH shown in FIG. Information different from the picture header PH and the slice header SH, for example, H.264. Similarly to H.264, when a picture parameter set or a sequence parameter set is encoded, the parameter encoded there may have a meaning of a context complexity flag. The context complexity flag is determined and encoded by a header encoding unit or a parameter set encoding unit (not shown) in the variable length code encoding unit 27. The context complexity flag to be encoded is transmitted to the quantization residual information encoding unit 271. When the context complexity flag is 0 indicating low complexity, the context index changing unit 271d does not perform the changing operation. In the case of 1 indicating high complexity, the changing operation is performed. When the changing operation is not performed, the complexity is low and the coding efficiency is low. When the changing operation is performed, the complexity is high and the efficiency is high. Therefore, according to the configuration including the context complexity flag, the balance between the complexity and the encoding efficiency can be selected.

(Additional notes)
Each block of the moving picture decoding apparatus 1 and the moving picture encoding apparatus 2 described above may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central Processing Unit). ) May be implemented in software.

Examples of the recording medium include tapes such as magnetic tapes and cassette tapes, magnetic disks such as floppy (registered trademark) disks / hard disks, and disks including optical disks such as CD-ROM / MO / MD / DVD / CD-R. IC cards (including memory cards) / optical cards, semiconductor memories such as mask ROM / EPROM / EEPROM / flash ROM, or PLD (Programmable logic device) or FPGA (Field Programmable Gate Array) Logic circuits can be used.

Also, each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network (Virtual Private Network), telephone line network, mobile communication network, satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, even in the case of wired lines such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, infrared rays such as IrDA and remote control, Bluetooth (registered trademark), IEEE 802.11 wireless, HDR ( It can also be used by wireless such as High Data Rate, NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, and terrestrial digital network.

The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention.

As described above, the moving picture decoding apparatus 1 according to the present invention includes the quantization residual information decoding unit 111. The quantized residual information decoding unit 111 includes a context index allocating unit 111a that allocates a context index to each syntax in the target frequency domain, and a context allocated to each syntax in the target frequency domain. A transform coefficient decoding unit 111b that sequentially performs arithmetic decoding based on the probability state specified by the index and restores each transform coefficient from each decoded syntax, and a distribution of restored non-zero transform coefficients in the target frequency domain And a context index changing unit 111d that changes a context index assigned to an undecoded syntax in the target frequency domain according to the bias. Thereby, an arithmetic decoding apparatus with high encoding efficiency can be realized.

Further, the present invention can also be expressed as follows. That is, the arithmetic decoding device according to the present invention is obtained by arithmetically encoding one or a plurality of types of syntax representing the transform coefficient for each transform coefficient obtained by frequency transforming the target image for each unit region. An arithmetic decoding apparatus for decoding encoded data, wherein each syntax in a target frequency domain corresponding to a unit area to be processed has a type of syntax and a position of the syntax in the target frequency domain. Context index allocating means for allocating a context index determined in accordance with the syntax decoding, and sequentially decoding each syntax in the target frequency domain based on a probability state specified by the context index allocated to the syntax Means and the syntax decoding means Therefore, the transform coefficient restoring means for restoring each transform coefficient from each decoded syntax, and the undecoded syntax in the target frequency domain according to the distribution of the restored non-zero transform coefficient in the target frequency domain And a context index changing means for changing a context index to be allocated.

Also, when the horizontal and vertical coordinates in the target frequency domain are represented as u and v, respectively, and the context index assigned to the undecoded syntax by the context index assigning means is represented as ctxIdx (u, v). The context index changing means, when the distribution of the decoded non-zero transform coefficients in the target frequency domain is biased in a predetermined direction of the horizontal direction and the vertical direction, the undecoded syntax Context index ctxIdx (u, v) assigned to ctxIdx ′ (u, v) = ctxIdx (v, u)
It is preferable to change to the context index ctxIdx ′ (u, v) given by

According to the above configuration, when the distribution of the decoded non-zero transform coefficients in the target frequency domain is biased in a predetermined direction between the horizontal direction and the vertical direction, the context index allocation unit performs the above The context index assigned to the undecoded syntax is changed to a context index obtained by replacing the argument of the context index.

Therefore, according to the above configuration, the syntax of each transform coefficient distributed with a bias in the predetermined direction is used when each transform coefficient is distributed with a bias in a direction other than the predetermined direction. It can be decoded using the context index used.

For example, when the predetermined direction is the vertical direction, the syntax of each transform coefficient distributed with a bias in the vertical direction is used, and the context index used when each transform coefficient is distributed with a bias in the horizontal direction. Can be used to decrypt.

Therefore, according to the above configuration, even if each transform coefficient is distributed in a biased manner in either the vertical direction or the horizontal direction, each syntax is based on an appropriate probability state corresponding to the bias. Can be decrypted. For this reason, according to said structure, even if it is a case where the bias | inclination of each conversion coefficient changes for every unit area | region, encoding efficiency can be improved.

In addition, the context index changing unit is configured to detect an undecoded symbol in the target frequency domain when an index indicating the distribution bias of the decoded non-zero transform coefficient in the target frequency domain is larger than a predetermined threshold. Preferably, the context index assigned to the tax is changed.

According to the above configuration, the context index changing unit is configured to detect the target frequency domain when an index representing the distribution deviation of the decoded non-zero transform coefficient in the target frequency domain is larger than a predetermined threshold. Since the context index assigned to the undecoded syntax is changed, each syntax can be decoded without being affected by the estimation error of the non-zero transform coefficient distribution bias that may occur when the index is small. it can.

According to the arithmetic coding apparatus having a configuration corresponding to the above configuration, each syntax can be encoded without being affected by the estimation error of the non-zero transform coefficient distribution bias. Encoded data having efficiency can be generated.

Further, the context index changing means is a syntax in which at least one of a horizontal coordinate and a vertical coordinate in the target frequency region is not more than a predetermined threshold among undecoded syntaxes in the target frequency region. Preferably, the context index assigned to the tax is changed.

The inventor, as for the syntax in which both the horizontal coordinate and the vertical coordinate in the target frequency domain are larger than the predetermined threshold, even if the context index is changed, the encoding efficiency is as expected. The knowledge that there is a case where it does not improve is obtained.

According to the above configuration, the context index changing unit is configured such that, of the undecoded syntax in the target frequency domain, both the horizontal coordinate and the vertical coordinate in the target frequency domain are the predetermined threshold values. Since the context index assigned to a larger syntax is not changed, the coding efficiency can be improved while reducing the processing amount.

The one or more types of syntax include a first flag indicating whether or not the transform coefficient is 0, and the context index changing means includes the undecoded first flag in the target frequency domain. It is preferable that the context index to be assigned is changed.

According to the above configuration, since the context index assigned to the first flag indicating whether or not the transform coefficient is 0 is changed, the code amount of the first flag can be reduced.

The first flag corresponds to a flag sig_flag used in, for example, TMuC (Test Model Consideration).

The one or more types of syntax include a second flag indicating whether or not it is the last transform coefficient in the processing order, and the context index changing means performs undecoding in the target frequency domain. It is preferable that the context index assigned to the second flag is changed.

According to the above configuration, since the context index assigned to the second flag indicating whether or not it is the last conversion coefficient in the processing order is changed, the code amount of the second flag can be reduced.

The second flag corresponds to the flag last_flag used in, for example, TMuC (Test Model under Consideration).

An image decoding apparatus according to the present invention includes the arithmetic decoding apparatus, an inverse frequency converting unit that generates a residual image by performing inverse frequency conversion on a transform coefficient decoded by the arithmetic decoding apparatus, and the inverse frequency transform. And a decoded image generating means for generating a decoded image by adding the residual image generated by the means and the predicted image predicted from the generated decoded image.

According to the image decoding device configured as described above, a residual image generated by performing inverse frequency conversion on the transform coefficient decoded by the arithmetic decoding device, and a predicted image predicted from the generated decoded image Is added to generate a decoded image, and thus a decoded image can be generated from encoded data with high encoding efficiency.

In addition, the arithmetic coding apparatus according to the present invention encodes each transform coefficient obtained by frequency-transforming the target image for each unit region by arithmetically coding one or more kinds of syntax representing the transform coefficient. An arithmetic coding apparatus for generating coded data, wherein each syntax representing each transform coefficient in a target frequency region corresponding to a unit region to be processed has a syntax type and a target frequency of the syntax Context index allocating means for allocating a context index determined according to a position in a region, and each syntax in the target frequency region is sequentially arithmetic code based on a probability state specified by the context index allocated to the syntax Syntax encoding means for generating the target circumference Depending on the deviation of the distribution of coded nonzero transform coefficients in several regions, and the context index changing means for changing a context index allocated to uncoded syntax in the subject frequency range, it is configured to include.

The image coding apparatus according to the present invention includes a transform coefficient generating unit that generates a transform coefficient by frequency-converting a residual image between a coding target image and a predicted image for each unit region, and the arithmetic coding device. The arithmetic encoding device generates encoded data by arithmetically encoding one or a plurality of types of syntax representing the transform coefficient generated by the transform coefficient generating means. .

According to the image encoding device configured as described above, the calculation encoding device generates a transform coefficient generated by frequency-converting the residual image between the encoding target image and the prediction image for each unit region. Since encoding is performed, encoded data with high encoding efficiency can be generated.

Also, the data structure of the encoded data according to the present invention is such that, for each transform coefficient obtained by frequency transforming the original image for each unit region, one or more kinds of syntax representing the transform coefficient are arithmetically encoded. The one or more types of syntax includes a flag indicating whether or not a transform coefficient is 0, and arithmetic decoding is performed to decode the encoded data. The apparatus assigns, to each syntax in the target frequency domain corresponding to the unit area to be processed, a context index determined according to the type of the syntax and the position of the syntax in the target frequency domain, and the target frequency Each syntax in the region is indicated by the context index assigned to that syntax. With successively arithmetic decoding based on the probability state which is in accordance with the deviation of the distribution of restored the flag in the target frequency range, it changes the context index allocated to the syntax of the undecoded in the subject frequency range.

The arithmetic decoding device that decodes the encoded data configured as described above is configured to assign a context to an undecoded syntax in the target frequency domain according to a bias of a distribution of the restored flag in the target frequency domain. Since the index is changed, the decoding process is performed using an appropriate probability state for decoding the undecoded syntax.

In addition, the arithmetic coding device that generates the coded data is substantially the same as the arithmetic decoding device, in accordance with the distribution of the distribution of the flag that has been coded in the target frequency region. It is preferable to change the context index assigned to the syntax. Since the encoded data generated by such an arithmetic encoding device is generated using an appropriate probability state, it is encoded data with high encoding efficiency.
<< Application example >>
The above-described moving image encoding device 2 and moving image decoding device 1 can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

First, it will be described with reference to FIG. 61 that the above-described moving image encoding device 2 and moving image decoding device 1 can be used for transmission and reception of moving images.

FIG. 61 (a) is a block diagram showing a configuration of a transmission apparatus PROD_A in which the moving picture encoding apparatus 2 is mounted. As illustrated in (a) of FIG. 61, the transmission device PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image, and with the encoded data obtained by the encoding unit PROD_A1. Thus, a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided. The moving image encoding apparatus 2 described above is used as the encoding unit PROD_A1.

The transmission device PROD_A is a camera PROD_A4 that captures a moving image, a recording medium PROD_A5 that records the moving image, an input terminal PROD_A6 that inputs the moving image from the outside, as a supply source of the moving image input to the encoding unit PROD_A1. An image processing unit A7 that generates or processes an image may be further provided. In FIG. 61A, a configuration in which all of these are provided in the transmission device PROD_A is illustrated, but a part thereof may be omitted.

The recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 according to the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.

FIG. 61 (b) is a block diagram showing a configuration of a receiving device PROD_B in which the moving image decoding device 1 is mounted. As illustrated in FIG. 61 (b), the receiving device PROD_B includes a receiving unit PROD_B1 that receives a modulated signal, a demodulating unit PROD_B2 that obtains encoded data by demodulating the modulated signal received by the receiving unit PROD_B1, and a demodulator. A decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2. The moving picture decoding apparatus 1 described above is used as the decoding unit PROD_B3.

The receiving device PROD_B has a display PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording the moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3. PROD_B6 may be further provided. In FIG. 61 (b), a configuration in which all of these are provided in the receiving device PROD_B is illustrated, but a part thereof may be omitted.

The recording medium PROD_B5 may be used for recording a non-encoded moving image, or may be encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.

Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

For example, a terrestrial digital broadcast broadcasting station (broadcasting equipment or the like) / receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting. Further, a broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.

Also, a server (workstation etc.) / Client (television receiver, personal computer, smart phone etc.) such as VOD (Video On Demand) service and video sharing service using the Internet is a transmitting device for transmitting and receiving modulated signals by communication. This is an example of PROD_A / reception device PROD_B (usually, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

In addition to the function of decoding the encoded data downloaded from the server and displaying it on the display, the video sharing service client has a function of encoding a moving image captured by the camera and uploading it to the server. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.

Next, it will be described with reference to FIG. 62 that the above-described moving picture encoding apparatus 2 and moving picture decoding apparatus 1 can be used for recording and reproduction of moving pictures.

62 (a) is a block diagram showing a configuration of a recording apparatus PROD_C in which the above-described moving picture encoding apparatus 2 is mounted. As shown in (a) of FIG. 62, the recording apparatus PROD_C includes an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on the recording medium PROD_M. A writing unit PROD_C2 for writing. The moving image encoding apparatus 2 described above is used as the encoding unit PROD_C1.

The recording medium PROD_M may be of a type built in the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) Or a drive device (not shown) built in the recording device PROD_C.

The recording device PROD_C is a camera PROD_C3 that captures moving images as a supply source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and reception for receiving moving images. The unit PROD_C5 and an image processing unit C6 that generates or processes an image may be further provided. 62A illustrates the configuration in which the recording apparatus PROD_C includes all of these, a part of the configuration may be omitted.

The receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.

Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, and an HDD (Hard Disk Drive) recorder (in this case, the input terminal PROD_C4 or the receiving unit PROD_C5 is a main supply source of moving images). . In addition, a camcorder (in this case, the camera PROD_C3 is a main source of moving images), a personal computer (in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images), a smartphone (in this case In this case, the camera PROD_C3 or the receiving unit PROD_C5 is a main supply source of moving images) is also an example of such a recording device PROD_C.

FIG. 62 (b) is a block diagram showing the configuration of the playback device PROD_D in which the above-described video decoding device 1 is mounted. As shown in (b) of FIG. 62, the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written to the recording medium PROD_M and a coded data read by the read unit PROD_D1. And a decoding unit PROD_D2 to be obtained. The moving picture decoding apparatus 1 described above is used as the decoding unit PROD_D2.

Note that the recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory, It may be of a type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as DVD or BD. Good.

In addition, the playback device PROD_D has a display PROD_D3 that displays a moving image, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image as a supply destination of the moving image output by the decoding unit PROD_D2. PROD_D5 may be further provided. FIG. 62B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but a part of the configuration may be omitted.

The transmission unit PROD_D5 may transmit an unencoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image with an encoding method for transmission between the decoding unit PROD_D2 and the transmission unit PROD_D5.

Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main supply destination of moving images). . In addition, a television receiver (in this case, the display PROD_D3 is a main supply destination of moving images), a digital signage (also referred to as an electronic signboard or an electronic bulletin board), and the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images. Desktop PC (in this case, the output terminal PROD_D4 or the transmission unit PROD_D5 is the main video image supply destination), laptop or tablet PC (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a moving image) A smartphone (which is a main image supply destination), a smartphone (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a main moving image supply destination), and the like are also examples of such a playback device PROD_D.

(About correspondence with HEVC)
Note that the LCU (Largest Coding Unit) in each of the above embodiments is H.264. It corresponds to the root of a coding tree (Coding Tree) of HEVC (High Efficiency Video Coding) proposed as a successor to H.264 / MPEG-4 AVC, and a leaf CU is a CU (Coding Unit, coding) It is also called the leaf of the tree). Moreover, PU and TU in each said embodiment are respectively equivalent to the prediction tree (Prediction Tree) and transformation tree (transform tree) in HEVC. Moreover, the partition of PU in the said embodiment is corresponded to PU (Prediction Unit) in HEVC. In the above embodiment, a block obtained by dividing a TU corresponds to a TU (Transformation Unit) in HEVC.

The present invention can be suitably applied to a decoding device that decodes encoded data and an encoding device that generates encoded data. Further, the present invention can be suitably applied to the data structure of encoded data generated by the encoding device and referenced by the decoding device.

<< Embodiment 1-1 >>
1 video decoding device (image decoding device)
121 Last coefficient decoding unit (last coefficient position detecting means)
122 Position determination unit (position determination means, distance determination means)
123 Two-stage decoding unit (two-stage decoding means)
125 Last two coefficient decoding units (non-zero coefficient position detecting means)
2 Video encoding device (image encoding device)
310 Last coefficient detection unit (last coefficient position detection means)
320 Position determination unit (position determination means)
330 Two-stage encoding unit (two-stage encoding means)
C10 Coefficient encoded data (Data structure of encoded data)
LT Last coefficient (last non-zero coefficient)
MX10 Coefficient matrix R10, R30, R50, R70 First target range (position determination range)
TBL11, TBL21 VLC table for linear scanning (variable length code table)
<< Embodiment 2 >>
1 video decoding device (image decoding device)
11 variable length code decoding unit 111 quantization residual information decoding unit (arithmetic decoding device)
111a Context index allocation unit (context index allocation means)
111b Transform coefficient decoding unit (syntax decoding means, transform coefficient restoring means)
111c Context variable management unit 111d Context index change unit (context index change means)
12 Predictive image generation unit 15 Frame memory 2 Video encoding device (image encoding device)
21 Predictive image generation unit 25 Frame memory 27 Variable length code encoding unit 27
271 Quantization residual information encoding unit (arithmetic encoding device)
271a Context index allocation unit (context index allocation means)
271b Transform coefficient coding unit (syntax coding means)
271c Context variable management unit 271d Context index change unit (context index change means)

Claims

In an image decoding apparatus that reproduces a coefficient matrix by decoding and inverse scanning encoded quantized transform coefficients included in encoded data obtained by encoding image data,
A last coefficient position detecting means for detecting the position of the last non-zero coefficient in the reverse scan order in the coefficient matrix;
Position determining means for determining whether or not the position of the last non-zero coefficient is in a position determination range from a row or column including a coefficient of a DC component to a predetermined row or column in the coefficient matrix;
When the position of the last non-zero coefficient is in the position determination range, the coefficient matrix is divided into a first decoding target range including the position of the last non-zero coefficient and a range other than the first decoding target range. An image decoding apparatus comprising: two-stage decoding means for performing decoding processing with each of the second decoding target ranges.
The position determining means determines whether the position of the last non-zero coefficient is in the first row or the first column in the coefficient matrix;
The image decoding apparatus according to claim 1, wherein the first decoding target range in the decoding process of the two-stage decoding unit coincides with the position determination range.
Distance determining means for determining whether the position of the last non-zero coefficient is a predetermined distance or more away from the position of the coefficient of the DC component;
The said two-stage decoding means performs the said decoding process, when the position of the last non-zero coefficient is separated from the position of the coefficient of the DC component by a predetermined distance or more. Image decoding apparatus.
The image decoding apparatus according to any one of claims 1 to 3, wherein the two-stage decoding unit performs zigzag scanning in reverse scanning in decoding of the second decoding target range.
In the decoding process of the second decoding target range by the two-stage decoding unit, the position detection by the last coefficient position detection unit, the position determination by the position determination unit, and the decoding by the two-stage decoding unit are recursively performed. 5. The image decoding apparatus according to claim 1, further comprising a recursive control unit configured to perform control as described above.
The two-stage decoding means is a variable length code table that defines a combination of a variable length code and a length of continuous non-zero coefficients in the decoding of the first decoding target range, wherein the shortest variable length code is 3. The image decoding apparatus according to claim 2, wherein the variable length code table refers to a variable length code table in which the length of the continuous non-zero coefficient is longer as the length of the variable length code is longer.
Non-zero coefficient position detecting means for detecting the position of the predetermined non-zero coefficient from the last non-zero coefficient in the reverse scan order in the coefficient matrix,
The position determination means determines whether or not the position from the last non-zero coefficient to the predetermined non-zero coefficient is within the position determination range. The image decoding device according to item 1.
When the predetermined mode switching condition is satisfied from the non-zero coefficient length decoding mode that decodes the length of consecutive non-zero coefficients to the sequential decoding mode that decodes sequential coefficients, the two-stage decoding means The mode switching condition for decoding in the first decoding target range is different from the mode switching condition for decoding in the second decoding target range. 8. The image decoding device according to any one of 7 above.
In an image encoding device that outputs encoded data by scanning and encoding a coefficient matrix composed of quantized transform coefficients, a prediction residual obtained by subtracting a predicted image from image data,
A last coefficient position detecting means for detecting a position of the last non-zero coefficient in the scan order in the coefficient matrix;
Position determining means for determining whether or not the position of the last non-zero coefficient is in a position determination range from a row or column including a coefficient of a DC component to a predetermined row or column in the coefficient matrix;
When the position of the last non-zero coefficient is in the position determination range, the coefficient matrix is divided into a first encoding target range including the position of the last non-zero coefficient, and the first encoding target range. An image encoding apparatus comprising: a two-stage encoding unit that performs encoding processing with a second encoding target range other than the above.
In the data structure of encoded data generated by scanning and encoding a coefficient matrix composed of transform coefficients obtained by performing orthogonal transform and quantized transform coefficients, the prediction residual obtained by subtracting the predicted image from the image data,
Position information indicating the position of the last non-zero coefficient in the scan order in the coefficient matrix;
A first coefficient that includes the last non-zero coefficient, and in the coefficient matrix, the coefficients in the first encoding target range from the row or column including the coefficient of the DC component to the predetermined row or column are sequentially encoded Encoded data;
A data structure of encoded data, comprising: second coefficient encoded data obtained by sequentially encoding coefficients in a second encoding target range other than the first encoding target range.
An arithmetic decoding device that decodes encoded data obtained by arithmetically encoding one or a plurality of types of syntax representing a transform coefficient for each transform coefficient obtained by frequency transforming a target image for each unit region There,
Context index allocating means for allocating a context index determined according to the type of syntax and the position of the syntax in the target frequency domain for each syntax in the target frequency domain corresponding to the unit area to be processed;
Syntax decoding means for sequentially arithmetically decoding each syntax in the target frequency domain based on a probability state specified by a context index assigned to the syntax;
Transform coefficient restoration means for restoring each transform coefficient from each syntax decoded by the syntax decoding means;
A context index changing means for changing a context index to be assigned to an undecoded syntax in the target frequency domain according to a bias in the distribution of the restored non-zero transform coefficient in the target frequency domain;
An arithmetic decoding device comprising:
When the horizontal and vertical coordinates in the target frequency domain are expressed as u and v, respectively, and the context index allocated to the undecoded syntax by the context index allocation unit is expressed as ctxIdx (u, v),
The context index changing means uses the undecoded syntax when the distribution of decoded non-zero transform coefficients in the target frequency domain is biased in a predetermined direction of the horizontal direction and the vertical direction. Assigned context index ctxIdx (u, v) ctxIdx ′ (u, v) = ctxIdx (v, u)
To the context index ctxIdx ′ (u, v) given by
The arithmetic decoding apparatus according to claim 11, wherein:
The context index changing means uses the undecoded syntax in the target frequency domain when an index representing the distribution bias of the decoded non-zero transform coefficient in the target frequency domain is larger than a predetermined threshold. The context index to be allocated is changed.
The arithmetic decoding apparatus according to claim 11 or 12, wherein
The context index changing means uses a syntax in which at least one of the horizontal coordinate and the vertical coordinate in the target frequency region is not more than a predetermined threshold among undecoded syntaxes in the target frequency region. Change the allocated context index,
The arithmetic decoding device according to any one of claims 11 to 13, wherein the arithmetic decoding device is any one of the above.
The one or more types of syntax include a first flag indicating whether or not the conversion coefficient is 0,
The context index changing means is for changing a context index assigned to an undecoded first flag in the target frequency domain.
The arithmetic decoding device according to claim 11, wherein the arithmetic decoding device is any one of the above.
The one or more types of syntax include a second flag indicating whether or not it is the last conversion coefficient in the processing order,
The context index changing means is for changing a context index assigned to an undecoded second flag in the target frequency domain.
The arithmetic decoding device according to claim 11, wherein the arithmetic decoding device is any one of the above.
The arithmetic decoding device according to any one of claims 11 to 16,
Inverse frequency transforming means for generating a residual image by inverse frequency transforming transform coefficients decoded by the arithmetic decoding device;
Decoded image generating means for generating a decoded image by adding the residual image generated by the inverse frequency transform means and the predicted image predicted from the generated decoded image;
An image decoding apparatus comprising:
An arithmetic encoding device that generates encoded data by arithmetically encoding one or a plurality of types of syntax representing a conversion coefficient for each conversion coefficient obtained by frequency-converting a target image for each unit region. ,
A context index that assigns a context index determined according to the type of the syntax and the position of the syntax in the target frequency domain for each syntax representing each transform coefficient in the target frequency domain corresponding to the unit area to be processed Allocation means;
Syntax encoding means for sequentially arithmetically encoding each syntax in the target frequency domain based on a probability state specified by a context index assigned to the syntax;
Context index changing means for changing a context index to be assigned to an uncoded syntax in the target frequency domain according to a bias in the distribution of encoded non-zero transform coefficients in the target frequency domain;
An arithmetic coding apparatus comprising:
Transform coefficient generating means for generating a transform coefficient by frequency transforming the residual image between the encoding target image and the predicted image for each unit region;
The arithmetic encoding device according to claim 18,
With
The arithmetic encoding device generates encoded data by arithmetically encoding one or a plurality of types of syntax representing the transform coefficient generated by the transform coefficient generation unit.
An image encoding apparatus characterized by that.
A data structure of encoded data obtained by arithmetically encoding one or a plurality of types of syntax representing the conversion coefficient for each conversion coefficient obtained by frequency-converting the original image for each unit region,
The one or more types of syntax include a flag indicating whether or not the conversion coefficient is 0,
The arithmetic decoding apparatus that decodes the encoded data, for each syntax in the target frequency domain corresponding to the unit area to be processed, according to the type of syntax and the position of the syntax in the target frequency domain A fixed context index is allocated, and each syntax in the target frequency domain is sequentially arithmetically decoded based on the probability state specified by the context index allocated to the syntax, and the restored in the target frequency domain Changing a context index assigned to an undecoded syntax in the target frequency domain according to the distribution of the flag;
A data structure of encoded data characterized by the above.