TW201808004A

TW201808004A - Encoding device, decoding device, encoding method, and decoding method

Info

Publication number: TW201808004A
Application number: TW106125561A
Authority: TW
Inventors: 大川真人; 齋藤秀雄; 西孝啓; 遠間正真
Original assignee: 美商松下電器（美國）知識產權公司
Priority date: 2016-07-29
Filing date: 2017-07-28
Publication date: 2018-03-01
Also published as: WO2018021373A1

Abstract

An encoding device (100) for encoding blocks to be encoded in an image is provided with: an intra/inter determination unit (1061) that determines whether intra prediction or inter prediction is to be used for a block to be encoded; and a frequency transformation unit (1063) that selectively uses a plurality of frequency transformation bases including DCT-II and DCT-V to perform a first frequency transformation on the prediction error of the block to be encoded. In the first frequency transformation, the basis of DCT-V is used if intra prediction is used for the block to be encoded, and the basis of DCT-II is used if inter prediction is used for the block to be encoded.

Description

Encoding device, decoding device, encoding method and decoding method

發明領域本揭示是有關於一種編碼裝置、解碼裝置、編碼方法及解碼方法。FIELD OF THE INVENTION The present disclosure relates to an encoding device, a decoding device, an encoding method, and a decoding method.

發明背景一種稱為HEVC(高效率視訊編碼，High-Efficiency Video Coding)的影像編碼標準規格，已藉由JCT-VC(視訊編碼聯合工作小組，Joint Collaborative Team on Video Coding)而標準化。先前技術文獻BACKGROUND OF THE INVENTION An image coding standard called HEVC (High-Efficiency Video Coding) has been standardized by JCT-VC (Joint Collaborative Team on Video Coding). Prior art literature

非專利文獻非專利文獻1：H.265(ISO/IEC 23008-2 HEVC(高效率視訊編碼，High Efficiency Video Coding)Non-Patent Document Non-Patent Document 1: H.265 (ISO / IEC 23008-2 HEVC (High Efficiency Video Coding)

發明概要發明欲解決之課題在這種編碼及解碼技術中，所要求的是更進一步的壓縮效率之提升。SUMMARY OF THE INVENTION Problems to be Solved by the Invention What is required in this encoding and decoding technology is a further improvement in compression efficiency.

於是，本揭示提供一種可以實現更進一步的壓縮效率的提升之編碼裝置、解碼裝置、編碼方法及解碼方法。用以解決課題之手段Therefore, the present disclosure provides an encoding device, a decoding device, an encoding method, and a decoding method that can achieve further improvement in compression efficiency. Means to solve the problem

本揭示之一態樣的編碼裝置，是對圖像的編碼對象區塊進行編碼的編碼裝置，並具備有處理器及記憶體，前述處理器是使用前述記憶體，並判定為了前述編碼對象區塊所使用的是框內預測及框間預測的哪一個，並選擇性地利用包含DCT-II及DCT-V的複數個頻率轉換的基底，來進行對前述編碼對象區塊的預測誤差之第1頻率轉換，在前述第1頻率轉換中，在為了前述編碼對象區塊所使用的是框內預測的情況下，是使用DCT-V的基底，在為了前述編碼對象區塊所使用的是框間預測的情況下，是使用DCT-II的基底。An encoding device according to one aspect of the present disclosure is an encoding device that encodes an encoding target block of an image, and includes a processor and a memory. The processor uses the memory and determines that it is the encoding target area. Which of intra-frame prediction and inter-frame prediction is used for the block, and a plurality of frequency conversion bases including DCT-II and DCT-V are selectively used to perform the first prediction error on the encoding target block. 1 frequency conversion. In the first frequency conversion, when in-frame prediction is used for the encoding target block, the base of DCT-V is used. When the frame is used for the encoding target block, a frame is used. In the case of inter-prediction, the base of DCT-II is used.

本揭示之一態樣的解碼裝置，是對圖像的解碼對象區塊進行解碼的解碼裝置，並具備有處理器及記憶體，前述處理器是使用前述記憶體，並判定為了前述解碼對象區塊所使用的是框內預測及框間預測的哪一個，並選擇性地利用包含DCT-II及DCT-V的逆轉換之複數個逆頻率轉換的基底，來進行對前述解碼對象區塊的預測誤差之第1逆頻率轉換，在前述第1逆頻率轉換中，在為了前述解碼對象區塊所使用的是框內預測的情況下，是使用DCT-V的逆轉換之基底，在為了前述解碼對象區塊所使用的是框間預測的情況下，是使用DCT-II的逆轉換之基底。A decoding device according to one aspect of the present disclosure is a decoding device that decodes a decoding target block of an image, and includes a processor and a memory. The processor uses the memory and determines that it is the decoding target area. Which of intra-frame prediction and inter-frame prediction is used for the block, and a plurality of inverse frequency conversion bases including inverse conversion of DCT-II and DCT-V are selectively used to perform the foregoing decoding of the target block The first inverse frequency conversion of prediction error. In the first inverse frequency conversion, when in-frame prediction is used for the decoding target block, it is the basis of inverse conversion using DCT-V. When inter-frame prediction is used for the decoding target block, it is the basis for inverse conversion using DCT-II.

再者，這些的全面或具體的態樣，可利用系統、方法、積體電路、電腦程式、或電腦可讀取之CD-ROM等之記錄媒體來實現，亦可利用系統、方法、積體電路、電腦程式及記錄媒體之任意的組合來實現。發明效果Furthermore, these comprehensive or specific aspects can be realized by using a system, method, integrated circuit, computer program, or a computer-readable recording medium such as a CD-ROM, or by using a system, method, or integrated circuit. It can be realized by any combination of circuits, computer programs, and recording media. Invention effect

本揭示可以提供一種可實現更進一步的壓縮效率的提升之編碼裝置、解碼裝置、編碼方法及解碼方法。The present disclosure can provide an encoding device, a decoding device, an encoding method, and a decoding method that can achieve further improvement in compression efficiency.

用以實施發明之形態 (成為本揭示之基礎的知識見解) 在H.265/HEVC(高效率視訊編碼，High Efficiency Video Coding)等以往的動態圖像編碼中，作為框內預測的區塊中的頻率轉換基底，基本上是使用DCT-II的基底，但只有4×4尺寸的亮度區塊會使用DST-VII的基底。The form for implementing the invention (the knowledge insights that form the basis of this disclosure) In the conventional moving image coding such as H.265 / HEVC (High Efficiency Video Coding), it is used as a block for in-frame prediction The frequency conversion base is basically the base using DCT-II, but only the 4 × 4 size luminance block will use the base of DST-VII.

另外，由於在框內預測中是使用空間上相鄰的區塊，因此大多會有作為輸入訊號與預測訊號的差分之殘差訊號在空間上偏移之情形。特別是，會有於左邊及上側相鄰的區塊之邊界附近的殘差變小的傾向。因此，若直流(0階)的轉換特性使用較平坦的DCT-II時，會有高階成分變得顯眼的情況，而成為壓縮效率惡化的原因。In addition, since blocks that are spatially adjacent are used in the intra-frame prediction, a residual signal that is a difference between an input signal and a prediction signal may be spatially shifted in most cases. In particular, there is a tendency that the residuals near the boundaries of the blocks adjacent to the left and upper sides become smaller. Therefore, if a flat DCT-II conversion characteristic is used for the DC (0th-order) conversion characteristics, higher-order components may become noticeable, which may cause the compression efficiency to deteriorate.

於是，本揭示之一態樣的編碼裝置，是對圖像的編碼對象區塊進行編碼的編碼裝置，並具備有處理器及記憶體，前述處理器是使用前述記憶體，並判定為了前述編碼對象區塊所使用的是框內預測及框間預測的哪一個，並選擇性地利用包含DCT-II及DCT-V的複數個頻率轉換的基底，來進行對前述編碼對象區塊的預測誤差之第1頻率轉換，在前述第1頻率轉換中，在為了前述編碼對象區塊所使用的是框內預測的情況下，是使用DCT-V的基底，在為了前述編碼對象區塊所使用的是框間預測的情況下，是使用DCT-II的基底。Therefore, an encoding device according to one aspect of the present disclosure is an encoding device that encodes an encoding target block of an image, and includes a processor and a memory. The processor uses the memory and determines that the encoding is for the encoding. Which of the intra-frame prediction and inter-frame prediction is used by the target block, and a plurality of frequency conversion bases including DCT-II and DCT-V are used selectively to perform prediction error on the aforementioned coding target block In the first frequency conversion, in the first frequency conversion, when in-frame prediction is used for the encoding target block, a DCT-V base is used. In the case of inter-frame prediction, it is the basis for using DCT-II.

藉此，當為了編碼對象區塊所使用的是框內預測的情況下，可以利用DCT-V的基底來轉換編碼對象區塊。由於在DCT-V中是在直流成分中於接近於參照像素的位置上使振幅變小，因此DCT-V適合框內預測的預測誤差之轉換。因此，編碼裝置可以實現更進一步的壓縮效率之提升。Therefore, when intra-frame prediction is used for the coding target block, the base of DCT-V can be used to convert the coding target block. In DCT-V, the amplitude is reduced at a position close to the reference pixel in the DC component. Therefore, DCT-V is suitable for the conversion of prediction errors in the frame prediction. Therefore, the encoding device can achieve further improvement in compression efficiency.

又，在本揭示之一態樣的編碼裝置中，亦可為例如，前述處理器進一步地判定前述編碼對象區塊的尺寸是否為閾值尺寸以下，且在前述第1頻率轉換中，在為了前述編碼對象區塊所使用的是框內預測的情況下，為：(i)若前述編碼對象區塊的尺寸為前述閾值尺寸以下時，使用DCT-V的基底；(ii)若前述編碼對象區塊的尺寸大於前述閾值尺寸時，使用DCT-II的基底。In the encoding device according to one aspect of the present disclosure, for example, the processor may further determine whether the size of the encoding target block is equal to or smaller than a threshold size, and in the first frequency conversion, for the foregoing purpose, When the coding target block uses in-frame prediction: (i) if the size of the coding target block is below the threshold size, the base of DCT-V is used; (ii) if the coding target area is When the block size is larger than the aforementioned threshold size, the base of DCT-II is used.

藉此，可以因應於使用框內預測的編碼對象區塊之尺寸，來切換DCT-II及DCT-V的基底，以轉換編碼對象區塊。若區塊尺寸較大，會有預測誤差在區塊內整體性地變小的傾向，而使DCT-II適合編碼對象區塊的預測誤差之轉換。另一方面，若區塊尺寸較小，會有越接近參照像素的像素預測誤差變得越小的傾向，而使DCT-V適合編碼對象區塊的預測誤差之轉換。因此，若編碼對象區塊的尺寸為閾值尺寸以下時，是使用DCT-V的基底來轉換編碼對象區塊，若編碼對象區塊的尺寸大於閾值尺寸時，是使用DCT-II的基底來轉換編碼對象區塊，藉此可以實現更進一步的壓縮效率之提升。Thereby, the bases of DCT-II and DCT-V can be switched in accordance with the size of the coding target block predicted using the frame to convert the coding target block. If the block size is large, the prediction error tends to become smaller in the block as a whole, and DCT-II is suitable for the conversion of the prediction error of the coding target block. On the other hand, if the block size is small, the pixel prediction error tends to be smaller as it gets closer to the reference pixel, making DCT-V suitable for conversion of the prediction error of the coding target block. Therefore, if the size of the coding target block is below the threshold size, the DCT-V base is used to convert the coding target block. If the size of the coding target block is greater than the threshold size, the DCT-II base is used to convert it. Encoding the target block, which can further improve the compression efficiency.

又，在本揭示之一態樣的編碼裝置中，亦可為例如，前述處理器更進一步地將前述閾值尺寸的資訊寫入至位元流內。In addition, in the encoding device of one aspect of the present disclosure, for example, the processor may further write the information of the threshold size into the bit stream.

藉此，可以將閾值尺寸的資訊寫入至位元流內。從而，變得可因應於輸入圖像而使用自適應地決定的閾值尺寸，而可以實現更進一步的壓縮效率之提升。Thereby, the information of the threshold size can be written into the bit stream. Therefore, it becomes possible to use an adaptively determined threshold size in accordance with the input image, and to achieve further improvement in compression efficiency.

又，在本揭示之一態樣的編碼裝置中，亦可為例如，前述處理器更進一步地判定要將包含第1轉換模式及第2轉換模式的複數個轉換模式當中的哪一個轉換模式適用於前述編碼對象區塊，並在適用前述第1轉換模式的情況下，進行前述第1頻率轉換，在適用前述第2轉換模式的情況下，進行與前述第1頻率轉換不同的第2頻率轉換。Furthermore, in the encoding device of one aspect of the present disclosure, for example, the processor may further determine which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is applicable. Performing the first frequency conversion on the encoding target block when the first conversion mode is applied, and performing the second frequency conversion different from the first frequency conversion when the second conversion mode is applied .

藉此，可以使用轉換模式來切換頻率轉換。從而，變得可實現更進一步的頻率轉換之效率化，且可以實現更進一步的壓縮效率之提升。Thereby, the frequency conversion can be switched using the conversion mode. As a result, it becomes possible to further improve the efficiency of frequency conversion, and further improve the compression efficiency.

又，在本揭示之一態樣的編碼裝置中，亦可為例如，前述處理器更進一步地將可適用於前述編碼對象區塊的轉換模式之資訊寫入至位元流內。Furthermore, in the encoding device of one aspect of the present disclosure, for example, the processor may further write information of a conversion mode applicable to the encoding target block into the bit stream.

藉此，可以將適用於編碼對象區塊的轉換模式之資訊寫入至位元流內。從而，變得可因應於輸入圖像而自適應地決定轉換模式，而可以實現更進一步的壓縮效率之提升。Thereby, the information of the conversion mode applicable to the encoding target block can be written into the bit stream. Thus, it becomes possible to adaptively determine a conversion mode in response to an input image, and to achieve further improvement in compression efficiency.

本揭示之一態樣的編碼方法，是對圖像的編碼對象區塊進行編碼的編碼方法，其判定為了前述編碼對象區塊所使用的是框內預測及框間預測的哪一個，並選擇性地利用包含DCT-II及DCT-V的複數個頻率轉換的基底，來進行對前述編碼對象區塊的預測誤差之第1頻率轉換，在前述第1頻率轉換中，在為了前述編碼對象區塊所使用的是框內預測的情況下，是使用DCT-V的基底，在為了前述編碼對象區塊所使用的是框間預測的情況下，是使用DCT-II的基底。One aspect of the present disclosure is an encoding method that encodes an encoding target block of an image. It determines which of intra-frame prediction and inter-frame prediction is used for the aforementioned encoding target block, and selects The first frequency conversion of the prediction error of the coding target block is performed using a base including a plurality of frequency conversions of DCT-II and DCT-V. In the first frequency conversion, in the area for the coding target, When the block uses intra-frame prediction, the base is DCT-V, and when the inter-frame prediction is used for the aforementioned coding target block, the base is DCT-II.

藉此，可以發揮和上述編碼裝置同樣的效果。Thereby, the same effects as those of the encoding device can be exhibited.

藉此，在為了解碼對象區塊所使用的是框內預測的情況下，可以利用DCT-V的逆轉換之基底來對解碼對象區塊進行逆轉換。由於在DCT-V中是在直流成分中於接近於參照像素的位置上使振幅變小，因此DCT-V適合框內預測的預測誤差之轉換。從而，解碼裝置可以實現更進一步的壓縮效率之提升。With this, in the case that intra-frame prediction is used for the decoding target block, the base of the inverse conversion of DCT-V can be used to perform inverse conversion on the decoding target block. In DCT-V, the amplitude is reduced at a position close to the reference pixel in the DC component. Therefore, DCT-V is suitable for the conversion of prediction errors in the frame prediction. Therefore, the decoding device can achieve further improvement in compression efficiency.

又，在本揭示之一態樣的解碼裝置中，亦可為例如，前述處理器更進一步地判定前述解碼對象區塊的尺寸是否為閾值尺寸以下，且在前述第1逆頻率轉換中，在為了前述解碼對象區塊所使用的是框內預測的情況下，為：(i)若前述解碼對象區塊的尺寸為前述閾值尺寸以下時，使用DCT-V的逆轉換之基底；(ii)若前述解碼對象區塊的尺寸大於前述閾值尺寸時，使用DCT-II的逆轉換之基底。In the decoding device according to one aspect of the present disclosure, for example, the processor may further determine whether the size of the decoding target block is equal to or smaller than a threshold size, and in the first inverse frequency conversion, In the case of using intra-frame prediction for the foregoing decoding target block: (i) if the size of the foregoing decoding target block is equal to or smaller than the aforementioned threshold size, the base of inverse conversion of DCT-V is used; (ii) If the size of the decoding target block is larger than the threshold size, the base of the inverse conversion of DCT-II is used.

藉此，可以因應於使用框內預測的解碼對象區塊之尺寸，來切換DCT-II及DCT-V的逆轉換之基底，以對解碼對象區塊進行逆轉換。若區塊尺寸較大，會有預測誤差在區塊內整體性地變小的傾向，而使DCT-II適合解碼對象區塊的預測誤差之轉換。另一方面，若區塊尺寸較小，會有越接近參照像素的像素預測誤差變得越小的傾向，而使DCT-V適合解碼對象區塊的預測誤差之轉換。因此，若解碼對象區塊的尺寸為閾值尺寸以下，是使用DCT-V的逆轉換之基底來對解碼對象區塊進行逆轉換，若解碼對象區塊的尺寸大於閾值尺寸，是使用DCT-II的逆轉換之基底來對解碼對象區塊進行逆轉換，藉此可以實現更進一步的壓縮效率之提升。Thereby, the basis of the inverse conversion of DCT-II and DCT-V can be switched according to the size of the decoding target block predicted in the frame to perform inverse conversion on the decoding target block. If the block size is large, the prediction error tends to become smaller in the block as a whole, and DCT-II is suitable for the conversion of the prediction error of the decoding target block. On the other hand, if the block size is small, the pixel prediction error tends to be smaller as it gets closer to the reference pixel, making DCT-V suitable for conversion of the prediction error of the decoding target block. Therefore, if the size of the decoding target block is below the threshold size, the base of the inverse conversion of DCT-V is used to inversely transform the decoding target block. If the size of the decoding target block is larger than the threshold size, DCT-II is used. Based on the inverse conversion of the image, inverse conversion is performed on the decoding target block, thereby further improving the compression efficiency.

又，在本揭示之一態樣的解碼裝置中，亦可為例如，前述處理器更進一步地從位元流中解讀前述閾值尺寸的資訊。In addition, in the decoding device of one aspect of the present disclosure, for example, the processor may further decode the information of the threshold size from the bit stream.

藉此，可以從位元流中解讀閾值尺寸的資訊。從而，變得可因應於輸入圖像而使用自適應地決定的閾值尺寸，而可以實現更進一步的壓縮效率之提升。In this way, the information of the threshold size can be interpreted from the bit stream. Therefore, it becomes possible to use an adaptively determined threshold size in accordance with the input image, and to achieve further improvement in compression efficiency.

又，在本揭示之一態樣的解碼裝置中，亦可為例如，前述處理器更進一步地判定要將包含第1轉換模式及第2轉換模式的複數個轉換模式當中的哪一個轉換模式適用於前述解碼對象區塊，並在適用前述第1轉換模式的情況下，進行前述第1逆頻率轉換，在適用前述第2轉換模式的情況下，進行與前述第1逆頻率轉換不同的第2逆頻率轉換。Further, in the decoding device of one aspect of the present disclosure, for example, the processor may further determine which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is applicable. Perform the first inverse frequency conversion on the decoding target block when the first conversion mode is applied, and perform the second inverse frequency conversion that is different from the first inverse frequency conversion when the second conversion mode is applied. Inverse frequency conversion.

藉此，可以使用轉換模式來切換逆頻率轉換。從而，變得可實現更進一步的頻率轉換之效率化，且可以實現更進一步的壓縮效率之提升。Thereby, the inverse frequency conversion can be switched using the conversion mode. As a result, it becomes possible to further improve the efficiency of frequency conversion, and further improve the compression efficiency.

又，在本揭示之一態樣的解碼裝置中，亦可為例如，前述處理器更進一步地從位元流中解讀可適用於前述解碼對象區塊的轉換模式之資訊。In addition, in the decoding device of one aspect of the present disclosure, for example, the processor further interprets information of a conversion mode applicable to the decoding target block from a bit stream.

藉此，可以將適用於解碼對象區塊的轉換模式之資訊寫入至位元流內。從而，變得可因應於輸入圖像而自適應地決定轉換模式，而可以實現更進一步的壓縮效率之提升。Thereby, the information of the conversion mode applicable to the decoding target block can be written into the bit stream. Thus, it becomes possible to adaptively determine a conversion mode in response to an input image, and to achieve further improvement in compression efficiency.

本揭示之一態樣的解碼方法，是對圖像的解碼對象區塊進行解碼的解碼方法，其判定為了前述解碼對象區塊所使用的是框內預測及框間預測的哪一個，並選擇性地利用包含DCT-II及DCT-V的逆轉換之複數個逆頻率轉換的基底，來進行對前述解碼對象區塊的預測誤差之第1逆頻率轉換，在前述第1逆頻率轉換中，在為了前述解碼對象區塊所使用的是框內預測的情況下，是使用DCT-V的逆轉換之基底，在為了前述解碼對象區塊所使用的是框間預測的情況下，是使用DCT-II的逆轉換之基底。One aspect of the present disclosure is a decoding method that decodes a decoding target block of an image, which determines which of intra-frame prediction and inter-frame prediction is used for the aforementioned decoding target block, and selects The basis of the plurality of inverse frequency conversions including the inverse conversion of DCT-II and DCT-V is used to perform the first inverse frequency conversion on the prediction error of the decoding target block. In the first inverse frequency conversion, In the case where intra-frame prediction is used for the aforementioned decoding target block, it is the base of inverse conversion using DCT-V, and when the inter-frame prediction is used for the aforementioned decoding target block, DCT is used. -II The basis of the inverse transformation.

藉此，可以發揮和上述解碼裝置同樣的效果。Thereby, the same effects as those of the decoding device can be exhibited.

再者，這些的全面或具體的態樣，可利用系統、積體電路、電腦程式、或電腦可讀取之CD-ROM等之記錄媒體來實現，亦可利用系統、積體電路、電腦程式及記錄媒體之任意的組合來實現。Furthermore, these comprehensive or specific aspects can be realized using a recording medium such as a system, integrated circuit, computer program, or a computer-readable CD-ROM, or a system, integrated circuit, or computer program. And any combination of recording media.

以下，將參照圖式來具體地說明實施形態。Hereinafter, embodiments will be specifically described with reference to the drawings.

再者，以下說明的實施形態都是表示概括性的或具體的例子。以下實施形態所示的數值、形狀、材料、構成要件、構成要件的配置位置及連接形態、步驟、步驟的順序等只是一個例子，並非用來限定請求範圍的主旨。又，以下的實施形態的構成要件之中，針對沒有記載在表示最上位概念之獨立請求項中的構成要件，是作為任意之構成要件來說明。 (實施形態1) [編碼裝置之概要]The embodiments described below are general or specific examples. The numerical values, shapes, materials, constituent elements, arrangement positions and connection forms, steps, and order of the steps shown in the following embodiments are merely examples, and are not intended to limit the scope of the request. In addition, among the constituent elements of the following embodiments, constituent elements that are not described in the independent request item indicating the highest-level concept are described as arbitrary constituent elements. (Embodiment 1) [Outline of encoding device]

首先，說明實施形態1之編碼裝置的概要。圖1是顯示實施形態1之編碼裝置100的功能構成之方塊圖。編碼裝置100是以區塊單位對動態圖像/圖像進行編碼的動態圖像/圖像編碼裝置。First, the outline of the encoding device according to the first embodiment will be described. FIG. 1 is a block diagram showing a functional configuration of an encoding device 100 according to the first embodiment. The encoding device 100 is a moving image / image encoding device that encodes moving images / images in units of blocks.

如圖1所示，編碼裝置100是以區塊單位對圖像進行編碼的裝置，並具備分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、區塊記憶體118、迴路濾波部120、框記憶體122、框內預測部124、框間預測部126、及預測控制部128。As shown in FIG. 1, the encoding device 100 is a device that encodes an image in units of blocks, and includes a division unit 102, a subtraction unit 104, a conversion unit 106, a quantization unit 108, an entropy encoding unit 110, an inverse quantization unit 112, The inverse conversion unit 114, the addition unit 116, the block memory 118, the loop filtering unit 120, the frame memory 122, the in-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128.

編碼裝置100可藉由例如通用處理器及記憶體來實現。在此情況下，藉由處理器執行保存在記憶體的軟體程式時，處理器是作為分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、迴路濾波部120、框內預測部124、框間預測部126及預測控制部128而發揮功能。又，編碼裝置100也可以作為對應於分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、迴路濾波部120、框內預測部124、框間預測部126及預測控制部128的1個以上之專用的電子電路來實現。The encoding device 100 can be implemented by, for example, a general-purpose processor and a memory. In this case, when the software program stored in the memory is executed by the processor, the processor functions as the division unit 102, the subtraction unit 104, the conversion unit 106, the quantization unit 108, the entropy encoding unit 110, the inverse quantization unit 112, and the inverse The conversion unit 114, the addition unit 116, the loop filtering unit 120, the intra-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128 function. The encoding device 100 may correspond to the division unit 102, the subtraction unit 104, the conversion unit 106, the quantization unit 108, the entropy encoding unit 110, the inverse quantization unit 112, the inverse conversion unit 114, the addition unit 116, the loop filter unit 120, One or more dedicated electronic circuits of the intra-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128 are implemented.

以下，針對包含在編碼裝置100的各構成要件來進行說明。 [分割部]Hereinafter, each constituent element included in the encoding device 100 will be described. [Division]

分割部102是將包含在輸入動態圖像的各圖片分割成複數個區塊，且將各區塊輸出至減法部104。例如，分割部102首先可將圖片分割為固定尺寸(例如128×128)的區塊。此固定尺寸的區塊有時被稱為編碼樹單元(CTU)。而且，分割部102是根據遞迴的四元樹(quadtree)及/或二元樹(binary tree)區塊分割，來將各個固定尺寸的區塊分割為可變尺寸(例如64×64以下)的區塊。此可變尺寸的區塊有時被稱為編碼單元(CU)、預測單元(PU)或轉換單元(TU)。再者，在本實施形態中，亦可不需要區別CU、PU及TU，而使圖片內的一部分或全部的區塊成為CU、PU、TU的處理單位。The division unit 102 divides each picture included in the input moving image into a plurality of blocks, and outputs each block to the subtraction unit 104. For example, the dividing unit 102 may first divide a picture into blocks of a fixed size (for example, 128 × 128). This fixed-size block is sometimes called a coding tree unit (CTU). In addition, the division unit 102 divides each fixed-size block into a variable size (for example, 64 × 64 or less) based on recursive quadtree and / or binary tree block division. Block. This variable-sized block is sometimes called a coding unit (CU), a prediction unit (PU), or a conversion unit (TU). Furthermore, in this embodiment, it is not necessary to distinguish between CU, PU, and TU, and a part or all of the blocks in the picture may be processed by CU, PU, and TU.

圖2是顯示實施形態1之區塊分割的一例之圖。在圖2中，實線是表示藉由四元樹區塊分割的區塊邊界，而虛線是表示藉由二元樹區塊分割的區塊邊界。FIG. 2 is a diagram showing an example of block division in the first embodiment. In FIG. 2, the solid line indicates a block boundary divided by a quaternary tree block, and the dotted line indicates a block boundary divided by a binary tree block.

在此，區塊10是128×128像素的正方形區塊(128×128區塊)。此128×128區塊10首先被分割成4個正方形的64×64區塊(四元樹區塊分割)。Here, the block 10 is a square block of 128 × 128 pixels (128 × 128 blocks). This 128 × 128 block 10 is first divided into 64 square 64 × 64 blocks (quaternary tree block partition).

左上的64×64區塊會進一步地被垂直地分割成2個矩形的32×64區塊，並將左邊的32×64區塊進一步垂直地分割成2個矩形的16×64區塊(二元樹區塊分割)。其結果是左上的64×64區塊被分割成2個的16×64區塊11、12、以及32×64區塊13。The upper left 64 × 64 block will be further vertically divided into 2 rectangular 32 × 64 blocks, and the left 32 × 64 block will be further vertically divided into 2 rectangular 16 × 64 blocks (two Meta-tree block partitioning). As a result, the upper left 64 × 64 block is divided into two 16 × 64 blocks 11, 12 and 32 × 64 block 13.

右上的64×64區塊被水平地分割為2個矩形的64×32區塊14、15(二元樹區塊分割)。The upper right 64 × 64 block is horizontally divided into two rectangular 64 × 32 blocks 14, 15 (binary tree block division).

左下的64×64區塊被分割為4個正方形的32×32區塊(四元樹區塊分割)。4個32×32區塊當中，將左上的區塊及右下的區塊進一步地分割。左上的32×32區塊被垂直地分割成2個矩形的16×32區塊，且將右邊的16×32區塊進一步水平地分割為2個16×16區塊(二元樹區塊分割)。右下的32×32區塊被水平地分割成2個32×16區塊(二元樹區塊分割)。結果，左下的64×64區塊被分割成：16×32區塊16；2個16×16區塊17、18；2個32×32區塊19、20；及2個32×16區塊21、22。The 64 × 64 block in the lower left is divided into 4 square 32 × 32 blocks (quaternary tree block partition). Among the four 32 × 32 blocks, the upper left block and the lower right block are further divided. The upper left 32 × 32 block is vertically divided into two rectangular 16 × 32 blocks, and the right 16 × 32 block is further horizontally divided into two 16 × 16 blocks (binary tree block division ). The lower right 32 × 32 block is horizontally divided into two 32 × 16 blocks (binary tree block partition). As a result, the lower left 64 × 64 block is divided into: 16 × 32 block 16; two 16 × 16 blocks 17, 18; two 32 × 32 blocks 19 and 20; and two 32 × 16 blocks 21, 22.

右下的64×64區塊23未被分割。The lower right 64 × 64 block 23 is not divided.

如以上，在圖2中，區塊10是根據遞迴的四元樹及二元樹區塊分割，而被分割成13個可變尺寸的區塊11~23。這種分割有時被稱為QTBT(四元樹加二元樹區塊結構(quad-tree plus binary tree))分割。As above, in FIG. 2, the block 10 is divided according to the recursive quaternary and binary tree blocks, and is divided into 13 variable-size blocks 11 to 23. This segmentation is sometimes called QTBT (quad-tree plus binary tree) segmentation.

再者，在圖2中，雖然是將1個區塊分割成4個或2個區塊(四元樹或二元樹區塊分割)，但分割並不限定於此。例如，亦可將1個區塊分割成3個區塊(三元樹區塊分割)。這種包含三元樹區塊分割的分割，有時被稱為MBT(多類型樹(multi type tree))分割。 [減法部]Moreover, in FIG. 2, although one block is divided into four or two blocks (quaternary tree or binary tree block division), the division is not limited to this. For example, one block may be divided into three blocks (ternary tree block division). This type of segmentation involving ternary tree block segmentation is sometimes referred to as MBT (multi type tree) segmentation. [Subtraction Division]

減法部104是以由分割部102所分割的區塊單位來從原訊號(原樣本)中減去預測訊號(預測樣本)。也就是說，減法部104會算出編碼對象區塊(以下，稱為當前區塊)的預測誤差(也可稱為殘差)。而且，減法部104會將算出的預測誤差輸出至轉換部106。The subtraction unit 104 subtracts the prediction signal (prediction sample) from the original signal (original sample) in units of blocks divided by the division unit 102. That is, the subtraction unit 104 calculates a prediction error (also referred to as a residual) of a coding target block (hereinafter, referred to as a current block). Then, the subtraction unit 104 outputs the calculated prediction error to the conversion unit 106.

原訊號是編碼裝置100的輸入訊號，且是表示構成動態圖像的各圖片之圖像的訊號(例如亮度(luma)訊號及2個色差(chroma)訊號)。在以下，有時也會將表示圖像的訊號稱為樣本。 [轉換部]The original signal is an input signal of the encoding device 100 and is a signal (for example, a luma signal and two chroma signals) indicating an image of each picture constituting a moving image. Hereinafter, a signal representing an image may be referred to as a sample. [Conversion Department]

轉換部106會將空間區域的預測誤差轉換成頻率區域的轉換係數，並將轉換係數輸出至量化部108。具體來說，轉換部106會例如對空間區域的預測誤差進行預定之離散餘弦轉換(DCT)或離散正弦轉換(DST)。The conversion unit 106 converts the prediction error in the spatial region into a conversion coefficient in the frequency region, and outputs the conversion coefficient to the quantization unit 108. Specifically, the conversion unit 106 performs, for example, a predetermined discrete cosine transform (DCT) or discrete sine transform (DST) on the prediction error of the spatial region.

再者，轉換部106亦可從複數個轉換類型之中自適應地選擇轉換類型，且使用與所選擇的轉換類型相對應之轉換基底函數(transform basis function)，來將預測誤差轉換成轉換係數。有時將這種轉換稱為EMT(外顯性多重核心轉換(explicit multiple core transform))或AMT(適應性多重轉換(adaptive multiple transform))。In addition, the conversion unit 106 may also adaptively select a conversion type from a plurality of conversion types, and use a transform basis function corresponding to the selected conversion type to convert the prediction error into a conversion coefficient. . This conversion is sometimes referred to as EMT (explicit multiple core transform) or AMT (adaptive multiple transform).

複數個轉換類型包含例如DCT-II、DCT-V、DCT-VIII、DST-I及DST-VII。圖3是顯示對應於各轉換類型的轉換基底函數之表格。在圖3中N是表示輸入像素的數量。從這些複數個轉換類型之中的轉換類型之選擇，可依例如預測的種類(框內預測(intra-prediction)及框間預測(inter-prediction))而定，亦可依框內預測模式而定。The plurality of conversion types include, for example, DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII. FIG. 3 is a table showing conversion basis functions corresponding to each conversion type. In FIG. 3, N is the number of input pixels. The selection of the conversion type from among the plurality of conversion types may depend on, for example, the type of prediction (intra-prediction and inter-prediction), or it may depend on the in-frame prediction mode. set.

這種顯示是否適用EMT或AMT的資訊(可稱為例如AMT旗標(AMT flag))及顯示所選擇的轉換類型的資訊是在CU層級被訊號化。再者，這些資訊的訊號化並不需限定於CU層級，也可以是其他的層級(例如，序列層級(sequence level)、圖片層級(picture level)、片段層級(slice level)、圖塊層級(tile level)或CTU層級)。This type of information indicating whether EMT or AMT is applicable (may be called, for example, AMT flag) and information indicating the selected conversion type is signaled at the CU level. Furthermore, the signalization of this information does not need to be limited to the CU level, but may be other levels (e.g., sequence level, picture level, slice level, tile level ( tile level) or CTU level).

又，轉換部106也可以將轉換係數(轉換結果)再轉換。有時將這種再轉換稱為AST(適應性二次轉換(adaptive secondary transform))或NSST(不可分的二次轉換(non-separable secondary transform))。例如，轉換部106會按在對應於框內預測誤差的轉換係數之區塊中所包含的每個子區塊(例如4×4子區塊)進行再轉換。顯示是否適用NSST的資訊以及與在NSST中所用的轉換矩陣相關的資訊是在CU層級被訊號化。再者，這些資訊的訊號化並不需限定於CU層級，也可以是其他的層級(例如，序列層級(sequence level)、圖片層級(picture level)、片段層級(slice level)、圖塊層級(tile level)或CTU層級)。 [量化部]The conversion unit 106 may convert the conversion coefficient (conversion result) again. Such retransforms are sometimes referred to as AST (adaptive secondary transform) or NSST (non-separable secondary transform). For example, the conversion unit 106 performs re-conversion for each sub-block (for example, a 4 × 4 sub-block) included in a block corresponding to a conversion coefficient of the in-frame prediction error. Information showing whether NSST is applicable and information related to the conversion matrix used in NSST is signaled at the CU level. Furthermore, the signalization of this information does not need to be limited to the CU level, but may be other levels (e.g., sequence level, picture level, slice level, tile level ( tile level) or CTU level). [Quantization Department]

量化部108是對從轉換部106輸出的轉換係數進行量化。具體來說，量化部108是以預定的掃描順序掃描當前區塊的轉換係數，且根據與所掃描的轉換係數相對應之量化參數(QP)來對該轉換係數進行量化。並且，量化部108會將當前區塊之已量化的轉換係數(以下，稱為量化係數)輸出到熵編碼部110及逆量化部112。The quantization unit 108 quantizes the conversion coefficients output from the conversion unit 106. Specifically, the quantization unit 108 scans the conversion coefficients of the current block in a predetermined scanning order, and quantizes the conversion coefficients according to a quantization parameter (QP) corresponding to the scanned conversion coefficients. In addition, the quantization unit 108 outputs the quantized conversion coefficients (hereinafter, referred to as quantization coefficients) of the current block to the entropy encoding unit 110 and the inverse quantization unit 112.

預定的順序是用於轉換係數的量化/逆量化之順序。例如，預定的掃描順序是以頻率的遞升順序(從低頻到高頻的順序)或遞降順序(從高頻到低頻的順序)來定義。The predetermined order is an order for quantization / inverse quantization of the conversion coefficients. For example, the predetermined scanning order is defined in ascending order of frequency (order from low frequency to high frequency) or descending order (order from high frequency to low frequency).

所謂量化參數是定義量化步距(量化寬度)的參數。例如，量化參數的值增加的話，會使量化步距也增加。也就是說，量化參數的值增加的話，會使量化誤差增大。 [熵編碼部]The quantization parameter is a parameter that defines a quantization step (quantization width). For example, increasing the value of the quantization parameter will increase the quantization step. In other words, if the value of the quantization parameter increases, the quantization error increases. [Entropy coding section]

熵編碼部110是藉由對從量化部108輸入之量化係數進行可變長度編碼，來生成編碼訊號(編碼位元流(bit stream))。具體來說，熵編碼部110是例如將量化係數二值化，而對二值訊號進行算術編碼。 [逆量化部]The entropy coding unit 110 generates a coding signal (encoded bit stream) by variable-length coding the quantization coefficient input from the quantization unit 108. Specifically, for example, the entropy coding unit 110 binarizes the quantization coefficient and performs arithmetic coding on the binary signal. [Inverse quantization section]

逆量化部112是對來自量化部108的輸入之量化係數進行逆量化。具體來說，逆量化部112是以預定的掃描順序對當前區塊的量化係數進行逆量化。並且，逆量化部112會將當前區塊之已逆量化的轉換係數輸出到逆轉換部114。 [逆轉換部]The inverse quantization unit 112 performs inverse quantization on the quantization coefficients input from the quantization unit 108. Specifically, the inverse quantization unit 112 performs inverse quantization on the quantization coefficients of the current block in a predetermined scanning order. Then, the inverse quantization section 112 outputs the inverse quantized conversion coefficients of the current block to the inverse conversion section 114. [Inverse Conversion Department]

逆轉換部114是藉由對自逆量化部112輸入的轉換係數進行逆轉換，以復原預測誤差。具體來說，逆轉換部114是藉由對轉換係數進行與由轉換部106進行的轉換對應之逆轉換，來復原當前區塊的預測誤差。並且，逆轉換部114會將復原的預測誤差輸出至加法部116。The inverse conversion unit 114 restores the prediction error by inversely converting the conversion coefficient input by the inverse quantization unit 112. Specifically, the inverse conversion unit 114 restores the prediction error of the current block by inverse conversion of the conversion coefficient corresponding to the conversion performed by the conversion unit 106. Then, the inverse conversion unit 114 outputs the restored prediction error to the addition unit 116.

再者，由於復原的預測誤差會因量化而失去資訊，因此和減法部104算出的預測誤差並不一致。亦即，復原的預測誤差中包含有量化誤差。 [加法部]Furthermore, since the restored prediction error loses information due to quantization, it does not agree with the prediction error calculated by the subtraction unit 104. That is, the restored prediction error includes a quantization error. [Addition Department]

加法部116會對自逆轉換部114輸入的預測誤差、與自預測控制部128輸入的預測樣本進行加法運算，藉此再構成當前區塊。而且，加法部116會將再構成的區塊輸出到區塊記憶體118及迴路濾波部120。有時也將再構成區塊稱為局部解碼區塊(local decoding block)。 [區塊記憶體]The addition unit 116 adds the prediction error input by the inverse conversion unit 114 and the prediction sample input by the self-prediction control unit 128 to thereby construct a current block. Then, the adding unit 116 outputs the reconstructed block to the block memory 118 and the loop filtering unit 120. The reconstructed block is sometimes called a local decoding block. [Block memory]

區塊記憶體118是用於保存在框內預測中所參照的區塊且也是編碼對象圖片(以下，稱為當前圖片)內的區塊的儲存部。具體來說，區塊記憶體118會保存從加法部116輸出的再構成區塊。 [迴路濾波部]The block memory 118 is a storage unit for storing the blocks referred to in the in-frame prediction and also the blocks in the encoding target picture (hereinafter, referred to as the current picture). Specifically, the block memory 118 stores the reconstructed blocks output from the adding unit 116. [Loop Filtering Division]

迴路濾波部120會對藉由加法部116再構成的區塊施行迴路濾波，且將已進行濾波的再構成區塊輸出到框記憶體122。所謂迴路濾波器是在編碼迴路內使用的濾波器(內嵌式迴路濾波器(In-loop filter))，且包含例如去區塊濾波器(Deblocking Filter，DF)、取樣自適應偏移(Sample Adaptive Offset，SAO)及自適應迴路濾波器(Adaptive Loop Filter，ALF)等。The loop filtering unit 120 performs loop filtering on the blocks reconstructed by the adding unit 116, and outputs the filtered reconstructed blocks to the frame memory 122. The so-called loop filter is a filter (in-loop filter) used in the coding loop, and includes, for example, a deblocking filter (DF) and a sample adaptive offset (Sample). Adaptive Offset (SAO) and Adaptive Loop Filter (ALF).

在ALF中，可適用去除編碼失真用的最小平方誤差濾波器，例如可適用按當前區塊內的2×2子區塊的每一個，根據局部的梯度(gradient)的方向及活動性(activity)來從複數個濾波器之中選擇的1個濾波器。In ALF, a least square error filter for removing coding distortion can be applied. For example, it can be applied to each of the 2 × 2 sub-blocks in the current block, according to the direction of the local gradient and activity. ) To select one filter from a plurality of filters.

具體來說，首先，可將子區塊(例如2×2子區塊)分類成複數個類別(class)(例如15個或25個類別)。子區塊的分類是根據梯度的方向及活動性來進行。例如，可利用梯度的方向值D(例如0~2或0~4)與梯度的活性值A(例如0~4)來算出分類值C(例如C=5D＋A)。而且，根據分類值C，來將子區塊分類成複數個類別(例如15個或25個類別)。Specifically, first, a sub-block (for example, a 2 × 2 sub-block) may be classified into a plurality of classes (for example, 15 or 25 classes). The classification of sub-blocks is based on the direction and activity of the gradient. For example, the classification value C (for example, C = 5D + A) can be calculated by using the direction value D (for example, 0 ~ 2 or 0 ~ 4) of the gradient and the activity value A (for example, 0 ~ 4) of the gradient. Furthermore, the sub-block is classified into a plurality of categories (for example, 15 or 25 categories) according to the classification value C.

梯度的方向值D可藉由例如比較複數個方向(例如水平、垂直及2個對角方向)的梯度來導出。又，梯度的活性值A是藉由例如對複數個方向的梯度作加法運算，並將加法結果量化來導出。The direction value D of the gradient can be derived by, for example, comparing gradients in a plurality of directions (for example, horizontal, vertical, and two diagonal directions). The activity value A of the gradient is derived by, for example, adding gradients in a plurality of directions and quantifying the addition result.

根據這種分類的結果，即可從複數個濾波器之中決定子區塊用的濾波器。Based on the results of this classification, a filter for a sub-block can be determined from a plurality of filters.

作為在ALF中所用的濾波器之形狀，可利用的有例如圓對稱形狀。圖4A~圖4C是顯示在ALF中所用的濾波器之形狀的複數個例子之圖。圖4A是顯示5×5菱形(diamond)形狀濾波器，圖4B是顯示7×7菱形形狀濾波器，圖4C是顯示9×9菱形形狀濾波器。顯示濾波器的形狀之資訊是在圖片層級被訊號化。再者，顯示濾波器的形狀之資訊的訊號化並不需要限定於圖片層級，亦可為其他的層級(例如，序列層級、片段層級、圖塊層級、CTU層級或CU層級)。As the shape of the filter used in ALF, for example, a circularly symmetric shape can be used. 4A to 4C are diagrams showing plural examples of the shape of a filter used in the ALF. FIG. 4A shows a 5 × 5 diamond shape filter, FIG. 4B shows a 7 × 7 diamond shape filter, and FIG. 4C shows a 9 × 9 diamond shape filter. Information on the shape of the display filter is signaled at the picture level. Furthermore, the signalization of the information showing the shape of the filter does not need to be limited to the picture level, but may be other levels (for example, sequence level, slice level, tile level, CTU level, or CU level).

ALF的開啟/關閉(on/off)是在例如圖片層級或CU層級決定的。例如，針對亮度是在CU層級來決定是否適用ALF，而針對色差則是在圖片層級來決定是否適用ALF。顯示ALF的開啟/關閉之資訊是在圖片層級或CU層級被訊號化。再者，顯示ALF的開啟/關閉之資訊的訊號化並不需要限定於圖片層級或CU層級，亦可為其他的層級(例如，序列層級、片段層級、圖塊層級或CTU層級)。The on / off of ALF is determined at the picture level or the CU level, for example. For example, whether to apply ALF is determined at the CU level for brightness, and whether to apply ALF is determined at the picture level for color difference. The display of ALF on / off information is signaled at the picture level or the CU level. Furthermore, the signalization of the information showing the on / off of ALF need not be limited to the picture level or the CU level, but may be other levels (for example, the sequence level, the fragment level, the tile level, or the CTU level).

可選擇的複數個濾波器(例如到15個或25個為止的濾波器)之係數組是在圖片層級被訊號化。再者，係數組的訊號化並不需要限定於圖片層級，也可以是其他的層級(例如序列層級、片段層級、圖塊層級、CTU層級、CU層級或子區塊層級)。 [框記憶體]The coefficient sets of a plurality of selectable filters (for example, filters up to 15 or 25) are signaled at the picture level. In addition, the signalization of the coefficient group does not need to be limited to the picture level, but may be other levels (such as sequence level, fragment level, tile level, CTU level, CU level, or sub-block level). [Frame memory]

框記憶體122是用於保存框間預測所用的參照圖片之儲存部，有時也被稱為框緩衝器(frame buffer)。具體來說，框記憶體122會保存已藉由迴路濾波部120而被濾波的再構成區塊。 [框內預測部]The frame memory 122 is a storage unit for storing reference pictures used for inter-frame prediction, and is sometimes referred to as a frame buffer. Specifically, the frame memory 122 stores the reconstructed blocks that have been filtered by the loop filtering unit 120. [In-frame prediction section]

框內預測部124是參照已保存於區塊記憶體118的當前圖片內之區塊來進行當前區塊的框內預測(也稱為畫面內預測)，藉此生成預測訊號(框內預測訊號)。具體來說，框內預測部124是參照與當前區塊相鄰的區塊之樣本(例如亮度值、色差值)來進行框內預測，藉此生成框內預測訊號，並將框內預測訊號輸出至預測控制部128。The in-frame prediction section 124 refers to the blocks already stored in the current picture of the block memory 118 to perform in-frame prediction (also called in-frame prediction) of the current block, thereby generating a prediction signal (in-frame prediction signal) ). Specifically, the in-frame prediction unit 124 performs in-frame prediction with reference to samples (e.g., luminance values and color difference values) of blocks adjacent to the current block, thereby generating an in-frame prediction signal, and performs in-frame prediction. The signal is output to the prediction control unit 128.

例如，框內預測部124會利用事先規定的複數個框內預測模式之中的1個來進行框內預測。複數個框內預測模式包含1個以上之非方向性預測模式、以及複數個方向性預測模式。For example, the intra-frame prediction unit 124 performs intra-frame prediction using one of a plurality of predetermined intra-frame prediction modes. The plurality of in-frame prediction modes include one or more non-directional prediction modes and a plurality of directional prediction modes.

1個以上之非方向性預測模式包含例如H.265/HEVC(高效率視訊編碼，High-Efficiency Video Coding)規格(非專利文獻1)所規定的平面(Planar)預測模式及DC預測模式。The one or more non-directional prediction modes include, for example, a planar prediction mode and a DC prediction mode specified by H.265 / HEVC (High-Efficiency Video Coding) specifications (Non-Patent Document 1).

複數個方向性預測模式包含例如H.265/HEVC規格所規定的33個方向之預測模式。再者，複數個方向性預測模式亦可除了33個方向以外，更進一步地包含32個方向的預測模式(合計65個方向性預測模式)。圖5是顯示框內預測中的67個框內預測模式(2個非方向性預測模式及65個方向性預測模式)之圖。實線箭頭是表示H.265/HEVC規格所規定的33個方向，虛線箭頭是表示追加的32個方向。The plurality of directional prediction modes include, for example, prediction modes in 33 directions prescribed by the H.265 / HEVC standard. Furthermore, the plurality of directional prediction modes may include, in addition to 33 directions, prediction modes in 32 directions (a total of 65 directional prediction modes). FIG. 5 is a diagram showing 67 intra-frame prediction modes (2 non-directional prediction modes and 65 directional prediction modes) in the intra-frame prediction. The solid arrows indicate the 33 directions defined by the H.265 / HEVC standard, and the dotted arrows indicate the 32 additional directions.

再者，在色差區塊的框內預測中，亦可參照亮度區塊。也就是說，也可以根據當前區塊的亮度成分，來預測當前區塊的色差成分。有時可將這種框內預測稱為CCLM(交叉成分線性模型，cross-component linear model)預測。這種參照亮度區塊的色差區塊之框內預測模式(例如被稱為CCLM模式)，也可以作為色差區塊的框內預測模式之一來加入。Furthermore, in the frame prediction of the color difference block, a luminance block may also be referred to. That is, the color difference component of the current block can also be predicted based on the brightness component of the current block. Such in-frame prediction may be referred to as CCLM (cross-component linear model) prediction. Such an in-frame prediction mode of a color-difference block that refers to a luminance block (for example, referred to as a CCLM mode) can also be added as one of the in-frame prediction modes of a color-difference block.

框內預測部124亦可根據水平/垂直方向的參照像素之梯度來補正框內預測後的像素值。這種伴隨補正的框內預測有時被稱為PDPC(獨立位置框內預測組合，position dependent intra prediction combination)。顯示有無PDPC的適用之資訊(被稱為例如PDPC旗標)是在例如CU層級被訊號化。再者，此資訊的訊號化並不需要限定於CU層級，也可以是其他的層級(例如序列層級、圖片層級、片段層級、圖塊層級或CTU層級)。 [框間預測部]The intra-frame prediction unit 124 may also correct the pixel values after intra-frame prediction based on the gradient of the reference pixels in the horizontal / vertical direction. Such intra-frame prediction accompanied by correction is sometimes referred to as PDPC (position-dependent intra prediction combination). The applicable information (referred to as, for example, the PDPC flag) that indicates the presence or absence of PDPC is signaled at, for example, the CU level. Furthermore, the signalization of this information does not need to be limited to the CU level, but may be other levels (such as sequence level, picture level, fragment level, tile level, or CTU level). [Inter-frame prediction section]

框間預測部126會參照為保存在框記憶體122的參照圖片且為與當前圖片不同的參照圖片，來進行當前區塊的框間預測(也稱為畫面間預測)，藉此生成預測訊號(框間預測訊號)。框間預測是以當前區塊或當前區塊內的子區塊(例如4×4區塊)之單位來進行。例如，框間預測部126是針對當前區塊或子區塊而在參照圖片內進行動態搜尋(動態估計(motion estimation))。而且，框間預測部126是利用以動態搜尋所得到的動態資訊(例如移動向量)來進行動態補償，藉此生成當前區塊或子區塊的框間預測訊號。並且，框間預測部126會將生成的框間預測訊號輸出至預測控制部128。The inter-frame prediction unit 126 refers to a reference picture stored in the frame memory 122 and is a reference picture different from the current picture to perform inter-frame prediction (also called inter-frame prediction) of the current block, thereby generating a prediction signal. (Inter-frame prediction signal). Inter-frame prediction is performed in units of a current block or a sub-block (for example, a 4 × 4 block) within the current block. For example, the inter-frame prediction unit 126 performs a dynamic search (motion estimation) in the reference picture for the current block or subblock. In addition, the inter-frame prediction unit 126 uses motion information (for example, a motion vector) obtained by dynamic search to perform motion compensation, thereby generating an inter-frame prediction signal of a current block or a sub-block. Then, the inter-frame prediction unit 126 outputs the generated inter-frame prediction signal to the prediction control unit 128.

使用於動態補償的動態資訊會被訊號化。在移動向量的訊號化中，亦可使用移動向量預測子(motion vector predictor)。也就是說，亦可將移動向量與移動向量預測子之間的差分訊號化。The motion information used for motion compensation is signaled. To signal the motion vector, a motion vector predictor may be used. That is, the difference signal between the motion vector and the motion vector predictor may be converted into a signal.

再者，不只是由動態搜尋得到的當前區塊之動態資訊，亦可連相鄰區塊的動態資訊也利用，來生成框間預測訊號。具體來說，亦可將由動態搜尋得到的動態資訊之預測訊號、以及根據相鄰區塊的動態資訊之預測訊號作加權相加，藉此以當前區塊內的子區塊單位來生成框間預測訊號。這種框間預測(動態補償)有時被稱為OBMC(重疊區塊動態補償，overlapped block motion compensation)。Furthermore, not only the dynamic information of the current block obtained by dynamic search, but also the dynamic information of adjacent blocks can also be used to generate inter-frame prediction signals. Specifically, the prediction signals of the dynamic information obtained by the dynamic search and the prediction signals of the dynamic information of the neighboring blocks can also be weighted and added, so as to generate the frame between the sub-block units in the current block. Forecast signal. Such inter-frame prediction (dynamic compensation) is sometimes referred to as OBMC (overlapped block motion compensation).

在這種OBMC模式中，顯示OBMC用的子區塊之尺寸的資訊(例如稱為OBMC區塊尺寸)是在序列層級被訊號化。又，顯示是否適用OBMC模式的資訊(例如稱為OBMC旗標)是在CU層級被訊號化。再者，這些資訊的訊號化之層級並不需要限定於序列層級及CU層級，亦可為其他的層級(例如圖片層級、片段層級、圖塊層級、CTU層級或子區塊層級)。In this OBMC mode, information (for example, called OBMC block size) showing the size of subblocks for OBMC is signaled at the sequence level. The information indicating whether the OBMC mode is applicable (for example, the OBMC flag) is signaled at the CU level. Furthermore, the level of signalization of these information does not need to be limited to the sequence level and the CU level, but may also be other levels (such as the picture level, the fragment level, the tile level, the CTU level, or the sub-block level).

再者，亦可不將動態資訊訊號化，而在解碼裝置側導出。例如，也可以使用H.265/HEVC規格所規定的合併模式(merge mode)。又，例如亦可藉由在解碼裝置側進行動態搜尋來導出動態資訊。在此情況下，可在不使用當前區塊的像素值的情形下進行動態搜尋。Furthermore, instead of signalizing the dynamic information, it may be derived on the decoding device side. For example, a merge mode defined by the H.265 / HEVC standard may be used. In addition, for example, dynamic information may be derived by performing a dynamic search on the decoding device side. In this case, a dynamic search can be performed without using the pixel values of the current block.

在此，針對在解碼裝置側進行動態搜尋的模式進行說明。有時將該在解碼裝置側進行動態搜尋的模式稱為PMMVD(型樣匹配移動向量導出，pattern matched motion vector derivation)模式、或FRUC(提升框速轉換，flame rate up-conversion)模式。Here, a mode for performing dynamic search on the decoding device side will be described. This mode of performing dynamic search on the decoding device side is sometimes referred to as a PMMVD (pattern matched motion vector derivation) mode or a FRUC (flame rate up-conversion) mode.

首先，參照空間上或時間上與當前區塊相鄰的編碼完成之區塊的移動向量，而可生成各自具有移動向量預測子的複數個候補之清單(與合併清單共通亦可)。而且，算出候補清單所包含的各候補之評價值，並根據評價值來選擇1個候補。First, referring to the motion vector of a coded block that is adjacent to the current block in space or time, a plurality of candidate lists each having a motion vector predictor can be generated (or common to the merged list). Then, an evaluation value of each candidate included in the candidate list is calculated, and one candidate is selected based on the evaluation value.

而且，可根據所選擇的候補之移動向量，來導出當前區塊用的移動向量。具體來說，是例如，將所選擇的候補之移動向量原樣導出作為當前區塊用的移動向量。又，亦可例如，在與所選擇的候補之移動向量相對應的參照圖片內的位置之周邊區域中，藉由進行型樣匹配，來導出當前區塊用的移動向量。Furthermore, the motion vector for the current block can be derived based on the selected candidate motion vector. Specifically, for example, the selected candidate motion vector is derived as it is as a motion vector for the current block. In addition, for example, in a peripheral area of a position in a reference picture corresponding to the selected candidate motion vector, a pattern-matching is performed to derive a motion vector for the current block.

再者，評價值是藉由與移動向量相對應的參照圖片內的區域、與預定的區域之間的型樣匹配來算出的。The evaluation value is calculated by pattern matching between a region in the reference picture corresponding to the motion vector and a predetermined region.

作為型樣匹配，可使用第1型樣匹配或第2型樣匹配。有時將第1型樣匹配及第2型樣匹配分別稱為雙向匹配(bilateral matching)及模板匹配(template matching)。As the pattern matching, the first pattern matching or the second pattern matching can be used. The first pattern matching and the second pattern matching are sometimes referred to as bilateral matching and template matching, respectively.

在第1型樣匹配中，是在為不同的2個參照圖片內的2個區塊且為沿著當前區塊的移動軌跡(motion trajectory)的2個區塊之間進行型樣匹配。因此，在第1型樣匹配中，作為上述候補的評價值的算出用之預定的區域，所使用的是沿著當前區塊的移動軌跡之其他參照圖片內的區域。In the first pattern matching, pattern matching is performed between two blocks that are two blocks in two different reference pictures and that are motion trajectory along the current block. Therefore, in the first pattern matching, as a predetermined area for calculating the candidate evaluation value, an area in another reference picture along the movement track of the current block is used.

圖6是用於說明沿著移動軌跡的2個區塊間的型樣匹配(雙向匹配)之圖。如圖6所示，在第1型樣匹配中，是在為沿著當前區塊(Cur block)的移動軌跡之2個區塊且為不同的2個參照圖片(Ref0、Ref1)內的2個區塊的配對中，搜尋最匹配的配對，藉此導出2個移動向量(MV0、MV1)。FIG. 6 is a diagram for explaining pattern matching (two-way matching) between two blocks along a moving track. As shown in FIG. 6, in the first type matching, it is 2 in the two reference pictures (Ref0, Ref1) that are two blocks along the current track (Cur block) and are different Among the pairs of blocks, the best matching pair is searched to derive 2 motion vectors (MV0, MV1).

在連續的移動軌跡之假設下，意指2個參照區塊的移動向量(MV0、MV1)會相對於當前圖片(Cur Pic)與2個參照圖片(Ref0、Ref1)之間的時間上的距離(TD0、TD1)成比例。例如，當前圖片在時間上位於2個參照圖片之間，且從當前圖片到2個參照圖片的時間上之距離為相等的情況下，在第1型樣匹配中，會導出鏡像對稱的雙向之移動向量。Under the assumption of continuous moving trajectories, it means that the motion vectors (MV0, MV1) of 2 reference blocks will be relative to the time distance between the current picture (Cur Pic) and the 2 reference pictures (Ref0, Ref1). (TD0, TD1) are proportional. For example, if the current picture is located between two reference pictures in time, and the time distance from the current picture to the two reference pictures is equal, in the first type matching, a mirror-symmetric two-way Move vector.

在第2型樣匹配中，是在當前圖片內的模板(在當前圖片內與當前區塊相鄰的區塊(例如上及/或左相鄰區塊))與參照圖片內的區塊之間進行型樣匹配。因此，在第2型樣匹配中，作為上述候補的評價值的算出用之預定的區域，所使用的是當前圖片內之與當前區塊相鄰的區塊。In the second type matching, the template in the current picture (the block adjacent to the current block in the current picture (such as the upper and / or left adjacent blocks)) and the block in the reference picture Pattern matching between. Therefore, in the second type matching, a block adjacent to the current block in the current picture is used as a predetermined region for calculating the candidate evaluation value.

圖7是用於說明在當前圖片內的模板與參照圖片內的區塊之間的型樣匹配(模板匹配)之圖。如圖7所示，在第2型樣匹配中，是藉由在參照圖片(Ref0)內搜尋與在當前圖片(Cur Pic)內相鄰於當前區塊(Cur block)的區塊最匹配的區塊，以導出當前區塊的移動向量。FIG. 7 is a diagram for explaining pattern matching (template matching) between a template in a current picture and a block in a reference picture. As shown in FIG. 7, in the second type of matching, the reference picture (Ref0) is searched for the block that matches the block closest to the current block in the current picture (Cur Pic). Block to derive the motion vector for the current block.

這種顯示是否適用FRUC模式的資訊(例如可稱為FRUC旗標)是在CU層級被訊號化。又，在適用FRUC模式的情況下(例如FRUC旗標為真的情況下)，顯示型樣匹配的方法(第1型樣匹配或第2型樣匹配)之資訊(例如可稱為FRUC模式旗標)在CU層級被訊號化。再者，這些資訊的訊號化並不需要限定於CU層級，亦可為其他的層級(例如，序列層級、圖片層級、片段層級、圖塊層級、CTU層級或子區塊層級)。This type of information (for example, a FRUC flag) indicating whether the FRUC mode is applicable is signaled at the CU level. In addition, when the FRUC mode is applied (for example, when the FRUC flag is true), the information of the pattern matching method (the first pattern match or the second pattern match) is displayed (for example, it can be referred to as the FRUC mode flag) (Subscript) is signaled at the CU level. Furthermore, the signalization of such information does not need to be limited to the CU level, but may be other levels (for example, sequence level, picture level, fragment level, tile level, CTU level, or sub-block level).

再者，也可以藉由與動態搜尋不同的方法，在解碼裝置側導出動態資訊。例如，亦可根據假設了等速直線運動的模型，以像素單位使用周邊像素值來算出移動向量的補正量。Furthermore, the dynamic information can also be derived on the decoding device side by a method different from the dynamic search. For example, based on a model that assumes a constant-speed linear motion, the correction amount of the motion vector may be calculated using the surrounding pixel values in pixel units.

在此，針對根據假設了等速直線運動的模型來導出移動向量的模式進行說明。有時將此模式稱為BIO(雙向光流，bi-directional optical flow)模式。Here, a mode for deriving a motion vector from a model that assumes a constant-speed linear motion will be described. This mode is sometimes called a BIO (bi-directional optical flow) mode.

圖8是用於說明假設了等速直線運動的模型之圖。在圖8中，(v_x ，v_y )是表示速度向量，τ₀ 、τ₁ 各自表示當前圖片(Cur Pic)與2個參照圖片(Ref₀ ，Ref₁ )之間的時間上之距離。(MVx₀ ，MVy₀ )是表示對應於參照圖片Ref₀ 的移動向量，(MVx₁ ，MVy₁ )是表示對應於參照圖片Ref₁ 的移動向量。FIG. 8 is a diagram for explaining a model in which constant-speed linear motion is assumed. In FIG. 8, (v _x , v _y ) are velocity vectors, and τ ₀ and τ ₁ each represent a time distance between a current picture (Cur Pic) and two reference pictures (Ref ₀ , Ref ₁ ). (MVx ₀ , MVy ₀ ) is a motion vector corresponding to the reference picture Ref ₀ , and (MVx ₁ , MVy ₁ ) is a motion vector corresponding to the reference picture Ref ₁ .

此時在速度向量(v_x ，v_y )的等速直線運動的假設之下，是將(MVx₀ ，MVy₀ )及(MVx₁ ，MVy₁ )各自表示為(v_x τ₀ ，v_y τ₀ )及(-v_x τ₁ ，-v_y τ₁ )，而使以下的光流等式(1)成立。 [數學式1] At this time, under the assumption of constant velocity linear motion of the velocity vector (v _x , v _y ), (MVx ₀ , MVy ₀ ) and (MVx ₁ , MVy ₁ ) are each expressed as (v _x τ ₀ , v _y τ ₀ ) and (-v _x τ ₁ , -v _y τ ₁ ), and the following optical flow equation (1) is established. [Mathematical formula 1]

在此，I^(k) 表示動態補償後的參照圖像k(k=0，1)之亮度值。此光流等式是表示下述的(i)、(ii)與(iii)之和等於零：(i)亮度值的時間微分、(ii)水平方向的速度及參照圖像的空間梯度之水平成分的積、及(iii)垂直方向的速度及參照圖像的空間梯度之垂直成分的積。根據此光流等式與赫米內插法公式(Hermite interpolation)的組合，可將從合併清單等得到的區塊單位之移動向量以像素單位進行補正。Here, I ^(k) represents the luminance value of the reference image k (k = 0, 1) after motion compensation. This optical flow equation represents the following (i), (ii), and (iii) equal to zero: (i) temporal differentiation of the luminance value, (ii) horizontal velocity and the level of the spatial gradient of the reference image The product of the components and (iii) the product of the vertical components of the velocity in the vertical direction and the spatial gradient of the reference image. According to the combination of the optical flow equation and Hermite interpolation, the motion vector of the block unit obtained from the combined list and the like can be corrected in pixel units.

再者，亦可藉由與根據假設了等速直線運動的模型之移動向量的導出不同之方法，在解碼裝置側導出移動向量。例如，亦可根據複數個相鄰區塊的移動向量而以子區塊單位來導出移動向量。In addition, the motion vector may be derived on the decoding device side by a method different from the derivation of the motion vector from a model in which a constant-speed linear motion is assumed. For example, the motion vector may be derived in units of sub-blocks based on the motion vectors of a plurality of adjacent blocks.

在此，針對根據複數個相鄰區塊的移動向量而以子區塊單位來導出移動向量的模式進行說明。有時將此模式稱為仿射動態補償預測(affine motion compensation prediction)模式。Here, a mode for deriving a motion vector in units of sub-blocks based on the motion vectors of a plurality of adjacent blocks will be described. This mode is sometimes called the affine motion compensation prediction mode.

圖9是用於說明根據複數個相鄰區塊的移動向量之子區塊單位的移動向量的導出之圖。在圖9中，當前區塊包含16個4×4子區塊。在此，是根據相鄰區塊的移動向量來導出當前區塊的左上角控制點之移動向量v₀ ，且根據相鄰子區塊的移動向量來導出當前區塊的右上角控制點之移動向量v₁ 。而且，使用2個移動向量v₀ 及v₁ ，藉由以下的式(2)，來導出當前區塊內的各子區塊之移動向量(v_x ，v_y )。 [數學式2] FIG. 9 is a diagram for explaining derivation of a motion vector based on a sub-block unit of a motion vector of a plurality of adjacent blocks. In FIG. 9, the current block contains 16 4 × 4 sub-blocks. Here, the movement vector v ₀ of the upper left corner control point of the current block is derived based on the movement vector of the neighboring block, and the movement of the upper right corner control point of the current block is derived based on the movement vector of the neighboring sub-block Vector v ₁ . Then, using two motion vectors v ₀ and v ₁ , the motion vector (v _x , v _y ) of each sub-block in the current block is derived by the following formula (2). [Mathematical formula 2]

在此，x及y各自表示子區塊的水平位置及垂直位置，且w是表示預定的加權係數。Here, x and y each indicate a horizontal position and a vertical position of a sub-block, and w is a predetermined weighting factor.

在這種仿射動態補償預測模式中，左上及右上角控制點的移動向量之導出方法也可以包含幾個不同的模式。顯示這種仿射動態補償預測模式的資訊(例如可稱為仿射旗標)是在CU層級被訊號化。再者，顯示該仿射動態補償預測模式的資訊之訊號化並不需要限定於CU層級，也可以是其他的層級(例如序列層級、圖片層級、片段層級、圖塊層級、CTU層級或子區塊層級)。 [預測控制部]In this affine motion compensation prediction mode, the method of deriving the motion vectors of the upper left and upper right control points can also include several different modes. Information (such as affine flags) showing such an affine motion-compensated prediction mode is signaled at the CU level. Furthermore, the signalization of the information showing the affine dynamic compensation prediction mode does not need to be limited to the CU level, but may be other levels (such as sequence level, picture level, fragment level, tile level, CTU level, or sub-regions). Block level). [Predictive Control Department]

預測控制部128會選擇框內預測訊號及框間預測訊號的任一個，且將所選擇的訊號作為預測訊號而輸出至減法部104及加法部116。 [解碼裝置的概要]The prediction control unit 128 selects any one of the intra-frame prediction signal and the inter-frame prediction signal, and outputs the selected signal to the subtraction unit 104 and the addition unit 116 as a prediction signal. [Outline of Decoding Device]

接著，針對可對從上述編碼裝置100輸出的編碼訊號(編碼位元流)進行解碼之解碼裝置的概要進行說明。圖10是顯示實施形態1之解碼裝置200的功能構成之方塊圖。解碼裝置200是以區塊單位對動態圖像/圖像進行解碼的動態圖像/圖像解碼裝置。Next, an outline of a decoding device capable of decoding an encoded signal (encoded bit stream) output from the encoding device 100 will be described. FIG. 10 is a block diagram showing a functional configuration of the decoding device 200 according to the first embodiment. The decoding device 200 is a moving image / image decoding device that decodes moving images / images in units of blocks.

如圖10所示，解碼裝置200具備熵解碼部202、逆量化部204、逆轉換部206、加法部208、區塊記憶體210、迴路濾波部212、框記憶體214、框內預測部216、框間預測部218、及預測控制部220。As shown in FIG. 10, the decoding device 200 includes an entropy decoding unit 202, an inverse quantization unit 204, an inverse conversion unit 206, an addition unit 208, a block memory 210, a loop filtering unit 212, a frame memory 214, and an in-frame prediction unit 216. , Inter-frame prediction unit 218, and prediction control unit 220.

解碼裝置200可藉由例如通用處理器及記憶體來實現。在此情況下，藉由處理器執行保存在記憶體的軟體程式時，處理器是作為熵解碼部202、逆量化部204、逆轉換部206、加法部208、迴路濾波部212、框內預測部216、框間預測部218及預測控制部220而發揮功能。又，解碼裝置200也可以作為對應於熵解碼部202、逆量化部204、逆轉換部206、加法部208、迴路濾波部212、框內預測部216、框間預測部218及預測控制部220的1個以上之專用的電子電路來實現。The decoding device 200 may be implemented by, for example, a general-purpose processor and a memory. In this case, when the processor executes the software program stored in the memory, the processor functions as the entropy decoding unit 202, the inverse quantization unit 204, the inverse conversion unit 206, the addition unit 208, the loop filtering unit 212, and the in-frame prediction. The unit 216, the inter-frame prediction unit 218, and the prediction control unit 220 function. The decoding device 200 may also correspond to the entropy decoding unit 202, the inverse quantization unit 204, the inverse conversion unit 206, the addition unit 208, the loop filtering unit 212, the intra-frame prediction unit 216, the inter-frame prediction unit 218, and the prediction control unit 220. More than 1 dedicated electronic circuit.

以下，針對包含在解碼裝置200的各構成要件來進行說明。 [熵解碼部]Hereinafter, each constituent element included in the decoding device 200 will be described. [Entropy decoding section]

熵解碼部202是對編碼位元流進行熵解碼。具體來說，熵解碼部202是例如從編碼位元流算術解碼出二值訊號。而且，熵解碼部202會對二值訊號進行多值化(debinarize)。藉此，熵解碼部202會以區塊單位將量化係數輸出至逆量化部204。 [逆量化部]The entropy decoding unit 202 performs entropy decoding on a coded bit stream. Specifically, the entropy decoding unit 202 arithmetically decodes a binary signal from the encoded bit stream, for example. The entropy decoding unit 202 debinarizes the binary signal. As a result, the entropy decoding unit 202 outputs the quantization coefficient to the inverse quantization unit 204 in units of blocks. [Inverse quantization section]

逆量化部204是對自熵解碼部202輸入的解碼對象區塊(以下，稱為當前區塊)的量化係數進行逆量化。具體來說，逆量化部204是針對當前區塊的量化係數的每一個，根據對應於該量化係數的量化參數，來對該量化係數進行逆量化。並且，逆量化部204會將當前區塊之已進行逆量化的量化係數(也就是轉換係數)輸出至逆轉換部206。 [逆轉換部]The inverse quantization unit 204 performs inverse quantization on the quantization coefficients of a decoding target block (hereinafter, referred to as a current block) input from the entropy decoding unit 202. Specifically, the inverse quantization unit 204 performs inverse quantization on each quantization coefficient of the current block based on a quantization parameter corresponding to the quantization coefficient. In addition, the inverse quantization unit 204 outputs the quantized coefficients (that is, conversion coefficients) of the current block that have been inversely quantized to the inverse conversion unit 206. [Inverse Conversion Department]

逆轉換部206是藉由對自逆量化部204輸入的轉換係數進行逆轉換，以復原預測誤差。The inverse conversion unit 206 restores the prediction error by inversely converting the conversion coefficient input by the inverse quantization unit 204.

在例如顯示已從編碼位元流中解讀出的資訊適用EMT或AMT的情況下(例如AMT旗標為真)，逆轉換部206會根據顯示已解讀的轉換類型之資訊，來對當前區塊的轉換係數進行逆轉換。In the case where, for example, it is displayed that the information decoded from the encoded bit stream is applicable to EMT or AMT (for example, the AMT flag is true), the inverse conversion section 206 will perform the current block analysis based on the information indicating the type of the decoded conversion. The inverse conversion is performed.

又，在例如顯示已從編碼位元流中解讀出的資訊適用NSST的情況下，逆轉換部206會對轉換係數適用逆再轉換。 [加法部]When the NSST is applied to information that has been decoded from the encoded bit stream, for example, the inverse conversion unit 206 applies inverse reconversion to the conversion coefficient. [Addition Department]

加法部208會對自逆轉換部206輸入的預測誤差、與自預測控制部220輸入的預測樣本進行加法運算，藉此再構成當前區塊。而且，加法部208會將再構成的區塊輸出到區塊記憶體210及迴路濾波部212。 [區塊記憶體]The addition unit 208 adds the prediction error input by the inverse conversion unit 206 and the prediction sample input by the self-prediction control unit 220 to thereby reconstruct the current block. The addition unit 208 outputs the reconstructed blocks to the block memory 210 and the loop filter unit 212. [Block memory]

區塊記憶體210是用於保存為在框內預測中參照的區塊且為解碼對象圖片(以下，稱為當前圖片)內的區塊的儲存部。具體來說，區塊記憶體210會保存從加法部208輸出的再構成區塊。 [迴路濾波部]The block memory 210 is a storage unit for storing blocks referenced in the intra-frame prediction and blocks in a picture to be decoded (hereinafter referred to as a current picture). Specifically, the block memory 210 stores the reconstructed blocks output from the adding unit 208. [Loop Filtering Division]

迴路濾波部212會對藉由加法部208再構成的區塊施行迴路濾波，且將已進行濾波的再構成區塊輸出到框記憶體214及顯示裝置等。The loop filtering unit 212 performs loop filtering on the blocks reconstructed by the adding unit 208, and outputs the filtered reconstructed blocks to the frame memory 214, the display device, and the like.

當顯示從編碼位元流中解讀出的ALF之開啟/關閉的資訊顯示的是ALF為開啟的情況下，可根據局部的梯度之方向及活動性而從複數個濾波器之中選擇1個濾波器，且將所選擇的濾波器適用於再構成區塊。 [框記憶體]When the display of the ALF on / off information decoded from the encoded bit stream shows that ALF is on, one filter can be selected from a plurality of filters according to the direction and activity of the local gradient. And apply the selected filter to the reconstructed block. [Frame memory]

框記憶體214是用於保存框間預測所用的參照圖片之儲存部，有時也被稱為框緩衝器(frame buffer)。具體來說，框記憶體214會保存已藉由迴路濾波部212而被濾波的再構成區塊。 [框內預測部]The frame memory 214 is a storage unit for storing reference pictures used for inter-frame prediction, and is sometimes referred to as a frame buffer. Specifically, the frame memory 214 stores the reconstructed blocks that have been filtered by the loop filtering unit 212. [In-frame prediction section]

框內預測部216是根據已從編碼位元流中解讀出的框內預測模式，並參照保存於區塊記憶體210的當前圖片內之區塊來進行框內預測，藉此生成預測訊號(框內預測訊號)。具體來說，框內預測部216是參照與當前區塊相鄰的區塊之樣本(例如亮度值、色差值)來進行框內預測，藉此生成框內預測訊號，並將框內預測訊號輸出至預測控制部220。The in-frame prediction unit 216 performs in-frame prediction based on the in-frame prediction mode that has been decoded from the encoded bit stream and refers to the blocks stored in the current picture of the block memory 210, thereby generating a prediction signal ( Frame prediction signal). Specifically, the in-frame prediction unit 216 performs in-frame prediction with reference to samples (e.g., luminance values and color difference values) of blocks adjacent to the current block, thereby generating an in-frame prediction signal, and performs intra-frame prediction. The signal is output to the prediction control unit 220.

再者，在色差區塊的框內預測中選擇參照亮度區塊的框內預測模式之情況下，框內預測部216也可以根據當前區塊的亮度成分，來預測當前區塊的色差成分。When the intra-frame prediction mode that refers to the luminance block is selected in the intra-frame prediction of the color difference block, the intra-frame prediction unit 216 may predict the color difference component of the current block based on the luminance component of the current block.

又，在已從編碼位元流中解讀出的資訊顯示的是適用PDPC的情況下，框內預測部216會根據水平/垂直方向的參照像素之梯度來補正框內預測後的像素值。 [框間預測部]When the information decoded from the coded bit stream indicates that PDPC is applicable, the intra-frame prediction unit 216 corrects the pixel values after intra-frame prediction based on the gradient of the reference pixels in the horizontal / vertical direction. [Inter-frame prediction section]

框間預測部218是參照保存於框記憶體214的參照圖片，來預測當前區塊。預測是以當前區塊或當前區塊內的子區塊(例如4x4區塊)之單位來進行。例如，框間預測部218會利用從編碼位元流中解讀出的動態資訊(例如移動向量)來進行動態補償，藉此生成當前區塊或子區塊的框間預測訊號，並將框間預測訊號輸出至預測控制部220。The inter-frame prediction unit 218 refers to a reference picture stored in the frame memory 214 to predict a current block. The prediction is performed in units of the current block or a sub-block (for example, a 4x4 block) within the current block. For example, the inter-frame prediction unit 218 uses motion information (such as a motion vector) decoded from the encoded bit stream to perform dynamic compensation, thereby generating an inter-frame prediction signal of the current block or sub-block, and The prediction signal is output to the prediction control unit 220.

再者，在顯示從編碼位元流中解讀出的資訊為適用OBMC模式的情況下，框間預測部218會使用的不只有藉由動態搜尋所得到的當前區塊之動態資訊，還有相鄰區塊的動態資訊，以生成框間預測訊號。Furthermore, in the case where the information decoded from the encoded bit stream is displayed as being applicable to the OBMC mode, the inter-frame prediction unit 218 will use not only the dynamic information of the current block obtained through dynamic search, but also relevant information. Dynamic information of neighboring blocks to generate inter-frame prediction signals.

又，顯示從編碼位元流中解讀出的資訊適用FRUC模式的情況下，框間預測部218會依照從編碼流解讀出的型樣匹配之方法(雙向匹配或模板匹配)來進行動態搜尋，藉此導出動態資訊。並且，框間預測部218會使用已導出的動態資訊來進行動態補償。When the FRUC mode is applied to the information decoded from the encoded bit stream, the inter-frame prediction unit 218 performs dynamic search according to the pattern matching method (two-way matching or template matching) decoded from the encoded stream. Use this to export dynamic information. Then, the inter-frame prediction unit 218 performs motion compensation using the derived motion information.

又，在適用BIO模式的情況下，框間預測部218會根據假設了等速直線運動的模型來導出移動向量。又，在顯示從編碼位元流中解讀出的資訊適用仿射動態補償預測模式的情況下，框間預測部218會根據複數個相鄰區塊的移動向量，以子區塊單位來導出移動向量。 [預測控制部]When the BIO mode is applied, the inter-frame prediction unit 218 derives a motion vector from a model that assumes a constant-speed linear motion. When the affine motion compensation prediction mode is applied to the information decoded from the encoded bit stream, the inter-frame prediction unit 218 derives the movement in units of sub-blocks based on the motion vectors of a plurality of adjacent blocks. vector. [Predictive Control Department]

預測控制部220會選擇框內預測訊號及框間預測訊號的任一個，且將所選擇的訊號作為預測訊號而輸出至加法部208。 [編碼裝置的轉換部之內部構成]The prediction control unit 220 selects one of the intra-frame prediction signal and the inter-frame prediction signal, and outputs the selected signal to the addition unit 208 as a prediction signal. [Internal Structure of Conversion Unit of Encoding Device]

接著，參照圖11來說明編碼裝置100的轉換部106之內部構成的一例。Next, an example of the internal configuration of the conversion unit 106 of the encoding device 100 will be described with reference to FIG. 11.

圖11是顯示實施形態1之編碼裝置100的轉換部106之內部構成的方塊圖。轉換部106具備有框內/框間判定部1061、基底選擇部1062、與頻率轉換部1063。FIG. 11 is a block diagram showing the internal configuration of the conversion unit 106 of the encoding device 100 according to the first embodiment. The conversion unit 106 includes an in-frame / inter-frame determination unit 1061, a base selection unit 1062, and a frequency conversion unit 1063.

框內/框間判定部1061會判定為了編碼對象區塊所使用的是框內預測及框間預測的哪一個。例如，框內/框間判定部1061是根據與對輸入圖像訊號、壓縮圖像進行局部解碼所得到的圖像訊號之比較結果，來判定是使用框內預測及框間預測的哪一個。The in-frame / inter-frame determination unit 1061 determines which of intra-frame prediction and inter-frame prediction is used for the coding target block. For example, the in-frame / inter-frame determination unit 1061 determines which of the intra-frame prediction and the inter-frame prediction is used based on a comparison result with an image signal obtained by locally decoding an input image signal and a compressed image.

基底選擇部1062是根據框內/框間判定部1061的判定結果，從包含DCT-II及DCT-V的複數個頻率轉換的基底之中，選擇1個基底。具體來說，當為了編碼對象區塊所使用的是框內預測的情形下，基底選擇部1062是選擇DCT-V的基底。另一方面，當為了編碼對象區塊所使用的是框間預測的情形下，基底選擇部1062是選擇DCT-II的基底。The base selection unit 1062 selects one base from a plurality of frequency conversion bases including DCT-II and DCT-V based on the determination result of the in-frame / inter-frame determination unit 1061. Specifically, when intra-frame prediction is used for the coding target block, the base selection unit 1062 is a base for selecting DCT-V. On the other hand, when inter-frame prediction is used for the coding target block, the base selection unit 1062 is a base for selecting DCT-II.

頻率轉換部1063是使用以基底選擇部1062所選擇的基底，來進行對編碼對象區塊的預測誤差(殘差)之頻率轉換。也就是說，頻率轉換部1063是選擇性地使用包含DCT-II及DCT-V的複數個頻率轉換的基底，來進行對編碼對象區塊的預測誤差之頻率轉換。具體來說，當為了編碼對象區塊所使用的是框內預測的情形下，頻率轉換部1063是使用DCT-V的基底來進行頻率轉換。另一方面，當為了編碼對象區塊所使用的是框間預測的情形下，頻率轉換部1063是使用DCT-II的基底來進行頻率轉換。The frequency conversion unit 1063 performs frequency conversion of the prediction error (residual error) on the encoding target block using the base selected by the base selection unit 1062. That is, the frequency conversion unit 1063 is a base that selectively uses a plurality of frequency conversions including DCT-II and DCT-V to perform frequency conversion on the prediction error of the coding target block. Specifically, when intra-frame prediction is used for the coding target block, the frequency conversion unit 1063 performs frequency conversion using the base of DCT-V. On the other hand, when inter-frame prediction is used for the coding target block, the frequency conversion unit 1063 performs frequency conversion using the base of DCT-II.

再者，由頻率轉換部1063輸出的編碼對象區塊之係數，是藉由量化部108及逆量化部112而被量化及被逆量化。逆轉換部114是對已量化及逆量化的編碼對象區塊之係數進行逆頻率轉換。此時，逆轉換部114是根據基底選擇部1062的選擇結果之資訊，來選擇逆頻率轉換的基底。也就是說，在為了編碼對象區塊所使用的是框內預測的情況下是選擇DCT-V的逆轉換之基底，在為了編碼對象區塊所使用的是框間預測的情況下是選擇DCT-II的逆轉換之基底，且利用所選擇的基底來實施逆頻率轉換。 [編碼裝置的轉換部之動作]In addition, the coefficients of the coding target block output by the frequency conversion unit 1063 are quantized and inversely quantized by the quantization unit 108 and the inverse quantization unit 112. The inverse conversion unit 114 performs inverse frequency conversion on the coefficients of the quantized and inverse-quantized encoding target block. At this time, the inverse conversion unit 114 selects a base of the inverse frequency conversion based on the information of the selection result of the base selection unit 1062. In other words, in the case where intra-frame prediction is used for encoding the target block, the base of the inverse conversion of DCT-V is selected, and when inter-frame prediction is used for encoding the target block, DCT is selected. -II is the base of the inverse conversion, and the selected base is used to implement the inverse frequency conversion. [Operation of Conversion Unit of Encoding Device]

接著，參照圖12來具體地說明如以上所構成之轉換部106的動作。圖12是顯示實施形態1之編碼裝置100的轉換部106之處理的流程圖。Next, the operation of the conversion unit 106 configured as described above will be specifically described with reference to FIG. 12. FIG. 12 is a flowchart showing processing performed by the conversion unit 106 of the encoding device 100 according to the first embodiment.

首先，框內/框間判定部1061會判定為了編碼對象區塊所使用的是框內預測及框間預測的哪一個(S101)。在此，當為了編碼對象區塊所使用的是框內預測的情形下(S101的框內)，基底選擇部1062是選擇DCT-V的基底(S102)。另一方面，當為了編碼對象區塊所使用的是框間預測的情形下(S101的框間)，基底選擇部1062是選擇DCT-II的基底(S103)。最後，頻率轉換部1063是使用在步驟S102或步驟S103所選擇的基底，來進行對編碼對象區塊的預測誤差之頻率轉換(S104)。 [解碼裝置的逆轉換部之內部構成]First, the in-frame / inter-frame determination unit 1061 determines which of the intra-frame prediction and the inter-frame prediction is used for the coding target block (S101). Here, when intra-frame prediction is used for the coding target block (in the frame of S101), the base selection unit 1062 selects the base of DCT-V (S102). On the other hand, when inter-frame prediction is used for the coding target block (between the frames of S101), the base selection unit 1062 selects the base of DCT-II (S103). Finally, the frequency conversion unit 1063 performs frequency conversion on the prediction error of the coding target block using the base selected in step S102 or step S103 (S104). [Internal configuration of the inverse conversion unit of the decoding device]

接著，說明解碼裝置200的逆轉換部206之內部構成。Next, the internal configuration of the inverse conversion unit 206 of the decoding device 200 will be described.

圖13是顯示實施形態1之解碼裝置200的逆轉換部206之內部構成的方塊圖。逆轉換部206具備框內/框間判定部2061、基底選擇部2062、及逆頻率轉換部2063。FIG. 13 is a block diagram showing the internal configuration of the inverse conversion unit 206 of the decoding device 200 according to the first embodiment. The inverse conversion unit 206 includes an in-frame / inter-frame determination unit 2061, a base selection unit 2062, and an inverse frequency conversion unit 2063.

框內/框間判定部2061會判定為了解碼對象區塊所使用的是框內預測及框間預測的哪一個。例如，框內/框間判定部2061會根據從位元流得到的資訊來進行判定。The in-frame / inter-frame determination unit 2061 determines which of intra-frame prediction and inter-frame prediction is used to decode the target block. For example, the in-frame / inter-frame determination unit 2061 makes a determination based on the information obtained from the bit stream.

基底選擇部2062是根據框內/框間判定部1061的判定結果，從包含DCT-II及DCT-V的逆轉換之複數個逆頻率轉換的基底當中，選擇1個基底。具體來說，當為了解碼對象區塊所使用的是框內預測的情形下，基底選擇部2062是選擇DCT-V的逆轉換之基底。另一方面，當為了編碼對象區塊所使用的是框間預測的情形下，基底選擇部2062是選擇DCT-II的逆轉換之基底。The base selection unit 2062 selects one base from a plurality of bases of inverse frequency conversion including the inverse conversion of DCT-II and DCT-V based on the determination result of the in-frame / inter-frame determination unit 1061. Specifically, when intra-frame prediction is used for decoding the target block, the base selection unit 2062 is a base for selecting the inverse conversion of DCT-V. On the other hand, when inter-frame prediction is used for the coding target block, the base selection unit 2062 is a base for selecting the inverse conversion of DCT-II.

逆頻率轉換部2063是使用由基底選擇部2062所選擇的基底，來進行對解碼對象區塊的係數之逆頻率轉換。也就是說，逆頻率轉換部2063是選擇性地使用包含DCT-II及DCT-V的逆轉換之複數個逆頻率轉換的基底，來進行對解碼對象區塊的係數之逆頻率轉換。The inverse frequency conversion unit 2063 performs inverse frequency conversion on the coefficients of the decoding target block using the base selected by the base selection unit 2062. In other words, the inverse frequency conversion unit 2063 selectively performs the inverse frequency conversion on the coefficients of the decoding target block by using a plurality of inverse frequency conversion bases including the inverse conversion of DCT-II and DCT-V.

更具體來說，當為了解碼對象區塊所使用的是框內預測的情形下，逆頻率轉換部2063是使用DCT-V的逆轉換之基底來進行逆頻率轉換。另一方面，當為了解碼對象區塊所使用的是框間預測的情形下，逆頻率轉換部2063是使用DCT-II的逆轉換之基底來進行逆頻率轉換。 [解碼裝置的逆轉換部之動作]More specifically, when intra-frame prediction is used for decoding the target block, the inverse frequency conversion unit 2063 performs inverse frequency conversion using the base of the inverse conversion of DCT-V. On the other hand, when inter-frame prediction is used for decoding a target block, the inverse frequency conversion unit 2063 performs inverse frequency conversion using the base of the inverse conversion of DCT-II. [Operation of the Inverse Conversion Unit of the Decoding Device]

接著，參照圖14來具體地說明如以上所構成之逆轉換部206的動作。圖14是顯示實施形態1之解碼裝置200的逆轉換部206之處理的流程圖。Next, the operation of the inverse conversion unit 206 configured as described above will be specifically described with reference to FIG. 14. FIG. 14 is a flowchart showing processing performed by the inverse conversion unit 206 of the decoding device 200 according to the first embodiment.

首先，框內/框間判定部2061會判定為了解碼對象區塊所使用的是框內預測及框間預測的哪一個(S201)。在此，當為了解碼對象區塊所使用的是框內預測的情形下(S201的框內)，基底選擇部2062是選擇DCT-V的逆轉換之基底(S202)。另一方面，當為了解碼對象區塊所使用的是框間預測的情形下(S201的框間)，基底選擇部2062是選擇DCT-II的逆轉換之基底(S203)。最後，逆頻率轉換部2063是使用在步驟S202或步驟S203所選擇的基底，來進行對解碼對象區塊的係數之逆頻率轉換(S204)。 [效果等]First, the in-frame / inter-frame determination unit 2061 determines which of the intra-frame prediction and the inter-frame prediction is used to decode the target block (S201). Here, when intra-frame prediction is used for decoding the target block (in the frame of S201), the base selection unit 2062 is a base for selecting the inverse conversion of DCT-V (S202). On the other hand, when inter-frame prediction is used for decoding the target block (between the frames of S201), the base selection unit 2062 selects the base of the inverse conversion of DCT-II (S203). Finally, the inverse frequency conversion unit 2063 performs inverse frequency conversion on the coefficients of the decoding target block using the base selected in step S202 or step S203 (S204). [Effects, etc.]

如上所述，根據本實施形態之編碼裝置100的轉換部106及解碼裝置200的逆轉換部206，當為了當前區塊所使用的是框內預測的情況下，可以使用DCT-V的基底或DCT-V的逆轉換之基底來對當前區塊進行轉換或逆轉換。由於在DCT-V中是在直流成分中於接近參照像素的位置上使振幅變小，因此DCT-V適合框內預測的預測誤差之轉換/逆轉換。因此，編碼裝置100及解碼裝置200可以實現更進一步的壓縮效率之提升。As described above, according to the conversion unit 106 of the encoding device 100 and the inverse conversion unit 206 of the decoding device 200 according to this embodiment, when intra-frame prediction is used for the current block, the base of DCT-V or The base of DCT-V's inverse conversion is to convert or inverse convert the current block. In DCT-V, the amplitude is reduced at a position close to the reference pixel in the DC component. Therefore, DCT-V is suitable for the conversion / inverse conversion of the prediction error in the frame prediction. Therefore, the encoding device 100 and the decoding device 200 can achieve a further improvement in compression efficiency.

在此，參照圖15A~圖16B來具體地說明DCT-V。Here, the DCT-V will be specifically described with reference to FIGS. 15A to 16B.

圖15A是表示32×32尺寸的區塊中之DCT-II的轉換特性之圖表。圖15B是表示32×32尺寸的區塊中之DCT-V的轉換特性之圖表。圖16A是表示4×4尺寸的區塊中之DST-VII的轉換特性之圖表。圖16B是表示4×4尺寸的區塊中之DCT-V的轉換特性之圖表。在圖15A~圖16B中，橫軸是表示從參照像素起的距離，縱軸是表示振幅。FIG. 15A is a graph showing the conversion characteristics of DCT-II in a 32 × 32 size block. FIG. 15B is a graph showing the conversion characteristics of DCT-V in a 32 × 32 size block. FIG. 16A is a graph showing the conversion characteristics of DST-VII in a 4 × 4 size block. FIG. 16B is a graph showing the conversion characteristics of DCT-V in a 4 × 4 size block. In FIGS. 15A to 16B, the horizontal axis indicates the distance from the reference pixel, and the vertical axis indicates the amplitude.

DCT-II是類型II的離散餘弦轉換。在DCT-II中，是使用圖3所示的基底(基底函數)。如圖15A所示，在直流(0階)中，無論距離如何，振幅值都是固定的。又，因此，對於在區塊內預測誤差較為一樣的框間預測之區塊，DCT-II是有效的。DCT-II is a discrete cosine transform of type II. In DCT-II, the base (base function) shown in FIG. 3 is used. As shown in FIG. 15A, in the direct current (0th order), the amplitude value is fixed regardless of the distance. In addition, DCT-II is effective for blocks that have the same intra-frame prediction error as the intra-block prediction error.

DCT-V是類型V的離散餘弦轉換。在DCT-V中，是使用圖3所示的基底(基底函數)。如圖15B所示，在比較大的區塊中，DCT-V具有與DCT-II相近的轉換特性。又，如圖16B所示，在比較小的區塊中，在直流中於接近參照像素的位置上會使振幅變小，故DCT-V的轉換特性與DST-VII的轉換特性相似。DCT-V is a discrete cosine transform of type V. In DCT-V, the base (base function) shown in FIG. 3 is used. As shown in FIG. 15B, in a relatively large block, DCT-V has a conversion characteristic similar to DCT-II. Also, as shown in FIG. 16B, in a relatively small block, the amplitude becomes smaller at a position closer to the reference pixel in DC, so the conversion characteristics of DCT-V are similar to the conversion characteristics of DST-VII.

在框內預測的區塊中，在接近參照像素的像素(左側及上側的像素)中，會有預測誤差變小的傾向。但是，由於當預測誤差較小的情況下會有採用較大的區塊之傾向，因此在較大的區塊中，在接近參照像素的像素中會難以顯現預測誤差變小的傾向。In a block predicted in a frame, pixels close to a reference pixel (pixels on the left and upper sides) tend to have a smaller prediction error. However, since a larger block tends to be used when the prediction error is small, in a larger block, it is difficult to show a tendency that the prediction error becomes smaller in pixels close to the reference pixel.

從而，可以將在較大的區塊中具有與DCT-II相似的轉換特性，且在較小的區塊中具有與DST-VII相似的轉換特性之DCT-V，適用於框內預測的區塊之頻率轉換，藉此從較小的尺寸到較大的尺寸都有效地對框內預測的區塊進行頻率轉換/逆頻率轉換。從而，可以將DCT-V的基底使用在框內預測的區塊之轉換，藉此實現更進一步的壓縮效率之提升。 (實施形態1的變形例1)Therefore, DCT-V, which has similar conversion characteristics as DCT-II in larger blocks and similar conversion characteristics as DST-VII in smaller blocks, can be applied to areas predicted in the frame. The frequency conversion of the block, thereby effectively performing frequency conversion / inverse frequency conversion on the block predicted in the frame from the smaller size to the larger size. Therefore, the base of the DCT-V can be used to transform the blocks predicted in the frame, thereby achieving a further improvement in compression efficiency. (Modification 1 of Embodiment 1)

接著，針對實施形態1的變形例1進行說明。在本變形例中，使用在頻率轉換及逆頻率轉換的基底是取決於當前區塊的尺寸，這一點與上述實施形態1不同。以下，將參照圖17~圖20，以與實施形態1不同之點為中心來具體地說明本變形例。 [編碼裝置的轉換部之內部構成]Next, a first modification of the first embodiment will be described. In this modification, the basis used for frequency conversion and inverse frequency conversion depends on the size of the current block, which is different from the first embodiment described above. Hereinafter, this modification will be specifically described with reference to Figs. 17 to 20, focusing on the differences from the first embodiment. [Internal Structure of Conversion Unit of Encoding Device]

圖17是顯示實施形態1的變形例1之編碼裝置100的轉換部106A之內部構成的方塊圖。轉換部106A具備框內/框間判定部1061、基底選擇部1062A、頻率轉換部1063、及尺寸判定部1064A。FIG. 17 is a block diagram showing the internal configuration of the conversion unit 106A of the encoding device 100 according to the first modification of the first embodiment. The conversion unit 106A includes an in-frame / inter-frame determination unit 1061, a base selection unit 1062A, a frequency conversion unit 1063, and a size determination unit 1064A.

尺寸判定部1064A會判定編碼對象區塊的尺寸是否為閾值尺寸以下。The size determination unit 1064A determines whether the size of the encoding target block is equal to or smaller than a threshold size.

當為了編碼對象區塊所使用的是框內預測的情形下，只要編碼對象區塊的尺寸為閾值尺寸以下，基底選擇部1062A即選擇DCT-V的基底。另一方面，即使在為了編碼對象區塊所使用的是框內預測的情形下，只要編碼對象區塊的尺寸比閾值尺寸更大，基底選擇部1062A即選擇DCT-II的基底。又，當為了編碼對象區塊所使用的是框間預測的情形下，基底選擇部1062A是與上述實施形態1同樣地選擇DCT-II的基底。When intra-frame prediction is used for the coding target block, as long as the size of the coding target block is equal to or smaller than the threshold size, the base selection unit 1062A selects the base of the DCT-V. On the other hand, even when intra-frame prediction is used for the coding target block, as long as the size of the coding target block is larger than the threshold size, the base selection unit 1062A selects the base of the DCT-II. When inter-frame prediction is used for the coding target block, the base selection unit 1062A selects the base of DCT-II in the same manner as in the first embodiment.

作為表示用來切換基底的區塊尺寸之邊界的閾值尺寸，是使用例如已於標準規格中事先定義的固定尺寸(例如16×16像素)。又，閾值尺寸也可以根據位元流中所包含的訊號來決定，且也可以由外部裝置或使用者輸入。例如，閾值尺寸也可以根據框內預測模式、量化參數、或預測誤差等來決定。 [編碼裝置的轉換部之動作]As the threshold size indicating the boundary of the block size for switching the base, for example, a fixed size (for example, 16 × 16 pixels) that has been previously defined in a standard specification is used. In addition, the threshold size may be determined according to a signal included in the bit stream, and may also be input by an external device or a user. For example, the threshold size may be determined according to an intra-frame prediction mode, a quantization parameter, or a prediction error. [Operation of Conversion Unit of Encoding Device]

接著，參照圖18來具體地說明如以上所構成之本變形例的轉換部106A之動作。圖18是顯示實施形態1的變形例1之編碼裝置100的轉換部106A之處理的流程圖。Next, the operation of the conversion unit 106A of the present modification configured as described above will be specifically described with reference to FIG. 18. FIG. 18 is a flowchart showing processing performed by the conversion unit 106A of the encoding device 100 according to the first modification of the first embodiment.

首先，框內/框間判定部1061會判定為了編碼對象區塊所使用的是框內預測及框間預測的哪一個(S101)。在此，當為了編碼對象區塊所使用的是框內預測的情形下(S101的框內)，尺寸判定部1064A會判定編碼對象區塊的尺寸是否為閾值尺寸以下(S111)。在此，當編碼對象區塊的尺寸為閾值尺寸以下的情況下(S111的是)，基底選擇部1062A會選擇DCT-V的基底(S102)。First, the in-frame / inter-frame determination unit 1061 determines which of the intra-frame prediction and the inter-frame prediction is used for the coding target block (S101). Here, when intra-frame prediction is used for the encoding target block (in the frame of S101), the size determination unit 1064A determines whether the size of the encoding target block is equal to or smaller than a threshold size (S111). Here, when the size of the coding target block is equal to or smaller than the threshold size (YES in S111), the base selection unit 1062A selects the base of the DCT-V (S102).

當為了編碼對象區塊所使用的是框間預測的情況下(S101的框間)、或者，編碼對象區塊的尺寸比閾值尺寸更大的情況下(S111的否)，基底選擇部1062A會選擇DCT-II的基底(S103)。最後，頻率轉換部1063是使用在步驟S102或步驟S103所選擇的基底，來進行對編碼對象區塊的預測誤差之頻率轉換(S104)。 [解碼裝置的逆轉換部之內部構成]When inter-frame prediction is used for the coding target block (between the frames in S101) or when the size of the coding target block is larger than the threshold size (No in S111), the base selection unit 1062A will The base of DCT-II is selected (S103). Finally, the frequency conversion unit 1063 performs frequency conversion on the prediction error of the coding target block using the base selected in step S102 or step S103 (S104). [Internal configuration of the inverse conversion unit of the decoding device]

接著，說明解碼裝置200的逆轉換部206A之內部構成。圖19是顯示實施形態1的變形例1之解碼裝置200的逆轉換部206A之內部構成的方塊圖。逆轉換部206A具備有框內/框間判定部2061、基底選擇部2062A、逆頻率轉換部2063、及尺寸判定部2064A。Next, the internal configuration of the inverse conversion unit 206A of the decoding device 200 will be described. FIG. 19 is a block diagram showing the internal configuration of the inverse conversion unit 206A of the decoding device 200 according to the first modification of the first embodiment. The inverse conversion unit 206A includes an in-frame / inter-frame determination unit 2061, a base selection unit 2062A, an inverse frequency conversion unit 2063, and a size determination unit 2064A.

尺寸判定部2064A會判定解碼對象區塊的尺寸是否為閾值尺寸以下。The size determination unit 2064A determines whether the size of the decoding target block is equal to or smaller than the threshold size.

當為了解碼對象區塊所使用的是框內預測的情形下，只要解碼對象區塊的尺寸為閾值尺寸以下，基底選擇部2062A即選擇DCT-V的基底。另一方面，即使在為了解碼對象區塊所使用的是框內預測的情形下，只要解碼對象區塊的尺寸比閾值尺寸更大，基底選擇部2062A即選擇DCT-II的基底。又，當為了解碼對象區塊所使用的是框間預測的情形下，基底選擇部2062A是與上述實施形態1同樣地選擇DCT-II的基底。When intra-frame prediction is used for the decoding target block, as long as the size of the decoding target block is equal to or smaller than the threshold size, the base selection unit 2062A selects the base of the DCT-V. On the other hand, even when intra-frame prediction is used for the decoding target block, as long as the size of the decoding target block is larger than the threshold size, the base selection unit 2062A selects the base of DCT-II. When inter-frame prediction is used for the decoding target block, the base selection unit 2062A selects the base of DCT-II in the same manner as in the first embodiment.

作為閾值尺寸，是使用與編碼裝置100的轉換部106A所用的閾值尺寸相同的尺寸。 [解碼裝置的逆轉換部之動作]As the threshold size, the same size as the threshold size used by the conversion unit 106A of the encoding device 100 is used. [Operation of the Inverse Conversion Unit of the Decoding Device]

接著，參照圖20來具體地說明如以上所構成之本變形例的逆轉換部206A之動作。圖20是顯示實施形態1的變形例1之解碼裝置200的逆轉換部206A的處理之流程圖。Next, the operation of the inverse conversion unit 206A of the present modification configured as described above will be specifically described with reference to FIG. 20. FIG. 20 is a flowchart showing processing of the inverse conversion unit 206A of the decoding device 200 according to the first modification of the first embodiment.

首先，框內/框間判定部2061會判定為了解碼對象區塊所使用的是框內預測及框間預測的哪一個(S201)。在此，當為了解碼對象區塊所使用的是框內預測的情形下(S201的框內)，尺寸判定部2064A會判定解碼對象區塊的尺寸是否為閾值尺寸以下(S211)。在此，當解碼對象區塊的尺寸為閾值尺寸以下的情況下(S211的是)，基底選擇部2062A會選擇DCT-V的逆轉換之基底(S202)。First, the in-frame / inter-frame determination unit 2061 determines which of the intra-frame prediction and the inter-frame prediction is used to decode the target block (S201). Here, when intra-frame prediction is used for the decoding target block (in the frame of S201), the size determination unit 2064A determines whether the size of the decoding target block is equal to or smaller than the threshold size (S211). Here, when the size of the decoding target block is equal to or smaller than the threshold size (YES in S211), the base selection unit 2062A selects the base of the inverse conversion of DCT-V (S202).

當為了解碼對象區塊所使用的是框間預測的情況下(S201的框間)、或者，解碼對象區塊的尺寸比閾值尺寸更大的情況下(S211的否)，基底選擇部2062A會選擇DCT-II的逆轉換之基底(S203)。最後，逆頻率轉換部2063是使用在步驟S202或步驟S203所選擇的基底，來進行對解碼對象區塊的係數之逆頻率轉換(S204)。 [效果等]When inter-frame prediction is used for the decoding target block (between the frames of S201) or when the size of the decoding target block is larger than the threshold size (No in S211), the base selection unit 2062A will The base of the inverse conversion of DCT-II is selected (S203). Finally, the inverse frequency conversion unit 2063 performs inverse frequency conversion on the coefficients of the decoding target block using the base selected in step S202 or step S203 (S204). [Effects, etc.]

如以上，根據本實施形態之編碼裝置100的轉換部106A及解碼裝置200的逆轉換部206A，可以因應於使用框內預測的當前區塊之尺寸，來切換DCT-II及DCT-V的基底，而對當前區塊進行轉換/逆轉換。若區塊尺寸較大，會有預測誤差在區塊內整體性地變小的傾向，而使DCT-II適合區塊的預測誤差之轉換。另一方面，若區塊尺寸較小，則會有預測誤差越接近參照像素的像素變得越小的傾向，而使DCT-V適合區塊的預測誤差之轉換。因此，若當前區塊的尺寸為閾值尺寸以下時，是藉由DCT-V來對當前區塊進行轉換/逆轉換，若編碼對象區塊的尺寸大於閾值尺寸時，是藉由DCT-II來對當前區塊進行轉換/逆轉換，藉此可以實現更進一步的壓縮效率之提升。 (實施形態1的變形例2)As described above, according to the conversion unit 106A of the encoding device 100 and the inverse conversion unit 206A of the decoding device 200 according to this embodiment, the bases of DCT-II and DCT-V can be switched in accordance with the size of the current block predicted using the frame. , And perform conversion / inverse conversion on the current block. If the block size is large, the prediction error tends to become smaller in the block as a whole, and DCT-II is suitable for the conversion of the prediction error of the block. On the other hand, if the block size is small, the pixels whose prediction errors are closer to the reference pixels tend to become smaller, and DCT-V is suitable for the conversion of the prediction errors of the blocks. Therefore, if the size of the current block is below the threshold size, the current block is converted / inversely converted by DCT-V. If the size of the block to be encoded is larger than the threshold size, it is converted by DCT-II. The current block is converted / inversely converted, so that further compression efficiency can be improved. (Modification 2 of Embodiment 1)

接著，針對實施形態1的變形例2進行說明。在本變形例中，是在將閾值尺寸的資訊寫入位元流內之點上，與上述實施形態1的變形例1不同。以下，將參照圖21~圖25，以與實施形態1之變形例1不同之點為中心來具體地說明本變形例。 [編碼裝置的轉換部之內部構成]Next, a second modification of the first embodiment will be described. This modification is different from the first modification of the first embodiment in that the information of the threshold size is written in the bit stream. Hereinafter, this modification will be specifically described with reference to Figs. 21 to 25, focusing on differences from the first modification of the first embodiment. [Internal Structure of Conversion Unit of Encoding Device]

圖21是顯示實施形態1的變形例2之編碼裝置100的轉換部106B之內部構成的方塊圖。轉換部106B具備框內/框間判定部1061、基底選擇部1062A、頻率轉換部1063、尺寸判定部1064A、及閾值尺寸決定部1065B。FIG. 21 is a block diagram showing the internal configuration of the conversion unit 106B of the encoding device 100 according to the second modification of the first embodiment. The conversion unit 106B includes an in-frame / inter-frame determination unit 1061, a base selection unit 1062A, a frequency conversion unit 1063, a size determination unit 1064A, and a threshold size determination unit 1065B.

閾值尺寸決定部1065B是因應於輸入圖像訊號等而自適應地決定閾值尺寸。所決定的閾值尺寸是用在尺寸判定部1064A中。The threshold size determination unit 1065B determines the threshold size adaptively in response to an input image signal or the like. The determined threshold size is used in the size determination section 1064A.

又，所決定的閾值尺寸之資訊是被輸出至熵編碼部110，且被寫入至位元流內。閾值尺寸的資訊是指顯示閾值尺寸的資訊，例如是顯示閾值尺寸其本身的值、或是顯示閾值尺寸的索引(index)。閾值尺寸的資訊是被寫入到例如圖22之(i)~(v)所示的複數個標頭之至少1個中。The information of the determined threshold size is output to the entropy coding unit 110 and written into the bit stream. The information of the threshold size refers to the information of the display threshold size, for example, the value of the display threshold size itself or the index of the display threshold size. The threshold size information is written in at least one of the plural headers shown in (i) to (v) of FIG. 22, for example.

圖22是顯示實施形態1之變形例2或3中的閾值尺寸或轉換模式的資訊之位元流內的位置的複數個例子之圖。圖22之(i)是顯示在視訊參數集內有閾值尺寸或轉換模式的資訊。圖22之(ii)是顯示在視訊流的序列參數集內有閾值尺寸或轉換模式的資訊。圖22之(iii)是顯示在圖片的圖片參數集內有閾值尺寸或轉換模式的資訊。圖22之(iv)是顯示在片段的片段標頭內有閾值尺寸或轉換模式的資訊。圖22的(v)是顯示用於進行動態圖系統或視訊解碼器的設置(set-up)或初始化之參數的群組內有閾值尺寸或轉換模式的資訊之情形。閾值尺寸或轉換模式的資訊存在於複數個階層(例如，圖片參數集及片段標頭)的情況下，存在於低階層(例如片段標頭)中的閾值尺寸或轉換模式的資訊，會覆寫存在於更高的階層(例如圖片參數集)中的閾值尺寸或轉換模式的資訊。22 is a diagram showing plural examples of positions in a bit stream of information of a threshold size or a conversion pattern in a modification 2 or 3 of the first embodiment. (I) of FIG. 22 is information showing that there is a threshold size or a conversion mode in the video parameter set. (Ii) of FIG. 22 is information showing that there is a threshold size or a conversion mode in the sequence parameter set of the video stream. (Iii) of FIG. 22 is information showing that there is a threshold size or a conversion mode in a picture parameter set of a picture. (Iv) of FIG. 22 is information showing that there is a threshold size or a conversion mode in a clip header of a clip. (V) of FIG. 22 shows a case where there is information of a threshold size or a conversion mode in a group of parameters for performing set-up or initialization of a motion picture system or a video decoder. In the case where the information of the threshold size or the conversion mode exists in a plurality of levels (for example, a picture parameter set and a fragment header), the information of the threshold size or the conversion mode in a lower layer (for example, a fragment header) is overwritten Information on threshold sizes or conversion patterns that exist in higher layers (such as picture parameter sets).

又，閾值尺寸的資訊只要在變更時寫入即可。也就是說，使用和之前剛使用的閾值尺寸相同的閾值尺寸時，亦可跳過閾值尺寸的資訊之寫入。 [編碼裝置的轉換部之動作]Note that the threshold size information may be written only when it is changed. That is to say, when using the same threshold size as the threshold size just used before, it is also possible to skip writing the information of the threshold size. [Operation of Conversion Unit of Encoding Device]

接著，參照圖23來具體地說明如以上所構成之本變形例的轉換部106B之動作。圖23是顯示實施形態1的變形例2之編碼裝置100的轉換部106B之處理的流程圖。Next, the operation of the conversion unit 106B of the present modification configured as described above will be specifically described with reference to FIG. 23. FIG. 23 is a flowchart showing processing performed by the conversion unit 106B of the encoding device 100 according to the second modification of the first embodiment.

首先，框內/框間判定部1061會判定為了編碼對象區塊所使用的是框內預測及框間預測的哪一個(S101)。在此，當為了編碼對象區塊所使用的是框內預測的情形下(S101的框內)，閾值尺寸決定部1065B會自適應地決定閾值尺寸，並將所決定的閾值尺寸的資訊輸出至熵編碼部110(S121)。之後，執行步驟S111以後的處理。 [解碼裝置的逆轉換部之內部構成]First, the in-frame / inter-frame determination unit 1061 determines which of the intra-frame prediction and the inter-frame prediction is used for the coding target block (S101). Here, when intra-frame prediction is used for the coding target block (in the frame of S101), the threshold size determination unit 1065B adaptively determines the threshold size and outputs the information of the determined threshold size to Entropy coding unit 110 (S121). After that, the processes from step S111 onward are executed. [Internal configuration of the inverse conversion unit of the decoding device]

接著，說明解碼裝置200的逆轉換部206B之內部構成。圖24是顯示實施形態1的變形例2之解碼裝置200的逆轉換部206B之內部構成的方塊圖。逆轉換部206B具備框內/框間判定部2061、基底選擇部2062A、逆頻率轉換部2063、尺寸判定部2064A、及閾值尺寸取得部2065B。Next, the internal configuration of the inverse conversion unit 206B of the decoding device 200 will be described. FIG. 24 is a block diagram showing the internal configuration of the inverse conversion unit 206B of the decoding device 200 according to the second modification of the first embodiment. The inverse conversion unit 206B includes an in-frame / inter-frame determination unit 2061, a base selection unit 2062A, an inverse frequency conversion unit 2063, a size determination unit 2064A, and a threshold size acquisition unit 2065B.

閾值尺寸取得部2065B是從位元流中取得閾值尺寸。例如，閾值尺寸取得部2065B是根據藉由熵解碼部202而從位元流中解讀出的閾值尺寸之資訊，來取得閾值尺寸。在此取得的閾值尺寸是用在尺寸判定部2064A中。 [解碼裝置的逆轉換部之動作]The threshold size acquisition unit 2065B obtains a threshold size from the bit stream. For example, the threshold size acquisition unit 2065B obtains the threshold size based on the information of the threshold size decoded from the bit stream by the entropy decoding unit 202. The threshold size obtained here is used in the size determination section 2064A. [Operation of the Inverse Conversion Unit of the Decoding Device]

接著，參照圖25來具體地說明如以上所構成之本變形例的逆轉換部206B之動作。圖25是顯示實施形態1的變形例2之解碼裝置200的逆轉換部206B之處理的流程圖。Next, the operation of the inverse conversion unit 206B of the present modification configured as described above will be specifically described with reference to FIG. 25. FIG. 25 is a flowchart showing processing performed by the inverse conversion unit 206B of the decoding device 200 according to the second modification of the first embodiment.

首先，框內/框間判定部2061會判定為了解碼對象區塊所使用的是框內預測及框間預測的哪一個(S201)。在此，當為了解碼對象區塊所使用的是框內預測的情形下(S201的框內)，閾值尺寸取得部2065B是從位元流中取得閾值尺寸(S221)。之後，執行步驟S211以後的處理。 [效果等]First, the in-frame / inter-frame determination unit 2061 determines which of the intra-frame prediction and the inter-frame prediction is used to decode the target block (S201). Here, when intra-frame prediction is used for the decoding target block (in the frame of S201), the threshold size obtaining unit 2065B obtains the threshold size from the bit stream (S221). After that, the processing from step S211 is executed. [Effects, etc.]

如以上，根據本實施形態之編碼裝置100的轉換部106A及解碼裝置200的逆轉換部206A，可以將閾值尺寸的資訊包含在位元流中。因此，可以因應於輸入圖像而適應性地決定閾值尺寸，而可以實現更進一步的壓縮效率之提升。 (實施形態1的變形例3)As described above, according to the conversion unit 106A of the encoding device 100 and the inverse conversion unit 206A of the decoding device 200 according to this embodiment, the information of the threshold size can be included in the bit stream. Therefore, the threshold size can be adaptively determined according to the input image, and further compression efficiency can be improved. (Modification 3 of Embodiment 1)

接著，針對實施形態1的變形例3進行說明。在本變形例中，可以切換使用上述實施形態1的變形例2之轉換及逆轉換的第1轉換模式、以及使用其他轉換及逆轉換的第2轉換模式，這一點與上述實施形態1的變形例2不同。以下，將參照圖26~圖29，以與實施形態1之變形例2不同之點為中心來具體地說明本變形例。 [編碼裝置的轉換部之內部構成]Next, a third modification of the first embodiment will be described. In this modification, the first conversion mode using the conversion and inverse conversion of the second modification of the first embodiment and the second conversion mode using other conversions and inverse conversion can be switched. This is the same as the modification of the first embodiment. Example 2 is different. Hereinafter, this modification will be specifically described with reference to FIGS. 26 to 29, focusing on points that are different from the second modification of the first embodiment. [Internal Structure of Conversion Unit of Encoding Device]

圖26是顯示實施形態1的變形例3之編碼裝置100的轉換部106C之內部構成的方塊圖。轉換部106C具備框內/框間判定部1061、基底選擇部1062C、頻率轉換部1063、尺寸判定部1064A、閾值尺寸決定部1065B、及轉換模式判定部1066C。FIG. 26 is a block diagram showing the internal configuration of the conversion unit 106C of the encoding device 100 according to the third modification of the first embodiment. The conversion unit 106C includes an in-frame / inter-frame determination unit 1061, a base selection unit 1062C, a frequency conversion unit 1063, a size determination unit 1064A, a threshold size determination unit 1065B, and a conversion mode determination unit 1066C.

轉換模式判定部1066C會判定要將包含第1轉換模式及第2轉換模式的複數個轉換模式當中的哪一個轉換模式適用在編碼對象區塊。在複數個轉換模式中，例如亦可為使可選擇的基底彼此相異、或亦可為使可選擇的基底相同但選擇方法彼此相異。The conversion mode determination unit 1066C determines which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is to be applied to the encoding target block. In the plurality of conversion modes, for example, the selectable substrates may be different from each other, or the selectable substrates may be the same but the selection methods may be different from each other.

適用於編碼對象區塊的轉換模式之資訊，可被輸出到熵編碼部110，且寫入至位元流內。轉換模式的資訊，是指用於識別轉換模式的資訊，可為例如顯示轉換模式的旗標或索引(index)。轉換模式的資訊是與閾值尺寸的資訊同樣地寫入到例如圖22之(i)~(v)所示的複數個標頭之至少1個中。再者，轉換模式的資訊與閾值尺寸的資訊不需要寫入至同一個標頭中，亦可寫入至不同的標頭中。Information suitable for the conversion mode of the encoding target block can be output to the entropy encoding section 110 and written into the bit stream. The information of the conversion mode refers to information for identifying the conversion mode, and may be, for example, a flag or an index displaying the conversion mode. The information of the conversion mode is written in at least one of the plural headers shown in (i) to (v) of FIG. 22 in the same manner as the information of the threshold size. Moreover, the information of the conversion mode and the information of the threshold size need not be written in the same header, and can also be written in different headers.

在適用第1轉換模式的情況下，基底選擇部1062C是與上述實施形態1的變形例2同樣地，從包含DCT-II及DCT-V的複數個頻率轉換之基底當中選擇1個基底。另一方面，在適用第2轉換模式的情況下，基底選擇部1062C是從和第1轉換模式不同的複數個頻率轉換的基底當中，選擇1個基底。When the first conversion mode is applied, the base selection unit 1062C selects one base from among a plurality of frequency conversion bases including DCT-II and DCT-V, as in the second modification of the first embodiment. On the other hand, when the second conversion mode is applied, the base selection unit 1062C selects one base from a plurality of frequency conversion bases different from the first conversion mode.

頻率轉換部1063是使用基底選擇部1062C所選擇的基底，來進行對編碼對象區塊的預測誤差之頻率轉換。也就是說，頻率轉換部1063會在適用第1轉換模式的情況下進行第1頻率轉換，並在適用第2轉換模式的情況下進行第2頻率轉換。The frequency conversion unit 1063 performs frequency conversion on the prediction error of the coding target block using the base selected by the base selection unit 1062C. That is, the frequency conversion unit 1063 performs the first frequency conversion when the first conversion mode is applied, and performs the second frequency conversion when the second conversion mode is applied.

第1頻率轉換與上述實施形態1的變形例2之頻率轉換是相同的。第2頻率轉換是與第1頻率轉換相異。例如，在第2頻率轉換中，所使用的是與第1頻率轉換不同的基底。再者，在第1頻率轉換中所用的基底，與在第2頻率轉換中所用的基底，不需在所有的條件中都是相異的，亦可在特定的條件(例如區塊尺寸為閾值尺寸以下的情況)中為相同。 [編碼裝置的轉換部之動作]The first frequency conversion is the same as the frequency conversion of the second modification of the first embodiment. The second frequency conversion is different from the first frequency conversion. For example, in the second frequency conversion, a base different from the first frequency conversion is used. Furthermore, the base used in the first frequency conversion and the base used in the second frequency conversion need not be different in all conditions, but can also be used in specific conditions (for example, the block size is a threshold value). The same applies to the case below the size). [Operation of Conversion Unit of Encoding Device]

接著，參照圖27來具體地說明如以上所構成之本變形例的轉換部106C之動作。圖27是顯示實施形態1的變形例3之編碼裝置100的轉換部106C之處理的流程圖。Next, referring to Fig. 27, the operation of the conversion unit 106C according to the present modification configured as described above will be specifically described. FIG. 27 is a flowchart showing processing of the conversion unit 106C of the encoding device 100 according to the third modification of the first embodiment.

首先，框內/框間判定部1061會判定為了編碼對象區塊所使用的是框內預測及框間預測的哪一個(S101)。在此，當為了編碼對象區塊所使用的是框內預測的情形下(S101的框內)，轉換模式判定部1066C會決定適用於編碼對象區塊的轉換模式，並輸出至熵編碼部110(S131)。First, the in-frame / inter-frame determination unit 1061 determines which of the intra-frame prediction and the inter-frame prediction is used for the coding target block (S101). Here, when intra-frame prediction is used for the encoding target block (in the frame of S101), the conversion mode determination unit 1066C determines the conversion mode applicable to the encoding target block and outputs it to the entropy encoding unit 110. (S131).

在此，若所決定的轉換模式為第1轉換模式(S132的第1轉換模式)，則執行步驟S121之後的處理。另一方面，若所決定的轉換模式為第2轉換模式(S132的第2轉換模式)，基底選擇部1062C會選擇第2轉換模式用的基底(S133)。Here, if the determined conversion mode is the first conversion mode (the first conversion mode in S132), the processes after step S121 are executed. On the other hand, if the determined conversion mode is the second conversion mode (the second conversion mode of S132), the base selection unit 1062C selects the base for the second conversion mode (S133).

之後，頻率轉換部1063是使用在步驟S102、步驟S103或步驟S133所選擇的基底，來執行編碼對象區塊的頻率轉換(S104)。Thereafter, the frequency conversion unit 1063 performs frequency conversion of the encoding target block using the base selected in step S102, step S103, or step S133 (S104).

作為第2轉換模式用的基底，亦可使用在DCT及DST的每一個中根據邊界條件及對稱性所定義的類型I到類型VIII的8種基底。在此情況下，亦可從合計16種的基底中，根據考慮了預測誤差、或有關於預測誤差及預測誤差的編碼之資訊的編碼量之評價值等，選擇用於編碼對象區塊的基底。例如，在根據預測誤差的選擇中，亦可選擇使殘差變得最小的基底。又，根據顯示框內預測的方向等之框內預測模式，來選擇用於編碼對象區塊的基底亦可。又，亦可在不從複數個頻率轉換的基底中選擇用於編碼對象區塊的基底的情形下，固定地選擇單一的基底。例如，亦可為在4×4的尺寸中固定地使用DST-VII的基底，在其他的尺寸中則是自適應地選擇基底。此外，在可以固定地選擇基底的情況下，亦可不將顯示所選擇的基底之訊號寫入至位元流中，只在自適應地選擇基底的情況下，才將顯示所選擇的基底之訊號寫入至位元流中。As the substrate for the second conversion mode, eight types of substrates of type I to type VIII defined by the boundary conditions and the symmetry in each of DCT and DST can also be used. In this case, from a total of 16 types of bases, the base for encoding the target block may be selected based on the evaluation value taking into account the prediction error or the encoding amount of the information about the prediction error and the encoding of the prediction error. . For example, in the selection based on the prediction error, a base for minimizing the residual error may also be selected. In addition, the base for encoding the target block may be selected based on the intra-frame prediction mode in which the direction of the intra-frame prediction is displayed. Moreover, a single base may be fixedly selected without selecting a base for encoding a target block from a plurality of bases for frequency conversion. For example, the substrate of DST-VII may be fixedly used in a 4 × 4 size, and the substrate may be adaptively selected in other sizes. In addition, when the substrate can be fixedly selected, the signal showing the selected substrate may not be written into the bit stream, and only when the substrate is adaptively selected, the signal of the selected substrate will be displayed. Write to bit stream.

在第1轉換模式與第2轉換模式中，可選擇的基底亦可相異、或可選擇的基底亦可是相同的但選擇方法相異。又，在第1轉換模式及第2轉換模式中，亦可取決於區塊尺寸來切換排他與重複，且亦可構成為在不同的區塊尺寸中可選擇相同的基底，而在相同的區塊尺寸中可選擇不同的基底。In the first conversion mode and the second conversion mode, the selectable bases may be different, or the selectable bases may be the same but the selection methods are different. In addition, in the first conversion mode and the second conversion mode, exclusive and repetition can be switched depending on the block size, and the same base can be selected in different block sizes, and the same area can be selected in different block sizes. Different substrates can be selected in the block size.

作為更具體的例子，在第2轉換模式中，亦可於區塊尺寸為4×4的情況下，為可固定地選擇DST-VII的基底，於區塊尺寸超過4×4的情況下，為可自適應地選擇DST-I或DST-VII之基底的任一個。此時，在第1轉換模式中，亦可於區塊尺寸為16×16以下的情況下為可固定地選擇DCT-V的基底，於區塊尺寸超過16×16的情況下，為可固定地選擇DCT-II的基底。As a more specific example, in the second conversion mode, the DST-VII base can be fixedly selected when the block size is 4 × 4, and when the block size exceeds 4 × 4, Either the DST-I or DST-VII substrate can be adaptively selected. At this time, in the first conversion mode, the base of DCT-V can be fixedly selected when the block size is 16 × 16 or less, and fixed when the block size exceeds 16 × 16. Select the base of DCT-II.

作為其他具體的例子，在第2轉換模式中，亦可於區塊尺寸為4×4的情況下，可固定地選擇DST-VII的基底，於區塊尺寸超過4×4的情況下，從各自包含1個以上的基底之複數個基底組當中選擇1個基底組，並從所選擇的基底組之中選擇1個基底。複數個基底組亦可包含例如由DST-I的基底及DST-VII的基底所構成的第1基底組、以及由DCT-VIII的基底及DST-VII的基底所構成的第2基底組。再者，基底組的選擇亦可根據例如顯示框內預測的方向等之框內預測模式來進行。又，在第1轉換模式中，亦可於區塊尺寸為16×16以下的情況下，可固定地選擇DCT-V的基底，於區塊尺寸超過16×16的情況下，可固定地選擇DCT-II的基底。 [解碼裝置的逆轉換部之內部構成]As another specific example, in the second conversion mode, when the block size is 4 × 4, the base of DST-VII can be fixedly selected. When the block size exceeds 4 × 4, from One substrate group is selected from a plurality of substrate groups each including one or more substrates, and one substrate is selected from the selected substrate groups. The plurality of substrate groups may include, for example, a first substrate group composed of a substrate of DST-I and a substrate of DST-VII, and a second substrate group composed of a substrate of DCT-VIII and a substrate of DST-VII. In addition, the selection of the base group may also be performed according to an intra-frame prediction mode such as displaying the direction of the intra-frame prediction. Also, in the first conversion mode, the base of DCT-V can be fixedly selected when the block size is 16 × 16 or less, and fixedly selectable when the block size exceeds 16 × 16. The base of DCT-II. [Internal configuration of the inverse conversion unit of the decoding device]

接著，說明解碼裝置200的逆轉換部206C之內部構成。圖28是顯示實施形態1的變形例3之解碼裝置200的逆轉換部206C之內部構成的方塊圖。逆轉換部206C具備框內/框間判定部2061、基底選擇部2062C、逆頻率轉換部2063、尺寸判定部2064A、閾值尺寸取得部2065B、及轉換模式判定部2066C。Next, the internal configuration of the inverse conversion unit 206C of the decoding device 200 will be described. FIG. 28 is a block diagram showing an internal configuration of the inverse conversion unit 206C of the decoding device 200 according to the third modification of the first embodiment. The inverse conversion unit 206C includes an in-frame / inter-frame determination unit 2061, a base selection unit 2062C, an inverse frequency conversion unit 2063, a size determination unit 2064A, a threshold size acquisition unit 2065B, and a conversion mode determination unit 2066C.

轉換模式判定部2066C會判定要將包含第1轉換模式及第2轉換模式的複數個轉換模式當中的哪一個轉換模式適用在解碼對象區塊。例如，轉換模式判定部2066C是根據藉由熵解碼部202從位元流中解讀出的轉換模式之資訊，來判定轉換模式。The conversion mode determination unit 2066C determines which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is to be applied to the decoding target block. For example, the conversion mode determination unit 2066C determines the conversion mode based on the information of the conversion mode decoded from the bit stream by the entropy decoding unit 202.

在適用第1轉換模式的情況下，基底選擇部2062C與上述實施形態1的變形例2同樣地，是從包含DCT-II及DCT-V的逆轉換之複數個逆頻率轉換的基底當中，選擇1個基底。另一方面，在適用第2轉換模式的情況下，基底選擇部2062C是從和第1轉換模式不同的複數個逆頻率轉換的基底當中，選擇1個基底。 [解碼裝置的逆轉換部之動作]When the first conversion mode is applied, the base selection unit 2062C selects, from the plurality of inverse frequency conversion bases including the inverse conversion of DCT-II and DCT-V, as in the second modification of the first embodiment. 1 substrate. On the other hand, when the second conversion mode is applied, the base selection unit 2062C selects one base from a plurality of inverse frequency conversion bases different from the first conversion mode. [Operation of the Inverse Conversion Unit of the Decoding Device]

接著，參照圖29來具體地說明如以上所構成之本變形例的逆轉換部206C之動作。圖29是顯示實施形態1的變形例3之解碼裝置200的逆轉換部206C之處理的流程圖。Next, the operation of the inverse conversion unit 206C of the present modification configured as described above will be specifically described with reference to FIG. 29. FIG. 29 is a flowchart showing processing performed by the inverse conversion unit 206C of the decoding device 200 according to the third modification of the first embodiment.

首先，框內/框間判定部2061會判定為了解碼對象區塊所使用的是框內預測及框間預測的哪一個(S201)。在此，當為了解碼對象區塊所使用的是框內預測的情形下(S201的框內)，轉換模式判定部2066C會判定要將複數個轉換模式當中的哪一個轉換模式適用於解碼對象區塊(S232)。在此，於適用第1轉換模式的情況下(S232的第1轉換模式)，則執行步驟S221以後的處理。另一方面，於適用第2轉換模式的情況下(S232的第2轉換模式)，基底選擇部2062C會選擇第2轉換模式用的基底(S233)。在此選擇的第2轉換模式用的基底，是與在編碼裝置100所選擇的第2轉換模式用的基底相對應的逆轉換之基底。First, the in-frame / inter-frame determination unit 2061 determines which of the intra-frame prediction and the inter-frame prediction is used to decode the target block (S201). Here, when intra-frame prediction is used for the decoding target block (in the frame of S201), the conversion mode determination unit 2066C determines which conversion mode among a plurality of conversion modes is to be applied to the decoding target area. Block (S232). Here, when the first conversion mode is applied (the first conversion mode of S232), the processes from step S221 onward are executed. On the other hand, when the second conversion mode is applied (the second conversion mode of S232), the base selection unit 2062C selects the base for the second conversion mode (S233). The base for the second conversion mode selected here is a base for inverse conversion corresponding to the base for the second conversion mode selected by the encoding device 100.

之後，逆頻率轉換部2063是使用在步驟S202、步驟S203或步驟S233所選擇的基底，來進行解碼對象區塊的逆頻率轉換(S204)。 [效果等]Thereafter, the inverse frequency conversion unit 2063 performs inverse frequency conversion of the decoding target block using the base selected in step S202, step S203, or step S233 (S204). [Effects, etc.]

如上所述，根據本變形例之編碼裝置100的轉換部106C及解碼裝置200的逆轉換部206C，可以使用轉換模式來切換頻率轉換。從而，變得可實現更進一步的頻率轉換之效率化，且可以實現更進一步的壓縮效率之提升。As described above, according to the conversion unit 106C of the encoding device 100 and the inverse conversion unit 206C of the decoding device 200 in this modification, it is possible to switch the frequency conversion using the conversion mode. As a result, it becomes possible to further improve the efficiency of frequency conversion, and further improve the compression efficiency.

此外，根據本變形例之編碼裝置100的轉換部106C及解碼裝置200的逆轉換部206C，可以將適用於當前區塊的轉換模式之資訊，包含在位元流內。從而，變得可因應於輸入圖像而適應性地決定轉換模式，而可以實現更進一步的壓縮效率之提升。In addition, according to the conversion unit 106C of the encoding device 100 and the inverse conversion unit 206C of the decoding device 200 in this modification, the information of the conversion mode applicable to the current block can be included in the bit stream. Therefore, it becomes possible to adaptively determine the conversion mode in accordance with the input image, and to achieve further improvement in compression efficiency.

再者，在本變形例中，是以使用2個轉換模式(第1轉換模式及第2轉換模式)的情況為中心來說明，但轉換模式的數量並不限定於2個。例如，除了第1轉換模式及第2轉換模式之外，亦可使用第3轉換模式及/或第4轉換模式。 (實施形態1的其他變形例)In addition, in this modification, a case where two conversion modes (a first conversion mode and a second conversion mode) are used will be mainly described, but the number of conversion modes is not limited to two. For example, in addition to the first conversion mode and the second conversion mode, a third conversion mode and / or a fourth conversion mode may be used. (Other Modifications of Embodiment 1)

以上，雖然就本揭示的1個或複數個態樣的編碼裝置及解碼裝置，而根據實施形態及變形例來進行說明，但本揭示並不限定於該實施形態及變形例。在不脫離本揭示的主旨之前提下，將本發明所屬技術領域中具有通常知識者可思及的各種變形施行於本實施形態或本變形例而成的形態、或組合不同變形例中的構成要件所構建的形態，也可包含在本揭示的1個或複數個態樣的範圍內。Although the encoding device and the decoding device according to one or more aspects of the present disclosure have been described based on the embodiments and the modifications, the present disclosure is not limited to the embodiments and the modifications. Without departing from the gist of the present disclosure, various modifications that can be conceived by a person having ordinary skill in the technical field to which the present invention pertains can be implemented in this embodiment or the modification, or a combination of different modifications The forms constructed by the requirements may also be included in the scope of one or more aspects of the present disclosure.

例如，在上述實施形態1及各變形例中，亦可與先前技術同樣地，對4×4尺寸的亮度區塊選擇DST-VII的基底。在此情況下，在對於所使用的是框內預測的區塊，且是4×4尺寸的亮度區塊以外的區塊之頻率轉換中，可選擇DCT-V的基底。又，針對使用框內預測的色差區塊，亦可與先前技術同樣地，選擇DCT-II的基底。在此情況下，針對使用框內預測的亮度區塊，可選擇DCT-V的基底。For example, in the first embodiment and each modification described above, the base of DST-VII may be selected for a 4 × 4 size luminance block as in the prior art. In this case, the DCT-V base can be selected for frequency conversion of a block other than the 4 × 4 size luminance block that is used for the intra-frame prediction block. In addition, for the color difference block using the intra-frame prediction, the base of DCT-II can be selected in the same manner as in the prior art. In this case, for the luminance block using the intra-frame prediction, the base of DCT-V can be selected.

再者，在上述實施形態1及各變形例中，框內/框間判定部1061是根據輸入圖像訊號、以及對壓縮圖像進行局部解碼而得到的圖像訊號之比較結果，來進行框內預測及框間預測的哪一個之判定，但根據其他訊號來進行判定亦可。Furthermore, in the above-mentioned Embodiment 1 and each modification, the in-frame / inter-frame determination unit 1061 performs frame processing based on a comparison result of an input image signal and an image signal obtained by locally decoding a compressed image. Which of the intra prediction and the inter-frame prediction is to be determined, but the determination may be made based on other signals.

再者，在上述實施形態1及各變形例中，雖然所使用的是DCT-II及DCT-V的正交轉換之基底，但亦可使用具有類似的轉換特性之非正交轉換的基底，而取代DCT-II及DCT-V的基底。Furthermore, in the above-mentioned Embodiment 1 and each modified example, although the basis of orthogonal conversion of DCT-II and DCT-V is used, a basis of non-orthogonal conversion having similar conversion characteristics may also be used. It replaces the base of DCT-II and DCT-V.

再者，在上述實施形態1及各變形例中，雖然是根據預測的種類(框內/框間)或區塊尺寸來選擇基底，但並不限定於此。例如亦可對框內預測模式、量化參數、或編碼對象區塊的殘差進行評價，並根據評價結果來選擇基底。In addition, in the first embodiment and each modification described above, the base is selected based on the type of prediction (inside / between frames) or block size, but it is not limited to this. For example, the in-frame prediction mode, quantization parameter, or residual of the coding target block may be evaluated, and the base may be selected according to the evaluation result.

再者，在上述實施形態1及各變形例中，雖然是在不論亮度及色差的情形下選擇基底，但並不限定於此。例如，對於色差區塊，亦可不論框內預測/框間預測，都固定地使用DCT-II的基底。In addition, in the first embodiment and each modification described above, although the substrate is selected regardless of the brightness and chromatic aberration, it is not limited to this. For example, for color-difference blocks, the base of DCT-II may be fixedly used regardless of intra-frame prediction / inter-frame prediction.

再者，在上述實施形態1及各變形例中，雖然在當前區塊使用框內預測的情況下，所使用的是DCT-II及DCT-V的基底，但並不限定於此。例如，除了DCT-II及DCT-V的基底之外，亦可使用DST-VII的基底。例如，於將框內預測適用於當前區塊的情況下，亦可為(i)當前區塊的尺寸為閾值尺寸以下時使用DST-VII的基底；(ii)當前區塊的尺寸比閾值尺寸更大時使用DCT-V的基底。Furthermore, in the above-mentioned Embodiment 1 and each modification, although the intra-frame prediction is used for the current block, the bases of DCT-II and DCT-V are used, but it is not limited to this. For example, in addition to the substrates of DCT-II and DCT-V, the substrates of DST-VII can also be used. For example, in the case where the in-frame prediction is applied to the current block, (i) the base of DST-VII can be used when the size of the current block is below the threshold size; (ii) the size of the current block is larger than the threshold size Use larger DCT-V substrates.

再者，在上述實施形態1及各變形例中，雖然轉換部或逆轉換部具備有基底選擇部，但亦可清楚地明示不具備有基底選擇部。在此情況下，只要將基底選擇部的功能整合到頻率轉換部或逆頻率轉換部即可。 (實施形態2)In addition, in the first embodiment and each modification described above, although the conversion unit or the inverse conversion unit is provided with the base selection unit, it may be clearly indicated that the base selection unit is not provided. In this case, the function of the base selection section may be integrated into the frequency conversion section or the inverse frequency conversion section. (Embodiment 2)

在以上的實施形態及各變形例中，功能方塊的每一個通常可藉由MPU及記憶體等來實現。又，藉由功能方塊的每一個所進行之處理，通常是藉由處理器等程式執行部將已儲存於ROM等記錄媒體的軟體(程式)讀出並執行來實現。可將該軟體藉由下載等來發布，亦可儲存於半導體記憶體等之記錄媒體來發布。再者，當然也可以藉由硬體(專用電路)來實現各功能方塊。In the above embodiments and modifications, each of the functional blocks can usually be realized by an MPU, a memory, or the like. In addition, the processing performed by each of the function blocks is usually realized by a program execution unit such as a processor reading out and executing software (program) stored in a recording medium such as a ROM. The software may be distributed by downloading or the like, or may be stored in a recording medium such as a semiconductor memory and distributed. Moreover, of course, each functional block can also be realized by hardware (dedicated circuit).

又，在實施形態及各變形例中所說明之處理，可以藉由使用單一的裝置(系統)集中處理來實現、或者亦可藉由使用複數個裝置分散處理來實現。又，執行上述程式之處理器可為單個，亦可為複數個。亦即，可進行集中處理、或者亦可進行分散處理。The processes described in the embodiments and the modifications can be implemented by using a single device (system) to perform the centralized processing, or by using a plurality of devices to perform the distributed processing. In addition, the processor executing the above program may be a single processor or a plurality of processors. That is, centralized processing may be performed, or distributed processing may be performed.

本發明不受以上之實施例所限定，可進行種種的變更，且該等亦包含於本發明之範圍內。The present invention is not limited by the above embodiments, and various changes can be made, and these are also included in the scope of the present invention.

更進一步地，在此說明上述實施形態及各變形例所示之動態圖像編碼方法(圖像編碼方法)或動態圖像解碼方法(圖像解碼方法)的應用例與使用其之系統。該系統之特徵在於具有使用圖像編碼方法之圖像編碼裝置、使用圖像解碼方法之圖像解碼裝置、及具備兩者之圖像編碼解碼裝置。系統中的其他構成，可以視情況適當地變更。 [使用例]Furthermore, application examples of the moving image encoding method (image encoding method) or the moving image decoding method (image decoding method) shown in the above-mentioned embodiments and modifications will be described below, and a system using the same. This system is characterized by having an image encoding device using an image encoding method, an image decoding device using an image decoding method, and an image encoding and decoding device having both. Other components in the system can be changed as appropriate. [Example of use]

圖30是顯示實現內容發送服務(content delivery service)的內容供給系統ex100的整體構成之圖。將通訊服務之提供區分割成期望的大小，且在各格區(cell)內分別設置有作為固定無線電台之基地台ex106、ex107、ex108、ex109、ex110。FIG. 30 is a diagram showing the overall configuration of a content supply system ex100 that implements a content delivery service. The communication service providing area is divided into desired sizes, and base stations ex106, ex107, ex108, ex109, and ex110, which are fixed radio stations, are provided in each cell.

在此內容供給系統ex100中，可透過網際網路服務提供者ex102或通訊網ex104、及基地台ex106~ex110，將電腦ex111、遊戲機ex112、相機ex113、家電ex114、及智慧型手機ex115等各機器連接到網際網路ex101。該內容供給系統ex100亦可構成為組合並連接上述任一要件。亦可在不透過作為固定無線電台之基地台ex106~ex110的情況下，將各機器透過電話網或近距離無線等直接或間接地相互連接。又，串流伺服器(streaming server)ex103，是透過網際網路ex101等而與電腦ex111、遊戲機ex112、相機ex113、家電ex114、及智慧型手機ex115等各機器相連接。又，串流伺服器ex103是透過衛星ex116來與飛機ex117內之熱點(hot spot)內的終端等連接。In this content supply system ex100, computers ex111, game consoles ex112, cameras ex113, home appliances ex114, and smart phones ex115 can be connected to each other through the Internet service provider ex102 or the communication network ex104, and the base stations ex106 to ex110. Connected to the internet ex101. The content supply system ex100 may be configured to combine and connect any of the above-mentioned requirements. The devices can also be directly or indirectly connected to each other through a telephone network or near-field wireless without passing through base stations ex106 to ex110, which are fixed radio stations. The streaming server ex103 is connected to various devices such as a computer ex111, a game machine ex112, a camera ex113, a home appliance ex114, and a smartphone ex115 through the Internet ex101 and the like. The streaming server ex103 is connected to a terminal or the like in a hot spot in the aircraft ex117 through the satellite ex116.

再者亦可使用無線存取點或熱點等，取代基地台ex106~ex110 。又，串流伺服器ex103可在不透過網際網路ex101或網際網路服務提供者ex102的情形下直接與通訊網ex104連接，亦可在不透過衛星ex116的情形下直接與飛機ex117連接。Furthermore, wireless access points or hotspots can be used instead of base stations ex106 ~ ex110. In addition, the streaming server ex103 may be directly connected to the communication network ex104 without using the Internet ex101 or the Internet service provider ex102, or may be directly connected to the aircraft ex117 without using the satellite ex116.

相機ex113是數位相機等可進行靜態圖攝影、及動態圖攝影之機器。又，智慧型手機ex115為對應於一般稱作2G、3G、3.9G、4G、還有今後會被稱作5G的移動通訊系統之方式的智慧型電話機、行動電話機、或者PHS(個人手持電話系統，Personal Handyphone System)等。The camera ex113 is a digital camera or the like that can perform still image photography and motion image photography. The smart phone ex115 is a smart phone, a mobile phone, or a PHS (Personal Handy Phone System) that corresponds to a mobile communication system generally called 2G, 3G, 3.9G, 4G, and 5G. , Personal Handyphone System).

家電ex118為冰箱、或包含於家庭用燃料電池汽電共生系統(cogeneration system)之機器等。The home appliance ex118 is a refrigerator or a device included in a domestic fuel cell gas-electric co-generation system.

在內容供給系統ex100中，是藉由使具有攝影功能之終端通過基地台ex106等來連接到串流伺服器ex103，而使實況(live)即時發送等變得可行。在實況即時發送中，終端(電腦ex111、遊戲機ex112、相機ex113、家電ex114、智慧型手機ex115、及飛機ex117內之終端等)會對使用者使用該終端所攝影之靜態圖或動態圖內容進行已在上述實施形態及各變形例中所說明之編碼處理，並將藉由編碼而得到的影像資料、與已將對應於影像之聲音編碼的聲音資料進行多工化，來將所獲得之資料傳送至串流伺服器ex103。亦即，各終端是作為本發明的一個態樣的圖像編碼裝置而發揮功能。In the content supply system ex100, a terminal having a photographing function is connected to the streaming server ex103 through a base station ex106 or the like, so that live transmission or the like becomes feasible. In real-time live transmission, the terminal (computer ex111, game console ex112, camera ex113, home appliances ex114, smart phone ex115, and terminal in aircraft ex117, etc.) will display the content of static or dynamic pictures taken by the user using the terminal Perform the encoding processing described in the above embodiment and each modification, and multiplex the image data obtained by encoding and the audio data that has been encoded with the sound corresponding to the image to obtain the obtained data. The data is sent to the streaming server ex103. That is, each terminal functions as an image encoding device according to one aspect of the present invention.

另一方面，串流伺服器ex103會對有要求之客戶端(client)所傳送之內容資料進行串流(stream)發送。客戶端是指可將已經過上述編碼處理之資料解碼的電腦ex111、遊戲機ex112、相機ex113、家電ex114、智慧型手機ex115、及飛機ex117內之終端等。已接收到所發送之資料的各機器會將所接收到之資料解碼處理並播放。亦即，各機器是作為本發明之一個態樣的圖像解碼裝置而發揮功能。 [分散處理]On the other hand, the streaming server ex103 sends a stream of content data transmitted by a requesting client. The client refers to a computer ex111, a game machine ex112, a camera ex113, a home appliance ex114, a smart phone ex115, and a terminal in an aircraft ex117 that can decode the data that has been encoded as described above. Each machine that has received the transmitted data will decode the received data and play it. That is, each device functions as an image decoding device according to one aspect of the present invention. [Decentralized processing]

又，串流伺服器ex103可為複數個伺服器或複數台電腦，亦可為將資料分散並處理或記錄以進行發送者。例如，串流伺服器ex103可藉由CDN(內容傳遞網路，Contents Delivery Network)而實現，亦可藉由分散於全世界的多數個邊緣伺服器(edge server)與於邊緣伺服器之間進行連接的網路來實現內容發送。在CDN上，會因應客戶來動態地分配在物理上相近之邊緣伺服器。並且，可以藉由將內容快取(cache)及傳遞至該邊緣伺服器來減少延遲。又，由於可以在發生某種錯誤時或因流量之增加等而改變通訊狀態時，以複數個邊緣伺服器將處理分散、或將發送主體切換為其他的邊緣伺服器，來繞過已發生障礙的網路部分並持續發送，因此可以實現高速且穩定的發送。In addition, the streaming server ex103 may be a plurality of servers or a plurality of computers, or may be a data that is distributed and processed or recorded for the sender. For example, the streaming server ex103 can be implemented by a CDN (Contents Delivery Network), or it can be performed by a number of edge servers scattered around the world and between edge servers. Connected network for content delivery. On the CDN, the edge servers that are physically close to each other are dynamically allocated according to customers. And, latency can be reduced by caching and passing content to the edge server. In addition, when a certain error occurs or the communication status is changed due to an increase in traffic, etc., it is possible to use a plurality of edge servers to distribute the processing or switch the sending subject to another edge server to bypass the obstacle that has occurred The network part and continuous transmission, so you can achieve high-speed and stable transmission.

又，不僅是發送本身之分散處理，亦可將已攝影之資料的編碼處理在各終端進行，也可在伺服器側進行，亦可互相分擔來進行。作為一例，一般在編碼處理中，會進行2次處理迴路。在第1次的迴路中可檢測在框或場景單位下之圖像的複雜度或編碼量。又，在第2次的迴路中可進行維持畫質並提升編碼效率的處理。例如，可以藉由使終端進行第1次的編碼處理，且使接收內容之伺服器側進行第2次的編碼處理，而減少在各終端的處理負荷並且提升內容的質與效率。此時，只要有以近乎即時的方式接收並解碼的要求，也可以將終端已進行之第一次的編碼完成資料以其他終端來接收並播放，因此也可做到更靈活的即時發送。In addition, not only the distributed processing of the transmission itself, but also the encoding processing of the photographed data can be performed at each terminal, it can also be performed at the server side, or it can be performed by sharing each other. As an example, in the encoding process, a processing loop is generally performed twice. In the first loop, it is possible to detect the complexity or coding amount of the image in the frame or scene unit. In the second loop, processing to maintain image quality and improve coding efficiency can be performed. For example, the terminal can perform the first encoding process and the server receiving the content can perform the second encoding process, thereby reducing the processing load on each terminal and improving the quality and efficiency of the content. At this time, as long as there is a request for receiving and decoding in a near-instant manner, the first encoding completion data that has been performed by the terminal can also be received and played by other terminals, so more flexible instant transmission can also be achieved.

作為其他的例子，相機ex113等是由圖像中進行特徵量提取，並將與特徵量相關之資料作為元資料(meta data)來壓縮並傳送至伺服器。伺服器會進行例如從特徵量判斷目標(object)之重要性並切換量化精度等的因應圖像之意義的壓縮。特徵量資料對於在伺服器之再度的壓縮時的移動向量預測之精度及效率提升特別有效。又，亦可在終端進行VLC(可變長度編碼)等簡易的編碼，並在伺服器進行CABAC(全文適應性二進制算術編碼方式)等處理負荷較大的編碼。As another example, a camera ex113 or the like extracts feature quantities from an image, compresses data related to the feature quantities as meta data, and transmits the data to a server. The server performs compression corresponding to the meaning of the image, such as judging the importance of an object from a feature amount and switching the quantization accuracy. The feature quantity data is particularly effective for improving the accuracy and efficiency of motion vector prediction when the server is compressed again. In addition, simple coding such as VLC (Variable Length Coding) may be performed on the terminal, and coding with a large processing load such as CABAC (full text adaptive binary arithmetic coding method) may be performed on the server.

此外，作為其他的例子，在運動場、購物商場、或工廠等中，會有藉由複數個終端拍攝幾乎相同的場景之複數個影像資料存在的情況。此時，可利用已進行攝影之複數個終端、與因應需要而沒有進行攝影之其他的終端及伺服器，以例如GOP(圖片群組，Group of Picture)單位、圖片單位、或已將圖片分割而成之圖塊(tile)單位等來各自分配編碼處理並進行分散處理。藉此，可以減少延遲，而更加能夠實現即時性(real-time)。In addition, as another example, in a sports field, a shopping mall, a factory, or the like, there may be cases where a plurality of image data are captured by a plurality of terminals to capture almost the same scene. At this time, a plurality of terminals that have been photographed, and other terminals and servers that have not been photographed according to needs, may be used, for example, GOP (Group of Picture) units, picture units, or pictures have been divided The resulting tile units are assigned encoding processing and distributed processing. Thereby, the delay can be reduced, and real-time can be realized more.

又，由於複數個影像資料幾乎為相同的場景，因此亦可在伺服器進行管理及/或指示為：可互相地參照在各終端所攝影之影像資料。或者，亦可使伺服器接收來自各終端之編碼完成資料，並在複數個資料間變更參照關係、或者補正或更換圖片本身並重新編碼。藉此，可以生成已提高一個個資料之質與效率的串流。In addition, since the plurality of image data are almost the same scene, it can also be managed and / or instructed on the server as: the image data photographed at each terminal can be referred to each other. Alternatively, the server may receive the coding completion data from each terminal, and change the reference relationship between the plurality of data, or correct or replace the picture itself and re-encode. In this way, a stream that has improved the quality and efficiency of each piece of data can be generated.

又，伺服器亦可在進行變更影像資料之編碼方式的轉碼(transcode)後再發送影像資料。例如，伺服器亦可將MPEG類之編碼方式轉換為VP類，且亦可將H.264轉換為H.265。In addition, the server may transmit the image data after transcoding the encoding method of the image data. For example, the server can also convert MPEG encoding to VP encoding, and it can also convert H.264 to H.265.

如此，編碼處理就可藉由終端或1個以上之伺服器來進行。因此，以下雖然使用「伺服器」或「終端」等記載來作為進行處理之主體，但亦可於終端進行在伺服器進行之處理的一部分或全部，也可於伺服器進行在終端進行之處理的一部分或全部。又，關於這些，針對解碼處理也是同樣的。 [3D、多角度]In this way, the encoding process can be performed by a terminal or one or more servers. Therefore, although descriptions such as "server" or "terminal" are used as the subject of processing, some or all of the processing performed on the server may be performed on the terminal, and processing performed on the terminal may also be performed on the server. Part or all of it. The same applies to the decoding process. [3D, multi-angle]

近年來，將以彼此幾乎同步之複數台相機ex113及/或智慧型手機ex115等終端所攝影到之不同的場景、或者將相同的場景從不同的角度攝影之圖像或影像加以整合並利用的作法也在逐漸增加中。各終端所攝影到之影像會根據另外取得的終端間之相對的位置關係、或者包含於影像之特徵點為一致的區域等而被整合。In recent years, different scenes captured by terminals such as multiple cameras ex113 and / or smart phones ex115, which are almost synchronized with each other, or images or videos taken from different angles of the same scene are integrated and used. The practice is also gradually increasing. The images captured by each terminal are integrated based on the relative positional relationship between the terminals obtained separately, or the areas where the feature points of the images are consistent.

不僅二維動態圖像編碼，伺服器亦可根據動態圖像之場景解析等而自動地、或者在使用者所指定之時刻中，將靜態圖編碼並傳送至接收終端。更進一步地，伺服器在可以取得攝影終端間之相對位置關係的情況下，不僅是二維動態圖像，還可以根據相同場景從不同的角度所攝影之影像，來生成該場景之三維形狀。再者，伺服器亦可將藉由點雲(point cloud)等而生成之三維資料另外編碼，亦可根據使用三維資料來辨識或追蹤人物或目標的結果，從複數個終端所攝影之影像中選擇、或再構成並生成要傳送至接收終端的影像。Not only the two-dimensional moving image coding, but also the server can automatically encode and transmit the static image to the receiving terminal according to the scene analysis of the moving image or at a time designated by the user. Furthermore, when the server can obtain the relative positional relationship between the photographing terminals, not only a two-dimensional moving image, but also an image photographed from different angles of the same scene to generate a three-dimensional shape of the scene. In addition, the server can additionally encode the three-dimensional data generated by the point cloud, etc., and can also use the three-dimensional data to identify or track people or targets, from the images taken by multiple terminals. Select, or reconstruct, and generate an image to be transmitted to the receiving terminal.

如此，使用者可以任意選擇對應於各攝影終端之各影像來享受場景，也可以享受從利用複數圖像或影像再構成之三維資料中切出任意視點的影像之內容。此外，與影像同樣地，聲音也可從複數個不同的角度進行收音，且伺服器亦可配合影像將來自特定角度或空間的聲音與影像進行多工化並傳送。In this way, the user can arbitrarily select each image corresponding to each photographing terminal to enjoy the scene, and can also enjoy the content of cutting out an image of an arbitrary viewpoint from the three-dimensional data reconstructed using a plurality of images or images. In addition, like the video, the sound can also be received from a plurality of different angles, and the server can also multiplex and transmit the sound and video from a specific angle or space with the video.

又，近年來，Virtual Reality(虛擬實境，VR)及Augmented Reality(擴增虛擬實境，AR)等將現實世界與虛擬世界建立對應之內容也逐漸普及。在VR圖像的情形下，伺服器亦可分別製作右眼用及左眼用之視點圖像，並藉由Multi-View Coding(多視圖編碼，MVC)等在各視點影像間進行容許參照之編碼，亦可不互相參照而作為不同的串流來進行編碼。在不同的串流之解碼時，可使其互相同步並播放成因應使用者之視點來重現虛擬的三維空間。In addition, in recent years, contents that establish a correspondence between the real world and the virtual world, such as Virtual Reality (VR) and Augmented Reality (AR), have also gradually spread. In the case of VR images, the server can also create right-view and left-view viewpoint images, and use Multi-View Coding (Multi-View Coding, MVC) to allow permissible reference between the viewpoint images. The encoding may be performed as a different stream without referring to each other. When decoding different streams, they can be synchronized with each other and played back to reproduce the virtual three-dimensional space in accordance with the user's point of view.

在AR圖像的情形下，伺服器會根據三維之位置或使用者之視點的活動將虛擬空間上之虛擬物體資訊重疊於現實空間之相機資訊。解碼裝置亦可藉由取得或保持虛擬物體資訊及三維資料，並因應使用者之視點的活動而生成二維圖像並順暢地連結，以製作重疊資料。或者，亦可為解碼裝置除了虛擬物體資訊之委託之外還將使用者的視點之活動也傳送至伺服器，且伺服器配合從保持於伺服器之三維資料中所接收到的視點的活動來製作重疊資料，而將重疊資料編碼並發送至解碼裝置。再者，亦可為重疊資料在RGB以外具有顯示穿透度的α值，伺服器將由三維資料所製作出之目標以外的部分之α值設定為0等，並在該部分為穿透狀態下進行編碼。或者，伺服器亦可如色度鍵(chroma key)的形式將預定之值的RGB值設定為背景，而生成將目標以外之部分設成為背景色之資料。In the case of AR images, the server will superimpose the virtual object information in the virtual space with the camera information in the real space according to the three-dimensional position or the activity of the user's viewpoint. The decoding device can also obtain and maintain virtual object information and three-dimensional data, and generate a two-dimensional image and smoothly link it in response to the user's viewpoint activity to create overlapping data. Alternatively, in addition to the request of the virtual object information, the decoding device can also transmit the user's viewpoint activity to the server, and the server cooperates with the viewpoint activity received from the three-dimensional data held on the server to Create overlapping data, and encode and send the overlapping data to the decoding device. In addition, the superimposed data may have an alpha value that shows penetration beyond RGB, and the server may set the alpha value of parts other than the target made from the three-dimensional data to 0, etc. To encode. Alternatively, the server may set a RGB value of a predetermined value as the background in the form of a chroma key, and generate data for setting a part other than the target as the background color.

同樣地，被發送之資料的解碼處理可在客戶端之各終端進行，也可在伺服器側進行，亦可互相分擔而進行。作為一例，亦可使某個終端暫時將接收要求傳送至伺服器，並在其他終端接收因應該要求之內容且進行解碼處理，再將解碼完成之訊號傳送至具有顯示器的裝置。不依靠可通訊之終端本身的性能而分散處理並選擇適當之內容的作法，可以播放畫質良好的資料。又，作為其他之例，亦可以用TV等接收大尺寸之圖像資料，並將圖片分割後之圖塊等一部分的區域解碼並顯示於鑑賞者之個人終端。藉此，可以將整體圖片共有化，並且可以就近確認自己負責的領域或想要更詳細地確認之區域。Similarly, the decoding process of the transmitted data can be performed at each terminal of the client, it can also be performed at the server side, and it can also be performed in a shared manner. As an example, a terminal may temporarily transmit a reception request to a server, and receive the content in response to the request at another terminal and perform a decoding process, and then transmit the decoded signal to a device with a display. It does not depend on the performance of the communicable terminal itself to decentralize the processing and select the appropriate content, which can play back good-quality data. Also, as another example, a TV or the like may be used to receive large-sized image data, and a part of a region such as a tile after the picture is divided may be decoded and displayed on the personal terminal of the viewer. In this way, you can share the overall picture, and you can confirm the area you are responsible for or the area you want to confirm in more detail.

又，今後可預想到下述情形：不論屋內外，在近距離、中距離、或長距離之無線通訊為可複數使用的狀況下，利用MPEG-DASH等之發送系統規格，一邊對連接中之通訊切換適當之資料一邊無縫地接收內容。藉此，使用者不僅對本身之終端，連設置於屋內外之顯示器等的解碼裝置或顯示裝置都可自由地選擇並且即時切換。又，根據本身之位置資訊等，可以一邊切換要進行解碼之終端及要進行顯示之終端一邊進行解碼。藉此，可在往目的地之移動中，一邊在已埋入有可顯示之元件的鄰近之建築物的牆面或地面的一部分顯示地圖資訊，一邊移動。又，也可以令編碼資料快取到可以在短時間內從接收終端進行存取之伺服器、或者複製到內容傳遞伺服器(content delivery server)中的邊緣伺服器等，根據在網路上對編碼資料的存取容易性，來切換接收資料之位元率(bit-rate)。 [可調式編碼]In addition, in the future, the following situations can be expected: regardless of indoor and outdoor conditions, in the case of close-range, middle-range, or long-range wireless communication that can be used multiple times, using the transmission system specifications such as MPEG-DASH, Communication switches the appropriate data while receiving content seamlessly. Thereby, the user can freely select not only his own terminal, but also a decoding device or a display device such as a display installed inside or outside the house, and switch instantly. In addition, it is possible to perform decoding while switching between a terminal to be decoded and a terminal to be displayed based on its own location information and the like. Thereby, it is possible to move while moving to the destination while displaying map information on a part of the wall or floor of a nearby building in which displayable elements are embedded. In addition, the encoded data can be cached to a server that can be accessed from the receiving terminal in a short time, or copied to an edge server in a content delivery server, etc., based on the encoding on the network. Ease of data access to switch the bit-rate of the received data. [Adjustable coding]

關於內容之切換，利用圖31所示之在上述實施形態及各變形例中所顯示之應用動態圖像編碼方法而被壓縮編碼之可調整的串流來進行說明。雖然伺服器具有複數個內容相同而質卻不同的串流來作為個別的流也無妨，但亦可構成為將如圖示般分層來進行編碼而實現的時間上/空間上可調整之串流的特徵加以活用，以切換內容。亦即，藉由使解碼側因應性能這種內在要因與通訊頻帶之狀態等的外在要因來決定要解碼至哪一層，解碼側即可以自由地切換低解析度之內容與高解析度之內容來解碼。例如，當想在回家後以網際網路TV等機器收看於移動中以智慧型手機ex115收看之影像的後續時，該機器只要將相同的串流解碼至不同的層即可，因此可以減輕伺服器側的負擔。The content switching will be described using the adjustable stream which is compression-encoded by applying the moving image encoding method shown in the above-mentioned embodiment and each modification shown in FIG. 31. Although the server has a plurality of streams with the same content but different qualities as individual streams, it is not a problem, but it can also be configured as a temporally / spatially adjustable stream realized by layering and encoding as shown in the figure. Stream characteristics are used to switch content. That is, the decoding side can decide which layer to decode according to the external factors such as performance and internal factors such as the state of the communication band. The decoding side can freely switch between low-resolution content and high-resolution content. To decode. For example, when you want to use a device such as Internet TV to watch the follow-up video on a mobile phone ex115 after you return home, the device only needs to decode the same stream to different layers, so it can be reduced. Burden on the server side.

此外，如上述地，除了實現按每層將圖片都編碼，且在基本層之上位存在增強層(enhancement layer)之具可調整性(scalability)的構成以外，亦可使增強層包含根據圖像之統計資訊等的元資訊，且使解碼側根據元資訊將基本層之圖片進行超解析，藉此來生成已高畫質化之內容。所謂超解析可以是相同解析度中的SN比之提升、以及解析度之擴大之任一種。元資訊包含：用於特定超解析處理中使用之線形或非線形的濾波係數之資訊、或者特定超解析處理中使用之濾波處理、機械學習或最小平方運算中的參數值的資訊等。In addition, as described above, in addition to realizing a structure in which a picture is coded for each layer and an enhancement layer exists above the base layer, the enhancement layer can also include an image based on the image. Metadata and other meta-information, and make the decoding side super-parse the pictures in the base layer based on the meta-information to generate high-quality content. The so-called super-resolution may be any of an increase in the SN ratio and an increase in the resolution in the same resolution. The meta-information includes: information for linear or non-linear filter coefficients used in specific super-analysis processing, or information about filter values used in specific super-analysis processing, mechanical learning, or parameter values in a least square operation.

或者，亦可構成為因應圖像內之目標等的含義而將圖片分割為圖塊等，且使解碼側選擇欲解碼之圖塊，而僅將一部分之區域解碼。又，藉由將目標之屬性(人物、車、球等)與影像內之位置(同一圖像中的座標位置等)作為元資訊加以儲存，解碼側即可根據元資訊特定所期望之目標的位置，並決定包含該目標之圖塊。例如，如圖32所示，元資訊可利用HEVC中的SEI訊息等與像素資料為不同之資料儲存構造而被儲存。此元資訊是表示例如主目標之位置、尺寸、或色彩等。Alternatively, it may be configured to divide a picture into tiles or the like in accordance with the meaning of an object or the like in the image, and cause the decoding side to select a tile to be decoded and decode only a part of the area. In addition, by storing the attributes of the target (person, car, ball, etc.) and the position in the image (coordinate position in the same image, etc.) as meta-information, the decoding side can specify the desired target's target based on the meta-information. Position, and decide which tile contains the target. For example, as shown in FIG. 32, the meta-information may be stored using a data storage structure different from the pixel data, such as the SEI message in HEVC. This meta information indicates, for example, the position, size, or color of the main target.

又，以串流、序列或隨機存取單位等，由複數個圖片構成之單位來保存元資訊亦可。藉此，解碼側可以取得特定人物出現在影像內之時刻等，且與圖片單位之資訊對照，藉此可以特定目標存在之圖片、以及在圖片內之目標的位置。 [網頁之最佳化]The meta information may be stored in units of a plurality of pictures, such as a stream, a sequence, or a random access unit. With this, the decoding side can obtain the moment when a specific person appears in the image, etc., and compare it with the information of the picture unit, so that the picture where the target exists and the position of the target in the picture can be specified. [Optimization of web pages]

圖33是顯示電腦ex111等中的網頁的顯示畫面例之圖。圖34是顯示智慧型手機ex115等中的網頁的顯示畫面例之圖。如圖33及圖34所示，當網頁包含複數個屬於對圖像內容之鏈接的鏈接圖像時，其外觀會依閱覽之元件而不同。當畫面上可看到複數個鏈接圖像時，直至使用者明確地選擇鏈接圖像、或者鏈接圖像接近畫面之中央附近或鏈接圖像之整體進入畫面內為止，顯示裝置(解碼裝置)都是顯示具有各內容之靜態圖或框內編碼畫面(Intra Picture，I-Picture)作為鏈接圖像、或者以複數個靜態圖或框內編碼畫面(I-Picture)等顯示如gif動畫的影像、或者僅接收基本層來將影像進行解碼及顯示。FIG. 33 is a diagram showing an example of a display screen displaying a web page on a computer ex111 or the like. FIG. 34 is a diagram showing an example of a display screen displaying a web page in a smartphone ex115 or the like. As shown in FIG. 33 and FIG. 34, when a web page includes a plurality of linked images belonging to a link to an image content, its appearance will vary depending on the components viewed. When a plurality of linked images can be seen on the screen, the display device (decoding device) is all until the user explicitly selects the linked image, or the linked image is near the center of the screen or the entire linked image enters the screen. It displays static pictures or intra-frame coded pictures (Intra Picture, I-Picture) with various contents as linked images, or displays multiple still pictures or I-Pictures (I-Picture), such as animated gif images, Or only receive the base layer to decode and display the image.

當已由使用者選擇出鏈接圖像時，顯示裝置會將基本層設為最優先來解碼。再者，只要有在構成網頁之HTML中顯示屬於可調整之內容的資訊，亦可使顯示裝置解碼至增強層。又，為了擔保即時性，在選擇之前或通訊頻帶非常嚴格時，顯示裝置可以藉由僅解碼及顯示前向參照(forward reference)之圖片(框內編碼畫面(I-Picture)、預測畫面(Predictive Picture，P-Picture)、僅前向參照之雙向預估編碼畫面(Bidirectionally Predictive Picture，B-Picture))，以減低前頭圖片之解碼時刻與顯示時刻之間的延遲(從內容之解碼開始到顯示開始之間的延遲)。又，顯示裝置亦可特意無視圖片之參照關係而將所有的雙向預估編碼畫面(B-Picture)及預測畫面(P-Picture)設成前向參照來粗略地解碼，並隨著時間經過且接收之圖片增加來進行正常的解碼。 [自動行駛]When the linked image has been selected by the user, the display device sets the base layer as the highest priority for decoding. Furthermore, as long as there is information showing adjustable content in the HTML constituting the webpage, the display device can also be decoded to the enhancement layer. In addition, in order to guarantee the timeliness, the display device can decode and display only forward reference pictures (in-frame coded picture (I-Picture), predictive picture (Predictive)) before the selection or when the communication frequency band is very strict. Picture (P-Picture) and forward-referenced Bidirectionally Predictive Picture (B-Picture)) to reduce the delay between the decoding time and the display time of the previous picture (from the decoding of the content to the display) Delay between starts). In addition, the display device may deliberately ignore all the reference relationships of pictures and set all the bi-directionally estimated coded pictures (B-Picture) and predicted pictures (P-Picture) to forward reference for rough decoding, and over time, and The received picture is added for normal decoding. [Automatic driving]

又，當為了汽車之自動行駛或行駛支援而傳送及接收二維或三維之地圖資訊等的靜態圖或影像資料時，接收終端除了屬於1個以上之層的圖像資料之外，亦可將天候或施工之資訊等也都接收作為元資訊，並對應於這些來解碼。再者，元資訊可以屬於層，亦可單純與圖像資料進行多工化。In addition, when transmitting and receiving two-dimensional or three-dimensional map information and other static maps or image data for automatic driving or driving support of a car, the receiving terminal may not only include image data belonging to one or more layers, but also Weather and construction information are also received as meta-information and decoded in accordance with these. Furthermore, meta-information can belong to layers, or it can simply be multiplexed with image data.

此時，由於包含接收終端之車、無人機(drone)或飛機等會移動，因此接收終端會在接收要求時傳送該接收終端之位置資訊，藉此一邊切換基地台ex106~ex110一邊實現無縫的接收及解碼。又，接收終端會因應使用者之選擇、使用者之狀況或通訊頻帶的狀態，而變得可動態地切換要將元資訊接收到何種程度，或要將地圖資訊更新至何種程度。At this time, since the vehicle including the receiving terminal, drone, or airplane will move, the receiving terminal will transmit the location information of the receiving terminal when receiving the request, thereby seamlessly switching base stations ex106 to ex110 while achieving Receiving and decoding. In addition, the receiving terminal can dynamically switch to what degree the meta-information is to be received or to which degree the map information is to be updated according to the user's selection, the user's condition, or the state of the communication band.

如以上地進行，在內容供給系統ex100中，客戶端可即時地接收使用者所傳送之已編碼的資訊，並將其進行解碼、播放。 [個人內容之發送]As described above, in the content supply system ex100, the client can immediately receive the encoded information transmitted by the user, and decode and play it. [Send personal content]

又，在內容供給系統ex100中，不僅是來自影像發送業者之高畫質且長時間的內容，來自個人之低畫質且短時間的內容的單播(unicast)、或多播(multicast)發送也是可做到的。又，這種個人的內容被認為今後也會持續增加下去。為了將個人內容做成更優良之內容，伺服器亦可在進行編輯處理之後進行編碼處理。這可藉由例如以下之構成來實現。In addition, in the content supply system ex100, not only high-quality and long-term content from video distribution companies, but also unicast or multicast transmission of low-quality and short-term content from individuals. It can be done. It is thought that such personal content will continue to increase in the future. In order to make personal content better, the server may also perform encoding processing after editing processing. This can be achieved by, for example, the following configuration.

伺服器會在攝影時即時地或累積地進行，而於攝影後，從原圖或編碼完成資料中進行攝影錯誤、場景搜尋、意義解析、及目標檢測等之辨識處理。而且，伺服器會根據辨識結果以手動或自動方式進行下述編輯：補正失焦或手震等、刪除亮度較其他圖片低或未聚焦之場景等重要性低的場景、強調目標之邊緣、使色調變化等。伺服器會根據編輯結果來將編輯後之資料編碼。又，當攝影時刻太長時收視率會下降的情況也是眾所皆知的，伺服器會根據圖像處理結果而以自動的方式如上述地不僅對重要性低之場景，連對動態較少的場景等也進行剪輯，以使其因應攝影時間成為特定之時間範圍內的內容。或者，伺服器亦可根據場景之意義解析的結果來生成摘錄並進行編碼。The server will do it in real time or cumulatively when shooting, and after shooting, it will identify the shooting errors, scene search, meaning analysis, and target detection from the original image or the coded data. Moreover, the server will perform the following manual or automatic editing based on the recognition results: correction of out-of-focus or camera shake, deletion of less important scenes such as scenes with lower brightness or unfocused scenes, emphasis on the edges of the target, Tone changes, etc. The server will encode the edited data according to the edit result. In addition, it is also well known that when the shooting time is too long, the ratings will decrease. The server will automatically perform not only the low-importance scenes, but also the less dynamic ones, according to the image processing results. Scenes, etc. are also clipped to make them content within a specific time range according to the shooting time. Alternatively, the server may generate and encode an excerpt based on the result of the meaning analysis of the scene.

再者，在個人內容中，也有照原樣的話會有侵害著作權、著作人格權、或肖像權等之內容攝入的案例，也有當共享的範圍超過所欲共享之範圍等對個人來說不方便的情況。因此，例如，伺服器亦可將畫面周邊部的人臉、或房子內等特意變更為未聚焦之圖像並編碼。又，伺服器亦可辨識是否有與事先登錄之人物不同的人物的臉照在編碼對象圖像內，並在有照出的情況下，進行將臉的部分打上馬賽克等之處理。或者，作為編碼之前處理或後處理，而從著作權等之觀點來讓使用者於圖像中指定想要加工之人物或背景區域後，令伺服器進行將所指定之區域置換為別的影像、或者使焦點模糊等之處理的作法也是可做到的。如果是人物，可以在動態圖像中一邊追蹤人物一邊置換臉的部分的影像。In addition, there are cases where personal content infringes copyright, personality rights, or portrait rights, as it is, and it is inconvenient for individuals when the scope of sharing exceeds the scope of sharing. Case. Therefore, for example, the server may intentionally change and encode a human face in the periphery of the screen, or the inside of a house, into an unfocused image. In addition, the server can also recognize whether a face of a person different from the person registered in advance is photographed in the encoding target image, and if it is photographed, perform processing such as mosaicing the face portion. Alternatively, as a pre- or post-encoding process, from the viewpoint of copyright, the user can specify the person or background area to be processed in the image, and then make the server replace the specified area with another image, Alternatively, processing such as blurring the focus is also possible. If it is a person, it is possible to replace the face part of the face while tracking the person in the moving image.

又，由於資料量較小之個人內容的視聽對即時性的要求較強，因此，雖然也會取決於頻帶寬，但解碼裝置首先會最優先地接收基本層再進行解碼及播放。解碼裝置亦可在這段期間接收增強層，且於迴路播放之情形等播放2次以上的情形下，將增強層也包含在內來播放高畫質的影像。只要是可進行像這樣可調整之編碼的串流，就可以提供一種雖然在未選擇時或初次看到的階段是粗略的動態圖，但串流會逐漸智能化(smart)而使圖像變好的體驗。除了可調式編碼以外，即使以第1次播放之粗略的串流、與參照第1次之動態圖而編碼之第2次的串流作為1個串流來構成也可以提供同樣的體驗。 [其他使用例]In addition, since the viewing and listening of personal content with a small amount of data has strong requirements for immediacy, although it also depends on the frequency bandwidth, the decoding device first receives the base layer first and then decodes and plays it. The decoding device can also receive the enhancement layer during this period, and in the case of loop playback or more than two times, the enhancement layer is also included to play the high-quality image. As long as the stream can be adjusted and coded like this, it can provide a kind of rough dynamic picture when it is not selected or the first time it is seen, but the stream will gradually be smart and the image will be changed. Good experience. In addition to the adjustable encoding, the same experience can be provided even if the rough stream for the first playback and the second stream encoded with reference to the first motion picture are configured as one stream. [Other use cases]

又，這些編碼或解碼處理一般是在各終端所具有之LSIex500中處理。LSIex500可為單晶片(one chip)，亦可為由複數個晶片形成之構成。再者，亦可將動態圖像編碼或解碼用之軟體安裝到可以在電腦ex111等讀取之某種記錄媒體(CD-ROM、軟式磁碟(flexible disk)、或硬碟等)，並使用該軟體進行編碼或解碼處理。此外，當智慧型手機ex115為附有相機時，亦可傳送以該相機取得之動態圖資料。此時的動態圖資料是以智慧型手機ex115具有之LSIex500所編碼處理過的資料。In addition, these encoding or decoding processes are generally processed in the LSIex500 that each terminal has. The LSIex500 may be a single chip, or may be formed of a plurality of chips. In addition, you can also install software for encoding or decoding moving images on a recording medium (CD-ROM, flexible disk, or hard disk) that can be read on a computer such as ex111, and use it. The software performs encoding or decoding. In addition, when the smart phone ex115 is equipped with a camera, it can also transmit dynamic image data obtained by the camera. The dynamic map data at this time is data processed and encoded by the LSIex500 included in the smart phone ex115.

再者，LSIex500亦可為將應用軟體下載並啟動(activate)之構成。此時，終端首先會判定該終端是否對應於內容之編碼方式、或者是否具有特定服務之執行能力。當終端沒有對應於內容之編碼方式時、或者不具有特定服務之執行能力時，終端會下載編碼解碼器或應用軟體，然後，取得及播放內容。Furthermore, the LSIex500 may be configured to download and activate application software. At this time, the terminal first determines whether the terminal corresponds to the encoding method of the content, or whether it has the ability to execute specific services. When the terminal does not have a coding method corresponding to the content, or does not have the ability to execute specific services, the terminal downloads a codec or application software, and then obtains and plays the content.

又，不限於透過網際網路ex101之內容供給系統ex100，在數位播放用系統中也可以安裝上述實施形態及各變形例之至少動態圖像編碼裝置(圖像編碼裝置)或動態圖像解碼裝置(圖像解碼裝置)之任一個。由於是利用衛星等來將已使影像與聲音被多工化之多工資料乘載於播放用之電波來進行傳送接收，因此會有相對於內容供給系統ex100之容易形成單播的構成更適合多播的差別，但有關於編碼處理及解碼處理仍可為同樣之應用。 [硬體構成]Moreover, it is not limited to the content supply system ex100 via the Internet ex101, and at least a moving image encoding device (image encoding device) or a moving image decoding device of the above-mentioned embodiment and each modification may be installed in the digital playback system. (Image Decoding Device). Since satellites are used to transmit and receive images and sounds that have been multiplexed with multiplexed data on radio waves for transmission and reception, it will be more suitable for the content supply system ex100 which is easy to form a unicast. The multicast is different, but the encoding and decoding processes can still be used for the same application. [Hardware composition]

圖35是顯示智慧型手機ex115之圖。又，圖36是顯示智慧型手機ex115的構成例之圖。智慧型手機ex115具備：用於在與基地台ex110之間傳送及接收電波的天線ex450、可拍攝影像及靜態圖之相機部ex465、顯示已將相機部ex465所拍攝到之影像以及在天線ex450所接收到之影像等解碼之資料的顯示部ex458。智慧型手機ex115更具備：觸控面板等之操作部ex466、用於輸出聲音或音響之揚聲器等的聲音輸出部ex457、用於輸入聲音之麥克風等之聲音輸入部ex456、可保存所攝影之影像或靜態圖、錄音之聲音、接收之影像或靜態圖、郵件等已編碼的資料、或已解碼的資料的記憶體部ex467、及作為與SIMex468之間的介面部之插槽部ex464，該SIMex468是用於特定使用者，且以網路為首進行對各種資料的存取之認證。再者，亦可使用外接記憶體取代記憶體部ex467。FIG. 35 is a diagram showing a smart phone ex115. FIG. 36 is a diagram showing a configuration example of a smartphone ex115. The smartphone ex115 includes an antenna ex450 for transmitting and receiving radio waves to and from the base station ex110, a camera section ex465 capable of capturing images and still images, a display showing images captured by the camera section ex465, and an antenna ex450. Display section ex458 of decoded data such as received images. The smart phone ex115 further includes an operation unit ex466 such as a touch panel, a sound output unit ex457 for outputting a sound or a speaker, a sound input unit ex456 for inputting a microphone, and the like, and can store a photographed image. Or still pictures, recorded sounds, received images or still pictures, mail and other encoded data, or decoded data memory part ex467, and the slot part ex464 as the interface between the SIMex468, the SIMex468 It is used for specific users and authenticates access to various data, including the Internet. Furthermore, an external memory may be used instead of the memory section ex467.

又，統合地控制顯示部ex458及操作部ex466等的主控制部ex460是透過匯流排ex470而與電源電路部ex461、操作輸入控制部ex462、影像訊號處理部ex455、相機介面部ex463、顯示器控制部ex459、調變/解調部ex452、多工/分離部ex453、聲音訊號處理部ex454、插槽部ex464、及記憶體部ex467相連接。The main control unit ex460 that controls the display unit ex458 and the operation unit ex466 in an integrated manner is connected to the power circuit unit ex461, the operation input control unit ex462, the image signal processing unit ex455, the camera interface portion ex463, and the display control unit through a bus ex470. The ex459, the modulation / demodulation unit ex452, the multiplexing / demultiplexing unit ex453, the audio signal processing unit ex454, the slot unit ex464, and the memory unit ex467 are connected.

電源電路部ex461在藉由使用者之操作而將電源鍵設成開啟狀態時，會藉由從電池組(battery pack)對各部供給電力而將智慧型手機ex115起動為可動作之狀態。When the power supply circuit section ex461 sets the power key to the on state by a user's operation, the smart phone ex115 is activated to operate by supplying power to each section from a battery pack.

智慧型手機ex115會根據具有CPU、ROM及RAM等的主控制部ex460的控制，進行通話及資料通訊等之處理。通話時，是將以聲音輸入部ex456所收音之聲音訊號在聲音訊號處理部ex454轉換為數位聲音訊號，並以調變/解調部ex452對其進行展頻處理，接著以傳送/接收部ex451施行數位類比轉換處理及頻率轉換處理後，透過天線ex450傳送。又，將接收資料放大且施行頻率轉換處理及類比數位轉換處理，並以調變/解調部ex452進行解展頻處理，接著以聲音訊號處理部ex454轉換為類比聲音訊號後，是由聲音輸出部ex457將其輸出。資料通訊模式時，是藉由本體部之操作部ex466等的操作而透過操作輸入控制部ex462將正文(text)、靜態圖、或影像資料送出至主控制部ex460，而同樣地進行傳送接收處理。當在資料通訊模式時傳送影像、靜態圖、或影像與聲音的情形下，影像訊號處理部ex455會藉由在上述實施形態及各變形例所示之動態圖像編碼方法而將保存於記憶體部ex467之影像訊號或從相機部ex465輸入之影像訊號壓縮編碼，並將已編碼之影像資料送出至多工/分離部ex453。又，聲音訊號處理部ex454會在以相機部ex465拍攝影像或靜態圖等時將以聲音輸入部ex456所收音之聲音訊號編碼，並將已編碼之聲音資料送出至多工/分離部ex453。多工/分離部ex453是以預定之方式將編碼完成之影像資料與編碼完成之聲音資料進行多工化，並以調變/解調部(調變/解調電路部)ex452、及傳送/接收部ex451施行調變處理及轉換處理並透過天線ex450傳送。The smart phone ex115 performs processing such as calling and data communication according to the control of the main control unit ex460 including the CPU, ROM, and RAM. During a call, the sound signal received by the sound input unit ex456 is converted into a digital sound signal in the sound signal processing unit ex454, and spread-spectrum processing is performed by the modulation / demodulation unit ex452, and then the transmission / reception unit ex451 After digital analog conversion processing and frequency conversion processing are performed, transmission is performed through the antenna ex450. In addition, the received data is amplified and subjected to frequency conversion processing and analog digital conversion processing, and the demodulation processing is performed by the modulation / demodulation unit ex452, and then converted to the analog sound signal by the sound signal processing unit ex454, and then the sound is output. The mini ex457 outputs it. In the data communication mode, the main body part's operation part ex466 and other operations are used to send the text (text), still pictures, or image data to the main control part ex460 through the operation input control part ex462, and the same transmission and reception processing is performed. . When images, still images, or images and sounds are transmitted in the data communication mode, the image signal processing unit ex455 will be stored in the memory by the dynamic image encoding method shown in the above embodiment and each modification. The image signal of the unit ex467 or the image signal input from the camera unit ex465 is compression-encoded, and the encoded image data is sent to the multiplex / separation unit ex453. In addition, the sound signal processing unit ex454 encodes the sound signals received by the sound input unit ex456 when shooting images or still pictures with the camera unit ex465, and sends the encoded sound data to the multiplexing / separation unit ex453. The multiplexing / separating section ex453 multiplexes the encoded video data and the encoded audio data in a predetermined manner, and uses a modulation / demodulation section (modulation / demodulation circuit section) ex452 and transmission / The receiving unit ex451 performs modulation processing and conversion processing, and transmits it through the antenna ex450.

在已接收附加於電子郵件或網路聊天之影像、或鏈接至網頁等之影像的情形下，為了透過天線ex450將已接收之多工資料解碼，多工/分離部ex453會藉由分離多工資料，而將多工資料分成影像資料之位元流與聲音資料之位元流，再透過同步匯流排ex470將已編碼之影像資料供給至影像訊號處理部ex455，並且將已編碼之聲音資料供給至聲音訊號處理部ex454。影像訊號處理部ex455會藉由對應於上述實施形態及各變形例所示之動態圖像編碼方法的動態圖像解碼方法來解碼影像訊號，並透過顯示器控制部ex459從顯示部ex458顯示被鏈接之動態圖像檔案中所含的影像或靜態圖。又，聲音訊號處理部ex454會將聲音訊號解碼，並從聲音輸出部ex457輸出聲音。再者，由於即時串流(real time streaming)已普及，因此依據使用者的狀況，也可能在社會上不適合發出聲音的場所發生聲音的播放。因此，作為初始值較理想的是，在不使聲音訊號播放的情形下僅播放影像資料之構成。亦可僅在使用者進行點選影像資料等操作的情形下才將聲音同步播放。In the case where an image attached to an e-mail or a web chat or an image linked to a web page is received, in order to decode the received multiplexed data via the antenna ex450, the multiplex / separation section ex453 will separate the multiplex by The multiplexed data is divided into a bit stream of image data and a bit stream of sound data, and then the encoded image data is supplied to the image signal processing unit ex455 through the synchronous bus ex470, and the encoded sound data is supplied. To the sound signal processing unit ex454. The image signal processing unit ex455 decodes the image signal by a dynamic image decoding method corresponding to the moving image encoding method shown in the above embodiment and each modification, and displays the linked image from the display unit ex458 through the display control unit ex459. An image or still image contained in a moving image file. The audio signal processing unit ex454 decodes the audio signal and outputs audio from the audio output unit ex457. Furthermore, since real-time streaming has become popular, depending on the user's situation, sound playback may occur in places that are not suitable for sound generation in society. Therefore, as an initial value, it is desirable to have a structure that only plays video data without playing a sound signal. The sound can also be played synchronously only when the user performs operations such as clicking on image data.

又，此處雖然以智慧型手機ex115為例進行了說明，但是作為終端，可考慮以下3種組裝形式：除了具有編碼器及解碼器兩者之傳送接收型終端，還有僅具有編碼器之傳送終端、以及僅具有解碼器之接收終端。此外，在數位播送用系統中，雖然是以接收或傳送已在影像資料中將聲音資料等多工化之多工資料來進行說明，但在多工資料中，除了聲音資料以外亦可將與影像有關聯之文字資料等多工化，且可接收或傳送影像資料本身而非多工資料。In addition, although the smart phone ex115 has been described as an example here, as a terminal, the following three types of assembly can be considered: in addition to a transmitting and receiving terminal having both an encoder and a decoder, there is also an A transmitting terminal and a receiving terminal having only a decoder. In addition, although the digital broadcasting system is described by receiving or transmitting multiplexed data that has been multiplexed with audio data, such as video data, in the multiplexed data, in addition to audio data, Images are multiplexed with related text data, and can receive or send image data itself instead of multiplexed data.

再者，雖然說明了以包含CPU之主控制部ex460控制編碼或解碼處理的情形，但終端具備GPU的情況也很多。因此，亦可構成為藉由在CPU與GPU共通的記憶體、或以可共通地使用的方式管理位址的記憶體，來活用GPU之性能而一併處理較寬廣區域。藉此可以縮短編碼時間，確保即時性，而可以實現低延遲。特別是對動態搜尋、解塊濾波方法(deblock filter)、SAO(取樣自適應偏移，Sample Adaptive Offset)、及轉換、量化之處理，在不利用CPU的情形下，利用GPU並以圖片等單位來一併進行時是有效率的。産業上之可利用性In addition, although the case where the main control unit ex460 including the CPU is used to control the encoding or decoding process has been described, the terminal may include a GPU in many cases. Therefore, it can also be configured to utilize the performance of the GPU and process a wider area together by utilizing the memory common to the CPU and the GPU or the memory that manages addresses in a way that can be used in common. This can reduce encoding time, ensure immediateness, and achieve low latency. In particular, for dynamic search, deblock filter, SAO (Sample Adaptive Offset), and conversion and quantization processing, the GPU is used in units of pictures and other units without using the CPU. It is efficient when combined. Industrial availability

本揭示可在例如電視接收器、數位錄影機、汽車導航系統、行動電話、數位相機、或數位攝影機等上利用。The present disclosure can be used in, for example, a television receiver, a digital video recorder, a car navigation system, a mobile phone, a digital camera, or a digital video camera.

10~23‧‧‧區塊
100‧‧‧編碼裝置
102‧‧‧分割部
104‧‧‧減法部
106、106A、106B、106C‧‧‧轉換部
108‧‧‧量化部
110‧‧‧熵編碼部
112、204‧‧‧逆量化部
114、206、206A、206B、206C‧‧‧逆轉換部
116、208‧‧‧加法部
118、210‧‧‧區塊記憶體
120、212‧‧‧迴路濾波部
122、214‧‧‧框記憶體
124、216‧‧‧框內預測部
126、218‧‧‧框間預測部
128、220‧‧‧預測控制部
200‧‧‧解碼裝置
202‧‧‧熵解碼部
1061、1061A、2061‧‧‧框內/框間判定部
1062、1062A、1062C、2062、2062A、2062C‧‧‧基底選擇部
1063‧‧‧頻率轉換部
1064A、2064A‧‧‧尺寸判定部
1065B‧‧‧閾值尺寸決定部
1066C、2066C‧‧‧轉換模式判定部
2063‧‧‧逆頻率轉換部
2065B‧‧‧閾值尺寸取得部
MV0、MV1、MVx₀、MVy₀、MVx₁、MVy₁、v0、v1‧‧‧移動向量
Ref0、Ref1‧‧‧參照圖片
TD0、TD1、τ₀、τ₁‧‧‧距離
S101~S104、S201~S204、S111、S121、S131、S132、S133、S211、S221、S232、S233‧‧‧步驟
ex100‧‧‧內容供給系統
ex101‧‧‧網際網路
ex102‧‧‧網際網路服務提供者
ex103‧‧‧串流伺服器
ex104‧‧‧通訊網
ex106、ex107、ex108、ex109、ex110‧‧‧基地台
ex111‧‧‧電腦
ex112‧‧‧遊戲機
ex113‧‧‧相機
ex114‧‧‧家電
ex115‧‧‧智慧型手機
ex116‧‧‧衛星
ex117‧‧‧飛機
ex450‧‧‧天線
ex451‧‧‧傳送/接收部
ex452‧‧‧調變/解調部(調變/解調電路部)
ex453‧‧‧多工/分離部
ex454‧‧‧聲音訊號處理部
ex455‧‧‧影像訊號處理部
ex456‧‧‧聲音輸入部
ex457‧‧‧聲音輸出部
ex458‧‧‧顯示部
ex459‧‧‧顯示器控制部
ex460‧‧‧主控制部
ex461‧‧‧電源電路部
ex462‧‧‧操作輸入控制部
ex463‧‧‧相機介面部
ex464‧‧‧插槽部
ex465‧‧‧相機部
ex466‧‧‧操作部
ex467‧‧‧記憶體部
ex468‧‧‧SIM
ex470‧‧‧匯流排
ex500‧‧‧LSI10 ~ 23‧‧‧block
100‧‧‧ encoding device
102‧‧‧Division
104‧‧‧Subtraction Division
106, 106A, 106B, 106C‧‧‧ Conversion Department
108‧‧‧Quantitative Department
110‧‧‧Entropy coding department
112, 204‧‧‧ Inverse quantification department
114, 206, 206A, 206B, 206C‧‧‧ Inverse conversion department
116, 208‧‧‧Addition Department
118, 210‧‧‧ block memory
120, 212‧‧‧loop filtering department
122, 214‧‧‧ frame memory
124, 216‧‧‧ Frame prediction department
126, 218‧‧‧‧ Inter-frame prediction department
128, 220‧‧‧ Predictive Control Department
200‧‧‧ decoding device
202‧‧‧Entropy Decoding Department
1061, 1061A, 2061‧‧‧ In-frame / inter-frame determination unit
1062, 1062A, 1062C, 2062, 2062A, 2062C‧‧‧Base selection department
1063‧‧‧Frequency Conversion Department
1064A, 2064A‧‧‧Size determination unit
1065B‧‧‧Threshold size determination unit
1066C, 2066C ‧‧‧ Conversion mode determination unit
2063‧‧‧ Inverse frequency conversion section
2065B‧‧‧Threshold size acquisition unit
_{MV0, MV1, MVx 0, MVy} 0, MVx 1, MVy 1, v0, v1‧‧‧ motion vector
Ref0, Ref1‧‧‧ reference pictures
TD0, TD1, τ ₀ , τ ₁ ‧‧‧ distance
S101 ~ S104, S201 ~ S204, S111, S121, S131, S132, S133, S211, S221, S232, S233
ex100‧‧‧Content Supply System
ex101‧‧‧Internet
ex102‧‧‧Internet Service Provider
ex103‧‧‧streaming server
ex104‧‧‧Communication Network
ex106, ex107, ex108, ex109, ex110‧‧‧ base station
ex111‧‧‧Computer
ex112‧‧‧Game console
ex113‧‧‧Camera
ex114‧‧‧Household appliances
ex115‧‧‧Smartphone
ex116‧‧‧ satellite
ex117‧‧‧plane
ex450‧‧‧antenna
ex451‧‧‧Transmission / Reception Department
ex452‧‧‧Modulation / Demodulation Section (Modulation / Demodulation Circuit Section)
ex453‧‧‧Multiplexing / Separation Department
ex454‧‧‧Sound Signal Processing Department
ex455‧‧‧Image Signal Processing Department
ex456‧‧‧Voice input section
ex457‧‧‧Sound output
ex458‧‧‧Display
ex459‧‧‧Display Control
ex460‧‧‧Main Control Department
ex461‧‧‧Power circuit department
ex462‧‧‧Operation input control section
ex463‧‧‧camera face
ex464‧‧‧Slot
ex465‧‧‧Camera Department
ex466‧‧‧Operation Department
ex467‧‧‧Memory
ex468‧‧‧SIM
ex470‧‧‧Bus
ex500‧‧‧LSI

圖1是顯示實施形態1之編碼裝置的功能構成之方塊圖。圖2是顯示實施形態1之區塊分割的一例之圖。圖3是顯示對應於各轉換類型的轉換基底函數之表格。圖4A是顯示在ALF所用的濾波器之形狀的一例之圖。圖4B是顯示在ALF所用的濾波器之形狀的其他的一例之圖。圖4C是顯示在ALF所用的濾波器之形狀的其他的一例之圖。圖5是顯示框內預測中的67個框內預測模式之圖。圖6是用於說明沿著移動軌跡的2個區塊間的型樣匹配(雙向匹配)之圖。圖7是用於說明在當前圖片內的模板與參照圖片內的區塊之間的型樣匹配(模板匹配)之圖。圖8是用於說明假設了等速直線運動的模型之圖。圖9是用於說明根據複數個相鄰區塊的移動向量之子區塊單位的移動向量的導出之圖。圖10是顯示實施形態1之解碼裝置的功能構成之方塊圖。圖11是顯示實施形態1之編碼裝置的轉換部之內部構成的方塊圖。圖12是顯示實施形態1之編碼裝置的轉換部之處理的流程圖。圖13是顯示實施形態1之解碼裝置的逆轉換部之內部構成的方塊圖。圖14是顯示實施形態1之解碼裝置的逆轉換部之處理的流程圖。圖15A是表示32×32尺寸的區塊中之DCT-II的轉換特性之圖表。圖15B是表示32×32尺寸的區塊中之DCT-V的轉換特性之圖表。圖16A是表示4×4尺寸的區塊中之DCT-II的轉換特性之圖表。圖16B是表示4×4尺寸的區塊中之DCT-V的轉換特性之圖表。圖17是顯示實施形態1之變形例1的編碼裝置的轉換部之內部構成的方塊圖。圖18是顯示實施形態1之變形例1的編碼裝置的轉換部之處理的流程圖。圖19是顯示實施形態1之變形例1的解碼裝置的逆轉換部之內部構成的方塊圖。圖20是顯示實施形態1之變形例1的解碼裝置的逆轉換部之處理的流程圖。圖21是顯示實施形態1之變形例2的編碼裝置的轉換部之內部構成的方塊圖。圖22是顯示實施形態1之變形例2或3中的閾值尺寸或轉換模式的資訊之位元流內的位置的複數個例子之圖。圖23是顯示實施形態1之變形例2的編碼裝置的轉換部之處理的流程圖。圖24是顯示實施形態1之變形例2的解碼裝置的逆轉換部之內部構成的方塊圖。圖25是顯示實施形態1之變形例2的解碼裝置的逆轉換部之處理的流程圖。圖26是顯示實施形態1之變形例3的編碼裝置的轉換部之內部構成的方塊圖。圖27是顯示實施形態1之變形例3的編碼裝置的轉換部之處理的流程圖。圖28是顯示實施形態1之變形例3的解碼裝置的逆轉換部之內部構成的方塊圖。圖29是顯示實施形態1之變形例3的解碼裝置的逆轉換部之處理的流程圖。圖30是實現內容發送服務(content delivery service)的內容供給系統之整體構成圖。圖31是顯示可調式編碼時之編碼構造的一例之圖。圖32是顯示可調式編碼時之編碼構造的一例之圖。圖33是顯示網頁的顯示畫面例之圖。圖34是顯示網頁的顯示畫面例之圖。圖35是顯示智慧型手機的一例之圖。圖36是顯示智慧型手機的構成例之方塊圖。FIG. 1 is a block diagram showing a functional configuration of an encoding device according to the first embodiment. FIG. 2 is a diagram showing an example of block division in the first embodiment. FIG. 3 is a table showing conversion basis functions corresponding to each conversion type. FIG. 4A is a diagram showing an example of the shape of a filter used in ALF. FIG. 4B is a diagram showing another example of the shape of a filter used in ALF. FIG. 4C is a diagram showing another example of the shape of a filter used in ALF. FIG. 5 is a diagram showing 67 intra-frame prediction modes in the intra-frame prediction. FIG. 6 is a diagram for explaining pattern matching (two-way matching) between two blocks along a moving track. FIG. 7 is a diagram for explaining pattern matching (template matching) between a template in a current picture and a block in a reference picture. FIG. 8 is a diagram for explaining a model in which constant-speed linear motion is assumed. FIG. 9 is a diagram for explaining derivation of a motion vector based on a sub-block unit of a motion vector of a plurality of adjacent blocks. Fig. 10 is a block diagram showing a functional configuration of a decoding device according to the first embodiment. Fig. 11 is a block diagram showing an internal configuration of a conversion unit of the encoding device of the first embodiment. Fig. 12 is a flowchart showing processing performed by a conversion unit of the encoding device of the first embodiment. FIG. 13 is a block diagram showing an internal configuration of an inverse conversion section of the decoding device according to the first embodiment. Fig. 14 is a flowchart showing processing performed by an inverse conversion unit of the decoding device of the first embodiment. FIG. 15A is a graph showing the conversion characteristics of DCT-II in a 32 × 32 size block. FIG. 15B is a graph showing the conversion characteristics of DCT-V in a 32 × 32 size block. FIG. 16A is a graph showing the conversion characteristics of DCT-II in a 4 × 4 size block. FIG. 16B is a graph showing the conversion characteristics of DCT-V in a 4 × 4 size block. FIG. 17 is a block diagram showing an internal configuration of a conversion unit of an encoding device according to a first modification of the first embodiment. FIG. 18 is a flowchart showing processing performed by a conversion unit of an encoding device according to a first modification of the first embodiment. 19 is a block diagram showing an internal configuration of an inverse conversion section of a decoding device according to a first modification of the first embodiment. FIG. 20 is a flowchart showing processing by an inverse conversion unit of the decoding device according to the first modification of the first embodiment. FIG. 21 is a block diagram showing an internal configuration of a conversion unit of an encoding device according to a second modification of the first embodiment. 22 is a diagram showing plural examples of positions in a bit stream of information of a threshold size or a conversion pattern in a modification 2 or 3 of the first embodiment. FIG. 23 is a flowchart showing processing performed by a conversion unit of an encoding device according to a second modification of the first embodiment. FIG. 24 is a block diagram showing an internal configuration of an inverse conversion unit of a decoding device according to a second modification of the first embodiment. 25 is a flowchart showing processing performed by an inverse conversion unit of a decoding device according to a second modification of the first embodiment. FIG. 26 is a block diagram showing an internal configuration of a conversion unit of an encoding device according to a third modification of the first embodiment. FIG. 27 is a flowchart showing processing performed by a conversion unit of an encoding device according to a third modification of the first embodiment. FIG. 28 is a block diagram showing an internal configuration of an inverse conversion unit of a decoding device according to a third modification of the first embodiment. FIG. 29 is a flowchart showing a process of an inverse conversion unit of a decoding device according to a third modification of the first embodiment. FIG. 30 is an overall configuration diagram of a content supply system that implements a content delivery service. FIG. 31 is a diagram showing an example of a coding structure during adjustable coding. FIG. 32 is a diagram showing an example of a coding structure during adjustable coding. FIG. 33 is a diagram showing an example of a display screen displaying a web page. FIG. 34 is a diagram showing an example of a display screen for displaying a web page. FIG. 35 is a diagram showing an example of a smartphone. FIG. 36 is a block diagram showing a configuration example of a smartphone.

104‧‧‧減法部 104‧‧‧Subtraction Division

106‧‧‧轉換部 106‧‧‧ Conversion Department

108‧‧‧量化部 108‧‧‧Quantitative Department

114‧‧‧逆轉換部 114‧‧‧ Inverse Conversion Department

1061‧‧‧框內/框間判定部 1061‧‧‧In-frame / between-frame judging section

1062‧‧‧基底選擇部 1062‧‧‧Base Selection Division

1063‧‧‧頻率轉換部 1063‧‧‧Frequency Conversion Department

Claims

An encoding device is an encoding device that encodes an encoding target block of an image, and includes a processor and a memory. The processor uses the memory and determines that a frame is used for the encoding target block. Which of intra-prediction and inter-frame prediction, and selectively using a plurality of frequency conversion bases including DCT-II and DCT-V to perform the first frequency conversion on the prediction error of the coding target block, as described above In the first frequency conversion, when intra-frame prediction is used for the aforementioned coding target block, the base of DCT-V is used, and when inter-frame prediction is used for the aforementioned coding target block, It is a substrate using DCT-II.

The encoding device according to claim 1, wherein the processor further determines whether the size of the encoding target block is equal to or smaller than a threshold size, and in the first frequency conversion, what is used for the encoding target block is In the case of in-frame prediction: (i) if the size of the aforementioned coding target block is below the aforementioned threshold size, the base of DCT-V is used; (ii) if the size of the aforementioned coding target block is greater than the aforementioned threshold size Using DCT-II substrate.

The encoding device as claimed in claim 2, wherein the processor further writes the information of the threshold size into the bit stream.

The encoding device according to any one of claims 1 to 3, wherein the processor further determines which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is applicable to the encoding. The target block performs the first frequency conversion when the first conversion mode is applied, and performs the second frequency conversion different from the first frequency conversion when the second conversion mode is applied.

The encoding device as claimed in claim 4, wherein the processor further writes information of a conversion mode applicable to the encoding target block into the bit stream.

An encoding method is an encoding method for encoding an encoding target block of an image. The encoding method determines which of intra-frame prediction and inter-frame prediction is used for the foregoing encoding target block, and selectively uses the same. The first frequency conversion of the prediction error of the coding target block is performed based on a plurality of frequency conversion bases of DCT-II and DCT-V. The first frequency conversion is used for the coding target block. In the case of intra-frame prediction, the base of DCT-V is used, and when the inter-frame prediction is used for the aforementioned coding target block, the base of DCT-II is used.

A decoding device is a decoding device that decodes a decoding target block of an image, and includes a processor and a memory. The processor uses the memory and determines whether the decoding target block is used for the decoding target block. Which of intra-frame prediction and inter-frame prediction, and selectively uses a plurality of inverse frequency conversion bases including inverse conversion of DCT-II and DCT-V to perform the first prediction error of the foregoing decoding target block. Inverse frequency conversion. In the first inverse frequency conversion, when in-frame prediction is used for the decoding target block, it is a basis for inverse conversion using DCT-V. In the case of using inter-frame prediction, it is the basis of the inverse conversion using DCT-II.

The decoding device according to claim 7, wherein the processor further determines whether the size of the decoding target block is equal to or smaller than a threshold size, and in the first inverse frequency conversion, the decoding unit uses the In the case of in-frame prediction, (i) if the size of the aforementioned decoding target block is below the aforementioned threshold size, the base of the inverse conversion of DCT-V is used; (ii) if the size of the aforementioned decoding target block is larger than For the aforementioned threshold size, the base of the inverse conversion of DCT-II is used.

The decoding device according to claim 8, wherein the processor further interprets the information of the threshold size from the bit stream.

The decoding device according to any one of claims 7 to 9, wherein the processor further determines which conversion mode among a plurality of conversion modes including the first conversion mode and the second conversion mode is suitable for the decoding. The target block performs the first inverse frequency conversion when the first conversion mode is applied, and performs the second inverse frequency conversion different from the first inverse frequency conversion when the second conversion mode is applied. .

The decoding device according to claim 10, wherein the processor further interprets the information of the conversion mode applicable to the decoding target block from the bit stream.

A decoding method is a decoding method for decoding a decoding target block of an image. The decoding method is to determine which of intra-frame prediction and inter-frame prediction is used for the foregoing decoding target block, and selectively use the same. The first inverse frequency conversion for the prediction error of the decoding target block is performed by a plurality of inverse frequency conversion bases including the inverse conversion of DCT-II and DCT-V. In the first inverse frequency conversion, When in-frame prediction is used for the decoding target block, it is the base of inverse conversion using DCT-V. When in-frame prediction is used for the foregoing decoding target block, DCT-II is used. Base of inverse transformation.