TW201834456A

TW201834456A - Image coding apparatus, image decoding apparatus, and method

Info

Publication number: TW201834456A
Application number: TW106140332A
Authority: TW
Inventors: 安倍清史; 西孝啓; 遠間正真; 橋本隆
Original assignee: 美商松下電器（美國）知識產權公司
Priority date: 2016-11-22
Filing date: 2017-11-21
Publication date: 2018-09-16
Also published as: WO2018097116A1

Abstract

An encoding device (100), when extracting at least one predicted motion vector candidate for a block to be encoded from a plurality of candidate motion vectors, encodes mode information for identifying an extraction method, selects an extraction method identified by the mode information for the block to be encoded from among a first extraction method and a second extraction method, and extracts the at least one predicted motion vector candidate in accordance with the selected extraction method. The first extraction method is an extraction method based on evaluation results for each of the plurality of candidate motion vectors obtained using a reconstructed image of an encoded region in a moving image without using the image region of the block to be encoded, and the second extraction method is an extraction method based on a predetermined priority for the plurality of candidate motion vectors.

Description

Encoding device, decoding device, encoding method and decoding method

發明領域本揭示是有關於對由複數個圖片構成的動態圖像進行編碼的編碼裝置等。FIELD OF THE INVENTION The present disclosure relates to an encoding apparatus or the like that encodes a moving image composed of a plurality of pictures.

發明背景以往，作為用於對動態圖像進行編碼的規格而存在有H.265。H.265也稱為HEVC(高效率視訊編碼，High Efficiency Video Coding)。先前技術文獻Background of the Invention Conventionally, H.265 exists as a specification for encoding a moving image. H.265 is also known as HEVC (High Efficiency Video Coding). Prior technical literature

非專利文獻非專利文獻1：H.265(ISO/IEC 23008-2 HEVC(High Efficiency Video Coding，高效率視訊編碼))Non-patent literature Non-Patent Document 1: H.265 (ISO/IEC 23008-2 HEVC (High Efficiency Video Coding))

發明概要發明欲解決之課題但是，在期望編碼效率更進一步提升的另一方面，會有因該編碼效率的提升而增加處理負擔的問題。SUMMARY OF THE INVENTION Problem to be Solved by the Invention However, on the other hand, in order to further improve coding efficiency, there is a problem that the processing load is increased due to an increase in coding efficiency.

因此，本揭示提供一種編碼裝置等，該編碼裝置具有可抑制處理負擔的增加並且可以謀求編碼效率的提升之可能性。用以解決課題之手段Accordingly, the present disclosure provides an encoding apparatus or the like which has an possibility of suppressing an increase in processing load and which can improve an encoding efficiency. Means to solve the problem

本揭示之一態樣的編碼裝置，是對動態圖像進行編碼的編碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，並根據與前述動態圖像中的編碼對象區塊相對應的複數個編碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取前述編碼對象區塊的至少1個運動向量預測子候選，且參照包含於前述動態圖像的參照圖片來導出前述編碼對象區塊的運動向量，並對已擷取的前述至少1個運動向量預測子候選當中的運動向量預測子、及已導出的前述編碼對象區塊的運動向量之差分進行編碼，利用已導出的前述編碼對象區塊之運動向量來對前述編碼對象區塊進行動態補償，在前述至少1個運動向量預測子候選的擷取中，是對用於識別擷取方法的模式資訊進行編碼，且從第1擷取方法及第2擷取方法中，對於前述編碼對象區塊，選擇藉由前述模式資訊而識別出的擷取方法，並依照已選擇的前述擷取方法，來擷取前述至少1個運動向量預測子候選，前述第1擷取方法是根據前述複數個候選運動向量的每一個之評價結果的擷取方法，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量，前述第2擷取方法是根據優先順序的擷取方法，該優先順序是對前述複數個候選運動向量已事先規定的優先順序。An encoding device according to an aspect of the present disclosure is an encoding device that encodes a moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory and is based on the dynamic Obtaining, by the motion vector of each of the plurality of coding completion blocks corresponding to the coding target block in the image, to obtain a plurality of candidate motion vectors, and extracting at least the foregoing coding target block from the plurality of candidate motion vectors a motion vector predictor candidate, and deriving a motion vector of the encoding target block with reference to a reference picture included in the moving image, and predicting a motion vector among the at least one motion vector predictor candidate that has been captured Encoding the difference between the sub- and the derived motion vector of the coding target block, and dynamically using the derived motion vector of the coding target block to dynamically compensate the coding target block, and predicting at least one motion vector In the extraction of the sub-candidate, the mode information for identifying the extraction method is encoded, and the first extraction side is obtained. In the method and the second extraction method, for the coding target block, a capture method identified by the mode information is selected, and the at least one motion vector prediction is captured according to the selected extraction method. a sub-candidate, the first extraction method is a method for extracting an evaluation result according to each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but are used a candidate motion vector of a reconstructed image of the coded completion region in the moving image, wherein the second capture method is a capture method according to a priority order, and the priority order is a predetermined priority for the plurality of candidate motion vectors order.

再者，這些全面性的或具體的態樣可以藉由系統、裝置、方法、積體電路、電腦程式、或電腦可讀取的CD-ROM等非暫時的記錄媒體來實現，也可以藉由系統、裝置、方法、積體電路、電腦程式、及記錄媒體的任意的組合來實現。發明效果Furthermore, these comprehensive or specific aspects may be implemented by a system, device, method, integrated circuit, computer program, or non-transitory recording medium such as a computer-readable CD-ROM, or by means of a non-transitory recording medium. Any combination of systems, devices, methods, integrated circuits, computer programs, and recording media is implemented. Effect of the invention

本揭示之一態樣的編碼裝置等，可抑制處理負擔的增加並且可以謀求編碼效率的提升。An encoding apparatus or the like which is one aspect of the present disclosure can suppress an increase in processing load and can improve encoding efficiency.

用以實施發明之形態以下，參照圖式來具體地說明實施形態。MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments will be specifically described with reference to the drawings.

再者，以下所說明的實施形態都是顯示全面性的或具體的例子之實施形態。以下實施形態所示的數值、形狀、材料、構成要件、構成要件的配置位置及連接形態、步驟、步驟的順序等，都只是一個例子，並非用來限定請求的範圍的主旨。又，以下實施形態的構成要件之中，針對沒有記載在表示最上位概念之獨立請求項中的構成要件，是作為任意之構成要件來說明。 (實施形態1)Furthermore, the embodiments described below are embodiments that exhibit comprehensive or specific examples. Numerical values, shapes, materials, constituent elements, arrangement positions and connection forms of the constituent elements, procedures, procedures, and the like, which are shown in the following embodiments, are merely examples, and are not intended to limit the scope of the claims. Further, among the constituent elements of the following embodiments, the constituent elements that are not described in the independent request items indicating the highest-level concept are described as arbitrary constituent elements. (Embodiment 1)

首先，說明實施形態1的概要，來作為可適用後述之本揭示的各態樣中說明的處理及/或構成的編碼裝置及解碼裝置之一例。但是，實施形態1只不過是可適用本揭示的各態樣中說明的處理及/或構成之編碼裝置及解碼裝置的一例，本揭示的各態樣中說明的處理及/或構成，亦可在與實施形態1不同的編碼裝置及解碼裝置中實施。First, an outline of the first embodiment will be described as an example of an encoding device and a decoding device to which the processing and/or configuration described in each aspect of the present disclosure to be described later can be applied. However, the first embodiment is merely an example of an encoding device and a decoding device to which the processing and/or configuration described in each aspect of the present disclosure can be applied, and the processing and/or configuration described in each aspect of the present disclosure may be used. It is implemented in an encoding apparatus and a decoding apparatus different from the first embodiment.

在對實施形態1適用本揭示的各態樣中所說明的處理及/或構成的情況下，亦可進行例如以下的任一項。 (1)對於實施形態1的編碼裝置或解碼裝置，將構成該編碼裝置或解碼裝置的複數個構成要件當中，與本揭示的各態樣中說明的構成要件相對應的構成要件，替換為本揭示的各態樣中所說明的構成要件。 (2)對於實施形態1的編碼裝置或解碼裝置，針對構成該編碼裝置或解碼裝置的複數個構成要件當中一部分的構成要件，可於施加了功能或實施的處理之追加、替換、刪除等任意的變更後，將與本揭示的各態樣中所說明的構成要件相對應的構成要件，替換為本揭示的各態樣中所說明的構成要件。 (3)對於實施形態1的編碼裝置或解碼裝置所實施的方法，可於施加了處理的追加、及/或針對該方法所包含的複數個處理當中的一部分之處理進行替換、刪除等任意的變更後，將與本揭示的各態樣中說明的處理相對應的處理，替換為本揭示的各態樣中所說明的處理。 (4)可將構成實施形態1的編碼裝置或解碼裝置之複數個構成要件當中的一部分之構成要件，與下述構成要件組合而實施：本揭示的各態樣中所說明的構成要件、具備有本揭示的各態樣中所說明的構成要件所具備的功能之一部分的構成要件、或實施本揭示的各態樣中所說明的構成要件所實施的處理之一部分的構成要件。 (5)可將具備有構成實施形態1的編碼裝置或解碼裝置之複數個構成要件當中的一部分之構成要件所具備的功能之一部分的構成要件、或實施構成實施形態1的編碼裝置或解碼裝置之複數個構成要件當中的一部分之構成要件所實施的處理之一部分的構成要件，與下述構成要件組合而實施：本揭示的各態樣中所說明的構成要件、具備有本揭示的各態樣中所說明的構成要件所具備的功能之一部分的構成要件、或實施本揭示的各態樣中所說明的構成要件所實施的處理之一部分的構成要件。 (6)對於實施形態1的編碼裝置或解碼裝置所實施的方法，可將在該方法所包含的複數個處理當中，與本揭示的各態樣中所說明的處理相對應的處理，替換為本揭示的各態樣中所說明的處理。 (7)可將實施形態1的編碼裝置或解碼裝置所實施的方法所包含的複數個處理當中之一部分的處理，與本揭示的各態樣中所說明的處理組合而實施。In the case where the processing and/or configuration described in each aspect of the present disclosure is applied to the first embodiment, any of the following may be performed. (1) In the encoding device or the decoding device according to the first embodiment, among the plurality of constituent elements constituting the encoding device or the decoding device, the constituent elements corresponding to the constituent elements described in the respective aspects of the present disclosure are replaced with The constituent elements described in the various aspects disclosed. (2) The encoding device or the decoding device according to the first embodiment can arbitrarily add, replace, or delete a function or a process to be applied to a part of a plurality of constituent elements constituting the encoding device or the decoding device. After the change, the constituent elements corresponding to the constituent elements described in the various aspects of the present disclosure are replaced with the constituent elements described in the various aspects of the present disclosure. (3) The method implemented by the encoding device or the decoding device according to the first embodiment may be performed by adding or deleting a part of the plurality of processes included in the method, or adding or deleting any of the plurality of processes included in the method. After the change, the processing corresponding to the processing described in each aspect of the present disclosure is replaced with the processing explained in each aspect of the disclosure. (4) The constituent elements of a part of the plurality of constituent elements constituting the encoding apparatus or the decoding apparatus of the first embodiment can be combined with the following constituent elements: the constituent elements described in the respective aspects of the present disclosure, and There are constituent elements of a part of the functions of the constituent elements described in the aspects of the present disclosure, and constituent elements of a part of the processing performed by the constituent elements described in the various aspects of the present disclosure. (5) A component including a part of the functions of a part of a plurality of constituent elements constituting the encoding device or the decoding device of the first embodiment, or an encoding device or a decoding device according to the first embodiment The constituent elements of the processing performed by the constituent elements of a part of the plurality of constituent elements are implemented in combination with the following constituent elements: constituent elements described in the aspects of the present disclosure, and various aspects of the present disclosure are provided. The constituent elements of one of the functions of the constituent elements described in the example, or the constituent elements of the processing performed by the constituent elements described in the various aspects of the present disclosure. (6) With respect to the method implemented by the encoding device or the decoding device according to the first embodiment, the processing corresponding to the processing described in each aspect of the present disclosure among the plurality of processing included in the method can be replaced with The processing illustrated in the various aspects of the present disclosure. (7) The processing of one of the plurality of processes included in the method performed by the encoding device or the decoding device according to the first embodiment can be implemented in combination with the processing described in each aspect of the present disclosure.

再者，在本揭示的各態樣中所說明的處理及/或構成的實施之方式，並不限定於上述的例子。例如，亦可在與實施形態1中揭示的動態圖像/圖像編碼裝置或動態圖像/圖像解碼裝置以不同之目的來利用的裝置中實施，亦可單獨地實施各態樣中所說明的處理及/或構成。又，也可以將不同的態樣中說明的處理及/或構成組合來實施。 [編碼裝置之概要]Furthermore, the manner in which the processes and/or configurations described in the various aspects of the present disclosure are implemented is not limited to the above examples. For example, it may be implemented in a device that is used for different purposes from the moving image/image encoding device or the moving image/image decoding device disclosed in the first embodiment, or may be separately implemented in various aspects. The processing and/or composition of the description. Further, the processes and/or configurations described in the different aspects may be combined and implemented. [Summary of coding device]

首先，說明實施形態1之編碼裝置的概要。圖1是顯示實施形態1之編碼裝置100的功能構成之方塊圖。編碼裝置100是以區塊單位對動態圖像/圖像進行編碼的動態圖像/圖像編碼裝置。First, an outline of an encoding apparatus according to the first embodiment will be described. Fig. 1 is a block diagram showing the functional configuration of an encoding apparatus 100 according to the first embodiment. The encoding device 100 is a moving image/image encoding device that encodes moving images/images in block units.

如圖1所示，編碼裝置100是以區塊單位對圖像進行編碼的裝置，並具備分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、區塊記憶體118、迴路濾波部120、框記憶體122、框內預測部124、框間預測部126、及預測控制部128。As shown in FIG. 1, the encoding apparatus 100 is an apparatus for encoding an image in units of blocks, and includes a dividing unit 102, a subtracting unit 104, a converting unit 106, a quantization unit 108, an entropy encoding unit 110, and an inverse quantization unit 112. The inverse conversion unit 114, the addition unit 116, the block memory 118, the loop filter unit 120, the frame memory 122, the in-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128.

編碼裝置100可藉由例如通用處理器及記憶體來實現。在此情況下，藉由處理器執行保存在記憶體的軟體程式時，處理器是作為分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、迴路濾波部120、框內預測部124、框間預測部126及預測控制部128而發揮功能。又，編碼裝置100亦可作為對應於分割部102、減法部104、轉換部106、量化部108、熵編碼部110、逆量化部112、逆轉換部114、加法部116、迴路濾波部120、框內預測部124、框間預測部126及預測控制部128之1個以上的專用的電子電路來實現。The encoding device 100 can be implemented by, for example, a general purpose processor and a memory. In this case, when the processor executes the software program stored in the memory, the processor functions as the division unit 102, the subtraction unit 104, the conversion unit 106, the quantization unit 108, the entropy coding unit 110, the inverse quantization unit 112, and the inverse. The conversion unit 114, the addition unit 116, the loop filter unit 120, the in-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128 function. Further, the encoding device 100 may correspond to the division unit 102, the subtraction unit 104, the conversion unit 106, the quantization unit 108, the entropy coding unit 110, the inverse quantization unit 112, the inverse conversion unit 114, the addition unit 116, and the loop filter unit 120. One or more dedicated electronic circuits of the in-frame prediction unit 124, the inter-frame prediction unit 126, and the prediction control unit 128 are realized.

以下，針對包含在編碼裝置100的各構成要件來進行說明。 [分割部]Hereinafter, each constituent element included in the encoding device 100 will be described. [Division Department]

分割部102是將包含在輸入動態圖像的各圖片分割成複數個區塊，且將各區塊輸出至減法部104。例如，分割部102首先將圖片分割為固定尺寸(例如128x128)的區塊。此固定尺寸的區塊被稱為編碼樹單元(CTU)。並且，分割部102會根據遞迴的四元樹(quadtree)及/或二元樹(binary tree)區塊分割，而將各個固定尺寸的區塊分割成可變尺寸(例如64x64以下)的區塊。此可變尺寸的區塊有時被稱為編碼單元(CU)、預測單元(PU)或轉換單元(TU)。再者，在本實施形態中，亦可不需要區別CU、PU及TU，而使圖片內的一部分或全部的區塊成為CU、PU、TU的處理單位。The division unit 102 divides each picture included in the input moving image into a plurality of blocks, and outputs each block to the subtraction unit 104. For example, the segmentation section 102 first divides the picture into blocks of a fixed size (for example, 128x128). This fixed size block is called a coding tree unit (CTU). Moreover, the segmentation unit 102 divides each fixed size block into a variable size (for example, 64×64 or less) according to the recursive quadtree and/or binary tree segmentation. Piece. This variable size block is sometimes referred to as a coding unit (CU), a prediction unit (PU), or a conversion unit (TU). Furthermore, in the present embodiment, it is not necessary to distinguish between the CU, the PU, and the TU, and some or all of the blocks in the picture are processed by the CU, the PU, and the TU.

圖2是顯示實施形態1中的區塊分割的一例之圖。在圖2中，實線是表示藉由四元樹區塊分割的區塊交界，而虛線是表示藉由二元樹區塊分割的區塊交界。Fig. 2 is a view showing an example of block division in the first embodiment. In FIG. 2, the solid line indicates the block boundary divided by the quaternary tree block, and the broken line indicates the block boundary divided by the binary tree block.

在此，區塊10是128x128像素的正方形區塊(128x128區塊)。此128x128區塊10首先被分割成4個正方形的64x64區塊(四元樹區塊分割)。Here, block 10 is a 128x128 pixel square block (128x128 block). This 128x128 block 10 is first partitioned into 4 square 64x64 blocks (quaternary tree block partitioning).

左上的64x64區塊進一步被垂直地分割成2個矩形的32x64區塊，且左邊的32x64區塊進一步被垂直地分割成2個矩形的16x64區塊(二元樹區塊分割)。其結果，左上的64x64區塊被分割成2個的16x64區塊11、12、以及32x64區塊13。The upper left 64x64 block is further vertically divided into two rectangular 32x64 blocks, and the left 32x64 block is further vertically divided into two rectangular 16x64 blocks (binary tree block partitioning). As a result, the upper left 64x64 block is divided into two 16x64 blocks 11, 12, and 32x64 blocks 13.

右上的64x64區塊被水平地分割為2個矩形的64x32區塊14、15(二元樹區塊分割)。The upper right 64x64 block is horizontally divided into two rectangular 64x32 blocks 14, 15 (binary tree block partitioning).

左下的64x64區塊被分割為4個正方形的32x32區塊(四元樹區塊分割)。4個32x32區塊當中，將左上的區塊及右下的區塊進一步地分割。左上的32x32區塊被垂直地分割成2個矩形的16x32區塊，且將右邊的16x32區塊進一步水平地分割為2個16x16區塊(二元樹區塊分割)。右下的32x32區塊被水平地分割成2個32x16區塊(二元樹區塊分割)。其結果，可將左下的64x64區塊分割成：16x32區塊16、2個16x16區塊17、18、2個32x32區塊19、20、與2個32x16區塊21、22。The lower left 64x64 block is divided into 4 square 32x32 blocks (quaternary tree block partitioning). Among the four 32x32 blocks, the upper left block and the lower right block are further divided. The upper left 32x32 block is vertically divided into two rectangular 16x32 blocks, and the right 16x32 block is further horizontally divided into two 16x16 blocks (binary tree block partitioning). The lower right 32x32 block is horizontally divided into two 32x16 blocks (binary tree block partitioning). As a result, the lower left 64x64 block can be divided into: 16x32 block 16, two 16x16 blocks 17, 18, two 32x32 blocks 19, 20, and two 32x16 blocks 21, 22.

右下的64x64區塊23未被分割。The lower right 64x64 block 23 is not split.

如以上，在圖2中，區塊10是根據遞迴的四元樹及二元樹區塊分割，而被分割成13個可變尺寸的區塊11~23。這種分割被稱為QTBT(四元樹加二元樹區塊結構(quad-tree plus binary tree))分割。As above, in FIG. 2, the block 10 is divided into 13 variable-sized blocks 11 to 23 according to the recursive quaternary tree and binary tree block division. This segmentation is called QTBT (quad tree plus binary tree) segmentation.

再者，在圖2中，雖然是將1個區塊分割成4個或2個區塊(四元樹或二元樹區塊分割)，但分割並不限定於此。例如，亦可將1個區塊分割成3個區塊(三元樹區塊分割)。這種包含了三元樹區塊分割的分割，被稱為MBT(多類型樹(multi type tree))分割。 [減法部]Furthermore, in FIG. 2, although one block is divided into four or two blocks (quaternary tree or binary tree block division), the division is not limited thereto. For example, one block may be divided into three blocks (three-dimensional tree block division). This segmentation, which includes ternary tree block partitioning, is called MBT (multi-type tree) segmentation. [Subtraction Department]

減法部104是以由分割部102所分割的區塊單位來從原訊號(原樣本)中減去預測訊號(預測樣本)。也就是說，減法部104會算出編碼對象區塊(以下，稱為當前區塊)的預測誤差(也可稱為殘差)。而且，減法部104會將算出的預測誤差輸出至轉換部106。The subtraction unit 104 subtracts the prediction signal (predicted sample) from the original signal (original sample) in the block unit divided by the division unit 102. In other words, the subtraction unit 104 calculates a prediction error (which may also be referred to as a residual) of the coding target block (hereinafter referred to as the current block). Further, the subtraction unit 104 outputs the calculated prediction error to the conversion unit 106.

原訊號是編碼裝置100的輸入訊號，且是表示構成動態圖像的各圖片之圖像的訊號(例如亮度(luma)訊號及2個色差(chroma)訊號)。在以下，有時也會將表示圖像的訊號稱為樣本。 [轉換部]The original signal is an input signal of the encoding device 100, and is a signal (for example, a luma signal and two chroma signals) indicating images of the respective pictures constituting the moving image. In the following, the signal representing the image is sometimes referred to as a sample. [conversion department]

轉換部106是將空間區域的預測誤差轉換成頻率區域的轉換係數，並將轉換係數輸出至量化部108。具體來說，轉換部106是例如對空間區域的預測誤差進行預定之離散餘弦轉換(DCT)或離散正弦轉換(DST)。The conversion unit 106 converts the prediction error of the spatial region into a conversion coefficient of the frequency region, and outputs the conversion coefficient to the quantization unit 108. Specifically, the conversion unit 106 is, for example, performing predetermined discrete cosine transform (DCT) or discrete sine transform (DST) on the prediction error of the spatial region.

再者，轉換部106亦可從複數個轉換類型之中自適應地選擇轉換類型，且使用與所選擇的轉換類型相對應之轉換基底函數(transform basis function)，來將預測誤差轉換成轉換係數。有時將這種轉換稱為EMT(外顯性多重核心轉換(explicit multiple core transform))或AMT(適應性多重轉換(adaptive multiple transform))。Furthermore, the conversion unit 106 may also adaptively select a conversion type from among a plurality of conversion types, and convert a prediction error into a conversion coefficient using a transform basis function corresponding to the selected conversion type. . This conversion is sometimes referred to as EMT (explicit multiple core transform) or AMT (adaptive multiple transform).

複數個轉換類型包含例如DCT-II、DCT-V、DCT-VIII、DST-I、及DST-VII。圖3是顯示對應於各轉換類型的轉換基底函數之表格。在圖3中N是表示輸入像素的數量。從這些複數個轉換類型之中的轉換類型之選擇，可依例如預測的種類(框內預測(intra-prediction)及框間預測(inter-prediction))而定，亦可依框內預測模式而定。A plurality of conversion types include, for example, DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII. Figure 3 is a table showing a conversion basis function corresponding to each conversion type. In Fig. 3, N is the number of input pixels. The selection of the conversion types from among the plurality of conversion types may be determined according to, for example, the type of prediction (intra-prediction and inter-prediction), or may be based on the intra-frame prediction mode. set.

這種顯示是否適用EMT或AMT的資訊(可稱為例如AMT旗標(AMT flag))及顯示所選擇的轉換類型的資訊，可在CU層級被訊號化。再者，這些資訊的訊號化並不需限定於CU層級，也可以是其他的層級(例如，序列層級(sequence level)、圖片層級(picture level)、片段層級(slice level)、圖塊層級(tile level)或CTU層級)。This display shows whether EMT or AMT information (which can be called, for example, an AMT flag) and information showing the selected conversion type can be signaled at the CU level. Moreover, the signalization of the information does not need to be limited to the CU level, but may be other levels (for example, a sequence level, a picture level, a slice level, and a tile level) (eg, a sequence level, a picture level, a slice level, and a tile level). Tile level) or CTU level).

又，轉換部106也可以將轉換係數(轉換結果)再轉換。有時將這種再轉換稱為AST(適應性二次轉換(adaptive secondary transform))或NSST(不可分離的二次轉換(non-separable secondary transform))。例如，轉換部106會按照對應於框內預測誤差的轉換係數之區塊所包含的每個子區塊(例如4x4子區塊)進行再轉換。顯示是否適用NSST的資訊以及與NSST所使用的轉換矩陣相關的資訊是在CU層級被訊號化。再者，這些資訊的訊號化並不需限定於CU層級，也可以是其他的層級(例如，序列層級、圖片層級、片段層級、圖塊層級或CTU層級)。Further, the conversion unit 106 may reconvert the conversion coefficient (conversion result). This retransformation is sometimes referred to as AST (adaptive secondary transform) or NSST (non-separable secondary transform). For example, the conversion unit 106 performs retransformation for each sub-block (for example, a 4×4 sub-block) included in the block corresponding to the conversion coefficient of the intra-frame prediction error. The information showing whether NSST is applicable and the information related to the conversion matrix used by NSST is signalized at the CU level. Moreover, the signalization of the information is not limited to the CU level, but may be other levels (for example, sequence level, picture level, fragment level, tile level or CTU level).

在此，可分離之(Separable)轉換是指以相當於輸入的維度之數來按各個方向分離並進行複數次轉換的方式，不可分離之(Non-Separable)轉換是指當輸入為多維時，將2個以上的維度匯總而視為1維，且一次進行轉換的方式。Here, the separable conversion refers to a method of separating and performing a plurality of conversions in various directions in a number corresponding to the input dimension, and the non-separable conversion means that when the input is multi-dimensional, A method in which two or more dimensions are combined and regarded as one dimension, and conversion is performed at a time.

例如，作為不可分離之轉換的一例，可列舉如下之轉換：在輸入為4x4的區塊之情況下，將其視為具有16個要件的一種排列，而對該排列以16x16的轉換矩陣來進行轉換處理。For example, as an example of the inseparable conversion, a conversion is exemplified: in the case where a block of 4×4 is input, it is regarded as an arrangement having 16 elements, and the arrangement is performed by a conversion matrix of 16×16. Conversion processing.

又，同樣地，於將4x4的輸入區塊視為具有16個要件的一種排列之後，對該排列進行複數次吉文斯旋轉(Givens rotation)之類的轉換(超立方吉文斯轉換，Hypercube Givens Transform)也是不可分離之轉換的例子。 [量化部]Also, similarly, after considering the 4x4 input block as an arrangement having 16 elements, the arrangement is subjected to a plurality of conversions such as Givens rotation (hypercube Givens conversion, Hypercube Givens Transform) ) is also an example of an inseparable transformation. [Quantization Department]

量化部108是對從轉換部106輸出的轉換係數進行量化。具體來說，量化部108是以規定的掃描順序掃描當前區塊的轉換係數，且根據與所掃描的轉換係數相對應之量化參數(QP)來對該轉換係數進行量化。並且，量化部108會將當前區塊之已量化的轉換係數(以下，稱為量化係數)輸出到熵編碼部110及逆量化部112。The quantization unit 108 quantizes the conversion coefficients output from the conversion unit 106. Specifically, the quantization unit 108 scans the conversion coefficient of the current block in a predetermined scanning order, and quantizes the conversion coefficient according to a quantization parameter (QP) corresponding to the scanned conversion coefficient. Further, the quantization unit 108 outputs the quantized conversion coefficients (hereinafter referred to as quantized coefficients) of the current block to the entropy coding unit 110 and the inverse quantization unit 112.

規定的順序是用於轉換係數的量化/逆量化之順序。例如，規定的掃描順序是以頻率的遞升順序(從低頻到高頻的順序)或遞降順序(從高頻到低頻的順序)來定義。The prescribed order is the order of quantization/inverse quantization for the conversion coefficients. For example, the prescribed scanning order is defined in the order of increasing frequency (in order of low frequency to high frequency) or descending order (in order of high frequency to low frequency).

所謂量化參數是定義量化步距(量化寬度)的參數。例如，量化參數的值增加的話，會使量化步距也增加。也就是說，量化參數的值增加的話，會使量化誤差増大。 [熵編碼部]The so-called quantization parameter is a parameter that defines the quantization step size (quantization width). For example, if the value of the quantization parameter is increased, the quantization step size will also increase. That is to say, if the value of the quantization parameter is increased, the quantization error is increased. [Entropy coding unit]

熵編碼部110是藉由對從量化部108輸入之量化係數進行可變長度編碼，來生成編碼訊號(編碼位元流(bit stream))。具體來說，熵編碼部110是例如將量化係數二值化，而對二值訊號進行算術編碼。 [逆量化部]The entropy coding unit 110 generates a coded signal (code bit stream) by performing variable length coding on the quantized coefficients input from the quantization unit 108. Specifically, the entropy coding unit 110 binarizes the quantized coefficients, for example, and arithmetically encodes the binary signals. [Inverse Quantization Department]

逆量化部112是對來自量化部108的輸入即量化係數進行逆量化。具體來說，逆量化部112是以規定的掃描順序對當前區塊的量化係數進行逆量化。並且，逆量化部112會將當前區塊之已逆量化的轉換係數輸出到逆轉換部114。 [逆轉換部]The inverse quantization unit 112 inversely quantizes the quantized coefficients which are inputs from the quantization unit 108. Specifically, the inverse quantization unit 112 inversely quantizes the quantized coefficients of the current block in a predetermined scanning order. Further, the inverse quantization unit 112 outputs the inversely quantized conversion coefficient of the current block to the inverse conversion unit 114. [Inverse Conversion Department]

逆轉換部114是藉由對來自逆量化部112的輸入即轉換係數進行逆轉換，以復原預測誤差。具體來說，逆轉換部114是藉由對轉換係數進行與轉換部106所進行的轉換相對應之逆轉換，來復原當前區塊的預測誤差。並且，逆轉換部114會將復原的預測誤差輸出至加法部116。The inverse conversion unit 114 inversely converts the conversion coefficient, which is an input from the inverse quantization unit 112, to restore the prediction error. Specifically, the inverse transform unit 114 restores the prediction error of the current block by performing inverse conversion corresponding to the conversion performed by the conversion unit 106 on the conversion coefficient. Further, the inverse conversion unit 114 outputs the restored prediction error to the addition unit 116.

再者，由於復原的預測誤差會因量化而失去資訊，因此和減法部104算出的預測誤差並不一致。亦即，復原的預測誤差中包含有量化誤差。 [加法部]Furthermore, since the predicted prediction error is lost due to quantization, the prediction error calculated by the subtraction unit 104 does not match. That is, the predicted error of the restoration includes a quantization error. [Addition Department]

加法部116會對來自逆轉換部114的輸入即預測誤差、及來自預測控制部128的輸入即預測樣本進行加法運算，藉此再構成當前區塊。而且，加法部116會將再構成的區塊輸出到區塊記憶體118及迴路濾波部120。有時也將再構成區塊稱為局部解碼區塊(local decoding block)。 [區塊記憶體]The addition unit 116 adds the prediction error, which is an input from the inverse conversion unit 114, and the prediction sample, which is an input from the prediction control unit 128, to the current block. Further, the addition unit 116 outputs the reconstructed block to the block memory 118 and the loop filter unit 120. The reconstructed block is sometimes referred to as a local decoding block. [block memory]

區塊記憶體118是用於保存在框內預測中所參照的區塊且也是編碼對象圖片(以下，稱為當前圖片)內的區塊之儲存部。具體來說，區塊記憶體118會保存從加法部116輸出的再構成區塊。 [迴路濾波部]The block memory 118 is a storage unit for storing a block referred to in the in-frame prediction and also a block in the encoding target picture (hereinafter referred to as a current picture). Specifically, the tile memory 118 stores the reconstructed block output from the addition unit 116. [loop filter unit]

迴路濾波部120會對藉由加法部116再構成的區塊施行迴路濾波，且將已進行濾波的再構成區塊輸出到框記憶體122。所謂迴路濾波器是在編碼迴路內使用的濾波器(內嵌式迴路濾波器(In-loop filter))，且包含例如去區塊濾波器(Deblocking Filter，DF)、取樣自適應偏移(Sample Adaptive Offset，SAO)及自適應迴路濾波器(Adaptive Loop Filter，ALF)等。The loop filter unit 120 performs loop filtering on the block reconstructed by the adder 116, and outputs the filtered reconstructed block to the frame memory 122. The loop filter is a filter (in-loop filter) used in the coding loop, and includes, for example, a Deblocking Filter (DF), a sample adaptive offset (Sample). Adaptive Offset (SAO) and Adaptive Loop Filter (ALF).

在ALF中，可適用去除編碼失真用的最小平方誤差濾波器，例如可適用：按當前區塊內的2x2子區塊的每一個，根據局部的梯度(gradient)的方向及活動性(activity)來從複數個濾波器之中選擇的1個濾波器。In ALF, the least square error filter for removing coding distortion can be applied. For example, it can be applied: according to each of the 2x2 sub-blocks in the current block, according to the direction and activity of the local gradient. One filter selected from a plurality of filters.

具體來說，首先，可將子區塊(例如2x2子區塊)分類成複數個類別(class)(例如15個或25個類別)。子區塊的分類是根據梯度的方向及活動性來進行。例如，可利用梯度的方向值D(例如0~2或0~4)與梯度的活性值A(例如0~4)來算出分類值C(例如C=5D＋A)。而且，根據分類值C，來將子區塊分類成複數個類別(例如15個或25個類別)。Specifically, first, sub-blocks (eg, 2x2 sub-blocks) may be classified into a plurality of classes (eg, 15 or 25 categories). The classification of sub-blocks is based on the direction and activity of the gradient. For example, the classification value C (for example, C=5D+A) can be calculated by using the gradient direction value D (for example, 0 to 2 or 0 to 4) and the gradient activity value A (for example, 0 to 4). Moreover, the sub-blocks are classified into a plurality of categories (for example, 15 or 25 categories) according to the classification value C.

梯度的方向值D可藉由例如比較複數個方向(例如水平、垂直及2個對角方向)的梯度來導出。又，梯度的活性值A是藉由例如對複數個方向的梯度作加法運算，並對加法結果進行量化來導出。The direction value D of the gradient can be derived, for example, by comparing gradients in a plurality of directions (eg, horizontal, vertical, and 2 diagonal directions). Further, the gradient activity value A is derived by, for example, adding a gradient in a plurality of directions and quantizing the addition result.

根據這種分類的結果，即可從複數個濾波器之中決定子區塊用的濾波器。According to the result of this classification, the filter for the sub-block can be determined from among a plurality of filters.

作為在ALF中所用的濾波器之形狀，可利用的有例如圓對稱形狀。圖4A~圖4C是顯示在ALF中所用的濾波器之形狀的複數個例子之圖。圖4A是顯示5x5菱形(diamond)形狀濾波器，圖4B是顯示7x7菱形形狀濾波器，圖4C是顯示9x9菱形形狀濾波器。顯示濾波器的形狀之資訊是在圖片層級被訊號化。再者，顯示濾波器的形狀之資訊的訊號化並不需要限定於圖片層級，亦可為其他的層級(例如，序列層級、片段層級、圖塊層級、CTU層級或CU層級)。As the shape of the filter used in the ALF, for example, a circularly symmetrical shape can be utilized. 4A to 4C are diagrams showing a plurality of examples of shapes of filters used in ALF. 4A shows a 5x5 diamond shape filter, FIG. 4B shows a 7x7 diamond shape filter, and FIG. 4C shows a 9x9 diamond shape filter. The information showing the shape of the filter is signalized at the picture level. Furthermore, the signalization of the information of the shape of the display filter need not be limited to the picture level, and may be other levels (for example, sequence level, slice level, tile level, CTU level or CU level).

ALF的開啟/關閉(on/off)是在例如圖片層級或CU層級決定的。例如，針對亮度是在CU層級來決定是否適用ALF，而針對色差則是在圖片層級來決定是否適用ALF。顯示ALF的開啟/關閉之資訊，可在圖片層級或CU層級被訊號化。再者，顯示ALF的開啟/關閉之資訊的訊號化並不需要限定於圖片層級或CU層級，亦可為其他的層級(例如，序列層級、片段層級、圖塊層級或CTU層級)。The on/off of the ALF is determined at, for example, the picture level or the CU level. For example, whether the ALF is applied to the brightness at the CU level or not is determined at the picture level for the color difference. The information showing the ALF on/off can be signaled at the picture level or the CU level. Furthermore, the signalization of the information indicating the on/off of the ALF is not limited to the picture level or the CU level, and may be other levels (for example, a sequence level, a slice level, a tile level, or a CTU level).

可選擇的複數個濾波器(例如到15個或25個為止的濾波器)之係數集合(set)，是在圖片層級被訊號化。再者，係數集合的訊號化並不需要限定於圖片層級，也可以是其他的層級(例如序列層級、片段層級、圖塊層級、CTU層級、CU層級或子區塊層級)。 [框記憶體]A set of coefficients of a selectable plurality of filters (e.g., filters up to 15 or 25) is signaled at the picture level. Furthermore, the signalization of the coefficient set does not need to be limited to the picture level, but may be other levels (eg, sequence level, slice level, tile level, CTU level, CU level, or sub-block level). [frame memory]

框記憶體122是用於保存框間預測所用的參照圖片之儲存部，有時也被稱為框緩衝器(frame buffer)。具體來說，框記憶體122會保存已藉由迴路濾波部120進行濾波的再構成區塊。 [框內預測部]The frame memory 122 is a storage unit for storing a reference picture used for inter-frame prediction, and is sometimes referred to as a frame buffer. Specifically, the frame memory 122 stores the reconstructed block that has been filtered by the loop filter unit 120. [In-frame prediction department]

框內預測部124是參照已保存於區塊記憶體118的當前圖片內之區塊來進行當前區塊的框內預測(也稱為畫面內預測)，藉此生成預測訊號(框內預測訊號)。具體來說，框內預測部124是參照與當前區塊相鄰的區塊之樣本(例如亮度值、色差值)來進行框內預測，藉此生成框內預測訊號，並將框內預測訊號輸出至預測控制部128。The in-frame prediction unit 124 performs intra-frame prediction (also referred to as intra-picture prediction) of the current block by referring to the block stored in the current picture of the block memory 118, thereby generating a prediction signal (in-frame prediction signal) ). Specifically, the in-frame prediction unit 124 performs intra-frame prediction by referring to samples (for example, luminance values and color difference values) of the blocks adjacent to the current block, thereby generating an in-frame prediction signal and predicting the in-frame prediction. The signal is output to the prediction control unit 128.

例如，框內預測部124會利用事先規定的複數個框內預測模式之中的1個來進行框內預測。複數個框內預測模式包含1個以上之非方向性預測模式、以及複數個方向性預測模式。For example, the in-frame prediction unit 124 performs intra-frame prediction using one of a plurality of predetermined intra-frame prediction modes. The plurality of in-frame prediction modes include one or more non-directional prediction modes and a plurality of directional prediction modes.

1個以上的非方向性預測模式包含例如H.265/HEVC(高效率視訊編碼(High-Efficiency Video Coding))規格(非專利文獻1)所規定的平面(Planar)預測模式及DC預測模式。One or more non-directional prediction modes include, for example, a Planar prediction mode and a DC prediction mode defined by the H.265/HEVC (High-Efficiency Video Coding) specification (Non-Patent Document 1).

複數個方向性預測模式包含例如H.265/HEVC規格所規定的33個方向之預測模式。再者，複數個方向性預測模式亦可除了33個方向以外，更進一步地包含32個方向的預測模式(合計65個方向性預測模式)。圖5是顯示框內預測中的67個框內預測模式(2個非方向性預測模式及65個方向性預測模式)之圖。實線箭頭是表示H.265/HEVC規格所規定的33個方向，虛線箭頭是表示追加的32個方向。The plurality of directional prediction modes include, for example, prediction modes of 33 directions specified by the H.265/HEVC specification. Furthermore, the plurality of directional prediction modes may further include prediction modes of 32 directions (total of 65 directional prediction modes) in addition to 33 directions. FIG. 5 is a diagram showing 67 in-frame prediction modes (two non-directional prediction modes and 65 directional prediction modes) in the in-frame prediction. The solid arrows indicate 33 directions defined by the H.265/HEVC specifications, and the dotted arrows indicate 32 additional directions.

再者，在色差區塊的框內預測中，亦可參照亮度區塊。也就是說，也可以根據當前區塊的亮度成分，來預測當前區塊的色差成分。有時可將這種框內預測稱為CCLM(交叉成分線性模型，cross-component linear model)預測。這種參照亮度區塊的色差區塊之框內預測模式(例如被稱為CCLM模式)，也可以作為1種色差區塊的框內預測模式來加入。Furthermore, in the in-frame prediction of the color difference block, the luminance block can also be referred to. That is to say, the color difference component of the current block can also be predicted based on the luminance component of the current block. Such intra-frame predictions can sometimes be referred to as CCLM (cross-component linear model) prediction. Such an intra-frame prediction mode (for example, referred to as CCLM mode) of the color difference block of the reference luminance block may also be added as an intra-frame prediction mode of one color difference block.

框內預測部124亦可根據水平/垂直方向的參照像素之梯度來補正框內預測後的像素值。這種伴隨補正的框內預測有時被稱為PDPC(獨立位置框內預測組合，position dependent intra prediction combination)。顯示有無適用PDPC的資訊(被稱為例如PDPC旗標)，是在例如CU層級被訊號化。再者，此資訊的訊號化並不需要限定於CU層級，也可以是其他的層級(例如序列層級、圖片層級、片段層級、圖塊層級、或CTU層級)。 [框間預測部]The in-frame prediction unit 124 may also correct the intra-frame predicted pixel value based on the gradient of the reference pixels in the horizontal/vertical direction. This intra-frame prediction of companion correction is sometimes referred to as PDPC (position dependent intra prediction combination). Information showing whether or not the PDPC is applicable (referred to as a PDPC flag, for example) is signalized at, for example, the CU level. Moreover, the signalization of this information does not need to be limited to the CU level, but may be other levels (such as sequence level, picture level, fragment level, tile level, or CTU level). [Inter-frame prediction unit]

框間預測部126會參照保存在框記憶體122的參照圖片且也是與當前圖片不同的參照圖片，來進行當前區塊的框間預測(也稱為畫面間預測)，藉此生成預測訊號(框間預測訊號)。框間預測是以當前區塊或當前區塊內的子區塊(例如4x4區塊)之單位來進行。例如，框間預測部126是針對當前區塊或子區塊而在參照圖片內進行運動搜尋(運動估計(motion estimation))。而且，框間預測部126是利用以運動搜尋所得到的運動資訊(例如運動向量)來進行動態補償，藉此生成當前區塊或子區塊的框間預測訊號。而且，框間預測部126會將生成的框間預測訊號輸出至預測控制部128。The inter-frame prediction unit 126 refers to the reference picture stored in the frame memory 122 and is also a reference picture different from the current picture, and performs inter-frame prediction (also referred to as inter-picture prediction) of the current block, thereby generating a prediction signal ( Inter-frame prediction signal). Inter-frame prediction is performed in units of current blocks or sub-blocks within the current block (eg, 4x4 blocks). For example, the inter-frame prediction unit 126 performs motion search (motion estimation) in the reference picture for the current block or sub-block. Further, the inter-frame prediction unit 126 performs motion compensation using motion information (for example, motion vector) obtained by motion search, thereby generating an inter-frame prediction signal of the current block or sub-block. Further, the inter-frame prediction unit 126 outputs the generated inter-frame prediction signal to the prediction control unit 128.

使用於動態補償的運動資訊會被訊號化。在運動向量的訊號化中，亦可使用運動向量預測子(motion vector predictor)。也就是說，亦可將運動向量與運動向量預測子之間的差分訊號化。Motion information used for dynamic compensation is signaled. In the signalization of motion vectors, a motion vector predictor can also be used. That is to say, the difference between the motion vector and the motion vector predictor can also be signaled.

再者，不只是由運動搜尋得到的當前區塊之運動資訊，亦可連相鄰區塊的運動資訊也利用，來生成框間預測訊號。具體來說，亦可藉由將根據運動搜尋所得到的運動資訊之預測訊號、以及根據相鄰區塊的運動資訊之預測訊號作加權相加，而以當前區塊內的子區塊單位來生成框間預測訊號。這種框間預測(動態補償)有時被稱為OBMC(重疊區塊動態補償，overlapped block motion compensation)。Moreover, not only the motion information of the current block obtained by the motion search, but also the motion information of the adjacent block can also be utilized to generate the inter-frame prediction signal. Specifically, the prediction signals of the motion information obtained according to the motion search and the prediction signals according to the motion information of the adjacent blocks may be weighted and added, and the sub-block units in the current block are used. Generate inter-frame prediction signals. This inter-frame prediction (dynamic compensation) is sometimes referred to as OBMC (overlapped block motion compensation).

在這種OBMC模式中，顯示OBMC用的子區塊之尺寸的資訊(例如被稱為OBMC區塊尺寸)，是在序列層級被訊號化。又，顯示是否適用OBMC模式的資訊(例如被稱為OBMC旗標)，是在CU層級被訊號化。再者，這些資訊的訊號化之層級並不需要限定於序列層級及CU層級，亦可為其他的層級(例如圖片層級、片段層級、圖塊層級、CTU層級或子區塊層級)。In this OBMC mode, information showing the size of the sub-blocks used by the OBMC (for example, referred to as OBMC block size) is signalized at the sequence level. Also, information indicating whether or not the OBMC mode is applied (for example, referred to as an OBMC flag) is signalized at the CU level. Moreover, the signalization level of the information does not need to be limited to the sequence level and the CU level, and may be other levels (such as picture level, slice level, tile level, CTU level or sub-block level).

再者，也可以不將運動資訊訊號化，而在解碼裝置側導出。例如，也可以使用以H.265/HEVC規格所規定的合併模式(merge mode)。又，亦可藉由例如在解碼裝置側進行運動搜尋來導出運動資訊。在此情況下，可在不使用當前區塊的像素值的情形下進行運動搜尋。Furthermore, the motion information may not be signaled but may be derived on the decoding device side. For example, a merge mode specified by the H.265/HEVC specification may also be used. Further, the motion information can also be derived by performing motion search on the decoding device side, for example. In this case, motion search can be performed without using the pixel values of the current block.

在此，針對在解碼裝置側進行運動搜尋的模式進行說明。有時將該在解碼裝置側進行運動搜尋的模式稱為PMMVD(型樣匹配運動向量導出，pattern matched motion vector derivation)模式、或FRUC(畫面頻率提升，frame rate up-conversion)模式。Here, a mode in which motion search is performed on the decoding device side will be described. The mode in which the motion search is performed on the decoding device side is sometimes referred to as a PMMVD (pattern matched motion vector derivation) mode or a FRUC (frame rate up-conversion) mode.

首先，可參照空間上或時間上與當前區塊相鄰之編碼完成區塊的運動向量，而生成各自具有運動向量預測子的複數個候選之清單(與合併清單共通亦可)。而且，算出候選清單所包含的各候選之評價值，並根據評價值來選擇1個候選。First, a list of a plurality of candidates each having a motion vector predictor may be generated by referring to a motion vector of a coded block that is spatially or temporally adjacent to the current block (common with the merge list). Then, the evaluation values of the candidates included in the candidate list are calculated, and one candidate is selected based on the evaluation values.

而且，可根據所選擇的候選運動向量，來導出當前區塊用的運動向量。具體來說，是例如，將所選擇的候選運動向量原樣導出作為當前區塊用的運動向量。又，亦可例如，在與所選擇的候選運動向量相對應的參照圖片內的位置之周邊區域中，藉由進行型樣匹配，來導出當前區塊用的運動向量。Moreover, the motion vector for the current block can be derived from the selected candidate motion vector. Specifically, for example, the selected candidate motion vector is derived as it is as the motion vector for the current block. Further, for example, in the peripheral region of the position in the reference picture corresponding to the selected candidate motion vector, the motion vector for the current block is derived by performing pattern matching.

再者，評價值是藉由與運動向量相對應的參照圖片內的區域、與規定的區域之間的型樣匹配來算出的。Furthermore, the evaluation value is calculated by matching the region in the reference picture corresponding to the motion vector with the pattern between the predetermined regions.

作為型樣匹配，可使用第1型樣匹配或第2型樣匹配。有時會將第1型樣匹配及第2型樣匹配分別稱為雙向匹配(bilateral matching)及模板匹配(template matching)。As a pattern match, a Type 1 match or a Type 2 match can be used. The first type matching and the second type matching are sometimes referred to as bidirectional matching and template matching, respectively.

在第1型樣匹配中，是在不同的2個參照圖片內的2個區塊且也是沿著當前區塊的運動軌跡(motion trajectory)的2個區塊之間進行型樣匹配。從而，在第1型樣匹配中，作為上述候選的評價值的算出用之規定區域，所使用的是沿著當前區塊的運動軌跡之其他參照圖片內的區域。In the first type matching, pattern matching is performed between two blocks in two different reference pictures and also between two blocks of the motion trajectory of the current block. Therefore, in the first pattern matching, the predetermined region for calculating the candidate evaluation value is the region in the other reference picture along the motion trajectory of the current block.

圖6是用於說明沿著運動軌跡的2個區塊間的型樣匹配(雙向匹配)之圖。如圖6所示，在第1型樣匹配中，是在沿著當前區塊(Cur block)的運動軌跡之2個區塊且也是不同的2個參照圖片(Ref0、Ref1)內的2個區塊的配對中，搜尋最匹配的配對，藉此導出2個運動向量(MV0、MV1)。Fig. 6 is a diagram for explaining pattern matching (bidirectional matching) between two blocks along a motion trajectory. As shown in FIG. 6, in the first type matching, there are two blocks in the motion trajectory along the current block (Cur block) and two in the two different reference pictures (Ref0, Ref1). In the pairing of the blocks, the most matching pair is searched, thereby deriving two motion vectors (MV0, MV1).

在連續的運動軌跡的假設之下，指出2個參照區塊的運動向量(MV0、MV1)會相對於當前圖片(Cur Pic)與2個參照圖片(Ref0、Ref1)之間的時間上之距離(TD0、TD1)而成比例。例如，當前圖片在時間上位於2個參照圖片之間，且從當前圖片到2個參照圖片的時間上之距離為相等的情況下，在第1型樣匹配中，會導出鏡像對稱的雙向之運動向量。Under the assumption of continuous motion trajectory, it is pointed out that the motion vectors (MV0, MV1) of the two reference blocks are relative to the temporal distance between the current picture (Cur Pic) and the two reference pictures (Ref0, Ref1). (TD0, TD1) is proportional. For example, if the current picture is located between two reference pictures in time, and the distance from the current picture to the two reference pictures is equal, in the first type matching, the mirror-symmetrical two-way is derived. Motion vector.

在第2型樣匹配中，是在當前圖片內的模板(在當前圖片內與當前區塊相鄰的區塊(例如上及/或左的相鄰區塊))與參照圖片內的區塊之間進行型樣匹配。從而，在第2型樣匹配中，作為上述候選的評價值的算出用之規定區域，所使用的是當前圖片內之與當前區塊相鄰的區塊。In the second type matching, it is a template in the current picture (a block adjacent to the current block in the current picture (for example, upper and/or left adjacent blocks)) and a block in the reference picture. Pattern matching between them. Therefore, in the second type matching, a predetermined area for calculating the candidate evaluation value is a block adjacent to the current block in the current picture.

圖7是用於說明在當前圖片內的模板與參照圖片內的區塊之間的型樣匹配(模板匹配)之圖。如圖7所示，在第2型樣匹配中，是藉由在參照圖片(Ref0)內搜尋與在當前圖片(Cur Pic)內相鄰於當前區塊(Cur block)的區塊最匹配的區塊，以導出當前區塊的運動向量。FIG. 7 is a diagram for explaining pattern matching (template matching) between a template in a current picture and a block in a reference picture. As shown in FIG. 7, in the second type matching, by searching within the reference picture (Ref0) and matching the block adjacent to the current block (Cur block) in the current picture (Cur Pic), Block to derive the motion vector of the current block.

這種顯示是否適用FRUC模式的資訊(例如可稱為FRUC旗標)，是在CU層級被訊號化。又，在適用FRUC模式的情況下(例如FRUC旗標為真的情況下)，顯示型樣匹配的方法(第1型樣匹配或第2型樣匹配)之資訊(例如可稱為FRUC模式旗標)是在CU層級被訊號化。再者，這些資訊的訊號化並不需要限定於CU層級，亦可為其他的層級(例如，序列層級、圖片層級、片段層級、圖塊層級、CTU層級或子區塊層級)。This kind of display is applicable to the information of the FRUC mode (for example, it can be called FRUC flag), which is signalized at the CU level. Also, in the case where the FRUC mode is applied (for example, when the FRUC flag is true), the information of the pattern matching method (the first type matching or the second type matching) is displayed (for example, it may be called the FRUC mode flag). The standard is signalized at the CU level. Moreover, the signalization of the information does not need to be limited to the CU level, and may be other levels (for example, sequence level, picture level, slice level, tile level, CTU level or sub-block level).

再者，也可以藉由與運動搜尋不同的方法，在解碼裝置側導出運動資訊。例如，亦可根據假設了等速直線運動的模型，以像素單位使用周邊像素值來算出運動向量的補正量。Furthermore, the motion information can also be derived on the decoding device side by a different method from the motion search. For example, the correction amount of the motion vector may be calculated using the peripheral pixel values in units of pixels based on a model that assumes a constant-speed linear motion.

在此，針對根據假設了等速直線運動的模型來導出運動向量的模式進行說明。有時將此模式稱為BIO(雙向光流，bi-directional optical flow)模式。Here, a mode in which a motion vector is derived based on a model in which a constant-speed linear motion is assumed will be described. This mode is sometimes referred to as BIO (bi-directional optical flow) mode.

圖8是用於說明假設了等速直線運動的模型之圖。在圖8中，(v_x ，v_y )表示速度向量，而τ₀ 、τ₁ 分別表示當前圖片(Cur Pic)與2個參照圖片(Ref₀ ，Ref₁ )之間的時間上之距離。(MVx₀ ，MVy₀ )表示對應於參照圖片Ref₀ 的運動向量，而(MVx₁ ，MVy₁ )表示對應於參照圖片Ref₁ 的運動向量。Fig. 8 is a view for explaining a model assuming constant-speed linear motion. In Fig. 8, (v _x , v _y ) represents a velocity vector, and τ ₀ , τ ₁ respectively represent the temporal distance between the current picture (Cur Pic) and two reference pictures (Ref ₀ , Ref ₁ ). (MVx ₀ , MVy ₀ ) represents a motion vector corresponding to the reference picture Ref ₀ , and (MVx ₁ , MVy ₁ ) represents a motion vector corresponding to the reference picture Ref ₁ .

此時在速度向量(v_x ，v_y )的等速直線運動之假設下，(MVx₀ ，MVy₀ )及(MVx₁ ，MVy₁ )是分別表示為(v_x τ₀ ，v_y τ₀ )及(-v_x τ₁ ，-v_y τ₁ )，且以下的光流等式(1)成立。 [數學式1] At this time, under the assumption of the constant velocity linear motion of the velocity vector (v _x , v _y ), (MVx ₀ , MVy ₀ ) and (MVx ₁ , MVy ₁ ) are expressed as (v _x τ ₀ , v _y τ _{0 , respectively).} And (-v _x τ ₁ , -v _y τ ₁ ), and the following optical flow equation (1) holds. [Math 1]

在此，I^(k) 表示動態補償後的參照圖像k(k=0，1)之亮度值。此光流等式是表示下述的(i)、(ii)、及(iii)之和等於零：(i)亮度值的時間微分、(ii)水平方向的速度以及參照圖像的空間梯度之水平成分的積、與(iii)垂直方向的速度以及參照圖像的空間梯度之垂直成分的積。根據此光流等式與赫米內插法公式(Hermite interpolation)的組合，可將從合併清單等得到的區塊單位之運動向量以像素單位進行補正。Here, I ^(k) represents the luminance value of the reference image k (k = 0, 1) after the motion compensation. The optical flow equation is expressed by the sum of (i), (ii), and (iii) below being equal to zero: (i) time differential of the luminance value, (ii) velocity in the horizontal direction, and spatial gradient of the reference image. The product of the product of the horizontal component and ( ) the velocity in the vertical direction and the vertical component of the spatial gradient of the reference image. According to the combination of the optical flow equation and the Hermite interpolation formula, the motion vector of the block unit obtained from the merge list or the like can be corrected in pixel units.

再者，亦可藉由與根據假設了等速直線運動的模型之運動向量的導出不同之方法，在解碼裝置側導出運動向量。例如，亦可根據複數個相鄰區塊的運動向量而以子區塊單位來導出運動向量。Furthermore, the motion vector can also be derived on the decoding device side by a different method from the derivation of the motion vector of the model which assumes a constant velocity linear motion. For example, the motion vector may also be derived in sub-block units based on motion vectors of a plurality of adjacent blocks.

在此，針對根據複數個相鄰區塊的運動向量而以子區塊單位來導出運動向量的模式進行說明。此模式被稱為仿射動態補償預測(affine motion compensation prediction)模式。Here, a mode in which a motion vector is derived in units of sub-blocks based on motion vectors of a plurality of adjacent blocks will be described. This mode is called an affine motion compensation prediction mode.

圖9是用於說明根據複數個相鄰區塊的運動向量之子區塊單位的運動向量的導出之圖。在圖9中，當前區塊包含16個4x4子區塊。在此，是根據相鄰區塊的運動向量來導出當前區塊的左上角控制點之運動向量v₀ ，且根據相鄰子區塊的運動向量來導出當前區塊的右上角控制點之運動向量v₁ 。而且，使用2個運動向量v₀ 及v₁ ，藉由以下的式(2)，來導出當前區塊內的各子區塊之運動向量(v_x ，v_y )。 [數學式2] Figure 9 is a diagram for explaining the derivation of motion vectors of sub-block units according to motion vectors of a plurality of adjacent blocks. In Figure 9, the current block contains 16 4x4 sub-blocks. Here, the motion vector v ₀ of the upper left corner control point of the current block is derived according to the motion vector of the adjacent block, and the motion of the upper right corner control point of the current block is derived according to the motion vector of the adjacent sub-block. Vector v ₁ . Further, using two motion vectors v ₀ and v ₁ , the motion vectors (v _x , v _y ) of the respective sub-blocks in the current block are derived by the following equation (2). [Math 2]

在此，x及y各自表示子區塊的水平位置及垂直位置，且w是表示預定的加權係數。Here, x and y each represent the horizontal position and the vertical position of the sub-block, and w is a predetermined weighting coefficient.

在這種仿射動態補償預測模式中，左上及右上角控制點的運動向量之導出方法也可以包含幾個不同的模式。顯示這種仿射動態補償預測模式的資訊(例如可稱為仿射旗標)，是在CU層級被訊號化。再者，顯示此仿射動態補償預測模式的資訊之訊號化並不需要限定於CU層級，也可以是其他的層級(例如序列層級、圖片層級、片段層級、圖塊層級、CTU層級、或子區塊層級)。 [預測控制部]In this affine dynamic compensation prediction mode, the method of deriving the motion vectors of the upper left and upper right control points may also include several different modes. Information showing such an affine dynamic compensated prediction mode (which may be referred to as an affine flag) is signaled at the CU level. Furthermore, the signalization of the information showing the affine dynamic compensation prediction mode does not need to be limited to the CU level, and may also be other levels (eg, sequence level, picture level, slice level, tile level, CTU level, or sub- Block level). [Predictive Control Department]

預測控制部128會選擇框內預測訊號及框間預測訊號的任一個，且將所選擇的訊號作為預測訊號而輸出至減法部104及加法部116。 [解碼裝置的概要]The prediction control unit 128 selects any one of the in-frame prediction signal and the inter-frame prediction signal, and outputs the selected signal as a prediction signal to the subtraction unit 104 and the addition unit 116. [Summary of decoding device]

接著，針對可對從上述編碼裝置100輸出的編碼訊號(編碼位元流)進行解碼之解碼裝置的概要進行說明。圖10是顯示實施形態1之解碼裝置200的功能構成之方塊圖。解碼裝置200是以區塊單位對動態圖像/圖像進行解碼的動態圖像/圖像解碼裝置。Next, an outline of a decoding device that can decode an encoded signal (encoded bit stream) output from the encoding device 100 will be described. FIG. 10 is a block diagram showing a functional configuration of a decoding device 200 according to the first embodiment. The decoding device 200 is a moving image/image decoding device that decodes a moving image/image in units of blocks.

如圖10所示，解碼裝置200具備熵解碼部202、逆量化部204、逆轉換部206、加法部208、區塊記憶體210、迴路濾波部212、框記憶體214、框內預測部216、框間預測部218、及預測控制部220。As shown in FIG. 10, the decoding apparatus 200 includes an entropy decoding unit 202, an inverse quantization unit 204, an inverse conversion unit 206, an addition unit 208, a block memory 210, a loop filter unit 212, a frame memory 214, and an in-frame prediction unit 216. The inter-frame prediction unit 218 and the prediction control unit 220.

解碼裝置200可藉由例如通用處理器及記憶體來實現。在此情況下，藉由處理器執行保存在記憶體的軟體程式時，處理器是作為熵解碼部202、逆量化部204、逆轉換部206、加法部208、迴路濾波部212、框內預測部216、框間預測部218、及預測控制部220而發揮功能。又，解碼裝置200也可以作為對應於熵解碼部202、逆量化部204、逆轉換部206、加法部208、迴路濾波部212、框內預測部216、框間預測部218、及預測控制部220之1個以上的專用的電子電路來實現。The decoding device 200 can be implemented by, for example, a general purpose processor and a memory. In this case, when the processor executes the software program stored in the memory, the processor functions as the entropy decoding unit 202, the inverse quantization unit 204, the inverse conversion unit 206, the addition unit 208, the loop filter unit 212, and the in-frame prediction. The unit 216, the inter-frame prediction unit 218, and the prediction control unit 220 function. Further, the decoding device 200 may correspond to the entropy decoding unit 202, the inverse quantization unit 204, the inverse conversion unit 206, the addition unit 208, the loop filter unit 212, the in-frame prediction unit 216, the inter-frame prediction unit 218, and the prediction control unit. One or more of 220 dedicated electronic circuits are implemented.

以下，針對包含在解碼裝置200的各構成要件來進行說明。 [熵解碼部]Hereinafter, each constituent element included in the decoding device 200 will be described. [Entropy decoding unit]

熵解碼部202是對編碼位元流進行熵解碼。具體來說，熵解碼部202是例如從編碼位元流對二值訊號進行算術解碼。而且，熵解碼部202會對二值訊號進行多值化(debinarize)。藉此，熵解碼部202會以區塊單位將量化係數輸出至逆量化部204。 [逆量化部]The entropy decoding unit 202 performs entropy decoding on the encoded bit stream. Specifically, the entropy decoding unit 202 performs arithmetic decoding of the binary signal from the encoded bit stream, for example. Further, the entropy decoding unit 202 debinarizes the binary signal. Thereby, the entropy decoding unit 202 outputs the quantized coefficients to the inverse quantization unit 204 in units of blocks. [Inverse Quantization Department]

逆量化部204是對來自熵解碼部202的輸入即解碼對象區塊(以下，稱為當前區塊)的量化係數進行逆量化。具體來說，逆量化部204是針對當前區塊的量化係數的每一個，根據對應於該量化係數的量化參數，來對該量化係數進行逆量化。並且，逆量化部204會將當前區塊之已進行逆量化的量化係數(也就是轉換係數)輸出至逆轉換部206。 [逆轉換部]The inverse quantization unit 204 inversely quantizes the quantized coefficients of the decoding target block (hereinafter referred to as the current block) which is an input from the entropy decoding unit 202. Specifically, the inverse quantization unit 204 is inverse quantized for each of the quantized coefficients of the current block based on the quantization parameter corresponding to the quantized coefficient. Further, the inverse quantization unit 204 outputs the quantized coefficients (that is, the conversion coefficients) of the current block that have been inversely quantized to the inverse transform unit 206. [Inverse Conversion Department]

逆轉換部206是藉由對來自逆量化部204的輸入即轉換係數進行逆轉換，以復原預測誤差。The inverse conversion unit 206 inversely converts the conversion coefficient, which is an input from the inverse quantization unit 204, to restore the prediction error.

例如，在已從編碼位元流中解讀出的資訊顯示的是適用EMT或AMT的情況下(例如AMT旗標為真)，逆轉換部206會根據顯示已解讀的轉換類型之資訊，來對當前區塊的轉換係數進行逆轉換。For example, in the case where the information that has been interpreted from the encoded bit stream shows that EMT or AMT is applied (for example, the AMT flag is true), the inverse conversion unit 206 compares the information indicating the type of conversion that has been interpreted. The conversion coefficient of the current block is inversely converted.

又，例如，在已從編碼位元流中解讀出的資訊顯示的是適用NSST的情況下，逆轉換部206會對轉換係數適用逆再轉換。 [加法部]Further, for example, when the information that has been read from the encoded bit stream indicates that the NSST is applied, the inverse conversion unit 206 applies inverse retransformation to the conversion coefficient. [Addition Department]

加法部208會對來自逆轉換部206的輸入即預測誤差、及來自預測控制部220的輸入即預測樣本進行加法運算，藉此再構成當前區塊。而且，加法部208會將再構成的區塊輸出到區塊記憶體210及迴路濾波部212。 [區塊記憶體]The addition unit 208 adds the prediction error, which is an input from the inverse conversion unit 206, and the prediction sample input from the prediction control unit 220, thereby reconstructing the current block. Further, the addition unit 208 outputs the reconstructed block to the block memory 210 and the loop filter unit 212. [block memory]

區塊記憶體210是用於保存在框內預測中所參照的區塊且也是解碼對象圖片(以下，稱為當前圖片)內的區塊之儲存部。具體來說，區塊記憶體210會保存從加法部208輸出的再構成區塊。 [迴路濾波部]The block memory 210 is a storage unit for storing a block referred to in the in-frame prediction and also a block in the decoding target picture (hereinafter referred to as a current picture). Specifically, the tile memory 210 stores the reconstructed block output from the addition unit 208. [loop filter unit]

迴路濾波部212會對藉由加法部208再構成的區塊施行迴路濾波，且將已進行濾波的再構成區塊輸出到框記憶體214及顯示裝置等。The loop filter unit 212 performs loop filtering on the block reconstructed by the adder 208, and outputs the filtered reconstructed block to the frame memory 214, the display device, and the like.

當顯示從編碼位元流中解讀出的ALF之開啟/關閉的資訊顯示的是ALF開啟的情況下，可根據局部的梯度之方向及活動性而從複數個濾波器之中選擇1個濾波器，且將所選擇的濾波器適用於再構成區塊。 [框記憶體]When the information indicating that the ALF is turned on/off from the encoded bit stream is displayed, the ALF is turned on, and one filter can be selected from the plurality of filters according to the direction and activity of the local gradient. And the selected filter is applied to the reconstructed block. [frame memory]

框記憶體214是用於保存框間預測所用的參照圖片之儲存部，有時也被稱為框緩衝器(frame buffer)。具體來說，框記憶體214會保存已藉由迴路濾波部212進行濾波的再構成區塊。 [框內預測部]The frame memory 214 is a storage unit for storing a reference picture used for inter-frame prediction, and is sometimes referred to as a frame buffer. Specifically, the frame memory 214 stores the reconstructed block that has been filtered by the loop filter unit 212. [In-frame prediction department]

框內預測部216是根據已從編碼位元流中解讀出的框內預測模式，並參照保存於區塊記憶體210的當前圖片內之區塊來進行框內預測，藉此生成預測訊號(框內預測訊號)。具體來說，框內預測部216是參照與當前區塊相鄰的區塊之樣本(例如亮度值、色差值)來進行框內預測，藉此生成框內預測訊號，並將框內預測訊號輸出至預測控制部220。The in-frame prediction unit 216 performs in-frame prediction based on the intra-frame prediction mode that has been interpreted from the encoded bit stream, and refers to the block stored in the current picture of the block memory 210, thereby generating a prediction signal ( In-frame prediction signal). Specifically, the in-frame prediction unit 216 performs intra-frame prediction by referring to samples (for example, luminance values and color difference values) of the blocks adjacent to the current block, thereby generating an in-frame prediction signal and predicting the in-frame prediction. The signal is output to the prediction control unit 220.

再者，在色差區塊的框內預測中選擇參照亮度區塊的框內預測模式之情況下，框內預測部216也可以根據當前區塊的亮度成分，來預測當前區塊的色差成分。Furthermore, in the case where the in-frame prediction mode of the reference luminance block is selected in the intra-frame prediction of the chroma block, the in-frame prediction unit 216 may predict the chroma component of the current block based on the luminance component of the current block.

又，在從編碼位元流中解讀出的資訊顯示的是適用PDPC之情況下，框內預測部216會根據水平/垂直方向的參照像素之梯度來補正框內預測後的像素值。 [框間預測部]Further, when the information read from the encoded bit stream indicates that the PDPC is applied, the in-frame prediction unit 216 corrects the intra-frame predicted pixel value based on the gradient of the reference pixels in the horizontal/vertical direction. [Inter-frame prediction unit]

框間預測部218是參照保存於框記憶體214的參照圖片，來預測當前區塊。預測是以當前區塊或當前區塊內的子區塊(例如4x4區塊)之單位來進行。例如，框間預測部218會利用從編碼位元流中解讀出的運動資訊(例如運動向量)來進行動態補償，藉此生成當前區塊或子區塊的框間預測訊號，並將框間預測訊號輸出至預測控制部220。The inter-frame prediction unit 218 refers to the reference picture stored in the frame memory 214 to predict the current block. The prediction is made in units of the current block or a sub-block within the current block (for example, a 4x4 block). For example, the inter-frame prediction unit 218 performs motion compensation using motion information (for example, motion vector) interpreted from the encoded bit stream, thereby generating an inter-frame prediction signal of the current block or sub-block, and inter-frame prediction signals. The prediction signal is output to the prediction control unit 220.

再者，在從編碼位元流中解讀出的資訊顯示的是適用OBMC模式的情況下，框間預測部218會使用的不只有藉由運動搜尋所得到的當前區塊之運動資訊，還有相鄰區塊的運動資訊，以生成框間預測訊號。Furthermore, in the case where the information read from the encoded bit stream shows that the OBMC mode is applied, the inter-frame prediction unit 218 uses not only the motion information of the current block obtained by the motion search but also the motion information of the current block obtained by the motion search. Motion information of adjacent blocks to generate inter-frame prediction signals.

又，在從編碼位元流中解讀出的資訊顯示的是適用FRUC模式的情況下，框間預測部218會依照從編碼流中解讀出的型樣匹配之方法(雙向匹配或模板匹配)來進行運動搜尋，藉此導出運動資訊。並且，框間預測部218會使用已導出的運動資訊來進行動態補償。Further, when the information read from the encoded bit stream shows that the FRUC mode is applied, the inter-frame prediction unit 218 follows the pattern matching method (bidirectional matching or template matching) interpreted from the encoded stream. Perform a motion search to derive sports information. Further, the inter-frame prediction unit 218 performs dynamic compensation using the derived motion information.

又，在適用BIO模式的情況下，框間預測部218會根據假設了等速直線運動的模型來導出運動向量。又，在從編碼位元流中解讀出的資訊顯示的是適用仿射動態補償預測模式的情況下，框間預測部218會根據複數個相鄰區塊的運動向量，以子區塊單位來導出運動向量。 [預測控制部]Further, when the BIO mode is applied, the inter-frame prediction unit 218 derives a motion vector based on a model that assumes a constant-speed linear motion. Further, in the case where the information read from the encoded bit stream indicates that the affine dynamic compensation prediction mode is applied, the inter-frame prediction unit 218 sets the sub-block unit based on the motion vectors of the plurality of adjacent blocks. Export motion vectors. [Predictive Control Department]

預測控制部220會選擇框內預測訊號及框間預測訊號的任一個，且將所選擇的訊號作為預測訊號而輸出至加法部208。 (實施形態2)The prediction control unit 220 selects any one of the in-frame prediction signal and the inter-frame prediction signal, and outputs the selected signal as a prediction signal to the addition unit 208. (Embodiment 2)

雖然本實施形態中的編碼裝置及解碼裝置具有與實施形態1同樣的構成及功能，但特徵是在於框間預測部126及218等的處理動作。 [成為本揭示之基礎的知識見解]The coding apparatus and the decoding apparatus according to the present embodiment have the same configuration and function as those of the first embodiment, but are characterized by the processing operations of the inter-frame prediction units 126 and 218. [Knowledge of knowledge that is the basis of this disclosure]

圖11是顯示成為本揭示之基礎的其他編碼裝置所進行的動態補償之流程圖。再者，在圖11以後的各圖中，是將運動向量表示為MV。Figure 11 is a flow chart showing the dynamic compensation performed by other encoding devices that are the basis of the present disclosure. Furthermore, in each of the figures subsequent to Fig. 11, the motion vector is represented as MV.

編碼裝置是按照相當於上述預測單元的每一個預測區塊，來對該預測區塊進行動態補償。此時，編碼裝置首先會根據時間上或空間上位於預測區塊周圍的複數個編碼完成區塊的運動向量等之資訊，來對該預測區塊取得複數個候選運動向量(步驟S101)。The encoding device dynamically compensates the prediction block according to each prediction block corresponding to the prediction unit. At this time, the encoding apparatus first acquires a plurality of candidate motion vectors for the prediction block based on the information of the motion vectors of the plurality of coded completion blocks temporally or spatially located around the prediction block (step S101).

接著，編碼裝置會從步驟S101中已取得的複數個候選運動向量之中，依照事先決定的優先順序來將N個(N為2以上的整數)候選運動向量的每一個擷取作為運動向量預測子候選(步驟S102)。再者，該優先順序是對N個候選運動向量的每一個而事先規定的。Next, the encoding apparatus extracts each of N (N is an integer of 2 or more) candidate motion vectors as a motion vector prediction from among a plurality of candidate motion vectors acquired in step S101 in accordance with a predetermined priority order. Sub-candidate (step S102). Furthermore, the priority order is predetermined for each of the N candidate motion vectors.

接著，編碼裝置會從該N個運動向量預測子候選之中，選擇1個運動向量預測子候選作為預測區塊的運動向量預測子。此時，編碼裝置會將用於識別已選擇的運動向量預測子之運動向量預測子選擇資訊編碼於串流中(步驟S103)。再者，串流是上述編碼訊號或編碼位元流。Next, the encoding apparatus selects one motion vector predictor candidate from among the N motion vector predictor candidates as a motion vector predictor of the prediction block. At this time, the encoding device encodes the motion vector predictor selection information for identifying the selected motion vector predictor into the stream (step S103). Furthermore, the stream is the above encoded signal or encoded bit stream.

接著，編碼裝置會參照編碼完成參照圖片，來導出預測區塊的運動向量(步驟S104)。此時，編碼裝置會更進一步地將該已導出的運動向量與運動向量預測子的差分值作為差分運動向量資訊而編碼於串流中。再者，編碼完成參照圖片是由編碼後再構成的複數個區塊所形成的圖片。Next, the encoding apparatus derives the motion vector of the prediction block by referring to the encoding completion reference picture (step S104). At this time, the encoding device further encodes the difference value of the derived motion vector and the motion vector predictor as the difference motion vector information in the stream. Furthermore, the coded completion reference picture is a picture formed by a plurality of blocks constructed by coding.

最後，編碼裝置會利用該已導出的運動向量與編碼完成參照圖片，來對預測區塊進行動態補償，藉此生成該預測區塊的預測圖像(步驟S105)。再者，預測圖像是上述之框間預測訊號。Finally, the encoding device dynamically compensates the prediction block by using the derived motion vector and the encoded completion reference picture, thereby generating a predicted image of the prediction block (step S105). Furthermore, the predicted image is the inter-frame prediction signal described above.

圖12是顯示成為本揭示之基礎的其他解碼裝置所進行的動態補償之流程圖。Figure 12 is a flow chart showing the dynamic compensation performed by other decoding devices that are the basis of the present disclosure.

解碼裝置會按每一個預測區塊，來對該預測區塊進行動態補償。此時，解碼裝置首先會根據時間上或空間上位於預測區塊周圍的複數個解碼完成區塊的運動向量等之資訊，來對該預測區塊取得複數個候選運動向量(步驟S111)。The decoding device dynamically compensates the prediction block for each prediction block. At this time, the decoding apparatus first acquires a plurality of candidate motion vectors for the prediction block based on the information of the motion vectors of the plurality of decoding completion blocks temporally or spatially located around the prediction block (step S111).

接著，解碼裝置會從步驟S111中已取得的複數個候選運動向量之中，依照事先決定的優先順序來將N個(N為2以上的整數)候選運動向量的每一個擷取作為運動向量預測子候選(步驟S112)。再者，該優先順序是對N個候選運動向量的每一個而事先規定的。Next, the decoding apparatus extracts each of N (N is an integer of 2 or more) candidate motion vectors as motion vector prediction from among a plurality of candidate motion vectors acquired in step S111 according to a predetermined priority order. Sub-candidate (step S112). Furthermore, the priority order is predetermined for each of the N candidate motion vectors.

接著，解碼裝置會從已輸入的串流中對運動向量預測子選擇資訊進行解碼，且利用該已解碼的運動向量預測子選擇資訊，從該N個運動向量預測子候選之中，將1個運動向量預測子候選選擇作為預測區塊的運動向量預測子(步驟S113)。Next, the decoding device decodes the motion vector predictor selection information from the input stream, and uses the decoded motion vector predictor selection information, and one of the N motion vector predictor candidates The motion vector predictor candidate is selected as the motion vector predictor of the prediction block (step S113).

接著，解碼裝置會從已輸入的串流中對差分運動向量資訊進行解碼，且對該已解碼的差分運動向量資訊即差分值、及已選擇的運動向量預測子進行加法運算，藉此導出預測區塊的運動向量(步驟S114)。Then, the decoding device decodes the differential motion vector information from the input stream, and adds the decoded difference motion vector information, that is, the difference value and the selected motion vector predictor, to derive the prediction. The motion vector of the block (step S114).

最後，解碼裝置會利用該已導出的運動向量與解碼完成參照圖片，來對預測區塊進行動態補償，藉此生成該預測區塊的預測圖像(步驟S115)。Finally, the decoding device dynamically compensates the prediction block by using the derived motion vector and the decoded complete reference picture, thereby generating a predicted image of the prediction block (step S115).

在此，在圖11及圖12所示的例子中，是為了擷取N個運動向量預測子候選，而利用事先規定的優先順序。但是，亦可為了得到與預測區塊的運動向量之差分更小的運動向量預測子，而對複數個候選運動向量的每一個進行評價。亦即，亦可算出在步驟S101或步驟111中取得的複數個候選運動向量的每一個的評價值，並從複數個候選運動向量之中，根據該算出的評價值來擷取N個運動向量預測子候選。Here, in the example shown in FIGS. 11 and 12, in order to extract N motion vector predictor candidates, a predetermined priority order is used. However, each of the plurality of candidate motion vectors may also be evaluated in order to obtain a motion vector predictor that is smaller than the difference between the motion vectors of the prediction blocks. In other words, the evaluation value of each of the plurality of candidate motion vectors acquired in step S101 or step 111 may be calculated, and N motion vectors may be extracted from the plurality of candidate motion vectors based on the calculated evaluation values. Predictor candidates.

圖13是用於說明評價值的算出方法之一例的圖。FIG. 13 is a view for explaining an example of a method of calculating an evaluation value.

作為評價值的算出方法，例如有模板匹配方式。在此模板匹配方式中，是將動態圖像中的編碼完成區域或解碼完成區域的再構成圖像利用於評價值的算出。再者，在圖13中，編碼完成區域及解碼完成區域是統稱為處理完成區域，編碼對象圖片及解碼對象圖片是統稱為處理對象圖片。又，設為編碼的對象之預測區塊及設為解碼的對象之預測區塊，是統稱為處理對象預測區塊。As a method of calculating the evaluation value, for example, there is a template matching method. In this template matching method, a reconstruction image of a coding completion region or a decoding completion region in a moving image is used for calculation of an evaluation value. In addition, in FIG. 13, the encoding completion area and the decoding completion area are collectively referred to as a processing completion area, and the encoding target picture and the decoding target picture are collectively referred to as a processing target picture. Further, the prediction block to be coded and the prediction block to be decoded are collectively referred to as a processing target prediction block.

具體來說，編碼裝置會算出下述兩種再構成圖像間的差分值，該兩種再構成圖像為：在編碼對象圖片中位於處理對象預測區塊之周邊的編碼完成區域之再構成圖像、以及在編碼完成參照圖片中位於由候選運動向量所指定的區塊之周邊的編碼完成區域之再構成圖像。例如，可將差分值算出作為像素值的差分絕對值和。Specifically, the encoding apparatus calculates a difference value between two types of reconstructed images which are reconstructed from the coded completion area located around the processing target prediction block in the encoding target picture. The image, and the reconstructed image of the coded completion region located in the periphery of the block designated by the candidate motion vector in the encoded reference picture. For example, the difference value can be calculated as the sum of absolute differences of the pixel values.

解碼裝置也是與編碼裝置同樣地算出下述兩種再構成圖像間的差分值，該兩種再構成圖像為：在解碼對象圖片中位於處理對象預測區塊之周邊的解碼完成區域之再構成圖像、以及在解碼完成參照圖片中位於由候選運動向量所指定的區塊之周邊的解碼完成區域之再構成圖像。例如，可將差分值算出作為像素值的差分絕對值和。Similarly to the encoding device, the decoding device calculates a difference value between two types of reconstructed images, which are located in the decoding completion region around the processing target prediction block in the decoding target picture. A reconstructed image constituting the image and the decoded completion region located in the periphery of the block designated by the candidate motion vector in the decoded reference picture. For example, the difference value can be calculated as the sum of absolute differences of the pixel values.

再者，以下將在參照圖片中由候選運動向量所指定的區塊稱為指定區塊。此指定區塊是以處理對象預測區塊之空間上的位置作為基準，而位於藉由候選運動向量所指示的位置上。又，處理完成區域相對於參照圖片中的指定區塊的相對位置、以及處理完成區域相對於處理對象圖片中的處理對象預測區塊的相對位置是相等的。又，位於處理對象預測區塊或指定區塊的周邊之處理完成區域，亦可為相鄰於該等區塊的左邊之區域及相鄰於上方的區域，且亦可為僅相鄰於左邊的區域，或亦可為僅相鄰於上方的區域。例如，若相鄰於左邊的區域及相鄰於上方的區域都存在，即可將該等區域使用於評價值的算出，若有任一區域不存在，則可僅將存在的區域使用於評價值的算出。Furthermore, the block specified by the candidate motion vector in the reference picture is hereinafter referred to as a designated block. The designated block is located at a position indicated by the candidate motion vector with reference to the position on the spatial space of the prediction block of the processing object. Further, the relative position of the processing completion area with respect to the specified block in the reference picture and the relative position of the processing completion area with respect to the processing object prediction block in the processing target picture are equal. Moreover, the processing completion area located at the periphery of the processing target prediction block or the designated block may also be an area adjacent to the left side of the block and an area adjacent to the upper part, and may be adjacent only to the left side. The area may be an area adjacent only to the upper side. For example, if both the area adjacent to the left side and the area adjacent to the upper side exist, the areas can be used for the calculation of the evaluation value, and if any of the areas does not exist, only the existing area can be used for evaluation. The calculation of the value.

編碼裝置及解碼裝置會利用已得到的差分值來算出評價值。例如，差分值越小會算出越高的評價值。再者，編碼裝置及解碼裝置亦可除了差分值之外，還利用其以外的資訊來算出評價值。The encoding device and the decoding device calculate the evaluation value using the obtained difference value. For example, the smaller the difference value, the higher the evaluation value is calculated. Furthermore, the encoding device and the decoding device may calculate the evaluation value by using information other than the difference value.

再者，藉由圖13的例子所示的模板匹配方式進行之評價值的算出方法，僅是一個例子，並不限定於此。例如，用於評價的區域之位置、或者判斷區域的使用可否之方法並不限定於圖13的例子，亦可為其他位置或方法。Furthermore, the method of calculating the evaluation value by the template matching method shown in the example of FIG. 13 is merely an example, and is not limited thereto. For example, the position of the area for evaluation or the method of determining whether or not the area is used is not limited to the example of FIG. 13, and may be other positions or methods.

圖14是用於說明評價值的算出方法之其他例的圖。FIG. 14 is a view for explaining another example of the method of calculating the evaluation value.

作為評價值的算出方法，例如有雙向匹配方式。在此雙向匹配方式中，也會將動態圖像中的編碼完成區域或解碼完成區域的再構成圖像利用於評價值的算出。再者，在圖14中，編碼完成區域及解碼完成區域是統稱為處理完成區域，編碼對象圖片及解碼對象圖片是統稱為處理對象圖片。又，作為編碼的對象之預測區塊及作為解碼的對象之預測區塊，是統稱為處理對象預測區塊。As a method of calculating the evaluation value, for example, there is a two-way matching method. In this two-way matching method, the reconstructed image of the coded completion area or the decoded completed area in the moving image is also used for calculation of the evaluation value. In addition, in FIG. 14, the encoding completion area and the decoding completion area are collectively referred to as a processing completion area, and the encoding target picture and the decoding target picture are collectively referred to as a processing target picture. Further, the prediction block to be encoded and the prediction block to be decoded are collectively referred to as a processing target prediction block.

具體來說，編碼裝置會算出下述兩種再構成圖像間的差分值，該兩種再構成圖像為：在編碼完成參照圖片1中由候選運動向量所指定的區塊之再構成圖像、以及在編碼完成參照圖片2中由對稱運動向量所指定的區塊之再構成圖像。由候選運動向量所指定的區塊、及由對稱運動向量所指定的區塊都是編碼完成區域。例如，可將差分值算出作為像素值的差分絕對值和。Specifically, the encoding apparatus calculates a difference value between two types of reconstructed images, which are reconstructed maps of the blocks specified by the candidate motion vectors in the encoded completed reference picture 1. The image and the reconstructed image of the block specified by the symmetric motion vector in the coded completion reference picture 2. The block specified by the candidate motion vector and the block specified by the symmetric motion vector are all coded completion regions. For example, the difference value can be calculated as the sum of absolute differences of the pixel values.

解碼裝置也是與編碼裝置同樣地算出下述兩種再構成圖像間的差分值，該兩種再構成圖像為：在解碼完成參照圖片1中由候選運動向量所指定的區塊之再構成圖像、以及在解碼完成參照圖片2中由對稱運動向量所指定的區塊之再構成圖像。由候選運動向量所指定的區塊、及由對稱運動向量所指定的區塊都是解碼完成區域。例如，可將差分值算出作為像素值的差分絕對值和。Similarly, the decoding device calculates a difference value between two types of reconstructed images in the same manner as the encoding device. The two reconstructed images are reconstructed by the block specified by the candidate motion vector in the decoded completed reference picture 1. The image, and the reconstructed image of the block specified by the symmetric motion vector in the decoded reference picture 2 is reconstructed. The block specified by the candidate motion vector and the block specified by the symmetric motion vector are all decoding completion areas. For example, the difference value can be calculated as the sum of absolute differences of the pixel values.

再者，對稱運動向量是藉由因應於上述之顯示時間間隔來對候選運動向量進行縮放(scaling)而生成的運動向量。又，由候選運動向量及對稱運動向量的每一個所指定的區塊，是位於以處理對象預測區塊之空間上的位置為基準而指示的位置上。Furthermore, the symmetric motion vector is a motion vector generated by scaling the candidate motion vector in response to the display time interval described above. Further, the block specified by each of the candidate motion vector and the symmetric motion vector is located at a position indicated on the basis of the position on the space in which the prediction block is processed.

編碼裝置及解碼裝置是利用已得到的差分值來算出評價值。例如，差分值越小則會算出越高的評價值。再者，編碼裝置及解碼裝置亦可除了差分值之外，還利用其以外的資訊來算出評價值。The encoding device and the decoding device calculate the evaluation value using the obtained difference value. For example, the smaller the difference value, the higher the evaluation value is calculated. Furthermore, the encoding device and the decoding device may calculate the evaluation value by using information other than the difference value.

再者，藉由圖14的例子所示的雙向匹配方式進行之評價值的算出方法，僅是一個例子，並不限定於此。例如，特定用於評價的處理完成區域之位置的方法，並不限定於圖14所示的例子。Furthermore, the method of calculating the evaluation value by the two-way matching method shown in the example of FIG. 14 is merely an example, and is not limited thereto. For example, the method of specifying the position of the processing completion area for evaluation is not limited to the example shown in FIG.

藉由根據這種評價值來從複數個候選運動向量中擷取N個運動向量預測子候選，而有可以提升預測區塊的預測精度之可能性。再者，模板匹配方式及雙向匹配方式均是上述之FRUC模式中所用的方式。從而，亦將根據這種評價值的運動向量預測子候選的擷取方法，稱為根據由FRUC所進行的評價結果之擷取方法。By extracting N motion vector predictor candidates from a plurality of candidate motion vectors based on such evaluation values, there is a possibility that the prediction accuracy of the prediction block can be improved. Furthermore, the template matching method and the bidirectional matching method are the methods used in the above FRUC mode. Therefore, the method of extracting the motion vector predictor candidate based on such an evaluation value is also referred to as a method of extracting based on the evaluation result by the FRUC.

在此，為了對預測區塊，從複數個候選運動向量中擷取出N個運動向量預測子候選，可以利用根據由FRUC所進行的評價結果之第1擷取方法、及根據事先決定的優先順序之第2擷取方法。Here, in order to extract N motion vector predictor candidates from a plurality of candidate motion vectors for the prediction block, the first extraction method based on the evaluation result by the FRUC and the prioritized priority order may be used. The second method of capture.

例如，編碼裝置及解碼裝置為了從複數個候選運動向量中擷取出2個運動向量預測子候選，是利用第1擷取方法來擷取1個運動向量預測子候選，並利用第2擷取方法來擷取剩下的另1個運動向量預測子候選。在這種情況下，可設想到的是，可對第1擷取方法及第2擷取方法的每一個方法均利用個別的候選清單。這些候選清單是顯示複數個候選運動向量的清單。For example, in order to extract two motion vector predictor candidates from a plurality of candidate motion vectors, the encoding apparatus and the decoding apparatus use the first extraction method to extract one motion vector predictor candidate, and use the second capture method. To retrieve the remaining one motion vector predictor candidate. In this case, it is conceivable that an individual candidate list can be utilized for each of the first extraction method and the second extraction method. These candidate lists are a list showing a plurality of candidate motion vectors.

因此，在如上述的情況下，雖然會有可以提升預測精度的可能性，但是必須對1個預測區塊製作彼此不同的複數個候選清單，而產生處理負擔增加的課題。Therefore, in the case as described above, there is a possibility that the prediction accuracy can be improved. However, it is necessary to create a plurality of candidate lists different from each other for one prediction block, and there is a problem that the processing load increases.

因此，本實施形態中的編碼裝置100及解碼裝置200是利用根據由FRUC所進行的評價結果之擷取方法，並且對預測區塊利用1個候選清單，來進行對該預測區塊的動態補償。Therefore, the encoding apparatus 100 and the decoding apparatus 200 according to the present embodiment perform the dynamic compensation of the prediction block by using the extraction method based on the evaluation result by the FRUC and using one candidate list for the prediction block. .

具體來說，本實施形態中的編碼裝置100是從複數個候選運動向量中，擷取編碼對象區塊之至少1個運動向量預測子候選。此時，編碼裝置100是根據複數個候選運動向量的每一個之評價結果，來擷取該至少1個運動向量預測子候選之全部，其中該複數個候選運動向量不使用編碼對象區塊的圖像區域，而是使用了動態圖像中編碼完成區域之再構成圖像的候選運動向量。Specifically, the encoding apparatus 100 according to the present embodiment extracts at least one motion vector predictor candidate of the encoding target block from a plurality of candidate motion vectors. At this time, the encoding apparatus 100 extracts all of the at least one motion vector predictor candidate according to the evaluation result of each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the map of the encoding target block. The image area, but the candidate motion vector of the reconstructed image of the coded completion area in the moving image is used.

又，本實施形態中的解碼裝置200是從複數個候選運動向量中，擷取解碼對象區塊之至少1個運動向量預測子候選。此時，解碼裝置200是根據複數個候選運動向量的每一個之評價結果，來擷取該至少1個運動向量預測子候選之全部，其中該複數個候選運動向量不使用解碼對象區塊的圖像區域，而是使用了動態圖像中解碼完成區域之再構成圖像的候選運動向量。Further, the decoding apparatus 200 according to the present embodiment extracts at least one motion vector predictor candidate of the decoding target block from the plurality of candidate motion vectors. At this time, the decoding apparatus 200 extracts all of the at least one motion vector predictor candidate according to the evaluation result of each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the map of the decoding target block. The image area, but the candidate motion vector of the reconstructed image of the decoding completion area in the moving image is used.

亦即，本實施形態中的編碼裝置100及解碼裝置200是藉由根據FRUC所進行的評價結果之擷取方法來擷取全部的運動向量預測子候選。換言之，是在不利用根據事先決定的優先順序之擷取方法的情況下，來擷取全部的運動向量預測子候選。該全部的運動向量預測子候選可為1個運動向量預測子候選，亦可為複數個運動向量預測子候選。從而，由於並不利用根據事先決定的優先順序之擷取方法，因此不需要用於該擷取方法之專用的候選清單，而可以對預測區塊進行利用了1個候選清單的動態補償。 [僅使用FRUC來擷取1個運動向量預測子候選]That is, the encoding device 100 and the decoding device 200 in the present embodiment extract all the motion vector predictor candidates by the extraction method based on the evaluation result by the FRUC. In other words, all motion vector predictor candidates are retrieved without using the acquisition method according to the priority order determined in advance. The all motion vector predictor candidates may be one motion vector predictor candidate, or may be a plurality of motion vector predictor candidates. Therefore, since the extraction method based on the priority order determined in advance is not used, a dedicated candidate list for the extraction method is not required, and dynamic compensation using one candidate list can be performed on the prediction block. [Use only FRUC to extract 1 motion vector predictor candidate]

圖15是顯示由本實施形態中的編碼裝置100所進行的動態補償之一例的流程圖。圖1所示的編碼裝置100在對複數個圖片所構成的動態圖像進行編碼時，編碼裝置100的框間預測部126等會執行圖15所示的處理。Fig. 15 is a flowchart showing an example of dynamic compensation performed by the encoding device 100 in the present embodiment. When the encoding device 100 shown in FIG. 1 encodes a moving image composed of a plurality of pictures, the inter-frame prediction unit 126 of the encoding device 100 executes the processing shown in FIG.

具體來說，框間預測部126會按照相當於上述預測單元的每一個預測區塊，來對該預測區塊即編碼對象區塊進行動態補償。此時，框間預測部126首先會根據時間上或空間上位於預測區塊周圍的複數個編碼完成區塊的運動向量等資訊，來對該預測區塊取得複數個候選運動向量(步驟S201)。例如，編碼完成區塊的運動向量等資訊可為已用於該編碼完成區塊的動態補償之運動向量，亦可為不僅是該運動向量，更包含有顯示時間間隔，其中該顯示時間間隔為包含編碼完成區塊的圖片與編碼對象圖片之間的顯示時間間隔。例如，複數個候選運動向量是因應於顯示時間間隔而將複數個編碼完成區塊的運動向量之每一個縮放而成的運動向量。又，位於預測區塊周圍的複數個編碼完成區塊亦可為例如下述區塊：相鄰於編碼對象的預測區塊之左下、左上、及右上的每一處之複數個編碼完成區塊、以及和編碼對象圖片不同的圖片中所包含的複數個編碼完成區塊當中的全部或一部分之編碼完成區塊。Specifically, the inter-frame prediction unit 126 dynamically compensates the prediction block, that is, the coding target block, in accordance with each prediction block corresponding to the prediction unit. At this time, the inter-frame prediction unit 126 first acquires a plurality of candidate motion vectors for the prediction block according to information such as motion vectors of a plurality of coded completion blocks temporally or spatially located around the prediction block (step S201). . For example, the information such as the motion vector of the coded completion block may be the motion vector that has been used for the dynamic compensation of the coded completion block, or may be not only the motion vector, but also includes a display time interval, where the display time interval is The display time interval between the picture containing the coded completion block and the picture to be encoded. For example, the plurality of candidate motion vectors are motion vectors obtained by scaling each of the motion vectors of the plurality of coded completion blocks in response to the display time interval. Moreover, the plurality of coded completion blocks located around the prediction block may also be, for example, the following blocks: a plurality of coded completion blocks adjacent to each of the lower left, upper left, and upper right of the prediction block of the coding object. And an encoding completion block of all or a part of the plurality of coding completion blocks included in the picture different from the coding target picture.

接著，框間預測部126是利用編碼完成區域的再構成圖像，來算出在步驟S201中已取得的複數個候選運動向量的每一個的評價值。亦即，框間預測部126是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出其等的評價值。並且，框間預測部126是從該複數個候選運動向量之中，選擇評價值最高的1個候選運動向量，來作為預測區塊的運動向量預測子(步驟S202)。亦即，框間預測部126是從複數個候選運動向量中，僅選擇評價結果最佳的1個候選運動向量，藉此來擷取上述至少1個運動向量預測子候選之全部。Next, the inter-frame prediction unit 126 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S201 using the reconstructed image of the encoding completion region. In other words, the inter-frame prediction unit 126 calculates an evaluation value based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 126 selects one candidate motion vector having the highest evaluation value from among the plurality of candidate motion vectors as the motion vector predictor of the prediction block (step S202). In other words, the inter-frame prediction unit 126 extracts all of the at least one motion vector predictor candidate from among a plurality of candidate motion vectors by selecting only one candidate motion vector having the best evaluation result.

再者，框間預測部126亦可在周邊區域中細密地移動該已選擇的運動向量預測子，以將由FRUC進行的評價值變得更高，藉此來對該運動向量預測子進行補正。亦即，框間預測部126亦可藉由細密地搜尋使藉由FRUC進行的評價值變得更高的區域，來補正該運動向量預測子。Furthermore, the inter-frame prediction unit 126 may also finely move the selected motion vector predictor in the peripheral region to increase the evaluation value by the FRUC, thereby correcting the motion vector predictor. In other words, the inter-frame prediction unit 126 can correct the motion vector predictor by searching for a region in which the evaluation value by the FRUC is made higher.

接著，框間預測部126是參照編碼完成參照圖片，來導出預測區塊的運動向量(步驟S203)。此時，框間預測部126是更進一步地算出該已導出的運動向量與運動向量預測子的差分值。熵編碼部110是將該差分值作為差分運動向量資訊而編碼於串流中。亦即，熵編碼部110是對已選擇的候選運動向量即運動向量預測子、以及已導出的編碼對象區塊的運動向量之差分進行編碼。Next, the inter-frame prediction unit 126 refers to the coded completion reference picture to derive the motion vector of the prediction block (step S203). At this time, the inter-frame prediction unit 126 further calculates a difference value between the derived motion vector and the motion vector predictor. The entropy coding unit 110 encodes the difference value as a difference motion vector information in the stream. That is, the entropy coding unit 110 encodes the difference between the selected candidate motion vector, that is, the motion vector predictor, and the motion vector of the derived coding target block.

最後，框間預測部126是利用該已導出的運動向量與編碼完成參照圖片，來對預測區塊進行動態補償，藉此生成該預測區塊的預測圖像(步驟S204)。Finally, the inter-frame prediction unit 126 dynamically compensates the prediction block by using the derived motion vector and the coded completion reference picture, thereby generating a prediction picture of the prediction block (step S204).

再者，框間預測部126亦可取代如上述之以預測區塊為單位的動態補償，而以藉由分割預測區塊所得到的子區塊單位來同樣地導出運動向量，並以子區塊單位來進行動態補償。Furthermore, the inter-frame prediction unit 126 may also replace the motion compensation in units of prediction blocks as described above, and equally derive the motion vector by sub-block units obtained by dividing the prediction block, and sub-regions. Block units for dynamic compensation.

圖16是顯示本實施形態中的解碼裝置200所進行的動態補償之一例的流程圖。圖10所示的解碼裝置200在對已編碼的複數個圖片所構成的動態圖像進行解碼時，解碼裝置200的框間預測部218等會執行圖16所示的處理。Fig. 16 is a flowchart showing an example of dynamic compensation performed by the decoding device 200 in the embodiment. When the decoding device 200 shown in FIG. 10 decodes a moving image composed of a plurality of encoded pictures, the inter-frame prediction unit 218 of the decoding device 200 or the like executes the processing shown in FIG. 16.

具體來說，框間預測部218會按照相當於上述預測單元的每一個預測區塊，來對該預測區塊即解碼對象區塊進行動態補償。此時，框間預測部218首先會根據時間上或空間上位於預測區塊周圍的複數個解碼完成區塊的運動向量等資訊，來對該預測區塊取得複數個候選運動向量(步驟S211)。例如，解碼完成區塊的運動向量等資訊可為已用於該解碼完成區塊的動態補償之運動向量，亦可為不僅是該運動向量，更包含有顯示時間間隔，其中該顯示時間間隔為包含解碼完成區塊的圖片與解碼對象圖片之間的顯示時間間隔。例如，複數個候選運動向量是因應於顯示時間間隔而將複數個解碼完成區塊的運動向量的每一個縮放而成的運動向量。又，位於預測區塊周圍的複數個解碼完成區塊亦可為例如下述區塊：相鄰於解碼對象的預測區塊之左下、左上、及右上的每一處之複數個解碼完成區塊、以及和解碼對象圖片不同的圖片中所包含的複數個解碼完成區塊當中的全部或一部分之解碼完成區塊。Specifically, the inter-frame prediction unit 218 dynamically compensates the prediction block, that is, the decoding target block, in accordance with each prediction block corresponding to the prediction unit. At this time, the inter-frame prediction unit 218 first acquires a plurality of candidate motion vectors for the prediction block according to information such as motion vectors of a plurality of decoding completion blocks temporally or spatially located around the prediction block (step S211). . For example, the information such as the motion vector of the decoding completion block may be a motion vector that has been used for dynamic compensation of the decoding completion block, or may be not only the motion vector, but also includes a display time interval, where the display time interval is The display time interval between the picture including the decoding completion block and the picture to be decoded. For example, the plurality of candidate motion vectors are motion vectors obtained by scaling each of the motion vectors of the plurality of decoded completion blocks in response to the display time interval. Moreover, the plurality of decoding completion blocks located around the prediction block may also be, for example, the following blocks: a plurality of decoding completion blocks adjacent to the lower left, upper left, and upper right of the prediction block of the decoding target. And decoding completion blocks of all or a part of the plurality of decoding completion blocks included in the picture different from the decoding target picture.

接著，框間預測部218是利用解碼完成區域的再構成圖像，來算出在步驟S211中已取得的複數個候選運動向量的每一個之評價值。亦即，框間預測部218是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出其等的評價值。並且，框間預測部218是從該複數個候選運動向量之中，選擇評價值最高的1個候選運動向量，來作為預測區塊的運動向量預測子(步驟S212)。亦即，框間預測部218是從複數個候選運動向量中，僅選擇評價結果最佳的1個候選運動向量，藉此來擷取上述至少1個運動向量預測子候選之全部。Next, the inter-frame prediction unit 218 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S211 using the reconstructed image of the decoding completion region. In other words, the inter-frame prediction unit 218 calculates an evaluation value based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 218 selects one candidate motion vector having the highest evaluation value from among the plurality of candidate motion vectors as the motion vector predictor of the prediction block (step S212). In other words, the inter-frame prediction unit 218 extracts all of the at least one motion vector predictor candidate from among a plurality of candidate motion vectors by selecting only one candidate motion vector having the best evaluation result.

再者，框間預測部218亦可在周邊區域中細密地移動該已選擇的運動向量預測子，以將由FRUC進行的評價值變得更高，藉此來對該運動向量預測子進行補正。亦即，框間預測部218亦可藉由細密地搜尋使藉由FRUC進行的評價值變得更高的區域，來補正該運動向量預測子。Furthermore, the inter-frame prediction unit 218 may also finely move the selected motion vector predictor in the peripheral region to increase the evaluation value by the FRUC, thereby correcting the motion vector predictor. In other words, the inter-frame prediction unit 218 can correct the motion vector predictor by searching for a region in which the evaluation value by the FRUC is made higher.

接著，框間預測部218是利用差分運動向量資訊來導出預測區塊的運動向量，其中該差分運動向量資訊是從已輸入至解碼裝置200的串流中藉由熵解碼部202解碼而成的資訊(步驟S213)。具體來說，框間預測部218會對該已解碼的差分運動向量資訊即差分值、及已選擇的運動向量預測子進行加法運算，藉此導出預測區塊的運動向量。亦即，熵解碼部202是對差分運動向量資訊進行解碼，該差分運動向量資訊是顯示2個運動向量的差分之差分資訊。並且，框間預測部218會在由該已解碼的差分資訊所示的差分上，加上已選擇的候選運動向量即運動向量預測子，藉此來導出解碼對象區塊即預測區塊的運動向量。Next, the inter-frame prediction unit 218 derives a motion vector of the prediction block by using the difference motion vector information, which is decoded by the entropy decoding unit 202 from the stream that has been input to the decoding device 200. Information (step S213). Specifically, the inter-frame prediction unit 218 adds the decoded difference motion vector information, that is, the difference value, and the selected motion vector predictor, thereby deriving the motion vector of the prediction block. That is, the entropy decoding unit 202 decodes the difference motion vector information which is differential information showing the difference between the two motion vectors. Further, the inter-frame prediction unit 218 adds the motion vector predictor which is the selected candidate motion vector to the difference indicated by the decoded difference information, thereby deriving the motion of the decoding target block, that is, the prediction block. vector.

最後，框間預測部218是利用該已導出的運動向量與解碼完成參照圖片，來對預測區塊進行動態補償，藉此生成該預測區塊的預測圖像(步驟S214)。Finally, the inter-frame prediction unit 218 dynamically compensates the prediction block by using the derived motion vector and the decoded completion reference picture, thereby generating a prediction image of the prediction block (step S214).

再者，框間預測部218亦可取代如上述之以預測區塊為單位的動態補償，而以藉由分割預測區塊所得到的子區塊單位來同樣地導出運動向量，並以子區塊單位來進行動態補償。Furthermore, the inter-frame prediction unit 218 may also replace the motion compensation in units of prediction blocks as described above, and equally derive the motion vector by sub-block units obtained by dividing the prediction block, and sub-regions. Block units for dynamic compensation.

雖然在圖15及圖16所示的例子中，是擷取1個運動向量預測子候選，但亦可擷取複數個運動向量預測子候選。 [僅使用FRUC來擷取複數個運動向量預測子候選]Although in the example shown in FIG. 15 and FIG. 16, one motion vector predictor candidate is extracted, a plurality of motion vector predictor candidates may be extracted. [Use only FRUC to retrieve multiple motion vector predictor candidates]

圖17是顯示本實施形態中的編碼裝置100所進行的動態補償之其他的例子的流程圖。圖1所示的編碼裝置100在對複數個圖片所構成的動態圖像進行編碼時，編碼裝置100的框間預測部126等是執行圖17所示的處理。Fig. 17 is a flowchart showing another example of dynamic compensation performed by the encoding device 100 in the embodiment. When the encoding device 100 shown in FIG. 1 encodes a moving image composed of a plurality of pictures, the inter-frame prediction unit 126 of the encoding device 100 or the like executes the processing shown in FIG.

具體來說，框間預測部126是按照相當於上述預測單元的每一個預測區塊，來對該預測區塊即編碼對象區塊進行動態補償。此時，框間預測部126首先是根據時間上或空間上位於預測區塊周圍的複數個編碼完成區塊的運動向量等資訊，來對該預測區塊取得複數個候選運動向量(步驟S201)。Specifically, the inter-frame prediction unit 126 dynamically compensates the prediction block, that is, the coding target block, in accordance with each prediction block corresponding to the prediction unit. At this time, the inter-frame prediction unit 126 first obtains a plurality of candidate motion vectors for the prediction block according to information such as a motion vector of a plurality of coded completion blocks temporally or spatially located around the prediction block (step S201). .

接著，框間預測部126是利用編碼完成區域的再構成圖像，來算出在步驟S201中已取得的複數個候選運動向量的每一個之評價值。亦即，框間預測部126是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出其等的評價值。並且，框間預測部126是根據該複數個候選運動向量的評價值，從複數個候選運動向量之中將N個(N為2以上的整數)候選運動向量的每一個擷取作為運動向量預測子候選(步驟S202a)。亦即，框間預測部126是根據上述之評價結果，而從複數個候選運動向量中將N個候選運動向量擷取作為上述至少1個運動向量預測子候選之全部。更具體來說，框間預測部126是從複數個候選運動向量中，將在評價結果較佳的順位上排名前N個的候選運動向量擷取作為上述至少1個運動向量預測子候選之全部。換言之，框間預測部126是從複數個候選運動向量中，將在評價值較高的順位上排名前N個的候選運動向量分別擷取作為運動向量預測子候選。Next, the inter-frame prediction unit 126 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S201 using the reconstructed image of the encoding completion region. In other words, the inter-frame prediction unit 126 calculates an evaluation value based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 126 extracts each of N (N is an integer of 2 or more) candidate motion vectors from the plurality of candidate motion vectors as motion vector prediction based on the evaluation values of the plurality of candidate motion vectors. Sub-candidate (step S202a). In other words, the inter-frame prediction unit 126 extracts all of the N candidate motion vectors from the plurality of candidate motion vectors as the at least one motion vector predictor candidate based on the evaluation result described above. More specifically, the inter-frame prediction unit 126 extracts, from among the plurality of candidate motion vectors, the top N candidate motion vectors ranked in the ranks of the evaluation results as all of the at least one motion vector predictor candidate. . In other words, the inter-frame prediction unit 126 extracts, from the plurality of candidate motion vectors, the first N candidate motion vectors ranked higher in the rank having the higher evaluation value as the motion vector predictor candidates.

再者，框間預測部126也可以針對該已擷取的N個運動向量預測子候選的每一個，在周邊區域中細密地移動該已選擇的運動向量預測子候選，以將由FRUC進行的評價值變得更高，藉此來對該運動向量預測子候選進行補正。亦即，框間預測部126亦可藉由細密地搜尋使藉由FRUC進行的評價值變得更高的區域，來補正該等運動向量預測子候選。Furthermore, the inter-frame prediction unit 126 may finely move the selected motion vector predictor candidate in the peripheral region for each of the N motion vector predictor candidates that have been captured to evaluate the FRUC. The value becomes higher, whereby the motion vector predictor candidate is corrected. In other words, the inter-frame prediction unit 126 can correct the motion vector predictor candidates by searching for a region in which the evaluation value by the FRUC is higher.

並且，框間預測部126是從已擷取的N個運動向量預測子候選中，選擇預測區塊的運動向量預測子(步驟S202b)。此時，框間預測部126是輸出運動向量預測子選擇資訊，該運動向量預測子選擇資訊是用於識別該已選擇的運動向量預測子之資訊。熵編碼部110是將該運動向量預測子選擇資訊編碼於串流中。Further, the inter-frame prediction unit 126 selects a motion vector predictor of the prediction block from among the N motion vector predictor candidates that have been captured (step S202b). At this time, the inter-frame prediction unit 126 outputs the motion vector predictor selection information, which is information for identifying the selected motion vector predictor. The entropy coding unit 110 encodes the motion vector predictor selection information in the stream.

在該運動向量預測子的選擇中，框間預測部126亦可利用編碼對象區塊即預測區塊的原圖像。例如，框間預測部126是針對N個運動向量預測子候選的每一個來算出差分，其中該差分是由該運動向量預測子候選所指定的區塊之圖像、以及預測區塊的原圖像之差分。並且，框間預測部126是將該差分為最小的運動向量預測子候選，選擇作為該預測區塊的運動向量預測子。或者，框間預測部126亦可進行利用了預測區塊的原圖像之運動搜尋，藉此來導出該預測區塊的運動向量。並且，框間預測部126是針對N個運動向量預測子候選的每一個來算出差分，其中該差分是由該運動向量預測子候選所指定的區塊之圖像、以及由已導出的預測區塊之運動向量所指定的區塊之圖像的差分。並且，框間預測部126是將該差分為最小的運動向量預測子候選，選擇作為該預測區塊的運動向量預測子。In the selection of the motion vector predictor, the inter-frame prediction unit 126 may also use the original image of the prediction target block, that is, the prediction block. For example, the inter-frame prediction unit 126 calculates a difference for each of the N motion vector predictor candidates, wherein the difference is an image of a block specified by the motion vector predictor candidate, and an original map of the prediction block. Like the difference. Further, the inter-frame prediction unit 126 is a motion vector predictor candidate that minimizes the difference, and selects a motion vector predictor as the prediction block. Alternatively, the inter-frame prediction unit 126 may perform motion search using the original image of the prediction block, thereby deriving the motion vector of the prediction block. Further, the inter-frame prediction unit 126 calculates a difference for each of the N motion vector predictor candidates, wherein the difference is an image of a block specified by the motion vector predictor candidate, and the derived prediction region The difference in the image of the block specified by the motion vector of the block. Further, the inter-frame prediction unit 126 is a motion vector predictor candidate that minimizes the difference, and selects a motion vector predictor as the prediction block.

接著，框間預測部126是參照編碼完成參照圖片，來導出預測區塊的運動向量(步驟S203)。此時，框間預測部126是更進一步地算出該已導出的運動向量與運動向量預測子的差分值。熵編碼部110是將該差分值作為差分運動向量資訊而編碼於串流中。Next, the inter-frame prediction unit 126 refers to the coded completion reference picture to derive the motion vector of the prediction block (step S203). At this time, the inter-frame prediction unit 126 further calculates a difference value between the derived motion vector and the motion vector predictor. The entropy coding unit 110 encodes the difference value as a difference motion vector information in the stream.

圖18是顯示本實施形態中的解碼裝置200所進行的動態補償之其他例子的流程圖。圖10所示的解碼裝置200在對已編碼的複數個圖片所構成的動態圖像進行解碼時，解碼裝置200的框間預測部218等會執行圖18所示的處理。Fig. 18 is a flowchart showing another example of dynamic compensation performed by the decoding device 200 in the embodiment. When the decoding device 200 shown in FIG. 10 decodes a moving image composed of a plurality of encoded pictures, the inter-frame prediction unit 218 of the decoding device 200 executes the processing shown in FIG. 18.

具體來說，框間預測部218是按照相當於上述預測單元的每一個預測區塊，來對該預測區塊即解碼對象區塊進行動態補償。此時，框間預測部218首先是根據時間上或空間上位於預測區塊周圍的複數個解碼完成區塊的運動向量等資訊，來對該預測區塊取得複數個候選運動向量(步驟S211)。Specifically, the inter-frame prediction unit 218 dynamically compensates the prediction block, that is, the decoding target block, in accordance with each prediction block corresponding to the prediction unit. At this time, the inter-frame prediction unit 218 first acquires a plurality of candidate motion vectors for the prediction block based on information such as motion vectors of a plurality of decoding completion blocks temporally or spatially located around the prediction block (step S211). .

接著，框間預測部218是利用解碼完成區域的再構成圖像，來算出在步驟S211中已取得的複數個候選運動向量的每一個之評價值。亦即，框間預測部218是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出其等的評價值。並且，框間預測部218是根據該複數個候選運動向量的評價值，從複數個候選運動向量之中將N個(N為2以上的整數)候選運動向量的每一個擷取作為運動向量預測子候選(步驟S212a)。亦即，框間預測部218是根據上述之評價結果，而從複數個候選運動向量中將N個候選運動向量擷取作為上述至少1個運動向量預測子候選之全部。更具體來說，框間預測部218是從複數個候選運動向量中，將在評價結果較佳的順位上排名前N個的候選運動向量擷取作為上述至少1個運動向量預測子候選之全部。換言之，框間預測部218是從複數個候選運動向量中，將在評價值較高的順位上排名前N個的候選運動向量分別擷取作為運動向量預測子候選。Next, the inter-frame prediction unit 218 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S211 using the reconstructed image of the decoding completion region. In other words, the inter-frame prediction unit 218 calculates an evaluation value based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 218 extracts each of N (N is an integer of 2 or more) candidate motion vectors from the plurality of candidate motion vectors as motion vector prediction based on the evaluation values of the plurality of candidate motion vectors. Sub-candidate (step S212a). In other words, the inter-frame prediction unit 218 extracts all of the N candidate motion vectors from the plurality of candidate motion vectors as the at least one motion vector predictor candidate based on the evaluation result described above. More specifically, the inter-frame prediction unit 218 extracts, from among the plurality of candidate motion vectors, the top N candidate motion vectors ranked in the ranks of the evaluation results as all of the at least one motion vector predictor candidate. . In other words, the inter-frame prediction unit 218 extracts, from the plurality of candidate motion vectors, the first N candidate motion vectors ranked higher in the rank having the higher evaluation value as the motion vector predictor candidates.

再者，框間預測部218亦可針對該已擷取的N個運動向量預測子候選的每一個，在周邊區域中細密地移動該已選擇的運動向量預測子候選，以將由FRUC進行的評價值變得更高，藉此來對該運動向量預測子候選進行補正。亦即，框間預測部218亦可藉由細密地搜尋使藉由FRUC進行的評價值變得更高的區域，來補正該等運動向量預測子候選。Furthermore, the inter-frame prediction unit 218 may also finely move the selected motion vector predictor candidate in the peripheral region for each of the N motion vector predictor candidates that have been captured, so as to be evaluated by the FRUC. The value becomes higher, whereby the motion vector predictor candidate is corrected. In other words, the inter-frame prediction unit 218 can correct the motion vector predictor candidates by searching for regions in which the evaluation values by the FRUC are made higher.

接著，框間預測部218是利用運動向量預測子選擇資訊，從已擷取的N個運動向量預測子候選中，選擇1個運動向量預測子候選來作為預測區塊的運動向量預測子，其中該運動向量預測子選擇資訊是從已輸入至解碼裝置200的串流中藉由熵解碼部202解碼而成的資訊(步驟S212b)。亦即，熵解碼部202是對運動向量預測子選擇資訊進行解碼，該運動向量預測子選擇資訊是用於識別運動向量預測子的資訊。並且，框間預測部218是從已擷取的N個運動向量預測子候選中，將藉由已解碼的運動向量預測子選擇資訊所識別的運動向量預測子候選，選擇作為該運動向量預測子。Next, the inter-frame prediction unit 218 selects one motion vector predictor candidate from the captured N motion vector predictor candidates as the motion vector predictor of the prediction block by using the motion vector predictor selection information. The motion vector predictor selection information is information decoded by the entropy decoding unit 202 from the stream that has been input to the decoding device 200 (step S212b). That is, the entropy decoding unit 202 decodes the motion vector predictor selection information which is information for identifying the motion vector predictor. Further, the inter-frame prediction unit 218 selects, as the motion vector predictor, the motion vector predictor candidate identified by the decoded motion vector predictor selection information from among the N motion vector predictor candidates that have been captured. .

接著，框間預測部218是利用差分運動向量資訊來導出預測區塊的運動向量，其中該差分運動向量資訊是從已輸入至解碼裝置200的串流中藉由熵解碼部202解碼而成的資訊(步驟S213)。具體來說，框間預測部218是對該已解碼的差分運動向量資訊即差分值、及已選擇的運動向量預測子進行加法運算，藉此導出預測區塊的運動向量。亦即，熵解碼部202是對差分運動向量資訊進行解碼，該差分運動向量資訊是顯示2個運動向量的差分之差分資訊。並且，框間預測部218是在由該已解碼的差分資訊所示的差分上，加上已選擇的運動向量預測子，藉此來導出解碼對象區塊即預測區塊的運動向量。Next, the inter-frame prediction unit 218 derives a motion vector of the prediction block by using the difference motion vector information, which is decoded by the entropy decoding unit 202 from the stream that has been input to the decoding device 200. Information (step S213). Specifically, the inter-frame prediction unit 218 adds the decoded difference motion vector information, that is, the difference value, and the selected motion vector predictor, thereby deriving the motion vector of the prediction block. That is, the entropy decoding unit 202 decodes the difference motion vector information which is differential information showing the difference between the two motion vectors. Further, the inter-frame prediction unit 218 adds the selected motion vector predictor to the difference indicated by the decoded difference information, thereby deriving the motion vector of the prediction block, that is, the prediction block.

圖19是用於說明從複數個候選運動向量中擷取N個運動向量預測子候選的方法之圖。19 is a diagram for explaining a method of extracting N motion vector predictor candidates from a plurality of candidate motion vectors.

在圖17及圖18所示的例子中，框間預測部126及218是從複數個候選運動向量之中，將在評價值較高的順位上排名前N個的候選運動向量分別擷取作為運動向量預測子候選。具體來說，在N=2的情況下，如圖19之(a)所示，框間預測部126及218是從全部的候選運動向量之中，將在評價值較高的順位上排名前2個的候選運動向量分別擷取作為運動向量預測子候選1及2。In the example shown in FIG. 17 and FIG. 18, the inter-frame prediction units 126 and 218 extract the top N candidate motion vectors from the plurality of candidate motion vectors among the plurality of candidate motion vectors. Motion vector predictor candidate. Specifically, in the case of N=2, as shown in (a) of FIG. 19, the inter-frame prediction units 126 and 218 are ranked from among all candidate motion vectors in order to rank higher in the evaluation value. Two candidate motion vectors are respectively taken as motion vector predictor candidates 1 and 2.

但是，框間預測部126及218亦可將複數個候選運動向量分類成N個群組，並從N個群組的每一個中擷取在該群組中評價結果最佳的1個候選運動向量，藉此來擷取上述至少1個運動向量預測子候選之全部。具體來說，在N=2的情況下，如圖19之(b)所示，框間預測部126及218是將全部的候選運動向量分類成2個群組。第1個群組是例如根據編碼對象圖片內的區塊之運動向量等而得到的候選運動向量所隸屬的群組。第2個群組例如是根據和編碼對象圖片不同的圖片內的區塊之運動向量等而得到的候選運動向量所隸屬的群組。However, the inter-frame prediction units 126 and 218 may also classify a plurality of candidate motion vectors into N groups, and extract one candidate motion having the best evaluation result in the group from each of the N groups. The vector is used to retrieve all of the at least one motion vector predictor candidate. Specifically, in the case of N=2, as shown in FIG. 19(b), the inter-frame prediction units 126 and 218 classify all the candidate motion vectors into two groups. The first group is, for example, a group to which the candidate motion vector obtained based on the motion vector of the block in the encoding target picture or the like belongs. The second group is, for example, a group to which the candidate motion vector is obtained based on the motion vector of the block in the picture different from the encoding target picture or the like.

並且，框間預測部126及218是從第1個群組中，將在該群組中評價值最高的1個候選運動向量擷取作為運動向量預測子候選1。再者，框間預測部126及218是從第2個群組中，將在該群組中評價值最高的1個候選運動向量擷取作為運動向量預測子候選2。Further, the inter-frame prediction units 126 and 218 extract, from the first group, one candidate motion vector having the highest evaluation value in the group as the motion vector predictor candidate 1. Furthermore, the inter-frame prediction units 126 and 218 extract, from the second group, one candidate motion vector having the highest evaluation value among the groups as the motion vector predictor candidate 2.

或者，框間預測部126及218亦可將複數個候選運動向量分類成M個(M是比N更大的整數)群組。並且，框間預測部126及218是從M個群組的每一個中，將在該群組中評價結果最佳的1個候選運動向量選擇作為代表候選運動向量。接著，框間預測部126及218亦可從已選擇的M個代表候選運動向量中，將在評價結果為較佳的順位上排名前N個的代表候選運動向量擷取作為上述至少1個運動向量預測子候選之全部。Alternatively, the inter-frame prediction units 126 and 218 may also classify a plurality of candidate motion vectors into M (M is an integer greater than N) group. Further, the inter-frame prediction units 126 and 218 select, as the representative candidate motion vector, one candidate motion vector that has the best evaluation result in the group from each of the M groups. Then, the inter-frame prediction units 126 and 218 may also extract, from the selected M representative candidate motion vectors, the top N representative candidate motion vectors in the ranking with the preferred evaluation result as the at least one motion. All of the vector predictor candidates.

具體來說，在M=3的情況下，如圖19之(c)所示，框間預測部126及218是將全部的候選運動向量分類成3個群組。第1個群組是例如根據編碼對象圖片內的編碼對象區塊之左側的區塊之運動向量等而得到的候選運動向量所隸屬的群組。第2個群組是例如根據編碼對象圖片內的編碼對象區塊之上側的區塊之運動向量等而得到的候選運動向量所隸屬的群組。第3個群組是例如根據和編碼對象圖片不同的圖片內的區塊之運動向量等而得到的候選運動向量所隸屬的群組。Specifically, in the case of M=3, as shown in (c) of FIG. 19, the inter-frame prediction units 126 and 218 classify all the candidate motion vectors into three groups. The first group is, for example, a group to which the candidate motion vector obtained based on the motion vector of the block on the left side of the encoding target block in the encoding target picture or the like. The second group is, for example, a group to which the candidate motion vector obtained based on the motion vector of the block on the upper side of the encoding target block in the encoding target picture or the like. The third group is a group to which the candidate motion vector is obtained, for example, based on a motion vector of a block in a picture different from the encoding target picture.

並且，框間預測部126及218是從第1個群組中，將在該第1個群組中評價值最高的1個候選運動向量選擇作為代表候選運動向量1。再者，框間預測部126及218是從第2個群組中，將在該第2個群組中評價值最高的1個候選運動向量選擇作為代表候選運動向量2。再者，框間預測部126及218是從第3個群組中，將在該第3個群組中評價值最高的1個候選運動向量選擇作為代表候選運動向量3。Further, the inter-frame prediction units 126 and 218 select one candidate motion vector having the highest evaluation value among the first groups as the representative candidate motion vector 1 from the first group. Further, the inter-frame prediction units 126 and 218 select, as the representative candidate motion vector 2, one candidate motion vector having the highest evaluation value among the second groups from the second group. Furthermore, the inter-frame prediction units 126 and 218 select, as the representative candidate motion vector 3, one candidate motion vector having the highest evaluation value among the third group from the third group.

接著，框間預測部126及218是從已擷取的3個代表候選運動向量中，將在評價值較高的順位上排名前2個的代表候選運動向量分別擷取作為運動向量預測子候選。Next, the inter-frame prediction units 126 and 218 extract, from the three representative candidate motion vectors that have been captured, the first two representative candidate motion vectors ranked higher in the rank having the higher evaluation value as the motion vector predictor candidates. .

再者，雖然在圖19所示的例子中，是擷取2個運動向量預測子候選，但並不限定於2個，也可以擷取3個以上的運動向量預測子候選。 [實施形態2的效果等]Further, in the example shown in FIG. 19, two motion vector predictor candidates are extracted, but the number is not limited to two, and three or more motion vector predictor candidates may be extracted. [Effects of Embodiment 2, etc.]

本實施形態之編碼裝置，是對動態圖像進行編碼的編碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，根據與前述動態圖像中的編碼對象區塊相對應的複數個編碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取前述編碼對象區塊的至少1個運動向量預測子候選，且參照包含於前述動態圖像的參照圖片來導出前述編碼對象區塊的運動向量，對於已擷取的前述至少1個運動向量預測子候選當中的運動向量預測子、及已導出的前述編碼對象區塊的運動向量之差分進行編碼，利用已導出的前述編碼對象區塊之運動向量來對前述編碼對象區塊進行動態補償，在前述至少1個運動向量預測子候選的擷取中，是根據前述複數個候選運動向量的每一個的評價結果來擷取前述至少1個運動向量預測子候選之全部，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量。再者，記憶體可為框記憶體122，亦可為其他記憶體，處理電路亦可包含有例如框間預測部126及熵編碼部110等。The encoding device according to the present embodiment is an encoding device that encodes a moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory and is based on the moving image. And encoding a motion vector of each of the plurality of coding completion blocks corresponding to the coding target block to obtain a plurality of candidate motion vectors, and extracting at least one motion vector of the foregoing coding target block from the plurality of candidate motion vectors Deriving a sub-candidate, and deriving a motion vector of the foregoing encoding target block with reference to a reference picture included in the foregoing moving image, and predicting a motion vector predictor among the at least one motion vector predicting sub-candidate that has been extracted, and having derived Encoding the difference of the motion vector of the encoding target block, and using the derived motion vector of the encoding target block to dynamically compensate the foregoing coding target block, and extracting at least one motion vector predictor candidate The above is based on the evaluation result of each of the plurality of candidate motion vectors described above. All of the motion vector predictor candidates, wherein the plurality of candidate motion vectors do not use the image region of the foregoing encoding target block, but use the candidate motion vector of the reconstructed image of the encoded complete region in the foregoing moving image . Furthermore, the memory may be the frame memory 122 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 126 and the entropy coding unit 110.

藉此，至少1個運動向量預測子候選之全部，是根據複數個候選運動向量的每一個之評價結果，亦即藉由FRUC所進行的評價結果來進行擷取，其中該複數個候選運動向量不使用編碼對象區塊的圖像區域，而是使用了動態圖像中編碼完成區域之再構成圖像的候選運動向量。從而，可以提高編碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。再者，在本實施形態中，不用根據事先規定的優先順序來擷取運動向量預測子候選。因此，只要生成根據FRUC所進行的評價結果之擷取用的候選清單，就可以擷取全部的運動向量預測子候選，而不需要生成根據優先順序之擷取用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, all of the at least one motion vector predictor candidate is extracted according to the evaluation result of each of the plurality of candidate motion vectors, that is, the result of the evaluation performed by the FRUC, wherein the plurality of candidate motion vectors are extracted. Instead of using the image area of the coding target block, a candidate motion vector of the reconstructed image of the coded completion area in the moving image is used. Thereby, the prediction accuracy of the coding target block, that is, the prediction block, can be improved, and the coding efficiency can be improved. Furthermore, in the present embodiment, it is not necessary to extract motion vector predictor candidates based on a predetermined priority order. Therefore, as long as the candidate list for the evaluation based on the evaluation result by the FRUC is generated, all the motion vector predictor candidates can be retrieved without generating the candidate list for the priority order. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，前述處理電路亦可在前述至少1個運動向量預測子候選的擷取中，從前述複數個候選運動向量中只選擇前述評價結果最佳的1個候選運動向量，藉此來擷取前述至少1個運動向量預測子候選之全部，並在前述差分的編碼中，對已選擇的前述候選運動向量即前述運動向量預測子、以及已導出的前述編碼對象區塊的運動向量之差分進行編碼。Furthermore, the processing circuit may select only one candidate motion vector having the best evaluation result from the plurality of candidate motion vectors in the capture of the at least one motion vector predictor candidate. At least one motion vector predictor candidate, and encoding, in the encoding of the difference, a difference between the selected candidate motion vector, that is, the motion vector predictor, and the derived motion vector of the encoding target block .

藉此，例如，如圖15所示，可擷取1個運動向量預測子候選，且選擇該運動向量預測子候選來作為運動向量預測子。另一方面，在擷取複數個運動向量預測子候選，並從該複數個運動向量預測子候選中選擇1個運動向量預測子的情況下，必須將用於識別該已選擇的運動向量預測子的資訊編碼並包含於串流中。但是，在圖15所示的例子中，由於是擷取1個運動向量預測子候選，且選擇該運動向量預測子候選來作為運動向量預測子，因此並不需要對這樣的資訊進行編碼。從而，可以謀求編碼量的減少。Thereby, for example, as shown in FIG. 15, one motion vector predictor candidate can be retrieved, and the motion vector predictor candidate can be selected as a motion vector predictor. On the other hand, in the case of extracting a plurality of motion vector predictor candidates and selecting one motion vector predictor from the plurality of motion vector predictor candidates, it is necessary to identify the selected motion vector predictor. The information is encoded and included in the stream. However, in the example shown in FIG. 15, since one motion vector predictor candidate is extracted and the motion vector predictor candidate is selected as the motion vector predictor, it is not necessary to encode such information. Therefore, it is possible to reduce the amount of coding.

又，前述處理電路亦可在前述至少1個運動向量預測子候選的擷取中，從前述複數個候選運動向量中，根據前述評價結果來將N個(N為2以上的整數)候選運動向量擷取作為前述至少1個運動向量預測子候選之全部，前述處理電路更進一步地從已擷取的N個運動向量預測子候選中選擇前述運動向量預測子，而對用於識別已選擇的前述運動向量預測子之選擇資訊進行編碼，且在前述差分的編碼中，是對已選擇的前述運動向量預測子、以及已導出的前述編碼對象區塊的運動向量之差分進行編碼。Furthermore, in the processing of the at least one motion vector predictor candidate, the processing circuit may select N (N is an integer of 2 or more) candidate motion vectors from the plurality of candidate motion vectors based on the evaluation result. Extracting all of the at least one motion vector predictor candidate, the processing circuit further selecting the motion vector predictor from the extracted N motion vector predictor candidates, and identifying the selected The selection information of the motion vector predictor is encoded, and in the encoding of the difference, the difference between the selected motion vector predictor and the derived motion vector of the encoding target block is encoded.

藉此，例如，如圖17所示，可擷取複數個運動向量預測子候選，且可以從該等之中，利用編碼對象區塊即預測區塊的圖像，來將預測精度較高的運動向量預測子候選選擇作為運動向量預測子。從而，可以謀求編碼效率的提升。又，由於可將用於識別像這樣選擇的運動向量預測子之選擇資訊加以編碼，因此解碼裝置可以藉由對該選擇資訊進行解碼，而適當地特定在編碼裝置中被選擇作為運動向量預測子的運動向量預測子候選。從而，可以使解碼裝置適當地對已編碼的動態圖像進行解碼。Thereby, for example, as shown in FIG. 17, a plurality of motion vector predictor candidates can be retrieved, and from which the image of the encoding target block, that is, the prediction block, can be used, the prediction precision is high. The motion vector predictor candidate is selected as a motion vector predictor. Therefore, it is possible to improve the coding efficiency. Further, since the selection information for identifying the motion vector predictor selected as such can be encoded, the decoding means can appropriately select the motion vector predictor selected in the encoding device by decoding the selected information. Motion vector predictor candidate. Thereby, the decoding device can be made to appropriately decode the encoded moving image.

又，前述處理電路在前述至少1個運動向量預測子候選的擷取中，亦可從前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前N個的候選運動向量擷取作為前述至少1個運動向量預測子候選之全部。例如，前述複數個候選運動向量的每一個之評價結果是差分越小為越佳的評價結果，其中該差分是由該候選運動向量所特定的第1編碼完成區域的再構成圖像、以及第2編碼完成再構成圖像的差分。Furthermore, the processing circuit may, in the extraction of the at least one motion vector predictor candidate, rank the top N candidate motion vectors from the plurality of candidate motion vectors. All of the at least one motion vector predictor candidate is extracted. For example, the evaluation result of each of the plurality of candidate motion vectors is an evaluation result that is smaller as the difference is smaller, wherein the difference is a reconstructed image of the first coded completion region specified by the candidate motion vector, and 2 The encoding is completed to reconstruct the difference of the image.

藉此，例如，如圖19之(a)所示，可以從複數個候選運動向量中，優先地選擇出預測精度較高的N個運動向量預測子候選。Thereby, for example, as shown in (a) of FIG. 19, N motion vector predictor candidates having higher prediction accuracy can be preferentially selected from a plurality of candidate motion vectors.

又，前述處理電路在前述至少1個運動向量預測子候選的擷取中，亦可將前述複數個候選運動向量分類成N個群組，並從前述N個群組的每一個群組中，擷取在該群組中前述評價結果最佳的1個候選運動向量，藉此來擷取前述至少1個運動向量預測子候選之全部。例如，如圖19之(b)所示，複數個候選運動向量是被分類成彼此性質不同的N個群組。並且，由於是從N個群組的每一個中，擷取評價結果最佳的1個運動向量預測子候選，因此可以擷取彼此性質不同，且預測精度較高的N個運動向量預測子候選。其結果，可以擴大運動向量預測子的選擇範圍，而可以提高選擇到預測精度更高的運動向量預測子之可能性。Furthermore, the processing circuit may classify the plurality of candidate motion vectors into N groups in the extraction of the at least one motion vector predictor candidate, and from each of the N groups. The first candidate motion vector having the best evaluation result in the group is extracted, thereby extracting all of the at least one motion vector predictor candidate. For example, as shown in (b) of FIG. 19, a plurality of candidate motion vectors are N groups classified as being different in nature from each other. Moreover, since one motion vector predictor candidate having the best evaluation result is obtained from each of the N groups, N motion vector predictor candidates having different properties and high prediction accuracy can be extracted. . As a result, the selection range of the motion vector predictor can be expanded, and the possibility of selecting a motion vector predictor with higher prediction accuracy can be improved.

又，前述處理電路在前述至少1個運動向量預測子候選的擷取中，亦可將前述複數個候選運動向量分類成M個(M是比N更大的整數)群組，且從前述M個群組的每一個中，將在該群組中前述評價結果最佳的1個候選運動向量選擇作為代表候選運動向量，並從已選擇的M個前述代表候選運動向量中，將在前述評價結果為較佳的順位上排名前N個的代表候選運動向量擷取作為前述至少1個運動向量預測子候選之全部。Furthermore, the processing circuit may classify the plurality of candidate motion vectors into M (M is an integer greater than N) group in the extraction of the at least one motion vector predictor candidate, and from the foregoing M In each of the groups, one candidate motion vector having the best evaluation result in the group is selected as the representative candidate motion vector, and from the selected M representative candidate motion vectors, the foregoing evaluation will be performed. The result is that the top N representative candidate motion vectors in the preferred order are all taken as the at least one motion vector predictor candidate.

藉此，例如，如圖19之(c)所示，即使在將複數個候選運動向量分類成比擷取的運動向量預測子候選之數量(亦即N個)更多的群組之情況下，仍可以擷取出彼此性質不同，且預測精度較高的N個運動向量預測子候選。Thereby, for example, as shown in (c) of FIG. 19, even in the case of classifying a plurality of candidate motion vectors into more groups than the number of captured motion vector predictor candidates (ie, N) It is still possible to extract N motion vector predictor candidates that are different in nature from each other and have higher prediction accuracy.

又，本實施形態中的解碼裝置，是對已編碼的動態圖像進行解碼的解碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，根據與前述動態圖像中的解碼對象區塊相對應的複數個解碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取前述解碼對象區塊的至少1個運動向量預測子候選，且對顯示2個運動向量的差分之差分資訊進行解碼，在由已解碼的前述差分資訊所示的差分上，加上已擷取的前述至少1個運動向量預測子候選當中的運動向量預測子，藉此導出前述解碼對象區塊的運動向量，利用已導出的前述解碼對象區塊之運動向量來對前述解碼對象區塊進行動態補償，在前述至少1個運動向量預測子候選的擷取中，是根據前述複數個候選運動向量的每一個之評價結果來擷取前述至少1個運動向量預測子候選之全部，其中該複數個候選運動向量不使用前述解碼對象區塊的圖像區域，而是使用了前述動態圖像中解碼完成區域之再構成圖像的候選運動向量。再者，記憶體可以是框記憶體214，亦可為其他記憶體，處理電路亦可包含有例如框間預測部218及熵解碼部202等。Further, the decoding device according to the present embodiment is a decoding device that decodes the encoded moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory and And acquiring a motion vector of each of the plurality of decoding completion blocks corresponding to the decoding target block in the foregoing moving image, to obtain a plurality of candidate motion vectors, and extracting the decoding target block from the plurality of candidate motion vectors. At least one motion vector predictor candidate, and decoding differential information showing the difference between the two motion vectors, and adding the at least one motion that has been captured to the difference indicated by the decoded differential information a motion vector predictor among the vector predictor candidates, thereby deriving a motion vector of the decoding target block, and dynamically compensating the decoding target block by using the derived motion vector of the decoding target block, at least 1 The motion vector predictor candidate is extracted based on the evaluation result of each of the plurality of candidate motion vectors. Taking all of the at least one motion vector predictor candidate, wherein the plurality of candidate motion vectors do not use the image region of the decoding target block, but use a reconstructed map of the decoded complete region in the dynamic image. The candidate motion vector of the image. Furthermore, the memory may be the frame memory 214 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 218 and the entropy decoding unit 202.

藉此，至少1個運動向量預測子候選之全部，是根據複數個候選運動向量的每一個之評價結果，亦即藉由FRUC所進行的評價結果來進行擷取，其中該複數個候選運動向量不使用解碼對象區塊的圖像區域，而是使用了動態圖像中解碼完成區域之再構成圖像的候選運動向量。從而，可以提高解碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。此外，在本實施形態中，不用根據事先規定的優先順序來擷取運動向量預測子候選。因此，只要生成根據FRUC所進行的評價結果之擷取用的候選清單，就可以擷取全部的運動向量預測子候選，而不需要生成根據優先順序之擷取用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, all of the at least one motion vector predictor candidate is extracted according to the evaluation result of each of the plurality of candidate motion vectors, that is, the result of the evaluation performed by the FRUC, wherein the plurality of candidate motion vectors are extracted. Instead of using the image area of the decoding target block, the candidate motion vector of the reconstructed image of the decoding completion area in the moving image is used. Therefore, the prediction accuracy of the decoding target block, that is, the prediction block can be improved, and the coding efficiency can be improved. Further, in the present embodiment, it is not necessary to extract motion vector predictor candidates based on a predetermined priority order. Therefore, as long as the candidate list for the evaluation based on the evaluation result by the FRUC is generated, all the motion vector predictor candidates can be retrieved without generating the candidate list for the priority order. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，前述處理電路亦可在前述至少1個運動向量預測子候選的擷取中，從前述複數個候選運動向量中只選擇前述評價結果最佳的1個候選運動向量，藉此來擷取前述至少1個運動向量預測子候選之全部，並在前述解碼對象區塊的運動向量之導出中，在由已解碼的前述差分資訊所示的差分上，加上已選擇的前述候選運動向量即前述運動向量預測子，藉此來導出前述解碼對象區塊的運動向量。Furthermore, the processing circuit may select only one candidate motion vector having the best evaluation result from the plurality of candidate motion vectors in the capture of the at least one motion vector predictor candidate. At least one motion vector predictor candidate, and in the derivation of the motion vector of the decoding target block, adding the selected candidate motion vector to the difference indicated by the decoded difference information The motion vector predictor is thereby used to derive the motion vector of the aforementioned decoding target block.

藉此，例如，如圖16所示，可擷取1個運動向量預測子候選，且選擇該運動向量預測子候選來作為運動向量預測子。另一方面，在已擷取複數個運動向量預測子候選的情況下，必須將進行識別用的資訊從串流中解碼出來，其中該資訊是用於從該等運動向量預測子候選中識別已藉由編碼裝置選擇的運動向量預測子之資訊。但是，在圖16所示的例子中，由於是擷取1個運動向量預測子候選，且選擇該運動向量預測子候選來作為運動向量預測子，因此並不需要對這樣的資訊進行解碼。從而，可以謀求編碼量的減少。Thereby, for example, as shown in FIG. 16, one motion vector predictor candidate can be retrieved, and the motion vector predictor candidate can be selected as a motion vector predictor. On the other hand, in the case where a plurality of motion vector predictor candidates have been retrieved, the information for identification must be decoded from the stream, wherein the information is used to identify from the motion vector predictor candidates. The information of the sub-predictor is predicted by the motion vector selected by the encoding device. However, in the example shown in FIG. 16, since one motion vector predictor candidate is extracted and the motion vector predictor candidate is selected as the motion vector predictor, such information does not need to be decoded. Therefore, it is possible to reduce the amount of coding.

又，前述處理電路亦可在前述至少1個運動向量預測子候選的擷取中，從前述複數個候選運動向量中，根據前述評價結果來將N個(N為2以上的整數)候選運動向量擷取作為前述至少1個運動向量預測子候選之全部，前述處理電路更進一步地對用於識別前述運動向量預測子的選擇資訊進行解碼，且從已擷取的N個運動向量預測子候選中，將藉由已解碼的前述選擇資訊所識別的運動向量預測子候選選擇作為前述運動向量預測子，並在前述解碼對象區塊的運動向量之導出中，在由已解碼的前述差分資訊所示的差分上，加上已選擇的前述運動向量預測子，藉此來導出前述解碼對象區塊的運動向量。Furthermore, in the processing of the at least one motion vector predictor candidate, the processing circuit may select N (N is an integer of 2 or more) candidate motion vectors from the plurality of candidate motion vectors based on the evaluation result. Taking all of the at least one motion vector predictor candidate, the foregoing processing circuit further decodes the selection information for identifying the motion vector predictor, and from the N motion vector predictor candidates that have been captured And selecting a motion vector predictor candidate identified by the decoded candidate information as the motion vector predictor, and in deriving the motion vector of the decoding target block, as indicated by the decoded differential information On the difference, the selected motion vector predictor is added, thereby deriving the motion vector of the aforementioned decoding target block.

藉此，例如，如圖18所示，可擷取複數個運動向量預測子候選，且可以從其等之中，藉由選擇資訊來將預測精度較高的運動向量預測子候選選擇作為運動向量預測子。從而，可以謀求編碼效率的提升。Thereby, for example, as shown in FIG. 18, a plurality of motion vector predictor candidates can be retrieved, and among them, a motion vector predictor candidate with a higher prediction precision can be selected as a motion vector by selecting information. Forecaster. Therefore, it is possible to improve the coding efficiency.

又，前述處理電路在前述至少1個運動向量預測子候選的擷取中，亦可從前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前N個的候選運動向量擷取作為前述至少1個運動向量預測子候選之全部。例如，前述複數個候選運動向量的每一個之評價結果是差分越小，則為越佳的評價結果，其中該差分是由該候選運動向量所特定的第1解碼完成區域的再構成圖像、以及第2解碼完成再構成圖像的差分。Furthermore, the processing circuit may, in the extraction of the at least one motion vector predictor candidate, rank the top N candidate motion vectors from the plurality of candidate motion vectors. All of the at least one motion vector predictor candidate is extracted. For example, the evaluation result of each of the plurality of candidate motion vectors is a result of a better evaluation, wherein the difference is a reconstructed image of the first decoding completion region specified by the candidate motion vector, And the second decoding is completed to reconstruct the difference of the image.

藉此，例如，如圖19之(a)所示，可以從複數個候選運動向量中，優先地選擇預測精度較高的N個運動向量預測子候選。Thereby, for example, as shown in (a) of FIG. 19, N motion vector predictor candidates having higher prediction accuracy can be preferentially selected from a plurality of candidate motion vectors.

又，前述處理電路在前述至少1個運動向量預測子候選的擷取中，也可以將前述複數個候選運動向量分類成N個群組，並從前述N個群組的每一個中擷取在該群組中前述評價結果最佳的1個候選運動向量，藉此來擷取前述至少1個運動向量預測子候選之全部。例如，如圖19之(b)所示，可將複數個候選運動向量分類成彼此性質不同的N個群組。並且，由於是從N個群組的每一個中，擷取評價結果最佳的1個運動向量預測子候選，因此可以擷取彼此性質不同，且預測精度較高的N個運動向量預測子候選。其結果，可以擴展運動向量預測子的選擇範圍，且可以提高預測精度更高的運動向量預測子被選擇之可能性。Furthermore, the processing circuit may classify the plurality of candidate motion vectors into N groups and extract from each of the N groups in the capture of the at least one motion vector predictor candidate. The one candidate motion vector having the best evaluation result in the group is used to extract all of the at least one motion vector predictor candidate. For example, as shown in (b) of FIG. 19, a plurality of candidate motion vectors may be classified into N groups having different properties from each other. Moreover, since one motion vector predictor candidate having the best evaluation result is obtained from each of the N groups, N motion vector predictor candidates having different properties and high prediction accuracy can be extracted. . As a result, the selection range of the motion vector predictor can be expanded, and the possibility that the motion vector predictor with higher prediction accuracy is selected can be improved.

藉此，例如，如圖19之(c)所示，即使在將複數個候選運動向量分類成比擷取的運動向量預測子候選之數量(亦即N個)更多的群組之情況下，仍可以擷取彼此性質不同，且預測精度較高的N個運動向量預測子候選。Thereby, for example, as shown in (c) of FIG. 19, even in the case of classifying a plurality of candidate motion vectors into more groups than the number of captured motion vector predictor candidates (ie, N) It is still possible to extract N motion vector predictor candidates that are different in nature from each other and have higher prediction accuracy.

這些全面性的或具體的態樣可以藉由系統、裝置、方法、積體電路、電腦程式、或電腦可讀取的CD-ROM等非暫時的記錄媒體來實現，也可以藉由系統、裝置、方法、積體電路、電腦程式、及記錄媒體的任意組合來實現。 (實施形態3) [FRUC/優先順序的切換]These comprehensive or specific aspects may be implemented by systems, devices, methods, integrated circuits, computer programs, or non-transitory recording media such as computer-readable CD-ROMs, or by systems and devices. , any combination of methods, integrated circuits, computer programs, and recording media. (Embodiment 3) [FRUC/Priority Switching]

本實施形態中的編碼裝置及解碼裝置雖然具有與實施形態1同樣的構成，但特徵在於框間預測部126及218的處理動作。亦即，本實施形態也與實施形態2同樣，是解決上述[成為本揭示之基礎的知識見解]中的課題，亦即，必須對1個預測區塊製作彼此不同的複數個候選清單，而造成處理負擔增加的課題。The coding apparatus and the decoding apparatus according to the present embodiment have the same configuration as that of the first embodiment, but are characterized by the processing operations of the inter-frame prediction units 126 and 218. In other words, the present embodiment is also a problem in solving the above-mentioned [knowledge knowledge which is the basis of the present disclosure], that is, it is necessary to create a plurality of candidate lists different from each other for one prediction block. A problem that increases the processing burden.

像本實施形態這樣的編碼裝置100，在上述至少1個運動向量預測子候選的擷取中，會對用於識別擷取方法的模式資訊進行編碼。並且，編碼裝置100是從第1擷取方法及第2擷取方法中，對於編碼對象區塊，選擇藉由該模式資訊而識別出的擷取方法，並依照已選擇的擷取方法，來擷取該至少1個運動向量預測子候選。在此，第1擷取方法是根據複數個候選運動向量之每一個的評價結果的擷取方法，其中該複數個候選運動向量不使用編碼對象區塊的圖像區域，而是使用了動態圖像中編碼完成區域之再構成圖像的候選運動向量。又，第2擷取方法是根據優先順序的擷取方法，該優先順序是對複數個候選運動向量事先規定的優先順序。The encoding apparatus 100 according to the present embodiment encodes mode information for identifying the capturing method in the extraction of the at least one motion vector predictor candidate. Further, the encoding apparatus 100 selects a capture method identified by the mode information for the encoding target block from the first capture method and the second capture method, and according to the selected capture method. The at least one motion vector predictor candidate is retrieved. Here, the first extraction method is a method of capturing an evaluation result according to each of a plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use an image region of the encoding target block, but a dynamic image is used. The candidate motion vector of the reconstructed image of the coded completion region. Further, the second capture method is a capture method based on a priority order which is a predetermined priority order for a plurality of candidate motion vectors.

又，本實施形態中的解碼裝置200，在上述之至少1個運動向量預測子候選的擷取中，是對用於識別擷取方法的模式資訊進行解碼。並且，解碼裝置200是從第1擷取方法及第2擷取方法中，對於解碼對象區塊，選擇藉由已解碼的模式資訊而識別出的擷取方法，並依照已選擇的擷取方法，來擷取該至少1個運動向量預測子候選。Further, the decoding apparatus 200 according to the present embodiment decodes the mode information for identifying the capture method in the capture of the at least one motion vector predictor candidate. Further, the decoding apparatus 200 selects a capture method identified by the decoded mode information for the decoding target block from the first capture method and the second capture method, and according to the selected capture method. And extracting the at least one motion vector predictor candidate.

亦即，本實施形態中的編碼裝置100及解碼裝置200是按每個預測區塊，將至少1個運動向量預測子候選的擷取方法切換成根據FRUC所進行的評價結果之擷取方法、及根據事先規定的優先順序之擷取方法。In other words, the encoding apparatus 100 and the decoding apparatus 200 in the present embodiment switch the extraction method of at least one motion vector predictor candidate to the acquisition result by the FRUC for each prediction block, And methods of obtaining according to prioritized priorities.

藉此，就不需要對預測區塊製作彼此不同的複數個候選清單，而可以抑制處理負擔的增加。Thereby, it is not necessary to create a plurality of candidate lists different from each other for the prediction block, and it is possible to suppress an increase in the processing load.

圖20是顯示由本實施形態中的編碼裝置100及解碼裝置200所進行的運動向量預測子之選擇方法的流程圖。FIG. 20 is a flowchart showing a method of selecting a motion vector predictor by the encoding device 100 and the decoding device 200 in the present embodiment.

框間預測部126及218會判定模式資訊是顯示0或是顯示1(步驟S301)。模式資訊是用於識別至少1個運動向量預測子候選的擷取方法之資訊。具體來說，在模式資訊=0的情況下，該模式資訊是顯示第1擷取方法，亦即，根據由FRUC所進行的評價結果之擷取方法。在模式資訊=1的情況下，該模式資訊是顯示第2擷取方法，亦即，依照事先規定的優先順序之擷取方法。The inter-frame prediction units 126 and 218 determine whether the mode information is display 0 or display 1 (step S301). The mode information is information for identifying a method of capturing at least one motion vector predictor candidate. Specifically, in the case of mode information=0, the mode information is a display method of displaying the first extraction method, that is, based on the evaluation result by the FRUC. In the case of mode information=1, the mode information is a method of displaying the second extraction method, that is, a method of capturing in accordance with a predetermined priority order.

在此，當判定為模式資訊所顯示的是0時，框間預測部126及218會與實施形態2同樣地，根據由FRUC所進行的評價結果來擷取至少1個運動向量預測子候選(步驟S302)。具體來說，框間預測部126及218是對複數個候選運動向量的每一個來進行使用了編碼完成或解碼完成之再構成圖像的評價。並且，框間預測部126及218是根據該評價結果，從複數個候選運動向量中擷取出至少1個運動向量預測子候選。Here, when it is determined that the mode information is 0, the inter-frame prediction units 126 and 218 extract at least one motion vector predictor candidate based on the evaluation result by the FRUC in the same manner as in the second embodiment ( Step S302). Specifically, the inter-frame prediction units 126 and 218 perform evaluation of a reconstructed image using encoding completion or decoding for each of a plurality of candidate motion vectors. Further, the inter-frame prediction units 126 and 218 extract at least one motion vector predictor candidate from the plurality of candidate motion vectors based on the evaluation result.

另一方面，當框間預測部126及218在步驟S301中判定為模式資訊所顯示的是1時，會與圖11及圖12所示的例子同樣地，擷取N個(N為2以上的整數)運動向量預測子候選(步驟S303)。具體來說，框間預測部126及218是依照事先規定的優先順序，從複數個候選運動向量中擷取N個運動向量預測子候選。On the other hand, when the inter-frame prediction units 126 and 218 determine in step S301 that the mode information is 1 is displayed, N is similar to the example shown in FIGS. 11 and 12 (N is 2 or more). Integer) motion vector predictor candidate (step S303). Specifically, the inter-frame prediction units 126 and 218 extract N motion vector predictor candidates from a plurality of candidate motion vectors in accordance with a predetermined priority order.

在步驟S302中已擷取出至少1個運動向量預測子候選時，框間預測部126及218是判定該已擷取的運動向量預測子候選之數量是否為複數個(步驟S304)。在此，當框間預測部126及218判定為已擷取的運動向量預測子候選之數量為1個時(步驟S304的否)，是將該已擷取的運動向量預測子候選選擇作為編碼對象區塊即預測區塊的運動向量預測子(步驟S305)。When at least one motion vector predictor candidate has been extracted in step S302, the inter-frame prediction units 126 and 218 determine whether or not the number of the extracted motion vector predictor candidates is plural (step S304). Here, when the inter-frame prediction units 126 and 218 determine that the number of motion vector predictor candidates that have been captured is one (NO in step S304), the captured motion vector predictor candidate is selected as the code. The target block is the motion vector predictor of the prediction block (step S305).

另一方面，當框間預測部126及218判定為已擷取的運動向量預測子候選之數量為複數個時(步驟S304的是)，是從該複數個運動向量預測子候選中，將由運動向量預測子選擇資訊所識別的運動向量預測子候選選擇作為預測區塊的運動向量預測子(步驟S306)。On the other hand, when the inter-frame prediction units 126 and 218 determine that the number of motion vector predictor candidates that have been captured is plural (YES in step S304), it is determined from the plurality of motion vector predictor candidates. The motion vector predictor candidate identified by the vector predictor selection information is selected as the motion vector predictor of the prediction block (step S306).

又，在步驟S303中已擷取N個運動向量預測子候選時，框間預測部126及218是從該N個運動向量預測子候選中，將由運動向量預測子選擇資訊所示的運動向量預測子候選選擇作為預測區塊的運動向量預測子(步驟S306)。When the N motion vector predictor candidates have been retrieved in step S303, the inter-frame prediction units 126 and 218 predict the motion vector indicated by the motion vector predictor selection information from the N motion vector predictor candidates. The sub-candidate is selected as the motion vector predictor of the prediction block (step S306).

再者，上述模式資訊可藉由編碼裝置100的熵編碼部110而編碼於串流中，且可藉由解碼裝置200的熵解碼部202而從該串流中解碼出來。Furthermore, the mode information can be encoded in the stream by the entropy encoding unit 110 of the encoding device 100, and can be decoded from the stream by the entropy decoding unit 202 of the decoding device 200.

可將這種模式資訊編碼於序列層、圖片層、及片段層當中的任一層之標頭區域中。亦即，熵編碼部110是將用於識別對包含在該層的各區塊之擷取方法的模式資訊，編碼於該層的標頭區域中。又，熵解碼部202是從該層的標頭區域中，將用於識別對包含在該層的各區塊之擷取方法的模式資訊解碼。This mode information can be encoded in the header area of any of the sequence layer, the picture layer, and the slice layer. That is, the entropy coding unit 110 encodes mode information for identifying a method of capturing the blocks included in the layer in the header area of the layer. Further, the entropy decoding unit 202 decodes the mode information for identifying the acquisition method for each of the blocks included in the layer from the header area of the layer.

藉此，就可以按每一個序列、圖片、或片段來切換根據FRUC之擷取方法、及依照事先規定的優先順序之擷取方法。又，模式資訊亦可以預測區塊單位來編碼於串流中。亦即，熵編碼部110是按每一個包含於動態圖像的區塊，來對模式資訊進行編碼，其中該模式資訊是用於識別相對於該區塊的擷取方法之資訊。又，熵解碼部202是按每一個包含於動態圖像的區塊，來對模式資訊進行解碼，其中該模式資訊是用於識別對該區塊的擷取方法之資訊。藉此，就可以按每一個預測區塊來切換該等擷取方法。Thereby, the method of capturing the FRUC and the method of capturing according to the predetermined priority order can be switched for each sequence, picture, or segment. Also, the mode information can also be predicted in block units to be encoded in the stream. That is, the entropy encoding unit 110 encodes the mode information for each of the blocks included in the moving image, wherein the mode information is information for identifying the capturing method with respect to the block. Further, the entropy decoding unit 202 decodes the mode information for each of the blocks included in the moving image, wherein the mode information is information for identifying the capturing method of the block. Thereby, the method of capturing can be switched for each prediction block.

再者，由模式資訊所示的上述數值(0或1)僅是一例，也可以是這些數值以外的數值。又，模式資訊也可以顯示數值以外的識別碼。亦即，只要是可以區別根據由FRUC所進行的評價結果之擷取方法、及依照事先規定的優先順序之擷取方法的識別碼即可，模式資訊顯示哪一種識別碼皆可。Furthermore, the above numerical value (0 or 1) indicated by the mode information is only an example, and may be a numerical value other than these numerical values. Also, the mode information can display an identification code other than the value. That is, as long as it is an identification code that can distinguish between the extraction method based on the evaluation result by the FRUC and the extraction method according to the predetermined priority order, the mode information can indicate which identification code is acceptable.

又，上述運動向量預測子選擇資訊是藉由編碼裝置100的熵編碼部110而以預測區塊單位來編碼於串流中，且是藉由解碼裝置200的熵解碼部202而從該串流中解碼出來。 [實施形態3的效果等]Further, the motion vector predictor selection information is encoded in the stream by the entropy encoding unit 110 of the encoding apparatus 100 in the prediction block unit, and is streamed by the entropy decoding unit 202 of the decoding device 200. Decoded out. [Effects of Embodiment 3, etc.]

本實施形態中的編碼裝置，是對動態圖像進行編碼的編碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，並根據與前述動態圖像中的編碼對象區塊相對應的複數個編碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取前述編碼對象區塊的至少1個運動向量預測子候選，且參照包含於前述動態圖像的參照圖片來導出前述編碼對象區塊的運動向量，對於已擷取的前述至少1個運動向量預測子候選當中的運動向量預測子、及已導出的前述編碼對象區塊的運動向量之差分進行編碼，利用已導出的前述編碼對象區塊之運動向量來對前述編碼對象區塊進行動態補償，在前述至少1個運動向量預測子候選的擷取中，是對用於識別擷取方法的模式資訊進行編碼，且從第1擷取方法及第2擷取方法中，對於前述編碼對象區塊，選擇藉由前述模式資訊而識別出的擷取方法，並依照已選擇的前述擷取方法，來擷取前述至少1個運動向量預測子候選，前述第1擷取方法是根據前述複數個候選運動向量之每一個的評價結果的擷取方法，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量，前述第2擷取方法是根據優先順序的擷取方法，該優先順序是對前述複數個候選運動向量事先規定的優先順序。再者，記憶體可為框記憶體122，亦可為其他記憶體，處理電路亦可包含有例如框間預測部126及熵編碼部110等。The encoding device according to the present embodiment is an encoding device that encodes a moving image, and includes a processing circuit and a memory connected to the processing circuit. The processing circuit uses the memory and is based on the moving image. Obtaining, by the motion vector of each of the plurality of coding completion blocks corresponding to the coding target block, a plurality of candidate motion vectors, and extracting at least one of the foregoing coding target blocks from the plurality of candidate motion vectors a motion vector predictor candidate, and deriving a motion vector of the encoding target block with reference to a reference picture included in the moving image, and a motion vector predictor among the at least one motion vector predictor candidate that has been captured, and Deriving a difference of motion vectors of the foregoing coding target block, and performing motion compensation on the coding target block by using the derived motion vector of the coding target block, where the at least one motion vector predictor candidate is used In the capture, the mode information used to identify the capture method is encoded, and the first capture method is used. And in the second extraction method, the capture target block is selected for the encoding target block, and the at least one motion vector predictor is captured according to the selected extraction method. a candidate, the first extraction method is a method for extracting an evaluation result according to each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but are used The candidate motion vector of the reconstructed image of the coded completion region in the moving image, the second capture method is a capture method according to a priority order, wherein the priority order is a predetermined priority order for the plurality of candidate motion vectors. Furthermore, the memory may be the frame memory 122 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 126 and the entropy coding unit 110.

藉此，就可以因應於模式資訊而對編碼對象區塊即預測區塊來適用例如根據由FRUC所進行的評價結果的第1擷取方法、或根據事先規定的優先順序的第2擷取方法。亦即，可以切換擷取方法。從而，可以提高編碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。此外，在本實施形態中，由於第1擷取方法及第2擷取方法的任一個擷取方法可適用在預測區塊中，因此毋須對預測區塊個別地生成第1擷取方法用的候選清單、及第2擷取方法用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, the first extraction method based on the evaluation result by the FRUC or the second extraction method according to the predetermined priority order can be applied to the prediction target block, that is, the prediction block in response to the mode information. . That is, the capture method can be switched. Thereby, the prediction accuracy of the coding target block, that is, the prediction block, can be improved, and the coding efficiency can be improved. Further, in the present embodiment, since any of the first extraction method and the second extraction method can be applied to the prediction block, it is not necessary to separately generate the first extraction method for the prediction block. Candidate list and candidate list for the second method. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，在前述模式資訊的編碼中，亦可將模式資訊編碼於前述動態圖像的串流中的序列層、圖片層、及片段層當中的任一層之標頭區域中，其中該模式資訊是用於識別對包含在前述層中的各區塊的擷取方法之資訊。In addition, in the coding of the mode information, mode information may also be encoded in a header area of any one of a sequence layer, a picture layer, and a slice layer in the stream of the moving image, where the mode information is Information for identifying a method of capturing the blocks included in the aforementioned layer.

藉此，就可以用序列、圖片、或片段單位來切換擷取方法。又，相較於例如用預測區塊等區塊單位來切換擷取方法的情況，可以更加抑制模式資訊的編碼量。In this way, the capture method can be switched in sequence, picture, or segment unit. Further, the amount of coding of the mode information can be further suppressed as compared with the case of switching the capturing method by, for example, a block unit such as a prediction block.

又，在前述模式資訊的編碼中，亦可按每一個包含於前述動態圖像的區塊，來對模式資訊進行編碼，其中該模式資訊是用於識別對該區塊的擷取方法之資訊。In addition, in the encoding of the foregoing mode information, the mode information may be encoded according to each of the blocks included in the dynamic image, wherein the mode information is information for identifying a method for capturing the block. .

藉此，就可以用例如預測區塊等區塊單位來切換擷取方法。又，相較於以序列、圖片、或片段單位來切換擷取方法的情況，可以更加提高區塊的預測精度之提升的可能性。Thereby, the capture method can be switched by using a block unit such as a prediction block. Moreover, the possibility of improving the prediction accuracy of the block can be further improved as compared with the case where the capture method is switched in units of sequences, pictures, or segments.

又，本實施形態之解碼裝置，是對已編碼的動態圖像進行解碼的解碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，根據與前述動態圖像中的解碼對象區塊相對應的複數個解碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取前述解碼對象區塊的至少1個運動向量預測子候選，且對顯示2個運動向量的差分之差分資訊進行解碼，在由已解碼的前述差分資訊所示的差分上，加上已擷取的前述至少1個運動向量預測子候選當中的運動向量預測子，藉此導出前述解碼對象區塊的運動向量，利用已導出的前述解碼對象區塊的運動向量來對前述解碼對象區塊進行動態補償，在前述至少1個運動向量預測子候選的擷取中，對用於識別擷取方法的模式資訊進行解碼，並從第1擷取方法及第2擷取方法中，對於前述解碼對象區塊，選擇藉由已解碼的前述模式資訊而識別出的擷取方法，並依照已選擇的前述擷取方法，來擷取前述至少1個運動向量預測子候選，前述第1擷取方法是根據前述複數個候選運動向量之每一個的評價結果的擷取方法，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量，前述第2擷取方法是根據優先順序的擷取方法，該優先順序是對前述複數個候選運動向量事先規定的優先順序。再者，記憶體可為框記憶體214，亦可為其他記憶體，處理電路亦可包含有例如框間預測部218及熵解碼部202等。Further, the decoding device according to the present embodiment is a decoding device that decodes the encoded moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory, according to the foregoing Obtaining, by the motion vector of each of the plurality of decoding completion blocks corresponding to the decoding target block in the dynamic image, to obtain a plurality of candidate motion vectors, and extracting the decoding target block from the plurality of candidate motion vectors At least one motion vector predictor candidate, and decoding differential information showing the difference of the two motion vectors, and adding the at least one motion vector that has been captured to the difference indicated by the decoded differential information Predicting a motion vector predictor among the sub-candidates, thereby deriving a motion vector of the decoding target block, and dynamically compensating the decoding target block by using the derived motion vector of the decoding target block, at least one of the foregoing In the capture of the motion vector predictor candidate, the mode information used to identify the capture method is decoded, and from the first In the method and the second extraction method, for the decoding target block, a capture method identified by the decoded mode information is selected, and the at least one of the at least one is captured according to the selected extraction method. a motion vector predictor candidate, wherein the first scooping method is a scribing method according to an evaluation result of each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use an image region of the encoding target block, Instead, the candidate motion vector of the reconstructed image of the coded completion region in the moving image is used, and the second capture method is a capture method according to a priority order in which the plurality of candidate motion vectors are specified in advance. Priority. Furthermore, the memory may be the frame memory 214 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 218 and the entropy decoding unit 202.

藉此，就可以因應於模式資訊而對解碼對象區塊即預測區塊來適用例如根據由FRUC所進行的評價結果的第1擷取方法、或根據事先規定的優先順序的第2擷取方法。亦即，可以切換擷取方法。從而，可以提高解碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。此外，在本實施形態中，由於第1擷取方法及第2擷取方法的任一個擷取方法可適用在預測區塊中，因此毋須對預測區塊個別地生成第1擷取方法用的候選清單、及第2擷取方法用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, the first extraction method based on the evaluation result by the FRUC or the second extraction method according to the predetermined priority order can be applied to the prediction target block, that is, the prediction block in response to the mode information. . That is, the capture method can be switched. Therefore, the prediction accuracy of the decoding target block, that is, the prediction block can be improved, and the coding efficiency can be improved. Further, in the present embodiment, since any of the first extraction method and the second extraction method can be applied to the prediction block, it is not necessary to separately generate the first extraction method for the prediction block. Candidate list and candidate list for the second method. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，在前述模式資訊的解碼中，亦可從前述動態圖像之串流中的序列層、圖片層、及片段層當中的任一層之標頭區域中將模式資訊解碼，其中該模式資訊是用於識別對包含在前述層中的各區塊的擷取方法之資訊。Moreover, in the decoding of the mode information, the mode information may be decoded from a header region of any one of a sequence layer, a picture layer, and a slice layer in the stream of the moving image, where the mode information is Information for identifying a method of capturing the blocks included in the aforementioned layer.

又，在前述模式資訊的解碼中，亦可按每一個包含於前述動態圖像的區塊，來對模式資訊進行解碼，其中該模式資訊是用於識別對該區塊的擷取方法之資訊。Moreover, in the decoding of the foregoing mode information, the mode information may be decoded for each block included in the dynamic image, wherein the mode information is information for identifying a method for capturing the block. .

這些全面性的或具體的態樣可以藉由系統、裝置、方法、積體電路、電腦程式、或電腦可讀取的CD-ROM等非暫時的記錄媒體來實現，也可以藉由系統、裝置、方法、積體電路、電腦程式、及記錄媒體的任意組合來實現。 (實施形態4) [從共通的候選清單中以FRUC、優先順序來擷取運動向量預測子候選]These comprehensive or specific aspects may be implemented by systems, devices, methods, integrated circuits, computer programs, or non-transitory recording media such as computer-readable CD-ROMs, or by systems and devices. , any combination of methods, integrated circuits, computer programs, and recording media. (Embodiment 4) [Selecting motion vector predictor candidates from FRUC and priority order from a common candidate list]

本實施形態中的編碼裝置及解碼裝置雖然具有與實施形態1同樣的構成，但特徵在於框間預測部126及218的處理動作。亦即，本實施形態也與實施形態2及3同樣地，是解決上述[成為本揭示之基礎的知識見解]中的課題，亦即，必須對1個預測區塊製作彼此不同的複數個候選清單，而造成處理負擔增加的課題。The coding apparatus and the decoding apparatus according to the present embodiment have the same configuration as that of the first embodiment, but are characterized by the processing operations of the inter-frame prediction units 126 and 218. In other words, in the same manner as in the second and third embodiments, the present embodiment solves the above-mentioned problem in the knowledge knowledge which is the basis of the present disclosure, that is, it is necessary to create a plurality of candidates different from each other in one prediction block. List, which causes problems in processing burden.

像本實施形態這樣的編碼裝置100，是從複數個候選運動向量中，擷取相對於編碼對象區塊的N個(N為2以上的整數)運動向量預測子候選。此時，編碼裝置100是生成下述候選清單：顯示複數個候選運動向量，且於第1擷取方法及第2擷取方法上共通的候選清單。並且，編碼裝置100是從該共通的候選清單所示的複數個候選運動向量中，依照第1擷取方法來擷取出M個(M為1以上且小於N的整數)運動向量預測子候選。再者，編碼裝置100是從該共通的候選清單所示的複數個候選運動向量中，依照第2擷取方法來擷取出L個(L=N－M)運動向量預測子候選。在此，第1擷取方法是根據複數個候選運動向量的每一個之評價結果的擷取方法，具體來說，是根據由FRUC所進行的評價結果之擷取方法，其中該複數個候選運動向量不使用編碼對象區塊的圖像區域，而是使用了動態圖像中編碼完成區域之再構成圖像的候選運動向量。又，第2擷取方法是根據優先順序的擷取方法，該優先順序是對複數個候選運動向量事先規定的優先順序。The encoding apparatus 100 according to the present embodiment extracts N (N is an integer of 2 or more) motion vector predictor candidates from the plurality of candidate motion vectors with respect to the encoding target block. At this time, the encoding apparatus 100 generates a candidate list in which a plurality of candidate motion vectors are displayed, and a candidate list common to the first extraction method and the second extraction method. Further, the encoding apparatus 100 extracts M (M is an integer of 1 or more and less than N) motion vector predictor candidate from the plurality of candidate motion vectors indicated by the common candidate list in accordance with the first acquisition method. Furthermore, the encoding apparatus 100 extracts L (L=N-M) motion vector predictor candidates from the plurality of candidate motion vectors indicated by the common candidate list in accordance with the second acquisition method. Here, the first extraction method is a method of extracting the evaluation result according to each of the plurality of candidate motion vectors, specifically, a method of extracting the evaluation result by the FRUC, wherein the plurality of candidate motions The vector does not use the image area of the encoding target block, but uses the candidate motion vector of the reconstructed image of the coded completion area in the moving image. Further, the second capture method is a capture method based on a priority order which is a predetermined priority order for a plurality of candidate motion vectors.

又，本實施形態中的解碼裝置200是從複數個候選運動向量中，擷取相對於解碼對象區塊的N個(N為2以上的整數)運動向量預測子候選。此時，解碼裝置200是生成下述候選清單：顯示複數個候選運動向量，且於第1擷取方法及第2擷取方法上共通的候選清單。並且，解碼裝置200是從該共通的候選清單所示的複數個候選運動向量中，依照第1擷取方法而擷取M個(M為1以上且小於N的整數)運動向量預測子候選。再者，解碼裝置200是從該共通的候選清單所示的複數個候選運動向量中，依照第2擷取方法而設為擷取L個(L=N－M)運動向量預測子候選。Further, the decoding apparatus 200 according to the present embodiment extracts N (N is an integer of 2 or more) motion vector predictor candidates from the plurality of candidate motion vectors with respect to the decoding target block. At this time, the decoding device 200 generates a candidate list in which a plurality of candidate motion vectors are displayed, and a candidate list common to the first extraction method and the second extraction method. Further, the decoding apparatus 200 extracts M (M is an integer of 1 or more and less than N) motion vector predictor candidate from the plurality of candidate motion vectors indicated by the common candidate list. Furthermore, the decoding apparatus 200 extracts L (L=N-M) motion vector predictor candidates from the plurality of candidate motion vectors indicated by the common candidate list in accordance with the second extraction method.

圖21是顯示本實施形態中的編碼裝置100所進行的動態補償之一例的流程圖。圖1所示的編碼裝置100在對以複數個圖片所構成的動態圖像進行編碼時，編碼裝置100的框間預測部126等是執行圖21所示的處理。Fig. 21 is a flowchart showing an example of dynamic compensation performed by the encoding device 100 in the embodiment. When the encoding apparatus 100 shown in FIG. 1 encodes a moving image composed of a plurality of pictures, the inter-frame prediction unit 126 of the encoding apparatus 100 or the like executes the processing shown in FIG. 21.

具體來說，框間預測部126是按照相當於上述預測單元的每一個預測區塊，來對該預測區塊即編碼對象區塊進行動態補償。此時，框間預測部126首先是根據時間上或空間上位於預測區塊周圍的複數個編碼完成區塊的運動向量等資訊，來對該預測區塊取得複數個候選運動向量(步驟S201)。此時，框間預測部126是生成下述候選清單：顯示在步驟S201中已取得的複數個候選運動向量候選，且在根據由FRUC所進行的評價結果之擷取方法、及根據事先規定的優先順序之擷取方法上共通的候選清單。Specifically, the inter-frame prediction unit 126 dynamically compensates the prediction block, that is, the coding target block, in accordance with each prediction block corresponding to the prediction unit. At this time, the inter-frame prediction unit 126 first obtains a plurality of candidate motion vectors for the prediction block according to information such as a motion vector of a plurality of coded completion blocks temporally or spatially located around the prediction block (step S201). . At this time, the inter-frame prediction unit 126 generates a candidate list in which a plurality of candidate motion vector candidates acquired in step S201 are displayed, and a method based on the evaluation result by the FRUC and a predetermined method are provided. A list of candidates common to the method of prioritization.

接著，框間預測部126是利用編碼完成區域的再構成圖像，來算出在步驟S201中已取得的複數個候選運動向量的每一個之評價值。亦即，框間預測部126是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出該等評價值。並且，框間預測部126是根據該複數個候選運動向量的評價值，從上述共通的候選清單所示的複數個候選運動向量之中，將M個候選運動向量的每一個擷取作為運動向量預測子候選1(步驟S202aa)。亦即，框間預測部126是從複數個候選運動向量中，將在評價值為較高順位上排名前M個的候選運動向量分別擷取作為運動向量預測子候選。再者，框間預測部126亦可針對該已擷取的M個運動向量預測子候選的每一個，在周邊區域中細密地移動該已選擇的運動向量預測子候選1，以將由FRUC進行的評價值變得更高，藉此來補正該運動向量預測子候選1。亦即，框間預測部126亦可細密地搜尋使藉由FRUC進行的評價值變得更高的區域，藉此來補正該等運動向量預測子候選1。Next, the inter-frame prediction unit 126 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S201 using the reconstructed image of the encoding completion region. That is, the inter-frame prediction unit 126 calculates the evaluation values based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 126 extracts each of the M candidate motion vectors as a motion vector from among the plurality of candidate motion vectors indicated by the common candidate list, based on the evaluation values of the plurality of candidate motion vectors. The sub-candidate 1 is predicted (step S202aa). In other words, the inter-frame prediction unit 126 extracts, from the plurality of candidate motion vectors, the candidate motion vectors of the top M in the higher order of the evaluation value as the motion vector predictor candidates. Furthermore, the inter-frame prediction unit 126 may also finely move the selected motion vector predictor candidate 1 in the peripheral region for each of the M motion vector predictor candidates that have been captured to be performed by the FRUC. The evaluation value becomes higher, thereby correcting the motion vector predictor candidate 1. In other words, the inter-frame prediction unit 126 can also finely search for a region in which the evaluation value by the FRUC is higher, thereby correcting the motion vector predictor candidates 1.

此外，框間預測部126是從上述共通的候選清單所示的複數個候選運動向量之中，依照事先規定的優先順序來將L個候選運動向量的每一個擷取作為運動向量預測子候選2(步驟S202ab)。Further, the inter-frame prediction unit 126 extracts each of the L candidate motion vectors as a motion vector predictor candidate 2 from a plurality of candidate motion vectors indicated by the common candidate list in accordance with a predetermined priority order. (Step S202ab).

並且，框間預測部126是將已擷取的M個運動向量預測子候選1及L個運動向量預測子候選2當中的任1個運動向量預測子候選，選擇作為預測區塊的運動向量預測子(步驟S202b)。此時，框間預測部126是輸出用於識別該已選擇的運動向量預測子之運動向量預測子選擇資訊。熵編碼部110是將該運動向量預測子選擇資訊編碼於串流中。Further, the inter-frame prediction unit 126 selects one of the M motion vector predictor candidates 1 and the L motion vector predictor candidates 2 that have been captured, and selects motion vector prediction as a prediction block. Sub (step S202b). At this time, the inter-frame prediction unit 126 outputs motion vector predictor selection information for identifying the selected motion vector predictor. The entropy coding unit 110 encodes the motion vector predictor selection information in the stream.

接著，框間預測部126是參照編碼完成參照圖片，而導出預測區塊的運動向量(步驟S203)。此時，框間預測部126是更進一步地算出該已導出的運動向量與運動向量預測子的差分值。熵編碼部110是將該差分值作為差分運動向量資訊而編碼於串流中。Next, the inter-frame prediction unit 126 refers to the encoding completion reference picture, and derives the motion vector of the prediction block (step S203). At this time, the inter-frame prediction unit 126 further calculates a difference value between the derived motion vector and the motion vector predictor. The entropy coding unit 110 encodes the difference value as a difference motion vector information in the stream.

再者，框間預測部126亦可取代如上述之以預測區塊為單位的動態補償，而以藉由分割預測區塊而得到的子區塊單位來同樣地導出運動向量，並以子區塊單位來進行動態補償。Furthermore, the inter-frame prediction unit 126 may also perform motion compensation in units of prediction blocks as described above, and similarly derive motion vectors by sub-block units obtained by dividing prediction blocks, and sub-areas Block units for dynamic compensation.

圖22是顯示本實施形態中的解碼裝置200所進行的動態補償之一例的流程圖。圖10所示的解碼裝置200在對已編碼的複數個圖片所構成的動態圖像進行解碼時，解碼裝置200的框間預測部218等會執行圖22所示的處理。Fig. 22 is a flowchart showing an example of dynamic compensation performed by the decoding device 200 in the embodiment. When the decoding device 200 shown in FIG. 10 decodes a moving image composed of a plurality of encoded pictures, the inter-frame prediction unit 218 of the decoding device 200 executes the processing shown in FIG.

此時，框間預測部218是生成下述候選清單：顯示在步驟S211中已取得的複數個候選運動向量候選，且在根據由FRUC所進行的評價結果之擷取方法、及根據事先規定的優先順序之擷取方法上共通的候選清單。At this time, the inter-frame prediction unit 218 generates a candidate list in which a plurality of candidate motion vector candidates acquired in step S211 are displayed, and a method based on the evaluation result by the FRUC and a predetermined method are provided. A list of candidates common to the method of prioritization.

接著，框間預測部218是利用解碼完成區域的再構成圖像，來算出在步驟S211中已取得的複數個候選運動向量的每一個之評價值。亦即，框間預測部218是根據FRUC，亦即模板匹配方式或雙向匹配方式來算出該等評價值。並且，框間預測部218是根據該複數個候選運動向量的評價值，從上述共通的候選清單所示的複數個候選運動向量之中，將M個候選運動向量的每一個擷取作為運動向量預測子候選1(步驟S212aa)。亦即，框間預測部218是從複數個候選運動向量中，將在評價值為較高順位上排名前M個的候選運動向量分別擷取作為運動向量預測子候選。再者，框間預測部218亦可針對該已擷取的M個運動向量預測子候選的每一個，在周邊區域中細密地移動該已選擇的運動向量預測子候選1，以將由FRUC進行的評價值變得更高，藉此來補正該運動向量預測子候選1，以使得藉由FRUC進行的評價值變得更高。亦即，框間預測部218亦可細密地搜尋使藉由FRUC進行的評價值變得更高的區域，藉此來補正該等運動向量預測子候選1。Next, the inter-frame prediction unit 218 calculates an evaluation value for each of the plurality of candidate motion vectors acquired in step S211 using the reconstructed image of the decoding completion region. That is, the inter-frame prediction unit 218 calculates the evaluation values based on the FRUC, that is, the template matching method or the two-way matching method. Further, the inter-frame prediction unit 218 extracts each of the M candidate motion vectors as a motion vector from among the plurality of candidate motion vectors indicated by the common candidate list, based on the evaluation values of the plurality of candidate motion vectors. The sub-candidate 1 is predicted (step S212aa). In other words, the inter-frame prediction unit 218 extracts, from the plurality of candidate motion vectors, the candidate M motion vectors ranked first in the higher order of the evaluation value as the motion vector predictor candidates. Furthermore, the inter-frame prediction unit 218 may also finely move the selected motion vector predictor candidate 1 in the peripheral region for each of the M motion vector predictor candidates that have been captured to be performed by the FRUC. The evaluation value becomes higher, whereby the motion vector predictor candidate 1 is corrected so that the evaluation value by the FRUC becomes higher. In other words, the inter-frame prediction unit 218 can also finely search for regions in which the evaluation values by the FRUC are higher, thereby correcting the motion vector predictor candidates 1.

再者，框間預測部218是從上述共通的候選清單所示的複數個候選運動向量之中，依照事先規定的優先順序來將L個候選運動向量的每一個擷取作為運動向量預測子候選2(步驟S212ab)。Furthermore, the inter-frame prediction unit 218 extracts each of the L candidate motion vectors as a motion vector predictor candidate from a plurality of candidate motion vectors indicated by the common candidate list in accordance with a predetermined priority order. 2 (step S212ab).

接著，框間預測部218是利用運動向量預測子選擇資訊，從已擷取的M個運動向量預測子候選1及L個運動向量預測子候選2中，將1個運動向量預測子候選選擇作為預測區塊的運動向量預測子(步驟S212b)。亦即，熵解碼部202是對運動向量預測子選擇資訊進行解碼，該運動向量預測子選擇資訊是用於識別解碼對象區塊即預測區塊的運動向量預測子之資訊。並且，框間預測部218是從已擷取的N個運動向量預測子候選1及2中，將藉由已解碼的運動向量預測子選擇資訊所識別的運動向量預測子候選，選擇作為預測區塊的運動向量預測子。Next, the inter-frame prediction unit 218 uses the motion vector prediction sub-selection information to select one motion vector predictor candidate from the M motion vector predictor candidates 1 and the L motion vector predictor candidates 2 that have been extracted. The motion vector predictor of the block is predicted (step S212b). That is, the entropy decoding unit 202 decodes the motion vector predictor selection information which is information for identifying the motion vector predictor of the decoding target block, that is, the prediction block. Further, the inter-frame prediction unit 218 selects, as the prediction region, the motion vector predictor candidate identified by the decoded motion vector predictor selection information from the N motion vector predictor candidates 1 and 2 that have been captured. The motion vector predictor of the block.

接著，框間預測部218是利用差分運動向量資訊來導出預測區塊的運動向量，其中該差分運動向量資訊是從已輸入至解碼裝置200的串流中藉由熵解碼部202進行解碼而得的資訊(步驟S213)。具體來說，框間預測部218是對該已解碼的差分運動向量資訊即差分值、及已選擇的運動向量預測子進行加法運算，藉此導出預測區塊的運動向量。亦即，熵解碼部202是對差分運動向量資訊進行解碼，該差分運動向量資訊是顯示2個運動向量的差分之差分資訊。並且，框間預測部218是在該已解碼的差分資訊所示的差分上，加上已選擇的運動向量預測子，藉此來導出解碼對象區塊即預測區塊的運動向量。Next, the inter-frame prediction unit 218 derives a motion vector of the prediction block by using the difference motion vector information, which is obtained by decoding by the entropy decoding unit 202 from the stream that has been input to the decoding device 200. Information (step S213). Specifically, the inter-frame prediction unit 218 adds the decoded difference motion vector information, that is, the difference value, and the selected motion vector predictor, thereby deriving the motion vector of the prediction block. That is, the entropy decoding unit 202 decodes the difference motion vector information which is differential information showing the difference between the two motion vectors. Further, the inter-frame prediction unit 218 adds the selected motion vector predictor to the difference indicated by the decoded difference information, thereby deriving the motion vector of the prediction block, that is, the prediction block.

再者，框間預測部218亦可取代如上述之以預測區塊為單位的動態補償，而以藉由分割預測區塊而得到的子區塊單位來同樣地導出運動向量，並以子區塊單位來進行動態補償。Furthermore, the inter-frame prediction unit 218 may also replace the motion compensation in units of prediction blocks as described above, and equally derive the motion vector by sub-block units obtained by dividing the prediction block, and sub-regions. Block units for dynamic compensation.

在此，在本實施形態中，在框間預測部126及218所進行的依照第2擷取方法的擷取中，亦即在根據事先規定的優先順序之擷取中，亦可利用第1擷取方法的擷取結果，亦即根據由FRUC所進行的評價結果之擷取結果。亦即，框間預測部126及218是依照利用了第1擷取方法中的評價結果之優先順序，而從至少1個候選運動向量中擷取L個運動向量預測子候選，其中該至少1個候選運動向量是在共通的候選清單上所示的複數個候選運動向量當中，除了已藉由第1擷取方法擷取的M個運動向量預測子候選之外的其餘的至少1個候選運動向量。Here, in the present embodiment, in the capture according to the second capture method performed by the inter-frame prediction units 126 and 218, that is, in the capture according to the predetermined priority order, the first use may be utilized. The results of the extraction method, that is, the results obtained from the evaluation results performed by FRUC. In other words, the inter-frame prediction units 126 and 218 extract L motion vector predictor candidates from at least one candidate motion vector in accordance with the priority order of the evaluation results in the first acquisition method, wherein the at least one The candidate motion vectors are among the plurality of candidate motion vectors shown on the common candidate list, except for at least one candidate motion other than the M motion vector predictor candidates that have been extracted by the first acquisition method. vector.

圖23是用於說明本實施形態中的運動向量預測子候選的擷取方法之圖。Fig. 23 is a view for explaining a method of capturing a motion vector predictor candidate in the embodiment;

例如，框間預測部126及218是將共通的候選清單所示的複數個候選運動向量分類成K個(K是2以上的整數)群組。並且，在M個運動向量預測子候選1的擷取中，框間預測部126及218是從該共通的候選清單所示的複數個候選運動向量中，將在上述評價結果為較佳的順位上排名前M個的候選運動向量擷取作為M個運動向量預測子候選1。再者，在L個運動向量預測子候選2的擷取中，框間預測部126及218是依照優先順序，從1個以上的候選運動向量中擷取L個運動向量預測子候選2，其中該1個以上的候選運動向量在該共通的候選清單當中，是隸屬於除了該M個運動向量預測子候選1的每一個所隸屬的群組之外的至少1個群組之任一者。For example, the inter-frame prediction units 126 and 218 classify a plurality of candidate motion vectors indicated by the common candidate list into K (K is an integer of 2 or more) group. Further, in the acquisition of the M motion vector predictor candidates 1, the inter-frame prediction units 126 and 218 are the preferred ones from the plurality of candidate motion vectors indicated by the common candidate list. The top M candidate motion vectors are extracted as M motion vector predictor candidates 1. Furthermore, in the acquisition of the L motion vector predictor candidates 2, the inter-frame prediction units 126 and 218 extract L motion vector predictor candidates 2 from one or more candidate motion vectors in accordance with a priority order, wherein Among the common candidate lists, the one or more candidate motion vectors belong to at least one group other than the group to which each of the M motion vector predictor candidates 1 belongs.

具體來說，如圖23之(a)所示，在K=3、M=1、及L=1的情況下，框間預測部126及218首先是將共通的候選清單所示的複數個候選運動向量分類成3個群組G1~C3。群組G1是根據例如左側的區塊之運動向量等而得到的候選運動向量所隸屬的群組，其中該左側的區塊是編碼對象圖片內的編碼對象區塊之左側的區塊。群組G2是根據例如上側的區塊之運動向量等而得到的候選運動向量所隸屬的群組，其中該上側的區塊是編碼對象圖片內的編碼對象區塊之上側的區塊。群組G3是根據例如和編碼對象圖片不同的圖片內的區塊之運動向量等而得到的候選運動向量所隸屬的群組。Specifically, as shown in (a) of FIG. 23, when K=3, M=1, and L=1, the inter-frame prediction units 126 and 218 are first a plurality of common candidate lists. The candidate motion vectors are classified into three groups G1 to C3. The group G1 is a group to which the candidate motion vector is obtained according to, for example, a motion vector of a block on the left side, and the block on the left side is a block on the left side of the coding target block in the coding target picture. The group G2 is a group to which the candidate motion vector is obtained based on, for example, a motion vector of the block on the upper side, and the block on the upper side is a block on the upper side of the coding target block in the picture to be encoded. The group G3 is a group to which the candidate motion vector is obtained based on, for example, a motion vector of a block in a picture different from the encoding target picture.

接著，框間預測部126及218是從共通的候選清單上所示的複數個候選運動向量中，將上述評價值最高的候選運動向量擷取作為運動向量預測子候選1。接著，框間預測部126及218是依照優先順序來從1個以上的候選運動向量中，擷取1個候選運動向量來作為運動向量預測子候選2，其中該1個以上的候選運動向量在共通的候選清單當中，是隸屬於除了該運動向量預測子候選1所隸屬的群組G1之外的群組G2及G3之任一者。Next, the inter-frame prediction units 126 and 218 extract the candidate motion vector having the highest evaluation value as the motion vector predictor candidate 1 from the plurality of candidate motion vectors shown in the common candidate list. Next, the inter-frame prediction units 126 and 218 extract one candidate motion vector as one motion vector predictor candidate 2 from one or more candidate motion vectors in accordance with a priority order, wherein the one or more candidate motion vectors are Among the common candidate lists, any one of the groups G2 and G3 other than the group G1 to which the motion vector predictor candidate 1 belongs is attached.

或者，框間預測部126及218是將共通的候選清單上所示的複數個候選運動向量分類成K個群組。並且，在M個運動向量預測子候選1的擷取中，框間預測部126及218是從該共通的候選清單上所示的複數個候選運動向量中，擷取在評價結果為較佳的順位上排名前M個的候選運動向量來作為M個運動向量預測子候選1。此外，框間預測部126及218是從複數個候選運動向量中，將評價結果最佳的候選運動向量特定為下一運動向量預測子候選，其中該複數個候選運動向量在共通的候選清單當中，是隸屬於除了該M個運動向量預測子候選1的每一個所隸屬的群組之外的至少1個群組之任一者。並且，在L個運動向量預測子候選2的擷取中，框間預測部126及218是從1個以上的候選運動向量中，依照優先順位來擷取L個運動向量預測子2，其中該1個以上的候選運動向量在共通的候選清單當中，是隸屬於與已特定的下一運動向量預測子候選所隸屬的群組相同的群組。Alternatively, the inter-frame prediction units 126 and 218 classify the plurality of candidate motion vectors shown in the common candidate list into K groups. Further, in the acquisition of the M motion vector predictor candidates 1, the inter-frame prediction units 126 and 218 are selected from the plurality of candidate motion vectors shown in the common candidate list, and the evaluation result is preferable. The top M candidate motion vectors are ranked as the M motion vector predictor candidates 1 in the order. Further, the inter-frame prediction units 126 and 218 specify, from among a plurality of candidate motion vectors, a candidate motion vector that is optimal in the evaluation result as a next motion vector predictor candidate, wherein the plurality of candidate motion vectors are among the common candidate lists. Is any one of at least one group other than the group to which each of the M motion vector predictor candidates 1 belongs. Further, in the acquisition of the L motion vector predictor candidates 2, the inter-frame prediction units 126 and 218 extract L motion vector predictors 2 from one or more candidate motion vectors in accordance with the priority order, wherein One or more candidate motion vectors are among the common candidate list, and belong to the same group as the group to which the specific next motion vector predictor candidate belongs.

具體來說，如圖23之(b)所示，在K=3、M=1、及L=1的情況下，框間預測部126及218首先是與上述例子同樣地，將共通的候選清單上所示的複數個候選運動向量分類成3個群組。Specifically, as shown in (b) of FIG. 23, when K=3, M=1, and L=1, the inter-frame prediction units 126 and 218 firstly share common candidates as in the above example. The plurality of candidate motion vectors shown on the list are classified into three groups.

接著，框間預測部126及218是從共通的候選清單上所示的複數個候選運動向量中，擷取上述評價值最高的候選運動向量來作為運動向量預測子候選1。再者，框間預測部126及218是從複數個候選運動向量當中，將評價值最高的候選運動向量4特定為下一運動向量預測子候選，其中該複數個候選運動向量在共通的候選清單當中，是隸屬於除了該運動向量預測子候選1所隸屬的群組G1之外的群組G2及G3之任一者。並且，框間預測部126及218是從1個以上的候選運動向量中，依照優先順序來擷取1個候選運動向量作為運動向量預測子候選2，其中該1個以上的候選運動向量在共通的候選清單當中，是隸屬於與群組G2相同的群組，且該群組G2是已特定的下一運動向量預測子候選，亦即候選運動向量4所隸屬的群組。Next, the inter-frame prediction units 126 and 218 extract the candidate motion vector having the highest evaluation value from the plurality of candidate motion vectors shown in the common candidate list as the motion vector predictor candidate 1. Furthermore, the inter-frame prediction units 126 and 218 specify, from among the plurality of candidate motion vectors, the candidate motion vector 4 having the highest evaluation value as the next motion vector predictor candidate, wherein the plurality of candidate motion vectors are in the common candidate list. Among them, any one of the groups G2 and G3 other than the group G1 to which the motion vector predictor candidate 1 belongs is affiliated. Further, the inter-frame prediction units 126 and 218 extract one candidate motion vector as a motion vector predictor candidate 2 from among one or more candidate motion vectors in a priority order, wherein the one or more candidate motion vectors are common. Among the candidate lists, belonging to the same group as the group G2, and the group G2 is a specific next motion vector predictor candidate, that is, a group to which the candidate motion vector 4 belongs.

圖24是顯示共通的候選清單之一例的圖。Fig. 24 is a diagram showing an example of a common candidate list.

例如，框間預測部126及218是相對於圖24之(a)所示的編碼對象區塊或解碼對象區塊(以下，簡稱為處理對象區塊)，而生成圖24之(b)所示的共通之候選清單。此共通的候選清單是由L0清單與L1清單所構成。For example, the inter-frame prediction units 126 and 218 are the coding target block or the decoding target block (hereinafter simply referred to as the processing target block) shown in (a) of FIG. 24, and the (b) of FIG. 24 is generated. A list of common candidates. This common candidate list is composed of a list of L0 and a list of L1.

具體來說，框間預測部126及218是將候選運動向量包含於共通的候選清單中，其中該候選運動向量是根據相鄰於處理對象區塊的相鄰區塊1、2、及5的運動向量之候選運動向量。相鄰區塊1是相鄰於處理對象區塊的左下之區塊，相鄰區塊2是相鄰於處理對象區塊的右上之區塊，相鄰區塊5是相鄰於處理對象區塊的左上之區塊。Specifically, the inter-frame prediction units 126 and 218 include the candidate motion vectors in a common candidate list, wherein the candidate motion vectors are based on adjacent blocks 1, 2, and 5 adjacent to the processing target block. The candidate motion vector of the motion vector. The adjacent block 1 is a block in the lower left adjacent to the block to be processed, the adjacent block 2 is a block in the upper right adjacent to the block to be processed, and the adjacent block 5 is adjacent to the processing target block. The upper left block of the block.

例如，相鄰區塊1是藉由運動向量mvL01及mvL11而進行編碼或解碼。相鄰區塊2是藉由運動向量mvL02及mvL12而進行編碼或解碼。相鄰區塊5是藉由運動向量mvL05而進行編碼或解碼。在這種情況下，如圖24之(b)所示，框間預測部126及218是將根據該等運動向量的候選運動向量作為空間候選運動向量而包含於共通的候選清單中。再者，框間預測部126及218亦可將根據其他相鄰區塊的運動向量之候選運動向量，作為空間候選運動向量而包含於共通的候選清單中。其他相鄰區塊可為例如位於相鄰區塊2的右鄰之相鄰區塊3、或者是位於相鄰區塊1的下鄰之相鄰區塊4等。又，框間預測部126及218亦可根據顯示時間間隔來對相鄰區塊的運動向量進行縮放，且將已縮放之該運動向量作為候選運動向量而包含於候選清單中。For example, adjacent block 1 is encoded or decoded by motion vectors mvL01 and mvL11. The adjacent block 2 is encoded or decoded by the motion vectors mvL02 and mvL12. The adjacent block 5 is encoded or decoded by the motion vector mvL05. In this case, as shown in (b) of FIG. 24, the inter-frame prediction units 126 and 218 include the candidate motion vectors based on the motion vectors as spatial candidate motion vectors in the common candidate list. Furthermore, the inter-frame prediction units 126 and 218 may also include candidate motion vectors based on motion vectors of other adjacent blocks as spatial candidate motion vectors in the common candidate list. The other adjacent blocks may be, for example, adjacent blocks 3 located immediately to the right of adjacent blocks 2, or adjacent blocks 4 located below the adjacent block 1, and the like. Moreover, the inter-frame prediction units 126 and 218 may also scale the motion vector of the adjacent block according to the display time interval, and include the scaled motion vector as a candidate motion vector in the candidate list.

再者，框間預測部126及218亦可將時間候選運動向量及結合雙預測候選運動向量(mvL0b, mvL1b)包含於候選清單中。在時間候選運動向量中，會有例如Col候選運動向量(mvL0t, mvL1t)及單向(Unilateral)候選運動向量(mvL0u, mvL1u)。Col候選運動向量(mvL0t, mvL1t)是根據下述運動向量的候選運動向量：位於與包含處理對象區塊的圖片不同的圖片之區塊、例如位於與處理對象區塊相同的位置之區塊的運動向量。再者，Col候選運動向量(mvL0t, mvL1t)亦可為以顯示時間間隔來縮放運動向量而得到的候選運動向量，其中該運動向量是位於與包含處理對象區塊的圖片不同的圖片之區塊的運動向量。又，Col候選運動向量亦可為根據運動向量的候選運動向量，其中該運動向量是位於與處理對象區塊不同的位置之區塊的運動向量。再者，彼此不同的複數個Col候選運動向量亦可包含於候選清單中。單向(Unilateral)候選運動向量(mvL0u, mvL1u)是根據下述之區塊的運動向量之候選運動向量：與包含處理對象區塊的圖片不同的圖片中的區塊，且是考慮了移動量的位置上的區塊，其中該移動量是相對於前述處理對象區塊的位置而隨時間經過的移動量。結合雙預測候選運動向量(mvL0b, mvL1b)是組合候選清單的L0清單與L1清單之各自的運動向量來生成的候選運動向量。Furthermore, the inter-frame prediction units 126 and 218 may also include the temporal candidate motion vector and the combined bi-predictive candidate motion vector (mvL0b, mvL1b) in the candidate list. In the temporal candidate motion vector, there may be, for example, a Col candidate motion vector (mvL0t, mvL1t) and a Unilateral candidate motion vector (mvL0u, mvL1u). The Col candidate motion vector (mvL0t, mvL1t) is a candidate motion vector according to a motion vector: a block located in a picture different from a picture containing a processing target block, for example, a block located at the same position as the processing target block. Motion vector. Furthermore, the Col candidate motion vector (mvL0t, mvL1t) may also be a candidate motion vector obtained by scaling a motion vector at a display time interval, wherein the motion vector is a block located in a picture different from the picture containing the processing target block. Motion vector. Also, the Col candidate motion vector may also be a candidate motion vector according to a motion vector, where the motion vector is a motion vector of a block located at a different location from the processing target block. Furthermore, a plurality of Col candidate motion vectors different from each other may also be included in the candidate list. The Unilateral candidate motion vector (mvL0u, mvL1u) is a candidate motion vector according to the motion vector of the block described below: a block in a picture different from the picture containing the block to be processed, and considering the amount of movement A block on the position where the amount of movement is an amount of movement over time with respect to the position of the aforementioned processing target block. The combined bi-predicted candidate motion vector (mvL0b, mvL1b) is a candidate motion vector generated by combining the L0 list of the candidate list with the respective motion vectors of the L1 list.

在本實施形態中，例如圖24之(b)所示的候選清單，可共通地使用於根據由FRUC所進行的評價結果之擷取方法、及根據事先規定的優先順序之擷取方法。 [實施形態4的效果等]In the present embodiment, for example, the candidate list shown in (b) of FIG. 24 can be commonly used for the extraction method based on the evaluation result by the FRUC and the extraction method according to the predetermined priority order. [Effects of Embodiment 4, etc.]

本實施形態中的編碼裝置，是對動態圖像進行編碼的編碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，並根據與前述動態圖像中的編碼對象區塊相對應的複數個編碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取相對於前述編碼對象區塊的N個(N為2以上的整數)運動向量預測子候選，且從已擷取的前述N個運動向量預測子候選中選擇運動向量預測子，並對用於識別已選擇的前述運動向量預測子之選擇資訊進行編碼，參照包含於前述動態圖像的參照圖片來導出前述編碼對象區塊的運動向量，且對已導出的前述編碼對象區塊的運動向量、及已選擇的前述運動向量預測子之差分進行編碼，並利用已導出的前述編碼對象區塊的運動向量來對前述編碼對象區塊進行動態補償，在前述N個運動向量預測子候選的擷取中，是生成候選清單，該候選清單是顯示前述複數個候選運動向量的清單，且是在第1擷取方法及第2擷取方法上共通的候選清單，從前述共通的候選清單上所示的前述複數個候選運動向量中，依照前述第1擷取方法來擷取M個(M為1以上且小於N之整數)運動向量預測子候選，且從前述共通的候選清單上所示的前述複數個候選運動向量中，依照前述第2擷取方法來擷取L個(L=N－M)運動向量預測子候選，前述第1擷取方法是根據前述複數個候選運動向量之每一個的評價結果的擷取方法，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量，前述第2擷取方法是根據優先順序的擷取方法，該優先順序是對前述複數個候選運動向量事先規定的優先順序。再者，記憶體可為框記憶體122，亦可為其他記憶體，處理電路亦可包含有例如框間預測部126及熵編碼部110等。The encoding device according to the present embodiment is an encoding device that encodes a moving image, and includes a processing circuit and a memory connected to the processing circuit. The processing circuit uses the memory and is based on the moving image. Obtaining, by the motion vector of each of the plurality of coding completion blocks corresponding to the coding target block, a plurality of candidate motion vectors, and extracting N from the plurality of candidate motion vectors relative to the foregoing coding target block (N is an integer of 2 or more) motion vector predictor candidate, and selecting a motion vector predictor from the aforementioned N motion vector predictor candidates, and identifying the selected motion vector predictor Selecting information for encoding, deriving a motion vector of the encoding target block with reference to a reference picture included in the moving image, and extracting a motion vector of the encoded target block and the selected motion vector predictor Differentially encoding, and using the derived motion vector of the coding target block to move the foregoing coding target block Compensating, in the extraction of the N motion vector predictor candidates, is a candidate list that displays a list of the plurality of candidate motion vectors, and is in the first capture method and the second capture method. a common candidate list, wherein, from the plurality of candidate motion vectors shown in the common candidate list, M (M is an integer greater than 1 and less than N) motion vector predictor is extracted according to the first extraction method a candidate, and extracting L (L=N-M) motion vector predictor candidates according to the second extraction method from the plurality of candidate motion vectors shown in the common candidate list, the first The acquisition method is a method for extracting an evaluation result of each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but the encoding in the foregoing dynamic image is used. Completing the candidate motion vector of the reconstructed image of the region, the second capturing method is a method of capturing according to a priority order, which is predetermined for the plurality of candidate motion vectors Priority order. Furthermore, the memory may be the frame memory 122 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 126 and the entropy coding unit 110.

藉此，就可依照第1擷取方法，亦即根據由FRUC所進行的評價結果來擷取M個運動向量預測子候選。從而，可以提高編碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。又，在本實施形態中，可生成於第1擷取方法及第2擷取方法上共通的候選清單。亦即，無論是依照該第1擷取方法來擷取M個運動向量預測子候選的情況，還是依照根據事先規定的優先順序的第2擷取方法來擷取L個運動向量預測子候選的情況，都可以參照共通的候選清單。其結果，毋須對預測區塊個別地生成第1擷取方法用的候選清單、及第2擷取方法用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, the M motion vector predictor candidates can be extracted according to the first acquisition method, that is, based on the evaluation result by the FRUC. Thereby, the prediction accuracy of the coding target block, that is, the prediction block, can be improved, and the coding efficiency can be improved. Further, in the present embodiment, a candidate list common to the first extraction method and the second extraction method can be generated. That is, whether the M motion vector predictor candidates are extracted according to the first acquisition method or the L motion vector predictor candidates are captured according to the second extraction method according to the predetermined priority order. In the case, you can refer to the common candidate list. As a result, it is not necessary to separately generate a candidate list for the first acquisition method and a candidate list for the second extraction method for the prediction block. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，前述處理電路在依照前述第2擷取方法的擷取中，亦可從至少1個前述候選運動向量中，依照利用了前述第1擷取方法中的評價結果之前述優先順序，來擷取前述L個運動向量預測子候選，其中該至少1個候選運動向量是前述共通的候選清單上所示的前述複數個候選運動向量當中，除了藉由前述第1擷取方法所擷取的前述M個運動向量預測子候選之外的其餘的至少1個候選運動向量。Further, in the acquisition by the second extraction method, the processing circuit may further select, from at least one of the candidate motion vectors, according to the priority order of the evaluation result in the first extraction method. Taking the foregoing L motion vector predictor candidates, wherein the at least one candidate motion vector is among the plurality of candidate motion vectors shown in the common candidate list, except for the foregoing by the first extraction method The remaining at least one candidate motion vector other than the M motion vector predictor candidates.

例如，如圖21所示，在依照第2擷取方法的擷取(例如步驟S202ab)中，可參照依照第1擷取方法的擷取結果(例如步驟S202aa)。從而，可以抑制以第1擷取方法與第2擷取方法將相同的候選運動向量擷取作為運動向量預測子候選的情形。For example, as shown in FIG. 21, in the extraction according to the second extraction method (for example, step S202ab), the extraction result according to the first extraction method (for example, step S202aa) can be referred to. Therefore, it is possible to suppress the case where the same candidate motion vector is extracted as the motion vector predictor candidate by the first extraction method and the second extraction method.

又，前述處理電路在前述N個運動向量預測子候選的擷取中，亦可將前述共通的候選清單上所示的前述複數個候選運動向量分類成K個(K為2以上的整數)群組，在依照前述第1擷取方法的擷取中，是從前述共通的候選清單上所示的前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前M個的候選運動向量擷取作為前述M個運動向量預測子候選，在依照前述第2擷取方法的擷取中，是依照前述優先順序，從1個以上的候選運動向量中擷取前述L個運動向量預測子候選，其中該1個以上的候選運動向量在前述共通的候選清單當中，是隸屬於除了前述M個運動向量預測子候選的每一個所隸屬的群組之外的至少1個群組之任一者。例如，前述複數個候選運動向量的每一個之評價結果是差分越小，為越佳的評價結果，其中該差分是由該候選運動向量所特定的第1編碼完成區域的再構成圖像、以及第2編碼完成再構成圖像的差分。Further, the processing circuit may classify the plurality of candidate motion vectors shown in the common candidate list into K (K is an integer of 2 or more) group in the capture of the N motion vector predictor candidates. In the acquisition according to the first extraction method, the first plurality of candidate motion vectors shown in the common candidate list are ranked in the top M of the preferred evaluation results. The candidate motion vector is extracted as the M motion vector predictor candidates. In the following method according to the second capture method, the L motion vectors are extracted from one or more candidate motion vectors according to the priority order. a prediction sub-candidate, wherein the one or more candidate motion vectors are among the aforementioned common candidate lists, belonging to at least one group other than the group to which each of the M motion vector predictor candidates belongs Either. For example, the evaluation result of each of the plurality of candidate motion vectors is a smaller evaluation result, which is a reconstructed image of the first coded completion region specified by the candidate motion vector, and The second encoding is completed to reconstruct the difference of the image.

例如，如圖23之(a)所示，可將複數個候選運動向量分類成彼此性質不同的K個群組。並且，可從K個群組之整體中擷取評價結果最佳的1個(M=1)運動向量預測子候選，並從該運動向量預測子候選所隸屬的群組以外的群組中，依照事先規定的優先順序來擷取另1個(L=1)運動向量預測子候選。從而，可以擷取彼此性質不同，且預測精度較高的2個(N=2)運動向量預測子候選。其結果，可以擴展運動向量預測子的選擇範圍，且可以提高預測精度更高的運動向量預測子被選擇之可能性。For example, as shown in (a) of FIG. 23, a plurality of candidate motion vectors may be classified into K groups different in nature from each other. And, one (M=1) motion vector predictor candidate having the best evaluation result can be extracted from the whole of the K groups, and from the group other than the group to which the motion vector predictor candidate is attached, Another one (L=1) motion vector predictor candidate is retrieved according to a predetermined priority order. Thereby, two (N=2) motion vector predictor candidates which are different in nature from each other and have high prediction precision can be extracted. As a result, the selection range of the motion vector predictor can be expanded, and the possibility that the motion vector predictor with higher prediction accuracy is selected can be improved.

又，前述處理電路在前述N個運動向量預測子候選的擷取中，亦可將前述共通的候選清單上所示的前述複數個候選運動向量分類成K個(K為2以上的整數)群組，在依照前述第1擷取方法的擷取中，是從前述共通的候選清單上所示的前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前M個的候選運動向量擷取作為前述M個運動向量預測子候選，並進一步地從複數個候選運動向量中，將前述評價結果最佳的候選運動向量特定為下一運動向量預測子候選，其中該複數個候選運動向量在前述共通的候選清單當中，是隸屬於除了前述M個運動向量預測子候選的每一個所隸屬的群組之外的至少1個群組之任一者，在依照前述第2擷取方法的擷取中，是依照前述優先順序，從1個以上的候選運動向量中擷取出前述L個運動向量預測子候選，其中該1個以上的候選運動向量在前述共通的候選清單當中，是隸屬於與已特定的前述下一運動向量預測子候選所隸屬的群組相同的群組。Further, the processing circuit may classify the plurality of candidate motion vectors shown in the common candidate list into K (K is an integer of 2 or more) group in the capture of the N motion vector predictor candidates. In the acquisition according to the first extraction method, the first plurality of candidate motion vectors shown in the common candidate list are ranked in the top M of the preferred evaluation results. The candidate motion vector is extracted as the foregoing M motion vector predictor candidates, and further, from the plurality of candidate motion vectors, the candidate motion vector having the best evaluation result is specified as a next motion vector predictor candidate, wherein the plurality of candidate motion vector candidates The candidate motion vector is one of at least one group other than the group to which each of the M motion vector predictor candidates belongs, among the aforementioned common candidate lists, according to the second aspect. In the acquisition method, the L motion vector predictor candidates are extracted from one or more candidate motion vectors according to the priority order, wherein the one or more candidates are selected Trends in the amount of common among the candidate list, the group is part of the same group with the next has a specific motion vector predictor candidate belongs.

例如，如圖23之(b)所示，可將複數個候選運動向量分類成彼此性質不同的K個群組。並且，可從K個群組之整體中擷取評價結果最佳的1個(M=1)運動向量預測子候選，並從該運動向量預測子候選所隸屬的群組以外的群組中，特定下一運動向量預測子候選。此外，可從與該下一運動向量預測子候選所隸屬的群組相同的群組中，依照優先順序來擷取另1個(L=1)運動向量預測子候選。從而，可以擷取彼此性質不同，且預測精度較高的2個(N=2)運動向量預測子候選。其結果，可以擴展運動向量預測子的選擇範圍，且可以提高預測精度更高的運動向量預測子被選擇之可能性。For example, as shown in (b) of FIG. 23, a plurality of candidate motion vectors may be classified into K groups different in nature from each other. And, one (M=1) motion vector predictor candidate having the best evaluation result can be extracted from the whole of the K groups, and from the group other than the group to which the motion vector predictor candidate is attached, A specific next motion vector predictor candidate. Furthermore, another (L=1) motion vector predictor candidate may be retrieved in accordance with the priority order from the same group as the group to which the next motion vector predictor candidate belongs. Thereby, two (N=2) motion vector predictor candidates which are different in nature from each other and have high prediction precision can be extracted. As a result, the selection range of the motion vector predictor can be expanded, and the possibility that the motion vector predictor with higher prediction accuracy is selected can be improved.

又，本實施形態之解碼裝置，是對已編碼的動態圖像進行解碼的解碼裝置，並具備處理電路、及連接於前述處理電路的記憶體，前述處理電路是利用前述記憶體，根據與前述動態圖像中的解碼對象區塊相對應的複數個解碼完成區塊的每一個之運動向量，來取得複數個候選運動向量，從前述複數個候選運動向量中，擷取相對於前述解碼對象區塊的N個(N為2以上的整數)運動向量預測子候選，對用於識別前述解碼對象區塊的運動向量預測子之選擇資訊進行解碼，且從已擷取的前述N個運動向量預測子候選中，將藉由已解碼的前述選擇資訊所識別的運動向量預測子候選選擇作為前述運動向量預測子，並對顯示2個運動向量的差分之差分資訊進行解碼，在由已解碼的前述差分資訊所示的差分上，加上已選擇的前述運動向量預測子，藉此來導出前述解碼對象區塊的運動向量，且利用已導出的前述解碼對象區塊的運動向量來對前述解碼對象區塊進行動態補償，在前述N個運動向量預測子候選的擷取中，是生成候選清單，該候選清單是顯示前述複數個候選運動向量的清單，且是於第1擷取方法及第2擷取方法上共通的候選清單，從前述共通的候選清單上所示的前述複數個候選運動向量中，依照前述第1擷取方法來擷取M個(M為1以上且小於N的整數)運動向量預測子候選，且從前述共通的候選清單上所示的前述複數個候選運動向量中，依照前述第2擷取方法來擷取L個(L=N－M)運動向量預測子候選，前述第1擷取方法是根據前述複數個候選運動向量之每一個的評價結果的擷取方法，其中該複數個候選運動向量不使用前述編碼對象區塊的圖像區域，而是使用了前述動態圖像中編碼完成區域之再構成圖像的候選運動向量，前述第2擷取方法是根據優先順序的擷取方法，該優先順序是對前述複數個候選運動向量事先規定的優先順序。再者，記憶體可為框記憶體214，亦可為其他記憶體，處理電路亦可包含有例如框間預測部218及熵解碼部202等。Further, the decoding device according to the present embodiment is a decoding device that decodes the encoded moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory, according to the foregoing Obtaining, by the motion vector of each of the plurality of decoding completion blocks corresponding to the decoding target block in the dynamic image, to obtain a plurality of candidate motion vectors, and extracting from the plurality of candidate motion vectors relative to the decoding target area N (N is an integer of 2 or more) motion vector predictor candidates of the block, decoding the selection information of the motion vector predictor for identifying the decoding target block, and predicting from the N motion vectors that have been captured In the sub-candidate, the motion vector predictor candidate identified by the decoded selection information is selected as the motion vector predictor, and the difference information indicating the difference of the two motion vectors is decoded, in the foregoing Adding the selected motion vector predictor to the difference indicated by the difference information, thereby deriving the decoding target block a motion vector, and dynamically using the derived motion vector of the decoding target block to dynamically compensate the decoding target block. In the capturing of the N motion vector predictor candidates, a candidate list is generated, and the candidate list is Displaying a list of the plurality of candidate motion vectors, and is a candidate list common to the first extraction method and the second extraction method, and from the plurality of candidate motion vectors shown in the common candidate list, according to the foregoing The first method of extracting M (M is an integer greater than 1 and less than N) motion vector predictor candidate, and from the plurality of candidate motion vectors shown in the common candidate list, according to the second Extracting a method for extracting L (L=N-M) motion vector predictor candidates, wherein the first scooping method is a method for extracting an evaluation result according to each of the plurality of candidate motion vectors, wherein the plurality of The candidate motion vector does not use the image region of the encoding target block, but uses the candidate motion vector of the reconstructed image of the encoded completion region in the aforementioned moving image, the second The method is based on capturing method priority order, the priority is the priority of the plurality of candidate motion vectors specified in advance. Furthermore, the memory may be the frame memory 214 or other memory, and the processing circuit may include, for example, the inter-frame prediction unit 218 and the entropy decoding unit 202.

藉此，就可依照第1擷取方法，亦即根據由FRUC所進行的評價結果來擷取M個運動向量預測子候選。從而，可以提高解碼對象區塊即預測區塊的預測精度，並可以謀求編碼效率之提升。又，在本實施形態中，可生成於第1擷取方法及第2擷取方法上共通的候選清單。亦即，無論是依照該第1擷取方法來擷取M個運動向量預測子候選的情況，還是依照根據事先規定的優先順序之第2擷取方法來擷取L個運動向量預測子候選的情況，都可以參照共通的候選清單。其結果，毋須對預測區塊個別地生成第1擷取方法用的候選清單、及第2擷取方法用的候選清單。從而，可抑制處理負擔的增加並且可以謀求編碼效率的提升。Thereby, the M motion vector predictor candidates can be extracted according to the first acquisition method, that is, based on the evaluation result by the FRUC. Therefore, the prediction accuracy of the decoding target block, that is, the prediction block can be improved, and the coding efficiency can be improved. Further, in the present embodiment, a candidate list common to the first extraction method and the second extraction method can be generated. That is, whether the M motion vector predictor candidates are extracted according to the first acquisition method or the L motion vector predictor candidates are captured according to the second extraction method according to the predetermined priority order. In the case, you can refer to the common candidate list. As a result, it is not necessary to separately generate a candidate list for the first acquisition method and a candidate list for the second extraction method for the prediction block. Thereby, an increase in processing load can be suppressed and an improvement in coding efficiency can be achieved.

又，前述處理電路在依照前述第2擷取方法的擷取中，亦可從至少1個前述候選運動向量中，依照利用了前述第1擷取方法中的評價結果之前述優先順序，來擷取前述L個運動向量預測子候選，其中該至少1個候選運動向量是在前述共通的候選清單上所示的前述複數個候選運動向量當中，除了藉由前述第1擷取方法所擷取的前述M個運動向量預測子候選之外的其餘的至少1個候選運動向量。Further, in the acquisition by the second extraction method, the processing circuit may further select, from at least one of the candidate motion vectors, according to the priority order of the evaluation result in the first extraction method. Taking the foregoing L motion vector predictor candidates, wherein the at least one candidate motion vector is among the plurality of candidate motion vectors shown in the foregoing common candidate list, except that the first capture method is used. The remaining at least one candidate motion vector other than the M motion vector predictor candidates.

例如，如圖22所示，在依照第2擷取方法的擷取(例如步驟S212ab)中，是參照依照第1擷取方法的擷取結果(例如步驟S212aa)。從而，可以抑制以第1擷取方法與第2擷取方法將相同的候選運動向量擷取作為運動向量預測子候選的情形。For example, as shown in FIG. 22, in the extraction according to the second extraction method (for example, step S212ab), the extraction result according to the first extraction method is referred to (for example, step S212aa). Therefore, it is possible to suppress the case where the same candidate motion vector is extracted as the motion vector predictor candidate by the first extraction method and the second extraction method.

又，前述處理電路在前述N個運動向量預測子候選的擷取中，亦可將在前述共通的候選清單上所示的前述複數個候選運動向量分類成K個(K為2以上的整數)群組，在依照前述第1擷取方法的擷取中，是從前述共通的候選清單上所示的前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前M個的候選運動向量擷取作為前述M個運動向量預測子候選，在依照前述第2擷取方法的擷取中，是依照前述優先順序，從1個以上的候選運動向量中擷取前述L個運動向量預測子候選，其中該1個以上的候選運動向量在前述共通的候選清單當中，是隸屬於除了前述M個運動向量預測子候選的每一個所隸屬的群組之外的至少1個群組之任一者。例如，前述複數個候選運動向量的每一個之評價結果是差分越小為越佳的評價結果，其中該差分是由該候選運動向量所特定的第1解碼完成區域的再構成圖像、以及第2解碼完成再構成圖像的差分。Further, the processing circuit may classify the plurality of candidate motion vectors shown in the common candidate list into K (K is an integer of 2 or more) in the capture of the N motion vector predictor candidates. In the acquisition according to the first extraction method, the group is ranked from the plurality of candidate motion vectors shown in the common candidate list, and the first M rankings are ranked in the preferred evaluation result. The candidate motion vector is captured as the M motion vector predictor candidates. In the following method according to the second capture method, the L motions are extracted from one or more candidate motion vectors according to the priority order. a vector predictor candidate, wherein the one or more candidate motion vectors among the aforementioned common candidate lists are at least one group belonging to a group to which each of the M motion vector predictor candidates belongs Either. For example, the evaluation result of each of the plurality of candidate motion vectors is an evaluation result that is smaller as the difference is smaller, wherein the difference is a reconstructed image of the first decoding completion region specified by the candidate motion vector, and 2 The decoding is completed to reconstruct the difference of the image.

又，前述處理電路在前述N個運動向量預測子候選的擷取中，亦可將在前述共通的候選清單上所示的前述複數個候選運動向量分類成K個(K為2以上的整數)群組，在依照前述第1擷取方法的擷取中，是從前述共通的候選清單上所示的前述複數個候選運動向量中，將在前述評價結果為較佳的順位上排名前M個的候選運動向量擷取作為前述M個運動向量預測子候選，並進一步地從複數個候選運動向量中，將前述評價結果最佳的候選運動向量特定為下一運動向量預測子候選，其中該複數個候選運動向量在前述共通的候選清單當中，是隸屬於除了前述M個運動向量預測子候選的每一個所隸屬的群組之外的至少1個群組之任一者，在依照前述第2擷取方法的擷取中，是依照前述優先順序，從1個以上的候選運動向量中擷取出前述L個運動向量預測子候選，其中該1個以上的候選運動向量在前述共通的候選清單當中，是隸屬於與已特定的前述下一運動向量預測子候選所隸屬的群組相同的群組。Further, the processing circuit may classify the plurality of candidate motion vectors shown in the common candidate list into K (K is an integer of 2 or more) in the capture of the N motion vector predictor candidates. In the acquisition according to the first extraction method, the group is ranked from the plurality of candidate motion vectors shown in the common candidate list, and the first M rankings are ranked in the preferred evaluation result. The candidate motion vector is extracted as the foregoing M motion vector predictor candidates, and further, from the plurality of candidate motion vectors, the candidate motion vector having the best evaluation result is specified as a next motion vector predictor candidate, wherein the complex number Among the aforementioned common candidate lists, the candidate motion vectors are any one of at least one group other than the group to which each of the M motion vector predictor candidates belongs, in accordance with the second In the acquisition method, the L motion vector predictor candidates are extracted from one or more candidate motion vectors according to the priority order, wherein the one or more candidates In the motion vector candidate list in common among the same group it is part of a particular group and has the next motion vector predictor candidate belongs.

這些全面性的或具體的態樣可以藉由系統、裝置、方法、積體電路、電腦程式、或電腦可讀取的CD-ROM等非暫時的記錄媒體來實現，也可以藉由系統、裝置、方法、積體電路、電腦程式、及記錄媒體的任意組合來實現。 [組裝例]These comprehensive or specific aspects may be implemented by systems, devices, methods, integrated circuits, computer programs, or non-transitory recording media such as computer-readable CD-ROMs, or by systems and devices. , any combination of methods, integrated circuits, computer programs, and recording media. [Assembly example]

圖25是顯示上述各實施形態之編碼裝置100的組裝例之方塊圖。編碼裝置100具備處理電路160及記憶體162。例如，圖1所示之編碼裝置100的複數個構成要件是藉由圖25所示之處理電路160及記憶體162來組裝。Fig. 25 is a block diagram showing an example of assembly of the coding apparatus 100 of each of the above embodiments. The encoding device 100 includes a processing circuit 160 and a memory 162. For example, the plurality of components of the encoding device 100 shown in FIG. 1 are assembled by the processing circuit 160 and the memory 162 shown in FIG.

處理電路160是進行資訊處理的電路，且是可對記憶體162進行存取的電路。例如，處理電路160可為對動態圖像進行編碼之專用或通用的電子電路。處理電路160亦可為CPU之類的處理器。又，處理電路160亦可為複數個電子電路的集合體。又，例如，處理電路160亦可為在圖1所示之編碼裝置100的複數個構成要件當中，發揮除了用於儲存資訊的構成要件之外的複數個構成要件之作用。The processing circuit 160 is a circuit that performs information processing and is a circuit that can access the memory 162. For example, processing circuit 160 can be a dedicated or general purpose electronic circuit that encodes dynamic images. Processing circuit 160 can also be a processor such as a CPU. Moreover, the processing circuit 160 can also be an aggregate of a plurality of electronic circuits. Further, for example, the processing circuit 160 may function as a plurality of constituent elements other than the constituent elements for storing information among the plurality of constituent elements of the encoding apparatus 100 shown in FIG.

記憶體162是儲存處理電路160用於對動態圖像進行編碼的資訊之通用或專用的記憶體。記憶體162可為電子電路，亦可連接至處理電路160上。又，亦可將記憶體162包含在處理電路160中。又，記憶體162亦可為複數個電子電路的集合體。又，記憶體162亦可為磁碟或光碟等，亦可表現為儲存器或記錄媒體等。又，記憶體162亦可為非揮發性記憶體，亦可為揮發性記憶體。The memory 162 is a general-purpose or dedicated memory that stores information for encoding a moving image by the processing circuit 160. The memory 162 can be an electronic circuit or can be connected to the processing circuit 160. Further, the memory 162 may be included in the processing circuit 160. Further, the memory 162 may be an aggregate of a plurality of electronic circuits. Further, the memory 162 may be a magnetic disk, a compact disk, or the like, and may be embodied as a memory or a recording medium. Moreover, the memory 162 can also be a non-volatile memory or a volatile memory.

例如，在記憶體162中可儲存有欲編碼的動態圖像，亦可儲存有對應於已編碼之動態圖像的位元串。又，在記憶體162中亦可儲存有處理電路160對動態圖像進行編碼用的程式。For example, a dynamic image to be encoded may be stored in the memory 162, or a bit string corresponding to the encoded dynamic image may be stored. Further, a program for encoding the moving image by the processing circuit 160 may be stored in the memory 162.

又，例如，記憶體162亦可在圖1所示之編碼裝置100的複數個構成要件當中，發揮用於儲存資訊的構成要件之作用。具體來說，記憶體162亦可發揮圖1所示之區塊記憶體118及框記憶體122的作用。更具體來說，在記憶體162中亦可儲存有處理完成子區塊、處理完成區塊、及處理完成圖片等。Further, for example, the memory 162 may function as a constituent element for storing information among a plurality of constituent elements of the encoding device 100 shown in FIG. Specifically, the memory 162 can also function as the block memory 118 and the frame memory 122 shown in FIG. More specifically, the processing completion sub-block, the processing completion block, and the processing completion picture may be stored in the memory 162.

再者，在編碼裝置100中，亦可不組裝圖1等所示之複數個構成要件的全部，且亦可不進行上述複數個處理的全部。圖1等所示之複數個構成要件的一部分亦可包含在其他的裝置中，且上述複數個處理的一部分亦可藉由其他的裝置來執行。並且，在編碼裝置100中，可組裝圖1等所示之複數個構成要件當中的一部分，且進行上述複數個處理的一部分，藉此即能夠以較少的編碼量來適當地處理動態圖像。Furthermore, in the encoding apparatus 100, all of the plurality of constituent elements shown in FIG. 1 and the like may not be assembled, and all of the above-described plural processing may not be performed. A part of the plurality of constituent elements shown in FIG. 1 and the like may also be included in other devices, and a part of the plurality of processes may be executed by other devices. Further, in the encoding apparatus 100, a part of the plurality of constituent elements shown in FIG. 1 and the like can be assembled, and a part of the plurality of processing units can be performed, whereby the moving image can be appropriately processed with a small amount of encoding. .

圖26是顯示上述各實施形態之解碼裝置200的組裝例之方塊圖。解碼裝置200具備處理電路260及記憶體262。例如，圖10所示之解碼裝置200的複數個構成要件是藉由圖26所示之處理電路260及記憶體262來組裝。Fig. 26 is a block diagram showing an example of assembly of the decoding device 200 of each of the above embodiments. The decoding device 200 includes a processing circuit 260 and a memory 262. For example, the plurality of components of the decoding device 200 shown in FIG. 10 are assembled by the processing circuit 260 and the memory 262 shown in FIG.

處理電路260是進行資訊處理的電路，且是可對記憶體262進行存取的電路。例如，處理電路260是對動態圖像進行解碼之通用或專用的電子電路。處理電路260亦可為CPU之類的處理器。又，處理電路260亦可為複數個電子電路的集合體。又，例如，處理電路260亦可在圖10所示之解碼裝置200的複數個構成要件當中，發揮除了用於儲存資訊的構成要件之外的複數個構成要件之作用。The processing circuit 260 is a circuit that performs information processing and is a circuit that can access the memory 262. For example, processing circuit 260 is a general purpose or special purpose electronic circuit that decodes moving images. Processing circuit 260 can also be a processor such as a CPU. Moreover, the processing circuit 260 can also be an aggregate of a plurality of electronic circuits. Further, for example, the processing circuit 260 may function as a plurality of constituent elements other than the constituent elements for storing information among the plurality of constituent elements of the decoding device 200 shown in FIG.

記憶體262是儲存處理電路260用於對動態圖像進行解碼的資訊之通用或專用的記憶體。記憶體262可為電子電路，亦可連接至處理電路260上。又，亦可將記憶體262包含在處理電路260中。又，記憶體262亦可為複數個電子電路的集合體。又，記憶體262亦可為磁碟或光碟等，亦可表現為儲存器或記錄媒體等。又，記憶體262亦可為非揮發性記憶體，亦可為揮發性記憶體。The memory 262 is a general-purpose or dedicated memory that stores information for decoding a moving image by the processing circuit 260. The memory 262 can be an electronic circuit or can be connected to the processing circuit 260. Further, the memory 262 may be included in the processing circuit 260. Further, the memory 262 may be an aggregate of a plurality of electronic circuits. Moreover, the memory 262 may be a magnetic disk, a compact disk, or the like, and may be embodied as a memory or a recording medium. Moreover, the memory 262 can also be a non-volatile memory or a volatile memory.

例如，在記憶體262中亦可儲存有對應於已編碼的動態圖像之位元串，亦可儲存有對應於已解碼的位元串之動態圖像。又，在記憶體262中亦可儲存有處理電路260對動態圖像進行解碼用的程式。For example, a bit string corresponding to the encoded dynamic image may be stored in the memory 262, and a dynamic image corresponding to the decoded bit string may also be stored. Further, a program for decoding the moving image by the processing circuit 260 may be stored in the memory 262.

又，例如，記憶體262亦可在圖10所示之解碼裝置200的複數個構成要件當中，發揮用於儲存資訊的構成要件之作用。具體來說，記憶體262亦可發揮圖10所示之區塊記憶體210及框記憶體214的作用。更具體來說，在記憶體262中亦可儲存有處理完成子區塊、處理完成區塊、及處理完成圖片等。Further, for example, the memory 262 may function as a constituent element for storing information among a plurality of constituent elements of the decoding device 200 shown in FIG. Specifically, the memory 262 can also function as the block memory 210 and the frame memory 214 shown in FIG. More specifically, the processing completion sub-block, the processing completion block, and the processing completion picture may be stored in the memory 262.

再者，在解碼裝置200中，亦可不組裝圖10等所示之複數個構成要件的全部，且亦可不進行上述複數個處理的全部。圖10等所示之複數個構成要件的一部分亦可包含在其他的裝置中，且上述複數個處理的一部分亦可藉由其他的裝置來執行。並且，在解碼裝置200中，可組裝圖10等所示之複數個構成要件當中的一部分，且進行上述複數個處理的一部分，藉此即能夠以較少的編碼量來適當地處理動態圖像。 [補充說明]Furthermore, in the decoding device 200, all of the plurality of constituent elements shown in FIG. 10 and the like may not be assembled, and all of the plurality of processing may not be performed. A part of the plurality of constituent elements shown in FIG. 10 and the like may be included in other devices, and a part of the plurality of processes may be executed by other devices. Further, in the decoding device 200, a part of the plurality of constituent elements shown in FIG. 10 and the like can be assembled, and a part of the plurality of processing can be performed, whereby the moving image can be appropriately processed with a small amount of encoding. . [Supplementary note]

上述各實施形態中的編碼裝置100及解碼裝置200亦可分別作為圖像編碼裝置及圖像解碼裝置來利用，亦可分別作為動態圖像編碼裝置及動態圖像解碼裝置來利用。或者，編碼裝置100及解碼裝置200可分別作為框間預測裝置來利用。也就是說，編碼裝置100及解碼裝置200亦可分別僅對應於框間預測部126及框間預測部218。The encoding device 100 and the decoding device 200 in each of the above embodiments may be used as an image encoding device and an image decoding device, respectively, or may be used as a moving image encoding device and a moving image decoding device, respectively. Alternatively, the encoding device 100 and the decoding device 200 can be utilized as inter-frame prediction devices, respectively. That is to say, the encoding device 100 and the decoding device 200 may correspond to only the inter-frame prediction unit 126 and the inter-frame prediction unit 218, respectively.

又，在上述各實施形態中，雖然是將預測區塊作為編碼對象區塊或解碼對象區塊來編碼或解碼，但是編碼對象區塊或解碼對象區塊並不限於預測區塊，亦可為子區塊，且亦可為其他區塊。Further, in each of the above embodiments, the prediction block is encoded or decoded as the coding target block or the decoding target block, but the coding target block or the decoding target block is not limited to the prediction block, and may be Sub-blocks, and can also be other blocks.

又，在上述各實施形態中，各構成要件可由專用之硬體構成，亦可藉由執行適合於各構成要件之軟體程式來實現。各構成要件亦可藉由CPU或處理器等之程式執行部將已記錄於硬碟或半導體記憶體等記錄媒體的軟體程式讀取並執行來實現。Further, in each of the above embodiments, each constituent element may be constituted by a dedicated hardware, or may be realized by executing a software program suitable for each constituent element. Each component can also be realized by reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory by a program execution unit such as a CPU or a processor.

具體來說，編碼裝置100及解碼裝置200亦可各自具備有處理電路(Processing Circuitry)、及電連接於該處理電路之可由該處理電路存取之儲存裝置(Storage)。Specifically, the encoding device 100 and the decoding device 200 may each include a processing circuit (Processing Circuitry) and a storage device (Storage) electrically connected to the processing circuit and accessible by the processing circuit.

處理電路包含專用的硬體及程式執行部之至少一個，且是利用儲存裝置來執行處理。又，儲存裝置在處理電路包含程式執行部的情形下，會儲存可藉由該程式執行部執行之軟體程式。The processing circuit includes at least one of a dedicated hardware and a program execution unit, and the storage device is used to perform processing. Further, when the processing circuit includes the program execution unit, the storage device stores the software program executable by the program execution unit.

在此，實現上述各實施形態之編碼裝置100或解碼裝置200等的軟體，是如以下的程式。Here, the software for realizing the encoding device 100, the decoding device 200, and the like of the above embodiments is as follows.

也就是說，該程式是使電腦執行依照圖15~圖18及圖20~圖22當中的任一個所示的流程圖之處理。That is, the program is a process for causing the computer to execute the flowchart shown in any one of FIGS. 15 to 18 and FIGS. 20 to 22.

又，如上所述，各構成要件亦可為電路。這些電路亦可整體構成為1個電路，亦可是各自不同的電路。又，各構成要件亦可利用通用的處理器來實現，亦可利用專用的處理器來實現。Further, as described above, each constituent element may be an electric circuit. These circuits may be integrally formed as one circuit or may be different circuits. Further, each component may be implemented by a general-purpose processor or by a dedicated processor.

又，亦可令另外的構成要件執行特定的構成要件所執行的處理。又，亦可將執行處理的順序變更，且亦可將複數個處理並行來執行。又，亦可使編碼解碼裝置具備有編碼裝置100及解碼裝置200。Further, it is also possible to cause another constituent element to execute the processing performed by the specific constituent element. Further, the order of executing the processing may be changed, or a plurality of processing may be performed in parallel. Further, the coding and decoding apparatus may be provided with the coding apparatus 100 and the decoding apparatus 200.

於說明中所用的第1及第2等的序數也可以適當地更換。又，對於構成要件等，可將序數重新給與，亦可去除。The number of the first and second numbers used in the description may be appropriately replaced. Further, for the constituent elements and the like, the ordinal number can be re-applied or removed.

以上，雖然根據各實施形態來說明編碼裝置100及解碼裝置200的態樣，但編碼裝置100及解碼裝置200的態樣並非限定於這些實施形態之態樣。只要不脫離本揭示之主旨，而將本發明所屬技術領域中具有通常知識者可設想得到之各種變形施行於實施形態者、或組合不同的實施形態中的構成要件而建構之形態，均可包含於編碼裝置100及解碼裝置200的態樣之範圍內。 (實施形態5)Although the aspects of the encoding device 100 and the decoding device 200 have been described above based on the respective embodiments, the aspects of the encoding device 100 and the decoding device 200 are not limited to the embodiments. Any form in which the various modifications conceivable by those of ordinary skill in the art to which the present invention pertains can be applied to the embodiments or combined with the constituent elements of the different embodiments can be included without departing from the spirit of the present disclosure. Within the scope of the encoding device 100 and the decoding device 200. (Embodiment 5)

在以上之各實施形態中，功能方塊的每一個通常可藉由MPU及記憶體等來實現。又，功能方塊的每一個所進行之處理，通常是藉由使處理器等程式執行部將已記錄於ROM等記錄媒體的軟體(程式)讀出並執行來實現。該軟體可藉由下載等來發布，亦可記錄於半導體記憶體等記錄媒體來發布。再者，當然也可以藉由硬體(專用電路)來實現各功能方塊。In each of the above embodiments, each of the functional blocks can be generally implemented by an MPU, a memory, or the like. Further, the processing performed by each of the function blocks is usually realized by causing a program execution unit such as a processor to read and execute a software (program) recorded on a recording medium such as a ROM. The software can be distributed by downloading or the like, or can be recorded on a recording medium such as a semiconductor memory. Furthermore, it is of course also possible to implement the functional blocks by hardware (dedicated circuit).

又，在各實施形態中所說明的處理，可藉由利用單一的裝置(系統)而集中處理來實現、或者亦可藉由利用複數個裝置而分散處理來實現。又，執行上述程式的處理器可為單個，亦可為複數個。亦即，可進行集中處理、或者亦可進行分散處理。Further, the processing described in each embodiment can be realized by a centralized processing using a single device (system), or can be realized by distributed processing using a plurality of devices. Moreover, the processor executing the above program may be a single or a plurality of processors. That is, it is possible to perform centralized processing or distributed processing.

本發明不受以上之實施例所限定，可進行各種的變更，且該等變更亦包含於本發明之範圍內。The present invention is not limited to the above embodiments, and various modifications can be made without departing from the scope of the invention.

在此，更進一步地說明上述各實施形態所示之動態圖像編碼方法(圖像編碼方法)或動態圖像解碼方法(圖像解碼方法)的應用例及利用其之系統。該系統之特徵在於具有使用圖像編碼方法之圖像編碼裝置、使用圖像解碼方法之圖像解碼裝置、及具備兩者之圖像編碼解碼裝置。針對系統中的其他構成，可以因應於情況而適當地變更。 [使用例]Here, an application example of the moving image encoding method (image encoding method) or the moving image decoding method (image decoding method) described in each of the above embodiments and a system using the same will be further described. This system is characterized by an image coding apparatus using an image coding method, an image decoding apparatus using an image decoding method, and an image coding and decoding apparatus including both. Other configurations in the system can be appropriately changed depending on the situation. [usage]

圖27是顯示實現內容發送服務(content delivery service)的內容供給系統ex100之整體構成的圖。將通訊服務的提供地區分割成所期望的大小，且在各格區(cell)內分別設置固定無線電台即基地台ex106、ex107、ex108、ex109、ex110。FIG. 27 is a diagram showing the overall configuration of a content supply system ex100 that realizes a content delivery service. The area where the communication service is provided is divided into a desired size, and base stations ex106, ex107, ex108, ex109, and ex110, which are fixed radio stations, are provided in each of the cells.

在此內容供給系統ex100中，可透過網際網路服務提供者ex102或通訊網ex104、及基地台ex106~ex110，將電腦ex111、遊戲機ex112、相機ex113、家電ex114、及智慧型手機ex115等各機器連接到網際網路ex101。該內容供給系統ex100亦可構成為組合並連接上述之任一要件。亦可在不透過作為固定無線電台之基地台ex106~ex110的情況下，將各機器透過電話網或近距離無線等直接或間接地相互連接。又，串流伺服器(streaming server)ex103，是透過網際網路ex101等而與電腦ex111、遊戲機ex112、相機ex113、家電ex114、及智慧型手機ex115等各機器相連接。又，串流伺服器ex103是透過衛星ex116而與飛機ex117內之熱點(hot spot)內的終端等連接。In the content supply system ex100, each of the machines such as the computer ex111, the game machine ex112, the camera ex113, the home appliance ex114, and the smart phone ex115 can be accessed via the Internet service provider ex102 or the communication network ex104 and the base stations ex106 to ex110. Connect to the Internet ex101. The content supply system ex100 may also be configured to combine and connect any of the above requirements. Alternatively, each device may be directly or indirectly connected to each other via a telephone network or short-range wireless without passing through the base station ex106 to ex110 as a fixed radio station. Further, the streaming server ex103 is connected to each of the devices such as the computer ex111, the game machine ex112, the camera ex113, the home appliance ex114, and the smart phone ex115 via the Internet ex101 or the like. Further, the streaming server ex103 is connected to a terminal or the like in a hot spot in the aircraft ex117 via the satellite ex116.

再者，亦可取代基地台ex106~ex110 ，而使用無線存取點或熱點等。又，串流伺服器ex103可在不透過網際網路ex101或網際網路服務提供者ex102的情形下直接與通訊網ex104連接，亦可在不透過衛星ex116的情形下直接與飛機ex117連接。Furthermore, it is also possible to use a wireless access point or hotspot instead of the base station ex106~ex110. Further, the streaming server ex103 can directly connect to the communication network ex104 without passing through the Internet ex101 or the Internet service provider ex102, or can directly connect to the aircraft ex117 without passing through the satellite ex116.

相機ex113是數位相機等可進行靜態圖攝影、及動態圖攝影之機器。又，智慧型手機ex115為對應於一般稱作2G、3G、3.9G、4G、還有今後被稱為5G的移動通訊系統之方式的智慧型電話機、行動電話機、或者PHS(Personal Handyphone System(個人手持電話系統))等。The camera ex113 is a machine that can perform still picture shooting and dynamic picture shooting such as a digital camera. In addition, the smart phone ex115 is a smart phone, a mobile phone, or a PHS (Personal Handyphone System) corresponding to a mobile communication system generally called 2G, 3G, 3.9G, 4G, and 5G. Handheld phone system)) and so on.

家電ex118可為冰箱、或包含於家庭用燃料電池汽電共生系統(cogeneration system)之機器等。The home appliance ex118 may be a refrigerator or a machine included in a household fuel cell cogeneration system.

在內容供給系統ex100中，具有攝影功能之終端是透過基地台ex106等來連接到串流伺服器ex103，藉此使實況(live)即時發送等變得可行。在實況即時發送中，終端(電腦ex111、遊戲機ex112、相機ex113、家電ex114、智慧型手機ex115、及飛機ex117內的終端等)是對使用者利用該終端所攝影之靜態圖或動態圖內容進行在上述各實施形態所說明的編碼處理，並對藉由編碼而得到的影像資料、及將對應於影像的聲音進行編碼而成的聲音資料進行多工化，來將所獲得的資料傳送至串流伺服器ex103。亦即，各終端是作為本發明的一個態樣的圖像編碼裝置而發揮功能。In the content supply system ex100, the terminal having the photographing function is connected to the streaming server ex103 via the base station ex106 or the like, thereby making it possible to transmit live live or the like. In live delivery, the terminal (computer ex111, game machine ex112, camera ex113, home appliance ex114, smart phone ex115, terminal in aircraft ex117, etc.) is a static or dynamic picture content captured by the user using the terminal. Performing the encoding process described in each of the above embodiments, and multiplexing the image data obtained by the encoding and the sound data encoded by the sound corresponding to the image, and transmitting the obtained data to Streaming server ex103. That is, each terminal functions as an image coding apparatus according to an aspect of the present invention.

另一方面，串流伺服器ex103會進行內容資料之串流發送，該內容資料即是對有要求之客戶端(client)傳送的內容資料。客戶端是指可將已經過上述編碼處理之資料解碼的電腦ex111、遊戲機ex112、相機ex113、家電ex114、智慧型手機ex115、及飛機ex117內之終端等。已接收到所發送之資料的各機器會將所接收到之資料解碼處理並播放。亦即，各機器是作為本發明之一個態樣的圖像解碼裝置而發揮功能。 [分散處理]On the other hand, the streaming server ex103 performs streaming of the content material, which is the content material transmitted to the client (client) that is required. The client is a computer ex111, a game machine ex112, a camera ex113, a home appliance ex114, a smart phone ex115, and a terminal in the aircraft ex117, which can decode the data that has undergone the above-described encoding processing. Each machine that has received the transmitted data decodes and plays the received data. That is, each machine functions as an image decoding device according to an aspect of the present invention. [Distributed processing]

又，串流伺服器ex103亦可為複數個伺服器或複數台電腦，且將資料分散並處理或記錄以進行發送。例如，串流伺服器ex103可藉由CDN(內容傳遞網路，Contents Delivery Network)來實現，亦可藉由分散於全世界的多數個邊緣伺服器(edge server)與連接邊緣伺服器之間的網路來實現內容發送。在CDN上，會因應於客戶來動態地分配在物理上相近之邊緣伺服器。並且，可以藉由將內容快取(cache)及發送至該邊緣伺服器來減少延遲。又，由於可以在發生某種錯誤時或因流量之增加等而改變通訊狀態時，以複數個邊緣伺服器將處理分散、或將發送主體切換為其他的邊緣伺服器，來繞過已發生障礙的網路的部分以持續發送，因此可以實現高速且穩定的發送。Moreover, the streaming server ex103 can also be a plurality of servers or a plurality of computers, and the data is distributed and processed or recorded for transmission. For example, the streaming server ex103 can be implemented by a CDN (Contents Delivery Network), or by a plurality of edge servers distributed between the world and the edge server. The network is used to implement content delivery. On the CDN, physically similar edge servers are dynamically allocated in response to the client. Also, the delay can be reduced by caching and sending the content to the edge server. Moreover, since the communication state can be changed when an error occurs or the flow rate is increased, the processing is dispersed by a plurality of edge servers, or the transmission body is switched to another edge server to bypass the obstacle that has occurred. The part of the network is continuously transmitted, so high-speed and stable transmission can be achieved.

又，不僅是發送本身的分散處理，已攝影的資料之編碼處理亦可在各終端進行，且也可在伺服器側進行，亦可互相分擔來進行。作為一例，一般在編碼處理中，會進行2次處理循環。在第1次的循環中是檢測在框或場景單位下之圖像的複雜度或編碼量。又，在第2次的循環中是進行維持畫質並提升編碼效率的處理。例如，藉由使終端進行第1次的編碼處理，且使接收內容之伺服器側進行第2次的編碼處理，可以減少在各終端之處理負荷並且提升內容的質與效率。此時，只要有以近乎即時的方式來進行接收並解碼的要求，也可以用其他終端來接收並播放終端已進行的第一次之編碼完成資料，因此也可做到更靈活的即時發送。Further, not only the distributed processing of the transmission itself but also the encoding processing of the photographed data may be performed at each terminal, or may be performed on the server side or shared with each other. As an example, generally, in the encoding process, the processing loop is performed twice. In the first loop, the complexity or amount of encoding of the image in the frame or scene unit is detected. Further, in the second loop, processing for maintaining image quality and improving encoding efficiency is performed. For example, by causing the terminal to perform the first encoding process and the second encoding process by the server side receiving the content, the processing load on each terminal can be reduced and the quality and efficiency of the content can be improved. At this time, as long as there is a request for receiving and decoding in a near-instant manner, other terminals can also be used to receive and play the first coded completion data that the terminal has performed, so that more flexible instant transmission can be achieved.

作為其他的例子，相機ex113等是從圖像中進行特徵量擷取，並將與特徵量相關之資料作為元資料(meta data)來壓縮並傳送至伺服器。伺服器會進行例如從特徵量判斷目標(object)之重要性並切換量化精度等的因應圖像之意義的壓縮。特徵量資料對於在伺服器之再度壓縮時的運動向量預測之精度及效率提升特別有效。又，亦可在終端進行VLC(可變長度編碼)等之簡易的編碼，並在伺服器進行CABAC(上下文參考之適應性二值算術編碼方式)等處理負荷較大的編碼。As another example, the camera ex113 or the like performs feature amount extraction from an image, and compresses and associates the material related to the feature amount as meta data to the server. The server performs, for example, compression of the meaning of the image corresponding to the importance of the object from the feature amount and switching the quantization precision. The feature quantity data is particularly effective for the accuracy and efficiency improvement of motion vector prediction when the server is again compressed. Further, it is also possible to perform simple coding such as VLC (Variable Length Coding) in the terminal, and perform processing with a large load such as CABAC (Critical Reference Adaptive Binary Arithmetic Coding) on the server.

此外，作為其他的例子，在運動場、購物商場、或工廠等中，會有藉由複數個終端拍攝幾乎相同的場景之複數個影像資料存在的情況。此時，可利用已進行攝影之複數個終端、與因應需要而沒有進行攝影之其他終端及伺服器，以例如GOP(圖片群組，Group of Picture)單位、圖片單位、或將圖片分割而成之圖塊(tile)單位等來各自分配編碼處理而進行分散處理。藉此，可以減少延遲，而更加能夠實現即時性(real-time)。Further, as another example, in a sports field, a shopping mall, a factory, or the like, a plurality of pieces of image data in which almost the same scene is captured by a plurality of terminals may exist. In this case, it is possible to use, for example, a GOP (Group of Picture) unit, a picture unit, or a picture by using a plurality of terminals that have been photographed, and other terminals and servers that are not photographed as needed. The tile units and the like are assigned to each other to perform a distributed process. Thereby, the delay can be reduced, and real-time can be more realized.

又，由於複數個影像資料幾乎為相同的場景，因此亦可利用伺服器進行管理及/或指示，以將在各終端所攝影之影像資料互相地配合參照。或者，亦可使伺服器接收來自各終端之編碼完成資料，並在複數個資料間變更參照關係、或者補正或更換圖片本身並重新編碼。藉此，可以生成已提高一個個資料之質與效率的串流(stream)。Moreover, since the plurality of image data are almost the same scene, the server can be managed and/or instructed to refer to the image data captured by each terminal. Alternatively, the server may be configured to receive the coded completion data from each terminal, change the reference relationship between the plurality of data, or correct or replace the picture itself and re-encode. Thereby, it is possible to generate a stream in which the quality and efficiency of one piece of data have been improved.

又，伺服器亦可在進行變更影像資料之編碼方式的轉碼(transcode)後再發送影像資料。例如，伺服器亦可將MPEG類之編碼方式轉換為VP類，亦可將H.264轉換為H.265。Moreover, the server can also transmit the image data after transcoding the encoding method of the changed image data. For example, the server can also convert the encoding method of the MPEG class into a VP class, and can also convert H.264 to H.265.

如此，即可藉由終端或1個以上的伺服器來進行編碼處理。因此，以下雖然使用「伺服器」或「終端」等記載來作為進行處理之主體，但亦可在終端進行在伺服器進行之處理的一部分或全部，且亦可在伺服器進行在終端進行之處理的一部分或全部。又，有關於上述內容，針對解碼處理也是同樣的。 [3D、多角度]In this way, the encoding process can be performed by the terminal or by one or more servers. Therefore, although the following describes the main body of the processing using the descriptions such as "server" or "terminal", some or all of the processing performed by the server may be performed at the terminal, or the server may perform the processing at the terminal. Part or all of the treatment. Further, regarding the above, the same applies to the decoding process. [3D, multi-angle]

近年來，以下作法也在逐漸增加中，即，將以彼此幾乎同步的複數台相機ex113及/或智慧型手機ex115等之終端所攝影到的不同場景、或者將從不同的角度攝影相同的場景之圖像或影像加以整合並利用。各終端所攝影到之影像會根據另外取得的終端間之相對的位置關係、或者包含於影像之特徵點為一致的區域等而被整合。In recent years, the following practices are gradually increasing, that is, different scenes photographed by terminals such as a plurality of cameras ex113 and/or a smart phone ex115 that are almost synchronized with each other, or the same scene will be photographed from different angles. The images or images are integrated and utilized. The image captured by each terminal is integrated based on the relative positional relationship between the acquired terminals or the area in which the feature points of the video are identical.

伺服器不僅對二維的動態圖像進行編碼，亦可根據動態圖像的場景解析等而自動地、或者在使用者所指定的時刻中，對靜態圖進行編碼並傳送至接收終端。此外，伺服器在可以取得攝影終端間之相對的位置關係的情況下，不僅是二維動態圖像，還可以根據相同場景從不同的角度所攝影之影像，來生成該場景之三維形狀。再者，伺服器亦可將藉由點雲(point cloud)而生成之三維的資料另外編碼，亦可根據使用三維資料來辨識或追蹤人物或目標的結果，而從複數個終端所攝影的影像中選擇、或再構成並生成欲傳送至接收終端的影像。The server not only encodes the two-dimensional moving image, but also encodes the static image and transmits it to the receiving terminal automatically or at a time specified by the user according to scene analysis of the moving image or the like. In addition, when the server can obtain the relative positional relationship between the photographing terminals, the server can generate not only a two-dimensional moving image but also a three-dimensional shape of the scene from images captured from different angles in the same scene. Furthermore, the server can also encode the three-dimensional data generated by the point cloud, and can also capture images from a plurality of terminals according to the use of the three-dimensional data to identify or track the result of the person or the target. Selecting, or reconstituting and generating an image to be transmitted to the receiving terminal.

如此，使用者可以任意選擇對應於各攝影終端之各影像來享受場景，也可以享受從利用複數個圖像或影像再構成之三維資料中切出任意視點而成的影像之內容。此外，與影像同樣地，聲音也可從複數個不同的角度進行收音，且伺服器亦可配合影像，將來自特定之角度或空間的聲音與影像進行多工化並傳送。In this way, the user can arbitrarily select the respective images corresponding to the respective imaging terminals to enjoy the scene, and can also enjoy the content of the image obtained by cutting out arbitrary viewpoints from the three-dimensional data reconstructed from the plurality of images or images. In addition, similar to the image, the sound can be collected from a plurality of different angles, and the server can also cooperate with the image to multiplex and transmit the sound and image from a specific angle or space.

又，近年來，Virtual Reality(虛擬實境，VR)及Augmented Reality(擴增虛擬實境，AR)等將現實世界與虛擬世界建立對應之內容也逐漸普及。在VR圖像的情形下，伺服器亦可分別製作右眼用及左眼用之視點圖像，並藉由Multi-View Coding(多視圖編碼，MVC)等在各視點影像間進行容許參照之編碼，亦可不互相參照而作為不同的串流來進行編碼。在不同的串流之解碼時，可使其互相同步來播放，以因應使用者之視點來重現虛擬的三維空間。In addition, in recent years, contents such as Virtual Reality (VR) and Augmented Reality (AR) have become popular in the real world and the virtual world. In the case of a VR image, the server can also create viewpoint images for the right eye and the left eye, and allow for reference between the viewpoint images by Multi-View Coding (MVC) or the like. Encoding may also be encoded as a different stream without cross-referencing. When decoding different streams, they can be synchronized with each other to play back, in order to reproduce the virtual three-dimensional space in response to the user's viewpoint.

在AR圖像的情形下，伺服器會根據三維之位置或使用者之視點的移動，將虛擬空間上之虛擬物體資訊重疊於現實空間之相機資訊。解碼裝置亦可取得或保持虛擬物體資訊及三維資料，並因應使用者之視點的移動而生成二維圖像並順暢地連結，藉以製作重疊資料。或者，亦可為解碼裝置除了虛擬物體資訊之委託之外還將使用者的視點之移動也傳送至伺服器，且伺服器配合從保持於伺服器之三維資料中所接收到的視點的移動來製作重疊資料，並將重疊資料編碼且發送至解碼裝置。再者，亦可為重疊資料除了RGB以外還具有顯示穿透度的α值，伺服器將從三維資料所製作出之目標以外的部分之α值設定為0等，並在該部分為穿透狀態下進行編碼。或者，伺服器亦可如色度鍵(chroma key)的形式，將規定之值的RGB值設定為背景，而生成目標以外之部分是形成為背景色之資料。In the case of an AR image, the server superimposes the virtual object information on the virtual space with the camera information in the real space according to the position of the three-dimensional or the movement of the user's viewpoint. The decoding device can also acquire or maintain virtual object information and three-dimensional data, and generate a two-dimensional image according to the movement of the user's viewpoint and smoothly connect to create overlapping data. Alternatively, the decoding device may transmit the movement of the user's viewpoint to the server in addition to the virtual object information, and the server cooperates with the movement of the viewpoint received from the three-dimensional data held by the server. The overlapping data is created and the overlapping data is encoded and sent to the decoding device. Furthermore, the overlapping data may have an alpha value indicating the transmittance in addition to the RGB, and the server sets the alpha value of the portion other than the target created by the three-dimensional data to 0, etc., and penetrates in the portion. Encoding in the state. Alternatively, the server may also set the RGB value of the specified value as the background in the form of a chroma key, and the part other than the generated target is formed as the background color.

同樣地，被發送之資料的解碼處理可在客戶端即各終端進行，亦可在伺服器側進行，亦可互相分擔而進行。作為一例，亦可使某個終端暫時將接收要求傳送至伺服器，並在其他終端接收因應該要求的內容且進行解碼處理，再將解碼完成之訊號傳送至具有顯示器的裝置。藉由不依靠可通訊之終端本身的性能而將處理分散並選擇適當之內容的作法，可以播放畫質良好的資料。又，作為其他的例子，亦可用TV等接收大尺寸之圖像資料，並將圖片分割後之圖塊等一部分的區域解碼，並顯示於鑑賞者之個人終端。藉此，可以將整體圖片共有化，並且可以就近確認自己負責的領域或想要更詳細地確認之區域。Similarly, the decoding process of the transmitted data can be performed on the client side, that is, on each terminal, or on the server side, or can be shared with each other. As an example, a terminal may temporarily transmit the reception request to the server, receive the content requested in the other terminal, perform decoding processing, and transmit the decoded signal to the device having the display. By dispersing the processing and selecting the appropriate content without relying on the performance of the communicable terminal itself, it is possible to play back a good picture quality. Further, as another example, a large-sized image data may be received by a TV or the like, and a part of a region such as a tile divided by the image may be decoded and displayed on the personal terminal of the appreciator. By this, the overall picture can be shared, and the area in which it is responsible or the area in which it is desired to be confirmed in more detail can be confirmed.

又，今後可預想到下述情形：不論屋內外，在近距離、中距離、或長距離之無線通訊為可複數使用的狀況下，利用MPEG-DASH等之發送系統規格，一邊對連接中的通訊切換適當的資料一邊無縫地接收內容。藉此，使用者不僅對本身之終端，連設置於屋內外之顯示器等的解碼裝置或顯示裝置都可自由地選擇並且即時切換。又，可以做到根據本身的位置資訊等，一邊切換要進行解碼之終端及要進行顯示之終端並一邊進行解碼。藉此，也可在往目的地之移動中，一邊在埋入有可顯示之元件的鄰近建築物的牆面或地面的一部分顯示地圖資訊，一邊移動。又，也可做到如下情形，即，令編碼資料快取到可以在短時間內從接收終端進行存取之伺服器、或者複製到內容傳遞伺服器(content delivery server)中的邊緣伺服器等，根據在網路上對編碼資料的存取容易性，來切換接收資料之位元率(bit-rate)。 [可調式編碼]In addition, in the future, it is expected that, in the case where the wireless communication at a short distance, a medium distance, or a long distance is used in a plurality of places, the transmission system specifications such as MPEG-DASH are used, and the connection is in progress. The communication switches the appropriate data while seamlessly receiving the content. Thereby, the user can freely select and instantly switch not only the terminal of the terminal, but also the display device or the display device provided in the display or the like. Further, it is possible to switch between the terminal to be decoded and the terminal to be displayed, in accordance with the position information of the user, and the like. Thereby, it is also possible to move the map information while displaying the map information on a part of the wall or the ground of the adjacent building in which the displayable component is embedded during the movement to the destination. Further, it is also possible to make the encoded data cached to a server that can be accessed from the receiving terminal in a short time, or to an edge server in a content delivery server, etc. The bit rate of the received data is switched according to the ease of accessing the encoded data on the network. [Adjustable coding]

關於內容之切換，是利用圖25所示之可調整的串流來進行說明，該可調整的串流應用了在上述各實施形態中所示之動態圖像編碼方法，並進行壓縮編碼。雖然伺服器具有複數個內容相同而質卻不同的串流來作為個別的串流也無妨，但亦可如圖示般構成為藉由分層來進行編碼，而實現時間上/空間上可調整之串流，並活用該串流的特徵來切換內容。亦即，藉由使解碼側因應性能這種內在要因與通訊頻帶之狀態等的外在要因來決定要解碼至哪一層，解碼側即可自由地切換低解析度之內容與高解析度之內容來解碼。例如，當想在回家後以網路電視等機器收看於移動中以智慧型手機ex115收看之影像的後續時，該機器只要將相同的串流解碼至不同的層即可，因此可以減輕伺服器側的負擔。The switching of the content is described using an adjustable stream shown in Fig. 25, and the moving picture encoding method shown in each of the above embodiments is applied and compression-encoded. Although the server has a plurality of streams of the same content but different qualities as individual streams, it may be configured to be encoded by layering as shown in the figure, and the time/space can be adjusted. Streaming, and using the characteristics of the stream to switch content. In other words, by deciding which layer to decode to the external factor such as the inherent factor of the decoding side and the state of the communication band, the decoding side can freely switch between the low-resolution content and the high-resolution content. To decode. For example, when you want to watch the video that is viewed by the smartphone ex115 on the mobile device after going home, the machine only needs to decode the same stream to different layers, so the servo can be reduced. The burden on the side of the device.

此外，如上述地，除了實現按每一層將圖片編碼、且在基本層之上位存在增強層(enhancement layer)之具可調整性(scalability)的構成以外，亦可使增強層包含有根據圖像之統計資訊等的元資訊，且使解碼側根據元資訊對基本層之圖片進行超解析，藉此來生成高畫質化之內容。所謂超解析可以是相同解析度中的SN比之提升、以及解析度之擴大的任一種。元資訊包含：用於特定超解析處理中使用之線形或非線形的濾波係數之資訊、或者特定超解析處理中使用之濾波處理、機械學習或最小平方運算中的參數值之資訊等。Further, as described above, in addition to the configuration in which the picture is encoded for each layer and the enhancement layer has an enhancement layer on the base layer, the enhancement layer may be included in the image according to the image. Meta-information such as statistical information, and the decoding side super-analyzes the picture of the base layer based on the meta-information, thereby generating high-quality content. The super-analysis may be any one of an increase in the SN ratio and an increase in the resolution in the same resolution. The meta information includes: information for linear or non-linear filter coefficients used in a specific super-resolution processing, or information on filter values used in a specific super-resolution processing, parameter values in mechanical learning or least squares operations, and the like.

或者，亦可構成為因應圖像內之目標等的含義而將圖片分割為圖塊等，且使解碼側選擇欲解碼之圖塊，藉此僅將一部分之區域解碼。又，藉由將目標之屬性(人物、車、球等)與影像內之位置(同一圖像中的座標位置等)作為元資訊加以儲存，解碼側即可根據元資訊特定出所期望之目標的位置，並決定包含該目標之圖塊。例如，如圖29所示，可使用HEVC中的SEI訊息等與像素資料為不同之資料保存構造來保存元資訊。此元資訊是表示例如主目標之位置、尺寸、或色彩等。Alternatively, the picture may be divided into tiles or the like in accordance with the meaning of the object or the like in the image, and the decoding side may select a block to be decoded, thereby decoding only a part of the area. Moreover, by storing the attributes of the target (person, car, ball, etc.) and the position within the image (coordinate position in the same image, etc.) as meta information, the decoding side can specify the desired target based on the meta information. Location and decide which tile contains the target. For example, as shown in FIG. 29, the meta information may be saved using a data save structure different from the pixel data, such as an SEI message in HEVC. This meta information indicates, for example, the position, size, or color of the main target.

又，亦可以串流、序列或隨機存取單位等由複數個圖片構成之單位來保存元資訊。藉此，解碼側可以取得特定人物出現在影像內之時刻等，且藉由與圖片單位之資訊對照，可以特定出目標存在之圖片、以及目標在圖片內的位置。 [網頁之最佳化]Further, the meta information may be stored in a unit composed of a plurality of pictures such as a stream, a sequence, or a random access unit. Thereby, the decoding side can obtain the time when the specific person appears in the image, and the like, and by comparing with the information of the picture unit, the picture in which the target exists and the position of the target in the picture can be specified. [Optimization of the page]

圖30是顯示電腦ex111等中的網頁的顯示畫面例之圖。圖31是顯示智慧型手機ex115等中的網頁的顯示畫面例之圖。如圖30及圖31所示，在網頁包含複數個對圖像內容之鏈接即鏈接圖像的情況下，其外觀會依閱覽之元件而不同。在畫面上可看到複數個鏈接圖像的情況下，直至使用者明確地選擇鏈接圖像、或者鏈接圖像接近畫面之中央附近或鏈接圖像之整體進入畫面內為止，顯示裝置(解碼裝置)都是顯示具有各內容之靜態圖或I圖片(框內編碼畫面，Intra Picture)作為鏈接圖像、或者以複數個靜態圖或I圖片等來顯示gif動畫之形式的影像、或者僅接收基本層來將影像解碼及顯示。FIG. 30 is a view showing an example of a display screen of a web page in the computer ex111 or the like. FIG. 31 is a view showing an example of a display screen of a web page in the smartphone ex115 or the like. As shown in FIG. 30 and FIG. 31, when a web page includes a plurality of links to image content, that is, a link image, the appearance thereof differs depending on the components to be viewed. When a plurality of link images are visible on the screen, the display device (decoding device) until the user explicitly selects the link image or the link image approaches the center of the screen or the entire link image enters the screen. ) is to display a still picture or an I picture (intra picture) having each content as a link image, or to display an image in the form of a gif animation in a plurality of still pictures or I pictures, or to receive only basic Layer to decode and display images.

在已由使用者選擇出鏈接圖像的情況下，顯示裝置會將基本層設為最優先來解碼。再者，只要在構成網頁之HTML中具有表示屬於可調整之內容的資訊，亦可使顯示裝置解碼至增強層。又，為了擔保即時性，在選擇之前或通訊頻帶非常吃緊的情況下，顯示裝置可以藉由僅解碼及顯示前向參照(forward reference)之圖片(I圖片(框內編碼畫面)、P圖片(預測畫面，Predictive Picture)、僅前向參照之B圖片(雙向預估編碼畫面，Bidirectionally Predictive Picture))，來減低開頭圖片之解碼時刻與顯示時刻之間的延遲(從內容之解碼開始到顯示開始之間的延遲)。又，顯示裝置亦可特意無視圖片之參照關係，而將所有的B圖片及P圖片設成前向參照來粗略地解碼，並隨著時間經過使接收之圖片增加來進行正常的解碼。 [自動行駛]In the case where the linked image has been selected by the user, the display device will set the base layer to be the highest priority for decoding. Furthermore, the display device may be decoded to the enhancement layer as long as it has information indicating that the content belongs to the adjustable content in the HTML constituting the web page. Moreover, in order to guarantee the immediacy, the display device can decode and display only the forward reference picture (I picture (in-frame coded picture), P picture (before selection) or the communication band is very tight. Predictive Picture, B-picture (Bidirectionally Predictive Picture), to reduce the delay between the decoding time and the display time of the first picture (from the decoding of the content to the start of the display) The delay between). Moreover, the display device may also deliberately ignore the reference relationship of the pictures, and set all the B pictures and P pictures as forward references to roughly decode, and increase the received pictures over time to perform normal decoding. [Automatic driving]

又，在為了汽車之自動行駛或行駛支援而傳送接收二維或三維之地圖資訊等的靜態圖或影像資料的情況下，除了屬於1個以上的層的圖像資料之外，接收終端亦可將天候或施工之資訊等也都接收作為元資訊，並對應於這些來解碼。再者，元資訊可以屬於層，亦可單純與圖像資料進行多工化。In addition, when a still image or video data such as two-dimensional or three-dimensional map information is transmitted for automatic driving or driving support of a car, the receiving terminal may be in addition to image data belonging to one or more layers. Information such as weather or construction is also received as meta information, and is decoded corresponding to these. Furthermore, the meta information can belong to the layer, and can be simply multiplexed with the image data.

此時，由於包含接收終端之車、無人機(drone)或飛機等會移動，因此藉由接收終端在接收要求時會傳送該接收終端之位置資訊之作法，即可一邊切換基地台ex106~ex110一邊實現無縫的接收及解碼。又，接收終端會因應使用者之選擇、使用者之狀況、或通訊頻帶的狀態，而變得可動態地切換要將元資訊接收到何種程度，或要將地圖資訊更新至何種程度。At this time, since the car including the receiving terminal, the drone or the airplane moves, the base station ex106~ex110 can be switched by the receiving terminal transmitting the position information of the receiving terminal when receiving the request. Seamless reception and decoding on one side. Moreover, the receiving terminal can dynamically switch to what extent the meta-information is to be received, or to what extent the map information is to be updated, depending on the user's choice, the state of the user, or the state of the communication band.

如以上，在內容供給系統ex100中，客戶端可即時地接收使用者所傳送之已編碼的資訊，並進行解碼、播放。 [個人內容之發送]As described above, in the content supply system ex100, the client can receive the encoded information transmitted by the user in real time, and decode and play the information. [Send of personal content]

又，在內容供給系統ex100中，不僅是來自影像發送業者之高畫質且長時間的內容，來自個人之低畫質且短時間的內容的單播(unicast)、或多播(multicast)發送也是可做到的。又，這種個人的內容被認為今後也會持續增加下去。為了將個人內容作成更優良之內容，伺服器亦可在進行編輯處理之後進行編碼處理。這可藉由例如以下之構成來實現。Further, in the content supply system ex100, not only the high-quality and long-time content from the video transmission provider, but also unicast or multicast transmission of the low-quality and short-time content of the individual. It can also be done. Moreover, this personal content is considered to continue to increase in the future. In order to make personal content into better content, the server can also perform encoding processing after performing editing processing. This can be achieved by, for example, the following constitution.

伺服器會在攝影時即時或累積於攝影後，從原圖或編碼完成資料中進行攝影錯誤、場景搜尋、意義解析、及目標檢測等辨識處理。而且，伺服器會根據辨識結果以手動或自動方式進行下述編輯：補正失焦或手震等、刪除亮度較其他圖片低或未聚焦之場景等重要性低的場景、強調目標之邊緣、變化色調等。伺服器會根據編輯結果來將編輯後之資料編碼。又，當攝影時刻太長時會導致收視率下降的情況也是眾所皆知的，伺服器會根據圖像處理結果而以自動的方式，如上述地不僅對重要性低之場景還有動態較少的場景等進行剪輯，以使其因應攝影時間成為特定之時間範圍內的內容。或者，伺服器亦可根據場景之意義解析的結果來生成摘錄(digest)並進行編碼。The server will perform recognition processing such as shooting error, scene search, meaning analysis, and target detection from the original image or the encoded completion data immediately or after the shooting. Moreover, the server performs the following editing manually or automatically according to the identification result: correcting out-of-focus or jitter, deleting scenes with low importance such as low or unfocused scenes, highlighting the edge of the target, and changing Hue and so on. The server will encode the edited data based on the edited results. Moreover, when the shooting time is too long, the viewing rate will be reduced. The server will automatically follow the image processing results, such as the above, not only for the scenes with low importance, but also for dynamic comparison. A small scene or the like is clipped so that the shooting time becomes a content within a specific time range. Alternatively, the server may also generate and encode the digest based on the result of the semantic analysis of the scene.

再者，在個人內容中，也有照原樣的話會有侵害著作權、著作人格權、或肖像權等之內容攝入的案例，也有當共享的範圍超過所欲共享之範圍等對個人來說不方便的情況。據此，例如，伺服器亦可將畫面周邊部之人臉、或房子內部等特意變更為未聚焦之圖像並編碼。又，伺服器亦可辨識編碼對象圖像內是否拍到與事先登錄之人物不同的人物的臉，並在拍到的情況下，進行將臉的部分打上馬賽克等之處理。或者，作為編碼之前處理或後處理，使用者亦可基於著作權等之觀點而於圖像中指定想要加工之人物或背景區域後，令伺服器進行將所指定之區域替換為另外的影像、或者使焦點模糊等處理。如果是人物，可以在動態圖像中一邊追蹤人物一邊替換臉的部分的影像。In addition, in the personal content, there may be cases in which the content of copyright, copyright, or portrait rights is infringed, and it is inconvenient for the individual to share the range beyond the scope of sharing. Case. According to this, for example, the server can also intentionally change the face of the peripheral portion of the screen or the inside of the house to an unfocused image and encode it. Further, the server can recognize whether or not a face of a person different from the person registered in advance is captured in the image to be encoded, and when the image is captured, a process of mosaicing the face portion or the like is performed. Alternatively, as a pre-coding process or a post-processing, the user may specify a person or a background area to be processed in the image based on the viewpoint of copyright, etc., and then cause the server to replace the designated area with another image, Or the focus is blurred and the like. If it is a character, you can replace the image of the part of the face while tracking the character in the moving image.

又，由於資料量較小之個人內容的視聽對即時性的要求較強，因此，雖然也會取決於頻帶寬，但解碼裝置首先會最優先地接收基本層並進行解碼及播放。解碼裝置亦可在這段期間內接收增強層，且於循環播放等播放2次以上的情形下，將增強層也包含在內來播放高畫質的影像。像這樣，只要是進行可調整之編碼的串流，就可以提供一種雖然在未選擇時或初次看到的階段是粗略的動態圖，但串流會逐漸智能化(smart)而使圖像變好的體驗。除了可調式編碼以外，即使將第1次播放之粗略的串流、與參照第1次之動態圖而編碼之第2次的串流構成為1個串流，也可以提供同樣的體驗。 [其他之使用例]Moreover, since the audiovisual content of the personal content having a small amount of data has a strong demand for immediacy, the decoding apparatus first receives the base layer and decodes and plays it first, depending on the frequency bandwidth. The decoding device can also receive the enhancement layer during this period, and if the playback is played twice or more in a loop, the enhancement layer is also included to play the high-quality image. In this way, as long as the stream of the adjustable encoding is performed, it is possible to provide a dynamic picture which is rough when the stage is not selected or is first seen, but the stream is gradually smart and the image is changed. Good experience. In addition to the tunable coding, even if the rough stream of the first play and the second stream coded with reference to the first motion picture are configured as one stream, the same experience can be provided. [Other use cases]

又，這些編碼或解碼處理一般是在各終端所具有之LSIex500中處理。LSIex500可為單晶片(one chip)，亦可為由複數個晶片形成之構成。再者，亦可將動態圖像編碼或解碼用之軟體安裝到可以在電腦ex111等讀取之某種記錄媒體(CD-ROM、軟式磁碟(flexible disk)、或硬碟等)，並使用該軟體來進行編碼或解碼處理。此外，當智慧型手機ex115為附有相機時，亦可傳送以該相機取得之動態圖資料。此時之動態圖資料是以智慧型手機ex115具有之LSIex500來編碼處理過之資料。Moreover, these encoding or decoding processes are generally processed in the LSI ex500 included in each terminal. The LSI ex500 may be a one chip or may be formed of a plurality of wafers. Furthermore, the software for encoding or decoding a moving image can be mounted to a recording medium (CD-ROM, flexible disk, or hard disk) that can be read on a computer ex111 or the like, and used. The software performs encoding or decoding processing. In addition, when the smart phone ex115 is attached with a camera, the dynamic image data acquired by the camera can also be transmitted. The dynamic map data at this time is encoded and processed by the LSI ex500 which the smart phone ex115 has.

再者，LSIex500亦可為將應用軟體下載並啟動(activate)之構成。此時，終端首先會判定該終端是否對應於內容之編碼方式、或者是否具有特定服務之執行能力。在終端沒有對應於內容之編碼方式時、或者不具有特定服務之執行能力的情況下，終端會下載編碼解碼器或應用軟體，然後，取得及播放內容。Furthermore, the LSI ex500 can also be configured to download and activate the application software. At this time, the terminal first determines whether the terminal corresponds to the encoding mode of the content or whether it has the execution capability of the specific service. When the terminal does not have a coding mode corresponding to the content, or does not have the execution capability of the specific service, the terminal downloads the codec or the application software, and then acquires and plays the content.

又，不限於透過網際網路ex101之內容供給系統ex100，在數位播放用系統中也可以安裝上述各實施形態之動態圖像編碼裝置(圖像編碼裝置)或動態圖像解碼裝置(圖像解碼裝置)之至少任一個。由於是利用衛星等來將已使影像與聲音被多工化之多工資料乘載於播放用之電波來進行傳送接收，因此會有相對於內容供給系統ex100之容易形成單播的構成更適合多播的差別，但有關於編碼處理及解碼處理仍可為同樣之應用。 [硬體構成]Further, the present invention is not limited to the content supply system ex100 via the Internet ex101, and the video encoding device (image encoding device) or the moving image decoding device (image decoding) of each of the above embodiments may be incorporated in the digital broadcasting system. At least one of the devices). Since the multiplexed data in which the video and audio are multiplexed is carried by the satellite or the like to be transmitted and received by the radio wave for broadcasting, it is more suitable for the unicast configuration of the content supply system ex100. The difference in multicast, but there are still similar applications for encoding processing and decoding processing. [Hardware composition]

圖32是顯示智慧型手機ex115之圖。又，圖33是顯示智慧型手機ex115的構成例之圖。智慧型手機ex115具備：用於在與基地台ex110之間傳送接收電波的天線ex450、可拍攝影像及靜態圖之相機部ex465、顯示已將相機部ex465所拍攝到之影像以及在天線ex450所接收到之影像等解碼之資料的顯示部ex458。智慧型手機ex115更具備：觸控面板等之操作部ex466、用於輸出聲音或音響之揚聲器等的聲音輸出部ex457、用於輸入聲音之麥克風等之聲音輸入部ex456、可保存所攝影之影像或靜態圖、錄音之聲音、接收之影像或靜態圖、郵件等已編碼之資料、或已解碼之資料的記憶體部ex467、及作為與SIMex468之間的介面部之插槽部ex464，該SIMex468是用於特定使用者，且以網路為首進行對各種資料的存取之認證。再者，取代記憶體部ex467而使用外接記憶體亦可。Figure 32 is a diagram showing the smartphone ex115. FIG. 33 is a diagram showing an example of the configuration of the smartphone ex115. The smartphone ex115 includes an antenna ex450 for transmitting and receiving radio waves between the base station ex110, a camera portion ex465 capable of capturing video and still images, and an image captured by the camera unit ex465 and received at the antenna ex450. The display unit ex458 of the decoded data such as the video. The smart phone ex115 further includes an operation unit ex466 such as a touch panel, an audio output unit ex457 for outputting a sound or an audio speaker, an audio input unit ex456 for inputting a sound microphone, and the like, and can store the captured image. Or static picture, recorded sound, received image or still picture, mailed data such as mail, or memory part ex467 of decoded data, and slot part ex464 as the interface between the face and SIMex468, the SIMex468 It is used for specific users and is authenticated by accessing various materials based on the Internet. Further, an external memory may be used instead of the memory portion ex467.

又，統合地控制顯示部ex458及操作部ex466等之主控制部ex460，是透過匯流排ex470而連接於電源電路部ex461、操作輸入控制部ex462、影像訊號處理部ex455、相機介面部ex463、顯示器控制部ex459、調變/解調部ex452、多工/分離部ex453、聲音訊號處理部ex454、插槽部ex464、及記憶體部ex467。Further, the main control unit ex460 such as the display unit ex458 and the operation unit ex466 is integrally connected to the power supply circuit unit ex461, the operation input control unit ex462, the video signal processing unit ex455, the camera interface ex463, and the display via the bus bar ex470. The control unit ex459, the modulation/demodulation unit ex452, the multiplex/separation unit ex453, the audio signal processing unit ex454, the slot unit ex464, and the memory unit ex467.

電源電路部ex461在藉由使用者之操作而將電源鍵設成開啟狀態時，會藉由從電池組(battery pack)對各部供給電力而將智慧型手機ex115起動為可運作之狀態。When the power supply unit ex461 is set to the on state by the user's operation, the power supply unit ex115 starts the smart phone ex115 in an operable state by supplying power to each unit from the battery pack.

智慧型手機ex115會根據具有CPU、ROM及RAM等之主控制部ex460的控制，進行通話及資料通訊等處理。通話時，是以聲音訊號處理部ex454將以聲音輸入部ex456所收音之聲音訊號轉換為數位聲音訊號，並以調變/解調部ex452對其進行展頻處理，接著以傳送/接收部ex451施行數位類比轉換處理及頻率轉換處理後，透過天線ex450傳送。又，將接收資料放大且施行頻率轉換處理及類比數位轉換處理，並以調變/解調部ex452進行解展頻處理，接著以聲音訊號處理部ex454轉換為類比聲音訊號後，由聲音輸出部ex457將其輸出。資料通訊模式時，是藉由本體部之操作部ex466等的操作而透過操作輸入控制部ex462將正文(text)、靜態圖、或影像資料送出至主控制部ex460，而同樣地進行傳送接收處理。在資料通訊模式時傳送影像、靜態圖、或影像與聲音的情形下，影像訊號處理部ex455是藉由在上述各實施形態中所示的動態圖像編碼方法，將保存於記憶體部ex467之影像訊號或從相機部ex465輸入之影像訊號壓縮編碼，並將已編碼之影像資料送出至多工/分離部ex453。又，聲音訊號處理部ex454是將以相機部ex465拍攝影像或靜態圖等時被聲音輸入部ex456所收音之聲音訊號編碼，並將已編碼之聲音資料送出至多工/分離部ex453。多工/分離部ex453是以規定之方式對編碼完成影像資料與編碼完成聲音資料進行多工化，並以調變/解調部(調變/解調電路部)ex452、及傳送/接收部ex451施行調變處理及轉換處理，並透過天線ex450來傳送。The smart phone ex115 performs processing such as call and data communication based on the control of the main control unit ex460 having a CPU, a ROM, and a RAM. At the time of the call, the audio signal processing unit ex454 converts the audio signal received by the voice input unit ex456 into a digital audio signal, and performs spreading processing by the modulation/demodulation unit ex452, and then transmits/receives the part After the digital analog conversion processing and the frequency conversion processing are performed, the transmission is performed through the antenna ex450. Further, the received data is amplified, frequency conversion processing, and analog-to-digital conversion processing are performed, and demodulation processing is performed by the modulation/demodulation unit ex452, and then converted into an analog sound signal by the audio signal processing unit ex454, and then the sound output unit is used. Ex457 will output it. In the data communication mode, the text, the static image, or the video data is sent to the main control unit ex460 through the operation input control unit ex462 by the operation of the operation unit ex466 or the like of the main unit, and the transmission and reception processing is performed in the same manner. . When the video, the still picture, or the video and audio are transmitted in the data communication mode, the video signal processing unit ex455 is stored in the memory unit ex467 by the moving picture coding method described in each of the above embodiments. The image signal or the image signal input from the camera unit ex465 is compression-encoded, and the encoded image data is sent to the multiplex/separation unit ex453. Further, the audio signal processing unit ex454 encodes the audio signal received by the audio input unit ex456 when the camera unit ex465 captures a video or a still picture, and sends the encoded audio data to the multiplex/separation unit ex453. The multiplexer/separation unit ex453 multiplexes the coded video data and the coded voice data in a predetermined manner, and uses a modulation/demodulation unit (modulation/demodulation circuit unit) ex452 and a transmission/reception unit. The ex451 performs modulation processing and conversion processing, and transmits it through the antenna ex450.

在已接收附加於電子郵件或網路聊天之影像、或鏈接至網頁等之影像的情形下，為了將透過天線ex450接收之多工資料解碼，多工/分離部ex453會藉由分離多工資料，而將多工資料分成影像資料之位元流與聲音資料之位元流，再透過同步匯流排ex470將已編碼之影像資料供給至影像訊號處理部ex455，並且將已編碼之聲音資料供給至聲音訊號處理部ex454。影像訊號處理部ex455是藉由對應於上述各實施形態所示之動態圖像編碼方法的動態圖像解碼方法來解碼影像訊號，並透過顯示器控制部ex459從顯示部ex458顯示被鏈接之動態圖像檔案中所含的影像或靜態圖。又，聲音訊號處理部ex454會將聲音訊號解碼，並從聲音輸出部ex457輸出聲音。再者，由於即時串流(real time streaming)已普及，因此依照使用者的狀況也可能在社會上不適合發出聲音的場所發生聲音的播放。因此，作為初始值，較理想的構成是，在不使聲音訊號播放的情形下僅播放影像資料。亦可僅在使用者進行點選影像資料等操作的情形下才將聲音同步播放。In the case where an image attached to an e-mail or a web chat or an image linked to a web page has been received, in order to decode the multiplexed data received through the antenna ex450, the multiplex/separation unit ex453 separates the multiplexed data. And dividing the multiplexed data into a bit stream of the bit stream and the sound data of the image data, and then supplying the encoded image data to the image signal processing unit ex455 through the sync bus ex470, and supplying the encoded sound data to the The audio signal processing unit ex454. The video signal processing unit ex455 decodes the video signal by the moving picture decoding method corresponding to the moving picture coding method described in each of the above embodiments, and displays the linked moving picture from the display unit ex458 via the display control unit ex459. The image or static image contained in the file. Further, the audio signal processing unit ex454 decodes the audio signal and outputs the sound from the audio output unit ex457. Furthermore, since real time streaming has become widespread, it is also possible to play sounds in places where the society is not suitable for sound generation according to the user's situation. Therefore, as an initial value, it is preferable to play only the video material without playing the audio signal. It is also possible to play the sound synchronously only when the user performs an operation such as clicking on the image data.

又，在此雖然以智慧型手機ex115為例進行了說明，但可作為終端而被考慮的有下述3種組裝形式：除了具有編碼器及解碼器兩者之傳送接收型終端以外，還有僅具有編碼器之傳送終端、以及僅具有解碼器之接收終端。此外，在數位播送用系統中，雖然是設成接收或傳送已在影像資料中將聲音資料等多工化之多工資料來進行說明，但在多工資料中，除了聲音資料以外，亦可將與影像有關聯之文字資料等多工化，且亦可接收或傳送影像資料本身而非多工資料。Here, although the smartphone ex115 has been described as an example, the following three types of assembly can be considered as the terminal: in addition to the transmission/reception type terminal having both the encoder and the decoder, A transmitting terminal having only an encoder, and a receiving terminal having only a decoder. In addition, in the digital broadcasting system, although it is configured to receive or transmit multiplexed data that has been multiplexed with sound data and the like in the video data, the multiplexed data may be in addition to the sound data. The multiplexed texts and the like associated with the images are multiplexed, and the image data itself can be received or transmitted instead of the multiplexed data.

再者，雖然是設為使包含CPU之主控制部ex460控制編碼或解碼處理並進行了說明，但終端具備GPU的情況也很多。因此，亦可構成為藉由在CPU與GPU上已共通的記憶體、或將位址管理成可以共通地使用的記憶體，來活用GPU之性能而將較寬廣區域一併處理。藉此可以縮短編碼時間，確保即時性，而可以實現低延遲。特別是在不利用CPU的情形下，利用GPU並以圖片等單位來一併進行運動搜尋、解塊濾波方法(deblock filter)、SAO(取樣自適應偏移，Sample Adaptive Offset)、及轉換、量化之處理時，是有效率的。産業上之可利用性In addition, although the main control unit ex460 including the CPU controls the encoding or decoding process, the terminal has a GPU. Therefore, it is also possible to use a memory that is common to both the CPU and the GPU, or to manage the address to be a memory that can be used in common, and to utilize the performance of the GPU to process a wider area together. This can shorten the encoding time, ensure immediacy, and achieve low latency. In particular, in the case where the CPU is not used, the GPU is used to perform motion search, deblock filter, SAO (Sample Adaptive Offset), conversion, quantization, and the like in units of pictures and the like. When it is processed, it is efficient. Industrial availability

本揭示可在例如電視機、數位錄影機、汽車導航系統、行動電話、數位相機、數位攝影機、視訊會議系統或電子鏡子等方面利用。The present disclosure can be utilized in, for example, televisions, digital video recorders, car navigation systems, mobile phones, digital cameras, digital cameras, video conferencing systems, or electronic mirrors.

10~23‧‧‧區塊10~23‧‧‧ Block

100‧‧‧編碼裝置100‧‧‧ coding device

102‧‧‧分割部102‧‧‧ Division

104‧‧‧減法部104‧‧‧Subtraction Department

106‧‧‧轉換部106‧‧‧Transition Department

108‧‧‧量化部108‧‧‧Quantity Department

110‧‧‧熵編碼部110‧‧‧ Entropy Coding Department

112、204‧‧‧逆量化部112, 204‧‧‧ inverse quantization

114、206‧‧‧逆轉換部114, 206‧‧‧ inverse conversion department

116、208‧‧‧加法部116, 208‧‧ Addition Department

118、210‧‧‧區塊記憶體118, 210‧‧‧ Block memory

120、212‧‧‧迴路濾波部120, 212‧‧‧Circuit Filtering Department

122、214‧‧‧框記憶體122, 214‧‧‧ box memory

124、216‧‧‧框內預測部124, 216‧‧‧ In-frame forecasting department

126、218‧‧‧框間預測部126, 218‧‧‧Inter-frame prediction department

128、220‧‧‧預測控制部128, 220‧‧‧Predictive Control Department

160、260‧‧‧處理電路160, 260‧‧‧ processing circuit

162、262‧‧‧記憶體162, 262‧‧‧ memory

200‧‧‧解碼裝置200‧‧‧ decoding device

202‧‧‧熵解碼部202‧‧‧ Entropy Decoding Department

Cur block‧‧‧當前區塊Cur block‧‧‧ current block

Cur Pic‧‧‧當前圖片Cur Pic‧‧‧ Current picture

ex100‧‧‧內容供給系統Ex100‧‧‧Content Supply System

ex101‧‧‧網際網路Ex101‧‧‧Internet

ex102‧‧‧網際網路服務提供者Ex102‧‧‧Internet Service Provider

ex103‧‧‧串流伺服器Ex103‧‧‧Streaming server

ex104‧‧‧通訊網Ex104‧‧‧Communication Network

ex106、ex107、ex108、ex109、ex110‧‧‧基地台Ex106, ex107, ex108, ex109, ex110‧‧‧ base station

ex111‧‧‧電腦Ex111‧‧‧ computer

ex112‧‧‧遊戲機Ex112‧‧‧game machine

ex113‧‧‧相機Ex113‧‧‧ camera

ex114‧‧‧家電Ex114‧‧‧Home appliances

ex115‧‧‧智慧型手機Ex115‧‧‧Smart mobile phone

ex116‧‧‧衛星Ex116‧‧‧ satellite

ex117‧‧‧飛機Ex117‧‧ aircraft

ex450‧‧‧天線Ex450‧‧‧Antenna

ex451‧‧‧傳送/接收部Ex451‧‧‧Transmission/receiving department

ex452‧‧‧調變/解調部Ex452‧‧‧Modulation/Demodulation Department

ex453‧‧‧多工/分離部Ex453‧‧‧Multiplex/Separation Department

ex454‧‧‧聲音訊號處理部Ex454‧‧‧Sound Signal Processing Department

ex455‧‧‧影像訊號處理部Ex455‧‧‧Image Signal Processing Department

ex456‧‧‧聲音輸入部Ex456‧‧‧Sound Input Department

ex457‧‧‧聲音輸出部Ex457‧‧‧Sound Output Department

ex458‧‧‧顯示部Ex458‧‧‧Display Department

ex459‧‧‧顯示器控制部Ex459‧‧‧Display Control Department

ex460‧‧‧主控制部Ex460‧‧‧Main Control Department

ex461‧‧‧電源電路部Ex461‧‧‧Power Circuit Department

ex462‧‧‧操作輸入控制部Ex462‧‧‧Operation Input Control Department

ex463‧‧‧相機介面部Ex463‧‧· Camera face

ex464‧‧‧插槽部Ex464‧‧‧Slots

ex465‧‧‧相機部Ex465‧‧‧ camera department

ex466‧‧‧操作部Ex466‧‧‧Operation Department

ex467‧‧‧記憶體部Ex467‧‧‧ memory department

ex468‧‧‧SIMEx468‧‧‧SIM

ex470‧‧‧匯流排Ex470‧‧‧ busbar

MV0、MV1、MVx0、MVy0、MVx1、MVy1、v0、v1‧‧‧運動向量MV0, MV1, MVx0, MVy0, MVx1, MVy1, v0, v1‧‧‧ motion vectors

Ref0、Ref1‧‧‧參照圖片Ref0, Ref1‧‧‧ reference picture

S101~S105、S111~S115、S201~S204、S211~S214、S202a、S202b、S212a、S212b、S202aa、S202ab、S212aa、S212ab、S301~S306‧‧‧步驟S101~S105, S111~S115, S201~S204, S211~S214, S202a, S202b, S212a, S212b, S202aa, S202ab, S212aa, S212ab, S301~S306‧‧

TD0、TD1‧‧‧距離TD0, TD1‧‧‧ distance

圖1是顯示實施形態1之編碼裝置的功能構成之方塊圖。圖2是顯示實施形態1中的區塊分割的一例之圖。圖3是顯示對應於各轉換類型的轉換基底函數之表格。圖4A是顯示在ALF中所用的濾波器之形狀的一例之圖。圖4B是顯示在ALF中所用的濾波器之形狀的其他的一例之圖。圖4C是顯示在ALF中所用的濾波器之形狀的其他的一例之圖。圖5是顯示框內預測中的67個框內預測模式之圖。圖6是用於說明沿著運動軌跡的2個區塊間的型樣匹配(雙向匹配)之圖。圖7是用於說明在當前圖片內的模板與參照圖片內的區塊之間的型樣匹配(模板匹配)之圖。圖8是用於說明假設了等速直線運動的模型之圖。圖9是用於說明根據複數個相鄰區塊的運動向量之子區塊單位的運動向量的導出之圖。圖10是顯示實施形態1之解碼裝置的功能構成之方塊圖。圖11是顯示成為本揭示之基礎的其他編碼裝置所進行的動態補償之流程圖。圖12是顯示成為本揭示之基礎的其他解碼裝置所進行的動態補償之流程圖。圖13是用於說明評價值的算出方法之一例的圖。圖14是用於說明評價值的算出方法之其他例的圖。圖15是顯示實施形態2中的編碼裝置所進行的動態補償之一例的流程圖。圖16是顯示實施形態2中的解碼裝置所進行的動態補償之一例的流程圖。圖17是顯示實施形態2中的編碼裝置所進行的動態補償之其他例的流程圖。圖18是顯示實施形態2中的解碼裝置所進行的動態補償之其他例的流程圖。圖19(a)~(c)是用於說明在實施形態2中的從複數個候選運動向量中擷取N個運動向量預測子候選的方法之圖。圖20是顯示實施形態3中的編碼裝置及解碼裝置所進行的運動向量預測子之選擇方法的流程圖。圖21是顯示實施形態4中的編碼裝置所進行的動態補償之一例的流程圖。圖22是顯示實施形態4中的解碼裝置所進行的動態補償之一例的流程圖。圖23(a)、(b)是用於說明實施形態4中的運動向量預測子候選的擷取方法之圖。圖24(a)、(b)是顯示實施形態4中的共通之候選清單的一例之圖。圖25是顯示各實施形態之編碼裝置的組裝例之方塊圖。圖26是顯示各實施形態之解碼裝置的組裝例之方塊圖。圖27是實現內容發送服務(content delivery service)的內容供給系統之整體構成圖。圖28是顯示可調式編碼時之編碼構造的一例之圖。圖29是顯示可調式編碼時之編碼構造的一例之圖。圖30是顯示網頁的顯示畫面例之圖。圖31是顯示網頁的顯示畫面例之圖。圖32是顯示智慧型手機的一例之圖。圖33是顯示智慧型手機的構成例之方塊圖。Fig. 1 is a block diagram showing a functional configuration of an encoding apparatus according to a first embodiment. Fig. 2 is a view showing an example of block division in the first embodiment. Figure 3 is a table showing a conversion basis function corresponding to each conversion type. Fig. 4A is a view showing an example of the shape of a filter used in ALF. Fig. 4B is a view showing another example of the shape of the filter used in the ALF. Fig. 4C is a view showing another example of the shape of the filter used in the ALF. FIG. 5 is a diagram showing 67 in-frame prediction modes in the in-frame prediction. Fig. 6 is a diagram for explaining pattern matching (bidirectional matching) between two blocks along a motion trajectory. FIG. 7 is a diagram for explaining pattern matching (template matching) between a template in a current picture and a block in a reference picture. Fig. 8 is a view for explaining a model assuming constant-speed linear motion. Figure 9 is a diagram for explaining the derivation of motion vectors of sub-block units according to motion vectors of a plurality of adjacent blocks. Fig. 10 is a block diagram showing the functional configuration of a decoding apparatus according to the first embodiment. Figure 11 is a flow chart showing the dynamic compensation performed by other encoding devices that are the basis of the present disclosure. Figure 12 is a flow chart showing the dynamic compensation performed by other decoding devices that are the basis of the present disclosure. FIG. 13 is a view for explaining an example of a method of calculating an evaluation value. FIG. 14 is a view for explaining another example of the method of calculating the evaluation value. Fig. 15 is a flow chart showing an example of dynamic compensation performed by the coding apparatus in the second embodiment. Fig. 16 is a flow chart showing an example of dynamic compensation performed by the decoding device in the second embodiment. Fig. 17 is a flow chart showing another example of dynamic compensation performed by the encoding device in the second embodiment. Fig. 18 is a flowchart showing another example of dynamic compensation performed by the decoding device in the second embodiment. 19(a) to 19(c) are diagrams for explaining a method of extracting N motion vector predictor candidates from a plurality of candidate motion vectors in the second embodiment. Fig. 20 is a flowchart showing a method of selecting a motion vector predictor by the encoding device and the decoding device in the third embodiment. Fig. 21 is a flow chart showing an example of dynamic compensation performed by the coding apparatus in the fourth embodiment. Fig. 22 is a flowchart showing an example of dynamic compensation performed by the decoding device in the fourth embodiment. 23(a) and 23(b) are diagrams for explaining a method of capturing a motion vector predictor candidate in the fourth embodiment. Figs. 24(a) and (b) are diagrams showing an example of a common candidate list in the fourth embodiment. Fig. 25 is a block diagram showing an example of assembly of an encoding apparatus according to each embodiment. Fig. 26 is a block diagram showing an example of assembly of a decoding device according to each embodiment. Fig. 27 is a view showing the overall configuration of a content supply system that implements a content delivery service. Fig. 28 is a view showing an example of a coding structure in the case of adjustable coding. Fig. 29 is a view showing an example of a coding structure in the case of adjustable coding. 30 is a diagram showing an example of a display screen of a web page. 31 is a diagram showing an example of a display screen of a web page. 32 is a diagram showing an example of a smart phone. Fig. 33 is a block diagram showing a configuration example of a smart phone.

Claims

An encoding device is an encoding device that encodes a moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory to generate a coding target region in the moving image Obtaining a plurality of motion vector vectors of each of the plurality of coded completion blocks corresponding to the block to obtain a plurality of candidate motion vectors, and extracting at least one motion vector predictor of the foregoing coding target block from the plurality of candidate motion vectors a candidate, and deriving a motion vector of the foregoing coding target block with reference to a reference picture included in the foregoing dynamic image, and predicting a motion vector predictor among the at least one motion vector prediction sub-candidate that has been extracted, and the derived Encoding the difference of the motion vector of the encoding target block, and dynamically using the derived motion vector of the encoding target block to dynamically compensate the coding target block, in the capturing of the at least one motion vector predictor candidate , encoding the mode information used to identify the capture method, and from the first capture method and In the second extraction method, for the encoding target block, the capturing method identified by the mode information is selected, and the at least one motion vector predictor candidate is captured according to the selected capturing method. The first extraction method is a method for extracting an evaluation result of each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but the foregoing The candidate motion vector of the reconstructed image of the coded completion region in the motion image, and the second capture method is a capture method according to a priority order, which is a priority order that has been previously defined for the plurality of candidate motion vectors.

The encoding device of claim 1, wherein in the encoding of the mode information, the mode information is encoded in a header region of any one of a sequence layer, a picture layer, and a slice layer in the stream of the moving image. Where the mode information is information for identifying a method of capturing the blocks included in the foregoing layer.

The encoding device of claim 1, wherein in the encoding of the foregoing mode information, the mode information is encoded for each block included in the dynamic image, wherein the mode information is used to identify the region. Information on the method of block extraction.

A decoding device is a decoding device that decodes a coded moving image, and includes a processing circuit and a memory connected to the processing circuit, wherein the processing circuit uses the memory and is based on the dynamic image Decoding a motion vector of each of the plurality of decoding completion blocks corresponding to the target block to obtain a plurality of candidate motion vectors, and extracting at least one motion of the decoding target block from the plurality of candidate motion vectors a vector predictor candidate, and decoding difference information showing differences between the two motion vectors, and adding the at least one motion vector predictor candidate that has been captured to the difference indicated by the decoded difference information a motion vector predictor in which the motion vector of the decoding target block is derived, and the decoded target block is dynamically compensated by using the derived motion vector of the decoding target block, and the at least one motion vector prediction is performed. In the capture of the sub-candidate, the mode information used to identify the capture method is decoded, and from the first In the method and the second extraction method, for the decoding target block, a capture method identified by the decoded mode information is selected, and the at least one of the at least one is captured according to the selected extraction method. a motion vector predictor candidate, wherein the first method of capturing is a method of extracting an evaluation result according to each of the plurality of candidate motion vectors, wherein the plurality of candidate motion vectors do not use an image region of the encoding target block, Instead, the candidate motion vector of the reconstructed image of the coded completion region in the moving image is used, and the second capture method is a capture method according to a priority order in which the plurality of candidate motion vectors are specified in advance. Priority.

The decoding device of claim 4, wherein in the decoding of the mode information, the mode information is from a header region of any one of a sequence layer, a picture layer, and a slice layer in the stream of the moving image. Decoding, wherein the mode information is information for identifying a method of capturing the blocks included in the aforementioned layer.

The decoding device of claim 4, wherein in the decoding of the foregoing mode information, the mode information is decoded for each block included in the dynamic image, wherein the mode information is used to identify the region. Information on the method of block extraction.

An encoding method is an encoding method for encoding a moving image, the encoding method is to obtain a complex number according to a motion vector of each of a plurality of encoding completion blocks corresponding to the encoding target block in the moving image. And selecting at least one motion vector predictor candidate of the foregoing coding target block from the plurality of candidate motion vectors, and deriving the foregoing coding target block by referring to a reference picture included in the dynamic image. a motion vector, and encoding, for the motion vector predictor among the at least one motion vector predictor candidate that has been captured, and the derived motion vector difference of the encoding target block, using the derived encoding object The motion vector of the block dynamically compensates the foregoing coding target block, and in the extraction of the at least one motion vector predictor candidate, the mode information for identifying the capture method is encoded, and from the first frame In the obtaining method and the second capturing method, for the encoding target block, the identification by the aforementioned mode information is selected. Extracting the method, and according to the selected extraction method, the at least one motion vector predictor candidate is captured, and the first capturing method is based on the evaluation result of each of the plurality of candidate motion vectors. a method, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but a candidate motion vector of the reconstructed image of the encoded completion region in the moving image is used, and the second capturing method is According to the prioritized retrieval method, the priority order is a prioritized order of the plurality of candidate motion vectors.

A decoding method is a decoding method for decoding an encoded moving image, the decoding method being based on a motion vector of each of a plurality of decoding completion blocks corresponding to a decoding target block in the foregoing moving image, Obtaining a plurality of candidate motion vectors, and extracting at least one motion vector predictor candidate of the decoding target block from the plurality of candidate motion vectors, and decoding differential information showing differences between the two motion vectors, And adding, to the difference indicated by the decoded difference information, a motion vector predictor among the at least one motion vector predictor candidate that has been captured, thereby extracting a motion vector of the decoding target block, and utilizing Deriving the motion vector of the decoding target block to dynamically compensate the decoding target block, and decoding the mode information for identifying the capturing method in the capturing of the at least one motion vector predicting sub-candidate And in the first extraction method and the second extraction method, for the decoding target block, the foregoing is selected by decoding Obtaining a method for capturing the at least one motion vector predictor candidate according to the selected method of capturing, the first capturing method is based on each of the plurality of candidate motion vectors a method for extracting the evaluation result, wherein the plurality of candidate motion vectors do not use the image region of the encoding target block, but a candidate motion vector of the reconstructed image of the encoded completion region in the moving image is used, The second extraction method is a method of capturing priorities according to a priority order, which is a predetermined priority order for the plurality of candidate motion vectors.