JP6459761B2

JP6459761B2 - Moving picture coding apparatus, moving picture coding method, and moving picture coding computer program

Info

Publication number: JP6459761B2
Application number: JP2015094346A
Authority: JP
Inventors: 章弘屋森
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2015-05-01
Filing date: 2015-05-01
Publication date: 2019-01-30
Anticipated expiration: 2035-05-01
Also published as: JP2016213615A

Description

本発明は、例えば、動画像符号化装置、動画像符号化方法及び動画像符号化用コンピュータプログラムに関する。 The present invention relates to, for example, a moving image encoding apparatus, a moving image encoding method, and a moving image encoding computer program.

動画像データは、一般に非常に大きなデータ量を有する。そのため、動画像データを扱う装置は、動画像データを他の装置へ送信しようとする場合、あるいは、動画像データを記憶装置に記憶しようとする場合、動画像データを符号化することにより圧縮する。代表的な動画像の符号化方式として、International Organization for Standardization /International Electrotechnical Commission(ISO/IEC)またはInternational Telecommunication Union - Telecommunication Standardization Sector(ITU-T)で策定されたMoving Picture Experts Group phase 2（MPEG-2/H.262）、MPEG-4、あるいはH.264 MPEG-4 Advanced Video Coding（H.264 MPEG-4 AVC）が利用されている。MPEG-4 AVC/H.264以降は、合同で設立された(Joint Collaborative Team on Video Coding (JCT-VC)によって標準化が進められ、新たな符号化標準として、HEVC(High Efficiency Video Coding, MPEG-H/H.265、正式名称はISO/IEC 23008、あるいはITU-T H.265)が策定されている。 The moving image data generally has a very large amount of data. Therefore, a device that handles moving image data compresses the moving image data by encoding it when transmitting the moving image data to another device or when storing the moving image data in the storage device. . Typical moving picture coding methods include the Moving Picture Experts Group phase 2 (MPEG- 2 / H.262), MPEG-4, or H.264 MPEG-4 Advanced Video Coding (H.264 MPEG-4 AVC) is used. MPEG-4 AVC / H.264 and later are being standardized by the jointly established (Joint Collaborative Team on Video Coding (JCT-VC). H / H.265, the official name is ISO / IEC 23008, or ITU-T H.265).

HEVCでは、従来の動画像符号化方式と比較して、動画像データに含まれる各ピクチャを分割するブロックのサイズの自由度が向上している。図１は、HEVCによる、ピクチャの分割の一例を示す図である。 In HEVC, the degree of freedom of the size of a block that divides each picture included in the moving image data is improved as compared with the conventional moving image encoding method. FIG. 1 is a diagram illustrating an example of picture division by HEVC.

図１に示されるように、ピクチャ１００は、符号化ブロックCoding Tree Unit(CTU)単位で分割され、各CTU１０１は、ラスタスキャン順に符号化される。CTU１０１のサイズは、64x64〜16x16画素の中から選択できる。ただし、CTU１０１のサイズは、シーケンス単位で一定とされる。 As shown in FIG. 1, a picture 100 is divided in coding block coding tree unit (CTU) units, and each CTU 101 is encoded in raster scan order. The size of the CTU 101 can be selected from 64 × 64 to 16 × 16 pixels. However, the size of the CTU 101 is constant for each sequence.

CTU１０１は、さらに、四分木構造で複数のCoding Unit（CU）１０２に分割される。一つのCTU１０１内の各CU１０２は、Zスキャン順に符号化される。CU１０２のサイズは可変であり、そのサイズは、CU分割モード8x8〜64x64画素の中から選択される。CU１０２は、符号化モードの一例であるイントラ予測符号化モードとインター予測符号化モードを選択する単位となる。CU１０２は、Prediction Unit（PU）１０３単位またはTransform Unit（TU）１０４単位で個別に処理される。PU１０３は、符号化モードに応じた予測が行われる単位となる。例えば、PU１０３は、イントラ予測符号化モードでは、予測モードが適用される単位となり、インター予測符号化モードでは、動き補償を行う単位となる。PU１０３のサイズは、例えば、イントラ予測符号化モードが適用される場合、2Nx2NとNxN（Nは、CUサイズ/2）から選択可能である。 The CTU 101 is further divided into a plurality of Coding Units (CU) 102 in a quadtree structure. Each CU 102 in one CTU 101 is encoded in the Z scan order. The size of the CU 102 is variable, and the size is selected from 8 × 8 to 64 × 64 pixels in the CU division mode. The CU 102 is a unit for selecting an intra prediction encoding mode and an inter prediction encoding mode, which are examples of the encoding mode. The CU 102 is individually processed in units of Prediction Unit (PU) 103 or Transform Unit (TU) 104. The PU 103 is a unit for performing prediction according to the encoding mode. For example, the PU 103 is a unit to which the prediction mode is applied in the intra prediction encoding mode, and is a unit for performing motion compensation in the inter prediction encoding mode. For example, when the intra prediction encoding mode is applied, the size of the PU 103 can be selected from 2Nx2N and NxN (N is CU size / 2).

一方、TU１０４は、直交変換の単位である。またイントラ予測符号化モードでは、TU１０４は、予測ブロックの生成単位でもある。TU１０４のサイズは、4x4画素〜32x32画素の中から選択される。TU１０４は、四分木構造で分割され、Zスキャン順に処理される。
なお、イントラ予測符号化モードは、動画像データが空間方向に相関性が高いことを利用する符号化モードであり、符号化対象ピクチャの符号化対象ブロックを、符号化対象ピクチャの既に符号化された領域の情報を用いて符号化する符号化モードである。一方、インター予測符号化モードは、動画像データが時間方向に相関性が高いことを利用する符号化モードであり、符号化対象ピクチャの符号化対象ブロックを、既に符号化された他のピクチャの情報を用いて符号化する符号化モードである。 On the other hand, the TU 104 is a unit of orthogonal transformation. In the intra prediction encoding mode, the TU 104 is also a prediction block generation unit. The size of the TU 104 is selected from 4 × 4 pixels to 32 × 32 pixels. The TU 104 is divided by a quadtree structure and processed in the Z scan order.
Note that the intra prediction encoding mode is an encoding mode that uses the fact that moving image data has a high correlation in the spatial direction, and the encoding target block of the encoding target picture has already been encoded. This is an encoding mode in which encoding is performed using the information of the area. On the other hand, the inter prediction encoding mode is an encoding mode that uses the fact that moving image data has a high correlation in the time direction, and the encoding target block of the encoding target picture is changed to that of another already encoded picture. This is an encoding mode for encoding using information.

HEVCでは、着目するCTUについて、CU、PU及びTUのサイズを決定する際に、例えば、CU、PU及びTUのそれぞれの取り得るサイズと符号化モードの組み合わせごとに、符号化コストが算出される。そしてその符号化コストが最小となるCU、PU及びTUのそれぞれのサイズと符号化モードの組み合わせが、着目するCTUの符号化の際に適用される。 In HEVC, when determining the size of the CU, PU, and TU for the CTU of interest, for example, the encoding cost is calculated for each possible combination of the size of CU, PU, and TU and the encoding mode. . Then, the combination of the size and the encoding mode of each CU, PU, and TU that minimizes the encoding cost is applied when encoding the CTU of interest.

また、符号化対象のブロックのサイズを決定する際に、対象となるブロックの複雑度を表す空間アクティビティ値を用いる技術が提案されている（例えば、特許文献１を参照）。特許文献１に開示された動画像符号化方法は、対象ブロックの符号化条件の決定の際に、その対象ブロックの少なくとも一部の領域の複雑度を表す第１空間アクティビティ値が第１閾値より小さい場合、小分割用の第１符号化条件を対象ブロックの符号化条件とする。一方、その動画像符号化方法は、第１空間アクティビティ値が第１閾値以上である場合、大分割用の第２符号化条件を対象ブロックの符号化条件とする。 In addition, a technique has been proposed that uses a spatial activity value representing the complexity of a target block when determining the size of a block to be encoded (see, for example, Patent Document 1). In the moving picture coding method disclosed in Patent Document 1, when determining the coding condition of the target block, the first spatial activity value representing the complexity of at least a part of the target block is more than the first threshold value. If it is smaller, the first coding condition for subdivision is set as the coding condition for the target block. On the other hand, when the first spatial activity value is greater than or equal to the first threshold, the moving image encoding method uses the second encoding condition for large partitioning as the encoding condition for the target block.

国際公開第２０１０／１５０４８６号International Publication No. 2010/150486

特許文献１に開示された技術では、符号化対象ブロックの複雑度が大きいほど、その符号化対象ブロックのサイズは大きくなる。 In the technique disclosed in Patent Document 1, the larger the complexity of the encoding target block, the larger the size of the encoding target block.

一方、符号化された動画像データを復号して得られる動画像の画質と圧縮効率のバランスを最適化するよう、適用する符号化モードを決定するための方式として、レート歪み最適化(Rate distortion optimization, RDO)方式が提案されている。RDO方式では、符号化モードを決定する際に、符号化の前後での誤差統計量である歪み量と、符号化対象のブロックの符号量であるレートとが考慮される。そして歪み量とレートとの関係を表すRD特性が最も良好となる符号化モードが選択される。 On the other hand, rate distortion optimization (Rate distortion) is used as a method for determining the encoding mode to be applied so as to optimize the balance between the image quality and compression efficiency of the moving image obtained by decoding the encoded moving image data. optimization, RDO) method has been proposed. In the RDO scheme, when determining an encoding mode, a distortion amount, which is an error statistic before and after encoding, and a rate, which is a code amount of a block to be encoded, are considered. Then, the encoding mode in which the RD characteristic representing the relationship between the distortion amount and the rate is the best is selected.

ここで、特許文献１に記載のように、符号化対象ブロックの複雑度に基づいて符号化対象ブロックのサイズを決定した場合、その決定されたサイズについてのRD特性が必ずしも最良とならず、他のサイズについてのRD特性の方がより良好となることがあった。 Here, as described in Patent Document 1, when the size of the encoding target block is determined based on the complexity of the encoding target block, the RD characteristic for the determined size is not necessarily the best, In some cases, the RD characteristics with respect to the size of the film became better.

そこで、本明細書は、符号化を行う単位となるブロックのサイズを適切に決定できる動画像符号化装置を提供することを目的とする。 Therefore, an object of the present specification is to provide a moving image encoding apparatus capable of appropriately determining the size of a block as a unit for performing encoding.

一つの実施形態によれば、動画像符号化装置が提供される。この動画像符号化装置は、動画像データに含まれるピクチャ上のブロックを分割した、複数のサブブロックのそれぞれの複雑度及び各サブブロック間の類似度に応じたオフセット値を算出し、ブロックを符号化単位としてそのブロックを符号化する場合の第１の符号化コストが、サブブロックを符号化単位としてそのブロックを符号化する場合の複数のサブブロックのそれぞれの第２の符号化コストの和とオフセット値の合計以下である場合、符号化単位としてブロックを選択し、一方、第１の符号化コストがその合計よりも大きい場合、符号化単位としてサブブロックを選択する符号化モード決定部と、ブロックを選択した符号化単位ごとに符号化する符号化部とを有する。 According to one embodiment, a video encoding device is provided. This moving image encoding apparatus calculates an offset value according to the complexity of each of a plurality of sub-blocks and the degree of similarity between the sub-blocks obtained by dividing a block on a picture included in the moving image data. The first encoding cost when the block is encoded as the encoding unit is the sum of the second encoding costs of the plurality of subblocks when the block is encoded using the subblock as the encoding unit. A coding mode determining unit that selects a block as a coding unit, and selects a sub-block as a coding unit when the first coding cost is greater than the sum, And an encoding unit that encodes a block for each selected encoding unit.

本発明の目的及び利点は、請求項において特に指摘されたエレメント及び組み合わせにより実現され、かつ達成される。
上記の一般的な記述及び下記の詳細な記述の何れも、例示的かつ説明的なものであり、請求項のように、本発明を限定するものではないことを理解されたい。 The objects and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
It should be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention as claimed.

本明細書に開示された動画像符号化装置は、符号化を行う単位となるブロックのサイズを適切に決定できる。 The moving picture encoding apparatus disclosed in this specification can appropriately determine the size of a block serving as a unit for performing encoding.

HEVCによる、ピクチャの分割の一例を示す図である。It is a figure which shows an example of the division | segmentation of the picture by HEVC. （ａ）及び（ｂ）は、それぞれ、ブロック分割の一例を示すである。(A) And (b) shows an example of block division, respectively. （ａ）及び（ｂ）は、それぞれ、図２（ａ）及び図２（ｂ）に示されたピクチャについて、TUサイズを予め設定された値としたときのRD特性を示す図である。(A) And (b) is a figure which shows the RD characteristic when making TU size into the value set beforehand about the picture shown by Fig.2 (a) and FIG.2 (b), respectively. 一つの実施形態による動画像符号化装置の概略構成図である。It is a schematic block diagram of the moving image encoder by one Embodiment. 符号化モード決定処理の動作フローチャートである。It is an operation | movement flowchart of an encoding mode determination process. 動画像符号化処理の動作フローチャートである。It is an operation | movement flowchart of a moving image encoding process. 上記の実施形態またはその変形例による動画像符号化装置の各部の機能を実現するコンピュータプログラムが動作することにより、動画像符号化装置として動作するコンピュータの構成図である。It is a block diagram of the computer which operate | moves as a moving image encoding apparatus by the computer program which implement | achieves the function of each part of the moving image encoding apparatus by said embodiment or its modification.

以下、図を参照しつつ、動画像符号化装置について説明する。最初に、HEVCに準拠してピクチャをブロックごとに分割する例について説明する。 Hereinafter, the moving picture coding apparatus will be described with reference to the drawings. First, an example in which a picture is divided into blocks according to HEVC will be described.

図２（ａ）及び図２（ｂ）は、それぞれ、ブロック分割の一例を示すである。図２（ａ）に示されたピクチャ２００には、並木が写っている。また、図２（ｂ）に示されたピクチャ２１０は、ホワイドノイズを表している。そしてピクチャ２００上に示された各ブロック２０１、及び、ピクチャ２１０上に示された各ブロック２１１は、CU、PU及びTUのそれぞれのサイズの組み合わせごとの符号化コストを次式で算出した場合に、符号化コストが最小となるTUを表す。なお、この例では、ピクチャを他のピクチャの情報を参照しないイントラ予測符号化モードで符号化すると仮定して符号化コストを算出した。

FIG. 2A and FIG. 2B each show an example of block division. A row of trees is shown in the picture 200 shown in FIG. Also, the picture 210 shown in FIG. 2B represents the wide noise. Each block 201 shown on the picture 200 and each block 211 shown on the picture 210 are obtained when the encoding cost for each combination of the sizes of CU, PU, and TU is calculated by the following equation: Represents the TU that minimizes the coding cost. In this example, the coding cost is calculated on the assumption that the picture is coded in the intra prediction coding mode in which information of other pictures is not referred to.

ここで予測誤差は、例えば、符号化コスト算出対象となるブロックと予測ブロックの対応画素間の誤差絶対値和として算出される。またモード情報量は、符号化コスト算出対象となる符号化モードで用いられる情報についての情報量を表す。そしてλは、ラグランジュの未定乗数である。
図２（ａ）及び図２（ｂ）に示されるように、複雑なところほど、小さいサイズのTUが選択されている。特に、ピクチャ２１０については、ピクチャ全体において最小サイズのTUが選択されている。しかしながら、このような複雑なシーンが写っているピクチャにおいて、ピクチャのRD特性は、ピクチャの分割サイズが小さいほど良好になるとは限らない。 Here, the prediction error is calculated, for example, as the sum of absolute values of errors between the corresponding pixels of the block that is the encoding cost calculation target and the prediction block. The mode information amount represents the information amount of information used in the encoding mode that is the encoding cost calculation target. Λ is Lagrange's undetermined multiplier.
As shown in FIGS. 2A and 2B, the smaller the TU is selected, the more complicated the TU is. In particular, for the picture 210, the smallest TU is selected for the entire picture. However, in a picture in which such a complicated scene is shown, the RD characteristic of the picture is not necessarily improved as the picture division size is smaller.

図３（ａ）及び図３（ｂ）は、それぞれ、図２（ａ）及び図２（ｂ）に示されたピクチャについて、TUサイズを予め設定された値としたときのRD特性を示す図である。図３（ａ）及び図３（ｂ）において、横軸は発生情報量、すなわち、レート(単位：Mbps)を表し、縦軸はピーク信号対雑音比(Peak Signal-to-Noise Ratio, PSNR)、すなわち、歪み量(単位：dB)を表す。したがって、左上に近いRD特性ほど、良好である。図３（ａ）において、グラフ３０１〜グラフ３０４は、それぞれ、TUサイズが32、16、8、4である場合のピクチャ２００についてのRD特性を表す。同様に、図３（ｂ）において、グラフ３１１〜グラフ３１４は、それぞれ、TUサイズが32、16、8、4である場合のピクチャ２１０についてのRD特性を表す。 3 (a) and 3 (b) are diagrams showing RD characteristics when the TU size is set to a preset value for the pictures shown in FIGS. 2 (a) and 2 (b), respectively. It is. 3 (a) and 3 (b), the horizontal axis represents the amount of generated information, that is, the rate (unit: Mbps), and the vertical axis represents the peak signal-to-noise ratio (PSNR). That is, it represents the amount of distortion (unit: dB). Therefore, the RD characteristics closer to the upper left are better. 3A, graphs 301 to 304 represent RD characteristics for the picture 200 when the TU sizes are 32, 16, 8, and 4, respectively. Similarly, in FIG. 3B, graphs 311 to 314 represent the RD characteristics for the picture 210 when the TU sizes are 32, 16, 8, and 4, respectively.

図３（ａ）及び図３（ｂ）に示されるように、ピクチャ２００及びピクチャ２１０の両方について、TUサイズが32のときのRD特性が最も良好となっており、TUサイズが小さくなるほど、RD特性は低下する。 As shown in FIG. 3A and FIG. 3B, the RD characteristic when the TU size is 32 is the best for both the picture 200 and the picture 210, and the smaller the TU size, the RD The characteristics are degraded.

このように、（１）式に従って算出された符号化コストに基づいて、ピクチャを分割するブロックのサイズを決定すると、必ずしもRD特性は最適とはならない。一方、ピクチャ２００及びピクチャ２１０では、ピクチャ全体が相対的に複雑である。そのため、特許文献１に記載のように複雑度に基づいてブロックのサイズを決定すると、ピクチャ上の局所的な位置によっては、小さいブロックサイズを用いた方がRD特性が良好となる場合でも、ピクチャ全体で大きなブロックサイズが選択される可能性が高い。 As described above, when the size of a block into which a picture is divided is determined based on the encoding cost calculated according to the equation (1), the RD characteristic is not necessarily optimal. On the other hand, in the picture 200 and the picture 210, the entire picture is relatively complicated. Therefore, when the block size is determined based on the complexity as described in Patent Document 1, depending on the local position on the picture, even if the smaller block size has better RD characteristics, the picture There is a high possibility that a large block size is selected as a whole.

ここで、本発明者は、鋭意研究の結果、複数のサブブロックに分割可能なブロックについて、以下の二つの条件が満たされる場合には、サブブロックに分割しない方がRD特性が良好となる可能性が高いことを見出した。
（１）各サブブロックに写っている被写体が複雑であること
（２）各サブブロックに写っている被写体が互いに類似していること Here, as a result of diligent research, the present inventor has the possibility that the RD characteristic is better when the block that can be divided into a plurality of sub-blocks is not divided into sub-blocks when the following two conditions are satisfied. I found that the nature is high.
(1) The subject in each sub-block is complex (2) The subject in each sub-block is similar to each other

そこで、本実施形態による動画像符号化装置は、符号化モード及び符号化単位となるブロックのサイズを決定する際に、着目するブロックについて、そのブロックを分割可能な複数のサブブロックのそれぞれの複雑度と、サブブロック間の類似度を算出する。動画像符号化装置は、サブブロックごとの複雑度、及びサブブロック間の類似度が上記の（１）と（２）の条件を満たす場合に正の値を持つオフセットを算出する。さらに、動画像符号化装置は、着目するブロックをサブブロックを符号化単位として符号化する際のサブブロックごとの符号化コストとブロックそのものを符号化単位として着目するブロックを符号化する際の符号化コストを算出する。そして動画像符号化装置は、サブブロックごとの符号化コストの和と着目するブロックそのものの符号化コストとの比較の際に、各サブブロックの符号化コストの和にオフセットを加算する。これにより、この動画像符号化装置は、上記の（１）と（２）の条件が満たされるブロックについてはサブブロックに分割され難くする。 Therefore, when determining the coding mode and the size of the block that is the coding unit, the moving image coding apparatus according to the present embodiment is configured to determine the complexity of each of a plurality of sub-blocks into which the block can be divided. Degree and similarity between sub-blocks are calculated. The video encoding apparatus calculates an offset having a positive value when the complexity for each sub-block and the similarity between sub-blocks satisfy the above conditions (1) and (2). Furthermore, the moving image encoding apparatus encodes the block of interest when encoding the block of interest using the subblock as an encoding unit and the encoding cost for each subblock and the block itself as the encoding unit. Calculation cost. Then, the moving image coding apparatus adds an offset to the sum of the coding costs of each sub-block when comparing the sum of the coding costs for each sub-block with the coding cost of the block of interest. As a result, the moving picture encoding apparatus makes it difficult for a block that satisfies the above conditions (1) and (2) to be divided into sub-blocks.

本実施形態では、動画像符号化装置は、HEVCに準拠するものとする。動画像符号化装置は、ピクチャをCTU単位で分割し、CTUごとに、CUサイズ、PUサイズ、TUサイズ及び符号化モードの組み合わせを選択する。そして動画像符号化装置は、CTUごとに、選択した組み合わせにしたがってそのCTUを符号化する。 In the present embodiment, the video encoding device is assumed to be compliant with HEVC. The moving picture coding apparatus divides a picture into CTU units, and selects a combination of a CU size, a PU size, a TU size, and a coding mode for each CTU. Then, the moving image encoding apparatus encodes the CTU for each CTU according to the selected combination.

図４は、一つの実施形態による動画像符号化装置の概略構成図である。動画像符号化装置１は、動きベクトル算出部１１と、符号化モード決定部１２と、符号化部１３と、記憶部１４とを有する。 FIG. 4 is a schematic configuration diagram of a video encoding apparatus according to one embodiment. The moving image encoding apparatus 1 includes a motion vector calculation unit 11, an encoding mode determination unit 12, an encoding unit 13, and a storage unit 14.

動画像符号化装置１が有するこれらの各部は、それぞれ別個の回路として形成される。あるいは動画像符号化装置１が有するこれらの各部は、その各部に対応する回路が集積された一つの集積回路として動画像符号化装置１に実装されてもよい。さらに、動画像符号化装置１が有するこれらの各部は、動画像符号化装置１が有するプロセッサ上で実行されるコンピュータプログラムにより実現される、機能モジュールであってもよい。 Each of these units included in the moving image encoding apparatus 1 is formed as a separate circuit. Alternatively, these units included in the video encoding device 1 may be mounted on the video encoding device 1 as one integrated circuit in which circuits corresponding to the respective units are integrated. Furthermore, each of these units included in the moving image encoding device 1 may be a functional module realized by a computer program executed on a processor included in the moving image encoding device 1.

符号化対象となるピクチャは、例えば、動画像符号化装置１全体を制御する制御部（図示せず）により複数のCTUに分割される。そして動画像符号化装置１には、各CTUが、例えばラスタスキャン順で入力される。そして動画像符号化装置１は、CTUごとに符号化する。以下、動画像符号化装置１が有する各部について説明する。 A picture to be encoded is divided into a plurality of CTUs, for example, by a control unit (not shown) that controls the entire moving image encoding apparatus 1. Each CTU is input to the moving image encoding apparatus 1 in the raster scan order, for example. Then, the moving image encoding apparatus 1 performs encoding for each CTU. Hereinafter, each part which the moving image encoder 1 has is demonstrated.

動きベクトル算出部１１は、符号化対象ピクチャが、PピクチャまたはBピクチャといった、インター予測符号化モードが適用可能なピクチャである場合、符号化対象のCTUについて適用可能なPUのそれぞれについて、動きベクトルを算出する。その際、動きベクトル算出部１１は、各PUについて、既に符号化され、かつ、符号化対象ピクチャが参照可能な参照ピクチャに対してブロックマッチングを実行して、PUと最も一致する参照ピクチャ及びその参照ピクチャ上の領域の位置を決定する。そして動きベクトル算出部１１は、PUとその領域間の空間的な移動量を表すベクトルを動きベクトルとして算出する。
なお、動きベクトル算出部１１は、各PUについて、そのPUと参照ピクチャ上の対応領域との対応画素間の差分絶対値和と動きベクトルの符号量の合計が最小となるときのPUと対応領域間の移動量を動きベクトルとしてもよい。
動きベクトル算出部１１は、各PUについて、動きベクトル及びその動きベクトルが参照する参照ピクチャを示す情報を、符号化モード決定部１２へ出力する。 When the encoding target picture is a picture to which the inter-prediction encoding mode is applicable, such as a P picture or a B picture, the motion vector calculation unit 11 performs a motion vector for each PU applicable to the encoding target CTU. Is calculated. At that time, the motion vector calculation unit 11 performs block matching on a reference picture that has already been encoded and can be referred to by the encoding target picture for each PU, and the reference picture that most closely matches the PU, and its reference picture Determine the location of the region on the reference picture. Then, the motion vector calculation unit 11 calculates a vector representing a spatial movement amount between the PU and the area as a motion vector.
For each PU, the motion vector calculation unit 11 determines the PU and the corresponding region when the sum of the absolute value of the difference between the corresponding pixels of the PU and the corresponding region on the reference picture and the sum of the motion vector code amounts are minimum. The amount of movement between them may be a motion vector.
The motion vector calculation unit 11 outputs information indicating a motion vector and a reference picture referred to by the motion vector to the encoding mode determination unit 12 for each PU.

符号化モード決定部１２は、符号化対象のCTUにおいて、適用可能なCUサイズ、PUサイズ、TUサイズ、及び符号化モードの組み合わせを求める。そして符号化モード決定部１２は、各組み合わせについて、符号量の推定値である符号化コストを算出する。そして符号化モード決定部１２は、各組み合わせの符号化コストに基づいて、符号化対象のCTUに適用するCUサイズ、PUサイズ、TUサイズ、及び符号化モードの組み合わせを決定する。 The encoding mode determination unit 12 obtains a combination of applicable CU size, PU size, TU size, and encoding mode in the CTU to be encoded. Then, the encoding mode determination unit 12 calculates an encoding cost that is an estimated value of the code amount for each combination. Then, the encoding mode determination unit 12 determines a combination of the CU size, the PU size, the TU size, and the encoding mode to be applied to the encoding target CTU based on the encoding cost of each combination.

符号化モード決定部１２は、適用可能なCUサイズ、PUサイズ、TUサイズ、及び符号化モードの組み合わせのそれぞれごとに、予測ブロックを生成する。予測ブロックは、着目する組み合わせに含まれる符号化モードに従って、符号化済みの参照ピクチャまたは符号化済みの他のブロックから生成される。なお、符号化モードには、例えば、イントラ予測符号化モード及びインター予測符号化モードが含まれる。さらに、イントラ予測符号化モードに基づいて予測ブロックが作成される場合、符号化モード決定部１２は、例えば、HEVCに規定される、予測ブロックの作成方法を規定する複数のモードのそれぞれについて予測ブロックを作成する。また、インター予測符号化モードに基づいて予測ブロックが生成される場合、符号化モード決定部１２は、適用可能な動きベクトルの生成方式または予測方式のそれぞれ（例えば、ダイレクトモードあるいはマージモードなど）について予測ブロックを作成する。また、符号化対象ピクチャが双方向予測が可能なBピクチャである場合、符号化モード決定部１２は、一方向の予測による動きベクトルに基づく予測ブロックだけでなく、各方向の予測による二つの動きベクトルに基づく予測ブロックも作成する。
以下では、便宜上、イントラ予測符号化モードにおける、予測ブロックの作成方法を規定するモードだけでなく、インター予測符号化モードにおける、動きベクトルの生成方式または予測方式と一方向予測または双方向予測の組み合わせを、予測モードと呼ぶ。 The encoding mode determination unit 12 generates a prediction block for each combination of applicable CU size, PU size, TU size, and encoding mode. The prediction block is generated from the encoded reference picture or another encoded block according to the encoding mode included in the combination of interest. The encoding mode includes, for example, an intra prediction encoding mode and an inter prediction encoding mode. Further, when a prediction block is created based on the intra prediction coding mode, the coding mode determination unit 12 predicts a prediction block for each of a plurality of modes that stipulate, for example, a method for creating a prediction block defined in HEVC. Create In addition, when a prediction block is generated based on the inter prediction encoding mode, the encoding mode determination unit 12 performs each applicable motion vector generation method or prediction method (for example, direct mode or merge mode). Create a prediction block. In addition, when the encoding target picture is a B picture capable of bidirectional prediction, the encoding mode determination unit 12 not only uses a prediction block based on a motion vector based on unidirectional prediction but also two motions based on prediction in each direction. A prediction block based on a vector is also created.
In the following, for the sake of convenience, not only a mode that defines a method for creating a prediction block in intra prediction coding mode, but also a combination of a motion vector generation method or prediction method and unidirectional prediction or bidirectional prediction in inter prediction coding mode. Is called a prediction mode.

符号化モード決定部１２は、例えば、着目する組み合わせの符号化コストを算出するために、その組み合わせに含まれるTUについて、予測誤差を算出する。本実施形態では、符号化モード決定部１２は、予測誤差として、次式に従って画素差分絶対値和SADを算出する。

ここで、OrgPixelは着目する組み合わせに含まれるTU内の画素の値であり、PredPixelは、予測ブロックの対応画素の値である。 For example, in order to calculate the encoding cost of the combination of interest, the encoding mode determination unit 12 calculates a prediction error for the TUs included in the combination. In the present embodiment, the encoding mode determination unit 12 calculates the pixel difference absolute value sum SAD as a prediction error according to the following equation.

Here, OrgPixel is the value of the pixel in the TU included in the combination of interest, and PredPixel is the value of the corresponding pixel of the prediction block.

なお、符号化モード決定部１２は、予測誤差として、SADを算出する代わりに、着目するTUと予測ブロック間の差分画像をアダマール変換した後の各画素の値の絶対値和SATDなどを算出してもよい。また、TUがピクチャの左上端に位置し、かつ、着目する符号化モードがイントラ予測符号化モードである場合のように、予測ブロックが作成されないTUについては、符号化モード決定部１２は、予測誤差として、アクティビティを算出してもよい。アクティビティACTは、例えば、次式に従って算出される。

ここで、AveBは、TU全体の画素値の平均値である。 Note that the encoding mode determination unit 12 calculates, as a prediction error, an absolute value sum SATD of the values of each pixel after Hadamard transform of the difference image between the focused TU and the prediction block, instead of calculating SAD. May be. In addition, for a TU for which a prediction block is not created, as in the case where the TU is located at the upper left corner of the picture and the encoding mode of interest is the intra prediction encoding mode, the encoding mode determination unit 12 performs prediction The activity may be calculated as an error. The activity ACT is calculated according to the following formula, for example.

Here, AveB is an average value of the pixel values of the entire TU.

符号化モード決定部１２は、着目する組み合わせについて、次式に従って符号化コストCostを算出する。

ここで、SADは、着目する組み合わせに含まれるTUについて算出された予測誤差である。またRは、モード情報量であり、動きベクトル、予測モードを表すフラグなど、直交変換係数以外の項目についての符号量の推定値である。そしてλはラグランジュの未定乗数である。 The encoding mode determination unit 12 calculates the encoding cost Cost according to the following expression for the combination of interest.

Here, SAD is a prediction error calculated for TUs included in the combination of interest. R is a mode information amount, which is an estimated value of a code amount for items other than orthogonal transform coefficients, such as a motion vector and a flag indicating a prediction mode. Λ is Lagrange's undetermined multiplier.

符号化モード決定部１２は、インター予測符号化モードとイントラ予測符号化モードとについて、別個に、適用するCUサイズ、PUサイズ、TUサイズ、及び予測モードの組み合わせを求める。その際、符号化モード決定部１２は、例えば、同一サイズのTUごとに、符号化コストが最小となるPUサイズ及び予測モードを決定する。さらに、符号化モード決定部１２は、そのTUについて決定したPUサイズ及び予測モードを用いる場合の符号化コストと、そのTUを４分割した各TU（以下、便宜上、サブTUと呼ぶ）について決定したPUサイズ及び予測モードを用いる場合の符号化コストの和を比較する。そして符号化モード決定部１２は、符号化コストの比較結果に応じてTUのサイズを選択する。符号化モード決定部１２は、再帰的に上記の処理をTUサイズの大きい方から順に繰り返すことで、インター予測符号化モードとイントラ予測符号化モードのそれぞれごとに、CUサイズ、PUサイズ、TUサイズ、及び予測モードの組み合わせを求める。そして符号化モード決定部１２は、CUごとに、インター予測符号化モードについて求めた組み合わせとイントラ予測符号化モードについて求めた組み合わせのうち、符号化コストが小さい方を、符号化対象のCTUに適用する組み合わせとする。 The coding mode determination unit 12 obtains a combination of a CU size, a PU size, a TU size, and a prediction mode to be applied separately for the inter prediction coding mode and the intra prediction coding mode. At that time, for example, the encoding mode determination unit 12 determines a PU size and a prediction mode that minimize the encoding cost for each TU of the same size. Further, the encoding mode determination unit 12 determines the encoding cost when using the PU size and prediction mode determined for the TU and each TU obtained by dividing the TU into four (hereinafter referred to as sub-TUs for convenience). Compare the sum of coding costs when using PU size and prediction mode. Then, the encoding mode determination unit 12 selects the TU size according to the comparison result of the encoding costs. The encoding mode determination unit 12 recursively repeats the above processing in order from the largest TU size, so that the CU size, the PU size, and the TU size for each of the inter prediction encoding mode and the intra prediction encoding mode. And a combination of prediction modes. Then, for each CU, the encoding mode determination unit 12 applies, to the CTU to be encoded, the one with the lower encoding cost among the combinations obtained for the inter prediction coding mode and the combinations obtained for the intra prediction coding mode. It is set as a combination.

図１に関して説明したように、TUのサイズは、4x4画素〜32x32画素の中から選択される。そこで符号化モード決定部１２は、例えば、32x32画素のサイズを持つTUについての符号化コストと、そのTUを４分割した、16x16画素のサイズを持つ４個のサブTUのそれぞれの符号化コストの和とを比較する。同様に、符号化モード決定部１２は、16x16画素のサイズを持つTUについての符号化コストと、そのTUを４分割した、8x8画素のサイズを持つ４個のサブTUのそれぞれの符号化コストの和とを比較する。さらに、符号化モード決定部１２は、8x8画素のサイズを持つTUについての符号化コストと、そのTUを４分割した、4x4画素のサイズを持つ４個のサブTUのそれぞれの符号化コストの和とを比較する。 As described with reference to FIG. 1, the size of the TU is selected from 4 × 4 pixels to 32 × 32 pixels. Therefore, for example, the encoding mode determination unit 12 determines the encoding cost for a TU having a size of 32 × 32 pixels and the encoding cost of each of four sub TUs having a size of 16 × 16 pixels obtained by dividing the TU into four. Compare the sum. Similarly, the encoding mode determination unit 12 determines the encoding cost for a TU having a size of 16 × 16 pixels, and the encoding cost for each of four sub TUs having a size of 8 × 8 pixels obtained by dividing the TU into four. Compare the sum. Further, the encoding mode determination unit 12 sums the encoding cost for the TU having the size of 8x8 pixels and the encoding cost of each of the four sub TUs having the size of 4x4 pixels obtained by dividing the TU into four. And compare.

本実施形態では、異なるTUサイズについての符号化コストを比較する際、符号化モード決定部１２は、上記の（１）及び（２）の条件が満たされる場合に、着目するTUを分割して得られる４個のサブTUよりも着目するTUの方が選択され易いようにする。 In this embodiment, when comparing the encoding costs for different TU sizes, the encoding mode determination unit 12 divides the TU of interest when the above conditions (1) and (2) are satisfied. The target TU is more easily selected than the obtained four sub-TUs.

例えば、着目するTUの符号化コストをC0とし、着目するTUを４分割した４個のサブTUのそれぞれの符号化コストをC1[i](i=1,2,3,4)とする。この場合、符号化モード決定部１２は、次式が満たされる場合に、TUのサイズとして、着目するTUのサイズ、すなわち、大きい方のTUのサイズを選択する。一方、次式が満たされない場合、符号化モード決定部１２は、TUのサイズとして、サブTUのサイズ、すなわち、小さい方のTUのサイズを選択する。

ここでoffsetは、着目するTUを４分割した４個のTUのそれぞれの複雑度及び４個のTU間の類似度に基づいて決定されるオフセット値である。 For example, the encoding cost of the target TU is C0, and the encoding costs of four sub-TUs obtained by dividing the target TU into four are C1 [i] (i = 1, 2, 3, 4). In this case, the encoding mode determination unit 12 selects the size of the focused TU, that is, the larger TU size as the TU size when the following equation is satisfied. On the other hand, when the following equation is not satisfied, the encoding mode determination unit 12 selects the sub TU size, that is, the smaller TU size as the TU size.

Here, offset is an offset value determined based on the complexity of each of the four TUs obtained by dividing the TU of interest into four and the similarity between the four TUs.

符号化コストC1[i]が大きいほど、対応するサブTUとその予測ブロック間の予測誤差が大きいので、そのサブTUに写っている被写体は複雑であると想定される。また、各サブTUの符号化コストC1[i]の分散が小さいほど、各サブTUに写っている被写体は類似していると想定される。したがって、本実施形態では、符号化モード決定部１２は、各サブTUについて算出された符号化コストC1[i]を、そのサブTUについての複雑度とし、各サブTUの符号化コストC1[i]の分散を、各サブTU間の類似度とする。 The larger the coding cost C1 [i] is, the larger the prediction error between the corresponding sub-TU and its prediction block is, so the subject in the sub-TU is assumed to be more complicated. Further, it is assumed that the subject shown in each sub-TU is more similar as the variance of the encoding cost C1 [i] of each sub-TU is smaller. Therefore, in the present embodiment, the encoding mode determination unit 12 sets the encoding cost C1 [i] calculated for each sub TU as the complexity for the sub TU, and the encoding cost C1 [i of each sub TU. ] Is the similarity between sub-TUs.

この場合、offsetは、例えば、次式に従って算出される。

ここでμは定数であり、例えば、0よりも大きく、かつ、1以下の値に設定される。またVar(C1[i])は、各サブTUの符号化コストC1[i]の分散であり、次式で算出される。閾値Th1は、例えば、サブTUの画素数に定数α（例えば、1〜5）を乗じた値に設定される。また、閾値Th2は、着目するTUに含まれるサブTUの数（本実施形態では、4）に画素の取り得る値の最大値（例えば、255）と定数β（例えば、0.05〜0.1）を乗じた値に設定される。

すなわち、各サブTUが一定以上複雑である場合、かつ、サブTU間のシーンが類似しているほどoffsetは大きな値となり、その結果として大きい方のTUのサイズが選択され易くなる。 In this case, offset is calculated according to the following equation, for example.

Here, μ is a constant, and is set to a value greater than 0 and 1 or less, for example. Var (C1 [i]) is a variance of the coding cost C1 [i] of each sub-TU, and is calculated by the following equation. The threshold value Th1 is set to, for example, a value obtained by multiplying the number of sub-TU pixels by a constant α (for example, 1 to 5). The threshold Th2 is obtained by multiplying the number of sub-TUs included in the TU of interest (4 in this embodiment) by the maximum value (for example, 255) that a pixel can take and a constant β (for example, 0.05 to 0.1). Value is set.

That is, when each sub TU is more than a certain level and the scenes between sub TUs are more similar, offset becomes a larger value, and as a result, the larger TU size is easily selected.

なお、変形例によれば、符号化モード決定部１２は、各サブTUの予測誤差を、そのサブTUについての複雑度とし、各サブTUの予測誤差の分散を、各サブTU間の類似度としてもよい。この場合、符号化モード決定部１２は、例えば、次式に従ってoffsetを算出する。

ここでt1[i](i=1,2,3,4)は、各サブTUについて、そのサブTUと対応する予測ブロック間の差分画像をアダマール変換した後の各画素の値の絶対値和SATDである。そしてVar(t1[i])は、各サブTUのSATDの分散である。閾値Th3は、例えば、サブTUの画素数に定数α（例えば、1〜5）を乗じた値に設定される。また、閾値Th4は、着目するTUに含まれるサブTUの数（本実施形態では、4）に画素の取り得る値の最大値（例えば、255）と定数β（例えば、0.05〜0.1）を乗じた値に設定される。SATDは、周波数領域でのサブTUの成分の大きさを表している。そのため、Var(t1[i])が小さいほど、各サブTUに写っている被写体はより類似していると想定される。この変形例における定数μの値は、上記の（６）式におけるμの値よりも大きい方が好ましい。 Note that, according to the modification, the encoding mode determination unit 12 sets the prediction error of each sub TU as the complexity for the sub TU, and sets the variance of the prediction error of each sub TU as the similarity between the sub TUs. It is good. In this case, the encoding mode determination unit 12 calculates offset according to the following equation, for example.

Here, t1 [i] (i = 1, 2, 3, 4) is the sum of the absolute values of the values of each pixel after Hadamard transform of the difference image between the sub TU and the prediction block corresponding to the sub TU. SATD. Var (t1 [i]) is the variance of the SATD of each sub-TU. For example, the threshold value Th3 is set to a value obtained by multiplying the number of pixels of the sub TU by a constant α (for example, 1 to 5). The threshold Th4 is obtained by multiplying the number of sub-TUs included in the TU of interest (4 in this embodiment) by the maximum value (for example, 255) that a pixel can take and a constant β (for example, 0.05 to 0.1). Value is set. SATD represents the size of the sub-TU component in the frequency domain. For this reason, it is assumed that the subject shown in each sub-TU is more similar as Var (t1 [i]) is smaller. The value of the constant μ in this modification is preferably larger than the value of μ in the above equation (6).

また他の変形例によれば、各サブTU間の類似度は、サブTU間のSADの差の絶対値で評価されてもよい。この場合、符号化モード決定部１２は、例えば、次式に従ってoffsetを算出する。

ここでs1[i](i=1,2,3,4)は、各サブTUについて、そのサブTUと対応する予測ブロック間のSADである。また、関数max(s1[i])は、s1[i](i=1,2,3,4)のうちの最大値を出力する関数であり、関数min(s1[i])は、s1[i](i=1,2,3,4)のうちの最小値を出力する関数である。閾値Th5は、例えば、サブTUの画素数に定数α（例えば、1〜5）を乗じた値に設定される。また、閾値Th6は、着目するTUに含まれるサブTUの数（本実施形態では、4）に画素の取り得る値の最大値（例えば、255）と定数β（例えば、0.05〜0.1）を乗じた値に設定される。 According to another modification, the similarity between sub-TUs may be evaluated by the absolute value of the difference in SAD between sub-TUs. In this case, the encoding mode determination unit 12 calculates offset according to the following equation, for example.

Here, s1 [i] (i = 1, 2, 3, 4) is the SAD between the prediction blocks corresponding to the sub TU for each sub TU. The function max (s1 [i]) is a function that outputs the maximum value of s1 [i] (i = 1, 2, 3, 4), and the function min (s1 [i]) is s1 [i] A function that outputs the minimum value of (i = 1, 2, 3, 4). The threshold value Th5 is set to, for example, a value obtained by multiplying the number of pixels of the sub TU by a constant α (for example, 1 to 5). The threshold Th6 is obtained by multiplying the number of sub-TUs included in the target TU (in this embodiment, 4) by the maximum value (for example, 255) that a pixel can take and a constant β (for example, 0.05 to 0.1). Value is set.

なお、(max(s1[i])- min(s1[i]))が0である場合、各サブTUに写っているシーンは同一である可能性がある。そこでこの場合には、符号化モード決定部１２は、（９）式によらず、大きい方のTUのサイズを選択することが好ましい。
さらに他の変形例によれば、各サブTU間の類似度は、各サブTUの符号化コストC1[i]のうちの最大値と最小値の差として算出されてもよい。また、各サブTUの複雑度は、（３）式で算出されるアクティビティとして算出されてもよい。 If (max (s1 [i])-min (s1 [i])) is 0, the scenes shown in each sub-TU may be the same. Therefore, in this case, it is preferable that the encoding mode determination unit 12 selects the larger TU size regardless of the equation (9).
According to still another modification, the similarity between the sub TUs may be calculated as a difference between the maximum value and the minimum value of the encoding costs C1 [i] of the sub TUs. Further, the complexity of each sub-TU may be calculated as an activity calculated by equation (3).

図５は、符号化モード決定処理の動作フローチャートである。符号化モード決定部１２はCTUごとに、下記の動作フローチャートに従って、符号化対象のCTUに適用するCUサイズ、PUサイズ、TUサイズ及び符号化モードの組み合わせを決定する。 FIG. 5 is an operation flowchart of the encoding mode determination process. For each CTU, the encoding mode determination unit 12 determines a combination of a CU size, a PU size, a TU size, and an encoding mode to be applied to the encoding target CTU according to the following operation flowchart.

符号化モード決定部１２は、イントラ予測符号化モードを着目する符号化モードに設定する（ステップＳ１０１）。そして符号化モード決定部１２は、最大のTUサイズを着目するTUサイズに設定する（ステップＳ１０２）。 The encoding mode determination unit 12 sets the intra prediction encoding mode to the encoding mode in which attention is paid (step S101). Then, the encoding mode determination unit 12 sets the maximum TU size as a focused TU size (step S102).

符号化モード決定部１２は、着目するTUサイズを持つTUのそれぞれと、そのTUを４分割したサブTUのそれぞれについて、符号化コストの最小値C0、C1[i]を算出する（ステップＳ１０３）。 The encoding mode determination unit 12 calculates the minimum values C0 and C1 [i] of the encoding cost for each TU having the TU size of interest and each of the sub TUs obtained by dividing the TU into four (step S103). .

符号化モード決定部１２は、着目するTUサイズを持つTUのそれぞれについて、そのTUを４分割した各サブTUの複雑度及びサブTU間の類似度に基づいてoffsetを算出する（ステップＳ１０４）。 The encoding mode determination unit 12 calculates an offset for each TU having a target TU size based on the complexity of each sub-TU obtained by dividing the TU into four and the similarity between sub-TUs (step S104).

符号化モード決定部１２は、着目するTUサイズを持つTUのそれぞれについて、そのTUの符号化コストC0が、各サブTUの符号化コストの和ΣC1[i]とoffsetの合計以下か否か判定する（ステップＳ１０５）。そして符号化モード決定部１２は、符号化コストC0が各サブTUの符号化コストの和ΣC1[i]とoffsetの合計以下である場合（ステップＳ１０５−Ｙｅｓ）、着目するTUのサイズを選択する（ステップＳ１０６）。一方、符号化コストC0が各サブTUの符号化コストの和ΣC1[i]とoffsetの合計より大きい場合（ステップＳ１０５−Ｎｏ）、符号化モード決定部１２は、サブTUのサイズを選択する（ステップＳ１０７）。 The encoding mode determination unit 12 determines, for each TU having a target TU size, whether the encoding cost C0 of the TU is less than or equal to the sum of the encoding costs ΣC1 [i] of each sub TU and offset (Step S105). Then, when the encoding cost C0 is equal to or less than the sum of the encoding costs ΣC1 [i] and offset of each sub TU (step S105—Yes), the encoding mode determination unit 12 selects the size of the TU of interest. (Step S106). On the other hand, when the encoding cost C0 is larger than the sum of the encoding costs ΣC1 [i] of each sub TU and offset (No in step S105), the encoding mode determination unit 12 selects the size of the sub TU ( Step S107).

符号化モード決定部１２は、サブTUのサイズが選択されたTUが有るか否か判定する（ステップＳ１０８）。サブTUサイズが選択されたTUが一つ以上あれば（ステップＳ１０８−Ｙｅｓ）、符号化モード決定部１２は、サブTUのサイズがTUの取り得る最小サイズか否か判定する（ステップＳ１０９）。なお、TUの取り得る最小サイズは、HEVCで規定される4x4画素であってもよく、あるいは、4x4画素以上で、予め設定されるサイズであってもよい。サブTUのサイズがTUの取り得る最小サイズでなければ（ステップＳ１０９−Ｎｏ）、符号化モード決定部１２は、サブTUのサイズを着目するTUのサイズに設定する（ステップＳ１１０）。そして符号化モード決定部１２は、サブTUのサイズが選択されたTUのそれぞれについて、サブTUとTUとして、ステップＳ１０３以降の処理を繰り返す。 The encoding mode determination unit 12 determines whether there is a TU for which the size of the sub TU is selected (step S108). If there is one or more TUs for which the sub TU size is selected (step S108—Yes), the encoding mode determination unit 12 determines whether the size of the sub TU is the minimum size that the TU can take (step S109). Note that the minimum size that can be taken by the TU may be 4 × 4 pixels defined by HEVC, or may be a size set in advance by 4 × 4 pixels or more. If the size of the sub TU is not the minimum size that can be taken by the TU (No in step S109), the encoding mode determination unit 12 sets the size of the sub TU to the size of the TU of interest (step S110). Then, the encoding mode determination unit 12 repeats the processing from step S103 onward as the sub TU and TU for each TU for which the size of the sub TU is selected.

一方、ステップＳ１０８において、サブTUのサイズが選択されたTUが無い場合（ステップＳ１０８−Ｎｏ）、符号化モード決定部１２は、TUの分割を終了する。また、ステップＳ１０９においてサブTUのサイズがTUの取り得る最小サイズである場合（ステップＳ１０９−Ｙｅｓ）も、符号化モード決定部１２は、TUの分割を終了する。その後、符号化モード決定部１２は、着目する符号化モードがインター予測符号化モードか否か判定する（ステップＳ１１１）。着目する符号化モードがイントラ予測符号化モードであれば（ステップＳ１１１−Ｎｏ）、符号化モード決定部１２は、インター予測符号化モードを着目する符号化モードに設定する（ステップＳ１１２）。そして符号化モード決定部１２は、ステップＳ１０２以降の処理を繰り返す。 On the other hand, in step S108, when there is no TU for which the size of the sub TU is selected (step S108-No), the encoding mode determination unit 12 ends the division of the TU. Also, when the size of the sub TU is the minimum size that can be taken by the TU in step S109 (step S109—Yes), the encoding mode determination unit 12 ends the division of the TU. Thereafter, the coding mode determination unit 12 determines whether or not the coding mode of interest is the inter prediction coding mode (step S111). If the encoding mode of interest is the intra prediction encoding mode (step S111-No), the encoding mode determination unit 12 sets the inter prediction encoding mode to the encoding mode of interest (step S112). Then, the encoding mode determination unit 12 repeats the processing after step S102.

一方、着目する符号化モードがインター予測符号化モードであれば（ステップＳ１１１−Ｙｅｓ）、符号化モード決定部１２は、CUごとに、上記のTU分割結果及び対応するPUと予測モードとに基づいて、インター予測符号化モードの符号化コストとイントラ予測符号化モードの符号化コストを算出する。そして符号化モード決定部１２は、インター予測符号化モードとイントラ予測符号化モードのうち、符号化コストが小さい方をそのCUについて適用する符号化モードとする（ステップＳ１１３）。なお、符号化モード決定部１２は、CUについても、取り得る最大サイズのCUから順に、CUの符号化コストと、そのCUを4分割した4個のサブCUのそれぞれの符号化コストの和とを比較して、符号化コストが小さい方を選択すればよい。ただし、TU及びPUは、複数のCUにわたって設定されることはないので、上記のTU分割の結果により、着目するCUが4個のサブCUに分割できない場合には、着目するCUは、それ以上分割されない。そして符号化モード決定部１２は、符号化モード決定処理を終了する。 On the other hand, if the encoding mode of interest is the inter prediction encoding mode (step S111-Yes), the encoding mode determination unit 12 is based on the TU division result and the corresponding PU and prediction mode for each CU. Thus, the encoding cost of the inter prediction encoding mode and the encoding cost of the intra prediction encoding mode are calculated. Then, the coding mode determination unit 12 sets the coding mode to be applied to the CU of the inter prediction coding mode and the intra prediction coding mode with the smaller coding cost (step S113). Note that the coding mode determination unit 12 also has the CU coding cost and the sum of the coding costs of the four sub CUs obtained by dividing the CU into four CUs in order from the CU having the largest possible size. And the one with the lower coding cost may be selected. However, since the TU and PU are not set across multiple CUs, if the CU of interest cannot be divided into four sub CUs as a result of the above TU partitioning, the CU of interest is no more than that. Not divided. Then, the encoding mode determination unit 12 ends the encoding mode determination process.

なお、符号化対象ピクチャが、インター予測符号化モードの適用がない、Iピクチャである場合には、符号化モード決定部１２は、ステップＳ１１０以降の処理を行わずに符号化モード決定処理を終了する。 In addition, when the encoding target picture is an I picture to which the inter prediction encoding mode is not applied, the encoding mode determination unit 12 ends the encoding mode determination process without performing the processes after step S110. To do.

符号化モード決定部１２は、符号化対象のCTUに適用するCUサイズ、PUサイズ、TUサイズ及び符号化モードの組み合わせを決定すると、その組み合わせを符号化部１３に出力する。なお、インター予測符号化されるCUについては、符号化モード決定部１２は、そのCUに含まれるPUについて算出された動きベクトルも符号化部１３へ出力する。 When the encoding mode determination unit 12 determines a combination of the CU size, the PU size, the TU size, and the encoding mode to be applied to the encoding target CTU, the encoding mode determination unit 12 outputs the combination to the encoding unit 13. Note that, for a CU to be inter-predictively encoded, the encoding mode determination unit 12 also outputs a motion vector calculated for a PU included in the CU to the encoding unit 13.

符号化部１３は、符号化モード決定部１２により決定されたCUサイズ、PUサイズ及びTUサイズに従って、符号化対象のCTUを分割する。そして符号化部１３は、符号化モード決定部１２により決定されたCUごとの符号化モードに従って、そのCUに含まれる各TUの予測ブロックを作成する。そして符号化部１３は、符号化対象のCTUに含まれる各TUと対応する予測ブロック間の予測誤差信号を、TUごとに直交変換して得られる直交変換係数を量子化及び可変長符号化することで、符号化対象のCTUを符号化する。
再度図４を参照すると、符号化部１３は、予測ブロック生成部２１と、予測誤差信号算出部２２と、直交変換部２３と、量子化部２４と、復号部２５と、可変長符号化部２６とを有する。 The encoding unit 13 divides the CTU to be encoded according to the CU size, PU size, and TU size determined by the encoding mode determination unit 12. Then, the encoding unit 13 creates a prediction block for each TU included in the CU according to the encoding mode for each CU determined by the encoding mode determination unit 12. Then, the encoding unit 13 quantizes and variable-length encodes the orthogonal transform coefficient obtained by orthogonally transforming the prediction error signal between the prediction blocks corresponding to each TU included in the CTU to be encoded for each TU. Thus, the CTU to be encoded is encoded.
Referring to FIG. 4 again, the encoding unit 13 includes a prediction block generation unit 21, a prediction error signal calculation unit 22, an orthogonal transformation unit 23, a quantization unit 24, a decoding unit 25, and a variable length encoding unit. 26.

予測ブロック生成部２１は、予測ブロックを生成する。予測ブロック生成部２１は、符号化対象のCTUに含まれる着目するCUがインター予測符号化される場合、そのCUに含まれる各PUについて、そのPUに設定される動きベクトルを用いて参照ピクチャを動き補償することで予測ブロックを生成する。また、予測ブロック生成部２１は、着目するCUがイントラ予測符号化される場合、そのCUに含まれる各PUについて、そのPUに適用される予測モードに従って、そのPUの周囲の符号化済みのブロックに含まれる画素から予測ブロックを生成する。 The prediction block generation unit 21 generates a prediction block. When the target CU included in the CTU to be encoded is inter-predictively encoded, the prediction block generation unit 21 uses the motion vector set in the PU for each PU included in the CU to generate a reference picture. A prediction block is generated by performing motion compensation. In addition, when the target CU is subjected to intra-prediction encoding, the prediction block generation unit 21 encodes blocks around the PU according to the prediction mode applied to the PU for each PU included in the CU. A prediction block is generated from the pixels included in.

予測ブロック生成部２１は、予測ブロックを予測誤差信号算出部２２へ出力する。 The prediction block generation unit 21 outputs the prediction block to the prediction error signal calculation unit 22.

予測誤差信号算出部２２は、符号化対象のCTU内の各画素について、予測ブロックの対応する画素との差分演算を実行する。そして予測誤差信号算出部２２は、その差分演算により得られた各画素に対応する差分値を、予測誤差信号とする。予測誤差信号算出部２２は、予測誤差信号を直交変換部２３へ出力する。 The prediction error signal calculation unit 22 performs a difference calculation for each pixel in the CTU to be encoded with the corresponding pixel of the prediction block. And the prediction error signal calculation part 22 makes the difference value corresponding to each pixel obtained by the difference calculation a prediction error signal. The prediction error signal calculation unit 22 outputs the prediction error signal to the orthogonal transformation unit 23.

直交変換部２３は、符号化対象のCTU内のTUごとに、予測誤差信号を直交変換することにより、直交変換係数の組を求める。例えば、直交変換部２３は、直交変換処理として、離散コサイン変換（Discrete Cosine Transform、DCT）を用いることで、直交変換係数として、TUごとのDCT係数の組を得る。あるいは、直交変換部２３は、直交変換処理として、アダマール変換を利用してもよい。
直交変換部２３は、TUごとの直交変換係数の組を量子化部２４へ出力する。 The orthogonal transform unit 23 obtains a set of orthogonal transform coefficients by performing orthogonal transform on the prediction error signal for each TU in the CTU to be encoded. For example, the orthogonal transform unit 23 obtains a set of DCT coefficients for each TU as orthogonal transform coefficients by using discrete cosine transform (DCT) as orthogonal transform processing. Alternatively, the orthogonal transform unit 23 may use Hadamard transform as the orthogonal transform process.
The orthogonal transform unit 23 outputs a set of orthogonal transform coefficients for each TU to the quantization unit 24.

量子化部２４は、TUごとに、直交変換係数を量子化することにより、その直交変換係数の量子化係数を算出する。この量子化処理は、一定区間に含まれる信号値を一つの信号値で表す処理である。そしてその一定区間は、量子化幅と呼ばれる。例えば、量子化部２４は、直交変換係数から、量子化幅に相当する所定数の下位ビットを切り捨てることにより、その直交変換係数を量子化する。量子化幅は、量子化パラメータによって決定される。例えば、量子化部２４は、量子化パラメータの値に対する量子化幅の値を表す関数にしたがって、使用される量子化幅を決定する。またその関数は、量子化パラメータの値に対する単調増加関数とすることができ、予め設定される。 For each TU, the quantization unit 24 quantizes the orthogonal transform coefficient to calculate a quantization coefficient of the orthogonal transform coefficient. This quantization process is a process that represents a signal value included in a certain section as one signal value. The fixed interval is called a quantization width. For example, the quantization unit 24 quantizes the orthogonal transform coefficient by truncating a predetermined number of lower bits corresponding to the quantization width from the orthogonal transform coefficient. The quantization width is determined by the quantization parameter. For example, the quantization unit 24 determines a quantization width to be used according to a function representing a quantization width value with respect to a quantization parameter value. The function can be a monotonically increasing function with respect to the value of the quantization parameter, and is set in advance.

また量子化部２４は、HEVCなどの動画像符号化規格に対応した様々な量子化パラメータ決定方法の何れかに従って量子化パラメータを決定すればよい。量子化部２４は、例えば、MPEG-2の標準テストモデル5に関する量子化パラメータの算出方法を用いることができる。なお、MPEG-2の標準テストモデル5に関する量子化パラメータの算出方法に関しては、例えば、http://www.mpeg.org/MPEG/MSSG/tm5/Ch10/Ch10.htmlで特定されるURLを参照されたい。
量子化部２４は、量子化処理を実行することにより、直交変換係数を表すために使用されるビットの数を削減できるので、符号量を低減できる。量子化部２４は、量子化係数を復号部２５及び可変長符号化部２６へ出力する。 Further, the quantization unit 24 may determine the quantization parameter according to any of various quantization parameter determination methods corresponding to a moving image coding standard such as HEVC. The quantization unit 24 can use, for example, a quantization parameter calculation method related to the MPEG-2 standard test model 5. For the quantization parameter calculation method for MPEG-2 standard test model 5, refer to the URL specified at http://www.mpeg.org/MPEG/MSSG/tm5/Ch10/Ch10.html, for example. I want to be.
Since the quantization unit 24 can reduce the number of bits used to represent the orthogonal transform coefficient by executing the quantization process, the code amount can be reduced. The quantization unit 24 outputs the quantized coefficient to the decoding unit 25 and the variable length coding unit 26.

復号部２５は、符号化対象のCTUの量子化係数から、そのCTUよりも後のCTU及び符号化対象のCTUを含むピクチャよりも符号化順で後のピクチャを符号化するために参照される参照ピクチャを生成する。そのために、復号部２５は、量子化係数に、量子化パラメータにより決定された量子化幅に相当する所定数を乗算することにより、量子化係数を逆量子化する。この逆量子化により、各TUの直交変換係数の組、例えば、DCT係数の組が復元される。その後、復号部２５は、TUごとに、直交変換係数の組を逆直交変換処理する。例えば、直交変換部２３がDCTを用いている場合、復号部２５は、各TUに対して逆DCT処理を実行する。逆量子化処理及び逆直交変換処理を量子化信号に対して実行することにより、符号化前の予測誤差信号と同程度の情報を有する予測誤差信号が再生される。 The decoding unit 25 is referred from the quantization coefficient of the CTU to be encoded in order to encode a picture subsequent to the CTU after the CTU and a picture including the CTU to be encoded in the encoding order. Generate a reference picture. For this purpose, the decoding unit 25 dequantizes the quantization coefficient by multiplying the quantization coefficient by a predetermined number corresponding to the quantization width determined by the quantization parameter. By this inverse quantization, a set of orthogonal transform coefficients of each TU, for example, a set of DCT coefficients is restored. Thereafter, the decoding unit 25 performs inverse orthogonal transform processing on the set of orthogonal transform coefficients for each TU. For example, when the orthogonal transform unit 23 uses DCT, the decoding unit 25 performs inverse DCT processing on each TU. By performing the inverse quantization process and the inverse orthogonal transform process on the quantized signal, a prediction error signal having the same level of information as the prediction error signal before encoding is reproduced.

復号部２５は、予測ブロックの各画素値に、その画素に対応する再生された予測誤差信号を加算する。これらの処理を各予測ブロックについて実行することにより、復号部２５は、その後に符号化されるPUに対する予測ブロックを生成するために参照されるブロックを復元する。さらに、復号部２５は、復元したブロックに対してデブロッキングフィルタ処理を実行してもよい。
復号部２５は、ブロックを復元する度に、その復元されたブロックを、記憶部１４に記憶する。 The decoding unit 25 adds the reproduced prediction error signal corresponding to the pixel to each pixel value of the prediction block. By executing these processes for each prediction block, the decoding unit 25 restores a block that is referred to in order to generate a prediction block for a PU to be encoded thereafter. Further, the decoding unit 25 may perform a deblocking filter process on the restored block.
Each time the decoding unit 25 restores a block, the decoding unit 25 stores the restored block in the storage unit 14.

記憶部１４は、復元されたブロックを一時的に記憶する。各ブロックの符号化順序にしたがって、１枚のピクチャ分の復元されたブロックを結合することで、後続するピクチャの符号化の際に参照されるピクチャが得られる。記憶部１４は、符号化対象ピクチャが参照する可能性がある、予め定められた所定枚数分のピクチャを記憶し、記憶しているピクチャの枚数がその所定枚数を超えると、符号化順序が古いピクチャから順に破棄する。 The storage unit 14 temporarily stores the restored block. By combining the restored blocks for one picture according to the coding order of each block, a picture to be referred to when coding the subsequent picture is obtained. The storage unit 14 stores a predetermined number of pictures that may be referred to by the encoding target picture. When the number of stored pictures exceeds the predetermined number, the encoding order is old. Discard in order from the picture.

さらに、記憶部１４は、インター予測符号化されたPUのそれぞれについての動きベクトルを記憶する。 Furthermore, the storage unit 14 stores a motion vector for each of the PUs subjected to inter prediction encoding.

可変長符号化部２６は、符号化対象のCTUに含まれる量子化係数を可変長符号化する。さらに、可変長符号化部２６は、予測ブロックの作成に利用された動きベクトルなども可変長符号化する。そして可変長符号化部２６は、その可変長符号化によって得られた符号化ビットを、HEVCなどに従って所定の順序に並べたビットストリームを出力する。なお、可変長符号化部２６は、可変長符号化方式として、Context-based Adaptive Binary Arithmetic Coding(CABAC)といった算術符号化処理を用いることができる。あるいは、可変長符号化部２６は、可変長符号化方式として、Context-based Adaptive Variable Length Coding (CAVLC)といったハフマン符号化処理を用いてもよい。 The variable length coding unit 26 performs variable length coding on the quantization coefficient included in the CTU to be coded. Furthermore, the variable length coding unit 26 also performs variable length coding on a motion vector or the like used to create a prediction block. Then, the variable length coding unit 26 outputs a bit stream in which coded bits obtained by the variable length coding are arranged in a predetermined order according to HEVC or the like. The variable length coding unit 26 can use arithmetic coding processing such as Context-based Adaptive Binary Arithmetic Coding (CABAC) as the variable length coding method. Alternatively, the variable length coding unit 26 may use Huffman coding processing such as Context-based Adaptive Variable Length Coding (CAVLC) as the variable length coding method.

制御部（図示せず）は、出力されたビットストリームを所定の順序で結合し、HEVCなどに従ったヘッダ情報などを付加することで、符号化された動画像データを含むビットストリームを得る。 A control unit (not shown) combines the output bit streams in a predetermined order and adds header information according to HEVC or the like to obtain a bit stream including encoded moving image data.

図６は、動画像符号化装置１による動画像符号化処理の動作フローチャートである。動画像符号化装置１はCTUごとに、下記の動作フローチャートに従って符号化する。 FIG. 6 is an operation flowchart of the moving image encoding process performed by the moving image encoding device 1. The moving image encoding apparatus 1 performs encoding for each CTU according to the following operation flowchart.

動きベクトル算出部１１は、符号化対象ピクチャが、PピクチャまたはBピクチャといった、インター予測符号化モードが適用可能なピクチャである場合、符号化対象のCTUについて適用可能なPUのそれぞれについて、動きベクトルを算出する（ステップＳ２０１）。 When the encoding target picture is a picture to which the inter-prediction encoding mode is applicable, such as a P picture or a B picture, the motion vector calculation unit 11 performs a motion vector for each PU applicable to the encoding target CTU. Is calculated (step S201).

符号化モード決定部１２は、符号化対象のCTUに適用される、CUサイズ、PUサイズ、TUサイズ、及び符号化モードの組み合わせを決定する（ステップＳ２０２）。 The encoding mode determination unit 12 determines a combination of a CU size, a PU size, a TU size, and an encoding mode to be applied to the encoding target CTU (step S202).

符号化部１３の予測ブロック生成部２１は、決定した組み合わせに含まれる符号化モードに従って、PUごとに予測ブロックを算出する（ステップＳ２０３）。そして符号化部１３の予測誤差信号算出部２２は、TUごとに、TUとそのTUに対応する予測ブロック間で画素ごとに差分演算することで、予測誤差信号を算出する（ステップＳ２０４）。 The prediction block generation unit 21 of the encoding unit 13 calculates a prediction block for each PU according to the encoding mode included in the determined combination (step S203). And the prediction error signal calculation part 22 of the encoding part 13 calculates a prediction error signal for every TU by calculating the difference for every pixel between TU and the prediction block corresponding to the TU (step S204).

その後、符号化部１３の直交変換部２３は、予測誤差信号をTUごとに直交変換して直交変換係数の組を算出する（ステップＳ２０５）。そして符号化部１３の量子化部２４は、各直交変換係数を量子化する（ステップＳ２０６）。 Thereafter, the orthogonal transform unit 23 of the encoding unit 13 performs orthogonal transform on the prediction error signal for each TU to calculate a set of orthogonal transform coefficients (step S205). Then, the quantization unit 24 of the encoding unit 13 quantizes each orthogonal transform coefficient (step S206).

符号化部１３の復号部２５は、量子化係数を逆量子化及び逆直交変換して再生した予測誤差信号と予測ブロックを加算して参照ブロックを生成する（ステップＳ２０７）。そして復号部２５は、その参照ブロックを記憶部１４に記憶する。一方、符号化部１３の可変長符号化部２６は、各量子化係数を可変長符号化する（ステップＳ２０８）。そして動画像符号化装置１は、動画像符号化処理を終了する。 The decoding unit 25 of the encoding unit 13 adds the prediction error signal reproduced by inverse quantization and inverse orthogonal transform of the quantized coefficient and the prediction block to generate a reference block (step S207). Then, the decoding unit 25 stores the reference block in the storage unit 14. On the other hand, the variable length coding unit 26 of the coding unit 13 performs variable length coding on each quantization coefficient (step S208). Then, the moving image encoding apparatus 1 ends the moving image encoding process.

以上に説明してきたように、この動画像符号化装置は、符号化対象のブロックについて、そのブロックを分割した複数のサブブロックのそれぞれの複雑度が所定以上で、各サブブロック間の類似度が高いほど、そのブロックを分割され難くする。これにより、この動画像符号化装置は、符号化単位のサイズ、例えば、直交変換の単位のサイズを適切に決定できる。 As described above, in this moving image encoding apparatus, for a block to be encoded, the complexity of each of a plurality of sub-blocks obtained by dividing the block is greater than or equal to a predetermined level, and the similarity between each sub-block is high. The higher the value, the harder the block is divided. Thereby, this moving image encoding apparatus can appropriately determine the size of the encoding unit, for example, the size of the unit of orthogonal transform.

なお、インター予測符号化モードが適用されるピクチャについては、着目するTUを分割した各サブTUの複雑度が高く、かつ、各サブTU間の類似度が高い場合でも、参照ピクチャ上に各サブTUと良好に一致する領域が存在することがある。このような場合には、着目するTUがサブTUに分割されたとしても、各サブTUについての予測誤差が非常に少なく、RD特性が比較的良好となることがある。そこで、符号化モード決定部１２は、符号化モードとしてインター予測符号化モードが適用される場合には、offsetを常に0として、TUサイズを決定してもよい。 For pictures to which the inter prediction coding mode is applied, even if the complexity of each sub-TU obtained by dividing the TU of interest is high and the similarity between the sub-TUs is high, each sub-TU is displayed on the reference picture. There may be regions that match well with the TU. In such a case, even if the target TU is divided into sub-TUs, the prediction error for each sub-TU is very small, and the RD characteristics may be relatively good. Therefore, the encoding mode determination unit 12 may determine the TU size with offset always set to 0 when the inter prediction encoding mode is applied as the encoding mode.

また、他の変形例によれば、動画像符号化装置は、直交変換の単位だけでなく、符号化モードを選択する単位を決定する際にも、上記の処理を適用してもよい。例えば、動画像符号化装置は、着目するCUとそのCUを４分割したサブCU間で符号化コストを比較する際に、各サブCUの複雑度と各サブCU間の類似度に応じて決定されるオフセットを、各サブCUの符号化コストの和に加えてもよい。 According to another modification, the moving image encoding apparatus may apply the above process not only when determining the unit for selecting the encoding mode but also for the unit of orthogonal transform. For example, when comparing the coding cost between a focused CU and a sub-CU obtained by dividing the CU into four, the moving image coding apparatus determines according to the complexity of each sub-CU and the similarity between the sub-CUs. The offset to be added may be added to the sum of the coding costs of each sub CU.

図７は、上記の実施形態またはその変形例による動画像符号化装置の各部の機能を実現するコンピュータプログラムが動作することにより、動画像符号化装置として動作するコンピュータの構成図である。 FIG. 7 is a configuration diagram of a computer that operates as a moving image encoding apparatus when a computer program that realizes the functions of the respective units of the moving image encoding apparatus according to the above-described embodiment or its modification is operated.

コンピュータ１００は、ユーザインターフェース部１０１と、通信インターフェース部１０２と、記憶部１０３と、記憶媒体アクセス装置１０４と、プロセッサ１０５とを有する。プロセッサ１０５は、ユーザインターフェース部１０１、通信インターフェース部１０２、記憶部１０３及び記憶媒体アクセス装置１０４と、例えば、バスを介して接続される。 The computer 100 includes a user interface unit 101, a communication interface unit 102, a storage unit 103, a storage medium access device 104, and a processor 105. The processor 105 is connected to the user interface unit 101, the communication interface unit 102, the storage unit 103, and the storage medium access device 104 via, for example, a bus.

ユーザインターフェース部１０１は、例えば、キーボードとマウスなどの入力装置と、液晶ディスプレイといった表示装置とを有する。または、ユーザインターフェース部１０１は、タッチパネルディスプレイといった、入力装置と表示装置とが一体化された装置を有してもよい。そしてユーザインターフェース部１０１は、例えば、ユーザの操作に応じて、符号化する動画像データを選択する操作信号をプロセッサ１０５へ出力する。 The user interface unit 101 includes, for example, an input device such as a keyboard and a mouse, and a display device such as a liquid crystal display. Alternatively, the user interface unit 101 may include a device such as a touch panel display in which an input device and a display device are integrated. Then, the user interface unit 101 outputs, for example, an operation signal for selecting moving image data to be encoded to the processor 105 in accordance with a user operation.

通信インターフェース部１０２は、コンピュータ１００を、動画像データを生成する装置、例えば、ビデオカメラと接続するための通信インターフェース及びその制御回路を有してもよい。そのような通信インターフェースは、例えば、High-Definition Multimedia Interface(HDMI)(登録商標)、またはUniversal Serial Bus（ユニバーサル・シリアル・バス、USB）とすることができる。 The communication interface unit 102 may include a communication interface for connecting the computer 100 to a device that generates moving image data, for example, a video camera, and a control circuit thereof. Such a communication interface may be, for example, High-Definition Multimedia Interface (HDMI) (registered trademark) or Universal Serial Bus (Universal Serial Bus, USB).

さらに、通信インターフェース部１０２は、イーサネット（登録商標）などの通信規格に従った通信ネットワークに接続するための通信インターフェース及びその制御回路を有してもよい。 Furthermore, the communication interface unit 102 may include a communication interface for connecting to a communication network according to a communication standard such as Ethernet (registered trademark) and a control circuit thereof.

この場合には、通信インターフェース部１０２は、通信ネットワークに接続された他の機器から、符号化する動画像データを取得し、そのデータをプロセッサ１０５へ渡す。また通信インターフェース部１０２は、プロセッサ１０５から受け取った、符号化済みの動画像データを通信ネットワークを介して他の機器へ出力してもよい。 In this case, the communication interface unit 102 acquires moving image data to be encoded from another device connected to the communication network, and passes the data to the processor 105. Further, the communication interface unit 102 may output the encoded moving image data received from the processor 105 to another device via the communication network.

記憶部１０３は、例えば、読み書き可能な半導体メモリと読み出し専用の半導体メモリとを有する。そして記憶部１０３は、プロセッサ１０５上で実行される、動画像符号化処理を実行するためのコンピュータプログラム、及びこれらの処理の途中または結果として生成されるデータを記憶する。 The storage unit 103 includes, for example, a readable / writable semiconductor memory and a read-only semiconductor memory. The storage unit 103 stores a computer program for executing a moving image encoding process executed on the processor 105, and data generated during or as a result of these processes.

記憶媒体アクセス装置１０４は、例えば、磁気ディスク、半導体メモリカード及び光記憶媒体といった記憶媒体１０６にアクセスする装置である。記憶媒体アクセス装置１０４は、例えば、記憶媒体１０６に記憶されたプロセッサ１０５上で実行される、動画像符号化処理用のコンピュータプログラムを読み込み、プロセッサ１０５に渡す。 The storage medium access device 104 is a device that accesses a storage medium 106 such as a magnetic disk, a semiconductor memory card, and an optical storage medium. For example, the storage medium access device 104 reads a computer program for moving image encoding processing executed on the processor 105 stored in the storage medium 106 and passes the computer program to the processor 105.

プロセッサ１０５は、上記の実施形態または変形例による動画像符号化処理用コンピュータプログラムを実行することにより、符号化動画像データを生成する。そしてプロセッサ１０５は、生成された符号化動画像データを記憶部１０３に保存し、または通信インターフェース部１０２を介して他の機器へ出力する。 The processor 105 generates encoded moving image data by executing the computer program for moving image encoding processing according to the above-described embodiment or modification. The processor 105 stores the generated encoded moving image data in the storage unit 103 or outputs it to another device via the communication interface unit 102.

なお、動画像符号化装置１の各部の機能をプロセッサ上で実行可能なコンピュータプログラムは、コンピュータによって読み取り可能な媒体に記録された形で提供されてもよい。ただし、そのような記録媒体には、搬送波は含まれない。 Note that the computer program capable of executing the functions of the respective units of the moving image encoding device 1 on the processor may be provided in a form recorded on a computer-readable medium. However, such a recording medium does not include a carrier wave.

ここに挙げられた全ての例及び特定の用語は、読者が、本発明及び当該技術の促進に対する本発明者により寄与された概念を理解することを助ける、教示的な目的において意図されたものであり、本発明の優位性及び劣等性を示すことに関する、本明細書の如何なる例の構成、そのような特定の挙げられた例及び条件に限定しないように解釈されるべきものである。本発明の実施形態は詳細に説明されているが、本発明の精神及び範囲から外れることなく、様々な変更、置換及び修正をこれに加えることが可能であることを理解されたい。 All examples and specific terms listed herein are intended for instructional purposes to help the reader understand the concepts contributed by the inventor to the present invention and the promotion of the technology. It should be construed that it is not limited to the construction of any example herein, such specific examples and conditions, with respect to showing the superiority and inferiority of the present invention. Although embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions and modifications can be made thereto without departing from the spirit and scope of the present invention.

以上説明した実施形態及びその変形例に関し、更に以下の付記を開示する。
（付記１）
動画像データに含まれるピクチャ上のブロックを分割した、複数のサブブロックのそれぞれの複雑度及び各サブブロック間の類似度に応じたオフセット値を算出し、前記ブロックを符号化単位として前記ブロックを符号化する場合の第１の符号化コストが、前記サブブロックを前記符号化単位として前記ブロックを符号化する場合の前記複数のサブブロックのそれぞれの第２の符号化コストの和と前記オフセット値の合計以下である場合、前記符号化単位として前記ブロックを選択し、一方、前記第１の符号化コストが前記合計よりも大きい場合、前記符号化単位として前記サブブロックを選択する符号化モード決定部と、
前記ブロックを前記選択した符号化単位ごとに符号化する符号化部と、
を有する動画像符号化装置。
（付記２）
前記符号化モード決定部は、前記複数のサブブロックのそれぞれについて、当該サブブロックの前記第２の符号化コストを当該サブブロックの前記複雑度とする、付記１に記載の動画像符号化装置。
（付記３）
前記符号化モード決定部は、前記複数のサブブロックのそれぞれについて、当該サブブロックと当該サブブロックについての予測ブロックとの対応画素間の差分絶対値和を当該サブブロックの前記複雑度とする、付記１に記載の動画像符号化装置。
（付記４）
前記符号化モード決定部は、前記複数のサブブロックのそれぞれの前記複雑度の分散を前記類似度として算出する、付記２または３に記載の動画像符号化装置。
（付記５）
前記符号化モード決定部は、前記複数のサブブロックのそれぞれの前記複雑度のうちの最大値と最小値の差を前記類似度として算出する、付記２または３に記載の動画像符号化装置。
（付記６）
前記符号化モード決定部は、前記ブロックを符号化済みの他のピクチャを参照して符号化するインター予測符号化モードに基づいて符号化する場合、前記オフセットを０とし、一方、前記ブロックを前記ピクチャの符号化済みの領域を参照して符号化するイントラ予測符号化モードに基づいて符号化する場合、前記複数のサブブロックのそれぞれの前記複雑度及び各サブブロック間の前記類似度に応じて前記オフセット値を算出する、付記１〜５の何れかに記載の動画像符号化装置。
（付記７）
前記符号化部は、前記ブロックと、前記ブロックについての予測ブロック間の予測誤差を前記選択した符号化単位ごとに直交変換して前記符号化単位ごとに直交変換係数を算出し、前記符号化単位ごとの前記直交変換係数を符号化する、付記１〜６の何れかに記載の動画像符号化装置。
（付記８）
動画像データに含まれるピクチャ上のブロックを分割した、複数のサブブロックのそれぞれの複雑度及び各サブブロック間の類似度に応じたオフセット値を算出し、
前記ブロックを符号化単位として前記ブロックを符号化する場合の第１の符号化コストが、前記サブブロックを前記符号化単位として前記ブロックを符号化する場合の前記複数のサブブロックのそれぞれの第２の符号化コストの和と前記オフセット値の合計以下である場合、前記符号化単位として前記ブロックを選択し、一方、前記第１の符号化コストが前記合計よりも大きい場合、前記符号化単位として前記サブブロックを選択し、
前記ブロックを前記選択した符号化単位ごとに符号化する、
ことを含む動画像符号化方法。
（付記９）
動画像データに含まれるピクチャ上のブロックを分割した、複数のサブブロックのそれぞれの複雑度及び各サブブロック間の類似度に応じたオフセット値を算出し、
前記ブロックを符号化単位として前記ブロックを符号化する場合の第１の符号化コストが、前記サブブロックを前記符号化単位として前記ブロックを符号化する場合の前記複数のサブブロックのそれぞれの第２の符号化コストの和と前記オフセット値の合計以下である場合、前記符号化単位として前記ブロックを選択し、一方、前記第１の符号化コストが前記合計よりも大きい場合、前記符号化単位として前記サブブロックを選択し、
前記ブロックを前記選択した符号化単位ごとに符号化する、
ことをコンピュータに実行させるための動画像符号化用コンピュータプログラム。 The following supplementary notes are further disclosed regarding the embodiment described above and its modifications.
(Appendix 1)
The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks. The first encoding cost when encoding is the sum of the second encoding cost of each of the plurality of sub-blocks and the offset value when the block is encoded using the sub-block as the encoding unit. Encoding mode selection for selecting the block as the encoding unit, and for selecting the sub-block as the encoding unit when the first encoding cost is greater than the total. And
An encoding unit that encodes the block for each of the selected encoding units;
A moving picture encoding apparatus having:
(Appendix 2)
The moving picture encoding apparatus according to appendix 1, wherein the encoding mode determination unit sets, for each of the plurality of sub-blocks, the second encoding cost of the sub-block as the complexity of the sub-block.
(Appendix 3)
The encoding mode determination unit, for each of the plurality of sub-blocks, the difference absolute value sum between corresponding pixels of the sub-block and the prediction block for the sub-block as the complexity of the sub-block The moving image encoding apparatus according to 1.
(Appendix 4)
The moving picture coding apparatus according to Supplementary Note 2 or 3, wherein the coding mode determination unit calculates a variance of the complexity of each of the plurality of sub-blocks as the similarity.
(Appendix 5)
The moving picture coding apparatus according to appendix 2 or 3, wherein the coding mode determination unit calculates a difference between a maximum value and a minimum value of the complexity of each of the plurality of sub-blocks as the similarity.
(Appendix 6)
The encoding mode determination unit sets the offset to 0 when encoding the block based on an inter prediction encoding mode in which the block is encoded with reference to another encoded picture. When encoding based on an intra prediction encoding mode in which encoding is performed with reference to an encoded region of a picture, according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks The moving image encoding apparatus according to any one of appendices 1 to 5, which calculates the offset value.
(Appendix 7)
The encoding unit orthogonally transforms a prediction error between the block and a prediction block for the block for each of the selected coding units to calculate an orthogonal transform coefficient for each of the coding units, and the coding unit The moving image encoding device according to any one of appendices 1 to 6, wherein the orthogonal transform coefficient is encoded for each.
(Appendix 8)
The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks,
The first encoding cost when the block is encoded using the block as an encoding unit is the second encoding cost of the plurality of sub-blocks when the block is encoded using the sub-block as the encoding unit. If the block is equal to or less than the sum of the encoding costs and the offset value, the block is selected as the encoding unit. On the other hand, if the first encoding cost is greater than the total, the encoding unit is Select the sub-block,
Encoding the block for each of the selected encoding units;
A moving picture encoding method including the above.
(Appendix 9)
The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks,
The first encoding cost when the block is encoded using the block as an encoding unit is the second encoding cost of the plurality of sub-blocks when the block is encoded using the sub-block as the encoding unit. If the block is equal to or less than the sum of the encoding costs and the offset value, the block is selected as the encoding unit. On the other hand, if the first encoding cost is greater than the total, the encoding unit is Select the sub-block,
Encoding the block for each of the selected encoding units;
A computer program for encoding a moving image for causing a computer to execute the above.

１動画像符号化装置
１１動きベクトル算出部
１２符号化モード決定部
１３符号化部
１４記憶部
２１予測ブロック生成部
２２予測誤差信号算出部
２３直交変換部
２４量子化部
２５復号部
２６可変長符号化部
１００コンピュータ
１０１ユーザインターフェース部
１０２通信インターフェース部
１０３記憶部
１０４記憶媒体アクセス装置
１０５プロセッサ DESCRIPTION OF SYMBOLS 1 Moving image encoder 11 Motion vector calculation part 12 Encoding mode determination part 13 Encoding part 14 Storage part 21 Prediction block generation part 22 Prediction error signal calculation part 23 Orthogonal transformation part 24 Quantization part 25 Decoding part 26 Variable length code | symbol Conversion unit 100 computer 101 user interface unit 102 communication interface unit 103 storage unit 104 storage medium access device 105 processor

Claims

The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks. The first encoding cost when encoding is the sum of the second encoding cost of each of the plurality of sub-blocks and the offset value when the block is encoded using the sub-block as the encoding unit. Encoding mode selection for selecting the block as the encoding unit, and for selecting the sub-block as the encoding unit when the first encoding cost is greater than the total. And
An encoding unit that encodes the block for each of the selected encoding units;
A moving picture encoding apparatus having:

The moving picture encoding apparatus according to claim 1, wherein the encoding mode determination unit sets the second encoding cost of the subblock as the complexity of the subblock for each of the plurality of subblocks. .

The encoding mode determination unit, for each of the plurality of sub-blocks, the difference absolute value sum between corresponding pixels of the sub-block and the prediction block for the sub-block is set as the complexity of the sub-block. Item 4. The moving image encoding device according to Item 1.

The moving image encoding apparatus according to claim 2, wherein the encoding mode determination unit calculates a variance of the complexity of each of the plurality of sub-blocks as the similarity.

The moving picture coding apparatus according to claim 2 or 3, wherein the coding mode determination unit calculates a difference between a maximum value and a minimum value of the complexity of each of the plurality of sub-blocks as the similarity. .

The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks,
The first encoding cost when the block is encoded using the block as an encoding unit is the second encoding cost of the plurality of sub-blocks when the block is encoded using the sub-block as the encoding unit. If the block is equal to or less than the sum of the encoding costs and the offset value, the block is selected as the encoding unit. On the other hand, if the first encoding cost is greater than the total, the encoding unit is Select the sub-block,
Encoding the block for each of the selected encoding units;
A moving picture encoding method including the above.

The block on the picture included in the moving image data is divided, and an offset value is calculated according to the complexity of each of the plurality of sub-blocks and the similarity between the sub-blocks,
The first encoding cost when the block is encoded using the block as an encoding unit is the second encoding cost of the plurality of sub-blocks when the block is encoded using the sub-block as the encoding unit. If the block is equal to or less than the sum of the encoding costs and the offset value, the block is selected as the encoding unit. On the other hand, if the first encoding cost is greater than the total, the encoding unit is Select the sub-block,
Encoding the block for each of the selected encoding units;
A computer program for encoding a moving image for causing a computer to execute the above.