JP2015073213A

JP2015073213A - Image decoder, image encoder, encoded data converter, and interest area display system

Info

Publication number: JP2015073213A
Application number: JP2013208138A
Authority: JP
Inventors: 山本　智幸; Tomoyuki Yamamoto; 智幸山本; 知宏猪飼; Tomohiro Igai; 健史筑波; Kenji Tsukuba
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2013-10-03
Filing date: 2013-10-03
Publication date: 2015-04-16

Abstract

PROBLEM TO BE SOLVED: To reduce the code amount of hierarchic encoded data while keeping high the picture quality of a reproduced image of an area of interest.SOLUTION: An image decoder 1 includes a skip slice determination part, a non-skip CTU decoding part, and a skip CTU generation part. A skip slice flag indicating whether an object slice is a skip slice or a non-skip slice is decoded from a slice header. When the skip slice flag indicates that the object slice is the skip slice, a slice header decoding part decodes the number of skip CTUs as the number of CTUs included in the object slice and also generates a decoded image of CTUs as may as the number that the number of skip CTUs indicates by the skip CTU generation part so as to generate a decoded image of the object slice. When the skip slice flag indicates that the object slice is the non-skip slice, on the other hand, the slice header decoding part decodes the CTUs included in the object slice by the non-skip CTU decoding part so as to generate a reproduced image of the object slice.

Description

本発明は、画像が階層的に符号化された階層符号化データを復号する画像復号装置、および画像を階層的に符号化することによって階層符号化データを生成する画像符号化装置に関する。 The present invention relates to an image decoding apparatus that decodes hierarchically encoded data in which an image is hierarchically encoded, and an image encoding apparatus that generates hierarchically encoded data by hierarchically encoding an image.

通信システムで伝送される情報、あるいは蓄積装置に記録される情報の１つに画像あるいは動画像がある。従来、これらの画像（以降、動画像を含む）の伝送・蓄積のため、画像を符号化する技術が知られている。 One of information transmitted in the communication system or information recorded in the storage device is an image or a moving image. 2. Description of the Related Art Conventionally, a technique for encoding an image for transmitting and storing these images (hereinafter including moving images) is known.

動画像符号化方式としては、AVC（H.264/MPEG-4 Advanced Video Coding）や、その後継コーデックであるHEVC（High-Efficiency Video Coding）が知られている（非特許文献１）。 As a moving picture coding system, AVC (H.264 / MPEG-4 Advanced Video Coding) and its successor codec HEVC (High-Efficiency Video Coding) are known (Non-patent Document 1).

これらの動画像符号化方式では、通常、入力画像を符号化／復号することによって得られる局所復号画像に基づいて予測画像が生成され、当該予測画像を入力画像（原画像）から減算して得られる予測残差（「差分画像」または「残差画像」と呼ぶこともある）が符号化される。また、予測画像の生成方法としては、画面間予測（インター予測）、および、画面内予測（イントラ予測）が挙げられる。 In these moving image encoding methods, a predicted image is usually generated based on a local decoded image obtained by encoding / decoding an input image, and obtained by subtracting the predicted image from the input image (original image). Prediction residuals (sometimes referred to as “difference images” or “residual images”) are encoded. In addition, examples of the method for generating a predicted image include inter-screen prediction (inter prediction) and intra-screen prediction (intra prediction).

イントラ予測では、同一ピクチャ内の局所復号画像に基づいて、当該ピクチャにおける予測画像が順次生成される。 In intra prediction, predicted images in a picture are sequentially generated based on a locally decoded image in the same picture.

インター予測では、ピクチャ間の動き補償により予測画像が生成される。インター予測で予測画像生成に用いられる復号済のピクチャは参照ピクチャと呼ばれる。 In inter prediction, a predicted image is generated by motion compensation between pictures. A decoded picture used for predictive image generation in inter prediction is called a reference picture.

また、複数の相互に関連性のある動画像をレイヤ（階層）に分けて符号化することで、複数の動画像から符号化データを生成する技術も知られており、階層符号化技術と呼ばれる。階層符号化技術により生成される符号化データは階層符号化データとも呼ばれる。 Also, a technique for generating encoded data from a plurality of moving images by encoding a plurality of mutually related moving images into layers (hierarchies) is also known, which is called a hierarchical encoding technique . The encoded data generated by the hierarchical encoding technique is also referred to as hierarchical encoded data.

代表的な階層符号化技術としてHEVCを基礎とするSHVC（Scalable HEVC）が知られている（非特許文献２）。 As a representative hierarchical coding technique, SHVC (Scalable HEVC) based on HEVC is known (Non-Patent Document 2).

SHVCでは、空間スケーラビリティ、時間スケーラビリティ、SNRスケーラビリティをサポートする。例えば空間スケーラビリティの場合、複数の異なる解像度の動画像をレイヤに分けて符号化して階層符号化データを生成する。例えば、原画像から所望の解像度にダウンサンプリングした画像を下位レイヤとして符号化する。次に原画像をレイヤ間の冗長性を除去するためにレイヤ間予測を適用した上で、上位レイヤとして符号化する。 SHVC supports spatial scalability, temporal scalability, and SNR scalability. For example, in the case of spatial scalability, hierarchical encoded data is generated by dividing a plurality of moving images having different resolutions into layers. For example, an image downsampled from the original image to a desired resolution is encoded as a lower layer. Next, the original image is encoded as an upper layer after applying inter-layer prediction in order to remove redundancy between layers.

別の代表的な階層符号化技術としてHEVCを基礎とするMV-HEVC（Multi View HEVC）が知られている（非特許文献３）。 As another representative hierarchical encoding technique, MV-HEVC (Multi View HEVC) based on HEVC is known (Non-Patent Document 3).

MV-HEVCではビュースケーラビリティをサポートする。ビュースケーラビリティでは、複数の異なる視点（ビュー）に対応する動画像をレイヤに分けて符号化して階層符号化データを生成する。例えば、基本となる視点（ベースビュー）に対応する動画像を下位レイヤとして符号化する。次に、異なる視点に対応する動画像を、レイヤ間予測を適用した上で、上位レイヤとして符号化する。 MV-HEVC supports view scalability. In view scalability, a moving image corresponding to a plurality of different viewpoints (views) is divided into layers and encoded to generate hierarchical encoded data. For example, a moving image corresponding to a basic viewpoint (base view) is encoded as a lower layer. Next, a moving image corresponding to a different viewpoint is encoded as an upper layer after applying inter-layer prediction.

SHVCやMV-HEVCにおけるレイヤ間予測には、レイヤ間画像予測とレイヤ間動き予測がある。レイヤ間画像予測では、下位レイヤの復号画像を利用して、予測画像を生成する。レイヤ間動き予測では、下位レイヤの動き情報を利用して、動き情報の予測値を導出する。レイヤ間予測において予測に用いられるピクチャはレイヤ間参照ピクチャと呼ばれる。また、レイヤ間参照ピクチャを含むレイヤは参照レイヤと呼ばれる。なお、以下では、インター予測に用いられる参照ピクチャと、レイヤ間予測に用いられる参照ピクチャを総称して単に参照ピクチャと呼称する。 Inter-layer prediction in SHVC and MV-HEVC includes inter-layer image prediction and inter-layer motion prediction. In inter-layer image prediction, a predicted image is generated using a decoded image of a lower layer. In inter-layer motion prediction, motion information prediction values are derived using motion information of lower layers. A picture used for prediction in inter-layer prediction is called an inter-layer reference picture. A layer including an inter-layer reference picture is called a reference layer. In the following, reference pictures used for inter prediction and reference pictures used for inter-layer prediction are generically referred to simply as reference pictures.

SHVCやMV-HEVCでは、予測画像の生成に、インター予測、イントラ予測、レイヤ間画像予測のいずれかを利用できる。 In SHVC or MV-HEVC, any one of inter prediction, intra prediction, and inter-layer image prediction can be used to generate a predicted image.

SHVCやMV-HEVCを利用するアプリーケーションの一つに、注目領域を考慮した映像アプリケーションがある。例えば、映像再生端末において、通常は全領域の映像を比較的低解像度で再生する。映像再生端末の視聴者が表示されている映像の一部を注目領域として指定した場合、当該注目領域が高解像度で再生端末に表示される。 One application that uses SHVC or MV-HEVC is a video application that takes into account the area of interest. For example, a video playback terminal normally plays back video in the entire area with a relatively low resolution. When a part of the video displayed by the viewer of the video reproduction terminal is designated as the attention area, the attention area is displayed on the reproduction terminal with high resolution.

前記のような注目領域を考慮した映像アプリケーションは、全領域の比較的低解像度の映像を下位レイヤの符号化データとして、注目領域の高解像度映像を上位レイヤの符号化データとして符号化した階層符号化データを用いて実現できる。すなわち、全領域を再生する場合は下位レイヤの符号化データのみを復号して再生し、注目領域の高解像度映像を再生する場合は、上位レイヤの符号化データを前記下位レイヤの符号化データに追加して伝送することで、低解像度映像に対する符号化データと高解像度映像に対する符号化データを両方送る場合に較べて少ない伝送帯域で前記アプリケーションが実現できる。 The video application considering the attention area as described above is a hierarchical code in which a relatively low resolution video of the entire area is encoded as lower layer encoded data, and a high resolution video of the attention area is encoded as upper layer encoded data. This can be realized using the data. That is, when reproducing the entire region, only the encoded data of the lower layer is decoded and reproduced, and when reproducing the high-resolution video of the region of interest, the encoded data of the upper layer is converted into the encoded data of the lower layer. By additionally transmitting, the application can be realized with a smaller transmission band than when both encoded data for low-resolution video and encoded data for high-resolution video are sent.

「Recommendation H.265 (04/13)」, ITU-T (２０１３年６月７日公開)"Recommendation H.265 (04/13)", ITU-T (released June 7, 2013) JCT3V-E1004_v6 「MV-HEVC Draft Text 5」, Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 115th Meeting: Vienna, AT, 27 Jul. -2 Aug. 2013 (２０１３年８月７日公開)JCT3V-E1004_v6 `` MV-HEVC Draft Text 5 '', Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO / IEC JTC 1 / SC 29 / WG 115th Meeting: Vienna, AT, 27 Jul. -2 Aug. 2013 (Released on August 7, 2013) JCTVC-N1008_v1 「SHVC Draft 3」, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 14th Meeting: Vienna, AT, 25 July - 2 Aug. 2013 （２０１３年８月２０日公開）JCTVC-N1008_v1 `` SHVC Draft 3 '', Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO / IEC JTC 1 / SC 29 / WG 11 14th Meeting: Vienna, AT, 25 July- 2 Aug. 2013 (Released on August 20, 2013)

しかしながら、注目領域に係らず全領域の高解像度映像に相当する上位レイヤの符号化データを伝送する場合、下位レイヤのみの符号化データを伝送する場合に較べて符号量が大きく増大するという課題があった。 However, when transmitting encoded data of the upper layer corresponding to the high-resolution video of the entire region regardless of the region of interest, there is a problem that the code amount is greatly increased compared to transmitting encoded data of only the lower layer. there were.

また、注目領域のみを含むような上位レイヤの符号化データを生成する場合、生成に要する処理量が多いという課題があった。例えば、ユーザ毎に異なる注目領域が指定される場合、ユーザ毎に異なる上位レイヤの符号化データを生成する必要があるが、そのような上位レイヤの符号化データの生成に要する処理量が大きい場合、多数のユーザに対して注目領域に相当する上位レイヤの符号化データを生成して送信することが困難であるという課題があった。 In addition, when the encoded data of the upper layer including only the attention area is generated, there is a problem that the processing amount required for generation is large. For example, when different attention areas are specified for each user, it is necessary to generate different upper layer encoded data for each user, but the amount of processing required to generate such upper layer encoded data is large. There is a problem that it is difficult to generate and transmit encoded data of an upper layer corresponding to a region of interest for a large number of users.

本発明は、上記問題に鑑みてされたものであり、その目的は、階層符号化方式において、全領域に相当する上位レイヤの符号化データと注目領域に相当する上位レイヤの符号化データのいずれも符号化／復号可能な画像符号化装置および画像復号装置を実現することにある。加えて、本発明の目的は、全領域に相当する上位レイヤの符号化データから注目領域に相当する上位レイヤの符号化データを、復号画像を生成することなく生成可能な符号化データのデータ構造、および、前記上位レイヤの符号化データから前記注目領域に相当する上位レイヤの符号化データを生成する符号化データ変換装置を実現することにある。 The present invention has been made in view of the above-described problem, and its purpose is to use any one of the encoded data of the upper layer corresponding to the entire region and the encoded data of the upper layer corresponding to the region of interest in the hierarchical encoding method. Is to realize an image encoding device and an image decoding device capable of encoding / decoding. In addition, an object of the present invention is to provide a data structure of encoded data that can generate encoded data of an upper layer corresponding to a region of interest from encoded data of an upper layer corresponding to the entire region without generating a decoded image. And an encoded data conversion apparatus that generates encoded data of an upper layer corresponding to the region of interest from the encoded data of the upper layer.

上記課題を解決するために、本発明に係る画像復号装置は、階層符号化された符号化データに含まれる上位レイヤの符号化データを復号し、上位レイヤの復号ピクチャを復元する画像復号装置であって、上位レイヤのスライスヘッダを復号するスライスヘッダ復号部と、上位レイヤの非スキップスライスに属するＣＴＵの復号画像をスライスデータに基づいて復号する非スキップＣＴＵ復号部と、上位レイヤのスキップスライスに属するＣＴＵの復号画像を生成するスキップＣＴＵ生成部を備え、前記スライスヘッダ復号部は、対象スライスがスキップスライスか非スキップスライスかを示すスキップスライスフラグを復号または推定し、前記スキップスライスフラグが、対象スライスがスキップスライスであることを示す場合、前記スライスヘッダ復号部は、対象スライスに含まれるＣＴＵ数であるスキップＣＴＵ数を復号するとともに、該スキップＣＴＵ数の示す個数のＣＴＵの復号画像を前記スキップＣＴＵ生成部により生成することで対象スライスの復号画像を生成し、前記スキップスライスフラグが、対象スライスが非スキップスライスであることを示す場合、対象スライスに含まれるＣＴＵを前記非スキップＣＴＵ復号部により復号することで対象スライスの復号画像を生成することを特徴としている。 In order to solve the above problem, an image decoding apparatus according to the present invention is an image decoding apparatus that decodes higher layer encoded data included in hierarchically encoded data and restores a decoded picture of the upper layer. A slice header decoding unit that decodes a slice header of an upper layer, a non-skip CTU decoding unit that decodes a decoded image of a CTU belonging to a non-skip slice of the upper layer based on slice data, and a skip slice of the upper layer A skip CTU generation unit that generates a decoded image of the CTU to which the slice belongs, wherein the slice header decoding unit decodes or estimates a skip slice flag indicating whether the target slice is a skip slice or a non-skip slice, and the skip slice flag is If the slice indicates a skip slice, the slice The header decoding unit decodes the skip CTU number that is the number of CTUs included in the target slice, and generates the decoded images of the number of CTUs indicated by the skip CTU number by the skip CTU generation unit, thereby decoding the decoded image of the target slice When the skip slice flag indicates that the target slice is a non-skip slice, the decoded image of the target slice is generated by decoding the CTU included in the target slice by the non-skip CTU decoding unit. It is characterized by.

また、上記画像復号装置において、表示領域情報を復号するパラメータセット復号部を備え、前記表示領域情報が指定する表示領域には、スキップスライスを含むタイルを含まず、タイルであって、該タイルに含まれるスライスが全て非スキップスライスであるタイルを含む、ことが好ましい。 The image decoding apparatus further includes a parameter set decoding unit that decodes the display area information, and the display area specified by the display area information does not include a tile including a skip slice, and is a tile. Preferably, the included slices include tiles that are all non-skip slices.

また、上記画像復号装置において、上記スキップＣＴＵ数は、対象スライスが属するタイルに含まれるＣＴＵ数を最大値とする可変長符号の復号処理を適用することで、スライスヘッダから復号される、ことが好ましい。 In the image decoding apparatus, the skip CTU number may be decoded from the slice header by applying a variable length code decoding process that maximizes the number of CTUs included in the tile to which the target slice belongs. preferable.

また、上記画像復号装置において、対象スライスを含むピクチャが１個のタイルを含む場合、上記スライスヘッダ復号部により復号または推定されるスキップスライスフラグが対象スライスが非スキップスライスであることを示す、ことが好ましい。 In the image decoding apparatus, when a picture including the target slice includes one tile, a skip slice flag decoded or estimated by the slice header decoding unit indicates that the target slice is a non-skip slice. Is preferred.

また、上記画像復号装置において、パラメータセットに含まれる全タイル依存性識別子を復号するパラメータセット復号部を備え、上記全タイル依存性識別子が、対応タイル間以外の動き依存がないこと、または、対応タイル間以外のレイヤ間依存がないことを示す場合、上記スライスヘッダ復号部により復号または推定されるスキップスライスフラグが対象スライスが非スキップスライスであることを示す、ことが好ましい。 The image decoding apparatus further includes a parameter set decoding unit that decodes all tile dependency identifiers included in the parameter set, and the all tile dependency identifiers have no motion dependency other than between corresponding tiles, or When indicating that there is no inter-layer dependency other than between tiles, it is preferable that the skip slice flag decoded or estimated by the slice header decoding unit indicates that the target slice is a non-skip slice.

また、上記画像復号装置において、対象スライスを含むピクチャが２個以上のタイルを含む場合、上記スライスヘッダ復号部は上記スキップスライスフラグを復号し、対象スライスを含むピクチャが１個のタイルを含む場合、上記スライスヘッダ復号部は上記スキップスライスフラグの値として対象スライスが非スキップスライスであることを示す値を推定する、ことが好ましい。 In the image decoding apparatus, when a picture including the target slice includes two or more tiles, the slice header decoding unit decodes the skip slice flag, and a picture including the target slice includes one tile. The slice header decoding unit preferably estimates a value indicating that the target slice is a non-skip slice as the value of the skip slice flag.

また、上記画像復号装置において、対象スライスを含むピクチャが２個以上のタイルを含む場合、上記スライスヘッダ復号部は、上記スキップスライスフラグの値に応じて非スキップＣＴＵ復号部、または、スキップＣＴＵ生成部のいずれか一方を選択して対象スライスに含まれるＣＴＵを復号し、対象スライスを含むピクチャが１個のタイルを含む場合、上記スライスヘッダ復号部は非スキップＣＴＵ復号部を選択して対象スライスに含まれるＣＴＵを復号する、ことが好ましい。 In the image decoding apparatus, when a picture including the target slice includes two or more tiles, the slice header decoding unit generates a non-skip CTU decoding unit or a skip CTU generation according to the value of the skip slice flag. When the CTU included in the target slice is decoded by selecting any one of the units and the picture including the target slice includes one tile, the slice header decoding unit selects the non-skip CTU decoding unit and selects the target slice It is preferable to decode the CTU included in.

また、上記画像復号装置において、上記全タイル依存性識別子は、対応タイル間以外の動き依存の有無を示す情報と、対応タイル間以外のレイヤ間依存の有無を示す情報を両方含む、ことが好ましい。 In the image decoding apparatus, it is preferable that the all-tile dependency identifier includes both information indicating presence / absence of motion dependency other than between corresponding tiles and information indicating presence / absence of inter-layer dependency other than between corresponding tiles. .

また、上記画像復号装置において、パラメータセットに含まれる全タイル依存性識別子を復号するパラメータセット復号部を備え、対象スライスを含むピクチャが２個以上のタイルを含み、かつ、上記全タイル依存性識別子が、対応タイル間以外の動き依存がないこと、または、対応タイル間以外のレイヤ間依存がないことを示す場合、上記スライスヘッダ復号部により復号または推定されるスキップスライスフラグが、対象スライスが非スキップスライスである、ことが好ましい。 The image decoding apparatus further includes a parameter set decoding unit that decodes all tile dependency identifiers included in the parameter set, the picture including the target slice includes two or more tiles, and the all tile dependency identifier Indicates that there is no motion dependence other than between corresponding tiles, or that there is no inter-layer dependence other than between corresponding tiles, the skip slice flag decoded or estimated by the slice header decoding unit is not the target slice. A skip slice is preferred.

上記課題を解決するために、本発明に係る画像符号化装置は、入力画像から上位レイヤの符号化データを生成する画像符号化装置であって、上位レイヤのスライスヘッダを符号化するスライスヘッダ符号化部と、上位レイヤの非スキップスライスに属するＣＴＵの復号画像をスライスデータに基づいて符号化する非スキップＣＴＵ符号化部と、上位レイヤのスキップスライスに属するＣＴＵの復号画像を生成するスキップＣＴＵ生成部と、表示領域情報を符号化するパラメータセット符号化部を備え、前記スライスヘッダ符号化部は、対象スライスがスキップスライスか非スキップスライスかを示すスキップスライスフラグを符号化し、前記スキップスライスフラグが、対象スライスがスキップスライスであることを示す場合、前記スライスヘッダ符号化部は、対象スライスに含まれるＣＴＵ数であるスキップＣＴＵ数を符号化するとともに、該スキップＣＴＵ数の示す個数のＣＴＵの復号画像を前記スキップＣＴＵ生成部により生成することで対象スライスの復号画像を生成し、前記スキップスライスフラグが、対象スライスが非スキップスライスであることを示す場合、対象スライスに含まれるＣＴＵを前記非スキップＣＴＵ符号化部により生成することで対象スライスの復号画像を生成し、前記表示領域情報が指定する表示領域には、スキップスライスを含むタイルを含まず、タイルであって、該タイルに含まれるスライスが全て非スキップスライスであるタイルを含むことを特徴としている。 In order to solve the above problems, an image encoding device according to the present invention is an image encoding device that generates encoded data of an upper layer from an input image, and a slice header code that encodes a slice header of the upper layer. A non-skip CTU encoding unit that encodes a decoded image of a CTU belonging to a non-skip slice of an upper layer based on slice data, and a skip CTU generation that generates a decoded image of a CTU belonging to a skip slice of an upper layer And a parameter set encoding unit that encodes display area information, the slice header encoding unit encodes a skip slice flag indicating whether the target slice is a skip slice or a non-skip slice, and the skip slice flag Indicates that the target slice is a skip slice, the slice The ddda encoding unit encodes the number of skip CTUs, which is the number of CTUs included in the target slice, and generates a decoded image of the number of CTUs indicated by the number of skip CTUs by generating the skip CTU generation unit. And when the skip slice flag indicates that the target slice is a non-skip slice, the non-skip CTU encoding unit generates a CTU included in the target slice to generate a decoded image of the target slice. The display area specified by the display area information does not include tiles including skip slices, and includes tiles that are all non-skip slices. Yes.

上記課題を解決するために、本発明に係る符号化データ変換装置は、入力される階層符号化データを入力される注目領域情報に基づいて変換し、変換後の階層符号化データを出力する階層符号化データ変換装置であって、上位レイヤの符号化データを注目領域情報に基づいて修正するスライス修正部を備え、前記スライス修正部は、注目領域情報の示す注目領域に基づいて、上位レイヤの符号化データに含まれるスライスであって、前記注目領域に含まれないスライスを非スキップスライスからスキップスライスに変更することで上位レイヤの符号化データを修正することを特徴としている。 In order to solve the above-described problem, an encoded data conversion apparatus according to the present invention converts an input hierarchical encoded data based on input attention area information, and outputs a hierarchical encoded data after conversion An encoded data conversion apparatus, comprising: a slice correction unit that corrects encoded data of an upper layer based on attention area information, wherein the slice correction unit is configured to perform an upper layer based on an attention area indicated by attention area information. It is characterized in that the encoded data of the upper layer is corrected by changing a slice included in the encoded data, which is not included in the attention area, from a non-skip slice to a skip slice.

また、上記符号化データ変換装置において、前記スライス修正部は、上位レイヤの符号化データに含まれるスライスであって、前記注目領域に含まれないタイルに含まれるスライスを非スキップスライスからスキップスライスに変更することで上位レイヤの符号化データを修正する、ことが好ましい。 Further, in the encoded data conversion apparatus, the slice correction unit converts a slice included in a tile not included in the attention region from a non-skip slice to a skip slice, which is included in the encoded data of the upper layer. It is preferable to modify the encoded data of the upper layer by changing.

また、上記符号化データ変換装置において、パラメータセット修正部を備え、前記パラメータセット修正部は、パラメータセットに含まれる表示領域情報を、該表示領域情報の示す表示領域が、前記注目領域情報の示す注目領域と一致するよう書き換えることでパラメータセットを修正する、ことが好ましい。 The encoded data conversion apparatus further includes a parameter set correction unit, wherein the parameter set correction unit indicates display area information included in the parameter set, and a display area indicated by the display area information indicates the attention area information. It is preferable to modify the parameter set by rewriting it so as to match the region of interest.

上記課題を解決するために、本発明に係る注目領域表示システムは、蓄積された階層符号化データを用いてピクチャ全体及びピクチャの注目領域に相当する部分領域を表示する注目領域表示システムであって、注目領域情報を供給する注目領域通知部と、前記注目領域情報に基づいて階層符号化データを変換して変換後階層符号化データを生成する階層符号化データ変換部と、上記変換後階層符号化データを復号して上位レイヤの復号ピクチャ及び下位レイヤの復号ピクチャを出力する階層動画像復号部と、前記下位レイヤの復号ピクチャを全体表示画像として出力し、かつ、前記上位レイヤの復号ピクチャを注目領域表示画像として出力する表示制御部を備えることを特徴としている。 In order to solve the above-described problem, an attention area display system according to the present invention is an attention area display system that displays an entire picture and a partial area corresponding to the attention area of a picture using accumulated encoded data. A region-of-interest notification unit that supplies region-of-interest information; a layer-encoded data conversion unit that converts the hierarchical encoded data based on the region-of-interest information to generate post-conversion layer-encoded data; and the post-conversion layer code A hierarchical video decoding unit that decodes the encoded data and outputs a decoded picture of an upper layer and a decoded picture of a lower layer, outputs the decoded picture of the lower layer as an entire display image, and outputs the decoded picture of the upper layer It is characterized by including a display control unit that outputs an attention area display image.

また、上記注目領域表示システムにおいて、上記表示制御部は、注目領域情報が注目領域が指定されていないことを示す場合、注目領域表示画像を出力しない、ことが好ましい。 In the attention area display system, it is preferable that the display control unit does not output the attention area display image when the attention area information indicates that the attention area is not designated.

また、上記注目領域表示システムにおいて、上記表示制御部は、注目領域情報の変更がある場合に、前記変更を判定した時点から、前記階層動画像復号部から変更後の注目領域情報に係る上位レイヤの復号ピクチャが供給されるまでの間、前記下位レイヤの復号ピクチャの部分領域であって、前記変更後の注目領域情報の示す領域を注目領域表示画像として出力する、ことが好ましい。 In the attention area display system, when the attention area information is changed, the display control unit determines whether the upper layer related to the attention area information after the change from the hierarchical video decoding section from the time when the change is determined. Until the decoded picture is supplied, it is preferable to output the area indicated by the attention area information after the change as a partial area of the decoded picture of the lower layer as the attention area display image.

本発明に係る画像復号装置は、上位レイヤのスライスヘッダに含まれるスキップスライスフラグの値が対象スライスがスキップスライスであることを示す場合、スライスデータを用いることなくスキップＣＴＵ生成部により対象スライス内の復号画像を生成できる。スキップＣＴＵ生成部はスライスデータを用いずに復号画像を生成するため、非スキップＣＴＵ生成部でスライスデータを参照して復号画像を復号する場合に較べて復号画像の画質は低いが、より少ない処理量で、かつ、より少ない符号化データ（スライスデータ）を用いて復号画像が生成できる。したがって、本発明に係る画像復号装置は、注目領域に含まれる領域内のスライスを非スキップスライス、注目領域に含まれない領域内のスライスをスキップスライスとして符号化された階層符号化データを復号する場合に、注目領域内の復号画像の品質を損なうことなく、少ない符号量の符号化データから、少ない処理量で復号画像を生成して復号ピクチャとして出力できる。 When the value of the skip slice flag included in the upper layer slice header indicates that the target slice is a skip slice, the image decoding apparatus according to the present invention uses the skip CTU generation unit without using slice data. A decoded image can be generated. Since the skip CTU generation unit generates the decoded image without using the slice data, the image quality of the decoded image is lower than when the non-skip CTU generation unit decodes the decoded image with reference to the slice data, but less processing is performed. A decoded image can be generated using a small amount of encoded data (slice data). Therefore, the image decoding apparatus according to the present invention decodes hierarchically encoded data that is encoded by using a slice in a region included in the region of interest as a non-skip slice and a slice in a region not included in the region of interest as a skip slice. In this case, it is possible to generate a decoded image with a small amount of processing from encoded data with a small code amount and output it as a decoded picture without impairing the quality of the decoded image in the attention area.

本発明の実施形態に係る階層動画像復号装置が備えるスキップスライス判定部において、対象スライスがスキップスライスか否かを判定する処理のフロー図である。It is a flowchart of the process which determines whether the target slice is a skip slice in the skip slice determination part with which the hierarchical moving image decoding apparatus which concerns on embodiment of this invention is provided. 本発明の実施形態に係る階層符号化データのレイヤ構造を説明するための図であって、（ａ）は、階層動画像符号化装置側について示しており、（ｂ）は、階層動画像復号装置側について示している。It is a figure for demonstrating the layer structure of the hierarchy coding data which concerns on embodiment of this invention, Comprising: (a) has shown about the hierarchy moving image encoder side, (b) is a hierarchy moving image decoding. The device side is shown. 本発明の実施形態に係る階層符号化データの構成を説明するための図であって、（ａ）は、シーケンスＳＥＱを規定するシーケンスレイヤを示しており、（ｂ）は、ピクチャＰＩＣＴを規定するピクチャレイヤを示しており、（ｃ）は、スライスＳを規定するスライスレイヤを示しており、（ｄ）は、符号化ツリーユニットＣＴＵを規定するＣＴＵレイヤを示しており、（ｅ）は、符号化ツリーユニットＣＴＵに含まれる符号化単位（Coding Unit；ＣＵ）を規定するＣＵレイヤを示している。It is a figure for demonstrating the structure of the hierarchy coding data which concerns on embodiment of this invention, Comprising: (a) has shown the sequence layer which prescribes | regulates sequence SEQ, (b) has prescribed | regulated picture PICT. (C) shows the slice layer that defines the slice S, (d) shows the CTU layer that defines the coding tree unit CTU, and (e) shows the code layer 3 illustrates a CU layer that defines a coding unit (CU) included in a coding tree unit CTU. 本発明の実施形態に係る階層符号化データにおけるピクチャとタイル・スライスの関係を説明する図であり、（ａ）はピクチャをタイル・スライスにより分割する場合の分割領域を例示しており、（ｂ）は符号化データの構成におけるタイルとスライスの関係を例示している。It is a figure explaining the relationship between the picture in the hierarchical coding data which concerns on embodiment of this invention, and a tile slice, (a) has illustrated the division area in the case of dividing a picture by a tile slice, (b ) Exemplifies the relationship between tiles and slices in the structure of encoded data. 上記階層動画像復号装置の概略的構成を示す機能ブロック図である。It is a functional block diagram which shows the schematic structure of the said hierarchy moving image decoding apparatus. 上記階層動画像復号装置に含まれるベース復号部の構成を例示する機能ブロック図である。It is a functional block diagram which illustrates the structure of the base decoding part contained in the said hierarchy moving image decoding apparatus. PPSの復号時に参照されるシンタックス表の一部であって、タイル情報に係る部分である。This is a part of the syntax table that is referred to when decoding PPS, and is a part related to tile information. ピクチャをタイル分割した場合のタイル行とタイル列を例示した図である。It is the figure which illustrated the tile row and tile column at the time of dividing a picture into tiles. SPSに含まれるSPS拡張の復号時に参照されるシンタックス表の一部であって、タイル情報に係る部分である。This is a part of the syntax table that is referred to when decoding the SPS extension included in the SPS, and is a part related to tile information. 全レイヤ依存性識別子の値と動き依存およびレイヤ間依存の対応関係を示す表である。It is a table | surface which shows the correspondence of the value of all the layer dependence identifiers, motion dependence, and inter-layer dependence. 上記階層動画像復号装置に含まれるスライス復号部の構成を例示する機能ブロック図である。It is a functional block diagram which illustrates the structure of the slice decoding part contained in the said hierarchy moving image decoding apparatus. スライスヘッダ復号時に参照されるシンタックス表の一部であって、スライス位置情報に係る部分である。This is a part of the syntax table that is referred to when decoding the slice header, and is a part related to slice position information. スライスヘッダ復号時に参照されるシンタックス表の一部であって、スキップスライス情報に係る部分である。This is a part of a syntax table that is referred to when decoding a slice header, and is a part related to skip slice information. 対象スライスがスキップスライスか否かを判定する処理のフロー図である。It is a flowchart of the process which determines whether a target slice is a skip slice. スライス復号処理の手順を示すフロー図である。It is a flowchart which shows the procedure of a slice decoding process. SPS復号時に参照されるシンタックス表の一部であって、表示領域情報に係る部分である。This is a part of the syntax table that is referred to at the time of SPS decoding, and is a part related to display area information. ピクチャ内の部分領域である表示領域と表示領域位置情報の関係を例示する図である。It is a figure which illustrates the relationship between the display area which is a partial area | region in a picture, and display area position information. 本発明の一実施形態に係る階層動画像符号化装置の概略的構成を示す機能ブロック図である。It is a functional block diagram which shows schematic structure of the hierarchy moving image encoder which concerns on one Embodiment of this invention. 上記階層動画像符号化装置に含まれるスライス符号化部の構成を例示する機能ブロック図である。It is a functional block diagram which illustrates the structure of the slice encoding part contained in the said hierarchy moving image encoder. 本発明の一実施形態に係る階層符号化データ変換装置の概略的構成を示した機能ブロック図である。It is the functional block diagram which showed schematic structure of the hierarchy encoding data converter which concerns on one Embodiment of this invention. 上記階層動画像復号装置、階層動画像符号化装置、及び、階層符号化データ変換装置の組み合わせにより実現する注目領域表示システムの構成を示したブロック図である。It is the block diagram which showed the structure of the attention area display system implement | achieved by the combination of the said hierarchy moving image decoding apparatus, a hierarchy moving image encoding apparatus, and a hierarchy encoding data converter. 上記階層動画像符号化装置を搭載した送信装置、および、上記階層動画像復号装置を搭載した受信装置の構成を示した図である。（ａ）は、階層動画像符号化装置を搭載した送信装置を示しており、（ｂ）は、階層動画像復号装置を搭載した受信装置を示している。It is the figure which showed the structure of the transmitter which mounts the said hierarchy moving image encoder, and the receiver which mounts the said hierarchy moving image decoder. (A) shows a transmission device equipped with a hierarchical video encoding device, and (b) shows a reception device equipped with a hierarchical video decoding device. 上記階層動画像符号化装置を搭載した記録装置、および、上記階層動画像復号装置を搭載した再生装置の構成を示した図である。（ａ）は、階層動画像符号化装置を搭載した記録装置を示しており、（ｂ）は、階層動画像復号装置を搭載した再生装置を示している。It is the figure which showed the structure of the recording device carrying the said hierarchy moving image encoder, and the reproducing | regenerating apparatus carrying the said hierarchy moving image decoding apparatus. (A) shows a recording device equipped with a hierarchical video encoding device, and (b) shows a playback device equipped with a hierarchical video decoding device.

図１〜図２３に基づいて、本発明の一実施形態に係る階層動画像復号装置１、階層動画像符号化装置２、および符号化データ変換装置３を説明すれば以下のとおりである。 The hierarchical video decoding device 1, the hierarchical video encoding device 2, and the encoded data conversion device 3 according to an embodiment of the present invention will be described based on FIGS.

〔概要〕
本実施の形態に係る階層動画像復号装置（画像復号装置）１は、階層動画像符号化装置（画像符号化装置）２によって階層符号化された符号化データを復号する。階層符号化とは、動画像を低品質のものから高品質のものにかけて階層的に符号化する符号化方式のことである。階層符号化は、例えば、SVCやSHVCにおいて標準化されている。なお、ここでいう動画像の品質とは、主観的および客観的な動画像の見栄えに影響する要素のことを広く意味する。動画像の品質には、例えば、“解像度”、“フレームレート”、“画質”、および、“画素の表現精度”が含まれる。よって、以下、動画像の品質が異なるといえば、例示的には、“解像度”等が異なることを指すが、これに限られない。例えば、異なる量子化ステップで量子化された動画像の場合（すなわち、異なる符号化雑音により符号化された動画像の場合）も互いに動画像の品質が異なるといえる。〔Overview〕
A hierarchical video decoding device (image decoding device) 1 according to the present embodiment decodes encoded data that has been hierarchically encoded by a hierarchical video encoding device (image encoding device) 2. Hierarchical coding is a coding scheme that hierarchically encodes moving images from low quality to high quality. Hierarchical coding is standardized in SVC and SHVC, for example. Note that the quality of a moving image here widely means an element that affects the appearance of a subjective and objective moving image. The quality of the moving image includes, for example, “resolution”, “frame rate”, “image quality”, and “pixel representation accuracy”. Therefore, hereinafter, if the quality of the moving image is different, it means that, for example, “resolution” is different, but it is not limited thereto. For example, in the case of moving images quantized in different quantization steps (that is, moving images encoded with different encoding noises), it can be said that the quality of moving images is different from each other.

階層符号化技術は、階層化される情報の種類の観点から、（１）空間スケーラビリティ、（２）時間スケーラビリティ、（３）ＳＮＲ（Signal to Noise Ratio）スケーラビリティ、および（４）ビュースケーラビリティに分類されることもある。空間スケーラビリティとは、解像度や画像のサイズにおいて階層化する技術である。時間スケーラビリティとは、フレームレート（単位時間のフレーム数）において階層化する技術である。ＳＮＲスケーラビリティは、符号化雑音において階層化する技術である。また、ビュースケーラビリティは、各画像に対応付けられた視点位置において階層化する技術である。 Hierarchical coding techniques are classified into (1) spatial scalability, (2) temporal scalability, (3) SNR (Signal to Noise Ratio) scalability, and (4) view scalability, from the viewpoint of the type of information layered. Sometimes. Spatial scalability is a technique for hierarchizing resolution and image size. Time scalability is a technique for layering at a frame rate (number of frames per unit time). SNR scalability is a technique for layering in coding noise. Also, view scalability is a technique for hierarchizing at the viewpoint position associated with each image.

また、本実施形態に係る符号化データ変換装置３は、階層動画像符号化装置２によって階層符号化された符号化データを変換し、所定の注目領域に関する符号化データを（注目領域符号化データ）を生成する。注目領域符号化データは、本実施形態に係る階層動画像復号装置１で復号できる。 Also, the encoded data conversion device 3 according to the present embodiment converts the encoded data that has been hierarchically encoded by the hierarchical moving image encoding device 2, and converts the encoded data related to a predetermined attention region (the attention region encoded data). ) Is generated. The attention area encoded data can be decoded by the hierarchical moving picture decoding apparatus 1 according to the present embodiment.

本実施形態に係る階層動画像符号化装置２、階層動画像復号装置１、及び階層符号化データ変換装置３の詳細な説明に先立って、まず（１）階層動画像符号化装置２または階層符号化データ変換装置３によって生成され、階層動画像復号装置１によって復号される階層符号化データのレイヤ構造を説明し、次いで（２）各レイヤで採用できるデータ構造の具体例について説明を行う。 Prior to detailed description of the hierarchical video encoding device 2, the hierarchical video decoding device 1, and the hierarchical encoded data conversion device 3 according to the present embodiment, first, (1) the hierarchical video encoding device 2 or the hierarchical code. A layer structure of hierarchically encoded data generated by the encoded data conversion device 3 and decoded by the hierarchical video decoding device 1 will be described, and then (2) a specific example of a data structure that can be adopted in each layer will be described.

〔階層符号化データのレイヤ構造〕
ここで、図２を用いて、階層符号化データの符号化および復号について説明すると次のとおりである。図２は、動画像を、下位階層Ｌ３、中位階層Ｌ２、および上位階層Ｌ１の３階層により階層的に符号化／復号する場合について模式的に表す図である。つまり、図２（ａ）および（ｂ）に示す例では、３階層のうち、上位階層Ｌ１が最上位層となり、下位階層Ｌ３が最下位層となる。 [Layer structure of hierarchically encoded data]
Here, encoding and decoding of hierarchically encoded data will be described with reference to FIG. FIG. 2 is a diagram schematically illustrating a case where a moving image is hierarchically encoded / decoded by three layers of a lower layer L3, a middle layer L2, and an upper layer L1. That is, in the example shown in FIGS. 2A and 2B, of the three layers, the upper layer L1 is the highest layer and the lower layer L3 is the lowest layer.

以下において、階層符号化データから復号され得る特定の品質に対応する復号画像は、特定の階層の復号画像（または、特定の階層に対応する復号画像）と称される（例えば、上位階層Ｌ１の復号画像ＰＯＵＴ＃Ａ）。 In the following, a decoded image corresponding to a specific quality that can be decoded from hierarchically encoded data is referred to as a decoded image of a specific hierarchy (or a decoded image corresponding to a specific hierarchy) (for example, an upper layer L1). Decoded image POUT # A).

図２（ａ）は、入力画像ＰＩＮ＃Ａ〜ＰＩＮ＃Ｃをそれぞれ階層的に符号化して符号化データＤＡＴＡ＃Ａ〜ＤＡＴＡ＃Ｃを生成する階層動画像符号化装置２＃Ａ〜２＃Ｃを示している。図２（ｂ）は、階層的に符号化された符号化データＤＡＴＡ＃Ａ〜ＤＡＴＡ＃Ｃをそれぞれ復号して復号画像ＰＯＵＴ＃Ａ〜ＰＯＵＴ＃Ｃを生成する階層動画像復号装置１＃Ａ〜１＃Ｃを示している。 FIG. 2A shows a hierarchical video encoding device 2 # A-2 # C that generates encoded data DATA # A-DATA # C by hierarchically encoding input images PIN # A-PIN # C, respectively. Is shown. FIG. 2B illustrates a hierarchical video decoding device 1 # A that generates decoded images POUT # A to POUT # C by decoding the hierarchically encoded data DATA # A to DATA # C, respectively. 1 # C is shown.

まず、図２（ａ）を用いて、符号化装置側について説明する。符号化装置側の入力となる入力画像ＰＩＮ＃Ａ、ＰＩＮ＃Ｂ、およびＰＩＮ＃Ｃは、原画は同じだが、画像の品質（解像度、フレームレート、および画質等）が異なる。画像の品質は、入力画像ＰＩＮ＃Ａ、ＰＩＮ＃Ｂ、およびＰＩＮ＃Ｃの順に低くなる。 First, the encoding device side will be described with reference to FIG. The input images PIN # A, PIN # B, and PIN # C that are input on the encoding device side have the same original image but different image quality (resolution, frame rate, image quality, and the like). The image quality decreases in the order of the input images PIN # A, PIN # B, and PIN # C.

下位階層Ｌ３の階層動画像符号化装置２＃Ｃは、下位階層Ｌ３の入力画像ＰＩＮ＃Ｃを符号化して下位階層Ｌ３の符号化データＤＡＴＡ＃Ｃを生成する。下位階層Ｌ３の復号画像ＰＯＵＴ＃Ｃを復号するのに必要な基本情報が含まれる（図２において“Ｃ”にて示している）。下位階層Ｌ３は、最下層の階層であるため、下位階層Ｌ３の符号化データＤＡＴＡ＃Ｃは、基本符号化データとも称される。 The hierarchical video encoding apparatus 2 # C of the lower hierarchy L3 encodes the input image PIN # C of the lower hierarchy L3 to generate encoded data DATA # C of the lower hierarchy L3. Basic information necessary for decoding the decoded image POUT # C of the lower layer L3 is included (indicated by “C” in FIG. 2). Since the lower layer L3 is the lowest layer, the encoded data DATA # C of the lower layer L3 is also referred to as basic encoded data.

また、中位階層Ｌ２の階層動画像符号化装置２＃Ｂは、中位階層Ｌ２の入力画像ＰＩＮ＃Ｂを、下位階層の符号化データＤＡＴＡ＃Ｃを参照しながら符号化して中位階層Ｌ２の符号化データＤＡＴＡ＃Ｂを生成する。中位階層Ｌ２の符号化データＤＡＴＡ＃Ｂには、符号化データＤＡＴＡ＃Ｃに含まれる基本情報“Ｃ”に加えて、中位階層の復号画像ＰＯＵＴ＃Ｂを復号するのに必要な付加的情報（図２において“Ｂ”にて示している）が含まれる。 Further, the hierarchical video encoding apparatus 2 # B of the middle hierarchy L2 encodes the input image PIN # B of the middle hierarchy L2 with reference to the encoded data DATA # C of the lower hierarchy, and performs the middle hierarchy L2 Encoded data DATA # B is generated. In addition to the basic information “C” included in the encoded data DATA # C, additional data necessary for decoding the decoded image POUT # B of the intermediate hierarchy is added to the encoded data DATA # B of the intermediate hierarchy L2. Information (indicated by “B” in FIG. 2) is included.

また、上位階層Ｌ１の階層動画像符号化装置２＃Ａは、上位階層Ｌ１の入力画像ＰＩＮ＃Ａを、中位階層Ｌ２の符号化データＤＡＴＡ＃Ｂを参照しながら符号化して上位階層Ｌ１の符号化データＤＡＴＡ＃Ａを生成する。上位階層Ｌ１の符号化データＤＡＴＡ＃Ａには、下位階層Ｌ３の復号画像ＰＯＵＴ＃Ｃを復号するのに必要な基本情報“Ｃ”および中位階層Ｌ２の復号画像ＰＯＵＴ＃Ｂを復号するのに必要な付加的情報“Ｂ”に加えて、上位階層の復号画像ＰＯＵＴ＃Ａを復号するのに必要な付加的情報（図２において“Ａ”にて示している）が含まれる。 Further, the hierarchical video encoding apparatus 2 # A of the upper hierarchy L1 encodes the input image PIN # A of the upper hierarchy L1 with reference to the encoded data DATA # B of the intermediate hierarchy L2 to Encoded data DATA # A is generated. The encoded data DATA # A of the upper layer L1 is used to decode the basic information “C” necessary for decoding the decoded image POUT # C of the lower layer L3 and the decoded image POUT # B of the middle layer L2. In addition to the necessary additional information “B”, additional information (indicated by “A” in FIG. 2) necessary for decoding the decoded image POUT # A of the upper layer is included.

このように上位階層Ｌ１の符号化データＤＡＴＡ＃Ａは、異なる複数の品質の復号画像に関する情報を含む。 As described above, the encoded data DATA # A of the upper layer L1 includes information related to decoded images having a plurality of different qualities.

次に、図２（ｂ）を参照しながら復号装置側について説明する。復号装置側では、上位階層Ｌ１、中位階層Ｌ２、および下位階層Ｌ３それぞれの階層に応じた復号装置１＃Ａ、１＃Ｂ、および１＃Ｃが、符号化データＤＡＴＡ＃Ａ、ＤＡＴＡ＃Ｂ、およびＤＡＴＡ＃Ｃを復号して復号画像ＰＯＵＴ＃Ａ、ＰＯＵＴ＃Ｂ、およびＰＯＵＴ＃Ｃを出力する。 Next, the decoding device side will be described with reference to FIG. On the decoding device side, the decoding devices 1 # A, 1 # B, and 1 # C corresponding to the layers of the upper layer L1, the middle layer L2, and the lower layer L3 are encoded data DATA # A and DATA # B, respectively. , And DATA # C are decoded to output decoded images POUT # A, POUT # B, and POUT # C.

なお、上位の階層符号化データの一部の情報を抽出して、より下位の特定の復号装置において、当該抽出した情報を復号することで特定の品質の動画像を再生することもできる。 It is also possible to reproduce a moving image having a specific quality by extracting a part of the information of the upper layer encoded data and decoding the extracted information in a lower specific decoding device.

例えば、中位階層Ｌ２の階層復号装置１＃Ｂは、上位階層Ｌ１の階層符号化データＤＡＴＡ＃Ａから、復号画像ＰＯＵＴ＃Ｂを復号するのに必要な情報（すなわち、階層符号化データＤＡＴＡ＃Ａに含まれる“Ｂ”および“Ｃ”）を抽出して、復号画像ＰＯＵＴ＃Ｂを復号してもよい。言い換えれば、復号装置側では、上位階層Ｌ１の階層符号化データＤＡＴＡ＃Ａに含まれる情報に基づいて、復号画像ＰＯＵＴ＃Ａ、ＰＯＵＴ＃Ｂ、およびＰＯＵＴ＃Ｃを復号できる。 For example, the hierarchy decoding apparatus 1 # B of the middle hierarchy L2 receives information necessary for decoding the decoded image POUT # B from the hierarchy encoded data DATA # A of the upper hierarchy L1 (that is, the hierarchy encoded data DATA # A decoded image POUT # B may be decoded by extracting “B” and “C”) included in A. In other words, on the decoding device side, the decoded images POUT # A, POUT # B, and POUT # C can be decoded based on information included in the hierarchically encoded data DATA # A of the upper hierarchy L1.

なお、以上の３階層の階層符号化データに限られず、階層符号化データは、２階層で階層符号化されていてもよいし、３階層よりも多い階層数にて階層符号化されていてもよい。 The hierarchical encoded data is not limited to the above three-layer hierarchical encoded data, and the hierarchical encoded data may be hierarchically encoded with two layers or may be hierarchically encoded with a number of layers larger than three. Good.

また、特定の階層の復号画像に関する符号化データの一部または全部を他の階層とは独立して符号化し、特定の階層の復号の際に、他の階層の情報を参照しなくても済むように階層符号化データを構成してもよい。例えば、図２（ａ）および（ｂ）を用いて上述した例では、復号画像ＰＯＵＴ＃Ｂの復号に“Ｃ”および“Ｂ”を参照すると説明したが、これに限られない。復号画像ＰＯＵＴ＃Ｂが“Ｂ”だけを用いて復号できるように階層符号化データを構成することも可能である。例えば、復号画像ＰＯＵＴ＃Ｂの復号に、“Ｂ”だけから構成される階層符号化データと、復号画像ＰＯＵＴ＃Ｃを入力とする階層動画像復号装置も構成できる。 Also, a part or all of the encoded data related to the decoded image of a specific hierarchy is encoded independently of the other hierarchy, and it is not necessary to refer to information of the other hierarchy when decoding the specific hierarchy. Hierarchically encoded data may be configured as described above. For example, in the example described above with reference to FIGS. 2A and 2B, it has been described that “C” and “B” are referred to for decoding the decoded image POUT # B, but the present invention is not limited thereto. It is also possible to configure the hierarchically encoded data so that the decoded image POUT # B can be decoded using only “B”. For example, it is possible to configure a hierarchical video decoding apparatus that receives the hierarchically encoded data composed only of “B” and the decoded image POUT # C for decoding the decoded image POUT # B.

なお、ＳＮＲスケーラビリティを実現する場合、入力画像ＰＩＮ＃Ａ、ＰＩＮ＃Ｂ、およびＰＩＮ＃Ｃとして同一の原画を用いた上で、復号画像ＰＯＵＴ＃Ａ、ＰＯＵＴ＃Ｂ、およびＰＯＵＴ＃Ｃが異なる画質となるよう階層符号化データを生成することもできる。その場合、下位階層の階層動画像符号化装置が、上位階層の階層動画像符号化装置に較べて、より大きい量子化幅を用いて予測残差を量子化することで階層符号化データを生成する。 When SNR scalability is realized, the same original image is used as the input images PIN # A, PIN # B, and PIN # C, and the decoded images POUT # A, POUT # B, and POUT # C have different image quality. Hierarchically encoded data can also be generated so that In that case, the lower layer hierarchical video encoding device generates hierarchical encoded data by quantizing the prediction residual using a larger quantization width than the upper layer hierarchical video encoding device. To do.

本書では、説明の便宜上、次のとおり用語を定義する。以下の用語は、特に断りがなければ、下記の技術的事項のことを表わすのに用いる。 In this document, the following terms are defined for convenience of explanation. The following terms are used to indicate the following technical matters unless otherwise specified.

上位レイヤ：ある階層よりも上位に位置する階層のことを、上位レイヤと称する。例えば、図２において、下位階層Ｌ３の上位レイヤは、中位階層Ｌ２および上位階層Ｌ１である。また、上位レイヤの復号画像とは、より品質の高い（例えば、解像度が高い、フレームレートが高い、画質が高い等）復号画像のことをいう。 Upper layer: A layer positioned higher than a certain layer is referred to as an upper layer. For example, in FIG. 2, the upper layers of the lower layer L3 are the middle layer L2 and the upper layer L1. The decoded image of the upper layer means a decoded image with higher quality (for example, high resolution, high frame rate, high image quality, etc.).

下位レイヤ：ある階層よりも下位に位置する階層のことを、下位レイヤと称する。例えば、図２において、上位階層Ｌ１の下位レイヤは、中位階層Ｌ２および下位階層Ｌ３である。また、下位レイヤの復号画像とは、より品質の低い復号画像のことをいう。 Lower layer: A layer located lower than a certain layer is referred to as a lower layer. For example, in FIG. 2, the lower layers of the upper layer L1 are the middle layer L2 and the lower layer L3. Further, the decoded image of the lower layer refers to a decoded image with lower quality.

対象レイヤ：復号または符号化の対象となっている階層のことをいう。 Target layer: A layer that is the target of decoding or encoding.

参照レイヤ（reference layer）：対象レイヤに対応する復号画像を復号するのに参照される特定の下位レイヤのことを参照レイヤと称する。 Reference layer: A specific lower layer referred to for decoding a decoded image corresponding to a target layer is referred to as a reference layer.

図２（ａ）および（ｂ）に示した例では、上位階層Ｌ１の参照レイヤは、中位階層Ｌ２および下位階層Ｌ３である。しかしながら、これに限られず、特定の上記レイヤの復号において、下位レイヤのすべてを参照しなくてもよいように階層符号化データを構成することもできる。例えば、上位階層Ｌ１の参照レイヤが、中位階層Ｌ２および下位階層Ｌ３のいずれか一方となるように階層符号化データを構成することも可能である。 In the example shown in FIGS. 2A and 2B, the reference layers of the upper hierarchy L1 are the middle hierarchy L2 and the lower hierarchy L3. However, the present invention is not limited to this, and the hierarchically encoded data can be configured so that it is not necessary to refer to all of the lower layers in decoding of the specific layer. For example, the hierarchical encoded data can be configured such that the reference layer of the upper hierarchy L1 is either the middle hierarchy L2 or the lower hierarchy L3.

基本レイヤ（base layer；ベースレイヤ）：最下層に位置する階層のことを基本レイヤと称する。基本レイヤの復号画像は、符号化データから復号され得るもっとも低い品質の復号画像であり、基本復号画像と呼称される。別の言い方をすれば、基本復号画像は、最下層の階層に対応する復号画像のことである。基本復号画像の復号に必要な階層符号化データの部分符号化データは基本符号化データと呼称される。例えば、上位階層Ｌ１の階層符号化データＤＡＴＡ＃Ａに含まれる基本情報“Ｃ”が基本符号化データである。 Base layer: A layer located at the lowest layer is called a base layer. The decoded image of the base layer is the lowest quality decoded image that can be decoded from the encoded data, and is referred to as a basic decoded image. In other words, the basic decoded image is a decoded image corresponding to the lowest layer. The partially encoded data of the hierarchically encoded data necessary for decoding the basic decoded image is referred to as basic encoded data. For example, the basic information “C” included in the hierarchically encoded data DATA # A of the upper hierarchy L1 is the basic encoded data.

拡張レイヤ：基本レイヤの上位レイヤは、拡張レイヤと称される。 Enhancement layer: The upper layer of the base layer is referred to as an enhancement layer.

レイヤ識別子：レイヤ識別子は、階層を識別するためのものであり、階層と１対１に対応する。階層符号化データには特定の階層の復号画像の復号に必要な部分符号化データを選択するために用いられる階層識別子が含まれる。特定のレイヤに対応するレイヤ識別子に関連付けられた階層符号化データの部分集合は、レイヤ表現とも呼称される。 Layer identifier: The layer identifier is for identifying a hierarchy, and corresponds to the hierarchy one-to-one. The hierarchically encoded data includes a hierarchical identifier used for selecting partial encoded data necessary for decoding a decoded image of a specific hierarchy. A subset of hierarchically encoded data associated with a layer identifier corresponding to a specific layer is also referred to as a layer representation.

一般に、特定の階層の復号画像の復号には、当該階層のレイヤ表現、および／または、当該階層の下位レイヤに対応するレイヤ表現が用いられる。すなわち、対象レイヤの復号画像の復号においては、対象レイヤのレイヤ表現、および／または、対象レイヤの下位レイヤに含まれる１つ以上階層のレイヤ表現が用いられる。 In general, for decoding a decoded image of a specific hierarchy, a layer expression of the hierarchy and / or a layer expression corresponding to a lower layer of the hierarchy is used. That is, in decoding the decoded image of the target layer, layer representation of the target layer and / or layer representation of one or more layers included in a lower layer of the target layer are used.

レイヤ間予測：レイヤ間予測とは、対象レイヤのレイヤ表現と異なる階層（参照レイヤ）のレイヤ表現に含まれるシンタックス要素値、シンタックス要素値より導出される値、および復号画像に基づいて、対象レイヤのシンタックス要素値や対象レイヤの復号に用いられる符号化パラメータ等を予測することである。動き予測に関する情報を参照レイヤの情報から予測するレイヤ間予測のことを動き情報予測と称することもある。また、下位レイヤの復号画像から予測するレイヤ間予測のことをレイヤ間画像予測（あるいはレイヤ間テクスチャ予測）と称することもある。なお、レイヤ間予測に用いられる階層は、例示的には、対象レイヤの下位レイヤである。また、参照レイヤを用いず対象レイヤ内で予測を行うことをレイヤ内予測と称することもある。 Inter-layer prediction: Inter-layer prediction is based on a syntax element value included in a layer expression of a layer (reference layer) different from the layer expression of the target layer, a value derived from the syntax element value, and a decoded image. It is to predict the syntax element value of the target layer, the encoding parameter used for decoding of the target layer, and the like. Inter-layer prediction that predicts information related to motion prediction from reference layer information is sometimes referred to as motion information prediction. In addition, inter-layer prediction predicted from a lower layer decoded image may be referred to as inter-layer image prediction (or inter-layer texture prediction). Note that the hierarchy used for inter-layer prediction is, for example, a lower layer of the target layer. In addition, performing prediction within a target layer without using a reference layer may be referred to as intra-layer prediction.

なお、以上の用語は、飽くまで説明の便宜上のものであり、上記の技術的事項を別の用語にて表現してもかまわない。 Note that the above terms are for convenience of explanation until they are tired, and the above technical matters may be expressed by other terms.

〔階層符号化データのデータ構造について〕
以下、各階層の符号化データを生成する符号化方式として、HEVCおよびその拡張方式を用いる場合について例示する。しかしながら、これに限られず、各階層の符号化データを、MPEG-2や、H.264/AVCなどの符号化方式により生成してもよい。 [Data structure of hierarchically encoded data]
Hereinafter, a case where HEVC and its extension method are used as an encoding method for generating encoded data of each layer will be exemplified. However, the present invention is not limited to this, and the encoded data of each layer may be generated by an encoding method such as MPEG-2 or H.264 / AVC.

また、下位レイヤと上位レイヤとが異なる符号化方式によって符号化されていてもよい。また、各階層の符号化データは、互いに異なる伝送路を介して階層動画像復号装置１に供給されてもよいし、同一の伝送路を介して階層動画像復号装置１に供給されてもよい。 Further, the lower layer and the upper layer may be encoded by different encoding methods. Also, the encoded data of each layer may be supplied to the hierarchical video decoding device 1 via different transmission paths, or may be supplied to the hierarchical video decoding device 1 via the same transmission path. .

例えば、超高精細映像（動画像、４Ｋ映像データ）を基本レイヤおよび１つの拡張レイヤによりスケーラブル符号化して伝送する場合、基本レイヤは、４Ｋ映像データをダウンスケーリングし、インタレース化した映像データをMPEG-2またはH.264/AVCにより符号化してテレビ放送網で伝送し、拡張レイヤは、４Ｋ映像（プログレッシブ）をHEVCにより符号化して、インターネットで伝送してもよい。 For example, when transmitting ultra-high-definition video (moving image, 4K video data) with a base layer and one extended layer in a scalable encoding, the base layer downscales 4K video data, and interlaced video data. It may be encoded by MPEG-2 or H.264 / AVC and transmitted over a television broadcast network, and the enhancement layer may encode 4K video (progressive) with HEVC and transmit over the Internet.

（基本レイヤ）
図３は、基本レイヤにおいて採用できる符号化データ（図２の例でいえば、階層符号化データＤＡＴＡ＃Ｃ）のデータ構造を例示する図である。階層符号化データＤＡＴＡ＃Ｃは、例示的に、シーケンス、およびシーケンスを構成する複数のピクチャを含む。 (Basic layer)
FIG. 3 is a diagram illustrating a data structure of encoded data (hierarchically encoded data DATA # C in the example of FIG. 2) that can be employed in the base layer. Hierarchically encoded data DATA # C illustratively includes a sequence and a plurality of pictures constituting the sequence.

階層符号化データＤＡＴＡ＃Ｃにおけるデータの階層構造を図３に示す。図３の（ａ）〜（ｅ）は、それぞれ、シーケンスＳＥＱを規定するシーケンスレイヤ、ピクチャＰＩＣＴを規定するピクチャレイヤ、スライスＳを規定するスライスレイヤ、符号化ツリーユニット（Coding Tree Unit；ＣＴＵ）を規定するＣＴＵレイヤ、符号化ツリーユニットＣＴＵに含まれる符号化単位（Coding Unit；ＣＵ）を規定するＣＵレイヤを示す図である。 FIG. 3 shows a hierarchical structure of data in the hierarchically encoded data DATA # C. 3A to 3E respectively show a sequence layer that defines the sequence SEQ, a picture layer that defines the picture PICT, a slice layer that defines the slice S, and a coding tree unit (CTU). It is a figure which shows the CU layer which prescribes | regulates the coding unit (Coding Unit; CU) contained in the CTU layer and coding tree unit CTU to prescribe | regulate.

（シーケンスレイヤ）
シーケンスレイヤでは、処理対象のシーケンスＳＥＱ（以下、対象シーケンスとも称する）を復号するために階層動画像復号装置１が参照するデータの集合が規定されている。シーケンスＳＥＱは、図３の（ａ）に示すように、ビデオパラメータセットＶＰＳ（Video Parameter Set）、シーケンスパラメータセットＳＰＳ（Sequence Parameter Set）、ピクチャパラメータセットＰＰＳ（Picture Parameter Set）、ピクチャＰＩＣＴ_１〜ＰＩＣＴ_NP（ＮＰはシーケンスＳＥＱに含まれるピクチャの総数）、及び、付加拡張情報ＳＥＩ（Supplemental Enhancement Information）を含んでいる。 (Sequence layer)
In the sequence layer, a set of data referred to by the hierarchical video decoding device 1 for decoding a sequence SEQ to be processed (hereinafter also referred to as a target sequence) is defined. As shown in FIG. 3A, the sequence SEQ includes a video parameter set VPS (Video Parameter Set), a sequence parameter set SPS (Sequence Parameter Set), a picture parameter set PPS (Picture Parameter Set), and pictures PICT _{1 to} PICT. _{It includes NP} (NP is the total number of pictures included in the sequence SEQ) and supplemental enhancement information (SEI).

ビデオパラメータセットＶＰＳでは、符号化データに含まれるレイヤ数、レイヤ間の依存関係が規定されている。 In the video parameter set VPS, the number of layers included in the encoded data and the dependency relationship between the layers are defined.

シーケンスパラメータセットＳＰＳでは、対象シーケンスを復号するために階層動画像復号装置１が参照する符号化パラメータの集合が規定されている。ＳＰＳは符号化データ内に複数存在してもよい。その場合、対象シーケンス毎に復号に用いられるＳＰＳが複数の候補から選択される。特定シーケンスの復号に使用されるＳＰＳは、アクティブＳＰＳとも呼ばれる。以下では、特に断りがなければ、対象シーケンスに対するアクティブＳＰＳを意味する。 In the sequence parameter set SPS, a set of encoding parameters referred to by the hierarchical video decoding device 1 for decoding the target sequence is defined. A plurality of SPSs may exist in the encoded data. In that case, an SPS used for decoding is selected from a plurality of candidates for each target sequence. An SPS used for decoding a specific sequence is also called an active SPS. In the following, unless otherwise specified, it means an active SPS for the target sequence.

ピクチャパラメータセットＰＰＳでは、対象シーケンス内の各ピクチャを復号するために階層動画像復号装置１が参照する符号化パラメータの集合が規定されている。なお、ＰＰＳは符号化データ内に複数存在してもよい。その場合、対象シーケンス内の各ピクチャから複数のＰＰＳの何れかを選択する。特定ピクチャの復号に使用されるＰＰＳはアクティブＰＰＳとも呼ばれる。以下では、特に断りがなければ、ＰＰＳは対象ピクチャに対するアクティブＰＰＳを意味する。 In the picture parameter set PPS, a set of encoding parameters referred to by the hierarchical video decoding device 1 for decoding each picture in the target sequence is defined. A plurality of PPS may exist in the encoded data. In that case, one of a plurality of PPSs is selected from each picture in the target sequence. A PPS used for decoding a specific picture is also called an active PPS. In the following, unless otherwise specified, PPS means active PPS for the current picture.

なお、アクティブＳＰＳおよびアクティブＰＰＳは、レイヤ毎に異なるＳＰＳやＰＰＳに設定してもよい。 The active SPS and the active PPS may be set to different SPSs and PPSs for each layer.

（ピクチャレイヤ）
ピクチャレイヤでは、処理対象のピクチャＰＩＣＴ（以下、対象ピクチャとも称する）を復号するために階層動画像復号装置１が参照するデータの集合が規定されている。ピクチャＰＩＣＴは、図３の（ｂ）に示すように、スライスヘッダＳＨ₁〜ＳＨ_NS、及び、スライスＳ₁〜Ｓ_NSを含んでいる（ＮＳはピクチャＰＩＣＴに含まれるスライスの総数）。 (Picture layer)
In the picture layer, a set of data that is referred to by the hierarchical video decoding device 1 in order to decode a picture PICT to be processed (hereinafter also referred to as a target picture) is defined. As shown in FIG. 3B, the picture PICT includes slice headers SH _{1 to} SH _NS and slices S _{1 to} S _NS (NS is the total number of slices included in the picture PICT).

なお、以下、スライスヘッダＳＨ₁〜ＳＨ_NSやスライスＳ₁〜Ｓ_NSのそれぞれを区別する必要が無い場合、符号の添え字を省略して記述することがある。また、以下に説明する階層符号化データＤＡＴＡ＃Ｃに含まれるデータであって、添え字を付している他のデータも同様である。 Hereinafter, when it is not necessary to distinguish each of the slice headers SH _{1 to} SH _NS and the slices S _{1 to} S _NS , the reference numerals may be omitted. The same applies to other data with subscripts included in hierarchically encoded data DATA # C described below.

スライスヘッダＳＨ_kには、対応するスライスＳ_kの復号方法を決定するために階層動画像復号装置１が参照する符号化パラメータ群が含まれている。例えば、ＳＰＳを指定するＳＰＳ識別子（seq_parameter_set_id）や、ＰＰＳを指定するＰＰＳ識別子（pic_parameter_set_id）が含まれる。また、スライスタイプを指定するスライスタイプ指定情報（slice_type）は、スライスヘッダＳＨに含まれる符号化パラメータの一例である。 The slice header SH _k includes a coding parameter group referred to by the hierarchical video decoding device 1 in order to determine a decoding method for the corresponding slice S _k . For example, an SPS identifier (seq_parameter_set_id) that specifies SPS and a PPS identifier (pic_parameter_set_id) that specifies PPS are included. The slice type designation information (slice_type) for designating the slice type is an example of an encoding parameter included in the slice header SH.

スライスタイプ指定情報により指定可能なスライスタイプとしては、（１）符号化の際にイントラ予測のみを用いるＩスライス、（２）符号化の際に単方向予測、又は、イントラ予測を用いるＰスライス、（３）符号化の際に単方向予測、双方向予測、又は、イントラ予測を用いるＢスライスなどが挙げられる。 As slice types that can be specified by the slice type specification information, (1) I slice that uses only intra prediction at the time of encoding, (2) P slice that uses unidirectional prediction or intra prediction at the time of encoding, (3) B-slice using unidirectional prediction, bidirectional prediction, or intra prediction at the time of encoding may be used.

（スライスレイヤ）
スライスレイヤでは、処理対象のスライスＳ（対象スライスとも称する）を復号するために階層動画像復号装置１が参照するデータの集合が規定されている。スライスＳは、図３の（ｃ）に示すように、符号化ツリーユニットＣＴＵ₁〜ＣＴＵ_NC（ＮＣはスライスＳに含まれるＣＴＵの総数）を含んでいる。 (Slice layer)
In the slice layer, a set of data that is referred to by the hierarchical video decoding device 1 in order to decode a slice S (also referred to as a target slice) to be processed is defined. As shown in FIG. 3C, the slice S includes coding tree units CTU _{1 to} CTU _NC (NC is the total number of CTUs included in the slice S).

（ＣＴＵレイヤ）
ＣＴＵレイヤでは、処理対象の符号化ツリーユニットＣＴＵ（以下、対象ＣＴＵとも称する）を復号するために階層動画像復号装置１が参照するデータの集合が規定されている。なお、符号化ツリーユニットのことを符号化ツリーブロック（CTB: Coding Tree block）、または、最大符号化単位（LCU:Largest Cording Unit）と呼ぶこともある。 (CTU layer)
In the CTU layer, a set of data referred to by the hierarchical video decoding device 1 for decoding a coding tree unit CTU to be processed (hereinafter also referred to as a target CTU) is defined. Note that the coding tree unit may be referred to as a coding tree block (CTB) or a maximum coding unit (LCU).

符号化ツリーユニットＣＴＵは、ＣＴＵヘッダＣＴＵＨと、符号化単位情報ＣＵ_１〜ＣＵ_ＮＬ（ＮＬはＣＴＵに含まれる符号化単位情報の総数）とを含む。ここで、まず、符号化ツリーユニットＣＴＵと、符号化単位情報ＣＵとの関係について説明すると次のとおりである。 The coding tree unit CTU includes a CTU header CTUH and coding unit information CU _{1 to} CU _NL (NL is the total number of coding unit information included in the CTU). Here, first, the relationship between the coding tree unit CTU and the coding unit information CU will be described as follows.

符号化ツリーユニットＣＴＵは、イントラ予測またはインター予測、および、変換の各処理ためのブロックサイズを特定するための単位に分割される。 The coding tree unit CTU is divided into units for specifying a block size for each process of intra prediction or inter prediction and transformation.

符号化ツリーユニットＣＴＵの上記単位は、再帰的な４分木分割により分割されている。この再帰的な４分木分割により得られる木構造のことを以下、符号化ツリー（coding tree）と称する。 The unit of the coding tree unit CTU is divided by recursive quadtree partitioning. The tree structure obtained by this recursive quadtree partitioning is hereinafter referred to as a coding tree.

以下、符号化ツリーの末端のノードであるリーフ（leaf）に対応する単位を、符号化ノード（coding node）として参照する。また、符号化ノードは、符号化処理の基本的な単位となるため、以下、符号化ノードのことを、符号化単位（ＣＵ）とも称する。 Hereinafter, a unit corresponding to a leaf, which is a node at the end of the coding tree, is referred to as a coding node. In addition, since the encoding node is a basic unit of the encoding process, hereinafter, the encoding node is also referred to as an encoding unit (CU).

つまり、符号化単位情報（以下、ＣＵ情報と称する）ＣＵ_１〜ＣＵ_ＮＬは、符号化ツリーユニットＣＴＵを再帰的に４分木分割して得られる各符号化ノード（符号化単位）に対応する情報である。 That is, coding unit information (hereinafter referred to as CU information) CU _{1 to} CU _NL corresponds to each coding node (coding unit) obtained by recursively dividing the coding tree unit CTU into quadtrees. Information.

また、符号化ツリーのルート（root）は、符号化ツリーユニットＣＴＵに対応付けられる。換言すれば、符号化ツリーユニットＣＴＵは、複数の符号化ノードを再帰的に含む４分木分割の木構造の最上位ノードに対応付けられる。 The root of the coding tree is associated with the coding tree unit CTU. In other words, the coding tree unit CTU is associated with the highest node of the tree structure of the quadtree partition that recursively includes a plurality of coding nodes.

なお、各符号化ノードのサイズは、当該符号化ノードの親ノードとなる符号化ノード（すなわち、当該符号化ノードの１階層上位のノード）のサイズの縦横とも半分である。 Note that the size of each coding node is half of the size of the coding node that is the parent node of the coding node (that is, the node one layer higher than the coding node).

また、符号化ツリーユニットＣＴＵのサイズ、および、各符号化ユニットのとり得るサイズは、シーケンスパラメータセットＳＰＳに含まれる、最小符号化ノードのサイズ指定情報、および最大符号化ノードと最小符号化ノードの階層深度の差分に依存する。例えば、最小符号化ノードのサイズが８×８画素であって、最大符号化ノードと最小符号化ノードの階層深度の差分が３である場合、符号化ツリーユニットＣＴＵのサイズが６４×６４画素であって、符号化ノードのサイズは、４種類のサイズ、すなわち、６４×６４画素、３２×３２画素、１６×１６画素、および、８×８画素の何れかをとり得る。 Also, the size of the coding tree unit CTU and the size that each coding unit can take are the size designation information of the minimum coding node and the maximum coding node and the minimum coding node included in the sequence parameter set SPS. Depends on hierarchy depth difference. For example, when the size of the minimum coding node is 8 × 8 pixels and the difference in the layer depth between the maximum coding node and the minimum coding node is 3, the size of the coding tree unit CTU is 64 × 64 pixels. Thus, the size of the encoding node can take any of four sizes, that is, 64 × 64 pixels, 32 × 32 pixels, 16 × 16 pixels, and 8 × 8 pixels.

（ＣＴＵヘッダ）
ＣＴＵヘッダＣＴＵＨには、対象ＣＴＵの復号方法を決定するために階層動画像復号装置１が参照する符号化パラメータが含まれる。具体的には、図３の（ｄ）に示すように、対象ＣＴＵの各ＣＵへの分割パターンを指定するＣＴＵ分割情報ＳＰ＿ＣＴＵ、および、量子化ステップの大きさを指定する量子化パラメータ差分Δｑｐ（qp_delta）が含まれる。 (CTU header)
The CTU header CTUH includes an encoding parameter referred to by the hierarchical video decoding device 1 in order to determine a decoding method of the target CTU. Specifically, as shown in FIG. 3 (d), CTU division information SP_CTU for designating a division pattern of the target CTU into each CU, and a quantization parameter difference Δqp (for designating the quantization step size) qp_delta).

ＣＴＵ分割情報ＳＰ＿ＣＴＵは、ＣＴＵを分割するための符号化ツリーを表す情報であり、具体的には、対象ＣＴＵに含まれる各ＣＵの形状、サイズ、および、対象ＣＴＵ内での位置を指定する情報である。 The CTU division information SP_CTU is information representing a coding tree for dividing the CTU, and specifically, information specifying the shape and size of each CU included in the target CTU and the position in the target CTU. It is.

なお、ＣＴＵ分割情報ＳＰ＿ＣＴＵは、ＣＵの形状やサイズを明示的に含んでいなくてもよい。例えばＣＴＵ分割情報ＳＰ＿ＣＴＵは、対象ＣＴＵ全体またはＣＴＵの部分領域を四分割するか否かを示すフラグの集合であってもよい。その場合、ＣＴＵの形状やサイズを併用することで各ＣＵの形状やサイズを特定できる。 Note that the CTU partition information SP_CTU may not explicitly include the shape or size of the CU. For example, the CTU division information SP_CTU may be a set of flags indicating whether or not the entire target CTU or a partial region of the CTU is to be divided into four. In that case, the shape and size of each CU can be specified by using the shape and size of the CTU together.

また、量子化パラメータ差分Δｑｐは、対象ＣＴＵにおける量子化パラメータｑｐと、当該対象ＣＴＵの直前に符号化されたＣＴＵにおける量子化パラメータｑｐ’との差分ｑｐ−ｑｐ’である。 The quantization parameter difference Δqp is a difference qp−qp ′ between the quantization parameter qp in the target CTU and the quantization parameter qp ′ in the CTU encoded immediately before the target CTU.

（ＣＵレイヤ）
ＣＵレイヤでは、処理対象のＣＵ（以下、対象ＣＵとも称する）を復号するために階層動画像復号装置１が参照するデータの集合が規定されている。 (CU layer)
In the CU layer, a set of data referred to by the hierarchical video decoding device 1 for decoding a CU to be processed (hereinafter also referred to as a target CU) is defined.

ここで、ＣＵ情報ＣＵに含まれるデータの具体的な内容の説明をする前に、ＣＵに含まれるデータの木構造について説明する。符号化ノードは、予測ツリー（prediction tree；ＰＴ）および変換ツリー（transform tree；ＴＴ）のルートのノードとなる。予測ツリーおよび変換ツリーについて説明すると次のとおりである。 Here, before describing specific contents of data included in the CU information CU, a tree structure of data included in the CU will be described. The encoding node is a node at the root of a prediction tree (PT) and a transform tree (TT). The prediction tree and the conversion tree are described as follows.

予測ツリーにおいては、符号化ノードが１または複数の予測ブロックに分割され、各予測ブロックの位置とサイズとが規定される。換言すれば、予測ブロックは、符号化ノードを構成する１または複数の重複しない領域である。また、予測ツリーは、上述の分割により得られた１または複数の予測ブロックを含む。 In the prediction tree, the encoding node is divided into one or a plurality of prediction blocks, and the position and size of each prediction block are defined. In other words, the prediction block is one or a plurality of non-overlapping areas constituting the encoding node. The prediction tree includes one or a plurality of prediction blocks obtained by the above division.

予測処理は、この予測ブロックごとに行われる。以下、予測の単位である予測ブロックのことを、予測単位（prediction unit；ＰＵ）とも称する。 Prediction processing is performed for each prediction block. Hereinafter, a prediction block that is a unit of prediction is also referred to as a prediction unit (PU).

予測ツリーにおける分割（以下、ＰＵ分割と略称する）の種類は、大まかにいえば、イントラ予測の場合と、インター予測の場合との２つがある。 Broadly speaking, there are two types of partitioning in the prediction tree (hereinafter abbreviated as PU partitioning): intra prediction and inter prediction.

イントラ予測の場合、分割方法は、２Ｎ×２Ｎ（符号化ノードと同一サイズ）と、Ｎ×Ｎとがある。 In the case of intra prediction, there are 2N × 2N (the same size as the encoding node) and N × N division methods.

また、インター予測の場合、分割方法は、２Ｎ×２Ｎ（符号化ノードと同一サイズ）、２Ｎ×Ｎ、２Ｎ×ｎＵ、２Ｎ×ｎＤ、Ｎ×２Ｎ、ｎＬ×２Ｎ、および、ｎＲ×２Ｎなどがある。 In the case of inter prediction, 2N × 2N (the same size as the encoding node), 2N × N, 2N × nU, 2N × nD, N × 2N, nL × 2N, nR × 2N, and the like are used as division methods. is there.

また、変換ツリーにおいては、符号化ノードが１または複数の変換ブロックに分割され、各変換ブロックの位置とサイズとが規定される。換言すれば、変換ブロックは、符号化ノードを構成する１または複数の重複しない領域のことである。また、変換ツリーは、上述の分割より得られた１または複数の変換ブロックを含む。 In the transform tree, the encoding node is divided into one or a plurality of transform blocks, and the position and size of each transform block are defined. In other words, the transform block is one or a plurality of non-overlapping areas constituting the encoding node. The conversion tree includes one or a plurality of conversion blocks obtained by the above division.

変換ツリーにおける分割には、符号化ノードと同一のサイズの領域を変換ブロックとして割り付けるものと、上述したツリーブロックの分割と同様、再帰的な４分木分割によるものがある。 There are two types of division in the transformation tree: one in which an area having the same size as that of a coding node is assigned as a transformation block, and the other in division by recursive quadtree division as in the above-described division of a tree block.

変換処理は、この変換ブロックごとに行われる。以下、変換の単位である変換ブロックのことを、変換単位（transform unit；ＴＵ）とも称する。 The conversion process is performed for each conversion block. Hereinafter, a transform block that is a unit of transform is also referred to as a transform unit (TU).

（ＣＵ情報のデータ構造）
続いて、図３（ｅ）を参照しながらＣＵ情報ＣＵに含まれるデータの具体的な内容を説明する。図３（ｅ）に示すように、ＣＵ情報ＣＵは、具体的には、スキップフラグＳＫＩＰ、予測ツリー情報（以下、ＰＴ情報と略称する）ＰＴＩ、および、変換ツリー情報（以下、ＴＴ情報と略称する）ＴＴＩを含む。 (Data structure of CU information)
Next, specific contents of data included in the CU information CU will be described with reference to FIG. As shown in FIG. 3E, the CU information CU specifically includes a skip flag SKIP, prediction tree information (hereinafter abbreviated as PT information) PTI, and conversion tree information (hereinafter abbreviated as TT information). Include TTI).

スキップフラグＳＫＩＰは、対象のＰＵについて、スキップモードが適用されているか否かを示すフラグであり、スキップフラグＳＫＩＰの値が１の場合、すなわち、対象ＣＵにスキップモードが適用されている場合、そのＣＵ情報ＣＵにおけるＰＴ情報ＰＴＩの一部、および、ＴＴ情報ＴＴＩは省略される。なお、スキップフラグＳＫＩＰは、Ｉスライスでは省略される。 The skip flag SKIP is a flag indicating whether or not the skip mode is applied to the target PU. When the value of the skip flag SKIP is 1, that is, when the skip mode is applied to the target CU, A part of the PT information PTI and the TT information TTI in the CU information CU are omitted. Note that the skip flag SKIP is omitted for the I slice.

［ＰＴ情報］
ＰＴ情報ＰＴＩは、ＣＵに含まれる予測ツリー（以下、ＰＴと略称する）に関する情報である。言い換えれば、ＰＴ情報ＰＴＩは、ＰＴに含まれる１または複数のＰＵそれぞれに関する情報の集合であり、階層動画像復号装置１により予測画像を生成する際に参照される。ＰＴ情報ＰＴＩは、図３（ｅ）に示すように、予測タイプ情報ＰＴｙｐｅ、および、予測情報ＰＩｎｆｏを含んでいる。 [PT information]
The PT information PTI is information related to a prediction tree (hereinafter abbreviated as PT) included in the CU. In other words, the PT information PTI is a set of information related to each of one or a plurality of PUs included in the PT, and is referred to when a predicted image is generated by the hierarchical video decoding device 1. As shown in FIG. 3E, the PT information PTI includes prediction type information PType and prediction information PInfo.

予測タイプ情報ＰＴｙｐｅは、対象ＰＵについての予測画像生成方法を指定する情報である。ベースレイヤにおいては、イントラ予測を用いるのか、または、インター予測を用いるのかを指定する情報である。 The prediction type information PType is information that specifies a predicted image generation method for the target PU. In the base layer, it is information that specifies whether intra prediction or inter prediction is used.

予測情報ＰＩｎｆｏは、予測タイプ情報ＰＴｙｐｅで指定される予測方法において用いられる予測情報である。ベースレイヤにおいては、イントラ予測の場合にイントラ予測情報PP_Intraが含まれる。また、インター予測の場合にはインター予測情報PP_Interを含む。 The prediction information PInfo is prediction information used in the prediction method specified by the prediction type information PType. In the base layer, intra prediction information PP_Intra is included in the case of intra prediction. In the case of inter prediction, inter prediction information PP_Inter is included.

インター予測情報PP_Interは、階層動画像復号装置１が、インター予測によってインター予測画像を生成する際に参照される予測情報を含む。より具体的には、インター予測情報PP_Interは、対象ＣＵの各インターＰＵへの分割パターンを指定するインターＰＵ分割情報、および、各インターＰＵについてのインター予測パラメータ（動き補償パラメータ）を含む。インター予測パラメータとしては、例えば、マージフラグ（merge_flag）、マージインデックス（merge_idx）、推定動きベクトルインデックス（mvp_idx）、参照ピクチャインデックス（ref_idx）、インター予測フラグ（inter_pred_flag）、および動きベクトル残差（mvd）を含む。 The inter prediction information PP_Inter includes prediction information that is referred to when the hierarchical video decoding device 1 generates an inter prediction image by inter prediction. More specifically, the inter prediction information PP_Inter includes inter PU division information that specifies a division pattern of the target CU into each inter PU, and inter prediction parameters (motion compensation parameters) for each inter PU. Examples of inter prediction parameters include a merge flag (merge_flag), a merge index (merge_idx), an estimated motion vector index (mvp_idx), a reference picture index (ref_idx), an inter prediction flag (inter_pred_flag), and a motion vector residual (mvd) including.

イントラ予測情報PP_Intraは、階層動画像復号装置１が、イントラ予測によってイントラ予測画像を生成する際に参照される符号化パラメータを含む。より具体的には、イントラ予測情報PP_Intraには、対象ＣＵの各イントラＰＵへの分割パターンを指定するイントラＰＵ分割情報、および、各イントラＰＵについてのイントラ予測パラメータが含まれる。イントラ予測パラメータは、各イントラＰＵについてのイントラ予測方法（予測モード）を指定するためのパラメータである。 The intra prediction information PP_Intra includes an encoding parameter that is referred to when the hierarchical video decoding device 1 generates an intra predicted image by intra prediction. More specifically, the intra prediction information PP_Intra includes intra PU division information that specifies a division pattern of the target CU into each intra PU, and intra prediction parameters for each intra PU. The intra prediction parameter is a parameter for designating an intra prediction method (prediction mode) for each intra PU.

ここで、イントラ予測パラメータは、各イントラＰＵについてのイントラ予測（予測モード）を復元するためのパラメータである。予測モードを復元するためのパラメータには、ＭＰＭ（Most Probable Mode、以下同様）に関するフラグであるmpm_flag、ＭＰＭを選択するためのインデックスであるmpm_idx、および、ＭＰＭ以外の予測モードを指定するためのインデックスであるrem_idxが含まれる。ここで、ＭＰＭとは、対象パーティションで選択される可能性が高い推定予測モードである。例えば、対象パーティションの周辺のパーティションに割り付けられた予測モードに基づいて推定された推定予測モードや、一般的に発生確率の高いＤＣモードやＰｌａｎａｒモードがＭＰＭに含まれ得る。 Here, the intra prediction parameter is a parameter for restoring intra prediction (prediction mode) for each intra PU. The parameters for restoring the prediction mode include mpm_flag which is a flag related to MPM (Most Probable Mode, the same applies hereinafter), mpm_idx which is an index for selecting the MPM, and an index for specifying a prediction mode other than MPM. Rem_idx is included. Here, MPM is an estimated prediction mode that is highly likely to be selected in the target partition. For example, the MPM may include an estimated prediction mode estimated based on prediction modes assigned to partitions around the target partition, and a DC mode or Planar mode that generally has a high probability of occurrence.

また、以下において、単に“予測モード”と表記する場合、特にことわりのない限り、輝度予測モードのことを指すものとする。色差予測モードについては、“色差予測モード”と表記し、輝度予測モードと区別する。また、予測モードを復元するパラメータには、色差予測モードを指定するためのパラメータであるchroma_modeが含まれる。 In the following description, when simply described as “prediction mode”, it means the luminance prediction mode unless otherwise specified. The color difference prediction mode is described as “color difference prediction mode” and is distinguished from the luminance prediction mode. The parameter for restoring the prediction mode includes chroma_mode that is a parameter for designating the color difference prediction mode.

［ＴＴ情報］
ＴＴ情報ＴＴＩは、ＣＵに含まれる変換ツリー（以下、ＴＴと略称する）に関する情報である。言い換えれば、ＴＴ情報ＴＴＩは、ＴＴに含まれる１または複数の変換ブロックそれぞれに関する情報の集合であり、階層動画像復号装置１により残差データを復号する際に参照される。 [TT information]
The TT information TTI is information regarding a conversion tree (hereinafter abbreviated as TT) included in the CU. In other words, the TT information TTI is a set of information regarding each of one or a plurality of transform blocks included in the TT, and is referred to when the hierarchical video decoding device 1 decodes residual data.

ＴＴ情報ＴＴＩは、図３（ｅ）に示すように、対象ＣＵの各変換ブロックへの分割パターンを指定するＴＴ分割情報ＳＰ＿ＴＴ、および、量子化予測残差ＱＤ₁〜ＱＤ_NT（ＮＴは、対象ＣＵに含まれるブロックの総数）を含んでいる。 As shown in FIG. 3 (e), the TT information TTI includes TT division information SP_TT that specifies a division pattern of the target CU into each transform block, and quantized prediction residuals QD _{1 to} QD _NT (NT is the target The total number of blocks included in the CU).

ＴＴ分割情報ＳＰ＿ＴＴは、具体的には、対象ＣＵに含まれる各変換ブロックの形状、および、対象ＣＵ内での位置を決定するための情報である。例えば、ＴＴ分割情報ＳＰ＿ＴＴは、対象ノードの分割を行うのか否かを示す情報（split_transform_unit_flag）と、その分割の深度を示す情報（trafoDepth）とから実現できる。 Specifically, the TT division information SP_TT is information for determining the shape of each transform block included in the target CU and the position in the target CU. For example, the TT division information SP_TT can be realized from information (split_transform_unit_flag) indicating whether or not the target node is divided and information (trafoDepth) indicating the division depth.

また、例えば、ＣＵサイズが、６４×６４の場合、分割により得られる各変換ブロックは、３２×３２画素から４×４画素までのサイズをとり得る。 For example, when the CU size is 64 × 64, each transform block obtained by the division can take a size from 32 × 32 pixels to 4 × 4 pixels.

各量子化予測残差ＱＤは、階層動画像符号化装置２が以下の処理１〜３を、処理対象の変換ブロックである対象ブロックに施すことによって生成した符号化データである。 Each quantization prediction residual QD is encoded data generated by the hierarchical video encoding device 2 performing the following processes 1 to 3 on a target block that is a conversion block to be processed.

処理１：符号化対象画像から予測画像を減算した予測残差を周波数変換（例えば、ＤＣＴ変換（Discrete Cosine Transform）およびＤＳＴ変換（Discrete Sine Transform）等）する；
処理２：処理１にて得られた変換係数を量子化する；
処理３：処理２にて量子化された変換係数を可変長符号化する；
なお、上述した量子化パラメータｑｐは、階層動画像符号化装置２が変換係数を量子化する際に用いた量子化ステップＱＰの大きさを表す（ＱＰ＝２^qp/6）。 Process 1: The prediction residual obtained by subtracting the prediction image from the encoding target image is subjected to frequency conversion (for example, DCT conversion (Discrete Cosine Transform) and DST conversion (Discrete Sine Transform));
Process 2: Quantize the transform coefficient obtained in Process 1;
Process 3: Variable length coding is performed on the transform coefficient quantized in Process 2;
Note that the quantization parameter qp described above represents the size of the quantization step QP used when the hierarchical moving image encoding apparatus 2 quantizes the transform coefficient (QP = 2 ^{qp / 6} ).

（ＰＵ分割情報）
ＰＵ分割情報によって指定されるＰＵ分割タイプには、対象ＣＵのサイズを２Ｎ×２Ｎ画素とすると、次の合計８種類のパターンがある。すなわち、２Ｎ×２Ｎ画素、２Ｎ×Ｎ画素、Ｎ×２Ｎ画素、およびＮ×Ｎ画素の４つの対称的分割（symmetric splittings）、並びに、２Ｎ×ｎＵ画素、２Ｎ×ｎＤ画素、ｎＬ×２Ｎ画素、およびｎＲ×２Ｎ画素の４つの非対称的分割（asymmetric splittings）である。なお、Ｎ＝２^ｍ（ｍは１以上の任意の整数）を意味している。以下、対象ＣＵを分割して得られる予測単位のことを予測ブロック、または、パーティションと称する。 (PU partition information)
The PU partition type specified by the PU partition information includes the following eight patterns in total, assuming that the size of the target CU is 2N × 2N pixels. That is, 4 symmetric splittings of 2N × 2N pixels, 2N × N pixels, N × 2N pixels, and N × N pixels, and 2N × nU pixels, 2N × nD pixels, nL × 2N pixels, And four asymmetric splittings of nR × 2N pixels. N = 2 ^m (m is an arbitrary integer of 1 or more). Hereinafter, a prediction unit obtained by dividing the target CU is referred to as a prediction block or a partition.

（拡張レイヤ）
拡張レイヤのレイヤ表現に含まれる符号化データ（以下、拡張レイヤ符号化データ）についても、例えば、図３に示すデータ構造とほぼ同様のデータ構造を採用できる。ただし、拡張レイヤ符号化データでは、以下のとおり、付加的な情報を追加したり、パラメータを省略できる。 (Enhancement layer)
For encoded data included in the layer representation of the enhancement layer (hereinafter, enhancement layer encoded data), for example, a data structure substantially similar to the data structure shown in FIG. 3 can be adopted. However, in the enhancement layer encoded data, additional information can be added or parameters can be omitted as follows.

スライスレイヤでは、空間スケーラビリティ、時間スケーラビリティ、および、ＳＮＲスケーラビリティ、ビュースケーラビリティの階層の識別情報（それぞれ、dependency_id、temporal_id、quality_id、および、view_id）が符号化されていてもよい。 In the slice layer, spatial scalability, temporal scalability, SNR scalability, and view scalability hierarchy identification information (dependency_id, temporal_id, quality_id, and view_id, respectively) may be encoded.

また、ＣＵ情報ＣＵに含まれる予測タイプ情報ＰＴｙｐｅは、対象ＣＵについての予測画像生成方法がイントラ予測、インター予測、または、レイヤ間画像予測のいずれかを指定する情報である。予測タイプ情報ＰＴｙｐｅには、レイヤ間画像予測モードの適用有無を指定するフラグ（レイヤ間画像予測フラグ）を含む。なお、レイヤ間画像予測フラグは、texture_rl_flag、inter_layer_pred_flag、または、base_mode_flagと呼ばれることもある。 Further, the prediction type information PType included in the CU information CU is information that specifies whether the prediction image generation method for the target CU is intra prediction, inter prediction, or inter-layer image prediction. The prediction type information PType includes a flag (inter-layer image prediction flag) that specifies whether or not to apply the inter-layer image prediction mode. Note that the inter-layer image prediction flag may be referred to as texture_rl_flag, inter_layer_pred_flag, or base_mode_flag.

拡張レイヤにおいて、対象ＣＵのＣＵタイプが、イントラＣＵ、レイヤ間ＣＵ、インターＣＵ、スキップＣＵのいずれであるかが指定されていてもよい。 In the enhancement layer, it may be specified whether the CU type of the target CU is an intra CU, an inter-layer CU, an inter CU, or a skip CU.

イントラＣＵは、ベースレイヤにおけるイントラＣＵと同様に定義できる。イントラＣＵでは、レイヤ間画像予測フラグが“０”に、予測モードフラグが“０”に設定される。 An intra CU can be defined similarly to an intra CU in the base layer. In the intra CU, the inter-layer image prediction flag is set to “0”, and the prediction mode flag is set to “0”.

レイヤ間ＣＵは、参照レイヤのピクチャの復号画像を予測画像生成に用いるＣＵと定義できる。レイヤ間ＣＵでは、レイヤ間画像予測フラグが“１”に、予測モードフラグが“０”に設定される。 An inter-layer CU can be defined as a CU that uses a decoded image of a picture of a reference layer to generate a predicted image. In the inter-layer CU, the inter-layer image prediction flag is set to “1” and the prediction mode flag is set to “0”.

スキップＣＵは、上述のHEVC方式の場合と同様に定義できる。例えば、スキップＣＵでは、スキップフラグに“１”が設定される。 The skip CU can be defined in the same manner as in the HEVC scheme described above. For example, in the skip CU, “1” is set in the skip flag.

インターＣＵは、非スキップかつ動き補償（MC；Motion Compensation）を適用するＣＵと定義されていてもよい。インターＣＵでは、例えば、スキップフラグに“０”が設定され、予測モードフラグに“１”が設定される。 The inter CU may be defined as a CU that applies non-skip and motion compensation (MC). In the inter CU, for example, “0” is set in the skip flag and “1” is set in the prediction mode flag.

また、上述のとおり拡張レイヤの符号化データを、下位レイヤの符号化方式と異なる符号化方式により生成しても構わない。すなわち、拡張レイヤの符号化・復号処理は、下位レイヤのコーデックの種類に依存しない。 Further, as described above, the encoded data of the enhancement layer may be generated by an encoding method different from the encoding method of the lower layer. That is, the encoding / decoding process of the enhancement layer does not depend on the type of the lower layer codec.

下位レイヤが、例えば、MPEG-2や、H.264/AVC方式によって符号化されていてもよい。 The lower layer may be encoded by, for example, MPEG-2 or H.264 / AVC format.

拡張レイヤ符号化データでは、VPSが拡張されて、レイヤ間の参照構造を表すパラメータが含まれていてもよい。 In the enhancement layer encoded data, the VPS may be extended to include a parameter representing a reference structure between layers.

また、拡張レイヤ符号化データでは、SPS、PPS、スライスヘッダが拡張されて、レイヤ間画像予測に用いる参照レイヤの復号画像に係る情報（例えば、後述のレイヤ間参照ピクチャセット、レイヤ間参照ピクチャリスト、ベース制御情報等を直接、または、間接的に導出するためのシンタックス）が含まれていてもよい。 Also, in the enhancement layer encoded data, SPS, PPS, and slice header are extended, and information related to a decoded image of a reference layer used for inter-layer image prediction (for example, an inter-layer reference picture set, an inter-layer reference picture list described later) , Syntax for deriving base control information or the like directly or indirectly).

なお、以上に説明したパラメータは、単独で符号化されていてもよいし、複数のパラメータが複合的に符号化されていてもよい。複数のパラメータが複合的に符号化される場合は、そのパラメータの値の組み合わせに対してインデックスが割り当てられ、割り当てられた当該インデックスが符号化される。また、パラメータが、別のパラメータや、復号済みの情報から導出可能であれば、当該パラメータの符号化を省略できる。 Note that the parameters described above may be encoded independently, or a plurality of parameters may be encoded in combination. When a plurality of parameters are encoded in combination, an index is assigned to the combination of parameter values, and the assigned index is encoded. Also, if the parameter can be derived from another parameter or decoded information, the encoding of the parameter can be omitted.

〔ピクチャ、タイル、スライスの関係〕
次に、本発明に係る重要な概念であるピクチャ、タイル、スライスについて、相互の関係および符号化データとの関係を図４を参照して説明する。図４は、階層符号化データにおけるピクチャとタイル・スライスの関係を説明する図である。タイルは、ピクチャ内の矩形の部分領域、および、該部分領域に係る符号化データに対応付けられる。スライスはピクチャ内の部分領域、および、該部分領域に係る符号化データ、すなわち、該部分領域に係るスライスヘッダおよびスライスデータに対応付けられる。 [Relationship between pictures, tiles, and slices]
Next, with respect to pictures, tiles, and slices, which are important concepts according to the present invention, the mutual relationship and the relationship with encoded data will be described with reference to FIG. FIG. 4 is a diagram for explaining the relationship between pictures and tile slices in hierarchically encoded data. A tile is associated with a rectangular partial area in a picture and encoded data relating to the partial area. A slice is associated with a partial area in a picture and encoded data related to the partial area, that is, a slice header and slice data related to the partial area.

図４（ａ）はピクチャをタイル・スライスにより分割する場合の分割領域を例示している。図４（ａ）では、ピクチャは矩形の６個のタイル（T00、T01、T02、T10、T11、T12）に分割されている。タイルT00、タイルT02、タイルT10、タイルT12は、それぞれ１個のスライス（順にスライスS00、スライスS02、スライスS10、スライスS12）を含む。一方、タイルT01は2個のスライス（スライスS01aとスライスS01b）を含み、タイルT11は2個のスライス（スライスS11aとスライスS11b）を含んでいる。 FIG. 4A illustrates an example of divided areas when a picture is divided by tile slices. In FIG. 4A, the picture is divided into six rectangular tiles (T00, T01, T02, T10, T11, T12). Each of the tile T00, the tile T02, the tile T10, and the tile T12 includes one slice (in order, a slice S00, a slice S02, a slice S10, and a slice S12). On the other hand, the tile T01 includes two slices (slice S01a and slice S01b), and the tile T11 includes two slices (slice S11a and slice S11b).

図４（ｂ）は符号化データの構成におけるタイルとスライスの関係を例示している。まず、符号化データは、複数のVCL（Video Coding Layer；ビデオ符号化レイヤ）NALユニットと非VCL（non-VCL）NALユニットから構成される。１枚のピクチャの相当するビデオ符号化レイヤの符号化データは、複数のVCL NALから構成される。ピクチャがタイルに分割される場合、ピクチャに相当する符号化データには、タイルのラスタ順にタイルに相当する符号化データが含まれている。すなわち、図４（ａ）で示したようにピクチャがタイルに分割される場合、タイルT00、T01、T02、T10、T11、T12の順にタイルに相当する符号化データが含まれる。タイルが複数のスライスに分割される場合、スライス先頭のCTUが、タイル内でのCTUラスタスキャン順で先に位置するスライスから順に、スライスに相当する符号化データがタイルに相当する符号化データに含まれる。例えば、図４（ａ）で示したようにタイルT01がスライスS01aとS01bを含む場合、スライスS01a、スライスS01bの順にスライスに相当する符号化データがタイルT01に相当する符号化データに順に含まれる。 FIG. 4B illustrates the relationship between tiles and slices in the configuration of encoded data. First, encoded data includes a plurality of VCL (Video Coding Layer) NAL units and non-VCL (non-VCL) NAL units. The encoded data of the video encoding layer corresponding to one picture is composed of a plurality of VCL NALs. When a picture is divided into tiles, the encoded data corresponding to the picture includes encoded data corresponding to the tiles in the tile raster order. That is, as shown in FIG. 4A, when a picture is divided into tiles, encoded data corresponding to tiles is included in the order of tiles T00, T01, T02, T10, T11, and T12. When a tile is divided into a plurality of slices, the encoded data corresponding to the slice is changed to the encoded data corresponding to the tile in the order of the CTU at the head of the slice starting from the slice positioned first in the CTU raster scan order within the tile. included. For example, as shown in FIG. 4A, when the tile T01 includes slices S01a and S01b, the encoded data corresponding to the slices are included in the encoded data corresponding to the tile T01 in order of the slices S01a and S01b. .

以上の説明から分かるように、ピクチャ内の特定のタイルに相当する符号化データには、１以上のスライスに対応する符号化データが関連付けられている。そのため、タイルに関連付けられるスライスの復号画像を生成できれば、該タイルに対応するピクチャ内の部分領域の復号画像を生成できる。 As can be seen from the above description, encoded data corresponding to one or more slices is associated with encoded data corresponding to a specific tile in a picture. Therefore, if a decoded image of a slice associated with a tile can be generated, a decoded image of a partial region in a picture corresponding to the tile can be generated.

以下では、特に追加の説明がなければ、上記のようなピクチャ、タイル、スライスと符号化データの関係を前提として説明を行う。 The following description will be made on the premise of the relationship between the picture, tile, slice, and encoded data as described above unless otherwise specified.

〔階層動画像復号装置〕
以下では、本実施形態に係る階層動画像復号装置１の構成について、図１〜図１７を参照して説明する。 [Hierarchical video decoding device]
Below, the structure of the hierarchy moving image decoding apparatus 1 which concerns on this embodiment is demonstrated with reference to FIGS.

（階層動画像復号装置の構成）
図５を用いて、階層動画像復号装置１の概略的構成を説明すると次のとおりである。図５は、階層動画像復号装置１の概略的構成を示した機能ブロック図である。階層動画像復号装置１は、階層符号化データＤＡＴＡ（階層動画像符号化装置２から提供される階層符号化データＤＡＴＡＦ、または、符号化データ変換装置３から提供される階層符号化データＤＡＴＡＲ）を復号して、対象レイヤの復号画像ＰＯＵＴ＃Ｔを生成する。なお、以下では、対象レイヤは基本レイヤを参照レイヤとする拡張レイヤであるとして説明する。そのため、対象レイヤは、参照レイヤに対する上位レイヤでもある。逆に、参照レイヤは、対象レイヤに対する下位レイヤでもある。 (Configuration of Hierarchical Video Decoding Device)
The schematic configuration of the hierarchical video decoding device 1 will be described with reference to FIG. FIG. 5 is a functional block diagram showing a schematic configuration of the hierarchical video decoding device 1. The hierarchical video decoding device 1 receives hierarchical encoded data DATA (hierarchical encoded data DATAF provided from the hierarchical video encoding device 2 or hierarchical encoded data DATAAR provided from the encoded data conversion device 3). Decoding is performed to generate a decoded image POUT # T of the target layer. In the following description, it is assumed that the target layer is an extension layer having the base layer as a reference layer. Therefore, the target layer is also an upper layer with respect to the reference layer. Conversely, the reference layer is also a lower layer with respect to the target layer.

図５に示すように階層動画像復号装置１は、ＮＡＬ逆多重化部１１、パラメータセット復号部１２、タイル設定部１３、スライス復号部１４、ベース復号部１５、復号ピクチャ管理部１６を含む。 As illustrated in FIG. 5, the hierarchical video decoding device 1 includes a NAL demultiplexing unit 11, a parameter set decoding unit 12, a tile setting unit 13, a slice decoding unit 14, a base decoding unit 15, and a decoded picture management unit 16.

ＮＡＬ逆多重化部１１は、NAL（Network Abstraction Layer）におけるNALユニット単位で伝送される階層符号化データＤＡＴＡを逆多重化する。 The NAL demultiplexing unit 11 demultiplexes hierarchically encoded data DATA transmitted in units of NAL units in NAL (Network Abstraction Layer).

NALは、VCL（Video Coding Layer）と、符号化データを伝送・蓄積する下位システムとの間における通信を抽象化するために設けられる層である。 NAL is a layer provided to abstract communication between a VCL (Video Coding Layer) and a lower system that transmits and stores encoded data.

VCLは、動画像符号化処理を行う層のことであり、VCLにおいて符号化が行われる。一方、ここでいう、下位システムは、H.264/AVCおよびHEVCのファイルフォーマットや、MPEG-2システムに対応する。 VCL is a layer that performs video encoding processing, and encoding is performed in VCL. On the other hand, the lower system here corresponds to the H.264 / AVC and HEVC file formats and the MPEG-2 system.

なお、NALでは、VCLで生成されたビットストリームが、NALユニットという単位で区切られて、宛先となる下位システムへ伝送される。NALユニットには、VCLで符号化された符号化データ、および、当該符号化データが宛先の下位システムに適切に届けられるためのヘッダが含まれる。また、各階層における符号化データは、NALユニット格納されることでNAL多重化されて階層動画像復号装置１に伝送される。 In NAL, a bit stream generated by VCL is divided into units called NAL units and transmitted to a lower system as a destination. The NAL unit includes encoded data encoded by the VCL and a header for appropriately delivering the encoded data to a destination lower system. Also, the encoded data in each layer is stored in NAL units, is NAL multiplexed, and is transmitted to the hierarchical video decoding device 1.

階層符号化データＤＡＴＡには、VCLにより生成されたNALの他に、パラメータセット（VPS、SPS、PPS）やSEI等を含むNALが含まれる。それらのNALはVCL NALに対して非VCL NALと呼ばれる。 Hierarchical encoded data DATA includes NAL including parameter sets (VPS, SPS, PPS), SEI and the like in addition to NAL generated by VCL. Those NALs are called non-VCL NALs versus VCL NALs.

ＮＡＬ逆多重化部１１は、階層符号化データＤＡＴＡを逆多重化して、対象レイヤ符号化データＤＡＴＡ＃Ｔおよび参照レイヤ符号化データＤＡＴＡ＃Ｒを取り出す。また、ＮＡＬ逆多重化部１１は、対象レイヤ符号化データＤＡＴＡ＃Ｔに含まれるNALのうち、非VCL NALをパラメータセット復号部１２に、VCL NALをスライス復号部１４にそれぞれ供給する。 The NAL demultiplexing unit 11 demultiplexes the hierarchical encoded data DATA, and extracts the target layer encoded data DATA # T and the reference layer encoded data DATA # R. Also, the NAL demultiplexing unit 11 supplies non-VCL NAL to the parameter set decoding unit 12 and VCL NAL to the slice decoding unit 14 among NALs included in the target layer encoded data DATA # T.

パラメータセット復号部１２は、入力される非VCL NALからパラメータセット、すなわち、ＶＰＳ、ＳＰＳ、および、ＰＰＳを復号してタイル設定部１３とスライス復号部１４に供給する。 The parameter set decoding unit 12 decodes the parameter set, that is, VPS, SPS, and PPS, from the input non-VCL NAL, and supplies them to the tile setting unit 13 and the slice decoding unit 14.

タイル設定部１３は、入力されるパラメータセットに基づいてピクチャのタイル情報を導出してスライス復号部１４に供給する。タイル情報は、少なくともピクチャのタイル分割情報を含む。タイル設定部１３の詳細な説明は後述する。 The tile setting unit 13 derives picture tile information based on the input parameter set and supplies the derived tile information to the slice decoding unit 14. The tile information includes at least tile division information of a picture. The detailed description of the tile setting unit 13 will be described later.

スライス復号部１４は、入力されるVCL NAL、パラメータセット、タイル情報、および、参照ピクチャに基づいて復号ピクチャ、または、復号ピクチャの部分領域を生成して復号ピクチャ管理部１６内のバッファに記録する。スライス復号部の詳細な説明は後述する。 The slice decoding unit 14 generates a decoded picture or a partial area of the decoded picture based on the input VCL NAL, parameter set, tile information, and reference picture, and records the decoded picture in a buffer in the decoded picture management unit 16. . A detailed description of the slice decoding unit will be described later.

復号ピクチャ管理部１６は、入力される復号ピクチャやベース復号ピクチャを内部の復号ピクチャバッファ（DPB: Decoded Picture Buffer）に記録するとともに、参照ピクチャリスト生成や出力ピクチャ決定を行う。また、復号ピクチャ管理部１６は、DPBに記録されている復号ピクチャを、所定のタイミングで出力ピクチャＰＯＵＴ＃Ｔとして外部に出力する。 The decoded picture management unit 16 records an input decoded picture and a base decoded picture in an internal decoded picture buffer (DPB), and generates a reference picture list and determines an output picture. Also, the decoded picture management unit 16 outputs the decoded picture recorded in the DPB to the outside as an output picture POUT # T at a predetermined timing.

ベース復号部１５は、参照レイヤ符号化データＤＡＴＡ＃Ｒからベース復号ピクチャを復号する。ベース復号ピクチャは、対象レイヤの復号ピクチャ復号時に利用される参照レイヤの復号ピクチャである。ベース復号部１５は、復号したベース復号ピクチャを復号ピクチャ管理部１６内のDPBに記録する。 The base decoding unit 15 decodes a base decoded picture from the reference layer encoded data DATA # R. The base decoded picture is a decoded picture of the reference layer used when decoding the decoded picture of the target layer. The base decoding unit 15 records the decoded base decoded picture in the DPB in the decoded picture management unit 16.

図６を用いて、ベース復号部１５の詳細構成を説明する。図６は、ベース復号部１５の構成について例示した機能ブロック図である。 The detailed configuration of the base decoding unit 15 will be described with reference to FIG. FIG. 6 is a functional block diagram illustrating the configuration of the base decoding unit 15.

図６に示すように、ベース復号部１５は、ベースNAL逆多重化部１５１、ベースパラメータセット復号部１５２、ベースタイル設定部１５３、ベーススライス復号部１５４、ベース復号ピクチャ管理部１５６を備える。 As shown in FIG. 6, the base decoding unit 15 includes a base NAL demultiplexing unit 151, a base parameter set decoding unit 152, a base tile setting unit 153, a base slice decoding unit 154, and a base decoded picture management unit 156.

ベースNAL逆多重化部１５１は、参照レイヤ符号化データＤＡＴＡ＃Ｒを逆多重化して、VCL NALと非VCL NALを抽出し、非VCL NALをベースパラメータセット復号部１５２に、VCL NALをベーススライス復号部１５４にそれぞれ供給する。 The base NAL demultiplexing unit 151 demultiplexes the reference layer encoded data DATA # R to extract the VCL NAL and the non-VCL NAL, the non-VCL NAL to the base parameter set decoding unit 152, and the VCL NAL to the base slice Each is supplied to the decryption unit 154.

ベースパラメータセット復号部１５２は、入力される非VCL NALからパラメータセット、すなわち、VPS、SPS、および、PPSを復号してベースタイル設定部１５３とベーススライス復号部１５４に供給する。 The base parameter set decoding unit 152 decodes the parameter set, that is, VPS, SPS, and PPS, from the input non-VCL NAL and supplies them to the base tile setting unit 153 and the base slice decoding unit 154.

ベーススタイル設定部１５３は、入力されるパラメータセットに基づいてピクチャのタイル情報を導出してベーススライス復号部１５４に供給する。 The base style setting unit 153 derives picture tile information based on the input parameter set and supplies the derived tile information to the base slice decoding unit 154.

ベーススライス復号部１５４は、入力されるVCL NAL、パラメータセット、タイル情報、および、参照ピクチャに基づいて復号ピクチャ、または、復号ピクチャの部分領域を生成してベース復号ピクチャ管理部１５６内のバッファに記録する。 The base slice decoding unit 154 generates a decoded picture or a partial area of the decoded picture based on the input VCL NAL, parameter set, tile information, and reference picture, and stores the decoded picture in the buffer in the base decoded picture management unit 156. Record.

ベース復号ピクチャ管理部１５６は、入力される復号ピクチャを内部のDPBに記録するとともに、参照ピクチャリスト生成や出力ピクチャ決定を行う。また、ベース復号ピクチャ管理部１５６は、DPBに記録されている復号ピクチャを、所定のタイミングでベース復号ピクチャとして出力する。 The base decoded picture management unit 156 records the input decoded picture in the internal DPB, and performs reference picture list generation and output picture determination. Further, the base decoded picture management unit 156 outputs the decoded picture recorded in the DPB as a base decoded picture at a predetermined timing.

（タイル設定部１３）
タイル設定部１３は、入力されるパラメータセットに基づいてピクチャのタイル情報を導出して出力する。 (Tile setting unit 13)
The tile setting unit 13 derives and outputs tile information of a picture based on the input parameter set.

本実施形態において、タイル設定部１３により生成されるタイル情報は、概略的には、タイル構造情報とタイル依存情報を含む。 In the present embodiment, the tile information generated by the tile setting unit 13 schematically includes tile structure information and tile dependency information.

タイル構造情報は、ピクチャ内のタイルの個数と各タイルの大きさが示す情報である。なお、タイルがピクチャを格子状に分割して得られる部分領域に対応付ける場合、ピクチャ内のタイルの個数は、水平方向に含まれるタイルの個数と垂直方向に含まれるタイルの個数の積に等しい。 The tile structure information is information indicating the number of tiles in the picture and the size of each tile. When the tile is associated with a partial area obtained by dividing the picture into a grid, the number of tiles in the picture is equal to the product of the number of tiles included in the horizontal direction and the number of tiles included in the vertical direction.

タイル依存情報は、ピクチャ内のタイル復号時の依存性を示す情報である。ここで、タイル復号時の依存性は、タイルがタイル外の領域に係る復号画素やシンタックス値に依存する程度を示す。なお、タイル外の領域には、対象ピクチャ上のタイル外の領域、参照ピクチャ上のタイル外の領域、ベース復号ピクチャ上のタイル外の領域が含まれる。 The tile dependency information is information indicating dependency at the time of decoding a tile in a picture. Here, the dependency at the time of decoding the tile indicates the degree to which the tile depends on the decoded pixel and the syntax value related to the area outside the tile. Note that the area outside the tile includes an area outside the tile on the target picture, an area outside the tile on the reference picture, and an area outside the tile on the base decoded picture.

以下、タイル設定部１３により生成されるタイル情報の詳細について、入力されるパラメータセットに基づく導出過程を含めて説明する。 Hereinafter, details of tile information generated by the tile setting unit 13 will be described including a derivation process based on an input parameter set.

タイル情報は、パラメータセットに含まれるSPSやPPSに含まれるタイル情報に係るシンタックスの値に基づいて導出される。タイル情報に係るシンタックスについて図７と図９を参照して説明する。 The tile information is derived based on the syntax value related to the tile information included in the SPS or PPS included in the parameter set. The syntax related to tile information will be described with reference to FIGS.

（PPSタイル情報）
図７は、パラメータセットに含まれるPPSの復号時にパラメータ復号部１２により参照されるシンタックス表の一部であって、タイル情報に係る部分である。 (PPS tile information)
FIG. 7 is a part of the syntax table referred to by the parameter decoding unit 12 when decoding the PPS included in the parameter set, and is a part related to tile information.

PPSに含まれるタイル情報に係るシンタックス（PPSタイル情報）には、複数タイル有効フラグ（tiles_enabled_flag）が含まれる。複数タイル有効フラグの値が１の場合、ピクチャが２個以上のタイルから構成されることを示す。当該フラグの値が０の場合、ピクチャが１個のタイルから構成される、すなわちピクチャとタイルが一致することを示す。 The syntax (PPS tile information) related to tile information included in the PPS includes a multi-tile enabled flag (tiles_enabled_flag). When the value of the multi-tile valid flag is 1, it indicates that the picture is composed of two or more tiles. When the value of the flag is 0, the picture is composed of one tile, that is, the picture and the tile match.

複数タイルが有効（tiles_enabled_flagが真）である場合、PPSタイル情報には、タイル列数を示す情報（num_tile_columns_minus1）、タイル行数を示す情報（num_tiles_rows_minus1）、および、タイルサイズの均等性を示すフラグ（uniform_spacing_flag）が追加で含まれる。 When multiple tiles are enabled (tiles_enabled_flag is true), the PPS tile information includes information indicating the number of tile columns (num_tile_columns_minus1), information indicating the number of tile rows (num_tiles_rows_minus1), and a flag indicating tile size uniformity ( uniform_spacing_flag) is additionally included.

num_tile_columns_minus1は、ピクチャの水平方向に含まれるタイルの数から１を引いた値に相当するシンタックスである。また、num_tile_rows_minus1は、ピクチャの垂直方向に含まれるタイルの数から１を引いた値に相当するシンタックスである。したがって、ピクチャに含まれるタイル数NumTilesInPicは次式により計算される。 num_tile_columns_minus1 is a syntax corresponding to a value obtained by subtracting 1 from the number of tiles included in the horizontal direction of a picture. Num_tile_rows_minus1 is a syntax corresponding to a value obtained by subtracting 1 from the number of tiles included in the vertical direction of the picture. Therefore, the number of tiles NumTilesInPic included in the picture is calculated by the following equation.

NumTilesInPic = (num_tile_columns_minus1+1) * (num_tile_rows_minus1+1)
uniform_spacing_flagの値が１の場合、ピクチャに含まれるタイルサイズが均等、すなわち、各タイルの幅と高さが等しいことを示す。uniform_spacing_flagの値が０の場合、ピクチャに含まれるタイルサイズが不均等、すなわち、ピクチャに含まれるタイルの幅や高さが必ずしも一致しないことを示す。 NumTilesInPic = (num_tile_columns_minus1 + 1) * (num_tile_rows_minus1 + 1)
A uniform_spacing_flag value of 1 indicates that the tile size included in the picture is uniform, that is, the width and height of each tile are equal. A uniform_spacing_flag value of 0 indicates that the tile sizes included in the picture are uneven, that is, the width and height of the tiles included in the picture do not necessarily match.

ピクチャに含まれるタイルサイズが不均等（uniform_spacing_flagが０）の場合、PPSタイル情報には、ピクチャに含まれる各タイル列に対して、タイル幅を示す情報（column_width_minus1[i]）、および、ピクチャに含まれる各タイル行に対して、タイルの高さを示す情報（row_height_minus1[i]）が追加で含まれる。 When the tile size included in the picture is uneven (uniform_spacing_flag is 0), the PPS tile information includes information indicating the tile width (column_width_minus1 [i]) for each tile column included in the picture, and the picture For each tile row included, additional information (row_height_minus1 [i]) indicating the height of the tile is included.

また、複数タイルが有効である場合、PPSタイル情報には、タイル境界をまたぐループフィルタの適用有無を示すフラグ（loop_filter_across_tiles_enabled_flag）を追加で含む。 Further, when a plurality of tiles are valid, the PPS tile information additionally includes a flag (loop_filter_across_tiles_enabled_flag) indicating whether or not to apply a loop filter that crosses the tile boundary.

ここで、図８を参照して、タイル行、タイル列とピクチャの関係を説明しておく。図８は、ピクチャをタイル分割した場合のタイル行とタイル列を例示した図である。図８の例では、ピクチャは４個のタイル列と３個のタイル行により分割されており、計１２個のタイルを含んでいる。例えば、タイル列０（TileCol0）は、タイルＴ００、Ｔ１０、Ｔ２０を含んでいる。また、例えば、タイル行０（TileRow0）は、タイルＴ００、Ｔ０１、Ｔ０２、Ｔ０３を含んでいる。タイル列ｉの幅はCTU単位でColWidth[i]と表記される。タイル行ｊの高さはCTU単位でRowHeight[j]と表記される。したがって、タイル行ｉに属し、かつ、タイル列ｊに属するタイルの幅はColWidth[i]、高さはRowHeight[j]となる。 Here, the relationship between tile rows, tile columns, and pictures will be described with reference to FIG. FIG. 8 is a diagram illustrating tile rows and tile columns when a picture is divided into tiles. In the example of FIG. 8, the picture is divided by 4 tile columns and 3 tile rows, and includes a total of 12 tiles. For example, the tile row 0 (TileCol0) includes tiles T00, T10, and T20. For example, tile row 0 (TileRow0) includes tiles T00, T01, T02, and T03. The width of the tile row i is expressed as ColWidth [i] in CTU units. The height of the tile row j is expressed as RowHeight [j] in CTU units. Therefore, the width of the tile belonging to tile row i and belonging to tile column j is ColWidth [i] and the height is RowHeight [j].

上記のPPSタイル情報に基づいて、タイル設定部１３は、タイル構造情報を導出する。タイル構造情報には、ラスタスキャンCTBアドレスからタイルスキャンCTBアドレスを導出する配列（CtbAddrRsToTs[ctbAddrRs]）、タイルスキャンCTBアドレスからラスタスキャンCTBアドレスを導出する配列（CtbAddrTsToRs[ctbAddrTs]）、タイルスキャンCTBアドレス毎のタイル識別子（TileId[ctbAddrTs]）、各タイル列の幅（ColumnWidthInLumaSamples[i]）、および、各タイル行の高さ（RowHeightInLumaSamples[j]）が含まれる。 Based on the above PPS tile information, the tile setting unit 13 derives tile structure information. The tile structure information includes an array for deriving a tile scan CTB address from a raster scan CTB address (CtbAddrRsToTs [ctbAddrRs]), an array for deriving a raster scan CTB address from a tile scan CTB address (CtbAddrTsToRs [ctbAddrTs]), and a tile scan CTB address Each tile identifier (TileId [ctbAddrTs]), the width of each tile column (ColumnWidthInLumaSamples [i]), and the height of each tile row (RowHeightInLumaSamples [j]) are included.

uniform_spacing_flagが１の場合、ピクチャサイズとピクチャ内のタイル数に基づいて各タイル列の幅が計算される。例えば、次式によりｉ番目のタイル列の幅（ColumnWidthInLumaSamples[i]）が計算される。なお、PicWidthInCtbsYは、ピクチャの水平方向に含まれるCTUの数を表す。 When uniform_spacing_flag is 1, the width of each tile column is calculated based on the picture size and the number of tiles in the picture. For example, the width of the i-th tile column (ColumnWidthInLumaSamples [i]) is calculated by the following equation. Note that PicWidthInCtbsY represents the number of CTUs included in the horizontal direction of the picture.

ColWidth[i] = ( (i+1) * PicWidthInCtbsY ) / ( num_tile_columns_minus1 + 1 ) - ( i * PicWidthInCtbsY ) / ( num_tile_columns_minus1 + 1 )
つまり、ピクチャをタイル列数で等分して得られる（i+1）番目とi番目の境界位置の差分として、ｉ番目のタイル列のCTU単位の幅であるColWidth[i]が計算される。 ColWidth [i] = ((i + 1) * PicWidthInCtbsY) / (num_tile_columns_minus1 + 1)-(i * PicWidthInCtbsY) / (num_tile_columns_minus1 + 1)
That is, ColWidth [i], which is the width of the i-th tile column in CTU units, is calculated as the difference between the (i + 1) -th and i-th boundary positions obtained by equally dividing the picture by the number of tile columns. .

一方、uniform_spacing_flagが０の場合、（column_width_minus1[i]+1）の値がｉ番目のタイル列のCTU単位の幅ColWidth[i]に設定される。 On the other hand, when uniform_spacing_flag is 0, the value of (column_width_minus1 [i] +1) is set to the width ColWidth [i] in CTU units of the i-th tile column.

ColumnWidthInLumaSamples[i]の値は、ColWidth[i]にCTUの画素単位の幅を乗じて得られる値を設定する。 The value of ColumnWidthInLumaSamples [i] is set to the value obtained by multiplying ColWidth [i] by the width of the CTU pixel unit.

なお、タイル行のCTU単位の高さRowHeight[j]についても、上記タイル列の幅と同様の方法で計算される。PicWidthInCtbsYの代わりにPicHeightInCtbsY（ピクチャの垂直方向に含まれるCTU数)、num_tiles_columns_minus1の代わりにnum_tiles_row_minus1、column_width_minus1[i]の代わりにrow_height_minus1[i]を用いる。 Note that the height RowHeight [j] of tile rows in CTU units is also calculated in the same manner as the width of the tile row. PicHeightInCtbsY (the number of CTUs included in the vertical direction of the picture) is used instead of PicWidthInCtbsY, and num_tiles_row_minus1 and row_height_minus1 [i] are used instead of num_tiles_columns_minus1 and column_width_minus1 [i].

RowHeightInLumaSamples[j]の値は、RowHeight[j]にCTUの画素単位の高さを乗じて得られる値を設定する。 The value of RowHeightInLumaSamples [j] is set to a value obtained by multiplying RowHeight [j] by the height of the CTU in pixel units.

次に、タイルスキャンCTBアドレスからラスタスキャンCTBアドレスを導出する配列（CtbAddrTsToRs[ctbAddrTs]）の導出方法を説明する。 Next, a method for deriving an array (CtbAddrTsToRs [ctbAddrTs]) for deriving raster scan CTB addresses from tile scan CTB addresses will be described.

まず、ｉ番目のタイル列の境界位置を示すcolBd[i]、および、ｊ番目のタイル行の境界位置を示すrowBd[j]を次式により計算する。なお、colBd[0]とrowBd[0]の値は０とする。 First, colBd [i] indicating the boundary position of the i-th tile row and rowBd [j] indicating the boundary position of the j-th tile row are calculated by the following equations. Note that the values of colBd [0] and rowBd [0] are 0.

colBd[i+1] = colBd[i] + colWidth[i]
rowBd[j+1] = rowBd[j] + rowHeight[j]
続いて、ピクチャに含まれるラスタスキャンCTUアドレス（ctbAddrRs）で識別されるCTUに関連付けられるタイルスキャンCTUアドレスを以下の手順で導出する。 colBd [i + 1] = colBd [i] + colWidth [i]
rowBd [j + 1] = rowBd [j] + rowHeight [j]
Subsequently, the tile scan CTU address associated with the CTU identified by the raster scan CTU address (ctbAddrRs) included in the picture is derived by the following procedure.

対象CTUのピクチャ内CTU単位での位置（tbX、tbY）をctbAddrRsから次式により計算する。ここで演算子「%」は剰余演算子であり、「A % B」は整数Aを整数Bで割った余りを意味する。 The position (tbX, tbY) in the CTU unit within the picture of the target CTU is calculated from ctbAddrRs by the following formula. Here, the operator “%” is a remainder operator, and “A% B” means a remainder obtained by dividing the integer A by the integer B.

tbX = ctbAddrRs % PicWidthInCtbsY
tbY = ctbAddrRs / PicWidthInCtbsY
続いて、対象CTUを含むタイルのピクチャ内のタイル単位の位置（tileX、tileY）を導出する。tileXには、評価式（tbX >= colBd[i]）が真となる最大のｉの値が設定される。同様に、tileYには、評価式（tbY >= rowBd[j]）が真となる最大のｊの値が設定される。 tbX = ctbAddrRs% PicWidthInCtbsY
tbY = ctbAddrRs / PicWidthInCtbsY
Subsequently, the position (tileX, tileY) of the tile unit in the picture of the tile including the target CTU is derived. In tileX, the maximum value of i at which the evaluation formula (tbX> = colBd [i]) is true is set. Similarly, the maximum value of j for which the evaluation formula (tbY> = rowBd [j]) is true is set in tileY.

CtbAddrRsToTs[ctbAddrRs]の値には、(tileX、tileY）のタイルよりもタイルスキャン順で先行するタイルに含まれるCTUの和と、（tileX、tileY）のタイル内で（tbX - colBd[tileX]、tbY - rowBd[tileY]）に位置するCTUの当該タイル内ラスタスキャン順の位置を加算した値が設定される。 The value of CtbAddrRsToTs [ctbAddrRs] includes the sum of the CTUs included in the tiles that precede the tiles of (tileX, tileY) in the tile scan order, and (tbX-colBd [tileX] within the tiles of (tileX, tileY), A value obtained by adding the positions of the raster scan order in the tile of the CTU located at tbY−rowBd [tileY]) is set.

CtbAddrTsToRs[ctbAddrTs]の値には、CtbAddrRsToTs[k]がctbAddrTsと一致する場合のkの値が設定される。 The value of k when CtbAddrRsToTs [k] matches ctbAddrTs is set to the value of CtbAddrTsToRs [ctbAddrTs].

TileId[ctbAddrTs]の値には、ctbAddrTsで示されるCTUが属するタイルのタイル識別子が設定される。ピクチャ内でタイル単位で（tileX、tileY）の位置にあるタイルのタイル識別子tileId(tileX、tileY)は、次式により計算される。 In the value of TileId [ctbAddrTs], the tile identifier of the tile to which the CTU indicated by ctbAddrTs belongs is set. The tile identifier tileId (tileX, tileY) of the tile located at the position of (tileX, tileY) in the tile in the picture is calculated by the following equation.

tileId(tileX,tileY) = (tileY * (num_tile_cols_minus1 + 1)) + tileX
（SPSタイル情報）
図９はパラメータセットに含まれるSPSの復号時にパラメータ復号部１２により参照されるシンタックス表の一部であって、タイル情報に係る部分である。上位レイヤのSPSに含まれるSPS拡張（sps_extension）には、タイル依存情報として、全タイル依存性識別子（all_tiles_decoding_dependency_idc）が含まれる。 tileId (tileX, tileY) = (tileY * (num_tile_cols_minus1 + 1)) + tileX
(SPS tile information)
FIG. 9 is a part of the syntax table that is referred to by the parameter decoding unit 12 when decoding the SPS included in the parameter set, and is a part related to tile information. The SPS extension (sps_extension) included in the SPS of the upper layer includes an all tile dependency identifier (all_tiles_decoding_dependency_idc) as tile dependency information.

全タイル依存性識別子の値は、符号化データに含まれる全タイルの復号に係る依存性の情報が含まれている。具体的には、動き依存とレイヤ間依存の情報を含んでいる。 The all-tile dependency identifier value includes dependency information related to decoding of all tiles included in the encoded data. Specifically, it includes information on motion dependence and inter-layer dependence.

動き依存は、レイヤ内のインター予測において、対象ピクチャの各タイルと同一位置にある参照ピクチャ上の領域（参照ピクチャ上の対応タイル領域）に含まれない領域（参照ピクチャタイル外領域）の復号画素や当該参照ピクチャタイル外領域のシンタックスに依存せずに対象ピクチャのタイルが復号できるか否かを示す。なお、以下では、参照ピクチャタイル外領域の復号画素やシンタックスに依存せずにタイルが復号できる場合を「動き依存がない」、それ以外の場合を「動き依存がある」と表現する。 In the motion prediction, inter-prediction in a layer is a decoded pixel of a region (region outside the reference picture tile) that is not included in the region on the reference picture (corresponding tile region on the reference picture) at the same position as each tile of the target picture. And whether the tile of the target picture can be decoded without depending on the syntax of the region outside the reference picture tile. In the following, a case where a tile can be decoded without depending on a decoded pixel or syntax in a region outside the reference picture tile is expressed as “no motion dependency”, and the other case is expressed as “motion dependency”.

レイヤ間依存は、レイヤ間予測において、対象ピクチャの各タイルと同一位置にあるレイヤ間参照ピクチャ上の領域（レイヤ間参照ピクチャ上の対応タイル領域）に含まれない領域（レイヤ間参照ピクチャタイル外領域）の復号画素や当該レイヤ間参照ピクチャタイル外領域のシンタックスに依存せずに対象ピクチャのタイルが復号できるか否かを示す。なお、以下では、レイヤ間参照ピクチャタイル外領域の復号画素やシンタックスに依存せずにタイルが復号できる場合を「レイヤ間依存がない」、それ以外の場合を「レイヤ間依存がある」と表現する。 Inter-layer dependence is an area that is not included in the area on the inter-layer reference picture (corresponding tile area on the inter-layer reference picture) that is in the same position as each tile of the target picture in inter-layer prediction (outside the inter-layer reference picture tile) This indicates whether or not the tile of the target picture can be decoded without depending on the decoded pixel in the region) and the syntax of the region outside the reference picture tile between the layers. In the following, when the tile can be decoded without depending on the decoded pixel or syntax of the inter-layer reference picture tile non-region area, “no inter-layer dependency”, otherwise “inter-layer dependency” Express.

全レイヤ依存性識別子の値と動き依存およびレイヤ間依存の対応関係を図１０に示す。図１０に示した通り、全レイヤ依存性識別子の値０は、動き依存とレイヤ間依存がともにあることを示す。全レイヤ依存性識別子の値１は、動き依存はあり、レイヤ間依存がないことを示す。全レイヤ依存性識別子の値２は、動き依存はなく、レイヤ間依存があることを示す。全レイヤ依存性識別子の値３は、動き依存、レイヤ間依存ともにないことを示す。 FIG. 10 shows the correspondence between all layer dependency identifier values, motion dependency, and inter-layer dependency. As shown in FIG. 10, the value 0 of the all layer dependency identifier indicates that there is both motion dependency and inter-layer dependency. A value 1 of all layer dependency identifiers indicates that there is motion dependency and no inter-layer dependency. The value 2 of the all layer dependency identifier indicates that there is no motion dependency and inter-layer dependency. A value 3 of all layer dependency identifiers indicates that neither motion dependency nor inter-layer dependency exists.

上記の対応関係によると、全レイヤ依存性識別子の値を２進数で表現した場合に、下位１ビットがレイヤ間依存の有無に相当し、上位１ビットが動き依存の有無に対応する。全レイヤ依存性識別子の値と動き依存とレイヤ間依存の関係は、上記の対応関係に限定されてないが、上記対応関係を使えば、動き依存のみ、または、レイヤ間依存のみの依存有無を容易に判定できるため好ましい。 According to the above correspondence relationship, when the value of the all layer dependency identifier is expressed in binary, the lower 1 bit corresponds to the presence / absence of inter-layer dependency, and the upper 1 bit corresponds to the presence / absence of motion dependency. The relationship between the value of all layer dependency identifiers, motion dependency, and inter-layer dependency is not limited to the above correspondence relationship. However, if the above correspondence relationship is used, it is possible to determine whether there is dependency only for motion dependency or only for inter-layer dependency. This is preferable because it can be easily determined.

なお、タイル依存情報は、必ずしもSPSに含まれる必要はなく、別のパラメータセット、例えばVPSやPPS、に含まれていても構わない。 Note that the tile-dependent information is not necessarily included in the SPS, and may be included in another parameter set, for example, VPS or PPS.

また、タイル依存情報として全レイヤ依存識別子を説明したが、他の情報をタイル依存情報としてもよい。より一般的には、以下のいずれかの情報を含んでいれば、タイル依存情報として利用できる。 Further, the all layer dependency identifier has been described as the tile dependency information, but other information may be used as the tile dependency information. More generally, if any of the following information is included, it can be used as tile-dependent information.

（１）任意のタイル復号時に参照ピクチャ上のタイル外領域の復号画素を参照するか否かを示す情報
（２）任意のタイル復号時に参照ピクチャ上のタイル外領域の符号化パラメータ（例えば動き情報）を参照するか否かを示す情報
（３）任意のタイル復号時にレイヤ間参照ピクチャ上のタイル外領域の復号画素を参照するか否かを示す情報
（４）任意のタイル復号時にレイヤ間参照ピクチャ上のタイル外領域の符号化パラメータ（例えば動き情報）を参照するか否かを示す情報
例えば、２値のフラグであって、上記（１）を示す情報をタイル依存情報として利用できる。また、例えば、３値の情報であって、「０」が上記（１）でタイル外領域の復号画素を参照することを示し、「１」が上記（１）でタイル外領域の復号画素を参照せず、上記（２）でタイル外領域の符号化パラメータを参照することを示し、「２」が上記（１）でタイル外領域の復号画素を参照せず、かつ、上記（２）でタイル外領域の符号化パラメータを参照しないことを示す情報をタイル依存情報として用いてもよい。 (1) Information indicating whether or not to refer to the decoded pixel in the non-tile area on the reference picture when decoding an arbitrary tile (2) The encoding parameter (eg, motion information) of the non-tile area on the reference picture when decoding an arbitrary tile (3) Information indicating whether to refer to the decoded pixel in the area outside the tile on the inter-layer reference picture at the time of arbitrary tile decoding (4) Reference between layers at the time of arbitrary tile decoding Information indicating whether or not to refer to an encoding parameter (for example, motion information) of an area outside a tile on a picture. For example, a binary flag, and information indicating the above (1) can be used as tile-dependent information. Further, for example, in the case of ternary information, “0” indicates that the decoded pixel in the non-tile area is referred to in (1) above, and “1” indicates the decoded pixel in the non-tile area in (1) above. In the above (2), reference is made to the encoding parameter of the area outside the tile, and “2” does not refer to the decoded pixel in the area outside the tile in (1), and Information indicating that the encoding parameter of the non-tile area is not referred to may be used as the tile-dependent information.

（スライス復号部１４）
スライス復号部１４は、入力されるVCL NAL、パラメータセット、および、タイル情報に基づいて復号ピクチャを生成して出力する。 (Slice decoding unit 14)
The slice decoding unit 14 generates and outputs a decoded picture based on the input VCL NAL, parameter set, and tile information.

図１１を用いて、スライス復号部１４の概略的構成を説明する。図１１は、スライス復号部１４の概略的構成を示した機能ブロック図である。 A schematic configuration of the slice decoding unit 14 will be described with reference to FIG. FIG. 11 is a functional block diagram illustrating a schematic configuration of the slice decoding unit 14.

スライス復号部１４は、スライスヘッダ復号部１４１、スライス位置設定部１４２、スキップスライス判定部１４３、CTU復号部１４４を備えている。CTU復号部１４４は、さらに、非スキップCTU復号部１４４ＮとスキップCTU生成部１４４Ｓを含んでいる。 The slice decoding unit 14 includes a slice header decoding unit 141, a slice position setting unit 142, a skip slice determination unit 143, and a CTU decoding unit 144. The CTU decoding unit 144 further includes a non-skip CTU decoding unit 144N and a skip CTU generation unit 144S.

（スライスヘッダ復号部）
スライスヘッダ復号部１４１は、入力されるVCL NALとパラメータセットに基づいてスライスヘッダを復号し、スライス位置設定部１４２、スキップスライス判定部１４３、および、CTU復号部１４４に出力する。 (Slice header decoding unit)
The slice header decoding unit 141 decodes the slice header based on the input VCL NAL and the parameter set, and outputs the decoded slice header to the slice position setting unit 142, the skip slice determination unit 143, and the CTU decoding unit 144.

スライスヘッダには、ピクチャ内のスライス位置に係る情報（SHスライス位置情報）、および、スキップスライスに係る情報（SHスキップスライス情報）が含まれる。以下、各々についてスライスヘッダ復号部１４１がスライスヘッダ復号時に参照するシンタックス表を例示して説明する。 The slice header includes information related to the slice position in the picture (SH slice position information) and information related to the skip slice (SH skip slice information). Hereinafter, a syntax table that the slice header decoding unit 141 refers to during slice header decoding will be described as an example.

図１２は、スライスヘッダ復号時にスライスヘッダ復号部１４１により参照されるシンタックス表の一部であって、スライス位置情報に係る部分である。 FIG. 12 shows a part of the syntax table referenced by the slice header decoding unit 141 at the time of decoding the slice header, and is a part related to the slice position information.

スライスヘッダには、スライス位置情報として、ピクチャ内先頭スライスフラグ（first_slice_segment_in_pic_flag）が含まれる。ピクチャ内先頭スライスフラグが１の場合、対象スライスが復号順でピクチャ内の先頭に位置することを示す。ピクチャ内先頭スライスフラグが０の場合は、対象スライスが復号順でピクチャ内の先頭に位置しないことを示す。 The slice header includes an in-picture first slice flag (first_slice_segment_in_pic_flag) as slice position information. When the in-picture head slice flag is 1, it indicates that the target slice is located at the head in the picture in decoding order. When the in-picture head slice flag is 0, it indicates that the target slice is not located at the head in the picture in decoding order.

また、スライスヘッダには、スライス位置情報として、スライスPPS識別子（slice_pic_parameter_set_id）が含まれる。スライスPPS識別子は、対象スライスに関連付けられるPPSの識別子であり、当該PPS識別子を介して、対象スライスに関連付けるべきタイル情報が特定される。 The slice header includes a slice PPS identifier (slice_pic_parameter_set_id) as slice position information. The slice PPS identifier is an identifier of a PPS associated with the target slice, and tile information to be associated with the target slice is specified via the PPS identifier.

また、スライスヘッダには、スライス位置情報として、スライスセグメントアドレス（slice_segment_address）が含まれる。スライスセグメントアドレスは、対象スライスの位置、すなわち、対象スライスに含まれるCTUのうち復号順で先頭のCTUのピクチャ内での位置を、ピクチャ内のCTUラスタスキャン順によるアドレスで指定する情報である。 The slice header includes a slice segment address (slice_segment_address) as slice position information. The slice segment address is information that designates the position of the target slice, that is, the position in the picture of the first CTU in the decoding order among the CTUs included in the target slice, by an address in the CTU raster scan order in the picture.

図１３は、スライスヘッダ復号時にスライスヘッダ復号部１４１により参照されるシンタックス表の一部であって、スキップスライス情報に係る部分である。 FIG. 13 is a part of the syntax table referred to by the slice header decoding unit 141 at the time of slice header decoding, and is a part related to skip slice information.

スキップスライス情報は、上位レイヤに含まれるスライスヘッダであって、スライスヘッダ拡張が有効（slice_segment_header_extension_present_flagが真）である場合に復号される。したがって、上位レイヤとなり得ない基礎レイヤでは、スキップスライス情報はスライスヘッダに含まれない。 The skip slice information is a slice header included in an upper layer, and is decoded when slice header extension is valid (slice_segment_header_extension_present_flag is true). Accordingly, skip slice information is not included in the slice header in a base layer that cannot be an upper layer.

複数タイルが有効（tiles_enabled_flagが真）であり、かつ、全タイル依存性識別子が動き依存またはレイヤ間依存の少なくとも何れか一方があることを示す（all_tiles_decoding_dependency_idc > 0）場合、スライスヘッダには、スキップスライス情報として、非重要タイルフラグ（non_significant_tile_flag）が含まれる。さらに、非重要タイルフラグが真の場合、スライスに含まれるCTU数を表わすシンタックス（num_ctu_in_slice_segment_minus1）が含まれる。前記のスライスに含まれるCTU数を表わすシンタックスは、スキップCTU数とも呼称する。なお、非重要タイルフラグは、スキップスライスフラグとも呼称される。また、同一のタイルに含まれる非重要タイルフラグの値は一致するよう制限されていることが好ましい。 When multiple tiles are enabled (tiles_enabled_flag is true) and all tile dependency identifiers indicate at least one of motion dependency and inter-layer dependency (all_tiles_decoding_dependency_idc> 0), the slice header contains a skip slice The information includes a non-important tile flag (non_significant_tile_flag). Furthermore, when the non-important tile flag is true, a syntax (num_ctu_in_slice_segment_minus1) indicating the number of CTUs included in the slice is included. The syntax representing the number of CTUs included in the slice is also referred to as a skip CTU number. Note that the non-important tile flag is also referred to as a skip slice flag. Moreover, it is preferable that the value of the non-important tile flag included in the same tile is limited so as to match.

ここで、num_ctu_in_slice_segment_minus1はビット長Mの二値化された非負整数として復号される。ビット長Mは、対象スライスを含むタイルに含まれるCTU数に基づいて、以下の式により計算される。 Here, num_ctu_in_slice_segment_minus1 is decoded as a binarized non-negative integer having a bit length M. The bit length M is calculated by the following formula based on the number of CTUs included in the tile including the target slice.

M = Ceil( Log2( ColWidth[tileX] * RowHeight[tileY] )
ここで、tileX = tileIdx % (num_tile_cols_minus1 + 1)
tileY = tileIdx / (num_tile_cols_minus1 + 1)
tileIdx = TileId[ CtbAddrRsToTs[ slice_segment_address ] ]
上記の式に従って導出されるMの値は、タイル識別子tileIdxに対応するタイルに含まれるCTU数（ColWidth[tileX] * RowHeight[tileY]）の２の対数をKとした場合、Ceil(K)、すなわち、K以上の整数の中で最小の整数に相当する値である。 M = Ceil (Log2 (ColWidth [tileX] * RowHeight [tileY])
Where tileX = tileIdx% (num_tile_cols_minus1 + 1)
tileY = tileIdx / (num_tile_cols_minus1 + 1)
tileIdx = TileId [CtbAddrRsToTs [slice_segment_address]]
The value of M derived according to the above formula is Ceil (K), where K is the logarithm of 2 of the CTU number (ColWidth [tileX] * RowHeight [tileY]) included in the tile corresponding to the tile identifier tileIdx. That is, it is a value corresponding to the smallest integer among the integers of K or more.

なお、num_ctu_in_slice_segment_minus1は上記以外の方法で二値化しても構わない。例えば、ピクチャ内に含まれるCTU数numCtbsInPicsを用いて、整数NをN=Ceil(Log2(numCtbsInPics))に設定し、当該Nをビット長とする非負整数として二値化してもよい。しかしながら、一般に複数タイルを用いる場合は、スライスはタイルに包含されるため、num_ctu_in_slice_segment_minus1を伝送する場合にはスライスに含まれるCTU数がタイルに含まれるCTU数を超えることはない。そのため、ピクチャに含まれるCTU数よりも少ないタイルに含まれるCTU数に基づいた符号長を用いることで、より少ない符号量でシンタックスを符号化できるため好ましい。 Note that num_ctu_in_slice_segment_minus1 may be binarized by a method other than the above. For example, the integer N may be set to N = Ceil (Log2 (numCtbsInPics)) using the number of CTUs numCtbsInPics included in the picture, and binarized as a non-negative integer having the bit length as N. However, in general, when a plurality of tiles are used, since the slice is included in the tile, when transmitting num_ctu_in_slice_segment_minus1, the number of CTUs included in the slice does not exceed the number of CTUs included in the tile. Therefore, it is preferable to use a code length based on the number of CTUs included in tiles smaller than the number of CTUs included in the picture, because the syntax can be encoded with a smaller code amount.

図１３を参照して説明したスライスヘッダに含まれるスキップスライス情報は、次のように表現することもできる。すなわち、基礎レイヤのスライスヘッダにはスキップスライス情報が含まれず、上位レイヤのスライスヘッダには所定の条件でスキップスライス情報が含まれる。 The skip slice information included in the slice header described with reference to FIG. 13 can also be expressed as follows. That is, the slice header of the base layer does not include skip slice information, and the slice header of the upper layer includes skip slice information under a predetermined condition.

このように上位レイヤのスライスヘッダにのみスキップスライス情報を追加することは、特に基礎レイヤが既存の標準、例えばHEVC、に準拠する場合に有用である。なぜならば、HEVCはすでに普及しており、スライスヘッダのシンタックスを変更することは困難である。そのような場合に、上位レイヤにのみスキップスライス情報を所定の条件で追加することで、基礎レイヤの後方互換性を損なうことなく、上位レイヤでは注目領域の高解像度画像表示に代表されるアプリケーションで使用できるスキップスライスの機能を提供できる。 Adding the skip slice information only to the slice header of the upper layer in this way is particularly useful when the base layer conforms to an existing standard such as HEVC. This is because HEVC is already popular and it is difficult to change the syntax of the slice header. In such a case, by adding skip slice information only to the upper layer under a predetermined condition, the upper layer is an application represented by a high-resolution image display of the attention area without impairing the backward compatibility of the base layer. A skip slice function that can be used can be provided.

（スライス位置設定部）
スライス位置設定部１４２は、入力されるスライスヘッダとタイル情報に基づいてピクチャ内のスライス位置を特定してCTU復号部１４４に出力する。 (Slice position setting part)
The slice position setting unit 142 specifies the slice position in the picture based on the input slice header and tile information, and outputs the slice position to the CTU decoding unit 144.

スライス内のｉ番目のCTUのピクチャ内での位置をCTU単位で（ctbX[i],ctbY[i])、タイルスキャンによるアドレスをctbAddrTs[i]と記載する場合、スライスの先頭CTU、すなわち０番目のCTU、のピクチャ内の位置（ctbX[0]、ctbY[0]）、タイルスキャンによるアドレスctbAddrTsは次式により計算される。 When the position in the picture of the i-th CTU in the slice is described in CTU units (ctbX [i], ctbY [i]) and the address by tile scan is described as ctbAddrTs [i], the first CTU of the slice, that is, 0 The position (ctbX [0], ctbY [0]) in the picture of the th CTU and the address ctbAddrTs by tile scanning are calculated by the following equations.

ctbAddrTs[0] = CtbAddrRsToTs[slice_segment_address]
ctbX[0] = slice_segment_address % PicWidthInCtbsY
ctbY[0] = slice_segment_address / PicWidthInCtbsY
ここで、CtbAddrRsToTs[X]はラスタスキャンのアドレスをタイルスキャンのアドレスに変換する配列であり、スライス位置設定部に入力されるタイル情報に含まれている。 ctbAddrTs [0] = CtbAddrRsToTs [slice_segment_address]
ctbX [0] = slice_segment_address% PicWidthInCtbsY
ctbY [0] = slice_segment_address / PicWidthInCtbsY
Here, CtbAddrRsToTs [X] is an array for converting raster scan addresses into tile scan addresses, and is included in tile information input to the slice position setting unit.

また、スライス内ｉ番目（ｉ＞０）のCTUのピクチャ内での位置（ctbX[i],ctbY[i])は次式により計算される。 Further, the position (ctbX [i], ctbY [i]) in the picture of the i-th (i> 0) CTU in the slice is calculated by the following equation.

ctbAddrTs[i] = ctbAddrTs[i-1] + 1
ctbX[i] = CtbAddrTsToRs[ctbAddrTs[i]] % PicWidthInCtbsY
ctbY[i] = CtbAddrTsToRs[ctbAddrTs[i]] / PicWidthInCtbsY
つまり、対象CTUのタイルスキャンのアドレスは、直前に先行するCTUのタイルスキャンのアドレスに1加算した値に設定される。そして、得られたタイルスキャンのアドレスを、タイル情報に含まれる変換配列CtbAddrTsToRsを用いてラスタスキャンのアドレスに変換する。ラスタスキャンのアドレスとCTU単位のピクチャの幅によりCTUのピクチャ内での位置（ctbX[i],ctbY[i]）が導出される。 ctbAddrTs [i] = ctbAddrTs [i-1] + 1
ctbX [i] = CtbAddrTsToRs [ctbAddrTs [i]]% PicWidthInCtbsY
ctbY [i] = CtbAddrTsToRs [ctbAddrTs [i]] / PicWidthInCtbsY
That is, the tile scan address of the target CTU is set to a value obtained by adding 1 to the tile scan address of the immediately preceding CTU. The obtained tile scan address is converted into a raster scan address using the conversion array CtbAddrTsToRs included in the tile information. The position (ctbX [i], ctbY [i]) in the CTU picture is derived from the raster scan address and the picture width in CTU units.

なお、（ctbX[i],ctbY[i]）からCTUのピクチャ内輝度画素単位での位置（ctbXInLumaPixels[i],ctbYInLumaPixels[i])を計算するには、各要素にCTUサイズを乗ずる計算をすればよい。例えば、輝度画素単位のCTU幅の２の対数であるCtbLog2SizeYを用いて、次のように計算できる。 To calculate the position (ctbXInLumaPixels [i], ctbYInLumaPixels [i]) of the CTU in the picture's luminance pixel unit from (ctbX [i], ctbY [i]), calculate by multiplying each element by the CTU size. do it. For example, it can be calculated as follows using CtbLog2SizeY which is the logarithm of 2 of the CTU width in luminance pixel units.

ctbXInLumaPixels[i] = ctbX[i] << CtbLog2SizeY
ctbYInLumaPixels[i] = ctbY[i] << CtbLog2SizeY
以上の処理により、スライス位置設定部１４２は、スライスに含まれる各CTUのピクチャ内での位置を計算して出力する。 ctbXInLumaPixels [i] = ctbX [i] << CtbLog2SizeY
ctbYInLumaPixels [i] = ctbY [i] << CtbLog2SizeY
Through the above processing, the slice position setting unit 142 calculates and outputs the position of each CTU included in the slice in the picture.

（スキップスライス判定部）
スキップスライス判定部１４３は、入力されるスライスヘッダに基づいて、対象スライスがスキップスライスか否かを判定する。対象スライスがスキップスライスと判定した場合、スライスヘッダをCTU復号部１４４内のスキップCTU生成部１４４Ｓに出力する。対象スライスが非スキップスライスと判定した場合、スライスヘッダおよびスライスデータをCTU復号部１４４内の非スキップCTU復号部１４４Ｎに出力する。 (Skip slice determination unit)
The skip slice determination unit 143 determines whether the target slice is a skip slice based on the input slice header. When it is determined that the target slice is a skip slice, the slice header is output to the skip CTU generation unit 144S in the CTU decoding unit 144. When it is determined that the target slice is a non-skip slice, the slice header and the slice data are output to the non-skip CTU decoding unit 144N in the CTU decoding unit 144.

スキップスライス判定部１４３にて実行される、対象スライスがスキップスライスか否かの判定処理を図１を参照して説明する。図１は対象スライスがスキップスライスか否かを判定する処理のフロー図である。スキップスライスの判定は以下のＳ１０1〜Ｓ１０５の手順で実行される。 Processing for determining whether or not the target slice is a skip slice, executed by the skip slice determination unit 143, will be described with reference to FIG. FIG. 1 is a flowchart of processing for determining whether or not a target slice is a skip slice. The determination of the skip slice is executed according to the following steps S101 to S105.

（Ｓ１０１）ピクチャ内のタイル数が２以上（Ｓ１０１でYES）の場合、Ｓ１０２へ進む。それ以外（Ｓ１０１でNO）の場合、Ｓ１０５へ進む。なお、Ｓ１０１の判定は、tiles_enabled_flagの値を参照して実行できる。つまり、tiles_enabled_flagの値が１ならばピクチャ内のタイル数は２以上であり、tiles_enabled_flagの値が０ならばピクチャ内のタイル数は１である、という事実を利用して判定できる。 (S101) If the number of tiles in the picture is 2 or more (YES in S101), the process proceeds to S102. Otherwise (NO in S101), the process proceeds to S105. Note that the determination in S101 can be performed with reference to the value of tiles_enabled_flag. That is, it can be determined using the fact that if the value of tiles_enabled_flag is 1, the number of tiles in the picture is 2 or more, and if the value of tiles_enabled_flag is 0, the number of tiles in the picture is 1.

（Ｓ１０２）シーケンスに含まれる全てのタイルに関して、動き依存、または、レイヤ間依存のいずれかが無い（Ｓ１０２でYES）場合、Ｓ１０３に進む。それ以外（Ｓ１０２でNO）の場合、Ｓ１０５に進む。なお、Ｓ１０２の判定は、前述のSPSタイル情報に含まれる全タイル依存性識別子（all_tiles_decoding_dependency_idc）を参照して実行できる。 (S102) If there is no motion dependency or inter-layer dependency for all tiles included in the sequence (YES in S102), the process proceeds to S103. Otherwise (NO in S102), the process proceeds to S105. Note that the determination in S102 can be performed with reference to the all-tiles dependency identifier (all_tiles_decoding_dependency_idc) included in the SPS tile information described above.

（Ｓ１０３）非重要タイルフラグ（non_significant_tile_flag）の値が１に等しい（Ｓ１０３でYES）場合、Ｓ１０４に進む。それ以外（Ｓ１０３でNO）の場合、Ｓ１０５に進む。 (S103) If the value of the non-important tile flag (non_significant_tile_flag) is equal to 1 (YES in S103), the process proceeds to S104. Otherwise (NO in S103), the process proceeds to S105.

（Ｓ１０４）対象スライスをスキップスライスに設定して処理を終了する。 (S104) The target slice is set as a skip slice and the process is terminated.

（Ｓ１０５）対象スライスを非スキップスライスに設定して処理を終了する。 (S105) The target slice is set as a non-skip slice and the process is terminated.

上記の手順によれば、非重要タイルが存在しない条件（ピクチャ内のタイル数が１、または、動き情報とレイヤ間依存のいずれも依存が有る）が成立する場合には対象スライスを非スキップスライスに設定し、それ以外の場合には、非重要タイルフラグの値に応じて対象スライスが非スキップスライスかスキップスライスかを決定している。 According to the above procedure, if the condition that the non-important tile does not exist (the number of tiles in the picture is 1 or both the motion information and the inter-layer dependence depend) is satisfied, the target slice is a non-skip slice. In other cases, whether the target slice is a non-skip slice or a skip slice is determined according to the value of the non-important tile flag.

上記の手順によれば、ピクチャ内のタイル数が１の場合にタイルをスキップすることを抑制できる。上位レイヤのピクチャ全体をスキップスライスとする処理の有用度は低い。なぜならば、そのような場合には下位レイヤのみを復号することで代用できるためである。したがって、タイル数が１の場合にタイルをスキップする処理を抑制できることが好ましい。 According to the above procedure, skipping of tiles can be suppressed when the number of tiles in a picture is 1. The usefulness of processing that uses the entire upper layer picture as a skip slice is low. This is because in such a case, it can be substituted by decoding only the lower layer. Therefore, it is preferable that the process of skipping tiles when the number of tiles is 1 can be suppressed.

また、上記の手順によれば、動き依存とレイヤ間依存のいずれも有る場合にタイルがスキップされることを抑制できる。動き依存がある場合、対象スライスが属するタイルの復号に、参照ピクチャ上で当該タイルと同一位置に対応する領域外の画素値やシンタックス値が参照される可能性がある。そのため、注目領域外のタイルをスキップスライスとして符号化すると、後続のピクチャにおける注目領域内の復号画像に影響が出るため、好ましくない。また、レイヤ間依存がある場合、注目領域外のタイルに相当する下位レイヤのタイルを復号する必要があり、下位レイヤにおいて注目領域外の領域を復号せずに複雑度を減らすこと、や、下位レイヤの注目領域に相当するタイルと、対象レイヤのタイルの並列復号ができないため、好ましくない。そのため、動き依存がレイヤ間依存のいずれかが無い場合にのみ、タイルがスキップできることが好ましく、上記のＳ１０２の手順により、そのような制約が実現できる。 Moreover, according to said procedure, it can suppress that a tile is skipped when there exists both motion dependence and inter-layer dependence. When there is motion dependence, there is a possibility that a pixel value or syntax value outside the region corresponding to the same position as the tile on the reference picture is referred to in decoding the tile to which the target slice belongs. Therefore, it is not preferable to encode a tile outside the attention area as a skip slice because a decoded image in the attention area in the subsequent picture is affected. In addition, when there is inter-layer dependency, it is necessary to decode lower layer tiles corresponding to tiles outside the attention area, and reduce complexity without decoding areas outside the attention area in the lower layer, This is not preferable because the tile corresponding to the attention area of the layer and the tile of the target layer cannot be decoded in parallel. Therefore, it is preferable that tiles can be skipped only when there is no interdependence between motion and inter-layer dependence, and such a restriction can be realized by the procedure of S102 described above.

なお、図１を参照して説明した処理において、Ｓ１０１のタイル数に係る判定処理を省略し、Ｓ１０１でYESが選択されたものとして処理を行っても構わない。その場合、ピクチャ全体がスキップされる可能性が生じるが、判定処理の処理量を削減できる。 In the process described with reference to FIG. 1, the determination process related to the number of tiles in S101 may be omitted, and the process may be performed assuming that YES is selected in S101. In that case, there is a possibility that the entire picture is skipped, but the amount of determination processing can be reduced.

また、図１を参照して説明した処理において、Ｓ１０２での処理、すなわち、「全タイルに動き依存がない、または、全タイルでレイヤ間依存がない」かを判定する処理に代わり、「全タイルに動き依存がない」かを判定する処理を行っても構わない。そうのような構成は、特に対象レイヤが参照レイヤとならない場合に有効であり、判定処理を削減できる。なぜならば、対象レイヤのタイルをスキップスライスに変更した場合であっても、当該タイルがより上位のレイヤにおいてレイヤ間予測により参照される可能性がないためである。 In addition, in the process described with reference to FIG. 1, instead of the process in S102, that is, the process of determining whether “all tiles have no motion dependency or all tiles have no inter-layer dependency”, You may perform the process which determines whether a tile has no motion dependence. Such a configuration is particularly effective when the target layer is not a reference layer, and can reduce the determination process. This is because even if the tile of the target layer is changed to a skip slice, there is no possibility that the tile is referred to by inter-layer prediction in a higher layer.

なお、図１を参照して説明した処理に代わり、図１４に示した処理を用いてもよい。図１４では、図１のＳ１０１とＳ１０２の判定処理でNOだった場合に対象スライスを非スキップスライスに直接設定せず、一度non_significant_flagの値を0に設定するＳ１０６ａを実行する点が図１のフローと異なる。図１４に基づく場合のスキップスライスの判定は以下のＳ１０1ａ〜Ｓ１０６の手順で実行される。なお、Ｓ１０３、Ｓ１０４、Ｓ１０５の処理は図１を参照して説明した同一符号の処理と同じであり記載を省略する。 Note that the process shown in FIG. 14 may be used instead of the process described with reference to FIG. In FIG. 14, the flow of FIG. 1 is that if the determination processing of S101 and S102 in FIG. 1 is NO, the target slice is not directly set as a non-skip slice, but S106a is set once to set the value of non_significant_flag to 0. And different. The determination of the skip slice based on FIG. 14 is executed in the following steps S101a to S106. Note that the processing of S103, S104, and S105 is the same as the processing of the same reference numerals described with reference to FIG.

（Ｓ１０１ａ）ピクチャ内のタイル数が２以上（Ｓ１０１ａでYES）の場合、Ｓ１０２ａへ進む。それ以外（Ｓ１０１ａでNO）の場合、Ｓ１０６ａへ進む。 (S101a) If the number of tiles in the picture is 2 or more (YES in S101a), the process proceeds to S102a. Otherwise (NO in S101a), the process proceeds to S106a.

（Ｓ１０２ａ）シーケンスに含まれる全てのタイルに関して、動き依存、または、レイヤ間依存のいずれかが無い（Ｓ１０２ａでYES）場合、Ｓ１０３に進む。それ以外（Ｓ１０２ａでNO）の場合、Ｓ１０６ａに進む。 (S102a) When there is no motion dependency or inter-layer dependency for all tiles included in the sequence (YES in S102a), the process proceeds to S103. In other cases (NO in S102a), the process proceeds to S106a.

（Ｓ１０６ａ）非重要タイルフラグ（non_significant_tile_flag）の値を０に設定してＳ１０３でへ進む。 (S106a) The value of the non-important tile flag (non_significant_tile_flag) is set to 0, and the process proceeds to S103.

上記の図１４を参照して説明した対象スライスがスキップスライスか否かの判定処理は、図１を参照して説明した処理に較べて判定回数が少なく実装できるという利点がある。なぜならば、ピクチャ内のタイル数はPPSタイル情報により決定できる。つまり、Ｓ１０１ａの判定はＰＰＳ１個につき一度実行して、その結果を保存しておくことができる。また、Ｓ１０２ａの判定に用いる全タイル依存性識別子はSPSタイル情報に含まれる。つまり、Ｓ１０２ａの判定はＳＰＳ１個につき一度実行して、その結果を保存しておくことができる。 The determination process of whether or not the target slice described with reference to FIG. 14 is a skip slice has an advantage that it can be implemented with a smaller number of determinations than the process described with reference to FIG. This is because the number of tiles in a picture can be determined by PPS tile information. That is, the determination in S101a can be executed once for each PPS and the result can be saved. Further, the all-tile dependency identifier used for the determination in S102a is included in the SPS tile information. That is, the determination in S102a can be executed once for each SPS and the result can be saved.

なお、図１３に示したシンタックス表に基づいてnon_significant_tile_flagを復号する場合、non_significant_tile_flagが符号化データ中に存在しない場合の推定値を０に設定することでも、図１４を参照して説明した処理と同一のスキップスライス判定処理が実現できる。なぜならば、図１３のシンタックス表では、Ｓ１０１ａがYES（tiles_enabled_flagが１）かつＳ１０１ｂがYES（all_tiles_decoding_dependency_idc>0）の場合にのみnon_significant_tile_flagが復号されるためである。 In addition, when decoding non_significant_tile_flag based on the syntax table shown in FIG. 13, setting the estimated value when non_significant_tile_flag does not exist in the encoded data is set to 0, The same skip slice determination process can be realized. This is because, in the syntax table of FIG. 13, non_significant_tile_flag is decoded only when S101a is YES (tiles_enabled_flag is 1) and S101b is YES (all_tiles_decoding_dependency_idc> 0).

（CTU復号部）
CTU復号部１４４は、概略的には、入力されるスライスヘッダ、スライスデータ、および、パラメータセットに基づいて、スライスに含まれる各ＣＴＵに対応する領域の復号画像を復号することで、スライスの復号画像を生成する。スライスの復号画像は、入力されるスライス位置の示す位置に、復号ピクチャの一部として出力される。CTUの復号画像は、スキップスライス判定部１４３の結果に基づいて、CTU復号部１４４に含まれる非スキップCTU復号部１４４Ｎ、または、スキップCTU生成部１４４Ｓにおいて実行される。 (CTU decoder)
In general, the CTU decoding unit 144 decodes a slice by decoding a decoded image in a region corresponding to each CTU included in the slice based on the input slice header, slice data, and parameter set. Generate an image. The decoded image of the slice is output as a part of the decoded picture at the position indicated by the input slice position. The decoded image of the CTU is executed by the non-skip CTU decoding unit 144N or the skip CTU generation unit 144S included in the CTU decoding unit 144 based on the result of the skip slice determination unit 143.

非スキップCTU復号部１４４Ｎは、入力されるスライスヘッダ、スライスデータに基づいて、非スキップスライスに含まれるCTUの復号画像を復号して出力する。すなわち、スライスデータに含まれるＰＴ情報を復号して対象CTUの予測画像を生成し、スライスデータに含まれるＴＴ情報を復号して対象CTUの予測残差を生成し、当該予測画像と予測残差に基づいて対象CTUの復号画像を生成する。 The non-skip CTU decoding unit 144N decodes and outputs the decoded image of the CTU included in the non-skip slice based on the input slice header and slice data. That is, the prediction information of the target CTU is generated by decoding the PT information included in the slice data, the prediction residual of the target CTU is generated by decoding the TT information included in the slice data, and the prediction image and the prediction residual Based on the above, a decoded image of the target CTU is generated.

（スキップCTU生成部）
スキップCTU生成部１４４Ｓは、入力されるスライスヘッダに基づいて、スキップスライスに含まれるCTUの復号画像を復号して出力する。スキップCTU生成部１４４Ｓにおいて生成されるCTUの復号画像の画素値p(x,y)は、例えば、次式により生成できる。ここで、(x,y)はCTU内の画素位置、BitDepthは出力画像のビット深度を表わす。 (Skip CTU generator)
The skip CTU generation unit 144S decodes and outputs the decoded image of the CTU included in the skip slice based on the input slice header. The pixel value p (x, y) of the decoded image of the CTU generated by the skip CTU generation unit 144S can be generated by the following equation, for example. Here, (x, y) represents the pixel position in the CTU, and BitDepth represents the bit depth of the output image.

p(x,y) = 1 << (BitDepth - 1)
すなわち、復号画像の画素値は、CTU内の位置によらず、ビット深度に依存した固定の値（ビット深度により表現される最大値に１を加算した値を２で除算した値）が設定される。例えば、出力画像のビット深度が８ビットの場合、128の値がスキップCTU内の各画素の復号画素値として設定される。 p (x, y) = 1 << (BitDepth-1)
In other words, the pixel value of the decoded image is set to a fixed value (a value obtained by adding 1 to the maximum value expressed by the bit depth divided by 2) depending on the bit depth regardless of the position in the CTU. The For example, when the bit depth of the output image is 8 bits, a value of 128 is set as the decoded pixel value of each pixel in the skip CTU.

なお、復号画像の画素値の生成方法は、必ずしも上記の方法でなくてもよい。スライスデータに依存しない別の生成方法で復号画像の画素値を導出しても構わない。例えば、パラメータセットが参照ピクチャまたはレイヤ間参照ピクチャの復号画素が参照可能であることを示す場合、参照ピクチャまたはレイヤ間参照ピクチャ上の復号画素をコピーしてスキップCTUの画素値としてもよい。レイヤ間参照ピクチャを用いる場合には、必要に応じて復号画像をスケーリングした後にコピーしてもよい。 Note that the method for generating the pixel value of the decoded image is not necessarily the method described above. The pixel value of the decoded image may be derived by another generation method that does not depend on slice data. For example, when the parameter set indicates that the decoded pixel of the reference picture or the inter-layer reference picture can be referred to, the decoded pixel on the reference picture or the inter-layer reference picture may be copied and used as the pixel value of the skip CTU. When using an inter-layer reference picture, the decoded image may be copied after scaling as necessary.

（スライス復号処理フロー）
スライス復号部１４におけるスライス復号処理を図１５を参照して説明する。図１５はスライス復号処理の手順を示すフロー図である。 (Slice decoding process flow)
The slice decoding process in the slice decoding unit 14 will be described with reference to FIG. FIG. 15 is a flowchart showing a procedure of slice decoding processing.

（Ｓ２０１）スライスヘッダ復号部１４１は、パラメータセットに基づいて、VLC NALからスライスヘッダを復号する。スライスヘッダをスライス位置設定部１４２に、スライスヘッダとVLC NALに含まれるスライスデータをスキップスライス判定部に出力する。Ｓ２０２に進む。 (S201) The slice header decoding unit 141 decodes the slice header from the VLC NAL based on the parameter set. The slice header is output to the slice position setting unit 142, and the slice header and the slice data included in the VLC NAL are output to the skip slice determination unit. The process proceeds to S202.

（Ｓ２０２）スライス位置設定部１４２は、タイル情報と、スライスヘッダに基づいて、スライス位置を導出してCTU復号部１４４に出力する。Ｓ２０３に進む。 (S202) The slice position setting unit 142 derives a slice position based on the tile information and the slice header, and outputs the slice position to the CTU decoding unit 144. The process proceeds to S203.

（Ｓ２０３）スキップスライス判定部１４３は、スライスヘッダに基づいて、対象スライスがスキップスライスか非スキップスライスかを判定する。対象スライスが非スキップスライスの場合、スライスヘッダとスライスデータを非スキップCTU復号部１４４Ｎに出力してＳ２０４ａに進む。対象スライスがスキップスライスの場合、スライスヘッダをスキップCTU生成部１４４Ｓに出力してＳ２０４ｂに進む。 (S203) The skip slice determination unit 143 determines whether the target slice is a skip slice or a non-skip slice based on the slice header. If the target slice is a non-skip slice, the slice header and slice data are output to the non-skip CTU decoding unit 144N, and the process proceeds to S204a. If the target slice is a skip slice, the slice header is output to the skip CTU generation unit 144S, and the process proceeds to S204b.

（Ｓ２０４ａ）非スキップスライスCTU復号部１４４Ｎは、スライスデータからCTUを順次復号し、復号したCTUの復号画像を、入力のスライス位置が示す位置に対応する復号ピクチャの部分領域として出力する。Ｓ２０５に進む。 (S204a) The non-skip slice CTU decoding unit 144N sequentially decodes the CTU from the slice data, and outputs the decoded image of the CTU as a partial region of a decoded picture corresponding to the position indicated by the input slice position. The process proceeds to S205.

（Ｓ２０４ｂ）スキップスライスCTU生成部１４４Ｓは、スライスヘッダが示す個数のCTUの復号画像を所定の導出処理により生成し、生成したCTUの復号画像を、入力のスライス位置が示す位置に対応する復号ピクチャの部分領域として出力する。Ｓ２０５に進む。 (S204b) The skip slice CTU generating unit 144S generates a decoded image of the number of CTUs indicated by the slice header by a predetermined derivation process, and the decoded picture of the generated CTU is a decoded picture corresponding to the position indicated by the input slice position. Is output as a partial area. The process proceeds to S205.

（Ｓ２０５）CTU復号部１４４は、非スキップCTU復号部１４４ＮまたはスキップCTU生成部１４４Ｓから出力されたスライスに含まれるCTUの復号画像を、復号ピクチャの一部であるスライスの復号画像として出力する。 (S205) The CTU decoding unit 144 outputs the decoded image of the CTU included in the slice output from the non-skip CTU decoding unit 144N or the skip CTU generation unit 144S as a decoded image of a slice that is a part of the decoded picture.

（動画像復号装置１の効果）
以上説明した本実施形態に係る階層動画像復号装置１（階層画像復号装置）は、階層符号化された符号化データに含まれる上位レイヤの符号化データを復号し、上位レイヤの復号ピクチャを復元する画像復号装置であって、上位レイヤのスライスヘッダを復号するスライスヘッダ復号部１４１と、上位レイヤの非スキップスライスに属するＣＴＵの復号画像をスライスデータに基づいて復号する非スキップＣＴＵ復号部１４４Ｎと、上位レイヤのスキップスライスに属するＣＴＵの復号画像を生成するスキップＣＴＵ生成部１４４Ｓを備えている。スライスヘッダ復号部１４１は、対象スライスがスキップスライスか非スキップスライスかを示すスキップスライスフラグを復号または推定し、前記スキップスライスフラグが、対象スライスがスキップスライスであることを示す場合、スライスヘッダ復号部１４１は、スキップＣＴＵ数を復号するとともに、該スキップＣＴＵ数の示す個数のＣＴＵの復号画像をスキップＣＴＵ生成部１４４Ｓにより生成することで対象スライスの復号画像を生成する。前記スキップスライスフラグが、対象スライスが非スキップスライスであることを示す場合、対象スライスに含まれるＣＴＵを前記非スキップＣＴＵ復号部１４４Ｎにより復号することで対象スライスの復号画像を生成することができる。 (Effect of moving image decoding apparatus 1)
The hierarchical moving picture decoding apparatus 1 (hierarchical picture decoding apparatus) according to the present embodiment described above decodes the upper layer encoded data included in the hierarchically encoded data, and restores the upper layer decoded picture. A slice header decoding unit 141 that decodes a slice header of an upper layer, and a non-skip CTU decoding unit 144N that decodes a decoded image of a CTU belonging to a non-skip slice of the upper layer based on slice data A skip CTU generation unit 144S that generates a decoded image of the CTU belonging to the skip slice of the upper layer is provided. The slice header decoding unit 141 decodes or estimates a skip slice flag indicating whether the target slice is a skip slice or a non-skip slice, and if the skip slice flag indicates that the target slice is a skip slice, a slice header decoding unit 141 decodes the number of skip CTUs and generates a decoded image of the target slice by generating decoded images of the number of CTUs indicated by the number of skipped CTUs by the skip CTU generation unit 144S. When the skip slice flag indicates that the target slice is a non-skip slice, a decoded image of the target slice can be generated by decoding the CTU included in the target slice by the non-skip CTU decoding unit 144N.

上記のように本発明に係る階層動画像復号装置１は、上位レイヤのスライスヘッダに含まれるスキップスライスフラグの値が対象スライスがスキップスライスであることを示す場合、スライスデータを用いることなくスキップＣＴＵ生成部により対象スライス内の復号画像を生成できる。スキップＣＴＵ生成部はスライスデータを用いずに復号画像を生成するため、非スキップＣＴＵ生成部でスライスデータを参照して復号画像を復号する場合に較べて復号画像の画質は低いが、より少ない処理量で、かつ、より少ない符号化データを用いて復号画像が生成できる。したがって、階層動画像復号装置１は、注目領域に含まれる領域内のスライスを非スキップスライス、注目領域に含まれない領域内のスライスをスキップスライスとして符号化された階層符号化データを復号する場合に、注目領域内の復号画像の品質を損なうことなく、少ない符号量の符号化データから、少ない処理量で復号画像を生成して復号ピクチャとして出力できる。 As described above, when the value of the skip slice flag included in the upper layer slice header indicates that the target slice is a skip slice, the hierarchical video decoding device 1 according to the present invention skips CTU without using slice data. The generation unit can generate a decoded image in the target slice. Since the skip CTU generation unit generates the decoded image without using the slice data, the image quality of the decoded image is lower than when the non-skip CTU generation unit decodes the decoded image with reference to the slice data, but less processing is performed. A decoded image can be generated using a small amount of encoded data in an amount. Therefore, when the hierarchical video decoding device 1 decodes hierarchically encoded data that is encoded with a slice in a region included in the region of interest as a non-skip slice and a slice in a region not included in the region of interest as a skip slice In addition, a decoded image can be generated with a small processing amount from encoded data with a small code amount and output as a decoded picture without deteriorating the quality of the decoded image in the region of interest.

[変形例１：コンフォーマンスウィンドウによる表示]
本実施形態に係る階層動画像復号装置１において、パラメータセットから表示領域情報を復号し、当該表示領域情報の示す復号ピクチャの部分領域を最終的な出力ピクチャとして外部に出力する構成を説明する。この構成では、パラメータセットから復号される表示領域情報は、復号ピクチャの部分領域であって、スキップスライスを含まない領域に制限される。以下、具体的な構成と手順を説明する。 [Variation 1: Display by conformance window]
In the hierarchical video decoding apparatus 1 according to the present embodiment, a configuration will be described in which display area information is decoded from a parameter set, and a partial area of a decoded picture indicated by the display area information is output to the outside as a final output picture. In this configuration, display area information decoded from the parameter set is limited to an area that is a partial area of a decoded picture and does not include a skip slice. Hereinafter, a specific configuration and procedure will be described.

パラメータセット復号部１２は、入力される対象レイヤ符号化データＤＡＴＡ＃Ｔから表示領域情報を復号する。表示領域情報は、例えば、SPSに含まれており、図１６に示すシンタックス表に従って復号される。図１６は、パラメータセット復号部１２がSPS復号時に参照するシンタックス表の一部であって、表示領域情報に係る部分である。 The parameter set decoding unit 12 decodes the display area information from the input target layer encoded data DATA # T. The display area information is included in the SPS, for example, and is decoded according to the syntax table shown in FIG. FIG. 16 is a part of a syntax table that the parameter set decoding unit 12 refers to when performing SPS decoding, and is a part related to display area information.

SPSから復号される表示領域情報は、表示領域フラグ（conformance_flag）を含む。表示領域フラグは表示領域の位置を表わす情報（表示領域位置情報）が追加でSPSに含まれるか否かを示す。すなわち、表示領域フラグが１の場合、表示領域位置情報が追加で含まれることを示し、表示領域フラグが０の場合、表示領域位置情報が追加で含まれないことを示す。 The display area information decoded from the SPS includes a display area flag (conformance_flag). The display area flag indicates whether information indicating the position of the display area (display area position information) is additionally included in the SPS. That is, when the display area flag is 1, it indicates that the display area position information is additionally included, and when the display area flag is 0, it indicates that the display area position information is not additionally included.

SPSから復号される表示領域情報は、表示領域フラグが１の場合、さらに表示領域位置情報として表示領域左オフセット（conf_win_left_offset）、表示領域右オフセット（conf_win_right_offset）、表示領域上オフセット（conf_win_top_offset）、表示領域下オフセット（conf_win_bottom_offset）を含む。 When the display area flag is 1, the display area information decoded from the SPS further includes display area left offset (conf_win_left_offset), display area right offset (conf_win_right_offset), display area upper offset (conf_win_top_offset), and display area. Contains the lower offset (conf_win_bottom_offset).

表示領域は、表示領域フラグが０の場合には、ピクチャ全体が設定される。一方、表示領域フラグが１の場合には、表示領域位置情報が示すピクチャ内の部分領域が設定される。なお、表示領域は、コンフォーマンス窓（conformance window）とも呼称される。 When the display area flag is 0, the entire picture is set as the display area. On the other hand, when the display area flag is 1, a partial area in the picture indicated by the display area position information is set. The display area is also referred to as a conformance window.

図１７を参照して表示領域位置情報と表示領域の関係を説明する。図１７は、ピクチャ内の部分領域である表示領域と表示領域位置情報の関係を例示する図である。図に示したように、表示領域はピクチャ内に包含されており、表示領域上オフセットはピクチャ上辺と表示領域上辺の距離、表示領域左オフセットはピクチャ左辺と表示領域左辺の距離、表示領域右オフセットはピクチャ右辺と表示領域右辺の距離、表示領域下オフセットはピクチャ下辺と表示領域下辺の距離をそれぞれ表わしている。したがって、上記の表示領域位置情報により、表示領域のピクチャ内の位置およびサイズが一意に特定できる。なお、表示領域情報は、表示領域のピクチャ内の位置およびサイズが一意に特定できる他の情報であっても構わない。 The relationship between the display area position information and the display area will be described with reference to FIG. FIG. 17 is a diagram illustrating a relationship between a display area which is a partial area in a picture and display area position information. As shown in the figure, the display area is included in the picture, the display area offset is the distance between the picture upper edge and the display area upper edge, the display area left offset is the distance between the picture left edge and the display area left edge, and the display area right offset. Represents the distance between the right side of the picture and the right side of the display area, and the lower offset of the display area represents the distance between the lower side of the picture and the lower side of the display area. Therefore, the position and size of the display area in the picture can be uniquely specified by the display area position information. The display area information may be other information that can uniquely identify the position and size of the display area in the picture.

上記表示領域は、スキップスライスに対応する領域を含まないように制限される。言い換えると、パラメータセットより復号される表示領域位置情報が指定する表示領域には、スキップスライスを含まないように制限される。スキップスライスの復号画像は一般的にスキップスライスの復号画像に較べて画質が低いため、表示領域を非スキップスライスに制限することで、階層動画像復号装置より出力される復号ピクチャの部分領域の画質を高く維持できる。 The display area is limited so as not to include an area corresponding to a skip slice. In other words, the display area specified by the display area position information decoded from the parameter set is limited so as not to include a skip slice. Since the decoded image of the skip slice is generally lower in image quality than the decoded image of the skip slice, the image quality of the partial area of the decoded picture output from the hierarchical video decoding apparatus is limited by limiting the display area to the non-skip slice. Can be kept high.

一般に、スライスのピクチャ内での位置の判定はタイルのピクチャ内での位置の判定より困難である。なぜならば、タイルの位置はPPSから計算できるのに対し、スライスの位置（スライスが含む各CTUの位置）はスライスヘッダやスライスデータを復号するまで特定できないためである。したがって、表示領域がスキップスライスを含むタイルに対応する領域を含まない構成とすることで、表示領域がスキップスライスを含まない構成とする場合に較べて、より容易に表示領域が制約を満たしているか否かを判定できる。 In general, the determination of the position of a slice in a picture is more difficult than the determination of the position in a tile picture. This is because the tile position can be calculated from the PPS, but the slice position (the position of each CTU included in the slice) cannot be specified until the slice header and slice data are decoded. Therefore, if the display area does not include the area corresponding to the tile including the skip slice, does the display area satisfy the constraints more easily than when the display area does not include the skip slice? You can determine whether or not.

なお、スキップスライスに対応する領域を出力せず、かつ、スキップスライスに対応する領域を参照しない構成では、スキップCTU生成部１４４Ｓを備えない構成とすることも可能である。この場合、表示領域がスキップスライスを含むタイルに対応する領域を含まず、かつ参照しない、構成では、スキップCTU生成部１４４Ｓにおけるスキップスライス部分の復号画像導出を省略することが可能であるため、さらに処理量を削減することができる。 Note that in a configuration that does not output an area corresponding to a skip slice and that does not refer to an area corresponding to a skip slice, a configuration that does not include the skip CTU generation unit 144S may be possible. In this case, in the configuration in which the display area does not include the area corresponding to the tile including the skip slice and is not referred to, the decoded image derivation of the skip slice portion in the skip CTU generation unit 144S can be omitted. The amount of processing can be reduced.

（階層動画像符号化装置の構成）
図１８を用いて、階層動画像符号化装置２の概略構成を説明する。図１８は、階層動画像符号化装置２の概略的構成を示した機能ブロック図である。階層動画像符号化装置２は、対象レイヤの入力画像ＰＩＮ＃Ｔを、参照レイヤ符号化データＤＡＴＡ＃Ｒを参照しながら符号化して、対象レイヤの階層符号化データＤＡＴＡを生成する。なお、参照レイヤ符号化データＤＡＴＡ＃Ｒは、参照レイヤに対応する階層動画像符号化装置において符号化済みであるとする。 (Configuration of Hierarchical Video Encoding Device)
A schematic configuration of the hierarchical video encoding device 2 will be described with reference to FIG. FIG. 18 is a functional block diagram showing a schematic configuration of the hierarchical video encoding device 2. The hierarchical video encoding device 2 encodes the input image PIN # T of the target layer with reference to the reference layer encoded data DATA # R to generate hierarchical encoded data DATA of the target layer. It is assumed that the reference layer encoded data DATA # R has been encoded in the hierarchical video encoding apparatus corresponding to the reference layer.

図１８に示すように階層動画像符号化装置２は、NAL多重化部２１、パラメータセット符号化部２２、タイル設定部２３、スライス符号化部１４、復号ピクチャ管理部１６、および、ベース復号部１５を備える。 As shown in FIG. 18, the hierarchical video encoding device 2 includes a NAL multiplexing unit 21, a parameter set encoding unit 22, a tile setting unit 23, a slice encoding unit 14, a decoded picture management unit 16, and a base decoding unit. 15.

ＮＡＬ多重化部２１は、入力される対象レイヤ符号化データＤＡＴＡ＃Ｔと、参照レイヤ符号化データＤＡＴＡ＃ＲとをＮＡＬユニットに格納することでＮＡＬ多重化した階層動画像符号化データＤＡＴＡを生成し、外部に出力する。 The NAL multiplexer 21 generates NAL-multiplexed hierarchical moving image encoded data DATA by storing the input target layer encoded data DATA # T and reference layer encoded data DATA # R in the NAL unit. And output to the outside.

パラメータセット符号化部２２は、入力されるタイル情報と入力画像に基づいて、入力画像の符号化に用いるパラメータセット（VPS、SPS、および、PPS）を設定して、対象レイヤ符号化データＤＡＴＡ＃Ｔの一部としてVCL NALの形式でパケット化してNAL多重化部２１に供給する。 The parameter set encoding unit 22 sets parameter sets (VPS, SPS, and PPS) used for encoding the input image based on the input tile information and the input image, and sets the target layer encoded data DATA #. Packetized in the VCL NAL format as a part of T and supplied to the NAL multiplexer 21.

パラメータセット符号化部２２は、図１６と図１７を参照して説明した、表示領域情報を含んでいることが好ましい。また、階層動画像復号装置１の変形例１の構成として説明したよういに、表示領域情報はスキップスライスに対応する領域を含まないよう制限されていることが好ましい。または、表示領域情報はスキップスライスを含むタイルに対応する領域を含まないように制限されていることが好ましい。 The parameter set encoding unit 22 preferably includes display area information described with reference to FIGS. 16 and 17. Further, as described as the configuration of the modification 1 of the hierarchical video decoding device 1, it is preferable that the display area information is limited so as not to include an area corresponding to the skip slice. Alternatively, the display area information is preferably limited so as not to include an area corresponding to a tile including a skip slice.

タイル設定部２３は、入力画像に基づいてピクチャのタイル情報を設定して、パラメータセット符号化部２２とスライス符号化部２４に供給する。例えば、ピクチャサイズをＭ×Ｎ個のタイルに分割することを示すタイル情報を設定する。ここで、Ｍ、Ｎは任意の正の整数である。また、例えば、ピクチャが所定サイズのタイル（例えば１２８画素×１２８画素のタイル）に分割されるようにタイル情報を設定してもよい。 The tile setting unit 23 sets tile information of a picture based on the input image, and supplies it to the parameter set encoding unit 22 and the slice encoding unit 24. For example, tile information indicating that the picture size is divided into M × N tiles is set. Here, M and N are arbitrary positive integers. Further, for example, the tile information may be set so that the picture is divided into tiles of a predetermined size (for example, tiles of 128 pixels × 128 pixels).

スライス符号化部２４は、入力される入力画像、パラメータセット、タイル情報、および、復号ピクチャ管理部１６に記録されている参照ピクチャに基づいて、ピクチャを構成するスライスに対応する入力画像の一部を符号化して、当該部分の符号化データを生成し、対象レイヤ符号化データＤＡＴＡ＃Ｔの一部としてNAL多重化部２１に供給する。スライス符号化部２４の詳細な説明は後述する。 Based on the input image, parameter set, tile information, and reference picture recorded in the decoded picture management unit 16, the slice encoding unit 24 is a part of the input image corresponding to the slice constituting the picture. Is encoded to generate encoded data of the part, and the encoded data is supplied to the NAL multiplexer 21 as a part of the target layer encoded data DATA # T. Detailed description of the slice encoding unit 24 will be described later.

復号ピクチャ管理部１６は、既に説明した階層動画像復号装置１の備える復号ピクチャ管理部１６と同一の構成要素である。ただし、階層動画像符号化装置２の備える復号ピクチャ管理部１６では、内部のDPBに記録されたピクチャを出力ピクチャとして出力する必要はないため、当該出力は省略できる。なお、階層動画像復号装置１の復号ピクチャ管理部１６の説明において「復号」として説明した記載は「符号化」と置き換えることで、階層動画像符号化装置２の復号ピクチャ管理部１６にも適用できる。 The decoded picture management unit 16 is the same component as the decoded picture management unit 16 included in the hierarchical video decoding device 1 already described. However, since the decoded picture management unit 16 included in the hierarchical video encoding device 2 does not need to output the picture recorded in the internal DPB as an output picture, the output can be omitted. Note that the description described as “decoding” in the description of the decoded picture management unit 16 of the hierarchical video decoding device 1 is also applied to the decoded picture management unit 16 of the hierarchical video encoding device 2 by replacing “coding”. it can.

ベース復号部１５は、既に説明した階層動画像復号装置１の備えるベース復号部１５と同一の構成要素であり、詳細説明は省略する。 The base decoding unit 15 is the same component as the base decoding unit 15 included in the hierarchical video decoding device 1 described above, and detailed description thereof is omitted.

（スライス符号化部）
次に図１９を参照して、スライス符号化部２４の構成の詳細を説明する。図１９は、スライス符号化部２４の概略的構成を示した機能ブロック図である。 (Slice coding unit)
Next, the details of the configuration of the slice encoding unit 24 will be described with reference to FIG. FIG. 19 is a functional block diagram showing a schematic configuration of the slice encoding unit 24.

図１９に示すように、スライス符号化部２４は、スライスヘッダ設定部２４１、スライス位置設定部２４２、スキップスライス判定部２４３、および、CTU符号化部２４４を含む。CTU符号化部２４４は、内部に非スキップCTU符号化部２４４ＮとスキップＣＴＵ生成部２４４Ｓを含む。 As illustrated in FIG. 19, the slice encoding unit 24 includes a slice header setting unit 241, a slice position setting unit 242, a skip slice determination unit 243, and a CTU encoding unit 244. The CTU encoding unit 244 includes a non-skip CTU encoding unit 244N and a skip CTU generation unit 244S inside.

スライスヘッダ設定部２４１は、入力されるパラメータセットとスライス位置情報に基づいてスライス単位で入力される入力画像の符号化に用いるスライスヘッダを生成する。生成されたスライスヘッダは、スライス符号化データの一部として出力されるとともに、入力画像と合わせてスキップスライス判定部１４３に供給される。 The slice header setting unit 241 generates a slice header used for encoding an input image input in units of slices based on the input parameter set and slice position information. The generated slice header is output as a part of the slice encoded data, and is supplied to the skip slice determination unit 143 together with the input image.

スライスヘッダ設定部２４１で生成されるスライスヘッダには、図１２を参照して説明したSHスライス位置情報、および、図１３を参照して説明したスライススキップ情報が少なくとも含まれる。 The slice header generated by the slice header setting unit 241 includes at least the SH slice position information described with reference to FIG. 12 and the slice skip information described with reference to FIG.

スライス位置設定部２４２は、入力されるタイル情報に基づいてピクチャ内のスライス位置を決定してスライスヘッダ設定部２４１に供給する。 The slice position setting unit 242 determines a slice position in the picture based on the input tile information and supplies the slice position to the slice header setting unit 241.

スキップスライス設定部２４３は、階層動画像復号装置１のスライス復号部１４に含まれるスキップスライス判定部１４３に対応する構成要素であり、入力されるスライスヘッダに基づいて、対象スライスがスキップスライスか否かを判定する。スキップスライス判定部２４３における上記判定は、スキップスライス判定部１４３と同一の判定基準が利用できる。スキップスライス判定部２４３は、対象スライスがスキップスライスと判定した場合、スライスヘッダをCTU符号化部２４４内のスキップCTU生成部２４４Ｓに出力する。対象スライスが非スキップスライスと判定した場合、スライスヘッダおよび入力画像（対象スライス部分）をCTU符号化部２４４内の非スキップCTU符号化部２４４Ｎに出力する。 The skip slice setting unit 243 is a component corresponding to the skip slice determination unit 143 included in the slice decoding unit 14 of the hierarchical video decoding device 1, and based on the input slice header, whether or not the target slice is a skip slice Determine whether. The same determination criteria as the skip slice determination unit 143 can be used for the determination in the skip slice determination unit 243. When the target slice is determined to be a skip slice, the skip slice determination unit 243 outputs the slice header to the skip CTU generation unit 244S in the CTU encoding unit 244. When it is determined that the target slice is a non-skip slice, the slice header and the input image (target slice portion) are output to the non-skip CTU encoding unit 244N in the CTU encoding unit 244.

CTU符号化部２４４は、入力されるパラメータセット、スライスヘッダに基づいて、入力画像（対象スライス部分）をCTU単位で符号化して、対象スライスに係るスライスデータおよび復号画像（復号ピクチャ）を生成して出力する。CTUの符号化は、スキップスライス判定部２４３の判定結果に従って、非スキップCTU符号化部２４４Ｎ、または、スキップCTU生成部２４４Ｓの何れかで実行される。 The CTU encoding unit 244 encodes an input image (target slice portion) in units of CTU based on the input parameter set and slice header, and generates slice data and a decoded image (decoded picture) related to the target slice. Output. The CTU encoding is performed by either the non-skip CTU encoding unit 244N or the skip CTU generation unit 244S according to the determination result of the skip slice determination unit 243.

非スキップCTU符号化部２４４Ｎは、非スキップスライスに含まれるCTUを符号化することで対応するスライスデータの部分データ、および、当該CTUの復号画像を生成する。符号化では、まず、入力画像の性質に基づいて対象CTUの予測パラメータ（ＰＴ情報）導出する。次に、前記予測パラメータに基づいて対象CTUの予測画像を生成し、入力画像と予測画像の差分を予測残差情報（ＴＴ情報）として生成する。ＰＴ情報とＴＴ情報を合わせて、対象CTUに相当するスライスデータの部分データを生成するとともに、ＴＴ情報を復号して対象CTUの予測残差を生成し、当該予測画像と予測残差に基づいて対象CTUの復号画像を生成する。 The non-skip CTU encoding unit 244N generates the partial data of the corresponding slice data and the decoded image of the CTU by encoding the CTU included in the non-skip slice. In encoding, first, the prediction parameter (PT information) of the target CTU is derived based on the properties of the input image. Next, a prediction image of the target CTU is generated based on the prediction parameter, and a difference between the input image and the prediction image is generated as prediction residual information (TT information). The partial information of slice data corresponding to the target CTU is generated by combining the PT information and the TT information, and the prediction residual of the target CTU is generated by decoding the TT information. Based on the prediction image and the prediction residual Generate a decoded image of the target CTU.

スキップCTU生成部２４４Ｓは、入力されるスライスヘッダに基づいて、スキップスライスに含まれるCTUの復号画像を復号して出力する。スキップCTU生成部２４４Ｓの復号画像生成の処理は、階層動画像復号装置１のスライス復号部１４が含むスキップCTU生成部１４４Ｓと同一の処理を用いることができる。 The skip CTU generation unit 244S decodes and outputs the decoded image of the CTU included in the skip slice based on the input slice header. The same processing as the skip CTU generation unit 144S included in the slice decoding unit 14 of the hierarchical video decoding device 1 can be used for the decoding image generation processing of the skip CTU generation unit 244S.

（動画像符号化装置２の効果）
以上説明した本実施形態に係る階層動画像符号化装置２は、入力画像から上位レイヤの符号化データを生成する階層動画像符号化装置（階層画像符号化装置）であって、上位レイヤのスライスヘッダを符号化するスライスヘッダ符号化部２４と、上位レイヤの非スキップスライスに属するＣＴＵの復号画像をスライスデータに基づいて符号化する非スキップＣＴＵ符号化部２４４Ｎと、上位レイヤのスキップスライスに属するＣＴＵの復号画像を生成するスキップＣＴＵ生成部２４４Ｓを備えている。 (Effect of moving picture coding apparatus 2)
The hierarchical video encoding device 2 according to the present embodiment described above is a hierarchical video encoding device (hierarchical image encoding device) that generates encoded data of an upper layer from an input image, and is a slice of an upper layer. A slice header encoding unit 24 that encodes a header, a non-skip CTU encoding unit 244N that encodes a decoded image of a CTU belonging to a non-skip slice of an upper layer based on slice data, and a skip slice of an upper layer A skip CTU generation unit 244S that generates a decoded image of the CTU is provided.

したがって、本発明に係る階層動画像符号化装置２は、注目領域外の領域に代表される重要度が比較的低い領域をスキップスライスを用いて符号化できる。その場合、非スキップスライスを用いて符号化する場合に較べて復号画像の画質は低いが、より少ない処理量で、より少ない符号量の符号化データが生成できる。したがって、階層動画像符号化装置２は、注目領域に含まれる領域内のスライスを非スキップスライス、注目領域に含まれない領域内のスライスをスキップスライスとして入力画像を符号化して階層符号化データを生成する場合に、注目領域内の復号画像の品質を損なうことなく、少ない符号量の階層符号化データを生成できる。 Therefore, the hierarchical video encoding apparatus 2 according to the present invention can encode a region having a relatively low importance represented by a region outside the region of interest using a skip slice. In that case, the image quality of the decoded image is lower than when encoding using non-skip slices, but encoded data with a smaller code amount can be generated with a smaller amount of processing. Therefore, the hierarchical video encoding apparatus 2 encodes an input image by using a slice in a region included in the region of interest as a non-skip slice and a slice in a region not included in the region of interest as a skip slice, and generates hierarchical encoded data. In the case of generation, hierarchically encoded data with a small code amount can be generated without deteriorating the quality of the decoded image in the attention area.

また、階層動画像符号化装置２は、表示領域情報を符号化するパラメータセット符号化部を備えており、該表示領域情報はスキップスライスを含むタイルを含まないように制限されている。したがって、階層動画像符号化装置２により符号化された階層符号化データを復号して再生した場合であっても、品質の低い復号画像に相当する領域が再生することを抑止できるため、再生端末による再生画像の品質を保証できる。 In addition, the hierarchical moving image encoding device 2 includes a parameter set encoding unit that encodes display area information, and the display area information is limited so as not to include tiles including skip slices. Therefore, even when the hierarchically encoded data encoded by the hierarchical moving image encoding device 2 is decoded and reproduced, it is possible to prevent the region corresponding to the decoded image with low quality from being reproduced. Can guarantee the quality of the playback image.

〔階層符号化データ変換装置３〕
図２０を用いて、階層符号化データ変換装置３の概略構成を説明する。図２０は、階層符号化データ変換装置３の概略的構成を示した機能ブロック図である。階層符号化データ変換装置３は、入力される階層符号化データＤＡＴＡを変換して、入力される注目領域情報に係る階層符号化データＤＡＴＡ−ＲＯＩを生成する。なお、階層符号化データＤＡＴＡは階層動画像符号化装置２により生成された階層符号化データである。また、階層符号化データＤＡＴＡ−ＲＯＩを階層動画像復号装置１に入力することで注目領域情報に係る上位レイヤの動画像を再生できる。 [Hierarchical coded data converter 3]
The schematic configuration of the hierarchically encoded data conversion device 3 will be described with reference to FIG. FIG. 20 is a functional block diagram showing a schematic configuration of the hierarchically encoded data conversion device 3. The hierarchical encoded data conversion device 3 converts the input hierarchical encoded data DATA to generate hierarchical encoded data DATA-ROI related to the input attention area information. The hierarchically encoded data DATA is hierarchically encoded data generated by the hierarchical moving image encoding device 2. Also, by inputting the hierarchically encoded data DATA-ROI to the hierarchical video decoding device 1, it is possible to reproduce the upper layer video related to the attention area information.

図２０に示すように、階層符号化データ変換装置３は、NAL逆多重化部１１、NAL多重化部２１、パラメータセット復号部１２、タイル設定部１３、パラメータセット修正部３２、スライス修正部３４を含む。 As shown in FIG. 20, the hierarchical encoded data conversion apparatus 3 includes a NAL demultiplexing unit 11, a NAL multiplexing unit 21, a parameter set decoding unit 12, a tile setting unit 13, a parameter set correction unit 32, and a slice correction unit 34. including.

NAL逆多重化部１１、パラメータセット復号部１２、タイル設定部１３は、それぞれ、階層動画像復号装置１が含む同名の構成要素と同じ機能を有するため、同一の符号を付与して説明を省略する。 Each of the NAL demultiplexing unit 11, the parameter set decoding unit 12, and the tile setting unit 13 has the same function as the component of the same name included in the hierarchical video decoding device 1, and therefore, the same reference numerals are given and description thereof is omitted. To do.

NAL多重化部２１は、階層動画像符号化装置２が含む同名の構成要素と同じ機能を有するため、同一の符号を付与して説明を省略する。 Since the NAL multiplexing unit 21 has the same function as the component of the same name included in the hierarchical video encoding device 2, the same reference numeral is assigned and the description thereof is omitted.

パラメータセット修正部３２は、入力される注目領域情報とタイル情報に基づいて、入力されるパラメータセット情報を修正して出力する。 The parameter set correction unit 32 corrects and outputs the input parameter set information based on the input attention area information and tile information.

注目領域情報は、動画像を構成するピクチャにおいて、ユーザー（例えば再生動画像の視聴者）が指定するピクチャの部分領域である。注目領域情報は、例えば矩形の領域で指定される。その場合、例えば、注目領域を表わす矩形の上辺、下辺、左辺、右辺のピクチャ全体の対応する辺（上辺、下辺、左辺、または、右辺）からの位置のオフセットを注目領域情報として指定できる。なお、矩形以外の形状の領域（例えば、円、多角形、物体抽出により抽出した物体を示す領域）を注目領域として使用してもよいが、以下では説明の簡単のため矩形の注目領域を想定する。なお、矩形以外の領域に対して、以下に記載する内容を適用する場合、例えば、注目領域を包含する面積最小の矩形を以下の説明における注目領域とみなして適用できる。 The attention area information is a partial area of a picture specified by a user (for example, a viewer of a reproduction moving image) in a picture constituting the moving image. The attention area information is specified by a rectangular area, for example. In this case, for example, an offset of a position from the corresponding side (upper side, lower side, left side, or right side) of the entire picture of the upper side, the lower side, the left side, and the right side of the rectangle representing the target region can be designated as the attention region information. Note that an area having a shape other than a rectangle (for example, a circle, a polygon, or an area indicating an object extracted by object extraction) may be used as the attention area. However, for the sake of simplicity, a rectangular attention area is assumed below. To do. In addition, when the content described below is applied to a region other than a rectangle, for example, a rectangle with the smallest area including the region of interest can be regarded as the region of interest in the following description.

パラメータセット修正部３２は、入力される注目領域情報の示す注目領域と一致するように、入力されるパラメータセットに含まれるSPSの表示領域情報を書き換える。SPSの表示領域情報として図１６を参照して説明したシンタックスを用いる場合、表示領域情報は次のＳ３０１からＳ３０３の手順で書き換えられる。 The parameter set correction unit 32 rewrites the display area information of the SPS included in the input parameter set so as to match the attention area indicated by the input attention area information. When the syntax described with reference to FIG. 16 is used as the display area information of SPS, the display area information is rewritten by the following steps S301 to S303.

（Ｓ３０１）注目領域がピクチャ全体と一致するか否かを判定する。一致する場合、Ｓ３０２に進み、一致しない場合、Ｓ３０３に進む。 (S301) It is determined whether or not the attention area matches the entire picture. If they match, the process proceeds to S302, and if they do not match, the process proceeds to S303.

（Ｓ３０２）上書き前の表示領域フラグの値が１であった場合には、当該表示領域フラグの値を０に上書きし、かつ、表示領域オフセット（conf_win_left_offset、conf_win_right_offset、conf_win_top_offset、conf_win_bottom_offset）をSPSから取り除いて処理を終了する。 (S302) If the value of the display area flag before overwriting is 1, overwrite the value of the display area flag to 0 and remove the display area offset (conf_win_left_offset, conf_win_right_offset, conf_win_top_offset, conf_win_bottom_offset) from the SPS. To finish the process.

（Ｓ３０３）表示領域フラグの値を１に上書きする。表示領域オフセットの各オフセットを注目領域を表わす矩形の各辺のピクチャの対応する辺との位置のオフセットの値に設定する。例えば、注目領域上辺のピクチャ上辺に対する位置オフセットを表示領域上オフセット（conf_win_top_offset）の値に設定する。なお、書き換え前の表示領域フラグの値が１であった場合には、上記設定した注目領域オフセットの値を用いて、元の注目領域オフセットの値を上書きする。書き換え前の表示領域フラグの値が１であった場合には、上記設定した注目領域オフセットをSPSの表示領域フラグの直後に挿入する。 (S303) The value of the display area flag is overwritten with 1. Each offset of the display area offset is set to the offset value of the position of each side of the rectangle representing the attention area with the corresponding side of the picture. For example, the position offset of the upper side of the attention area with respect to the upper side of the picture is set to the value of the display area upper offset (conf_win_top_offset). If the value of the display area flag before rewriting is 1, the original attention area offset value is overwritten using the attention area offset value set above. When the value of the display area flag before rewriting is 1, the set attention area offset is inserted immediately after the SPS display area flag.

スライス修正部３４は、入力されるパラメータセットとタイル情報に基づいて、入力されるスライスデータを注目領域情報を考慮したスライスデータに修正して出力する。 Based on the input parameter set and tile information, the slice correcting unit 34 corrects the input slice data to slice data considering the attention area information and outputs the slice data.

スライスデータの修正は、概略的には、注目領域に含まれないタイルに含まれるスライスをスキップスライスに設定し、少なくとも一部の領域が注目領域に含まれるタイルに含まれるスライスを非スキップスライスに設定する修正である。スライスデータ修正処理は次のＳ４０１〜Ｓ４０４に示す手順で実行される。 In general, the slice data is corrected by setting a slice included in a tile that is not included in the region of interest as a skip slice, and a slice included in a tile including at least a portion of the region in the region of interest as a non-skip slice. It is a correction to set. The slice data correction process is executed according to the following steps S401 to S404.

（Ｓ４０１）ピクチャが含む各タイル順次対象タイルに設定し、以下のＳ４０２からＳ４０４の処理を実行する。 (S401) Each tile included in the picture is sequentially set as a target tile, and the following processes from S402 to S404 are executed.

（Ｓ４０２）対象タイルが、パラメータセットのSPSの含む表示領域情報が示す表示領域の完全に外側か否かを判定する。例えば、以下のＣ１〜Ｃ４に示す全ての条件が真であれば、対象タイルが表示領域の外側と判定される。 (S402) It is determined whether the target tile is completely outside the display area indicated by the display area information included in the SPS of the parameter set. For example, if all the conditions shown in C1 to C4 below are true, the target tile is determined to be outside the display area.

Ｃ１：対象タイル右辺が表示領域左辺と一致、または、前者が後者の左に位置する、
Ｃ２：対象タイル左辺が表示領域右辺と一致、または、前者が後者の右に位置する、
Ｃ３：対象タイル上辺が表示領域下辺と一致、または、前者が後者の下に位置する、
Ｃ４：対象タイル下辺が表示領域上辺と一致、または、前者が後者の上に位置する。 C1: The right side of the target tile matches the left side of the display area, or the former is located on the left of the latter.
C2: The left side of the target tile matches the right side of the display area, or the former is located on the right of the latter.
C3: The upper side of the target tile matches the lower side of the display area, or the former is located below the latter.
C4: The lower side of the target tile coincides with the upper side of the display area, or the former is positioned on the latter.

対象タイルが表示領域の外側と判定された場合、Ｓ４０３に進む。それ以外の場合、Ｓ４０４に進む。 If it is determined that the target tile is outside the display area, the process proceeds to S403. Otherwise, the process proceeds to S404.

なお、上記判定処理において表示領域の代わりに注目領域情報を直接スライス修正部３４に入力して表示領域の代わりに注目領域を用いても構わない。 In the determination process, attention area information may be directly input to the slice correction unit 34 instead of the display area, and the attention area may be used instead of the display area.

（Ｓ４０３）対象タイルに含まれるスライスをスキップスライスに設定する。すなわち、対象タイルに含まれる全てのスライスに対して、スライスヘッダに含まれるスキップスライスフラグの値を１に上書きする。上書き前のスキップスライスフラグの値が０である場合、スライスに含まれるCTUの数をスキップCTU数としてスライスヘッダのスキップスライスフラグの直後に追加し、かつ、スライスデータを全て削除する。 (S403) A slice included in the target tile is set as a skip slice. That is, the value of the skip slice flag included in the slice header is overwritten with 1 for all slices included in the target tile. When the value of the skip slice flag before overwriting is 0, the number of CTUs included in the slice is added as the number of skip CTUs immediately after the skip slice flag in the slice header, and all slice data is deleted.

（Ｓ４０４）対象タイルに含まれるスライスを非スキップスライスに設定する。なお、通常はスライス修正前のスライスデータは非スキップスライスであり、その場合は、入力されたスライスヘッダとスライスデータをそのまま出力する。 (S404) A slice included in the target tile is set as a non-skip slice. Normally, slice data before slice correction is a non-skip slice, and in this case, the input slice header and slice data are output as they are.

（階層符号化データ変換処理フロー）
階層符号化データ変換装置３による階層符号化データ変換処理は、Ｓ５０１〜Ｓ５０６に示す手順を順次実行することで実現される。 (Hierarchical coded data conversion process flow)
The hierarchical encoded data conversion process by the hierarchical encoded data conversion device 3 is realized by sequentially executing the procedures shown in S501 to S506.

（Ｓ５０１）NAL逆多重化部１１は、入力された階層符号化データＤＡＴＡを逆多重化する。得られた対象レイヤ符号化データＤＡＴＡ＃Ｔのうち、パラメータセットに係る部分（非VCL NAL）をパラメータ復号部１２に出力し、スライスレイヤ（スライスヘッダ、スライスデータ）に係る部分をスライス修正部３４に出力する。得られた参照レイヤ符号化データＤＡＴＡ＃ＲはNAL逆多重化部２１に出力される。 (S501) The NAL demultiplexing unit 11 demultiplexes the input hierarchical encoded data DATA. Of the obtained target layer encoded data DATA # T, the portion related to the parameter set (non-VCL NAL) is output to the parameter decoding unit 12, and the portion related to the slice layer (slice header, slice data) is output to the slice correction unit 34. Output to. The obtained reference layer encoded data DATA # R is output to the NAL demultiplexer 21.

（Ｓ５０２）パラメータセット復号部１２は、入力された非VCL NALからパラメータセット（VPS、SPS、PPS）を復号して、パラメータセット修正部３２とタイル設定部１３に出力する。 (S502) The parameter set decoding unit 12 decodes the parameter set (VPS, SPS, PPS) from the input non-VCL NAL and outputs it to the parameter set correction unit 32 and the tile setting unit 13.

（Ｓ５０３）パラメータセット修正部３２は、入力される注目領域情報に基づいて入力されるパラメータセットを修正し、NAL多重化部２１とスライス修正部３４に出力する。 (S503) The parameter set correction unit 32 corrects the input parameter set based on the input attention area information, and outputs it to the NAL multiplexing unit 21 and the slice correction unit 34.

（Ｓ５０４）タイル設定部１３は、入力されるパラメータセットからタイル情報を導出してスライス修正部３４に出力する。 (S504) The tile setting unit 13 derives tile information from the input parameter set and outputs the tile information to the slice correction unit 34.

（Ｓ５０５）スライス修正部３４は、入力されるタイル情報と修正後のパラメータセットに基づいて、入力されるスライスヘッダ、スライスデータを修正して、NAL多重化部２１に出力する。 (S505) The slice correcting unit 34 corrects the input slice header and slice data based on the input tile information and the corrected parameter set, and outputs them to the NAL multiplexing unit 21.

（Ｓ５０６）NAL多重化部２１は、入力される修正後のパラメータセットと修正後のスライスヘッダとスライスデータを修正後の対象レイヤの符号化データとして、入力される参照レイヤ符号化データＤＡＴＡ＃Ｒと多重化して階層符号化データＤＡＴＡ−ＲＯＩとして外部に出力する。 (S506) The NAL multiplexing unit 21 receives the input reference layer encoded data DATA # R as the encoded data of the target layer after correction of the corrected parameter set, the corrected slice header, and the slice data. And output to the outside as hierarchically encoded data DATA-ROI.

（階層符号化データ変換装置３の効果）
以上説明した本実施形態に係る階層符号化データ変換装置３は、対象レイヤ（上位レイヤ）の符号化データを注目領域情報に基づいて修正するスライス修正部３４を備えている。スライス修正部３４は、注目領域情報の示す注目領域に基づいて、上位レイヤの符号化データに含まれるスライスであって、注目領域に含まれないスライスを非スキップスライスからスキップスライスに変更する。 (Effect of Hierarchical Coded Data Conversion Device 3)
The hierarchical encoded data conversion device 3 according to the present embodiment described above includes the slice correction unit 34 that corrects the encoded data of the target layer (upper layer) based on the attention area information. Based on the attention area indicated by the attention area information, the slice correction unit 34 changes a slice that is included in the higher layer encoded data and is not included in the attention area from a non-skip slice to a skip slice.

上記の階層符号化データ変換装置３によれば、入力された階層符号化データを変換して、上位レイヤにおいて注目領域に含まれないスライスをスキップスライスに変換した階層符号化データを生成できる。スキップスライスは非スキップスライスに較べて符号量が少ないため、変換前に較べて符号量の少ない符号化データを生成できる。加えて、変換後の符号化データにおいて、注目領域に関しては、非スキップスライスで符号化されているため、変換前に較べて注目領域の復号画像の画質は低下しない。 According to the above-described hierarchically encoded data conversion device 3, it is possible to convert the input hierarchically encoded data and generate hierarchically encoded data in which a slice not included in the attention area in the upper layer is converted into a skip slice. Since the skip slice has a smaller code amount than that of the non-skip slice, encoded data having a smaller code amount than before conversion can be generated. In addition, in the encoded data after conversion, since the region of interest is encoded with a non-skip slice, the image quality of the decoded image of the region of interest does not deteriorate compared to before conversion.

〔注目領域表示システム〕
上述した階層動画像復号装置１、階層動画像符号化装置２、及び、階層符号化データ変換装置３を組み合わせて、注目領域情報を表示するシステム（注目領域表示システムＳＹＳ）を構成できる。 [Attention area display system]
A system that displays attention area information (attention area display system SYS) can be configured by combining the above-described hierarchical moving picture decoding apparatus 1, hierarchical moving picture encoding apparatus 2, and hierarchical encoded data conversion apparatus 3.

図２１に基づいて、上述した階層動画像復号装置１、階層動画像符号化装置２、及び、階層符号化データ変換装置３の組み合わせにより、注目領域表示システムが構成できることを説明する。図２１は、階層動画像復号装置１、階層動画像符号化装置２、及び、階層符号化データ変換装置３の組み合わせによる注目領域表示システムの構成を示したブロック図である。注目領域表示システムＳＹＳは、概略的には、品質の異なる入力画像を階層符号化して蓄積しておき、ユーザーからの注目領域情報に応じて蓄積された階層符号化データを変換して提供し、変換した階層符号化データを復号することで注目領域（ＲＯＩ）に係る高品質の再生画像を表示する。 Based on FIG. 21, it will be described that the attention area display system can be configured by a combination of the above-described hierarchical video decoding device 1, hierarchical video encoding device 2, and hierarchical encoded data conversion device 3. FIG. 21 is a block diagram illustrating a configuration of a region of interest display system that is a combination of the hierarchical video decoding device 1, the hierarchical video encoding device 2, and the hierarchical encoded data conversion device 3. The attention area display system SYS is generally provided by hierarchically encoding and storing input images having different qualities, and converting and providing the hierarchically encoded data accumulated according to attention area information from the user, By decoding the converted hierarchically encoded data, a high-quality reproduced image related to the region of interest (ROI) is displayed.

図２１に示すように、注目領域表示システムＳＹＳは、階層動画像符号化部ＳＹＳ１Ａ、階層動画像符号化部ＳＹＳ１Ｂ、階層符号化データ蓄積部ＳＹＳ２、階層符号化データ変換部ＳＹＳ３、階層動画像復号部ＳＹＳ４、表示制御部ＳＹＳ５、ＲＯＩ表示部ＳＹＳ６、全体表示部ＳＹＳ７、ＲＯＩ通知部ＳＹＳ８を構成要素として含む。 As shown in FIG. 21, the attention area display system SYS includes a hierarchical video encoding unit SYS1A, a hierarchical video encoding unit SYS1B, a hierarchical encoded data storage unit SYS2, a hierarchical encoded data conversion unit SYS3, and a hierarchical video decoding. The unit SYS4, the display control unit SYS5, the ROI display unit SYS6, the whole display unit SYS7, and the ROI notification unit SYS8 are included as constituent elements.

階層動画像符号化部ＳＹＳ１Ａ、ＳＹＳ１Ｂには、前述の階層動画像符号化装置２を利用できる。 The above-described hierarchical video encoding device 2 can be used for the hierarchical video encoding units SYS1A and SYS1B.

階層符号化データ蓄積部ＳＹＳ２は、階層符号化データを蓄積し、要求に応じて階層符号化データを供給する。階層符号化データ蓄積部ＳＹＳ２として、記録媒体（メモリ、ハードディスク、光学ディスク）を備えたコンピュータが利用できる。 The hierarchically encoded data storage unit SYS2 stores hierarchically encoded data and supplies the hierarchically encoded data as required. A computer having a recording medium (memory, hard disk, optical disk) can be used as the hierarchically encoded data storage unit SYS2.

階層符号化データ変換部ＳＹＳ３には、前述の階層符号化データ変換部３が利用できる。 The hierarchically encoded data conversion unit 3 described above can be used as the hierarchically encoded data conversion unit SYS3.

階層動画像復号部ＳＹＳ４には、前述の階層動画像復号装置１が利用できる。 The hierarchical video decoding device 1 can be used for the hierarchical video decoding unit SYS4.

表示制御部ＳＹＳ５は、注目領域情報に基づいて、復号ピクチャをＲＯＩ表示画像としてＲＯＩ表示部ＳＹＳ６に提供するか、または、復号ピクチャを全体表示画像として全体表示部ＳＹＳ７に供給する。 The display control unit SYS5 provides the decoded picture as the ROI display image to the ROI display unit SYS6 based on the attention area information, or supplies the decoded picture as the entire display image to the entire display unit SYS7.

表示制御部ＳＹＳ５は、注目領域情報で注目領域が指定されている場合、階層動画像復号部から入力される復号ピクチャであって、下位レイヤの復号ピクチャを全体表示画像として全体表示部ＳＹＳ７に供給する一方で、ＲＯＩ表示部ＳＹＳ６には、階層動画像復号部から入力される復号ピクチャであって、上位レイヤの復号ピクチャをＲＯＩ表示画像としてＲＯＩ表示部ＳＹＳ６に供給する。なお、注目領域情報で注目領域が指定されていない場合、ＲＯＩ表示部ＳＹＳ６にはＲＯＩ表示画像は供給されない。 When the attention area is designated by the attention area information, the display control unit SYS5 supplies the whole picture display unit SYS7 with the decoded picture of the lower layer, which is a decoded picture input from the hierarchical moving picture decoding unit, as a whole display image. On the other hand, the ROI display unit SYS6 supplies the decoded picture input from the hierarchical moving image decoding unit and the decoded picture of the upper layer to the ROI display unit SYS6 as the ROI display image. Note that when the attention area is not specified in the attention area information, the ROI display image is not supplied to the ROI display section SYS6.

表示制御部ＳＹＳ５は、注目領域情報で注目領域が指定されている場合、階層動画像復号部から入力される復号ピクチャであって、下位レイヤの復号ピクチャを全体表示画像として全体表示部ＳＹＳ７に供給する一方で、ＲＯＩ表示部ＳＹＳ６には復号ピクチャを供給しない。 When the attention area is designated by the attention area information, the display control unit SYS5 supplies the whole picture display unit SYS7 with the decoded picture of the lower layer, which is a decoded picture input from the hierarchical moving picture decoding unit, as a whole display image. On the other hand, the decoded picture is not supplied to the ROI display unit SYS6.

なお、表示制御部ＳＹＳ５は、注目領域情報が変更された場合に、当該注目領域情報に係る階層符号化データの上位レイヤの復号ピクチャが階層動画像復号部ＳＹＳ４から供給されるまでの間、階層符号化データの下位レイヤの復号ピクチャの部分領域であって、注目領域に対応する部分をＲＯＩ表示画像としてＲＯＩ表示部ＳＹＳ６に供給しても構わない。下位レイヤの復号ピクチャの部分領域であって、注目領域に対応する部分は、該注目領域に係る上位レイヤの復号ピクチャに較べて画質は低いが、ユーザーが注目領域の指定後に、階層符号化データ変換部への通知及び変換処理に伴う遅延を待たずに注目領域をＲＯＩ表示部ＳＹＳ６に表示ができるという利点がある。 Note that, when the attention area information is changed, the display control unit SYS5 is configured so that the decoded picture of the upper layer of the hierarchically encoded data related to the attention area information is supplied from the hierarchical video decoding unit SYS4. A partial region of the decoded picture in the lower layer of the encoded data and corresponding to the region of interest may be supplied to the ROI display unit SYS6 as the ROI display image. The partial area of the decoded picture of the lower layer, which corresponds to the attention area, has a lower image quality than the decoded picture of the upper layer related to the attention area. There is an advantage that the attention area can be displayed on the ROI display unit SYS6 without waiting for the delay accompanying the notification to the conversion unit and the conversion process.

ＲＯＩ表示部ＳＹＳ６は、ＲＯＩ表示画像を所定の表示領域の所定の表示位置に表示する。例えば、表示領域はテレビの画面であり、表示位置はその部分領域（例えば右上隅の矩形領域）である。また、例えば、表示領域は携帯型端末（スマートフォンやタブレット型コンピュータ）のディスプレイであり、表示位置はその全体である。 The ROI display unit SYS6 displays the ROI display image at a predetermined display position in a predetermined display area. For example, the display area is a television screen, and the display position is a partial area (for example, a rectangular area in the upper right corner). Further, for example, the display area is a display of a portable terminal (smart phone or tablet computer), and the display position is the whole.

全体表示部ＳＹＳ７は、全体表示画像を所定の表示領域の所定の表示位置に表示する。例えば、表示領域はテレビの画面であり、表示位置はその全体である。なお、全体表示部ＳＹＳ７とＲＯＩ表示部ＳＹＳ６の表示領域が同じ場合、ＲＯＩ表示画像を全体表示画像の上に重ねるように表示することが好ましい。なお、ＲＯＩ表示部ＳＹＳ６および全体表示部ＳＹＳ７は、入力される画像を表示領域のサイズに一致するサイズに拡大または縮小して表示しても構わない。 The entire display unit SYS7 displays the entire display image at a predetermined display position in a predetermined display area. For example, the display area is a television screen, and the display position is the whole. In addition, when the display area of whole display part SYS7 and ROI display part SYS6 is the same, it is preferable to display so that a ROI display image may be superimposed on a whole display image. Note that the ROI display unit SYS6 and the entire display unit SYS7 may display the input image enlarged or reduced to a size that matches the size of the display area.

ＲＯＩ通知部ＳＹＳ８は、所定の方法でユーザーが指定した注目領域情報を通知する。例えば、ユーザーは全体表示画像が表示された表示領域上で、注目領域に相当する領域を指定することでＲＯＩ通知部に注目領域を伝えることができる。なお、ＲＯＩ通知部ＳＹＳ８は、ユーザーの指定がない場合は、注目領域が無いことを示す情報を注目領域情報として通知する。 The ROI notification unit SYS8 notifies attention area information designated by the user by a predetermined method. For example, the user can inform the ROI notification unit of the attention area by designating an area corresponding to the attention area on the display area where the entire display image is displayed. Note that the ROI notification unit SYS8 notifies information indicating that there is no attention area as attention area information when there is no user designation.

（注目領域表示システムのフロー）
注目領域表示システムによる処理は、階層符号化データ生成蓄積処理と注目領域データ生成再生処理に分けることができる。 (Flow of attention area display system)
Processing by the attention area display system can be divided into hierarchical encoded data generation and accumulation processing and attention area data generation and reproduction processing.

階層符号化データ生成蓄積処理では、異なる品質の入力画像から階層符号化データを生成して蓄積する。階層符号化データ生成蓄積処理は、Ｔ１０１からＴ１０３の手順で実行される。 In the hierarchically encoded data generation and accumulation process, hierarchically encoded data is generated and stored from input images of different qualities. The hierarchically encoded data generation / accumulation process is executed in the sequence from T101 to T103.

（Ｔ１０１）階層動画像符号化部ＳＹＳ１Ｂは、入力される低品質の入力画像を符号化し、生成された階層符号化データを階層動画像符号化部ＳＹＳ１Ａに供給する。つまり、階層動画像符号化部ＳＹＳ１Ｂは、入力画像から、階層動画像符号化部ＳＹＳ１Ａにおいて参照レイヤ（下位レイヤ）として使用される階層符号化データを生成して出力する。 (T101) The hierarchical moving image encoding unit SYS1B encodes the input low-quality input image, and supplies the generated hierarchical encoded data to the hierarchical moving image encoding unit SYS1A. That is, the hierarchical moving image encoding unit SYS1B generates and outputs hierarchical encoded data used as a reference layer (lower layer) in the hierarchical moving image encoding unit SYS1A from the input image.

（Ｔ１０２）階層動画像符号化部ＳＹＳ１Ａは、入力される高品質の入力画像を、入力された階層符号化データを参照レイヤの符号化データとして符号化し、階層符号化データを生成して階層符号化データ蓄積部ＳＹＳ２に出力する。 (T102) The hierarchical moving image encoding unit SYS1A encodes the input high-quality input image using the input hierarchical encoded data as encoded data of the reference layer, generates hierarchical encoded data, and generates a hierarchical code The data is output to the digitized data storage unit SYS2.

（Ｔ１０３）階層符号化データ蓄積部ＳＹＳ２は、入力された階層符号化データに適切なインデックスを付けて内部の記録媒体に記録する。 (T103) The hierarchically encoded data storage unit SYS2 attaches an appropriate index to the input hierarchically encoded data and records it on an internal recording medium.

注目領域データ生成再生処理では、階層符号化データ蓄積部ＳＹＳ２から階層符号化データを読み出し、注目領域に相当する階層符号化データに変換し、変換した階層符号化データを復号して再生及び表示する。注目領域データ生成再生処理は、以下のＴ２０１〜Ｔ２０７の手順で実行される。 In the attention area data generation / reproduction processing, the hierarchically encoded data is read from the hierarchically encoded data storage unit SYS2, converted into hierarchically encoded data corresponding to the attention area, and the converted hierarchically encoded data is decoded and reproduced and displayed. . The attention area data generation / reproduction processing is executed in the following steps T201 to T207.

（Ｔ２０１）ユーザーの選択した動画像に関する階層符号化データが階層符号化データ蓄積部ＳＹＳ２から階層符号化データ変換部ＳＹＳ３に供給される。 (T201) Hierarchical encoded data related to the moving image selected by the user is supplied from the hierarchically encoded data storage unit SYS2 to the hierarchically encoded data conversion unit SYS3.

（Ｔ２０２）ＲＯＩ通知部ＳＹＳ８は、ユーザーの指定した注目領域情報を階層符号化データ変換部ＳＹＳ３、および、表示制御部ＳＹＳ５に通知する。 (T202) The ROI notification unit SYS8 notifies the attention area information designated by the user to the hierarchically encoded data conversion unit SYS3 and the display control unit SYS5.

（Ｔ２０３）階層符号化データ変換部ＳＹＳ３は、入力された注目領域情報に基づいて、入力された階層符号化データを変換して、階層動画像復号部ＳＹＳ４に出力する。 (T203) The hierarchical encoded data conversion unit SYS3 converts the input hierarchical encoded data based on the input attention area information, and outputs the converted hierarchical encoded data to the hierarchical video decoding unit SYS4.

（Ｔ２０４）階層動画像復号部ＳＹＳ４は、入力された階層動画像符号化データ（変換後）を復号して、再生された上位レイヤおよび下位レイヤの復号ピクチャを表示制御部ＳＹＳ５に出力する。 (T204) The hierarchical video decoding unit SYS4 decodes the input hierarchical video encoded data (after conversion), and outputs the reproduced decoded pictures of the upper layer and the lower layer to the display control unit SYS5.

（Ｔ２０５）表示制御部ＳＹＳ５は、入力された注目領域情報に基づいて、入力された復号ピクチャをＲＯＩ表示部ＳＹＳ６および全体表示部ＳＹＳ７に出力する。 (T205) The display control unit SYS5 outputs the input decoded picture to the ROI display unit SYS6 and the entire display unit SYS7 based on the input attention area information.

（Ｔ２０６）全体表示部ＳＹＳ７は、入力された全体表示画像を表示する。 (T206) The entire display unit SYS7 displays the input entire display image.

（Ｔ２０７）ＲＯＩ表示部ＳＹＳ６は、入力されたＲＯＩ表示画像を表示する。 (T207) The ROI display unit SYS6 displays the input ROI display image.

（注目領域表示システムの効果）
以上説明した本実施形態に係る注目領域表示システムＳＹＳは、注目領域情報を供給する注目領域通知部（ＲＯＩ通知部ＳＹＳ８）と、前記注目領域情報に基づいて階層符号化データを変換して変換後階層符号化データを生成する階層符号化データ変換部ＳＹＳ３と、上記変換後階層符号化データを復号して上位レイヤ及び下位レイヤの復号ピクチャを出力する階層動画像復号部ＳＹＳ４と、表示制御部ＳＹＳ５、注目領域表示部（ＲＯＩ表示部ＳＹＳ６），および、全体表示部ＳＹＳ７を備えている。前記表示制御部ＳＹＳ５は、前記下位レイヤの復号ピクチャを全体表示部ＳＹＳ７に供給し、かつ、前記上位レイヤの復号ピクチャを注目領域表示部に供給する。 (Effect of attention area display system)
The attention area display system SYS according to the present embodiment described above includes an attention area notification section (ROI notification section SYS8) that supplies attention area information, and converts the hierarchically encoded data based on the attention area information and after conversion. A hierarchical encoded data conversion unit SYS3 that generates hierarchical encoded data, a hierarchical moving image decoding unit SYS4 that decodes the converted hierarchical encoded data and outputs decoded pictures of an upper layer and a lower layer, and a display control unit SYS5 , An attention area display section (ROI display section SYS6), and an entire display section SYS7. The display control unit SYS5 supplies the lower layer decoded picture to the entire display unit SYS7, and supplies the upper layer decoded picture to the attention area display unit.

上記の注目領域表示システムＳＹＳによれば、下位レイヤの復号ピクチャの全体を表示し、かつ、注目領域情報により指定された領域の復号ピクチャを表示できる。その際、注目領域情報により指定された領域の復号ピクチャは、階層符号化データの上位レイヤの符号化データを用いて復号されるため、画質が高い。加えて、注目領域に基づいて変換された階層符号化データは、変換前の階層符号化データよりも符号量が少ない。したがって、上記の注目領域表示システムＳＹＳを用いることで、階層符号化データの転送に要する帯域を削減しつつ、注目領域に係る画質の高い復号ピクチャを再生できる。 According to the attention area display system SYS described above, the entire decoded picture of the lower layer can be displayed, and the decoded picture of the area specified by the attention area information can be displayed. At that time, the decoded picture of the area specified by the attention area information is decoded using the encoded data of the upper layer of the hierarchically encoded data, so that the image quality is high. In addition, the hierarchically encoded data converted based on the attention area has a smaller code amount than the hierarchically encoded data before conversion. Therefore, by using the attention area display system SYS described above, it is possible to reproduce a decoded picture with high image quality related to the attention area while reducing the bandwidth required for transferring the hierarchically encoded data.

（他の階層動画像符号化／復号システムへの適用例）
上述した階層動画像符号化装置２及び階層動画像復号装置１は、動画像の送信、受信、記録、再生を行う各種装置に搭載して利用できる。なお、動画像は、カメラ等により撮像された自然動画像であってもよいし、コンピュータ等により生成された人工動画像（ＣＧおよびＧＵＩを含む）であってもよい。 (Application example to other hierarchical video encoding / decoding systems)
The above-described hierarchical video encoding device 2 and hierarchical video decoding device 1 can be used by being mounted on various devices that perform transmission, reception, recording, and reproduction of moving images. The moving image may be a natural moving image captured by a camera or the like, or may be an artificial moving image (including CG and GUI) generated by a computer or the like.

図２２に基づいて、上述した階層動画像符号化装置２および階層動画像復号装置１を、動画像の送信および受信に利用できることを説明する。図２２の（ａ）は、階層動画像符号化装置２を搭載した送信装置ＰＲＯＤ＿Ａの構成を示したブロック図である。 Based on FIG. 22, it will be described that the above-described hierarchical video encoding device 2 and hierarchical video decoding device 1 can be used for transmission and reception of video. FIG. 22A is a block diagram illustrating a configuration of a transmission device PROD_A in which the hierarchical video encoding device 2 is mounted.

図２２の（ａ）に示すように、送信装置ＰＲＯＤ＿Ａは、動画像を符号化することによって符号化データを得る符号化部ＰＲＯＤ＿Ａ１と、符号化部ＰＲＯＤ＿Ａ１が得た符号化データで搬送波を変調することによって変調信号を得る変調部ＰＲＯＤ＿Ａ２と、変調部ＰＲＯＤ＿Ａ２が得た変調信号を送信する送信部ＰＲＯＤ＿Ａ３とを備えている。上述した階層動画像符号化装置２は、この符号化部ＰＲＯＤ＿Ａ１として利用される。 As illustrated in FIG. 22A, the transmission device PROD_A modulates a carrier wave with an encoding unit PROD_A1 that obtains encoded data by encoding a moving image and the encoded data obtained by the encoding unit PROD_A1. Thus, a modulation unit PROD_A2 that obtains a modulation signal and a transmission unit PROD_A3 that transmits the modulation signal obtained by the modulation unit PROD_A2 are provided. The hierarchical moving image encoding apparatus 2 described above is used as the encoding unit PROD_A1.

送信装置ＰＲＯＤ＿Ａは、符号化部ＰＲＯＤ＿Ａ１に入力する動画像の供給源として、動画像を撮像するカメラＰＲＯＤ＿Ａ４、動画像を記録した記録媒体ＰＲＯＤ＿Ａ５、動画像を外部から入力するための入力端子ＰＲＯＤ＿Ａ６、及び、画像を生成または加工する画像処理部Ａ７を更に備えていてもよい。図２２の（ａ）においては、これら全てを送信装置ＰＲＯＤ＿Ａが備えた構成を例示しているが、一部を省略しても構わない。 The transmission device PROD_A is a camera PROD_A4 that captures a moving image, a recording medium PROD_A5 that records the moving image, an input terminal PROD_A6 that inputs the moving image from the outside, as a supply source of the moving image input to the encoding unit PROD_A1. An image processing unit A7 that generates or processes an image may be further provided. In FIG. 22A, the configuration in which the transmission apparatus PROD_A includes all of these is illustrated, but a part thereof may be omitted.

なお、記録媒体ＰＲＯＤ＿Ａ５は、符号化されていない動画像を記録したものであってもよいし、伝送用の符号化方式とは異なる記録用の符号化方式で符号化された動画像を記録したものであってもよい。後者の場合、記録媒体ＰＲＯＤ＿Ａ５と符号化部ＰＲＯＤ＿Ａ１との間に、記録媒体ＰＲＯＤ＿Ａ５から読み出した符号化データを記録用の符号化方式に従って復号する復号部（不図示）を介在させるとよい。 The recording medium PROD_A5 may be a recording of a non-encoded moving image, or a recording of a moving image encoded by a recording encoding scheme different from the transmission encoding scheme. It may be a thing. In the latter case, a decoding unit (not shown) for decoding the encoded data read from the recording medium PROD_A5 according to the recording encoding method may be interposed between the recording medium PROD_A5 and the encoding unit PROD_A1.

図２２の（ｂ）は、階層動画像復号装置１を搭載した受信装置ＰＲＯＤ＿Ｂの構成を示したブロック図である。図２２の（ｂ）に示すように、受信装置ＰＲＯＤ＿Ｂは、変調信号を受信する受信部ＰＲＯＤ＿Ｂ１と、受信部ＰＲＯＤ＿Ｂ１が受信した変調信号を復調することによって符号化データを得る復調部ＰＲＯＤ＿Ｂ２と、復調部ＰＲＯＤ＿Ｂ２が得た符号化データを復号することによって動画像を得る復号部ＰＲＯＤ＿Ｂ３とを備えている。上述した階層動画像復号装置１は、この復号部ＰＲＯＤ＿Ｂ３として利用される。 FIG. 22B is a block diagram illustrating a configuration of a receiving device PROD_B in which the hierarchical video decoding device 1 is mounted. As illustrated in FIG. 22B, the reception device PROD_B includes a reception unit PROD_B1 that receives a modulated signal, a demodulation unit PROD_B2 that obtains encoded data by demodulating the modulation signal received by the reception unit PROD_B1, and a demodulation A decoding unit PROD_B3 that obtains a moving image by decoding the encoded data obtained by the unit PROD_B2. The above-described hierarchical video decoding device 1 is used as the decoding unit PROD_B3.

受信装置ＰＲＯＤ＿Ｂは、復号部ＰＲＯＤ＿Ｂ３が出力する動画像の供給先として、動画像を表示するディスプレイＰＲＯＤ＿Ｂ４、動画像を記録するための記録媒体ＰＲＯＤ＿Ｂ５、及び、動画像を外部に出力するための出力端子ＰＲＯＤ＿Ｂ６を更に備えていてもよい。図２２の（ｂ）においては、これら全てを受信装置ＰＲＯＤ＿Ｂが備えた構成を例示しているが、一部を省略しても構わない。 The receiving device PROD_B has a display PROD_B4 for displaying a moving image, a recording medium PROD_B5 for recording the moving image, and an output terminal for outputting the moving image to the outside as a supply destination of the moving image output by the decoding unit PROD_B3. PROD_B6 may be further provided. FIG. 22B illustrates a configuration in which the reception apparatus PROD_B includes all of these, but a part of the configuration may be omitted.

なお、記録媒体ＰＲＯＤ＿Ｂ５は、符号化されていない動画像を記録するためのものであってもよいし、伝送用の符号化方式とは異なる記録用の符号化方式で符号化されたものであってもよい。後者の場合、復号部ＰＲＯＤ＿Ｂ３と記録媒体ＰＲＯＤ＿Ｂ５との間に、復号部ＰＲＯＤ＿Ｂ３から取得した動画像を記録用の符号化方式に従って符号化する符号化部（不図示）を介在させるとよい。 The recording medium PROD_B5 may be used for recording a non-encoded moving image, or may be encoded using a recording encoding method different from the transmission encoding method. May be. In the latter case, an encoding unit (not shown) for encoding the moving image acquired from the decoding unit PROD_B3 according to the recording encoding method may be interposed between the decoding unit PROD_B3 and the recording medium PROD_B5.

なお、変調信号を伝送する伝送媒体は、無線であってもよいし、有線であってもよい。また、変調信号を伝送する伝送態様は、放送（ここでは、送信先が予め特定されていない送信態様を指す）であってもよいし、通信（ここでは、送信先が予め特定されている送信態様を指す）であってもよい。すなわち、変調信号の伝送は、無線放送、有線放送、無線通信、及び有線通信の何れによって実現してもよい。 Note that the transmission medium for transmitting the modulation signal may be wireless or wired. Further, the transmission mode for transmitting the modulated signal may be broadcasting (here, a transmission mode in which the transmission destination is not specified in advance) or communication (here, transmission in which the transmission destination is specified in advance). Refers to the embodiment). That is, the transmission of the modulation signal may be realized by any of wireless broadcasting, wired broadcasting, wireless communication, and wired communication.

例えば、地上デジタル放送の放送局（放送設備など）／受信局（テレビジョン受像機など）は、変調信号を無線放送で送受信する送信装置ＰＲＯＤ＿Ａ／受信装置ＰＲＯＤ＿Ｂの一例である。また、ケーブルテレビ放送の放送局（放送設備など）／受信局（テレビジョン受像機など）は、変調信号を有線放送で送受信する送信装置ＰＲＯＤ＿Ａ／受信装置ＰＲＯＤ＿Ｂの一例である。 For example, a terrestrial digital broadcast broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by wireless broadcasting. Further, a broadcasting station (such as broadcasting equipment) / receiving station (such as a television receiver) of cable television broadcasting is an example of a transmitting device PROD_A / receiving device PROD_B that transmits and receives a modulated signal by cable broadcasting.

また、インターネットを用いたＶＯＤ（Video On Demand）サービスや動画共有サービスなどのサーバ（ワークステーションなど）／クライアント（テレビジョン受像機、パーソナルコンピュータ、スマートフォンなど）は、変調信号を通信で送受信する送信装置ＰＲＯＤ＿Ａ／受信装置ＰＲＯＤ＿Ｂの一例である（通常、ＬＡＮにおいては伝送媒体として無線又は有線の何れかが用いられ、ＷＡＮにおいては伝送媒体として有線が用いられる）。ここで、パーソナルコンピュータには、デスクトップ型ＰＣ、ラップトップ型ＰＣ、及びタブレット型ＰＣが含まれる。また、スマートフォンには、多機能携帯電話端末も含まれる。 Also, a server (workstation or the like) / client (television receiver, personal computer, smartphone, etc.) such as a VOD (Video On Demand) service or a video sharing service using the Internet transmits and receives a modulated signal by communication. This is an example of PROD_A / reception device PROD_B (usually, either a wireless or wired transmission medium is used in a LAN, and a wired transmission medium is used in a WAN). Here, the personal computer includes a desktop PC, a laptop PC, and a tablet PC. The smartphone also includes a multi-function mobile phone terminal.

なお、動画共有サービスのクライアントは、サーバからダウンロードした符号化データを復号してディスプレイに表示する機能に加え、カメラで撮像した動画像を符号化してサーバにアップロードする機能を有している。すなわち、動画共有サービスのクライアントは、送信装置ＰＲＯＤ＿Ａ及び受信装置ＰＲＯＤ＿Ｂの双方として機能する。 Note that the client of the video sharing service has a function of encoding a moving image captured by a camera and uploading it to the server in addition to a function of decoding the encoded data downloaded from the server and displaying it on the display. That is, the client of the video sharing service functions as both the transmission device PROD_A and the reception device PROD_B.

図２３に基づいて、上述した階層動画像符号化装置２および階層動画像復号装置１を、動画像の記録および再生に利用できることを説明する。図２３の（ａ）は、上述した階層動画像符号化装置２を搭載した記録装置ＰＲＯＤ＿Ｃの構成を示したブロック図である。 Based on FIG. 23, it will be described that the above-described hierarchical video encoding device 2 and hierarchical video decoding device 1 can be used for recording and reproduction of video. FIG. 23A is a block diagram illustrating a configuration of a recording apparatus PROD_C in which the above-described hierarchical video encoding apparatus 2 is mounted.

図２３の（ａ）に示すように、記録装置ＰＲＯＤ＿Ｃは、動画像を符号化することによって符号化データを得る符号化部ＰＲＯＤ＿Ｃ１と、符号化部ＰＲＯＤ＿Ｃ１が得た符号化データを記録媒体ＰＲＯＤ＿Ｍに書き込む書込部ＰＲＯＤ＿Ｃ２と、を備えている。上述した階層動画像符号化装置２は、この符号化部ＰＲＯＤ＿Ｃ１として利用される。 As shown in (a) of FIG. 23, the recording device PROD_C includes an encoding unit PROD_C1 that obtains encoded data by encoding a moving image, and the encoded data obtained by the encoding unit PROD_C1 on the recording medium PROD_M. A writing unit PROD_C2 for writing. The hierarchical moving image encoding device 2 described above is used as the encoding unit PROD_C1.

なお、記録媒体ＰＲＯＤ＿Ｍは、（１）ＨＤＤ（Hard Disk Drive）やＳＳＤ(Solid State Drive)等のように、記録装置ＰＲＯＤ＿Ｃに内蔵されるタイプのものであってもよいし、（２）ＳＤメモリカードやＵＳＢ（Universal Serial Bus）フラッシュメモリ等のように、記録装置ＰＲＯＤ＿Ｃに接続されるタイプのものであってもよいし、（３）ＤＶＤ（Digital Versatile Disc）やＢＤ（Blu-ray Disc:登録商標）等のように、記録装置ＰＲＯＤ＿Ｃに内蔵されたドライブ装置（不図示）に装填されるものであってもよい。 The recording medium PROD_M may be of a type built in the recording device PROD_C, such as (1) HDD (Hard Disk Drive) or SSD (Solid State Drive), or (2) SD memory. It may be of the type connected to the recording device PROD_C, such as a card or USB (Universal Serial Bus) flash memory, or (3) DVD (Digital Versatile Disc) or BD (Blu-ray Disc: registration) For example, it may be loaded into a drive device (not shown) built in the recording device PROD_C.

また、記録装置ＰＲＯＤ＿Ｃは、符号化部ＰＲＯＤ＿Ｃ１に入力する動画像の供給源として、動画像を撮像するカメラＰＲＯＤ＿Ｃ３、動画像を外部から入力するための入力端子ＰＲＯＤ＿Ｃ４、動画像を受信するための受信部ＰＲＯＤ＿Ｃ５、及び、画像を生成または加工する画像処理部Ｃ６を更に備えていてもよい。図２３の（ａ）においては、これら全てを記録装置ＰＲＯＤ＿Ｃが備えた構成を例示しているが、一部を省略しても構わない。 The recording device PROD_C is a camera PROD_C3 that captures moving images as a supply source of moving images to be input to the encoding unit PROD_C1, an input terminal PROD_C4 for inputting moving images from the outside, and reception for receiving moving images. The unit PROD_C5 and an image processing unit C6 that generates or processes an image may be further provided. FIG. 23A illustrates a configuration in which the recording apparatus PROD_C includes all of these, but a part of the configuration may be omitted.

なお、受信部ＰＲＯＤ＿Ｃ５は、符号化されていない動画像を受信するものであってもよいし、記録用の符号化方式とは異なる伝送用の符号化方式で符号化された符号化データを受信するものであってもよい。後者の場合、受信部ＰＲＯＤ＿Ｃ５と符号化部ＰＲＯＤ＿Ｃ１との間に、伝送用の符号化方式で符号化された符号化データを復号する伝送用復号部（不図示）を介在させるとよい。 The receiving unit PROD_C5 may receive a non-encoded moving image, or may receive encoded data encoded by a transmission encoding scheme different from the recording encoding scheme. You may do. In the latter case, a transmission decoding unit (not shown) that decodes encoded data encoded by the transmission encoding method may be interposed between the reception unit PROD_C5 and the encoding unit PROD_C1.

このような記録装置ＰＲＯＤ＿Ｃとしては、例えば、ＤＶＤレコーダ、ＢＤレコーダ、ＨＤＤ（Hard Disk Drive）レコーダなどが挙げられる（この場合、入力端子ＰＲＯＤ＿Ｃ４又は受信部ＰＲＯＤ＿Ｃ５が動画像の主な供給源となる）。また、カムコーダ（この場合、カメラＰＲＯＤ＿Ｃ３が動画像の主な供給源となる）、パーソナルコンピュータ（この場合、受信部ＰＲＯＤ＿Ｃ５又は画像処理部Ｃ６が動画像の主な供給源となる）、スマートフォン（この場合、カメラＰＲＯＤ＿Ｃ３又は受信部ＰＲＯＤ＿Ｃ５が動画像の主な供給源となる）なども、このような記録装置ＰＲＯＤ＿Ｃの一例である。 Examples of such a recording device PROD_C include a DVD recorder, a BD recorder, and an HDD (Hard Disk Drive) recorder (in this case, the input terminal PROD_C4 or the receiving unit PROD_C5 is a main supply source of moving images). . In addition, a camcorder (in this case, the camera PROD_C3 is a main source of moving images), a personal computer (in this case, the receiving unit PROD_C5 or the image processing unit C6 is a main source of moving images), a smartphone (in this case In this case, the camera PROD_C3 or the receiving unit PROD_C5 is a main supply source of moving images) is also an example of such a recording device PROD_C.

図２３の（ｂ）は、上述した階層動画像復号装置１を搭載した再生装置ＰＲＯＤ＿Ｄの構成を示したブロックである。図２３の（ｂ）に示すように、再生装置ＰＲＯＤ＿Ｄは、記録媒体ＰＲＯＤ＿Ｍに書き込まれた符号化データを読み出す読出部ＰＲＯＤ＿Ｄ１と、読出部ＰＲＯＤ＿Ｄ１が読み出した符号化データを復号することによって動画像を得る復号部ＰＲＯＤ＿Ｄ２と、を備えている。上述した階層動画像復号装置１は、この復号部ＰＲＯＤ＿Ｄ２として利用される。 FIG. 23B is a block diagram illustrating a configuration of a playback device PROD_D in which the above-described hierarchical video decoding device 1 is mounted. As shown in (b) of FIG. 23, the playback device PROD_D reads a moving image by decoding a read unit PROD_D1 that reads encoded data written to the recording medium PROD_M and a coded data read by the read unit PROD_D1. And a decoding unit PROD_D2 to be obtained. The hierarchical moving image decoding apparatus 1 described above is used as the decoding unit PROD_D2.

なお、記録媒体ＰＲＯＤ＿Ｍは、（１）ＨＤＤやＳＳＤなどのように、再生装置ＰＲＯＤ＿Ｄに内蔵されるタイプのものであってもよいし、（２）ＳＤメモリカードやＵＳＢフラッシュメモリなどのように、再生装置ＰＲＯＤ＿Ｄに接続されるタイプのものであってもよいし、（３）ＤＶＤやＢＤなどのように、再生装置ＰＲＯＤ＿Ｄに内蔵されたドライブ装置（不図示）に装填されるものであってもよい。 Note that the recording medium PROD_M may be of the type built into the playback device PROD_D, such as (1) HDD or SSD, or (2) such as an SD memory card or USB flash memory, It may be of a type connected to the playback device PROD_D, or (3) may be loaded into a drive device (not shown) built in the playback device PROD_D, such as DVD or BD. Good.

また、再生装置ＰＲＯＤ＿Ｄは、復号部ＰＲＯＤ＿Ｄ２が出力する動画像の供給先として、動画像を表示するディスプレイＰＲＯＤ＿Ｄ３、動画像を外部に出力するための出力端子ＰＲＯＤ＿Ｄ４、及び、動画像を送信する送信部ＰＲＯＤ＿Ｄ５を更に備えていてもよい。図２３の（ｂ）においては、これら全てを再生装置ＰＲＯＤ＿Ｄが備えた構成を例示しているが、一部を省略しても構わない。 In addition, the playback device PROD_D has a display PROD_D3 that displays a moving image, an output terminal PROD_D4 that outputs the moving image to the outside, and a transmission unit that transmits the moving image as a supply destination of the moving image output by the decoding unit PROD_D2. PROD_D5 may be further provided. FIG. 23B illustrates a configuration in which the playback apparatus PROD_D includes all of these, but a part of the configuration may be omitted.

なお、送信部ＰＲＯＤ＿Ｄ５は、符号化されていない動画像を送信するものであってもよいし、記録用の符号化方式とは異なる伝送用の符号化方式で符号化された符号化データを送信するものであってもよい。後者の場合、復号部ＰＲＯＤ＿Ｄ２と送信部ＰＲＯＤ＿Ｄ５との間に、動画像を伝送用の符号化方式で符号化する符号化部（不図示）を介在させるとよい。 The transmission unit PROD_D5 may transmit an unencoded moving image, or transmits encoded data encoded by a transmission encoding method different from the recording encoding method. You may do. In the latter case, it is preferable to interpose an encoding unit (not shown) that encodes a moving image with an encoding method for transmission between the decoding unit PROD_D2 and the transmission unit PROD_D5.

このような再生装置ＰＲＯＤ＿Ｄとしては、例えば、ＤＶＤプレイヤ、ＢＤプレイヤ、ＨＤＤプレイヤなどが挙げられる（この場合、テレビジョン受像機等が接続される出力端子ＰＲＯＤ＿Ｄ４が動画像の主な供給先となる）。また、テレビジョン受像機（この場合、ディスプレイＰＲＯＤ＿Ｄ３が動画像の主な供給先となる）、デジタルサイネージ（電子看板や電子掲示板等とも称され、ディスプレイＰＲＯＤ＿Ｄ３又は送信部ＰＲＯＤ＿Ｄ５が動画像の主な供給先となる）、デスクトップ型ＰＣ（この場合、出力端子ＰＲＯＤ＿Ｄ４又は送信部ＰＲＯＤ＿Ｄ５が動画像の主な供給先となる）、ラップトップ型又はタブレット型ＰＣ（この場合、ディスプレイＰＲＯＤ＿Ｄ３又は送信部ＰＲＯＤ＿Ｄ５が動画像の主な供給先となる）、スマートフォン（この場合、ディスプレイＰＲＯＤ＿Ｄ３又は送信部ＰＲＯＤ＿Ｄ５が動画像の主な供給先となる）なども、このような再生装置ＰＲＯＤ＿Ｄの一例である。 Examples of such a playback device PROD_D include a DVD player, a BD player, and an HDD player (in this case, an output terminal PROD_D4 to which a television receiver or the like is connected is a main supply destination of moving images). . In addition, a television receiver (in this case, the display PROD_D3 is a main supply destination of moving images), a digital signage (also referred to as an electronic signboard or an electronic bulletin board), and the display PROD_D3 or the transmission unit PROD_D5 is the main supply of moving images. Desktop PC (in this case, the output terminal PROD_D4 or the transmission unit PROD_D5 is the main video image supply destination), laptop or tablet PC (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a moving image) A smartphone (which is a main image supply destination), a smartphone (in this case, the display PROD_D3 or the transmission unit PROD_D5 is a main moving image supply destination), and the like are also examples of such a playback device PROD_D.

（ハードウェア的実現およびソフトウェア的実現について）
最後に、階層動画像復号装置１、階層動画像符号化装置２の各ブロックは、集積回路（ＩＣチップ）上に形成された論理回路によってハードウェア的に実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェア的に実現してもよい。 (About hardware implementation and software implementation)
Finally, each block of the hierarchical video decoding device 1 and the hierarchical video encoding device 2 may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or may be a CPU (Central It may be realized by software using a Processing Unit).

後者の場合、上記各装置は、各機能を実現する制御プログラムの命令を実行するＣＰＵ、上記プログラムを格納したＲＯＭ（Read Only Memory）、上記プログラムを展開するＲＡＭ（Random Access Memory）、上記プログラムおよび各種データを格納するメモリ等の記憶装置（記録媒体）などを備えている。そして、本発明の目的は、上述した機能を実現するソフトウェアである上記各装置の制御プログラムのプログラムコード（実行形式プログラム、中間コードプログラム、ソースプログラム）をコンピュータで読み取り可能に記録した記録媒体を、上記各装置に供給し、そのコンピュータ（またはＣＰＵやＭＰＵ（Micro Processing Unit））が記録媒体に記録されているプログラムコードを読み出し実行することによっても、達成可能である。 In the latter case, each of the devices includes a CPU that executes instructions of a control program that realizes each function, a ROM (Read Only Memory) that stores the program, a RAM (Random Access Memory) that expands the program, the program, and A storage device (recording medium) such as a memory for storing various data is provided. An object of the present invention is to provide a recording medium in which a program code (execution format program, intermediate code program, source program) of a control program for each of the above devices, which is software that realizes the above-described functions, is recorded in a computer-readable manner This can also be achieved by supplying to each of the above devices and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU (Micro Processing Unit)).

上記記録媒体としては、例えば、磁気テープやカセットテープ等のテープ類、フロッピー（登録商標）ディスク／ハードディスク等の磁気ディスクやＣＤ−ＲＯＭ（Compact Disc Read-Only Memory）／ＭＯ（Magneto-Optical）／ＭＤ（Mini Disc）／ＤＶＤ（Digital Versatile Disk）／ＣＤ−Ｒ（CD Recordable）等の光ディスクを含むディスク類、ＩＣカード（メモリカードを含む）／光カード等のカード類、マスクＲＯＭ／ＥＰＲＯＭ（Erasable Programmable Read-only Memory）／ＥＥＰＲＯＭ（登録商標）（Electrically Erasable and Programmable Read-only Memory）／フラッシュＲＯＭ等の半導体メモリ類、あるいはＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等の論理回路類などを用いることができる。 Examples of the recording medium include tapes such as magnetic tapes and cassette tapes, magnetic disks such as floppy (registered trademark) disks / hard disks, CD-ROM (Compact Disc Read-Only Memory) / MO (Magneto-Optical) / Disks including optical disks such as MD (Mini Disc) / DVD (Digital Versatile Disk) / CD-R (CD Recordable), cards such as IC cards (including memory cards) / optical cards, mask ROM / EPROM (Erasable Programmable Read-only Memory (EEPROM) (Electrically Erasable and Programmable Read-only Memory) / Semiconductor memories such as flash ROM, or logic circuits such as PLD (Programmable Logic Device) and FPGA (Field Programmable Gate Array) Etc. can be used.

また、上記各装置を通信ネットワークと接続可能に構成し、上記プログラムコードを通信ネットワークを介して供給してもよい。この通信ネットワークは、プログラムコードを伝送可能であればよく、特に限定されない。例えば、インターネット、イントラネット、エキストラネット、ＬＡＮ（Local Area Network）、ＩＳＤＮ（Integrated Services Digital Network）、ＶＡＮ（Value-Added Network）、ＣＡＴＶ（Community Antenna Television）通信網、仮想専用網（Virtual Private Network）、電話回線網、移動体通信網、衛星通信網等が利用可能である。また、この通信ネットワークを構成する伝送媒体も、プログラムコードを伝送可能な媒体であればよく、特定の構成または種類のものに限定されない。例えば、ＩＥＥＥ（Institute of Electrical and Electronic Engineers）１３９４、ＵＳＢ、電力線搬送、ケーブルＴＶ回線、電話線、ＡＤＳＬ（Asymmetric Digital Subscriber Line）回線等の有線でも、ＩｒＤＡ（Infrared Data Association）やリモコンのような赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＩＥＥＥ８０２．１１無線、ＨＤＲ（High Data Rate）、ＮＦＣ（Near Field Communication）、ＤＬＮＡ（Digital Living Network Alliance）、携帯電話網、衛星回線、地上波デジタル網等の無線でも利用可能である。なお、本発明は、上記プログラムコードが電子的な伝送で具現化された、搬送波に埋め込まれたコンピュータデータ信号の形態でも実現され得る。 Further, each of the above devices may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, the Internet, an intranet, an extranet, a LAN (Local Area Network), an ISDN (Integrated Services Digital Network), a VAN (Value-Added Network), a CATV (Community Antenna Television) communication network, a virtual private network (Virtual Private Network), A telephone line network, a mobile communication network, a satellite communication network, etc. can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, infra-red such as IrDA (Infrared Data Association) or remote control, such as IEEE (Institute of Electrical and Electronic Engineers) 1394, USB, power line carrier, cable TV line, telephone line, ADSL (Asymmetric Digital Subscriber Line) line, etc. , Bluetooth (registered trademark), IEEE 802.11 wireless, HDR (High Data Rate), NFC (Near Field Communication), DLNA (Digital Living Network Alliance), mobile phone network, satellite line, terrestrial digital network, etc. Is possible. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention.

本発明は、画像データが階層的に符号化された符号化データを復号する階層画像復号装置、および、画像データが階層的に符号化された符号化データを生成する階層画像符号化装置に好適に適用できる。また、階層画像符号化装置によって生成され、階層画像復号装置によって参照される階層符号化データのデータ構造に好適に適用できる。 The present invention is suitable for a hierarchical image decoding device that decodes encoded data in which image data is hierarchically encoded, and a hierarchical image encoding device that generates encoded data in which image data is hierarchically encoded. Applicable to. Further, the present invention can be suitably applied to the data structure of hierarchically encoded data generated by the hierarchical image encoding device and referenced by the hierarchical image decoding device.

１階層動画像復号装置（画像復号装置）
１１ＮＡＬ逆多重化部
１２パラメータセット復号部
１３タイル設定部
１４スライス復号部
１４１スライスヘッダ復号部
１４２スライス位置設定部
１４３、２４３スキップスライス判定部
１４４ＣＴＵ復号部
１４４Ｎ非スキップＣＴＵ復号部
１４４ＳスキップＣＴＵ生成部
１５ベース復号部
１６復号ピクチャ管理部
２階層動画像符号化装置（画像符号化装置）
２１ＮＡＬ多重化部
２２パラメータセット符号化部
２３タイル設定部
２４スライス符号化部
２４１スライスヘッダ設定部
２４２スライス位置設定部
２４４ＣＴＵ符号化部
２４４Ｎ非スキップＣＴＵ符号化部
２４４ＳスキップＣＴＵ符号化部
３階層符号化データ変換装置
３２パラメータセット修正部
３４スライス修正部
ＳＹＳ注目領域表示システム
ＳＹＳ１Ａ、ＳＹＳ１Ｂ階層動画像符号化部
ＳＹＳ２階層符号化データ蓄積部
ＳＹＳ３階層符号化データ変換部
ＳＹＳ４階層動画像復号部
ＳＹＳ５表示制御部
ＳＹＳ６ＲＯＩ表示部（注目領域表示部）
ＳＹＳ７全体表示部
ＳＹＳ８ＲＯＩ通知部（注目領域通知部） 1. Hierarchical video decoding device (image decoding device)
DESCRIPTION OF SYMBOLS 11 NAL demultiplexing part 12 Parameter set decoding part 13 Tile setting part 14 Slice decoding part 141 Slice header decoding part 142 Slice position setting part 143, 243 Skip slice determination part 144 CTU decoding part 144N Non-skip CTU decoding part 144S Skip CTU generation 15 Base decoding unit 16 Decoded picture management unit 2 Hierarchical video encoding device (image encoding device)
21 NAL multiplexing unit 22 Parameter set encoding unit 23 Tile setting unit 24 Slice encoding unit 241 Slice header setting unit 242 Slice position setting unit 244 CTU encoding unit 244N Non-skip CTU encoding unit 244S Skip CTU encoding unit 3 layers Coded data converter 32 Parameter set correction unit 34 Slice correction unit SYS Attention area display system SYS1A, SYS1B Hierarchical moving image encoding unit SYS2 Hierarchical encoded data storage unit SYS3 Hierarchical encoded data conversion unit SYS4 Hierarchical moving image decoding unit SYS5 Display Control unit SYS6 ROI display unit (attention area display unit)
SYS7 Overall display section SYS8 ROI notification section (attention area notification section)

Claims

An image decoding apparatus that decodes higher layer encoded data included in hierarchically encoded data and restores a decoded picture of an upper layer,
A slice header decoding unit for decoding the slice header of the upper layer;
A non-skip CTU decoding unit that decodes a decoded image of a CTU belonging to a non-skip slice of an upper layer based on slice data;
A skip CTU generation unit that generates a decoded image of a CTU belonging to a skip slice of an upper layer;
The slice header decoding unit decodes or estimates a skip slice flag indicating whether the target slice is a skip slice or a non-skip slice,
When the skip slice flag indicates that the target slice is a skip slice, the slice header decoding unit decodes the number of skip CTUs that is the number of CTUs included in the target slice, and the number of skip CTUs indicates the number of skip CTUs. A decoded image of the target slice is generated by generating a decoded image of the CTU by the skip CTU generation unit,
When the skip slice flag indicates that the target slice is a non-skip slice, a decoded image of the target slice is generated by decoding the CTU included in the target slice by the non-skip CTU decoding unit. Image decoding device.

A parameter set decoding unit for decoding display area information;
The display area specified by the display area information does not include a tile including a skip slice, but includes a tile, and all the slices included in the tile are non-skip slices. The image decoding device according to 1.

The skip CTU number is decoded from a slice header by applying a decoding process of a variable length code having a maximum value of the number of CTUs included in a tile to which the target slice belongs. Image decoding apparatus.

When the picture including the target slice includes one tile, the skip slice flag decoded or estimated by the slice header decoding unit indicates that the target slice is a non-skip slice. The image decoding device according to claim 3.

A parameter set decoding unit for decoding all tile-dependent identifiers included in the parameter set;
The skip slice flag decoded or estimated by the slice header decoding unit when the all-tile dependency identifier indicates that there is no motion dependency other than between corresponding tiles or no inter-layer dependency other than between corresponding tiles. The image decoding apparatus according to claim 1, wherein indicates that the target slice is a non-skip slice.

When the picture including the target slice includes two or more tiles, the slice header decoding unit decodes the skip slice flag,
The slice header decoding unit estimates a value indicating that the target slice is a non-skip slice as a value of the skip slice flag when a picture including the target slice includes one tile. 5. The image decoding device according to 4.

When the picture including the target slice includes two or more tiles, the slice header decoding unit selects either the non-skip CTU decoding unit or the skip CTU generation unit according to the value of the skip slice flag. To decode the CTU contained in the target slice,
The slice header decoding unit selects a non-skip CTU decoding unit to decode a CTU included in the target slice when a picture including the target slice includes one tile. Image decoding device.

6. The all-tile dependency identifier includes both information indicating presence / absence of motion dependence other than between corresponding tiles and information indicating presence / absence of inter-layer dependence other than between corresponding tiles. Image decoding device.

A parameter set decoding unit for decoding all tile-dependent identifiers included in the parameter set;
The picture including the target slice includes two or more tiles, and the all-tile dependency identifier indicates that there is no motion dependency other than between corresponding tiles or no inter-layer dependency other than between corresponding tiles. 4. The image decoding device according to claim 1, wherein the skip slice flag decoded or estimated by the slice header decoding unit indicates that the target slice is a non-skip slice.

An image encoding device that generates encoded data of an upper layer from an input image,
A slice header encoding unit that encodes a slice header of an upper layer;
A non-skip CTU encoding unit that encodes a decoded image of a CTU belonging to a non-skip slice of an upper layer based on slice data;
A skip CTU generation unit that generates a decoded image of a CTU belonging to a skip slice of an upper layer;
A parameter set encoder for encoding display area information;
The slice header encoding unit encodes a skip slice flag indicating whether the target slice is a skip slice or a non-skip slice,
When the skip slice flag indicates that the target slice is a skip slice, the slice header encoding unit encodes the number of skip CTUs that is the number of CTUs included in the target slice and indicates the number of skip CTUs. A decoded image of the target slice is generated by generating a decoded image of the number of CTUs by the skip CTU generation unit,
When the skip slice flag indicates that the target slice is a non-skip slice, the decoded image of the target slice is generated by generating the CTU included in the target slice by the non-skip CTU encoding unit,
The display area specified by the display area information does not include a tile including a skip slice, but includes a tile, and all the slices included in the tile include a non-skip slice. apparatus.

A hierarchical encoded data conversion device that converts input hierarchical encoded data based on input attention area information and outputs the converted hierarchical encoded data,
A slice correction unit for correcting the encoded data of the upper layer based on the attention area information,
The slice correction unit is configured to change a slice that is included in the encoded data of the upper layer and is not included in the attention area from a non-skip slice to a skip slice based on the attention area indicated by the attention area information. 1. A hierarchical encoded data conversion apparatus for correcting encoded data of an upper layer in

The slice correction unit is a slice included in the encoded data of the upper layer, and the encoded data of the upper layer is changed by changing a slice included in the tile not included in the attention area from a non-skip slice to a skip slice. The hierarchical encoded data conversion apparatus according to claim 11, wherein:

It has a parameter set correction unit,
The parameter set correction unit corrects the parameter set by rewriting the display area information included in the parameter set so that the display area indicated by the display area information matches the attention area indicated by the attention area information. The hierarchical encoded data conversion device according to claim 11 or 12.

An attention area display system that displays an entire picture and a partial area corresponding to the attention area of the picture using the accumulated hierarchical encoded data,
An attention area notification unit for supplying attention area information;
A hierarchically encoded data converter that converts hierarchically encoded data based on the attention area information to generate converted hierarchically encoded data; and
A hierarchical moving picture decoding unit that decodes the transformed hierarchically encoded data and outputs a decoded picture of an upper layer and a decoded picture of a lower layer;
An attention area display system comprising: a display control unit that outputs the decoded picture of the lower layer as a whole display image and outputs the decoded picture of the upper layer as an attention area display image.

The attention area display system according to claim 14, wherein the display control section does not output an attention area display image when the attention area information indicates that an attention area is not designated.

When there is a change in attention area information, the display control section starts from the time when the change is determined until the decoded picture of the upper layer related to the attention area information after the change is supplied from the hierarchical video decoding section. The attention area according to claim 14 or 15, characterized in that a partial area of the decoded picture of the lower layer and indicated by the changed attention area information is output as an attention area display image. Area display system.