JP2017034729A

JP2017034729A - Dynamic image predictive decoding method and dynamic image predictive decoding device

Info

Publication number: JP2017034729A
Application number: JP2016224323A
Authority: JP
Inventors: 順也瀧上; Junya Takigami; ブン　チュンセン; Chunsen Bun; チュンセンブン; タン　ティオ　ケン; Tio Ken Tang; ティオケンタン
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2016-11-17
Filing date: 2016-11-17
Publication date: 2017-02-09
Anticipated expiration: 2032-07-06
Also published as: JP6272979B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem that in a NAL unit header of a conventional method, even in the case where a value of nal_ref_flag is significantly determined in accordance with a value of nal_unit_type, bits are allocated to nal_ref_flag and nal_unit_type and design may be inefficient.SOLUTION: A dynamic image predictive decoding method includes: an input step for inputting compressed image data; and a decoding step for decoding NAL unit header information and a reference picture set (RPS) to restore the compressed image data as regenerative image data. The RPS identifies a set of pictures to be used for inter-screen prediction of associated pictures, and the NAL unit header information includes nal_unit_type significantly indicating whether to use the regenerative image data for the inter-screen prediction when decoding the other picture in the same temporal layer.SELECTED DRAWING: Figure 5

Description

本発明は、動画像予測復号方法及び動画像予測復号装置に関するものである。 The present invention relates to a moving picture predictive decoding method and a moving picture predictive decoding apparatus.

従来の動画像圧縮技術では、ビットストリームはネットワーク・アブストラクション・レイヤー（ＮＡＬ）ユニットにカプセル化される。ＮＡＬユニットは自己完結したパケットを提供し、ビデオ・レイヤーに異なるネットワーク環境における同一性を与える。ＮＡＬユニットのヘッダにはシステム・レイヤーで必要となる情報が含まれている。ＮＡＬユニットのヘッダはパケットネットワークにおけるパケットヘッダの一部となり、メディア・アウェア・ネットワーク・エレメンツ（ＭＡＮＥｓ）によって動作するようになるように設計されている。 In a conventional moving image compression technique, a bit stream is encapsulated in a network abstraction layer (NAL) unit. NAL units provide self-contained packets, giving the video layer identity in different network environments. The header of the NAL unit includes information necessary for the system layer. The header of the NAL unit becomes part of the packet header in the packet network and is designed to work with Media Aware Network Elements (MANEs).

従来技術のＮＡＬユニットヘッダは以下のシンタックス・エレメンツを含んでいる。ｎａｌ＿ｒｅｆ＿ｆｌａｇは、そのＮＡＬユニットが他のＮＡＬユニットの復号処理において参照に用いられるか否かを指示する。ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅはＮＡＬユニットによって伝達される内容の型を指示する。ＮＡＬユニットはパラメータ・セット、符号化スライス、サプリメンタル・エンハンスメント・インフォメーション（ＳＥＩ）メッセージなどの情報を含む。ｔｅｍｐｏｒａｌ＿ｉｄはＮＡＬユニットの時間識別子を指示する。 The prior art NAL unit header includes the following syntax elements: nal_ref_flag indicates whether or not the NAL unit is used for reference in the decoding process of another NAL unit. nal_unit_type indicates the type of content conveyed by the NAL unit. The NAL unit contains information such as parameter sets, coded slices, supplemental enhancement information (SEI) messages. temporal_id indicates the time identifier of the NAL unit.

従来技術は非特許文献１に記載されている。 The prior art is described in Non-Patent Document 1.

Benjamin Bross et. al.,"Highefficiency video coding (HEVC) text specification draft 7", JointCollaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IECJTC1/SC29/WG11, 9th Meeting: Geneva, CH, 27th April - 7th May 2012Benjamin Bross et. Al., "Highefficiency video coding (HEVC) text specification draft 7", JointCollaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO / IECJTC1 / SC29 / WG11, 9th Meeting: Geneva, CH, 27th April-7th May 2012

ＭＡＮＥｓがパケットの冒頭で最小限のバイト数を調べるように設計されているように、ＮＡＬユニットヘッダは限られた資源である。従来技術においては、ＮＡＬユニットヘッダは２バイトに過ぎない。それゆえ、ＮＡＬユニットヘッダの全てのシンタックスエレメントは重要であり、可能な限り数多くの、かつ、他のシンタックスエレメントとは相関のない情報を伝達するべきである。 The NAL unit header is a limited resource, as MANEs are designed to look at the minimum number of bytes at the beginning of a packet. In the prior art, the NAL unit header is only 2 bytes. Therefore, all syntax elements of the NAL unit header are important and should convey as much information as possible and uncorrelated with other syntax elements.

大部分のＮＡＬユニットタイプの場合、ｎａｌ＿ｒｅｆ＿ｆｌａｇは固定値に設定される必要があるため、ｎａｌ＿ｒｅｆ＿ｆｌａｇは必要とされない。非特許文献１に記載の仕様において、ｎａｌ＿ｒｅｆ＿ｆｌａｇが０または１の値をとりうるＮＡＬユニットタイプは３種類のみである。仕様で定義されているその他のＮＡＬユニットタイプではｎａｌ＿ｒｅｆ＿ｆｌａｇの値は固定されている。これを表１に示す。

表１は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値（NAL unit type range列）とｎａｌ＿ｒｅｆ＿ｆｌａｇが取りうる値（Possible nal_ref_flag列）との対応を示す表である。ここで、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値が１、２、あるいは３であるＮＡＬユニットタイプは、ｎａｌ＿ｒｅｆ＿ｆｌａｇの値として０あるいは１をとりうる。残りのＮＡＬユニットタイプはリザーブされている、あるいは仕様化されていない。 For most NAL unit types, nal_ref_flag is not required because nal_ref_flag needs to be set to a fixed value. In the specification described in Non-Patent Document 1, there are only three types of NAL unit types in which nal_ref_flag can take a value of 0 or 1. In other NAL unit types defined in the specification, the value of nal_ref_flag is fixed. This is shown in Table 1.

Table 1 is a table showing the correspondence between the value of nal_unit_type (NAL unit type range column) and the value that nal_ref_flag can take (Possible nal_ref_flag column). Here, the NAL unit type whose nal_unit_type value is 1, 2, or 3 can take 0 or 1 as the value of nal_ref_flag. The remaining NAL unit types are reserved or not specified.

このようにｎａｌ＿ｒｅｆ＿ｆｌａｇの値が、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値に応じて一意に決定される場合においても、従来手法ではｎａｌ＿ｒｅｆ＿ｆｌａｇ及びｎａｌ＿ｕｎｉｔ＿ｔｙｐｅのそれぞれにビットを割り当てており、非効率な設計となっている。 Thus, even when the value of nal_ref_flag is uniquely determined according to the value of nal_unit_type, in the conventional method, bits are assigned to each of nal_ref_flag and nal_unit_type, which is an inefficient design.

上述の課題を解決するための解決策はＮＡＬユニットヘッダでｎａｌ＿ｒｅｆ＿ｆｌａｇを明示的に送らずに、ＮＡＬユニットタイプから暗示することである。ＮＡＬユニットの内容が、参照ピクチャあるいは非参照ピクチャになりうる３通りのＮＡＬユニットタイプについて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが１であることを暗示する３通りのＮＡＬユニットタイプを追加する。元の３通りのＮＡＬユニットタイプについては、ｎａｌ＿ｒｅｆ＿ｆｌａｇが０であることを暗示している。 The solution to solve the above problem is to imply from the NAL unit type without explicitly sending nal_ref_flag in the NAL unit header. Three types of NAL unit types implying that nal_ref_flag is 1 are added to the three types of NAL unit types whose contents of the NAL unit can be reference pictures or non-reference pictures. For the original three NAL unit types, nal_ref_flag is 0.

上述の課題を解決するために、本発明に係る動画像予測復号方法は、動画像予測復号装置により実行される動画像予測復号方法であって、動画像を構成する複数のピクチャのための圧縮画像データであって、リファレンス・ピクチャ・セット（ＲＰＳ）を含み、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された圧縮画像データを入力する入力ステップと、ＮＡＬユニットヘッダ情報及びＲＰＳを復号し、圧縮画像データを再生画像データとして復元する復号ステップと、を含み、動画像を構成する複数のピクチャは、複数のテンポラル・レイヤに分類され、ＲＰＳは、関連するピクチャの画面間予測に使われるピクチャの組を識別し、ＮＡＬユニットヘッダ情報は、再生画像データが、同じテンポラル・レイヤの他のピクチャを復号する際に画面間予測に使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、他のピクチャのＲＰＳは、同じテンポラル・レイヤの非参照ピクチャを含まない。 In order to solve the above-described problem, a moving picture predictive decoding method according to the present invention is a moving picture predictive decoding method executed by a moving picture predictive decoding apparatus, and compresses a plurality of pictures constituting a moving picture. An input step for inputting compressed image data encapsulated in a NAL unit together with NAL unit header information, including reference picture set (RPS), and decoding and compressing the NAL unit header information and RPS A plurality of pictures constituting the moving image are classified into a plurality of temporal layers, and the RPS is a picture used for inter-picture prediction of related pictures. The NAL unit header information indicates that the reproduced image data has other pixels in the same temporal layer. Includes nal_unit_type uniquely indicating whether used for inter-picture prediction in decoding tea, the RPS other pictures free of non-reference picture in the same temporal layer.

本発明に係る動画像予測復号方法におけるＮＡＬユニットヘッダ情報は、再生画像データが、同じテンポラル・レイヤの後続のピクチャの復号での画面間予測に復号順で使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号順の後続のピクチャのＲＰＳは、同じテンポラル・レイヤの非参照ピクチャを含まないこととしてもよい。 The NAL unit header information in the video predictive decoding method according to the present invention includes nal_unit_type that uniquely indicates whether the reproduced image data is used in decoding order for inter-screen prediction in decoding of subsequent pictures of the same temporal layer. RPS of subsequent pictures in decoding order may not include non-reference pictures of the same temporal layer.

本発明に係る動画像予測復号装置は、動画像を構成する複数のピクチャのための圧縮画像データであって、リファレンス・ピクチャ・セット（ＲＰＳ）を含み、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された圧縮画像データを入力する入力手段と、ＮＡＬユニットヘッダ情報及びＲＰＳを復号し、圧縮画像データを再生画像データとして復元する復号手段と、を具備し、動画像を構成する複数のピクチャは、複数のテンポラル・レイヤに分類され、ＲＰＳは、関連するピクチャの画面間予測に使われるピクチャの組を識別し、ＮＡＬユニットヘッダ情報は、再生画像データが、同じテンポラル・レイヤの他のピクチャを復号する際に画面間予測に使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、他のピクチャのＲＰＳは、同じテンポラル・レイヤの非参照ピクチャを含まない。 The moving picture predictive decoding apparatus according to the present invention is compressed image data for a plurality of pictures constituting a moving picture, includes a reference picture set (RPS), and is encapsulated in a NAL unit together with NAL unit header information. A plurality of pictures constituting a moving image, comprising: input means for inputting the compressed image data, and decoding means for decoding the NAL unit header information and RPS and restoring the compressed image data as reproduced image data. Classified into multiple temporal layers, the RPS identifies the set of pictures used for inter-picture prediction of related pictures, and the NAL unit header information decodes other pictures in the same temporal layer that the reconstructed image data Nal_unit_type that uniquely indicates whether or not to use for inter-screen prediction when Kucha the RPS does not include the non-reference picture in the same temporal layer.

本発明に係る動画像予測復号装置におけるＮＡＬユニットヘッダ情報は、再生画像データが、同じテンポラル・レイヤの後続のピクチャの復号での画面間予測に復号順で使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号順の後続のピクチャのＲＰＳは、同じテンポラル・レイヤの非参照ピクチャを含まないこととしてもよい。 The NAL unit header information in the moving picture predictive decoding apparatus according to the present invention includes nal_unit_type that uniquely indicates whether or not the reproduced image data is used in the decoding order for inter-picture prediction in decoding of subsequent pictures of the same temporal layer. RPS of subsequent pictures in decoding order may not include non-reference pictures of the same temporal layer.

本発明の効果は、ｎａｌ＿ｒｅｆ＿ｆｌａｇに使われているビットを節約し、他の指示情報として利用可能にすることである。これはＮＡＬユニットヘッダのより効率的な利用である。もうひとつの利用法は、ＮＡＬユニットタイプを６ビットから７ビットに拡張できることである。現時点では利用可能な６４通りのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値の半分には既存のＮＡＬユニットタイプが割り当てられており、３２通りのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値についてはリザーブされ、将来新しいＮＡＬユニットタイプを規定する際に利用可能である。これらリザーブされたＮＡＬユニットタイプの値のうち３つを使い、かつＮＡＬユニットタイプのビット数を７ビットに拡張することで、将来的に９３通り（１２８−３２−３＝９３）の更なるＮＡＬユニットを規定することができる。 The effect of the present invention is to save bits used in nal_ref_flag and make them available as other instruction information. This is a more efficient use of the NAL unit header. Another use is that the NAL unit type can be expanded from 6 bits to 7 bits. At present, half of the 64 available nal_unit_type values are assigned the existing NAL unit type, and the 32 nal_unit_type values are reserved and can be used when defining a new NAL unit type in the future. By using three of these reserved NAL unit type values and expanding the number of bits of the NAL unit type to 7 bits, 93 (128-32-3 = 93) additional NAL units will be added in the future. Can be prescribed.

本発明の実施形態に係る動画像予測符号化装置を示すブロック図である。It is a block diagram which shows the moving image predictive coding apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る動画像予測復号装置を示すブロック図である。It is a block diagram which shows the moving image predictive decoding apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る動画像予測符号化方法の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the moving image predictive encoding method which concerns on embodiment of this invention. 本発明の実施形態に係る動画像予測符号化方法の処理のうち一部処理の詳細な流れを示すフローチャートである。It is a flowchart which shows the detailed flow of a one part process among the processes of the moving image predictive encoding method which concerns on embodiment of this invention. 本発明の実施形態に係る動画像予測復号方法の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the moving image predictive decoding method which concerns on embodiment of this invention. 本発明の実施形態に係る動画像予測復号方法の処理のうち一部処理の詳細な流れを示すフローチャートである。It is a flowchart which shows the detailed flow of a one part process among the processes of the moving image predictive decoding method which concerns on embodiment of this invention. 記録媒体に記録されたプログラムを実行するためのコンピュータのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the computer for performing the program recorded on the recording medium. 記録媒体に記憶されたプログラムを実行するためのコンピュータの斜視図である。It is a perspective view of a computer for executing a program stored in a recording medium. 動画像予測符号化プログラムの構成例を示すブロック図である。It is a block diagram which shows the structural example of a moving image predictive encoding program. 動画像予測復号プログラムの構成例を示すブロック図である。It is a block diagram which shows the structural example of a moving image prediction decoding program.

以下、本発明の実施の形態について、図１から図１０を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to FIGS.

まず、本発明による動画像予測符号化方法について説明する。図１は本発明の実施形態による動画像予測符号化装置を示すブロック図である。１０１は入力端子、１０２はブロック分割器、１０３は予測信号生成器、１０４はフレームメモリ、１０５は減算器、１０６は変換器、１０７は量子化器、１０８は逆量子化器、１０９は逆変換器、１１０は加算器、１１１はエントロピー符号化器、１１２は出力端子、１１３は入力端子である。入力端子１０１は入力手段に対応する。減算器１０５と変換器１０６と量子化器１０７とエントロピー符号化器１１１とは符号化手段に対応する。逆量子化器１０８、逆変換器１０９と加算器１１０は復号手段に対応する。 First, the video predictive coding method according to the present invention will be described. FIG. 1 is a block diagram showing a video predictive coding apparatus according to an embodiment of the present invention. 101 is an input terminal, 102 is a block divider, 103 is a prediction signal generator, 104 is a frame memory, 105 is a subtractor, 106 is a converter, 107 is a quantizer, 108 is a dequantizer, and 109 is an inverse transform , 110 is an adder, 111 is an entropy encoder, 112 is an output terminal, and 113 is an input terminal. The input terminal 101 corresponds to input means. The subtractor 105, the converter 106, the quantizer 107, and the entropy encoder 111 correspond to encoding means. The inverse quantizer 108, the inverse transformer 109, and the adder 110 correspond to decoding means.

以上のように構成された動画像予測符号化装置について、以下その動作を述べる。複数枚の画像からなる動画像の信号は入力端子１０１に入力される。符号化の対象となる画像はブロック分割器１０２にて、複数の領域に分割される。本発明による実施形態では、８ｘ８の画素からなるブロックに分割されるが、それ以外のブロックの大きさまたは形に分割してもよい。次に符号化処理の対象となる領域（以下対象ブロックとよぶ）に対して、予測信号を生成する。本発明による実施形態では、２種類の予測方法が用いられる。すなわち画面間予測と画面内予測である。 The operation of the video predictive coding apparatus configured as described above will be described below. A moving image signal composed of a plurality of images is input to the input terminal 101. An image to be encoded is divided into a plurality of regions by the block divider 102. In the embodiment according to the present invention, the block is divided into 8 × 8 pixels, but may be divided into other block sizes or shapes. Next, a prediction signal is generated for a region to be encoded (hereinafter referred to as a target block). In the embodiment according to the present invention, two kinds of prediction methods are used. That is, inter-screen prediction and intra-screen prediction.

画面間予測では、過去に符号化されたのちに復元された再生画像を参照画像として、この参照画像から対象ブロックに対する誤差の最も小さい予測信号を与える動き情報を求める。この処理は動き検出とよばれる。また場合に応じて、対象ブロックを再分割し、再分割された小領域に対し画面間予測方法を決定してもよい。この場合、各種の分割方法の中から、対象ブロック全体に対し最も効率のよい分割方法及びそれぞれの動き情報を決定する。本発明による実施形態では、予測信号生成器１０３にて行われ、対象ブロックはラインＬ１０２、参照画像はＬ１０４経由で入力される。参照画像としては、過去に符号化され復元された複数の画像を参照画像として用いる。詳細は従来の技術であるＭＰＥＧ−２、４、Ｈ．２６４のいずれかの方法と同じである。このように決定された動き情報及び小領域の分割方法はラインＬ１１２経由でエントロピー符号化器１１１に送られ符号化した上で出力端子１１２から送出される。また複数の参照画像の中で、予測信号がどの参照画像から取得するかに関する情報（リファレンス・インデックス）もラインＬ１１２経由でエントロピー符号化器１１１に送られる。予測信号生成器１０３では、小領域の分割方法及びそれぞれの小領域に対応する、参照画像と動き情報をもとにフレームメモリ１０４から参照画像信号を取得し、予測信号を生成する。このように生成された画面間予測信号はラインＬ１０３経由で減算器１０５に送られる。 In inter-screen prediction, a reproduction image that has been encoded in the past and then restored is used as a reference image, and motion information that gives a prediction signal with the smallest error for the target block is obtained from this reference image. This process is called motion detection. Further, according to circumstances, the target block may be subdivided, and the inter-screen prediction method may be determined for the subdivided small area. In this case, the most efficient division method and the respective motion information are determined from the various division methods for the entire target block. In the embodiment according to the present invention, the prediction signal generator 103 inputs the target block via the line L102 and the reference image via L104. As the reference image, a plurality of images encoded and restored in the past are used as the reference image. For details, refer to the conventional techniques MPEG-2, 4, H.264. It is the same as any of the H.264 methods. The motion information and the small area dividing method determined in this way are sent to the entropy encoder 111 via the line L112, encoded, and sent out from the output terminal 112. Also, information (reference index) regarding which reference image the prediction signal is acquired from among a plurality of reference images is also sent to the entropy encoder 111 via the line L112. The prediction signal generator 103 acquires a reference image signal from the frame memory 104 based on the reference image and motion information corresponding to each small region dividing method and each small region, and generates a prediction signal. The inter-screen prediction signal generated in this way is sent to the subtractor 105 via the line L103.

画面内予測では、対象ブロックに空間的に隣接する既再生の画素値を用いて画面内予測信号を生成する。具体的には予測信号生成器１０３では、フレームメモリ１０４から同じ画面内にある既再生の画素信号を取得し、これらの信号を外挿することによって画面内予測信号を生成する。外挿の方法に関する情報はラインＬ１１２経由でエントロピー符号化器１１１に送られ符号化した上で出力端子１１２から送出される。このように生成された画面内予測信号は減算器１０５に送られる。予測信号生成器１０３における画面内の予測信号生成方法は、従来の技術であるＨ．２６４の方法と同じである。上述のように求められた画面間予測信号と画面内予測信号に対し、誤差の最も小さいものが選択され、減算器１０５に送られる。 In intra-screen prediction, an intra-screen prediction signal is generated using already reproduced pixel values spatially adjacent to the target block. Specifically, the prediction signal generator 103 acquires already reproduced pixel signals in the same screen from the frame memory 104 and extrapolates these signals to generate an in-screen prediction signal. Information regarding the extrapolation method is sent to the entropy encoder 111 via the line L112, encoded, and sent from the output terminal 112. The intra-screen prediction signal generated in this way is sent to the subtractor 105. The prediction signal generation method in the screen in the prediction signal generator 103 is a conventional technique of H.264. This is the same as the H.264 method. Of the inter-screen prediction signal and the intra-screen prediction signal obtained as described above, the signal having the smallest error is selected and sent to the subtractor 105.

減算器１０５にて対象ブロックの信号（ラインＬ１０２経由）から予測信号（ラインＬ１０３経由）を引き算し、残差信号を生成する。この残差信号は変換器１０６にて離散コサイン変換され、その各係数は量子化器１０７にて量子化される。最後にエントロピー符号化器１１１にて量子化された変換係数を符号化して、予測方法に関する情報とともに出力端子１１２より送出される。 The subtractor 105 subtracts the prediction signal (via the line L103) from the signal of the target block (via the line L102) to generate a residual signal. This residual signal is subjected to discrete cosine transform by a converter 106, and each coefficient thereof is quantized by a quantizer 107. Finally, the transform coefficient quantized by the entropy encoder 111 is encoded and transmitted from the output terminal 112 together with information on the prediction method.

後続の対象ブロックに対する画面内予測もしくは画面間予測を行うために、圧縮された対象ブロックの信号は逆処理し復元される。すなわち、量子化された変換係数は逆量子化器１０８にて逆量子化されたのちに逆変換器１０９にて逆離散コサイン変換され、残差信号を復元する。加算器１１０にて復元された残差信号とラインＬ１０３から送られた予測信号とを加算し、対象ブロックの信号を再生し、フレームメモリ１０４に格納する。本実施の形態では変換器１０６と逆変換器１０９を用いているが、これらの変換器に代わるほかの変換処理を用いてもよい。場合によって、変換器１０６と逆変換器１０９がなくてもよい。 In order to perform intra prediction or inter prediction for the subsequent target block, the compressed signal of the target block is inversely processed and restored. That is, the quantized transform coefficient is inversely quantized by the inverse quantizer 108 and then inverse discrete cosine transformed by the inverse transformer 109 to restore the residual signal. The residual signal restored by the adder 110 and the prediction signal sent from the line L103 are added, and the signal of the target block is reproduced and stored in the frame memory 104. In the present embodiment, converter 106 and inverse converter 109 are used, but other conversion processes in place of these converters may be used. In some cases, the converter 106 and the inverse converter 109 may be omitted.

入力端子１１３より各画像の表示順番情報や画像を符号化するタイプ（画面内予測符号化、画面間予測符号化、双方向予測符号化）、ＮＡＬユニットタイプに関する情報が入力され、これらの情報に基づいて予測信号生成器１０３が動作する。またこれらの情報はラインＬ１１３を経由してエントロピー符号化器１１１に送られ、符号化した上で出力端子１１２から送出される。ＮＡＬユニットタイプを符号化するためのエントロピー符号化器１１１の動作については後述する。 Information about the display order of each image, the type for encoding the image (intra-screen predictive coding, inter-screen predictive coding, bi-directional predictive coding), and information about the NAL unit type are input from the input terminal 113. Based on these information Thus, the prediction signal generator 103 operates. These pieces of information are sent to the entropy encoder 111 via the line L113, encoded, and sent from the output terminal 112. The operation of the entropy encoder 111 for encoding the NAL unit type will be described later.

次に本発明による動画像予測復号方法について説明する。図２は本発明の実施形態による画像予測復号装置のブロック図を示す。２０１は入力端子、２０２はデータ解析器、２０３は逆量子化器、２０４は逆変換器、２０５は加算器、２０６は出力端子、２０７はフレームメモリ、２０８は予測信号生成器、２０９はフレームメモリ管理器である。入力端子２０１は入力手段に対応する。データ解析器２０２と逆量子化器２０３と逆変換器２０４と加算器２０５とは復号手段に対応する。復号手段としてそれ以外のものを用いてもよい。また逆変換器２０４がなくてもよい。 Next, the video predictive decoding method according to the present invention will be described. FIG. 2 shows a block diagram of an image predictive decoding apparatus according to an embodiment of the present invention. 201 is an input terminal, 202 is a data analyzer, 203 is an inverse quantizer, 204 is an inverse transformer, 205 is an adder, 206 is an output terminal, 207 is a frame memory, 208 is a prediction signal generator, and 209 is a frame memory It is a manager. The input terminal 201 corresponds to input means. The data analyzer 202, inverse quantizer 203, inverse transformer 204, and adder 205 correspond to decoding means. Any other decoding means may be used. Further, the inverse converter 204 may not be provided.

以上のように構成された動画像予測復号装置について、以下その動作を述べる。上述した方法で圧縮符号化された圧縮データは入力端子２０１から入力される。この圧縮データには、画像を複数のブロックに分割された対象ブロックを予測し符号化された残差信号及び予測信号の生成に関連する情報などが含まれている。予測信号の生成に関連する情報として、ＮＡＬユニットタイプに加え、画面間予測の場合はブロック分割に関する情報（ブロックのサイズ）や、動き情報と上述のリファレンス・インデックスに関する情報が含まれ、画面内予測の場合は周辺の既再生の画素から外挿方法に関する情報が含まれている。 The operation of the video predictive decoding apparatus configured as described above will be described below. The compressed data compressed and encoded by the method described above is input from the input terminal 201. The compressed data includes a residual signal encoded by predicting a target block obtained by dividing an image into a plurality of blocks, information related to generation of a prediction signal, and the like. In addition to the NAL unit type, information related to the generation of the prediction signal includes information on block division (block size) and information on motion information and the above-described reference index in the case of inter-screen prediction. In this case, information on the extrapolation method is included from the surrounding already reproduced pixels.

データ解析器２０２にて、圧縮データから対象ブロックの残差信号、ＮＡＬユニットタイプを含む予測信号の生成に関連する情報、量子化パラメータ、画像の表示順番情報を抽出する。データ解析器２０２におけるＮＡＬユニットタイプ抽出のための動作については後述する。対象ブロックの残差信号は逆量子化器２０３にて量子化パラメータ（ラインＬ２０２経由）をもとに逆量子化される。その結果は逆変換器２０４にて逆離散コサイン変換される。 The data analyzer 202 extracts the residual signal of the target block, information related to the generation of the prediction signal including the NAL unit type, the quantization parameter, and the display order information of the image from the compressed data. The operation for extracting the NAL unit type in the data analyzer 202 will be described later. The residual signal of the target block is inversely quantized by the inverse quantizer 203 based on the quantization parameter (via the line L202). The result is subjected to inverse discrete cosine transform by an inverse transformer 204.

次にラインＬ２０６経由で、対象画像の表示順番情報、画像の符号化タイプＮＡＬユニットタイプ、およびリファレンス・インデックスなど予測信号の生成に関連する情報が予測信号生成器２０８に送られる。予測信号生成器２０８では、予測信号の生成に関連する情報をもとに、フレームメモリ２０７にアクセスし、複数の参照画像の中から参照信号を取得し（ラインＬ２０７経由）予測信号を生成する。この予測信号はラインＬ２０８経由で加算器２０５に送られ、復元された残差信号に加算され、対象ブロック信号を再生し、ラインＬ２０５経由で出力端子２０６から出力すると同時にフレームメモリ２０７に格納される。 Next, information related to the generation of the prediction signal, such as the display order information of the target image, the image encoding type NAL unit type, and the reference index, is sent to the prediction signal generator 208 via the line L206. The prediction signal generator 208 accesses the frame memory 207 based on information related to generation of the prediction signal, acquires a reference signal from a plurality of reference images (via the line L207), and generates a prediction signal. This prediction signal is sent to the adder 205 via the line L208, added to the restored residual signal, reproduces the target block signal, and is output from the output terminal 206 via the line L205 and stored in the frame memory 207 at the same time. .

フレームメモリ２０７には、後続の画像の復号・再生に用いられる再生画像が格納されている。 The frame memory 207 stores a reproduced image used for decoding / reproducing subsequent images.

表２および表３は、ＮＡＬユニットヘッダの２バイトの使用形態に関する２通りのシンタックスの選択肢を示す表である。

表２および表３において、Ｄｅｓｃｒｉｐｔｏｒ列の括弧内の数字は、対応する項目が有するビット数を表す。 Tables 2 and 3 are tables showing two syntax options related to the usage pattern of 2 bytes of the NAL unit header.

In Tables 2 and 3, the numbers in parentheses in the Descriptor column represent the number of bits that the corresponding item has.

表２のＮＡＬユニットヘッダシンタックスでは、ｎａｌ＿ｒｅｆ＿ｆｌａｇはリザーブドビット（reserved）に置き換わっている。このビットは現在の復号装置では無視されるが、将来の復号装置のために新たな意味やセマンティクスを割り当てることができる。なお、表２におけるビットの配置は説明のために過ぎず、リザーブドビットは２バイトのヘッダ内の他の場所に配置してもよい。 In the NAL unit header syntax of Table 2, nal_ref_flag is replaced with a reserved bit (reserved). This bit is ignored by current decoders, but new meanings and semantics can be assigned for future decoders. Note that the bit arrangement in Table 2 is for illustrative purposes only, and the reserved bits may be arranged elsewhere in the 2-byte header.

表３のＮＡＬユニットヘッダシンタックスでは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに７ビットが割り当てられており、最大１２８通りの異なるｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを規定することができる。なお、本実施形態においてはｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに７ビットを割り当てることを選択したが、ｎａｌ＿ｒｅｆ＿ｆｌａｇで節約されたビットは、ｔｅｍｐｏｒａｌ＿ｉｄに割り当てられてもよい。 In the NAL unit header syntax of Table 3, 7 bits are assigned to nal_unit_type, and up to 128 different nal_unit_types can be defined. In this embodiment, it is selected that 7 bits are assigned to nal_unit_type, but the bits saved by nal_ref_flag may be assigned to temporal_id.

表４に本実施形態におけるＮＡＬユニットタイプを示す。

表４は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値から推定されるｎａｌ＿ｒｅｆ＿ｆｌａｇの値を示す表である。ＮＡＬユニットタイプは表４の２列目に示されるように、複数のカテゴリにグループ分けすることができる。そのカテゴリとは下記の通りである。１）ＲＡＰスライス（RAP slice）：ランダム・アクセス・ピクチャの符号化スライスを含んでいるＮＡＬユニット
２）ＴＬＡスライス（TLA slice）：テンポラル・レイヤー・アクセスの符号化スライスを含んでいるＮＡＬユニット
３）ＴＦＤスライス（TFD slice）：ディスカードのためのタグ付けがされたピクチャの符号化スライスを含んでいるＮＡＬユニット
４）その他のスライス（Other slice）：上記のいずれでもない符号化スライスを含んでいるＮＡＬユニット
５）パラメータ・セット（Parameter Set）：ビデオ、シーケンス、ピクチャの適応パラメータセットを含んでいるＮＡＬユニット
６）インフォメーション（Information）：アクセス・デリミタ、フィラーデータ、あるいはサプリメンタル・エンハンスメント・インフォメーション（ＳＥＩ）を含んでいるＮＡＬユニット Table 4 shows NAL unit types in the present embodiment.

Table 4 is a table showing the value of nal_ref_flag estimated from the value of nal_unit_type. NAL unit types can be grouped into multiple categories as shown in the second column of Table 4. The categories are as follows. 1) RAP slice: NAL unit 2 containing coded slice of random access picture 2) TLA slice: NAL unit 3 containing coded slice of temporal layer access 3) TFD slice (TFD slice): NAL unit 4 containing coded slice of tagged picture for discarding Other slice: containing coded slice that is not any of the above NAL unit 5) Parameter Set: NAL unit 6 containing adaptive parameter set of video, sequence, picture. Information: Information: Access delimiter, filler data, or supplemental enhancement information (SEI). ) Comprising at which NAL unit

本実施形態では、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ（ピクチャタイプ）の値として９、１０、１１に対応する３通りの新しいＮＡＬユニットタイプが従来技術のｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに追加される。これらのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値をもつＮＡＬユニットは、それぞれｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値として１、２、３をもつＮＡＬユニットと同じスライスタイプを含む。ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ：１は非ＲＡＰ、非ＴＦＤかつ非ＴＬＡピクチャの符号化スライスを含み、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ：２はＴＦＤピクチャの符号化スライスを含み、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅ：３は非ＴＦＤのＴＬＡピクチャの符号化スライスを含んでいる。
従来技術との違いは、本実施形態において、値１、２、３が非参照ピクチャに属する符号化スライスであり、値９、１０、１１が非参照ピクチャに属する符号化スライスであることである。 In this embodiment, three new NAL unit types corresponding to 9, 10, and 11 are added to the nal_unit_type of the prior art as values of nal_unit_type (picture type). These NAL units having the value of nal_unit_type include the same slice type as the NAL units having 1, 2, and 3 as values of nal_unit_type, respectively. nal_unit_type: 1 contains non-RAP, non-TFD and non-TLA picture coding slices, nal_unit_type: 2 contains TFD picture coding slices, nal_unit_type: 3 contains non-TFD TLA picture coding slices .
The difference from the prior art is that, in this embodiment, values 1, 2, and 3 are coded slices belonging to non-reference pictures, and values 9, 10, and 11 are coded slices belonging to non-reference pictures. .

なおそれぞれのカテゴリに割り当てられる値は、上記に限定されない。さらには、それぞれのカテゴリをいくつかのサブカテゴリに拡張し、表４におけるリザーブされた値を用いて、新規の値をそれらのサブカテゴリに割り当ててもよい。 The values assigned to the respective categories are not limited to the above. In addition, each category may be expanded into several subcategories and the reserved values in Table 4 may be used to assign new values to those subcategories.

図３に本実施形態におけるＮＡＬユニットヘッダの符号化のための動画像予測符号化装置の動作を示す。ステップ１１０において、動画像予測符号化装置はパケット化されるビデオデータを取得する。ステップ１２０において、常に０に固定されているＮＡＬユニットの最初のビットを符号化する。ステップ１３０において、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを決定し、符号化する。ステップ１４０において、ｔｅｍｐｏｒａｌ＿ｉｄを符号化し、ステップ１５０において、リザーブされている５ビット（ｒｅｓｅｒｖｅｄ＿ｏｎｅ＿５ｂｉｔｓ）を符号化し、ＮＡＬユニットヘッダを完結させる。ステップ１６０において、残りのペイロード（ｐａｙｌｏａｄ）をパケット化し、処理を終了する。 FIG. 3 shows the operation of the video predictive encoding apparatus for encoding the NAL unit header in this embodiment. In step 110, the video predictive coding apparatus acquires video data to be packetized. In step 120, the first bit of the NAL unit, which is always fixed at 0, is encoded. In step 130, nal_unit_type is determined and encoded. In step 140, temporal_id is encoded, and in step 150, the reserved 5 bits (reserved_one_5bits) are encoded to complete the NAL unit header. In step 160, the remaining payload is packetized and the process ends.

図４に上述のステップ１３０におけるｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの決定及び符号化における処理の詳細を示す。 FIG. 4 shows details of the process for determining and encoding nal_unit_type in step 130 described above.

ステップ２１０において、動画像予測符号化装置はパケット化されるデータがランダム・アクセス・ピクチャ（ＲＡＰ）のいずれかに属する符号化スライスであるか否かを判定し、ＲＡＰのいずれかに属する符号化スライスである場合（ＹＥＳ）はステップ２２０に進む。そうでない場合（ＮＯ）はステップ２３０に進む。 In step 210, the video predictive coding apparatus determines whether or not the data to be packetized is a coded slice belonging to any of the random access pictures (RAP), and encodes belonging to any of the RAPs. If it is a slice (YES), the process proceeds to step 220. If not (NO), the process proceeds to Step 230.

ステップ２２０において、動画像予測符号化装置はＲＡＰタイプに応じて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが１であることを暗示する４から８までのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを符号化し、ステップ１４０に進む。 In step 220, the video predictive encoding apparatus encodes 4 to 8 nal_unit_types implying that nal_ref_flag is 1 according to the RAP type, and then proceeds to step 140.

ステップ２３０において、動画像予測符号化装置はパケット化されるデータがパラメータ・セットであるか否かを判定し、パラメータ・セットである場合（ＹＥＳ）はステップ２４０に進む。そうでない場合（ＮＯ）はステップ２５０に進む。 In step 230, the video predictive coding apparatus determines whether or not the data to be packetized is a parameter set. If the data is a parameter set (YES), the process proceeds to step 240. Otherwise (NO), go to step 250.

ステップ２４０において、動画像予測符号化装置はパラメータ・セットに応じて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが１であることを暗示する２５から２８までのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを符号化し、ステップ１４０に進む。 In step 240, the video predictive coding apparatus codes nal_unit_type from 25 to 28 implying that nal_ref_flag is 1 according to the parameter set, and proceeds to step 140.

ステップ２５０において、動画像予測符号化装置はパケット化されるデータがインフォメーション・データであるか否かを判定し、インフォメーション・データである場合（ＹＥＳ）はステップ２６０に進む。そうでない場合（ＮＯ）はステップ２７０に進む。 In step 250, the moving picture predictive coding apparatus determines whether or not the data to be packetized is information data. If the data is information data (YES), the process proceeds to step 260. If not (NO), the process proceeds to Step 270.

ステップ２６０において、動画像予測符号化装置はインフォメーション・タイプに応じて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが０であることを暗示する２９から３１までのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを符号化し、ステップ１４０に進む。 In step 260, the video predictive encoding apparatus encodes nal_unit_type from 29 to 31 implying that nal_ref_flag is 0 according to the information type, and proceeds to step 140.

ステップ２７０において、動画像予測符号化装置はパケット化されるデータが参照ピクチャであるか否かを判定し、参照ピクチャである場合（ＹＥＳ）はステップ２８０に進む。そうでない場合（ＮＯ）はステップ２９０に進む。ここで、参照ピクチャであるか否かの判定は、予測信号生成器から出力されるピクチャ間の参照情報に基づいて行われる。 In step 270, the video predictive coding apparatus determines whether or not the data to be packetized is a reference picture. If the data is a reference picture (YES), the process proceeds to step 280. If not (NO), the process proceeds to Step 290. Here, the determination of whether or not it is a reference picture is performed based on reference information between pictures output from the prediction signal generator.

ステップ２７０における条件分岐は以下の通りでもよい。ステップ２７０においては、ビデオデータは参照ピクチャや非参照ピクチャかのいずれかでなくてはならない。ステップ２７０において、動画像予測符号化装置はピクチャが参照ピクチャであるか否かを判定し、参照ピクチャである場合（ＹＥＳ）はステップ２８０に進む。そうでない場合（ＮＯ）はステップ２９０に進む。 The conditional branch in step 270 may be as follows. In step 270, the video data must be either a reference picture or a non-reference picture. In step 270, the video predictive coding apparatus determines whether or not the picture is a reference picture. If the picture is a reference picture (YES), the process proceeds to step 280. If not (NO), the process proceeds to Step 290.

ステップ２８０において、動画像予測符号化装置はスライスタイプに応じて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが１であることを暗示する９から１１までのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを符号化し、ステップ１４０に進む。 In step 280, the video predictive encoding apparatus encodes nal_unit_type from 9 to 11 implying that nal_ref_flag is 1 according to the slice type, and proceeds to step 140.

ステップ２９０において、動画像予測符号化装置はスライスタイプに応じて、ｎａｌ＿ｒｅｆ＿ｆｌａｇが０であることを暗示する１から３までのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを符号化し、ステップ１４０に進む。 In step 290, the video predictive encoding apparatus encodes nal_unit_type of 1 to 3 implying that nal_ref_flag is 0 according to the slice type, and proceeds to step 140.

図５に、本実施形態におけるＮＡＬユニットヘッダの復号のための動画像予測復号装置の動作を示す。ステップ３１０において、動画像予測復号装置は復号のための次のパケットを取得する。ステップ３２０において、常に０に固定されているＮＡＬユニットの最初のビット（ｆｏｒｂｉｄｄｅｎ＿ｚｅｒｏ＿ｂｉｔ）を復号する。ステップ３３０において、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを復号し、ｎａｌ_ｒｅｆ＿ｆｌａｇの値を設定する。ステップ３４０において、ｔｅｍｐｏｒａｌ＿ｉｄを復号し、ステップ３５０において、リザーブされている５ビット（ｒｅｓｅｒｖｅｄ＿ｏｎｅ＿５ｂｉｔｓ）を復号し、ＮＡＬユニットヘッダを完結させる。ステップ３６０において、残りのペイロードをパケットから読み出し、処理を終了する。 FIG. 5 shows the operation of the video predictive decoding apparatus for decoding the NAL unit header in this embodiment. In step 310, the moving picture predictive decoding apparatus acquires the next packet for decoding. In step 320, the first bit (forbidden_zero_bit) of the NAL unit which is always fixed to 0 is decoded. In step 330, nal_unit_type is decoded and the value of nal_ref_flag is set. In step 340, temporal_id is decoded, and in step 350, the reserved 5 bits (reserved_one_5 bits) are decoded to complete the NAL unit header. In step 360, the remaining payload is read from the packet and the process ends.

図６に、上述のステップ３３０におけるｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの復号及びｎａｌ_ｒｅｆ＿ｆｌａｇの値の設定における処理の詳細を示す。 FIG. 6 shows details of processing in the decoding of nal_unit_type and the setting of the value of nal_ref_flag in step 330 described above.

ステップ４００において、動画像予測復号装置はＮＡＬユニットヘッダを復号することで、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値を取得する。 In step 400, the video predictive decoding apparatus acquires the value of nal_unit_type by decoding the NAL unit header.

ステップ４１０において、動画像予測復号装置はｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値が１から３までのいずれかであるか否かを判定し、１から３までのいずれかである場合（ＹＥＳ）はＮＡＬユニットが非参照ピクチャの符号化スライスのひとつを含んでおり、ステップ４２０に進む。そうでない場合（ＮＯ）はステップ４３０に進む。 In step 410, the video predictive decoding apparatus determines whether the value of nal_unit_type is any one from 1 to 3, and if it is any one from 1 to 3 (YES), the NAL unit is a non-reference picture. , And proceeds to step 420. Otherwise (NO), the process proceeds to step 430.

ステップ４２０において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの値を０に設定し、ステップ３４０に進む。 In step 420, the video predictive decoding apparatus sets the value of nal_ref_flag to 0, and proceeds to step 340.

ステップ４３０において、動画像予測復号装置はｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値が４から１１までのいずれかであるか否かを判定し、４から１１までのいずれかである場合（ＹＥＳ）はＮＡＬユニットがランダム・アクセス・ピクチャの符号化スライス、あるいは参照ピクチャの符号化スライスのひとつを含んでおり、ステップ４４０に進む。そうでない場合（ＮＯ）はステップ４５０に進む。 In step 430, the moving picture predictive decoding apparatus determines whether the value of nal_unit_type is any of 4 to 11, and if it is any of 4 to 11 (YES), the NAL unit is randomly accessed. Includes one of the coded slice of the picture or the coded slice of the reference picture, and proceeds to step 440. If not (NO), the process proceeds to step 450.

ステップ４５０において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの値を１に設定し、ステップ３４０に進む。 In step 450, the video predictive decoding apparatus sets the value of nal_ref_flag to 1, and proceeds to step 340.

ステップ４５０において、動画像予測復号装置はｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値が２５から２８までのいずれかであるか否かを判定し、２５から２８までのいずれかである場合（ＹＥＳ）はＮＡＬユニットがパラメータ・セットを含んでおり、ステップ４６０に進む。そうでない場合（ＮＯ）はステップ４７０に進む。 In step 450, the moving picture predictive decoding apparatus determines whether the value of nal_unit_type is any of 25 to 28. If it is any of 25 to 28 (YES), the NAL unit sets the parameter set. The process proceeds to step 460. If not (NO), the process proceeds to Step 470.

ステップ４６０において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの値を１に設定し、ステップ３４０に進む。 In step 460, the moving picture predictive decoding apparatus sets the value of nal_ref_flag to 1, and proceeds to step 340.

ステップ４７０において、動画像予測復号装置はｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値が２９から３１までのいずれかであるか否かを判定し、２９から３１までのいずれかである場合（ＹＥＳ）はＮＡＬユニットがインフォメーション・データを含んでおり、ステップ４８０に進む。そうでない場合（ＮＯ）はｎａｌ＿ｕｎｉｔ＿ｔｙｐｅは無効な値であり、ステップ４９０に進む。 In step 470, the moving picture predictive decoding apparatus determines whether or not the value of nal_unit_type is any of 29 to 31, and if it is any of 29 to 31 (YES), the NAL unit is the information data. The process proceeds to step 480. Otherwise (NO), nal_unit_type is an invalid value and the process proceeds to step 490.

ステップ４８０において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの値を０に設定し、ステップ３４０に進む。 In step 480, the moving picture predictive decoding apparatus sets the value of nal_ref_flag to 0, and proceeds to step 340.

ステップ４９０において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの値は未定義とし、ステップ３４０に進む。 In step 490, the moving picture predictive decoding apparatus sets the value of nal_ref_flag as undefined, and proceeds to step 340.

本実施形態において、上述したｎａｌ＿ｒｅｆ＿ｆｌａｇの設定は論理的な判定を通じたものであるが、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅをインデックスとしたｎａｌ＿ｒｅｆ＿ｆｌａｇの参照テーブルを用いて、ｎａｌ＿ｒｅｆ＿ｆｌａｇの値を設定してもよい。表５は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅをインデックスとしたｎａｌ＿ｒｅｆ＿ｆｌａｇの参照テーブルの一例である。

表５では、ｎａｌ＿ｒｅｆ＿ｆｌａｇの３２通りのエントリーは表４の最終列と同様の値に設定されている。 In the present embodiment, the above-described setting of nal_ref_flag is through logical determination, but the value of nal_ref_flag may be set using a nal_ref_flag reference table with nal_unit_type as an index. Table 5 is an example of a reference table of nal_ref_flag using nal_unit_type as an index.

In Table 5, 32 kinds of entries of nal_ref_flag are set to values similar to those in the last column of Table 4.

なお、上述したｎａｌ＿ｒｅｆ＿ｆｌａｇの推定あるいは設定方法は動画像予測復号装置に限定されず、ＭＡＮＥｓにも適用可能である。 Note that the above-described estimation or setting method of nal_ref_flag is not limited to the moving picture predictive decoding apparatus, and can be applied to MANEs.

本実施形態において、動画像予測復号装置はｎａｌ＿ｒｅｆ＿ｆｌａｇの設定を行わないことを選択し、復号されたピクチャが参照ピクチャであるか否かを決定する際に、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅの値を直接使用してもよい。これは論理的な表現を用いると以下のように説明される。当該ピクチャのｎａｌ＿ｕｎｉｔ＿ｔｙｐｅが１、２、または３である場合、当該ピクチャは非参照ピクチャである。そうでない場合、当該ピクチャは参照ピクチャであり、他のピクチャが参照に用いるため保存される。 In this embodiment, the video predictive decoding apparatus may select not to set nal_ref_flag, and may directly use the value of nal_unit_type when determining whether or not the decoded picture is a reference picture. . This can be explained as follows using a logical expression. When the nal_unit_type of the picture is 1, 2, or 3, the picture is a non-reference picture. Otherwise, the picture is a reference picture and is saved for use by other pictures for reference.

本実施形態では、参照ピクチャ並びに非参照ピクチャの定義は映像全体に対して適用される。しかしながら、映像が、より高いテンポラル・レイヤのピクチャを捨てる、選択フレームドロップの処理が行われた場合には、この定義はもはや正確ではない可能性がある。 In the present embodiment, the definition of the reference picture and the non-reference picture is applied to the entire video. However, this definition may no longer be accurate if a selection frame drop process is performed in which the video discards higher temporal layer pictures.

そのような状況においては、いくつかの参照ピクチャは実際には参照されないピクチャになりうる。これを回避するために、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅが９、１０、１１である参照ピクチャ、並びにｎａｌ＿ｕｎｉｔ＿ｔｙｐｅが１、２、３である非参照ピクチャは以下のように定義してもよい。 In such a situation, some reference pictures may become pictures that are not actually referenced. In order to avoid this, reference pictures with nal_unit_type of 9, 10, and 11 and non-reference pictures with nal_unit_type of 1, 2, and 3 may be defined as follows.

参照ピクチャとは前記ピクチャと同じテンポラル・レイヤの他のいずれかのピクチャによって画面間予測のために使用されるピクチャである。 A reference picture is a picture used for inter-screen prediction by any other picture in the same temporal layer as the picture.

非参照ピクチャとは前記ピクチャと同じテンポラル・レイヤの他のいずれのピクチャによっても画面間予測のために使用されないピクチャである。 A non-reference picture is a picture that is not used for inter-screen prediction by any other picture in the same temporal layer as the picture.

非特許文献１に記載の従来法においては、画面間予測はどのピクチャが画面間予測のために利用可能かを規定するリファレンス・ピクチャ・セット（ＲＰＳ）の中身によって指示される。それゆえ、上述の定義は下記のように記載してもよい。 In the conventional method described in Non-Patent Document 1, inter-screen prediction is indicated by the contents of a reference picture set (RPS) that defines which pictures can be used for inter-screen prediction. Therefore, the above definition may be written as follows:

非参照ピクチャ（ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅが１、２または３）は前記ピクチャと同じテンポラル・レイヤの他のいずれのピクチャのＲＰＳにも含まれない。 A non-reference picture (nal_unit_type is 1, 2 or 3) is not included in the RPS of any other picture in the same temporal layer as the picture.

参照ピクチャ（ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅが９、１０または１１）とは前記ピクチャと同じテンポラル・レイヤの他のいずれかのピクチャのＲＰＳに含まれる。 The reference picture (nal_unit_type is 9, 10 or 11) is included in the RPS of any other picture in the same temporal layer as the picture.

コンピュータを上述の動画像予測符号化装置及び動画像予測復号装置として機能させるための本発明に係る動画像予測符号化プログラム及び動画像予測復号プログラムは、プログラムとして記録媒体に格納されて提供される。記録媒体としては、フロッピー（登録商標）ディスク、ＣＤ−ＲＯＭ、ＤＶＤ、あるいはＲＯＭ等の記録媒体、あるいは半導体メモリ等が例示される。 A moving picture predictive coding program and a moving picture predictive decoding program according to the present invention for causing a computer to function as the above-described moving picture predictive coding apparatus and moving picture predictive decoding apparatus are provided as a program stored in a recording medium. . Examples of the recording medium include a floppy (registered trademark) disk, a CD-ROM, a DVD, a ROM, or a recording medium, or a semiconductor memory.

図７は、記録媒体に記録されたプログラムを実行するためのコンピュータのハードウェア構成を示す図であり、図８は、記録媒体に記憶されたプログラムを実行するためのコンピュータの斜視図である。コンピュータとして、ＣＰＵを具備しソフトウエアによる処理や制御を行なうＤＶＤプレーヤ、セットトップボックス、携帯電話などを含む。 FIG. 7 is a diagram showing a hardware configuration of a computer for executing a program recorded on the recording medium, and FIG. 8 is a perspective view of the computer for executing the program stored on the recording medium. Examples of the computer include a DVD player, a set-top box, a mobile phone, and the like that have a CPU and perform processing and control by software.

図７に示すように、コンピュータ３０は、フロッピー（登録商標）ディスクドライブ装置、ＣＤ−ＲＯＭドライブ装置、ＤＶＤドライブ装置等の読取装置１２と、オペレーティングシステムを常駐させた作業用メモリ（ＲＡＭ）１４と、記録媒体１０に記憶されたプログラムを記憶するメモリ１６と、ディスプレイといった表示装置１８と、入力装置であるマウス２０及びキーボード２２と、データ等の送受を行うための通信装置２４と、プログラムの実行を制御するＣＰＵ２６とを備えている。コンピュータ３０は、記録媒体１０が読取装置１２に挿入されると、読取装置１２から記録媒体１０に格納された動画像予測符号化・復号プログラムにアクセス可能になり、当該動画像予測符号化・復号プログラムによって、本発明による動画像予測符号化装置・復号装置として動作することが可能になる。 As shown in FIG. 7, the computer 30 includes a reading device 12 such as a floppy (registered trademark) disk drive device, a CD-ROM drive device, a DVD drive device, and a working memory (RAM) 14 in which an operating system is resident. , A memory 16 for storing a program stored in the recording medium 10, a display device 18 such as a display, a mouse 20 and a keyboard 22 as input devices, a communication device 24 for transmitting and receiving data and the like, and execution of the program CPU 26 for controlling the above. When the recording medium 10 is inserted into the reading device 12, the computer 30 can access the moving image predictive encoding / decoding program stored in the recording medium 10 from the reading device 12, and the moving image predictive encoding / decoding is performed. The program makes it possible to operate as a moving picture predictive encoding apparatus / decoding apparatus according to the present invention.

図８に示すように、動画像予測符号化プログラムもしくは動画像復号プログラは、搬送波に重畳されたコンピュータデータ信号４０としてネットワークを介して提供されるものであってもよい。この場合、コンピュータ３０は、通信装置２４によって受信した動画像予測符号化プログラムもしくは動画像予測復号プログラをメモリ１６に格納し、当該動画像予測符号化プログラムもしくは動画像予測復号プログラムを実行することができる。 As shown in FIG. 8, the video predictive encoding program or video decoding program may be provided via a network as a computer data signal 40 superimposed on a carrier wave. In this case, the computer 30 may store the moving picture predictive coding program or the moving picture predictive decoding program received by the communication device 24 in the memory 16 and execute the moving picture predictive coding program or the moving picture predictive decoding program. it can.

具体的には、図９に示す通り、動画像予測符号化プログラムＰ１００は、動画像を構成する複数の画像を入力する入力モジュールＰ１０１と、画像を、画面内予測もしくは画面間予測のいずれかのプログラムで符号化し、圧縮画像データを生成し、パケットヘッダ情報とともにパケット化する符号化モジュールＰ１０２と、を具備し、パケットヘッダ情報は、ピクチャタイプを含み、符号化モジュールＰ１０２は、ピクチャタイプを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する、ことを特徴とする動画像予測符号化プログラムである。 Specifically, as illustrated in FIG. 9, the moving image predictive encoding program P100 includes an input module P101 that inputs a plurality of images constituting a moving image, and the image is either in-screen prediction or inter-screen prediction. An encoding module P102 that encodes by a program to generate compressed image data and packetizes the packet header information, and the packet header information includes a picture type, and the encoding module P102 encodes the picture type It is a moving picture predictive encoding program characterized by determining so that the converted picture data may be uniquely used for reference when decoding other pictures.

同様に、図１０に示す通り、動画像予測復号プログラムＰ２００は、動画像を構成する複数の画像に対し、画面内予測もしくは画面間予測のいずれかによって符号化され、パケットヘッダ情報とともにパケット化された、圧縮画像データを入力する入力モジュールＰ２０１と、パケットヘッダ情報及び圧縮画像データを復元する復号モジュールＰ２０２と、を具備し、パケットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すピクチャタイプを含み、復号モジュールＰ２０２は、ピクチャタイプに基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定することを特徴とする動画像予測復号プログラムである。 Similarly, as illustrated in FIG. 10, the moving picture predictive decoding program P200 encodes a plurality of pictures constituting a moving picture by either intra-screen prediction or inter-screen prediction, and packetizes the packet header information. In addition, an input module P201 for inputting compressed image data and a decoding module P202 for restoring packet header information and compressed image data are provided, and the restored picture data is used for decoding other pictures by the packet header information. The decoding module P202 includes a picture type that uniquely indicates whether or not to be used for reference, and the decoding module P202 uses the restored picture data based on the picture type for reference when decoding other pictures. It is a moving picture predictive decoding program characterized by determining whether to use.

復号モジュールＰ２０２は、ピクチャタイプと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴としてもよい。 The decoding module P202 is based on a pre-stored correspondence table in which picture types are associated with information indicating whether or not the restored picture data is used for reference when decoding other pictures. It may be characterized in that it is determined whether the restored picture data is used for reference when decoding other pictures.

上述の課題を解決するために、本発明に係る動画像予測符号化装置は、動画像を構成する複数の画像を入力する入力手段と、画像を、画面内予測もしくは画面間予測のいずれかの方法で符号化し、圧縮画像データを生成し、パケットヘッダ情報とともにパケット化する符号化手段と、を具備し、パケットヘッダ情報は、ピクチャタイプを含み、符号化手段は、ピクチャタイプを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する、ことを特徴とする。 In order to solve the above-described problem, a moving image predictive coding apparatus according to the present invention includes an input unit that inputs a plurality of images constituting a moving image, and the image is either in-screen prediction or inter-screen prediction. Encoding means for generating compressed image data and packetizing together with packet header information, wherein the packet header information includes a picture type, and the encoding means is encoded with the picture type. The picture data is determined so as to uniquely indicate whether or not the picture data is used for reference when decoding other pictures.

また、本発明に係る動画像予測復号装置は、動画像を構成する複数の画像に対し、画面内予測もしくは画面間予測のいずれかによって符号化され、パケットヘッダ情報とともにパケット化された、圧縮画像データを入力する入力手段と、パケットヘッダ情報及び圧縮画像データを復元する復号手段と、を具備し、パケットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すピクチャタイプを含み、復号手段は、ピクチャタイプに基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 The video predictive decoding apparatus according to the present invention is a compressed image that is encoded by either intra-screen prediction or inter-screen prediction and packetized together with packet header information for a plurality of images constituting the video. Input means for inputting data, and decoding means for restoring packet header information and compressed image data. The packet header information is used for reference when the restored picture data is decoded by other pictures. The decoding means determines whether or not the restored picture data is used for reference when decoding other pictures based on the picture type, including a picture type that uniquely indicates whether or not it is used It is characterized by.

また、本発明に係る動画像予測復号装置における復号手段は、ピクチャタイプと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 The decoding means in the video predictive decoding apparatus according to the present invention corresponds to the picture type and information indicating whether or not the restored picture data is used for reference when decoding other pictures. Based on the pre-stored correspondence table attached, it is determined whether the restored picture data is used for reference when decoding other pictures.

本発明に係る動画像予測符号化方法は、動画像を構成する複数の画像を入力する入力ステップと、画像を、画面内予測もしくは画面間予測のいずれかの方法で符号化し、圧縮画像データを生成し、パケットヘッダ情報とともにパケット化する符号化ステップと、を具備し、パケットヘッダ情報は、ピクチャタイプを含み、符号化ステップは、ピクチャタイプを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する、ことを特徴とする。 A moving image predictive encoding method according to the present invention includes an input step of inputting a plurality of images constituting a moving image, the image is encoded by any one of intra prediction or inter prediction, and compressed image data is encoded. An encoding step of generating and packetizing together with the packet header information, wherein the packet header information includes a picture type, and the encoding step includes the picture type, the encoded picture data includes other pictures. It is characterized in that it is determined so as to uniquely indicate whether or not it is used for reference when decoding.

本発明に係る動画像予測復号方法は、動画像を構成する複数の画像に対し、画面内予測もしくは画面間予測のいずれかによって符号化され、パケットヘッダ情報とともにパケット化された、圧縮画像データを入力する入力ステップと、パケットヘッダ情報及び圧縮画像データを復元する復号ステップと、を具備し、パケットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すピクチャタイプを含み、復号ステップは、ピクチャタイプに基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 The moving image predictive decoding method according to the present invention is a method of decoding compressed image data encoded by either intra-screen prediction or inter-screen prediction and packetized together with packet header information for a plurality of images constituting a moving image. An input step for inputting, and a decoding step for restoring packet header information and compressed image data. The packet header information is used for reference when the restored picture data is decoded by another picture. The decoding step determines whether the restored picture data is used for reference when decoding other pictures, based on the picture type, It is characterized by that.

本発明に係る動画像予測復号方法における復号ステップは、ピクチャタイプと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 In the decoding step in the video predictive decoding method according to the present invention, the picture type is associated with information indicating whether or not the restored picture data is used for reference when decoding other pictures. Based on a correspondence table stored in advance, it is determined whether the restored picture data is used for reference when decoding other pictures.

本発明に係る動画像予測符号化プログラムは、動画像を構成する複数の画像を入力する入力モジュールと、画像を、画面内予測もしくは画面間予測のいずれかのプログラムで符号化し、圧縮画像データを生成し、パケットヘッダ情報とともにパケット化する符号化モジュールと、を具備し、パケットヘッダ情報は、ピクチャタイプを含み、符号化モジュールは、ピクチャタイプを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する、ことを特徴とする。 A moving image predictive encoding program according to the present invention includes an input module that inputs a plurality of images constituting a moving image, and the image is encoded with either an intra-screen prediction or an inter-screen prediction program. An encoding module that generates and packetizes the packet header information, the packet header information includes a picture type, the encoding module includes a picture type, and the encoded picture data includes other pictures. It is characterized in that it is determined so as to uniquely indicate whether or not it is used for reference when decoding.

本発明に係る動画像予測復号プログラムは、動画像を構成する複数の画像に対し、画面内予測もしくは画面間予測のいずれかによって符号化され、パケットヘッダ情報とともにパケット化された、圧縮画像データを入力する入力モジュールと、パケットヘッダ情報及び圧縮画像データを復元する復号モジュールと、を具備し、パケットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すピクチャタイプを含み、復号モジュールは、ピクチャタイプに基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 The moving picture predictive decoding program according to the present invention encodes compressed image data encoded by either intra-screen prediction or inter-screen prediction and packetized together with packet header information for a plurality of images constituting a moving image. An input module for inputting and a decoding module for restoring packet header information and compressed image data are provided, and the packet header information is used for reference when the restored picture data is decoded by other pictures. A decoding module that uniquely indicates whether or not, based on the picture type, the decoding module determines whether the restored picture data is used for reference when decoding other pictures, It is characterized by that.

本発明に係る動画像予測復号プログラムにおける復号モジュールは、ピクチャタイプと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを決定する、ことを特徴とする。 The decoding module in the video predictive decoding program according to the present invention associates a picture type with information indicating whether or not the restored picture data is used for reference when decoding other pictures. Based on a correspondence table stored in advance, it is determined whether the restored picture data is used for reference when decoding other pictures.

上述の課題を解決するために、本発明に係る動画像予測符号化装置は、動画像を構成する複数のピクチャを入力する入力手段と、ピクチャを符号化し、圧縮画像データを生成し、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化する符号化手段と、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、符号化手段は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを、符号化されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する。 In order to solve the above-described problem, a moving picture predictive coding apparatus according to the present invention includes an input unit that inputs a plurality of pictures constituting a moving picture, encodes the pictures, generates compressed picture data, and generates a NAL unit. Encoding means for encapsulating the NAL unit together with header information, and the plurality of pictures constituting the moving image are classified into a plurality of temporal layers, the NAL unit header information includes nal_unit_type, and the encoding means includes: , Nal_unit_type is determined to uniquely indicate whether the encoded picture data is used for reference when decoding other pictures of the same temporal layer.

また、本発明に係る動画像予測復号装置は、動画像を構成する複数のピクチャが符号化され、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された、圧縮画像データを入力する入力手段と、ＮＡＬユニットヘッダ情報及び圧縮画像データを復元する復号手段と、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、復元されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号手段は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに基づいて、圧縮画像データを復元する。 The video predictive decoding apparatus according to the present invention includes an input unit that inputs compressed image data in which a plurality of pictures constituting a video is encoded and encapsulated in a NAL unit together with NAL unit header information; Decoding means for restoring unit header information and compressed image data, and a plurality of pictures constituting a moving image are classified into a plurality of temporal layers, and the restored picture data is the same as the NAL unit header information. The decoding means includes nal_unit_type that uniquely indicates whether the picture is used for reference when decoding other pictures in the temporal layer, and the decoding means restores the compressed image data based on nal_unit_type.

本発明に係る動画像予測符号化方法は、動画像を構成する複数のピクチャを入力する入力ステップと、ピクチャを符号化し、圧縮画像データを生成し、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化する符号化ステップと、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、符号化ステップは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを、符号化されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する。 The moving picture predictive coding method according to the present invention includes an input step for inputting a plurality of pictures constituting a moving picture, a picture is coded, compressed picture data is generated, and is encapsulated in a NAL unit together with NAL unit header information. A plurality of pictures constituting a moving image are classified into a plurality of temporal layers, the NAL unit header information includes nal_unit_type, and the encoding step includes nal_unit_type encoded pictures. It is determined to uniquely indicate whether the data is used for reference in decoding other pictures of the same temporal layer.

本発明に係る動画像予測復号方法は、動画像を構成する複数のピクチャが符号化され、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された、圧縮画像データを入力する入力ステップと、ＮＡＬユニットヘッダ情報及び圧縮画像データを復元する復号ステップと、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、復元されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号ステップは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに基づいて、圧縮画像データを復元する。 The video predictive decoding method according to the present invention includes an input step of inputting compressed image data in which a plurality of pictures constituting a video are encoded and encapsulated in a NAL unit together with NAL unit header information, and a NAL unit header A decoding step for restoring information and compressed image data, wherein a plurality of pictures constituting a moving image are classified into a plurality of temporal layers, and the NAL unit header information includes the same temporal It includes nal_unit_type that uniquely indicates whether it is used for reference when decoding other pictures of the layer, and the decoding step restores the compressed image data based on nal_unit_type.

本発明に係る動画像予測符号化プログラムは、動画像を構成する複数のピクチャを入力する入力モジュールと、ピクチャを符号化し、圧縮画像データを生成し、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化する符号化モジュールと、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、符号化モジュールは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを、符号化されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する。 A moving picture predictive coding program according to the present invention encodes a picture by inputting an input module that inputs a plurality of pictures constituting the moving picture, generates compressed image data, and encapsulates the NAL unit together with NAL unit header information. A plurality of pictures constituting a moving image are classified into a plurality of temporal layers, the NAL unit header information includes nal_unit_type, and the encoding module encodes nal_unit_type. It is determined to uniquely indicate whether the data is used for reference in decoding other pictures of the same temporal layer.

本発明に係る動画像予測復号プログラムは、動画像を構成する複数のピクチャが符号化され、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された、圧縮画像データを入力する入力モジュールと、ＮＡＬユニットヘッダ情報及び圧縮画像データを復元する復号モジュールと、を具備し、動画像を構成する複数のピクチャは複数のテンポラル・レイヤに分類され、ＮＡＬユニットヘッダ情報は、復元されたピクチャデータが、同じテンポラル・レイヤの他のピクチャを復号する際に参照のために使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号モジュールは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに基づいて、圧縮画像データを復元する。 The moving picture predictive decoding program according to the present invention includes an input module for inputting compressed image data in which a plurality of pictures constituting a moving picture are encoded and encapsulated in the NAL unit together with NAL unit header information, and a NAL unit header. A decoding module that restores information and compressed image data, wherein a plurality of pictures constituting a moving image are classified into a plurality of temporal layers, and the NAL unit header information includes the same temporal The decoding module restores the compressed image data based on nal_unit_type, including nal_unit_type that uniquely indicates whether or not the picture is used for reference when decoding other pictures of the layer.

本発明に係る動画像予測符号化装置は、動画像を構成する複数のピクチャを入力する入力手段と、ピクチャを符号化し、圧縮画像データを生成し、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化する符号化手段と、を具備し、ＮＡＬユニットヘッダ情報は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、符号化手段は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する。 The moving picture predictive coding apparatus according to the present invention inputs a plurality of pictures constituting a moving picture, encodes the pictures, generates compressed picture data, and encapsulates the NAL units together with NAL unit header information. Encoding means, the NAL unit header information includes nal_unit_type, and the encoding means uses nal_unit_type to determine whether encoded picture data is used for reference when decoding other pictures. It is determined so as to uniquely indicate whether or not.

本発明に係る動画像予測復号装置は、動画像を構成する複数のピクチャが符号化され、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された、圧縮画像データを入力する入力手段と、ＮＡＬユニットヘッダ情報及び圧縮画像データを復元する復号手段と、を具備し、ＮＡＬユニットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号手段は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに基づいて、圧縮画像データを復号する。 The moving picture predictive decoding apparatus according to the present invention includes an input means for inputting compressed image data in which a plurality of pictures constituting a moving picture are encoded and encapsulated in a NAL unit together with NAL unit header information, and a NAL unit header. Decoding means for restoring information and compressed image data, and the NAL unit header information uniquely indicates whether the restored picture data is used for reference when decoding other pictures. nal_unit_type is included, and the decoding unit decodes the compressed image data based on nal_unit_type.

本発明に係る動画像予測復号装置における復号手段は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、圧縮画像データを復号することとしてもよい。 The decoding means in the moving picture predictive decoding apparatus according to the present invention has previously associated nal_unit_type and information indicating whether or not the restored picture data is used for reference when decoding other pictures. The compressed image data may be decoded based on the stored correspondence table.

本発明に係る動画像予測符号化方法は、動画像を構成する複数のピクチャを入力する入力ステップと、ピクチャを符号化し、圧縮画像データを生成し、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化する符号化ステップと、を具備し、ＮＡＬユニットヘッダ情報は、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、符号化ステップは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを、符号化されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すように決定する。 The moving picture predictive coding method according to the present invention includes an input step for inputting a plurality of pictures constituting a moving picture, a picture is coded, compressed picture data is generated, and is encapsulated in a NAL unit together with NAL unit header information. The NAL unit header information includes nal_unit_type, and the encoding step uses nal_unit_type to determine whether the encoded picture data is used for reference when decoding other pictures. It is determined so as to uniquely indicate whether or not.

本発明に係る動画像予測復号方法は、動画像を構成する複数のピクチャが符号化され、ＮＡＬユニットヘッダ情報とともにＮＡＬユニットにカプセル化された、圧縮画像データを入力する入力ステップと、ＮＡＬユニットヘッダ情報及び圧縮画像データを復元する復号ステップと、を具備し、ＮＡＬユニットヘッダ情報は、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを一意に示すｎａｌ＿ｕｎｉｔ＿ｔｙｐｅを含み、復号ステップは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅに基づいて、圧縮画像データを復号する。 The video predictive decoding method according to the present invention includes an input step of inputting compressed image data in which a plurality of pictures constituting a video are encoded and encapsulated in a NAL unit together with NAL unit header information, and a NAL unit header A decoding step for restoring information and compressed image data, and the NAL unit header information uniquely indicates whether the restored picture data is used for reference when decoding other pictures. nal_unit_type is included, and the decoding step decodes the compressed image data based on nal_unit_type.

本発明に係る動画像予測復号方法における復号ステップは、ｎａｌ＿ｕｎｉｔ＿ｔｙｐｅと、復元されたピクチャデータが、他のピクチャを復号する際に参照のために使われるか否かを示す情報とが対応付いた予め格納された対応表に基づいて、圧縮画像データを復号することとしてもよい。 In the decoding step in the video predictive decoding method according to the present invention, nal_unit_type is associated with information indicating whether or not the restored picture data is used for reference when decoding other pictures. The compressed image data may be decoded based on the stored correspondence table.

１０１…入力端子、１０２…ブロック分割器、１０３…予測信号生成器、１０４…フレームメモリ、１０５…減算器、１０６…変換器、１０７…量子化器、１０８…逆量子化器、１０９…逆変換器、１１０…加算器、１１１…エントロピー符号化器、１１２…出力端子、１１３…入力端子、２０１…入力端子、２０２…データ解析器、２０３…逆量子化器、２０４…逆変換器、２０５…加算器、２０６…出力端子、２０７…フレームメモリ、２０８…予測信号生成器。 DESCRIPTION OF SYMBOLS 101 ... Input terminal, 102 ... Block divider, 103 ... Prediction signal generator, 104 ... Frame memory, 105 ... Subtractor, 106 ... Converter, 107 ... Quantizer, 108 ... Inverse quantizer, 109 ... Inverse transformation 110 ... adder, 111 ... entropy encoder, 112 ... output terminal, 113 ... input terminal, 201 ... input terminal, 202 ... data analyzer, 203 ... inverse quantizer, 204 ... inverse transformer, 205 ... Adder, 206 ... output terminal, 207 ... frame memory, 208 ... prediction signal generator.

Claims

A video predictive decoding method executed by a video predictive decoding device,
An input step for inputting compressed image data for a plurality of pictures constituting a moving image, including a reference picture set (RPS) and encapsulated in a NAL unit together with NAL unit header information; ,
A decoding step of decoding the NAL unit header information and the RPS, and restoring the compressed image data as reproduced image data;
Including
The plurality of pictures constituting the moving image are classified into a plurality of temporal layers,
The RPS identifies a set of pictures used for inter-picture prediction of related pictures;
The NAL unit header information includes nal_unit_type that uniquely indicates whether or not the reproduced image data is used for inter-screen prediction when decoding other pictures of the same temporal layer.
The RPS of the other pictures does not include non-reference pictures of the same temporal layer;
Video predictive decoding method.

The NAL unit header information includes nal_unit_type that uniquely indicates whether the reproduced image data is used in decoding order for inter-picture prediction in decoding of subsequent pictures of the same temporal layer,
The RPS of the subsequent pictures in decoding order does not include non-reference pictures of the same temporal layer;
The moving picture predictive decoding method according to claim 1.

Input means for inputting compressed image data for a plurality of pictures constituting a moving image, including a reference picture set (RPS) and encapsulated in a NAL unit together with NAL unit header information ,
Decoding means for decoding the NAL unit header information and the RPS and restoring the compressed image data as reproduced image data;
Comprising
The plurality of pictures constituting the moving image are classified into a plurality of temporal layers,
The RPS identifies a set of pictures used for inter-picture prediction of related pictures;
The NAL unit header information includes nal_unit_type that uniquely indicates whether or not the reproduced image data is used for inter-screen prediction when decoding other pictures of the same temporal layer.
The RPS of the other pictures does not include non-reference pictures of the same temporal layer;
Video predictive decoding apparatus.

The NAL unit header information includes nal_unit_type that uniquely indicates whether the reproduced image data is used in decoding order for inter-picture prediction in decoding of subsequent pictures of the same temporal layer,
The RPS of the subsequent pictures in decoding order does not include non-reference pictures of the same temporal layer;
The moving picture predictive decoding apparatus according to claim 3.