JP3344576B2

JP3344576B2 - Image encoding device and image encoding method, image decoding device and image decoding method

Info

Publication number: JP3344576B2
Application number: JP2000184491A
Authority: JP
Inventors: 輝彦鈴木; 陽一矢ヶ崎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1996-09-09
Filing date: 2000-06-20
Publication date: 2002-11-11
Anticipated expiration: 2016-09-20
Also published as: JP2001045496A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像符号化装置お
よび画像符号化方法、画像復号化装置および画像復号化
方法、並びに記録媒体および記録方法に関する。特に、
例えば、動画像データを、光磁気ディスクや磁気テープ
などの記録媒体に記録し、これを再生してディスプレイ
などに表示したり、テレビ会議システム、テレビ電話シ
ステム、放送用機器、マルチメディアデータベース検索
システムなどのように、動画像データを伝送路を介して
送信側から受信側に伝送し、受信側において、これを受
信し、表示する場合や、編集して記録する場合などに用
いて好適な画像符号化装置および画像符号化方法、画像
復号化装置および画像復号化方法、並びに記録媒体およ
び記録方法に関する。The present invention relates to an image encoding device and an image encoding method, an image decoding device and an image decoding method, and a recording medium and a recording method. In particular,
For example, moving image data is recorded on a recording medium such as a magneto-optical disk or a magnetic tape, and is reproduced and displayed on a display or the like, a video conference system, a video telephone system, a broadcasting device, a multimedia database search system. For example, when moving image data is transmitted from a transmitting side to a receiving side via a transmission path, and the receiving side receives and displays the moving image data, or edits and records the image, a suitable image is used. The present invention relates to an encoding device and an image encoding method, an image decoding device and an image decoding method, and a recording medium and a recording method.

【０００２】[0002]

【従来の技術】例えば、テレビ会議システム、テレビ電
話システムなどのように、動画像データを遠隔地に伝送
するシステムにおいては、伝送路を効率良く利用するた
め、画像データを、そのライン相関やフレーム間相関を
利用して圧縮符号化するようになされている。2. Description of the Related Art For example, in a system for transmitting moving image data to a remote place, such as a video conference system or a video telephone system, image data is converted into a line correlation or a frame in order to use a transmission path efficiently. The compression encoding is performed using the inter-correlation.

【０００３】動画像の高能率符号化方式として代表的な
ものとしてＭＰＥＧ（Moving Picture Experts Group）
（蓄積用動画像符号化）方式がある。これはＩＳＯ−Ｉ
ＥＣ／ＪＴＣ１／ＳＣ２／ＷＧ１１において議論され、
標準案として提案されたものであり、動き補償予測符号
化とＤＣＴ（Discrete Cosine Transform）符号化を組
み合わせたハイブリッド方式が採用されている。A moving picture experts group (MPEG) is a typical moving picture coding scheme.
(Moving picture coding for storage). This is ISO-I
Discussed in EC / JTC1 / SC2 / WG11,
It has been proposed as a standard, and employs a hybrid method combining motion compensation prediction coding and DCT (Discrete Cosine Transform) coding.

【０００４】ＭＰＥＧでは、様々なアプリケーションや
機能に対応するために、いくつかのプロファイルおよび
レベルが定義されている。最も基本となるのが、メイン
プロファイルメインレベル（ＭＰ＠ＭＬ（Main Profile
at Main Level））である。[0004] In MPEG, several profiles and levels are defined in order to support various applications and functions. The most basic is the main profile main level (MP @ ML (Main Profile
at Main Level)).

【０００５】図４２は、ＭＰＥＧ方式におけるＭＰ＠Ｍ
Ｌのエンコーダの一例の構成を示している。[0005] FIG. 42 is a diagram showing MP @ M in the MPEG system.
5 shows an exemplary configuration of an L encoder.

【０００６】符号化すべき画像データは、フレームメモ
リ３１に入力され、一時記憶される。そして、動きベク
トル検出器３２は、フレームメモリ３１に記憶された画
像データを、例えば、１６画素×１６画素などで構成さ
れるマクロブロック単位で読み出し、その動きベクトル
を検出する。[0006] Image data to be encoded is input to a frame memory 31 and is temporarily stored. Then, the motion vector detector 32 reads out the image data stored in the frame memory 31 in units of macroblocks composed of, for example, 16 pixels × 16 pixels, and detects the motion vector.

【０００７】ここで、動きベクトル検出器３２において
は、各フレームの画像データを、Ｉピクチャ、Ｐピクチ
ャ、またはＢピクチャのうちのいずれかとして処理す
る。なお、シーケンシャルに入力される各フレームの画
像を、Ｉ，Ｐ，Ｂピクチャのいずれのピクチャとして処
理するかは、予め定められている（例えば、Ｉ，Ｂ，
Ｐ，Ｂ，Ｐ，・・・Ｂ，Ｐとして処理される）。Here, the motion vector detector 32 processes the image data of each frame as any one of an I picture, a P picture, and a B picture. It should be noted that it is determined in advance as to which of the I, P, and B pictures the image of each frame input sequentially is processed (for example, I, B,
P, B, P,..., B, P).

【０００８】即ち、動きベクトル検出器３２は、フレー
ムメモリ３１に記憶された画像の中の、予め定められた
所定の参照フレームを参照し、その参照フレームと、現
在符号化の対象となっているフレームの１６画素×１６
ラインの小ブロック（マクロブロック）とをパターンマ
ッチング（ブロックマッチング）することにより、その
マクロブロックの動きベクトルを検出する。That is, the motion vector detector 32 refers to a predetermined reference frame in an image stored in the frame memory 31, and the reference frame and the current frame are to be encoded. 16 pixels of frame x 16
By performing pattern matching (block matching) with a small block (macro block) of the line, a motion vector of the macro block is detected.

【０００９】ここで、ＭＰＥＧにおいては、画像の予測
モードには、イントラ符号化（フレーム内符号化）、前
方予測符号化、後方予測符号化、両方向予測符号化の４
種類があり、Ｉピクチャはイントラ符号化され、Ｐピク
チャはイントラ符号化または前方予測符号化され、Ｂピ
クチャはイントラ符号化、前方予測符号化、後方予測符
号化、または両方法予測符号化される。[0009] Here, in MPEG, the image prediction modes include four modes: intra coding (intra-frame coding), forward prediction coding, backward prediction coding, and bidirectional prediction coding.
There are types, I pictures are intra-coded, P pictures are intra-coded or forward predicted coded, B pictures are intra-coded, forward predicted coded, backward predicted coded, or both methods predictive coded .

【００１０】即ち、動きベクトル検出器３２は、Ｉピク
チャについては、予測モードとしてイントラ符号化モー
ドを設定する。この場合、動きベクトル検出器３２は、
動きベクトルの検出は行わず、予測モード（イントラ予
測モード）を、ＶＬＣ（可変長符号化）器３６および動
き補償器４２に出力する。That is, the motion vector detector 32 sets the intra coding mode as the prediction mode for the I picture. In this case, the motion vector detector 32
The motion vector is not detected, and the prediction mode (intra prediction mode) is output to the VLC (variable length coding) unit 36 and the motion compensator 42.

【００１１】また、動きベクトル検出器３２は、Ｐピク
チャについては、前方予測を行い、その動きベクトルを
検出する。さらに、動きベクトル検出器３２は、前方予
測を行うことにより生じる予測誤差と、符号化対象のマ
クロブロック（Ｐピクチャのマクロブロック）の、例え
ば分散とを比較し、マクロブロックの分散の方が予測誤
差より小さい場合、予測モードとしてイントラ符号化モ
ードを設定し、ＶＬＣ器３６および動き補償器４２に出
力する。また、動きベクトル検出器３２は、前方予測を
行うことにより生じる予測誤差の方が小さければ、予測
モードとして前方予測符号化モードを設定し、検出した
動きベクトルとともに、ＶＬＣ器３６および動き補償器
４２に出力する。The motion vector detector 32 performs forward prediction on a P picture and detects the motion vector. Further, the motion vector detector 32 compares a prediction error caused by performing forward prediction with, for example, a variance of a coding-target macroblock (a macroblock of a P picture), and the variance of the macroblock is more predictive. When the difference is smaller than the error, the intra coding mode is set as the prediction mode, and the prediction mode is output to the VLC unit 36 and the motion compensator 42. If the prediction error caused by performing the forward prediction is smaller, the motion vector detector 32 sets the forward prediction encoding mode as the prediction mode, and sets the VLC unit 36 and the motion compensator 42 together with the detected motion vector. Output to

【００１２】さらに、動きベクトル検出器３２は、Ｂピ
クチャについては、前方予測、後方予測、および両方向
予測を行い、それぞれの動きベクトルを検出する。そし
て、動きベクトル検出器３２は、前方予測、後方予測、
および両方向予測についての予測誤差の中の最小のもの
（以下、適宜、最小予測誤差という）を検出し、その最
小予測誤差と、符号化対象のマクロブロック（Ｂピクチ
ャのマクロブロック）の、例えば分散とを比較する。そ
の比較の結果、マクロブロックの分散の方が最小予測誤
差より小さい場合、動きベクトル検出器３２は、予測モ
ードとしてイントラ符号化モードを設定し、ＶＬＣ器３
６および動き補償器４２に出力する。また、動きベクト
ル検出器３２は、最小予測誤差の方が小さければ、予測
モードとして、その最小予測誤差が得られた予測モード
を設定し、対応する動きベクトルとともに、ＶＬＣ器３
６および動き補償器４２に出力する。Further, the motion vector detector 32 performs forward prediction, backward prediction, and bidirectional prediction on the B picture, and detects respective motion vectors. Then, the motion vector detector 32 performs forward prediction, backward prediction,
And a minimum prediction error of the bidirectional prediction (hereinafter, appropriately referred to as a minimum prediction error), and the minimum prediction error and the variance of the encoding target macroblock (the macroblock of the B picture), for example. Compare with As a result of the comparison, if the variance of the macroblock is smaller than the minimum prediction error, the motion vector detector 32 sets the intra coding mode as the prediction mode, and sets the VLC unit 3
6 and the motion compensator 42. If the minimum prediction error is smaller, the motion vector detector 32 sets the prediction mode in which the minimum prediction error is obtained as the prediction mode, and sets the VLC unit 3 together with the corresponding motion vector.
6 and the motion compensator 42.

【００１３】動き補償器４２は、動きベクトル検出器３
２から予測モードと動きベクトルの両方を受信すると、
その予測モードおよび動きベクトルにしたがって、フレ
ームメモリ４１に記憶されている、符号化され、既に局
所復号化された画像データを読み出し、これを、予測画
像として、演算器３３および４０に供給する。The motion compensator 42 includes a motion vector detector 3
When both the prediction mode and the motion vector are received from 2,
According to the prediction mode and the motion vector, the coded and already locally decoded image data stored in the frame memory 41 is read out and supplied to the calculators 33 and 40 as a predicted image.

【００１４】演算器３３は、動きベクトル検出器３２が
フレームメモリ３１から読み出した画像データと同一の
マクロブロックを、フレームメモリ３１から読み出し、
そのマクロブロックと、動き補償器４２からの予測画像
との差分を演算する。この差分値は、ＤＣＴ器３４に供
給される。The arithmetic unit 33 reads from the frame memory 31 the same macroblock as the image data read from the frame memory 31 by the motion vector detector 32,
The difference between the macro block and the predicted image from the motion compensator 42 is calculated. This difference value is supplied to the DCT unit 34.

【００１５】一方、動き補償器４２は、動きベクトル検
出器３２から予測モードのみを受信した場合、即ち、予
測モードがイントラ符号化モードである場合には、予測
画像を出力しない。この場合、演算器３３（演算器４０
も同様）は、特に処理を行わず、フレームメモリ３１か
ら読み出したマクロブロックを、そのままＤＣＴ器３４
に出力する。On the other hand, when only the prediction mode is received from the motion vector detector 32, that is, when the prediction mode is the intra-coding mode, the motion compensator 42 does not output a predicted image. In this case, the operator 33 (the operator 40
Does not perform any particular processing, and converts the macroblock read from the frame memory 31 into the DCT unit 34 without any processing.
Output to

【００１６】ＤＣＴ器３４では、演算器３３の出力に対
して、ＤＣＴ処理が施され、その結果得られるＤＣＴ係
数が、量子化器３５に供給される。量子化器３５では、
バッファ３７のデータ蓄積量（バッファ３７に記憶され
ているデータの量）（バッファフィードバック）に対応
して量子化ステップ（量子化スケール）が設定され、そ
の量子化ステップで、ＤＣＴ器３４からのＤＣＴ係数が
量子化される。この量子化されたＤＣＴ係数（以下、適
宜、量子化係数という）は、設定された量子化ステップ
とともに、ＶＬＣ器３６に供給される。In the DCT unit 34, the output of the arithmetic unit 33 is subjected to DCT processing, and the resulting DCT coefficient is supplied to the quantizer 35. In the quantizer 35,
A quantization step (quantization scale) is set corresponding to the amount of data stored in the buffer 37 (the amount of data stored in the buffer 37) (buffer feedback), and the DCT from the DCT unit 34 is set in the quantization step. The coefficients are quantized. The quantized DCT coefficients (hereinafter, appropriately referred to as quantization coefficients) are supplied to the VLC unit 36 together with the set quantization steps.

【００１７】ＶＬＣ器３６では、量子化器３５より供給
される量子化ステップに対応して、同じく量子化器３５
より供給される量子化係数が、例えばハフマン符号など
の可変長符号に変換され、バッファ３７に出力される。
さらに、ＶＬＣ器３６は、量子化器３５からの量子化ス
テップ、動きベクトル検出器３２からの予測モード（イ
ントラ符号化（画像内予測符号化）、前方予測符号化、
後方予測符号化、または両方向予測符号化のうちのいず
れが設定されたかを示すモード）および動きベクトルも
可変長符号化し、バッファ３７に出力する。In the VLC unit 36, corresponding to the quantization step supplied from the quantizer 35,
The supplied quantization coefficient is converted into a variable-length code such as a Huffman code, for example, and output to the buffer 37.
Further, the VLC unit 36 performs a quantization step from the quantizer 35, a prediction mode (intra-coding (intra-picture prediction coding), a forward prediction coding,
A mode indicating which of the backward prediction coding and the bidirectional prediction coding has been set) and the motion vector are also variable-length coded and output to the buffer 37.

【００１８】バッファ３７は、ＶＬＣ器３６からのデー
タを一時蓄積し、そのデータ量を平滑化して、例えば、
伝送路に出力し、または記録媒体に記録する。The buffer 37 temporarily stores data from the VLC unit 36 and smoothes the data amount.
Output to a transmission path or record on a recording medium.

【００１９】また、バッファ３７は、そのデータ蓄積量
を、量子化器３５に出力しており、量子化器３５は、こ
のバッファ３７からのデータ蓄積量にしたがって量子化
ステップを設定する。即ち、量子化器３５は、バッファ
３７がオーバーフローしそうなとき、量子化ステップを
大きくし、これにより、量子化係数のデータ量を低下さ
せる。また、量子化器３５は、バッファ３７がアンダー
フローしそうなとき、量子化ステップを小さくし、これ
により、量子化係数のデータ量を増大させる。このよう
にして、バッファ３７のオーバフローとアンダフローを
防止するようになっている。The buffer 37 outputs the data storage amount to the quantizer 35, and the quantizer 35 sets a quantization step according to the data storage amount from the buffer 37. That is, when the buffer 37 is about to overflow, the quantizer 35 increases the quantization step, thereby reducing the data amount of the quantization coefficient. When the buffer 37 is about to underflow, the quantizer 35 reduces the quantization step, thereby increasing the data amount of the quantization coefficient. Thus, the overflow and the underflow of the buffer 37 are prevented.

【００２０】量子化器３５が出力する量子化係数と量子
化ステップは、ＶＬＣ器３６だけでなく、逆量子化器３
８にも供給されるようになされている。逆量子化器３５
では、量子化器３５からの量子化係数が、同じく量子化
器３５からの量子化ステップにしたがって逆量子化さ
れ、これによりＤＣＴ係数に変換される。このＤＣＴ係
数は、ＩＤＣＴ器（逆ＤＣＴ器）３９に供給される。Ｉ
ＤＣＴ器３９では、ＤＣＴ係数が逆ＤＣＴ処理され、演
算器４０に供給される。The quantization coefficient and the quantization step output from the quantizer 35 are determined not only by the VLC unit 36 but also by the inverse quantizer 3.
8 as well. Inverse quantizer 35
Then, the quantized coefficient from the quantizer 35 is inversely quantized in accordance with a quantization step from the quantizer 35, and is thereby converted into a DCT coefficient. The DCT coefficient is supplied to an IDCT unit (inverse DCT unit) 39. I
In the DCT unit 39, the DCT coefficient is subjected to an inverse DCT process, and is supplied to the arithmetic unit 40.

【００２１】演算器４０には、ＩＤＣＴ器３９の出力の
他、上述したように、動き補償器４２から、演算器３３
に供給されている予測画像と同一のデータが供給されて
おり、演算器４０は、ＩＤＣＴ器３９からの信号（予測
残差）と、動き補償器４２からの予測画像とを加算する
ことで、元の画像を、局所復号する（但し、予測モード
がイントラ符号化である場合には、ＩＤＣＴ器３９の出
力は、演算器４０をスルーして、フレームメモリ４１に
供給される）。なお、この復号画像は、受信側において
得られる復号画像と同一のものである。The arithmetic unit 40 receives the output of the IDCT unit 39 and, as described above, the motion compensator 42 and the arithmetic unit 33.
Are supplied with the same data as the prediction image supplied to the calculation unit 40. The arithmetic unit 40 adds the signal (prediction residual) from the IDCT unit 39 and the prediction image from the motion compensator 42, The original image is locally decoded (however, when the prediction mode is intra coding, the output of the IDCT unit 39 is supplied to the frame memory 41 through the arithmetic unit 40). This decoded image is the same as the decoded image obtained on the receiving side.

【００２２】演算器４０において得られた復号画像（局
所復号画像）は、フレームメモリ４１に供給されて記憶
され、その後、インター符号化（前方予測符号化、後方
予測符号化、量方向予測符号化）される画像に対する参
照画像（参照フレーム）として用いられる。The decoded image (local decoded image) obtained by the arithmetic unit 40 is supplied to and stored in the frame memory 41, and then inter-coded (forward predictive coding, backward predictive coding, quantitative predictive coding). ) Is used as a reference image (reference frame) for the image to be processed.

【００２３】次に、図４３は、図４２のエンコーダから
出力される符号化データを復号化する、ＭＰＥＧにおけ
るＭＰ＠ＭＬのデコーダの一例の構成を示している。FIG. 43 shows an example of the configuration of an MPEG @ ML decoder in MPEG for decoding encoded data output from the encoder shown in FIG.

【００２４】伝送路を介して伝送されてきた符号化デー
タが図示せぬ受信装置で受信され、または記録媒体に記
録された符号化データが図示せぬ再生装置で再生され、
バッファ１０１に供給されて記憶される。The encoded data transmitted via the transmission path is received by a receiving device (not shown), or the encoded data recorded on the recording medium is reproduced by a reproducing device (not shown),
The data is supplied to the buffer 101 and stored.

【００２５】ＩＶＬＣ器（逆ＶＬＣ器）（可変長復号化
器）１０２は、バッファ１０１に記憶された符号化デー
タを読み出し、可変長復号化することで、その符号化デ
ータを、動きベクトル、予測モード、量子化ステップ、
および量子化係数に分離する。これらのうち、動きベク
トルおよび予測モードは動き補償器１０７に供給され、
量子化ステップおよび量子化係数は逆量子化器１０３に
供給される。An IVLC unit (inverse VLC unit) (variable-length decoder) 102 reads out the encoded data stored in the buffer 101 and performs variable-length decoding so that the encoded data is converted into a motion vector Mode, quantization step,
And quantization coefficients. Among them, the motion vector and the prediction mode are supplied to the motion compensator 107,
The quantization step and the quantization coefficient are supplied to the inverse quantizer 103.

【００２６】逆量子化器１０３は、ＩＶＬＣ器１０２よ
り供給された量子化係数を、同じくＩＶＬＣ器１０２よ
り供給された量子化ステップにしたがって逆量子化し、
その結果得られるＤＣＴ係数を、ＩＤＣＴ器１０４に出
力する。ＩＤＣＴ器１０４は、逆量子化器１０３からの
ＤＣＴ係数を逆ＤＣＴし、演算器１０５に供給する。The inverse quantizer 103 inversely quantizes the quantized coefficient supplied from the IVLC unit 102 in accordance with the quantization step also supplied from the IVLC unit 102.
The resulting DCT coefficient is output to IDCT unit 104. The IDCT unit 104 performs an inverse DCT on the DCT coefficient from the inverse quantizer 103 and supplies the result to an arithmetic unit 105.

【００２７】演算器１０５には、ＩＤＣＴ器１０４の出
力の他、動き補償器１０７の出力も供給されている。即
ち、動き補償器１０７は、フレームメモリ１０６に記憶
されている、既に復号された画像を、図４２の動き補償
器４１における場合と同様に、ＩＶＬＣ器１０２からの
動きベクトルおよび予測モードにしたがって読み出し、
予測画像として、演算器１０５に供給する。演算器１０
５は、ＩＤＣＴ器１０４からの信号（予測残差）と、動
き補償器１０７からの予測画像とを加算することで、元
の画像を復号する。この復号画像は、フレームメモリ１
０６に供給されて記憶される。なお、ＩＤＣＴ器１０４
の出力が、イントラ符号化されたものである場合には、
その出力は、演算器１０５をスルーして、そのままフレ
ームメモリ１０６に供給されて記憶される。The output of the motion compensator 107 is supplied to the arithmetic unit 105 in addition to the output of the IDCT unit 104. That is, the motion compensator 107 reads the already decoded image stored in the frame memory 106 in accordance with the motion vector and the prediction mode from the IVLC unit 102 as in the case of the motion compensator 41 in FIG. ,
The prediction image is supplied to the arithmetic unit 105. Arithmetic unit 10
5 decodes the original image by adding the signal (prediction residual) from the IDCT unit 104 and the predicted image from the motion compensator 107. This decoded image is stored in the frame memory 1
06 and stored. Note that the IDCT device 104
If the output of is intra-coded,
The output passes through the arithmetic unit 105 and is supplied to and stored in the frame memory 106 as it is.

【００２８】フレームメモリ１０６に記憶された復号画
像は、その後に復号される画像の参照画像として用いら
れるとともに、適宜読み出され、例えば、図示せぬディ
スプレイなどに供給されて表示される。The decoded image stored in the frame memory 106 is used as a reference image for an image to be subsequently decoded, read out as appropriate, and supplied to, for example, a display (not shown) and displayed.

【００２９】なお、ＭＰＥＧ１および２では、Ｂピクチ
ャは、参照画像として用いられないため、エンコーダま
たはデコーダそれぞれにおいて、フレームメモリ４１
（図４２）または１０６（図４３）には記憶されない。In MPEG1 and MPEG-2, B pictures are not used as reference pictures, so that the frame memory 41 is used in each of the encoder and the decoder.
(FIG. 42) or 106 (FIG. 43).

【００３０】ＭＰＥＧでは、以上のようなＭＰ＠ＭＬの
他にも、様々なプロファイルおよびレベルが定義され、
また各種のツールが用意されている。ＭＰＥＧのツール
の代表的なものの１つとしては、例えば、スケーラビリ
ティがある。In MPEG, in addition to the above-mentioned MP @ ML, various profiles and levels are defined.
In addition, various tools are prepared. One of the typical MPEG tools is scalability, for example.

【００３１】即ち、ＭＰＥＧでは、異なる画像サイズや
フレームレートに対応するスケーラビリティを実現する
スケーラブル符号化方式が導入されている。例えば、空
間スケーラビリティでは、下位レイヤのビットストリー
ムのみを復号する場合、画像サイズの小さい画像だけが
得られ、下位レイヤおよび上位レイヤの両方のビットス
トリームを復号する場合、画像サイズの大きい画像が得
られる。That is, MPEG introduces a scalable encoding method for realizing scalability corresponding to different image sizes and frame rates. For example, in spatial scalability, when decoding only the bit stream of the lower layer, only an image having a small image size is obtained, and when decoding the bit streams of both the lower layer and the upper layer, an image having a large image size is obtained. .

【００３２】図４４は、空間スケーラビリティを実現す
るエンコーダの一例の構成を示している。なお、空間ス
ケーラビリティでは、例えば、下位レイヤは画像サイズ
の小さい画像信号、また上位レイヤは画像サイズの大き
い画像信号に対応する。FIG. 44 shows an example of the configuration of an encoder for realizing spatial scalability. In the spatial scalability, for example, the lower layer corresponds to an image signal having a small image size, and the upper layer corresponds to an image signal having a large image size.

【００３３】上位レイヤ符号化部２０１には、例えば、
符号化すべき画像が、そのまま上位レイヤの画像として
入力され、下位レイヤ符号化部２０２には、符号化すべ
き画像を間引いて、その画素数を少なくしたもの（従っ
て、解像度を低下させ、そのサイズを小さくしたもの）
が、下位レイヤの画像として入力される。For example, the upper layer coding section 201
The image to be coded is input as it is as the image of the upper layer, and the lower layer coding unit 202 thins out the image to be coded and reduces the number of pixels (accordingly, the resolution is reduced and the size is reduced). Smaller)
Is input as the image of the lower layer.

【００３４】下位レイヤ符号化部２０２では、下位レイ
ヤの画像が、例えば、図４２における場合と同様にして
予測符号化され、その符号化結果としての下位レイヤビ
ットストリームが出力される。さらに、下位レイヤ符号
化部２０２では、局所復号した下位レイヤの画像を、上
位レイヤの画像のサイズと同一サイズに拡大したもの
（以下、適宜、拡大画像という）が生成される。この拡
大画像は、上位レイヤ符号化部２０１に供給される。The lower layer coding section 202 predictively codes the image of the lower layer, for example, as in the case of FIG. 42, and outputs a lower layer bit stream as a result of the coding. Further, the lower layer encoding unit 202 generates a locally decoded image of the lower layer enlarged to the same size as the image of the upper layer (hereinafter, appropriately referred to as an enlarged image). This enlarged image is supplied to the upper layer encoding unit 201.

【００３５】上位レイヤ符号化部２０１でも、やはり、
例えば、図４２における場合と同様にして、上位レイヤ
の画像が予測符号化され、その符号化結果としての上位
レイヤビットストリームが出力される。なお、上位レイ
ヤ符号化部２０１では、下位レイヤ符号化部２０２から
の拡大画像をも参照画像として用いて、予測符号化が行
われる。In the upper layer coding section 201,
For example, as in the case of FIG. 42, the image of the upper layer is predictively encoded, and an upper layer bit stream is output as a result of the encoding. Note that the upper layer encoding unit 201 performs predictive encoding using the enlarged image from the lower layer encoding unit 202 as a reference image.

【００３６】上位レイヤビットストリームおよび下位レ
イヤビットストリームは多重化され、符号化データとし
て出力される。The upper layer bit stream and the lower layer bit stream are multiplexed and output as encoded data.

【００３７】図４５は、図４４の下位レイヤ符号化部２
０２の一例の構成を示している。なお、図中、図４２に
おける場合と対応する部分については、同一の符号を付
してある。即ち、下位レイヤ符号化部２０２は、アップ
サンプリング部２１１が新たに設けられている他は、図
４２のエンコーダと同様に構成されている。FIG. 45 is a diagram showing the lower layer coding unit 2 in FIG.
02 shows an example configuration. Note that, in the figure, parts corresponding to those in FIG. 42 are denoted by the same reference numerals. That is, the lower layer encoding unit 202 is configured similarly to the encoder in FIG. 42 except that an upsampling unit 211 is newly provided.

【００３８】アップサンプリング部２１１では、演算器
４０が出力する、局所復号された下位レイヤの画像がア
ップサンプリングされる（補間される）ことで、上位レ
イヤの画像サイズと同一の画像サイズに拡大され、上位
レイヤ符号化部２０１に供給される。The up-sampling unit 211 up-samples (interpolates) the locally decoded lower-layer image output from the arithmetic unit 40, thereby enlarging the image to the same image size as the upper-layer image. , Are supplied to the upper layer coding section 201.

【００３９】図４６は、図４４の上位レイヤ符号化部２
０１の一例の構成を示している。なお、図中、図４２に
おける場合と対応する部分については、同一の符号を付
してある。即ち、上位レイヤ符号化部２０１は、重み付
加部２２１，２２２、および演算器２２３が新たに設け
られている他は、基本的に図４２のエンコーダと同様に
構成されている。FIG. 46 shows the upper layer coding section 2 of FIG.
01 shows an example configuration. Note that, in the figure, parts corresponding to those in FIG. 42 are denoted by the same reference numerals. That is, the upper layer encoding unit 201 is basically configured in the same manner as the encoder of FIG. 42 except that weighting units 221 and 222 and a calculator 223 are newly provided.

【００４０】重み付加部２２１は、動き補償器４２が出
力する予測画像に対して、重みＷを乗算し、演算器２２
３に出力する。演算器２２３には、重み付加部２２１の
出力の他、重み付加部２２２の出力も供給されており、
重み付加部２２２は、下位レイヤ符号化部２０２から供
給される拡大画像に対して、重み（１−Ｗ）を乗算し、
演算器２２３に供給する。The weighting unit 221 multiplies the predicted image output from the motion compensator 42 by a weight W,
Output to 3. The output of the weighting unit 222 is also supplied to the arithmetic unit 223 in addition to the output of the weighting unit 221.
The weighting unit 222 multiplies the enlarged image supplied from the lower layer encoding unit 202 by a weight (1-W),
It is supplied to the arithmetic unit 223.

【００４１】演算器２２３は、重み付加回路２２１およ
び２２２の出力を加算し、その加算結果を、予測画像と
して演算器３３および４０に出力する。The arithmetic unit 223 adds the outputs of the weighting circuits 221 and 222, and outputs the addition result to the arithmetic units 33 and 40 as a predicted image.

【００４２】以下、上位レイヤ符号化部２０１では、図
４２における場合と同様の処理が行われる。Thereafter, in upper layer coding section 201, the same processing as in the case of FIG. 42 is performed.

【００４３】従って、上位レイヤ符号化部２０１では、
上位レイヤの画像を参照画像とするだけでなく、下位レ
イヤ符号化部２０２からの拡大画像、即ち、下位レイヤ
の画像をも参照画像として、予測符号化が行われる。Therefore, in the upper layer coding section 201,
Predictive coding is performed using not only the image of the upper layer as a reference image but also the enlarged image from the lower layer encoding unit 202, that is, the image of the lower layer as a reference image.

【００４４】なお、重み付加部２２１において用いられ
る重みＷは、あらかじめ設定されており（従って、重み
付加部２２２において用いられる重み１−Ｗも、あらか
じめ設定されている）、また、この重みＷは、ＶＬＣ器
３６に供給され、可変長符号化されるようになされてい
る。The weight W used in the weight adding section 221 is set in advance (therefore, the weight 1-W used in the weight adding section 222 is also set in advance). , VLC unit 36 and are subjected to variable-length coding.

【００４５】次に、図４７は、空間スケーラビリティを
実現するデコーダの一例の構成を示している。FIG. 47 shows an example of the configuration of a decoder for realizing spatial scalability.

【００４６】図４４のエンコーダから出力された符号化
データは、上位レイヤビットストリームと下位レイヤビ
ットストリームとに分離され、それぞれは、上位レイヤ
復号化部２３１または下位レイヤ復号化部２３２に供給
される。The encoded data output from the encoder of FIG. 44 is separated into an upper layer bit stream and a lower layer bit stream, and each is supplied to the upper layer decoding section 231 or the lower layer decoding section 232. .

【００４７】下位レイヤ復号化部２３２では、下位レイ
ヤビットストリームが、図４３における場合と同様にし
て復号化され、その結果得られる下位レイヤの復号画像
が出力される。さらに、下位レイヤ復号化部２３２で
は、下位レイヤの復号画像が、上位レイヤの画像のサイ
ズと同一サイズに拡大され、これにより、拡大画像が生
成される。この拡大画像は、上位レイヤ復号化部２３１
に供給される。In lower layer decoding section 232, the lower layer bit stream is decoded in the same manner as in FIG. 43, and the resulting lower layer decoded image is output. Further, the lower layer decoding unit 232 enlarges the decoded image of the lower layer to the same size as the image of the upper layer, thereby generating an enlarged image. This enlarged image is output to the upper layer decoding unit 231.
Supplied to

【００４８】上位レイヤ復号化部２３１でも、やはり、
例えば、図４３における場合と同様にして、上位レイヤ
ビットストリームが復号化される。但し、上位レイヤ復
号化部２３１では、下位レイヤ復号化部２３２からの拡
大画像をも参照画像として用いて、復号が行われる。In the upper layer decoding section 231,
For example, as in the case of FIG. 43, the upper layer bit stream is decoded. However, the upper layer decoding unit 231 performs decoding using the enlarged image from the lower layer decoding unit 232 as a reference image.

【００４９】図４８は、図４７の下位レイヤ復号化部２
３２の一例の構成を示している。なお、図中、図４３に
おける場合と対応する部分については、同一の符号を付
してある。即ち、下位レイヤ復号化部２３２は、アップ
サンプリング部２４１が新たに設けられている他は、図
４３のデコーダと同様に構成されている。FIG. 48 shows the lower layer decoding unit 2 in FIG.
32 shows an example configuration. Note that, in the figure, parts corresponding to the case in FIG. 43 are denoted by the same reference numerals. That is, the lower layer decoding unit 232 has the same configuration as the decoder in FIG. 43 except that an upsampling unit 241 is newly provided.

【００５０】アップサンプリング部２４１では、演算器
１０５が出力する、復号された下位レイヤの画像がアッ
プサンプリングされる（補間される）ことで、上位レイ
ヤの画像サイズと同一の画像サイズに拡大され、上位レ
イヤ復号化部２３１に供給される。The up-sampling unit 241 up-samples (interpolates) the decoded lower-layer image output from the arithmetic unit 105 to enlarge the image to the same image size as the upper-layer image. This is supplied to the upper layer decoding unit 231.

【００５１】図４９は、図４７の上位レイヤ復号化部２
３１の一例の構成を示している。なお、図中、図４３に
おける場合と対応する部分については、同一の符号を付
してある。即ち、上位レイヤ復号化部２３１は、重み付
加部２５１，２５２、および演算器２５３が新たに設け
られている他は、基本的に図４３のエンコーダと同様に
構成されている。FIG. 49 shows the upper layer decoding section 2 of FIG.
31 shows an example configuration. Note that, in the figure, parts corresponding to the case in FIG. 43 are denoted by the same reference numerals. That is, the upper layer decoding unit 231 is basically configured in the same manner as the encoder in FIG. 43 except that the weighting units 251 and 252 and the arithmetic unit 253 are newly provided.

【００５２】ＩＶＬＣ器１０２は、図４３で説明した処
理の他、符号化データから重みＷを抽出し、重み付加部
２５１および２５２に出力する。重み付加部２５１は、
動き補償器１０７が出力する予測画像に対して、重みＷ
を乗算し、演算器２５３に出力する。演算器２５３に
は、重み付加部２５１の出力の他、重み付加部２５２の
出力も供給されており、重み付加部２５２は、下位レイ
ヤ復号化部２３２から供給される拡大画像に対して、重
み（１−Ｗ）を乗算し、演算器２５３に供給する。The IVLC unit 102 extracts the weight W from the encoded data in addition to the processing described with reference to FIG. 43 and outputs the weight W to the weight adding units 251 and 252. The weight adding unit 251 includes:
For the predicted image output from the motion compensator 107, the weight W
And outputs the result to the calculator 253. The output of the weighting unit 252 is also supplied to the arithmetic unit 253 in addition to the output of the weighting unit 251. The weighting unit 252 performs weighting on the enlarged image supplied from the lower layer decoding unit 232. The product is multiplied by (1−W) and supplied to the calculator 253.

【００５３】演算器２５３は、重み付加回路２５１およ
び２５２の出力を加算し、その加算結果を、予測画像と
して演算器１０５に出力する。The arithmetic unit 253 adds the outputs of the weighting circuits 251 and 252, and outputs the addition result to the arithmetic unit 105 as a predicted image.

【００５４】以上のように、上位レイヤ復号化部２３１
では、上位レイヤの画像を参照画像とするだけでなく、
下位レイヤ符号化部２３２からの拡大画像、即ち、下位
レイヤの画像をも参照画像として、復号が行われる。As described above, upper layer decoding section 231
Now, in addition to using the upper layer image as a reference image,
Decoding is performed using the enlarged image from the lower layer encoding unit 232, that is, the image of the lower layer as a reference image.

【００５５】なお、以上説明した処理は、輝度信号およ
び色差信号の両方に対して施される。但し、色差信号の
動きベクトルとしては、例えば、輝度信号の動きベクト
ルを１／２倍したものが用いられる。The above-described processing is performed on both the luminance signal and the color difference signal. However, as the motion vector of the color difference signal, for example, a value obtained by halving the motion vector of the luminance signal is used.

【００５６】現在、上述のようなＭＰＥＧ方式の他に
も、様々な動画像の高能率符号化方式が標準化されてい
る。例えば、ＩＴＵ−Ｔでは、主に通信用の符号化方式
として、Ｈ．２６１やＨ．２６３という方式が規定され
ている。このＨ．２６１やＨ．２６３も、基本的にはＭ
ＰＥＧ方式と同様に動き補償予測符号化とＤＣＴ変換符
号化を組み合わせたものであり、ヘッダ情報などの詳細
は異なるが、エンコーダやデコーダの基本的な構成は、
ＭＰＥＧ方式の場合と同様となる。At present, in addition to the above-mentioned MPEG system, various high-efficiency coding systems for moving images are standardized. For example, in ITU-T, H.264 is mainly used as an encoding method for communication. 261 and H.E. 263 is defined. This H. 261 and H.E. 263 is basically M
Similar to the PEG method, this is a combination of motion compensation prediction coding and DCT transform coding. Although details such as header information are different, the basic configuration of an encoder and a decoder is as follows.
This is similar to the case of the MPEG system.

【００５７】[0057]

【発明が解決しようとする課題】ところで、複数の画像
を合成して１つの画像を構成する画像合成システムで
は、例えばクロマキーという手法が用いられる。これ
は、ある物体を青などの特定の一様な色の背景の前で撮
影し、青以外の領域をそこから抽出し、別の画像に合成
するもので、抽出した領域を示す信号はキー信号（ｋｅ
ｙ信号）と呼ばれる。In an image synthesizing system for synthesizing a plurality of images to form one image, for example, a technique called chroma key is used. In this method, an object is photographed in front of a specific color background such as blue, and the non-blue area is extracted therefrom and combined with another image.The signal indicating the extracted area is a key. Signal (ke
y signal).

【００５８】図５０は、従来の画像の合成方法を説明す
るための図である。なお、ここでは、画像Ｆ１を背景
と、画像Ｆ２を前景とする。また、画像Ｆ２は、特定の
色の背景の前で、物体（ここでは、人物）を撮影し、そ
の色以外の領域を抽出することによって得られるもので
あり、キー信号Ｋ１は、その抽出した領域を示す信号で
ある。FIG. 50 is a diagram for explaining a conventional image synthesizing method. Here, it is assumed that the image F1 is a background and the image F2 is a foreground. The image F2 is obtained by photographing an object (here, a person) in front of a background of a specific color and extracting a region other than the color. The key signal K1 is obtained by extracting the key signal K1. This is a signal indicating an area.

【００５９】画像合成システムでは、背景である画像Ｆ
１と、前景である画像Ｆ２とが、キー信号Ｋ１にしたが
って合成され、合成画像Ｆ３が生成される。この合成画
像Ｆ３は、例えば、ＭＰＥＧ符号化などされて伝送され
る。In the image synthesizing system, the background image F
1 and the image F2, which is the foreground, are synthesized according to the key signal K1 to generate a synthesized image F3. The composite image F3 is transmitted after being subjected to, for example, MPEG encoding.

【００６０】ところで、以上のように合成画像Ｆ３を符
号化して伝送した場合、伝送されるのは、合成画像Ｆ３
についての符号化データだけであるから、キー信号Ｋ１
などについての情報は失われ、従って、受信側におい
て、例えば、前景Ｆ２はそのままで、背景Ｆ１のみを変
更するといったような画像の再編集、再合成は困難とな
る。When the composite image F3 is encoded and transmitted as described above, what is transmitted is the composite image F3.
, The key signal K1
Information is lost, so that it becomes difficult for the receiving side to re-edit and re-synthesize the image, for example, changing only the background F1 while keeping the foreground F2.

【００６１】そこで、例えば、図５１に示すように、画
像Ｆ１，Ｆ２、およびキー信号Ｋ１をそれぞれ単独で符
号化し、それぞれのビットストリームを多重化する方法
が考えられる。この場合、受信側では、例えば、図５２
に示すように、多重化されたデータを、逆多重化するこ
とで、画像Ｆ１，Ｆ２、またはキー信号Ｋ１のビットス
トリームを得て、それぞれのビットストリームを復号化
する。そして、それにより得られる画像Ｆ１，Ｆ２、ま
たはキー信号Ｋ１の復号結果を用いて合成を行うこと
で、合成画像Ｆ３が生成される。この場合、受信側で
は、例えば、前景Ｆ２をそのままにして、背景Ｆ１だけ
を他の画像に変更するといった再編集および再合成が可
能となる。Therefore, for example, as shown in FIG. 51, a method of independently encoding the images F1 and F2 and the key signal K1 and multiplexing the respective bit streams is conceivable. In this case, on the receiving side, for example, FIG.
As shown in (1), the multiplexed data is demultiplexed to obtain a bit stream of the images F1 and F2 or the key signal K1, and each bit stream is decoded. Then, by performing synthesis using the images F1 and F2 obtained thereby or the decoding result of the key signal K1, a synthesized image F3 is generated. In this case, the receiving side can perform re-editing and re-synthesis, for example, changing the background F1 to another image while leaving the foreground F2 as it is.

【００６２】ところで、合成画像Ｆ３は、画像Ｆ１とＦ
２とから構成されているが、これと同様に、いかなる画
像も、複数の画像（物体）から構成されていると考える
ことができる。いま、このように画像を構成する単位を
ＶＯ（Video Object）と呼ぶものとすると、このような
ＶＯ単位で符号化を行う方式については、現在、ＩＳＯ
−ＩＥＣ／ＪＴＣ１／ＳＣ２９／ＷＧ１１において、Ｍ
ＰＥＧ４として標準化作業が進められている。Incidentally, the composite image F3 is composed of the images F1 and F
Similarly, any image can be considered to be composed of a plurality of images (objects). Now, assuming that a unit forming an image in this way is called a VO (Video Object), a method of performing encoding in such a VO unit is currently under ISO.
-In IEC / JTC1 / SC29 / WG11, M
Standardization work is underway for PEG4.

【００６３】しかしながら、いまのところ、ＶＯを効率
良く符号化する方法や、キー信号を符号化する方法が確
立しておらず、未解決な問題となっている。However, at present, a method for efficiently encoding VO and a method for encoding key signals have not been established, and are unsolved problems.

【００６４】また、ＭＰＥＧ４では、スケーラビリティ
機能の提供について規定しているが、時間とともに位置
と大きさが変化するＶＯを対象としたスケーラビリティ
を実現する具体的な手法も提案されていない。Although MPEG4 specifies provision of a scalability function, no specific method for realizing scalability for a VO whose position and size change with time has not been proposed.

【００６５】即ち、例えば、遠方から向かってくる人物
などをＶＯとした場合、そのＶＯの位置と大きさは、時
間の経過とともに変化する。従って、上位レイヤの画像
の予測符号化に際し、下位レイヤの画像を参照画像とし
て用いる場合には、その上位レイヤの画像と、参照画像
として用いる下位レイヤの画像との相対的な位置関係を
明確にする必要がある。That is, for example, when a person or the like coming from a distance is a VO, the position and size of the VO change over time. Therefore, when a lower layer image is used as a reference image in predictive encoding of an upper layer image, the relative positional relationship between the upper layer image and the lower layer image used as the reference image is clearly defined. There is a need to.

【００６６】また、ＶＯ単位のスケーラビリティを行う
場合においては、下位レイヤのスキップマクロブロック
の条件が、上位レイヤのスキップマクロブロックの条件
に、そのまま当てはまるとは限らない。When scalability is performed on a VO basis, the condition of a skip macroblock in a lower layer does not always directly apply to the condition of a skip macroblock in an upper layer.

【００６７】本発明は、このような状況に鑑みてなされ
たものであり、ＶＯ単位の符号化を、容易に実現するこ
とができるようにするものである。The present invention has been made in view of such a situation, and is intended to easily realize coding on a VO basis.

【００６８】[0068]

【課題を解決するための手段】請求項１に記載の画像符
号化装置は、第１および第２の画像の解像度の違いに基
づいて、第２の画像を拡大または縮小する拡大縮小手段
と、拡大縮小手段の出力を参照画像として、第１の画像
の予測符号化を行う第１画像符号化手段と、所定の絶対
座標系における第１および第２の画像の位置を決定し、
その第１または第２の画像の位置それぞれに関する第１
または第２の位置情報を出力する位置決定手段とを備
え、第１画像符号化手段が、第１の位置情報に基づい
て、第１の画像の位置を認識するとともに、拡大縮小手
段が第２の画像を拡大または縮小したときの拡大率また
は縮小率に対応して、第２の位置情報を変換し、その変
換結果に対応する位置を、参照画像の位置として認識
し、予測符号化を行うことを特徴とする。An image coding apparatus according to claim 1 includes a scaling unit that scales up or down a second image based on a difference in resolution between the first and second images. A first image encoding unit that performs predictive encoding of a first image using an output of the scaling unit as a reference image, and positions of the first and second images in a predetermined absolute coordinate system are determined;
A first one for each of the positions of the first or second image;
Or a position determining means for outputting second position information, wherein the first image encoding means recognizes the position of the first image based on the first position information, and The second position information is converted according to the enlargement ratio or reduction ratio when the image is enlarged or reduced, and the position corresponding to the conversion result is recognized as the position of the reference image, and predictive encoding is performed. It is characterized by the following.

【００６９】請求項２に記載の画像符号化方法は、画像
符号化装置が、第１および第２の画像の解像度の違いに
基づいて、第２の画像を拡大または縮小する拡大縮小手
段と、拡大縮小手段の出力を参照画像として、第１の画
像の予測符号化を行う第１画像符号化手段と、所定の絶
対座標系における第１および第２の画像の位置を決定
し、その第１または第２の画像の位置それぞれに関する
第１または第２の位置情報を出力する位置決定手段とを
備え、第１画像符号化手段に、第１の位置情報に基づい
て、第１の画像の位置を認識させるとともに、拡大縮小
手段が第２の画像を拡大または縮小したときの拡大率ま
たは縮小率に対応して、第２の位置情報を変換させ、そ
の変換結果に対応する位置を、参照画像の位置として認
識させ、予測符号化を行わせることを特徴とする。According to a second aspect of the present invention, in the image encoding method, the image encoding device enlarges or reduces the second image based on a difference in resolution between the first and second images. A first image encoding unit that performs predictive encoding of a first image using an output of the scaling unit as a reference image, and positions of the first and second images in a predetermined absolute coordinate system are determined. Or position determining means for outputting first or second position information on each of the positions of the second image, wherein the first image encoding means outputs the position of the first image based on the first position information. And the second position information is converted in accordance with an enlargement or reduction ratio when the enlargement or reduction unit enlarges or reduces the second image, and a position corresponding to the conversion result is referred to as a reference image. And predictive coding Characterized in that it causes.

【００７０】請求項３に記載の画像復号化装置は、第１
および第２の画像の解像度の違いに基づいて、第２画像
復号化手段により復号化された第２の画像を拡大または
縮小する拡大縮小手段と、拡大縮小手段の出力を参照画
像として、第１の画像を復号化する第１画像復号化手段
とを備え、符号化データが、所定の絶対座標系における
第１または第２の画像の位置それぞれ関する第１または
第２の位置情報を含んでおり、第１画像復号化手段が、
第１の位置情報に基づいて、第１の画像の位置を認識す
るとともに、拡大縮小手段が第２の画像を拡大または縮
小したときの拡大率または縮小率に対応して、第２の位
置情報を変換し、その変換結果に対応する位置を、参照
画像の位置として認識し、第１の画像の復号化を行うこ
とを特徴とする。An image decoding apparatus according to a third aspect is characterized in that the first
And a scaling unit that scales up or down the second image decoded by the second image decoding unit based on a difference in resolution between the second image and the second image. And first image decoding means for decoding the image of the first image, wherein the encoded data includes first or second position information on the position of the first or second image in a predetermined absolute coordinate system, respectively. , The first image decoding means,
Based on the first position information, the position of the first image is recognized, and the second position information corresponding to the enlargement or reduction ratio when the enlargement / reduction unit enlarges or reduces the second image. Is converted, the position corresponding to the conversion result is recognized as the position of the reference image, and decoding of the first image is performed.

【００７１】請求項４に記載の画像復号化方法は、画像
復号化装置が、第１および第２の画像の解像度の違いに
基づいて、第２画像復号化手段により復号化された第２
の画像を拡大または縮小する拡大縮小手段と、拡大縮小
手段の出力を参照画像として、第１の画像を復号化する
第１画像復号化手段とを備え、符号化データが、所定の
絶対座標系における第１または第２の画像の位置それぞ
れ関する第１または第２の位置情報を含んでいる場合、
第１画像復号化手段に、第１の位置情報に基づいて、第
１の画像の位置を認識させるとともに、拡大縮小手段が
第２の画像を拡大または縮小したときの拡大率または縮
小率に対応して、第２の位置情報を変換させ、その変換
結果に対応する位置を、参照画像の位置として認識さ
せ、第１の画像の復号化を行わせることを特徴とする。According to a fourth aspect of the present invention, in the image decoding method, the image decoding device decodes the second image by the second image decoding means based on a difference in resolution between the first and second images.
And a first image decoding unit that decodes a first image using an output of the scaling unit as a reference image, wherein the encoded data is in a predetermined absolute coordinate system. Contains the first or second position information relating to the position of the first or second image, respectively.
The first image decoding means recognizes the position of the first image based on the first position information, and corresponds to the enlargement or reduction rate when the enlargement / reduction means enlarges or reduces the second image. Then, the second position information is converted, the position corresponding to the conversion result is recognized as the position of the reference image, and the decoding of the first image is performed.

【００７２】[0072]

【００７３】[0073]

【００７４】[0074]

【００７５】[0075]

【００７６】[0076]

【００７７】[0077]

【００７８】[0078]

【００７９】[0079]

【００８０】請求項１に記載の画像符号化装置および請
求項２に記載の画像符号化方法においては、拡大縮小手
段が、第１および第２の画像の解像度の違いに基づい
て、第２の画像を拡大または縮小し、第１画像符号化手
段が、拡大縮小手段の出力を参照画像として、第１の画
像の予測符号化を行うようになされている。位置決定手
段は、所定の絶対座標系における第１および第２の画像
の位置を決定し、その第１または第２の画像の位置それ
ぞれに関する第１または第２の位置情報を出力するよう
になされている。この場合において、第１画像符号化手
段では、第１の位置情報に基づいて、第１の画像の位置
が認識されるとともに、拡大縮小手段が第２の画像を拡
大または縮小したときの拡大率または縮小率に対応し
て、第２の位置情報が変換され、その変換結果に対応す
る位置が、参照画像の位置として認識され、予測符号化
が行われるようになされている。[0080] In the image encoding apparatus according to the first aspect and the image encoding method according to the second aspect, the enlarging / reducing means is configured to determine the second image based on a difference in resolution between the first and second images. The image is enlarged or reduced, and the first image encoding unit performs predictive encoding of the first image using the output of the enlargement / reduction unit as a reference image. The position determining means determines the positions of the first and second images in a predetermined absolute coordinate system and outputs first or second position information relating to the positions of the first and second images, respectively. ing. In this case, the first image encoding unit recognizes the position of the first image based on the first position information, and enlarges or reduces the magnification when the enlargement / reduction unit enlarges or reduces the second image. Alternatively, the second position information is converted according to the reduction ratio, a position corresponding to the conversion result is recognized as the position of the reference image, and predictive encoding is performed.

【００８１】請求項３に記載の画像復号化装置および請
求項４に記載の画像復号化方法においては、拡大縮小手
段は、第１および第２の画像の解像度の違いに基づい
て、第２画像復号化手段により復号化された第２の画像
を拡大または縮小し、第１画像復号化手段は、拡大縮小
手段の出力を参照画像として、第１の画像を復号化する
ようになされている。そして、符号化データが、所定の
絶対座標系における第１または第２の画像の位置それぞ
れ関する第１または第２の位置情報を含んでいる場合、
第１画像復号化手段では、第１の位置情報に基づいて、
第１の画像の位置が認識されるとともに、拡大縮小手段
が第２の画像を拡大または縮小したときの拡大率または
縮小率に対応して、第２の位置情報が変換され、その変
換結果に対応する位置が、参照画像の位置として認識さ
れ、第１の画像の復号化が行われるようになされてい
る。[0081] The image decoding apparatus according to claim 3 and 請
In the image decoding method according to claim 4 , the scaling unit scales up the second image decoded by the second image decoding unit based on a difference in resolution between the first and second images. Alternatively, the first image decoding unit decodes the first image using the output of the scaling unit as a reference image. Then, when the encoded data includes the first or second position information relating to the position of the first or second image in the predetermined absolute coordinate system, respectively,
In the first image decoding means, based on the first position information,
The position of the first image is recognized, and the second position information is converted in accordance with the enlargement or reduction ratio when the enlargement or reduction unit enlarges or reduces the second image, and the conversion result is The corresponding position is recognized as the position of the reference image, and the decoding of the first image is performed.

【００８２】[0082]

【００８３】[0083]

【００８４】[0084]

【００８５】[0085]

【００８６】[0086]

【発明の実施の形態】図１は、本発明を適用したエンコ
ーダの一実施の形態を示している。FIG. 1 shows an embodiment of an encoder to which the present invention is applied.

【００８７】符号化すべき画像データは、ＶＯ構成部１
に入力され、ＶＯ構成部１では、そこに入力される画像
を構成する物体を抽出し、ＶＯを構成する。さらに、Ｖ
Ｏ構成部１は、各ＶＯについてのキー信号を生成し、対
応するＶＯとともに、ＶＯＰ構成部２1乃至２Nそれぞれ
に出力する。即ち、ＶＯ構成部１においてＮ個のＶＯ１
乃至ＶＯ＃Ｎが構成された場合、そのＮ個のＶＯ１乃至
ＶＯ＃Ｎは、対応するキー信号とともに、ＶＯＰ構成部
２1乃至２Nそれぞれに出力される。The image data to be coded is
The VO composing unit 1 extracts an object constituting an image inputted thereto and constructs a VO. Furthermore, V
The O component 1 generates a key signal for each VO and outputs the key signal to the VOP components 21 to 2N together with the corresponding VO. That is, in the VO configuration unit 1, N VO1s
When VO # N is configured, the N VO1 to VO # N are output to the VOP constituent units 21 to 2N together with the corresponding key signals.

【００８８】具体的には、例えば、符号化すべき画像デ
ータが、前述の図５１で示したように、背景Ｆ１、前景
Ｆ２、およびキー信号Ｋ１を含んでおり、これらから、
クロマキーにより合成画像を生成することができるもの
である場合、ＶＯ構成部１は、例えば、前景Ｆ２を、Ｖ
Ｏ１とし、キー信号Ｋ１を、そのＶＯ１のキー信号とし
て、ＶＯＰ構成部２1に出力する。さらに、ＶＯ構成部
１は、背景Ｆ１を、ＶＯ２として、ＶＯＰ構成部２2に
出力する。なお、背景についてはキー信号は不要なた
め、出力されない（生成されない）。More specifically, for example, the image data to be encoded includes a background F1, a foreground F2, and a key signal K1, as shown in FIG.
If a composite image can be generated by chroma keying, the VO constructing unit 1
O1 and outputs the key signal K1 to the VOP constructing unit 21 as a key signal of the VO1. Further, the VO component 1 outputs the background F1 to the VOP component 22 as VO2. Note that a key signal is not output for the background because it is unnecessary (it is not generated).

【００８９】また、ＶＯ構成部１は、符号化すべき画像
データが、キー信号を含んでいない、例えば、既に合成
された画像である場合、所定のアルゴリズムにしたがっ
て、画像を領域分割することにより、１以上の領域を抽
出し、さらに、その各領域に対応するキー信号を生成す
る。そして、ＶＯ構成部１は、抽出した領域のシーケン
スをＶＯとし、生成したキー信号とともに、対応するＶ
ＯＰ構成部２n（但し、ｎ＝１，２，・・・，Ｎ）に出
力する。When the image data to be encoded does not include a key signal, for example, an already synthesized image, the VO constructing unit 1 divides the image into regions according to a predetermined algorithm. One or more regions are extracted, and a key signal corresponding to each region is generated. Then, the VO configuration unit 1 sets the sequence of the extracted region as VO, and generates the corresponding key signal together with the corresponding V signal.
Output to the OP constituent unit 2n (where n = 1, 2,..., N).

【００９０】ＶＯＰ構成部２nは、ＶＯ構成部１の出力
から、ＶＯＰ（VO Plane）を構成する。即ち、各フレー
ムから物体を抽出し、その物体を囲む、例えば、最小の
長方形をＶＯＰとする。なお、このとき、ＶＯＰ構成部
２nは、その横および縦の画素数が、例えば、１６の倍
数となるようにＶＯＰを構成する。ＶＯ構成部２nは、
ＶＯＰを構成すると、そのＶＯＰに含まれる物体の部分
の画像データ（例えば、輝度信号および色差信号など）
を抜くためのキー信号（このキー信号は、上述したよう
に、ＶＯ構成部１から供給される）とともに、ＶＯＰ符
号化部３nに出力する。The VOP constructing unit 2 n constructs a VOP (VO Plane) from the output of the VO constructing unit 1. That is, an object is extracted from each frame, and, for example, a minimum rectangle surrounding the object is set as a VOP. At this time, the VOP forming unit 2n forms the VOP such that the number of horizontal and vertical pixels is a multiple of, for example, 16. The VO component 2n
When a VOP is configured, image data of an object portion included in the VOP (for example, a luminance signal and a color difference signal)
Is output to the VOP encoding unit 3n together with a key signal for extracting the symbol (this key signal is supplied from the VO configuration unit 1 as described above).

【００９１】さらに、ＶＯＰ構成部２nは、ＶＯＰの大
きさ（例えば、横および縦の長さ）を表すサイズデータ
（VOP size）と、フレームにおける、そのＶＯＰの位置
（例えば、フレームの最も左上を原点とするときの座
標）を表すオフセットデータ（VOP offset）とを検出
し、これらのデータも、ＶＯＰ符号化部３nに供給す
る。Further, the VOP constructing unit 2n includes size data (VOP size) representing the size of the VOP (for example, the horizontal and vertical lengths) and the position of the VOP in the frame (for example, The offset data (VOP offset) representing the origin (coordinates at the origin) is detected, and these data are also supplied to the VOP encoding unit 3n.

【００９２】ＶＯＰ符号化部３nは、ＶＯＰ構成部２nの
出力を、例えば、ＭＰＥＧや、Ｈ．２６３などの規格に
準拠した方式で符号化し、その結果得られるビットスト
リームを、多重化部４に出力する。多重化部４は、ＶＯ
Ｐ符号化部３1乃至３Nからのビットストリームを多重化
し、その結果得られる多重化データを、例えば、地上波
や、衛星回線、ＣＡＴＶ網その他の伝送路５を介して伝
送し、または、例えば、磁気ディスク、光磁気ディス
ク、光ディスク、磁気テープその他の記録媒体６に記録
する。The VOP encoder 3n outputs the output of the VOP constructor 2n to, for example, MPEG or H.264. H.263 and the like, and the resulting bit stream is output to the multiplexing unit 4. The multiplexing unit 4 includes a VO
The bit streams from the P encoders 31 to 3N are multiplexed, and the resulting multiplexed data is transmitted via, for example, a terrestrial wave, a satellite line, a CATV network or another transmission path 5, or The information is recorded on a recording medium 6 such as a magnetic disk, a magneto-optical disk, an optical disk, a magnetic tape, or the like.

【００９３】ここで、ＶＯおよびＶＯＰについて説明す
る。Here, VO and VOP will be described.

【００９４】ＶＯは、ある合成画像のシーケンスが存在
する場合の、その合成画像を構成する各物体のシーケン
スであり、ＶＯＰは、ある時刻におけるＶＯを意味す
る。即ち、例えば、いま、画像Ｆ１およびＦ２を合成し
て構成される合成画像Ｆ３がある場合、画像Ｆ１または
Ｆ２が時系列に並んだものが、それぞれＶＯであり、あ
る時刻における画像Ｆ１またはＦ２が、それぞれＶＯＰ
である。従って、ＶＯは、異なる時刻の、同一物体のＶ
ＯＰの集合ということができる。VO is a sequence of each object constituting the composite image when there is a sequence of the composite image, and VOP means VO at a certain time. That is, for example, if there is a composite image F3 composed of the images F1 and F2, the image F1 or F2 arranged in time series is VO, and the image F1 or F2 at a certain time is , Each VOP
It is. Therefore, VO is the V of the same object at different times.
It can be said that it is a set of OPs.

【００９５】なお、例えば、画像Ｆ１を背景とするとと
もに、画像Ｆ２を前景とすると、合成画像Ｆ３は、画像
Ｆ２を抜くためのキー信号を用いて、画像Ｆ１およびＦ
２を合成することによって得られるが、この場合におけ
る画像Ｆ２のＶＯＰには、その画像Ｆ２を構成する画像
データ（輝度信号および色差信号）の他、適宜、そのキ
ー信号も含まれるものとする。For example, if the image F1 is set as the background and the image F2 is set as the foreground, the composite image F3 is formed by using the key signals for extracting the image F2 and the images F1 and F2.
2, the VOP of the image F2 in this case includes not only the image data (luminance signal and color difference signal) constituting the image F2 but also its key signal as appropriate.

【００９６】画像フレーム（画枠）のシーケンスは、そ
の大きさおよび位置のいずれも変化しないが、ＶＯは、
大きさや位置が変化する場合がある。即ち、同一のＶＯ
を構成するＶＯＰであっても、時刻によって、その大き
さや位置が異なる場合がある。The sequence of image frames (image frames) does not change in both size and position, but VO
The size and position may change. That is, the same VO
May be different in size and position depending on the time.

【００９７】具体的には、図２は、背景である画像Ｆ１
と、前景である画像Ｆ２とからなる合成画像を示してい
る。More specifically, FIG. 2 shows an image F1 as a background.
And a composite image including the foreground image F2.

【００９８】画像Ｆ１は、例えば、ある自然の風景を撮
影したものであり、その画像全体のシーケンスが１つの
ＶＯ（ＶＯ０とする）とされている。また、画像Ｆ２
は、例えば、人が歩いている様子を撮影したものであ
り、その人を囲む最小の長方形のシーケンスが１つのＶ
Ｏ（ＶＯ１とする）とされている。The image F1 is, for example, a photograph of a natural scenery, and the sequence of the entire image is one VO (referred to as VO0). Also, the image F2
Is an image of a person walking, for example, and the smallest rectangular sequence surrounding the person is one V
O (referred to as VO1).

【００９９】この場合、ＶＯ０は風景の画像であるか
ら、基本的に、通常の画像のフレームと同様に、その位
置および大きさの両方とも変化しない。これに対して、
ＶＯ１は人の画像であるから、人物が左右に移動した
り、また、図面において手前側または奥側に移動するこ
とにより、その大きさや位置が変化する。従って、図２
は、同一時刻におけるＶＯ０およびＶＯ１を表している
が、両者の位置や大きさが同一とは限らない。In this case, since VO0 is a landscape image, basically, both its position and size do not change, similarly to a normal image frame. On the contrary,
Since the VO1 is an image of a person, the size and position of the VO1 change when the person moves left and right or moves forward or backward in the drawing. Therefore, FIG.
Represents VO0 and VO1 at the same time, but their positions and sizes are not necessarily the same.

【０１００】そこで、図１のＶＯＰ符号化部３nは、そ
の出力するビットストリームに、ＶＯＰを符号化したデ
ータの他、所定の絶対座標系におけるＶＯＰの位置（座
標）および大きさに関する情報も含めるようになされて
いる。なお、図２においては、ある時刻におけるＶＯ０
（ＶＯＰ）の位置を示すベクトルをＯＳＴ０と、同一時
刻におけるＶＯ１（ＶＯＰ）の位置を表すベクトルをＯ
ＳＴ１と、それぞれ表してある。Therefore, the VOP encoding unit 3n in FIG. 1 includes, in the output bit stream, information on the position (coordinate) and size of the VOP in a predetermined absolute coordinate system, in addition to the data obtained by encoding the VOP. It has been made like that. In FIG. 2, VO0 at a certain time
The vector indicating the position of (VOP) is OST0, and the vector indicating the position of VO1 (VOP) at the same time is OST0.
ST1.

【０１０１】次に、図３は、図１のＶＯＰ符号化部３n
の基本的な構成例を示している。Next, FIG. 3 shows the VOP encoding unit 3n of FIG.
2 shows an example of a basic configuration.

【０１０２】ＶＯＰ構成部２nからの画像信号（画像デ
ータ）（ＶＯＰを構成する輝度信号および色差信号）
は、画像信号符号化部１１に入力される。画像信号符号
化部１１は、基本的には、前述した図４２のエンコーダ
と同様に構成され、そこでは、ＶＯＰが、例えば、ＭＰ
ＥＧやＨ．２６３などの規格に準拠した方式で符号化さ
れる。画像信号符号化部１１でＶＯＰが符号化されるこ
とにより得られる、その動きおよびテクスチャの情報
は、多重化部１３に供給される。Image signal (image data) from VOP forming section 2n (luminance signal and color difference signal forming VOP)
Is input to the image signal encoding unit 11. The image signal encoding unit 11 is basically configured in the same manner as the encoder of FIG. 42 described above.
EG and H. H.263 and the like. The motion and texture information obtained by encoding the VOP in the image signal encoding unit 11 is supplied to the multiplexing unit 13.

【０１０３】また、ＶＯＰ構成部２nからのキー信号
は、キー信号符号化部１２に入力され、そこで、例え
ば、ＤＰＣＭ（Differential Pulse Code Modulation）
などにされることにより符号化される。キー信号符号化
部１２における符号化の結果得られるキー信号情報は、
やはり多重化部１３に供給される。The key signal from the VOP forming unit 2n is input to the key signal encoding unit 12, where, for example, DPCM (Differential Pulse Code Modulation).
And so on. Key signal information obtained as a result of encoding in the key signal encoding unit 12 is as follows:
It is also supplied to the multiplexing unit 13.

【０１０４】多重化部１３には、画像信号符号化部１１
およびキー信号符号化部１２の出力の他、ＶＯＰ構成部
２nからのサイズデータ（VOP size）およびオフセット
データ（VOP offset）も供給されており、多重化部１３
は、これらを多重化して、バッファ１４に出力する。バ
ッファ１４は、多重化部１３の出力を一時記憶し、その
データ量を平滑化して出力する。The multiplexing unit 13 includes the image signal encoding unit 11
In addition to the output of the key signal encoding unit 12 and the size data (VOP size) and offset data (VOP offset) from the VOP configuration unit 2n, the multiplexing unit 13
Multiplexes these and outputs them to the buffer 14. The buffer 14 temporarily stores the output of the multiplexing unit 13, smoothes the data amount, and outputs the data.

【０１０５】なお、キー信号符号化部１２においては、
ＤＰＣＭの他、例えば、画像信号符号化部１１において
予測符号化が行われることにより検出された動きベクト
ルにしたがって、キー信号を動き補償し、その時間的に
前または後のＶＯＰにおけるキー信号との差分を演算す
ることで、キー信号を符号化するようにすることなども
可能である。Note that the key signal encoding unit 12
In addition to the DPCM, for example, the key signal is motion-compensated in accordance with the motion vector detected by performing the predictive encoding in the image signal encoding unit 11, and the key signal is compared with the key signal in the temporally preceding or succeeding VOP. By calculating the difference, the key signal can be encoded.

【０１０６】また、キー信号符号化部１２におけるキー
信号の符号化結果のデータ量（バッファフィードバッ
ク）は、画像信号符号化部１１に供給するようにするこ
とが可能である。この場合、画像信号符号化部１１で
は、キー信号符号化部１２からのデータ量をも考慮し
て、量子化ステップが決定される。The data amount (buffer feedback) of the encoding result of the key signal in the key signal encoding unit 12 can be supplied to the image signal encoding unit 11. In this case, the quantization step is determined in the image signal encoding unit 11 in consideration of the data amount from the key signal encoding unit 12 as well.

【０１０７】次に、図４は、スケーラビリティを実現す
る、図１のＶＯＰ符号化部３nの構成例を示している。Next, FIG. 4 shows a configuration example of the VOP encoding unit 3n of FIG. 1 for realizing scalability.

【０１０８】ＶＯＰ構成部２nからのＶＯＰ（画像デー
タ）、並びにそのキー信号、サイズデータ（VOP siz
e）、およびオフセットデータ（VOP offset）は、いず
れも画像階層化部２１に供給される。The VOP (image data) from the VOP constructing unit 2n, its key signal and size data (VOP siz
e) and the offset data (VOP offset) are both supplied to the image layering unit 21.

【０１０９】画像階層化部２１は、ＶＯＰから、複数の
階層の画像データを生成する（ＶＯＰの階層化を行
う）。即ち、例えば、空間スケーラビリティの符号化を
行う場合においては、画像階層化部２１は、そこに入力
される画像データおよびキー信号を、そのまま上位レイ
ヤ（上位階層）の画像データおよびキー信号として出力
するとともに、それらの画像データおよびキー信号を構
成する画素数を間引くことなどにより縮小し（解像度を
低下させ）、これを下位レイヤ（下位階層）の画像デー
タおよびキー信号として出力する。The image layering section 21 generates a plurality of layers of image data from the VOP (performs layering of the VOP). That is, for example, when performing spatial scalability encoding, the image layering unit 21 outputs the image data and the key signal input thereto as it is as the image data and the key signal of the upper layer (upper layer). At the same time, the image data and the key signal are reduced by reducing the number of pixels constituting the key signal (reducing the resolution), and are output as lower layer (lower layer) image data and a key signal.

【０１１０】なお、入力されたＶＯＰを下位レイヤのデ
ータとするとともに、そのＶＯＰの解像度を、何らかの
手法で高くし（画素数を多くし）、これを、上位レイヤ
のデータとすることなども可能である。It is also possible to use the input VOP as lower layer data, increase the resolution of the VOP by some method (increase the number of pixels), and use this as the upper layer data. It is.

【０１１１】また、階層数は、３以上とすることも可能
であるが、ここでは、簡単のために、２階層の場合につ
いて説明を行う。Although the number of layers can be three or more, the case of two layers will be described here for simplicity.

【０１１２】画像階層化部２１は、例えば、時間スケー
ラビリティ（テンポラルスケーラビリティ）の符号化を
行う場合、時刻に応じて、画像データおよびキー信号
を、下位レイヤまたは上位レイヤのデータとして、例え
ば、交互に出力する。即ち、例えば、画像階層化部２１
は、そこに、あるＶＯを構成するＶＯＰが、ＶＯＰ０，
ＶＯＰ１，ＶＯＰ２，ＶＯＰ３，・・・の順で入力され
たとした場合、ＶＯＰ０，ＶＯＰ２，ＶＯＰ４，ＶＯＰ
６，・・・を、下位レイヤのデータとして、また、ＶＯ
Ｐ１，ＶＯＰ３，ＶＯＰ５，ＶＯＰ７，・・・を、上位
レイヤデータとして出力する。なお、時間スケーラビリ
ティの場合は、このようにＶＯＰが間引かれたものが、
下位レイヤおよび上位レイヤのデータとされるだけで、
画像データの拡大または縮小（解像度の変換）は行われ
ない（但し、行うようにすることも可能である）。For example, when performing temporal scalability (temporal scalability) encoding, the image layering unit 21 alternately converts image data and key signals as lower layer or upper layer data, for example, according to time. Output. That is, for example, the image layering unit 21
Indicates that VOPs constituting a certain VO are VOP0,
If it is assumed that VOP1, VOP2, VOP3,... Are input in this order, VOP0, VOP2, VOP4, VOP
, ... as lower layer data, and VO
P1, VOP3, VOP5, VOP7,... Are output as upper layer data. In the case of time scalability, VOPs are thinned out in this way,
Only the lower layer and upper layer data,
The enlargement or reduction (resolution conversion) of the image data is not performed (however, it can be performed).

【０１１３】また、画像階層化部２１は、例えば、ＳＮ
Ｒ（Signal to Noise Ratio）スケーラビリティの符号
化を行う場合、入力された画像データおよびキー信号
を、そのまま上位レイヤまたは下位レイヤのデータそれ
ぞれとして出力する。即ち、この場合、下位レイヤ並び
に上位レイヤの画像データおよびキー信号は、同一のデ
ータとなる。Further, the image hierarchical unit 21 can, for example,
When performing R (Signal to Noise Ratio) scalability encoding, input image data and a key signal are output as data of an upper layer or a lower layer as they are. That is, in this case, the image data and the key signal of the lower layer and the upper layer are the same data.

【０１１４】ここで、ＶＯＰごとに符号化を行う場合の
空間スケーラビリティについては、例えば、次のような
３種類が考えられる。Here, the following three types of spatial scalability when encoding is performed for each VOP can be considered, for example.

【０１１５】即ち、例えば、いま、ＶＯＰとして、図２
に示したような画像Ｆ１およびＦ２でなる合成画像が入
力されたとすると、第１の空間スケーラビリティは、図
５に示すように、入力されたＶＯＰ全体（図５（Ａ））
を上位レイヤ（EnhancementLayer）とするとともに、そ
のＶＯＰ全体を縮小したもの（図５（Ｂ））を下位レイ
ヤ（Base Layer）とするものである。That is, for example, as a VOP, FIG.
Assuming that a composite image composed of the images F1 and F2 as shown in FIG. 5 is input, the first spatial scalability is as shown in FIG. 5 as a whole of the input VOP (FIG. 5A).
Is the upper layer (EnhancementLayer), and the reduced VOP (FIG. 5B) is the lower layer (Base Layer).

【０１１６】また、第２の空間スケーラビリティは、図
６に示すように、入力されたＶＯＰを構成する一部の物
体（図６（Ａ）（ここでは、画像Ｆ２に相当する部
分）））を抜き出して（なお、このような抜き出しは、
例えば、ＶＯＰ構成部２nにおける場合と同様にして行
われ、従って、これにより抜き出された物体も、１つの
ＶＯＰと考えることができる）、上位レイヤとするとと
もに、そのＶＯＰ全体を縮小したもの（図６（Ｂ））を
下位レイヤとするものである。As shown in FIG. 6, the second spatial scalability is such that, as shown in FIG. 6, some objects (FIG. 6A (here, a portion corresponding to image F2)) constituting the input VOP are used. Extract (Note that such extraction is
For example, the processing is performed in the same manner as in the VOP composing unit 2n, so that the extracted object can also be considered as one VOP), the upper layer, and a reduced VOP as a whole ( FIG. 6B) is a lower layer.

【０１１７】さらに、第３の空間スケーラビリティは、
図７および図８に示すように、入力されたＶＯＰを構成
する物体（ＶＯＰ）を抜き出して、その物体ごとに、上
位レイヤおよび下位レイヤを生成するものである。な
お、図７は、図２のＶＯＰを構成する背景（画像Ｆ１）
から上位レイヤおよび下位レイヤを生成した場合を示し
ており、また、図８は、図２のＶＯＰを構成する前景
（画像Ｆ２）から上位レイヤおよび下位レイヤを生成し
た場合を示している。Furthermore, the third spatial scalability is:
As shown in FIGS. 7 and 8, an object (VOP) constituting an input VOP is extracted, and an upper layer and a lower layer are generated for each object. FIG. 7 shows a background (image F1) constituting the VOP of FIG.
8 shows a case where an upper layer and a lower layer are generated from the foreground (image F2) constituting the VOP of FIG. 2.

【０１１８】以上のようなスケーラビリティのうちのい
ずれを用いるかは予め決められており、画像階層化部２
１は、その予め決められたスケーラビリティによる符号
化を行うことができるように、ＶＯＰの階層化を行う。Which of the above scalabilities is used is determined in advance, and the image hierarchy unit 2
1 hierarchically arranges VOPs so that encoding based on the predetermined scalability can be performed.

【０１１９】さらに、画像階層化部２１は、そこに入力
されるＶＯＰのサイズデータおよびオフセットデータ
（それぞれを、以下、適宜、初期サイズデータ、初期オ
フセットデータという）から、生成した下位レイヤおよ
び上位レイヤのＶＯＰの所定の絶対座標系における位置
を表すオフセットデータと、その大きさを示すサイズデ
ータとを計算（決定）する。Further, the image layering section 21 generates the lower layer and the upper layer generated from the VOP size data and the offset data (hereinafter referred to as initial size data and initial offset data, respectively, as appropriate). The offset data indicating the position of the VOP in the predetermined absolute coordinate system and the size data indicating the size are calculated (determined).

【０１２０】ここで、下位レイヤ並びに上位レイヤのＶ
ＯＰのオフセットデータ（位置情報）およびサイズデー
タの決定方法について、例えば、上述の第２のスケーラ
ビリティ（図６）を行う場合を例に説明する。Here, V of the lower layer and the upper layer
A method of determining the offset data (position information) and the size data of the OP will be described by taking, for example, a case where the above-described second scalability (FIG. 6) is performed.

【０１２１】この場合、下位レイヤのオフセットデータ
ＦＰＯＳ＿Ｂは、例えば、図９（Ａ）に示すように、下
位レイヤの画像データを、その解像度および上位レイヤ
の解像度の違いに基づいて拡大（補間）したときに、即
ち、下位レイヤの画像を、上位レイヤの画像の大きさと
一致するような拡大率（上位レイヤの画像を縮小して下
位レイヤの画像を生成したときの、その縮小率の逆数）
（以下、適宜、倍率ＦＲという）で拡大したときに、そ
の拡大画像の絶対座標系におけるオフセットデータが、
初期オフセットデータと一致するように決定される。ま
た、下位レイヤのサイズデータＦＳＺ＿Ｂも同様に、下
位レイヤの画像を倍率ＦＲで拡大したときに得られる拡
大画像のサイズデータが初期サイズデータと一致するよ
うに決定される。In this case, the lower layer offset data FPOS_B is obtained by enlarging (interpolating) the lower layer image data based on the difference between its resolution and the upper layer resolution, as shown in FIG. 9A, for example. Sometimes, that is, the enlargement ratio of the image of the lower layer to match the size of the image of the upper layer (the reciprocal of the reduction ratio when the image of the lower layer is generated by reducing the image of the upper layer)
(Hereinafter, appropriately referred to as magnification FR), the offset data of the enlarged image in the absolute coordinate system is
It is determined to match the initial offset data. Similarly, the size data FSZ_B of the lower layer is determined so that the size data of the enlarged image obtained when the image of the lower layer is enlarged by the magnification FR matches the initial size data.

【０１２２】一方、上位レイヤのオフセットデータＦＰ
ＯＳ＿Ｅは、例えば、図９（Ｂ）に示すように、入力さ
れたＶＯＰから抜き出した物体を囲む１６倍最小長方形
（ＶＯＰ）の、例えば、左上の頂点の座標が、初期オフ
セットデータに基づいて求められ、この値に決定され
る。また、上位レイヤのサイズデータＦＰＯＳ＿Ｅは、
入力されたＶＯＰから抜き出した物体を囲む１６倍最小
長方形の、例えば横および縦の長さに決定される。On the other hand, upper layer offset data FP
OS_E is, for example, as shown in FIG. 9B, the coordinates of, for example, the upper left vertex of the 16-fold minimum rectangle (VOP) surrounding the object extracted from the input VOP is obtained based on the initial offset data. And is determined to this value. The size data FPOS_E of the upper layer is
The length is determined to be, for example, the horizontal and vertical lengths of a 16-fold minimum rectangle surrounding the object extracted from the input VOP.

【０１２３】従って、この場合、下位レイヤのオフセッ
トデータＦＰＯＳ＿ＢおよびサイズデータＦＰＯＳ＿Ｂ
を、倍率ＦＲにしたがって変換し（変換後のオフセット
データＦＰＯＳ＿ＢまたはサイズデータＦＰＯＳ＿Ｂ
を、それぞれ、変換オフセットデータＦＰＯＳ＿Ｂまた
は変換サイズデータＦＰＯＳ＿Ｂという）、絶対座標系
において、変換オフセットデータＦＰＯＳ＿Ｂに対応す
る位置に、変換サイズデータＦＳＺ＿Ｂに対応する大き
さの画枠を考え、そこに、下位レイヤの画像データをＦ
Ｒ倍だけした拡大画像を配置するとともに（図９
（Ａ））、その絶対座標系において、上位レイヤのオフ
セットデータＦＰＯＳ＿ＥおよびサイズデータＦＰＯＳ
＿Ｅにしたがって、上位レイヤの画像を同様に配置する
と（図９（Ｂ））、拡大画像を構成する各画素と、上位
レイヤの画像を構成する各画素とは、対応するものどう
しが同一の位置に配置されることになる。即ち、この場
合、例えば、図９において、上位レイヤの画像である人
の部分と、拡大画像の中の人の部分とは、同一の位置に
配置されることになる。Accordingly, in this case, the offset data FPOS_B and the size data FPOS_B of the lower layer
Is converted according to the magnification FR (the offset data FPOS_B or the size data FPOS_B after the conversion).
Are respectively referred to as conversion offset data FPOS_B or conversion size data FPOS_B), and an image frame having a size corresponding to the conversion size data FSZ_B is considered at a position corresponding to the conversion offset data FPOS_B in the absolute coordinate system. F is the image data of the layer
In addition to arranging an enlarged image R times (see FIG. 9)
(A)), the offset data FPOS_E and the size data FPOS of the upper layer in the absolute coordinate system.
If the image of the upper layer is similarly arranged according to _E (FIG. 9B), each pixel constituting the enlarged image and each pixel constituting the image of the upper layer have the same position at the same position. Will be placed in That is, in this case, for example, in FIG. 9, the part of the person, which is the image of the upper layer, and the part of the person in the enlarged image are arranged at the same position.

【０１２４】第１および第３のスケーラビリティにおけ
る場合も、同様にして、下位レイヤの拡大画像および上
位レイヤの画像を構成する、対応する画素どうしが、絶
対座標系において同一の位置に配置されるように、オフ
セットデータＦＰＯＳ＿ＢおよびＦＰＯＳ＿Ｅ、並びに
サイズデータＦＳＺ＿ＢおよびＦＳＺ＿Ｅが決定され
る。Similarly, in the first and third scalability, corresponding pixels constituting the enlarged image of the lower layer and the image of the upper layer are arranged at the same position in the absolute coordinate system. Then, offset data FPOS_B and FPOS_E and size data FSZ_B and FSZ_E are determined.

【０１２５】また、オフセットデータＦＰＯＳ＿Ｂおよ
びＦＰＯＳ＿Ｅ、並びにサイズデータＦＳＺ＿Ｂおよび
ＦＳＺ＿Ｅは、その他、例えば、次のように決定するこ
とも可能である。The offset data FPOS_B and FPOS_E and the size data FSZ_B and FSZ_E can be determined as follows, for example.

【０１２６】即ち、下位レイヤのオフセットデータＦＰ
ＯＳ＿Ｂは、例えば、図１０（Ａ）に示すように、下位
レイヤの拡大画像のオフセットデータが、絶対座標系に
おける所定の位置としての、例えば原点などに一致する
ように決定することができる。That is, the offset data FP of the lower layer
OS_B can be determined such that, for example, as shown in FIG. 10A, the offset data of the enlarged image of the lower layer matches a predetermined position in the absolute coordinate system, such as the origin.

【０１２７】一方、上位レイヤのオフセットデータＦＰ
ＯＳ＿Ｅは、例えば、図１０（Ｂ）に示すように、入力
されたＶＯＰから抜き出した物体を囲む１６倍最小長方
形の、例えば、左上の頂点の座標が、初期オフセットデ
ータに基づいて求められ、その座標から初期オフセット
データを減算した値に決定することができる。On the other hand, upper layer offset data FP
OS_E is, for example, as shown in FIG. 10B, the coordinates of, for example, the upper left vertex of the 16-fold minimum rectangle surrounding the object extracted from the input VOP are obtained based on the initial offset data. The value can be determined by subtracting the initial offset data from the coordinates.

【０１２８】なお、図１０における場合、下位レイヤの
サイズデータＦＳＺ＿Ｂおよび上位レイヤのサイズデー
タＦＰＯＳ＿Ｅは、図９における場合と同様に決定され
る。In the case of FIG. 10, the size data FSZ_B of the lower layer and the size data FPOS_E of the upper layer are determined in the same manner as in FIG.

【０１２９】以上のようにオフセットデータＦＰＯＳ＿
ＢおよびＦＰＯＳ＿Ｅを決定する場合においても、下位
レイヤの拡大画像および上位レイヤの画像を構成する、
対応する画素どうしが、絶対座標系において同一の位置
に配置されることになる。As described above, the offset data FPOS_
Even when B and FPOS_E are determined, an enlarged image of the lower layer and an image of the upper layer are formed.
The corresponding pixels are arranged at the same position in the absolute coordinate system.

【０１３０】図４に戻り、画像階層化部２１において生
成された上位レイヤの画像データ、キー信号、オフセッ
トデータＦＰＯＳ＿Ｅ、およびサイズデータＦＳＺ＿Ｅ
は、遅延回路２２で、後述する下位レイヤ符号化部２５
における処理時間だけ遅延され、上位レイヤ符号化部２
３に供給される。また、下位レイヤの画像データ、キー
信号、オフセットデータＦＰＯＳ＿Ｂ、およびサイズデ
ータＦＳＺ＿Ｂは、下位レイヤ符号化部２５に供給され
る。また、倍率ＦＲは、遅延回路２２を介して、上位レ
イヤ符号化部２３および解像度変換部２４に供給され
る。Returning to FIG. 4, the image data, key signal, offset data FPOS_E, and size data FSZ_E of the upper layer generated in image hierarchy unit 21 are obtained.
Is a delay circuit 22, and a lower layer coding unit 25 described later
Is delayed by the processing time in the upper layer encoding unit 2
3 is supplied. The lower layer image data, the key signal, the offset data FPOS_B, and the size data FSZ_B are supplied to the lower layer encoding unit 25. Further, the magnification FR is supplied to the upper layer encoding unit 23 and the resolution conversion unit 24 via the delay circuit 22.

【０１３１】下位レイヤ符号化部２５では、下位レイヤ
の画像データ（第２の画像）およびキー信号が符号化さ
れ、その結果得られる符号化データ（ビットストリー
ム）に、オフセットデータＦＰＯＳ＿Ｂおよびサイズデ
ータＦＳＺ＿Ｂが含められ、多重化部２６に供給され
る。The lower layer encoding section 25 encodes the lower layer image data (second image) and the key signal, and adds the offset data FPOS_B and the size data FSZ_B to the resulting encoded data (bit stream). And supplied to the multiplexing unit 26.

【０１３２】また、下位レイヤ符号化部２５は、符号化
データを局所復号化し、その結果局所復号結果である下
位レイヤの画像データを、解像度変換部２４に出力す
る。解像度変換部２４は、下位レイヤ符号化部２５から
の下位レイヤの画像データを、倍率ＦＲにしたがって拡
大（または縮小）することにより、元の大きさに戻し、
これにより得られる拡大画像を、上位レイヤ符号化部２
３に出力する。The lower layer encoding section 25 locally decodes the encoded data, and outputs the lower layer image data as a result of the local decoding to the resolution conversion section 24. The resolution conversion unit 24 returns the image data of the lower layer from the lower layer encoding unit 25 to the original size by enlarging (or reducing) the image data according to the magnification FR.
The enlarged image obtained in this way is converted into an upper layer encoding unit 2
Output to 3.

【０１３３】一方、上位レイヤ符号化部２３では、上位
レイヤの画像データ（第１の画像）およびキー信号が符
号化され、その結果得られる符号化データ（ビットスト
リーム）に、オフセットデータＦＰＯＳ＿Ｅおよびサイ
ズデータＦＳＺ＿Ｅが含められ、多重化部２６に供給さ
れる。なお、上位レイヤ符号化部２３においては、上位
レイヤ画像データの符号化は、解像度変換部２４から供
給される拡大画像をも参照画像として用いて行われる。On the other hand, the upper layer encoding section 23 encodes the image data (first image) and the key signal of the upper layer, and adds the offset data FPOS_E and the size to the encoded data (bit stream) obtained as a result. The data FSZ_E is included and supplied to the multiplexing unit 26. In the upper layer encoding unit 23, encoding of the upper layer image data is performed using the enlarged image supplied from the resolution conversion unit 24 as a reference image.

【０１３４】多重化部２６では、上位レイヤ符号化部２
３および下位レイヤ符号化部２５の出力が多重化されて
出力される。In the multiplexing section 26, the upper layer coding section 2
3 and the output of the lower layer coding unit 25 are multiplexed and output.

【０１３５】なお、下位レイヤ符号化部２５から上位レ
イヤ符号化部２３に対しては、下位レイヤのサイズデー
タＦＳＺ＿Ｂ、オフセットデータＦＰＯＳ＿Ｂ、動きベ
クトルＭＶ、フラグＣＯＤなどが供給されており、上位
レイヤ符号化部２３では、これらのデータを必要に応じ
て参照しながら、処理を行うようになされているが、こ
の詳細については、後述する。The lower layer coding section 25 supplies the lower layer size data FSZ_B, offset data FPOS_B, motion vector MV, flag COD, and the like to the upper layer coding section 23. The conversion unit 23 performs processing while referring to these data as necessary. The details will be described later.

【０１３６】次に、図１１は、図４の下位レイヤ符号化
部２５の詳細構成例を示している。なお、図中、図４２
における場合と対応する部分については、同一の符号を
付してある。即ち、下位レイヤ符号化部２５は、キー信
号符号化部４３およびキー信号復号部４４が新たに設け
られている他は、基本的には、図４２のエンコーダと同
様に構成されている。Next, FIG. 11 shows an example of the detailed configuration of the lower layer coding section 25 of FIG. Note that in FIG.
The same reference numerals are given to the portions corresponding to the case in. That is, the lower layer encoding unit 25 is basically configured in the same manner as the encoder in FIG. 42 except that a key signal encoding unit 43 and a key signal decoding unit 44 are newly provided.

【０１３７】画像階層化部２１（図４）からの画像デー
タ、即ち、下位レイヤのＶＯＰは、図４２における場合
と同様に、フレームメモリ３１に供給されて記憶され、
動きベクトル検出器３２において、マクロブロック単位
で動きベクトルの検出が行われる。The image data from the image layering unit 21 (FIG. 4), that is, the VOP of the lower layer is supplied to and stored in the frame memory 31 as in the case of FIG.
In the motion vector detector 32, a motion vector is detected for each macroblock.

【０１３８】但し、下位レイヤ符号化部２５の動きベク
トル検出器３２には、下位レイヤのＶＯＰのサイズデー
タＦＳＺ＿ＢおよびオフセットデータＦＰＯＳ＿Ｂが供
給されるようになされており、そこでは、このサイズデ
ータＦＳＺ＿ＢおよびオフセットデータＦＰＯＳ＿Ｂに
基づいて、マクロブロックの動きベクトルが検出され
る。However, the motion vector detector 32 of the lower layer encoder 25 is supplied with the size data FSZ_B and the offset data FPOS_B of the VOP of the lower layer, where the size data FSZ_B and the offset data FPOS_B are supplied. A motion vector of a macroblock is detected based on the offset data FPOS_B.

【０１３９】即ち、上述したように、ＶＯＰは、時刻
（フレーム）によって、大きさや位置が変化するため、
その動きベクトルの検出にあたっては、その検出のため
の基準となる座標系を設定し、その座標系における動き
を検出する必要がある。そこで、ここでは、動きベクト
ル検出器４３は、上述の絶対座標系を基準となる座標系
とし、その絶対座標系に、サイズデータＦＳＺ＿Ｂおよ
びオフセットデータＦＰＯＳ＿Ｂにしたがって、符号化
対象のＶＯＰおよび参照画像とするＶＯＰを配置して、
動きベクトルを検出するようになされている。That is, as described above, the VOP changes in size and position depending on time (frame).
In detecting the motion vector, it is necessary to set a coordinate system serving as a reference for the detection and to detect a motion in the coordinate system. Therefore, here, the motion vector detector 43 uses the above-described absolute coordinate system as a reference coordinate system, and uses the absolute coordinate system as a reference according to the size data FSZ_B and the offset data FPOS_B. Place a VOP to
A motion vector is detected.

【０１４０】さらに、動きベクトル検出器３２には、下
位レイヤのキー信号を符号化し、その符号化結果を復号
化した復号キー信号が、キー信号復号部４４から供給さ
れるようになされており、動きベクトル検出器３２は、
この復号キー信号によって、ＶＯＰから物体を抜き出
し、その動きベクトルを検出するようになされている。
ここで、物体を抜き出すのに、元のキー信号（符号化前
のキー信号）ではなく、復号キー信号を用いるのは、受
信側において用いられるのが復号キー信号だからであ
る。Further, the motion vector detector 32 encodes the key signal of the lower layer, and supplies a decoded key signal obtained by decoding the encoded result from the key signal decoding unit 44. The motion vector detector 32
With this decryption key signal, an object is extracted from the VOP and its motion vector is detected.
Here, the reason why the decoding key signal is used instead of the original key signal (key signal before encoding) to extract the object is that the decoding key signal is used on the receiving side.

【０１４１】なお、検出された動きベクトル（ＭＶ）
は、予測モードとともに、ＶＬＣ器３６および動き補償
器４２に供給される他、上位レイヤ符号化部２３（図
４）にも供給される。The detected motion vector (MV)
Is supplied to the VLC unit 36 and the motion compensator 42 together with the prediction mode, and is also supplied to the upper layer encoding unit 23 (FIG. 4).

【０１４２】また、動き補償を行う場合においても、や
はり、上述したように、基準となる座標系における動き
を検出する必要があるため、動き補償器４２には、サイ
ズデータＦＳＺ＿ＢおよびオフセットデータＦＰＯＳ＿
Ｂが供給されるようになされている。さらに、動き補償
器４２には、動きベクトル検出器３２における場合と同
様の理由から、キー信号復号部４４から復号キー信号が
供給されるようになされている。Also, when performing motion compensation, it is necessary to detect the motion in the reference coordinate system as described above. Therefore, the size data FSZ_B and the offset data FPOS_
B is supplied. Further, the decoded key signal is supplied from the key signal decoding unit 44 to the motion compensator 42 for the same reason as in the motion vector detector 32.

【０１４３】動きベクトルの検出されたＶＯＰは、図４
２における場合と同様に量子化データとされてＶＬＣ器
３６に供給される。ＶＬＣ器３６には、やはり図４２に
おける場合と同様に、量子化データ、量子化ステップ、
動きベクトル、および予測モードが供給される他、画像
階層化部２１からのサイズデータＦＳＺ＿Ｂおよびオフ
セットデータＦＰＯＳ＿Ｂも供給されており、そこで
は、これらのデータすべてが可変長符号化される。さら
に、ＶＬＣ器３６には、キー信号符号化部４３からキー
信号の符号化結果（キー信号のビットストリーム）も供
給されるようになされており、ＶＬＣ器３６は、このキ
ー信号の符号化結果も可変長符号化して出力する。The VOP in which the motion vector is detected is shown in FIG.
2, and is supplied to the VLC unit 36 as quantized data. As in the case of FIG. 42, the VLC unit 36 also includes quantized data, a quantized step,
In addition to the motion vector and the prediction mode, size data FSZ_B and offset data FPOS_B from the image layering unit 21 are also supplied, where all of these data are variable-length coded. Further, the VLC unit 36 is also supplied with the encoding result of the key signal (the bit stream of the key signal) from the key signal encoding unit 43. The VLC unit 36 outputs the encoding result of the key signal. Is also output after variable-length encoding.

【０１４４】即ち、キー信号符号化部４３は、画像階層
化部２１からのキー信号を、例えば、図３で説明したよ
うに符号化し、その符号化結果を、ＶＬＣ器３６に出力
する。また、キー信号の符号化結果は、ＶＬＣ器３６の
他、キー信号復号部４４にも供給され、キー信号復号部
４４は、キー信号の符号化結果を復号化し、その復号化
されたキー信号（復号キー信号）を、動きベクトル検出
器３２、動き補償器４２、および解像度変換部２４（図
４）に出力する。That is, the key signal encoding unit 43 encodes the key signal from the image hierarchical unit 21 as described with reference to FIG. 3, for example, and outputs the encoding result to the VLC unit 36. The encoding result of the key signal is also supplied to the key signal decoding unit 44 in addition to the VLC unit 36, and the key signal decoding unit 44 decodes the encoding result of the key signal and outputs the decoded key signal. (Decryption key signal) is output to the motion vector detector 32, the motion compensator 42, and the resolution converter 24 (FIG. 4).

【０１４５】ここで、キー信号符号化部４３には、下位
レイヤのキー信号の他、サイズデータＦＳＺ＿Ｂおよび
オフセットデータＦＰＯＳ＿Ｂが供給されるようになさ
れており、そこでも、動きベクトル検出器３２における
場合と同様に、それらのデータに基づいて、絶対座標系
におけるキー信号の位置と範囲とが認識されるようにな
されている。Here, in addition to the key signal of the lower layer, the size data FSZ_B and the offset data FPOS_B are supplied to the key signal encoding unit 43. Similarly, the position and range of the key signal in the absolute coordinate system are recognized based on the data.

【０１４６】動きベクトルの検出されたＶＯＰは、上述
したように符号化される他、やはり図４２における場合
と同様に局所復号され、フレームメモリ４１に記憶され
る。この復号画像は、前述したように参照画像として用
いられる他、解像度変換部２４に出力される。The VOP in which the motion vector has been detected is coded as described above, and also locally decoded as in the case of FIG. 42 and stored in the frame memory 41. The decoded image is used as a reference image as described above, and is output to the resolution conversion unit 24.

【０１４７】なお、ＭＰＥＧ４においては、ＭＰＥＧ１
および２と異なり、Ｂピクチャも参照画像として用いら
れるため、Ｂピクチャも、局所復号化され、フレームメ
モリ４１に記憶されるようになされている（但し、現時
点においては、Ｂピクチャが参照画像として用いられる
のは上位レイヤについてだけである）。Note that in MPEG4, MPEG1
Unlike B and 2, the B picture is also used as a reference picture, so the B picture is also locally decoded and stored in the frame memory 41 (however, at this time, the B picture is used as a reference picture). Only for the upper layers).

【０１４８】一方、ＶＬＣ器３６は、図４２で説明した
ように、Ｉ，Ｐ，Ｂピクチャのマクロブロックについ
て、スキップマクロブロックとするかどうかを決定し、
その決定結果を示すフラグＣＯＤ，ＭＯＤＢを設定す
る。このフラグＣＯＤ，ＭＯＤＢは、やはり可変長符号
化されて伝送される。さらに、フラグＣＯＤは、上位レ
イヤ符号化部２３にも供給される。On the other hand, as described with reference to FIG. 42, the VLC unit 36 determines whether or not macroblocks of I, P, and B pictures are to be skipped macroblocks.
The flags COD and MODB indicating the determination result are set. The flags COD and MODB are also transmitted after being variable-length coded. Further, the flag COD is also supplied to the upper layer encoding unit 23.

【０１４９】次に、図１２は、図４の上位レイヤ符号化
部２３の構成例を示している。なお、図中、図１１また
は図４２における場合と対応する部分については、同一
の符号を付してある。即ち、上位レイヤ符号化部２３
は、キー信号符号化部５１、フレームメモリ５２、およ
びキー信号復号部５３が、新たに設けられている他は、
基本的には、図１１の下位レイヤ符号化部２５または図
４２のエンコーダと同様に構成されている。Next, FIG. 12 shows an example of the configuration of the upper layer coding section 23 of FIG. Note that, in the figure, parts corresponding to those in FIG. 11 or FIG. 42 are denoted by the same reference numerals. That is, the upper layer encoding unit 23
Except that a key signal encoding unit 51, a frame memory 52, and a key signal decoding unit 53 are newly provided,
Basically, the configuration is the same as that of the lower layer encoding unit 25 in FIG. 11 or the encoder in FIG.

【０１５０】画像階層化部２１（図４）からの画像デー
タ、即ち、上位レイヤのＶＯＰは、図４２における場合
と同様に、フレームメモリ３１に供給されて記憶され、
動きベクトル検出器３２において、マクロブロック単位
で動きベクトルの検出が行われる。なお、この場合も、
動きベクトル検出器３２には、図１１における場合と同
様に、上位レイヤのＶＯＰの他、そのサイズデータＦＳ
Ｚ＿ＥおよびオフセットデータＦＰＯＳ＿Ｅが供給され
るととも、キー信号復号部５３がら復号キーが供給され
るようになされており、動きベクトル検出器３２では、
上述の場合と同様に、このサイズデータＦＳＺ＿Ｅおよ
びオフセットデータＦＰＯＳ＿Ｅに基づいて、絶対座標
系における上位レイヤのＶＯＰの配置位置が認識される
とともに、そのＶＯＰに含まれる物体の抜き出しが、復
号キー信号に基づいて行われ、マクロブロックの動きベ
クトルが検出される。The image data from the image layering unit 21 (FIG. 4), that is, the VOP of the upper layer is supplied to and stored in the frame memory 31 as in the case of FIG.
In the motion vector detector 32, a motion vector is detected for each macroblock. In this case,
As in the case of FIG. 11, the motion vector detector 32 includes, in addition to the VOP of the upper layer, the size data FS thereof.
When the Z_E and the offset data FPOS_E are supplied, the decryption key is supplied from the key signal decryption unit 53. In the motion vector detector 32,
Similarly to the above case, based on the size data FSZ_E and the offset data FPOS_E, the arrangement position of the VOP of the upper layer in the absolute coordinate system is recognized, and the extraction of the object included in the VOP is included in the decryption key signal. Based on this, a motion vector of a macroblock is detected.

【０１５１】ここで、上位レイヤ符号化部２３および下
位レイヤ符号化部２５における動きベクトル検出器３２
では、図４２で説明したように、予め設定されている所
定のシーケンスにしたがって、ＶＯＰが処理されていく
が、そのシーケンスは、ここでは、例えば、次のように
設定されている。Here, the motion vector detector 32 in the upper layer encoding section 23 and the lower layer encoding section 25
Then, as described with reference to FIG. 42, the VOP is processed according to a predetermined sequence that is set in advance. Here, the sequence is set as follows, for example.

【０１５２】即ち、空間スケーラビリティの場合におい
ては、図１３（Ａ）または図１３（Ｂ）に示すように、
上位レイヤまたは下位レイヤのＶＯＰは、例えば、Ｐ，
Ｂ，Ｂ，Ｂ，・・・またはＩ，Ｐ，Ｐ，Ｐ，・・・の順
でそれぞれ処理されていく。That is, in the case of spatial scalability, as shown in FIG. 13A or FIG.
The VOP of the upper layer or the lower layer is, for example, P,
Are processed in the order of B, B, B,... Or I, P, P, P,.

【０１５３】そして、この場合、上位レイヤの最初のＶ
ＯＰであるＰピクチャは、例えば、同時刻における下位
レイヤのＶＯＰ（ここでは、Ｉピクチャ）を参照画像と
して用いて符号化される。また、上位レイヤの２番目以
降のＶＯＰであるＢピクチャは、例えば、その直前の上
位レイヤのＶＯＰおよびそれと同時刻の下位レイヤのＶ
ＯＰを参照画像として用いて符号化される。即ち、ここ
では、上位レイヤのＢピクチャは、下位レイヤのＰピク
チャと同様に他のＶＯＰを符号化する場合の参照画像と
して用いられる。Then, in this case, the first V
The P picture that is the OP is encoded using, for example, a VOP (here, an I picture) of a lower layer at the same time as a reference image. The B picture that is the second or later VOP of the upper layer is, for example, the VOP of the immediately preceding upper layer and the VOP of the lower layer at the same time as the VOP.
The encoding is performed using the OP as a reference image. That is, here, the B picture in the upper layer is used as a reference image when encoding another VOP, like the P picture in the lower layer.

【０１５４】なお、下位レイヤについては、例えば、Ｍ
ＰＥＧ１や２、あるいはＨ．２６３における場合と同様
に符号化が行われていく。For the lower layer, for example, M
PEG 1 or 2, or H.264. Encoding is performed as in the case of H.263.

【０１５５】ＳＮＲスケーラビリティは、空間スケーラ
ビリティにおける倍率ＦＲが１のときと考えられるか
ら、上述の空間スケーラビリティの場合と同様に処理さ
れる。Since the SNR scalability is considered to be when the magnification FR in spatial scalability is 1, it is processed in the same manner as in the case of spatial scalability described above.

【０１５６】テンポラルスケーラビリティの場合、即
ち、例えば、上述したように、ＶＯが、ＶＯＰ０，ＶＯ
Ｐ１，ＶＯＰ２，ＶＯＰ３，・・・で構成され、ＶＯＰ
１，ＶＯＰ３，ＶＯＰ５，ＶＯＰ７，・・・が上位レイ
ヤとされ（図１４（Ａ））、ＶＯＰ０，ＶＯＰ２，ＶＯ
Ｐ４，ＶＯＰ６，・・・が下位レイヤとされた場合にお
いては（図１４（Ｂ））、図１４に示すように、上位レ
イヤまたは下位レイヤのＶＯＰは、例えば、Ｂ，Ｂ，
Ｂ，・・・またはＩ，Ｐ，Ｐ，Ｐ，・・・の順でそれぞ
れ処理されていく。In the case of temporal scalability, that is, for example, as described above, VO is VOP0, VO
P1, VOP2, VOP3,...
, VOP3, VOP5, VOP7,... Are upper layers (FIG. 14A), and VOP0, VOP2, VO
When P4, VOP6,... Are the lower layers (FIG. 14B), as shown in FIG. 14, the VOPs of the upper layer or the lower layer are, for example, B, B,
B,... Or I, P, P, P,.

【０１５７】そして、この場合、上位レイヤの最初のＶ
ＯＰ１（Ｂピクチャ）は、例えば、下位レイヤのＶＯＰ
０（Ｉピクチャ）およびＶＯＰ２（Ｐピクチャ）を参照
画像として用いて符号化される。また、上位レイヤの２
番目のＶＯＰ３（Ｂピクチャ）は、例えば、その直前に
Ｂピクチャとして符号化された上位レイヤのＶＯＰ１、
およびＶＯＰ３の次の時刻（フレーム）における画像で
ある下位レイヤのＶＯＰ４（Ｐピクチャ）を参照画像と
して用いて符号化される。上位レイヤの３番目のＶＯＰ
５（Ｂピクチャ）も、ＶＯＰ３と同様に、例えば、その
直前にＢピクチャとして符号化された上位レイヤのＶＯ
Ｐ３、およびＶＯＰ５の次の時刻（フレーム）における
画像である下位レイヤのＶＯＰ６（Ｐピクチャ）を参照
画像として用いて符号化される。Then, in this case, the first V
OP1 (B picture) is, for example, a lower layer VOP.
0 (I picture) and VOP2 (P picture) are coded using as reference pictures. Also, the upper layer 2
The VOP3 (B-picture) is, for example, VOP1 of the upper layer coded immediately before as a B-picture,
And VOP4 (P picture) of the lower layer, which is an image at the time (frame) next to VOP3, is used as a reference image. Third VOP of upper layer
5 (B picture), like VOP3, for example, the VO of the upper layer coded immediately before as a B picture
Encoding is performed using P3 and VOP6 (P picture) of the lower layer, which is an image at the time (frame) next to VOP5, as a reference image.

【０１５８】以上のように、あるレイヤのＶＯＰ（ここ
では、上位レイヤ）については、ＰおよびＢピクチャを
符号化するための参照画像として、他のレイヤ（スケー
ラブルレイヤ）（ここでは、下位レイヤ）のＶＯＰを用
いることができる。このように、あるレイヤのＶＯＰを
符号化するのに、他のレイヤのＶＯＰを参照画像として
用いる場合、即ち、ここでは、上位レイヤのＶＯＰを予
測符号化するのに、下位レイヤのＶＯＰを参照画像とし
て用いる場合、上位レイヤ符号化部２３（図１２）の動
きベクトル検出器３２は、その旨を示すフラグｒｅｆ＿
ｌａｙｅｒ＿ｉｄ（階層数が３以上存在する場合、フラ
グｒｅｆ＿ｌａｙｅｒ＿ｉｄは、参照画像として用いる
ＶＯＰが属するレイヤを表す）を設定して出力するよう
になされている。As described above, for a VOP of a certain layer (here, an upper layer), another layer (a scalable layer) (here, a lower layer) is used as a reference image for coding P and B pictures. VOP can be used. As described above, when a VOP of another layer is used as a reference image to encode a VOP of a certain layer, that is, here, a VOP of a lower layer is referred to for predictive encoding of a VOP of an upper layer. When used as an image, the motion vector detector 32 of the upper layer encoding unit 23 (FIG. 12) outputs a flag ref_
A layer_id (when there are three or more layers, the flag ref_layer_id indicates a layer to which a VOP used as a reference image belongs) is set and output.

【０１５９】さらに、上位レイヤ符号化部２３の動きベ
クトル検出器３２は、ＶＯＰについてのフラグｒｅｆ＿
ｌａｙｅｒ＿ｉｄにしたがい、前方予測符号化または後
方予測符号化を、それぞれ、どのレイヤのＶＯＰを参照
画像として行うかを示すフラグｒｅｆ＿ｓｅｌｅｃｔ＿
ｃｏｄｅ（参照画像情報）を設定して出力するようにも
なされている。Further, the motion vector detector 32 of the upper layer encoding unit 23 sets the flag ref_
According to the layer_id, a flag ref_select__ indicating which layer VOP is to be used as a reference image for forward prediction coding or backward prediction coding, respectively.
A code (reference image information) is set and output.

【０１６０】即ち、図１５（Ａ）または（Ｂ）は、Ｐま
たはＢピクチャについてのフラグｒｅｆ＿ｓｅｌｅｃｔ
＿ｃｏｄｅを、それぞれ示している。That is, FIG. 15A or FIG. 15B shows a flag ref_select for a P or B picture.
_Code are shown respectively.

【０１６１】例えば、上位レイヤ（Enhancement Laye
r）のＰピクチャが、その直前に復号（局所復号）され
る、それと同一のレイヤに属するＶＯＰを参照画像とし
て用いて符号化される場合、フラグｒｅｆ＿ｓｅｌｅｃ
ｔ＿ｃｏｄｅは「００」とされる。また、Ｐピクチャ
が、その直前に表示される、それと異なるレイヤ（ここ
では、下位レイヤ）（Reference Layer）に属するＶＯ
Ｐを参照画像として用いて符号化される場合、フラグｒ
ｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅは「０１」とされる。さ
らに、Ｐピクチャが、その直後に表示される、それと異
なるレイヤに属するＶＯＰを参照画像として用いて符号
化される場合、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ
は「１０」とされる。また、Ｐピクチャが、それと同時
刻における、異なるレイヤのＶＯＰを参照画像として用
いて符号化される場合、フラグｒｅｆ＿ｓｅｌｅｃｔ＿
ｃｏｄｅは「１１」とされる（図１５（Ａ））。For example, an upper layer (Enhancement Layer)
When the P picture of r) is encoded using a VOP belonging to the same layer as the reference picture, which is decoded immediately before (local decoding), the flag ref_selec
t_code is set to “00”. Also, the VO belonging to a different layer (here, lower layer) (Reference Layer) displayed immediately before the P picture is displayed.
If encoding is performed using P as a reference image, the flag r
ef_select_code is set to “01”. Further, when the P picture is encoded using a VOP that is displayed immediately after and belongs to a different layer from the POP as a reference image, the flag ref_select_code is used.
Is set to “10”. When a P picture is encoded using a VOP of a different layer at the same time as the reference picture as a reference picture, the flag ref_select_
The code is set to “11” (FIG. 15A).

【０１６２】一方、例えば、上位レイヤのＢピクチャ
が、それと同時刻における、異なるレイヤのＶＯＰを前
方予測のための参照画像として用い、かつ、その直前に
復号される、それと同一のレイヤに属するＶＯＰを後方
予測のための参照画像として用いて符号化される場合、
フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅは「００」とさ
れる。また、上位レイヤのＢピクチャが、それと同一の
レイヤに属するＶＯＰを前方予測のための参照画像とし
て用い、かつ、その直前に表示される、それと異なるレ
イヤに属するＶＯＰを後方予測のための参照画像として
用いて符号化される場合、フラグｒｅｆ＿ｓｅｌｅｃｔ
＿ｃｏｄｅは「０１」とされる。さらに、上位レイヤの
Ｂピクチャが、その直前に復号される、それと同一のレ
イヤに属するＶＯＰを前方予測のための参照画像として
用い、かつその直後に表示される、それと異なるレイヤ
に属するＶＯＰを後方予測のための参照画像として用い
て符号化される場合、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃ
ｏｄｅは「１０」とされる。また、上位レイヤのＢピク
チャが、その直前に表示される、それと異なるレイヤに
属するＶＯＰを前方予測のための参照画像として用い、
かつその直後に表示される、それと異なるレイヤに属す
るＶＯＰを後方予測のための参照画像として用いて符号
化される場合、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ
は「１１」とされる（図１５（Ｂ））。On the other hand, for example, a B-picture of an upper layer uses a VOP of a different layer at the same time as the reference picture for forward prediction, and decodes immediately before that and belongs to the same layer as the VOP belonging to the same layer. Is encoded using as a reference image for backward prediction,
The flag ref_select_code is set to “00”. In addition, a B-picture of an upper layer uses a VOP belonging to the same layer as a reference image for forward prediction, and displays a VOP belonging to a different layer, which is displayed immediately before and belongs to a different layer, as a reference image for backward prediction. When the encoding is performed by using the flag ref_select
_Code is set to “01”. Further, the BOP of the upper layer is decoded immediately before and uses the VOP belonging to the same layer as the reference image for forward prediction, and the VOP belonging to a different layer displayed immediately after that and belonging to a different layer is backward. If the encoding is performed using a reference image for prediction, the flag ref_select_c is used.
mode is set to “10”. Further, a VOP belonging to a different layer, which is displayed immediately before the B picture of the upper layer and belongs to a different layer, is used as a reference image for forward prediction,
In the case where encoding is performed using a VOP displayed immediately after that and belonging to a different layer as a reference image for backward prediction, a flag ref_select_code is used.
Is set to “11” (FIG. 15B).

【０１６３】ここで、図１３および図１４で説明した予
測符号化の方法は、１つの例であり、前方予測符号化、
後方予測符号化、または両方向予測符号化における参照
画像として、どのレイヤの、どのＶＯＰを用いるかは、
例えば、図１５で説明した範囲で、自由に設定すること
が可能である。Here, the method of predictive coding described with reference to FIGS. 13 and 14 is one example,
Which VOP of which layer is used as a reference image in backward prediction coding or bidirectional prediction coding is as follows.
For example, it can be set freely within the range described with reference to FIG.

【０１６４】なお、上述の場合においては、便宜的に、
「空間スケーラビリティ」、「時間スケーラビリテ
ィ」、「ＳＮＲスケーラビリティ」という語を用いた
が、図１５で説明したように、予測符号化に用いる参照
画像を設定する場合、即ち、図１５に示したようなシン
タクスを用いる場合、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃ
ｏｄｅによって、空間スケーラビリティや、テンポラル
スケーラビリティ、ＳＮＲスケーラビリティを明確に区
別することは困難となる。即ち、逆にいえば、フラグｒ
ｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅを用いることによって、
上述のようなスケーラビリティの区別をせずに済むよう
になる。In the above case, for convenience,
Although the terms “spatial scalability”, “temporal scalability”, and “SNR scalability” are used, as described with reference to FIG. 15, when a reference image used for predictive coding is set, that is, as shown in FIG. When the syntax is used, the flag ref_select_c
The mode makes it difficult to clearly distinguish spatial scalability, temporal scalability, and SNR scalability. That is, conversely, the flag r
By using ef_select_code,
It is not necessary to distinguish the scalability as described above.

【０１６５】なお、上述のスケーラビリティとフラグｒ
ｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅとを対応付けるとすれ
ば、例えば、次のようになる。即ち、Ｐピクチャについ
ては、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「１
１」の場合が、フラグｒｅｆ＿ｌａｙｅｒ＿ｉｄが示す
レイヤの同時刻におけるＶＯＰを参照画像（前方予測の
ための参照画像）として用いる場合であるから、これ
は、空間スケーラビリティまたはＳＮＲスケーラビリテ
ィに対応する。そして、フラグｒｅｆ＿ｓｅｌｅｃｔ＿
ｃｏｄｅが「１１」の場合以外は、テンポラルスケーラ
ビリティに対応する。The scalability and the flag r
If ef_select_code is associated, for example, it is as follows. That is, for the P picture, the flag ref_select_code is set to “1”.
Since the case of “1” is a case where the VOP of the layer indicated by the flag ref_layer_id at the same time is used as a reference image (a reference image for forward prediction), this corresponds to spatial scalability or SNR scalability. Then, the flag ref_select_
Except when the code is “11”, it corresponds to temporal scalability.

【０１６６】また、Ｂピクチャについては、フラグｒｅ
ｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」の場合が、やは
り、フラグｒｅｆ＿ｌａｙｅｒ＿ｉｄが示すレイヤの同
時刻におけるＶＯＰを前方予測のための参照画像として
用いる場合であるから、これが、空間スケーラビリティ
またはＳＮＲスケーラビリティに対応する。そして、フ
ラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」の場合
以外は、テンポラルスケーラビリティに対応する。For a B picture, the flag re
Since the case where f_select_code is “00” is the case where the VOP of the layer indicated by the flag ref_layer_id at the same time is used as a reference image for forward prediction, this corresponds to spatial scalability or SNR scalability. Except when the flag ref_select_code is “00”, it corresponds to temporal scalability.

【０１６７】なお、上位レイヤのＶＯＰの予測符号化の
ために、それと異なるレイヤ（ここでは、下位レイヤ）
の、同時刻におけるＶＯＰを参照画像として用いる場
合、両者の間に動きはないので、動きベクトルは、常に
０（０，０）とされる。It is to be noted that, for predictive coding of a VOP of an upper layer, a different layer (here, a lower layer)
When the VOP at the same time is used as a reference image, there is no motion between the two and the motion vector is always 0 (0, 0).

【０１６８】図１２に戻り、上位レイヤ符号化部２３の
動き検出器３１では、以上のようなフラグｒｅｆ＿ｌａ
ｙｅｒ＿ｉｄおよびｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが
設定され、動き補償器４２およびＶＬＣ器３６に供給さ
れる。Returning to FIG. 12, in the motion detector 31 of the upper layer coding unit 23, the flag ref_la
Yer_id and ref_select_code are set and supplied to the motion compensator 42 and the VLC unit 36.

【０１６９】また、動きベクトル検出器３２では、フラ
グｒｅｆ＿ｌａｙｅｒ＿ｉｄおよびｒｅｆ＿ｓｅｌｅｃ
ｔ＿ｃｏｄｅにしたがって、フレームメモリ３１を参照
するだけでなく、必要に応じて、フレームメモリ５２を
も参照して、動きベクトルが検出される。In the motion vector detector 32, the flags ref_layer_id and ref_select
According to t_code, a motion vector is detected not only by referring to the frame memory 31 but also by referring to the frame memory 52 as necessary.

【０１７０】ここで、フレームメモリ５２には、解像度
変換部２４（図４）から、局所復号された下位レイヤの
拡大画像が供給されるようになされている。即ち、解像
度変換部２４では、局所復号された下位レイヤのＶＯＰ
が、例えば、いわゆる補間フィルタなどによって拡大さ
れ、これにより、そのＶＯＰを、ＦＲ倍だけした拡大画
像、つまり、その下位レイヤのＶＯＰに対応する上位レ
イヤのＶＯＰと同一の大きさとした拡大画像が生成さ
れ、上位レイヤ符号化部２３に供給される。フレームメ
モリ５２では、このようにして解像度変換部２４から供
給される拡大画像が記憶される。The frame memory 52 is supplied with the locally decoded enlarged image of the lower layer from the resolution converter 24 (FIG. 4). That is, in the resolution conversion unit 24, the locally decoded VOP of the lower layer
Is enlarged by, for example, a so-called interpolation filter, thereby generating an enlarged image obtained by multiplying the VOP by FR times, that is, an enlarged image having the same size as the VOP of the upper layer corresponding to the VOP of the lower layer. Then, it is supplied to the upper layer coding unit 23. The frame memory 52 stores the enlarged image supplied from the resolution conversion unit 24 in this way.

【０１７１】従って、倍率ＦＲが１の場合は、解像度変
換部２４は、下位レイヤ符号化部２５からの局所復号さ
れたＶＯＰに対して、特に処理を施すことなく、そのま
ま、上位レイヤ符号化部２３に供給する。Therefore, when the magnification FR is 1, the resolution conversion section 24 does not perform any processing on the locally decoded VOP from the lower layer coding section 25 without any processing. 23.

【０１７２】動きベクトル検出器３２には、下位レイヤ
符号化部２５からサイズデータＦＳＺ＿Ｂおよびオフセ
ットデータＦＰＯＳ＿Ｂが供給されるとともに、遅延回
路２２（図４）からの倍率ＦＲが供給されるようになさ
れており、動きベクトル検出器３１は、フレームメモリ
５２に記憶された拡大画像を参照画像として用いる場
合、即ち、上位レイヤのＶＯＰの予測符号化に、そのＶ
ＯＰと同時刻における下位レイヤのＶＯＰを参照画像と
して用いる場合（この場合、図１５で説明したように、
フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅは、Ｐピクチャ
については「１１」に（同図（Ａ））、Ｂピクチャにつ
いては「００」にされる（同図（Ｂ））、その拡大画像
に対応するサイズデータＦＳＺ＿Ｂおよびオフセットデ
ータＦＰＯＳ＿Ｂに、倍率ＦＲを乗算する。そして、そ
の乗算結果に基づいて、絶対座標系における拡大画像の
位置を認識し、動きベクトルの検出を行う。The motion vector detector 32 is supplied with the size data FSZ_B and the offset data FPOS_B from the lower layer coding unit 25, and is supplied with the magnification FR from the delay circuit 22 (FIG. 4). When the motion vector detector 31 uses the enlarged image stored in the frame memory 52 as a reference image, that is, the motion vector detector 31 uses the V
When a lower layer VOP at the same time as the OP is used as a reference image (in this case, as described in FIG.
The flag ref_select_code is set to “11” for the P picture (FIG. (A)) and “00” for the B picture (FIG. (B)), and the size data FSZ_B and offset data corresponding to the enlarged image FPOS_B is multiplied by the magnification FR. Then, based on the multiplication result, the position of the enlarged image in the absolute coordinate system is recognized, and a motion vector is detected.

【０１７３】なお、動きベクトル検出器３２には、下位
レイヤの動きベクトルと予測モードが供給されるように
なされており、これは、次のような場合に使用される。
即ち、動きベクトル検出部３２は、例えば、上位レイヤ
のＢピクチャについてのフラグｒｅｆ＿ｓｅｌｅｃｔ＿
ｃｏｄｅが「００」である場合において、倍率ＦＲが１
であるとき、即ち、ＳＮＲスケーラビリティのとき（但
し、この場合、上位レイヤの予測符号化に、上位レイヤ
のＶＯＰが用いられるので、この点で、ここでいうＳＮ
Ｒスケーラビリティは、ＭＰＥＧ２に規定されているも
のと異なる）、上位レイヤと下位レイヤは同一の画像で
あるから、上位レイヤのＢピクチャの予測符号化には、
下位レイヤの同時刻における画像の動きベクトルと予測
モードをそのまま用いることができる。そこで、この場
合、動きベクトル検出部３２は、上位レイヤのＢピクチ
ャについては、特に処理を行わず、下位レイヤの動きベ
クトルと予測モードをそのまま採用する。The motion vector detector 32 is supplied with the motion vector and the prediction mode of the lower layer, and is used in the following case.
That is, for example, the motion vector detection unit 32 sets the flag ref_select_
When the code is “00”, the magnification FR is 1
, Ie, when the SNR is scalable (however, in this case, since the VOP of the upper layer is used for predictive coding of the upper layer, the SN
The R scalability is different from that specified in MPEG2). Since the upper layer and the lower layer are the same image, the predictive coding of the B picture of the upper layer
The motion vector and the prediction mode of the image at the same time of the lower layer can be used as they are. Therefore, in this case, the motion vector detection unit 32 does not particularly perform processing on the B picture of the upper layer, and adopts the motion vector and the prediction mode of the lower layer as they are.

【０１７４】なお、この場合、上位レイヤ符号化部２３
では、動きベクトル検出器３２からＶＬＣ器３６には、
動きベクトルおよび予測モードは出力されない（従っ
て、伝送されない）。これは、受信側において、上位レ
イヤの動きベクトルおよび予測モードを、下位レイヤの
復号結果から認識することができるからである。In this case, upper layer coding section 23
Then, from the motion vector detector 32 to the VLC unit 36,
The motion vector and prediction mode are not output (and therefore not transmitted). This is because the receiving side can recognize the motion vector and the prediction mode of the upper layer from the decoding result of the lower layer.

【０１７５】以上のように、動きベクトル検出器３２
は、上位レイヤのＶＯＰの他、拡大画像をも参照画像と
して用いて、動きベクトルを検出し、さらに、図４２で
説明したように、予測誤差（あるいは分散）を最小にす
る予測モードを設定する。また、動きベクトル検出器３
２は、例えば、フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ
やｒｅｆ＿ｌａｙｅｒ＿ｉｄその他の必要な情報を設定
して出力する。As described above, the motion vector detector 32
Detects a motion vector using an enlarged image as a reference image in addition to the VOP of the upper layer, and sets a prediction mode that minimizes a prediction error (or variance) as described with reference to FIG. . Also, the motion vector detector 3
2 is, for example, a flag ref_select_code
And ref_layer_id and other necessary information are set and output.

【０１７６】なお、図１２では、下位レイヤ符号化部２
５から、下位レイヤにおけるＩまたはＰピクチャを構成
するマクロブロックがスキップマクロブロックであるか
どうかを示すフラグＣＯＤが、動きベクトル検出器３
２、ＶＬＣ器３６、および動き補償器４２に供給される
ようになされているが、これについては、後述する。In FIG. 12, lower layer coding section 2
5, a flag COD indicating whether a macroblock constituting an I or P picture in a lower layer is a skip macroblock is determined by the motion vector detector 3
2, which is supplied to the VLC unit 36 and the motion compensator 42, which will be described later.

【０１７７】動きベクトルの検出されたマクロブロック
は、上述した場合と同様に符号化され、これにより、Ｖ
ＬＣ器３６からは、その符号化結果としての可変長符号
が出力される。The macroblock in which the motion vector is detected is coded in the same manner as described above.
The LC unit 36 outputs a variable-length code as a result of the encoding.

【０１７８】なお、上位レイヤ符号化部２３のＶＬＣ器
３６は、下位レイヤ符号化部２５における場合と同様
に、フラグＣＯＤ，ＭＯＤＢを設定して出力するように
なされている。ここで、フラグＣＯＤは、上述したよう
に、ＩまたはＰピクチャのマクロブロックがスキップマ
クロブロックであるかどうかを示すものであるが、フラ
グＭＯＤＢは、Ｂピクチャのマクロブロックがスキップ
マクロブロックであるかどうかを示すものである。The VLC unit 36 of the upper layer coding unit 23 sets and outputs flags COD and MODB as in the case of the lower layer coding unit 25. Here, the flag COD indicates whether the macroblock of the I or P picture is a skip macroblock, as described above. The flag MODB indicates whether the macroblock of the B picture is a skip macroblock. It indicates whether or not.

【０１７９】また、ＶＬＣ器３６には、量子化係数、量
子化ステップ、動きベクトル、および予測モードの他、
倍率ＦＲ、フラグｒｅｆ＿ｓｅｒｅｃｔ＿ｃｏｄｅ，ｒ
ｅｆ＿ｌａｙｅｒ＿ｉｄ、サイズデータＦＳＺ＿Ｅ、オ
フセットデータＦＰＯＳ＿Ｅ、およびキー信号符号化部
５１の出力も供給されるようになされており、ＶＬＣ器
３６では、これらのデータがすべて可変長符号化されて
出力される。The VLC unit 36 has a quantization coefficient, a quantization step, a motion vector, and a prediction mode.
Magnification FR, flag ref_select_code, r
The ef_layer_id, the size data FSZ_E, the offset data FPOS_E, and the output of the key signal encoding unit 51 are also supplied. The VLC unit 36 performs variable-length encoding on all of these data and outputs them.

【０１８０】一方、動きベクトルの検出されたマクロブ
ロックは符号化された後、やはり上述したように局所復
号され、フレームメモリ４１に記憶される。そして、動
き補償器４２において、動きベクトル検出器３２におけ
る場合と同様にして、フレームメモリ４１に記憶され
た、局所復号された上位レイヤのＶＯＰだけでなく、フ
レームメモリ５２に記憶された、局所復号されて拡大さ
れた下位レイヤのＶＯＰをも参照画像として用いて動き
補償が行われ、予測画像が生成される。On the other hand, the macroblock in which the motion vector is detected is encoded, then locally decoded as described above, and stored in the frame memory 41. Then, in the motion compensator 42, similarly to the case of the motion vector detector 32, not only the locally decoded VOP of the upper layer stored in the frame memory 41 but also the local decoding The motion compensation is performed using the VOP of the lower layer enlarged as a reference image, and a predicted image is generated.

【０１８１】即ち、動き補償器４２には、動きベクトル
および予測モードの他、フラグｒｅｆ＿ｓｅｒｅｃｔ＿
ｃｏｄｅ，ｒｅｆ＿ｌａｙｅｒ＿ｉｄ、復号キー信号、
倍率ＦＲ、サイズデータＦＳＺ＿Ｂ，ＦＳＺ＿Ｅ、オフ
セットデータＦＰＯＳ＿Ｂ，ＦＰＯＳ＿Ｅが供給される
ようになされており、動き補償器４２は、フラグｒｅｆ
＿ｓｅｒｅｃｔ＿ｃｏｄｅ，ｒｅｆ＿ｌａｙｅｒ＿ｉｄ
に基づいて、動き補償すべき参照画像を認識し、さら
に、参照画像として、局所復号された上位レイヤのＶＯ
Ｐ、または拡大画像を用いる場合には、その絶対座標系
における位置と大きさを、サイズデータＦＳＺ＿Ｅおよ
びオフセットデータＦＰＯＳ＿Ｅ、またはサイズデータ
ＦＳＺ＿ＢおよびオフセットデータＦＰＯＳ＿Ｂに基づ
いて認識し、必要に応じて、倍率ＦＲと復号キー信号を
用いて予測画像を生成する。That is, in addition to the motion vector and the prediction mode, the flag ref_select_
code, ref_layer_id, decryption key signal,
The magnification FR, the size data FSZ_B, FSZ_E, and the offset data FPOS_B, FPOS_E are supplied, and the motion compensator 42 sets the flag ref
_Select_code, ref_layer_id
, The reference image to be motion-compensated is recognized, and the VO of the locally decoded upper layer is used as the reference image.
When using P or the enlarged image, the position and size in the absolute coordinate system are recognized based on the size data FSZ_E and the offset data FPOS_E, or the size data FSZ_B and the offset data FPOS_B. A predicted image is generated using the FR and the decryption key signal.

【０１８２】一方、上位レイヤのＶＯＰのキー信号は、
キー信号符号化部５１に供給される。キー信号符号化部
５１では、例えば、図１１のキー信号符号化部４３にお
ける場合と同様にして、キー信号が符号化され、ＶＬＣ
器３６およびキー信号復号部５３に供給される。キー信
号復号部５３では、キー信号符号化部５１によるキー信
号の符号化結果が復号される。この復号化されたキー信
号は、上述したように、動きベクトル検出器３２および
動き補償器４２に供給され、上位レイヤのＶＯＰの抜き
出しに用いられる。On the other hand, the key signal of the VOP of the upper layer is
It is supplied to the key signal encoding unit 51. In the key signal encoding unit 51, for example, the key signal is encoded in the same manner as in the case of the key signal encoding unit 43 in FIG.
Is supplied to the device 36 and the key signal decoding unit 53. In the key signal decoding unit 53, the result of encoding the key signal by the key signal encoding unit 51 is decoded. The decoded key signal is supplied to the motion vector detector 32 and the motion compensator 42 as described above, and is used for extracting the VOP of the upper layer.

【０１８３】次に、図１６は、図１のエンコーダから出
力されるビットストリームを復号化するデコーダの一実
施の形態の構成を示している。Next, FIG. 16 shows a configuration of an embodiment of a decoder for decoding the bit stream output from the encoder of FIG.

【０１８４】図１のエンコーダから出力され、伝送路５
を介して伝送されてくるビットストリームは、図示せぬ
受信装置で受信され、あるいは、記録媒体６に記録され
たビットストリームは、図示せぬ再生装置で再生され、
逆多重化部７１に供給される。The output from the encoder shown in FIG.
Is received by a receiving device (not shown), or the bit stream recorded on the recording medium 6 is reproduced by a reproducing device (not shown),
The signal is supplied to the demultiplexing unit 71.

【０１８５】逆多重化部７１では、そこに入力されたビ
ットストリームが、ＶＯごとのビットストリームＶＯ
１，ＶＯ２，・・・に分離され、それぞれ、対応するＶ
ＯＰ復号部７２nに供給される。ＶＯＰ復号部７２nで
は、逆多重化部７１からのビットストリームから、ＶＯ
を構成するＶＯＰ（画像データ）、キー信号、サイズデ
ータ（VOP size）、およびオフセットデータ（VOP offs
et）が復号され、画像再構成部７３に供給される。In the demultiplexer 71, the bit stream input thereto is converted into a bit stream VO for each VO.
, VO2,..., And the corresponding V
This is supplied to the OP decoding unit 72n. The VOP decoding unit 72n converts the bit stream from the demultiplexing unit 71 into a VO
(Image data), key signal, size data (VOP size), and offset data (VOP offs)
et) is decoded and supplied to the image reconstruction unit 73.

【０１８６】画像再構成部７３では、ＶＯＰ復号部７２
1乃至７２Nそれぞれからの出力に基づいて、元の画像が
再構成される。この再構成された画像は、例えば、モニ
タ７４に供給されて表示される。In the image reconstruction unit 73, the VOP decoding unit 72
An original image is reconstructed based on the output from each of 1 to 72N. The reconstructed image is supplied to the monitor 74 and displayed, for example.

【０１８７】次に、図１７は、図１６のＶＯＰ復号部７
２nの基本的な構成例を示している。Next, FIG. 17 is a block diagram showing the VOP decoding unit 7 shown in FIG.
2n shows a basic configuration example.

【０１８８】逆多重化部７１（図１６）からのビットス
トリームは、逆多重化部８１に入力され、そこで、キー
信号情報と、動きおよびテクスチャの情報とが抽出され
る。そして、キー信号情報はキー信号復号部８２に供給
され、また、動きおよびテクスチャの情報は画像信号復
号部８３に供給される。さらに、逆多重化部７１では、
そこに入力されるビットストリームから、サイズデータ
（VOP size）およびオフセットデータ（VOP offset）が
抽出され、画像再構成部７３（図１６）に供給される。The bit stream from demultiplexing section 71 (FIG. 16) is input to demultiplexing section 81, where key signal information and motion and texture information are extracted. Then, the key signal information is supplied to the key signal decoding unit 82, and the motion and texture information is supplied to the image signal decoding unit 83. Further, in the demultiplexing unit 71,
Size data (VOP size) and offset data (VOP offset) are extracted from the bit stream input thereto and supplied to the image reconstruction unit 73 (FIG. 16).

【０１８９】キー信号復号部８２または画像信号復号部
８３では、キー信号情報、または動きおよびテクスチャ
の情報それぞれが復号され、その結果得られるキー信
号、またはＶＯＰの画像データ（輝度信号および色差信
号）が、画像再構成部７３に供給される。The key signal decoder 82 or the image signal decoder 83 decodes the key signal information or the motion and texture information, respectively, and obtains the resulting key signal or VOP image data (luminance signal and color difference signal). Is supplied to the image reconstruction unit 73.

【０１９０】なお、図３のキー信号符号化部１２におい
て、キー信号を、画像信号符号化部１１において検出さ
れた動きベクトルにしたがって動き補償することによ
り、その符号化を行った場合には、画像信号復号部８３
において画像を復号するのに用いた動きベクトルは、キ
ー信号復号部８２に供給され、これにより、キー信号復
号部８２では、その動きベクトルを用いて、キー信号の
復号が行われる。When the key signal is encoded by the key signal encoding unit 12 shown in FIG. 3 by performing motion compensation on the key signal according to the motion vector detected by the image signal encoding unit 11, Image signal decoding unit 83
Are supplied to the key signal decoding unit 82, and the key signal decoding unit 82 decodes the key signal using the motion vector.

【０１９１】次に、図１８は、スケーラビリティを実現
する、図１６のＶＯＰ復号部７２nの構成例を示してい
る。Next, FIG. 18 shows a configuration example of the VOP decoding unit 72n of FIG. 16 for realizing scalability.

【０１９２】逆多重化部７１（図１６）から供給される
ビットストリームは、逆多重化部９１に入力され、そこ
で、上位レイヤのＶＯＰのビットストリームと、下位レ
イヤのＶＯＰのビットストリームとに分離される。上位
レイヤのＶＯＰのビットストリームは、遅延回路９２に
おいて、下位レイヤ復号部９５における処理の時間だけ
遅延された後、上位レイヤ復号部９３に供給され、ま
た、下位レイヤのＶＯＰのビットストリームは、下位レ
イヤ復号部９５に供給される。The bit stream supplied from the demultiplexing section 71 (FIG. 16) is input to the demultiplexing section 91, where the bit stream is separated into an upper layer VOP bit stream and a lower layer VOP bit stream. Is done. The bit stream of the VOP of the upper layer is supplied to the upper layer decoding unit 93 after being delayed by the processing time of the lower layer decoding unit 95 in the delay circuit 92, and the bit stream of the VOP of the lower layer is This is supplied to the layer decoding unit 95.

【０１９３】下位レイヤ復号部９５では、下位レイヤの
ビットストリームが復号され、その結果得られる下位レ
イヤの復号画像およびキー信号が解像度変換部９４に供
給される。また、下位レイヤ復号部９５は、下位レイヤ
のビットストリームを復号することにより得られるサイ
ズデータＦＳＺ＿Ｂ、オフセットデータＦＰＯＳ＿Ｂ、
動きベクトル（ＭＶ）、予測モード、フラグＣＯＤなど
の、上位レイヤのＶＯＰを復号するのに必要な情報を、
上位レイヤ復号部９３に供給する。The lower layer decoding section 95 decodes the bit stream of the lower layer, and supplies the resulting decoded image and key signal of the lower layer to the resolution converter 94. Further, the lower layer decoding unit 95 outputs size data FSZ_B, offset data FPOS_B, and the like obtained by decoding the bit stream of the lower layer.
Information necessary for decoding a higher layer VOP, such as a motion vector (MV), a prediction mode, and a flag COD,
This is supplied to the upper layer decoding unit 93.

【０１９４】上位レイヤ復号部９３では、遅延回路９２
を介して供給される上位レイヤのビットストリームが、
下位レイヤ復号部９５および解像度変換部９４の出力を
必要に応じて参照することにより復号化され、その結果
得られる上位レイヤの復号画像、キー信号、サイズデー
タＦＳＺ＿Ｅ、およびオフセットデータＦＰＯＳ＿Ｅが
出力される。さらに、上位レイヤ復号部９３は、上位レ
イヤのビットストリームを復号することにより得られる
倍率ＦＲを、解像度変換部９４に出力する。解像度変換
部９４では、上位レイヤ復号部９３からの倍率ＦＲを用
いて、図４における解像度変換部２４における場合と同
様にして、下位レイヤの復号画像が変換される。この変
換により得られる拡大画像は、上位レイヤ復号部９３に
供給され、上述したように、上位レイヤのビットストリ
ームの復号に用いられる。The upper layer decoding section 93 includes a delay circuit 92
The upper layer bit stream provided via
The output is decoded by referring to the outputs of the lower layer decoding unit 95 and the resolution conversion unit 94 as necessary, and the resulting decoded image, key signal, size data FSZ_E, and offset data FPOS_E of the upper layer are output. . Further, the upper layer decoding unit 93 outputs the magnification FR obtained by decoding the bit stream of the upper layer to the resolution conversion unit 94. The resolution conversion unit 94 converts the decoded image of the lower layer using the magnification FR from the upper layer decoding unit 93 in the same manner as in the case of the resolution conversion unit 24 in FIG. The enlarged image obtained by this conversion is supplied to the upper layer decoding unit 93, and is used for decoding the bit stream of the upper layer as described above.

【０１９５】次に、図１９は、図１８の下位レイヤ復号
部９５の構成例を示している。なお、図中、図４３のデ
コーダにおける場合と対応する部分については、同一の
符号を付してある。即ち、下位レイヤ復号部９５は、キ
ー信号復号部１０８が新たに設けられている他は、図４
３のデコーダと基本的に同様に構成されている。Next, FIG. 19 shows a configuration example of the lower layer decoding section 95 in FIG. Note that, in the figure, parts corresponding to those in the decoder in FIG. 43 are denoted by the same reference numerals. That is, the lower layer decoding unit 95 is different from the one shown in FIG.
3 is basically the same as the decoder of FIG.

【０１９６】逆多重化部９１からの下位レイヤのビット
ストリームは、バッファ１０１に供給されて記憶され
る。ＩＶＬＣ器１０２は、その後段のブロックの処理状
態に対応して、バッファ１０１からビットストリームを
適宜読み出し、そのビットストリームを可変長復号化す
ることで、量子化係数、動きベクトル、予測モード、量
子化ステップ、キー信号の符号化データ、サイズデータ
ＦＳＺ＿Ｂ、オフセットデータＦＰＯＳ＿Ｂ、およびフ
ラグＣＯＤなどを分離する。量子化係数および量子化ス
テップは、逆量子化器１０３に供給され、動きベクトル
および予測モードは、動き補償器１０７と上位レイヤ復
号部９３（図１８）に供給される。また、サイズデータ
ＦＳＺ＿ＢおよびオフセットデータＦＰＯＳ＿Ｂは、動
き補償器１０７、キー信号復号部１０８、画像再構成部
７３（図１６）、および上位レイヤ復号部９３に供給さ
れ、フラグＣＯＤは、上位レイヤ復号部９３に供給され
る。さらに、キー信号の符号化データは、キー信号復号
部１０８に供給される。The bit stream of the lower layer from the demultiplexer 91 is supplied to the buffer 101 and stored. The IVLC unit 102 appropriately reads out a bit stream from the buffer 101 in accordance with the processing state of the subsequent block, and performs variable length decoding on the bit stream to obtain a quantization coefficient, a motion vector, a prediction mode, a quantization mode, Steps, encoded data of key signals, size data FSZ_B, offset data FPOS_B, flag COD, and the like are separated. The quantization coefficient and the quantization step are supplied to the inverse quantizer 103, and the motion vector and the prediction mode are supplied to the motion compensator 107 and the upper layer decoding unit 93 (FIG. 18). The size data FSZ_B and the offset data FPOS_B are supplied to the motion compensator 107, the key signal decoding unit 108, the image reconstruction unit 73 (FIG. 16), and the upper layer decoding unit 93, and the flag COD is set to the upper layer decoding unit. 93. Further, the encoded data of the key signal is supplied to the key signal decoding unit 108.

【０１９７】逆量子化器１０３、ＩＤＣＴ器１０４、演
算器１０５、フレームメモリ１０６、または動き補償器
１０７では、図１１の下位レイヤ符号化部２５の逆量子
化器３８、ＩＤＣＴ器３９、演算器４０、フレームメモ
リ４１、または動き補償器４２における場合とそれぞれ
同様の処理が行われることで、下位レイヤのＶＯＰが復
号され、画像再構成部７３、上位レイヤ復号部９３、お
よび解像度変換部９４（図１８）に供給される。In the inverse quantizer 103, the IDCT unit 104, the arithmetic unit 105, the frame memory 106, or the motion compensator 107, the inverse quantizer 38, the IDCT unit 39, and the arithmetic unit of the lower layer coding unit 25 in FIG. 40, the frame memory 41, or the motion compensator 42, the same processing is performed, whereby the lower layer VOP is decoded, and the image reconstruction unit 73, the upper layer decoding unit 93, and the resolution conversion unit 94 ( 18).

【０１９８】また、キー信号復号部１０８では、やはり
図１１の下位レイヤ符号化部２５のキー信号復号部４４
における場合と同様の処理が行われることで、キー信号
の符号化データが復号され、その結果得られるキー信号
が、画像再構成部７３、上位レイヤ復号部９３、および
解像度変換部９４に供給される。In the key signal decoding section 108, the key signal decoding section 44 of the lower layer coding section 25 in FIG.
Is performed, the encoded data of the key signal is decoded, and the resulting key signal is supplied to the image reconstruction unit 73, the upper layer decoding unit 93, and the resolution conversion unit 94. You.

【０１９９】次に、図２０は、図１８の上位レイヤ復号
部９３の構成例を示している。なお、図中、図４３にお
ける場合と対応する部分については、同一の符号を付し
てある。即ち、上位レイヤ復号部９３は、キー信号復号
部１１１およびフレームメモリ１１２が新たに設けられ
ている他は、基本的に、図４３のエンコーダと同様に構
成されている。Next, FIG. 20 shows a configuration example of the upper layer decoding section 93 of FIG. Note that, in the figure, parts corresponding to the case in FIG. 43 are denoted by the same reference numerals. That is, the upper layer decoding unit 93 is basically configured in the same manner as the encoder in FIG. 43 except that a key signal decoding unit 111 and a frame memory 112 are newly provided.

【０２００】逆多重化部９１からの上位レイヤのビット
ストリームは、バッファ１０１を介してＩＶＬＣ器１０
２に供給される。ＩＶＬＣ器１０２は、上位レイヤのビ
ットストリームを可変長復号化することで、量子化係
数、動きベクトル、予測モード、量子化ステップ、キー
信号の符号化データ、サイズデータＦＳＺ＿Ｅ、オフセ
ットデータＦＰＯＳ＿Ｅ、倍率ＦＲ、フラグｒｅｆ＿ｌ
ａｙｅｒ＿ｉｄ，ｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ，Ｃ
ＯＤ，ＭＯＤＢなどを分離する。量子化係数および量子
化ステップは、図１９における場合と同様に、逆量子化
器１０３に供給され、動きベクトルおよび予測モード
は、動き補償器１０７に供給される。また、サイズデー
タＦＳＺ＿ＥおよびオフセットデータＦＰＯＳ＿Ｅは、
動き補償器１０７、キー信号復号部１０８、および画像
再構成部７３（図１６）に供給され、フラグＣＯＤ，Ｍ
ＯＤＢ，ｒｅｆ＿ｌａｙｅｒ＿ｉｄ、およびｒｅｆ＿ｓ
ｅｌｅｃｔ＿ｃｏｄｅは、動きベクトル検出器１０７に
供給される。さらに、キー信号の符号化データは、キー
信号復号部１１１に供給され、倍率ＦＲは、動き補償器
１０７および解像度変換部９４（図１８）に供給され
る。The bit stream of the upper layer from the demultiplexer 91 is supplied to the IVLC unit 10 via the buffer 101.
2 is supplied. The IVLC unit 102 performs variable-length decoding on the bit stream of the upper layer to obtain a quantization coefficient, a motion vector, a prediction mode, a quantization step, encoded data of a key signal, size data FSZ_E, offset data FPOS_E, and a scaling factor FR. , Flag ref_l
ayer_id, ref_select_code, C
OD, MODB, etc. are separated. As in the case of FIG. 19, the quantization coefficient and the quantization step are supplied to the inverse quantizer 103, and the motion vector and the prediction mode are supplied to the motion compensator 107. The size data FSZ_E and the offset data FPOS_E are
The signals are supplied to the motion compensator 107, the key signal decoding unit 108, and the image reconstructing unit 73 (FIG. 16).
ODB, ref_layer_id, and ref_s
The select_code is supplied to the motion vector detector 107. Further, the encoded data of the key signal is supplied to the key signal decoding unit 111, and the magnification FR is supplied to the motion compensator 107 and the resolution conversion unit 94 (FIG. 18).

【０２０１】なお、動き補償器１０７には、上述したデ
ータの他、下位レイヤ復号部９５（図１８）から、下位
レイヤの動きベクトル、フラグＣＯＤ、サイズデータＦ
ＳＺ＿Ｂ、およびオフセットデータＦＰＯＳ＿Ｂが供給
されるようになされている。また、フレームメモリ１１
２には、解像度変換部９４から拡大画像が供給される。The motion compensator 107 receives, from the lower layer decoding unit 95 (FIG. 18), the motion vector, the flag COD, and the size data F of the lower layer in addition to the data described above.
SZ_B and offset data FPOS_B are supplied. Also, the frame memory 11
2 is supplied with an enlarged image from the resolution conversion unit 94.

【０２０２】逆量子化器１０３、ＩＤＣＴ器１０４、演
算器１０５、フレームメモリ１０６、動き補償器１０
７、またはフレームメモリ１１２では、図１２の上位レ
イヤ符号化部２３の逆量子化器３８、ＩＤＣＴ器３９、
演算器４０、フレームメモリ４１、動き補償器４２、ま
たはフレームメモリ５２における場合とそれぞれ同様の
処理が行われることで、上位レイヤのＶＯＰが復号さ
れ、画像再構成部７３に供給される。[0202] Inverse quantizer 103, IDCT unit 104, arithmetic unit 105, frame memory 106, motion compensator 10
7 or the frame memory 112, the inverse quantizer 38, the IDCT unit 39,
By performing the same processing as in the arithmetic unit 40, the frame memory 41, the motion compensator 42, or the frame memory 52, the VOP of the upper layer is decoded and supplied to the image reconstruction unit 73.

【０２０３】また、キー信号復号部１１１では、やはり
図１２の上位レイヤ符号化部２３のキー信号復号部５３
における場合と同様の処理が行われることで、キー信号
の符号化データが復号され、その結果得られるキー信号
が、画像再構成部７３に供給される。Also, in key signal decoding section 111, key signal decoding section 53 of upper layer coding section 23 in FIG.
Is performed, the encoded data of the key signal is decoded, and the resulting key signal is supplied to the image reconstruction unit 73.

【０２０４】ここで、以上のように構成される上位レイ
ヤ復号部９３および下位レイヤ復号部９５を有するＶＯ
Ｐ復号部７２nにおいては、上位レイヤについての復号
画像、キー信号、サイズデータＦＳＺ＿Ｅ、およびオフ
セットデータＦＰＯＳ＿Ｅ（以下、適宜、これらをすべ
て含めて、上位レイヤデータという）と、下位レイヤに
ついての上位レイヤについての復号画像、キー信号、サ
イズデータＦＳＺ＿Ｂ、およびオフセットデータＦＰＯ
Ｓ＿Ｂ（以下、適宜、これらをすべて含めて、下位レイ
ヤデータという）が得られるが、画像再構成部７３で
は、この上位レイヤデータまたは下位レイヤデータか
ら、例えば、次のようにして画像が再構成されるように
なされている。Here, a VO having upper layer decoding section 93 and lower layer decoding section 95 configured as described above is provided.
In the P decoding unit 72n, the decoded image, the key signal, the size data FSZ_E, and the offset data FPOS_E (hereinafter, all of which are appropriately referred to as upper layer data) for the upper layer and the upper layer for the lower layer Image, key signal, size data FSZ_B, and offset data FPO
S_B (hereinafter referred to as “lower layer data” as appropriate, including all of them) is obtained. The image reconstructing unit 73 reconstructs an image from the upper layer data or the lower layer data as follows, for example. It has been made to be.

【０２０５】即ち、例えば、第１の空間スケーラビリテ
ィ（図５）が行われた場合（入力されたＶＯＰ全体が上
位レイヤとされるとともに、そのＶＯＰ全体を縮小した
ものが下位レイヤされた場合）において、下位レイヤデ
ータおよび上位レイヤデータの両方のデータが復号され
たときには、画像再構成部７３は、上位レイヤデータの
みに基づき、サイズデータＦＳＺ＿Ｅに対応する大きさ
の上位レイヤの復号画像（ＶＯＰ）を、必要に応じて、
そのキー信号で抜き出し、オフセットデータＦＰＯＳ＿
Ｅによって示される位置に配置する。また、例えば、上
位レイヤのビットストリームにエラーが生じたり、ま
た、モニタ７４が、低解像度の画像にしか対応していな
いため、下位レイヤデータのみの復号が行われたときに
は、画像再構成部７３は、その下位レイヤデータのみに
基づき、サイズデータＦＳＺ＿Ｂに対応する大きさの上
位レイヤの復号画像（ＶＯＰ）を、必要に応じて、その
キー信号で抜き出し、オフセットデータＦＰＯＳ＿Ｂに
よって示される位置に配置する。That is, for example, when the first spatial scalability (FIG. 5) is performed (when the entire input VOP is set as the upper layer and the reduced VOP is set as the lower layer) When both the lower layer data and the upper layer data are decoded, the image reconstructing unit 73 generates a decoded image (VOP) of the upper layer having a size corresponding to the size data FSZ_E based on only the upper layer data. ,If necessary,
Extracted by the key signal, the offset data FPOS_
Place at the location indicated by E. Further, for example, when an error occurs in the bit stream of the upper layer, or when the monitor 74 supports only low-resolution images, decoding of only the lower layer data is performed. Extracts a decoded image (VOP) of the upper layer having a size corresponding to the size data FSZ_B based on only the lower layer data, if necessary, with the key signal and arranges the decoded image at the position indicated by the offset data FPOS_B. .

【０２０６】また、例えば、第２の空間スケーラビリテ
ィ（図６）が行われた場合（入力されたＶＯＰの一部が
上位レイヤとされるとともに、そのＶＯＰ全体を縮小し
たものが下位レイヤとされた場合）において、下位レイ
ヤデータおよび上位レイヤデータの両方のデータが復号
されたときには、画像再構成部７３は、サイズデータＦ
ＳＺ＿Ｂに対応する大きさの下位レイヤの復号画像を、
倍率ＦＲにしたがって拡大し、その拡大画像を生成す
る。さらに、画像再構成部７３は、オフセットデータＦ
ＰＯＳ＿ＢをＦＲ倍し、その結果得られる値に対応する
位置に、拡大画像を配置する。そして、画像再構成部７
３は、サイズデータＦＳＺ＿Ｅに対応する大きさの上位
レイヤの復号画像を、オフセットデータＦＰＯＳ＿Ｅに
よって示される位置に配置する。Further, for example, when the second spatial scalability (FIG. 6) is performed (a part of the input VOP is set as the upper layer, and the reduced VOP is set as the lower layer). In the case, when both the lower layer data and the upper layer data are decoded, the image reconstructing unit 73 outputs the size data F
A decoded image of a lower layer having a size corresponding to SZ_B is
The image is enlarged according to the magnification FR, and an enlarged image is generated. Further, the image reconstructing unit 73 outputs the offset data F
POS_B is multiplied by FR, and the enlarged image is arranged at a position corresponding to the value obtained as a result. Then, the image reconstruction unit 7
No. 3 arranges the decoded image of the upper layer of the size corresponding to the size data FSZ_E at the position indicated by the offset data FPOS_E.

【０２０７】この場合、上位レイヤの復号画像の部分
が、それ以外の部分に比較して高い解像度で表示される
ことになる。In this case, the decoded image portion of the upper layer is displayed with a higher resolution than the other portions.

【０２０８】なお、上位レイヤの復号画像を配置する場
合、その復号画像と、拡大画像とは合成されることにな
るが、この合成は、上位レイヤのキー信号を用いて行わ
れる。When the decoded image of the upper layer is arranged, the decoded image and the enlarged image are combined, but this combination is performed using the key signal of the upper layer.

【０２０９】また、図１８（図１６）には図示しなかっ
たが、上位レイヤ復号部９３（ＶＯＰ復号部７２n）か
ら画像再構成部７３に対しては、上述したデータの他、
倍率ＦＲも供給されるようになされており、画像再構成
部７３は、これを用いて、拡大画像を生成するようにな
されている。Although not shown in FIG. 18 (FIG. 16), the upper layer decoding unit 93 (VOP decoding unit 72n) sends the image reconstruction unit 73
The magnification FR is also supplied, and the image reconstruction unit 73 generates an enlarged image using the magnification FR.

【０２１０】一方、第２の空間スケーラビリティが行わ
れた場合において、下位レイヤデータのみが復号された
ときには、上述の第１の空間スケーラビリティが行われ
た場合と同様にして、画像が再構成される。On the other hand, when the second spatial scalability is performed and only the lower layer data is decoded, an image is reconstructed in the same manner as when the first spatial scalability is performed. .

【０２１１】さらに、第３の空間スケーラビリティ（図
７、図８）が行われた場合（入力されたＶＯＰを構成す
る物体ごとに、その物体全体を上位レイヤとするととも
に、その物体全体を間引いたものを下位レイヤとした場
合）においては、上述の第２の空間スケーラビリティが
行われた場合と同様にして、画像が再構成される。Further, when the third spatial scalability (FIGS. 7 and 8) is performed (for each object constituting the input VOP, the entire object is set as an upper layer and the entire object is thinned out). In this case, the image is reconstructed in the same manner as when the above-described second spatial scalability is performed.

【０２１２】上述したように、オフセットデータＦＰＯ
Ｓ＿ＢおよびＦＰＯＳ＿Ｅは、下位レイヤの拡大画像お
よび上位レイヤの画像を構成する、対応する画素どうし
が、絶対座標系において同一の位置に配置されるように
なっているため、以上のように画像を再構成すること
で、正確な（位置ずれのない）画像を得ることができ
る。As described above, the offset data FPO
S_B and FPOS_E are such that the corresponding pixels constituting the enlarged image of the lower layer and the image of the upper layer are arranged at the same position in the absolute coordinate system. With this configuration, an accurate (no displacement) image can be obtained.

【０２１３】次に、スケーラビリティにおけるシンタク
スについて、例えば、ＭＰＥＧ４ＶＭ（Verification M
odel）を例に説明する。Next, regarding syntax in scalability, for example, MPEG4VM (Verification M
odel) will be described as an example.

【０２１４】図２１は、スケーラビリティの符号化によ
って得られるビットストリームの構成を示している。FIG. 21 shows the configuration of a bit stream obtained by scalability encoding.

【０２１５】ビットストリームは、ＶＳ（Video Sessio
n Class）を単位として構成され、各ＶＳは、１以上の
ＶＯ（Video Object Class）から構成される。そして、
ＶＯは、１以上のＶＯＬ（Video Object Layer Class）
から構成され（画像を階層化しないときは１のＶＯＬで
構成され、画像を階層化する場合には、その階層数だけ
のＶＯＬで構成される）、ＶＯＬは、ＶＯＰから構成さ
れる。The bit stream is VS (Video Sessio)
n Class), and each VS is composed of one or more VOs (Video Object Classes). And
VO is one or more VOL (Video Object Layer Class)
(When the image is not hierarchized, it is composed of one VOL, and when the image is hierarchized, it is composed of the VOL of the number of layers), and the VOL is composed of the VOL.

【０２１６】図２２または図２３は、ＶＳまたはＶＯの
シンタクスをそれぞれ示している。ＶＯは、画像全体ま
たは画像の一部（物体）のシーケンスに対応するビット
ストリームであり、従って、ＶＳは、そのようなシーケ
ンスの集合で構成される（よって、ＶＳは、例えば、一
本の番組に相当する）。FIG. 22 or FIG. 23 shows the syntax of VS or VO, respectively. A VO is a bit stream corresponding to a sequence of an entire image or a part (object) of an image, and thus a VS is composed of a set of such sequences (so that a VS is, for example, a single program). Equivalent).

【０２１７】図２４は、ＶＯＬのシンタクスを示してい
る。FIG. 24 shows the syntax of a VOL.

【０２１８】ＶＯＬは、スケーラビリティのためのクラ
スであり、video_object_layer_id（図２４において、
Ａ１で示す部分）で示される番号によって識別される。
即ち、例えば、下位レイヤのＶＯＬについてのvideo_ob
ject_layer_idは０とされ、また、例えば、上位レイヤ
のＶＯＬについてのvideo_object_layer_idは１とされ
る。なお、上述したように、スケーラブルのレイヤの数
は２に限られることなく、３以上の任意とすることがで
きる。The VOL is a class for scalability, and video_object_layer_id (in FIG. 24,
A1)).
That is, for example, video_ob for the VOL of the lower layer
ject_layer_id is set to 0, and for example, video_object_layer_id for the VOL of the upper layer is set to 1. Note that, as described above, the number of scalable layers is not limited to two, but may be any number of three or more.

【０２１９】また、各ＶＯＬについて、それが画像全体
であるのか、画像の一部であるのかは、video_object_l
ayer_shape（図２４において、Ａ２で示す部分）で識別
される。このvideo_object_layer_shapeは、ＶＯＬの形
状を示すフラグで、例えば、以下のように設定される。[0219] For each VOL, whether it is an entire image or a part of the image is determined by video_object_l.
It is identified by ayer_shape (portion indicated by A2 in FIG. 24). The video_object_layer_shape is a flag indicating the shape of the VOL, and is set, for example, as follows.

【０２２０】即ち、ＶＯＬの形状が長方形状であると
き、video_object_layer_shapeは、例えば「００」とさ
れる。また、ＶＯＬが、ハードキー（０または１のうち
のいずれか一方の値をとる２値（Binary）の信号）によ
って抜き出される領域の形状をしているとき、video_ob
ject_layer_shapeは、例えば「０１」とされる。さら
に、ＶＯＬが、ソフトキー（０乃至１の範囲の連続した
値（Gray-Scale）をとることが可能な信号）によって抜
き出される領域の形状をしているとき（ソフトキーを用
いて合成されるものであるとき）、video_object_layer
_shapeは、例えば「１０」とされる。That is, when the VOL is rectangular, the video_object_layer_shape is set to, for example, “00”. When the VOL has a shape of an area extracted by a hard key (a binary (Binary) signal having one of 0 or 1), video_ob
ject_layer_shape is, for example, “01”. Further, when the VOL is in the shape of an area extracted by a soft key (a signal capable of taking a continuous value (Gray-Scale) in the range of 0 to 1) (combined using the soft key) Video_object_layer)
_shape is, for example, “10”.

【０２２１】ここで、video_object_layer_shapeが「０
０」とされるのは、ＶＯＬの形状が長方形状であり、か
つ、そのＶＯＬの絶対座標形における位置および大きさ
が、時間とともに変化しない、即ち、一定の場合であ
る。なお、この場合、その大きさ（横の長さと縦の長
さ）は、video_object_layer_widthとvideo_object_lay
er_height（図２４において、Ａ７で示す部分）によっ
て示される。video_object_layer_widthおよびvideo_ob
ject_layer_heightは、いずれも１０ビットの固定長の
フラグで、video_object_layer_shapeが「００」の場合
には、最初に、一度だけ伝送される（これは、video_ob
ject_layer_shapeが「００」の場合、上述したように、
ＶＯＬの絶対座標系における大きさが一定であるからで
ある）。Here, video_object_layer_shape is set to “0”.
It is set to "0" when the VOL has a rectangular shape and the position and size of the VOL in the absolute coordinate form do not change with time, that is, are constant. In this case, the sizes (horizontal length and vertical length) are video_object_layer_width and video_object_lay
er_height (portion indicated by A7 in FIG. 24). video_object_layer_width and video_ob
The ject_layer_height is a 10-bit fixed-length flag. When video_object_layer_shape is “00”, it is transmitted only once first (this is video_ob
When ject_layer_shape is “00”, as described above,
This is because the size of the VOL in the absolute coordinate system is constant.)

【０２２２】また、ＶＯＬが、下位レイヤまたは上位レ
イヤのうちのいずれであるかは、１ビットのフラグであ
るscalability（図２４において、Ａ３で示す部分）に
よって示される。ＶＯＬが下位レイヤの場合、scalabil
ityは、例えば１とされ、それ以外の場合、scalability
は、例えば０とされる。Further, whether the VOL is a lower layer or an upper layer is indicated by scalability (portion indicated by A3 in FIG. 24) which is a 1-bit flag. When VOL is a lower layer, scalabil
ity is, for example, 1; otherwise, scalability
Is set to 0, for example.

【０２２３】さらに、ＶＯＬが、自身以外のＶＯＬにお
ける画像を参照画像として用いる場合、その参照画像が
属するＶＯＬは、上述したように、ref_layer_id（図２
４において、Ａ４で示す部分）で表される。なお、ref_
layer_idは、上位レイヤについてのみ伝送される。Further, when the VOL uses an image in a VOL other than itself as a reference image, the VOL to which the reference image belongs is, as described above, ref_layer_id (FIG. 2).
4, the portion indicated by A4). Note that ref_
layer_id is transmitted only for the upper layer.

【０２２４】また、図２４においてＡ５で示すhor_samp
ling_factor_nとhor_sampling_factor_mは、下位レイヤ
のＶＯＰの水平方向の長さに対応する値と、上位レイヤ
のＶＯＰの水平方向の長さに対応する値をそれぞれ示
す。従って、下位レイヤに対する上位レイヤの水平方向
の長さ（水平方向の解像度の倍率）は、式hor_sampling
_factor_n/hor_sampling_factor_mで与えられる。Also, hor_samp indicated by A5 in FIG.
ling_factor_n and hor_sampling_factor_m indicate a value corresponding to the horizontal length of the VOP of the lower layer and a value corresponding to the horizontal length of the VOP of the upper layer, respectively. Therefore, the horizontal length (horizontal resolution magnification) of the upper layer with respect to the lower layer is expressed by the formula hor_sampling.
Given as _factor_n / hor_sampling_factor_m.

【０２２５】さらに、図２４においてＡ６で示すver_sa
mpling_factor_nとver_sampling_factor_mは、下位レイ
ヤのＶＯＰの垂直方向の長さに対応する値と、上位レイ
ヤのＶＯＰの垂直方向の長さに対応する値をそれぞれ示
す。従って、下位レイヤに対する上位レイヤの垂直方向
の長さ（垂直方向の解像度の倍率）は、式ver_sampling
_factor_n/ver_sampling_factor_mで与えられる。Further, ver_sa indicated by A6 in FIG.
mpling_factor_n and ver_sampling_factor_m indicate a value corresponding to the vertical length of the lower layer VOP and a value corresponding to the vertical length of the upper layer VOP, respectively. Therefore, the vertical length (magnification of the vertical resolution) of the upper layer with respect to the lower layer is calculated by the expression ver_sampling.
Given as _factor_n / ver_sampling_factor_m.

【０２２６】次に、図２５は、ＶＯＰ（Video Object P
lane Class）のシンタクスを示している。FIG. 25 shows a VOP (Video Object P).
lane Class).

【０２２７】ＶＯＰの大きさ（横と縦の長さ）は、例え
ば、１０ビット固定長のVOP_widthとVOP_height（図２
５において、Ｂ１で示す部分）で表される。また、ＶＯ
Ｐの絶対座標系における位置は、例えば、１０ビット固
定長のVOP_horizontal_spatial_mc_ref（図２５におい
て、Ｂ２で示す部分）とVOP_vertical_mc_ref（図２５
において、Ｂ３で示す部分）で表される。なお、VOP_wi
dthまたはVOP_heightは、ＶＯＰの水平方向または垂直
方向の長さをそれぞれ表し、これらは、上述のサイズデ
ータＦＳＺ＿ＢやＦＳＺ＿Ｅに相当する。また、VOP_ho
rizontal_spatial_mc_refまたはVOP_vertical_mc_ref
は、ＶＯＰの水平方向または垂直方向の座標（ｘまたは
ｙ座標）をそれぞれ表し、これらは、上述のオフセット
データＦＰＯＳ＿ＢやＦＰＯＳ＿Ｅに相当する。The VOP size (horizontal and vertical lengths) is, for example, VOP_width and VOP_height (FIG.
5, the portion indicated by B1). Also, VO
The positions of P in the absolute coordinate system are, for example, VOP_horizontal_spatial_mc_ref (portion indicated by B2 in FIG. 25) and VOP_vertical_mc_ref (FIG. 25) having a fixed length of 10 bits.
, The portion indicated by B3). VOP_wi
dth or VOP_height represents the length of the VOP in the horizontal or vertical direction, respectively, and corresponds to the size data FSZ_B or FSZ_E described above. Also, VOP_ho
rizontal_spatial_mc_ref or VOP_vertical_mc_ref
Represents the horizontal or vertical coordinates (x or y coordinates) of the VOP, respectively, and corresponds to the above-described offset data FPOS_B and FPOS_E.

【０２２８】VOP_width，VOP_height，VOP_horizontal_
spatial_mc_ref、およびVOP_vertical_mc_refは、video
_object_layer_shapeが「００」以外の場合にのみ伝送
される。即ち、video_object_layer_shapeが「００」の
場合、上述したように、ＶＯＰの大きさおよび位置はい
ずれも一定であるから、VOP_width，VOP_height，VOP_h
orizontal_spatial_mc_ref、およびVOP_vertical_mc_re
fは伝送する必要がない。この場合、受信側では、ＶＯ
Ｐは、その左上の頂点が、例えば、絶対座標系の原点に
一致するように配置され、また、その大きさは、図２４
で説明したvideo_object_layer_widthおよびvideo_obje
ct_layer_heightから認識される。VOP_width, VOP_height, VOP_horizontal_
spatial_mc_ref and VOP_vertical_mc_ref are video
It is transmitted only when _object_layer_shape is other than “00”. That is, when the video_object_layer_shape is “00”, as described above, since the size and position of the VOP are all constant, the VOP_width, VOP_height, and VOP_h
orizontal_spatial_mc_ref, and VOP_vertical_mc_re
f need not be transmitted. In this case, on the receiving side, VO
P is arranged such that its upper left vertex coincides with, for example, the origin of the absolute coordinate system.
Video_object_layer_width and video_obje described in
Recognized from ct_layer_height.

【０２２９】図２５においてＢ４で示すref_select_cod
eは、図１５で説明したように、参照画像として用いる
画像を表すもので、同図に示すように、ＶＯＰのシンタ
クスにおいて規定されている。Ref_select_cod indicated by B4 in FIG.
e represents an image used as a reference image as described in FIG. 15, and is specified in the VOP syntax as shown in FIG.

【０２３０】次に、図２６は、ＶＯＰ（Video Object P
lane Class）のシンタクスの他の例を示している。Next, FIG. 26 shows a VOP (Video Object P)
lane Class) is shown.

【０２３１】この実施の形態においても、図２５におけ
る場合と同様に、ＶＯＰの大きさおよび位置に関する情
報は、video_object_layer_shapeが「００」以外の場合
に伝送される。In this embodiment, as in the case of FIG. 25, information on the size and position of the VOP is transmitted when the video_object_layer_shape is other than “00”.

【０２３２】但し、この実施の形態では、video_object
_layer_shapeが「００」以外の場合、今回伝送するＶＯ
Ｐの大きさが、前回伝送したＶＯＰの大きさと等しいか
どうかを示す１ビットのフラグload_VOP_size（図２６
において、Ｃ１で示す部分）が伝送される。このload_V
OP_sizeは、今回のＶＯＰの大きさが、直前に復号化さ
れるＶＯＰの大きさと等しい場合、または等しくない場
合、例えば、それぞれ０または１とされる。However, in this embodiment, video_object
_layer_shape is other than "00", VO to be transmitted this time
26. A 1-bit flag load_VOP_size (FIG. 26) indicating whether the size of P is equal to the size of the previously transmitted VOP.
, The portion indicated by C1) is transmitted. This load_V
OP_size is set to 0 or 1, for example, when the size of the current VOP is equal to or not equal to the size of the VOP decoded immediately before.

【０２３３】そして、load_VOP_sizeが０の場合、VOP_w
idth，VOP_height（図２６において、Ｃ２で示す部分）
は伝送されず、また、load_VOP_sizeが１の場合のみ、V
OP_width，VOP_heightは伝送される。ここで、VOP_widt
h，VOP_heightは、図２５で説明したものと同様のもの
である。When load_VOP_size is 0, VOP_w
idth, VOP_height (part indicated by C2 in FIG. 26)
Is not transmitted, and only when load_VOP_size is 1,
OP_width and VOP_height are transmitted. Where VOP_widt
h and VOP_height are the same as those described in FIG.

【０２３４】なお、図２５や図２６において、VOP_widt
hまたはVOP_heightとしては、今回のＶＯＰの横の長さ
または縦の長さと、直前に復号されるＶＯＰの横の長さ
または縦の長さとの差分値（以下、適宜、大きさ差分と
いう）それぞれを用いることが可能である。In FIGS. 25 and 26, VOP_widt
As h or VOP_height, a difference between the horizontal or vertical length of the current VOP and the horizontal or vertical length of the VOP to be decoded immediately before (hereinafter, appropriately referred to as a size difference), respectively Can be used.

【０２３５】実際の画像では、ＶＯＰの大きさが変化す
る頻度はそれほど多くなく、従って、load_VOP_sizeが
１の場合のみ、VOP_width，VOP_heightを伝送するよう
にすることで、冗長なビットを削減することが可能とな
る。また、大きさ差分を用いる場合には、さらに、情報
量の低減化を図ることが可能となる。In an actual image, the frequency at which the size of a VOP changes is not very high. Therefore, by transmitting VOP_width and VOP_height only when load_VOP_size is 1, redundant bits can be reduced. It becomes possible. When the size difference is used, the amount of information can be further reduced.

【０２３６】なお、大きさ差分を用いる場合、その算出
は、図１１および図１２におけるＶＬＣ器３６において
行われ、さらに、このＶＬＣ器３６では、大きさ差分
が、例えば可変長符号化されて出力される。また、この
場合、図１９および図２０のＩＶＬＣ器１０２では、大
きさ差分と、直前に復号されたＶＯＰの大きさとを加算
することで、今回復号するＶＯＰの大きさが認識され
る。When the size difference is used, the calculation is performed in the VLC unit 36 shown in FIGS. 11 and 12. In the VLC unit 36, the size difference is subjected to, for example, variable-length encoding and output. Is done. In this case, the IVLC unit 102 in FIGS. 19 and 20 recognizes the size of the VOP to be decoded this time by adding the size difference and the size of the VOP decoded immediately before.

【０２３７】一方、ＶＯＰの位置に関する情報について
は、絶対座標系における座標そのものではなく、今回の
ＶＯＰの座標と直前に復号されるＶＯＰ（前回のＶＯ
Ｐ）の座標との差分値（以下、適宜、位置差分という）
が、diff_VOP_horizontal_ref，diff_VOP_vertical_ref
（図２６において、Ｃ３で示す部分）によって伝送され
る。On the other hand, the information regarding the position of the VOP is not the coordinates themselves in the absolute coordinate system, but the coordinates of the current VOP and the VOP decoded immediately before (the last VOP).
A difference value from the coordinates of P) (hereinafter, referred to as a position difference as appropriate)
Is diff_VOP_horizontal_ref, diff_VOP_vertical_ref
(The portion indicated by C3 in FIG. 26).

【０２３８】ここで、直前に復号されるＶＯＰの絶対座
標系におけるｘまたはｙ座標をVOP_horizontal_mc_spat
ial_ref_prevまたはVOP_vertical_mc_spatial_ref_prev
と表すとき、diff_VOP_horizontal_refまたはdiff_VOP_
vertical_refは、図１１および図１２におけるＶＬＣ器
３６において、図２５に示したVOP_hirizontal_mc_spat
ial_refまたはVOP_vertical_mc_spatial_refを用い、次
式にしたがって、それぞれ計算される。Here, the VOP_horizontal_mc_spat is the x or y coordinate in the absolute coordinate system of the VOP decoded immediately before.
ial_ref_prev or VOP_vertical_mc_spatial_ref_prev
Is expressed as diff_VOP_horizontal_ref or diff_VOP_
The vertical_ref is the VOP_hirizontal_mc_spat shown in FIG. 25 in the VLC unit 36 in FIGS.
ial_ref or VOP_vertical_mc_spatial_ref is calculated according to the following equation.

【０２３９】diff_VOP_horizontal_ref=VOP_hirizontal
_mc_spatial_ref -VOP_horizontal_mc_spatial_ref_prev diff_VOP_vertical_ref=VOP_vertical_mc_spatial_ref -VOP_vertical_mc_spatial_ref_prev[0239] diff_VOP_horizontal_ref = VOP_hirizontal
_mc_spatial_ref -VOP_horizontal_mc_spatial_ref_prev diff_VOP_vertical_ref = VOP_vertical_mc_spatial_ref -VOP_vertical_mc_spatial_ref_prev

【０２４０】なお、図１１および図１２におけるＶＬＣ
器３６では、diff_VOP_horizontal_ref，diff_VOP_vert
ical_refは、それぞれ可変長符号化されて出力される。The VLC shown in FIGS. 11 and 12
In the device 36, diff_VOP_horizontal_ref, diff_VOP_vert
ical_ref is output after being subjected to variable-length coding.

【０２４１】即ち、ＶＬＣ器３６では、まず、diff_VOP
_horizontal_refまたはdiff_VOP_vertical_refに対応し
て、図２６においてＣ４で示す位置に配置されるdiff_s
ize_horizontalまたはdiff_size_verticalが、図２７に
示すテーブルにしたがって求められ、可変長符号（Cod
e）にそれぞれ変換される。さらに、ＶＬＣ器３６で
は、diff_VOP_horizontal_refまたはdiff_VOP_vertical
_refが、diff_size_horizontalまたはdiff_size_vertic
alに対応して、図２８に示すテーブルにしたがって可変
長符号（Code）にそれぞれ変換される。そして、このよ
うに可変長符号に変換されたdiff_VOP_horizontal_re
f，diff_VOP_vertical_ref，diff_size_horizontal、お
よびdiff_size_verticalが、他のデータに多重化されて
伝送される。That is, in the VLC unit 36, first, diff_VOP
diff_s arranged at the position indicated by C4 in FIG. 26 corresponding to _horizontal_ref or diff_VOP_vertical_ref
ize_horizontal or diff_size_vertical is obtained according to the table shown in FIG.
e) respectively. Further, in the VLC unit 36, diff_VOP_horizontal_ref or diff_VOP_vertical
_ref is diff_size_horizontal or diff_size_vertic
Al is converted into a variable-length code (Code) according to the table shown in FIG. Then, the diff_VOP_horizontal_re converted to the variable length code
f, diff_VOP_vertical_ref, diff_size_horizontal, and diff_size_vertical are multiplexed with other data and transmitted.

【０２４２】この場合、図１９および図２０のＩＶＬＣ
器１０２では、diff_size_horizontalまたはdiff_size_
verticalから、diff_VOP_horizontal_refまたはdiff_VO
P_vertical_refの可変長符号の長さが認識され、その認
識結果に基づいて、それぞれ可変長復号される。In this case, the IVLC shown in FIGS.
In the device 102, diff_size_horizontal or diff_size_
From vertical, diff_VOP_horizontal_ref or diff_VO
The length of the variable-length code of P_vertical_ref is recognized, and each variable-length code is decoded based on the recognition result.

【０２４３】以上のように、位置差分を伝送する場合に
おいは、図２５における場合に比較して、やはり情報量
を低減することが可能となる。As described above, in transmitting the position difference, the amount of information can be reduced as compared with the case of FIG.

【０２４４】なお、図２６においてＣ５で示すref_sele
ct_codeは、図２５で説明したものと同様のものであ
る。Note that the ref_sele indicated by C5 in FIG.
ct_code is the same as that described with reference to FIG.

【０２４５】次に、図２９は、マクロブロックのシンタ
クスを示している。Next, FIG. 29 shows the syntax of a macroblock.

【０２４６】まず、図２９（Ａ）は、ＩまたはＰピクチ
ャ（ＶＯＰ）を構成するマクロブロックのシンタクスを
示しており、先頭のｆｉｒｓｔ＿ＭＭＲ＿ｃｏｄｅの後
に配置されるフラグＣＯＤは、そのＣＯＤより後に配置
されているデータがあるかどうかを示す。First, FIG. 29A shows the syntax of a macroblock constituting an I or P picture (VOP), and the flag COD placed after the first first_MMR_code is placed after the first COD. Indicates whether there is data available.

【０２４７】即ち、図１１の下位レイヤ符号化部２５お
よび図１２の上位レイヤ符号化部２３を構成するＶＬＣ
器３６は、ＩまたはＰピクチャを構成するマクロブロッ
クについて得られるＤＣＴ係数（ＤＣＴ係数の量子化結
果）が、すべて０であり、かつその動きベクトルが０の
とき、そのＩまたはＰピクチャのマクロブロックをスキ
ップマクロブロックとし、この場合、ＣＯＤを１とす
る。従って、ＣＯＤが１の場合には、そのマクロブロッ
クについて伝送すべきデータが存在しないため、フラグ
ＣＯＤより後のデータは伝送されない。That is, the VLCs constituting the lower layer encoder 25 in FIG. 11 and the upper layer encoder 23 in FIG.
When the DCT coefficients (quantization results of the DCT coefficients) obtained for the macroblocks constituting the I or P picture are all 0 and the motion vector is 0, the macroblock of the I or P picture Is a skip macro block, and in this case, COD is 1. Therefore, when COD is 1, there is no data to be transmitted for the macroblock, and data after the flag COD is not transmitted.

【０２４８】一方、ＩまたはＰピクチャのＤＣＴ係数
に、０以外のＡＣ成分が存在するとき、ＶＬＣ器３６で
は、フラグＣＯＤが０とされ、その後に続くデータが必
要に応じて伝送される。なお、フラグＣＯＤの後に配置
されるＭＣＢＰＣは、マクロブロックのタイプを示すも
ので、このＭＣＢＰＣにしたがって、それに続く必要な
データが伝送される。On the other hand, when an AC component other than 0 exists in the DCT coefficient of the I or P picture, the VLC unit 36 sets the flag COD to 0, and the subsequent data is transmitted as necessary. The MCBPC arranged after the flag COD indicates the type of the macroblock, and necessary data following the MCBPC is transmitted according to the MCBPC.

【０２４９】ここで、Ｉピクチャがスキップマクロブロ
ックとなる場合は、基本的にはないので、Ｉピクチャに
ついてのＣＯＤは伝送されない（伝送しないようにする
ことができる）。Here, since there is basically no case where an I picture is a skipped macroblock, the COD for the I picture is not transmitted (it can be prevented from being transmitted).

【０２５０】次に、図２９（Ｂ）は、Ｂピクチャ（ＶＯ
Ｐ）を構成するマクロブロックのシンタクスを示してお
り、先頭のｆｉｒｓｔ＿ＭＭＲ＿ｃｏｄｅの後に配置さ
れるフラグＭＯＤＢは、図２９（Ａ）におけるフラグＣ
ＯＤに対応し、そのＭＯＤＢより後に配置されているデ
ータがあるかどうか、即ち、Ｂピクチャのマクロブロッ
クのタイプを表す。Next, FIG. 29 (B) shows a B picture (VO
P) indicates the syntax of the macroblocks constituting P), and the flag MODB arranged after the first first_MMR_code is the flag CDB in FIG.
It indicates whether there is data arranged after the MODB corresponding to the OD, that is, indicates the type of macroblock of the B picture.

【０２５１】図１１および図１２のＶＬＣ器３６では、
ＭＯＤＢは、例えば、図３０に示すような可変長符号に
符号化されて伝送される。In the VLC unit 36 shown in FIGS. 11 and 12,
The MODB is, for example, encoded into a variable length code as shown in FIG. 30 and transmitted.

【０２５２】即ち、本実施の形態では、ＭＯＤＢを可変
長符号化するためのテーブル（可変長符号化するための
テーブルは、その可変長符号を可変長復号化するために
も用いられるので、以下、適宜、その両方を含めて可変
長テーブルという）として、図３０（Ａ）および図３０
（Ｂ）に示す２種類が用意されており、図３０（Ａ）の
可変長テーブル（以下、適宜、ＭＯＤＢテーブルＡとい
う）においては、ＭＯＤＢについて３つの可変長符号
が、また、図３０（Ｂ）の可変長テーブル（以下、適
宜、ＭＯＤＢテーブルＢという）においては、ＭＯＤＢ
について２つの可変長符号が割り当てられている。That is, in the present embodiment, a table for variable-length encoding of the MODB (the table for variable-length encoding is also used for variable-length decoding of the variable-length code. 30 (A) and 30 (A) and FIG. 30 (A) and FIG.
Two types shown in (B) are prepared. In the variable length table of FIG. 30A (hereinafter referred to as MODB table A as appropriate), three variable length codes for MODB are provided. ) In the variable length table (hereinafter referred to as MODB table B as appropriate)
Are assigned two variable length codes.

【０２５３】図１１および図１２のＶＬＣ器３６は、Ｍ
ＯＤＢテーブルＡを用いる場合、Ｂピクチャを構成する
マクロブロックが、それが復号されるまでに復号され
る、他のフレームのマクロブロックについてのデータ
（量子化係数や動きベクトルなど）だけを用いて復号す
ることができるか、または、その直前に復号されるＩま
たはＰピクチャの、対応する位置におけるマクロブロッ
ク（いま処理をしようとしているＢピクチャのマクロブ
ロックの位置と同一の位置にあるＩまたはＰピクチャの
マクロブロック）がスキップマクロブロックであるとき
（ＣＯＤが１のとき）、そのＢピクチャのマクロブロッ
クをスキップマクロブロックとし、ＭＯＤＢを０とす
る。そして、この場合、ＭＢＴＹＰＥおよびＣＢＰＢを
含む、ＭＯＤＢより後のデータは伝送されない。The VLC unit 36 shown in FIGS.
When the ODB table A is used, a macroblock constituting a B picture is decoded using only data (a quantization coefficient, a motion vector, and the like) of a macroblock of another frame which is decoded before the macroblock is decoded. Or the macroblock at the corresponding position of the I or P picture decoded immediately before (the I or P picture at the same position as the macroblock position of the B picture currently being processed) ) Is a skipped macroblock (when COD is 1), the macroblock of the B picture is set to a skipped macroblock, and MODB is set to 0. Then, in this case, data after the MODB, including MBTYPE and CBPB, is not transmitted.

【０２５４】また、マクロブロックについてのＤＣＴ係
数（量子化されたＤＣＴ係数）がすべて、例えば０など
の同一の値であるが、そのマクロブロックについての動
きベクトルが存在する場合（動きベクトルを伝送する必
要がある場合）、ＭＯＤＢは「１０」とされ、その後に
続くＭＢＴＹＰＥが伝送される。If the DCT coefficients (quantized DCT coefficients) of a macroblock all have the same value such as 0, for example, but a motion vector exists for the macroblock (the motion vector is transmitted). If necessary), the MODB is set to “10”, and the subsequent MBTYPE is transmitted.

【０２５５】さらに、マクロブロックについての少なく
とも１つのＤＣＴ係数が０でなく（ＤＣＴ係数が存在
し）、そのマクロブロックについての動きベクトルが存
在する場合、ＭＯＤＢは「１１」とされ、その後に続く
ＭＢＴＹＰＥおよびＣＢＰＢが伝送される。Further, if at least one DCT coefficient of a macroblock is not 0 (DCT coefficient is present) and a motion vector is present for the macroblock, MODB is set to “11”, and the subsequent MBTYPE And CBPB are transmitted.

【０２５６】ここで、ＭＢＴＹＰＥは、マクロブロック
の予測モードおよびそのマクロブロックに含まれるデー
タ（フラグ）を示すものであり、また、ＣＢＰＢは、マ
クロブロック中のどのブロックにＤＣＴ係数が存在する
かを示す６ビットのフラグである。即ち、マクロブロッ
クは、図３１に示すように、４個の輝度信号についての
８×８画素のブロックと、色差信号Ｃｂ，Ｃｒについて
の８×８画素のブロックとの合計で６個のブロックで構
成され、図１１および図１２のＤＣＴ器３４では、この
ブロックごとにＤＣＴ処理が施されるが、図１１および
図１２のＶＬＣ器３６では、６ビットのＣＢＰＢの各ビ
ットが、６個のブロックそれぞれにおけるＤＣＴ係数が
存在するかどうかで０または１とされる。Here, MBTYPE indicates a prediction mode of a macroblock and data (flag) included in the macroblock, and CBPB indicates which block in the macroblock has a DCT coefficient. This is a 6-bit flag. That is, as shown in FIG. 31, the macro block is a total of six blocks of a block of 8 × 8 pixels for four luminance signals and a block of 8 × 8 pixels for color difference signals Cb and Cr. The DCT unit 34 of FIGS. 11 and 12 performs DCT processing for each block. In the VLC unit 36 of FIGS. 11 and 12, each bit of the 6-bit CBPB is divided into six blocks. It is set to 0 or 1 depending on whether a DCT coefficient exists in each case.

【０２５７】具体的には、マクロブロックを構成する６
個のブロックについて、例えば、図３１に示すように、
１乃至６のブロック番号が設定されているものとする
と、ＶＬＣ器３６は、例えば、ＣＢＰＢの第Ｎビット
（ここでは、例えば、最下位ビットを第１ビットとし、
最上位ビットを第６ビットとする）を、ブロック番号Ｎ
のブロックにＤＣＴ係数が存在しない場合に０とし、存
在する場合に１とする。従って、ＣＢＰＢが０（「００
００００」）である場合、そのマクロブロックについて
のＤＣＴ係数は存在しないことを意味する。More specifically, 6
For example, as shown in FIG. 31,
Assuming that block numbers 1 to 6 are set, the VLC unit 36, for example, sets the N-th bit of CBPB (here, for example, the least significant bit is the first bit,
The most significant bit is the sixth bit) and the block number N
Is set to 0 when no DCT coefficient exists in the block, and set to 1 when the DCT coefficient exists. Therefore, CBPB is 0 (“00
0000 ”), it means that there is no DCT coefficient for that macroblock.

【０２５８】一方、図１１および図１２のＶＬＣ器３６
において、ＭＯＤＢテーブルＢ（図３０（Ｂ））が用い
られる場合、ＭＯＤＢは、ＭＯＤＢテーブルＡが用いら
れる場合に「１０」または「１１」とされるときに、そ
れぞれ「０」または「１０」とされる。従って、ＭＯＤ
ＢテーブルＢが用いられる場合は、スキップマクロブロ
ックは生じない。On the other hand, the VLC unit 36 shown in FIGS.
In the case where the MODB table B (FIG. 30B) is used, when the MODB table A is used, when the MODB table A is set to “10” or “11”, the MODB is “0” or “10”, respectively. Is done. Therefore, MOD
When B table B is used, no skip macroblock occurs.

【０２５９】次に、ＭＢＴＹＰＥは、図１１および図１
２のＶＬＣ器３６において、例えば、図３２に示すよう
な可変長符号に符号化されて伝送される。Next, MBTYPE is shown in FIG. 11 and FIG.
In the second VLC unit 36, for example, it is encoded into a variable length code as shown in FIG. 32 and transmitted.

【０２６０】即ち、本実施の形態では、ＭＢＴＹＰＥの
ための可変長テーブルとして、図３２（Ａ）および図３
２（Ｂ）に示す２種類が用意されており、図３２（Ａ）
の可変長テーブル（以下、適宜、ＭＢＴＹＰＥテーブル
Ａという）においては、ＭＢＴＹＰＥについて４つの可
変長符号が、また、図３２（Ｂ）の可変長テーブル（以
下、適宜、ＭＢＴＹＰＥテーブルＢという）において
は、ＭＢＴＹＰＥについて３つの可変長符号が割り当て
られている。That is, in this embodiment, a variable length table for MBTYPE is used as shown in FIG.
2 (B) are prepared, and FIG. 32 (A)
In the variable length table (hereinafter, appropriately referred to as MBTYPE table A), four variable length codes for MBTYPE, and in the variable length table of FIG. 32B (hereinafter, appropriately referred to as MBTYPE table B), Three variable length codes are assigned to MBTYPE.

【０２６１】図１１および図１２のＶＬＣ器３６は、Ｍ
ＢＴＹＰＥテーブルＡを用いる場合、予測モードが双方
向予測符号化モード（Interpolate MC + Q）であるとき
には、ＭＢＴＹＰＥを「０１」に可変長符号化する。そ
して、この場合、ＤＱＵＡＮＴ，ＭＶＤf，ＭＶＤbが伝
送される。ここで、ＤＱＵＡＮＴは量子化ステップを、
ＭＶＤｆまたはＭＶＤｂは前方向予測または後方向予測
に用いられる動きベクトルをそれぞれ示す。なお、ＤＱ
ＵＡＮＴとしては、量子化ステップそのものではなく、
今回の量子化ステップと前回の量子化ステップとの差分
を用いることが可能である。The VLC unit 36 shown in FIGS.
When the BTYPE table A is used, when the prediction mode is the bidirectional prediction encoding mode (Interpolate MC + Q), MBTYPE is subjected to variable-length encoding to “01”. Then, in this case, DQUANT, MVDf, and MVDb are transmitted. Here, DQUANT represents the quantization step,
MVDf or MVDb indicates a motion vector used for forward prediction or backward prediction, respectively. Note that DQ
As a UANT, not the quantization step itself,
It is possible to use the difference between the current quantization step and the previous quantization step.

【０２６２】また、予測モードが後方予測符号化モード
（Backward MC + Q）であるときには、ＭＢＴＹＰＥは
「００１」に可変長符号化され、ＤＱＵＡＮＴ，ＭＶＤ
bが伝送される。When the prediction mode is the backward prediction coding mode (Backward MC + Q), MBTYPE is variable-length coded to “001”, and DQUANT, MVD
b is transmitted.

【０２６３】さらに、予測モードが前方予測符号化モー
ド（Forward MC + Q）であるときには、ＭＢＴＹＰＥは
「０００１」に可変長符号化され、ＤＱＵＡＮＴ，ＭＶ
Ｄfが伝送される。Further, when the prediction mode is the forward prediction coding mode (Forward MC + Q), MBTYPE is variable-length coded to "0001" and DQUANT, MV
Df is transmitted.

【０２６４】また、予測モードがＨ．２６３に規定され
ているダイレクトモード（Direct codingモード）であ
るときには、ＭＢＴＹＰＥは「１」とされ、ＭＶＤＢが
伝送される。If the prediction mode is H.264, In the direct mode (Direct coding mode) defined in H.263, MBTYPE is set to “1” and MVDB is transmitted.

【０２６５】ここで、上述の場合においては、インター
符号化モードとして、前方予測符号化モード、後方予測
符号化モード、および両方光予測モードの３種類につい
てしか説明しなかったが、ＭＰＥＧ４では、この３種類
に、ダイレクトモードを加えた４種類が規定されてお
り、従って、図１１および図１２の動きベクトル検出器
３２では、Ｂピクチャについては、例えば、イントラ符
号化モード、前方予測符号化モード、後方予測符号化モ
ード、両方向予測モード、またはダイレクトモードのう
ちの、予測誤差を最も少なくするものが予測モードとし
て設定されるようになされている。なお、ダイレクトモ
ードについての詳細は後述する。Here, in the above-described case, only three types of inter-coding modes, namely, forward prediction coding mode, backward prediction coding mode, and both-light prediction mode, have been described. Four types are defined by adding the direct mode to the three types. Therefore, in the motion vector detector 32 in FIGS. 11 and 12, for the B picture, for example, the intra coding mode, the forward prediction coding mode, Among the backward prediction coding mode, the bidirectional prediction mode, and the direct mode, the one that minimizes the prediction error is set as the prediction mode. The details of the direct mode will be described later.

【０２６６】一方、図１１および図１２のＶＬＣ器３６
において、ＭＢＴＹＰＥテーブルＢ（図３２（Ｂ））が
用いられる場合、ＭＢＴＹＰＥは、ＭＰＴＹＰＥテーブ
ルＡが用いられる場合に「０１」、「００１」、または
「０００１」とされるときに、それぞれ「１」、「０
１」、または「００１」とされる。従って、ＭＢＴＹＰ
ＥテーブルＢが用いられる場合は、予測モードとしてダ
イレクトモードは設定されない。On the other hand, the VLC unit 36 shown in FIGS.
In the case where the MBTYPE table B (FIG. 32 (B)) is used, the MBTYPE is set to “1” when it is set to “01”, “001”, or “0001” when the MPTYPE table A is used. , "0
1 "or" 001 ". Therefore, MBTYP
When the E table B is used, the direct mode is not set as the prediction mode.

【０２６７】次に、図３３を参照して、ダイレクトモー
ドについて説明する。Next, the direct mode will be described with reference to FIG.

【０２６８】例えば、いま、ＶＯＰ０，ＶＯＰ１，ＶＯ
Ｐ２，ＶＯＰ３の順で表示される４つのＶＯＰが存在
し、ＶＯＰ０およびＶＯＰ３がＰピクチャ（Ｐ−ＶＯ
Ｐ）で、ＶＯＰ１およびＶＯＰ２がＢピクチャ（Ｂ−Ｖ
ＯＰ）であるとする。また、ＶＯＰ０，ＶＯＰ１，ＶＯ
Ｐ２およびＶＯＰ３は、ＶＯＰ０，ＶＯＰ３，ＶＯＰ
１，ＶＯＰ２の順で符号化／復号化されるものとする。For example, now, VOP0, VOP1, VO
There are four VOPs displayed in the order of P2 and VOP3, and VOP0 and VOP3 are P pictures (P-VO
P), VOP1 and VOP2 are B pictures (BV
OP). VOP0, VOP1, VO
P2 and VOP3 are VOP0, VOP3, VOP
It is assumed that encoding / decoding is performed in the order of 1, VOP2.

【０２６９】以上のような条件下において、例えば、Ｖ
ＯＰ１のダイレクトモードでの予測符号化は、次のよう
に行われる。Under the above conditions, for example,
The predictive encoding in the direct mode of OP1 is performed as follows.

【０２７０】即ち、ＶＯＰ１の直前に符号化（復号化）
されるＰピクチャ、即ち、図３３の実施の形態ではＶＯ
Ｐ３において、これから符号化しようとするＶＯＰ１の
マクロブロック（符号化対象マクロブロック）と同一位
置にあるマクロブロック（対応マクロブロック）につい
ての動きベクトルをＭＶとするとき、ダイレクトモード
においては、この動きベクトルＭＶおよび所定のベクト
ルＭＶＤＢから、符号化対象マクロブロックを前方予測
符号化するための動きベクトルＭＶＦと、後方予測符号
化するための動きベクトルＭＶＢが、次式にしたがって
計算される。That is, encoding (decoding) is performed immediately before VOP1.
33, that is, VO in the embodiment of FIG.
In P3, when a motion vector of a macroblock (corresponding macroblock) located at the same position as a macroblock of VOP1 to be encoded (encoding target macroblock) is MV, in the direct mode, this motion vector From the MV and the predetermined vector MVDB, a motion vector MVF for forward predictive coding of the current macroblock and a motion vector MVB for backward predictive coding are calculated according to the following equations.

【０２７１】ＭＶＦ＝（ＴＲＢ×ＭＶ）／ＴＲＤ＋ＭＶＤＢＭＶＢ＝（ＴＲＢ−ＴＲＤ）×ＭＶ／ＴＲＤ但し、動きベクトルＭＶＢが上式により計算されるの
は、ベクトルＭＶＤＢが０の場合で、このベクトルＭＶ
ＤＢが０でない場合には、動きベクトルＭＶＢは次式に
したがって計算される。ＭＶＢ＝ＭＶＦ−ＭＶなお、ＴＲＢは、ＶＯＰ１から、その直前に表示される
ＩまたはＰピクチャ（図３３の実施の形態ではＶＯＰ
０）までの距離を表し、ＴＲＤは、表示順で、ＶＯＰ１
の直前と直後にあるＩまたはＰピクチャ（図３３の実施
の形態ではＶＯＰ０とＶＯＰ３）の間隔を表す。MVF = (TRB × MV) / TRD + MVDB MVB = (TRB−TRD) × MV / TRD However, the motion vector MVB is calculated by the above equation when the vector MVDB is 0 and the vector MV
If DB is not 0, the motion vector MVB is calculated according to the following equation. MVB = MVF-MV Note that TRB starts from VOP1 and is an I or P picture displayed immediately before (VOP1 in the embodiment of FIG. 33).
0), and TRD represents VOP1 in the display order.
Represents the interval between I or P pictures (VOP0 and VOP3 in the embodiment of FIG. 33) immediately before and immediately after.

【０２７２】図１１および図１２の動きベクトル検出器
３２は、ＢピクチャのＶＯＰについては、ベクトルＭＶ
ＤＢを種々の値に変化させ（但し、ベクトルＭＶＤＢ
は、動きベクトルＭＶと方向が同一のベクトル）、例え
ば、上式にしたがって得られる動きベクトルＭＶＦおよ
びＭＶＢを用いて予測符号化を行うことにより生じる予
測誤差が、イントラ符号化モード、前方予測符号化モー
ド、後方予測符号化モード、および両方向予測モードの
うちのいずれのものよりも小さいとき、予測モードとし
てダイレクトモードを設定する。The motion vector detector 32 shown in FIGS. 11 and 12 calculates the vector MV for the VOP of the B picture.
DB to various values (however, the vector MVDB
Is a vector having the same direction as the motion vector MV). For example, a prediction error generated by performing predictive coding using the motion vectors MVF and MVB obtained according to the above equation is different from an intra coding mode, a forward predictive coding. When it is smaller than any one of the mode, the backward prediction coding mode, and the bidirectional prediction mode, the direct mode is set as the prediction mode.

【０２７３】なお、図３３の実施の形態においては、Ｔ
ＲＢ＝１，ＴＲＤ＝３とされており、従って、動きベク
トルＭＶＦは、ＭＶ／３＋ＭＶＤＢで与えられる。ま
た、動きベクトルＭＶＢは、ＭＶＤＢが０のときは２Ｍ
Ｖ／３で、ＭＶＤＢが０でないときは−２ＭＶ／３＋Ｍ
ＶＤＢで与えられる。In the embodiment shown in FIG. 33, T
Since RB = 1 and TRD = 3, the motion vector MVF is given by MV / 3 + MVDB. The motion vector MVB is 2M when MVDB is 0.
At V / 3, when MVDB is not 0, -2MV / 3 + M
Provided in VDB.

【０２７４】ところで、予測モードがダイレクトモード
とされた場合においては、符号化対象マクロブロックの
符号化／復号化に、最も最近に符号化／復号化されるＰ
ピクチャ（図３３の実施の形態ではＶＯＰ３）における
対応マクロブロックの動きベクトルＭＶが必要となる。When the prediction mode is set to the direct mode, the most recently encoded / decoded P is used for encoding / decoding of the current macroblock.
A motion vector MV of a corresponding macroblock in a picture (VOP3 in the embodiment of FIG. 33) is required.

【０２７５】しかしながら、ＶＯＰは、その大きさおよ
び位置が変化する場合があり（上述したように、video_
object_layer_shapeが「１０」または「０１」の場
合）、この場合、対応マクロブロックが存在するとは限
らない。従って、大きさや位置が変化するＶＯＰを対象
とした符号化／復号化を行う場合においては、無条件に
ダイレクトモードを使用したのでは、処理をすることが
できなくなる状態が生じることになる。However, the size and position of the VOP may change (as described above,
When the object_layer_shape is “10” or “01”), in this case, the corresponding macroblock does not always exist. Therefore, in the case of performing encoding / decoding for a VOP whose size or position changes, if the direct mode is used unconditionally, a state occurs in which processing cannot be performed.

【０２７６】そこで、本実施の形態では、ダイレクトモ
ードは、符号化対象マクロブロックを有するＶＯＰ（Ｂ
ピクチャのＶＯＰ）が、最も最近に復号されるＰピクチ
ャのＶＯＰと、その大きさが同一である場合のみ使用可
能とする。具体的には、上述したVOP_widthおよびVOP_h
eightで表されるＶＯＰの大きさが変化しない場合の
み、ダイレクトモードの使用を許可するようにする。Therefore, in the present embodiment, the direct mode is a VOP (BOP) having a coding-target macroblock.
(VOP of the picture) is the same as the VOP of the P picture to be decoded most recently. Specifically, VOP_width and VOP_h described above
Only when the size of the VOP represented by eight does not change, the use of the direct mode is permitted.

【０２７７】従って、ダイレクトモードに対応するＭＢ
ＴＹＰＥの可変長符号が定義されているＭＢＴＹＰＥテ
ーブルＡは、基本的には、符号化対象マクロブロックを
有するＢピクチャのＶＯＰの大きさと、最も最近に復号
されるＰピクチャのＶＯＰの大きさとが同一である場合
にのみ用いられる。Therefore, the MB corresponding to the direct mode
In the MBTYPE table A in which the variable-length code of TYPE is defined, the size of the VOP of the B picture having the macroblock to be coded is basically the same as the size of the VOP of the most recently decoded P picture. Used only when

【０２７８】なお、ＭＯＤＢテーブルＡ（図３０
（Ａ））は、ＭＰＥＧ４において規定されており、ま
た、このＭＯＤＢテーブルＡを用いる場合においては、
ＭＯＤＢが「０」のときであって、図１５で説明したre
f_select_codeが「００」でないときは、予測モードを
ダイレクトモードとすることが規定されている。従っ
て、ＭＯＤＢテーブルＡも、基本的には、符号化対象マ
クロブロックを有するＢピクチャのＶＯＰの大きさと、
最も最近に復号されるＰピクチャのＶＯＰの大きさとが
同一である場合にのみ用いられる。The MODB table A (FIG. 30)
(A)) is defined in MPEG4, and when this MODB table A is used,
When the MODB is “0” and the re
When f_select_code is not “00”, it is defined that the prediction mode is the direct mode. Therefore, the MODB table A also basically has the size of the VOP of the B picture having the encoding-target macroblock,
It is used only when the size of the VOP of the most recently decoded P picture is the same.

【０２７９】以上から、ＭＯＤＢテーブルＡおよびＭＢ
ＴＹＰＥテーブルＡが用いられる場合、ＭＯＤＢが
「０」か、または、ＭＢＴＹＰＥが「１」のとき、予測
モードはダイレクトモードとなる。From the above, the MODB tables A and MB
When TYPE table A is used, when MODB is “0” or MBTYPE is “1”, the prediction mode is the direct mode.

【０２８０】なお、video_object_layer_shapeが「０
０」の場合は、ＶＯＰの大きさは変化しないため、この
場合も、ＭＯＤＢテーブルＡおよびＭＢＴＹＰＥテーブ
ルＡが用いられることになる。[0280] Note that the video_object_layer_shape is set to "0".
In the case of “0”, the size of the VOP does not change, so that the MODB table A and the MBTYPE table A are used in this case as well.

【０２８１】一方、符号化対象マクロブロックを有する
ＢピクチャのＶＯＰの大きさと、最も最近に復号される
ＰピクチャのＶＯＰの大きさとが異なる場合、ダイレク
トモードを使用することができないので、ＭＢＴＹＰＥ
は、ＭＢＴＹＰＥテーブルＢ（図３２（Ｂ））を用いて
可変長符号化／可変長復号化される。On the other hand, if the VOP size of the B picture having the encoding target macroblock is different from the VOP size of the most recently decoded P picture, the direct mode cannot be used.
Is subjected to variable-length encoding / variable-length decoding using the MBTYPE table B (FIG. 32B).

【０２８２】また、符号化対象マクロブロックを有する
ＢピクチャのＶＯＰの大きさと、最も最近に復号される
ＰピクチャのＶＯＰの大きさとが異なる場合、少なくと
もＭＰＴＹＰＥは伝送する必要があるので、即ち、ＭＢ
ＴＹＰＥおよびＣＢＰＢの両方を伝送せずに済むことは
ないので、ＭＯＤＢは、ＭＢＴＹＰＥおよびＣＢＰＢの
両方を転送しない場合が定義されているＭＯＤＢテーブ
ルＡ（図３０（Ａ））ではなく、そのような場合が定義
されていないＭＯＤＢテーブルＢ（図３０（Ｂ））を用
いて可変長符号化／可変長復号化される。If the VOP size of the B picture having the encoding target macroblock is different from the VOP size of the most recently decoded P picture, at least MPTYPE must be transmitted, that is, MB
Since it is not necessary to transmit both TYPE and CBPB, the MODB is not the MODB table A (FIG. 30 (A)) in which it is defined that both MBTYPE and CBPB are not transmitted. Is variable-length encoded / variable-length decoded using a MODB table B (FIG. 30B) in which is not defined.

【０２８３】以上のように、ＶＯＰの大きさの変化に対
応して、用いる可変長テーブルを選択（変更）すること
で、その符号化の結果得られるデータのデータ量を低減
することが可能となる。As described above, by selecting (changing) the variable length table to be used in response to the change in the size of the VOP, it is possible to reduce the amount of data obtained as a result of the encoding. Become.

【０２８４】即ち、ＭＯＤＢテーブルＡ（図３０
（Ａ））だけを用いた場合、ＭＯＤＢが１ビットの可変
長符号にされる場合が１通りと、２ビットの可変長符号
にされる場合が２通りだけ存在する。一方、ＭＯＤＢテ
ーブルＢ（図３０（Ｂ））を用いた場合、ＭＯＤＢが１
ビットの可変長符号にされる場合が１通りと、２ビット
の可変長符号にされる場合が１通りだけ存在する。従っ
て、ＭＯＤＢテーブルＡおよびＢの両方を用いる場合、
ＭＯＤＢテーブルＡだけを用いる場合に比較して、ＭＯ
ＤＢが２ビットの可変長符号にされる頻度が減少し、そ
の結果、データ量を低減することができる。That is, the MODB table A (FIG. 30)
When only (A)) is used, there are only one case where the MODB is changed to a 1-bit variable length code and two cases where the MODB is changed to a 2-bit variable length code. On the other hand, when the MODB table B (FIG. 30B) is used, the MODB is 1
There is only one case where a variable length code is used for bits, and only one case where a variable length code is used for two bits. Therefore, when using both the MODB tables A and B,
Compared to the case where only the MODB table A is used, the MO
The frequency at which the DB is changed to a 2-bit variable length code is reduced, and as a result, the data amount can be reduced.

【０２８５】同様に、ＭＢＴＹＰＥテーブルＡ（図３２
（Ａ））によれば、ＭＢＴＹＰＥが最長で４ビットの可
変長符号にされる場合があるが、ＭＢＴＹＰＥテーブル
Ｂ（図３２（Ｂ））では、ＭＢＴＹＰＥは、最長でも３
ビットの可変長符号にしかされない。従って、やはり、
データ量を低減することができる。Similarly, the MBTYPE table A (FIG. 32)
According to (A)), the MBTYPE may be a variable-length code of 4 bits at the longest, but in the MBTYPE table B (FIG. 32B), the MBTYPE is 3 at the longest.
It is only a variable length code of bits. Therefore,
The amount of data can be reduced.

【０２８６】ところで、以上のように、複数のＭＯＤＢ
テーブルやＭＢＴＹＰＥテーブルを用いる場合、下位レ
イヤと、ｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」以
外となっている上位レイヤとについては問題ないが、ｒ
ｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」となっている
上位レイヤについては、次のような問題が生じる。By the way, as described above, a plurality of MODBs
When a table or MBTYPE table is used, there is no problem with the lower layer and the upper layer whose ref_select_code is other than “00”.
The following problem occurs in the upper layer in which ef_select_code is “00”.

【０２８７】即ち、上位レイヤにおいて、Ｂピクチャの
処理対象マクロブロックについてのフラグｒｅｆ＿ｓｅ
ｌｅｃｔ＿ｃｏｄｅが「００」である場合というのは、
図３４に示すように、その処理対象マクロブロックを、
同一レイヤ（ここでは上位レイヤ）におけるＩまたはＰ
ピクチャと、そのレイヤと異なるレイヤ（ここでは下位
レイヤ）の同一時刻における画像（拡大画像）とが、必
要に応じて、参照画像として用いられる場合である（図
１５）。That is, in the upper layer, the flag ref_se for the macroblock to be processed of the B picture
When the select_code is “00”,
As shown in FIG. 34, the macro block to be processed is
I or P in the same layer (here upper layer)
In this case, a picture and an image (enlarged image) at the same time of a layer different from the layer (lower layer here) at the same time are used as reference images as necessary (FIG. 15).

【０２８８】一方、ダイレクトモードは、時刻の異なる
２つのＩまたはＰピクチャの間にあるＢピクチャを、そ
の直前に復号されるＰピクチャの動きベクトルを用いて
予測符号化するものである。On the other hand, in the direct mode, a B picture located between two I or P pictures at different times is predictively coded using a motion vector of a P picture decoded immediately before.

【０２８９】従って、ｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ
が「００」の場合、ダイレクトモードは適用し得ないの
に、ＭＢＴＹＰＥテーブルＡが用いられるときには、予
測モードとして、ダイレクトモードが設定されることが
ある。Therefore, ref_select_code
Is "00", the direct mode cannot be applied, but when the MBTYPE table A is used, the direct mode may be set as the prediction mode.

【０２９０】そこで、本実施の形態においては、上位レ
イヤにおいて、Ｂピクチャの処理対象マクロブロックに
ついてのフラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「０
０」である場合、次のような第１または第２の方法のう
ちのいずれかによって、ＭＢＴＹＰＥが可変長符号化／
可変長復号化されるようになされている。Therefore, in the present embodiment, the flag ref_select_code for the macroblock to be processed of the B picture is set to “0” in the upper layer.
"0", the MBTYPE is set to the variable length encoding / coding by one of the following first or second methods:
Variable-length decoding is performed.

【０２９１】即ち、第１の方法では、上位レイヤにおい
て、Ｂピクチャの処理対象マクロブロックについてのフ
ラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」である
場合は、ＭＢＴＹＰＥテーブルＡは用いられず、ＭＢＴ
ＹＰＥテーブルＢが用いられる。ＭＢＴＹＰＥテーブル
Ｂには、上述したように、ダイレクトモードは定義され
ていないから、図３４に示したような場合に、予測モー
ドとしてダイレクトモードが設定されることはない。That is, in the first method, when the flag ref_select_code for the macroblock to be processed of the B picture is “00” in the upper layer, the MBTYPE table A is not used and the MBT
YPE table B is used. As described above, since the MBTYPE table B does not define the direct mode, the direct mode is not set as the prediction mode in the case shown in FIG.

【０２９２】また、第２の方法では、ダイレクトモード
に準ずる予測モードとして、次のような準ダイレクトモ
ードを定義し、上位レイヤにおいて、Ｂピクチャの処理
対象マクロブロックについてのフラグｒｅｆ＿ｓｅｌｅ
ｃｔ＿ｃｏｄｅが「００」である場合に、ＭＢＴＹＰＥ
テーブルＡが用いられるときには、ＭＢＴＹＰＥの可変
長符号「１」に、ダイレクトモードではなく、準ダイレ
クトモードを割り当てるようにする。In the second method, the following quasi-direct mode is defined as a prediction mode similar to the direct mode, and the flag ref_sele for the macroblock to be processed of the B picture is defined in the upper layer.
If ct_code is “00”, MBTYPE
When the table A is used, not the direct mode but the quasi-direct mode is assigned to the variable-length code “1” of MBTYPE.

【０２９３】ここで、準ダイレクトモードにおいては、
図３４に示した場合において、前方向予測は、下位レイ
ヤ（異なるレイヤ）の画像を倍率ＦＲにしたがって拡大
した拡大画像を参照画像（予測参照画像）として行い、
また、後方予測は、上位レイヤ（同一レイヤ）の直前に
符号化（復号）された画像を参照画像として行う。Here, in the quasi-direct mode,
In the case shown in FIG. 34, the forward prediction is performed by using an enlarged image obtained by enlarging the image of the lower layer (different layer) in accordance with the magnification FR as a reference image (prediction reference image).
In addition, backward prediction is performed using an image encoded (decoded) immediately before an upper layer (the same layer) as a reference image.

【０２９４】さらに、図３５に示すように、前方予測の
参照画像とされる拡大画像における対応マクロブロック
（符号化対象のマクロブロックと同一位置にあるマクロ
ブロック）についての動きベクトルをＭＶとするとき、
後方予測に用いる動きベクトルＭＶＢとして、次式で与
えられるベクトルを用いる。Further, as shown in FIG. 35, when the motion vector for the corresponding macroblock (the macroblock located at the same position as the encoding-target macroblock) in the enlarged image used as the reference image for forward prediction is MV. ,
As the motion vector MVB used for backward prediction, a vector given by the following equation is used.

【０２９５】ＭＶＢ＝ＭＶ×ＦＲ＋ＭＶＤＢMVB = MV × FR + MVDB

【０２９６】即ち、下位レイヤの対応マクロブロックに
ついての動きベクトルＭＶをＦＲ倍し、これに、ベクト
ルＭＶＤＢを加算したものを、後方予測の動きベクトル
ＭＶＢとして用いる。That is, the motion vector MV for the corresponding macroblock in the lower layer is multiplied by FR, and the result obtained by adding the vector MVDB thereto is used as the motion vector MVB for backward prediction.

【０２９７】なお、この場合、動きベクトルＭＶＢは伝
送されない。これは、動きベクトルＭＶＢは、動きベク
トルＭＶ、倍率ＦＲ、およびＭＶＤＢから得ることがで
きるためであり、従って、受信側（デコーダ側）では、
上位レイヤにおいて、Ｂピクチャの処理対象マクロブロ
ックについてのフラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅ
が「００」である場合に、ＭＢＴＹＰＥテーブルＡを用
いて可変長復号化が行われるとき、ＭＢＴＹＰＥが
「１」となっているマクロブロックの動きベクトルＭＶ
Ｂは、下位レイヤの対応マクロブロックについての動き
ベクトルＭＶ、倍率ＦＲ、およびベクトルＭＶＤＢから
求められる。[0297] In this case, the motion vector MVB is not transmitted. This is because the motion vector MVB can be obtained from the motion vector MV, the scaling factor FR, and the MVDB. Therefore, on the receiving side (decoder side),
In an upper layer, a flag ref_select_code for a macroblock to be processed of a B picture
Is “00”, and when the variable length decoding is performed using the MBTYPE table A, the motion vector MV of the macroblock whose MBTYPE is “1”
B is obtained from the motion vector MV, magnification FR, and vector MVDB for the corresponding macroblock in the lower layer.

【０２９８】よって、この場合、いわば冗長なデータで
ある動きベクトルＭＶＢが伝送されないので、符号化効
率を向上させることができる。Thus, in this case, since the motion vector MVB, which is redundant data, is not transmitted, the coding efficiency can be improved.

【０２９９】次に、図３６および図３７のフローチャー
トを参照して、図１１および図１２のＶＬＣ器３６、並
びに図１９および図２０のＩＶＬＣ器１０２において用
いられる可変長テーブルの決定方法（ＭＯＤＢテーブル
ＡまたはＢのうちのいずれを用いるのかと、ＭＢＴＹＰ
ＥテーブルＡまたはＢのうちのいずれを用いるのかを決
定する方法）について説明する。Next, with reference to the flowcharts of FIGS. 36 and 37, a method of determining the variable length table (MODB table) used in the VLC unit 36 of FIGS. 11 and 12 and the IVLC unit 102 of FIGS. 19 and 20. Whether to use A or B, and MBTYP
A method of determining which of the E tables A and B is used) will be described.

【０３００】図３６は、下位レイヤについて用いる可変
長テーブルの決定方法を示している。FIG. 36 shows a method for determining the variable length table used for the lower layer.

【０３０１】この場合、まず最初に、ステップＳ３１に
おいて、ＶＯＰの大きさが変化しているかどうかが、例
えば、図２５で説明したvideo_object_layer_shapeや、
VOP_width，VOP_height、あるいは図２６で説明したloa
d_VOP_sizeなどを参照することにより判定される。ステ
ップＳ３１において、ＶＯＰの大きさが変化していない
と判定された場合、ステップＳ３２に進み、ＭＯＤＢテ
ーブルＡおよびＭＢＴＹＰＥテーブルＡを用いることが
決定され、処理を終了する。一方、ステップＳ３１にお
いて、ＶＯＰの大きさが変化していると判定された場
合、ステップＳ３３に進み、ＭＯＤＢテーブルＢおよび
ＭＢＴＹＰＥテーブルＢを用いることが決定され、処理
を終了する。In this case, first, in step S31, whether or not the size of the VOP has changed is determined by, for example, the video_object_layer_shape described in FIG.
VOP_width, VOP_height, or loa described in FIG.
It is determined by referring to d_VOP_size and the like. If it is determined in step S31 that the magnitude of the VOP has not changed, the process proceeds to step S32, where it is determined that the MODB table A and the MBTYPE table A are used, and the process ends. On the other hand, when it is determined in step S31 that the magnitude of the VOP has changed, the process proceeds to step S33, where it is determined that the MODB table B and the MBTYPE table B are used, and the process ends.

【０３０２】次に、図３７は、上位レイヤについて用い
る可変長テーブルの決定方法を示している。Next, FIG. 37 shows a method of determining a variable length table used for an upper layer.

【０３０３】この場合、まず最初に、ステップＳ４１に
おいて、ref_select_codeが「００」であるかどうかが
判定される。ステップＳ４１において、ref_select_cod
eが「００」であると判定された場合、即ち、処理しよ
うとしている上位レイヤのＶＯＰについて、下位レイヤ
の同一時刻におけるＶＯＰが参照画像として用いられる
場合、ステップＳ４２に進み、ＭＯＤＢテーブルＡおよ
びＭＢＴＹＰＥテーブルＢを用いることが決定され、処
理を終了する。In this case, first, in step S41, it is determined whether or not ref_select_code is “00”. In step S41, ref_select_cod
If e is determined to be “00”, that is, if the VOP of the lower layer at the same time is used as a reference image for the VOP of the upper layer to be processed, the process proceeds to step S42, where the MODB table A and the MBTYPE It is determined that the table B is to be used, and the process ends.

【０３０４】なお、この場合、準ダイレクトモードを使
用するときには、ＭＢＴＹＰＥテーブルＢではなく、Ｍ
ＢＴＹＰＥテーブルＡを用いるように決定がなされる。
即ち、ステップＳ４２では、第１の方法を適用するとき
はＭＢＴＹＰＥテーブルＢが、また、第２の方法を適用
するときはＭＢＴＹＰＥテーブルＡが選択される。In this case, when the quasi-direct mode is used, instead of MBTYPE table B, M
A decision is made to use BTYPE table A.
That is, in step S42, the MBTYPE table B is selected when the first method is applied, and the MBTYPE table A is selected when the second method is applied.

【０３０５】一方、ステップＳ４１において、ref_sele
ct_codeが「００」でないと判定された場合、ステップ
Ｓ４３に進み、以下、ステップＳ４３乃至Ｓ４５におい
て、図３６のステップＳ３１乃至Ｓ３３における場合と
同様の処理が行われることにより、用いるべきＭＯＤＢ
テーブルおよびＭＢＴＹＰＥテーブルが決定される。On the other hand, in step S41, ref_sele
If it is determined that ct_code is not “00”, the process proceeds to step S43, and thereafter, in steps S43 to S45, the same processing as in steps S31 to S33 of FIG.
The table and the MBTYPE table are determined.

【０３０６】次に、図３８乃至図４０を参照して、図１
１の下位レイヤ符号化部２５および図１２の上位レイヤ
符号化部２３、並びに図１９の下位レイヤ復号部９５お
よび図２０の上位レイヤ復号部９３におけるスキップマ
クロブロックの処理について説明する。Next, referring to FIGS. 38 to 40, FIG.
The processing of the skip macroblock in the lower layer encoding unit 25 of FIG. 1 and the upper layer encoding unit 23 of FIG. 12, and the lower layer decoding unit 95 of FIG. 19 and the upper layer decoding unit 93 of FIG. 20 will be described.

【０３０７】なお、ここでは、上述したように、Ｉピク
チャのマクロブロックがスキップマクロブロックとなる
ことは基本的にはないものとし、従って、ＰおよびＢピ
クチャを対象に説明を行う。[0307] Here, as described above, it is basically assumed that the macroblock of the I picture does not become a skip macroblock, and therefore, the description will be given of P and B pictures.

【０３０８】また、ＭＯＤＢテーブルＢが用いられる場
合も、上述したようにスキップマクロブロックが生じる
ことはないので、従って、スキップマクロブロックの処
理は、ＭＯＤＢテーブルＡが用いられる場合にのみ行わ
れる。Also, when the MODB table B is used, the skipped macroblock does not occur as described above, so that the skipped macroblock processing is performed only when the MODB table A is used.

【０３０９】まず、図３８は、図１１の下位レイヤ符号
化部２５、および図１９の下位レイヤ復号部９５におけ
るスキップマクロブロックの処理を説明するフローチャ
ートである。First, FIG. 38 is a flowchart for explaining the processing of the skipped macroblock in the lower layer encoding section 25 in FIG. 11 and the lower layer decoding section 95 in FIG.

【０３１０】まず最初に、ステップＳ１においては、処
理対象のマクロブロックが、ＰまたはＢピクチャのうち
のいずれであるかが判定される。ステップＳ１におい
て、処理対象のマクロブロックが、Ｐピクチャであると
判定された場合、ステップＳ２に進み、そのマクロブロ
ックについてのＣＯＤが１であるかどうかが判定され
る。ステップＳ２において、処理対象のマクロブロック
についてのＣＯＤが１であると判定された場合、ステッ
プＳ３に進み、そのマクロブロックはスキップマクロブ
ロックであると決定され、そのように取り扱われる。即
ち、この場合、処理対象のマクロブロックの量子化係数
（ＤＣＴ係数）はすべて０であるとされ、また、その動
きベクトルも０であるとされる。First, in step S1, it is determined whether the macroblock to be processed is a P or B picture. If it is determined in step S1 that the macroblock to be processed is a P-picture, the process proceeds to step S2, where it is determined whether the COD of the macroblock is 1. If it is determined in step S2 that the COD of the macroblock to be processed is 1, the process proceeds to step S3, where the macroblock is determined to be a skipped macroblock, and is treated as such. That is, in this case, the quantization coefficients (DCT coefficients) of the macroblock to be processed are all zero, and the motion vectors are also zero.

【０３１１】また、ステップＳ２において、処理対象の
マクロブロックについてのＣＯＤが１でないと判定され
た場合、ステップＳ４に進み、そのマクロブロックは、
通常処理される。即ち、この場合、Ｐピクチャのマクロ
ブロックは、０以外のＤＣＴ係数を有し、または０以外
の動きベクトルを有するものとして扱われる。If it is determined in step S2 that the COD of the macroblock to be processed is not 1, the process proceeds to step S4, where the macroblock is
Usually processed. That is, in this case, the macroblock of the P picture is treated as having a DCT coefficient other than 0 or having a motion vector other than 0.

【０３１２】一方、ステップＳ１において、処理対象の
マクロブロックがＢピクチャであると判定された場合、
ステップＳ５に進み、そのＢピクチャのマクロブロック
が復号される直前に復号されるＩまたはＰピクチャにお
いて、同一位置にあるマクロブロック（対応マクロブロ
ック）のＣＯＤが１であるかどうかが判定される。ステ
ップＳ５において、処理対象のマクロブロックについて
の対応マクロブロックのＣＯＤが１であると判定された
場合、ステップＳ６に進み、その処理対象のマクロブロ
ックは、スキップマクロブロックであると決定され、そ
のように取り扱われる。On the other hand, if it is determined in step S1 that the macroblock to be processed is a B picture,
Proceeding to step S5, it is determined whether the COD of the macroblock (corresponding macroblock) at the same position is 1 in the I or P picture decoded immediately before the macroblock of the B picture is decoded. If it is determined in step S5 that the COD of the corresponding macroblock for the macroblock to be processed is 1, the process proceeds to step S6, where the macroblock to be processed is determined to be a skip macroblock. Will be dealt with.

【０３１３】即ち、いま、処理すべき画像（ＶＯＰ）と
して、例えば、図４０（Ａ）に示すように、Ｉ／Ｐ（Ｉ
／Ｐは、ＩまたはＰピクチャを意味する），Ｂ，Ｉ／Ｐ
というシーケンスで表示されるものがあり、これらは、
同図（Ａ）において、最も左のＩ／Ｐ、最も右のＩ／
Ｐ、左から２番目のＢの順で符号化／復号化されるもの
とする。さらに、いま、左から２番目のＢピクチャのマ
クロブロックが処理の対象となっているものとする。That is, as an image to be processed (VOP), for example, as shown in FIG.
/ P means I or P picture), B, I / P
Are displayed in the sequence
In FIG. 3A, the leftmost I / P and the rightmost I / P
It is assumed that encoding / decoding is performed in the order of P and the second B from the left. Further, it is assumed that the macroblock of the second B picture from the left is to be processed.

【０３１４】この場合、最も右のＩ／Ｐピクチャは、最
も左のＩ／Ｐピクチャを参照画像として用いて符号化／
復号化されることになる。従って、処理対象のＢピクチ
ャのマクロブロックについての、最も右のＩ／Ｐピクチ
ャの対応マクロブロックのＣＯＤが１である場合、即
ち、その対応マクロブロックがスキップマクロブロック
である場合、最も左のＩ／Ｐピクチャから最も右のＩ／
Ｐピクチャまでの間には、画像の変化がなかったことに
なる。そこで、上述したように、処理対象のマクロブロ
ックがＢピクチャであり、かつ、その対応マクロブロッ
クのＣＯＤが１のときは、その処理対象のマクロブロッ
クはスキップマクロブロックとされる。In this case, the rightmost I / P picture is encoded / coded using the leftmost I / P picture as a reference image.
Will be decrypted. Therefore, when the COD of the corresponding macroblock of the rightmost I / P picture of the macroblock of the B picture to be processed is 1, that is, when the corresponding macroblock is a skip macroblock, the leftmost I / P picture is skipped. / P picture to rightmost I /
Until the P picture, there is no change in the image. Therefore, as described above, when the macroblock to be processed is a B picture and the COD of the corresponding macroblock is 1, the macroblock to be processed is set as a skipped macroblock.

【０３１５】なお、この場合、そのＢピクチャの処理対
象のマクロブロックについての処理（予測符号化／復号
化）は、最も右のＩ／Ｐピクチャの対応マクロブロック
と同様に行われ、従って、その動きベクトルは０と、ま
た、ＤＣＴ係数もすべて０として扱われる（エンコーダ
側では、上述したように、ＭＯＤＢのみ伝送され、それ
以降のＣＢＰＢやＭＢＴＹＰＥなどは伝送されない）。In this case, the processing (prediction encoding / decoding) for the macroblock to be processed for the B picture is performed in the same manner as the corresponding macroblock for the rightmost I / P picture. The motion vector is treated as 0, and the DCT coefficients are all treated as 0 (as described above, only the MODB is transmitted on the encoder side, and the subsequent CBPB, MBTYPE, etc. are not transmitted).

【０３１６】図３８に戻り、ステップＳ５において、対
応マクロブロックのＣＯＤが１でないと判定された場
合、ステップＳ７に進み、処理対象のＢピクチャのマク
ロブロックのＭＯＤＢが０であるかどうかが判定され
る。ステップＳ７において、そのＭＯＤＢが０であると
判定された場合、ステップＳ８に進み、その処理対象の
マクロブロックは、スキップマクロブロックであると決
定され、そのように取り扱われる。Returning to FIG. 38, if it is determined in step S5 that the COD of the corresponding macroblock is not 1, the flow advances to step S7 to determine whether the MODB of the macroblock of the B picture to be processed is 0. You. If it is determined in step S7 that the MODB is 0, the process proceeds to step S8, where the macroblock to be processed is determined to be a skipped macroblock, and is treated as such.

【０３１７】即ち、いま、処理すべき画像（ＶＯＰ）と
して、例えば、図４０（Ｂ）に示すように、図４０
（Ａ）における場合と同様の順番で、表示および符号化
／復号化されるものがあり、やはり、同図（Ａ）におけ
る場合と同様に、左から２番目のＢピクチャのマクロブ
ロックが処理の対象となっているものとする。That is, as an image to be processed (VOP), for example, as shown in FIG.
Some are displayed and encoded / decoded in the same order as in (A), and the macroblock of the second B picture from the left is processed in the same manner as in (A) of FIG. It is assumed that it is targeted.

【０３１８】いまの場合、処理対象のＢピクチャのマク
ロブロックについての、最も右のＩ／Ｐピクチャの対応
マクロブロックのＣＯＤが１でないから、即ち、その対
応マクロブロックがスキップマクロブロックでないか
ら、最も左のＩ／Ｐピクチャから最も右のＩ／Ｐピクチ
ャまでの間には、画像に変化があったことになる。In this case, since the COD of the corresponding macroblock of the rightmost I / P picture for the macroblock of the B picture to be processed is not 1, that is, since the corresponding macroblock is not a skip macroblock, The image has changed between the left I / P picture and the rightmost I / P picture.

【０３１９】一方、処理対象のＢピクチャのマクロブロ
ックのＭＯＤＢが０であるから、このマクロブロック
は、それが復号されるまでに復号される、他のフレーム
のマクロブロックについてのデータだけを用いて復号す
ることができるか、その直前に復号されるＩまたはＰピ
クチャにおける対応マクロブロックがスキップマクロブ
ロックである（ＣＯＤが１である）ということになる
が、上述したように、ＣＯＤは１でないから、処理対象
のＢピクチャのマクロブロックは、それが復号されるま
でに復号される、他のフレームのマクロブロックについ
てのデータ（以下、適宜、既復号データという）を用い
て復号することができるということになる。On the other hand, since the MODB of the macroblock of the B picture to be processed is 0, this macroblock is decoded by using only the data of the macroblock of another frame which is decoded before it is decoded. The corresponding macroblock in the I or P picture decoded immediately before or can be decoded is a skip macroblock (COD is 1), but as described above, since COD is not 1, , A macroblock of a B picture to be processed can be decoded by using data of a macroblock of another frame (hereinafter, appropriately referred to as decoded data) which is decoded before the macroblock is decoded. Will be.

【０３２０】そこで、最も左のＩ／Ｐピクチャから最も
右のＩ／Ｐピクチャまでの間に、画像に変化があり、か
つ、処理対象のＢピクチャのマクロブロックが、既復号
データだけを用いて復号することができる場合を考える
と、それは、例えば、図４０（Ｂ）に示すように、最も
右のＩ／Ｐピクチャにおける対応マクロブロック（実線
で示す部分）を、最も左のＩ／Ｐピクチャを参照画像と
して処理する場合の動きベクトルＭＶ１を、例えば、１
／２倍または−１／２倍した動きベクトルＭＶ２または
ＭＶ３によって、最も左のＩ／Ｐピクチャまたは最も右
のＩ／Ｐピクチャを動き補償してそれぞれ得られる予測
画像（同図（Ｂ）において、点線で示す部分）の平均値
が、処理対象のマクロブロックと一致する場合（予測誤
差が生じない場合）ということになる。Therefore, there is a change in the image between the leftmost I / P picture and the rightmost I / P picture, and the macroblock of the B picture to be processed uses only the decoded data. Considering the case where decoding is possible, for example, as shown in FIG. 40B, the corresponding macroblock (part shown by a solid line) in the rightmost I / P picture is replaced with the leftmost I / P picture. Is processed as a reference image, the motion vector MV1 is, for example, 1
A predicted image obtained by motion-compensating the leftmost I / P picture or the rightmost I / P picture with the motion vector MV2 or MV3 multiplied by ２ or − ／, respectively (in FIG. This means that the average value of the portion indicated by the dotted line) matches the macroblock to be processed (when no prediction error occurs).

【０３２１】以上から、図３８のステップＳ８では、Ｂ
ピクチャの処理対象のマクロブロックについての処理
（予測符号化／復号化）は、動きベクトルとして、最も
右のＩ／Ｐピクチャにおける対応マクロブロックの動き
ベクトルＭＶ１から求められる動きベクトルＭＶ２（Ｍ
ＶＦ）およびＭＶ３（ＭＶＢ）が用いられ、かつ、その
画素値（画像データ）として、上述のような予測画像の
平均値が用いられて行われる。As described above, in step S8 of FIG.
The processing (prediction encoding / decoding) of the macroblock to be processed in the picture is performed by using a motion vector MV2 (M
VF) and MV3 (MVB) are used, and the average value of the above-described predicted image is used as the pixel value (image data).

【０３２２】即ち、この場合、例えば、処理対象のマク
ロブロックについての予測モードは、上述したダイレク
トモードとされる。なお、Ｈ．２６３では、ダイレクト
モードが適用されるのはＰＢピクチャであり、従って、
本実施の形態におけるＢピクチャとは、ＭＰＥＧ１，２
におけるＢピクチャや、Ｈ．２６３におけるＰＢピクチ
ャなどを含む、いわば広い概念のものである。That is, in this case, for example, the prediction mode for the macroblock to be processed is the direct mode described above. In addition, H. In H.263, it is the PB picture to which the direct mode is applied,
The B picture in the present embodiment refers to MPEG1, MPEG2,
B picture and H. This is a broad concept including the PB picture in H.263.

【０３２３】一方、ステップＳ７において、処理対象の
ＢピクチャのマクロブロックについてのＭＯＤＢが０で
ないと判定された場合、ステップＳ９に進み、ステップ
Ｓ４における場合と同様に、通常の処理が行われる。On the other hand, if it is determined in step S7 that the MODB for the macroblock of the B picture to be processed is not 0, the process proceeds to step S9, and normal processing is performed as in step S4.

【０３２４】次に、図３９は、図１２の上位レイヤ符号
化部２３、および図２０の上位レイヤ復号部９３におけ
るスキップマクロブロックの処理を説明するフローチャ
ートである。Next, FIG. 39 is a flow chart for explaining the processing of the skipped macroblock in the upper layer coding section 23 in FIG. 12 and the upper layer decoding section 93 in FIG.

【０３２５】この場合、ステップＳ１１乃至Ｓ１４にお
いては、図３８のステップＳ１乃至Ｓ４における場合と
それぞれ同様の処理が行われる。即ち、Ｐピクチャにつ
いては、下位レイヤおよび上位レイヤのいずれについて
も同一の処理が施される。In this case, in steps S11 to S14, the same processing as in steps S1 to S4 in FIG. 38 is performed. That is, the same processing is performed on the P picture for both the lower layer and the upper layer.

【０３２６】一方、ステップＳ１１において、処理対象
のマクロブロックがＢピクチャであると判定された場
合、ステップＳ１５に進み、そのマクロブロックについ
てのフラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」
であるかどうかが判定される。ステップＳ１５におい
て、処理対象のＢピクチャのマクロブロックについての
フラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」でな
いと判定された場合、即ち、そのＢピクチャのマクロブ
ロックが、下位レイヤの同一時刻における画像を参照画
像として用いて処理されるものではない場合、ステップ
Ｓ１６乃至Ｓ２０に進み、図３８のステップＳ５乃至Ｓ
９における場合とそれぞれ同様の処理が行われる。On the other hand, if it is determined in step S11 that the macroblock to be processed is a B picture, the flow advances to step S15 to set the flag ref_select_code for the macroblock to "00".
Is determined. In step S15, when it is determined that the flag ref_select_code for the macroblock of the B picture to be processed is not “00”, that is, the macroblock of the B picture uses a lower layer image at the same time as a reference image as a reference image. If not, the process proceeds to steps S16 to S20 and proceeds to steps S5 to S in FIG.
9, the same processing is performed.

【０３２７】また、ステップＳ１５において、処理対象
のＢピクチャのマクロブロックについてのフラグｒｅｆ
＿ｓｅｌｅｃｔ＿ｃｏｄｅが「００」であると判定され
た場合、即ち、そのＢピクチャのマクロブロックが、下
位レイヤの同一時刻における画像を参照画像として用い
て処理されるものである場合、ステップＳ２１に進み、
処理対象のＢピクチャのマクロブロックについてのＭＯ
ＤＢが０であるかどうかが判定される。At step S15, the flag ref for the macroblock of the B picture to be processed
If it is determined that _select_code is “00”, that is, if the macroblock of the B picture is to be processed using the lower-layer image at the same time as the reference image, the process proceeds to step S21.
MO for macroblock of B picture to be processed
It is determined whether DB is 0.

【０３２８】ステップＳ２１において、処理対象のＢピ
クチャのマクロブロックについてのＭＯＤＢが０である
と判定された場合、ステップＳ２２に進み、その処理対
象のマクロブロックは、スキップマクロブロックである
と決定され、そのように取り扱われる。また、ステップ
Ｓ２１において、処理対象のＢピクチャのマクロブロッ
クについてのＭＯＤＢが０でないと判定された場合、ス
テップＳ２３に進み、図３８のステップＳ３における場
合と同様に、通常の処理が行われる。If it is determined in step S21 that the MODB for the macroblock of the B picture to be processed is 0, the process proceeds to step S22, where the macroblock to be processed is determined to be a skip macroblock, Treated that way. If it is determined in step S21 that the MODB of the macroblock of the B picture to be processed is not 0, the process proceeds to step S23, and normal processing is performed as in step S3 of FIG.

【０３２９】即ち、いま、処理すべき上位レイヤの画像
（ＶＯＰ）として、例えば、図４０（Ｃ）に示すよう
に、Ｉ／Ｐ，Ｂ，Ｂ，・・・というシーケンスで表示さ
れるものがあり、また、下位レイヤの画像としても、同
様のシーケンスで表示されるものがあるとする。さら
に、これらの下位レイヤの画像と上位レイヤの画像とが
交互に符号化／復号化されるものとする。なお、上位レ
イヤのＢピクチャについてのｒｅｆ＿ｓｅｌｅｃｔ＿ｃ
ｏｄｅが「００」である場合には、画像の符号化／復号
化の順序は、このようになる。That is, as the upper layer image (VOP) to be processed, for example, as shown in FIG. 40 (C), an image displayed in a sequence of I / P, B, B,. It is assumed that there are some lower layer images that are displayed in the same sequence. Further, it is assumed that these lower layer images and upper layer images are alternately encoded / decoded. Note that ref_select_c for the B picture in the upper layer
When the mode is “00”, the encoding / decoding order of the image is as follows.

【０３３０】この場合において、ｒｅｆ＿ｓｅｌｅｃｔ
＿ｃｏｄｅの値を、ステップＳ１５において判定しない
とすると、即ち、図３８で説明した場合と同様の処理を
行うとすると、処理対象の上位レイヤのＢピクチャのマ
クロブロックが、図４０（Ｃ）に示したように、下位レ
イヤの同一時刻における画像（拡大画像）、または上位
レイヤにおける直前の復号画像（最も左のＩ／Ｐピクチ
ャ）を参照画像として用いて符号化／復号化され、その
Ｂピクチャより後のフレームは参照されないのにもかか
わらず、そのような後のフレームにおける対応マクロブ
ロックについてのＣＯＤ（またはＭＯＤＢ）の値によっ
て、処理対象のマクロブロックがスキップマクロブロッ
クかどうかが決定されることになる。In this case, ref_select
If the value of _code is not determined in step S15, that is, if the same processing as described with reference to FIG. 38 is performed, the macroblock of the B picture in the upper layer to be processed is shown in FIG. As described above, encoding / decoding is performed using an image (enlarged image) of the lower layer at the same time or an immediately preceding decoded image (leftmost I / P picture) of the upper layer as a reference image. The COD (or MODB) value for the corresponding macroblock in such a subsequent frame determines whether the macroblock to be processed is a skipped macroblock, even though the subsequent frame is not referenced. Become.

【０３３１】しかしながら、処理対象のマクロブロック
を符号化／復号化する際に参照されないフレームに基づ
いて、そのマクロブロックがスキップマクロブロックか
どうかを判定するのは好ましくない。However, it is not preferable to determine whether a macroblock to be processed is a skip macroblock based on a frame that is not referred to when encoding / decoding the macroblock to be processed.

【０３３２】そこで、図３９の実施の形態においては、
上位レイヤのＢピクチャについては、ｒｅｆ＿ｓｅｌｅ
ｃｔ＿ｃｏｄｅに基づき、それが「００」である場合、
即ち、そのＢピクチャのマクロブロックが、図４０
（Ｃ）に示したように、下位レイヤの同一時刻における
画像（拡大画像）、または上位レイヤにおける直前の復
号画像（最も左のＩ／Ｐピクチャ）を参照画像として用
いて処理される場合には、それ以降のフレームにおける
対応マクロブロックについてのＣＯＤまたはＭＯＤＢと
は無関係に、その処理対象のＢピクチャのマクロブロッ
クについてのＭＯＤＢにしたがって、スキップマクロブ
ロックかどうかが決定される。Therefore, in the embodiment shown in FIG.
For the B picture of the upper layer, ref_sel
Based on ct_code, if it is "00",
That is, the macro block of the B picture is
As shown in (C), when processing is performed using an image (enlarged image) of the lower layer at the same time or an immediately preceding decoded image (leftmost I / P picture) of the upper layer as a reference image. , Regardless of the COD or MODB of the corresponding macroblock in the subsequent frames, whether or not the macroblock is a skip macroblock is determined according to the MODB of the macroblock of the B picture to be processed.

【０３３３】なお、ｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅが
「００」である場合において、処理対象のＢピクチャの
マクロブロックについてのＭＯＤＢが０となるのは、参
照画像として、下位レイヤの同一時刻における画像では
なく、上位レイヤにおける直前の復号画像（最も左のＩ
／Ｐピクチャ）を用いたときであるのが一般的であるか
ら、その処理対象のマクロブロックについての処理（予
測符号化／復号化）は、その直前の復号画像を参照画像
とし、動きベクトルは０として行われる。When ref_select_code is “00”, the MODB of the macroblock of the B picture to be processed becomes 0 as a reference image in the upper layer, not in the image at the same time of the lower layer. The immediately preceding decoded image (the leftmost I
/ P picture), the processing (prediction encoding / decoding) for the macroblock to be processed is performed using the immediately preceding decoded image as a reference image and the motion vector as Performed as 0.

【０３３４】以上、スキップマクロブロックの処理につ
いて説明したが、このような処理を行う場合において、
処理対象のマクロブロックが上位レイヤまたは下位レイ
ヤのうちのいずれのものであるかの判定は、図２４で説
明したフラグscalabilityに基づいて行われる。The processing of the skip macro block has been described above. In the case where such processing is performed,
Whether the macroblock to be processed is one of the upper layer and the lower layer is determined based on the flag scalability described with reference to FIG.

【０３３５】ここで、図１２の動きベクトル検出器３
２、ＶＬＣ器３６、および動き補償器４２に下位レイヤ
のＣＯＤが供給されるのは、次のような理由による。Here, the motion vector detector 3 shown in FIG.
The reason why the COD of the lower layer is supplied to the VLC unit 36 and the motion compensator 42 is as follows.

【０３３６】即ち、例えば、図１４に示した時間スケー
ラビリティの場合、上述したように、上位レイヤの予測
に下位レイヤの画像が参照画像として用いられる。この
場合、例えば、下位レイヤのＶＯＰ０、上位レイヤのＶ
ＯＰ１、下位レイヤのＶＯＰ２は、時間的に連続する画
像であり、従って、このような３つのＶＯＰ１，ＶＯＰ
２，ＶＯＰ３が、図４０（Ａ）で説明した条件を満たせ
ば、上位レイヤのＶＯＰ１のマクロブロックはスキップ
マクロブロックということになる。そして、マクロブロ
ックがスキップマクロブロックであれば、そのマクロブ
ロックは特に処理せずに済む。一方、図４０（Ａ）で説
明した条件を満たすかどうかの判断は、下位レイヤのＶ
ＯＰ２のＣＯＤが必要であり、このため、図１２の上位
レイヤ符号化部２３における動きベクトル検出器３２、
ＶＬＣ器３６、および動き補償器４２には、下位レイヤ
のＣＯＤが供給されるようになされている。That is, for example, in the case of the temporal scalability shown in FIG. 14, as described above, the image of the lower layer is used as the reference image for the prediction of the upper layer. In this case, for example, VOP0 of the lower layer, VOP of the upper layer
OP1 and the lower layer VOP2 are images that are temporally continuous, and therefore, these three VOP1 and VOP
2. If VOP3 satisfies the condition described with reference to FIG. 40A, the macroblock of VOP1 in the upper layer is a skipped macroblock. If the macroblock is a skip macroblock, the macroblock does not need to be processed. On the other hand, whether or not the condition described with reference to FIG.
Since the COD of OP2 is necessary, the motion vector detector 32 in the upper layer coding unit 23 in FIG.
The lower layer COD is supplied to the VLC unit 36 and the motion compensator 42.

【０３３７】次に、現在、ＭＰＥＧ４では、予測モード
がダイレクトである場合を除き、マクロブロックのＤＣ
Ｔ係数すべてが、量子化により、例えば、０などの所定
の値になる場合（ＤＣＴ係数が存在しない場合）であっ
ても、量子化ステップについてのＤＱＵＡＮＴを伝送す
べきことが規定されているが、マクロブロックのＤＣＴ
係数が存在しない場合に、ＤＱＵＡＮＴを伝送するのは
冗長である。Next, at present, in MPEG4, except for the case where the prediction mode is direct, the DC
Although it is specified that DQUANT for the quantization step should be transmitted even when all the T coefficients have a predetermined value such as 0 due to quantization (when no DCT coefficient exists), for example. , DCT of macroblock
Transmitting DQUANT when no coefficients are present is redundant.

【０３３８】そこで、図１１および図１２のＶＬＣ器３
６、並びに図１９および図２０のＩＶＬＣ器１０２で
は、量子化ステップＤＱＵＡＮＴが、次のように扱われ
るようになされている。Therefore, the VLC unit 3 shown in FIGS.
6, and the IVLC unit 102 in FIGS. 19 and 20, the quantization step DQUANT is handled as follows.

【０３３９】即ち、まず最初に、ステップＳ５１におい
て、ＣＢＰＢが０であるかどうかが判定され、ＣＢＰＢ
が０であると判定された場合、マクロブロックのＤＣＴ
係数は存在しないため、ステップＳ５６に進み、量子化
ステップは無視され（エンコーダ側では量子化ステップ
ＤＱＵＡＮＴは伝送されず、デコーダ側ではビットスト
リームからの量子化ステップＤＱＵＡＮＴの抽出は行わ
れない（行うことができない））、処理を終了する。That is, first, in step S51, it is determined whether or not CBPB is 0.
Is determined to be 0, the DCT of the macroblock
Since there is no coefficient, the process proceeds to step S56, and the quantization step is ignored (the quantization step DQUANT is not transmitted on the encoder side, and the quantization step DQUANT is not extracted from the bit stream on the decoder side (the Is not possible)), and the processing ends.

【０３４０】ここで、図３０で説明したように、ＣＢＰ
Ｂは伝送されない場合があるが、この場合には、ステッ
プＳ５１の処理はスキップされ、ステップＳ５２の処理
が行われる。Here, as described with reference to FIG.
B may not be transmitted, but in this case, the process of step S51 is skipped, and the process of step S52 is performed.

【０３４１】一方、ステップＳ５１において、ＣＢＰＢ
が０でないと判定された場合、ステップＳ５２に進み、
ＭＯＤＢが０であるか否が判定される。ステップＳ５２
において、ＭＯＤＢが「０」であると判定された場合、
図３０で説明したように、ＣＢＰＢは伝送されず、従っ
て、マクロブロックのＤＣＴ係数は存在しないため、ス
テップＳ５６に進み、量子化ステップは無視され、処理
を終了する。On the other hand, in step S51, CBPB
If it is determined that is not 0, the process proceeds to step S52,
It is determined whether or not MODB is 0. Step S52
, When the MODB is determined to be “0”,
As described with reference to FIG. 30, since CBPB is not transmitted, and therefore, no DCT coefficient of the macroblock exists, the process proceeds to step S56, the quantization step is ignored, and the process ends.

【０３４２】また、ステップＳ５２において、ＭＯＤＢ
が「０」でないと判定された場合、ステップＳ５３に進
み、ＭＯＤＢテーブルＡまたはＢのうちのいずれが、Ｍ
ＯＤＢの可変長符号化／可変長復号化に用いられるのか
が判定される。ステップＳ５３において、ＭＯＤＢテー
ブルＢが用いられると判定された場合、ステップＳ５４
をスキップして、ステップＳ５５に進む。また、ステッ
プＳ５３において、ＭＯＤＢテーブルＡが用いられると
判定された場合、ステップＳ５４に進み、ＭＯＤＢが
「１０」であるかどうかが判定される。In step S52, MODB
Is not “0”, the process proceeds to step S53, and either of the MODB tables A or B
It is determined whether ODB is used for variable-length encoding / variable-length decoding of ODB. If it is determined in step S53 that the MODB table B is to be used, the process proceeds to step S54.
And skips to step S55. If it is determined in step S53 that the MODB table A is to be used, the process proceeds to step S54, where it is determined whether the MODB is "10".

【０３４３】ステップＳ５４において、ＭＯＤＢが「１
０」であると判定された場合、即ち、ＭＯＤＢテーブル
Ａが用いられる場合であって、ＭＯＤＢが「１０」であ
る場合、やはり図３０で説明したように、ＣＢＰＢは伝
送されず、従って、マクロブロックのＤＣＴ係数は存在
しないため、ステップＳ５６に進み、量子化ステップは
無視され、処理を終了する。In step S54, MODB is set to "1".
If it is determined to be “0”, that is, if the MODB table A is used and the MODB is “10”, CBPB is not transmitted as described with reference to FIG. Since there is no DCT coefficient of the block, the process proceeds to step S56, the quantization step is ignored, and the process ends.

【０３４４】一方、ステップＳ５４において、ＭＯＤＢ
が「１０」でないと判定された場合、量子化ステップに
ついての処理が行われ（エンコーダ側では量子化ステッ
プＤＱＵＡＮＴが伝送され、デコーダ側ではビットスト
リームからの量子化ステップＤＱＵＡＮＴの抽出が行わ
れ）、処理を終了する。On the other hand, in step S54,
Is determined to be not “10”, a process for the quantization step is performed (the quantization step DQUANT is transmitted on the encoder side, and the quantization step DQUANT is extracted from the bit stream on the decoder side), The process ends.

【０３４５】以上のように、マクロブロックのＤＣＴ係
数が存在しないとき、即ち、ＣＢＰＢが０のとき、ＭＯ
ＤＢテーブルＡを用いる場合においてＭＯＤＢが「０」
または「１０」のとき、およびＭＯＤＢテーブルＢを用
いる場合においてＭＯＤＢが「０」のときには、量子化
ステップを無視するようにしたので、データの冗長度を
低減することができる。As described above, when the DCT coefficient of the macroblock does not exist, that is, when CBPB is 0, the MO
MODB is "0" when using DB table A
Alternatively, when the value is "10", and when MODB is "0" in the case where the MODB table B is used, the quantization step is ignored, so that the data redundancy can be reduced.

【０３４６】なお、ＣＢＰＢが伝送されるが、その値が
「０」である場合というのは、ＭＯＤＢテーブルＡまた
はＢを用いてＭＯＤＢが「１１」または「１０」とそれ
ぞれされる場合であるが、そのような場合は、ＭＯＤＢ
が「１０」または「０」をそれぞれ用いれば済むので、
基本的には生じない。従って、図４１の実施の形態で
は、最初のステップＳ５１において、ＣＢＰＢの値を判
定するようにしたが、この判定処理は、処理効率の観点
からは、ステップＳ５５の処理の直前に行うようにする
のが望ましい。When CBPB is transmitted and its value is “0”, it means that the MODB is set to “11” or “10” using the MODB table A or B, respectively. , In such a case, MODB
Use "10" or "0" respectively.
Basically not. Accordingly, in the embodiment of FIG. 41, the value of CBPB is determined in the first step S51, but this determination processing is performed immediately before the processing of step S55 from the viewpoint of processing efficiency. It is desirable.

【０３４７】また、図４１の処理は、上述の第１および
第２の方法のいずれを用いる場合にも適用可能である。The processing shown in FIG. 41 is applicable to any of the first and second methods.

【０３４８】以上のように、位置や大きさの変換するＶ
Ｏを、絶対座標系に配置して処理するようにしたので、
ＶＯごとの予測符号化／復号化が可能となり、また、Ｖ
Ｏを対象としたスケーラビリティを実現することが可能
となる。As described above, the position and the size of the V
O is arranged and processed in the absolute coordinate system.
Predictive encoding / decoding for each VO becomes possible.
Scalability for O can be realized.

【０３４９】さらに、スキップマクロブロックの処理
を、そのスキップマクロブロックに用いられる参照画像
を示すフラグｒｅｆ＿ｓｅｌｅｃｔ＿ｃｏｄｅを考慮し
て決定するようにしたので、効率的な処理が可能とな
る。Furthermore, the processing of the skipped macroblock is determined in consideration of the flag ref_select_code indicating the reference image used for the skipped macroblock, so that efficient processing can be performed.

【０３５０】また、上位レイヤと下位レイヤの画像が同
一である場合において、上位レイヤの予測符号化のため
の参照画像として、同一時刻における下位レイヤの復号
画像を用いるときには、上位レイヤにおける動きベクト
ルは伝送せず、下位レイヤにおけるもののみを伝送する
ようにしたので、データ量を低減することが可能とな
る。In the case where the upper layer and the lower layer have the same image, when a lower layer decoded image at the same time is used as a reference image for predictive coding of the upper layer, the motion vector in the upper layer is Since only the data in the lower layer is transmitted without being transmitted, the data amount can be reduced.

【０３５１】なお、本実施の形態においてマクロブロッ
ク単位で行われると説明した処理は、マクロブロック単
位以外の単位で行うようにすることも可能である。The processing described in this embodiment as being performed in units of macroblocks can be performed in units other than units of macroblocks.

【０３５２】また、本実施の形態では、２種類のＭＯＤ
ＢテーブルＡおよびＢを用意し、いずれか一方を選択し
て用いるようにしたが、ＭＯＤＢテーブルは３種類以上
用意することも可能である。このことは、ＭＢＴＹＰＥ
テーブルについても同様である。In this embodiment, two types of MODs are used.
Although B tables A and B are prepared and one of them is selected and used, it is also possible to prepare three or more types of MODB tables. This means that MBTYPE
The same applies to the table.

【０３５３】[0353]

【発明の効果】請求項１に記載の画像符号化装置および
請求項２に記載の画像符号化方法によれば、第１および
第２の画像の解像度の違いに基づいて、第２の画像が拡
大または縮小され、それを参照画像として、第１の画像
の予測符号化が行われる。一方、所定の絶対座標系にお
ける第１および第２の画像の位置が決定され、その第１
または第２の画像の位置それぞれに関する第１または第
２の位置情報が出力される。この場合において、第１の
位置情報に基づいて、第１の画像の位置が認識されると
ともに、第２の画像を拡大または縮小したときの拡大率
または縮小率に対応して、第２の位置情報が変換され、
その変換結果に対応する位置が、参照画像の位置として
認識され、予測符号化が行われる。従って、例えば、時
刻とともに、位置が変化する画像を対象としたスケーラ
ビリティを実現することが可能となる。According to the image encoding apparatus of the first aspect and the image encoding method of the second aspect, the second image is formed based on the difference in resolution between the first and second images. The image is enlarged or reduced and the first image is subjected to predictive encoding using the image as a reference image. On the other hand, the positions of the first and second images in the predetermined absolute coordinate system are determined, and the first and second positions are determined.
Alternatively, first or second position information on each of the positions of the second image is output. In this case, the position of the first image is recognized based on the first position information, and the second position corresponding to the enlargement or reduction ratio when the second image is enlarged or reduced. The information is converted,
The position corresponding to the conversion result is recognized as the position of the reference image, and predictive encoding is performed. Therefore, for example, scalability for an image whose position changes with time can be realized.

【０３５４】請求項３に記載の画像復号化装置および請
求項４に記載の画像復号化方法によれば、第１および第
２の画像の解像度の違いに基づいて、復号化された第２
の画像が拡大または縮小され、それを参照画像として、
第１の画像が復号化される。そして、符号化データが、
所定の絶対座標系における第１または第２の画像の位置
それぞれ関する第１または第２の位置情報を含んでいる
場合、第１の位置情報に基づいて、第１の画像の位置が
認識されるとともに、第２の画像を拡大または縮小した
ときの拡大率または縮小率に対応して、第２の位置情報
が変換され、その変換結果に対応する位置が、参照画像
の位置として認識され、第１の画像の復号化が行われ
る。従って、例えば、時刻とともに、位置が変化する画
像を対象としたスケーラビリティを実現することが可能
となる。[0354] The image decoding apparatus according to claim 3 and 請
According to the image decoding method described in claim 4 , the decoded second image is based on a difference in resolution between the first and second images.
Image is scaled up or down and used as a reference image,
The first image is decoded. Then, the encoded data is
When the first or second position information on the position of the first or second image in the predetermined absolute coordinate system is included, the position of the first image is recognized based on the first position information. At the same time, the second position information is converted in accordance with the enlargement ratio or reduction ratio when the second image is enlarged or reduced, and the position corresponding to the result of the conversion is recognized as the position of the reference image. One image is decoded. Therefore, for example, scalability for an image whose position changes with time can be realized.

【０３５５】[0355]

【０３５６】[0356]

【０３５７】[0357]

【０３５８】[0358]

[Brief description of the drawings]

【図１】本発明を適用したエンコーダの一実施の形態を
示すブロック図である。FIG. 1 is a block diagram showing an embodiment of an encoder to which the present invention is applied.

【図２】時刻によって、ＶＯの位置、大きさが変化する
ことを説明するための図である。FIG. 2 is a diagram for explaining that a position and a size of a VO change with time.

【図３】図１のＶＯＰ符号化部３1乃至３Nの構成例を示
すブロック図である。FIG. 3 is a block diagram illustrating a configuration example of VOP encoding units 31 to 3N in FIG. 1;

【図４】図１のＶＯＰ符号化部３1乃至３Nの他の構成例
を示すブロック図である。FIG. 4 is a block diagram illustrating another configuration example of the VOP encoding units 31 to 3N of FIG. 1;

【図５】空間スケーラビリティを説明するための図であ
る。FIG. 5 is a diagram for explaining spatial scalability.

【図６】空間スケーラビリティを説明するための図であ
る。FIG. 6 is a diagram for describing spatial scalability.

【図７】空間スケーラビリティを説明するための図であ
る。FIG. 7 is a diagram for describing spatial scalability.

【図８】空間スケーラビリティを説明するための図であ
る。FIG. 8 is a diagram for describing spatial scalability.

【図９】ＶＯＰのサイズデータおよびオフセットデータ
の決定方法を説明するための図である。FIG. 9 is a diagram for explaining a method for determining VOP size data and offset data.

【図１０】ＶＯＰのサイズデータおよびオフセットデー
タの決定方法を説明するための図である。FIG. 10 is a diagram for explaining a method of determining VOP size data and offset data.

【図１１】図４の下位レイヤ符号化部２５の構成例を示
すブロック図である。11 is a block diagram illustrating a configuration example of a lower layer encoding unit 25 in FIG.

【図１２】図４の上位レイヤ符号化部２３の構成例を示
すブロック図である。12 is a block diagram illustrating a configuration example of an upper layer encoding unit 23 in FIG.

【図１３】空間スケーラビリティを説明するための図で
ある。FIG. 13 is a diagram for describing spatial scalability.

【図１４】時間スケーラビリティを説明するための図で
ある。FIG. 14 is a diagram for explaining time scalability.

【図１５】リファレンシャルセレクトコード（ref_sele
ct_code）を説明するための図である。FIG. 15 shows a reference select code (ref_sele
FIG. 3 is a diagram for explaining (ct_code).

【図１６】本発明を適用したデコーダの一実施の形態の
構成を示すブロック図である。FIG. 16 is a block diagram illustrating a configuration of an embodiment of a decoder to which the present invention has been applied.

【図１７】図１６のＶＯＰ復号部７２1乃至７２Nの構成
例を示すブロック図である。17 is a block diagram illustrating a configuration example of VOP decoding units 721 to 72N of FIG.

【図１８】図１６のＶＯＰ復号部７２1乃至７２Nの他の
構成例を示すブロック図である。18 is a block diagram illustrating another configuration example of the VOP decoding units 721 to 72N of FIG.

【図１９】図１８の下位レイヤ復号部９５の構成例を示
すブロック図である。19 is a block diagram illustrating a configuration example of a lower layer decoding unit 95 in FIG.

【図２０】図１８の上位レイヤ復号部９３の構成例を示
すブロック図である。20 is a block diagram illustrating a configuration example of an upper layer decoding unit 93 in FIG.

【図２１】スケーラブル符号化によって得られるビット
ストリームのシンタクスを示す図である。FIG. 21 is a diagram illustrating the syntax of a bit stream obtained by scalable encoding.

【図２２】ＶＳのシンタクスを示す図である。FIG. 22 is a diagram illustrating the syntax of a VS.

【図２３】ＶＯのシンタクスを示す図である。FIG. 23 is a diagram illustrating the syntax of a VO.

【図２４】ＶＯＬのシンタクスを示す図である。FIG. 24 is a diagram illustrating the syntax of a VOL.

【図２５】ＶＯＰのシンタクスを示す図である。FIG. 25 is a diagram illustrating VOP syntax.

【図２６】ＶＯＰのシンタクスを示す図である。FIG. 26 is a diagram illustrating the syntax of a VOP.

【図２７】diff_size_horizontalおよびdiff_size_vert
icalの可変長符号を示す図である。FIG. 27: diff_size_horizontal and diff_size_vert
It is a figure showing the variable length code of ical.

【図２８】diff_VOP_horizontal_refおよびdiff_VOP_ve
rtical_refの可変長符号を示す図である。FIG. 28: diff_VOP_horizontal_ref and diff_VOP_ve
It is a figure showing the variable length code of rtical_ref.

【図２９】マクロブロックのシンタクスを示す図であ
る。FIG. 29 is a diagram illustrating the syntax of a macroblock.

【図３０】ＭＯＤＢの可変長符号を示す図である。FIG. 30 is a diagram illustrating a variable-length code of a MODB.

【図３１】マクロブロックの構成例を示す図である。FIG. 31 is a diagram illustrating a configuration example of a macroblock.

【図３２】ＭＢＴＹＰＥの可変長符号を示す図である。FIG. 32 is a diagram showing a variable length code of MBTYPE.

【図３３】ダイレクトモードによる予測符号化を説明す
るための図である。FIG. 33 is a diagram for describing predictive encoding in a direct mode.

【図３４】上位レイヤのＢピクチャの予測符号化を説明
するための図である。FIG. 34 is a diagram for describing predictive encoding of a B picture in an upper layer.

【図３５】準ダイレクトモードを説明するための図であ
る。FIG. 35 is a diagram for explaining a quasi-direct mode.

【図３６】下位レイヤについて用いる可変長テーブルの
決定方法を説明するためのフローチャートである。FIG. 36 is a flowchart illustrating a method for determining a variable length table used for a lower layer.

【図３７】上位レイヤについて用いる可変長テーブルの
決定方法を説明するためのフローチャートである。FIG. 37 is a flowchart illustrating a method for determining a variable length table used for an upper layer.

【図３８】下位レイヤにおけるスキップマクロブロック
についての処理を説明するためのフローチャートであ
る。FIG. 38 is a flowchart illustrating a process for a skip macroblock in a lower layer.

【図３９】上位レイヤにおけるスキップマクロブロック
についての処理を説明するためのフローチャートであ
る。FIG. 39 is a flowchart illustrating a process for a skipped macroblock in an upper layer.

【図４０】スキップマクロブロックについての処理を説
明するための図である。FIG. 40 is a diagram for describing processing for a skipped macroblock.

【図４１】量子化ステップＤＱＵＡＮＴについての処理
を説明するためのフローチャートである。FIG. 41 is a flowchart illustrating a process of a quantization step DQUANT.

【図４２】従来のエンコーダの一例の構成を示すブロッ
ク図である。FIG. 42 is a block diagram illustrating a configuration of an example of a conventional encoder.

【図４３】従来のデコーダの一例の構成を示すブロック
図である。FIG. 43 is a block diagram showing a configuration of an example of a conventional decoder.

【図４４】従来のスケーラブル符号化を行うエンコーダ
の一例の構成を示すブロック図である。FIG. 44 is a block diagram illustrating a configuration of an example of a conventional encoder that performs scalable encoding.

【図４５】図４４の下位レイヤ符号化部２０２の構成例
を示すブロック図である。FIG. 45 is a block diagram illustrating a configuration example of a lower layer encoding unit 202 in FIG. 44.

【図４６】図４４の上位レイヤ符号化部２０１の構成例
を示すブロック図である。FIG. 46 is a block diagram illustrating a configuration example of an upper layer encoding unit 201 in FIG. 44.

【図４７】従来のスケーラブル復号化を行うデコーダの
一例の構成を示すブロック図である。FIG. 47 is a block diagram illustrating a configuration of an example of a conventional decoder that performs scalable decoding.

【図４８】図４７の下位レイヤ復号化部２３２の構成例
を示すブロック図である。FIG. 48 is a block diagram illustrating a configuration example of a lower layer decoding unit 232 in FIG. 47.

【図４９】図４７の上位レイヤ復号化部２３１の構成例
を示すブロック図である。FIG. 49 is a block diagram illustrating a configuration example of an upper layer decoding unit 231 in FIG. 47.

【図５０】従来の画像合成方法を説明するための図であ
る。FIG. 50 is a diagram for explaining a conventional image synthesizing method.

【図５１】画像の再編集および再合成を可能とする符号
化方法を説明するための図である。FIG. 51 is a diagram for describing an encoding method that enables re-editing and re-synthesis of an image.

【図５２】画像の再編集および再合成を可能とする復号
化方法を説明するための図である。FIG. 52 is a diagram for describing a decoding method that enables re-editing and re-synthesis of an image.

[Explanation of symbols]

１ＶＯ構成部，２1乃至２N ＶＯＰ構成部，３1
乃至３N ＶＯＰ符号化部，４多重化部，２１
画像階層化部，２３上位レイヤ符号化部，２４解
像度変換部，２５下位レイヤ符号化部，２６多
重化部，３１フレームメモリ，３２動きベクトル
検出器，３３演算器，３４ＤＣＴ器，３５
量子化器，３６ＶＬＣ器，３８逆量子化器，
３９ＩＤＣＴ器，４０演算器，４１フレームメ
モリ，４２動き補償器，４３キー信号符号化部，
４４キー信号復号部，５１キー信号符号化部，
５２キー信号復号部，５３フレームメモリ，
７１逆多重化部，７２1乃至７２N ＶＯＰ復号部，
７３画像再構成部，９１逆多重化部，９３上
位レイヤ復号部，９４解像度変換部，９５下位
レイヤ復号部，ＩＶＬＣ器，１０３逆量子化器，
１０４ＩＤＣＴ器，１０５演算器，１０６フ
レームメモリ，１０７動き補償器，１０８，１１
１キー信号復号部，１１２フレームメモリ1 VO component, 21 to 2N VOP component, 31
To 3N VOP encoding unit, 4 multiplexing unit, 21
Image layering unit, 23 upper layer encoding unit, 24 resolution conversion unit, 25 lower layer encoding unit, 26 multiplexing unit, 31 frame memory, 32 motion vector detector, 33 arithmetic unit, 34 DCT unit, 35
Quantizer, 36 VLC unit, 38 inverse quantizer,
39 IDCT unit, 40 arithmetic unit, 41 frame memory, 42 motion compensator, 43 key signal encoding unit,
44 key signal decoder, 51 key signal encoder,
52 key signal decoding unit, 53 frame memory,
71 demultiplexer, 721 to 72N VOP decoder,
73 image reconstruction unit, 91 demultiplexing unit, 93 upper layer decoding unit, 94 resolution conversion unit, 95 lower layer decoding unit, IVLC unit, 103 dequantizer,
104 IDCT unit, 105 arithmetic unit, 106 frame memory, 107 motion compensator, 108, 11
1-key signal decoder, 112 frame memory

Claims

(57) [Claims]

1. An image encoding apparatus for encoding a first image using a second image having a different resolution from that of the first image, wherein a difference in resolution between the first and second images is provided. On the basis of the,
Scaling means for scaling up or down the second image; first image coding means for predictive coding of the first image using an output of the scaling means as a reference image; A second image encoding means for encoding the position of the first and second images in a predetermined absolute coordinate system, and the first or second position of the first or second image with respect to the position of the first or second image, respectively. A position determining unit that outputs position information; a first image encoding unit, a second image encoding unit, and a multiplexing unit that multiplexes an output of the position determining unit. Based on the first position information, while recognizing the position of the first image, and corresponding to an enlargement or reduction ratio when the enlargement / reduction unit enlarges or reduces the second image, Converting the second location information The position corresponding to the conversion result, confirmed as the position of the reference image, the image coding apparatus and performing predictive coding.

2. An image encoding method for an image encoding device for encoding a first image using a second image having a resolution different from that of the first image, wherein the image encoding device comprises: Based on the difference in resolution between the first and second images,
Scaling means for scaling up or down the second image; first image coding means for predictive coding of the first image using an output of the scaling means as a reference image; A second image encoding means for encoding the position of the first and second images in a predetermined absolute coordinate system, and the first or second position of the first or second image with respect to the position of the first or second image, respectively. A position determining unit that outputs position information; and a multiplexing unit that multiplexes the outputs of the first image encoding unit, the second image encoding unit, and the position determining unit. Based on the first position information, the position of the first image is recognized, and the enlargement / reduction unit corresponds to an enlargement ratio or a reduction ratio when the second image is enlarged or reduced. Convert the second location information Thereby, a position corresponding to the conversion result, is recognized as the position of the reference image, image coding method, characterized in that to perform predictive coding.

3. An image decoding apparatus for decoding coded data obtained by predictively coding a first image using a second image having a different resolution from that of the first image, wherein the second image A second image decoding unit that decodes an image, and based on a difference in resolution between the first and second images,
Scaling means for expanding or reducing the second image decoded by the second image decoding means; and a first image for decoding the first image using an output of the scaling means as a reference image Decoding means, wherein the encoded data includes first or second position information relating to the position of the first or second image in a predetermined absolute coordinate system, respectively, wherein the first image decoding means Based on the first position information, while recognizing the position of the first image, and corresponding to an enlargement or reduction ratio when the enlargement / reduction unit enlarges or reduces the second image, An image decoding apparatus, comprising: converting the second position information; recognizing a position corresponding to the conversion result as the position of the reference image; and decoding the first image.

4. An image decoding method for an image decoding apparatus for decoding encoded data obtained by predictively encoding a first image using a second image having a different resolution from that of the first image. The image decoding device, comprising: a second image decoding unit that decodes the second image; and a difference in resolution between the first and second images.
Scaling means for expanding or reducing the second image decoded by the second image decoding means; and a first image for decoding the first image using an output of the scaling means as a reference image Decoding means, wherein the encoded data includes first or second position information relating to a position of the first or second image in a predetermined absolute coordinate system, respectively. And means for recognizing the position of the first image based on the first position information, and corresponding to an enlargement or reduction ratio when the enlargement / reduction means enlarges or reduces the second image. And converting the second position information, recognizing a position corresponding to the conversion result as the position of the reference image, and decoding the first image. Method.