JP2008034893A

JP2008034893A - Multi-viewpoint image decoder

Info

Publication number: JP2008034893A
Application number: JP2006190026A
Authority: JP
Inventors: Hiroya Nakamura; 博哉中村
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2006-03-28
Filing date: 2006-07-11
Publication date: 2008-02-14

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem wherein a buffer for rearrangement for each viewpoint is required and delay may be generated since a decoded image signal is outputted for each frame or for each field and then is delayed for each frame or for each field successively for synchronization in a conventional stereoscopic image decoder. <P>SOLUTION: A decoding image management section 208 controls a decoding buffer 207 based on the number V of viewpoints supplied from a decoding image management information calculation section 202, a viewpoint number v for specifying the viewpoint of each viewpoint image, and a number d for indicating the output order of the decoding image at each viewpoint. A decoding image output section 209 outputs a decoding image stored into the decoding image buffer 207 to a multi-viewpoint image decoder, or the like as a viewpoint image M(v) by synchronizing each viewpoint mutually based on the viewpoint number v for specifying the viewpoint of each viewpoint image calculated from the number V of viewpoints and a decoding image output order number o at the decoding image management information calculation section 202 and the number d for indicating the output order of the decoding image at each viewpoint. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は多視点画像復号装置に係り、特に異なる視点から撮影され、かつ、符号化された多視点画像符号化データを復号する多視点画像復号装置に関する。 The present invention relates to a multi-view image decoding apparatus, and more particularly, to a multi-view image decoding apparatus that decodes encoded multi-view image data shot and encoded from different viewpoints.

＜動画像符号化方式＞
現在、時間軸上に連続する動画像をディジタル信号の情報として取り扱い、その際、効率の高い情報の放送、伝送又は蓄積等を目的とし、時間方向の冗長性を利用して動き補償予測を用い、空間方向の冗長性を利用して離散コサイン変換等の直交変換を用いて符号化圧縮するＭＰＥＧ（Moving Picture Experts Group）などの符号化方式に準拠した装置、システムが、普及している。 <Video coding system>
Currently, moving images on the time axis are handled as digital signal information. At that time, motion compensated prediction is used using redundancy in the time direction for the purpose of broadcasting, transmitting or storing information with high efficiency. In addition, apparatuses and systems that comply with an encoding scheme such as MPEG (Moving Picture Experts Group) that encodes and compresses using orthogonal transformation such as discrete cosine transformation using redundancy in the spatial direction have become widespread.

１９９５年に制定されたＭＰＥＧ−２ビデオ（ＩＳＯ／ＩＥＣ１３８１８−２）符号化方式は、汎用の動画像圧縮符号化方式として定義されており、プログレッシブ走査画像に加えてインターレース走査画像にも対応し、ＳＤＴＶ（標準解像度画像）のみならずＨＤＴＶ（高精細画像）まで対応しており、光ディスクであるＤＶＤ（Digital Versatile Disk）や、Ｄ−ＶＨＳ（登録商標）規格のディジタルＶＴＲによる磁気テープなどの蓄積メディアや、ディジタル放送等のアプリケーションとして広く用いられている。 The MPEG-2 video (ISO / IEC 13818-2) encoding system established in 1995 is defined as a general-purpose moving image compression encoding system, and supports interlaced scanned images in addition to progressive scanned images. Supports not only SDTV (standard definition images) but also HDTV (high definition images), and storage of DVDs (Digital Versatile Disks), which are optical discs, and magnetic tapes using D-VHS (registered trademark) digital VTRs. It is widely used as an application for media and digital broadcasting.

また、ネットワーク伝送や携帯端末等のアプリケーションにおいて、より高い符号化効率を目標とする、ＭＰＥＧ−４ビジュアル（ＩＳＯ／ＩＥＣ１４４９６−２）符号化方式の標準化が行われ、１９９８年に国際標準として制定された。 In addition, MPEG-4 visual (ISO / IEC 14496-2) encoding method was standardized, aiming at higher encoding efficiency in applications such as network transmission and portable terminals, and was established as an international standard in 1998. It was done.

更に、２００３年に、国際標準化機構（ＩＳＯ）と国際電気標準会議（ＩＥＣ）のジョイント技術委員会（ＩＳＯ／ＩＥＣ）と、国際電気通信連合電気通信標準化部門（ＩＴＵ−Ｔ）の共同作業によってＭＰＥＧ−４ＡＶＣ／Ｈ.２６４と呼ばれる符号化方式（ＩＳＯ／ＩＥＣでは１４４９６−１０、ＩＴＵ‐ＴではＨ.２６４の規格番号がつけられている。以下、これをＡＶＣ／Ｈ.２６４符号化方式と呼ぶ）が国際標準として制定された。
このＡＶＣ／Ｈ.２６４符号化方式では、従来のＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式に比べ、より高い符号化効率を実現している。 Furthermore, in 2003, joint work of the International Technical Organization (ISO) and the International Electrotechnical Commission (IEC) Joint Technical Committee (ISO / IEC) and the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) -4 AVC / H.264 encoding method (ISO / IEC 1449-10, ITU-T H.264 standard number. This is hereinafter referred to as AVC / H.264 encoding method. Called the international standard.
This AVC / H.264 encoding method achieves higher encoding efficiency than conventional encoding methods such as MPEG-2 video and MPEG-4 visual.

ＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式のＰピクチャでは、表示順序で直前のＩピクチャまたはＰピクチャのみから動き補償予測を行っていた。これに対して、ＡＶＣ／Ｈ.２６４符号化方式では、複数のピクチャを参照ピクチャとして用いることができ、この中からブロック毎に最適なものを選択して動き補償を行うことができる。また、表示順序で先行するピクチャに加えて、既に符号化済みの表示順序で後続のピクチャも参照することができる。また、ＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式のＢピクチャは、表示順序で前方１枚の参照ピクチャ、後方１枚の参照ピクチャ、もしくはその２枚の参照ピクチャを同時に参照し、２つのピクチャの平均値を予測ピクチャとし、対象ピクチャと予測ピクチャの差分データを符号化していた。 In a P picture of an encoding system such as MPEG-2 video or MPEG-4 visual, motion compensation prediction is performed only from the immediately preceding I picture or P picture in the display order. On the other hand, in the AVC / H.264 encoding method, a plurality of pictures can be used as reference pictures, and motion compensation can be performed by selecting an optimum one for each block. Further, in addition to the preceding picture in the display order, a subsequent picture can be referred to in the already encoded display order. In addition, a B picture of an encoding method such as MPEG-2 video or MPEG-4 visual refers to one reference picture in the display order, one reference picture in the rear, or two reference pictures at the same time. The average value of the two pictures is a predicted picture, and the difference data between the target picture and the predicted picture is encoded.

一方、ＡＶＣ／Ｈ.２６４符号化方式では、表示順序で前方１枚、後方１枚という制約にとらわれず、前方や後方に関係なく任意の参照ピクチャを予測のために参照可能となった。さらに、Ｂピクチャを参照ピクチャとして参照することも可能となっている。 On the other hand, in the AVC / H.264 encoding method, any reference picture can be referred to for prediction regardless of the front or rear, regardless of the restriction of one front and one rear in the display order. Furthermore, it is possible to refer to the B picture as a reference picture.

また、ＡＶＣ／Ｈ．２６４符号化方式では復号画像を出力する際に符号化順序から復号画像の出力順序（出力先の表示装置等で表示する際に望ましい順序）に並び替えるために、参照ピクチャと、非参照ピクチャの両方をメモリに格納しなければならないが、参照ピクチャと非参照ピクチャを復号画像バッファと呼ばれる１つのメモリで統一的に管理する仕組みが導入されている。符号化された符号化列（ビットストリーム）には、画像毎にピクチャ・オーダー・カウント（picture order count）と呼ばれる出力順序を示す情報が符号化されており、復号画像の出力順序で番号がつけられている。復号されて復号画像バッファに格納されている画像で、picture order countの値が最も小さい画像から順次出力する。また、picture order countはＩＤＲピクチャ（符号化順序でそのピクチャより前のピクチャの情報を使わなくても、それ以後のピクチャが正常に復号できることを意味するピクチャ）でリセットされる。 In addition, AVC / H. In the H.264 encoding method, when a decoded image is output, the reference picture and the non-reference picture are changed in order from the encoding order to the output order of the decoded picture (desired order when displayed on the display device of the output destination). Both have to be stored in the memory, but a mechanism has been introduced in which the reference picture and the non-reference picture are unifiedly managed by a single memory called a decoded picture buffer. The coded sequence (bit stream) is encoded with information indicating an output order called a picture order count for each image, and is numbered in the output order of the decoded images. It has been. The images that have been decoded and stored in the decoded image buffer are sequentially output from the images with the smallest picture order count value. Also, the picture order count is reset with an IDR picture (a picture that means that subsequent pictures can be normally decoded without using information of pictures preceding the picture in the coding order).

＜多視点画像符号化方式＞
一方、２眼式立体テレビジョンにおいては、２台のカメラにより異なる２方向から撮影された左眼用画像、右眼用画像を生成し、これを同一画面上に表示して立体画像を見せるようにしている。この場合、左眼用画像、及び右眼用画像はそれぞれ独立した画像として別個に伝送、あるいは記録されている。しかし、これでは単一の２次元画像の約２倍の情報量が必要となってしまう。 <Multi-view image coding method>
On the other hand, in a twin-lens stereoscopic television, a left-eye image and a right-eye image captured from two different directions by two cameras are generated and displayed on the same screen to show a stereoscopic image. I have to. In this case, the left eye image and the right eye image are separately transmitted or recorded as independent images. However, this requires about twice as much information as a single two-dimensional image.

そこで、従来より、左右いずれか一方の画像を主画像とし、他方の画像（副画像）情報を一般的な圧縮符号化方法によって情報圧縮して情報量を抑える手法が提案されている（例えば、特許文献１参照）。この特許文献１に記載された立体テレビジョン画像伝送方式では、小領域毎に他方の画像での相関の高い相対位置を求め、その位置偏移量（視差ベクトル）と差信号（予測残差信号）とを伝送するようにしている。差信号も伝送、記録するのは、主画像と視差情報であるずれ量や位置偏移量を用いれば副画像に近い画像が復元できるが、物体の影になる部分など主画像がもたない副画像の情報は復元できないからである。 Therefore, conventionally, a method has been proposed in which one of the left and right images is set as a main image and the other image (sub-image) information is information-compressed by a general compression encoding method to reduce the amount of information (for example, Patent Document 1). In the stereoscopic television image transmission method described in Patent Document 1, a relative position with high correlation in the other image is obtained for each small region, and the position shift amount (parallax vector) and a difference signal (prediction residual signal) are obtained. ). The difference signal is also transmitted and recorded because the image close to the sub-image can be restored using the main image and the amount of disparity and position shift, which is parallax information, but there is no main image such as the shadow of the object. This is because the sub-image information cannot be restored.

また、１９９６年に単視点画像の符号化国際標準であるＭＰＥＧ−２ビデオ（ＩＳＯ／ＩＥＣ１４４９６−２）符号化方式に、マルチビュープロファイルと呼ばれるステレオ画像の符号化方式が追加された（ＩＳＯ／ＩＥＣ１４４９６−２／ＡＭＤ３）。ＭＰＥＧ−２ビデオ・マルチビュープロファイルは左眼用画像を基本レイヤー、右眼用画像を拡張レイヤーで符号化する２レイヤーの符号化方式となっており、時間方向の冗長性を利用した動き補償予測や、空間方向の冗長性を利用した離散コサイン変換に加えて、視点間の冗長性を利用した視差補償予測を用いて符号化圧縮する。 In 1996, a stereo image encoding method called a multi-view profile was added to the MPEG-2 video (ISO / IEC 14496-2) encoding method, which is an international standard for single-view image encoding (ISO / IEC). IEC 14496-2 / AMD3). The MPEG-2 video multi-view profile is a two-layer encoding method that encodes the image for the left eye with the base layer and the image for the right eye with the enhancement layer, and motion compensated prediction using redundancy in the time direction In addition to discrete cosine transformation using redundancy in the spatial direction, encoding compression is performed using disparity compensation prediction using redundancy between viewpoints.

更に、多視点画像を符号化・復号する際に、１系統の符号化経路及び復号経路を用いる手法が提案されている（例えば、特許文献２参照）。この手法では、符号化側（送信側）で、図２０に示すように複数の方向から撮影して得られる複数のチャンネルの画像信号を、順次化手段３０１にて各チャンネル毎に順次１フレーム又は１フィールドずつ遅延させる。制御手段３０４は、フレーム周期またはフィールド周期のスイッチ信号を出力して、順次化手段３０１から複数のチャンネルの信号を順次に取り出すよう、マルチプレクサ３０２を制御する。 Furthermore, a method using one system encoding path and decoding path when encoding / decoding a multi-viewpoint image has been proposed (for example, see Patent Document 2). In this method, on the encoding side (transmission side), image signals of a plurality of channels obtained by photographing from a plurality of directions as shown in FIG. Delay one field at a time. The control unit 304 outputs a switch signal having a frame period or a field period, and controls the multiplexer 302 so as to sequentially extract the signals of a plurality of channels from the sequential unit 301.

マルチプレクサ３０２では１フレーム又は１フィールド毎に順次に挿入することで順次に配列して、１つの画像信号とする。符号化手段３０３はマルチプレクサ３０２から出力された画像信号を符号化して、符号化ビット列を伝送路に出力する。 The multiplexer 302 sequentially inserts one frame or one field at a time to form one image signal. The encoding unit 303 encodes the image signal output from the multiplexer 302 and outputs an encoded bit string to the transmission path.

復号側（受信側）では、図２１に示すように復号化手段３０５で前記符号化ビット列を復号する。制御手段３０８は、復号された画像信号から、フレーム周期またはフィールド周期のスイッチ信号を出力して、デマルチプレクサ３０６を制御する。これにより、デマルチプレクサ３０６からは復号された画像信号の各フレームまたはフィールドが順次に抜き取られて同時化手段３０７に供給され、ここで１フレーム又は１フィールドずつ順次遅延して同時化し、撮影時と同様な複数チャンネルの画像信号として出力する。 On the decoding side (receiving side), the encoded bit string is decoded by the decoding means 305 as shown in FIG. The control unit 308 controls the demultiplexer 306 by outputting a switch signal having a frame period or a field period from the decoded image signal. As a result, each frame or field of the decoded image signal is sequentially extracted from the demultiplexer 306 and supplied to the synchronizer 307, where the frames or fields are sequentially delayed and synchronized one frame or one field at a time. It outputs as a similar multi-channel image signal.

特開昭６１-１４４１９１号公報JP-A 61-144191 特開平１０−２４３４１９号公報Japanese Patent Laid-Open No. 10-243419

しかしながら、従来の立体視画像符号化・復号化方法及び装置では、符号化時に多視点画像信号を構成するそれぞれの複数のチャンネルの画像信号を、１フレーム又は１フィールド毎に順次に配列し、１つの画像信号としてから符号化手段３０３に入力し、さらに符号化手段３０３で符号化順に並び替えて符号化するので、１フレーム又は１フィールド毎に順次に並び替えるためのバッファが必要であり、その際に遅延が生じることがある。 However, in the conventional stereoscopic image encoding / decoding method and apparatus, the image signals of a plurality of channels constituting the multi-view image signal at the time of encoding are sequentially arranged for each frame or one field. Since it is input to the encoding unit 303 as one image signal and further encoded and rearranged in the encoding order by the encoding unit 303, a buffer for sequentially rearranging every frame or one field is necessary. There may be a delay.

また、復号時にも、復号化手段３０５で復号された画像信号を１フレーム又は１フィールド毎に順次に配列して出力してから、１フレーム又は１フィールドずつ順次遅延して同時化するので、視点毎に並び替えるためのバッファが必要となり、その際に遅延が生じることがある。 Also, at the time of decoding, the image signals decoded by the decoding means 305 are sequentially arranged and output for each frame or field, and then sequentially delayed by one frame or field, so that the viewpoint is synchronized. A buffer for rearrangement is required every time, and a delay may occur at that time.

本発明は以上の点に鑑みてなされたもので、バッファサイズを削減すると共に、遅延時間を短くする多視点画像復号装置を提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide a multi-viewpoint image decoding apparatus that reduces the buffer size and shortens the delay time.

上記目的を達成するため、第１の発明は、設定された複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、一の視点から実際に撮影して得られた画像信号、又は一の視点から仮想的に撮影したものとして生成した画像信号である多視点画像信号を符号化した符号化データを復号する多視点画像復号装置であって、符号化データを復号して復号多視点画像信号を生成する復号手段と、復号多視点画像信号を随時、画像バッファに格納する格納手段と、画像バッファから復号多視点画像信号を取り出し、各視点の復号画像信号を互いに同期させて各視点の復号画像信号毎にそれぞれ独立したチャンネルとして出力する出力手段と、を有することを特徴とする。 In order to achieve the above object, the first invention is a multi-view image signal including image signals of respective viewpoints respectively obtained from a plurality of set viewpoints, and the image signal of one viewpoint is actually transmitted from one viewpoint. A multi-viewpoint image decoding apparatus that decodes encoded data obtained by encoding a multi-viewpoint image signal that is an image signal obtained by shooting in the image or an image signal generated as a virtual shot from one viewpoint. Decoding means for decoding the encoded data to generate a decoded multi-view image signal; storage means for storing the decoded multi-view image signal in the image buffer as needed; taking out the decoded multi-view image signal from the image buffer; And output means for outputting the decoded image signals as independent channels for each decoded image signal of each viewpoint.

この発明では、表示装置等の入力形式が独立した複数のチャンネルの並列入力形式の場合に、その表示装置等の入力形式に合わせて、画像バッファに格納された復号多視点画像信号を、各視点がそれぞれ独立したチャンネルで出力することができるため、従来の立体視画像復号化方法及び装置のように復号後に１フレーム又は１フィールド毎に順次配列して出力してから同時化する必要がなく、各視点の復号画像信号を得ることができる。 In the present invention, when the input format of the display device or the like is a parallel input format of a plurality of channels, the decoded multi-viewpoint image signal stored in the image buffer is matched with the input format of the display device or the like for each viewpoint. Can be output in independent channels, so that it is not necessary to synchronize after sequentially arranging and outputting every frame or one field after decoding as in the conventional stereoscopic image decoding method and apparatus, A decoded image signal for each viewpoint can be obtained.

また、上記の目的を達成するため、第２の発明は、設定された複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、一の視点から実際に撮影して得られた画像信号、又は一の視点から仮想的に撮影したものとして生成した画像信号である多視点画像信号を符号化した符号化データを復号する多視点画像復号装置であって、符号化データを復号して復号多視点画像信号を生成する復号手段と、復号多視点画像信号を随時、画像バッファに格納する格納手段と、画像バッファから復号多視点画像信号を取り出し、各視点の復号画像信号を互いに同期させて各視点の復号画像信号をインターリーブして１つのチャンネルでシリアルに出力する出力手段と、を有することを特徴とする。 In order to achieve the above object, the second invention is a multi-viewpoint image signal including image signals of respective viewpoints respectively obtained from a plurality of set viewpoints, and an image signal of one viewpoint is Multi-viewpoint image decoding apparatus that decodes encoded data obtained by encoding a multi-viewpoint image signal that is an image signal that is actually captured from a viewpoint or an image signal that is virtually captured from one viewpoint A decoding unit that decodes encoded data to generate a decoded multi-view image signal; a storage unit that stores the decoded multi-view image signal in an image buffer as needed; and a decoded multi-view image signal is extracted from the image buffer Output means for synchronizing the decoded image signals of the respective viewpoints with each other, interleaving the decoded image signals of the respective viewpoints, and outputting them serially in one channel.

この発明では、表示装置等の入力形式が１つのチャンネルのシリアル入力形式の場合に、その表示装置等の入力形式に合わせて、画像バッファに格納された復号多視点画像信号を、各視点の復号画像信号を互いに同期させてインターリーブして１つのチャンネルでシリアルに出力することができるため、従来の立体視画像復号化方法及び装置のように復号後に１フレーム又は１フィールド毎に順次配列して出力してから同時化する必要がなく、各視点の復号画像信号を得ることができる。 In this invention, when the input format of the display device or the like is a serial input format of one channel, the decoded multi-viewpoint image signal stored in the image buffer is decoded for each viewpoint according to the input format of the display device or the like. Since image signals can be synchronized with each other and interleaved and output serially in one channel, they are sequentially arranged and output for each frame or field after decoding as in the conventional stereoscopic image decoding method and apparatus. Then, there is no need to synchronize, and a decoded image signal for each viewpoint can be obtained.

更に、上記の目的を達成するため、第３の発明は、設定された複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、一の視点から実際に撮影して得られた画像信号、又は一の視点から仮想的に撮影したものとして生成した画像信号である多視点画像信号を符号化した符号化データを復号する多視点画像復号装置であって、多視点画像信号と、その多視点画像信号の視点の数Ｖと、各視点のそれぞれを特定する番号ｖ及びそれぞれの視点での復号画像の出力順序を示す番号ｄを一括で示す復号画像出力順番号ｏ（整数）とがそれぞれ符号化されている符号化データを復号して、復号多視点画像信号と、多視点画像信号の視点の数Ｖと、復号画像出力順番号ｏとをそれぞれ生成する復号手段と、復号された復号画像出力順番号ｏを復号された視点の数Ｖで整数演算により除算して得た商（整数）を、それぞれの視点での復号画像の出力順序を示す番号ｄとして算出すると共に、除算の剰余（０以上Ｖ未満の整数）を各視点のそれぞれを特定する番号ｖとして算出する算出手段と、復号多視点画像信号を画像バッファに格納する格納手段と、算出手段から供給されるそれぞれの視点での復号画像の出力順序を示す番号ｄと各視点のそれぞれを特定する番号ｖに応じて、画像バッファから復号多視点画像信号を取り出し、復号多視点画像信号を構成する各視点の復号画像信号を互いに同期させて出力する出力手段と、を有することを特徴とする。 Furthermore, in order to achieve the above object, the third invention is a multi-view image signal including the image signals of the respective viewpoints respectively obtained from a plurality of set viewpoints. Multi-viewpoint image decoding apparatus that decodes encoded data obtained by encoding a multi-viewpoint image signal that is an image signal that is actually captured from a viewpoint or an image signal that is virtually captured from one viewpoint The multi-view image signal, the number V of viewpoints of the multi-view image signal, the number v that identifies each viewpoint, and the number d that indicates the output order of the decoded image at each viewpoint are collectively shown. Decoded encoded image output order numbers o (integers) are encoded respectively, and decoded multi-view image signals, the number V of viewpoints of the multi-view image signals, and decoded image output order numbers o Each of the decoding means for generating The quotient (integer) obtained by dividing the decoded decoded image output order number o by the number V of decoded viewpoints by integer arithmetic is calculated as a number d indicating the output order of the decoded images at the respective viewpoints. , A calculation means for calculating the remainder of division (an integer greater than or equal to 0 and less than V) as a number v for identifying each viewpoint, a storage means for storing the decoded multi-viewpoint image signal in the image buffer, and a calculation means. The decoded multi-viewpoint image signal is extracted from the image buffer in accordance with the number d indicating the output order of the decoded image at each viewpoint and the number v specifying each viewpoint, and each viewpoint constituting the decoded multi-viewpoint image signal is extracted. Output means for outputting the decoded image signals in synchronization with each other.

この発明では、多視点画像信号の視点の数Ｖと、各視点のそれぞれを特定する番号ｖ及びそれぞれの視点での復号画像の出力順序を示す番号ｄを一括で示す復号画像出力順番号ｏを復号し、その復号画像出力順番号ｏと復号した視点の数Ｖとに基づいて多視点画像信号を構成する各視点の画像信号それぞれの視点を特定する番号ｖと、それぞれの視点での復号画像の出力順序を示す番号ｄとを算出し、それらにより多視点画像表示装置等に適切に出力することができると共に、従来の単一視点の画像符号化／復号方式を本発明の多視点画像符号化／復号方式に拡張する際に、本発明の復号画像出力順番号ｏを従来の符号化方式の復号画像の出力順序を示す番号（例えばＡＶＣ／Ｈ．２６４方式におけるpicture order count）として扱うことができる。 In the present invention, the number V of viewpoints of the multi-viewpoint image signal, the number v that identifies each viewpoint, and the decoded image output order number o that collectively indicates the number d that indicates the output order of the decoded image at each viewpoint. The number v for identifying the viewpoint of each image signal of each viewpoint constituting the multi-view image signal based on the decoded image output order number o and the number V of decoded viewpoints, and the decoded image at each viewpoint And the number d indicating the output order of the image can be appropriately output to a multi-view image display device or the like, and the conventional single-view image encoding / decoding method can be applied to the multi-view image code of the present invention. When expanding to a decoding / decoding method, the decoded image output order number o of the present invention is treated as a number indicating the output order of decoded images of the conventional encoding method (for example, picture order count in the AVC / H.264 method) Can .

本発明によれば、従来の立体視画像復号化方法及び装置のように復号後に１フレーム又は１フィールド毎に順次配列して出力してから同時化する必要が無く、同時化のための画像バッファを持たず、遅延時間を短くすることができる。 According to the present invention, unlike the conventional stereoscopic image decoding method and apparatus, there is no need to synchronize after sequentially arranging and outputting every frame or one field after decoding, and an image buffer for synchronization The delay time can be shortened.

また、本発明によれば、多視点画像信号を構成する各視点の画像信号の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄとを一括で示す復号画像出力順番号ｏを多視点画像信号の視点の数Ｖと共に符号化された符号化データを復号するようにしたため、多視点画像信号を構成する各視点の画像信号の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄとに応じて、多視点画像表示装置等に適切に出力することができる。 Further, according to the present invention, the decoded image that collectively indicates the viewpoint number v that specifies the viewpoint of the image signal of each viewpoint constituting the multi-viewpoint image signal and the number d that indicates the output order of the decoded image at each viewpoint. Since the encoded data obtained by encoding the output order number o together with the number V of viewpoints of the multi-view image signal is decoded, the viewpoint number v for specifying the viewpoint of the image signal of each viewpoint constituting the multi-view image signal; According to the number d indicating the output order of the decoded image at each viewpoint, it can be appropriately output to a multi-view image display device or the like.

これにより、本発明によれば、従来の単一視点の画像符号化／復号方式を、多視点画像符号化／復号方式に拡張する際に、本発明の復号画像出力順番号ｏを従来の符号化方式の復号画像の出力順序を示す番号（例えばＡＶＣ／Ｈ．２６４方式におけるpicture order count）として扱い、復号することで、小さな改良により従来の符号化／復号方式との互換をとることができる。 Thus, according to the present invention, when the conventional single-view image encoding / decoding method is expanded to the multi-view image encoding / decoding method, the decoded image output order number o of the present invention is changed to the conventional code. By processing and decoding as a number indicating the output order of the decoded image of the encoding method (for example, picture order count in the AVC / H.264 method), compatibility with the conventional encoding / decoding method can be achieved with a small improvement. .

以下、図面と共に本発明の実施の形態を説明する。まず、本発明の多視点画像復号装置で復号する符号化データを生成する多視点画像符号化装置について図面を参照して説明する。図１は上記の多視点画像符号化装置の一例のブロック図であり、図２は図１の多視点画像符号化装置の処理手順を説明するフローチャートである。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. First, a multi-view image encoding device that generates encoded data to be decoded by the multi-view image decoding device of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of an example of the multi-view image encoding device described above, and FIG. 2 is a flowchart for explaining a processing procedure of the multi-view image encoding device of FIG.

図１に示すように、多視点画像符号化装置は符号化管理部１０１、復号画像出力順番号算出部１０２、並べ替えバッファ１０３、動き／視差補償予測部１０４、符号化モード判定部１０５、残差信号算出部１０６、残差信号符号化部１０７、残差信号復号部１０８、残差信号重畳部１０９、復号画像バッファ１１０、符号化ビット列生成部１１１を備えている。 As shown in FIG. 1, the multi-view image encoding apparatus includes an encoding management unit 101, a decoded image output order number calculation unit 102, a rearrangement buffer 103, a motion / disparity compensation prediction unit 104, an encoding mode determination unit 105, a remaining A difference signal calculation unit 106, a residual signal encoding unit 107, a residual signal decoding unit 108, a residual signal superimposing unit 109, a decoded image buffer 110, and an encoded bit string generation unit 111 are provided.

次に、図１に示す多視点画像符号化装置の動作について、図２のフローチャートを併せ参照して説明する。まず、図１の符号化管理部１０１には多視点画像符号化装置に供給される多視点画像の視点数、及び撮影時の各視点のカメラの位置、向き等のカメラパラメータ情報が供給される。これらの情報は受信側でも多視点画像を表示する際に必要な情報であり、多視点画像の視点数Ｖ、及びカメラパラメータ情報は、復号画像出力順番号算出部１０２を経由して符号化ビット列生成部１１１に供給されて符号化ビット列として生成される（図２のステップＳ１０１）。 Next, the operation of the multi-view image encoding apparatus shown in FIG. 1 will be described with reference to the flowchart of FIG. First, the encoding management unit 101 in FIG. 1 is supplied with camera parameter information such as the number of viewpoints of a multi-view image supplied to the multi-view image encoding apparatus, and the position and orientation of the camera at each viewpoint at the time of shooting. . These pieces of information are necessary for displaying the multi-viewpoint image on the receiving side, and the viewpoint number V of the multi-viewpoint image and the camera parameter information are encoded bit sequence via the decoded image output order number calculation unit 102. The data is supplied to the generation unit 111 and is generated as an encoded bit string (step S101 in FIG. 2).

さらに、符号化管理部１０１には多視点画像を構成する各画像のそれぞれについて、各視点をそれぞれ特定する情報、タイムスタンプ等の情報が供給される。これらの情報や画像の入力順序等を基に、各画像がどの視点に属するか、それぞれの視点での復号画像の出力順序（本多視点画像符号化装置で符号化して得られる符号化ビット列を復号側で復号して得られるそれぞれの視点での復号画像の出力順序）、各視点間の同期等を管理する。各画像のそれぞれについて、視点を特定する視点番号ｖ、及びそれぞれの視点での復号画像の出力順序を示す番号ｄをつけ、各画像にそれぞれ対応付ける。 Further, the encoding management unit 101 is supplied with information for specifying each viewpoint, information such as a time stamp, etc. for each of the images constituting the multi-viewpoint image. Based on these information and image input order, etc., which viewpoint each image belongs to, the output order of the decoded image at each viewpoint (the encoded bit string obtained by encoding with this multi-viewpoint image encoding device) (Decoding image output order at each viewpoint obtained by decoding on the decoding side), synchronization between each viewpoint, and the like are managed. For each image, a viewpoint number v that identifies the viewpoint and a number d that indicates the output order of the decoded image at each viewpoint are assigned, and are associated with each image.

ここで、各視点のそれぞれを特定する視点番号ｖの値のつけ方について説明する。視点番号ｖには０、または視点数を示すＶ未満の正の整数を各視点のそれぞれに割り当てる。
同じ視点の画像にはそれぞれ同一の値を割り当て、異なる視点の画像にはそれぞれ異なる値を割り当てる。以上の様にそれぞれの画像の視点番号ｖに値を割り当てることで、視点番号ｖによりそれぞれの画像がどの視点のものかを特定することができる。 Here, how to assign the value of the viewpoint number v that identifies each viewpoint will be described. As the viewpoint number v, 0 or a positive integer less than V indicating the number of viewpoints is assigned to each viewpoint.
The same value is assigned to images of the same viewpoint, and different values are assigned to images of different viewpoints. As described above, by assigning a value to the viewpoint number v of each image, it is possible to specify which viewpoint each image belongs to by the viewpoint number v.

次に、それぞれの視点での復号画像の出力順序を示す番号ｄの値のつけ方について説明する。番号ｄにはそれぞれ整数を割り当て、その値は復号画像の出力順序に応じて増加させる。ただし、視点間のそれぞれの復号画像の出力時刻が同じ場合、番号ｄの値はそれぞれ同一にする。以上のようにそれぞれの復号画像の出力順序を示す番号ｄに値を割り当てることで、復号側では番号ｄの値に応じて番号ｄの小さい画像から出力することで、所望の出力順序で出力することができ、また、視点間においては番号ｄが等しい場合、出力時刻が同じであることが判別できるので、視点間で同期させて出力することができる。加えて、符号化管理部１０１では各画像の符号化順序を管理すると共に、動き補償予測／視差補償予測に用いる参照画像を管理する。 Next, a description will be given of how to assign the value of the number d indicating the output order of the decoded image at each viewpoint. An integer is assigned to each number d, and the value is increased in accordance with the output order of decoded images. However, when the output times of the decoded images between the viewpoints are the same, the values of the numbers d are the same. As described above, by assigning a value to the number d indicating the output order of each decoded image, the decoding side outputs from the image having a smaller number d according to the value of the number d, and outputs in the desired output order. In addition, when the number d is the same between the viewpoints, it can be determined that the output times are the same, so that the viewpoints can be output in synchronization. In addition, the coding management unit 101 manages the coding order of each image and manages reference images used for motion compensation prediction / parallax compensation prediction.

図３は図１の多視点画像符号化装置で符号化する多視点画像を構成する各画像のそれぞれについて、視点を特定する視点番号ｖ及びそれぞれの視点での復号画像の出力順序を示す番号ｄに値を割り当てた場合の一例を示す図である。同図において、縦軸が視点方向を表し、横軸は時間方向を表している。また、Ｍ（ｖ）（ｖ＝０，１，２，・・・，Ｖ−１）は多視点画像を構成する視点画像を示しており、ｖは各視点のそれぞれを特定する視点番号である。さらに、ｍ（ｖ，ｄ）（ｖ＝０，１，２，・・・，Ｖ−１；ｄ＝０，１，２，・・・）は、視点画像Ｍ（ｖ）を構成する画像を示しており、ｖは各視点のそれぞれを特定する視点番号、ｄはそれぞれの視点での復号画像の出力順序を示す番号である。例えば、ある２つの画像ｍ（ｖ，ｄ）を比較する際に、両者の番号ｖの値が同じ場合は、両者は同じ視点の画像であり、両者の番号ｖの値が異なる場合は、両者はそれぞれ異なる視点の画像である。また、両者の番号ｄの値が同じ場合は、同じ時刻の画像である。両者の番号ｄの値が異なる場合は、異なる時刻の画像であり、値の小さい方が早い時刻の画像である。 FIG. 3 is a view number v that identifies the viewpoint and a number d that indicates the output order of the decoded image at each viewpoint for each of the images constituting the multi-view image encoded by the multi-view image encoding device of FIG. It is a figure which shows an example at the time of assigning a value to. In the figure, the vertical axis represents the viewpoint direction, and the horizontal axis represents the time direction. M (v) (v = 0, 1, 2,..., V−1) indicates viewpoint images constituting the multi-viewpoint image, and v is a viewpoint number for specifying each viewpoint. . Further, m (v, d) (v = 0, 1, 2,..., V−1; d = 0, 1, 2,...) Represents an image constituting the viewpoint image M (v). In the figure, v is a viewpoint number that identifies each viewpoint, and d is a number that indicates the output order of the decoded image at each viewpoint. For example, when comparing two images m (v, d), if the values of the numbers v are the same, they are images of the same viewpoint, and if the values of the numbers v are different, both Are images of different viewpoints. When the values of the numbers d are the same, the images are the same time. When the values of the numbers d are different, the images are at different times, and the smaller value is the image at the earlier time.

多視点画像信号の視点の数Ｖを符号化するのに加えて、視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄを個別に符号化することもできるが、本発明では、両者を一括で示す復号画像出力順番号ｏとして符号化する。従来の単一視点の画像符号化／復号方式を本実施の形態の多視点画像符号化／復号方式に拡張する際に、本方式の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄとを一括で示す復号画像出力順番号ｏを従来の単一視点の画像符号化／復号方式の復号画像の出力順序を示す番号として扱い、符号化／復号することで、小さな改良により従来の単一視点の画像符号化／復号方式との互換をとることができる。具体的には、ＡＶＣ／Ｈ.２６４方式を多視点符号化方式に拡張する際には、本方式の復号画像出力順番号ｏをＡＶＣ／Ｈ.２６４方式の復号画像の出力順序を示す番号であるピクチャ・オーダー・カウント（picture order count）として扱う。 In addition to encoding the number of viewpoints V of the multi-viewpoint image signal, it is also possible to individually encode the viewpoint number v for specifying the viewpoint and the number d indicating the output order of the decoded image at each viewpoint. In the present invention, encoding is performed as a decoded image output order number o indicating both of them collectively. When the conventional single-viewpoint image encoding / decoding system is extended to the multi-viewpoint image encoding / decoding system of the present embodiment, the viewpoint number v for specifying the viewpoint of this system and the decoded image at each viewpoint A decoded image output order number o that collectively indicates a number d indicating the output order of the image is treated as a number indicating the output order of the decoded image of the conventional single-viewpoint image encoding / decoding method, and is encoded / decoded. With a small improvement, compatibility with the conventional single-viewpoint image encoding / decoding method can be achieved. Specifically, when the AVC / H.264 system is expanded to the multi-viewpoint encoding system, the decoded image output order number o of this system is a number indicating the output order of the decoded image of the AVC / H.264 system. Treat as a picture order count.

図１の復号画像出力順番号算出部１０２では符号化する多視点画像の視点数Ｖ、各視点のそれぞれを特定する視点番号ｖ、及びそれぞれの視点での復号画像の出力順序ｄから復号画像出力順番号ｏを算出する（図２のステップＳ１０３）。復号画像出力順番号ｏは次式により算出する。 The decoded image output order number calculation unit 102 in FIG. 1 outputs the decoded image from the viewpoint number V of the multi-view image to be encoded, the viewpoint number v that identifies each viewpoint, and the output order d of the decoded image at each viewpoint. A sequence number o is calculated (step S103 in FIG. 2). The decoded image output order number o is calculated by the following equation.

ｏ＝ｄ・Ｖ＋ｖ（１）
図４は図１の多視点画像符号化装置で符号化する５視点（Ｖ＝５）の多視点画像を構成する各画像のそれぞれについて（１）式により、復号画像出力順番号ｏを算出し、値を割り当てた場合の一例を示す図である。符号化ビット列生成部１１１は、復号画像出力順番号算出部１０２で算出された復号画像出力順番号ｏをビット列に符号化する（図２のステップＳ１０４）。 o = d · V + v (1)
4 calculates a decoded image output order number o for each of the images constituting a multi-view image of five viewpoints (V = 5) to be encoded by the multi-view image encoding device of FIG. It is a figure which shows an example at the time of assigning a value. The encoded bit string generation unit 111 encodes the decoded image output order number o calculated by the decoded image output order number calculation unit 102 into a bit string (step S104 in FIG. 2).

また、復号側である後述する多視点画像復号装置では図１の多視点画像符号化装置で符号化されたビット列を復号して得られる多視点画像の視点数Ｖと復号画像出力順番号ｏから（１）式を満たす各画像の視点を特定する視点番号ｖ（ただし、ｖは０以上Ｖ未満の整数）とそれぞれの視点での復号画像の出力順序を示す番号ｄ（整数）を算出する。具体的には、番号ｄは復号画像出力順番号ｏを視点数Ｖで整数演算により除算して得た商とする。また、番号ｖは復号画像出力順番号ｏを視点数Ｖで整数演算により除算したときの剰余の値とする。または、視点番号ｖは番号ｄを算出した後で次式により算出してもよい。 Further, in the multi-view image decoding apparatus described later on the decoding side, from the viewpoint number V of the multi-view image obtained by decoding the bit string encoded by the multi-view image encoding apparatus in FIG. 1 and the decoded image output order number o. A viewpoint number v (where v is an integer greater than or equal to 0 and less than V) and a number d (integer) indicating the output order of decoded images at each viewpoint are calculated. Specifically, the number d is a quotient obtained by dividing the decoded image output order number o by the number of viewpoints V by integer arithmetic. The number v is a remainder value obtained by dividing the decoded image output order number o by the number of viewpoints V by integer arithmetic. Alternatively, the viewpoint number v may be calculated by the following equation after calculating the number d.

ｖ＝ｏ−ｄ・Ｖ（２）
また、並べ替えバッファ１０３は供給される多視点画像を格納する。ここで、多視点画像を構成する視点画像Ｍ（ｖ）（ｖ＝０，１，２，・・・，Ｖ−１）は各視点毎にそれぞれ独立したチャンネルで並列に入力する方法と、各視点の画像信号がインターリーブされた信号として１つのチャンネルでシリアルに入力する方法がある。各視点の画像信号をインターリーブする方法としては、各視点の画像信号を画素単位でインターリーブする方法、複数の画素を纏めた単位でインターリーブする方法、水平方向のライン単位でインターリーブする方法、画像単位でインターリーブする方法、複数の画像を纏めた単位でインターリーブする方法等がある。 v = od−V (2)
The rearrangement buffer 103 stores the supplied multi-viewpoint images. Here, the viewpoint images M (v) (v = 0, 1, 2,..., V−1) constituting the multi-viewpoint image are input in parallel through independent channels for each viewpoint, There is a method in which a viewpoint image signal is serially input through one channel as an interleaved signal. As a method of interleaving the image signals of each viewpoint, a method of interleaving the image signals of each viewpoint in units of pixels, a method of interleaving in units of a plurality of pixels, a method of interleaving in units of horizontal lines, a unit of images There are a method of interleaving, a method of interleaving in units of a plurality of images, and the like.

入力される視点画像Ｍ（ｖ）のインターリーブ構造の例を図５〜図１２を用いて説明する。図５は各視点の信号を画素単位でインターリーブした場合の例である。同図において、ｐ（ｖ，ｉ）は各視点画像の画素を表し、ｖは視点を特定する視点番号、ｉは画素のインデックスである。 An example of the interleave structure of the input viewpoint image M (v) will be described with reference to FIGS. FIG. 5 shows an example in which the signals at each viewpoint are interleaved in units of pixels. In the figure, p (v, i) represents a pixel of each viewpoint image, v is a viewpoint number for specifying the viewpoint, and i is a pixel index.

図６は各視点の信号を複数の画素を纏めた単位でインターリーブした場合の例である。
同図において、ｐｙ（ｖ，ｉｙ）は各視点画像の輝度信号の画素を表し、ｖは視点を特定する視点番号、ｉｙは輝度信号の画素のインデックスである。ｐｕ（ｖ，ｉｕ）は色差信号（Ｕ）の画素を表し、ｉｕは色差信号（Ｕ）の画素のインデックスである。ｐｖ（ｖ，ｉｖ）は色差信号（Ｖ）の画素を表し、ｉｖは色差信号（Ｖ）の画素のインデックスである。 FIG. 6 shows an example of interleaving the signals of each viewpoint in units of a plurality of pixels.
In the figure, py (v, iy) represents a pixel of the luminance signal of each viewpoint image, v is a viewpoint number for specifying the viewpoint, and iy is an index of the pixel of the luminance signal. pu (v, iu) represents a pixel of the color difference signal (U), and iu is an index of the pixel of the color difference signal (U). pv (v, iv) represents a pixel of the color difference signal (V), and iv is an index of the pixel of the color difference signal (V).

図７は各視点の信号を水平方向１６画素、垂直方向１６画素の画素ブロック単位、あるいは水平方向１６画素、垂直方向８画素の画素ブロック単位でインターリーブした場合の例を示す。同図において、ｂ（ｖ，ｉｂ）は各視点画像の画素を表し、ｖは視点を特定する視点番号、ｉｂは画素ブロックのインデックスである。画素ブロックを水平ラインで走査することにより、複数画素を纏めた単位でインターリーブしたものの一種であるといえる。 FIG. 7 shows an example in which the signals of each viewpoint are interleaved in units of pixel blocks of 16 pixels in the horizontal direction and 16 pixels in the vertical direction, or in units of pixel blocks of 16 pixels in the horizontal direction and 8 pixels in the vertical direction. In the figure, b (v, ib) represents a pixel of each viewpoint image, v is a viewpoint number for specifying the viewpoint, and ib is a pixel block index. It can be said that this is a type of interleaving in a unit of a plurality of pixels by scanning a pixel block along a horizontal line.

図８は各視点の信号を水平方向のライン単位でインターリーブした場合の例である。同図において、ｌ（ｖ，ｊ）は各視点画像のラインを表し、ｖは視点を特定する視点番号、ｊはラインのインデックスである。ラインは複数の画素から構成されるので、複数画素を纏めた単位でインターリーブしたものの一種であるといえる。 FIG. 8 shows an example of interleaving the signals of each viewpoint in units of horizontal lines. In the figure, l (v, j) represents a line of each viewpoint image, v is a viewpoint number for specifying the viewpoint, and j is an index of the line. Since a line is composed of a plurality of pixels, it can be said that the line is a kind of interleaved unit of a plurality of pixels.

また、図９は各視点の信号を１つの画像に纏めた形式でインターリーブした場合も１つに纏めた画像を水平ラインで走査することにより、水平ライン単位でインターリーブしたものの一種であるといえる。図９（Ａ）は各視点の信号を纏めた１つの画像を示し、同図（Ｂ）は水平ライン単位でインターリーブした画像を示す。同図において、ｖは視点を特定する視点番号、ｄはそれぞれの視点での画像の撮影順序を示す番号（それぞれの視点での復号画像の出力順序を示す番号）を示す。また、ｌ（ｖ，ｊ）のｊはラインインデックスを示す。 Further, FIG. 9 can be said to be a type of interleaving in units of horizontal lines by scanning the images combined into one image in a horizontal line even when the signals of each viewpoint are interleaved in a format combined into one image. FIG. 9A shows one image in which the signals of the respective viewpoints are collected, and FIG. 9B shows an image interleaved in units of horizontal lines. In the same figure, v indicates a viewpoint number for specifying a viewpoint, and d indicates a number indicating an imaging order of images at each viewpoint (a number indicating an output order of decoded images at each viewpoint). Further, j in l (v, j) indicates a line index.

図１０は各視点の信号を複数のラインを纏めたスライス単位でインターリーブした場合の例を示す。同図において、ｓ（ｖ，ｋ）は各視点画像の輝度信号の画素を表し、ｖは視点を特定する視点番号、ｋは複数のラインを纏めたスライスのインデックスである。スライスは複数の画素から構成されるので、複数画素を纏めた単位でインターリーブしたものの一種であるといえると共に、画素ブロック単位でインターリーブしたものの一種であるともいえる。 FIG. 10 shows an example in which the signals of each viewpoint are interleaved in units of slices in which a plurality of lines are combined. In the figure, s (v, k) represents a pixel of a luminance signal of each viewpoint image, v is a viewpoint number for specifying the viewpoint, and k is an index of a slice in which a plurality of lines are collected. Since a slice is composed of a plurality of pixels, it can be said that it is a kind of interleaving in units of a plurality of pixels and also a kind of interleaving in units of pixel blocks.

図１１は各視点の信号を画像単位でインターリーブした場合の例を示す。同図において、ｍ（ｖ，ｄ）は各視点画像を構成する画像を表し、ｖは視点を特定する視点番号、ｄはそれぞれの視点での画像の撮影順序を示す番号（それぞれの視点での復号画像の出力順序を示す番号）である。 FIG. 11 shows an example in which the signals of each viewpoint are interleaved in units of images. In the figure, m (v, d) represents an image constituting each viewpoint image, v is a viewpoint number for specifying the viewpoint, and d is a number indicating an image capturing order at each viewpoint (at each viewpoint). The number indicating the output order of the decoded image).

図１２は各視点の信号を複数の画像を纏めた単位でインターリーブした場合の例を示す。同図において、ｍ（ｖ，ｄ）は図１１と同様に各視点画像を構成する画像を表し、ｖは視点を特定する視点番号、ｄはそれぞれの視点での画像の撮影順序を示す番号（それぞれの視点での復号画像の出力順序を示す番号）である。 FIG. 12 shows an example in which the signals at each viewpoint are interleaved in units of a plurality of images. In the figure, m (v, d) represents an image constituting each viewpoint image as in FIG. 11, v is a viewpoint number for specifying the viewpoint, and d is a number indicating the image capturing order at each viewpoint ( The number indicating the output order of the decoded image at each viewpoint).

図１の多視点画像符号化装置においては、図５〜図１２のうちのいずれの方法で画像信号を入力する場合においても、遅延させることなく、随時並べ替えバッファ１０３に入力し、格納する。更に、符号化管理部１０１で制御される符号化順に応じて、並べ替えバッファ１０３に格納された画像信号を画素ブロック単位で、動き／視差補償予測部１０４及び残差信号算出部１０６にそれぞれ供給する（図２のステップＳ１０６）。 In the multi-view image encoding apparatus of FIG. 1, even when an image signal is input by any of the methods in FIGS. 5 to 12, it is input to the rearrangement buffer 103 and stored at any time without delay. Further, the image signals stored in the rearrangement buffer 103 are supplied to the motion / disparity compensation prediction unit 104 and the residual signal calculation unit 106 in units of pixel blocks according to the encoding order controlled by the encoding management unit 101. (Step S106 in FIG. 2).

符号化管理部１０１で制御する符号化順序について図１３を用いて説明する。図１３は５視点（Ｖ＝５）の多視点画像を構成する各画像ｍ（ｖ，ｄ）の符号化順序、及び動き補償／視差補償の参照関係の一例を示す図である。同図において、視点画像Ｍ（０）、Ｍ（２）、Ｍ（４）は他の視点の画像を参照する視差補償予測を行わずに符号化する。例えば、視点画像Ｍ（０）の画像ｍ（０，０）は他の画像を参照せず、画面内だけで独立して符号化するピクチャとして符号化する。 The encoding order controlled by the encoding management unit 101 will be described with reference to FIG. FIG. 13 is a diagram illustrating an example of the encoding order of each image m (v, d) constituting a multi-view image of five viewpoints (V = 5) and a reference relationship of motion compensation / parallax compensation. In the figure, viewpoint images M (0), M (2), and M (4) are encoded without performing disparity compensation prediction referring to images of other viewpoints. For example, the image m (0,0) of the viewpoint image M (0) is encoded as a picture that is encoded independently only within the screen without referring to other images.

視点画像Ｍ（０）の画像ｍ（０，３）は同一視点の表示順序で前方の画像ｍ（０，０）の復号画像を参照画像とし、動き補償予測を用いて、符号化する。更に、視点画像Ｍ（０）の画像ｍ（０，１）は同一視点の表示順序で前方の画像ｍ（０，０）及び後方の画像ｍ（０，３）の復号画像を参照画像とし、動き補償予測を用いて、符号化する。 The image m (0,3) of the viewpoint image M (0) is encoded using the motion compensated prediction using the decoded image of the front image m (0,0) in the display order of the same viewpoint as a reference image. Further, the image m (0, 1) of the viewpoint image M (0) is a reference image that is a decoded image of the front image m (0, 0) and the rear image m (0, 3) in the same viewpoint display order. Encode using motion compensated prediction.

一方、視点画像Ｍ（１）、Ｍ（３）は動き補償予測に加えて、他の視点の画像を参照画像として予測する視差補償予測を用いて符号化する。例えば、視点画像Ｍ（１）の画像ｍ（１，１）は同一視点の表示順序で前方の画像ｍ（１，０）及び後方の画像ｍ（１，３）の復号画像を参照画像とし、動き補償予測を行うのに加えて、別視点の画像ｍ（０，１）及びｍ（２，１）の復号画像を参照画像とし、視差補償予測を用いて符号化する。 On the other hand, viewpoint images M (1) and M (3) are encoded using disparity compensation prediction that predicts an image of another viewpoint as a reference image in addition to motion compensation prediction. For example, the image m (1,1) of the viewpoint image M (1) is a reference image that is a decoded image of the front image m (1,0) and the rear image m (1,3) in the display order of the same viewpoint. In addition to performing motion compensation prediction, the decoded images of the images m (0, 1) and m (2, 1) of different viewpoints are used as reference images and encoded using parallax compensation prediction.

視点画像Ｍ（１）の画像ｍ（１，１）を符号化する際には、参照画像となる画像ｍ（１，０）、ｍ（１，３）、ｍ（０，１）及びｍ（２，１）は符号化、復号が完了し、復号画像バッファ１１０に格納されていなければならない。本例では、ｍ（０，０）、ｍ（２，０）、ｍ（１，０）、ｍ（４，０）、ｍ（３，０）、ｍ（０，３）、ｍ（２，３）、ｍ（１，３）、ｍ（４，３）、ｍ（３，３）、ｍ（０，１）、ｍ（２，１）、ｍ（１，１）、ｍ（４，１）、ｍ（３，１）、ｍ（０，２）、ｍ（２，２）、ｍ（１，２）、ｍ（４，２）、ｍ（３，２）、ｍ（０，６）、ｍ（２，６）、ｍ（１，６）、ｍ（４，６）、ｍ（３，６）、ｍ（０，４）、ｍ（２，４）、ｍ（１，４）、・・・の符号化順で符号化すればよい。この符号化順を示す符号化順番号ｃは、復号画像出力順番号ｏと１対１に対応しており、符号化管理部１０１で管理される。 When the image m (1, 1) of the viewpoint image M (1) is encoded, the images m (1, 0), m (1, 3), m (0, 1), and m ( 2, 1) must be encoded and decoded and stored in the decoded image buffer 110. In this example, m (0,0), m (2,0), m (1,0), m (4,0), m (3,0), m (0,3), m (2, 3), m (1,3), m (4,3), m (3,3), m (0,1), m (2,1), m (1,1), m (4,1 ), M (3,1), m (0,2), m (2,2), m (1,2), m (4,2), m (3,2), m (0,6) , M (2,6), m (1,6), m (4,6), m (3,6), m (0,4), m (2,4), m (1,4), It suffices to code in the order of encoding. The encoding order number c indicating the encoding order has a one-to-one correspondence with the decoded image output order number o and is managed by the encoding management unit 101.

動き／視差補償予測部１０４は、従来のＡＶＣ／Ｈ.２６４方式等と同様に動き補償予測を行うのに加えて、前述の視差補償予測を行う（図２のステップＳ１０７）。動き補償予測は表示順序で前方または後方の同一視点の画像を参照画像とするが、視差補償予測は別視点の画像を参照画像とすれば共通の処理を行うことができる。符号化管理部１０１の制御に応じて、並べ替えバッファ１０３から供給される画素ブロックと、復号画像バッファ１１０から供給される参照画像との間でブロックマッチングを行い、動き補償予測の場合は動きベクトル、視差補償予測の場合は視差ベクトルを検出し、動き補償予測／視差補償予測ブロック信号を作成して動き補償予測／視差補償予測ブロック信号、及び動きベクトル／視差ベクトルを符号化モード判定部１０５に供給する。 The motion / disparity compensation prediction unit 104 performs the above-described disparity compensation prediction in addition to performing the motion compensation prediction in the same manner as in the conventional AVC / H.264 method (step S107 in FIG. 2). In motion compensated prediction, an image of the same viewpoint in front or rear in the display order is used as a reference image, but in parallax compensated prediction, a common process can be performed if an image of another viewpoint is used as a reference image. In accordance with control of the encoding management unit 101, block matching is performed between the pixel block supplied from the rearrangement buffer 103 and the reference image supplied from the decoded image buffer 110. In the case of motion compensation prediction, a motion vector is used. In the case of disparity compensation prediction, a disparity vector is detected, a motion compensation prediction / disparity compensation prediction block signal is generated, and the motion compensation prediction / disparity compensation prediction block signal and the motion vector / disparity vector are sent to the encoding mode determination unit 105. Supply.

動き補償予測／視差補償予測を行うか否か、参照画像の数、どの復号画像を参照画像とするか、画素ブロックのサイズ等の候補の組み合わせは符号化管理部１０１で制御され、この制御に応じて動き補償予測／視差補償予測に関するすべての符号化モードの候補となるすべての組み合わせについて動き補償予測／視差補償予測を行い、それぞれの動き補償予測／視差補償予測ブロック信号、及び動きベクトル／視差ベクトルを符号化モード判定部１０５に供給する。ここでの画素ブロックのサイズの候補とは、画素ブロックを更に分割したそれぞれの小ブロックのことである。例えば、画素ブロックを水平方向１６画素、垂直方向１６画素（すなわち、１６×１６）とした場合、１６×８、８×１６、８×８、８×４、４×８、４×４等の小ブロックに分割して動き補償予測を行い、候補とする。 The encoding management unit 101 controls candidate combinations such as whether to perform motion compensation prediction / disparity compensation prediction, the number of reference images, which decoded image to use as a reference image, and the size of a pixel block. Accordingly, motion compensation prediction / disparity compensation prediction is performed for all combinations that are candidates for all coding modes related to motion compensation prediction / disparity compensation prediction, and each motion compensation prediction / disparity compensation prediction block signal and motion vector / disparity are predicted. The vector is supplied to the encoding mode determination unit 105. The candidate pixel block size here is each small block obtained by further dividing the pixel block. For example, if the pixel block is 16 pixels in the horizontal direction and 16 pixels in the vertical direction (that is, 16 × 16), 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, 4 × 4, etc. Dividing into small blocks, motion compensation prediction is performed, and candidates are set.

符号化モード判定部１０５では、動き補償予測、視差補償予測のどの手法をどの参照画像を用いてどのような画素ブロック単位で選択、組み合わせると効率の良い符号化が実現できるかを判定して符号化モードを決定し、得られた符号化モード、及び当該動きベクトル／視差ベクトルを符号化ビット列生成部１１１に供給すると共に、当該動き補償予測／視差補償予測ブロック信号を残差信号演算部１０６に供給する（図２のステップＳ１０８）。 The encoding mode determination unit 105 determines which method of motion compensation prediction and disparity compensation prediction is selected and combined in which pixel block unit using which reference image, and efficient encoding can be realized. And the obtained encoding mode and the motion vector / disparity vector are supplied to the encoded bit string generation unit 111, and the motion compensation prediction / disparity compensation prediction block signal is supplied to the residual signal calculation unit 106. Supply (step S108 in FIG. 2).

例えば、時間軸上で前と後の参照画像からの動き補償予測を組み合わせる場合、前の参照画像から動き補償予測を行って得られた動き補償予測ブロックと、後の参照画像から動き補償予測を行って得られた動き補償予測ブロックの各画素値を平均したブロックを生成して候補とする。また、動き補償予測と視差補償予測と組み合わせることもできる。さらに、画素値を平均する際には１：１の平均のみならず、１：２、１：３などの重み付けをしてもよい。また、画素ブロックを４×４画素から１６×１６画素の小ブロックに分割して符号化モードの候補とした場合、それぞれの小ブロックの予測方法を変えることもできる。 For example, when combining motion compensated prediction from the previous and subsequent reference images on the time axis, the motion compensated prediction block obtained by performing the motion compensated prediction from the previous reference image and the motion compensated prediction from the subsequent reference image are used. A block obtained by averaging the pixel values of the motion compensated prediction block obtained by the above is generated and set as a candidate. Also, motion compensation prediction and parallax compensation prediction can be combined. Furthermore, when averaging pixel values, weighting such as 1: 2 or 1: 3 may be used in addition to the average of 1: 1. Further, when the pixel block is divided into small blocks of 4 × 4 pixels to 16 × 16 pixels to be candidates for the encoding mode, the prediction method of each small block can be changed.

符号化モードを判定する手法については様々なものがあるが、例えば各符号化モードについて符号量と歪み量を算出し、これら符号量と歪み量のバランスにおいて最適な符号化モードを選択する手法がある。この符号化モード判定では、まずそれぞれの符号化モードの組み合わせに対して、残差信号を算出し、この残差信号やベクトル及び符号化モードを符号化して得られる符号化列のビット長を算出し、符号量とする。さらに、符号化した残差信号を復号し、予測信号と加算された復号信号と符号化前の画像信号との絶対値誤差和、あるいは二乗和を算出し、歪み量とする。符号量に予め定めた乗数を乗じ、歪み量に加算し、評価値とする。候補となるすべての符号化モードの組み合わせの評価値の中で最小のものを選択し、当該画素ブロックの符号化モードとする。 There are various methods for determining the encoding mode. For example, there is a method for calculating the code amount and the distortion amount for each encoding mode and selecting the optimum encoding mode in the balance between the code amount and the distortion amount. is there. In this encoding mode determination, first, a residual signal is calculated for each combination of encoding modes, and the bit length of an encoded sequence obtained by encoding the residual signal, vector, and encoding mode is calculated. Code amount. Further, the encoded residual signal is decoded, and an absolute value error sum or a square sum of the decoded signal added with the prediction signal and the image signal before encoding is calculated to obtain the distortion amount. The code amount is multiplied by a predetermined multiplier and added to the distortion amount to obtain an evaluation value. The smallest evaluation value of the combinations of all candidate encoding modes is selected and set as the encoding mode of the pixel block.

残差信号演算部１０６は、並べ替えバッファ１０３から供給される画素ブロック信号から、符号化モード判定部１０５から供給される動き補償予測／視差補償予測ブロック信号を減算し、残差信号を得る（図２のステップＳ１０９）。残差信号符号化部１０７は、残差信号演算部１０６から入力された残差信号に対して直交変換、量子化等の残差信号符号化処理を行い、符号化残差信号を算出する（図２のステップＳ１１０）。 The residual signal calculation unit 106 subtracts the motion compensation prediction / disparity compensation prediction block signal supplied from the coding mode determination unit 105 from the pixel block signal supplied from the rearrangement buffer 103 to obtain a residual signal ( Step S109 in FIG. The residual signal encoding unit 107 performs a residual signal encoding process such as orthogonal transform and quantization on the residual signal input from the residual signal calculating unit 106 to calculate an encoded residual signal ( Step S110 in FIG.

符号化管理部１０１は、当該符号化画像が符号化順序で後に続く画像の動き補償予測、もしくは他の視点の視差補償予測の参照画像として利用されるか否かを管理しており（図２のステップＳ１１１）、参照画像として利用される場合は、符号化残差信号を復号し、復号画像信号を復号画像バッファ１１０に画素ブロック単位で順次格納する（図２のステップＳ１１２〜Ｓ１１４）。 The encoding management unit 101 manages whether or not the encoded image is used as a reference image for motion compensation prediction of an image that follows in the encoding order or parallax compensation prediction of another viewpoint (FIG. 2). Step S111), when used as a reference image, the encoded residual signal is decoded, and the decoded image signal is sequentially stored in the decoded image buffer 110 in units of pixel blocks (steps S112 to S114 in FIG. 2).

まず、残差信号復号部１０８は、残差信号符号化部１０７から入力された符号化残差信号に対して、逆量子化、逆直交変換等の残差信号復号処理を行い、復号残差信号を生成する（図２のステップＳ１１２）。残差信号重畳部１０９は符号化モード判定部１０５から供給される予測信号に、残差信号復号部１０８から供給される復号残差信号を重畳し、復号画像信号を算出し（図２のステップＳ１１３）、その復号画像信号を復号画像バッファ１１０に画素ブロック単位で順次格納する（図２のステップＳ１１４）。この復号画像バッファ１１０に格納された復号画像信号は、必要に応じて、符号化順で後に続く画像の動き補償予測、もしくは他の視点の視差補償予測の参照画像となる。 First, the residual signal decoding unit 108 performs a residual signal decoding process such as inverse quantization and inverse orthogonal transform on the encoded residual signal input from the residual signal encoding unit 107, thereby obtaining a decoded residual. A signal is generated (step S112 in FIG. 2). The residual signal superimposing unit 109 superimposes the decoded residual signal supplied from the residual signal decoding unit 108 on the prediction signal supplied from the encoding mode determining unit 105 to calculate a decoded image signal (step of FIG. 2). In step S113, the decoded image signal is sequentially stored in the decoded image buffer 110 in units of pixel blocks (step S114 in FIG. 2). The decoded image signal stored in the decoded image buffer 110 becomes a reference image for motion-compensated prediction of an image that follows in the coding order or parallax-compensated prediction of another viewpoint, if necessary.

符号化ビット列生成部１１１は、符号化モード判定部１０５から入力される符号化モード、及び、動きベクトルまたは視差ベクトル、残差信号符号化部１０７から入力される符号化残差信号等をハフマン符号化、算術符号化等のエントロピー符号化を用いて順次符号化し、符号化ビット列を生成する（図２のステップＳ１１５）。 The encoded bit string generation unit 111 performs Huffman coding on the encoding mode input from the encoding mode determination unit 105, the motion vector or the disparity vector, the encoded residual signal input from the residual signal encoding unit 107, and the like. Encoding is performed sequentially using entropy encoding such as encoding and arithmetic encoding to generate an encoded bit string (step S115 in FIG. 2).

以上、図２のステップＳ１０６からステップＳ１１５までの処理を画素ブロック単位で符号化画像内のすべての画素ブロックの符号化が完了するまで繰り返す（図２のステップＳ１０５〜Ｓ１１６）。更に、図２のステップＳ１０３からステップＳ１１６までの処理を符号化画像毎に繰り返す（図２のステップＳ１０２〜Ｓ１１７）。 As described above, the processing from step S106 to step S115 in FIG. 2 is repeated in units of pixel blocks until encoding of all the pixel blocks in the encoded image is completed (steps S105 to S116 in FIG. 2). Further, the processing from step S103 to step S116 in FIG. 2 is repeated for each encoded image (steps S102 to S117 in FIG. 2).

次に、本発明の多視点画像復号装置について図面を参照して説明する。図１４は本発明になる多視点画像復号装置の一実施の形態のブロック図、図１５は図１４の多視点画像復号装置の処理手順を説明するフローチャートである。図１４に示すように、本実施の形態の多視点画像復号装置は、符号化ビット列復号部２０１、復号画像管理情報算出部２０２、動き／視差補償予測部２０３、予測信号合成部２０４、残差信号復号部２０５、残差信号重畳部２０６、復号画像バッファ２０７、復号画像管理部２０８、復号画像出力部２０９を備えている。 Next, the multi-viewpoint image decoding apparatus of the present invention will be described with reference to the drawings. FIG. 14 is a block diagram of an embodiment of the multi-view image decoding apparatus according to the present invention, and FIG. 15 is a flowchart for explaining the processing procedure of the multi-view image decoding apparatus of FIG. As illustrated in FIG. 14, the multi-view image decoding apparatus according to the present embodiment includes an encoded bit sequence decoding unit 201, a decoded image management information calculation unit 202, a motion / disparity compensation prediction unit 203, a prediction signal synthesis unit 204, a residual, A signal decoding unit 205, a residual signal superimposing unit 206, a decoded image buffer 207, a decoded image management unit 208, and a decoded image output unit 209 are provided.

本実施の形態の多視点画像復号装置の動作について、図１５のフローチャートを併せ参照して説明する。まず、符号化ビット列復号部２０１は図１に示した多視点画像符号化装置によりハフマン符号化、算術符号化等のエントロピー符号化を用いて符号化された符号化ビット列を復号し、多視点画像の視点数Ｖ、カメラの視点位置等のカメラパラメータ情報を得る（図１５のステップＳ２０１）。これらのカメラパラメータ情報は復号した多視点画像を出力先となる多視点画像表示装置等にて表示する際に必要な情報である。また、視点数Ｖは多視点画像を構成する各画像の視点を特定する視点番号ｖ、及びそれぞれの視点での復号画像の出力順序を示す番号ｄを算出する際にも用いる。 The operation of the multi-view image decoding apparatus according to the present embodiment will be described with reference to the flowchart of FIG. First, the encoded bit sequence decoding unit 201 decodes an encoded bit sequence encoded by entropy encoding such as Huffman encoding or arithmetic encoding by the multi-view image encoding apparatus shown in FIG. Camera parameter information such as the number of viewpoints V and the viewpoint position of the camera is obtained (step S201 in FIG. 15). The camera parameter information is information necessary for displaying the decoded multi-viewpoint image on a multi-viewpoint image display device or the like serving as an output destination. The number of viewpoints V is also used when calculating the viewpoint number v that identifies the viewpoints of the images constituting the multi-viewpoint image, and the number d that indicates the output order of the decoded images at each viewpoint.

また、符号化ビット列復号部２０１は多視点画像を構成する各画像の視点を特定する視点番号ｖ、及びそれぞれの視点での復号画像の出力順序を示す番号ｄを算出するために必要な復号画像出力順番号ｏを得る（図１５のステップＳ２０３）。 Also, the encoded bit string decoding unit 201 decodes a decoded image necessary for calculating a viewpoint number v that identifies the viewpoint of each image constituting the multi-viewpoint image and a number d that indicates the output order of the decoded image at each viewpoint. An output order number o is obtained (step S203 in FIG. 15).

復号画像管理情報算出部２０２は、符号化ビット列復号部２０１から供給される視点数Ｖと復号画像出力順番号ｏから（１）式を満たす各画像の視点を特定する視点番号ｖ（ただし、ｖは０以上Ｖ未満の整数）とそれぞれの視点での復号画像の出力順序を示す番号ｄ（整数）を算出する（図１５のステップＳ２０４）。具体的には、番号ｄは復号画像出力順番号ｏを視点数Ｖで整数演算により除算して得た商とする。また、番号ｖは復号画像出力順番号ｏを視点数Ｖで整数演算により除算したときの剰余の値とする。または、視点番号ｖは番号ｄを算出した後で（２）式により算出してもよい。 The decoded image management information calculation unit 202 determines the viewpoint number v (provided that v is the number of viewpoints) that satisfies the expression (1) from the number V of viewpoints supplied from the encoded bit string decoding unit 201 and the decoded image output order number o. Is an integer greater than or equal to 0 and less than V) and a number d (integer) indicating the output order of the decoded image at each viewpoint is calculated (step S204 in FIG. 15). Specifically, the number d is a quotient obtained by dividing the decoded image output order number o by the number of viewpoints V by integer arithmetic. The number v is a remainder value obtained by dividing the decoded image output order number o by the number of viewpoints V by integer arithmetic. Alternatively, the viewpoint number v may be calculated by the equation (2) after calculating the number d.

符号化ビット列復号部２０１から供給される視点数Ｖ、復号画像出力順番号ｏ、及び復号画像管理情報算出部２０２で算出された視点を特定する視点番号ｖ及びそれぞれの視点での復号画像の出力順序を示す番号ｄは、後述する復号画像管理部２０８に供給され、復号画像バッファ２０７に格納される復号画像の管理に用いる。 The number of viewpoints V supplied from the encoded bit string decoding unit 201, the decoded image output order number o, the viewpoint number v specifying the viewpoint calculated by the decoded image management information calculation unit 202, and the output of the decoded image at each viewpoint The number d indicating the order is supplied to a decoded image management unit 208 (to be described later) and used for management of the decoded image stored in the decoded image buffer 207.

ここで、本実施の形態の多視点画像復号装置では、符号化の場合と同様に、従来の単一視点の復号方式を多視点復号方式として拡張する際に、本方式の復号画像出力順番号ｏを従来の単一視点の復号方式の復号画像の出力順序を示す番号として扱うことで、従来の復号方式との互換をとることができる。 Here, in the multi-view image decoding apparatus according to the present embodiment, as in the case of encoding, when the conventional single-view decoding method is expanded as a multi-view decoding method, the decoded image output sequence number of this method is used. By treating o as a number indicating the output order of the decoded image of the conventional single-view decoding method, compatibility with the conventional decoding method can be achieved.

例えば、ＡＶＣ／Ｈ.２６４方式を多視点符号化方式に拡張する際には、本方式の復号画像出力順番号ｏをＡＶＣ／Ｈ.２６４方式の復号画像の出力順序を示す番号であるピクチャ・オーダー・カウント（picture order count）として扱う。また、本実施の形態の多視点画像復号装置では符号化側で符号化された順序で復号するため、符号化順序は復号順序と等しくなる。更に、符号化ビット列復号部２０１では、復号する画素ブロックの符号化モード、動きベクトルまたは視差ベクトル、符号化残差信号（符号化された予測残差信号）等の情報を得る（図１５のステップＳ２０６）。 For example, when the AVC / H.264 system is expanded to the multi-viewpoint encoding system, the decoded image output order number o of this system is a picture / number indicating the output order of the decoded image of the AVC / H.264 system. Treat as an order count. In addition, since the multi-view image decoding apparatus according to the present embodiment performs decoding in the order encoded on the encoding side, the encoding order is equal to the decoding order. Further, the encoded bit string decoding unit 201 obtains information such as the encoding mode, motion vector or disparity vector, encoded residual signal (encoded prediction residual signal) of the pixel block to be decoded (step in FIG. 15). S206).

続いて、動き／視差補償予測部２０３は、符号化ビット列復号部２０１で復号された符号化モードに応じて、動き補償予測／視差補償予測を行う（図１５のステップＳ２０７）。この動き補償予測／視差補償予測では、符号化モードに応じて復号画像バッファ２０７から供給される画像を参照し、符号化ビット列復号部２０１で復号された動きベクトル／視差ベクトルが指し示す位置の画素ブロックを動き補償予測／視差補償予測ブロックとする。上記の画素ブロックのサイズは小ブロックに分割され、それぞれの小ブロックの予測方法、動きベクトル／視差ベクトルが異なる場合もある。また、複数の参照ピクチャから予測されている場合もある。このような場合は、複数の動き補償予測／視差補償予測を行い、複数の予測ブロックを得る。 Subsequently, the motion / disparity compensation prediction unit 203 performs motion compensation prediction / disparity compensation prediction according to the coding mode decoded by the coded bit string decoding unit 201 (step S207 in FIG. 15). In this motion compensation prediction / parallax compensation prediction, an image supplied from the decoded image buffer 207 according to the encoding mode is referred to, and the pixel block at the position indicated by the motion vector / disparity vector decoded by the encoded bit string decoding unit 201 Is a motion compensation prediction / parallax compensation prediction block. The size of the pixel block is divided into small blocks, and the prediction method and motion vector / disparity vector of each small block may be different. In some cases, prediction is performed from a plurality of reference pictures. In such a case, a plurality of motion compensation predictions / disparity compensation predictions are performed to obtain a plurality of prediction blocks.

予測信号合成部２０４は、当該画素ブロックが小ブロックに分割されている場合や、複数の参照ピクチャから予測されている場合は複数の予測ブロックを合成し、当該画素ブロックの予測信号を生成する（図１５のステップＳ２０８）。一方、残差信号復号部２０５は、符号化ビット列復号部２０１から入力された符号化残差信号に対して、逆量子化、逆直交変換等の残差信号復号処理を行い、復号残差信号を生成する（図１５のステップＳ２０９）。 The prediction signal synthesis unit 204 synthesizes a plurality of prediction blocks when the pixel block is divided into small blocks or is predicted from a plurality of reference pictures, and generates a prediction signal of the pixel block ( Step S208 in FIG. 15). On the other hand, the residual signal decoding unit 205 performs a residual signal decoding process such as inverse quantization and inverse orthogonal transform on the encoded residual signal input from the encoded bit string decoding unit 201, thereby obtaining a decoded residual signal. Is generated (step S209 in FIG. 15).

残差信号重畳部２０６は、予測信号合成部２０４から供給される予測信号に、残差信号復号部２０５から供給される復号残差信号を重畳して復号画像信号を算出し（図１５のステップＳ２１０）、その復号画像信号を復号画像バッファ２０７に画素ブロック単位で順次格納する（図１５のステップＳ２１１）。この復号画像バッファ２０７に格納された復号画像信号は、必要に応じて、符号化順で後に続く画像を復号する際に参照画像となる。 The residual signal superimposing unit 206 calculates a decoded image signal by superimposing the decoded residual signal supplied from the residual signal decoding unit 205 on the prediction signal supplied from the prediction signal combining unit 204 (step in FIG. 15). In step S210, the decoded image signal is sequentially stored in the decoded image buffer 207 in units of pixel blocks (step S211 in FIG. 15). The decoded image signal stored in the decoded image buffer 207 becomes a reference image when decoding subsequent images in the encoding order, if necessary.

以上、図１５のステップＳ２０６からステップＳ２１１までの処理を画素ブロック単位で符号化画像内のすべての画素ブロックの復号が完了するまで繰り返す（図１５のステップＳ２０５〜Ｓ２１２）。 The processing from step S206 to step S211 in FIG. 15 is repeated until decoding of all pixel blocks in the encoded image is completed in units of pixel blocks (steps S205 to S212 in FIG. 15).

復号画像管理部２０８は、復号画像管理情報算出部２０２から供給される視点数Ｖ、復号画像出力順番号ｏ、各画像の視点を特定する視点番号ｖ、及びそれぞれの視点での復号画像の出力順序を示す番号ｄと復号画像バッファ２０７に格納された復号画像信号を対応付けて管理する。復号画像管理部２０８は、これらのパラメータを基に復号画像バッファ２０７に格納された復号画像を出力するかどうか判定する（図１５のステップＳ２１３）。 The decoded image management unit 208 outputs the number of viewpoints V supplied from the decoded image management information calculation unit 202, the decoded image output order number o, the viewpoint number v that identifies the viewpoint of each image, and the output of the decoded image at each viewpoint. The number d indicating the order and the decoded image signal stored in the decoded image buffer 207 are managed in association with each other. Based on these parameters, the decoded image management unit 208 determines whether to output the decoded image stored in the decoded image buffer 207 (step S213 in FIG. 15).

復号順序と復号画像の出力順序が異なり、出力のタイミングが異なる場合は遅延が必要となり、復号画像を出力しない場合もある。復号画像管理部２０８では、復号画像バッファ２０７に格納されている復号画像信号のそれぞれについて、番号ｖにより視点を特定し、番号ｄによりそれぞれの視点での復号画像の出力順序を管理して各視点の復号画像信号の番号ｄの値が等しい画像を同時、または連続的に出力するように各視点を互いに同期させ、番号ｄの値が小さいものから順に出力するように制御する。復号画像出力部２０９は、復号画像バッファ２０７に格納された復号画像を復号画像管理部２０８の制御に応じて、各視点の復号画像信号を互いに同期させて多視点画像表示装置等に出力する（図１５のステップＳ２１４）。 When the decoding order and the output order of the decoded image are different and the output timing is different, a delay is necessary, and the decoded image may not be output. The decoded image management unit 208 specifies the viewpoint for each decoded image signal stored in the decoded image buffer 207 by the number v, manages the output order of the decoded image at each viewpoint by the number d, and manages each viewpoint. Control is performed so that the viewpoints are synchronized with each other so that images with the same value of the number d of the decoded image signals are output simultaneously or continuously, and output in ascending order of the value of the number d. The decoded image output unit 209 outputs the decoded image stored in the decoded image buffer 207 to the multi-view image display device or the like in synchronization with the decoded image signal of each viewpoint in accordance with the control of the decoded image management unit 208 ( Step S214 in FIG. 15).

復号された視点画像Ｍ（ｖ）の各視点の画像を互いに同期させて出力する方法としては、各視点の画像信号をそれぞれ独立したチャンネルで並列に出力する方法と、各視点の画像信号をインターリーブして１つのチャンネルでシリアルに出力する方法がある。復号画像出力部２０９で各視点の画像信号をそれぞれ独立したチャンネルで並列に出力する場合には、各視点の同時刻の画像信号、すなわちそれぞれの視点での復号画像の出力順序を示す番号ｄが等しい各視点の復号画像信号を互いに同期させてそれぞれ出力する（図１５のステップＳ２１３、Ｓ２１４）。 As a method of outputting the images of each viewpoint of the decoded viewpoint image M (v) in synchronization with each other, a method of outputting the image signals of each viewpoint in parallel on independent channels, and an interleaving of the image signals of each viewpoint There is a method of serially outputting with one channel. When the decoded image output unit 209 outputs the image signals of each viewpoint in parallel through independent channels, the image signal at the same time of each viewpoint, that is, the number d indicating the output order of the decoded images at each viewpoint is The decoded image signals of the same viewpoints are output in synchronization with each other (steps S213 and S214 in FIG. 15).

前述の多視点画像符号化装置の説明で用いた図３を用いて説明すると、各視点画像Ｍ（ｖ）のそれぞれについて、各画像ｍ（ｖ，ｄ）の番号ｄの値が小さいものから順に出力させることで、復号画像を出力先の多視点画像表示装置等で表示する際に望ましい順序で出力させることができる。また、番号ｄの値が同じそれぞれの復号画像信号を同時刻に出力することで、各視点の復号画像信号を互いに同期させることができる。その際、すべての視点の画像が復号された後に復号画像の出力を開始することで、各視点の復号画像を欠落することなく出力することができる。（図１５のステップＳ２１３、Ｓ２１４）
復号画像出力部２０９で各視点をインターリーブした信号として１つのチャンネルでシリアルに出力する場合には、それぞれの視点での復号画像の出力順序を示す番号ｄの値が小さいものから順に、各視点の同時刻の画像、すなわちそれぞれの視点での復号画像の出力順序を示す番号ｄが等しい各視点の復号画像信号を互いにインターリーブすることで同期させて出力する。各視点をインターリーブした信号としてシリアルに出力する方法としては、それぞれの視点の信号を画素単位でインターリーブする方法、複数の画素を纏めた単位でインターリーブする方法、水平方向のライン単位でインターリーブする方法、画像単位でインターリーブする方法、複数の画像を纏めた単位でインターリーブする方法等がある。 If it demonstrates using FIG. 3 used by description of the above-mentioned multiview image coding apparatus, about each viewpoint image M (v), the value of the number d of each image m (v, d) is an order from the smallest. By outputting the decoded images, the decoded images can be output in a desirable order when displayed on an output destination multi-viewpoint image display device or the like. Further, by outputting the decoded image signals having the same value of the number d at the same time, the decoded image signals of the respective viewpoints can be synchronized with each other. In that case, the decoding image of each viewpoint can be output without missing by starting the output of the decoding image after the images of all the viewpoints are decoded. (Steps S213 and S214 in FIG. 15)
In the case where the decoded image output unit 209 serially outputs each viewpoint as a signal interleaved with one channel, the number d indicating the output order of the decoded image at each viewpoint is ordered in ascending order. Images at the same time, that is, the decoded image signals of the respective viewpoints having the same number d indicating the output order of the decoded images at the respective viewpoints, are synchronized with each other and output. As a method of serially outputting each viewpoint as an interleaved signal, a method of interleaving each viewpoint signal in units of pixels, a method of interleaving in units of a plurality of pixels, a method of interleaving in units of horizontal lines, There are a method of interleaving in units of images, a method of interleaving in units of a plurality of images, and the like.

出力する復号画像信号のインターリーブ構造については図５〜図１２に示した前述の多視点画像符号化装置に入力される視点画像のインターリーブ構造と同様である。それぞれ独立したチャンネルで並列に出力する場合と同様に、それぞれの視点の信号を画素単位でインターリーブする方法、複数の画素を纏めた単位でインターリーブする方法、水平方向のライン単位でインターリーブする方法では例えば図５〜図１０に示したように番号ｄが等しい各視点のそれぞれの画像信号を画素単位、複数の画素を纏めた単位、水平方向のライン単位でインターリーブして出力することで、各視点の画像信号を同時刻に出力することができ、視点間を互いに同期して出力することができる。 The interleave structure of the decoded image signal to be output is the same as the interleave structure of the viewpoint image input to the above-described multi-view image encoding apparatus shown in FIGS. As in the case of outputting in parallel on independent channels, the method of interleaving each viewpoint signal in units of pixels, the method of interleaving in units of a plurality of pixels, and the method of interleaving in units of horizontal lines, for example, As shown in FIG. 5 to FIG. 10, the image signals of the respective viewpoints having the same number d are interleaved and output in units of pixels, units in which a plurality of pixels are grouped, and horizontal line units. Image signals can be output at the same time, and the viewpoints can be output in synchronization with each other.

この際、インターリーブの対象となるすべての視点の画像が復号された後に復号画像のインターリーブ、及び出力を開始することで、各視点の復号画像を欠落することなく出力することができる（図１５のステップＳ２１３、Ｓ２１４）。また、画像単位でインターリーブする方法では、図１１に示すように番号ｄが等しい各視点のそれぞれの画像信号を画像単位で連続的に出力することで、各視点を互いに同期して出力することができる。更に、図１５のステップＳ２０３からステップＳ２１４までの処理を符号化画像毎に繰り返す（図１５のステップＳ２０２〜Ｓ２１５）。 At this time, by starting the interleaving and output of the decoded image after the images of all the viewpoints to be interleaved are decoded, the decoded image of each viewpoint can be output without being lost (FIG. 15). Steps S213 and S214). Further, in the method of interleaving in units of images, as shown in FIG. 11, by sequentially outputting the image signals of the respective viewpoints having the same number d in units of images, the viewpoints can be output in synchronization with each other. it can. Further, the processing from step S203 to step S214 in FIG. 15 is repeated for each encoded image (steps S202 to S215 in FIG. 15).

以上のように、図１に示した多視点画像符号化装置において、前記各視点がそれぞれ独立したチャンネルで並列に入力する方法と、前記各視点がインターリーブされた信号として１つのチャンネルでシリアルに入力する方法のいずれにおいても遅延させることなく、随時並べ替えバッファ１０３に入力し、格納する。従って、従来例の立体視画像符号化方法及び装置のように符号化の前に１フレーム又は１フィールド毎に順次配列する必要が無く、順次化のための画像バッファを持たず、遅延時間を短くすることができるという効果を得ることができる。 As described above, in the multi-view image encoding apparatus shown in FIG. 1, a method in which the respective viewpoints are input in parallel through independent channels, and a serial input of the respective viewpoints as interleaved signals through one channel. In any of the methods, the data is input to the rearrangement buffer 103 and stored without delay. Therefore, unlike the conventional stereoscopic image encoding method and apparatus, it is not necessary to sequentially arrange one frame or one field before encoding, and there is no image buffer for sequentialization, and the delay time is shortened. The effect that it can be done can be obtained.

一方、本実施の形態の多視点画像復号装置において、復号画像バッファ２０７に格納された復号画像を多視点画像表示装置等に出力する際には表示装置等の入力に合わせた形式、すなわち前記各視点がそれぞれ独立したチャンネルで並列に出力する方法、または前記各視点がインターリーブされた信号として１つのチャンネルでシリアルに出力する方法で出力する。従って、従来例の立体視画像復号化方法及び装置のように復号後に１フレーム又は１フィールド毎に順次配列して出力してから同時化する必要が無く、同時化のための画像バッファを持たず、遅延時間を短くすることができるという効果を得ることができる。 On the other hand, in the multi-view image decoding device of the present embodiment, when outputting the decoded image stored in the decoded image buffer 207 to the multi-view image display device or the like, the format according to the input of the display device or the like, The viewpoint is output by a method of outputting in parallel on independent channels or a method of serially outputting on one channel as a signal in which the viewpoints are interleaved. Therefore, unlike the conventional stereoscopic image decoding method and apparatus, there is no need to synchronize after decoding by sequentially arranging every frame or field after decoding, and there is no image buffer for synchronization. The effect that the delay time can be shortened can be obtained.

それに加えて、図１の多視点画像符号化装置では、多視点画像信号を構成する各画像の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄを復号画像出力順番号ｏとして一括で示す復号画像出力順番号ｏを多視点画像信号の視点の数Ｖと共に符号化する。一方、本実施の形態の多視点画像復号装置では、このようにして符号化された符号化データを確実に復号することができ、多視点画像信号を構成する各視点画像の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄに応じて、多視点画像表示装置等に適切に出力することができる。 In addition, in the multi-view image encoding apparatus of FIG. 1, the view number v that specifies the viewpoint of each image that constitutes the multi-view image signal and the number d that indicates the output order of the decoded image at each viewpoint are displayed as the decoded image. The decoded image output order number o collectively shown as the output order number o is encoded together with the number V of viewpoints of the multi-view image signal. On the other hand, in the multi-view image decoding apparatus according to the present embodiment, it is possible to reliably decode the encoded data encoded in this way, and to specify the viewpoint of each viewpoint image constituting the multi-view image signal. According to the number v and the number d indicating the output order of the decoded image at each viewpoint, it can be appropriately output to a multi-viewpoint image display device or the like.

従来の単一視点の画像符号化／復号方式を本実施の形態の多視点画像符号化／復号方式に拡張する際に、本方式の各視点画像の視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄを一括で示す復号画像出力順番号ｏを従来の単一視点の画像符号化／復号方式の復号画像の出力順序を示す番号（例えば、ＡＶＣ／Ｈ．２６４のpicture order count）として扱い、符号化／復号することで、小さな改良により従来の単一視点の画像符号化／復号方式との互換を取ることができるという効果を得ることができる。 When the conventional single-viewpoint image encoding / decoding system is expanded to the multi-viewpoint image encoding / decoding system of the present embodiment, the viewpoint number v for identifying the viewpoint of each viewpoint image of this system and each viewpoint The decoded image output order number o that collectively indicates the output number d indicating the output order of the decoded images in FIG. 5 is the number that indicates the output order of the decoded images of the conventional single viewpoint image encoding / decoding method (for example, AVC / H. H.264 picture order count) and encoding / decoding can achieve an effect that compatibility with a conventional single-viewpoint image encoding / decoding method can be achieved with a small improvement.

なお、本発明は以上の実施の形態に限定されるものではなく、例えば、本実施の形態の多視点画像復号装置において、画像を出力する際に、復号画像バッファ２０７に格納された復号画像を多視点画像表示装置等に出力する際に、画素の並び替えが必要な場合、復号画像出力部２０９の内部、または外部に一時記憶用のラインバッファを設け、復号画像バッファ２０７から読み出した復号画像信号をラインバッファに一時的に書き込んでから随時画素を並び替えて出力したり、復号画像バッファ２０７から読み出した復号画像信号を画素を並び替えてラインバッファに一時的に書き込んでから随時出力したりすることもでき、これも本発明に含まれる。 Note that the present invention is not limited to the above embodiment. For example, in the multi-view image decoding apparatus according to the present embodiment, the decoded image stored in the decoded image buffer 207 is output when the image is output. When rearrangement of pixels is necessary when outputting to a multi-viewpoint image display device or the like, a line buffer for temporary storage is provided inside or outside the decoded image output unit 209, and the decoded image read from the decoded image buffer 207 After the signal is temporarily written in the line buffer, the pixels are rearranged and output at any time, or the decoded image signal read from the decoded image buffer 207 is rearranged and temporarily written in the line buffer and then output at any time. This is also included in the present invention.

また、図１の多視点画像符号化装置において、画像を入力する際に、遅延させることなく、随時並べ替えバッファ１０３に入力し、格納したが、入力される画像の形式と並べ替えバッファ１０３の格納形式が異なるなど、並べ替えバッファ１０３への格納時に画素の並び替えが必要な場合、一時記憶用のラインバッファを設け、入力画像信号をラインバッファに一時的に書き込んでから随時画素を並び替えて並べ替えバッファ１０３に格納したり、画素を並び替えてラインバッファに一時的に書き込んでから随時並べ替えバッファ１０３に格納したりすることもできる。 Further, in the multi-view image encoding apparatus of FIG. 1, when inputting an image, the image is input to the rearrangement buffer 103 and stored without delay, but the input image format and the rearrangement buffer 103 If the pixels need to be rearranged when stored in the rearrangement buffer 103, such as in different storage formats, a line buffer for temporary storage is provided, and the input image signal is temporarily written in the line buffer, and then the pixels are rearranged as needed. The data can be stored in the rearrangement buffer 103, or the pixels can be rearranged and temporarily written in the line buffer, and then stored in the rearrangement buffer 103 as needed.

また、以上の本実施の形態の多視点画像復号装置の説明においては、復号画像出力部２０９では、復号画像バッファ２０７に格納された復号画像を復号画像管理情報算出部２０２で視点数Ｖと復号画像出力順番号ｏから算出された視点を特定する視点番号ｖとそれぞれの視点での復号画像の出力順序を示す番号ｄを基に復号画像管理部２０８の制御に応じて、復号された各視点の画像信号を互いに同期させて出力したが、各視点を画像単位でインターリーブして１つのチャンネルでシリアルに出力する場合には復号画像出力順番号ｏをもとに出力することもできる。この場合、各復号画像の復号画像出力順番号ｏの値が小さいものから順に出力させることで、各視点を画像単位でインターリーブして互いに同期させて出力させることができる。 Further, in the above description of the multi-view image decoding apparatus according to the present embodiment, the decoded image output unit 209 decodes the decoded image stored in the decoded image buffer 207 with the viewpoint number V and the decoded image management information calculation unit 202. Each viewpoint decoded according to the control of the decoded image management unit 208 based on the viewpoint number v specifying the viewpoint calculated from the image output order number o and the number d indicating the output order of the decoded image at each viewpoint. In the case where the viewpoints are interleaved in units of images and serially output in one channel, the image signals can be output based on the decoded image output order number o. In this case, by outputting the decoded images in order from the smallest decoded image output order number o, the viewpoints can be interleaved in units of images and output in synchronization with each other.

また、以上の説明において符号化、復号に用いる多視点画像は異なる視点から実際に撮影された多視点画像を符号化、復号することもできるが、実際には撮影していない仮想的な視点の位置を周辺の視点から補間する等、変換または生成された視点画像を符号化、復号することもでき、本発明に含まれる。また、コンピュータグラフィックス等の多視点画像を符号化、復号することもでき、本発明に含まれる。 In the above description, multi-viewpoint images used for encoding and decoding can be encoded and decoded from multi-viewpoint images actually captured from different viewpoints. A viewpoint image that has been converted or generated, such as interpolating a position from a peripheral viewpoint, can be encoded and decoded, and is included in the present invention. In addition, multi-view images such as computer graphics can be encoded and decoded, and is included in the present invention.

例えば、Ａ，Ｂ，Ｃ，Ｄの４つの視点の画像信号を備えた多視点画像信号は、（１）４つの視点の画像信号がすべて各視点で実際に撮影して得られた画像信号である場合、（２）４つの視点の画像信号がすべて各視点で仮想的に撮影したものとして生成した画像信号である場合、（３）Ａ，Ｂ視点の画像信号が各視点で実際に撮影して得られた画像信号、Ｃ，Ｄ視点の画像信号が各視点で仮想的に撮影したものとして生成した画像信号といったように、実際に撮影して得られた画像信号と仮想的に撮影したものと生成した画像信号とが混在している場合の３つの場合が想定される。 For example, a multi-viewpoint image signal including image signals of four viewpoints A, B, C, and D is (1) an image signal obtained by actually photographing all four viewpoint image signals at each viewpoint. In some cases, (2) when all four viewpoint image signals are virtually taken at each viewpoint, (3) A and B viewpoint image signals are actually captured at each viewpoint. The image signal obtained by actually shooting and the image signal obtained by actually shooting, such as the image signal obtained by virtually capturing the image signal of the C and D viewpoints and the image signals of the C and D viewpoints. There are three cases where the generated image signal and the generated image signal are mixed.

また、本発明で用いる多視点画像の各視点の位置はどのような配置でもよい。このことについて、図１６〜図１９と共に説明する。図１６〜図１９中に示される番号は視点位置を示す視点番号ｖ（ｖ＝０，１，２，・・・）である。図１６は視点を水平方向に配置した例である。カメラを水平方向に並べて撮影されたものである。図１７は視点を垂直方向に配置した例である。カメラを垂直方向に並べて撮影されたものである。図１８、図１９は視点を水平／垂直２次元の方向に配置した例である。カメラを水平／垂直２次元の方向に配置し並べて撮影されたものである。視点を特定する視点番号ｖの値は各視点と１対１で対応し、０以上視点数Ｖ未満の整数をそれぞれ割り当てければならないが、カメラパラメータ等で符号化側と復号側で整合性が取れればどのような順番でも良い。 In addition, the position of each viewpoint of the multi-viewpoint image used in the present invention may be any arrangement. This will be described with reference to FIGS. The numbers shown in FIGS. 16 to 19 are viewpoint numbers v (v = 0, 1, 2,...) Indicating viewpoint positions. FIG. 16 shows an example in which the viewpoints are arranged in the horizontal direction. The images were taken with the cameras arranged horizontally. FIG. 17 shows an example in which the viewpoints are arranged in the vertical direction. The picture was taken with the cameras arranged vertically. 18 and 19 are examples in which the viewpoints are arranged in a two-dimensional horizontal / vertical direction. The images were taken with the cameras arranged in two horizontal / vertical directions. The value of the viewpoint number v that identifies the viewpoint has a one-to-one correspondence with each viewpoint and must be assigned an integer that is greater than or equal to 0 and less than the number of viewpoints V. However, there is consistency between the encoding side and the decoding side in terms of camera parameters. Any order is acceptable.

以上の多視点画像符号化、及び復号に関する処理は、ハードウェアを用いた伝送、蓄積、受信装置として実現することができるのは勿論のこと、ＲＯＭ（リード・オンリ・メモリ）やフラッシュメモリ等に記憶されているファームウェアや、コンピュータ等のソフトウェアによっても実現することができる。そのファームウェアプログラム、ソフトウェアプログラムをコンピュータ等で読み取り可能な記録媒体に記録して提供することも、有線あるいは無線のネットワークを通してサーバから提供することも、地上波あるいは衛星ディジタル放送のデータ放送として提供することも可能である。 The above multi-view image encoding and decoding processes can be realized as a transmission, storage, and reception device using hardware, as well as in a ROM (Read Only Memory), a flash memory, or the like. It can also be realized by stored firmware or software such as a computer. The firmware program and software program can be recorded on a computer-readable recording medium, provided from a server through a wired or wireless network, or provided as a data broadcast of terrestrial or satellite digital broadcasting Is also possible.

本発明復号装置により復号する符号化ビット列を生成する多視点画像符号化装置の一例のブロック図である。It is a block diagram of an example of the multiview image coding apparatus which produces | generates the encoding bit sequence decoded by this invention decoding apparatus. 図１の多視点画像符号化処理説明用フローチャートである。3 is a flowchart for explaining multi-view image encoding processing in FIG. 1. 各画像の視点を特定する視点番号ｖ及びそれぞれの視点での復号画像の出力順序を示す番号ｄに値を割り当てた場合の一例を説明する図である。It is a figure explaining an example at the time of assigning a value to the viewpoint number v which specifies the viewpoint of each image, and the number d which shows the output order of the decoded image in each viewpoint. 各画像の復号画像出力順番号ｏに値を割り当てた場合の一例を説明する図である。It is a figure explaining an example at the time of assigning a value to decoding picture output order number o of each picture. 各視点の信号を画素単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint per pixel. 各視点の信号を複数の画素を纏めた単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint by the unit which put together the several pixel. 各視点の信号を１６×１６、１６×８画素等の画素ブロック単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint in pixel block units, such as 16x16 and 16x8 pixels. 各視点の信号を水平方向のライン単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint by the line unit of a horizontal direction. 各視点の信号を１つの画像に纏めた形式でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving in the form which put together the signal of each viewpoint in one picture. 各視点の信号を複数のラインを纏めたスライス単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint in the slice unit which put together the some line. 各視点の信号を画像単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint for every image. 各視点の信号を複数の画像を纏めた単位でインターリーブした場合の一例を説明する図である。It is a figure explaining an example at the time of interleaving the signal of each viewpoint in the unit which put together the some image. 各画像の符号化順序、及び動き補償／視差補償の参照関係の一例を説明する図である。It is a figure explaining an example of the encoding relationship of each image, and the reference relationship of motion compensation / parallax compensation. 本発明の多視点画像復号装置の一実施の形態のブロック図である。It is a block diagram of one embodiment of a multi-view image decoding device of the present invention. 図１４の多視点画像復号処理説明用のフローチャートである。15 is a flowchart for explaining the multi-view image decoding process of FIG. 視点を水平方向に配置された場合の一例を説明する図である。It is a figure explaining an example at the time of arrange | positioning a viewpoint to a horizontal direction. 視点を垂直方向に配置された場合の一例を説明する図である。It is a figure explaining an example at the time of arrange | positioning a viewpoint to a perpendicular direction. 視点を水平／垂直２次元の方向に配置された場合の一例を説明する図である。It is a figure explaining an example at the time of arrange | positioning a viewpoint in a horizontal / vertical two-dimensional direction. 視点を水平／垂直２次元の方向に配置された場合の一例を説明する図である。It is a figure explaining an example at the time of arrange | positioning a viewpoint in a horizontal / vertical two-dimensional direction. 従来の立体視多視点画像符号化装置の一例の構成図である。It is a block diagram of an example of the conventional stereoscopic vision multi-view image encoding apparatus. 従来の立体視画像復号化装置の一例の構成図である。It is a block diagram of an example of the conventional stereoscopic vision image decoding apparatus.

Explanation of symbols

１０１符号化管理部
１０２復号画像出力順番号算出部
１０３並べ替えバッファ
１０４動き／視差補償予測部
１０５符号化モード判定部
１０６残差信号算出部
１０７残差信号符号化部
１０８残差信号復号部
１０９残差信号重畳部
１１０復号画像バッファ
１１１符号化ビット列生成部
２０１符号化ビット列復号部
２０２復号画像管理情報算出部
２０３動き／視差補償予測部
２０４予測信号合成部
２０５残差信号復号部
２０６残差信号重畳部
２０７復号画像バッファ
２０８復号画像管理部
２０９復号画像出力部
101 Coding management unit
102 Decoded image output order number calculation unit
DESCRIPTION OF SYMBOLS 103 Rearrangement buffer 104 Motion / disparity compensation prediction part 105 Coding mode determination part 106 Residual signal calculation part 107 Residual signal encoding part 108 Residual signal decoding part 109 Residual signal superimposition part 110 Decoded image buffer 111 Encoding bit sequence Generation unit 201 Encoded bit stream decoding unit 202 Decoded image management information calculation unit 203 Motion / disparity compensation prediction unit 204 Prediction signal synthesis unit 205 Residual signal decoding unit 206 Residual signal superimposing unit 207 Decoded image buffer 208 Decoded image management unit 209 Decoding Image output unit

Claims

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-view image decoding device that decodes encoded data obtained by encoding a multi-view image signal that is an image signal generated as a virtual image taken from one viewpoint,
Decoding means for decoding the encoded data and generating a decoded multi-viewpoint image signal;
Storage means for storing the decoded multi-viewpoint image signal at any time in an image buffer;
A multi-viewpoint, comprising: an output unit that extracts a decoded multi-viewpoint image signal from the image buffer and outputs the decoded image signals of the respective viewpoints as independent channels for each decoded image signal of the respective viewpoints. Image decoding device.

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-view image decoding device that decodes encoded data obtained by encoding a multi-view image signal that is an image signal generated as a virtual image taken from one viewpoint,
Decoding means for decoding the encoded data and generating a decoded multi-viewpoint image signal;
Storage means for storing the decoded multi-viewpoint image signal at any time in an image buffer;
Output means for extracting a decoded multi-viewpoint image signal from the image buffer, synchronizing the decoded image signals of the respective viewpoints with each other, interleaving the decoded image signals of the respective viewpoints, and outputting them serially in one channel, Multi-viewpoint image decoding apparatus.

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-view image decoding device that decodes encoded data obtained by encoding a multi-view image signal that is an image signal generated as a virtual image taken from one viewpoint,
A decoded image that collectively indicates the multi-view image signal, the number V of viewpoints of the multi-view image signal, a number v that identifies each viewpoint, and a number d that indicates the output order of the decoded image at each viewpoint The encoded data in which the output order number o (integer) is encoded is decoded, and the decoded multi-view image signal, the viewpoint number V of the multi-view image signal, the decoded image output order number o, and Each of the decoding means for generating
A quotient (integer) obtained by dividing the decoded image output order number o by the integer operation by the number V of the decoded viewpoints is calculated as a number d indicating the output order of the decoded images at each viewpoint. And calculating means for calculating the remainder of the division (an integer of 0 or more and less than V) as a number v for identifying each of the viewpoints;
Storage means for storing the decoded multi-viewpoint image signal in an image buffer;
In accordance with a number d indicating the output order of the decoded images at the respective viewpoints supplied from the calculation means and a number v specifying each of the viewpoints, a decoded multi-viewpoint image signal is extracted from the image buffer, A multi-view image decoding apparatus comprising: output means for outputting the decoded image signals of the respective viewpoints constituting the decoded multi-view image signal in synchronization with each other.