JP2009159466A

JP2009159466A - Method, apparatus and program for decoding multi-viewpoint image

Info

Publication number: JP2009159466A
Application number: JP2007337315A
Authority: JP
Inventors: Hiroya Nakamura; 博哉中村
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2007-12-27
Filing date: 2007-12-27
Publication date: 2009-07-16

Abstract

<P>PROBLEM TO BE SOLVED: To generate an encoded bit stream by performing encoding while omitting redundant viewpoint dependent information on an encoding side, and to derive the encoding-omitted viewpoint dependent information when decoding the encoded bit stream. <P>SOLUTION: Since a syntax element whose index (i) is "0" is not encoded for an encoded bit stream to be supplied, in a reference viewpoint number information decoding section 404, the syntax element the index (i) of which is "0" is not decoded, and syntax elements the index (i) of which is "1" or more are decoded. A reference viewpoint information decoding section 406 decodes a syntax element indicating a viewpoint ID of a viewpoint to be used as reference of inter-viewpoint prediction. In such a case, however, a syntax element the (i) of which is "0" or "1" is not decoded in the reference viewpoint information decoding section 406, and syntax elements the (i) of which is "2" or more are decoded. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は多視点画像復号方法、多視点画像復号装置及び多視点画像復号プログラムに係り、特に異なる視点から撮影された多視点画像を符号化して得られた多視点画像符号化データを復号する多視点画像復号方法、多視点画像復号装置及び多視点画像復号プログラムに関する。 The present invention relates to a multi-view image decoding method, a multi-view image decoding apparatus, and a multi-view image decoding program, and more particularly to decoding multi-view image encoded data obtained by encoding multi-view images taken from different viewpoints. The present invention relates to a viewpoint image decoding method, a multi-view image decoding apparatus, and a multi-view image decoding program.

＜動画像符号化方式＞
現在、時間軸上に連続する動画像をディジタル信号の情報として取り扱い、その際、効率の高い情報の放送、伝送又は蓄積等を目的とし、時間方向の冗長性を利用して動き補償予測を用い、空間方向の冗長性を利用して離散コサイン変換等の直交変換を用いて符号化圧縮するＭＰＥＧ（Moving Picture Experts Group）などの符号化方式に準拠した装置、システムが、普及している。 <Video coding system>
Currently, moving images on the time axis are handled as digital signal information. At that time, motion compensated prediction is used using redundancy in the time direction for the purpose of broadcasting, transmitting or storing information with high efficiency. Devices and systems that are compliant with a coding scheme such as MPEG (Moving Picture Experts Group) that performs coding compression using orthogonal transform such as discrete cosine transform using redundancy in the spatial direction have become widespread.

１９９５年に制定されたＭＰＥＧ−２ビデオ（ＩＳＯ／ＩＥＣ１３８１８−２）符号化方式は、汎用の動画像圧縮符号化方式として定義されており、プログレッシブ走査画像に加えてインターレース走査画像にも対応し、ＳＤＴＶ（標準解像度画像）のみならずＨＤＴＶ（高精細画像）まで対応しており、光ディスクであるＤＶＤ（Digital Versatile Disk）や、Ｄ−ＶＨＳ（登録商標）規格のディジタルＶＴＲによる磁気テープなどの蓄積メディアや、ディジタル放送等のアプリケーションとして広く用いられている。 The MPEG-2 video (ISO / IEC 13818-2) encoding system established in 1995 is defined as a general-purpose moving image compression encoding system, and supports interlaced scanned images in addition to progressive scanned images. Supports not only SDTV (standard definition images) but also HDTV (high definition images), and storage of DVDs (Digital Versatile Disks), which are optical discs, and magnetic tapes using D-VHS (registered trademark) digital VTRs. It is widely used as an application for media and digital broadcasting.

また、ネットワーク伝送や携帯端末等のアプリケーションにおいて、より高い符号化効率を目標とする、ＭＰＥＧ−４ビジュアル（ＩＳＯ／ＩＥＣ１４４９６−２）符号化方式の標準化が行われ、１９９８年に国際標準として制定された。 In addition, MPEG-4 visual (ISO / IEC 14496-2) encoding method was standardized, aiming at higher encoding efficiency in applications such as network transmission and portable terminals, and was established as an international standard in 1998. It was done.

更に、国際標準化機構（ＩＳＯ）と国際電気標準会議（ＩＥＣ）のジョイント技術委員会（ＩＳＯ／ＩＥＣ）と、国際電気通信連合電気通信標準化部門（ＩＴＵ−Ｔ）が共同でＪＶＴ（ＪｏｉｎｔＶｉｄｅｏＴｅａｍ）を組織し、共同作業によって２００３年に、ＭＰＥＧ−４ＡＶＣ／Ｈ.２６４と呼ばれる符号化方式（ＩＳＯ／ＩＥＣでは１４４９６−１０、ＩＴＵ‐ＴではＨ.２６４の規格番号がつけられている。以下、これをＡＶＣ／Ｈ.２６４符号化方式と呼ぶ）が国際標準として制定された。このＡＶＣ／Ｈ.２６４符号化方式では、従来のＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式に比べ、より高い符号化効率を実現している。 In addition, the Joint Technical Committee (ISO / IEC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) and the International Telecommunication Union Telecommunication Standardization Department (ITU-T) jointly jointly developed JVT (Joint Video Team). In 2003, the MPEG-4 AVC / H.264 encoding method (14496-10 for ISO / IEC and H.264 for ITU-T was assigned in collaboration. This is called the AVC / H.264 encoding method). This AVC / H.264 encoding method achieves higher encoding efficiency than conventional encoding methods such as MPEG-2 video and MPEG-4 visual.

ＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式のＰピクチャ（順方向予測符号化画像）では、表示順序で直前のＩピクチャまたはＰピクチャのみから動き補償予測を行っていた。これに対して、ＡＶＣ／Ｈ.２６４符号化方式では、Ｐピクチャ及びＢピクチャは複数のピクチャを参照ピクチャとして用いることができ、この中からブロック毎に最適なものを選択して動き補償を行うことができる。また、表示順序で先行するピクチャに加えて、既に符号化済みの表示順序で後続のピクチャも参照することができる。また、ＭＰＥＧ−２ビデオやＭＰＥＧ−４ビジュアル等の符号化方式のＢピクチャは、表示順序で前方１枚の参照ピクチャ、後方１枚の参照ピクチャ、もしくはその２枚の参照ピクチャを同時に参照し、２つのピクチャの平均値を予測ピクチャとし、対象ピクチャと予測ピクチャの差分データを符号化していた。 In a P picture (forward prediction encoded image) of an encoding method such as MPEG-2 video or MPEG-4 visual, motion compensation prediction is performed only from the immediately preceding I picture or P picture in the display order. On the other hand, in the AVC / H.264 encoding method, a plurality of pictures can be used as P pictures and B pictures as reference pictures, and an optimal one is selected for each block to perform motion compensation. be able to. Further, in addition to the preceding picture in the display order, a subsequent picture can be referred to in the already encoded display order. In addition, a B picture of an encoding method such as MPEG-2 video or MPEG-4 visual refers to one reference picture in the display order, one reference picture in the rear, or two reference pictures at the same time. The average value of the two pictures is a predicted picture, and the difference data between the target picture and the predicted picture is encoded.

一方、ＡＶＣ／Ｈ.２６４符号化方式では、Ｂピクチャは表示順序で前方１枚、後方１枚という制約にとらわれず、前方や後方に関係なく任意の参照ピクチャを予測のために参照可能となった。さらに、Ｂピクチャを参照ピクチャとして参照することも可能となっている。ＰピクチャやＢピクチャの時間方向のインター予測（動き補償予測）において、複数の参照ピクチャの候補から実際にどの参照ピクチャを参照しているかを指定するために参照ピクチャリストが定義されている。参照ピクチャは参照ピクチャリストに登録され、その特定はインデックスにより指定する。このインデックスは参照インデックスと呼ばれる。また、参照ピクチャリストは参照ピクチャリスト０と参照ピクチャリスト１が定義されており、Ｐスライスは参照ピクチャリスト０に登録されている参照ピクチャのみを参照してインター予測を行うことが可能であり、Ｂスライスは参照ピクチャリスト０、参照ピクチャリスト１の両方のリストに登録されている参照ピクチャを参照してインター予測を行うことが可能である。参照ピクチャリスト０に登録されている参照ピクチャを参照する予測を、リスト０予測、参照ピクチャリスト１に登録されている参照ピクチャを参照する予測を、リスト１予測と呼んで区別している。 On the other hand, in the AVC / H.264 coding system, a B picture can be referred to for prediction regardless of the forward or backward, regardless of the forward or backward, without being restricted by the restriction of one forward and one backward in the display order. It was. Furthermore, it is possible to refer to the B picture as a reference picture. In inter prediction (motion compensation prediction) in the temporal direction of P pictures and B pictures, a reference picture list is defined to designate which reference picture is actually referred from a plurality of reference picture candidates. The reference picture is registered in the reference picture list, and its specification is specified by an index. This index is called the reference index. Further, the reference picture list defines the reference picture list 0 and the reference picture list 1, and the P slice can perform inter prediction with reference to only the reference pictures registered in the reference picture list 0. The B slice can perform inter prediction with reference to reference pictures registered in both the reference picture list 0 and the reference picture list 1. The prediction that refers to the reference picture registered in the reference picture list 0 is distinguished by calling the list 0 prediction and the prediction that refers to the reference picture registered in the reference picture list 1 as list 1 prediction.

更に、ＭＰＥＧ−２ビデオではピクチャ、ＭＰＥＧ−４ではビデオ・オブジェクト・プレーン（ＶＯＰ）を１つの単位として、ピクチャ（ＶＯＰ）毎の符号化モードが決められていたが、ＡＶＣ／Ｈ.２６４符号化方式では、スライスを符号化の単位としており、１つのピクチャ内にＩスライス、Ｐスライス、Ｂスライス等異なるスライスを混在させる構成にすることも可能となっている。 Furthermore, the encoding mode for each picture (VOP) has been determined using a picture in MPEG-2 video and a video object plane (VOP) in MPEG-4 as one unit, but AVC / H.264 encoding is used. In the system, a slice is used as an encoding unit, and it is also possible to have a configuration in which different slices such as an I slice, a P slice, and a B slice are mixed in one picture.

更に、ＡＶＣ／Ｈ.２６４符号化方式ではビデオの画素信号（符号化モード、動きベクトル、ＤＣＴ係数等）の符号化／復号処理を行うＶＣＬ（Video Coding Layer;ビデオ符号化層）と、ＮＡＬ（Network Abstraction Layer;ネットワーク抽象層）が定義されている。 Further, in the AVC / H.264 encoding method, a VCL (Video Coding Layer) that performs encoding / decoding processing of video pixel signals (encoding mode, motion vector, DCT coefficient, etc.), NAL ( Network Abstraction Layer) is defined.

ＡＶＣ／Ｈ.２６４符号化方式で符号化された符号化ビット列はＮＡＬの一区切りであるＮＡＬユニットを単位として構成される。ＮＡＬユニットはＶＣＬで符号化されたデータ（符号化モード、動きベクトル、ＤＣＴ係数等）を含むＶＣＬＮＡＬユニットと、ＶＣＬで生成されたデータを含まないｎｏｎ−ＶＣＬＮＡＬユニットがある。ｎｏｎ−ＶＣＬＮＡＬユニットにはシーケンス全体の符号化に係るパラメータ情報が含まれているＳＰＳ（シーケンス・パラメータ・セット）や、ピクチャの符号化に係るパラメータ情報が含まれているＰＰＳ（ピクチャ・パラメータ・セット）、ＶＣＬで符号化されたデータの復号に必須ではないＳＥＩ（補足付加情報）等がある。 An encoded bit string encoded by the AVC / H.264 encoding method is configured in units of NAL units that are a delimiter of NAL. The NAL unit includes a VCL NAL unit including data (encoding mode, motion vector, DCT coefficient, etc.) encoded by VCL, and a non-VCL NAL unit not including data generated by VCL. The non-VCL NAL unit includes an SPS (sequence parameter set) that includes parameter information related to coding of the entire sequence, and a PPS (picture parameter parameter) that includes parameter information related to picture coding. Set), SEI (supplementary additional information) and the like which are not essential for decoding data encoded by VCL.

それぞれのＮＡＬユニットのヘッダ部（先頭部）には常に「０」の値を持つフラグ（forbidden_zero_bit）、ＳＰＳ、またはＰＰＳ、または参照ピクチャとなるスライスが含まれているかどうかを見分ける識別子（nal_ref_idc）、ＮＡＬユニットの種類を見分ける識別子（nal_unit_type）が含まれる。nal_unit_typeは、ＶＣＬＮＡＬユニットの場合、「１」から”５”のいずれかの値を持つように規定されており、ｎｏｎ−ＶＣＬＮＡＬユニットの場合、例えばＳＥＩが”６”、ＳＰＳが”７”、ＰＰＳが”８”の値を持つように規定されている。復号側ではＮＡＬユニットの種類はＮＡＬユニットのヘッダ部に含まれるＮＡＬユニットの種類を見分ける識別子である「nal_unit_type」で識別することができる。 A header (head part) of each NAL unit always has a flag (forbidden_zero_bit) having a value of “0”, an SPS or PPS, or an identifier (nal_ref_idc) for identifying whether or not a slice serving as a reference picture is included. An identifier (nal_unit_type) for identifying the type of NAL unit is included. The nal_unit_type is defined to have any value from “1” to “5” in the case of the VCL NAL unit. For the non-VCL NAL unit, for example, the SEI is “6” and the SPS is “7”. , PPS is defined to have a value of “8”. On the decoding side, the type of the NAL unit can be identified by “nal_unit_type” which is an identifier for identifying the type of the NAL unit included in the header part of the NAL unit.

また、ＡＶＣ／Ｈ.２６４符号化方式における符号化の基本の単位はピクチャを分割したスライスであり、ＶＣＬＮＡＬユニットはスライス単位となっている。そこで、いくつかのＮＡＬユニットを纏めたアクセス・ユニットと呼ばれる単位が定義されており、１アクセス・ユニットに１つの符号化されたピクチャが含まれている。 The basic unit of encoding in the AVC / H.264 encoding method is a slice obtained by dividing a picture, and the VCL NAL unit is a slice unit. Therefore, a unit called an access unit in which several NAL units are combined is defined, and one encoded picture is included in one access unit.

＜多視点画像符号化方式＞
一方、２眼式立体テレビジョンにおいては、２台のカメラにより異なる２方向から撮影された左眼用画像、右眼用画像を生成し、これを同一画面上に表示して立体画像を見せるようにしている。この場合、左眼用画像、及び右眼用画像はそれぞれ独立した画像として別個に伝送、あるいは記録されている。しかし、これでは単一の２次元画像の約２倍の情報量が必要となってしまう。 <Multi-view image coding method>
On the other hand, in a twin-lens stereoscopic television, a left-eye image and a right-eye image captured from two different directions by two cameras are generated and displayed on the same screen to show a stereoscopic image. I have to. In this case, the left eye image and the right eye image are separately transmitted or recorded as independent images. However, this requires about twice as much information as a single two-dimensional image.

そこで、左右いずれか一方の画像を主画像とし、他方の画像（副画像）情報を一般的な圧縮符号化方法によって情報圧縮して情報量を抑える手法が提案されている（例えば、特許文献１参照）。この特許文献１に記載された立体テレビジョン画像伝送方式では、小領域毎に他方の画像での相関の高い相対位置を求め、その位置偏移量（視差ベクトル）と差信号（予測残差信号）とを伝送するようにしている。差信号も伝送、記録するのは、主画像と視差情報であるずれ量や位置偏移量を用いれば副画像に近い画像が復元できるが、物体の影になる部分など主画像がもたない副画像の情報は復元できないからである。 Therefore, a method has been proposed in which one of the left and right images is used as a main image, and the other image (sub-image) information is information-compressed by a general compression encoding method to suppress the amount of information (for example, Patent Document 1). reference). In the stereoscopic television image transmission method described in Patent Document 1, a relative position with high correlation in the other image is obtained for each small region, and the position shift amount (parallax vector) and a difference signal (prediction residual signal) are obtained. ). The difference signal is also transmitted and recorded because the image close to the sub-image can be restored using the main image and the amount of disparity and position shift, which is parallax information, but there is no main image such as the shadow of the object. This is because the sub-image information cannot be restored.

また、１９９６年に単視点画像の符号化国際標準であるＭＰＥＧ−２ビデオ（ＩＳＯ／ＩＥＣ１３８１８−２）符号化方式に、マルチビュープロファイルと呼ばれるステレオ画像の符号化方式が追加された（ＩＳＯ／ＩＥＣ１３８１８−２／ＡＭＤ３）。ＭＰＥＧ−２ビデオ・マルチビュープロファイルは左眼用画像を基本レイヤー、右眼用画像を拡張レイヤーで符号化する２レイヤーの符号化方式となっており、時間方向の冗長性を利用した動き補償予測や、空間方向の冗長性を利用した離散コサイン変換に加えて、視点間の冗長性を利用した視差補償予測を用いて符号化圧縮する。 In 1996, a stereo image encoding method called a multi-view profile was added to the MPEG-2 video (ISO / IEC 13818-2) encoding method, which is an international standard for single-view image encoding (ISO / IEC). IEC 13818-2 / AMD3). The MPEG-2 video multi-view profile is a two-layer encoding method that encodes the image for the left eye with the base layer and the image for the right eye with the enhancement layer, and motion compensated prediction using redundancy in the time direction In addition to discrete cosine transformation using redundancy in the spatial direction, encoding compression is performed using disparity compensation prediction using redundancy between viewpoints.

また、３台以上のカメラで撮影された多視点画像に対して動き補償予測、視差補償予測を用いて情報量を抑える手法が提案されている（例えば、特許文献２参照）。この特許文献２に記載された画像高能率符号化方式は複数の視点の参照ピクチャとのパターンマッチングを行い、誤差が最小となる動き補償／視差補償予測画像を選択することにより、符号化効率を向上させている。 In addition, a technique has been proposed for reducing the amount of information using motion compensation prediction and parallax compensation prediction for multi-viewpoint images captured by three or more cameras (see, for example, Patent Document 2). The high-efficiency image coding method described in Patent Document 2 performs pattern matching with reference pictures of a plurality of viewpoints, and selects a motion compensation / disparity compensation predicted image that minimizes an error, thereby improving coding efficiency. It is improving.

また、ＪＶＴではＡＶＣ／Ｈ.２６４符号化方式を多視点画像に拡張した多視点画像符号化（ＭＶＣ：Multiview Video Coding（以下、ＭＶＣ方式と呼ぶ））の標準化作業が進んでおり、現時点では規格の草案であるＪＤ４.０（Joint Draft 4.0）を最新版として発行している（例えば、非特許文献１参照）。上記のＭＰＥＧ−２ビデオ・マルチビュープロファイルと同様に、このＭＶＣ方式でも視点間の予測を取り入れることで、符号化効率を向上させている。 In JVT, the standardization work of multi-view video coding (MVC: Multiview Video Coding (hereinafter referred to as MVC method)) in which the AVC / H.264 coding method is extended to a multi-view image is progressing. JD4.0 (Joint Draft 4.0) is issued as the latest version (see Non-Patent Document 1, for example). Similar to the MPEG-2 video multi-view profile described above, this MVC method also improves encoding efficiency by incorporating prediction between viewpoints.

ここで、ＭＶＣ方式で多視点画像の各視点の画像を符号化、及び符号化された符号化ビット列を復号する際の視点間、及び視点画像を構成する符号化対象画像間の参照依存関係について８視点の場合を例にとって説明する。 Here, with respect to the reference dependency relationship between the viewpoints when the images of the respective viewpoints of the multi-viewpoint image are encoded by the MVC method, and the encoded encoded bit string is decoded, and between the encoding target images constituting the viewpoint image The case of 8 viewpoints will be described as an example.

図１０は、８視点からなる多視点画像を符号化する際の画像間の参照依存関係の一例を示す図であり、縦軸は視点での空間の次元の方向（本明細書では視点での空間の次元の方向を視点方向とする）を示しており、横軸は撮影（表示）順序での時間の次元の方向（本明細書では時間の次元の方向を時間方向とする）を示している。Ｐ（ｖ，ｔ）（視点ｖ＝０，１，２，・・・；時間ｔ＝０，１，２，・・・）は、時間ｔにおける視点ｖの画像である。 FIG. 10 is a diagram illustrating an example of the reference dependency relationship between images when a multi-view image including eight viewpoints is encoded. The vertical axis indicates the direction of the dimension of the space at the viewpoint (in this specification, the viewpoint The horizontal axis indicates the time dimension direction in the shooting (display) order (in this specification, the time dimension direction is the time direction). Yes. P (v, t) (viewpoint v = 0, 1, 2,...; Time t = 0, 1, 2,...) Is an image of the viewpoint v at time t.

また、矢印の終点で指し示す画像が符号化／復号する画像で、その符号化／復号する画像を符号化／復号する際に時間方向のインター予測（動き補償予測）や視点間予測（視差補償予測）で参照する参照ピクチャは矢印の始点で指し示す画像である。更に、符号化／復号する画像を符号化／復号する際に時間方向のインター予測で参照する参照ピクチャは横方向の矢印の始点で指し示す画像であり、視点間予測で参照する参照ピクチャは縦方向の矢印の始点で指し示す画像である。 Also, the image indicated by the end point of the arrow is an image to be encoded / decoded. When encoding / decoding the image to be encoded / decoded, temporal inter prediction (motion compensation prediction) or inter-view prediction (disparity compensation prediction) is performed. The reference picture referred to in () is an image indicated by the start point of the arrow. Further, when encoding / decoding an image to be encoded / decoded, a reference picture referred to by temporal inter prediction is an image pointed by the start point of a horizontal arrow, and a reference picture referred to by inter-view prediction is a vertical direction It is an image pointed by the starting point of the arrow.

ここで、時間方向のインター予測（動き補償予測）は他の時間の画像を参照する予測のことであり、視点間予測（視差補償予測）は他の視点の画像を参照する予測のことである。また、時間方向のインター予測の参照として用いることのできるのは時間方向での符号化／復号順で先行する画像のみとし、視点間予測の参照として用いることのできるのは視点方向での符号化／復号順序（視点空間の次元での符号化復号／順序）で先行する画像のみとする。 Here, inter prediction in the time direction (motion compensation prediction) refers to prediction that refers to an image at another time, and inter-view prediction (disparity compensation prediction) refers to prediction that refers to an image at another viewpoint. . Also, only the image preceding in the encoding / decoding order in the temporal direction can be used as a reference for temporal inter prediction, and the encoding in the viewpoint direction can be used as a reference for inter-view prediction. / Only the image preceding in the decoding order (encoding / decoding / order in the dimension of the viewpoint space).

ここで、視点方向での視点の符号化／復号順序は視点０、視点２、視点１、視点４、視点３、視点６、視点５、視点７の順とした場合、視点０の画像Ｐ（０，ｔ）は、すべて他の視点の画像を参照せず、時間方向のインター予測（動き補償予測）を用いて通常のＡＶＣ／Ｈ.２６４と同様に符号化／復号する。また、視点０以外の視点（視点１〜７）では他の視点の復号画像から予測する視点間予測（視差補償予測）を用いている。例えば、視点２の画像Ｐ（２，０）は他の視点である視点０の画像Ｐ（０，０）の復号画像を参照ピクチャとし、視点間予測を用いて、符号化／復号する。 Here, when the viewpoint encoding / decoding order in the viewpoint direction is the order of the viewpoint 0, the viewpoint 2, the viewpoint 1, the viewpoint 4, the viewpoint 3, the viewpoint 6, the viewpoint 5, and the viewpoint 7, the image P of the viewpoint 0 ( 0, t) are all encoded / decoded in the same manner as normal AVC / H.264 using inter prediction (motion compensation prediction) in the time direction without referring to images of other viewpoints. Further, viewpoints other than viewpoint 0 (viewpoints 1 to 7) use inter-view prediction (disparity compensation prediction) predicted from decoded images of other viewpoints. For example, the image P (2, 0) of the viewpoint 2 is encoded / decoded using inter-view prediction using the decoded image of the image P (0, 0) of the viewpoint 0 as another viewpoint as a reference picture.

また、視点１の画像Ｐ（１，０）は他の視点である視点０の画像Ｐ（０，０）と視点２の画像Ｐ（２，０）の各復号画像を参照ピクチャとし、視点間予測を用いて、符号化／復号する。同じ時間であるｔが０の各視点の画像を前記視点方向での視点の符号化／復号順序でＰ（０，０），Ｐ（２，０），Ｐ（１，０），Ｐ（４，０），Ｐ（３，０），Ｐ（６，０），Ｐ（５，０），Ｐ（７，０）の順で符号化／復号した後、ｔが４の各視点の画像を同じく前記視点方向での視点の符号化／復号順序でＰ（０，４），Ｐ（２，４），Ｐ（１，４），Ｐ（４，４），Ｐ（３，４），Ｐ（６，４），Ｐ（５，４），Ｐ（７，４）の順で符号化する。その後、ｔが２の各視点の画像の符号化／復号に続く。 The viewpoint 1 image P (1, 0) is a decoded reference image of the viewpoint 0 image P (0, 0) and the viewpoint 2 image P (2, 0), which are other viewpoints. Encode / decode using prediction. The images of the respective viewpoints at which t is 0 at the same time are P (0,0), P (2,0), P (1,0), P (4) in the viewpoint encoding / decoding order in the viewpoint direction. , 0), P (3,0), P (6,0), P (5,0), P (7,0) in this order, and then images of each viewpoint with t = 4 are obtained. Similarly, P (0,4), P (2,4), P (1,4), P (4,4), P (3,4), P in the viewpoint encoding / decoding order in the viewpoint direction. Encoding is performed in the order of (6, 4), P (5, 4), P (7, 4). Subsequently, the encoding / decoding of the image of each viewpoint with t = 2 is followed.

視点間の予測を取り入れるに際しては、ＡＶＣ／Ｈ．２６４方式で既に定義されている参照ピクチャリストに、時間方向のインター予測（動き補償予測）に用いる参照ピクチャに加えて視点間予測に用いる参照ピクチャも登録できるように拡張することで対応している。 In incorporating predictions between viewpoints, AVC / H. The reference picture list already defined in the H.264 system is supported by extending the reference picture used for inter-view prediction in addition to the reference picture used for temporal inter prediction (motion compensation prediction). .

更に、ＭＶＣ方式は、符号化される多視点画像の視点数や、視点間方向での符号化／復号順序、視点間予測によってもたらされる各視点間の参照依存関係をシーケンス全体として符号化する仕組みを持っており、シーケンス情報のパラメータセットであるＳＰＳ（シーケンス・パラメータ・セット）を拡張することにより符号化を行う。また、ＭＶＣ方式のＪＤ４.０で定義されているＳＰＳのＭＶＣ拡張部分のシンタックス構造に対して符号量を削減するために改良を加えたものが提案されている（非特許文献2参照）。このＳＰＳのＭＶＣ拡張部分のシンタックス構造を図２２を用いて説明する。 Further, the MVC scheme encodes the number of viewpoints of the multi-view image to be encoded, the encoding / decoding order in the inter-view direction, and the reference dependency relationship between the viewpoints caused by the inter-view prediction as a whole sequence. And encoding is performed by extending SPS (sequence parameter set) which is a parameter set of sequence information. In addition, an SPS MVC extension syntax structure defined in MVC JD4.0 is improved in order to reduce the code amount (see Non-Patent Document 2). The syntax structure of the SVC MVC extension will be described with reference to FIG.

図２２において、「num_views_minus1」は符号化ビット列に符号化される視点の数を符号化するためのパラメータであり、視点数から「１」を引いた値である。続いて、「view_id[i]」が各視点毎に視点方向での符号化／復号順序で連続して繰り返し符号化される構造となっている。「view_id[i]」は視点方向での符号化／復号順序をインデックスｉで示したときの視点の視点ＩＤを示す。すなわち、「view_id[i]」は視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示す。ここで、本明細書の説明においては、配列のインデックス（添え字）は０から始まるものとする。例えば、配列「view_id[i]」の先頭は「view_id[0]」、その次は「view_id[1]」となる。また、順序を表す際にも最初を０番目、その次を１番目とする。つまり、視点方向で最初に符号化／復号する視点は０番目、その次に符号化／復号する視点は１番目とする。 In FIG. 22, “num_views_minus1” is a parameter for encoding the number of viewpoints encoded in the encoded bit string, and is a value obtained by subtracting “1” from the number of viewpoints. Subsequently, “view_id [i]” has a structure in which each viewpoint is repeatedly and repeatedly encoded in the encoding / decoding order in the viewpoint direction. “View_id [i]” indicates the viewpoint ID of the viewpoint when the encoding / decoding order in the viewpoint direction is indicated by the index i. That is, “view_id [i]” indicates the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. Here, in the description of the present specification, the array index (subscript) starts from 0. For example, the top of the array “view_id [i]” is “view_id [0]”, and the next is “view_id [1]”. Also, when expressing the order, the first is 0th and the next is 1st. That is, the viewpoint that is first encoded / decoded in the viewpoint direction is 0th, and the viewpoint that is encoded / decoded next is 1st.

続くシンタックス要素「num_anchor_refs_l0[i]」、「anchor_ref_l0[i][j]」、「num_anchor_refs_l1[i]」、「anchor_ref_l1[i][j]」、「num_non_anchor_refs_l0[i]」、「non_anchor_ref_l0[i][j]」、「num_non_anchor_refs_l1[i]」、「non_anchor_ref_l1[i][j]」は視点間の参照依存関係を示す視点依存情報である。 Subsequent syntax elements "num_anchor_refs_l0 [i]", "anchor_ref_l0 [i] [j]", "num_anchor_refs_l1 [i]", "anchor_ref_l1 [i] [j]", "num_non_anchor_refs_l0 [i]", "_ref" “[j]”, “num_non_anchor_refs_l1 [i]”, and “non_anchor_ref_l1 [i] [j]” are view dependency information indicating reference dependency relationships between views.

「num_anchor_refs_l0[i]」は視点方向での符号化／復号順序でｉ番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数である。「num_anchor_refs_l0[i]」は各視点毎に存在する。 “Num_anchor_refs_l0 [i]” is the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. “Num_anchor_refs_l0 [i]” exists for each viewpoint.

ここで、アンカーピクチャは復号時に異なる表示時刻の画像を参照ピクチャとして参照せずに復号することのできる画像である。アンカーピクチャの復号時に参照ピクチャとして用いることができるのは同時刻の他の視点のアンカーピクチャだけである。従って、アンカーピクチャは時間方向のインター予測を用いることはできない。例えば、図１０に示す参照依存関係で符号化する場合は、Ｐ（０，０）、Ｐ（１，０）、Ｐ（２，０）、Ｐ（０，４）、Ｐ（１，４）、Ｐ（２，４）などがアンカーピクチャである。 Here, an anchor picture is an image that can be decoded without referring to an image at a different display time as a reference picture at the time of decoding. Only the anchor picture of another viewpoint at the same time can be used as the reference picture when the anchor picture is decoded. Therefore, anchor picture cannot use inter prediction in the temporal direction. For example, in the case of encoding with the reference dependency shown in FIG. 10, P (0,0), P (1,0), P (2,0), P (0,4), P (1,4) , P (2, 4), etc. are anchor pictures.

ここで、視点方向での符号化／復号順序が０番目の視点（最初に符号化／復号される視点）は常に視点間予測を用いずに、つまり他の視点を参照せずに符号化する。ここで、各視点において、視点間予測を用いずに、つまり他の視点を参照せずに符号化する際には参照できる視点の数を０で表す。従って、視点方向での符号化／復号順序が０番目の視点においては、視点間予測の参照として利用する視点の数は常に「０」となるので、「num_anchor_refs_l0[0]」が省略されている。従って、視点方向での符号化／復号順序で１番目の視点（次に符号化／復号される視点）、すなわち、インデックスiが「１」の「num_anchor_refs_l0[i]」から符号化される。 Here, the viewpoint with the 0th encoding / decoding order in the viewpoint direction (the viewpoint to be encoded / decoded first) is always encoded without using inter-view prediction, that is, without referring to other viewpoints. . Here, in each viewpoint, when encoding is performed without using inter-view prediction, that is, without referring to another viewpoint, the number of viewpoints that can be referred to is represented by zero. Accordingly, in the viewpoint with the 0th encoding / decoding order in the viewpoint direction, the number of viewpoints used as a reference for inter-view prediction is always “0”, and therefore “num_anchor_refs_l0 [0]” is omitted. . Therefore, encoding is performed from “num_anchor_refs_l0 [i]” in which the first viewpoint (viewpoint to be encoded / decoded next) in the encoding / decoding order in the viewpoint direction, that is, index i is “1”.

また、図２２の「anchor_ref_l0[i][j]」は視点方向での符号化／復号順序でｉ番目の視点のアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤの値を示す。「anchor_ref_l0[i][j]」は各視点について「num_anchor_refs_l0[i]」と同じ数存在する。 Further, “anchor_ref_l0 [i] [j]” in FIG. 22 is used as a reference for the j-th inter-view prediction in the reference picture list 0 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. The value of the viewpoint ID of the selected viewpoint is shown. There are as many “anchor_ref_l0 [i] [j]” as “num_anchor_refs_l0 [i]” for each viewpoint.

「num_anchor_refs_l1[i]」は視点方向での符号化／復号順序でｉ番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数である。「num_anchor_refs_l1[i]」は各視点毎に存在する。ここで、前記と同様の理由により、視点方向での符号化／復号順序が0番目の視点においては、視点間予測で参照できる視点の数は常に0となるので、「num_anchor_refs_l1[0]」が省略されている。従って、視点方向での符号化／復号順序で１番目の視点（次に符号化／復号される視点）、すなわち、インデックスiが１の「num_anchor_refs_l1[i]」から符号化される。「anchor_ref_l1[i][j]」は視点方向での符号化／復号順序でｉ番目の視点のアンカーピクチャ用の参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤの値を示す。「anchor_ref_l1[i][j]」は各視点について「num_anchor_refs_l1[i]」と同じ数存在する。 “Num_anchor_refs_l1 [i]” is the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. “Num_anchor_refs_l1 [i]” exists for each viewpoint. Here, for the same reason as described above, since the number of viewpoints that can be referred to in inter-view prediction is always 0 in the viewpoint with the 0th encoding / decoding order in the viewpoint direction, “num_anchor_refs_l1 [0]” is It is omitted. Therefore, encoding is performed from “num_anchor_refs_l1 [i]” in which the first viewpoint (the viewpoint to be encoded / decoded next) in the encoding / decoding order in the viewpoint direction, that is, the index i is 1. “Anchor_ref_l1 [i] [j]” is the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 1 for the anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. Indicates the value of. There are as many “anchor_ref_l1 [i] [j]” as “num_anchor_refs_l1 [i]” for each viewpoint.

また、「num_non_anchor_refs_l0[i]」は視点方向での符号化／復号順序でｉ番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数である。「num_non_anchor_refs_l0[i]」は各視点毎に存在する。 Further, “num_non_anchor_refs_l0 [i]” is the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. “Num_non_anchor_refs_l0 [i]” exists for each viewpoint.

ここで、ノンアンカーピクチャはアンカーピクチャを除く画像である。ノンアンカーピクチャの復号時に異なる表示時刻の画像を参照ピクチャとして参照することができる。つまり、時間方向のインター予測を用いることも可能である。例えば、図１０では、Ｐ（０，１）、Ｐ（１，１）、Ｐ（２，１）、Ｐ（０，２）、Ｐ（１，２）、Ｐ（２，２）などがノンアンカーピクチャである。ここで、前記と同様の理由により、視点方向での符号化／復号順序が０番目の視点においては、視点間予測で参照できる視点の数は常に「０」となるので、num_non_anchor_refs_l0[0]が省略されている。従って、視点方向での符号化／復号順序で１番目の視点（次に符号化／復号される視点）、すなわちインデックスiが「１」の「num_non_anchor_refs_l0[i]」から符号化される。 Here, the non-anchor picture is an image excluding the anchor picture. When decoding a non-anchor picture, an image at a different display time can be referred to as a reference picture. That is, inter prediction in the time direction can be used. For example, in FIG. 10, P (0,1), P (1,1), P (2,1), P (0,2), P (1,2), P (2,2), etc. are non- It is an anchor picture. Here, for the same reason as described above, since the number of viewpoints that can be referred to in inter-view prediction is always “0” in the viewpoint with the 0th encoding / decoding order in the viewpoint direction, num_non_anchor_refs_l0 [0] is It is omitted. Accordingly, encoding is performed from “num_non_anchor_refs_l0 [i]” in which the first viewpoint (the viewpoint to be encoded / decoded next) in the encoding / decoding order in the viewpoint direction, that is, the index i is “1”.

また、図２２の「non_anchor_ref_l0[i][j]」は視点方向での符号化／復号順序でｉ番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤの値を示す。「non_anchor_ref_l0[i][j]」は各視点について「num_non_anchor_refs_l0[i]」と同じ数存在する。 Further, “non_anchor_ref_l0 [i] [j]” in FIG. 22 is used as a reference for the j-th inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. Indicates the value of the viewpoint ID of the viewpoint used. There are the same number of “non_anchor_ref_l0 [i] [j]” as “num_non_anchor_refs_l0 [i]” for each viewpoint.

また、「num_non_anchor_refs_l1[i]」は視点方向での符号化／復号順序でｉ番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数である。「num_non_anchor_refs_l1[i]」は各視点毎に存在する。ここで、前記と同様の理由により、視点方向での符号化／復号順序が０番目の視点においては、視点間予測で参照できる視点の数は常に「０」となるので、「num_non_anchor_refs_l1[0]」が省略されている。従って、視点方向での符号化／復号順序で１番目の視点（次に符号化／復号される視点）、すなわち、インデックスiが「１」の「num_non_anchor_refs_l1[i]」から符号化される。 “Num_non_anchor_refs_l1 [i]” is the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. “Num_non_anchor_refs_l1 [i]” exists for each viewpoint. Here, for the same reason as described above, since the number of viewpoints that can be referred to in inter-view prediction is always “0” in the viewpoint with the 0th encoding / decoding order in the viewpoint direction, “num_non_anchor_refs_l1 [0] "Is omitted. Therefore, encoding is performed from “num_non_anchor_refs_l1 [i]” in which the first viewpoint (viewpoint to be encoded / decoded next) in the encoding / decoding order in the viewpoint direction, that is, the index i is “1”.

更に、「non_anchor_ref_l1[i][j]」は視点方向での符号化／復号順序でｉ番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤの値を示す。「non_anchor_ref_l1[i][j]」は各視点について「num_non_anchor_refs_l1[i]」と同じ数存在する。また、各シンタックス要素は指数ゴロム符号化（expothetical Golomb coding）と呼ばれる手法で符号無しで符号化される。 Further, “non_anchor_ref_l1 [i] [j]” is a view used as a reference for the j-th inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. Indicates the value of the viewpoint ID. There are the same number of “non_anchor_ref_l1 [i] [j]” as “num_non_anchor_refs_l1 [i]” for each viewpoint. Each syntax element is encoded without a code by a technique called exponential Golomb coding.

ここで用いる指数ゴロム符号化はユニバーサル符号化の一種で、変換テーブルを用いずに可変長符号化する方式である。指数ゴロム符号はprefixと呼ばれる「０」が連続したビット列の後に１ビットの「１」が続き、suffixと呼ばれる「０」又は「１」が連続したprefixのビット数と同じビット数のビット列が続く。prefixのビット数をｎとし、suffixの値をｓとすると、符号無し指数ゴロム符号で符号化されたビット列の値νは次式で導き出される。 Exponential Golomb coding used here is a kind of universal coding, which is a variable length coding method without using a conversion table. In the exponent Golomb code, a bit sequence of “0” called “prefix” is followed by “1” of 1 bit, and a bit sequence of the same number of bits as the prefix of “0” or “1” called suffix is followed. . If the number of prefix bits is n and the suffix value is s, the value ν of the bit string encoded by the unsigned exponential Golomb code is derived by the following equation.

ν＝２ⁿ−１＋ｓ（１）
符号なし指数ゴロム符号で符号化されるビット列とコード番号の関係を図２３に示す。例えば、これから復号するビット列が“0001010”の場合、最初に「０」が３つ連続するので、prefixのビット数ｎは「３」となる。次に続く「１」を省き、prefixのビット数３ビットに相当するsuffixのビット列は“０１０”であるので、このsuffixの値ｓは１０進数で「２」である。従って、（１）式により、このビット列のコード番号νは９（＝２³−１＋２）となる。 ν = 2 ⁿ -1 + s (1)
FIG. 23 shows the relationship between the bit string encoded by the unsigned exponential Golomb code and the code number. For example, when the bit string to be decoded is “0001010”, three “0” s are consecutive first, so the number of prefix bits n is “3”. The subsequent “1” is omitted, and the suffix bit string corresponding to the prefix bit number of 3 bits is “010”. Therefore, the suffix value s is “2” in decimal. Therefore, according to equation (1), the code number ν of this bit string is 9 (= 2 ³ -1 + 2).

また、ＭＶＣ方式で定義されている図２２に示すシンタックス構造に従って、８視点からなる多視点画像を図１０に示す参照依存関係で符号化する際のＳＰＳのＭＶＣ拡張部分の各シンタックス要素とその値の一例を図２４に示す。ここで、視点方向での視点の符号化／復号順序は視点０、視点２、視点１、視点４、視点３、視点６、視点５、視点７の順とし、視点０の視点ＩＤを「８」、視点１の視点ＩＤを「９」、視点２の視点ＩＤを「１０」、視点３の視点ＩＤを「１１」、視点４の視点ＩＤを「１２」、視点５の視点ＩＤを「１３」、視点６の視点ＩＤを「１４」、視点７の視点ＩＤを「１５」とする。 Further, according to the syntax structure shown in FIG. 22 defined in the MVC method, each syntax element of the MVC extension portion of the SPS when encoding a multi-view image consisting of eight viewpoints with reference dependency shown in FIG. An example of the value is shown in FIG. Here, the viewpoint encoding / decoding order in the viewpoint direction is the order of viewpoint 0, viewpoint 2, viewpoint 1, viewpoint 4, viewpoint 3, viewpoint 6, viewpoint 5, viewpoint 7, and the viewpoint ID of viewpoint 0 is “8”. ", The viewpoint ID of the viewpoint 1 is" 9 ", the viewpoint ID of the viewpoint 2 is" 10 ", the viewpoint ID of the viewpoint 3 is" 11 ", the viewpoint ID of the viewpoint 4 is" 12 ", and the viewpoint ID of the viewpoint 5 is" 13 " ”, The viewpoint ID of the viewpoint 6 is“ 14 ”, and the viewpoint ID of the viewpoint 7 is“ 15 ”.

まず、図１０に示す多視点画像の視点数は８視点であるので、「num_views_minus1」は「７」が符号無し指数ゴロム符号で符号化される。その際のビット列は“0001000”となり、７ビットである。次に、「view_id[0]」の値は視点方向での視点の符号化／復号順序で０番目の視点（最初の視点）である視点０の視点ＩＤである「８」が符号無し指数ゴロム符号で符号化され、その際のビット列は“0001001”となり、７ビットである。同様に、「view_id[1]」の値は視点方向での視点の符号化／復号順序で１番目の視点（０番目の次の視点）である視点２の視点ＩＤである「１０」が符号化されてビット列は“0001011”となり、「view_id[2]」の値は視点方向での視点の符号化／復号順序で２番目に符号化される視点１の視点ＩＤである「９」が符号化されてビット列は“0001010”となる。以下の「view_id[3]」から「view_id[7]」も同様に符号化される。 First, since the number of viewpoints of the multi-viewpoint image shown in FIG. 10 is eight, “7” is encoded with an unsigned exponential Golomb code for “num_views_minus1”. The bit string at that time is “0001000”, which is 7 bits. Next, the value of “view_id [0]” is “8” which is the viewpoint ID of the viewpoint 0 which is the 0th viewpoint (first viewpoint) in the encoding / decoding order of the viewpoint in the viewpoint direction. The bit string at that time is “0001001”, which is 7 bits. Similarly, the value of “view_id [1]” is “10” which is the viewpoint ID of the viewpoint 2 which is the first viewpoint (the 0th next viewpoint) in the encoding / decoding order of the viewpoint in the viewpoint direction. The bit string becomes “0001011”, and the value of “view_id [2]” is “9” which is the viewpoint ID of the viewpoint 1 that is encoded second in the viewpoint encoding / decoding order in the viewpoint direction. The bit string becomes “0001010”. The following “view_id [3]” to “view_id [7]” are similarly encoded.

続いて、視点依存情報のシンタックス要素が符号化される。ここで、視点方向での符号化／復号順序で０番目の視点（最初の視点）である視点０は常に他の視点を参照しないので、「num_anchor_refs_l0[0]」、「num_anchor_refs_l1[0]」は符号化しない。視点方向での符号化／復号順序で視点０に続く視点２のアンカーピクチャの符号化の際には参照ピクチャリスト０の参照ピクチャとして視点０だけを参照し、参照ピクチャリスト１は用いずに符号化する。参照ピクチャリスト０では視点間予測で参照する視点の数が１つであるので、「num_anchor_refs_l0[1]」の値は「１」が符号化され、「anchor_ref_l0[1][0]」は参照する視点０の視点ＩＤの値である「８」が符号無し指数ゴロム符号で符号化され、その際のビット列は“0001001”となり、７ビットである。 Subsequently, the syntax element of the view-dependent information is encoded. Here, since view 0 that is the 0th view (first view) in the encoding / decoding order in the view direction does not always refer to another view, “num_anchor_refs_l0 [0]” and “num_anchor_refs_l1 [0]” are Do not encode. When encoding the anchor picture of the viewpoint 2 that follows the viewpoint 0 in the encoding / decoding order in the viewpoint direction, only the viewpoint 0 is referred to as the reference picture of the reference picture list 0, and the reference picture list 1 is not used. Turn into. In the reference picture list 0, since the number of viewpoints referred to in inter-view prediction is one, “1” is encoded as the value of “num_anchor_refs_l0 [1]”, and “anchor_ref_l0 [1] [0]” is referred to. The viewpoint ID value “8” of viewpoint 0 is encoded with an unsigned exponential Golomb code, and the bit string at that time is “0001001”, which is 7 bits.

続いて、「num_anchor_refs_l1[1]」の値は「０」が符号化される。次に、符号化される視点１のアンカーピクチャの符号化の際には参照ピクチャリスト０の参照ピクチャとして視点０、参照ピクチャリスト１の参照ピクチャとして視点２を参照し、視点間予測で参照する視点の数がそれぞれ１つであるので、「num_anchor_refs_l0[2]」の値は「１」が符号化され、「anchor_ref_l0[2][0]」は参照する視点０の視点ＩＤの値である「８」が符号化される。更に、「num_anchor_refs_l1[2]」の値は「１」が符号化され、「anchor_ref_l1[2][0]」は参照する視点２の視点ＩＤの値である「１０」が符号化される。続く以下の視点依存情報のシンタックス要素も同様に符号化される。 Subsequently, “0” is encoded as the value of “num_anchor_refs_l1 [1]”. Next, when encoding the anchor picture of the viewpoint 1 to be encoded, the viewpoint 0 is referred to as the reference picture of the reference picture list 0, the viewpoint 2 is referred to as the reference picture of the reference picture list 1, and is referred to by inter-view prediction. Since the number of viewpoints is one, “1” is encoded as the value of “num_anchor_refs_l0 [2]”, and “anchor_ref_l0 [2] [0]” is the value of the viewpoint ID of the viewpoint 0 to be referred to. 8 "is encoded. Further, “1” is encoded as the value of “num_anchor_refs_l1 [2]”, and “10” that is the value of the viewpoint ID of the viewpoint 2 to be referenced is encoded as “anchor_ref_l1 [2] [0]”. The following syntax elements of the viewpoint-dependent information are encoded in the same manner.

符号化側でシーケンス全体として前記パラメータ、すなわち、視点数、及び各視点の視点依存情報を符号化することにより、復号側ではシーケンス全体として、各視点の参照依存関係を判別することができる。各視点の参照依存情報は視点間予測ピクチャのための参照ピクチャリストの初期化等の復号処理に用いる。 By encoding the parameters, that is, the number of viewpoints and the view dependency information of each viewpoint, on the encoding side, it is possible to determine the reference dependency of each viewpoint as the entire sequence on the decoding side. The reference dependency information of each viewpoint is used for decoding processing such as initialization of a reference picture list for inter-view prediction pictures.

特開昭６１-１４４１９１号公報JP-A 61-144191 特開平６−９８３１２号公報JP-A-6-98312 Joint Draft 4.0 on Multiview Video Coding, Joint Video Team of ISO/IEC MPEG & ITU-T VCEG,JVT-X209, July 2007Joint Draft 4.0 on Multiview Video Coding, Joint Video Team of ISO / IEC MPEG & ITU-T VCEG, JVT-X209, July 2007 Comments on MVC high level syntax, J.H.Yang他, Joint Video Team of ISO/IEC MPEG & ITU-T VCEG,JVT-Y061, October 2007Comments on MVC high level syntax, J.H.Yang et al., Joint Video Team of ISO / IEC MPEG & ITU-T VCEG, JVT-Y061, October 2007

ＭＶＣ方式では、多くの視点数を有する多視点画像を符号化する場合は、時間方向の冗長性を利用した時間方向のインター予測（動き補償予測）や、空間方向の冗長性を利用した直交変換に加えて、視点間の冗長性を利用した視点間予測（視差補償予測）を用いて符号化圧縮することで、より符号化効率を向上させることができる。 In the MVC method, when encoding a multi-viewpoint image having a large number of viewpoints, temporal direction inter prediction (motion compensation prediction) using temporal direction redundancy and orthogonal transformation using spatial direction redundancy are used. In addition, encoding efficiency can be further improved by performing encoding compression using inter-view prediction (parallax compensation prediction) using redundancy between viewpoints.

ＭＶＣ方式では、視点間予測のために、視点間の参照依存関係を示す視点依存情報をシーケンス全体として符号化する仕組みを持っており、シーケンス情報のパラメータセットであるＳＰＳ（シーケンス・パラメータ・セット）を拡張することにより符号化を行う。しかし、ＳＰＳはシーケンス全体に係る重要なパラメータであるので、機能を満たした上で、できる限り符号量を削減する必要がある。 The MVC system has a mechanism for encoding view dependency information indicating reference dependency relationships between viewpoints as a whole sequence for inter-view prediction, and an SPS (sequence parameter set) that is a parameter set of sequence information. Encoding is performed by extending. However, since SPS is an important parameter for the entire sequence, it is necessary to reduce the amount of codes as much as possible while satisfying the function.

本発明は以上の点に鑑みてなされたもので、符号化側で冗長な視点依存情報を省略して符号化することにより符号化ビット列を生成し、その符号化された符号化ビット列を復号する場合において符号化を省略した視点依存情報を導出する多視点画像復号方法、多視点画像復号装置及び多視点画像復号プログラムを提供することを目的とする。 The present invention has been made in view of the above points. An encoding bit sequence is generated by encoding without omitting redundant view-dependent information on the encoding side, and the encoded encoding bit sequence is decoded. An object of the present invention is to provide a multi-view image decoding method, a multi-view image decoding apparatus, and a multi-view image decoding program for deriving view-dependent information in which encoding is omitted.

上記目的を達成するため、第１の発明は、設定された複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、一の視点から実際に撮影して得られた画像信号、又は一の視点から仮想的に撮影したものとして生成した画像信号である多視点画像信号が符号化されてなる符号化ビット列中の復号対象の符号化データを復号する多視点画像復号方法であって、
復号対象の符号化データは、各視点の画像信号の符号化において各視点の複数の視点間での符号化／復号順序を特定する情報を符号化して得た第１の符号化データと、複数の視点間での符号化／復号順序で最初に符号化する視点を０番目の視点、次に符号化する視点を１番目の視点とした場合に、ｉ番目（ただし、ｉは自然数）の視点において、各視点の画像信号の符号化の際に他の視点の復号画像信号を参照する視点間予測において参照できる０以上の視点の数の情報（ただし、０は視点間予測において参照できる視点がなく、この視点においては視点間予測を用いずに符号化することを示す。）を１番目以降の視点毎に符号化して得た第２の符号化データと、符号化／復号順序でｊ番目（ただし、ｊは２以上の自然数）に符号化する視点において、視点間予測で参照できる視点の数が１以上の際に、当該視点での視点間予測で参照する視点を特定する情報を、当該視点毎に符号化して得た第３の符号化データとを含むものであり、
第１の符号化データを復号して、複数の視点間での符号化／復号順序を特定する情報を得る第１のステップと、第２の符号化データを復号して、視点間予測において参照できる視点の数の情報を得る第２のステップと、１番目の視点において、第２のステップで復号して得た視点間予測において参照できる視点の数が１であるときに、視点間予測において参照する視点を特定する情報を、第１のステップで復号した複数の視点間での符号化／復号順序を特定する情報から得る０番目の視点を特定する情報とする第３のステップと、当該視点毎の第３の符号化データを復号して、視点間予測において参照する視点を特定する情報を得る第４のステップとを含むことを特徴とする。 In order to achieve the above object, the first invention is a multi-view image signal including image signals of respective viewpoints respectively obtained from a plurality of set viewpoints, and the image signal of one viewpoint is actually transmitted from one viewpoint. The encoded data to be decoded in the encoded bit string obtained by encoding the multi-view image signal that is an image signal obtained by photographing the image signal or the image signal generated as a virtual image taken from one viewpoint. A multi-view image decoding method for decoding, comprising:
The encoded data to be decoded includes a plurality of first encoded data obtained by encoding information for specifying an encoding / decoding order between a plurality of viewpoints of each viewpoint in encoding of an image signal of each viewpoint; I-th viewpoint (where i is a natural number) when the first viewpoint is the first viewpoint in the encoding / decoding order between the two viewpoints and the next viewpoint is the first viewpoint. , Information on the number of viewpoints of zero or more that can be referred to in inter-view prediction that refers to decoded image signals of other viewpoints when coding the image signal of each viewpoint (where 0 is a viewpoint that can be referred to in inter-view prediction) In this viewpoint, encoding is performed without using inter-view prediction.) The second encoded data obtained by encoding for each of the first and subsequent viewpoints, and the jth in the encoding / decoding order. (Where j is a natural number greater than or equal to 2) In this case, when the number of viewpoints that can be referred to in inter-view prediction is 1 or more, the third encoding obtained by encoding the information for specifying the viewpoint to be referred to in the inter-view prediction in the relevant viewpoint for each viewpoint Data and
A first step of decoding first encoded data to obtain information for specifying an encoding / decoding order between a plurality of viewpoints, and a second step of decoding the second encoded data for reference in inter-view prediction In the second step of obtaining information on the number of possible viewpoints, and in the first viewpoint, when the number of viewpoints that can be referred to in the inter-view prediction obtained by decoding in the second step is 1, in the inter-view prediction A third step in which information for specifying a viewpoint to be referred to is information for specifying a zeroth viewpoint obtained from information for specifying an encoding / decoding order among a plurality of viewpoints decoded in the first step; And decoding the third encoded data for each viewpoint to obtain information for identifying the viewpoint to be referred to in the inter-view prediction.

また、上記の目的を達成するため、第２の発明は、設定された複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、一の視点から実際に撮影して得られた画像信号、又は一の視点から仮想的に撮影したものとして生成した画像信号である多視点画像信号が符号化されてなる符号化ビット列中の復号対象の符号化データを復号する多視点画像復号装置であって、
復号対象の符号化データは、各視点の画像信号の符号化において各視点の複数の視点間での符号化／復号順序を特定する情報を符号化して得た第１の符号化データと、複数の視点間での符号化／復号順序で最初に符号化する視点を０番目の視点、次に符号化する視点を１番目の視点とした場合に、ｉ番目（ただし、ｉは自然数）の視点において、各視点の画像信号の符号化の際に他の視点の復号画像信号を参照する視点間予測において参照できる０以上の視点の数の情報（ただし、０は視点間予測において参照できる視点がなく、この視点においては視点間予測を用いずに符号化することを示す。）を１番目以降の視点毎に符号化して得た第２の符号化データと、符号化／復号順序でｊ番目（ただし、ｊは２以上の自然数）に符号化する視点において、視点間予測で参照できる視点の数が１以上の際に、当該視点での視点間予測で参照する視点を特定する情報を、当該視点毎に符号化して得た第３の符号化データとを含むものであり、
第１の符号化データを復号して、複数の視点間での符号化／復号順序を特定する情報を得る第１の復号手段と、第２の符号化データを復号して、視点間予測において参照できる視点の数の情報を得る第２の復号手段と、１番目の視点において、第２のステップで復号して得た視点間予測において参照できる視点の数が１であるときに、視点間予測において参照する視点を特定する情報を、第１のステップで復号した複数の視点間での符号化／復号順序を特定する情報から得る０番目の視点を特定する情報とする第３の復号手段と、当該視点毎の第３の符号化データを復号して、視点間予測において参照する視点を特定する情報を得る第４の復号手段とを有することを特徴とする。 In order to achieve the above object, the second invention is a multi-viewpoint image signal including image signals of respective viewpoints respectively obtained from a plurality of set viewpoints, and an image signal of one viewpoint is Decoding target code in an encoded bit string obtained by encoding an image signal obtained by actually photographing from a viewpoint or a multi-viewpoint image signal that is generated as a virtual photograph from one viewpoint A multi-viewpoint image decoding device for decoding the digitized data,
The encoded data to be decoded includes a plurality of first encoded data obtained by encoding information for specifying an encoding / decoding order between a plurality of viewpoints of each viewpoint in encoding of an image signal of each viewpoint; I-th viewpoint (where i is a natural number) when the first viewpoint is the first viewpoint in the encoding / decoding order between the two viewpoints and the next viewpoint is the first viewpoint. , Information on the number of viewpoints of zero or more that can be referred to in inter-view prediction that refers to decoded image signals of other viewpoints when coding the image signal of each viewpoint (where 0 is a viewpoint that can be referred to in inter-view prediction) In this viewpoint, encoding is performed without using inter-view prediction.) The second encoded data obtained by encoding for each of the first and subsequent viewpoints, and the jth in the encoding / decoding order. (Where j is a natural number greater than or equal to 2) In this case, when the number of viewpoints that can be referred to in inter-view prediction is 1 or more, the third encoding obtained by encoding the information for specifying the viewpoint to be referred to in the inter-view prediction in the relevant viewpoint for each viewpoint Data and
In inter-view prediction, a first decoding unit that decodes the first encoded data to obtain information for specifying an encoding / decoding order between a plurality of viewpoints, and decodes the second encoded data The second decoding means for obtaining information on the number of view points that can be referred to, and the first view point when the number of view points that can be referred to in the inter-view prediction obtained by decoding in the second step is 1. Third decoding means that uses information specifying a viewpoint to be referred to in prediction as information specifying a zeroth viewpoint obtained from information specifying an encoding / decoding order between a plurality of viewpoints decoded in the first step And fourth decoding means for decoding the third encoded data for each viewpoint and obtaining information for identifying the viewpoint to be referred to in inter-view prediction.

更に、上記の目的を達成するため、第３の発明は、第１の発明と同じ符号化対象の符号化データに対して、第１の発明の各ステップをコンピュータにより実行させて復号させる多視点画像復号プログラムであることを特徴とする。 Furthermore, in order to achieve the above object, the third aspect of the invention is a multi-viewpoint in which each step of the first aspect of the invention is decoded by the computer with respect to the same encoded data as the first aspect of the invention. It is an image decoding program.

本発明によれば、多視点画像の復号の際に、視点方向での符号化／復号順序で１番目の視点の視点間予測の参照として用いられる視点を特定する情報を省略して符号化された符号化ビット列から、視点方向での符号化／復号順序で１番目の視点の視点間予測の参照として用いられる視点を特定する情報を導出して、多視点画像を復号することができるので、符号化側で生成するＳＰＳ（シーケンス・パラメータ・セット）の符号化ビット列の発生符号量を削減することができる。 According to the present invention, when multi-viewpoint images are decoded, the encoding is performed by omitting the information specifying the viewpoint used as the reference for inter-view prediction of the first viewpoint in the encoding / decoding order in the viewpoint direction. From this encoded bit string, it is possible to derive information for identifying a viewpoint used as a reference for inter-view prediction of the first viewpoint in the encoding / decoding order in the viewpoint direction, and to decode a multi-viewpoint image. It is possible to reduce the generated code amount of the SPS (sequence parameter set) encoded bit string generated on the encoding side.

また、本発明によれば、符号化ビット列の発生符号量を削減することで、伝送時の伝送量を削減したり、蓄積媒体等への記録の際にデータ量を削減できるのは勿論のこと、符号化／復号の処理量を削減でき、更にはＳＰＳのエラー耐性を向上できる。 Further, according to the present invention, it is possible to reduce the transmission amount at the time of transmission by reducing the generated code amount of the encoded bit string, and it is possible to reduce the data amount when recording to a storage medium or the like. Thus, the processing amount of encoding / decoding can be reduced, and further, the error resistance of SPS can be improved.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（多視点画像符号化装置及び多視点画像符号化方法）
まず、本発明になる多視点画像復号方法、多視点画像復号装置及び多視点画像復号プログラムで復号する符号化ビット列を生成する多視点画像符号化方法及び多視点画像符号化装置について説明する。 (Multi-view image encoding apparatus and multi-view image encoding method)
First, a multi-view image decoding method, a multi-view image decoding apparatus, a multi-view image encoding method and a multi-view image encoding apparatus for generating an encoded bit string to be decoded by a multi-view image decoding program according to the present invention will be described.

図１は、多視点画像符号化装置の一例のブロック図を示す。同図に示すように、この多視点画像符号化装置は、符号化管理部１０１、シーケンス情報符号化部１０２、ピクチャ情報符号化部１０３、画像信号符号化部１０４、多重化１０５を備え、入力される多視点画像信号を符号化して符号化データ（符号化ビット列）を出力する。ここで、上記の多視点画像信号は、設定された２あるいは３以上の複数の視点でそれぞれ得られる各視点の画像信号を含む多視点画像信号であり、一の視点の画像信号は、その一の視点から実際に撮影して得られた画像信号、又はその一の視点から仮想的に撮影したものとして生成した画像信号である。 FIG. 1 is a block diagram illustrating an example of a multi-view image encoding device. As shown in the figure, the multi-view image encoding apparatus includes an encoding management unit 101, a sequence information encoding unit 102, a picture information encoding unit 103, an image signal encoding unit 104, and a multiplexing 105, and an input The multi-view image signal to be encoded is encoded to output encoded data (encoded bit string). Here, the multi-view image signal is a multi-view image signal including the image signals of the respective viewpoints respectively obtained from the set two or three or more viewpoints. It is an image signal obtained by actually photographing from one viewpoint, or an image signal generated as a virtual photograph from one viewpoint.

多視点画像符号化装置の説明においては、ＡＶＣ／Ｈ.２６４符号化方式を多視点画像に拡張したＭＶＣ方式による多視点画像符号化装置として説明する。 In the description of the multi-view image encoding apparatus, the multi-view image encoding apparatus according to the MVC method, which is an extension of the AVC / H.264 encoding method to a multi-view image, will be described.

まず、図１の多視点画像符号化装置で符号化することにより生成される符号化ビット列のシンタックス構造について説明する。図１１は本発明になるＳＰＳにおけるＭＶＣ拡張部分のシンタックス構造を示す図である。 First, the syntax structure of an encoded bit string generated by encoding with the multi-view image encoding device in FIG. 1 will be described. FIG. 11 is a diagram showing the syntax structure of the MVC extension part in the SPS according to the present invention.

図１１に示すシンタックス構造においては、図２２に示した従来例と同様に、まず、符号化ビット列に符号化される視点の数から１を減じた値を示すシンタックス要素である「num_views_minus1」が符号無し整数指数ゴロム符号により符号化され、さらに、視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示すシンタックス要素である「view_id[i]」が各視点毎に視点方向での符号化／復号順序で連続して繰り返し符号化される構造となっている。 In the syntax structure shown in FIG. 11, as in the conventional example shown in FIG. 22, first, “num_views_minus1”, which is a syntax element indicating a value obtained by subtracting 1 from the number of viewpoints encoded in the encoded bit string. "View_id [i]", which is a syntax element indicating the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction, is encoded for each viewpoint. It has a structure in which encoding is repeated continuously in the encoding / decoding order in the direction.

続くシンタックス要素「num_anchor_refs_l0[i]」、「anchor_ref_l0[i][j]」、「num_anchor_refs_l1[i]」、「anchor_ref_l1[i][j]」、「num_non_anchor_refs_l0[i]」、「non_anchor_ref_l0[i][j]」、「num_non_anchor_refs_l1[i]」、「non_anchor_ref_l1[i][j]」は視点間の参照依存関係を示す視点依存情報である。「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」は、インデックスiが「０」を除いた各視点毎に符号化／復号順序で符号化される構造になっている。 Subsequent syntax elements "num_anchor_refs_l0 [i]", "anchor_ref_l0 [i] [j]", "num_anchor_refs_l1 [i]", "anchor_ref_l1 [i] [j]", "num_non_anchor_refs_l0 [i]", "_ref" “[j]”, “num_non_anchor_refs_l1 [i]”, and “non_anchor_ref_l1 [i] [j]” are view dependency information indicating reference dependency relationships between views. "Num_anchor_refs_l0 [i]", "num_anchor_refs_l1 [i]", "num_non_anchor_refs_l0 [i]", and "num_non_anchor_refs_l1 [i]" are encoded in the encoding / decoding order for each viewpoint except index i is "0" It has become a structure.

ここで、多視点画像符号化方法、装置、プログラム、及び後述する本発明の多視点画像復号方法、装置、プログラムにおいては、視点方向での符号化／復号順序で1番目の視点については、視点間予測の参照として０番目の視点の復号画像のみを用いることができるものと規定する。 Here, in the multi-view image encoding method, apparatus, and program, and the multi-view image decoding method, apparatus, and program of the present invention described later, for the first viewpoint in the encoding / decoding order in the view direction, It is defined that only the decoded image of the 0th viewpoint can be used as a reference for inter prediction.

更に、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数、すなわち、「num_anchor_refs_l0[1]」が「１」のとき、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として用いられる視点の視点ＩＤ、すなわち「anchor_ref_l0[1][0]」の値は視点方向の符号化順序で０番目の視点の視点ＩＤ、すなわち「view_id[0]」の値とするものと規定する。 Further, the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is, “num_anchor_refs_l0 [1]” is “1”. ”, The viewpoint ID of the viewpoint used as a reference for inter-view prediction in the reference picture list 0 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is,“ anchor_ref_l0 [1] [0 The value of “]” is defined as the viewpoint ID of the 0th viewpoint in the encoding direction of the viewpoint direction, that is, the value of “view_id [0]”.

同様に、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数、すなわち、「num_anchor_refs_l1[1]」が「１」のとき、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として用いられる視点の視点ＩＤ、すなわち「anchor_ref_l1[1][0]」の値は視点方向の符号化順序で０番目の視点の視点ＩＤ、すなわち「view_id[0]」の値とするものと規定する。 Similarly, the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is, “num_anchor_refs_l1 [1]” is “ 1 ”, the viewpoint ID of the viewpoint used as a reference for inter-view prediction in the reference picture list 1 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is,“ anchor_ref_l1 [1] [ The value “0]” is defined as the viewpoint ID of the 0th viewpoint in the encoding direction of the viewpoint direction, that is, the value of “view_id [0]”.

同様に、視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数、すなわち「num_non_anchor_refs_l0[1]」が「１」のとき、視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として用いられる視点の視点ＩＤ、すなわち「non_anchor_ref_l0[1][0]」の値は視点方向の符号化順序で０番目の視点の視点ＩＤ、すなわち「view_id[0]」の値とするものと規定する。 Similarly, the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is, “num_non_anchor_refs_l0 [1]” is “ 1 ”, the viewpoint ID of the viewpoint used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is,“ non_anchor_ref_l0 [1] The value of “[0]” is defined as the viewpoint ID of the 0th viewpoint in the encoding direction of the viewpoint direction, that is, the value of “view_id [0]”.

同様に、視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数、すなわち、「num_non_anchor_refs_l1[1]」が「１」のとき、視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として用いられる視点の視点ＩＤ、すなわち「non_anchor_ref_l1[1][0]」の値は視点方向の符号化順序で０番目の視点の視点ＩＤ、すなわち「view_id[0]」の値とするものと規定する。 Similarly, the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is, “num_non_anchor_refs_l1 [1]” is When “1”, the viewpoint ID of the viewpoint used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction, that is, “non_anchor_ref_l1 [1 ] [0] ”is defined as the viewpoint ID of the 0th viewpoint in the encoding direction of the viewpoint direction, that is, the value of“ view_id [0] ”.

符号化側／復号側双方でこのように規定することで、インデックスiが「１」のシンタックス要素「anchor_ref_l0[1][0]」、「anchor_ref_l1[1][0]」、「non_anchor_ref_l0[1][0]」、「non_anchor_ref_l1[1][0]」を符号化側で符号化しなくても、それらの値を復号側で導出することが可能となる。従って、本発明ではインデックスiが「１」のシンタックス要素「anchor_ref_l0[1][0]」、「anchor_ref_l1[1][0]」、「non_anchor_ref_l0[1][0]」、「non_anchor_ref_l1[1][0]」の符号化、復号を常に省略する。 By specifying in this way on both the encoding side and the decoding side, the syntax elements “anchor_ref_l0 [1] [0]”, “anchor_ref_l1 [1] [0]”, “non_anchor_ref_l0 [1] whose index i is“ 1 ”are defined. ] [0] ”and“ non_anchor_ref_l1 [1] [0] ”can be derived on the decoding side without encoding on the encoding side. Therefore, in the present invention, the syntax elements “anchor_ref_l0 [1] [0]”, “anchor_ref_l1 [1] [0]”, “non_anchor_ref_l0 [1] [0]”, “non_anchor_ref_l1 [1]” having the index i of “1” are used. [0] is always omitted.

また、本発明ではインデックスiが「１」より大きいシンタックス要素、すなわち視点方向での符号化／復号順序で２番目以降の視点についてのみ、それぞれの「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」の値に応じた数の「anchor_ref_l0[i][j]」、「anchor_ref_l1[i][j]」、「non_anchor_ref_l0[i][j]」、「non_anchor_ref_l1[i][j]」が符号化される構造となっている。 Further, according to the present invention, only “num_anchor_refs_l0 [i]” and “num_anchor_refs_l1 [i]” are used for syntax elements having an index i larger than “1”, that is, only the second and subsequent viewpoints in the encoding / decoding order in the viewpoint direction. "," Num_non_anchor_refs_l0 [i] "," anchor_ref_l0 [i] [j] "," anchor_ref_l1 [i] [j] "," non_anchor_ref_l0 [i] [j] [j] "depending on the value of" num_non_anchor_refs_l1 [i] " "," Non_anchor_ref_l1 [i] [j] "is encoded.

視点方向での符号化／復号順序で先行する視点のみを視点間予測の参照として用いることができるものとした場合、視点方向での符号化／復号順序で１番目の視点の画像を符号化する際に視点間予測の参照として用いることができるのは０番目の視点の復号画像だけである。更に、「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」は「０」か「１」の値をとることができ、２以上の値をとることはできない。従って、本発明の前記規定を適用した上で、インデックスiが「１」のシンタックス要素「anchor_ref_l0[1][0]」、「anchor_ref_l1[1][0]」、「non_anchor_ref_l0[1][0]」、「non_anchor_ref_l1[1][0]」の符号化、復号を常に省略しても、従来例と同様の予測構造を設定して符号化することが可能である。 When only the viewpoint that precedes the encoding / decoding order in the viewpoint direction can be used as a reference for inter-view prediction, the image of the first viewpoint is encoded in the encoding / decoding order in the viewpoint direction. In this case, only the decoded image of the 0th viewpoint can be used as a reference for inter-view prediction. Furthermore, "num_anchor_refs_l0 [i]", "num_anchor_refs_l1 [i]", "num_non_anchor_refs_l0 [i]", and "num_non_anchor_refs_l1 [i]" can take values of "0" or "1". I can't take it. Therefore, after applying the above definition of the present invention, the syntax elements “anchor_ref_l0 [1] [0]”, “anchor_ref_l1 [1] [0]”, “non_anchor_ref_l0 [1] [0] whose index i is“ 1 ”are used. ] ”And“ non_anchor_ref_l1 [1] [0] ”can be encoded with the same prediction structure as in the conventional example, even if encoding and decoding are always omitted.

更に、インデックスiが「１」のシンタックス要素「anchor_ref_l0[1][0]」、「anchor_ref_l1[1][0]」、「non_anchor_ref_l0[1][0]」、「non_anchor_ref_l1[1][0]」の符号化、復号を常に省略することで、ＳＰＳ（シーケンス・パラメータ・セット）の符号化ビット列の発生符号量を削減することができる。また、符号化ビット列の発生符号量を削減することで、伝送時の伝送量を削減したり、蓄積媒体等への記録の際にデータ量を削減できるのは勿論のこと、符号化／復号の処理量が削減できる。 Furthermore, the syntax elements “anchor_ref_l0 [1] [0]”, “anchor_ref_l1 [1] [0]”, “non_anchor_ref_l0 [1] [0]”, “non_anchor_ref_l1 [1] [0]” with the index i being “1” ”Is always omitted, it is possible to reduce the generated code amount of the SPS (sequence parameter set) encoded bit string. In addition, by reducing the amount of generated code of the encoded bit string, it is possible to reduce the amount of transmission during transmission and the amount of data when recording to a storage medium or the like. The amount of processing can be reduced.

一方、本発明の復号側では、後述するように、多視点画像の復号の際に、視点方向での符号化／復号順序で１番目の視点の視点間予測の参照として用いられる視点を特定する情報を省略して符号化された符号化ビット列から、視点方向での符号化／復号順序で１番目の視点の視点間予測の参照として用いられる視点を特定する情報を導出して、多視点画像を復号することができることから、符号化側で生成するＳＰＳ（シーケンス・パラメータ・セット）の符号化ビット列の発生符号量を削減することができるという効果がある。 On the other hand, on the decoding side of the present invention, as will be described later, when decoding a multi-viewpoint image, a viewpoint used as a reference for inter-view prediction of the first viewpoint in the encoding / decoding order in the viewpoint direction is specified. Deriving information for identifying the viewpoint used as a reference for inter-view prediction of the first viewpoint in the encoding / decoding order in the viewpoint direction from the encoded bit string encoded by omitting the information, the multi-view image Therefore, it is possible to reduce the generated code amount of the SPS (sequence parameter set) encoded bit string generated on the encoding side.

特に、ＳＰＳはシーケンス全体の符号化に関わるパラメータ情報であるので、符号化ビット列中のシーケンスに属する他のデータを復号する際には不可欠であり、符号化ビット列に含まれる他のデータに比べて最も重要な情報である。符号化ビット列のシステム情報を入手するためにＳＰＳのみを復号したり、ピクチャ情報やスライスのヘッダ情報を復号する際にはまずＳＰＳの情報を復号する必要があるので、処理量の削減効果は他のデータに比べてより大きいものとなる。また、システムによってはＳＰＳは符号化モード、及び動き／視差ベクトル、符号化残差信号等が符号化されている符号化ビット列であるＶＣＬＮＡＬユニットと分離して伝送されることもある。この場合も、他のデータに比べてＳＰＳの符号量や処理量の削減効果はより大きいものとなる。 In particular, since SPS is parameter information related to the encoding of the entire sequence, it is indispensable when decoding other data belonging to the sequence in the encoded bit string, compared to other data included in the encoded bit string. The most important information. Since only SPS is decoded to obtain system information of an encoded bit string, or SPS information needs to be decoded first when decoding picture information and slice header information, the effect of reducing the amount of processing is other than that. It will be larger than the data. Also, depending on the system, the SPS may be transmitted separately from the VCL NAL unit, which is an encoded bit sequence in which an encoding mode, a motion / disparity vector, an encoded residual signal, and the like are encoded. Also in this case, the effect of reducing the SPS code amount and the processing amount is greater than that of other data.

また、ＳＰＳの符号化ビット列の発生符号量を削減することで、伝送時におけるＳＰＳの符号化ビット列にエラーが生じるリスクが低下し、エラー耐性を高めることがでる。ＳＰＳはシーケンス全体の符号化に関わるパラメータ情報であり、ＳＰＳの符号化ビット列にエラーが生じるとシーケンス全体に致命的な影響を及ぼすので、本実施の形態のＳＰＳのエラー耐性の向上による効果は非常に大きい。 Further, by reducing the amount of generated code of the SPS encoded bit string, the risk that an error occurs in the SPS encoded bit string during transmission is reduced, and error tolerance can be increased. SPS is parameter information related to the coding of the entire sequence, and if an error occurs in the coded bit string of the SPS, the entire sequence is fatally affected. Therefore, the effect of improving the error resistance of the SPS according to the present embodiment is very effective. Big.

図１２は、図１１に示すシンタックス構造に従って、８視点の多視点画像を図１０に示す参照依存関係で符号化する際のＳＰＳのＭＶＣ拡張部分の各シンタックス要素とその値の一例を示す。ここで、図２４に示した従来例と同様に視点方向での視点の符号化／復号順序は視点０、視点２、視点１、視点４、視点３、視点６、視点５、視点７の順とし、視点０の視点ＩＤを「８」、視点１の視点ＩＤを「９」、視点２の視点ＩＤを「１０」、視点３の視点ＩＤを「１１」、視点４の視点ＩＤを「１２」、視点５の視点ＩＤを「１３」、視点６の視点ＩＤを「１４」、視点７の視点ＩＤを「１５」とする。 FIG. 12 shows an example of each syntax element and its value in the MVC extension part of SPS when an 8-view multi-view image is encoded with the reference dependency shown in FIG. 10 according to the syntax structure shown in FIG. . Here, as in the conventional example shown in FIG. 24, the encoding / decoding order of viewpoints in the viewpoint direction is the order of viewpoint 0, viewpoint 2, viewpoint 1, viewpoint 4, viewpoint 3, viewpoint 6, viewpoint 5, and viewpoint 7. The viewpoint ID of the viewpoint 0 is “8”, the viewpoint ID of the viewpoint 1 is “9”, the viewpoint ID of the viewpoint 2 is “10”, the viewpoint ID of the viewpoint 3 is “11”, and the viewpoint ID of the viewpoint 4 is “12”. ”, The viewpoint ID of the viewpoint 5 is“ 13 ”, the viewpoint ID of the viewpoint 6 is“ 14 ”, and the viewpoint ID of the viewpoint 7 is“ 15 ”.

この図１２に示すシンタックス構造と図２４に示した従来のシンタックス構造と比較すると、従来例、本実施の形態共に、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[1]」の値が「1」となっている。 Compared with the syntax structure shown in FIG. 12 and the conventional syntax structure shown in FIG. 24, both the conventional example and the present embodiment are for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction. The value of the syntax element “num_anchor_refs_l0 [1]” indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 is “1”.

しかし、図２４に示した従来のシンタックス構造において、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト０での０番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[1][0]」は参照する視点０の視点ＩＤの値である「８」が符号無し指数ゴロム符号で符号化され、その際のビット列は“0001001”となり、７ビットであるのに対し、図１２に示すシンタックス構造では「anchor_ref_l0[1][0]」の符号化が省略されている。この省略された「anchor_ref_l0[1][0]」の値は復号側では導出する。 However, in the conventional syntax structure shown in FIG. 24, it is used as a reference for the 0th inter-view prediction in the reference picture list 0 for the anchor picture of the 1st view in the encoding / decoding order in the view direction. The syntax element “anchor_ref_l0 [1] [0]” indicating the viewpoint ID of the viewpoint is encoded with “8”, which is the viewpoint ID value of the viewpoint 0 to be referenced, with an unsigned exponential Golomb code. In the syntax structure shown in FIG. 12, the encoding of “anchor_ref_l0 [1] [0]” is omitted. The omitted “anchor_ref_l0 [1] [0]” value is derived on the decoding side.

従って、図１２に示すシンタックス構造に従えば、符号化により生成されるＳＰＳの符号化ビット列の符号量が従来例に対して削減でき、洗練されたものとなる。 Therefore, according to the syntax structure shown in FIG. 12, the code amount of the SPS encoded bit string generated by the encoding can be reduced as compared with the conventional example, which is refined.

次に、図１の多視点画像符号化装置の動作について説明する。図１において、まず、符号化管理部１０１は、外部から設定された符号化パラメータをもとに、必要に応じて新たにパラメータを計算し、シーケンス全体に関連するパラメータ情報（ＳＰＳ）、ピクチャに関連するパラメータ情報（ＰＰＳ）、ピクチャのスライスに関連するヘッダ情報（スライスヘッダ）等を含む符号化に関する管理を行う。さらに、符号化管理部１０１は、符号化対象画像の参照依存関係、符号化／復号順序を管理する。 Next, the operation of the multi-view image encoding device in FIG. 1 will be described. In FIG. 1, first, the encoding management unit 101 calculates a new parameter as necessary based on an encoding parameter set from the outside, and sets parameter information (SPS) related to the entire sequence and a picture. Management related to coding including related parameter information (PPS), header information related to a slice of a picture (slice header), and the like is performed. Furthermore, the encoding management unit 101 manages the reference dependency relationship and the encoding / decoding order of the encoding target image.

参照依存関係については、視点単位で他の視点の復号画像を参照するか否かを管理するとともに、ピクチャまたはスライス単位で、符号化対象画像を符号化する際に他の視点の復号画像を参照画像として用いる視点間予測（視差補償予測）を行うか否か、符号化対象画像を符号化後に復号して得られる復号画像が他の視点の符号化対象画像を符号化する際に参照画像として用いられるか否か、複数ある参照画像の候補の中からどの参照画像を参照するかについて管理する。また、符号化／復号順序については、前記参照依存関係において、復号側で、復号する符号化ビット列の画像が参照する参照画像が復号された後に復号を開始できるように符号化／復号順序を管理する。 Regarding the reference dependency, it is managed whether or not to refer to the decoded image of another viewpoint in units of viewpoints, and when the encoding target image is encoded in units of pictures or slices, the decoded images of other viewpoints are referred to Whether to perform inter-view prediction (disparity compensation prediction) to be used as an image, or a decoded image obtained by decoding an encoding target image after encoding an encoding target image of another viewpoint as a reference image It is managed whether or not it is used and which reference image is referred to from among a plurality of reference image candidates. As for the encoding / decoding order, the encoding / decoding order is managed so that the decoding side can start decoding after the reference image referenced by the image of the encoded bit string to be decoded is decoded, in the reference dependency relationship. To do.

次に、シーケンス情報符号化部１０２は、符号化管理部１０１で管理されるシーケンス全体に関連するパラメータ情報（ＳＰＳ）を符号化する。ここでは、図１１に示すシンタックス構造に従ってＳＰＳのＭＶＣ拡張部分も符号化する。 Next, the sequence information encoding unit 102 encodes parameter information (SPS) related to the entire sequence managed by the encoding management unit 101. Here, the MVC extension part of SPS is also encoded according to the syntax structure shown in FIG.

図２はシーケンス情報符号化部１０２の一例のブロック図を示す。図２に示すように、シーケンス情報符号化部１０２は、ＭＶＣ拡張部分以外のシーケンス情報符号化部２０１、視点数情報符号化部２０２、符号化順序情報符号化部２０３、参照視点数情報符号化部２０４、及び参照視点情報符号化部２０５から構成される。また、参照視点数情報符号化部２０４、及び参照視点情報符号化部２０５は、視点依存情報符号化部２０６を構成している。更に、視点数情報符号化部２０２、及び符号化順序情報符号化部２０３は、視点依存情報符号化部２０６と共に、ＳＰＳＭＶＣ拡張部分符号化部２０７を構成している。 FIG. 2 is a block diagram illustrating an example of the sequence information encoding unit 102. As shown in FIG. 2, the sequence information encoding unit 102 includes a sequence information encoding unit 201, a view number information encoding unit 202, a coding order information encoding unit 203, and a reference view number information encoding other than the MVC extension part. Unit 204 and a reference view information encoding unit 205. Also, the reference view number information encoding unit 204 and the reference view information encoding unit 205 constitute a view dependent information encoding unit 206. Furthermore, the viewpoint number information encoding unit 202 and the encoding order information encoding unit 203 together with the viewpoint dependent information encoding unit 206 constitute an SPS MVC extended partial encoding unit 207.

ＭＶＣ拡張部分以外のシーケンス情報符号化部２０１は、ＭＶＣ拡張部分以外のシーケンス情報、すなわちＡＶＣ／Ｈ.２６４方式でのＳＰＳ（シーケンス・パラメータ・セット）を符号化する。一方、視点数情報符号化部２０２、符号化順序情報符号化部２０３、参照視点数情報符号化部２０４、及び参照視点情報符号化部２０５で構成されるＳＰＳＭＶＣ拡張部分符号化部２０７は、図１１に示すシンタックス構造に従ってシーケンス全体に関連する情報（ＳＰＳ）のＭＶＣ拡張部分を符号化する。 The sequence information encoding unit 201 other than the MVC extension portion encodes sequence information other than the MVC extension portion, that is, SPS (sequence parameter set) in the AVC / H.264 system. On the other hand, the SPS MVC extended partial encoding unit 207, which includes the view number information encoding unit 202, the encoding order information encoding unit 203, the reference view number information encoding unit 204, and the reference view information encoding unit 205, According to the syntax structure shown in FIG. 11, the MVC extension part of the information (SPS) related to the entire sequence is encoded.

まず、視点数情報符号化部２０２は、符号化ビット列に符号化される視点の数から「１」を減じた値を示すシンタックス要素「num_views_minus1」を符号化する。次に、符号化順序情報符号化部２０３は、視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示すシンタックス要素「view_id[i]」を視点方向での符号化／復号順序で符号化する。 First, the viewpoint number information encoding unit 202 encodes a syntax element “num_views_minus1” indicating a value obtained by subtracting “1” from the number of viewpoints encoded in the encoded bit string. Next, the encoding order information encoding unit 203 encodes / decodes the syntax element “view_id [i]” indicating the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction in the viewpoint direction. Encode in order.

次に、視点依存情報符号化部２０６を構成する参照視点数情報符号化部２０４、参照視点情報符号化部２０５により、視点依存情報を符号化する。ここで符号化する視点依存情報は前述のシンタックス要素「num_anchor_refs_l0[i]」、「anchor_ref_l0[i][j]」、「num_anchor_refs_l1[i]」、「anchor_ref_l1[i][j]」、「num_non_anchor_refs_l0[i]」、「non_anchor_ref_l0[i][j]」、「num_non_anchor_refs_l1[i]」、「non_anchor_ref_l1[i][j]」である。 Next, the view dependent information is encoded by the reference view number information encoding unit 204 and the reference view information encoding unit 205 that constitute the view dependent information encoding unit 206. The view-dependent information to be encoded here includes the syntax elements “num_anchor_refs_l0 [i]”, “anchor_ref_l0 [i] [j]”, “num_anchor_refs_l1 [i]”, “anchor_ref_l1 [i] [j]”, “num_non_anchor_refs_0” [i] ”,“ non_anchor_ref_l0 [i] [j] ”,“ num_non_anchor_refs_l1 [i] ”,“ non_anchor_ref_l1 [i] [j] ”.

参照視点数情報符号化部２０４は、視点依存情報のうち、それぞれの視点のアンカーピクチャ用、及びノンアンカーピクチャ用の参照ピクチャリスト０、及び参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」を符号化する。ただし、前述のとおり、本実施の形態においては、参照視点数情報符号化部２０４ではインデックスiが「０」の前記シンタックス要素を符号化することはなく、インデックスiが「１」以上の前記シンタックス要素を符号化する。 The reference view number information encoding unit 204 can be used as a reference for inter-view prediction in the reference picture list 0 and the reference picture list 1 for the anchor picture and the non-anchor picture of each viewpoint in the view dependency information. The syntax elements “num_anchor_refs_l0 [i]”, “num_anchor_refs_l1 [i]”, “num_non_anchor_refs_l0 [i]”, and “num_non_anchor_refs_l1 [i]” indicating the number of viewpoints are encoded. However, as described above, in the present embodiment, the reference view number information encoding unit 204 does not encode the syntax element whose index i is “0”, and the index i is “1” or more. Encode syntax elements.

参照視点情報符号化部２０５ではそれぞれの視点のアンカーピクチャ用、及びノンアンカーピクチャ用の参照ピクチャリスト０、及び参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[i][j]」、「anchor_ref_l1[i][j]」、「non_anchor_ref_l0[i][j]」、「non_anchor_ref_l1[i][j]」を符号化する。ただし、前述のとおり、本実施の形態においては参照視点情報符号化部２０５ではインデックスiが「０」または「１」の前記シンタックス要素を符号化することはなく、インデックスiが「２」以上の前記シンタックス要素を符号化する。 The reference viewpoint information encoding unit 205 sets the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 0 and the reference picture list 1 for the anchor picture and the non-anchor picture of each viewpoint. The syntax elements “anchor_ref_l0 [i] [j]”, “anchor_ref_l1 [i] [j]”, “non_anchor_ref_l0 [i] [j]”, and “non_anchor_ref_l1 [i] [j]” are encoded. However, as described above, in the present embodiment, the reference viewpoint information encoding unit 205 does not encode the syntax element whose index i is “0” or “1”, and the index i is “2” or more. Are encoded.

再び図１に戻って説明する。ピクチャ情報符号化部１０３は、符号化管理部１０１で管理されるピクチャに関連する情報（ＰＰＳ）を符号化する。また、画像信号符号化部１０４は、符号化管理部１０１で管理されるスライスに関連する情報（スライスヘッダ）及び供給される符号化対象の画像信号をスライス単位で符号化する。画像信号を符号化する際には視点間予測を用いることもあるが、その際には前記視点依存情報に基づいて視点間予測の参照画像を選択する。 Returning again to FIG. The picture information encoding unit 103 encodes information (PPS) related to a picture managed by the encoding management unit 101. Also, the image signal encoding unit 104 encodes information (slice header) related to the slice managed by the encoding management unit 101 and the supplied encoding target image signal in units of slices. When encoding an image signal, inter-view prediction may be used. In this case, a reference image for inter-view prediction is selected based on the viewpoint dependency information.

多重化部１０５は、シーケンス情報符号化部１０２で符号化して得られたシーケンス情報の符号化ビット列と、ピクチャ情報符号化部１０３で符号化して得られたピクチャ情報の符号化ビット列と、画像信号符号化部１０４で符号化して得られたスライス情報及び画像信号の符号化ビット列とをそれぞれＮＡＬユニット単位で扱うためのヘッダ情報を付加して、多重化し、多視点画像の符号化ビット列とする。 The multiplexing unit 105 includes an encoded bit sequence of sequence information obtained by encoding by the sequence information encoding unit 102, an encoded bit sequence of picture information obtained by encoding by the picture information encoding unit 103, and an image signal Header information for handling the slice information obtained by encoding by the encoding unit 104 and the encoded bit sequence of the image signal in units of NAL units is added and multiplexed to obtain an encoded bit sequence of the multi-view image.

次に、図１に示した多視点画像符号化装置による多視点画像符号化処理手順について、図３〜図９のフローチャートを参照して説明する。各ステップの処理動作については図１、及び図２のブロック図を用いて説明したものと同じであるので、ここでは図１、及び図２と対応付けることで、処理手順のみを説明する。 Next, the multi-view image encoding processing procedure by the multi-view image encoding device shown in FIG. 1 will be described with reference to the flowcharts of FIGS. Since the processing operation of each step is the same as that described with reference to the block diagrams of FIG. 1 and FIG. 2, only the processing procedure will be described here in association with FIG. 1 and FIG.

まず、シーケンス全体の符号化に係るパラメータ情報を符号化し、シーケンス全体の符号化に係るパラメータ情報の符号化ビット列を生成する（図３のステップＳ１０１）。このステップＳ１０１の処理は、図１の多視点画像符号化装置ではシーケンス情報符号化部１０２での符号化動作に相当する。 First, parameter information related to encoding of the entire sequence is encoded, and an encoded bit string of parameter information related to encoding of the entire sequence is generated (step S101 in FIG. 3). The processing in step S101 corresponds to the encoding operation in the sequence information encoding unit 102 in the multi-viewpoint image encoding device in FIG.

この、ステップＳ１０１のシーケンス情報の符号化処理手順の一例について図４のフローチャートと共に更に詳細に説明する。まず、ＭＶＣ拡張部分以外のシーケンス情報を符号化する（ステップＳ１１１）。このステップＳ１１１の処理は、図２のＭＶＣ拡張部分以外のシーケンス情報符号化部２０１での符号化動作に相当する。 An example of the sequence information encoding process procedure in step S101 will be described in more detail with reference to the flowchart of FIG. First, sequence information other than the MVC extension is encoded (step S111). The processing in step S111 corresponds to the encoding operation in the sequence information encoding unit 201 other than the MVC extension portion in FIG.

続いて、図１１に示すシンタックス構造に従ってシーケンス全体に関連する情報（ＳＰＳ）のＭＶＣ拡張部分を符号化する（ステップＳ１１２からＳ１１４）。まず、符号化ビット列に符号化される視点の数の情報を符号化する（ステップＳ１１２）。このステップＳ１１２の処理は、図２の視点数情報符号化部２０２での符号化動作に相当する。続いて、視点方向での符号化／復号順序で各視点の視点ＩＤの情報を符号化する（ステップＳ１１３）。このステップＳ１１３の処理は、図２の符号化順序情報符号化部２０３での符号化動作に相当する。 Subsequently, the MVC extension portion of the information (SPS) related to the entire sequence is encoded according to the syntax structure shown in FIG. 11 (steps S112 to S114). First, information on the number of viewpoints to be encoded into the encoded bit string is encoded (step S112). The processing in step S112 corresponds to the encoding operation in the viewpoint number information encoding unit 202 in FIG. Subsequently, the viewpoint ID information of each viewpoint is encoded in the encoding / decoding order in the viewpoint direction (step S113). The processing in step S113 corresponds to the encoding operation in the encoding order information encoding unit 203 in FIG.

このステップＳ１１３の視点方向での符号化／復号順序での視点ＩＤの符号化処理手順の一例について図５のフローチャートと共に更に詳細に説明する。まず、視点方向での符号化／復号順序を示すインデックスiを「０」とする（ステップＳ１２１）。続いて、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ１２２）。インデックスiの値が（視点数−１）以下でない場合、ステップＳ１１３の符号化処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ１２３に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ１２２からステップＳ１２４までの処理を繰り返す。ステップＳ１２３では、視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示すシンタックス要素「view_id[i]」を符号化する。続いて、ステップＳ１２４では、インデックスiに「１」を加えて再びステップＳ１２２に進む。 An example of the viewpoint ID encoding processing procedure in the encoding / decoding order in the viewpoint direction in step S113 will be described in more detail with reference to the flowchart of FIG. First, an index i indicating the encoding / decoding order in the viewpoint direction is set to “0” (step S121). Subsequently, it is determined whether or not the value of index i is equal to or less than (number of viewpoints-1) (step S122). If the value of index i is not less than (number of viewpoints−1), the encoding process in step S113 is terminated. If the value of index i is (number of viewpoints-1) or less, the process proceeds to step S123, and the processing from step S122 to step S124 is repeated until the value of index i is not less than (number of viewpoints-1). In step S123, the syntax element “view_id [i]” indicating the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction is encoded. Subsequently, in step S124, “1” is added to the index i, and the process proceeds again to step S122.

再び、図４のフローチャートに戻って説明する。上記のステップＳ１１３の処理に続いて、ステップＳ１１４では、視点依存情報を符号化する。このステップＳ１１４の処理は、図２の参照視点数情報符号化部２０４と参照視点情報符号化部２０５で構成される視点依存情報符号化部２０６での符号化動作に相当する。 Returning to the flowchart of FIG. Subsequent to the processing in step S113 described above, in step S114, the view-dependent information is encoded. The processing in step S114 corresponds to the encoding operation in the view dependent information encoding unit 206 configured by the reference view number information encoding unit 204 and the reference view information encoding unit 205 in FIG.

このステップＳ１１４の視点依存情報の符号化処理手順の一例について図６のフローチャートと共に更に詳細に説明する。ステップＳ１１４の視点依存情報の符号化処理では、アンカーピクチャの視点依存情報を符号化した後（ステップＳ１３１）、ノンアンカーピクチャの視点依存情報を符号化する（ステップＳ１３２）。このステップＳ１３２の処理が完了したら図６の視点依存情報の符号化処理は終了である。 An example of the processing procedure for encoding the viewpoint dependent information in step S114 will be described in more detail with reference to the flowchart of FIG. In the encoding process of the view dependency information in step S114, the view dependency information of the anchor picture is encoded (step S131), and then the view dependency information of the non-anchor picture is encoded (step S132). When the process in step S132 is completed, the view-dependent information encoding process in FIG. 6 is completed.

このステップＳ１３１のアンカーピクチャの視点依存情報の符号化処理手順の一例について図７のフローチャートと共に更に詳細に説明する。図７のアンカーピクチャの視点依存情報の符号化処理では、インデックスｉが「０」、すなわち視点方向での符号化／復号順序で０番目に符号化／復号される視点（最初に符号化／復号される視点）は常に視点間予測を用いずに、つまり他の視点を参照せずに符号化するので、視点間予測の参照として利用する視点の数は常に「０」となり、「num_anchor_refs_l0[0]」、及び「num_anchor_refs_l0[1]」を符号化せず、値を常に「０」とする。そこで、視点方向での符号化／復号順序を示すインデックスiを「１」とする（ステップＳ１４１）。 An example of the processing procedure for encoding the viewpoint dependent information of the anchor picture in step S131 will be described in more detail with reference to the flowchart of FIG. In the encoding process of the anchor picture view-dependent information in FIG. 7, the index i is “0”, that is, the viewpoint encoded / decoded in the encoding / decoding order in the viewpoint direction (first encoding / decoding). (Viewed viewpoint) is always encoded without using inter-view prediction, that is, without referring to other viewpoints, the number of viewpoints used as a reference for inter-view prediction is always “0”, and “num_anchor_refs_l0 [0 ] ”And“ num_anchor_refs_l0 [1] ”are not encoded, and the value is always“ 0 ”. Therefore, the index i indicating the encoding / decoding order in the viewpoint direction is set to “1” (step S141).

続いて、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ１４２）。インデックスiの値が（視点数−１）以下でない場合、アンカーピクチャの視点依存情報の符号化処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ１４３に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ１４２からステップＳ１５５までの処理を繰り返す。 Subsequently, it is determined whether or not the value of the index i is (the number of viewpoints−1) or less (step S142). If the value of the index i is not less than (number of viewpoints−1), the encoding process of the anchor-picture viewpoint-dependent information is terminated. When the value of index i is (number of viewpoints-1) or less, the process proceeds to step S143, and the processing from step S142 to step S155 is repeated until the value of index i is not less than (number of viewpoints-1).

ステップＳ１４３では、視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[i]」を符号化する。 In step S143, the syntax element “num_anchor_refs_l0 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. Is encoded.

続いて、ステップＳ１４４では、インデックスｉが「１」より大きいかどうかを判断する。インデックスｉが「１」より大きいときはステップＳ１４５に進み、インデックスｊの値を「０」とする。続いて、インデックスｊの値が「num_anchor_refs_l0[i]」より小さいかどうかを判断し（ステップＳ１４６）、インデックスｊの値が「num_anchor_refs_l0[i]」の値より小さい場合、インデックスｊの値が「num_anchor_refs_l0[i]」の値以上になるまで、ステップＳ１４６からステップＳ１４８までの処理を繰り返す。 Subsequently, in step S144, it is determined whether or not the index i is greater than “1”. When the index i is larger than “1”, the process proceeds to step S145, and the value of the index j is set to “0”. Subsequently, it is determined whether the value of index j is smaller than “num_anchor_refs_l0 [i]” (step S146). If the value of index j is smaller than the value of “num_anchor_refs_l0 [i]”, the value of index j is “num_anchor_refs_l0”. The process from step S146 to step S148 is repeated until the value is equal to or greater than the value of [i].

ステップＳ１４７では視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[i][j]」を符号化してステップＳ１４８に進む。ステップＳ１４８ではインデックスｊに「１」を加えて再びステップＳ１４６に進む。 In step S147, the syntax element “anchor_ref_l0” indicating the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 0 for the anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. [i] [j] ”is encoded, and the process proceeds to step S148. In step S148, “1” is added to the index j, and the process proceeds again to step S146.

一方、ステップＳ１４４でインデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」であると判定したとき、又はステップＳ１４６でインデックスｊの値が「num_anchor_refs_l0[i]」の値以上であると判定した場合、ステップＳ１４９に進む。ステップＳ１４９では、視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l1[i]」を符号化する。 On the other hand, when the index i is not larger than “1” in step S144, that is, when it is determined that the index i is “1”, or the value of the index j is greater than or equal to the value of “num_anchor_refs_l0 [i]” in step S146. If it is determined that there is, the process proceeds to step S149. In step S149, the syntax element “num_anchor_refs_l1 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. Is encoded.

続いて、インデックスｉが「１」より大きいかどうかを判断する（ステップＳ１５０）。インデックスｉが「１」より大きいときはステップＳ１５１に進み、インデックスｊの値を「０」とする。続いて、インデックスｊの値が「num_anchor_refs_l1[i]」より小さいかどうかを判断し（ステップＳ１５２）、インデックスｊの値が「num_anchor_refs_l1[i]」の値より小さい場合、インデックスｊの値が「num_anchor_refs_l1[i]」の値以上になるまで、ステップＳ１５２からステップＳ１５４までの処理を繰り返す。 Subsequently, it is determined whether or not the index i is larger than “1” (step S150). When the index i is larger than “1”, the process proceeds to step S151, and the value of the index j is set to “0”. Subsequently, it is determined whether or not the value of index j is smaller than “num_anchor_refs_l1 [i]” (step S152). If the value of index j is smaller than the value of “num_anchor_refs_l1 [i]”, the value of index j is “num_anchor_refs_l1”. The process from step S152 to step S154 is repeated until the value of [i] "is reached.

インデックスｊの値が「num_anchor_refs_l1[i]」の値より小さい場合、視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l1[i][j]」を符号化する（ステップＳ１５３）。続いて、インデックスｊに「１」を加えて（ステップＳ１５４）、再びステップＳ１５２に進む。 When the value of the index j is smaller than the value of “num_anchor_refs_l1 [i]”, reference to the j-th inter-view prediction in the reference picture list 1 for the anchor picture of the i-th view in the encoding / decoding order in the view direction The syntax element “anchor_ref_l1 [i] [j]” indicating the viewpoint ID of the viewpoint used as is encoded (step S153). Subsequently, “1” is added to the index j (step S154), and the process proceeds again to step S152.

一方、ステップＳ１５０でインデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」であると判定したとき、又はステップＳ１５２でインデックスｊの値が「num_anchor_refs_l1[i]」以上であると判定したときは、ステップＳ１５５に進む。ステップＳ１５５ではインデックスｉに「１」を加えて再びステップＳ１４２に進む。 On the other hand, when the index i is not larger than “1” in step S150, that is, when it is determined that the index i is “1”, or the value of the index j is “num_anchor_refs_l1 [i]” or more in step S152. If so, the process proceeds to step S155. In step S155, “1” is added to the index i and the process proceeds to step S142 again.

次に、図６のステップＳ１３２のノンアンカーピクチャの視点依存情報の符号化処理手順の一例について図８のフローチャートと共に更に詳細に説明する。図８のノンアンカーピクチャの視点依存情報の符号化処理では、前述の理由により、「num_non_anchor_refs_l0[0]」、及び「num_non_anchor_refs_l0[1]」を符号化せず、値を常に「０」とする。そこで、インデックスiを「１」とした後（ステップＳ１６１）、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ１６２）。 Next, an example of the processing procedure for encoding the viewpoint dependent information of the non-anchor picture in step S132 of FIG. 6 will be described in more detail with reference to the flowchart of FIG. In the non-anchor picture view-dependent information encoding process of FIG. 8, “num_non_anchor_refs_l0 [0]” and “num_non_anchor_refs_l0 [1]” are not encoded for the reasons described above, and the value is always “0”. Therefore, after setting index i to “1” (step S161), it is determined whether or not the value of index i is equal to or less than (number of viewpoints−1) (step S162).

インデックスiの値が（視点数−１）以下でない場合、ノンアンカーピクチャの視点依存情報の符号化処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ１６３に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ１６２からステップＳ１７５までの処理を繰り返す。 When the value of the index i is not less than (number of viewpoints−1), the encoding processing of the viewpoint dependent information of the non-anchor picture is finished. When the value of index i is (number of viewpoints-1) or less, the process proceeds to step S163, and the processing from step S162 to step S175 is repeated until the value of index i is not less than (number of viewpoints-1).

ステップＳ１６３では、視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_non_anchor_refs_l0[i]」を符号化する。 In step S163, the syntax element “num_non_anchor_refs_l0 [i indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. ] ".

続いて、ステップＳ１６４では、インデックスｉが「１」より大きいかどうかを判断する。インデックスｉが「１」より大きいときはステップＳ１６５に進み、インデックスｊの値を「０」とする。続いて、インデックスｊの値が「num_non_anchor_refs_l0[i]」より小さいかどうかを判断し（ステップＳ１６６）、インデックスｊの値が「num_non_anchor_refs_l0[i]」の値より小さい場合、インデックスｊの値が「num_non_anchor_refs_l0[i]」の値以上になるまで、ステップＳ１６６からステップＳ１６８までの処理を繰り返す。 Subsequently, in step S164, it is determined whether or not the index i is greater than “1”. When the index i is larger than “1”, the process proceeds to step S165, and the value of the index j is set to “0”. Subsequently, it is determined whether or not the value of the index j is smaller than “num_non_anchor_refs_l0 [i]” (step S166). If the value of the index j is smaller than the value of “num_non_anchor_refs_l0 [i]”, The processing from step S166 to step S168 is repeated until the value of [i] becomes equal to or greater than the value.

ステップＳ１６７では視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「non_anchor_ref_l0[i][j]」を符号化してステップＳ１６８に進む。ステップＳ１６８ではインデックスｊに「１」を加えて再びステップＳ１６６に進む。 In step S167, the syntax element “indicating the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. Non_anchor_ref_l0 [i] [j] ”is encoded, and the process proceeds to step S168. In step S168, “1” is added to the index j, and the process proceeds again to step S166.

一方、ステップＳ１６４でインデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」であると判定したとき、又はステップＳ１６６でインデックスｊの値が「num_non_anchor_refs_l0[i]」の値以上であると判定した場合、ステップＳ１６９に進む。ステップＳ１６９では、視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_non_anchor_refs_l1[i]」を符号化する。 On the other hand, when the index i is not greater than “1” in step S164, that is, when it is determined that the index i is “1”, or the value of the index j is greater than or equal to the value of “num_non_anchor_refs_l0 [i]” in step S166. If it is determined that there is, the process proceeds to step S169. In step S169, the syntax element “num_non_anchor_refs_l1 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. ] ".

続いて、ステップＳ１７２では、インデックスｉが「１」より大きいかどうかを判断する。インデックスｉが「１」より大きいときはステップＳ１７１に進み、インデックスｊの値を「０」とする。続いて、インデックスｊの値が「num_non_anchor_refs_l1[i]」より小さいかどうかを判断し（ステップＳ１７２）、インデックスｊの値が「num_non_anchor_refs_l1[i]」の値より小さい場合、インデックスｊの値が「num_non_anchor_refs_l1[i]」の値以上になるまで、ステップＳ１７２からステップＳ１７４までの処理を繰り返す。 Subsequently, in step S172, it is determined whether or not the index i is larger than “1”. When the index i is larger than “1”, the process proceeds to step S171, and the value of the index j is set to “0”. Subsequently, it is determined whether or not the value of the index j is smaller than “num_non_anchor_refs_l1 [i]” (step S 172). The processes from step S172 to step S174 are repeated until the value of [i] is equal to or greater.

ステップＳ１７３では視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「non_anchor_ref_l1[i][j]」を符号化してステップＳ１７４に進む。ステップＳ１７４ではインデックスｊに「１」を加えて再びステップＳ１７２に進む。 In step S173, the syntax element “indicating the viewpoint ID of the viewpoint used as the reference for the j-th inter-view prediction in the reference picture list 1 for the non-anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. Non_anchor_ref_l1 [i] [j] ”is encoded, and the process proceeds to step S174. In step S174, “1” is added to the index j, and the process proceeds again to step S172.

一方、ステップＳ１７０でインデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」であると判定したとき、又はステップＳ１７２でインデックスｊの値が「num_non_anchor_refs_l1[i]」の値以上であると判定したときは、ステップＳ１７５に進む。ステップＳ１７５ではインデックスｉに「１」を加えて再びステップＳ１６２に進む。 On the other hand, when the index i is not larger than “1” in step S170, that is, when it is determined that the index i is “1”, or the value of the index j is greater than or equal to the value of “num_non_anchor_refs_l1 [i]” in step S172. If it is determined that there is, the process proceeds to step S175. In step S175, “1” is added to the index i, and the process proceeds to step S162 again.

以上の図７、及び図８の処理手順の説明において、図７のステップＳ１４３、Ｓ１４９、図８のステップＳ１６３、Ｓ１６９の処理は、図２の参照視点数情報符号化部２０４の符号化動作に相当し、図７のステップＳ１４７、Ｓ１５３、図８のステップＳ１６７、Ｓ１７３の処理は図２の参照視点情報符号化部２０５の符号化動作に相当する。 7 and FIG. 8, the processing in steps S143 and S149 in FIG. 7 and steps S163 and S169 in FIG. 8 are performed by the encoding operation of the reference viewpoint number information encoding unit 204 in FIG. The processing in steps S147 and S153 in FIG. 7 and steps S167 and S173 in FIG. 8 corresponds to the encoding operation of the reference viewpoint information encoding unit 205 in FIG.

再び、図４のフローチャートに戻って説明する。ステップＳ１１４の処理が完了したら図４のシーケンス情報の符号化処理は終了である。 Returning to the flowchart of FIG. When the process of step S114 is completed, the sequence information encoding process of FIG. 4 is completed.

再び、図３のフローチャートに戻って説明する。上記の図４乃至図８のフローチャートと共に説明したステップＳ１０１の処理が完了すると、ステップＳ１０２に進む。ステップＳ１０２では、シーケンス全体の符号化に係るパラメータ情報の符号化ビット列を多重化し、多重化された符号化ビット列を得る。このステップＳ１０２の処理は、図１の多視点画像符号化装置では多重化部１０５での多重化動作に相当する。 Returning to the flowchart of FIG. When the process of step S101 described with the flowcharts of FIGS. 4 to 8 is completed, the process proceeds to step S102. In step S102, the encoded bit string of the parameter information relating to the encoding of the entire sequence is multiplexed to obtain a multiplexed encoded bit string. The processing in step S102 corresponds to the multiplexing operation in the multiplexing unit 105 in the multi-view image encoding device in FIG.

次のステップＳ１０３では、ピクチャの符号化に係るパラメータ情報等を符号化し、ピクチャの符号化に係るパラメータ情報の符号化ビット列を生成する。このステップＳ１０３の処理は、図１の多視点画像符号化装置ではピクチャ情報符号化部１０３での符号化動作に相当する。 In the next step S103, parameter information related to picture encoding is encoded, and an encoded bit string of parameter information related to picture encoding is generated. The processing in step S103 corresponds to the encoding operation in the picture information encoding unit 103 in the multi-view image encoding apparatus in FIG.

続いて、ステップＳ１０４では、ピクチャの符号化に係るパラメータ情報の符号化ビット列を多重化し、多重化された符号化ビット列を得る。このステップＳ１０４の処理は、図１の多視点画像符号化装置では多重化部１０５での多重化動作に相当する。 Subsequently, in step S104, the encoded bit string of the parameter information related to the encoding of the picture is multiplexed to obtain a multiplexed encoded bit string. The processing in step S104 corresponds to the multiplexing operation in the multiplexing unit 105 in the multi-view image encoding device in FIG.

続いて、ステップＳ１０５では、スライス情報及び画像信号を符号化する。このステップＳ１０５の処理は、図１の多視点画像符号化装置では画像信号符号化部１０４での処理動作に相当する。 Subsequently, in step S105, the slice information and the image signal are encoded. The processing in step S105 corresponds to the processing operation in the image signal encoding unit 104 in the multi-view image encoding device in FIG.

続いて、ステップＳ１０６では、ステップＳ１０２、ステップＳ１０４で多重化されたビット列に続いて、復号画像出力順番号ｏ、符号化モード、及び、動きベクトルまたは視差ベクトル、符号化残差信号等の符号化ビット列を必要に応じて一つの符号化ビット列、または複数の符号化ビット列に適宜多重化する。このステップＳ１０６の処理は、図１の多視点画像符号化装置では多重化部１０５での多重化動作に相当する。 Subsequently, in step S106, following the bit sequence multiplexed in step S102 and step S104, the decoded image output order number o, the encoding mode, and the motion vector or disparity vector, encoding residual signal, etc. are encoded. The bit string is appropriately multiplexed into one encoded bit string or a plurality of encoded bit strings as necessary. The processing in step S106 corresponds to the multiplexing operation in the multiplexing unit 105 in the multi-view image encoding device in FIG.

ステップＳ１０６での多重化動作終了後に、符号化の対象となる多視点画像の全ての画像について符号化処理が完了したか否かを判断する（ステップＳ１０７）。完了している場合、本多視点画像符号化処理手順が終了となる。完了していない場合、ステップＳ１０５に進み、符号化の対象となる多視点画像の全ての画像について符号化処理が完了するまでステップＳ１０５からステップＳ１０６までの処理を繰り返す。 After completion of the multiplexing operation in step S106, it is determined whether or not the encoding process has been completed for all images of the multi-viewpoint image to be encoded (step S107). If completed, the multi-viewpoint image encoding processing procedure ends. If not completed, the process proceeds to step S105, and the process from step S105 to step S106 is repeated until the encoding process is completed for all the images of the multi-viewpoint image to be encoded.

次に、ネットワークを介して伝送する場合の多重化部１０５での多重化及び送信処理手順について、図９のフローチャートを用いて説明する。図９において、多重化部１０５は、シーケンス情報の符号化ビット列と、ピクチャ情報の符号化ビット列と、スライス情報及び画像信号の符号化ビット列とをそれぞれ多重化したデータを、必要に応じてＭＰＥＧ−２システム方式、ＭＰ４ファイルフォーマット、ＲＴＰ等の規格に基づいてパケット化する（ステップＳ１８１）。 Next, the multiplexing and transmission processing procedure in the multiplexing unit 105 when transmitting via a network will be described with reference to the flowchart of FIG. In FIG. 9, a multiplexing unit 105 multiplexes data obtained by multiplexing a coded bit sequence of sequence information, a coded bit sequence of picture information, and a coded bit sequence of slice information and an image signal, as necessary. Packetization is performed based on standards such as 2-system, MP4 file format, and RTP (step S181).

続いて、多重化部１０５は、必要に応じてＭＰＥＧ−２システム方式、ＭＰ４ファイルフォーマット、ＲＴＰ等の規格に基づいてパケット・ヘッダを上記のパケットに付加した後（ステップＳ１８２）、ネットワークを介して送信する（ステップＳ１８３）。 Subsequently, the multiplexing unit 105 adds a packet header to the packet based on standards such as MPEG-2 system, MP4 file format, RTP, etc. as necessary (step S182), and then via the network. Transmit (step S183).

（多視点画像復号装置及び多視点画像復号方法）
次に、図１〜図１２と共に説明した多視点画像符号化方法、多視点画像符号化装置及び多視点画像符号化プログラムにより生成された符号化データを復号する本発明になる多視点画像復号方法、多視点復号装置及び多視点画像復号プログラムについて図面を参照して説明する。 (Multi-view image decoding apparatus and multi-view image decoding method)
Next, the multi-view image decoding method according to the present invention for decoding encoded data generated by the multi-view image encoding method, multi-view image encoding device, and multi-view image encoding program described with reference to FIGS. A multi-viewpoint decoding apparatus and a multi-viewpoint image decoding program will be described with reference to the drawings.

図１３は、本発明になる多視点画像復号装置の一実施の形態のブロック図を示す。図１３に示すように、本実施の形態の多視点画像復号装置は、分離部３０１、復号管理部３０２、シーケンス情報復号部３０３、ピクチャ情報復号部３０４、画像信号復号部３０５を備え、多視点画像信号を符号化した符号化ビット列が入力され、これを復号して多視点画像信号を出力する。 FIG. 13 shows a block diagram of an embodiment of a multi-view image decoding apparatus according to the present invention. As shown in FIG. 13, the multi-view image decoding apparatus according to the present embodiment includes a separation unit 301, a decoding management unit 302, a sequence information decoding unit 303, a picture information decoding unit 304, and an image signal decoding unit 305. An encoded bit string obtained by encoding an image signal is input, and this is decoded to output a multi-view image signal.

次に、図１３に示す多視点画像復号装置の動作について、ＡＶＣ／Ｈ.２６４符号化方式と関連付けて説明する。まず、分離部３０１は、図１に示した多視点画像符号化装置により符号化され、ネットワークを介して送信された符号化ビット列を受信する。なお、本方式での符号化ビット列の供給形態はネットワーク伝送での受信のみならず、ＤＶＤ等の蓄積メディアに記録された符号化ビット列を読み込んだり、ＢＳ／地上波等の放送で放映された符号化ビット列を受信することもできる。 Next, the operation of the multi-view image decoding apparatus shown in FIG. 13 will be described in association with the AVC / H.264 encoding method. First, the separation unit 301 receives an encoded bit string that is encoded by the multi-view image encoding apparatus illustrated in FIG. 1 and transmitted via the network. It should be noted that the encoded bit string supply form in this system is not only received via network transmission, but also a code bit string recorded on a storage medium such as a DVD, or a code broadcast on BS / terrestrial broadcasts. An encoded bit string can also be received.

また、分離部３０１は、供給される符号化ビット列からパケット・ヘッダを除去し、ＮＡＬユニット単位に分離する。更に、分離部３０１は、分離したＮＡＬユニットのヘッダ部に含まれるＮＡＬユニットの種類を見分ける識別子（nal_unit_type）を評価し、当該ＮＡＬユニットがシーケンス全体の符号化に係るパラメータ情報が符号化されている符号化ビット列の場合は、シーケンス情報復号部３０３に供給し、ピクチャの符号化に係るパラメータ情報等が符号化されている符号化ビット列の場合は、ピクチャ情報復号部３０４に供給し、当該ＮＡＬユニットがＶＣＬＮＡＬユニット、すなわち符号化モード、及び動き／視差ベクトル、符号化残差信号等が符号化されている符号化ビット列の場合は、画像信号復号部３０５に供給する。 Also, the separation unit 301 removes the packet header from the supplied encoded bit string, and separates the NAL unit. Further, the separation unit 301 evaluates an identifier (nal_unit_type) that identifies the type of the NAL unit included in the header part of the separated NAL unit, and the NAL unit encodes parameter information related to the coding of the entire sequence. In the case of an encoded bit string, it is supplied to the sequence information decoding unit 303, and in the case of an encoded bit string in which parameter information relating to the encoding of a picture is encoded, it is supplied to the picture information decoding unit 304, and the NAL unit Is a coded bit string in which a coding mode, a motion / disparity vector, a coded residual signal, and the like are coded, is supplied to the image signal decoding unit 305.

シーケンス情報復号部３０３は、分離部３０１で分離されたシーケンス全体の符号化に係るパラメータ情報（ＳＰＳ）が符号化された符号化ビット列を復号する。ここでは、図１１に示すシンタックス構造に従ってＳＰＳのＭＶＣ拡張部分も復号する。 The sequence information decoding unit 303 decodes a coded bit string in which parameter information (SPS) related to coding of the entire sequence separated by the separation unit 301 is coded. Here, the MVC extension part of SPS is also decoded according to the syntax structure shown in FIG.

図１４は、シーケンス情報復号部３０３の一実施の形態の構成を示すブロック図である。図１４に示すように、シーケンス情報復号部３０３は、ＭＶＣ拡張部分以外のシーケンス情報復号部４０１、視点数情報復号部４０２、復号順序情報復号部４０３、参照視点数情報復号部４０４、参照視点情報生成部４０５、参照視点情報復号部４０６、スイッチ４０７から構成される。スイッチ４０７は、シーケンス全体の符号化に係るパラメータ情報（ＳＰＳ）、及び、図１１に示すＳＰＳのＭＶＣ拡張のシンタックス構造、及び各復号部４０１〜４０４、４０６の復号結果に応じて切り替わり、符号化ビット列を各復号部４０１〜４０４、４０６に順次供給する。 FIG. 14 is a block diagram showing a configuration of an embodiment of the sequence information decoding unit 303. As illustrated in FIG. 14, the sequence information decoding unit 303 includes a sequence information decoding unit 401, a viewpoint number information decoding unit 402, a decoding order information decoding unit 403, a reference view number information decoding unit 404, and reference view information other than the MVC extension portion. A generation unit 405, a reference viewpoint information decoding unit 406, and a switch 407 are included. The switch 407 is switched according to the parameter information (SPS) related to the encoding of the entire sequence, the syntax structure of the MVC extension of the SPS shown in FIG. 11, and the decoding results of the decoding units 401 to 404, 406. The digitized bit string is sequentially supplied to the decoding units 401 to 404, 406.

また、参照視点数情報復号部４０４、参照視点情報生成部４０５、及び参照視点情報復号部４０６は、視点依存情報復号部４０８を構成している。更に、視点数情報復号部４０２、及び復号順序情報復号部４０３は、視点依存情報復号部４０８と共に、ＳＰＳＭＶＣ拡張部分復号部４０９を構成している。 Further, the reference view number information decoding unit 404, the reference view information generation unit 405, and the reference view information decoding unit 406 constitute a view dependency information decoding unit 408. Furthermore, the viewpoint number information decoding unit 402 and the decoding order information decoding unit 403 constitute an SPS MVC extended partial decoding unit 409 together with the viewpoint dependent information decoding unit 408.

図１４のＭＶＣ拡張部分以外のシーケンス情報復号部４０１は、符号化ビット列からＭＶＣ拡張部分以外のシーケンス情報、すなわちＡＶＣ／Ｈ.２６４方式でのＳＰＳ（シーケンス・パラメータ・セット）を復号する。また、視点数情報復号部４０２、復号順序情報復号部４０３、参照視点数情報復号部４０４、参照視点情報生成部４０５、参照視点情報復号部４０６から構成されるＳＰＳＭＶＣ拡張部分復号部４０９は、符号化ビット列から図１１に示すシンタックス構造に従ってシーケンス全体に関連する情報（ＳＰＳ）のＭＶＣ拡張部分を復号する。 The sequence information decoding unit 401 other than the MVC extension portion in FIG. 14 decodes sequence information other than the MVC extension portion, that is, SPS (sequence parameter set) in the AVC / H.264 system, from the encoded bit string. In addition, the SPS MVC extended partial decoding unit 409 including the view number information decoding unit 402, the decoding order information decoding unit 403, the reference view number information decoding unit 404, the reference view information generating unit 405, and the reference view information decoding unit 406 is: The MVC extension part of the information (SPS) related to the entire sequence is decoded from the encoded bit string according to the syntax structure shown in FIG.

このＳＰＳＭＶＣ拡張部分復号部４０９では、まず、視点数情報復号部４０２が符号化ビット列から、符号化ビット列に符号化される視点の数から「１」を減じた値を示すシンタックス要素「num_views_minus1」を復号する。次に、復号順序情報復号部４０３が符号化ビット列から、各視点毎に視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示すシンタックス要素「view_id[i]」を順次復号する。供給される符号化ビット列には視点方向での符号化／復号順序で視点ＩＤを示す「view_id_[i]」が符号化されているので、どのような符号化／復号順序で各視点が符号化されているのかを知ることができる。 In this SPS MVC extended partial decoding unit 409, first, the number-of-views information decoding unit 402 has a syntax element “num_views_minus1” indicating a value obtained by subtracting “1” from the number of viewpoints encoded in the encoded bit sequence from the encoded bit sequence. "Is decrypted. Next, the decoding order information decoding unit 403 sequentially decodes the syntax element “view_id [i]” indicating the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction for each viewpoint from the encoded bit string. To do. Since “view_id_ [i]” indicating the view ID in the encoding / decoding order in the view direction is encoded in the supplied encoded bit string, each viewpoint is encoded in any encoding / decoding order You can know what is being done.

次に、参照視点数情報復号部４０４、参照視点情報生成部４０５、参照視点情報復号部４０６から構成される視点依存情報復号部４０８が、符号化ビット列から視点依存情報を復号する。ここで復号する視点依存情報は前述のシンタックス要素「num_anchor_refs_l0[i]」、「anchor_ref_l0[i][j]」、「num_anchor_refs_l1[i]」、「anchor_ref_l1[i][j]」、「num_non_anchor_refs_l0[i]」、「non_anchor_ref_l0[i][j]」、「num_non_anchor_refs_l1[i]」、「non_anchor_ref_l1[i][j]」である。 Next, a view dependency information decoding unit 408 including a reference view number information decoding unit 404, a reference view information generation unit 405, and a reference view information decoding unit 406 decodes the view dependency information from the encoded bit string. The view-dependent information to be decoded here includes the syntax elements “num_anchor_refs_l0 [i]”, “anchor_ref_l0 [i] [j]”, “num_anchor_refs_l1 [i]”, “anchor_ref_l1 [i] [j]”, “num_non_anchor_refs_l0 [ i] ”,“ non_anchor_ref_l0 [i] [j] ”,“ num_non_anchor_refs_l1 [i] ”,“ non_anchor_ref_l1 [i] [j] ”.

参照視点数情報復号部４０４は、視点依存情報のうち、それぞれの視点のアンカーピクチャ用、及びノンアンカーピクチャ用の参照ピクチャリスト０及び参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」を復号する。ただし、前述のとおり、本発明においては、供給される符号化ビット列にインデックスｉが「０」の前記シンタックス要素が符号化されることはないので、参照視点数情報復号部４０４ではインデックスiが「０」の前記シンタックス要素を復号することはなく、インデックスiが「１」以上の前記シンタックス要素を復号する。 The reference view number information decoding unit 404 includes viewpoint views that can be used as references for inter-view prediction in the reference picture list 0 and the reference picture list 1 for the anchor picture and the non-anchor picture of the respective viewpoints in the view dependency information. The syntax elements “num_anchor_refs_l0 [i]”, “num_anchor_refs_l1 [i]”, “num_non_anchor_refs_l0 [i]”, “num_non_anchor_refs_l1 [i]” indicating the numbers are decoded. However, as described above, in the present invention, the syntax element whose index i is “0” is not encoded in the supplied encoded bit string. The syntax element “0” is not decoded, and the syntax element whose index i is “1” or more is decoded.

参照視点情報復号部４０６は、それぞれ「num_anchor_refs_l0[i]」、「num_anchor_refs_l1[i]」、「num_non_anchor_refs_l0[i]」、「num_non_anchor_refs_l1[i]」に応じた数のそれぞれの視点のアンカーピクチャ用、及びノンアンカーピクチャ用の参照ピクチャリスト０、及び参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[i][j]」、「anchor_ref_l1[i][j]」、「non_anchor_ref_l0[i][j]」、「non_anchor_ref_l1[i][j]」を復号する。ただし、前述のとおり、本発明においては供給される符号化ビット列にiが「０」または「１」の前記シンタックス要素が符号化されることはないので、参照視点情報復号部４０６ではiが「０」または「１」の前記シンタックス要素を復号することはなく、iが「２」以上の前記シンタックス要素を復号する。 The reference view information decoding unit 406 is used for anchor pictures of numbers corresponding to “num_anchor_refs_l0 [i]”, “num_anchor_refs_l1 [i]”, “num_non_anchor_refs_l0 [i]”, “num_non_anchor_refs_l1 [i]”, and Syntax elements “anchor_ref_l0 [i] [j]” and “anchor_ref_l1 [] indicating the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 0 for the non-anchor picture and the reference picture list 1 i] [j] ”,“ non_anchor_ref_l0 [i] [j] ”, and“ non_anchor_ref_l1 [i] [j] ”are decoded. However, as described above, in the present invention, since the syntax element in which i is “0” or “1” is not encoded in the supplied encoded bit string, the reference view information decoding unit 406 does not specify i. The syntax element of “0” or “1” is not decoded, and the syntax element of which i is “2” or more is decoded.

参照視点情報生成部４０５では、前述の理由により、視点方向での符号化／復号順序で1番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照の視点の数を示すシンタックス要素「num_anchor_refs_l0[1]」の値が「１」の場合は、視点方向での符号化／復号順序で1番目の視点のアンカーピクチャ用の参照ピクチャリスト０での０番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[1][0]」の値を視点方向での符号化／復号順序で０番目の視点の視点ＩＤである「view_id[0]」とする。 In the reference viewpoint information generation unit 405, for the reason described above, the reference viewpoint information generation unit 405 indicates the number of reference viewpoints for inter-view prediction in the reference picture list 0 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction. When the value of the tax element “num_anchor_refs_l0 [1]” is “1”, the 0th inter-view prediction in the reference picture list 0 for the anchor picture of the first view in the encoding / decoding order in the view direction is performed. The value of the syntax element “anchor_ref_l0 [1] [0]” indicating the viewpoint ID of the viewpoint used as a reference is “view_id [0]” that is the viewpoint ID of the zeroth viewpoint in the encoding / decoding order in the viewpoint direction. And

同様に、「num_anchor_refs_l1[1]」の値が「１」の場合は、「anchor_ref_l1[1][0]」の値を「view_id[0]」とし、「num_non_anchor_refs_l0[1]」の値が「１」の場合は「non_anchor_ref_l0[1][0]」の値を「view_id[0]」とし、「num_non_anchor_refs_l1[1]」の値が「１」の場合は「non_anchor_ref_l1[1][0]」の値を「view_id[0]」とする。 Similarly, when the value of “num_anchor_refs_l1 [1]” is “1”, the value of “anchor_ref_l1 [1] [0]” is set to “view_id [0]”, and the value of “num_non_anchor_refs_l0 [1]” is “1” "", The value of "non_anchor_ref_l0 [1] [0]" is "view_id [0]", and the value of "num_non_anchor_refs_l1 [1]" is "1", the value of "non_anchor_ref_l1 [1] [0]" Is “view_id [0]”.

再び、図１３に戻って説明する。シーケンス情報復号部３０３で復号されたシーケンス全体の管理情報は、復号管理部３０２に供給され、復号の管理に用いられる。ピクチャ情報復号部３０４は、分離部３０１で分離されたピクチャの符号化に係るパラメータ情報（ＰＰＳ）が符号化された符号化ビット列を復号し、復号したパラメータ情報（ＰＰＳ）をピクチャ管理情報として復号管理部３０２に供給し、復号の管理に用いる。 Returning to FIG. 13, the description will be continued. The management information of the entire sequence decoded by the sequence information decoding unit 303 is supplied to the decoding management unit 302 and used for decoding management. The picture information decoding unit 304 decodes a coded bit string in which parameter information (PPS) related to coding of the picture separated by the separation unit 301 is coded, and decodes the decoded parameter information (PPS) as picture management information. The data is supplied to the management unit 302 and used for decoding management.

画像信号復号部３０５は、復号管理部３０２から供給される視点数情報、復号順序情報、視点依存情報などを含む復号されたシーケンス全体の管理情報やピクチャ管理情報に基づいて、分離部３０１から供給される復号対象の符号化ビット列（符号化データ）を復号して画像信号を得る。画像信号を復号する際には視点間予測を用いて復号することもあるが、その際には前記視点依存情報も用いて視点間予測の参照画像を決定する。 The image signal decoding unit 305 is supplied from the separation unit 301 based on the management information and the picture management information of the entire decoded sequence including the number-of-views information, the decoding order information, and the view dependency information supplied from the decoding management unit 302. The encoded bit string (encoded data) to be decoded is decoded to obtain an image signal. When decoding an image signal, it may be decoded using inter-view prediction, and in that case, a reference image for inter-view prediction is also determined using the view-dependent information.

次に、図１３に示した本実施の形態の多視点画像復号装置による多視点画像復号処理手順について、図１５〜図２１のフローチャートを参照して説明する。各ステップの処理動作については図１３及び図１４のブロック図を用いて説明したものと同じであるので、ここでは図１３及び図１４と対応付けることで、処理手順のみを説明する。 Next, the multi-view image decoding processing procedure by the multi-view image decoding apparatus of the present embodiment shown in FIG. 13 will be described with reference to the flowcharts of FIGS. Since the processing operation of each step is the same as that described with reference to the block diagrams of FIGS. 13 and 14, only the processing procedure will be described here in association with FIGS.

まず、図１５のステップＳ２０１では符号化された符号化ビット列をＮＡＬユニット単位に分離する。このステップＳ２０１において、ネットワークを介して符号化ビット列を伝送する場合の受信及び分離処理手順について、図２１のフローチャートを用いて詳細に説明する。ステップＳ２０１の分離処理において、まず、ネットワークを介して符号化ビット列を受信し（ステップＳ２８１）、続いて、その受信した符号化ビット列に用いられたＭＰＥＧ−２システム方式、ＭＰ４ファイルフォーマット、ＲＴＰ等の規格に基づいて付加されたパケット・ヘッダを復号して除去する（ステップＳ２８２）。そして、ＮＡＬユニット単位で符号化ビット列を分離する（ステップＳ２８３）。 First, in step S201 in FIG. 15, the encoded bit string that has been encoded is separated into NAL unit units. In this step S201, the reception and separation processing procedure when transmitting the encoded bit string via the network will be described in detail with reference to the flowchart of FIG. In the separation process in step S201, first, an encoded bit string is received via the network (step S281), and subsequently, the MPEG-2 system method, MP4 file format, RTP, etc. used for the received encoded bit string are received. The packet header added based on the standard is decoded and removed (step S282). Then, the encoded bit string is separated in units of NAL units (step S283).

再び、図１５に戻って説明する。図１５のステップＳ２０１で分離されたＮＡＬユニットのヘッダ部に含まれるＮＡＬユニットの種類を見分ける識別子（nal_unit_type）を評価し、当該ＮＡＬユニットがシーケンス全体の符号化に係るパラメータ情報（ＳＰＳ）、すなわちシーケンス情報であるか否か判定し（ステップＳ２０２）、シーケンス情報の場合、ステップＳ２０５に進み、シーケンス情報ではなくピクチャ情報（ＰＰＳ）と判定された場合（ステップＳ２０３）、ステップＳ２０６に進む。 Again, referring back to FIG. The identifier (nal_unit_type) for identifying the type of the NAL unit included in the header part of the NAL unit separated in step S201 in FIG. 15 is evaluated, and the NAL unit is parameter information (SPS) related to encoding of the entire sequence, that is, the sequence It is determined whether or not the information is information (step S202). If it is sequence information, the process proceeds to step S205. If it is determined that the information is not picture information (PPS) but sequence information (step S203), the process proceeds to step S206.

また、当該ＮＡＬユニットがシーケンス情報でも、ピクチャ情報でもない場合は、ステップＳ２０４に進む。ステップＳ２０４では当該ＮＡＬユニットがＶＣＬＮＡＬユニットであるか、すなわち符号化モード、動きベクトルまたは視差ベクトル、符号化残差信号等が符号化されている符号化ビット列であるかを判定し、ＶＣＬＮＡＬユニットである場合、ステップＳ２０７に進む。これらのステップＳ２０１、Ｓ２０２、Ｓ２０３、Ｓ２０４の処理は、図１３の多視点画像復号装置では分離部３０１での処理動作に相当する。 If the NAL unit is neither sequence information nor picture information, the process proceeds to step S204. In step S204, it is determined whether the NAL unit is a VCL NAL unit, that is, a coding bit string in which a coding mode, a motion vector or a disparity vector, a coded residual signal, and the like are coded, and the VCL NAL unit. If YES, the process proceeds to step S207. The processes in steps S201, S202, S203, and S204 correspond to the processing operation in the separation unit 301 in the multi-viewpoint image decoding apparatus in FIG.

次に、ステップＳ２０５では、シーケンス全体の符号化に係るパラメータ情報が符号化された符号化ビット列を復号し、シーケンス全体の符号化に係るパラメータ情報を得る。このステップＳ２０５の処理は、図１３の多視点画像符号化装置ではシーケンス情報復号部３０３での復号動作に相当する。 Next, in step S205, the encoded bit string in which the parameter information related to the encoding of the entire sequence is encoded is decoded to obtain the parameter information related to the encoding of the entire sequence. The processing in step S205 corresponds to the decoding operation in the sequence information decoding unit 303 in the multi-view image encoding device in FIG.

この、ステップＳ２０５のシーケンス情報の復号処理手順の一例について図１６のフローチャートと共に更に詳細に説明する。シーケンス情報の復号処理では、まず、ＭＶＣ拡張部分以外のシーケンス情報を復号する（ステップＳ２１１）。このステップＳ２１１の処理は、図１４のＭＶＣ拡張部分以外のシーケンス情報復号部４０１での復号動作に相当する。 An example of the sequence information decoding process procedure of step S205 will be described in more detail with reference to the flowchart of FIG. In the sequence information decoding process, first, sequence information other than the MVC extension is decoded (step S211). The processing in step S211 corresponds to the decoding operation in the sequence information decoding unit 401 other than the MVC extension portion in FIG.

ステップＳ２１１に続いて、図１１に示すシンタックス構造に従ってシーケンス全体に関連する情報（ＳＰＳ）のＭＶＣ拡張部分を復号する（ステップＳ２１２からＳ２１４）。まず、符号化ビット列に符号化される視点の数の情報を復号する（ステップＳ２１２）。このステップＳ２１２の処理は、図１４の視点数情報復号部４０２での復号動作に相当する。ステップＳ２１２に続いて、視点方向での復号順序で符号化された各視点の視点ＩＤの情報を復号する（ステップＳ２１３）。このステップＳ２１３の復号処理は、図１４の復号順序情報復号部４０３での復号動作に相当する。 Subsequent to step S211, the MVC extension portion of the information (SPS) related to the entire sequence is decoded according to the syntax structure shown in FIG. 11 (steps S212 to S214). First, information on the number of viewpoints encoded in the encoded bit string is decoded (step S212). The processing in step S212 corresponds to the decoding operation in the viewpoint number information decoding unit 402 in FIG. Subsequent to step S212, the viewpoint ID information of each viewpoint encoded in the decoding order in the viewpoint direction is decoded (step S213). The decoding process in step S213 corresponds to the decoding operation in the decoding order information decoding unit 403 in FIG.

このステップＳ２１３の視点方向での復号順序で符号化された各視点の視点ＩＤの復号処理手順の一例について、図１７のフローチャートと共に更に詳細に説明する。ステップＳ２１３の復号処理では、まず、視点方向での符号化／復号順序を示すインデックスiを「０」とし（ステップＳ２２１）、続いて、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ２２２）。インデックスiの値が（視点数−１）以下でない場合、ステップＳ２１３の復号処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ２２３に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ２２２からステップＳ２２４までの処理を繰り返す。 An example of the decoding process procedure of the viewpoint ID of each viewpoint encoded in the decoding order in the viewpoint direction in step S213 will be described in more detail with reference to the flowchart of FIG. In the decoding process in step S213, first, the index i indicating the encoding / decoding order in the viewpoint direction is set to “0” (step S221), and then whether or not the value of the index i is (number of viewpoints−1) or less. Judgment is made (step S222). When the value of index i is not less than (number of viewpoints−1), the decoding process in step S213 is terminated. When the value of index i is (number of viewpoints-1) or less, the process proceeds to step S223, and the processing from step S222 to step S224 is repeated until the value of index i is not less than (number of viewpoints-1).

ステップＳ２２３では、視点方向での符号化／復号順序でi番目の視点の視点ＩＤを示すシンタックス要素「view_id[i]」を復号する。続いて、ステップＳ２２４では、インデックスiに「１」を加えて、再びステップＳ２２２に進む。 In step S223, the syntax element “view_id [i]” indicating the viewpoint ID of the i-th viewpoint in the encoding / decoding order in the viewpoint direction is decoded. Subsequently, in step S224, “1” is added to the index i, and the process proceeds again to step S222.

再び、図１６のフローチャートに戻って説明する。図１７と共に説明した上記のステップＳ２１３の視点ＩＤの情報の復号処理に続いて、視点依存情報を復号する（ステップＳ２１４）。このステップＳ２１４の処理は、図１４の参照視点数情報復号部４０４、参照視点情報生成部４０５、参照視点情報復号部４０６で構成される視点依存情報復号部４０８での復号動作に相当する。 Returning to the flowchart of FIG. Following the decoding process of the viewpoint ID information in step S213 described above with reference to FIG. 17, the viewpoint dependent information is decoded (step S214). The processing in step S214 corresponds to the decoding operation in the view dependency information decoding unit 408 including the reference view number information decoding unit 404, the reference view information generation unit 405, and the reference view information decoding unit 406 in FIG.

このステップＳ２１４の視点依存情報の復号処理手順の一例について図１８のフローチャートと共に説明する。このステップＳ２１４では、まず、アンカーピクチャの視点依存情報を復号し（ステップＳ２３１）、続いてノンアンカーピクチャの視点依存情報を復号する（ステップＳ２３２）ことで復号処理を終了する。 An example of the process of decoding the viewpoint dependent information in step S214 will be described with reference to the flowchart of FIG. In step S214, first, the view dependency information of the anchor picture is decoded (step S231), and then the view dependency information of the non-anchor picture is decoded (step S232), thereby ending the decoding process.

次に、図１８のステップＳ２３１のアンカーピクチャの視点依存情報の復号処理手順の一例について図１９のフローチャートと共に更に詳細に説明する。 Next, an example of the decoding processing procedure of the viewpoint dependent information of the anchor picture in step S231 in FIG. 18 will be described in more detail with reference to the flowchart in FIG.

図１９のアンカーピクチャの視点依存情報の復号処理手順では、符号化ビット列に視点方向での符号化／復号順序で０番目の視点（符号化／復号順序で最初に符号化／復号する視点）のアンカーピクチャ用の参照ピクチャリスト０及び参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[0]」と「num_anchor_refs_l1[0]」が符号化されておらず、それらの値を常に「０」とする（ステップＳ２４１）。そこで、視点方向での符号化／復号順序を示すインデックスiを「１」とする（ステップＳ２４２）。 In the decoding processing procedure of the anchor-picture view-dependent information in FIG. 19, the 0th viewpoint (viewpoint that is first encoded / decoded in the encoding / decoding order) of the encoded bit string in the encoding / decoding order in the view direction. The syntax elements “num_anchor_refs_l0 [0]” and “num_anchor_refs_l1 [0]” indicating the number of viewpoints that can be used as references for inter-view prediction in the reference picture list 0 and the reference picture list 1 for anchor pictures are encoded. The values are always set to “0” (step S241). Therefore, the index i indicating the encoding / decoding order in the viewpoint direction is set to “1” (step S242).

続いて、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ２４３）。インデックスiの値が（視点数−１）以下でない場合、アンカーピクチャの視点依存情報の復号処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ２４４に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ２４３からステップＳ２６０までの処理を繰り返す。 Subsequently, it is determined whether or not the value of the index i is (the number of viewpoints−1) or less (step S243). If the value of the index i is not less than (number of viewpoints−1), the decoding process of the viewpoint dependent information of the anchor picture is terminated. When the value of index i is (number of viewpoints-1) or less, the process proceeds to step S244, and the processing from step S243 to step S260 is repeated until the value of index i is not less than (number of viewpoints-1).

ステップＳ２４４では、視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l0[i]」を復号する。続いて、ステップＳ２４５では、インデックスｉが「１」より大きいかどうかを判断する。インデックスｉが「１」より大きいときはステップＳ２４８に進み、インデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」のときは、ステップＳ２４６に進む。ステップＳ２４６では、「num_anchor_refs_l0[1]」の値が「１」かどうかを判断する。 In step S244, the syntax element “num_anchor_refs_l0 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. "Is decrypted. Subsequently, in step S245, it is determined whether or not the index i is greater than “1”. When the index i is greater than “1”, the process proceeds to step S248. When the index i is not greater than “1”, that is, when the index i is “1”, the process proceeds to step S246. In step S246, it is determined whether or not the value of “num_anchor_refs_l0 [1]” is “1”.

ここで、「num_anchor_refs_l0[1]」は前述の通り、「０」または「１」の値を持つ。「num_anchor_refs_l0[1]」の値が「０」の場合、ステップＳ２５２に進み、値が「１」の場合、ステップＳ２４７に進む。前述の通り、インデックスiが「１」では符号化ビット列には視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素が符号化されていない。そこで、ステップＳ２４７では図１７のステップＳ２２３で復号した視点方向での符号化／復号順序で０番目の視点（最初に符号化／復号する視点）の視点ＩＤ、すなわち「view_id[0]」の値を「anchor_ref_l0[1][0]」の値とし、ステップＳ２５２に進む。 Here, “num_anchor_refs_l0 [1]” has a value of “0” or “1” as described above. When the value of “num_anchor_refs_l0 [1]” is “0”, the process proceeds to step S252, and when the value is “1”, the process proceeds to step S247. As described above, when the index i is “1”, the viewpoint used as the reference for inter-view prediction in the reference picture list 0 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction is used for the encoded bit string. The syntax element indicating the viewpoint ID is not encoded. Therefore, in step S247, the viewpoint ID of the 0th viewpoint (viewpoint to be encoded / decoded first) in the encoding / decoding order in the viewpoint direction decoded in step S223 of FIG. 17, that is, the value of “view_id [0]” The value of “anchor_ref_l0 [1] [0]” is set, and the process proceeds to step S252.

一方、ステップＳ２４５でインデックスｉが「１」より大きいと判断したときは、インデックスｊを「０」とした後（ステップＳ２４８）、インデックスｊの値が「num_anchor_refs_l0[i]」より小さいかどうかを判断する（ステップＳ２４９）。インデックスｊの値が「num_anchor_refs_l0[i]」以上の場合、ステップＳ２５２に進む。一方、インデックスｊの値が「num_anchor_refs_l0[i]」より小さい場合、ステップＳ２５０に進み、インデックスｊの値が「num_anchor_refs_l0[i]」以上になるまで、ステップＳ２４９からステップＳ２５１までの処理を繰り返す。 On the other hand, if it is determined in step S245 that the index i is greater than “1”, the index j is set to “0” (step S248), and then it is determined whether the value of the index j is smaller than “num_anchor_refs_l0 [i]”. (Step S249). When the value of the index j is “num_anchor_refs_l0 [i]” or more, the process proceeds to step S252. On the other hand, when the value of index j is smaller than “num_anchor_refs_l0 [i]”, the process proceeds to step S250, and the processing from step S249 to step S251 is repeated until the value of index j becomes equal to or greater than “num_anchor_refs_l0 [i]”.

ステップＳ２５０では視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l0[i][j]」を復号してステップＳ２５１に進む。続いて、ステップＳ２５１では、インデックスｊの値に「１」を加えて再びステップＳ２４９に進む。 In step S250, the syntax element “anchor_ref_l0” indicating the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 0 for the anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. [i] [j] ”is decoded, and the process proceeds to step S251. Subsequently, in step S251, “1” is added to the value of the index j, and the process proceeds again to step S249.

ステップＳ２５２では、視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_anchor_refs_l1[i]」を復号する。続いて、インデックスｉが「１」より大きいかどうかを判断する（ステップＳ２５３）。インデックスｉが「１」より大きいときはステップＳ２５６に進み、インデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」のときは、ステップＳ２５４に進む。 In step S252, the syntax element “num_anchor_refs_l1 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the anchor picture of the i-th view in the encoding / decoding order in the view direction. "Is decrypted. Subsequently, it is determined whether or not the index i is larger than “1” (step S253). When the index i is larger than “1”, the process proceeds to step S256. When the index i is not larger than “1”, that is, when the index i is “1”, the process proceeds to step S254.

ステップＳ２５４では、「num_anchor_refs_l1[1]」の値が「１」かどうかを判断する。ここで、「num_anchor_refs_l1[1]」は前述の通り、「０」または「１」の値を持つ。「num_anchor_refs_l1[1]」の値が「０」の場合、ステップＳ２６０に進み、値が「１」の場合、ステップＳ２５５に進む。前述の通り、インデックスiが「１」では符号化ビット列には視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素が符号化されていない。ステップＳ２５５では図１７のステップＳ２２３で復号した視点方向での符号化／復号順序で０番目の視点（最初に符号化／復号する視点）の視点ＩＤ、すなわち「view_id[0]」の値を「anchor_ref_l1[1][0]」の値とし、ステップＳ２６０に進む。 In step S254, it is determined whether or not the value of “num_anchor_refs_l1 [1]” is “1”. Here, “num_anchor_refs_l1 [1]” has a value of “0” or “1” as described above. When the value of “num_anchor_refs_l1 [1]” is “0”, the process proceeds to step S260, and when the value is “1”, the process proceeds to step S255. As described above, when the index i is “1”, the viewpoint used as the reference for inter-view prediction in the reference picture list 1 for the anchor picture of the first viewpoint in the encoding / decoding order in the viewpoint direction is used for the encoded bit string. The syntax element indicating the viewpoint ID is not encoded. In step S255, the viewpoint ID of the 0th viewpoint (viewpoint to be encoded / decoded first) in the encoding / decoding order in the viewpoint direction decoded in step S223 of FIG. 17, that is, the value of “view_id [0]” is set to “ The value of “anchor_ref_l1 [1] [0]” is set, and the process proceeds to step S260.

一方、ステップＳ２５３でインデックスｉの値が「１」より大きいと判定したときは、インデックスｊを「０」とした後（ステップＳ２５６）、インデックスｊの値が「num_anchor_refs_l1[i]」より小さいかどうかを判断する（ステップＳ２５７）。インデックスｊの値が「num_anchor_refs_l1[i]」以上の場合、ステップＳ２６０に進む。一方、インデックスｊの値が「num_anchor_refs_l1[i]」より小さい場合、ステップＳ２５８に進み、インデックスｊの値が「num_anchor_refs_l1[i]」以上になるまで、ステップＳ２５７からステップＳ２５９までの処理を繰り返す。 On the other hand, if it is determined in step S253 that the value of index i is greater than “1”, after index j is set to “0” (step S256), whether or not the value of index j is smaller than “num_anchor_refs_l1 [i]” Is determined (step S257). When the value of the index j is “num_anchor_refs_l1 [i]” or more, the process proceeds to step S260. On the other hand, when the value of the index j is smaller than “num_anchor_refs_l1 [i]”, the process proceeds to step S258, and the processing from step S257 to step S259 is repeated until the value of the index j becomes “num_anchor_refs_l1 [i]” or more.

ステップＳ２５８では視点方向での符号化／復号順序でi番目の視点のアンカーピクチャ用の参照ピクチャリスト１でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「anchor_ref_l1[i][j]」を復号してステップＳ２５９に進む。ステップＳ２５９では、インデックスｊの値に「１」を加えて再びステップＳ２５７に進む。ステップＳ２６０では、インデックスｉの値に「１」を加えて再びステップＳ２４３に進む。 In step S258, the syntax element “anchor_ref_l1” indicating the viewpoint ID of the viewpoint used as a reference for the j-th inter-view prediction in the reference picture list 1 for the anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. [i] [j] ”is decoded, and the process proceeds to step S259. In step S259, “1” is added to the value of index j, and the process proceeds again to step S257. In step S260, “1” is added to the value of index i, and the process proceeds again to step S243.

次に、図１８のステップＳ２３２のノンアンカーピクチャの視点依存情報の復号処理手順の一例について図２０のフローチャートと共に更に詳細に説明する。 Next, an example of the decoding processing procedure of the viewpoint dependent information of the non-anchor picture in step S232 in FIG. 18 will be described in more detail with reference to the flowchart in FIG.

図２０のノンアンカーピクチャの視点依存情報の復号処理では、符号化ビット列に視点方向での符号化／復号順序で０番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０及び参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_non_anchor_refs_l0[0]」と「num_non_anchor_refs_l1[0]」が符号化されておらず、それらの値を常に「０」とする（ステップＳ２６１）。そこで、インデックスiを「１」とする（ステップＳ２６２）。 In the decoding process of the view dependent information of the non-anchor picture in FIG. 20, the reference bit list 0 and the reference picture list 1 for the non-anchor picture of the 0th view in the coding / decoding order in the view direction are coded into the coded bit string. The syntax elements “num_non_anchor_refs_l0 [0]” and “num_non_anchor_refs_l1 [0]” indicating the number of viewpoints that can be used as a reference for inter-view prediction are not encoded, and their values are always “0” (step S261). ). Therefore, the index i is set to “1” (step S262).

続いて、インデックスiの値が（視点数−１）以下かどうかを判断する（ステップＳ２６３）。インデックスiの値が（視点数−１）以下でない場合、ノンアンカーピクチャの視点依存情報の復号処理を終了する。インデックスiの値が（視点数−１）以下の場合、ステップＳ２６４に進み、インデックスiの値が（視点数−１）以下でなくなるまで、ステップＳ２６３からステップＳ２８０までの処理を繰り返す。 Subsequently, it is determined whether or not the value of the index i is (the number of viewpoints−1) or less (step S263). If the value of index i is not less than (number of viewpoints −1), the decoding process of the viewpoint dependent information of the non-anchor picture is terminated. When the value of index i is (number of viewpoints-1) or less, the process proceeds to step S264, and the processing from step S263 to step S280 is repeated until the value of index i is not less than (number of viewpoints-1).

ステップＳ２６４では、視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_non_anchor_refs_l0[i]」を復号する。続いて、ステップＳ２６５では、インデックスｉが「１」より大きいかどうかを判断する。インデックスｉが「１」より大きいときはステップＳ２６８に進み、インデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」のときは、ステップＳ２６６に進む。 In step S264, the syntax element “num_non_anchor_refs_l0 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. ] ". Subsequently, in step S265, it is determined whether or not the index i is greater than “1”. When the index i is greater than “1”, the process proceeds to step S268. When the index i is not greater than “1”, that is, when the index i is “1”, the process proceeds to step S266.

ステップＳ２６６では、「num_non_anchor_refs_l0[1]」の値が「１」かどうかを判断する。ここで、「num_non_anchor_refs_l0[1]」は前述の通り、「０」または「１」の値を持つ。「num_non_anchor_refs_l0[1]」の値が「０」の場合、ステップＳ２７２に進み、値が「１」の場合、ステップＳ２６７に進む。 In step S266, it is determined whether or not the value of “num_non_anchor_refs_l0 [1]” is “1”. Here, “num_non_anchor_refs_l0 [1]” has a value of “0” or “1” as described above. When the value of “num_non_anchor_refs_l0 [1]” is “0”, the process proceeds to step S272, and when the value is “1”, the process proceeds to step S267.

前述の通り、インデックスiが「１」では符号化ビット列には視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０での視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素が符号化されていない。そこで、ステップＳ２６７では図１７のステップＳ２２３で復号した視点方向での符号化／復号順序で０番目の視点（最初に符号化／復号する視点）の視点ＩＤ、即ち「view_id[0]」の値を「non_anchor_ref_l0[1][0]」の値とし、ステップＳ２７２に進む。 As described above, when the index i is “1”, the coded bit string is used as a reference for inter-view prediction in the reference picture list 0 for the non-anchor picture of the first view in the coding / decoding order in the view direction. A syntax element indicating the viewpoint ID of the viewpoint is not encoded. Accordingly, in step S267, the viewpoint ID of the 0th viewpoint (viewpoint to be encoded / decoded first) in the encoding / decoding order in the viewpoint direction decoded in step S223 of FIG. 17, that is, the value of “view_id [0]” Is set to a value of “non_anchor_ref_l0 [1] [0]”, and the process proceeds to step S272.

一方、ステップＳ２６５でインデックスｉの値が「１」より大きいと判定したときは、インデックスｊを「０」とした後（ステップＳ２６８）、インデックスｊの値が「num_non_anchor_refs_l0[i]」より小さいかどうかを判断する（ステップＳ２６９）。インデックスｊの値が「num_non_anchor_refs_l0[i]」以上の場合、ステップＳ２７２に進む。一方、インデックスｊの値が「num_non_anchor_refs_l0[i]」より小さい場合、ステップＳ２７０に進み、インデックスｊの値が「num_non_anchor_refs_l0[i]」以上になるまで、ステップＳ２６９からステップＳ２７１までの処理を繰り返す。 On the other hand, if it is determined in step S265 that the value of index i is greater than “1”, after index j is set to “0” (step S268), whether the value of index j is smaller than “num_non_anchor_refs_l0 [i]” Is determined (step S269). When the value of the index j is “num_non_anchor_refs_l0 [i]” or more, the process proceeds to step S272. On the other hand, when the value of the index j is smaller than “num_non_anchor_refs_l0 [i]”, the process proceeds to step S270, and the processing from step S269 to step S271 is repeated until the value of the index j becomes “num_non_anchor_refs_l0 [i]” or more.

ステップＳ２７０では視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト０でのｊ番目の視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素「non_anchor_ref_l0[i][j]」を復号してステップＳ２７１に進む。続いて、ステップＳ２７１ではインデックスｊの値に「１」を加えて再びステップＳ２６９に進む。 In step S270, the syntax element “indicating the viewpoint ID of the viewpoint used as the reference for the j-th inter-view prediction in the reference picture list 0 for the non-anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. Non_anchor_ref_l0 [i] [j] ”is decoded, and the process proceeds to step S271. Subsequently, in step S271, “1” is added to the value of the index j, and the process proceeds again to step S269.

ステップＳ２７２では、視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素「num_non_anchor_refs_l1[i]」を復号する。続いて、インデックスｉが１より大きいかどうかを判断する（ステップＳ２７３）。インデックスｉが「１」より大きいときはステップＳ２７６に進み、インデックスｉが「１」より大きくないとき、すなわち、インデックスｉが「１」のときは、ステップＳ２７４に進む。 In step S272, the syntax element “num_non_anchor_refs_l1 [i] indicating the number of viewpoints that can be used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the i-th view in the encoding / decoding order in the view direction. ] ". Subsequently, it is determined whether or not the index i is greater than 1 (step S273). When the index i is larger than “1”, the process proceeds to step S276, and when the index i is not larger than “1”, that is, when the index i is “1”, the process proceeds to step S274.

ステップＳ２７４では、「num_non_anchor_refs_l1[1]」の値が「１」かどうかを判断する。ここで、「num_non_anchor_refs_l1[1]」は前述の通り、「０」または「１」の値を持つ。「num_non_anchor_refs_l1[1]」の値が「０」の場合、ステップＳ２８０に進み、値が「１」の場合、ステップＳ２７５に進む。 In step S274, it is determined whether or not the value of “num_non_anchor_refs_l1 [1]” is “1”. Here, “num_non_anchor_refs_l1 [1]” has a value of “0” or “1” as described above. When the value of “num_non_anchor_refs_l1 [1]” is “0”, the process proceeds to step S280, and when the value is “1”, the process proceeds to step S275.

前述の通り、インデックスiが「１」では符号化ビット列には視点方向での符号化／復号順序で１番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１での視点間予測の参照として用いられる視点の視点ＩＤを示すシンタックス要素が符号化されていない。そこで、ステップＳ２７５では図１７のステップＳ２２３で復号した視点方向での符号化／復号順序で０番目の視点（最初に符号化／復号する視点）の視点ＩＤ、すなわち「view_id[0]」の値を「non_anchor_ref_l1[1][0]」の値とし、ステップＳ２８０に進む。 As described above, when the index i is “1”, the coded bit string is used as a reference for inter-view prediction in the reference picture list 1 for the non-anchor picture of the first view in the coding / decoding order in the view direction. A syntax element indicating the viewpoint ID of the viewpoint is not encoded. Therefore, in step S275, the viewpoint ID of the 0th viewpoint (viewpoint to be encoded / decoded first) in the encoding / decoding order in the viewpoint direction decoded in step S223 of FIG. 17, that is, the value of “view_id [0]” Is set to a value of “non_anchor_ref_l1 [1] [0]”, and the process proceeds to step S280.

一方、ステップＳ２７３でインデックスｉの値が「１」より大きいと判定したときは、インデックスｊを「０」とした後（ステップＳ２７６）、インデックスｊの値が「num_non_anchor_refs_l1[i]」より小さいかどうかを判断する（ステップＳ２７７）。インデックスｊの値が「num_non_anchor_refs_l1[i]」の値以上の場合、ステップＳ２６０に進む。一方、インデックスｊの値が「num_non_anchor_refs_l1[i]」の値より小さい場合、ステップＳ２７８に進み、インデックスｊの値が「num_non_anchor_refs_l1[i]」以上になるまで、ステップＳ２７７からステップＳ２７９までの処理を繰り返す。 On the other hand, if it is determined in step S273 that the value of index i is greater than “1”, after index j is set to “0” (step S276), whether or not the value of index j is smaller than “num_non_anchor_refs_l1 [i]” Is determined (step S277). If the value of the index j is equal to or greater than the value of “num_non_anchor_refs_l1 [i]”, the process proceeds to step S260. On the other hand, if the value of index j is smaller than the value of “num_non_anchor_refs_l1 [i]”, the process proceeds to step S278, and the processing from step S277 to step S279 is repeated until the value of index j becomes “num_non_anchor_refs_l1 [i]” or more. .

ステップＳ２７８では、視点方向での符号化／復号順序でi番目の視点のノンアンカーピクチャ用の参照ピクチャリスト１でのｊ番目の視点間予測野参照として用いられる視点の視点ＩＤを示すシンタックス要素「non_anchor_ref_l1[i][j]」を復号してステップＳ２７９に進む。ステップＳ２７９では、インデックスｊの値に「１」を加えて再びステップＳ２７７に進む。ステップＳ２８０では、インデックスｉの値に「１」を加えて再びステップＳ２６３に進む。 In step S278, the syntax element indicating the viewpoint ID of the viewpoint used as the j-th inter-view prediction field reference in the reference picture list 1 for the non-anchor picture of the i-th viewpoint in the encoding / decoding order in the viewpoint direction. “Non_anchor_ref_l1 [i] [j]” is decoded, and the process proceeds to step S279. In step S279, “1” is added to the value of index j, and the process proceeds again to step S277. In step S280, “1” is added to the value of index i, and the process proceeds again to step S263.

以上の図１９、及び図２０の処理手順の説明において、図１９のステップＳ２４４、Ｓ２５２、図２０のステップＳ２６４、Ｓ２７２の復号処理は、図１４の参照視点数情報復号部４０４の復号動作に相当する。また、図１９のステップＳ２４７、Ｓ２５５、図２０のステップＳ２６７、Ｓ２７５の処理は、図１４の参照視点情報生成部４０５の導出動作に相当する。更に、図１９のステップＳ２５０、Ｓ２５８、図２０のステップＳ２７０、Ｓ２７８の復号処理は、図１４の参照視点情報復号部４０６の復号動作に相当する。 In the description of the processing procedures in FIGS. 19 and 20 above, the decoding processes in steps S244 and S252 in FIG. 19 and steps S264 and S272 in FIG. 20 correspond to the decoding operation of the reference viewpoint number information decoding unit 404 in FIG. To do. Further, the processes in steps S247 and S255 in FIG. 19 and steps S267 and S275 in FIG. 20 correspond to the derivation operation of the reference viewpoint information generation unit 405 in FIG. Furthermore, the decoding processes in steps S250 and S258 in FIG. 19 and steps S270 and S278 in FIG. 20 correspond to the decoding operation of the reference viewpoint information decoding unit 406 in FIG.

再び、図１６に戻って説明する。図１８乃至図２０と共に説明した上記のステップＳ２１４の処理が完了したら図１６のシーケンス情報の符号化処理は終了である。 Again, referring back to FIG. When the process of step S214 described above with reference to FIGS. 18 to 20 is completed, the sequence information encoding process of FIG. 16 is completed.

再び、図１５のフローチャートに戻って説明する。図１６乃至図２０と共に説明した上記のステップＳ２０５のシーケンス情報の復号処理が完了すると、ステップＳ２０８に進む。一方、ステップＳ２０６では、ピクチャの符号化に係るパラメータ情報を復号する。このステップＳ２０６の処理は、図１３の多視点画像復号装置のピクチャ情報復号部３０４での復号動作に相当する。ステップＳ２０６の復号処理が完了したらステップＳ２０８に進む。一方、ステップＳ２０７では、スライス情報及び画像信号を復号する。このステップＳ２０７の処理は、図１３の多視点画像復号装置では画像信号復号部３０５での復号動作に相当する。ステップＳ２０７の処理が完了したらステップＳ２０８に進む。 Returning to the flowchart of FIG. When the decoding process of the sequence information in step S205 described above with reference to FIGS. 16 to 20 is completed, the process proceeds to step S208. On the other hand, in step S206, the parameter information related to the coding of the picture is decoded. The processing in step S206 corresponds to the decoding operation in the picture information decoding unit 304 of the multi-viewpoint image decoding apparatus in FIG. When the decoding process in step S206 is completed, the process proceeds to step S208. On the other hand, in step S207, the slice information and the image signal are decoded. The processing in step S207 corresponds to the decoding operation in the image signal decoding unit 305 in the multi-viewpoint image decoding apparatus in FIG. When the process of step S207 is completed, the process proceeds to step S208.

ステップＳ２０８では、復号の対象となる符号化ビット列のすべての復号処理が完了したか否かを判断する。完了している場合、本多視点画像復号処理手順が終了となる。完了していない場合、最初のステップＳ２０１に戻り、復号の対象となる符号化ビット列のすべての復号処理が完了するまでステップＳ２０１からステップＳ２０８までの処理を繰り返す。 In step S208, it is determined whether or not all decoding processes for the encoded bit string to be decoded have been completed. If completed, this multi-viewpoint image decoding processing procedure ends. If not completed, the process returns to the first step S201, and the processes from step S201 to step S208 are repeated until all the decoding processes of the encoded bit string to be decoded are completed.

なお、以上の説明においては、視点方向での符号化／復号順序で１番目の視点のアンカーピクチャ用、及びノンアンカーピクチャ用の参照ピクチャリスト０、及び参照ピクチャリスト１での視点間予測の参照として利用できる視点の数を示すシンタックス要素を符号化側で符号化し、復号側で復号するものとして説明したが、視点方向での符号化／復号順序で先行する視点のみを視点間予測で参照できるものとした場合、視点方向での符号化／復号順序で１番目の視点の視点間予測の際に参照できるのは０番目の視点ただ１つだけである。 In the above description, reference of inter-view prediction in the reference picture list 0 and the reference picture list 1 for the anchor picture of the first viewpoint and the non-anchor picture in the encoding / decoding order in the viewpoint direction is referred to. As described above, the syntax element indicating the number of viewpoints that can be used as encoding is encoded on the encoding side and decoded on the decoding side, but only the viewpoint that precedes the encoding / decoding order in the viewpoint direction is referred to in inter-view prediction. If it is possible, only the 0th viewpoint can be referred to in the inter-view prediction of the first viewpoint in the encoding / decoding order in the viewpoint direction.

従って、１番目の視点に関しては視点間予測の参照として利用できる視点の数は「０」か「１」の値をとる。視点間予測の参照として利用できる視点の数が「０」の場合、他の視点を参照せずに符号化／復号することを表し、視点間予測の参照として利用できる視点の数が「１」の場合、他の視点を参照して符号化／復号できることを表す。従って、１番目の視点に関しては、視点間予測の参照として利用できる視点の数の情報の替りに他の視点を参照するか否かの２値の情報、または視点間予測の参照として利用できる視点の数が「０」か「１」かの２値の情報を符号化側で符号化し、復号側で復号することで代用することができる。 Therefore, regarding the first viewpoint, the number of viewpoints that can be used as a reference for inter-view prediction takes a value of “0” or “1”. When the number of viewpoints that can be used as a reference for inter-view prediction is “0”, this represents encoding / decoding without referring to other viewpoints, and the number of viewpoints that can be used as a reference for inter-view prediction is “1”. In this case, it indicates that encoding / decoding can be performed with reference to another viewpoint. Therefore, regarding the first viewpoint, instead of information on the number of viewpoints that can be used as a reference for inter-view prediction, binary information indicating whether or not to reference another viewpoint, or a viewpoint that can be used as a reference for inter-view prediction It is possible to substitute binary information of “0” or “1” by encoding on the encoding side and decoding on the decoding side.

なお、以上の説明においては、符号化、復号に用いる多視点画像は異なる視点から実際に撮影された多視点画像を符号化、復号することもできるが、実際には撮影していない仮想的な視点の位置を周辺の視点から補間する等、変換または生成された視点画像を符号化、復号することもでき、本発明に含まれる。 In the above description, a multi-view image used for encoding and decoding can be encoded and decoded from a multi-view image actually captured from different viewpoints, but it is a virtual image that is not actually captured. It is also possible to encode and decode a viewpoint image that has been converted or generated, such as by interpolating the viewpoint position from surrounding viewpoints, and is included in the present invention.

例えば、Ａ，Ｂ，Ｃ，Ｄの４つの視点の画像信号を備えた多視点画像信号は、（１）４つの視点の画像信号がすべて各視点で実際に撮影して得られた画像信号である場合、（２）４つの視点の画像信号がすべて各視点で仮想的に撮影したものとして生成した画像信号である場合、（３）Ａ，Ｂ視点の画像信号が各視点で実際に撮影して得られた画像信号、Ｃ，Ｄ視点の画像信号が各視点で仮想的に撮影したものとして生成した画像信号といったように、実際に撮影して得られた画像信号と仮想的に撮影したものとして生成した画像信号とが混在している場合の３つの場合が想定される。 For example, a multi-viewpoint image signal including image signals of four viewpoints A, B, C, and D is (1) an image signal obtained by actually photographing all four viewpoint image signals at each viewpoint. In some cases, (2) when all four viewpoint image signals are virtually taken at each viewpoint, (3) A and B viewpoint image signals are actually captured at each viewpoint. The image signal obtained by actually shooting and the image signal obtained by actually shooting, such as the image signal obtained by virtually capturing the image signal of the C and D viewpoints and the image signals of the C and D viewpoints. Are assumed to be mixed with the generated image signal.

また、コンピュータグラフィックス等の多視点画像を符号化、復号することもでき、本発明に含まれる。更に、以上の多視点画像符号化、および復号に関する処理は、ハードウェアを用いた伝送、蓄積、受信装置として実現することができるのは勿論のこと、ＲＯＭ（リード・オンリ・メモリ）やフラッシュメモリ等に記憶されているファームウェアや、コンピュータ等のソフトウェアによっても実現することができる。そのファームウェアプログラム、ソフトウェアプログラムをコンピュータ等で読み取り可能な記録媒体に記録して提供することも、有線あるいは無線のネットワークを通してサーバから提供することも、地上波あるいは衛星ディジタル放送のデータ放送として提供することも可能である。 In addition, multi-view images such as computer graphics can be encoded and decoded, and is included in the present invention. Furthermore, the above multi-view image encoding and decoding processes can be realized as a transmission, storage, and reception device using hardware, as well as a ROM (Read Only Memory) and a flash memory. It can also be realized by firmware stored in the computer or software such as a computer. The firmware program and software program can be provided by recording them on a computer-readable recording medium, provided from a server through a wired or wireless network, or provided as a data broadcast of terrestrial or satellite digital broadcasting. Is also possible.

本発明で復号する符号化ビット列を生成する多視点画像符号化装置の一例のブロック図である。It is a block diagram of an example of the multiview image coding apparatus which produces | generates the encoding bit sequence decoded by this invention. 図１中の多視点画像符号化装置を構成するシーケンス情報符号化部１０２の一例のブロック図である。It is a block diagram of an example of the sequence information encoding part 102 which comprises the multiview image encoding apparatus in FIG. 本発明で復号する符号化ビット列の多視点画像符号化処理説明用フローチャートである。It is a flowchart for the multi-view image encoding process description of the encoding bit sequence decoded by this invention. 図３中のステップＳ１０１のシーケンス情報の符号化処理説明用フローチャートである。FIG. 4 is a flowchart for explaining an encoding process of sequence information in step S <b> 101 in FIG. 3. FIG. 図４中のステップＳ１１３の視点方向での符号化／復号順序による視点ＩＤの符号化処理説明用フローチャートである。FIG. 5 is a flowchart for explaining viewpoint ID encoding processing according to the encoding / decoding order in the viewpoint direction in step S113 in FIG. 4; 図４中のステップＳ１１４の視点依存情報の符号化処理説明用フローチャートである。FIG. 5 is a flowchart for explaining viewpoint-dependent information encoding processing in step S114 in FIG. 4; FIG. 図６中のステップＳ１３１のアンカーピクチャの視点依存情報の符号化処理説明用フローチャートである。FIG. 7 is a flowchart for explaining processing for encoding viewpoint-dependent information of an anchor picture in step S131 in FIG. 6; FIG. 図６中のステップＳ１３２のノンアンカーピクチャの視点依存情報の符号化処理説明用フローチャートである。FIG. 7 is a flowchart for explaining encoding processing of viewpoint-dependent information of a non-anchor picture in step S132 in FIG. ネットワークを介して伝送する場合のパケット化及び送信処理説明用フローチャートである。It is a flowchart for packetization and transmission processing explanation in the case of transmitting via a network. ８視点からなる多視点画像を符号化する際の画像間の参照依存関係の一例を示す図である。It is a figure which shows an example of the reference dependence relationship between the images at the time of encoding the multiview image which consists of 8 viewpoints. 本発明で復号する符号化ビット列のＳＰＳのＭＶＣ拡張部分のシンタックス構造の一例を示す図である。It is a figure which shows an example of the syntax structure of the MVC extension part of SPS of the encoding bit stream decoded by this invention. 図１１に示すシンタックス構造に基づいて、図１０に示す予測の参照依存関係で符号化する際のＳＰＳのＭＶＣ拡張部分の各シンタックス要素とその値の一例である。It is an example of each syntax element and its value of the MVC extension part of SPS at the time of encoding by the reference dependence of prediction shown in FIG. 10 based on the syntax structure shown in FIG. 本発明の多視点画像復号装置の一実施の形態のブロック図である。It is a block diagram of one embodiment of a multi-view image decoding device of the present invention. 図１３中の多視点画像復号装置を構成するシーケンス情報復号部３０３の一実施の形態のブロック図である。It is a block diagram of one Embodiment of the sequence information decoding part 303 which comprises the multiview image decoding apparatus in FIG. 本発明の多視点画像復号処理説明用フローチャートである。It is a flowchart for multi-view image decoding processing description of this invention. 図１５中のステップＳ２０５のシーケンス情報の復号処理説明用フローチャートである。16 is a flowchart for explaining decoding processing of sequence information in step S205 in FIG. 15. 図１６中のステップＳ２１３の視点方向での符号化／復号順序で符号化された視点ＩＤの復号処理説明用フローチャートである。17 is a flowchart for explaining decoding processing of a viewpoint ID encoded in the encoding / decoding order in the viewpoint direction in step S213 in FIG. 図１６中のステップＳ２１４の視点依存情報の復号処理説明用フローチャートである。FIG. 17 is a flowchart for explaining decoding processing of viewpoint dependent information in step S214 in FIG. 16; 図１８中のステップＳ２３１のアンカーピクチャの視点依存情報の復号処理説明用フローチャートである。FIG. 19 is a flowchart for explaining decoding processing of viewpoint dependent information of an anchor picture in step S231 in FIG. 18; FIG. 図１８中のステップＳ２３２のノンアンカーピクチャの視点依存情報の復号処理説明用フローチャートである。FIG. 19 is a flowchart for describing decoding processing of viewpoint-dependent information of a non-anchor picture in step S232 in FIG. ネットワークを介して受信する場合の受信処理説明用フローチャートである。It is a flowchart for reception processing explanation in the case of receiving via a network. 従来例のＳＰＳのＭＶＣ拡張部分のシンタックス構造の一例を示す図である。It is a figure which shows an example of the syntax structure of the MVC expansion part of SPS of a prior art example. 符号なし指数ゴロム符号で符号化されたビット列とコード番号の関係の一例を示す図である。It is a figure which shows an example of the relationship between the bit sequence and code number which were encoded with the unsigned exponential Golomb code. 図２２のシンタックス構造に基づいて、図１０に示す予測の参照依存関係で符号化する際のＳＰＳのＭＶＣ拡張部分の各シンタックス要素とその値の一例を示す図である。It is a figure which shows an example of each syntax element and its value of the MVC extension part of SPS at the time of encoding by the reference dependence relationship of prediction shown in FIG. 10 based on the syntax structure of FIG.

Explanation of symbols

１０１符号化管理部
１０２シーケンス情報符号化部
１０３ピクチャ情報符号化部
１０４画像信号符号化部
１０５多重化部
２０１ＭＶＣ拡張部分以外のシーケンス情報符号化部
２０２視点数情報符号化部
２０３符号化順序情報符号化部
２０４参照視点数情報符号化部
２０５参照視点情報符号化部
２０６視点依存情報符号化部
２０７ＳＰＳＭＶＣ拡張部分符号化部
３０１分離部
３０２復号管理部
３０３シーケンス情報復号部
３０４ピクチャ情報復号部
３０５画像信号復号部
４０１ＭＶＣ拡張部分以外のシーケンス情報復号部
４０２視点数情報復号部
４０３復号順序情報復号部
４０４参照視点数情報復号部
４０５参照視点情報生成部
４０６参照視点情報復号部
４０７スイッチ
４０８視点依存情報復号部
４０９ＳＰＳＭＶＣ拡張部分復号部 DESCRIPTION OF SYMBOLS 101 Coding management part 102 Sequence information coding part 103 Picture information coding part 104 Image signal coding part 105 Multiplexing part 201 Sequence information coding part other than MVC extension part 202 View number information coding part 203 Coding order information Coding section 204 Reference view number information coding section 205 Reference view information coding section 206 View dependent information coding section 207 SPS MVC extended partial coding section 301 Separating section 302 Decoding management section 303 Sequence information decoding section 304 Picture information decoding section 305 Image signal decoding unit 401 Sequence information decoding unit other than MVC extension portion 402 Viewpoint number information decoding unit 403 Decoding order information decoding unit 404 Reference view number information decoding unit 405 Reference viewpoint information generating unit 406 Reference view information decoding unit 407 Switch 408 Viewpoint Dependent information decoding unit 409 S PS MVC extended partial decoding unit

Claims

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-view image decoding method for decoding encoded data to be decoded in an encoded bit string obtained by encoding a multi-view image signal which is an image signal generated as a virtual image taken from one viewpoint,
The encoded data to be decoded is
First encoded data obtained by encoding information for specifying an encoding / decoding order between the plurality of viewpoints of each viewpoint in encoding of the image signal of each viewpoint;
When the viewpoint that is first encoded in the encoding / decoding order among the plurality of viewpoints is the 0th viewpoint and the viewpoint that is encoded next is the 1st viewpoint, i is the ith (where i is a natural number) Information of the number of zero or more viewpoints that can be referred to in inter-view prediction that refers to decoded image signals of other viewpoints at the time of encoding the image signal of each viewpoint (note that 0 is referred to in inter-view prediction) There is no viewpoint that can be performed, and in this viewpoint, encoding is performed without using inter-view prediction.) Second encoded data obtained by encoding each of the first and subsequent viewpoints;
When the number of the viewpoints that can be referred to in inter-view prediction is one or more in the viewpoint that is encoded jth (where j is a natural number of 2 or more) in the encoding / decoding order, Information that identifies the viewpoint to be referred to in the prediction includes third encoded data obtained by encoding each viewpoint.
A first step of decoding the first encoded data to obtain information for specifying an encoding / decoding order among the plurality of viewpoints;
A second step of decoding the second encoded data to obtain information on the number of viewpoints that can be referred to in the inter-view prediction;
In the first viewpoint, when the number of viewpoints that can be referred to in the inter-view prediction obtained by decoding in the second step is 1, information for identifying the viewpoint to be referred to in the inter-view prediction is A third step of setting information for specifying the 0th viewpoint obtained from information for specifying an encoding / decoding order between the plurality of viewpoints decoded in the first step;
A fourth step of decoding the third encoded data for each of the viewpoints to obtain information for specifying a viewpoint to be referred to in the inter-view prediction.

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-view image decoding device that decodes encoded data to be decoded in an encoded bit string obtained by encoding a multi-view image signal, which is an image signal generated as a virtual image taken from one viewpoint,
The encoded data to be decoded is
First encoded data obtained by encoding information for specifying an encoding / decoding order between the plurality of viewpoints of each viewpoint in encoding of the image signal of each viewpoint;
When the viewpoint that is first encoded in the encoding / decoding order among the plurality of viewpoints is the 0th viewpoint and the viewpoint that is encoded next is the 1st viewpoint, i is the ith (where i is a natural number) Information of the number of zero or more viewpoints that can be referred to in inter-view prediction that refers to decoded image signals of other viewpoints at the time of encoding the image signal of each viewpoint (note that 0 is referred to in inter-view prediction) There is no viewpoint that can be performed, and in this viewpoint, encoding is performed without using inter-view prediction.) Second encoded data obtained by encoding each of the first and subsequent viewpoints;
When the number of the viewpoints that can be referred to in inter-view prediction is one or more in the viewpoint that is encoded jth (where j is a natural number of 2 or more) in the encoding / decoding order, Information that identifies the viewpoint to be referred to in the prediction includes third encoded data obtained by encoding each viewpoint.
First decoding means for decoding the first encoded data to obtain information for specifying an encoding / decoding order between the plurality of viewpoints;
Second decoding means for decoding the second encoded data to obtain information on the number of viewpoints that can be referred to in the inter-view prediction;
In the first viewpoint, when the number of viewpoints that can be referred to in the inter-view prediction obtained by decoding in the second step is 1, information for identifying the viewpoint to be referred to in the inter-view prediction is Third decoding means for specifying the 0th viewpoint obtained from information specifying the encoding / decoding order between the plurality of viewpoints decoded in the first step;
A multi-viewpoint image decoding apparatus comprising: a fourth decoding unit that decodes the third encoded data for each of the viewpoints to obtain information for specifying a viewpoint to be referred to in the inter-viewpoint prediction.

It is a multi-viewpoint image signal including image signals of each viewpoint obtained respectively at a plurality of set viewpoints, and the image signal of one viewpoint is an image signal obtained by actually photographing from the one viewpoint, or A multi-viewpoint image decoding program that causes a computer to decode encoded data to be decoded in an encoded bit string obtained by encoding a multi-viewpoint image signal that is an image signal that is virtually captured from one viewpoint. There,
The encoded data to be decoded is
First encoded data obtained by encoding information for specifying an encoding / decoding order between the plurality of viewpoints of each viewpoint in encoding of the image signal of each viewpoint;
When the viewpoint that is first encoded in the encoding / decoding order among the plurality of viewpoints is the 0th viewpoint and the viewpoint that is encoded next is the 1st viewpoint, i is the ith (where i is a natural number) Information of the number of zero or more viewpoints that can be referred to in inter-view prediction that refers to decoded image signals of other viewpoints at the time of encoding the image signal of each viewpoint (note that 0 is referred to in inter-view prediction) There is no viewpoint that can be performed, and in this viewpoint, encoding is performed without using inter-view prediction.) Second encoded data obtained by encoding each of the first and subsequent viewpoints;
When the number of the viewpoints that can be referred to in inter-view prediction is one or more in the viewpoint that is encoded jth (where j is a natural number of 2 or more) in the encoding / decoding order, Information that identifies the viewpoint to be referred to in the prediction includes third encoded data obtained by encoding each viewpoint.
In the computer,
A first step of decoding the first encoded data to obtain information for specifying an encoding / decoding order among the plurality of viewpoints;
A second step of decoding the second encoded data to obtain information on the number of viewpoints that can be referred to in the inter-view prediction;
In the first viewpoint, when the number of viewpoints that can be referred to in the inter-view prediction obtained by decoding in the second step is 1, information for identifying the viewpoint to be referred to in the inter-view prediction is A third step of setting information for specifying the 0th viewpoint obtained from information for specifying an encoding / decoding order between the plurality of viewpoints decoded in the first step;
And a fourth step of decoding the third encoded data for each viewpoint and obtaining information for specifying a viewpoint to be referred to in the inter-view prediction.