JP6273828B2

JP6273828B2 - Image coding apparatus, image coding method, image decoding apparatus, and image decoding method

Info

Publication number: JP6273828B2
Application number: JP2013264961A
Authority: JP
Inventors: 章弘屋森
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2013-12-24
Filing date: 2013-12-24
Publication date: 2018-02-07
Anticipated expiration: 2033-12-24
Also published as: JP2015122606A

Description

本発明は、画像符号化装置、画像符号化方法、画像復号装置、及び画像復号方法に関する。 The present invention relates to an image encoding device, an image encoding method, an image decoding device, and an image decoding method.

動画像の符号化には、例えば、ＩＴＵ−Ｔ（International Telecommunication Union Telecommunication Standardization Sector）により勧告されたＨ．２６４と呼ばれる符号化方式が広く利用されている。国際標準化機構（ＩＳＯ）及び国際電気標準会議（ＩＥＣ）では、同じ技術がＭＰＥＧ−４ＡＶＣ（MPEG-4 Part 10 Advanced Video Coding）と呼ばれている。これらの呼称はまとめてＨ．２６４／ＡＶＣと表記されることがある。Ｈ．２６４／ＡＶＣ方式は、旧来のＭＰＥＧ−２符号化方式などに比べて高い圧縮効率を実現している。 For encoding moving images, for example, H.264 recommended by ITU-T (International Telecommunication Union Telecommunication Standardization Sector). An encoding method called H.264 is widely used. In the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), the same technique is called MPEG-4 AVC (MPEG-4 Part 10 Advanced Video Coding). These names are collectively referred to as H.264. H.264 / AVC. H. The H.264 / AVC system achieves higher compression efficiency than the conventional MPEG-2 encoding system.

Ｈ．２６４／ＡＶＣ方式の符号化には、異なる動画フレームを参照して現在の動画フレームを予測するフレーム間予測（インター予測）と呼ばれる技術が利用される。時間的に前の動画フレームを参照して現在の動画フレームを予測する方法は前方向予測（Forward Prediction）と呼ばれる。また、時間的に後の動画フレームを参照して現在の動画フレームを予測する方法は後方向予測（Backward Prediction）と呼ばれる。さらに、時間的に前又は後にある複数の動画フレームを参照して現在の動画フレームを予測する方法は双方向予測（Bi-directive Prediction）と呼ばれる。 H. For encoding of the H.264 / AVC format, a technique called inter-frame prediction (inter prediction) in which a current moving image frame is predicted with reference to different moving image frames is used. A method of predicting a current video frame with reference to a temporally previous video frame is called forward prediction. A method of predicting the current video frame with reference to a temporal video frame is called backward prediction. Furthermore, a method of predicting the current moving image frame with reference to a plurality of moving image frames before or after in time is called bi-directional prediction (Bi-directive Prediction).

フレーム間予測を用いずに符号化される動画フレームはＩ（Intra-coded）ピクチャと呼ばれる。また、前方向予測だけを用いて符号化される動画フレームはＰ（Predicted）ピクチャと呼ばれる。さらに、前方向予測、後方向予測、双方向予測のいずれかを選択的に用いて符号化される動画フレームはＢ（Bi-directional Predicted）ピクチャと呼ばれる。 A moving image frame encoded without using inter-frame prediction is called an I (Intra-coded) picture. A moving picture frame encoded using only forward prediction is called a P (Predicted) picture. Furthermore, a moving picture frame that is encoded by selectively using any one of forward prediction, backward prediction, and bidirectional prediction is called a B (Bi-directional Predicted) picture.

Ｈ．２６４／ＡＶＣ方式では、圧縮効率を高めるための工夫として、ダイレクトモード(Direct Mode)と呼ばれる予測モードが規定されている。ダイレクトモードは、Ｂピクチャに適用することができる。ダイレクトモードには、時間ダイレクトモードと空間ダイレクトモードという２種類のダイレクトモードがある。 H. In the H.264 / AVC format, a prediction mode called a direct mode is defined as a device for improving compression efficiency. The direct mode can be applied to B pictures. There are two types of direct modes: temporal direct mode and spatial direct mode.

時間ダイレクトモードは、あるＢピクチャ内にある画素ブロックを予測する際に、符号化済みピクチャの同位置にある画素ブロックの動きベクトルを利用する予測モードである。また、空間ダイレクトモードは、Ｂピクチャ内にある画素ブロックを予測する際に、その周辺に位置する画素ブロックの動きベクトルを利用する予測モードである。いずれのダイレクトモードも符号化対象の画素ブロックについて動きベクトルの情報を復号側へと伝送せずに済む分だけ情報量を減らすことができる。 The temporal direct mode is a prediction mode that uses a motion vector of a pixel block at the same position of an encoded picture when predicting a pixel block in a certain B picture. The spatial direct mode is a prediction mode that uses a motion vector of a pixel block located in the vicinity when predicting a pixel block in a B picture. In any direct mode, the amount of information can be reduced by the amount that it is not necessary to transmit motion vector information to the decoding side for the pixel block to be encoded.

近年、立体視映像などの多視点動画像に上記の符号化方式を適用できるようにするため、ＭＶＣ（Multi-view Video Coding）と呼ばれる拡張方式（以下、ＭＶＣ方式と呼ぶ。）が規定された。ＭＶＣ方式では、ある視点に対応する動画フレームの画素ブロックを符号化する際に、他の視点に対応する動画フレームの画素ブロックを参照する予測モード（以下、視点間予測と呼ぶ。）が追加された。但し、ＭＶＣ方式では、視点間予測を用いる画素ブロックについては時間ダイレクトモードを適用しないこととされている。 In recent years, an extension method called MVC (Multi-view Video Coding) (hereinafter referred to as MVC method) has been defined so that the above encoding method can be applied to multi-view video such as stereoscopic video. . In the MVC method, when encoding a pixel block of a moving image frame corresponding to a certain viewpoint, a prediction mode (hereinafter referred to as inter-view prediction) that refers to a pixel block of a moving image frame corresponding to another viewpoint is added. It was. However, in the MVC method, the temporal direct mode is not applied to pixel blocks that use inter-view prediction.

なお、ＭＶＣ方式に関し、参照先の動画フレームが視点間予測を用いて符号化されている場合に、他視点の動画フレームに設定された動きベクトルを用いて、参照元の動画フレームに時間ダイレクトモードを適用する方法（以下、提案方法）が提案されている。 In addition, regarding the MVC method, when the reference moving image frame is encoded using inter-view prediction, the motion vector set in the moving image frame of the other viewpoint is used to set the temporal direct mode to the reference moving image frame. Has been proposed (hereinafter, proposed method).

特開２０１２−１８２６１６号公報JP 2012-182616 A

上述した時間ダイレクトモードの適用は、情報量を効果的に低減するために有効な方法である。そのため、視点間予測を用いて符号化されるＢピクチャに時間ダイレクトモードを適用できるようにすれば、符号化効率の向上に寄与すると考えられる。なお、ここではＨ．２６４／ＡＶＣ方式の時間ダイレクトモードを例に挙げて説明したが、次世代規格であるＨ．２６５／ＨＥＶＣ（High Efficiency Video Coding）方式のマージモードについても同様である。 The application of the time direct mode described above is an effective method for effectively reducing the amount of information. For this reason, if the temporal direct mode can be applied to a B picture that is encoded using inter-view prediction, it is considered that it contributes to an improvement in encoding efficiency. Here, H. The time direct mode of the H.264 / AVC system has been described as an example. The same applies to the merge mode of the H.265 / HEVC (High Efficiency Video Coding) method.

そこで、１つの側面によれば、本発明の目的は、多視点動画符号化における符号化効率を向上させることが可能な画像符号化装置、画像符号化方法、画像復号装置、及び画像復号方法を提供することにある。 Therefore, according to one aspect, an object of the present invention is to provide an image encoding device, an image encoding method, an image decoding device, and an image decoding method capable of improving encoding efficiency in multi-view video encoding. It is to provide.

本開示の１つの側面によれば、複数の視点にそれぞれ対応する複数の動画像が格納される記憶部と、複数の動画像のうち第１の動画像に含まれる画像の中から、第２の動画像に含まれる同時刻の画像を参照する視点間動き情報を有する対象画像を検出し、同時刻の画像が有する動きベクトルを利用して対象画像の画像領域を符号化する演算部と、を有する画像符号化装置が提供される。演算部は、第１の動画像のうち対象画像の画像領域から参照される第１の参照画像と、該対象画像との間の第１の時間間隔、及び、第２の動画像のうち動きベクトルにより参照される第２の参照画像と、該同時刻の画像との間の第２の時間間隔に基づいて、画像領域の符号化に利用する動きベクトルの長さを調整する。 According to one aspect of the present disclosure, a storage unit that stores a plurality of moving images respectively corresponding to a plurality of viewpoints, and a second of the images included in the first moving image among the plurality of moving images. A calculation unit that detects a target image having inter-viewpoint motion information that refers to an image at the same time included in the moving image and encodes an image region of the target image using a motion vector included in the image at the same time; Is provided. The computing unit includes a first time interval between the first reference image referred to from the image area of the target image in the first moving image and the target image, and a motion in the second moving image. Based on the second time interval between the second reference image referred to by the vector and the image at the same time, the length of the motion vector used for encoding the image region is adjusted.

本発明によれば、多視点動画符号化における符号化効率を向上させることが可能になる。 According to the present invention, it is possible to improve encoding efficiency in multi-view video encoding.

第１実施形態に係る画像符号化装置の一例を示した図である。It is the figure which showed an example of the image coding apparatus which concerns on 1st Embodiment. 時間ダイレクトモードについて説明するための図である。It is a figure for demonstrating time direct mode. 視点間予測と時間ダイレクトモードとの関係について説明するための第１の図である。It is a 1st figure for demonstrating the relationship between prediction between viewpoints and temporal direct mode. 視点間予測と時間ダイレクトモードとの関係について説明するための第２の図である。It is a 2nd figure for demonstrating the relationship between prediction between viewpoints, and the time direct mode. 視点間予測と時間ダイレクトモードとの関係について説明するための第３の図である。It is a 3rd figure for demonstrating the relationship between prediction between viewpoints, and the time direct mode. 第２実施形態に係るシステムの一例を示した図である。It is the figure which showed an example of the system which concerns on 2nd Embodiment. 第２実施形態に係る符号化装置が有する機能を実現することが可能なハードウェアの一例を示した図である。It is the figure which showed an example of the hardware which can implement | achieve the function which the encoding apparatus which concerns on 2nd Embodiment has. 第２実施形態に係る符号化装置が有する機能の一例を示した第１のブロック図である。It is the 1st block diagram which showed an example of the function which the encoding apparatus which concerns on 2nd Embodiment has. 第２実施形態に係る符号化装置が有する機能の一例を示した第２のブロック図である。It is the 2nd block diagram which showed an example of the function which the encoding apparatus which concerns on 2nd Embodiment has. 第２実施形態に係る符号化装置が有する機能の一例を示した第３のブロック図である。It is the 3rd block diagram which showed an example of the function which the encoding apparatus which concerns on 2nd Embodiment has. 第２実施形態に係る基準ベクトルの計算方法について説明するための第１の図である。It is a 1st figure for demonstrating the calculation method of the reference | standard vector which concerns on 2nd Embodiment. 第２実施形態に係る基準ベクトルの計算方法について説明するための第２の図である。It is a 2nd figure for demonstrating the calculation method of the reference | standard vector which concerns on 2nd Embodiment. 第２実施形態に係る基準ベクトルの補正方法について説明するための第１の図である。It is a 1st figure for demonstrating the correction method of the reference | standard vector which concerns on 2nd Embodiment. 第２実施形態に係る基準ベクトルの補正方法について説明するための第２の図である。It is a 2nd figure for demonstrating the correction method of the reference | standard vector which concerns on 2nd Embodiment. 第２実施形態に係る復号装置が有する機能の一例を示した第１のブロック図である。It is the 1st block diagram showing an example of the function which the decoding device concerning a 2nd embodiment has. 第２実施形態に係る復号装置が有する機能の一例を示した第２のブロック図である。It is the 2nd block diagram showing an example of the function which the decoding device concerning a 2nd embodiment has. 第２実施形態に係る復号装置が有する機能の一例を示した第３のブロック図である。It is the 3rd block diagram showing an example of the function which the decoding device concerning a 2nd embodiment has. 第２実施形態に係る符号化処理の流れを示した第１のフロー図である。It is the 1st flowchart which showed the flow of the encoding process which concerns on 2nd Embodiment. 第２実施形態に係る符号化処理の流れを示した第２のフロー図である。It is the 2nd flowchart which showed the flow of the encoding process which concerns on 2nd Embodiment. 第２実施形態に係る符号化処理の流れを示した第３のフロー図である。It is the 3rd flowchart which showed the flow of the encoding process which concerns on 2nd Embodiment. 第３実施形態に係る符号化方法について説明するための第１の図である。It is a 1st figure for demonstrating the encoding method which concerns on 3rd Embodiment. 第３実施形態に係る符号化方法について説明するための第２の図である。It is a 2nd figure for demonstrating the encoding method which concerns on 3rd Embodiment.

以下に添付図面を参照しながら、本発明の実施形態について説明する。なお、本明細書及び図面において実質的に同一の機能を有する要素については、同一の符号を付することにより重複説明を省略する場合がある。 Embodiments of the present invention will be described below with reference to the accompanying drawings. In addition, about the element which has the substantially same function in this specification and drawing, duplication description may be abbreviate | omitted by attaching | subjecting the same code | symbol.

＜１．第１実施形態＞
図１を参照しながら、第１実施形態について説明する。図１は、第１実施形態に係る画像符号化装置の一例を示した図である。 <1. First Embodiment>
The first embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of an image encoding device according to the first embodiment.

図１に示すように、第１実施形態に係る画像符号化装置１０は、記憶部１１及び演算部１２を有する。
なお、記憶部１１は、ＲＡＭ（Random Access Memory）などの揮発性記憶装置、或いは、ＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの不揮発性記憶装置である。演算部１２は、ＣＰＵ（Central Processing Unit）やＤＳＰ（Digital Signal Processor）などのプロセッサである。但し、演算部１２は、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などの電子回路であってもよい。演算部１２は、例えば、記憶部１１又は他のメモリに記憶されたプログラムを実行する。 As illustrated in FIG. 1, the image encoding device 10 according to the first embodiment includes a storage unit 11 and a calculation unit 12.
The storage unit 11 is a volatile storage device such as a RAM (Random Access Memory) or a non-volatile storage device such as an HDD (Hard Disk Drive) or a flash memory. The arithmetic unit 12 is a processor such as a CPU (Central Processing Unit) or a DSP (Digital Signal Processor). However, the arithmetic unit 12 may be an electronic circuit such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA). For example, the calculation unit 12 executes a program stored in the storage unit 11 or another memory.

記憶部１１には、複数の視点＃１、＃２にそれぞれ対応する複数の動画像２１、２２が格納される。なお、図１の例では、２視点の場合が示されているが、視点の数が３以上の場合も同様である。また、図１の例では、視点＃１に対応する動画像２１がＢａｓｅ−Ｖｉｅｗに設定され、視点＃２に対応する動画像２２が非Ｂａｓｅ−Ｖｉｅｗに設定されている。以下、この例に沿って説明を進める。 The storage unit 11 stores a plurality of moving images 21 and 22 respectively corresponding to a plurality of viewpoints # 1 and # 2. In the example of FIG. 1, the case of two viewpoints is shown, but the same applies to the case where the number of viewpoints is three or more. Further, in the example of FIG. 1, the moving image 21 corresponding to the viewpoint # 1 is set to Base-View, and the moving image 22 corresponding to the viewpoint # 2 is set to non-Base-View. Hereinafter, the description will proceed along this example.

Ｂａｓｅ−Ｖｉｅｗとは、他視点の画像を参照するインター予測を行う画像が含まれないことを意味する。つまり、視点＃１に対応する動画像２１の画像Ｐｉｃ₁₀、…、Ｐｉｃ₁₅を符号化及び復号する際に、視点＃２に対応する動画像２２の画像Ｐｉｃ₂₀、…、Ｐｉｃ₂₅が参照されない。画像Ｐｉｃ₁₀、…、Ｐｉｃ₁₅の情報だけを用いて動画像２１が符号化及び復号されるため、視点＃２に対応する動画像２２の画像Ｐｉｃ₂₀、…、Ｐｉｃ₂₅に関する情報がなくとも動画像２１を符号化及び復号することができる。 Base-View means that an image that performs inter prediction referring to an image of another viewpoint is not included. That is, when the images Pic ₁₀ ,..., Pic ₁₅ of the moving image 21 corresponding to the viewpoint # 1 are encoded and decoded, the images Pic ₂₀ ,..., Pic _{25 of} the moving image 22 corresponding to the viewpoint # 2 are not referred to. . Image Pic _10, ..., since the moving image 21 is encoded and decoded using only information in Pic _15, image Pic ₂₀ of the moving image 22 corresponding to the viewpoint # 2, ..., video without any information about Pic ₂₅ Image 21 can be encoded and decoded.

一方、非Ｂａｓｅ−Ｖｉｅｗとは、他視点の画像を参照するインター予測を行う画像が含まれることを意味する。つまり、視点＃２に対応する動画像２２の画像Ｐｉｃ₂₀、…、Ｐｉｃ₂₅を符号化及び復号する際に、視点＃１に対応する動画像２１の画像Ｐｉｃ₁₀、…、Ｐｉｃ₁₅の少なくとも１つが参照される。つまり、視点＃１に対応する動画像２１の画像Ｐｉｃ₁₀、…、Ｐｉｃ₁₅の少なくとも１つに関する情報が動画像２２の符号化及び復号に用いられる。 On the other hand, non-Base-View means that an image that performs inter prediction referring to an image of another viewpoint is included. That is, the image Pic ₂₀ of the moving image 22 corresponding to the viewpoint # 2, ..., in encoding and decoding Pic _25, image Pic ₁₀ of the moving image 21 corresponding to the viewpoint # 1, ..., at least one of Pic ₁₅ One is referenced. That is, information on at least one of the images Pic ₁₀ ,..., Pic ₁₅ of the moving image 21 corresponding to the viewpoint # 1 is used for encoding and decoding of the moving image 22.

演算部１２は、一の動画像２２に含まれる画像の中から、他の動画像２１に含まれる同時刻の画像ＣｏｌＰｉｃを参照する視点間動き情報ｍｖＣｏｌを有する対象画像ＣｕｒｒＰｉｃを検出する。つまり、演算部１２は、予測モードが視点間予測に設定された画像領域を含む対象画像ＣｕｒｒＰｉｃを検出する。図１の例では、対象画像ＣｕｒｒＰｉｃとして画像Ｐｉｃ₂₄が検出されている。また、視点間動き情報ｍｖＣｏｌにより参照される画像ＣｏｌＰｉｃは、画像Ｐｉｃ₁₄である。 The computing unit 12 detects a target image CurrPic having inter-viewpoint motion information mvCol referring to an image ColPic at the same time included in another moving image 21 from images included in one moving image 22. That is, the calculation unit 12 detects the target image CurrPic including an image region in which the prediction mode is set to inter-view prediction. In the example of FIG. 1, the image Pic ₂₄ is detected as the target image CurrPic. The image ColPic referenced by interview motion information mvCol is an image Pic _14.

演算部１２は、対象画像ＣｕｒｒＰｉｃと同時刻の画像ＣｏｌＰｉｃが有する動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁を利用して対象画像ＣｕｒｒＰｉｃの画像領域を符号化する。参照先の画像ＣｏｌＰｉｃは、対象画像ＣｕｒｒＰｉｃより先に符号化されている。そのため、対象画像ＣｕｒｒＰｉｃを符号化する際に、画像ＣｏｌＰｉｃの動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁を利用することができる。 The calculation unit 12 encodes the image area of the target image CurrPic using the motion information basemVCol ₀ and basemvCol ₁ included in the image ColPic at the same time as the target image CurrPic. The reference image ColPic is encoded before the target image CurrPic. Therefore, when encoding the target image CurrPic, the motion information basemvCol ₀ and basemvCol ₁ of the image ColPic can be used.

例えば、演算部１２は、画像ＣｏｌＰｉｃに含まれる画像領域のうち、視点間動き情報ｍｖＣｏｌにより対象画像ＣｕｒｒＰｉｃの画像領域に対応付けられる１つ又は複数の画像領域を特定する。そして、演算部１２は、特定した画像領域の動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁に基づいて対象画像ＣｕｒｒＰｉｃの画像領域を符号化する。図１の例では、動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁に基づいて動き情報ｍｖＬ₀、ｍｖＬ₁が生成され、動き情報ｍｖＬ₀、ｍｖＬ₁を用いて対象画像ＣｕｒｒＰｉｃが符号化される。 For example, the calculation unit 12 specifies one or a plurality of image areas associated with the image area of the target image CurrPic by using the inter-viewpoint movement information mvCol among the image areas included in the image ColPic. Then, the calculation unit 12 encodes the image region of the target image CurrPic based on the motion information basemvCol ₀ and basemvCol ₁ of the specified image region. In the example of FIG. 1, the motion information basemvCol _0, basemvCol motion information mvL _0, mvL ₁ based on ₁ is generated, the target image CurrPic is coded using the motion information mvL _0, mvL _1.

例えば、画像Ｐｉｃ₁₂、Ｐｉｃ₁₄間の時間ｔｂＬ₀と、画像Ｐｉｃ₂₂、Ｐｉｃ₂₄間の時間ｔｄＬ₀とが異なる場合、演算部１２は、比（ｔｄＬ₀／ｔｂＬ₀）により動き情報ｂａｓｅｍｖＣｏｌ₀をスケーリングする。そして、演算部１２は、スケーリング後の動き情報（ｔｄＬ₀／ｔｂＬ₀）＊ｂａｓｅｍｖＣｏｌ₀を動き情報ｍｖＬ₀とする。同様に、演算部１２は、比（ｔｄＬ₁／ｔｂＬ₁）により動き情報ｂａｓｅｍｖＣｏｌ₁をスケーリングし、スケーリング後の動き情報（ｔｄＬ₁／ｔｂＬ₁）＊ｂａｓｅｍｖＣｏｌ₁を動き情報ｍｖＬ₁とする。 For example, an image Pic _12, time TBL ₀ between Pic _14, image Pic _22, when the time TDL ₀ between Pic ₂₄ are different, computing unit 12, by the ratio (tdL _₀ / _{tbL 0)} motion information BasemvCol ₀ Scale. Then, the calculation unit 12 sets the motion information after scaling (tdL ₀ / tbL ₀ ) * basemvCol ₀ as the motion information mvL ₀ . Similarly, the operation unit 12 scales the motion information basemvCol ₁ by the ratio (tdL ₁ / tbL ₁ ), and sets the scaled motion information (tdL ₁ / tbL ₁ ) * basemvCol ₁ as the motion information mvL ₁ .

上記のように、対象画像ＣｕｒｒＰｉｃの符号化に用いる動き情報ｍｖＬ₀、ｍｖＬ₁は、視点間動き情報ｍｖＣｏｌが指す画像ＣｏｌＰｉｃの動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁から得られる。そのため、対象画像ＣｕｒｒＰｉｃの動き情報ｍｖＬ₀、ｍｖＬ₁がなくとも、画像ＣｏｌＰｉｃの動き情報ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁があれば、対象画像ＣｕｒｒＰｉｃを復号することができる。つまり、復号用の情報（符号化データ）に動き情報ｍｖＬ₀、ｍｖＬ₁を含めずに済む分だけ符号化効率を向上させることが可能になる。なお、上記の技術は、非Ｂａｓｅ−Ｖｉｅｗの動画像２２を復号する際にも適用することができ、符号化効率の向上に寄与する。 As described above, the motion information mvL ₀ and mvL ₁ used for encoding the target image CurrPic is obtained from the motion information basemvCol ₀ and basemvCol _{1 of the} image ColPic pointed to by the inter-viewpoint motion information mvCol. Therefore, even if there is no motion information mvL _0, mvL ₁ of the target image CurrPic, if motion of the image ColPic information basemvCol _0, basemvCol ₁ is able to decode the target picture CurrPic. That is, it is possible to improve the encoding efficiency by the amount that does not include the motion information mvL ₀ and mvL ₁ in the decoding information (encoded data). Note that the above technique can also be applied when decoding a non-Base-View moving image 22, which contributes to an improvement in encoding efficiency.

以上、第１実施形態について説明した。
＜２．第２実施形態＞
次に、第２実施形態について説明する。なお、ここでは説明の都合上、ＭＶＣ方式を例に挙げて説明を進めるが、第２実施形態に係る技術の適用範囲はこれに限定されない。 The first embodiment has been described above.
<2. Second Embodiment>
Next, a second embodiment will be described. Here, for convenience of explanation, the description will be given by taking the MVC method as an example, but the scope of application of the technology according to the second embodiment is not limited to this.

［２−１．時間ダイレクトモード］
図２を参照しながら、時間ダイレクトモードについて説明する。図２は、時間ダイレクトモードについて説明するための図である。 [2-1. Time direct mode]
The time direct mode will be described with reference to FIG. FIG. 2 is a diagram for explaining the time direct mode.

時間ダイレクトモードは、動画像の一連性に着目し、符号化対象のマクロブロック（以下、ＣｕｒｒＭＢ）と時間的に隣接したマクロブロックの動きベクトルからＣｕｒｒＭＢの動きベクトルを決定する方法である。時間ダイレクトモードを適用すると、動きの時間相関性が高まる上、ＣｕｒｒＭＢに関する動きベクトルの情報を復号側へと伝送せずに済む分だけ符号化効率が改善する。 The temporal direct mode is a method of determining a CurrMB motion vector from a motion vector of a macroblock temporally adjacent to a coding target macroblock (hereinafter, CurrMB), paying attention to a sequence of moving images. When the temporal direct mode is applied, the temporal correlation of motion increases, and the coding efficiency is improved to the extent that motion vector information relating to CurrMB is not transmitted to the decoding side.

Ｈ．２６４／ＡＶＣ方式の双方向予測は、Ｌｉｓｔ０（Ｌ０）方向予測とＬｉｓｔ１（Ｌ１）方向予測とを組み合わせて実現される。Ｌ０方向は前方向に設定されることが多い。また、Ｌ１方向は後方向に設定されることが多い。但し、Ｈ．２６４／ＡＶＣ方式の双方向予測は、異なる複数の前方向予測を組み合わせることや、異なる複数の後方向予測を組み合わせることが許容されている。そのため、予測方向はＬ０方向、Ｌ１方向と表現される。本稿においても同様の表現を用いる場合がある。 H. The H.264 / AVC system bidirectional prediction is realized by combining List0 (L0) direction prediction and List1 (L1) direction prediction. The L0 direction is often set to the forward direction. Further, the L1 direction is often set to the backward direction. However, H. In the H.264 / AVC system bidirectional prediction, it is allowed to combine a plurality of different forward predictions or to combine a plurality of different backward predictions. Therefore, the prediction directions are expressed as L0 direction and L1 direction. Similar expressions may be used in this paper.

参照方向は、ピクチャ毎にｒｅｆ＿ｉｄｘというパラメータを用いて表現される。例えば、Ｌ０方向のｒｅｆ＿ｉｄｘは、ｒｅｆ＿ｉｄｘ＿ｌ０と表現される。また、Ｌ１方向のｒｅｆ＿ｉｄｘは、ｒｅｆ＿ｉｄｘ＿ｌ１と表現される。時間ダイレクトモードでは、Ｌ１方向に最小のｒｅｆ＿ｉｄｘを有するＩピクチャ又はＰピクチャ内で、ＣｕｒｒＭＢと同位置にあるマクロブロックが特定され、そのマクロブロックが指す動きベクトルを基準にＣｕｒｒＭＢの動きベクトルが決定される。なお、ＣｕｒｒＭＢの動きベクトルを決定する際に基準とする動きベクトルを基準ベクトルと呼ぶ場合がある。 The reference direction is expressed for each picture using a parameter called ref_idx. For example, ref_idx in the L0 direction is expressed as ref_idx_10. Further, ref_idx in the L1 direction is expressed as ref_idx_l1. In the temporal direct mode, a macroblock located at the same position as CurrMB is specified in the I picture or P picture having the smallest ref_idx in the L1 direction, and the motion vector of CurrMB is determined based on the motion vector pointed to by the macroblock. The A motion vector that is used as a reference when determining the CurrMB motion vector may be referred to as a reference vector.

図２の例では、ＢピクチャであるＣｕｒｒＰｉｃ内のＣｕｒｒＭＢが時間ダイレクトモードによる符号化対象のマクロブロックである。時間ダイレクトモードでは、Ｂピクチャ内のＣｕｒｒＭＢを予測する際に、既に符号化済のピクチャ（以下、ＣｏｌＰｉｃ）に注目し、ＣｏｌＰｉｃ内でＣｕｒｒＭＢと同位置にあるマクロブロック（以下、ＣｏｌＭＢ）の動きベクトルｂａｓｅｍｖＣｏｌを利用する。なお、図２の例では、Ｌ１方向に位置するＰピクチャ（ｒｅｆ＿ｉｄｘ＝０）がＣｏｌＰｉｃである。また、動きベクトルｂａｓｅｍｖＣｏｌは基準ベクトルである。また、基準ベクトルｂａｓｅｍｖＣｏｌが指すピクチャをＲｅｆＰｉｃＣｏｌと呼ぶ場合がある。 In the example of FIG. 2, CurrMB in CurrPic that is a B picture is a macroblock to be encoded in the temporal direct mode. In the temporal direct mode, when predicting CurrMB in a B picture, attention is paid to an already encoded picture (hereinafter referred to as ColPic), and a motion vector of a macroblock (hereinafter referred to as ColMB) located at the same position as CurrMB in ColPic. Use basemvCol. In the example of FIG. 2, the P picture (ref_idx = 0) located in the L1 direction is ColPic. The motion vector basemvCol is a reference vector. In addition, the picture indicated by the reference vector basemvCol may be referred to as RefPicCol.

時間ダイレクトモードでは、ＣｕｒｒＰｉｃからＲｅｆＰｉｃＣｏｌへの動きベクトルｍｖＬ₀、及びＣｕｒｒＰｉｃからＣｏｌＰｉｃへの動きベクトルｍｖＬ₁が基準ベクトルｂａｓｅｍｖＣｏｌに基づいて計算される。例えば、ピクチャ間隔（例えば、ＰＯＣ（ＰｉｃｔｕｒｅＯｒｄｅｒＣｏｕｎｔ）の差）の比率に基づいて基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングすることでｍｖＬ₀、ｍｖＬ₁が求められる。 In the temporal direct mode, a motion vector mvL ₀ from CurrPic to RefPicCol and a motion vector mvL ₁ from CurrPic to ColPic are calculated based on the reference vector basemvCol. For example, mvL ₀ and mvL ₁ are obtained by scaling the reference vector basemvCol based on the ratio of picture intervals (for example, the difference in POC (Picture Order Count)).

図２の例では、ＲｅｆＰｉｃＣｏｌとＣｏｌＰｉｃとの間の時間がｔｄであり、ＲｅｆＰｉｃＣｏｌとＣｕｒｒＰｉｃとの間の時間がｔｂである。この場合、基準ベクトルｂａｓｅｍｖＣｏｌを比率（ｔｂ／ｔｄ）でスケーリングした動きベクトル（ｔｂ／ｔｄ）＊ｂａｓｅｍｖＣｏｌがｍｖＬ₀となる。また、（ｍｖＬ₀−ｂａｓｅｍｖＣｏｌ）がｍｖＬ₁となる。時間ダイレクトモードでは、このようにして求めた動きベクトルｍｖＬ₀、ｍｖＬ₁をＣｕｒｒＭＢの予測に用いる。この例では、動きベクトルｍｖＬ₀、ｍｖＬ₁が基準ベクトルｂａｓｅｍｖＣｏｌから得られるため、動き情報ｍｖＬ₀、ｍｖＬ₁を復号側へと伝送せずに済む。 In the example of FIG. 2, the time between RefPicCol and ColPic is td, and the time between RefPicCol and CurrPic is tb. In this case, the motion vector (tb / td) * basemvCol obtained by scaling the reference vector basemvCol by the ratio (tb / td) is mvL ₀ . In addition, (mvL ₀ -basemvCol) becomes mvL ₁ . In the temporal direct mode, the motion vectors mvL ₀ and mvL ₁ thus obtained are used for CurrMB prediction. In this example, since the motion vectors mvL ₀ and mvL ₁ are obtained from the reference vector basemvCol, it is not necessary to transmit the motion information mvL ₀ and mvL ₁ to the decoding side.

なお、ダイレクトモードに用いる動きベクトルの持ち方は、ＳＰＳ（Ｓｅｑｕｅｎｃｅ＿ｐａｒａｍｅｔｅｒ＿ｓｅｔ）のｄｉｒｅｃｔ＿８ｘ８＿ｉｎｆｅｒｅｎｃｅ＿ｆｌａｇの値によって決定される。ｄｉｒｅｃｔ＿８ｘ８＿ｉｎｆｅｒｅｎｃｅ＿ｆｌａｇ＝１の場合は、８ｘ８単位でダイレクトモードに使用する動きベクトルを持つことになる。また、ｄｉｒｅｃｔ＿８ｘ８＿ｉｎｆｅｒｅｎｃｅ＿ｆｌａｇ＝０の場合は、４ｘ４単位でダイレクトモードに使用する動きベクトルを持つことになる。 It should be noted that how to hold the motion vector used in the direct mode is determined by the value of direct_8 × 8_influence_flag of SPS (Sequence_parameter_set). When direct_8 × 8_influence_flag = 1, it has a motion vector used for the direct mode in units of 8 × 8. Further, when direct_8x8_influence_flag = 0, it has a motion vector used for the direct mode in units of 4x4.

以上、時間ダイレクトモードについて説明した。
［２−２．視点間予測と時間ダイレクトモード］
次に、図３〜図５を参照しながら、視点間予測と時間ダイレクトモードとの関係について説明する。図２に示した時間ダイレクトモードの例は、１視点に対応する動画像の符号化に時間ダイレクトモードを適用する方法であった。そのため、図２の例では視点間予測が考慮されていない。ここでは、視点間予測と時間ダイレクトモードとの関係について考察する。 The time direct mode has been described above.
[2-2. Inter-view prediction and temporal direct mode]
Next, the relationship between the inter-view prediction and the temporal direct mode will be described with reference to FIGS. The example of the temporal direct mode shown in FIG. 2 is a method in which the temporal direct mode is applied to encoding of a moving image corresponding to one viewpoint. Therefore, inter-view prediction is not considered in the example of FIG. Here, the relationship between inter-view prediction and temporal direct mode is considered.

図３は、視点間予測と時間ダイレクトモードとの関係について説明するための第１の図である。また、図４は、視点間予測と時間ダイレクトモードとの関係について説明するための第２の図である。また、図５は、視点間予測と時間ダイレクトモードとの関係について説明するための第３の図である。 FIG. 3 is a first diagram for explaining the relationship between the inter-view prediction and the temporal direct mode. FIG. 4 is a second diagram for explaining the relationship between the inter-view prediction and the temporal direct mode. FIG. 5 is a third diagram for explaining the relationship between the inter-view prediction and the temporal direct mode.

（２−２−１．視点間予測）
ＭＶＣ方式は多視点動画像の符号化に関する。ＭＶＣ方式では、Ｌ０方向予測やＬ１方向予測などの予測モードに加え、別視点に対応する動画像のピクチャを参照する新たな予測モード（視点間予測）が利用される。視点間予測では、符号化対象のピクチャと同一時刻（同一ＰＯＣ）の別視点ピクチャが参照される。 (2-2-1. Inter-view prediction)
The MVC method relates to encoding of a multi-view video. In the MVC scheme, in addition to prediction modes such as L0 direction prediction and L1 direction prediction, a new prediction mode (inter-view prediction) that refers to a moving picture corresponding to another viewpoint is used. In inter-view prediction, a different view picture at the same time (same POC) as the picture to be encoded is referred to.

ここで、図３の例について考える。図３には、視点＃１（Ｂａｓｅ−Ｖｉｅｗ）に対応する動画像のピクチャと、視点＃２（非Ｂａｓｅ−Ｖｉｅｗ）に対応する動画像のピクチャとが例示されている。 Now consider the example of FIG. FIG. 3 illustrates a moving picture picture corresponding to viewpoint # 1 (Base-View) and a moving picture picture corresponding to viewpoint # 2 (non-Base-View).

図３の例では、ＢピクチャＰｉｃ＃２内のマクロブロックＭＢ＃２が視点間予測に基づく符号化の対象である。視点＃１に対応する動画像の各ピクチャは、視点＃２に対応する動画像のピクチャを参照せずに符号化される。いま、視点＃１に対応するピクチャＰｉｃ＃１は、既に符号化済みであるとする。この場合、マクロブロックＭＢ＃２の符号化に用いる動きベクトルは、視点＃１（他視点）のＢピクチャＰｉｃ＃１に含まれるマクロブロックＭＢ＃１を指す視点間動きベクトル（ｉＭＶ）となる。 In the example of FIG. 3, the macroblock MB # 2 in the B picture Pic # 2 is a target for encoding based on inter-view prediction. Each picture of the moving picture corresponding to the viewpoint # 1 is encoded without referring to the picture of the moving picture corresponding to the viewpoint # 2. Now, it is assumed that the picture Pic # 1 corresponding to the viewpoint # 1 has already been encoded. In this case, the motion vector used for encoding the macroblock MB # 2 is the inter-viewpoint motion vector (iMV) indicating the macroblock MB # 1 included in the B picture Pic # 1 of the viewpoint # 1 (other viewpoint).

上記のように、視点間予測では、符号化対象のピクチャと同一時刻（同一ＰＯＣ）の別視点ピクチャが参照される。そのため、ＢピクチャＰｉｃ＃２を時間ダイレクトモードのＣｕｒｒＰｉｃとし、視点間動きベクトルｉＭＶが指すＢピクチャＰｉｃ＃１をＣｏｌＰｉｃとすると、ＣｕｒｒＰｉｃとＣｏｌＰｉｃとの間でＰＯＣの差が０となる。また、ＢピクチャＰｉｃ＃２をＣｏｌＰｉｃとすると、ＢピクチャＰｉｃ＃１がＲｅｆＰｉｃＣｏｌとなり、ＲｅｆＰｉｃＣｏｌとＣｏｌＰｉｃとの間でＰＯＣの差が０となる。そのため、スケーリングによりＣｕｒｒＰｉｃの動きベクトルを求めることができない。 As described above, in the inter-view prediction, a different view picture at the same time (same POC) as the picture to be encoded is referred to. Therefore, if the B picture Pic # 2 is CurrPic in the temporal direct mode and the B picture Pic # 1 pointed to by the inter-viewpoint motion vector iMV is ColPic, the POC difference between CurrPic and ColPic becomes zero. If the B picture Pic # 2 is ColPic, the B picture Pic # 1 becomes RefPicCol, and the difference in POC between RefPicCol and ColPic becomes zero. Therefore, the CurrPic motion vector cannot be obtained by scaling.

Ｈ．２６４規格書（Ｔ−ＲＥＣ−Ｈ．２６４−２００９０３−Ｉ！！ＰＤＦ−Ｅ．ｐｄｆ）のＨ．７．４．３には、ＣｏｌＰｉｃ内に視点間予測を行うブロックが存在する場合に適用可能なダイレクトモードを空間ダイレクトモードに限る旨が記載されている。しかし、単純に周辺ブロックの動きベクトルを使用する空間ダイレクトモードよりも、前後のピクチャを含めた動きベクトルの時間的な連続性を維持する時間ダイレクトモードの方がより大きな符号化効率の向上を見込める。こうした理由から、第２実施形態では、視点間予測と時間ダイレクトモードとの関係について考察し、多視点動画像の符号化に時間ダイレクトモードを適用できるようにする方法を提案する。 H. H.264 standard (T-REC-H.264-200903-I! PDF-E.pdf). 7.4.3 describes that the direct mode applicable when there is a block that performs inter-view prediction in ColPic is limited to the spatial direct mode. However, the temporal direct mode that maintains temporal continuity of motion vectors including the preceding and succeeding pictures can be expected to improve coding efficiency more than the spatial direct mode that simply uses the motion vectors of surrounding blocks. . For these reasons, the second embodiment considers the relationship between inter-view prediction and temporal direct mode, and proposes a method that allows temporal direct mode to be applied to multi-view video encoding.

（２−２−２．参照ブロックが視点間予測ブロックである場合）
図４を参照する。図４の例は、視点＃２に対応するＣｕｒｒＰｉｃのマクロブロックと同じ位置にあるＣｏｌＰｉｃのマクロブロックが視点間動きベクトルｉＭＶにより視点＃２に対応するピクチャを参照する場合を示したものである。この場合、視点間動きベクトルｉＭＶをスケーリングしてもＣｕｒｒＰｉｃの符号化に用いる動きベクトルは得られない。 (2-2-2. When the reference block is an inter-view prediction block)
Please refer to FIG. The example of FIG. 4 illustrates a case where a ColPic macroblock at the same position as the CurrPic macroblock corresponding to the viewpoint # 2 refers to the picture corresponding to the viewpoint # 2 by the inter-view motion vector iMV. In this case, even if the inter-view motion vector iMV is scaled, a motion vector used for CurrPic encoding cannot be obtained.

そこで、本発明者は、図４に示すように、視点間動きベクトルｉＭＶが指すブロックの動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する方法を提案した。この方法によれば、ピクチャＩ₁₂、Ｐ₁₅の間隔ｔｄ（ＰＯＣ差分）とピクチャＩ₂₂、Ｂ₂₄の間隔ｔｂ（ＰＯＣ差分）とに基づいて基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングすることにより、ＣｕｒｒＰｉｃの動きベクトルが得られる。従って、図４のような場合でも時間ダイレクトモードを適用することが可能になる。第２実施形態では、図４に示した方法による符号化も考慮する。 Therefore, the present inventor has proposed a method of setting the motion vector of the block indicated by the inter-viewpoint motion vector iMV as the reference vector basemCol, as shown in FIG. According to this method, by scaling the reference vector basemvCol based on the interval td (POC difference) between the pictures I ₁₂ and P _{15 and} the interval tb (POC difference) between the pictures I ₂₂ and B ₂₄ , the motion vector of CurrPic Is obtained. Therefore, the time direct mode can be applied even in the case of FIG. In the second embodiment, encoding by the method shown in FIG. 4 is also considered.

（２−２−３．参照ブロックが他視点ブロックである場合）
図５を参照する。図５の例は、視点＃２に対応するＣｕｒｒＰｉｃのマクロブロックが視点間動きベクトルｉＭＶにより視点＃１に対応するピクチャを参照する場合を示したものである。この場合、ＣｕｒｒＰｉｃのインター予測に用いる動きベクトルの長さｔｂ（ＰＯＣ差分）が０となる。そのため、基準ベクトルｂａｓｅｍｖＣｏｌの長さに相当する時間ｔｄとｔｂとの比で基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングすることができない。 (2-2-3. Reference block is another viewpoint block)
Please refer to FIG. The example of FIG. 5 shows a case where the CurrPic macroblock corresponding to the viewpoint # 2 refers to the picture corresponding to the viewpoint # 1 by the inter-view motion vector iMV. In this case, the length tb (POC difference) of the motion vector used for CurrPic inter prediction is zero. Therefore, the reference vector basemvCol cannot be scaled by the ratio of the times td and tb corresponding to the length of the reference vector basemvCol.

そこで、第２実施形態では、視点間動きベクトルｉＭＶが指すブロックの動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定し、視点間動きベクトルｉＭＶの長さに相当する時間ｔｂを用いずにＣｕｒｒＰｉｃの動きベクトルを求める方法を提案する。例えば、図５の例では、基準ベクトルｂａｓｅｍｖＣｏｌの長さに相当する時間ｔｄと、ピクチャＩ₂₂、Ｂ₂₄の間隔に相当する時間との比で基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングする方法が考えられる。この方法によれば、図５のような場合でも時間ダイレクトモードを適用することが可能になる。 Therefore, in the second embodiment, the motion vector of the block indicated by the inter-view motion vector iMV is set as the reference vector basemvCol, and the CurrPic motion vector is obtained without using the time tb corresponding to the length of the inter-view motion vector iMV. Suggest a method. For example, in the example of FIG. 5, a method of scaling the reference vector basemvCol by the ratio of the time td corresponding to the length of the reference vector basemvCol and the time corresponding to the interval between the pictures I ₂₂ and B ₂₄ can be considered. According to this method, the time direct mode can be applied even in the case of FIG.

以上、視点間予測と時間ダイレクトモードとの関係について説明した。
［２−３．システム］
次に、図６を参照しながら、第２実施形態に係るシステムについて説明する。図６は、第２実施形態に係るシステムの一例を示した図である。 The relationship between the inter-view prediction and the temporal direct mode has been described above.
[2-3. system]
Next, a system according to the second embodiment will be described with reference to FIG. FIG. 6 is a diagram illustrating an example of a system according to the second embodiment.

図６に示すように、第２実施形態に係るシステムは、符号化装置１００及び復号装置２００を含む。符号化装置１００は、複数の視点にそれぞれ対応する複数の動画像を符号化する装置である。符号化装置１００により符号化された複数の動画像に対応する符号化データは、復号装置２００に伝送される。例えば、符号化データは、ネットワークや可搬記録媒体などを介して復号装置２００に伝送される。復号装置２００は、符号化装置１００から取得した符号化データを復号して複数の動画像を復元する。 As illustrated in FIG. 6, the system according to the second embodiment includes an encoding device 100 and a decoding device 200. The encoding apparatus 100 is an apparatus that encodes a plurality of moving images respectively corresponding to a plurality of viewpoints. Encoded data corresponding to a plurality of moving images encoded by the encoding device 100 is transmitted to the decoding device 200. For example, the encoded data is transmitted to the decoding device 200 via a network or a portable recording medium. The decoding device 200 decodes the encoded data acquired from the encoding device 100 to restore a plurality of moving images.

なお、以下では、説明の都合上、２視点（Ｂａｓｅ−Ｖｉｅｗ、非Ｂａｓｅ−Ｖｉｅｗ）の動画像を符号化及び復号することを前提に説明する。但し、第２実施形態に係る技術の適用範囲はこれに限定されず、視点数が３以上の場合にも適用可能である。 In the following description, for convenience of explanation, description will be made on the assumption that a moving image of two viewpoints (Base-View, non-Base-View) is encoded and decoded. However, the scope of application of the technology according to the second embodiment is not limited to this, and the present invention can also be applied when the number of viewpoints is three or more.

以上、システムについて説明した。
［２−４．ハードウェアの例］
次に、図７を参照しながら、符号化装置１００が有する機能を実現することが可能なハードウェアについて説明する。図７は、第２実施形態に係る符号化装置が有する機能を実現することが可能なハードウェアの一例を示した図である。 The system has been described above.
[2-4. Hardware example]
Next, hardware capable of realizing the functions of the encoding apparatus 100 will be described with reference to FIG. FIG. 7 is a diagram illustrating an example of hardware capable of realizing the functions of the encoding apparatus according to the second embodiment.

符号化装置１００が有する機能は、例えば、図７に示す情報処理装置のハードウェア資源を用いて実現することが可能である。つまり、符号化装置１００が有する機能は、コンピュータプログラムを用いて図７に示すハードウェアを制御することにより実現される。 The functions of the encoding apparatus 100 can be realized using, for example, hardware resources of the information processing apparatus illustrated in FIG. That is, the functions of the encoding device 100 are realized by controlling the hardware shown in FIG. 7 using a computer program.

図７に示すように、このハードウェアは、主に、ＣＰＵ９０２と、ＲＯＭ（Read Only Memory）９０４と、ＲＡＭ９０６と、ホストバス９０８と、ブリッジ９１０とを有する。さらに、このハードウェアは、外部バス９１２と、インターフェース９１４と、入力部９１６と、出力部９１８と、記憶部９２０と、ドライブ９２２と、接続ポート９２４と、通信部９２６とを有する。 As shown in FIG. 7, this hardware mainly includes a CPU 902, a ROM (Read Only Memory) 904, a RAM 906, a host bus 908, and a bridge 910. Further, this hardware includes an external bus 912, an interface 914, an input unit 916, an output unit 918, a storage unit 920, a drive 922, a connection port 924, and a communication unit 926.

ＣＰＵ９０２は、例えば、演算処理装置又は制御装置として機能し、ＲＯＭ９０４、ＲＡＭ９０６、記憶部９２０、又はリムーバブル記録媒体９２８に記録された各種プログラムに基づいて各構成要素の動作全般又はその一部を制御する。ＲＯＭ９０４は、ＣＰＵ９０２に読み込まれるプログラムや演算に用いるデータなどを格納する記憶装置の一例である。ＲＡＭ９０６には、例えば、ＣＰＵ９０２に読み込まれるプログラムや、そのプログラムを実行する際に変化する各種パラメータなどが一時的又は永続的に格納される。 The CPU 902 functions as, for example, an arithmetic processing unit or a control unit, and controls the overall operation or a part of each component based on various programs recorded in the ROM 904, the RAM 906, the storage unit 920, or the removable recording medium 928. . The ROM 904 is an example of a storage device that stores a program read by the CPU 902, data used for calculation, and the like. The RAM 906 temporarily or permanently stores, for example, a program read by the CPU 902 and various parameters that change when the program is executed.

これらの要素は、例えば、高速なデータ伝送が可能なホストバス９０８を介して相互に接続される。一方、ホストバス９０８は、例えば、ブリッジ９１０を介して比較的データ伝送速度が低速な外部バス９１２に接続される。また、入力部９１６としては、例えば、マウス、キーボード、タッチパネル、タッチパッド、ボタン、スイッチ、及びレバーなどが用いられる。さらに、入力部９１６としては、赤外線やその他の電波を利用して制御信号を送信することが可能なリモートコントローラが用いられることもある。 These elements are connected to each other via, for example, a host bus 908 capable of high-speed data transmission. On the other hand, the host bus 908 is connected to an external bus 912 having a relatively low data transmission speed via a bridge 910, for example. As the input unit 916, for example, a mouse, a keyboard, a touch panel, a touch pad, a button, a switch, a lever, or the like is used. Furthermore, as the input unit 916, a remote controller capable of transmitting a control signal using infrared rays or other radio waves may be used.

出力部９１８としては、例えば、ＣＲＴ（Cathode Ray Tube）、ＬＣＤ（Liquid Crystal Display）、ＰＤＰ（Plasma Display Panel）、又はＥＬＤ（Electro-Luminescence Display）などのディスプレイ装置が用いられる。また、出力部９１８として、スピーカやヘッドホンなどのオーディオ出力装置、又はプリンタなどが用いられることもある。つまり、出力部９１８は、情報を視覚的又は聴覚的に出力することが可能な装置である。 As the output unit 918, for example, a display device such as a CRT (Cathode Ray Tube), an LCD (Liquid Crystal Display), a PDP (Plasma Display Panel), or an ELD (Electro-Luminescence Display) is used. As the output unit 918, an audio output device such as a speaker or headphones, or a printer may be used. In other words, the output unit 918 is a device that can output information visually or audibly.

記憶部９２０は、各種のデータを格納するための装置である。記憶部９２０としては、例えば、ＨＤＤなどの磁気記憶デバイスが用いられる。また、記憶部９２０として、ＳＳＤ（Solid State Drive）やＲＡＭディスクなどの半導体記憶デバイス、光記憶デバイス、又は光磁気記憶デバイスなどが用いられてもよい。 The storage unit 920 is a device for storing various data. As the storage unit 920, for example, a magnetic storage device such as an HDD is used. Further, as the storage unit 920, a semiconductor storage device such as an SSD (Solid State Drive) or a RAM disk, an optical storage device, a magneto-optical storage device, or the like may be used.

ドライブ９２２は、着脱可能な記録媒体であるリムーバブル記録媒体９２８に記録された情報を読み出し、又はリムーバブル記録媒体９２８に情報を書き込む装置である。リムーバブル記録媒体９２８としては、例えば、磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリなどが用いられる。 The drive 922 is a device that reads information recorded on a removable recording medium 928 that is a removable recording medium or writes information on the removable recording medium 928. As the removable recording medium 928, for example, a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is used.

接続ポート９２４は、例えば、ＵＳＢ（Universal Serial Bus）ポート、ＩＥＥＥ１３９４ポート、ＳＣＳＩ（Small Computer System Interface）、ＲＳ−２３２Ｃポート、又は光オーディオ端子など、外部接続機器９３０を接続するためのポートである。外部接続機器９３０としては、例えば、プリンタなどが用いられる。 The connection port 924 is a port for connecting an external connection device 930 such as a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface), an RS-232C port, or an optical audio terminal. For example, a printer or the like is used as the external connection device 930.

通信部９２６は、ネットワーク９３２に接続するための通信デバイスである。通信部９２６としては、例えば、有線又は無線ＬＡＮ（Local Area Network）用の通信回路、ＷＵＳＢ（Wireless USB）用の通信回路、光通信用の通信回路やルータ、ＡＤＳＬ（Asymmetric Digital Subscriber Line）用の通信回路やルータ、携帯電話ネットワーク用の通信回路などが用いられる。通信部９２６に接続されるネットワーク９３２は、有線又は無線により接続されたネットワークであり、例えば、インターネット、ＬＡＮ、放送網、衛星通信回線などを含む。 The communication unit 926 is a communication device for connecting to the network 932. As the communication unit 926, for example, a communication circuit for wired or wireless LAN (Local Area Network), a communication circuit for WUSB (Wireless USB), a communication circuit or router for optical communication, an ADSL (Asymmetric Digital Subscriber Line) Communication circuits, routers, communication circuits for mobile phone networks, and the like are used. A network 932 connected to the communication unit 926 is a wired or wireless network, and includes, for example, the Internet, a LAN, a broadcast network, a satellite communication line, and the like.

以上、符号化装置１００が有する機能を実現することが可能なハードウェアについて説明した。なお、復号装置２００が有する機能も、図７に例示したハードウェアを利用して実現することが可能である。つまり、図７に示したハードウェアは、復号装置２００が有する機能を実現することが可能なハードウェアの一例でもある。従って、復号装置２００が有する機能を実現することが可能なハードウェアに関する詳細な説明を省略する。 Heretofore, the hardware capable of realizing the functions of the encoding device 100 has been described. Note that the functions of the decoding device 200 can also be realized using the hardware illustrated in FIG. That is, the hardware illustrated in FIG. 7 is an example of hardware that can realize the functions of the decoding device 200. Therefore, detailed description regarding hardware capable of realizing the functions of the decoding device 200 is omitted.

［２−５．符号化装置の機能］
次に、図８〜図１０を参照しながら、符号化装置１００が有する機能について説明する。 [2-5. Function of encoding apparatus]
Next, functions of the encoding device 100 will be described with reference to FIGS.

図８は、第２実施形態に係る符号化装置が有する機能の一例を示した第１のブロック図である。また、図９は、第２実施形態に係る符号化装置が有する機能の一例を示した第２のブロック図である。また、図１０は、第２実施形態に係る符号化装置が有する機能の一例を示した第３のブロック図である。 FIG. 8 is a first block diagram illustrating an example of the functions of the encoding device according to the second embodiment. FIG. 9 is a second block diagram showing an example of the functions of the encoding apparatus according to the second embodiment. FIG. 10 is a third block diagram illustrating an example of functions of the encoding apparatus according to the second embodiment.

（２−５−１．全体）
図８に示すように、符号化装置１００は、第１視点符号化部１０１、統計情報取得部１０２、及び第２視点符号化部１０３を有する。なお、符号化装置１００は、第１視点符号化部１０１、統計情報取得部１０２、及び第２視点符号化部１０３の機能は、上述したＣＰＵ９０２などを用いて実現できる。 (2-5-1. Overall)
As illustrated in FIG. 8, the encoding device 100 includes a first viewpoint encoding unit 101, a statistical information acquisition unit 102, and a second viewpoint encoding unit 103. Note that in the encoding device 100, the functions of the first viewpoint encoding unit 101, the statistical information acquisition unit 102, and the second viewpoint encoding unit 103 can be realized using the above-described CPU 902 or the like.

第１視点符号化部１０１には、Ｂａｓｅ−Ｖｉｅｗに対応する動画像のピクチャ（以下、Ｂａｓｅ−Ｖｉｅｗピクチャ）が入力される。第１視点符号化部１０１は、入力されたＢａｓｅ−Ｖｉｅｗピクチャを符号化する。第１視点符号化部１０１の機能については後段において詳述する。 The first viewpoint encoding unit 101 receives a moving picture corresponding to Base-View (hereinafter referred to as Base-View picture). The first viewpoint encoding unit 101 encodes the input Base-View picture. The function of the first viewpoint encoding unit 101 will be described in detail later.

統計情報取得部１０２は、第１視点符号化部１０１によるＢａｓｅ−Ｖｉｅｗピクチャの符号化時に得られる統計情報を取得する。統計情報としては、例えば、量子化スケール値（以下、ｑＰ値）や差分係数の個数などがある。統計情報取得部１０２により取得された統計情報は、第２視点符号化部１０３に入力される。 The statistical information acquisition unit 102 acquires statistical information obtained when the first viewpoint encoding unit 101 encodes the Base-View picture. The statistical information includes, for example, a quantization scale value (hereinafter referred to as qP value) and the number of difference coefficients. The statistical information acquired by the statistical information acquisition unit 102 is input to the second viewpoint encoding unit 103.

第２視点符号化部１０３には、非Ｂａｓｅ−Ｖｉｅｗに対応する動画像のピクチャ（以下、非Ｂａｓｅ−Ｖｉｅｗピクチャ）が入力される。第２視点符号化部１０３は、第１視点符号化部１０１が符号化に用いた動きベクトルやピクチャに関する情報、及び統計情報を用いて非Ｂａｓｅ−Ｖｉｅｗピクチャを符号化する。第２視点符号化部１０３の機能については後段において詳述する。 The second viewpoint encoding unit 103 receives a moving picture corresponding to non-Base-View (hereinafter, non-Base-View picture). The second viewpoint encoding unit 103 encodes the non-Base-View picture using the motion vector and the information about the picture used by the first viewpoint encoding unit 101 and the statistical information. The function of the second viewpoint encoding unit 103 will be described in detail later.

（２−５−２．第１視点符号化部１０１の細部）
ここで、図９を参照しながら、第１視点符号化部１０１の機能について、さらに説明する。 (2-5-2. Details of First View Coding Unit 101)
Here, the function of the first viewpoint encoding unit 101 will be further described with reference to FIG.

図９に示すように、第１視点符号化部１０１は、減算器１１１、直交変換・量子化部１１２、可変長符号化部１１３、逆直交変換・逆量子化部１１４、及び加算器１１５を有する。さらに、第１視点符号化部１０１は、フレームメモリ１１６、動き補償部１１７、動きベクトル検出部１１８、動きベクトルメモリ１１９、基準ベクトル取得部１２０、ダイレクトベクトル計算部１２１、及びモード判定部１２２を有する。 As shown in FIG. 9, the first viewpoint encoding unit 101 includes a subtractor 111, an orthogonal transformation / quantization unit 112, a variable length coding unit 113, an inverse orthogonal transformation / inverse quantization unit 114, and an adder 115. Have. Further, the first viewpoint encoding unit 101 includes a frame memory 116, a motion compensation unit 117, a motion vector detection unit 118, a motion vector memory 119, a reference vector acquisition unit 120, a direct vector calculation unit 121, and a mode determination unit 122. .

減算器１１１には、Ｂａｓｅ−Ｖｉｅｗピクチャを１６×１６ピクセル（画素）のブロック（ＭＢ）に分割したマクロブロックのデータ（以下、ＭＢデータ）が入力される。減算器１１１は、入力されたＭＢデータから、動き補償部１１７から出力される予測画像のＭＢデータを減算して予測誤差データを生成する。減算器１１１により生成された予測誤差データは、直交変換・量子化部１１２に入力される。 The subtractor 111 receives macroblock data (hereinafter referred to as MB data) obtained by dividing the Base-View picture into blocks (MB) of 16 × 16 pixels (pixels). The subtractor 111 subtracts the MB data of the predicted image output from the motion compensation unit 117 from the input MB data to generate prediction error data. Prediction error data generated by the subtractor 111 is input to the orthogonal transform / quantization unit 112.

直交変換・量子化部１１２は、入力された予測誤差データを８×８ピクセルのブロック又は４×４ピクセルのブロックを単位として直交変換する。適用可能な直交変換としては、例えば、ＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ）変換やアダマール（Ｈａｄａｍａｒｄ）変換などがある。このような直交変換により、予測誤差データは、水平方向の周波数成分及び垂直方向の周波数成分のデータに変換される。 The orthogonal transform / quantization unit 112 orthogonally transforms the input prediction error data in units of 8 × 8 pixel blocks or 4 × 4 pixel blocks. Examples of applicable orthogonal transforms include DCT (Discrete Cosine Transform) transform and Hadamard transform. Through such orthogonal transformation, the prediction error data is converted into data of a horizontal frequency component and a vertical frequency component.

また、直交変換・量子化部１１２は、直交変換により得られた周波数成分のデータを量子化して量子化データを生成する。直交変換・量子化部１１２により生成された量子化データは、可変長符号化部１１３及び逆直交変換・逆量子化部１１４に入力される。 The orthogonal transform / quantization unit 112 quantizes the frequency component data obtained by the orthogonal transform to generate quantized data. The quantized data generated by the orthogonal transform / quantization unit 112 is input to the variable length coding unit 113 and the inverse orthogonal transform / inverse quantization unit 114.

可変長符号化部１１３は、量子化データを可変長符号化して符号化データを生成する。可変長符号化とは、シンボルの出現頻度に応じて可変長の符号を割り当てる符号化方法である。例えば、可変長符号化部１１３は、出現頻度の高い係数の組合せに短い符号を割り当て、出現頻度の低い係数の組合せに長い符号を割当てる。適用可能な可変長符号化としては、ＣＡＶＬＣ（Ｃｏｎｔｅｘｔ−ＡｄａｐｔｉｖｅＶａｒｉａｂｌｅＬｅｎｇｔｈＣｏｄｉｎｇ）やＣＡＢＡＣ（Ｃｏｎｔｅｘｔ−ＡｄａｐｔｉｖｅＢｉｎａｒｙＡｒｉｔｈｍｅｔｉｃＣｏｄｉｎｇ）などがある。 The variable length encoding unit 113 performs variable length encoding on the quantized data to generate encoded data. The variable length coding is a coding method that assigns a variable length code according to the appearance frequency of a symbol. For example, the variable length coding unit 113 assigns a short code to a combination of coefficients having a high appearance frequency, and assigns a long code to a combination of coefficients having a low appearance frequency. Examples of applicable variable length coding include CAVLC (Context-Adaptive Variable Length Coding) and CABAC (Context-Adaptive Binary Arithmetic Coding).

逆直交変換・逆量子化部１１４は、量子化データを逆量子化して周波数成分のデータを復元する。また、逆直交変換・逆量子化部１１４は、周波数成分のデータに逆直交変換を施して予測誤差データを復元する。復元された予測誤差データは、加算器１１５に入力される。加算器１１５は、動き補償により動き補償部１１７で生成されたＭＢデータと、逆直交変換・逆量子化部１１４により復元された予測誤差データとを加算して局所復号画像を生成する。 The inverse orthogonal transform / inverse quantization unit 114 dequantizes the quantized data to restore the frequency component data. The inverse orthogonal transform / inverse quantization unit 114 performs inverse orthogonal transform on the frequency component data to restore the prediction error data. The restored prediction error data is input to the adder 115. The adder 115 adds the MB data generated by the motion compensation unit 117 by motion compensation and the prediction error data restored by the inverse orthogonal transform / inverse quantization unit 114 to generate a locally decoded image.

加算器１１５により生成された局所復号画像は、フレームメモリ１１６に入力される。なお、局所復号画像にデブロッキングフィルタが施されるようにしてもよい。フレームメモリ１１６には、加算器１１５により入力された局所復号画像が格納される。フレームメモリ１１６に格納された局所復号画像は、動き補償部１１７により参照される。また、フレームメモリ１１６に格納された局所復号画像は、第２視点符号化部１０３により非Ｂａｓｅ−Ｖｉｅｗピクチャの符号化に利用される。 The locally decoded image generated by the adder 115 is input to the frame memory 116. A deblocking filter may be applied to the locally decoded image. The frame memory 116 stores the locally decoded image input by the adder 115. The locally decoded image stored in the frame memory 116 is referred to by the motion compensation unit 117. In addition, the locally decoded image stored in the frame memory 116 is used by the second viewpoint encoding unit 103 to encode a non-Base-View picture.

動き補償部１１７は、動きベクトルメモリ１１９に格納されている動きベクトルを用いて、フレームメモリ１１６に格納されている局所復号画像の動き補償を実行し、動き補償後のＭＢデータ（参照ピクチャのＭＢデータ）を生成する。動きベクトル検出部１１８は、入力されたＢａｓｅ−ＶｉｅｗピクチャのＭＢデータ（対象ピクチャのＭＢデータ）と参照ピクチャのＭＢデータとを用いて動き探索を実行し、動きベクトルを生成する。 The motion compensation unit 117 performs motion compensation of the locally decoded image stored in the frame memory 116 using the motion vector stored in the motion vector memory 119, and performs MB data after motion compensation (MB of the reference picture). Data). The motion vector detection unit 118 performs motion search using the input MB data of the Base-View picture (MB data of the target picture) and the MB data of the reference picture, and generates a motion vector.

動き探索の方法としては、例えば、ブロックマッチング法などがある。なお、動き探索には、画素の差分絶対値和の大きさなどが利用される。例えば、動き補償に用いる評価値ｃｏｓｔは、差分絶対値ＳＡＤ＿ｃｏｓｔ、動きベクトルの符号量に相当する評価値ＭＶ＿ｃｏｓｔを用いて下記の式（１）のように表現できる。なお、ＳＡＤは、ＳｕｍＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅの略である。また、ＭＶは、ＭｏｔｉｏｎＶｅｃｔｏｒの略である。 As a motion search method, for example, there is a block matching method. Note that the magnitude of the sum of absolute differences of pixels is used for motion search. For example, the evaluation value cost used for motion compensation can be expressed as the following equation (1) using the difference absolute value SAD_cost and the evaluation value MV_cost corresponding to the code amount of the motion vector. Note that SAD is an abbreviation for Sum Absolute Difference. MV is an abbreviation for Motion Vector.

ｃｏｓｔ＝ＳＡＤ＿ｃｏｓｔ＋ＭＶ＿ｃｏｓｔ
…（１）
動きベクトル検出部１１８は、例えば、上記の評価値ｃｏｓｔが最小となるような動きベクトルを検出する。動きベクトル検出部１１８により検出された動きベクトルは、動きベクトルメモリ１１９及びモード判定部１２２に入力される。動きベクトルメモリ１１９には、動きベクトル検出部１１８により検出された動きベクトルが格納される。 cost = SAD_cost + MV_cost
... (1)
The motion vector detection unit 118 detects, for example, a motion vector that minimizes the evaluation value cost. The motion vector detected by the motion vector detection unit 118 is input to the motion vector memory 119 and the mode determination unit 122. The motion vector memory 119 stores a motion vector detected by the motion vector detection unit 118.

基準ベクトル取得部１２０は、符号化対象のマクロブロック（ＣｕｒｒＭＢ）が双方向予測を用いるマクロブロックであるか否かを判定する。ＣｕｒｒＭＢが双方向予測を用いるマクロブロックである場合、基準ベクトル取得部１２０は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定し、ＣｏｌＭＢの動きベクトルを動きベクトルメモリ１１９から取得する（図２を参照）。基準ベクトル取得部１２０により取得された動きベクトルは、基準ベクトルｂａｓｅｍｖＣｏｌとしてダイレクトベクトル計算部１２１に入力される。 The reference vector acquisition unit 120 determines whether or not the encoding target macroblock (CurrMB) is a macroblock using bi-directional prediction. When the CurrMB is a macroblock using bi-directional prediction, the reference vector acquisition unit 120 specifies a ColPic macroblock (ColMB) that is in the same position as the CurrMB, and acquires a motion vector of the ColMB from the motion vector memory 119 ( (See FIG. 2). The motion vector acquired by the reference vector acquisition unit 120 is input to the direct vector calculation unit 121 as a reference vector basemvCol.

ダイレクトベクトル計算部１２１は、基準ベクトルｂａｓｅｍｖＣｏｌを時間配分でスケーリングし、ＣｕｒｒＭＢの動きベクトル（ダイレクトベクトル）を生成する。例えば、図２の例では、基準ベクトルｂａｓｅｍｖＣｏｌが時間ｔｂ、ｔｄの比に基づいてスケーリングされ、ダイレクトベクトルｍｖＬ₀、ｍｖＬ₁が生成されている。このようにしてダイレクトベクトル計算部１２１により生成されたダイレクトベクトルは、モード判定部１２２に入力される。 The direct vector calculation unit 121 scales the reference vector basemCol with time distribution to generate a CurrMB motion vector (direct vector). For example, in the example of FIG. 2, the reference vector basemvCol is scaled based on the ratio of the times tb and td, and the direct vectors mvL ₀ and mvL ₁ are generated. The direct vector generated by the direct vector calculation unit 121 in this way is input to the mode determination unit 122.

モード判定部１２２は、５つの予測モード（イントラ予測モード、前方向予測モード、後方向予測モード、双方向予測モード、ダイレクトモード）のうち、符号化コストが最も小さい予測モードを選択する。例えば、モード判定部１２２は、予測モード毎に、下記の式（２）〜式（６）に示す符号化コストを計算する。 The mode determination unit 122 selects a prediction mode with the lowest coding cost among the five prediction modes (intra prediction mode, forward prediction mode, backward prediction mode, bidirectional prediction mode, and direct mode). For example, the mode determination unit 122 calculates the encoding cost shown in the following equations (2) to (6) for each prediction mode.

ｃｏｓｔ＿ｄｉｒｅｃｔは、ダイレクトモードの符号化コストである。ｃｏｓｔ＿ｆｏｒｗａｒｄは、前方向予測モードの符号化コストである。ｃｏｓｔ＿ｂａｃｋｗａｒｄは、後方向予測モードの符号化コストである。ｃｏｓｔ＿ｂｉｄｉｒｅｃｔｉｏｎは、双方向予測モードの符号化コストである。ｃｏｓｔ＿ｉｎｔｒａは、イントラ予測モードの符号化コストである。＊ｏｒｇは、対象ピクチャのＭＢデータを表す。＊ｒｅｆは、参照ピクチャのＭＢデータを表す。＊ｍｖは、ＣｕｒｒＭＢが用いる動きベクトルの候補である。＊ｐｒｅｖｍｖは、ＣｕｒｒＭＢの周辺に位置するマクロブロックに基づく予測ベクトルを表す。 cost_direct is the encoding cost of the direct mode. cost_forward is the coding cost of the forward prediction mode. cost_backward is the encoding cost of the backward prediction mode. cost_biddirection is the encoding cost of the bidirectional prediction mode. cost_intra is the coding cost of the intra prediction mode. * Org represents MB data of the target picture. * Ref represents the MB data of the reference picture. * Mv is a motion vector candidate used by CurrMB. * Prevmv represents a prediction vector based on a macroblock located around CurrMB.

ｃｏｓｔ＿ｄｉｒｅｃｔ＝ＳＡＤ（＊ｏｒｇ，＊ｒｅｆ）
…（２）
ｃｏｓｔ＿ｆｏｒｗａｒｄ＝ＳＡＤ（＊ｏｒｇ，＊ｒｅｆ）＋ＭＶ＿ＣＯＳＴ（＊ｍｖ，＊ｐｒｅｖｍｖ）
…（３）
ｃｏｓｔ＿ｂａｃｋｗａｒｄ＝ＳＡＤ（＊ｏｒｇ，＊ｒｅｆ）＋ＭＶ＿ＣＯＳＴ（＊ｍｖ，＊ｐｒｅｖｍｖ）
…（４）
ｃｏｓｔ＿ｂｉｄｉｒｅｃｔｉｏｎ＝ＳＡＤ（＊ｏｒｇ，＊ｒｅｆ）＋ＭＶ＿ＣＯＳＴ（＊ｍｖ，＊ｐｒｅｖｍｖ）
…（５）
ｃｏｓｔ＿ｉｎｔｒａ＝ＡＣＴ（＊ｏｒｇ）
…（６）
なお、ＳＡＤ（・）は、下記の式（７）に示すように、画素差分絶対値和を計算する関数である。Ｈ．２６４／ＡＶＣ方式では、１つのマクロブロックを複数のサブブロックに分割することができる。例えば、１つのマクロブロックが４つの８×８のサブブロックに分かれている場合、サブブロック毎に計算される４つの差分絶対値和が、ＳＡＤ（・）の計算結果となる。サブブロックの単位を８×１６ピクセル、１６×８ピクセル、４×８ピクセル、８×４ピクセル、４×４ピクセルにした場合も同様である。 cost_direct = SAD (* org, * ref)
... (2)
cost_forward = SAD (* org, * ref) + MV_COST (* mv, * prevmv)
... (3)
cost_backward = SAD (* org, * ref) + MV_COST (* mv, * prevmv)
(4)
cost_bidirection = SAD (* org, * ref) + MV_COST (* mv, * prevmv)
... (5)
cost_intra = ACT (* org)
... (6)
SAD (·) is a function for calculating the sum of pixel difference absolute values as shown in the following equation (7). H. In the H.264 / AVC format, one macro block can be divided into a plurality of sub blocks. For example, when one macroblock is divided into four 8 × 8 sub-blocks, four sums of absolute differences calculated for each sub-block are the calculation results of SAD (•). The same applies when the unit of the sub-block is 8 × 16 pixels, 16 × 8 pixels, 4 × 8 pixels, 8 × 4 pixels, and 4 × 4 pixels.

ＳＡＤ（＊ｏｒｇ，＊ｒｅｆ）＝ Σ｜＊ｏｒｇ − ＊ｒｅｆ｜
…（７）
ＭＶ＿ＣＯＳＴ（・）は、動きベクトルの符号量に比例した評価値を計算する関数である。ＭＶ＿ＣＯＳＴ（・）は、下記の式（８）のように表現される。なお、λは、符号化コストへの影響度を調整するための重み定数である。また、Ｔａｂｌｅ［・］は、ベクトル差分の大きさを符号量相当に変換するテーブルである。これらは予め設定される。 SAD (* org, * ref) = Σ | * org− * ref |
... (7)
MV_COST (·) is a function for calculating an evaluation value proportional to the code amount of the motion vector. MV_COST (•) is expressed as in the following Expression (8). Note that λ is a weighting constant for adjusting the degree of influence on the coding cost. Table [·] is a table for converting the magnitude of the vector difference into the code amount. These are preset.

ＭＶ＿ＣＯＳＴ＝ λ＊（Ｔａｂｌｅ［＊ｍｖ−＊ｐｒｅｖｍｖ］）
…（８）
上記の式（６）に示すように、イントラ予測モードの符号化コストは、アクティビティと呼ばれる関数ＡＣＴ（・）により表現される。イントラ予測モードの場合、ＣｕｒｒＭＢそのものが直交変換される。そのため、ＣｕｒｒＭＢに含まれる各画素の画素値が平均値（ＡｖｅＭＢ）から離れている度合い（アクティビティ）を用いて符号化コストが求められる。アクティビティを求める関数ＡＣＴ（・）は、下記の式（９）で表現される。 MV_COST = λ * (Table [* mv− * prevmv])
... (8)
As shown in the above equation (6), the coding cost of the intra prediction mode is expressed by a function ACT (•) called an activity. In the intra prediction mode, CurrMB itself is orthogonally transformed. Therefore, the encoding cost is obtained using the degree (activity) that the pixel value of each pixel included in CurrMB is away from the average value (AveMB). The function ACT (·) for obtaining the activity is expressed by the following equation (9).

ＡＣＴ（＊ｏｒｇ，＊ｒｅｆ）＝ Σ｜＊ｏｒｇ − ＡｖｅＭＢ｜
…（９）
なお、上記の式（２）〜式（６）から明らかなように、ダイレクトモードの符号化コストにはＭＶ＿ＣＯＳＴが含まれない。そのため、予測が的中して全ての予測モードでＳＡＤが同等で比較的小さい値となった場合には、ダイレクトモードの符号化コストが低くなり、ダイレクトモードが選択されやすくなる。 ACT (* org, * ref) = Σ | * org−AveMB |
... (9)
As is clear from the above equations (2) to (6), MV_COST is not included in the coding cost of the direct mode. Therefore, when the prediction is correct and the SAD is equal and relatively small in all prediction modes, the encoding cost of the direct mode is reduced and the direct mode is easily selected.

さて、上記のような方法で予測モード毎に符号化コストを計算したモード判定部１２２は、符号化コストが最小となる予測モードを選択する。また、モード判定部１２２は、選択した予測モードによる符号化に用いる動きベクトルを動きベクトルメモリ１１９に格納する。さらに、モード判定部１２２は、選択した予測モードを動き補償部１１７に通知する。 Now, the mode determination part 122 which calculated the encoding cost for every prediction mode with the above methods selects the prediction mode with the minimum encoding cost. The mode determination unit 122 also stores a motion vector used for encoding in the selected prediction mode in the motion vector memory 119. Furthermore, the mode determination unit 122 notifies the motion compensation unit 117 of the selected prediction mode.

上記のように、Ｂａｓｅ−Ｖｉｅｗピクチャの符号化は、非Ｂａｓｅ−Ｖｉｅｗピクチャの符号化とは独立して実行される。
（２−５−３．第２視点符号化部１０３の細部）
ここで、図１０を参照しながら、第２視点符号化部１０３の機能について、さらに説明する。 As described above, the coding of the Base-View picture is performed independently of the coding of the non-Base-View picture.
(2-5-3. Details of Second View Encoding Unit 103)
Here, the function of the second viewpoint encoding unit 103 will be further described with reference to FIG.

図１０に示すように、第２視点符号化部１０３は、減算器１３１、直交変換・量子化部１３２、可変長符号化部１３３、逆直交変換・逆量子化部１３４、及び加算器１３５を有する。また、第２視点符号化部１０３は、フレームメモリ１３６、動き補償部１３７、動きベクトル検出部１３８、及び動きベクトルメモリ１３９を有する。さらに、第２視点符号化部１０３は、基準ベクトル取得部１４０、基準ベクトル補正部１４１、ダイレクトベクトル計算部１４２、及びモード判定部１４３を有する。 As shown in FIG. 10, the second viewpoint encoding unit 103 includes a subtractor 131, an orthogonal transform / quantization unit 132, a variable length encoding unit 133, an inverse orthogonal transform / inverse quantization unit 134, and an adder 135. Have. The second viewpoint encoding unit 103 includes a frame memory 136, a motion compensation unit 137, a motion vector detection unit 138, and a motion vector memory 139. Furthermore, the second viewpoint encoding unit 103 includes a reference vector acquisition unit 140, a reference vector correction unit 141, a direct vector calculation unit 142, and a mode determination unit 143.

減算器１３１には、非Ｂａｓｅ−ＶｉｅｗピクチャのＭＢデータが入力される。減算器１３１は、第１視点符号化部１０１と同様に、入力されたＭＢデータから、動き補償部１３７から出力される予測画像のＭＢデータを減算して予測誤差データを生成する。減算器１３１により生成された予測誤差データは、直交変換・量子化部１３２に入力される。 The subtracter 131 receives MB data of a non-Base-View picture. Similar to the first viewpoint encoding unit 101, the subtractor 131 subtracts the MB data of the prediction image output from the motion compensation unit 137 from the input MB data to generate prediction error data. The prediction error data generated by the subtracter 131 is input to the orthogonal transform / quantization unit 132.

直交変換・量子化部１３２は、入力された予測誤差データを８×８ピクセルのブロック又は４×４ピクセルのブロックを単位として直交変換する。適用可能な直交変換としては、例えば、ＤＣＴ変換やアダマール変換などがある。このような直交変換により、予測誤差データは、水平方向の周波数成分及び垂直方向の周波数成分のデータに変換される。 The orthogonal transform / quantization unit 132 orthogonally transforms the input prediction error data in units of 8 × 8 pixel blocks or 4 × 4 pixel blocks. Examples of applicable orthogonal transform include DCT transform and Hadamard transform. Through such orthogonal transformation, the prediction error data is converted into data of a horizontal frequency component and a vertical frequency component.

また、直交変換・量子化部１３２は、直交変換により得られた周波数成分のデータを量子化して量子化データを生成する。直交変換・量子化部１３２により生成された量子化データは、可変長符号化部１３３及び逆直交変換・逆量子化部１３４に入力される。可変長符号化部１３３は、量子化データを可変長符号化して符号化データを生成する。適用可能な可変長符号化としては、ＣＡＶＬＣやＣＡＢＡＣなどがある。 The orthogonal transform / quantization unit 132 quantizes the frequency component data obtained by the orthogonal transform to generate quantized data. The quantized data generated by the orthogonal transform / quantization unit 132 is input to the variable length coding unit 133 and the inverse orthogonal transform / inverse quantization unit 134. The variable length encoding unit 133 performs variable length encoding on the quantized data to generate encoded data. Examples of applicable variable length coding include CAVLC and CABAC.

逆直交変換・逆量子化部１３４は、量子化データを逆量子化して周波数成分のデータを復元する。また、逆直交変換・逆量子化部１３４は、周波数成分のデータに逆直交変換を施して予測誤差データを復元する。復元された予測誤差データは、加算器１３５に入力される。加算器１３５は、動き補償により動き補償部１３７で生成されたＭＢデータと、逆直交変換・逆量子化部１３４により復元された予測誤差データとを加算して局所復号画像を生成する。 The inverse orthogonal transform / inverse quantization unit 134 dequantizes the quantized data to restore the frequency component data. The inverse orthogonal transform / inverse quantization unit 134 performs inverse orthogonal transform on the frequency component data to restore the prediction error data. The restored prediction error data is input to the adder 135. The adder 135 adds the MB data generated by the motion compensation unit 137 by motion compensation and the prediction error data restored by the inverse orthogonal transform / inverse quantization unit 134 to generate a locally decoded image.

加算器１３５により生成された局所復号画像は、フレームメモリ１３６及び動き補償部１３７に入力される。なお、局所復号画像にデブロッキングフィルタが施されるようにしてもよい。フレームメモリ１３６には、加算器１３５により入力された局所復号画像が格納される。フレームメモリ１３６に格納された局所復号画像は、動き補償部１３７により参照される。 The locally decoded image generated by the adder 135 is input to the frame memory 136 and the motion compensation unit 137. A deblocking filter may be applied to the locally decoded image. The frame memory 136 stores the locally decoded image input by the adder 135. The locally decoded image stored in the frame memory 136 is referred to by the motion compensation unit 137.

動き補償部１３７は、動きベクトルメモリ１３９に格納されている動きベクトルを用いて、フレームメモリ１３６に格納されている局所復号画像の動き補償を実行し、動き補償後のＭＢデータ（参照ピクチャのＭＢデータ）を生成する。動きベクトル検出部１３８は、入力された非Ｂａｓｅ−ＶｉｅｗピクチャのＭＢデータ（対象ピクチャのＭＢデータ）と参照ピクチャのＭＢデータとを用いて動き探索を実行し、動きベクトルを生成する。 The motion compensation unit 137 performs motion compensation of the locally decoded image stored in the frame memory 136 using the motion vector stored in the motion vector memory 139, and performs MB data after motion compensation (MB of the reference picture). Data). The motion vector detection unit 138 performs motion search using the input MB data of the non-Base-View picture (MB data of the target picture) and the MB data of the reference picture, and generates a motion vector.

なお、動きベクトル検出部１３８は、他視点（この例ではＢａｓｅ−Ｖｉｅｗ）の参照ピクチャを利用できる場合、第１視点符号化部１０１のフレームメモリ１１６から参照ピクチャを取得する。そして、動きベクトル検出部１３８は、取得した参照ピクチャのＭＢデータを用いて動き探索を実行し、動きベクトルを生成する。 Note that the motion vector detection unit 138 acquires a reference picture from the frame memory 116 of the first viewpoint encoding unit 101 when a reference picture of another viewpoint (Base-View in this example) can be used. Then, the motion vector detection unit 138 performs a motion search using the acquired MB data of the reference picture to generate a motion vector.

動きベクトル検出部１３８により検出された動きベクトルは、動きベクトルメモリ１３９及びモード判定部１４３に入力される。動きベクトルメモリ１３９には、動きベクトル検出部１３８により検出された動きベクトルが格納される。 The motion vector detected by the motion vector detection unit 138 is input to the motion vector memory 139 and the mode determination unit 143. The motion vector memory 139 stores a motion vector detected by the motion vector detection unit 138.

基準ベクトル取得部１４０は、符号化対象のマクロブロック（ＣｕｒｒＭＢ）が双方向予測を用いるマクロブロックであるか否かを判定する。ＣｕｒｒＭＢが双方向予測を用いるマクロブロックである場合、基準ベクトル取得部１４０は、ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶであるか否かを判定する。 The reference vector acquisition unit 140 determines whether or not the encoding target macroblock (CurrMB) is a macroblock using bidirectional prediction. When CurrMB is a macroblock using bi-directional prediction, reference vector acquisition section 140 determines whether or not the motion vector used for CurrMB prediction is inter-viewpoint motion vector iMV.

ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶである場合、基準ベクトル取得部１４０は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定する。そして、基準ベクトル取得部１４０は、第１視点符号化部１０１の動きベクトルメモリ１１９からＣｏｌＭＢの動きベクトルを取得する（図５を参照）。なお、ＣｏｌＰｉｃは、ＣｕｒｒＰｉｃと同じ時刻（同ＰＯＣ）のＢａｓｅ−Ｖｉｅｗピクチャである。この場合、基準ベクトル取得部１４０により取得された動きベクトルは、基準ベクトル補正部１４１に入力される。 When the motion vector used for CurrMB prediction is the inter-viewpoint motion vector iMV, the reference vector acquisition unit 140 identifies a ColPic macroblock (ColMB) located at the same position as the CurrMB. Then, the reference vector acquisition unit 140 acquires a ColMB motion vector from the motion vector memory 119 of the first viewpoint encoding unit 101 (see FIG. 5). Note that ColPic is a Base-View picture at the same time (same POC) as CurrPic. In this case, the motion vector acquired by the reference vector acquisition unit 140 is input to the reference vector correction unit 141.

一方、ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶでない場合、基準ベクトル取得部１４０は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定する。この場合、ＣｏｌＰｉｃは、非Ｂａｓｅ−Ｖｉｅｗピクチャである。また、基準ベクトル取得部１４０は、ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶであるか否かを判定する。 On the other hand, when the motion vector used for CurrMB prediction is not the inter-viewpoint motion vector iMV, the reference vector acquisition unit 140 identifies a ColPic macroblock (ColMB) located at the same position as CurrMB. In this case, ColPic is a non-Base-View picture. In addition, the reference vector acquisition unit 140 determines whether or not the motion vector used for the prediction of ColMB is the inter-viewpoint motion vector iMV.

ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶである場合、基準ベクトル取得部１４０は、ＣｏｌＭＢの予測に用いる視点間動きベクトルｉＭＶが指すＢａｓｅ−Ｖｉｅｗピクチャのマクロブロック（ＲｅｆＭＢ）を特定する。そして、基準ベクトル取得部１４０は、第１視点符号化部１０１の動きベクトルメモリ１１９からＲｅｆＭＢの動きベクトルを取得する（図４を参照）。この場合、基準ベクトル取得部１４０により取得された動きベクトルは、基準ベクトル補正部１４１に入力される。 When the motion vector used for the prediction of ColMB is the inter-view motion vector iMV, the reference vector acquisition unit 140 specifies the macro block (RefMB) of the Base-View picture indicated by the inter-view motion vector iMV used for the prediction of ColMB. Then, the reference vector acquisition unit 140 acquires the RefMB motion vector from the motion vector memory 119 of the first viewpoint encoding unit 101 (see FIG. 4). In this case, the motion vector acquired by the reference vector acquisition unit 140 is input to the reference vector correction unit 141.

一方、ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶでない場合、基準ベクトル取得部１４０は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定し、ＣｏｌＭＢの動きベクトルを動きベクトルメモリ１３９から取得する（図２を参照）。この場合、基準ベクトル取得部１４０により取得された動きベクトルは、基準ベクトルｂａｓｅｍｖＣｏｌとしてダイレクトベクトル計算部１４１に入力される。 On the other hand, when the motion vector used for the prediction of ColMB is not the inter-viewpoint motion vector iMV, the reference vector acquisition unit 140 identifies a ColPic macroblock (ColMB) at the same position as CurrMB, and stores the motion vector of ColMB in the motion vector memory. 139 (see FIG. 2). In this case, the motion vector acquired by the reference vector acquisition unit 140 is input to the direct vector calculation unit 141 as the reference vector basemvCol.

基準ベクトル補正部１４１は、統計情報取得部１０２から入力された統計情報に基づき、基準ベクトル取得部１４０から入力された動きベクトルを補正して基準ベクトルｂａｓｅｍｖＣｏｌを生成する。基準ベクトルｂａｓｅｍｖＣｏｌを生成する際に行う補正方法については後述する。基準ベクトル補正部１４１により生成された基準ベクトルｂａｓｅｍｖＣｏｌは、ダイレクトベクトル計算部１４２に入力される。 The reference vector correction unit 141 corrects the motion vector input from the reference vector acquisition unit 140 based on the statistical information input from the statistical information acquisition unit 102 to generate a reference vector basemvCol. A correction method performed when generating the reference vector basemvCol will be described later. The reference vector basemCol generated by the reference vector correction unit 141 is input to the direct vector calculation unit 142.

ダイレクトベクトル計算部１４２は、基準ベクトルｂａｓｅｍｖＣｏｌを時間配分でスケーリングし、ＣｕｒｒＭＢの動きベクトル（ダイレクトベクトル）を生成する。ダイレクトベクトル計算部１４２により生成されたダイレクトベクトルは、モード判定部１４３に入力される。 The direct vector calculation unit 142 scales the reference vector basemCol with time distribution, and generates a CurrMB motion vector (direct vector). The direct vector generated by the direct vector calculation unit 142 is input to the mode determination unit 143.

モード判定部１４３は、５つの予測モード（イントラ予測モード、前方向予測モード、後方向予測モード、双方向予測モード、ダイレクトモード）のうち、符号化コストが最も小さい予測モードを選択する。また、モード判定部１４３は、選択した予測モードによる符号化に用いる動きベクトルを動きベクトルメモリ１３９に格納する。さらに、モード判定部１４３は、選択した予測モードを動き補償部１３７に通知する。 The mode determination unit 143 selects a prediction mode with the lowest coding cost among the five prediction modes (intra prediction mode, forward prediction mode, backward prediction mode, bidirectional prediction mode, and direct mode). Further, the mode determination unit 143 stores a motion vector used for encoding in the selected prediction mode in the motion vector memory 139. Furthermore, the mode determination unit 143 notifies the motion compensation unit 137 of the selected prediction mode.

上記のように、非Ｂａｓｅ−Ｖｉｅｗピクチャの符号化は、Ｂａｓｅ−Ｖｉｅｗピクチャの符号化に用いた情報を適宜利用して実行される。
（２−５−４．基準ベクトルの計算に関する補足説明）
ここで、図１１〜図１４を参照しながら、ＣｕｒｒＭＢの動きベクトルが視点間動きベクトルｉＭＶである場合の基準ベクトルｂａｓｅｍｖＣｏｌの計算について説明を補足する。また、ダイレクトベクトルの計算についても説明を補足する。 As described above, the encoding of the non-Base-View picture is performed by appropriately using the information used for encoding the Base-View picture.
(2-5-4. Supplementary explanation on calculation of reference vector)
Here, with reference to FIGS. 11 to 14, a supplementary explanation will be given for the calculation of the reference vector basemvCol when the motion vector of CurrMB is the inter-viewpoint motion vector iMV. Further, the explanation will be supplemented for the calculation of the direct vector.

図１１は、第２実施形態に係る基準ベクトルの計算方法について説明するための第１の図である。また、図１２は、第２実施形態に係る基準ベクトルの計算方法について説明するための第２の図である。また、図１３は、第２実施形態に係る基準ベクトルの補正方法について説明するための第１の図である。また、図１４は、第２実施形態に係る基準ベクトルの補正方法について説明するための第２の図である。 FIG. 11 is a first diagram for explaining a reference vector calculation method according to the second embodiment. FIG. 12 is a second diagram for explaining the reference vector calculation method according to the second embodiment. FIG. 13 is a first diagram for explaining a reference vector correction method according to the second embodiment. FIG. 14 is a second diagram for explaining the reference vector correction method according to the second embodiment.

（基準ベクトル及びダイレクトベクトルの計算）
図１１を参照する。図１１の例では、視点＃２（非Ｂａｓｅ−Ｖｉｅｗ）のピクチャＢ２４が符号化対象のマクロブロック（ＣｕｒｒＭＢ）を含むＣｕｒｒＰｉｃである。また、ＣｕｒｒＭＢの予測に用いる動きベクトルは視点間動きベクトルｉＭＶである。この場合、視点＃１（Ｂａｓｅ−Ｖｉｅｗ）のピクチャＢ１４がＣｏｌＰｉｃとなる。 (Calculation of reference vector and direct vector)
Please refer to FIG. In the example of FIG. 11, the picture B24 of viewpoint # 2 (non-Base-View) is CurrPic including a macroblock (CurrMB) to be encoded. The motion vector used for CurrMB prediction is the inter-viewpoint motion vector iMV. In this case, the picture B14 at the viewpoint # 1 (Base-View) is ColPic.

ＣｕｒｒＰｉｃとＣｏｌＰｉｃとは、類似度の高いピクチャである。しかし、撮影位置や角度などが異なるため、ＣｕｒｒＰｉｃとＣｏｌＰｉｃとが完全に同じピクチャとはならない。そのため、視点間動きベクトルｉＭＶが有限の長さを有する。そして、視点間動きベクトルｉＭＶにより参照されるブロックｉＭＢがＣｏｌＭＢとなる。 CurrPic and ColPic are pictures with high similarity. However, since the shooting position and angle are different, CurrPic and ColPic are not completely the same picture. Therefore, the inter-viewpoint motion vector iMV has a finite length. The block iMB referred to by the inter-viewpoint motion vector iMV is ColMB.

ＣｕｒｒＰｉｃがＢピクチャである場合、ＣｕｒｒＰｉｃと同じ時刻（同ＰＯＣ）のＣｏｌＰｉｃはＢピクチャである。そのため、図１１に示すように、ＣｏｌＭＢの予測にはＬ０方向及びＬ１方向の動きベクトルが用いられる。 When CurrPic is a B picture, ColPic at the same time (same POC) as CurrPic is a B picture. Therefore, as shown in FIG. 11, motion vectors in the L0 direction and the L1 direction are used for prediction of ColMB.

基準ベクトル補正部１４１は、基準ベクトル取得部１４０が取得したＬ０方向の動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌ₀とし、Ｌ１方向の動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌ₁とする。このとき、基準ベクトル補正部１４１は、視点間動きベクトルｉＭＶを用いて基準ベクトルｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁を補正する。この補正方法については後述する。 The reference vector correction unit 141 sets the motion vector in the L0 direction acquired by the reference vector acquisition unit 140 as the reference vector basemvCol ₀ and sets the motion vector in the L1 direction as the reference vector basemvCol ₁ . At this time, the reference vector correction unit 141 corrects the reference vectors basemvCol ₀ and basemvCol ₁ using the inter-viewpoint motion vector iMV. This correction method will be described later.

基準ベクトルｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁が得られると、ダイレクトベクトル計算部１４２は、基準ベクトルｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁をスケーリングしてダイレクトベクトルｍｖＬ₀、ｍｖＬ₁を計算する。 When the reference vector basemvCol _0, basemvCol ₁ is obtained, the direct vector calculation unit 142 calculates a direct vector mvL _0, mvL ₁ by scaling the reference vector basemvCol _0, basemvCol _1.

図１１の例では、ピクチャＩ₁₂、Ｂ₁₄の間隔ｔｂＬ₀（ＰＯＣ差分）と、ピクチャＩ₂₂、Ｂ₂₄の間隔ｔｄＬ₀（ＰＯＣ差分）との比率を用いて基準ベクトルｂａｓｅｍｖＣｏｌ₀がスケーリングされる。また、ピクチャＢ₁₄、Ｐ₁₅の間隔ｔｂＬ₁（ＰＯＣ差分）と、ピクチャＢ₂₄、Ｐ₂₅の間隔ｔｄＬ₁（ＰＯＣ差分）との比率を用いて基準ベクトルｂａｓｅｍｖＣｏｌ₁がスケーリングされる。 In the example of FIG. 11, the reference vector basemvCol ₀ is scaled using the ratio of the interval tbL ₀ (POC difference) between the pictures I ₁₂ and B _{14 and} the interval tdL ₀ (POC difference) between the pictures I ₂₂ and B _24. . In addition, the reference vector basemvCol ₁ is scaled using the ratio of the interval tbL ₁ (POC difference) between the pictures B ₁₄ and P _{15 and} the interval tdL ₁ (POC difference) between the pictures B ₂₄ and P ₂₅ .

つまり、非Ｂａｓｅ−Ｖｉｅｗにおけるｒｅｆ＿ｉｄｘ＿ｌ０が最小のピクチャとＣｕｒｒＰｉｃとの間隔ｔｄＬ₀、Ｂａｓｅ−Ｖｉｅｗにおけるｒｅｆ＿ｉｄｘ＿ｌ０が最小のピクチャとＣｏｌＰｉｃとの間隔ｔｂＬ₀がスケーリングに利用される。さらに、非Ｂａｓｅ−Ｖｉｅｗにおけるｒｅｆ＿ｉｄｘ＿ｌ１が最小のピクチャとＣｕｒｒＰｉｃとの間隔ｔｄＬ₁、Ｂａｓｅ−Ｖｉｅｗにおけるｒｅｆ＿ｉｄｘ＿ｌ１が最小のピクチャとＣｏｌＰｉｃとの間隔ｔｂＬ₁がスケーリングに利用される。この場合、ダイレクトベクトルｍｖＬ₀、ｍｖＬ₁は、下記の式（１０）及び式（１１）により得られる。 That is, the spacing TBL ₀ of ref_idx_l0 and minimum picture and ColPic in intervals tdL _0, Base-View of ref_idx_l0 in non Base-View is the smallest picture and CurrPic are used for scaling. Further, the interval TBL ₁ of ref_idx_l1 is ref_idx_l1 is the interval tdL _1, Base-View of the minimum picture and CurrPic the smallest picture and ColPic in non Base-View is used for scaling. In this case, the direct vectors mvL ₀ and mvL ₁ are obtained by the following equations (10) and (11).

ｍｖＬ₀ ＝ｂａｓｅｍｖＣｏｌ₀・（ｔｂＬ₀ ／ｔｄＬ₀）
…（１０）
ｍｖＬ₁ ＝ｂａｓｅｍｖＣｏｌ₁・（ｔｂＬ₁ ／ｔｄＬ₁）
…（１１）
上記のように、ｔｂＬ₀、ｔｄＬ₀、ｔｂＬ₁、ｔｄＬ₁を用いてスケーリングすることにより、ＣｕｒｒＭＢの予測時に他視点のピクチャが参照される場合でも基準ベクトルからダイレクトベクトルを得ることができる。つまり、ＣｕｒｒＭＢが視点間参照を利用する場合でも時間ダイレクトモードを適用することが可能になる。なお、１つの基準ベクトルを選択的に利用する方法や、１つ又は複数のダイレクトベクトルを選択して利用する方法など、様々な変形例が考えられる。 mvL ₀ = basemvCol ₀ · (tbL ₀ / tdL ₀ )
(10)
mvL ₁ = basemvCol ₁ (tbL ₁ / tdL ₁ )
... (11)
As described above, by scaling with _{_{_{tbL 0, tdL 0, tbL 1}}} , tdL 1, it is possible to obtain a direct vector from the reference vector even if other viewpoint picture when predicting the CurrMB is referred to. That is, the temporal direct mode can be applied even when CurrMB uses inter-viewpoint reference. Various modification examples such as a method of selectively using one reference vector and a method of selecting and using one or a plurality of direct vectors are conceivable.

図１２の例は、基準ベクトルｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁からダイレクトベクトルｍｖＬ₀、ｍｖＬ₁を計算し、１つのダイレクトベクトルを選択して利用する方法を示している。ダイレクトベクトルｍｖＬ₀は、ピクチャ間隔ｔｂＬ₀、ｔｄを利用して基準ベクトルｂａｓｅｍｖＣｏｌ₀をスケーリングしたものである。一方、ダイレクトベクトルｍｖＬ₁は、基準ベクトルｂａｓｅｍｖＣｏｌ₁をＬ０方向に反転させたものである（ｍｖＬ₁＝−ｂａｓｅｍｖＣｏｌ₁）。ダイレクトベクトルの選択は、例えば、符号化コストに基づいて行われる。 Example of FIG. 12, the direct vector mvL _0, mvL ₁ from the reference vector basemvCol _0, basemvCol ₁ calculates, shows how to select and use one of the direct vector. The direct vector mvL ₀ is obtained by scaling the reference vector basemvCol ₀ using the picture intervals tbL ₀ and td. On the other hand, the direct vector mvL ₁ is obtained by inverting the reference vector basemvCol ₁ in the L0 direction (mvL ₁ = −basemvCol ₁ ). The selection of the direct vector is performed based on the encoding cost, for example.

なお、図１２の例では、ピクチャＢ₂₃を指すダイレクトベクトルを計算しているが、ダイレクトベクトルが指すピクチャの情報は、例えば、ＳＥＩ（ＳｕｐｐｌｅｍｅｎｔａｌＥｎｈａｎｃｅｍｅｎｔＩｎｆｏｒｍａｔｉｏｎ）で指定することができる。ＳＥＩはヘッダ情報の一例である。図１２の例では、ｒｅｆ＿ｉｄｘ＿ｌ０＝２のピクチャを指すダイレクトベクトルが指定されている。 In the example of FIG. 12, the direct vector indicating the picture B ₂₃ is calculated. However, the information of the picture indicated by the direct vector can be specified by SEI (Supplemental Enhancement Information), for example. SEI is an example of header information. In the example of FIG. 12, a direct vector indicating a picture of ref_idx_I0 = 2 is specified.

上記のような方法を適用することで、基準ベクトルからダイレクトベクトルを得ることができる。以下、基準ベクトルの補正方法について述べる。
（基準ベクトルの補正＃１：最大面積）
第１の補正方法について述べる。 By applying the above method, a direct vector can be obtained from a reference vector. The reference vector correction method will be described below.
(Reference vector correction # 1: Maximum area)
A first correction method will be described.

図１３を参照する。既に述べたように、視点の異なるピクチャは、撮影位置や角度の違いから、完全に同じピクチャとはならない。そのため、視点間動きベクトルｉＭＶは有限の長さを有する。つまり、視点間動きベクトルｉＭＶが指すブロックｉＭＢは、ＣｏｌＰｉｃに含まれるマクロブロックのいずれかと完全に一致しない可能性が高い。例えば、ＣｏｌＰｉｃのマクロブロックＭＢ＃１、…、＃４と、視点間動きベクトルｉＭＶが指すブロックｉＭＢとが図１３に例示した位置関係にある場合について考える。なお、マクロブロックＭＢ＃１、…、＃４は周辺ブロックの一例である。 Please refer to FIG. As already described, pictures with different viewpoints are not completely the same because of differences in shooting position and angle. Therefore, the inter-viewpoint motion vector iMV has a finite length. That is, there is a high possibility that the block iMB pointed to by the inter-viewpoint motion vector iMV does not completely match any of the macroblocks included in the ColPic. For example, consider the case where the ColPic macroblocks MB # 1,..., # 4 and the block iMB pointed to by the inter-viewpoint motion vector iMV are in the positional relationship illustrated in FIG. Macro blocks MB # 1,..., # 4 are examples of peripheral blocks.

マクロブロックＭＢ＃１、…、＃４、ブロックｉＭＢのサイズはそれぞれＮ×Ｍであるとする。図１３の例では、マクロブロックＭＢ＃１、＃２とブロックｉＭＢとが重なる面積は０である。また、マクロブロックＭＢ＃３とブロックｉＭＢとが重なる面積は、ｎｕｍ_y×（Ｎ−ｎｕｍ_x）である。そして、マクロブロックＭＢ＃４とブロックｉＭＢとが重なる面積は、ｎｕｍ_y×ｎｕｍ_xである。図１３の例では、（Ｎ−ｎｕｍ_x）よりもｎｕｍ_xの方が大きい。従って、マクロブロックＭＢ＃４とブロックｉＭＢとの重なり面積が最大となる。 Assume that the sizes of the macroblocks MB # 1,..., # 4, and the block iMB are N × M. In the example of FIG. 13, the area where the macro blocks MB # 1, # 2 and the block iMB overlap is zero. Furthermore, the overlapping area and the macro-block MB # 3 and the block iMB is _{_{num y × (N-num x}} ). Then, the overlapping area and the macro-block MB # 4 and the block iMB is num _y × num _x. In the example of FIG. 13, num _x is larger than (N-num _x ). Therefore, the overlapping area between the macro block MB # 4 and the block iMB is maximized.

第１の補正方法は、重なり面積が最大となるマクロブロックＭＢ＃４を求め、重なり面積が最大となるマクロブロックＭＢ＃４の予測に用いる動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する方法である。つまり、第１の補正方法を採用する場合、基準ベクトル補正部１４１は、ブロックｉＭＢと周辺ブロックとの重なり面積を計算し、重なり面積が最大となるマクロブロックの予測に用いる動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する。第１の補正方法によれば、演算負荷の増大を抑制しつつ、視点間動きベクトルｉＭＶを考慮して基準ベクトルｂａｓｅｍｖＣｏｌの設定が可能になる。 The first correction method is a method of obtaining the macro block MB # 4 having the maximum overlap area and setting a motion vector used for prediction of the macro block MB # 4 having the maximum overlap area as the reference vector basemvCol. That is, when the first correction method is employed, the reference vector correction unit 141 calculates the overlap area between the block iMB and the peripheral blocks, and uses the motion vector used for prediction of the macroblock with the maximum overlap area as the reference vector basemvCol. Set to. According to the first correction method, the reference vector basemvCol can be set in consideration of the inter-viewpoint motion vector iMV while suppressing an increase in calculation load.

（基準ベクトルの補正＃２：加重平均）
第２の補正方法について述べる。ここでは、ＣｏｌＰｉｃのマクロブロックＭＢ＃１、…、＃４と、視点間動きベクトルｉＭＶが指すブロックｉＭＢとが図１４に例示した位置関係にある場合について考える。また、マクロブロックＭＢ＃１、…、＃４、ブロックｉＭＢのサイズはそれぞれＮ×Ｍであるとする。第２の補正方法は、ブロックｉＭＢの周辺に位置するマクロブロックＭＢ＃１、…、＃４の動きベクトルを加重平均し、加重平均で得られたベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定するものである。以下、加重平均の計算方法について２通りの例（例１、例２）を示す。 (Reference vector correction # 2: weighted average)
A second correction method will be described. Here, consider the case where the ColPic macroblocks MB # 1,..., # 4 and the block iMB pointed to by the inter-viewpoint motion vector iMV are in the positional relationship illustrated in FIG. Also, it is assumed that the sizes of the macro blocks MB # 1,... In the second correction method, the motion vectors of the macroblocks MB # 1,..., # 4 located around the block iMB are weighted and averaged, and the vector obtained by the weighted average is set as the reference vector basemvCol. Hereinafter, two examples (Example 1 and Example 2) of the calculation method of the weighted average are shown.

（例１：重なる面積に基づく加重平均の計算について）
図１４の例では、マクロブロックＭＢ＃１とブロックｉＭＢとが重なる面積Ｓ₁は、ｄｘ_A×ｄｙ_Aである。また、マクロブロックＭＢ＃２とブロックｉＭＢとが重なる面積Ｓ₂は、ｄｘ_B×ｄｙ_Aである。また、マクロブロックＭＢ＃３とブロックｉＭＢとが重なる面積Ｓ₃は、ｄｘ_A×ｄｙ_Bである。また、マクロブロックＭＢ＃４とブロックｉＭＢとが重なる面積Ｓ₄は、ｄｘ_B×ｄｙ_Bである。なお、ｄｘ_A、ｄｘ_B、ｄｙ_A、ｄｙ_Bは画素数の単位で表現される。 (Example 1: Calculation of weighted average based on overlapping area)
In the example of FIG. 14, the area S ₁ where the macro block MB # 1 and the block iMB overlap is dx _A × dy _A. The area S ₂ where the macro block MB # 2 and the block iMB overlap is dx _B × dy _A. The area S ₃ where the macro block MB # 3 and the block iMB overlap is dx _A × dy _B. The area S ₄ where the macro block MB # 4 and the block iMB overlap is dx _B × dy _B. Note that dx _A , dx _B , dy _A , and dy _B are expressed in units of the number of pixels.

例１では、加重平均の計算に用いる評価値Ｃｏｓｔｘｙが上記の重なる面積Ｓ₁、…、Ｓ₄に基づいて計算される。評価値Ｃｏｓｔｘｙは、下記の式（１２）で与えられる。但し、Ｋは、予め設定された重み係数である。さらに、評価値Ｃｏｓｔｘｙを重みとする加重平均の計算により基準ベクトルｂａｓｅｍｖＣｏｌが計算される。例えば、基準ベクトルｂａｓｅｍｖＣｏｌは、下記の式（１３）に従って計算される。但し、ｂａｓｅｍｖＣｏｌ_x、ｂａｓｅｍｖＣｏｌ_yは、それぞれ基準ベクトルｂａｓｅｍｖＣｏｌのｘ成分、ｙ成分である。また、ＭＶｘ_i、ＭＶｙ_iは、それぞれマクロブロックＭＢ＃ｉの予測に用いる動きベクトルのｘ成分、ｙ成分である。 In Example 1, the evaluation value Costxy used for calculating the weighted average is calculated based on the overlapping areas S ₁ ,..., S ₄ . The evaluation value Costxy is given by the following equation (12). Here, K is a weighting factor set in advance. Further, the reference vector basemvCol is calculated by calculating a weighted average using the evaluation value Costxy as a weight. For example, the reference vector basemvCol is calculated according to the following equation (13). Here, basemvCol _x and basemvCol _y are the x component and the y component of the reference vector basemvCol, respectively. Further, MVx _i and MVy _i are the x component and y component of the motion vector used for prediction of the macroblock MB # i, respectively.

Ｃｏｓｔｘｙ_i ＝Ｋ×Ｓ_i （ｉ＝１，…，４）
…（１２） Costxy _i = K × S _i (i = 1,..., 4)
(12)

（例２：重なる面積及び量子化値に基づく加重平均の計算について）
例２では、加重平均の計算に用いる評価値Ｃｏｓｔｘｙが上記の重なる面積Ｓ₁、…、Ｓ₄、符号化に用いた量子化値（量子化スケール値：ｑＰ値）、及び有効係数の数に基づいて計算される。ｑＰ値及び有効係数の数は統計情報取得部１０２により取得される。なお、符号化に用いた有効係数の個数や成分の絶対値が小さければ、予測結果の信頼度が高いと判断できる。また、有効係数が同程度であれば、使用したｑＰ値が小さいほど予測結果の信頼度が高いと判断できる。そのため、例２では、これらの値を利用する。 (Example 2: Calculation of weighted average based on overlapping area and quantized value)
In Example 2, the evaluation value Costxy used for the calculation of the weighted average is the above-described overlapping area S ₁ ,..., S ₄ , the quantization value (quantization scale value: qP value) used for encoding, and the number of effective coefficients. Calculated based on. The qP value and the number of effective coefficients are acquired by the statistical information acquisition unit 102. If the number of effective coefficients used for encoding and the absolute values of the components are small, it can be determined that the reliability of the prediction result is high. Moreover, if the effective coefficient is comparable, it can be judged that the reliability of a prediction result is so high that the used qP value is small. Therefore, in Example 2, these values are used.

評価値Ｃｏｓｔｘｙは、下記の式（１４）で与えられる。但し、Ｋは、予め設定された重み係数である。ｑＰ_iは、マクロブロックＭＢ＃ｉの符号化に用いたｑＰ値である。ｎｕｍＣｏｅｆ_iは、マクロブロックＭＢ＃ｉの符号化に用いた有効係数の数である。下記の式（１４）から明らかなように、重なる面積が大きいほど評価値Ｃｏｓｔｘｙは大きくなる。また、ｑＰ値が大きいほど評価値Ｃｏｓｔｘｙは大きくなる。さらに、有効係数の数が小さいほど、評価値Ｃｏｓｔｘｙは大きくなる。 The evaluation value Costxy is given by the following equation (14). Here, K is a weighting factor set in advance. qP _i is a qP value used for encoding the macroblock MB # i. numCoef _i is the number of effective coefficients used for encoding the macroblock MB # i. As is clear from the following formula (14), the evaluation value Costxy increases as the overlapping area increases. The evaluation value Costxy increases as the qP value increases. Furthermore, the evaluation value Costxy increases as the number of effective coefficients decreases.

このようにして与えられた評価値Ｃｏｓｔｘｙを重みとする加重平均の計算により基準ベクトルｂａｓｅｍｖＣｏｌが計算される。例えば、基準ベクトルｂａｓｅｍｖＣｏｌは、上記の式（１３）に従って計算される。但し、ｂａｓｅｍｖＣｏｌ_x、ｂａｓｅｍｖＣｏｌ_yは、それぞれ基準ベクトルｂａｓｅｍｖＣｏｌのｘ成分、ｙ成分である。また、ＭＶｘ_i、ＭＶｙ_iは、それぞれマクロブロックＭＢ＃ｉの予測に用いる動きベクトルのｘ成分、ｙ成分である。 The reference vector basemvCol is calculated by calculating a weighted average using the evaluation value Costxy given in this way as a weight. For example, the reference vector basemvCol is calculated according to the above equation (13). Here, basemvCol _x and basemvCol _y are the x component and the y component of the reference vector basemvCol, respectively. Further, MVx _i and MVy _i are the x component and y component of the motion vector used for prediction of the macroblock MB # i, respectively.

上記の方法により基準ベクトルｂａｓｅｍｖＣｏｌは補正され、ＣｕｒｒＭＢの予測に利用可能なダイレクトベクトルが得られる。
以上、符号化装置１００が有する機能について説明した。 The reference vector basemCol is corrected by the above method, and a direct vector that can be used for the prediction of CurrMB is obtained.
In the above, the function which the encoding apparatus 100 has was demonstrated.

［２−６．復号装置の機能］
次に、図１５〜図１７を参照しながら、復号装置２００が有する機能について説明する。 [2-6. Function of Decoding Device]
Next, functions of the decoding device 200 will be described with reference to FIGS.

図１５は、第２実施形態に係る復号装置が有する機能の一例を示した第１のブロック図である。また、図１６は、第２実施形態に係る復号装置が有する機能の一例を示した第２のブロック図である。また、図１７は、第２実施形態に係る復号装置が有する機能の一例を示した第３のブロック図である。 FIG. 15 is a first block diagram illustrating an example of functions of the decoding device according to the second embodiment. FIG. 16 is a second block diagram illustrating an example of the functions of the decoding device according to the second embodiment. FIG. 17 is a third block diagram illustrating an example of functions of the decoding device according to the second embodiment.

（２−６−１．全体）
図１５に示すように、復号装置２００は、第１視点復号部２０１、統計情報取得部２０２、及び第２視点復号部２０３を有する。なお、復号装置２００は、第１視点復号部２０１、統計情報取得部２０２、及び第２視点復号部２０３の機能は、上述したＣＰＵ９０２などを用いて実現できる。 (2-6-1. Overall)
As illustrated in FIG. 15, the decoding device 200 includes a first viewpoint decoding unit 201, a statistical information acquisition unit 202, and a second viewpoint decoding unit 203. Note that in the decoding device 200, the functions of the first viewpoint decoding unit 201, the statistical information acquisition unit 202, and the second viewpoint decoding unit 203 can be realized using the above-described CPU 902 or the like.

第１視点復号部２０１には、Ｂａｓｅ−Ｖｉｅｗピクチャを符号化した符号化データ（以下、Ｂａｓｅ−Ｖｉｅｗ符号化データ）が入力される。第１視点復号部２０１は、入力されたＢａｓｅ−Ｖｉｅｗ符号化データを復号する。第１視点復号部２０１の機能については後段において詳述する。 The first viewpoint decoding unit 201 receives encoded data obtained by encoding a Base-View picture (hereinafter referred to as Base-View encoded data). The first viewpoint decoding unit 201 decodes the input Base-View encoded data. The function of the first viewpoint decoding unit 201 will be described in detail later.

統計情報取得部２０２は、第１視点復号部２０１によるＢａｓｅ−Ｖｉｅｗ符号化データの復号時に得られる統計情報を取得する。統計情報としては、例えば、ｑＰ値や差分有効係数の個数などがある。統計情報取得部２０２により取得された統計情報は、第２視点復号部２０３に入力される。 The statistical information acquisition unit 202 acquires statistical information obtained when the first viewpoint decoding unit 201 decodes Base-View encoded data. The statistical information includes, for example, the qP value and the number of differential effective coefficients. The statistical information acquired by the statistical information acquisition unit 202 is input to the second viewpoint decoding unit 203.

第２視点復号部２０３には、非Ｂａｓｅ−Ｖｉｅｗピクチャを符号化した符号化データ（以下、非Ｂａｓｅ−Ｖｉｅｗ符号化データ）が入力される。第２視点復号部２０３は、第１視点復号部２０１が復号に用いた動きベクトルやピクチャに関する情報、及び統計情報を用いて非Ｂａｓｅ−Ｖｉｅｗ符号化データを復号する。第２視点復号部２０３の機能については後段において詳述する。 The second viewpoint decoding unit 203 receives encoded data obtained by encoding a non-Base-View picture (hereinafter, non-Base-View encoded data). The second viewpoint decoding unit 203 decodes the non-Base-View encoded data using the motion vector and the information related to the picture used by the first viewpoint decoding unit 201 and the statistical information. The function of the second viewpoint decoding unit 203 will be described in detail later.

（２−６−２．第１視点復号部２０１の細部）
ここで、図１６を参照しながら、第１視点復号部２０１の機能について、さらに説明する。 (2-6-2. Details of First View Decoding Unit 201)
Here, the function of the first viewpoint decoding unit 201 will be further described with reference to FIG.

図１６に示すように、第１視点復号部２０１は、可変長復号部２１１、逆直交変換・逆量子化部２１２、予測モード取得部２１３、スイッチ２１４、及びイントラ予測部２１５を有する。また、第１視点復号部２０１は、動きベクトル取得部２１６、フレームメモリ２１７、動き補償部２１８、スイッチ２１９、及び加算器２２０を有する。さらに、第１視点復号部２０１は、動きベクトルメモリ２２１、基準ベクトル取得部２２２、及びダイレクトベクトル計算部２２３を有する。 As illustrated in FIG. 16, the first viewpoint decoding unit 201 includes a variable length decoding unit 211, an inverse orthogonal transform / inverse quantization unit 212, a prediction mode acquisition unit 213, a switch 214, and an intra prediction unit 215. The first viewpoint decoding unit 201 includes a motion vector acquisition unit 216, a frame memory 217, a motion compensation unit 218, a switch 219, and an adder 220. Furthermore, the first viewpoint decoding unit 201 includes a motion vector memory 221, a reference vector acquisition unit 222, and a direct vector calculation unit 223.

可変長復号部２１１には、Ｂａｓｅ−Ｖｉｅｗ符号化データ（Ｂａｓｅ−Ｖｉｅｗ側のビットストリーム）が入力される。Ｂａｓｅ−Ｖｉｅｗ符号化データが入力された可変長復号部２１１は、符号化装置１００が用いた可変長符号化に対応する方式でＢａｓｅ−Ｖｉｅｗ符号化データに対する可変長復号を実行し、量子化データなどの情報を復元する。このとき、可変長復号部２１１は、ＳＰＳ（Sequence Parameter Set）やＰＰＳ（Picture Parameter Set）などのヘッダ情報、マクロブロック毎の予測モードや動きベクトル、及び、差分係数などの情報も復元する。 The variable-length decoding unit 211 receives Base-View encoded data (Base-View side bit stream). The variable-length decoding unit 211 to which the Base-View encoded data is input performs variable-length decoding on the Base-View encoded data in a scheme corresponding to the variable-length encoding used by the encoding device 100, and the quantized data Restore information such as. At this time, the variable length decoding unit 211 also restores header information such as SPS (Sequence Parameter Set) and PPS (Picture Parameter Set), information such as prediction mode and motion vector for each macroblock, and difference coefficient.

可変長復号部２１１により復元された予測モードなどの情報は、予測モード取得部２１３に入力される。また、可変長復号部２１１により復元されたヘッダ情報や差分係数などの情報は、統計情報取得部２０２に入力される。また、可変長復号部２１１により復元された量子化データは、逆直交変換・逆量子化部２１２に入力される。 Information such as the prediction mode restored by the variable length decoding unit 211 is input to the prediction mode acquisition unit 213. Information such as header information and difference coefficients restored by the variable length decoding unit 211 is input to the statistical information acquisition unit 202. The quantized data restored by the variable length decoding unit 211 is input to the inverse orthogonal transform / inverse quantization unit 212.

逆直交変換・逆量子化部２１２は、可変長復号部２１１から入力された量子化データを逆量子化して周波数成分のデータを復元する。また、逆直交変換・逆量子化部２１２は、周波数成分のデータに逆直交変換を施して予測誤差データを復元する。復元された予測誤差データは、加算器２２０に入力される。加算器２２０は、スイッチ２１９を介して入力されたＭＢデータと、逆直交変換・逆量子化部２１２により復元された予測誤差データとを加算して復号画像を生成する。 The inverse orthogonal transform / inverse quantization unit 212 dequantizes the quantized data input from the variable length decoding unit 211 to restore the frequency component data. The inverse orthogonal transform / inverse quantization unit 212 performs inverse orthogonal transform on the frequency component data to restore the prediction error data. The restored prediction error data is input to the adder 220. The adder 220 adds the MB data input via the switch 219 and the prediction error data restored by the inverse orthogonal transform / inverse quantization unit 212 to generate a decoded image.

加算器２２０により生成された復号画像は、フレームメモリ２１７に入力されると共に、復号結果として出力される。なお、復号画像にデブロッキングフィルタが施されるようにしてもよい。フレームメモリ２１７には、加算器２２０により入力された復号画像が格納される。フレームメモリ２１７に格納された復号画像は、動き補償部２１８により参照される。 The decoded image generated by the adder 220 is input to the frame memory 217 and output as a decoding result. Note that a deblocking filter may be applied to the decoded image. The frame memory 217 stores the decoded image input by the adder 220. The decoded image stored in the frame memory 217 is referred to by the motion compensation unit 218.

予測モード取得部２１３は、マクロブロック毎に予測モード（イントラ予測モード、前方向予測モード、後方向予測モード、双方向予測モード、ダイレクトモード）を判定する。なお、ブロック分割サイズが予測モードに付加されていてもよい。なお、スイッチ２１４は、予測モードの判定結果に応じて切り替えられる。予測モードがイントラ予測モードである場合、イントラ予測部２１５が、イントラ予測を用いて対象のマクロブロックを復号する。予測モードが前方向予測モード、後方向予測モード、双方向予測モードのいずれかである場合、動きベクトル取得部２１６が、可変長復号部２１１により復元された動きベクトルの情報を取得する。 The prediction mode acquisition unit 213 determines a prediction mode (intra prediction mode, forward prediction mode, backward prediction mode, bidirectional prediction mode, direct mode) for each macroblock. Note that the block division size may be added to the prediction mode. The switch 214 is switched according to the determination result of the prediction mode. When the prediction mode is the intra prediction mode, the intra prediction unit 215 decodes the target macroblock using the intra prediction. When the prediction mode is any one of the forward prediction mode, the backward prediction mode, and the bidirectional prediction mode, the motion vector acquisition unit 216 acquires information on the motion vector restored by the variable length decoding unit 211.

予測モードがダイレクトモードである場合、基準ベクトル取得部２２２が、ＣｏｌＰｉｃの復号時に用いた動きベクトルを動きベクトルメモリ２２１から取得する。また、基準ベクトル取得部２２２は、取得した動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する。基準ベクトル取得部２２２により設定された基準ベクトルｂａｓｅｍｖＣｏｌは、ダイレクトベクトル計算部２２３に入力される。ダイレクトベクトル計算部２２３は、基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングしてダイレクトベクトルを計算する。 When the prediction mode is the direct mode, the reference vector acquisition unit 222 acquires the motion vector used at the time of decoding ColPic from the motion vector memory 221. In addition, the reference vector acquisition unit 222 sets the acquired motion vector as the reference vector basemvCol. The reference vector basemCol set by the reference vector acquisition unit 222 is input to the direct vector calculation unit 223. The direct vector calculation unit 223 calculates a direct vector by scaling the reference vector basemCol.

ダイレクトベクトル計算部２２３により計算されたダイレクトベクトルは、動きベクトルメモリ２２１に格納される。動きベクトルメモリ２２１に格納されたダイレクトベクトルは、動き補償部２１８により参照される。動き補償部２１８は、ダイレクトベクトル、又は、動きベクトル取得部２１６により取得された動きベクトルを用いて、フレームメモリ２１７に格納されている復号画像（参照ピクチャ）の動き補償を実行し、動き補償後のＭＢデータを生成する。動き補償後のＭＢデータは、スイッチ２１９を介して加算器２２０に入力される。 The direct vector calculated by the direct vector calculation unit 223 is stored in the motion vector memory 221. The direct vector stored in the motion vector memory 221 is referred to by the motion compensation unit 218. The motion compensation unit 218 performs motion compensation of the decoded image (reference picture) stored in the frame memory 217 using the direct vector or the motion vector acquired by the motion vector acquisition unit 216, and after motion compensation MB data is generated. The MB data after motion compensation is input to the adder 220 via the switch 219.

上記のように、Ｂａｓｅ−Ｖｉｅｗ符号化データの復号は、非Ｂａｓｅ−Ｖｉｅｗ符号化データの復号とは独立して実行される。
（２−６−３．第２視点復号部２０３の細部）
ここで、図１７を参照しながら、第２視点復号部２０３の機能について、さらに説明する。 As described above, decoding of Base-View encoded data is performed independently of decoding of non-Base-View encoded data.
(2-6-3. Details of second viewpoint decoding unit 203)
Here, the function of the second viewpoint decoding unit 203 will be further described with reference to FIG.

図１７に示すように、第２視点復号部２０３は、可変長復号部２３１、逆直交変換・逆量子化部２３２、予測モード取得部２３３、スイッチ２３４、及びイントラ予測部２３５を有する。また、第２視点復号部２０３は、動きベクトル取得部２３６、動き補償部２３７、フレームメモリ２３８、スイッチ２３９、及び加算器２４０を有する。さらに、第２視点復号部２０３は、動きベクトルメモリ２４１、基準ベクトル取得部２４２、基準ベクトル補正部２４３、及びダイレクトベクトル計算部２４４を有する。 As illustrated in FIG. 17, the second viewpoint decoding unit 203 includes a variable length decoding unit 231, an inverse orthogonal transform / inverse quantization unit 232, a prediction mode acquisition unit 233, a switch 234, and an intra prediction unit 235. The second viewpoint decoding unit 203 includes a motion vector acquisition unit 236, a motion compensation unit 237, a frame memory 238, a switch 239, and an adder 240. Further, the second viewpoint decoding unit 203 includes a motion vector memory 241, a reference vector acquisition unit 242, a reference vector correction unit 243, and a direct vector calculation unit 244.

可変長復号部２３１には、非Ｂａｓｅ−Ｖｉｅｗ符号化データ（非Ｂａｓｅ−Ｖｉｅｗ側のビットストリーム）が入力される。非Ｂａｓｅ−Ｖｉｅｗ符号化データが入力された可変長復号部２３１は、符号化装置１００が用いた可変長符号化に対応する方式で非Ｂａｓｅ−Ｖｉｅｗ符号化データに対する可変長復号を実行し、量子化データなどの情報を復元する。このとき、可変長復号部２３１は、ＳＰＳやＰＰＳなどのヘッダ情報、マクロブロック毎の予測モードや動きベクトル、及び、差分係数などの情報も復元する。 Non-Base-View encoded data (non-Base-View side bit stream) is input to the variable length decoding unit 231. The variable length decoding unit 231 to which the non-Base-View encoded data is input performs variable length decoding on the non-Base-View encoded data in a scheme corresponding to the variable length encoding used by the encoding apparatus 100, Information such as data. At this time, the variable length decoding unit 231 also restores information such as header information such as SPS and PPS, prediction modes and motion vectors for each macroblock, and difference coefficients.

可変長復号部２３１により復元された予測モードなどの情報は、予測モード取得部２３３に入力される。また、可変長復号部２３１により復元された量子化データは、逆直交変換・逆量子化部２３２に入力される。 Information such as the prediction mode restored by the variable length decoding unit 231 is input to the prediction mode acquisition unit 233. The quantized data restored by the variable length decoding unit 231 is input to the inverse orthogonal transform / inverse quantization unit 232.

逆直交変換・逆量子化部２３２は、可変長復号部２３１から入力された量子化データを逆量子化して周波数成分のデータを復元する。また、逆直交変換・逆量子化部２３２は、周波数成分のデータに逆直交変換を施して予測誤差データを復元する。復元された予測誤差データは、加算器２４０に入力される。加算器２４０は、スイッチ２３９を介して入力されたＭＢデータと、逆直交変換・逆量子化部２３２により復元された予測誤差データとを加算して復号画像を生成する。 The inverse orthogonal transform / inverse quantization unit 232 dequantizes the quantized data input from the variable length decoding unit 231 to restore the frequency component data. Further, the inverse orthogonal transform / inverse quantization unit 232 performs inverse orthogonal transform on the frequency component data to restore the prediction error data. The restored prediction error data is input to the adder 240. The adder 240 adds the MB data input via the switch 239 and the prediction error data restored by the inverse orthogonal transform / inverse quantization unit 232 to generate a decoded image.

加算器２４０により生成された復号画像は、フレームメモリ２３８に入力されると共に、復号結果として出力される。なお、復号画像にデブロッキングフィルタが施されるようにしてもよい。フレームメモリ２３８には、加算器２４０により入力された復号画像が格納される。フレームメモリ２３８に格納された復号画像は、動き補償部２３７により参照される。 The decoded image generated by the adder 240 is input to the frame memory 238 and output as a decoding result. Note that a deblocking filter may be applied to the decoded image. The frame memory 238 stores the decoded image input by the adder 240. The decoded image stored in the frame memory 238 is referred to by the motion compensation unit 237.

予測モード取得部２３３は、マクロブロック毎に予測モード（イントラ予測モード、前方向予測モード、後方向予測モード、双方向予測モード、ダイレクトモード）を判定する。なお、ブロック分割サイズが予測モードに付加されていてもよい。なお、スイッチ２３４は、予測モードの判定結果に応じて切り替えられる。予測モードがイントラ予測モードである場合、イントラ予測部２３５が、イントラ予測を用いて対象のマクロブロックを復号する。予測モードが前方向予測モード、後方向予測モード、双方向予測モードのいずれかである場合、動きベクトル取得部２３６が、可変長復号部２３１により復元された動きベクトルの情報を取得する。 The prediction mode acquisition unit 233 determines a prediction mode (intra prediction mode, forward prediction mode, backward prediction mode, bidirectional prediction mode, direct mode) for each macroblock. Note that the block division size may be added to the prediction mode. The switch 234 is switched according to the determination result of the prediction mode. When the prediction mode is the intra prediction mode, the intra prediction unit 235 decodes the target macroblock using the intra prediction. When the prediction mode is any one of the forward prediction mode, the backward prediction mode, and the bidirectional prediction mode, the motion vector acquisition unit 236 acquires information on the motion vector restored by the variable length decoding unit 231.

予測モードがダイレクトモードである場合、基準ベクトル取得部２４２が、ＣｕｒｒＭＢの予測に用いる動きベクトルを動きベクトルメモリ２４１から取得する。また、基準ベクトル取得部２４２は、ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶであるか否かを判定する。 When the prediction mode is the direct mode, the reference vector acquisition unit 242 acquires a motion vector used for CurrMB prediction from the motion vector memory 241. The reference vector acquisition unit 242 determines whether the motion vector used for CurrMB prediction is the inter-viewpoint motion vector iMV.

ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶである場合、基準ベクトル取得部２４２は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定する。そして、基準ベクトル取得部２４２は、第１視点復号部２０１の動きベクトルメモリ２２１からＣｏｌＭＢの動きベクトルを取得する（図５を参照）。なお、ＣｏｌＰｉｃは、ＣｕｒｒＰｉｃと同じ時刻（同ＰＯＣ）のＢａｓｅ−Ｖｉｅｗピクチャである。この場合、基準ベクトル取得部２４２により取得された動きベクトルは、基準ベクトル補正部２４３に入力される。 When the motion vector used for CurrMB prediction is the inter-viewpoint motion vector iMV, the reference vector acquisition unit 242 identifies a ColPic macroblock (ColMB) located at the same position as CurrMB. Then, the reference vector acquisition unit 242 acquires a ColMB motion vector from the motion vector memory 221 of the first viewpoint decoding unit 201 (see FIG. 5). Note that ColPic is a Base-View picture at the same time (same POC) as CurrPic. In this case, the motion vector acquired by the reference vector acquisition unit 242 is input to the reference vector correction unit 243.

一方、ＣｕｒｒＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶでない場合、基準ベクトル取得部２４２は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定する。この場合、ＣｏｌＰｉｃは、非Ｂａｓｅ−Ｖｉｅｗピクチャである。また、基準ベクトル取得部２４２は、ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶであるか否かを判定する。 On the other hand, when the motion vector used for CurrMB prediction is not the inter-viewpoint motion vector iMV, the reference vector acquisition unit 242 identifies a ColPic macroblock (ColMB) that is in the same position as CurrMB. In this case, ColPic is a non-Base-View picture. Further, the reference vector acquisition unit 242 determines whether or not the motion vector used for the prediction of ColMB is the inter-viewpoint motion vector iMV.

ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶである場合、基準ベクトル取得部２４２は、ＣｏｌＭＢの予測に用いる視点間動きベクトルｉＭＶが指すＢａｓｅ−Ｖｉｅｗピクチャのマクロブロック（ＲｅｆＭＢ）を特定する。そして、基準ベクトル取得部２４２は、第１視点復号部２０１の動きベクトルメモリ２２１からＲｅｆＭＢの動きベクトルを取得する（図４を参照）。この場合、基準ベクトル取得部２４２により取得された動きベクトルは、基準ベクトル補正部２４３に入力される。 When the motion vector used for the prediction of ColMB is the inter-view motion vector iMV, the reference vector acquisition unit 242 specifies the macro block (RefMB) of the Base-View picture pointed to by the inter-view motion vector iMV used for the prediction of ColMB. Then, the reference vector acquisition unit 242 acquires the RefMB motion vector from the motion vector memory 221 of the first viewpoint decoding unit 201 (see FIG. 4). In this case, the motion vector acquired by the reference vector acquisition unit 242 is input to the reference vector correction unit 243.

一方、ＣｏｌＭＢの予測に用いる動きベクトルが視点間動きベクトルｉＭＶでない場合、基準ベクトル取得部２４２は、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃのマクロブロック（ＣｏｌＭＢ）を特定し、ＣｏｌＭＢの動きベクトルを動きベクトルメモリ２４１から取得する（図２を参照）。この場合、基準ベクトル取得部２４２により取得された動きベクトルは、基準ベクトルｂａｓｅｍｖＣｏｌとしてダイレクトベクトル計算部２４４に入力される。 On the other hand, when the motion vector used for the prediction of ColMB is not the inter-viewpoint motion vector iMV, the reference vector acquisition unit 242 specifies a ColPic macroblock (ColMB) at the same position as CurrMB, and stores the motion vector of ColMB in the motion vector memory. 241 (see FIG. 2). In this case, the motion vector acquired by the reference vector acquisition unit 242 is input to the direct vector calculation unit 244 as the reference vector basemvCol.

基準ベクトル補正部２４３は、統計情報取得部２０２から入力された統計情報に基づき、基準ベクトル取得部２４２から入力された動きベクトルを補正して基準ベクトルｂａｓｅｍｖＣｏｌを生成する。基準ベクトルｂａｓｅｍｖＣｏｌを生成する際に行う補正方法は、符号化装置１００が符号化の際に行った補正方法と同じ方法である。基準ベクトル補正部２４３により生成された基準ベクトルｂａｓｅｍｖＣｏｌは、ダイレクトベクトル計算部２４４に入力される。 The reference vector correction unit 243 corrects the motion vector input from the reference vector acquisition unit 242 based on the statistical information input from the statistical information acquisition unit 202 and generates a reference vector basemvCol. The correction method performed when generating the reference vector basemvCol is the same as the correction method performed by the encoding device 100 when encoding. The reference vector basemCol generated by the reference vector correction unit 243 is input to the direct vector calculation unit 244.

ダイレクトベクトル計算部２４４は、基準ベクトルｂａｓｅｍｖＣｏｌを時間配分でスケーリングし、ＣｕｒｒＭＢの動きベクトル（ダイレクトベクトル）を生成する。なお、スケーリングの方法は、符号化装置１００が符号化の際に行ったスケーリング方法と同じ方法である。ダイレクトベクトル計算部２４４により生成されたダイレクトベクトルは、動きベクトルメモリ２４１に格納される。動きベクトルメモリ２４１に格納されたダイレクトベクトルは、動き補償部２３７により参照される。 The direct vector calculation unit 244 scales the reference vector basemvCol with time distribution to generate a CurrMB motion vector (direct vector). The scaling method is the same as the scaling method performed by the encoding device 100 during encoding. The direct vector generated by the direct vector calculation unit 244 is stored in the motion vector memory 241. The direct vector stored in the motion vector memory 241 is referred to by the motion compensation unit 237.

動き補償部２３７は、ダイレクトベクトル、又は、動きベクトル取得部２３６により取得された動きベクトルを用いて、フレームメモリ２３８に格納されている復号画像（参照ピクチャ）の動き補償を実行し、動き補償後のＭＢデータを生成する。動き補償後のＭＢデータは、スイッチ２３９を介して加算器２４０に入力される。 The motion compensation unit 237 performs motion compensation of the decoded image (reference picture) stored in the frame memory 238 using the direct vector or the motion vector acquired by the motion vector acquisition unit 236, and after motion compensation MB data is generated. The MB data after motion compensation is input to the adder 240 via the switch 239.

上記のように、非Ｂａｓｅ−Ｖｉｅｗ符号化データの復号は、Ｂａｓｅ−Ｖｉｅｗ符号化データの復号結果を利用して実行される。
以上、復号装置２００が有する機能について説明した。 As described above, the decoding of the non-Base-View encoded data is performed using the decoding result of the Base-View encoded data.
In the above, the function which the decoding apparatus 200 has was demonstrated.

［２−７．処理フロー］
次に、図１８〜図２０を参照しながら、符号化装置１００が実行する符号化処理の流れについて説明する。 [2-7. Processing flow]
Next, the flow of the encoding process executed by the encoding device 100 will be described with reference to FIGS.

図１８は、第２実施形態に係る符号化処理の流れを示した第１のフロー図である。また、図１９は、第２実施形態に係る符号化処理の流れを示した第２のフロー図である。また、図２０は、第２実施形態に係る符号化処理の流れを示した第３のフロー図である。 FIG. 18 is a first flowchart showing the flow of the encoding process according to the second embodiment. FIG. 19 is a second flowchart showing the flow of the encoding process according to the second embodiment. FIG. 20 is a third flowchart showing the flow of the encoding process according to the second embodiment.

（２−７−１．全体）
図１８を参照しながら、符号化処理の全体的な流れについて説明する。
なお、符号化対象のピクチャをＣｕｒｒＰｉｃ、符号化対象のマクロブロックをＣｕｒｒＭＢ、ＣｕｒｒＭＢの予測に用いる動きベクトルｍｖＣｏｌが指すピクチャをＣｏｌＰｉｃと表記する。また、視点間動きベクトルをｉＭＶ、基準ベクトルをｂａｓｅｍｖＣｏｌ、ダイレクトベクトルをｍｖＬ₀、ｍｖＬ₁と表記する。また、ＣｕｒｒＭＢと同位置にあるＣｏｌＰｉｃ内のマクロブロックをＣｏｌＭＢ、ＣｏｌＭＢの予測に用いる動きベクトルをｍｖＲｅｆ、ｍｖＲｅｆが指すピクチャをＲｅｆＰｉｃＣｏｌ、ｉＭＶが指すブロックをｉＭＢと表記する。 (2-7-1. Overall)
The overall flow of the encoding process will be described with reference to FIG.
The picture to be encoded is referred to as CurrPic, the macroblock to be encoded is referred to as CurrMB, and the picture pointed to by the motion vector mvCol used for CurrMB prediction is referred to as ColPic. Also, the inter-viewpoint motion vector is represented as iMV, the reference vector is represented as basemvCol, and the direct vector is represented as mvL ₀ and mvL ₁ . Also, a macroblock in ColPic located at the same position as CurrMB is denoted as ColMB, a motion vector used for prediction of ColMB is denoted as mvRef, a picture pointed to by mvRef is denoted as RefPicCol, and a block pointed to by iMV is denoted as iMB.

（Ｓ１０１）符号化装置１００は、符号化方式がＭＶＣ方式であり、かつ、ＣｕｒｒＰｉｃが双方向予測を用いて符号化されるピクチャ（Ｂピクチャ）であるか否かを判定する。符号化方式がＭＶＣ方式であり、かつ、ＣｕｒｒＰｉｃがＢピクチャである場合、処理はＳ１０２へと進む。一方、符号化方式がＭＶＣ方式でないか、ＣｕｒｒＰｉｃがＢピクチャでない場合、処理はＳ１０８へと進む。 (S101) The encoding apparatus 100 determines whether or not the encoding method is the MVC method and CurrPic is a picture (B picture) encoded using bi-directional prediction. If the encoding method is the MVC method and CurrPic is a B picture, the process proceeds to S102. On the other hand, if the encoding method is not the MVC method or CurrPic is not the B picture, the process proceeds to S108.

（Ｓ１０２）符号化装置１００は、ＣｏｌＰｉｃが他視点のピクチャであるか否かを判定する。ＣｏｌＰｉｃが他視点のピクチャである場合（つまり、ｍｖＣｏｌが視点間動きベクトルｉＭＶである場合）、処理はＳ１０３へと進む。一方、ＣｏｌＰｉｃが他視点のピクチャでない場合、処理はＳ１０５へと進む。 (S102) The encoding apparatus 100 determines whether ColPic is a picture of another viewpoint. When ColPic is a picture of another viewpoint (that is, when mvCol is the inter-viewpoint motion vector iMV), the process proceeds to S103. On the other hand, if ColPic is not a picture of another viewpoint, the process proceeds to S105.

（Ｓ１０３）符号化装置１００は、視点間動きベクトルｉＭＶ（ｍｖＣｏｌ）を用いて基準ベクトルｂａｓｅｍｖＣｏｌを計算する。なお、基準ベクトルｂａｓｅｍｖＣｏｌの計算処理については後段において詳述する。 (S103) The encoding apparatus 100 calculates the reference vector basemvCol using the inter-viewpoint motion vector iMV (mvCol). The calculation process of the reference vector basemvCol will be described in detail later.

（Ｓ１０４）符号化装置１００は、基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングしてダイレクトベクトルｍｖＬ₀、ｍｖＬ₁を計算する。例えば、符号化装置１００は、図１１、図１２に示すような方法で基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングする。Ｓ１０４の処理が完了すると、処理は、Ｓ１０８へと進む。 (S104) The encoding apparatus 100 calculates the direct vectors mvL ₀ and mvL ₁ by scaling the reference vector basemvCol. For example, the encoding apparatus 100 scales the reference vector basemvCol by a method as illustrated in FIGS. When the process of S104 is completed, the process proceeds to S108.

（Ｓ１０５）符号化装置１００は、ＲｅｆＰｉｃＣｏｌが他視点のピクチャであるか否かを判定する。ＲｅｆＰｉｃＣｏｌが他視点のピクチャである場合（つまり、ｍｖＲｅｆが視点間動きベクトルｉＭＶである場合）、処理はＳ１０６へと進む。一方、ＲｅｆＰｉｃＣｏｌが他視点のピクチャでない場合、処理はＳ１０８へと進む。 (S105) The encoding apparatus 100 determines whether RefPicCol is a picture of another viewpoint. When RefPicCol is a picture of another viewpoint (that is, when mvRef is an inter-viewpoint motion vector iMV), the process proceeds to S106. On the other hand, if RefPicCol is not a picture of another viewpoint, the process proceeds to S108.

（Ｓ１０６）符号化装置１００は、視点間動きベクトルｉＭＶ（ｍｖＲｅｆ）を用いて基準ベクトルｂａｓｅｍｖＣｏｌを計算する。なお、基準ベクトルｂａｓｅｍｖＣｏｌの計算処理については後段において詳述する。 (S106) The encoding apparatus 100 calculates the reference vector basemvCol using the inter-view motion vector iMV (mvRef). The calculation process of the reference vector basemvCol will be described in detail later.

（Ｓ１０７）符号化装置１００は、基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングしてダイレクトベクトルｍｖＬ₀、ｍｖＬ₁を計算する。例えば、符号化装置１００は、図４に示すような方法で基準ベクトルｂａｓｅｍｖＣｏｌをスケーリングする。 (S107) The encoding apparatus 100 calculates the direct vectors mvL ₀ and mvL ₁ by scaling the reference vector basemvCol. For example, the encoding apparatus 100 scales the reference vector basemVCol by a method as illustrated in FIG.

（Ｓ１０８）符号化装置１００は、ＣｕｒｒＭＢの予測に用いる好適な予測モードを判定する。例えば、符号化装置１００は、５つの予測モード（イントラ予測モード、前方向予測モード、後方向予測モード、双方向予測モード、ダイレクトモード）のうち、符号化コストが最も小さくなる予測モードを選択して判定結果とする。 (S108) The encoding apparatus 100 determines a suitable prediction mode used for CurrMB prediction. For example, the encoding apparatus 100 selects a prediction mode with the lowest encoding cost among five prediction modes (intra prediction mode, forward prediction mode, backward prediction mode, bidirectional prediction mode, and direct mode). To determine the result.

（Ｓ１０９）符号化装置１００は、Ｓ１０８の処理で選択した予測モードを用いてＣｕｒｒＰｉｃを符号化する。
（Ｓ１１０）符号化装置１００は、ＣｕｒｒＰｉｃに含まれる全てのマクロブロック（１ピクチャ分）について符号化処理が終了したか否かを判定する。ＣｕｒｒＰｉｃに含まれる全てのマクロブロックについて符号化処理が終了した場合、図１８に示した一連の処理は終了する。一方、ＣｕｒｒＰｉｃに含まれる全てのマクロブロックについて符号化処理が終了していない場合、処理はＳ１０２へと進む。 (S109) The encoding apparatus 100 encodes CurrPic using the prediction mode selected in the process of S108.
(S110) The encoding apparatus 100 determines whether or not the encoding process has been completed for all macroblocks (for one picture) included in the CurrPic. When the encoding process is completed for all the macroblocks included in CurrPic, the series of processes illustrated in FIG. 18 ends. On the other hand, if the encoding process has not been completed for all the macroblocks included in CurrPic, the process proceeds to S102.

（２−７−２．基準ベクトルの計算）
ここで、図１９を参照しながら、Ｓ１０３、Ｓ１０６の処理について、さらに説明する。 (2-7-2. Calculation of reference vector)
Here, the processing of S103 and S106 will be further described with reference to FIG.

（Ｓ１２１）符号化装置１００は、視点間動きベクトルｉＭＶが指すブロックｉＭＢと、ブロックｉＭＢの周辺に位置する周辺ブロックとの重なり面積を計算する。
（Ｓ１２２）符号化装置１００は、Ｓ１２１の処理で計算した重なり面積が最も大きい周辺ブロックを選択するか否かを判定する。つまり、符号化装置１００は、図１３に示した基準ベクトルｂａｓｅｍｖＣｏｌの補正方法を採用するか否かを判定する。なお、採用する補正方法は予め設定されているものとする。重なり面積が最も大きい周辺ブロックを選択する場合、処理はＳ１２３へと進む。一方、重なり面積が最も大きい周辺ブロックを選択しない場合、処理はＳ１２４へと進む。 (S121) The encoding apparatus 100 calculates an overlap area between the block iMB indicated by the inter-viewpoint motion vector iMV and the peripheral blocks located around the block iMB.
(S122) The encoding apparatus 100 determines whether to select a peripheral block having the largest overlap area calculated in the process of S121. That is, the encoding apparatus 100 determines whether or not to adopt the correction method for the reference vector basemvCol illustrated in FIG. Note that the correction method to be adopted is set in advance. When the peripheral block with the largest overlap area is selected, the process proceeds to S123. On the other hand, when the peripheral block having the largest overlap area is not selected, the process proceeds to S124.

（Ｓ１２３）符号化装置１００は、ブロックｉＭＢの中心座標を含む周辺ブロックを特定する。そして、符号化装置１００は、特定した周辺ブロックの動きベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する。なお、図１３に関する説明では、周辺ブロック毎に重なり面積を計算し、重なり面積の最大値に対応する周辺ブロックを選択する方法を紹介したが、ブロックｉＭＢの中心座標を含む周辺ブロックを特定する方法でも同様の結果を得ることができる。Ｓ１２３の処理が完了すると、図１９に示した一連の処理は終了する。 (S123) The encoding apparatus 100 specifies a peripheral block including the center coordinates of the block iMB. Then, the encoding apparatus 100 sets the motion vector of the identified peripheral block as the reference vector basemvCol. In the description related to FIG. 13, the method of calculating the overlapping area for each peripheral block and selecting the peripheral block corresponding to the maximum value of the overlapping area has been introduced, but the method of specifying the peripheral block including the center coordinates of the block iMB. But you can get similar results. When the process of S123 is completed, the series of processes shown in FIG.

（Ｓ１２４）符号化装置１００は、ｉＭＢと周辺ブロックとの重なり面積に基づいて周辺ブロックの動きベクトルを加重平均する。例えば、符号化装置１００は、上記の式（１２）〜式（１４）に基づいて加重平均を計算する。なお、加重平均の計算処理については後段において詳述する。 (S124) The encoding apparatus 100 performs weighted averaging of the motion vectors of the peripheral blocks based on the overlapping area between the iMB and the peripheral blocks. For example, the encoding apparatus 100 calculates a weighted average based on the above equations (12) to (14). The weighted average calculation process will be described in detail later.

（Ｓ１２５）符号化装置１００は、周辺ブロックの動きベクトルを加重平均して得られたベクトルを基準ベクトルｂａｓｅｍｖＣｏｌに設定する。Ｓ１２５の処理が完了すると、図１９に示した一連の処理は終了する。 (S125) The encoding apparatus 100 sets a vector obtained by weighted averaging of the motion vectors of the neighboring blocks as the reference vector basemvCol. When the process of S125 is completed, the series of processes shown in FIG.

（２−７−３．加重平均の計算）
ここで、図２０を参照しながら、Ｓ１２４の処理について、さらに説明する。
（Ｓ１３１）符号化装置１００は、ブロックｉＭＢと重なる周辺ブロックの１つを選択する。 (2-7-3. Calculation of weighted average)
Here, the processing of S124 will be further described with reference to FIG.
(S131) The encoding apparatus 100 selects one of the peripheral blocks overlapping the block iMB.

（Ｓ１３２）符号化装置１００は、Ｓ１３１の処理で選択した周辺ブロックとブロックｉＭＢとに関する重なり面積を取得する。
（Ｓ１３３）符号化装置１００は、Ｓ１３２の処理で取得した重なり面積に基づいて評価値を計算する。例えば、符号化装置１００は、上記の式（１２）又は式（１４）に基づいて評価値Ｃｏｓｔｘｙを計算する。 (S132) The encoding apparatus 100 acquires an overlapping area related to the peripheral block selected in the process of S131 and the block iMB.
(S133) The encoding apparatus 100 calculates an evaluation value based on the overlapping area acquired in the process of S132. For example, the encoding apparatus 100 calculates the evaluation value Costxy based on the above formula (12) or formula (14).

（Ｓ１３４）符号化装置１００は、ブロックｉＭＢと重なる周辺ブロックの全てを選択し、各周辺ブロックに関する評価値Ｃｏｓｔｘｙを計算したか否かを判定する。ブロックｉＭＢと重なる周辺ブロックの全てを選択し終えている場合、処理はＳ１３５へと進む。一方、ブロックｉＭＢと重なる周辺ブロックの全てを選択し終えていない場合、処理はＳ１３１へと進む。 (S134) The encoding apparatus 100 selects all the peripheral blocks that overlap the block iMB, and determines whether or not the evaluation value Costxy for each peripheral block has been calculated. If all of the neighboring blocks that overlap the block iMB have been selected, the process proceeds to S135. On the other hand, if all the peripheral blocks overlapping the block iMB have not been selected, the process proceeds to S131.

（Ｓ１３５）符号化装置１００は、Ｓ１３３の処理で計算した各周辺ブロックの評価値Ｃｏｓｔｘｙを用いて各周辺ブロックの予測に用いる動きベクトルの加重平均を計算する。例えば、符号化装置１００は、上記の式（１３）に基づいて加重平均を計算する。Ｓ１３５の処理が完了すると、図２０に示した一連の処理は終了する。 (S135) The encoding apparatus 100 calculates a weighted average of motion vectors used for prediction of each neighboring block using the evaluation value Costxy of each neighboring block calculated in the process of S133. For example, the encoding apparatus 100 calculates a weighted average based on the above equation (13). When the process of S135 is completed, the series of processes shown in FIG.

以上、符号化装置１００が実行する符号化処理の流れについて説明した。
以上説明したように、第２実施形態に係る技術を適用すれば、ＣｕｒｒＰｉｃ及びＣｏｌＰｉｃに視点間予測が適用されている場合であっても、時間ダイレクトモードを利用することが可能になる。その結果、多視点動画符号化における符号化効率の更なる向上に寄与する。 The flow of the encoding process executed by the encoding device 100 has been described above.
As described above, if the technique according to the second embodiment is applied, the temporal direct mode can be used even when inter-view prediction is applied to CurrPic and ColPic. As a result, this contributes to further improvement in encoding efficiency in multi-view video encoding.

以上、第２実施形態について説明した。
＜３．第３実施形態＞
次に、図２１及び図２２を参照しながら、第３実施形態について説明する。図２１は、第３実施形態に係る符号化方法について説明するための第１の図である。図２２は、第３実施形態に係る符号化方法について説明するための第２の図である。第３実施形態は、Ｈ．２６５／ＨＥＶＣ方式への適用例に相当し、視点間予測とマージモードとを組み合わせる仕組みを提供する。なお、Ｈ．２６５／ＨＥＶＣ方式では、予測符号化の単位としてＰＵ（Prediction Unit）が用いられる。 The second embodiment has been described above.
<3. Third Embodiment>
Next, a third embodiment will be described with reference to FIGS. 21 and 22. FIG. 21 is a first diagram for describing an encoding method according to the third embodiment. FIG. 22 is a second diagram for describing the encoding method according to the third embodiment. The third embodiment is described in H.264. This corresponds to an application example to the H.265 / HEVC scheme, and provides a mechanism for combining inter-view prediction and merge mode. H. In the H.265 / HEVC scheme, a PU (Prediction Unit) is used as a predictive coding unit.

マージモードとは、符号化対象のＰＵ（以下、ＣｕｒｒＰＵ）の時間近傍及び空間近傍に位置するＰＵの中から、そのまま流用可能な動きパラメータを有するＰＵを選択し、選択したＰＵの動きパラメータを共通化する方法である。マージモードを適用すると、選択したＰＵの位置を示すインデックスだけを符号化すれば済むため、符号化効率の向上に寄与する。以下、Ｈ．２６５／ＨＥＶＣ方式で採用されている空間近傍予測、時間近傍予測、及びこれらの予測方法と視点間予測との関係などについて述べる。 In merge mode, a PU having a motion parameter that can be used as it is is selected from PUs located near the time and near the space of the encoding target PU (hereinafter referred to as CurrPU), and the motion parameters of the selected PU are shared. It is a method to convert. When the merge mode is applied, only the index indicating the position of the selected PU needs to be encoded, which contributes to improvement in encoding efficiency. Hereinafter, H.C. The spatial neighborhood prediction, temporal neighborhood prediction employed in the H.265 / HEVC scheme, and the relationship between these prediction methods and inter-viewpoint prediction will be described.

Ｈ．２６５／ＨＥＶＣ方式では、ＡＭＶＰ（Adaptive Motion Vector Prediction）と呼ばれる動きベクトルの予測技術が採用されている。ＡＭＶＰでは、予測値の候補として、同一ピクチャ上にある空間近傍の動きベクトルだけでなく、符号化済みの参照ピクチャ上にある時間近傍の動きベクトルも考慮される。 H. In the H.265 / HEVC scheme, a motion vector prediction technique called AMVP (Adaptive Motion Vector Prediction) is employed. In AMVP, not only a spatial motion vector on the same picture but also a temporal motion vector on an encoded reference picture is considered as a prediction value candidate.

例えば、図２１の（Ａ）に示した画素Ａ₀、Ａ₁、Ｂ₀、Ｂ₁、Ｂ₂のそれぞれを含むＰＵの動きベクトル（ｍｖＬ_A0、ｍｖＬ_A1、ｍｖＬ_B0、ｍｖＬ_B1、ｍｖＬ_B2）が空間近傍における予測値の候補となる。また、図２１の（Ｂ）に示した画素Ｃ₀、Ｃ₁のそれぞれを含むＰＵの動きベクトル（ｍｖＬ_C0、ｍｖＬ_C1）が時間近傍における動きベクトルの候補となる。さらに、ｍｖＬ_A0、ｍｖＬ_A1から選ばれたｍｖＬ_A、ｍｖＬ_B0、ｍｖＬ_B1、ｍｖＬ_B2から選ばれたｍｖＬ_B、ｍｖＬ_C0、ｍｖＬ_C1から選ばれたｍｖＬ_C、及びゼロベクトルの中から予測値の候補が絞り込まれる。 For example, the motion vector (mvL _A0 , mvL _A1 , mvL _B0 , mvL _B1 , mvL _B2 ) of the PU including each of the pixels A ₀ , A ₁ , B ₀ , B ₁ , B ₂ shown in FIG. Are candidates for predicted values near the space. Further, the motion vector (mvL _C0 , mvL _C1 ) of the PU including each of the pixels C ₀ and C ₁ shown in (B) of FIG. 21 is a motion vector candidate in the vicinity of time. Furthermore, mvL _A0, mvL _A selected from mvL _A1, the mvL _B0, mvL _B1, mvL selected from mvL _B2 _B, mvL _C0, mvL selected from mvL _C1 _C, and the predicted value from the zero vector Candidates are narrowed down.

このようにして絞り込まれた予測値の候補を利用してＣｕｒｒＰＵの予測が行われる。ここで、時間近傍予測値の導出方法に注目する。Ｈ．２６５／ＨＥＶＣ方式では、時間近傍予測値（ＣｕｒｒＰＵの動きベクトル）の導出時に、Ｈ．２６４／ＡＶＣ方式の時間ダイレクトモードと同様のスケーリング処理が行われる。 CurrPU prediction is performed using prediction value candidates narrowed down in this way. Here, attention is focused on the method for deriving the temporal neighborhood prediction value. H. In the H.265 / HEVC method, the H.264 / HEVC method uses H.264 when deriving a temporal neighborhood prediction value (CurrPU motion vector). The same scaling process as in the H.264 / AVC time direct mode is performed.

例えば、ＣｕｒｒＰＵと同位置（画素Ｃ₁を含む位置）にある時間近傍ピクチャ（ＣｏｌＰｉｃ）内のＰＵを予測する動きベクトルｍｖＣｏｌをスケーリングする処理が行われる。このとき、ＣｏｌＰｉｃの対象ＰＵ（例えば、画素Ｃ₁を含むＰＵ）が参照する参照ピクチャとＣｏｌＰｉｃとの距離ｔｄ、ＣｕｒｒＰＵが参照する参照ピクチャとＣｕｒｒＰＵを含むピクチャ（ＣｕｒｒＰｉｃ）との距離ｔｂが利用される。スケーリングの方法は、図２に例示した時間ダイレクトモードと同様である。 For example, the process of scaling a motion vector mvCol predicting the PU in time near the picture in CurrPU the same position (position including the pixel C ₁₎ (ColPic) is performed. At this time, colPic the target PU (e.g., PU including the pixel C ₁₎ the distance between the reference picture and colPic which refers td, the distance tb between the picture (CurrPic) including a reference picture and CurrPU which CurrPU references are utilized The The scaling method is the same as in the time direct mode illustrated in FIG.

ＣｕｒｒＰＵに視点間予測を適用する場合、図２２のような状況が生じると考えられる。つまり、非Ｂａｓｅ−Ｖｉｅｗ（ＤｅｐｅｎｄｅｎｔＶｉｅｗ）側のＣｕｒｒＰＵが視点間動きベクトルｉＭＶによりＢａｓｅ−ＶｉｅｗピクチャのＣｏｌＰＵ（例えば、画素Ｄ₁を含むＰＵ）を参照する。この場合、上記のスケーリング処理はできない。しかし、図１１に例示した方法を適用すればスケーリング処理が可能になるため、時間近傍予測値を利用してＣｕｒｒＰＵの動きベクトルを予測することが可能になる。その結果、Ｈ．２６５／ＨＥＶＣ方式においても符号化効率の向上が期待できる。 When applying inter-view prediction to CurrPU, the situation shown in FIG. 22 is considered to occur. That is, referring to non-Base-View (Dependent View) CurrPU is by interview motion vector iMV of Base-View picture side ColPU (e.g., PU including the pixel D _1). In this case, the above scaling process cannot be performed. However, since the scaling process can be performed by applying the method illustrated in FIG. 11, the CurrPU motion vector can be predicted using the temporal neighborhood prediction value. As a result, H.C. In the H.265 / HEVC system, an improvement in encoding efficiency can be expected.

以上、第３実施形態について説明した。
以上、添付図面を参照しながら好適な実施形態について説明したが、本発明は係る例に限定されない。当業者であれば、特許請求の範囲に記載された範疇内において、様々な変形例や修正例に想到し得ることは明らかであり、こうした変形例や修正例についても当然に本発明の技術的範囲に属することは言うまでもない。 The third embodiment has been described above.
As mentioned above, although preferred embodiment was described referring an accompanying drawing, this invention is not limited to the example which concerns. It is obvious for a person skilled in the art that various variations and modifications can be conceived within the scope of the claims, and such variations and modifications are naturally understood by the technical scope of the present invention. It goes without saying that it belongs to a range.

１０画像符号化装置
１１記憶部
１２演算部
２１、２２動画像
Ｐｉｃ₁₀、Ｐｉｃ₁₁、Ｐｉｃ₁₂、Ｐｉｃ₁₃、Ｐｉｃ₁₄、Ｐｉｃ₁₅、Ｐｉｃ₂₀、Ｐｉｃ₂₁、Ｐｉｃ₂₂、Ｐｉｃ₂₃、Ｐｉｃ₂₄、Ｐｉｃ₂₅、ＣｏｌＰｉｃ画像
ＣｕｒｒＰｉｃ対象画像
ｂａｓｅｍｖＣｏｌ₀、ｂａｓｅｍｖＣｏｌ₁、ｍｖＬ₀、ｍｖＬ₁ 動き情報
ｍｖＣｏｌ視点間動き情報
ｔｂＬ₀、ｔｂＬ₁、ｔｄＬ₀、ｔｄＬ₁ 時間 10 image encoding device 11 storage unit 12 operation unit 21 moving picture _{_{_{Pic 10, Pic 11, Pic 12}}} , Pic 13, Pic 14, Pic 15, Pic 20, Pic 21, Pic 22, Pic 23, Pic 24, Pic ₂₅ , ColPic image CurrPic target image basemvCol ₀ , basemvCol ₁ , mvL ₀ , mvL ₁ motion information mvCol inter-viewpoint motion information tbL ₀ , tbL ₁ , tdL ₀ , tdL ₁ time

Claims

A storage unit storing a plurality of moving images respectively corresponding to a plurality of viewpoints;
Among the plurality of moving images, a target image having inter-viewpoint motion information that refers to an image at the same time included in the second moving image is detected from images included in the first moving image, and the same time using the motion vector having the image of have a, a calculating unit for encoding the image area of the target image,
The computing unit includes a first time interval between the first reference image referred to from the image area of the target image in the first moving image, and the target image, and the second moving image. The length of the motion vector used for encoding the image region based on a second time interval between a second reference image referred to by the motion vector in the image and the image at the same time the image coding apparatus that adjust the.

The calculation unit specifies one or a plurality of image areas associated with the image area of the target image based on the inter-viewpoint movement information among the image areas included in the image at the same time, and the specified image area The image encoding device according to claim 1, wherein an image region of the target image is encoded based on a motion vector of the target image.

When there are a plurality of image regions associated with the image region of the target image based on the inter-viewpoint movement information, the calculation unit selects the image region having the maximum size among the plurality of identified image regions, The image encoding device according to claim 2, wherein the image region of the target image is encoded based on the selected motion vector of the image region.

When there are a plurality of image regions associated with the image region of the target image based on the inter-viewpoint motion information, the arithmetic unit sets a weight on the motion vector of the image region based on the specified size of the image region, The image encoding device according to claim 2, wherein the motion vector statistical information is generated in consideration of the set weight, and an image region of the target image is encoded based on the statistical information.

When there are a plurality of image areas associated with the image area of the target image by the inter-viewpoint motion information, the arithmetic unit weights the motion vector of the image area based on the specified size and quantization information of the image area The image encoding device according to claim 2, wherein statistical information of the motion vector is generated in consideration of the set weight, and an image region of the target image is encoded based on the statistical information.

Computer
Obtaining the moving image from a storage unit storing a plurality of moving images respectively corresponding to a plurality of viewpoints ;
Among the plurality of moving images, a target image having inter-viewpoint motion information that refers to an image at the same time included in the second moving image is detected from images included in the first moving image, and the same time using the motion vector having the image of encoded image areas of the target image,
In the encoding process, a first time interval between a first reference image referred to from an image region of the target image in the first moving image and the target image, and the second Based on a second time interval between the second reference image referred to by the motion vector and the image at the same time, the motion vector used for encoding the image region Image coding method that adjusts the length .

A storage unit that stores encoded data obtained by encoding a plurality of moving images respectively corresponding to a plurality of viewpoints;
Inter-viewpoint motion that refers to an image at the same time included in the second moving image from among the images included in the first moving image among the plurality of moving images in the process of decoding the encoded data detecting a target image having information, said using a motion vector image at the same time has to have a, a computing section for decoding the image area of the target image,
The computing unit includes a first time interval between the first reference image referred to from the image area of the target image in the first moving image, and the target image, and the second moving image. Based on the second time interval between the second reference image referred to by the motion vector in the image and the image at the same time, the length of the motion vector used for decoding the image region is adjustment to that image decoding apparatus.

Computer
Obtaining the encoded data from a storage unit storing encoded data obtained by encoding a plurality of moving images respectively corresponding to a plurality of viewpoints ;
Inter-viewpoint motion that refers to an image at the same time included in the second moving image from among the images included in the first moving image among the plurality of moving images in the process of decoding the encoded data A target image having information is detected, and an image region of the target image is decoded using a motion vector included in the image at the same time ;
In the decoding process, a first time interval between the first reference image that is referred to from the image area of the target image in the first moving image and the target image, and the second time The length of the motion vector used for decoding the image area based on the second time interval between the second reference image referred to by the motion vector and the image at the same time in the moving image Adjust image decoding method.