JP4355914B2

JP4355914B2 - Multi-view image transmission system and method, multi-view image compression device and method, multi-view image decompression device and method, and program

Info

Publication number: JP4355914B2
Application number: JP2003343303A
Authority: JP
Inventors: 順一郎石井
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2003-10-01
Filing date: 2003-10-01
Publication date: 2009-11-04
Anticipated expiration: 2023-10-01
Also published as: JP2005110113A

Description

本発明は、立体画像を含む多視点画像を圧縮するための多視点画像圧縮装置および方法と、そのような多視点画像圧縮装置または方法により圧縮された画像データを伸長するための多視点画像伸長装置及び方法と、多視点画像圧縮装置と多視点画像伸長装置により構成される多視点画像伝送システムと方法に関する。 The present invention relates to a multi-viewpoint image compression apparatus and method for compressing a multi-viewpoint image including a stereoscopic image, and multi-viewpoint image expansion for decompressing image data compressed by such a multi-viewpoint image compression apparatus or method. The present invention relates to an apparatus and method, and a multi-view image transmission system and method including a multi-view image compression apparatus and a multi-view image decompression apparatus.

平面画像と比べて迫力のある立体画像を伝送するための立体映像システムが従来から提案されている。このような立体映像システムには、人間の目の両眼視差を応用して、左右２つの視差画像を使った２眼式立体映像システムや、１つの対象物を複数の視点から撮影した画像を用いる多眼式立体映像システムがある。 Conventionally, a stereoscopic video system for transmitting a stereoscopic image that is more powerful than a planar image has been proposed. In such a stereoscopic video system, a binocular stereoscopic video system using two parallax images on the left and right sides by applying binocular parallax of human eyes, or an image obtained by photographing one object from a plurality of viewpoints. There are multi-view 3D video systems used.

このようにある１つの対象物を複数の視点から撮影することにより得られる立体画像を伝送するために様々な立体画像伝送システムが提案されている。 Various stereoscopic image transmission systems have been proposed in order to transmit a stereoscopic image obtained by photographing a single target object from a plurality of viewpoints.

このような立体画像伝送システムの従来の技術として、特許文献１に画像高能率符号化方式が公開されている。図３４は、この従来の立体画像伝送システムにおける立体画像圧縮装置の構成を示すブロック図である。 As a conventional technique for such a stereoscopic image transmission system, Patent Document 1 discloses an image high-efficiency encoding method. FIG. 34 is a block diagram showing a configuration of a stereoscopic image compression apparatus in this conventional stereoscopic image transmission system.

この従来の立体画像圧縮装置では、パターンマッチング部４００５〜４００７において、符号化画面４００１と、時間的或いは空間的に異なる参照画面４００２〜４００４との、パターンマッチング、すなわち動き補償あるいは視差補償をそれぞれ行い、補償ベクトルを求める。選択部４００８では、参照画面４００２〜４００４のうちで最も誤差が小さくなる参照画面を選択し、上記補償ベクトルとともに選択フラグとして伝送する。符号器４００９では、選択フラグが示す参照画面と符号化画面の値との予測誤差を求めて受信側に伝送する。 In this conventional stereoscopic image compression apparatus, the pattern matching units 4005 to 4007 perform pattern matching, that is, motion compensation or parallax compensation between the encoding screen 4001 and the reference screens 4002 to 4004 that are temporally or spatially different. Find the compensation vector. The selection unit 4008 selects a reference screen having the smallest error among the reference screens 4002 to 4004 and transmits it as a selection flag together with the compensation vector. The encoder 4009 obtains a prediction error between the reference screen indicated by the selection flag and the value of the encoded screen and transmits it to the receiving side.

この従来技術では、時間的または空間的に離れた複数の視差画像を参照画面とするため予測効率の向上を図ることができるが、既存の動画像符号化規格に比べて予測構造が複雑であるため、既存の動画像用ＬＳＩの構成を大幅に変更する必要があり、結果的にコストが増大するという問題がある。 In this prior art, a plurality of parallax images separated temporally or spatially are used as a reference screen, so that the prediction efficiency can be improved. However, the prediction structure is more complicated than the existing video coding standard. For this reason, it is necessary to drastically change the configuration of the existing moving image LSI, resulting in an increase in cost.

他の従来技術としては、特許文献２に立体動画像高能率符号化復号化装置及びその方法が公開されている。図３５は、この従来技術の構成を示す図である。符号化装置に同時に入力される左右の視差画像のどちらか一方の画像を１画像期間遅延させる遅延装置４１０１と、遅延装置４１０１で１画像期間遅延した画像と他方の遅延をしていない画像とを１画像内の左右または上下に分けて合成する画像処理部４１０２および４１０３と、画像処理部４１０２および４１０３により合成された画像を符号化する符号化部４１０４とからなる。ここで符号化部４１０４では、ＭＰＥＧ（Motion Picture Experts Group）規格に準拠した符号化を行ない、Ｐフレームの動きベクトル検出、または、Ｂフレームの動きベクトル検出の場合においては、参照画像における符号化画像と同一位置をセンタとした動きベクトルサーチ範囲に半画面分サーチセンタを移動した動きベクトルサーチ範囲を追加して動きベクトルを求めるようにしている。 As another conventional technique, Patent Document 2 discloses a stereoscopic video high-efficiency encoding / decoding apparatus and method. FIG. 35 is a diagram showing the configuration of this prior art. A delay device 4101 that delays one of the left and right parallax images simultaneously input to the encoding device by one image period, an image that is delayed by one image period by the delay device 4101, and an image that is not delayed by the other The image processing units 4102 and 4103 for combining the left and right or upper and lower parts in one image and the encoding unit 4104 for encoding the images combined by the image processing units 4102 and 4103 are included. Here, the encoding unit 4104 performs encoding based on the MPEG (Motion Picture Experts Group) standard, and in the case of P frame motion vector detection or B frame motion vector detection, the encoded image in the reference image is used. The motion vector search range obtained by moving the search center by half a screen is added to the motion vector search range centered at the same position as the motion vector.

この従来技術では、空間的に並べられた左右視差画像における２箇所の類似部分のうちで予測誤差の小さいブロックを選択できるため予測効率が向上するというメリットがあるが、多重化画像の境界をまたぐような長い動きベクトルが多く選択される場合に、動きベクトル符号量が大幅に増加してしまう問題があり、これについては何ら触れられていない。 This conventional technique has an advantage that prediction efficiency is improved because a block having a small prediction error can be selected from two similar parts in spatially aligned left and right parallax images. When many such long motion vectors are selected, there is a problem that the amount of code of the motion vector increases significantly, and this is not mentioned at all.

尚、このようなシステムを用いることにより、伝送する画像が複数の視差画像からなる立体画像だけでなく１つの対象物を複数の視点から撮影することにより得られる多視点画像を伝送することができるため、広い意味では多視点画像伝送システムとして表現することができるものである。よって、以降は１つの対象物を複数の視点から撮影することにより得られる多視点画像を伝送するためのシステムを多視点画像伝送システムと呼ぶこととする。
特開平６−９８３１２号公報特開平１１−１１３０２６号公報 By using such a system, it is possible to transmit not only a stereoscopic image including a plurality of parallax images but also a multi-viewpoint image obtained by photographing one object from a plurality of viewpoints. Therefore, in a broad sense, it can be expressed as a multi-viewpoint image transmission system. Therefore, hereinafter, a system for transmitting a multi-view image obtained by photographing one object from a plurality of viewpoints will be referred to as a multi-view image transmission system.
JP-A-6-98312 Japanese Patent Laid-Open No. 11-113026

上述した従来の多視点画像伝送システムおよび方法では、下記のような問題点があった。
（１）特許文献１記載の技術では、現在までに種々開発されている平面動画用のコーデックＬＳＩの構成をそのまま利用することができず、大幅に変更しなければならない。
（２）特許文献２記載の技術では、左右視差画像を多重化した多重化画像の境界をまたぐような動きベクトルが多く選択されると、動きベクトル符号量が大幅に増加してしまい、立体画像を効率よく圧縮伝送することができない。 The conventional multi-viewpoint image transmission system and method described above have the following problems.
(1) With the technique described in Patent Document 1, the configuration of a codec LSI for flat-motion video that has been developed to date cannot be used as it is, and must be changed drastically.
(2) In the technique described in Patent Document 2, if many motion vectors are selected so as to cross the boundary of the multiplexed images obtained by multiplexing the left and right parallax images, the motion vector code amount increases significantly, and the stereoscopic image Cannot be efficiently compressed and transmitted.

本発明の目的は、現在までに種々開発されている平面動画用のコーデックＬＳＩの構成をほとんど変えずに利用し、かつ、多視点画像を効率よく圧縮伝送することのできる多視点画像伝送システムと方法、多視点画像圧縮装置と方法、多視点画像伸長装置と方法を提供することである。 An object of the present invention is to provide a multi-viewpoint image transmission system that can utilize a configuration of a codec LSI for planar moving images that has been developed to date, with little change, and that can efficiently compress and transmit multi-viewpoint images. To provide a method, a multi-viewpoint image compression apparatus and method, and a multi-viewpoint image decompression apparatus and method.

上記目的を達成するために、本発明は、所定の対象物を複数の視点から撮影することにより得られる複数の多視点画像を動きベクトルと差分画像とに分解して符号化して動画ストリームとして出力することによりデータ量の圧縮を行う多視点画像圧縮装置であって、
前記複数の多視点画像を画像空間的に多重化して１枚の多重化画像を生成する多視点画像多重化手段と、
前記多重化画像の所定領域を動きベクトルサーチ範囲として、符号化を行おうとする多重化画像の符号化対象ブロックと類似する類似ブロックを、参照画像とした多重化画像を構成するブロックのうちから最も予測効率が高くなるように選択することにより動きベクトルを検出する動きベクトル検出手段と、
前記動きベクトル検出手段により検出された動きベクトルを、前記選択されたブロックを含む多視点画像内において符号化対象ブロックの多視点画像内でのローカル座標と同一の座標に位置するオフセットブロックに至るオフセットベクトルと、該オフセットブロックから前記選択されたブロックに至るローカル動きベクトルとに分解する動きベクトル分解手段と、
前記動きベクトル分解手段により分解されたローカル動きベクトルおよびオフセットベクトルを前記動画ストリームに多重化して出力する多重化手段とを有する。 In order to achieve the above object, the present invention decomposes and encodes a plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints into motion vectors and difference images, and outputs them as a video stream. A multi-viewpoint image compression device that compresses the amount of data by
Multi-view image multiplexing means for multiplexing the plurality of multi-view images in an image space to generate one multiplexed image;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. Motion vector detection means for detecting a motion vector by selecting so as to increase the prediction efficiency;
An offset from the motion vector detected by the motion vector detection means to an offset block located at the same coordinate as the local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block Motion vector decomposing means for decomposing the vector into local motion vectors from the offset block to the selected block;
And multiplexing means for multiplexing the local motion vector and the offset vector decomposed by the motion vector decomposing means into the moving picture stream and outputting the result.

本発明によれば、オフセットベクトルの始点と終点は各多視点画像内でのローカル座標が等しいため、オフセットベクトルを表す符号は、ベクトルの終点がＮ枚の複数の多視点画像のうちのどの画像を指しているかを表現するのに必要なビット数で済む。また、ローカル動きベクトルは、１枚の多視点画像内における局所的な動きを表現するのに必要な少ないビット数で表すことができる。従って、本発明により動きベクトルをオフセットベクトルとローカル動きベクトルに分解すれば、動きベクトルを分解せずに符号化する場合に比べて、動きベクトル符号量を大幅に削減することができる。また、複数の多視点画像を１枚の大きな画面に多重化することにより、既存の動画像規格の予測構造をそのまま利用でき、また、現在までに種々開発されている平面動画像用コーデックＬＳＩの構成もほとんど変えずに利用できるため、低コストかつ高効率な多視点画像の圧縮が実現できる。 According to the present invention, since the start point and end point of the offset vector have the same local coordinates in each multi-viewpoint image, the code representing the offset vector is any image among the multiple multi-viewpoint images whose vector end point is N. The number of bits required to express whether or not Further, the local motion vector can be expressed by a small number of bits necessary for expressing a local motion in one multi-viewpoint image. Therefore, if the motion vector is decomposed into the offset vector and the local motion vector according to the present invention, the motion vector code amount can be greatly reduced as compared with the case where the motion vector is encoded without being decomposed. In addition, by multiplexing a plurality of multi-viewpoint images on a single large screen, the prediction structure of the existing video standard can be used as it is, and various plane codec LSIs that have been developed to date have been developed. Since the configuration can be used with almost no change, low-cost and highly efficient multi-viewpoint image compression can be realized.

また、本発明の他の多視点画像圧縮装置では、前記複数の多視点画像を、右目画像と左目画像、または右目画像と左目画像と前記右目画像または前記左目画像のいずれかを水平方向に２倍の解像度とした追加画像である。 In another multi-viewpoint image compression apparatus of the present invention, the plurality of multi-viewpoint images may be divided into a right-eye image and a left-eye image, or a right-eye image, a left-eye image, and the right-eye image or the left-eye image in the horizontal direction. This is an additional image with double resolution.

また、本発明の他の多視点画像圧縮装置では、前記複数の多視点画像が、複数の立体画像と該立体画像よりも解像度の高い平面画像とから構成され、
符号化対象ブロックが前記多重化画像の立体画像内にあるか平面画像内にあるかを判別し、符号化対象ブロックが立体画像内にある場合には、前記動きベクトル検出手段により検出された動きベクトルを前記動きベクトル分解手段に出力し、符号化対象ブロックが平面画像内にある場合には、前記動きベクトル検出手段により検出された動きベクトルを前記多重化手段に出力する判別手段をさらに備えるようにしてもよい。 Further, in another multi-viewpoint image compression device of the present invention, the plurality of multi-viewpoint images is composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the motion detected by the motion vector detecting means A discrimination means is further provided for outputting the vector to the motion vector decomposing means, and outputting the motion vector detected by the motion vector detecting means to the multiplexing means when the encoding target block is in a plane image. It may be.

また、本発明の他の多視点画像圧縮装置では、前記ローカル動きベクトルを表す符号は、ＭＰＥＧ規格における動きベクトルのフォーマットに従って符号化されるようにしてもよい。 In another multi-view image compression apparatus of the present invention, the code representing the local motion vector may be encoded in accordance with a motion vector format in the MPEG standard.

また、本発明の他の多視点画像圧縮装置では、オフセットベクトルを表す符号は、前記多重化画像に配置されているＭ枚の画像の各々に対し予め定められた少なくとも[ｌｏｇ₂（Ｍ−１）]＋１ビット（但し[ｘ]はｘを超えない最大の整数）の固定長符号テーブルを参照することにより表わすようにしてもよい。 In another multi-viewpoint image compression apparatus of the present invention, the code representing the offset vector is at least [log ₂ (M−1) predetermined for each of the M images arranged in the multiplexed image. )] + 1 bit (where [x] is a maximum integer not exceeding x) may be expressed by referring to a fixed length code table.

さらに、本発明の他の多視点画像圧縮装置では、前記オフセットベクトルを表す符号は、符号化対象ブロックを含む多視点画像と選択されたブロックを含む視差画像との視点の距離に応じて可変長符号化されるようにしてもよいし、前記多重化画面において互いに隣接する２つ以上のブロックにそれぞれ対応する複数のオフセットベクトル群によりランレングス符号化されるようにしてもよい。 Furthermore, in another multi-viewpoint image compression apparatus of the present invention, the code representing the offset vector has a variable length according to the viewpoint distance between the multi-viewpoint image including the encoding target block and the parallax image including the selected block. It may be encoded, or may be run-length encoded by a plurality of offset vector groups respectively corresponding to two or more blocks adjacent to each other on the multiplexed screen.

また、本発明の他の多視点画像圧縮装置では、前記多視点画像圧縮手段において圧縮される動画ストリームはＭＰＥＧ規格に準拠した動画ストリームとし、
前記オフセットベクトルを表す符号は、該ＭＰＥＧ規格に準拠したストリーム中のユーザデータ部、ヘッダ部のいずれかまたは両方に挿入され、
前記オフセットベクトルが存在する位置を示すフラグ、及び、前記オフセットベクトルの符号化フォーマットを示すフラグ、及び、多重化画像内の画像の配置順序を示すフラグは、前記ＭＰＥＧ規格に準拠したストリーム中のユーザデータ部に挿入されるようにしてもよい。 In another multi-viewpoint image compression apparatus of the present invention, the moving-image stream compressed by the multi-viewpoint image compressing unit is a moving-image stream conforming to the MPEG standard,
The code representing the offset vector is inserted into one or both of a user data part and a header part in a stream compliant with the MPEG standard,
The flag indicating the position where the offset vector exists, the flag indicating the encoding format of the offset vector, and the flag indicating the arrangement order of the images in the multiplexed image are a user in the stream compliant with the MPEG standard. You may make it insert in a data part.

上記目的を達成するために、本発明は、所定の対象物を複数の視点から撮影することにより得られる複数の多視点画像を空間的に多重化することにより得られる多重化画像を動きベクトルと差分画像とに分解して符号化する際、前記多重化画像の所定領域を動きベクトルサーチ範囲として、符号化を行おうとする多重化画像の符号化対象ブロックと類似する類似ブロックを、参照画像とした多重化画像を構成するブロックのうちから最も予測効率が高くなるように選択することにより動きベクトルを検出し、検出された前記動きベクトルを、動きベクトルの検出の際に参照画像とした多重化画像を構成するブロックのうちから選択されたブロックを含む多視点画像内において符号化対象ブロックの多視点画像内でのローカル座標と同一の座標に位置するオフセットブロックに至るオフセットベクトルと、該オフセットブロックから前記選択されたブロックに至るローカル動きベクトルとに分解し、分解された前記ローカル動きベクトルおよび前記オフセットベクトルを符号化して多重化することにより得られた動画ストリームを受信して伸長することにより元の多視点画像を復元する多視点画像伸長装置であって、
受信した動画ストリーム中に含まれる前記ローカル動きベクトルと前記オフセットベクトルを分離する分離手段と、前記分離手段により分離されたローカル動きベクトルとオフセットベクトルから動きベクトルを合成する動きベクトル合成手段と、前記動きベクトル合成手段により合成された動きベクトルと受信した動画ストリーム中の参照画像から予測画像を形成し、該予測画像と前記動画ストリームに含まれる差分画像との和をとることにより元の多重化画像を復元する多重化画像復元手段とを有する。 In order to achieve the above object, the present invention relates to a multiplexed image obtained by spatially multiplexing a plurality of multi-view images obtained by photographing a predetermined object from a plurality of viewpoints as a motion vector. When decomposing into a differential image and encoding, a similar block similar to the encoding target block of the multiplexed image to be encoded is set as a reference image using a predetermined region of the multiplexed image as a motion vector search range. A motion vector is detected by selecting from among the blocks constituting the multiplexed image so that the prediction efficiency is highest, and the detected motion vector is used as a reference image when detecting the motion vector. The same coordinates as the local coordinates in the multi-view image of the encoding target block in the multi-view image including the block selected from the blocks constituting the image It is obtained by decomposing into an offset vector leading to the offset block located and a local motion vector extending from the offset block to the selected block, and encoding and multiplexing the decomposed local motion vector and the offset vector. A multi-viewpoint image decompressing device that restores the original multi-viewpoint image by receiving and decompressing the received video stream,
Separating means for separating the local motion vector and the offset vector included in the received video stream, motion vector synthesizing means for synthesizing a motion vector from the local motion vector and the offset vector separated by the separating means, and the motion A predicted image is formed from the motion vector synthesized by the vector synthesizing unit and the reference image in the received moving image stream, and the original multiplexed image is obtained by summing the predicted image and the difference image included in the moving image stream. Multiplexed image restoring means for restoring.

本発明では、動きベクトルがオフセットベクトルとローカル動きベクトルとに分解されて符号化された動画ストリームを多視点画像圧縮装置から受信し、この動画ストリームからローカル動きベクトルおよびオフセットベクトルを分離して合成することにより元の動きベクトルを得る。そして、合成した動きベクトルを用いて受信した動画ストリームを伸長することにより元の多視点画像を復元する。オフセットベクトルの始点と終点は各多視点画像内でのローカル座標が等しいため、オフセットベクトルを表す符号は、ベクトルの終点がＮ枚の複数の多視点画像のうちのどの画像を指しているかを表現するのに必要なビット数で済む。また、ローカル動きベクトルは、１枚の多視点画像内における局所的な動きを表現するのに必要な少ないビット数で表すことができる。そのため、動きベクトルを分解せずに符号化する場合に比べて、動きベクトル符号量を大幅に削減することができる。また、複数の多視点画像が１枚の大きな画面に多重化された多重化画像を復元した後に分離することにより元の複数の多視点画像を得るようにしているので、既存の動画像規格の予測構造をそのまま利用でき、また、現在までに種々開発されている平面動画像用コーデックＬＳＩの構成もほとんど変えずに利用できるため、低コストかつ高効率な多視点画像の圧縮が実現できる。 In the present invention, a moving image stream in which a motion vector is decomposed into an offset vector and a local motion vector and encoded is received from the multi-viewpoint image compression apparatus, and the local motion vector and the offset vector are separated from the moving image stream and synthesized. Thus, the original motion vector is obtained. Then, the original multi-viewpoint image is restored by expanding the received video stream using the synthesized motion vector. Since the start point and end point of the offset vector have the same local coordinates in each multi-viewpoint image, the code representing the offset vector represents which image of the N multi-viewpoint images the end point of the vector points to The number of bits required to do this is sufficient. Further, the local motion vector can be expressed by a small number of bits necessary for expressing a local motion in one multi-viewpoint image. Therefore, the amount of motion vector codes can be greatly reduced as compared with the case of encoding without decomposing motion vectors. In addition, since a plurality of multi-view images are multiplexed on a single large screen and then separated after being restored, the original multi-view images are obtained. Since the prediction structure can be used as it is, and the configuration of the codec LSI for plane moving images that has been developed in various ways can be used with almost no change, low-cost and highly efficient multi-viewpoint image compression can be realized.

また、本発明の他の多視点画像伸長装置では、前記複数の多視点画像が、複数の立体画像と該立体画像よりも解像度の高い平面画像とから構成され、
多重化画像内での立体画像及び平面画像の配置順序を検出し、符号化対象ブロックが立体画像内にある場合には、前記分離手段により動画ストリームから分離された動きベクトルをローカル動きベクトルとして前記動きベクトル合成手段に出力し、符号化対象ブロックが平面画像内にある場合には、前記分離手段により動画ストリームから分離された動きベクトルを前記多重化画像復元手段に出力する判別手段をさらに備えるようにしてもよい。 Further, in another multi-viewpoint image decompression device of the present invention, the plurality of multi-viewpoint images is composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the moving image stream by the separation means is used as the local motion vector. And a determination unit that outputs to the motion vector synthesis unit and outputs the motion vector separated from the moving image stream by the separation unit to the multiplexed image restoration unit when the encoding target block is in the plane image. It may be.

上記目的を達成するために、本発明は、所定の対象物を複数の視点から撮影することにより得られる複数の多視点画像を動きベクトルと差分画像とに分解して符号化して動画ストリームとして伝送を行い、伝送されてきた動画ストリームを受信して伸長することにより元の多視点画像を復元する多視点画像伝送システムであって、
前記複数の多視点画像を画像空間的に多重化して１枚の多重化画像を生成する多視点画像多重化手段と、前記多重化画像の所定領域を動きベクトルサーチ範囲として、符号化を行おうとする多重化画像の符号化対象ブロックと類似する類似ブロックを、参照画像とした多重化画像を構成するブロックのうちから最も予測効率が高くなるように選択することにより動きベクトルを検出する動きベクトル検出手段と、前記動きベクトル検出手段により検出された動きベクトルを、前記選択されたブロックを含む多視点画像内において符号化対象ブロックの多視点画像内でのローカル座標と同一の座標に位置するオフセットブロックに至るオフセットベクトルと、該オフセットブロックから前記選択されたブロックに至るローカル動きベクトルとに分解する動きベクトル分解手段と、前記動きベクトル分解手段により分解されたローカル動きベクトルおよびオフセットベクトルを符号化して前記動画ストリームに多重化して出力する多重化手段とを有する多視点画像圧縮装置と、
前記多視点画像圧縮装置から受信した動画ストリーム中に含まれるローカル動きベクトルとオフセットベクトルを分離する分離手段と、前記分離手段により分離されたローカル動きベクトルとオフセットベクトルから動きベクトルを合成する動きベクトル合成手段と、前記動きベクトル合成手段により合成された動きベクトルと受信した動画ストリーム中の参照画像から予測画像を形成し、該予測画像と前記動画ストリームに含まれる差分画像との和をとることにより元の多重化画像を復元する多重化画像復元手段とを有する多視点画像伸長装置とを備えている。 To achieve the above object, the present invention decomposes and encodes a plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints into motion vectors and difference images and transmits them as a moving picture stream. A multi-viewpoint image transmission system that restores the original multi-viewpoint image by receiving and decompressing the transmitted video stream,
Multi-view image multiplexing means for multiplexing the plurality of multi-view images in an image space to generate one multiplexed image, and encoding using the predetermined area of the multiplexed image as a motion vector search range Motion vector detection that detects a motion vector by selecting a similar block that is similar to the encoding target block of the multiplexed image to be the highest in prediction efficiency from among the blocks that constitute the multiplexed image as a reference image And an offset block in which the motion vector detected by the motion vector detecting means is located at the same coordinate as the local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block. And the local motion vector from the offset block to the selected block. A motion vector resolution means interpreted, the multi-viewpoint image compression device having a multiplexing means for outputting the multiplexed the motion local motion vector and the offset vector decomposed by vector resolution means, encoded in the video stream,
Separation means for separating a local motion vector and an offset vector included in a video stream received from the multi-viewpoint image compression apparatus, and motion vector composition for synthesizing a motion vector from the local motion vector and the offset vector separated by the separation means And a motion vector synthesized by the motion vector synthesis means and a reference image in the received video stream, and a predicted image is formed by taking the sum of the predicted image and the difference image included in the video stream And a multi-viewpoint image decompression device having a multiplexed image restoration means for restoring the multiplexed image.

また、本発明の他の多視点画像伝送システムでは、
前記複数の多視点画像が、複数の立体画像と該立体画像よりも解像度の高い平面画像とから構成され、
前記多視点画像圧縮装置は、
符号化対象ブロックが前記多重化画像の立体画像内にあるか平面画像内にあるかを判別し、符号化対象ブロックが立体画像内にある場合には、前記動きベクトル検出手段により検出された動きベクトルを前記動きベクトル分解手段に出力し、符号化対象ブロックが平面画像内にある場合には、前記動きベクトル検出手段により検出された動きベクトルを前記多重化手段に出力する第１の判別手段をさらに備え、
前記多視点画像伸長装置は、
多重化画像内での立体画像及び平面画像の配置順序を検出し、符号化対象ブロックが立体画像内にある場合には、前記分離手段により動画ストリームから分離された動きベクトルをローカル動きベクトルとして前記動きベクトル合成手段に出力し、符号化対象ブロックが平面画像内にある場合には、前記分離手段により動画ストリームから分離された動きベクトルを前記多重化画像復元手段に出力する第２の判別手段をさらに備えるようにしてもよい。 In another multi-viewpoint image transmission system of the present invention,
The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The multi-viewpoint image compression apparatus includes:
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the motion detected by the motion vector detecting means First discriminating means for outputting a vector to the motion vector decomposing means and outputting a motion vector detected by the motion vector detecting means to the multiplexing means when the encoding target block is in a plane image. In addition,
The multi-viewpoint image decompression device includes:
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the moving image stream by the separation means is used as the local motion vector. A second discriminating unit that outputs to the motion vector synthesizing unit, and outputs the motion vector separated from the moving image stream by the separating unit to the multiplexed image restoring unit when the encoding target block is in the plane image; You may make it provide further.

本発明では、多視点画像圧縮装置側では、動きベクトルをオフセットベクトルとローカル動きベクトルに分解して動画ストリームに含めて送信し、多視点画像伸長装置側では、受信した動画ストリーム中のオフセットベクトルとローカル動きベクトルを分離して合成することにより元の動きベクトルを合成し、この動きベクトルを用いて受信した動画ストリームから元の多重化画像を復元する。オフセットベクトルの始点と終点は各多視点画像内でのローカル座標が等しいため、オフセットベクトルを表す符号は、ベクトルの終点がＮ枚の複数の多視点画像のうちのどの画像を指しているかを表現するのに必要なビット数で済む。また、ローカル動きベクトルは、１枚の多視点画像内における局所的な動きを表現するのに必要な少ないビット数で表すことができる。従って、本発明のようにして動画ストリームの伝送を行うようにすれば、動きベクトルを分解せずに符号化する場合に比べて、動きベクトル符号量を大幅に削減することができる。また、多視点画像圧縮装置では、複数の多視点画像を１枚の大きな画面に多重化して送信し、多視点画像伸長装置では、復元された多重化画像を分離することにより元の複数の多視点画像を得るようにしているので、既存の動画像規格の予測構造をそのまま利用でき、また、現在までに種々開発されている平面動画像用コーデックＬＳＩの構成もほとんど変えずに利用できるため、低コストかつ高効率な多視点画像の圧縮が実現できる。 In the present invention, on the multi-viewpoint image compression device side, the motion vector is decomposed into an offset vector and a local motion vector and transmitted in the moving image stream, and on the multiview image decompression device side, the offset vector in the received moving image stream and The original motion vector is synthesized by separating and synthesizing the local motion vectors, and the original multiplexed image is restored from the received video stream using the motion vectors. Since the start point and end point of the offset vector have the same local coordinates in each multi-viewpoint image, the code representing the offset vector represents which image of the N multi-viewpoint images the end point of the vector points to The number of bits required to do this is sufficient. Further, the local motion vector can be expressed by a small number of bits necessary for expressing a local motion in one multi-viewpoint image. Therefore, if the moving picture stream is transmitted as in the present invention, the amount of motion vector code can be greatly reduced as compared with the case where the motion vector is encoded without being decomposed. In addition, the multi-view image compression apparatus multiplexes and transmits a plurality of multi-view images on one large screen, and the multi-view image decompression apparatus separates the restored multiplexed images to separate the original multiple images. Since the viewpoint image is obtained, the prediction structure of the existing moving image standard can be used as it is, and the configuration of the codec LSI for planar moving images that has been developed so far can be used with almost no change. Low-cost and highly efficient multi-viewpoint image compression can be realized.

以上説明したように、本発明によれば、下記のような効果を得ることができる。
（１）多視点画像伝送システムでは、多視点画像圧縮装置において、検出された動きベクトルをオフセットベクトルとローカル動きベクトルに分解して動画ストリームに多重化して送信し、多視点画像伸長装置において、受信した動画ストリーム中のオフセットベクトルとローカル動きベクトルを分離して合成することにより元の動きベクトルを合成し、この動きベクトルを用いて受信した動画ストリームから元の多重化画像を復元するようにしているので、動きベクトルを分解せずに符号化する場合に比べて、動きベクトル符号量を大幅に削減することができる。
（２）多視点画像伝送システムでは、多視点画像圧縮装置において、伝送しようとする複数の多視点画像を１枚の大きな画面に多重化して送信し、多視点画像伸長装置において、復元された多重化画像を分離することにより元の複数の多視点画像を得るようにしているので、既存の動画像規格の予測構造をそのまま利用でき、また、現在までに種々開発されている平面動画像用コーデックＬＳＩの構成もほとんど変えずに利用できるため、低コストかつ高効率な多視点画像の圧縮が実現できる。 As described above, according to the present invention, the following effects can be obtained.
(1) In a multi-view image transmission system, a multi-view image compression apparatus decomposes a detected motion vector into an offset vector and a local motion vector, multiplexes them into a moving picture stream, and transmits them to a multi-view image expansion apparatus. The original motion vector is synthesized by separating and synthesizing the offset vector and the local motion vector in the video stream, and the original multiplexed image is restored from the video stream received using this motion vector. Therefore, the amount of motion vector codes can be greatly reduced compared to the case of encoding without decomposing motion vectors.
(2) In the multi-view image transmission system, the multi-view image compression apparatus multiplexes and transmits a plurality of multi-view images to be transmitted on one large screen, and the multi-view image expansion apparatus restores the multiplexed data. The original multi-viewpoint images are obtained by separating the digitized images, so that the prediction structure of the existing video standard can be used as it is, and the plane video codec that has been developed to date Since the LSI configuration can be used with almost no change, low-cost and highly efficient multi-viewpoint image compression can be realized.

次に、本発明の実施の形態について図面を参照して詳細に説明する。ここでは、多視点画像伝送システムの１つである立体画像伝送システムを用いて説明を行う。 Next, embodiments of the present invention will be described in detail with reference to the drawings. Here, a description will be given using a stereoscopic image transmission system which is one of the multi-viewpoint image transmission systems.

（第１の実施形態）
図１は、本発明の第１の実施形態の立体画像伝送システムの構成を示すブロック図である。本実施形態の立体画像伝送システムは、複数の視差画像を動きベクトルと差分画像とに分解して符号化して動画ストリームとして出力することによりデータ量の圧縮を行う立体画像圧縮装置１０と、立体画像伸長装置２０と、この立体画像圧縮装置１０と立体画像伸長装置２０とを接続する伝送路とから構成されている。 (First embodiment)
FIG. 1 is a block diagram showing a configuration of a stereoscopic image transmission system according to the first embodiment of the present invention. The stereoscopic image transmission system according to the present embodiment includes a stereoscopic image compression apparatus 10 that compresses a data amount by decomposing and encoding a plurality of parallax images into motion vectors and difference images and outputting them as a moving image stream, and a stereoscopic image The decompressing device 20 and a transmission path connecting the stereoscopic image compressing device 10 and the stereoscopic image decompressing device 20 are configured.

立体画像圧縮装置１０は、立体画像多重化部１０４と、多重化画像圧縮部１０５と、動きベクトル分解部１０６と、送信・記録部１０７とから構成されている。また、立体画像伸長装置２０は、図１に示すように、受信・再生部１０８、多重化画像伸長部１０９と、動きベクトル合成部１１０と、立体画像分離部１１１とから構成されている。 The stereoscopic image compression apparatus 10 includes a stereoscopic image multiplexing unit 104, a multiplexed image compression unit 105, a motion vector decomposition unit 106, and a transmission / recording unit 107. As shown in FIG. 1, the stereoscopic image decompression apparatus 20 includes a reception / reproduction unit 108, a multiplexed image decompression unit 109, a motion vector synthesis unit 110, and a stereoscopic image separation unit 111.

立体画像多重化部１０４は、入力された複数の視差画像からなる立体画像を画像空間的に多重化して１枚の多重化画像を生成する。ここでは、立体画像多重化部１０４に入力される立体画像は、第１眼画像１０１₁〜第Ｎ眼画像１０１_NのＮ枚から構成されているものとして説明する。 The stereoscopic image multiplexing unit 104 generates a single multiplexed image by multiplexing the input stereoscopic image including a plurality of parallax images in terms of image space. Here, it is assumed that the stereoscopic image input to the stereoscopic image multiplexing unit 104 is composed of N images of the first eye image 101 ₁ to the Nth eye image 101 _N.

動きベクトル分解部１０６は、多重化画像圧縮部１０５により検出された動きベクトルを、オフセットブロックに至るオフセットベクトルと、このオフセットブロックから選択されたブロックに至るローカル動きベクトルとに分解する。ここで、オフセットブロックとは、動きベクトルを検出する際に、参照画像中の選択されたブロックを含む多視点画像内において符号化対象ブロックの多視点画像内でのローカル座標と同一の座標に位置するブロックのことである。 The motion vector decomposition unit 106 decomposes the motion vector detected by the multiplexed image compression unit 105 into an offset vector that reaches the offset block and a local motion vector that reaches the block selected from the offset block. Here, the offset block is a position at the same coordinate as the local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block in the reference image when detecting the motion vector. It is a block to do.

立体画像多重化部１０４では、Ｎ枚の立体画像が空間的に配置され、図２に示すように、１枚の大きな画像に多重化される。ここで多重化方法は、図２に示した方法以外に、縦方向に並べるのでも、横方向に並べるのでもよく、また、視差画像を並べる順番も、図２に示した通りでなくて構わない。多重化された画像は多重化画像圧縮部１０５において圧縮されるが、その際求められる動きベクトル情報は、動きベクトル分解部１０６においてオフセットベクトルとローカル動きベクトルとに分解され、出力ストリーム中に挿入される。圧縮された立体画像は送信・記録部１０７によって１本のストリームとして送信あるいは記録される。 In the stereoscopic image multiplexing unit 104, N stereoscopic images are spatially arranged and multiplexed into one large image as shown in FIG. Here, the multiplexing method may be arranged in the vertical direction or in the horizontal direction other than the method shown in FIG. 2, and the order in which the parallax images are arranged may not be as shown in FIG. Absent. The multiplexed image is compressed by the multiplexed image compression unit 105. The motion vector information obtained at this time is decomposed into an offset vector and a local motion vector by the motion vector decomposition unit 106, and inserted into the output stream. The The compressed stereoscopic image is transmitted or recorded as one stream by the transmission / recording unit 107.

ここで、立体画像多重化部１０４により多重化された多重化画像は、格子状のマクロブロックに分割され、このマクロブロック単位で動きベクトルの検出や動き補償が行われる。マクロブロックとは、図３に示すようなＭＰＥＧ等の規格で用いられる、１６画素×１６ラインのブロックのことである。 Here, the multiplexed image multiplexed by the stereoscopic image multiplexing unit 104 is divided into grid-like macroblocks, and motion vector detection and motion compensation are performed in units of the macroblocks. A macroblock is a block of 16 pixels × 16 lines used in standards such as MPEG as shown in FIG.

尚、オフセットベクトルおよびローカル動きベクトルを出力ストリーム中に挿入する具体的な方法については、第１〜第５の実施形態の説明後にまとめて説明する。 A specific method for inserting the offset vector and the local motion vector into the output stream will be described collectively after the description of the first to fifth embodiments.

立体画像圧縮装置１０により圧縮された多重化画像は、伝送路を介して立体画像伸長装置２０に送られて受信・再生部１０８によって１本のストリームとして受信あるいは再生される。そして、受信・再生部１０８によって受信あるいは再生された多重化画像は、多重化画像伸長部１０９において伸長されるが、その際、動きベクトル合成部１１０において受信したローカル動きベクトルとオフセットベクトルから１本の動きベクトルが合成され、合成された動きベクトル情報を用いて多重化画像を復元する。伸長された多重化画像は、立体画像分離部１１１において第１眼画像１１２₁、第２眼画像１１２₂、・・・、第Ｎ眼画像１１２_Nに分離される。そして、第１眼画像１１２₁〜第Ｎ眼画像１１２_Nを１列毎に配置した画像を作って立体ディスプレイに表示し、Ｎ眼立体表示が実現される。 The multiplexed image compressed by the stereoscopic image compression apparatus 10 is sent to the stereoscopic image expansion apparatus 20 via a transmission path, and is received or reproduced as a single stream by the reception / reproduction unit 108. The multiplexed image received or reproduced by the reception / reproduction unit 108 is decompressed by the multiplexed image decompression unit 109. At this time, one image is extracted from the local motion vector and the offset vector received by the motion vector synthesis unit 110. Are combined, and the multiplexed image is restored using the combined motion vector information. The expanded multiplexed image is separated into a first eye image 112 ₁ , a second eye image 112 ₂ ,..., An Nth eye image 112 _N by the stereoscopic image separation unit 111. Then, an image in which the first eye image 112 ₁ to the Nth eye image 112 _N are arranged for each column is created and displayed on the stereoscopic display, thereby realizing N-eye stereoscopic display.

次に、多重化画像圧縮部１０５における多重化画像の符号化方法について説明する。多重化画像圧縮部１０５は、図４に示すように、動きベクトル検出部３０４と、ＤＣＴ変換部３１０と、量子化部３１１と、逆量子化部３１５と、可変長符号化部３１２と、多重化部３１３と、逆ＤＣＴ変換部３１６と、予測メモリ３０３と、動き補償部３０６と、から構成されている。 Next, a method for encoding a multiplexed image in the multiplexed image compression unit 105 will be described. As illustrated in FIG. 4, the multiplexed image compression unit 105 includes a motion vector detection unit 304, a DCT conversion unit 310, a quantization unit 311, an inverse quantization unit 315, a variable length coding unit 312, 313, an inverse DCT transform unit 316, a prediction memory 303, and a motion compensation unit 306.

多重化画像圧縮部１０５は、図４に示すように、既存の種々の平面動画圧縮部とほぼ同様の構成となっている。すなわち、動画像の時間方向の相関を利用するための動き補償や、空間方向の高周波成分を取り除くためのＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ：離散コサイン変換）など、平面動画を効率よく圧縮するための機能のみが備わっており、立体画像に特化した機能は備えらておらず、動きベクトル検出部３０４により検出された動きベクトル３０５が動きベクトル分解部１０６に出力され、動きベクトル分解部１０６からのローカル動きベクトル３０８とオフセットベクトル３０９が多重化部３１３に入力されている点が異なっている。なお、この図４において、動きベクトル分解部１０６は多重化画像圧縮部１０５の外部にある構成となっているが、多重化画像圧縮部１０５の内部に含む構成としても構わない。 As shown in FIG. 4, the multiplexed image compression unit 105 has substantially the same configuration as existing various planar moving image compression units. That is, only a function for efficiently compressing a planar moving image, such as motion compensation for using temporal correlation of moving images and DCT (Discrete Cosine Transform) for removing high-frequency components in the spatial direction. Is not provided with a function specialized for stereoscopic images, and the motion vector 305 detected by the motion vector detecting unit 304 is output to the motion vector decomposing unit 106, and the local motion from the motion vector decomposing unit 106 is output. The difference is that the vector 308 and the offset vector 309 are input to the multiplexing unit 313. In FIG. 4, the motion vector decomposing unit 106 is configured outside the multiplexed image compressing unit 105, but may be configured to be included inside the multiplexed image compressing unit 105.

動きベクトル検出手部３０４は、多重化画像全体を動きベクトルサーチ範囲として、符号化を行おうとする多重化画像の符号化対象ブロックと類似する類似ブロックを、参照画像とした多重化画像を構成するブロックのうちから最も予測効率が高くなるように選択することにより動きベクトルを検出する。 The motion vector detection unit 304 forms a multiplexed image using the entire multiplexed image as a motion vector search range and a similar block similar to the encoding target block of the multiplexed image to be encoded as a reference image. A motion vector is detected by selecting the block so that the prediction efficiency is highest.

多重化画像圧縮部１０５の動作としては、まず、多重化画像圧縮部１０５に入力された多重化画像と、予測メモリ３０３に記憶されている過去あるいは未来の参照画像とを動きベクトル検出部３０４においてブロック単位で比較し、動きベクトル３０５を検出する。このとき、多重化画像に配置されているＮ枚の視差画像の類似部分から予測誤差の最も小さいブロックを選択することにより、予測効率を向上させることができる。この動きベクトル情報により、動き補償部３０６において予測メモリ３０３に記憶されている参照画像から対応するデータを読み出して予測画像を形成し、入力された多重化画像との差分をとる。一方動きベクトルは動きベクトル分解部１０６においてオフセットベクトル３０９とローカル動きベクトル３０８とに分解される。差分画像はＤＣＴ変換部３１０、量子化部３１１、及び可変長符号化部３１２を経て、多重化部３１３においてローカル動きベクトル情報、オフセットベクトル情報とともに１本の動画ストリーム３１４に多重化される。さらに、次のフレームの多重化画像を圧縮するための参照画像は、逆量子化部３１５、逆ＤＣＴ変換部３１６を経て３０３予測メモリに記憶される。 As an operation of the multiplexed image compressing unit 105, first, the motion vector detecting unit 304 uses the multiplexed image input to the multiplexed image compressing unit 105 and the past or future reference images stored in the prediction memory 303. The motion vector 305 is detected by comparing in block units. At this time, the prediction efficiency can be improved by selecting the block with the smallest prediction error from the similar parts of the N parallax images arranged in the multiplexed image. Based on this motion vector information, the motion compensation unit 306 reads corresponding data from the reference image stored in the prediction memory 303 to form a prediction image, and obtains a difference from the input multiplexed image. On the other hand, the motion vector is decomposed into an offset vector 309 and a local motion vector 308 by the motion vector decomposition unit 106. The difference image passes through the DCT transform unit 310, the quantization unit 311, and the variable length coding unit 312, and is multiplexed into one moving image stream 314 together with the local motion vector information and the offset vector information in the multiplexing unit 313. Further, the reference image for compressing the multiplexed image of the next frame is stored in the 303 prediction memory via the inverse quantization unit 315 and the inverse DCT conversion unit 316.

次に、動きベクトル分解部１０６における動きベクトル分解方法、及び分解されたベクトルの符号化方法について、更に詳しく説明する。 Next, the motion vector decomposition method and the method of encoding the decomposed vector in the motion vector decomposition unit 106 will be described in more detail.

動きベクトル分解の様子は、図５を用いて説明する。以下、第ｋ眼画像内のローカル座標(ｘ,ｙ)にあるブロックを、ブロック座標(ｋ,ｘ,ｙ)と表すことにする。符号化対象ブロック４０１は第６眼画像内にあり、ローカル座標(ｉ,ｊ)であるから、ブロック座標は(６,ｉ,ｊ)となる。一方、参照ブロック４０３は第１眼画像内にあり、ローカル座標(ｉ’,ｊ’)であるから、ブロック座標は(１,ｉ’,ｊ’)となる。動きベクトル分解部１０６に入力された動きベクトル４０５は、以下のようにして分解される。すなわち、動きベクトル４０５は、ブロック座標(６,ｉ,ｊ)に位置する符号化対象ブロック４０１から、第１眼画像における同一ローカル座標、すなわちブロック座標(１,ｉ,ｊ)に位置するオフセットブロック４０６に至るオフセットベクトル４０７と、オフセットブロック４０７から、ブロック座標(１,ｉ’,ｊ’)に位置する参照ブロック４０３に至るローカル動きベクトル４０８とに分解される。 The state of motion vector decomposition will be described with reference to FIG. Hereinafter, a block at local coordinates (x, y) in the k-th eye image is represented as block coordinates (k, x, y). Since the encoding target block 401 is in the sixth eye image and has local coordinates (i, j), the block coordinates are (6, i, j). On the other hand, since the reference block 403 is in the first eye image and has local coordinates (i ′, j ′), the block coordinates are (1, i ′, j ′). The motion vector 405 input to the motion vector decomposition unit 106 is decomposed as follows. That is, the motion vector 405 is an offset block located at the same local coordinate in the first eye image, that is, at the block coordinate (1, i, j), from the encoding target block 401 located at the block coordinate (6, i, j). An offset vector 407 leading to 406 and a local motion vector 408 leading from the offset block 407 to the reference block 403 located at the block coordinates (1, i ′, j ′) are decomposed.

ローカル動きベクトルの符号化については、視差画像内での局所的な動きを表現できればよいため、平面動画像規格にあるような動きベクトル符号化の方法に従って可変長符号化すればよい。 As for local motion vector encoding, it is only necessary to represent local motion in a parallax image. Therefore, variable length encoding may be performed according to a motion vector encoding method as in the plane video standard.

次に、オフセットベクトルの符号化方法について説明する。オフセットベクトル符号は、選択されたブロックがどの視差画像内に存在するかを表現できれば良いので、図６のように視差画像毎に少なくとも[ｌｏｇ₂（Ｍ−１）]＋１ビット（但し[ｘ]はｘを超えない最大の整数、Ｍは多重化画像内に存在する視差画像の数）の固定長符号を割り当てておき、このテーブルを参照して符号化する方法がまず考えられる。 Next, an offset vector encoding method will be described. Since the offset vector code only needs to be able to express in which parallax image the selected block exists, at least [log ₂ (M−1)] + 1 bits (provided that [x] Is a maximum integer that does not exceed x, and M is a fixed-length code of the number of parallax images existing in the multiplexed image), and a method of encoding with reference to this table is first considered.

さらに、動きベクトル検出において選択されるブロックは、符号化対象ブロックと同一視点画像内に存在することが多く、逆に視点が離れた画像内に存在する割合は低くなる傾向にあるため、符号化対象ブロックと参照ブロックの視点距離が近い場合は符号長を短くし、遠い場合は符号長を長くするようにすれば、オフセットベクトル情報を効率的に符号化することが可能になる。一例として、図７に示すような可変長符号テーブルを用意して、符号化対象ブロックが属する視差画像と、検出された参照ブロックが属する視差画像をもとに符号を決定すればよい。 Furthermore, since the block selected in motion vector detection often exists in the same viewpoint image as the encoding target block, and conversely, the ratio that exists in an image with a different viewpoint tends to be low. If the code length is shortened when the viewpoint distance between the target block and the reference block is short, and the code length is long when the target block is far, the offset vector information can be efficiently encoded. As an example, a variable length code table as shown in FIG. 7 is prepared, and the code may be determined based on the parallax image to which the encoding target block belongs and the parallax image to which the detected reference block belongs.

また、オフセットベクトルは隣接ブロック間で相関性があることを考慮して、オフセットベクトル情報をランレングス符号化することも可能である。すなわち、オフセットベクトル情報を、「オフセットベクトルの値と、値が連続する個数」として表現する。例えば、図８に示すように、第６眼画像内にある連続する５つのブロック７０２における動きベクトル７０３が、図８のように左から順に第２眼画像、第２眼画像、第２眼画像、第６眼画像、第６眼画像に属するブロック７０４を参照しているとする。この場合、オフセットベクトルを順に並べると”２２２６６”となるが、これをランレングス方式で表現すると、「２が３個、６が２個」すなわち”２３６２”となる。このようにランレングス符号化を行う場合も、オフセットベクトルの値及び値が連続する個数は、固定長符号化、可変長符号化のどちらでもよい。ランレングス符号は、多重化画面全体で区切るのでもよいし、スライス単位で区切るのでもよいし、個々の視差画像の境界部分で区切るのでもよい。 In addition, the offset vector information can be run-length encoded in consideration of the fact that the offset vector has a correlation between adjacent blocks. That is, the offset vector information is expressed as “the value of the offset vector and the number of consecutive values”. For example, as shown in FIG. 8, the motion vectors 703 in five consecutive blocks 702 in the sixth eye image are the second eye image, second eye image, second eye image in order from the left as shown in FIG. It is assumed that the sixth eye image and the block 704 belonging to the sixth eye image are referred to. In this case, when the offset vectors are arranged in order, “22266” is obtained. When this is expressed by the run length method, “2 is 3 and 6 is 2”, that is, “2362”. Even when run-length encoding is performed in this way, the value of the offset vector and the number of consecutive values may be either fixed-length encoding or variable-length encoding. The run-length code may be divided on the entire multiplexed screen, may be divided on a slice basis, or may be divided on the boundary portion between individual parallax images.

また、オフセットベクトル情報が固定長符号か、可変長符号か、また、ランレングス符号化されているかどうかなどの、オフセットベクトルの符号化フォーマットを示すフラグは、出力ストリーム中のユーザデータ部やプライベートデータ部など、ユーザが任意にデータを挿入できるフィールドに挿入される。さらに、多重化画像における視差画像の配置順序を示すフラグも、ユーザデータ部に挿入される。 In addition, flags indicating the encoding format of the offset vector, such as whether the offset vector information is a fixed-length code, a variable-length code, or run-length encoding, is a user data part or private data in the output stream. The field is inserted into a field where the user can arbitrarily insert data, such as a section. Furthermore, a flag indicating the arrangement order of the parallax images in the multiplexed image is also inserted into the user data portion.

上記処理により得られたオフセットベクトル情報及びローカル動きベクトル情報は、出力ストリーム中に挿入される。ローカル動きベクトル情報は、平面動画像の符号化規格における動きベクトル情報の符号化方式に従って符号化され、動きベクトル符号として挿入される。オフセットベクトル情報は、付加情報としてユーザデータ部、あるいはローカル動きベクトル符号の前後に挿入され、さらに、ストリーム中でのオフセットベクトル情報の存在場所を示すフラグがユーザデータ部に挿入される。 The offset vector information and local motion vector information obtained by the above processing are inserted into the output stream. The local motion vector information is encoded according to a motion vector information encoding method in the plane video encoding standard, and is inserted as a motion vector code. The offset vector information is inserted as additional information before or after the user data part or the local motion vector code, and a flag indicating the location of the offset vector information in the stream is inserted into the user data part.

次に、多重化画像伸長部１０９における多重化画像の復号化方法について説明する。図９に示すように、多重化画像伸長部１０９も多重化画像圧縮部１０５と同様に、既存の平面動画像伸張部と同様の構成になっている。すなわち、多重化画像圧縮部１０５に示した、一般的な平面動画用圧縮部で出力されるストリームが伸長できる機能が備わっていればよい。 Next, a method for decoding a multiplexed image in the multiplexed image decompressing unit 109 will be described. As illustrated in FIG. 9, the multiplexed image decompressing unit 109 has the same configuration as the existing planar moving image decompressing unit, like the multiplexed image compressing unit 105. In other words, it is only necessary to have a function for expanding the stream output by the general plane moving image compression unit shown in the multiplexed image compression unit 105.

多重化画像伸長部１０９は、図９に示されるように、分離部８０３と、可変長復号化部８０７と、逆量子化部８０８と、逆ＤＣＴ変換部８０９と、動き補償部８１２と、予測メモリ８１３とから構成されている。ここで、可変長復号化部８０７と、逆量子化部８０８と、逆ＤＣＴ変換部８０９と、動き補償部８１２と、予測メモリ８１３は、圧縮された符号化された多重化画像を復元するための多重化画像復元手段として機能する。 As illustrated in FIG. 9, the multiplexed image decompression unit 109 includes a separation unit 803, a variable length decoding unit 807, an inverse quantization unit 808, an inverse DCT transform unit 809, a motion compensation unit 812, a prediction And a memory 813. Here, the variable length decoding unit 807, the inverse quantization unit 808, the inverse DCT conversion unit 809, the motion compensation unit 812, and the prediction memory 813 are for restoring the compressed encoded multiplexed image. Function as a multiplexed image restoration means.

多重化画像伸長部１０９での動作は、まず、多重化画像伸張部１０９に入力された１本の動画ストリーム８０２が、分離部８０３において差分画像データ８０４、オフセットベクトル８０５、ローカル動きベクトル８０６とに分離される。差分画像データ８０４は可変長復号化部８０７、逆量子化部８０８、逆ＤＣＴ変換部８０９において、それぞれ可変長復号化、逆量子化、逆ＤＣＴ変換され、差分画像に復号される。オフセットベクトル８０５及びローカル動きベクトル８０６は、動きベクトル合成部１１０において１本の動きベクトル８１１に合成される。この合成された動きベクトル８１１を用いて、動き補償部８１２において、予測メモリ８１３に記憶されている過去あるいは未来の参照画像から予測画像が形成される。そして、予測画像と逆ＤＣＴ変換部８０９から出力された差分画像との和をとることによって、多重化画像が復元される。 In the operation of the multiplexed image decompression unit 109, first, one moving image stream 802 input to the multiplexed image decompression unit 109 is converted into difference image data 804, an offset vector 805, and a local motion vector 806 in the separation unit 803. To be separated. The difference image data 804 is subjected to variable length decoding, inverse quantization, and inverse DCT conversion in a variable length decoding unit 807, an inverse quantization unit 808, and an inverse DCT conversion unit 809, respectively, and is decoded into a difference image. The offset vector 805 and the local motion vector 806 are combined into one motion vector 811 in the motion vector combining unit 110. Using the synthesized motion vector 811, the motion compensation unit 812 forms a predicted image from past or future reference images stored in the prediction memory 813. Then, the multiplexed image is restored by taking the sum of the predicted image and the difference image output from the inverse DCT transform unit 809.

次に、オフセットベクトル８０５及びローカル動きベクトル８０６の復号化方法、動きベクトル合成部１１０における動きベクトル合成方法について、更に詳しく説明する。 Next, the decoding method of the offset vector 805 and the local motion vector 806 and the motion vector synthesis method in the motion vector synthesis unit 110 will be described in more detail.

まず、動画ストリーム中の動きベクトル符号を検出し、これをローカル動きベクトル情報として取得する。ローカル動きベクトルの復号化方法については、種々の動画像規格に定められている動きベクトルの復号化方式に従う。 First, a motion vector code in a moving image stream is detected and acquired as local motion vector information. The local motion vector decoding method follows a motion vector decoding method defined in various video standards.

オフセットベクトルは、ユーザデータ内に存在するオフセットベクトル情報の存在場所を示すフラグを検出することにより、動画ストリーム中のユーザデータ部、あるいはローカル動きベクトル符号の前後からオフセットベクトル符号として取り出して検出する。オフセットベクトル符号の復号化方法については、ユーザデータ部に挿入されている、オフセットベクトルの符号化フォーマットを示すフラグを検出し、送受信側で予め決めておいたフォーマットに従って復号化する。 The offset vector is detected by detecting the flag indicating the location of the offset vector information existing in the user data as an offset vector code from the user data part in the moving image stream or before and after the local motion vector code. As for the decoding method of the offset vector code, a flag indicating the encoding format of the offset vector inserted in the user data part is detected, and decoding is performed according to a format predetermined on the transmission / reception side.

動きベクトルの合成方法については、ユーザデータ部から多重化画像内における視差画像の配置順序を示すフラグを検出し、これにより多重化画像内におけるオフセットブロックの位置を算出する。そして、オフセットベクトルとローカル動きベクトルをベクトル加算することにより、元の動きベクトルを合成する。 Regarding the motion vector synthesis method, a flag indicating the disposition order of the parallax images in the multiplexed image is detected from the user data portion, and thereby the position of the offset block in the multiplexed image is calculated. Then, the original motion vector is synthesized by vector addition of the offset vector and the local motion vector.

本実施形態の立体画像伝送システムては、立体画像圧縮装置１０の多重化画像圧縮部１０５では、動きベクトル検出部３０４により検出された動きベクトル３０５をローカル動きベクトル３０８と、オフセットベクトル３０９に分解して動画ストリームに多重化するようにしている。ここで、オフセットベクトル３０９の始点と終点は各視差画像内でのローカル座標が等しいため、オフセットベクトル３０９を表す符号は、ベクトルの終点がＮ枚の複数の視差画像のうちのどの画像を指しているかを表現するのに必要なビット数で済む。また、ローカル動きベクトル３０８は、１枚の視差画像内における局所的な動きを表現するのに必要な少ないビット数で表すことができる。従って、本実施形態により動きベクトル３０５をオフセットベクトル３０９とローカル動きベクトル３０８に分解すれば、動きベクトル３０５を分解せずに符号化する場合に比べて、動きベクトル符号量を大幅に削減することができる。また、Ｎ枚の視差画像を１枚の大きな画面に多重化することにより、既存の動画像規格の予測構造をそのまま利用でき、また、現在までに種々開発されている平面動画像用コーデックＬＳＩの構成もほとんど変えずに利用できるため、低コストかつ高効率な立体画像の圧縮が実現できる。 In the stereoscopic image transmission system of this embodiment, the multiplexed image compression unit 105 of the stereoscopic image compression apparatus 10 decomposes the motion vector 305 detected by the motion vector detection unit 304 into a local motion vector 308 and an offset vector 309. Are multiplexed into the video stream. Here, since the local coordinates in the respective parallax images are the same as the start point and the end point of the offset vector 309, the code representing the offset vector 309 indicates which of the plurality of parallax images whose vector end point is N. The number of bits required to express whether or not Further, the local motion vector 308 can be expressed by a small number of bits necessary for expressing local motion in one parallax image. Therefore, if the motion vector 305 is decomposed into the offset vector 309 and the local motion vector 308 according to the present embodiment, the motion vector code amount can be greatly reduced as compared with the case where the motion vector 305 is encoded without being decomposed. it can. In addition, by multiplexing N parallax images on one large screen, the prediction structure of the existing video standard can be used as it is, and various codec LSIs for plane video that have been developed to date Since the configuration can be used with almost no change, low-cost and highly efficient stereoscopic image compression can be realized.

本実施形態では、多視点画像は、１つの対象物を異なる視点から撮像した場合を想定して説明したが、対象物は１つに限定されるものではない。複数の対象物のそれぞれを複数の視点から撮影した画像についても、本発明を適用することができる。その場合は、符号化対象ブロックは、それぞれの対象物に対応した複数の画像のグループ内に含まれることとなるので、動きベクトルサーチ範囲は、多重化画像全体とするのではなく、各対象物に対応した複数の画像グループ内を所定領域とすれば、効率よくサーチできる。よって、本実施形態の立体画像伝送システムによれば、複数の立体画像だけでなく、所定の対象物を複数の視点から撮影することにより得られる複数の多視点画像を伝送する場合にも用いることができるものである。 In the present embodiment, the multi-viewpoint image has been described assuming that one object is captured from different viewpoints, but the number of objects is not limited to one. The present invention can also be applied to images obtained by photographing each of a plurality of objects from a plurality of viewpoints. In that case, since the encoding target block is included in a group of a plurality of images corresponding to each object, the motion vector search range is not the entire multiplexed image, but each object. If a plurality of image groups corresponding to is set as a predetermined area, the search can be performed efficiently. Therefore, according to the stereoscopic image transmission system of the present embodiment, it is used not only when transmitting a plurality of stereoscopic images but also when transmitting a plurality of multi-view images obtained by photographing a predetermined object from a plurality of viewpoints. It is something that can be done.

（第２の実施形態）
次に、本発明の第２の実施形態の立体画像伝送システムについて説明する。 (Second Embodiment)
Next, a stereoscopic image transmission system according to a second embodiment of the present invention will be described.

上記で説明した第１の実施形態は、Ｎ枚の視差画像を立体画像として伝送するものであったが、本発明の第２実施の実施形態では、Ｎ枚の視差画像に加え、Ｎ枚の視差画像の内の１枚を列方向にＮ倍した解像度を有する平面画像を立体画像に含めて伝送するものである。 In the first embodiment described above, N parallax images are transmitted as a stereoscopic image, but in the second embodiment of the present invention, in addition to N parallax images, N A three-dimensional image is transmitted by including a planar image having a resolution obtained by multiplying one of the parallax images by N times in the column direction.

伝送されたＮ枚の視差画像は、画像表示側において立体ディスプレイを用いて表示されることにより立体映像が実現される。しかし、画像表示側が通常の平面ディスプレイしか備えていない場合には、視差画像のうちの１枚を水平方向に引き延ばして表示しなければならず水平方向の解像度が劣化してしまうことになる。そのため、画像表示側が立体ディスプレイまたは平面ディスプレイのいずれを備えている場合でも高精細な表示を行うことができるように、Ｎ枚の視差画像とともにＮ枚の視差画像の内の１枚を列方向にＮ倍した解像度を有する平面画像を立体画像に含めて伝送する。 The transmitted N parallax images are displayed on the image display side using a stereoscopic display, thereby realizing a stereoscopic video. However, when the image display side includes only a normal flat display, one of the parallax images must be extended and displayed in the horizontal direction, and the horizontal resolution is deteriorated. Therefore, one of N parallax images and one of N parallax images in the column direction so that high-definition display can be performed regardless of whether the image display side includes a stereoscopic display or a flat display. A plane image having a resolution multiplied by N is included in the stereoscopic image and transmitted.

本実施形態の立体画像伝送システムの構成を図１０に示す。図１０において、図１中の構成要素と同一の構成要素には同一の符号を付し、説明を省略するものとする。
本実施形態の立体映像伝送システムは、図１０に示されるように、立体画像圧縮装置３０と、立体画像伸長装置４０と、この立体画像圧縮装置３０と立体画像伸長装置４０とを接続する伝送路とから構成されている。 The configuration of the stereoscopic image transmission system of this embodiment is shown in FIG. In FIG. 10, the same components as those in FIG. 1 are denoted by the same reference numerals, and description thereof will be omitted.
As shown in FIG. 10, the stereoscopic video transmission system of the present embodiment includes a stereoscopic image compression device 30, a stereoscopic image expansion device 40, and a transmission path that connects the stereoscopic image compression device 30 and the stereoscopic image expansion device 40. It consists of and.

立体画像圧縮装置３０は、立体画像多重化部２０４と、多重化画像圧縮部２０５と、動きベクトル分解部１０６と、送信・記録部１０７とから構成されている。また、立体画像伸長装置４０は、受信・再生部１０８と、多重化画像伸長部２０９と、動きベクトル合成部１１０と、立体画像分離部２１１とから構成されている。 The stereoscopic image compression apparatus 30 includes a stereoscopic image multiplexing unit 204, a multiplexed image compression unit 205, a motion vector decomposition unit 106, and a transmission / recording unit 107. The stereoscopic image decompressing apparatus 40 includes a receiving / reproducing unit 108, a multiplexed image decompressing unit 209, a motion vector synthesizing unit 110, and a stereoscopic image separating unit 211.

本実施形態では、立体画像圧縮装置３０には、Ｎ枚の視差画像である第１眼画像１０１₁〜第Ｎ眼画像１０１_Nから構成される立体画像とともにＮ枚の視差画像の内の１枚を列方向にＮ倍した解像度を有する平面画像２０１とが入力されている。 In the present embodiment, the stereoscopic image compression apparatus 30 includes _{one of} the N parallax images together with the stereoscopic image composed of the first eye image 101 ₁ to the Nth eye image 101 _N that are N parallax images. And a planar image 201 having a resolution obtained by multiplying N in the column direction by N.

立体画像多重化部２０４では、入力されたＮ枚の第１眼画像１０１₁〜第Ｎ眼画像１０１_Nと平面画像２０１は、図１１に示すように空間的に配置され、１枚の大きな画像に多重化される。ここで多重化方法は、図１１に示した方法以外に、縦方向に並べるのでも、横方向に並べるのでもよく、また、視差画像及び平面画像を並べる順番も、図１１に示した通りでなくて構わない。多重化された画像は多重化画像圧縮部２０５において圧縮されるが、その際求められる動きベクトル情報は、符号化対象ブロックがＮ枚の視差画像内にある場合は動きベクトル分解部においてオフセットベクトルとローカル動きベクトルとに分解され、動画ストリーム中に挿入される。符号化対象ブロックが平面画像内にある場合は、そのまま動きベクトルとして動画ストリーム中に挿入される。圧縮された立体画像及び平面画像は送信・記録部１０７によって１本のストリームとして送信あるいは記録される。 In the stereoscopic image multiplexing unit 204, the input N first eye images 101 ₁ to N N eye images 101 _N and the planar image 201 are spatially arranged as shown in FIG. Is multiplexed. In addition to the method shown in FIG. 11, the multiplexing method may be arranged in the vertical direction or in the horizontal direction, and the order in which the parallax image and the planar image are arranged is as shown in FIG. It does n’t matter. The multiplexed image is compressed by the multiplexed image compressing unit 205, and the motion vector information obtained at this time is calculated as an offset vector in the motion vector decomposing unit when the encoding target block is in N parallax images. It is broken down into local motion vectors and inserted into the video stream. When the encoding target block is in the plane image, it is inserted as it is into the moving picture stream as a motion vector. The compressed stereoscopic image and planar image are transmitted or recorded as one stream by the transmission / recording unit 107.

圧縮された多重化画像は、受信・再生部１０８によって１本のストリームとして受信あるいは再生され、多重化画像伸長部２０９において伸長されるが、その際、動きベクトル合成部１１０において受信したローカル動きベクトルとオフセットベクトルから１本の動きベクトルが合成され、合成された動きベクトル情報を用いて多重化画像を復元する。伸長された多重化画像は、立体画像分離部２１１において第１眼画像〜第Ｎ眼画像１１２₁〜１１２_N及び平面画像２１２に分離される。そして、Ｎ眼立体表示を行う場合は第１眼画像〜第Ｎ眼画像１１２₁〜１１２_Nを１列毎に配置して立体ディスプレイに表示し、平面表示を行う場合は平面画像２１２をそのまま平面ディスプレイに表示する。 The compressed multiplexed image is received or reproduced as a single stream by the reception / reproduction unit 108 and is expanded by the multiplexed image expansion unit 209. At this time, the local motion vector received by the motion vector synthesis unit 110 is expanded. One motion vector is synthesized from the offset vector and the multiplexed image is restored using the synthesized motion vector information. The expanded multiplexed image is separated into a first eye image to an Nth eye image 112 _{1 to} 112 _N and a planar image 212 by the stereoscopic image separation unit 211. When N-eye stereoscopic display is performed, the first eye image to N-th eye images 112 _{1 to} 112 _N are arranged for each column and displayed on the stereoscopic display, and when planar display is performed, the planar image 212 is directly planarized. Show on the display.

次に、多重化画像圧縮部２０５における多重化画像の符号化方法について説明する。多重化画像圧縮部２０５は図１２に示すような構成になっている。図１２において、図４中の構成要素と同一の構成要素には同一の符号を付し、説明を省略するものとする。 Next, a method for encoding a multiplexed image in the multiplexed image compression unit 205 will be described. The multiplexed image compression unit 205 is configured as shown in FIG. In FIG. 12, the same components as those in FIG. 4 are denoted by the same reference numerals, and description thereof will be omitted.

多重化画像圧縮部２０５は、図１２に示すように、動きベクトル検出部３０４と、ＤＣＴ変換部３１０と、量子化部３１１と、逆量子化部３１５と、可変長符号化部３１２と、多重化部３１３と、逆ＤＣＴ変換部３１６と、予測メモリ３０３と、動き補償部３０６と、判別部２０２とから構成されている。 As illustrated in FIG. 12, the multiplexed image compression unit 205 includes a motion vector detection unit 304, a DCT conversion unit 310, a quantization unit 311, an inverse quantization unit 315, a variable length coding unit 312, It comprises a conversion unit 313, an inverse DCT conversion unit 316, a prediction memory 303, a motion compensation unit 306, and a determination unit 202.

本実施形態における多重化画像圧縮部２０５は、判別部２０２が設けられている以外は、図３に示した第１の実施形態における多重化画像圧縮部１０５と同様な構成となっている。 The multiplexed image compression unit 205 in the present embodiment has the same configuration as the multiplexed image compression unit 105 in the first embodiment shown in FIG. 3 except that the determination unit 202 is provided.

判別部２０２は、符号化対象ブロックがＮ枚の視差画像内にあるか平面画像内にあるかを判別部２０２によって判別し、Ｎ枚の視差画像内にある場合には、動きベクトル検出部３０４により検出された動きベクトル３０５を、動きベクトル分解部１０６に出力し、符号化対象ブロックが平面画像内にある場合には、動きベクトル検出部３０４により検出された動きベクトル３０５をローカル動きベクトル３０８の替わりとして多重化部３１３に出力する。 The discriminating unit 202 discriminates whether the encoding target block is in N parallax images or a planar image by the discriminating unit 202, and when it is in the N parallax images, the motion vector detecting unit 304. Is output to the motion vector decomposing unit 106, and when the block to be encoded is in the plane image, the motion vector 305 detected by the motion vector detecting unit 304 is converted into the local motion vector 308. Instead, it is output to the multiplexing unit 313.

この図１２では、動きベクトル分解部１０６は多重化画像圧縮部２０５の外部に設けられているが、多重化画像圧縮部２０５内部に含む構成としても構わない。この多重化画像圧縮部２０５での動作は、まず、入力された多重化画像と、予測メモリ３０３に記憶されている過去あるいは未来の参照画像とを動きベクトル検出部３０４においてブロック単位で比較し、動きベクトル３０５を検出する。このとき、符号化対象ブロックがＮ枚の視差画像内にある場合は、動きベクトルサーチ範囲を参照多重化画像におけるＮ枚の視差画像が配置されている部分とし、Ｎ個の類似部分のうちで最も予測誤差の小さいブロックを選択する。一方、符号化対象ブロックが平面画像内にある場合は、動きベクトルサーチ範囲は参照多重化画像の平面画像が配置されている部分のみとし、１本の動画を符号化する場合と同様の動きベクトル検出を行う。この動きベクトル情報により、動き補償部３０６において予測メモリ３０３に記憶されている参照画像から対応するデータを読み出して予測画像を形成し、入力された多重化画像との差分をとる。一方、動きベクトル３０５については、符号化対象ブロックがＮ枚の視差画像内にあるか平面画像内にあるかが判別部２０２において判別され、符号化対象ブロックがＮ枚の視差画像内にある場合は、動きベクトル３０５は動きベクトル分解部１０６においてオフセットベクトル３０９とローカル動きベクトル３０８とに分解される。符号化対象ブロックが平面画像内にある場合は、動きベクトル分解部１０６は経由せず、動きベクトル検出部１０５によって求められた動きベクトル３０５はそのまま多重化部３１３に送られる。差分画像はＤＣＴ変換部３１０、量子化部３１１、及び可変長符号化部３１２を経て、多重化部３１３においてローカル動きベクトル３０８情報、オフセットベクトル３０９情報とともに１本の動画ストリームに多重化される。さらに、次のフレームの多重化画像を圧縮するための参照画像は、逆量子化部３１５、逆ＤＣＴ変換部３１６を経て予測メモリ３０３に記憶される。 In FIG. 12, the motion vector decomposing unit 106 is provided outside the multiplexed image compressing unit 205, but may be configured to be included inside the multiplexed image compressing unit 205. In the operation of the multiplexed image compression unit 205, first, the input multiplexed image and a past or future reference image stored in the prediction memory 303 are compared in block units in the motion vector detection unit 304, A motion vector 305 is detected. At this time, when the encoding target block is in N parallax images, the motion vector search range is set as a portion where N parallax images in the reference multiplexed image are arranged, and among the N similar portions, Select the block with the smallest prediction error. On the other hand, when the block to be encoded is in a plane image, the motion vector search range is only a portion where the plane image of the reference multiplexed image is arranged, and the same motion vector as that for encoding one moving image. Perform detection. Based on this motion vector information, the motion compensation unit 306 reads corresponding data from the reference image stored in the prediction memory 303 to form a prediction image, and obtains a difference from the input multiplexed image. On the other hand, with respect to the motion vector 305, when the determination unit 202 determines whether the encoding target block is in the N parallax images or the planar image, and the encoding target block is in the N parallax images The motion vector 305 is decomposed into an offset vector 309 and a local motion vector 308 by the motion vector decomposition unit 106. When the block to be encoded is in the planar image, the motion vector decomposition unit 106 does not pass through, and the motion vector 305 obtained by the motion vector detection unit 105 is sent to the multiplexing unit 313 as it is. The difference image passes through the DCT transform unit 310, the quantization unit 311, and the variable length coding unit 312, and is multiplexed into one moving image stream together with the local motion vector 308 information and the offset vector 309 information in the multiplexing unit 313. Further, the reference image for compressing the multiplexed image of the next frame is stored in the prediction memory 303 via the inverse quantization unit 315 and the inverse DCT conversion unit 316.

動きベクトル分解部１０６における動きベクトル分解方法、及び分解されたベクトルの符号化方法については、第１実施形態と同様の方法によって行われる。但し、ユーザデータ部に挿入されるオフセットベクトル情報において、多重化画像におけるＮ枚の視差画像の配置順序に加え、平面画像の配置順序を示すデータも挿入される点が異なる。 The motion vector decomposition method and the encoded vector encoding method in the motion vector decomposition unit 106 are performed by the same method as in the first embodiment. However, in the offset vector information inserted in the user data part, in addition to the arrangement order of N parallax images in the multiplexed image, data indicating the arrangement order of planar images is also different.

次に、本実施形態の多重化画像伸長部２０９における多重化画像の復号化方法について説明する。多重化画像伸長部２０９は図１３に示すような構成になっている。図１３において、図９中の構成要素と同一の構成要素には同一の符号を付し、説明を省略するものとする。 Next, a method for decoding a multiplexed image in the multiplexed image expansion unit 209 according to this embodiment will be described. The multiplexed image decompression unit 209 is configured as shown in FIG. In FIG. 13, the same components as those in FIG. 9 are denoted by the same reference numerals, and description thereof is omitted.

多重化画像伸長部２０９は、図１３に示されるように、分離部８０３と、可変長復号化部８０７と、逆量子化部８０８と、逆ＤＣＴ変換部８０９と、動き補償部８１２と、予測メモリ８１３と、判別部２０６とから構成されている。 As illustrated in FIG. 13, the multiplexed image decompression unit 209 includes a separation unit 803, a variable length decoding unit 807, an inverse quantization unit 808, an inverse DCT transform unit 809, a motion compensation unit 812, and a prediction The memory 813 and the determination unit 206 are configured.

本実施形態における多重化画像伸長部２０９は、判別部２０６が設けられている以外は、図９に示した第１の実施形態における多重化画像伸長部１０９と同様な構成となっている。 The multiplexed image decompression unit 209 in the present embodiment has the same configuration as the multiplexed image decompression unit 109 in the first embodiment shown in FIG. 9 except that the determination unit 206 is provided.

判別部２０６は、多重化画像内でのＮ枚の視差画像及び平面画像の配置順序を示すフラグを検出し、符号化対象ブロックがＮ枚の視差画像内にある場合には、分離部８０３により動画ストリームから分離された動きベクトルをそのままローカル動きベクトル８０６として動きベクトル合成部１１０に出力し、符号化対象ブロックが平面画像内にある場合には、分離部８０３により動画ストリームから分離された動きベクトルを動きベクトル８１１として動き補償部８１２に出力する。 The determination unit 206 detects a flag indicating the arrangement order of the N parallax images and the planar image in the multiplexed image, and when the encoding target block is in the N parallax images, the separation unit 803 The motion vector separated from the video stream is output as it is to the motion vector synthesis unit 110 as the local motion vector 806, and when the encoding target block is in the plane image, the motion vector separated from the video stream by the separation unit 803 As a motion vector 811 to the motion compensation unit 812.

まず、多重画像伸長部２０９では、入力された動画ストリームが、分離部８０３において差分画像データ８０４とオフセットベクトル８０５と、動きベクトルに分離される。
尚、符号化対象ブロックが、平面画像内にある場合には、オフセットベクトルは存在しないため、分離部８０３からオフセットベクトル８０５は出力されない。次に、判別部２０６では、多重化画像内でのＮ枚の視差画像及び平面画像の配置順序を示すフラグを検出し、復号化対象ブロックがＮ枚の視差画像内にあった場合は、分離部８０３からの動きベクトルをそのままローカル動きベクトル８０６として動きベクトル合成部１１０に出力する。符号化対象ブロックが平面画像内にある場合には、判別部２０６が、分離部８０３からの動きベクトルを動きベクトル８１１として動き補償部８１２に出力する。 First, in the multiple image decompression unit 209, the input moving image stream is separated into difference image data 804, an offset vector 805, and a motion vector by a separation unit 803.
Note that when the encoding target block is in the planar image, there is no offset vector, so the offset vector 805 is not output from the separation unit 803. Next, the determination unit 206 detects a flag indicating the arrangement order of the N parallax images and the planar image in the multiplexed image. If the decoding target block is in the N parallax images, separation is performed. The motion vector from the unit 803 is output to the motion vector synthesis unit 110 as a local motion vector 806 as it is. When the encoding target block is in the planar image, the determination unit 206 outputs the motion vector from the separation unit 803 to the motion compensation unit 812 as the motion vector 811.

差分画像データ８０４は、可変長復号化部８０７、逆量子化部８０８、逆ＤＣＴ変換部８０９において、それぞれ可変長復号化、逆量子化、逆ＤＣＴ変換され、差分画像に復号される。オフセットベクトル８０５及びローカル動きベクトル８０６は、動きベクトル合成部１１０において１本の動きベクトル８１１に合成される。この合成された動きベクトル８１１を用いて、動き補償部８１２において、予測メモリ８１３に記憶されている過去あるいは未来の参照画像から予測画像が形成される。そして、予測画像と逆ＤＣＴ変換部８０９からの差分画像との和をとることによって、多重化画像が復元される。 The difference image data 804 is subjected to variable length decoding, inverse quantization, and inverse DCT conversion in a variable length decoding unit 807, an inverse quantization unit 808, and an inverse DCT conversion unit 809, respectively, and is decoded into a difference image. The offset vector 805 and the local motion vector 806 are combined into one motion vector 811 in the motion vector combining unit 110. Using the synthesized motion vector 811, the motion compensation unit 812 forms a predicted image from past or future reference images stored in the prediction memory 813. Then, by taking the sum of the predicted image and the difference image from the inverse DCT transform unit 809, the multiplexed image is restored.

オフセットベクトル及びローカル動きベクトルの復号化方法については、第１実施形態と同様の方法によって行われる。但しこのとき、多重化画像におけるＮ枚の視差画像及び平面画像の配置順序を示すデータを検出し、これにより多重化画像内におけるオフセットブロックの位置を算出する点が若干異なる。 About the decoding method of an offset vector and a local motion vector, it is performed by the method similar to 1st Embodiment. However, at this time, data indicating the arrangement order of N parallax images and planar images in the multiplexed image is detected, and the position of the offset block in the multiplexed image is calculated accordingly.

本実施形態では、Ｎ枚の視差画像の内の１枚を列方向に引き延ばして平面画像としているが、ここで用いる平面画像としては、異なった対象物を表示するための画像であっても良い。この場合、平面画像内にある符号化対象ブロックに対する動きベクトルサーチにおいては、サーチ範囲は多重化画像全体ではなく平面画像が配置されている部分のみとし、Ｎ枚の視差画像内にある符号化対象ブロックに対する動きベクトルサーチにおいては、サーチ範囲はＮ枚の視差画像が配置されている部分のみとする。このような構成とすれば、異なった視点で異なった映像を映すことができ、複数のユーザが別の情報を同時に見ることが可能となる。 In the present embodiment, one of N parallax images is extended in the column direction to form a planar image. However, the planar image used here may be an image for displaying different objects. . In this case, in the motion vector search for the encoding target block in the planar image, the search range is only the portion where the planar image is arranged, not the entire multiplexed image, and the encoding target is in the N parallax images. In the motion vector search for a block, the search range is only a portion where N parallax images are arranged. With such a configuration, different videos can be projected from different viewpoints, and a plurality of users can simultaneously view different information.

本実施形態では、符号化対象ブロックが平面画像内の場合にはオフセットベクトルが存在しないものとして説明しているが、符号化対象ブロックが平面画像内の場合にもオフセットベクトルを用いるようにして平面画像と立体画像との間で動き予測を行うようにしてもよい。この場合には、図１２に示した多重化画像圧縮部２０５における判別部２０２および図１３に示した多重化画像伸長部２０９における判別部２０６は不要となる。 In the present embodiment, it is described that there is no offset vector when the encoding target block is in a planar image. However, the offset vector is also used when the encoding target block is in a planar image. Motion prediction may be performed between the image and the stereoscopic image. In this case, the determination unit 202 in the multiplexed image compression unit 205 shown in FIG. 12 and the determination unit 206 in the multiplexed image expansion unit 209 shown in FIG. 13 are unnecessary.

（第３の実施形）
次に、本発明の第３の実施形態の立体画像伝送システムについて説明する。 (Third embodiment)
Next, a stereoscopic image transmission system according to a third embodiment of the present invention will be described.

上記第２の実施形態の立体画像伝送システムでは、Ｎ枚の視差画像とともに平面画像を伝送する場合について説明を行った。しかし、この第２の実施形態では、Ｎ枚の視差画像と平面画像とは、大きさが異なりまた解像度も異なるものであるため空間的相関性が低くなり、平面画像と視差画像間での動き予測を行っても効率的な圧縮を行うことができなかった。そこで、本実施形態の立体画像伝送システムでは、平面画像を視差画像と同じ大きさを有するＮ枚の平面部分画像に分割して、Ｎ枚の視差画像とともに多重化するようにして、効率的な圧縮を行うようにしたものである。 In the stereoscopic image transmission system according to the second embodiment, the case where a planar image is transmitted together with N parallax images has been described. However, in the second embodiment, the N parallax images and the planar image are different in size and resolution, and thus the spatial correlation is low, and the motion between the planar image and the parallax image is low. Even with the prediction, efficient compression could not be performed. Therefore, in the stereoscopic image transmission system according to the present embodiment, the planar image is divided into N planar partial images having the same size as the parallax image, and multiplexed together with the N parallax images. The compression is performed.

本発明の第３の実施形態は、図１に示した第１の実施形態において、Ｎ枚の視差画像である第１〜第Ｎ眼画像１０１₁〜１０１_Nに加え、Ｎ枚の視差画像の内の１枚を列方向にＮ倍した解像度を有する平面画像を入力し、立体画像多重化部１０４において入力された平面画像をＮ枚の平面部分画像に分割してからＮ枚の視差画像とともに多重化する点のみが異なっている。 Third embodiment of the present invention, in the first embodiment shown in FIG. 1, the first to addition to the N-eye image 101 ₁ to 101 _N is N parallax images, the N parallax images A plane image having a resolution obtained by multiplying one of them in the column direction by N times is input, and the plane image input by the stereoscopic image multiplexing unit 104 is divided into N plane partial images, and then together with N parallax images Only the point of multiplexing is different.

本実施形態における立体画像多重化部では、入力された平面画像は、図１４に示すようにＮ列毎に取り出して視差画像と同じ大きさを有するＮ枚の平面部分画像に分割され、図１５に示すように、Ｎ枚の視差画像とＮ枚の平面部分画像を合わせた計２Ｎ枚の画像が多重化され１枚の大きな多重化画像となる。ここで多重化方法は、図１５に示した方法以外に、縦方向に並べるのでも、横方向に並べるのでもよく、また、視差画像及び平面部分画像を並べる順番も、図１５に示した通りでなくて構わない。さらに、図１５において例えば平面画像は第１眼画像の４倍の水平解像度をもつ画像だとすると、第１眼画像と第１平面部分画像は全く同じ画像となるから、第１眼画像と第１平面部分画像のうちのいずれかを省略し、代わりにダミー画像を挿入してもよい。このように平面画像をＮ枚の平面部分画像に分割することで、平面画像を、Ｎ枚の視差画像と同じ大きさでかつ空間的相関性の高いＮ枚の画像とすることができる。 In the stereoscopic image multiplexing unit according to the present embodiment, the input planar image is extracted every N columns as shown in FIG. 14, and divided into N planar partial images having the same size as the parallax image. As shown in FIG. 2, a total of 2N images, which are a combination of N parallax images and N planar partial images, are multiplexed into one large multiplexed image. Here, the multiplexing method may be arranged in the vertical direction or in the horizontal direction other than the method shown in FIG. 15, and the order in which the parallax image and the planar partial image are arranged is also as shown in FIG. It doesn't matter. Further, in FIG. 15, for example, if the planar image is an image having a horizontal resolution four times that of the first eye image, the first eye image and the first planar partial image are exactly the same image. Any of the partial images may be omitted, and a dummy image may be inserted instead. By dividing the planar image into N planar partial images in this way, the planar image can be made into N images having the same size as the N parallax images and high spatial correlation.

このようにして多重化された多重化画像を圧縮・伸長する方法については、図１に示した第１の実施形態と同様な方法により行われるため、第１の実施形態の構成を示した図１を用いて本実施形態の以降の動作について説明する。 The method for compressing / decompressing the multiplexed image multiplexed in this way is performed by the same method as that of the first embodiment shown in FIG. 1, and is a diagram showing the configuration of the first embodiment. 1 will be used to explain the subsequent operation of the present embodiment.

立体画像多重化部により多重化された多重化画像は、多重化画像圧縮部１０５において圧縮されるが、その際求められる動きベクトル情報は、動きベクトル分解部１０６によりオフセットベクトルとローカル動きベクトルとに分解され、動画ストリーム中に挿入される。圧縮された立体画像及び平面画像は送信・記録部１０７によって１本のストリームとして送信あるいは記録される。 The multiplexed image multiplexed by the stereoscopic image multiplexing unit is compressed by the multiplexed image compression unit 105. The motion vector information obtained at this time is converted into an offset vector and a local motion vector by the motion vector decomposition unit 106. It is decomposed and inserted into the video stream. The compressed stereoscopic image and planar image are transmitted or recorded as one stream by the transmission / recording unit 107.

立体画像伸長装置２０では、圧縮された多重化画像は、受信・再生部１０８によって１本のストリームとして受信あるいは再生され、多重化画像伸長部１０９において伸長されるが、その際、動きベクトル合成部１１０において受信したローカル動きベクトルとオフセットベクトルから１本の動きベクトルが合成され、合成された動きベクトル情報を用いて多重化画像を復元する。伸長された多重化画像は、立体画像分離部１１１において第１眼画像〜第Ｎ眼画像、及び第１平面部分画像〜第Ｎ平面部分画像に分離され、第１平面部分画像〜第Ｎ平面部分画像は図１４に示した手順と逆の手順により１枚の平面画像に再構成される。そして、Ｎ眼立体表示を行う場合は第１眼画像〜第Ｎ眼画像を１列毎に配置して立体ディスプレイに表示し、平面表示を行う場合は平面画像をそのまま平面ディスプレイに表示する。 In the stereoscopic image decompression apparatus 20, the compressed multiplexed image is received or reproduced as a single stream by the reception / reproduction unit 108 and decompressed by the multiplexed image decompression unit 109. At this time, the motion vector synthesis unit One motion vector is synthesized from the local motion vector and the offset vector received at 110, and a multiplexed image is restored using the synthesized motion vector information. The expanded multiplexed image is separated into a first eye image to an Nth eye image and a first plane partial image to an Nth plane partial image by the stereoscopic image separation unit 111, and the first plane partial image to the Nth plane portion. The image is reconstructed into one plane image by a procedure reverse to the procedure shown in FIG. When performing N-eye stereoscopic display, the first to N-th eye images are arranged for each column and displayed on the stereoscopic display, and when performing planar display, the planar image is displayed as it is on the planar display.

本実施形態では、Ｎ枚の視差画像及びＮ枚の平面部分画像がすべて同じ大きさ・解像度を有しており、しかもそれらは互いに空間的相関性が高いため、多重化画像圧縮部１０５における多重化画像の符号化方法、動きベクトル分解方法及び分解されたベクトルの符号化方法、多重化画像の復号化方法、オフセットベクトル及びローカル動きベクトルの復号化方法については、第１実施形態においてＮを２Ｎに置き換えた場合と同様の方法で実施する。但し、ユーザデータ部に挿入されるオフセットベクトル情報において、多重化画像におけるＮ枚の視差画像の配置順序に加え、Ｎ枚の平面部分画像の配置順序を示すデータも挿入される点が異なる。 In the present embodiment, N parallax images and N plane partial images all have the same size and resolution, and they have high spatial correlation with each other. For the encoded image encoding method, the motion vector decomposition method and the decomposed vector encoding method, the multiplexed image decoding method, the offset vector and the local motion vector decoding method, N is set to 2N in the first embodiment. The method is the same as when replaced with. However, the offset vector information inserted in the user data part differs in that data indicating the arrangement order of N planar partial images is also inserted in addition to the arrangement order of N parallax images in the multiplexed image.

（第４の実施形態）
次に、本発明の第４の実施形態の立体画像伝送システムについて説明する。上記で説明した第２の実施形態では、画像表示側が平面ディスプレイしか備えていない場合を考慮して、Ｎ枚の視差画像とともに平面画像を送信するようにしていたが、本発明の第４の実施形態では、画像表示側が１〜Ｎ眼ディスプレイのいずれであっても立体表示あるいは平面表示を行うことができるように、第１眼画像〜第Ｎ眼画像の原画像を列方向に間引かずに、原画像の解像度のまま入力する。 (Fourth embodiment)
Next, a stereoscopic image transmission system according to a fourth embodiment of the present invention will be described. In the second embodiment described above, the planar image is transmitted together with the N parallax images in consideration of the case where the image display side includes only the planar display. However, the fourth embodiment of the present invention is described. In the embodiment, the original images of the first eye image to the Nth eye image are not thinned out in the column direction so that stereoscopic display or planar display can be performed even if the image display side is any of the 1 to N eye displays. , Input with the resolution of the original image.

本実施形態における立体画多重化部においては、これらの画像が空間的に配置され、１枚の大きな画像に多重化される。例えばＮ＝４とするとき、多重化処理過程においては、図１６のように第１眼画像〜第Ｎ眼画像をそのまま多重化画像に配置する。ここで多重化方法は、図１６に示した方法以外に、縦方向に並べるのでも、横方向に並べるのでもよく、また、視差画像を並べる順番も、図１６に示した通りでなくて構わない。 In the stereoscopic image multiplexing unit in the present embodiment, these images are spatially arranged and multiplexed into one large image. For example, when N = 4, in the multiplexing process, the first to Nth eye images are arranged as they are in the multiplexed image as shown in FIG. Here, the multiplexing method may be arranged in the vertical direction or the horizontal direction other than the method shown in FIG. 16, and the order in which the parallax images are arranged may not be as shown in FIG. Absent.

立体画像多重化部により多重化された画像は多重化画像圧縮部１０５において圧縮されるが、その際求められる動きベクトル情報は、動きベクトル分解部１０６によりオフセットベクトルとローカル動きベクトルとに分解され、動画ストリーム中に挿入される。圧縮された立体画像及び平面画像は送信・記録部１０７によって１本のストリームとして送信あるいは記録される。 The image multiplexed by the stereoscopic image multiplexing unit is compressed by the multiplexed image compression unit 105. The motion vector information obtained at this time is decomposed into an offset vector and a local motion vector by the motion vector decomposition unit 106, Inserted into the video stream. The compressed stereoscopic image and planar image are transmitted or recorded as one stream by the transmission / recording unit 107.

立体画像伸長装置２０では、圧縮された立体画像及び平面画像は、受信・再生部１０８によって１本のストリームとして受信あるいは再生され、多重化画像伸長部１０９において伸長されるが、その際、動きベクトル合成部１１０において受信したローカル動きベクトルとオフセットベクトルから１本の動きベクトルが合成され、合成された動きベクトル情報を用いて多重化画像を復元する。伸長された多重化画像は、立体画像分離部１１１において第１眼画像〜第Ｎ眼画像に分離される。Ｎ眼立体表示を行う場合は第１眼画像〜第Ｎ眼画像のそれぞれの第１列、第Ｎ＋１列、第２Ｎ＋１列・・・を取り出して１列毎に配置し、立体ディスプレイに表示する。平面表示を行う場合は第１眼画像〜第Ｎ眼画像のうちの任意の１枚をそのまま平面ディスプレイに表示する。ｋ眼立体表示（２≦ｋ＜Ｎ）を行う場合は、第１眼画像〜第ｋ眼画像のそれぞれの第１列、第ｋ＋１列、第２ｋ＋１列・・・を取り出して１列毎に配置し、立体ディスプレイに表示する。 In the stereoscopic image decompression device 20, the compressed stereoscopic image and planar image are received or reproduced as a single stream by the reception / reproduction unit 108 and decompressed by the multiplexed image decompression unit 109. One motion vector is synthesized from the local motion vector and the offset vector received by the synthesis unit 110, and a multiplexed image is restored using the synthesized motion vector information. The expanded multiplexed image is separated into a first eye image to an Nth eye image by the stereoscopic image separation unit 111. When performing N-eye stereoscopic display, the first, N + 1th, 2N + 1th,... Columns of the first to Nth eye images are taken out and arranged for each column and displayed on a stereoscopic display. When planar display is performed, any one of the first to Nth eye images is directly displayed on the planar display. When k-eye stereoscopic display (2 ≦ k <N) is performed, the first, k + 1, 2k + 1,... columns of the first eye image to the kth eye image are extracted and arranged for each column. And displayed on the stereoscopic display.

本実施形態では、Ｎ枚の視差画像がすべて同じ大きさ・解像度を有しているため、多重化画像圧縮部１０５における多重化画像の符号化方法、動きベクトル分解方法及び分解されたベクトルの符号化方法、多重化画像の復号化方法、オフセットベクトル及びローカル動きベクトルの復号化方法については、第１の実施形態と同様の方法により実施することができる。但し、ユーザデータ部に、第１の実施形態において挿入されるフラグに加えて、Ｎ枚の視差画像がすべて高解像度（原画を間引いていない）であることを示すフラグが挿入される点が異なる。 In this embodiment, since all the N parallax images have the same size and resolution, the multiplexed image encoding unit 105 in the multiplexed image compression unit 105, the motion vector decomposition method, and the code of the decomposed vector The decoding method, the multiplexed image decoding method, the offset vector, and the local motion vector decoding method can be implemented by the same method as in the first embodiment. However, in addition to the flag inserted in the first embodiment, a flag indicating that all N parallax images have high resolution (the original image is not thinned out) is inserted in the user data portion. .

（第５の実施形態）
次に、本発明の第５の実施形態について説明する。上記第１〜第４の実施形態では、画像表示側がＮ眼の立体ディスプレイまたは平面ディスプレイである場合を前提としていたが、本実施形態は、ｋをＮの約数とした場合に、画像表示側がｋ眼ディスプレイである場合でも表示ができるようにしたものである。 (Fifth embodiment)
Next, a fifth embodiment of the present invention will be described. In the first to fourth embodiments, it is assumed that the image display side is an N-eye stereoscopic display or a flat display. However, in this embodiment, when k is a divisor of N, the image display side is The display can be performed even in the case of a k-eye display.

本実施形態では、図１に示した第１の実施形態の立体画像伝送システムにおいて、ｋを３以上の整数Ｎの任意の約数とするとき、全てのｋに対して、ｋ眼ディスプレイに表示するために必要な全ての視差画像を入力する。例えば、Ｎ＝６の場合、６の約数、すなわち１眼（平面）ディスプレイに表示するための平面画像、２眼立体ディスプレイに表示するための２枚の視差画像、３眼ディスプレイに表示するための３枚の視差画像、及び６眼ディスプレイに表示するための６枚の視差画像を、立体画像多重化部に入力する。以下の説明ではＮ＝６の場合を用いて説明する。 In the present embodiment, in the stereoscopic image transmission system of the first embodiment shown in FIG. 1, when k is an arbitrary divisor of an integer N of 3 or more, all k are displayed on the k-eye display. All the parallax images necessary for this are input. For example, when N = 6, a divisor of 6, that is, a planar image to be displayed on a single-eye (flat) display, two parallax images to be displayed on a two-eye stereoscopic display, and to be displayed on a three-eye display These three parallax images and six parallax images to be displayed on the 6-eye display are input to the stereoscopic image multiplexing unit. In the following description, the case of N = 6 will be used.

立体画像多重化部に入力する視差画像の大きさ、視点の関係を図１７に示す。立体画像多重化部においては、これらの視差画像が１枚の大きな画像に空間的に多重化される。このとき６眼ディスプレイに表示する６枚の視差画像を除いたすべての視差画像はそれぞれ、図１８〜図２０に示すように第６眼画像と同じ大きさを有する複数の視差部分画像に分割され、図２１に示すように、全ての視差画像及び視差部分画像を合計した２４枚の画像が多重化される。図１８は、平面ディスプレイ用画像を複数の部分画像に分割する方法を示す図であり、図１９は、２眼ディスプレイ用画像を複数の部分画像に分割する方法を示す図である。また、図２０は、３眼ディスプレイ用画像を複数の部分画像に分割する方法を示す図である。 FIG. 17 shows the relationship between the size and viewpoint of the parallax image input to the stereoscopic image multiplexing unit. In the stereoscopic image multiplexing unit, these parallax images are spatially multiplexed into one large image. At this time, all the parallax images excluding the six parallax images displayed on the six-eye display are each divided into a plurality of parallax partial images having the same size as the sixth eye image as shown in FIGS. As shown in FIG. 21, 24 images obtained by summing all the parallax images and the parallax partial images are multiplexed. FIG. 18 is a diagram illustrating a method of dividing a flat display image into a plurality of partial images, and FIG. 19 is a diagram illustrating a method of dividing a binocular display image into a plurality of partial images. FIG. 20 is a diagram illustrating a method of dividing a trinocular display image into a plurality of partial images.

ここで多重化方法は、図２１に示した方法以外に、縦方向に並べるのでも、横方向に並べるのでもよく、また、視差画像及び視差部分画像を並べる順番も、図に示した通りでなくて構わない。さらに、視差画像及び視差部分画像のうちで重複する画像はダミー画像で置き換えてもよい。ダミー画像がたくさんあるならば、多重化画像自体を小さくし、必要な画像のみを配置するのでもよい。例えば、図１８〜図２０を参照すると、１眼用第１視点第１部分画像と、２眼用第１視点第１部分画像と、３眼用第１視点第１部分画像とは、互いに重複しているので、これらのうち一つを多重化画像に配置すれば十分である。このように互いに解像度の異なる視差画像群を複数の視差部分画像に分割することで、全ての視差画像を同じ大きさでかつ空間的相関性の高い画像群とすることができる。 Here, in addition to the method shown in FIG. 21, the multiplexing method may be arranged in the vertical direction or in the horizontal direction, and the order in which the parallax images and the parallax partial images are arranged is as shown in the figure. It doesn't matter. Furthermore, an overlapping image in the parallax image and the parallax partial image may be replaced with a dummy image. If there are many dummy images, the multiplexed image itself may be made smaller and only necessary images may be arranged. For example, referring to FIGS. 18 to 20, the first viewpoint first partial image for one eye, the first partial first viewpoint image for two eyes, and the first viewpoint first partial image for three eyes overlap each other. Therefore, it is sufficient to place one of these in the multiplexed image. In this way, by dividing the parallax image group having different resolutions into a plurality of parallax partial images, all the parallax images can be made into an image group having the same size and high spatial correlation.

立体画像伸長装置２０では、圧縮された多重化画像は、受信・再生部１０８によって１本のストリームとして受信あるいは再生され、多重化画像伸長部１０９において伸長されるが、その際、動きベクトル合成部１１０において受信したローカル動きベクトルとオフセットベクトルから１本の動きベクトルが合成され、合成された動きベクトル情報を用いて多重化画像を復元する。伸長された多重化画像は、立体画像分離部１１１において視差画像、及び視差部分画像に分離され、視差部分画像は図１８〜図２０と逆の手順により、１枚の平面画像、２枚の２眼ディスプレイ用画像、３枚の３眼ディスプレイ用画像に再構成される。そして、６眼立体表示を行う場合は６眼用第１視点画像〜６眼用第６視点画像を１列毎に配置して６眼立体ディスプレイに表示し、３眼立体表示を行う場合は３眼用第１視点画像〜３眼用第３視点画像を１列毎に配置して３眼立体ディスプレイに表示し、２眼立体表示を行う場合は２眼用第１視点画像〜２眼用第２視点画像を１列毎に配置して２眼立体ディスプレイに表示し、平面表示を行う場合は１眼（平面）用第１視点画像をそのまま平面ディスプレイに表示する。 In the stereoscopic image decompression apparatus 20, the compressed multiplexed image is received or reproduced as a single stream by the reception / reproduction unit 108 and decompressed by the multiplexed image decompression unit 109. At this time, the motion vector synthesis unit One motion vector is synthesized from the local motion vector and the offset vector received at 110, and a multiplexed image is restored using the synthesized motion vector information. The decompressed multiplexed image is separated into a parallax image and a parallax partial image by the stereoscopic image separation unit 111, and the parallax partial image is obtained by reversing the procedure shown in FIGS. The image for eye display is reconstructed into three images for three eye display. When performing 6-eye stereoscopic display, the 6-eye first viewpoint image to the 6-eye sixth viewpoint image are arranged for each column and displayed on the 6-eye stereoscopic display, and when performing 3-eye stereoscopic display, 3 is displayed. When the first viewpoint image for eyes to the third viewpoint image for three eyes are arranged for each row and displayed on a three-view stereoscopic display and two-dimensional stereoscopic display is performed, the first viewpoint image for two eyes to the second viewpoint for two eyes are displayed. When the two viewpoint images are arranged for each column and displayed on the binocular stereoscopic display, and the flat display is performed, the first viewpoint image for one eye (plane) is displayed on the flat display as it is.

本実施形態では、多重化画像に多重化される視差画像及び視差部分画像の総枚数をＭとするとき、多重化されるＭ枚画像はすべて同じ大きさ・解像度を有しており、しかもそれらは互いに空間的相関性が高いため、多重化画像圧縮部における多重化画像の符号化方法、動きベクトル分解方法及び分解されたベクトルの符号化方法、多重化画像の復号化方法、オフセットベクトル及びローカル動きベクトルの復号化方法については、第１実施形態においてＮをＭに置き換えた場合と同様の方法で実施することができる。但し、ユーザデータ部に挿入されるオフセットベクトル情報において、Ｎ枚の視差画像の配置順序を示すのではなく、多重化画像におけるＭ枚の視差画像及び視差部分画像の配置順序が挿入される点が異なる。 In this embodiment, when the total number of parallax images and parallax partial images to be multiplexed on a multiplexed image is M, all the M images to be multiplexed have the same size and resolution, and Are highly spatially correlated with each other, the multiplexed image encoding method, the motion vector decomposition method and the decomposed vector encoding method, the multiplexed image decoding method, the offset vector, and the local The motion vector decoding method can be implemented in the same manner as in the case where N is replaced with M in the first embodiment. However, in the offset vector information inserted in the user data part, the arrangement order of M parallax images and the parallax partial images in the multiplexed image is inserted instead of indicating the arrangement order of N parallax images. Different.

上記第１〜第５の実施形態では、圧縮側では、オフセットベクトルを復号化するために必要な情報、すなわちストリーム中においてオフセットベクトルが存在する位置を示すフラグ、オフセットベクトルの符号化フォーマットを示すフラグ、多重化画像内の視差画像、あるいは視差画像と平面画像、あるいは視差画像と視差部分画像の配置順序を示すフラグなどの情報を、動画ストリーム中に挿入するものとして説明した。ここでは、その一例として、ユーザデータ部などの、ユーザが任意のデータを挿入できるフィールドに挿入されるオフセットベクトル情報のフォーマットについて、図２２〜図２６を参照して説明する。ここでは、ＭＰＥＧストリーム中にオフセットベクトルを挿入する場合の具体的な方法について説明する。 In the first to fifth embodiments, on the compression side, information necessary for decoding the offset vector, that is, a flag indicating the position where the offset vector exists in the stream, and a flag indicating the encoding format of the offset vector In the above description, information such as a parallax image in a multiplexed image, or a flag indicating the arrangement order of a parallax image and a planar image, or a parallax image and a parallax partial image is inserted into a moving image stream. Here, as an example, the format of offset vector information inserted in a field where a user can insert arbitrary data, such as a user data section, will be described with reference to FIGS. Here, a specific method for inserting an offset vector into an MPEG stream will be described.

図２２に示すように、ＭＰＥＧストリームは階層構造となっており、画像サイズやアスペクト比、フレーム・レートなど、ストリーム全体が共有する情報を格納するシーケンスレイヤ、複数のピクチャをまとめたＧＯＰ(Group Of Picture)レイヤ、１枚の静止画として扱うことのできるフレームまたはフィールドを示すピクチャレイヤ、マクロブロックを水平方向に帯状につなげた領域であるスライスレイヤ、１６画素×１６ラインの領域であり動き補償の単位であるマクロブロックレイヤ、８画素×８画素の領域でありＤＣＴ変換の単位であるブロックレイヤから構成されている。オフセットベクトル情報はＭＰＥＧの規格外のため、ＭＰＥＧ規格に準じるためには、任意のデータを格納できるユーザデータ部にオフセットベクトル情報を挿入すればよい。ＭＰＥＧの規格上、ユーザデータ部を挿入可能な位置はシーケンスヘッダやピクチャヘッダの前後などに限られるが、例えば以下に説明する方法に従って１フレーム分のオフセットベクトル情報を挿入するためには、各ピクチャヘッダの前または後にユーザデータ部を設け、そこに格納すればよい。 As shown in FIG. 22, the MPEG stream has a hierarchical structure, a sequence layer that stores information shared by the entire stream, such as an image size, an aspect ratio, and a frame rate, and a GOP (Group Of (Picture) layer, a picture layer indicating a frame or field that can be handled as a single still image, a slice layer that is a region in which macroblocks are connected in a band in the horizontal direction, and a region of 16 pixels × 16 lines that is used for motion compensation The unit is composed of a macroblock layer, an area of 8 pixels × 8 pixels, and a block layer which is a unit of DCT conversion. Since the offset vector information is out of the MPEG standard, in order to comply with the MPEG standard, the offset vector information may be inserted into a user data portion that can store arbitrary data. According to the MPEG standard, the position where the user data portion can be inserted is limited to before and after the sequence header and the picture header. For example, in order to insert offset vector information for one frame according to the method described below, each picture A user data portion may be provided before or after the header and stored therein.

オフセットベクトル情報の先頭には、図２３に示すように、オフセットベクトル情報の有無を示すフラグ３１が挿入される。このフラグ３１が“１”のときは、後ろに多重化画像の配置順序を示す多重化画像情報（muxed image information）３２、オフセットベクトルのストリーム上での存在位置や符号化フォーマットを示すオフセットベクトル情報（offset vector information）３３が挿入され、オフセットベクトルがユーザデータに存在するならば、その後ろにオフセットベクトルの符号化データオフセットベクトルデータ（offset vector data）３４が続く。 As shown in FIG. 23, a flag 31 indicating whether or not there is offset vector information is inserted at the head of the offset vector information. When this flag 31 is “1”, multiplexed image information (muxed image information) 32 indicating the arrangement order of the multiplexed images is followed, and offset vector information indicating the position of the offset vector on the stream and the encoding format. If (offset vector information) 33 is inserted and an offset vector exists in the user data, followed by encoded data offset vector data 34 of the offset vector.

多重化画像情報３２が格納されたセクションには、多重化画像における視差画像、平面画像、視差部分画像の配置順序を示すデータが格納される。多重化画像情報３２の先頭には、図２４に示すように、まず多重化画像がどのような画像から構成されているかを示す多重化画像構成フラグ（muxed image structure）４１が格納される。このビットは例えば、
００：多重化画像は視差画像のみから構成されている。 In the section in which the multiplexed image information 32 is stored, data indicating the arrangement order of the parallax image, the planar image, and the parallax partial image in the multiplexed image is stored. As shown in FIG. 24, first, a multiplexed image structure flag (muxed image structure) 41 indicating what kind of image the multiplexed image is composed is stored at the head of the multiplexed image information 32. This bit is for example
00: The multiplexed image is composed only of parallax images.

０１：多重化画像は視差画像と平面画像から構成されている。 01: The multiplexed image is composed of a parallax image and a planar image.

１０：多重化画像は高解像度（原画から間引いていない）の１〜Ｎ眼画像から構成されている。 10: The multiplexed image is composed of 1-N eye images with high resolution (not thinned out from the original image).

１１：多重化画像はＮ眼ディスプレイ用画像と、全てのｋに対して、ｋ眼ディスプレイに表示するために必要な全ての視差画像から構成されている。
のようにすればよい。さらに、上記で“０１”（視差画像＋平面画像）であった場合、平面画像の配置形式を示すフラグをさらに挿入する。すなわち、
０：平面画像は分割せずにそのまま配置されている。 11: The multiplexed image is composed of an N-eye display image and all parallax images necessary for displaying on the k-eye display for all k.
Like this. Further, in the case of “01” (parallax image + planar image) as described above, a flag indicating the layout format of the planar image is further inserted. That is,
0: The planar image is arranged as it is without being divided.

１：平面画像は複数の平面部分画像に分割されて配置されている。
のようにすればよい。次に、立体画像の最大視点数を示すＮの具体的な数字であるＮ数４２が格納される。このデータは固定長としても、可変長としてもよい。そしてその次に、多重化画像の配置を示す多重化画像配置（muxed image arrangement）サブセクション４３が続く。 1: The planar image is divided into a plurality of planar partial images.
Like this. Next, N number 42 which is a specific number of N indicating the maximum number of viewpoints of the stereoscopic image is stored. This data may be fixed length or variable length. This is followed by a muxed image arrangement subsection 43 showing the arrangement of the multiplexed images.

多重化画像配置サブセクション４３の先頭には、図２４に示すように、多重化画像が水平方向の視差画像（または平面部分画像、視差部分画像）数を示す水平部分画素数（muxed partition image number width(mW)）４４、垂直方向の視差画像（または平面部分画像、視差部分画像）数を示す垂直部分画素数（muxed partition image number height(mH)）４５が挿入される。但し、多重化画像が視差画像と平面画像から構成されており、かつ平面画像が分割されずに配置されている場合は、視差画像は横方向に１列に並べる方法しかとることができないので、上記２つのデータの代わりに平面画像と視差画像の配置関係を示す２ビットのフラグが挿入される。すなわち、
００：平面画像は視差画像の上に配置されている。 At the beginning of the multiplexed image arrangement subsection 43, as shown in FIG. 24, the number of horizontal partial pixels (muxed partition image number) indicating the number of parallax images (or planar partial images and parallax partial images) in the horizontal direction of the multiplexed image. width (mW)) 44 and a vertical partial pixel number (muxed partition image number height (mH)) 45 indicating the number of vertical parallax images (or planar partial images and parallax partial images) are inserted. However, if the multiplexed image is composed of a parallax image and a plane image, and the plane image is arranged without being divided, the parallax image can only be arranged in a row in the horizontal direction. Instead of the two data, a 2-bit flag indicating the arrangement relationship between the planar image and the parallax image is inserted. That is,
00: The planar image is arranged on the parallax image.

０１：平面画像は視差画像の下に配置されている。 01: The planar image is arranged below the parallax image.

１０：平面画像は視差画像の右に配置されている。 10: The planar image is arranged on the right side of the parallax image.

１１：平面画像は視差画像の左に配置されている。
とする。さらにこの場合、前記ｍＷ及びｍＨは、それぞれＮ、１に設定される。次に、多重化画像内の左上の画像ブロックから、右下のブロックへ向かって順番に配置データ（muxed img）[y][x]４６が格納される。配置データ（muxed img）[y][x]４６に格納されるデータは、例えば以下のように決めればよい。多重化画像に配置される視差画像、あるいは視差画像と平面部分画像、あるいは視差画像と視差部分画像、の総数がＭである場合、これにダミー画像を加えたＭ＋１種類の画像を表すのに必要な最低ビット数すなわち[log₂(M-1)]＋２ビットの固定長とする。例えばＮ＝６で、多重化画像に視差画像と平面部分画像が配置されているとき、Ｍ＝１２となるから、以下のようにして視差画像、平面部分画像とビットの関係を決めておく。１２＋１は４ビットあれば表すことができるので、
００００：ダミー画像
０００１：第１眼画像
００１０：第２眼画像
００１１：第３眼画像
０１００：第４眼画像
０１０１：第５眼画像
０１１０：第６眼画像
０１１１：第１平面部分画像
１０００：第２平面部分画像
１００１：第３平面部分画像
１０１０：第４平面部分画像
１０１１：第５平面部分画像
１１００：第６平面部分画像
１１０１〜１１１１：保留
とすればよい。例として図２５に示すように多重化画像が３×５＝１５の画像ブロックから構成されており、図２５に示すように視差画、平面部分画像が配置されているとすると、図２４の配置データ[0][0]４６〜配置データ[2][4]４６のフィールドには、
0100 1001 0010 0000 1011 1100 0110 0101 0111 0000 0000 0001 1000 1010 0011
のように符号が挿入される。 11: The planar image is arranged on the left side of the parallax image.
And Further, in this case, the mW and mH are set to N and 1, respectively. Next, arrangement data (muxed img) [y] [x] 46 is stored in order from the upper left image block in the multiplexed image toward the lower right block. The data stored in the arrangement data (muxed img) [y] [x] 46 may be determined as follows, for example. When the total number of parallax images, parallax images and planar partial images, or parallax images and parallax partial images arranged in a multiplexed image is M, it is necessary to represent M + 1 types of images including dummy images. The minimum number of bits, that is, [log ₂ (M-1)] + a fixed length of 2 bits. For example, when N = 6 and when the parallax image and the planar partial image are arranged in the multiplexed image, M = 12, so the relationship between the parallax image, the planar partial image and the bit is determined as follows. Since 12 + 1 can be expressed with 4 bits,
0000: Dummy image 0001: First eye image 0010: Second eye image 0011: Third eye image 0100: Fourth eye image 0101: Fifth eye image 0110: Sixth eye image 0111: First planar partial image 1000: First 2 plane partial image 1001: 3rd plane partial image 1010: 4th plane partial image 1011: 5th plane partial image 1100: 6th plane partial image 1101-1111: What is necessary is just to hold. As an example, if a multiplexed image is composed of 3 × 5 = 15 image blocks as shown in FIG. 25 and a parallax image and a planar partial image are arranged as shown in FIG. 25, the arrangement shown in FIG. In the fields of data [0] [0] 46 to arrangement data [2] [4] 46,
0100 1001 0010 0000 1011 1100 0110 0101 0111 0000 0000 0001 1000 1010 0011
A code is inserted as follows.

オフセットベクトル情報３３のセクションには、図２６に示すように、オフセットベクトルの符号化フォーマットやストリーム中での挿入位置を示すデータが格納される。オフセットベクトル情報３３の先頭には、図２６に示すように、符号化フォーマットを示すオフセットベクトル符号化フォーマットフラグ（offset vector format）４７が挿入される。このフラグは、以下のような意味をもつ。 In the section of the offset vector information 33, as shown in FIG. 26, data indicating the encoding format of the offset vector and the insertion position in the stream is stored. As shown in FIG. 26, an offset vector encoding format flag (offset vector format) 47 indicating an encoding format is inserted at the head of the offset vector information 33. This flag has the following meaning.

００：オフセットベクトルを表す符号は固定長符号化(constant length coding: CLC)されており、ランレングス符号化されていない。 00: The code representing the offset vector is constant length coding (CLC), and is not run-length coded.

０１：オフセットベクトルを表す符号は可変長符号化(variable length coding: VLC)されており、ランレングス符号化されていない。 01: The code representing the offset vector is variable length coding (VLC) and is not run-length coded.

１０：オフセットベクトルを表す符号は固定長符号化(constant length coding: CLC)されており、ランレングス符号化されている。 10: The code representing the offset vector is fixed length coding (CLC) and run-length coded.

１１：オフセットベクトルを表す符号は可変長符号化(variable length coding: VLC)されており、ランレングス符号化されている。
次に、オフセットベクトル符号のストリーム中での存在位置を示すオフセットベクトル格納位置フラグ（offset vector location）４８が挿入される。このオフセットベクトル格納位置フラグ４８は、例えば
００：オフセットベクトル符号はユーザデータ部に挿入されている。 11: The code representing the offset vector is variable length coding (VLC) and run-length coded.
Next, an offset vector storage position flag (offset vector location) 48 indicating the position of the offset vector code in the stream is inserted. In the offset vector storage position flag 48, for example, 00: the offset vector code is inserted in the user data portion.

０１：オフセットベクトル符号はマクロブロックヘッダ部に挿入されている。 01: The offset vector code is inserted in the macroblock header part.

１０：オフセットベクトル符号は動きベクトル符号のすぐ後ろに挿入されている。 10: The offset vector code is inserted immediately after the motion vector code.

１１：オフセットベクトル符号は動きベクトル符号のすぐ後ろに挿入されている。
という意味をもつ。ここで、上記フラグが“００”以外（ユーザデータ部以外）の場合は、ランレングス符号化することはできないので、オフセットベクトル符号化フォーマットフラグ４７の上位１ビットは強制的に“０”にされる。次に、オフセットベクトル符号化フォーマットフラグ４７で、ランレングス符号化されている（＝“１０”または“１１”）場合は、ランレングス符号化に関する付加情報が挿入される。オフセットベクトルレングス符号化フォーマットフラグ（offset vector length format）４９は、オフセットベクトルの連続する数を表す符号のフォーマットを示すフラグで、以下のような意味をもつ。 11: The offset vector code is inserted immediately after the motion vector code.
It has the meaning. Here, when the flag is other than “00” (other than the user data portion), run-length encoding cannot be performed, so the upper 1 bit of the offset vector encoding format flag 47 is forcibly set to “0”. The Next, when run-length encoding is performed with the offset vector encoding format flag 47 (= “10” or “11”), additional information regarding the run-length encoding is inserted. An offset vector length encoding format flag (offset vector length format) 49 is a flag indicating a format of a code representing a continuous number of offset vectors, and has the following meaning.

０：オフセットベクトルの連続数を表す符号は、固定長符号化（CLC）されている。 0: The code representing the number of consecutive offset vectors is fixed-length coded (CLC).

１：オフセットベクトルの連続数を表す符号は、可変長符号化（VLC）されている。 1: The code representing the number of consecutive offset vectors is variable length coded (VLC).

次に、ランレングス符号化の付加情報として、オフセットベクトルのランレングス符号の区切りの単位（周期）を示すオフセットベクトルランレングス周期フラグ（offset vector RL separate period）５０が挿入される。フラグの意味は以下のとおりである。 Next, an offset vector run length period flag (offset vector RL separate period) 50 indicating the unit (period) of the offset vector run length code is inserted as additional information of run length coding. The meanings of the flags are as follows.

００：オフセットベクトルのランレングス符号はピクチャ単位で区切られている。 00: The run length code of the offset vector is divided in units of pictures.

０１：オフセットベクトルのランレングス符号はスライス単位で区切られている。 01: The run-length code of the offset vector is divided in units of slices.

１０：オフセットベクトルのランレングス符号はブロック画像（視差画像または）単位で区切られている。 10: The run-length code of the offset vector is divided in units of block images (parallax images or).

１１：保留
次に、オフセットベクトルデータ３４のセクションには、図２７に示すように、オフセットベクトルの符号化データが格納される。このセクションでは、オフセットベクトル情報３３セクションのオフセットベクトル符号化フォーマットフラグ４７の値によって、格納される形式が異なる。 11: Hold Next, as shown in FIG. 27, the offset vector data 34 section stores encoded data of the offset vector. In this section, the stored format differs depending on the value of the offset vector encoding format flag 47 in the offset vector information 33 section.

ランレングス符号化されていない場合のオフセットベクトルデータ３４の構成を図２７（ａ）に示し、ランレングス符号化されている場合のオフセットベクトルデータ３４の構成を図２７（ｂ）に示す。 FIG. 27A shows the configuration of the offset vector data 34 when it is not run-length encoded, and FIG. 27B shows the configuration of the offset vector data 34 when it is run-length encoded.

まず、ランレングス符号化されていない場合、図２７（ａ）に示すように、オフセットベクトルを示す固定長または可変長の符号化データ５１が並ぶ。データの並び順は、図２８に示すような左上のマクロブロックから右下のマクロブロックへ向かう順番となる。また、符号の意味やビット数は、第１の実施形態及び図６、図７に示した方法に従う。一方、ランレングス符号化されている場合は、図２７（ｂ）に示すように、オフセットベクトルの値（run）５４と、その連続する数（length）５５が、それぞれ固定長、可変長のどちらかで格納される。データの並び順は図２８に従う。ランレングス符号の終端には、終端であることを示すランレングスエンドコード（run-length end code）５６が格納される。 First, when run-length encoding is not performed, as shown in FIG. 27A, fixed-length or variable-length encoded data 51 indicating an offset vector is arranged. The data arrangement order is from the upper left macroblock to the lower right macroblock as shown in FIG. The meaning of the code and the number of bits are in accordance with the method shown in the first embodiment and FIGS. On the other hand, when run-length encoding is performed, as shown in FIG. 27B, the offset vector value (run) 54 and its continuous number (length) 55 are either fixed length or variable length. Stored in The data arrangement order follows FIG. At the end of the run-length code, a run-length end code 56 indicating the end is stored.

オフセットベクトル符号がユーザデータ以外の位置にある場合、例えばマクロブロックヘッダ部や動きベクトル符号の前後にある場合は、ランレングス符号化は行われず、該当マクロブロックにおけるオフセットベクトル符号が、固定長符号または可変長符号により挿入される。 When the offset vector code is located at a position other than the user data, for example, before or after the macroblock header part or the motion vector code, run-length encoding is not performed, and the offset vector code in the corresponding macroblock is a fixed-length code or Inserted with variable length code.

ここまでに説明した第１〜第５の実施形態において、多重化画像圧縮部１０、３０及び多重化画像伸張部２０、４０として、ＭＰＥＧ規格などの動画像圧縮規格に準拠した既存の平面動画用エンコーダ・デコーダをほとんどそのまま用いることができる。この場合、多重化画像圧縮部１０、３０において圧縮されるストリームはＭＰＥＧ規格に準拠したストリームであり、オフセットベクトルを表す符号は、動画ストリーム中のユーザデータ部、ヘッダ部のいずれかまたは両方に挿入され、前記オフセットベクトルが存在する位置を示すフラグ、及び、オフセットベクトルの符号化フォーマットを示すフラグ、及び、多重化画像内の視差画像、あるいは視差画像と平面画像、あるいは視差画像と視差部分画像の配置順序を示すフラグは、ユーザデータ部に挿入される。多重化画像伸長部２０、４０においては、動画ストリームはＭＰＥＧ規格に従って復号化されるとともに、オフセットベクトルは第１〜第５の実施形態に示した方法で復号化される。このように、既存の動画用エンコーダ・デコーダをほとんどそのまま用いることができるため、低コストでかつ効率のよい立体画像の伝送が可能となる。 In the first to fifth embodiments described so far, the multiplexed image compression units 10 and 30 and the multiplexed image decompression units 20 and 40 are for existing planar moving images conforming to a moving image compression standard such as the MPEG standard. The encoder / decoder can be used almost as it is. In this case, the stream compressed in the multiplexed image compression units 10 and 30 is a stream compliant with the MPEG standard, and the code representing the offset vector is inserted into one or both of the user data portion and the header portion in the moving image stream. A flag indicating a position where the offset vector exists, a flag indicating a coding format of the offset vector, and a parallax image in the multiplexed image, or a parallax image and a plane image, or a parallax image and a parallax partial image. A flag indicating the arrangement order is inserted into the user data portion. In the multiplexed image decompression units 20 and 40, the moving image stream is decoded in accordance with the MPEG standard, and the offset vector is decoded by the method shown in the first to fifth embodiments. Thus, since the existing moving image encoder / decoder can be used almost as it is, it is possible to transmit stereoscopic images efficiently at low cost.

また、図には示されていないが、本発明の第１〜第５の実施形態の立体画像圧縮装置１０、３０および立体画像伸長装置２０、４０は、上記で説明した立体画像圧縮方法および立体画像伸長方法を実行するためのプログラムを記録した記録媒体を備えている。この記録媒体は磁気ディスク、半導体メモリまたはその他の記録媒体であってもよい。このプログラムは、記録媒体から立体画像圧縮装置１０、３０および立体画像伸長装置２０、４０に読み込まれ、立体画像圧縮装置１０、３０および立体画像伸長装置２０、４０の動作を制御する。具体的には、立体画像圧縮装置１０、３０および立体画像伸長装置２０、４０内のＣＰＵがこのプログラムの制御により立体画像圧縮装置１０、３０および立体画像伸長装置２０、４０のハードウェア資源に特定の処理を行うように指示することにより上記の処理が実現される。 Although not shown in the figure, the stereoscopic image compression devices 10 and 30 and the stereoscopic image decompression devices 20 and 40 according to the first to fifth embodiments of the present invention are the same as the stereoscopic image compression method and the stereoscopic image described above. A recording medium recording a program for executing the image decompression method is provided. This recording medium may be a magnetic disk, a semiconductor memory, or another recording medium. This program is read from the recording medium into the stereoscopic image compression apparatuses 10 and 30 and the stereoscopic image expansion apparatuses 20 and 40, and controls the operations of the stereoscopic image compression apparatuses 10 and 30 and the stereoscopic image expansion apparatuses 20 and 40. Specifically, the CPU in the stereoscopic image compression apparatuses 10 and 30 and the stereoscopic image expansion apparatuses 20 and 40 specifies the hardware resources of the stereoscopic image compression apparatuses 10 and 30 and the stereoscopic image expansion apparatuses 20 and 40 under the control of this program. The above processing is realized by instructing to perform the above processing.

さらに、上記第１〜第５の実施形態では、立体画像圧縮装置と立体画像伸長装置からなる立体画像伝送システムを用いて説明しているが、本発明は伝送する画像が立体画像である場合に限定されるものではなく、所定の対象物を複数の視点から撮影した多視点画像を伝送する場合にも同様に適用することができるものである。また、立体画像も多視点画像に含まれるため、この場合には、立体画像圧縮装置および立体画像伸長装置は、多視点画像圧縮装置および多視点画像伸長装置に対応し、立体画像伝送システムは、多視点画像伝送システムに対応する。また、立体画像圧縮方法および立体画像伸長方法は、多視点画像圧縮方法および多視点画像伸長方法に対応し、立体画像伝送方法は、多視点画像伝送方法に対応する。 Further, in the first to fifth embodiments described above, a stereoscopic image transmission system including a stereoscopic image compression device and a stereoscopic image decompression device has been described. However, in the present invention, when an image to be transmitted is a stereoscopic image. The present invention is not limited, and the present invention can be similarly applied when transmitting a multi-viewpoint image obtained by photographing a predetermined object from a plurality of viewpoints. Further, since the stereoscopic image is also included in the multi-viewpoint image, in this case, the stereoscopic image compression device and the stereoscopic image expansion device correspond to the multi-viewpoint image compression device and the multi-viewpoint image expansion device. Supports multi-viewpoint image transmission system. In addition, the stereoscopic image compression method and the stereoscopic image expansion method correspond to the multi-view image compression method and the multi-view image expansion method, and the stereoscopic image transmission method corresponds to the multi-view image transmission method.

次に、本発明の一実施例について図面を参照して詳細に説明する。
立体画像の最も単純なものは、複数の多視点画像が右目画像と左目画像の２つの視差画像からなるものであり、またこの２つの画像に右目画像または左目画像のいずれかを水平方向に２倍の解像度とした追加画像をさらに用いるようにすれば画像表示側が平面ディスプレイの場合にも表示が可能となる。 Next, an embodiment of the present invention will be described in detail with reference to the drawings.
The simplest three-dimensional image is a plurality of multi-viewpoint images composed of two parallax images, a right-eye image and a left-eye image, and the right-eye image or the left-eye image is added to these two images in the horizontal direction. If an additional image having a double resolution is further used, the image can be displayed even when the image display side is a flat display.

本発明の具体的な実施例として、右目画像・左目画像と、左目画像を高解像度にするための追加画像を圧縮・伝送する場合について説明する。この実施例は、上記で説明した第３の実施形態においてＮ＝２とした場合に等しく、右目画像が第１眼画像、左目画像が第２眼画像、追加画像が第１平面部分画像に相当する。 As a specific embodiment of the present invention, a case will be described in which a right-eye image / left-eye image and an additional image for making the left-eye image high-resolution are compressed and transmitted. This example is equivalent to the case where N = 2 in the third embodiment described above. The right eye image corresponds to the first eye image, the left eye image corresponds to the second eye image, and the additional image corresponds to the first planar partial image. To do.

以下の説明では便宜上、右目画像、左目画像及び追加画像はそれぞれ１７６画素×２８８ラインとする。立体表示を行う場合は右目画像・左目画像を１画素毎に交互に配置し、３５２画素×２８８ラインの立体画像として視聴する。また、平面表示を行う場合は、左目画像と追加画像を１画素毎に交互に配置し、３５２画素×２８８ラインの平面画像として視聴する。 In the following description, the right eye image, the left eye image, and the additional image are each assumed to be 176 pixels × 288 lines for convenience. When performing stereoscopic display, the right-eye image and the left-eye image are alternately arranged for each pixel, and viewed as a stereoscopic image of 352 pixels × 288 lines. In the case of performing planar display, the left-eye image and the additional image are alternately arranged for each pixel and viewed as a planar image of 352 pixels × 288 lines.

右目画像・左目画像・追加画像は図１の立体画像多重化部１０４に入力され、図２９のように１枚の大きな画像に多重化される。多重化方法は、縦方向でも横方向に並べるのでもよいが、ここでは便宜上、横方向に並べることにする。すなわち、５２８画素×２８８ラインの多重化画像に多重化される。 The right-eye image, left-eye image, and additional image are input to the stereoscopic image multiplexing unit 104 in FIG. 1, and are multiplexed into one large image as shown in FIG. The multiplexing method may be arranged in the vertical direction or in the horizontal direction, but here it is arranged in the horizontal direction for convenience. That is, it is multiplexed into a multiplexed image of 528 pixels × 288 lines.

次に、図４に示した多重化画像圧縮部１０５における動作について説明する。まず、多重化画像圧縮部１０５における動きベクトル検出部３０４においては、予測メモリ３０３に記憶されている過去あるいは未来の画像を参照し、マクロブロック単位でのブロックマッチングが行われ、動きベクトルが検出される。ここで、図３０に示すように、一つの符号化対象ブロックに対し、参照画像内には３箇所の類似ブロックがあることがわかる。そのため、まず３つの類似ブロック近傍においてそれぞれ予測誤差（たとえば、符号化対象ブロックと参照ブロックとの差分自乗和）が最小となる箇所を検出し、求めた３箇所の類似ブロックのうちで最も予測誤差が小さいブロックを選択すればよい。このようにして求められた動きベクトルは、動きベクトル分解部１０６においてローカル動きベクトルとオフセットベクトルに分解される。ローカル動きベクトルは、既存の動画符号化規格の動きベクトルの符号化方法に従って符号化する。オフセットベクトルの符号化方法については、例えば符号化対象ブロックが左目画像にあり、選択されたブロックが右目画像内にある場合は、図３１のオフセットベクトルテーブルに従うと、オフセットベクトルを表す符号は“１０”となることがわかる。このようにして、符号化対象フレーム内の全てのマクロブロックに対して動きベクトル検出、動きベクトル分解を行う。その他多重化画像圧縮部１０５における動作については、上記の第１の実施形態などで説明した通りであるので、省略する。 Next, the operation in the multiplexed image compression unit 105 shown in FIG. 4 will be described. First, the motion vector detection unit 304 in the multiplexed image compression unit 105 refers to past or future images stored in the prediction memory 303, performs block matching in units of macroblocks, and detects a motion vector. The Here, as shown in FIG. 30, it can be seen that there are three similar blocks in the reference image for one encoding target block. For this reason, first, a location where the prediction error (for example, the sum of squared differences between the encoding target block and the reference block) is minimized is detected in the vicinity of the three similar blocks, and the prediction error is the largest among the obtained three similar blocks. What is necessary is just to select a block with small. The motion vector obtained in this way is decomposed into a local motion vector and an offset vector by the motion vector decomposing unit 106. The local motion vector is encoded according to the motion vector encoding method of the existing moving image encoding standard. With regard to the encoding method of the offset vector, for example, when the encoding target block is in the left-eye image and the selected block is in the right-eye image, according to the offset vector table of FIG. " In this way, motion vector detection and motion vector decomposition are performed on all macroblocks in the encoding target frame. Other operations in the multiplexed image compression unit 105 are the same as those described in the first embodiment, and will not be described.

上記で求められたオフセットベクトル情報及びオフセットベクトルに関する種々の情報はユーザデータ部に挿入される。以下、上記で説明した例に基づいて、オフセットベクトルに関する情報を挿入する方法について説明する。ここでは、オフセットベクトル情報は全てユーザデータ部に挿入されるものとし、オフセットベクトル符号は図３１のように可変長符号化されており、ランレングス符号化は行われないものとして説明する。また、多重化画像内における視差画像の配置は図３０のようになっているものとする。 The offset vector information obtained above and various information related to the offset vector are inserted into the user data section. Hereinafter, a method for inserting information related to an offset vector will be described based on the example described above. Here, it is assumed that all the offset vector information is inserted into the user data part, the offset vector code is variable-length encoded as shown in FIG. 31, and no run-length encoding is performed. In addition, it is assumed that the arrangement of parallax images in the multiplexed image is as shown in FIG.

まず、多重化画像情報セクションの多重化画像構成フラグについては、本実施例では視差画像と平面画像から構成されているので、“０１”となる。さらに、そのすぐ後に平面画像の配置形式を示すフラグとして、平面画像が複数の平面部分画像に分割されて配置されることを示す符号”1"が挿入される。次にＮ数は、最大視点数を示すので２が入る。このフィールドを８ビット固定長とすれば、“００００００１０”となる。 First, the multiplexed image configuration flag of the multiplexed image information section is “01” because it is composed of a parallax image and a planar image in this embodiment. Further, immediately after that, a code “1” indicating that the plane image is divided into a plurality of plane partial images is inserted as a flag indicating the arrangement format of the plane image. Next, N is 2 because it indicates the maximum number of viewpoints. If this field has a fixed length of 8 bits, it becomes “00000010”.

続いて多重化画像配置サブセクション４３の先頭の水平部分画像数４４、垂直部分画像数４５には、それぞれ“３”、“１”が入る。これらも８ビット固定長とすると、それぞれ“００００００１１”、“０００００００１”となる。次に、多重化画像配置データ４６には、多重化画像内の各画像ブロックにどの視差画像が配置されているかを示すデータが格納される。全画像数は３であるから、これらを区別するためには２ビットあれば十分である。そうすると、多重化画像の符号は以下のようになる。
００：保留
０１：右目画像
１０：左目画像
１１：左目用追加画像
従って図３０の配置をもとにすると、多重化画像配置データ４６には、
０１１０１１
が挿入される。 Subsequently, “3” and “1” are entered in the number of horizontal partial images 44 and the number of vertical partial images 45 at the beginning of the multiplexed image arrangement subsection 43, respectively. If these are 8 bits fixed length, they are “00000011” and “00000001”, respectively. Next, the multiplexed image arrangement data 46 stores data indicating which parallax images are arranged in each image block in the multiplexed image. Since the total number of images is 3, two bits are sufficient to distinguish them. Then, the code of the multiplexed image is as follows.
00: Hold 01: Right eye image 10: Left eye image 11: Additional image for left eye Therefore, based on the arrangement of FIG.
01 10 11
Is inserted.

続いて、オフセットベクトル情報３３の先頭には符号化フォーマットを示すオフセットベクトル符号化フォーマットフラグ４７が挿入される。オフセットベクトルは可変長符号化されおり、ランレングス符号化は行われないので、このオフセットベクトル符号化フォーマットフラグ４７には“０１”を挿入する。次に、オフセットベクトル情報が格納されている場所を示すオフセットベクトル格納位置フラグ４８には、オフセットベクトル情報が全てユーザデータに格納されていることを示す“００”が挿入される。ランレングス符号化は行わないので、ランレングス符号化に関する情報であるオフセットベクトル・レングス符号化フォーマットフラグ４９及びオフセットベクトル・ランレングス周期フラグ５０は挿入されず、スキップされる。続くオフセットベクトルデータ３４のセクションには、動きベクトル分解部１０６で求められた全マクロブロックのオフセットベクトル符号が挿入される。 Subsequently, an offset vector encoding format flag 47 indicating the encoding format is inserted at the head of the offset vector information 33. Since the offset vector is variable-length encoded and run-length encoding is not performed, “01” is inserted into the offset vector encoding format flag 47. Next, “00” indicating that all the offset vector information is stored in the user data is inserted into the offset vector storage position flag 48 indicating the location where the offset vector information is stored. Since run-length encoding is not performed, the offset vector / length encoding format flag 49 and the offset vector / run-length period flag 50, which are information relating to run-length encoding, are not inserted and skipped. In subsequent sections of the offset vector data 34, offset vector codes of all macroblocks obtained by the motion vector decomposing unit 106 are inserted.

例として、多重化画像内の第１行目のはじめの１５個のマクロブロックにおけるオフセットベクトルが図３２のように求められた場合、オフセットベクトルデータ３４セクションには、図３１のテーブルに従い、
００１０１１０１１０００１０１００１１１１１０
の順に挿入される。ここで、左右及び追加画像の水平画素数はそれぞれ１７６画素であり、各画像の水平方向には１７６／１６＝１１個のマクロブロックが入るため、左から１２番目のマクロブロックで符号化対象画面が右目画像から左目画像に切り替わることにより、１１番目と１２番目のマクロブロックにおけるオフセットベクトルが同じ左目画像を参照していても符号は異なることに注意されたい。さらに第２行目以降も同様にしてオフセットベクトルデータが挿入される。 As an example, when the offset vectors in the first 15 macroblocks in the first row in the multiplexed image are obtained as shown in FIG. 32, the offset vector data 34 section includes the table of FIG.
0 0 10 11 0 11 0 0 0 10 10 0 0 11 11 10
Are inserted in this order. Here, the number of horizontal pixels of each of the left and right images and the additional image is 176 pixels, and 176/16 = 11 macroblocks are included in the horizontal direction of each image. Note that, by switching from the right-eye image to the left-eye image, the signs are different even if the offset vectors in the 11th and 12th macroblocks refer to the same left-eye image. Further, offset vector data is inserted in the same manner from the second row.

多重化画像伸長部１０９での動作は、まず、多重化画像伸張部１０９に入力された１本の動画ストリームが、分離部８０３において差分画像データ８０４、オフセットベクトル情報８０５、動きベクトル情報８０６とに分離される。 In the operation of the multiplexed image decompression unit 109, first, one moving image stream input to the multiplexed image decompression unit 109 is converted into difference image data 804, offset vector information 805, and motion vector information 806 in the separation unit 803. To be separated.

次に動きベクトル合成部１１０においては、オフセットベクトル情報及びローカル動きベクトル情報をもとに、１本の動きベクトル８１１が合成されるが、合成するためにはオフセットベクトル及びローカル動きベクトルを復号する必要がある。 Next, in the motion vector synthesis unit 110, one motion vector 811 is synthesized based on the offset vector information and the local motion vector information. In order to synthesize, it is necessary to decode the offset vector and the local motion vector. There is.

まずローカル動きベクトルは、動画ストリーム中の動きベクトル符号を検出し、これをローカル動きベクトル情報として取得する。ローカル動きベクトルの復号化方法については、種々の動画像規格に定められている動きベクトルの復号化方式に従う。 First, as the local motion vector, a motion vector code in the moving image stream is detected and acquired as local motion vector information. The local motion vector decoding method follows a motion vector decoding method defined in various video standards.

次に、オフセットベクトルを復号するためには、多重化画像の配置順序やオフセットベクトルの符号化フォーマットなどの情報が必要なので、ユーザデータ部に挿入されているオフセットベクトルに関する情報を復号してから、オフセットベクトル符号自体を復号する。本実施例では、多重化画像情報３２より復号した情報から、視点数は２で、多重化画像は左目画像・右目画像・左目用追加画像の３枚から構成されており、それらは多重化画像内左から右目画像・左目画像・追加画像の順で配置されていることがわかる。また、オフセットベクトル情報３３より復号した情報から、オフセットベクトルは可変長符号化されており、ランレングス符号化は行われていないことがわかる。これらの情報があれば、フレーム内の全マクロブロックにおけるオフセットベクトル符号を一意に復号することができ、オフセットベクトルとローカル動きベクトルを合成してもとの動きベクトルに復元することができる。 Next, in order to decode the offset vector, since information such as the arrangement order of the multiplexed images and the encoding format of the offset vector is necessary, after decoding the information about the offset vector inserted in the user data portion, The offset vector code itself is decoded. In the present embodiment, from the information decoded from the multiplexed image information 32, the number of viewpoints is 2, and the multiplexed image is composed of three images of a left-eye image, a right-eye image, and a left-eye additional image, which are multiplexed images. It can be seen that the left-eye image, the left-eye image, and the additional image are arranged in this order from the inner left. Also, it can be seen from the information decoded from the offset vector information 33 that the offset vector is variable-length encoded and no run-length encoding is performed. With these pieces of information, the offset vector code in all macroblocks in the frame can be uniquely decoded, and the original motion vector can be restored by combining the offset vector and the local motion vector.

この合成された動きベクトル情報を用いて、動き補償部８１２において、予測メモリ８１３に記憶されている過去あるいは未来の参照画像から予測画像が形成される。そして、予測画像と前記差分画像との和をとることによって、多重化画像が復元される。その他１０９多重化画像伸長部における動作については、第１実施形態などで説明した通りであるので、省略する。 Using the synthesized motion vector information, the motion compensation unit 812 forms a predicted image from past or future reference images stored in the prediction memory 813. Then, the multiplexed image is restored by taking the sum of the predicted image and the difference image. The other operations in the 109 multiplexed image decompression unit are the same as those described in the first embodiment, and will not be described.

そして、伸長された多重化画像は多重化画像分離部１１１において、右目画像・左目画像・左目用追加画像に分離され出力される。 The expanded multiplexed image is separated into a right-eye image, a left-eye image, and a left-eye additional image by the multiplexed image separation unit 111 and output.

以上述べた方法により、右目画像・左目画像・追加画像相互の空間的相関性を利用し、かつ動きベクトル符号量の増加を抑えることができ、立体画像及び高解像度平面画像を効率よく圧縮伝送することが可能になる。 By the method described above, the spatial correlation between the right eye image, the left eye image, and the additional image can be used, and an increase in the amount of motion vector code can be suppressed, and a stereoscopic image and a high-resolution planar image are efficiently compressed and transmitted. It becomes possible.

最後に、従来技術による空間配置法により立体画像を圧縮した場合と、本発明によるオフセット空間配置法により立体画像を圧縮した場合との動きベクトルの符号量の比較をシミュレーションした結果を図３３のグラフに示す。ここでは、下記のような条件に基づいてシミュレーションを行った。
（１）使用した映像シーケンス
シーケンス名：立体映像標準チャート No.9: Amusement Park
画像サイズ：原画像は右目画像・左目画像ともＨＤＴＶサイズ(１９２０画素×１０３５ライン)だが、リサンプリング・トリミング処理を施して右目画像・左目画像・追加画像をそれぞれ１６０画素×２４０とした。
フレーム数：４５０フレーム（３０[フレーム／秒]×[１５秒]）
（２）エンコード条件
使用したエンコーダ：MPEG-2 TM-5(Test Model 5)エンコーダ
（テストモデルとは、エンコーダやデコーダ開発の際などに性能比較の対象として、映像業界で標準的に用いられているコーデック（エンコーダ＋デコーダのセット）である。）
画像劣化：ビットレート固定ではなく、ＤＣＴ係数量子化値（画像の劣化度合いとほぼ等しい）を一定にした。ただし、エンコード条件は基本的にDCT係数符号量のみに影響し、動きベクトル符号量はシーケンスの複雑度（動きの大小等）に依存するため、エンコード条件の違いによる動きベクトル符号量の違いはほとんどないといってよい。
動きベクトル検出アルゴリズム：単純ブロックマッチング（差分絶対値和最小点探索）
（３）立体画像圧縮アルゴリズム
オフセット空間配置法（本実施形態）、空間配置法（従来技術）の２種類
オフセット空間配置法：本実施形態に従って動きベクトルをローカル動きベクトルとオフセットベクトルとに分解する方法。
空間配置法：オフセット空間配置法において動きベクトルを分解しない方法。
（４）オフセットベクトル符号化条件
符号化テーブル：全ページ図３１のテーブルを使用する。
ランレングス符号化：符号化しない。 Finally, a graph of FIG. 33 shows the result of simulating the comparison of the coding amount of the motion vector when the stereoscopic image is compressed by the spatial arrangement method according to the prior art and when the stereoscopic image is compressed by the offset space arrangement method according to the present invention Shown in Here, the simulation was performed based on the following conditions.
(1) Video sequence name used: 3D standard chart No.9: Amusement Park
Image size: The original image is HDTV size (1920 pixels × 1035 lines) for both the right-eye image and the left-eye image, but resampling / trimming processing is performed to make the right-eye image, left-eye image, and additional image 160 pixels × 240, respectively.
Number of frames: 450 frames (30 [frames / second] x [15 seconds])
(2) Encoder used for encoding conditions: MPEG-2 TM-5 (Test Model 5) encoder (Test model is a standard used in the video industry as a target for performance comparison when developing encoders and decoders, etc.) Codec (encoder + decoder set).)
Image degradation: The bit rate was not fixed, but the DCT coefficient quantization value (approximately equal to the degree of image degradation) was made constant. However, since the encoding condition basically affects only the DCT coefficient code amount, and the motion vector code amount depends on the complexity of the sequence (such as the magnitude of motion), there is little difference in the motion vector code amount due to the difference in the encoding condition It can be said that there is no.
Motion vector detection algorithm: simple block matching (difference absolute value sum minimum point search)
(3) Stereoscopic image compression algorithm Two types of offset space arrangement methods: offset space arrangement method (this embodiment) and space arrangement method (prior art): a method of decomposing a motion vector into a local motion vector and an offset vector according to this embodiment .
Spatial layout method: A method that does not decompose motion vectors in the offset spatial layout method.
(4) Offset vector coding condition coding table: The table of FIG. 31 for all pages is used.
Run-length encoding: No encoding.

上記のような条件に行われた図３３のシミュレーション結果では、オフセット空間配置法と空間配置法により符号化した場合の、１〜４５０の各フレームにおける動きベクトル発生符号量が示されている。図３３中の実線は空間配置法における動きベクトル符号量Ｖ（オフセットベクトル符号量はゼロ）、１点破線はオフセット空間配置法の動きベクトル符号量（ローカル動きベクトル符号量(Ｖ_l)＋オフセットベクトル符号量(Ｖ_o)）を示している。また、破線はオフセット空間配置法におけるローカル動きベクトル符号量(Ｖ_l)のみの符号量を示している。従って、一点破線と破線の間がオフセットベクトル符号量(Ｖ_o)に相当する。このグラフより、動きベクトル分解により動きベクトル符号量を効果的に削減することが可能である（最大で１９％）ことが確認できる。動きベクトル分解によりオフセットベクトルの符号量はオーバーヘッドとなるが、オフセット空間配置法において（ローカル）動きベクトルを短くすることによる符号量削減効果が大きい（実線vs一点破線の比較）ため、トータル比較でも空間配置法の符号量を上回ることはなかった。尚、オフセットベクトル符号量はランレングス符号化などによりさらに削減できる可能性がある。 The simulation result of FIG. 33 performed under the above conditions shows the motion vector generation code amount in each frame of 1 to 450 when encoding is performed by the offset space arrangement method and the space arrangement method. A solid line in FIG. 33 indicates a motion vector code amount V (offset vector code amount is zero) in the spatial arrangement method, and a one-dot broken line indicates a motion vector code amount (local motion vector code amount (V _l ) + offset vector) in the offset space arrangement method. Code amount (V _o )) is shown. A broken line indicates a code amount of only the local motion vector code amount (V _l ) in the offset space arrangement method. Therefore, the area between the dashed line and the broken line corresponds to the offset vector code amount (V _o ). From this graph, it can be confirmed that the motion vector code amount can be effectively reduced by the motion vector decomposition (19% at the maximum). Although the amount of code of the offset vector becomes overhead due to the motion vector decomposition, the effect of reducing the amount of code by shortening the (local) motion vector in the offset space arrangement method is large (comparison of solid line vs. dashed line). It did not exceed the code amount of the placement method. There is a possibility that the offset vector code amount can be further reduced by run-length encoding or the like.

本発明の第１の実施形態の立体画像伝送システムの構成を示すブロック図である。It is a block diagram which shows the structure of the three-dimensional image transmission system of the 1st Embodiment of this invention. 本発明の第１の実施形態における、視差画像の空間多重化例を示す図である。It is a figure which shows the example of spatial multiplexing of the parallax image in the 1st Embodiment of this invention. マクロブロックを説明するための図である。It is a figure for demonstrating a macroblock. 本発明の第１の実施形態における多重化画像圧縮部１０５の構成を示すブロック図である。It is a block diagram which shows the structure of the multiplexed image compression part 105 in the 1st Embodiment of this invention. 本発明の第１の実施形態における動きベクトル分解の様子を示す図である。It is a figure which shows the mode of the motion vector decomposition | disassembly in the 1st Embodiment of this invention. オフセットベクトルの符号化の際に参照する固定長符号テーブルの例を示す図である。It is a figure which shows the example of the fixed length code table referred in the case of encoding of an offset vector. オフセットベクトルの符号化の際に参照する可変長符号テーブルの例を示す図である。It is a figure which shows the example of the variable length code table referred in the case of encoding of an offset vector. ランレングス符号化を説明するための動きベクトル並びの一例を示す図である。It is a figure which shows an example of the motion vector sequence for demonstrating run length encoding. 本発明の第１の実施形態における多重化画像伸長部１０９の構成を示すブロック図である。It is a block diagram which shows the structure of the multiplexed image expansion | extension part 109 in the 1st Embodiment of this invention. 本発明の第２の実施形態の立体画像伝送システムの構成を示すブロック図である。It is a block diagram which shows the structure of the stereo image transmission system of the 2nd Embodiment of this invention. 本発明の第２の実施形態における、視差画像及び平面画像の空間多重化例を示す図である。It is a figure which shows the example of the spatial multiplexing of the parallax image and planar image in the 2nd Embodiment of this invention. 本発明の第２の実施形態における、多重化画像圧縮部２０５の構成を示すブロック図である。It is a block diagram which shows the structure of the multiplexed image compression part 205 in the 2nd Embodiment of this invention. 本発明の第２の実施形態における、多重化画像伸長部２０９の構成を示すブロック図である。It is a block diagram which shows the structure of the multiplexed image expansion | extension part 209 in the 2nd Embodiment of this invention. 本発明の第３の実施形態における、平面画像を複数の平面部分画像に分割する方法を示す図である。It is a figure which shows the method of dividing | segmenting a plane image into several plane partial image in the 3rd Embodiment of this invention. 本発明の第３の実施形態における、視差画像及び平面部分画像の空間多重化例を示す図である。It is a figure which shows the example of spatial multiplexing of the parallax image and plane partial image in the 3rd Embodiment of this invention. 本発明の第４の実施形態における、視差画像の空間多重化例を示す図である。It is a figure which shows the example of spatial multiplexing of the parallax image in the 4th Embodiment of this invention. 本発明の第５の実施形態における、入力する視差画像の大きさ、視点の関係を示す図である。It is a figure which shows the magnitude | size of a parallax image and the relationship of a viewpoint in the 5th Embodiment of this invention. 本発明の第５の実施形態における、平面ディスプレイ用画像を複数の部分画像に分割する方法を示す図である。It is a figure which shows the method of dividing | segmenting the image for flat displays into the some partial image in the 5th Embodiment of this invention. 本発明の第５の実施形態における、２眼ディスプレイ用画像を複数の部分画像に分割する方法を示す図である。It is a figure which shows the method of dividing | segmenting the image for twin-lens displays into the some partial image in the 5th Embodiment of this invention. 本発明の第５の実施形態における、３眼ディスプレイ用画像を複数の部分画像に分割する方法を示す図である。It is a figure which shows the method of dividing | segmenting the image for 3 eyes displays into a some partial image in the 5th Embodiment of this invention. 本発明の第５の実施形態における、視差画像及び視差部分画像の空間多重化例を示す図である。It is a figure which shows the example of spatial multiplexing of the parallax image and the parallax partial image in the 5th Embodiment of this invention. ＭＰＥＧストリーム中にオフセットベクトルを挿入する様子を説明するための図である。It is a figure for demonstrating a mode that an offset vector is inserted in an MPEG stream. ユーザデータ部における、オフセットベクトル情報の挿入例を示す図である。It is a figure which shows the example of insertion of offset vector information in a user data part. 図２３における多重化画像情報３２内のデータ配置例を示す図である。It is a figure which shows the example of data arrangement | positioning in the multiplexed image information 32 in FIG. 図２３の配置データ４６フィールドに挿入する符号を説明するための、多重化画像における視差画像、平面部分画像、ダミー画像の配置例を示す図である。It is a figure which shows the example of arrangement | positioning of the parallax image in a multiplexed image, a plane partial image, and a dummy image for demonstrating the code | symbol inserted in the arrangement | positioning data 46 field of FIG. 図２３におけるオフセットベクトル情報３３セクション内のデータ配置例を示す図である。It is a figure which shows the example of data arrangement | positioning in the offset vector information 33 section in FIG. 図２３におけるオフセットベクトルデータ３４セクション内のデータ配置例を示す図である。It is a figure which shows the example of data arrangement | positioning in the offset vector data 34 section in FIG. 多重化画像内における、オフセットベクトルの走査方向を示す図である。It is a figure which shows the scanning direction of an offset vector in a multiplexed image. 本発明の一実施例において右目画像・左目画像・追加画像が多重化される様子を示す図である。It is a figure which shows a mode that a right eye image, a left eye image, and an additional image are multiplexed in one Example of this invention. 動きベクトルの検出方法を説明するための図である。It is a figure for demonstrating the detection method of a motion vector. 本発明の一実施例におけるオフセットベクトルテーブルを示す図である。It is a figure which shows the offset vector table in one Example of this invention. 本発明の一実施例におけるオフセットベクトルの例を示す図である。It is a figure which shows the example of the offset vector in one Example of this invention. 従来技術による空間配置法により立体画像を圧縮した場合と、本発明によるオフセット空間配置法により立体画像を圧縮した場合との動きベクトルの符号量の比較をシミュレーションした結果を示すグラフである。It is a graph which shows the result of having simulated the comparison of the code amount of the motion vector in the case where a stereo image is compressed by the spatial arrangement method by a prior art, and the case where a stereo image is compressed by the offset space arrangement method by this invention. 従来の立体画像圧縮装置の構成を示すブロック図である。It is a block diagram which shows the structure of the conventional stereo image compression apparatus. 従来の他の立体画像圧縮装置の構成を示すブロック図である。It is a block diagram which shows the structure of the other conventional stereo image compression apparatus.

Explanation of symbols

１０立体画像圧縮装置
２０立体画像伸長装置
３０立体画像圧縮装置
３１フラグ
３２多重化画像情報
３３オフセットベクトル情報
３４オフセットベクトルデータ
４０立体画像伸長装置
４１多重化画像構成フラグ
４２Ｎ数
４３多重化画像配置サブセクション
４４水平部分画素数
４５垂直部分画素数
４６配置データ
４７オフセットベクトル符号化フォーマットフラグ
４８オフセットベクトル格納位置フラグ
４９オフセットベクトルレングス符号化フォーマットフラグ
５０オフセットベクトルランレングス周期フラグ
５１符号化で
５４オフセットベクトルの値
５５オフセットベクトルの値が連続する数
５６ランレングスエンドコード
１０１₁〜１０１_N 第１〜第Ｎ眼画像入力
１０４立体画像多重化部
１０５多重化画像圧縮部
１０６動きベクトル分解部
１０７送信・記録部
１０８受信・再生部
１０９多重化画像伸長部
１１０動きベクトル合成部
１１１立体画像分離部
１１２₁〜１１２_N 第１〜第Ｎ眼画像出力
３０３予測メモリ
３０４動きベクトル検出部
３０５動きベクトル
３０６動き補償部
３０８ローカル動きベクトル
３０９オフセットベクトル
３１０ＤＣＴ変換部
３１１量子化部
３１２可変長符号化部
３１３多重化部
３１５逆量子化部
３１６逆ＤＣＴ変換部
４０１符号化対象ブロック
４０３参照ブロック
４０５動きベクトル
４０６オフセットブロック
４０７オフセットベクトル
４０８ローカル動きベクトル
７０２連続するブロック
７０３動きベクトル
７０４参照ブロック
８０３分離部
８０４差分画像データ
８０５オフセットベクトル
８０６ローカル動きベクトル
８０７可変長復号化部
８０８逆量子化部
８０９逆ＤＣＴ変換部
８１０動きベクトル合成部
８１１動きベクトル
８１２動き補償部
８１３予測メモリ DESCRIPTION OF SYMBOLS 10 Stereoscopic image compression apparatus 20 Stereoscopic image expansion apparatus 30 Stereoscopic image compression apparatus 31 Flag 32 Multiplexed image information 33 Offset vector information 34 Offset vector data 40 Stereoscopic image expansion apparatus 41 Multiplexed image structure flag 42 N number 43 Multiplexed image arrangement | positioning sub Section 44 Number of horizontal partial pixels 45 Number of vertical partial pixels 46 Arrangement data 47 Offset vector encoding format flag 48 Offset vector storage position flag 49 Offset vector length encoding format flag 50 Offset vector run length period flag 51 Coding 54 Offset vector number 56 runlength end code values of 55 offset vector is continuous 101 ₁ to 101 _N first to N-eye image input 104 stereoscopic image multiplexing unit 105 multiplex Kaga Compression unit 106 the motion vector decomposition unit 107 transmits and recording unit 108 the reception and reproduction unit 109 multiplexes the image decompression unit 110 motion vector synthesis unit 111 stereoscopic image separation unit 112 ₁ to 112 _N first to N-eye image output 303 prediction memory 304 Motion vector detection unit 305 Motion vector 306 Motion compensation unit 308 Local motion vector 309 Offset vector 310 DCT conversion unit 311 Quantization unit 312 Variable length encoding unit 313 Multiplexing unit 315 Inverse quantization unit 316 Inverse DCT conversion unit 401 Encoding target Block 403 Reference block 405 Motion vector 406 Offset block 407 Offset vector 408 Local motion vector 702 Successive block 703 Motion vector 704 Reference block 803 Separation unit 804 Difference image data 805 Offset vector 806 Local motion vector 807 Variable length decoding unit 808 Inverse quantization unit 809 Inverse DCT conversion unit 810 Motion vector synthesis unit 811 Motion vector 812 Motion compensation unit 813 Prediction memory

Claims

A multi-viewpoint image that compresses a data amount by decomposing and encoding a plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image and outputting them as a moving image stream A compression device,
Multi-view image multiplexing means for multiplexing the plurality of multi-view images in an image space to generate one multiplexed image;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. Motion vector detection means for detecting a motion vector by selecting so as to increase the prediction efficiency;
An offset from the motion vector detected by the motion vector detection means to an offset block located at the same coordinate as the local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block Motion vector decomposing means for decomposing the vector into local motion vectors from the offset block to the selected block;
A multi-viewpoint image compression apparatus comprising: a multiplexing unit that multiplexes and outputs the local motion vector and offset vector decomposed by the motion vector decomposing unit to the moving image stream.

The multi-viewpoint image compression apparatus according to claim 1, wherein the plurality of multi-viewpoint images are a right-eye image and a left-eye image.

2. The multi-viewpoint image compression apparatus according to claim 1, wherein the plurality of multi-viewpoint images are a right-eye image, a left-eye image, and an additional image in which either the right-eye image or the left-eye image is doubled in the horizontal direction.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the motion detected by the motion vector detecting means A determination unit that outputs a vector to the motion vector decomposing unit and outputs a motion vector detected by the motion vector detecting unit to the multiplexing unit when the encoding target block is in a plane image; The multi-viewpoint image compression apparatus according to claim 1.

5. The multi-viewpoint image compression apparatus according to claim 1, wherein the code representing the local motion vector is encoded according to a motion vector format in the MPEG standard.

The code representing the offset vector is at least [log ₂ (M−1)] + 1 bits (provided that [x] exceeds x) predetermined for each of the M images arranged in the multiplexed image. 6. The multi-viewpoint image compression apparatus according to claim 1, wherein the multi-viewpoint image compression apparatus is represented by referring to a fixed length code table of (not a maximum integer).

The code representing the offset vector is variable-length-coded according to the viewpoint distance between the multi-view image including the encoding target block and the parallax image including the selected block. The multi-viewpoint image compression apparatus described.

The multi-viewpoint image according to any one of 1 to 5, wherein the code representing the offset vector is run-length encoded by a plurality of offset vector groups respectively corresponding to two or more blocks adjacent to each other on the multiplexed screen. Compression device.

The video stream compressed by the multi-viewpoint image compression means is a video stream that complies with the MPEG standard,
The code representing the offset vector is inserted into one or both of a user data part and a header part in a stream compliant with the MPEG standard,
The flag indicating the position where the offset vector exists, the flag indicating the encoding format of the offset vector, and the flag indicating the arrangement order of the images in the multiplexed image are a user in the stream compliant with the MPEG standard. The multi-viewpoint image compression apparatus according to any one of claims 1 to 8, wherein the multi-viewpoint image compression apparatus is inserted into a data portion.

When decomposing and encoding a multiplexed image obtained by spatially multiplexing a plurality of multi-view images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image, A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. A block selected from among blocks constituting a multiplexed image in which a motion vector is detected by selecting so as to increase prediction efficiency, and the detected motion vector is used as a reference image when detecting the motion vector. In the multi-view image including the offset block that is located at the same coordinate as the local coordinate in the multi-view image of the encoding target block. Receive a moving picture stream obtained by decomposing the set vector and a local motion vector from the offset block to the selected block, and encoding and multiplexing the decomposed local motion vector and the offset vector A multi-viewpoint image decompression device that restores the original multi-viewpoint image by decompressing
Separating means for separating the local motion vector and the offset vector included in the received video stream, motion vector synthesizing means for synthesizing a motion vector from the local motion vector and the offset vector separated by the separating means, and the motion A predicted image is formed from the motion vector synthesized by the vector synthesizing unit and the reference image in the received moving image stream, and the original multiplexed image is obtained by summing the predicted image and the difference image included in the moving image stream. A multi-viewpoint image decompression device comprising multiplexed image restoration means for restoration.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the moving image stream by the separation means is used as the local motion vector. When the block to be encoded is in a plane image and is output to the motion vector synthesizing unit, the discriminating unit further outputs a motion vector separated from the video stream by the separating unit to the multiplexed image restoring unit. The multi-viewpoint image expansion device according to claim 10.

A plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints are decomposed into motion vectors and difference images, encoded and transmitted as a moving image stream, and the transmitted moving image stream is received. A multi-viewpoint image transmission system that restores the original multi-viewpoint image by decompressing
Multi-view image multiplexing means for multiplexing the plurality of multi-view images in an image space to generate one multiplexed image, and encoding using the predetermined area of the multiplexed image as a motion vector search range Motion vector detection that detects a motion vector by selecting a similar block that is similar to the encoding target block of the multiplexed image to be the highest in prediction efficiency from among the blocks that constitute the multiplexed image as a reference image And an offset block in which the motion vector detected by the motion vector detecting means is located at the same coordinate as the local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block. And the local motion vector from the offset block to the selected block. A motion vector resolution means interpreted, the multi-viewpoint image compression device having a multiplexing means for outputting the multiplexed the motion local motion vector and the offset vector decomposed by vector resolution means, encoded in the video stream,
Separation means for separating a local motion vector and an offset vector included in a video stream received from the multi-viewpoint image compression apparatus, and motion vector composition for synthesizing a motion vector from the local motion vector and the offset vector separated by the separation means And a motion vector synthesized by the motion vector synthesis means and a reference image in the received video stream, and a predicted image is formed by taking the sum of the predicted image and the difference image included in the video stream A multi-viewpoint image transmission system comprising: a multi-viewpoint image decompression device having multiplexed image restoration means for restoring multiple multiplexed images.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The multi-viewpoint image compression apparatus includes:
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the motion detected by the motion vector detecting means First discriminating means for outputting a vector to the motion vector decomposing means and outputting a motion vector detected by the motion vector detecting means to the multiplexing means when the encoding target block is in a plane image. In addition,
The multi-viewpoint image decompression device includes:
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the moving image stream by the separation means is used as the local motion vector. A second discriminating unit that outputs to the motion vector synthesizing unit, and outputs the motion vector separated from the moving image stream by the separating unit to the multiplexed image restoring unit when the encoding target block is in the plane image; The multi-viewpoint image transmission system according to claim 12, further comprising:

A multi-viewpoint image that compresses a data amount by decomposing and encoding a plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image and outputting them as a moving image stream Compression method,
Generating a single multiplexed image by multiplexing the plurality of multi-viewpoint images in an image space;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. Detecting a motion vector by selecting for high prediction efficiency;
The detected motion vector, an offset vector reaching an offset block located at the same coordinate as a local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block, and the offset Decomposing into a local motion vector from a block to the selected block;
And a step of multiplexing the decomposed local motion vector and the offset vector into the moving picture stream and outputting the multiplexed video.

The multi-viewpoint image compression method according to claim 14, wherein the plurality of multi-viewpoint images are a right-eye image and a left-eye image.

15. The multi-viewpoint image compression method according to claim 14, wherein the plurality of multi-viewpoint images are a right-eye image, a left-eye image, and an additional image in which either the right-eye image or the left-eye image is doubled in the horizontal direction.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the detected motion vector is set as an offset vector. 15. The processing further includes the step of decomposing into a local motion vector, and further comprising the step of multiplexing the detected motion vector into the moving picture stream and outputting when the target block to be encoded is in a planar image. 17. The multi-viewpoint image compression method according to any one of items 1 to 16.

The multi-viewpoint image compression method according to any one of claims 14 to 17, wherein the code representing the local motion vector is encoded according to a motion vector format in the MPEG standard.

The video stream compressed by the multi-viewpoint image compression means is a video stream that complies with the MPEG standard,
The code representing the offset vector is inserted into one or both of a user data part and a header part in a stream compliant with the MPEG standard,
The flag indicating the position where the offset vector exists, the flag indicating the encoding format of the offset vector, and the flag indicating the arrangement order of the images in the multiplexed image are a user in the stream compliant with the MPEG standard. The multi-viewpoint image compression method according to claim 14, wherein the multi-viewpoint image compression method is inserted into a data portion.

When decomposing and encoding a multiplexed image obtained by spatially multiplexing a plurality of multi-view images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image, A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. A block selected from among blocks constituting a multiplexed image in which a motion vector is detected by selecting so as to increase prediction efficiency, and the detected motion vector is used as a reference image when detecting the motion vector. In the multi-view image including the offset block that is located at the same coordinate as the local coordinate in the multi-view image of the encoding target block. Receive a moving picture stream obtained by decomposing the set vector and a local motion vector from the offset block to the selected block, and encoding and multiplexing the decomposed local motion vector and the offset vector A multi-viewpoint image decompression method that restores the original multi-viewpoint image by decompressing,
Separating the local motion vector and the offset vector included in the received video stream;
Synthesizing a motion vector from the separated local motion vector and the offset vector;
The step of forming a predicted image from the synthesized motion vector and a reference image in the received moving image stream, and restoring the original multiplexed image by taking the sum of the predicted image and the difference image included in the moving image stream A multi-viewpoint image decompression method.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected. When the encoding target block is in the stereoscopic image, the motion vector separated from the video stream is synthesized as the local motion vector. The process further includes the step of proceeding to the step of restoring the original multiplexed image using the motion vector separated from the video stream when the encoding target block is in the planar image. The multi-viewpoint image decompression method according to claim 20.

A plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints are decomposed into motion vectors and difference images, encoded and transmitted as a moving image stream, and the transmitted moving image stream is received. A multi-viewpoint image transmission method for restoring the original multi-viewpoint image by
Generating a single multiplexed image by multiplexing the plurality of multi-viewpoint images in an image space;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. Detecting a motion vector by selecting for high prediction efficiency;
The detected motion vector, an offset vector reaching an offset block located at the same coordinate as a local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block, and the offset Decomposing into a local motion vector from a block to the selected block;
Encoding the decomposed local motion vector and the offset vector, and multiplexing and outputting the video stream;
Separating a local motion vector and an offset vector included in the received video stream;
Synthesizing a motion vector from the separated local motion vector and the offset vector;
The step of forming a predicted image from the synthesized motion vector and a reference image in the received moving image stream, and restoring the original multiplexed image by taking the sum of the predicted image and the difference image included in the moving image stream A multi-viewpoint image transmission method.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
In the process of compressing the multi-viewpoint image, it is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, The process proceeds to a step of decomposing the detected motion vector into an offset vector and a local motion vector. When the block to be encoded is in a plane image, the detected motion vector is multiplexed into the moving image stream. Output step;
In the process of expanding the multi-viewpoint image, the arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the video stream The process proceeds to the step of synthesizing a motion vector with the local motion vector as the local motion vector, and if the block to be encoded is in the plane image, the original multiplexed image is restored using the motion vector separated from the video stream The multi-viewpoint image transmission method according to claim 22, further comprising a step of proceeding to the step.

A multi-viewpoint image that compresses a data amount by decomposing and encoding a plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image and outputting them as a moving image stream A program for causing a computer to execute a compression method,
A process of multiplexing the plurality of multi-view images in an image space to generate one multiplexed image;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. A process of detecting a motion vector by selecting so that the prediction efficiency is high;
The detected motion vector, an offset vector reaching an offset block located at the same coordinate as a local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block, and the offset Decomposing into a local motion vector from a block to the selected block;
The program for making a computer perform the process which multiplexes and outputs the decomposed | disassembled said local motion vector and the said offset vector to the said moving image stream.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
It is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, the detected motion vector is set as an offset vector. 25. Proceeding to a process of decomposing into local motion vectors, and if the target block to be encoded is in a planar image, the computer further executes a process of multiplexing and outputting the detected motion vector to the moving image stream. The program described.

When decomposing and encoding a multiplexed image obtained by spatially multiplexing a plurality of multi-view images obtained by photographing a predetermined object from a plurality of viewpoints into a motion vector and a difference image, A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. A block selected from among blocks constituting a multiplexed image in which a motion vector is detected by selecting so as to increase prediction efficiency, and the detected motion vector is used as a reference image when detecting the motion vector. In the multi-view image including the offset block that is located at the same coordinate as the local coordinate in the multi-view image of the encoding target block. Receive a moving picture stream obtained by decomposing the set vector and a local motion vector from the offset block to the selected block, and encoding and multiplexing the decomposed local motion vector and the offset vector A program for causing a computer to execute a multi-viewpoint image decompression method that restores the original multi-viewpoint image by decompressing,
A process of separating the local motion vector and the offset vector included in the received video stream;
A process of synthesizing a motion vector from the separated local motion vector and the offset vector;
A process of forming a predicted image from the synthesized motion vector and a reference image in the received moving image stream, and restoring the original multiplexed image by taking the sum of the predicted image and the difference image included in the moving image stream A program that causes a computer to execute.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
The arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected. When the encoding target block is in the stereoscopic image, the motion vector separated from the video stream is synthesized as the local motion vector. And when the encoding target block is in a plane image, further causing the computer to execute a process of proceeding to a process of restoring the original multiplexed image using a motion vector separated from the moving image stream. 26. The program according to 26.

A plurality of multi-viewpoint images obtained by photographing a predetermined object from a plurality of viewpoints are decomposed into motion vectors and difference images, encoded and transmitted as a moving image stream, and the transmitted moving image stream is received. A program for causing a computer to execute a multi-viewpoint image transmission method for restoring the original multi-viewpoint image by decompressing,
A process of multiplexing the plurality of multi-view images in an image space to generate one multiplexed image;
A predetermined block of the multiplexed image is set as a motion vector search range, and a similar block similar to the encoding target block of the multiplexed image to be encoded is selected as a reference image from among the blocks constituting the multiplexed image. A process of detecting a motion vector by selecting so that the prediction efficiency is high;
The detected motion vector, an offset vector reaching an offset block located at the same coordinate as a local coordinate in the multi-view image of the encoding target block in the multi-view image including the selected block, and the offset Decomposing into a local motion vector from a block to the selected block;
A process of encoding the decomposed local motion vector and the offset vector, multiplexing the video into the video stream, and outputting the video stream;
A process of separating the local motion vector and the offset vector included in the received video stream;
A process of synthesizing a motion vector from the separated local motion vector and the offset vector;
A process of forming a predicted image from the synthesized motion vector and a reference image in the received moving image stream, and restoring the original multiplexed image by taking the sum of the predicted image and the difference image included in the moving image stream A program that causes a computer to execute.

The plurality of multi-viewpoint images are composed of a plurality of stereoscopic images and a planar image having a higher resolution than the stereoscopic images,
In the process of compressing the multi-viewpoint image, it is determined whether the encoding target block is in the stereoscopic image or the planar image of the multiplexed image, and when the encoding target block is in the stereoscopic image, Proceed to the process of decomposing the detected motion vector into an offset vector and a local motion vector, and if the block to be encoded is in a planar image, the detected motion vector is multiplexed into the video stream and output Processing to
In the process of expanding the multi-viewpoint image, the arrangement order of the stereoscopic image and the planar image in the multiplexed image is detected, and when the encoding target block is in the stereoscopic image, the motion vector separated from the video stream The process proceeds to a process of synthesizing a motion vector using the local motion vector as a local motion vector, and if the block to be encoded is in a plane image, the process of restoring the original multiplexed image using the motion vector separated from the video stream is performed. 30. The program according to claim 28, further causing the computer to execute the proceeding process.