JP6949163B2

JP6949163B2 - Element image group generator, encoder, decoder, and program

Info

Publication number: JP6949163B2
Application number: JP2020050152A
Authority: JP
Inventors: 一宏原; 河北　真宏; 真宏河北; 洗井　淳; 淳洗井; 三科　智之; 智之三科; 菊池　宏; 宏菊池; 三浦　雅人; 雅人三浦; 直人岡市; 隼人渡邉
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2016-06-16
Filing date: 2020-03-19
Publication date: 2021-10-13
Anticipated expiration: 2036-06-16
Also published as: JP2020108174A

Description

本発明は、要素画像群生成装置、符号化装置、復号装置、およびプログラムに関する。 The present invention relates to an element image group generating device, a coding device, a decoding device, and a program.

任意視点から視認することができる立体テレビ用の撮像方式の１つとして、インテグラル方式が提案され、研究されている。インテグラル方式では、撮像の際に、平面上または球面上に配列されたレンズ群を用いる。配列されたレンズ群は、レンズアレイ（lens array）と呼ばれる。レンズ群を構成する個々のレンズは、要素レンズと呼ばれる。このインテグラル方式を採用した立体テレビカメラ（撮像装置）からの出力信号は、レンズ群から得られる立体撮像信号が集積された信号である。 The integral method has been proposed and studied as one of the imaging methods for 3D television that can be visually recognized from an arbitrary viewpoint. In the integral method, a group of lenses arranged on a plane or a spherical surface is used at the time of imaging. The arranged lens group is called a lens array. The individual lenses that make up the lens group are called element lenses. The output signal from the stereoscopic television camera (imaging apparatus) adopting this integral method is a signal in which the stereoscopic imaging signals obtained from the lens group are integrated.

インテグラル方式で撮像される立体像の解像度は、撮像素子の画素数、および上記の要素レンズの個数に関係することが知られている。要素レンズの個数が増大し、また１つの要素レンズから生成される画像（個々の要素レンズによって生成される画像を、以下において「要素画像」と呼ぶ）の画素数が多ければ、撮影される立体像の解像度も増大する。
また、インテグラル方式による立体像の表示においても同様に、要素レンズの数を多くし、要素画像の画素数を多くすることにより、高解像の自然な立体像を再現することができる。 It is known that the resolution of a stereoscopic image captured by the integral method is related to the number of pixels of the image sensor and the number of the above-mentioned element lenses. If the number of element lenses increases and the number of pixels of the image generated from one element lens (the image generated by each element lens is hereinafter referred to as "element image") is large, the stereoscopic image is taken. The image resolution also increases.
Similarly, in the display of a stereoscopic image by the integral method, a high-resolution natural stereoscopic image can be reproduced by increasing the number of element lenses and the number of pixels of the element image.

インテグラル方式によって表示される立体像を正しく見ることができる範囲を視域という。視域は要素画像が要素レンズによって投影される角度に相当し、要素レンズの焦点距離と要素画像サイズに関係している。 The range in which the stereoscopic image displayed by the integral method can be seen correctly is called the visual range. The visual field corresponds to the angle at which the element image is projected by the element lens, and is related to the focal length of the element lens and the element image size.

インテグラル方式による要素画像群を取得するためには、レンズアレイを通して被写体を撮像素子で撮像する。この方法では、撮影する被写体の大きさは制限されるが、撮像素子において直接要素画像群を取得できることから、撮影から表示までの処理を即座に行える利点がある。被写体の大きさに制限がなく広範囲の空間や被写体をインテグラル方式で撮影するためには、対象となる撮影物を多視点から撮影した多視点映像を取得し、被写体の３次元モデルをコンピューターグラフィクス（ＣＧ）空間上で生成する方法が用いられる。 In order to acquire the element image group by the integral method, the subject is imaged by the image sensor through the lens array. In this method, the size of the subject to be photographed is limited, but since the element image group can be directly acquired by the image sensor, there is an advantage that the processing from shooting to display can be performed immediately. In order to shoot a wide range of spaces and subjects in an integral manner without any restrictions on the size of the subject, acquire multi-viewpoint images of the target object from multiple viewpoints and computer graphics the 3D model of the subject. (CG) A method of generating in space is used.

特許文献１には、多視点画像から得られた３次元モデルを、ＣＧ上に配置するレンズアレイを通して疑似的に撮影することで要素画像群を取得する方法が記載されている。 Patent Document 1 describes a method of acquiring an element image group by pseudo-photographing a three-dimensional model obtained from a multi-viewpoint image through a lens array arranged on CG.

特開２０１３−１９６５３２号公報Japanese Unexamined Patent Publication No. 2013-196532

特許文献１に記載された方法では、多視点画像を基に３次元モデルを生成するプロセスが必要となるため、要素画像群を生成するためには高い処理能力を持った演算装置が必要となり、また演算処理に時間がかかることが問題である。一方で、３次元モデルを生成する際には、時間軸上での３次元モデルの変化を捉えることができるため、被写体の影になることで表示することのできない領域（この領域を「オクルージョン領域」と呼ぶ）の補間をすることができる。 Since the method described in Patent Document 1 requires a process of generating a three-dimensional model based on a multi-viewpoint image, an arithmetic unit having high processing power is required to generate an element image group. Another problem is that the arithmetic processing takes time. On the other hand, when the 3D model is generated, the change of the 3D model on the time axis can be captured, so that the area that cannot be displayed due to the shadow of the subject (this area is referred to as the "occlusion area"). ”) Can be interpolated.

さらに多視点画像から３次元モデルを生成せずに要素画像群を生成する方法も知られている。この方法では、レンズアレイの構造によっては、要素画像以外の成分である要素画像外成分を含んだ要素画像群を生成する必要があるため、３次元モデルを生成する場合と同様に複雑な演算処理が必要になり、要素画像群の生成に時間がかかるという問題が生じる。 Further, a method of generating an element image group from a multi-viewpoint image without generating a three-dimensional model is also known. In this method, depending on the structure of the lens array, it is necessary to generate an element image group including a component other than the element image, which is a component other than the element image. Is required, and there arises a problem that it takes time to generate the element image group.

インテグラル立体像を撮像したり表示したりするためのレンズアレイとして代表的なものは、正方配列のレンズアレイや、デルタ配列のレンズアレイである。
デルタ配列は、円形の要素レンズをいわゆる「俵積み」の状態に配置したものである。
つまり、デルタ配列では、同じ大きさの円形の要素レンズを、隣接する要素レンズの中心点同士が正三角形の形を成すように配列する。同じ大きさの円形の要素レンズが互いに重ならないように配置する場合、このデルタ配列は、単位面積当たりの要素レンズ数が最も多くなる配置のしかたである。このデルタ配列のレンズアレイを用いることによって、垂直成分の要素レンズ数を増やし、表示されるインテグラル立体像の解像度を向上させることができる。 Typical lens arrays for capturing and displaying integral stereoscopic images are a square array lens array and a delta array lens array.
The delta array is a so-called "bale-stacked" arrangement of circular element lenses.
That is, in the delta arrangement, circular element lenses of the same size are arranged so that the center points of adjacent element lenses form an equilateral triangle. When circular element lenses of the same size are arranged so as not to overlap each other, this delta arrangement is an arrangement in which the number of element lenses per unit area is the largest. By using this delta-arranged lens array, it is possible to increase the number of element lenses of the vertical component and improve the resolution of the displayed integral stereoscopic image.

デルタ配列のレンズアレイを用いて要素画像群を生成する場合、生成される要素画像群には要素画像の成分だけでなく、要素画像外の成分が含まれてしまう。これは、デルタ配列された円形の要素レンズ間に隙間があり、その隙間を通る光の成分が撮像されるためである。
そして、この要素画像群を多視点画像から生成するには、多視点画像を構成する複数の画素を抽出し、それらの画素値からの内挿処理を行う必要が生じてしまう。そのため、要素画像群を生成するには演算が複雑になり多くの時間が必要となるという問題がある。 When an element image group is generated using a lens array having a delta array, the generated element image group includes not only the components of the element image but also the components outside the element image. This is because there is a gap between the delta-arranged circular element lenses, and the light component passing through the gap is imaged.
Then, in order to generate this element image group from the multi-viewpoint image, it becomes necessary to extract a plurality of pixels constituting the multi-viewpoint image and perform interpolation processing from those pixel values. Therefore, there is a problem that the calculation becomes complicated and a lot of time is required to generate the element image group.

本発明は、上記の課題認識に基づいて行なわれたものであり、少ない計算量あるいは計算時間で要素画像群を生成することのできる要素画像群生成装置を提供しようとするものである。
また、本発明は、上記の要素画像群生成装置の構成を応用することによって、要素画像群を高効率に符号化および復号することのできるシステム（符号化装置および復号装置）を提供しようとするものである。 The present invention has been made based on the above-mentioned problem recognition, and an object of the present invention is to provide an element image group generating device capable of generating an element image group with a small amount of calculation or a calculation time.
Further, the present invention aims to provide a system (encoding device and decoding device) capable of highly efficiently encoding and decoding an element image group by applying the configuration of the element image group generating device described above. It is a thing.

［１］上記の課題を解決するため、本発明の一態様は、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群を生成する要素画像群生成装置であって、多視点からの視た画像を配置してなる多視点画像群を取得する多視点画像群取得部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を前記画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記多視点画像群取得部が取得した前記多視点画像群の画素の画素値から、前記マスク画像記憶部が記憶する前記マスク画像の画素の画素値を減算する要素画像外成分減算部と、前記要素画像外成分減算部から出力される多視点画像群を要素画像群に変換する画像変換部と、を具備することを特徴とする要素画像群生成装置である。 [1] In order to solve the above problems, one aspect of the present invention is an element image group generation device that generates an element image group composed of element images for displaying a stereoscopic image by an integral method, and is a multi-viewpoint. The multi-viewpoint image group acquisition unit that acquires the multi-viewpoint image group formed by arranging the images viewed from the above, and the degree to which the components other than the element image are included in the pixels in the multi-viewpoint image group are included in the pixels. From the mask image storage unit that stores the mask image held as the pixel value and the pixel values of the pixels of the multi-viewpoint image group acquired by the multi-viewpoint image group acquisition unit, the mask image stored by the mask image storage unit It is characterized by including an element image external component subtraction unit that subtracts a pixel value of a pixel, and an image conversion unit that converts a multi-viewpoint image group output from the element image external component subtraction unit into an element image group. It is an element image group generator.

［２］また、本発明の一態様は、要素画像からなる要素画像群を取得する要素画像群取得部と、前記要素画像群取得部が取得した前記要素画像群を、多視点画像群に変換する画像変換部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を前記画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記マスク画像記憶部に記憶されている前記マスク画像を参照することにより、前記画像変換部から出力される多視点画像群に含まれる各視点の画像について要素画像外成分が所定値以下か否かを判定し、前記要素画像外成分が前記所定値以下である視点の画像のみを選定して出力する多視点画像選定部と、前記多視点画像群のうちの前記多視点画像選定部によって選定された視点の画像のみを符号化して出力する多視点画像符号化部と、を具備することを特徴とする符号化装置である。 [2] Further, in one aspect of the present invention, an element image group acquisition unit that acquires an element image group composed of element images and the element image group acquired by the element image group acquisition unit are converted into a multi-viewpoint image group. An image conversion unit, a mask image storage unit that stores a mask image that stores a mask image in which a component other than the element image is included in a pixel in the multi-viewpoint image group as a pixel value of the pixel, and the mask image. By referring to the mask image stored in the storage unit, it is determined whether or not the component outside the element image is equal to or less than a predetermined value for the image of each viewpoint included in the multi-viewpoint image group output from the image conversion unit. , A multi-viewpoint image selection unit that selects and outputs only an image of a viewpoint whose component outside the element image is equal to or less than the predetermined value, and a viewpoint selected by the multi-viewpoint image selection unit in the multi-viewpoint image group. It is a coding apparatus including a multi-viewpoint image coding unit that encodes and outputs only an image.

［３］また、本発明の一態様による符号化装置は、上記の符号化装置において、前記要素画像群取得部が取得した前記要素画像群に含まれる前記要素画像の各々について、前記要素画像のピッチが画素ピッチの整数倍になるように、縦方向または横方向の少なくともいずれかの方向に画素を挿入または削除することにより、画像サイズを変換する画像サイズ変換部、をさらに具備し、前記画像変換部は、前記画像サイズ変換部によってサイズが変換された前記要素画像群を、前記多視点画像群に変換する、ことを特徴とする。 [3] Further, in the coding device according to one aspect of the present invention, for each of the element images included in the element image group acquired by the element image group acquisition unit, the element image An image size conversion unit that converts an image size by inserting or deleting pixels in at least one of the vertical direction and the horizontal direction so that the pitch is an integral multiple of the pixel pitch is further provided. The conversion unit is characterized in that the element image group whose size has been converted by the image size conversion unit is converted into the multi-viewpoint image group.

［４］また、本発明の一態様による符号化装置は、前記画像サイズ変換部は、画像サイズを変換する際に、変換前の前記要素画像に含まれる画素の画素値を、変換後の前記要素画像に含まれる対応する位置の画素の画素値とする、ことを特徴とする請求項３に記載の符号化装置。 [4] Further, in the coding apparatus according to one aspect of the present invention, when the image size conversion unit converts the image size, the pixel values of the pixels included in the element image before conversion are converted into the above-mentioned after conversion. The coding apparatus according to claim 3, wherein the pixel value is a pixel value of a pixel at a corresponding position included in an element image.

［５］また、本発明の一態様は、符号化された多視点画像を復号し、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群に変換して出力する復号装置であって、複数の視点の画像である多視点画像を符号化して得られた符号を復号する多視点画像復号部と、前記多視点画像復号部によって復号された視点の画像に基づき、不足する視点の画像を生成し、前記多視点画像復号部によって復号された視点の画像と生成した画像とからなる多視点画像群を出力する不足多視点画像生成部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記不足多視点画像生成部から出力された前記多視点画像群の画素の画素値から、前記マスク画像記憶部が記憶する前記マスク画像の画素の画素値を減算する要素画像外成分減算部と、前記要素画像外成分減算部から出力される多視点画像群を前記要素画像群に変換する画像変換部と、を具備することを特徴とする復号装置である。 [5] Further, one aspect of the present invention is a decoding device that decodes a coded multi-viewpoint image, converts it into an element image group consisting of element images for displaying a stereoscopic image by an integral method, and outputs the image. There is a multi-viewpoint image decoding unit that decodes a code obtained by encoding a multi-viewpoint image that is an image of a plurality of viewpoints, and a viewpoint image that is insufficient based on the viewpoint image decoded by the multi-viewpoint image decoding unit. The insufficient multi-viewpoint image generation unit that generates the image of the above and outputs the multi-viewpoint image group consisting of the viewpoint image decoded by the multi-viewpoint image decoding unit and the generated image, and the components other than the element image are described above. A mask image storage unit that stores a mask image that holds the degree of inclusion in the pixels in the multi-viewpoint image group as a pixel value of the pixel, and a pixel of the multi-viewpoint image group output from the insufficient multi-viewpoint image generation unit. The element image includes an element image external component subtraction unit that subtracts the pixel value of the pixel of the mask image stored by the mask image storage unit from the pixel value, and a multi-viewpoint image group output from the element image external component subtraction unit. It is a decoding device characterized by including an image conversion unit for converting into a group.

［６］また、本発明の一態様による復号装置は、前記画像変換部が出力した前記要素画像群に含まれる前記要素画像の各々について、符号化の際に行われた画像サイズの変換の逆変換となるように、縦方向または横方向の少なくともいずれかの方向に画素を挿入または削除することにより、画像サイズを変換する画像サイズ変換部、をさらに具備することを特徴とする。 [6] Further, the decoding device according to one aspect of the present invention reverses the conversion of the image size performed at the time of encoding for each of the element images included in the element image group output by the image conversion unit. It is characterized by further including an image size conversion unit that converts an image size by inserting or deleting pixels in at least one of the vertical direction and the horizontal direction so as to perform conversion.

［７］また、本発明の一態様による復号装置は、前記画像サイズ変換部は、画像サイズを変換する際に、変換前の前記要素画像に含まれる画素の画素値を、変換後の前記要素画像に含まれる対応する位置の画素の画素値とする、ことを特徴とする。 [7] Further, in the decoding device according to one aspect of the present invention, when the image size conversion unit converts the image size, the pixel values of the pixels included in the element image before conversion are converted into the element after conversion. It is characterized in that it is the pixel value of the pixel at the corresponding position included in the image.

［８］また、本発明の一態様は、コンピューターを、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群を生成する要素画像群生成装置であって、多視点からの視た画像を配置してなる多視点画像群を取得する多視点画像群取得部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を前記画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記多視点画像群取得部が取得した前記多視点画像群の画素の画素値から、前記マスク画像記憶部が記憶する前記マスク画像の画素の画素値を減算する要素画像外成分減算部と、前記要素画像外成分減算部から出力される多視点画像群を要素画像群に変換する画像変換部と、を具備する要素画像群生成装置として機能させるためのプログラム。 [8] Further, one aspect of the present invention is an element image group generation device for generating an element image group composed of element images for displaying a stereoscopic image by an integral method, and the computer is viewed from multiple viewpoints. The pixel value of the pixel value is the degree to which the multi-viewpoint image group acquisition unit that acquires the multi-viewpoint image group formed by arranging the images and the degree to which the components other than the element image are included in the pixels in the multi-viewpoint image group. From the pixel values of the mask image storage unit that stores the mask image to be held and the pixels of the multi-viewpoint image group acquired by the multi-viewpoint image group acquisition unit, the pixels of the pixels of the mask image stored by the mask image storage unit. It functions as an element image group generation device including an element image external component subtraction unit for subtracting a value and an image conversion unit for converting a multi-viewpoint image group output from the element image external component subtraction unit into an element image group. Program for.

［９］また、本発明の一態様は、コンピューターを、要素画像からなる要素画像群を取得する要素画像群取得部と、前記要素画像群取得部が取得した前記要素画像群を、多視点画像群に変換する画像変換部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を前記画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記マスク画像記憶部に記憶されている前記マスク画像を参照することにより、前記画像変換部から出力される多視点画像群に含まれる各視点の画像について要素画像外成分が所定値以下か否かを判定し、前記要素画像外成分が前記所定値以下である視点の画像のみを選定して出力する多視点画像選定部と、前記多視点画像群のうちの前記多視点画像選定部によって選定された視点の画像のみを符号化して出力する多視点画像符号化部と、を具備する符号化装置として機能させるためのプログラムである。 [9] Further, in one aspect of the present invention, a computer is used to obtain a multi-viewpoint image of an element image group acquisition unit that acquires an element image group composed of element images and the element image group acquired by the element image group acquisition unit. An image conversion unit that converts into a group, a mask image storage unit that stores a mask image that holds the degree to which components other than the element image are included in the pixels in the multi-viewpoint image group as pixel values of the pixels, and a mask image storage unit. By referring to the mask image stored in the mask image storage unit, whether or not the component outside the element image is equal to or less than a predetermined value for the image of each viewpoint included in the multi-viewpoint image group output from the image conversion unit. Is selected by the multi-viewpoint image selection unit that selects and outputs only the image of the viewpoint whose component outside the element image is equal to or less than the predetermined value, and the multi-viewpoint image selection unit of the multi-viewpoint image group. This is a program for functioning as a coding device including a multi-viewpoint image coding unit that encodes and outputs only an image of a viewpoint.

［１０］また、本発明の一態様は、コンピューターを、符号化された多視点画像を復号し、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群に変換して出力する復号装置であって、複数の視点の画像である多視点画像を符号化して得られた符号を復号する多視点画像復号部と、前記多視点画像復号部によって復号された視点の画像に基づき、不足する視点の画像を生成し、前記多視点画像復号部によって復号された視点の画像と生成した画像とからなる多視点画像群を出力する不足多視点画像生成部と、前記要素画像の外の成分が前記多視点画像群の中の画素に含まれる程度を画素の画素値として保持するマスク画像を記憶するマスク画像記憶部と、前記不足多視点画像生成部から出力された前記多視点画像群の画素の画素値から、前記マスク画像記憶部が記憶する前記マスク画像の画素の画素値を減算する要素画像外成分減算部と、前記要素画像外成分減算部から出力される多視点画像群を前記要素画像群に変換する画像変換部と、を具備する復号装置として機能させるためのプログラムである。 [10] Further, in one aspect of the present invention, a computer decodes a coded multi-viewpoint image, converts it into an element image group composed of element images for displaying a stereoscopic image by an integral method, and outputs the image. Based on a multi-viewpoint image decoding unit that is a decoding device and decodes a code obtained by encoding a multi-viewpoint image that is an image of a plurality of viewpoints, and a viewpoint image decoded by the multi-viewpoint image decoding unit. A missing multi-viewpoint image generator that generates an image of the missing viewpoint and outputs a multi-viewpoint image group consisting of the viewpoint image decoded by the multi-viewpoint image decoding unit and the generated image, and a non-deficient multi-viewpoint image generation unit outside the element image. A mask image storage unit that stores a mask image that holds the degree to which a component is included in a pixel in the multi-viewpoint image group as a pixel value of the pixel, and the multi-viewpoint image group output from the insufficient multi-viewpoint image generation unit. The element image external component subtraction unit that subtracts the pixel value of the mask image pixel stored in the mask image storage unit from the pixel value of the pixel, and the multi-viewpoint image group output from the element image external component subtraction unit. This is a program for functioning as a decoding device including an image conversion unit for converting into the element image group.

本発明によれば、マスク画像の画素値を減算する処理で要素画像を生成することが可能となる。つまり、要素画像を生成するための処理量あるいは処理時間を小さくすることが可能となる。 According to the present invention, it is possible to generate an element image by a process of subtracting a pixel value of a mask image. That is, it is possible to reduce the processing amount or processing time for generating the element image.

本発明の第１実施形態による要素画像群生成装置の概略機能構成を示すブロック図である。It is a block diagram which shows the schematic functional structure of the element image group generating apparatus by 1st Embodiment of this invention. デルタ配列によるレンズアレイのレンズの配置と、同実施形態による画像サイズ変換部でのサイズ変換比率を説明するための概略図である。It is the schematic for demonstrating the arrangement of the lens of the lens array by the delta arrangement, and the size conversion ratio in the image size conversion part by the same embodiment. 同実施形態による要素画像群生成装置が使用するマスク画像を生成するためのマスク画像生成装置の概略機能構成を示すブロック図である。It is a block diagram which shows the schematic functional structure of the mask image generation apparatus for generating the mask image used by the element image group generating apparatus by this embodiment. 同実施形態で使用するマスク画像を生成する過程における画像の一例を示す概略図である。It is the schematic which shows an example of the image in the process of generating the mask image used in the same embodiment. 同実施形態で使用するマスク画像の一例を示す概略図である。マスク画像は、上記のマスク画像生成装置によって生成される。It is the schematic which shows an example of the mask image used in the same embodiment. The mask image is generated by the above-mentioned mask image generator. 同実施形態による要素画像群生成装置の多視点画像群取得部が外部から取得し、画像サイズ変換部によってサイズ変更された多視点画像群の例を示す概略図である。It is a schematic diagram which shows the example of the multi-viewpoint image group acquired from the outside by the multi-viewpoint image group acquisition part of the element image group generation apparatus by the same embodiment, and was resized by the image size conversion part. 同実施形態による要素画像群生成装置の要素画像外成分減算部が出力する多視点画像群の例を示す概略図である。It is a schematic diagram which shows the example of the multi-viewpoint image group output by the element image external component subtraction part of the element image group generation apparatus by the same embodiment. 第２実施形態による要素画像群生成装置の概略機能構成を示すブロック図である。It is a block diagram which shows the schematic functional structure of the element image group generating apparatus by 2nd Embodiment. 第３実施形態による符号化装置および復号装置の概略機能構成を示すブロック図である。It is a block diagram which shows the schematic functional structure of the coding apparatus and decoding apparatus according to 3rd Embodiment. 同実施形態による多視点画像選定部における、視点の選定の結果の例を示す概略図である。It is a schematic diagram which shows the example of the result of selection of viewpoints in the multi-viewpoint image selection part by the same embodiment. 同実施形態による復号装置側での多視点画像を復元する処理での画像の例を示す概略図である。It is the schematic which shows the example of the image in the process of restoring a multi-viewpoint image on the decoding apparatus side by the same embodiment. 同実施形態による復号装置側で復元した結果の多視点画像群を示す概略図である。It is a schematic diagram which shows the multi-viewpoint image group of the result of restoration on the decoding apparatus side by the same embodiment. 画像サイズの変換を行わない場合に得られる多視点画像群の例を示す概略図である。It is a schematic diagram which shows the example of the multi-viewpoint image group obtained when the image size conversion is not performed. 画像サイズの変換処理を行った場合に得られる多視点画像群の例を示す概略図である。It is a schematic diagram which shows the example of the multi-viewpoint image group obtained when the image size conversion processing is performed. 実施形態の変形例による、画像サイズの変換の前後における画素の位置関係を説明するための概略図である。It is a schematic diagram for demonstrating the positional relationship of the pixel before and after the conversion of an image size by the modification of embodiment. 同変形例による画像サイズ変換部がサイズ変換を行う処理において、返還前後の画素値の参照関係を示す概略図である。It is a schematic diagram which shows the reference relation of the pixel value before and after the return in the process which the image size conversion unit performs the size conversion by the modification. 同変形例によってサイズ変更処理した要素画像群を基に生成された多視点画像群を示す概略図である。It is a schematic diagram which shows the multi-viewpoint image group generated based on the element image group which was resized by the modification.

次に、図面を参照しながら、本発明の実施形態について説明する。
［第１実施形態］
図１は、本実施形態による要素画像群生成装置の概略機能構成を示すブロック図である。この要素画像群生成装置１は、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群を生成するものである。図示するように、要素画像群生成装置１は、マスク画像記憶部１０と、多視点画像群取得部２１と、画像サイズ変換部２２と、要素画像外成分減算部２３と、画像変換部３１と、画像サイズ変換部３２と、要素画像群出力部３３とを含んで構成される。
以下において、画像内の画素値は、予め定められた最小値以上であり、且つ予め定められた最大値以下である。ここで、最小値はゼロである。画像が複数のチャンネルから成る（例えば、Ｒ（赤），Ｇ（緑），Ｂ（青）の三原色から成る）ものである場合には、各チャンネルの画素値が、前記最小値以上、且つ前記最大値以下である。 Next, an embodiment of the present invention will be described with reference to the drawings.
[First Embodiment]
FIG. 1 is a block diagram showing a schematic functional configuration of an element image group generating device according to the present embodiment. The element image group generation device 1 generates an element image group composed of element images for displaying a stereoscopic image by an integral method. As shown in the figure, the element image group generation device 1 includes a mask image storage unit 10, a multi-viewpoint image group acquisition unit 21, an image size conversion unit 22, an element image external component subtraction unit 23, and an image conversion unit 31. , An image size conversion unit 32 and an element image group output unit 33 are included.
In the following, the pixel value in the image is at least a predetermined minimum value and at least a predetermined maximum value. Here, the minimum value is zero. When the image is composed of a plurality of channels (for example, it is composed of the three primary colors of R (red), G (green), and B (blue)), the pixel value of each channel is equal to or more than the minimum value and described above. It is below the maximum value.

マスク画像記憶部１０は、マスク画像を記憶する。マスク画像記憶部１０が記憶するマスク画像は、立体像の表示に使用するレンズアレイの構造に基づいて予め生成されたものである。なお、マスク画像は、画素ごとの画素値を有するデータである。マスク画像の各画素の画素値は、その画素における要素画像外成分の量の度合いを表すものである。マスク画像の画素値が最小値（ゼロ）のとき、その画素は要素画像外成分を持たないことを表す。マスク画像の画素値が最大値のとき、その画素は要素画像外成分が最大であることを表す。なお、マスク画像記憶部１０が記憶するマスク画像は、デルタ配列によるレンズアレイを用いることを前提として生成されたものである。つまり、マスク画像記憶部１０は、要素画像の外の成分（要素画像外成分）が多視点画像群の中の画素に含まれる程度を画素の画素値として保持するマスク画像を記憶するものである。なお、マスク画像の生成のしかたについては、後述する。 The mask image storage unit 10 stores the mask image. The mask image stored in the mask image storage unit 10 is generated in advance based on the structure of the lens array used for displaying the stereoscopic image. The mask image is data having a pixel value for each pixel. The pixel value of each pixel of the mask image represents the degree of the amount of the component outside the element image in that pixel. When the pixel value of the mask image is the minimum value (zero), it means that the pixel has no component outside the element image. When the pixel value of the mask image is the maximum value, it means that the pixel has the maximum component outside the element image. The mask image stored in the mask image storage unit 10 is generated on the premise that a lens array based on a delta array is used. That is, the mask image storage unit 10 stores a mask image that retains the degree to which components outside the element image (components outside the element image) are included in the pixels in the multi-viewpoint image group as pixel values of the pixels. .. The method of generating the mask image will be described later.

多視点画像群取得部２１は、外部から、多視点画像を取得する。多視点画像群は、被写体を、多視点のカメラによって撮像して得られる画像である。つまり、多視点画像群取得部２１は、多視点からの視た画像を配置してなる多視点画像群を取得するものである。例えば、視点が縦８個×横８個の合計６４個である場合、それらの視点による多視点画像を、縦８個×横８個に配置された６４個のカメラによって撮像することができる。また、視点数より少ないカメラで撮像を行い、補間により多視点画像を得ることもできる。例えば、３２個のカメラによって撮像した画像を用いて、それら３２個の視点とは異なる他の３２個の視点の画像を補間によって生成し、合計６４視点の多視点画像を得ることもできる。画像を補間する場合には、内挿補間を用いる。また、多視点画像群取得部２１は、それらのカメラから直接多視点画像を得てもよいし、それらのカメラで予め撮像された画像データを外部から取得するようにしてもよい。 The multi-viewpoint image group acquisition unit 21 acquires a multi-viewpoint image from the outside. The multi-viewpoint image group is an image obtained by capturing a subject with a multi-viewpoint camera. That is, the multi-viewpoint image group acquisition unit 21 acquires a multi-viewpoint image group formed by arranging images viewed from multiple viewpoints. For example, when there are a total of 64 viewpoints of 8 vertical × 8 horizontal, multi-viewpoint images from those viewpoints can be captured by 64 cameras arranged in 8 vertical × 8 horizontal. It is also possible to take an image with a camera having a smaller number of viewpoints and obtain a multi-viewpoint image by interpolation. For example, using images captured by 32 cameras, images of 32 viewpoints different from those 32 viewpoints can be generated by interpolation to obtain a total of 64 viewpoints of multi-viewpoint images. When interpolating an image, interpolation interpolation is used. Further, the multi-viewpoint image group acquisition unit 21 may obtain the multi-viewpoint image directly from those cameras, or may acquire the image data previously captured by those cameras from the outside.

画像サイズ変換部２２は、多視点画像群取得部２１で取得した多視点画像のサイズを変換する。具体的には、画像サイズ変換部２２は、デルタ配列によるレンズアレイの縦横配置比率に合うように、多視点画像のサイズを変更する。より具体的には、画像サイズ変換部２２は、多視点画像の縦方向の画素数が（２／ＳＱＲＴ（３））倍（≒１．１５４７倍）になるように、画像サイズを変換する。つまり、画像サイズ変換部２２は、要素画像群に含まれる要素画像の各々について、要素画像のピッチが画素ピッチの整数倍になるように、縦方向または横方向の少なくともいずれかの方向に画素を挿入または削除することにより、画像サイズを変換する。ここで、「ＳＱＲＴ（）」は、平方根を表す。なお、画像サイズ変換部２２は、横方向の画像サイズを変更しない。このとき、画像サイズ変換部２２は、画像サイズの変換後の各画素の画素値を、変換前の画素値の内挿計算により求める。なお、内挿計算を用いる代わりに、後述する変形例３の方法により、サイズ変換後の各画素の画素値を求めるようにしてもよい。
なお、画像サイズ変換処理における画素の比率については、後で、別の図面を参照しながら補足説明する。 The image size conversion unit 22 converts the size of the multi-viewpoint image acquired by the multi-viewpoint image group acquisition unit 21. Specifically, the image size conversion unit 22 changes the size of the multi-viewpoint image so as to match the aspect ratio of the lens array based on the delta arrangement. More specifically, the image size conversion unit 22 converts the image size so that the number of pixels in the vertical direction of the multi-viewpoint image is (2 / SQRT (3)) times (≈1.1547 times). That is, the image size conversion unit 22 converts pixels in at least one of the vertical direction and the horizontal direction so that the pitch of the element images is an integral multiple of the pixel pitch for each of the element images included in the element image group. Convert the image size by inserting or deleting. Here, "SQRT ()" represents a square root. The image size conversion unit 22 does not change the image size in the horizontal direction. At this time, the image size conversion unit 22 obtains the pixel value of each pixel after the conversion of the image size by the interpolation calculation of the pixel value before the conversion. Instead of using the interpolation calculation, the pixel value of each pixel after the size conversion may be obtained by the method of the modification 3 described later.
The pixel ratio in the image size conversion process will be supplementarily described later with reference to another drawing.

要素画像外成分減算部２３は、画像サイズ変換部２２によって出力される多視点画像（画像サイズ変換後）から、マスク画像記憶部１０から読み出されるマスク画像を減算する処理を行う。要素画像外成分減算部２３は、これら両画像の各画素について、画素値の減算を行う。つまり、要素画像外成分減算部２３は、多視点画像群取得部２１が取得した多視点画像群の画素の画素値から、マスク画像記憶部１０が記憶するマスク画像の画素の画素値を減算する。なお、要素画像外成分減算部２３は、両画像の対応する画素同士の画素値の減算を行う。なお、画素値の最小値はゼロである。減算の結果の値が負である場合には、その画素の画素値をゼロとする。 The element image external component subtraction unit 23 performs a process of subtracting the mask image read from the mask image storage unit 10 from the multi-viewpoint image (after image size conversion) output by the image size conversion unit 22. The element image external component subtraction unit 23 subtracts the pixel value for each pixel of both of these images. That is, the element image external component subtraction unit 23 subtracts the pixel values of the mask image pixels stored in the mask image storage unit 10 from the pixel values of the pixels of the multi-viewpoint image group acquired by the multi-viewpoint image group acquisition unit 21. .. The element image external component subtraction unit 23 subtracts the pixel values of the corresponding pixels of both images. The minimum pixel value is zero. If the value of the result of subtraction is negative, the pixel value of that pixel is set to zero.

画像変換部３１は、要素画像外成分減算部２３から出力される多視点画像群を、要素画像群に変換する。
なお、多視点画像群を要素画像群に変換する処理自体は、既存の技術により行うことができる。視点ごとの画素をまとめて配置してなる多視点画像群を基に、画像変換部３１は、画素を再配置することにより、要素レンズごとの画素をまとめて配置してなる要素画像群を生成する。 The image conversion unit 31 converts the multi-viewpoint image group output from the element image external component subtraction unit 23 into an element image group.
The process itself of converting the multi-viewpoint image group into the element image group can be performed by the existing technique. Based on the multi-viewpoint image group in which the pixels for each viewpoint are arranged together, the image conversion unit 31 generates an element image group in which the pixels for each element lens are arranged together by rearranging the pixels. do.

画像サイズ変換部３２は、画像変換部３１から出力された画像のサイズを変換する。画像サイズ変換部３２におけるサイズ変換処理は、前述した画像サイズ変換部２２におけるサイズ変換処理の逆変換である。具体的には、画像サイズ変換部３２は、画像の縦方向の画素数が（ＳＱＲＴ（３）／２）倍（≒０．８６６０倍）になるように、画像サイズを変換する。つまり、画像サイズ変換部３２は、要素画像群に含まれる要素画像の各々について、要素画像のピッチが画素ピッチの整数倍になるように、縦方向または横方向の少なくともいずれかの方向に画素を挿入または削除することにより、画像サイズを変換する。なお、画像サイズ変換部３２は、横方向の画像サイズを変更しない。ここでも、画像サイズ変換部３２は、画像サイズの変換後の各画素の画素値を、変換前の画素値の内挿計算により求める。なお、内挿計算を用いる代わりに、後述する変形例３の方法により、サイズ変換後の各画素の画素値を求めるようにしてもよい。
なお、画像サイズ変換部３２による上記のサイズ変換処理により、要素レンズのピッチが画素ピッチの整数倍になる。
そして、要素画像群出力部３３は、以上の処理によって得られた要素画像群を出力する。 The image size conversion unit 32 converts the size of the image output from the image conversion unit 31. The size conversion process in the image size conversion unit 32 is the inverse conversion of the size conversion process in the image size conversion unit 22 described above. Specifically, the image size conversion unit 32 converts the image size so that the number of pixels in the vertical direction of the image is (SQRT (3) / 2) times (≈0.8660 times). That is, the image size conversion unit 32 shifts pixels in at least one of the vertical direction and the horizontal direction so that the pitch of the element images is an integral multiple of the pixel pitch for each of the element images included in the element image group. Convert the image size by inserting or deleting. The image size conversion unit 32 does not change the image size in the horizontal direction. Here, too, the image size conversion unit 32 obtains the pixel value of each pixel after the conversion of the image size by the interpolation calculation of the pixel value before the conversion. Instead of using the interpolation calculation, the pixel value of each pixel after the size conversion may be obtained by the method of the modification 3 described later.
By the above size conversion process by the image size conversion unit 32, the pitch of the element lens becomes an integral multiple of the pixel pitch.
Then, the element image group output unit 33 outputs the element image group obtained by the above processing.

図２は、上記の画像サイズ変換部２２および画像サイズ変換部３２のそれぞれにおけるサイズ変換比率を説明するための図である。同図は、デルタ配列で配置された多数の要素レンズのうちの一部を平面視した平面図である。同図では、多数の要素レンズのうちの４個の要素レンズのみを示しており、その他の要素レンズを省略している。同図内に示す三角形は、各々の円形の要素レンズの中心点を頂点とする三角形であり、正三角形である。
図示するように、デルタ配列されている場合、横方向の要素レンズのピッチと、縦方向の要素レンズのピッチは、２対ＳＱＲＴ（３）である。そして、この比率に基づき、画像サイズ変換部２２および画像サイズ変換部３２における変換比率を定めている。 FIG. 2 is a diagram for explaining the size conversion ratio in each of the image size conversion unit 22 and the image size conversion unit 32. The figure is a plan view of a part of a large number of element lenses arranged in a delta arrangement. In the figure, only four element lenses out of a large number of element lenses are shown, and other element lenses are omitted. The triangles shown in the figure are triangles having the center point of each circular element lens as an apex, and are equilateral triangles.
As shown in the figure, in the case of the delta arrangement, the pitch of the element lenses in the horizontal direction and the pitch of the element lenses in the vertical direction are 2 pairs SQRT (3). Then, based on this ratio, the conversion ratio in the image size conversion unit 22 and the image size conversion unit 32 is determined.

次に、要素画像群生成装置１で使用するマスク画像を生成する方法について説明する。
図３は、マスク画像生成装置の概略機能構成を示すブロック図である。図示するように、マスク画像生成装置４０は、３次元モデル記憶部４１と、要素画像群生成部４２と、画像サイズ変換部４３と、画像変換部４４と、反転部４５と、マスク画像出力部４６とを含んで構成される。ここで説明する方法では、マスク画像生成装置４０は、３次元モデルを基に要素画像群を生成し、その要素画像群を多視点画像群に変換する。各部の機能は、次に述べる通りである。 Next, a method of generating a mask image to be used in the element image group generating device 1 will be described.
FIG. 3 is a block diagram showing a schematic functional configuration of the mask image generator. As shown in the figure, the mask image generation device 40 includes a three-dimensional model storage unit 41, an element image group generation unit 42, an image size conversion unit 43, an image conversion unit 44, an inversion unit 45, and a mask image output unit. It is configured to include 46 and. In the method described here, the mask image generation device 40 generates an element image group based on the three-dimensional model, and converts the element image group into a multi-viewpoint image group. The functions of each part are as described below.

３次元モデル記憶部４１は、コンピューターグラフィクスのための３次元モデルを記憶する。ここで、マスク画像を生成するために、３次元モデル記憶部４１が記憶する３次元モデルは、立体像の表示に用いるためのレンズアレイ上の１００％白の平面である。３次元モデルが「１００％白」であるとき、その３次元モデルを直接描画した場合に得られる画素値は、最大値（ＲＧＢの３原色で表現される場合、ＲＧＢの各々のチャンネルにおいて、画素値は最大値）である。 The 3D model storage unit 41 stores a 3D model for computer graphics. Here, the three-dimensional model stored in the three-dimensional model storage unit 41 for generating the mask image is a 100% white plane on the lens array for use in displaying the three-dimensional image. When the 3D model is "100% white", the pixel value obtained when the 3D model is drawn directly is the maximum value (when expressed in the three primary colors of RGB, the pixels in each channel of RGB. The value is the maximum value).

要素画像群生成部４２は、与えられるレンズアレイの構造（位置、個数、サイズ、焦点距離等）にしたがって、３次元モデル記憶部４１に記憶されている３次元モデルを被写体としたときの要素画像群を生成する。 The element image group generating unit 42 is an element image when a three-dimensional model stored in the three-dimensional model storage unit 41 is used as a subject according to the structure (position, number, size, focal length, etc.) of the given lens array. Generate a group.

画像サイズ変換部４３は、要素画像群生成部４２によって生成された要素画像群に対して、画像サイズを変更する処理を行う。ここで、具体的には、画像サイズ変換部４３は、縦方向の画像サイズの変更比率を（２／ＳＱＲＴ（３））倍（≒１．１５４７倍）とする。なお、画像サイズ変換部４３は、横方向の画像サイズを変更しない。画像サイズ変換部４３は、画像サイズの変換後の各画素の画素値を、変換前の画素値の内挿計算により求める。図２を用いて説明したように、この画像サイズの変更比率は、デルタ配列のレンズアレイを構成する要素レンズのレンズピッチにより決まるものである。画像サイズ変換部４３がこのように画像サイズを変更することにより、要素レンズのレンズピッチと、表示装置の画素ピッチとの関係が整数比となる。
なお、画像サイズ変換部４３は、処理対象とする多視点画像（マスク画像が表す要素画像外成分の減算の対象となる多視点画像）のサイズに合わせるために、上記のサイズ変換の処理を行う。サイズが既に整合している場合には、この画像サイズ変換部４３によるサイズ変換の処理を省略するようにする。 The image size conversion unit 43 performs a process of changing the image size of the element image group generated by the element image group generation unit 42. Here, specifically, the image size conversion unit 43 sets the change ratio of the image size in the vertical direction to (2 / SQRT (3)) times (≈1.1547 times). The image size conversion unit 43 does not change the image size in the horizontal direction. The image size conversion unit 43 obtains the pixel value of each pixel after conversion of the image size by interpolation calculation of the pixel value before conversion. As described with reference to FIG. 2, the change ratio of the image size is determined by the lens pitch of the element lenses constituting the lens array of the delta arrangement. By changing the image size in this way, the image size conversion unit 43 makes the relationship between the lens pitch of the element lens and the pixel pitch of the display device an integer ratio.
The image size conversion unit 43 performs the above size conversion process in order to match the size of the multi-viewpoint image to be processed (the multi-viewpoint image to be subtracted from the elements other than the element image represented by the mask image). .. If the sizes are already matched, the size conversion process by the image size conversion unit 43 is omitted.

画像変換部４４は、画像サイズ変換部４３から出力される要素画像群（画像サイズ変更済み）を多視点画像群に変換する。画像変換部４４から出力される画像については、後で、図を参照しながら説明する。
なお、要素画像群を多視点画像群に変換する処理自体は、既存の技術により行うことができる。要素レンズごとの画素をまとめて配置してなる要素画像群を基に、画像変換部４４は、画素を再配置することにより、視点ごとの画素をまとめて配置してなる多視点画像群を生成する。 The image conversion unit 44 converts the element image group (image size changed) output from the image size conversion unit 43 into a multi-viewpoint image group. The image output from the image conversion unit 44 will be described later with reference to the drawings.
The process itself of converting the element image group into the multi-viewpoint image group can be performed by the existing technique. Based on the element image group in which the pixels for each element lens are arranged together, the image conversion unit 44 generates a multi-viewpoint image group in which the pixels for each viewpoint are arranged together by rearranging the pixels. do.

反転部４５は、画像変換部４４から出力される多視点画像群について、ネガ／ポジ反転する処理を行う。反転部４５は、入力される画像の各画素について、ネガ／ポジ反転を行う。ネガ／ポジ反転の処理は、数式は、出力画素値＝最大画素値−入力画素値と表される。この反転部４５の処理により、白の画素は黒に変換され、黒の画素は白に変換される。また、中間階調の画素も同様に反転される。なお、反転部４５から出力される画像については、後で、図面を参照しながら説明する。反転部４５による処理の結果得られる画像が、マスク画像である。
マスク画像出力部４６は、反転部４５によって生成されたマスク画像を、外部に出力する。 The inversion unit 45 performs a negative / positive inversion process on the multi-viewpoint image group output from the image conversion unit 44. The inversion unit 45 performs negative / positive inversion for each pixel of the input image. In the negative / positive inversion process, the mathematical formula is expressed as output pixel value = maximum pixel value-input pixel value. By the processing of the inversion unit 45, the white pixels are converted to black, and the black pixels are converted to white. In addition, the pixels of the intermediate gradation are also inverted in the same manner. The image output from the inversion unit 45 will be described later with reference to the drawings. The image obtained as a result of the processing by the inversion unit 45 is a mask image.
The mask image output unit 46 outputs the mask image generated by the inversion unit 45 to the outside.

なお、所定のパラメーターにより定義されるレンズアレイについては、要素画像を求める計算を一度だけ行い、生成されたマスク画像を記憶しておくことにより、同一のレンズアレイに関しては繰り返しそのマスク画像を使用できる。マスク画像記憶部１０にマスク画像を記憶させておくようにする。 For a lens array defined by a predetermined parameter, the mask image can be used repeatedly for the same lens array by performing the calculation for obtaining the element image only once and storing the generated mask image. .. The mask image storage unit 10 stores the mask image.

図４は、マスク画像を生成する過程において、上記の画像変換部４４から出力される画像の一例を示す概略図である。図示する画像は、画像変換部４４によって、要素画像群から多視点画像群に変換された後の画像である。ここに示す画像において、白い画素（画素値の大きい画素）には、元の３次元モデル（１００％白の平面）が反映されている。また、この画像において、黒い画素（画素値の小さい画素）には、要素画像外成分が反映されている。中間階調の画素は、その画素値に応じて、要素画像外成分の程度（画素値が小さいほど、要素画像外成分である度合いが高い）を表している。つまり、同図で示す画像は、要素画像外成分が、要素画像群から多視点画像群に変換された後において、どのように分布しているかを示している。 FIG. 4 is a schematic view showing an example of an image output from the image conversion unit 44 in the process of generating a mask image. The illustrated image is an image after being converted from an element image group to a multi-viewpoint image group by the image conversion unit 44. In the image shown here, the original three-dimensional model (100% white plane) is reflected in the white pixels (pixels having a large pixel value). Further, in this image, black pixels (pixels having a small pixel value) reflect components outside the element image. The half-gradation pixel represents the degree of the component outside the element image (the smaller the pixel value, the higher the degree of the component outside the element image) according to the pixel value. That is, the image shown in the figure shows how the components outside the element image are distributed after being converted from the element image group to the multi-viewpoint image group.

図５は、マスク画像生成装置によって生成されたマスク画像の一例を示す概略図である。図示する画像は、図４に示した画像を上記の反転部４５がネガ／ポジ反転処理した結果として得られる画像である。つまり、図５に示す画像において、白の画素（画素値の大きい画素）は、要素画像外成分に対応する画素である。また、画素値が大きいほど、その画素は、要素画像外成分を含む度合いが大きいとも言える。 FIG. 5 is a schematic view showing an example of a mask image generated by the mask image generator. The illustrated image is an image obtained as a result of the negative / positive reversal processing of the image shown in FIG. 4 by the reversing unit 45. That is, in the image shown in FIG. 5, white pixels (pixels having a large pixel value) are pixels corresponding to components outside the element image. It can also be said that the larger the pixel value, the greater the degree to which the pixel contains components outside the element image.

なお、図３に示したマスク画像生成装置４０でマスク画像を生成する代わりに、次の方法によりマスク画像を生成するようにしてもよい。即ち、真っ白なもの（例えば、真っ白な平面状の板など）を、ライトフィールドカメラやＩＰ（integral photography）カメラなどで撮影する。そして、得られた画像の輝度を反転することにより、マスク画像を生成する。 Instead of generating the mask image by the mask image generation device 40 shown in FIG. 3, the mask image may be generated by the following method. That is, a pure white object (for example, a pure white flat plate) is photographed with a light field camera, an IP (integral photography) camera, or the like. Then, a mask image is generated by inverting the brightness of the obtained image.

次に、要素画像群生成装置によって処理される多視点画像群の例について説明する。
図６は、要素画像群生成装置１の多視点画像群取得部２１が外部から取得し、画像サイズ変換部２２によってサイズが変更された多視点画像群の例を示す概略図である。図示する画像は、縦８個×横８個の合計６４個の視点から得られた多視点画像である。 Next, an example of a multi-viewpoint image group processed by the element image group generator will be described.
FIG. 6 is a schematic view showing an example of a multi-viewpoint image group acquired by the multi-viewpoint image group acquisition unit 21 of the element image group generation device 1 from the outside and whose size is changed by the image size conversion unit 22. The illustrated image is a multi-viewpoint image obtained from a total of 64 viewpoints (8 vertical × 8 horizontal).

図７は、要素画像群生成装置１の要素画像外成分減算部２３が出力する多視点画像群の例を示す概略図である。つまり、図５に示したマスク画像が予めマスク画像記憶部１０に記憶されており、図６に示した多視点画像から、マスク画像記憶部１０から読み出されたマスク画像（図５）を減算した結果として得られる画像が、図７に示す多視点画像である。図６の画像（マスク画像を減算する前の画像）と図７の画像（マスク画像を減算し亜後の画像）とを比べると、主に周辺部の視点に対応する画像において、マスク画像による減算の効果が相対的に強く効いている。これは、図５に示すマスク画像における白っぽい画素（画素値が相対的に大きい画素）の分布に対応している。 FIG. 7 is a schematic view showing an example of a multi-viewpoint image group output by the element image external component subtraction unit 23 of the element image group generation device 1. That is, the mask image shown in FIG. 5 is stored in the mask image storage unit 10 in advance, and the mask image (FIG. 5) read from the mask image storage unit 10 is subtracted from the multi-viewpoint image shown in FIG. The image obtained as a result of this is the multi-viewpoint image shown in FIG. Comparing the image of FIG. 6 (the image before subtracting the mask image) and the image of FIG. 7 (the image after subtracting the mask image), the image mainly corresponding to the viewpoint of the peripheral portion is based on the mask image. The effect of subtraction is relatively strong. This corresponds to the distribution of whitish pixels (pixels having a relatively large pixel value) in the mask image shown in FIG.

このように、本実施形態による要素画像群生成装置１によれば、予め算出して記憶し得置いたマスク画像を、多視点画像から減算するため、要素画像外成分を抑制した多視点画像を生成（要素画像外成分減算部２３からの出力）することができる。そして、要素画像外成分減算部２３から出力される多視点画像群を、画像変換部３１で要素画像群に変換し、その後、画像サイズ変換部３２において画像サイズの再変換（画像サイズ変換部２２におけるサイズ変換の逆変換）を行うことによって、所望の要素画像群を生成することができる。 As described above, according to the element image group generating device 1 according to the present embodiment, since the mask image calculated and stored in advance is subtracted from the multi-view image, the multi-view image in which the components outside the element image are suppressed is obtained. It can be generated (output from the element image external component subtraction unit 23). Then, the multi-viewpoint image group output from the element image external component subtraction unit 23 is converted into an element image group by the image conversion unit 31, and then the image size conversion unit 32 reconverts the image size (image size conversion unit 22). By performing the inverse conversion of the size conversion in (1), a desired element image group can be generated.

［第２実施形態］
次に、本発明の第２実施形態について説明する。なお、前実施形態において既に説明した事項については以下において説明を省略する場合がある。ここでは、本実施形態に特有の事項を中心に説明する。
図８は、本実施形態による要素画像群生成装置の概略機能構成を示すブロック図である。図示するように、要素画像群生成装置２は、マスク画像記憶部１１と、多視点画像群取得部２１と、要素画像外成分減算部２３と、画像変換部３１と、要素画像群出力部３３とを含んで構成される。 [Second Embodiment]
Next, a second embodiment of the present invention will be described. The matters already described in the previous embodiment may be omitted below. Here, the matters peculiar to the present embodiment will be mainly described.
FIG. 8 is a block diagram showing a schematic functional configuration of the element image group generating device according to the present embodiment. As shown in the figure, the element image group generation device 2 includes a mask image storage unit 11, a multi-viewpoint image group acquisition unit 21, an element image external component subtraction unit 23, an image conversion unit 31, and an element image group output unit 33. It is composed including and.

本実施形態において、マスク画像記憶部１１がマスク画像を記憶するものである点は、前実施形態におけるマスク画像記憶部１０と同様である。ただし、本実施形態によるマスク画像記憶部１１は、使用するレンズアレイの構造にしたがってサイズ変更されているマスク画像を記憶する点が特徴である。つまり、マスク画像記憶部１１が記憶するマスク画像については、画像の横方向のサイズを１倍に、そして縦方向のサイズを（ＳＱＲＴ（３）／２）倍（≒０．８６６０倍）に予めサイズ変換されている。
したがって、本実施形態による要素画像外成分減算部２３は、多視点画像群取得部２１が取得した多視点画像から、上記のマスク画像記憶部１１が記憶しているマスク画像を減算する処理を行う。ただし、各画素について、画素値を減算した結果が負のあたいになるばあいには、その画素の画素値をゼロとする。 In the present embodiment, the mask image storage unit 11 stores the mask image, which is the same as the mask image storage unit 10 in the previous embodiment. However, the mask image storage unit 11 according to the present embodiment is characterized in that it stores a mask image whose size has been changed according to the structure of the lens array used. That is, for the mask image stored by the mask image storage unit 11, the horizontal size of the image is increased by 1 and the vertical size is increased by (SQRT (3) / 2) times (≈0.8660 times) in advance. It has been resized.
Therefore, the element image external component subtraction unit 23 according to the present embodiment performs a process of subtracting the mask image stored in the mask image storage unit 11 from the multi-viewpoint image acquired by the multi-viewpoint image group acquisition unit 21. .. However, for each pixel, if the result of subtracting the pixel value becomes negative, the pixel value of that pixel is set to zero.

したがって、本実施形態では、第１実施形態による要素画像群生成装置１が有していた画像サイズ変換部２２および画像サイズ変換部３２によるサイズ変換の処理を必要としない。したがって、本実施形態では、さらに高速に要素画像群を生成することが可能となる。 Therefore, in the present embodiment, the size conversion processing by the image size conversion unit 22 and the image size conversion unit 32 that the element image group generating device 1 according to the first embodiment has is not required. Therefore, in the present embodiment, it is possible to generate the element image group at a higher speed.

［第３実施形態］
次に、本発明の第３実施形態について説明する。なお、前実施形態までにおいて既に説明した事項については以下において説明を省略する場合がある。ここでは、本実施形態に特有の事項を中心に説明する。
図９は、本実施形態による符号化装置および復号装置の概略機能構成を示すブロック図である。符号化装置３は、入力される要素画像群を多視点画像に変換し符号化する装置である。符号化装置３によって生成された符号は、伝送により、あるいは記録媒体を経由することにより、復号装置４に渡される。復号装置４は、符号化装置３から出力された符号を取得し、復号し、要素画像群に変換して出力する装置である。つまり、復号装置４は、符号化された多視点画像を復号し、インテグラル方式による立体画像の表示のための要素画像からなる要素画像群に変換して出力する。 [Third Embodiment]
Next, a third embodiment of the present invention will be described. The matters already described up to the previous embodiment may be omitted below. Here, the matters peculiar to the present embodiment will be mainly described.
FIG. 9 is a block diagram showing a schematic functional configuration of a coding device and a decoding device according to the present embodiment. The coding device 3 is a device that converts an input element image group into a multi-viewpoint image and encodes it. The code generated by the coding device 3 is passed to the decoding device 4 by transmission or via a recording medium. The decoding device 4 is a device that acquires the code output from the coding device 3, decodes it, converts it into an element image group, and outputs the code. That is, the decoding device 4 decodes the encoded multi-viewpoint image, converts it into an element image group composed of element images for displaying a stereoscopic image by the integral method, and outputs the image.

図示するように、符号化装置３は、マスク画像記憶部１３と、要素画像群取得部５１と、画像サイズ変換部５２と、画像変換部５３と、多視点画像選定部５４と、多視点画像符号化部５５とを含んで構成される。
また、復号装置４は、マスク画像記憶部１４と、多視点画像復号部６１と、不足多視点画像生成部６２と、要素画像外成分減算部６３と、画像変換部６４と、画像サイズ変換部６５と、要素画像群出力部６６とを含んで構成される。 As shown in the figure, the coding device 3 includes a mask image storage unit 13, an element image group acquisition unit 51, an image size conversion unit 52, an image conversion unit 53, a multi-viewpoint image selection unit 54, and a multi-viewpoint image. It is configured to include a coding unit 55.
Further, the decoding device 4 includes a mask image storage unit 14, a multi-viewpoint image decoding unit 61, a lacking multi-viewpoint image generation unit 62, an element image external component subtraction unit 63, an image conversion unit 64, and an image size conversion unit. 65 and an element image group output unit 66 are included.

符号化装置３および復号装置４の各部が有する機能と、これらの装置による処理の流れを以下において説明する。
まず符号化装置３が有する要素画像群取得部５１は、外部から、要素画像群を取得する。要素画像群は、複数の要素画像が配置されてなる画像である。要素画像群取得部５１は、例えば、要素レンズと撮像素子を用いて撮像された要素画像群を取得する。また、要素画像群取得部５１は、静止画としての要素画像群を取得してもよいし、時系列の画像として構成される要素画像群の映像を取得してもよい。
なお、要素画像群取得部５１が取得した要素画像群には、要素画像外成分が含まれている。 The functions of each part of the coding device 3 and the decoding device 4 and the flow of processing by these devices will be described below.
First, the element image group acquisition unit 51 of the coding device 3 acquires the element image group from the outside. The element image group is an image in which a plurality of element images are arranged. The element image group acquisition unit 51 acquires, for example, an element image group imaged by using an element lens and an image sensor. Further, the element image group acquisition unit 51 may acquire an element image group as a still image, or may acquire an image of an element image group configured as a time-series image.
The element image group acquired by the element image group acquisition unit 51 includes components outside the element image.

画像サイズ変換部５２は、第１実施形態における画像サイズ変換部２２と同様に、画像のサイズを変換する。具体的には、画像サイズ変換部５２は、要素画像群取得部５１から出力される画像のサイズを変換し、出力する。
画像変換部５３は、画像サイズ変換部５２から出力される要素画像群を、多視点画像群に変換する。つまり、要素画像群取得部５１が取得した要素画像群を、多視点画像群に変換する。 The image size conversion unit 52 converts the size of the image in the same manner as the image size conversion unit 22 in the first embodiment. Specifically, the image size conversion unit 52 converts the size of the image output from the element image group acquisition unit 51 and outputs the image.
The image conversion unit 53 converts the element image group output from the image size conversion unit 52 into a multi-viewpoint image group. That is, the element image group acquired by the element image group acquisition unit 51 is converted into the multi-viewpoint image group.

マスク画像記憶部１３は、第１実施形態におけるマスク画像記憶部１０と同様に、要素画像外成分の度合いを表すマスク画像を予め記憶しておくものである。つまり、マスク画像記憶部１３は、要素画像の外の成分が多視点画像群の中の画素に含まれる程度を画素の画素値として保持するマスク画像を記憶するものである。なお、マスク画像の生成方法については、第１実施形態におけるマスク画像生成装置４０の機能として説明した通りである。 Similar to the mask image storage unit 10 in the first embodiment, the mask image storage unit 13 stores in advance a mask image representing the degree of the component outside the element image. That is, the mask image storage unit 13 stores a mask image that retains the degree to which components outside the element image are included in the pixels in the multi-viewpoint image group as pixel values of the pixels. The method of generating the mask image is as described as the function of the mask image generation device 40 in the first embodiment.

多視点画像選定部５４は、画像変換部５３が出力する多視点画像群のうち、要素画像外成分を含む視点からの画像を除外し、要素画像外成分を含まない視点からの画像のみを選択する。このとき、多視点画像選定部５４は、マスク画像記憶部１３から読み出すマスク画像の各画素の値に基づいて、ある視点からの画像が要素画像外成分を含むものであるか否かを選定する。
なお、多視点画像選定部５４において、ある視点の画像が要素画像外成分を含むものであるか否かを判定するためには、例えば、マスク画像の画素値に関して適宜定められた閾値を用いるようにする。そして、多視点画像選定部５４は、多視点画像群のうちの特定の視点の画像の領域について、対応するマスク画像の中のその領域に含まれる各画素の画素値の平均値を算出する。そして、算出された平均値が上記の閾値以上であれば、多視点画像選定部５４は、その視点の画像は要素画像外成分を含むものであると判定する。逆に、算出された平均値が上記の閾値未満であれば、多視点画像選定部５４は、その視点の画像は要素画像外成分を含まないものであると判定する。
つまり、多視点画像選定部５４は、マスク画像記憶部１３に記憶されているマスク画像を参照することにより、画像変換部５３から出力される多視点画像群に含まれる各視点の画像について要素画像外成分が所定値以下か否かを判定し、要素画像外成分が所定値以下である視点の画像のみを選定して出力するものである。 The multi-viewpoint image selection unit 54 excludes the image from the viewpoint including the component outside the element image from the multi-viewpoint image group output by the image conversion unit 53, and selects only the image from the viewpoint not including the component outside the element image. do. At this time, the multi-viewpoint image selection unit 54 selects whether or not the image from a certain viewpoint contains components other than the element image, based on the value of each pixel of the mask image read from the mask image storage unit 13.
In the multi-viewpoint image selection unit 54, in order to determine whether or not the image of a certain viewpoint contains components other than the element image, for example, a threshold value appropriately determined for the pixel value of the mask image is used. .. Then, the multi-viewpoint image selection unit 54 calculates the average value of the pixel values of each pixel included in the region of the image of a specific viewpoint in the multi-viewpoint image group in the corresponding mask image. Then, if the calculated average value is equal to or higher than the above threshold value, the multi-viewpoint image selection unit 54 determines that the viewpoint image includes components outside the element image. On the contrary, if the calculated average value is less than the above threshold value, the multi-viewpoint image selection unit 54 determines that the image of the viewpoint does not include the component outside the element image.
That is, the multi-viewpoint image selection unit 54 refers to the mask image stored in the mask image storage unit 13 to display element images for the images of each viewpoint included in the multi-viewpoint image group output from the image conversion unit 53. It is determined whether or not the external component is equal to or less than a predetermined value, and only the image of the viewpoint in which the external component of the element image is equal to or less than the predetermined value is selected and output.

多視点画像符号化部５５は、画像変換部５３から出力された多視点画像群のうち、多視点画像選定部５４によって選ばれた画像のみを、符号化し出力する。つまり、多視点画像符号化部５５は、要素画像外成分を含まない画像のみを符号化し出力する。つまり、多視点画像符号化部５５は、多視点画像群のうちの多視点画像選定部５４によって選定された視点の画像のみを符号化して出力する。なお、画像（映像）の符号化自体は、既存の技術を適宜適用することのより行うことができる。 The multi-viewpoint image coding unit 55 encodes and outputs only the image selected by the multi-viewpoint image selection unit 54 from the multi-viewpoint image group output from the image conversion unit 53. That is, the multi-viewpoint image coding unit 55 encodes and outputs only the image that does not include the component outside the element image. That is, the multi-viewpoint image coding unit 55 encodes and outputs only the image of the viewpoint selected by the multi-viewpoint image selection unit 54 in the multi-viewpoint image group. The image (video) coding itself can be performed by appropriately applying the existing technology.

以上のように、本実施形態による符号化装置３は、多視点画像選定部５４における判定により、要素画像外の成分を含む多視点画像を符号化の対象から除外する。つまり、符号化装置３は、要素画像成分のみを含む多視点画像を符号化するものであり、要素画像外の成分を含む多視点画像を含んでいない符号を出力する。 As described above, the coding apparatus 3 according to the present embodiment excludes the multi-viewpoint image including the component other than the element image from the object of coding by the determination by the multi-viewpoint image selection unit 54. That is, the coding device 3 encodes the multi-viewpoint image including only the element image components, and outputs a code that does not include the multi-viewpoint image including the components outside the element image.

復号装置４側では、多視点画像復号部６１は、符号化装置３から出力された多視点画像を復号する。前述の通り、符号化装置３の多視点画像選定部５４において多視点画像の選定を行っているため、多視点画像復号部６１によって復号された画像においては一部の視点の画像が欠けている。つまり、多視点画像復号部６１は、複数の視点の画像である多視点画像を符号化して得られた符号を復号する。 On the decoding device 4, the multi-viewpoint image decoding unit 61 decodes the multi-viewpoint image output from the coding device 3. As described above, since the multi-viewpoint image selection unit 54 of the coding apparatus 3 selects the multi-viewpoint image, the image decoded by the multi-viewpoint image decoding unit 61 lacks the image of a part of the viewpoint. .. That is, the multi-viewpoint image decoding unit 61 decodes the code obtained by encoding the multi-viewpoint image which is an image of a plurality of viewpoints.

不足多視点画像生成部６２は、その不足している多視点画像（つまり、要素画像外成分を含む画像として多視点画像選定部５４において除外された視点の画像）を生成するものである。そして、不足多視点画像生成部６２は、多視点画像復号部６１によって復号された画像群と、自らが生成した不足視点の画像群とを合わせてひとつの多視点画像群として、出力する。つまり、不足多視点画像生成部６２は、多視点画像復号部６１によって復号された視点の画像に基づき、不足する視点の画像を生成し、多視点画像復号部６１によって復号された視点の画像と生成した画像とからなる多視点画像群を出力する。
具体的には、不足多視点画像生成部６２は、多視点画像復号部６１によって復号された多視点画像群に基づいて、不足している視点の画像を生成する。より具体的な方法として、例えば、不足多視点画像生成部６２は、不足している視点の画像に最も近い距離の視点位置する画像（多視点画像復号部６１によって復号された画像）を複製することによって、その不足している視点の画像を生成する。あるいは、例えば、不足多視点画像生成部６２は、不足している視点の画像の周辺の視点の画像（多視点画像復号部６１によって復号された画像）を用いた視点の内挿補間処理をすることによって、その不足している視点の画像を生成する。 The deficient multi-viewpoint image generation unit 62 generates the deficient multi-viewpoint image (that is, an image of a viewpoint excluded by the multi-viewpoint image selection unit 54 as an image including components outside the element image). Then, the deficient multi-viewpoint image generation unit 62 outputs the image group decoded by the multi-viewpoint image decoding unit 61 and the deficient viewpoint image group generated by itself as one multi-viewpoint image group. That is, the missing multi-viewpoint image generation unit 62 generates an image of the missing viewpoint based on the image of the viewpoint decoded by the multi-viewpoint image decoding unit 61, and together with the image of the viewpoint decoded by the multi-viewpoint image decoding unit 61. Outputs a multi-viewpoint image group consisting of the generated image.
Specifically, the missing multi-viewpoint image generation unit 62 generates an image of the missing viewpoint based on the multi-viewpoint image group decoded by the multi-viewpoint image decoding unit 61. As a more specific method, for example, the missing multi-viewpoint image generation unit 62 duplicates an image located at a viewpoint closest to the missing viewpoint image (an image decoded by the multi-viewpoint image decoding unit 61). By doing so, an image of the missing viewpoint is generated. Alternatively, for example, the missing multi-viewpoint image generation unit 62 performs interpolation processing of the viewpoint using the image of the viewpoint around the image of the missing viewpoint (the image decoded by the multi-viewpoint image decoding unit 61). By doing so, an image of the missing viewpoint is generated.

マスク画像記憶部１４は、マスク画像記憶部１３と同様に、要素画像外成分の度合いを表すマスク画像を予め記憶しておくものである。つまり、マスク画像記憶部１４は、要素画像の外の成分が多視点画像群の中の画素に含まれる程度を画素の画素値として保持するマスク画像を記憶するものである。
要素画像外成分減算部６３は、不足多視点画像生成部６２によって出力される多視点画像群から、マスク画像記憶部１４から読み出すマスク画像を減算する処理を行う。具体的には、要素画像外成分減算部６３は、多視点画像群に含まれる各画素の画素値から、マスク画像に含まれる各画素の画素値を減算する。つまり、要素画像外成分減算部６３は、不足多視点画像生成部６２から出力された多視点画像群の画素の画素値から、マスク画像記憶部１４が記憶するマスク画像の画素の画素値を減算する。なお、個々の画素に関して、減算の結果が負である場合には、その画素の画素値をゼロとする。この処理は、要素画像外成分減算部６３が、不足多視点画像生成部６２によって出力される多視点画像群から要素画像外成分を減算する処理である。 Similar to the mask image storage unit 13, the mask image storage unit 14 stores in advance a mask image indicating the degree of the component outside the element image. That is, the mask image storage unit 14 stores a mask image that retains the degree to which components outside the element image are included in the pixels in the multi-viewpoint image group as pixel values of the pixels.
The element image external component subtraction unit 63 performs a process of subtracting the mask image read from the mask image storage unit 14 from the multi-viewpoint image group output by the missing multi-viewpoint image generation unit 62. Specifically, the element image external component subtraction unit 63 subtracts the pixel value of each pixel included in the mask image from the pixel value of each pixel included in the multi-viewpoint image group. That is, the element image external component subtraction unit 63 subtracts the pixel values of the mask image pixels stored in the mask image storage unit 14 from the pixel values of the pixels of the multi-viewpoint image group output from the insufficient multi-viewpoint image generation unit 62. do. If the result of the subtraction is negative for each pixel, the pixel value of that pixel is set to zero. This process is a process in which the element image external component subtraction unit 63 subtracts the element image external component from the multi-viewpoint image group output by the missing multi-viewpoint image generation unit 62.

画像変換部６４は、要素画像外成分減算部６３から出力される多視点画像を、要素画像に変換する。多視点画像群を要素画像群に変換する技術については、既に述べた通りである。
画像サイズ変換部６５は、画像変換部６４から出力される要素画像の画像サイズを変換する。画像サイズ変換部６５による画像サイズの変換処理は、符号化装置３の画像サイズ変換部５２における変換処理の逆変換の処理である。つまり、画像サイズ変換部６５は、画像変換部６４が出力した要素画像群に含まれる要素画像の各々について、符号化の際に行われた画像サイズの変換の逆変換となるように、縦方向または横方向の少なくともいずれかの方向に画素を挿入または削除することにより、画像サイズを変換する。即ち、画像サイズ変換部６５は、第１実施形態における画像サイズ変換部３２と同様に、画像のサイズを変換する。
要素画像群出力部６６は、画像サイズ変換部６５でサイズ変換処理された要素画像を、出力する。 The image conversion unit 64 converts the multi-viewpoint image output from the element image external component subtraction unit 63 into an element image. The technique for converting a multi-viewpoint image group into an element image group has already been described.
The image size conversion unit 65 converts the image size of the element image output from the image conversion unit 64. The image size conversion process by the image size conversion unit 65 is an inverse conversion process of the conversion process in the image size conversion unit 52 of the coding apparatus 3. That is, the image size conversion unit 65 performs the inverse conversion of the image size conversion performed at the time of encoding for each of the element images included in the element image group output by the image conversion unit 64 in the vertical direction. Alternatively, the image size is transformed by inserting or removing pixels in at least one of the horizontal directions. That is, the image size conversion unit 65 converts the size of the image in the same manner as the image size conversion unit 32 in the first embodiment.
The element image group output unit 66 outputs the element image that has been size-converted by the image size conversion unit 65.

次に、本実施形態での具体的な処理例について、模擬的な画像データを参照しながら説明する。
図１０は、多視点画像選定部５４における、視点の選定の結果の例を示す概略図である。ここに図示する画像は、多視点画像であり、縦８個（８行）×横８個（８列）の計６４個の視点の画像を含んでいる。多視点画像選定部５４は、前述の通り、マスク画像記憶部１３を参照することによって、要素画像外成分を含む視点の画像を除外し、要素画像外成分を含む視点の画像のみを出力する。同図に示す例では、多視点画像選定部５４は、枠線の外側の視点の画像を要素画像外成分の画像として除外し、枠線の内側の視点の画像を出力すべき視点の画像として選定する。具体的には、多視点画像選定部５４は、第１行のすべての列の視点の画像と、第２行の第１列，第２列，第７列，第８列の視点の画像と、第３行から第７行までの各行における第１列と第８列の視点の画像と、第８行の第１列，第２列，第７列，第８列の視点の画像とを、要素画像外成分の画像として除外する。また、多視点画像選定部５４は、第２行の第３列から第６列までの視点の画像と、第３行から第７行までの各行における第２列から第７列までの視点の画像と、第８行の第３列から第６列までの視点の画像とを、出力すべき画像として選定する。
そして、多視点画像符号化部５５は、多視点画像選定部５４によって選ばれた視点の画像群（要素画像外成分を含まない画像群）のみを符号化して出力する。 Next, a specific processing example in this embodiment will be described with reference to simulated image data.
FIG. 10 is a schematic view showing an example of the result of selection of viewpoints in the multi-viewpoint image selection unit 54. The image illustrated here is a multi-viewpoint image, and includes a total of 64 viewpoint images of 8 vertical (8 rows) × 8 horizontal (8 columns). As described above, the multi-viewpoint image selection unit 54 excludes the image of the viewpoint including the component outside the element image by referring to the mask image storage unit 13, and outputs only the image of the viewpoint including the component outside the element image. In the example shown in the figure, the multi-viewpoint image selection unit 54 excludes the image of the viewpoint outside the frame line as the image of the component outside the element image, and outputs the image of the viewpoint inside the frame line as the image of the viewpoint to be output. Select. Specifically, the multi-viewpoint image selection unit 54 includes images of the viewpoints of all columns in the first row and images of the viewpoints of the first column, the second column, the seventh column, and the eighth column in the second row. , The image of the viewpoint of the first column and the eighth column in each row from the third row to the seventh row, and the image of the viewpoint of the first column, the second column, the seventh column, and the eighth column of the eighth row. , Exclude as an image of components outside the element image. Further, the multi-viewpoint image selection unit 54 determines the images of the viewpoints from the third column to the sixth column in the second row and the viewpoints from the second column to the seventh column in each row from the third row to the seventh row. The image and the image of the viewpoint from the third column to the sixth column of the eighth row are selected as the image to be output.
Then, the multi-viewpoint image coding unit 55 encodes and outputs only the image group of the viewpoint selected by the multi-viewpoint image selection unit 54 (the image group that does not include the component outside the element image).

図を用いて例示したように、本実施形態による符号化装置３は、多視点画像選定部５４が要素画像外成分の画像を除外するため、符号化効率を高めることができる。 As illustrated with reference to the figure, in the coding apparatus 3 according to the present embodiment, the multi-viewpoint image selection unit 54 excludes the image of the component other than the element image, so that the coding efficiency can be improved.

図１１は、復号装置側での多視点画像を復元する処理での画像の例を示す概略図である。図示するように、この多視点画像群には、符号化装置３の多視点画像選定部５４において除外された視点の画像群は含まれていない。そして、前述の通り、復号装置４の不足多視点画像生成部６２は、不足している視点の画像を生成することによって、同図に示す多視点画像群を補完する。図示する例では、具体的には、不足多視点画像生成部６２は、図内の矢印にしたがって、不足多視点画像を生成する。即ち、第２行第３列の視点の画像を基に、第１行第２列と第３列の視点の画像が生成される。第２行第４列の視点の画像を基に、第１行第４列の視点の画像が生成される。第２行第５列の視点の画像を基に、第１行第５列の視点の画像が生成される。第２行第６列の視点の画像を基に、第１行第６列と第７列の視点の画像が生成される。第３行第２列の視点の画像を基に、第２行第１列と第３行第１列の視点の画像が生成される。第３行第３列の視点の画像を基に、第１行第１列と第２行第２列の視点の画像が生成される。第３行第６列の視点の画像を基に、第１行第８列と第２行第７列の視点の画像が生成される。第３行第７列の視点の画像を基に、第２行第８列と第３行第８列の視点の画像が生成される。第４行から第６行のそれぞれの第２列の視点の画像を基に、当該行の第１列の視点の画像が生成される。第４行から第６行のそれぞれの第７列の視点の画像を基に、当該行の第８列の視点の画像が生成される。第７行第２列の視点の画像を基に、第７行第１列，第８行第１列，第８行第２列の視点の画像が生成される。第７行第７列の視点の画像を基に、第７行第８列，第８行第７列，第８行第８列の視点の画像が生成される。
なお、多視点画像を復元するための方法としてここに示した視点画像の参照パターンは一例にすぎず、他の参照パターンによって、複製または内挿補間の処理をして不足している視点の画像を生成するようにしてもよい。 FIG. 11 is a schematic view showing an example of an image in a process of restoring a multi-viewpoint image on the decoding device side. As shown in the figure, this multi-viewpoint image group does not include the viewpoint image group excluded by the multi-viewpoint image selection unit 54 of the encoding device 3. Then, as described above, the missing multi-viewpoint image generation unit 62 of the decoding device 4 complements the multi-viewpoint image group shown in the figure by generating an image of the missing viewpoint. In the illustrated example, specifically, the missing multi-viewpoint image generation unit 62 generates a missing multi-viewpoint image according to the arrows in the figure. That is, the images of the viewpoints of the first row, the second column, and the third column are generated based on the images of the viewpoints of the second row and the third column. An image of the viewpoint of the first row and the fourth column is generated based on the image of the viewpoint of the second row and the fourth column. An image of the viewpoint of the first row and the fifth column is generated based on the image of the viewpoint of the second row and the fifth column. Based on the image of the viewpoint of the second row and the sixth column, the image of the viewpoint of the first row, the sixth column and the seventh column is generated. Based on the image of the viewpoint of the third row and the second column, the image of the viewpoint of the second row, the first column and the third row and the first column is generated. Based on the image of the viewpoint of the third row and the third column, the image of the viewpoint of the first row, the first column and the second row and the second column is generated. Based on the image of the viewpoint of the third row and the sixth column, the image of the viewpoint of the first row, the eighth column and the second row and the seventh column is generated. Based on the image of the viewpoint of the third row and the seventh column, the image of the viewpoint of the second row and the eighth column and the third row and the eighth column is generated. Based on the image of the viewpoint of the second column of each of the fourth row to the sixth row, the image of the viewpoint of the first column of the row is generated. Based on the image of the viewpoint of the 7th column of each of the 4th to 6th rows, the image of the viewpoint of the 8th column of the row is generated. Based on the image of the viewpoint of the 7th row and the 2nd column, the image of the viewpoint of the 7th row and the 1st column, the 8th row and the 1st column, and the 8th row and the 2nd column is generated. Based on the image of the viewpoint of the 7th row and the 7th column, the image of the viewpoint of the 7th row and the 8th column, the 8th row and the 7th column, and the 8th row and the 8th column is generated.
The reference pattern of the viewpoint image shown here as a method for restoring the multi-view image is only an example, and the image of the viewpoint that is lacking due to duplication or interpolation interpolation processing by another reference pattern. May be generated.

図１２は、復号装置側での復元した結果の多視点画像群を示す概略図である。図示するように、符号化装置３の多視点画像選定部５４によって除外された視点の画像（図１０を参照）は、復号装置４の不足多視点画像生成部６２によって復元されている。もちろん、符号化前の元の多視点画像（画像変換部５３から出力される段階の画像）と、復元後の画像（不足多視点画像生成部６２から出力される段階の画像）とは、同一ではない。しかしながら、不足多視点画像生成部６２が復元する視点の画像は、近似的な画像として実用上充分であると考えられる。また、復元された視点の画像は、元々、要素画像外成分を含んでいた視点の画像であり、要素画像外成分減算部６３における処理で減算の対象（被減数の画像）である。 FIG. 12 is a schematic view showing a group of multi-viewpoint images as a result of restoration on the decoding device side. As shown in the figure, the image of the viewpoint (see FIG. 10) excluded by the multi-viewpoint image selection unit 54 of the coding device 3 is restored by the lacking multi-viewpoint image generation unit 62 of the decoding device 4. Of course, the original multi-viewpoint image before encoding (the image at the stage output from the image conversion unit 53) and the image after restoration (the image at the stage output from the insufficient multi-viewpoint image generation unit 62) are the same. is not it. However, the viewpoint image restored by the insufficient multi-viewpoint image generation unit 62 is considered to be practically sufficient as an approximate image. Further, the restored image of the viewpoint is an image of the viewpoint that originally contained the component outside the element image, and is the target of subtraction (the image to be subtracted) by the processing in the element image external component subtraction unit 63.

以上説明したように、本実施形態では、多視点画像符号化部５５が符号化処理する多視点画像の数を少なくすることにより、符号化後（圧縮後）のデータのサイズを小さくすることができる。また、要素画像外成分を含む視点画像には高周波成分が多く存在するため、そのような視点の画像を符号化せずに済むことで、高効率な符号化を行うことが可能になる。したがって、符号化後のデータを通信回線等によって伝送する場合には、伝送効率を良くすることができる。また、符号化後のデータを記録媒体等に記録する場合には、記録効率を良くすることができる。 As described above, in the present embodiment, the size of the coded (compressed) data can be reduced by reducing the number of multi-viewpoint images to be coded by the multi-viewpoint image coding unit 55. can. Further, since the viewpoint image including the components other than the element image has many high-frequency components, it is not necessary to encode the image of such a viewpoint, so that highly efficient coding can be performed. Therefore, when the encoded data is transmitted by a communication line or the like, the transmission efficiency can be improved. Further, when the encoded data is recorded on a recording medium or the like, the recording efficiency can be improved.

次に、実施形態の変形例について説明する。
［変形例１］
第３実施形態において、予め生成したマスク画像を、符号化装置３側のマスク画像記憶部１３と復号装置４側のマスク画像記憶部１４とに記憶させておくようにした。
変形例として、符号化装置３側で、マスク画像をも符号化し、符号化されたマスク画像を復号装置４側に渡すようにしても良い。復号装置４側では、取得した符号からマスク画像を復号し、要素画像外成分減算部６３においてそのマスク画像を使用する。 Next, a modified example of the embodiment will be described.
[Modification 1]
In the third embodiment, the mask image generated in advance is stored in the mask image storage unit 13 on the encoding device 3 side and the mask image storage unit 14 on the decoding device 4.
As a modification, the mask image may also be encoded on the coding device 3 side, and the encoded mask image may be passed to the decoding device 4. On the decoding device 4, the mask image is decoded from the acquired code, and the mask image is used in the element image external component subtraction unit 63.

［変形例２］
上記の実施形態では、画像サイズ変換部が画像サイズを変換することにより、画素のピッチがレンズアレイを構成するレンズのサンプリング点に合うようにした。
変形例として、画素のピッチを、レンズアレイを構成するレンズのサンプリング点に合わせるためのサイズ変換の処理を省略するようにしても良い。
サイズ変換の処理を省略した場合にも、マスク画像に基づいて、要素画像外成分を減算する処理（第１実施形態の要素画像外成分減算部２３）や、多視点画像を選定する処理（第２実施形態の多視点画像選定部５４）の効果を得ることはできる。
ただし、下に説明するように、画像サイズを変換する処理を行うことにより、ノイズを軽減したり、符号化効率を上げたりすることができる。 [Modification 2]
In the above embodiment, the image size conversion unit converts the image size so that the pixel pitch matches the sampling points of the lenses constituting the lens array.
As a modification, the size conversion process for matching the pixel pitch to the sampling points of the lenses constituting the lens array may be omitted.
Even when the size conversion process is omitted, the process of subtracting the components outside the element image based on the mask image (the element image outside component subtraction unit 23 of the first embodiment) and the process of selecting the multi-viewpoint image (the first). The effect of the multi-viewpoint image selection unit 54) of the second embodiment can be obtained.
However, as described below, noise can be reduced and coding efficiency can be improved by performing a process of converting the image size.

図１３は、画像サイズの変換処理を行わない場合に得られる多視点画像群の例を示す概略図である。図示する例は、縦８個×横８個で、計６４個の視点で構成される多視点画像群である。
図１４は、画像サイズの変換処理を行った場合に得られる多視点画像群の例を示す概略図である。図示する例は、縦８個×横８個で、計６４個の視点で構成される多視点画像群である。
図１３に示す画像と図１４に示す画像とを比較すると、画像サイズの変更を行わなかった場合の多視点画像群では、ランダムなノイズが多く発生していることがわかる。これは、デルタ配列のレンズアレイが用いられているために、サイズ変換の処理を行わない場合（図１３の場合）には、表示するモニターの画素ピッチとレンズアレイを構成するレンズのサンプリング点が一致しないことによる。つまり、この両者が一致しないと、要素画像外成分が拡散してしまうことによりランダムなノイズとして表示されてしまう。一方、サイズ変換の処理を行う場合（図１４の場合）には、表示するモニターの画素ピッチとレンズアレイを構成するレンズのサンプリング点が一致する（そのようにサイズを変換する）ため、得られた多視点画像群では要素画像外の成分が特定の位置に集中している。 FIG. 13 is a schematic view showing an example of a multi-viewpoint image group obtained when the image size conversion process is not performed. The illustrated example is a multi-viewpoint image group consisting of a total of 64 viewpoints, which are 8 vertical × 8 horizontal.
FIG. 14 is a schematic view showing an example of a multi-viewpoint image group obtained when the image size conversion process is performed. The illustrated example is a multi-viewpoint image group consisting of a total of 64 viewpoints, which are 8 vertical × 8 horizontal.
Comparing the image shown in FIG. 13 with the image shown in FIG. 14, it can be seen that a large amount of random noise is generated in the multi-viewpoint image group when the image size is not changed. This is because a delta-arranged lens array is used, and when the size conversion process is not performed (in the case of FIG. 13), the pixel pitch of the monitor to be displayed and the sampling points of the lenses constituting the lens array are set. Due to inconsistency. That is, if these two do not match, the components outside the element image are diffused and displayed as random noise. On the other hand, when the size conversion process is performed (in the case of FIG. 14), the pixel pitch of the displayed monitor and the sampling points of the lenses constituting the lens array match (the size is converted in that way), so that the result is obtained. In the multi-viewpoint image group, the components outside the element image are concentrated at a specific position.

なお、多視点画像符号化部５５が画像の符号化をする際には、符号化処理の基本単位となるブロックのサイズが大きくなるほど符号化を効率よく行うことができる。したがって、サイズ変換処理を行った場合の多視点画像群（図１４）では要素画像外成分が局所に集中する傾向があるため、サイズ変換処理を行わない場合の多視点画像群（図１３）よりも高効率な符号化を行うことができる。 When the multi-viewpoint image coding unit 55 encodes an image, the larger the size of the block, which is the basic unit of the coding process, the more efficiently the coding can be performed. Therefore, in the multi-viewpoint image group (FIG. 14) when the size conversion process is performed, the components outside the element image tend to be locally concentrated, and therefore, from the multi-viewpoint image group (FIG. 13) when the size conversion process is not performed. Can also perform highly efficient coding.

［変形例３］
前述の各実施形態において、画像サイズ変換部（符号２２，３２，４３，５２，６５）は、隣接する画素ないしは近傍の画素の画素値の内挿により、サイズ変換後の画素値を算出することとした。
本変形例では、画像サイズ変換部は、画像サイズを変換する際に上記の内挿処理を行わず、代わりに、隣接する画素の画素値を参照し、その画素値をそのままコピーする。つまり、画像サイズ変換部は、画像サイズを変換する際に、変換前の要素画像に含まれる画素の画素値を、変換後の要素画像に含まれる対応する位置の画素の画素値とする。画素値を参照してその画素値をそのままコピーする処理は、画素数が減る方向の画像サイズ変換においても、画素数が増える方向の画像サイズ変換においても、適用可能である。つまり、画素数が減る方向に画像サイズを変換する場合には、参照されずに捨てられる画素値が存在し得る。また、画素数が増える方向に画像サイズを変換する場合には、同じ画素値を持つ複数の画素が生成される箇所が少なくとも画像内の一部において生じ得る。
このような画像サイズの変換を行う場合、画素数が増える場合に追加される画素位置の情報さえあれば、その反対方向への変換（画素数が減るような変換）を可逆変換とすることができる。 [Modification 3]
In each of the above-described embodiments, the image size conversion unit (reference numerals 22, 32, 43, 52, 65) calculates the pixel value after size conversion by interpolating the pixel values of adjacent pixels or neighboring pixels. And said.
In this modification, the image size conversion unit does not perform the above interpolation processing when converting the image size, but instead refers to the pixel values of adjacent pixels and copies the pixel values as they are. That is, when the image size is converted, the image size conversion unit sets the pixel value of the pixel included in the element image before conversion as the pixel value of the pixel at the corresponding position included in the element image after conversion. The process of referring to the pixel value and copying the pixel value as it is can be applied to both the image size conversion in the direction of decreasing the number of pixels and the image size conversion in the direction of increasing the number of pixels. That is, when the image size is converted in the direction in which the number of pixels decreases, there may be pixel values that are discarded without being referenced. Further, when the image size is converted in the direction of increasing the number of pixels, a portion where a plurality of pixels having the same pixel value are generated may occur at least in a part of the image.
When performing such image size conversion, if there is information on the pixel position added when the number of pixels increases, the conversion in the opposite direction (conversion that reduces the number of pixels) can be regarded as reversible conversion. can.

ここで、本変形例における画素位置の計算について説明する。
図１５は、画像サイズの変換の前後における画素の位置関係を説明するための概略図である。同図（ａ）は、画像サイズ変換前の、要素画像の並びを平面視した平面図である。
同図（ｂ）は、画像サイズ変換後の、要素画像の並びを平面視した平面図である。同図（ａ）および（ｂ）のそれぞれにおいて、要素画像はデルタ配列で配置されている。各要素画像を円形として、その直径のサイズがＲであるとき、このデルタ配列の要素画像の縦方向のピッチ（縦方向のレンズピッチ）は｛Ｒ・（ＳＲＱＴ（３）／２）｝である。同図（ａ）における縦方向のレンズピッチがＡ画素分に相当し、同図（ｂ）における縦方向のレンズピッチがＢ画素分に相当するとき、サイズ変換によって新たに追加される画素の位置は、下の式（１）を用いて表される。なお、ここで、Ｂ＞Ａである。 Here, the calculation of the pixel position in this modification will be described.
FIG. 15 is a schematic diagram for explaining the positional relationship of the pixels before and after the conversion of the image size. FIG. 3A is a plan view of the arrangement of the element images before the image size conversion.
FIG. 3B is a plan view of the arrangement of the element images after the image size conversion. In each of the figures (a) and (b), the element images are arranged in a delta array. Each element image as a circular, when the size of the diameter is R, the vertical pitch of the elemental images of the delta arrangement (the vertical direction of the lens pitch) is {R · (SRQT (3) / 2)} be. When the vertical lens pitch in the figure (a) corresponds to the A pixel and the vertical lens pitch in the figure (b) corresponds to the B pixel, the positions of the pixels newly added by the size conversion. Is expressed using the following equation (1). Here, B> A.

式（１）の不等式の両辺において、床関数（floor function）を用いている。床関数は、１個の実引数を取る関数であり、その実引数を超えない最大の整数を返す関数である。
サイズ変換の際に新たに画素が追加される画素位置は、上記の式（１）を満たすときのｎの値である。 The floor function is used on both sides of the inequality in equation (1). The floor function is a function that takes one actual argument and returns the maximum integer that does not exceed the actual argument.
The pixel position to which a new pixel is added at the time of size conversion is the value of n when the above equation (1) is satisfied.

なお、ここでは、縦方向のサイズを変換する場合について具体的に説明しているが、横方向のサイズ変換でも、同じ考え方を適用できる。 Although the case of converting the size in the vertical direction is specifically described here, the same concept can be applied to the size conversion in the horizontal direction.

図１６は、画像サイズ変換部によるサイズ変換の処理において、画素値の参照関係を示す概略図である。なお、同図は、サイズ変換前とサイズ変換後のそれぞれにおける画素値参照の関係を、画素配列の断面図で表している。同図は、要素画像群の一部分のみの断面を示す。同図において、破線Ｈ１，Ｈ２，Ｈ３のそれぞれは、要素画像の境界を示す。つまり、図示する断面において、破線Ｈ１からＨ２までの範囲が、１つの要素画像に対応する。また、破線Ｈ２からＨ３までの範囲が、別の１つの要素画像に対応する。要素画像の中心部から隣接する他の要素画像の中心部までの距離が、レンズピッチ（要素画像のピッチ）である。なお、破線Ｈ１からＨ２までの距離と、破線Ｈ２からＨ３までの距離もまた、それぞれ、レンズピッチに等しい。そして、図示する例では、サイズ変換前におけるレンズピッチは、５．５画素分（Ａの値）に対応する。また、サイズ変換後におけるレンズピッチは、６．０画素分（Ｂの値）に対応する。つまり、サイズ変換後において、レンズピッチが、画素の整数個分に等しい。つまり、本例で、サイズ変換前においては複数の要素画像に属する画素が存在するのに対して、サイズ変換後においては、ある画素は必ず１つの要素画像のみに属する。なお、本例では、わかりやすく図示するために、Ａ＝５．５、Ｂ＝６．０としている。実際には、１要素画像は、縦・横それぞれ、数十個ないしは数百個の画素の配列として構成されていてもよい。 FIG. 16 is a schematic view showing a reference relationship of pixel values in the size conversion process by the image size conversion unit. In addition, this figure shows the relationship of the pixel value reference before and after the size conversion by the cross-sectional view of the pixel array. The figure shows a cross section of only a part of the element image group. In the figure, each of the broken lines H1, H2, and H3 indicates the boundary of the element image. That is, in the cross section shown, the range from the broken lines H1 to H2 corresponds to one element image. Further, the range from the broken line H2 to H3 corresponds to another element image. The distance from the center of the element image to the center of another adjacent element image is the lens pitch (pitch of the element image). The distance from the broken line H1 to H2 and the distance from the broken line H2 to H3 are also equal to the lens pitch, respectively. Then, in the illustrated example, the lens pitch before the size conversion corresponds to 5.5 pixels (value of A). Further, the lens pitch after the size conversion corresponds to 6.0 pixels (value of B). That is, after the size conversion, the lens pitch is equal to an integer number of pixels. That is, in this example, while there are pixels belonging to a plurality of element images before the size conversion, a certain pixel always belongs to only one element image after the size conversion. In this example, A = 5.5 and B = 6.0 are set for easy understanding. Actually, the one-element image may be configured as an array of tens or hundreds of pixels in each of the vertical and horizontal directions.

同図において、便宜上、各画素に番号を付与している。
サイズ変換前においては１要素画像が５．５画素に相当する。第１の要素画像（破線Ｈ１からＨ２までの範囲）は、１番から５番までの画素を含み、また６番の画素の上半分を含んでいる。また、第２の要素画像（破線Ｈ２からＨ３までの範囲）は、６番の画素の下半分を含み、また７番から１１番までの画素を含んでいる。
サイズ変換後においては１要素画像が６．０画素に相当する。第１の要素画像（破線Ｈ１からＨ２までの範囲）は、１番から６番までの画素を含んでいる。また、第２の要素画像（破線Ｈ２からＨ３までの範囲）は、７番から１２番までの画素を含んでいる。 In the figure, for convenience, each pixel is numbered.
Before the size conversion, the one-element image corresponds to 5.5 pixels. The first element image (range from broken lines H1 to H2) includes pixels 1 to 5, and also includes the upper half of pixel 6. The second element image (range from broken line H2 to H3) includes the lower half of the sixth pixel and also includes the seventh to eleventh pixels.
After the size conversion, the one-element image corresponds to 6.0 pixels. The first element image (range from broken line H1 to H2) includes pixels 1 to 6. Further, the second element image (range from broken line H2 to H3) includes pixels 7 to 12.

画像サイズ変換部は、サイズ変更前の画像の画素値を参照し、サイズ変更後の画像を生成する。図示する例では、サイズ変更後の画像における１２番の画素が新たに追加された画素である。画素数が増える方向にサイズ変更する場合、画像サイズ変換部は、常に、レンズの端（要素画像の端）の位置に、追加される画素を配置する。サイズ変換後の画像において、１番から１１番までの画素は、それぞれ、サイズ変換前の１番から１１番までの画素を参照した画素値を有する。つまり、画像サイズ変換部は、変換前の１番の画素の画素値を変換後の１番の画素の画素値とし、変換前の２番の画素の画素値を変換後の２番の画素の画素値とし、以下同様である。そして、サイズ変換後の画像における１２番の画素は、サイズ変換前の画像における１１番の画素を参照した画素値を有する。 The image size conversion unit refers to the pixel value of the image before resizing and generates the image after resizing. In the illustrated example, the 12th pixel in the resized image is a newly added pixel. When the size is changed in the direction of increasing the number of pixels, the image size conversion unit always arranges the added pixels at the positions of the ends of the lens (edges of the element images). In the image after the size conversion, the pixels 1 to 11 each have a pixel value referring to the pixels 1 to 11 before the size conversion. That is, the image size conversion unit sets the pixel value of the first pixel before conversion as the pixel value of the first pixel after conversion, and sets the pixel value of the second pixel before conversion as the pixel value of the second pixel after conversion. The pixel value is the same as below. Then, the 12th pixel in the image after the size conversion has a pixel value referring to the 11th pixel in the image before the size conversion.

このように、追加された画素の位置は計算により求まるため、サイズ変更後の画像を基に、サイズ変更前の画像を完全に復元できる。つまり、本変形例では、画素数が増える方向のサイズ変換の処理は、可逆変換である。 In this way, since the positions of the added pixels can be obtained by calculation, the image before resizing can be completely restored based on the image after resizing. That is, in this modification, the process of size conversion in the direction of increasing the number of pixels is reversible conversion.

図１７は、本変形例によってサイズ変更処理した要素画像群を基に生成された多視点画像群を示す概略図である。図示する例では、多視点画像群は、縦８個、横８個の合計６４個の視点による多視点画像を含む。図示する多視点画像群の基となった要素画像群は、縦方向および横方向の両方向にサイズ変換処理したものである。つまり、サイズ変換処理によって、縦方向および横方向の両方向に、画素が追加されている。そして、サイズ変換処理によって追加された画素を含む多視点画像群を、同図において、枠で囲んで示している。つまり、８行×８列に配置された多視点画像群のうち、第８行のすべての視点の画像と、第８列のすべての視点の画像とに、サイズ変換処理によって追加された画素が含まれる。符号化する際に、これらの視点の画像の圧縮率を高くしても（つまり、符号のサイズを小さくしても）、復号する側では追加された画素はまた削除されるため、画質劣化を抑制することができる。 FIG. 17 is a schematic view showing a multi-viewpoint image group generated based on the element image group that has been resized according to the present modification. In the illustrated example, the multi-viewpoint image group includes a multi-viewpoint image from a total of 64 viewpoints, 8 in the vertical direction and 8 in the horizontal direction. The element image group that is the basis of the multi-viewpoint image group shown is the one that has undergone size conversion processing in both the vertical direction and the horizontal direction. That is, pixels are added in both the vertical direction and the horizontal direction by the size conversion process. Then, the multi-viewpoint image group including the pixels added by the size conversion process is shown by enclosing it in a frame in the figure. That is, among the multi-viewpoint image group arranged in 8 rows × 8 columns, the pixels added by the size conversion process are added to the images of all the viewpoints in the 8th row and the images of all the viewpoints in the 8th column. included. Even if the compression ratio of the images at these viewpoints is increased (that is, the size of the code is reduced) during coding, the added pixels are also deleted on the decoding side, resulting in deterioration of image quality. It can be suppressed.

つまり、この変形例３によれば、次の効果が得られる。
第１の効果は、画像サイズを変換する処理において、変換前の画素の画素値をそのまま変換後の画素の画素値とするため、サイズ変換処理での演算処理量が少なくて済む。つまり、画素値の内挿のための演算処理を行わなくて済む。
第２の効果は、画像サイズの変換を、可逆変換とすることができる。
第３の効果は、サイズ変換処理の際に追加された画素の画質劣化は、最終的な符号劣化とならないことである。これは、符号化側におけるサイズ変換処理の際に追加された画素は、復号側におけるサイズ変換処理の際に削除されるためである。 That is, according to this modification 3, the following effect can be obtained.
The first effect is that in the process of converting the image size, the pixel value of the pixel before conversion is used as it is as the pixel value of the pixel after conversion, so that the amount of calculation processing in the size conversion process can be reduced. That is, it is not necessary to perform arithmetic processing for interpolation of pixel values.
The second effect is that the image size conversion can be a reversible conversion.
The third effect is that the deterioration of the image quality of the pixels added during the size conversion process does not result in the final code deterioration. This is because the pixels added during the size conversion process on the coding side are deleted during the size conversion process on the decoding side.

また、符号化側におけるサイズ変換処理の際に追加された画素を含む多視点画像の情報を、符号に含めないようにしてもよい。これは、例えば図９に示した多視点画像選定部５４が、追加された画素を含む多視点画像を選定しないようにすることで、実施できる。 Further, the information of the multi-viewpoint image including the pixels added during the size conversion process on the coding side may not be included in the code. This can be done, for example, by preventing the multi-viewpoint image selection unit 54 shown in FIG. 9 from selecting the multi-viewpoint image including the added pixels.

なお、上述した各実施形態およびその変形例における要素画像群生成装置、符号化装置、復号装置、マスク画像生成装置の機能をコンピューターで実現するようにしても良い。
その場合、これら各装置の機能を実現するためのプログラムをコンピューター読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピューターシステムに読み込ませ、実行することによって実現しても良い。なお、ここでいう「コンピューターシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピューター読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピューターシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピューター読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバーやクライアントとなるコンピューターシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでも良い。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピューターシステムにすでに記録されているプログラムとの組み合わせで実現できるものであっても良い。 The functions of the element image group generating device, the coding device, the decoding device, and the mask image generating device in each of the above-described embodiments and modifications thereof may be realized by a computer.
In that case, a program for realizing the functions of each of these devices may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by the computer system and executed. .. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. Further, the "computer-readable recording medium" refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a storage device such as a hard disk built in a computer system. Further, a "computer-readable recording medium" is a communication line for transmitting a program via a network such as the Internet or a communication line such as a telephone line, and dynamically holds the program for a short period of time. It may also include a program that holds a program for a certain period of time, such as a volatile memory inside a computer system that serves as a server or a client in that case. Further, the above-mentioned program may be a program for realizing a part of the above-mentioned functions, and may be a program for realizing the above-mentioned functions in combination with a program already recorded in the computer system.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 Although the embodiments of the present invention have been described in detail with reference to the drawings, the specific configuration is not limited to this embodiment, and includes designs and the like within a range that does not deviate from the gist of the present invention.

本発明は、例えば放送事業やコンテンツ配信事業等、立体画像（映像）によるコンテンツを生成する事業に利用することが可能である。 The present invention can be used in businesses that generate content using stereoscopic images (videos), such as broadcasting businesses and content distribution businesses.

１，２要素画像群生成装置
３符号化装置
４復号装置
１０，１１，１３，１４マスク画像記憶部
２１多視点画像群取得部
２２画像サイズ変換部
２３要素画像外成分減算部
３１画像変換部
３２画像サイズ変換部
３３要素画像群出力部
４０マスク画像生成装置
４１３次元モデル記憶部
４２要素画像群生成部
４３画像サイズ変換部
４４画像変換部
４５反転部
４６マスク画像出力部
５１要素画像群取得部
５２画像サイズ変換部
５３画像変換部
５４多視点画像選定部
５５多視点画像符号化部
６１多視点画像復号部
６２不足多視点画像生成部
６３要素画像外成分減算部
６４画像変換部
６５画像サイズ変換部
６６要素画像群出力部 1, 2, element image group generation device 3 coding device 4 decoding device 10, 11, 13, 14 Mask image storage unit 21 Multi-viewpoint image group acquisition unit 22 Image size conversion unit 23 Element image external component subtraction unit 31 Image conversion unit 32 Image size conversion unit 33 Element image group output unit 40 Mask image generation device 41 Three-dimensional model storage unit 42 Element image group generation unit 43 Image size conversion unit 44 Image conversion unit 45 Inversion unit 46 Mask image output unit 51 Element image group acquisition unit 52 Image size conversion unit 53 Image conversion unit 54 Multi-viewpoint image selection unit 55 Multi-viewpoint image coding unit 61 Multi-viewpoint image decoding unit 62 Insufficient multi-viewpoint image generation unit 63 Element external component subtraction unit 64 Image conversion unit 65 Image size conversion Part 66 Element image group output part

Claims

An element image group acquisition unit that acquires an element image group consisting of element images,
An image conversion unit that converts the element image group acquired by the element image group acquisition unit into a multi-viewpoint image group, and an image conversion unit.
A mask image storage unit that stores a mask image that holds the degree to which components other than the element image are included in the pixels in the multi-viewpoint image group as pixel values of the pixels.
The image conversion unit by referring to a pixel value indicating the degree to which components other than the element image of the mask image stored in the mask image storage unit are included in the pixels in the multi-viewpoint image group. For the image of each viewpoint included in the multi-viewpoint image group output from, the average value of the pixel values of the area of the mask image corresponding to the area of the image of the viewpoint is calculated, and the average value is determined by the component outside the element image. A multi-viewpoint image selection unit that determines whether or not it represents a value or less, and selects and outputs only an image of a viewpoint indicating that the component outside the element image is equal to or less than the predetermined value.
A multi-viewpoint image coding unit that encodes and outputs only the image of the viewpoint selected by the multi-viewpoint image selection unit in the multi-viewpoint image group, and
A coding device comprising.

For each of the element images included in the element image group acquired by the element image group acquisition unit, at least one of the vertical direction and the horizontal direction so that the pitch of the element image is an integral multiple of the pixel pitch. Image size converter, which converts image size by inserting or deleting pixels in
Further equipped,
The image conversion unit converts the element image group whose size has been converted by the image size conversion unit into the multi-viewpoint image group.
The coding apparatus according to claim 1.

When converting the image size, the image size conversion unit sets the pixel value of the pixel included in the element image before conversion as the pixel value of the pixel at the corresponding position included in the element image after conversion.
The coding apparatus according to claim 2.

Computer,
An element image group acquisition unit that acquires an element image group consisting of element images,
An image conversion unit that converts the element image group acquired by the element image group acquisition unit into a multi-viewpoint image group, and an image conversion unit.
A mask image storage unit that stores a mask image that holds the degree to which components other than the element image are included in the pixels in the multi-viewpoint image group as pixel values of the pixels.
The image conversion unit by referring to a pixel value indicating the degree to which components other than the element image of the mask image stored in the mask image storage unit are included in the pixels in the multi-viewpoint image group. For the image of each viewpoint included in the multi-viewpoint image group output from, the average value of the pixel values of the area of the mask image corresponding to the area of the image of the viewpoint is calculated, and the average value is determined by the component outside the element image. A multi-viewpoint image selection unit that determines whether or not it represents a value or less, and selects and outputs only an image of a viewpoint indicating that the component outside the element image is equal to or less than the predetermined value.
A multi-viewpoint image coding unit that encodes and outputs only the image of the viewpoint selected by the multi-viewpoint image selection unit in the multi-viewpoint image group, and
A program for functioning as an encoding device.