JP2017126264A

JP2017126264A - Information processor, information processing method and program

Info

Publication number: JP2017126264A
Application number: JP2016006228A
Authority: JP
Inventors: 檜垣　欣成; Kinsei Higaki; 欣成檜垣
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-01-15
Filing date: 2016-01-15
Publication date: 2017-07-20

Abstract

PROBLEM TO BE SOLVED: To provide an information processor capable of increasing the speed in a piece of processing to search a corresponding block between images without reducing the accuracy.SOLUTION: The information processor acquires a plurality of image data. A projection operation is performed on each acquired image data to generate a vector in which the dimensionality corresponding to the pixels in the processing object is reduced. The similarity of pixels processing object is calculated using the generated vector.SELECTED DRAWING: Figure 3

Description

本発明は、複数の画像から対応する局所領域を探索する処理に関する。 The present invention relates to processing for searching a corresponding local region from a plurality of images.

従来より、視点が異なる複数の画像（視差画像群）に基づき視差マップまたは距離マップを推定する技術がある。このようにして推定された視差マップまたは距離マップは、被写体の３次元形状の計測などに用いられる。その他、視差マップまたは距離マップは撮影後にピント位置、被写界深度、視点、照明などを画像処理によって変更するために用いられる場合もある。このような、撮影後にピント位置、被写界深度、視点、照明などを画像処理によって変更する技術はコンピュテーショナルフォトグラフィと呼ばれており、一部のカメラで製品化されている。 Conventionally, there is a technique for estimating a parallax map or a distance map based on a plurality of images (parallax image groups) having different viewpoints. The parallax map or distance map estimated in this way is used for measuring the three-dimensional shape of the subject. In addition, the parallax map or the distance map may be used to change the focus position, depth of field, viewpoint, illumination, and the like by image processing after shooting. Such a technique for changing the focus position, depth of field, viewpoint, illumination, and the like by image processing after shooting is called “computational photography” and is commercialized in some cameras.

この視差マップまたは距離マップの推定技術は、２つの画像間での対応する特徴点を探索する方法と、対応するブロックを探索する方法とに分けられる。特徴点を探索する方法は、結果が疎な情報であるため先述した用途には補助的にしか用いることができない。これに対しブロックを探索する方法は、密な距離情報を推定できるため先述した用途に適用できる。ブロックを探索する方法を用いる場合、推定精度はブロックのサイズに大きく依存する。このため、対応するブロックを探索する処理を、入力画像に対するブロックのサイズを変えて複数回実行する方法が一般的に用いられている。また、ブロックのサイズが大きいほど対応の精度が向上する一方で演算量が増え、処理の時間が増大する。そこで、対応するブロックを探索する演算において、演算量を低減し処理を高速化する技術が求められている。特許文献１では、シーンに応じて入力画像の解像度変換を行うことで演算量の低減を図ることが記載されている。 This parallax map or distance map estimation technique can be divided into a method of searching for corresponding feature points between two images and a method of searching for corresponding blocks. The method for searching for feature points is sparse information, and can only be used supplementarily for the applications described above. On the other hand, the method for searching for a block can be applied to the above-described use because it can estimate dense distance information. When using a method for searching for a block, the estimation accuracy greatly depends on the size of the block. For this reason, a method is generally used in which the process of searching for a corresponding block is executed a plurality of times while changing the block size for the input image. Also, the larger the block size, the higher the accuracy of correspondence, while the amount of calculation increases and the processing time increases. Thus, there is a need for a technique for reducing the amount of computation and speeding up the processing for searching for a corresponding block. Patent Document 1 describes that the amount of calculation is reduced by performing resolution conversion of an input image according to a scene.

特開２００１−１２６０６５号公報JP 2001-126065 A

ＤｉｍｉｔｒｉｓＡｃｈｌｉｏｐｔａｓ著「Ｄａｔａｂａｓｅ―ｆｒｉｅｎｄｌｙＲａｎｄｏｍＰｒｏｊｅｃｔｉｏｎｓ」第２０回ＡＣＭＳＩＧＭＯＤ―ＳＩＧＡＣＴ―ＳＩＧＡＲＴシンポジウム予稿集、２００１年、２７４―２８１頁Dimitris Achryoptas "Database-friendly Random Projections" Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium 2001, pp. 274-281 ＲｏｂｅｒｔＣａｌｄｅｒｂａｎｋ、ＳｔｅｐｈｅｎＨｏｗａｒｄ、ＳｉｎａＪａｆａｒｐｏｕｒ著「ＣｏｎｓｔｒｕｃｔｉｏｎｏｆａＬａｒｇｅＣｌａｓｓｏｆＤｅｔｅｒｍｉｎｉｓｔｉｃＳｅｎｓｉｎｇＭａｔｒｉｃｅｓＴｈａｔＳａｔｉｓｆｙａＳｔａｔｉｓｔｉｃａｌＩｓｏｍｅｔｒｙＰｒｏｐｅｒｔｙ」ＩＥＥＥＪｏｕｒｎａｌｏｆＳｅｌｅｃｔｅｄＴｏｐｉｃｓｉｎＳｉｇｎａｌＰｒｏｃｅｓｓｉｎｇ、２０１０年、４巻、２号、３５８―３７４頁Robert Calderbank, Stephen Howard, Sina Jafarpour al., "Construction of a Large Class of Deterministic Sensing Matrices That Satisfy a Statistical Isometry Property" IEEE Journal of Selected Topics in Signal Processing, 2010 years, Vol. 4, No. 2, pp. 358-374

しかしながら、特許文献１に記載の技術では、解像度変換により情報量が低減するために、解像度変換が適切でない場合にブロックマッチングの精度が低下してしまう。これは、画像を低解像度化すると画像上の詳細な特徴が消失することに起因する。このため、入力画像の各部分が属する距離レンジが予め正確に推定されていなければならないという制約がある。 However, in the technique described in Patent Document 1, since the information amount is reduced by resolution conversion, the accuracy of block matching is reduced when resolution conversion is not appropriate. This is because detailed features on the image disappear when the resolution of the image is reduced. For this reason, there is a restriction that the distance range to which each part of the input image belongs must be accurately estimated in advance.

本発明に係る情報処理装置は、複数の画像データを取得する取得手段と、前記取得手段で取得した各画像データに対して射影演算を行い、処理対象の画素に対応する、次元数を減らしたベクトルを生成する生成手段と、前記生成手段で生成された前記ベクトルを用いて前記処理対象の画素の類似度を算出する算出手段とを有することを特徴とする。 An information processing apparatus according to the present invention obtains a plurality of image data, performs a projection operation on each image data obtained by the obtaining means, and reduces the number of dimensions corresponding to a pixel to be processed. It has a generation means which generates a vector, and a calculation means which calculates similarity of the pixel for processing using the vector generated by the generation means.

本発明によれば、画像間における対応するブロックを探索する処理を、精度を低下させることなく高速化することが可能となる。 According to the present invention, it is possible to speed up the process of searching for a corresponding block between images without reducing accuracy.

実施例１の説明で用いる、情報処理装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an information processing apparatus that is used in the description of Embodiment 1. FIG. 実施例１の説明で用いる、情報処理装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an information processing apparatus that is used in the description of Embodiment 1. FIG. 実施例１の説明で用いる、情報処理装置による処理の流れを示すフローチャートである。3 is a flowchart showing a flow of processing by the information processing apparatus used in the description of the first embodiment. 実施例１の説明で用いる、類似度の一例を示す図である。It is a figure which shows an example of the similarity used by description of Example 1. FIG. 実施例１の説明で用いる、視差マップの一例を示す図である。It is a figure which shows an example of the parallax map used by description of Example 1. FIG. 実施例２の説明で用いる、情報処理装置による処理の流れを示すフローチャートである。10 is a flowchart illustrating a flow of processing performed by the information processing apparatus, which is used in the description of the second embodiment. 実施例２の説明で用いる、類似度の一例を示す図である。It is a figure which shows an example of the similarity degree used by description of Example 2. FIG. 実施例２の説明で用いる、視差マップの一例を示す図である。It is a figure which shows an example of the parallax map used by description of Example 2. FIG. 実施例１および２の説明で用いる、処理内容を説明する模式図である。It is a schematic diagram explaining the processing content used by description of Example 1 and 2. FIG.

＜実施例１＞
実施例１では、画像データの局所領域（以下、ブロックという）にそれぞれ次元圧縮処理を行うことで、対応ブロック探索処理の高速化を実現する例について述べる。まずは、実施例１の情報処理装置の構成について説明する。 <Example 1>
In the first embodiment, an example will be described in which a corresponding block search process is speeded up by performing dimension compression processing on each local region (hereinafter referred to as a block) of image data. First, the configuration of the information processing apparatus according to the first embodiment will be described.

図１は、実施例１の情報処理装置の構成の一例を示す図である。実施例１の情報処理装置１００（以下、処理装置１００とする）は、ＣＰＵ１０１、ＲＡＭ１０２、ＲＯＭ１０３、二次記憶装置１０４、入力インターフェース１０５、出力インターフェース１０６を含む。そして、処理装置１００の各構成部はシステムバス１０７によって相互に接続されている。また、処理装置１００は、入力インターフェース１０５を介して外部記憶装置１０８および操作部１１０に接続されており、出力インターフェース１０６を介して外部記憶装置１０８および表示装置１０９に接続されている。 FIG. 1 is a diagram illustrating an example of the configuration of the information processing apparatus according to the first embodiment. An information processing apparatus 100 (hereinafter referred to as a processing apparatus 100) according to the first embodiment includes a CPU 101, a RAM 102, a ROM 103, a secondary storage device 104, an input interface 105, and an output interface 106. The components of the processing apparatus 100 are connected to each other via a system bus 107. The processing device 100 is connected to the external storage device 108 and the operation unit 110 via the input interface 105, and is connected to the external storage device 108 and the display device 109 via the output interface 106.

ＣＰＵ１０１は、ＲＡＭ１０２をワークメモリとして、ＲＯＭ１０３に格納されたプログラムを実行し、システムバス１０７を介して処理装置１００の各構成部を統括的に制御するプロセッサーである。これにより、後述する様々な処理が実行される。二次記憶装置１０４は、処理装置１００で取り扱われる種々のデータを記憶する記憶装置であり、本実施例ではＨＤＤが用いられる。ＣＰＵ１０１は、システムバス１０７を介して二次記憶装置１０４へのデータの書き込みおよび二次記憶装置１０４に記憶されたデータの読出しを行うことができる。なお、二次記憶装置１０４にはＨＤＤの他に、光ディスクドライブやフラッシュメモリなど、様々な記憶デバイスを用いることが可能である。 The CPU 101 is a processor that executes a program stored in the ROM 103 using the RAM 102 as a work memory and controls each component of the processing apparatus 100 through the system bus 107. Thereby, various processes described later are executed. The secondary storage device 104 is a storage device that stores various data handled by the processing device 100, and an HDD is used in this embodiment. The CPU 101 can write data to the secondary storage device 104 and read data stored in the secondary storage device 104 via the system bus 107. In addition to the HDD, various storage devices such as an optical disk drive and a flash memory can be used for the secondary storage device 104.

入力インターフェース１０５は、例えばＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースであり、外部装置から処理装置１００へのデータや命令等の入力は、この入力インターフェース１０５を介して行われる。処理装置１００は、この入力インターフェース１０５を介して、外部記憶装置１０８（例えば、ハードディスク、メモリーカード、ＣＦカード、ＳＤカード、ＵＳＢメモリなどの記憶媒体）からデータを取得する。また、処理装置１００は、この入力インターフェース１０５を介して、操作部１１０を用いて入力されたユーザによる命令を取得する。操作部１１０はマウスやキーボードなどの入力装置であり、ユーザの指示を処理装置１００に入力するために用いられる。 The input interface 105 is a serial bus interface such as USB or IEEE1394, for example, and input of data, commands, and the like from an external device to the processing device 100 is performed via the input interface 105. The processing device 100 acquires data from the external storage device 108 (for example, a storage medium such as a hard disk, a memory card, a CF card, an SD card, and a USB memory) via the input interface 105. Further, the processing apparatus 100 acquires a user command input using the operation unit 110 via the input interface 105. The operation unit 110 is an input device such as a mouse or a keyboard, and is used to input a user instruction to the processing device 100.

出力インターフェース１０６は、入力インターフェース１０５と同様にＵＳＢやＩＥＥＥ１３９４等のシリアルバスインターフェースを備える。その他に、例えばＤＶＩやＨＤＭＩ（登録商標）等の映像出力端子を用いることも可能である。処理装置１００から外部装置へのデータ等の出力は、この出力インターフェース１０６を介して行われる。処理装置１００は、この出力インターフェース１０６を介して表示装置１０９（液晶ディスプレイなどの各種画像表示デバイス）に、処理された画像などを出力することで、画像の表示を行う。なお、処理装置１００の構成要素は上記に限られるものではなく、他の構成要素を含むことが可能であるが、ここでは説明を省略する。また、ここではＰＣなどの情報処理装置を例に挙げて説明したが、図１に示す構成の一部がネットワーク接続されている構成を利用する形態であってもよい。 Similar to the input interface 105, the output interface 106 includes a serial bus interface such as USB or IEEE1394. In addition, for example, a video output terminal such as DVI or HDMI (registered trademark) can be used. Output of data and the like from the processing apparatus 100 to the external apparatus is performed via the output interface 106. The processing device 100 displays the image by outputting the processed image or the like to the display device 109 (various image display devices such as a liquid crystal display) via the output interface 106. Note that the components of the processing apparatus 100 are not limited to the above, and may include other components, but the description thereof is omitted here. In addition, here, an information processing apparatus such as a PC has been described as an example, but a configuration in which a part of the configuration illustrated in FIG. 1 is network-connected may be used.

以下、実施例１の処理装置１００で行われる処理について、図２に示す機能ブロック図、図３に示すフローチャートおよび図９（ａ）に示す模式図を用いて説明する。実施例１の処理装置１００は、図２に示すようにデータ取得部２０１、射影演算部２０２、照合部２０３、対応決定部２０４としての機能を有する。処理装置１００は、ＣＰＵ１０１がＲＯＭ１０３内に格納された制御プログラムを読み込み実行することで、上記各部の機能を実現する。なお、各構成部に相当する専用の処理回路を備えるように処理装置１００を構成するようにしてもよい。以下、各構成部により行われる処理の流れを説明する。 Hereinafter, processing performed by the processing apparatus 100 according to the first embodiment will be described with reference to a functional block diagram illustrated in FIG. 2, a flowchart illustrated in FIG. 3, and a schematic diagram illustrated in FIG. As illustrated in FIG. 2, the processing apparatus 100 according to the first embodiment has functions as a data acquisition unit 201, a projection calculation unit 202, a collation unit 203, and a correspondence determination unit 204. In the processing apparatus 100, the CPU 101 reads and executes a control program stored in the ROM 103, thereby realizing the functions of the above-described units. Note that the processing apparatus 100 may be configured to include a dedicated processing circuit corresponding to each component. Hereinafter, the flow of processing performed by each component will be described.

ステップＳ３０１では、データ取得部２０１は、入力インターフェース１０５を介して、または二次記憶装置１０４から、処理対象の画像データを取得する。取得する画像データは、複数の画像にそれぞれが対応する複数の画像データである。例えば複数の画像は、１台または複数台のカメラで撮像した複数の異なる視点からの画像を含む。撮像手段として、例えば、小型カメラを複数台並置した多眼カメラを用いてもよいし、マイクロレンズアレイを内蔵することで複数視点の画像を同時に取得できるプレノプティックカメラを用いてもよい。処理装置１００は、取得した複数の画像の１つを基準画像９０１として決定する。基準画像９０１としては、取得した画像のうち視差マップなどの情報を推定したい画像を選択すればよく、予めユーザが設定してもよいし、処理装置１００が自動的に決定してもよい。また、以降では基準画像以外の入力画像の１つを参照画像９０６とする。 In step S 301, the data acquisition unit 201 acquires image data to be processed via the input interface 105 or from the secondary storage device 104. The acquired image data is a plurality of image data that respectively correspond to a plurality of images. For example, the plurality of images include images from a plurality of different viewpoints captured by one or a plurality of cameras. As the imaging means, for example, a multi-lens camera in which a plurality of small cameras are juxtaposed may be used, or a plenoptic camera that can simultaneously acquire images of a plurality of viewpoints by incorporating a microlens array may be used. The processing apparatus 100 determines one of the acquired plurality of images as the reference image 901. As the reference image 901, an image for which information such as a parallax map is to be estimated may be selected from the acquired images, and the user may set in advance, or the processing apparatus 100 may automatically determine the reference image 901. Hereinafter, one of the input images other than the standard image is referred to as a reference image 906.

ステップＳ３０２では、データ取得部２０１は、基準画像９０１の１つの画素（以下、対象画素と呼ぶ）に対応する１つのブロック９０２を抽出し、射影演算部２０２に出力する。ブロックは複数の画素から構成される。ブロック９０２は、対象画素を中心とする矩形領域としてもよい。 In step S 302, the data acquisition unit 201 extracts one block 902 corresponding to one pixel (hereinafter referred to as a target pixel) of the reference image 901, and outputs it to the projection calculation unit 202. A block is composed of a plurality of pixels. The block 902 may be a rectangular area centered on the target pixel.

ステップＳ３０３では、射影演算部２０２が、データ取得部２０１から入力された基準画像９０１の１つのブロック９０２に対し射影演算を行う。射影演算としては、ブロック９０２を変形して得られるベクトル９０３（列ベクトル）に対し後述する横長の射影行列９０４を乗じる方法が挙げられる。この射影演算の結果として、ブロック９０２に対応するベクトル９０３から、より低次元の（要素数の少ない）ベクトル９０５が生成される。例えば図９（ａ）に示すように、Ｎ画素のブロックに対してＭ×Ｎのサイズの行列を乗じると、Ｍ次元のベクトル９０５が生成される。なお、射影演算の目的はデータサイズを低減することであるので、Ｍ＜Ｎを満たすこととする。 In step S 303, the projection calculation unit 202 performs a projection calculation on one block 902 of the reference image 901 input from the data acquisition unit 201. As the projection calculation, there is a method of multiplying a vector 903 (column vector) obtained by transforming the block 902 by a horizontally long projection matrix 904 described later. As a result of this projection operation, a lower-dimensional vector (less elements) 905 is generated from the vector 903 corresponding to the block 902. For example, as shown in FIG. 9A, an M-dimensional vector 905 is generated by multiplying a block of N pixels by a matrix of M × N size. Since the purpose of the projection operation is to reduce the data size, it is assumed that M <N.

先述した射影行列が満たすべき条件の１つとして、非特許文献１にも記載されているように、Ｊｏｈｎｓｏｎ―Ｌｉｎｄｅｎｓｔｒａｕｓｓの補題がある。この補題によれば、任意の２ベクトルの差分に対しある条件を満たすＭ＜Ｎの射影行列を乗じて次元を縮小しても、そのノルム（長さ）はほぼ完全に保存されることが数学的に保証される。また、２ベクトルの類似度を算出するための情報と２ベクトルの差分のノルムとは密接な関係をもつ。２ベクトルの類似度を算出する際には、２ベクトルの差分などを求める処理を行なうからである。よって、このような射影行列を乗じて前記ブロックをより低次元のベクトルに変換しても、前記ブロックが有する類似度算出のための情報は失われない。このように、本実施例では、情報を損なうことなく類似度の算出に用いるデータのサイズを縮小できるために、精度を維持しながら対応ブロック探索の演算量を低減できる。 As one of the conditions to be satisfied by the above-described projection matrix, there is a Johnson-Lindenstrauss lemma as described in Non-Patent Document 1. According to this lemma, even if the dimension is reduced by multiplying a difference between two arbitrary vectors by a projection matrix of M <N that satisfies a certain condition, the norm (length) is almost completely preserved. Guaranteed. The information for calculating the similarity between two vectors and the norm of the difference between the two vectors are closely related. This is because when calculating the similarity between two vectors, a process for obtaining a difference between the two vectors is performed. Therefore, even if such a projection matrix is multiplied to convert the block into a lower-dimensional vector, information for calculating the similarity of the block is not lost. As described above, in this embodiment, since the size of data used for calculating the similarity can be reduced without losing information, the calculation amount of the corresponding block search can be reduced while maintaining the accuracy.

任意の２ベクトルの差分に対して射影行例を乗じて次元を縮小しても、そのノルムが保存されるような射影行列の具体例が、非特許文献１の１．１節に記載されている。非特許文献１には、任意の２ベクトルの差分に対して射影行例を乗じて次元を縮小しても、そのノルムが保存されるような射影行列の具体例として、各要素が独立に１／２の確率で＋１または−１をランダムにとる２値の行列が記載されている。また、各要素が独立に１／６の確率で＋√３を、１／６の確率で−√３を、２／３の確率で０をランダムにとる疎な行列も同じ性質を有することが知られている。なお、任意の２ベクトルの差分に対して射影行例を乗じて次元を縮小しても、そのノルムが保存されるような条件を満たす射影行列はこれらに限定されず、ランダム性を有するベクトルから生成された巡回行列でもよい。さらには、射影行列は必ずしもランダム性を有する必要はなく、非特許文献２の２節に記載されているような離散チャープ行列、Ｄｅｌｓａｒｔｅ―Ｇｏｅｔｈａｌｓコード、ＢＣＨコードなどの一意に決定される行列でもよい。なお、ブロックによらず射影行列を同一にすることが計算上好適である。このように、本実施例では、乱数、離散チャープ行列、Ｄｅｌｓａｒｔｅ―Ｇｏｅｔｈａｌｓコード、ＢＣＨコードのいずれかによって決定されるＭ行Ｎ列（Ｍ＜Ｎ）を用いることができる。また、この行列においては、符号が反転した同じ値が略同数あることができる。さらには、乱数は離散値に丸められた値としてもよい。 A specific example of a projection matrix in which the norm is preserved even if the dimension is reduced by multiplying the difference between two arbitrary vectors by a projection row example is described in section 1.1 of Non-Patent Document 1. Yes. In Non-Patent Document 1, as a specific example of a projection matrix in which the norm is preserved even if the dimension is reduced by multiplying the difference between two arbitrary vectors by a projection row example, each element is independently 1 A binary matrix taking +1 or -1 at random with a probability of / 2 is described. A sparse matrix in which each element independently has + √3 with a probability of 1/6, −√3 with a probability of 1/6, and 0 with a probability of 2/3 may have the same property. Are known. Note that even if a dimension is reduced by multiplying a difference between two arbitrary vectors by a projection row example, a projection matrix that satisfies the condition that the norm is preserved is not limited to these, and from a vector having randomness The generated cyclic matrix may be used. Furthermore, the projection matrix does not necessarily have randomness, and may be a uniquely determined matrix such as a discrete chirp matrix, Delsarte-Goethals code, or BCH code as described in Section 2 of Non-Patent Document 2. . Note that it is preferable for calculation that the projection matrix is the same regardless of the block. Thus, in the present embodiment, M rows and N columns (M <N) determined by any one of a random number, a discrete chirp matrix, a Delsarte-Goethals code, and a BCH code can be used. Further, in this matrix, there can be substantially the same number of the same values with the signs inverted. Furthermore, the random number may be a value rounded to a discrete value.

ステップＳ３０４では、データ取得部２０１は、参照画像９０６から特定の位置の画素に対応する複数のブロック９０７を抽出する。ブロック９０７の位置は、ブロック９０２の位置から予め決定された複数の相対位置に移動した位置としてもよい。この複数の相対位置は視差の候補値であり、ブロック９０２の位置によらず固定された値の集合でもよいし、ブロック９０２の位置ごとに異なる値の集合でもよい。 In step S304, the data acquisition unit 201 extracts a plurality of blocks 907 corresponding to the pixel at a specific position from the reference image 906. The position of the block 907 may be a position moved from the position of the block 902 to a plurality of predetermined relative positions. The plurality of relative positions are candidate values of parallax, and may be a set of fixed values regardless of the position of the block 902, or may be a set of values different for each position of the block 902.

ステップＳ３０５では、射影演算部２０２は、ステップＳ３０３と同様に、ブロック９０７の各々を変形して得られるベクトル９０８に対し射影演算を行い、複数の低次元ベクトル９０９を生成する。なお、本実施例では、基準画像のブロック９０２に対応する複数のブロックを先に抽出し、その後、抽出した複数のブロックそれぞれに射影演算を行なう例を説明したが、これに限られない。例えば、基準画像のブロック９０２に対応する単一のブロック９０７を抽出して射影演算を行い、この処理をブロック９０７の位置を変えて反復する処理でもよい。 In step S305, the projection calculation unit 202 performs a projection calculation on the vector 908 obtained by transforming each of the blocks 907, and generates a plurality of low-dimensional vectors 909, as in step S303. In the present embodiment, an example has been described in which a plurality of blocks corresponding to the block 902 of the reference image are extracted first, and then a projection operation is performed on each of the extracted blocks. However, the present invention is not limited to this. For example, a single block 907 corresponding to the block 902 of the reference image may be extracted to perform a projection operation, and this processing may be repeated by changing the position of the block 907.

なお、ステップＳ３０５の射影変換で用いる射影行列はステップＳ３０３の射影変換で用いる射影行列と同じ射影行列である。すなわち、後述する類似度の比較で用いられる、基準画像のブロックと、参照画像の複数のブロックとの第１の組においては、同じ射影行列を用いることが必要であるが、他の組においては第１の組で用いた射影行列とは異なる射影行例を用いてもよい。 Note that the projection matrix used in the projection transformation in step S305 is the same projection matrix as the projection matrix used in the projection transformation in step S303. That is, in the first set of the block of the base image and the plurality of blocks of the reference image, which are used in the similarity comparison described later, it is necessary to use the same projection matrix, but in the other sets A projection row example different from the projection matrix used in the first group may be used.

ステップＳ３０６では、照合部２０３は、ベクトル９０５とベクトル９０９の類似度の算出を行う。すなわち、ステップＳ３０３で射影演算がされたベクトル９０５と、ステップＳ３０５で射影演算がされた複数のベクトル９０９との類似度の算出を行なう。類似度の算出方法としては、一般に知られている差分２乗和、差分絶対値和、正規化相互相関などを用いてもよいし、これらに限定されることはない。従来から行なわれているブロックマッチングでは、射影前のベクトル９０３とベクトル９０８との類似度を算出していたが、ブロックサイズが大きいほどこの演算量が増大する。これに対し、本実施例で説明した方法では次元を低減したベクトル同士の類似度の算出を行うため演算量の低減が可能になる。 In step S306, the collation unit 203 calculates the similarity between the vector 905 and the vector 909. That is, the similarity between the vector 905 subjected to the projection operation in step S303 and the plurality of vectors 909 subjected to the projection operation in step S305 is calculated. As a method of calculating the similarity, a generally known sum of squared differences, sum of absolute differences, normalized cross-correlation, or the like may be used, but the method is not limited to these. In conventional block matching, the similarity between the pre-projection vector 903 and the vector 908 is calculated, but the amount of calculation increases as the block size increases. On the other hand, in the method described in this embodiment, the calculation amount can be reduced because the similarity between vectors with reduced dimensions is calculated.

ステップＳ３０７では、対応決定部２０４は、ステップＳ３０６において算出された類似度と参照画像のブロック９０７の基準画像のブロック９０２に対する相対位置とに基づき、基準画像９０１のブロック９０２の位置における視差を推定する。最も単純な方法としては、類似度が最大になるブロック９０２とブロック９０７との間の距離ｄを視差値とする。別の方法としては、類似度をブロック９０２とブロック９０７の距離の関数とみなし、関数フィッティングにより極大値をとる距離を算出し、これを視差値とする。なお、類似度に基づき視差を推定する方法はこれらに限られることはなく、任意の公知の技術を用いることができる。また、視差値は必要に応じて距離値に変換してもよい。 In step S307, the correspondence determination unit 204 estimates the parallax at the position of the block 902 of the reference image 901 based on the similarity calculated in step S306 and the relative position of the block 907 of the reference image to the block 902 of the reference image. . As the simplest method, the distance d between the block 902 and the block 907 having the maximum similarity is used as the parallax value. As another method, the degree of similarity is regarded as a function of the distance between the block 902 and the block 907, a distance having a maximum value is calculated by function fitting, and this is used as a parallax value. Note that the method of estimating the parallax based on the similarity is not limited to these, and any known technique can be used. Further, the parallax value may be converted into a distance value as necessary.

ステップＳ３０８では、処理装置１００は、基準画像９０１の視差を算出すべき対象画素の全ての画素に対応するブロック９０２に対して処理が行われたかを判定する。処理が完了していない場合にはステップＳ３０２に戻り、未処理の対象画素に対応するブロック９０２を選択した上でステップＳ３０２〜３０７の処理を行う。 In step S308, the processing device 100 determines whether processing has been performed on the block 902 corresponding to all the pixels of the target pixel for which the parallax of the reference image 901 is to be calculated. If the process has not been completed, the process returns to step S302 to select the block 902 corresponding to the unprocessed target pixel, and then perform the processes of steps S302 to 307.

以上が、実施例１の処理装置１００で行われる処理である。この処理を、図９（ａ）に模式的に示す。以上の処理によれば、画像内のブロックの次元数を、情報量を低減させることなく縮小することができるので、視差マップを高速に生成することができる。 The above is the process performed by the processing apparatus 100 according to the first embodiment. This process is schematically shown in FIG. According to the above process, the number of dimensions of the blocks in the image can be reduced without reducing the amount of information, so that a parallax map can be generated at high speed.

本実施例の効果を説明するため、以下に上記の処理を画像データに対して実際に行った例を示す。 In order to explain the effect of this embodiment, an example in which the above processing is actually performed on image data will be shown below.

水平方向にのみ視差を有する２枚のステレオ画像を入力画像データとする。ブロックサイズを５×５画素とし、各ブロックの組に対する類似度として式（１）に示す正規化相互相関ＮＣＣを基準画像の座標（ｘ_０，ｙ_０）および推定視差ｄごとに算出する。 Two stereo images having parallax only in the horizontal direction are set as input image data. The block size is 5 × 5 pixels, and the normalized cross-correlation NCC shown in Expression (1) is calculated for each reference image coordinate (x ₀ , y ₀ ) and estimated parallax d as the similarity to each block set.

ここで、Ｆ（ｘ，ｙ）は基準画像の座標（ｘ，ｙ）における画素値、Ｇ（ｘ，ｙ）は参照画像の座標（ｘ，ｙ）における画素値、Ｂは座標（ｘ_０，ｙ_０）を中心とするブロック内の座標の集合である。ブロックサイズが５×５画素の場合には、射影演算を行わない場合のＢは２５組の座標の集合である。また射影演算を行う場合は、Ｂは１０組の座標の集合であり、ブロックに対応するベクトルに乗じる行列は１０×２５のサイズで各要素は独立に１／２の確率で０または１をとるとした。すなわち、射影演算によりベクトルの次元を６０％削減することになる。 Here, F (x, y) is the pixel value at the coordinates (x, y) of the standard image, G (x, y) is the pixel value at the coordinates (x, y) of the reference image, and B is the coordinates (x ₀ , It is a set of coordinates in a block centered on y ₀ ). When the block size is 5 × 5 pixels, B in the case where no projection calculation is performed is a set of 25 sets of coordinates. When performing projection calculation, B is a set of 10 coordinates, the matrix to be multiplied by the vector corresponding to the block is 10 × 25 size, and each element independently takes 0 or 1 with a probability of 1/2. It was. That is, the vector dimension is reduced by 60% by the projection operation.

各座標（ｘ_０，ｙ_０）に対し、ＮＣＣが最大になるｄを探索し、視差マップの座標（ｘ_０，ｙ_０）における値をｄとする。１つの座標（ｘ_０，ｙ_０）におけるＮＣＣとｄとの対応を図４に示す。図４（ａ）は射影演算を行わない場合、図４（ｂ）は射影演算を行った場合を示す図であり、図中の△はＮＣＣが最大の点を表す。射影演算によらずＮＣＣが最大値を取るｄの値は一致しており、ＮＣＣのｄ依存性は類似している。また、算出された視差マップに５×５画素のメディアンフィルタ処理を行った結果を、図５に示す。図５（ａ）は射影演算を行わない場合、図５（ｂ）は射影演算を行った場合、図５（ｃ）は図５（ｂ）から図５（ａ）を減算した分布を示す図である。画像端部や物体の輪郭近傍のような原理的に正確な視差の推定が困難な位置を除けば、両者はほぼ差が０である。このように、類似度の算出に用いるベクトルの次元を削減しても、射影行列が所定の条件を満たせばブロックの情報が保存されるために、視差推定の結果に実質的に影響を与えない。 For each coordinate (x ₀ , y ₀ ), search for d that maximizes the NCC, and let d be the value at the coordinate (x ₀ , y ₀ ) of the parallax map. FIG. 4 shows the correspondence between NCC and d in one coordinate (x ₀ , y ₀ ). FIG. 4A shows a case where the projection calculation is not performed, and FIG. 4B shows a case where the projection calculation is performed. In FIG. 4, Δ represents a point having the maximum NCC. Regardless of the projection operation, the values of d at which the NCC takes the maximum value are the same, and the dCC dependency of the NCC is similar. In addition, FIG. 5 shows a result of performing 5 × 5 pixel median filter processing on the calculated parallax map. FIG. 5A shows a distribution obtained by subtracting FIG. 5A from FIG. 5B when FIG. 5B is a case where projection operation is not performed, FIG. 5B is a case where projection operation is performed, and FIG. It is. Except for positions where it is difficult to estimate parallax in principle, such as near the edge of an image or the contour of an object, the difference between them is almost zero. As described above, even if the dimension of the vector used for calculating the similarity is reduced, the block information is preserved if the projection matrix satisfies a predetermined condition. Therefore, the parallax estimation result is not substantially affected. .

以上が実施例１の処理である。以上の処理によれば、視差マップの推定精度を損なうことなく類似度算出処理の演算量低減を実現できる。 The above is the processing of the first embodiment. According to the above processing, it is possible to reduce the amount of calculation of the similarity calculation processing without impairing the estimation accuracy of the parallax map.

＜実施例２＞
実施例１の方法では、画像データから１つ１つのブロックを抽出してベクトル化した上で行列を乗じるため、計算効率が良くない。そこで、実施例２では、先述の射影行列と同様の方法で生成した同一サイズの異なるカーネルを複数生成する。そして生成したカーネルの各々を基準画像および参照画像にそれぞれ畳み込むことで複数の射影画像をそれぞれ算出し、この射影画像を用いて類似度を算出する。２次元の畳み込み演算は高速フーリエ変換を用いて効率よく計算できるため、この方法の方がブロックごとに行列を乗じるよりも高速に処理することが可能である。 <Example 2>
In the method of the first embodiment, each block is extracted from the image data, vectorized, and multiplied by a matrix, so that the calculation efficiency is not good. Therefore, in the second embodiment, a plurality of kernels having the same size and generated by the same method as the projection matrix described above are generated. A plurality of projected images are calculated by convolving each of the generated kernels with the standard image and the reference image, and the similarity is calculated using the projected images. Since the two-dimensional convolution operation can be calculated efficiently using the fast Fourier transform, this method can be processed faster than multiplying the matrix for each block.

本実施例は、ブロックに対応するベクトルから射影演算により得られた低次元ベクトルを用いる代わりに、上記のように得られた複数の射影画像の同一座標の画素値を並べたベクトルを類似度の算出に用いる以外は、実施例１と同じである。 In this embodiment, instead of using a low-dimensional vector obtained by projective calculation from a vector corresponding to a block, a vector in which pixel values of the same coordinates of a plurality of projected images obtained as described above are arranged is used for similarity. Except for use in the calculation, the second embodiment is the same as the first embodiment.

以下、実施例１の処理装置１００で行われる処理について、図２に示す機能ブロック図、図６に示すフローチャートおよび図９（ｂ）に示す模式図を用いて説明する。 Hereinafter, processing performed by the processing apparatus 100 according to the first embodiment will be described with reference to a functional block diagram illustrated in FIG. 2, a flowchart illustrated in FIG. 6, and a schematic diagram illustrated in FIG. 9B.

ステップＳ６０１では、ステップＳ３０１と同様に、データ取得部２０１は、入力インターフェース１０５を介して、または二次記憶装置１０４から、複数の画像に対応する複数の処理対象の画像データを取得する。また、処理装置１００は、複数の画像のうち１つを基準画像９０１に、他の１つを参照画像９０６に定める。 In step S601, as in step S301, the data acquisition unit 201 acquires a plurality of processing target image data corresponding to a plurality of images via the input interface 105 or from the secondary storage device 104. Further, the processing apparatus 100 determines one of the plurality of images as the standard image 901 and the other as the reference image 906.

ステップＳ６０２では、射影演算部２０２は、基準画像９０１に対し複数のカーネル９１０を畳み込み、カーネルにそれぞれが対応した射影画像から構成される射影画像群９１１を生成し照合部２０３に出力する。同様に、参照画像９０６に対し先述と同一の複数のカーネル９１０を畳み込み、カーネルにそれぞれが対応した射影画像から構成される射影画像群９１３を生成し照合部２０３に出力する。なお、ここで畳み込みに用いるカーネルは、カーネル内の要素の値が実施例１で説明したような射影行例と同様の方法で生成されたカーネルである。ステップＳ６０２の処理によって、カーネルの個数分の射影画像群９１１、９１３がそれぞれ生成されることになる。また、射影演算部２０２は、基準画像９０１と参照画像９０６とに対して複数のカーネルをそれぞれ同じ順番で使用して射影画像を生成し、生成した順で照合部２０３に射影画像を出力する。照合部２０３においては、同じカーネルを使用して得られた基準画像９０１の射影画像と参照画像９０６の射影画像との対応関係がわかるように射影画像群を処理する。例えば、照合部は射影演算部２０２から出力された順序で射影画像群を取得する。 In step S 602, the projection calculation unit 202 convolves a plurality of kernels 910 with the reference image 901, generates a projection image group 911 including projection images corresponding to the kernels, and outputs the projection image group 911 to the collation unit 203. Similarly, a plurality of kernels 910 that are the same as those described above are convoluted with respect to the reference image 906, and a projected image group 913 including projected images corresponding to the kernels is generated and output to the matching unit 203. Note that the kernel used for convolution here is a kernel in which the values of elements in the kernel are generated in the same manner as in the projection row example described in the first embodiment. By the processing in step S602, projection image groups 911 and 913 corresponding to the number of kernels are respectively generated. In addition, the projection calculation unit 202 generates a projection image using a plurality of kernels in the same order for the base image 901 and the reference image 906, and outputs the projection image to the matching unit 203 in the order of generation. The collation unit 203 processes the projected image group so that the correspondence between the projected image of the standard image 901 and the projected image of the reference image 906 obtained using the same kernel can be understood. For example, the collation unit acquires a projected image group in the order output from the projection calculation unit 202.

ステップＳ６０３では、照合部２０３は、基準画像の射影画像群９１１から画素群９１２を抽出する画素位置を１つ選択する。すなわち、基準画像の射影画像群９１１における対象画素の画素位置を選択する。 In step S603, the collation unit 203 selects one pixel position from which the pixel group 912 is extracted from the projected image group 911 of the reference image. That is, the pixel position of the target pixel in the projected image group 911 of the reference image is selected.

ステップＳ６０４では、照合部２０３は、基準画像の射影画像群９１１の対象画素の画素位置と同一位置の画素群９１２を射影画像群９１１から抽出し、ベクトル９０５を生成する。 In step S 604, the matching unit 203 extracts a pixel group 912 at the same position as the target pixel of the reference image projection image group 911 from the projection image group 911 and generates a vector 905.

ステップＳ６０５では、照合部２０３は、参照画像の射影画像群９１３において、特定の位置に対応する複数の画素群９１４を射影画像群９１３から抽出し、複数のベクトル９０９を生成する。なお、画素群９１４の位置は、画素群９１４から予め決定された複数の相対位置に移動した位置としてもよい。このように、画素群９１２と画素群９１４の相対位置は、ステップＳ３０４で説明した相対位置と同一としてよい。 In step S605, the collation unit 203 extracts a plurality of pixel groups 914 corresponding to a specific position from the projected image group 913 in the projected image group 913 of the reference image, and generates a plurality of vectors 909. Note that the position of the pixel group 914 may be a position moved from the pixel group 914 to a plurality of predetermined relative positions. Thus, the relative position between the pixel group 912 and the pixel group 914 may be the same as the relative position described in step S304.

前述のように照合部２０３では、同じカーネルを使用して得られた基準画像９０１の射影画像と参照画像９０６の射影画像との対応関係がわかるように射影画像群を処理する。したがって、生成されたベクトル９０５とベクトル９０９とにおける各要素は、それぞれ同じカーネルを用いて生成された射影画像に基づく要素となっている。 As described above, the collation unit 203 processes the projected image group so that the correspondence between the projected image of the base image 901 and the projected image of the reference image 906 obtained using the same kernel can be understood. Therefore, each element in the generated vector 905 and vector 909 is an element based on a projection image generated using the same kernel.

ステップＳ６０６では、照合部２０３は、ベクトル９０５とベクトル９０９の類似度の算出を行う。このように処理を行なう結果、カーネルと同じサイズのブロックを直接類似度の算出に用いる場合に比べて演算量が低減する。 In step S606, the collation unit 203 calculates the similarity between the vector 905 and the vector 909. As a result of performing the processing in this way, the amount of calculation is reduced compared to the case where a block having the same size as the kernel is directly used for calculating the similarity.

ステップＳ６０７では、ステップＳ３０７と同様に、対応決定部２０４が、ステップＳ６０６において算出された類似度と、画素群９１２と画素群９１４の画素位置の距離とに基づき、基準画像の選択された画素位置における視差を推定する。最も単純な方法としては、類似度が最大になる画素群９１２と画素群９１４の画素位置の距離を視差値とする。別の方法としては、類似度を画素群９１２と画素群９１４の画素位置の距離の関数とみなし、関数フィッティングにより極大値をとる画素位置の距離を算出し、これを視差値とする。 In step S607, as in step S307, the correspondence determining unit 204 selects the selected pixel position of the reference image based on the similarity calculated in step S606 and the distance between the pixel positions of the pixel group 912 and the pixel group 914. The parallax at is estimated. As the simplest method, the distance between the pixel positions of the pixel group 912 and the pixel group 914 having the maximum similarity is used as the parallax value. As another method, the similarity is regarded as a function of the distance between the pixel positions of the pixel group 912 and the pixel group 914, the distance between the pixel positions having the maximum value is calculated by function fitting, and this is used as the parallax value.

ステップＳ６０８では、基準画像の視差を算出すべき処理対象の全ての画素に対して処理が行われたかを判定する。処理が完了していない場合にはステップＳ６０３に戻り、未処理の画素に対応する画素位置を選択した上でステップＳ６０３〜６０７の処理を行う。 In step S608, it is determined whether processing has been performed on all pixels to be processed for which the parallax of the reference image is to be calculated. If the process has not been completed, the process returns to step S603, and after selecting a pixel position corresponding to an unprocessed pixel, the processes of steps S603 to S607 are performed.

以上が、実施例２の処理装置１００で行われる処理である。以上の処理によれば、画像内のブロックに対し必要な情報を損失することなく縮小することができるので、視差マップを高速に生成することができる。 The above is the process performed by the processing apparatus 100 according to the second embodiment. According to the above processing, it is possible to reduce the necessary information for the blocks in the image without losing them, so that the parallax map can be generated at high speed.

演算量のオーダーは、類似度にＮＣＣを用い入力画像がｎ画素とすると、射影演算を用いない場合はＯ（Ｎｎ）である。一方で本実施例の射影演算を用いる場合は、高速フーリエ変換を用いた畳み込みと縮小されたブロックを用いたＮＣＣ算出の２つの演算を行うため、Ｏ（ｎｌｏｇｎ）＋Ｏ（Ｍｎ）となる。これらを比べるとおおよそＮとＭの比となり、Ｍが小さいほど本実施例の方法は高速になることが分かる。つまり、ベクトルを用いた類似度の算出においては、本実施例のように次元数（Ｍ）が少ないほど高速になる。なお、本実施例のベクトルの次元数は、カーネルの個数に対応する。従って、本実施例で使用するカーネルの総数は、カーネルに含まれる画素数よりも少ないものとする。 The order of the calculation amount is O (Nn) when the projection calculation is not used when NCC is used for the similarity and the input image has n pixels. On the other hand, in the case of using the projection operation of the present embodiment, O (nlogn) + O (Mn) is obtained because two operations of convolution using fast Fourier transform and NCC calculation using a reduced block are performed. When these are compared, the ratio is approximately N to M, and it can be seen that the smaller M is, the faster the method of this embodiment is. That is, the calculation of similarity using a vector is faster as the number of dimensions (M) is smaller as in this embodiment. Note that the number of dimensions of the vector in this embodiment corresponds to the number of kernels. Therefore, the total number of kernels used in this embodiment is assumed to be smaller than the number of pixels included in the kernel.

図４と同じ１つの座標（ｘ_０，ｙ_０）におけるＮＣＣとｄの対応を図７に示す。図４（ａ）と比較すると、ＮＣＣが最大値を取るｄの値は一致しており、ＮＣＣのｄ依存性は類似している。また、算出された視差マップに５×５画素のメディアンフィルタ処理を行った結果を、図８に示す。図８（ａ）は射影演算を行わない場合、図８（ｂ）は図８（ａ）から図５（ａ）を減算した分布である。実施例１と同様に、推定が困難な位置を除けば、両者はほぼ差が０である。なお、視差マップの算出に要した演算時間は、一例として図５（ａ）が６．４秒、図８（ａ）が３．７秒であった。 FIG. 7 shows the correspondence between NCC and d at the same coordinate (x ₀ , y ₀ ) as in FIG. Compared with FIG. 4A, the values of d at which the NCC takes the maximum value match, and the dCC dependency of the NCC is similar. In addition, FIG. 8 shows the result of performing 5 × 5 pixel median filter processing on the calculated parallax map. FIG. 8A shows a distribution in which no projection calculation is performed, and FIG. 8B shows a distribution obtained by subtracting FIG. 5A from FIG. 8A. Similar to the first embodiment, except for the positions that are difficult to estimate, the difference between them is almost zero. As an example, the calculation time required to calculate the parallax map is 6.4 seconds in FIG. 5A and 3.7 seconds in FIG. 8A.

以上が実施例２の処理である。以上の処理によれば、視差マップの推定精度を損なうことなく処理の高速化を実現できる。 The above is the processing of the second embodiment. According to the above processing, it is possible to realize high-speed processing without impairing the estimation accuracy of the parallax map.

＜その他の実施形態＞
以上説明した実施形態においては、撮影条件が異なる２つの撮影画像の対応する局所領域（部分領域）を探索する処理を例に挙げて説明した。しかしながら、必ずしも撮影画像に限られるものではなく、２つの画像間の対応する局所領域を探索する処理であればよく、処理対象の画像がどのようにして得られた画像であってもよい。また、ここでは２つの画像間の局所領域を探索する処理を例に挙げて説明したが、処理対象の画像の数は２つに限られるものではなく、複数の画像間の局所領域を探索するような処理でもよい。 <Other embodiments>
In the embodiment described above, the process of searching for a corresponding local region (partial region) of two captured images with different shooting conditions has been described as an example. However, the processing is not necessarily limited to a captured image, and any processing that searches for a corresponding local region between two images may be used, and an image obtained by how the processing target image may be obtained. Further, here, the processing for searching for a local region between two images has been described as an example, but the number of images to be processed is not limited to two, and a local region between a plurality of images is searched. Such processing may be used.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

２０１データ取得部
２０２射影演算部
２０３照合部
２０４対応決定部 201 Data Acquisition Unit 202 Projection Operation Unit 203 Collation Unit 204 Correspondence Determination Unit

Claims

Obtaining means for obtaining a plurality of image data;
Projection calculation is performed on each image data acquired by the acquisition unit, and a generation unit that generates a vector with a reduced number of dimensions corresponding to a pixel to be processed;
An information processing apparatus comprising: a calculation unit that calculates the similarity of the pixel to be processed using the vector generated by the generation unit.

The generation means includes a projection matrix of M rows and N columns (M <N) determined by any one of a random number, a discrete chirp matrix, a Delsarte-Goethals code, and a BCH code, and each image indicated by the plurality of image data. The information processing apparatus according to claim 1, wherein a vector with a reduced number of dimensions is generated by obtaining a product of a vector corresponding to a pixel value of a partial region composed of a plurality of pixels.

The generating means performs a convolution operation between a plurality of kernels each including a value determined by any one of a random number, a discrete chirp matrix, a Delsarte-Goethals code, and a BCH code and each image indicated by the plurality of image data. 2. The information processing apparatus according to claim 1, wherein a projection image is generated using a vector, and a vector corresponding to a pixel value of a pixel group at a predetermined position of the projection image is generated as the vector with the reduced number of dimensions. .

The generating means is a pixel group at a predetermined position of a first projected image group obtained by performing a convolution operation of the plurality of kernels on a first image among a plurality of images indicated by a plurality of image data. A second projected image group obtained by performing a convolution operation of the plurality of kernels on a value of each element included in the vector corresponding to the pixel value of the second image and a second image different from the first image The vector so that the value of each element included in the vector corresponding to the pixel value of the pixel group at the predetermined position of the predetermined value becomes a value corresponding to the projected image obtained by convolving the same kernel with each image. The information processing apparatus according to claim 3, wherein:

The information processing apparatus according to claim 3, wherein a total number of the kernels is smaller than a number of elements included in the kernel.

The information processing apparatus according to claim 2, wherein in the projection matrix, there are approximately the same number of the same values with the signs inverted.

6. The information processing apparatus according to claim 3, wherein in the kernel, there are substantially the same number of the same values with inverted signs.

The information processing apparatus according to claim 2, wherein the matrix is a cyclic matrix.

The information processing apparatus according to claim 2, wherein the random number is a value rounded to a discrete value.

The information processing apparatus according to claim 1, further comprising a calculation unit that calculates parallax or a distance between the subject and the camera using the similarity.

An acquisition step of acquiring a plurality of image data;
A projecting operation is performed on each image data acquired in the acquisition step, and a generation step for generating a vector with a reduced number of dimensions corresponding to the pixel to be processed is provided.
A calculation step of calculating a similarity of the pixel to be processed using the vector generated in the generation step.

The program for functioning a computer as each means of the information processing apparatus as described in any one of Claims 1-10.