JP2020170390A

JP2020170390A - Image processing device

Info

Publication number: JP2020170390A
Application number: JP2019072017A
Authority: JP
Inventors: 壮石過; Takeshi Ishika
Original assignee: Toshiba Development and Engineering Corp
Current assignee: Toshiba Development and Engineering Corp
Priority date: 2019-04-04
Filing date: 2019-04-04
Publication date: 2020-10-15
Anticipated expiration: 2039-04-04
Also published as: JP7334051B2

Abstract

To efficiently perform matching between stereo camera images.SOLUTION: According to one aspect of the present invention, an image processing device includes an acquisition unit, a search unit, and an identification unit. The acquisition unit acquires a first image in which an object is photographed by a first camera and a second image in which the object is photographed by a second camera. The search unit identifies the parallax of pixels in the first image with respect to corresponding pixels in the second image in descending order from a predetermined upper limit value, and searches for an area including the pixels for which the parallax has been identified, in which the area of the identification completed pixels exceeds a threshold. The identification unit sets a search range including the area, and identifies the parallax of the pixels in the search range.SELECTED DRAWING: Figure 2

Description

本発明は、被写体に対する３Ｄ計測のための画像処理技術に関する。 The present invention relates to an image processing technique for 3D measurement of a subject.

従来、例えばピッキング、パッキングなどを行うロボットハンドを制御するために、対象物を撮影したステレオカメラ画像間でマッチングが行われ、当該対象物の３Ｄ形状が推定される。さらに、予め用意されている対象物の３Ｄモデルとのマッチングにより、撮影された対象物がどのような位置でどのような姿勢を取っているかが推定される。そして、かかるロボットハンドは、ピッキング、パッキングなどのアクションを対象物の推定位置姿勢に応じて行うように制御される。 Conventionally, in order to control a robot hand that performs picking, packing, or the like, matching is performed between stereo camera images of an object, and the 3D shape of the object is estimated. Further, by matching with a 3D model of the object prepared in advance, it is estimated what kind of position and what posture the photographed object is taking. Then, the robot hand is controlled so as to perform actions such as picking and packing according to the estimated position and posture of the object.

特許文献１には、距離センサを用いて取得した粗い距離情報に基づいて複数のフォーカス位置で複数回の撮像を行い、これらの画像を合成することにより全焦点画像を得て、高低差のある被写体を精密に計測することが提案されている。 In Patent Document 1, a omnifocal image is obtained by performing multiple imaging at a plurality of focus positions based on the coarse distance information acquired by using a distance sensor and synthesizing these images, and there is a height difference. It has been proposed to measure the subject precisely.

特開２０１６−０２０８９１号公報JP-A-2016-020891

前述のステレオカメラ画像間のマッチングは、通常はカメラの視野全体に亘って行われる。同様に、対象物の位置姿勢を推定するための３Ｄモデルとのマッチングも、通常はカメラの視野全体に亘って行われる。これらの計算量はステレオカメラ画像の画素数に依存して増大するが、マッチングの精度を確保する観点からすると画像の解像度を不用意に下げることは好ましくない。さらに、多数の対象物が存在する場合には、１つの対象物に対するアクションが終わる度にこれらの処理を再度行う必要があるので、当該処理に関わる計算量が大きいことによる悪影響はより深刻となる。例えば、大量に山積みされた部品をピックアップするなどの作業の高速化が困難となり得る。 The matching between the stereo camera images described above is usually performed over the entire field of view of the camera. Similarly, matching with a 3D model for estimating the position and orientation of an object is usually performed over the entire field of view of the camera. These calculations increase depending on the number of pixels of the stereo camera image, but it is not preferable to carelessly reduce the resolution of the image from the viewpoint of ensuring the accuracy of matching. Further, when a large number of objects exist, it is necessary to perform these processes again every time the action for one object is completed, so that the adverse effect due to the large amount of calculation related to the process becomes more serious. .. For example, it may be difficult to speed up work such as picking up a large number of piled parts.

特許文献１に記載の技法は、マッチングの精度向上に寄与する可能性はあるものの、マッチングに関する計算量そのものは変わらない。 Although the technique described in Patent Document 1 may contribute to improving the accuracy of matching, the amount of calculation itself related to matching does not change.

本発明は、ステレオカメラ画像間のマッチングを効率的に行うことを目的とする。 An object of the present invention is to efficiently perform matching between stereo camera images.

本発明の一態様によれば、画像処理装置は、取得部と、探索部と、同定部とを含む。取得部は、第１のカメラによって対象物を撮影した第１の画像と第２のカメラによって対象物を撮影した第２の画像とを取得する。探索部は、第１の画像内の画素の第２の画像内の対応する画素に対する視差を予め定められた上限値から降順に同定し、視差の同定済みである画素を含み、かつ視差の同定済みである画素の面積が閾値を超える第１の領域を探索する。同定部は、第１の領域を包含する探索範囲を設定し、前記探索範囲内の画素の視差を同定する。 According to one aspect of the present invention, the image processing apparatus includes an acquisition unit, a search unit, and an identification unit. The acquisition unit acquires a first image in which the object is photographed by the first camera and a second image in which the object is photographed by the second camera. The search unit identifies the parallax of the pixels in the first image with respect to the corresponding pixels in the second image in descending order from a predetermined upper limit value, includes the pixels for which the parallax has been identified, and identifies the parallax. The first region where the area of the completed pixel exceeds the threshold is searched. The identification unit sets a search range including the first region, and identifies the parallax of the pixels in the search range.

本発明によれば、ステレオカメラ画像間のマッチングを効率的に行うことができる。 According to the present invention, matching between stereo camera images can be performed efficiently.

第１の実施形態に係る画像処理装置を含む３Ｄ計測システムを例示する図。The figure which illustrates the 3D measurement system including the image processing apparatus which concerns on 1st Embodiment. 第１の実施形態に係る画像処理装置を例示するブロック図。The block diagram which illustrates the image processing apparatus which concerns on 1st Embodiment. 図２の画像処理装置によって行われる被写体領域の探索処理の説明図。It is explanatory drawing of the search process of the subject area performed by the image processing apparatus of FIG. 図３の例において視差が２５画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 25 pixels in the example of FIG. 図３の例において視差が２１画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 21 pixels in the example of FIG. 図３の例において視差が１７画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 17 pixels in the example of FIG. 図２の画像処理装置の動作を例示するフローチャート。The flowchart which illustrates the operation of the image processing apparatus of FIG. 図７のステップＳ２１０の詳細を例示するフローチャート。The flowchart which illustrates the detail of step S210 of FIG. 図７のステップＳ２２０の詳細を例示するフローチャート。The flowchart which illustrates the detail of step S220 of FIG. 図２の画像処理装置によって行われる探索範囲の設定処理の説明図。The explanatory view of the search range setting process performed by the image processing apparatus of FIG. 第２の実施形態に係る画像処理装置を例示するブロック図。The block diagram which illustrates the image processing apparatus which concerns on 2nd Embodiment. 図１１の画像処理装置によって行われる被写体領域の探索処理の説明図。FIG. 5 is an explanatory diagram of a subject area search process performed by the image processing apparatus of FIG. 図１２の例において視差が２０画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 20 pixels in the example of FIG. 図１２の例において視差が１６画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 16 pixels in the example of FIG. 図１２の例において視差が１２画素の場合の累積結果画像を例示する図。The figure which illustrates the cumulative result image when the parallax is 12 pixels in the example of FIG. 図１１の画像処理装置の動作を例示するフローチャート。The flowchart which illustrates the operation of the image processing apparatus of FIG. 図１６のステップＳ４２０の詳細を例示するフローチャート。FIG. 5 is a flowchart illustrating the details of step S420 of FIG. 図１１の画像処理装置によって行われる探索範囲の設定処理の説明図。It is explanatory drawing of the setting process of the search range performed by the image processing apparatus of FIG. 対象物の３Ｄモデルを例示する図。The figure which illustrates the 3D model of an object.

以下、図面を参照しながら実施形態の説明を述べる。なお、以降、説明済みの要素と同一または類似の要素には同一または類似の符号を付し、重複する説明については基本的に省略する。 Hereinafter, embodiments will be described with reference to the drawings. Hereinafter, elements that are the same as or similar to the elements described will be designated by the same or similar reference numerals, and duplicate explanations will be basically omitted.

（第１の実施形態）
第１の実施形態に係る画像処理装置は、例えば図１に示す３Ｄ計測システムに組み込むことができる。この３Ｄ計測システムは、多眼カメラ１０と、プロジェクタ２０と、プロジェクタ／カメラ制御装置３０と、本実施形態に係る画像処理装置１００とを含む。 (First Embodiment)
The image processing apparatus according to the first embodiment can be incorporated into, for example, the 3D measurement system shown in FIG. This 3D measurement system includes a multi-lens camera 10, a projector 20, a projector / camera control device 30, and an image processing device 100 according to the present embodiment.

多眼カメラ１０は、例えばロボットハンドに取り付けられ、被写体６０が設置された台７０を見下ろす形で当該被写体６０を撮影することになる。ここで、被写体６０は、対象物と呼ぶこともできる。対象物は、１つの物品を指していてもよいし、同種または異種の複数の物品を指していてもよい。後述する位置姿勢の推定を実現するために、各対象物と同種の物品の３Ｄモデルが予め用意され得る。なお、図１によれば、多眼カメラ１０は２つのカメラを備えているが、３つ以上のカメラを備えていてもよい。 The multi-lens camera 10 is attached to, for example, a robot hand, and takes a picture of the subject 60 while looking down at a table 70 on which the subject 60 is installed. Here, the subject 60 can also be called an object. The object may refer to one article, or may refer to a plurality of articles of the same type or different types. In order to realize the estimation of the position and orientation described later, a 3D model of an article of the same type as each object may be prepared in advance. According to FIG. 1, the multi-lens camera 10 includes two cameras, but may include three or more cameras.

プロジェクタ／カメラ制御装置３０は、多眼カメラ１０およびプロジェクタ２０を制御する。具体的には、プロジェクタ／カメラ制御装置３０は、プロジェクタ２０に、後述される３Ｄ計測を行うための既定の計測用パターンの投影を命令する。また、プロジェクタ／カメラ制御装置３０は、計測用パターンが投影されている間に、多眼カメラ１０に被写体６０の撮影を命令する。 The projector / camera control device 30 controls the multi-eye camera 10 and the projector 20. Specifically, the projector / camera control device 30 instructs the projector 20 to project a default measurement pattern for performing 3D measurement described later. Further, the projector / camera control device 30 instructs the multi-eye camera 10 to shoot the subject 60 while the measurement pattern is projected.

プロジェクタ２０は、プロジェクタ／カメラ制御装置３０からの命令に従って、計測用パターンを投影する。そして、多眼カメラ１０に含まれるカメラ１１およびカメラ１２は、プロジェクタ／カメラ制御装置３０からの命令に従って撮影を行い、それぞれ撮影画像を生成する。 The projector 20 projects a measurement pattern according to a command from the projector / camera control device 30. Then, the camera 11 and the camera 12 included in the multi-lens camera 10 take a picture according to a command from the projector / camera control device 30, and each generate a shot image.

画像処理装置１００は、多眼カメラ１０から複数枚の撮影画像を取得し、これら画像に対して後述される種々の３Ｄ計測を行う。例えば、多眼カメラ１０における視差、多眼カメラ１０から被写体６０までの距離、被写体６０の３Ｄ形状、および被写体６０の位置姿勢などを推定したりする。 The image processing device 100 acquires a plurality of captured images from the multi-lens camera 10 and performs various 3D measurements on these images, which will be described later. For example, the parallax in the multi-eye camera 10, the distance from the multi-eye camera 10 to the subject 60, the 3D shape of the subject 60, the position and orientation of the subject 60, and the like are estimated.

画像処理装置１００は、入出力制御、通信制御、読み書き制御、および種々の画像処理（例えば、後述される被写体領域の探索、探索範囲の設定、３Ｄ計測、など）を行うプロセッサを含む。 The image processing device 100 includes a processor that performs input / output control, communication control, read / write control, and various image processing (for example, search for a subject area described later, setting a search range, 3D measurement, and the like).

ここで、プロセッサは、典型的にはＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）および／またはＧＰＵ(ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ)であるが、マイコン、ＦＰＧＡ(ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ)、またはＤＳＰ(ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ)、などであってもよい。 Here, the processor is typically a CPU (Central Processing Unit) and / or a GPU (Graphics Processing Unit), but is a microcomputer, an FPGA (Field Programmable Gate Array), a DSP (Digital Signal Processor), or the like. You may.

画像処理装置１００は、さらに、かかる処理を実現するためにプロセッサによって実行されるプログラムおよび当該プログラムによって使用されるデータ、例えば画像データ、被写体領域を定義するデータ、探索範囲を定義するデータ、視差データ、距離データ、デプスマップデータ、点群データ、３Ｄデータ、１または複数の物品の３Ｄモデルデータ、などを一時的に格納するメモリを含んでいる。メモリは、かかるプログラム／データが展開されるワークエリアを有するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）を含み得る。 Further, the image processing apparatus 100 further includes a program executed by the processor to realize such processing and data used by the program, such as image data, data defining a subject area, data defining a search range, and parallax data. , Distance data, depth map data, point group data, 3D data, 3D model data of one or more articles, and the like. The memory may include a RAM (Random Access Memory) having a work area where such programs / data are deployed.

画像処理装置１００は、さらに、例えば多眼カメラ１０などの外部装置に接続するためのインタフェース（Ｉ／Ｆ）を利用可能である。Ｉ／Ｆは、画像処理装置１００に内蔵されていてもよいし、画像処理装置１００に外付けされていてもよい。 The image processing device 100 can further use an interface (I / F) for connecting to an external device such as a multi-lens camera 10. The I / F may be built in the image processing device 100 or may be externally attached to the image processing device 100.

Ｉ／Ｆは、例えば、多眼カメラ１０から画像を受け取る。Ｉ／Ｆは、例えば光ファイバケーブル、ＨＤＭＩ（登録商標）（Ｈｉｇｈ−ＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）ケーブル、などの有線通信Ｉ／Ｆであってもよいし、例えばＢｌｕｅｔｏｏｔｈ（登録商標）、Ｗｉ−Ｆｉ（登録商標）などの無線通信技術を利用する無線通信Ｉ／Ｆであってもよい。 The I / F receives an image from, for example, the multi-lens camera 10. The I / F may be a wired communication I / F such as an optical fiber cable, an HDMI® (High-Definition Multimedia Interface) cable, or the like, for example, Bluetooth®, Wi-Fi®. It may be a wireless communication I / F that uses a wireless communication technology such as (trademark).

以下、図２を用いて画像処理装置１００の構成例の説明を続ける。
図２に例示されるように、画像処理装置１００は、画像取得部１０１と、領域探索部１０２と、視差同定部１０３と、距離推定部１０５と、３Ｄ形状推定部１０６と、位置姿勢推定部１０７と、３Ｄモデル記憶部１０８とを含む。 Hereinafter, the description of the configuration example of the image processing apparatus 100 will be continued with reference to FIG.
As illustrated in FIG. 2, the image processing device 100 includes an image acquisition unit 101, a region search unit 102, a parallax identification unit 103, a distance estimation unit 105, a 3D shape estimation unit 106, and a position / orientation estimation unit. It includes 107 and a 3D model storage unit 108.

画像取得部１０１は、多眼カメラ１０に含まれるカメラ１１およびカメラ１２から、それぞれ、当該カメラ１１およびカメラ１２によって被写体６０（対象物）を撮影した第１の画像および第２の画像をそれぞれ取得する。画像取得部１０１は、取得した第１の画像および第２の画像を領域探索部１０２へ送る。画像取得部１０１は、例えば前述のプロセッサおよびＩ／Ｆに相当し得る。 The image acquisition unit 101 acquires a first image and a second image of the subject 60 (object) taken by the camera 11 and the camera 12, respectively, from the camera 11 and the camera 12 included in the multi-lens camera 10. To do. The image acquisition unit 101 sends the acquired first image and the second image to the area search unit 102. The image acquisition unit 101 may correspond to, for example, the above-mentioned processor and I / F.

領域探索部１０２は、画像取得部１０１から第１の画像および第２の画像を受け取る。領域探索部１０２は、被写体６０（対象物）のうちカメラ１１およびカメラ１２からの距離が相対的に近い部分が写っている領域を、第１の画像および／または第２の画像から探索する。かかる領域は、被写体領域と呼ぶこともできる。領域探索部１０２は、例えば前述のプロセッサに相当し得る。 The area search unit 102 receives the first image and the second image from the image acquisition unit 101. The area search unit 102 searches for a region of the subject 60 (object) in which a portion of the subject 60 (object) that is relatively close to the camera 11 and the camera 12 is captured from the first image and / or the second image. Such an area can also be called a subject area. The area search unit 102 may correspond to, for example, the above-mentioned processor.

例えば、ロボットハンドが、大量に山積みされた対象物をその上方からピックアップするユースケースでは、ロボットハンドは、視野内にどれだけの対象物が存在しようとも、通常はその時々で最も近い、つまり最も高い位置に積まれている対象物から順にピックアップしていくことになる。故に、かかる被写体領域の周辺にマッチングの探索範囲を制限すれば、ロボットハンドの動きを害することなくマッチングに関わる計算量を大幅に削減することができる。 For example, in a use case where a robot hand picks up a large pile of objects from above, the robot hand is usually the closest, or closest, at any given time, no matter how many objects are in the field of view. Objects that are piled up in higher positions will be picked up in order. Therefore, if the matching search range is limited to the periphery of the subject area, the amount of calculation related to matching can be significantly reduced without impairing the movement of the robot hand.

具体的には、領域探索部１０２は、予め定められた上限値（ｋ）から順に、第１の画像内の画素の、第２の画像内の対応する画素に対する視差を同定する。すなわち、領域探索部１０２は、図３に例示されるように、２５画素、・・・、２１画素、・・・１７画素、・・・のように、対象となる値を減じながら視差の同定を行う。なお、以降の説明では、第１の画像を基準とし、第２の画像をずらしてマッチングを行っているが、これらは逆であってもよい。ここで、上限値とは、理論上の上限値、例えばエピポーラ線の長さに基づいて定められてもよいし、図１の３Ｄ計測システムの実用環境に即した値に定められてもよい。 Specifically, the region search unit 102 identifies the parallax of the pixels in the first image with respect to the corresponding pixels in the second image in order from the predetermined upper limit value (k). That is, as illustrated in FIG. 3, the area search unit 102 identifies the parallax while reducing the target values, such as 25 pixels, ..., 21 pixels, ... 17 pixels, ... I do. In the following description, the first image is used as a reference, and the second image is shifted for matching, but these may be reversed. Here, the upper limit value may be set based on a theoretical upper limit value, for example, the length of the epipolar line, or may be set to a value suitable for the practical environment of the 3D measurement system of FIG.

まず、領域探索部１０２は、第２の画像に含まれる各画素をエピポーラ線に沿ってｋ画素ずらして比較用の画像を生成し、第１の画像内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、領域探索部１０２は、第１の画像に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅｓ）を計算する。そして、領域探索部１０２は、ＳＡＤが閾値未満、すなわちマッチングが成功であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上、すなわちマッチングが失敗であれば当該画素の値を「０」として、視差＝ｋの結果画像（例えば２値画像）を生成する。この結果画像に含まれる各画素の値は、第１の画像において当該画素と同一位置にある画素についての視差がｋであるか否か（「１」／「０」）を意味する。領域探索部１０２は、この視差＝ｋの結果画像を（最初の）累積結果画像（例えば２値画像）として保存する。 First, the area search unit 102 shifts each pixel included in the second image by k pixels along the epipolar line to generate an image for comparison, and for each pixel in the first image, the image for comparison and the image for comparison are generated. Block matching is performed. For example, the area search unit 102 includes a SAD (Sum) of a pixel block centered on a pixel of interest included in the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. of Absolute Differences) is calculated. Then, the area search unit 102 sets the value of the pixel at the same position as the pixel of interest to "1" if the SAD is less than the threshold value, that is, if the matching is successful, and conversely, if the SAD is equal to or more than the threshold value, that is, if the matching fails. The value of the pixel is set to "0", and a result image (for example, a binary image) with parallax = k is generated. The value of each pixel included in the resulting image means whether or not the parallax of the pixel at the same position as the pixel in the first image is k (“1” / “0”). The area search unit 102 saves the result image of parallax = k as a (first) cumulative result image (for example, a binary image).

次に、領域探索部１０２は、より小さな視差について同定を行う。すなわち、領域探索部１０２は、第２の画像に含まれる各画素をエピポーラ線に沿って、ｋ−１画素ずらして比較用の画像を生成し、第１の画像内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、領域探索部１０２は、第１の画像に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、領域探索部１０２は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｋ−１の結果画像を生成する。この結果画像に含まれる各画素の値は、第１の画像において当該画素と同一位置にある画素についての視差がｋ−１であるか否か（「１」／「０」）を意味する。領域探索部１０２は、累積結果画像にこの視差＝ｋ−１の結果画像を加算して更新する。この累積結果画像に含まれる各画素の値は、第１の画像において当該画素と同一位置にある画素についての視差がｋ−１以上であるか否か、換言すれば視差が同定済みであるか否か（「１」／「０」）を意味する。 Next, the region search unit 102 identifies the smaller parallax. That is, the area search unit 102 shifts each pixel included in the second image by k-1 pixels along the epipolar line to generate an image for comparison, and for each pixel in the first image, the comparison is made. Block matching with the image of. For example, the area search unit 102 calculates the SAD of the pixel block centered on the pixel of interest included in the first image and the pixel block centered on the pixel at the same position as the pixel of interest in the comparison image. To do. Then, the area search unit 102 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is equal to or more than the threshold value. A result image with parallax = k-1 is generated. The value of each pixel included in the resulting image means whether or not the parallax of the pixel at the same position as the pixel in the first image is k-1 (“1” / “0”). The area search unit 102 updates the cumulative result image by adding the result image of parallax = k-1. The value of each pixel included in this cumulative result image is whether or not the parallax for the pixel at the same position as the pixel in the first image is k-1 or more, in other words, whether the parallax has been identified. It means whether or not ("1" / "0").

このように、領域探索部１０２は、必要に応じてｋ−２，ｋ−３，・・・と順に減じながら視差の同定を行い、累積結果画像を更新する。この結果、視差＝ｉの結果画像を加算して更新された累積結果画像に含まれる各画素の値は、第１の画像において当該画素と同一位置にある画素についての視差がｉ以上であるか否か、換言すれば視差が同定済みであるか否か（「１」／「０」）を意味する。このように、累積結果画像において、「１」の値を持つ画素は段々と増えていくことになる。一例として、図４は視差＝２５画素の場合の累積結果画像、図５は視差＝２１の場合の累積結果画像、および図６は視差＝１７画素の場合の累積結果画像をそれぞれ示す。各図において、「１」の値を持つ画素は黒、「０」の値を持つ画素は白でそれぞれ描かれている。 In this way, the region search unit 102 identifies the parallax while decreasing the number in the order of k-2, k-3, ..., As necessary, and updates the cumulative result image. As a result, the value of each pixel included in the cumulative result image updated by adding the result image of parallax = i is whether the parallax of the pixel at the same position as the pixel in the first image is i or more. Whether or not, in other words, whether or not the parallax has been identified (“1” / “0”). In this way, the number of pixels having a value of "1" in the cumulative result image gradually increases. As an example, FIG. 4 shows a cumulative result image when the parallax = 25 pixels, FIG. 5 shows a cumulative result image when the parallax = 21, and FIG. 6 shows a cumulative result image when the parallax = 17 pixels. In each figure, the pixel having the value of "1" is drawn in black, and the pixel having the value of "0" is drawn in white.

領域探索部１０２は、累積結果画像が保存され、または更新される毎に、当該累積結果画像において「１」の値を持つ画素を含み、かつ「１」の値を持つ画素の面積が閾値を超える領域（被写体領域）を探索する。ここで、かかる領域は、「１」の値を持つ画素が連続する領域であってもよいし、例えばある画素を基準とした所定形状、例えばＮ画素＊Ｎ画素の矩形、の領域であってもよい。また、連続とは、「１」の値を持つ画素が少なくとも水平または垂直方向に隣接している場合を指すが、斜め方向に隣接している場合もさらに指すことがあり得る。また、隣接とは、「１」の値を持つ画素が少なくとも「０」の値を持つ画素によって隔てられることなく並んでいる場合を指すが、「１」の値を持つ画素が１つまたはそれ以上の「１」の値を持つ画素によって隔てられる場合もさらに指すことがあり得る。例えば、水平または垂直方向に、「１」、「１」、「０」、「１」、「１」の順に画素が並んでいた場合に、これら５つの画素は連続すると解されてもよいし、連続しないと解されてもよい。領域探索部１０２は、かかる被写体領域が探索された場合には、当該領域を定義するデータを生成し、これを第１の画像および第２の画像と共に視差同定部１０３へ送る。 Each time the cumulative result image is saved or updated, the area search unit 102 includes a pixel having a value of "1" in the cumulative result image, and the area of the pixel having a value of "1" sets a threshold value. Search for an area that exceeds (subject area). Here, such a region may be a region in which pixels having a value of "1" are continuous, or is, for example, a region having a predetermined shape based on a certain pixel, for example, a rectangle of N pixels * N pixels. May be good. Further, the term “continuous” refers to a case where pixels having a value of “1” are adjacent to each other at least in the horizontal or vertical direction, but may further refer to a case where the pixels are adjacent in the diagonal direction. Further, "adjacent" refers to a case where pixels having a value of "1" are lined up without being separated by pixels having a value of "0", but there is one pixel having a value of "1" or a pixel thereof. It may be further pointed out when it is separated by the pixels having the above value of "1". For example, when the pixels are arranged in the order of "1", "1", "0", "1", and "1" in the horizontal or vertical direction, these five pixels may be understood as continuous. , May be understood as non-continuous. When the subject area is searched, the area search unit 102 generates data defining the area and sends the data together with the first image and the second image to the parallax identification unit 103.

視差同定部１０３は、領域探索部１０２から、第１の画像、第２の画像、および被写体領域を定義するデータを受け取る。視差同定部１０３は、被写体領域を包含する探索範囲を設定し、探索範囲内の画素の視差を同定する。視差同定部１０３は、探索範囲内の画素についての視差の同定結果を表す視差データを生成し、これを距離推定部１０５へ送る。視差データは、例えば第１の画像の各画素の座標と当該画素に関する視差ベクトルとを含み得る。また、視差同定部１０３は、視差データを図示されない外部装置へ送ってもよい。視差同定部１０３は、例えば前述のプロセッサ（およびＩ／Ｆ）に相当し得る。 The parallax identification unit 103 receives data defining the first image, the second image, and the subject area from the area search unit 102. The parallax identification unit 103 sets a search range including the subject area, and identifies the parallax of the pixels in the search range. The parallax identification unit 103 generates parallax data representing the parallax identification result for the pixels in the search range, and sends this to the distance estimation unit 105. The parallax data may include, for example, the coordinates of each pixel of the first image and the parallax vector for that pixel. Further, the parallax identification unit 103 may send the parallax data to an external device (not shown). The parallax identification unit 103 may correspond to, for example, the above-mentioned processor (and I / F).

視差同定部１０３は、領域探索部１０２によって同定済みである視差よりも小さな値から順に視差の同定を継続し、被写体領域の拡大とこれに伴う探索範囲の拡大（再設定）とを行ってもよい。 The parallax identification unit 103 continues to identify the parallax in order from a value smaller than the parallax identified by the area search unit 102, and even if the subject area is expanded and the search range is expanded (reset) accordingly. Good.

具体的には、視差同定部１０３は、領域探索部１０２によって探索された被写体領域を包含する探索範囲を設定する。例えば、視差同定部１０３は、被写体領域を包含する矩形を探索範囲としてもよいし、被写体領域の縁に沿って当該被写体よりも大きな探索範囲を設定してもよい。 Specifically, the parallax identification unit 103 sets a search range including the subject area searched by the area search unit 102. For example, the parallax identification unit 103 may set a rectangle including the subject area as the search range, or may set a search range larger than the subject along the edge of the subject area.

そして、視差同定部１０３は、領域探索部１０２によって同定済みである値（ここではｍとする）よりも小さな視差について同定を行う。すなわち、視差同定部１０３は、第２の画像に含まれる各画素をエピポーラ線に沿ってｍ−１画素ずらして比較用の画像を生成し、第１の画像の探索範囲内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、視差同定部１０３は、第１の画像の探索範囲に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、視差同定部１０３は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｍ−１の結果画像を生成する。この結果画像に含まれる各画素の値は、第１の画像の探索範囲において当該画素と同一位置にある画素についての視差がｍ−１であるか否か（「１」／「０」）を意味する。視差同定部１０３は、累積結果画像にこの視差＝ｍ−１の結果画像を加算して被写体領域を更新する。この累積結果画像に含まれる各画素の値は、第１の画像の少なくとも探索範囲において当該画素と同一位置にある画素についての視差がｍ−１以上であるか否か、換言すれば視差が同定済みであるか否か（「１」／「０」）を意味する。 Then, the parallax identification unit 103 identifies the parallax smaller than the value (here, m) identified by the region search unit 102. That is, the parallax identification unit 103 generates an image for comparison by shifting each pixel included in the second image by m-1 pixel along the epipolar line, and corresponds to each pixel within the search range of the first image. Block matching with the image for comparison is performed. For example, the parallax identification unit 103 includes a pixel block centered on a pixel of interest included in the search range of the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. Calculate SAD. Then, the parallax identification unit 103 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is greater than or equal to the threshold value. A result image with parallax = m-1 is generated. As a result, the value of each pixel included in the image determines whether or not the parallax of the pixel at the same position as the pixel in the search range of the first image is m-1 (“1” / “0”). means. The parallax identification unit 103 updates the subject area by adding the result image of parallax = m-1 to the cumulative result image. The value of each pixel included in this cumulative result image identifies whether or not the parallax of the pixel at the same position as the pixel in at least the search range of the first image is m-1 or more, in other words, the parallax is identified. It means whether or not it has been completed (“1” / “0”).

続いて、視差同定部１０３は、このように更新した被写体領域が、現在設定されている探索範囲の境界に達したか否かを判定する。被写体領域が現在設定されている探索範囲の境界に達していなければ、視差同定部１０３はより小さな視差の同定を行う。他方、被写体領域が探索範囲の境界に達していれば、視差同定部１０３は、被写体領域が探索範囲に包含されるように、探索範囲を以下に説明するように拡大して再設定することになる。 Subsequently, the parallax identification unit 103 determines whether or not the subject area updated in this way has reached the boundary of the currently set search range. If the subject area does not reach the boundary of the currently set search range, the parallax identification unit 103 identifies the smaller parallax. On the other hand, if the subject area reaches the boundary of the search range, the parallax identification unit 103 decides to expand and reset the search range as described below so that the subject area is included in the search range. Become.

具体的には、視差同定部１０３は、現在の累積結果画像における被写体領域を包含するように探索範囲を拡大して再設定する。ただし、前述のように、視差同定部１０３は、再設定前の探索範囲よりも外側の画素についてマッチングを行っていないので、実際には視差がｍ−１以上である被写体領域は、再設定後の探索範囲の境界にも達するおそれがある。故に、視差同定部１０３は、探索範囲のうち再設定により拡大した部分について前述のマッチングを視差＝ｍ−１について行って、累積結果画像を再更新する。そして、視差同定部１０３は、被写体領域が、現在設定されている探索範囲の境界に達したか否かを再判定する。被写体領域が探索範囲の境界に達していれば、視差同定部１０３は同様の処理を繰り返す必要がある。他方、被写体領域が探索範囲の境界に達していなければ、視差同定部１０３はより小さな視差の同定と、必要であれば探索範囲の拡大とを行うことになる。最終的に、視差の同定済みである画素を含む領域（被写体領域）のサイズが予め定められた目標値に達した場合に、視差同定部１０３は視差の同定を終了することになる。 Specifically, the parallax identification unit 103 expands and resets the search range so as to include the subject area in the current cumulative result image. However, as described above, since the parallax identification unit 103 does not match the pixels outside the search range before the reset, the subject area where the parallax is actually m-1 or more is actually set after the reset. It may reach the boundary of the search range of. Therefore, the parallax identification unit 103 re-updates the cumulative result image by performing the above-mentioned matching for the parallax = m-1 for the portion of the search range expanded by the resetting. Then, the parallax identification unit 103 redetermines whether or not the subject area has reached the boundary of the currently set search range. If the subject area reaches the boundary of the search range, the parallax identification unit 103 needs to repeat the same process. On the other hand, if the subject area does not reach the boundary of the search range, the parallax identification unit 103 will identify the smaller parallax and expand the search range if necessary. Finally, when the size of the region (subject region) including the pixel for which the parallax has been identified reaches a predetermined target value, the parallax identification unit 103 ends the parallax identification.

例えば、視差同定部１０３は、被写体領域に外接する矩形の対角線の長さが、対象物の３Ｄ形状に基づいて定められる目標値に達した場合に、視差の同定を終了し得る。ここで、対象物の３Ｄ形状は、例えば図１９に示される３Ｄモデルとして登録され得る。３Ｄモデルは、対象物のＣＡＤデータに基づいて作成されてもよいし、対象物の３Ｄ撮影データに基づいて作成されてもよい。図１９の３Ｄモデルをロボット（ハンド）が対象物を把持する方向から見たときに外接する最小の矩形の頂点をＡ、Ｂ、Ｃ、およびＤとする。この矩形の対角線の長さＡＤまたはＢＣに基づいて、かかる目標値が定められてよい。かかる対角線の長さは、頂点Ａおよび頂点Ｄの間の三次元ユークリッド距離、または頂点Ｂおよび頂点Ｃの間の三次元ユークリッド距離として算出可能である。 For example, the parallax identification unit 103 can finish the parallax identification when the length of the diagonal line of the rectangle circumscribing the subject area reaches a target value determined based on the 3D shape of the object. Here, the 3D shape of the object can be registered as, for example, the 3D model shown in FIG. The 3D model may be created based on the CAD data of the object, or may be created based on the 3D imaging data of the object. Let A, B, C, and D be the vertices of the smallest rectangle that circumscribes the 3D model of FIG. 19 when viewed from the direction in which the robot (hand) grips the object. Such a target value may be set based on the diagonal length AD or BC of this rectangle. The length of such a diagonal can be calculated as a three-dimensional Euclidean distance between vertices A and D, or a three-dimensional Euclidean distance between vertices B and C.

視差同定部１０３による探索範囲の設定例を図１０に示す。図１０の例では、視差＝ｋ−１の時に、被写体領域の探索が済んでいる。故に、段階Ｓ１では、視差同定部１０３は、視差＝ｋ−１以上である被写体領域を包含するように探索範囲を設定する。 FIG. 10 shows an example of setting the search range by the parallax identification unit 103. In the example of FIG. 10, the search for the subject area is completed when the parallax = k-1. Therefore, in step S1, the parallax identification unit 103 sets the search range so as to include the subject region where the parallax = k-1 or more.

段階Ｓ２では、視差同定部１０３は、現在設定されている探索範囲内で視差＝ｋ−２についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（右端）に達している。 In step S2, the parallax identification unit 103 performs matching for parallax = k-2 within the currently set search range. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (right end) of the currently set search range.

故に、段階Ｓ３では、視差同定部１０３は、探索範囲を拡大し、この拡大した部分の中で視差＝ｋ−２についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（右端）に達している。 Therefore, in step S3, the parallax identification unit 103 expands the search range and performs matching for parallax = k-2 in the expanded portion. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (right end) of the currently set search range.

故に、段階Ｓ４では、視差同定部１０３は、探索範囲を拡大し、この拡大した部分の中で視差＝ｋ−２についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲によって包含されている。 Therefore, in step S4, the parallax identification unit 103 expands the search range and performs matching for parallax = k-2 in the expanded portion. As a result, the region having a value of "1" in the cumulative result image is included by the currently set search range.

次に、段階Ｓ５では、視差同定部１０３は、現在設定されている探索範囲内で視差＝ｋ−３についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（右端）に達している。 Next, in step S5, the parallax identification unit 103 performs matching for parallax = k-3 within the currently set search range. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (right end) of the currently set search range.

故に、段階Ｓ６では、視差同定部１０３は、探索範囲を拡大し、この拡大した部分の中で視差＝ｋ−３についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（右端）に達している。 Therefore, in step S6, the parallax identification unit 103 expands the search range and performs matching for parallax = k-3 in the expanded portion. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (right end) of the currently set search range.

故に、段階Ｓ７では、視差同定部１０３は、探索範囲を拡大し、この拡大した部分の中で視差＝ｋ−３についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲によって包含されている。 Therefore, in step S7, the parallax identification unit 103 expands the search range and performs matching for parallax = k-3 in the expanded portion. As a result, the region having a value of "1" in the cumulative result image is included by the currently set search range.

次に、段階Ｓ８では、視差同定部１０３は、現在設定されている探索範囲内で視差＝ｋ−４についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域のサイズは予め定められた目標値に達している。故に、視差同定部１０３は、視差の同定を終了する。 Next, in step S8, the parallax identification unit 103 performs matching for parallax = k-4 within the currently set search range. As a result, the size of the region having a value of "1" in the cumulative result image has reached a predetermined target value. Therefore, the parallax identification unit 103 ends the parallax identification.

距離推定部１０５は、視差同定部１０３から視差データを受け取る。距離推定部１０５は、視差データに基づいて、第１の画像の探索範囲内の（マッチングに成功した）画素毎に、被写体６０の表面における当該画素に対応する点からカメラ１１およびカメラ１２までの推定距離、すなわち推定デプスを算出する。距離推定部１０５は、推定距離を表す距離データを３Ｄ形状推定部１０６へ送る。距離データは、例えば第１の画像の探索範囲内の各画素の座標と当該画素の距離（デプス）とを含み得る。また、距離推定部１０５は、距離データを図示されない外部装置へ送ってもよい。距離推定部１０５は、例えば前述のプロセッサ（およびＩ／Ｆ）に相当し得る。 The distance estimation unit 105 receives the parallax data from the parallax identification unit 103. Based on the parallax data, the distance estimation unit 105 extends from the point corresponding to the pixel on the surface of the subject 60 to the camera 11 and the camera 12 for each pixel (successfully matched) within the search range of the first image. Calculate the estimated distance, i.e. the estimated depth. The distance estimation unit 105 sends distance data representing the estimated distance to the 3D shape estimation unit 106. The distance data may include, for example, the coordinates of each pixel within the search range of the first image and the distance (depth) of the pixel. Further, the distance estimation unit 105 may send the distance data to an external device (not shown). The distance estimation unit 105 may correspond to, for example, the above-mentioned processor (and I / F).

３Ｄ形状推定部１０６は、距離推定部１０５から距離データを受け取る。３Ｄ形状推定部１０６は、距離データに基づいて、探索範囲における被写体６０（対象物）の３Ｄ形状を推定する。具体的には、３Ｄ形状推定部１０６は、距離データを例えば画像化し、（探索範囲内の）デプスマップ（データ）を生成してもよい。デプスマップは、例えば画像データであって、各画素がその対応する距離に比例する画素値を有する。また、３Ｄ形状推定部１０６は、デプスマップに加えて、またはデプスマップの代わりに（探索範囲内の）被写体６０の点群データを生成してもよい。点群データは、被写体６０の表面の多数の点の３Ｄ位置データを含み得る。３Ｄ形状推定部１０６は、推定３Ｄ形状を表す３Ｄデータ（例えば、デプスマップおよび／または点群データ）を位置姿勢推定部１０７へ送る。また、３Ｄ形状推定部１０６は、３Ｄデータを図示されない外部装置へ送ってもよい。３Ｄ形状推定部１０６は、例えば前述のプロセッサ（およびＩ／Ｆ）に相当し得る。 The 3D shape estimation unit 106 receives the distance data from the distance estimation unit 105. The 3D shape estimation unit 106 estimates the 3D shape of the subject 60 (object) in the search range based on the distance data. Specifically, the 3D shape estimation unit 106 may image the distance data, for example, and generate a depth map (data) (within the search range). The depth map is, for example, image data, where each pixel has a pixel value proportional to its corresponding distance. Further, the 3D shape estimation unit 106 may generate point cloud data of the subject 60 (within the search range) in addition to the depth map or instead of the depth map. The point cloud data may include 3D position data of a large number of points on the surface of the subject 60. The 3D shape estimation unit 106 sends 3D data (for example, depth map and / or point cloud data) representing the estimated 3D shape to the position / orientation estimation unit 107. Further, the 3D shape estimation unit 106 may send 3D data to an external device (not shown). The 3D shape estimation unit 106 may correspond to, for example, the above-mentioned processor (and I / F).

位置姿勢推定部１０７は、３Ｄ形状推定部１０６から被写体６０の３Ｄデータを受け取り、３Ｄモデル記憶部１０８から１または複数の物品の３Ｄモデルデータを読み出す。位置姿勢推定部１０７は、被写体６０の３Ｄデータを、読み出した３Ｄモデルデータとマッチングし、被写体６０の位置および／または姿勢を推定する。位置姿勢推定部１０７は、推定位置および／または推定姿勢を表す位置／姿勢データを図示されない外部装置、例えばロボットハンド（の制御装置）へ送る。位置姿勢推定部１０７は、例えば前述のプロセッサおよびＩ／Ｆに相当し得る。 The position / orientation estimation unit 107 receives the 3D data of the subject 60 from the 3D shape estimation unit 106, and reads out the 3D model data of one or a plurality of articles from the 3D model storage unit 108. The position / orientation estimation unit 107 matches the 3D data of the subject 60 with the read 3D model data, and estimates the position and / or posture of the subject 60. The position / orientation estimation unit 107 sends the estimated position and / or the position / attitude data representing the estimated posture to an external device (for example, a control device) (for example, a robot hand) (not shown). The position / orientation estimation unit 107 may correspond to, for example, the above-mentioned processor and I / F.

ここで、被写体６０が複数種類の物品のいずれかに該当する場合には、位置姿勢推定部１０７は、被写体６０の位置姿勢を推定するよりも前に、当該被写体６０の物品認識処理を行ってもよい。すなわち、位置姿勢推定部１０７は、被写体６０の３Ｄデータとマッチングされるべき物品の３Ｄモデルデータがいずれであるかを判定してもよい。 Here, when the subject 60 corresponds to any of a plurality of types of articles, the position / orientation estimation unit 107 performs the article recognition process of the subject 60 before estimating the position / orientation of the subject 60. May be good. That is, the position / orientation estimation unit 107 may determine which is the 3D model data of the article to be matched with the 3D data of the subject 60.

３Ｄモデル記憶部１０８は、予め用意された１または複数の物品の３Ｄモデルデータを保存する。３Ｄモデル記憶部１０８には、例えばピックアップなどの対象となる物品の３Ｄモデルデータが保存され得る。３Ｄモデル記憶部１０８に保存された３Ｄモデルデータは、位置姿勢推定部１０７によって読み出される。３Ｄモデル記憶部１０８は、例えば前述のメモリまたは補助記憶装置に相当し得る。ここで、補助記憶装置は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、フラッシュメモリ、などであってよい。 The 3D model storage unit 108 stores 3D model data of one or a plurality of articles prepared in advance. The 3D model storage unit 108 can store 3D model data of a target article such as a pickup. The 3D model data stored in the 3D model storage unit 108 is read out by the position / orientation estimation unit 107. The 3D model storage unit 108 may correspond to, for example, the above-mentioned memory or auxiliary storage device. Here, the auxiliary storage device may be an HDD (Hard Disk Drive), an SSD (Solid State Drive), a flash memory, or the like.

以下、図７乃至図９を用いて、図２の画像処理装置の動作を説明する。
まず、同定済みの視差をカウントするための変数ｐが、予め定められた上限値（ｋ）に初期化される（ステップＳ２０１）。 Hereinafter, the operation of the image processing apparatus of FIG. 2 will be described with reference to FIGS. 7 to 9.
First, the variable p for counting the identified parallax is initialized to a predetermined upper limit value (k) (step S201).

領域探索部１０２は、ステップＳ２０１において初期化された変数ｐを減じながら、第１の画像および第２の画像のマッチングを行い、被写体領域を探索する（ステップＳ２１０）。なお、ステップＳ２１０の具体例は、図８を用いて後述される。 The area search unit 102 searches the subject area by matching the first image and the second image while subtracting the variable p initialized in step S201 (step S210). A specific example of step S210 will be described later with reference to FIG.

視差同定部１０３は、被写体領域のサイズが予め定められた目標値に達するまで、ステップＳ２１０に引き続き当該変数ｐを減じながら探索範囲内の画素の視差を同定して被写体領域を拡大させると共に当該被写体領域を包含するように探索範囲を設定する（ステップＳ２２０）。なお、ステップＳ２２０の具体例は、図９を用いて後述される。 The parallax identification unit 103 identifies the parallax of the pixels in the search range while decreasing the variable p in the step S210 until the size of the subject area reaches a predetermined target value, expands the subject area, and expands the subject area. The search range is set so as to include the region (step S220). A specific example of step S220 will be described later with reference to FIG.

距離推定部１０５は、ステップＳ２２０における各画素についての視差の同定結果（視差データ）に基づいて、当該画素に対応する被写体６０（対象物）の表面上の点からカメラ１１およびカメラ１２までの距離を推定する（ステップＳ２４５）。 The distance estimation unit 105 is the distance from the point on the surface of the subject 60 (object) corresponding to the pixel to the camera 11 and the camera 12 based on the parallax identification result (parallax data) for each pixel in step S220. Is estimated (step S245).

３Ｄ形状推定部１０６は、ステップＳ２４５における距離の推定結果（距離データ）に基づいて、被写体６０の３Ｄ形状を推定する（ステップＳ２４６）。 The 3D shape estimation unit 106 estimates the 3D shape of the subject 60 based on the distance estimation result (distance data) in step S245 (step S246).

位置姿勢推定部１０７は、ステップＳ２４６における被写体６０の３Ｄ形状の推定結果（被写体６０の３Ｄデータ）を、３Ｄモデル記憶部１０８から読み出した、１または複数の物品の３Ｄモデルデータとマッチングし、当該被写体６０の位置および／または姿勢を推定する（ステップＳ２４７）。 The position / orientation estimation unit 107 matches the 3D shape estimation result (3D data of the subject 60) of the subject 60 in step S246 with the 3D model data of one or more articles read from the 3D model storage unit 108, and the said The position and / or posture of the subject 60 is estimated (step S247).

次に、図８を用いて、ステップＳ２１０の詳細を説明する。図８の例では、処理はステップＳ２１１から開始する。
ステップＳ２１１において、領域探索部１０２は、視差＝ｐについて、ブロックマッチングを行い、結果画像を生成する。具体的には、領域探索部１０２は、第２の画像に含まれる各画素をエピポーラ線に沿ってｐ画素ずらして比較用の画像を生成し、第１の画像内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、領域探索部１０２は、第１の画像に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、領域探索部１０２は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｐの結果画像を生成し得る。 Next, the details of step S210 will be described with reference to FIG. In the example of FIG. 8, the process starts from step S211.
In step S211 the area search unit 102 performs block matching for parallax = p and generates a result image. Specifically, the area search unit 102 generates an image for comparison by shifting each pixel included in the second image by p pixels along the epipolar line, and for each pixel in the first image, the comparison is made. Block matching with the image of. For example, the area search unit 102 calculates the SAD of the pixel block centered on the pixel of interest included in the first image and the pixel block centered on the pixel at the same position as the pixel of interest in the comparison image. To do. Then, the area search unit 102 sets the value of the pixel at the same position as the pixel of interest to "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is equal to or more than the threshold value. A resulting image with parallax = p can be generated.

次に、領域探索部１０２は、累積結果画像にステップＳ２１２において生成した視差＝ｐの結果画像を加算して、累積結果画像を更新する（ステップＳ２１２）。なお、ステップＳ２１２を初めて行う場合には、視差＝ｐ＝ｋの結果画像がそのまま（最初の）累積結果画像として保存され得る。 Next, the area search unit 102 adds the result image of parallax = p generated in step S212 to the cumulative result image to update the cumulative result image (step S212). When the step S212 is performed for the first time, the result image of parallax = p = k can be saved as it is as the (first) cumulative result image.

続いて、領域探索部１０２は、ステップＳ２１２において更新した累積結果画像において「１」の値を持つ画素を含む領域を検出する（ステップＳ２１３）。領域探索部１０２は、ステップＳ２１３において検出した領域に含まれる「１」の画素の面積が閾値を超えるか否かを判定する（ステップＳ２１４）。 Subsequently, the area search unit 102 detects an area including a pixel having a value of "1" in the cumulative result image updated in step S212 (step S213). The area search unit 102 determines whether or not the area of the pixel “1” included in the area detected in step S213 exceeds the threshold value (step S214).

ステップＳ２１４において領域に含まれる「１」の画素の面積が閾値以下であると判定された場合に処理はステップＳ２１５へ進む。他方、ステップＳ２１４において領域に含まれる「１」の画素の面積が閾値を超えると判定された場合に、領域探索部１０２はステップＳ２１３において検出された領域を被写体領域の探索結果として決定し、当該被写体領域を定義するデータを生成、出力して、処理が終了する。ステップＳ２１５において、領域探索部１０２は変数ｐをデクリメントし、処理はステップＳ２１１に戻る。 When it is determined in step S214 that the area of the pixel "1" included in the area is equal to or less than the threshold value, the process proceeds to step S215. On the other hand, when it is determined in step S214 that the area of the pixel "1" included in the area exceeds the threshold value, the area search unit 102 determines the area detected in step S213 as the search result of the subject area. The data that defines the subject area is generated and output, and the process ends. In step S215, the area search unit 102 decrements the variable p, and the process returns to step S211.

次に、図９を用いて、ステップＳ２２０の詳細を説明する。図９の例では、処理はステップＳ２２１から開始する。
ステップＳ２２１において、視差同定部１０３は、ステップＳ２１０において探索された被写体領域を包含する探索範囲を設定し、処理はステップＳ２２２に進む。ステップＳ２２２において、視差同定部１０３は、変数ｐをデクリメントする。 Next, the details of step S220 will be described with reference to FIG. In the example of FIG. 9, the process starts from step S221.
In step S221, the parallax identification unit 103 sets a search range including the subject area searched in step S210, and the process proceeds to step S222. In step S222, the parallax identification unit 103 decrements the variable p.

次に、視差同定部１０３は、ステップＳ２２２においてデクリメントされた視差＝ｐについて、探索範囲内のブロックマッチングを行い、結果画像を生成する（ステップＳ２２３）。具体的には、視差同定部１０３は、第２の画像に含まれる各画素をエピポーラ線に沿ってｐ画素ずらして比較用の画像を生成し、第１の画像の探索範囲内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、視差同定部１０３は、第１の画像の探索範囲に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、視差同定部１０３は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｐの結果画像を生成する。 Next, the parallax identification unit 103 performs block matching within the search range for the parallax = p decremented in step S222, and generates a result image (step S223). Specifically, the parallax identification unit 103 generates an image for comparison by shifting each pixel included in the second image by p pixels along the epipolar line, and for each pixel within the search range of the first image. Block matching is performed with the image for comparison. For example, the parallax identification unit 103 includes a pixel block centered on a pixel of interest included in the search range of the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. Calculate SAD. Then, the parallax identification unit 103 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is greater than or equal to the threshold value. A result image with parallax = p is generated.

続いて、視差同定部１０３は、累積結果画像にステップＳ２２３において生成した視差＝ｐの結果画像を加算して、累積結果画像を更新する（ステップＳ２２４）。領域探索部１０２は、ステップＳ２２４において更新した累積結果画像において「１」の値を持つ画素を含む領域を検出する（ステップＳ２２５）。 Subsequently, the parallax identification unit 103 adds the result image of parallax = p generated in step S223 to the cumulative result image to update the cumulative result image (step S224). The area search unit 102 detects a region including a pixel having a value of "1" in the cumulative result image updated in step S224 (step S225).

視差同定部１０３は、ステップＳ２２５において検出した領域のサイズが目標値に達したか否かを判定する（ステップＳ２３２）。ステップＳ２３２において、領域のサイズが目標値に達していると判定されれば処理が終了する。他方、ステップＳ２３２において、領域のサイズが目標値に達していないと判定されれば処理はステップＳ２２６へ進む。 The parallax identification unit 103 determines whether or not the size of the region detected in step S225 has reached the target value (step S232). If it is determined in step S232 that the size of the area has reached the target value, the process ends. On the other hand, if it is determined in step S232 that the size of the region has not reached the target value, the process proceeds to step S226.

ステップＳ２２６では、視差同定部１０３は、ステップＳ２２５において検出した領域が現在設定されている探索範囲の境界に達したか否かを判定する。ステップＳ２２６において領域が探索範囲の境界に達していないと判定されれば、処理はステップＳ２２２に戻る。他方、ステップＳ２２６において領域が探索範囲の境界に達したと判定されれば、処理はステップＳ２２７へ進む。 In step S226, the parallax identification unit 103 determines whether or not the region detected in step S225 has reached the boundary of the currently set search range. If it is determined in step S226 that the region does not reach the boundary of the search range, the process returns to step S222. On the other hand, if it is determined in step S226 that the region has reached the boundary of the search range, the process proceeds to step S227.

ステップＳ２２７において、視差同定部１０３は、ステップＳ２２５において検出された（拡大した）被写体領域を包含するように、探索範囲を拡大して再設定する。 In step S227, the parallax identification unit 103 expands and resets the search range so as to include the (enlarged) subject area detected in step S225.

次に、視差同定部１０３は、現在の視差＝ｐについて、探索範囲内、より正確にはステップＳ２２７において拡大した部分のブロックマッチングを行い、結果画像を生成する（ステップＳ２２８）。具体的には、視差同定部１０３は、第２の画像に含まれる各画素をエピポーラ線に沿ってｐ画素ずらして比較用の画像を生成し、第１の画像の探索範囲（の拡大部分）内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、視差同定部１０３は、第１の画像の探索範囲に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、視差同定部１０３は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｐの結果画像を生成する。 Next, the parallax identification unit 103 performs block matching of the expanded portion of the current parallax = p within the search range, more accurately in step S227, and generates a result image (step S228). Specifically, the parallax identification unit 103 generates an image for comparison by shifting each pixel included in the second image by p pixels along the epipolar line, and the search range (enlarged portion) of the first image. Block matching is performed for each pixel in the image with the image for comparison. For example, the parallax identification unit 103 includes a pixel block centered on a pixel of interest included in the search range of the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. Calculate SAD. Then, the parallax identification unit 103 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is greater than or equal to the threshold value. A result image with parallax = p is generated.

続いて、視差同定部１０３は、累積結果画像にステップＳ２２８において生成した視差＝ｐの結果画像を加算して、累積結果画像を更新する（ステップＳ２２９）。領域探索部１０２は、ステップＳ２２９において更新した累積結果画像において「１」の値を持つ画素を含む領域を検出する（ステップＳ２３０）。 Subsequently, the parallax identification unit 103 adds the result image of parallax = p generated in step S228 to the cumulative result image to update the cumulative result image (step S229). The area search unit 102 detects a region including a pixel having a value of “1” in the cumulative result image updated in step S229 (step S230).

視差同定部１０３は、ステップＳ２３０において検出した領域のサイズが目標値に達したか否かを判定する（ステップＳ２３３）。ステップＳ２３３において、領域のサイズが目標値に達していると判定されれば処理が終了する。他方、ステップＳ２３３において、領域のサイズが目標値に達していないと判定されれば処理はステップＳ２３１へ進む。 The parallax identification unit 103 determines whether or not the size of the region detected in step S230 has reached the target value (step S233). In step S233, if it is determined that the size of the area has reached the target value, the process ends. On the other hand, if it is determined in step S233 that the size of the region has not reached the target value, the process proceeds to step S231.

ステップＳ２３１では、視差同定部１０３は、ステップＳ２３０において検出した領域が現在設定されている探索範囲の境界に達したか否かを判定する。ステップＳ２３１において領域が探索範囲の境界に達していないと判定されれば、処理はステップＳ２２２に戻る。他方、ステップＳ２３１において領域が探索範囲の境界に達したと判定されれば、処理はステップＳ２２７に戻る。 In step S231, the parallax identification unit 103 determines whether or not the region detected in step S230 has reached the boundary of the currently set search range. If it is determined in step S231 that the region does not reach the boundary of the search range, the process returns to step S222. On the other hand, if it is determined in step S231 that the region has reached the boundary of the search range, the process returns to step S227.

以上説明したように、第１の実施形態に係る画像処理装置は、ステレオカメラ画像の各画素に関する視差を予め定められた上限値から降順に同定し、カメラに近い被写体の（一部の）領域を早期に探索する。そして、この画像処理装置は、探索された領域を包含するように探索範囲を設定して、当該探索範囲内で視差を同定する。故に、この画像処理装置によれば、視差同定のためのマッチングや、被写体の位置姿勢の推定のためのマッチングを行う範囲をカメラの視野の一部に制限することができる。すなわち、カメラの視野全体を探索範囲とした場合に比べて、マッチングの実行回数を削減する、すなわちマッチングに関する計算量を削減しながらも、カメラに近い被写体に関する視差の同定および／または位置姿勢の推定をすることができる。従って、この画像処理装置によれば、例えば、ロボットハンドが、大量に山積みされた対象物をその上方からピックアップするユースケースにおいて、ロボットハンドが対象物をピックアップしてから次の対象物をピックアップするまでの時間を短縮させることで作業を高速化させることができる。 As described above, the image processing apparatus according to the first embodiment identifies the parallax for each pixel of the stereo camera image in descending order from a predetermined upper limit value, and is a (partial) region of the subject close to the camera. To search early. Then, this image processing device sets the search range so as to include the searched area, and identifies the parallax within the search range. Therefore, according to this image processing device, it is possible to limit the range of matching for parallax identification and matching for estimating the position and orientation of the subject to a part of the field of view of the camera. That is, the number of matching executions is reduced, that is, the amount of calculation related to matching is reduced, while parallax identification and / or position / orientation estimation for a subject close to the camera is performed, as compared with the case where the entire field of view of the camera is used as the search range. Can be done. Therefore, according to this image processing device, for example, in a use case where a robot hand picks up a large number of piled objects from above, the robot hand picks up the object and then picks up the next object. Work can be speeded up by shortening the time required.

（第２の実施形態）
図１に例示した３Ｄ計測システムは、前述の画像処理装置１００に代えて、第２の実施形態に係る画像処理装置３００を組み込むこともできる。 (Second Embodiment)
In the 3D measurement system illustrated in FIG. 1, the image processing device 300 according to the second embodiment can be incorporated in place of the image processing device 100 described above.

画像処理装置３００は、前述の画像処理装置１００と同様に、多眼カメラ１０から複数枚の撮影画像を取得し、これら画像に対して種々の３Ｄ計測を行う。また、画像処理装置３００は、画像処理装置１００と同様のハードウェア構成を有し得る。 Similar to the image processing device 100 described above, the image processing device 300 acquires a plurality of captured images from the multi-eye camera 10 and performs various 3D measurements on these images. Further, the image processing device 300 may have the same hardware configuration as the image processing device 100.

以下、図１１を用いて画像処理装置３００の構成例の説明を続ける。
図１１に例示されるように、画像処理装置３００は、画像取得部１０１と、領域探索部１０２と、視差同定部３０３と、距離推定部１０５と、３Ｄ形状推定部１０６と、位置姿勢推定部１０７と、３Ｄモデル記憶部１０８とを含む。 Hereinafter, the description of the configuration example of the image processing apparatus 300 will be continued with reference to FIG.
As illustrated in FIG. 11, the image processing device 300 includes an image acquisition unit 101, a region search unit 102, a parallax identification unit 303, a distance estimation unit 105, a 3D shape estimation unit 106, and a position / orientation estimation unit. It includes 107 and a 3D model storage unit 108.

視差同定部３０３は、領域探索部１０２から、第１の画像、第２の画像、および被写体領域を定義するデータを受け取る。視差同定部３０３は、被写体領域を包含する探索範囲を設定し、探索範囲内の画素の視差を同定する。視差同定部３０３は、探索範囲内の画素についての視差の同定結果を表す視差データを生成し、これを距離推定部１０５へ送る。視差データは、例えば第１の画像の各画素の座標と当該画素に関する視差ベクトルとを含み得る。また、視差同定部３０３は、視差データを図示されない外部装置へ送ってもよい。視差同定部３０３は、例えば前述のプロセッサ（およびＩ／Ｆ）に相当し得る。 The parallax identification unit 303 receives data defining the first image, the second image, and the subject area from the area search unit 102. The parallax identification unit 303 sets a search range including the subject area, and identifies the parallax of the pixels in the search range. The parallax identification unit 303 generates parallax data representing the parallax identification result for the pixels in the search range, and sends this to the distance estimation unit 105. The parallax data may include, for example, the coordinates of each pixel of the first image and the parallax vector for that pixel. Further, the parallax identification unit 303 may send the parallax data to an external device (not shown). The parallax identification unit 303 may correspond to, for example, the above-mentioned processor (and I / F).

視差同定部３０３は、領域探索部１０２によって同定済みである視差よりも小さな値から順に視差の同定を継続し、被写体領域の拡大とこれに伴う探索範囲の拡大（再設定）とを行ってもよい。後述するように、視差同定部３０３は、ある値についての視差の同定の結果、探索範囲を拡大した場合に、拡大された探索範囲に対して、予め定められた上限値から当該ある値まで視差を同定する点、そして、拡大された探索範囲内により有望な被写体領域が存在した場合に、設定中の探索範囲をリセットして視差の同定を再開する点で、前述の視差同定部１０３とは異なる。 The parallax identification unit 303 continues to identify the parallax in order from a value smaller than the parallax identified by the area search unit 102, and even if the subject area is expanded and the search range is expanded (reset) accordingly. Good. As will be described later, when the parallax identification unit 303 identifies the parallax for a certain value and the search range is expanded, the parallax identification unit 303 performs the parallax from a predetermined upper limit value to the certain value with respect to the expanded search range. The parallax identification unit 103 is different from the above-mentioned parallax identification unit 103 in that it identifies the parallax and, when a more promising subject area exists in the expanded search range, the search range being set is reset and the parallax identification is restarted. different.

具体的には、視差同定部３０３は、領域探索部１０２によって探索された被写体領域を包含する探索範囲を設定する。例えば、視差同定部３０３は、被写体領域を包含する矩形を探索範囲としてもよいし、被写体領域の縁に沿って当該被写体よりも大きな探索範囲を設定してもよい。 Specifically, the parallax identification unit 303 sets a search range including the subject area searched by the area search unit 102. For example, the parallax identification unit 303 may set a rectangle including the subject area as the search range, or may set a search range larger than the subject along the edge of the subject area.

そして、視差同定部３０３は、領域探索部１０２によって同定済みである値（ここではｍとする）よりも小さな視差について同定を行う。すなわち、視差同定部３０３は、第２の画像に含まれる各画素をエピポーラ線に沿ってｍ−１画素ずらして比較用の画像を生成し、第１の画像の探索範囲内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、視差同定部３０３は、第１の画像の探索範囲に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、視差同定部３０３は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｍ−１の結果画像を生成する。この結果画像に含まれる各画素の値は、第１の画像の探索範囲において当該画素と同一位置にある画素についての視差がｍ−１であるか否か（「１」／「０」）を意味する。視差同定部３０３は、累積結果画像にこの視差＝ｍ−１の結果画像を加算して被写体領域を更新する。この累積結果画像に含まれる各画素の値は、第１の画像の少なくとも探索範囲において当該画素と同一位置にある画素についての視差がｍ−１以上であるか否か、換言すれば視差が同定済みであるか否か（「１」／「０」）を意味する。 Then, the parallax identification unit 303 identifies a parallax smaller than the value (here, m) identified by the region search unit 102. That is, the parallax identification unit 303 shifts each pixel included in the second image by m-1 pixel along the epipolar line to generate an image for comparison, and corresponds to each pixel within the search range of the first image. Block matching with the image for comparison is performed. For example, the parallax identification unit 303 includes a pixel block centered on a pixel of interest included in the search range of the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. Calculate SAD. Then, the parallax identification unit 303 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is greater than or equal to the threshold value. A result image with parallax = m-1 is generated. As a result, the value of each pixel included in the image determines whether or not the parallax of the pixel at the same position as the pixel in the search range of the first image is m-1 (“1” / “0”). means. The parallax identification unit 303 updates the subject area by adding the result image of parallax = m-1 to the cumulative result image. The value of each pixel included in this cumulative result image identifies whether or not the parallax of the pixel at the same position as the pixel in at least the search range of the first image is m-1 or more, in other words, the parallax is identified. It means whether or not it has been completed (“1” / “0”).

続いて、視差同定部３０３は、このように更新した被写体領域が、現在設定されている探索範囲の境界に達したか否かを判定する。被写体領域が現在設定されている探索範囲の境界に達していなければ、視差同定部３０３はより小さな視差の同定を行う。他方、被写体領域が探索範囲の境界に達していれば、視差同定部３０３は、被写体領域が探索範囲に包含されるように、探索範囲を以下に説明するように拡大して再設定することになる。 Subsequently, the parallax identification unit 303 determines whether or not the subject area updated in this way has reached the boundary of the currently set search range. If the subject area does not reach the boundary of the currently set search range, the parallax identification unit 303 identifies the smaller parallax. On the other hand, if the subject area reaches the boundary of the search range, the parallax identification unit 303 decides to expand and reset the search range as described below so that the subject area is included in the search range. Become.

具体的には、視差同定部３０３は、現在の累積結果画像における被写体領域を包含するように探索範囲を拡大して再設定する。ただし、前述のように、視差同定部３０３は、再設定前の探索範囲よりも外側の画素についてマッチングを行っていないので、実際には視差がｍ−１以上である被写体領域は、再設定後の探索範囲の境界にも達するおそれがある。故に、視差同定部３０３は、探索範囲のうち再設定により拡大した部分について前述のマッチングを予め定められた上限値からｍ−１まで行って、累積結果画像を再更新する。 Specifically, the parallax identification unit 303 expands and resets the search range so as to include the subject area in the current cumulative result image. However, as described above, since the parallax identification unit 303 does not match the pixels outside the search range before the resetting, the subject area where the parallax is actually m-1 or more is actually set after the resetting. It may reach the boundary of the search range of. Therefore, the parallax identification unit 303 re-updates the cumulative result image by performing the above-mentioned matching from the predetermined upper limit value to m-1 for the portion of the search range expanded by the resetting.

ここで、探索範囲のうち再設定により拡大した部分に、ｍ−１より大きい、すなわち上限値からｍまでの視差が同定された領域が存在し、さらにこの領域が現在の探索範囲内の被写体領域と不連続である場合には、当該領域は現在の探索範囲内の被写体領域に対応する被写体よりもカメラに近い位置にある被写体に対応する可能性がある。そこで、視差同定部３０３は、設定中の探索範囲（および当該探索範囲によって包含される被写体領域）を破棄してこの（被写体）領域を包含する探索範囲を新たに設定（すなわち、探索範囲をリセット）して、当該探索範囲内の画素について視差の同定を再開する。そして、視差同定部３０３は、被写体領域が、現在設定されている探索範囲の境界に達したか否かを再判定する。被写体領域が探索範囲の境界に達していれば、視差同定部３０３は同様の処理を繰り返す必要がある。他方、被写体領域が探索範囲の境界に達していなければ、視差同定部３０３はより小さな視差の同定と、必要であれば探索範囲の拡大とを行うことになる。最終的に、視差の同定済みである画素を含む領域（被写体領域）のサイズが予め定められた目標値に達した場合に、視差同定部３０３は視差の同定を終了することになる。 Here, in the portion of the search range expanded by resetting, there is a region larger than m-1, that is, a parallax from the upper limit value to m is identified, and this region is a subject region within the current search range. If it is discontinuous, the area may correspond to a subject located closer to the camera than the subject corresponding to the subject area in the current search range. Therefore, the parallax identification unit 303 discards the search range being set (and the subject area included by the search range) and newly sets a search range including this (subject) area (that is, resets the search range). ), And the parallax identification is restarted for the pixels in the search range. Then, the parallax identification unit 303 redetermines whether or not the subject area has reached the boundary of the currently set search range. If the subject area reaches the boundary of the search range, the parallax identification unit 303 needs to repeat the same process. On the other hand, if the subject area does not reach the boundary of the search range, the parallax identification unit 303 will identify the smaller parallax and expand the search range if necessary. Finally, when the size of the region (subject region) including the pixel for which the parallax has been identified reaches a predetermined target value, the parallax identification unit 303 ends the parallax identification.

ここで、２つの被写体６１および被写体６２が図１２に例示されていたとする。図１２の例では、被写体６２は、被写体６１に被さっており、被写体６１よりも被写体６２がカメラに近い位置にある。しかしながら、被写体６１および被写体６２の姿勢の影響で、被写体６１の一部が、被写体６２よりもカメラに近い位置に突き出ている。 Here, it is assumed that the two subjects 61 and the subject 62 are illustrated in FIG. In the example of FIG. 12, the subject 62 covers the subject 61, and the subject 62 is closer to the camera than the subject 61. However, due to the influence of the postures of the subject 61 and the subject 62, a part of the subject 61 protrudes closer to the camera than the subject 62.

被写体６１および被写体６２がこのように配置されていたとすると、領域探索部１０２は、被写体６１の一部を被写体領域として探索する可能性がある。それから視差同定部３０３は、被写体６１のうちカメラからより遠い部分について視差を同定し、探索範囲を拡大することになる。ここで、仮に、ある値について探索範囲内の画素の視差を同定し、さらに探索範囲を拡大した場合に、この拡大した部分の画素の視差を当該ある値についてのみ同定したとする。探索範囲の拡大部分の画素の視差をこのように同定すると、被写体６１および被写体６２の重なった部分では被写体６２の方が被写体６１よりも視差が大きいので、当該拡大部分に被写体６２が存在していたとしても、これが見落とされるおそれがある。 Assuming that the subject 61 and the subject 62 are arranged in this way, the area search unit 102 may search for a part of the subject 61 as a subject area. Then, the parallax identification unit 303 identifies the parallax for the portion of the subject 61 farther from the camera, and expands the search range. Here, it is assumed that when the parallax of the pixels in the search range is identified for a certain value and the search range is further expanded, the parallax of the pixels in the expanded portion is identified only for the certain value. When the parallax of the pixels in the enlarged portion of the search range is identified in this way, the subject 62 has a larger parallax than the subject 61 in the overlapped portion of the subject 61 and the subject 62, so that the subject 62 exists in the enlarged portion. Even so, this can be overlooked.

そこで、視差同定部３０３は、探索範囲の拡大部分の画素の視差も、予め定められた値から現在注目する値まで同定することで、かかる見落としを防ぐ。図１３、図１４、および図１５に、図１２の例において、視差が２０画素、１６画素、および１２画素の場合、の累積結果画像をそれぞれ示す。図１３の段階では被写体６１の一部が探索範囲に含まれているが、図１４の段階では被写体６２の一部を包含するように探索範囲がリセットされ、図１５の段階では、被写体６１および被写体６２の位置関係、すなわち被写体６２が被写体６１に被さっていることが理解できる。 Therefore, the parallax identification unit 303 also identifies the parallax of the pixels in the expanded portion of the search range from a predetermined value to a value of interest at present, thereby preventing such oversight. 13, FIG. 14, and FIG. 15 show cumulative result images when the parallax is 20, 16 pixels, and 12 pixels in the example of FIG. 12, respectively. At the stage of FIG. 13, a part of the subject 61 is included in the search range, but at the stage of FIG. 14, the search range is reset so as to include a part of the subject 62, and at the stage of FIG. 15, the subject 61 and It can be understood that the positional relationship of the subject 62, that is, the subject 62 covers the subject 61.

視差同定部３０３による探索範囲の設定例を図１８に示す。図１８の例では、視差＝ｋ−１の時に、被写体領域の探索が済んでいる。故に、段階Ｓ１１では、視差同定部３０３は、視差＝ｋ−１以上である被写体領域を包含するように探索範囲を設定する。 FIG. 18 shows an example of setting the search range by the parallax identification unit 303. In the example of FIG. 18, the search for the subject area is completed when the parallax = k-1. Therefore, in step S11, the parallax identification unit 303 sets the search range so as to include the subject region in which the parallax = k-1 or more.

段階Ｓ１２では、視差同定部３０３は、現在設定されている探索範囲内で視差＝ｋ−２についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（上端および右端）に達している。 In step S12, the parallax identification unit 303 performs matching for parallax = k-2 within the currently set search range. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (upper end and right end) of the currently set search range.

故に、段階Ｓ１３では、視差同定部３０３は、探索範囲を拡大する。そして、段階Ｓ１４では、視差同定部３０３は、この拡大した部分の中で視差＝ｋ、ｋ−１およびｋ−２についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（上端および右端）に達している。 Therefore, in step S13, the parallax identification unit 303 expands the search range. Then, in step S14, the parallax identification unit 303 performs matching for parallax = k, k-1 and k-2 in this enlarged portion. As a result, the region having a value of "1" in the cumulative result image reaches the boundary (upper end and right end) of the currently set search range.

故に、段階Ｓ１５では、視差同定部３０３は、探索範囲をさらに拡大する。そして、段階Ｓ１６では、視差同定部３０３は、この拡大した部分の中で視差＝ｋ、ｋ−１およびｋ−２についてのマッチングを行う。この結果、視差＝ｋおよび／またはｋ−１が同定された領域が検出され、かつ現在の探索範囲内の被写体領域と不連続であったため、探索範囲がリセットされた。さらに、累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲の境界（左端）に達している。 Therefore, in step S15, the parallax identification unit 303 further expands the search range. Then, in step S16, the parallax identification unit 303 performs matching for parallax = k, k-1 and k-2 in this enlarged portion. As a result, the region in which the parallax = k and / or k-1 was identified was detected and was discontinuous with the subject region in the current search range, so the search range was reset. Further, the region having a value of "1" in the cumulative result image reaches the boundary (left end) of the currently set search range.

故に、段階Ｓ１７では、視差同定部３０３は、探索範囲をさらに拡大する。そして、段階Ｓ１８では、視差同定部３０３は、この拡大した部分の中で視差＝ｋ、ｋ−１およびｋ−２についてのマッチングを行う。累積結果画像において「１」の値を持つ領域は現在設定されている探索範囲によって包含されている。 Therefore, in step S17, the parallax identification unit 303 further expands the search range. Then, in step S18, the parallax identification unit 303 performs matching for parallax = k, k-1 and k-2 in this enlarged portion. The area having a value of "1" in the cumulative result image is included by the currently set search range.

次に、段階Ｓ１９では、視差同定部３０３は、現在設定されている探索範囲内で視差＝ｋ−３についてのマッチングを行う。この結果、累積結果画像において「１」の値を持つ領域のサイズは目標値に達した。故に、視差同定部３０３は、視差の同定を終了する。 Next, in step S19, the parallax identification unit 303 performs matching for parallax = k-3 within the currently set search range. As a result, the size of the region having a value of "1" in the cumulative result image reached the target value. Therefore, the parallax identification unit 303 finishes the parallax identification.

以下、図１６および図１７を用いて、図１１の画像処理装置の動作を説明する。図１６のフローチャートは、図７のフローチャートにおけるステップＳ２２０をステップＳ４２０に置き換えたものである。 Hereinafter, the operation of the image processing apparatus of FIG. 11 will be described with reference to FIGS. 16 and 17. The flowchart of FIG. 16 replaces step S220 in the flowchart of FIG. 7 with step S420.

ステップＳ４２０において、視差同定部３０３は、被写体領域のサイズが予め定められた目標値に達するまで、ステップＳ２１０に引き続き当該変数ｐを減じながら探索範囲内の画素の視差を同定して被写体領域を拡大させると共に当該被写体領域を包含するように探索範囲を設定する（ステップＳ４２０）。なお、ステップＳ４２０の具体例は、図１７を用いて後述される。 In step S420, the parallax identification unit 303 identifies the parallax of the pixels in the search range and expands the subject area while decreasing the variable p following step S210 until the size of the subject area reaches a predetermined target value. The search range is set so as to include the subject area (step S420). A specific example of step S420 will be described later with reference to FIG.

次に、図１７を用いて、ステップＳ４２０の詳細を説明する。図１７のフローチャートは、図９のフローチャートにおけるステップＳ２２８およびステップＳ２２９をステップＳ４２８およびステップＳ４２９にそれぞれ置き換え、さらにステップＳ４３４およびステップＳ４３５を追加したものである。 Next, the details of step S420 will be described with reference to FIG. In the flowchart of FIG. 17, step S228 and step S229 in the flowchart of FIG. 9 are replaced with steps S428 and S429, respectively, and steps S434 and S435 are added.

ステップＳ４２８において、視差同定部３０３は、予め定められた上限値（ｋ）から現在の視差＝ｐまで、拡大された探索範囲内、すなわちステップＳ２２７において拡大した部分のブロックマッチングを行い、それぞれ結果画像を生成する（ステップＳ４２８）。具体的には、視差同定部３０３は、ｐ≦ｉ≦ｋを満足する任意の整数ｉに関して、第２の画像に含まれる各画素をエピポーラ線に沿ってｉ画素ずらして比較用の画像を生成し、第１の画像の探索範囲の拡大部分内の画素毎に当該比較用の画像とのブロックマッチングを行う。例えば、視差同定部３０３は、第１の画像の探索範囲に含まれる注目画素を中心とする画素ブロックと、比較用の画像において当該注目画素と同一位置にある画素を中心とする画素ブロックとのＳＡＤを計算する。そして、視差同定部３０３は、ＳＡＤが閾値未満であればこの注目画素と同一位置にある画素の値を「１」、逆にＳＡＤが閾値以上であれば当該画素の値を「０」として、視差＝ｉの結果画像を生成する。 In step S428, the parallax identification unit 303 performs block matching within the expanded search range from the predetermined upper limit value (k) to the current parallax = p, that is, the expanded portion in step S227, and each result image. Is generated (step S428). Specifically, the parallax identification unit 303 generates an image for comparison by shifting each pixel included in the second image by i pixel along the epipolar line with respect to an arbitrary integer i satisfying p ≦ i ≦ k. Then, block matching with the comparison image is performed for each pixel in the expanded portion of the search range of the first image. For example, the parallax identification unit 303 includes a pixel block centered on a pixel of interest included in the search range of the first image and a pixel block centered on a pixel at the same position as the pixel of interest in the comparison image. Calculate SAD. Then, the parallax identification unit 303 sets the value of the pixel at the same position as the pixel of interest to be "1" if the SAD is less than the threshold value, and conversely sets the value of the pixel to "0" if the SAD is greater than or equal to the threshold value. A result image of parallax = i is generated.

続いて、視差同定部１０３は、視差＝ｋ〜ｐ＋１について、画素値＝「１」の領域が検出され、かつ探索範囲内の被写体領域と不連続であるか否かを判定する（ステップＳ４３４）。かかる領域が検出されれば処理はステップＳ４３５へ進む、検出されなければ処理はステップＳ４２９へ進む。 Subsequently, the parallax identification unit 103 determines whether or not the region of the pixel value = "1" is detected for the parallax = k to p + 1 and is discontinuous with the subject region within the search range (step S434). .. If such a region is detected, the process proceeds to step S435, and if not detected, the process proceeds to step S429.

ステップＳ４３５において、視差同定部１０３は、現在設定中の探索範囲を破棄し、検出された領域を包含する探索範囲を新たに設定して、処理はステップＳ４２９へ進む。 In step S435, the parallax identification unit 103 discards the search range currently being set, sets a new search range including the detected area, and the process proceeds to step S429.

ステップＳ４２９において、視差同定部３０３は、累積結果画像にステップＳ４２８において生成した視差＝ｋ〜ｐの結果画像を加算して、累積結果画像を更新する。 In step S429, the parallax identification unit 303 updates the cumulative result image by adding the result image of parallax = k to p generated in step S428 to the cumulative result image.

以上説明したように、第２の実施形態に係る画像処理装置は、ある値について探索範囲内の画素の視差を同定し、さらに探索範囲を拡大した場合に、拡大した部分について予め定められた上限値から当該ある値まで視差の同定を行う点、そして、拡大された探索範囲内により有望な被写体領域が存在した場合に、設定中の探索範囲をリセットして視差の同定を再開する点で、第１の実施形態に係る画像処理装置とは異なる。この画像処理装置は、第１の実施形態に係る画像処理装置に比べて、視差の同定のための演算量が増加するおそれはあるものの、最初に探索された被写体領域とは離れた箇所に当該被写体領域に対応する第１の被写体よりもカメラに近い第２の被写体が存在する場合に、当該第２の被写体を見落としにくくなる。 As described above, the image processing apparatus according to the second embodiment identifies the parallax of the pixels in the search range for a certain value, and when the search range is further expanded, a predetermined upper limit is set for the expanded portion. The point of identifying the parallax from the value to the certain value, and the point of resetting the set search range and restarting the parallax identification when a more promising subject area exists in the expanded search range. It is different from the image processing apparatus according to the first embodiment. Although this image processing device may increase the amount of calculation for parallax identification as compared with the image processing device according to the first embodiment, the image processing device corresponds to a location distant from the initially searched subject area. When there is a second subject closer to the camera than the first subject corresponding to the subject area, it becomes difficult to overlook the second subject.

上述の実施形態は、本発明の概念の理解を助けるための具体例を示しているに過ぎず、本発明の範囲を限定することを意図されていない。実施形態は、本発明の要旨を逸脱しない範囲で、様々な構成要素の付加、削除または転換をすることができる。 The above embodiments are merely specific examples to aid in understanding the concepts of the invention and are not intended to limit the scope of the invention. In the embodiment, various components can be added, deleted or converted without departing from the gist of the present invention.

上述の実施形態では、いくつかの機能部を説明したが、これらは各機能部の実装の一例に過ぎない。例えば、１つの装置に実装されると説明された複数の機能部が複数の別々の装置に亘って実装されることもあり得るし、逆に複数の別々の装置に亘って実装されると説明された機能部が１つの装置に実装されることもあり得る。 In the above-described embodiment, some functional parts have been described, but these are only examples of implementation of each functional part. For example, it is possible that a plurality of functional parts described as being mounted on one device may be mounted on a plurality of separate devices, and conversely, it is described as being mounted on a plurality of separate devices. It is also possible that the functional unit is mounted on one device.

上記各実施形態において説明された種々の機能部は、回路を用いることで実現されてもよい。回路は、特定の機能を実現する専用回路であってもよいし、プロセッサのような汎用回路であってもよい。 The various functional parts described in each of the above embodiments may be realized by using a circuit. The circuit may be a dedicated circuit that realizes a specific function, or may be a general-purpose circuit such as a processor.

上記各実施形態の処理の少なくとも一部は、例えば汎用のコンピュータに搭載されたプロセッサを基本ハードウェアとして用いることでも実現可能である。上記処理を実現するプログラムは、コンピュータで読み取り可能な記録媒体に格納して提供されてもよい。プログラムは、インストール可能な形式のファイルまたは実行可能な形式のファイルとして記録媒体に記憶される。記録媒体としては、磁気ディスク、光ディスク（ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ等）、光磁気ディスク（ＭＯ等）、半導体メモリなどである。記録媒体は、プログラムを記憶でき、かつ、コンピュータが読み取り可能であれば、何れであってもよい。また、上記処理を実現するプログラムを、インターネットなどのネットワークに接続されたコンピュータ（サーバ）上に格納し、ネットワーク経由でコンピュータ（クライアント）にダウンロードさせてもよい。 At least a part of the processing of each of the above embodiments can be realized by using, for example, a processor mounted on a general-purpose computer as basic hardware. The program that realizes the above processing may be provided by storing it in a computer-readable recording medium. The program is stored on the recording medium as a file in an installable format or a file in an executable format. Examples of the recording medium include magnetic disks, optical disks (CD-ROM, CD-R, DVD, etc.), magneto-optical disks (MO, etc.), semiconductor memories, and the like. The recording medium may be any medium as long as it can store the program and can be read by a computer. Further, the program that realizes the above processing may be stored on a computer (server) connected to a network such as the Internet and downloaded to the computer (client) via the network.

１０・・・多眼カメラ
１１，１２・・・カメラ
２０・・・プロジェクタ
３０・・・プロジェクタ／カメラ制御装置
６０，６１，６２・・・被写体
７０・・・台
１００，３００・・・画像処理装置
１０１・・・画像取得部
１０２・・・領域探索部
１０３，３０３・・・視差同定部
１０５・・・距離推定部
１０６・・・３Ｄ形状推定部
１０７・・・位置姿勢推定部
１０８・・・３Ｄモデル記憶部 10 ... Multi-lens camera 11, 12 ... Camera 20 ... Projector 30 ... Projector / camera control device 60, 61, 62 ... Subject 70 ... Unit 100, 300 ... Image processing Device 101: Image acquisition unit 102: Area search unit 103, 303: Parallax identification unit 105: Distance estimation unit 106: 3D shape estimation unit 107: Position / orientation estimation unit 108 ...・ 3D model storage

Claims

An acquisition unit that acquires a first image in which an object is photographed by a first camera and a second image in which the object is photographed by a second camera.
The parallax of the pixels in the first image with respect to the corresponding pixels in the second image is identified in descending order from a predetermined upper limit value, and the pixels including the identified pixels of the parallax are included, and the parallax is identified. A search unit that searches for a first region in which the area of pixels that have been completed exceeds the threshold, and
An image processing apparatus comprising a parallax identification unit for setting a search range including the first region and identifying parallax of pixels in the search range.

The identification unit identifies the parallax of the pixels in the search range for a first value smaller than the identified parallax and updates the first region, and the first region is the search range. When the boundary is reached, the search range is expanded and reset to include the first region, and then the parallax of the pixels in the search range for a second value smaller than the first value. The image processing apparatus according to claim 1.

The identification unit identifies the parallax of the pixels in the search range for a first value smaller than the identified parallax and updates the first region, and the first region is the search range. When the boundary is reached, the search range is expanded and reset to include the first region, and then the pixels in the search range expanded from the upper limit value to the first value are described. The image processing apparatus according to claim 1, which identifies parallax.

The identification unit detects a second region in which a parallax larger than the first value is identified from the pixels in the expanded search range, and the second region is the first region. The image processing according to claim 3, wherein the search range being set is discarded, a search range including the second region is newly set, and identification of the parallax is restarted when the search range is discontinuous. apparatus.

Claims 1 to 4 in which the identification unit ends the identification of the parallax when the size of the region including the pixel for which the parallax has been identified within the search range reaches a predetermined target value. The image processing apparatus according to any one of the above.

Claimed to further include a distance estimation unit that estimates the distances from the point of the object corresponding to the pixels in the search range to the first camera and the second camera based on the identified parallax. The image processing apparatus according to any one of items 1 to 5.

The image processing apparatus according to claim 6, further comprising a 3D shape estimation unit that estimates the 3D shape of the object in the search range based on the distance estimation result.

It further includes a position / orientation estimation unit that matches the estimation result of the 3D shape of the object with a 3D model of one or a plurality of articles prepared in advance and estimates at least one of the position and the attitude of the object. The image processing apparatus according to claim 7.