JP2012226645A

JP2012226645A - Image processing apparatus, image processing method, recording medium, and program

Info

Publication number: JP2012226645A
Application number: JP2011094938A
Authority: JP
Inventors: Tsutomu Sawada; 務澤田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2011-04-21
Filing date: 2011-04-21
Publication date: 2012-11-15

Abstract

PROBLEM TO BE SOLVED: To enable three-dimensional coordinates to be detected correctly.SOLUTION: An acquisition unit acquires a fisheye image that is an image captured through a fisheye lens, a conversion unit converts the fisheye image and generates a converted image, and a recognition unit recognizes an image to be recognized from the converted image. The present invention is applicable to an image processing apparatus.

Description

本技術は画像処理装置および方法、記録媒体並びにプログラムに関し、特に正確に３次元座標を検出することができるようにした画像処理装置および方法、記録媒体並びにプログラムに関する。 The present technology relates to an image processing apparatus and method, a recording medium, and a program, and more particularly, to an image processing apparatus and method, a recording medium, and a program that can accurately detect three-dimensional coordinates.

カメラにおいて魚眼レンズが使用される場合がある。魚眼レンズは３６０度の範囲の画像を撮像できるため、例えば周囲を全体的に監視するのに好適である。 Fisheye lenses may be used in cameras. Since the fisheye lens can capture an image in a range of 360 degrees, it is suitable for monitoring the entire surroundings, for example.

このような魚眼レンズを有するカメラをロボットの頭上に配置し、周囲を観察することが提案されている（例えば特許文献１）。 It has been proposed to arrange a camera having such a fisheye lens on the robot's head and observe the surroundings (for example, Patent Document 1).

特開２００８−２１０４０３号公報JP 2008-210403 A

しかしながら、魚眼レンズを用いたカメラにより撮像された画像はひずみが大きいので、その画像から、そこに写されている被写体の３次元位置を正確に推定することは困難である。 However, since an image captured by a camera using a fisheye lens has a large distortion, it is difficult to accurately estimate the three-dimensional position of the subject imaged from the image.

本技術はこのような状況に鑑みてなされたものであり、正確に３次元座標を検出できるようにする。 The present technology has been made in view of such a situation, and enables accurate detection of three-dimensional coordinates.

本技術の一側面は、魚眼レンズを介して撮影した画像である魚眼画像を取得する取得部と、前記魚眼画像を変換して、変換画像を生成する変換部と、前記変換画像から認識対象の画像を認識する認識部とを備える画像処理装置である。 One aspect of the present technology is an acquisition unit that acquires a fisheye image that is an image captured through a fisheye lens, a conversion unit that converts the fisheye image to generate a converted image, and a recognition target from the converted image It is an image processing apparatus provided with the recognition part which recognizes this image.

前記変換画像は、前記魚眼画像を円筒面上に展開した展開画像とすることができる。 The converted image may be a developed image obtained by developing the fisheye image on a cylindrical surface.

前記展開画像上の前記認識対象の画像の大きさを検出する検出部をさらに備えることができる。 The image processing apparatus may further include a detection unit that detects a size of the recognition target image on the developed image.

前記認識対象までの距離を、前記認識対象の画像の大きさから演算する演算部をさらに備えることができる。 The image processing apparatus may further include a calculation unit that calculates a distance to the recognition target from a size of the recognition target image.

前記演算部は、前記距離と大きさの関係を規定する関係式から、前記認識対象の画像の大きさに対応する前記認識対象までの距離を演算することができる。 The calculation unit can calculate the distance to the recognition target corresponding to the size of the recognition target image from a relational expression that defines the relationship between the distance and the size.

異なる大きさの複数の前記認識対象について用意されている複数の前記関係式から１つを選択する選択部をさらに備えることができる。 The information processing apparatus may further include a selection unit that selects one from the plurality of relational expressions prepared for the plurality of recognition targets having different sizes.

１つの前記関係式を選択するために距離を測定する測定部をさらに備えることができる。 In order to select one of the relational expressions, a measuring unit that measures a distance may be further provided.

１つの前記関係式を選択するために特定の前記認識対象を特定する特定部をさらに備えることができる。 In order to select one said relational expression, the specific part which specifies the specific said recognition object can further be provided.

前記演算部はさらに、新たな前記関係式を演算し、追加することができる。 The calculation unit can further calculate and add a new relational expression.

前記関係式を更新する更新部をさらに備えることができる。 An updating unit for updating the relational expression can be further provided.

前記関係式から検出された前記認識対象までの距離をパラメータとして、前記魚眼画像の座標系を、世界座標系に変換する座標変換部をさらに備えることができる。 The image processing apparatus may further include a coordinate conversion unit that converts a coordinate system of the fisheye image into a world coordinate system using a distance to the recognition target detected from the relational expression as a parameter.

本技術の一側面の画像処理方法、記録媒体及びプログラムは、上述した本技術の一側面の画像処理装置に対応する画像処理方法、記録媒体及びプログラムである。 An image processing method, a recording medium, and a program according to one aspect of the present technology are an image processing method, a recording medium, and a program corresponding to the image processing device according to one aspect of the present technology.

本技術の一側面においては、魚眼レンズを介して撮影した画像である魚眼画像が取得され、魚眼画像を変換して、変換画像が生成され、変換画像から認識対象の画像が検出される。 In one aspect of the present technology, a fish-eye image that is an image captured through a fish-eye lens is acquired, the fish-eye image is converted, a converted image is generated, and a recognition target image is detected from the converted image.

以上のように、本技術の一側面によれば、正確に３次元座標を検出することができる。 As described above, according to one aspect of the present technology, three-dimensional coordinates can be accurately detected.

本技術の画像処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明する図である。It is a figure explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 被写体の画像の大きさと距離の関係を説明する図である。It is a figure explaining the relationship between the magnitude | size of a to-be-photographed object's image, and distance. 被写体の画像の大きさと距離の関係を示すグラフである。It is a graph which shows the relationship between the magnitude | size and distance of a to-be-photographed object. 魚眼画像の座標系を説明する図である。It is a figure explaining the coordinate system of a fisheye image. 世界座標系を説明する図である。It is a figure explaining a world coordinate system. 学習処理を説明するフローチャートである。It is a flowchart explaining a learning process. 被写体の画像の大きさと距離の関係を表す式を説明する図である。It is a figure explaining the formula showing the relationship between the magnitude | size of a subject image, and distance. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 関係式選択処理を説明するフローチャートである。It is a flowchart explaining a relational expression selection process. 被写体の画像の大きさから距離を求める原理を説明する図である。It is a figure explaining the principle which calculates | requires distance from the magnitude | size of the image of a to-be-photographed object. 関係式選択処理を説明するフローチャートである。It is a flowchart explaining a relational expression selection process. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 本技術の画像処理装置の３次元位置推定処理を説明するフローチャートである。It is a flowchart explaining the three-dimensional position estimation process of the image processing apparatus of this technique. 画像の展開を説明する図である。It is a figure explaining expansion | deployment of an image. パーソナルコンピュータの構成を示すブロック図である。It is a block diagram which shows the structure of a personal computer.

以下、技術を実施するための形態（以下、実施の形態と称する）について説明する。なお、説明は以下の順序で行う。
１．第１の実施の形態
２．第２の実施の形態
３．第３の実施の形態
４．第４の実施の形態
５．変形例
６．本技術のプログラムへの適用
７．その他 Hereinafter, modes for carrying out the technology (hereinafter referred to as embodiments) will be described. The description will be given in the following order.
1. 1. First embodiment 2. Second embodiment 3. Third embodiment 4. Fourth embodiment Modification 6 6. Application of this technology to programs Other

＜１．第１の実施の形態＞
［画像処理装置の構成］ <1. First Embodiment>
[Configuration of image processing apparatus]

図１は、本技術の画像処理装置１の構成を示すブロック図である。画像処理装置１は、カメラ１１、モータ１２、マイクロプロセッサ１３、入力部１４、出力部１５、通信部１６、メモリ１７、およびステレオカメラ１８を有している。 FIG. 1 is a block diagram illustrating a configuration of an image processing apparatus 1 according to the present technology. The image processing apparatus 1 includes a camera 11, a motor 12, a microprocessor 13, an input unit 14, an output unit 15, a communication unit 16, a memory 17, and a stereo camera 18.

カメラ１１は魚眼レンズ１１Ａを有し、所定の範囲の監視領域の被写体を撮像する。モータ１２は、カメラ１１を駆動し、その方向を所定の方向に指向させる。マイクロプロセッサ１３は各部を制御する。入力部１４はキーボード、マウス、スイッチ等により構成され、ユーザにより操作され、操作に対応する信号をマイクロプロセッサ１３に出力する。出力部１５は、LCD(Liquid Crystal Display)、スピーカなどにより構成され、所定の画像やメッセージを表示したり、音声を出力する。 The camera 11 has a fisheye lens 11A and images a subject in a monitoring area within a predetermined range. The motor 12 drives the camera 11 and directs its direction in a predetermined direction. The microprocessor 13 controls each part. The input unit 14 includes a keyboard, a mouse, a switch, and the like. The input unit 14 is operated by a user and outputs a signal corresponding to the operation to the microprocessor 13. The output unit 15 includes an LCD (Liquid Crystal Display), a speaker, and the like, and displays a predetermined image or message or outputs a sound.

通信部１６は、インターネットその他のネットワークを介して他の装置と通信する。メモリ１７は、各種の情報やプログラムを記憶する。ステレオカメラ１８は必要に応じて付加され、測距のための画像を撮像する。 The communication unit 16 communicates with other devices via the Internet or other networks. The memory 17 stores various information and programs. The stereo camera 18 is added as necessary to capture an image for distance measurement.

マイクロプロセッサ１３は、撮像部３１、画像変換部３２、顔認識部３３、判定部３４、検出部３５、演算部３６、座標変換部３７、指示部３８、選択部３９、測定部４０、記憶部４１、更新部４２、制御部４３の機能ブロックを有している。 The microprocessor 13 includes an imaging unit 31, an image conversion unit 32, a face recognition unit 33, a determination unit 34, a detection unit 35, a calculation unit 36, a coordinate conversion unit 37, an instruction unit 38, a selection unit 39, a measurement unit 40, and a storage unit. 41, an update unit 42, and a control unit 43.

撮像部３１は、カメラ１１とステレオカメラ１８により被写体を撮像する。画像変換部３２は、魚眼レンズ１１Ａによる画像を、変換画像に変換する。顔認識部３３は、変換画像から認識対象としての人の顔を認識する。判定部３４は、各種の判定処理を実行する。検出部３５は、顔、顔のサイズ等の検出処理を行う。演算部３６は、距離、関係式、顔の特定のための演算を行う。 The imaging unit 31 images the subject with the camera 11 and the stereo camera 18. The image conversion unit 32 converts the image by the fisheye lens 11A into a converted image. The face recognition unit 33 recognizes a human face as a recognition target from the converted image. The determination unit 34 executes various determination processes. The detection unit 35 performs detection processing such as a face and a face size. The calculation unit 36 performs calculations for specifying distances, relational expressions, and faces.

座標変換部３７は、座標変換の演算を行う。指示部３８は、距離の変更、顔の大きさの変更などの指示を行う。選択部３９は、関係式を選択する。測定部４０は、距離を測定する。記憶部４１は、距離と顔のサイズを記憶する。更新部４２は、距離と顔のサイズの関係式を更新する。制御部４３は、モータ１２を駆動し、カメラ１１とステレオカメラ１８の向きを制御する。 The coordinate conversion unit 37 performs a coordinate conversion calculation. The instruction unit 38 gives instructions such as changing the distance and changing the size of the face. The selection unit 39 selects a relational expression. The measurement unit 40 measures the distance. The storage unit 41 stores the distance and the face size. The update unit 42 updates the relational expression between the distance and the face size. The control unit 43 drives the motor 12 and controls the orientation of the camera 11 and the stereo camera 18.

＜１．第１の実施の形態＞
［３次元位置推定処理１］ <1. First Embodiment>
[Three-dimensional position estimation process 1]

次に、画像処理装置１の３次元位置推定処理を説明する。図２は、本技術の画像処理装置１の３次元位置推定処理を説明するフローチャートである。この処理は、画像処理装置１の電源をオンしたとき開始される。 Next, the three-dimensional position estimation process of the image processing apparatus 1 will be described. FIG. 2 is a flowchart illustrating the three-dimensional position estimation process of the image processing apparatus 1 according to the present technology. This process is started when the image processing apparatus 1 is turned on.

ステップＳ１において撮像部３１は、カメラ１１で撮影する。カメラ１１により魚眼レンズ１１Ａを介して周囲の被写体が撮影され、その画像が魚眼画像として取得される。 In step S <b> 1, the imaging unit 31 captures an image with the camera 11. A surrounding object is photographed by the camera 11 via the fisheye lens 11A, and the image is acquired as a fisheye image.

図３は、本技術の画像処理装置１の３次元位置推定処理を説明する図である。カメラ１１による撮影処理により、図３の左上に示されている魚眼画像Ｐ１が取得される。 FIG. 3 is a diagram illustrating a three-dimensional position estimation process of the image processing apparatus 1 according to the present technology. The fisheye image P1 shown in the upper left of FIG.

ステップＳ２において、画像変換部３２は、画像を変換する。具体的には、ステップＳ１で取得された魚眼画像Ｐ１が、変換画像Ｐ２に変換される。この実施の形態における変換画像Ｐ２は、魚眼画像Ｐ１を円筒面上に展開した展開画像とされている。図３の右上に、この変換画像Ｐ２が示されている。 In step S2, the image conversion unit 32 converts the image. Specifically, the fisheye image P1 acquired in step S1 is converted into a converted image P2. The converted image P2 in this embodiment is a developed image in which the fisheye image P1 is developed on the cylindrical surface. This converted image P2 is shown in the upper right of FIG.

ステップＳ３において顔認識部３３は、顔認識処理を実行する。すなわち、図３の右下の図に示されるように、ステップＳ２で得られた変換画像に含まれる人の顔の画像５１が変換画像Ｐ２から認識される。この認識処理は、例えば肌色、目、口、耳、鼻などの特徴を有する領域を検索することで行うことができる。 In step S3, the face recognition unit 33 performs face recognition processing. That is, as shown in the lower right diagram of FIG. 3, the human face image 51 included in the converted image obtained in step S2 is recognized from the converted image P2. This recognition process can be performed, for example, by searching for a region having characteristics such as skin color, eyes, mouth, ears, and nose.

ステップＳ１の撮像処理で得られた魚眼画像Ｐ１においては、３次元空間における上方向が画像の中心となる。そのため、３次元空間における上方向が画像の上方向であることを前提としている検出器で顔を検出し、そのサイズを迅速かつ正確に検出することが困難である。そこで本実施の形態においては、変換画像Ｐ２が生成される。これにより３次元空間における上方向の検出が容易となり、その結果、顔を検出することが容易となる。つまり、図３の右下に示されているように、変換画像Ｐ２から顔の画像５１が検出される。 In the fisheye image P1 obtained by the imaging process in step S1, the upward direction in the three-dimensional space is the center of the image. Therefore, it is difficult to detect the face with a detector that assumes that the upward direction in the three-dimensional space is the upward direction of the image, and to detect the size quickly and accurately. Therefore, in the present embodiment, a converted image P2 is generated. This facilitates the upward detection in the three-dimensional space, and as a result, the face can be easily detected. That is, as shown in the lower right of FIG. 3, a face image 51 is detected from the converted image P2.

ステップＳ４において判定部３４は、顔が認識されたかを判定する。顔が認識されない場合、処理はステップＳ１に戻り、顔が認識されるまで、ステップＳ１乃至Ｓ４の処理が繰り返される。 In step S4, the determination unit 34 determines whether a face is recognized. If the face is not recognized, the process returns to step S1, and the processes of steps S1 to S4 are repeated until the face is recognized.

ステップＳ４において顔が認識されたと判定された場合、ステップＳ５において検出部３５は、認識された顔を魚眼画像の座標系から検出する。すなわち、図３の左側の中央に示されるように、魚眼画像Ｐ１上における顔の画像５１が検出される。 If it is determined in step S4 that the face has been recognized, in step S5, the detection unit 35 detects the recognized face from the coordinate system of the fisheye image. That is, as shown in the left center of FIG. 3, a face image 51 on the fisheye image P1 is detected.

ステップＳ６において検出部３５は、顔のサイズ、つまり大きさを検出する。具体的には、図３の左側の中央に示される魚眼画像Ｐ１上における顔の画像５１の大きさが検出される。 In step S6, the detection unit 35 detects the size of the face, that is, the size. Specifically, the size of the face image 51 on the fish-eye image P1 shown at the left center of FIG. 3 is detected.

図４は、被写体の画像の大きさと距離の関係を説明する図である。図４に示されるように、ＸＹＺの世界座標系において、カメラ１１からより遠い距離Ｒ１の位置に顔の画像５１−１が存在し、より近い距離Ｒ２の位置に顔の画像５１−２が存在するとする。この場合、魚眼画像Ｐ１上の顔の画像５１−１と、顔の画像５１−２の大きさを比較すると、距離が近い顔の画像５１−２の方が、距離が遠い顔の画像５１−１より大きくなる。 FIG. 4 is a diagram for explaining the relationship between the size of the subject image and the distance. As shown in FIG. 4, in the XYZ world coordinate system, the face image 51-1 exists at a position at a distance R1 farther from the camera 11, and the face image 51-2 exists at a position at a closer distance R2. Then. In this case, when comparing the size of the face image 51-1 on the fish-eye image P <b> 1 and the face image 51-2, the face image 51-2 closer to the distance is the face image 51 farther away. Greater than -1.

図５は、被写体の画像の大きさと距離の関係を示すグラフである。図５において、横軸は顔の画像５１の大きさ（すなわち、顔のサイズ）を表し、縦軸は、カメラ１１と顔の距離Ｒを表す。図５に曲線Ｌで示されるように、顔のサイズＳと距離Ｒの間には負の相関がある。つまり、距離Ｒが大きくなるほど、顔のサイズＳは小さくなる。曲線Ｌは下に突の形状となる。曲線Ｌは、次の式で表される。
Ｒ＝ｆs（Ｓ）（１） FIG. 5 is a graph showing the relationship between the size of the subject image and the distance. In FIG. 5, the horizontal axis represents the size of the face image 51 (that is, the face size), and the vertical axis represents the distance R between the camera 11 and the face. As indicated by a curve L in FIG. 5, there is a negative correlation between the face size S and the distance R. That is, as the distance R increases, the face size S decreases. The curve L has a shape that protrudes downward. The curve L is represented by the following formula.
R = fs (S) (1)

ステップＳ７において演算部３６は、関係式に基づき、顔のサイズから距離を求める。ここで関係式とは、図５において曲線Ｌで表される式、すなわち顔のサイズＳと距離Ｒの関係を表す式（１）である。ステップＳ６の処理で検出された顔のサイズＳに対応する距離Ｒが、式（１）から求められる。式（１）はメモリ１７に予め記憶されている。 In step S7, the calculation unit 36 obtains the distance from the face size based on the relational expression. Here, the relational expression is an expression represented by a curve L in FIG. 5, that is, an expression (1) representing a relationship between the face size S and the distance R. A distance R corresponding to the face size S detected in the process of step S6 is obtained from the equation (1). Expression (1) is stored in the memory 17 in advance.

ステップＳ８において座標変換部３７は、座標系を世界座標系に変換する。 In step S8, the coordinate conversion unit 37 converts the coordinate system to the world coordinate system.

図６は、魚眼レンズ１１Ａにより撮像された魚眼画像の座標系を説明する図である。図６の魚眼画像の座標系ｘｙにおいて、座標（ｘ_Ｃ，ｙ_Ｃ）はカメラ１１が存在する座標である。ｒ_Ｃは、魚眼レンズ１１Ａの視野半径である。世界座標系でカメラ１１から距離Ｒに位置する顔の画像５１が、魚眼画像の座標系において、座標（ｘ，ｙ）に位置するものとする。座標（ｘ_Ｃ，ｙ_Ｃ）と座標（ｘ，ｙ）とを結ぶ直線の、ｙ座標軸に対する時計回転方向の角度をθとする。そして、顔の画像５１の、世界座標系のＸＹ面上のカメラ１１からの仰角、つまり水平面からの仰角をφとする。このとき、次式が成立する。 FIG. 6 is a diagram for explaining a coordinate system of a fish-eye image captured by the fish-eye lens 11A. In the fisheye image coordinate system xy of FIG. 6, the coordinates (x _C , y _C ) are coordinates where the camera 11 exists. r _C is the field radius of the fisheye lens 11A. Assume that a face image 51 located at a distance R from the camera 11 in the world coordinate system is located at coordinates (x, y) in the coordinate system of the fisheye image. The angle in the clockwise direction with respect to the y coordinate axis of the straight line connecting the coordinates (x _C , y _C ) and the coordinates (x, y) is defined as θ. The elevation angle of the face image 51 from the camera 11 on the XY plane of the world coordinate system, that is, the elevation angle from the horizontal plane is defined as φ. At this time, the following equation holds.

ｘ＝ｘ_Ｃ−ｒ_Ｃ×cosφ×sinθ （２）
ｙ＝ｙ_Ｃ＋ｒ_Ｃ×cosφ×cosθ （３） x = x _C −r _C × cos φ × sin θ (2)
y = y _C + r _C × cosφ × cosθ (3)

上記式（２），（３）から次式が求められる。
ｒ_Ｃ×cosφ＝（ｘ_Ｃ−ｘ）／sinθ （４）
ｙ＝ｙ_Ｃ＋（（ｘ_Ｃ−ｘ）／sinθ）×cosθ （５）
ｙ＝ｙ_Ｃ＋（ｘ_Ｃ−ｘ）／tanθ （６）
θ＝tan^−１（（ｘ_Ｃ−ｘ）／（ｙ−ｙ_Ｃ））（７）
ｒ_Ｃ×cosφ×sinθ＝ｘ_Ｃ−ｘ（８）
cosφ＝（ｘ_Ｃ−ｘ）／（ｒ_Ｃ×sinθ）（９）
φ＝cos^−１（（ｘ_Ｃ−ｘ）／（ｒ_Ｃ×sinθ））（１０） From the above equations (2) and (3), the following equation is obtained.
r _C × cos φ = (x _C −x) / sin θ (4)
y = y _C + ((x _C −x) / sin θ) × cos θ (5)
y = y _C + (x _C −x) / tan θ (6)
θ = tan ⁻¹ ((x _C −x) / (y−y _C )) (7)
r _C × cos φ × sin θ = x _C −x (8)
cosφ = (x _C −x) / (r _C × sin θ) (9)
φ = cos ⁻¹ ((x _C −x) / (r _C × sin θ)) (10)

上記式に基づいて、演算部３６は、顔の画像５１の座標（ｘ，ｙ）から、角度θ，φを求めることができる。 Based on the above equation, the calculation unit 36 can obtain the angles θ and φ from the coordinates (x, y) of the face image 51.

図７は、世界座標系を説明する図である。図７（並びに、図３の左側の下の図）に示されるように、顔の画像５１の世界座標系における座標（Ｘ，Ｙ，Ｚ）は、次式で表される。
Ｘ＝Ｒcosφ×cosθ （１１）
Ｙ＝Ｒcosφ×sinθ （１２）
Ｚ＝Ｒsinφ （１３） FIG. 7 is a diagram illustrating the world coordinate system. As shown in FIG. 7 (and the lower diagram on the left side of FIG. 3), the coordinates (X, Y, Z) in the world coordinate system of the face image 51 are expressed by the following equations.
X = Rcosφ × cosθ (11)
Y = Rcosφ × sinθ (12)
Z = Rsinφ (13)

座標変換部３７は上記式（１１）乃至（１３）に基づき、ステップＳ７の処理で求められた距離Ｒをパラメータとして、顔の画像５１の座標（Ｘ，Ｙ，Ｚ）を演算することができる。角度θ，φは、上記式（７）と式（１０）で演算することができる。 The coordinate conversion unit 37 can calculate the coordinates (X, Y, Z) of the face image 51 based on the above formulas (11) to (13), using the distance R obtained in step S7 as a parameter. . The angles θ and φ can be calculated by the above formulas (7) and (10).

ステップＳ９において判定部３４は、推定処理を終了するかを判定する。ユーザから処理の終了が指示されていない場合、処理はステップＳ１に戻り、それ以降の処理が繰り返される。 In step S9, the determination unit 34 determines whether to end the estimation process. If the end of the process is not instructed by the user, the process returns to step S1 and the subsequent processes are repeated.

ステップＳ９において、ユーザから処理の終了が指示されたと判定された場合、推定処理は終了される。 If it is determined in step S9 that the user has instructed the end of the process, the estimation process ends.

以上のようにして、１個の魚眼レンズ１１Ａを有する１台のカメラ１１の画像から、迅速かつ正確に、世界座標を求めることができる。その結果、顔の画像５１の世界座標系における移動軌跡を予測したり、顔の画像５１をより認識し易い方向にカメラ１１を指向させたりすることができる。 As described above, the world coordinates can be obtained quickly and accurately from the image of one camera 11 having one fisheye lens 11A. As a result, the movement trajectory of the face image 51 in the world coordinate system can be predicted, or the camera 11 can be pointed in a direction in which the face image 51 can be more easily recognized.

＜１．第１の実施の形態＞
［学習処理］ <1. First Embodiment>
[Learning process]

上述したように、図２の３次元位置推定処理を実行する場合、距離と顔のサイズの関係が予め判っている必要がある。つまり、関係式を予め求めておく必要がある。そこで次に、距離と顔のサイズの関係式を予め学習し、求めておく方法について、図８と図９を参照して説明する。なお、図８と図９の実施の形態においては、異なる大きさの複数の顔についての複数の関係式を学習するものとする。 As described above, when the three-dimensional position estimation process of FIG. 2 is executed, the relationship between the distance and the face size needs to be known in advance. That is, it is necessary to obtain a relational expression in advance. Therefore, a method for learning in advance and obtaining a relational expression between the distance and the face size will be described with reference to FIGS. In the embodiment shown in FIGS. 8 and 9, a plurality of relational expressions for a plurality of faces having different sizes are learned.

図８は、学習処理を説明するフローチャートである。なお、図８のステップＳ４１，Ｓ４２，Ｓ４４、Ｓ４５の処理は、図２のステップＳ１，Ｓ２，Ｓ３，Ｓ６の処理と同様の処理であり、これらの処理については繰り返しになるので簡単に説明する。 FIG. 8 is a flowchart for explaining the learning process. Note that the processes in steps S41, S42, S44, and S45 in FIG. 8 are the same as the processes in steps S1, S2, S3, and S6 in FIG. 2, and these processes are repeated and will be briefly described. .

ステップＳ４１において撮像部３１は、カメラ１１で撮影する。カメラ１１により魚眼レンズ１１Ａを介して周囲の被写体が撮影され、その画像が魚眼画像として取得される。 In step S <b> 41, the imaging unit 31 captures an image with the camera 11. A surrounding object is photographed by the camera 11 via the fisheye lens 11A, and the image is acquired as a fisheye image.

ステップＳ４２において、画像変換部３２は、画像を変換する。すなわち、ステップＳ４１で取得された魚眼画像Ｐ１が変換画像Ｐ２に変換される。 In step S42, the image conversion unit 32 converts the image. That is, the fisheye image P1 acquired in step S41 is converted into the converted image P2.

ステップＳ４３において検出部３５は、所定の大きさの顔が配置された距離を検出する。つまり、ステレオカメラ１８または別途設けられた測距装置により、顔までの距離が測定され、その結果が取得される。 In step S43, the detection unit 35 detects the distance at which a face having a predetermined size is arranged. That is, the distance to the face is measured by the stereo camera 18 or a distance measuring device provided separately, and the result is acquired.

ステップＳ４４において顔認識部３３は、顔認識処理を実行する。すなわち、ステップＳ４２で得られた変換画像に含まれる人の顔の画像が認識される。 In step S44, the face recognition unit 33 executes face recognition processing. That is, the human face image included in the converted image obtained in step S42 is recognized.

ステップＳ４５において検出部３５は、顔のサイズを検出する。 In step S45, the detection unit 35 detects the size of the face.

次にステップＳ４６において記憶部４１は、距離と顔のサイズとの関係を記憶する。つまり、ステップＳ４３で検出された距離と、ステップＳ４５で検出された顔のサイズが、対応づけられてメモリ１７に記憶される。 Next, in step S46, the storage unit 41 stores the relationship between the distance and the face size. That is, the distance detected in step S43 and the face size detected in step S45 are associated with each other and stored in the memory 17.

ステップＳ４７において判定部３４は、必要な範囲を測定したかを判定する。つまり、監視する範囲についての測定が完了したかが判定される。 In step S47, the determination unit 34 determines whether the necessary range has been measured. That is, it is determined whether the measurement for the monitored range is completed.

まだ必要な範囲の測定が完了していない場合、ステップＳ４８において指示部３８は、距離の変更を指示する。例えば出力部１５に距離の変更を指示するメッセージが表示される。ユーザはこの指示に従って、顔の位置を変更する。 If the measurement of the necessary range has not been completed, the instruction unit 38 instructs to change the distance in step S48. For example, a message instructing the change of the distance is displayed on the output unit 15. The user changes the position of the face according to this instruction.

その後、処理はステップＳ４１に戻り、新たに位置に配置された顔について、同様の処理が実行される。 Thereafter, the processing returns to step S41, and the same processing is executed for the face newly placed at the position.

ステップＳ４７において必要な範囲の測定が完了したと判定された場合、ステップＳ４９において判定部３４は、全ての大きさを測定したかを判定する。まだ全ての大きさの顔を測定していない場合、ステップＳ５０において指示部３８は、顔の大きさの変更を指示する。例えば、出力部１５に顔の大きさの変更を指示するメッセージが表示される。ユーザはこの指示に従って、顔の大きさを変更する。 If it is determined in step S47 that the measurement of the necessary range has been completed, the determination unit 34 determines in step S49 whether all the sizes have been measured. If the faces of all sizes have not been measured yet, the instruction unit 38 instructs to change the size of the face in step S50. For example, a message instructing to change the face size is displayed on the output unit 15. The user changes the size of the face according to this instruction.

その後、処理はステップＳ４１に戻り、新たな大きさの顔について、同様の処理が実行される。 Thereafter, the process returns to step S41, and the same process is executed for the face of a new size.

ステップＳ４９において全ての大きさの顔が測定されたと判定された場合、ステップＳ６１において演算部３６は、各サイズの顔と距離の関係式を演算する。 When it is determined in step S49 that faces of all sizes have been measured, in step S61, the calculation unit 36 calculates a relational expression between the face of each size and the distance.

図９は、被写体の画像の大きさと距離の関係を表す式を説明する図である。以上のようにして、例えばサイズが大きい顔の画像５１Ｌについての距離と顔のサイズの関係を表す曲線Ｌ１が、最小自乗法などにより求められ、それを表す関数である関係式Ｒ＝ｆ_Ｌ（Ｓ）が求められる。 FIG. 9 is a diagram for explaining an expression representing the relationship between the size of the subject image and the distance. As described above, for example, the curve L1 representing the relationship between the distance and the face size for the large face image 51L is obtained by the least square method or the like, and the relational expression R = f _L ( S) is required.

同様に、サイズが中程度の大きさの顔の画像５１Ｍについての距離と顔のサイズの関係を表す曲線Ｌ２が、最小自乗法などにより求められ、それを表す関係式Ｒ＝ｆ_Ｍ（Ｓ）が求められる。さらにサイズが小さい顔の画像５１Ｓについての距離と顔のサイズの関係を表す曲線Ｌ３が、最小自乗法などにより求められ、それを表す関係式Ｒ＝ｆ_Ｓ（Ｓ）が求められる。 Similarly, a curve L2 representing the relationship between the distance and the face size for the face image 51M having a medium size is obtained by the least square method or the like, and a relational expression R = f _M (S) representing it. Is required. Further, a curve L3 representing the relationship between the distance and the face size for the face image 51S having a smaller size is obtained by the least square method or the like, and a relational expression R = f _S (S) representing it is obtained.

求められた関係式は、メモリ１７に記憶される。 The obtained relational expression is stored in the memory 17.

＜２．第２の実施の形態＞
［３次元位置推定処理２］ <2. Second Embodiment>
[Three-dimensional position estimation process 2]

図２の３次元位置推定処理は、関係式が１つである場合の処理であるが、関係式が複数存在する場合の処理について、図１０乃至図１４を参照して説明する。 The three-dimensional position estimation process of FIG. 2 is a process when there is one relational expression, but a process when there are a plurality of relational expressions will be described with reference to FIGS. 10 to 14.

図１０と図１１は、本技術の画像処理装置１の３次元位置推定処理を説明するフローチャートである。なお、図１０と図１１のステップＳ８１乃至Ｓ８６と、ステップＳ８８乃至Ｓ９０の処理は、図２のステップＳ１乃至Ｓ９の処理と同様の処理であり、ステップＳ８７と、ステップＳ９１乃至Ｓ９６の処理が、図２の３次元位置推定処理と異なっている。以下においては、主に図２における場合と異なる処理を中心に説明する。 10 and 11 are flowcharts illustrating the three-dimensional position estimation process of the image processing apparatus 1 according to the present technology. Note that the processes in steps S81 to S86 and steps S88 to S90 in FIGS. 10 and 11 are the same as the processes in steps S1 to S9 in FIG. 2, and the processes in steps S87 and S91 to S96 are as follows. This is different from the three-dimensional position estimation process of FIG. In the following, a description will be given mainly of processing different from the case in FIG.

すなわち図２における場合と同様に、ステップＳ８１乃至Ｓ８６において、カメラ１１で撮影して得られた魚眼画像が変換画像に変換され、変換画像から顔が認識され、認識された顔の魚眼画像上のサイズが検出される。 That is, as in the case of FIG. 2, in steps S81 to S86, the fisheye image obtained by photographing with the camera 11 is converted into a converted image, the face is recognized from the converted image, and the fisheye image of the recognized face is recognized. The upper size is detected.

魚眼画像上の顔のサイズが検出された後、ステップＳ８７において関係式選択処理が実行される。この関係式選択処理の詳細について、図１２のフローチャートを参照して説明する。この実施の形態の場合、魚眼レンズ１１Ａを有するカメラ１１以外に、ステレオカメラ１８が設けられている。 After the face size on the fisheye image is detected, the relational expression selection process is executed in step S87. Details of this relational expression selection processing will be described with reference to the flowchart of FIG. In the case of this embodiment, a stereo camera 18 is provided in addition to the camera 11 having the fisheye lens 11A.

図１２は、関係式選択処理を説明するフローチャートである。ステップＳ１１１において測定部４０は、ステレオカメラ１８により距離を測定する。すなわち、ステレオカメラ１８が顔を撮像し、測定部４０はその画像から顔までの距離を演算する。 FIG. 12 is a flowchart for explaining the relational expression selection process. In step S <b> 111, the measurement unit 40 measures the distance using the stereo camera 18. That is, the stereo camera 18 images the face, and the measurement unit 40 calculates the distance from the image to the face.

ステップＳ１１２において選択部３９は、距離と顔のサイズから関係式を選択する。 In step S112, the selection unit 39 selects a relational expression from the distance and the face size.

図１３は、被写体の画像の大きさから距離を求める原理を説明する図である。いま、図１３に示されるように、３つの関係式Ｒ＝ｆ_Ｌ（Ｓ），Ｒ＝ｆ_Ｍ（Ｓ），Ｒ＝ｆ_Ｓ（Ｓ）が記憶されているものとする。ステップＳ８５の処理で検出された顔のサイズをＳ_１とするとき、関係式Ｒ＝ｆ_Ｌ（Ｓ）上において、顔のサイズＳ_１と対応する距離はＲ_１となる。関係式Ｒ＝ｆ_Ｍ（Ｓ）上において、顔のサイズＳ_１と対応する距離はＲ_２となり、関係式Ｒ＝ｆ_Ｓ（Ｓ）上において、顔のサイズＳ_１と対応する距離はＲ_３となる。 FIG. 13 is a diagram for explaining the principle of obtaining the distance from the size of the subject image. Assume that three relational expressions R = f _L (S), R = f _M (S), and R = f _S (S) are stored as shown in FIG. When the size of the detected face in the processing of step S85 and _{S 1,} on the relationship R _{= f} L (S), a distance corresponding to the size _{S 1} of the face becomes _{R 1.} On the relational expression R = f _M (S), the distance corresponding to the face size S ₁ is R ₂ , and on the relational expression R = f _S (S), the distance corresponding to the face size S ₁ is R _3. It becomes.

選択部３９は、距離Ｒ_１乃至Ｒ_３のうち、ステップＳ１１１で測定された距離Ｒ_０に最も近い値に対応する関係式を選択する。例えば距離Ｒ_１の値が距離Ｒ_０の値に最も近い場合、関係式Ｒ＝ｆ_Ｌ（Ｓ）が選択される。すなわち、この場合、撮影されている被写体の人物の顔のサイズは、大きいサイズであると判断される。 Selecting unit 39, among the distances _{R 1} to _{R 3,} selects a relation corresponding to the value closest to the distance _{R 0} measured in step S111. For example, when the value of the distance R ₁ is closest to the value of the distance R ₀ , the relational expression R = f _L (S) is selected. That is, in this case, the size of the face of the person being photographed is determined to be a large size.

以上のようにして、１つの関係式が選択されると、その後、ステップＳ８８乃至Ｓ９６において、図２のステップＳ７乃至Ｓ９，Ｓ１乃至Ｓ６の処理と同様の処理が実行される。すなわち、ステップＳ８８，Ｓ８９において、選択された関係式に基づいて顔のサイズから距離が求められ、さらに魚眼画像の座標系が世界座標系に変換される。 When one relational expression is selected as described above, processing similar to the processing in steps S7 to S9 and S1 to S6 in FIG. 2 is thereafter performed in steps S88 to S96. That is, in steps S88 and S89, the distance is obtained from the face size based on the selected relational expression, and the coordinate system of the fisheye image is converted into the world coordinate system.

ステップＳ９０で、まだ推定処理の終了が指示されていないと判定された場合、ステップＳ９１乃至Ｓ９６で、ステップＳ８１乃至Ｓ８６と同様の処理が実行される。すなわち、カメラ１１で新たに撮影された魚眼画像が変換画像に変換される。そして変換画像から顔が認識されるまで顔認識処理が行われ、顔が認識されると、顔のサイズが検出される。さらに検出された顔のサイズに基づいて、ステップＳ８７で選択された関係式から距離が求められる。以下同様の処理が繰り返される。 If it is determined in step S90 that the estimation process has not yet been instructed, the same processes as in steps S81 to S86 are executed in steps S91 to S96. That is, a fisheye image newly taken by the camera 11 is converted into a converted image. Then, face recognition processing is performed until the face is recognized from the converted image, and when the face is recognized, the size of the face is detected. Further, the distance is obtained from the relational expression selected in step S87 based on the detected face size. Thereafter, the same processing is repeated.

こように、１度関係式が選択されると、以後、その選択された関係式が使用される。複数の顔が存在する場合、検出された顔がトラッキングされ、同じ顔については同じ関係式が使用される。 Thus, once a relational expression is selected, the selected relational expression is used thereafter. If there are multiple faces, the detected face is tracked and the same relation is used for the same face.

以上のようにして、この実施の形態の場合、ステレオカメラ１８による測定結果に基づいて、１つの関係式が選択され、選択された関係式に基づいて、距離が求められる。 As described above, in the case of this embodiment, one relational expression is selected based on the measurement result by the stereo camera 18, and the distance is obtained based on the selected relational expression.

ステレオカメラ１８を用いる場合、ステップＳ８８の距離を求める処理でも、ステレオカメラ１８による測定結果をそのまま用いることが可能である。しかし、カメラ１１で撮影できてもステレオカメラ１８で撮影できない範囲が存在することもある。そこで、ステレオカメラ１８による測定結果は関係式の選択にのみ用い、距離は関係式から検出するようにした方が、全体としてのバラツキを少なくし、統一のとれた検出を行うことが可能となる。 When the stereo camera 18 is used, the measurement result obtained by the stereo camera 18 can be used as it is in the process of obtaining the distance in step S88. However, there may be a range that can be captured by the camera 11 but cannot be captured by the stereo camera 18. Therefore, if the measurement result obtained by the stereo camera 18 is used only for selection of the relational expression and the distance is detected from the relational expression, it becomes possible to perform uniform detection with less variation as a whole. .

図１２の実施の形態の場合、ステレオカメラ１８による測定結果に基づいて、１つの関係式が選択されたが、他の方法で関係式を選択することも可能である。図１４を参照して、他の方法で関係式を選択する例を説明する。 In the case of the embodiment of FIG. 12, one relational expression is selected based on the measurement result by the stereo camera 18, but it is also possible to select the relational expression by another method. An example of selecting a relational expression by another method will be described with reference to FIG.

図１４は、関係式選択処理を説明するフローチャートである。この処理は、図１０のステップＳ８７の関係式選択処理の他の例を表す。この実施の形態の場合、顔からその人物を特定することができ、その特定された人物の顔のサイズがメモリ１７に予め登録されている。 FIG. 14 is a flowchart for explaining the relational expression selection process. This process represents another example of the relational expression selection process in step S87 of FIG. In the case of this embodiment, the person can be specified from the face, and the size of the face of the specified person is registered in the memory 17 in advance.

上述した場合と同様に、ステップＳ８１乃至Ｓ８６において、カメラ１１で撮影して得られた魚眼画像が変換画像に変換され、変換画像から顔が認識され、認識された顔の魚眼画像上のサイズが検出される。 In the same manner as described above, in steps S81 to S86, the fisheye image obtained by photographing with the camera 11 is converted into a converted image, the face is recognized from the converted image, and the recognized face on the fisheye image is displayed. The size is detected.

魚眼画像上の顔のサイズが検出された後、ステップＳ８７において、図１４の関係式選択処理が実行される。 After the face size on the fisheye image is detected, the relational expression selection process of FIG. 14 is executed in step S87.

図１４のステップＳ１３１において検出部３５は特定部として機能し、顔を特定する。すなわち、メモリ１７には、監視対象の人物の顔とその顔のサイズが予め登録されており、ステップＳ８３の処理で認識された顔が、予め登録されている顔と比較され、特定される。 In step S131 in FIG. 14, the detection unit 35 functions as a specifying unit and specifies a face. That is, the face of the person to be monitored and the size of the face are registered in the memory 17 in advance, and the face recognized in the process of step S83 is compared with the face registered in advance and specified.

ステップＳ１３２において選択部３９は、特定された顔のサイズに対応する関係式を選択する。例えば各人物の顔のサイズは、図１３のサイズが大きい顔の画像５１Ｌ、サイズが中程度の顔の画像５１Ｍ、およびサイズが小さい顔の画像５１Ｓのいずれかに分類されており、特定された人物の顔のサイズに対応する関係式が選択される。特定された顔のサイズが小さい顔の画像５１Ｓである場合には、関係式Ｒ＝ｆ_Ｓ（Ｓ）が選択される。 In step S132, the selection unit 39 selects a relational expression corresponding to the specified face size. For example, the face size of each person is classified into one of the large face image 51L, the medium face image 51M, and the small face image 51S shown in FIG. A relational expression corresponding to the size of the person's face is selected. When the identified face image 51S has a small face size, the relational expression R = f _S (S) is selected.

以上のようにして、１つの関係式が選択されると、図１１のステップＳ８８乃至Ｓ９６において、上述した場合と同様の処理が実行される。 When one relational expression is selected as described above, processing similar to that described above is executed in steps S88 to S96 in FIG.

この実施の形態の場合、ステレオカメラ１８が不要なので、ステレオカメラ１８を設ける場合に比べて、構成を簡略化し、低コスト化することができる。また、装置を取り付ける空間も狭くてよい。 In this embodiment, since the stereo camera 18 is unnecessary, the configuration can be simplified and the cost can be reduced as compared with the case where the stereo camera 18 is provided. Further, the space for installing the device may be narrow.

＜３．第３の実施の形態＞
［３次元位置推定処理３］ <3. Third Embodiment>
[Three-dimensional position estimation process 3]

図２の実施の形態、並びに図１０と図１１の実施の形態においては、関係式を予め用意しておく必要があったが、関係式を適宜追加するようにすることも可能である。以下、図１５と図１６を参照して、この場合の実施の形態について説明する。この実施の形態の場合にも、魚眼レンズ１１Ａを有するカメラ１１以外に、ステレオカメラ１８が設けられている。 In the embodiment of FIG. 2 and the embodiments of FIGS. 10 and 11, it is necessary to prepare the relational expression in advance, but it is possible to add the relational expression as appropriate. Hereinafter, an embodiment in this case will be described with reference to FIGS. 15 and 16. Also in this embodiment, a stereo camera 18 is provided in addition to the camera 11 having the fisheye lens 11A.

図１５と図１６は、本技術の画像処理装置１の３次元位置推定処理を説明するフローチャートである。図１５と図１６のステップＳ１６１乃至Ｓ１６６、ステップＳ１７０乃至Ｓ１７２の処理は、図２のステップＳ１乃至Ｓ９の処理と同様の処理である。ステップＳ１６７乃至Ｓ１６９、ステップＳ１７３乃至Ｓ１８１の処理が、図２における場合と異なっている。以下、主に、異なっている処理について説明する。 FIGS. 15 and 16 are flowcharts illustrating the three-dimensional position estimation process of the image processing apparatus 1 of the present technology. The processes in steps S161 to S166 and steps S170 to S172 in FIGS. 15 and 16 are the same as the processes in steps S1 to S9 in FIG. The processes in steps S167 to S169 and steps S173 to S181 are different from those in FIG. Hereinafter, mainly different processes will be described.

ステップＳ１６１乃至Ｓ１６６において、カメラ１１で撮影して得られた魚眼画像が変換画像に変換され、変換画像から顔が認識され、認識された魚眼画像上の顔のサイズが検出される。 In steps S161 to S166, the fisheye image obtained by photographing with the camera 11 is converted into a converted image, the face is recognized from the converted image, and the size of the face on the recognized fisheye image is detected.

魚眼画像上の顔のサイズが検出された後、ステップＳ１６７，Ｓ１６８において、図１２のステップＳ１１１，Ｓ１１２における場合と同様の処理が実行される。すなわち、ステップＳ１６７において測定部４０は、ステレオカメラ１８により距離を測定する。このとき、ステレオカメラ１８が顔を撮像し、測定部４０はその画像から顔までの距離を演算する。 After the face size on the fisheye image is detected, the same processing as in steps S111 and S112 in FIG. 12 is executed in steps S167 and S168. That is, in step S167, the measurement unit 40 measures the distance by the stereo camera 18. At this time, the stereo camera 18 images the face, and the measurement unit 40 calculates the distance from the image to the face.

ステップＳ１６８において選択部３９は、距離と顔のサイズから関係式を選択する。すなわち図１３を参照して上述したように、選択部３９は、距離Ｒ_１乃至Ｒ_３のうち、ステップＳ１６６で測定された距離Ｒ_０に最も近い値に対応する関係式を選択する。例えば距離Ｒ_１の値が距離Ｒ_０の値に最も近い場合、関係式Ｒ＝ｆ_Ｌ（Ｓ）が選択される。 In step S168, the selection unit 39 selects a relational expression from the distance and the face size. That is, as described above with reference to FIG. 13, the selection unit 39 selects a relational expression corresponding to the value closest to the distance R ₀ measured in step S 166 among the distances R _{1 to} R ₃ . For example, when the value of the distance R ₁ is closest to the value of the distance R ₀ , the relational expression R = f _L (S) is selected.

以上のようにして、１つの関係式が選択されると、ステップＳ１６９において判定部３４は、適切な関係式があるかを判定する。つまり、ステップＳ１６８の処理で、適切な関係式が選択されたかが判定される。例えばステップＳ１６６で検出された顔のサイズと各関係式の顔のサイズとの差が、所定の閾値以内であるかが判定され、差が閾値以内である関係式が存在する場合、適切な関係式が存在すると判定される。 As described above, when one relational expression is selected, in step S169, the determination unit 34 determines whether there is an appropriate relational expression. That is, it is determined in step S168 whether an appropriate relational expression has been selected. For example, it is determined whether the difference between the face size detected in step S166 and the face size of each relational expression is within a predetermined threshold. If there is a relational expression in which the difference is within the threshold, an appropriate relation exists. It is determined that the expression exists.

ステップＳ１６９において適切な関係式が存在すると判定された場合、ステップＳ１７０乃至Ｓ１７２において、図２のステップＳ７乃至Ｓ９の処理と同様の処理が実行される。すなわち、ステップＳ１７０,Ｓ１７１において、選択された関係式に基づいて顔のサイズから距離が求められ、さらに魚眼画像の座標系が世界座標系に変換される。 When it is determined in step S169 that an appropriate relational expression exists, in steps S170 to S172, processing similar to the processing in steps S7 to S9 in FIG. 2 is executed. That is, in steps S170 and S171, the distance is obtained from the face size based on the selected relational expression, and the coordinate system of the fisheye image is converted into the world coordinate system.

ステップＳ１７２において判定部３４は、推定処理の終了が指示されたかを判定する。ステップＳ１７２で、まだ推定処理の終了が指示されていないと判定された場合、ステップＳ１７３乃至Ｓ１７８、ステップＳ１７０乃至Ｓ１７２の処理が実行される。ステップＳ１７３乃至Ｓ１７８の処理は、ステップＳ１６１乃至Ｓ１６６の処理と同様の処理である。 In step S172, the determination unit 34 determines whether the end of the estimation process is instructed. If it is determined in step S172 that the estimation process has not been instructed yet, the processes of steps S173 to S178 and steps S170 to S172 are executed. The processes in steps S173 through S178 are the same as the processes in steps S161 through S166.

すなわち、新たな魚眼画像が取得され、変換画像に変換される。変換画像から顔が認識されるまで顔認識処理が行われ、顔が認識されると、魚眼画像上の顔のサイズが検出される。 That is, a new fisheye image is acquired and converted into a converted image. Face recognition processing is performed until a face is recognized from the converted image. When the face is recognized, the size of the face on the fisheye image is detected.

ステップＳ１７０において演算部３６は、ステップＳ１６８の処理で選択されている関係式に基づいて、ステップＳ１７８の処理で検出された顔のサイズに対応する距離を求める。ステップＳ１７１において座標変換部３７は、座標系を世界座標系に変換する。 In step S170, the calculation unit 36 obtains a distance corresponding to the size of the face detected in step S178 based on the relational expression selected in step S168. In step S171, the coordinate conversion unit 37 converts the coordinate system to the world coordinate system.

ステップＳ１７２において推定処理を終了するかが判定され、まだ終了が指示されていない場合、処理はステップＳ１７３に戻り、それ以降の処理が繰り返される。 In step S172, it is determined whether or not the estimation process is to be terminated. If the termination has not been instructed yet, the process returns to step S173, and the subsequent processes are repeated.

ステップＳ１６９において、適切な関係式が存在しないと判定された場合、つまり、ステップＳ１６６で検出された顔のサイズと各関係式の顔のサイズとの差が閾値以内ではない場合、ステップＳ１７９以降の処理が実行される。ステップＳ１７９において記憶部４１は、距離と顔のサイズを記憶する。すなわち、ステップＳ１６６で検出された顔のサイズと、ステップＳ１６７の処理でステレオカメラ１８により測定された距離が、対応づけてメモリ１７に記憶される。 If it is determined in step S169 that an appropriate relational expression does not exist, that is, if the difference between the face size detected in step S166 and the face size of each relational expression is not within the threshold value, step S179 and the subsequent steps are performed. Processing is executed. In step S179, the storage unit 41 stores the distance and the face size. That is, the face size detected in step S166 and the distance measured by the stereo camera 18 in the process of step S167 are stored in the memory 17 in association with each other.

ステップＳ１８０において判定部３４は、データが十分記憶されたかを判定する。距離と顔のサイズのデータが、関係式を演算するのに十分なデータであるかは、例えば次のように判定することができる。例えば、図１３に示される各顔のサイズの関係式を表す曲線Ｌ_１乃至Ｌ_３は、それぞれほぼ相似と考えることができる。この場合、所定の値の距離と顔のサイズで規定される１点の座標が決まれば、そこを通る相似の曲線は一義的に特定することができる。このように考える場合、１つのデータが十分なデータとなる。 In step S180, the determination unit 34 determines whether data is sufficiently stored. Whether the distance and face size data is sufficient to calculate the relational expression can be determined, for example, as follows. For example, the curves L _{1 to} L ₃ representing the relational expressions of the face sizes shown in FIG. 13 can be considered to be almost similar to each other. In this case, if the coordinates of one point defined by the distance of a predetermined value and the size of the face are determined, a similar curve passing there can be uniquely identified. When thinking in this way, one piece of data is sufficient data.

また、複数の座標の平均値を演算し、その平均値で表される１つの点を通る相似の曲線を求めることもできる。この場合、平均値を求めるための任意の数のデータが十分なデータとなる。 It is also possible to calculate an average value of a plurality of coordinates and obtain a similar curve passing through one point represented by the average value. In this case, an arbitrary number of data for obtaining the average value is sufficient data.

さらに、複数のデータから最小自乗法により１つの関係式を演算する場合には、最小自乗法により１つの関係式を演算するのに必要な数のデータが、十分なデータとなる。 Further, when one relational expression is calculated from a plurality of data by the least square method, the number of data necessary for calculating one relational expression by the least square method is sufficient data.

ステップＳ１８０においてデータが十分記憶されたと判定された場合、ステップＳ１８１において演算部３６は、所定の距離と顔のサイズの関係式を演算し、追加する。すなわち、新たな関係式がメモリ１７に記憶される。 If it is determined in step S180 that the data has been sufficiently stored, in step S181, the calculator 36 calculates and adds a relational expression between the predetermined distance and the face size. That is, a new relational expression is stored in the memory 17.

ステップＳ１８１の処理の後、ステップＳ１７２において、推定処理の終了が指示されたかが判定され、まだ終了が指示されていない場合、処理はステップＳ１７３に進み、それ以降の処理が実行される。ステップＳ１７２において、推定処理の終了が指示されたと判定された場合、処理は終了される。 After the process of step S181, it is determined in step S172 whether or not the end of the estimation process has been instructed. If the end has not been instructed yet, the process proceeds to step S173, and the subsequent processes are executed. If it is determined in step S172 that the end of the estimation process has been instructed, the process ends.

一方、ステップＳ１８０においてまだデータが十分記憶されていないと判定された場合、ステップＳ１８１の処理をスキップし、ステップＳ１７２の処理に移行してもよい。しかしこの実施の形態においては、ステップＳ１８０でまだデータが十分記憶されていないと判定された場合、ステップＳ１８２において、最も近い関係式を選択する処理が実行される。つまり、いまの場合、ステップＳ１６６で検出された顔のサイズと各関係式の顔のサイズとの差が閾値を超えている（ステップＳ１６９でそのように判定されている）のであるが、その差が最も小さい関係式が選択される。その後、ステップＳ１６９で適切な関係式が存在すると判定された場合と同様に、処理はステップＳ１７０に進み、それ以降の処理が実行される。 On the other hand, when it is determined in step S180 that the data has not been sufficiently stored, the process of step S181 may be skipped and the process may proceed to step S172. However, in this embodiment, when it is determined in step S180 that the data is not yet sufficiently stored, in step S182, processing for selecting the closest relational expression is executed. That is, in this case, the difference between the face size detected in step S166 and the face size of each relational expression exceeds the threshold value (determined as such in step S169). The relational expression with the smallest is selected. Thereafter, similarly to the case where it is determined in step S169 that an appropriate relational expression exists, the processing proceeds to step S170, and the subsequent processing is executed.

以上のようにして、この実施の形態の場合、新たな関係式が適宜追加される。 As described above, in the case of this embodiment, new relational expressions are added as appropriate.

＜４．第４の実施の形態＞
［３次元位置推定処理４］ <4. Fourth Embodiment>
[Three-dimensional position estimation process 4]

次に、図１７と図１８を参照して、関係式を更新する実施の形態について説明する。この実施の形態の場合にも、魚眼レンズ１１Ａを有するカメラ１１以外に、ステレオカメラ１８が設けられている。また、この実施の形態においては、メモリ１７に顔のサイズが既知の特定の顔が予め記憶されている。 Next, an embodiment for updating the relational expression will be described with reference to FIGS. 17 and 18. Also in this embodiment, a stereo camera 18 is provided in addition to the camera 11 having the fisheye lens 11A. In this embodiment, a specific face whose face size is known is stored in the memory 17 in advance.

図１７と図１８は、本技術の画像処理装置１の３次元位置推定処理を説明するフローチャートである。図１７と図１８のステップＳ２０１乃至Ｓ２０６、ステップＳ２１２乃至Ｓ２１４の処理は、図２のステップＳ１乃至Ｓ９の処理と同様の処理である。ステップＳ２０７乃至Ｓ２１１、ステップＳ２１５乃至Ｓ２２２の処理が、図２における場合と異なっている。以下、主に、異なっている処理について説明する。 FIGS. 17 and 18 are flowcharts illustrating the three-dimensional position estimation process of the image processing apparatus 1 according to the present technology. The processes in steps S201 to S206 and steps S212 to S214 in FIGS. 17 and 18 are the same as the processes in steps S1 to S9 in FIG. The processes in steps S207 to S211 and steps S215 to S222 are different from those in FIG. Hereinafter, mainly different processes will be described.

ステップＳ２０１乃至Ｓ２０６において、カメラ１１で撮影して得られた魚眼画像が変換画像に変換され、変換画像から顔が認識され、認識された魚眼画像上の顔のサイズが検出される。 In steps S201 to S206, the fisheye image obtained by photographing with the camera 11 is converted into a converted image, the face is recognized from the converted image, and the size of the face on the recognized fisheye image is detected.

魚眼画像上の顔のサイズが検出された後、ステップＳ２０７において、図１２のステップＳ１１１における場合と同様の処理が実行される。すなわち、ステップＳ２０７において測定部４０は、ステレオカメラ１８により距離を測定する。このとき、ステレオカメラ１８が顔を撮像し、測定部４０はその画像から顔までの距離を演算する。 After the face size on the fisheye image is detected, in step S207, the same processing as in step S111 in FIG. 12 is executed. That is, in step S207, the measurement unit 40 measures the distance using the stereo camera 18. At this time, the stereo camera 18 images the face, and the measurement unit 40 calculates the distance from the image to the face.

ステップＳ２０８において判定部３４は、ステップＳ２０３の処理で認識された顔が、特定の顔であるかを判定する。ステップＳ２０３の処理で認識された顔が、特定の顔ではない場合、ステップＳ２１１において、図１２のステップＳ１１２における場合と同様の処理が実行される。すなわち、選択部３９は、距離と顔のサイズから関係式を選択する。図１３を参照して上述したように、選択部３９は、距離Ｒ_１乃至Ｒ_３のうち、ステップＳ２０７で測定された距離Ｒ_０に最も近い値に対応する関係式を選択する。例えば距離Ｒ_１の値が距離Ｒ_０の値に最も近い場合、関係式Ｒ＝ｆ_Ｌ（Ｓ）が選択される。 In step S208, the determination unit 34 determines whether the face recognized in the process of step S203 is a specific face. If the face recognized in the process of step S203 is not a specific face, the same process as in step S112 of FIG. 12 is executed in step S211. That is, the selection unit 39 selects a relational expression from the distance and the face size. As described above with reference to FIG. 13, the selection unit 39 selects a relational expression corresponding to the value closest to the distance R ₀ measured in step S 207 among the distances R _{1 to} R ₃ . For example, when the value of the distance R ₁ is closest to the value of the distance R ₀ , the relational expression R = f _L (S) is selected.

以上のようにして、１つの関係式が選択されると、ステップＳ２１２乃至Ｓ２１４において、図２のステップＳ７乃至Ｓ９の処理と同様の処理が実行される。すなわち、ステップＳ２１２乃至Ｓ２１４において、選択された関係式に基づいて顔のサイズから距離が求められ、さらに魚眼画像の座標系が世界座標系に変換される。 As described above, when one relational expression is selected, in steps S212 to S214, processing similar to the processing in steps S7 to S9 in FIG. 2 is executed. That is, in steps S212 to S214, the distance is obtained from the face size based on the selected relational expression, and the coordinate system of the fisheye image is converted into the world coordinate system.

ステップＳ２１４において判定部３４は、推定処理の終了が指示されたかを判定する。ステップＳ２１４で、まだ推定処理の終了が指示されていないと判定された場合、ステップＳ２１５乃至Ｓ２２０、ステップＳ２１２乃至Ｓ２１４の処理が実行される。ステップＳ２１５乃至Ｓ２２０の処理は、ステップＳ２０１乃至Ｓ２０６の処理と同様の処理である。 In step S214, the determination unit 34 determines whether the end of the estimation process is instructed. If it is determined in step S214 that the end of the estimation process has not been instructed yet, the processes of steps S215 to S220 and steps S212 to S214 are executed. The processes in steps S215 to S220 are the same as the processes in steps S201 to S206.

すなわち、新たな魚眼画像が取得され、変換画像に変換される。変換画像から顔が認識されるまで顔認識処理が行われ、顔が認識されると、顔のサイズが検出される。 That is, a new fisheye image is acquired and converted into a converted image. Face recognition processing is performed until a face is recognized from the converted image, and when the face is recognized, the size of the face is detected.

ステップＳ２１２において演算部３６は、ステップＳ２１１の処理で選択されている関係式に基づいて、ステップＳ２２０の処理で検出された顔のサイズに対応する距離を求める。ステップＳ２１３において座標変換部３７は、座標系を世界座標系に変換する。 In step S212, the calculation unit 36 obtains a distance corresponding to the face size detected in the process of step S220, based on the relational expression selected in the process of step S211. In step S213, the coordinate conversion unit 37 converts the coordinate system to the world coordinate system.

ステップＳ２１４において推定処理を終了するかが判定され、まだ終了が指示されていない場合、処理はステップＳ２１５に戻り、それ以降の処理が繰り返される。 In step S214, it is determined whether or not the estimation process is to be terminated, and if the termination has not been instructed yet, the process returns to step S215 and the subsequent processes are repeated.

一方ステップＳ２０８において、認識された顔が予め登録されている特定の顔であると判定された場合、ステップＳ２０９において記憶部４１は、距離と顔のサイズを記憶する。つまり、ステップＳ２０６で検出された顔のサイズと、ステップＳ２０７で測定された距離が、対応づけてメモリ１７に記憶される。 On the other hand, when it is determined in step S208 that the recognized face is a specific face registered in advance, in step S209, the storage unit 41 stores the distance and the size of the face. That is, the face size detected in step S206 and the distance measured in step S207 are stored in the memory 17 in association with each other.

ステップＳ２１０において判定部３４は、データが十分記憶されたかを判定する。距離と顔のサイズのデータが、関係式を演算するのに十分なデータであるかは、例えば図１５のステップＳ１８０における場合と同様に判定することができる。 In step S210, the determination unit 34 determines whether data is sufficiently stored. Whether or not the distance and face size data are sufficient to calculate the relational expression can be determined in the same manner as in step S180 of FIG. 15, for example.

すなわち、例えば、図１３に示される各顔のサイズの関係式を表す曲線Ｌ_１乃至Ｌ_３は、それぞれほぼ相似と考える場合、所定の値の距離と顔のサイズで規定される１点の座標が決まれば、そこを通る相似の曲線は一義的に特定することができる。このように考える場合、１つのデータが十分なデータとなる。 That is, for example, when the curves L _{1 to} L ₃ representing the relational expressions of the face sizes shown in FIG. 13 are substantially similar to each other, the coordinates of one point defined by the predetermined distance and the face size are used. If it is determined, a similar curve passing there can be uniquely identified. When thinking in this way, one piece of data is sufficient data.

ステップＳ２１０においてデータが十分であると判定された場合、ステップＳ２２１において更新部４２は、特定の距離と顔のサイズの関係式を演算し、更新する。特定の顔のサイズの正確な値は、メモリ１７に予め登録されている。そこで、ステップＳ２０６で検出された顔のサイズが、予め登録されている顔のサイズに対応づけられる。この顔のサイズと、ステップＳ２０７においてステレオカメラ１８により測定された距離とに基づいて演算を行うことで、正確な関係式を得ることができる。 If it is determined in step S210 that the data is sufficient, in step S221, the update unit 42 calculates and updates the relational expression between the specific distance and the face size. The exact value of the specific face size is registered in the memory 17 in advance. Therefore, the face size detected in step S206 is associated with a pre-registered face size. An accurate relational expression can be obtained by performing calculations based on the size of the face and the distance measured by the stereo camera 18 in step S207.

関係式が更新されると、ステップＳ２２２において選択部３９は、更新した関係式を選択する。すなわちステップＳ２２１でいま更新された関係式が選択される。 When the relational expression is updated, in step S222, the selection unit 39 selects the updated relational expression. That is, the relational expression just updated in step S221 is selected.

その後、処理はステップＳ２１２に進み、それ以降の処理が実行される。 Thereafter, the process proceeds to step S212, and the subsequent processes are executed.

ステップＳ２１０においてデータがまだ十分記憶されていないと判定された場合、ステップＳ２２１，Ｓ２２２の処理はスキップされる。この場合は、ステップＳ２１１において、距離と顔のサイズから関係式が選択される。つまりこの場合には関係式は更新されず、ステップＳ２０６で検出された顔のサイズと、ステップＳ２０７で測定された距離にもとづき特定される従前の関係式が選択される。その後、処理はステップＳ２１２に進み、それ以降の処理が実行される。 If it is determined in step S210 that the data is not yet sufficiently stored, the processes in steps S221 and S222 are skipped. In this case, in step S211, a relational expression is selected from the distance and the face size. That is, in this case, the relational expression is not updated, and the previous relational expression specified based on the face size detected in step S206 and the distance measured in step S207 is selected. Thereafter, the process proceeds to step S212, and the subsequent processes are executed.

このように、この実施の形態においてはサイズが既知の顔が検出されると、関係式が更新されるので、関係式をより正確なものとし、それに基づいて正確な位置を検出することが可能になる。 Thus, in this embodiment, when a face with a known size is detected, the relational expression is updated, so that the relational expression can be made more accurate and an accurate position can be detected based on the relational expression. become.

［５．変形例］ [5. Modified example]

魚眼画像の他の変換画像の例について説明する。図１９は、画像の展開を説明する図である。図１９の例では、１枚の魚眼画像が、その中心を通る水平方向の仮想的な線と垂直方向の仮想的な線により、左上、右上、左下、および右下の４つの領域Ａ１乃至Ａ４に区分される。そして１枚の魚眼画像が、左右または上下に隣接する２つの領域を含む処理対象の変換画像に変換される。つまり、領域Ａ３と領域Ａ４の変換画像、領域Ａ４と領域Ａ２の変換画像、領域Ａ２と領域Ａ１の変換画像、および領域Ａ１と領域Ａ３の変換画像に変換される。 Examples of other converted images of fisheye images will be described. FIG. 19 is a diagram for explaining image development. In the example of FIG. 19, one fisheye image is divided into four regions A1 to A1, an upper left, an upper right, a lower left, and a lower right by a horizontal virtual line and a vertical virtual line passing through the center. It is divided into A4. Then, one fisheye image is converted into a conversion image to be processed that includes two regions that are adjacent to each other on the left and right or top and bottom. That is, the image is converted into a converted image of region A3 and region A4, a converted image of region A4 and region A2, a converted image of region A2 and region A1, and a converted image of region A1 and region A3.

このように変換した変換画像から顔を認識することでも、１枚の魚眼画像自体から顔を認識する場合に比べれば、容易に顔を認識することが可能になる。 Even when the face is recognized from the converted image thus converted, the face can be easily recognized as compared with the case where the face is recognized from one fisheye image itself.

ただしこの場合、各変換画像において、顔は最大±４５度に傾斜する。そこで、顔の認識器は、±４５度に傾斜する顔を認識できる性能を有する必要がある。 However, in this case, in each converted image, the face is inclined up to ± 45 degrees. Therefore, the face recognizer needs to have a performance capable of recognizing a face inclined at ± 45 degrees.

以上においてはステレオカメラ１８により距離を測定するようにしたが、他の測距手段を用いることも可能である。 In the above, the distance is measured by the stereo camera 18, but other distance measuring means can be used.

また、魚眼画像を平面展開することも考えられる。 It is also conceivable to develop a fisheye image on a plane.

本技術は、例えば所定の空間を監視する監視装置、ロボットにより周囲を視認する装置、その他の画像処理装置に適用することができる。また認識対象は人の顔に限られない。動物、各種の物体などを認識対象とすることができる。 The present technology can be applied to, for example, a monitoring device that monitors a predetermined space, a device that visually recognizes the surroundings by a robot, and other image processing devices. The recognition target is not limited to a human face. Animals, various objects, etc. can be recognized.

[６．本技術のプログラムへの適用] [6. Application of this technology to programs]

上述した一連の処理は、ハードウエアにより実行することもできるし、ソフトウエアにより実行することもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、コンピュータにインストールされる。ここで、コンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどが含まれる。 The series of processes described above can be executed by hardware or can be executed by software. When a series of processing is executed by software, a program constituting the software is installed in the computer. Here, the computer includes, for example, a general-purpose personal computer capable of executing various functions by installing various programs by installing a computer incorporated in dedicated hardware.

図２０は、上述した一連の処理をプログラムにより実行するパーソナルコンピュータの構成を示すブロック図である。 FIG. 20 is a block diagram showing a configuration of a personal computer that executes the above-described series of processing by a program.

コンピュータにおいて、CPU（Central Processing Unit）２２１，ROM（Read Only Memory）２２２，RAM（Random Access Memory）２２３は、バス２２４により相互に接続されている。 In the computer, a CPU (Central Processing Unit) 221, a ROM (Read Only Memory) 222, and a RAM (Random Access Memory) 223 are connected to each other via a bus 224.

バス２２４には、さらに、入出力インタフェース２２５が接続されている。入出力インタフェース２２５には、入力部２２６、出力部２２７、記憶部２２８、通信部２２９、及びドライブ２３０が接続されている。 An input / output interface 225 is further connected to the bus 224. An input unit 226, an output unit 227, a storage unit 228, a communication unit 229, and a drive 230 are connected to the input / output interface 225.

入力部２２６は、キーボード、マウス、マイクロフォンなどよりなる。出力部２２７は、ディスプレイ、スピーカなどよりなる。記憶部２２８は、ハードディスクや不揮発性のメモリなどよりなる。通信部２２９は、ネットワークインタフェースなどよりなる。ドライブ２３０は、磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリなどのリムーバブルメディア２３１を駆動する。 The input unit 226 includes a keyboard, a mouse, a microphone, and the like. The output unit 227 includes a display, a speaker, and the like. The storage unit 228 includes a hard disk, a nonvolatile memory, and the like. The communication unit 229 includes a network interface. The drive 230 drives a removable medium 231 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.

以上のように構成されるコンピュータでは、CPU２２１が、例えば、記憶部２２８に記憶されているプログラムを、入出力インタフェース２２５及びバス２２４を介して、RAM２２３にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 221 loads the program stored in the storage unit 228 to the RAM 223 via the input / output interface 225 and the bus 224 and executes the program, for example. Is performed.

コンピュータ（CPU２２１）が実行するプログラムは、例えば、パッケージメディア等としてのリムーバブルメディア２３１に記録して提供することができる。また、プログラムは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供することができる。 The program executed by the computer (CPU 221) can be provided by being recorded on a removable medium 231 as a package medium, for example. The program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.

コンピュータでは、プログラムは、リムーバブルメディア２３１をドライブ２３０に装着することにより、入出力インタフェース２２５を介して、記憶部２２８にインストールすることができる。また、プログラムは、有線または無線の伝送媒体を介して、通信部２２９で受信し、記憶部２２８にインストールすることができる。その他、プログラムは、ROM２２２や記憶部２２８に、あらかじめインストールしておくことができる。 In the computer, the program can be installed in the storage unit 228 via the input / output interface 225 by attaching the removable medium 231 to the drive 230. The program can be received by the communication unit 229 via a wired or wireless transmission medium and installed in the storage unit 228. In addition, the program can be installed in the ROM 222 or the storage unit 228 in advance.

なお、コンピュータが実行するプログラムは、本明細書で説明する順序に沿って時系列に処理が行われるプログラムであっても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで処理が行われるプログラムであっても良い。 The program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.

なお、本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiments of the present technology are not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present technology.

［７．その他］ [7. Others]

本技術は、以下のような構成もとることができる。
（１）
魚眼レンズを介して撮影した画像である魚眼画像を取得する取得部と、
前記魚眼画像を変換して、変換画像を生成する変換部と、
前記変換画像から認識対象の画像を認識する認識部と
を備える画像処理装置。
（２）
前記変換画像は、前記魚眼画像を円筒面上に展開した展開画像である
前記（１）に記載の画像処理装置。
（３）
前記展開画像上の前記認識対象の画像の大きさを検出する検出部をさらに備える
前記（１）または（２）に記載の画像処理装置。
（４）
前記認識対象までの距離を、前記認識対象の画像の大きさから演算する演算部をさらに備える
前記（３）に記載の画像処理装置。
（５）
前記演算部は、前記距離と大きさの関係を規定する関係式から、前記認識対象の画像の大きさに対応する前記認識対象までの距離を演算する
前記（４）に記載の画像処理装置。
（６）
異なる大きさの複数の前記認識対象について用意されている複数の前記関係式から１つを選択する選択部をさらに備える
前記（５）に記載の画像処理装置。
（７）
１つの前記関係式を選択するために距離を測定する測定部をさらに備える
前記（６）に記載の画像処理装置。
（８）
１つの前記関係式を選択するために特定の前記認識対象を特定する特定部をさらに備える
前記（６）に記載の画像処理装置。
（９）
前記演算部はさらに、新たな前記関係式を演算し、追加する
前記（５）乃至（８）のいずれかに記載の画像処理装置。
（１０）
前記関係式を更新する更新部をさらに備える
前記（５）乃至（９）のいずれかに記載の画像処理装置。
（１１）
前記関係式から検出された前記認識対象までの距離をパラメータとして、前記魚眼画像の座標系を、世界座標系に変換する座標変換部をさらに備える
前記（５）乃至（１０）のいずれかに記載の画像処理装置。
（１２）
魚眼レンズを介して撮影した画像である魚眼画像を取得する取得ステップと、
前記魚眼画像を変換して、変換画像を生成する変換ステップと、
前記変換画像から認識対象の画像を認識する認識ステップと
を含む画像処理方法。
（１３）
コンピュータに、
魚眼レンズを介して撮影した画像である魚眼画像を取得する取得ステップと、
前記魚眼画像を変換して、変換画像を生成する変換ステップと、
前記変換画像から認識対象の画像を認識する認識ステップと
を実行させるプログラムが記録されているコンピュータが読み取り可能な記録媒体。
（１４）
コンピュータに、
魚眼レンズを介して撮影した画像である魚眼画像を取得する取得ステップと、
前記魚眼画像を変換して、変換画像を生成する変換ステップと、
前記変換画像から認識対象の画像を認識する認識ステップと
を実行させるプログラム。 The present technology can be configured as follows.
(1)
An acquisition unit that acquires a fisheye image that is an image taken through a fisheye lens;
A conversion unit that converts the fisheye image and generates a converted image;
An image processing apparatus comprising: a recognition unit that recognizes an image to be recognized from the converted image.
(2)
The image processing apparatus according to (1), wherein the converted image is a developed image obtained by developing the fisheye image on a cylindrical surface.
(3)
The image processing apparatus according to (1) or (2), further including a detection unit configured to detect a size of the recognition target image on the developed image.
(4)
The image processing apparatus according to (3), further including a calculation unit that calculates a distance to the recognition target from a size of the image of the recognition target.
(5)
The image processing apparatus according to (4), wherein the calculation unit calculates a distance to the recognition target corresponding to a size of the image to be recognized, from a relational expression that defines a relationship between the distance and the size.
(6)
The image processing apparatus according to (5), further including a selection unit that selects one of the plurality of relational expressions prepared for the plurality of recognition targets having different sizes.
(7)
The image processing apparatus according to (6), further including a measurement unit that measures a distance in order to select one of the relational expressions.
(8)
The image processing apparatus according to (6), further including a specifying unit that specifies the specific recognition target in order to select one of the relational expressions.
(9)
The image processing apparatus according to any one of (5) to (8), wherein the calculation unit further calculates and adds a new relational expression.
(10)
The image processing apparatus according to any one of (5) to (9), further including an update unit that updates the relational expression.
(11)
The coordinate system according to any one of (5) to (10), further including a coordinate conversion unit that converts the coordinate system of the fisheye image into a world coordinate system using the distance to the recognition target detected from the relational expression as a parameter. The image processing apparatus described.
(12)
An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A recognition step of recognizing an image to be recognized from the converted image.
(13)
On the computer,
An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A computer-readable recording medium storing a program for executing a recognition step of recognizing an image to be recognized from the converted image.
(14)
On the computer,
An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A recognition step of recognizing an image to be recognized from the converted image.

１画像処理装置，１１カメラ，１１Ａ魚眼レンズ，１３マイクロプロセッサ，３１撮像部，３２画像変換部，３３顔認識部，３４判定部，３５検出部，３６演算部，３７座標変換部，３８指示部，３９選択部，４０測定部，４１記憶部，４２更新部，４３制御部 DESCRIPTION OF SYMBOLS 1 Image processing apparatus, 11 Camera, 11A Fisheye lens, 13 Microprocessor, 31 Image pick-up part, 32 Image conversion part, 33 Face recognition part, 34 Determination part, 35 Detection part, 36 Calculation part, 37 Coordinate conversion part, 38 Instruction part, 39 selection unit, 40 measurement unit, 41 storage unit, 42 update unit, 43 control unit

Claims

An acquisition unit that acquires a fisheye image that is an image taken through a fisheye lens;
A conversion unit that converts the fisheye image and generates a converted image;
An image processing apparatus comprising: a recognition unit that recognizes an image to be recognized from the converted image.

The image processing apparatus according to claim 1, wherein the converted image is a developed image obtained by developing the fisheye image on a cylindrical surface.

The image processing apparatus according to claim 2, further comprising a detection unit that detects a size of the image to be recognized on the developed image.

The image processing apparatus according to claim 3, further comprising a calculation unit that calculates a distance to the recognition target from a size of the image of the recognition target.

The image processing apparatus according to claim 4, wherein the calculation unit calculates a distance to the recognition target corresponding to a size of the image to be recognized, from a relational expression that defines a relationship between the distance and the size.

The image processing apparatus according to claim 5, further comprising a selection unit that selects one of the plurality of relational expressions prepared for the plurality of recognition targets having different sizes.

The image processing apparatus according to claim 6, further comprising a measurement unit that measures a distance in order to select one of the relational expressions.

The image processing apparatus according to claim 6, further comprising a specifying unit that specifies a specific recognition target in order to select one of the relational expressions.

The image processing apparatus according to claim 6, wherein the calculation unit further calculates and adds the new relational expression.

The image processing apparatus according to claim 6, further comprising an update unit that updates the relational expression.

The image processing apparatus according to claim 5, further comprising a coordinate conversion unit that converts a coordinate system of the fisheye image into a world coordinate system using a distance to the recognition target detected from the relational expression as a parameter.

An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A recognition step of recognizing an image to be recognized from the converted image.

On the computer,
An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A computer-readable recording medium storing a program for executing a recognition step of recognizing an image to be recognized from the converted image.

On the computer,
An acquisition step of acquiring a fisheye image, which is an image taken through a fisheye lens;
Converting the fisheye image to generate a converted image;
A recognition step of recognizing an image to be recognized from the converted image.