JP2007241637A5

JP2007241637A5 -

Info

Publication number: JP2007241637A5
Application number: JP2006062925A
Authority: JP
Filing date: 2006-03-08
Publication date: 2009-04-09
Anticipated expiration: 2026-03-08

Description

3D object recognition method and 3D image processing apparatus

本発明は、３次元物体認識方法およびその装置に関するもので、特に１台のカメラから取得する画像データを固有空間表現して、固有空間表現されたテンプレートと比較し対象物を認識する３次元物体認識方法およびその装置に関する。 The present invention relates to a three-dimensional object recognition method and apparatus, and more particularly to a three-dimensional object that recognizes an object by representing eigenspace representation of image data acquired from one camera and comparing it with a template represented by the eigenspace. The present invention relates to a recognition method and an apparatus thereof.

従来、対象物の３次元物体認識を行うための主な画像処理手法として、カメラを用いたパターンマッチング手法がある。
パターンマッチング手法には、学習段階として、対象物を異なる複数の視点からカメラで撮影して、大量のテンプレート画像を用意し、コンピュータに記憶させておき、認識段階では、カメラ画像と前記テンプレート画像とを比較して物体認識を行う手法がある。また、計算コストの削減と、複雑な背景や環境光の変化に対するある程度頑健な認識を行うためのパターンマッチング手法として、固有空間法がある。すなわち、カメラにより得られる画像データとテンプレート画像に対してＫ-Ｌ展開を施し、画像ベクトルの識別に最適な固有空間上で類似度計算を正規化相関によって行い、お互いを比較することで対象物の認識を行う手法であり、画像データを固有ベクトルとして圧縮するため、コンピュータが記憶しておくべき情報量を削減できるという特徴がある。 Conventionally, there is a pattern matching method using a camera as a main image processing method for performing three-dimensional object recognition of an object.
In the pattern matching method, as a learning stage, an object is photographed by a camera from a plurality of different viewpoints, a large number of template images are prepared and stored in a computer, and in the recognition stage, the camera image and the template image are stored. There is a method of performing object recognition by comparing. In addition, there is an eigenspace method as a pattern matching method for reducing calculation cost and performing robust recognition to some extent against complicated background and changes in ambient light. That is, the image data obtained by the camera and the template image are subjected to KL expansion, similarity calculation is performed on the eigenspace optimum for image vector identification by normalized correlation, and the objects are compared with each other. The image data is compressed as an eigenvector, so that the amount of information that should be stored by the computer can be reduced.

また、固有空間法を用いた物体認識手法の認識率の向上及び、計算コストの削減などを改善するための考案は多くの文献に記載され、例えば、対象物の全体画像と特徴的な局所画像とにより対象物を認識する方法が開示されている（例えば特許文献１参照）。
本従来技術に記載の画像認識システムは、対象物の全体的画像及び局所画像を解析する全体画像処理手段及び局所画像処理手段を備え、全体画像処理手段による認識対象の全体画像の解析処理と局所画像処理手段による全体画像の特徴的な局所の画像を解析処理とにより画像認識を行うことにより、複数の狭い画素領域である局所毎の画像情報の解析処理を並行して且つ順次探索的に行うことができ、全体画像と特徴的な局所画像とにより対象の認識を行う。複雑な形状の認識対象であっても従来の如く困難な演算を伴う確立的評価を必要とせず仮説推論が可能なシステムとなり、画像の解析処理に要するコンピュータ等の負荷が軽減でき、対象画像の認識に至る時間が短縮可能となる。
特開平１０−２３２９３８ In addition, many ideas for improving the recognition rate of the object recognition method using the eigenspace method and reducing the calculation cost are described in many literatures, for example, the entire image of the object and the characteristic local image. A method for recognizing an object is disclosed (for example, see Patent Document 1).
The image recognition system described in the prior art includes an entire image processing unit and a local image processing unit that analyze an entire image and a local image of an object, and an analysis process and a local image of the entire image to be recognized by the entire image processing unit. By performing image recognition on the characteristic local image of the entire image by the image processing means, analysis processing of image information for each local area that is a plurality of narrow pixel regions is performed in parallel and sequentially in a search manner. The target is recognized by the whole image and the characteristic local image. Even for complex shape recognition targets, it becomes a system that enables hypothetical reasoning without the need for estimative evaluation with difficult operations as in the past, reducing the load on the computer etc. required for image analysis processing, The time to recognition can be shortened.
JP-A-10-232938

固有空間法による物体認識は、対象物を様々な方向から撮影したテンプレートを事前に用意する必要がある。実際の対象物の画像が事前に用意したパターンと異なる見え方で得られた場合には認識率が著しく低下する。また、この現象を避けるために、膨大な量のテンプレートを用意したとしても、計算量の爆発的増加や、テンプレートの増加に伴う部分空間の拡散により、認識率の低下を招く。 In object recognition by the eigenspace method, it is necessary to prepare in advance a template in which an object is photographed from various directions. If the actual image of the object is obtained with a different appearance from the pattern prepared in advance, the recognition rate is significantly reduced. In order to avoid this phenomenon, even if an enormous amount of templates are prepared, the recognition rate is lowered due to the explosive increase in the amount of calculation and the diffusion of the partial space accompanying the increase in templates.

また、特許文献１に記載の物体認識手法は、対象物の局所画像から抽出された特徴と認識画像との整合性を評価する複数の局所モジュールを備え、整合性の評価を順次探索的に行う手法であり、相応のアルゴリズムを必要とする。また、テンプレートデータそのものを減らすものではなく、計算量、認識率の面においても大幅な改善効果は見込めない。 In addition, the object recognition method described in Patent Document 1 includes a plurality of local modules for evaluating the consistency between the feature extracted from the local image of the target object and the recognized image, and sequentially evaluates the consistency evaluation. This method requires a corresponding algorithm. In addition, the template data itself is not reduced, and a significant improvement effect cannot be expected in terms of calculation amount and recognition rate.

本発明はこのような問題点に鑑みてなされたものであり、テンプレートを削減できると共に、高速かつ、高い認識率で物体認識を行うことができる３次元物体認識方法およびその装置を提供することを目的とする。 The present invention has been made in view of such problems, and provides a three-dimensional object recognition method and apparatus capable of reducing templates and performing object recognition at high speed and with a high recognition rate. Objective.

上記問題を解決するため、請求項１に記載の発明は、水平面との距離が固定された１台のカメラと、前記カメラからの画像データを動画像処理するコンピュータで構成されたシステムにより、前記水平面上に存在する対象物の様々な方向から撮影した２次元画像データを、固有空間で圧縮したテンプレートとして前記コンピュータに予め記憶させておき、前記カメラから取得する画像データを前記固有空間表現して、前記テンプレートと比較し、対象物を認識する画像処理の方法であって、垂直方向のカメラの傾きごとに、部分的な固有空間を生成しておき、前記カメラの傾き情報を取得し、前記固有空間を用いて画像処理することを特徴としている。
また、請求項２に記載の発明は、水平面との距離が固定され、パン、チルト機能を備えた１台のカメラと、前記カメラからの画像データを動画像処理すると共に前記カメラを操作するソフトウェアを備えたコンピュータで構成されたシステムにより、前記水平面上に存在する対象物の様々な方向から撮影した２次元画像データを、固有空間で圧縮したテンプレートとして前記コンピュータに予め記憶させておき、前記カメラから取得する画像データを前記固有空間表現して、前記テンプレートと比較し、対象物を認識する画像処理の方法であって、垂直方向のカメラの傾きごとに、部分的な固有空間を生成しておき、前記カメラの傾き情報を取得し、前記固有空間を用いて画像処理し、前記画像処理の結果を受けて前記カメラを対象物に追従させることを特徴としている。
また、請求項３に記載の発明は、前記固有空間が、カメラの傾きごとに、前記傾きに対する辞書データで構成された部分的な固有空間であることを特徴としている。
また、請求項４に記載の発明は、前記固有空間が、カメラの傾きごとに、垂直方向のカメラ画角範囲に対応した辞書データを備えた部分的な固有空間であることを特徴としている。
また、請求項５に記載の発明は、前記対象物の移動または前記カメラの移動に対して、前記カメラのパン角およびチルト角を制御し、前記カメラを対象物に追従させることを特徴としている。
また、請求項６に記載の発明は、水平面との距離が固定された１台のカメラと、前記カメラからの画像データを動画像処理するコンピュータとを備え、前記コンピュータは、水平面上に存在する対象物の様々な方向から撮影した２次元画像データを固有空間で圧縮したテンプレートとして記憶する辞書データ部と、前記カメラから取得する画像データを固有空間表現処理する固有空間表現処理部と、前記辞書データ部から得られる辞書データと前記固有空間表現処理部から得られる固有空間表現された画像データを比較するパターンマッチング処理部とを備えた３次元画像処理装置において、前記辞書データ部は、カメラの傾き情報に対応する辞書データで構成された前記部分的な固有空間で構成されていることを特徴としている。
また、請求項７に記載の発明は、水平面との距離が固定され、パン、チルト機能を備えた１台のカメラと、前記カメラからの画像データを動画像処理すると共に前記カメラを操作するカメラ制御ソフトウェアを有するコンピュータとを備え、前記コンピュータは、水平面上に存在する対象物の様々な方向から撮影した２次元画像データを固有空間で圧縮したテンプレートとして記憶する辞書データ部と、前記カメラから取得する画像データを固有空間表現処理する固有空間表現処理部と、前記辞書データ部から得られる辞書データと前記固有空間表現処理部から得られる固有空間表現された画像データを比較するパターンマッチング処理部とを備えた３次元画像処理装置において、前記辞書データ部は、カメラの傾き情報と垂直方向のカメラ画角範囲に対応した辞書データを備えた部分的な固有空間で構成されていることを特徴としている。 In order to solve the above-mentioned problem, the invention according to claim 1 is characterized in that a system constituted by one camera having a fixed distance from a horizontal plane and a computer that performs moving image processing on image data from the camera, Two-dimensional image data photographed from various directions of an object existing on a horizontal plane is stored in the computer in advance as a template compressed in an eigenspace, and image data acquired from the camera is expressed in the eigenspace. , An image processing method for recognizing an object in comparison with the template, generating a partial eigenspace for each camera tilt in a vertical direction, obtaining camera tilt information, It is characterized by image processing using an eigenspace.
According to a second aspect of the present invention, there is provided a camera having a fixed distance from a horizontal plane and having a pan / tilt function, software for processing image data from the camera and operating the camera 2D image data photographed from various directions of an object existing on the horizontal plane by a system comprising a computer, and stored in advance in the computer as a template compressed in eigenspace, An image processing method for recognizing image data obtained from the image and comparing it with the template to recognize an object, and generating a partial eigenspace for each vertical camera tilt. The camera acquires tilt information of the camera, performs image processing using the eigenspace, and follows the camera according to the object in response to the result of the image processing. It is characterized by causing.
The invention described in claim 3 is characterized in that the eigenspace is a partial eigenspace composed of dictionary data for the tilt for each tilt of the camera.
The invention according to claim 4, wherein the eigenspace, each camera tilt, you are characterized by a partial proper space with the dictionary data corresponding to the vertical camera angle range .
Also, the invention according to claim 5, relative movement or movement of the camera of the object, by controlling the pan angle and the tilt angle of the camera, as characterized by to follow the camera to the object Yes.
According to a sixth aspect of the present invention, there is provided a camera having a fixed distance from a horizontal plane, and a computer that performs moving image processing on image data from the camera, and the computer exists on the horizontal plane. A dictionary data section for storing two-dimensional image data photographed from various directions of an object as a template compressed in an eigenspace; an eigenspace expression processing section for performing eigenspace expression processing on image data acquired from the camera; and the dictionary A three-dimensional image processing apparatus comprising a dictionary data obtained from a data part and a pattern matching processing part for comparing eigenspace-represented image data obtained from the eigenspace expression processing part, wherein the dictionary data part comprises a camera It is characterized by comprising the partial eigenspace composed of dictionary data corresponding to the tilt information.
According to a seventh aspect of the present invention, there is provided a camera having a fixed distance from a horizontal plane and having a pan / tilt function, a camera that performs image processing on image data from the camera and operates the camera A computer having control software, and the computer obtains from the camera a dictionary data unit for storing two-dimensional image data photographed from various directions of an object existing on a horizontal plane as a template compressed in an eigenspace. An eigenspace representation processing unit that performs eigenspace representation processing on the image data to be performed, a pattern matching processing unit that compares the dictionary data obtained from the dictionary data unit and the eigenspace-represented image data obtained from the eigenspace representation processing unit, In the three-dimensional image processing apparatus, the dictionary data section includes a camera tilt information and a vertical camera. It is characterized by being composed by partial proper space with the dictionary data corresponding to the angular range.

請求項１に記載の発明によると、限定された方向に対する固有空間に切り替えて対象物との照合を行っているため、物体を認識する段階において、辞書データとの照合にかかる計算量を削減することができる。また、余分なテンプレートを排除することになるため、認識率を向上させることができる。 According to the first aspect of the present invention, since the collation with the object is performed by switching to the eigenspace with respect to the limited direction, the amount of calculation for collation with the dictionary data is reduced at the stage of recognizing the object. be able to. In addition, since the unnecessary template is eliminated, the recognition rate can be improved.

また、請求項２記載の発明によると、垂直方向のカメラ画角範囲に対応した辞書データを備えた部分的な固有空間を生成しておき、カメラの傾き情報に対応する前記部分的な固有空間を用いて画像処理し、この画像処理の結果を受けてカメラを対象物に追従させいるので、垂直方向のカメラ画角範囲にある対象物の認識に対して、辞書データとの照合にかかる計算量を削減することができると共に、余分なテンプレートを排除することになるため、認識率が向上する。 According to the second aspect of the present invention, a partial eigenspace having dictionary data corresponding to the camera angle range in the vertical direction is generated, and the partial eigenspace corresponding to the tilt information of the camera is generated. Since the camera is made to follow the object in response to the result of the image processing, the calculation for collation with the dictionary data for the recognition of the object in the vertical camera angle range is performed. Since the amount can be reduced and an unnecessary template is eliminated, the recognition rate is improved.

請求項３に記載の発明によると、対象物を特定の方向からカメラで撮影し、物体認識を行う場合において、対象物の中心を垂直方向のカメラ画角の中心に設定し、カメラの傾き情報から、その傾き方向に対応する固有空間のみを用いて画像処理すれば、物体認識段階での照合時間を削減することができ、高速かつ、高い認識率で物体認識を行うことができる。 According to the third aspect of the present invention, when the object is photographed with a camera from a specific direction and object recognition is performed, the center of the object is set as the center of the vertical camera angle of view, and camera tilt information Therefore, if image processing is performed using only the eigenspace corresponding to the tilt direction, the verification time at the object recognition stage can be reduced, and object recognition can be performed at high speed and with a high recognition rate.

また、対象物がカメラ画角の中心になったときの、カメラの傾き情報を検出し、この傾き方向に対応する固有空間のみを用いて画像処理を行えば、対象物またはカメラが移動する場合でも、物体認識段階での照合時間を削減することができ、高速かつ、高い認識率で物体認識を行うことができる。 Also, if the object or camera moves if the camera tilt information is detected when the object is at the center of the camera angle of view and image processing is performed using only the eigenspace corresponding to this tilt direction However, the verification time in the object recognition stage can be reduced, and object recognition can be performed at high speed and with a high recognition rate.

請求項４に記載の発明によると、対象物を特定の方向からカメラで撮影し、物体認識を行う場合において、カメラの傾き情報から、その傾き方向に対応するカメラの画角の範囲のデータを含んだ固有空間を用いて画像処理すれば、カメラ画角の範囲内にある対象物に対して、物体認識段階での照合時間を削減することができ、高速かつ、高い認識率で物体認識を行うことができる。 According to the fourth aspect of the present invention, when an object is photographed with a camera from a specific direction and object recognition is performed, data on the range of the angle of view of the camera corresponding to the tilt direction is obtained from the tilt information of the camera. By performing image processing using the included eigenspace, it is possible to reduce the verification time at the object recognition stage for objects within the range of the camera angle of view, and perform object recognition at high speed and with a high recognition rate. It can be carried out.

また、対象物がカメラ画角の範囲になったときの、カメラの傾き情報を検出し、その傾き方向に対応するカメラの画角の範囲のデータを含んだ固有空間を用いて画像処理を行えば、対象物またはカメラが移動する場合でも、物体認識段階での照合時間を削減することができ、高速かつ、高い認識率で物体認識を行うことができる。 In addition, camera tilt information is detected when the object falls within the range of the camera angle of view, and image processing is performed using an eigenspace including data of the range of camera angle of view corresponding to the tilt direction. For example, even when the object or the camera moves, the verification time at the object recognition stage can be reduced, and object recognition can be performed at high speed and with a high recognition rate.

また、請求項５に記載の発明によると、対象物の移動またはカメラの移動に対して、カメラのパン角およびチルト角を制御し、カメラを対象物に追従させれば、自律動作型ロボットシステムに搭載し、対象物に対するロボット動作を行うことができる。 According to the invention described in claim 5 , if the pan angle and tilt angle of the camera are controlled with respect to the movement of the object or the movement of the camera, and the camera follows the object, the autonomously operating robot system The robot can be operated on the object.

また、請求項６に記載の発明によると、辞書データがカメラの傾き情報に対応する部分的な固有空間で構成されているので、テンプレートを削減できると共に、高速かつ、高い認識率で物体認識を行うことができる。 According to the invention described in claim 6 , since the dictionary data is composed of a partial eigenspace corresponding to the tilt information of the camera, it is possible to reduce templates and perform object recognition at high speed and with a high recognition rate. It can be carried out.

また、請求項７に記載の発明によると、辞書データが垂直方向のカメラ画角範囲に対応した辞書データを備えた部分的な固有空間で構成されているので、垂直方向のカメラ画角範囲にある対象物の認識に対して、辞書データとの照合にかかる計算量を削減することができると共に、余分なテンプレートを排除することになるため、認識率が向上する。 According to the seventh aspect of the present invention, the dictionary data is composed of a partial eigenspace having dictionary data corresponding to the vertical camera field angle range. For the recognition of a certain object, the amount of calculation required for collation with dictionary data can be reduced, and unnecessary templates are eliminated, so that the recognition rate is improved.

以下、本発明の実施の形態について図を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は本発明の第１の実施形態である３次元物体認識方法の構成を示すブロック図である。
図において、１０１はカメラ、１０２はカメラ１０１から送られる画像データ１００ｂをリアルタイムで取得するコンピュータである。例えば680×480ピクセルの画像データを30f/s の速度で取得する。コンピュータ内部は、辞書データ検索エンジン１０３、固有空間表現処理ソフトウェア１０４、パターンマッチング処理ソフトウェア１０５、辞書データ１０６で構成される。 FIG. 1 is a block diagram showing a configuration of a three-dimensional object recognition method according to the first embodiment of the present invention.
In the figure, 101 is a camera, and 102 is a computer that acquires image data 100b sent from the camera 101 in real time. For example, image data of 680 × 480 pixels is acquired at a speed of 30 f / s. The inside of the computer includes a dictionary data search engine 103, eigenspace expression processing software 104, pattern matching processing software 105, and dictionary data 106.

次に、動作について説明する。
先ず、コンピュータ１０２は、カメラの傾き（θpi）情報１００ａを取得する。ただし、カメラの傾き（θpi）は、水平面を基準としたカメラの垂直方向の傾きを示すパラメータである。
次に、辞書データ検索エンジン１０３は、コンピュータ１０２に予め記憶されている辞書データ１０６の中から、カメラの傾き（θpi）情報１００ａに対応する辞書データを検索し、カメラの傾き情報に対応する辞書データ１００ｃをパターンマッチング処理ソフトウェア１０５へ送る。
一方、固有空間表現処理ソフトウェア１０４は、カメラ１０１から取り込まれる画像データ１００ｂを固有空間表現し、固有空間表現された画像データ１００ｄをパターンマッチング処理ソフトウェア１０５へ送る。パターンマッチング処理ソフトウェア１０５は、カメラの傾き（θpi）情報に対応する辞書データ１００ｃと、固有空間表現された画像データ１００ｄを比較し、対象物かどうかの判別を行う。 Next, the operation will be described.
First, the computer 102 acquires camera tilt (θpi) information 100a. However, the camera tilt (θpi) is a parameter indicating the tilt in the vertical direction of the camera with respect to the horizontal plane.
Next, the dictionary data search engine 103 searches the dictionary data 106 stored in advance in the computer 102 for dictionary data corresponding to the camera tilt (θpi) information 100a, and the dictionary corresponding to the camera tilt information. The data 100c is sent to the pattern matching processing software 105.
On the other hand, the eigenspace expression processing software 104 expresses the image data 100 b captured from the camera 101 as eigenspace, and sends the image data 100 d expressed in eigenspace to the pattern matching processing software 105. The pattern matching processing software 105 compares the dictionary data 100c corresponding to the camera tilt (θpi) information with the image data 100d expressed in the eigenspace to determine whether the object is an object.

図２は、対象物とカメラの位置関係を示す図である。
図において、座標系（ｘ、ｙ、ｚ）は、ｘ−ｙ平面が水平面と平行であるような座標系である。図中のｈは水平面とカメラとの距離を示す。 FIG. 2 is a diagram illustrating the positional relationship between the object and the camera.
In the figure, the coordinate system (x, y, z) is a coordinate system in which the xy plane is parallel to the horizontal plane. H in the figure indicates the distance between the horizontal plane and the camera.

θは対象物を中心とした角度を示し、添え字“ｒ”は、対象物を中心にしたｚ軸回りの角度で、角度θｒは、カメラからｘ−ｙ平面上に垂直に降ろした射影点と対象物の中心点とを結ぶ直線と、ｘ軸とがなす角度であり、範囲は（０≦θr≦２π）である。また、添え字“ｐ”は、ｘ−ｙ平面に対する角度を示し、角度θpは、カメラの中心点と対象物を結ぶ直線のｘ−ｙ平面に対する角度であり、ｘ−ｙ平面上の物体を対象にする場合、範囲は（０≦θp≦π）である。なお、以降では、角度θpをカメラの傾き角度と言うことにする。 θ represents an angle centered on the object, the subscript “r” is an angle around the z-axis centered on the object, and the angle θr is a projection point vertically dropped from the camera on the xy plane. Is an angle formed by a straight line connecting the center point of the object and the x-axis, and the range is (0 ≦ θr ≦ 2π). The subscript “p” indicates an angle with respect to the xy plane, and the angle θp is an angle with respect to the xy plane of a straight line connecting the center point of the camera and the object, and an object on the xy plane is represented by In the case of object, the range is (0 ≦ θp ≦ π). In the following, the angle θp is referred to as the camera tilt angle.

辞書データ１０６の作成において、テンプレート画像は、θr及び、θpに対して、一定の間隔ごとに撮影するものとし、θrに対してＭ通り（θr＝θr0、θr1、・・・、θrM-1）、θpに対してＮ通り（θp＝θp0、θp1、・・・、θpN-1）のテンプレートを用意するとすれば、テンプレート数はＭ×Ｎ個となる。 In creating the dictionary data 106, template images are taken at regular intervals with respect to θr and θp, and there are M types of θr (θr = θr0, θr1,..., ΘrM-1). , Θp, N templates (θp = θp0, θp1,..., ΘpN−1) are prepared, and the number of templates is M × N.

このとき、従来の固有空間法では、Ｍ×Ｎ個のテンプレートから、１つの固有空間を生成するが、本発明では、θpに対して別々の固有空間を生成する。すなわち、θp＝θp0、θp1、・・・、θpN-1のそれぞれに対して、θrに対するＭ個のテンプレート集合を考え、Ｎ個の固有空間を生成する。 At this time, in the conventional eigenspace method, one eigenspace is generated from M × N templates, but in the present invention, separate eigenspaces are generated for θp. That is, for each of θp = θp0, θp1,..., ΘpN−1, M template sets for θr are considered, and N eigenspaces are generated.

そのようにして作成した辞書データ１０６を用意しておき、物体認識の段階では、カメラの傾き（θpi）情報１００ａを取得し、傾きθpiに近いデータをＮ個の辞書データから抽出し、画像処理を行う。 The dictionary data 106 thus created is prepared, and at the stage of object recognition, camera tilt (θpi) information 100a is acquired, data close to the tilt θpi is extracted from N dictionary data, and image processing is performed. I do.

図３は本実施例における物体認識処理の流れを示すフローチャートである。
図において、処理３０１は、カメラの傾き（θpi）情報１００ａをコンピュータ１０２に取り込む手続きであり、カメラの傾きθpiは、防犯カメラなど、カメラの姿勢が固定である用途の場合は、傾きを実測することによって予めコンピュータに記憶させておく。また、カメラがパン、チルト機能を備えており、カメラとコンピュータ間で通信が可能な場合は、カメラから傾き情報を取得する。 FIG. 3 is a flowchart showing the flow of object recognition processing in this embodiment.
In the figure, a process 301 is a procedure for taking camera tilt (θpi) information 100a into the computer 102, and the camera tilt θpi is actually measured for an application where the camera posture is fixed, such as a security camera. This is stored in advance in the computer. If the camera has a pan / tilt function and communication is possible between the camera and the computer, tilt information is acquired from the camera.

処理３０２は、カメラ１０１からの画像データ１００ｂをコンピュータ１０２に取り込む手続きであり、取り込み装置（カメラの規格によるが、キャプチャボード、ＵＳＢ端子、ＦＩＲＥＷＩＲＥ等）を介して取り込むことが可能である。 A process 302 is a procedure for capturing the image data 100b from the camera 101 into the computer 102, and can be captured via a capturing device (such as a capture board, a USB terminal, or FIREWIRE depending on the camera standard).

処理３０３は、取り込まれた画像データ１００ｂの固有空間表現処理を実行するステップで、取り込まれた画像データ１００ｂは固有空間表現処理ソフトウェア１０４によって、固有ベクトルとして圧縮される。 A process 303 is a step of executing eigenspace representation processing of the captured image data 100b, and the captured image data 100b is compressed as an eigenvector by the eigenspace representation processing software 104.

処理３０４は、カメラの傾きに対応する辞書データ郡の検索を行うステップで、コンピュータ１０２に予め記憶されている辞書データ１０６の中から、カメラの傾き（θpi）情報１００ａに対応する辞書データが検索される。 A process 304 is a step of searching for dictionary data corresponding to the camera tilt. The dictionary data corresponding to the camera tilt (θpi) information 100a is searched from the dictionary data 106 stored in the computer 102 in advance. Is done.

処理３０５はパターンマッチング処理を実行するステップで、処理３０３で得られた固有空間表現された画像データ１００ｄは、処理３０４で得られたカメラの傾き情報に対応する辞書データ１００ｃと照合され、対象物かどうかが判別される。 A process 305 is a step of executing a pattern matching process. The image data 100d expressed in the eigenspace obtained in the process 303 is collated with the dictionary data 100c corresponding to the camera tilt information obtained in the process 304, and the target object is obtained. Is determined.

このように本実施例では、角度θp方向の制約条件を設けることによって、対象物の見え方を限定し、その傾き方向に対応する固有空間のみを用いて画像処理しているので、物体認識段階での照合時間を削減することができ、高速かつ、高い認識率で物体認識を行うことができる。 As described above, in this embodiment, the constraint condition in the direction of the angle θp is provided to limit the appearance of the object, and image processing is performed using only the eigenspace corresponding to the tilt direction. Can be reduced, and object recognition can be performed at high speed and with a high recognition rate.

実際に、この制約条件を満たす用途は数多く存在し、例えば、防犯カメラで人物を特定するような場合、固定のカメラで撮影する人物の見え方は大きく制限される。また、カメラを搭載した自律移動型ロボットにおける物体認識においても、ロボットは地面上を走行し、地面上に存在する物体を認識して、その物体に対して特定の動作（物体への接近、障害物回避など）を行う場合も前記制約条件は満たされることになる。 Actually, there are many applications that satisfy this constraint condition. For example, when a person is identified by a security camera, the appearance of the person photographed by a fixed camera is greatly limited. Also, in object recognition in an autonomous mobile robot equipped with a camera, the robot travels on the ground, recognizes an object present on the ground, and performs a specific action (approach to the object, obstacle) The above-mentioned constraint condition is also satisfied when performing object avoidance or the like.

すなわち、対象物の見え方が、地面から垂直な方向に対して限定されるため、テンプレートとして必要なデータは、角度θp方向の画像集合のみであり、その固有空間は、全方位のテンプレートを用いた固有空間に比べてはるかに小さくなり、物体認識段階における、照合時間を削減することができる。また、余分なテンプレートを排除することになるため、認識率を上げることができる。 In other words, since the appearance of the object is limited to the direction perpendicular to the ground, the data required as a template is only the image set in the direction of the angle θp, and the eigenspace uses an omnidirectional template. Compared to the eigenspace, the collation time in the object recognition stage can be reduced. In addition, since unnecessary templates are eliminated, the recognition rate can be increased.

図４は本発明の第２の実施形態である３次元物体認識方法の構成を示すブロック図である。
本実施例は、パン、チルト機能を備えた１台のカメラと画像処理用コンピュータを用いて対象物の自動探索及び、画像処理による物体認識、さらに、動いている対象物の場合、物体認識の結果を受けて、対象物のカメラ追従を行うものである。 FIG. 4 is a block diagram showing the configuration of the three-dimensional object recognition method according to the second embodiment of the present invention.
In this embodiment, a single camera having a pan and tilt function and an image processing computer are used to automatically search for an object and recognize an object by image processing. Further, in the case of a moving object, object recognition Based on the result, the camera follows the object.

図４において、４０６は辞書データで、第１実施例ではカメラの傾き角度θpに対する辞書データ１０６を用いたが、本実施例ではカメラの傾きと画角とによって作成される辞書データ４０６を用いる。また、４０７はカメラを制御するためのカメラ制御ソフトウェアである。カメラの画角は、使用するカメラによって固有の値をもつものであり、前記画角はその垂直方向の画角とし、以下パラメータθvで表す。 In FIG. 4, reference numeral 406 denotes dictionary data. In the first embodiment, the dictionary data 106 with respect to the camera tilt angle θp is used, but in this embodiment, dictionary data 406 created based on the camera tilt and the angle of view is used. Reference numeral 407 denotes camera control software for controlling the camera. The angle of view of the camera has a specific value depending on the camera to be used. The angle of view is the angle of view in the vertical direction, and is represented by the parameter θv below.

本実施例の構成が第１実施例と異なる点は、辞書データ４０６がカメラの傾きと画角とによって作成されている点と、コンピュータ１０２がカメラ制御ソフトウェア４０７を備えている点である。なお、カメラ１０１はパン、チルト機能を備え、コンピュータ１０２の指令信号によって、カメラ１０１を動作させることができるものとする。 The configuration of this embodiment is different from that of the first embodiment in that the dictionary data 406 is created based on the tilt and angle of view of the camera, and that the computer 102 includes camera control software 407. Note that the camera 101 has a pan / tilt function and can be operated by a command signal of the computer 102.

図５は、対象物とカメラ及び、カメラの画角の関係を示した図である。
実施例１では、対象物はカメラ画角の中心にあるとしたが、カメラが対象物を自動的に探索するような場合、探索段階において、対象物は画角の中心にはなく、カメラが縦、横方向に動作することによって画角内に入る。したがって、実施例１のように特定のθpの角度に対する固有空間を用いた場合、対象物の認識率は低くなる。 FIG. 5 is a diagram illustrating the relationship between the object, the camera, and the angle of view of the camera.
In the first embodiment, the object is at the center of the camera angle of view. However, when the camera automatically searches for the object, the object is not at the center of the angle of view in the search stage. Enters the angle of view by moving vertically and horizontally. Therefore, when the eigenspace for a specific angle θp is used as in the first embodiment, the recognition rate of the object is low.

そこで、画角θvを考慮した、より広範囲なテンプレート集合を用いて、固有空間を生成する。用意するテンプレートは実施例１と同様とし、θrに対してＭ通り（θr＝θr0、θr1、・・・、θrM-1）、θpに対してＮ通り（θp＝θp0、θp1、・・・、θpN-1）のＭ×Ｎ個のテンプレートを考える。各θpの値に対して、図５に示すような画角の範囲（θp−θv/２〜θp＋θv/２）を考え、その範囲に含まれる全θpの集合に対する固有空間を生成する。例えば、各θpの角度が等間隔のとき、前記範囲の中にＬ通りのθpの値が含まれる場合、１つの固有空間を生成するためにはＬ×Ｍ個のテンプレートが必要であり、それぞれのθpに対して生成するため、Ｎ個の固有空間を生成することになる。 Therefore, the eigenspace is generated using a wider template set considering the angle of view θv. The prepared templates are the same as those in the first embodiment, and there are M types for θr (θr = θr0, θr1,..., ΘrM-1), and N types for θp (θp = θp0, θp1,. Consider M × N templates of θpN−1). For each value of θp, an angle of view range (θp−θv / 2 to θp + θv / 2) as shown in FIG. 5 is considered, and an eigenspace for a set of all θp included in the range is generated. For example, when the angles of θp are equally spaced, if the range includes L values of θp, L × M templates are required to generate one eigenspace, Therefore, N eigenspaces are generated.

図６は本実施例における物体認識処理の流れを示すフローチャートである。
図において、処理６０１は対象物探索動作指令をカメラへ送信するステップで、先ず、対象物を探索するために、カメラ制御ソフトウェア４０７から、パン、チルト制御信号４００ｆをカメラ１０１に送る。本ステップにおいてカメラ動作中は、カメラ１０１の傾き情報と、画像データはリアルタイムにコンピュータ１０２へ送られているものとする。 FIG. 6 is a flowchart showing the flow of object recognition processing in this embodiment.
In the figure, a process 601 is a step of transmitting an object search operation command to the camera. First, in order to search for an object, a pan / tilt control signal 400f is sent from the camera control software 407 to the camera 101. During the camera operation in this step, it is assumed that the tilt information of the camera 101 and the image data are sent to the computer 102 in real time.

次に、コンピュータ１０２は、カメラの傾き情報１００ａと、画像データ１００ｂを取り込み（処理３０１、処理３０２）、画像データ１００ｂは、固有空間表現処理ソフトウェア１０４で固有空間表現処理され固有ベクトルに圧縮される（処理３０３）。 Next, the computer 102 captures the camera tilt information 100a and the image data 100b (process 301 and process 302), and the image data 100b is subjected to eigenspace expression processing by the eigenspace expression processing software 104 and compressed into eigenvectors ( Process 303).

ここで、辞書データに関しては、辞書データ検索エンジン１０３によって、カメラの傾き情報１００ａを基に、カメラ画角範囲（θp−θv/２〜θp＋θv/２）に対応するデータを、辞書データ４０６の中から抽出する（処理３０４）。 Here, with respect to the dictionary data, the dictionary data search engine 103 converts the data corresponding to the camera field angle range (θp−θv / 2 to θp + θv / 2) into the dictionary data 406 based on the camera tilt information 100a. (Processing 304).

以上により得られた固有空間表現された画像データ１００ｄは、パターンマッチング処理ソフトウェア１０５によって、カメラ画角範囲に対応する辞書データ４００ｅと照合され（処理３０５）、もし対象物が存在しないと判断された場合（判断６０２、ＮＯの場合）、処理６０１に戻り、カメラ１０１は、探索動作を行う。 The eigenspace-represented image data 100d obtained as described above is collated with the dictionary data 400e corresponding to the camera angle-of-view range by the pattern matching processing software 105 (processing 305), and it is determined that the object does not exist. In the case (determination 602, NO), the process returns to the process 601, and the camera 101 performs a search operation.

一方、処理６０２で、対象物が存在すると判断された場合（判断６０２、ＹＥＳの場合）は、対象物が画角の中心となるようにカメラ１０１を動作させる。すなわち、対象物が画角の中心にあるかどうかを判断し（判断６０３）、もし、中心にない場合（判断６０３、ＮＯの場合）は、カメラ制御ソフトウェア４０７によって、対象物が画角の中心にくるようにカメラを動作させ（処理６０４）、処理３０１以降の動作を行う。また、カメラ１０１の動作によって、対象物が画角の中心となった場合（判断６０３、ＹＥＳの場合）、実施例１で記載した物体認識手法を適用することができる（処理３００）。 On the other hand, when it is determined in process 602 that the object is present (determination 602, YES), the camera 101 is operated so that the object is at the center of the angle of view. That is, it is determined whether or not the object is at the center of the angle of view (determination 603). If the object is not at the center (in the case of determination 603, NO), the camera control software 407 determines that the object is at the center of the angle of view. The camera is operated so as to come to (step 604), and the operations after step 301 are performed. Further, when the object becomes the center of the angle of view by the operation of the camera 101 (in the case of determination 603, YES), the object recognition method described in the first embodiment can be applied (processing 300).

このように、本実施例では、カメラの傾き情報と、垂直方向のカメラ画角情報を用いて、画角の範囲に対応する固有空間から対象物の自動探索し、動画像処理の結果を受けて、カメラを対象物に追従させているので、対象物またはカメラが移動する場合でも、限定された方向に対する固有空間を用いて画像処理を行うことができ、高速かつ、高い認識率で物体認識を行うことができる。 As described above, in this embodiment, using the camera tilt information and the vertical camera angle information, the object is automatically searched from the eigenspace corresponding to the range of the angle of view, and the result of the moving image processing is received. Since the camera follows the object, even when the object or the camera moves, image processing can be performed using the eigenspace for a limited direction, and object recognition is performed at high speed and with a high recognition rate. It can be performed.

本発明は、対象物に対するロボット動作を行う自律動作型ロボットシステムに適用できる。 The present invention can be applied to an autonomous robot system that performs a robot operation on an object.

本発明の第１の実施形態である３次元物体認識方法の構成を示すブロック図The block diagram which shows the structure of the three-dimensional object recognition method which is the 1st Embodiment of this invention. 本発明の第１実施例における対象物とカメラの位置関係を示す図The figure which shows the positional relationship of the target object and camera in 1st Example of this invention. 本発明の第１実施例における物体認識処理の流れを示すフローチャートThe flowchart which shows the flow of the object recognition process in 1st Example of this invention. 本発明の第２の実施形態である３次元物体認識方法の構成を示すブロック図The block diagram which shows the structure of the three-dimensional object recognition method which is the 2nd Embodiment of this invention. 本発明の第２実施例における対象物とカメラ及び、カメラの画角の関係を示す説明図Explanatory drawing which shows the relationship between the target object and camera in 2nd Example of this invention, and the angle of view of a camera 本発明の第２実施例における物体認識処理の流れを示すフローチャートThe flowchart which shows the flow of the object recognition process in 2nd Example of this invention.

１０１カメラ
１０２コンピュータ
１０３辞書データ検索エンジン
１０４固有空間表現処理ソフトウェア
１０５パターンマッチング処理ソフトウェア
１０６辞書データ
１００ａカメラの傾き情報
１００ｂ画像データ
１００ｃカメラの傾きに対応する辞書データ
１００ｄ固有空間表現された画像データ
２０１対象物
４０６辞書データ
４０７カメラ制御ソフトウェア
４００ｅカメラの傾きと画角の範囲に対応する辞書データ
４００ｆパン、チルト制御信号 DESCRIPTION OF SYMBOLS 101 Camera 102 Computer 103 Dictionary data search engine 104 Eigenspace expression processing software 105 Pattern matching processing software 106 Dictionary data 100a Camera tilt information 100b Image data 100c Dictionary data corresponding to camera tilt 100d Eigenspace expressed image data 201 Object Object 406 Dictionary data 407 Camera control software 400e Dictionary data corresponding to camera tilt and angle of view range 400f Pan / tilt control signal

Claims

One camera with a fixed distance from the horizontal plane;
By a system constituted by a computer that performs image processing on image data from the camera,
Two-dimensional image data photographed from various directions of an object existing on the horizontal plane is stored in advance in the computer as a template compressed in an eigenspace, and image data acquired from the camera is expressed in the eigenspace. An image processing method for recognizing an object in comparison with the template,
Generate a partial eigenspace for each camera tilt in the vertical direction,
A method for recognizing a three-dimensional object, comprising acquiring tilt information of the camera and performing image processing using the eigenspace .

The distance between the horizontal plane is fixed, and one camera with pan and tilt function,
By a system composed of a computer provided with software for operating the camera and moving image processing the image data from the camera,
Two-dimensional image data photographed from various directions of an object existing on the horizontal plane is stored in advance in the computer as a template compressed in an eigenspace, and image data acquired from the camera is expressed in the eigenspace. An image processing method for recognizing an object in comparison with the template,
Generate a partial eigenspace for each camera tilt in the vertical direction,
A three-dimensional object recognition method characterized by acquiring tilt information of the camera, performing image processing using the eigenspace, and causing the camera to follow an object in response to the result of the image processing .

Before SL eigenspace for each inclination of the camera, the three-dimensional object recognition method according to claim 1 or claim 2 characterized in that said a partial eigenspace constructed in the dictionary data for tilt.

Before SL eigenspace for each inclination of the camera, 3 according to claim 1 or claim 2 characterized in that a partial eigenspace having the dictionary data corresponding to the vertical camera angle range Dimensional object recognition method .

Relative movement or movement of the camera before Symbol object, controls the panning angle and a tilt angle of the camera, three-dimensional object recognition according to Claim 2, characterized in that to follow the camera to the object Way .

And one camera the distance between the horizontal plane is fixed,
A computer that performs image processing on image data from the camera,
The computer
A dictionary data section for storing two-dimensional image data taken from various directions of an object existing on a horizontal plane as a template compressed in an eigenspace;
An eigenspace representation processing unit for performing eigenspace representation processing on image data acquired from the camera;
In a three-dimensional image processing apparatus comprising a dictionary data obtained from the dictionary data portion and a pattern matching processing portion for comparing eigenspace-represented image data obtained from the eigenspace representation processing portion,
3. The three-dimensional image processing apparatus according to claim 1, wherein the dictionary data section is configured by the partial eigenspace configured by dictionary data corresponding to camera tilt information .

The distance between the horizontal plane is fixed, and one camera with pan and tilt function,
A computer having camera control software for moving the image data from the camera and operating the camera;
The computer
A dictionary data section for storing two-dimensional image data taken from various directions of an object existing on a horizontal plane as a template compressed in an eigenspace;
An eigenspace representation processing unit for performing eigenspace representation processing on image data acquired from the camera;
In a three-dimensional image processing apparatus comprising a dictionary data obtained from the dictionary data portion and a pattern matching processing portion for comparing eigenspace-represented image data obtained from the eigenspace representation processing portion,
3. The three-dimensional image processing apparatus according to claim 1, wherein the dictionary data section is composed of a partial eigenspace having dictionary data corresponding to camera tilt information and a vertical camera field angle range.