JP2021003782A

JP2021003782A - Object recognition processing device, object recognition processing method and picking apparatus

Info

Publication number: JP2021003782A
Application number: JP2019119686A
Authority: JP
Inventors: 林　正樹; Masaki Hayashi; 林　　正樹
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2019-06-27
Filing date: 2019-06-27
Publication date: 2021-01-14

Abstract

To provide an object recognition processing device, an object recognition processing method and picking apparatus which can suppress the reduction in the recognition processing speed while increasing the recognition accuracy.SOLUTION: An object recognition processing device comprises: a storage unit which stores object information of an identification object; an image generation unit which generates a simulation image in which the plurality of identification objects are arranged; a recognition accuracy calculation unit which performs image recognition by comparing the object information with the identification object in the simulation image generated by the image generation unit and obtains the recognition accuracy of the image recognition for each posture of the identification object; a user interface unit which notifies a user of the recognition accuracy and receives permission/rejection of the recognition accuracy from the user for each posture of the identification object; and a discriminator generation unit which generates a discriminator for each posture of the identification object that is rejected when the user interface unit receives the rejection.SELECTED DRAWING: Figure 2

Description

本発明は、物体認識処理装置、物体認識処理方法およびピッキング装置に関するものである。 The present invention relates to an object recognition processing device, an object recognition processing method, and a picking device.

特許文献１に記載されているシミュレーション装置は、三次元仮想空間を画面上に表示する表示装置と、操作者によって指定された撮像範囲、使用される撮像カメラの光学的特徴情報および要求される計測精度に基づいて撮像カメラの設置位置を決定するカメラ位置決定部と、三次元仮想空間における撮像カメラの位置およびその光学的特徴情報に基づいて撮像カメラにより取得されるべき仮想的な画像を生成する仮想画像生成部とを備えており、撮像カメラの適正な位置の決定および検出パラメータの調整を容易に行うことができる。 The simulation device described in Patent Document 1 includes a display device that displays a three-dimensional virtual space on a screen, an imaging range specified by an operator, optical feature information of an imaging camera used, and required measurement. A camera position determining unit that determines the installation position of the imaging camera based on accuracy, and a virtual image to be acquired by the imaging camera based on the position of the imaging camera in a three-dimensional virtual space and its optical feature information are generated. Equipped with a virtual image generation unit, it is possible to easily determine an appropriate position of the imaging camera and adjust detection parameters.

特開２００８−２１０９２号公報JP-A-2008-21092

しかしながら、このようなシミュレーション装置では、精度過剰な認識処理を行ってしまい、処理速度を向上させることが困難であった。 However, with such a simulation device, recognition processing with excessive accuracy is performed, and it is difficult to improve the processing speed.

本発明の物体認識処理装置は、識別対象の物体情報が記憶されている記憶部と、
複数の前記識別対象を配置させたシミュレーション画像を生成する画像生成部と、
前記画像生成部が生成した前記シミュレーション画像内の前記識別対象に対して前記物体情報と比較することにより画像認識を行い、前記識別対象の姿勢毎に前記画像認識の認識精度を求める認識精度算出部と、
前記認識精度をユーザーに報知し、前記識別対象の姿勢毎に前記ユーザーから前記認識精度を許容するか否かを受け付けるユーザーインターフェース部と、
前記ユーザーインターフェース部が否を受け付けた場合に、否とされた前記識別対象の姿勢毎に識別器を生成する識別器生成部と、を有することを特徴とする。 The object recognition processing device of the present invention includes a storage unit that stores object information to be identified and a storage unit.
An image generation unit that generates a simulation image in which a plurality of identification targets are arranged,
A recognition accuracy calculation unit that performs image recognition by comparing the identification target in the simulation image generated by the image generation unit with the object information, and obtains the recognition accuracy of the image recognition for each posture of the identification target. When,
A user interface unit that notifies the user of the recognition accuracy and accepts whether or not the recognition accuracy is allowed from the user for each posture of the identification target.
It is characterized by having a classifier generation unit that generates a classifier for each posture of the discriminating target that is rejected when the user interface unit accepts the rejection.

本発明の好適な実施形態に係るピッキング装置の全体構成を示す図である。It is a figure which shows the whole structure of the picking apparatus which concerns on a preferable embodiment of this invention. 物体認識処理装置の構成を示すブロック図である。It is a block diagram which shows the structure of the object recognition processing apparatus. 教示作業の方法を示すフローチャートである。It is a flowchart which shows the method of teaching work. シミュレーション画像Ｐ１の一例を示す図である。It is a figure which shows an example of the simulation image P1. モニターに表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on a monitor. モニターに表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on a monitor. ピッキング装置のピッキング方法を示すフローチャートである。It is a flowchart which shows the picking method of a picking apparatus.

以下、本発明の物体認識処理装置、物体認識処理方法およびピッキング装置を添付図面に示す実施形態に基づいて詳細に説明する。 Hereinafter, the object recognition processing device, the object recognition processing method, and the picking device of the present invention will be described in detail based on the embodiments shown in the accompanying drawings.

図１は、本発明の好適な実施形態に係るピッキング装置の全体構成を示す図である。図２は、物体認識処理装置の構成を示すブロック図である。図３は、教示作業の方法を示すフローチャートである。図４は、シミュレーション画像Ｐ１の一例を示す図である。図５および図６は、それぞれ、モニターに表示される画面の一例を示す図である。図７は、ピッキング装置のピッキング方法を示すフローチャートである。 FIG. 1 is a diagram showing an overall configuration of a picking device according to a preferred embodiment of the present invention. FIG. 2 is a block diagram showing a configuration of an object recognition processing device. FIG. 3 is a flowchart showing a method of teaching work. FIG. 4 is a diagram showing an example of the simulation image P1. 5 and 6 are diagrams showing an example of a screen displayed on the monitor, respectively. FIG. 7 is a flowchart showing a picking method of the picking device.

図１に示すピッキング装置１は、載置台１７０に不規則すなわち無造作に配置された複数の識別対象としての物体Ｘを撮像するカメラ２００と、カメラ２００の撮像結果に基づいて物体認識処理を行う物体認識処理装置３００と、物体認識処理装置３００の物体認識処理結果に基づいて、載置台から１つまたは複数の物体Ｘをピッキングするマニュピレーター１５１を備えるロボット１００と、を有する。 The picking device 1 shown in FIG. 1 includes a camera 200 that images a plurality of objects X as identification targets that are irregularly or randomly arranged on a mounting table 170, and an object that performs object recognition processing based on the imaging results of the camera 200. It has a recognition processing device 300 and a robot 100 including a manipulator 151 that picks one or more objects X from a mounting table based on the object recognition processing result of the object recognition processing device 300.

まず、ロボット１００について簡単に説明する。図１に示すように、ロボット１００は、水平多関節ロボットすなわちスカラロボットであり、例えば、電子部品等のワークの保持、搬送、組立および検査等の各作業で用いられる。なお、ロボット１００の用途は、特に限定されない。 First, the robot 100 will be briefly described. As shown in FIG. 1, the robot 100 is a horizontal articulated robot, that is, a SCARA robot, and is used in each work such as holding, transporting, assembling, and inspecting a work such as an electronic component. The use of the robot 100 is not particularly limited.

ロボット１００は、基台１１０と、基台１１０に接続されているアーム１２０と、を有する。また、アーム１２０は、基端部が基台１１０に接続され、基台１１０に対して第１軸Ｊ１まわりに回動可能な第１アーム１２１と、基端部が第１アーム１２１の先端部に接続され、第１アーム１２１に対して第１軸Ｊ１と平行な第２軸Ｊ２まわりに回動可能な第２アーム１２２と、を有する。また、第２アーム１２２の先端部には作業ヘッド１３０が設けられている。 The robot 100 has a base 110 and an arm 120 connected to the base 110. Further, the arm 120 has a first arm 121 whose base end is connected to the base 110 and is rotatable around the first axis J1 with respect to the base 110, and a base end is the tip of the first arm 121. It has a second arm 122 which is connected to the first arm 121 and is rotatable around a second axis J2 which is parallel to the first axis J1. A work head 130 is provided at the tip of the second arm 122.

基台１１０は、例えば、図示しない床面にボルト等によって固定されている。また、基台１１０内には基台１１０に対して第１アーム１２１を第１軸Ｊ１まわりに回動させる駆動装置１４１が設けられており、第２アーム１２２内には第１アーム１２１に対して第２アーム１２２を第２軸Ｊ２まわりに回動させる駆動装置１４２が設けられている。駆動装置１４１、１４２には、それぞれ、駆動源としてのモーターＭ、モーターＭの駆動を制御するコントローラーＣ、モーターＭの回転量を検出するエンコーダーＥ等が含まれている。 The base 110 is fixed to, for example, a floor surface (not shown) with bolts or the like. Further, a drive device 141 for rotating the first arm 121 around the first axis J1 with respect to the base 110 is provided in the base 110, and with respect to the first arm 121 in the second arm 122. A drive device 142 for rotating the second arm 122 around the second axis J2 is provided. The drive devices 141 and 142 include a motor M as a drive source, a controller C for controlling the drive of the motor M, an encoder E for detecting the amount of rotation of the motor M, and the like, respectively.

作業ヘッド１３０は、第２アーム１２２の先端部に同軸的に配置されたスプラインナット１３１およびボールネジナット１３２と、スプラインナット１３１およびボールネジナット１３２に挿通されたスプラインシャフト１３３と、を有する。スプラインシャフト１３３は、第２アーム１２２に対して、その中心軸であり、第１、第２軸Ｊ１、Ｊ２と平行な第３軸Ｊ３まわりに回転可能であり、かつ、第３軸Ｊ３に沿った方向に昇降可能である。 The work head 130 has a spline nut 131 and a ball screw nut 132 coaxially arranged at the tip of the second arm 122, and a spline shaft 133 inserted through the spline nut 131 and the ball screw nut 132. The spline shaft 133 is the central axis of the second arm 122, is rotatable around the third axis J3 parallel to the first and second axes J1 and J2, and is along the third axis J3. It can be raised and lowered in the vertical direction.

第２アーム１２２内にはスプラインナット１３１を回転させてスプラインシャフト１３３を第３軸Ｊ３まわりに回転させる駆動装置１４３と、ボールネジナット１３２を回転させてスプラインシャフト１３３を第３軸Ｊ３に沿った方向に昇降させる駆動装置１４４と、が設けられている。駆動装置１４３、１４４には、それぞれ、駆動源としてのモーターＭ、モーターＭの駆動を制御するコントローラーＣ、モーターＭの回転量を検出するエンコーダーＥ等が含まれている。 In the second arm 122, a drive device 143 that rotates the spline nut 131 to rotate the spline shaft 133 around the third axis J3, and a direction in which the ball screw nut 132 is rotated to rotate the spline shaft 133 along the third axis J3. There is a drive device 144 that moves up and down. The drive devices 143 and 144 include a motor M as a drive source, a controller C for controlling the drive of the motor M, an encoder E for detecting the amount of rotation of the motor M, and the like, respectively.

スプラインシャフト１３３の下端部には、エンドエフェクターを装着するためのペイロード１５０が設けられている。ペイロード１５０に装着するエンドエフェクターとしては、特に限定されないが、本実施形態では、物体Ｘをピッキングすなわち把持するためのマニピュレーター１５１が用いられている。なお、ピッキング方法としては、特に限定されず、複数の爪部で挟み込んでもよいし、エアチャック、静電チャック等によって吸着させてもよい。 A payload 150 for mounting an end effector is provided at the lower end of the spline shaft 133. The end effector attached to the payload 150 is not particularly limited, but in the present embodiment, a manipulator 151 for picking or gripping the object X is used. The picking method is not particularly limited, and the picking method may be sandwiched between a plurality of claws, or may be attracted by an air chuck, an electrostatic chuck, or the like.

また、基台１１０内には、物体認識処理装置３００からの指令に基づいて駆動装置１４１、１４２、１４３、１４４の駆動を制御するロボット制御装置１６０が設けられている。ロボット制御装置１６０は、例えば、コンピューターから構成され、情報を処理するプロセッサー（ＣＰＵ）と、プロセッサーに通信可能に接続されたメモリーと、外部インターフェースと、を有する。また、メモリーにはプロセッサーにより実行可能な各種プログラムが保存され、プロセッサーは、メモリーに記憶された各種プログラム等を読み込んで実行することができる。 Further, in the base 110, a robot control device 160 that controls the drive of the drive devices 141, 142, 143, and 144 based on a command from the object recognition processing device 300 is provided. The robot control device 160 includes, for example, a processor (CPU) composed of a computer and processing information, a memory communicatively connected to the processor, and an external interface. In addition, various programs that can be executed by the processor are stored in the memory, and the processor can read and execute various programs and the like stored in the memory.

以上、ロボット１００の全体構成について簡単に説明した。ただし、ロボット１００の構成については、特に限定されず、例えば、アーム１２０は、第１アーム１２１を省略し、第２アーム１２２が基台１１０に接続されている構成となっていてもよいし、第１アーム１２１と第２アーム１２２との間に、さらに、第１、第２軸Ｊ１、Ｊ２と平行な軸まわりに回転可能な少なくとも１つのアームが介在していてもよい。また、水平多関節ロボットではなく、複数のアームの回転軸が捩じれの関係にある６軸ロボット、双腕ロボット等の多関節ロボットであってもよい。 The overall configuration of the robot 100 has been briefly described above. However, the configuration of the robot 100 is not particularly limited. For example, the arm 120 may be configured such that the first arm 121 is omitted and the second arm 122 is connected to the base 110. Between the first arm 121 and the second arm 122, at least one arm that can rotate around an axis parallel to the first and second axes J1 and J2 may be further interposed. Further, instead of the horizontal articulated robot, an articulated robot such as a 6-axis robot or a dual-arm robot in which the rotation axes of a plurality of arms are twisted may be used.

次に、カメラ２００について説明する。カメラ２００は、載置台１７０に配置された複数の物体Ｘを撮像して、物体Ｘを含む画像を得る機能を有する。また、カメラ２００は、載置台１７０との相対的な位置関係が固定されている。これにより、カメラ２００が撮像した画像から物体Ｘの物体認識処理を容易に行うことができる。このようなカメラ２００としては、特に限定されないが、本実施形態では、ＲＧＢカメラを用いている。なお、カメラ２００としては、ＲＧＢカメラの他にも、例えば、グレースケールカメラ、赤外線カメラ等を用いることもできる。また、カメラ２００に替えて、例えば、物体Ｘの点群データを取得できるような深度センサーを用いてもよい。 Next, the camera 200 will be described. The camera 200 has a function of capturing a plurality of objects X arranged on the mounting table 170 and obtaining an image including the objects X. Further, the camera 200 has a fixed relative positional relationship with the mounting table 170. As a result, the object recognition process of the object X can be easily performed from the image captured by the camera 200. The camera 200 is not particularly limited, but in the present embodiment, an RGB camera is used. As the camera 200, for example, a grayscale camera, an infrared camera, or the like can be used in addition to the RGB camera. Further, instead of the camera 200, for example, a depth sensor that can acquire point cloud data of the object X may be used.

次に、物体認識処理装置３００について説明する。物体認識処理装置３００は、ロボット制御装置１６０、カメラ２００、キーボードＫと接続されている。このような物体認識処理装置３００は、例えば、コンピューターから構成され、情報を処理するプロセッサー（ＣＰＵ）と、プロセッサーに通信可能に接続されたメモリーと、外部インターフェースと、を有する。また、メモリーにはプロセッサーにより実行可能な各種プログラムが保存され、プロセッサーは、メモリーに記憶された各種プログラム等を読み込んで実行することができる。 Next, the object recognition processing device 300 will be described. The object recognition processing device 300 is connected to the robot control device 160, the camera 200, and the keyboard K. Such an object recognition processing device 300 includes, for example, a processor (CPU) composed of a computer and processing information, a memory communicably connected to the processor, and an external interface. In addition, various programs that can be executed by the processor are stored in the memory, and the processor can read and execute various programs and the like stored in the memory.

図２に示すように、物体認識処理装置３００は、シミュレーション部３１０と、識別器生成部３２０と、ユーザーインターフェース部３３０と、画像取得部３４０と、認識部３５０と、学習部３６０と、処理時間推定部３７０と、通信部３８０と、これら各部を制御する制御部３９０と、を有する。これら各部のうち、シミュレーション部３１０と、識別器生成部３２０と、ユーザーインターフェース部３３０と、学習部３６０と、処理時間推定部３７０とによって認識処理の教示作業が行われ、画像取得部３４０と、認識部３５０と、学習部３６０とによって実際の物体Ｘの認識処理が行われる。 As shown in FIG. 2, the object recognition processing device 300 includes a simulation unit 310, a classifier generation unit 320, a user interface unit 330, an image acquisition unit 340, a recognition unit 350, a learning unit 360, and a processing time. It has an estimation unit 370, a communication unit 380, and a control unit 390 that controls each of these units. Of these units, the simulation unit 310, the classifier generator 320, the user interface unit 330, the learning unit 360, and the processing time estimation unit 370 perform the recognition processing teaching work, and the image acquisition unit 340 and the image acquisition unit 340 The recognition unit 350 and the learning unit 360 perform recognition processing of the actual object X.

まず、認識処理の教示作業について説明する。教示作業は、図３に示すように、物体Ｘの情報を学習して物体情報Ｄ１を得るステップＳ１１と、物体Ｘの認識処理を基本認識アルゴリズムを用いてシミュレーションするステップＳ１２と、ステップＳ１２で行ったシミュレーションの結果に基づいて認識精度を向上させるための追加の識別器を生成するステップＳ１３と、ステップＳ１３で生成した識別器を加えた場合に必要とされる認識処理時間を求めるステップＳ１４と、を有する。 First, the teaching work of the recognition process will be described. As shown in FIG. 3, the teaching work is performed in step S11 to learn the information of the object X and obtain the object information D1, step S12 to simulate the recognition process of the object X using the basic recognition algorithm, and step S12. Step S13 to generate an additional discriminator for improving the recognition accuracy based on the result of the simulation, and step S14 to obtain the recognition processing time required when the discriminator generated in step S13 is added. Has.

［ステップＳ１１］
学習部３６０では、認識対象である物体ＸのＣＡＤデータすなわちＣＡＤ（computer-aided Design）を用いて作成されたデータを用いて物体情報Ｄ１を作成し、物体Ｘの認識処理に使用するための学習結果を得る。物体情報Ｄ１をＣＡＤデータから作成することにより、物体情報Ｄ１がより正確なものとなる。物体情報Ｄ１としては、例えば、３６０度様々な角度から見た物体Ｘのテンプレートが挙げられる。学習部３６０は、情報を記憶する記憶部３６１を有し、記憶部３６１に学習結果を記憶する。物体Ｘの認識処理に用いるテンプレートの数としては、特に限定されず、多い程、認識処理精度が向上するが、認識処理速度が低下する。そのため、認識処理精度と認識処理速度とのバランスに応じて適宜設定することができる。 [Step S11]
The learning unit 360 creates object information D1 using CAD data of the object X to be recognized, that is, data created by using CAD (computer-aided Design), and learns to use it for recognition processing of the object X. Get results. By creating the object information D1 from the CAD data, the object information D1 becomes more accurate. Examples of the object information D1 include templates of the object X viewed from various angles of 360 degrees. The learning unit 360 has a storage unit 361 that stores information, and stores the learning result in the storage unit 361. The number of templates used for the recognition processing of the object X is not particularly limited, and the larger the number, the better the recognition processing accuracy, but the lower the recognition processing speed. Therefore, it can be appropriately set according to the balance between the recognition processing accuracy and the recognition processing speed.

［ステップＳ１２］
シミュレーション部３１０は、学習部３６０が生成した物体情報Ｄ１を用いて、物体Ｘの認識処理のシミュレーションを行う。シミュレーション部３１０は、画像生成部３１１と、認識精度算出部３１２と、を有する。 [Step S12]
The simulation unit 310 simulates the recognition process of the object X by using the object information D1 generated by the learning unit 360. The simulation unit 310 includes an image generation unit 311 and a recognition accuracy calculation unit 312.

画像生成部３１１は、コンピューター上で、現実世界の載置台１７０と同じ載置台を仮想的に準備し、物体ＸのＣＡＤデータを用いて載置台に複数の物体Ｘを無造作に配置する。そして、画像生成部３１１は、図４に示すように、載置台上の物体Ｘを現実世界のカメラ２００と同じ仮想的なカメラで撮像した画像をシミュレーション画像Ｐ１として生成する。これにより、現実世界においてカメラ２００で撮像した画像と同様の画像をシミュレーション画像Ｐ１としてコンピューター上で生成することができる。なお、画像生成部３１１には、シミュレーション画像Ｐ１の生成に用いた各物体Ｘの位置および姿勢が記憶される。 The image generation unit 311 virtually prepares the same mounting table as the mounting table 170 in the real world on a computer, and randomly arranges a plurality of objects X on the mounting table using the CAD data of the object X. Then, as shown in FIG. 4, the image generation unit 311 generates an image of the object X on the mounting table captured by the same virtual camera as the camera 200 in the real world as the simulation image P1. As a result, an image similar to the image captured by the camera 200 in the real world can be generated on the computer as the simulation image P1. The image generation unit 311 stores the position and orientation of each object X used to generate the simulation image P1.

コンピューター上の仮想的なカメラは、現実世界のカメラ２００と同じ仕様であり、載置台との相対的な位置関係も現実世界と同じであることが好ましい。なお、前記の「仕様」とは、例えば、レンズの焦点距離、Ｆ値、撮像素子の解像度、ＩＳＯ（感度）、ノイズ等が挙げられる。これにより、より現実世界に近いシミュレーション画像Ｐ１を生成することができる。 It is preferable that the virtual camera on the computer has the same specifications as the camera 200 in the real world, and the relative positional relationship with the mounting table is also the same as in the real world. The above-mentioned "specifications" include, for example, the focal length of the lens, the F value, the resolution of the image sensor, ISO (sensitivity), noise, and the like. As a result, it is possible to generate a simulation image P1 that is closer to the real world.

また、画像生成部３１１は、シミュレーション画像Ｐ１を生成する際、現実世界の環境を用いることが好ましい。つまり、画像生成部３１１は、現実世界で物体Ｘが置かれている環境の情報、特に、光源の数、位置、光量および色を再現してシミュレーション画像Ｐ１を生成することが好ましい。これにより、シミュレーション画像Ｐ１内の物体Ｘに影すなわち陰影を映し出すことができるため、さらに現実世界に近いシミュレーション画像Ｐ１を生成することができる。 Further, it is preferable that the image generation unit 311 uses a real-world environment when generating the simulation image P1. That is, it is preferable that the image generation unit 311 reproduces information on the environment in which the object X is placed in the real world, particularly the number, position, amount of light, and color of the light source to generate the simulation image P1. As a result, a shadow, that is, a shadow can be projected on the object X in the simulation image P1, so that the simulation image P1 closer to the real world can be generated.

認識精度算出部３１２は、画像生成部３１１が生成したシミュレーション画像Ｐ１内の物体Ｘと、記憶部３６１に記憶されている物体情報Ｄ１とを基本認識アルゴリズムを用いて比較することにより物体Ｘの認識処理を行い、物体Ｘの姿勢毎に認識精度を求める。このような方法によれば、より簡単に認識精度を求めることができる。認識処理の方法としては、特に限定されず、例えば、以下のような方法を用いることができる。 The recognition accuracy calculation unit 312 recognizes the object X by comparing the object X in the simulation image P1 generated by the image generation unit 311 with the object information D1 stored in the storage unit 361 using a basic recognition algorithm. The processing is performed, and the recognition accuracy is obtained for each posture of the object X. According to such a method, the recognition accuracy can be obtained more easily. The method of recognition processing is not particularly limited, and for example, the following method can be used.

まず、認識精度算出部３１２は、図４に示すように、物体Ｘが十分に収まるウィンドウサイズを持つ領域Ｓを設定し、この領域Ｓをシミュレーション画像Ｐ１で移動させることにより、物体Ｘを見つける。次に、認識精度算出部３１２は、物体Ｘが映った画像を取得して、これを基本認識アルゴリズムを用いて認識処理することにより、当該画像に映った物体Ｘの位置および姿勢を記憶部３６１に記憶されている物体情報Ｄ１から推定する。推定方法としては、特に限定されず、例えば、認識精度算出部３１２は、物体情報Ｄ１を用いたテンプレートマッチングによって画像に映った物体Ｘの位置および姿勢を推定することができる。次に、認識精度算出部３１２は、物体Ｘの姿勢の推定結果が正解か不正解かを判断する。前述したように、シミュレーション画像Ｐ１内にある各物体Ｘの姿勢は、画像生成部３１１に記憶されている。そのため、この記憶された姿勢に基づいて推定結果が正解か不正解かを判断することができる。 First, as shown in FIG. 4, the recognition accuracy calculation unit 312 finds the object X by setting a region S having a window size in which the object X is sufficiently accommodated and moving this region S with the simulation image P1. Next, the recognition accuracy calculation unit 312 acquires an image of the object X and performs recognition processing using the basic recognition algorithm to store the position and orientation of the object X reflected in the image in the storage unit 361. It is estimated from the object information D1 stored in. The estimation method is not particularly limited, and for example, the recognition accuracy calculation unit 312 can estimate the position and orientation of the object X reflected in the image by template matching using the object information D1. Next, the recognition accuracy calculation unit 312 determines whether the estimation result of the posture of the object X is a correct answer or an incorrect answer. As described above, the posture of each object X in the simulation image P1 is stored in the image generation unit 311. Therefore, it is possible to determine whether the estimation result is correct or incorrect based on this memorized posture.

認識精度算出部３１２は、上述の作業をシミュレーション画像Ｐ１に映った全ての物体Ｘについて順番に行う。認識精度算出部３１２が全ての物体Ｘに対して上述の作業をし終えると、画像生成部３１１は、シミュレーション画像Ｐ１とは物体Ｘの配置が異なる新たなシミュレーション画像Ｐ１を生成する。新たなシミュレーション画像Ｐ１は、前回のシミュレーション画像Ｐ１を生成するために配置した複数の物体Ｘを無造作にかき混ぜるようなイメージで生成することができる。 The recognition accuracy calculation unit 312 performs the above-mentioned work in order for all the objects X shown in the simulation image P1. When the recognition accuracy calculation unit 312 finishes the above-mentioned work for all the objects X, the image generation unit 311 generates a new simulation image P1 in which the arrangement of the objects X is different from that of the simulation image P1. The new simulation image P1 can be generated as an image in which a plurality of objects X arranged to generate the previous simulation image P1 are randomly stirred.

認識精度算出部３１２は、新たに生成されたシミュレーション画像Ｐ１についても前回と同様に全ての物体Ｘの姿勢を推定し、その推定結果が正解か不正解かを判断する。このように、シミュレーション部３１０は、画像生成部３１１によるシミュレーション画像Ｐ１の生成と、認識精度算出部３１２による物体Ｘの姿勢の推定とを十分な回数繰り返す。これにより、物体Ｘの姿勢の推定結果およびその正否を大量に学習することができる。なお、画像生成部３１１は、認識精度算出部３１２において種々の姿勢についてバランスよく推定結果の正否を判断できるように、生成する多数のシミュレーション画像Ｐ１内に、種々の姿勢の物体Ｘをバランスよく配置することが好ましい。 The recognition accuracy calculation unit 312 estimates the postures of all the objects X for the newly generated simulation image P1 as in the previous time, and determines whether the estimation result is a correct answer or an incorrect answer. In this way, the simulation unit 310 repeats the generation of the simulation image P1 by the image generation unit 311 and the estimation of the posture of the object X by the recognition accuracy calculation unit 312 a sufficient number of times. As a result, it is possible to learn a large amount of estimation results of the posture of the object X and its correctness. The image generation unit 311 arranges objects X in various postures in a well-balanced manner in a large number of simulation images P1 to be generated so that the recognition accuracy calculation unit 312 can determine the correctness of the estimation result in a well-balanced manner for various postures. It is preferable to do so.

次に、認識精度算出部３１２は、学習した情報に基づいて、物体Ｘの姿勢毎の認識精度を求める。具体的には、学習した物体Ｘの各姿勢について、推定結果の正解率と不正解率とを求め、さらには不正解の場合に他のどの姿勢と誤認したかをその確率と共に求める。つまり、正解率がｘ％で、間違えた姿勢がＡ姿勢とＢ姿勢で、Ａ姿勢の誤認識確率がｙ％で、Ｂ姿勢の誤認識確率がｚ％であるといった情報を求める。 Next, the recognition accuracy calculation unit 312 obtains the recognition accuracy for each posture of the object X based on the learned information. Specifically, for each posture of the learned object X, the correct answer rate and the incorrect answer rate of the estimation result are obtained, and further, in the case of an incorrect answer, which other posture is mistaken for is obtained together with the probability. That is, information is obtained such that the correct answer rate is x%, the wrong postures are the A posture and the B posture, the misrecognition probability of the A posture is y%, and the misrecognition probability of the B posture is z%.

ユーザーインターフェース部３３０は、認識精度算出部３１２が求めた各姿勢のうち、認識精度が所定の確率よりも低い姿勢をユーザーに報知する。本実施形態では、図１および図５に示すように、物体認識処理装置３００に接続されたモニター４００に表示することによりユーザーに報知している。ただし、報知の方法は、特に限定されない。また、ユーザーインターフェース部３３０は、認識精度の低い姿勢が存在しない場合には、その旨を報知する。なお、モニター４００は、物体認識処理装置３００に内蔵されていてもよい。 The user interface unit 330 notifies the user of a posture whose recognition accuracy is lower than a predetermined probability among the postures obtained by the recognition accuracy calculation unit 312. In the present embodiment, as shown in FIGS. 1 and 5, the user is notified by displaying the display on the monitor 400 connected to the object recognition processing device 300. However, the notification method is not particularly limited. Further, when the posture with low recognition accuracy does not exist, the user interface unit 330 notifies that fact. The monitor 400 may be built in the object recognition processing device 300.

図５に示す例では、認識精度が所定の確率よりも低かった２つの姿勢が表示されている。さらに、姿勢毎に正解率と、誤認識した姿勢と、誤認識した姿勢の誤認識率とが表示されている。また、ユーザーインターフェース部３３０は、このような誤認識を許容するか許容しないかをユーザーから受け付ける受け付け部３３１を有する。図示の構成では、受け付け部３３１は、モニター４００に表示され、誤認識を許容するか許容しないかを選択可能なチェック欄を有する。ユーザーは、姿勢毎に誤認識を許容する許容しないかを判断し、誤認識を許容する場合には、キーボードＫ、マウス等の操作デバイスを用いて、その姿勢に対応するチェック欄にチェックを入れる。そして、ユーザーインターフェース部３３０は、この選択に対応する信号を受け付ける。 In the example shown in FIG. 5, two postures whose recognition accuracy is lower than a predetermined probability are displayed. Further, the correct answer rate, the erroneously recognized posture, and the erroneously recognized posture are displayed for each posture. Further, the user interface unit 330 has a reception unit 331 that accepts from the user whether or not to allow such misrecognition. In the illustrated configuration, the receiving unit 331 is displayed on the monitor 400 and has a check column for allowing or not allowing erroneous recognition. The user determines whether or not to allow misrecognition for each posture, and if misrecognition is tolerated, uses an operating device such as a keyboard K or a mouse to check the check box corresponding to that posture. .. Then, the user interface unit 330 receives the signal corresponding to this selection.

例えば、Ｎｏ．１の姿勢では、誤認識した姿勢が上下反対の姿勢であり、上下反対の姿勢でロボット１００にピックアップされるとその後の作業に支障がでるおそれがある。そのため、Ｎｏ．１の姿勢の誤認識を許容することができず、チェック欄にはチェックが入っていない。一方で、Ｎｏ．２の姿勢では、誤認識した姿勢が中心軸まわりにずれている姿勢であり、ずれた姿勢でロボット１００にピックアップされてもその後の作業に支障がほとんどない。そのため、Ｎｏ．２の姿勢の誤認識については許容することができ、チェック欄にはチェックが入っている。このように、その後の作業に支障がでるか否か等、個別の事情に鑑みて、姿勢毎に誤認識を許容するかしないかを判断することができる。 For example, No. In the posture of 1, the erroneously recognized posture is the posture upside down, and if the robot 100 picks up the posture upside down, the subsequent work may be hindered. Therefore, No. The misrecognition of the posture of 1 cannot be tolerated, and the check box is not checked. On the other hand, No. In the posture of 2, the misrecognized posture is shifted around the central axis, and even if the robot 100 picks up the posture in the shifted posture, there is almost no hindrance to the subsequent work. Therefore, No. Misrecognition of posture 2 can be tolerated, and the check box is checked. In this way, it is possible to determine whether or not to allow erroneous recognition for each posture in consideration of individual circumstances such as whether or not the subsequent work is hindered.

［ステップＳ１３］
識別器生成部３２０は、ユーザーインターフェース部３３０が受け付けた結果が「誤認識を許容しない」を含んでいた場合に、「誤認識を許容しない」と選択された姿勢毎に認識精度を向上させるための識別器を個別に生成する。識別器は、認識精度を向上させるためのアルゴリズムからなり、このアルゴリズムは、例えば、ニューラルネットワークを用いて生成される。これにより、識別器を容易に生成することができる。なお、アルゴリズムとしては、ニューラルネットワークの中でも特に畳み込みニューラルネットワーク（ＣＮＮ）によるクラス分類器を用いることが好ましい。 [Step S13]
The classifier generation unit 320 improves the recognition accuracy for each posture selected as "not tolerating erroneous recognition" when the result received by the user interface unit 330 includes "not tolerating erroneous recognition". Generate the classifiers individually. The classifier consists of an algorithm for improving recognition accuracy, and this algorithm is generated using, for example, a neural network. As a result, the classifier can be easily generated. As an algorithm, it is particularly preferable to use a classifier based on a convolutional neural network (CNN) among neural networks.

クラス分類器は、例えば、「誤認識を許容しない」と選択された姿勢に対して、画像認識が正解だった画像と不正解だった画像とを大量に学習させることにより生成することができる。このようなクラス分類器によれば、物体情報Ｄ１に収められている物体Ｘの全姿勢の中から正解の１つの姿勢を求めるのではなく、不正解となり易い幾つかの姿勢の中から正解の１つの姿勢を分類するため、その処理にかかる時間を大幅に短縮することができる。なお、識別器としては、特に限定されず、例えば、サポートベクターマシンを用いた画像認識アルゴリズムであってもよいし、アダブースト（ＡｄａＢｏｏｓｔ）を用いた画像認識アルゴリズムであってもよい。 The classifier can be generated, for example, by learning a large amount of images for which image recognition is correct and images for which image recognition is incorrect for a posture selected as "not tolerating erroneous recognition". According to such a classifier, instead of finding one correct posture from all the postures of the object X stored in the object information D1, the correct answer is obtained from some postures that are likely to be incorrect. Since one posture is classified, the time required for the processing can be significantly reduced. The classifier is not particularly limited, and may be, for example, an image recognition algorithm using a support vector machine or an image recognition algorithm using AdaBoost.

以上のように、認識精度の低い特定の姿勢だけにその認識精度を高める識別器を追加することにより、識別器を追加した姿勢についての処理速度だけが基本認識アルゴリズムに対して長くなり、その他の姿勢すなわち識別器が追加されていない姿勢についての処理速度は基本認識アルゴリズムと変わらない。したがって、物体Ｘの認識精度を向上させつつ、例えば全ての姿勢について認識精度を向上させなければならない構成と比べて処理速度の低下をより小さく抑えることができる。そのため、より短時間でより精度よく物体Ｘの認識処理を行うことのできる物体認識処理装置３００となる。 As described above, by adding a discriminator that enhances the recognition accuracy only to a specific posture with low recognition accuracy, only the processing speed for the posture to which the discriminator is added becomes longer than the basic recognition algorithm, and other The processing speed for the pose, that is, the pose for which no classifier is added, is the same as the basic recognition algorithm. Therefore, while improving the recognition accuracy of the object X, the decrease in processing speed can be suppressed to be smaller than, for example, as compared with the configuration in which the recognition accuracy must be improved for all postures. Therefore, the object recognition processing device 300 can perform the recognition processing of the object X more accurately in a shorter time.

［ステップＳ１４］
処理時間推定部３７０は、識別器生成部３２０が生成した識別器を追加せず、基本認識アルゴリズムだけを用いて物体Ｘの認識処理を行う場合にかかる物体Ｘ１個あたりの平均処理時間Ｔ１［ｓｅｃ／物体Ｘ］と、識別器生成部３２０が生成した識別器を追加し、基本認識アルゴリズムの後にさらに識別器による認識を行う場合にかかる物体Ｘ１個あたりの平均処理時間Ｔ２［ｓｅｃ／物体Ｘ］と、を求める。なお、識別器による認識が追加される分、平均処理時間Ｔ２が平均処理時間Ｔ１よりも長くなる。 [Step S14]
The processing time estimation unit 370 does not add the classifier generated by the classifier generation unit 320, and performs the recognition process of the object X using only the basic recognition algorithm. The average processing time per object X is T1 [sec]. / Object X] and the classifier generated by the classifier generator 320 are added, and the average processing time T2 [sec / object X] per object X required when recognition by the classifier is further performed after the basic recognition algorithm. And ask. The average processing time T2 becomes longer than the average processing time T1 due to the addition of recognition by the classifier.

ユーザーインターフェース部３３０は、処理時間推定部３７０が求めた平均処理時間Ｔ１、Ｔ２をユーザーに報知する。本実施形態では、図６に示すように、モニター４００に平均処理時間Ｔ１、Ｔ２を並べて表示することによりユーザーに報知している。平均処理時間Ｔ１、Ｔ２を並べて表示することにより、識別器を追加したことによる遅延時間が分かりやすくなる。また、ユーザーは、モニター４００に表示された平均処理時間Ｔ１、Ｔ２に基づいて、認識精度を向上させるべきか否かを最終判断することができる。つまり、処理時間が長くなってでも認識精度を高めるべきか、反対に、認識精度を高めるのを諦めてでも処理時間を短くすべきかを選択することができる。 The user interface unit 330 notifies the user of the average processing times T1 and T2 obtained by the processing time estimation unit 370. In the present embodiment, as shown in FIG. 6, the user is notified by displaying the average processing times T1 and T2 side by side on the monitor 400. By displaying the average processing times T1 and T2 side by side, the delay time due to the addition of the classifier becomes easy to understand. In addition, the user can make a final judgment as to whether or not the recognition accuracy should be improved based on the average processing times T1 and T2 displayed on the monitor 400. That is, it is possible to select whether the recognition accuracy should be increased even if the processing time becomes long, or conversely, whether the processing time should be shortened even if the processing time is given up.

図６に示すように、ユーザーインターフェース部３３０は、識別器を追加するか追加しないかを受け付ける受け付け部３３２を有する。受け付け部３３２は、平均処理時間Ｔ１、Ｔ２と共にモニター４００に表示され、図示の構成では、「追加する」または「追加しない」をユーザーが選択できるようになっている。「追加する」を受け付けた場合には基本認識アルゴリズムに識別器を追加した高精度認識アルゴリズムを実際の認識アルゴリズムとして用い、「追加しない」を受け付けた場合には基本認識アルゴリズムを実際の認識アルゴリズムとして用いる。なお、「追加しない」を受け付けた場合、ユーザーインターフェース部３３０は、図５に示す画面に戻って、再び、ユーザーに誤認識を許容するか許容しないかを選択させてもよい。そして、ユーザーインターフェース部３３０は、この選択に対応する信号を受け付ける。 As shown in FIG. 6, the user interface unit 330 has a reception unit 332 that accepts whether or not to add a classifier. The reception unit 332 is displayed on the monitor 400 together with the average processing times T1 and T2, and in the illustrated configuration, the user can select "add" or "do not add". When "add" is accepted, the high-precision recognition algorithm with a classifier added to the basic recognition algorithm is used as the actual recognition algorithm, and when "do not add" is accepted, the basic recognition algorithm is used as the actual recognition algorithm. Use. When "Do not add" is accepted, the user interface unit 330 may return to the screen shown in FIG. 5 and again allow the user to select whether to allow the erroneous recognition or not. Then, the user interface unit 330 receives the signal corresponding to this selection.

以上、画像認識の教示作業について説明した。次に、ロボット１００による物体Ｘのピッキング作業について図７に示すフローチャートに基づいて説明する。 The teaching work of image recognition has been described above. Next, the picking operation of the object X by the robot 100 will be described based on the flowchart shown in FIG.

まず、ステップＳ２１として、カメラ２００が物体Ｘを撮像し、当該撮像により得られた画像データＰ２を画像取得部３４０が取得する。次に、ステップＳ２２として、認識部３５０は、物体Ｘが十分に収まるウィンドウサイズを持つ領域Ｓを設定し、この領域Ｓを画像データＰ２上の初期位置とする。 First, in step S21, the camera 200 images the object X, and the image acquisition unit 340 acquires the image data P2 obtained by the imaging. Next, in step S22, the recognition unit 350 sets a region S having a window size in which the object X is sufficiently accommodated, and sets this region S as the initial position on the image data P2.

次に、ステップＳ２３として、認識部３５０は、領域Ｓ内に物体Ｘが存在するか否かを判断する。ステップＳ２３で物体Ｘが見つかった場合、ステップＳ２４として、認識部３５０は、基本認識アルゴリズムを用いて物体Ｘの位置および姿勢を推定する。次に、ステップＳ２５として、認識部３５０は、推定した姿勢が前述した識別器による追加認識を必要とする姿勢、すなわち、上述した教示作業でユーザーが「誤認識を許容しない」と判断した姿勢であるか否かを判断する。識別器による追加認識が必要な場合、ステップＳ２６として、認識部３５０は、識別器による追加認識を行い、物体Ｘの姿勢をより精度よく推定する。 Next, in step S23, the recognition unit 350 determines whether or not the object X exists in the area S. When the object X is found in step S23, as step S24, the recognition unit 350 estimates the position and orientation of the object X using the basic recognition algorithm. Next, as step S25, the recognition unit 350 takes a posture in which the estimated posture requires additional recognition by the above-mentioned classifier, that is, a posture in which the user determines in the above-mentioned teaching work that "misrecognition is not allowed". Determine if it exists. When additional recognition by the classifier is required, in step S26, the recognition unit 350 performs additional recognition by the classifier and estimates the posture of the object X more accurately.

ステップＳ２６が終了した後、または、ステップＳ２３で物体Ｘが見つからなかった場合やステップＳ２５で識別器による追加認識が必要なかった場合には、ステップＳ２７として、認識部３５０は、画像データＰ２上での領域Ｓの位置を変更する。次に、ステップＳ２８として、認識部３５０は、画像データＰ２の全領域を領域Ｓによってスキャンしたかを判断する。画像データＰ２の全領域のスキャンが終了していない場合には、ステップＳ２３に戻ってステップＳ２３〜Ｓ２８を繰り返す。 After the completion of step S26, or when the object X is not found in step S23, or when additional recognition by the classifier is not required in step S25, the recognition unit 350 performs the recognition unit 350 on the image data P2 as step S27. The position of the area S of is changed. Next, in step S28, the recognition unit 350 determines whether the entire area of the image data P2 has been scanned by the area S. If the scanning of the entire area of the image data P2 has not been completed, the process returns to step S23 and repeats steps S23 to S28.

反対に、画像データＰ２の全領域のスキャンが終了している場合には、ステップＳ２９として、認識部３５０は、発見した物体Ｘのうちからロボット１００によりピッキングする１つの物体Ｘを決定する。そして、ステップＳ３０として、通信部３８０は、選択した物体Ｘをピッキングするための指令をロボット１００に送信し、ロボット１００による物体Ｘのピッキング作業を実行する。ロボット１００による物体Ｘのピッキング作業が終了すると、ステップＳ２１に戻り、物体Ｘのピッキング作業が終了するまでＳ２１〜Ｓ３０を繰り返す。 On the contrary, when the scanning of the entire area of the image data P2 is completed, as step S29, the recognition unit 350 determines one object X to be picked by the robot 100 from the found objects X. Then, in step S30, the communication unit 380 transmits a command for picking the selected object X to the robot 100, and executes the picking operation of the object X by the robot 100. When the picking work of the object X by the robot 100 is completed, the process returns to step S21, and S21 to S30 are repeated until the picking work of the object X is completed.

以上、ピッキング装置１について説明した。このようなピッキング装置１は、識別対象である物体Ｘを画像認識する物体認識処理装置３００と、物体認識処理装置３００の認識結果に基づいて物体Ｘをピッキングするマニピュレーター１５１と、を有する。また、物体認識処理装置３００は、物体Ｘの物体情報Ｄ１が記憶されている記憶部３６１と、複数の物体Ｘを配置させたシミュレーション画像Ｐ１を生成する画像生成部３１１と、画像生成部３１１が生成したシミュレーション画像Ｐ１内の物体Ｘに対して物体情報Ｄ１と比較することにより画像認識を行い、物体Ｘの姿勢毎に画像認識の認識精度を求める認識精度算出部３１２と、認識精度をユーザーに報知し、物体Ｘの姿勢毎にユーザーから認識精度を許容するか否かを受け付けるユーザーインターフェース部３３０と、ユーザーインターフェース部３３０が否を受け付けた場合すなわち「認識精度を許容しない」の場合に、否とされた物体Ｘの姿勢毎に識別器を生成する識別器生成部３２０と、を有する。 The picking device 1 has been described above. Such a picking device 1 includes an object recognition processing device 300 that recognizes an object X to be identified as an image, and a manipulator 151 that picks the object X based on the recognition result of the object recognition processing device 300. Further, the object recognition processing device 300 includes a storage unit 361 in which the object information D1 of the object X is stored, an image generation unit 311 that generates a simulation image P1 in which a plurality of objects X are arranged, and an image generation unit 311. The recognition accuracy calculation unit 312 that performs image recognition by comparing the object X in the generated simulation image P1 with the object information D1 and obtains the recognition accuracy of the image recognition for each posture of the object X, and the recognition accuracy to the user. The user interface unit 330 that notifies and accepts whether or not the recognition accuracy is allowed from the user for each posture of the object X, and when the user interface unit 330 accepts the rejection, that is, when "the recognition accuracy is not allowed", no It has a classifier generator 320 that generates a classifier for each posture of the object X.

このような構成によれば、認識精度の低い特定の姿勢だけにその認識精度を高める識別器を追加することができる。そのため、識別器を追加した姿勢についての処理速度だけが基本認識アルゴリズムに対して長くなり、その他の姿勢すなわち識別器が追加されていない姿勢についての処理速度は基本認識アルゴリズムと変わらない。したがって、物体Ｘの認識精度を向上させつつ、例えば全ての姿勢について認識精度を向上させなければならない構成と比べて処理速度の低下をより小さく抑えることができる。その結果、より短時間でより精度よく物体Ｘの認識処理を行うことのできるピッキング装置１となる。 According to such a configuration, it is possible to add a discriminator that enhances the recognition accuracy only in a specific posture having a low recognition accuracy. Therefore, only the processing speed for the posture to which the classifier is added is longer than that for the basic recognition algorithm, and the processing speed for the other postures, that is, the postures to which the classifier is not added is the same as the basic recognition algorithm. Therefore, while improving the recognition accuracy of the object X, the decrease in processing speed can be suppressed to be smaller than, for example, as compared with the configuration in which the recognition accuracy must be improved for all postures. As a result, the picking device 1 is capable of performing the recognition process of the object X more accurately in a shorter time.

また、前述したように、物体認識処理装置３００は、識別対象である物体Ｘの物体情報Ｄ１が記憶されている記憶部３６１と、複数の物体Ｘを配置させたシミュレーション画像Ｐ１を生成する画像生成部３１１と、画像生成部３１１が生成したシミュレーション画像Ｐ１内の物体Ｘに対して物体情報Ｄ１と比較することにより画像認識を行い、物体Ｘの姿勢毎に画像認識の認識精度を求める認識精度算出部３１２と、認識精度をユーザーに報知し、物体Ｘの姿勢毎にユーザーから認識精度を許容するか否かを受け付けるユーザーインターフェース部３３０と、ユーザーインターフェース部３３０が否を受け付けた場合すなわち「認識精度を許容しない」の場合に、否とされた物体Ｘの姿勢毎に識別器を生成する識別器生成部３２０と、を有する。 Further, as described above, the object recognition processing device 300 generates an image generation unit 361 in which the object information D1 of the object X to be identified is stored and a simulation image P1 in which a plurality of objects X are arranged. Image recognition is performed by comparing the object X in the simulation image P1 generated by the image generation unit 311 with the object information D1, and the recognition accuracy calculation for obtaining the recognition accuracy of the image recognition for each posture of the object X. The unit 312, the user interface unit 330 that notifies the user of the recognition accuracy and accepts whether or not the recognition accuracy is allowed from the user for each posture of the object X, and the case where the user interface unit 330 accepts the rejection, that is, "recognition accuracy". In the case of "not allowed", it has a discriminator generation unit 320 that generates a discriminator for each posture of the object X that is rejected.

このような構成によれば、認識精度の低い特定の姿勢だけにその認識精度を高める識別器を追加することができる。そのため、識別器を追加した姿勢についての処理速度だけが基本認識アルゴリズムに対して長くなり、その他の姿勢すなわち識別器が追加されていない姿勢についての処理速度は基本認識アルゴリズムと変わらない。したがって、物体Ｘの認識精度を向上させつつ、例えば全ての姿勢について認識精度を向上させなければならない構成と比べて処理速度の低下をより小さく抑えることができる。その結果、より短時間でより精度よく物体Ｘの認識処理を行うことのできる物体認識処理装置３００となる。 According to such a configuration, it is possible to add a discriminator that enhances the recognition accuracy only in a specific posture having a low recognition accuracy. Therefore, only the processing speed for the posture to which the classifier is added is longer than that for the basic recognition algorithm, and the processing speed for the other postures, that is, the postures to which the classifier is not added is the same as the basic recognition algorithm. Therefore, while improving the recognition accuracy of the object X, the decrease in processing speed can be suppressed to be smaller than, for example, as compared with the configuration in which the recognition accuracy must be improved for all postures. As a result, the object recognition processing device 300 can perform the recognition processing of the object X more accurately in a shorter time.

また、前述したように、物体認識処理装置３００は、識別器生成部３２０が生成した識別器を加えて画像処理を行う場合の画像処理時間を推定する処理時間推定部３７０を有する。ユーザーインターフェース部３３０は、処理時間推定部３７０が推定した画像処理時間である平均処理時間Ｔ２を報知する。これにより、ユーザーは、特定の姿勢に対して認識精度を向上させるべきか否かを判断することができる。つまり、処理時間が長くなってでも認識精度を高めるべきか、反対に、認識精度を高めるのを諦めてでも処理時間を短くすべきかを選択することができる。 Further, as described above, the object recognition processing device 300 has a processing time estimation unit 370 that estimates the image processing time when the image processing is performed by adding the classifier generated by the classifier generation unit 320. The user interface unit 330 notifies the average processing time T2, which is the image processing time estimated by the processing time estimation unit 370. This allows the user to determine whether or not the recognition accuracy should be improved for a particular posture. That is, it is possible to select whether the recognition accuracy should be increased even if the processing time becomes long, or conversely, whether the processing time should be shortened even if the processing time is given up.

また、前述したように、物体情報Ｄ１は、ＣＡＤデータから生成される。これにより、物体情報Ｄ１がより正確なものとなる。 Further, as described above, the object information D1 is generated from the CAD data. As a result, the object information D1 becomes more accurate.

また、前述したように、画像生成部３１１は、物体Ｘが存在する環境の情報を用いてシミュレーション画像Ｐ１を生成する。これにより、現実世界に近いシミュレーション画像Ｐ１を生成することができる。 Further, as described above, the image generation unit 311 generates the simulation image P1 by using the information of the environment in which the object X exists. As a result, it is possible to generate a simulation image P1 that is close to the real world.

また、前述したように、認識精度算出部３１２は、シミュレーション画像Ｐ１から取得される物体Ｘの情報と、記憶部３６１に記憶されている物体情報Ｄ１とを照合することにより認識精度を求める。これにより、より簡単に認識精度を求めることができる。 Further, as described above, the recognition accuracy calculation unit 312 obtains the recognition accuracy by collating the information of the object X acquired from the simulation image P1 with the object information D1 stored in the storage unit 361. As a result, the recognition accuracy can be obtained more easily.

また、前述したように、識別器生成部３２０は、ニューラルネットワークを用いて識別器を生成する。これにより、識別器を容易に生成することができる。 Further, as described above, the discriminator generation unit 320 generates a discriminator using a neural network. As a result, the classifier can be easily generated.

また、前述したように、物体認識処理方法は、識別対象である物体Ｘの物体情報Ｄ１を記憶するステップと、複数の物体Ｘを配置させたシミュレーション画像Ｐ１を生成するステップと、画像生成部３１１が生成したシミュレーション画像Ｐ１内の物体Ｘに対して物体情報Ｄ１と比較することにより画像認識を行い、物体Ｘの姿勢毎に画像認識の認識精度を求めるステップと、認識精度をユーザーに報知し、物体Ｘの姿勢毎にユーザーから認識精度を許容するか否かを受け付けるステップと、ユーザーインターフェース部３３０が否を受け付けた場合すなわち「認識精度を許容しない」の場合に、否とされた物体Ｘの姿勢毎に識別器を生成するステップと、を有する。 Further, as described above, the object recognition processing method includes a step of storing the object information D1 of the object X to be identified, a step of generating a simulation image P1 in which a plurality of objects X are arranged, and an image generation unit 311. Image recognition is performed by comparing the object X in the simulation image P1 generated by the object X with the object information D1, and the step of obtaining the recognition accuracy of the image recognition for each posture of the object X and the recognition accuracy are notified to the user. The step of accepting whether or not the recognition accuracy is allowed from the user for each posture of the object X, and the case where the user interface unit 330 accepts the rejection, that is, when "the recognition accuracy is not allowed", the object X is rejected. It has a step of generating a classifier for each posture.

このような構成によれば、認識精度の低い特定の姿勢だけにその認識精度を高める識別器を追加することができる。そのため、識別器を追加した姿勢についての処理速度だけが基本認識アルゴリズムに対して長くなり、その他の姿勢すなわち識別器が追加されていない姿勢についての処理速度は基本認識アルゴリズムと変わらない。したがって、物体Ｘの認識精度を向上させつつ、例えば全ての姿勢について認識精度を向上させなければならない構成と比べて処理速度の低下をより小さく抑えることができる。その結果、より短時間でより精度よく物体Ｘの認識処理を行うことのできる物体認識処理方法となる。 According to such a configuration, it is possible to add a discriminator that enhances the recognition accuracy only in a specific posture having a low recognition accuracy. Therefore, only the processing speed for the posture to which the classifier is added is longer than that for the basic recognition algorithm, and the processing speed for the other postures, that is, the postures to which the classifier is not added is the same as the basic recognition algorithm. Therefore, while improving the recognition accuracy of the object X, the decrease in processing speed can be suppressed to be smaller than, for example, as compared with the configuration in which the recognition accuracy must be improved for all postures. As a result, it becomes an object recognition processing method capable of performing the recognition processing of the object X more accurately in a shorter time.

以上、本発明の物体認識処理装置、物体認識処理方法およびピッキング装置を図示の実施形態に基づいて説明したが、本発明はこれに限定されるものではなく、各部の構成は、同様の機能を有する任意の構成のものに置換することができる。また、他の任意の構成物が付加されていてもよい。 The object recognition processing device, the object recognition processing method, and the picking device of the present invention have been described above based on the illustrated embodiments, but the present invention is not limited to this, and the configurations of the respective parts have the same functions. It can be replaced with any configuration that has. Moreover, other arbitrary components may be added.

１…ピッキング装置、１００…ロボット、１１０…基台、１２０…アーム、１２１…第１アーム、１２２…第２アーム、１３０…作業ヘッド、１３１…スプラインナット、１３２…ボールネジナット、１３３…スプラインシャフト、１４１〜１４４…駆動装置、１５０…ペイロード、１５１…マニピュレーター、１６０…ロボット制御装置、１７０…載置台、２００…カメラ、３００…物体認識処理装置、３１０…シミュレーション部、３１１…画像生成部、３１２…認識精度算出部、３２０…識別器生成部、３３０…ユーザーインターフェース部、３３１、３３２…受け付け部、３４０…画像取得部、３５０…認識部、３６０…学習部、３６１…記憶部、３７０…処理時間推定部、３８０…通信部、３９０…制御部、４００…モニター、Ｃ…コントローラー、Ｄ１…物体情報、Ｅ…エンコーダー、Ｊ１…第１軸、Ｊ２…第２軸、Ｊ３…第３軸、Ｋ…キーボード、Ｍ…モーター、Ｐ１…シミュレーション画像、Ｐ２…画像データ、Ｓ…領域、Ｓ１１〜Ｓ１４、Ｓ２１〜Ｓ３０…ステップ、Ｔ１…平均処理時間、Ｔ２…平均処理時間、Ｘ…物体 1 ... Picking device, 100 ... Robot, 110 ... Base, 120 ... Arm, 121 ... 1st arm, 122 ... 2nd arm, 130 ... Working head, 131 ... Spline nut, 132 ... Ball screw nut, 133 ... Spline shaft, 141-144 ... drive device, 150 ... payload, 151 ... manipulator, 160 ... robot control device, 170 ... mount, 200 ... camera, 300 ... object recognition processing device, 310 ... simulation unit, 311 ... image generation unit, 312 ... Recognition accuracy calculation unit, 320 ... Discriminator generation unit, 330 ... User interface unit, 331, 332 ... Reception unit, 340 ... Image acquisition unit, 350 ... Recognition unit, 360 ... Learning unit, 361 ... Storage unit, 370 ... Processing time Estimating unit, 380 ... Communication unit, 390 ... Control unit, 400 ... Monitor, C ... Controller, D1 ... Object information, E ... Encoder, J1 ... 1st axis, J2 ... 2nd axis, J3 ... 3rd axis, K ... Keyboard, M ... motor, P1 ... simulation image, P2 ... image data, S ... area, S11 to S14, S21 to S30 ... steps, T1 ... average processing time, T2 ... average processing time, X ... object

Claims

A storage unit that stores object information to be identified, and
An image generation unit that generates a simulation image in which a plurality of identification targets are arranged,
A recognition accuracy calculation unit that performs image recognition by comparing the identification target in the simulation image generated by the image generation unit with the object information, and obtains the recognition accuracy of the image recognition for each posture of the identification target. When,
A user interface unit that notifies the user of the recognition accuracy and accepts whether or not the recognition accuracy is allowed from the user for each posture of the identification target.
An object recognition processing apparatus comprising: a classifier generating unit that generates a classifier for each posture of the discriminating target that is rejected when the user interface unit accepts the rejection.

It has a processing time estimation unit that estimates the image processing time when the image processing is performed by adding the classifier generated by the discriminator generation unit.
The object recognition processing device according to claim 1, wherein the user interface unit notifies the image processing time estimated by the processing time estimation unit.

The object recognition processing device according to claim 1 or 2, wherein the object information is generated from CAD data.

The object recognition processing device according to any one of claims 1 to 3, wherein the image generation unit generates the simulation image by using the information of the environment in which the identification target exists.

The recognition accuracy calculation unit obtains the recognition accuracy by collating the object information of the identification target acquired from the simulation image with the object information stored in the storage unit. The object recognition processing apparatus according to any one item.

The object recognition processing device according to any one of claims 1 to 5, wherein the classifier generation unit generates the classifier using a neural network.

The step of recording the object information to be identified and
A step of generating a simulation image in which a plurality of the identification targets are arranged, and
A step of performing image recognition on the identification target in the simulation image by comparing it with the object information and obtaining the recognition accuracy of the image recognition for each posture of the identification target.
A step of notifying the user of the recognition accuracy and receiving from the user whether or not to allow the recognition accuracy for each posture of the identification target.
An object recognition processing method comprising a step of generating a classifier for each posture of the identification target that is rejected when the user interface unit accepts the rejection.

An object recognition processing device that recognizes the identification target as an image,
It has a manipulator that picks the identification target based on the recognition result of the object recognition processing device.
The object recognition processing device includes a storage unit in which the object information to be identified is stored and a storage unit.
An image generation unit that generates a simulation image in which a plurality of identification targets are arranged,
A recognition accuracy calculation unit that performs image recognition by comparing the identification target in the simulation image generated by the image generation unit with the object information, and obtains the recognition accuracy of the image recognition for each posture of the identification target. When,
A user interface unit that notifies the user of the recognition accuracy and accepts whether or not the recognition accuracy is allowed from the user for each posture of the identification target.
A picking device including a classifier generating unit that generates a classifier for each posture of the discriminating target that is rejected when the user interface unit accepts the rejection.