JPWO2019098044A1

JPWO2019098044A1 - Robot motion adjustment device, motion control system, and robot system

Info

Publication number: JPWO2019098044A1
Application number: JP2019523125A
Authority: JP
Inventors: 浩司白土; 高志南本
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2017-11-14
Filing date: 2018-11-01
Publication date: 2019-11-21
Anticipated expiration: 2038-11-01
Also published as: JP6696627B2; WO2019098044A1; DE112018005832T5; DE112018005832B4; CN111344120B; CN111344120A

Abstract

作業対象に過大な負荷が作用しないようにロボットの動作を調整するとともに、調整を容易化する。ロボット制御装置（１１１）が動作指令値をロボット（１２０）に送り、エンドエフェクタ（１３０）が装着されたロボット（１２０）に作業対象（２００）に対する作業を試行させる。試行に起因してエンドエフェクタ（１３０）に作用する力を外界センサ（１４２）が検出する。動作調整装置（１１２）は外界センサ（１４２）の検出結果を用いた学習を行って、ロボット制御装置（１１１）から取得した動作指令値を調整し、更新する。The robot operation is adjusted so that an excessive load does not act on the work target, and the adjustment is facilitated. The robot controller (111) sends an operation command value to the robot (120), causing the robot (120) to which the end effector (130) is attached to try the work on the work target (200). The external sensor (142) detects a force acting on the end effector (130) due to the trial. The motion adjustment device (112) performs learning using the detection result of the external sensor (142) to adjust and update the motion command value acquired from the robot control device (111).

Description

この発明は、産業用ロボットや非製造業向けのサービスロボットなどに関するものである。特に、この発明は、ロボットに装着されたエンドエフェクタを目標となる位置姿勢に到達させるためのロボットの動作を調整する動作調整装置及び動作制御システムと、当該動作調整装置及び動作制御システムを備えたロボットシステムに関するものである。 The present invention relates to industrial robots, service robots for non-manufacturing industries, and the like. In particular, the present invention includes an operation adjustment device and an operation control system for adjusting the operation of the robot for causing the end effector mounted on the robot to reach a target position and orientation, and the operation adjustment device and the operation control system. It relates to robot systems.

従来の産業用ロボットシステムでは、ロボットと作業対象の関係が精密に位置決めされ、位置決めされた環境下でロボットが高速・高精度で作業を繰り返すようなシステム構成が多かった。これに対して近年では、力覚センサあるいはビジョンセンサなどの複数の外界センサを活用するロボットシステムが増加しつつある。このようなロボットシステムは、ロボットと作業対象とが精密に位置決めされていない環境で使用され、外界センサの検出結果に応じてロボット動作を制御する。 In conventional industrial robot systems, there are many system configurations in which the relationship between the robot and the work target is precisely positioned, and the robot repeats the work with high speed and high accuracy in the positioned environment. In contrast, in recent years, robot systems that use a plurality of external sensors such as force sensors or vision sensors are increasing. Such a robot system is used in an environment where the robot and the work target are not precisely positioned, and controls the robot operation according to the detection result of the external sensor.

例えば、このようなロボットシステムは、作業対象となる物体の位置姿勢あるいは周辺環境が未知の状況で使用される。また、別の例としては、このようなロボットシステムは、作業対象となる物体の位置姿勢あるいは周辺環境が変化する状況で使用される。具体的な事例としては、ビンピッキング作業、表面倣い動作を伴う挿入作業、コネクタ等の部品の嵌め合い作業などが挙げられる。また、非製造業向けのサービスロボットの分野では、様々に変化する環境下での作業が前提とされており、同様に複数のセンサを用いてロボットの動作が制御されている。 For example, such a robot system is used in a situation where the position and orientation of the object to be worked on or the surrounding environment is unknown. As another example, such a robot system is used in a situation where the position and orientation of an object to be worked or the surrounding environment changes. Specific examples include a bin picking operation, an insertion operation accompanied by a surface copying operation, and a fitting operation of components such as a connector. In the field of service robots for non-manufacturing industries, it is premised on work under various changing environments, and similarly, the operation of the robot is controlled using a plurality of sensors.

これらのセンサを活用したロボットの制御系では、ロボットの動作を調整するために、複数の制御パラメータの調整が必要となる。制御パラメータが適切に調整されることで、ロボットの動作が適切となり、ロボットシステムの性能が確保される。しかし、制御パラメータの調整は容易ではなく、専門的な知識が要求されることが多い。そこで、制御パラメータの調整を容易化するために、いくつかの自動調整手段が提案されている。例えば、特許文献１には、学習によってロボットの動作を高速化させるロボットシステムが開示されている。 In a robot control system using these sensors, adjustment of a plurality of control parameters is necessary to adjust the operation of the robot. By appropriately adjusting the control parameters, the operation of the robot becomes appropriate and the performance of the robot system is ensured. However, adjustment of control parameters is not easy and often requires specialized knowledge. Therefore, in order to facilitate the adjustment of the control parameter, some automatic adjustment means have been proposed. For example, Patent Document 1 discloses a robot system that speeds up the operation of a robot by learning.

特開２０１７−９４４３８号公報JP 2017-94438 A

従来のロボットシステムでは、学習において、ロボットの動作に起因して作業対象に作用する負荷の大きさが考慮されていない。したがって、学習で得られたロボットの動作において、作業対象に作用する負荷が適切な大きさとならず、作業対象に過大な負荷が作用する場合があった。本発明は、作業対象に過大な負荷が作用することがないようにロボットの動作を調整でき、ロボットの動作の調整を容易化できる動作調整装置、動作制御システム及びロボットシステムを得ることを目的とする。 In the conventional robot system, the magnitude of the load acting on the work target due to the movement of the robot is not considered in learning. Therefore, in the operation of the robot obtained by learning, the load acting on the work target may not be an appropriate magnitude, and an excessive load may act on the work target. An object of the present invention is to obtain an operation adjustment device, an operation control system, and a robot system that can adjust the operation of a robot so that an excessive load does not act on a work target, and can easily adjust the operation of the robot. To do.

本発明のロボット動作調整装置は、エンドエフェクタが装着されたロボットと、ロボットの動作を制御するロボット制御装置とを備え、ロボットが作業対象に対して作業を行うロボットシステムで用いられ、外界センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、ロボットの動作を制御するためにロボット制御装置からロボットに送信される動作指令値を調整する指令値学習部を備える。 The robot motion adjustment device of the present invention includes a robot equipped with an end effector and a robot control device that controls the motion of the robot. The robot motion adjustment device is used in a robot system in which a robot performs work on a work target. A command value learning unit that performs learning using the detected force acting on the end effector as an input and adjusts an operation command value transmitted from the robot controller to the robot in order to control the operation of the robot is provided.

また、本発明の動作制御システムは、エンドエフェクタが装着されたロボットが作業対象に対して作業を行うロボットシステムで用いられ、ロボットに動作指令値を送信してロボットの動作を制御するロボット制御装置と、センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、動作指令値を調整する指令値学習部とを備える。 The motion control system of the present invention is used in a robot system in which a robot equipped with an end effector performs work on a work target, and transmits a motion command value to the robot to control the robot motion. And a command value learning unit that adjusts the operation command value by performing learning using the force acting on the end effector detected by the sensor as an input.

また、本発明のロボットシステムは、エンドエフェクタが装着されたロボットと、ロボットに動作指令値を送信してロボットの動作を制御するロボット制御装置と、センサで検出されたエンドエフェクタに作用する力を入力とした学習を行って、動作指令値を調整する指令値学習部とを備え、ロボットが作業対象に対して作業を行う。 In addition, the robot system of the present invention includes a robot on which an end effector is mounted, a robot control device that transmits an operation command value to the robot to control the operation of the robot, and a force that acts on the end effector detected by the sensor. The robot includes a command value learning unit that performs learning as input and adjusts the motion command value, and the robot performs work on the work target.

本発明の動作調整装置、動作制御システム及びロボットシステムによれば、作業対象に過大な負荷が作用することがないようにロボットの動作を調整でき、ロボットの動作の調整を容易化できる。 According to the motion adjusting device, the motion control system, and the robot system of the present invention, the motion of the robot can be adjusted so that an excessive load does not act on the work target, and the adjustment of the motion of the robot can be facilitated.

本発明の実施の形態１による動作調整装置を備えたロボットシステムのシステム構成の一例を示すブロック図である。It is a block diagram which shows an example of the system configuration | structure of the robot system provided with the motion adjustment apparatus by Embodiment 1 of this invention. 本発明の実施の形態１によるロボット制御装置及び動作調整装置を実現するための具体的なハードウェア構成の一例を示す図である。It is a figure which shows an example of the concrete hardware constitutions for implement | achieving the robot control apparatus and operation | movement adjustment apparatus by Embodiment 1 of this invention. 本発明の実施の形態１による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation adjustment apparatus by Embodiment 1 of this invention, and a surrounding block. 本発明の実施の形態１による動作調整装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the operation adjustment apparatus by Embodiment 1 of this invention. 本発明の実施の形態１によるロボットシステムにおける更新前の速度パターンの一例を示す図である。It is a figure which shows an example of the speed pattern before the update in the robot system by Embodiment 1 of this invention. 本発明の実施の形態１による動作制御システムの処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of a process of the operation control system by Embodiment 1 of this invention. 本発明の実施の形態２による動作調整装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the operation adjustment apparatus by Embodiment 2 of this invention. 本発明の実施の形態２によるロボットシステムにおける速度パターンの初期値の一例を示す図である。It is a figure which shows an example of the initial value of the speed pattern in the robot system by Embodiment 2 of this invention. 本発明の実施の形態２によるロボットシステムにおける力覚センサの検出値の一例を示す図である。It is a figure which shows an example of the detected value of the force sensor in the robot system by Embodiment 2 of this invention. 本発明の実施の形態２によるロボットシステムにおける更新後の速度パターンの一例を示す図である。It is a figure which shows an example of the speed pattern after the update in the robot system by Embodiment 2 of this invention. 本発明の実施の形態２によるロボットシステムにおける更新後の速度パターンの別の例を示す図である。It is a figure which shows another example of the speed pattern after the update in the robot system by Embodiment 2 of this invention. 本発明の実施の形態３による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation adjustment apparatus by Embodiment 3 of this invention, and a surrounding block. 本発明の実施の形態３による指令値学習部の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the command value learning part by Embodiment 3 of this invention, and a surrounding block. 本発明の実施の形態３によるロボットシステムが実施する作業の一例を示す図である。It is a figure which shows an example of the operation | work which the robot system by Embodiment 3 of this invention implements. 本発明の実施の形態３による学習処理部の処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of a process of the learning process part by Embodiment 3 of this invention. 本発明の実施の形態３による学習処理部で行われる前処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the pre-processing performed by the learning process part by Embodiment 3 of this invention. 本発明の実施の形態３による学習処理部で行われる学習処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the learning process performed in the learning process part by Embodiment 3 of this invention. 本発明の実施の形態３によるロボットシステムにおける試行時の速度パターンの一例を示す図である。It is a figure which shows an example of the speed pattern at the time of the trial in the robot system by Embodiment 3 of this invention. 本発明の実施の形態３によるロボットシステムにおける試行時に取得される力情報の一例を示す図である。It is a figure which shows an example of the force information acquired at the time of the trial in the robot system by Embodiment 3 of this invention. 本発明の実施の形態４による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation adjustment apparatus by Embodiment 4 of this invention, and a surrounding block. 本発明の実施の形態４による動作調整装置で行われる前処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the pre-processing performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. 本発明の実施の形態４による動作調整装置で行われる学習処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the learning process performed with the operation | movement adjustment apparatus by Embodiment 4 of this invention. 本発明の実施の形態５による動作調整装置の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows the structural example of the operation adjustment apparatus by Embodiment 5 of this invention, and a surrounding block. 本発明の実施の形態５による動作学習部の構成例を示すブロック図である。It is a block diagram which shows the structural example of the action learning part by Embodiment 5 of this invention. 本発明の実施の形態５による動作調整装置の別の構成例及び周辺のブロックを示すブロック図である。It is a block diagram which shows another structural example and peripheral block of the operation | movement adjustment apparatus by Embodiment 5 of this invention.

実施の形態１.
図１は本発明の実施の形態１による動作調整装置を備えたロボットシステム１００のシステム構成の一例を示すブロック図である。図１に示す通り、ロボットシステム１００は、動作制御システム１１０、ロボット１２０、エンドエフェクタ１３０、内界センサ１４１、及び外界センサ１４２を備える。また、動作制御システム１１０は、ロボット制御装置１１１及び動作調整装置１１２を備える。ロボット制御装置は、ロボットコントローラとも呼ばれる。Embodiment 1.
FIG. 1 is a block diagram showing an example of a system configuration of a robot system 100 provided with an operation adjustment apparatus according to Embodiment 1 of the present invention. As shown in FIG. 1, the robot system 100 includes an operation control system 110, a robot 120, an end effector 130, an internal sensor 141, and an external sensor 142. Further, the motion control system 110 includes a robot control device 111 and a motion adjustment device 112. The robot control device is also called a robot controller.

ロボット制御装置１１１は、内界センサ１４１及び外界センサ１４２の検出結果に基づいて、ロボット１２０の動作を制御するための動作指令値をロボット１２０に送信し、ロボット１２０の動作を制御する。ロボット１２０には、ロボットハンド等のエンドエフェクタ１３０が装着される。エンドエフェクタ１３０は、作業対象２００に直接働きかける。エンドエフェクタ１３０は、ロボットシステム１００が行う各作業に応じて適切な種類のものが選択される。作業対象２００の周辺には、周辺環境３００が存在する。 The robot control device 111 transmits an operation command value for controlling the operation of the robot 120 to the robot 120 based on the detection results of the internal sensor 141 and the external sensor 142, and controls the operation of the robot 120. An end effector 130 such as a robot hand is attached to the robot 120. The end effector 130 works directly on the work target 200. An appropriate type of end effector 130 is selected according to each operation performed by the robot system 100. A surrounding environment 300 exists around the work target 200.

周辺環境３００は、例えば、作業対象２００を組み付ける先となる部品、作業対象２００を位置決めするジグ、作業対象２００に加工を施す工具（電動ドライバ等）、作業対象２００を供給するパーツフィーダ、ロボット１２０を取り囲む安全カバー、作業対象２００を搬送するベルトコンベア等である。また、作業対象を撮像するカメラなど、外界センサ１４２も周辺環境の一部として扱う場合もある。これは、外界センサ１４２がロボット１２０の周辺の所定の位置に固定されている場合などに、ロボットが１２０またはエンドエフェクタ１３０が外界センサ１４２に接触する可能性があるためである。 The peripheral environment 300 includes, for example, a part to which the work target 200 is assembled, a jig for positioning the work target 200, a tool for processing the work target 200 (such as an electric screwdriver), a parts feeder that supplies the work target 200, and the robot 120. A safety cover that surrounds the belt, a belt conveyor that conveys the work object 200, and the like. In some cases, the external sensor 142 such as a camera that captures a work target is also handled as a part of the surrounding environment. This is because the robot 120 or the end effector 130 may come into contact with the external sensor 142 when the external sensor 142 is fixed at a predetermined position around the robot 120.

ロボット制御装置１１１から出力される動作指令値は、例えば、ロボット１２０に装着されたエンドエフェクタ１３０の各時刻における目標位置および目標姿勢を表す情報、すなわち位置指令値である。動作指令値が、各時刻におけるエンドエフェクタ１３０の目標位置を表す場合、動作指令値によって各時刻間のエンドエフェクタの１３０の移動速度も表されている。したがって、位置指令値は、ロボットの目標動作速度を表す速度指令値であると考えることもできる。 The operation command value output from the robot control device 111 is, for example, information indicating the target position and target posture of the end effector 130 attached to the robot 120 at each time, that is, a position command value. When the motion command value represents the target position of the end effector 130 at each time, the movement speed of the end effector 130 between the times is also represented by the motion command value. Therefore, the position command value can be considered as a speed command value representing the target operation speed of the robot.

また、ロボット制御装置１１１から出力される動作指令値は、ロボット１２０の目標動作速度、またはエンドエフェクタ１３０の目標移動速度を表す速度指令値であっても良い。目標動作速度または目標移動速度は、ロボット１２０の動作の各時点の間の速度、または経路の各地点の間の速度で与えられる。さらに、動作指令値は、ロボット１２０の動作の目標加速度、またはエンドエフェクタ１３０の移動の目標加速度を表す加速度指令値であっても良い。動作指令値は、ロボット１２０の動作を直接的に制御するものであれば、様々な形態が考えられる。 Further, the operation command value output from the robot control device 111 may be a speed command value representing the target operation speed of the robot 120 or the target movement speed of the end effector 130. The target operation speed or the target movement speed is given by a speed between each time point of the operation of the robot 120 or a speed between each point on the route. Further, the motion command value may be an acceleration command value that represents a target acceleration for the motion of the robot 120 or a target acceleration for the movement of the end effector 130. The motion command value may be in various forms as long as it directly controls the motion of the robot 120.

動作調整装置１１２は、外界センサ１４２の検出結果と、外部から与えられる制約条件とに応じて、ロボット制御装置１１１で生成される動作指令値を調整し、更新する。すなわち、動作調整装置１１２は、ロボットの動作を調整する。言い換えると、動作調整装置１１２は、内界センサ１４１及び外界センサ１４２の検出結果と、ロボット制御装置１１１から出力される動作指令値との対応関係を調整し、調整結果を反映して対応関係を更新することになる。なお、動作指令値の調整は、動作指令値の修正、または動作指令値の補正と言い換えることもできる。 The motion adjustment device 112 adjusts and updates the motion command value generated by the robot control device 111 according to the detection result of the external sensor 142 and the constraint condition given from the outside. That is, the motion adjustment device 112 adjusts the motion of the robot. In other words, the motion adjustment device 112 adjusts the correspondence relationship between the detection results of the inner world sensor 141 and the outer world sensor 142 and the motion command value output from the robot control device 111, and reflects the adjustment result to change the correspondence relationship. Will be updated. The adjustment of the operation command value can also be referred to as the correction of the operation command value or the correction of the operation command value.

更新された動作指令値が存在する場合、ロボット制御装置１１１は、更新された動作指令値をロボット１２０へと出力する。動作調整装置１１２は、外界センサ１４２の検出結果だけではなく、内界センサ１４１の検出結果も参照して動作指令値を更新しても良い。なお、制約条件は、動作調整装置１１２またはロボット制御装置１１１の内部に予め記憶されていても良い。 When the updated operation command value exists, the robot control device 111 outputs the updated operation command value to the robot 120. The motion adjustment device 112 may update the motion command value with reference to not only the detection result of the external sensor 142 but also the detection result of the internal sensor 141. Note that the constraint condition may be stored in advance in the motion adjustment device 112 or the robot control device 111.

本実施の形態のロボットシステム１００は、動作指令値を調整して更新する調整処理と、更新された動作指令値を用いて作業対象２００に対する作業を行う作業処理との２つの処理を行う。言い換えると、ロボットシステム１００の動作には、調整フェーズと作業フェーズとがあり、調整処理は、調整フェーズにおけるロボットシステム１００の処理である。また、作業処理は、作業フェーズにおけるロボットシステム１００の処理である。動作調整装置１１２は、調整処理において、最適な動作指令値になるように動作指令値を調整する。ただし、調整処理と作業処理とは完全に分離される必要はない。例えば、作業対象２００に対する作業が行われている間にも、動作調整装置１１２が最適な動作指令値を随時算出するように、ロボットシステム１００が構成されても良い。この構成においては、ロボットシステム１００は、現在使用されている動作指令値よりも適切な動作指令値が算出された場合など、必要に応じて所定のタイミングで動作指令値を更新する。この点は、以降の実施の形態でも同様である。 The robot system 100 according to the present embodiment performs two processes: an adjustment process for adjusting and updating an operation command value, and a work process for performing work on the work target 200 using the updated operation command value. In other words, the operation of the robot system 100 includes an adjustment phase and a work phase, and the adjustment process is a process of the robot system 100 in the adjustment phase. The work process is a process of the robot system 100 in the work phase. The motion adjustment device 112 adjusts the motion command value so as to obtain an optimal motion command value in the adjustment process. However, the adjustment process and the work process need not be completely separated. For example, the robot system 100 may be configured so that the motion adjustment device 112 calculates an optimal motion command value as needed while a work on the work target 200 is being performed. In this configuration, the robot system 100 updates the operation command value at a predetermined timing as necessary, for example, when an operation command value more appropriate than the currently used operation command value is calculated. This is the same in the following embodiments.

図２は、ロボット制御装置１１１及び動作調整装置１１２を実現するための具体的なハードウェア構成の一例を示す図である。ロボット制御装置１１１及び動作調整装置１１２は、メモリ４０２に記憶されるプログラムをプロセッサ４０１で実行することで実現される。プロセッサ４０１とメモリ４０２とは、データバス４０３で接続される。メモリ４０２には、揮発性のメモリ及び非揮発性のメモリが備えられ、一時的な情報は揮発性のメモリに記憶される。なお、ロボット制御装置１１１及び動作調整装置１１２は一体として構成しても良いし、別体として構成しても良い。例えば、ロボット制御装置１１１と動作調整装置１１２とが、ネットワークなどを介して接続されていても良い。以降の実施の形態においても、ロボット制御装置１１１及び動作調整装置１１２は同様のハードウェア構成で実現できる。 FIG. 2 is a diagram illustrating an example of a specific hardware configuration for realizing the robot control device 111 and the motion adjustment device 112. The robot control device 111 and the motion adjustment device 112 are realized by the processor 401 executing a program stored in the memory 402. The processor 401 and the memory 402 are connected by a data bus 403. The memory 402 includes a volatile memory and a non-volatile memory, and temporary information is stored in the volatile memory. Note that the robot control device 111 and the motion adjustment device 112 may be configured integrally or may be configured separately. For example, the robot control device 111 and the motion adjustment device 112 may be connected via a network or the like. Also in the following embodiments, the robot control device 111 and the motion adjustment device 112 can be realized by the same hardware configuration.

ロボットシステム１００は、内界センサ１４１及び外界センサ１４２で取得されたデータに基づいて動作制御システム１１０が動作指令値を出力し、動作指令値に追従してロボット１２０が動作する制御系を構成している。内界センサ１４１としては、ロボットの関節の位置を取得するセンサ、関節の動作速度を取得するセンサ、関節を動作させるためのモータの電流値を取得するセンサ等がある。ロボットシステム１００は、ロボット制御装置１１１、ロボット１２０、及び内界センサ１４１によって、エンドエフェクタ１３０の位置決めを行う位置制御系を構成している。ロボットの関節の位置を取得するセンサとしては、例えば、モータの回転量を検出するエンコーダ、レゾルバ、ポテンショメータなどが考えられる。また、関節の動作速度を取得するセンサとしては、タコメータなどが考えられる。内界センサとしては、他にも、ロボット１２０自身の情報として、ジャイロセンサ、慣性センサ等が使用される場合がある。 The robot system 100 constitutes a control system in which the operation control system 110 outputs an operation command value based on the data acquired by the internal sensor 141 and the external sensor 142, and the robot 120 operates following the operation command value. ing. Examples of the internal sensor 141 include a sensor that acquires the position of the joint of the robot, a sensor that acquires the operation speed of the joint, and a sensor that acquires the current value of the motor for operating the joint. The robot system 100 constitutes a position control system that positions the end effector 130 by the robot control device 111, the robot 120, and the internal sensor 141. As a sensor for acquiring the position of the joint of the robot, for example, an encoder, a resolver, a potentiometer, or the like that detects the rotation amount of the motor can be considered. In addition, a tachometer or the like can be considered as a sensor for acquiring the operation speed of the joint. In addition, as the internal sensor, a gyro sensor, an inertia sensor, or the like may be used as information on the robot 120 itself.

内界センサ１４１に基づくフィードバック制御によって、ロボットシステム１００は、マテハン作業などを行う位置制御ロボットシステムを構成する。ここで、マテハン作業とは、資材や部品などの移送や搬送する作業である。この位置制御ロボットシステムを内界センサ１４１に基づくフィードバック制御システムと呼ぶ。内界センサ１４１に基づくフィードバック制御において、制御パラメータとしては、位置制御のゲイン、速度制御のゲイン、電流制御のゲイン、フィードバック制御に用いられるフィルタの設計パラメータ等が存在する。フィードバック制御に用いられるフィルタとしては、移動平均フィルタ、ローパスフィルタ、バンドパスフィルタ、ハイパスフィルタ等が考えられる。なお、内界センサ１４１に基づくフィードバック制御は、ロボット１２０が動作指令値に従って動作するための制御となる。言い換えると、内界センサ１４１に基づくフィードバック制御は、動作指令値を実現するために行われる制御となる。 By feedback control based on the inner world sensor 141, the robot system 100 constitutes a position control robot system that performs material handling work and the like. Here, the material handling operation is an operation of transferring or transporting materials or parts. This position control robot system is referred to as a feedback control system based on the internal sensor 141. In feedback control based on the internal sensor 141, control parameters include position control gain, speed control gain, current control gain, filter design parameters used for feedback control, and the like. As a filter used for feedback control, a moving average filter, a low-pass filter, a band-pass filter, a high-pass filter, and the like can be considered. Note that the feedback control based on the inner world sensor 141 is control for the robot 120 to operate according to the operation command value. In other words, the feedback control based on the internal sensor 141 is control performed to realize the operation command value.

一方で、外界センサ１４２としては、力覚センサ、カメラ等のビジョンセンサ、触覚センサ、タッチセンサ等がある。外界センサ１４２は、ロボット１２０と、作業対象２００または周辺環境３００との接触状態や位置関係を計測する。ロボットシステム１００は、ロボット制御装置１１１、動作調整装置１１２、ロボット１２０、及び外界センサ１４２によって、外界センサ１４２に基づくセンサフィードバック制御システムを構成している。また、ロボットシステム１００は、外界センサ１４２から出力されるセンサ信号に基づいてセンサフィードバック制御を実施するのではなく、外界センサ１４２からのセンサ信号を単にトリガー信号として利用する場合もある。この場合、ロボットシステム１００は、トリガー信号を起点として、内界センサ１４１によるフィードバック制御の制御パラメータを切り替える。外界センサ１４２に基づくセンサフィードバック制御システムは、位置制御ロボットシステムのアウターループとして構築されている。 On the other hand, the external sensor 142 includes a force sensor, a vision sensor such as a camera, a tactile sensor, and a touch sensor. The external sensor 142 measures the contact state and positional relationship between the robot 120 and the work target 200 or the surrounding environment 300. In the robot system 100, the robot control device 111, the motion adjustment device 112, the robot 120, and the external sensor 142 constitute a sensor feedback control system based on the external sensor 142. Further, the robot system 100 may not use the sensor feedback control based on the sensor signal output from the external sensor 142 but may use the sensor signal from the external sensor 142 as a trigger signal. In this case, the robot system 100 switches control parameters for feedback control by the internal sensor 141 with the trigger signal as a starting point. The sensor feedback control system based on the external sensor 142 is constructed as an outer loop of the position control robot system.

外界センサ１４２に基づくセンサフィードバック制御システムは、加速度、速度、位置姿勢、距離、力、モーメント等によって、ロボット１２０、ロボットアームまたはエンドエフェクタ１３０と、作業対象２００または周辺環境３００との位置関係、接触挙動等をセンシングする。さらに、外界センサ１４２に基づくセンサフィードバック制御システムは、センシング結果に基づいて、所望の位置関係または力応答を得るようにロボット１２０の動作を制御する。言い換えると、外界センサ１４２に基づくセンサフィードバック制御システムは、所望の位置関係または力応答を得るように動作指令値を修正する。外界センサ１４２に基づくセンサフィードバック制御システムにおいて、制御パラメータとしては、力覚制御に関する力制御ゲイン、インピーダンスパラメータ、ビジュアルサーボ制御に関するゲイン、ビジュアルインピーダンスパラメータ、フィードバック制御に用いられるフィルタの設定パラメータなどがある。 The sensor feedback control system based on the external sensor 142 is based on acceleration, speed, position and orientation, distance, force, moment, and the like, based on the positional relationship and contact between the robot 120, the robot arm or the end effector 130, and the work target 200 or the surrounding environment 300. Sensing behavior. Furthermore, the sensor feedback control system based on the external sensor 142 controls the operation of the robot 120 so as to obtain a desired positional relationship or force response based on the sensing result. In other words, the sensor feedback control system based on the external sensor 142 corrects the operation command value so as to obtain a desired positional relationship or force response. In the sensor feedback control system based on the external sensor 142, the control parameters include force control gain related to haptic control, impedance parameter, gain related to visual servo control, visual impedance parameter, filter setting parameter used for feedback control, and the like.

内界センサ１４１および外界センサ１４２に基づいて制御を行う場合に、調整が必要となる制御パラメータを、以後では単にパラメータと呼ぶことがある。ここで、内界センサ１４１または外界センサ１４２として使用されるセンサとしては、具体的には、電流値センサ、関節位置センサ、関節速度センサ、温度距離センサ、カメラ、ＲＧＢ−Ｄセンサ、近接覚センサ、触覚センサ、力センサ等が考えられる。また、内界センサ１４１または外界センサ１４２の計測対象は、ロボット１２０の位置姿勢、エンドエフェクタ１３０の位置姿勢、作業対象２００となるワークの位置姿勢、作業者の位置姿勢等が考えられる。 In the case where control is performed based on the inner world sensor 141 and the outer world sensor 142, control parameters that require adjustment may be simply referred to as parameters hereinafter. Here, specific examples of the sensor used as the internal sensor 141 or the external sensor 142 include a current value sensor, a joint position sensor, a joint speed sensor, a temperature distance sensor, a camera, an RGB-D sensor, and a proximity sensor. A tactile sensor, a force sensor, etc. are conceivable. In addition, the measurement target of the internal sensor 141 or the external sensor 142 may be the position / posture of the robot 120, the position / posture of the end effector 130, the position / posture of the work to be the work target 200, the position / posture of the operator, and the like.

図３は、本発明の実施の形態１による動作調整装置１１２の構成例及び周辺のブロックを示すブロック図である。図３は、ロボットシステム１００の構成の一部を抽出して示したものである。動作調整装置１１２は、指令値学習部１１３を備える。なお、図３において、センサ１４０は、内界センサ１４１及び外界センサ１４２を１つにまとめたものである。上述のように、センサ１４０としては多様なものが考えられる。しかし、本実施の形態のロボットシステム１００は、センサ１４０には、ロボット１２０の動作に起因してエンドエフェクタ１３０に作用する外力を検出する力覚センサを少なくとも備える。この力覚センサは、外界センサ１４２となる。なお、センサ１４０として少なくとも力覚センサを含むことは、以降の実施の形態でも同様である。 FIG. 3 is a block diagram illustrating a configuration example of the motion adjustment device 112 according to the first embodiment of the present invention and peripheral blocks. FIG. 3 shows a part of the configuration of the robot system 100 extracted. The motion adjustment device 112 includes a command value learning unit 113. In FIG. 3, the sensor 140 is a combination of the inner world sensor 141 and the outer world sensor 142. As described above, various sensors 140 are conceivable. However, in the robot system 100 of the present embodiment, the sensor 140 includes at least a force sensor that detects an external force acting on the end effector 130 due to the operation of the robot 120. This force sensor becomes the external sensor 142. Note that the sensor 140 includes at least a force sensor in the following embodiments.

力覚センサは、エンドエフェクタ１３０に作用する外力を計測し、力制御あるいはインピーダンス制御を実施するのに用いられる。なお、エンドエフェクタ１３０が作業対象２００または周辺環境３００に与える力を制御することを力制御と呼ぶ。また、力覚センサの検出結果に従ってロボット１２０の動作を制御することを力覚制御と呼ぶ。力制御においては、目標作業力が設定され、作業対象２００または周辺環境３００に与えられる力の大きさが制御される。 The force sensor is used for measuring an external force acting on the end effector 130 and performing force control or impedance control. Controlling the force that the end effector 130 gives to the work target 200 or the surrounding environment 300 is called force control. Controlling the operation of the robot 120 according to the detection result of the force sensor is called force sense control. In the force control, a target work force is set, and the magnitude of the force applied to the work target 200 or the surrounding environment 300 is controlled.

一方、インピーダンス制御においては、エンドエフェクタ１３０と作業対象２００とが接触した場合などに発生する接触力に関するインピーダンス特性（バネ、ダンパ、慣性）が定義され、制御に利用される。接触力が発生する場合としては、エンドエフェクタ１３０と周辺環境３００とが接触した場合、エンドエフェクタ１３０に把持された作業対象２００と周辺環境３００とが接触した場合なども考えられる。また、インピーダンス特性は、インピーダンスパラメータで表される。 On the other hand, in impedance control, impedance characteristics (spring, damper, inertia) relating to contact force generated when the end effector 130 and the work target 200 come into contact are defined and used for control. As a case where the contact force is generated, a case where the end effector 130 and the surrounding environment 300 come into contact with each other, a case where the work target 200 held by the end effector 130 and the surrounding environment 300 come into contact, or the like can be considered. The impedance characteristic is represented by an impedance parameter.

力制御においては、力制御の目標値を決定する必要がある。また、インピーダンス制御においては、インピーダンスパラメータを用いて制御特性を決定する必要がある。さらに、力制御及びインピーダンス制御のいずれにおいても、制御の応答性に寄与するゲインなども決定する必要があり、調整項目は多い。従来のロボットシステムでは、作業を安定的に行うことを目的としたパラメータ調整が多くなされてきた。この場合、ロボット１２０の動作の応答性、機械剛性等を含めたシステム特性を同定して、条件または状態によらず安定して応答するパラメータセットを１つ見つけることになる。しかし、作業対象２００との接触を伴うロボット１２０の動作では、動作の進行によって、作業対象２００とエンドエフェクタ１３０との間の接触状態が変化する。したがって、パラメータセットの調整は、接触状態の遷移を考慮して行われる必要がある。この調整は試行錯誤的に行われることになり、容易ではなかった。 In force control, it is necessary to determine a target value for force control. In impedance control, it is necessary to determine control characteristics using impedance parameters. Furthermore, in both force control and impedance control, it is necessary to determine a gain that contributes to control responsiveness, and there are many adjustment items. In conventional robot systems, many parameter adjustments have been made for the purpose of performing work stably. In this case, the system characteristics including the response of the operation of the robot 120, the mechanical rigidity, and the like are identified, and one parameter set that stably responds regardless of conditions or states is found. However, in the operation of the robot 120 with contact with the work target 200, the contact state between the work target 200 and the end effector 130 changes as the operation progresses. Therefore, the adjustment of the parameter set needs to be performed in consideration of the transition of the contact state. This adjustment has been done on a trial and error basis and has not been easy.

本実施の形態のロボットシステム１００においては、動作調整装置１１２が動作指令値を更新することで、ロボット１２０の動作が適切となるように制御する。動作調整装置１１２には、制約条件が入力される。制約条件には、力覚センサで検出される力情報の上限値または下限値が含まれる。以降では、動作制御システム１１０から出力される動作指令値が速度指令値であるものとして説明する。速度指令値は、エンドエフェクタ１３０の移動経路上の各地点に対する、エンドエフェクタ１３０の目標移動速度とする。この時、時系列の速度指令値は、各地点に対する速度パターンとなる。速度指令値は、作業中の各時点に対するロボット１２０の目標動作速度であっても良い。 In the robot system 100 of the present embodiment, the motion adjustment device 112 controls the motion of the robot 120 to be appropriate by updating the motion command value. A constraint condition is input to the motion adjustment device 112. The constraint condition includes an upper limit value or a lower limit value of force information detected by the force sensor. In the following description, it is assumed that the operation command value output from the operation control system 110 is a speed command value. The speed command value is a target moving speed of the end effector 130 for each point on the moving path of the end effector 130. At this time, the time-series speed command value is a speed pattern for each point. The speed command value may be a target operation speed of the robot 120 for each time point during work.

速度パターンでは、目標速度Ｖｉ（ｉ＝１，２，３，・・・）と目標速度の切り替わり位置Ｐｉ（ｉ＝１，２，３，・・・）が定義される。なお、切り替わり位置は、切り替わり時間や、切り替わりのためのパラメータで設定してよい。切り替わりのためのパラメータとしては、位置や時間を基準とした動作指令値の進捗率が例示される。また、目標速度の切り替わり位置Ｐｉは、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、目標速度の切り替わり位置Ｐｉは、内界センサ１４１で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 In the speed pattern, a target speed Vi (i = 1, 2, 3,...) And a target speed switching position Pi (i = 1, 2, 3,...) Are defined. Note that the switching position may be set by a switching time or a parameter for switching. As the parameter for switching, the progress rate of the operation command value based on the position and time is exemplified. The target speed switching position Pi may be a target speed switching start point or a target speed switching completion point. Further, the target speed switching position Pi may be a point where it is ensured that the operation speed detected by the inner sensor 141 falls within a predetermined error range from the target speed.

図４は、本発明の実施の形態１による動作調整装置１１２の動作を説明するための図である。図４に示すように、ロボット１２０に装着されたエンドエフェクタ１３０が位置Ｐ０から位置Ｐ３まで移動する場合を考える。ロボット１２０には、外界センサ１４２として力覚センサ１４３が取り付けられている。力覚センサ１４３は、エンドエフェクタ１３０に作用する外力を計測する。 FIG. 4 is a diagram for explaining the operation of the operation adjustment device 112 according to the first embodiment of the present invention. Consider the case where the end effector 130 attached to the robot 120 moves from position P0 to position P3 as shown in FIG. A force sensor 143 is attached to the robot 120 as the external sensor 142. The force sensor 143 measures an external force acting on the end effector 130.

図５は、本発明の実施の形態１によるロボットシステム１００における更新前の速度パターンの一例を示す図である。図５において横軸はエンドエフェクタ１３０の位置Ｐ、縦軸はエンドエフェクタ１３０の目標移動速度Ｖである。図５の速度パターンでは、エンドエフェクタ１３０がＰ０からＰ３に移動する間に、目標速度が変化している。動作調整装置１１２は、力覚センサ１４３の検出結果に基づいて速度パターンを更新する。 FIG. 5 is a diagram showing an example of a speed pattern before update in the robot system 100 according to the first embodiment of the present invention. In FIG. 5, the horizontal axis represents the position P of the end effector 130, and the vertical axis represents the target moving speed V of the end effector 130. In the speed pattern of FIG. 5, the target speed changes while the end effector 130 moves from P0 to P3. The motion adjustment device 112 updates the speed pattern based on the detection result of the force sensor 143.

図６は、本発明の実施の形態１による動作制御システム１１０の処理の流れの一例を示すフロー図である。ここで、制約条件としては、力覚センサ１４３で検出される力情報の上限値及び下限値と、作業時間の上限値が含まれているものとする。まず、ステップＳ１０において、ロボット制御装置１１１は、速度パターンの初期値を決定する。次に、ステップＳ１１において、ロボット制御装置１１１は、ロボット１２０の動作を制御して作業を試行する。なお、前述のように調整処理と作業処理とは完全に分離されていない場合など、ロボットシステム１００における通常の作業の一部が試行として扱われる場合もある。 FIG. 6 is a flowchart showing an example of the processing flow of the operation control system 110 according to the first embodiment of the present invention. Here, the constraint conditions include an upper limit value and a lower limit value of force information detected by the force sensor 143 and an upper limit value of work time. First, in step S10, the robot control device 111 determines an initial value of the speed pattern. Next, in step S 11, the robot control device 111 tries the operation by controlling the operation of the robot 120. Note that, as described above, a part of normal work in the robot system 100 may be treated as a trial, such as when adjustment processing and work processing are not completely separated.

次に、ステップＳ１２において、動作調整装置１１２は、制約条件が満たされているかを判定する。すなわち、ステップＳ１２において、動作調整装置１１２は、力覚センサ１４３の検出値が制約条件で規定される上限値及び下限値の間に入っているかと、作業時間の制約が満たされているかを判定する。力覚センサ１４３の検出値を判定する際には、例えば、検出値の最大値を制約条件の上限値と比較し、検出値の最小値を制約条件の下限値と比較する。なお、ステップＳ１２において、動作調整装置１１２は、力覚センサ１４３の検出値そのものではなく、検出値から演算によって求められる評価値を用いても良い。この評価値の一例としては、力覚センサ１４３の検出値と、タクトタイムとを入力とした評価関数で演算される評価値が考えられる。ステップＳ１２では、動作調整装置１１２は、この評価値を制限範囲内か否かを判定しても良い。 Next, in step S12, the motion adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the motion adjustment device 112 determines whether the detection value of the force sensor 143 is between the upper limit value and the lower limit value defined by the constraint conditions and whether the work time constraint is satisfied. To do. When determining the detection value of the force sensor 143, for example, the maximum value of the detection value is compared with the upper limit value of the constraint condition, and the minimum value of the detection value is compared with the lower limit value of the constraint condition. In step S12, the motion adjustment device 112 may use an evaluation value obtained by calculation from the detection value instead of the detection value itself of the force sensor 143. As an example of the evaluation value, an evaluation value calculated by an evaluation function having the detection value of the force sensor 143 and the tact time as inputs can be considered. In step S12, the motion adjustment device 112 may determine whether or not the evaluation value is within the limit range.

ステップＳ１２において、制約条件が満たされていると判定された場合には、動作制御システム１１０の処理は一旦終了し、以降は更新された速度パターンでの作業が行われる。一方、ステップＳ１２において、制約条件が満たされていないと判定された場合には、動作制御システム１１０の処理はステップＳ１３へと移行する。ステップＳ１３では、動作調整装置１１２は、速度パターンを調整し、速度パターンを更新する。ステップＳ１３では、動作調整装置１１２は、例えば補正するための補正係数を算出し、試行を行った際の速度パターンに乗算することで、速度パターンを調整する。ステップＳ１３の処理が終了すると、動作制御システム１１０の処理はステップＳ１１へと戻る。 If it is determined in step S12 that the constraint condition is satisfied, the process of the motion control system 110 is temporarily terminated, and thereafter, the work with the updated speed pattern is performed. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the operation adjustment device 112 adjusts the speed pattern and updates the speed pattern. In step S13, the motion adjustment device 112 adjusts the speed pattern by, for example, calculating a correction coefficient for correction and multiplying the speed pattern when the trial is performed. When the process of step S13 ends, the process of the operation control system 110 returns to step S11.

本発明の実施の形態１による動作制御システム１１０は、以上のような処理を行う。以上のように、本発明の実施の形態１による動作制御システム１１０は、複数回の試行によって得られるデータに基づいて学習的に速度パターンの調整を行う。言い換えると、本発明の実施の形態１による動作制御システム１１０は、機械学習または最適化手法を用いて動作指令値である速度パターンの調整を行う。 The operation control system 110 according to the first embodiment of the present invention performs the processing as described above. As described above, the motion control system 110 according to the first embodiment of the present invention adjusts the speed pattern in a learning manner based on data obtained by a plurality of trials. In other words, the motion control system 110 according to the first embodiment of the present invention adjusts the speed pattern, which is the motion command value, using machine learning or an optimization method.

なお、以上の説明では、作業時間の上限値が制約条件に含まれているものとしたが、必須の条件ではなく、他の条件であっても良い。また、制約条件として作業時間の上限値が与えられる代わりに、他の条件を満たした上で作業時間が最短となることを制約条件としても良い。さらに、以上の説明では、与えられた制約条件を満たすように動作制御システム１１０が動作指令値を更新する場合について説明したが、動作制御システム１１０が制御パラメータを調整して更新する構成とすることも考えられる。さらに、図１では、ロボット制御装置１１１と動作調整装置１１２とを別に備える構成例を示しているが、ロボット制御装置１１１が動作調整装置１１２を内蔵するように構成することもできる。 In the above description, it is assumed that the upper limit value of the work time is included in the constraint condition, but it may be other conditions instead of the essential condition. Further, instead of being given the upper limit value of the work time as a constraint condition, the constraint condition may be that the work time becomes the shortest after satisfying other conditions. Further, in the above description, the case where the motion control system 110 updates the motion command value so as to satisfy the given constraint condition has been described. However, the motion control system 110 is configured to adjust and update the control parameter. Is also possible. Further, FIG. 1 illustrates a configuration example in which the robot control device 111 and the motion adjustment device 112 are separately provided, but the robot control device 111 may be configured to incorporate the motion adjustment device 112.

本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００は、以上のように構成される。本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、力覚センサ１４３の検出値が所定の範囲内となるようにロボット１２０の動作が調整される。ここで、力覚センサ１４３の検出値は、エンドエフェクタ１３０に作用する外力の大きさを表している。言い換えると、力覚センサ１４３の検出値は、ロボット１２０の動作に起因して作業対象２００又は周辺環境３００に加えられる力の大きさを表す情報である。したがって、本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、作業対象２００または周辺環境３００に加えられる力が適切な大きさとなるように、すなわち作業対象２００または周辺環境３００に過大な負荷が作用することがないようにロボット１２０の動作を調整でき、また、ロボット１２０の動作の調整を容易化できる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted so that the detection value of the force sensor 143 is within a predetermined range. Here, the detection value of the force sensor 143 represents the magnitude of the external force acting on the end effector 130. In other words, the detection value of the force sensor 143 is information indicating the magnitude of the force applied to the work target 200 or the surrounding environment 300 due to the operation of the robot 120. Therefore, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the force applied to the work target 200 or the surrounding environment 300 becomes an appropriate magnitude, that is, the work target 200 or the surrounding area. The operation of the robot 120 can be adjusted so that an excessive load does not act on the environment 300, and the adjustment of the operation of the robot 120 can be facilitated.

以上のように、力覚センサ１４３を用いて力応答が所望の範囲内に収まる様に動作指令値を学習的に調整することで、作業対象となるアイテムを破損しない高品質なロボット作業を実現することができる。さらに、作業時間を制約条件に加えることで、高速な作業も実現可能でとなる。 As described above, by using the force sensor 143 to adjust the motion command value so that the force response falls within a desired range, a high-quality robot work that does not damage the work item is realized. can do. Furthermore, high-speed work can be realized by adding work time to the constraint condition.

また、本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００は、制約条件として力覚センサ１４３で検出される力の大きさを用いたが、モーメント、トルク、電流値などを検出し、これらの上限あるいは下限のいずれかを制約条件に用いることもできる。これらによって、ロボット１２０またはエンドエフェクタ１３０と外界との接触状況に制限値を設けることができ、所望の範囲内での動作指令値を探索することが可能となる。その結果、作業対象２００を傷つけないような作業を実現することができる。 Further, although the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment use the magnitude of the force detected by the force sensor 143 as a constraint condition, the moment, torque, current value, and the like are used. It is also possible to detect and use either the upper limit or the lower limit as a constraint condition. As a result, a limit value can be set for the contact state between the robot 120 or the end effector 130 and the outside world, and an operation command value within a desired range can be searched. As a result, an operation that does not damage the operation target 200 can be realized.

さらに、制約条件としては、周辺環境３００との相対位置姿勢やロボット１２０の位置姿勢を加えることもできる。これらの上限あるいは下限のいずれかを制約条件に加えることで、高品質な作業を実現しつつも、周辺環境３００との干渉を抑制したロボット作業を実現できる。その結果として、システムの稼働率を上げるといった、格別の効果を得ることができる。以上で述べた効果は、他の実施の形態でも同様に得られるものである。 Furthermore, as a constraint condition, a relative position and orientation with respect to the surrounding environment 300 and a position and orientation of the robot 120 can be added. By adding either the upper limit or the lower limit to the constraint condition, it is possible to realize a robot operation that suppresses interference with the surrounding environment 300 while realizing a high-quality operation. As a result, it is possible to obtain a special effect such as increasing the operating rate of the system. The effects described above can be obtained in other embodiments as well.

実施の形態２.
本実施の形態の動作調整装置、動作制御システム及びロボットシステムの構成は、図１に示されたものと同様である。本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００は、一連の作業のためにロボット１２０に与えられる動作指令を複数の区分に分割し、区分毎に動作指令値を調整するものである。なお、以降では動作制御システムから出力される動作指令値が速度指令値であるものとして説明する。Embodiment 2.
The configuration of the motion adjustment device, the motion control system, and the robot system of the present embodiment is the same as that shown in FIG. The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment divide the motion command given to the robot 120 for a series of work into a plurality of sections and adjust the motion command value for each section. Is. In the following description, it is assumed that the operation command value output from the operation control system is a speed command value.

図７は、本発明の実施の形態２による動作調整装置１１２の動作を説明するための図である。図７に示すように、ロボット１２０に装着されたエンドエフェクタ１３０を位置Ｐ０から位置Ｐ３まで移動させる作業を考える。初期位置である位置Ｐ０が作業の開始点であり、位置Ｐ３が作業の終了点である。エンドエフェクタ１３０は、位置Ｐ０から位置Ｐ３まで移動する間に、位置Ｐ１、位置Ｐ２を経由する。 FIG. 7 is a diagram for explaining the operation of the operation adjustment device 112 according to the second embodiment of the present invention. As shown in FIG. 7, an operation of moving the end effector 130 attached to the robot 120 from the position P0 to the position P3 is considered. The initial position P0 is the work start point, and the position P3 is the work end point. The end effector 130 passes through the position P1 and the position P2 while moving from the position P0 to the position P3.

本実施の形態のロボットシステム１００において、作業の開始点から作業の終了点までの経路は、複数の区分に分割される。言い換えると、本実施の形態のロボットシステム１００において、１つの作業の開始から作業の終了までのロボット１２０の動作は、複数の区分に分割される。ここで、位置Ｐ０から位置Ｐ１までを区分Ｓ１、位置Ｐ１から位置Ｐ２までを区分Ｓ２、位置Ｐ２から位置Ｐ３までを区分Ｓ３とする。また、区分Ｓ１の目標移動速度をＶ１とし、区分Ｓ２の目標移動速度をＶ２とし、区分Ｓ３の目標移動速度をＶ３とする。本実施の形態のロボットシステム１００は、分割された区分毎に動作指令値を調整して更新する。具体的には、ロボットシステム１００は、区分Ｓ１の目標移動速度、区分Ｓ２の目標移動速度、区分Ｓ３の目標移動速度をそれぞれ調整する。 In the robot system 100 according to the present embodiment, the path from the work start point to the work end point is divided into a plurality of sections. In other words, in the robot system 100 of the present embodiment, the operation of the robot 120 from the start of one work to the end of the work is divided into a plurality of sections. Here, the position P0 to the position P1 is set as a section S1, the position P1 to the position P2 is set as a section S2, and the position P2 to the position P3 is set as a section S3. Further, the target moving speed of the section S1 is V1, the target moving speed of the section S2 is V2, and the target moving speed of the section S3 is V3. The robot system 100 according to the present embodiment adjusts and updates the operation command value for each divided section. Specifically, the robot system 100 adjusts the target moving speed of the section S1, the target moving speed of the section S2, and the target moving speed of the section S3.

なお、本実施の形態のロボットシステム１００において、区分に分割するための分割点となる位置Ｐ１、Ｐ２は、作業内容に応じて予め定められるものとする。位置Ｐ１、Ｐ２は、区分が切り替わる位置であり、切り替え位置と呼ばれる場合もある。また、ここでは区分の数を３つとして例示しているが、３つに限定されるわけではない。さらに、ここでは位置によって空間的に区分を定義しているが、作業の開始時点から作業の終了時点までを時間的に分割しても良い。 In the robot system 100 according to the present embodiment, the positions P1 and P2 that are division points for division into segments are determined in advance according to the work content. The positions P1 and P2 are positions at which the sections are switched, and are sometimes called switching positions. In addition, although the number of sections is illustrated as three here, the number is not limited to three. Furthermore, although the section is spatially defined according to the position here, the time from the start of the work to the end of the work may be divided in time.

本実施の形態の動作制御システム１１０には、制約条件として力覚センサ１４３の検出結果の上限値Ｆｌｉｍが与えられるものとする。本実施の形態の動作制御システム１１０の処理の流れは、基本的に図６に示すフロー図と同様である。ただし、速度パターンは区分毎に調整されることになる。まず、図６のステップＳ１０において、ロボット制御装置１１１は、速度パターンの初期値を決定する。図８は、本発明の実施の形態２によるロボットシステム１００における速度パターンの初期値の一例を示す図である。図８において横軸はエンドエフェクタ１３０の位置Ｐ、縦軸はエンドエフェクタ１３０の目標移動速度Ｖである。図８において、速度パターンの初期値は、Ｖ１＝Ｖ２＝Ｖ３＝Ｖｉｎｉである。 It is assumed that the upper limit value Flim of the detection result of the force sensor 143 is given to the motion control system 110 of the present embodiment as a constraint condition. The processing flow of the operation control system 110 of the present embodiment is basically the same as the flowchart shown in FIG. However, the speed pattern is adjusted for each section. First, in step S10 of FIG. 6, the robot controller 111 determines an initial value of the speed pattern. FIG. 8 is a diagram showing an example of initial values of speed patterns in the robot system 100 according to the second embodiment of the present invention. In FIG. 8, the horizontal axis is the position P of the end effector 130, and the vertical axis is the target moving speed V of the end effector 130. In FIG. 8, the initial value of the speed pattern is V1 = V2 = V3 = Vini.

次に、ステップＳ１１において、ロボット制御装置１１１は、ロボット１２０の動作を制御して作業を試行する。図９は、本発明の実施の形態２によるロボットシステム１００における力覚センサ１４３の検出値の一例を示す図である。図９において横軸はエンドエフェクタ１３０の位置Ｐ、縦軸は力覚センサ１４３の検出値Ｆである。図９は、図８に示す速度パターンの初期値でロボット１２０を動作させた場合に、力覚センサ１４３で検出される値を表している。 Next, in step S 11, the robot control device 111 tries the operation by controlling the operation of the robot 120. FIG. 9 is a diagram illustrating an example of detection values of the force sensor 143 in the robot system 100 according to the second embodiment of the present invention. In FIG. 9, the horizontal axis represents the position P of the end effector 130, and the vertical axis represents the detection value F of the force sensor 143. FIG. 9 shows values detected by the force sensor 143 when the robot 120 is operated with the initial value of the speed pattern shown in FIG.

次に、ステップＳ１２において、動作調整装置１１２は、制約条件が満たされているかを判定する。すなわち、ステップＳ１２において、動作調整装置１１２は、各区分における力覚センサ１４３の検出値が制約条件で規定される上限値Ｆｌｉｍ以下であるかを判定する。判定に用いる力覚センサ１４３の検出値としては、例えば、各区分における力覚センサ１４３の検出値のうちの最大値を用いる。ステップＳ１２において、全ての区分で力覚センサ１４３の検出値がＦｌｉｍ以下であった場合には、動作調整装置１１２は制約条件が満たされていると判定する。一方、ステップＳ１２において、力覚センサ１４３の検出値が上限値Ｆｌｉｍを超えた区分が１つでも存在する場合には、動作調整装置１１２は制約条件が満たされていないと判定する。 Next, in step S12, the motion adjustment device 112 determines whether the constraint condition is satisfied. That is, in step S12, the motion adjustment device 112 determines whether the detection value of the force sensor 143 in each section is equal to or less than the upper limit value Flim defined by the constraint condition. As the detection value of the force sensor 143 used for the determination, for example, the maximum value among the detection values of the force sensor 143 in each section is used. In step S12, when the detection value of the force sensor 143 is equal to or smaller than Flim in all sections, the motion adjustment device 112 determines that the constraint condition is satisfied. On the other hand, in step S12, when there is at least one section in which the detection value of the force sensor 143 exceeds the upper limit value Flim, the motion adjustment device 112 determines that the constraint condition is not satisfied.

ステップＳ１２において、制約条件が満たされていると判定された場合には、動作制御システム１１０の処理は一旦終了し、以降は更新された速度パターンでの作業が行われる。一方、ステップＳ１２において、制約条件が満たされていないと判定された場合には、動作制御システム１１０の処理はステップＳ１３へと移行する。ステップＳ１３では、動作調整装置１１２は、力覚センサ１４３の検出値が上限値Ｆｌｉｍを超えた区分の目標速度が小さくなるように速度パターンを調整し、速度パターンを更新する。 If it is determined in step S12 that the constraint condition is satisfied, the process of the motion control system 110 is temporarily terminated, and thereafter, the work with the updated speed pattern is performed. On the other hand, when it is determined in step S12 that the constraint condition is not satisfied, the process of the operation control system 110 proceeds to step S13. In step S13, the motion adjustment device 112 adjusts the speed pattern so that the target speed of the section in which the detection value of the force sensor 143 exceeds the upper limit value Flim is small, and updates the speed pattern.

図９に示す例では、区分Ｓ２において、力覚センサ１４３の検出値Ｆｍａｘ２が、上限値Ｆｌｉｍを超えている。一方、区分Ｓ１における力覚センサ１４３の検出値Ｆｍａｘ１、及び区分Ｓ３における力覚センサ１４３の検出値Ｆｍａｘ３は、上限値Ｆｌｉｍを超えていない。したがって、ステップＳ１２において、動作調整装置１１２は制約条件が満たされていないと判定する。ステップＳ１３では、動作調整装置１１２は、区分Ｓ２における目標速度Ｖ２が小さくなるように速度パターンを調整する。本発明の実施の形態２による動作制御システム１１０は、以上のような処理を行う。図１０は、本発明の実施の形態２によるロボットシステム１００における更新後の速度パターンの一例を示す図である。図１０において横軸はエンドエフェクタ１３０の位置Ｐ、縦軸はエンドエフェクタ１３０の目標移動速度Ｖである。 In the example shown in FIG. 9, the detection value Fmax2 of the force sensor 143 exceeds the upper limit value Flim in the section S2. On the other hand, the detection value Fmax1 of the force sensor 143 in the section S1 and the detection value Fmax3 of the force sensor 143 in the section S3 do not exceed the upper limit value Flim. Accordingly, in step S12, the motion adjustment device 112 determines that the constraint condition is not satisfied. In step S13, the motion adjustment device 112 adjusts the speed pattern so that the target speed V2 in the section S2 becomes small. The operation control system 110 according to the second embodiment of the present invention performs the above processing. FIG. 10 is a diagram showing an example of the updated speed pattern in the robot system 100 according to the second embodiment of the present invention. 10, the horizontal axis represents the position P of the end effector 130, and the vertical axis represents the target moving speed V of the end effector 130.

なお、以上の説明では、制約条件として力覚センサ１４３の検出結果の上限値Ｆｌｉｍが与えられるものとしたが、さらに作業時間が最短となることを制約条件として加えても良い。この場合、図９において、Ｆｍａｘ１及びＦｍａｘ３は、上限値Ｆｌｉｍを超えていないので、ステップＳ１３において、動作調整装置１１２は、区分Ｓ１における目標速度Ｖ１、及び区分Ｓ３における目標速度Ｖ３が大きくなるように速度パターンを調整する。このように速度パターンを調整することで、作業時間をより短くすることが可能となる。図１１は、本発明の実施の形態２によるロボットシステム１００における更新後の速度パターンの別の例を示す図である。図１１において横軸はエンドエフェクタ１３０の位置Ｐ、縦軸はエンドエフェクタ１３０の目標移動速度Ｖである。 In the above description, the upper limit value Flim of the detection result of the force sensor 143 is given as a constraint condition. However, it may be added as a constraint condition that the work time is further shortest. In this case, since Fmax1 and Fmax3 do not exceed the upper limit value Flim in FIG. 9, in step S13, the motion adjustment device 112 increases the target speed V1 in the section S1 and the target speed V3 in the section S3. Adjust the speed pattern. By adjusting the speed pattern in this way, the work time can be further shortened. FIG. 11 is a diagram showing another example of the updated speed pattern in the robot system 100 according to the second embodiment of the present invention. In FIG. 11, the horizontal axis represents the position P of the end effector 130, and the vertical axis represents the target moving speed V of the end effector 130.

なお、動作指令値が速度指令値である場合、図１０、図１１に示す通り、分割点Ｐ１、Ｐ２は、目標速度が切り換えられる位置となる。分割点Ｐ１、Ｐ２は、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、分割点Ｐ１、Ｐ２は、内界センサ１４１で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 When the operation command value is a speed command value, the dividing points P1 and P2 are positions where the target speed is switched as shown in FIGS. The division points P1 and P2 may be start points for switching the target speed, or may be completion points for switching the target speed. Further, the division points P1 and P2 may be points at which the operating speed detected by the inner sensor 141 is guaranteed to be within a predetermined error range from the target speed.

本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００は、以上のように構成される。本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、区分毎にロボット１２０の動作が調整される。力覚センサ１４３の検出値が所定の値よりも大きくなる区分のみ動作が遅くなるように調整されるので、作業全体の動作を不要に遅くすることなく、しかも作業対象２００または周辺環境３００に過大な負荷が作用することがないように、ロボット１２０の動作を調整でき、また、ロボット１２０の動作の調整を容易化できる。さらに、力覚センサ１４３の検出値が所定の値よりも小さくなる区分については動作が早くなるように調整されるように構成すれば、作業全体の動作をより早くすることも可能となる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Since the operation is adjusted so that the operation is delayed only in the section in which the detection value of the force sensor 143 is larger than the predetermined value, the operation of the entire work is not unnecessarily delayed, and the work object 200 or the surrounding environment 300 is excessively large. Therefore, the operation of the robot 120 can be adjusted so that a large load is not applied, and the adjustment of the operation of the robot 120 can be facilitated. Furthermore, if it is configured so that the operation of the section in which the detection value of the force sensor 143 is smaller than a predetermined value is adjusted so that the operation becomes faster, the operation of the entire work can be made faster.

以上のように、本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、区間ごとに最適な動作指令値を学習し、更新することで、従来の調整では実現できなかった細やかな動作指令値の設計が可能となり、結果として高速かつ高品質なロボット作業を実現することができる。 As described above, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment, it can be realized by conventional adjustment by learning and updating the optimum motion command value for each section. It was possible to design detailed operation command values that were not present, and as a result, high-speed and high-quality robot work could be realized.

実施の形態３．
図１２は、本発明の実施の形態３による動作調整装置１１２ｂの構成例及び周辺のブロックを示すブロック図である。図１２は、ロボットシステム１００の構成の一部を抽出して示したものである。動作調整装置１１２ｂは、指令値学習部１１３ｂを備える。本実施の形態の動作調整装置、動作制御システム及びロボットシステムの構成は、動作調整装置１１２が動作調整装置１１２ｂに置き換えられる以外は、図１に示されたものと同様である。本実施の形態における動作調整装置１１２ｂは、実施の形態２における動作調整装置１１２と比較して、区分情報が入力される点が異なる。区分情報には、区分位置の初期値、および各区分における動作指令値の初期値の情報が含まれる。なお、区分位置とは、各区分の両端の分割点Ｐｉの位置であり、例えば、動作速度の目標値が切り換えられる位置である。内界センサ１４１または外界センサ１４２によって、エンドエフェクタ１３０が所定の位置に到達したことが検出されると、動作速度の目標値が切り換えられる。Embodiment 3 FIG.
FIG. 12 is a block diagram showing a configuration example of the motion adjustment device 112b according to Embodiment 3 of the present invention and peripheral blocks. FIG. 12 shows a part of the configuration of the robot system 100 extracted. The motion adjustment device 112b includes a command value learning unit 113b. The configuration of the motion adjustment device, the motion control system, and the robot system of the present embodiment is the same as that shown in FIG. 1 except that the motion adjustment device 112 is replaced with the motion adjustment device 112b. The operation adjustment device 112b in the present embodiment is different from the operation adjustment device 112 in the second embodiment in that segment information is input. The division information includes information on the initial value of the division position and the initial value of the operation command value in each division. The section position is the position of the dividing point Pi at both ends of each section, for example, a position where the target value of the operation speed is switched. When the inner world sensor 141 or the outer world sensor 142 detects that the end effector 130 has reached a predetermined position, the target value of the operation speed is switched.

本実施の形態の動作調整装置、動作制御システム及びロボットシステムは、実施の形態２におけるものと同様に、区分毎に動作指令値を調整する。区分毎に動作指令値を調整することで、指令値学習部１１３ｂを、衝突などが生じる区分は低速な動作となるように調整し、それ以外の区分は高速な動作となるように調整する学習器とすることができる。この学習器によれば、高速な作業が実現される動作指令値を自動的に学習できる。指令値学習部１１３ｂは、各区分に対応する動作指令値を自動的に学習していく。簡単のため、動作調整装置１１２ｂは制御パラメータを調整せず、動作指令値のみを調整するものとして説明する。 The motion adjustment device, motion control system, and robot system according to the present embodiment adjust the motion command value for each section in the same manner as in the second embodiment. Learning to adjust the command value learning unit 113b to adjust the command value learning unit 113b for each category so that the category in which a collision or the like occurs is a low-speed operation, and to adjust the other categories to a high-speed operation. Can be a container. According to this learning device, it is possible to automatically learn an operation command value that realizes high-speed work. The command value learning unit 113b automatically learns the operation command value corresponding to each section. For the sake of simplicity, description will be made assuming that the motion adjustment device 112b adjusts only the motion command value without adjusting the control parameter.

本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、指令値学習部１１３ｂは、区分情報、制約条件、センサ１４０の検出値、および更新前の動作指令値を入力として、動作指令値をそれぞれ更新する。区分情報は、動作指令値をＮ個の区分に分割するために定義されている。区分に分割するためのそれぞれの分割点をＰｉ（ｉ＝０，１，２，・・・，Ｎ＋１）と定義する。ここで、Ｎは自然数である。また、ここでは、動作の開始点および終了点も分割点に含まれるものとし、開始点をＰ０とする。分割点Ｐｉの１つ前の分割点と分割点Ｐｉとの間の区間を区分Ｓｉ（ｉ＝０，１，２，・・・，Ｎ）と呼ぶ。 In the motion adjustment device, the motion control system, and the robot system according to the present embodiment, the command value learning unit 113b receives the segment information, the constraint condition, the detection value of the sensor 140, and the motion command value before update as the motion command value. Update each. The category information is defined to divide the operation command value into N categories. Each division point for dividing into sections is defined as Pi (i = 0, 1, 2,..., N + 1). Here, N is a natural number. Here, the start point and end point of the operation are also included in the division points, and the start point is P0. A section between the division point immediately before the division point Pi and the division point Pi is called a section Si (i = 0, 1, 2,..., N).

本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、区分は、作業状態が変化するたびに定義されることを想定している。例えば、力覚センサを用いた嵌合作業を考えると、分割点Ｐｉは嵌合される部品間の接触現象が生じる前後、接触状態が変化する前後で定義される。予想される接触状態の変化に応じて分割点Ｐｉは定義され、それぞれにふさわしい位置、速度、加速度、といった動作指令の目標値に変更することで、作業全体の高速化が図られる。この際、過去の試行情報から適切な分割点Ｐｉの位置と、それぞれの区分Ｓｉの指令値パターンを定義することが本実施の形態の動作調整装置、動作制御システム及びロボットシステムの特徴である。 In the motion adjustment device, the motion control system, and the robot system according to the present embodiment, it is assumed that the classification is defined every time the work state changes. For example, considering a fitting operation using a force sensor, the dividing point Pi is defined before and after the contact phenomenon between the components to be fitted occurs and before and after the contact state changes. The dividing point Pi is defined according to the expected change in the contact state, and the speed of the entire work can be increased by changing to the target values of the operation commands such as the position, speed, and acceleration suitable for each. At this time, it is a feature of the motion adjusting device, the motion control system, and the robot system of the present embodiment that an appropriate position of the dividing point Pi and a command value pattern of each section Si are defined from past trial information.

指令値学習部１１３ｂに入力される制約条件は、高速化した作業に関して、作業成功と作業失敗の境界を定義する条件である。高速化した作業動作では、エンドエフェクタ１３０の位置制御の誤差などから、エンドエフェクタ１３０が作業対象２００に強く衝突するリスクがある。強い衝突が生じると、エンドエフェクタ１３０あるいは作業対象２００が破損し、作業失敗になってしまう場合がある。このような過去の作業失敗を考慮し、設計時点でユーザが制約条件を定義すること、あるいは過去の試行データによって制約条件を定義することで、高速かつ低衝撃な作業を行う動作指令値の生成が実現可能となる。 The constraint condition input to the command value learning unit 113b is a condition that defines the boundary between the work success and the work failure with respect to the speeded up work. In a high-speed work operation, there is a risk that the end effector 130 may strongly collide with the work target 200 due to an error in position control of the end effector 130 or the like. When a strong collision occurs, the end effector 130 or the work target 200 may be damaged, resulting in work failure. In consideration of such past work failures, generation of operation command values for high-speed and low-impact work by the user defining constraints at the time of design or by defining constraints based on past trial data Is feasible.

制約条件としては、位置の制限範囲、姿勢の制限範囲、動作速度の上限値、動作速度の下限値、力の上限値、力の下限値、モーメントの上限値、モーメントの下限定値などがある。特に、ロボット１２０や作業対象２００の位置姿勢を取得可能な場合に、制約条件としてロボット１２０の位置姿勢、ロボット１２０と周辺環境３００との相対的な位置姿勢の上限あるいは下限のいずれかで定義される制限値を入力することができる。 Constraint conditions include position limit range, posture limit range, upper limit value of motion speed, lower limit value of motion speed, upper limit value of force, lower limit value of force, upper limit value of moment, lower limit value of moment, etc. . In particular, when the position / orientation of the robot 120 or the work target 200 can be acquired, the position / orientation of the robot 120 is defined as a constraint condition, and either the upper or lower limit of the relative position / orientation of the robot 120 and the surrounding environment 300 is defined. Limit value can be entered.

また、内界センサ１４１または外界センサ１４２で取得したデータをセンサ情報と呼ぶ。センサ情報に対しては、フィルタ処理によってノイズを除去する処理、閾値を超えた値だけを抽出する処理などの前処理が必要に応じて行われる。 Data acquired by the inner sensor 141 or the outer sensor 142 is referred to as sensor information. For sensor information, preprocessing such as processing for removing noise by filter processing and processing for extracting only values exceeding a threshold value is performed as necessary.

動作指令値とは、ロボットシステム１００の位置制御系に入力可能な制御指令値のことを指す。動作指令値は、単に指令値と呼ばれる場合もある。ロボット１２０の動作は、各軸のモータの動作によって制御されている。動作指令値には、例えば、モータの動作を制御するための位置指令値、速度指令値、電流指令値なども含まれる。また、時間と速度との関係を表すプロファイルから生成された速度パターンから、動作調整装置１１２ｂが等価的に位置指令値の時系列データを生成し、ロボット制御装置１１１に入力することもできる。動作指令値はロボット制御装置１１１の内部で生成することもできる。 The operation command value refers to a control command value that can be input to the position control system of the robot system 100. The operation command value is sometimes simply referred to as a command value. The operation of the robot 120 is controlled by the operation of the motor of each axis. The operation command value includes, for example, a position command value, a speed command value, a current command value and the like for controlling the operation of the motor. Further, the motion adjustment device 112b can equivalently generate time series data of the position command value from the speed pattern generated from the profile representing the relationship between the time and the speed, and can input it to the robot control device 111. The operation command value can also be generated inside the robot controller 111.

本実施の形態の動作調整装置１１２ｂは、ロボット制御装置１１１の内部の指令値を取り出し、ロボット１２０が作業を実施した際の応答として得られるセンサ情報に応じて、動作指令値を調整し、更新する。この点は、他の実施の形態でも同様である。以降では動作制御システムから出力される動作指令値が速度指令値であるものとして説明する。なお、他の構成として、動作調整装置１１２ｂが、動作指令値そのものではなく、動作指令値の生成に必要なパラメータをロボット制御装置１１１に渡す構成も考えられる。例えば、動作調整装置１１２ｂは、区分位置および各区分における動作速度の目標値だけをロボット制御装置１１１に入力することもできる。この場合、ロボット制御装置が、入力された区分位置および動作速度の目標値を元にして動作指令値を生成する。 The motion adjustment device 112b according to the present embodiment takes out the command value inside the robot control device 111, adjusts the motion command value according to the sensor information obtained as a response when the robot 120 performs the work, and updates it. To do. This is the same in other embodiments. In the following description, it is assumed that the operation command value output from the operation control system is a speed command value. As another configuration, a configuration in which the motion adjustment device 112b passes not the motion command value itself but a parameter necessary for generating the motion command value to the robot control device 111 is also conceivable. For example, the motion adjustment device 112b can input only the segment position and the target value of the motion speed in each segment to the robot control device 111. In this case, the robot control device generates an operation command value based on the input segment position and operation speed target value.

動作調整装置１１２ｂは、指令値学習部１１３ｂを備えている。指令値学習部１１３ｂは、動作指令値を調整し、更新する。指令値学習部１１３ｂは、区分情報、制約条件、更新前の動作指令値、センサ１４０の検出値に基づいて、新しい動作指令値を求める。指令値学習部１１３ｂは、新しい動作指令値を求める際に、評価関数によって作業の高速性と作業品質を評価し、作業対象２００が壊れにくく高速な動作を探索するよう設計される。なお、また、動作調整装置１１２ｂは、ロボット制御装置１１１で用いられる制御パラメータも調整、更新する構成としても良い。制御パラメータの調整、更新も、指令値学習部１１３ｂで行われる。 The motion adjustment device 112b includes a command value learning unit 113b. The command value learning unit 113b adjusts and updates the operation command value. The command value learning unit 113b obtains a new motion command value based on the classification information, the constraint condition, the motion command value before update, and the detection value of the sensor 140. The command value learning unit 113b is designed so that when a new motion command value is obtained, the work speed and work quality are evaluated by an evaluation function, and the work object 200 is not easily broken and searches for a fast motion. In addition, the motion adjustment device 112b may be configured to adjust and update the control parameters used in the robot control device 111. Adjustment and update of the control parameters are also performed by the command value learning unit 113b.

図１３は、本発明の実施の形態３による指令値学習部１１３ｂの構成例及び周辺のブロックを示すブロック図である。図１３は、ロボットシステム１００の構成の一部を抽出して示したものである。指令値学習部１１３ｂは、記憶部１１４及び学習処理部１１５を備える。図１３を用いて、指令値学習部１１３ｂにおける探索の方法の一例について述べる。ここで、予め区分の数がＮ＝４と定義されているものとする。また、各区分で定義された目標速度の値である速度目標値が動作指令値として用いられるものとする。また、指令値学習部１１３ｂは、各区分における速度目標値を調整することで、高速な作業を実現するものとする。 FIG. 13 is a block diagram illustrating a configuration example of the command value learning unit 113b according to Embodiment 3 of the present invention and peripheral blocks. FIG. 13 shows a part of the configuration of the robot system 100 extracted. The command value learning unit 113b includes a storage unit 114 and a learning processing unit 115. An example of a search method in the command value learning unit 113b will be described with reference to FIG. Here, it is assumed that the number of sections is defined as N = 4 in advance. Further, it is assumed that a speed target value, which is a target speed value defined in each section, is used as an operation command value. In addition, the command value learning unit 113b realizes high-speed work by adjusting the speed target value in each section.

図１４は、本発明の実施の形態３によるロボットシステム１００が実施する作業の一例を示す図である。図１４に示す通り、ロボットシステム１００は、第１の部品２１０を第２の部品３１０に挿入する作業を行う。図１４は、作業の進行に伴う第１の部品２１０と第２の部品３１０との相対位置の変化を図示したものであり、（ａ）、（ｂ）、（ｃ）、（ｄ）の順に作業が進行していく様子を表している。第１の部品２１０が作業対象２００に相当し、第２の部品３１０が周辺環境３００に相当する。 FIG. 14 is a diagram illustrating an example of work performed by the robot system 100 according to the third embodiment of the present invention. As shown in FIG. 14, the robot system 100 performs an operation of inserting the first component 210 into the second component 310. FIG. 14 illustrates changes in the relative positions of the first component 210 and the second component 310 as the work progresses, in the order of (a), (b), (c), and (d). It shows how work is progressing. The first part 210 corresponds to the work target 200, and the second part 310 corresponds to the surrounding environment 300.

第１の部品２１０には、穴２１１が設けられている。一方、第２の部品３１０には、突起３１１が設けられている。第１の部品２１０を第２の部品３１０に挿入する際には、穴２１１に突起３１１が挿入される。第１の部品２１０は、第１の素材で構成される。一方、第２の部品３１０は、第１の素材で構成される部分３１２と第２の素材で構成される部分３１３とを備えている。第１の部品２１０を第２の部品３１０に挿入する際には、第１の部品２１０と第２の部品３１０との接触状態に変化が生じる。 A hole 211 is provided in the first component 210. On the other hand, the second component 310 is provided with a protrusion 311. When the first component 210 is inserted into the second component 310, the protrusion 311 is inserted into the hole 211. The first component 210 is made of a first material. On the other hand, the second component 310 includes a portion 312 made of the first material and a portion 313 made of the second material. When the first component 210 is inserted into the second component 310, the contact state between the first component 210 and the second component 310 changes.

図１４に示す例では、（ｂ）から（ｄ）にかけての作業の進行に応じて、部品間で接触する部位および接触状態が変化する。接触状態としては、接触部分の各部品の素材、接触部分の広さなどが挙げられる。接触状態が変化することで、接触部分に生じる摩擦力が変化する。図１４の（ｂ）では、第１の部品２１０及び第２の部品３１０の外形同士の摩擦力が発生する。図１４の（ｃ）では、さらに、穴２１１と突起３１１との接触が加わるため、摩擦力が大きくなる。部品間に発生する摩擦力の変化によって、力覚センサ１４３の検出結果も変化することになる。すなわち、部品のはめあい作業やコネクタの挿入作業などにおいては、作業の進行に応じて部品間の反力が変化する。力覚センサは、この部品間の反力を検出している。 In the example illustrated in FIG. 14, the part and the contact state that are in contact with each other change as the work progresses from (b) to (d). Examples of the contact state include the material of each part of the contact portion, the width of the contact portion, and the like. As the contact state changes, the frictional force generated at the contact portion changes. In FIG. 14B, a frictional force between the outer shapes of the first component 210 and the second component 310 is generated. In FIG. 14C, since the contact between the hole 211 and the protrusion 311 is further added, the frictional force is increased. The detection result of the force sensor 143 also changes due to a change in the frictional force generated between the parts. That is, in the part fitting work and the connector inserting work, the reaction force between the parts changes according to the progress of the work. The force sensor detects the reaction force between the components.

図１３に示す通り、指令値学習部１１３ｂは、センサ１４０で検出された力情報、およびロボット制御装置１１１から取得した速度パターンを記憶部１１４に記憶する。ロボットシステム１００は、動作指令値の調整のために作業を試行する際に、速度パターンを指定してロボット１２０を動作させることができるものとする。記憶部１１４に記憶された力情報、速度パターン、区分情報、および制約条件に基づいて、学習処理部１１５が速度パターンを更新して、オフライン処理としてロボット制御装置１１１に出力する。 As illustrated in FIG. 13, the command value learning unit 113 b stores the force information detected by the sensor 140 and the speed pattern acquired from the robot control device 111 in the storage unit 114. It is assumed that the robot system 100 can operate the robot 120 by designating a speed pattern when attempting a work for adjusting the operation command value. Based on the force information, speed pattern, classification information, and constraint conditions stored in the storage unit 114, the learning processing unit 115 updates the speed pattern and outputs it to the robot control device 111 as offline processing.

ここで、ロボット制御装置１１１には１つの速度パターンが記憶されているが、動作指令値の調整の際には、動作調整装置１１２ｂは、基準となる１つの速度パターンに対しても、複数種類の速度パターンを用いて作業を試行するように促す。この結果、動作指令値の調整の際に、ロボットシステム１００は、様々な条件で試行することになる。動作調整装置１１２ｂは、様々な条件での試行で得られたデータに基づいて動作指令値を調整する。例えば、ロボットシステム１００は、ロボット制御装置１１１に記憶されている動作指令値とは異なる動作指令値を含め、それぞれ異なる動作指令値によってＮａ回の試行を行う。動作調整装置１１２ｂは、Ｎａ回の試行の結果として得られるデータを入力して１回学習し、動作指令値を更新する。Ｎａ回の試行を１セットとして、Ｎｂセットの試行を実施すると、多くの場合において動作指令値は収束し、それ以上の改善が発生しなくなる。ここで、Ｎａ、Ｎｂは、１以上の整数である。 Here, one speed pattern is stored in the robot control device 111. However, when adjusting the operation command value, the motion adjustment device 112b also applies a plurality of types to one reference speed pattern. Encourage them to try the task using the speed pattern. As a result, when adjusting the operation command value, the robot system 100 tries under various conditions. The motion adjustment device 112b adjusts the motion command value based on data obtained through trials under various conditions. For example, the robot system 100 performs Na trials with different motion command values including motion command values different from the motion command values stored in the robot controller 111. The motion adjustment device 112b inputs data obtained as a result of Na trials, learns once, and updates the motion command value. When N trials are performed with Na trials as one set, the operation command value converges in many cases, and no further improvement occurs. Here, Na and Nb are integers of 1 or more.

以上に示した通り、本実施の形態のロボットシステム１００では、設定した１つあるいは１つ以上の複数の動作指令値を用いて試行し、得られた力センサデータに基づいて評価値を生成する。動作調整装置１１２ｂは、それぞれの評価値に基づいて、動作指令値の更新を行う。動作指令値の更新において、動作調整装置１１２ｂは、１つあるいは１つ以上の複数の動作指令値を生成し、再び試行を実施する。動作指令値が１つである場合には、評価値をプロットしたグラフにおいて評価値が収束していれば、動作調整装置１１２ｂは動作指令値の更新を終了する。動作指令値が複数である場合には、動作指令値に対応する評価値が最小となる結果のみをプロットしたグラフにおいて評価値が収束していれば、動作調整装置１１２ｂは動作指令値の更新を終了する。この場合、複数の動作指令値を更新していた場合は、動作調整装置１１２ｂは評価値が最小となった動作指令値に更新する。 As described above, in the robot system 100 according to the present embodiment, trial is performed using one or more set operation command values, and an evaluation value is generated based on the obtained force sensor data. . The motion adjustment device 112b updates the motion command value based on each evaluation value. In the update of the operation command value, the operation adjustment device 112b generates one or more plural operation command values, and performs the trial again. When there is only one operation command value, if the evaluation value has converged in the graph in which the evaluation value is plotted, the operation adjustment device 112b ends the update of the operation command value. When there are a plurality of operation command values, the motion adjustment device 112b updates the operation command value if the evaluation value converges in a graph in which only the result that minimizes the evaluation value corresponding to the operation command value is plotted. finish. In this case, when a plurality of operation command values have been updated, the operation adjustment device 112b updates the operation command value with the smallest evaluation value.

図１５は、本発明の実施の形態３による学習処理部１１５の処理の流れの一例を示すフロー図である。図１５に示す通り、まずステップＳ１００において、学習処理部１１５は準備段階としての前処理を行う。次にステップＳ２００において、学習処理部１１５は学習処理を行う。 FIG. 15 is a flowchart showing an example of a processing flow of the learning processing unit 115 according to the third embodiment of the present invention. As shown in FIG. 15, first in step S100, the learning processing unit 115 performs preprocessing as a preparation stage. Next, in step S200, the learning processing unit 115 performs learning processing.

図１６は、本発明の実施の形態３による学習処理部１１５で行われる前処理の流れの一例を示すフロー図である。なお、動作の説明のために、図１６には学習処理部１１５以外のブロックが行う動作も記載されている。まず、ステップＳ１０１において、ロボット制御装置１１１は、力覚制御を行うための制御パラメータを設定する。次に、ステップＳ１０２において、ロボット制御装置１１１は、ロボット１２０を動作させて作業を試行する。次に、ステップＳ１０３において、指令値学習部１１３ｂは、その試行で得られたデータを取得する。各試行によって得られたデータを試行データと呼ぶ。試行データには、各試行で検出された力情報、各試行で使用された速度パターンを含む。力情報は、各試行において、所定の時間間隔で力覚センサ１４３によって取得された時系列のデータであり、力波形とも呼ばれる。次に、ステップＳ１０４において、記憶部１１４は、ステップＳ１０３で取得されたデータを記憶する。 FIG. 16 is a flowchart illustrating an example of the flow of preprocessing performed by the learning processing unit 115 according to Embodiment 3 of the present invention. For the purpose of explaining the operation, FIG. 16 also shows an operation performed by blocks other than the learning processing unit 115. First, in step S101, the robot control device 111 sets a control parameter for performing force sense control. Next, in step S 102, the robot control device 111 tries the operation by operating the robot 120. Next, in step S103, the command value learning unit 113b acquires data obtained by the trial. The data obtained by each trial is called trial data. The trial data includes the force information detected in each trial and the velocity pattern used in each trial. The force information is time-series data acquired by the force sensor 143 at predetermined time intervals in each trial, and is also called a force waveform. Next, in step S104, the storage unit 114 stores the data acquired in step S103.

次に、ステップＳ１０５において、学習処理部１１５は、試行データがＫ個以上取得されたか否かを判定する。ここで、Ｋは自然数であり、予め設定される。まだＫ個以上の試行データが取得されていなければ、処理はステップＳ１０２に戻る。一方、Ｋ個以上の試行データが取得されていれば場合、処理はステップＳ１０６に進む。したがって、ステップＳ１０６に処理が進んだ時点では、Ｋ個の試行データＤ１ｊ（ｊ＝１，２，３，・・・Ｋ）が取得され、記憶部１１４に記憶されている。 Next, in step S105, the learning processing unit 115 determines whether or not K or more pieces of trial data have been acquired. Here, K is a natural number and is set in advance. If K or more pieces of trial data have not yet been acquired, the process returns to step S102. On the other hand, if K or more pieces of trial data have been acquired, the process proceeds to step S106. Therefore, when the process proceeds to step S106, K pieces of trial data D1j (j = 1, 2, 3,... K) are acquired and stored in the storage unit 114.

次に、ステップＳ１０６において、学習処理部１１５は、記憶部１１４に記憶されているＫ個の試行データに基づいて区分位置を定義する。区分位置とは、各区分の両端の分割点の位置である。分割点の位置は、例えば、エンドエフェクタ１３０の位置に対応する。分割点の位置が、速度目標値の切り替え位置となる。分割点の位置は、目標速度の切り替えの開始点であっても良いし、目標速度の切り替えの完了点であっても良い。また、分割点の位置は、内界センサ１４１で検出される動作速度が、目標速度から所定の誤差範囲内に収まることが保証される点であっても良い。 Next, in step S 106, the learning processing unit 115 defines segment positions based on the K trial data stored in the storage unit 114. The section position is the position of the dividing point at both ends of each section. The position of the dividing point corresponds to the position of the end effector 130, for example. The position of the division point becomes the speed target value switching position. The position of the dividing point may be a start point of target speed switching or a completion point of target speed switching. Further, the position of the dividing point may be a point at which the operation speed detected by the inner sensor 141 is guaranteed to be within a predetermined error range from the target speed.

学習処理部１１５は、例えば、Ｋ個の試行データの平均や分散に基づいて区分位置を定義する。学習処理部１１５は、力波形の変化率に注目し、力波形が大きく変化する前後に分割点を設定することで、自動的に分割点の位置を決定することができる。あるいは、ユーザが作業内容に合わせて状態変化の生じる点を分割点として、手動で決定することもできる。 The learning processing unit 115 defines the division position based on, for example, an average or variance of K pieces of trial data. The learning processing unit 115 pays attention to the rate of change of the force waveform, and can automatically determine the position of the dividing point by setting the dividing point before and after the force waveform changes greatly. Alternatively, the user can manually determine a point where a state change occurs according to the work content as a division point.

次に、ステップＳ１０７において、学習処理部１１５は、区分位置が定義されているか否かを判定する。区分位置が定義されていなければ、処理はステップＳ１０６に戻る。区分位置が定義されていれば、処理はステップＳ１０８に進む。次に、ステップＳ１０８において、学習処理部１１５は、各区分に対して速度目標値を定義する。速度目標値は、ユーザによって指定される力の上限値、および、目標タクトタイムに基づいて算出される。 Next, in step S107, the learning processing unit 115 determines whether or not the division position is defined. If the segment position is not defined, the process returns to step S106. If the segment position is defined, the process proceeds to step S108. Next, in step S108, the learning processing unit 115 defines a speed target value for each section. The speed target value is calculated based on the upper limit value of the force specified by the user and the target tact time.

具体的には、学習処理部１１５は、目標タクトタイムまでに作業を完了させるための標準作業速度を全体の速度目標値Ｖｄｎとして設定する。次に、学習処理部１１５は、力の上限値に基づいて、速度上限値Ｖｍａｘを定義する。エンドエフェクタ１３０が作業対象２００または周辺環境３００と衝突したときの速度と、その際にエンドエフェクタ１３０に加えられる外力との関係は、作業対象の剛性情報などに基づいて予め求めることができる。学習処理部１１５は、この関係を記憶したテーブル等を参照して速度上限値Ｖｍａｘを求めることができる。 Specifically, the learning processing unit 115 sets the standard work speed for completing the work by the target tact time as the overall speed target value Vdn. Next, the learning processing unit 115 defines a speed upper limit value Vmax based on the upper limit value of the force. The relationship between the speed when the end effector 130 collides with the work target 200 or the surrounding environment 300 and the external force applied to the end effector 130 at that time can be obtained in advance based on the rigidity information of the work target. The learning processing unit 115 can obtain the speed upper limit value Vmax with reference to a table or the like that stores this relationship.

学習処理部１１５は、全体の速度目標値Ｖｄｎと速度上限値Ｖｍａｘとを用いて、目標速度Ｖｄを決定する。速度目標値Ｖｄは、０より大きく、速度上限値Ｖｍａｘより小さい。目標速度Ｖｄは、徐々にＶｄｎに近づくように設定される。例えば、学習処理部１１５は、０＜Ｖｄ＜Ｖｄｎ＜Ｖｍａｘの条件下で、速度パラメータがある程度バラけるように、乱数を利用して複数個の速度目標値Ｖｄを定義する。このように、ステップＳ１０８において、学習処理部１１５は、決められた範囲内でバラけた値となるように速度目標値Ｖｄを決定する。次に、ステップＳ１０９において、学習処理部１１５は、速度目標値が定義されているか否かを判定する。速度目標値が定義されていなければ、処理はステップＳ１０８に戻る。速度目標値が定義されていれば、前処理は終了となる。前処理によって、学習処理を行う際の初期値が決定される。 The learning processing unit 115 determines the target speed Vd using the entire speed target value Vdn and the speed upper limit value Vmax. The speed target value Vd is larger than 0 and smaller than the speed upper limit value Vmax. The target speed Vd is set so as to gradually approach Vdn. For example, the learning processing unit 115 defines a plurality of speed target values Vd using random numbers so that the speed parameter varies to some extent under the condition of 0 <Vd <Vdn <Vmax. As described above, in step S108, the learning processing unit 115 determines the speed target value Vd so as to be a value that varies within the determined range. Next, in step S109, the learning processing unit 115 determines whether or not a speed target value is defined. If the speed target value is not defined, the process returns to step S108. If the speed target value is defined, the preprocessing ends. The initial value for performing the learning process is determined by the preprocessing.

図１７は、本発明の実施の形態３による学習処理部１１５で行われる学習処理の流れの一例を示すフロー図である。なお、動作の説明のために、図１７には学習処理部１１５以外のブロックが行う動作も記載されている。まず、ステップＳ２０１において、ロボット制御装置１１１は、ロボット１２０を動作させて作業を試行する。次に、ステップＳ２０２において、指令値学習部１１３ｂは、その試行で得られた試行データを取得する。次に、ステップＳ２０３において、記憶部１１４は、ステップＳ２０２で取得された試行データを記憶する。 FIG. 17 is a flowchart showing an example of the flow of learning processing performed in the learning processing unit 115 according to Embodiment 3 of the present invention. For the sake of explanation of the operation, FIG. 17 also shows an operation performed by blocks other than the learning processing unit 115. First, in step S 201, the robot control device 111 tries the operation by operating the robot 120. Next, in step S202, the command value learning unit 113b acquires trial data obtained by the trial. Next, in step S203, the storage unit 114 stores the trial data acquired in step S202.

次に、ステップＳ２０４において、学習処理部１１５は、試行データがＭ個以上取得されたか否かを判定する。ここで、Ｍは自然数であり、予め設定される。まだＭ個以上の試行データが取得されていなければ、処理はステップＳ２０１に戻る。一方、Ｍ個以上の試行データが取得されていれば、処理はステップＳ２０５に進む。したがって、ステップＳ２０５に処理が進んだ時点では、Ｍ個の試行データＤ２ｊ（ｊ＝１，２，３，・・・Ｍ）が取得され、記憶部１１４に記憶されている。なお、Ｍ組の区分位置、速度目標値が定義されていれば、ステップＳ２０１における試行はそれぞれ異なる組の区分位置、速度目標値を用いて実行される。したがって、ステップＳ２０５に処理が進んだ時点では、Ｍ組の区分位置、速度目標値に対応するＭ個の試行データＤ２ｊが記憶されることになる。 Next, in step S204, the learning processing unit 115 determines whether or not M pieces of trial data have been acquired. Here, M is a natural number and is set in advance. If M or more pieces of trial data have not yet been acquired, the process returns to step S201. On the other hand, if M or more pieces of trial data have been acquired, the process proceeds to step S205. Accordingly, when the process proceeds to step S205, M pieces of trial data D2j (j = 1, 2, 3,... M) are acquired and stored in the storage unit 114. If M groups of segment positions and speed target values are defined, the trial in step S201 is executed using different groups of segment positions and speed target values. Therefore, when the process proceeds to step S205, M pieces of trial data D2j corresponding to the M sets of segment positions and speed target values are stored.

次に、ステップＳ２０５において、学習処理部１１５は、制約条件に基づいて、Ｍ個の試行データのそれぞれに対して評価値を演算する。演算された評価値は記憶される。次に、ステップＳ２０６において、学習処理部１１５は、Ｍ個の試行データのうち、評価値が最良となった試行データに対応する区分位置および速度目標値を求める。次に、ステップＳ２０７において、学習処理部１１５は、新しく求められたＭ個の評価値の中で最良の評価値と、過去に求められた評価値とを比較して、評価値が最良となる結果に収束したか否かを判定する。収束していれば、処理はステップＳ２０９に進み、調整を完了するための処理が行われ、動作指令値の調整は完了となる。動作指令値の調整が完了した時点で、最良の評価値が得られた区分位置および速度目標値が、動作指令値の調整結果となる。一方、まだ収束していなければ、処理はステップＳ２０８に進む。 Next, in step S205, the learning processing unit 115 calculates an evaluation value for each of the M trial data based on the constraint condition. The calculated evaluation value is stored. Next, in step S206, the learning processing unit 115 obtains a segment position and a speed target value corresponding to the trial data having the best evaluation value among the M trial data. Next, in step S207, the learning processing unit 115 compares the best evaluation value among the newly obtained M evaluation values with the evaluation value obtained in the past, and the evaluation value becomes the best. It is determined whether or not the result has converged. If it has converged, the process proceeds to step S209, a process for completing the adjustment is performed, and the adjustment of the operation command value is completed. When the adjustment of the operation command value is completed, the segment position and the speed target value at which the best evaluation value is obtained become the adjustment result of the operation command value. On the other hand, if not yet converged, the process proceeds to step S208.

次に、ステップＳ２０８において、学習処理部１１５は、新たにＭ組の区分位置および速度目標値を定義し、区分位置および速度目標値を更新する。Ｍ組の区分位置および速度目標値は、互いに区分位置または速度目標値が異なる。すなわち、ステップＳ２０８において、学習処理部１１５は、新たにＭ組の動作指令値を設定する。Ｍ組の動作指令値のそれぞれは、区分位置と各区分位置に対応する速度目標値をパラメータとして有している。各組の動作指令値においては、区分数よりも１つ多い区分位置が存在し、区分数と同じ数の速度目標値が存在する。ステップＳ２０８の処理が完了すると、処理はステップＳ２０１に戻る。 Next, in step S208, the learning processing unit 115 newly defines M sets of segment positions and speed target values, and updates the segment positions and speed target values. The segment positions or speed target values of the M sets of segment positions and speed target values are different from each other. That is, in step S208, the learning processing unit 115 newly sets M sets of operation command values. Each of the M sets of operation command values has a segment position and a speed target value corresponding to each segment position as parameters. In each set of operation command values, there is one more segment position than the number of segments, and there are the same number of speed target values as the number of segments. When the process of step S208 is completed, the process returns to step S201.

以上のように、学習処理において、ロボットシステム１００は、設定された区分位置と、各区分に対して設定された速度目標値とに基づいて、Ｍ回の試行作業を実施する。Ｍ回の試行作業は、それぞれ区分位置または速度目標値が異なる条件下で実施される。Ｍ回の試行が終了する度に、学習処理部１１５は、各区分に対する分割点の位置および各区分に対する速度目標値を更新する。 As described above, in the learning process, the robot system 100 performs M trial operations based on the set segment positions and the speed target value set for each segment. The M trial operations are performed under different conditions of the segment position or the speed target value. Each time M trials are completed, the learning processing unit 115 updates the position of the dividing point for each section and the speed target value for each section.

図１８は、本発明の実施の形態３によるロボットシステム１００における試行時の速度パターンの一例を示す図である。また、図１９は、本発明の実施の形態３によるロボットシステム１００における試行時に取得される力情報の一例を示す図である。図１８及び図１９において、Ｐ０〜Ｐ３は分割点の位置であり、Ｓ１〜Ｓ４は４つの区分を表している。また、図１８において、Ｖ１〜Ｖ４は、各区分における速度目標値を表している。図１９は、図１８に示す速度パターンによる試行において取得された力情報を表している。 FIG. 18 is a diagram showing an example of a speed pattern at the time of trial in the robot system 100 according to the third embodiment of the present invention. Moreover, FIG. 19 is a figure which shows an example of the force information acquired at the time of the trial in the robot system 100 by Embodiment 3 of this invention. 18 and 19, P0 to P3 are positions of division points, and S1 to S4 represent four sections. In FIG. 18, V1 to V4 represent speed target values in each section. FIG. 19 shows force information acquired in the trial using the speed pattern shown in FIG.

図１４に示すような組立作業においては、部品間の接触が発生する位置の付近で、第１の部品２１０を保持するエンドエフェクタ１３０と第２の部品３１０との作用反力が制限値よりも大きくなることがある。この場合、制限値を超えた力の量を制限超過量で評価できる。図１９においては、区分Ｓ２において、力覚センサ１４３で検出された力の大きさＦが制限値Ｌ０を超えている。制限超過量ＤＨは、検出された力の大きさＦが制限値Ｌ０を超えている場合に、検出された力の大きさＦと制限値Ｌ０との差分で求められる。制限超過量ＤＨが設定する閾値より大きい区分がある場合は、その区分の速度目標値を調整する必要がある。 In the assembling work as shown in FIG. 14, the reaction force between the end effector 130 holding the first part 210 and the second part 310 is near the limit value near the position where the contact between the parts occurs. May grow. In this case, the amount of force exceeding the limit value can be evaluated as the limit excess amount. In FIG. 19, in section S2, the magnitude F of the force detected by the force sensor 143 exceeds the limit value L0. The limit excess amount DH is obtained by the difference between the detected force magnitude F and the limit value L0 when the detected force magnitude F exceeds the limit value L0. When there is a section where the limit excess amount DH is larger than the set threshold, it is necessary to adjust the speed target value of the section.

図１９では区分Ｓ２で検出された力Ｆが大きくなっている。したがって、図１８に示す速度パターンに対して、学習処理部１１５は、区分Ｓ２における速度目標値Ｖ２が小さくなるように速度パターンを調整する。さらに、学習処理部１１５は、区分Ｓ２の両端となる分割点Ｐ１およびＰ２の位置も調整する。図１８に示す速度パターンにおいて、分割点Ｐ１は速度目標値を下げ始める点であり、分割点Ｐ２は速度目標値を上げ始める点である。すなわち、学習処理部１１５は、速度目標値の変化を開始する点の位置も調整する。これらの調整は、制約条件に基づいて行われる。 In FIG. 19, the force F detected in the section S2 is large. Therefore, with respect to the speed pattern shown in FIG. 18, the learning processing unit 115 adjusts the speed pattern so that the speed target value V2 in the section S2 becomes smaller. Furthermore, the learning processing unit 115 also adjusts the positions of the dividing points P1 and P2 that are both ends of the section S2. In the speed pattern shown in FIG. 18, the division point P1 is a point at which the speed target value starts to be lowered, and the division point P2 is a point at which the speed target value starts to be raised. That is, the learning processing unit 115 also adjusts the position of the point where the change of the speed target value is started. These adjustments are made based on constraint conditions.

例えば、制約条件として、力の大きさＦに制限値Ｌ０を設定した場合、上限となる制限値Ｌ０を超えていない試行に対しては力の大きさＦに関する評価値が０となるように評価関数を定義する。力の大きさＦに関する評価値が０にならない場合は、学習処理部１１５は、速度目標値Ｖ２、分割点Ｐ１、Ｐ２の位置を更新し続けて、動作指令値を調整する。この調整と同時に、なるべく高速な作業が実施されるように評価関数を定義することもできる。図１９においては、区分Ｓ１、Ｓ３及びＳ４では、検出された力の大きさＦは、制限値Ｌ０に対して余裕量ＤＬが存在する。ここで、余裕量ＤＬは制限値Ｌ０までの量、もしくは制限値Ｌ０までの量を指標化したものとする。制限値Ｌ０までの量は、制限値Ｌ０と検出された力の大きさＦとの差分で定義される。余裕量ＤＬが０より大きい場合は、速度目標値を上げる方向に調整、更新することができる。このような調整によって、なるべく高速に作業を行うような調整が可能となる。 For example, when the limit value L0 is set for the force magnitude F as a constraint condition, the evaluation value for the force magnitude F is evaluated to be 0 for trials that do not exceed the upper limit value L0. Define a function. When the evaluation value regarding the magnitude F of force does not become 0, the learning processing unit 115 continuously updates the positions of the speed target value V2 and the division points P1 and P2 and adjusts the operation command value. At the same time as this adjustment, an evaluation function can be defined so that work can be performed as fast as possible. In FIG. 19, in the sections S1, S3, and S4, the detected force magnitude F has a margin DL with respect to the limit value L0. Here, it is assumed that the margin amount DL is an index of the amount up to the limit value L0 or the amount up to the limit value L0. The amount up to the limit value L0 is defined by the difference between the limit value L0 and the detected force magnitude F. When the margin amount DL is larger than 0, the speed target value can be adjusted and updated in the direction of increasing. By such adjustment, it is possible to perform adjustment so as to perform work as fast as possible.

これらの調整を図１７におけるステップＳ２０５およびステップＳ２０６で行う。このとき、評価値を最良とする分割点Ｐｉの位置および速度目標値を求めるために、評価関数を用いた機械学習あるいは最適化手法を適用することができる。例えば、強化学習、ベイズ最適化、粒子群最適化などの手法が例示される。これらの手法によって、評価値を最良とする動作指令値を設定することができる。例えば、作業中の各時点で検出される力Ｆ（ｔ）および作業時間Ｔを用いた式（１）で表される評価関数Ｆｑが定義されているとする。学習処理部１１５は、評価関数Ｆｑで算出される評価値が小さくなるように動作指令値を調整することで、力Ｆ（ｔ）および作業時間Ｔが小さくなるような動作指令値を求めることができる。図１７に示すとおり、評価関数によって求められる評価値が収束したところで、調整は完了となる。 These adjustments are performed in steps S205 and S206 in FIG. At this time, machine learning or an optimization method using an evaluation function can be applied to obtain the position and speed target value of the dividing point Pi that makes the evaluation value the best. For example, methods such as reinforcement learning, Bayesian optimization, and particle swarm optimization are exemplified. By these methods, an operation command value that makes the evaluation value the best can be set. For example, it is assumed that the evaluation function Fq represented by the equation (1) using the force F (t) detected at each time point during work and the work time T is defined. The learning processing unit 115 can obtain an operation command value that reduces the force F (t) and the work time T by adjusting the operation command value so that the evaluation value calculated by the evaluation function Fq is small. it can. As shown in FIG. 17, the adjustment is completed when the evaluation value obtained by the evaluation function has converged.

本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００は、以上のように構成される。本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、区分毎にロボット１２０の動作が調整される。したがって、作業全体の動作を不要に遅くすることなく、しかも作業対象２００または周辺環境３００に過大な負荷が作用することがないように、ロボット１２０の動作を調整でき、また、ロボット１２０の動作の調整を容易化できる。さらに、力覚センサ１４３の検出値が所定の値よりも小さくなる区分については動作が早くなるように調整されるように構成すれば、作業全体の動作をより早くすることも可能となる。 The motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment are configured as described above. According to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the motion of the robot 120 is adjusted for each section. Therefore, the operation of the robot 120 can be adjusted so as not to unnecessarily slow down the operation of the entire work and an excessive load is not applied to the work target 200 or the surrounding environment 300. Adjustment can be facilitated. Furthermore, if it is configured so that the operation of the section in which the detection value of the force sensor 143 is smaller than a predetermined value is adjusted so that the operation becomes faster, the operation of the entire work can be made faster.

以上のように、本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、区間ごとに最適な動作指令値を学習し、更新することで、従来の調整では実現できなかった細やかな動作指令値の設計が可能となり、結果として高速かつ高品質なロボット作業を実現することができる。具体的には、本実施の形態の動作調整装置１１２、動作制御システム１１０及びロボットシステム１００によれば、部品のはめあい作業やコネクタの挿入作業などにおいて、はめあう部品間の反力を抑制しながら作業時間を短縮することができる。 As described above, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 according to the present embodiment, it can be realized by conventional adjustment by learning and updating the optimum motion command value for each section. It was possible to design detailed operation command values that were not present, and as a result, high-speed and high-quality robot work could be realized. Specifically, according to the motion adjustment device 112, the motion control system 110, and the robot system 100 of the present embodiment, the reaction force between the mating components is suppressed in the component fitting operation, the connector insertion operation, and the like. Work time can be shortened.

実施の形態４．
図２０は、本発明の実施の形態４による動作調整装置１１２ｃの構成例及び周辺のブロックを示すブロック図である。本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、他の構成は図１に示されたものと同様である。図２０は、ロボットシステム１００の構成の一部を抽出して示したものである。本実施の形態の動作調整装置１１２ｃは、指令値学習部１１３ｂおよび指令値区分部１１６を備える。Embodiment 4 FIG.
FIG. 20 is a block diagram illustrating a configuration example of the motion adjustment device 112c according to the fourth embodiment of the present invention and peripheral blocks. In the motion adjustment device, the motion control system, and the robot system of the present embodiment, other configurations are the same as those shown in FIG. FIG. 20 shows a part of the configuration of the robot system 100 extracted. The motion adjustment device 112c of the present embodiment includes a command value learning unit 113b and a command value sorting unit 116.

指令値区分部１１６には、更新前の動作指令値がロボット制御装置１１１から入力され、センサ１４０の検出値であるセンサ情報がセンサ１４０から入力され、制約条件が外部から入力される。指令値区分部１１６は、これらの入力に対して、エンドエフェクタ１３０等の位置あるいは指令値進捗率を用いて動作指令値を区分する分割点Ｐｉ（ｉ＝０，１，２，・・・，Ｎ＋１）を定義し、これを区分情報として出力する。指令値学習部１１３ｂは、図１２に示されたものと同様のものである。 The command value classification unit 116 receives an operation command value before update from the robot control device 111, sensor information that is a detection value of the sensor 140 from the sensor 140, and a constraint condition from the outside. The command value classifying unit 116 divides the operation command value with respect to these inputs using the position of the end effector 130 or the like or the command value progress rate (i = 0, 1, 2,...). N + 1) is defined and output as segment information. The command value learning unit 113b is the same as that shown in FIG.

本実施の形態の動作調整装置１１２ｃは、センサ情報の特徴量や制約条件を用いて、たとえば機械学習を適用して分割すべき空間を決定し、ここで分割された特徴量空間上のクラス情報を利用して現在の分割点Ｐｉを生成する。動作調整装置１１２ｃは、図１５に示す処理と同様に、前処理及び学習処理を行う。図２１は、本発明の実施の形態４による動作調整装置１１２ｃで行われる前処理の流れの一例を示すフロー図である。また、図２２は、本発明の実施の形態４による動作調整装置１１２ｃで行われる学習処理の流れの一例を示すフロー図である。 The motion adjustment device 112c according to the present embodiment determines the space to be divided by applying, for example, machine learning using the feature amount and the constraint condition of the sensor information, and class information on the feature amount space divided here. Is used to generate the current dividing point Pi. The motion adjustment device 112c performs preprocessing and learning processing similarly to the processing shown in FIG. FIG. 21 is a flowchart showing an example of the flow of preprocessing performed by the operation adjustment device 112c according to the fourth embodiment of the present invention. FIG. 22 is a flowchart showing an example of the flow of learning processing performed by the motion adjustment device 112c according to Embodiment 4 of the present invention.

図２１に示す前処理は、図１６に示す処理と比較すると、ステップＳ１０６ｂにおいて、区分位置に加えて、区分数も定義する点が異なる。例えば、波形的な特徴に基づいて自動的に区分を生成することが出来る。波形的な特徴として例えば、時系列で取得した位置データ、速度データ、力データおよび力変化率データに関して、一定時間毎Ｔｓｍｐのデータの最大値あるいは度数分布を入力とし、入力に基づいてクラスタリングを実施する。クラスタリングには、機械学習の一種であるｋ−ｍｅａｎｓ法などクラスタリング手法を用いて波形の特徴的な履歴毎に区切れ目を定義することが出来る。これに基づいて例えばＸ個の種類の波形特徴を定義したとする。 The pre-process shown in FIG. 21 differs from the process shown in FIG. 16 in that the number of sections is defined in addition to the section position in step S106b. For example, sections can be automatically generated based on waveform characteristics. For example, regarding the position data, velocity data, force data, and force change rate data acquired in time series as the waveform characteristics, the maximum value or frequency distribution of Tsmp data at regular intervals is input, and clustering is performed based on the input To do. For clustering, a break can be defined for each characteristic history of the waveform using a clustering technique such as the k-means method which is a kind of machine learning. For example, assume that X types of waveform features are defined.

次に、取得したクラスタに基づいて、元のデータに対してラベル付けを実施することができる。例えば、Ｘ個存在するクラスタそれぞれに対する、対象としている入力の類似度Ｓ（ｉ）（ここで、ｉ＝１，２，３，・・・，Ｘ）を定義して、どのグループの属性の特徴に最も近いかということをパーセンテージで表現することが出来る。その場合、最もパーセンテージが大きなグループとして、ラベル付けすることができる。時間ｔを変数して定義された各時刻のラベルＬ（ｔ）とする。ステップＳ１０６ｂにおいて、ラベルの変化が生じる全てあるいはいくつかの部分で区切り目として、区分数・区分位置を定義することできる。 The original data can then be labeled based on the acquired cluster. For example, for each of the X clusters, the target input similarity S (i) (where i = 1, 2, 3,..., X) is defined, and the attribute characteristics of which group Can be expressed as a percentage. In that case, it can be labeled as the group with the highest percentage. The time t is defined as a label L (t) for each time defined as a variable. In step S106b, the number of divisions and the division position can be defined as breaks in all or some parts where the label changes.

一方、図２２に示す学習処理は、図１７に示す処理と比較すると、ステップＳ２１１、ステップＳ２１２、ステップＳ２１３の３つの処理が異なる。図２２に示す学習処理は、ステップＳ２１１において、センサ情報、動作指令値および制御パラメータ、制約条件に基づき、区分数と区分位置とを学習するための評価関数を用いて第１の評価値を求める。また、図２２に示す学習処理は、ステップＳ２１２において、第１の評価値に基づいて区分数および区分位置を学習し、更新する。さらに、図２２に示す学習処理は、ステップＳ２１３において、動作指令値を学習するための第２の評価値を求める。したがって、図２２に示す学習処理は、区分数および区分位置を学習した後に、動作指令値を学習することになる。 On the other hand, the learning process shown in FIG. 22 differs from the process shown in FIG. 17 in three processes of step S211, step S212, and step S213. In the learning process shown in FIG. 22, in step S211, the first evaluation value is obtained using an evaluation function for learning the number of divisions and the division position based on the sensor information, the operation command value, the control parameter, and the constraint condition. . In the learning process shown in FIG. 22, the number of divisions and the division position are learned and updated based on the first evaluation value in step S212. Furthermore, the learning process shown in FIG. 22 obtains a second evaluation value for learning the operation command value in step S213. Therefore, the learning process shown in FIG. 22 learns the operation command value after learning the number of divisions and the division position.

以上の処理を含むことで、区分情報を自動的に学習する枠組みが追加され、区分情報を予め事前知識を活用して設計する必要が無くなることになり、設計時間を短くするという格別の効果を得ることができる。 By including the above processing, a framework for automatically learning the category information is added, and it becomes unnecessary to design the category information in advance using prior knowledge, which has the special effect of shortening the design time. Can be obtained.

実施の形態５．
図２３は、本発明の実施の形態５による動作調整装置１１２ｄの構成例及び周辺のブロックを示すブロック図である。本実施の形態の動作調整装置、動作制御システム及びロボットシステムにおいて、他の構成は図１に示されたものと同様である。図２３は、ロボットシステム１００の構成の一部を抽出して示したものである。本実施の形態の動作調整装置１１２ｄは、指令値区分部１１６および動作学習部１１７を備える。指令値区分部１１６は、図２０に示されたものと同様のものである。Embodiment 5 FIG.
FIG. 23 is a block diagram illustrating a configuration example of the motion adjustment device 112d according to the fifth embodiment of the present invention and peripheral blocks. In the motion adjustment device, the motion control system, and the robot system of the present embodiment, other configurations are the same as those shown in FIG. FIG. 23 shows a part of the configuration of the robot system 100 extracted. The motion adjustment device 112d of the present embodiment includes a command value sorting unit 116 and a motion learning unit 117. The command value sorting unit 116 is the same as that shown in FIG.

図２４は、本発明の実施の形態５による動作学習部１１７の構成例を示すブロック図である。動作学習部１１７は、指令値学習部１１３ｂおよびパラメータ学習部１１８を備える。指令値学習部１１３ｂは、図２０に示されたものと同様のものである。動作学習部１１７には、ロボット制御装置１１１から更新前の動作指令値および制御パラメータが入力される。また、動作学習部１１７には、外部から制約条件が入力される。また、動作学習部１１７には、センサ１４０からセンサ情報が入力される。また、動作学習部１１７には、指令値区分部１１６から区分情報が入力される。入力された信号は、指令値学習部１１３ｂおよびパラメータ学習部１１８に入力される。 FIG. 24 is a block diagram showing a configuration example of the motion learning unit 117 according to the fifth embodiment of the present invention. The action learning unit 117 includes a command value learning unit 113b and a parameter learning unit 118. The command value learning unit 113b is the same as that shown in FIG. The motion learning unit 117 receives an operation command value and control parameters before update from the robot control device 111. In addition, a constraint condition is input to the motion learning unit 117 from the outside. In addition, sensor information is input from the sensor 140 to the motion learning unit 117. Further, the classification information is input from the command value classification unit 116 to the motion learning unit 117. The input signal is input to the command value learning unit 113b and the parameter learning unit 118.

パラメータ学習部１１８は、位置指令値、速度指令値、加速度指令値といった直接的なロボットの振る舞いではなく、外界センサに基づくセンサフィードバック制御系のゲイン、インピーダンスパラメータ、フィルタ設計パラメータなどを調整する。すなわち、パラメータ学習部１１８は、フィードバック制御系の制御パラメータを調整する。パラメータ学習部１１８は、区分情報、センサ情報、制約条件、指令値および制御パラメータを入力として、これらを用いて、入力された制御パラメータを、制約条件を満たすような制御パラメータに更新する。制御パラメータを更新する際には、機械学習を用いることができる。例示として、パラメータ学習部１１８は、予め定義された評価関数で得られる評価値が大きくなるように制御パラメータを更新し、漸近的に収束するまで演算を繰り返す。なお、定義される評価関数によっては、パラメータ学習部１１８は、評価値が小さくなるように制御パラメータを更新することになる。 The parameter learning unit 118 adjusts the gain, impedance parameter, filter design parameter, and the like of the sensor feedback control system based on the external sensor, not the direct robot behavior such as the position command value, the speed command value, and the acceleration command value. That is, the parameter learning unit 118 adjusts the control parameter of the feedback control system. The parameter learning unit 118 receives the classification information, the sensor information, the constraint condition, the command value, and the control parameter as input, and uses them to update the input control parameter to a control parameter that satisfies the constraint condition. Machine learning can be used to update the control parameters. As an example, the parameter learning unit 118 updates the control parameter so that an evaluation value obtained by a predefined evaluation function becomes large, and repeats the calculation until asymptotically converges. Depending on the defined evaluation function, the parameter learning unit 118 updates the control parameter so that the evaluation value becomes small.

ここで、図２４において、パラメータ学習部１１８は、指令値学習部１１３ｂとは独立した構成として例示されている。しかし、パラメータ学習部１１８及び指令値学習部１１３ｂは、必ずしもそれぞれが独立した処理を行う必要はない。例えば、パラメータ学習部１１８及び指令値学習部１１３ｂ、１つの評価関数を用いて同時に処理を行うこともできる。なお、パラメータ学習部１１８は、制御パラメータを区分毎に調整する。また、指令値学習部１１３ｂで使用される区分数と、パラメータ学習部１１８で使用される区分数とは、必ずしも同じではない。例えば、指令値学習部１１３ｂで使用される区分数と比較して、パラメータ学習部１１８で使用される区分数の方が多い場合が考えられる。 Here, in FIG. 24, the parameter learning unit 118 is exemplified as a configuration independent of the command value learning unit 113b. However, the parameter learning unit 118 and the command value learning unit 113b do not necessarily need to perform independent processing. For example, the parameter learning unit 118 and the command value learning unit 113b can perform processing simultaneously using one evaluation function. The parameter learning unit 118 adjusts the control parameter for each category. Further, the number of sections used in the command value learning unit 113b and the number of sections used in the parameter learning unit 118 are not necessarily the same. For example, there may be a case where the number of sections used by the parameter learning unit 118 is larger than the number of sections used by the command value learning unit 113b.

また、パラメータ学習部１１８は、外界センサ１４２に基づくセンサフィードバック制御システムにおける制御パラメータだけではなく、内界センサ１４１に基づくフィードバック制御システムにおける制御パラメータも更新することができ、この結果、より高品質で高速なロボット作業を実現することが可能となる。 In addition, the parameter learning unit 118 can update not only the control parameters in the sensor feedback control system based on the external sensor 142 but also the control parameters in the feedback control system based on the internal sensor 141. High-speed robot work can be realized.

図２５は、発明の実施の形態５による動作調整装置１１２ｄの別の構成例及び周辺のブロックを示すブロック図である。図２５では、指令値学習部１１３ｂを備えない構成例を示している。この構成例では、動作調整装置１１２ｄは、動作指令値の更新は行わず、制御パラメータのみを更新する。 FIG. 25 is a block diagram illustrating another configuration example and peripheral blocks of the operation adjustment device 112d according to the fifth embodiment of the invention. FIG. 25 illustrates a configuration example that does not include the command value learning unit 113b. In this configuration example, the motion adjustment device 112d updates only the control parameter without updating the motion command value.

実施の形態６．
本実施の形態の動作調整装置、動作制御システム及びロボットシステムは、速度パターンを調整するに際し、各区分Ｓｉにおける速度目標値に対して、上限値または下限値を定め、それぞれの区分における探索空間を作業対象の剛性や作業対象の組立て品質上の制約に基づいて定義する。本実施の形態の動作調整装置、動作制御システム及びロボットシステムによれば、探索空間の中で、実現可能であるが組立て品質上の問題が生じる動作指令値あるいは制御パラメータを探索しなくなる。したがって、ユーザが求める作業品質を規定した範囲で、高速な組立を実現する動作指令値あるいは制御パラメータに収束させることができる。これにより、調整後のロボットは、作業対象に作用させる反力を大きくせず傷つけない作業品質を確保することができるという、格別な効果を得ることができる。Embodiment 6 FIG.
When adjusting the speed pattern, the motion adjusting device, the motion control system, and the robot system according to the present embodiment determine an upper limit value or a lower limit value for the speed target value in each section Si, and set a search space in each section. The definition is based on the rigidity of the work object and the restrictions on the assembly quality of the work object. According to the motion adjustment apparatus, motion control system, and robot system of the present embodiment, it is not possible to search for motion command values or control parameters that can be realized but cause problems in assembly quality in the search space. Therefore, it is possible to converge to the operation command value or control parameter for realizing high-speed assembly within a range in which the work quality required by the user is defined. As a result, the adjusted robot can obtain a special effect that the work force that does not damage the work target without increasing the reaction force can be ensured.

１００ロボットシステム、１１０動作制御システム、１１１ロボット制御装置、１１２、１１２ｂ、１１２ｃ、１１２ｄ動作調整装置、１１３、１１３ｂ指令値学習部、１１４記憶部、１１５学習処理部、１１６指令値区分部、１１７動作学習部、１２０ロボット、１３０エンドエフェクタ、１４０センサ、１４１内界センサ、１４２外界センサ、１４３力覚センサ、２００作業対象、２１０第１の部品、２１１穴、３００周辺環境、３１０第２の部品、３１１突起、４０１プロセッサ、４０２メモリ、４０３データバス。 DESCRIPTION OF SYMBOLS 100 Robot system, 110 Motion control system, 111 Robot control apparatus, 112, 112b, 112c, 112d Motion adjustment apparatus, 113, 113b Command value learning part, 114 Storage part, 115 Learning processing part, 116 Command value classification part, 117 operation | movement Learning unit, 120 robot, 130 end effector, 140 sensor, 141 internal sensor, 142 external sensor, 143 force sensor, 200 work target, 210 first part, 211 holes, 300 ambient environment, 310 second part, 311 protrusion, 401 processor, 402 memory, 403 data bus.

Claims

A robot motion adjustment device used in a robot system including a robot equipped with an end effector and a robot control device for controlling the motion of the robot, wherein the robot performs work on a work target,
An operation command value transmitted from the robot controller to the robot in order to control the operation of the robot by performing learning using the force acting on the end effector detected by an external sensor included in the robot system as an input. An operation adjustment apparatus comprising a command value learning unit for adjusting

The motion adjustment device according to claim 1, wherein the command value learning unit adjusts the motion command value by performing learning using a range of force acting on the end effector as a constraint condition.

The motion adjustment device according to claim 2, wherein the command value learning unit adjusts the motion command value by performing learning using an upper limit of time required for the work as a constraint.

4. The motion according to claim 1, wherein the motion command value is a speed command value that is a target value of the movement speed of the end effector or a target value of the motion speed of the robot. Adjustment device.

The said command value learning part adjusts the said operation command value with respect to each of the some division | segmentation which divided | segmented from the start to the completion | finish of the said operation | work. The operation adjusting device according to 1.

A command value division unit that generates a plurality of divisions by dividing from the start to the end of the work,
The operation adjustment device according to claim 5, wherein the command value learning unit adjusts the operation command value for each of the classifications generated by the command value classification unit.

The operation adjustment device according to claim 6, wherein the command value division unit adjusts a position of a division point for dividing the work into the divisions.

The said command value learning part adjusts the said operation command value by performing learning which used either the upper limit or lower limit of force, moment, torque, or electric current value as a constraint condition. The operation adjusting device according to any one of the above.

The command value learning unit adjusts the operation command value by performing learning using either the position or orientation of the robot or the upper or lower limit of the relative position and orientation relative to the surrounding environment as a constraint. Item 9. The operation adjustment device according to any one of Items 2 to 8.

The operation according to claim 5 or 6, wherein the command value learning unit performs an evaluation based on an evaluation function every M trials (M is a natural number) and adjusts the operation command value. Adjustment device.

A parameter learning unit for learning a control parameter in at least one of feedback control based on an inner world sensor and feedback control based on the external sensor provided in the robot system;
The parameter learning unit performs learning based on classification information that is information on the classification and sensor information obtained from the external sensor in a plurality of trials, and updates the control parameter. The operation adjusting apparatus according to claim 5 or 6.

A robot motion adjusting device used in a robot system including a robot with an end effector mounted thereon and a robot control device for controlling the motion of the robot, wherein the robot performs work on a work target,
Based on feedback control of the operation of the robot based on the internal sensor provided in the robot system and learning based on the external sensor by performing learning using as input the force acting on the end effector detected by the external sensor provided in the robot system. An operation adjustment apparatus comprising: a parameter learning unit that learns a control parameter in at least one of feedback control of the operation of the robot.

13. The motion adjustment device according to claim 12, wherein the parameter learning unit adjusts the control parameter by performing learning using a range of force acting on the end effector as a constraint condition.

The operation adjustment apparatus according to claim 13, wherein the parameter learning unit adjusts the control parameter by performing learning using an upper limit of time required for the work as a constraint.

The said parameter learning part adjusts the said control parameter with respect to each of the some division | segmentation divided | segmented from the start to the completion | finish of the said operation | work, The any one of Claim 12 to 14 characterized by the above-mentioned. Motion adjustment device.

A command value division unit that generates a plurality of divisions by dividing from the start to the end of the work,
16. The motion adjustment device according to claim 15, wherein the parameter learning unit adjusts the control parameter for each of the classifications generated by the command value classification unit.

The operation adjustment device according to claim 16, wherein the command value division unit adjusts a position of a division point for dividing the work into the divisions.

The motion adjustment device according to any one of claims 1 to 17,
A robot control device that controls the operation of the robot based on the motion command value or the control parameter adjusted by the motion adjustment device.

The operation control system according to claim 18;
A robot system comprising the robot controlled by the motion control system.