JP2020157380A

JP2020157380A - Fitting work device

Info

Publication number: JP2020157380A
Application number: JP2019055936A
Authority: JP
Inventors: 博明大庭; Hiroaki Oba
Original assignee: NTN Corp; NTN Toyo Bearing Co Ltd
Current assignee: NTN Corp
Priority date: 2019-03-25
Filing date: 2019-03-25
Publication date: 2020-10-01
Anticipated expiration: 2039-03-25
Also published as: JP7290439B2

Abstract

To provide a technique for efficiently performing mechanical learning without having a singular point in structure.SOLUTION: A work device 101 includes: a holding section 112 which holds a fitting component; an angle adjustment section 111 which adjusts the direction of the holding section 112; a work head 109 on which the angle adjustment section 111 is mounted; a position adjustment section which moves the work head 109; and an information processor 102 which controls the work device 101. The angle adjustment section 111 has first and second link hubs 32 and 33, a plurality of links arranged in parallel between the first and second link hubs 32 and 33, and a plurality of driving sections for driving each link. The information processor 102 uses torque of each driving section of the angle adjustment section 111 during a fitting work as a parameter of a mechanical learning model, determines each driving signal to be transmitted to each driving section of the position adjustment section and the angle adjustment section 111 by the mechanical learning model, drives each driving section on the basis of the driving signals, and adjusts positions of the fitting component in a horizontal direction and a vertical direction and the direction of the fitting component.SELECTED DRAWING: Figure 1

Description

本開示は、嵌合作業装置に関し、より特定的には、リンク機構による角度調整機能を備える嵌合作業装置の機械学習を用いた制御に関する。 The present disclosure relates to a fitting work device, and more specifically, to control of a fitting work device having an angle adjusting function by a link mechanism using machine learning.

組立ロボットに機械部品等の組み立て作業をさせる場合、組立ロボットに組立作業のワークピース（以下、ワークと称する）の位置を高精度に教示する必要がある。特に、対となるワーク同士を嵌合する嵌合作業においては、組立ロボットにワークの位置を正確に教示する必要があり、極めて高い位置決め精度が求められる。 When the assembling robot is made to assemble mechanical parts and the like, it is necessary to teach the assembling robot the position of the workpiece (hereinafter referred to as a work) of the assembling work with high accuracy. In particular, in the fitting operation of fitting the paired workpieces together, it is necessary to accurately teach the position of the workpieces to the assembly robot, and extremely high positioning accuracy is required.

また、近年、機械学習の手法が進歩してきたこともあり、組立ロボットを様々な組み立て作業に対応させるべく、組立ロボットの制御に機械学習を導入することが望まれている。 Further, in recent years, the method of machine learning has advanced, and it is desired to introduce machine learning into the control of the assembling robot in order to make the assembling robot correspond to various assembling work.

組立ロボットへの作業位置の教示に関し、例えば、特許文献１（特開２００８−２６４９１０号公報）は、「嵌合部品を把持する把持手段と、把持手段によってされた嵌合部品に加わる力およびモーメントを検出する力モーメント検出手段と、を備え、嵌合部品を被嵌合部品に嵌合させるロボットにおいて、嵌合途中でかじり付き状態であると判断する間は、挿入動作を継続するとともに、大きさと方向が周期的に変化する振動力を、把持手段を介して嵌合部品に付加する」ロボット制御装置を開示している（［要約］参照）。 Regarding the teaching of the working position to the assembly robot, for example, Patent Document 1 (Japanese Unexamined Patent Publication No. 2008-264910) states that "a gripping means for gripping a fitting part and a force and a moment applied to the fitting part by the gripping means". In a robot equipped with a force moment detecting means for detecting a force moment, and fitting a fitting part to a fitting part, the insertion operation is continued and the size is large while it is determined that the fitting part is in a galling state during the fitting. A robot control device that "applies a vibrating force whose direction and direction change periodically to a fitting component via a gripping means" is disclosed (see [Summary]).

また、特許文献２（特開２０１５−０１６５２７号公報）は、「多関節ロボットの教示点を高精度且つ安価に、自動的に設定可能なロボット装置及び多関節ロボットによる教示点設定方法」を開示している（［要約］参照）。 Further, Patent Document 2 (Japanese Unexamined Patent Publication No. 2015-016527) discloses "a robot device capable of automatically setting a teaching point of an articulated robot with high accuracy and low cost, and a method of setting a teaching point by an articulated robot". (See [Summary]).

また、特許文献３（特表２０１５−５３０２７６号公報）は「ラボラトリー・オートメーション・システム（ＬＡＳ）内において、グリッパユニットを備えるロボットアームを校正し、かつ／又はアライメントする、自動アライメントプロセス、及び関連する技術的構成」を開示している（段落［０００５］参照）。 Further, Patent Document 3 (Japanese Patent Laid-Open No. 2015-530276) describes an automatic alignment process for calibrating and / or aligning a robot arm provided with a gripper unit in a laboratory automation system (LAS), and related matters. "Technical configuration" is disclosed (see paragraph [0005]).

特開２００８−２６４９１０号公報Japanese Unexamined Patent Publication No. 2008-264910 特開２０１５−０１６５２７号公報Japanese Unexamined Patent Publication No. 2015-016527 特表２０１５−５３０２７６号公報Japanese Patent Application Laid-Open No. 2015-530276

例えば、特許文献１または２に開示された技術はいずれも多関節ロボットを前提としている。多関節ロボットは、一般に特異点と呼ばれる構造的に制御できなくなる姿勢を含む。また、多関節ロボットは、ワークに加わる力やモーメントを検出するためのセンサーが必要であり、機械学習と組み合わせた場合に学習用のパラメータが多くなり、学習効率が悪くなる。 For example, all the techniques disclosed in Patent Documents 1 and 2 are premised on an articulated robot. Articulated robots include structurally uncontrollable postures, commonly referred to as singularities. In addition, the articulated robot requires a sensor to detect the force and moment applied to the work, and when combined with machine learning, the number of learning parameters increases and the learning efficiency deteriorates.

そのため、多関節ロボットと異なり、構造的に特異点を有さず、効率よく機械学習を行うための技術が必要とされている。 Therefore, unlike an articulated robot, there is a need for a technique for efficiently performing machine learning without structurally having a singular point.

本開示は、上記のような背景に鑑みてなされたものであって、ある局面における目的は、構造的に特異点を有さず、効率よく機械学習を行うための技術を提供することにある。 The present disclosure has been made in view of the above background, and an object in a certain aspect is to provide a technique for efficiently performing machine learning without structurally having a singular point. ..

ある実施の形態に従う嵌合作業をする作業装置は、嵌合部品を把持する把持部と、把持部が装着され、把持部の向きを調整する角度調整部と、角度調整部が装着される作業ヘッドと、複数の駆動部により前記作業ヘッドを移動させる位置調整部と、作業装置を制御する制御装置とを備える。角度調整部は、第１および第２のリンクハブと、第１および第２のリンクハブの間に並列に配置された複数のリンクと、複数のリンクのそれぞれを駆動させる複数の駆動部とを含む。制御装置は、嵌合作業中に生じる角度調整部の各駆動部のトルクを取得し、角度調整部の各駆動部のトルクを機械学習モデルのパラメータとし、機械学習モデルにより、位置調整部および角度調整部の各駆動部に送信するそれぞれの駆動信号を決定し、決定した駆動信号に基づいて、位置調整部の各駆動部を駆動させることにより、嵌合部品の水平方向および上下方向の位置を調整し、さらに、角度調整部の各駆動部を駆動させることにより、嵌合部品の向きを調整する。 A work device that performs a fitting operation according to a certain embodiment includes a grip portion that grips the fitting component, an angle adjusting portion that is equipped with the grip portion and adjusts the orientation of the grip portion, and an angle adjusting portion. It includes a head, a position adjusting unit for moving the work head by a plurality of drive units, and a control device for controlling the work device. The angle adjusting unit includes a first and second link hubs, a plurality of links arranged in parallel between the first and second link hubs, and a plurality of driving units for driving each of the plurality of links. Including. The control device acquires the torque of each drive unit of the angle adjustment unit generated during the fitting operation, uses the torque of each drive unit of the angle adjustment unit as a parameter of the machine learning model, and uses the machine learning model to obtain the position adjustment unit and the angle. Each drive signal to be transmitted to each drive unit of the adjustment unit is determined, and each drive unit of the position adjustment unit is driven based on the determined drive signal to determine the horizontal and vertical positions of the fitting parts. The orientation of the fitting component is adjusted by adjusting and further driving each drive unit of the angle adjusting unit.

ある実施の形態によれば、構造的に特異点を有さず、効率よく機械学習を行うための技術を提供することが可能である。 According to a certain embodiment, it is possible to provide a technique for efficiently performing machine learning without structurally having a singular point.

この発明の上記および他の目的、特徴、局面および利点は、添付の図面と関連して理解されるこの発明に関する次の詳細な説明から明らかとなるであろう。 The above and other objectives, features, aspects and advantages of the invention will become apparent from the following detailed description of the invention as understood in connection with the accompanying drawings.

ある実施の形態に従う嵌合作業システム１００の一構成例を示す図である。It is a figure which shows one configuration example of the fitting work system 100 according to a certain embodiment. 角度調整機構１１１の一構成例を示す図である。It is a figure which shows one configuration example of the angle adjustment mechanism 111. 角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。An example of one configuration in which the electric actuator 11 for attitude control is attached to the rotating shaft 42 of the angle adjusting mechanism 111 is shown. 把持機構１１２の一構成例を示す図である。It is a figure which shows one configuration example of the gripping mechanism 112. 把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。It is a figure which shows an example of the angle adjustment mechanism 111 which attached the gripping mechanism 112. 嵌合作業時のワークのずれを調整したときの、把持機構１１２の先端の様子の一例を示す図である。It is a figure which shows an example of the state of the tip of the gripping mechanism 112 when the deviation of the work at the time of fitting work is adjusted. 情報処理装置１０２のハードウェアの一構成例を示す図である。It is a figure which shows one configuration example of the hardware of the information processing apparatus 102. 情報処理装置１０２の機能部の一構成例を示す図である。It is a figure which shows one configuration example of the functional part of the information processing apparatus 102. 評価値関数部８０２の動作の一例を示す図である。It is a figure which shows an example of the operation of the evaluation value function part 802. 動作パターンテーブル８０３の一例を示す図である。It is a figure which shows an example of the operation pattern table 803. 嵌合作業システム１００の処理の一例を示すフローチャートである。It is a flowchart which shows an example of the process of the fitting work system 100. 図１１の処理の動作イメージの一例を示す図である。It is a figure which shows an example of the operation image of the process of FIG. 嵌合作業システム１００の嵌合作業の学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of the learning process of the fitting work of the fitting work system 100. 嵌合作業の初期学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of the initial learning process of the fitting work. 嵌合作業の学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of the learning process of fitting work. 評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。It is a flowchart which shows an example of the update process of the evaluation value function F of the evaluation value function unit 802.

以下、図面を参照しつつ、本開示に係る技術思想の実施の形態について説明する。以下の説明では、同一の部品には同一の符号を付してある。それらの名称および機能も同じである。したがって、それらについての詳細な説明は繰り返さない。 Hereinafter, embodiments of the technical concept according to the present disclosure will be described with reference to the drawings. In the following description, the same parts are designated by the same reference numerals. Their names and functions are the same. Therefore, the detailed description of them will not be repeated.

＜Ａ．システム構成＞
まず、本実施の形態に従う嵌合作業システムの構成について説明する。 <A. System configuration>
First, the configuration of the fitting work system according to the present embodiment will be described.

図１は、本実施の形態に従う嵌合作業システム１００の一構成例を示す図である。図１を参照して、嵌合作業システム１００は、嵌合作業装置１０１と、情報処理装置１０２と、制御装置１０３とを備える。 FIG. 1 is a diagram showing a configuration example of a fitting work system 100 according to the present embodiment. With reference to FIG. 1, the fitting work system 100 includes a fitting work device 101, an information processing device 102, and a control device 103.

嵌合作業装置１０１は、架台１０４と、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃ（以降、総称する場合は電動アクチュエータ１０８と呼ぶ）と、作業ヘッド１０９と、回転ユニット取付部材１１０と、角度調整機構１１１と、把持機構１１２と、ワーク設置台１１３とを備える。 The fitting work device 101 includes a gantry 104, a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, and electric actuators 108A, 108B, 108C (hereinafter collectively referred to as generic names). In this case, it is referred to as an electric actuator 108), a work head 109, a rotating unit mounting member 110, an angle adjusting mechanism 111, a gripping mechanism 112, and a work setting table 113.

架台１０４は、第１の直動ユニット１０５と、第２の直動ユニット１０６と、第３の直動ユニット１０７と、それぞれの直動ユニットを駆動させる電動アクチュエータ１０８Ａ，１０８Ｂ，１０８Ｃと、作業ヘッド１０９とからなる位置調整装置の装着を受ける台である。 The gantry 104 includes a first linear motion unit 105, a second linear motion unit 106, a third linear motion unit 107, electric actuators 108A, 108B, 108C for driving the respective linear motion units, and a work head. It is a table on which a position adjusting device including 109 is mounted.

第１の直動ユニット１０５、第２の直動ユニット１０６および第３の直動ユニット１０７は、それぞれ直交するＸ軸、Ｙ軸、Ｚ軸方向に作業ヘッド１０９を移動させる。ある局面において、各直動ユニットは、フレームと、リニアシャフトと、リニアブッシュと、電動アクチュエータ１０８から動力を伝達するための台形ネジおよびボールネジナットとを備えていてもよい。また、ある局面において、各直動ユニットは、リニアシャフトの代わりに、リニアガイドや、フレームの表面を滑るガイドローラーを備えていてもよい。また、ある局面において、各直動ユニットは、台形ネジの代わりに駆動ベルトを備えていてもよい。また、各直動ユニットの端部には、各電動アクチュエータ１０８の初期位置の決定と、安全機構のための、衝突検知センサーが設けられていてもよい。 The first linear motion unit 105, the second linear motion unit 106, and the third linear motion unit 107 move the work head 109 in the orthogonal X-axis, Y-axis, and Z-axis directions, respectively. In certain aspects, each linear motion unit may include a frame, a linear shaft, a linear bush, and a trapezoidal screw and a ball screw nut for transmitting power from the electric actuator 108. Further, in a certain aspect, each linear motion unit may include a linear guide or a guide roller that slides on the surface of the frame instead of the linear shaft. Also, in certain aspects, each linear motion unit may include a drive belt instead of a trapezoidal screw. Further, a collision detection sensor for determining the initial position of each electric actuator 108 and for a safety mechanism may be provided at the end of each linear motion unit.

電動アクチュエータ１０８は、それぞれの直動ユニットを駆動させる。ある局面において、電動アクチュエータ１０８は、ステッピングモータであり、台形ネジや駆動ベルトを介して動力を各直動ユニットに伝達してもよい。また、ある局面において、電動アクチュエータ１０８は、ＡＣサーボモータまたはエンコーダーを備えたギアードモータであってもよい。情報処理装置１０２は、ステッピングモータのステップ数や、エンコーダーの回転数によって、作業ヘッド１０９および把持機構１１２の現在位置を算出してもよい。 The electric actuator 108 drives each linear motion unit. In one aspect, the electric actuator 108 is a stepping motor, which may transmit power to each linear motion unit via a trapezoidal screw or drive belt. Further, in a certain aspect, the electric actuator 108 may be an AC servomotor or a geared motor including an encoder. The information processing device 102 may calculate the current positions of the work head 109 and the gripping mechanism 112 based on the number of steps of the stepping motor and the rotation speed of the encoder.

作業ヘッド１０９は、上下方向（Ｚ軸方向）に動作するように第３の直動ユニット１０７に取り付けられている。また、作業ヘッド１０９は、作業に必要なパーツを取り付けるためのネジ穴やアタッチメントを備える。 The work head 109 is attached to the third linear motion unit 107 so as to operate in the vertical direction (Z-axis direction). Further, the work head 109 is provided with screw holes and attachments for attaching parts necessary for work.

回転ユニット取付部材１１０は、作業ヘッド１０９に取り付けられており、角度調整機構１１１を取り付けるためのネジ穴やアタッチメントを備える。角度調整機構１１１は、把持機構１１２によって把持されたワークの向きを微調整する。また、角度調整機構１１１の根元は電動アクチュエータを用いた回転機構となっている。なお、回転機構は、角度調整機構１１１とは別体でもよい。角度調整機構１１１の詳細については後述する。把持機構１１２は、嵌合作業のためのワーク、例えば、接続端子のコネクタやプラグ等を把持する。把持機構１１２の詳細については後述する。ワーク設置台１１３は、嵌合作業のための片方のワーク１１４を設置するための台である。 The rotating unit mounting member 110 is mounted on the work head 109 and includes screw holes and attachments for mounting the angle adjusting mechanism 111. The angle adjusting mechanism 111 finely adjusts the direction of the work gripped by the gripping mechanism 112. Further, the base of the angle adjusting mechanism 111 is a rotation mechanism using an electric actuator. The rotation mechanism may be separate from the angle adjusting mechanism 111. The details of the angle adjusting mechanism 111 will be described later. The gripping mechanism 112 grips a work for fitting work, for example, a connector or a plug of a connection terminal. Details of the gripping mechanism 112 will be described later. The work installation table 113 is a table for installing one work 114 for fitting work.

情報処理装置１０２は、制御装置１０３を介して、嵌合作業装置１０１に対して制御命令を送信し、また、電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータのモータトルク値等を取得する。情報処理装置１０２の詳細は後述する。 The information processing device 102 transmits a control command to the fitting work device 101 via the control device 103, and also acquires the motor torque value of the electric actuator 108 of the electric actuator 108 and the angle adjusting mechanism 111. Details of the information processing device 102 will be described later.

制御装置１０３は、嵌合作業装置１０１および情報処理装置１０２の間のデータを相互に変換する。ある局面において、制御装置１０３は、マイクロコンピューターからなる制御基板であり、情報処理装置１０２から、嵌合作業装置１０１の電動アクチュエータ１０８や角度調整機構１１１の電動アクチュエータに対する指令（指令トルク、回転量、回転速度等）を受信し、それぞれの電動アクチュエータに制御信号を送信してもよい。 The control device 103 mutually converts the data between the fitting work device 101 and the information processing device 102. In a certain aspect, the control device 103 is a control board composed of a microcomputer, and commands (command torque, rotation speed, command torque, rotation speed, etc.) from the information processing device 102 to the electric actuator 108 of the fitting work device 101 and the electric actuator of the angle adjusting mechanism 111. (Rotation speed, etc.) may be received and a control signal may be transmitted to each electric actuator.

＜Ｂ．システム構成部品のハードウェア構成＞
図２は、角度調整機構１１１の一構成例を示す図である。図２を参照して、角度調整機構１１１は、基端側の第１リンクハブ３２に対し先端側の第２リンクハブ３３を３組のリンク機構３４によって姿勢変更可能に連結したものである。先端側の第２リンクハブ３３には、図１に示された把持機構１１２が取り付けられる。なお、ここでは３組のリンク機構３４を有する角度調整機構１１１について示したが、リンク機構３４の数は、４組以上であってもよい。 <B. Hardware configuration of system components>
FIG. 2 is a diagram showing a configuration example of the angle adjusting mechanism 111. With reference to FIG. 2, the angle adjusting mechanism 111 connects the first link hub 32 on the proximal end side to the second link hub 33 on the distal end side in a posture-changeable manner by three sets of link mechanisms 34. The gripping mechanism 112 shown in FIG. 1 is attached to the second link hub 33 on the distal end side. Although the angle adjusting mechanism 111 having three sets of link mechanisms 34 is shown here, the number of link mechanisms 34 may be four or more.

各リンク機構３４は、基端側の端部リンク部材３５、先端側の端部リンク部材３６および中央リンク部材３７で構成される。リンク機構３４は、４つの回転対偶からなる４節連鎖のリンク機構である。基端側および先端側の端部リンク部材３５，３６はＬ字状の形状を有する。 Each link mechanism 34 is composed of an end link member 35 on the proximal end side, an end link member 36 on the distal end side, and a central link member 37. The link mechanism 34 is a four-node chain link mechanism composed of four rotational pairs. The end link members 35 and 36 on the base end side and the tip end side have an L-shape.

基端側の端部リンク部材３５の一端は、回転軸４２を介して、基端側の第１リンクハブ３２に回転自在に連結されている。先端側の端部リンク部材３６の一端は、回転軸７３を介して、先端側の第２リンクハブ３３に回転自在に連結されている。中央リンク部材３７は、回転軸５５，７５を介して、両端に端部リンク部材３５，３６の各他端がそれぞれ回転自在に連結されている。 One end of the end link member 35 on the base end side is rotatably connected to the first link hub 32 on the base end side via a rotation shaft 42. One end of the end link member 36 on the tip side is rotatably connected to the second link hub 33 on the tip side via a rotation shaft 73. The other ends of the end link members 35 and 36 are rotatably connected to both ends of the central link member 37 via rotation shafts 55 and 75.

角度調整機構１１１は、パラレルリンク機構であり、２つの球面リンク機構を組み合わせた構造を有する。端部リンク部材３５，３６と中央リンク部材３７との各回転対偶の中心軸は、ある交差角を持っていてもよいし、平行であってもよい。 The angle adjusting mechanism 111 is a parallel link mechanism and has a structure in which two spherical link mechanisms are combined. The central axes of each rotational kinematic pair of the end link members 35, 36 and the central link member 37 may have a certain crossing angle or may be parallel.

角度調整機構１１１は、リンクの動作のみで各リンクハブの中心軸の相対角度を調整可能であり、多関節ロボットのように直列に連結された複数の関節の動作を伴わない。このため、先端のわずかな動きに対して構成部材が大きく動くことは無く素早い動作が可能である。また、角度調整機構１１１は、特異点を持たず、リンクを駆動させる電動アクチュエータのモータトルク値から、任意の姿勢における把持機構１１２の先端に加わる力を検出できる。 The angle adjusting mechanism 111 can adjust the relative angle of the central axis of each link hub only by the operation of the link, and does not involve the operation of a plurality of joints connected in series like an articulated robot. Therefore, the constituent members do not move significantly with respect to a slight movement of the tip, and quick movement is possible. Further, the angle adjusting mechanism 111 does not have a singular point and can detect a force applied to the tip of the gripping mechanism 112 in an arbitrary posture from the motor torque value of the electric actuator that drives the link.

第２リンクハブ３３は、第１リンクハブ３２から見て半球面上で姿勢を変える。そのため、第１リンクハブ３２から見た第２リンクハブ３３の目標位置と、各リンクの姿勢とは、必ず一対一で対応する。よって、角度調整機構１１１は、ロボットアーム等のマルチリンクを持つ構造と異なり、特異点を有さない。 The second link hub 33 changes its posture on a hemisphere when viewed from the first link hub 32. Therefore, the target position of the second link hub 33 as seen from the first link hub 32 and the posture of each link always have a one-to-one correspondence. Therefore, the angle adjusting mechanism 111 does not have a singular point, unlike a structure having a multi-link such as a robot arm.

図３は、角度調整機構１１１の回転軸４２に姿勢制御用の電動アクチュエータ１１を取り付けた一構成例を示す。電動アクチュエータ１１は、減速機構６２を備えたロータリアクチュエータ（モータ）である。電動アクチュエータ１１は、基端側の第１リンクハブ３２の上面に、電動アクチュエータ１１の回転軸と回転軸４２とが同軸上に位置するように設置されている。電動アクチュエータ１１および減速機構６２は、一体として設けられてもよい。減速機構６２は、モータ固定部材６３により基端側の第１リンクハブ３２に固定される。 FIG. 3 shows an example of a configuration in which the electric actuator 11 for attitude control is attached to the rotating shaft 42 of the angle adjusting mechanism 111. The electric actuator 11 is a rotary actuator (motor) provided with a reduction mechanism 62. The electric actuator 11 is installed on the upper surface of the first link hub 32 on the proximal end side so that the rotation shaft and the rotation shaft 42 of the electric actuator 11 are coaxially located. The electric actuator 11 and the reduction mechanism 62 may be provided integrally. The speed reduction mechanism 62 is fixed to the first link hub 32 on the proximal end side by the motor fixing member 63.

図３に示す例では、電動アクチュエータ１１が３組のリンク機構３４の全てに設けられているが、本実施の形態に従う角度調整機構１１１はこれに限られない。角度調整機構１１１は、リンク機構３４のうち少なくとも２組に姿勢制御用の電動アクチュエータ１１が設けられていれば、基端側の第１リンクハブ３２に対する先端側の第２リンクハブ３３の姿勢を確定することができる。 In the example shown in FIG. 3, the electric actuator 11 is provided in all three sets of the link mechanisms 34, but the angle adjusting mechanism 111 according to the present embodiment is not limited to this. If at least two sets of the link mechanisms 34 are provided with the electric actuators 11 for attitude control, the angle adjusting mechanism 111 can change the attitude of the second link hub 33 on the tip side with respect to the first link hub 32 on the proximal end side. Can be confirmed.

図４は、把持機構１１２の一構成例を示す図である。把持機構１１２は、対向する２枚の爪で対象物を挟み込む。本実施の形態に従う把持機構１１２は、エアシリンダを用いて２枚の爪を開閉させる方式である。状態Ａは把持機構１１２の開放時の状態を示す。状態Ｂは把持機構１１２の閉じた状態を示す。図４に示す把持機構１１２は一例であり、本実施の形態に従う把持機構１１２はこれに限られない。ある局面において、把持機構１１２は、電動式の開閉機構、対象物を吸着する機構または他の挟み込み機構であってもよい。 FIG. 4 is a diagram showing a configuration example of the gripping mechanism 112. The gripping mechanism 112 sandwiches the object between two opposing claws. The gripping mechanism 112 according to the present embodiment is a method of opening and closing two claws by using an air cylinder. The state A indicates a state when the gripping mechanism 112 is opened. The state B indicates a closed state of the gripping mechanism 112. The gripping mechanism 112 shown in FIG. 4 is an example, and the gripping mechanism 112 according to the present embodiment is not limited to this. In certain aspects, the gripping mechanism 112 may be an electric opening / closing mechanism, a mechanism for adsorbing an object, or another pinching mechanism.

図５は、把持機構１１２を取り付けた角度調整機構１１１の一例を示す図である。角度調整機構１１１の先端側の第２リンクハブ３３は、把持機構１１２をネジ止めするネジ穴、はめ込み穴またはその他のアタッチメントを備えていてもよい。図５に示す構成によって、嵌合作業装置１０１は、把持機構１１２でワークを把持したときの微妙なずれを調整することができる。 FIG. 5 is a diagram showing an example of the angle adjusting mechanism 111 to which the gripping mechanism 112 is attached. The second link hub 33 on the distal end side of the angle adjusting mechanism 111 may include a screw hole, a fitting hole, or other attachment for screwing the gripping mechanism 112. According to the configuration shown in FIG. 5, the fitting work device 101 can adjust a delicate deviation when the work is gripped by the gripping mechanism 112.

図６は、嵌合作業時のワークのずれを調整したときの、把持機構１１２の先端の様子の一例を示す図である。嵌合作業では、状態Ａのように、把持機構１１２が把持するワークＰのわずかなずれにより、ワーク設置台１１３に設置されたワークＨの中心軸Ｃと、把持機構１１２により把持されるワークＰの中心軸Ｃ'とが一致しない場合がある。このような場合において、角度調整機構１１１は、把持機構１１２が把持するワークＰの角度を調整し、状態ＢのようにワークＨの中心軸Ｃと、ワークＰの中心軸Ｃ'とを一致させることができる。 FIG. 6 is a diagram showing an example of the state of the tip of the gripping mechanism 112 when the deviation of the work during the fitting operation is adjusted. In the fitting operation, as in the state A, due to a slight deviation of the work P gripped by the gripping mechanism 112, the central axis C of the work H installed on the work installation table 113 and the work P gripped by the gripping mechanism 112 May not match the central axis C'of. In such a case, the angle adjusting mechanism 111 adjusts the angle of the work P gripped by the gripping mechanism 112, and makes the central axis C of the work H and the central axis C'of the work P coincide with each other as in the state B. be able to.

また、状態Ｃのように、ワーク設置台１１３またはワーク設置台１１３に設置されたワークＨが傾いている場合がある。このような場合においても、角度調整機構１１１は、把持機構１１２が把持するワークＰの角度を斜めに微調整し、状態ＤのようにワークＨの中心軸Ｃと、ワークＰの中心軸Ｃ'とを一致させることができる。 Further, as in the state C, the work setting table 113 or the work H installed on the work setting table 113 may be tilted. Even in such a case, the angle adjusting mechanism 111 obliquely finely adjusts the angle of the work P gripped by the gripping mechanism 112, and the central axis C of the work H and the central axis C'of the work P as in the state D. Can be matched with.

なお、位置調整装置は、角度調整機構１１１を移動させているが、本実施の形態に従う嵌合作業装置１０１はこれに限られない。位置調整装置は、角度調整機構１１１と、把持機構１１２およびワーク設置台１１３上のワークを相対的に位置決めできればよく、ある局面において、位置調整装置は、ワーク設置台１１３を移動させる機構を含んでもよい。 The position adjusting device moves the angle adjusting mechanism 111, but the fitting work device 101 according to the present embodiment is not limited to this. The position adjusting device only needs to be able to relatively position the angle adjusting mechanism 111, the gripping mechanism 112, and the work on the work setting table 113, and in a certain aspect, the position adjusting device may include a mechanism for moving the work setting table 113. Good.

＜Ｃ．回路およびソフトウェア構成＞
図７は、情報処理装置１０２のハードウェアの一構成例を示す図である。図７を参照して、情報処理装置１０２は、ＣＰＵ（Central Processing Unit）７０１と、１次記憶装置７０２と、２次記憶装置７０３と、外部機器インターフェース７０４と、入力インターフェース７０５と、出力インターフェース７０６と、通信インターフェース７０７とを備える。 <C. Circuit and software configuration>
FIG. 7 is a diagram showing a configuration example of the hardware of the information processing device 102. With reference to FIG. 7, the information processing unit 102 includes a CPU (Central Processing Unit) 701, a primary storage device 702, a secondary storage device 703, an external device interface 704, an input interface 705, and an output interface 706. And a communication interface 707.

ＣＰＵ７０１は、情報処理装置１０２で動作するプログラムやデータを処理する。１次記憶装置７０２は、ＣＰＵ７０１によって実行されるプログラムおよび参照されるデータを格納する。ある局面において、ＤＲＡＭ（Dynamic Random Access Memory）が１次記憶装置７０２として使用されてもよい。 The CPU 701 processes programs and data that operate in the information processing device 102. The primary storage device 702 stores a program executed by the CPU 701 and data to be referenced. In some aspects, DRAM (Dynamic Random Access Memory) may be used as the primary storage device 702.

２次記憶装置７０３は、プログラムやデータ等を長期間記憶する。一般的に２次記憶装置７０３は、１次記憶装置７０２よりも低速であるため、ＣＰＵ７０１で直接使用するデータは、１次記憶装置７０２に配置され、それ以外のデータは、２次記憶装置７０３に配置される。ある局面において、ＨＤＤ（Hard Disk Drive）やＳＳＤ（Solid State Drive）等の不揮発性の記憶装置が２次記憶装置７０３として使用されてもよい。 The secondary storage device 703 stores programs, data, and the like for a long period of time. Since the secondary storage device 703 is generally slower than the primary storage device 702, the data directly used by the CPU 701 is arranged in the primary storage device 702, and the other data is the secondary storage device 703. Placed in. In a certain aspect, a non-volatile storage device such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive) may be used as the secondary storage device 703.

外部機器インターフェース７０４は、情報処理装置１０２に補助デバイスを接続する場合等に使用される。ある局面において、ＵＳＢ（Universal Serial Bus）インターフェースが、外部機器インターフェース７０４として使用されてもよい。入力インターフェース７０５は、キーボードやマウス等を接続するために使用される。ある局面において、ＵＳＢインターフェースが、入力インターフェース７０５として使用されてもよい。 The external device interface 704 is used when an auxiliary device is connected to the information processing device 102 or the like. In some aspects, a USB (Universal Serial Bus) interface may be used as the external device interface 704. The input interface 705 is used to connect a keyboard, a mouse, or the like. In some aspects, the USB interface may be used as the input interface 705.

出力インターフェース７０６は、ディスプレイ等の出力デバイスを接続するために使用される。ある局面において、ＨＤＭＩ（登録商標）（High-Definition Multimedia Interface）やＤＶＩ（Digital Visual Interface）が出力インターフェース７０６として使用されてもよい。 The output interface 706 is used to connect an output device such as a display. In certain aspects, HDMI® (High-Definition Multimedia Interface) or DVI (Digital Visual Interface) may be used as the output interface 706.

通信インターフェース７０７は、外部の通信機器と通信するために使用される。ある局面において、ＬＡＮ（Local Area Network）ポートや、Ｗｉ−Ｆｉ（登録商標）（Wireless Fidelity）の送受信装置等が、通信インターフェース７０７として使用されてもよい。また、ある局面において、情報処理装置１０２は、ＰＣ（Personal Computer）またはワークステーションであってもよい。本実施の形態に従う情報処理装置１０２の処理は、図７に示すハードウェア上で、プログラムとして実行されてもよい。 The communication interface 707 is used to communicate with an external communication device. In a certain aspect, a LAN (Local Area Network) port, a Wi-Fi (registered trademark) (Wireless Fidelity) transmitter / receiver, or the like may be used as the communication interface 707. Further, in a certain aspect, the information processing apparatus 102 may be a PC (Personal Computer) or a workstation. The processing of the information processing apparatus 102 according to the present embodiment may be executed as a program on the hardware shown in FIG. 7.

図８は、情報処理装置１０２を実現する機能の一構成例を示す図である。ある局面において、図８に示す機能の一部は、図７に示すハードウェア上で、プログラムが実行されることにより実現され得る。図８を参照して、情報処理装置１０２は、信号入力部８０１と、評価値関数部８０２と、動作パターンテーブル８０３と、動作決定部８０４と、指令生成部８０５と、動作結果判定部８０６と、評価値関数学習部８０７とを含む。 FIG. 8 is a diagram showing a configuration example of a function that realizes the information processing device 102. In a certain aspect, some of the functions shown in FIG. 8 can be realized by executing a program on the hardware shown in FIG. 7. With reference to FIG. 8, the information processing apparatus 102 includes a signal input unit 801, an evaluation value function unit 802, an operation pattern table 803, an operation determination unit 804, a command generation unit 805, and an operation result determination unit 806. , Evaluation value function learning unit 807 and the like.

信号入力部８０１は、嵌合作業装置１０１から角度調整機構１１１の電動アクチュエータ１１のモータトルク値を取得する。ある局面において、信号入力部８０１は、さらに、位置調整装置の電動アクチュエータ１０８のモータトルク値を取得してもよい。また、信号入力部８０１は、ワークおよび把持機構１１２をカメラ（図示しない）により撮影した画像や、任意のセンサーの出力値を取得してもよい。 The signal input unit 801 acquires the motor torque value of the electric actuator 11 of the angle adjusting mechanism 111 from the fitting work device 101. In a certain aspect, the signal input unit 801 may further acquire the motor torque value of the electric actuator 108 of the position adjusting device. Further, the signal input unit 801 may acquire an image of the work and the gripping mechanism 112 taken by a camera (not shown) or an output value of an arbitrary sensor.

評価値関数部８０２は、後述する評価値関数Ｆを用いて信号入力部に入力されたモータトルク値等に基づいて各動作パターンに対応する評価値を計算する。 The evaluation value function unit 802 calculates the evaluation value corresponding to each operation pattern based on the motor torque value or the like input to the signal input unit using the evaluation value function F described later.

動作パターンテーブル８０３は、位置調整装置および角度調整機構１１１の各電動アクチュエータの移動量および移動速度、加速度、指令トルク値の内の少なくとも１つが対応付けられた複数の動作パターンを保管する。動作パターンテーブル８０３は、角度調整機構１１１に関して、個別のアクチュエータの指令値ではなく、角度調整機構１１１の角度等を動作パターンに含めてもよい。 The operation pattern table 803 stores a plurality of operation patterns associated with at least one of the movement amount, movement speed, acceleration, and command torque value of each electric actuator of the position adjustment device and the angle adjustment mechanism 111. Regarding the angle adjusting mechanism 111, the operation pattern table 803 may include the angle of the angle adjusting mechanism 111 and the like in the operation pattern instead of the command value of each actuator.

動作決定部８０４は、動作パターンテーブル８０３の動作パターンの中から、評価値が最大となる動作パターンを嵌合作業装置１０１の次の動作として選択する。指令生成部８０５は、動作決定部８０４により選択された動作パターンに基づいて、嵌合作業装置１０１の各電動アクチュエータへの指令値を生成し、制御装置１０３を介して嵌合作業装置１０１に送信する。 The operation determination unit 804 selects the operation pattern having the maximum evaluation value from the operation patterns of the operation pattern table 803 as the next operation of the fitting work device 101. The command generation unit 805 generates a command value for each electric actuator of the fitting work device 101 based on the operation pattern selected by the operation determination unit 804, and transmits the command value to the fitting work device 101 via the control device 103. To do.

動作結果判定部８０６は、前回選択された動作パターンの実行前後における、角度調整機構１１１の各電動アクチュエータのモータトルク値を比較する。前回選択された動作パターンの実行後のモータトルク値が、前回選択された動作パターンの実行前のモータトルク値より小さい場合は、動作結果判定部８０６は高い報酬を出力する。他方、当該実行後のモータトルク値が当該実行前のトルク値よりも大きい場合は、動作結果判定部８０６は低い報酬を出力する。ここでの報酬とは、評価値関数Ｆを更新するための機械学習における報酬である。ある局面において、動作結果判定部８０６は、位置調整装置の各電動アクチュエータ１０８のモータトルク値も前回選択された動作パターンの実行前後における比較対象としてもよい。 The operation result determination unit 806 compares the motor torque values of the electric actuators of the angle adjusting mechanism 111 before and after the execution of the previously selected operation pattern. If the motor torque value after the execution of the previously selected operation pattern is smaller than the motor torque value before the execution of the previously selected operation pattern, the operation result determination unit 806 outputs a high reward. On the other hand, when the motor torque value after the execution is larger than the torque value before the execution, the operation result determination unit 806 outputs a low reward. The reward here is a reward in machine learning for updating the evaluation value function F. In a certain aspect, the operation result determination unit 806 may also compare the motor torque value of each electric actuator 108 of the position adjusting device before and after the execution of the previously selected operation pattern.

評価値関数学習部８０７は、動作結果判定部が出力した報酬を教師信号として、動作パターンを選択した時の評価値と、教師信号との差に基づいて評価値関数Ｆを更新する。ある局面において、評価値関数学習部８０７は、予め定められた回数だけ評価値関数Ｆを更新するごとに、評価値関数部８０２で使用する評価値関数Ｆを最新状態に更新してもよい。 The evaluation value function learning unit 807 updates the evaluation value function F based on the difference between the evaluation value when the operation pattern is selected and the teacher signal, using the reward output by the operation result determination unit as the teacher signal. In a certain aspect, the evaluation value function learning unit 807 may update the evaluation value function F used by the evaluation value function unit 802 to the latest state every time the evaluation value function F is updated a predetermined number of times.

図９は、評価値関数部８０２の動作の一例を示す図である。評価値関数部８０２は、信号入力部８０１からモータトルク値等を取得して評価値関数Ｆに入力する。評価値は動作パターンごとに算出される。図９に示す例では、評価値関数部８０２は、ｎ個の各動作パターンａ_１〜ａ_ｎに対して評価値を算出する。ある局面において、評価値関数部８０２は、モータトルク値等を評価値関数Ｆの入力として受け付け、各動作パターンの評価値を計算するプログラムであってもよい。 FIG. 9 is a diagram showing an example of the operation of the evaluation value function unit 802. The evaluation value function unit 802 acquires a motor torque value or the like from the signal input unit 801 and inputs it to the evaluation value function F. The evaluation value is calculated for each operation pattern. In the example shown in FIG. 9, the evaluation value function unit 802 calculates an evaluation value for each of the n operation patterns a ₁ ~a _n. In a certain aspect, the evaluation value function unit 802 may be a program that accepts a motor torque value or the like as an input of the evaluation value function F and calculates an evaluation value of each operation pattern.

評価値関数Ｆが出力するｎ個の評価値は、次に実行すべき動作パターンを選択するための指標であり、対応する評価値が最大の値を示す動作パターンが、次に実行すべき最適な動作であること示す。 The n evaluation values output by the evaluation value function F are indexes for selecting the operation pattern to be executed next, and the operation pattern showing the maximum value corresponding to the evaluation value is the optimum operation pattern to be executed next. Indicates that the operation is normal.

そのため、動作決定部８０４は、ｎ個の動作パターンの中から、最大の評価値に対応する動作パターンを次の動作として選択する。図９に示す例では、「評価値＝０．６１４」が最大のため、動作決定部８０４は、「評価値＝０．６１４」に対応する動作パターンａ_ｎ−３を選択する。 Therefore, the operation determination unit 804 selects the operation pattern corresponding to the maximum evaluation value as the next operation from the n operation patterns. In the example shown in FIG. 9, since "evaluation value = 0.614" is the maximum, the operation determination unit 804 selects the operation pattern an _-3 corresponding to "evaluation value = 0.614".

動作決定部８０４は、選択した動作パターンａ_ｎ−３を指令生成部８０５に転送する。指令生成部８０５は、動作パターンテーブル８０３を参照し、ａ_ｎ−３に対応する指令値を生成して制御装置１０３に出力する。 The operation determination unit 804 transfers the selected operation pattern an _-3 to the command generation unit 805. The command generation unit 805 refers to the operation pattern table 803, generates a command value corresponding to an _-3 , and outputs the command value to the control device 103.

図１０は、動作パターンテーブル８０３の一例を示す図である。動作パターンテーブル８０３は、動作パターンごとに、位置調整装置の電動アクチュエータ１０８の移動量、移動速度、加速度および指令トルク値と、角度調整機構１１１の根元の回転機構の回転角度、回転速度、加速度および指令トルク値と、角度調整機構１１１の折れ角変更量、旋回角変更量、回転速度、加速度および指令トルク値とを格納する。 FIG. 10 is a diagram showing an example of the operation pattern table 803. The operation pattern table 803 shows, for each operation pattern, the movement amount, movement speed, acceleration and command torque value of the electric actuator 108 of the position adjusting device, and the rotation angle, rotation speed, acceleration and rotation angle, rotation speed, acceleration of the rotation mechanism at the base of the angle adjustment mechanism 111. The command torque value and the bending angle change amount, turning angle change amount, rotation speed, acceleration, and command torque value of the angle adjusting mechanism 111 are stored.

ある局面において、動作パターンテーブル８０３は、角度調整機構１１１の個別の電動アクチュエータの移動量、移動速度、加速度、および指令トルク値を格納してもよい。また、ある局面において、動作パターンテーブル８０３は、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって参照されてもよい。 In certain aspects, the motion pattern table 803 may store the movement amount, movement speed, acceleration, and command torque value of the individual electric actuators of the angle adjusting mechanism 111. Further, in a certain aspect, the operation pattern table 803 may be referred to by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702.

＜Ｄ．嵌合作業における情報処理装置１０２の内部処理＞
図１１は、嵌合作業システム１００の処理の一例を示すフローチャートである。ある局面において、図１１の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１１の各ステップを実行するものとして当該処理を説明する。 <D. Internal processing of information processing device 102 in fitting work>
FIG. 11 is a flowchart showing an example of processing of the fitting work system 100. In a certain aspect, the program for executing the process of FIG. 11 may be stored in the secondary storage device 703 and read by the primary storage device 702 to be executed by the CPU 701. Hereinafter, the process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１１０５において、情報処理装置１０２は、嵌合作業装置１０１がワークを把持した状態で、位置調整装置により、把持機構１１２で把持したワークを予め定められた位置（嵌合開始位置）に移動させるための指令を嵌合作業装置１０１に送信する。ステップＳ１１１０において、情報処理装置１０２は、変数ｉに１を代入する。 In step S1105, the information processing device 102 moves the work gripped by the gripping mechanism 112 to a predetermined position (fitting start position) by the position adjusting device while the fitting work device 101 grips the work. Is transmitted to the fitting work device 101. In step S1110, the information processing apparatus 102 assigns 1 to the variable i.

ステップＳ１１１５において、情報処理装置１０２は、変数ｉの値が、定数Ｎ以下であるか否かを判定する。情報処理装置１０２は、変数ｉの値が定数Ｎ以下であると判定した場合（ステップＳ１１１５にてＹＥＳ）、制御をステップＳ１１２０に移す。そうでない場合（ステップＳ１１１５にてＮＯ）、情報処理装置１０２は制御をステップＳ１１５５に移す。ステップＳ１１１５は、情報処理装置１０２がステップＳ１１２０からＳ１１５０までの処理を最大で定数Ｎまで繰り返すための判定処理である。 In step S1115, the information processing apparatus 102 determines whether or not the value of the variable i is equal to or less than the constant N. When the information processing apparatus 102 determines that the value of the variable i is equal to or less than the constant N (YES in step S1115), the information processing apparatus 102 shifts control to step S1120. If not (NO in step S1115), the information processing apparatus 102 shifts control to step S1155. Step S1115 is a determination process for the information processing apparatus 102 to repeat the processes from steps S1120 to S1150 up to a constant N.

ステップＳ１１２０において、情報処理装置１０２は、嵌合作業装置１０１から、各電動アクチュエータのモータトルク値を取得する。ある局面において、情報処理装置１０２は、各電動アクチュエータのモータトルク値に加えて、画像や各種センサー値をカメラや各種センサーから取得してもよい。 In step S1120, the information processing device 102 acquires the motor torque value of each electric actuator from the fitting work device 101. In a certain aspect, the information processing apparatus 102 may acquire images and various sensor values from the camera and various sensors in addition to the motor torque values of the electric actuators.

ステップＳ１１２５において、情報処理装置１０２は、取得した各電動アクチュエータのモータトルク値を評価値関数部８０２の入力として、評価値を算出する。ある局面において、情報処理装置１０２は、角度調整機構１１１の電動アクチュエータ１１のモータトルク値を評価値関数部８０２の入力として評価値を算出してもよい。また、ある局面において、情報処理装置１０２は、角度調整機構１１１の電動アクチュエータ１１のモータトルク値および位置調整装置の電動アクチュエータ１０８のモータトルク値を評価値関数部８０２の入力として評価値を算出してもよい。 In step S1125, the information processing device 102 calculates the evaluation value by using the acquired motor torque value of each electric actuator as an input of the evaluation value function unit 802. In a certain aspect, the information processing apparatus 102 may calculate the evaluation value by using the motor torque value of the electric actuator 11 of the angle adjusting mechanism 111 as an input of the evaluation value function unit 802. Further, in a certain aspect, the information processing apparatus 102 calculates an evaluation value by inputting the motor torque value of the electric actuator 11 of the angle adjusting mechanism 111 and the motor torque value of the electric actuator 108 of the position adjusting device as the input of the evaluation value function unit 802. You may.

ステップＳ１１３０において、情報処理装置１０２は、動作パターンごとに算出された評価値の中で最大の評価値を選択し、当該最大の評価値に対応する動作パターンａ_ｋを次の動作として選択する。 In step S1130, the information processing apparatus 102 selects the maximum evaluation value among the evaluation values calculated for each operation pattern, and selects the operation pattern _ak corresponding to the maximum evaluation value as the next operation.

ステップＳ１１３５において、情報処理装置１０２は、指令生成部８０５により、選択した動作パターンａ_ｋを実行するための指令を嵌合作業装置１０１に送信する。嵌合作業装置１０１は、受信した指令に基づいて、位置調整装置の電動アクチュエータ１０８を駆動させることにより、把持機構１１２の水平方向および上下方向の位置を調整し、さらに、角度調整機構１１１の電動アクチュエータ１１を駆動させることにより、把持機構１１２の向きを調整する。ステップＳ１１４０において、情報処理装置１０２は、位置調整装置の各電動アクチュエータ１０８の位置情報を取得する。 In step S1135, the information processing device 102 transmits a command for executing the selected operation pattern _ak to the fitting work device 101 by the command generation unit 805. The fitting work device 101 adjusts the horizontal and vertical positions of the gripping mechanism 112 by driving the electric actuator 108 of the position adjusting device based on the received command, and further, the angle adjusting mechanism 111 is electrically operated. The orientation of the gripping mechanism 112 is adjusted by driving the actuator 11. In step S1140, the information processing device 102 acquires the position information of each electric actuator 108 of the position adjusting device.

ステップＳ１１４５において、情報処理装置１０２は、取得した位置調整装置の各電動アクチュエータ１０８の位置情報から、把持機構１１２の把持するワークが目標位置に到達したか否か（すなわち、嵌合作業が完了したか否か）を判定する。到達の判定は、たとえば、目標位置と現在位置との差分が閾値以下であるとき、到達したなどと判定する。情報処理装置１０２は、把持機構１１２の把持するワークが目標位置に到達したと判定した場合（ステップＳ１１４５にてＹＥＳ）、処理を終了する。そうでない場合（ステップＳ１１４５にてＮＯ）、情報処理装置１０２は制御をステップＳ１１５０に移す。 In step S1145, the information processing device 102 determines whether or not the work gripped by the gripping mechanism 112 has reached the target position (that is, the fitting operation is completed) from the acquired position information of each electric actuator 108 of the position adjusting device. Whether or not) is determined. The arrival is determined, for example, when the difference between the target position and the current position is equal to or less than the threshold value. When the information processing device 102 determines that the work gripped by the gripping mechanism 112 has reached the target position (YES in step S1145), the information processing device 102 ends the process. If not (NO in step S1145), the information processing apparatus 102 shifts control to step S1150.

ある局面において、情報処理装置１０２は、把持機構１１２の把持するワークが目標位置に到達したか否かを判定するために、位置調整装置の上下方向の電動アクチュエータ１０８Ｃの位置情報のみを参照してもよいし、全ての電動アクチュエータ１０８の位置情報を参照してもよい。ある局面において、電動アクチュエータ１０８がステッピングモーターの場合、情報処理装置１０２は、電動アクチュエータ１０８のステップ数に基づいて作業ヘッド１０９および把持機構１１２の現在位置を算出してもよい。また、ある局面において、情報処理装置１０２は、電動アクチュエータ１０８のエンコーダーの回転数に基づいて作業ヘッド１０９および把持機構１１２の現在位置を算出してもよい。 In a certain aspect, the information processing apparatus 102 refers only to the position information of the electric actuator 108C in the vertical direction of the position adjusting device in order to determine whether or not the workpiece gripped by the gripping mechanism 112 has reached the target position. Alternatively, the position information of all the electric actuators 108 may be referred to. In a certain aspect, when the electric actuator 108 is a stepping motor, the information processing device 102 may calculate the current positions of the work head 109 and the gripping mechanism 112 based on the number of steps of the electric actuator 108. Further, in a certain aspect, the information processing apparatus 102 may calculate the current positions of the work head 109 and the gripping mechanism 112 based on the rotation speed of the encoder of the electric actuator 108.

ステップＳ１１５０において、情報処理装置１０２は、変数ｉの値をインクリメントして（変数ｉに１を加算する）、制御をステップＳ１１１５に移す。ステップＳ１１５５において、情報処理装置１０２は、動作パターンを予め定められた回数（Ｎ回）実行したが嵌合作業が完了しなかったと判定し、動作失敗の判定を行い、処理を終了する。なお、情報処理装置１０２は、各ワークに対して、図１１の処理を順次実行する。 In step S1150, the information processing apparatus 102 increments the value of the variable i (adds 1 to the variable i) and shifts control to step S1115. In step S1155, the information processing apparatus 102 determines that the operation pattern has been executed a predetermined number of times (N times), but the fitting operation has not been completed, determines that the operation has failed, and ends the process. The information processing device 102 sequentially executes the process of FIG. 11 for each work.

図１２は、図１１の処理の動作イメージの一例を示す図である。最初に、嵌合作業装置１０１は、位置調整装置により、把持機構１１２が把持したワークを予め定められた嵌合位置に移動させる（ステップＳ１１０５に対応）。 FIG. 12 is a diagram showing an example of an operation image of the process of FIG. First, the fitting work device 101 moves the work gripped by the gripping mechanism 112 to a predetermined fitting position by the position adjusting device (corresponding to step S1105).

状態Ａは、把持機構１１２が把持したワークが予め定められた嵌合位置に移動した直後の状態を表す。状態Ａに示す例では、１１３に設置されたワークＨの中心軸Ｃと、把持機構１１２により把持されるワークＰの中心軸Ｃ'とが一致していない。 The state A represents a state immediately after the work gripped by the gripping mechanism 112 has moved to a predetermined fitting position. In the example shown in the state A, the central axis C of the work H installed in 113 and the central axis C'of the work P gripped by the gripping mechanism 112 do not match.

情報処理装置１０２は、状態Ａから、図１１のステップＳ１１１５〜ステップＳ１１４５の処理を繰り返すことにより、嵌合作業装置１０１に嵌合作業をさせる。情報処理装置１０２は、状態Ａのときの各電動アクチュエータのモータトルク値を取得する（ステップＳ１１２０に対応）。次に、情報処理装置１０２は、状態Ａのときの各電動アクチュエータのモータトルク値を評価値関数Ｆの入力として、各動作パターンの評価値を算出する（ステップＳ１１２５に対応）。そして、情報処理装置１０２は、最も評価値の高い動作パターンａ_ｎ−３を選択し（ステップＳ１１３０に対応）、動作パターンａ_ｎ−３に対応する指令を嵌合作業装置１０１に送信する（ステップＳ１１３５に対応）。 From the state A, the information processing device 102 causes the fitting work device 101 to perform the fitting work by repeating the processes of steps S115 to S1145 of FIG. The information processing device 102 acquires the motor torque value of each electric actuator in the state A (corresponding to step S1120). Next, the information processing device 102 calculates the evaluation value of each operation pattern by using the motor torque value of each electric actuator in the state A as the input of the evaluation value function F (corresponding to step S1125). Then, the information processing apparatus 102 selects the operation pattern an _-3 having the highest evaluation value (corresponding to step S1130), and transmits a command corresponding to the operation pattern an _-3 to the fitting work apparatus 101 (step). Corresponds to S1135).

状態Ｂは、嵌合作業装置１０１が動作パターンａ_ｎ−３を実行した直後の様子を示す。情報処理装置１０２は、状態Ｂのときの各電動アクチュエータの現在位置を取得する（ステップＳ１１４０に対応）。さらに、情報処理装置１０２は、取得した各電動アクチュエータの現在位置から、嵌合作業が完了したか否か（ワークが目標位置に到達したか否か）を判定する（ステップＳ１１４５に対応）。状態Ｂにおいて、嵌合作業は完了していないので、情報処理装置１０２は、再度ステップＳ１１１５からステップＳ１１４５までの処理を繰り返す。 The state B shows a state immediately after the fitting work device 101 executes the operation pattern an _-3 . The information processing device 102 acquires the current position of each electric actuator in the state B (corresponding to step S1140). Further, the information processing apparatus 102 determines from the acquired current position of each electric actuator whether or not the fitting operation is completed (whether or not the work has reached the target position) (corresponding to step S1145). Since the fitting operation is not completed in the state B, the information processing apparatus 102 repeats the processes from step S1115 to step S1145 again.

情報処理装置１０２は、状態Ａのときと同様にステップＳ１１２０からステップＳ１１２５までの処理を実行する。そして、情報処理装置１０２は、最も評価値の高い動作パターンａ_ｎ−１を選択し（ステップＳ１１３０に対応）、動作パターンａ_ｎ−１に対応する指令を嵌合作業装置１０１に送信する（ステップＳ１１３５に対応）。 The information processing apparatus 102 executes the processes from step S1120 to step S1125 in the same manner as in the state A. Then, the information processing apparatus 102 selects the operation pattern an _-1 having the highest evaluation value (corresponding to step S1130), and transmits a command corresponding to the operation pattern an _-1 to the fitting work apparatus 101 (step). Corresponds to S1135).

状態Ｃは、嵌合作業装置１０１が動作パターンａ_ｎ−１を実行した直後の様子を示す。以下同様に、情報処理装置１０２は、嵌合作業の完了判定を行い、嵌合作業が完了していないと判定する間は、評価値の算出と、動作パターンを嵌合作業装置１０１に実行させる処理とを繰り返す。 The state C shows a state immediately after the fitting work device 101 executes the operation pattern an _-1 . Similarly, the information processing device 102 determines the completion of the fitting work, and while it is determined that the fitting work is not completed, the information processing device 102 calculates the evaluation value and causes the fitting work device 101 to execute the operation pattern. The process is repeated.

状態Ｆは、嵌合作業装置１０１が状態Ｅのときに動作パターンａ_３を実行した直後の様子を示す。状態Ｆにおいて、嵌合作業は完了していることがわかる。情報処理装置１０２は、位置調整装置の電動アクチュエータ１０８Ｃの位置情報等から、嵌合作業の完了を検出する。嵌合作業が完了した後は、情報処理装置１０２は、嵌合作業装置１０１に、次のワークの嵌合作業を行うための指令を送信してもよい。 State F show the state of immediately after the fitting operation device 101 performs an action pattern a ₃ in the state E. It can be seen that the fitting operation is completed in the state F. The information processing device 102 detects the completion of the fitting operation from the position information of the electric actuator 108C of the position adjusting device. After the fitting work is completed, the information processing device 102 may send a command to the fitting work device 101 to perform the fitting work of the next work.

＜Ｅ．嵌合作業の学習処理＞
図１１および図１２で説明した例において、情報処理装置１０２は、角度調整機構１１１および位置調整装置の現在のモータトルク値等に基づいて各動作パターンの評価値を計算し、評価値が最大になる動作パターンを順次実行することで動作を成功させる。そのため、図９の評価値関数Ｆは、モータトルク値等に基づいて次に実行すべき最適な動作パターンに対して最大の評価値を出力するよう最適化されている必要がある。 <E. Learning process of fitting work>
In the examples described with reference to FIGS. 11 and 12, the information processing apparatus 102 calculates the evaluation value of each operation pattern based on the current motor torque values of the angle adjusting mechanism 111 and the position adjusting device, and the evaluation value is maximized. The operation is successful by sequentially executing the operation patterns. Therefore, the evaluation value function F in FIG. 9 needs to be optimized to output the maximum evaluation value for the optimum operation pattern to be executed next based on the motor torque value and the like.

しかし、嵌合処理の対象となるワークの初期状態は、把持機構１１２に把持されるごとに変化する可能性ある。また、嵌合作業装置１０１が動作パターンを実行することで、把持機構１１２により把持されたワークの姿勢も変化する可能性がある。これらのあらゆる状態を想定したルールベースの動作プログラムの構築は困難である。よって、本実施の形態に従う嵌合作業システム１００は、強化学習により、繰り返し動作を試行する過程で評価値関数Ｆを最適化する。 However, the initial state of the work to be fitted may change each time it is gripped by the gripping mechanism 112. Further, when the fitting work device 101 executes the operation pattern, the posture of the work gripped by the gripping mechanism 112 may change. It is difficult to construct a rule-based operation program that assumes all of these states. Therefore, the fitting work system 100 according to the present embodiment optimizes the evaluation value function F in the process of trying the repetitive operation by reinforcement learning.

図１３は、嵌合作業システム１００の嵌合作業の学習処理の一例を示すフローチャートである。ある局面において、図１３の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１３の各ステップを実行するものとして当該学習処理を説明する。 FIG. 13 is a flowchart showing an example of the learning process of the fitting work of the fitting work system 100. In a certain aspect, the program for executing the process of FIG. 13 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１３１０において、情報処理装置１０２は、変数ｊに１を代入する。ステップＳ１３２０において、情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ１以下であると判定すると（ステップＳ１３２０にてＹＥＳ）、ステップＳ１３３０に制御を移す。そうでない場合（ステップＳ１３２０にてＮＯ）、情報処理装置１０２は、ステップＳ１３５０に制御を移す。評価値関数Ｆが未学習の初期状態において、情報処理装置１０２は、変数ｊが定数Ｊ１に達するまで、嵌合動作初期学習を繰り返し実行する。 In step S1310, the information processing apparatus 102 assigns 1 to the variable j. In step S1320, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J1. When the information processing apparatus 102 determines that the value of the variable j is equal to or less than the constant J1 (YES in step S1320), the information processing apparatus 102 shifts control to step S1330. If not (NO in step S1320), the information processing apparatus 102 transfers control to step S1350. In the initial state where the evaluation value function F has not been learned, the information processing apparatus 102 repeatedly executes the initial learning of the fitting operation until the variable j reaches the constant J1.

ステップＳ１３３０において、情報処理装置１０２は、嵌合作業の初期学習処理を実行する。嵌合作業の初期学習処理については後述する。ステップＳ１３４０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、上限回数として予め定められた回数Ｊ１まで、嵌合作業の初期学習処理を繰り返し実行する。 In step S1330, the information processing device 102 executes the initial learning process of the fitting operation. The initial learning process of the fitting work will be described later. In step S1340, the information processing apparatus 102 increments the value of the variable j. After that, the information processing apparatus 102 repeatedly executes the initial learning process of the fitting operation up to a predetermined number of times J1 as the upper limit number of times.

ステップＳ１３５０において、情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であるか否かを判定する。情報処理装置１０２は、変数ｊの値が定数Ｊ２以下であると判定すると（ステップＳ１３５０にてＹＥＳ）、ステップＳ１３６０に制御を移す。そうでない場合（ステップＳ１３５０にてＮＯ）、情報処理装置１０２は、学習処理を終了する。 In step S1350, the information processing apparatus 102 determines whether or not the value of the variable j is equal to or less than the constant J2. When the information processing apparatus 102 determines that the value of the variable j is equal to or less than the constant J2 (YES in step S1350), the information processing apparatus 102 shifts control to step S1360. If not (NO in step S1350), the information processing apparatus 102 ends the learning process.

ステップＳ１３６０において、情報処理装置１０２は、嵌合作業の学習処理を実行する。嵌合作業の学習処理については後述する。ステップＳ１３７０において、情報処理装置１０２は、変数ｊの値をインクリメントする。以降は、情報処理装置１０２は、変数ｊが上限値として予め定められた定数Ｊ２より大きくなるまで、嵌合作業の学習処理を繰り返し実行する。 In step S1360, the information processing device 102 executes the learning process of the fitting operation. The learning process of the fitting work will be described later. In step S1370, the information processing apparatus 102 increments the value of the variable j. After that, the information processing apparatus 102 repeatedly executes the learning process of the fitting operation until the variable j becomes larger than the predetermined constant J2 as the upper limit value.

図１４は、嵌合作業の初期学習処理（図１３のステップＳ１３３０に対応）の一例を示すフローチャートである。ある局面において、図１４の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１４の各ステップを実行するものとして当該初期学習処理を説明する。 FIG. 14 is a flowchart showing an example of the initial learning process of the fitting operation (corresponding to step S1330 in FIG. 13). In a certain aspect, the program for executing the process of FIG. 14 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the initial learning process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１４０５において、情報処理装置１０２は、嵌合作業装置１０１がワークを把持した状態で、位置調整装置により、把持機構１１２で把持したワークを予め定められた位置（嵌合開始位置）に移動させるための指令を嵌合作業装置１０１に送信する。 In step S1405, the information processing device 102 moves the work gripped by the gripping mechanism 112 to a predetermined position (fitting start position) by the position adjusting device while the fitting work device 101 grips the work. Is transmitted to the fitting work device 101.

ステップＳ１４１０において、情報処理装置１０２は、変数ｉに１を代入する。ステップＳ１４１５において、情報処理装置１０２は、嵌合作業装置１０１の各電動アクチュエータのモータトルク値Ｔ１を取得する。 In step S1410, the information processing apparatus 102 substitutes 1 for the variable i. In step S1415, the information processing device 102 acquires the motor torque value T1 of each electric actuator of the fitting work device 101.

ステップＳ１４２０において、情報処理装置１０２は、動作決定部８０４により、乱数を用いて次に実行する動作パターンａ_ｋを選択する。具体的には、情報処理装置１０２は、１〜ｎの間の乱数に基づいて動作パターンのインデックス番号ｋを決定する。 In step S1420, the information processing apparatus 102 selects the operation pattern _ak to be executed next by using the random number by the operation determination unit 804. Specifically, the information processing apparatus 102 determines the index number k of the operation pattern based on a random number between 1 and n.

ステップＳ１４２５において、情報処理装置１０２は、動作開始前の角度調整機構１１１の各電動アクチュエータのモータトルク値Ｔ１を評価値関数学習部８０７に保管した後、嵌合作業装置１０１に動作パターンａ_ｋを実行させるための指令を送信する。ある局面において、モータトルク値Ｔ１は、位置調整装置の各電動アクチュエータ１０８のモータトルク値を含んでいてもよい。 In step S1425, the information processing apparatus 102 stores the motor torque value T1 of each electric actuator of the angle adjusting mechanism 111 before the start of operation in the evaluation value function learning unit 807, and then transmits the operation pattern _ak to the fitting work apparatus 101. Send a command to execute. In a certain aspect, the motor torque value T1 may include the motor torque value of each electric actuator 108 of the position adjusting device.

ステップＳ１４３０において、情報処理装置１０２は、嵌合作業装置１０１が動作パターンａ_ｋを実行した後に、位置調整装置の電動アクチュエータ１０８の現在位置と、角度調整機構１１１の各電動アクチュエータのモータトルク値Ｔ２とを取得する。ある局面において、モータトルク値Ｔ２は、位置調整装置の各電動アクチュエータ１０８のモータトルク値を含んでいてもよい。 In step S1430, the information processing apparatus 102, after the fitting operation device 101 performs an action pattern _{a k,} and the current position of the electric actuator 108 of the positioning device, the motor torque values of each electric actuator of the angle adjusting mechanism 111 T2 And get. In a certain aspect, the motor torque value T2 may include the motor torque value of each electric actuator 108 of the position adjusting device.

ステップＳ１４３５において、情報処理装置１０２は、位置調整装置の電動アクチュエータ１０８の現在位置が、目標位置と一致しているか否かを判定する。情報処理装置１０２は、位置調整装置の電動アクチュエータ１０８の現在位置が、目標位置と一致していると判定した場合（ステップＳ１４３５にてＹＥＳ）、制御をステップＳ１４４０に移す。そうでない場合（ステップＳ１４３５にてＮＯ）、情報処理装置１０２は制御をステップＳ１４５５に移す。 In step S1435, the information processing device 102 determines whether or not the current position of the electric actuator 108 of the position adjusting device matches the target position. When the information processing device 102 determines that the current position of the electric actuator 108 of the position adjusting device matches the target position (YES in step S1435), the information processing device 102 shifts control to step S1440. If not (NO in step S1435), the information processing apparatus 102 shifts control to step S1455.

ステップＳ１４４０において、情報処理装置１０２は、終了判定をＴｒｕｅ（完了）にし、「動作パターンａ_ｋ」に対する報酬Ｒを１にする。なお、本実施の例では、報酬Ｒは、成功のときは１、失敗のときは−１、それ以外のときは０とするが、報酬Ｒの例はこれに限られない。成功時や失敗時のときの報酬ごとに差があればよい。 In step S1440, the information processing apparatus 102 sets the end determination to True (completion) and sets the reward R for the "operation pattern _ak " to 1. In the example of this implementation, the reward R is 1 for success, -1 for failure, and 0 for other cases, but the example of reward R is not limited to this. It suffices if there is a difference for each reward at the time of success or failure.

ステップＳ１４４５において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「動作パターンａ_ｋ実行前のモータトルク値Ｔ１」、「動作パターンａ_ｋ実行後のモータトルク値Ｔ２」、「報酬Ｒ（Ｒ＝１）」および「終了判定Ｔｒｕｅ（完了）」を保存する。ステップＳ１４５０において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1445, the information processing apparatus 102 tells the evaluation value function learning unit 807 that the "operation pattern a _k ", the "motor torque value T1 before the execution of the operation pattern a _k ", and the "motor after the execution of the operation pattern a _k " are executed. The "torque value T2", "reward R (R = 1)", and "end determination True (completion)" are saved. In step S1450, the information processing device 102 executes the update process of the evaluation value function F.

ステップＳ１４５５において、情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいか否かを判定する。定数Ｎ１は、嵌合作業中に繰り返してよい動作パターンの実行回数の上限値である。情報処理装置１０２は、変数ｉの値が定数Ｎ１より大きいと判定した場合（ステップＳ１４５５にてＹＥＳ）、動作パターンの実行回数が上限に達したと判断し、制御をステップＳ１４６０に移す。そうでない場合（ステップＳ１４４５にてＮＯ）、情報処理装置１０２は制御をステップＳ１４６５に移す。 In step S1455, the information processing apparatus 102 determines whether or not the value of the variable i is larger than the constant N1. The constant N1 is an upper limit of the number of times the operation pattern is executed that may be repeated during the fitting operation. When the information processing apparatus 102 determines that the value of the variable i is larger than the constant N1 (YES in step S1455), it determines that the number of executions of the operation pattern has reached the upper limit, and shifts control to step S1460. If not (NO in step S1445), the information processing apparatus 102 shifts control to step S1465.

ステップＳ１４６０において、情報処理装置１０２は、終了判定をＴｒｕｅにし、動作パターンａ_ｋに対する報酬Ｒを−１にする。ステップＳ１４４５以降の処理は前述した通りになる。ステップＳ１４６５において、情報処理装置１０２は、変数ｉの値をインクリメントする。ステップＳ１４７０において、情報処理装置１０２は、終了判定をＦａｌｓｅにし、実行した「動作パターンａ_ｋ」に対する報酬Ｒを０にする。 In step S1460, the information processing apparatus 102 sets the end determination to True and sets the reward R for the operation pattern _ak to -1. The processing after step S1445 is as described above. In step S1465, the information processing apparatus 102 increments the value of the variable i. In step S1470, the information processing apparatus 102 sets the end determination to False, and sets the reward R for the executed "operation pattern _ak " to 0.

ステップＳ１４７５において、情報処理装置１０２は、評価値関数学習部８０７に、実行した「動作パターンａ_ｋ」、「動作パターンａ_ｋ実行前のモータトルク値Ｔ１」、「動作パターンａ_ｋ実行後のモータトルク値Ｔ２」、「報酬Ｒ（Ｒ＝０）」および「終了判定Ｆａｌｓｅ（未完了）」を保存する。ステップＳ１４８０において、情報処理装置１０２は、評価値関数Ｆの更新処理を実行する。 In step S1475, the information processing apparatus 102 tells the evaluation value function learning unit 807 that the "operation pattern a _k ", the "motor torque value T1 before the execution of the operation pattern a _k ", and the "motor after the execution of the operation pattern a _k " are executed. The "torque value T2", "reward R (R = 0)", and "end determination False (incomplete)" are saved. In step S1480, the information processing device 102 executes the update process of the evaluation value function F.

図１５は、嵌合作業の学習処理（図１３のステップＳ１３６０に対応）の一例を示すフローチャートである。ある局面において、図１５の処理を実行するためのプログラムは、２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１５の各ステップを実行するものとして当該学習処理を説明する。また、図１５において、図１４と同一の処理に関しては、同一の符号を付し、説明は繰り返さないものとする。 FIG. 15 is a flowchart showing an example of a learning process of fitting work (corresponding to step S1360 of FIG. 13). In a certain aspect, the program for executing the process of FIG. 15 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the learning process will be described assuming that the information processing apparatus 102 executes each step of FIG. Further, in FIG. 15, the same processing as in FIG. 14 is designated by the same reference numerals, and the description will not be repeated.

ステップＳ１５１０において、情報処理装置１０２は、評価値関数部８０２により、各電動アクチュエータのモータトルク値Ｔ１に基づいて、各動作パターンの評価値を算出する。ステップＳ１５２０において、情報処理装置１０２は、動作パターンテーブル８０３を参照して、最も評価値が高い動作パターンを選択する。 In step S1510, the information processing apparatus 102 calculates the evaluation value of each operation pattern based on the motor torque value T1 of each electric actuator by the evaluation value function unit 802. In step S1520, the information processing apparatus 102 refers to the operation pattern table 803 and selects the operation pattern having the highest evaluation value.

図１４の嵌合の初期学習処理においては、学習情報が十分にないため、情報処理装置１０２は、ステップＳ１４２０において、乱数で次の動作パターンを選択している。これに対して、図１５の嵌合の学習処理においては、一定量以上の学習情報が評価値関数学習部８０７に蓄積されているため、情報処理装置１０２は、ステップＳ１５１０において、評価値関数Ｆに基づいて評価値を算出する。情報処理装置１０２は、図１５の処理においても、随時、評価値関数Ｆを更新することで嵌合作業の精度を向上させる。 Since there is not enough learning information in the initial learning process of fitting in FIG. 14, the information processing apparatus 102 selects the next operation pattern with a random number in step S1420. On the other hand, in the fitting learning process of FIG. 15, since a certain amount or more of learning information is accumulated in the evaluation value function learning unit 807, the information processing apparatus 102 sets the evaluation value function F in step S1510. The evaluation value is calculated based on. Even in the process of FIG. 15, the information processing apparatus 102 improves the accuracy of the fitting operation by updating the evaluation value function F at any time.

図１６は、評価値関数部８０２の評価値関数Ｆの更新処理の一例を示すフローチャートである。ある局面において、図１６の処理を実行するためのプログラムは２次記憶装置７０３に記憶され、１次記憶装置７０２に読み出されることにより、ＣＰＵ７０１によって実行されてもよい。これ以降、情報処理装置１０２が図１６の各ステップを実行するものとして更新処理を説明する。 FIG. 16 is a flowchart showing an example of the update process of the evaluation value function F of the evaluation value function unit 802. In a certain aspect, the program for executing the process of FIG. 16 may be executed by the CPU 701 by being stored in the secondary storage device 703 and read out by the primary storage device 702. Hereinafter, the update process will be described assuming that the information processing apparatus 102 executes each step of FIG.

ステップＳ１６１０において、情報処理装置１０２は、評価値関数学習部８０７に保存されている各動作パターンａ_ｋの「動作パターンａ_ｋ実行前のモータトルク値Ｔ１」、「動作パターンａ_ｋ実行後のモータトルク値Ｔ２」、「報酬Ｒ」および「終了判定」を読み出す。 In step S1610, the information processing apparatus 102 uses the "motor torque value T1 before executing the operation pattern _ak " and the "motor after executing the operation pattern _ak " of each operation pattern _ak stored in the evaluation value function learning unit 807. "Torque value T2", "reward R" and "end judgment" are read out.

ステップＳ１６２０において、情報処理装置１０２は、ステップＳ１６１０にて読み出した各種データを用いて、学習用の評価値関数Ｆ’の内部パラメータを更新する。評価値関数Ｆ’は、評価値の算出に使用される評価値関数Ｆとは別に用意する学習用の評価値関数である。評価値関数Ｆ’は、評価値関数学習部８０７によって使用される。評価値関数Ｆは、評価値関数部８０２によって使用される。ステップＳ１６３０において、情報処理装置１０２は、学習処理を予め定められた回数繰り返すごとに、評価値関数Ｆ’を評価値関数Ｆにコピーする。情報処理装置１０２は、図１３〜図１５の処理中においても、図１６の処理を随時実行してもよい。 In step S1620, the information processing apparatus 102 updates the internal parameters of the evaluation value function F'for learning by using various data read in step S1610. The evaluation value function F'is an evaluation value function for learning prepared separately from the evaluation value function F used for calculating the evaluation value. The evaluation value function F'is used by the evaluation value function learning unit 807. The evaluation value function F is used by the evaluation value function unit 802. In step S1630, the information processing apparatus 102 copies the evaluation value function F'to the evaluation value function F every time the learning process is repeated a predetermined number of times. The information processing apparatus 102 may execute the process of FIG. 16 at any time even during the process of FIGS. 13 to 15.

以下に、評価値関数Ｆの学習処理の詳細について説明する。評価値関数Ｆはニューラルネットワークのため、学習には教師信号が必要になる。情報処理装置１０２は、終了判定に応じて教師信号ｙを次のように決定する。 The details of the learning process of the evaluation value function F will be described below. Since the evaluation value function F is a neural network, a teacher signal is required for learning. The information processing device 102 determines the teacher signal y as follows according to the end determination.

嵌合処理の終了判定がＴｒｕｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the end determination of the fitting process is True is as follows.

嵌合処理の終了判定がＦａｌｓｅの場合の教師信号ｙは以下のようになる。 The teacher signal y when the end determination of the fitting process is False is as follows.

ここで、「ｓ'＝Ｔ２、ａ'」は「Ｑ（ｓ，ａ）」が最大になる動作パターンを意味する。情報処理装置１０２は、上記の教師信号ｙと、評価値関数Ｆとの２乗誤差Ｅを求め、誤差逆伝搬法によりニューラルネットワークの学習を行う。評価値関数Ｆは、下記の式（３）の式で表される。 Here, "s'= T2, a'" means an operation pattern in which "Q (s, a)" is maximized. The information processing apparatus 102 obtains the square error E of the above-mentioned teacher signal y and the evaluation value function F, and learns the neural network by the error back propagation method. The evaluation value function F is represented by the following equation (3).

また、情報処理装置１０２は、式（３）を下記の式（４）に代入して誤差を算出する。 Further, the information processing apparatus 102 substitutes the equation (3) into the following equation (4) to calculate the error.

誤差逆伝搬法は、上記Ｅが０になるようにニューラルネットワークの内部パラメータを最適化する。よって、学習が進むにしたがって下記の式（５）の値が０に近づいていく。 The error back propagation method optimizes the internal parameters of the neural network so that the above E becomes 0. Therefore, as the learning progresses, the value of the following equation (5) approaches 0.

強化学習も同様に、学習が十分に行われると、下記の式（６）が成り立つので、誤差逆伝搬法によるニューラルネットワークの学習は強化学習の学習結果と同様になる。 Similarly, in reinforcement learning, when the learning is sufficiently performed, the following equation (6) holds, so that the learning of the neural network by the error back propagation method is the same as the learning result of reinforcement learning.

以上説明したように、本実施の形態に従う嵌合作業装置１０１は、直列多関節の構造を有さず、代わりに直動機構およびパラレルリンクのみの構成を有する。その結果、多関節ロボットが持つ特異点の問題が発生せず、多関節ロボットよりも少ないスペースでの作業を可能にする。また、嵌合作業装置１０１は、機械学習においても、パラレルリンクの基端側リンクハブに取付けられた電動アクチュエータおよび位置調整装置の電動アクチュエータのモータトルク値のみを学習データとすることができる。そのため、嵌合作業装置１０１は、多関節ロボットと比較して、学習パラメータが少なく機械学習が容易になる。よって、高い精度を要求されるコネクタ等の嵌合作業における精度を向上させることが可能となる。 As described above, the fitting work device 101 according to the present embodiment does not have a series articulated structure, but instead has a structure of only a linear motion mechanism and a parallel link. As a result, the problem of singularity of the articulated robot does not occur, and it is possible to work in a smaller space than the articulated robot. Further, even in machine learning, the fitting work device 101 can use only the motor torque values of the electric actuator attached to the base end side link hub of the parallel link and the electric actuator of the position adjusting device as learning data. Therefore, the fitting work device 101 has fewer learning parameters and facilitates machine learning as compared with the articulated robot. Therefore, it is possible to improve the accuracy in the fitting work of the connector or the like, which requires high accuracy.

今回開示された実施の形態は全ての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内で全ての変更が含まれることが意図される。 It should be considered that the embodiments disclosed this time are exemplary in all respects and not restrictive. The scope of the present invention is shown not by the above description but by the scope of claims, and it is intended that all modifications are included in the meaning and scope equivalent to the scope of claims.

１１，１０８Ａ，１０８Ｂ，１０８Ｃ電動アクチュエータ、３２第１リンクハブ、３３第２リンクハブ、３４リンク機構、３５，３６端部リンク部材、３７中央リンク部材、４２，５５，７３，７５回転軸、６２減速機構、６３モータ固定部材、１００合作業システム、１０１合作業装置、１０２情報処理装置、１０３制御装置、１０４架台、１０５第１の直動ユニット、１０６第２の直動ユニット、１０７第３の直動ユニット、１０９作業ヘッド、１１０回転ユニット取付部材、１１１角度調整機構、１１２把持機構、１１３ワーク設置台、１１４ワーク、７０２１次記憶装置、７０３２次記憶装置、７０４外部機器インターフェース、７０５入力インターフェース、７０６出力インターフェース、７０７通信インターフェース、８０１信号入力部、８０２評価値関数部、８０３動作パターンテーブル、８０４動作決定部、８０５指令生成部、８０６動作結果判定部、８０７評価値関数学習部。 11,108A, 108B, 108C Electric actuator, 32 1st link hub, 33 2nd link hub, 34 link mechanism, 35, 36 end link member, 37 central link member, 42, 55, 73, 75 rotating shaft, 62 Deceleration mechanism, 63 motor fixing member, 100 working system, 101 working device, 102 information processing device, 103 control device, 104 gantry, 105 first linear motion unit, 106 second linear motion unit, 107 third Linear unit, 109 work head, 110 rotation unit mounting member, 111 angle adjustment mechanism, 112 gripping mechanism, 113 work installation base, 114 work, 702 primary storage device, 703 secondary storage device, 704 external device interface, 705 input Interface, 706 output interface, 707 communication interface, 801 signal input unit, 802 evaluation value function unit, 803 operation pattern table, 804 operation determination unit, 805 command generation unit, 806 operation result determination unit, 807 evaluation value function learning unit.

Claims

A work device that performs fitting work
A grip part that grips the fitting part and
An angle adjusting portion to which the grip portion is attached and adjusting the orientation of the grip portion,
The work head to which the angle adjustment unit is mounted and
A position adjustment unit that moves the work head by a plurality of drive units,
It is equipped with an information processing device that controls the work device.
The angle adjusting unit
With the first and second link hubs,
A plurality of links arranged in parallel between the first and second link hubs,
A plurality of driving units for driving each of the plurality of links are included.
The information processing device
The torque of each drive part of the angle adjusting part generated during the fitting work is acquired, and
The torque of each drive unit of the angle adjustment unit is used as a parameter of the machine learning model, and the machine learning model determines each drive signal to be transmitted to the position adjustment unit and each drive unit of the angle adjustment unit.
By driving each drive unit of the position adjustment unit based on the determined drive signal, the horizontal and vertical positions of the fitting component are adjusted, and each drive unit of the angle adjustment unit is further adjusted. A working device that adjusts the orientation of the mating component by driving it.

The information processing device
Further, the torque of each drive unit of the position adjustment unit is acquired, and the torque is acquired.
The first aspect of the present invention, wherein the torque of each drive unit of the position adjustment unit is included in the parameters of the machine learning model when determining the drive signal of each drive unit of the position adjustment unit and the angle adjustment unit. Working equipment.

The drive signal transmitted to each drive unit of the position adjustment unit and the angle adjustment unit includes information on a command torque, a rotation speed, and a rotation amount of each drive unit, according to claim 1 or 2. Working equipment.

The information processing apparatus according to any one of claims 1 to 3, which determines that the fitting operation is completed based on the position of the grip portion in the vertical direction being a predetermined position. The work equipment described.

Each drive unit of the position adjustment unit is a stepping motor.
The work device according to claim 4, wherein the information processing device detects the position of the grip portion in the vertical direction based on the number of steps of each drive unit of the position adjusting unit.

The work device according to claim 4, wherein the information processing device detects the vertical position of the grip portion based on the output of an encoder provided in each drive unit of the position adjusting unit.

4. The information processing device repeatedly adjusts the fitting position of the fitting component gripped by the grip portion based on the machine learning model while determining that the fitting operation is not completed. 6. The working apparatus according to any one of 6.

The work device according to any one of claims 4 to 7, wherein the information processing device generates reward data used for learning the machine learning model based on the determination that the fitting work is completed.

The information processing device includes the torque of each drive unit of the angle adjusting unit, the completion determination of the fitting operation, and the reward data in the learning parameters.
The working apparatus according to claim 8, wherein the machine learning model is updated based on the learning parameters.

The work device according to claim 9, wherein the information processing device further includes the torque of each drive unit of the position adjusting unit in the learning parameter.

The working device according to any one of claims 1 to 10, wherein the position adjusting unit includes a three-axis linear motion mechanism.