JP7378309B2

JP7378309B2 - working equipment

Info

Publication number: JP7378309B2
Application number: JP2020020271A
Authority: JP
Inventors: 孝介原
Original assignee: Sumitomo Heavy Industries Ltd
Current assignee: Sumitomo Heavy Industries Ltd
Priority date: 2020-02-10
Filing date: 2020-02-10
Publication date: 2023-11-13
Anticipated expiration: 2040-02-10
Also published as: JP2021122924A

Description

本発明は、作業装置に関する。 The present invention relates to a working device.

特許文献１には、ロボットアームを自動運転するシステムが示されている。このシステムは、オペレータによるロボットアームの操作を機械学習によって模倣した複数の模倣モデルと、周辺環境のデータの分類に基づいて使用する模倣モデルを選択するモデル選択部とを備える。 Patent Document 1 discloses a system for automatically operating a robot arm. This system includes a plurality of imitation models that imitate the operation of a robot arm by an operator using machine learning, and a model selection unit that selects an imitation model to be used based on classification of data of the surrounding environment.

特開２０１８－２０６２８６号公報JP2018-206286A

従来、予測モデルを用いて何らかの状態を予測し、この予測結果に基づいて操作を自動化する自動運転システムがある。しかしながら、従来の自動運転システムによる予測は、単体の物体の動作予測など単純な運動の予測が行われるのみであった。そのため、従来の自動運転システムでは、相互作用して互いの配置が変わるような複数の物体を操作対象として扱うことは困難であった。 Conventionally, there are automatic driving systems that predict certain conditions using predictive models and automate operations based on the predicted results. However, predictions made by conventional autonomous driving systems have only made simple motion predictions, such as predicting the motion of a single object. Therefore, in conventional automatic driving systems, it is difficult to handle multiple objects that interact with each other and change their positions.

本発明は、複数の物体に対する操作を自動化できる作業装置を提供することを目的とする。 An object of the present invention is to provide a work device that can automate operations on a plurality of objects.

（１）
本発明の一態様の作業装置は、
容器に複数の物体を収容する作業装置であって、
前記複数の物体に対する操作が可能な可動部と、
前記複数の物体の状態を取得する状態取得部と、
操作後の前記複数の物体の状態の変化を予測して前記複数の物体に対する操作を決定する操作決定部と、
前記操作決定部が決定した操作を前記可動部に行わせる操作制御部と、
を備え、
前記操作決定部は、前記複数の物体の相互作用による状態変化を含めて、前記可動部による操作後の前記複数の物体の状態を予測する機械学習された予測モデルと、前記予測モデルを用いた予測結果を評価する評価処理部と、を有し、かつ、前記予測モデルを用いた予測と前記評価処理部による評価とに基づいて操作を決定し、
前記評価処理部は、前記容器内の複数の物体間の隙間と、前記容器内の物体の数とに基づいて前記予測結果を評価する。
（２）
本発明のもう一つの態様の作業装置は、
複数の物体として土砂を運搬する作業装置であって、
前記複数の物体に対する操作が可能な可動部と、
前記複数の物体の状態を取得する状態取得部と、
操作後の前記複数の物体の状態の変化を予測して前記複数の物体に対する操作を決定する操作決定部と、
前記操作決定部が決定した操作を前記可動部に行わせる操作制御部と、
を備え、
前記操作決定部は、前記複数の物体の相互作用による状態変化を含めて、前記可動部による操作後の前記複数の物体の状態を予測する機械学習された予測モデルと、前記予測モデルを用いた予測結果を評価する評価処理部と、を有し、かつ、前記予測モデルを用いた予測と前記評価処理部による評価とに基づいて操作を決定し、
前記評価処理部は、前記予測結果の土砂形状と目標の土砂形状との比較に基づいて前記予測結果を評価する。
（３）
本発明のもう一つの態様の作業装置は、
複数の物体に対する操作が可能な可動部と、
前記複数の物体の状態を取得する状態取得部と、
操作後の前記複数の物体の状態の変化を予測して前記複数の物体に対する操作を決定する操作決定部と、
前記操作決定部が決定した操作を前記可動部に行わせる操作制御部と、
前記複数の物体の目標状態のデータを設定可能な設定処理部と、
を備え、
前記操作決定部は、前記複数の物体の相互作用による状態変化を含めて、前記可動部による操作後の前記複数の物体の状態を予測する機械学習された予測モデルと、前記予測モデルを用いた予測結果を評価する評価処理部と、を有し、かつ、前記予測モデルを用いた予測と前記評価処理部による評価とに基づいて操作を決定し、
前記評価処理部は、前記目標状態のデータを用いて前記予測結果を評価する。 (1)
A working device according to one embodiment of the present invention includes:
A working device that stores a plurality of objects in a container,
a movable part capable of operating the plurality of objects;
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit evaluates the prediction result based on gaps between a plurality of objects in the container and the number of objects in the container.
(2)
A working device according to another aspect of the present invention includes:
A working device that transports earth and sand as multiple objects,
a movable part capable of operating the plurality of objects;
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit evaluates the prediction result based on a comparison between the sediment shape of the prediction result and the target sediment shape.
(3)
A working device according to another aspect of the present invention includes:
A movable part that can operate on multiple objects,
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
a setting processing unit capable of setting target state data of the plurality of objects;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit evaluates the prediction result using the data of the target state.

本発明によれば、複数の物体に対する操作を自動化できる作業装置を提供できる。 According to the present invention, it is possible to provide a work device that can automate operations on a plurality of objects.

本発明の実施形態１に係る作業装置を示すブロック図である。FIG. 1 is a block diagram showing a working device according to Embodiment 1 of the present invention. 評価に使用する変数を説明する図である。FIG. 3 is a diagram illustrating variables used for evaluation. 制御部が実行する作業処理の手順を示すフローチャートである。3 is a flowchart showing a procedure of work processing executed by a control unit. 第１の操作と評価の一例を示す説明図である。It is an explanatory diagram showing an example of the 1st operation and evaluation. 第２の操作と評価の一例を示す説明図である。It is an explanatory diagram showing an example of the 2nd operation and evaluation. 本発明の実施形態２に係る作業装置を示すブロック図である。FIG. 2 is a block diagram showing a working device according to Embodiment 2 of the present invention. 実施形態２の作業装置の自動運転処理を説明する図である。FIG. 7 is a diagram illustrating automatic operation processing of the working device according to the second embodiment.

以下、本発明の実施形態について図面を参照して詳細に説明する。 Embodiments of the present invention will be described in detail below with reference to the drawings.

（実施形態１）
図１は、本発明の実施形態１に係る作業装置を示すブロック図である。実施形態１において操作対象の物体は射出成形品やグラス等のワークである。実施形態１の作業装置１は、容器（箱）の中に複数のワークを自動的に収容する装置であり、効率的に多くのワークを容器の中に収めることを目的としている。 (Embodiment 1)
FIG. 1 is a block diagram showing a working device according to Embodiment 1 of the present invention. In the first embodiment, the object to be operated is a workpiece such as an injection molded product or a glass. The work device 1 of the first embodiment is a device that automatically stores a plurality of workpieces in a container (box), and aims to efficiently store a large number of workpieces in the container.

作業装置１は、図１に示すように、複数のワークの位置を取得するための撮影部３と、ワークの操作が可能なロボットハンドなどの可動部２と、可動部２を動かして操作の自動運転を行う制御部１０とを備える。撮影部３は、本発明に係る状態取得部の一例に相当する。 As shown in FIG. 1, the work device 1 includes an imaging unit 3 for acquiring the positions of a plurality of workpieces, a movable part 2 such as a robot hand that can operate the workpieces, and a movable part 2 that moves the movable part 2 to perform operations. It also includes a control unit 10 that performs automatic operation. The photographing section 3 corresponds to an example of a state acquisition section according to the present invention.

可動部２は、ワークを容器に追加する操作、容器内のワークを動かす操作が可能である。 The movable part 2 is capable of adding a workpiece to a container and moving a workpiece within the container.

制御部１０は、制御プログラムを格納した記憶部と、制御プログラムを実行するＣＰＵ（Central Processing Unit）と、撮影部３からの撮像画像の入力と可動部２への制御信号の出力とを行うＩ／Ｏとを有するコンピュータである。制御部１０では、ＣＰＵが制御プログラムを実行することで、幾つかの機能モジュールが実現される。機能モジュールには、可動部２の操作を決定する操作決定部１１と、可動部２を動かして操作決定部１１が決定した操作を実行させる操作制御部１２とが含まれる。 The control unit 10 includes a storage unit that stores a control program, a CPU (Central Processing Unit) that executes the control program, and an I/O unit that inputs captured images from the imaging unit 3 and outputs control signals to the movable unit 2. /O. In the control unit 10, several functional modules are realized by the CPU executing a control program. The functional module includes an operation determining section 11 that determines the operation of the movable section 2, and an operation control section 12 that moves the movable section 2 to execute the operation determined by the operation determining section 11.

操作決定部１１には、可動部２がワークに操作を加えた場合に複数のワークの状態の変化を予測する機械学習された予測モデル１１１と、複数のワークの状態の評価を行う評価処理部１１２とが含まれる。 The operation determining unit 11 includes a machine-learned prediction model 111 that predicts changes in the states of a plurality of workpieces when the movable unit 2 performs an operation on the workpieces, and an evaluation processing unit that evaluates the states of the plurality of workpieces. 112 is included.

予測モデル１１１は、相互に影響しあう複数の物体（ワーク等）に対して或る操作を行った後の当該複数の物体の状態を予測する。予測モデル１１１は、例えば多体問題のシミュレーションを機械学習により近似したニューラルネットワークを適用できる。 The prediction model 111 predicts the states of a plurality of objects (workpieces, etc.) that influence each other after a certain operation is performed on the objects. For the prediction model 111, for example, a neural network that approximates the simulation of a multi-body problem using machine learning can be applied.

評価処理部１１２は、予測モデル１１１が予測した複数の物体（ワーク等）の状態についての評価を行う。評価処理部１１２は、複数の物体の状態が望ましい状態であるほど高い評価値を出力するように設計される。望ましい状態は、例えば目標の状態に速く近づく状態であってもよい。評価処理部１１２は、評価関数を用いて、複数の物体（ワーク等）の状態の評価を行う。評価関数はユーザが設定入力可能に構成されてもよい。 The evaluation processing unit 112 evaluates the states of a plurality of objects (workpieces, etc.) predicted by the prediction model 111. The evaluation processing unit 112 is designed to output a higher evaluation value as the states of the plurality of objects are more desirable. The desired state may be, for example, a state that quickly approaches the target state. The evaluation processing unit 112 evaluates the states of a plurality of objects (workpieces, etc.) using an evaluation function. The evaluation function may be configured so that the user can input settings.

操作決定部１１は、予測モデル１１１の予測結果と評価処理部１１２の評価に基づいて、可動部２の次の操作を決定する。操作決定部１１は、例えば、複数の操作ステップ先を予測ホライズンとし、予測ホライズンにおける複数の物体の状態の予測結果と評価に基づいて、次の操作ステップを決定する。操作決定部１１は、様々な組み合わせの操作を選択して複数通りの予測ホライズンの状態予測及び評価を、予測モデル１１１及び評価処理部１１２を用いて実行させ、その評価を比較する。そして、評価が最も高い予測ホライズンを見つけ、当該予測ホライズンの１つ目の操作ステップの操作を、次の操作として決定する。 The operation determining section 11 determines the next operation of the movable section 2 based on the prediction result of the prediction model 111 and the evaluation of the evaluation processing section 112. For example, the operation determination unit 11 sets a plurality of operation steps ahead as a prediction horizon, and determines the next operation step based on the prediction results and evaluations of the states of the plurality of objects in the prediction horizon. The operation determining unit 11 selects various combinations of operations, executes multiple prediction horizon state predictions and evaluations using the prediction model 111 and the evaluation processing unit 112, and compares the evaluations. Then, the prediction horizon with the highest evaluation is found, and the operation of the first operation step of the prediction horizon is determined as the next operation.

操作制御部１２は、可動部２を動かして操作決定部１１が決定した操作を実行させる。 The operation control section 12 moves the movable section 2 to execute the operation determined by the operation determination section 11.

続いて、予測モデル１１１及び評価処理部１１２の具体的な一例を説明する。本発明に係る予測モデル及び評価処理部は、以下の具体的に限定されるものでない。 Next, a specific example of the prediction model 111 and the evaluation processing unit 112 will be explained. The prediction model and evaluation processing unit according to the present invention are not specifically limited to the following.

＜予測モデル＞
予測モデル１１１においては、ｉ番目の物体（ワーク等）の状態ベクトルをｘ_ｉ ^ｋと記述し、その集合をＸ^ｋ＝｛ｘ_ｉ ^ｋ｜ｉ＝１，…，Ｎ^ｋ｝と記述する。さらに、物体に加える操作をｕ^ｋと記述する。添え字ｋは、離散時間を表わす。予測モデル１１１のニューラルネットワークは、次式（１）のように、或る離散時間ｋ－１の物体の状態ベクトル集合Ｘ^ｋ－１と操作ｕ^ｋとを入力とし、次の離散時間ｋの物体の状態ベクトル集合Ｘ^ｋを出力とする関数ｆとして表わすことができる。

<Prediction model>
In the prediction model 111, the state vector of the i-th object (such as a workpiece) is written as x _i ^k , and the set thereof is written as X ^k ={x _i ^k |i=1,...,N ^k }. Furthermore, an operation applied to an object is written as ^uk . The subscript k represents discrete time. The neural network of the prediction model 111 inputs the state vector set X k-1 of the object at a certain discrete time k-1 and the operation u ^k , as shown in the following equation (1), and calculates the state vector set X ^k-1 of the object at the next discrete time k. can be expressed as a function f whose output is the state vector set X ^k of .

予測モデル１１１には、多体問題に適した具体例として、例えば、Chang, Michael B. et al., "A compositional object-based approach to learning physical dynamic.", arXiv preprint arXiv:1612.00341 (2016). ICLR2017. に記載のニューラルネットワークを適用できる。上記の文献には、多体問題のシミュレーションを扱うニューラルネットワークが示されている。予測モデル１１１の機械学習を行う場合、複数の物体に対して様々なパターンの操作を試行したシミュレーションデータから、多くの学習データセット｛Ｘ^ｋ、Ｘ^ｋ－１、ｕ^ｋ－１｝を用意する。シミュレーションには、Distinct Element Methodなどを用いることができる。｛Ｘ^ｋ－１、ｕ^ｋ－１｝は訓練用入力データであり、｛Ｘ^ｋ｝は目標値である。ニューラルネットワークに学習データセットを与えて、バックプロパゲーション（誤差逆伝播法）等により、各パラメータを最適化することで、機械学習された予測モデル１１１が得られる。 The predictive model 111 includes, for example, Chang, Michael B. et al., "A compositional object-based approach to learning physical dynamic.", arXiv preprint arXiv:1612.00341 (2016). The neural network described in ICLR2017. can be applied. The above-mentioned literature describes a neural network that deals with the simulation of many-body problems. When performing machine learning of the prediction model 111, many learning data sets {X ^k , X ^k-1 , u ^k-1 } are prepared from simulation data obtained by attempting various patterns of operations on multiple objects. . Distinct Element Method etc. can be used for simulation. {X ^k-1 , u ^k-1 } is training input data, and {X ^k } is a target value. A machine-learned prediction model 111 is obtained by giving a learning data set to the neural network and optimizing each parameter by backpropagation (error backpropagation method) or the like.

物体の状態ベクトルｘ_ｉ ^ｋは、複数の要素、例えば、ワークか否か、壁面か否か、ロボットアームか否か、二次元上の位置、二次元方向の速度、基準点からの方位θ、基準点を中心とする角加速度ω、等々を有してもよい。状態ベクトルｘ_ｉ ^ｋが、物体の種類を示す要素を有することで、状態ベクトルｘ_ｉ ^ｋにより、ワークだけでなく、容器の壁面、可動部２（ロボットアーム）など、他の物体の状態を表わすことができる。そして、状態ベクトル集合Ｘ^ｋに、他の物体の状態を含めることができる。 The state vector x _i ^k of the object includes multiple elements, such as whether it is a workpiece, whether it is a wall surface, whether it is a robot arm or not, a two-dimensional position, a two-dimensional velocity, an orientation θ from a reference point, It may have an angular acceleration ω centered on the reference point, and so on. Since the state vector x _i ^k has an element indicating the type of object, the state vector x _i ^k represents the state of not only the workpiece but also other objects such as the wall of the container and the movable part 2 (robot arm). be able to. The states of other objects can be included in the state vector set ^Xk .

物体に加える操作ｕ^ｋは、複数の要素、例えば、可動部２による操作の種類ａ^ｋと、可動部２による操作量ｖ^ｋを有する。操作の種類ａ^ｋには、容器にワークを投入する操作、容器内のワークを動かす操作などが含まれる。操作量ｖ_ｋは、可動部２によりワークを移動する方向と長さの情報である。 The operation u ^k applied to the object has a plurality of elements, for example, the type of operation a ^k by the movable part 2 and the amount of operation v ^k by the movable part 2. The type of operation ^ak includes an operation of putting a workpiece into a container, an operation of moving a workpiece in a container, and the like. The manipulated variable v _k is information about the direction and length in which the workpiece is moved by the movable part 2 .

＜評価処理部＞
評価処理部１１２は、評価関数Ｌを有し、状態ベクトル集合Ｘ^ｋを入力して、評価値を出力する。評価関数Ｌは、状態ベクトル集合Ｘ^ｋが、目標状態に速やかに近づく状態であれば高い評価値に、その逆であれば低い評価値が得られるように設計される。実施形態１では、容器に多くのワークが詰め込める状態が高い評価値となるように評価関数Ｌが設計される。評価関数Ｌは、複数の物体に及ぼす操作の種類が複数あれば、操作の種類ごとの項を有してもよい。 <Evaluation processing section>
The evaluation processing unit 112 has an evaluation function L, inputs the state vector set ^Xk , and outputs an evaluation value. The evaluation function L is designed so that a high evaluation value will be obtained if the state vector set X ^k quickly approaches the target state, and a low evaluation value will be obtained if the state vector set X k quickly approaches the target state. In the first embodiment, the evaluation function L is designed so that a state in which a large number of works can be packed into a container results in a high evaluation value. The evaluation function L may have a term for each type of operation if there are multiple types of operations performed on multiple objects.

＜容器内でワークを押して隙間を空ける操作の評価関数Ｌ_Ａ＞
隙間を空ける操作に関する評価関数Ｌ_Ａは、大きな隙間が得られた場合に、高い評価値が得られるように設計されればよい。大きな隙間により、ワークの投入が可能となるためである。評価関数Ｌ_Ａを作成するため、先ず、任意な点ｐと各部との距離を示すベクトルｄを導入する。図２は、ベクトルｄを説明する図である。

<Evaluation function L _A of the operation of pushing the workpiece in the container to create a gap>
The evaluation function _LA related to the operation of creating a gap may be designed so that a high evaluation value is obtained when a large gap is obtained. This is because the large gap makes it possible to insert the workpiece. In order to create the evaluation function _LA , first, a vector d indicating the distance between an arbitrary point p and each part is introduced. FIG. 2 is a diagram illustrating the vector d.

ここで、ｐは容器内の任意な点の位置ベクトル、Ｄ_ｐは２点間の距離を表わす関数、Ｄ_ｌは１点と直線との距離を表わす関数、ｙ_１～ｙ_Ｍは容器内の各ワークの位置ベクトル、ｂ_１～ｂ_Ｌは容器の各壁の位置角度平面長さを特定できる量である。ベクトルｄの各要素は、任意な点ｐと各ワークとの距離、並びに、任意な点ｐと容器の各壁との距離を示す。図２に示すように、容器Ｃ１内にＭ＝８個のワークＷがあり、容器Ｃ１の壁がＬ＝４面であれば、集合Ｙの元はｙ_１～ｙ_８となり、集合Ｂの元はｂ_１～ｂ_４となり、ベクトルｄは、Ｍ＋Ｌ＝１２の要素を有することとなる。 Here, p is a position vector of an arbitrary point in the container, D _p is a function representing the distance between two points, D _l is a function representing the distance between one point and a straight line, and y ₁ to y _M are the position vectors of arbitrary points in the container. The position vectors b ₁ to b _L of each workpiece are quantities that can specify the position angle plane length of each wall of the container. Each element of the vector d indicates the distance between an arbitrary point p and each workpiece, and the distance between an arbitrary point p and each wall of the container. As shown in FIG. 2, if there are M=8 workpieces W in the container C1 and the walls of the container C1 are L=4, the elements of the set Y are y ₁ to y ₈ , and the elements of the set B are is b ₁ to b ₄ , and vector d has M+L=12 elements.

ベクトルｄの全要素が大きい値であれば、点ｐの周囲に大きな隙間があることが示される。一方、ベクトルｄの要素に大きな値と小さな値とが含まれる場合、大きな値が示す間隔内に、小さな値が示す間隔離れたワークが存在する可能性がある。この場合、大きな値が示す間隔は隙間とはならず、隙間を評価する上で、影響度が低くなる。そこで、このような影響の重みを表わすベクトルηを導入する。

If all elements of vector d have large values, it indicates that there is a large gap around point p. On the other hand, if the elements of the vector d include large values and small values, there is a possibility that there are works that are separated from each other by small values within the intervals indicated by large values. In this case, the interval indicated by a large value is not a gap, and the degree of influence on evaluating the gap is low. Therefore, a vector η representing the weight of such influence is introduced.

ここで、［］_ｇは、ｇ個の要素を有するベクトルを示す。ｇは、１、…、Ｍ、Ｍ＋１、…、Ｍ＋Ｌであり、ベクトルηの要素数はベクトルｄの要素数と一致する。ｄ_ｈは、ベクトルｄのｈ番目の要素を示す。αとｃは調整用の定数であり、実際のワークに合わせて、適宜定められる。ベクトルηの式は、遠方のワーク又は壁までの隙間については、評価の値が割り引かれるような重みを表わす。 Here, [ ] _g indicates a vector having g elements. g is 1, . . . , M, M+1, . . . , M+L, and the number of elements of vector η matches the number of elements of vector d. d _h indicates the h-th element of vector d. α and c are constants for adjustment, and are appropriately determined according to the actual workpiece. The expression for the vector η represents a weight such that the evaluation value is discounted for a distant workpiece or a gap to a wall.

さらに、上記のベクトルｄ、ηを用いて、次式（６）のように、関数γを導入する。関数γは、任意な点ｐにおける隙間の大きさの推測値を与える関数であり、ワーク又は壁までの距離を示すベクトルｄと、間隔が隙間に与える影響度の重みを示すベクトルηとで、同一要素同士を掛け合わせて総和をとる。関数γは、任意な点ｐと、容器内の全ワークの位置ベクトルの集合Ｂと、容器の全壁を特定する情報の集合Ｙと、を引数として持つ。

Furthermore, using the above vectors d and η, a function γ is introduced as shown in the following equation (6). The function γ is a function that gives an estimated value of the size of the gap at an arbitrary point p, and includes a vector d indicating the distance to the work or the wall, and a vector η indicating the weight of the influence of the gap on the gap. Multiply the same elements to get the sum. The function γ has as arguments an arbitrary point p, a set B of position vectors of all workpieces in the container, and a set Y of information specifying all walls of the container.

評価関数Ｌ_Ａは、関数γを用いて、次式（７）のように、任意な点ｐの中で最大となるηとして定義される。

任意な点ｐの最適化（γを大きくする点ｐの算出）は、容器内の点をランダムに探索して、γを大きくする点ｐ’を大まかに算出し、この点ｐ’の近傍で、よりγを大きくする点ｐを勾配法により計算することで得てもよい。勾配は、関数γの数値微分により求めることができる。 The evaluation function _LA is defined as η that is the maximum among arbitrary points p using the function γ, as shown in the following equation (7).

Optimization of an arbitrary point p (calculation of a point p that increases γ) is to randomly search points in the container, roughly calculate a point p' that increases γ, and then , the point p at which γ becomes larger may be obtained by calculating using the gradient method. The gradient can be determined by numerical differentiation of the function γ.

＜ワークを容器内に投入する操作の評価関数Ｌ_Ｂ＞
ワークを投入する操作に関する評価関数Ｌ_Ｂは、ワークの数が増えれば高い評価値が得られるように設計されればよい。このため、評価関数Ｌ_Ｂは、次式（８）のように、ワークの数（集合Ｙの元の数）と定義できる。

<Evaluation function L _B of the operation of putting the work into the container>
The evaluation function _LB related to the operation of inserting the workpieces may be designed so that a higher evaluation value can be obtained as the number of workpieces increases. Therefore, the evaluation function _LB can be defined as the number of works (the number of elements of the set Y) as shown in the following equation (8).

＜総合の評価関数Ｌ＞
総合の評価関数Ｌは、容器内でワークを押して隙間を空ける操作と、ワークを容器内に投入する操作との、どちらを選ぶかの評価が可能なように設計されればよい。総合の評価関数Ｌは、次式（９）のように、各操作に関する評価関数Ｌ_Ａ、Ｌ_Ｂを重み付けして結合することで定義できる。μは、正の重みを示す定数である。

<Comprehensive evaluation function L>
The overall evaluation function L may be designed such that it is possible to evaluate whether to select an operation of pushing the workpiece in the container to create a gap or an operation of throwing the workpiece into the container. The overall evaluation function L can be defined by weighting and combining the evaluation functions L _A and L _B regarding each operation, as shown in the following equation (9). μ is a constant indicating positive weight.

評価値をコスト値（良くない値）として扱う場合には、上記評価関数Ｌの符号を反転させればよい。 When treating the evaluation value as a cost value (bad value), the sign of the evaluation function L may be reversed.

＜作業処理＞
図３は、制御部が実行する作業処理の手順を示すフローチャートである。図４は、第１の操作と評価の一例を示す説明図である。図５は、第２の操作と評価の一例を示す説明図である。 <Work processing>
FIG. 3 is a flowchart showing the procedure of work processing executed by the control unit. FIG. 4 is an explanatory diagram showing an example of the first operation and evaluation. FIG. 5 is an explanatory diagram showing an example of the second operation and evaluation.

例えばユーザからの開始要求があった場合に、制御部１０は作業処理を開始する。作業処理が開始されると、制御部１０は、先ず、撮影部３の撮影画像を取得して、複数の物体の状態を検出する（ステップＳ１）。実施形態１では、複数の物体は操作ステップごとに一旦静止するので、ステップＳ１では、状態として複数の物体の位置が取得される。 For example, when a start request is received from a user, the control unit 10 starts work processing. When the work process is started, the control unit 10 first acquires images taken by the imaging unit 3 and detects the states of a plurality of objects (step S1). In the first embodiment, the plurality of objects once stops at each operation step, so in step S1, the positions of the plurality of objects are acquired as the state.

次に、制御部１０では、操作決定部１１が、ステップＳ１で取得された状態から、予測に使用する状態ベクトルｘ_ｉ ^ｋ－１の集合Ｘ^ｋ－１を、初期化、すなわち、ステップＳ１で取得された状態の値にセットする（ステップＳ２）。 Next, in the control unit 10, the operation determining unit 11 initializes a set X ^k- 1 of state vectors x _i ^k- 1 to be used for prediction from the state acquired in step S1, that is, in step S1 The value of the acquired state is set (step S2).

次に、操作決定部１１は、状態ベクトル集合Ｘ^ｋ－１に適用できる操作を選択する（ステップＳ３）。例えば、図４及び図５に示すように、離散時間ｋ－１の状態ベクトル集合Ｘ^ｋ－１（容器Ｃ１の中に複数のワークＷが配置された状態）であれば、一定以上の隙間に新たにワークＷを投入するという操作と、容器Ｃ１内のいずれかのワークＷをどれだけどの方向に押すという操作が選択可能であり、これらの操作の中から、いずれかの操作を選択する。選択は、ランダムに行われてもよいし、分散した選択がなされてもよいし、理想の操作の範囲が予め分かっていれば理想の範囲内の操作が多く選択されるようにされてもよい。図４は、矢印Ａ１の可動部２の動きによって１つのワークＷ１を押す操作が選択された例を示している。図５は、新たなワークＷ２を投入する操作が選択された例を示している。 Next, the operation determining unit 11 selects an operation that can be applied to the state vector set X ^k-1 (step S3). For example, as shown in FIGS. 4 and 5, if the state vector set X ^k-1 (a state in which a plurality of workpieces W are placed in the container C1) at a discrete time k-1, a gap larger than a certain level It is possible to select an operation of introducing a new workpiece W and an operation of pushing any workpiece W in the container C1 by how much and in which direction, and one of these operations is selected. Selection may be made randomly, selection may be distributed, or if the range of ideal operations is known in advance, many operations within the ideal range may be selected. . FIG. 4 shows an example in which the operation of pushing one workpiece W1 is selected by the movement of the movable part 2 as indicated by the arrow A1. FIG. 5 shows an example in which the operation of introducing a new workpiece W2 is selected.

次に操作決定部１１は、状態ベクトル集合Ｘ^ｋ－１と選択された操作ｕ^ｋ－１とから次の離散時間ｋの状態ベクトル集合Ｘ^ｋを予測する（ステップＳ４）。予測は、予測モデル１１１を用いて行われる。 Next, the operation determining unit 11 predicts the state vector set X ^k of the next discrete time k from the state vector set X ^k-1 and the selected operation u ^k-1 (step S4). Prediction is performed using prediction model 111.

次に操作決定部１１は、予め定められた最大予測ステップ（予測ホライズン）まで予測が到達したか判別し（ステップＳ５）、ＮＯであれば、処理をステップＳ３に戻して、ステップＳ３～Ｓ５の処理を繰り返す。予測ホライズンが大きなステップ数になると、演算負荷が高まるので、予測ホライズンは適宜なステップ数に定められるとよい。例えば３ステップ程度としてもよい。 Next, the operation determining unit 11 determines whether the prediction has reached a predetermined maximum prediction step (prediction horizon) (step S5), and if NO, returns the process to step S3 and performs steps S3 to S5. Repeat the process. If the prediction horizon becomes a large number of steps, the calculation load increases, so the prediction horizon may be set to an appropriate number of steps. For example, it may be performed in about three steps.

ステップＳ３～Ｓ５の繰り返しにより、ステップＳ２の初期化された状態から、複数回の操作を加えた予測ホライズン（例えば３操作ステップ先）の状態ベクトル集合Ｘ^ｋ＋２が推測される。図４及び図５は、予測ホライズンを１操作ステップ先とした例を示している。 By repeating steps S3 to S5, a state vector set X ^k+2 of a prediction horizon (for example, three operation steps ahead) obtained by adding a plurality of operations is estimated from the initialized state of step S2. 4 and 5 show an example in which the prediction horizon is one operation step ahead.

ステップＳ５でＹＥＳと判別されると、操作決定部１１は、評価処理部１１２により予測された状態ベクトル集合Ｘ^ｋ＋２の評価値を計算させる（ステップＳ６）。評価処理部１１２は、評価関数Ｌに状態ベクトル集合Ｘ^ｋ＋２を入力して、評価値を計算する。図４及び図５は、予測ホライズンが１操作ステップ先なので、評価値を計算する状態ベクトル集合はＸ^ｋとなる。図４の例では、予測された状態ベクトル集合Ｘ^ｋから計算された評価関数Ｌ_Ａの値が向上し、総合の評価も向上したため、選択された操作ｕ^ｋ－１が良い操作と判定されている。図５の例では、予測された状態ベクトル集合Ｘ^ｋから計算された評価関数Ｌ_Ｂの値が向上し、隙間に関する評価関数Ｌ_Ａの増減と合わせて、総合の評価が向上したため、選択された操作ｕ^ｋ－１が良い操作と判定されている。操作の選択により、高低さまざまな評価値が計算される。 If YES is determined in step S5, the operation determining unit 11 causes the evaluation processing unit 112 to calculate the evaluation value of the predicted state vector set X ^k+2 (step S6). The evaluation processing unit 112 inputs the state vector set X ^k+2 to the evaluation function L and calculates an evaluation value. In FIGS. 4 and 5, the prediction horizon is one operation step ahead, so the state vector set for calculating the evaluation value is X ^k . In the example of FIG. 4, the value of the evaluation function L _A calculated from the predicted state vector set X ^k has improved, and the overall evaluation has also improved, so the selected operation u ^k−1 is determined to be a good operation. There is. In the example of FIG _. 5, the value of the evaluation function _L calculated from the predicted state vector ^set Operation u ^k-1 is determined to be a good operation. Depending on the operation selection, various evaluation values, high and low, are calculated.

続いて、操作決定部１１は、ステップＳ６の評価が、予め定められた最大評価回数に達したか判別し、ＮＯであれば、処理をステップＳ２に戻して、ステップＳ２からの処理を繰り返す。ステップＳ２～Ｓ７の繰り返しにより、最大評価回数分の様々な操作に対する予測結果とそれに基づく評価値が得られる。 Subsequently, the operation determining unit 11 determines whether the evaluation in step S6 has reached a predetermined maximum number of evaluations, and if NO, returns the process to step S2 and repeats the process from step S2. By repeating steps S2 to S7, prediction results and evaluation values based on the prediction results for various operations for the maximum number of evaluations are obtained.

ステップＳ７でＹＥＳとなると、操作決定部１１は、ステップＳ２～Ｓ７の繰り返しにより得られた最大評価回数分の評価値を比較し、最も評価値が高い予測ホライズンで選択されていた第１操作ステップの操作を、次に実行する操作として選択する（ステップＳ８）。 If YES in step S7, the operation determining unit 11 compares the evaluation values for the maximum number of evaluations obtained by repeating steps S2 to S7, and selects the first operation step selected in the prediction horizon with the highest evaluation value. is selected as the next operation to be executed (step S8).

制御部１０では、操作決定部１１が次の操作を決定したら、操作制御部１２が、可動部２を制御して操作を実行させる（ステップＳ９）。そして、制御部１０は、終了条件に達したか否かを判別し（ステップＳ１０）、達していれば作業処理を終了し、達していなければ、処理をステップＳ１に戻して、ステップＳ１からの処理を繰り返す。終了条件は、例えば、操作の実行後に計測された状態ベクトル集合Ｘ^ｋに基づく条件（例えば、容器内のワークの個数が最大詰込み数に達した等）、あるいは、最大繰り返し回数に達した場合等から適宜定められればよい。 In the control section 10, when the operation determining section 11 determines the next operation, the operation control section 12 controls the movable section 2 to execute the operation (step S9). Then, the control unit 10 determines whether or not the end condition has been reached (step S10). If the end condition has been reached, the work process is ended; if the end condition has not been reached, the process returns to step S1, and the process starts from step S1. Repeat the process. The termination condition may be, for example, a condition based on the state vector set ^Xk measured after the execution of the operation (for example, the number of workpieces in the container has reached the maximum number of fillings), or a case where the maximum number of repetitions has been reached. It may be determined as appropriate from the following.

ステップＳ１～Ｓ９の処理が繰り返されることで、評価関数Ｌの値を高くする操作が選択されかつ実行されていき、作業の目的を達成する自動運転が実現される。 By repeating the processing of steps S1 to S9, an operation that increases the value of the evaluation function L is selected and executed, and automatic operation that achieves the purpose of the work is realized.

以上のように、実施形態１の作業装置１によれば、複数のワークの操作が可能な可動部２と、複数のワークの位置を取得する撮影部３と、操作後の複数のワークの配置を予測して複数のワークに対する可動部２の操作を決定する操作決定部１１と、操作決定部１１が決定した操作を可動部２に行わせる操作制御部１２とを備える。したがって、相互に作用する複数のワークに対して、目標の作業（容器に多くのワークを詰める動作等）を達成する自動運転を実現できる。 As described above, according to the working device 1 of the first embodiment, the movable part 2 that can operate a plurality of workpieces, the imaging part 3 that acquires the positions of the plurality of workpieces, and the arrangement of the plurality of workpieces after the operation. The apparatus includes an operation determining section 11 that predicts the operation of the movable section 2 for a plurality of workpieces and determines the operation of the movable section 2 for a plurality of workpieces, and an operation control section 12 that causes the movable section 2 to perform the operation determined by the operation determining section 11. Therefore, it is possible to realize automatic operation that accomplishes a target task (such as filling a container with many works) for a plurality of works that interact with each other.

さらに、実施形態１の作業装置１によれば、複数のワークの相互作用による配置変化を含めて操作後の複数のワークの状態を予測する機械学習された予測モデル１１１と、予測された複数のワークの状態を評価する評価処理部１１２とを備え、予測と評価とに基づき次に実行する操作を決定する。したがって、目的に沿った操作の決定を小さな演算負荷で決定することができる。 Further, according to the working device 1 of the first embodiment, a machine-learned prediction model 111 that predicts the states of a plurality of workpieces after the operation including changes in arrangement due to interaction of the plurality of workpieces, and The evaluation processing unit 112 evaluates the state of the workpiece, and determines the next operation to be performed based on prediction and evaluation. Therefore, operations can be determined in accordance with the purpose with a small computational load.

そして、実施形態１の作業装置１により、容器にワークを詰めて収容する操作を自動化できる。 The work device 1 of the first embodiment can automate the operation of filling and storing workpieces in containers.

（実施形態２）
図６は、本発明の実施形態２に係る作業装置を示すブロック図である。実施形態２の作業装置１は、土砂を自動的に運搬する装置であり、目標の土砂形状の生成を効率的に行うことを目的としている。実施形態２では操作対象の物体、並びに、状態が予測される物体として、土砂が適用される。 (Embodiment 2)
FIG. 6 is a block diagram showing a working device according to Embodiment 2 of the present invention. The working device 1 of the second embodiment is a device that automatically transports earth and sand, and is intended to efficiently generate a target earth and sand shape. In the second embodiment, earth and sand are used as the object to be operated and the object whose state is predicted.

作業装置１は、図６に示すように、可動部２Ａがパワーショベル（ショベル、クローラ、旋回装置等）であり、さらに、制御部１０には、ユーザが目標状態の設定データを設定できる設定処理部１３が追加されている。設定処理部１３は、目標状態の設定データを格納する設定部１３１を有する。その他の構成要素は、実施形態１と同様である。 As shown in FIG. 6, in the working device 1, the movable part 2A is a power shovel (shovel, crawler, swing device, etc.), and the control unit 10 further includes a setting process that allows the user to set setting data for a target state. Section 13 has been added. The setting processing section 13 includes a setting section 131 that stores setting data of the target state. Other components are the same as in the first embodiment.

予測モデル１１１は、複数の物体の状態として土砂の配置や密度が適用され、土砂をすくう、すくった土砂を運搬する、土砂を下す等の各操作を行った場合の土砂の状態を予測する。予測モデル１１１は、機械学習されたニューラルネットワークを適用できる。土砂の操作に対しては、操作箇所から遠いところにある土砂への相互作用が少ないことから、予測モデル１１１は、相互作用が非常に少ないエリアの物体に関する計算がスクリーニングにより外されるように、演算量を削減する機能を有していてもよい。また、土砂の一粒一粒を物体の単位とすると、演算量が膨大になるため、予測モデル１１１は、予め定めた土砂のまとまりを物体の単位として扱うようにしてもよい。 The prediction model 111 applies the arrangement and density of earth and sand as the states of a plurality of objects, and predicts the state of earth and sand when performing operations such as scooping up earth and sand, transporting scooped earth and sand, and lowering earth and sand. The prediction model 111 can apply a machine learned neural network. Regarding the operation of earth and sand, since there is little interaction with earth and sand in a place far from the operation point, the prediction model 111 is designed to screen out calculations related to objects in areas where there is very little interaction. It may have a function of reducing the amount of calculation. Furthermore, if each grain of earth and sand is used as a unit of object, the amount of calculation will be enormous, so the prediction model 111 may treat a predetermined mass of earth and sand as a unit of object.

評価処理部１１２は、実施形態１と同様に予め設計された評価関数に基づき、予測された土砂の配置状態について評価値を計算する。評価関数には、設定部１３１に登録された目標状態データを使用した関数が含まれ、例えば、予測された土砂形状と目標の土砂形状との差が小さいほど高い評価値が得られる関数を含む。 The evaluation processing unit 112 calculates an evaluation value for the predicted earth and sand arrangement state based on a predesigned evaluation function as in the first embodiment. The evaluation function includes a function that uses the target state data registered in the setting unit 131, and includes, for example, a function that provides a higher evaluation value as the difference between the predicted sediment shape and the target sediment shape is smaller. .

図７は、実施形態２の作業装置の自動運転処理を説明する図である。図７のラインＬ０は設定部１３１に登録された目標の土砂形状を示す。実施形態２の作業装置１においても、実施形態１と同様に、制御部１０が作業処理の中で、予測モデル１１１を用いた土砂の配置の予測と、評価処理部１１２により計算された評価値とに基づいて、操作決定部１１が操作を決定し、操作制御部１２がその操作を可動部２Ａに実行させる。そして、このような操作が繰り返されることで、目標の土砂形状に合わせた土砂の運搬が自動運転により実現される。 FIG. 7 is a diagram illustrating automatic operation processing of the working device according to the second embodiment. A line L0 in FIG. 7 indicates the target earth and sand shape registered in the setting unit 131. Also in the working device 1 of the second embodiment, similarly to the first embodiment, the control unit 10 predicts the arrangement of earth and sand using the prediction model 111 and the evaluation value calculated by the evaluation processing unit 112 during work processing. Based on this, the operation determining section 11 determines the operation, and the operation control section 12 causes the movable section 2A to execute the operation. Then, by repeating such operations, the transportation of earth and sand in accordance with the target earth and sand shape is realized through automatic operation.

以上のように、実施形態２の作業装置１によれば、ユーザが目標状態データを設定できる設定処理部１３を備え、評価処理部１１２は目標状態データを用いて評価値を計算する。したがって、目標状態（目標の土砂形状）が変わる現場に対して、目標状態データの設定により、各現場に対応することができる。 As described above, the working device 1 of the second embodiment includes the setting processing unit 13 that allows the user to set target state data, and the evaluation processing unit 112 calculates an evaluation value using the target state data. Therefore, by setting the target state data, it is possible to respond to each site where the target state (target earth and sand shape) changes.

以上、本発明の各実施形態について説明した。しかし、本発明は上記の実施形態に限られない。例えば、上記実施形態では、操作後に静止する物体を操作対象としたため、状態取得部が取得する物体の状態、並びに、予測モデルが予測する物体の状態として、物体の位置が採用された例を示した。しかし、操作対象の物体は、運動する物体、温度、摩擦抵抗、重量、電流、電圧等の様々な物理量が変化する物体であってもよい。この場合、状態取得部が取得する物体の状態、並びに、予測モデルが予測する物体の状態には、位置の他、速度、各速度、並びに、様々な物理量が含まれてもよい。予測モデルは、物体の状態ベクトルにこれらの物理量を含めて予測を行えばよい。また、状態取得部は、これらの物理量を測定する装置が適用されればよい。その他、実施の形態で示した細部は、発明の趣旨を逸脱しない範囲で適宜変更可能である。 Each embodiment of the present invention has been described above. However, the present invention is not limited to the above embodiments. For example, in the above embodiment, since the operation target is an object that remains stationary after being operated, an example is shown in which the position of the object is adopted as the state of the object acquired by the state acquisition unit and the state of the object predicted by the prediction model. Ta. However, the object to be operated may be a moving object, an object in which various physical quantities such as temperature, frictional resistance, weight, current, voltage, etc. change. In this case, the state of the object acquired by the state acquisition unit and the state of the object predicted by the prediction model may include, in addition to position, velocity, each velocity, and various physical quantities. The prediction model may perform prediction by including these physical quantities in the state vector of the object. Moreover, a device that measures these physical quantities may be applied to the state acquisition unit. Other details shown in the embodiments can be changed as appropriate without departing from the spirit of the invention.

１作業装置
２、２Ａ可動部
３撮影部
１０制御部
１１操作決定部
１２操作制御部
１１１予測モデル
１１２評価処理部
Ｃ１容器
Ｗ、Ｗ１、Ｗ２ワーク 1 Working device 2, 2A Movable part 3 Photographing part 10 Control part 11 Operation determining part 12 Operation control part 111 Prediction model 112 Evaluation processing part C1 Container W, W1, W2 Work

Claims

A working device that stores a plurality of objects in a container,
a movable part capable of operating the plurality of objects;
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit evaluates the prediction result based on the gaps between the plurality of objects in the container and the number of objects in the container.
working equipment.

A working device that transports earth and sand as multiple objects,
a movable part capable of operating the plurality of objects;
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit evaluates the prediction result based on a comparison between the sediment shape of the prediction result and the target sediment shape.
working equipment.

A movable part that can operate on multiple objects,
a state acquisition unit that acquires the states of the plurality of objects;
an operation determining unit that predicts a change in the state of the plurality of objects after the operation and determines an operation for the plurality of objects;
an operation control unit that causes the movable unit to perform the operation determined by the operation determination unit;
a setting processing unit capable of setting target state data of the plurality of objects;
Equipped with
The operation determining unit uses a machine-learned prediction model that predicts the states of the plurality of objects after the operation by the movable part, including state changes due to interactions of the plurality of objects, and the prediction model. an evaluation processing unit that evaluates a prediction result, and determines an operation based on the prediction using the prediction model and the evaluation by the evaluation processing unit,
The evaluation processing unit is a work device that evaluates the prediction result using data of the target state .

The movable part is a robot hand,
accommodating a plurality of objects in a container by operating the movable part;
The working device according to any one of claims 1 to 3 .

The movable part is a shovel,
the plurality of objects are earth and sand;
transporting the earth and sand by operating the movable part;
The working device according to claim 2 or 3 .

The prediction model is a neural network that handles simulation of many-body problems.
The working device according to any one of claims 1 to 5 .

The operation determining unit includes:
determining an operation to change the arrangement of some of the plurality of objects, an operation to add an object, or an operation including both;
The working device according to any one of claims 1 to 6 .