JP2009066692A

JP2009066692A - Trajectory searching device

Info

Publication number: JP2009066692A
Application number: JP2007236561A
Authority: JP
Inventors: Komei Sugiura; 孔明杉浦; Naoto Iwahashi; 直人岩橋
Original assignee: ATR Advanced Telecommunications Research Institute International; National Institute of Information and Communications Technology
Current assignee: ATR Advanced Telecommunications Research Institute International; National Institute of Information and Communications Technology
Priority date: 2007-09-12
Filing date: 2007-09-12
Publication date: 2009-04-02
Anticipated expiration: 2027-09-12
Also published as: JP5141876B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a robot which is flexibly adaptable to an environmental change for performing more developed operation than the learned operation although the robot which can execute learned operation only has a closed application range in the robot to be learned and executed operation. <P>SOLUTION: The trajectory searching device is provided for searching most reliable trajectorys while using probability models for representing the learned operation and for representing combined operation with any of the probability models combined. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、ロボットに人間の動作のまねをさせるという技術において、ロボットが動作を行うための軌道を探索する軌道探索装置の改良に関する。 The present invention relates to an improvement in a trajectory search apparatus that searches for a trajectory for a robot to perform an operation in a technique for imitating a human motion in a robot.

ユーザが何らかの動作を実演することにより、ロボットにその動作を学習させることを見真似学習と呼ぶ。ロボットが学習した動作は、ロボットが行い得る基本的な動作であるので動作プリミティブと呼ぶ。ロボットに学習させる動作プリミティブとしては、物を操作する動作と物を操作しない動作とに分類できるが、実用上重要となるのは物を操作する動作である。このとき、操作される物をトラジェクタと呼ぶ。 Imitation learning is a process in which a user demonstrates a motion and causes the robot to learn the motion. The movement learned by the robot is called a movement primitive because it is a basic movement that the robot can perform. The operation primitives to be learned by the robot can be classified into an operation for manipulating an object and an operation for not manipulating the object, but what is practically important is an operation for manipulating the object. At this time, an object to be operated is called a trajector.

動作プリミティブの学習において、ロボットは、ユーザが実演する動作をカメラで撮影し、トラジェクタの軌跡をキャプチャする。このようなキャプチャを何度も行うことで、動作プリミティブのもつ一般的な特性を抽出する。これは、キャプチャしたトラジェクタの軌跡を確率モデルによってモデル化することでなされる。確率モデルとしては、例えば、隠れマルコフモデル(HMM: Hidden Markov Model)が用いられ、これはトラジェクタの状態が時間経過と共にどのように遷移すべきかを、時刻と共に変化する位置、速度、加速度の確率分布と状態間の遷移確率を用いて表現したモデルである。 In learning of motion primitives, the robot captures the trajectory of the trajector by capturing the motion that the user demonstrates with a camera. By performing such capture many times, general characteristics of the operation primitive are extracted. This is done by modeling the trajectory of the captured trajector using a probability model. As the probability model, for example, a Hidden Markov Model (HMM) is used. This is a probability distribution of position, velocity, and acceleration that changes with time, how the state of the projector should change over time. And a model expressed using transition probabilities between states.

学習後、ユーザから動作プリミティブを命じられた際、ロボットは、命じられた動作プリミティブを表す確率モデルの下で最も確からしい軌道を探索することで、トラジェクタが描くべき最適な軌道を得る。
関連する先行技術文献として、以下の非特許文献が挙げられる。
Iwahashi, N.: Robots That Learn Language: Developmental Approach to Human-Machine Conversations, Symbol Grounding and Beyound: Proceedings of the Third International Workshop on the Emergence and Evolution of Linguistic Communication (Vogt, P. et al.(eds.)), Springer, pp 143-167 (2006). 羽岡哲郎, 岩橋直人:「言語獲得のための参照点に依存した空間的移動の概念の学習」, 信学技報, PRMU2000-105, pp.39-46 (2000). Tokuda,K., Kobayashi, T. and Imai, S. Speech parameter generation from HMM using dynamic features,Procceedings of International Conference in Acoustics, Speech, and Signal Processing, pp.660-663 (1995). After learning, when a motion primitive is ordered from the user, the robot searches the most probable trajectory under the probability model representing the commanded motion primitive to obtain the optimal trajectory to be drawn by the trajector.
The following non-patent documents are listed as related prior art documents.
Iwahashi, N .: Robots That Learn Language: Developmental Approach to Human-Machine Conversations, Symbol Grounding and Beyound: Proceedings of the Third International Workshop on the Emergence and Evolution of Linguistic Communication (Vogt, P. et al. (Eds.)) , Springer, pp 143-167 (2006). Tetsuo Haoka, Naoto Iwahashi: “Learning the Concept of Spatial Movement Depending on Reference Points for Language Acquisition”, IEICE Technical Report, PRMU2000-105, pp.39-46 (2000). Tokuda, K., Kobayashi, T. and Imai, S. Speech parameter generation from HMM using dynamic features, Procceedings of International Conference in Acoustics, Speech, and Signal Processing, pp. 660-663 (1995).

ロボットは、学習した動作プリミティブが命じられれば、それを実行することができる。しかし、学習した動作プリミティブだけしか実行できないのならば、ロボットとして実用性に乏しいものとなる。
例えば、工場内で多数の部品を種別毎に仕分けして所望の宛先の生産ラインに送り出すという作業をロボットに学習させて自動化を図る場合、そのような配送には様々なパターンが存在するため、ユーザは多数の動作プリミティブを学習させておく必要があり、個々の作業についてカメラの前で実演を行うという手間で、多くの工数が発生する。これでは、日々に厳しい生産スケジュールに追われる生産現場において、実用化は不可能であり、見真似学習の応用はせいぜいおもちゃや遊戯施設のアトラクション程度に限られてしまう。結果として応用範囲が閉ざされてしまうという問題がある。 The robot can execute the learned motion primitives if ordered. However, if only the learned motion primitives can be executed, the robot is not practical.
For example, when the robot learns the task of sorting a large number of parts by type in the factory and sending them to the desired destination production line for automation, there are various patterns for such delivery, The user needs to learn a large number of motion primitives, and a lot of man-hours are generated by performing the demonstration in front of the camera for each work. This makes it impossible to put it to practical use in production sites that are subject to strict production schedules on a daily basis, and imitation learning applications are limited to attraction of toys and amusement facilities at best. As a result, there is a problem that the application range is closed.

本発明の目的は、学習した限られた動作プリミティブを基に、更に発展した動作を実現するトラジェクタの軌道を探索する軌道探索装置を提供することである。 An object of the present invention is to provide a trajectory search apparatus that searches a trajectory of a trajector that realizes a further developed operation based on a learned limited operation primitive.

上記課題を解決するために、本発明に係る軌道探索装置はトラジェクタを移動するための軌道の探索を行う軌道探索装置であって、トラジェクタを移動するための駆動装置が行い得る動作プリミティブを含んだトラジェクタの移動指示をユーザから受け付ける受付手段と、前記受付手段で受け付けたユーザからの指示に含まれる動作プリミティブからなる動作列を生成して、生成された動作列の範囲内で、最尤軌道の探索を行う探索手段とを備え、駆動装置が行い得る各々の動作プリミティブは、確率モデルによって表され、確率モデルは、トラジェクタの状態が時間経過と共にどのように遷移すべきかを、時点毎の確率分布と状態間の遷移確率を用いて定義したモデルであり、前記探索手段による軌道探索は、前記動作列について、動作プリミティブを表す確率モデルの結合を行い、結合により得られた結合確率モデルの下で、尤度が最大となる軌道を探索することを特徴とする。 In order to solve the above-mentioned problems, a trajectory search apparatus according to the present invention is a trajectory search apparatus that searches a trajectory for moving a trajector, and includes an operation primitive that can be performed by a drive device for moving the trajector. An accepting unit that accepts a movement instruction of a trajector from a user, and an action sequence that includes action primitives included in the instruction from the user that is accepted by the accepting unit, and generates a maximum likelihood trajectory within the range of the generated action sequence. Each operation primitive that can be performed by the drive device is represented by a probability model, and the probability model indicates how the state of the trajector should be changed over time. And a trajectory search by the search means for the motion sequence for the motion sequence. It performs binding probabilistic model representing a primitive, under resulting joint probability model by coupling, characterized by searching the track likelihood is maximized.

本発明の軌道探索装置は、学習した動作プリミティブを表す確率モデルを結合してできる結合確率モデルの下で、トラジェクタの最尤軌道を探索する。動作プリミティブの結合の仕方はいくらでもあるので、少数の動作プリミティブを学習させるだけで、ロボットに多様な動作を実行させることができる。すなわち、ロボットに様々な応用力を具備させることができる。複数の動作を連続して行わせるには、単に動作プリミティブを順番に実行させることでも可能であるが、この場合1つの動作プリミティブを終了するたびに動作が止まることになり、動作と動作の間で加速度を急激に変化させる場合があるため危険である。本発明の軌道探索装置では、動作プリミティブを表す確率モデルの段階で結合を行い、結合確率モデル上でトラジェクタの軌道を探索する。つまり、動作全体が1つの結合動作として実行され、動作プリミティブと動作プリミティブの間のつなぎ目ができないので、人間が動作を連続して行う場合のような自然な動作をロボットにさせることができる。 The trajectory search apparatus according to the present invention searches for the maximum likelihood trajectory of a trajector under a joint probability model formed by combining probability models representing learned motion primitives. Since there are any number of ways to combine the motion primitives, the robot can perform various motions by learning a small number of motion primitives. That is, the robot can be provided with various application capabilities. In order to perform multiple operations in succession, it is possible to simply execute the operation primitives in order. In this case, however, the operation stops each time one operation primitive is finished. It is dangerous because the acceleration may change suddenly. In the trajectory search apparatus of the present invention, coupling is performed at the stage of the probability model representing the operation primitive, and the trajectory of the trajector is searched on the coupling probability model. In other words, since the entire motion is executed as one combined motion and there is no connection between the motion primitives, the robot can perform natural motions such as when a human performs motions continuously.

ここで、動作プリミティブを表す確率モデルの結合は、動作列の2番目以降に実行すべき動作プリミティブの各状態におけるトラジェクタの位置、速度、加速度の平均を、その状態におけるトラジェクタの位置、速度、加速度の平均と、その動作プリミティブの初期状態におけるトラジェクタの位置の平均と、直前の動作プリミティブの最終状態における位置の平均とを用いて算出されるとすることができる。確率モデルの結合において、確率モデルを特徴づけるパラメータである平均のうち、位置の平均に関して、当該動作プリミティブの初期状態と、直前の動作プリミティブの最終状態とが連続するように座標変換することができるので、トラジェクタの軌道の連続性を確保することができる。 Here, the combination of probabilistic models representing motion primitives is obtained by calculating the average of the position, velocity, and acceleration of each of the motion primitives to be executed in the second and subsequent motion sequences, and the position, velocity, and acceleration of the projector in that state. And the average of the positions of the trajectors in the initial state of the motion primitive and the average of the positions in the final state of the immediately previous motion primitive. In the combination of probability models, among the averages that are parameters that characterize the probability model, the coordinates of the position average can be transformed so that the initial state of the operation primitive and the final state of the immediately preceding operation primitive are continuous. Therefore, the continuity of the trajectory of the trajector can be ensured.

ここで、前記軌道探索装置は、トラジェクタの軌道をキャプチャするキャプチャ手段と、駆動装置が動作する空間に配置されている静止オブジェクトの認識を行う認識手段とを備え、動作プリミティブを表す確率モデルの結合は、結合の対象となる動作プリミティブの基準となる参照点が、認識した静止オブジェクトの位置の何れかに合致するか否かの判定を行い、動作プリミティブの基準となる参照点が、認識した静止オブジェクトの位置の何れかに合致すれば、前記結合において、動作列の2番目以降に実行すべき動作プリミティブの各状態における分散を、その動作プリミティブの各状態における分散とし、動作プリミティブの基準となる参照点が、認識した静止オブジェクトの位置の何れにも合致しなければ、前記結合において、動作列の2番目以降に実行すべき動作プリミティブの各状態における分散を、その動作プリミティブの各状態における分散と、その直前の動作プリミティブの最終状態における分散との和とすることでなされるとすることができる。結合の対象となる動作プリミティブの参照点が静止オブジェクトの位置に一致しない場合に分散を拡大するので、結合動作全体において取り得る軌道に広がりをもたせることができる。また、結合の対象となる動作プリミティブの参照点が静止オブジェクトの位置に一致する場合は、元の確率モデルの分散をそのまま用いるので、参照点からぶれずにトラジェクタを移動させることができる。 Here, the trajectory search device includes a capturing unit that captures the trajectory of the trajector and a recognition unit that recognizes a stationary object arranged in a space in which the driving device operates, and a combination of probability models representing motion primitives. Determines whether the reference point that is the reference of the motion primitive to be combined matches any of the recognized stationary object positions, and the reference point that is the reference of the motion primitive is the recognized static If it matches any of the object positions, the distribution in each state of the operation primitive to be executed after the second in the operation sequence is the distribution in each state of the operation primitive in the combination, which becomes the reference of the operation primitive If the reference point does not match any of the recognized stationary object positions, The distribution in each state of the action primitive to be executed after the second in the column is the sum of the distribution in each state of the action primitive and the distribution in the final state of the immediately preceding action primitive. Can do. Since the dispersion is expanded when the reference point of the operation primitive to be combined does not coincide with the position of the stationary object, the trajectory that can be taken in the entire combining operation can be widened. Also, when the reference point of the action primitive to be combined matches the position of the stationary object, the variance of the original probability model is used as it is, so that the trajector can be moved without shaking from the reference point.

トラジェクタの移動指示には、トラジェクタの初期位置と目標位置とを含み、前記探索手段は、トラジェクタの初期位置から目標位置までの経路に対応する動作プリミティブの動作列を複数生成して、それらの動作列毎に、確率モデルの結合を行うとすることができる。指定されたトラジェクタの初期位置と目標位置とを結ぶ軌道の探索において、学習した動作プリミティブからなる動作列を複数生成し、それらを結合した動作を表す確率モデル上でそれぞれ軌道の探索を行うので、単純な動作プリミティブに限らない柔軟な軌道を探索することができる。 The trajector movement instruction includes the initial position and the target position of the trajector, and the search means generates a plurality of operation primitive operation sequences corresponding to the path from the initial position of the trajector to the target position, It can be assumed that the probability models are combined for each column. In the search for the trajectory connecting the initial position of the specified trajector and the target position, a plurality of motion sequences consisting of learned motion primitives are generated, and each trajectory is searched on the probability model representing the motion that combines them. A flexible trajectory that is not limited to simple motion primitives can be searched.

トラジェクタの移動指示は、動作プリミティブの動作名とランドマークの組みを複数含み、前記探索手段による確率モデルの結合の対象となる動作プリミティブは、移動指示に含まれる複数の動作名に対応した動作プリミティブであるとすることができる。トラジェクタの移動を命じる際に、どの動作プリミティブをどの順に行うかを細かく指定できるので、障害物の回避など、探索して得られた最尤軌道では不都合がある場合でも、別の軌道で動作を実行させることができる。 The movement instruction of the trajector includes a plurality of combinations of operation names and landmarks of operation primitives, and the operation primitives to be combined with the probability model by the search means are operation primitives corresponding to the plurality of operation names included in the movement instruction. It can be assumed that When ordering the movement of the trajector, it is possible to specify in detail in which order the operation primitive is to be performed, so even if the maximum likelihood trajectory obtained by searching, such as avoiding an obstacle, is inconvenient, the operation is performed in another trajectory. Can be executed.

前記結合において、結合対象となる2つの確率モデルが定義されている座標系を統一するように座標変換を行うとすることができる。確率モデルの平均の結合において、異なる座標系で定義された確率モデルの座標軸を変換することで、座標系を統一することができる。結合時に座標系が統一されるので、確率モデルは動作プリミティブに固有の座標系で定義することができる。従って、座標変換により等価となる動作プリミティブを複数学習させる必要がなくなる。 In the combination, coordinate conversion may be performed so as to unify the coordinate system in which two probability models to be combined are defined. In the average combination of the probability models, the coordinate system can be unified by converting the coordinate axes of the probability models defined in different coordinate systems. Since the coordinate system is unified at the time of combination, the probability model can be defined in a coordinate system specific to the operation primitive. Therefore, it is not necessary to learn a plurality of equivalent operation primitives by coordinate transformation.

前記座標変換は、直前の動作プリミティブを表す確率モデルが定義されている座標系に対して、当該動作プリミティブを表す確率モデルが定義されている座標系をアフィン変換することでなされるとすることができる。結合対象となる2つの動作プリミティブのうち、後で実行される動作プリミティブを表す確率モデルが定義された座標系を、前に実行される動作プリミティブを表す確率モデルが定義された座標系に合わせるように座標系のアフィン変換を行うことで、動作列内のすべての動作プリミティブを表す確率モデルの座標系が一致するように逐次変換される。 The coordinate transformation may be performed by affine transformation of a coordinate system in which a probability model representing the motion primitive is defined with respect to a coordinate system in which a probability model representing the immediately preceding motion primitive is defined. it can. Of the two action primitives to be combined, the coordinate system in which the probability model representing the action primitive to be executed later is defined matches the coordinate system in which the probability model representing the action primitive to be executed in advance is defined. By performing the affine transformation of the coordinate system, the sequential transformation is performed so that the coordinate systems of the probability models representing all motion primitives in the motion sequence match.

前記動作プリミティブを表す確率モデルは、ユーザが実演した動作に基づいて決定されるとすることができる。動作プリミティブの学習は、ユーザが実演した動作を再現するようになされるので、ロボットの動作も人間が行う滑らかな動作に似せることができる。
前記尤度が最大となる軌道で表される動作を、前記駆動装置が行う前に、ユーザに確認を求め、ユーザが同意しなかった場合には前記軌道を出力しないとすることができる。軌道探索装置が探索した最尤軌道を駆動装置に動作させる前に、ユーザに確認させることにより、意図せぬ危険な動作を事前に防ぐことができる。 The probability model representing the operation primitive may be determined based on an operation performed by a user. The learning of the motion primitive is performed so as to reproduce the motion demonstrated by the user, so that the motion of the robot can be made to resemble a smooth motion performed by a human.
Before the driving device performs the operation represented by the trajectory with the maximum likelihood, the user is asked to confirm, and if the user does not agree, the trajectory is not output. By causing the user to confirm the maximum likelihood trajectory searched by the trajectory search device before operating the driving device, an unintended dangerous motion can be prevented in advance.

本実施形態では、図1に示すようなロボットのアーム型のマニピュレータを動かして、物を操作するシステムを考える。図1は、本実施形態における軌道探索装置300を用いたマニピュレータ制御システムの構成を示す図である。マニピュレータ制御システムは、命令された動作を行うマニピュレータ500、命令を音声で入力するためのマイクロフォン、入力された命令を解析する音声認識エンジン100、命令を実行する周囲の状況を撮影するためのカメラ、撮影された映像を解析するための画像認識エンジン200、入力された命令と映像から動作を行う軌道の探索および動作の学習を行う軌道探索装置300、軌道探索装置300によって探索された軌道を実現するためにマニピュレータ500を制御する制御パラメータを生成する制御装置400によって構成されている。 In the present embodiment, a system for operating an object by moving an arm type manipulator of a robot as shown in FIG. 1 is considered. FIG. 1 is a diagram showing a configuration of a manipulator control system using a trajectory search apparatus 300 in the present embodiment. The manipulator control system includes a manipulator 500 that performs a commanded operation, a microphone for inputting a command by voice, a speech recognition engine 100 that analyzes the command that has been input, a camera for shooting a surrounding situation in which the command is executed, Realize the trajectory searched by the image recognition engine 200 for analyzing the captured video, the trajectory search device 300 for searching the trajectory for performing the motion and learning the motion from the input command and video, and the trajectory search device 300 Therefore, the control device 400 is configured to generate control parameters for controlling the manipulator 500.

本実施形態の軌道探索装置300は、動作を学習させるための学習モードと、学習した動作を結合した結合動作を行う軌道を探索する軌道探索モードとを有している。
各モードの構成の具体的説明に入る前に、本明細書で用いられる用語等の概念を説明しておく。
ロボットが行う動作には、「歩く」、「飛ぶ」、「回る」のように物を操作しない動作と、図2の(a)「ペンをあげる」、(b)「ペンを箱にのせる」、(c)「ペンに箱をとびこえさせる」のように物を操作する動作とがある。実際に、ロボットに何か仕事をさせることを考えると、物を操作しない動作よりも物を操作する動作の方がはるかに有用であるので、ここでは、物を操作する動作を学習させることを考える。認知言語学では、外部世界を解釈する主体のプロセスにおいて焦点化される存在のうち、相対的に際立って認知される対象をトラジェクタ、これを背景的に位置付けるオブジェクトをランドマークと呼ぶ。物を操作する動作においては、動作の対象となり移動する物がトラジェクタ、それ以外の物がランドマークとなる。図2の例では、「ペン」がトラジェクタであり、「箱」がランドマークである。 The trajectory search apparatus 300 according to the present embodiment has a learning mode for learning an operation and a trajectory search mode for searching for a trajectory that performs a combined operation combining the learned operations.
Prior to entering a specific description of the configuration of each mode, concepts such as terms used in this specification will be described.
The movements that the robot performs include actions that do not manipulate objects such as `` walking '', `` flying '', and `` turning '', and (a) `` lifting the pen '', (b) `` putting the pen in the box ''", (C) There is an operation of manipulating an object such as" Move the pen over the box ". In fact, considering that you want the robot to do some work, the action of manipulating things is much more useful than the action of not manipulating things. Think. In cognitive linguistics, among the entities that are focused on in the process of the subject that interprets the outside world, the objects that are recognized relatively prominently are called trajectors, and the objects that place this in the background are called landmarks. In an operation of manipulating an object, an object to be operated is a trajector, and the other object is a landmark. In the example of FIG. 2, “pen” is a trajector, and “box” is a landmark.

動作を学習するとは、ユーザによって実演された動作を一般化し、同じような動作を再現できるようにすることである。ここで、全く同じ動作ではなく、同じような動作という点が重要である。全く同じ動作であれば、物が動く軌跡を座標として記憶しておけばロボットがそれを再現することは容易である。しかし、「ペンを箱にのせる」という動作を考えた場合、「ペン」と「箱」の位置関係は状況によって様々であり、単に記憶した軌跡を再現するだけでは「ペンを箱にのせる」ことはできない。この場合、「ペン」と「箱」の位置を認識し、「ペン」を持ち上げつつ、「箱」の上部に近付け、「箱」の上面に向かって下げるという一般化された動作を記憶することが学習である。 Learning the action is to generalize the action demonstrated by the user so that the same action can be reproduced. Here, it is important that the operations are not the same but the same. If the movement is exactly the same, it is easy for the robot to reproduce the movement trajectory if it is stored as coordinates. However, when considering the action of “putting a pen in a box”, the positional relationship between “pen” and “box” varies depending on the situation. "It is not possible. In this case, it recognizes the position of the “pen” and “box”, remembers the generalized movement of lifting the “pen”, approaching the top of the “box”, and lowering it toward the top of the “box” Is learning.

本マニピュレータ制御システムは、まず、学習モードにおいて、マニピュレータ500が行う基本的な動作となる動作プリミティブを学習させる。その上で、軌道探索モードでは、マイクロフォンからのユーザによる音声入力、あるいは、カメラによって撮影されたユーザのジェスチャによって命令が与えられたことを検知し、学習した動作プリミティブのうちの複数を結合して、与えられた命令を実行するのに最適な動作を行える軌道を探索する。探索で見つかった軌道は制御装置400に送られ、マニピュレータ500を制御する制御パラメータに変換され、実際に、マニピュレータ500を動かすことになる。尚、学習した動作プリミティブのうちの1つが、与えられた命令を実行するのに最適な動作であれば、動作プリミティブの結合を行わず、その動作プリミティブを表す確率モデルの下で軌道を探索すればよい。 The manipulator control system first learns operation primitives which are basic operations performed by the manipulator 500 in the learning mode. In addition, in the trajectory search mode, it is detected that a command is given by a user's voice input from a microphone or a user's gesture photographed by a camera, and a plurality of learned operation primitives are combined. , Search for a trajectory that can perform the optimum operation to execute a given command. The trajectory found in the search is sent to the control device 400, converted into control parameters for controlling the manipulator 500, and actually moves the manipulator 500. Note that if one of the learned motion primitives is the best motion to execute the given instruction, the motion primitives are not combined and the trajectory is searched under the probability model representing the motion primitive. That's fine.

ここで、命令とは、例えば、「人形を箱にのせる」というものである。ユーザがマイクロフォンに向かって、「人形を箱にのせる」と命令すれば、音声認識エンジン100は、「のせる」という動作名を抽出し、動作の対象となる物、すなわち、トラジェクタが「人形」であり、動作の目標位置が「箱」であると認識する。本実施形態では、動く物はトラジェクタのみを考える。トラジェクタ以外の静止している物は静止オブジェクトと呼ぶ。トラジェクタや静止オブジェクトがどこにあるかを認識するために、カメラによる撮影が行われ、撮影された映像は、画像認識エンジン200によって解析される。 Here, the command is, for example, “put a doll in a box”. When the user instructs the microphone to “put the doll in a box”, the speech recognition engine 100 extracts the operation name “put”, and the object to be operated, that is, the trajector is “the doll” And the target position of the motion is recognized as “box”. In this embodiment, the moving object is only a trajector. A stationary object other than the trajector is called a stationary object. In order to recognize where the trajector or the stationary object is, the camera performs shooting, and the shot video is analyzed by the image recognition engine 200.

以下、学習モードおよび軌道探索モードのそれぞれの構成について説明する。
《学習モードの構成》
本節では、マニピュレータ制御システムの学習モードの構成について説明する。
動作の学習は、ユーザが命令を発し、ロボットに行って欲しい動作を何度も実演することでなされる。例えば、「のせる」という動作を学習させる場合、「ペンを箱にのせる」、「人形を棚にのせる」、「赤い箱を青い箱にのせる」など「のせる」動作を何度も見せ、ロボットはそれらの動作を一般化することで学習する。これは、入力映像から、トラジェクタの運動および静止オブジェクトの位置を抽出し、トラジェクタの運動を一般化するように座標軸とその原点を探索し、トラジェクタの運動を与える確率モデルを決定することでなされる。 Hereinafter, the configurations of the learning mode and the trajectory search mode will be described.
<Structure of learning mode>
This section describes the learning mode configuration of the manipulator control system.
The learning of the movement is performed by the user giving a command and demonstrating the movement that the robot wants to perform many times. For example, when learning the action of “putting”, the “putting” action such as “putting a pen on a box”, “putting a doll on a shelf”, “putting a red box on a blue box”, etc. The robot learns by generalizing their movements. This is done by extracting the motion of the trajector and the position of the stationary object from the input video, searching for the coordinate axis and its origin so as to generalize the motion of the trajector, and determining a probability model that gives the motion of the trajector. .

図3に入力映像の例を示す。図3(a)は、赤、青、緑の3つの箱が並んでおいてあるときに、「赤い箱を青い箱にのせる」という動作を実演した例である。図3(b)は、続いて、
「緑の箱を赤い箱にのせる」という動作を実演した例である。どちらも「のせる」という動作の実演ではあるが、目標位置、動きの方向に違いがある。しかし、どちらも動きの方向は目標位置に近付ける方向であるという点では共通している。
〈確率モデル〉
ここで、動作の確率モデルによる表現について説明しておく。動作は、トラジェクタの運動であり、トラジェクタの位置、速度、加速度の時間的変化で表すことができる。しかし、動作をトラジェクタの位置、速度、加速度の時間的変化で特定してしまうと、動作のもつ曖昧さが失われてしまう。例えば、人は「ペンをあげる」という動作を行う場合、「ペン」の座標の鉛直成分が時間とともに大きくなるような軌跡を描くが、途中のある時刻に必ず特定の位置まで上昇していなければならいという制約はないし、水平方向に多少のぶれがあっても、「あげる」という動作の範疇を越えることにならない。このような曖昧さを加味して、ある時刻にトラジェクタがある位置を中心とするある範囲内の状態に存在している確率と、状態間の遷移確率を与えることで、トラジェクタの運動を表現したモデルが確率モデルである。 Fig. 3 shows an example of input video. FIG. 3 (a) is an example demonstrating the operation of “putting a red box on a blue box” when three boxes of red, blue and green are lined up. Figure 3 (b) continues
This is an example of demonstrating the action of “putting a green box on a red box”. Both are demonstrations of “putting”, but there are differences in the target position and direction of movement. However, both are common in that the direction of movement is a direction approaching the target position.
<Probability model>
Here, a description will be given of a motion probability model. The movement is a movement of the trajector, and can be represented by a temporal change in the position, speed, and acceleration of the trajector. However, if the motion is specified by the temporal change in the position, velocity, and acceleration of the trajector, the ambiguity of the motion is lost. For example, when a person performs the action of “raising a pen”, a trajectory is drawn in which the vertical component of the coordinates of “pen” increases with time, but it must be raised to a specific position at some point in the middle. There is no restriction that it does not follow, and even if there is some blur in the horizontal direction, it does not exceed the category of “raise” operation. In consideration of such ambiguity, the motion of the trajector is expressed by giving the probability of being in a state within a certain range centered on the position of the trajector at a certain time and the transition probability between the states. The model is a probabilistic model.

図4は、確率モデルλで表した動作プリミティブの概念図である。図4には、確率モデルλの下での、3次元空間上で、トラジェクタの状態sを、時刻t_sにおいて位置x_s(λ)の平均Eを楕円体の中心の黒丸で、分散Vを楕円体の大きさで表現した図である。トラジェクタは、時刻t_0には原点の近くに存在する確率が高く、時刻がt_1,t_2,...と進むにつれて原点から離れた位置に存在する確率が高くなることを示している。運動を指定するには、位置だけでなく、速度、加速度についても同様に、平均と分散を考える必要がある。図4は、図面の都合で3次元の空間座標のみを示しているが、実際には位置、速度、加速度を含めた9次元空間上での確率モデルとして定義される。 FIG. 4 is a conceptual diagram of motion primitives represented by a probability model λ. Fig. 4 shows the state s of the trajector in the three-dimensional space under the probability model λ, the mean E of the position x_s (λ) at the time t_s is the black circle at the center of the ellipsoid, and the variance V is the ellipsoid It is the figure expressed with the magnitude | size of. The trajector has a high probability of being near the origin at time t_0, and shows that the probability of being at a position away from the origin increases as the time advances t_1, t_2,. In order to specify the motion, it is necessary to consider not only the position but also the velocity and acceleration as well as the mean and variance. FIG. 4 shows only three-dimensional spatial coordinates for the convenience of drawing, but is actually defined as a probabilistic model in nine-dimensional space including position, velocity, and acceleration.

尚、本明細書では、下付き添字を表すために「_」を用い、上付き添字を表すために「^」を用いている。
図5は、確率モデルλのデータ構造を示した図である。確率モデルλは、各時刻t_sに状態sにある確率分布を与えるパラメータの列と、状態間の遷移確率とで定義されている。各時刻t_sにおけるパラメータは、位置x_s、速度x'_s、加速度x''_sのそれぞれについての平均Eおよび分散Vで構成されている。位置の平均を(p,q,r)、速度の平均を(v_p,v_q,v_r)、加速度の平均を(a_p,a_q,a_r)、位置の分散を(V_p,V_q,V_r)、速度の分散を(V'_p,V'_q,V'_r)、加速度の分散を(V''_p,V''_q,V''_r)とおく。これらのパラメータのうち、位置に関しては、図4の楕円体で示した通りである。 In this specification, “_” is used to indicate a subscript, and “^” is used to indicate a superscript.
FIG. 5 is a diagram illustrating a data structure of the probability model λ. The probability model λ is defined by a sequence of parameters that give a probability distribution in the state s at each time t_s and a transition probability between the states. The parameter at each time t_s includes an average E and a variance V for each of the position x_s, the velocity x′_s, and the acceleration x ″ _s. Average position (p, q, r), average velocity (v_p, v_q, v_r), average acceleration (a_p, a_q, a_r), position variance (V_p, V_q, V_r), velocity The variance is (V′_p, V′_q, V′_r), and the acceleration variance is (V ″ _p, V ″ _q, V ″ _r). Among these parameters, the position is as shown by the ellipsoid in FIG.

尚、本明細書中では、位置xの時間に関する1階微分および2階微分を表すのに、速度x'、および、加速度x''のように「ダッシュ」を用いているが、分散V、および、確率モデルΛに付した「ダッシュ」は、時間微分を意味するものではないことに注意されたい。
〈軌道探索装置300の学習モードにおける内部構成〉
図6は、本実施形態の軌道探索装置300の学習モードにおける内部構成を示す図である。本実施形態の軌道探索装置300は、入出力インターフェース1、データベース格納部2、プロセッサ3、ワークメモリ4、プログラム格納部5から構成される。データベース格納部2は学習結果を記憶するためのハードディスク等の記録装置であり、ワークメモリ4は演算を行うためのRAMであり、プログラム格納部5は学習プログラムおよび軌道探索プログラムを記録したROMである。 In this specification, “dash” is used to express the first and second derivatives of the position x with respect to time, such as velocity x ′ and acceleration x ″. It should be noted that the “dash” attached to the probability model Λ does not mean time differentiation.
<Internal configuration of orbit search device 300 in learning mode>
FIG. 6 is a diagram showing an internal configuration in the learning mode of the trajectory search apparatus 300 of the present embodiment. The trajectory search apparatus 300 of this embodiment includes an input / output interface 1, a database storage unit 2, a processor 3, a work memory 4, and a program storage unit 5. The database storage unit 2 is a recording device such as a hard disk for storing learning results, the work memory 4 is a RAM for performing calculations, and the program storage unit 5 is a ROM that records a learning program and a trajectory search program. .

学習において、ユーザは、図1に示したマイクロフォンに向かってロボットに学習させる動作の動作名を発し、その動作名は音声認識エンジン100を介して、図6の入出力インターフェース1からプロセッサ3へと入力される。その後、ユーザは動作を実演し、実演した動作は、図1に示したカメラによって撮影され、画像認識エンジン200によって、トラジェクタが描く軌道と周辺の静止オブジェクトの位置集合とが、図6の入出力インターフェース1からプロセッサ3へと入力される。ユーザはロボットに動作を学習させるために何度も実演を行うので、それぞれについてトラジェクタの軌道Y_iと静止オブジェクトの位置集合O_iとが入力される。 In learning, the user issues an operation name of the operation to be learned by the robot toward the microphone shown in FIG. 1, and the operation name is transmitted from the input / output interface 1 in FIG. 6 to the processor 3 via the speech recognition engine 100. Entered. After that, the user demonstrated the motion, and the motion that was demonstrated was captured by the camera shown in FIG. 1, and the trajectory drawn by the trajector and the position set of the surrounding stationary objects were input and output in FIG. Input from the interface 1 to the processor 3. Since the user performs demonstrations many times to make the robot learn the movement, the trajectory Y_i of the trajector and the position set O_i of the stationary object are input for each.

学習モードでは、軌道探索装置300は、プログラム格納部5に格納された学習プログラムをプロセッサ3に実行させる。学習プログラムは、入出力インターフェース1から入力された動作名に対応する複数のトラジェクタの軌道Y_iと静止オブジェクトの位置集合O_iとの組から、トラジェクタの軌道を一般化するような座標系と原点、および、その動作を表す確率モデルを、ワークメモリ4上で探索する。探索で見つかった確率モデルのパラメータは動作名と関連付けて、データベース格納部2に格納される。
〈確率モデルを用いた動作の学習の定式化〉
本節では、学習の定式化について説明する。 In the learning mode, the trajectory search apparatus 300 causes the processor 3 to execute the learning program stored in the program storage unit 5. The learning program uses a coordinate system and an origin that generalizes the trajectory of the trajector from a set of trajectories Y_i of the plurality of trajectors corresponding to the motion name input from the input / output interface 1 and the position set O_i of the stationary object, and A probability model representing the operation is searched on the work memory 4. The parameters of the probability model found by the search are stored in the database storage unit 2 in association with the action name.
<Formulation of motion learning using a probability model>
In this section, we will explain the formulation of learning.

ロボットに動作を学習させるためにユーザが実演した動作を撮影したN個の入力映像からなる集合をV={V_1,V_2,...,V_N}とする。各映像V_iは、移動するオブジェクトが1つで、背景に静止しているオブジェクトが複数ある映像であるとする。例えば「のせる」という動作を学習させる場合、図3の(a)赤い箱を青い箱にのせる、(b)緑の箱を赤い箱にのせる、といった動作を映した映像がそれぞれの映像V_iに対応する。 Let V = {V_1, V_2, ..., V_N} be a set of N input images that have captured the motions demonstrated by the user to let the robot learn the motions. Each video V_i is a video with one moving object and a plurality of objects that are stationary in the background. For example, when learning the action of `` putting '', the images showing the actions of (a) putting a red box in a blue box and (b) putting a green box in a red box in Fig. 3 are the respective images. Corresponds to V_i.

各映像V_iから、移動する物と止まっている物を抽出することでトラジェクタと静止オブジェクトとが認識される。映像V_iからキャプチャされたトラジェクタの軌道をY_i、静止オブジェクトの位置集合をO_iとする。図3(a)では、左から右へ移動しながら一旦持ち上げて下げるという赤い箱の軌道がY_iであり、青い箱と緑の箱の位置がO_iである。図3(b)では、右から左へ移動しながら一旦持ち上げて下げるという緑の箱の軌道がY_iであり、青い箱と赤い箱の位置がO_iである。 A trajector and a stationary object are recognized by extracting moving objects and stationary objects from each video V_i. Let Y_i be the trajectory of the trajector captured from the video V_i, and O_i be the position set of the stationary object. In FIG. 3 (a), the red box trajectory that is lifted and lowered while moving from left to right is Y_i, and the blue and green boxes are O_i. In FIG. 3 (b), the trajectory of the green box that is lifted and lowered while moving from right to left is Y_i, and the position of the blue box and red box is O_i.

トラジェクタの軌道Y_iは、トラジェクタの位置x_t、速度x'_t、加速度x''_tの時系列であり、これらをまとめてy_t=[x_t,x'_t,x''_t]^Tと書く。ここで、Tは転置行列をとることを表す記号である。
トラジェクタの軌道を数値化するためには、座標系の原点と座標軸の向きを決めることが必要である。座標系の原点は、軌道を数値化する上での基準となるので、参照点と呼ぶ。参照点はランドマークの中から選ばれる。ランドマークは、静止オブジェクトの位置集合O_i、および、トラジェクタの初期位置x_0、映像の中心位置からなる集合をL_i={l^(i)_1,l^(i)_2,...,l^(i)_M_i}である。ここで、M_iは、映像V_iにおけるランドマークの数である。座標軸の向きkは、予め用意されたK種類の候補から選択する。座標軸の具体的なとり方については後述する。 The trajectory Y_i of the trajector is a time series of the position x_t, velocity x'_t, and acceleration x '' _ t of the trajector, and these are collectively written as y_t = [x_t, x'_t, x '' _ t] ^ T. Here, T is a symbol representing taking a transposed matrix.
In order to quantify the trajectory of the trajector, it is necessary to determine the origin of the coordinate system and the orientation of the coordinate axes. The origin of the coordinate system is referred to as a reference point because it serves as a standard for digitizing the trajectory. The reference point is selected from landmarks. The landmark is a set consisting of the stationary object position set O_i, the initial position x_0 of the trajector, and the center position of the video, L_i = {l ^ (i) _1, l ^ (i) _2, ..., l ^ (i) _M_i}. Here, M_i is the number of landmarks in the video V_i. The direction k of the coordinate axis is selected from K types of candidates prepared in advance. A specific method of taking the coordinate axes will be described later.

トラジェクタの軌道Yと、座標軸の向きkと、参照点lで決まる軌道をF(Y,k,l)と書く。トラジェクタの運動に関する確率モデルλの下で、映像V_iに示されるトラジェクタの軌道をY_i、座標軸の向きをk、映像V_iにおけるM_i個のランドマークのうちm_i番目のランドマークl^(i)_m_iを参照点としたときの軌道F(Y_i,k,l^(i)_m_i)を描く確率を尤度P(F(Y_i,k,l^(i)_m_i);λ)とする。このとき各映像V_iに関する対数尤度の、すべての入力映像に渡る和が最大になるように、確率モデルλ*、座標軸の向きk*、入力映像ごとの参照点を示すインデックスの集合m*を決定することにより、動作の学習を行う。これを式で表すと(数1)となる。 The trajectory determined by the trajectory Y of the trajector, the direction k of the coordinate axis, and the reference point l is written as F (Y, k, l). Under the stochastic model λ for the motion of the trajector, the trajectory of the trajector shown in the video V_i is Y_i, the direction of the coordinate axis is k, and the m_i-th landmark l ^ (i) _m_i out of the M_i landmarks in the video V_i Let probability P (F (Y_i, k, l ^ (i) _m_i); λ) be the probability of drawing a trajectory F (Y_i, k, l ^ (i) _m_i) as a reference point. At this time, a set of indices m * indicating the probability model λ *, coordinate axis direction k *, and reference point for each input video is set so that the logarithmic likelihood for each video V_i is maximized over all input videos. By making the decision, the operation is learned. This is expressed by the following equation (Equation 1).

ここで、argmaxの下にλ,k,mを書いた記号は、引数部分が最大になるようなλ,k,mを返す関数であり、m=(m_1,m_2,...,m_N)である。また、「*」は、推定値を意味する記号である。

Here, the symbol with λ, k, m written under argmax is a function that returns λ, k, m that maximizes the argument part, and m = (m_1, m_2, ..., m_N) It is. “*” Is a symbol indicating an estimated value.

要するに、確率モデルλを与えれば、その確率モデルの上でトラジェクタがある軌道Fを描く確率が決まるので、ユーザによる実演を撮影した入力映像に示された軌道Y_iのすべてを再現するのに最も確からしい座標系のとり方を決定するというのが、(数1)の意味するところである。
この問題は、例えば、確率モデルとしてHMMを用いることで解くことができる。解法の詳細は、非特許文献2に示されている。 In short, given the probability model λ, the probability of drawing a trajectory F with a trajector on that probability model is determined, so it is most reliable to reproduce all of the trajectory Y_i shown in the input video shot by the user's demonstration. It is the meaning of (Equation 1) to decide how to take a new coordinate system.
This problem can be solved, for example, by using an HMM as a probability model. Details of the solution are shown in Non-Patent Document 2.

尚、本明細書では、確率モデルとしてHMMを用いることを念頭において説明を行っているが、これは確率モデルをHMMに限定するものではない。
《動作プリミティブの学習》
本節では、いくつかの動作プリミティブを学習させたときの学習結果の具体例を示す。学習させた動作プリミティブは、「あげる」、「ちかづける」、「はなす」、「まわす」、「のせる」、「さげる」、「とびこえさせる」の7種類である。座標系のとり方としては、次のK=3種類を採用した。
k_1:ランドマークを原点とし、鉛直下向きにy軸、水平方向にx軸をとり、トラジェクタの初期位置のx座標が負になるようにx軸の向きを決めた座標系。 In this specification, the description is made with the use of the HMM as the probability model, but this does not limit the probability model to the HMM.
《Learning motion primitives》
In this section, specific examples of learning results when learning some motion primitives are shown. There are seven types of motion primitives that are learned: “Give up”, “Crick up”, “Hanasu”, “Turn”, “Put on”, “Sake up”, and “Move up”. The following K = 3 types were adopted as the coordinate system.
k_1: A coordinate system with the landmark as the origin, the y-axis vertically downward, the x-axis horizontally, and the x-axis orientation determined so that the x-coordinate of the initial position of the trajector is negative.

k_2:ランドマークを原点とし、原点からトラジェクタの初期位置に向かう方向にx軸、それに垂直な方向にy軸をとった座標系。
k_3:トラジェクタの初期位置を原点とし、鉛直下向きにy軸、水平方向にx軸をとった座標系。
図7に、それぞれの動作プリミティブを学習させたときにユーザが実演した動作の軌道と、そのときに選択された座標系kおよび、参照点lを示す。図7における(a)から(g)は、上の7種類のそれぞれの動作プリミティブごとの学習結果を示している。 k_2: A coordinate system with the landmark as the origin, the x axis in the direction from the origin to the initial position of the trajector, and the y axis in the direction perpendicular to it.
k_3: A coordinate system in which the initial position of the trajector is the origin, and the y-axis is taken vertically downward and the x-axis is taken horizontally.
FIG. 7 shows the motion trajectory demonstrated by the user when learning each motion primitive, the coordinate system k selected at that time, and the reference point l. (A) to (g) in FIG. 7 show learning results for each of the above seven types of operation primitives.

まず、図7(a)を例に記号の意味を説明する。図7(a)における、四角や丸は静止オブジェクトを表しており、Tを付した中黒丸はトラジェクタを表している。静止オブジェクトがいくつか置かれた状況で、太い矢印の始点から終点へと、ユーザがトラジェクタを上方に持ち上げ、「あげる」という命令をマイクロフォンに向かって発する作業を4回行ったことを示している。各入力動作ごとに、静止オブジェクトの数や位置、トラジェクタの初期位置および目標位置は異なっており、それぞれの入力動作は、カメラで撮影され、入力映像V_iとして認識される。入力映像V_iから、動いている物体をトラジェクタとして認識し、トラジェクタの初期位置および静止オブジェクトの位置がランドマークとして認識される。このとき、(数1)に基づいて座標系と参照点を決定した結果、細い矢印で示した方向に座標軸をとり、それらの交点が参照点Rに選択されたことを示している。すなわち、「あげる」という動作に対ては、座標系k_3が選択されたことを示している。 First, the meaning of the symbols will be described using FIG. 7 (a) as an example. In FIG. 7 (a), squares and circles represent stationary objects, and the black and white circles with T represent trajectors. In the situation where several stationary objects are placed, it is shown that the user has raised the trajector upward from the start point to the end point of the thick arrow and issued a command to “raise” four times to the microphone. . The number and position of stationary objects, the initial position and target position of the trajector are different for each input operation, and each input operation is captured by a camera and recognized as an input video V_i. From the input video V_i, a moving object is recognized as a trajector, and the initial position of the trajector and the position of a stationary object are recognized as landmarks. At this time, as a result of determining the coordinate system and the reference point based on (Equation 1), the coordinate axis is taken in the direction indicated by the thin arrow, and the intersection point thereof is selected as the reference point R. That is, for the operation of “raising”, this indicates that the coordinate system k_3 has been selected.

図7(b)から(g)の各動作に関しても同様で、「のせる」、「とびこえさせる」に対しては座標系k_1、「ちかづける」、「はなす」に対しては座標系k_2、「あげる」、「まわす」、「さげる」に対しては座標系k_3が選択されたことを示している。
《軌道探索モードの構成》
本節では、本実施形態の軌道探索装置300の軌道探索モードにおける内部構成について説明する。 The same applies to the operations in FIGS. 7 (b) to 7 (g). For `` put on '' and `` get over '', the coordinate system k_1, for `` flickering '' and `` Hanasu '', the coordinate system k_2 , “Turn”, “Turn”, and “Saguru” indicate that the coordinate system k_3 is selected.
<Orbit search mode configuration>
In this section, an internal configuration in the trajectory search mode of the trajectory search device 300 of the present embodiment will be described.

軌道探索とは、ユーザからの命令された動作を実行するのに最も確からしい軌道を、学習した動作プリミティブを結合することで生成される結合動作を表す結合確率モデルの下で探索することである。
図8は、本実施形態の軌道探索装置300の軌道探索モードにおける内部構成である。これは、図6で示した学習モードのおける構成と基本的に同じであるが、プロセッサ3が実行するプログラムが異なるため、入出力されるデータ、および、ワークメモリ4で展開される演算内容が異なる。 Trajectory search is the search for the most probable trajectory for executing the commanded motion from the user under a joint probability model that represents the joint motion generated by combining the learned motion primitives. .
FIG. 8 shows an internal configuration in the trajectory search mode of the trajectory search device 300 of the present embodiment. This is basically the same as the configuration in the learning mode shown in FIG. 6, but since the program executed by the processor 3 is different, the input / output data and the operation contents developed in the work memory 4 are different. Different.

入出力インターフェース1は、図1に示したマイクロフォンに向かってユーザが発した命令を音声認識エンジン100が解析した指示内容を示すコード列を受け取る。コード列は、ユーザが指を指して「これをあそこに動かす」と命じた場合のように、トラジェクタの初期位置x_0と移動後の目標位置x_nとを指定したコード列、もしくは、「ペンをあげてから、箱にのせる」と命じた場合のように、「あげる」「のせる」といった動作プリミティブを実行する順番に指定したコード列である。 The input / output interface 1 receives a code string indicating an instruction content analyzed by the speech recognition engine 100 about a command issued by the user toward the microphone shown in FIG. The code string is a code string that specifies the initial position x_0 of the trajector and the target position x_n after the movement, as in the case where the user points to the finger and `` move this over there '' or `` lift the pen The code string is specified in the order of execution of the operation primitives such as “raise” and “put”.

また、入出力インターフェース1は、図1に示したカメラで撮影された映像を画像認識エンジン200で解析し、トラジェクタの初期位置x_0と目標位置x_n、静止オブジェクトの位置集合Oを取得する。トラジェクタの初期位置は、画像認識エンジン200によって特定されるので、命令ではユーザはトラジェクタを名前あるいはジェスチャで指定するだけでよい。 In addition, the input / output interface 1 analyzes the video captured by the camera shown in FIG. 1 with the image recognition engine 200, and acquires the initial position x_0 and target position x_n of the trajector and the position set O of the stationary object. Since the initial position of the trajector is specified by the image recognition engine 200, the command only requires the user to specify the trajector by name or gesture.

プロセッサ3は、プログラム格納部5に格納されている軌道探索プログラムを実行する。軌道探索プログラムは、学習によってデータベース格納部2に格納された動作プリミティブを表す確率モデルのパラメータλのうち2つ以上を読み込み、それらを結合した結合確率モデルのパラメータΛ'をワークメモリ4上に保有する。ここで、2つ以上の確率モデルのパラメータを読み込むのは、確率モデルの結合を行う例を示すためであり、結合が不要な場合は、1つの確率モデルのパラメータを読み込めばよい。どの動作プリミティブに対応する確率モデルのパラメータを読み込むかは、ユーザによって指示された命令の形式によって異なる。すなわち、トラジェクタの初期位置と目標位置だけが指定された場合は、軌道探索プログラムは、学習した動作プリミティブの何種類ものパターンの結合を考慮するが、動作の順序まで指定して命令された場合は、指定された動作プリミティブに対応する確率モデルのパラメータだけを読み込み、それらの結合だけを生成する。 The processor 3 executes a trajectory search program stored in the program storage unit 5. The trajectory search program reads two or more parameters λ of the probability model representing the operation primitive stored in the database storage unit 2 by learning, and stores the parameter Λ ′ of the combined probability model obtained by combining them in the work memory 4 To do. Here, the parameters of two or more probability models are read in order to show an example of combining probability models. When the combination is unnecessary, the parameters of one probability model may be read. Which operation primitive the parameter of the probability model corresponding to is read depends on the format of the instruction instructed by the user. That is, when only the initial position and target position of the trajector are specified, the trajectory search program considers the combination of learned patterns of motion primitives. Read only the parameters of the probabilistic model corresponding to the specified motion primitive and generate only their combination.

軌道探索プログラムは、結合された確率モデルの下で、最も確からしい軌道Y*を探索し、その結果を入出力インターフェース1を介して、制御装置400へと送る。この最も確からしい軌道とは、確率モデルを与えた上での、初期位置から目標位置までの遷移確率が最大となる軌道という意味であり、最短距離をとる軌道であるとか、運動に要するエネルギーが最小になる軌道という意味は特にはない。確率モデルはユーザの動作に基づいて学習されたものであるので、最も確からしい軌道は、人間の自然な動作を再現したもの近い軌道となる。
〈確率モデルの結合〉
本節では、学習した動作プリミティブを表す確率モデルを結合し、結合確率モデルの下で軌道を探索する方法の具体的処理内容ついて説明する。 The trajectory search program searches for the most probable trajectory Y * under the combined probability model, and sends the result to the control device 400 via the input / output interface 1. This most probable trajectory means a trajectory that gives the maximum transition probability from the initial position to the target position after giving a probability model. There is no particular meaning for the minimum trajectory. Since the probabilistic model is learned based on the user's motion, the most probable trajectory is a trajectory close to a reproduction of a natural human motion.
<Combination of probability models>
In this section, the specific processing contents of a method of combining probability models representing learned motion primitives and searching for trajectories under the combined probability model will be described.

学習した動作プリミティブをロボットに命じることで、ロボットにその動作プリミティブを行わせることができる。しかし、単純に学習した動作プリミティブだけしか行えないのであれば、あらゆる動作を動作プリミティブとして学習させなければ多様な命令に応じることはできない。本発明では、学習した動作プリミティブを結合することにより得られる複合的な動作を表す結合確率モデルの下で軌道を探索する。 By instructing the robot with the learned motion primitive, the robot can perform the motion primitive. However, if only simple learned operation primitives can be performed, it is not possible to respond to various instructions unless all operations are learned as operation primitives. In the present invention, a trajectory is searched under a joint probability model representing a composite motion obtained by combining learned motion primitives.

本実施形態では、結合動作の軌道探索方法として次の2種類を考える。
1.トラジェクタの初期位置x_0と目標位置x_nを入力として、最も尤度の高い〈動作プリミティブ、ランドマーク〉の組の列からなる動作列A*と軌道Y*を探索する。
2.トラジェクタと〈動作プリミティブ、ランドマーク〉の組の列からなら動作列Aを入力として、最も尤度の高い軌道Y*を探索する。
軌道探索法1は、「ペンをここにもってくる」のように、ユーザがトラジェクタと目標位置を指定して命令した場合に採用される。トラジェクタの初期位置は、トラジェクタを指定すれば画像認識エンジン200によって特定される。この場合は、動作列と軌道の両方の探索を行う。 In the present embodiment, the following two types of trajectory search methods for the combined operation are considered.
1. By using the initial position x_0 and the target position x_n of the trajector as inputs, the motion sequence A * and the trajectory Y *, which are composed of a pair of <motion primitive, landmark> having the highest likelihood, are searched.
2. Search the trajectory Y * with the highest likelihood, using the sequence of trajectors and <motion primitive, landmark> as the sequence of motion A.
The trajectory search method 1 is employed when the user designates and instructs a trajector and a target position, such as “Pen is brought here”. The initial position of the trajector is specified by the image recognition engine 200 if the trajector is designated. In this case, both the motion sequence and the trajectory are searched.

軌道探索法2は、「ペンをあげてから、ちかづける」のように、ユーザが動作プリミティブの順序を指定して命令した場合に採用される。この場合は、動作列が指定されるので、その動作列の下で、軌道の探索だけを行う。
動作プリミティブの学習では、各動作プリミティブごとに座標系が決められるので、異なる動作プリミティブを結合するにあたって座標系を統一する必要がある。例えば、図9のように、「あげる」という動作プリミティブと、「ちかづける」という動作プリミティブを結合し、「ぬいぐるみ」を「だるま」の左上位置に近付ける動作を生成する場合を考える。図9(a)は、「あげる」という動作プリミティブを表す確率モデルλ_1を示しており、これは、トラジェクタの初期位置を参照点とし、水平方向にx軸、鉛直上向きにy軸をとった座標系で定義されている。図9(b)は、「ちかづける」という動作プリミティブを表す確率モデルλ_2を示しており、これは、トラジェクタを近付ける対象となるランドマークの位置を参照点とし、参照点からトラジェクタの初期位置に向かう方向にx軸、それに直交する方向にy軸をとった座標系で定義されている。この2つの動作プリミティブから、「あげてから、ちかづける」という結合動作を生成するためには、図9(c)のように、「あげる」という動作プリミティブの終点が「ちかづける」という動作プリミティブの始点に一致し、「ちかづける」という動作プリミティブの終点が「だるま」の左上位置に一致するように座標変換するように、「ちかづける」という動作プリミティブの定義されている座標系を座標変換する必要がある。 The trajectory search method 2 is adopted when the user designates the order of motion primitives and gives an instruction, such as “pick up a pen and then click”. In this case, since an action sequence is specified, only the trajectory search is performed under the action sequence.
In learning of operation primitives, a coordinate system is determined for each operation primitive, and therefore it is necessary to unify the coordinate system when combining different operation primitives. For example, as shown in FIG. 9, consider a case where an operation primitive “raise” is combined with an operation primitive “chickle” to generate an operation that brings “stuffed animal” closer to the upper left position of “daruma”. Fig. 9 (a) shows a probability model λ_1 representing the motion primitive "raise", which is a coordinate with the initial position of the trajector as the reference point and the x-axis in the horizontal direction and the y-axis in the upward direction. It is defined in the system. Fig. 9 (b) shows a probability model λ_2 that represents the action primitive of `` Cricking '', which uses the position of the landmark to be brought close to the trajector as the reference point, from the reference point to the initial position of the trajector. It is defined by a coordinate system with the x-axis in the direction toward it and the y-axis in the direction perpendicular to it. In order to generate a combined action of “raise and then click” from these two action primitives, the end of the action primitive “raise” and the action primitive that “chuck” is generated as shown in FIG. 9 (c). Coordinates the coordinate system in which the action primitive "Kikakase" is defined so that the coordinate of the action primitive "Kikakase" matches the upper left position of "Daruma". There is a need to.

以下、動作プリミティブの結合における座標変換の定式化について説明する。
今、動作列Aがn個の動作プリミティブを結合することで生成され、それぞれの動作プリミティブは確率モデルλ_i(i=1,2,...,n)で表されるものとする。動作列Aに含まれる動作プリミティブのそれぞれの座標系で定義された確率モデルの列をΛ=(λ_1,λ_2,...,λ_n)と書く。 Hereinafter, the formulation of coordinate transformation in the combination of motion primitives will be described.
Now, it is assumed that the motion sequence A is generated by combining n motion primitives, and each motion primitive is represented by a probability model λ_i (i = 1, 2,..., N). A sequence of probability models defined in each coordinate system of motion primitives included in motion sequence A is written as Λ = (λ_1, λ_2,..., Λ_n).

確率モデルを特徴づけるパラメータは平均と分散である。平均は、その確率モデルにおける位置、速度、加速度の平均であり、分散は平均からのばらつきを与えるパラメータである。連続する動作の終点と始点を合わせるということは、それぞれの確率モデルの終点と始点の平均位置を合わせることに相当する。
〈平均の結合〉
まず、平均の結合について説明する。 The parameters that characterize the probabilistic model are the mean and variance. The average is the average of the position, velocity, and acceleration in the probability model, and the variance is a parameter that gives variation from the average. Matching the end point and start point of successive movements corresponds to matching the average position of the end point and start point of each probability model.
<Average coupling>
First, average coupling will be described.

確率モデルλ_jで表されるj番目の動作プリミティブにおいて、ある状態sにおける、位置の平均値をE_x_s(λ_j)、速度の平均値をE_x'_s(λ_j)、加速度の平均値をE_x''_s(λ_j)とする。すなわち、j番目の動作プリミティブにおいて、状態sにおける平均は、E_y_s(λ_j)=[E_x_s(λ_j),E_x'_s(λ_j),E_x''_s(λ_j)]^Tである。j番目の動作プリミティブの初期状態s=0における位置は、E_x_0(λ_j)なので、初期位置が原点になるように座標を平行移動すると、状態sにおける位置に関する平均は、(E_x_s(λ_j)-E_x_0(λ_j))となる。この平均は、j番目の動作プリミティブに対応する座標系で定義されているので、動作プリミティブを結合するためには、これを世界座標系へ変換する必要がある。世界座標系は、動作列の最初の動作プリミティブを表す確率モデルが定義されている座標系とする。この世界座標系へ変換をW_k,lとすると、W_k,lは、座標軸の向きkと参照点lに依存したアフィン変換である。参照点を原点に平行移動した上でW_k,lを作用することで座標系を統一し、直前の(j-1)番目の動作プリミティブの最終状態S_j-1における平均位置E_x_S_j-1(λ_j-1)だけ原点を平行移動すれば、j番目の動作に対応する確率モデルの世界座標系での平均が得られる。ここで、S_jはj番目の動作プリミティブにおける最終状態であり、トラジェクタの初期位置としてE_x_S_0(λ_0)=x_0、目標位置としてE_x_S_n(λ_n)=x_nとする。以上を式で表すと、(数2)から(数5)のように書ける。 In the j-th motion primitive represented by the probability model λ_j, the position average value E_x_s (λ_j), the velocity average value E_x'_s (λ_j), and the acceleration average value E_x``_s in a state s (λ_j). That is, in the j-th operation primitive, the average in the state s is E_y_s (λ_j) = [E_x_s (λ_j), E_x′_s (λ_j), E_x ″ _s (λ_j)] ^ T. Since the position of the jth motion primitive in the initial state s = 0 is E_x_0 (λ_j), when the coordinates are translated so that the initial position is the origin, the average of the position in the state s is (E_x_s (λ_j) −E_x_0 (λ_j)). Since this average is defined in the coordinate system corresponding to the jth motion primitive, it is necessary to convert it to the world coordinate system in order to combine the motion primitives. The world coordinate system is a coordinate system in which a probability model representing the first motion primitive in the motion sequence is defined. When the transformation into the world coordinate system is W_k, l, W_k, l is an affine transformation that depends on the direction k of the coordinate axis and the reference point l. The coordinate system is unified by translating the reference point to the origin and acting on W_k, l, and the average position E_x_S_j-1 (λ_j- in the final state S_j-1 of the immediately preceding (j-1) th motion primitive If the origin is translated by 1), the average of the probability model corresponding to the jth motion in the world coordinate system is obtained. Here, S_j is the final state in the j-th operation primitive, and it is assumed that E_x_S_0 (λ_0) = x_0 is the initial position of the trajector and E_x_S_n (λ_n) = x_n is the target position. When the above is expressed by an expression, it can be written as (Equation 2) to (Equation 5).

〈分散の結合〉
次に分散の結合について説明する。分散は平均からのばらつきを示すパラメータであり、トラジェクタが描く軌道のゆらぎを与える。複数の動作プリミティブを結合するにあたって、各動作プリミティブが静止オブジェクトに対して相対的な動作かそうでないかによって、軌道のゆらぎの許容範囲は変わる。すなわち、静止オブジェクトに対して相対的な動作の場合は、静止オブジェクトとの位置関係をある程度正確に保つ必要があるのに対して、そうでない動作の場合は、多少軌道の位置がずれていても問題になることはない。このような理由から、分散の結合においては、結合する動作が静止オブジェクトに依存していない場合は、位置に関する分散を拡大するように結合する。こうすることで、結合動作による滑らかな軌道を生成することができる。

<Combination of dispersion>
Next, dispersion coupling will be described. Dispersion is a parameter indicating variation from the average, and gives fluctuation of the trajectory drawn by the trajector. In combining a plurality of motion primitives, the allowable range of trajectory fluctuation varies depending on whether each motion primitive is relative to a stationary object or not. In other words, in the case of movement relative to a stationary object, it is necessary to maintain the positional relationship with the stationary object to a certain degree of accuracy, whereas in the case of movement that is not so, the position of the trajectory may be slightly shifted. There is no problem. For this reason, in the dispersion combination, when the operation to be combined does not depend on the stationary object, the combination regarding the position is expanded. By doing so, it is possible to generate a smooth trajectory by the coupling operation.

確率モデルλ_jで表されるj番目の動作プリミティブにおいて、ある状態sにおける分散V_y_s(λ_j)を次のように変換する。すなわち、j番目の動作プリミティブの参照点l_jが、静止オブジェクトの位置集合Oに含まれる場合は、j番目の動作プリミティブの分散は変更せず、そうでない場合は、直前の(j-1)番目の動作プリミティブの最終状態S_j-1における分散のうち位置に関するものV_x_S_j-1(λ_j-1)を付加する。分散のうち速度および加速度に関しては何も変換しない。これを式で表すと(数6)および(数7)となる。 In the j-th operation primitive represented by the probability model λ_j, the variance V_y_s (λ_j) in a certain state s is converted as follows. That is, if the reference point l_j of the j-th motion primitive is included in the position set O of the still object, the distribution of the j-th motion primitive is not changed, otherwise, the previous (j-1) th V_x_S_j-1 (λ_j-1) related to the position among the variances in the final state S_j-1 of the operation primitive is added. Nothing is converted in terms of velocity and acceleration among the variances. This can be expressed as (Equation 6) and (Equation 7).

平均の結合の際にはアフィン変換を行ったが、分散の結合においてはアフィン変換を行っていない。これは、HMMの結合においては、共分散行列の対角成分のみを用いることが一般的であるためであるが、分散に関しても平均の場合と同様にアフィン変換を行って計算してもよい。
〈確率モデルの結合〉
上述した確率モデルの結合を模式的に表すと図10から図12のようになる。

The affine transformation is performed in the average coupling, but the affine transformation is not performed in the dispersion coupling. This is because it is common to use only the diagonal component of the covariance matrix in the HMM coupling, but the variance may be calculated by performing affine transformation as in the case of the average.
<Combination of probability models>
The combination of the above probability models is schematically shown in FIGS.

図10は、結合される3つの動作プリミティブを表している。図10(a)は、参照点から離す動作プリミティブを表す確率モデルλ_1、図10(b)は、水平方向に平行移動させる動作プリミティブを表す確率モデルλ_2、図10(c)は、参照点に近づける動作プリミティブを表す確率モデルλ_3である。図10(a)の参照点は、ランドマーク集合のうちのトラジェクタの初期位置である。図10(c)の参照点は、ランドマーク集合のうちのいずれかの静止オブジェクトの位置である。それぞれ独立した座標系で定義されているこれらの3つの動作プリミティブを結合して、トラジェクタを初期位置から、一旦離して、水平移動させたあと、静止オブジェクトに近づけるという結合動作を生成する。 FIG. 10 represents the three motion primitives that are combined. 10 (a) is a probability model λ_1 representing a motion primitive that is moved away from the reference point, FIG. 10 (b) is a probability model λ_2 representing a motion primitive to be translated in the horizontal direction, and FIG. 10 (c) is a reference point. This is a probability model λ_3 representing an action primitive to be approached. The reference point in FIG. 10 (a) is the initial position of the trajector in the landmark set. The reference point in FIG. 10 (c) is the position of any stationary object in the landmark set. These three motion primitives defined in independent coordinate systems are combined to generate a combined operation in which the trajector is moved away from the initial position, moved horizontally, and then moved closer to a stationary object.

図11は、図10に示した3つの動作プリミティブを結合する際の、各確率モデルλ_1,λ_2,λ_3の変換の様子を示している。確率モデルの列Λ=(λ_1,λ_2,λ_3)において、確率モデルλ_1は、1番目の確率モデルであるので何も変換されない。確率モデルλ_2は、2番目の確率モデルであるので、(数3)に従って、原点をずらした上で、直前の動作プリミティブを表す確率モデルλ_1が定義された座標系へのアフィン変換を行い、かつ、確率モデルλ_2の原点が、直前の確率モデルλ_1の最終状態の平均位置に一致するように平行移動させる。また、速度と加速度については、W_k,lによるアフィン変換だけを行う。分散に関しては、(数7)に従って、確率モデルλ_2は、静止オブジェクトの位置を参照点にもたないので、直前の動作プリミティブを表す確率モデルλ_1の分散を加える。これにより、確率モデルλ_2の分散は、図11に示したように、点線から実線のように拡大し、軌道の探索に広がりをもたせることができる。確率モデルλ_3も、確率モデルλ_2の場合と同様に、(数3)に従って、座標変換が行われる。確率モデルλ_3は、目標位置に近づける動作プリミティブを表しており、静止オブジェクトの位置を参照点としてもつ確率モデルであるので、(数7)に従って、分散は何も変換されない。 FIG. 11 shows how the probability models λ_1, λ_2, and λ_3 are converted when the three operation primitives shown in FIG. 10 are combined. In the probability model column Λ = (λ_1, λ_2, λ_3), the probability model λ_1 is the first probability model, so nothing is converted. Since the probability model λ_2 is the second probability model, the origin is shifted according to (Equation 3), and then the affine transformation to the coordinate system in which the probability model λ_1 representing the immediately preceding motion primitive is defined, and The translation is performed so that the origin of the probability model λ_2 matches the average position of the final state of the immediately preceding probability model λ_1. For speed and acceleration, only affine transformation using W_k, l is performed. Regarding the variance, according to (Equation 7), the probability model λ_2 does not have the position of the stationary object at the reference point, so the variance of the probability model λ_1 representing the immediately preceding motion primitive is added. As a result, the variance of the probability model λ_2 can be expanded from the dotted line to the solid line as shown in FIG. 11, and the trajectory search can be expanded. Similarly to the case of the probability model λ_2, the probability model λ_3 is also subjected to coordinate transformation according to (Equation 3). The probability model λ_3 represents an action primitive that approaches the target position, and is a probability model having the position of a stationary object as a reference point. Therefore, no variance is converted according to (Equation 7).

以上のようにして、変換された確率モデル全体が結合確率モデルΛ'であり、図12のようになる。この結合確率モデルΛ'の下で、最も確からしい軌道の探索を行う。
〈最尤軌道の探索〉
最尤軌道の探索は、結合確率モデルΛ'の下で、トラジェクタが軌道Yを描く確率を尤度P(Y;Λ')とし、その対数尤度が最大となるように、確率モデルΛ'*と軌道Y*を決定することでなされる。これを式で表すと、(数8)である。この解は、非特許文献3で提案されている最適化法によって求めることができる。 As described above, the whole converted probability model is the combined probability model Λ ′, as shown in FIG. Under this joint probability model Λ ′, the most probable trajectory is searched.
<Search for maximum likelihood trajectory>
The search for the maximum likelihood trajectory is based on the probability model Λ ′ so that the probability of the trajector drawing the trajectory Y under the joint probability model Λ ′ is the likelihood P (Y; Λ ′), and its logarithmic likelihood is maximized. This is done by determining * and orbit Y *. This is expressed by the following equation (8). This solution can be obtained by the optimization method proposed in Non-Patent Document 3.

上式は、トラジェクタの目標位置を指定した命令がなされ、軌道探索方法1が採用された場合の式である。軌道探索方法1が用いられるのは、ロボットに「赤い箱を青い箱の上にのせる」と命令すれば、適切な軌道を計算するような場合である。この場合、始点と終点を与え、学習した動作プリミティブと認識されるランドマークからあらゆる〈動作プリミティブ、ランドマーク〉の組を生成し、生成された組を並べてできる動作列Aを作る。動作プリミティブの結合に制限を設けなければ、動作列Aは無限個存在するので、実際には、n個の動作プリミティブの結合までに制限する。このnを結合の深さと呼ぶ。生成される動作列Aの個数は、結合の深さをn、ランドマークの数をM、学習した動作プリミティブの数をZとすると、最大でもMZ^iをi=1からnまで足したもので表され、これは結合の深さnのべき乗で増大する。このため、動作プリミティブの数Zが少なくても、非常に多様な動作が可能となる。

The above expression is an expression when a command specifying the target position of the trajector is made and the trajectory search method 1 is adopted. The trajectory search method 1 is used when a robot is instructed to “put a red box on a blue box” to calculate an appropriate trajectory. In this case, a start point and an end point are given, and a set of all <action primitives, landmarks> is generated from landmarks recognized as learned action primitives, and an action sequence A in which the generated sets can be arranged is created. If there are no restrictions on the combination of action primitives, there are an infinite number of action sequences A, so in practice the restriction is limited to the combination of n action primitives. This n is called the bond depth. The number of motion sequences A generated is the sum of MZ ^ i from i = 1 to n at most, where n is the depth of connection, M is the number of landmarks, and Z is the number of learned motion primitives. Which increases with the power of the coupling depth n. For this reason, even if the number Z of operation primitives is small, very various operations are possible.

例えば、深さn=3までで、考えうるすべての〈動作プリミティブ、ランドマーク〉の組の列からなる動作列Aを生成すると、上述の動作プリミティブを表す確率モデルの結合により、動作列Aに対応する確率モデルΛ'が決定する。その確率モデルΛ'の下で、トラジェクタが軌道Yを描く確率Pが決まるので、確率Pが最大となるような、Λ'とYを決める、つまり、〈動作プリミティブ、ランドマーク〉の組の列と、そのときの最も確からしい軌道を決めるのが軌道探索方法1である。 For example, when generating a motion sequence A composed of all possible <motion primitive, landmark> pairs up to a depth n = 3, the motion sequence A is combined with the above-described probability model representing the motion primitive. The corresponding probability model Λ ′ is determined. Under the probability model Λ ′, the probability P of the trajector drawing the trajectory Y is determined. Therefore, Λ ′ and Y are determined so that the probability P is maximized, that is, a sequence of pairs of <operation primitive, landmark> The orbit search method 1 determines the most probable orbit at that time.

軌道探索方法2が用いられるのは、ロボットに「赤い箱をあげてから、箱をとびこえ、さげる」のように、軌道を指定して命令するような場合である。この場合、軌道探索方法1における動作列Aが与えられるので、〈動作プリミティブ、ランドマーク〉の組の列に関する探索は行わない。つまり、軌道Y*は、(数9)に従って生成される。 The trajectory search method 2 is used when the robot is instructed by specifying a trajectory, such as “lift a red box, then skip the box, and lower it”. In this case, since the motion sequence A in the trajectory search method 1 is given, the search for the sequence of <motion primitive, landmark> is not performed. That is, the trajectory Y * is generated according to (Equation 9).

上述の、(数8)および(数9)のいずれにおいても、トラジェクタが静止オブジェクトに衝突する効果は考慮していないが、これは、例えば、軌道Yのうち静止オブジェクトに衝突する軌道は除外した上で、最尤軌道を選択するという条件を入れることで回避できる。

In both of (Equation 8) and (Equation 9) described above, the effect of the trajector colliding with a stationary object is not considered, but this excludes, for example, the trajectory Y that collides with a stationary object. This can be avoided by adding the condition of selecting the maximum likelihood trajectory.

図15は、結合確率モデルΛ'の下での最尤軌道の探索の様子を示した図である。結合確率モデルΛ'の下で、初期位置から目標位置までトラジェクタを動かす軌道Yは、図15に実線および点線で表したようにいくつもある。これらの軌道Yのそれぞれについて、結合確率モデルΛ'の下での、トラジェクタが軌道Yを描く確率P(Y;Λ')を計算し、例えば、図15の実線で表した軌道を描く確率が最大であれば、実線の軌道を最尤軌道Y*とする。 FIG. 15 is a diagram showing a state of searching for the maximum likelihood trajectory under the joint probability model Λ ′. Under the joint probability model Λ ′, there are a number of trajectories Y that move the trajector from the initial position to the target position, as shown by the solid line and the dotted line in FIG. For each of these trajectories Y, the probability P (Y; Λ ′) of the trajector drawing the trajectory Y under the joint probability model Λ ′ is calculated. For example, the probability of drawing the trajectory represented by the solid line in FIG. If it is maximum, the solid line trajectory is the maximum likelihood trajectory Y *.

軌道探索法2で動作を指示した場合のように、動作列Aが与えらえれ、結合確率モデルΛ'が唯一つに決まれば、その結合確率モデルΛ'の中だけで最尤軌道を探索すれば済むが、軌道探索法1で命令を行った場合は、どの動作プリミティブをどんな順番で結合するかまで探索する必要がある。図14は、初期位置x_0にあるトラジェクタを目標位置x_nにある静止オブジェクトの上にのせるという動作を、いくつかの動作プリミティブの結合により実現する例を示した図である。 If the motion sequence A is given and the joint probability model Λ ′ is determined to be unique, as in the case where the motion is specified by the trajectory search method 2, the maximum likelihood trajectory is searched only in the joint probability model Λ ′. However, when the command is issued by the trajectory search method 1, it is necessary to search for which operation primitives are combined in what order. FIG. 14 is a diagram illustrating an example of realizing the operation of placing the trajector at the initial position x_0 on the stationary object at the target position x_n by combining several operation primitives.

図14(a)は、図12で示したものと同じ、「はなす」「水平移動させる」「ちかづける」の3つの動作プリミティブを結合した結合動作を表す結合確率モデルΛ'_1である。図14(b)は、「とびこえる」という動作を2回続けて行った場合の結合動作を表す結合確率モデルΛ'_2である。図14(c)は、「あげる」「水平移動させる」「さげる」の3つの動作プリミティブを結合した結合動作を表す結合確率モデルΛ'_3である。図14(d)は、「はなす」「さげる」の2つの動作プリミティブを結合した結合動作を表す結合確率モデルΛ'_4である。 FIG. 14A shows a joint probability model Λ′_1 that represents a joint action in which the three action primitives “Hanasu”, “Move horizontally”, and “Crick” are combined as shown in FIG. FIG. 14 (b) is a joint probability model Λ′_2 representing the joint action in the case where the action of “overshoot” is performed twice in succession. FIG. 14 (c) is a joint probability model Λ′_3 representing a joint action in which three action primitives of “raise”, “move horizontally”, and “sales” are joined. FIG. 14D shows a joint probability model Λ′_4 representing a joint action obtained by joining two action primitives “Hanasu” and “Sakeru”.

初期位置x_0から目標位置x_nまでトラジェクタを移動させるような動作プリミティブからなる動作列は他にも考えられる。最尤軌道の探索では、定められた深さまでのすべての動作列を生成し、それぞれの動作列ごとに、軌道Yを描く確率を計算する。それぞれの結合確率モデルΛ'の下で計算したトラジェクタが軌道軌道Yを描く確率P(Y;Λ')が最大となる軌道Y*と結合確率モデルΛ'*を探索した結果、例えば、図14(d)の結合確率モデルΛ'_4の下で軌道Y_4を描く確率P(Y_4;Λ'_4)が最大とわかったならば、最尤軌道はY*=Y_4と決定される。これを図15(d)の実線で示す。
《目標位置を指定した移動指示》
図16は、目標位置を指定してトラジェクタの移動を指示した場合に、本実施形態の軌道探索装置により探索された最尤軌道の例をいくつか示してある。この場合は、軌道探索法1に従い、最尤軌道と最尤動作列を探索する。図16において、トラジェクタは1であり、2から5の静止オブジェクトがある空間において、(a)から(g)で示した目標位置に移動するよう指示した場合に、探索された最尤軌道がそれぞれ点線で示されている。図17は、これらの最尤軌道を与える動作列を示している。 There may be other motion sequences composed of motion primitives that move the projector from the initial position x_0 to the target position x_n. In the search for the maximum likelihood trajectory, all motion sequences up to a predetermined depth are generated, and the probability of drawing the trajectory Y is calculated for each motion sequence. As a result of searching the trajectory Y * and the joint probability model Λ ′ * that maximize the probability P (Y; Λ ′) that the trajectors calculated under the respective joint probability models Λ ′ draw the trajectory trajectory Y, for example, FIG. If the probability P (Y_4; Λ′_4) for drawing the trajectory Y_4 under the joint probability model Λ′_4 in (d) is found to be the maximum, the maximum likelihood trajectory is determined as Y * = Y_4. This is indicated by the solid line in FIG.
《Movement instruction specifying the target position》
FIG. 16 shows some examples of maximum likelihood trajectories searched by the trajectory search apparatus of this embodiment when the target position is designated and the movement of the trajector is instructed. In this case, the maximum likelihood trajectory and the maximum likelihood action sequence are searched according to the trajectory search method 1. In FIG. 16, when the trajector is 1, and instructed to move to the target position shown in (a) to (g) in the space where there are 2 to 5 stationary objects, the searched maximum likelihood trajectories are respectively Shown in dotted lines. FIG. 17 shows an operation sequence that gives these maximum likelihood trajectories.

例えば、(a)の場合、静止オブジェクト2から「はなす」という動作プリミティブで移動が行われる。(b)の場合、静止オブジェクト5に「のせる」という動作プリミティブで移動が行われる。(c)は2つの動作プリミティブからなる動作列により移動が行われ、一旦、静止オブジェクト2に「のせて」から、静止オブジェクト5に「ちかづける」ことで移動が行われる。(d)は静止オブジェクト2に「のせて」るまでは(c)と同じ軌道をたどり、そのあと、静止オブジェクト2から「はなす」ことで移動が行われる。(e)は静止オブジェクト2を「とびこえさせ」てから、静止オブジェクト3から「はなす」ことで移動が行われる。(f)は静止オブジェクト2から「はなし」てから、静止オブジェクト4を「とびこえさせる」ことで移動が行われる。(g)は静止オブジェクト2に「ちかづけ」てから、静止オブジェクト3に「ちかづける」ことで移動が行われる。(g)の軌道が静止オブジェクト2を通過していることからわかるように、この例では、トラジェクタと静止オブジェクトの衝突は考慮していない。しかし、これは上述したように、衝突のある軌道は最尤軌道から除外するなどして容易に回避できる。
《動作列を指定した移動指示》
図18は、動作列を指定してトラジェクタの移動を指示した場合に、本実施形態の軌道探索装置により探索された最尤軌道の例をいくつか示してある。この場合は、軌道探索法2に従い、与えられた動作列を表す結合確率モデルの下で、最尤軌道の探索を行う。図18(a)は、「オブジェクト1を、オブジェクト2をとびこえさせてから、さげて、オブジェクト4にちかづける」と命じた場合に探索された最尤軌道である。図18(b)は、「オブジェクト2を、オブジェクト1をとびこえさせ、再びオブジェクト1をとびこえさせ、オブジェクト5にのせる」と命じた場合に探索された最尤軌道である。
《マニピュレータ制御システムの動作》
本節では、本実施形態の軌道探索装置を用いたマニピュレータ制御システムの軌道探索モードにおける動作を、図19のフローチャートを参照しながら説明する。 For example, in the case of (a), movement is performed from the stationary object 2 with the action primitive “Hanasu”. In the case of (b), the movement is performed with the action primitive “put” on the stationary object 5. In (c), the movement is performed by an action sequence including two action primitives, and the movement is performed by “putting” the stationary object 2 and then “flicking” the stationary object 5. (d) follows the same trajectory as (c) until it is “placed” on the stationary object 2, and then the movement is performed by “shaking” from the stationary object 2. In (e), after moving the stationary object 2 “over”, the movement is performed by “breaking” the stationary object 3. In (f), after moving from the stationary object 2, the movement is performed by “flying” the stationary object 4. In (g), movement is performed by “chipping” the stationary object 2 and then “chipping” the stationary object 3. As can be seen from the trajectory (g) passing through the stationary object 2, in this example, the collision between the trajector and the stationary object is not considered. However, as described above, this can be easily avoided by excluding the trajectory with a collision from the maximum likelihood trajectory.
<< Movement instruction with action sequence >>
FIG. 18 shows some examples of maximum likelihood trajectories searched by the trajectory search apparatus of this embodiment when the movement sequence is instructed by designating the motion sequence. In this case, according to the trajectory search method 2, a search for the maximum likelihood trajectory is performed under a joint probability model representing a given motion sequence. FIG. 18 (a) shows the maximum likelihood trajectory searched in the case where it is instructed to “turn object 1 over object 2 and then lower it to object 4”. FIG. 18 (b) shows the maximum likelihood trajectory searched in the case where it is instructed that “object 2 is made to fly over object 1, object 1 is made to fly again, and placed on object 5”.
<Operation of manipulator control system>
In this section, the operation in the trajectory search mode of the manipulator control system using the trajectory search device of the present embodiment will be described with reference to the flowchart of FIG.

まず、軌道探索モードでは、ユーザの音声及びジェスチャによる動作指示を待つ(ステップS101)。マイクロフォンからの音声入力とカメラで撮影された映像にユーザからの指示が確認されたら(ステップS101 Y)、音声認識エンジン100が指示内容を認識する(ステップS102)。すなわち、ユーザが発した命令からトラジェクタやランドマーク、指示に含まれる動作プリミティブを抽出する。 First, in the trajectory search mode, it waits for an operation instruction by the user's voice and gesture (step S101). When an instruction from the user is confirmed in the voice input from the microphone and the video taken by the camera (step S101 Y), the voice recognition engine 100 recognizes the instruction content (step S102). That is, the operation primitive included in the trajector, the landmark, and the instruction is extracted from the command issued by the user.

以下からは、ユーザがトラジェクタの目標位置を指定して命令したのか、あるいは、動作列を指定して命令をしたのかによって処理を分岐する。これは、入力された命令にトラジェクタの目標位置x_nが含まれているかどうかによって判定する(ステップS103)。トラジェクタの目標位置x_nが含まれている場合は(ステップS103 Y)、目標位置を指定した命令と判定されステップS111に進む。トラジェクタの目標位置x_nが含まれていない場合は(ステップS103 N)、動作列を指定した命令と判定され(ステップS103 N)ステップS104に進む。 From the following, the process branches depending on whether the user designates and instructs the target position of the trajector or designates the operation sequence. This is determined by whether or not the input command includes the target position x_n of the trajector (step S103). If the target position x_n of the trajector is included (step S103 Y), it is determined that the command has designated the target position, and the process proceeds to step S111. If the target position x_n of the trajector is not included (step S103 N), it is determined that the command is for specifying an operation sequence (step S103 N), and the process proceeds to step S104.

動作列を指定した命令の処理では、まず、〈動作プリミティブ、ランドマーク〉の組の列からなる動作列Aが入力されたかどうかを判別する(ステップS104)。動作列Aが入力されていなければ(ステップS104 N)、ステップS101の指示待ち状態に戻る。動作列Aが入力されていれば(ステップS104 Y)、動作列Aに含まれる動作プリミティブを表す確率モデルの結合を行う(ステップS105)。次に、結合確率モデルΛ'の下で最尤軌道Y*の探索を行い(ステップS106)、探索された最尤軌道Y*を、制御装置400を介して、マニピュレータ側に送信する(ステップS107)。 In the process of an instruction designating an action sequence, it is first determined whether or not an action sequence A consisting of a set of <action primitive, landmark> has been input (step S104). If the operation sequence A has not been input (step S104 N), the process returns to the instruction waiting state in step S101. If the motion sequence A has been input (step S104 Y), the probability models representing the motion primitives included in the motion sequence A are combined (step S105). Next, the maximum likelihood trajectory Y * is searched under the joint probability model Λ ′ (step S106), and the searched maximum likelihood trajectory Y * is transmitted to the manipulator side via the control device 400 (step S107). ).

目標位置をを指定した命令の処理では、まず、画像認識エンジン200がカメラで撮影した画像を解析し、トラジェクタの初期位置x_0と目標位置x_nを認識する(ステップS111)。次に、初期位置x_0から目標位置x_nへとトラジェクタを移動するような動作プリミティブの動作列を複数生成し(ステップS112)、生成された動作列毎に、動作列に含まれる各動作プリミティブを表す確率モデルの結合を行う(ステップS113)。結合された確率モデルのそれぞれの下で最尤軌道の探索を行い、生成した動作列の中での最尤軌道Y*とそのときの結合動作を表す結合確率モデルΛ'を探索する(ステップS114)。 In the process of the command specifying the target position, first, the image recognition engine 200 analyzes the image captured by the camera, and recognizes the initial position x_0 and the target position x_n of the trajector (step S111). Next, a plurality of motion primitive motion sequences that move the trajector from the initial position x_0 to the target position x_n are generated (step S112), and each motion primitive included in the motion sequence is represented for each generated motion sequence. Probability models are combined (step S113). The maximum likelihood trajectory is searched under each of the combined probability models, and the maximum likelihood trajectory Y * in the generated operation sequence and the combined probability model Λ ′ representing the combined operation at that time are searched (step S114). ).

探索された結合動作の軌道は、必ずしもユーザの意図した動作であるとは限らないので、実際にマニピュレータ側に探索された最尤軌道Y*を送信する前に、行動予定動作名をを音声でアナウンスする(ステップS115)。行動予定動作名は、結合動作に含まれる動作プリミティブの名称を実行する順に含んだ動作名である。アナウンスは、例えば、図15(d)の例では、「トラジェクタを初期位置から箱の上方まで離してから、下げます。」というようなものである。アナウンスに対して、ユーザがOKの指示を出せば(ステップS116 Y)、探索された最尤軌道Y*を、制御装置400を介して、マニピュレータ側に送信する(ステップS107)。アナウンスに対して、ユーザがOKの指示を出さなければ(ステップS116 N)、ステップS101の指示待ち状態に戻る。 Since the trajectory of the searched joint motion is not necessarily the motion intended by the user, before sending the maximum likelihood trajectory Y * actually searched to the manipulator side, the action scheduled motion name is spoken. Announcement is made (step S115). The scheduled action name is an action name including the names of action primitives included in the combined action in the order of execution. For example, in the example of FIG. 15 (d), the announcement is such that “the trajector is moved away from the initial position to the top of the box and then lowered”. If the user gives an OK instruction to the announcement (step S116 Y), the searched maximum likelihood trajectory Y * is transmitted to the manipulator side via the control device 400 (step S107). If the user does not give an OK instruction for the announcement (N in step S116), the process returns to the instruction waiting state in step S101.

以上が、本実施形態の軌道探索装置を用いたマニピュレータ制御システムの軌道探索モードにおける動作である。
《確率モデルの結合方法》
本節では、本実施形態の軌道探索装置の軌道探索モードにおける確率モデルの結合方法を、図20のフローチャートを参照しながら説明する。 The above is the operation in the trajectory search mode of the manipulator control system using the trajectory search apparatus of the present embodiment.
《Probability model combination method》
In this section, a method of combining probability models in the trajectory search mode of the trajectory search apparatus of this embodiment will be described with reference to the flowchart of FIG.

ここでは、n個の動作プリミティブを表す確率モデルλ_j(j=1,2,...,n)からなる確率モデルの列Λ=(λ_1,λ_2,...,λ_n)を結合して結合確率モデルΛ'を作ることを考える。
まず、位置に関する平均および分散の初期値として、E_x_S_0(λ_0)=x_0、V_x_S_0(λ_0)=0とする(ステップS201)。ここで、x_0はトラジェクタの初期状態における位置である。 Here, a sequence of probability models Λ = (λ_1, λ_2, ..., λ_n) consisting of probability models λ_j (j = 1,2, ..., n) representing n motion primitives are combined and combined Consider creating a probabilistic model Λ ′.
First, it is assumed that E_x_S_0 (λ_0) = x_0 and V_x_S_0 (λ_0) = 0 as initial values of the average and variance regarding the position (step S201). Here, x_0 is the position of the trajector in the initial state.

各動作プリミティブを表す確率モデルを指定するためのインデックスをj=1と初期化し(ステップS202)、動作プリミティブに関するループ処理に入る。確率モデルλ_j-1で表される直前の動作プリミティブの最終状態S_j-1における位置から連続的に確率モデルを結合するために、(数3)に従ってE'_jを算出する(ステップS203)。分散の拡大量V'_jを決めるために、(数7)に従って、確率モデルλ_jで表される動作プリミティブの参照点が、静止オブジェクトの位置集合Oに属するかどうか判定する(ステップS204)。参照点が、静止オブジェクトの位置集合Oに属している場合は(ステップS204 Y)、分散を拡大しないので、V'_jを0とする(ステップS205)。参照点が、静止オブジェクトの位置集合Oに属していない場合は(ステップS204 N)、V'_jの位置に関する成分を、確率モデルλ_j-1で表される直前の動作プリミティブの最終状態S_j-1における位置の分散V_x_S_j-1にする(ステップS206)。 An index for designating a probability model representing each operation primitive is initialized to j = 1 (step S202), and loop processing relating to the operation primitive is started. In order to continuously combine the probability model from the position in the final state S_j-1 of the immediately preceding action primitive represented by the probability model λ_j-1, E′_j is calculated according to (Equation 3) (step S203). In order to determine the dispersion expansion amount V′_j, it is determined whether or not the reference point of the motion primitive represented by the probability model λ_j belongs to the stationary object position set O according to (Equation 7) (step S204). If the reference point belongs to the stationary object position set O (step S204 Y), V′_j is set to 0 (step S205) because the variance is not expanded. If the reference point does not belong to the stationary object position set O (N in step S204), the component relating to the position of V′_j is set to the final state S_j−1 of the motion primitive immediately before represented by the probability model λ_j−1. The position variance V_x_S_j-1 is set at (step S206).

次に、状態を示すインデックスをs=0と初期化し(ステップS207)、各確率モデルに関する処理内での状態に関するループ処理に入る。まず、各状態s毎に、(数2)に従って、平均の変換を行う(ステップS208)。続いて、(数6)に従って、分散の変換を行う。
状態を示すインデックスsが、処理中の確率モデルλ_jで表される動作プリミティブの最終状態を示すインデックスS_jより大きくなければ(ステップS210 N)、sをインクリメントして(ステップS211)、ステップS208に戻る。状態を示すインデックスsが、処理中の確率モデルλ_jで表される動作プリミティブの最終状態を示すインデックスS_jより大きければ(ステップS210 Y)、状態に関するループを抜け、ステップS212に進む。 Next, an index indicating a state is initialized to s = 0 (step S207), and a loop process relating to the state in the process relating to each probability model is entered. First, average conversion is performed for each state s according to (Equation 2) (step S208). Subsequently, the variance is converted according to (Equation 6).
If the index s indicating the state is not larger than the index S_j indicating the final state of the motion primitive represented by the probability model λ_j being processed (N in step S210), s is incremented (step S211), and the process returns to step S208. . If the index s indicating the state is larger than the index S_j indicating the final state of the motion primitive represented by the probability model λ_j being processed (Y in step S210), the state loop is exited and the process proceeds to step S212.

動作プリミティブを表す確率モデルを指定するためのインデックスjが、確率モデルの列Λに含まれる動作プリミティブの数nより大きくなければ(ステップS212 N)、jをインクリメントし(ステップS213)、ステップS203に戻る。動作プリミティブを表す確率モデルを指定するためのインデックスjが、確率モデルの列Λに含まれる動作プリミティブの数nより大きければ、(ステップS212 Y)、動作プリミティブに関するループを抜け、ステップS214に進む。 If the index j for designating the probability model representing the action primitive is not larger than the number n of action primitives included in the probability model column Λ (step S212 N), j is incremented (step S213), and step S203 is entered. Return. If the index j for designating the probability model representing the action primitive is larger than the number n of action primitives included in the probability model column Λ (step S212 Y), the process leaves the loop for the action primitive and proceeds to step S214.

最後に、軌道の終点を目標位置x_nに合わせるために、最後の動作プリミティブを表す確率モデルλ_nの最終状態S_nにおける位置に関する平均および分散をE_x_S_n(λ_n)=x_n、V_x_S_n(λ_n)=0とする(ステップS214)。
以上が、本実施形態の軌道探索装置の軌道探索モードにおける確率モデルの結合方法である。 Finally, in order to match the end point of the trajectory to the target position x_n, the average and variance regarding the position in the final state S_n of the probability model λ_n representing the last motion primitive are set to E_x_S_n (λ_n) = x_n, V_x_S_n (λ_n) = 0 (Step S214).
The above is the method of combining the probability models in the trajectory search mode of the trajectory search apparatus of the present embodiment.

本発明の軌道探索装置は、産業用ロボットに組み込むことで、様々な現場の状況に柔軟に対応して作業をこなす、応用力のあるロボットを実現することができる。また、自動車に搭載することで、ユーザの要求に応えながら目的地に到達するまでの経路を探索するという用途に利用することができる。 By incorporating the trajectory search apparatus of the present invention into an industrial robot, it is possible to realize a robot with applied power that can flexibly respond to various on-site situations. Moreover, it can utilize for the use of searching for the path | route until it arrives at the destination, responding to a user's request | requirement by mounting in a motor vehicle.

マニピュレータ制御システムの構成を示す図。The figure which shows the structure of a manipulator control system. 物を操作する動作の例を示す図。The figure which shows the example of the operation | movement which operates a thing. ロボットに学習させるための入力映像の例を示す図。The figure which shows the example of the input image | video for making a robot learn. 動作プリミティブを表す確率モデルλ_jを模式的に示す図。The figure which shows typically the probability model (lambda) _j showing an operation | movement primitive. 確率モデルのパラメータのデータ構造を示す図。The figure which shows the data structure of the parameter of a probability model. 学習モードにおける軌道探索装置の内部構成を示す図。The figure which shows the internal structure of the orbit search apparatus in learning mode. 動作プリミティブの学習結果の例を示す図。The figure which shows the example of the learning result of an operation primitive. 軌道探索における軌道探索装置の内部構成を示す図。The figure which shows the internal structure of the orbit search apparatus in an orbit search. 動作プリミティブの結合における座標変換の例。An example of coordinate transformation in the combination of motion primitives. 結合前の動作プリミティブの例を示す図。The figure which shows the example of the operation | movement primitive before combining. 動作プリミティブの結合における平均および分散の関係を示す図。The figure which shows the relationship of the average and dispersion | distribution in the combination of an operation | movement primitive. 結合確率モデルにおけるパラメータを示す図。The figure which shows the parameter in a joint probability model. 結合確率モデルの下での軌道の探索を示す図。The figure which shows the search of an orbit under a joint probability model. いくつかの確率モデルの結合パターンを示す図。The figure which shows the joint pattern of some probability models. いくつかの確率モデルの結合パターンの中から探索した最尤軌道を示す図。The figure which shows the maximum likelihood orbit searched from the coupling pattern of some probability models. 目標位置を指定して移動指示を行った場合の最尤軌道の例を示す図。The figure which shows the example of the maximum likelihood locus | trajectory at the time of designating a target position and performing a movement instruction | indication. 目標位置ごとの探索された動作列を示す図。The figure which shows the operation | movement row | line searched for every target position. 動作列を指定して移動指示を行った場合の最尤軌道の例を示す図。The figure which shows the example of the maximum likelihood locus | trajectory at the time of designating a movement sequence and performing a movement instruction | indication. マニピュレータ制御システムの動作を示すフローチャート。The flowchart which shows operation | movement of a manipulator control system. 確率モデルの結合方法を示すフローチャート。The flowchart which shows the combining method of a probability model.

Explanation of symbols

100：音声認識エンジン
200：画像認識エンジン
300：軌道探索装置
400：制御装置
500：マニピュレータ
1：入出力インターフェース
2：データベース格納部
3：プロセッサ
4：ワークメモリ
5：プログラム格納部 100: Speech recognition engine
200: Image recognition engine
300: Orbit search device
400: Control device
500: Manipulator
1: Input / output interface
2: Database storage
3: Processor
4: Work memory
5: Program storage

Claims

A trajectory search apparatus for searching a trajectory for moving a trajector,
Accepting means for accepting from the user an instruction to move the trajector including operation primitives that can be performed by the drive device for moving the trajector;
A search means for generating a motion sequence composed of motion primitives included in an instruction from the user received by the reception means, and searching for a maximum likelihood trajectory within the range of the generated motion sequence;
Each motion primitive that the drive can perform is represented by a probability model,
The probabilistic model is a model that defines how the state of the trajector should transition over time using the probability distribution for each time point and the transition probability between states,
The trajectory search by the search means is
A trajectory search apparatus characterized by combining probability models representing motion primitives for the motion sequences, and searching for a trajectory having the maximum likelihood under the joint probability model obtained by the combining.

The combination of probability models representing motion primitives is
The average of the position, speed, and acceleration of the trajector in each state of the motion primitive to be executed after the second in the motion sequence, the average of the position, speed, and acceleration of the trajector in that state, and the average of the trajector in the initial state of the motion primitive 2. The trajectory search apparatus according to claim 1, wherein the trajectory search apparatus is calculated by using an average of positions and an average of positions in the final state of the immediately preceding operation primitive.

The trajectory search device includes:
Capture means for capturing the trajectory of the trajector;
Recognizing means for recognizing a stationary object arranged in a space in which the driving device operates,
The combination of probability models representing motion primitives is
A determination is made as to whether or not the reference point that is the basis of the operation primitive to be combined matches any of the recognized positions of the stationary object,
If the reference point serving as the reference of the action primitive matches any of the recognized positions of the still object, the distribution of the action primitives to be executed in the second and subsequent stages of the action sequence in the combination is performed. And variance in each state of
If the reference point that is the basis of the motion primitive does not match any of the recognized positions of the stationary object, the dispersion in each state of the motion primitive to be executed after the second in the motion sequence in the above-mentioned combination is performed. 2. The trajectory search device according to claim 1, wherein the trajectory search device is made by a sum of a variance in each state of the primitive and a variance in the final state of the immediately preceding operation primitive.

The trajector movement instruction includes the initial position and the target position of the trajector,
2. The search unit generates a plurality of operation sequences of operation primitives corresponding to a path from an initial position of a trajector to a target position, and combines probability models for each of the operation sequences. The trajectory search device described.

The trajector movement instruction includes multiple combinations of operation names and landmarks of operation primitives,
The operation primitives to be combined with the probability model by the search means are:
2. The trajectory search device according to claim 1, wherein the trajectory is a motion primitive corresponding to a plurality of motion names included in the movement instruction.

In the binding,
3. The trajectory search apparatus according to claim 2, wherein coordinate transformation is performed so as to unify a coordinate system in which two probability models to be combined are defined.

The coordinate transformation is
7. The coordinate system in which the probability model representing the immediately preceding motion primitive is defined, is affine transformed with respect to the coordinate system in which the probability model representing the motion primitive is defined. Orbit search device.

The probability model representing the motion primitive is:
2. The trajectory search device according to claim 1, wherein the trajectory search device is determined based on an operation performed by a user.

Before the drive device performs the operation represented by the trajectory with the maximum likelihood, the user is asked to confirm,
2. The trajectory search device according to claim 1, wherein the trajectory is not output when the user does not agree.