JP2019030949A

JP2019030949A - Robot control device, robot control method, and robot control program

Info

Publication number: JP2019030949A
Application number: JP2017154232A
Authority: JP
Inventors: 充裕後藤; Mitsuhiro Goto; 純史布引; Ayafumi Nunobiki; 成宗松村; Narimune Matsumura; 崇裕松元; Takahiro Matsumoto; 今井　倫太; Michita Imai; 倫太今井
Original assignee: Nippon Telegraph and Telephone Corp; Keio University
Current assignee: Nippon Telegraph and Telephone Corp; Keio University
Priority date: 2017-08-09
Filing date: 2017-08-09
Publication date: 2019-02-28
Anticipated expiration: 2037-08-09
Also published as: JP6824127B2

Abstract

To reduce time and labor required for the preparation of an operation scenario of a robot.SOLUTION: A robot control device includes: first estimation means which estimates a dialog act indicating the intention of utterance content from log information including utterance content of a robot and identification information of a nonverbal operation; calculation means which calculates a co-occurrence probability of the nonverbal operation to be executed together with the dialog act; second estimation means which estimates utterance time of new utterance content of the robot and a dialog act of new utterance content; determination means which determines a nonverbal operation suitable for the utterance time estimated by being executed together with the dialog act of the new utterance content on the basis of the estimated utterance time, the dialog act of the new utterance content and the co-occurrence probability; and generation means which generates an operation scenario of the robot on the basis of the determined nonverbal operation and the new utterance content.SELECTED DRAWING: Figure 1

Description

本発明の実施形態は、ロボット制御装置、ロボット制御方法およびロボット制御プログラムに関する。 Embodiments described herein relate generally to a robot control device, a robot control method, and a robot control program.

受付や接客などの分野で、発話機能を有するロボットを用いたサービスが行われている。この種のサービスの一例としては、企業による来訪者の受付において、来訪時の手続き手順や施設案内などをユーザ（来訪者）と対話しながら説明し、受付をロボットで自動化するものがある。 In the fields of reception and customer service, services using a robot having a speech function are performed. As an example of this type of service, there is a method in which a visitor's reception procedure by a company is explained while interacting with a user (visitor) about a procedure procedure and facility guidance at the time of visit, and the reception is automated by a robot.

このようなロボットを用いたサービスでユーザとの対話を円滑に進めるには、ロボットによる発話内容だけではなく、ロボットの顔の向きや手や体の動きなど、非言語動作の表現が必要不可欠となる。例えば、企業による来訪者の受付においてロボットが施設案内をする際に、ロボットによる音声発話だけではなく、ロボットの手の動きで案内の方向を指し示すことで、ユーザに分かりやすく案内を実施できる。 In order to facilitate conversations with users through such services using robots, it is essential to express not only the utterances of robots but also non-verbal movements such as robot face orientation and hand and body movements. Become. For example, when a robot guides a facility at a reception of a visitor by a company, the guidance can be performed in an easy-to-understand manner for the user by pointing the direction of guidance not only by voice utterance by the robot but also by movement of the robot's hand.

また、ロボットに対して受付や接客を効果的に実施させるためには、シナリオ作成者がロボットの表現として、発話内容に加えて、非言語動作（顔向きや体の動き）を適切に決定し、ロボットが実行する動作シナリオを作成する必要がある。 In addition, in order to effectively carry out reception and customer service for the robot, the scenario creator appropriately determines non-linguistic actions (face orientation and body movement) in addition to the utterance content as the expression of the robot. It is necessary to create an operation scenario for the robot to execute.

この発話内容と非言語動作の２種類の情報を、シナリオ作成者は同時並行的に検討し、発話長に合わせた長さの動作を設定しながら、シナリオ作成を行う必要があり、労力が大きくなることが１つの課題となっていた。 The scenario creator needs to consider the two types of information, utterance content and non-verbal behavior, in parallel, and create a scenario while setting the length to match the utterance length. It was one problem.

そこで、ロボットの動作シナリオを簡易に作成する手法として、従来技術では、プログラム言語による動作シナリオの作成ではなく、GUI（グラフィカルユーザインタフェース）を用いたグラフィカルプログラミングによる手法が用いられてきた。例えばロボット単体に対する動作シナリオの作成ツールとして非特許文献１で示されているChoregrapheが挙げられる。また、複数のロボットやIoT（Internet of Things）機器を連携させる動作シナリオ作成のツールとして、非特許文献２で示されているクラウド型統合開発環境などが挙げられる。 Therefore, as a technique for easily creating an operation scenario of a robot, a technique based on graphical programming using a GUI (Graphical User Interface) has been used in the prior art instead of creating an operation scenario using a programming language. For example, Choregraphe shown in Non-Patent Document 1 can be cited as an operation scenario creation tool for a single robot. As a tool for creating an operation scenario for linking a plurality of robots and IoT (Internet of Things) devices, a cloud-type integrated development environment shown in Non-Patent Document 2 can be cited.

E.Pot, J.Monceaux, R.Gelin and B.Maisonnier等，“Choregraphe: a Graphical Tool for Humanoid Robot Programming”，Robot and Human Interactive Communication, 2009, pp.46-51.E.Pot, J.Monceaux, R.Gelin and B.Maisonnier et al., “Choregraphe: a Graphical Tool for Humanoid Robot Programming”, Robot and Human Interactive Communication, 2009, pp.46-51. 松元崇裕，松村成宗，細淵貴司，“インタラクションロボットサービスのためのクラウド型総合開発環境”，電子情報通信学会技術研究報告，pp33-36, 2015．Matsumoto Takahiro, Matsumura Narimune, Hosoki Takashi, “Cloud-based Integrated Development Environment for Interaction Robot Service”, IEICE Technical Report, pp33-36, 2015.

非特許文献１の手法は、プログラム言語を書くことができない人が、GUIツールを利用してロボットを細かく制御することが可能であるが、GUIツールを使ってロボットが向けるべき視線方向、実施するべきジェスチャ、および、これらを実施するタイミングなど非言語動作を設定していく作業が必要なことは変わらず、一方でGUIツールの使い方自体も習得する必要がある。 The method of Non-Patent Document 1 enables a person who cannot write a programming language to finely control a robot using a GUI tool. There is no change in the need to set non-linguistic behavior such as the gestures to be performed and the timing to implement them, but it is also necessary to learn how to use the GUI tool itself.

また、非特許文献２の手法は、プリミティブ（primitive）な非言語動作（手を上げる、視線を向けるなど）の組み合わせパターンをあらかじめ定義しておき、ロボットの非言語動作をGUI画面上で選択しながら、ロボットの動作シナリオを記述することができる点で、非特許文献１に記載された手法と比べて容易に実施可能ではある。しかしながら、制御対象のロボットに対して、ロボットが向けるべき視線方向、実施するべきジェスチャ、およびこれらを実施するタイミングを記述する必要がある点は同様で、シナリオ作成に労力が大きくかかる点では改善の余地があった。 In the method of Non-Patent Document 2, a combination pattern of primitive non-language movements (raising hands, turning a line of sight, etc.) is defined in advance, and the non-language movement of the robot is selected on the GUI screen. However, compared with the method described in Non-Patent Document 1, it can be easily implemented in that a robot operation scenario can be described. However, it is necessary to describe the direction of the line of sight that the robot should point to, the gesture to be performed, and the timing to perform these for the robot to be controlled, and this is an improvement in that it takes a lot of effort to create a scenario. There was room.

本発明の目的は、ロボットの動作シナリオの作成にかかる労力を軽減することができるロボット制御装置、ロボット制御方法およびロボット制御プログラムを提供することである。 An object of the present invention is to provide a robot control device, a robot control method, and a robot control program that can reduce the labor required to create a robot operation scenario.

上記目的を達成するために、この発明の一実施形態におけるロボット制御装置の第１の態様は、ロボットの発話内容と、前記ロボットの非言語動作の識別情報とを含むログ情報から、前記発話内容の意図を示す対話行為を推定する第１の推定手段と、前記第１の推定手段により推定された対話行為に対する、当該対話行為とともに実施される非言語動作の共起確率を計算する計算手段と、前記ロボットの新たな発話内容を取得し、この取得した発話内容の音声発話に要する時間である発話時間、および、前記取得した発話内容の意図を示す対話行為をそれぞれ推定する第２の推定手段と、前記第２の推定手段により推定された発話時間、前記第２の推定手段により推定された対話行為、および前記計算手段により計算された共起確率に基づいて、前記第２の推定手段により推定された対話行為とともに実施される動作で、かつ動作に要する時間である動作時間の長さが前記第２の推定手段により推定された発話時間に対応した長さである非言語動作を決定する決定手段と、前記決定手段により決定された非言語動作および前記新たな発話内容に基づいて、前記ロボットの動作シナリオを生成する生成手段とを有する装置を提供する。 In order to achieve the above object, a first aspect of a robot control apparatus according to an embodiment of the present invention is based on log information including utterance content of a robot and identification information of non-verbal motion of the robot. First estimating means for estimating a dialogue act indicating the intention of the user, and calculation means for calculating a co-occurrence probability of a non-language action performed together with the dialogue action for the dialogue action estimated by the first estimating means; Second estimation means for acquiring a new utterance content of the robot and estimating an utterance time which is a time required for voice utterance of the acquired utterance content and an interactive action indicating an intention of the acquired utterance content And the utterance time estimated by the second estimating means, the dialogue action estimated by the second estimating means, and the co-occurrence probability calculated by the calculating means The length of the operation time, which is an operation performed together with the dialogue act estimated by the second estimation means, and the time required for the operation corresponds to the speech time estimated by the second estimation means There is provided an apparatus comprising: a determination unit that determines a non-language action, and a generation unit that generates a motion scenario of the robot based on the non-language action determined by the determination unit and the new utterance content.

上記構成のロボット制御装置の第２の態様は、第１の態様において、前記計算手段は、前記第１の推定手段により推定された対話行為に基づいて、当該対話行為とともに実施される非言語動作の共起確率を複数種類の非言語動作についてそれぞれ計算し、前記決定手段は、前記複数種類の非言語動作のうち共起確率が最も高い非言語動作を選択し、前記選択した非言語動作に要する時間である動作時間と前記第２の推定手段により推定された発話時間との間に差分があるときに、前記複数種類の非言語動作のうち前記選択した非言語動作以外の非言語動作であって前記共起確率が最も高い非言語動作の選択を繰り返す装置を提供する。 According to a second aspect of the robot control device configured as described above, in the first aspect, the calculation means is a non-language operation performed together with the interactive action based on the interactive action estimated by the first estimating means. The co-occurrence probabilities are calculated for a plurality of types of non-language actions, and the determining means selects a non-language action having the highest co-occurrence probability among the plurality of types of non-language actions, and selects the selected non-language actions. When there is a difference between an operation time that is a required time and an utterance time estimated by the second estimation means, a non-language operation other than the selected non-language operation among the plurality of types of non-language operations An apparatus for repeatedly selecting a non-language action having the highest co-occurrence probability is provided.

上記構成のロボット制御装置の第３の態様は、第２の態様において、前記非言語動作を示す情報を、この非言語動作が、動作完了まで繰り返し継続する動作であるか否かを示す属性情報と対応付けて記憶する非言語動作記憶手段をさらに備え、前記決定手段は、前記決定手段により選択された非言語動作に対応する属性情報を前記非言語動作記憶手段から読み出し、当該属性情報が前記繰り返し継続する動作を示すときに、前記決定手段により選択された非言語動作を、前記決定手段により選択された非言語動作に要する時間である動作時間が前記第２の推定手段により推定された発話時間に達するまで繰り返した動作として、当該動作を、前記第２の推定手段により推定された対話行為とともに実施される動作で、かつ前記動作時間の長さが前記第２の推定手段により推定された発話時間に対応した長さである非言語動作として決定する装置を提供する。 According to a third aspect of the robot control device configured as described above, in the second aspect, the attribute information indicating whether or not the information indicating the non-language operation is an operation in which the non-language operation continues repeatedly until the operation is completed. A non-linguistic motion storage unit that stores the information in association with the non-linguistic motion storage unit, the determination unit reads attribute information corresponding to the non-language operation selected by the determination unit from the non-linguistic motion storage unit An utterance in which an operation time that is a time required for the non-language operation selected by the determination unit is estimated by the second estimation unit when the non-language operation selected by the determination unit is indicated when the operation continues repeatedly As an operation repeated until reaching the time, the operation is an operation carried out together with the dialogue act estimated by the second estimation means, and the length of the operation time There is provided an apparatus for determining a non-language operation is the length corresponding to the speech time estimated by the second estimation means.

上記構成のロボット制御装置の第４の態様は、第２の態様において、前記決定手段は、前記決定手段により選択された非言語動作が、同一の部分的な動作を反復した動作を含むときに、前記決定手段により選択された非言語動作を、前記同一の部分的な動作を反復した動作を繰り返す動作に更新する装置を提供する。 According to a fourth aspect of the robot control apparatus configured as described above, in the second aspect, the determination unit includes the non-language operation selected by the determination unit including an operation in which the same partial operation is repeated. An apparatus is provided that updates a non-language operation selected by the determining means to an operation that repeats an operation obtained by repeating the same partial operation.

本発明の一実施形態におけるロボット制御方法の態様は、ロボット制御装置が行なうロボット制御方法であって、ロボットの発話内容と、前記ロボットの非言語動作の識別情報とを含むログ情報から、前記発話内容の意図を示す第１の対話行為を推定し、前記推定された対話行為に対する、当該対話行為とともに実施される非言語動作の共起確率を計算し、前記ロボットの新たな発話内容を取得し、この取得した発話内容の音声発話に要する時間である発話時間、および、前記取得した発話内容の意図を示す第２の対話行為をそれぞれ推定し、前記推定した発話時間、前記推定された前記第２の対話行為、および前記計算された共起確率に基づいて、前記推定された前記第２の対話行為とともに実施される動作で、かつ動作に要する時間である動作時間の長さが前記推定された発話時間に対応した長さである非言語動作を決定し、前記決定された非言語動作および前記新たな発話内容に基づいて、前記ロボットの動作シナリオを生成する方法を提供する。 An aspect of a robot control method according to an embodiment of the present invention is a robot control method performed by a robot control device, wherein the utterance is obtained from log information including the utterance content of the robot and identification information of the non-verbal motion of the robot. Estimating a first dialogue act indicating the intention of the content, calculating a co-occurrence probability of non-verbal motion performed together with the estimated dialogue act together with the dialogue act, and obtaining a new utterance content of the robot , The utterance time, which is the time required for voice utterance of the acquired utterance content, and the second dialogue action indicating the intention of the acquired utterance content are estimated, respectively, and the estimated utterance time, the estimated first time 2 and an operation performed together with the estimated second interaction act based on the calculated co-occurrence probability and a time required for the operation. A non-language action whose length of action time corresponds to the estimated utterance time is determined, and an action scenario of the robot is generated based on the determined non-language action and the new utterance content Provide a way to do it.

本発明の一実施形態におけるロボット制御プログラムの態様は、第１乃至第４の態様のいずれか１つにおけるロボット制御装置の前記各手段としてプロセッサを機能させるプログラムを提供する。 The aspect of the robot control program in one embodiment of the present invention provides a program that causes a processor to function as each means of the robot control apparatus according to any one of the first to fourth aspects.

本発明によれば、ロボットの動作シナリオの作成にかかる労力を軽減することが可能になる。 According to the present invention, it is possible to reduce labor required for creating a robot operation scenario.

本発明の一実施形態におけるロボット制御装置の構成例を示すブロック図。The block diagram which shows the structural example of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の動作シナリオログ記憶部での記憶内容の一例を示す図。The figure which shows an example of the memory content in the operation scenario log memory | storage part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の共起確率記憶部での記憶内容の一例を示す図。The figure which shows an example of the memory content in the co-occurrence probability memory | storage part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の非言語動作記憶部での記憶内容の一例を示す図。The figure which shows an example of the memory content in the non-language operation | movement memory | storage part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の部分動作記憶部での記憶内容の一例を示す図。The figure which shows an example of the memory content in the partial operation | movement memory | storage part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の付与動作記憶部での記憶内容の一例を示す図。The figure which shows an example of the memory content in the provision operation | movement memory | storage part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置による共起確率計算の手順の一例を示すフローチャート。The flowchart which shows an example of the procedure of co-occurrence probability calculation by the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置の発話情報取得部による発話情報取得の手順の一例を示すフローチャート。The flowchart which shows an example of the procedure of the speech information acquisition by the speech information acquisition part of the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置による動作付与の手順の一例を示すフローチャート。The flowchart which shows an example of the procedure of the operation | movement provision by the robot control apparatus in one Embodiment of this invention. 本発明の一実施形態におけるロボット制御装置による動作シナリオ生成の手順の一例を示すフローチャート。The flowchart which shows an example of the procedure of the operation | movement scenario production | generation by the robot control apparatus in one Embodiment of this invention.

以下、図面を参照しながら、この発明に係わる一実施形態を説明する。
一実施形態では、ロボットの過去の動作シナリオ作成ログと、ロボットの発話内容テキストとから、ロボットへの自動的な非言語動作付与を実現する。 Hereinafter, an embodiment according to the present invention will be described with reference to the drawings.
In one embodiment, automatic non-linguistic motion assignment to the robot is realized from the past motion scenario creation log of the robot and the utterance content text of the robot.

図１は、本発明の一実施形態におけるロボット制御システムの構成例を示すブロック図である。
図１に示すように、本発明の一実施形態におけるロボット制御システムは、ロボット制御装置１０、発話内容入力部２００、ロボット３０１を有する。
また、ロボット制御装置１０は、外部の制御対象としてのロボット３０１と接続可能である。
一例として、ロボット制御システムは、ロボット制御装置１０をスマートフォン、タブレット型端末、パーソナルコンピュータ（ＰＣ）などのコンピュータデバイスとした装置とすることにより実現される。例えば、コンピュータデバイスは、ＣＰＵ（Central Processing Unit）などのプロセッサと、プロセッサに接続されるメモリと、ロボット３０１と（例えば無線で）通信するための通信インタフェースと、を備える。なお、ロボット制御システムの実現形態は、この例に限定されるものではない。以下では、ロボット制御装置１０は、ロボット３０１と別の構成であるとして説明するが、ロボット制御装置１０がロボット３０１に組み込まれる構成であってもよい。 FIG. 1 is a block diagram illustrating a configuration example of a robot control system according to an embodiment of the present invention.
As shown in FIG. 1, the robot control system according to an embodiment of the present invention includes a robot control device 10, an utterance content input unit 200, and a robot 301.
The robot control apparatus 10 can be connected to a robot 301 as an external control target.
As an example, the robot control system is realized by using the robot control device 10 as a device such as a smartphone, a tablet terminal, or a personal computer (PC). For example, the computer device includes a processor such as a CPU (Central Processing Unit), a memory connected to the processor, and a communication interface for communicating with the robot 301 (for example, wirelessly). The implementation form of the robot control system is not limited to this example. Hereinafter, the robot control apparatus 10 will be described as having a configuration different from that of the robot 301. However, the robot control apparatus 10 may be configured to be incorporated in the robot 301.

本実施形態におけるロボット対話制御システムは下記の構成を含む。
（１）ログ中の発話内容のテキストから発話内容の対話行為を推定する、対話行為推定部１０１（第１の推定手段）
（２）推定した対話行為に対する、この対話行為と一緒に実施される非言語動作の共起確率を複数種類の非言語動作についてそれぞれ求める、共起確率計算部１０２
（３）ロボットの新たな発話内容のテキストを入力する発話内容入力部２００
（４）入力した新たな発話内容のテキストから、ロボットが音声発話する場合の発話時間（発話内容の音声発話に要する時間）や対話行為を発話情報として推定する、発話情報取得部２０１（第２の推定手段）
（５）共起確率や発話情報（発話時間、対話行為）をもとに、この発話情報における対話行為とともに実施されて、動作時間（動作開始から動作終了までに要する時間）の長さが発話時間に対応した長さである非言語動作を、新たな発話内容に付与する非言語動作として決定する、付与動作決定部２０３
（６）付与した非言語動作と入力された発話内容とからロボットの動作シナリオを自動生成する、動作シナリオ生成部３００
発話内容を入力するために、発話内容入力部２００は、事前に作成したテキストファイルを読み込ませても良いし、Webブラウザなどで動作するアプリケーションのユーザインターフェースにテキストを逐次入力しても良い。 The robot interaction control system in this embodiment includes the following configuration.
(1) Dialog action estimation unit 101 (first estimation means) that estimates the dialog action of the utterance content from the text of the utterance content in the log
(2) A co-occurrence probability calculation unit 102 that obtains a co-occurrence probability of a non-linguistic action performed together with the dialogue action with respect to the estimated dialogue action for each of a plurality of types of non-language actions.
(3) Utterance content input unit 200 for inputting text of new utterance content of the robot
(4) An utterance information acquisition unit 201 (second) that estimates, as utterance information, an utterance time (time required for voice utterance of an utterance content) or a dialogue action when the robot utters voice from the input text of the new utterance content. Estimation method)
(5) Based on the co-occurrence probability and utterance information (speech time, dialogue act), the dialogue is performed along with the dialogue act in this utterance information. A given action determining unit 203 that determines a non-language action having a length corresponding to time as a non-language action to be given to a new utterance content.
(6) An operation scenario generation unit 300 that automatically generates an operation scenario of the robot from the given non-language operation and the input utterance content.
In order to input the utterance content, the utterance content input unit 200 may read a text file created in advance, or may sequentially input text to the user interface of an application that operates on a Web browser or the like.

ロボット３０１は、NW（ネットワーク）接続機能、音声発話機能（音声合成または音声ファイル再生）や非言語動作実施機能を有し、動作シナリオ生成部３００で生成したシナリオに従って、音声発話や非言語動作を実行する。このロボットは複数台存在しても良く、動作シナリオ生成部３００がロボット毎に動作シナリオを生成する。 The robot 301 has an NW (network) connection function, a voice utterance function (speech synthesis or voice file playback), and a non-language action execution function, and performs voice utterance and non-language action according to the scenario generated by the action scenario generation unit 300. Run. A plurality of robots may exist, and the operation scenario generation unit 300 generates an operation scenario for each robot.

また、本実施形態で対象とする動作シナリオは、非特許文献２に記載されたように、状態遷移図のノード上に発話内容や非言語動作などのロボット制御内容を記述し、ノード間のリンクに「発話完了」や「一定時間経過」などの遷移条件を記述したものとすることができる。
さらに、本実施形態で対象とする非言語動作は、非言語動作を構成する部分的な動作としてのプリミティブな動作を複数組み合わせた動作パターンとしてあらかじめ定義しておくことができる。非言語動作は、「継続」と「単発」でなる２種類の属性のいずれかを有する。
属性が「継続」である非言語動作は、「動作完了」の命令が設定されるまで、上記のプリミティブな動作を組み合わせた動作を繰り返し継続する動作である。また、属性が「単発」である非言語動作は、プリミティブな動作の組み合わせを全て実行したら終了する動作である。 In addition, as described in Non-Patent Document 2, the operation scenario targeted in the present embodiment describes robot control contents such as speech contents and non-language actions on the nodes of the state transition diagram, and links between the nodes. The transition conditions such as “utterance completion” and “definite time elapse” can be described in FIG.
Furthermore, the non-language operation targeted in the present embodiment can be defined in advance as an operation pattern in which a plurality of primitive operations as partial operations constituting the non-language operation are combined. The non-linguistic action has one of two types of attributes of “continuation” and “single”.
The non-language operation having the attribute “continuation” is an operation in which an operation combining the above primitive operations is continuously repeated until an instruction “operation completion” is set. In addition, the non-language operation having the attribute “single” is an operation that ends when all combinations of primitive operations are executed.

（対話行為推定部１０１）
本実施形態において、ロボットの非言語動作を自動で付与するために、まず、対話行為推定部１０１は、過去に作成された動作シナリオログを読み込む。そして、対話行為推定部１０１は、動作シナリオ中の各ノード情報として、発話内容と非言語動作を取得し、各発話内容の対話行為DA_iを決定する。対話行為は発話の意図を示す分類ラベルの１つであり、質問・挨拶・フィラー（filler）などの分類要素から構成される。 (Dialogue action estimation unit 101)
In this embodiment, in order to automatically assign a non-language motion of a robot, first, the dialogue action estimation unit 101 reads an operation scenario log created in the past. Then, the dialogue action estimation unit 101 acquires the utterance contents and non-linguistic actions as each node information in the action scenario, and determines the dialogue action DA _i of each utterance contents. The dialogue act is one of the classification labels indicating the intention of the utterance, and is composed of classification elements such as a question, a greeting, and a filler.

（共起確率計算部１０２）
次に、共起確率計算部１０２は、対話行為とロボットの非言語動作との共起確率を、全ての対話行為と複数種類の非言語動作とについてそれぞれ求める。ここで、ロボットのジェスチャ（非言語動作）集合をM={mo₁,mo₂,…,mo_k}とし、ある対話行為αの発話と一緒に非言語動作mo₁が実施される共起確率をp_α1とすると、対話行為αに対する各非言語動作の共起確率の離散分布は、p_α={p_α1, p_α2,…,p_αk}(このとき (Co-occurrence probability calculation unit 102)
Next, the co-occurrence probability calculation unit 102 obtains the co-occurrence probabilities of the dialogue action and the non-linguistic motion of the robot for each of the dialogue behavior and a plurality of types of non-language motions. Here, the set of robot gestures (non-language action) is M = {mo ₁ , mo ₂ , ..., mo _k }, and the co-occurrence probability that the non-language action mo ₁ is performed together with the utterance of a certain dialogue action α P _α1 , the discrete distribution of the co-occurrence probabilities of each non-verbal action for dialogue action α is p _α = {p _α1 , p _α2,…, p _αk }

)と表せる。 ).

本実施形態では、このようにして求めた共起確率を利用して、新たな発話内容テキストが入力された際に、非言語動作を自動的に付与する手法を提案する。 In the present embodiment, a method of automatically giving a non-language action when a new utterance content text is input using the co-occurrence probability obtained in this way is proposed.

（発話情報取得部２０１）
この手法では、発話情報取得部２０１が発話内容入力部２００を介して入力した発話内容テキストT_iから発話情報として、音声合成時の発話時間t_iと対話行為DA_iをそれぞれ推定する。 (Speech information acquisition unit 201)
In this method, the utterance time t _i and the dialogue action DA _i at the time of speech synthesis are estimated as utterance information from the utterance content text T _i input by the utterance information acquisition unit 201 via the utterance content input unit 200.

（付与動作決定部２０３）
次に、ここで取得した対話行為DA_iをもとに、付与動作決定部２０３は、事前に共起確率計算部１０２が計算した、発話内容に基づく対話行為と非言語動作との共起確率を参照して、該当する共起確率の離散分布 (Granting operation determination unit 203)
Next, based on the dialogue action DA _i acquired here, the grant action determination unit 203 calculates the co-occurrence probability between the dialogue action based on the utterance content and the non-language action calculated by the co-occurrence probability calculation unit 102 in advance. To see the discrete distribution of relevant co-occurrence probabilities

を求める。付与動作決定部２０３は、この求めた離散分布から、共起確率の高い順に非言語動作mo_j'を自動的に選択し、この非言語動作の動作時間t_allと発話時間t_iとを比較しながら、発話時間t_iを満たす時間にわたって動作する非言語動作mo_j'を１個以上付与する。 Ask for. The assigning action determining unit 203 automatically selects the non-language action mo _j ′ from the obtained discrete distribution in descending order of the co-occurrence probability, and compares the action time t _all of the non-language action with the utterance time t _i. Meanwhile, one or more non-language motions mo _j ′ that operate for a time satisfying the speech time t _i are given.

付与された非言語動作mo_j'の属性が「継続」である場合には、発話時間t_iまでこの動作を繰り返し実行し、発話完了時に、継続していた動作を完了する。付与された非言語動作mo_j'の属性が「単発」である場合には、この非言語動作mo_j'内で反復しているプリミティブな動作部分のみを１回以上(事前設定したn回まで)繰り返してなる、新たな非言語動作mo_j''を生成し、これを付与する。 When the attribute of the assigned non-language action mo _j 'is “continuation”, this action is repeatedly executed until the utterance time t _i , and the continued action is completed when the utterance is completed. If the attribute of the given non-language action mo _j 'is "single", only the primitive action part that is repeated in this non-language action mo _j ' is repeated at least once (up to the preset n times) ) Generate a new non-linguistic action mo _j ″ that is repeated, and assign it.

（動作シナリオ生成部３００）
上記の手順を経て、発話内容テキストT_iに対して付与された非言語動作集合から、動作シナリオ生成部３００は、ロボットの動作シナリオを自動生成する。ロボット３０１は、この動作シナリオに従って、音声発話と非言語動作とをそれぞれ実行する。

次に、本発明の実施例について示す。本実施例では、発話内容の入力に沿って非言語動作を付与したロボットの動作シナリオを自動で生成し、１台のロボットの制御を実現することができる。

本実施例におけるロボット制御システムが有する機能は、（１）ロボットの過去の動作シナリオから、発話内容と非言語動作との共起確率をあらかじめ計算する共起確率計算機能、（２）共起確率にもとづき、発話時間と動作時間を考慮して、入力された発話内容から非言語動作を付与する非言語動作付与機能、（３）発話内容への非言語動作の付与結果から、ロボット制御プラットフォームに合わせて動作シナリオを生成する機能、に大別できる。 (Operation scenario generation unit 300)
Through the above steps, from nonverbal behavior set granted to speech content text T _i, the operation scenario generation unit 300 automatically generates an operation scenario of the robot. The robot 301 executes voice utterance and non-language operation according to this operation scenario.

Next, examples of the present invention will be described. In the present embodiment, it is possible to automatically generate an operation scenario of a robot to which a non-language operation is given in accordance with the input of utterance content, thereby realizing control of one robot.

The functions of the robot control system in this embodiment are (1) a co-occurrence probability calculation function for calculating in advance the co-occurrence probability between the utterance content and the non-language action from the past motion scenario of the robot, and (2) the co-occurrence probability. Based on the above, considering the speech time and motion time, the non-language motion giving function that gives non-language motion from the input speech content, (3) From the result of giving non-language motion to the speech content, the robot control platform It can be broadly divided into functions that generate operation scenarios.

図１に示すように、ロボット制御システムのロボット制御装置１０は、５つの制御モジュールと、６つの情報記憶装置を有する。 As shown in FIG. 1, the robot control device 10 of the robot control system includes five control modules and six information storage devices.

制御モジュールは、対話行為推定部１０１、共起確率計算部１０２、発話情報取得部２０１、付与動作決定部２０３、動作シナリオ生成部３００を含む。 The control module includes a dialogue action estimation unit 101, a co-occurrence probability calculation unit 102, an utterance information acquisition unit 201, an assignment operation determination unit 203, and an operation scenario generation unit 300.

情報記憶装置は、例えば不揮発性メモリなどにより実現され、動作シナリオログ記憶部１００、共起確率記憶部１０３、発話情報記憶部２０２、非言語動作記憶部２０４、部分動作記憶部２０５、付与動作記憶部２０６を含む。 The information storage device is realized by, for example, a non-volatile memory or the like, and includes an operation scenario log storage unit 100, a co-occurrence probability storage unit 103, an utterance information storage unit 202, a non-language operation storage unit 204, a partial operation storage unit 205, and an attached operation storage. Part 206 is included.

共起確率計算機能について説明する。この機能は、動作シナリオログ記憶部１００と、対話行為推定部１０１と、共起確率計算部１０２と、共起確率記憶部１０３とで実現される。 The co-occurrence probability calculation function will be described. This function is realized by the operation scenario log storage unit 100, the dialogue action estimation unit 101, the co-occurrence probability calculation unit 102, and the co-occurrence probability storage unit 103.

動作シナリオログ記憶部１００は、過去に作成された、ロボットの動作シナリオログを保存する。対話行為推定部１０１は、動作シナリオ中のロボットの発話内容からロボットの対話行為を推定する。
共起確率計算部１０２は、対話行為推定部１０１により推定した対話行為と非言語動作とが共起する確率を計算する。共起確率記憶部１０３は、共起確率計算部１０２により計算した、各対話行為における非言語動作の共起確率を保存する。 The operation scenario log storage unit 100 stores a robot operation scenario log created in the past. The dialogue action estimation unit 101 estimates the dialogue action of the robot from the utterance content of the robot in the operation scenario.
The co-occurrence probability calculation unit 102 calculates the probability that the dialogue action estimated by the dialogue action estimation unit 101 and the non-language action co-occur. The co-occurrence probability storage unit 103 stores the co-occurrence probabilities of non-linguistic actions in each dialogue action calculated by the co-occurrence probability calculation unit 102.

図２は、本発明の一実施形態におけるロボット制御装置の動作シナリオログ記憶部１００での記憶内容の一例を示す図である。 FIG. 2 is a diagram illustrating an example of contents stored in the operation scenario log storage unit 100 of the robot control apparatus according to the embodiment of the present invention.

図２に示すように、動作シナリオログ記憶部１００に記憶される動作シナリオログは、１行がロボットの動作シナリオの状態遷移図（以下、状態遷移図と称することがある）の１ノードに含まれる情報に対応し、（１）ノードID、（２）ノード内に記述された発話内容、（３）非言語動作ID、（４）次ノードへの遷移条件、（５）遷移先のノードID、を含む。
非言語動作IDは、後述する非言語動作記憶部２０４にも保存された情報であり、対となるノードIDに該当するノードでロボットに設定される非言語動作のIDである。 As shown in FIG. 2, in the operation scenario log stored in the operation scenario log storage unit 100, one row is included in one node of a state transition diagram of the robot operation scenario (hereinafter sometimes referred to as a state transition diagram). (1) Node ID, (2) Utterance content described in the node, (3) Non-language action ID, (4) Transition condition to the next node, (5) Transition destination node ID ,including.
The non-language action ID is information stored in a non-language action storage unit 204 described later, and is an ID of a non-language action set for the robot at a node corresponding to the paired node ID.

例えば、図２に示したノードID「１」に対応する行では、ノード「１」（ノードIDが「１」であるノード）で「こんにちは」との発話内容をロボットに発話させながら、非言語動作「１」（非言語動作IDが「１」である非言語動作）と非言語動作「２」（非言語動作IDが「２」である非言語動作）をそれぞれ実行する動作シナリオが記述されている。また、動作シナリオログ記憶部１００には、他センサやデバイスの動作情報などを同時に保存してもよい。 For example, in the row corresponding to the node ID "1" shown in FIG. 2, while the speech content of the "Hello" on node "1" (node node ID is "1") is spoken to the robot, nonverbal An operation scenario is described in which operation “1” (non-language operation with non-language operation ID “1”) and non-language operation “2” (non-language operation with non-language operation ID “2”) are described. ing. The operation scenario log storage unit 100 may simultaneously store operation information of other sensors and devices.

図３は、本発明の一実施形態におけるロボット制御装置の共起確率記憶部１０３での記憶内容の一例を示す図である。図４は、本発明の一実施形態におけるロボット制御装置の非言語動作記憶部２０４での記憶内容の一例を示す図である。図５は、本発明の一実施形態におけるロボット制御装置の部分動作記憶部２０５での記憶内容の一例を示す図である。図６は、本発明の一実施形態におけるロボット制御装置の付与動作記憶部２０６での記憶内容の一例を示す図である。
図３に示すように、共起確率記憶部１０３は、各対話行為に対して、非言語動作毎の共起確率を保存する。図３に示した例では、対話行為「挨拶」に対して、非言語動作「１」（非言語動作IDが「１」である非言語動作）が0.5の確率（０から１の範囲内）で、非言語動作「２」（非言語動作IDが「２」である非言語動作）が「0.2」の確率で、非言語動作「３」（非言語動作IDが「３」である非言語動作）が「0.05」の確率でそれぞれ共起することを表す。 FIG. 3 is a diagram illustrating an example of the contents stored in the co-occurrence probability storage unit 103 of the robot control apparatus according to the embodiment of the present invention. FIG. 4 is a diagram illustrating an example of contents stored in the non-language operation storage unit 204 of the robot control apparatus according to the embodiment of the present invention. FIG. 5 is a diagram illustrating an example of the contents stored in the partial motion storage unit 205 of the robot control apparatus according to the embodiment of the present invention. FIG. 6 is a diagram illustrating an example of the contents stored in the assigning operation storage unit 206 of the robot control apparatus according to the embodiment of the present invention.
As shown in FIG. 3, the co-occurrence probability storage unit 103 stores the co-occurrence probability for each non-language action for each dialogue action. In the example shown in FIG. 3, the probability that the non-language action “1” (non-language action with the non-language action ID “1”) is 0.5 (within a range of 0 to 1) for the dialogue action “greeting”. The non-language action “2” (non-language action whose non-language action ID is “2”) has a probability of “0.2” and the non-language action “3” (non-language action ID is “3”). Motion) occurs at the probability of “0.05”.

図７は、本発明の一実施形態におけるロボット制御装置による共起確率計算の手順の一例を示すフローチャートである。
図７に示すように、対話行為推定部１０１は、ロボットの動作シナリオの状態遷移図における各ノードに含まれるノード情報（ノードID、発話内容、非言語動作ID、遷移条件、遷移先ノード）を、動作シナリオログ記憶部１００に記憶される動作シナリオログから読み出す（Ｓ１１）。対話行為推定部１０１は、読み出したノード情報から発話内容のテキストと非言語動作IDとを取得し、この発話内容のテキストから対話行為を推定する（Ｓ１２）。共起確率計算部１０２は、推定した各対話行為と非言語動作との共起確率をそれぞれ計算する（Ｓ１３）。共起確率計算部１０２は、計算した、対話行為ごとの共起確率を共起確率記憶部１０３へ保存する。 FIG. 7 is a flowchart showing an example of the procedure of co-occurrence probability calculation by the robot control apparatus according to the embodiment of the present invention.
As shown in FIG. 7, the dialogue action estimation unit 101 uses the node information (node ID, utterance content, non-language action ID, transition condition, transition destination node) included in each node in the state transition diagram of the robot operation scenario. Then, it reads out from the operation scenario log stored in the operation scenario log storage unit 100 (S11). The dialogue action estimation unit 101 acquires the text of the utterance content and the non-language action ID from the read node information, and estimates the dialogue act from the text of the utterance content (S12). The co-occurrence probability calculation unit 102 calculates the co-occurrence probabilities of each estimated dialogue action and non-language action (S13). The co-occurrence probability calculation unit 102 stores the calculated co-occurrence probability for each dialogue action in the co-occurrence probability storage unit 103.

次に、非言語動作付与機能について説明する。この機能は、発話情報取得部２０１と、発話情報記憶部２０２と、付与動作決定部２０３と、非言語動作記憶部２０４と、部分動作記憶部２０５と、付与動作記憶部２０６とで実現される。 Next, the non-language action giving function will be described. This function is realized by the utterance information acquisition unit 201, the utterance information storage unit 202, the provision operation determination unit 203, the non-language operation storage unit 204, the partial operation storage unit 205, and the provision operation storage unit 206. .

発話情報取得部２０１は、シナリオ作成者が入力した、新たな発話内容のテキストから、対話行為や発話時間を推定したりすることで、発話情報を得る。発話情報記憶部２０２は、発話情報取得部２０１が取得した発話情報（発話時間と対話行為）を記憶する。
付与動作決定部２０３は、共起確率と発話時間に合わせて、ロボットの非言語動作を付与する。 The utterance information acquisition unit 201 obtains utterance information by estimating the dialogue action and the utterance time from the text of new utterance contents input by the scenario creator. The utterance information storage unit 202 stores the utterance information (the utterance time and the dialogue action) acquired by the utterance information acquisition unit 201.
The assigning action determining unit 203 assigns a non-language action of the robot in accordance with the co-occurrence probability and the utterance time.

非言語動作記憶部２０４は、ロボットに設定可能なプリミティブな動作（非言語動作を構成する部分的な動作）の組み合わせにより作成した非言語動作一覧を保存する。
部分動作記憶部２０５は、プリミティブな動作の情報を部分動作一覧として保存する。付与動作記憶部２０６は、発話内容に合わせて付与動作決定部２０３が、発話内容に自動的に付与した非言語動作の一覧を付与動作一覧として保存する。 The non-language action storage unit 204 stores a list of non-language actions created by a combination of primitive actions (partial actions constituting non-language actions) that can be set in the robot.
The partial motion storage unit 205 stores primitive motion information as a partial motion list. The giving action storage unit 206 stores a list of non-language actions automatically given to the utterance contents by the giving action determination unit 203 according to the utterance contents as a giving action list.

図４に示すように、非言語動作一覧の１行が一つの非言語動作の情報を表し、（１）非言語動作ID、（２）非言語動作の動作名称、（３）非言語動作の属性、（４）非言語動作を構成する部分動作（例えば部分動作「１」〜「x」）を含む。
部分動作は、ロボットを制御する際の最もプリミティブな動作である。図５に示すように、部分動作記憶部２０５では、（１）各種の部分動作を一意に識別す部分動作ID、（２）部分動作をさせたいロボット関節の指定部分、（３）指定した関節をどの角度まで動かすかを示すパラメータ、（４）指定した角度までの遷移時間を示す動作時間、を対応付けて部分動作一覧として保存する。
図５に示した例では、部分動作ID「１」の動作は、ロボットの右手の関節を４０°に３００ｍｓの遷移時間で動かすことを表し、また、部分動作ID「２」の動作は、ロボットの右手の関節を０°に４００ｍｓの遷移時間で動かすことを表す。この部分動作ID「１」の動作と部分動作ID「２」の動作とを組み合わせて、図４に示した例では「手を振る」（動作名称）という非言語動作を設定可能としている。
また、図４に示した例では、特殊な非言語動作として非言語動作「end」（非言語動作IDが「end」である非言語動作）を設定することができる。これは、「継続」の属性を持つ非言語動作に対して、この動作を完了する命令に相当し、非言語動作「end」を実施すると、ロボットの全ての関節を初期状態に戻すことができる。 As shown in FIG. 4, one line of the non-language action list represents one non-language action information, (1) non-language action ID, (2) non-language action name, and (3) non-language action information. Attributes, and (4) partial actions constituting non-language actions (for example, partial actions “1” to “x”).
The partial motion is the most primitive motion when controlling the robot. As shown in FIG. 5, in the partial motion storage unit 205, (1) a partial motion ID for uniquely identifying various partial motions, (2) a designated portion of a robot joint to be subjected to partial motion, and (3) a designated joint Are stored in a partial motion list in association with a parameter indicating to which angle to move, and (4) an operation time indicating a transition time to the specified angle.
In the example shown in FIG. 5, the motion of the partial motion ID “1” represents that the joint of the right hand of the robot is moved to 40 ° with a transition time of 300 ms, and the motion of the partial motion ID “2” is Represents moving the right hand joint to 0 ° with a transition time of 400 ms. By combining the motion of the partial motion ID “1” and the motion of the partial motion ID “2”, in the example shown in FIG. 4, it is possible to set a non-language motion of “waving hand” (motion name).
In the example shown in FIG. 4, a non-language operation “end” (a non-language operation whose non-language operation ID is “end”) can be set as a special non-language operation. This corresponds to a command for completing this operation for a non-language operation having the “continue” attribute. When the non-language operation “end” is performed, all joints of the robot can be returned to the initial state. .

図６に示すように、付与動作一覧では、（１）新たな発話内容のテキスト、（２）この発話内容に付与された非言語動作に対応する非言語動作ID、（３）発話内容に付与された非言語動作を構成する複数の部分動作に対応する部分動作ID、が記述される。また、この付与動作一覧では、発話時間に応じて、１種類の発話内容のテキストに複数の非言語動作が付与されることもある。
図６の例では、１、２行目における「こんにちは」という発話内容テキストに対して、１行目の非言語動作（部分動作IDが動作順に３、４、３、４である４つの部分動作でなる非言語動作）と２行目の非言語動作（部分動作IDが動作順に１、２である２つの部分動作でなる非言語動作）とがそれぞれ付与されている。 As shown in FIG. 6, in the list of assigned actions, (1) text of new utterance contents, (2) non-language action ID corresponding to the non-language action assigned to the utterance contents, and (3) given to the utterance contents Partial action IDs corresponding to a plurality of partial actions constituting the non-language action thus described are described. Further, in this list of added actions, a plurality of non-language actions may be given to text of one kind of utterance content according to the utterance time.
In the example of FIG. 6, with respect to speech content the text "Hello" in 1 row, the first row of nonverbal behavior (partial operation ID is four parts operation is 3,4,3,4 the order of operation And a non-language operation on the second line (a non-language operation consisting of two partial actions whose partial action IDs are 1 and 2 in the order of actions).

図８は、本発明の一実施形態におけるロボット制御装置の発話情報取得部２０１による発話情報取得の手順の一例を示すフローチャートである。
まず、発話情報取得部２０１は、シナリオ作成者の発話内容のテキスト入力が終了しないときに（Ｓ２１のＮ）、発話内容入力部２００から入力された発話内容T_iのテキストから対話行為Da_iを推定する。また、発話情報取得部２０１は、発話内容T_iのテキストを音声合成し、この音声ファイルを再生する際の発話時間t_iを取得し、これらの対話行為Da_iと発話時間t_iとを発話情報として発話情報記憶部２０２へ保存する（Ｓ２２）。以後は、Ｓ２１に戻り、シナリオ作成者の発話内容のテキスト入力が終了する（Ｓ２１のＹ）まで、Ｓ２２を繰り返す。 FIG. 8 is a flowchart illustrating an example of a procedure for acquiring utterance information by the utterance information acquiring unit 201 of the robot control apparatus according to the embodiment of the present invention.
First, the speech information acquisition unit 201, when the text input speech contents of the scenario creator not finished (S21 of N), the dialogue act Da _i from the text of the speech content T _i that is input from the speech content input unit 200 presume. Also, speech information acquisition unit 201, the text of the speech content T _i to speech synthesis, obtains the speech time t _i at a time of reproducing the audio file, speech and these interactions act Da _i and speech time t _i Information is stored in the utterance information storage unit 202 (S22). Thereafter, the process returns to S21, and S22 is repeated until the text input of the utterance contents of the scenario creator is completed (Y in S21).

図９は、本発明の一実施形態におけるロボット制御装置による動作付与の手順の一例を示すフローチャートである。ここでは、新たな発話内容に非言語動作を付与する手順について説明する。
まず、付与動作決定部２０３は、新たな発話内容T_iから上記のように推定された対話行為Da_iおよび発話時間t_iを発話情報記憶部２０２に記憶される発話情報からそれぞれ読み出す（Ｓ３１）。
あわせて、付与動作決定部２０３は、Ｓ３１で読み出した対話行為Da_iにおける各非言語動作の共起確率を共起確率記憶部１０３（図３参照）からそれぞれ読み出す（Ｓ３２）。
次に、付与動作決定部２０３は、Ｓ３１で読み出した対話行為Da_iにおける非言語動作の集合M’(={mo₁’,mo₂,…,mo_k’}を共起確率記憶部１０３（図３参照）から取得する（Ｓ３３）。 FIG. 9 is a flowchart illustrating an example of a procedure for imparting motion by the robot control apparatus according to the embodiment of the present invention. Here, a procedure for adding a non-language operation to new utterance content will be described.
First, applying the operation determination unit 203 reads from each of the speech information stored dialog acts Da _i and speech time t _i from a new utterance T _i is estimated as described above in the speech information memory unit 202 (S31) .
In addition, applying the operation determination unit 203 reads each co-occurrence probability of each non-language operation in dialog acts Da _i read in S31 from the co-occurrence probability storage unit 103 (see FIG. 3) (S32).
Then, applying the operation determination unit 203, the set M nonverbal behavior in dialog acts Da _i read in _{S31 '(= {mo 1'} , mo 2, ..., mo k '} co-occurrence probability storage unit 103 ( (See FIG. 3) (S33).

次に、付与動作決定部２０３は、発話内容に付与する非言語動作の全動作時間（以下で選択する非言語動作の実行に要する時間で総実行時間と称することもある）t_allのカウンタの値を初期値の０に設定する（Ｓ３４）。 Next, the assigning action determining unit 203 _counts the counter for _all the non-language action given to the utterance content (the time required to execute the non-language action selected below may be referred to as the total execution time). The value is set to the initial value 0 (S34).

以後は、非言語動作の付与を進めていく。ここでは、部分動作記憶部２０５の記憶内容（図５参照）は、非言語動作記憶部２０４に記憶されているものとする。具体的には、付与動作決定部２０３は、Ｓ３３で取得した非言語動作の集合M’(={mo₁’,mo₂,…,mo_k’}のうち、選択前であって、Ｓ３２で読み出した共起確率が最も高い非言語動作mo_j’の非言語動作IDを選択する（Ｓ３５）。 After that, we will continue to add non-language actions. Here, it is assumed that the storage content (see FIG. 5) of the partial motion storage unit 205 is stored in the non-language motion storage unit 204. Specifically, the assigning action determining unit 203 is before the selection from the set of non-language actions M ′ (= {mo ₁ ′, mo ₂ ,..., Mo _k ′} acquired in S33, and in S32 The non-language action ID of the read non-language action mo _j ′ with the highest co-occurrence probability is selected (S35).

付与動作決定部２０３は、この選択した非言語動作mo_j’の動作に要する時間である動作時間t_mojを計算する（Ｓ３６）。具体的には、付与動作決定部２０３は、非言語動作一覧（図４参照）における、Ｓ３５で選択した非言語動作IDに対応付けられる各部分動作の部分動作IDを特定し、非言語動作記憶部２０４に記憶される部分動作一覧（図５参照）における、当該特定した各部分動作IDに対応付けられる動作時間を参照することで、Ｓ３５で選択した非言語動作mo_j’を構成する各部分動作の動作時間の和を動作時間t_mojとして求める。 The assigned action determining unit 203 calculates an action time t _moj that is a time required for the action of the selected non-language action mo _j ′ (S36). Specifically, the assigned action determining unit 203 identifies the partial action ID of each partial action associated with the non-language action ID selected in S35 in the non-language action list (see FIG. 4), and stores the non-language action storage. Each part constituting the non-language action mo _j ′ selected in S35 by referring to the action time associated with each identified partial action ID in the partial action list (see FIG. 5) stored in the unit 204 The sum of the operation time of the operation is obtained as the operation time t _moj .

付与動作決定部２０３は、Ｓ３６で計算した動作時間t_mojを、全動作時間t_allのカウンタの現在の値に追加する（Ｓ３７）。 The assigning operation determination unit 203 adds the operation time t _moj calculated in S36 to the current value of the counter of the total operation time t _all (S37).

付与動作決定部２０３は、Ｓ３１で取得した発話時間t_iと、Ｓ３７での追加後の全動作時間t_allのカウンタの現在の値とを比較する（Ｓ３８）。 The assigning action determining unit 203 compares the utterance time t _i acquired in S31 with the current value of the counter of the total action time t _all after the addition in S37 (S38).

Ｓ３８での比較の結果、Ｓ３７での追加後の全動作時間t_allのカウンタの現在の値が、Ｓ３１で取得した発話時間t_i以上であるとき（Ｓ３８のＹ）は、付与動作決定部２０３は、Ｓ３５で選択した非言語動作mo_j’を付与動作記憶部２０６に保存して（Ｓ３９）、動作付与にかかる処理フローを終了する。図６で示すように、ここで付与する非言語動作mo_j’は、（１）Ｓ３１で取得した新たな発話内容のテキスト、（２）非言語動作一覧（図４）における、Ｓ３５で選択した非言語動作mo_j’（新たな発話内容に付与される非言語動作）の非言語動作ID、（３）非言語動作一覧（図４）における、この非言語動作IDと対となる属性、（４）非言語動作一覧（図４）における、上記の（２）で特定した非言語動作IDと対となる各部分動作の部分動作ID、を含む。 As a result of the comparison in S38, when the current value of the counter of the total operation time t _all after the addition in S37 is equal to or longer than the utterance time t _i acquired in S31 (Y in S38), the imparting operation determination unit 203 Stores the non-language action mo _j ′ selected in S35 in the assigning action storage unit 206 (S39), and ends the processing flow related to the action assignment. As shown in FIG. 6, the non-language action mo _j ′ given here is selected in S35 in (1) the text of the new utterance content acquired in S31 and (2) the non-language action list (FIG. 4). Non-language action ID of non-language action mo _j ′ (non-language action given to new utterance content), (3) Attributes paired with this non-language action ID in the list of non-language actions (FIG. 4), ( 4) In the non-language action list (FIG. 4), the partial action ID of each partial action paired with the non-language action ID specified in (2) above is included.

一方、Ｓ３７での追加後の全動作時間のカウンタの現在の値t_allがＳ３１で取得した発話時間t_iより小さいとき（Ｓ３８のＮ）は、付与動作決定部２０３は、Ｓ３５で選択された非言語動作mo_j’の属性に合わせて、必要に応じて全動作時間のカウンタの現在の値t_allを延長する処理を行なう。
具体的には、Ｓ３５で選択された非言語動作mo_j’の属性が「継続」であるときは（Ｓ４０のＹ）、発話が終了する（非言語動作mo_j’にかかる時間の累積が発話時間t_iに達する）まで、この非言語動作mo_j’を継続すればよく、発話終了後に当該非言語動作mo_j’を完了するように、付与動作決定部２０３は、非言語動作IDが「end」である非言語動作を付与動作記憶部２０６に保存して（Ｓ４１）、動作付与にかかる処理フローを終了する。 On the other hand, when the current value t _all of the counter for all the operation times after the addition in S37 is smaller than the utterance time t _i acquired in S31 (N in S38), the assignment operation determining unit 203 is selected in S35. In accordance with the attribute of the non-language action mo _j ′, a process for extending the current value t _all of the counter for all action time is performed as necessary.
Specifically, when the attribute of the non-language action mo _j ′ selected in S35 is “continuation” (Y in S40), the utterance ends (accumulation of the time taken for the non-language action mo _j ′ is uttered). time reaches t _i) to, 'may be continued, the non-verbal behavior mo _j after the speech ends' this nonverbal behavior mo _j to complete the, imparting operation determination unit 203, nonverbal behavior ID is " The non-language action “end” is stored in the assigning action storage unit 206 (S41), and the processing flow relating to the action assignment ends.

ここで付与動作記憶部２０６に保存される非言語動作は、（１）Ｓ３１で取得した発話内容のテキスト、（２）非言語動作ID「end」、（３）属性「単発」、（４）この非言語動作ID「end」と対となる各部分動作の部分動作ID、を含む。
つまり、Ｓ３５で選択した非言語動作の属性が「継続」であるときに、Ｓ３５で選択した非言語動作を、非言語動作に要する時間である動作時間が発話時間に達するまで繰り返してなる動作として求め、この求めた動作が、発話時間に対応した（動作時間の長さが発話時間に対応した長さである）非言語動作として決定される。例えば、Ｓ３５で選択した非言語動作の１回の動作時間が発話時間の３分の１であるときは、この動作を３回繰り返してなる動作が、上記対応した非言語動作として決定される。 Here, the non-language action stored in the assigned action storage unit 206 includes (1) the text of the utterance content acquired in S31, (2) the non-language action ID “end”, (3) the attribute “single”, (4) A partial motion ID of each partial motion paired with this non-language motion ID “end” is included.
That is, when the attribute of the non-language action selected in S35 is “continuation”, the non-language action selected in S35 is repeated until the action time that is the time required for the non-language action reaches the utterance time. The obtained motion is determined as a non-language motion corresponding to the speech time (the length of the motion time is the length corresponding to the speech time). For example, when the operation time of one non-language operation selected in S35 is one third of the speech time, an operation obtained by repeating this operation three times is determined as the corresponding non-language operation.

また、Ｓ３５で選択された非言語動作mo_j’の属性が「単発」であるときは（Ｓ４０のＮ）、発話時間t_iに合わせて、更なる非言語動作を付与するか、付与動作の内で反復するプリミティブな動作の繰り返し回数を増やす。
具体的には、Ｓ３５で選択された非言語動作mo_j’内に、反復するプリミティブな動作が存在しないときには（Ｓ４２のＮ）、Ｓ３５で選択した非言語動作mo_j’をＳ３９と同様に付与動作記憶部２０６に保存した上で（Ｓ４３）、Ｓ３５に戻る。
このようにＳ３５に戻ることで、先にＳ３５で選択した非言語動作mo_j’と比較して、Ｓ３２で取得した各非言語動作のうち未選択の非言語動作であって、共起確率が最も高い新たな非言語動作mo_j’を選択して、この動作に要する動作時間を反映した全動作時間と発話時間とが再度比較される。このように、選択した非言語動作に要する時間である動作時間と発話時間との間に差分があるときに、必要に応じ、複数種類の非言語動作（Ｓ３２で取得した各非言語動作）のうち既に選択した非言語動作以外の非言語動作であって共起確率が最も高い非言語動作の選択が繰り返される。 Further, when the attribute of the non-language action mo _j 'selected in S35 is "single" (N in S40), a further non-language action is given according to the utterance time t _i , or Increase the number of times a primitive action repeats within
Specifically, granted nonverbal behavior 'in, when the primitive operation of repeating the absence (S42 of N), nonverbal behavior mo _j selected in S35' mo _j selected in S35 in the same manner as S39 the After saving in the operation storage unit 206 (S43), the process returns to S35.
By returning to S35 in this way, compared to the non-language action mo _j ′ previously selected in S35, each non-language action acquired in S32 is an unselected non-language action, and the co-occurrence probability is higher. The highest new non-language action mo _j ′ is selected, and the total action time reflecting the action time required for this action is compared again with the speech time. As described above, when there is a difference between the operation time, which is the time required for the selected non-language operation, and the speech time, a plurality of types of non-language operations (each non-language operation acquired in S32) are used as necessary. Of these, the selection of a non-language action other than the already selected non-language action having the highest co-occurrence probability is repeated.

一方で、Ｓ３５で選択された非言語動作mo_j’内に、反復するプリミティブな動作（例えば図６に示す１行目の発話内容テキスト「こんにちは」に対する、部分動作ID「３」、「４」が反復する動作）が存在するときは（Ｓ４２のＹ）、この反復するプリミティブな動作を１〜ｎ回（ｎ回は繰り返し回数の上限）繰り返したときの非言語動作mo_j’’を生成する（Ｓ４４）。例えば、Ｓ３５で選択した非言語動作に対し、上記の反復する動作をｘ回（ｎ回が上限）繰り返した動作を追加した動作の動作時間が発話時間に達するときは、Ｓ３５で選択した非言語動作が、この反復する動作を上記のｘ３回繰り返してなる動作を追加した動作に更新される。 On the other hand, the non-verbal behavior within mo _j 'selected in S35, for speech content the text "Hello" in the first line shown in primitive operation (for example, FIG. 6 repeating, partial operation ID "3", "4" If there is an operation that repeats (Y in S42), a non-language operation mo _j ″ is generated when this repeating primitive operation is repeated 1 to n times (where n is the upper limit of the number of repetitions). (S44). For example, when the operation time of an operation obtained by adding an operation obtained by repeating the above repetitive operation x times (n times is the upper limit) to the non-language operation selected in S35 reaches the speech time, the non-language operation selected in S35 The operation is updated to an operation to which an operation obtained by repeating this repeated operation x3 times is added.

この、反復するプリミティブな動作は、１つの部分動作IDに対応する部分動作が反復する動作であってもよいし、上記のように、複数の部分動作IDに対応する部分動作の集合を１まとまりの動作として、この動作が反復する動作であってもよい。 This repeated primitive operation may be an operation in which a partial operation corresponding to one partial operation ID is repeated. As described above, a set of partial operations corresponding to a plurality of partial operation IDs is collected. The operation may be an operation in which this operation repeats.

なお、上記のようにＳ４４での繰り返し回数の最大値（ｎ回）を事前に設定することで、Ｓ３５で選択された非言語動作mo_j’内に反復するプリミティブな動作が過剰に繰り返されないようにすることができる。 As described above, by setting the maximum number of repetitions (n times) in S44 in advance, the primitive operation that repeats in the non-language operation mo _j ′ selected in S35 is not excessively repeated. Can be.

付与動作決定部２０３は、Ｓ４４で生成した非言語動作mo_j’’に要する動作時間を全動作時間t_allのカウンタの現在の値に追加する（Ｓ４５）。付与動作決定部２０３は、Ｓ４４で生成した非言語動作mo_j’’を付与動作記憶部２０６に保存して（Ｓ４６）、Ｓ３１で取得した発話時間t_iと、Ｓ４５での追加後の全動作時間t_allのカウンタの現在の値とを比較する（Ｓ４７）。
Ｓ４７の比較で、Ｓ４５での追加後の全動作時間t_allのカウンタの現在の値が、Ｓ３１で取得した発話時間t_i以上であるとき（Ｓ４７のＹ）は、付与動作決定部２０３は、動作付与にかかる処理フローを終了する。
一方、Ｓ４５での追加後の全動作時間のカウンタの現在の値t_allがＳ３１で取得した発話時間t_iより小さいとき（Ｓ４７のＮ）は、Ｓ３５に戻る。 The assigning action determining unit 203 adds the action time required for the non-language action mo _j ″ generated in S44 to the current value of the counter of the total action time t _all (S45). The assigning action determining unit 203 stores the non-language action mo _j ″ generated in S44 in the assigning action storage unit 206 (S46), and the utterance time t _i acquired in S31 and all the actions after addition in S45. The current value of the counter at time t _all is compared (S47).
In the comparison of S47, when the current value of the counter of the total operation time t _all after the addition in S45 is equal to or longer than the utterance time t _i acquired in S31 (Y in S47), the granting action determination unit 203 The processing flow relating to the operation assignment is terminated.
On the other hand, when the current value t _all of the counter of all operation times after the addition in S45 is smaller than the utterance time t _i acquired in S31 (N in S47), the process returns to S35.

次に、非言語動作の付与結果からの動作シナリオ生成機能について説明する。この機能は動作シナリオ生成部３００で実現することができる。 Next, an operation scenario generation function based on the result of non-language operation will be described. This function can be realized by the operation scenario generation unit 300.

図１０は、本発明の一実施形態におけるロボット制御装置による動作シナリオ生成の手順の一例を示すフローチャートである。
まず、動作シナリオ生成部３００は、付与動作記憶部２０６に記憶される付与動作一覧（図６参照）の最初の行を選択し、この選択した行から（１）発話内容のテキストと、（２）付与された非言語動作IDを読み出し、これら読み出した発話内容と非言語動作IDを状態遷移図のノード内に記述する（Ｓ５１）。
動作シナリオ生成部３００は、付与動作一覧（図６参照）における、Ｓ５１で選択した行の次の行を読み出す（Ｓ５２）。この行に、Ｓ５１で選択した行から読み出した発話内容のテキストと同一の発話内容のテキストが存在する場合で（Ｓ５３のＹ）、この行で記述される非言語動作IDが「end」でない場合は（Ｓ５４のＮ）、動作シナリオ生成部３００は、この非言語動作IDを同一のノード内に記述する。（Ｓ５５）。 FIG. 10 is a flowchart illustrating an example of a procedure for generating an operation scenario by the robot control apparatus according to the embodiment of the present invention.
First, the action scenario generation unit 300 selects the first line of the list of grant actions (see FIG. 6) stored in the grant action storage unit 206, and (1) the text of the utterance content and (2 ) Read the given non-language action ID, and describe the read utterance content and the non-language action ID in the node of the state transition diagram (S51).
The action scenario generation unit 300 reads the next line after the line selected in S51 in the list of given actions (see FIG. 6) (S52). When a text having the same utterance content as the text of the utterance content read from the line selected in S51 exists in this line (Y in S53), and the non-language action ID described in this line is not “end” (N in S54), the operation scenario generation unit 300 describes this non-language operation ID in the same node. (S55).

一方、Ｓ５１で選択した行から読み出した発話内容のテキストと同一の発話内容のテキストが、Ｓ５２で選択した行に存在しない場合には（Ｓ５３のＮ）、動作シナリオ生成部３００は、ノード間の遷移条件を「発話完了」と設定し、Ｓ５１でノードに記述された発話内容のテキストとを非言語動作IDを、このノードの次の新しいノードに設定する（Ｓ５６）。 On the other hand, when the text of the same utterance content as the text of the utterance content read from the line selected in S51 does not exist in the line selected in S52 (N in S53), the operation scenario generation unit 300 determines whether the The transition condition is set to “utterance completion”, and the text of the utterance content described in the node in S51 is set to the non-language action ID in the next new node after this node (S56).

また、「継続」の属性を有する非言語動作IDが付与された場合の対応として、Ｓ５４にて、非言語動作IDが「end」である場合には（Ｓ５４のＹ）、動作シナリオ生成部３００は、ノード間の遷移条件を「発話完了」と設定し、Ｓ５１でノードに記述された発話内容のテキストとを非言語動作IDを、このノードの次の新しいノードに設定する（Ｓ５６）。 Further, as a response to the case where a non-language action ID having the “continuation” attribute is given, in S54, when the non-language action ID is “end” (Y in S54), the action scenario generation unit 300 Sets the transition condition between the nodes as “utterance completion”, sets the text of the utterance content described in the node in S51, and sets the non-language action ID to the new node next to this node (S56).

以上説明したように、本発明の一実施形態におけるロボット制御装置は、過去に作成された、ロボットの動作シナリオ中の発話内容と非言語動作との共起関係を利用して、シナリオ作成者が入力した、ロボットの新たな発話内容のテキストをもとに、非言語動作を自動的にすることができる。これにより、動作シナリオを作成するための労力が少なくなり、かつ、ロボットに何を話させるかを示す文章を書くだけで、ロボットの制御を行えることができるようになるため、動作シナリオの作成に習熟した人でなくとも、ロボットサービスを実現することができるようになる。 As described above, the robot controller according to the embodiment of the present invention uses the co-occurrence relationship between the utterance content and the non-linguistic motion in the robot motion scenario created in the past, so that the scenario creator Based on the input text of the new utterance content of the robot, non-verbal movement can be automatically performed. This reduces the effort required to create a motion scenario, and allows you to control the robot simply by writing a sentence that tells the robot what to speak. Even if you are not a skilled person, you will be able to realize robot services.

なお、本発明は、上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は適宜組み合わせて実施してもよく、その場合組み合わせた効果が得られる。更に、上記実施形態には種々の発明が含まれており、開示される複数の構成要件から選択された組み合わせにより種々の発明が抽出され得る。例えば、実施形態に示される全構成要件からいくつかの構成要件が削除されても、課題が解決でき、効果が得られる場合には、この構成要件が削除された構成が発明として抽出され得る。 In addition, this invention is not limited to the said embodiment, In the implementation stage, it can change variously in the range which does not deviate from the summary. Further, the embodiments may be implemented in combination as appropriate, and in that case, the combined effect can be obtained. Furthermore, the present invention includes various inventions, and various inventions can be extracted by combinations selected from a plurality of disclosed constituent elements. For example, even if several constituent requirements are deleted from all the constituent requirements shown in the embodiment, if the problem can be solved and an effect can be obtained, the configuration from which the constituent requirements are deleted can be extracted as an invention.

また、各実施形態に記載した手法は、計算機（コンピュータ）に実行させることができるプログラム（ソフトウエア手段）として、例えば磁気ディスク（フロッピー（登録商標）ディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ、ＭＯ等）、半導体メモリ（ＲＯＭ、ＲＡＭ、フラッシュメモリ等）等の記録媒体に格納し、また通信媒体により伝送して頒布することもできる。なお、媒体側に格納されるプログラムには、計算機に実行させるソフトウエア手段（実行プログラムのみならずテーブルやデータ構造も含む）を計算機内に構成させる設定プログラムをも含む。本装置を実現する計算機は、記録媒体に記録されたプログラムを読み込み、また場合により設定プログラムによりソフトウエア手段を構築し、このソフトウエア手段によって動作が制御されることにより上述した処理を実行する。なお、本明細書でいう記録媒体は、頒布用に限らず、計算機内部あるいはネットワークを介して接続される機器に設けられた磁気ディスクや半導体メモリ等の記憶媒体を含むものである。 In addition, the method described in each embodiment is, for example, a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), optical disk (CD-ROM, etc.) as a program (software means) that can be executed by a computer (computer). It can be stored in a recording medium such as a DVD, MO, etc., semiconductor memory (ROM, RAM, flash memory, etc.), or transmitted and distributed by a communication medium. The program stored on the medium side includes a setting program that configures software means (including not only the execution program but also a table and data structure) in the computer. A computer that implements this apparatus reads a program recorded on a recording medium, constructs software means by a setting program as the case may be, and executes the above-described processing by controlling the operation by this software means. The recording medium referred to in this specification is not limited to distribution, but includes a storage medium such as a magnetic disk or a semiconductor memory provided in a computer or a device connected via a network.

１０…ロボット制御装置、１００…動作シナリオログ記憶部、１０１…対話行為推定部、１０２…共起確率計算部、１０３…共起確率記憶部、２００…発話内容入力部、２０１…発話情報取得部、２０２…発話情報記憶部、２０３…付与動作決定部、２０４…非言語動作記憶部、２０５…部分動作記憶部、２０６…付与動作記憶部、３００…動作シナリオ生成部、３０１…ロボット。 DESCRIPTION OF SYMBOLS 10 ... Robot control apparatus, 100 ... Operation scenario log memory | storage part, 101 ... Dialog action estimation part, 102 ... Co-occurrence probability calculation part, 103 ... Co-occurrence probability memory | storage part, 200 ... Utterance content input part, 201 ... Utterance information acquisition part , 202 ... Utterance information storage unit, 203 ... Granting motion determination unit, 204 ... Non-language motion storage unit, 205 ... Partial motion storage unit, 206 ... Granting motion storage unit, 300 ... Motion scenario generation unit, 301 ... Robot.

Claims

First estimation means for estimating a dialogue act indicating the intention of the utterance content from log information including the utterance content of the robot and identification information of the non-verbal motion of the robot;
Calculation means for calculating a co-occurrence probability of a non-linguistic action performed together with the dialogue action for the dialogue action estimated by the first estimation means;
Second estimation means for acquiring a new utterance content of the robot and estimating an utterance time which is a time required for voice utterance of the acquired utterance content and an interactive action indicating an intention of the acquired utterance content; ,
Based on the utterance time estimated by the second estimating means, the dialogue action estimated by the second estimating means, and the co-occurrence probability calculated by the calculating means, Determining means for determining a non-linguistic action that is an action carried out together with the interactive action and whose length of the action time, which is the time required for the action, is a length corresponding to the utterance time estimated by the second estimating means When,
A robot control apparatus comprising: a generation unit configured to generate an operation scenario of the robot based on the non-language motion determined by the determination unit and the new utterance content.

The calculating means includes
Based on the dialogue action estimated by the first estimation means, the co-occurrence probabilities of non-language actions performed together with the dialogue action are respectively calculated for a plurality of types of non-language actions,
The determining means includes
Selecting a non-language action having the highest co-occurrence probability among the plurality of types of non-language actions;
When there is a difference between the operation time that is the time required for the selected non-language operation and the utterance time estimated by the second estimation means, the selected non-language of the plurality of types of non-language operations The robot control apparatus according to claim 1, wherein selection of a non-language motion other than a motion and the non-language motion having the highest co-occurrence probability is repeated.

Non-language operation storage means for storing the information indicating the non-language operation in association with attribute information indicating whether or not the non-language operation is an operation that continues repeatedly until the operation is completed,
The determining means includes
The attribute information corresponding to the non-language action selected by the decision means is read from the non-language action storage means, and when the attribute information indicates the action that continues repeatedly, the non-language action selected by the decision means is As the action repeated until the utterance time estimated by the second estimation means reaches the utterance time that is the time required for the non-language action selected by the determination means, the action is determined by the second estimation means. The operation performed together with the estimated dialogue action, and the length of the operation time is determined as a non-language operation having a length corresponding to the utterance time estimated by the second estimation means. Robot controller.

The determining means includes
When the non-language action selected by the determining means includes an action that repeats the same partial action, the non-language action selected by the determining means is an action that repeats the same partial action. The robot control device according to claim 2, wherein the robot control device is updated to an operation that repeats.

A robot control method performed by a robot controller,
From the log information including the utterance content of the robot and the identification information of the non-verbal motion of the robot, a first interactive action indicating the intention of the utterance content is estimated,
Calculating a co-occurrence probability of a non-verbal action performed together with the estimated dialogue action with the dialogue action;
The new utterance content of the robot is acquired, and the utterance time, which is the time required for voice utterance of the acquired utterance content, and the second interactive action indicating the intention of the acquired utterance content are estimated, respectively.
An operation performed together with the estimated second interaction action based on the estimated utterance time, the estimated second interaction action, and the calculated co-occurrence probability, and required for the operation Determining a non-language action in which the length of the action time, which is a time, is a length corresponding to the estimated speech time;
A robot control method for generating a motion scenario of the robot based on the determined non-language motion and the new utterance content.

A robot control program for causing a processor to function as each means of the robot control device according to claim 1.