JP2019018336A

JP2019018336A - Device, method, program, and robot

Info

Publication number: JP2019018336A
Application number: JP2018007252A
Authority: JP
Inventors: 亮太宮崎; Ryota Miyazaki
Original assignee: Panasonic Intellectual Property Management Co Ltd
Current assignee: Panasonic Intellectual Property Management Co Ltd
Priority date: 2017-07-18
Filing date: 2018-01-19
Publication date: 2019-02-07
Anticipated expiration: 2038-01-19
Also published as: JP7075168B2

Abstract

To ensure that an involved state is continued relative to a target person.SOLUTION: A processing section causes a robot: to execute any one of a first action, a second action, and a third action, as an initial action, in order for the robot to communicate with a target person according to an acquired image and acquired sound; to execute a one-level higher action after executing a current action containing the initial action, when sound is acquired by a microphone; to execute, when no sound is acquired by the microphone following execution of the current action, determination of whether elapsed time after the execution of the current action is less than a threshold; to continue the current action if the elapsed time is less than the threshold; and to execute a one-level lower action of the current action if the elapsed time is equal to or more than the threshold.SELECTED DRAWING: Figure 1

Description

本開示は、人物とのコミュニケーションを図るロボット等に関するものである。 The present disclosure relates to a robot or the like that communicates with a person.

従来より、人物とのコミュニケーションを通じて人物との関与を図ることを主眼とするロボットが知られている。このようなロボットでは、可能な限りユーザがロボットと関与している状態を継続させることが重要である。 2. Description of the Related Art Conventionally, robots whose main purpose is to engage with a person through communication with the person are known. In such a robot, it is important to keep the user engaged with the robot as much as possible.

特許文献１には、ロボットがユーザ入力から独立したタスクを実行する自律タスク状態と、ロボットがユーザと相互作用する関与状態とを含み、現在の状況に基づいて、自律タスク状態から関与状態への移行タイミングと、関与状態から自律タスク状態への移行タイミングとを判断する技術が開示されている。 Patent Document 1 includes an autonomous task state in which a robot executes a task independent of user input, and a participating state in which the robot interacts with the user. Based on the current situation, the autonomous task state changes from the autonomous task state to the participating state. A technique for determining the transition timing and the transition timing from the participation state to the autonomous task state is disclosed.

特表２０１４−５０２５６６号公報Special table 2014-502565 gazette

しかし、上記の従来技術では、集中しているとまわりが見えなくなり、かつ飽きが生じやすいという幼児の特性が全く考慮されていないので、関与状態を継続できないという課題があり、更なる改善の必要がある。 However, the above-mentioned prior art does not take into account the characteristics of infants who cannot see the surroundings when they are concentrated and are easily bored, so there is a problem that the state of involvement cannot be continued, and further improvement is necessary. There is.

本開示の一態様に係る装置は、所定の行動を実行することによって対象人物とコミュニケーションする装置であって、
前記装置周辺の映像を取得するカメラと、
前記装置周辺の音を取得するマイクと、
処理部と、
スピーカと、
前記装置を動かす駆動部と、を備え、
前記処理部は、
前記取得された映像および前記取得された音に従って、前記装置に対して、前記対象人物とコミュニケーションするための、第１行動、第２行動、及び第３行動、のいずれかを初期行動として実行させ、前記第２行動は、前記第３行動の一つ上位の行動であり、前記第１行動は、前記第２行動の一つ上位の行動であり、
前記初期行動を含む現在の行動が実行されてから、前記マイクによって取得された音がある場合は、前記現在の行動の一つ上位の行動を前記装置に実行させ、
前記現在の行動が実行されてから、前記マイクによって取得された音が無い場合は、前記現在の行動が実行されてからの経過時間が閾値未満であるか判断し、
前記経過時間が前記閾値未満であると判断された場合は、前記現在の行動を前記装置に継続させ、
前記経過時間が前記閾値以上であると判断された場合は、前記現在の行動の一つ下位の行動を前記装置に実行させ、
前記装置に、前記第１行動として、所定のタスクを実行させ、
前記スピーカに、前記第２行動として、前記対象人物に話しかける音声を出力させ、前記駆動部を制御して、前記装置に前記第３行動として、前記対象人物の動きと同調した動きをさせる。 An apparatus according to an aspect of the present disclosure is an apparatus that communicates with a target person by executing a predetermined action,
A camera for acquiring video around the device;
A microphone for acquiring sound around the device;
A processing unit;
Speakers,
A drive unit for moving the device,
The processor is
According to the acquired video and the acquired sound, the apparatus is caused to execute any one of a first action, a second action, and a third action as an initial action for communicating with the target person. , The second action is one action higher than the third action, the first action is one action higher than the second action,
If there is a sound acquired by the microphone after the current action including the initial action is executed, the apparatus executes the action one level higher than the current action,
If there is no sound acquired by the microphone since the current action was executed, determine whether the elapsed time since the current action was executed is less than a threshold,
If it is determined that the elapsed time is less than the threshold, the device continues the current action,
If it is determined that the elapsed time is greater than or equal to the threshold value, the device performs an action one level lower than the current action,
Let the device execute a predetermined task as the first action,
As the second action, the speaker is made to output a voice to be spoken to the target person, and the drive unit is controlled to cause the apparatus to make a movement synchronized with the movement of the target person as the third action.

上記態様により、更なる改善を実現できた。 According to the above aspect, further improvement can be realized.

本開示の実施の形態におけるロボットの全体構成の一例を示すブロック図である。It is a block diagram showing an example of the whole robot composition in an embodiment of this indication. 初期関与ステージテーブルの一例を示す図である。It is a figure which shows an example of an initial participating stage table. 移行テーブルの一例を示す図である。It is a figure which shows an example of a transfer table. 関与ステージの決定処理の一例を示すフローチャートである。It is a flowchart which shows an example of the determination process of a participating stage. 図４の続きのフローチャートである。FIG. 5 is a flowchart continued from FIG. 4. 図５のＳ２０５の詳細な処理の一例を示すフローチャートである。It is a flowchart which shows an example of the detailed process of S205 of FIG. 割り込み禁止条件テーブルの一例を示す図である。It is a figure which shows an example of an interruption prohibition condition table. 辞書テーブルの一例を示す図である。It is a figure which shows an example of a dictionary table. ロボットの外観の一例を示す図である。It is a figure which shows an example of the external appearance of a robot.

（本開示に係る一態様を発明するに至った経緯）
上述のように特許文献１では、関与状態と、関与状態以外の非関与状態（自律タスク状態及び休憩状態など）との間の移行タイミングを決定する方法が開示されてる。具体的には、特許文献１では、ユーザがロボットを注視し、且つ、ユーザがロボットからの関与オファー（例えば、問いかけ）に応じた場合、非関与状態から関与状態に移行することが開示されている。また、特許文献１では、関与状態において、会話が不成立となった後に複数回の問いかけに対してユーザからの応答がなく、一定時間、待機状態が継続した場合に非関与状態に移行することが開示されている。 (Background to inventing one aspect of the present disclosure)
As described above, Patent Document 1 discloses a method for determining a transition timing between a participating state and a non-participating state other than the participating state (such as an autonomous task state and a resting state). Specifically, in Patent Document 1, it is disclosed that when a user pays attention to a robot and the user responds to a participation offer (for example, an inquiry) from the robot, the state shifts from a non-participation state to a participation state. Yes. Moreover, in patent document 1, there is no response from a user with respect to a plurality of questions after the conversation is not established in the engaged state, and when the standby state continues for a certain time, the state may shift to the non-engaged state. It is disclosed.

ところで、本発明者は、幼児とクイズをしたり、幼児に片づけを促したり、幼児に学習を促したりというような、幼児に対して何らかの仕事（タスク）を通じて幼児を教育及びしつけして、幼児の能力を向上させるロボットの研究を行っている。ここで、幼児は、自己中心性が強いという心理特性を有しており、好奇心が旺盛である反面、飽きやすく、集中力を持続させることが困難という傾向がある。 By the way, the present inventor educates and trains the infant through some kind of work (task), such as quizing with the infant, encouraging the infant to clean up, or encouraging the infant to learn. We are researching robots that improve their ability. Here, infants have a psychological characteristic that they have strong self-centeredness, and while they are very curious, they tend to get bored and have difficulty in maintaining their concentration.

したがって、特許文献１において幼児を対象とした場合、ロボットを注視したり、ロボットの問いかけ対して確実に応答したりすることが期待できず、非関与状態から関与状態への移行の契機が少ないという課題が発生する。また、たとえ関与状態に移行できたとしても、幼児は、すぐに、ロボットからの問いかけに対して無反応になったり、ロボットから遠くに離れたりするといった行動をとる傾向が高いので、すぐに非関与状態に戻ってしまうという課題が発生する。このように、特許文献１は、関与状態を長時間持続することができないという課題がある。 Therefore, when an infant is targeted in Patent Document 1, it cannot be expected to gaze at the robot or respond to the question of the robot with certainty, and there is little opportunity for transition from the non-participating state to the participating state. Challenges arise. In addition, even if the child can move to an engaged state, the infant tends to immediately take an action that does not respond to the question from the robot or moves away from the robot. The problem of returning to the engagement state occurs. Thus, patent document 1 has the subject that a participating state cannot be maintained for a long time.

かかる課題は、上記特許文献１にも言及がなく、従来には存在しなかったと認識している。 Such a problem is not mentioned in the above-mentioned Patent Document 1, and is recognized as not existing in the past.

そこで、本発明者は、幼児に対してあるタスクを実行させる場合、幼児の意識が低い状態でいきなりタスクを課すのは却って逆効果であり、幼児の意識を十分に高めてからタスクを実行させることが有効であるとの知見を得た。そのためには、幼児への関与の度合いを増減させながら幼児のロボットへの関心を徐々に高めていくことが有効であるとの知見を得た。 Therefore, when the present inventor performs a certain task on the infant, imposing the task suddenly in a state where the infant is low is counterproductive, and the task is executed after sufficiently increasing the infant's awareness. It was found that this is effective. To that end, it was found that it is effective to gradually increase children's interest in robots while increasing or decreasing the degree of involvement with children.

本発明者は、上記の課題を解決するために、上記の知見の下、以下の改善策を検討した。 In order to solve the above problems, the present inventor examined the following improvement measures based on the above knowledge.

本態様では、関与の度合いが高い順に第１、第２、及び第３行動が用意されており、まず、ロボットの周囲の映像及び音の状況から第１〜第３行動のうちのいずれかの行動が初期行動として決定される。ここで、第１行動は対象人物にタスクを実行させるための行動であり、第２行動は対象人物へ話しかける行動であり、第３行動は対象人物の動きに同調する行動であり、第１〜第３行動の順に対象人物への関与が高いと言える。 In this aspect, the first, second, and third actions are prepared in descending order of the degree of involvement. First, any one of the first to third actions is determined from the surrounding video and sound conditions. The action is determined as the initial action. Here, the first action is an action for causing the target person to execute the task, the second action is an action to speak to the target person, and the third action is an action in synchronization with the movement of the target person. It can be said that the involvement in the target person is high in the order of the third action.

そして、マイクが周囲の音を検知した場合は、初期行動が対象人物にとって相応しく、対象人物の装置への関心が高いとして初期行動が一つ上位の行動に移行される。一方、初期行動を実行してからの経過時間が閾値に到達するまでマイクにより音が検知されなかった場合は、初期行動が対象人物の意識に対して相応しくなく、対象人物の関心を高めることができなかったと判定され、初期行動が一つ下位の行動に移行される。以後、このようにして、現在の行動が段階的に移行されながら対象人物の関心が高められていき、対象人物の関心が十分に高まった状態で、対象人物にタスクが課される。 When the microphone detects ambient sounds, the initial action is appropriate for the target person, and the initial action is shifted to one higher level action because the target person is highly interested in the device. On the other hand, if no sound is detected by the microphone until the elapsed time from the execution of the initial action reaches the threshold, the initial action is not appropriate for the target person's consciousness, and the target person's interest may be increased. It is determined that the action could not be performed, and the initial action is shifted to a lower action. Thereafter, in this manner, the interest of the target person is increased while the current action is gradually shifted, and the task is imposed on the target person in a state where the interest of the target person is sufficiently increased.

このように、本態様は対象人物の装置への関心を十分に高めてから対象人物にタスクを課すことができので、対象人物が例えば、幼児のような心理的特性を持つ人物であっても、装置の対象人物への関与状態を長時間継続させることができる。その結果、対象人物にタスクを長時間取り組ませることができ、対象人物の能力を効果的に高めることができる。 Thus, since this aspect can impose a task on the target person after sufficiently increasing the interest of the target person in the apparatus, even if the target person is a person having psychological characteristics such as an infant, for example. The state of involvement of the device with the target person can be continued for a long time. As a result, the target person can be tasked for a long time, and the ability of the target person can be enhanced effectively.

上記態様において、前記第３行動の一つ下位の行動は第４行動であり、
前記第４行動の一つ下位の行動は第５行動であり、
前記処理部は、
前記現在の行動が実行されてから、前記マイクによって取得された音がある場合、且つ、前記取得された音に含まれた前記対象人物の音声に、前記装置に備えられた辞書に含まれた語句が含まれる場合は、前記第５行動を前記装置に実行させ、
前記駆動部を制御して、前記装置に前記第４行動として、前記装置の現在位置で所定の動きをさせ、
前記装置に、前記第５行動として、前記対象人物とのコミュニケーションを停止させてもよい。 In the above aspect, the one action lower than the third action is a fourth action,
The first action lower than the fourth action is the fifth action,
The processor is
If there is a sound acquired by the microphone since the current action was executed, and the voice of the target person included in the acquired sound was included in the dictionary provided in the device If the phrase is included, the device performs the fifth action,
Controlling the drive unit to cause the device to perform a predetermined movement at the current position of the device as the fourth action,
The apparatus may stop communication with the target person as the fifth action.

本態様では、更に、現在位置で装置に所定の動きをさせる第４行動と、第４行動よりも関与の度合いが下位にあり、対象人物とのコミュニケーションを停止させる第５行動とが用意されている。そして、対象人物が辞書に含まれた語句（例えば、あっちへ行って）を発話した場合、第５行動が実行されるので、対象人物が装置とのコミュニケーションを積極的に拒んでおり、装置への関心の向上が見込まれない状況において、装置が対象人物を不必要に刺激することが防止され、装置が対象人物にとって煩わしい存在となることを防止できる。 In this aspect, a fourth action for causing the device to perform a predetermined movement at the current position and a fifth action having a lower degree of involvement than the fourth action and stopping communication with the target person are provided. Yes. When the target person speaks a phrase (for example, go there) included in the dictionary, the fifth action is executed, so the target person actively refuses to communicate with the apparatus, and In the situation where the interest is not expected to be improved, the apparatus is prevented from unnecessarily stimulating the target person, and the apparatus can be prevented from becoming annoying for the target person.

上記態様において、前記処理部は、前記取得された映像から前記対象人物を認識し、且つ、前記取得された音から前記対象人物の音声を認識した場合は、前記装置に対して前記初期行動として前記第１行動を実行させてもよい。 In the above aspect, when the processing unit recognizes the target person from the acquired video and recognizes the voice of the target person from the acquired sound, the processing unit performs the initial action as to the device. The first action may be executed.

本態様では、対象人物が例えば装置の方を向いて発話しているような場合、対象人物の関心は十分に高いとして、第１行動が実行されるので、対象人物に対して速やかにタスクを課すことができる。 In this aspect, for example, when the target person is speaking toward the device, the first action is executed on the assumption that the target person is sufficiently interested. Can be imposed.

上記態様において、前記処理部は、前記取得された映像から前記対象人物を認識せず、且つ、前記取得された音から前記対象人物の音声を認識した場合は、前記装置に対して前記初期行動として前記第２行動を実行させてもよい。 In the above aspect, when the processing unit does not recognize the target person from the acquired video and recognizes the voice of the target person from the acquired sound, the processing unit performs the initial action on the device. The second action may be executed as follows.

本態様では、対象人物が例えば、装置に向かわずに何かしらの発話をしているような場合、対象人物に対して装置が話しかける第２行動が実行されるので、対象人物を適切に刺激して、対象人物の関心を高めることができる。 In this aspect, for example, when the target person is uttering something without going to the device, the second action is executed by the device speaking to the target person. , Can increase the interest of the target person.

上記態様において、前記処理部は、前記取得された映像から前記対象人物を認識し、且つ、前記取得された音から前記対象人物の音声を認識しない場合は、前記装置に対して前記初期行動として前記第３行動を実行させてもよい。 In the above aspect, when the processing unit recognizes the target person from the acquired video and does not recognize the target person's voice from the acquired sound, the processing unit performs the initial action for the device. The third action may be executed.

本態様では、対象人物が例えば、装置に向かわず発話もしていないような場合、対象人物の動きに同調する第３行動が実行されるので、対象人物を適切に刺激して、対象人物の関心を高めることができる。 In this aspect, for example, when the target person does not go to the apparatus and does not speak, the third action that is synchronized with the movement of the target person is executed. Therefore, the target person is appropriately stimulated and interested in the target person. Can be increased.

上記態様において、前記処理部は、前記第１行動として、前記スピーカに対して前記対象人物とのコミュニケーションを開始することを提案する音声を出力させてもよい。 The said aspect WHEREIN: The said process part may output the audio | voice which proposes starting the communication with the said target person with respect to the said speaker as said 1st action.

本態様では、第１行動を行う際、対象人物とのコミュニケーションを通じてタスクを要求することができ、対象人物に対して違和感なくタスクを要求することができる。 In this aspect, when performing the first action, a task can be requested through communication with the target person, and the task can be requested from the target person without a sense of incongruity.

上記態様において、前記処理部は、
前記取得された映像から前記対象人物の頭部の傾きを認識した場合は、
前記駆動部を制御して、前記第３行動として、前記装置の上部を、前記頭部の傾きと同じ方向と角度とで、傾けさせてもよい。 In the above aspect, the processing unit includes:
When the inclination of the head of the target person is recognized from the acquired video,
The drive unit may be controlled to tilt the upper part of the apparatus at the same direction and angle as the head tilt as the third action.

本態様では、第３行動を行う際、対象人物の頭部の姿勢の変化に連動して装置の姿勢が変化されるので、対象人物の動きに同調して装置が動作していることを対象人物に容易に分からせることができ、対象人物の装置への関心を高めることができる。 In this aspect, when performing the third action, the posture of the device is changed in conjunction with the change in the posture of the target person's head, so that the device is operating in synchronization with the movement of the target person. A person can be easily understood, and the interest of the target person in the device can be increased.

上記態様において、前記処理部は、
前記取得された映像から前記対象人物の所定のリズムに合わせた動作を認識した場合は、
前記駆動部を制御して、前記第３行動として、前記装置を前記リズムに合わせて動かせてもよい。 In the above aspect, the processing unit includes:
When recognizing an operation in accordance with a predetermined rhythm of the target person from the acquired video,
You may control the said drive part and can move the said apparatus according to the said rhythm as said 3rd action.

本態様では、第３行動を行う際、対象人物の動きのリズムに合わせて装置が動くため、対象人物の動きに同調して装置が動作していることを対象人物に容易に分からせることができ、対象人物の装置への関心を高めることができる。 In this aspect, when performing the third action, the apparatus moves in accordance with the movement rhythm of the target person, so that the target person can easily know that the apparatus is operating in synchronization with the movement of the target person. It is possible to increase the interest of the target person in the device.

上記態様において、前記処理部は、
前記第２行動として、前記対象人物に対応する名前を含んだ音声を、前記スピーカに出力させてもよい。 In the above aspect, the processing unit includes:
As the second action, a sound including a name corresponding to the target person may be output to the speaker.

本態様では、第２行動を行う際、対象人物の名前を用いて装置が対象人物に話しかけるので、装置に対する対象人物の関心を高めることができる。 In this aspect, when performing the second action, the apparatus talks to the target person using the name of the target person, so that the interest of the target person with respect to the apparatus can be increased.

上記態様において、前記処理部は、
前記第４行動として、前記装置を左右に揺らせてもよい。 In the above aspect, the processing unit includes:
As the fourth action, the device may be swung left and right.

本態様では、第４行動を行う際、装置が左右に揺らされるので、対象人物の関心がさほど高くない状況下において、比較的少ない刺激で対象人物の装置への関心を引き出すことができる。 In this aspect, when performing the fourth action, the device is shaken to the left and right, so that the subject person's interest in the device can be drawn with a relatively small amount of stimulation in a situation where the subject person's interest is not so high.

上記態様において、前記処理部は、
前記第４行動として、重力方向を軸として前記装置を旋回させてもよい。 In the above aspect, the processing unit includes:
As the fourth action, the device may be turned about the direction of gravity.

本態様では、第４行動を行う際、装置がその場で自転するので、対象人物の関心がさほど高くない状況下において、比較的少ない刺激で対象人物の装置への関心を引き出すことができる。 In this aspect, when performing the fourth action, the device rotates on the spot, so that the subject person's interest in the device can be drawn with a relatively small amount of stimulation in a situation where the subject person's interest is not so high.

上記態様において、前記処理部は、
前記第５行動として、前記装置を前記対象人物から遠ざからせてもよい。 In the above aspect, the processing unit includes:
As the fifth action, the apparatus may be moved away from the target person.

本態様では、第５行動を行う際、装置が対象人物から遠ざかるので、対象人物の関心の向上が見込まれない状況下において、対象人物に対して不必要な刺激を付与することを防止できる。 In this aspect, when the fifth action is performed, the apparatus moves away from the target person, and therefore it is possible to prevent unnecessary stimulation from being applied to the target person in a situation where improvement in the interest of the target person is not expected.

上記態様において、前記処理部は、
前記第５行動として、重力方向を軸として前記装置を１８０度旋回させてもよい。 In the above aspect, the processing unit includes:
As the fifth action, the device may be rotated 180 degrees around the direction of gravity.

本態様では、装置が対象人物に対して反対方向に向くため、対象人物の関心の向上が見込まれない状況下において、対象人物に対して不必要な刺激を付与することを防止できる。 In this aspect, since the apparatus is directed in the opposite direction with respect to the target person, it is possible to prevent unnecessary stimulation from being applied to the target person in a situation where the interest of the target person is not expected to improve.

上記態様において、前記処理部は、
前記現在の行動が実行されてから、前記マイクによって取得された音が無い場合、且つ、前記装置に所定の割り込み禁止条件が設定されている場合は、前記装置に前記第５行動を実行させ、
前記所定の割り込み禁止条件は、
所定の時間帯についての条件及び前記対象人物の場所についての条件を含んでもよい。 In the above aspect, the processing unit includes:
When there is no sound acquired by the microphone since the current action is executed, and when a predetermined interrupt prohibition condition is set in the apparatus, the apparatus is caused to execute the fifth action,
The predetermined interrupt prohibition condition is
A condition regarding a predetermined time zone and a condition regarding the location of the target person may be included.

所定の場所で所定の時間に対象人物がロボットに関与されては困るような所定の行動（例えば、食事及び睡眠）を行うことが習慣化されていることがある。この場合、対象人物にロボットを関与させるのは生活パターンを乱すので好ましくない。そこで、本態様では割り込み禁止条件を設け、割り込み禁止条件が設定されている時間帯及び場所においては装置が対象人物への関与から離脱する第５行動をロボットに実行させる。これにより、ロボットが対象人物の生活パターンを乱すことを防止できる。 It may be customary to perform a predetermined action (for example, meal and sleep) that makes it difficult for a target person to be involved in a robot at a predetermined place at a predetermined time. In this case, it is not preferable to involve the robot in the target person because it disturbs the life pattern. Therefore, in this aspect, an interrupt prohibition condition is provided, and the robot performs a fifth action in which the apparatus leaves the involvement of the target person in the time zone and place where the interrupt prohibition condition is set. This can prevent the robot from disturbing the life pattern of the target person.

本開示は、このような装置に含まれる特徴的な各ステップをコンピュータに実行させるコンピュータプログラムとして実現することもできる。そして、そのようなコンピュータプログラムを、ＣＤ−ＲＯＭ等のコンピュータ読取可能な非一時的な記録媒体あるいはインターネット等の通信ネットワークを介して流通させることができるのは、言うまでもない。 The present disclosure can also be realized as a computer program that causes a computer to execute each characteristic step included in such an apparatus. Needless to say, such a computer program can be distributed via a computer-readable non-transitory recording medium such as a CD-ROM or a communication network such as the Internet.

なお、以下で説明する実施の形態は、いずれも本開示の一具体例を示すものである。以下の実施の形態で示される数値、形状、構成要素、ステップ、ステップの順序などは、一例であり、本開示を限定する主旨ではない。また、以下の実施の形態における構成要素のうち、最上位概念を示す独立請求項に記載されていない構成要素については、任意の構成要素として説明される。また全ての実施の形態において、各々の内容を組み合わせることも出来る。 Note that each of the embodiments described below shows a specific example of the present disclosure. Numerical values, shapes, components, steps, order of steps, and the like shown in the following embodiments are merely examples, and are not intended to limit the present disclosure. In addition, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims indicating the highest concept are described as optional constituent elements. In all the embodiments, the contents can be combined.

（実施の形態）
（全体構成）
以下、本開示の実施の形態について説明する。以下の説明では、本開示の装置をロボットに適用した場合を例に挙げて説明する。 (Embodiment)
(overall structure)
Hereinafter, embodiments of the present disclosure will be described. In the following description, a case where the apparatus of the present disclosure is applied to a robot will be described as an example.

図１は、本開示の実施の形態におけるロボット１の全体構成の一例を示すブロック図である。ロボット１は、例えば、幼児等の自己中心性が強いという心理特性を有するユーザとのコミュニケーションを通じて幼児の発育を支援することを主な目的とするロボットである。なお、コミュニケーションとは、幼児とロボット１とが音声により会話するというような直接的なコミュニケーションのみならず、ロボット１と幼児とが一緒になってダンスをするような間接的なコミュニケーションも含まれる。 FIG. 1 is a block diagram illustrating an example of the overall configuration of the robot 1 according to the embodiment of the present disclosure. The robot 1 is a robot whose main purpose is to support the growth of an infant through communication with a user who has a psychological characteristic that the self-centeredness of the infant is strong, for example. The communication includes not only direct communication in which the infant and the robot 1 have a conversation by voice, but also indirect communication in which the robot 1 and the infant dance together.

ロボット１は、センサー部１００、行動実行部２００、出力部３００を備える。センサー部１００は、マイク１０１及びカメラ１０２で構成されている。マイク１０１は、ロボット１の周囲の音を音声信号に変換する。また、マイク１０１は、変換した音声信号を所定のサンプリングレートでＡ／Ｄ変換し、デジタルの音声データに変換し、プロセッサ２１０に出力する。カメラ１０２は、ロボット１の周囲の映像を取得し、画像データを取得する。ここで、カメラ１０２は、例えば、ＣＣＤ又はＣＭＯＳイメージセンサで構成され、所定のフレームレート（例えば、１秒間に６０フレーム）でロボット１の周囲の画像を撮影し、デジタルの画像データに変換する。カメラ１０２は、ロボット１の正面前方を撮影する通常のカメラで構成されてもよいし、ロボット１の全方位を撮影する全方位カメラで構成されてもよい。カメラ１０２により所定のフレームレートで撮影された画像データは、所定のフレームレートで、プロセッサ２１０に入力される。ここで、カメラ１０２は、ステレオカメラ又は赤外線カメラで構成されてもよい。この場合、カメラ１０２が撮像する画像データには、周囲の物体までの距離を示す距離成分が含まれることになる。 The robot 1 includes a sensor unit 100, an action execution unit 200, and an output unit 300. The sensor unit 100 includes a microphone 101 and a camera 102. The microphone 101 converts sounds around the robot 1 into audio signals. Further, the microphone 101 performs A / D conversion on the converted audio signal at a predetermined sampling rate, converts it to digital audio data, and outputs the digital audio data to the processor 210. The camera 102 acquires a video around the robot 1 and acquires image data. Here, the camera 102 is composed of, for example, a CCD or a CMOS image sensor, captures an image around the robot 1 at a predetermined frame rate (for example, 60 frames per second), and converts it into digital image data. The camera 102 may be configured with a normal camera that captures the front front of the robot 1 or may be configured with an omnidirectional camera that captures all directions of the robot 1. Image data captured by the camera 102 at a predetermined frame rate is input to the processor 210 at a predetermined frame rate. Here, the camera 102 may be configured by a stereo camera or an infrared camera. In this case, the image data captured by the camera 102 includes a distance component indicating the distance to the surrounding object.

行動実行部２００は、プロセッサ２１０及びメモリ２０８を備える。プロセッサ２１０は、例えば、ＡＳＩＣ、ＤＳＰ、又はＣＰＵ等のプロセッサで構成され、音声認識部２０１、画像認識部２０２、初期関与ステージ判断部２０３、関与実行部２０４、移行判断部２０５、及びコマンド生成部２０７を備える。ここで、プロセッサ２１０が備える各構成要素はプロセッサがメモリ２０８に記憶されたコンピュータを行動実行部２００として機能させるプログラムを実行することで実現されてもよいし、専用のハードウェア回路で構成されてもよい。行動実行部２００を構成する全ての構成要素は、同一端末に実装されてもよい。或いは、行動実行部２００を構成する全て又は一部の構成要素は、光ファイバ、無線、又は公衆電話回線など任意のネットワークを介して接続される別の端末又はサーバ上に個別に実装されていてもよい。この場合、行動実行部２００は、別の端末又はサーバと通信することによって実現される。 The action execution unit 200 includes a processor 210 and a memory 208. The processor 210 is composed of a processor such as an ASIC, DSP, or CPU, for example, and includes a voice recognition unit 201, an image recognition unit 202, an initial participation stage determination unit 203, a participation execution unit 204, a transition determination unit 205, and a command generation unit. 207. Here, each component included in the processor 210 may be realized by the processor executing a program that causes the computer stored in the memory 208 to function as the action execution unit 200, or may be configured by a dedicated hardware circuit. Also good. All the components constituting the action execution unit 200 may be mounted on the same terminal. Alternatively, all or some of the components constituting the action execution unit 200 are individually mounted on another terminal or server connected via an arbitrary network such as an optical fiber, a radio, or a public telephone line. Also good. In this case, the action execution unit 200 is realized by communicating with another terminal or server.

音声認識部２０１は、マイク１０１から入力される音声データに対して所定の音声認識処理を実行して、音声データからロボット１の周囲に存在するユーザの発話内容を認識し、認識した発話内容を文字列に変換し、発話データを生成する。所定の音声認識処理としては、例えば、隠れマルコフモデル、統計的手法、又は動的時間伸縮法といった公知の手法が採用できる。 The voice recognition unit 201 performs a predetermined voice recognition process on the voice data input from the microphone 101, recognizes the utterance content of the user existing around the robot 1 from the voice data, and determines the recognized utterance content. Convert to character string and generate utterance data. As the predetermined speech recognition processing, for example, a known method such as a hidden Markov model, a statistical method, or a dynamic time expansion / contraction method can be employed.

また、音声認識部２０１は、マイク１０１から入力される音声データの声紋がメモリ２０８に記憶された所定のユーザの声紋と照合することで、発話したユーザを特定する。そして、音声認識部２０１は、特定したユーザの識別情報と、発話データとを含むデータを音声認識データとして出力する。所定のユーザとしては、例えば、ロボット１が教育支援対象とする幼児及びその幼児の家族が含まれる。以下、ロボット１が教育支援の対象とする幼児を対象人物の一例として説明する。 In addition, the voice recognition unit 201 identifies the user who spoke by comparing the voice print of the voice data input from the microphone 101 with the voice print of a predetermined user stored in the memory 208. Then, the voice recognition unit 201 outputs data including the identified user identification information and utterance data as voice recognition data. The predetermined user includes, for example, an infant to be supported by the robot 1 as an educational support and a family of the infant. In the following, an explanation will be given by taking an infant whose robot 1 is a target of education support as an example of a target person.

画像認識部２０２は、カメラ１０２から入力される画像データに対して顔認識処理を適用することで、ロボット１の周囲に位置するユーザを認識し、認識したユーザの識別情報を画像認識データとして出力する。また、画像認識部２０２は、顔認識処理により認識したユーザの目線、顔の方向、及びユーザの移動量等を検出し、検出結果を画像認識データとして出力する。ここで、画像認識処理には、例えば、画像データに含まれる人物の顔の特徴量を抽出する処理、及び、抽出した特徴量を予めメモリ２０８に記憶された所定のユーザの顔の特徴量と比較する処理等が含まれる。 The image recognition unit 202 recognizes a user located around the robot 1 by applying face recognition processing to the image data input from the camera 102, and outputs the identification information of the recognized user as image recognition data. To do. In addition, the image recognition unit 202 detects the user's eye line, face direction, user movement amount, and the like recognized by the face recognition process, and outputs the detection result as image recognition data. Here, the image recognition process includes, for example, a process of extracting a feature amount of a person's face included in image data, and a feature amount of a predetermined user's face stored in the memory 208 in advance. Processing to compare is included.

初期関与ステージ判断部２０３は、音声認識部２０１から出力された音声認識データと、画像認識部２０２から出力された画像認識データとに基づいて、ロボット１の幼児に対する初期の関与ステージを示す初期関与ステージを決定する。図２は、初期関与ステージ判断部２０３が初期関与ステージを決定するために用いる初期関与ステージテーブルＴ１の一例を示す図である。なお、初期関与ステージテーブルＴ１はメモリ２０８に事前に記憶されている。 The initial participation stage determination unit 203 indicates an initial participation stage indicating an initial participation stage for the infant of the robot 1 based on the voice recognition data output from the voice recognition unit 201 and the image recognition data output from the image recognition unit 202. Determine the stage. FIG. 2 is a diagram illustrating an example of the initial participation stage table T1 used by the initial participation stage determination unit 203 to determine the initial participation stage. Note that the initial participation stage table T1 is stored in the memory 208 in advance.

初期関与ステージテーブルＴ１は、複数の初期関与ステージと各初期関与ステージに対する条件とが対応付けられたデータベースであり、「認識項目」、「センサ」、「条件」、及び「初期関与ステージ」のフィールドを備える。 The initial participation stage table T1 is a database in which a plurality of initial participation stages and conditions for each initial participation stage are associated with each other. The fields of “recognition item”, “sensor”, “condition”, and “initial participation stage” Is provided.

「初期関与ステージ」フィールドには、「タスク実行」、「模倣」、及び「問いかけ」の３つ関与ステージと、対応する関与ステージがないことを示す「Ｎ／Ａ」とが登録されている。なお、「タスク実行」、「問いかけ」、及び「模倣」の順で、ロボット１の幼児に対する関与の度合いが高い。「関与ステージ」とは、幼児に対してロボット１が関与する際の行動を決定するための指標を指す。 In the “initial participation stage” field, three participation stages of “task execution”, “imitation”, and “question” and “N / A” indicating that there is no corresponding participation stage are registered. The degree of involvement of the robot 1 with the infant is high in the order of “task execution”, “question”, and “imitation”. The “participation stage” refers to an index for determining an action when the robot 1 is involved with an infant.

関与ステージが「タスク実行」にある場合、幼児に対して所定のタスクを課す行動（第１行動の一例）がロボット１により実行される。所定のタスクとは、幼児が取り組むべき仕事又は課題を指し、例えば、なぞなぞ遊び、片付け、勉強、宿題、読書、ロボット１による絵本の読み聞かせ、及び手伝い等が該当する。なお、タスクは、音声認識部２０１から出力される音声認識データと、画像認識部２０２から出力される画像認識データとの少なくとも一方に基づいて、初期関与ステージの決定に先立ってプロセッサ２１０により事前に決定される。言い換えれば、初期関与ステージを決定する処理は、新たなタスクが決定される度に実行される。 When the participation stage is “task execution”, the robot 1 executes an action that imposes a predetermined task on the infant (an example of the first action). The predetermined task refers to a work or task to be tackled by the infant, such as riddle play, tidying up, studying, homework, reading, reading a picture book by the robot 1, and helping. The task is preliminarily executed by the processor 210 prior to the determination of the initial participation stage based on at least one of the speech recognition data output from the speech recognition unit 201 and the image recognition data output from the image recognition unit 202. It is determined. In other words, the process of determining the initial participation stage is executed every time a new task is determined.

例えば、発話データに幼児に対して片付けを促す母親によって発話されたメッセージが含まれる場合、「片付け」がタスクとして決定される。例えば、音声認識データに「○○ちゃん、お片づけしましょうね」というような片付けを促す母親による発話データが含まれていれば、所定のタスクとして「片付け」が決定される。 For example, when the utterance data includes a message uttered by a mother who prompts an infant to clean up, “clean up” is determined as a task. For example, if the speech recognition data includes utterance data by a mother urging the user to clean up, such as “Let ’s clean up,” “cleanup” is determined as a predetermined task.

「認識項目」フィールドには、初期関与ステージのそれぞれに対するロボット１の認識項目が登録されている。図２の例では、全ての初期関与ステージに対して、「認識項目」として、「人認識」と「音声認識」とが含まれている。「人認識」とは、ロボット１が幼児を映像によって認識したことを指す。「音声認識」とは、ロボット１が幼児を音によって認識したことを指す。 In the “recognition item” field, recognition items of the robot 1 for each of the initial participation stages are registered. In the example of FIG. 2, “person recognition” and “voice recognition” are included as “recognition items” for all initial participation stages. “Human recognition” means that the robot 1 recognizes an infant by video. “Voice recognition” means that the robot 1 recognizes an infant by sound.

「センサ」フィールドには、認識項目に挙げられた認識を実現するためのセンサの種類が登録されている。図２の例では、全ての初期関与ステージにおいて、「人認識」には「カメラ」が登録され、「音声認識」には「マイク」が登録されている。つまり、「人認識」はカメラ１０２により撮影された画像データを用いて実行され、「音声認識」はマイク１０１により集音された音声データを用いて実行されることになる。 In the “sensor” field, the type of sensor for realizing the recognition listed in the recognition item is registered. In the example of FIG. 2, “camera” is registered in “person recognition” and “microphone” is registered in “voice recognition” in all initial participation stages. That is, “person recognition” is executed using image data captured by the camera 102, and “voice recognition” is executed using sound data collected by the microphone 101.

「条件」フィールドには、各初期関与ステージに対する決定条件が登録されている。例えば、「タスク実行」は、「人認識」と「音声認識」とが共に「認識」である場合に決定される。「模倣」は「人認識」が「認識」であり、「音声認識」が「なし」（＝非認識）である場合に決定される。「問いかけ」は「人認識」が「なし」であり、「音声認識」が「認識」である場合に決定される。なお、「人認識」と「音声認識」とが共に「なし」である場合、初期関与ステージは決定されない。 In the “condition” field, a determination condition for each initial participation stage is registered. For example, “task execution” is determined when both “person recognition” and “voice recognition” are “recognition”. “Imitation” is determined when “person recognition” is “recognition” and “speech recognition” is “none” (= non-recognition). “Question” is determined when “person recognition” is “none” and “voice recognition” is “recognition”. When both “person recognition” and “voice recognition” are “none”, the initial participation stage is not determined.

ここで、初期関与ステージの決定は、幼児のロボット１への関心が高いほど、ロボット１への関与の度合いを高くするとの考えに基づいている。また、幼児のロボット１への関心は、幼児がロボット１に向かって会話している状況、幼児がロボット１に向かわずに会話している状況、及び幼児がロボット１に向かっているが会話はしていない状況の順に高いとみなしている。 Here, the determination of the initial participation stage is based on the idea that the degree of involvement in the robot 1 increases as the infant's interest in the robot 1 increases. The infant's interest in the robot 1 is that the infant is talking to the robot 1, the infant is talking without going to the robot 1, and the infant is going to the robot 1. It is considered higher in the order of the situation that has not.

そこで、初期関与ステージテーブルＴ１では、「人認識」と「音声認識」とが共に「認識」の状況は、幼児のロボット１への関心の度合いが１番高いステージにあるとみなし、ロボット１の幼児への関与の度合いが１番高い関与ステージである「タスク実行」が割り当てられている。また、「人認識」が「なし」であり「音声認識」が「認識」の状況は、幼児のロボット１への関心が２番目に高いステージにあるとみなし、ロボット１の幼児への関与の度合いが２番目に高い関与ステージである「問いかけ」が割り当てられている。また、「人認識」が「認識」であり「音声認識」が「なし」の状況は、幼児のロボット１への関心が３番目に高いステージにあるとみなし、ロボット１の幼児への関与の度合いが３番目に高い関与ステージである「模倣」が割り当てられている。 Therefore, in the initial participation stage table T1, the situation where both “person recognition” and “speech recognition” are “recognition” is considered to be in a stage where the degree of interest in the robot 1 is the highest, and the robot 1 “Task execution”, which is the participation stage with the highest degree of involvement with infants, is assigned. In addition, the situation where “person recognition” is “none” and “voice recognition” is “recognition” is considered that the infant's interest in the robot 1 is at the second highest stage, and the robot 1 is involved in the infant. “Question”, which is the second highest involved stage, is assigned. The situation where “person recognition” is “recognition” and “speech recognition” is “none” is considered that the infant's interest in the robot 1 is in the third highest stage, and the robot 1 is involved in the infant. “Imitation”, which is the third highest involved stage, is assigned.

図１に参照を戻す。関与実行部２０４は、初期関与ステージ判断部２０３によって決定された初期関与ステージに対応する行動をロボット１に実行させるためのコマンドの出力依頼をコマンド生成部２０７に出力する。また、関与実行部２０４は、後述する移行判断部２０５によって関与ステージの移行が決定された場合、移行先の関与ステージに対応する行動をロボット１に実行させるためのコマンドの出力依頼をコマンド生成部２０７に出力する。 Returning to FIG. The participation execution unit 204 outputs a command output request for causing the robot 1 to execute an action corresponding to the initial participation stage determined by the initial participation stage determination unit 203 to the command generation unit 207. In addition, when the transition determination unit 205, which will be described later, determines the transition of the participating stage, the involvement executing unit 204 sends a command output request for causing the robot 1 to execute an action corresponding to the participating stage of the transition destination. It outputs to 207.

ここで、関与ステージは後述の図３で示すように、「タスク実行」、「問いかけ」、「模倣」、「待機」、及び「離脱」の５つで構成される。そのため、関与実行部２０４は、これら５つの関与ステージに対応する５つの行動をロボット１に実行させるためのコマンドの出力依頼をコマンド生成部２０７に出力する。なお、「タスク実行」、「問いかけ」、「模倣」、「待機」、及び「離脱」のそれぞれに対応する行動は、第１行動、第２行動、第３行動、第４行動、及び第５行動の一例に該当する。 Here, as shown in FIG. 3 to be described later, the participation stage is composed of five items of “task execution”, “question”, “imitation”, “waiting”, and “leaving”. Therefore, the involvement executing unit 204 outputs a command output request for causing the robot 1 to execute the five actions corresponding to these five participating stages to the command generating unit 207. The actions corresponding to “task execution”, “question”, “imitation”, “waiting”, and “leaving” are the first action, the second action, the third action, the fourth action, and the fifth action, respectively. It corresponds to an example of action.

例えば、決定された関与ステージが「タスク実行」であれば、「タスク実行」のコマンドの出力依頼がコマンド生成部２０７に出力され、決定された関与ステージが「模倣」であれば、「模倣」のコマンドの出力依頼がコマンド生成部２０７に出力される。 For example, if the determined participation stage is “task execution”, an output request for a “task execution” command is output to the command generation unit 207, and if the determined participation stage is “imitation”, “imitation”. The command output request is output to the command generation unit 207.

移行判断部２０５は、初期関与ステージ判断部２０３により決定された初期関与ステージを別の関与ステージに移行するか否かを判断する。また、移行判断部２０５は、初期関与ステージを移行させた後の関与ステージを別の関与ステージに移行するか否かを判定する。以下、初期関与ステージと、初期関与ステージから移行された後の関与ステージとを含め、現在、ロボット１に設定されている関与ステージのことを「現在の関与ステージ」と記述する。 The transition determination unit 205 determines whether or not to transfer the initial participation stage determined by the initial participation stage determination unit 203 to another participation stage. In addition, the transition determination unit 205 determines whether or not to shift the participation stage after shifting the initial participation stage to another participation stage. Hereinafter, the participation stage currently set in the robot 1 including the initial participation stage and the participation stage after the transition from the initial participation stage is described as a “current participation stage”.

ここで、移行判断部２０５は、現在の関与ステージに対応する行動が実行されてから、音声認識部２０１により幼児の音声が認識された場合、現在の関与ステージを一つ上の関与ステージに移行すると判断する。すなわち、移行判断部２０５は、現在の関与ステージをロボット１に実行させた結果、幼児が発話により反応した場合、幼児のロボット１への関心は高まる傾向にあると判断して、現在の関与ステージを１つ上の関与ステージに移行するのである。 Here, the transition determination unit 205 shifts the current participation stage to a higher participation stage when the voice recognition unit 201 recognizes the voice of the infant after the action corresponding to the current participation stage is executed. Judge that. That is, as a result of causing the robot 1 to execute the current participation stage, the transition determination unit 205 determines that the infant's interest in the robot 1 tends to increase when the infant responds by utterance. Is moved to a higher participation stage.

一方、移行判断部２０５は、現在の関与ステージに対応する行動（現在の行動）が実行されてから、音声認識部２０１により幼児の音声が認識されなかった場合、現在の行動が実行されてからの経過時間が閾値未満であるかを判断する。そして、移行判断部２０５は、前記経過時間が閾値未満であると判断した場合、現在の関与ステージを継続すると判断する。一方、移行判断部２０５は、前記経過時間が閾値以上になっても幼児が発話しなかった場合、現在の関与ステージを一つ下位の関与ステージに移行すると判断する。すなわち、移行判断部２０５は、前記経過時間が閾値に到達するまで待っても、幼児から発話による反応がない場合は、現在の関与ステージは幼児に合っておらず、却って幼児のロボット１への関心を低下させてしまうと判断して、一つ下位の関与ステージに移行するのである。 On the other hand, if the voice recognition unit 201 does not recognize an infant's voice after the action corresponding to the current participation stage (current action) is executed, the transition determination unit 205 executes the current action. It is determined whether the elapsed time is less than the threshold value. If the transition determination unit 205 determines that the elapsed time is less than the threshold, the transition determination unit 205 determines to continue the current participation stage. On the other hand, if the infant does not speak even when the elapsed time is equal to or greater than the threshold, the transition determination unit 205 determines that the current participation stage is shifted to a lower participation stage. That is, if the transition determination unit 205 waits until the elapsed time reaches the threshold value and there is no response from the utterance from the infant, the current participation stage is not suitable for the infant, and instead the infant's robot 1 Judge that it will reduce interest, and move on to the next stage of involvement.

図３は、ロボット１の関与ステージの順序を定める移行テーブルＴ２の一例を示す図である。図３に示す移行テーブルＴ２には、ロボット１が幼児に関与する度合いが高い順に「タスク実行」、「問いかけ」、「模倣」、「待機」、及び「離脱」の５つの関与ステージが登録されている。例えば、移行判断部２０５は、移行テーブルＴ２に登録された順位にしたがって段階的に関与ステージを設定し、ロボット１の幼児に対する関与の度合いを上げ下げする。なお、移行テーブルＴ２は、メモリ２０８に事前に記憶されている。 FIG. 3 is a diagram illustrating an example of the transition table T2 that determines the order of the participating stages of the robot 1. In the transition table T2 shown in FIG. 3, five participation stages of “task execution”, “question”, “imitation”, “waiting”, and “leaving” are registered in descending order of the degree of involvement of the robot 1 with the infant. ing. For example, the transition determination unit 205 sets the participation stage step by step according to the order registered in the transition table T2, and raises or lowers the degree of involvement of the robot 1 with the infant. Note that the migration table T2 is stored in the memory 208 in advance.

コマンド生成部２０７は、関与実行部２０４からコマンドの出力依頼を受け付けた場合、出力依頼が示すコマンドを出力部３００に出力する。 When the command generation unit 207 receives a command output request from the participation execution unit 204, the command generation unit 207 outputs the command indicated by the output request to the output unit 300.

出力部３００は、コマンド生成部２０７からのコマンドにしたがって、現在の関与ステージに対応する行動をロボット１に実現させる構成要素であり、スピーカ３０１及び駆動部３０２を備える。 The output unit 300 is a component that causes the robot 1 to realize an action corresponding to the current participation stage in accordance with a command from the command generation unit 207, and includes a speaker 301 and a drive unit 302.

スピーカ３０１は、コマンド生成部２０７からのコマンドにしたがって、現在の関与ステージに対応する行動を実行する際に必要な音声データを音声に変換して外部に出力する。 The speaker 301 converts voice data necessary for executing an action corresponding to the current participation stage into voice according to the command from the command generation unit 207 and outputs the voice to the outside.

駆動部３０２は、例えば、モータ等のアクチュエータと、アクチュエータによって作動する機構部とによって構成され、コマンド生成部２０７からのコマンドにしたがって、現在の関与ステージに対応する行動をロボット１に実現させる。機構部としては、ロボット１を前進又は後進させる部材、ロボット１の姿勢を変化させる部材、及びロボット１の顔の表情を表示する表示部の向きを変化させる部材等が含まれる。アクチュエータとしては、ロボット１を前進又は後進させる部材を駆動するモータ、ロボット１の姿勢を変化させる部材を駆動するモータ、及び表示部の向きを変化させるモータ等が含まれる。 The drive unit 302 includes, for example, an actuator such as a motor and a mechanism unit that is operated by the actuator, and causes the robot 1 to realize an action corresponding to the current participating stage in accordance with a command from the command generation unit 207. The mechanism unit includes a member that moves the robot 1 forward or backward, a member that changes the posture of the robot 1, a member that changes the orientation of the display unit that displays the facial expression of the robot 1, and the like. The actuator includes a motor that drives a member that moves the robot 1 forward or backward, a motor that drives a member that changes the posture of the robot 1, a motor that changes the orientation of the display unit, and the like.

なお、行動実行部２００がロボット１の本体部分とは別の端末又はサーバに実装されている場合、スピーカ３０１及び駆動部３０２は、有線または無線を介して行動実行部２００が実装された端末又はサーバと接続されればよい。 When the action execution unit 200 is mounted on a terminal or server different from the main body of the robot 1, the speaker 301 and the drive unit 302 are connected to the terminal on which the action execution unit 200 is mounted via a wired or wireless connection. What is necessary is just to connect with a server.

（フローチャート）
次に、図４及び図５を用いて、ロボット１における関与ステージの決定処理について説明する。図４は、関与ステージの決定処理の一例を示すフローチャートである。図５は図４の続きのフローチャートである。 (flowchart)
Next, the involved stage determination process in the robot 1 will be described with reference to FIGS. FIG. 4 is a flowchart illustrating an example of the involved stage determination process. FIG. 5 is a flowchart subsequent to FIG.

まず、プロセッサ２１０は、音声認識部の電源がオンされるとロボット１が起動する（Ｓ１０１）。次に、初期関与ステージ判断部２０３は、音声認識部２０１による音声認識データと、画像認識部２０２よる画像認識データとに基づいて、センサー入力の有無を検知する（Ｓ１０２）。ここで、初期関与ステージ判断部２０３は、音声認識部２０１により幼児に関する音声認識データが出力されず、且つ、画像認識部２０２により幼児に関する画像認識データが出力されていない場合、センサー入力無しと判断し、幼児に関する音声認識データと幼児に関する画像認識データとの少なくとも一方が出力された場合、センサー入力有りと判断すればよい。 First, the processor 210 activates the robot 1 when the voice recognition unit is powered on (S101). Next, the initial participation stage determination unit 203 detects the presence / absence of a sensor input based on the voice recognition data by the voice recognition unit 201 and the image recognition data by the image recognition unit 202 (S102). Here, the initial participation stage determination unit 203 determines that there is no sensor input when the speech recognition unit 201 does not output the speech recognition data about the infant and the image recognition unit 202 does not output the image recognition data about the infant. When at least one of the voice recognition data regarding the infant and the image recognition data regarding the infant is output, it may be determined that there is a sensor input.

Ｓ１０２において、センサー入力有りと判断された場合（Ｓ１０２でＹＥＳ）、処理はＳ１０３に進み、センサー入力無しと判断された場合（Ｓ１０２でＮＯ）、処理はＳ１０２に戻る。 If it is determined in S102 that there is a sensor input (YES in S102), the process proceeds to S103. If it is determined that there is no sensor input (NO in S102), the process returns to S102.

Ｓ１０３では、初期関与ステージ判断部２０３は、メモリ２０８に記憶された初期関与ステージテーブルＴ１を参照して、ロボット１の初期関与ステージを決定する（Ｓ１０３）。ここで、初期関与ステージ判断部２０３は、条件の項目にマッチした関与ステージをロボット１の初期関与ステージとして決定する。 In S103, the initial participation stage determination unit 203 refers to the initial participation stage table T1 stored in the memory 208, and determines the initial participation stage of the robot 1 (S103). Here, the initial participation stage determination unit 203 determines the participation stage that matches the condition item as the initial participation stage of the robot 1.

例えば、音声認識部２０１により幼児の発話が音声認識され、且つ、画像認識部２０２により幼児の映像が認識された場合、初期関与ステージとして「タスク実行」が決定される。また、音声認識部２０１により幼児の発話が音声認識されず、且つ、画像認識部２０２により幼児の映像が認識された場合、初期関与ステージとして「模倣」が決定される。また、音声認識部２０１により幼児の発話が音声認識され、且つ、画像認識部２０２により幼児の映像が認識されなかった場合、初期関与ステージとして「問いかけ」が決定される。 For example, when the speech recognition unit 201 recognizes the speech of the infant and the image recognition unit 202 recognizes the image of the infant, “task execution” is determined as the initial participation stage. Further, when the speech recognition unit 201 does not recognize the speech of the infant and the image recognition unit 202 recognizes the image of the infant, “imitation” is determined as the initial participation stage. In addition, when the speech recognition unit 201 recognizes the speech of the infant and the image recognition unit 202 does not recognize the infant image, “question” is determined as the initial participation stage.

次に、図５を参照し、Ｓ２０２では、現在の関与ステージが「離脱」でない場合（２０２でＮＯ）、関与実行部２０４は、現在の関与ステージに対応する行動を実行するためのコマンドの出力依頼をコマンド生成部２０７に出力する（Ｓ２０３）。ここで、初期関与ステージとしては、「タスク実行」、「問いかけ」、及び「模倣」の３つがあるため、まず、これら３つの関与ステージのいずれかに対応する行動が実行されることになる。 Next, referring to FIG. 5, in S202, when the current participation stage is not “leave” (NO in 202), the participation execution unit 204 outputs a command for executing an action corresponding to the current participation stage. The request is output to the command generation unit 207 (S203). Here, since there are three initial participation stages, “task execution”, “question”, and “imitation”, first, an action corresponding to one of these three participation stages is executed.

タスクとして、例えば、なぞなぞ遊びが実行される場合、プロセッサ２１０は、幼児に対して「なぞなぞ遊びをやろう」というようなコミュニケーションの開始を提案する音声をスピーカ３０１に出力させるコマンド、及びなぞなぞの問題を読み上げる音声をスピーカ３０１から出力させるコマンドを出力部３００に出力すればよい。また、プロセッサ２１０は、なぞなぞの問題を読み上げた後、音声認識部２０１により幼児から答えの発話が音声認識された場合、発話内容をなぞなぞの答えと照合し、正解した場合は、正解したことを示す音声をスピーカ３０１から出力させるコマンドを出力部３００に出力すればよい。 As a task, for example, when riddle play is executed, the processor 210 causes the speaker 301 to output a voice suggesting the start of communication such as “Let's play riddle play” to the infant, and a riddle problem. A command for outputting a voice to read out from the speaker 301 may be output to the output unit 300. In addition, the processor 210 reads the riddle problem, and if the speech recognition unit 201 recognizes the speech of the answer from the infant, the processor 210 compares the content of the speech with the riddle answer. A command for outputting the indicated voice from the speaker 301 may be output to the output unit 300.

また、タスクとして、片付けが実行される場合、プロセッサ２１０は、「お部屋の片付けをしよう」という問いかけの音声をスピーカ３０１から出力させるコマンドを出力部３００に出力した後、画像認識部２０２の認識結果から幼児が片付けをしているか否かを判断し、片付けをしていると判断した場合は、幼児の行動をほめる音声をスピーカ３０１から出力させるコマンドを出力部３００に出力すればよい。一方、プロセッサ２１０は、幼児が片付けをしていないと判断した場合は、幼児に対して片付けを促す音声をスピーカ３０１から出力させるコマンドを出力部３００に出力すればよい。 Further, when cleanup is executed as a task, the processor 210 outputs, to the output unit 300, a command for outputting a voice asking “let's clean up the room” from the speaker 301, and then the recognition result of the image recognition unit 202. From this, it is determined whether or not the infant is cleaning up. If it is determined that the baby is cleaning up, a command for outputting a voice that compliments the infant's action from the speaker 301 may be output to the output unit 300. On the other hand, if the processor 210 determines that the infant has not been cleared, the processor 210 may output a command to the output unit 300 to output a sound prompting the infant to be cleaned from the speaker 301.

「問いかけ」の関与ステージに対応する行動は、ロボット１が幼児に対し、例えば、幼児の名前を呼ぶなどして、幼児に話しかける行動が該当する。 The action corresponding to the “question” participation stage corresponds to an action in which the robot 1 speaks to the infant by calling the infant's name, for example.

「模倣」の関与ステージに対応する行動は、ロボット１が幼児と同調した動きをする行動、すなわち、幼児の動き真似をする行動が該当する。「模倣」の関与ステージに対応する行動をロボット１が実行する場合、プロセッサ２１０は、例えば、画像認識部２０２の認識結果から、幼児が下を向いて何らかの作業を集中して行っていることを検出したとすると、ロボット１の上部を幼児の頭部の傾きと同じ角度で同じ向きに傾けるコマンドを出力部３００に出力すればよい。ここで、同じ向きとは、ロボット１と幼児とが対面していることを想定しており、例えば、幼児が頭部を左に傾けたのであればロボット１が上部を右の方向に傾けること、又、幼児が頭部を右に傾けたのであればロボット１が上部を左の方向に傾けることが該当する。 The action corresponding to the “imitation” participation stage corresponds to an action in which the robot 1 moves in synchronization with the infant, that is, an action that imitates the movement of the infant. When the robot 1 performs an action corresponding to the “imitation” participation stage, the processor 210 indicates that, for example, from the recognition result of the image recognition unit 202, the infant is concentrating and performing some work. If detected, a command to tilt the upper part of the robot 1 in the same direction as the tilt of the infant's head may be output to the output unit 300. Here, the same direction assumes that the robot 1 and the infant are facing each other. For example, if the infant tilts the head to the left, the robot 1 tilts the upper part to the right. If the infant tilts his / her head to the right, the robot 1 tilts the upper part to the left.

また、プロセッサ２１０は、例えば、画像認識部２０２の認識結果から、ダンスのように幼児が所定のリズムに合わせた動作をしていることを検知した場合、ロボット１を所定のリズムに合わせてダンスさせるコマンドを出力部３００に出力すればよい。 Further, for example, when the processor 210 detects from the recognition result of the image recognition unit 202 that the infant is performing an operation in accordance with a predetermined rhythm like a dance, the processor 210 performs the dance in accordance with the predetermined rhythm. The command to be output may be output to the output unit 300.

一方、Ｓ２０２にて、現在の関与ステージが「離脱」である場合（Ｓ２０２でＹＥＳ）、関与実行部２０４は、ロボット１に幼児への関与から離脱させる離脱行動を実行させるコマンドの出力依頼をコマンド生成部２０７に出力し、ロボット１に離脱行動を実行させ（Ｓ２０７）、処理を終了する。 On the other hand, when the current participation stage is “leave” in S202 (YES in S202), the participation execution unit 204 issues a command output request for executing a withdrawal action that causes the robot 1 to leave the infant. The data is output to the generation unit 207, and the robot 1 is caused to perform a withdrawal action (S207), and the process is terminated.

ここで、離脱行動とは、ロボット１が幼児への関与を離脱する行動を指す。例えば、離脱行動としては、幼児から所定距離だけ離れた位置にロボット１を移動させたり、ロボット１の向きを重力方向を軸として１８０度旋回させて幼児の顔と反対方向にロボット１の正面を向けさせたりするというような、ロボット１が幼児とのコミュニケーションを停止させる自律的な行動が該当する。これにより、ロボット１への関心の向上が見込まれないほど幼児の意識が低い状態において、不必要に幼児を刺激して幼児がロボット１に対して嫌悪感を抱くことを防止できる。 Here, the withdrawal action refers to an action in which the robot 1 leaves the infant. For example, as the separation action, the robot 1 is moved to a position away from the infant by a predetermined distance, or the robot 1 is turned 180 degrees around the direction of gravity as the axis of gravity to move the front of the robot 1 in the direction opposite to the infant's face. An autonomous action in which the robot 1 stops communication with the infant, such as pointing it, corresponds. Thereby, in the state where the infant's consciousness is so low that the interest in the robot 1 is not expected to increase, the infant can be prevented from unnecessarily stimulating the robot 1 from feeling disgusted.

Ｓ２０４にて、関与実行部２０４は、現在の関与ステージが「タスク実行」であってそのタスクが終了した場合（Ｓ２０４でＮＯ）、ロボット１に離脱行動を実行させるコマンドの出力依頼をコマンド生成部２０７に出力し、ロボット１に離脱行動を実行させ（Ｓ２０７）、処理を終了する。これにより、タスクの終了後もロボット１が幼児につきまとい、幼児がロボット１に対して嫌悪感を抱くことを防止できる。 In S204, if the current participation stage is “task execution” and the task is completed (NO in S204), the participation execution unit 204 sends a command output request to the robot 1 to execute a withdrawal action. In step S207, the robot 1 is caused to execute a withdrawal action (S207), and the process ends. Thereby, even after the task is completed, it is possible to prevent the robot 1 from staying on the infant and causing the infant 1 to feel disgusted with the robot 1.

一方、現在の関与ステージが「タスク実行」であってそのタスクが終了していない場合（Ｓ２０４でＮＯ）、又は、現在の関与ステージが「問いかけ」、「模倣」、又は「待機」の場合（Ｓ２０４でＮＯ）、移行判断部２０５は関与ステージの移行の判断処理を行う（Ｓ２０５）。 On the other hand, when the current participation stage is “task execution” and the task has not ended (NO in S204), or when the current participation stage is “question”, “imitation”, or “waiting” ( In step S204, NO is determined, and the transition determination unit 205 performs determination processing for transition of the involved stage (S205).

なお、タスクは、タスク毎に設定されている終了条件が満たされた場合に終了される。例えば、タスクが幼児の質問に答えるタスクである場合、タスクはロボット１が幼児の質問に答えた後、一定時間が経過するまで、幼児から質問がなければ終了される。また、タスクがなぞなぞ遊びであれば、ロボット１が出題したなぞなぞに幼児が所定回数回答した場合又は出題してから一定時間が経過するまでに幼児からの発話がない場合に終了される。 Note that a task is terminated when a termination condition set for each task is satisfied. For example, if the task is a task for answering an infant's question, the task is terminated if there is no question from the infant until a predetermined time elapses after the robot 1 answers the infant's question. If the task is riddle play, the process is terminated when the infant answers the riddle that the robot 1 has given a question a predetermined number of times or when there is no utterance from the infant until a certain time has passed since the question was given.

次に、関与実行部２０４は、Ｓ２０５の処理によって決定された関与ステージに対応する行動をロボット１に実行させるためのコマンドの出力依頼をコマンド生成部２０７に出力する（Ｓ２０６）。これにより、Ｓ２０５の処理によって決定された関与ステージに対応する行動がロボット１によって実行される。なお、Ｓ２０５の処理によって決定された関与ステージはメモリ２０８に一時的に記憶される。 Next, the participation executing unit 204 outputs a command output request for causing the robot 1 to execute an action corresponding to the participating stage determined by the processing of S205 to the command generating unit 207 (S206). As a result, the robot 1 executes an action corresponding to the participation stage determined by the process of S205. The participation stage determined by the process of S205 is temporarily stored in the memory 208.

次に、図５のＳ２０５に示す移行の判断処理について説明する。図６は、図５のＳ２０５の詳細な処理の一例を示すフローチャートである。 Next, the migration determination process shown in S205 of FIG. 5 will be described. FIG. 6 is a flowchart showing an example of detailed processing in S205 of FIG.

まず、移行判断部２０５は、幼児の音声反応が無しと判断した場合（Ｓ３０１でＮＯ）、割り込み禁止状態にあるか否かを確認をする（Ｓ３０２）。図７は、割り込み禁止条件が設定された割り込み禁止条件テーブルＴ３の一例を示す図である。 First, when it is determined that there is no voice response of the infant (NO in S301), the transition determination unit 205 checks whether or not the interrupt is prohibited (S302). FIG. 7 is a diagram illustrating an example of an interrupt prohibition condition table T3 in which interrupt prohibition conditions are set.

幼児が所定の時間に所定の場所でロボット１に関与されると困るような行動を行うことが習慣化されている場合、所定の時間に所定の場所にいる幼児にロボット１が関与すると、幼児の生活パターンを乱してしまう。例えば、朝の時間帯にダイニングで幼児が朝食をとることが習慣化されている場合、この朝の時間帯にロボット１が幼児に関与すると、朝食の妨げになってしまう。そこで、本実施の形態では割り込み禁止条件を設け、割り込み禁止条件が設定されている時間帯及び場所においては、ロボット１に離脱行動を実行させ、幼児の生活パターンをロボット１が乱すことを防止している。 If it is customary to perform an action that would be troublesome if the infant is involved in the robot 1 at a predetermined place at a predetermined time, if the robot 1 is involved with an infant at the predetermined place at the predetermined time, Will disturb the life pattern. For example, when it is customary for an infant to have breakfast in dining in the morning time zone, if the robot 1 is involved with the infant in this morning time zone, breakfast will be hindered. Therefore, in the present embodiment, an interrupt prohibition condition is provided, and in the time zone and place where the interrupt prohibition condition is set, the robot 1 performs a withdrawal action to prevent the robot 1 from disturbing the infant's life pattern. ing.

割り込み禁止条件テーブルＴ３は縦軸に時間帯、横軸に場所が規定された二次元のテーブルである。「時間帯」フィールドには、１日の時間帯を複数に区切った時間帯が登録されている。ここでは、７時から９時、９時から１２時、・・・、２１時から７時というように、１日の時間帯が７個の時間帯に区切られている。「場所」フィールドには、幼児の家の部屋名が登録されている。割り込み禁止条件テーブルＴ３の各セルには、割り込み禁止条件が設定されていることを示すデータ「ＯＮ」と割り込み禁止条件が設定されていないことを示すデータ「ＯＦＦ」とが登録されている。 The interrupt prohibition condition table T3 is a two-dimensional table in which a vertical axis indicates a time zone and a horizontal axis indicates a location. In the “time zone” field, a time zone in which the daily time zone is divided into a plurality of times is registered. Here, the time zone of the day is divided into seven time zones, such as 7 o'clock to 9 o'clock, 9 o'clock to 12 o'clock, ..., 21 o'clock to 7 o'clock. In the “place” field, the room name of the infant's house is registered. In each cell of the interrupt prohibition condition table T3, data “ON” indicating that the interrupt prohibition condition is set and data “OFF” indicating that the interrupt prohibition condition is not set are registered.

例えば、７時から９時の時間帯においてダイニングのセルには「ＯＮ」が登録されている。したがって、移行判断部２０５は、この時間帯において幼児がダイニングにいることを検知した場合、関与ステージを「離脱」に決定する。これは、この家庭では、例えば、７時から９時の時間帯にダイニングで、幼児が朝食をとることが習慣化されており、ロボット１を幼児に関与させると朝食の邪魔になるからである。 For example, “ON” is registered in the dining cell in the time zone from 7:00 to 9:00. Accordingly, when the transition determination unit 205 detects that the infant is in the dining room during this time period, the transition determination unit 205 determines the participation stage to “leave”. This is because, in this family, for example, it is customary for an infant to have breakfast at a dining time from 7:00 to 9:00, and if the robot 1 is involved in the infant, it will interfere with breakfast. .

一方、例えば、７時から９時の時間帯においてダイニング以外の場所のセルには「ＯＦＦ」が登録されている。したがって、移行判断部２０５は、この時間帯であっても幼児がダイニング以外の場所にいれば、ロボット１は離脱行動以外の行動が許容された通常動作を行う。 On the other hand, for example, “OFF” is registered in the cells other than the dining in the time zone from 7:00 to 9:00. Therefore, the transition determination unit 205 performs the normal operation in which the robot 1 is allowed to perform actions other than the withdrawal action if the infant is in a place other than the dining room even during this time period.

なお、割り込み禁止条件は、例えば、スマートフォン等の携帯端末に対してユーザが入力したデータに基づいて事前に設定される。ユーザは、例えば、携帯端末に対して発話することで、割り込み禁止条件を設定すればよい。これにより、割り込み禁止状態にある場合、ロボット１が幼児に関与することを防止できる。 The interrupt prohibition condition is set in advance based on data input by the user to a mobile terminal such as a smartphone, for example. For example, the user may set the interrupt prohibition condition by speaking to the mobile terminal. Thereby, when it is in the interruption prohibition state, the robot 1 can be prevented from participating in the infant.

図６に参照を戻す。Ｓ３０２では、移行判断部２０５は、割り込み禁止条件テーブルＴ３を参照し、割り込み禁止状態であると判定した場合（Ｓ３０２でＹＥＳ）、関与ステージを「離脱」に移行する（Ｓ３０５）。一方、割り込み禁止状態でないと判断した場合（Ｓ３０２でＮＯ）、処理はＳ３０３に進む。ここで、移行判断部２０５は、画像認識部２０２による認識結果から幼児がいる部屋を判定し、判定した部屋と現在時刻が属する時間帯とに対応するセルに「ＯＮ」が登録されている場合、割り込み禁止状態にあると判定し（Ｓ３０２でＹＥＳ）、前記セルに「ＯＦＦ」が登録されている場合、割り込み禁止状態にないと判定する（Ｓ３０２でＮＯ）。 Returning to FIG. In S302, the transition determination unit 205 refers to the interrupt prohibition condition table T3, and determines that the interrupt is disabled (YES in S302), shifts the participating stage to “leave” (S305). On the other hand, if it is determined that the interrupt is not prohibited (NO in S302), the process proceeds to S303. Here, the transition determination unit 205 determines the room where the infant is present from the recognition result of the image recognition unit 202, and “ON” is registered in the cell corresponding to the determined room and the time zone to which the current time belongs. Then, it is determined that the interrupt is prohibited (YES in S302). If “OFF” is registered in the cell, it is determined that the interrupt is not disabled (NO in S302).

Ｓ３０３では、移行判断部２０５は、現在の関与ステージに対応する行動を実行してからの経過時間が閾値未満であるか否かを判定する。前記経過時間が閾値未満であれば（Ｓ３０３でＹＥＳ）、移行判断部２０５は、現在の関与ステージを維持する（Ｓ３０７）。一方、前記経過時間が閾値以上であれば（Ｓ３０３でＮＯ）、移行判断部２０５は、移行テーブルＴ２を参照し、現在の関与ステージを一つ下位の関与ステージに移行させる（Ｓ３０８）。閾値としては、例えば、これ以上、同一の関与ステージに対応する行動をロボット１に実行させても、幼児の関心が高まらないことが見込まれる予め定められた時間が採用され、例えば、１分、２分、３分、５分、１０分というような値が採用できる。 In S303, the transition determination unit 205 determines whether or not the elapsed time since the execution of the action corresponding to the current participation stage is less than the threshold value. If the elapsed time is less than the threshold (YES in S303), the transition determination unit 205 maintains the current participation stage (S307). On the other hand, if the elapsed time is equal to or greater than the threshold value (NO in S303), the transition determination unit 205 refers to the transition table T2 and shifts the current participation stage to a lower participation stage (S308). As the threshold value, for example, a predetermined time in which it is expected that the interest of the infant will not increase even if the robot 1 is caused to perform an action corresponding to the same participating stage is adopted. Values such as 2 minutes, 3 minutes, 5 minutes, and 10 minutes can be employed.

図３において、例えば現在の関与ステージが「模倣」であり、この状態において関与ステージが一つ下位に移行すると判断された場合、関与ステージは「待機」に設定される。なお、「待機」の関与ステージに対応する行動は、ロボット１をその場で左右に揺らせる行動、又は、重力方向を軸としてロボット１をその場で旋回（スピン）させる行動が該当する。これにより、幼児の関心が低い状況下において、比較的少ない刺激で幼児のロボット１への関心を引き出すことができる。 In FIG. 3, for example, when the current participation stage is “imitation” and it is determined that the participation stage moves down one level in this state, the participation stage is set to “standby”. Note that the action corresponding to the “standby” participation stage corresponds to an action of swinging the robot 1 left and right on the spot, or an action of turning (spinning) the robot 1 on the spot around the gravity direction. Thereby, in the situation where the infant's interest is low, the infant's interest in the robot 1 can be drawn with relatively little stimulation.

Ｓ３０１において、移行判断部２０５は、幼児の音声反応が有ると判断した場合（Ｓ３０１でＹＥＳ）、幼児の発話内容が拒絶用語辞書にヒットしたか否かを判断をする（Ｓ３０４）。図８は、拒絶用語辞書が登録された辞書テーブルＴ４の一例を示す図である。辞書テーブルＴ４には、「あっちいって」、「話かけないで」、及び「うるさい」等、ロボット１を拒絶するような用語が登録されている。移行判断部２０５は、幼児の発話内容に辞書テーブルＴ４に登録されたいずれかの用語が含まれている場合（Ｓ３０４でＹＥＳ）、幼児はロボット１の関与を積極的に拒んでいると判断し、現在の関与ステージを「離脱」に移行させる（Ｓ３０５）。 In S301, when it is determined that the infant has a voice response (YES in S301), the transition determination unit 205 determines whether or not the utterance content of the infant has hit the rejection term dictionary (S304). FIG. 8 is a diagram illustrating an example of a dictionary table T4 in which rejection term dictionaries are registered. In the dictionary table T4, terms that reject the robot 1, such as “get it right”, “don't talk”, and “noisy” are registered. When the utterance content of the infant includes any term registered in the dictionary table T4 (YES in S304), the transition determination unit 205 determines that the infant is actively refusing to involve the robot 1. The current participation stage is shifted to “leave” (S305).

一方、幼児の発話内容に辞書テーブルＴ４に登録されいずれの用語も含まれていない場合（Ｓ３０４でＮＯ）、移行判断部２０５は現在の関与ステージを一つ上位の関与ステージに移行させる（Ｓ３０６）。図３を参照し、例えば、現在の関与ステージが「模倣」であり、この状態において関与ステージを一つ上位に上げると判断された場合、関与ステージは「問いかけ」に移行される。 On the other hand, when the utterance content of the infant is registered in the dictionary table T4 and does not include any term (NO in S304), the transition determination unit 205 shifts the current participation stage to a higher participation stage (S306). . Referring to FIG. 3, for example, if the current participation stage is “imitation” and it is determined that the participation stage is to be raised one level in this state, the participation stage is shifted to “question”.

Ｓ３０５、Ｓ３０６、Ｓ３０７、Ｓ３０８の処理が終了すると、処理は図５のＳ２０６に戻り、関与ステージが移行された後、処理が図５のＳ２０２に進み、タスクが終了する又は離脱行動が実行されるまでＳ２０２〜Ｓ２０６の処理が繰り返される。 When the processes of S305, S306, S307, and S308 are completed, the process returns to S206 in FIG. 5, and after the participation stage is shifted, the process proceeds to S202 in FIG. The processes of S202 to S206 are repeated until.

（ロボット）
次に、ロボット１の機構について説明する。図９は、ロボット１の外観の一例を示す図である。ロボット１は、球帯状のメイン筐体４０１と球冠部４０２、４０３とを備えており、メイン筐体４０１と球冠部４０２、４０３とは全体として球体を構成する。即ち、ロボット１は球体形状を有する。また、ロボット１は、球冠部４０２（もしくは球冠部４０３）にマイク１０１とカメラ１０２とスピーカ３０１とを備える。また、ロボット１は、図略の制御回路を備える。図１に示す行動実行部２００は、この制御回路に実装される。図９の例では、カメラ１０２は球冠部４０２と球冠部４０３とのそれぞれに設けられた２つのカメラを含むステレオカメラで構成されており、周辺環境の映像と距離データを取得する。 (robot)
Next, the mechanism of the robot 1 will be described. FIG. 9 is a diagram illustrating an example of the appearance of the robot 1. The robot 1 includes a spherical main casing 401 and crown portions 402 and 403, and the main casing 401 and the crown portions 402 and 403 constitute a sphere as a whole. That is, the robot 1 has a spherical shape. In addition, the robot 1 includes a microphone 101, a camera 102, and a speaker 301 in a spherical crown 402 (or a spherical crown 403). The robot 1 includes a control circuit (not shown). The action execution unit 200 shown in FIG. 1 is implemented in this control circuit. In the example of FIG. 9, the camera 102 is configured by a stereo camera including two cameras provided in each of the spherical crown portion 402 and the spherical crown portion 403, and acquires the surrounding environment image and distance data.

球冠部４０２の中心と球冠部４０３の中心とはメイン筐体４０１の内部に設けられたシャフト（図略）によって固定接続されている。メイン筐体４０１はシャフトに対して回転自在に取り付けられている。また、シャフトにはフレーム（図略）及び表示部（図略）が取り付けられている。フレームにはメイン筐体４０１を回転させる第１モータ（図略）が取り付けられている。この第１モータ（図略）が回転することで、メイン筐体４０１は球冠部４０２、４０３に対して回転し、ロボット１は前進又は後退する。なお、ロボット１が前進又は後退する場合、球冠部４０２、４０３は停止状態にあるので、スピーカ３０１及びカメラ１０２はロボット１の正面を向いた状態に維持される。また、表示部には、ロボット１の目及び口を示す画像を表示する。この表示部は、第２モータ（図略）による動力によってシャフトに対する角度が調整自在に取り付けられている。したがって、表示部のシャフトに対する角度を調整することで、ロボットの目及び口の方向が調整される。なお、表示部はメイン筐体４０１とは独立してシャフトに取り付けられているので、メイン筐体４０１が回転してもシャフトに対する角度は変化しない。したがって、ロボット１は、目及び口の向きを固定した状態で前進又は後退できる。 The center of the spherical crown portion 402 and the center of the spherical crown portion 403 are fixedly connected by a shaft (not shown) provided inside the main housing 401. The main casing 401 is rotatably attached to the shaft. A frame (not shown) and a display unit (not shown) are attached to the shaft. A first motor (not shown) for rotating the main casing 401 is attached to the frame. As the first motor (not shown) rotates, the main casing 401 rotates with respect to the crown portions 402 and 403, and the robot 1 moves forward or backward. When the robot 1 moves forward or backward, the crown portions 402 and 403 are in a stopped state, so that the speaker 301 and the camera 102 are maintained facing the front of the robot 1. Further, an image showing the eyes and mouth of the robot 1 is displayed on the display unit. The display unit is attached so that the angle with respect to the shaft can be adjusted by power from a second motor (not shown). Therefore, the direction of the eyes and mouth of the robot is adjusted by adjusting the angle of the display unit with respect to the shaft. Since the display unit is attached to the shaft independently of the main casing 401, the angle with respect to the shaft does not change even when the main casing 401 rotates. Therefore, the robot 1 can move forward or backward with the direction of eyes and mouth fixed.

更に、シャフトにはおもり（図略）が下げられている。このおもりは、第３モータ（図略）の動力によりロボット１の正面方向を軸として揺動自在に取り付けられている。したがって、おもりを揺動させることで、ロボット１をその場で揺れるように動作をさせることができる。また、おもりを後方から前方に見て、左方又は右方に傾斜させた状態でロボット１を前進させることで、ロボット１を左方又は右方に旋回させることができる。例えば、おもりを左方に傾斜させた状態でロボット１を前進させる動作と、おもりを右方に傾斜させた状態でロボット１を後退させる動作とを小刻みに繰り返すことで、ロボット１は重力方向を軸に旋回（その場旋回）を行うことができる。 Further, a weight (not shown) is lowered on the shaft. This weight is attached so as to be swingable about the front direction of the robot 1 by the power of a third motor (not shown). Therefore, by swinging the weight, the robot 1 can be operated to swing on the spot. Further, the robot 1 can be turned leftward or rightward by moving the robot 1 forward in a state where the weight is tilted leftward or rightward when viewed from the rear to the front. For example, the robot 1 can change the direction of gravity by repeating the operation of moving the robot 1 forward with the weight tilted to the left and the operation of moving the robot 1 backward with the weight tilted to the right. A turn (spot turn) can be performed on the shaft.

なお、上述した、メイン筐体４０１、第１モータ、第２モータ、第３モータ、及び表示部は図１の駆動部３０２を構成する。 The main casing 401, the first motor, the second motor, the third motor, and the display unit described above constitute the drive unit 302 in FIG.

制御回路は、ロボット１の各種動作を制御する。なお、ロボット１は、全体として球体を構成しているが、これに限られるものではなく、少なくとも移動機構を有した構成を備えれば良い。 The control circuit controls various operations of the robot 1. The robot 1 forms a sphere as a whole. However, the present invention is not limited to this, and it is sufficient that the robot 1 has at least a moving mechanism.

このように、本実施の形態では、幼児へのロボット１の関与の度合いを段階的に上げることにより、幼児のロボット１への関心を段階的に高めていき、幼児のの関心が十分に高まった状態で幼児にタスクを課すことができる。したがって、本開示では、集中力を持続させることが困難な心理的特性を持つ幼児に対して、ロボット１を長時間関与させることが可能となる。その結果、幼児にタスクを長時間取り組ませることができ、幼児の能力を効果的に高めることができる。 As described above, in the present embodiment, by gradually increasing the degree of involvement of the robot 1 with the infant, the infant's interest in the robot 1 is gradually increased, and the infant's interest is sufficiently increased. Tasks can be imposed on infants in a standing state. Therefore, in the present disclosure, the robot 1 can be involved for a long time with an infant having a psychological characteristic that makes it difficult to maintain concentration. As a result, the child can be tasked for a long time, and the child's ability can be effectively enhanced.

本開示は、下記の変形例が採用できる。 The present disclosure can employ the following modifications.

（１）上記実施の形態では、本開示に係る装置がロボット１に適用される例が示されたが、これに限定されず、ロボット１以外の移動可能な装置であればどのような実装されてもよい。 (1) In the above-described embodiment, an example in which the device according to the present disclosure is applied to the robot 1 has been shown. May be.

（２）上記実施の形態では、関与ステージとして「離脱」及び「待機」が含まれていたが、これらの関与ステージは省かれてもよい。この場合、「タスク実行」、「問いかけ」、及び「模倣」の３つの関与ステージのいずれかに対応する行動をロボット１はとることになる。 (2) In the above embodiment, “leave” and “standby” are included as the participation stages, but these participation stages may be omitted. In this case, the robot 1 takes an action corresponding to one of the three participating stages of “task execution”, “question”, and “imitation”.

（３）図６のフローチャートでは、割り込み禁止状態の有無を判定する処理（Ｓ３０２）と、拒絶用語辞書へのヒットの有無を判定する処理（Ｓ３０４）とが設けられているが、これは一例であり、両処理のうちいずれか一方又は両方が省かれてもよい。 (3) In the flowchart of FIG. 6, there are provided a process for determining the presence or absence of an interrupt disabled state (S302) and a process for determining the presence or absence of a hit in the rejection term dictionary (S304). Yes, either or both of the processes may be omitted.

（４）図５のフローチャートは電源ＯＮをトリガーに開始されているが、これは一例であり、幼児に課すべきタスクが発生したことをトリガーに開始されてもよい。 (4) Although the flowchart of FIG. 5 is started when the power is turned on, this is an example, and it may be started when a task to be imposed on the infant has occurred.

本開示によれば、飽きが生じやすく、関与の契機が難しい対象人物に対し、ロボットの関与状態を持続できるので、教育用ロボットとして有用である。 According to the present disclosure, it is useful as an educational robot because the robot's involvement state can be maintained for a target person who is easily bored and difficult to get involved.

１ロボット
１００センサー部
１０１マイク
１０２カメラ
２００行動実行部
２０１音声認識部
２０２画像認識部
２０３初期関与ステージ判断部
２０４関与実行部
２０５移行判断部
２０７コマンド生成部
２０８メモリ
２１０プロセッサ
３００出力部
３０１スピーカ
３０２駆動部
Ｔ１初期関与ステージテーブル
Ｔ２移行テーブル
Ｔ３割り込み禁止条件テーブル
Ｔ４辞書テーブル DESCRIPTION OF SYMBOLS 1 Robot 100 Sensor part 101 Microphone 102 Camera 200 Action execution part 201 Voice recognition part 202 Image recognition part 203 Initial participation stage determination part 204 Involvement execution part 205 Transition determination part 207 Command generation part 208 Memory 210 Processor 300 Output part 301 Speaker 302 Drive Part T1 Initial participation stage table T2 Transition table T3 Interrupt disable condition table T4 Dictionary table

Claims

A device that communicates with a target person by executing a predetermined action,
A camera for acquiring video around the device;
A microphone for acquiring sound around the device;
A processing unit;
Speakers,
A drive unit for moving the device,
The processor is
According to the acquired video and the acquired sound, the apparatus is caused to execute any one of a first action, a second action, and a third action as an initial action for communicating with the target person. , The second action is one action higher than the third action, the first action is one action higher than the second action,
If there is a sound acquired by the microphone after the current action including the initial action is executed, the apparatus executes the action one level higher than the current action,
If there is no sound acquired by the microphone since the current action was executed, determine whether the elapsed time since the current action was executed is less than a threshold,
If it is determined that the elapsed time is less than the threshold, the device continues the current action,
If it is determined that the elapsed time is greater than or equal to the threshold value, the device performs an action one level lower than the current action,
Let the device execute a predetermined task as the first action,
Causing the speaker to output a voice to be spoken to the target person as the second action, and controlling the drive unit to cause the device to perform a movement in synchronization with the movement of the target person as the third action;
apparatus.

The first action lower than the third action is the fourth action,
The first action lower than the fourth action is the fifth action,
The processor is
If there is a sound acquired by the microphone since the current action was executed, and the voice of the target person included in the acquired sound was included in the dictionary provided in the device If the phrase is included, the device performs the fifth action,
Controlling the drive unit to cause the device to perform a predetermined movement at the current position of the device as the fourth action,
Causing the device to stop communication with the target person as the fifth action,
The apparatus of claim 1.

When the processing unit recognizes the target person from the acquired video and recognizes the voice of the target person from the acquired sound, the processing unit performs the first action as the initial action with respect to the device. To execute,
The apparatus of claim 1.

When the processing unit does not recognize the target person from the acquired video and recognizes the voice of the target person from the acquired sound, the processing unit performs the second action as the initial action with respect to the device. Take action,
The apparatus of claim 1.

When the processing unit recognizes the target person from the acquired video and does not recognize the voice of the target person from the acquired sound, the processing unit performs the third action as the initial action for the device. To execute,
The apparatus of claim 1.

The processing unit, as the first action, causes the speaker to output a sound proposing to start communication with the target person,
The apparatus according to claim 3.

The processor is
When the inclination of the head of the target person is recognized from the acquired video,
Controlling the drive unit to tilt the upper part of the device in the same direction and angle as the tilt of the head, as the third action,
The apparatus of claim 5.

The processor is
When recognizing an operation in accordance with a predetermined rhythm of the target person from the acquired video,
By controlling the drive unit, as the third action, the device can be moved according to the rhythm,
The apparatus of claim 5.

The processor is
As the second action, a sound including a name corresponding to the target person is output to the speaker.
The apparatus of claim 4.

The processor is
As the fourth action, the device is swung left and right,
The apparatus of claim 2.

The processor is
As the fourth action, the device is turned about the direction of gravity,
The apparatus of claim 2.

The processor is
As the fifth action, the device is moved away from the target person.
The apparatus of claim 2.

The processor is
As the fifth action, the device is turned 180 degrees around the direction of gravity.
The apparatus of claim 2.

The processor is
When there is no sound acquired by the microphone since the current action is executed, and when a predetermined interrupt prohibition condition is set in the apparatus, the apparatus is caused to execute the fifth action,
The predetermined interrupt prohibition condition is
Including a condition for a predetermined time zone and a condition for the location of the target person,
The apparatus of claim 2.

A method in an apparatus for communicating with a target person by executing a predetermined action,
The video around the device is acquired by the camera,
The sound around the device is acquired by a microphone,
According to the acquired video and the acquired sound, the apparatus is caused to execute any one of a first action, a second action, and a third action as an initial action for communicating with the target person. , The second action is one action higher than the third action, the first action is one action higher than the second action,
If there is a sound acquired by the microphone after the current action including the initial action is executed, the apparatus executes the action one level higher than the current action,
If there is no sound acquired by the microphone since the current action was executed, determine whether the elapsed time since the current action was executed is less than a threshold,
If it is determined that the elapsed time is less than the threshold, the device continues the current action,
If it is determined that the elapsed time is equal to or greater than the threshold, the device is caused to execute an action one level lower than the current action,
Let the device execute a predetermined task as the first action,
As a second action, the speaker outputs a voice to be spoken to the target person, and controls the drive unit to cause the device to make a movement synchronized with the movement of the target person as the third action.
Method.

A program for causing a computer to execute the method according to claim 15.

A robot that communicates with a target person by executing a predetermined action,
A camera that acquires video around the robot;
A microphone for obtaining sounds around the robot;
A processing unit;
Speakers,
A drive unit for moving the robot,
The processor is
According to the acquired video and the acquired sound, the robot is caused to execute, as an initial action, any one of a first action, a second action, and a third action for communicating with the target person. , The second action is one action higher than the third action, the first action is one action higher than the second action,
If there is a sound acquired by the microphone since the current action including the initial action is executed, the robot is caused to execute an action one level higher than the current action,
If there is no sound acquired by the microphone since the current action was executed, determine whether the elapsed time since the current action was executed is less than a threshold,
If it is determined that the elapsed time is less than the threshold, the robot continues the current action,
When it is determined that the elapsed time is equal to or greater than the threshold, the robot is caused to execute an action that is one subordinate to the current action,
Let the robot execute a predetermined task as the first action,
Causing the speaker to output a voice to be spoken to the target person as the second action, and controlling the drive unit to cause the robot to move in synchronization with the movement of the target person as the third action;
robot.