JP2008254103A

JP2008254103A - Presenter action reproducing robot, and controlling method and controlling program of presenter action reproducing robot

Info

Publication number: JP2008254103A
Application number: JP2007097617A
Authority: JP
Inventors: Hitoshi Morikawa; 仁志森川
Original assignee: SKY Co Ltd
Current assignee: SKY Co Ltd
Priority date: 2007-04-03
Filing date: 2007-04-03
Publication date: 2008-10-23

Abstract

<P>PROBLEM TO BE SOLVED: To obtain a presenter action reproducing robot which can make gestures capable of correctly informing an user about an intention of a data explainer, and a controlling method and a controlling program of the presenter action reproducing robot. <P>SOLUTION: An action pattern searching portion 10 is provided, which searches an action pattern corresponding to the action of a presenter analyzed by an action analyzing portion 3 and corresponding to a key word detected by a key word detecting portion, from action patterns of a robot memorized in an action pattern memory portion 9. A body controlling portion 12 controls a body according to the action pattern searched by the action pattern searching portion 10. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

この発明は、例えば、資料の説明などを行うプレゼンターの動作を再現するプレゼンター動作再現ロボット、プレゼンター動作再現ロボットの制御方法及び制御プログラムに関するものである。 The present invention relates to, for example, a presenter motion reproduction robot that reproduces the motion of a presenter that explains a document and the like, a control method for a presenter motion reproduction robot, and a control program.

以下の特許文献１には、検査技術者の頭部の動作を検知する動き解析装置を実装し、その動き解析装置の検知結果を参照して、ロボットの頭部が検査技術者の頭部と同じ動作を行うように、ロボットの頭部を制御する技術が開示されている。
ただし、このロボットは、動き解析装置が検査技術者の頭部の動作を検知するようにしているが、検査技術者の音声を解析することはない。
このため、ロボットの動作は、検査技術者の頭部の動作のみに依存し、検査技術者の音声との対応関係は存在しない。 In Patent Document 1 below, a motion analysis device that detects the motion of the inspection engineer's head is mounted, and the head of the robot is referred to as the inspection engineer's head by referring to the detection result of the motion analysis device. A technique for controlling the head of a robot to perform the same operation is disclosed.
In this robot, the motion analysis device detects the movement of the head of the inspection engineer, but does not analyze the voice of the inspection engineer.
For this reason, the movement of the robot depends only on the movement of the head of the inspection engineer, and there is no correspondence with the voice of the inspection engineer.

特開２００５−１１８９５３号公報（段落番号［００１４］から［００２５］、図１）JP 2005-118953 A (paragraph numbers [0014] to [0025], FIG. 1)

従来のロボットは以上のように構成されているので、検査技術者の頭部と同じ動作を行うように頭部が制御されるが、検査技術者の音声と無関係に頭部が制御される。このため、ユーザがロボットの動作を見ても、その動作の意図を容易に把握することができないことがあるなどの課題があった。 Since the conventional robot is configured as described above, the head is controlled so as to perform the same operation as the head of the inspection engineer, but the head is controlled regardless of the voice of the inspection engineer. For this reason, even if the user looks at the operation of the robot, there is a problem that the intention of the operation may not be easily grasped.

この発明は上記のような課題を解決するためになされたもので、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができるプレゼンター動作再現ロボット、プレゼンター動作再現ロボットの制御方法及び制御プログラムを得ることを目的とする。 The present invention has been made in order to solve the above-described problems. A presenter motion reproduction robot capable of performing gestures that can accurately convey the intention of a material explainer to a user, and control of the presenter motion reproduction robot The object is to obtain a method and a control program.

請求項１記載の発明に係るプレゼンター動作再現ロボットは、資料説明者の動作を解析する動作解析手段と、資料説明者の音声を認識する音声認識手段と、資料説明者の動作及び音声に対応するロボット動作を記憶しているロボット動作記憶手段と、ロボット動作記憶手段に記憶されているロボット動作の中から、動作解析手段により解析された資料説明者の動作に対応し、かつ、音声認識手段により認識された資料説明者の音声に対応するロボット動作を検索するロボット動作検索手段と、ロボット動作検索手段により検索されたロボット動作にしたがって身体を制御する身体制御手段とを備えるようにしたものである。 The presenter motion reproduction robot according to the first aspect of the present invention corresponds to the motion analysis means for analyzing the motion of the document explainer, the speech recognition means for recognizing the speech of the document explainer, and the motion and the speech of the document explainer. Corresponding to the motion of the material explainer analyzed by the motion analysis means out of the robot motion storage means storing the robot motion and the robot motion stored in the robot motion storage means, and by the voice recognition means A robot motion search means for searching for a robot motion corresponding to the voice of the recognized document explainer and a body control means for controlling the body according to the robot motion searched by the robot motion search means are provided. .

請求項１記載の発明によれば、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができる効果が得られる。 According to the first aspect of the invention, there is an effect that it is possible to perform gestures that can accurately convey the intention of the material explainer to the user.

請求項２記載の発明に係るプレゼンター動作再現ロボットは、資料説明者の音声に対応するロボット動作として、特定のキーワードに対応するロボット動作を記憶しているロボット動作記憶手段と、資料説明者の音声の中から特定のキーワードを検出する音声認識手段と、ロボット動作記憶手段に記憶されているロボット動作の中から、資料説明者の動作に対応し、かつ、音声認識手段により検出された特定のキーワードに対応するロボット動作を検索するロボット動作検索手段とを備えるようにしたものである。 According to a second aspect of the present invention, there is provided a presenter motion reproduction robot that includes a robot motion storage means for storing a robot motion corresponding to a specific keyword as a robot motion corresponding to a voice of a document explainer, and a voice of the document explainer. A voice recognition means for detecting a specific keyword from the robot, and a specific keyword corresponding to the action of the material explainer and detected by the voice recognition means from the robot actions stored in the robot action storage means And a robot motion search means for searching for a robot motion corresponding to.

請求項２記載の発明によれば、プレゼンター動作再現ロボットの動作を特定のキーワードと対応付けることができるようになり、さらに、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができる効果が得られる。 According to the second aspect of the present invention, it becomes possible to associate the motion of the presenter motion reproduction robot with a specific keyword, and to perform gestures that can accurately convey the intention of the material explainer to the user. The effect that can be obtained.

請求項３記載の発明に係るプレゼンター動作再現ロボットは、音声認識手段により特定のキーワードが検出されない場合、ロボット動作検索手段がロボット動作記憶手段に記憶されているロボット動作の中から、資料説明者の動作に対応するロボット動作を検索するようにしたものである。 In the presenter motion reproduction robot according to the third aspect of the present invention, when a specific keyword is not detected by the voice recognition means, the robot motion search means is selected from the robot motions stored in the robot motion storage means by the document explainer. The robot motion corresponding to the motion is searched.

請求項３記載の発明によれば、特定のキーワードが検出されない場合でも、資料説明者の動作を再現することができる効果が得られる。 According to the third aspect of the present invention, even if a specific keyword is not detected, an effect of reproducing the operation of the material explainer can be obtained.

請求項４記載の発明に係るプレゼンター動作再現ロボットの制御方法は、動作解析手段が資料説明者の動作を解析する動作解析ステップと、音声認識手段が資料説明者の音声を認識する音声認識ステップと、ロボット動作検索手段が資料説明者の動作及び音声に対応するロボット動作を記憶しているロボット動作記憶手段から、動作解析手段により解析された資料説明者の動作に対応し、かつ、音声認識手段により認識された資料説明者の音声に対応するロボット動作を検索するロボット動作検索ステップと、身体制御手段がロボット動作検索手段により検索されたロボット動作にしたがって身体を制御する身体制御ステップとを備えようにしたものである。 According to a fourth aspect of the present invention, there is provided a control method for a presenter motion reproduction robot, wherein the motion analysis means analyzes the motion of the document explainer, and the speech recognition means recognizes the speech of the document explainer. The robot motion search means corresponds to the motion of the material explainer analyzed by the motion analysis means from the robot motion storage means in which the robot motion corresponding to the motion and voice of the material explainer is stored, and the voice recognition means A robot motion search step for searching for a robot motion corresponding to the voice of the document explainer recognized by the robot, and a body control step for the body control means to control the body according to the robot motion searched by the robot motion search means. It is a thing.

請求項４記載の発明によれば、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができる効果が得られる。 According to the fourth aspect of the present invention, there is an effect that it is possible to perform gestures that can accurately convey the intention of the material explainer to the user.

請求項５記載の発明に係るプレゼンター動作再現ロボットの制御プログラムは、資料説明者の動作を解析する動作解析処理手順と、資料説明者の音声を認識する音声認識処理手順と、資料説明者の動作及び音声に対応するロボット動作を記憶しているロボット動作記憶手段から、動作解析処理手順により解析された資料説明者の動作に対応し、かつ、音声認識処理手順により認識された資料説明者の音声に対応するロボット動作を検索するロボット動作検索処理手順と、ロボット動作検索処理手順により検索されたロボット動作にしたがって身体を制御する身体制御処理手順とを備えようにしたものである。 A control program for a presenter motion reproduction robot according to a fifth aspect of the present invention is a motion analysis processing procedure for analyzing a motion of a material explainer, a speech recognition processing procedure for recognizing a speech of the material explainer, and a motion of the material explainer And from the robot motion storage means storing the robot motion corresponding to the voice, the voice of the material explainer corresponding to the motion of the material explainer analyzed by the motion analysis processing procedure and recognized by the voice recognition processing procedure. Are provided with a robot motion search processing procedure for searching for a robot motion corresponding to the above and a body control processing procedure for controlling the body according to the robot motion searched by the robot motion search processing procedure.

請求項５記載の発明によれば、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができる効果が得られる。 According to the fifth aspect of the present invention, there is an effect that it is possible to perform gestures that can accurately convey the intention of the material explainer to the user.

この発明によれば、ロボット動作記憶手段に記憶されているロボット動作の中から、動作解析手段により解析された資料説明者の動作に対応し、かつ、音声認識手段により認識された資料説明者の音声に対応するロボット動作を検索するロボット動作検索手段を設け、身体制御手段がロボット動作検索手段により検索されたロボット動作にしたがって身体を制御するように構成したので、資料説明者の意図を正確にユーザに伝えることが可能な身振りを行うことができる効果がある。 According to the present invention, among the robot motions stored in the robot motion storage means, the material explainer's motion corresponding to the motion of the material explainer analyzed by the motion analysis means and recognized by the voice recognition means. The robot motion search means for searching for the robot motion corresponding to the voice is provided, and the body control means is configured to control the body according to the robot motion searched by the robot motion search means. There is an effect that gestures that can be conveyed to the user can be performed.

実施の形態１．
図１はこの発明の実施の形態１によるプレゼンター動作再現ロボットを示す構成図であり、図１において、カメラ１は資料説明者であるプレゼンターを撮影し、プレゼンターの映像を映像格納部２に格納する。
映像格納部２はカメラ１から出力されたプレゼンターの映像を格納するメモリである。
動作解析部３は例えばＣＰＵ等を実装している半導体集積回路基板などから構成されており、映像格納部２に格納されている映像の変化を捉えて、プレゼンターの動作を解析する処理を実施する。
なお、カメラ１、映像格納部２及び動作解析部３から動作解析手段が構成されている。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing a presenter motion reproduction robot according to Embodiment 1 of the present invention. In FIG. 1, a camera 1 photographs a presenter who is a material explainer, and stores an image of the presenter in a video storage unit 2. .
The video storage unit 2 is a memory that stores the video of the presenter output from the camera 1.
The operation analysis unit 3 is composed of, for example, a semiconductor integrated circuit board on which a CPU or the like is mounted. The operation analysis unit 3 captures a change in the video stored in the video storage unit 2 and performs a process of analyzing the operation of the presenter. .
The camera 1, the video storage unit 2, and the motion analysis unit 3 constitute a motion analysis unit.

キーワード格納部４はプレゼンターの音声の中から検出する対象のキーワードを格納しているメモリである。
マイク５はプレゼンターの音声を集音して、その音声信号を音声格納部６に出力する。
音声格納部６はマイク５から出力された音声信号を格納するメモリである。 The keyword storage unit 4 is a memory that stores a keyword to be detected from the presenter's voice.
The microphone 5 collects the presenter's voice and outputs the voice signal to the voice storage unit 6.
The audio storage unit 6 is a memory that stores an audio signal output from the microphone 5.

音声認識処理部７は例えばＣＰＵ等を実装している半導体集積回路基板などから構成されており、音声格納部６に格納されている音声信号を解析して、プレゼンターの音声を識別する処理を実施する。
キーワード検出部８は例えばＣＰＵ等を実装している半導体集積回路基板などから構成されており、音声認識処理部７により識別された音声の中から、キーワード格納部４に格納されているキーワードを検出する処理を実施する。
なお、キーワード格納部４、マイク５、音声格納部６、音声認識処理部７及びキーワード検出部８から音声認識手段が構成されている。 The voice recognition processing unit 7 is composed of, for example, a semiconductor integrated circuit board on which a CPU or the like is mounted. The voice recognition processing unit 7 analyzes a voice signal stored in the voice storage unit 6 and performs a process of identifying the presenter's voice. To do.
The keyword detection unit 8 is composed of, for example, a semiconductor integrated circuit board on which a CPU or the like is mounted, and detects a keyword stored in the keyword storage unit 4 from the voice identified by the voice recognition processing unit 7. Perform the process.
The keyword storage unit 4, the microphone 5, the voice storage unit 6, the voice recognition processing unit 7, and the keyword detection unit 8 constitute voice recognition means.

動作パターン記憶部９はプレゼンターの動作及び特定のキーワードに対応する動作パターン（ロボット動作）を記憶しているメモリである。なお、動作パターン記憶部９はロボット動作記憶手段を構成している。
動作パターン検索部１０は例えばＣＰＵ等を実装している半導体集積回路基板などから構成されており、動作パターン記憶部９に記憶されている動作パターンの中から、動作解析部３により解析されたプレゼンターの動作に対応し、かつ、キーワード検出部８により検出されたキーワードに対応する動作パターンを検索する処理を実施する。なお、動作パターン検索部１０はロボット動作検索手段を構成している。
動作パターン格納部１１は動作パターン検索部１０により検索された動作パターンを格納するメモリである。 The motion pattern storage unit 9 is a memory that stores the motion of the presenter and the motion pattern (robot motion) corresponding to a specific keyword. The motion pattern storage unit 9 constitutes a robot motion storage means.
The operation pattern search unit 10 is composed of, for example, a semiconductor integrated circuit board on which a CPU or the like is mounted. The presenter analyzed by the operation analysis unit 3 from the operation patterns stored in the operation pattern storage unit 9. And a process of searching for an operation pattern corresponding to the keyword detected by the keyword detection unit 8 is performed. Note that the motion pattern search unit 10 constitutes a robot motion search means.
The operation pattern storage unit 11 is a memory for storing the operation pattern searched by the operation pattern search unit 10.

身体制御部１２は例えばＣＰＵ等を実装している半導体集積回路基板などから構成されており、タイミング発生部１４から出力されるタイミング信号に同期して、動作パターン格納部１１に格納されている動作パターンにしたがって身体を制御する処理を実施する。なお、身体制御部１２は身体制御手段を構成している。
音声再生部１３は例えばオーディオ装置などから構成されており、タイミング発生部１４から出力されるタイミング信号に同期して、音声格納部６に格納されている音声信号にしたがってプレゼンターの音声を再生する処理を実施する。
タイミング発生部１４は外部からロボットの動作開始要求を受けると、身体制御部１２と音声再生部１３の同期を確立するために、タイミング信号（例えば、所定周波数のパルス信号や、開始トリガ信号など）を身体制御部１２及び音声再生部１３に出力する。 The body control unit 12 is composed of, for example, a semiconductor integrated circuit board on which a CPU or the like is mounted, and an operation stored in the operation pattern storage unit 11 in synchronization with a timing signal output from the timing generation unit 14. A process of controlling the body according to the pattern is performed. The body control unit 12 constitutes body control means.
The audio reproduction unit 13 is constituted by, for example, an audio device and the like, and is a process for reproducing the presenter's audio in accordance with the audio signal stored in the audio storage unit 6 in synchronization with the timing signal output from the timing generation unit 14. To implement.
When the timing generation unit 14 receives a robot operation start request from the outside, the timing generation unit 14 establishes a synchronization between the body control unit 12 and the sound reproduction unit 13 (for example, a pulse signal of a predetermined frequency, a start trigger signal, etc.). Is output to the body control unit 12 and the sound reproduction unit 13.

図２はこの発明の実施の形態１によるプレゼンター動作再現ロボットの動作解析部３を示す構成図であり、図２において、変化部位検出部２１は映像格納部２に格納されている映像の変化を捉えて、動いているプレゼンターの部位を検出する処理を実施する。
変化部位解析部２２は変化部位検出部２１により検出された部位の動き（例えば、動いている方向）を解析する処理を実施する。 FIG. 2 is a block diagram showing the motion analysis unit 3 of the presenter motion reproduction robot according to Embodiment 1 of the present invention. In FIG. 2, the change site detection unit 21 detects changes in the video stored in the video storage unit 2. Capture the presenter's moving parts.
The change site analysis unit 22 performs a process of analyzing the movement (for example, the moving direction) of the site detected by the change site detection unit 21.

図３はこの発明の実施の形態１によるプレゼンター動作再現ロボットの音声認識処理部７を示す構成図であり、図３において、特徴抽出部３１は音声格納部６に格納されている音声信号に対して、例えば、ＬＰＣ分析を実行することにより、その音声信号の対数パワー、１６次ケプストラム係数、Δ対数パワー及びΔ１６次ケプストラム係数を含む３４次元の特徴パラメータを抽出する。
ＨＭＭメモリ３２は隠れマルコフモデル（隠れマルコフモデルは、複数の状態と、各状態間の遷移を示す弧とから構成されており、各弧には状態間の遷移確率と入力コード（特徴パラメータ）に対する出力確率が格納されている）を記憶しているメモリである。
音素照合部３３はＨＭＭメモリ３２に格納されている隠れマルコフモデルを用いて音素照合処理を実施することにより、特徴抽出部３１により抽出された特徴パラメータから音素データを生成する。 FIG. 3 is a block diagram showing the speech recognition processing unit 7 of the presenter motion reproduction robot according to the first embodiment of the present invention. In FIG. 3, the feature extraction unit 31 corresponds to the speech signal stored in the speech storage unit 6. Thus, for example, by performing LPC analysis, 34-dimensional feature parameters including logarithmic power, 16th-order cepstrum coefficient, Δlogarithmic power, and Δ16th-order cepstrum coefficient of the speech signal are extracted.
The HMM memory 32 is composed of a hidden Markov model (a hidden Markov model is composed of a plurality of states and arcs indicating transitions between the states. Each arc has a transition probability between states and an input code (feature parameter). The output probability is stored).
The phoneme matching unit 33 performs phoneme matching processing using a hidden Markov model stored in the HMM memory 32, thereby generating phoneme data from the feature parameters extracted by the feature extraction unit 31.

言語モデル格納部３４は統計的言語モデルを格納しているメモリである。
音声認識部３５は言語モデル格納部３４に格納されている統計的言語モデルを参照して、例えば、“ＯｎｅＰａｓｓＤＰアルゴリズム”を実行する。即ち、音素照合部３３により生成された音素データについて左から右方向に、後戻りなしに処理して、より高い生起確率の単語を音声認識結果（プレゼンターの音声）に決定する音声認識処理を実施する。 The language model storage unit 34 is a memory that stores a statistical language model.
The speech recognition unit 35 refers to the statistical language model stored in the language model storage unit 34 and executes, for example, the “One Pass DP algorithm”. That is, the phoneme data generated by the phoneme matching unit 33 is processed from left to right without backtracking, and a speech recognition process is performed to determine a word with a higher probability of occurrence as a speech recognition result (presenter's speech). .

図４はこの発明の実施の形態１によるプレゼンター動作再現ロボットを示す正面図である。また、図５はこの発明の実施の形態１によるプレゼンター動作再現ロボットを示す側面図である。
図４及び図５では、上肢（左上腕部４１Ｌ、右上腕部４１Ｒ、左下腕部４２Ｌ、右下腕部４２Ｒ）や首（首関節部４６Ｎ）のアクチュエータを動かして、プレゼンターの動作を再現するプレゼンター動作再現ロボットの例を示している。
図４及び図５において、プレゼンター動作再現ロボットの左上腕部４１Ｌは一端が可動自在に左肩関節部４３Ｌに取り付けられており、左下腕部４２Ｌは一端が可動自在に左肘関節部４４Ｌに取り付けられている。
左肩関節部４３Ｌは身体制御部１２の指示の下、例えば、左上腕部４１Ｌを矢印Ａ方向に回転させるアクチュエータや、左上腕部４１Ｌを矢印Ｂ方向にスイングさせるアクチュエータなどからなる機械要素である。
左肘関節部４４Ｌは身体制御部１２の指示の下、例えば、左下腕部４２Ｌを矢印Ｃ方向に回転させるアクチュエータなどからなる機械要素である。 FIG. 4 is a front view showing the presenter motion reproduction robot according to Embodiment 1 of the present invention. FIG. 5 is a side view showing the presenter motion reproduction robot according to Embodiment 1 of the present invention.
4 and 5, the motion of the presenter is reproduced by moving the actuators of the upper limbs (left upper arm portion 41L, upper right arm portion 41R, left lower arm portion 42L, right lower arm portion 42R) and neck (neck joint portion 46N). An example of a presenter motion reproduction robot is shown.
4 and 5, one end of the left upper arm portion 41L of the presenter motion reproduction robot is movably attached to the left shoulder joint portion 43L, and one end of the left lower arm portion 42L is movably attached to the left elbow joint portion 44L. ing.
Under the instruction of the body control unit 12, the left shoulder joint portion 43L is a mechanical element including, for example, an actuator that rotates the left upper arm portion 41L in the direction of arrow A, and an actuator that swings the left upper arm portion 41L in the direction of arrow B.
The left elbow joint portion 44L is a mechanical element composed of, for example, an actuator that rotates the left lower arm portion 42L in the direction of arrow C under the instruction of the body control unit 12.

プレゼンター動作再現ロボットの右上腕部４１Ｒは一端が可動自在に右肩関節部４３Ｒに取り付けられており、右下腕部４２Ｒは一端が可動自在に右肘関節部４４Ｒに取り付けられている。
右肩関節部４３Ｒは身体制御部１２の指示の下、例えば、右上腕部４１Ｒを矢印Ａ方向に回転させるアクチュエータや、右上腕部４１Ｒを矢印Ｂ方向にスイングさせるアクチュエータなどからなる機械要素である。
右肘関節部４４Ｒは身体制御部１２の指示の下、例えば、右下腕部４２Ｒを矢印Ｃ方向に回転させるアクチュエータなどからなる機械要素である。 One end of the upper right arm portion 41R of the presenter motion reproduction robot is movably attached to the right shoulder joint portion 43R, and one end of the right lower arm portion 42R is movably attached to the right elbow joint portion 44R.
The right shoulder joint portion 43R is a mechanical element including, for example, an actuator that rotates the upper right arm portion 41R in the arrow A direction, an actuator that swings the upper right arm portion 41R in the arrow B direction under the instruction of the body control unit 12. .
The right elbow joint portion 44R is a mechanical element composed of, for example, an actuator that rotates the right lower arm portion 42R in the direction of arrow C under the instruction of the body control unit 12.

移動ローラ４５Ｌ，４５Ｒは身体制御部１２の指示の下、プレゼンター動作再現ロボットを移動させる移動機構である。
首関節部４６Ｎは身体制御部１２の指示の下、プレゼンター動作再現ロボットの首を上下に向けたり、左右に回転させたりするアクチュエータなどからなる機械要素である。 The moving rollers 45L and 45R are moving mechanisms that move the presenter motion reproduction robot under the instruction of the body control unit 12.
The neck joint portion 46N is a mechanical element composed of an actuator or the like that turns the neck of the presenter motion reproduction robot up and down or rotates left and right under the instruction of the body control unit 12.

図１の例では、プレゼンター動作再現ロボットの構成要素であるカメラ１、動作解析部３、マイク５、音声認識処理部７、キーワード検出部８、動作パターン検索部１０、身体制御部１２、音声再生部１３及びタイミング発生部１４がそれぞれ専用のハードウェアで構成されていることを想定しているが、プレゼンター動作再現ロボットがコンピュータで構成されている場合、カメラ１、動作解析部３、マイク５、音声認識処理部７、キーワード検出部８、動作パターン検索部１０、身体制御部１２、音声再生部１３及びタイミング発生部１４の処理内容を記述しているプログラムをコンピュータのメモリに格納し、コンピュータのＣＰＵが当該メモリに格納されているプログラムを実行するようにしてもよい。
図６はこの発明の実施の形態１によるプレゼンター動作再現ロボットの処理内容を示すフローチャートである。 In the example of FIG. 1, the camera 1, the motion analysis unit 3, the microphone 5, the speech recognition processing unit 7, the keyword detection unit 8, the motion pattern search unit 10, the body control unit 12, and the voice playback, which are components of the presenter motion reproduction robot. It is assumed that the unit 13 and the timing generation unit 14 are each configured by dedicated hardware. However, when the presenter motion reproduction robot is configured by a computer, the camera 1, the motion analysis unit 3, the microphone 5, A program describing the processing contents of the voice recognition processing unit 7, the keyword detection unit 8, the motion pattern search unit 10, the body control unit 12, the voice reproduction unit 13, and the timing generation unit 14 is stored in a computer memory. The CPU may execute a program stored in the memory.
FIG. 6 is a flowchart showing the processing contents of the presenter motion reproduction robot according to Embodiment 1 of the present invention.

次に動作について説明する。
カメラ１は、資料説明者であるプレゼンターを例えば一定時間毎（例えば、５秒毎）に撮影し、プレゼンターの映像を映像格納部２に格納する（ステップＳＴ１）。
ここでは、カメラ１により撮影された映像が間欠映像であれば、撮影時刻Ｔの最新の映像をＰ_T、カメラ１により前回撮影された撮影時刻Ｔ−１の映像をＰ_T-1、カメラ１によりＮ回前に撮影された撮影時刻Ｔ−Ｎの映像をＰ_T-Nで表記する。
また、カメラ１により撮影された映像が連続映像であれば、最新フレームの映像をＰ_T、１フレーム前の映像をＰ_T-1、Ｎフレーム前の映像をＰ_T-Nで表記する。 Next, the operation will be described.
The camera 1 photographs a presenter who is a material explainer, for example, every predetermined time (for example, every 5 seconds), and stores the video of the presenter in the video storage unit 2 (step ST1).
Here, if the video shot by the camera 1 is an intermittent video, the latest video at the shooting time T is P _T , the video at the shooting time T-1 previously shot by the camera 1 is P _T-1 , and the camera 1 The video at the shooting time _TN taken N times before is expressed as _PTN .
If the video taken by the camera 1 is a continuous video, the video of the latest frame is expressed as P _T , the video before one frame is expressed as P _T-1 , and the video before N frames is expressed as P _TN .

動作解析部３は、カメラ１がプレゼンターの映像を映像格納部２に格納すると、映像格納部２に格納されている映像Ｐ_T，Ｐ_T-1，Ｐ_T-2，・・・，Ｐ_T-Nの変化を捉えて、プレゼンターの動作を解析する（ステップＳＴ２）。
以下、動作解析部３による動作の解析処理を具体的に説明する。
ただし、説明の簡単化のため、図７に示すように、映像Ｐ_T，Ｐ_T-1，Ｐ_T-2を比較する例を説明する。 When the camera 1 stores the video of the presenter in the video storage unit 2, the motion analysis unit 3 stores the video P _T , P _T-1 , P _T-2 ,..., P _TN stored in the video storage unit 2. The movement of the presenter is analyzed by capturing the change (step ST2).
Hereinafter, the operation analysis processing by the operation analysis unit 3 will be described in detail.
However, for simplification of description, an example in which the images P _T , P _T-1 and P _T-2 are compared as shown in FIG. 7 will be described.

動作解析部３の変化部位検出部２１は、映像格納部２に格納されている映像Ｐ_T，Ｐ_T-1，Ｐ_T-2の変化を捉えて、動いているプレゼンターの部位を検出する。
即ち、変化部位検出部２１は、図８に示すように、映像Ｐ_Tと映像Ｐ_T-1の差分映像Ｓ_T（映像Ｐ_Tのうち、映像Ｐ_T-1と相違している部分のみを示す映像）を求めるとともに、映像Ｐ_T-1と映像Ｐ_T-2の差分映像Ｓ_T-1（映像Ｐ_T-1のうち、映像Ｐ_T-2と相違している部分のみを示す映像）を求め、それらの差分映像Ｓ_T，Ｓ_T-1の輪郭を抽出して、その輪郭の特徴を解析する。
変化部位検出部２１は、例えば、動きの検出対象部位がプレゼンターの“首”と“手”である場合、輪郭の特徴量と予め設定されているプレゼンターの“首”の特徴量（または、“手”の特徴量）を比較し、双方の特徴量の差分が所定の閾値より小さければ、動いているプレゼンターの部位が“首”（または、“手”）であると判別する。 The change site detection unit 21 of the motion analysis unit 3 detects changes in the images P _T , P _T-1 , and P _T-2 stored in the video storage unit 2 and detects the site of the moving presenter.
That is, the change portion detection section 21, as shown in FIG. 8, in the video P _T and the image P _T-1 of the differential image S _T (picture P _T, only the part that differs from the image P _T-1 with obtaining the image) indicating the difference image S _T-1 video P _T-1 and the image P _T-2 (in the video P _T-1, the image showing only parts that are different from the image P _T-2) Are extracted, the contours of the difference images S _T and S _T-1 are extracted, and the features of the contours are analyzed.
For example, when the motion detection target parts are the presenter's “neck” and “hand”, the change part detection unit 21 has a contour feature quantity and a preset presenter “neck” feature quantity (or “ If the difference between the two feature values is smaller than a predetermined threshold, it is determined that the moving presenter's part is the “neck” (or “hand”).

ここでは、変化部位検出部２１が差分映像の輪郭を抽出する処理や特徴量を抽出する処理などを実施して、動いているプレゼンターの部位を検出するものについて示したが、これに限るものではなく、例えば、公知の顔認識アルゴリズム等を使用して、差分映像が顔画像であるか否かを判別することにより、動いているプレゼンターの部位が“首”であるか否かを判別するようにしてもよい。
公知の顔認識アルゴリズムは、例えば、「電子情報通信学会論文誌Ｄ−II ｖｏｌ．Ｊ８８−Ｄ−II Ｎｏ．８ｐｐ．１３３９−１３４８２００５」などに開示されている。 Here, although the change part detection part 21 implemented the process which extracts the outline of a difference image, the process which extracts the feature-value, etc., it showed about the part of the presenter who is moving, but it did not restrict to this Instead, for example, by using a known face recognition algorithm or the like, it is determined whether or not the difference video is a face image, thereby determining whether or not the moving presenter's part is the “neck”. It may be.
Known face recognition algorithms are disclosed in, for example, “The Institute of Electronics, Information and Communication Engineers Journal D-II vol. J88-D-II No. 8 pp. 1339-1348 2005”.

変化部位解析部２２は、変化部位検出部２１が動いているプレゼンターの部位を検出すると、差分映像Ｓ_T，Ｓ_T-1を比較して、その部位の動きを解析する。
即ち、変化部位解析部２２は、変化部位検出部２１により検出された部位が動いている方向を解析する。
動いているプレゼンターの部位が“首”であれば、首の上又は下方向の移動、あるいは、右又は左方向の回転を検出する。
また、動いているプレゼンターの部位が“手”であれば、手の右又は左方向の移動を検出する。 When the change part detection unit 22 detects the part of the presenter where the change part detection unit 21 is moving, the change part analysis unit 22 compares the difference images S _T and S _T-1 and analyzes the movement of the part.
That is, the change site analysis unit 22 analyzes the direction in which the site detected by the change site detection unit 21 is moving.
If the moving presenter part is the “neck”, the movement of the upper or lower direction of the neck or the rotation of the right or left direction is detected.
If the moving presenter's part is “hand”, movement of the hand in the right or left direction is detected.

マイク５は、カメラ１による撮影と並行して、プレゼンターの音声を集音し、その音声信号を音声格納部６に格納する（ステップＳＴ３）。
音声認識処理部７は、マイク５が音声信号を音声格納部６に格納すると、その音声信号を解析して、プレゼンターの音声を識別する（ステップＳＴ４）。
以下、音声認識処理部７による音声の識別処理を具体的に説明する。 The microphone 5 collects the presenter's voice in parallel with the shooting by the camera 1, and stores the voice signal in the voice storage unit 6 (step ST3).
When the microphone 5 stores the voice signal in the voice storage unit 6, the voice recognition processing unit 7 analyzes the voice signal and identifies the presenter's voice (step ST4).
Hereinafter, the voice identification processing by the voice recognition processing unit 7 will be specifically described.

音声認識処理部７の特徴抽出部３１は、音声格納部６に格納されている音声信号に対して、例えば、ＬＰＣ分析を実行することにより、その音声信号の対数パワー、１６次ケプストラム係数、Δ対数パワー及びΔ１６次ケプストラム係数を含む３４次元の特徴パラメータを抽出する。
音素照合部３３は、特徴抽出部３１が特徴パラメータを抽出すると、ＨＭＭメモリ３２に格納されている隠れマルコフモデルを用いて音素照合処理を実施することにより、特徴抽出部３１により抽出された特徴パラメータから音素データを生成する。 The feature extraction unit 31 of the speech recognition processing unit 7 performs, for example, LPC analysis on the speech signal stored in the speech storage unit 6, so that the logarithmic power of the speech signal, the 16th-order cepstrum coefficient, Δ 34-dimensional feature parameters including logarithmic power and Δ16th-order cepstrum coefficients are extracted.
When the feature extraction unit 31 extracts a feature parameter, the phoneme matching unit 33 performs a phoneme matching process using a hidden Markov model stored in the HMM memory 32, thereby extracting the feature parameter extracted by the feature extraction unit 31. Phoneme data is generated.

音声認識部３５は、音素照合部３３が音素データを生成すると、言語モデル格納部３４に格納されている統計的言語モデルを参照して、例えば、“ＯｎｅＰａｓｓＤＰアルゴリズム”を実行する。
即ち、音声認識部３５は、その音素データについて左から右方向に、後戻りなしに処理して、より高い生起確率の単語（例えば、名詞、動詞）を音声認識結果（プレゼンターの音声）に決定する音声認識処理を実施する。 When the phoneme matching unit 33 generates phoneme data, the speech recognition unit 35 refers to the statistical language model stored in the language model storage unit 34 and executes, for example, the “One Pass DP algorithm”.
That is, the speech recognition unit 35 processes the phoneme data from the left to the right without backtracking, and determines a word having a higher occurrence probability (for example, a noun or a verb) as a speech recognition result (presenter's speech). Perform voice recognition processing.

キーワード検出部８は、音声認識処理部７がプレゼンターの音声を識別すると、そのプレゼンターの音声の中から、キーワード格納部４に格納されているキーワードを検出する（ステップＳＴ５）。
例えば、キーワードとして、「どうでしょう」、「新商品」、「おめでとうございます」などがキーワード格納部４に格納されている場合、プレゼンターの音声を構成している単語（または、単語の組み合わせ）と、「どうでしょう」などのキーワードとを比較して、そのキーワードと一致する単語（または、単語の組み合わせ）を検出する。 When the speech recognition processing unit 7 identifies the presenter's speech, the keyword detection unit 8 detects the keyword stored in the keyword storage unit 4 from the presenter's speech (step ST5).
For example, when keywords such as “how about”, “new product”, “congratulations” are stored in the keyword storage unit 4, the words (or combinations of words) that make up the presenter's voice, A keyword (or a combination of words) matching the keyword is detected by comparing with a keyword such as “How is it”?

動作パターン検索部１０は、動作解析部３がプレゼンターの動作を解析し、かつ、キーワード検出部８がキーワードを検出すると、動作パターン記憶部９に記憶されている動作パターンの中から、動作解析部３により解析されたプレゼンターの動作に対応し、かつ、キーワード検出部８により検出されたキーワードに対応する動作パターンを検索し（ステップＳＴ６）、その動作パターンを動作パターン格納部１１に格納する。
ここで、動作パターン記憶部９には、図９に示すように、例えば、ロボットの右手、左手、首など、ロボットのパーツ毎の動作パターンが記憶されている。 When the motion analysis unit 3 analyzes the presenter's motion and the keyword detection unit 8 detects a keyword, the motion pattern search unit 10 selects a motion analysis unit from the motion patterns stored in the motion pattern storage unit 9. The motion pattern corresponding to the presenter motion analyzed in step 3 and corresponding to the keyword detected by the keyword detection unit 8 is searched (step ST6), and the motion pattern is stored in the motion pattern storage unit 11.
Here, as shown in FIG. 9, the motion pattern storage unit 9 stores motion patterns for each part of the robot, such as the right hand, left hand, and neck of the robot.

例えば、動作解析部３の解析結果が「プレゼンターの右手が右から左に動いている」旨を示し、キーワード検出部８がキーワード「新商品」を検出している場合、動作パターン検索部１０が右手用の動作パターン（図９（ａ）を参照）の中から、「ロボットの右手を右から左に移動してから、右手を商品に向ける」動作パターンを検索する。
また、動作解析部３の解析結果が「プレゼンターの左手が左から右に動いている」旨を示し、キーワード検出部８がキーワード「どうでしょう」を検出している場合、動作パターン検索部１０が左手用の動作パターン（図９（ｂ）を参照）の中から、「ロボットの左手を客に向けてから、左手を左から右に移動する」動作パターンを検索する。 For example, when the analysis result of the motion analysis unit 3 indicates that “the right hand of the presenter is moving from right to left” and the keyword detection unit 8 detects the keyword “new product”, the motion pattern search unit 10 From the motion pattern for the right hand (see FIG. 9A), a motion pattern of “moving the robot's right hand from right to left and then turning the right hand toward the product” is searched.
When the analysis result of the motion analysis unit 3 indicates that “the presenter's left hand is moving from left to right” and the keyword detection unit 8 detects the keyword “how?”, The motion pattern search unit 10 The motion pattern for “moving the left hand from the left to the right after turning the robot's left hand toward the customer” is searched from the motion patterns for use (see FIG. 9B).

ただし、キーワード検出部８によりキーワードが検出されない場合、動作パターン記憶部９に記憶されている動作パターンの中から、動作解析部３により解析されたプレゼンターの動作に対応する動作パターンを検索する。
例えば、動作解析部３の解析結果が「プレゼンターの右手が右から左に動いている」旨を示しているが、キーワード検出部８がキーワードを検出しない場合、動作パターン検索部１０が右手用の動作パターン（図９（ａ）を参照）の中から、「ロボットの右手を右から左に移動する」動作パターンを検索する。
また、動作解析部３の解析結果が「プレゼンターの左手が左から右に動いている」旨を示しているが、キーワード検出部８がキーワードを検出しない場合、動作パターン検索部１０が左手用の動作パターン（図９（ｂ）を参照）の中から、「ロボットの左手を左から右に移動する」動作パターンを検索する。 However, if no keyword is detected by the keyword detection unit 8, an operation pattern corresponding to the presenter operation analyzed by the operation analysis unit 3 is searched from the operation patterns stored in the operation pattern storage unit 9.
For example, if the analysis result of the motion analysis unit 3 indicates that “the presenter's right hand is moving from right to left”, but the keyword detection unit 8 does not detect a keyword, the motion pattern search unit 10 is for the right hand. An operation pattern “move the right hand of the robot from right to left” is searched from the operation patterns (see FIG. 9A).
Further, the analysis result of the motion analysis unit 3 indicates that “the presenter's left hand is moving from left to right”. However, if the keyword detection unit 8 does not detect a keyword, the motion pattern search unit 10 is for the left hand. An operation pattern “move the left hand of the robot from the left to the right” is searched from the operation patterns (see FIG. 9B).

ここでは、動作解析部３の解析結果が「プレゼンターの右手」又は「プレゼンターの左手」の動きを示しているので、動作パターン検索部１０が「プレゼンターの右手」又は「プレゼンターの左手」の動作パターンを検索するものについて示したが、例えば、動作解析部３の解析結果が「プレゼンターの右手」、「プレゼンターの左手」、「プレゼンターの首」など、複数の部位の動きを同時に示している場合には、動作パターン検索部１０が「プレゼンターの右手」、「プレゼンターの左手」、「プレゼンターの首」などの動作パターンを同時に検索するようにする。 Here, since the analysis result of the motion analysis unit 3 indicates the motion of “presenter's right hand” or “presenter's left hand”, the motion pattern search unit 10 performs the motion pattern of “presenter's right hand” or “presenter's left hand”. For example, when the analysis result of the motion analysis unit 3 indicates movements of a plurality of parts simultaneously such as “presenter's right hand”, “presenter's left hand”, “presenter's neck”, etc. The operation pattern search unit 10 simultaneously searches for operation patterns such as “presenter's right hand”, “presenter's left hand”, and “presenter's neck”.

タイミング発生部１４は、動作パターン検索部１０により動作パターンが動作パターン格納部１１に格納されたのち、外部からロボットの動作開始要求を受けると、身体制御部１２によるロボット制御と音声再生部１３による音声再生を同期させるため、タイミング信号（例えば、所定周波数のパルス信号や、開始トリガ信号など）を身体制御部１２及び音声再生部１３に出力する。 When the motion pattern search unit 10 stores the motion pattern in the motion pattern storage unit 11 and receives a robot motion start request from the outside, the timing generation unit 14 performs the robot control by the body control unit 12 and the voice playback unit 13. In order to synchronize the sound reproduction, a timing signal (for example, a pulse signal of a predetermined frequency, a start trigger signal, etc.) is output to the body control unit 12 and the sound reproduction unit 13.

身体制御部１２は、タイミング発生部１４からタイミング信号を受けると、そのタイミング信号に同期して、動作パターン格納部１１に格納されている動作パターンにしたがって身体を制御する。
即ち、身体制御部１２は、ロボットが動作パターンの通りに動作する制御信号をロボットのアクチュエータに出力する（ステップＳＴ７）。
例えば、動作パターンが「ロボットの右手を右から左に移動する」であれば、右肩関節部４３Ｒ及び右肘関節部４４Ｒに係るアクチュエータに制御信号を出力することにより、ロボットの右手を右から左に移動させるようにする。 When body controller 12 receives the timing signal from timing generator 14, body controller 12 controls the body according to the motion pattern stored in motion pattern storage 11 in synchronization with the timing signal.
That is, the body control unit 12 outputs a control signal that causes the robot to operate according to the operation pattern to the actuator of the robot (step ST7).
For example, if the movement pattern is “move the robot's right hand from right to left”, the control signal is output to the actuators related to the right shoulder joint portion 43R and the right elbow joint portion 44R, thereby moving the robot's right hand from the right. Move to the left.

音声再生部１３は、タイミング発生部１４からタイミング信号を受けると、そのタイミング信号に同期して、音声格納部６に格納されている音声信号にしたがってプレゼンターの音声を再生する。 When receiving the timing signal from the timing generation unit 14, the audio reproduction unit 13 reproduces the presenter's audio according to the audio signal stored in the audio storage unit 6 in synchronization with the timing signal.

以上で明らかなように、この実施の形態１によれば、動作パターン記憶部９に記憶されているロボットの動作パターンの中から、動作解析部３により解析されたプレゼンターの動作に対応し、かつ、キーワード検出部８により検出されたキーワードに対応する動作パターンを検索する動作パターン検索部１０を設け、身体制御部１２が動作パターン検索部１０により検索された動作パターンにしたがって身体を制御するように構成したので、プレゼンターの意図を正確にユーザに伝えることが可能な身振りを行うことができる効果を奏する。
即ち、動作パターン検索部１０が動作解析部３により解析されたプレゼンターの動作だけでなく、キーワード検出部８により検出されたキーワードに対応する動作パターンを検索するようにしているので、ロボットの動作を特定のキーワードと対応付けることができるようになり、その結果、プレゼンターの意図を正確にユーザに伝えることが可能なジェスチャを実現することができる効果を奏する。 As apparent from the above, according to the first embodiment, the motion corresponding to the presenter analyzed by the motion analysis unit 3 among the motion patterns of the robot stored in the motion pattern storage unit 9, and An operation pattern search unit 10 for searching for an operation pattern corresponding to the keyword detected by the keyword detection unit 8 is provided, and the body control unit 12 controls the body according to the operation pattern searched by the operation pattern search unit 10. Since it comprised, there exists an effect which can perform the gesture which can tell a user's intention correctly to a user.
That is, since the motion pattern search unit 10 searches not only the presenter's motion analyzed by the motion analysis unit 3 but also the motion pattern corresponding to the keyword detected by the keyword detection unit 8, the motion of the robot is controlled. As a result, it is possible to associate with a specific keyword, and as a result, it is possible to realize a gesture that can accurately convey the presenter's intention to the user.

また、この実施の形態１によれば、キーワード検出部８により特定のキーワードが検出されない場合、動作パターン記憶部９に記憶されているロボットの動作パターンの中から、動作解析部３により解析されたプレゼンターの動作に対応する動作パターンを検索するように構成したので、キーワード検出部８により特定のキーワードが検出されない場合でも、プレゼンターの動作を再現することができる効果を奏する。 Further, according to the first embodiment, when a specific keyword is not detected by the keyword detection unit 8, an analysis is performed by the motion analysis unit 3 from the robot motion patterns stored in the motion pattern storage unit 9. Since the operation pattern corresponding to the operation of the presenter is searched, there is an effect that the operation of the presenter can be reproduced even when the keyword detection unit 8 does not detect a specific keyword.

なお、この実施の形態１では、動作解析部３が一定時間毎にプレゼンターの動作を解析し、音声認識処理部７が一定時間毎にプレゼンターの音声を認識して、キーワード検出部８がキーワードを検出することを想定しているが、これに限るものではなく、例えば、外部から処理開始要求信号を受ける毎に、動作解析部３がプレゼンターの動作を解析するとともに、音声認識処理部７がプレゼンターの音声を認識して、キーワード検出部８がキーワードを検出するようにしてもよい。
また、例えば、プレゼンターが１文を発話する毎に、動作解析部３がプレゼンターの動作を解析するとともに、音声認識処理部７がプレゼンターの音声を認識して、キーワード検出部８がキーワードを検出するようにしてもよい。 In the first embodiment, the motion analysis unit 3 analyzes the operation of the presenter at regular intervals, the speech recognition processing unit 7 recognizes the speech of the presenter at regular intervals, and the keyword detection unit 8 selects keywords. However, the present invention is not limited to this. For example, every time a processing start request signal is received from the outside, the motion analysis unit 3 analyzes the motion of the presenter, and the speech recognition processing unit 7 May be recognized, and the keyword detection unit 8 may detect the keyword.
For example, every time the presenter utters a sentence, the motion analysis unit 3 analyzes the motion of the presenter, the speech recognition processing unit 7 recognizes the speech of the presenter, and the keyword detection unit 8 detects the keyword. You may do it.

実施の形態２．
上記実施の形態１では、動作パターン検索部１０が動作パターンを動作パターン格納部１１に格納したのち、タイミング発生部１４が外部からロボットの動作開始要求を受けると、タイミング信号を身体制御部１２及び音声再生部１３に出力するものについて示したが（この場合のロボット制御は、通常、プレゼンターが動作を終えてから、プレゼンターの動作と非同期に行われる）、動作パターン検索部１０が動作パターンを動作パターン格納部１１に格納すると、直ちに、タイミング発生部１４がタイミング信号を身体制御部１２及び音声再生部１３に出力するようにしてもよい。
この場合、プレゼンターが動作しているとき、プレゼンターの動作に追従するようなリアルタイムなロボット制御を実施することができる。 Embodiment 2. FIG.
In the first embodiment, when the motion pattern search unit 10 stores the motion pattern in the motion pattern storage unit 11 and the timing generation unit 14 receives a robot motion start request from the outside, the timing signal is transmitted to the body control unit 12 and Although what was output to the audio reproduction unit 13 has been shown (in this case, the robot control is normally performed asynchronously with the operation of the presenter after the presenter finishes the operation), the operation pattern search unit 10 operates the operation pattern. As soon as it is stored in the pattern storage unit 11, the timing generation unit 14 may output a timing signal to the body control unit 12 and the sound reproduction unit 13.
In this case, when the presenter is operating, real-time robot control that follows the operation of the presenter can be performed.

実施の形態３．
上記実施の形態１では、カメラ１がプレゼンターの映像を映像格納部２に格納し、マイク５が音声信号を音声格納部６に格納するものについて示したが、例えば、ビデオカメラによりプレゼンターが撮影された録画データを映像格納部２及び音声格納部６に格納するようにしてもよい。
この場合、動作解析部３は、映像格納部２に格納されている録画データからプレゼンターの映像を取得し、音声認識処理部７は、音声格納部６に格納されている録画データからプレゼンターの音声を取得する。 Embodiment 3 FIG.
In the first embodiment, the camera 1 stores the video of the presenter in the video storage unit 2 and the microphone 5 stores the audio signal in the audio storage unit 6. For example, the presenter is photographed by a video camera. The recorded data may be stored in the video storage unit 2 and the audio storage unit 6.
In this case, the motion analysis unit 3 acquires the presenter's video from the recorded data stored in the video storage unit 2, and the voice recognition processing unit 7 uses the presenter's audio from the recorded data stored in the audio storage unit 6. To get.

この発明の実施の形態１によるプレゼンター動作再現ロボットを示す構成図である。It is a block diagram which shows the presenter operation | movement reproduction robot by Embodiment 1 of this invention. この発明の実施の形態１によるプレゼンター動作再現ロボットの動作解析部３を示す構成図である。It is a block diagram which shows the motion analysis part 3 of the presenter motion reproduction robot by Embodiment 1 of this invention. この発明の実施の形態１によるプレゼンター動作再現ロボットの音声認識処理部７を示す構成図である。It is a block diagram which shows the speech recognition process part 7 of the presenter operation | movement reproduction robot by Embodiment 1 of this invention. この発明の実施の形態１によるプレゼンター動作再現ロボットを示す正面図である。It is a front view which shows the presenter operation | movement reproduction robot by Embodiment 1 of this invention. この発明の実施の形態１によるプレゼンター動作再現ロボットを示す側面図である。It is a side view which shows the presenter operation | movement reproduction robot by Embodiment 1 of this invention. この発明の実施の形態１によるプレゼンター動作再現ロボットの処理内容を示すフローチャートである。It is a flowchart which shows the processing content of the presenter operation | movement reproduction robot by Embodiment 1 of this invention. カメラ１により撮影された映像を示す説明図である。It is explanatory drawing which shows the image | video image | photographed with the camera. カメラ１により撮影された複数の差分映像を示す説明図である。It is explanatory drawing which shows the some difference image image | photographed with the camera. ロボットのパーツ毎の動作パターンを示す説明図である。成図である。It is explanatory drawing which shows the operation | movement pattern for every part of a robot. It is a chart.

Explanation of symbols

１カメラ（動作解析手段）
２映像格納部（動作解析手段）
３動作解析部（動作解析手段）
４キーワード格納部（音声認識手段）
５マイク（音声認識手段）
６音声格納部（音声認識手段）
７音声認識処理部（音声認識手段）
８キーワード検出部（音声認識手段）
９動作パターン記憶部（ロボット動作記憶手段）
１０動作パターン検索部（ロボット動作検索手段）
１１動作パターン格納部
１２身体制御部（身体制御手段）
１３音声再生部
１４タイミング発生部
２１変化部位検出部
２２変化部位解析部
３１特徴抽出部
３２ＨＭＭメモリ
３３音素照合部
３４言語モデル格納部
３５音声認識部
４１Ｌ左上腕部
４１Ｒ右上腕部
４２Ｌ左下腕部
４２Ｒ右下腕部
４３Ｌ左肩関節部
４３Ｒ右肩関節部
４４Ｌ左肘関節部
４４Ｒ右肘関節部
４５Ｌ，４５Ｒ移動ローラ
４６Ｎ首関節部 1 Camera (motion analysis means)
2 Video storage (motion analysis means)
3. Motion analysis unit (motion analysis means)
4 Keyword storage (voice recognition means)
5 Microphone (voice recognition means)
6 Voice storage (voice recognition means)
7 Voice recognition processing part (voice recognition means)
8 Keyword detector (voice recognition means)
9 Operation pattern storage (robot operation storage means)
10 Motion pattern search unit (robot motion search means)
11 motion pattern storage unit 12 body control unit (body control means)
DESCRIPTION OF SYMBOLS 13 Voice reproduction part 14 Timing generation part 21 Change part detection part 22 Change part analysis part 31 Feature extraction part 32 HMM memory 33 Phoneme collation part 34 Language model storage part 35 Speech recognition part 41L Left upper arm part 41R Upper right arm part 42L Left lower arm part 42R Right lower arm portion 43L Left shoulder joint portion 43R Right shoulder joint portion 44L Left elbow joint portion 44R Right elbow joint portion 45L, 45R Moving roller 46N Neck joint portion

Claims

A motion analysis means for analyzing the behavior of the document explainer;
A voice recognition means for recognizing the voice of the material explainer;
Robot operation storage means for storing the robot operation corresponding to the operation and voice of the material explainer;
The robot motion stored in the robot motion storage means corresponds to the motion of the material explainer analyzed by the motion analysis means, and corresponds to the voice of the material explainer recognized by the voice recognition means. Robot motion search means for searching for robot motion to perform,
A presenter motion reproduction robot comprising: body control means for controlling the body according to the robot motion searched by the robot motion search means.

The robot operation storage means for storing the robot operation corresponding to the specific keyword as the robot operation corresponding to the voice of the document explainer;
The voice recognition means for detecting a specific keyword from the voice of the document explainer;
The robot operation for searching for a robot operation corresponding to the operation of the material explainer and corresponding to the specific keyword detected by the voice recognition unit from among the robot operations stored in the robot operation storage unit The presenter motion reproducing robot according to claim 1, further comprising a search unit.

The robot motion search means is characterized in that, when a specific keyword is not detected by the voice recognition means, the robot motion corresponding to the material explainer's motion is searched from the robot motions stored in the robot motion storage means. The presenter motion reproducing robot according to claim 2.

A motion analysis step in which the motion analysis means analyzes the motion of the document explainer;
A voice recognition step in which the voice recognition means recognizes the voice of the material explainer;
The robot motion search means corresponds to the motion of the material explainer analyzed by the motion analysis means from the robot motion storage means in which the robot motion corresponding to the motion and voice of the material explainer is stored, and the voice A robot motion search step for searching for a robot motion corresponding to the voice of the material explainer recognized by the recognition means;
A control method for a presenter motion reproduction robot, comprising: a body control step in which the body control means controls the body according to the robot motion searched by the robot motion search means.

Action analysis processing procedure to analyze the document explainer's action,
A voice recognition processing procedure for recognizing the voice of the document explainer,
Corresponding to the motion of the material explainer analyzed by the motion analysis processing procedure from the robot motion storage means storing the motion of the material explainer and the robot motion corresponding to the speech, and by the speech recognition processing procedure A robot motion search processing procedure for searching for a robot motion corresponding to the voice of the recognized document explainer;
A control program for a presenter motion reproduction robot for causing a computer to execute a body control processing procedure for controlling a body according to the robot motion searched by the robot motion search processing procedure.