JP2002041411A

JP2002041411A - Text-reading robot, its control method and recording medium recorded with program for controlling text recording robot

Info

Publication number: JP2002041411A
Application number: JP2000229272A
Authority: JP
Inventors: Ikuo Kitagishi; 郁雄北岸; Satoshi Iwaki; 敏岩城; Takao Kakizaki; 隆夫柿崎; Tamotsu Machino; 保町野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2000-07-28
Filing date: 2000-07-28
Publication date: 2002-02-08

Abstract

PROBLEM TO BE SOLVED: To provide a text-reading robot, capable of performing a text reading advanced in an emotional information transmission action without increasing capacity, to provide a method for controlling the robot and to provide a recording medium, in which a program for controlling the robot is recorded. SOLUTION: The text-reading robot for reading a text stored in a memory of a personal computer(PC), connectable to a computer system, analyzes the contents of sentences by morpheme analysis for dividing the text into words, extracts information concerned with emotions and generating voice, gesture or an effective sound corresponding to the extracted emotion information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はインターネットで流
通される電子メールやホームページ等のテキストを音声
合成により自動的に読み上げるテキスト読み上げロボッ
ト、その制御方法及びテキスト読み上げロボット制御プ
ログラムを記録した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a text-to-speech robot for automatically reading text such as e-mails and homepages distributed on the Internet by voice synthesis, a control method thereof, and a recording medium on which a text-to-speech robot control program is recorded.

【０００２】[0002]

【従来の技術】電子メールを音声合成により自動的に読
み上げるロボットは既存である。しかし、只単に合成音
を発生するだけなので、感情表現が乏しい、理解し難
い、飽きるという問題があった。また、特開平１１−３
２７８７２号公報や特開平１１−６５９６４号公報の様
に、予め電子メールテキストに動作プログラムや音素情
報を格納し、メールを送信し、着信先で対応プログラム
を実行するといった方法では、電子メールに動作プログ
ラムや音素情報が添付されるため、容量が大きくなる、
或いは子メールの送信先にも同一のメーラーのインスト
ールが必要となる等の問題があった。2. Description of the Related Art There are existing robots that automatically read an e-mail by voice synthesis. However, there is a problem that emotional expression is poor, difficult to understand, and tired because only synthetic sounds are generated. Also, JP-A-11-3
In a method of storing an operation program and phoneme information in an e-mail text in advance, transmitting an e-mail, and executing a corresponding program at a destination as in Japanese Patent No. 277872 and Japanese Patent Application Laid-Open No. 11-65964, an operation is performed on an e-mail. Because the program and phoneme information are attached, the capacity increases,
Alternatively, there is a problem that the same mailer needs to be installed at the destination of the child mail.

【０００３】[0003]

【発明が解決しようとする課題】従来の電子メール読み
上げロボットでは感情表現が乏しい、容量が大きくなる
等の問題があった。The conventional e-mail reading robot has problems such as poor emotional expression and large capacity.

【０００４】本発明は上記の事情に鑑みてなされたもの
で、容量が大きくならないで、感情表現を伴なう情緒的
な情報伝達行為に進化したテキスト読み上げができるテ
キスト読み上げロボット、その制御方法及びテキスト読
み上げロボット制御プログラムを記録した記録媒体を提
供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and a text-to-speech robot capable of reading out a text-to-speech evolved into an emotional information transmission action accompanied by emotional expression without increasing the capacity, a control method thereof, and a control method thereof. An object of the present invention is to provide a recording medium on which a text-to-speech robot control program is recorded.

【０００５】[0005]

【課題を解決するための手段】上記目的を達成するため
に本発明は、コンピュータシステムに接続可能なＰＣ
（パソコン）内のメモリ上に蓄積されたテキストを読み
上げるテキスト読み上げロボットであって、前記テキス
トを単語分割する形態素解析によって文章の内容を解析
し、感情に関わる情報を抽出する感情情報抽出手段と、
前記感情情報抽出手段で抽出された感情情報に対応した
声やジェスチャー、効果音を発現する音声・運動発現手
段とを具備することを特徴とするものである。In order to achieve the above object, the present invention provides a PC which can be connected to a computer system.
A text-to-speech robot that reads text stored in a memory in a (personal computer), wherein the text-to-speech robot analyzes sentence contents by morphological analysis that divides the text into words, and extracts emotion-related information,
It is characterized by comprising voice / motion expression means for expressing voices, gestures and sound effects corresponding to the emotion information extracted by the emotion information extraction means.

【０００６】また本発明は、前記テキスト読み上げロボ
ットおいて、感情情報抽出手段でテキストを解析する
際、該テキスト内の絵文字情報を抽出し、音声・運動発
現手段で前記絵文字情報の意図とテキスト中の存在場所
に対応して、声やジェスチャー、効果音を発現すること
を特徴とするものである。In the text-reading robot, when the text is analyzed by the emotion information extracting means, pictographic information in the text is extracted, and the intention of the pictographic information and the text in the text are extracted by the voice / motion expressing means. Voices, gestures, and sound effects according to the location of the character.

【０００７】また本発明は、前記テキスト読み上げロボ
ットにおいて、音声・運動発現手段でジェスチャーを発
現する際、スピーカ出力に応じて、ロボットの口の開閉
の大きさが変化することを特徴とするものである。The present invention is also characterized in that in the text-to-speech robot, when the gesture is expressed by the voice / motion expression means, the size of opening / closing of the robot's mouth changes according to the speaker output. is there.

【０００８】また本発明は、前記テキスト読み上げロボ
ットにおいて、テキストは、電子メールとして送受信さ
れたテキストであり、受信メールから送信者データ、及
び受信メールに対する返信回数を蓄積し、電子メールの
送受信回数が多い特定メール送信者に対して、メールの
読み上げ音声、セリフが変化することを特徴とするもの
である。[0008] Further, the present invention provides the text-to-speech robot, wherein the text is text transmitted / received as an e-mail, and stores sender data from the received mail and the number of replies to the received mail. The feature is that the voice reading and the speech of the mail change for many specific mail senders.

【０００９】また本発明は、前記テキスト読み上げロボ
ットにおいて、テキストは、電子メールとして送受信さ
れたテキストであり、電子メールテキストを単語分割す
る形態素解析によって文章の内容を解析し、感情に関わ
る情報を取得し、電子メール送信者と受信者の親密度を
解析し、親密度に応じてメールの読み上げ音声、セリフ
が変化することを特徴とするものである。Further, according to the present invention, in the text-to-speech robot, the text is text transmitted / received as an e-mail, and the content of the sentence is analyzed by morphological analysis that divides the e-mail text into words to obtain information related to emotion. Then, the intimacy between the e-mail sender and the e-mail is analyzed, and the read-out voice and dialogue of the e-mail are changed according to the intimacy.

【００１０】また本発明は、コンピュータシステムに接
続可能なＰＣ内のメモリ上に蓄積されたテキストを読み
上げるテキスト読み上げロボットの制御方法であって、
前記テキストを単語分割する形態素解析によって文章の
内容を解析し、感情に関わる情報を抽出する感情情報抽
出ステップと、前記感情情報抽出ステップで抽出された
感情情報に対応した声やジェスチャー、効果音を発現す
る音声・運動発現ステップとを有することを特徴とす
る。The present invention also relates to a method of controlling a text-to-speech robot that reads text stored in a memory in a PC connectable to a computer system.
The sentence is analyzed by morphological analysis to divide the text into words, and an emotion information extraction step of extracting information related to emotion, and a voice, gesture, and sound effect corresponding to the emotion information extracted in the emotion information extraction step are generated. Voice / motion expression step.

【００１１】また本発明は、コンピュータシステムに接
続可能なＰＣ内のメモリ上に蓄積されたテキストを読み
上げるテキスト読み上げロボットの制御プログラムを記
録したコンピュータ読み取り可能な記録媒体であって、
前記テキストを単語分割する形態素解析によって文章の
内容を解析し、感情に関わる情報を抽出する感情情報抽
出手順、前記感情情報抽出手順で抽出された感情情報に
対応した声やジェスチャー、効果音を発現する音声・運
動発現手順をコンピュータに実行させるためのものであ
る。Further, the present invention is a computer-readable recording medium recording a control program for a text-to-speech robot that reads text stored in a memory in a PC connectable to a computer system,
The sentence is analyzed by morphological analysis that divides the text into words, and an emotion information extraction procedure for extracting information related to emotions, and voices, gestures, and sound effects corresponding to the emotion information extracted in the emotion information extraction procedure are expressed. This is for causing a computer to execute a voice / motion expression procedure.

【００１２】本発明は、通常の読み上げ機能に、テキス
トの内容のうち、感情に関わる表現を抽出して、読み上
げる途中で、その感情に相応しいセリフやジェスチャー
を発現する機能を付加する。また、一般にスマイリーと
呼ばれる絵文字がメール送り手の感情表現であることに
着目し、その絵文字が存在するタイミングで、ロボット
動作をさせることにより、よりタイムリーな効果を狙
う。According to the present invention, a function of extracting an expression relating to an emotion from the contents of a text and adding a function of expressing a line or a gesture appropriate to the emotion during the reading is added to the normal reading function. In addition, attention is paid to the fact that a pictogram generally called a smiley is an emotional expression of an e-mail sender, and a robot action is performed at the timing when the pictogram exists, thereby aiming for a more timely effect.

【００１３】[0013]

【発明の実施の形態】以下図面を参照して本発明の実施
形態例を詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１４】図１は、本発明の実施形態例による電子メ
ールのテキストを読み上げる際のロボットの機能ブロッ
ク図を表す。図において、１１は感情情報抽出装置、１
２はメール送信者ＤＢ（データベース）、１３はセリフ
ＤＢ、１４は音声合成装置、１５は音声制御装置、１６
は運動制御装置、１７はジェスチャーＤＢ、１８は駆動
機構、１９はスピーカである。FIG. 1 is a functional block diagram of a robot when reading out text of an electronic mail according to an embodiment of the present invention. In the figure, 11 is an emotion information extraction device, 1
2 is a mail sender DB (database), 13 is a serif DB, 14 is a voice synthesizer, 15 is a voice controller, 16
Is a motion control device, 17 is a gesture DB, 18 is a drive mechanism, and 19 is a speaker.

【００１５】メールテキストが入力として与えられ、基
本的にはそのテキストを音声合成装置１４を通してスピ
ーカ１９より発声する。一方で、メール内容の喜怒哀楽
などの感情表現を抽出する。抽出された喜怒哀楽等の感
情情報は、運動制御装置１６、音声制御装置１５に送信
される。それぞれの装置では、感情情報に相応しい、効
果的な音声とジェスチャーがそれぞれのＤＢより選択さ
れ、それぞれの出力装置に伝達される。A mail text is given as an input, and the text is basically uttered from a speaker 19 through the voice synthesizer 14. On the other hand, emotion expressions such as emotions and emotions of the mail content are extracted. The extracted emotion information such as emotions and emotions is transmitted to the exercise control device 16 and the voice control device 15. In each device, effective voices and gestures suitable for the emotion information are selected from the respective DBs and transmitted to the respective output devices.

【００１６】ジェスチャーは駆動機構１８により発現さ
れる。同時に、スピーカ１９より効果音や感情に相応し
い言葉が発声される。The gesture is expressed by the drive mechanism 18. At the same time, words suitable for sound effects and emotions are uttered from the speaker 19.

【００１７】また、ジェスチャーを発現する際、スピー
カ出力に応じて、ロボットの口の開閉の大きさが変化す
る。すなわち、スピーカ１９から運動制御装置１６へ音
声信号を入力し、スピーカ１９へ入力された音の大き
さ、即ち電気信号の振幅値を取得し、音の大きさに応じ
て、口の開閉を駆動するモータ（ＰＷＭ信号によって制
御可能なサーボモータ）への入力ＰＷＭ信号のパルス幅
を変更する。Further, when the gesture is expressed, the size of the opening and closing of the mouth of the robot changes according to the speaker output. That is, an audio signal is input from the speaker 19 to the exercise control device 16, the loudness of the sound input to the speaker 19, that is, the amplitude value of the electric signal is obtained, and the opening and closing of the mouth is driven according to the loudness of the sound. The pulse width of the input PWM signal to the motor to be controlled (servo motor controllable by the PWM signal) is changed.

【００１８】具体的には、例えば次のような処理によ
り、モータを制御する。Specifically, the motor is controlled by, for example, the following processing.

【００１９】（１）Ｖｓｐ：スピーカへの電気信号値
（ｍｖ） αｖ−ｍａｘ：スピーカへの最大入力信号値（ｍｖ） αｖ−ｍｉｎ：スピーカへの最小入力信号値（ｍｖ） αｖ−ｍｉｎ≪αｖ−ｌｏｗ≪αｖ−ｈｉｇｈ≪αｖ−
ｍａｘ（２）Ｐｗ：ＰＷＭ信号のパルス幅（μｓｅｃ） αｐ−ｍａｘ：有効最大パルス幅（μｓｅｃ） αｐ−ｍｉｎ：有効最小パルス幅（μｓｅｃ） αｐ−ｍｉｎ≪αｐ−ｌｏｗ≪αｐ−ｈｉｇｈ≪αｐ−
ｍａｘ（３）音声の定常時 αｖ−ｌｏｗ＜＝Ｖｓｐ＜＝αｖ−ｈｉｇｈの場合、出力すべきＰＷＭ信号のパルス幅は、 αｐ−ｌｏｗ＜＝Ｐｗ＜＝αｐ−ｈｉｇｈ（４）音声が大きい場合 αｖ−ｈｉｇｈ＜Ｖｓｐ＜＝αｖ−ｍａｘの場合、出力すべきＰＷＭ信号のパルス幅は、 αｐ−ｍｉｎ＜＝Ｐｗ＜＝αｐ−ｍａｘ（５）音声が小さい場合 αｖ−ｍｉｎ＜＝Ｖｓｐ＜αｖ−ｌｏｗの場合、出力すべきＰＷＭ信号のパルス幅は、 αｖ−ｌｏｗ≪Ｐｗ≪αｖ−ｈｉｇｈ次に、図１の各モジュールについて説明する。(1) Vsp: electric signal value to the speaker (mv) αv-max: maximum input signal value to the speaker (mv) αv-min: minimum input signal value to the speaker (mv) αv-min≪αv −low≪αv-high≪αv-
max (2) Pw: pulse width of the PWM signal (μsec) αp-max: effective maximum pulse width (μsec) αp-min: effective minimum pulse width (μsec) αp-min≪αp-low≪αp-high≪αp-
max (3) When the sound is stationary αv-low <= Vsp <= αv-high When the pulse width of the PWM signal to be output is αp-low <= Pw <= αp-high (4) When the sound is loud In the case of αv-high <Vsp <= αv-max, the pulse width of the PWM signal to be output is αp-min <= Pw <= αp-max (5) When the sound is low αv-min <= Vsp <αv- In the case of low, the pulse width of the PWM signal to be output is αv−low≪Pw≪αv−high Next, each module in FIG. 1 will be described.

【００２０】［感情情報抽出装置１１］電子メールテキ
ストを単語に分けて品詞のタグ付けを行う形態素解析
（情報処理学会第５６回全国大会：“情報抽出とユーザ
の行動履歴に基づく電子メールのランキング”，長谷
川，高木，ｐｐ２−２５９−２６０，１９９８情報処
理学会研究報告自然言語処理：“電子メールコミュニケ
ーションにおけるスケジュール情報抽出”，長谷川，高
木，１２３−１０，１９９８：“Support Vector Machi
neによるテキスト分類”，平，向内，春野，１２８−２
４情報処理学会第５６回全国大会：“保守性を考慮し
た日本語形態素解析システム”，渕，松岡，高木，１１
７−９，１９９７：“形態素解析を用いた中間部分一致
検索の高速化手法”，奥，野田，林，１２１−９，１９
９７）によって文章の内容を解析し、感情に関わるタグ
情報を取得する（下記例参照）。出力情報は、情報タ
グ、タグの手掛かりとなった単語のテキスト情報を含む
テキストファイル、及び情報タグの出現頻度計算に基づ
くメールの属性である。[Emotion Information Extraction Apparatus 11] Morphological analysis that divides e-mail text into words and tags parts of speech (56th Annual Convention of IPSJ: "E-mail ranking based on information extraction and user's action history") "Hasegawa, Takagi, pp. 2-259-260, 1998 IPSJ research report Natural language processing:" Extraction of schedule information in e-mail communication ", Hasegawa, Takagi, 123-10, 1998:" Support Vector Machi
Text classification by ne ", Taira, Mukouchi, Haruno, 128-2
4 IPSJ 56th National Convention: "Japanese Morphological Analysis System Considering Conservativeness", Fuchi, Matsuoka, Takagi, 11
7-9, 1997: "A method for speeding up the search for intermediate partial matches using morphological analysis", Oku, Noda, Hayashi, 121-9, 19
97), the contents of the sentence are analyzed, and tag information relating to emotions is obtained (see the example below). The output information is an information tag, a text file containing the text information of the word serving as a clue of the tag, and an attribute of the mail based on the calculation of the appearance frequency of the information tag.

【００２１】＜出力情報＞メールの属性楽しみタグ＠喜び手掛かりとなった単語｛わーい｝＜形態素解析例＞（入力メールテキスト）失礼しまーす。美穂です。さっ
ちゃん家での、ホームパーティのお誘いでーす。今回
は、自分の得意料理を持ち寄りましょう。<Output information> E-mail attribute Pleasure tag ＠Pleasure Clued word ｛Waii｝ <Example of morphological analysis> (input mail text) I'm sorry. Miho. Invite a home party at Satchan's house. This time, bring your own specialties.

【００２２】（処理後）＠挨拶｛失礼しまーす｝。(After processing) {Greetings} Excuse me.

【００２３】美穂です。It is Miho.

【００２４】さっちゃんの家での、＠楽しみ｛ホームパ
ーティ｝のお誘いでーす。Inviting a “fun” home party at Satchan's house.

【００２５】今回は、自分の得意料理を持ち寄り＠勧誘
｛ましょう｝。[0025] This time, let's bring in your own specialty dish.

【００２６】“＠”の後から”｛“まで→タグ情報 “｛“から”｝”まで→タグの手掛かりとなった単語。
セリフＤＢに韻律パラメータ（ピッチ周波数、パワー、
音韻継続長）を変調し、抑揚を持たせた音声ファイルと
してあらかじめ格納済み。From “＠” to “｛” → tag information “｛” to “｝” → word as a clue of tag.
Prosodic parameters (pitch frequency, power,
Pre-stored as a voice file that modulates the phonetic duration and adds inflection.

【００２７】＜感情情報タグとタグの手掛かりとなる単
語の例＞タグ：喜び単語：成功，おめでとう，（＾＾）Ｖ，などタグ：怒り単語：文句，努，（｀´），などタグ：哀しみ単語：失敗，痛，（Ｔ＿Ｔ），などタグ：楽しみ単語：笑，愉快，ｐ（＾＾）ｑ，などタグ：陳謝単語：ごめん，申し訳ない，ｍ（＿）ｍ，などタグ：驚き単語：えっ，あっ，（＊＿＊），などタグ：強調単語：！，至急，ｏ（＾ｏ＾）ｏ，など＜メールの内容の属性決定例＞本文中のタグタグの出現回数挨拶１陳謝４哀しみ２驚き１ ↓ ↓ メールの属性＝陳謝［メール送信者ＤＢ１２］メールのヘッダ情報から送信
者リストや送信者とのメール受信の頻度を記憶。また、
感情情報抽出装置からのメールの属性情報とメールの送
受信回数に基づいて送信者と受信者との関係を抽出す
る。メールの送信者名、Subject（サブジェクト）、送
信者との関係の出力を行なう。図２は本発明の実施形態
例に係るメール送信者ＤＢ１２の記録テーブルを示す説
明図である。<Examples of emotion information tags and clues of tags> Tags: joy Words: success, congratulations, (＾＾) V, etc. Tags: anger Words: phrases, effort, (｀ ′), etc. Tags: Sorrow Words: failure, pain, (T_T), etc. Tags: fun Words: lol, pleasure, p (＾＾) q, etc. Tags: Chen Xie Words: sorry, sorry, m (_) m, etc. Tags: surprise word : Eh, ah, (* _ *), etc. Tag: Emphasis Word:! , Urgent, o (＾ o ＾) o, etc. <Example of attribute determination of mail content> Number of appearances of tags in the text Tag 1 times Appreciation 4 Sorrow 2 Surprise 1 ↓ ↓ Attribute of mail = Mail appreciation [Mail sender DB12] Stores the sender list and the frequency of mail reception with the sender from the mail header information. Also,
The relationship between the sender and the recipient is extracted based on the attribute information of the mail from the emotion information extraction device and the number of times the mail has been transmitted and received. Outputs the sender name, Subject, and the relationship with the sender of the mail. FIG. 2 is an explanatory diagram showing a record table of the mail sender DB 12 according to the embodiment of the present invention.

【００２８】［セリフＤＢ１３］感情情報タグの手掛か
りとなる単語に対して、韻律パラメータ（ピッチ周波
数、パワー、音韻継続長）に変更を加え、抑揚を持たせ
た音声ファイル、及びメールの内容の発声前後に、付加
するコメント文の音声ファイルを格納。コメント文は、
丁寧な文章、砕けた文章、抑揚のある文章、抑揚の無い
文章を音声ファイルとして保持。[Serif DB13] Prosody parameters (pitch frequency, power, phoneme duration) are changed for words that are clues to the emotion information tag, and the inflection voice file and the contents of the mail are uttered. The audio file of the comment sentence to be added is stored before and after. The comment text is
Polite sentences, broken sentences, sentences with intonation, and sentences without intonation are stored as audio files.

【００２９】＜コメント文の例＞ ○○「ちゃんからメールだよ！」××「だってさ！」 ○○「さんからメールが届きました。」××「とのこと
です。」 ○○→送信者の名前 ××→Subjectの内容［音声合成装置１４］情報タグ、タグの手掛かりとなっ
た単語のテキスト情報を含むテキストファイルを音声フ
ァイルへ変換を行なう。この時、タグの手掛かりとなっ
た単語の音声ファイルをセリフＤＢ１３から取得し、合
成を行なう。また、感情情報抽出装置１１から取得した
メールの属性に基づいて韻律パラメータ（ピッチ周波
数、パワー、音韻継続長）に変更を加える。コメント文
の生成に関しては、メール送信者ＤＢ１２からの送信者
との関係性、送信者名、Subjectを参照し、セリフＤＢ
１３から取得したセリフの音声ファイルと合成を行な
う。セリフＤＢからセリフを選択する際、送信者との関
係性を考慮し、親密度が大きければ大きいほど、砕けた
抑揚のある文章を選択する。<Example of a comment sentence> XX "E-mail from Chan!" XX "That's it!" XX "An e-mail has arrived from XXX." XX "It's about."Sender's name XX → Contents of subject [Speech synthesizer 14] Converts a text file containing text information of information tags and words used as clues of tags into speech files. At this time, the voice file of the word that became the clue of the tag is obtained from the dialogue DB 13 and synthesized. Further, the prosodic parameters (pitch frequency, power, phoneme duration) are changed based on the attribute of the mail acquired from the emotion information extraction device 11. Regarding the generation of the comment sentence, reference is made to the relationship with the sender, the sender name, and the subject from the mail sender DB 12, and the dialog DB
13 is synthesized with the voice file of the dialogue acquired from the dialogue. When selecting a line from the line DB, a sentence with a broken intonation is selected as the intimacy increases, in consideration of the relationship with the sender.

【００３０】出力情報は、生成されたメール本文、及び
コメント文の音声ファイルを音声制御装置１５へ出力す
る。また運動制御装置１６へは、情報タグ、及び情報タ
グの手掛かりとなった単語の情報、生成された音声ファ
イルの再生時間情報、情報タグの手掛かりとなった単語
が埋め込まれている位置の時間情報、及び該単語の再生
時間を含んだ情報を出力する。As the output information, a voice file of the generated mail text and comment text is output to the voice control device 15. Also, the motion control device 16 has information tags, information on the words that have become clues to the information tags, reproduction time information of the generated audio files, and time information on the positions where the words that have become clues to the information tags are embedded. , And information including the reproduction time of the word.

【００３１】＜運動制御装置１６への出力情報＞タグの総数：１１（個）音声ファイルの合計再生時間（ｍｓ）：３６００第１番目のタグ情報：＠陳謝手掛かり単語：｛ごめん｝単語再生のスタートタイム（ｍｓ）：６００（合成音声の頭からのｍｓ）単語再生のエンドタイム（ｍｓ）：９００（合成音声の頭からのｍｓ）単語再生時間（ｍｓ）：３００（ｍｓ）次タグまでの時間：５００（ｍｓ）第２番目のタグ情報：＠挨拶手掛かり単語：｛おはよう｝単語再生のスタートタイム（ｍｓ）：１４００（合成音声の頭からのｍｓ）単語再生のエンドタイム（ｍｓ）：２２００（合成音声の頭からのｍｓ）単語再生時間（ｍｓ）：８００（ｍｓ）次タグまでの時間：４００（ｍｓ）第３番目のタグ情報：：第１１番目のタグ情報：［ジェスチャ−ＤＢ１７］感情情報タグに対応付けられ
た、ロボットの関節駆動用モータの指令値（モータ番
号、位置、速度、時間）の時系列情報を格納。<Output Information to Exercise Control Device 16> Total number of tags: 11 (pieces) Total playback time of audio files (ms): 3600 First tag information: ＠Chan Xie Clue words: ｛Sorry め Word playback Start time (ms): 600 (ms from beginning of synthesized speech) End time of word playback (ms): 900 (ms from beginning of synthesized speech) Word playback time (ms): 300 (ms) Until next tag Time: 500 (ms) Second tag information: ＠Greeting Clues: ｛Good morning｝ Start time of word playback (ms): 1400 (ms from beginning of synthesized speech) End time of word playback (ms): 2200 (Ms from head of synthesized speech) Word playback time (ms): 800 (ms) Time to next tag: 400 (ms) Third tag information:: 11th Tag information: [Gesture-DB17] Stores time-series information of command values (motor number, position, speed, time) of the joint driving motor of the robot, which are associated with the emotion information tag.

【００３２】［運動制御装置１６］音声合成装置１４か
らの出力情報に基づいて、ジェスチャ−ＤＢ１７から、
感情情報タグに対応したロボットの関節駆動用モータの
指令値を呼び出し、駆動機構１８へＰＷＭ信号として送
信。駆動タイミングは、音声合成装置１４の出力情報で
ある、音声ファイルの再生時間情報、情報タグの手掛か
りとなった単語が埋め込まれている位置の時間情報、お
よび該単語の再生時間に従う。[Motion control device 16] Based on the output information from the speech synthesis device 14, the gesture DB 17
The command value of the motor for driving the joint of the robot corresponding to the emotion information tag is called and transmitted to the drive mechanism 18 as a PWM signal. The drive timing is in accordance with the reproduction time information of the audio file, the time information of the position where the word serving as the clue of the information tag is embedded, and the reproduction time of the word, which are the output information of the voice synthesis device 14.

【００３３】［音声制御装置１５］音声合成装置１４か
ら取得した音声ファイルをスピーカ１９により再生す
る。[Sound control device 15] The sound file obtained from the sound synthesis device 14 is reproduced by the speaker 19.

【００３４】尚、本発明におけるテキスト読み上げロボ
ットの制御方法は、具体的には、パーソナルコンピュー
タ（ＰＣ）等のコンピュータにより、予め所定のコンピ
ュータ読み取り可能な記録媒体に記録されたテキスト読
み上げロボットの制御プログラムに基づいて実行され
る。The method for controlling a text-to-speech robot according to the present invention is, more specifically, a control program for a text-to-speech robot that is recorded in advance on a computer-readable recording medium by a computer such as a personal computer (PC). It is executed based on.

【００３５】すなわち、コンピュータシステムに接続可
能なＰＣ内のメモリ上に蓄積されたテキストを読み上げ
るテキスト読み上げロボットの制御プログラムを記録し
たコンピュータ読み取り可能な記録媒体であって、前記
テキストを単語分割する形態素解析によって文章の内容
を解析し、感情に関わる情報を抽出する感情情報抽出手
順、前記感情情報抽出手順で抽出された感情情報に対応
した声やジェスチャー、効果音を発現する音声・運動発
現手順をコンピュータに実行させる。That is, a computer-readable recording medium recording a control program of a text-to-speech robot that reads text stored in a memory in a PC connectable to a computer system, wherein the text is divided into words. A sentence information extraction procedure for analyzing the contents of a sentence and extracting information relating to emotions, and a voice / motion expression procedure for expressing voices, gestures, and sound effects corresponding to the emotion information extracted in the emotion information extraction procedure To run.

【００３６】[0036]

【発明の効果】以上述べたように本発明によれば、メー
ルの内容に関わる喜怒哀楽等の感情に応じて、ロボット
が音と動作でその感情の表現を手助けしてくれる。この
ため、無味乾燥になりがちなメール読み上げ行為が、感
情表現を伴なう情緒的な情報伝達行為に進化し、文字と
その合成音だけでは表現しきれない木目細かな電子メー
ル情報流通が可能となる。As described above, according to the present invention, the robot assists the expression of the emotion with sound and action in accordance with the emotions such as emotions, emotions, and so on related to the contents of the mail. As a result, e-mail reading, which tends to be tasteless and dry, has evolved into an emotional communication with emotional expression, enabling detailed distribution of e-mail information that cannot be expressed using only letters and their synthesized sounds. Becomes

【００３７】また、ロボット動作発現のタイミングを、
メール内絵文字の位置に対応させることにより、書き手
の気持ちをより深く読み取った感情表現が可能となる。Further, the timing of the robot operation manifestation is
By associating with the position of the pictogram in the mail, it is possible to express emotions by reading the writer's feelings more deeply.

[Brief description of the drawings]

【図１】本発明の実施形態例に係る電子メールのテキス
トを読み上げる際のテキスト読み上げロボットを示す機
能ブロック図である。FIG. 1 is a functional block diagram showing a text-to-speech robot when reading out text of an e-mail according to an embodiment of the present invention.

【図２】本発明の実施形態例に係るメール送信者ＤＢの
記録テーブルを示す説明図である。FIG. 2 is an explanatory diagram showing a record table of a mail sender DB according to the embodiment of the present invention.

[Explanation of symbols]

１１感情情報抽出装置１２メール送信者ＤＢ（データベース）１３セリフＤＢ１４音声合成装置１５音声制御装置１６運動制御装置１７ジェスチャーＤＢ１８駆動機構１９スピーカ DESCRIPTION OF SYMBOLS 11 Emotion information extraction device 12 Mail sender DB (database) 13 Dialog DB 14 Voice synthesis device 15 Voice control device 16 Motion control device 17 Gesture DB 18 Drive mechanism 19 Speaker

───────────────────────────────────────────────────── フロントページの続き (72)発明者柿崎隆夫東京都千代田区大手町二丁目３番１号日本電信電話株式会社内 (72)発明者町野保東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5D045 AA07 AB11 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Takao Kakizaki 2-3-1 Otemachi, Chiyoda-ku, Tokyo Within Nippon Telegraph and Telephone Corporation (72) Inventor Tamotsu Machino 2-3-3 Otemachi, Chiyoda-ku, Tokyo No. 1 F-term in Nippon Telegraph and Telephone Corporation (reference) 5D045 AA07 AB11

Claims

[Claims]

1. A PC connectable to a computer system
A text-to-speech robot that reads text stored on a memory in a text-to-speech robot, wherein the sentiment information is analyzed by morphological analysis that divides the text into words, and emotion-related information is extracted from the sentence. A text-to-speech robot comprising: voice / motion expression means for expressing voices, gestures, and sound effects corresponding to the emotion information extracted by the extraction means.

2. The text-to-speech robot according to claim 1, wherein when the text is analyzed by the emotion information extraction means, pictograph information in the text is extracted, and the intention of the pictograph information and the text are extracted by voice / motion expression means. A text-to-speech robot that produces voices, gestures, and sound effects according to its location in the space.

3. The text-to-speech robot according to claim 1, wherein when the gesture is expressed by the voice / motion expression means, the size of the opening / closing of the mouth of the robot changes according to the speaker output. A reading robot.

4. The text-to-speech robot according to claim 1, wherein the text is text transmitted / received as an e-mail, stores sender data from a received mail, and the number of replies to the received mail, and sends / receives the e-mail. For specific email senders who send many emails,
A text-to-speech robot characterized by changing dialogue.

5. The text-to-speech robot according to claim 1, wherein the text is text transmitted / received as an e-mail, and the content of the sentence is analyzed by morphological analysis that divides the e-mail text into words, and information relating to emotions is analyzed. A text-to-speech robot that obtains and analyzes the intimacy between an e-mail sender and a recipient, and changes the voice and speech of an e-mail according to the intimacy.

6. A PC connectable to a computer system
A text-to-speech robot control method for reading a text stored on a memory in a text-to-speech robot, wherein the sentence is analyzed by morphological analysis that divides the text into words, and an emotion information extraction step of extracting information related to emotion is provided. A method for controlling a text-to-speech robot, comprising: a voice / motion expression step of expressing voices, gestures, and sound effects corresponding to the emotion information extracted in the emotion information extraction step.

7. A PC connectable to a computer system
A control program for a text-to-speech robot that reads text stored on a memory in the memory, wherein the content of the sentence is analyzed by morphological analysis for dividing the text into words, and emotion-related information is extracted. A recording medium for recording a text-to-speech robot control program for causing a computer to execute a voice / gesture / sound effect expressing procedure corresponding to the emotion information extracted in the emotion information extracting step.