JPH0981632A

JPH0981632A - Information publication device

Info

Publication number: JPH0981632A
Application number: JP23580595A
Authority: JP
Inventors: Yasuyo Shibazaki; 靖代芝崎; Miyoshi Fukui; 美佳福井
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-09-13
Filing date: 1995-09-13
Publication date: 1997-03-28

Abstract

PROBLEM TO BE SOLVED: To provide an information releasing device capable of reducing the mental burdens of a user by realizing natural and smooth interaction for which the feeling of the user is considered. SOLUTION: In this information publication device for inputting the data of plural forms including a text, sound, a picture and a pointing position, extracting the intention and feeling information of the user from the inputted data, preparing a response plan and generating a response to the user, a user feeling recognition part 106 for recognizing the feeling state of the user from the internal state of a response plan preparation part 105, the intention and feeling information of the user and the transition on a time base of interaction condition information including the kind of the prepared response plan is provided and the response plan preparation part 105 selects or changes a response strategy corresponding to the recognized result of the user feeling recognition part 106 and prepares the response plan matched with the response strategy.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、個人の所有するデ
ータを他人に公開する情報公開装置に係り、特にユーザ
である他人の意図と感情とを認識して適切な応答を行な
う情報公開装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information disclosure apparatus for disclosing data owned by an individual to another person, and more particularly to an information disclosure apparatus for recognizing the intention and emotion of another person who is a user and making an appropriate response. .

【０００２】[0002]

【従来の技術】近年、自然言語インタフェース/ マルチ
モーダルインタフェースなど、人間が有する情報伝達手
段を複数採用したヒューマンインタフェースの構築が盛
んに行なわれてきている。2. Description of the Related Art In recent years, a human interface, which adopts a plurality of information transmission means possessed by humans, such as natural language interface / multimodal interface, has been actively constructed.

【０００３】こうしたメディアの種類もさることなが
ら、対話を効率的に進める要因として話者間の感情伝達
が果たす役割が大きい。実際、電子メールのやりとり
で、話者間の文脈認識・発話意図解釈のズレから対話が
平行線を辿る例が少なくないという現状は、話者の心理
状況あるいは感情の抽出／認識／伝達を支援する装置の
必要性を表している。In addition to these types of media, emotional communication between speakers plays a major role as a factor for efficiently promoting dialogue. In fact, in the current situation that there are not a few cases in which dialogues follow parallel lines due to the difference in context recognition and utterance intention interpretation between speakers in the exchange of e-mails, it supports the extraction / recognition / transmission of the psychological situation or emotions of the speaker. It represents the need for a device to do this.

【０００４】音声の意図・感情情報の認識理解や合成に
ついては、金澤等( 電子情報通信学会論文集D-II, Vol.
J77-D-II, No.8, pp.1512-1521) の研究やCahn等("Gene
rating Expression in Synthesized Speech", Technic
al Report, MassachusettInstitute of Technology, 19
90)の研究が既に報告されている。Kanazawa et al. (Proceedings of the Institute of Electronics, Information and Communication Engineers D-II, Vol.
J77-D-II, No.8, pp.1512-1521) and Cahn et al.
rating Expression in Synthesized Speech ", Technic
al Report, Massachusett Institute of Technology, 19
The research of 90) has already been reported.

【０００５】これらは、音声信号のピッチやアクセント
等の「音声律情報」に注目して、怒り、喜び、悲しみ、
同意、感心、つなぎ等の意図や感情情報を認識理解した
り生成したりするものである。These focus on the "voice temperament information" such as the pitch and accent of the voice signal, and anger, joy, sadness,
It recognizes, understands, and generates intentional and emotional information such as consent, impression, and connection.

【０００６】また、テキスト中の感情を表現する文字列
から感情を認識する試みもある。藤本等によって、単語
単位の感情情報を数量化して登録した感情辞書を用いて
単語列に含まれる感情情報を抽出する方式（特公平６−
８２３７６）や、さらに構文解析結果に基づき前記感情
情報を変換する規則をもつ方式（特公平６−８２３７
７）等が提案されている。There is also an attempt to recognize an emotion from a character string expressing the emotion in the text. A method of extracting emotion information included in a word string by using an emotion dictionary in which emotion information in units of words is quantified and registered by Fujimoto et al.
82376) or a method having a rule for converting the emotion information based on the result of syntax analysis (Japanese Patent Publication No. 6-8237).
7) etc. are proposed.

【０００７】しかし、これらは、一文あるいは一発話に
含まれる感情を抽出するものであり、対話を通じてユー
ザの感情を認識するものではない。実際には、同じ発話
や言語表現を用いても、状況に応じてユーザの感情は違
うことが多い。また、ユーザによっても感情を表す発話
や言語表現は異なる。However, these are for extracting the emotion contained in one sentence or one utterance, and not for recognizing the emotion of the user through dialogue. In reality, even if the same utterance or linguistic expression is used, the user's emotions often differ depending on the situation. Also, the utterances and linguistic expressions expressing emotions differ depending on the user.

【０００８】これに対して、物語の記述から登場人物の
感情を認識する試みもある。W.G.Lehnert 等("The Role
of Affect in Narrative Structure", Cognition and
Emotion, 1987, pp. 299- 322)の研究や、M.G.Dyer("Em
otions and their Computations: Three Computer Mod
els",1987,Lawrence Erlbaum Associates Limited)の研
究が報告されている。これらは自然言語で記述された文
章から状況を認識し、登場人物の感情状態を推論するも
のである。しかし、感情を表す言語表現や状況は物語の
記述から抽出され、登場人物の発話や対話の内容のみを
用いた感情や状況の認識は行なわれていない。On the other hand, there is also an attempt to recognize the emotions of the characters from the description of the story. WGLehnert et al. ("The Role
of Affect in Narrative Structure ", Cognition and
Emotion, 1987, pp. 299-322) and MGDyer ("Em
otions and their Computations: Three Computer Mod
els ", 1987, Lawrence Erlbaum Associates Limited), which recognizes situations from sentences written in natural language and infers the emotional state of characters. Linguistic expressions and situations are extracted from narrative descriptions, and emotions and situations are not recognized using only the utterances and dialogue contents of characters.

【０００９】談話構造モデルをもちユーザの発話意図に
対して適切な応答を生成する対話システムの研究も行わ
れている。テキストの対話では、住田等( 「質問応答シ
ステムにおける応答の自然性に関する考察」, 信学技報
NLC86-16, pp.25-32, 1986)や浮田等( 「自然言語入力
による機器操作案内システム」, 信学技報OS88-18, pp.
13-18,1988) の研究、音声では、荒木等( 「対話の構造
と単語の概念を利用した発話の理解」、情報処理学会第
42回全国大会,3, pp. 61-62 ,1991)の研究等がある。こ
れらは、ユーザの発話あるいは入力テキストから状況に
応じたユーザの意図認識を行ない適切な応答を生成する
ことを目的とするものである。しかし、ユーザの感情認
識は行なっていない。Research has also been conducted on a dialogue system that has a discourse structure model and generates an appropriate response to a user's speech intention. In the text dialogue, Sumita et al. (“Consideration on naturalness of response in question answering system”, IEICE Tech.
NLC86-16, pp.25-32, 1986), Ukita, et al. ("Device operation guidance system by natural language input", IEICE Technical Report OS88-18, pp.
13-18, 1988), in the speech, Araki et al. (“Understanding Speech Using the Structure of Dialogue and the Concept of Words”, IPSJ
42nd National Convention, 3, pp. 61-62, 1991). These are intended to recognize the user's intention according to the situation from the user's utterance or input text and generate an appropriate response. However, it does not recognize the emotion of the user.

【００１０】さらに、特願平７−８６２６６に記載され
た情報公開装置及びマルチモーダル情報入出力システム
では、感情認識は1 文あるいは1 発話に含まれる感情情
報を利用して決定しているが、対話の状況に関する情報
は利用していない。Further, in the information disclosure device and the multimodal information input / output system described in Japanese Patent Application No. 7-86266, emotion recognition is determined by using emotion information included in one sentence or one utterance. No information on the status of the dialogue is used.

【００１１】[0011]

【発明が解決しようとする課題】このように、従来の情
報公開装置には、ユーザの感情を対話の状況に基づいて
解析する手段がなかった。このため、ユーザの感情を考
慮した応答生成が困難になり、対話がちぐはぐになって
ユーザの意図が正しく理解できないなど、ユーザに無用
な精神的負担を与えていた。As described above, the conventional information disclosure device has no means for analyzing the emotion of the user based on the situation of the dialogue. For this reason, it becomes difficult to generate a response in consideration of the user's emotions, and the dialogue becomes distorted so that the user's intention cannot be correctly understood, which imposes an unnecessary mental burden on the user.

【００１２】本発明は、このような実情に鑑みてなされ
たものであり、対話の状況にしたがって発話者の感情を
認識し、ユーザの感情を考慮した自然で円滑な対話を実
現することにより、ユーザの精神的負担を軽減すること
を可能する情報公開装置を提供することを目的とする。[0012] The present invention has been made in view of such a situation, and recognizes the emotion of the speaker according to the situation of the dialogue and realizes a natural and smooth dialogue in consideration of the emotion of the user. An object of the present invention is to provide an information disclosure device capable of reducing the mental burden on the user.

【００１３】[0013]

【課題を解決するための手段】本発明は、テキスト、音
声、画像およびポインティング位置を含む複数の形態の
データを入力する入力手段と、この入力手段により入力
されたデータからユーザの意図および感情情報を抽出す
る抽出手段と、この抽出手段の抽出結果に基づいて応答
プランを作成する応答プラン作成手段と、この作成され
た応答プランに基づいて前記ユーザへの応答を生成する
応答生成手段とを有する情報公開装置において、前記応
答プラン作成手段の内部状態、前記抽出されたユーザの
意図ならびに感情情報、および前記作成された応答プラ
ンの種別を含む対話状況情報の時間軸上の推移から前記
ユーザの感情状態を認識する感情認識手段を具備し、前
記応答プラン作成手段は、前記感情認識手段の認識結果
にしたがって応答戦略を選択または変更し、その応答戦
略に合致した応答プランを作成することを特徴とする。SUMMARY OF THE INVENTION According to the present invention, input means for inputting a plurality of forms of data including text, voice, image and pointing position, and user's intention and emotion information from the data input by this input means. And a response plan creating unit that creates a response plan based on the extraction result of the extracting unit, and a response creating unit that creates a response to the user based on the created response plan. In the information disclosure device, the internal state of the response plan creating means, the extracted intention and emotion information of the user, and the transition of the conversation situation information including the type of the created response plan on the time axis from the emotion of the user An emotion recognition unit for recognizing a state is provided, and the response plan creation unit responds according to a recognition result of the emotion recognition unit. Select or change the substantially, characterized in that to create a response plan that matches the response strategy.

【００１４】本発明によれば、音声、テキストおよび画
像などの複数形態の入力と、対話の状況とからユーザの
感情状態を認識し、適切な応答をすることが可能とな
る。すなわち、本発明によれば、ユーザが怒っている状
態の場合は、「平謝り」、「仲裁」、「喧嘩の買い言
葉」など、焦っている場合は、「迅速な対応」、「他の
手段の紹介」など、ユーザをプラスやマイナスの感情へ
導く応答を生成することができる。これにより、ユーザ
への応答のバリエーションを多数生成しうる場面におい
て、従来、画一的あるいはランダムに決定されていた応
答を、ユーザの感情を鑑みて決定することができ、より
自然な応答生成が実現できる。According to the present invention, it is possible to recognize the emotional state of the user from a plurality of forms of input such as voice, text and image and the situation of the dialogue, and make an appropriate response. That is, according to the present invention, when the user is angry, "apologize", "arbitration", "words for quarrel", etc. It is possible to generate a response such as "introduction" that leads the user to a positive or negative emotion. As a result, in a situation where many variations of responses to the user can be generated, it is possible to determine a response that was conventionally standardized or randomly determined in consideration of the user's emotions, and a more natural response generation can be achieved. realizable.

【００１５】[0015]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態を説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００１６】（第１実施形態）まず、本発明の第１実施
形態を説明する。(First Embodiment) First, a first embodiment of the present invention will be described.

【００１７】図１は第１実施形態の情報公開装置の機能
ブロック図である。FIG. 1 is a functional block diagram of the information disclosure device of the first embodiment.

【００１８】図１に示したように、同実施形態の情報公
開装置１０は、入力部１０１、データ記憶部１０２、デ
ータ検索管理部１０３、要求受付部１０４、応答プラン
作成部１０５、ユーザ感情認識部１０６および応答生成
部１０７を有してなる。As shown in FIG. 1, the information disclosure device 10 of the embodiment has an input unit 101, a data storage unit 102, a data search management unit 103, a request reception unit 104, a response plan creation unit 105, and user emotion recognition. It has a unit 106 and a response generation unit 107.

【００１９】入力部１０１は、テキスト、画像および音
声などといったユーザのデータを入力する。データ記憶
部１０２は、情報公開装置１０にアクセスができるユー
ザのリスト、各ユーザの情報提供者との関係、ユーザの
応答規則（テキスト、音声など）、およびユーザの履歴
などを記憶する。データ検索管理部１０３は、データ記
憶部１０２に記憶されているデータからユーザ情報を取
り出す。The input unit 101 inputs user data such as text, images and voice. The data storage unit 102 stores a list of users who can access the information disclosure device 10, a relationship between each user and an information provider, a user response rule (text, voice, etc.), a user history, and the like. The data search management unit 103 extracts user information from the data stored in the data storage unit 102.

【００２０】要求受付部１０４は、たとえばネットワー
クなどの通信手段を介して入力部１０１が入力したテキ
スト、画像および音などといったユーザのデータを受け
付けて、そのユーザの意図を抽出する。ユーザ感情認識
部１０６は、要求受付部１０４で抽出した意図と、要求
受付部１０４で受け付けたテキスト、画像および音声な
どのデータとからユーザの感情を抽出する。The request accepting unit 104 accepts user data such as text, images and sounds input by the input unit 101 via a communication means such as a network and extracts the intention of the user. The user emotion recognition unit 106 extracts the user's emotion from the intention extracted by the request reception unit 104 and the data such as the text, the image, and the sound received by the request reception unit 104.

【００２１】応答プラン作成部１０５は、要求受付部１
０４で抽出した意図とユーザ感情認識部１０６で抽出し
たユーザ感情とから応答プランを作成する。応答生成部
１０７は、応答プラン作成部１０５で作成した応答プラ
ンから応答を生成して出力する。The response plan creating section 105 is the request receiving section 1
A response plan is created from the intention extracted in 04 and the user emotion extracted by the user emotion recognition unit 106. The response generation unit 107 generates and outputs a response from the response plan created by the response plan creation unit 105.

【００２２】以下に同実施形態の動作を説明する.ここ
では、情報を公開し提供する者を情報提供者と呼び、情
報公開装置にアクセスし情報を得ようとする者をユーザ
と呼ぶ。また、情報公開装置をエージェントと呼び、さ
らにユーザがエージェントにアクセスした目的を対話目
的、感情を表現する言葉を感情語と呼ぶ。The operation of the embodiment will be described below. Here, a person who discloses and provides information is called an information provider, and a person who accesses the information disclosure device to obtain information is called a user. The information disclosure device is called an agent, the purpose of the user accessing the agent is a dialogue purpose, and a word expressing an emotion is called an emotion word.

【００２３】図２を参照して同実施形態の情報公開装置
の動作手順を説明する。The operation procedure of the information disclosure apparatus of the embodiment will be described with reference to FIG.

【００２４】ユーザは、たとえばウィンドウベースのイ
ンタフェースによりテキスト文でエージェントにアクセ
スする。ここでエージェントは、このアクセスに対しユ
ーザの登録有無を確認し、登録されてなければ対話を拒
否する（図２のステップ２０１）。The user accesses the agent in textual form, for example by means of a window-based interface. Here, the agent confirms whether or not the user is registered for this access, and if not registered, refuses the dialogue (step 201 in FIG. 2).

【００２５】ステップ２０２で、エージェントは、ユー
ザ情報を検索する。すなわち、ユーザの性格、情報提供
者との社会的関係、および信頼関係などを対人情報デー
タベースから検索する。そして、ステップ２０３で、ユ
ーザの発話意図を抽出し、テキスト文あるいは音声に含
まれた感情語を形態素解析により抽出し、ステップ２０
４で、入力で用いられた感情語からユーザの感情を推定
する。なお、テキスト文や音声文に含まれる感情語のみ
でユーザの感情を推定することは困難なので、同実施形
態では、たとえば感情語からの感情抽出と、対話展開に
そった文脈感情の抽出との双方からユーザの感情を認識
することとする。At step 202, the agent retrieves user information. That is, the personality information database is searched for the personality of the user, the social relationship with the information provider, the trust relationship, and the like. Then, in step 203, the utterance intention of the user is extracted, and the emotional word included in the text sentence or the voice is extracted by morphological analysis.
At 4, the user's emotion is estimated from the emotion word used for input. Since it is difficult to estimate the emotion of the user only by the emotional words included in the text sentence or the voice sentence, in the same embodiment, for example, emotion extraction from the emotional words and contextual emotion extraction along with the dialogue development are performed. Both sides should recognize the user's emotions.

【００２６】たとえば、予め感情語は、感情の種類やそ
の言葉がもつ尤度やインパクトとともに辞書に登録して
おく。そして、ユーザ感情認識部１０６でテーブル形式
で保持している対話回数から感情を類推した対話回数感
情モデル（図３参照）を用いて、（期待）／（不安）／
（余裕）／（焦燥）／（感謝）／（納得）／（諦め）／
（怒り）などの対話から発生する感情を推定し、感情語
の意味が対話段階の感情に合致しているかを確認する。
なお、このモデルでは対話回数に応じて感情を設定する
が、対話回数と感情の指定はユーザモデルに拠ってもよ
い。たとえば、ユーザモデルにおける短気の程度を５段
階で表し、短気度が５だった場合各感情の標準回数から
１だけマイナスして設定する、などとする。For example, emotional words are registered in advance in a dictionary together with emotion types, likelihoods and impacts of the words. Then, using the dialogue count emotion model (see FIG. 3) in which the emotion is analogized from the dialogue count held in the table format in the user emotion recognition unit 106, (expectation) / (anxiety) /
(Margin) / (frustration) / (thanks) / (convinced) / (give up) /
Estimate the emotion generated from the dialogue such as (anger) and confirm whether the meaning of the emotion word matches the emotion at the dialogue stage.
In this model, emotions are set according to the number of dialogues, but the number of dialogues and the designation of emotions may depend on the user model. For example, the degree of impatience in the user model is expressed in five stages, and when the impatience degree is 5, it is set by subtracting 1 from the standard number of each emotion.

【００２７】この方法において、たとえば感情語が曖昧
性の高い言葉の場合、図４に示すように対話段階に相当
する感情を加味する。対話回数モデルの感情が（不安）
／（期待）のとき、［あのね；期待（５，１），焦燥
（５，８）；］が含まれていたら,(期待) の尤度を上げ
るなどの修正がある。あるいは、対話段階にそぐわない
感情がきた場合、満足／不満足を傾向として反映するた
め、対話回数感情モデルに沿った感情の尤度を上げる。
たとえば、対話回数感情モデルが（焦燥）の感情を指し
ているとき、［ばか；怒り（９，８）；］がテキスト文
に含まれていたら、不安の尤度を上げて怒りの尤度を下
げる。このような修正後、尤度×インパクトの値で感情
を特定する. たとえば、Ｗ１＝４０、Ｗ２＝６０などと
して、尤度×インパクトがＷ１とＷ２との間なら（対話
回数＋１）に相当する対話回数感情、Ｗ２以上なら（対
話回数＋２）に相当する対話感情、などとする。In this method, for example, when the emotional word is a highly ambiguous word, the emotion corresponding to the dialogue stage is added as shown in FIG. The emotion of the dialogue count model is (anxiety)
If / (expectation) includes [Ane; expectation (5,1), frustration (5,8);], there is a correction such as increasing the likelihood of (expectation). Alternatively, when an emotion that does not suit the dialogue stage comes in, the likelihood of emotion according to the dialogue count emotion model is increased in order to reflect satisfaction / dissatisfaction as a tendency.
For example, if the dialogue count emotion model indicates an emotion of (frustration), and if the text sentence includes “idiot; anger (9,8);”, the likelihood of anxiety is increased to increase the likelihood of anger. Lower. After such correction, the emotion is identified by the value of the likelihood × impact. For example, if W1 = 40, W2 = 60, etc., and the likelihood × impact is between W1 and W2, it corresponds to (the number of dialogues + 1). The number of conversations is the emotion, and if W2 or more, the conversation is equivalent to (the number of conversations + 2).

【００２８】あるいは、感情の分布を［余裕−焦燥］、
［満足−不満］および［受容−拒否］から成る３次元空
間に配置し、感情語をこの３次元空間の座標で表し、各
軸の値とともに辞書に登録しておく。Alternatively, the distribution of emotions is [margin-frustrated],
It is arranged in a three-dimensional space consisting of [satisfaction-dissatisfaction] and [acceptance-rejection], and emotional words are expressed by the coordinates of this three-dimensional space and registered in a dictionary together with the values of each axis.

【００２９】たとえば、感情語の［余裕−焦燥］、［期
待−不満］および［受容−拒否］の各度合を検索した
後、エージェントとユーザとの発話意図の組合せによ
り、各軸の度合を修正する。For example, after searching for each degree of emotional words [margin-frustration], [expectation-dissatisfaction] and [acceptance-rejection], the degree of each axis is corrected by a combination of utterance intentions of the agent and the user. To do.

【００３０】たとえば、感情語を［うっそぉ（余裕：−
４，期待：−３，受容：−２）］のように辞書登録し、
エージェントが謝罪、ユーザが要求という発話意図の組
合せのとき、感情語の余裕度と受容度とを−２にする。
あるいは、発話意図が肯定であった場合に、受容度の低
い感情語がはいってきたら、受容度を＋２などとする。
そして、修正した値を各条件にあてはめて感情を特定す
る。この修正条件は、ユーザ感情認識部１０６の中にテ
ーブルとして保持している。For example, if an emotional word is [Woooo (margin:-
4, expectation: -3, acceptance: -2)]
When the agent apologizes and the user requests the utterance intention, the emotion word margin and acceptance are set to -2.
Alternatively, when the utterance intention is affirmative and an emotional word having a low acceptance is introduced, the acceptance is set to +2 or the like.
Then, the corrected value is applied to each condition to specify the emotion. This correction condition is held as a table in the user emotion recognition unit 106.

【００３１】修正条件の例を図５に示す。修正結果か
ら、たとえば図６のような条件で感情語が表現する感情
を特定する。An example of the correction conditions is shown in FIG. From the correction result, the emotion expressed by the emotion word is specified under the condition as shown in FIG. 6, for example.

【００３２】たとえば、図７の対話において、［うっそ
ぉ（余裕：−４，満足：−３，受容：−２）］という感
情語が登場した場合、発話意図の組合せから、［うっそ
ぉ（余裕：−６，満足：−４，受容：−３）］）とな
る。図６から、感情語が表現する感情は怒りと推定され
る。For example, in the dialogue shown in FIG. 7, when the emotional word "Usoo (margin: -4, satisfaction: -3, acceptance: -2)" appears, the combination of the utterance intentions causes "Usso (Margin: -6, satisfaction: -4, acceptance: -3)]). From FIG. 6, the emotion expressed by the emotion word is estimated to be anger.

【００３３】図２のステップ２０５では、ステップ２０
４での感情語結果を受け、対話におけるユーザの感情を
分析する。In step 205 of FIG. 2, step 20
The emotional word result in 4 is received, and the emotion of the user in the dialogue is analyzed.

【００３４】また、発話意図から、ユーザの対話目的を
［依頼］、［確認］および［命令］などに分類すること
ができる。そして、その目的達成の進捗に応じて、感情
の推移が設定できると考えられる。ここでは、これを感
情推移モデルと呼ぶことにする。Further, the user's dialogue purpose can be classified into “request”, “confirmation”, and “command” based on the utterance intention. Then, it is considered that the emotional transition can be set according to the progress of the achievement of the purpose. Here, this is called an emotion transition model.

【００３５】たとえば、［依頼］の場合、アクセス段階
でのユーザの感情には（期待）が大きい。また、最終的
には依頼の結果を得たユーザが、（満足）および（感
謝）の感情で対話を終了するのが理想である。しかし、
不足情報が多くエージェントが補填質問をｎ回繰り返す
うちに、ユーザには（不安）や（不信）といった感情が
芽生えると思われる。また、意図が認識されていない、
あるいはエージェントが提示した結果が見当外れであ
る、などをｍ回繰り返すうちに、ユーザの感情は（不
信）、（焦燥）および（落胆）が支配するようになる。
さらに、対話総数ｋ回にいたっても意図した結果が得ら
れない場合、（困惑）や（怒り）が色濃くなる。対話終
了時のユーザの感情が（困惑）や（怒り）に達した場
合、エージェントは信頼性を失墜したことになる。この
状況を回避するため、エージェントは釈明／謝罪などの
応答生成プランを加える。図８に例を示す。For example, in the case of [request], the user's emotion at the access stage has a high (expectation). Ideally, the user who finally obtained the request ends the conversation with feelings of (satisfaction) and (gratitude). But,
It seems that feelings such as (anxiety) and (distrust) develop into the user while the agent repeats the supplementary question n times due to lack of information. Also, the intent is not recognized,
Alternatively, the user's emotions are controlled by (disbelief), (frustration), and (disappointment) after m times such that the result presented by the agent is misguided.
Furthermore, if the intended result cannot be obtained even after the total number of dialogues is k, (confused) and (anger) become intense. If the emotion of the user at the end of the dialogue reaches (confused) or (anger), the agent has lost credibility. To avoid this situation, the agent adds a response generation plan such as clarification / apology. An example is shown in FIG.

【００３６】［確認］の場合、アクセス段階でのユーザ
の感情（期待）は、［依頼］時よりも小さいと思われ
る。また、確認を終えたユーザの感情は、（納得）、
（満足）および（感謝) といった感情で対話を終了する
のが望ましい。［依頼］と同様、意図伝達が滞った場
合、（焦燥）の感情が生じるが、それは最終的に（諦
め）といった自己収束的な感情に移行すると考えられ
る。図９に例を示す。In the case of [confirmation], the emotion (expectation) of the user at the access stage is considered to be smaller than that at the time of [request]. In addition, the emotion of the user who finished the confirmation is (convinced),
It is desirable to end the dialogue with feelings of (satisfaction) and (gratitude). Similar to [Request], if the intention communication is delayed, a feeling of (frustration) occurs, but it is considered that it eventually shifts to a self-converging feeling such as (giving up). An example is shown in FIG.

【００３７】［命令］の場合、アクセス段階でのユーザ
の感情は、［依頼］時よりも期待がさらに大きい。ま
た、対話終了時のユーザ感情は、（満足）や（信頼）が
望ましい。また、目的達成に時間がかかると（焦燥）、
達成されないと（怒り）など、感情が大きく振れること
が予想できる。この状況を回避するため、エージェント
は釈明／謝罪などの適切な応答生成プランを加え、感情
の振れを小さくするように努める。図１０に例を示す。In the case of [command], the feeling of the user at the access stage is much higher than that at the time of [request]. Further, the user feeling at the end of the dialogue is preferably (satisfaction) or (trust). Also, if it takes time to achieve the purpose (frustration),
It can be expected that emotions will fluctuate significantly if it is not achieved (anger). To avoid this situation, agents try to reduce emotional swings by adding appropriate response generation plans such as clarifications / apologies. An example is shown in FIG.

【００３８】以上のような対話目的に応じた感情推移モ
デルは、ユーザ感情認識部１０６の中にテーブルとして
保持している。このテーブルの数値は、ユーザモデルに
登録されたユーザの性格（短気）などにより決定する。
たとえば、ユーザモデルの短気の程度を５段階で表し、
短気度が５だった場合に各感情の標準値から１だけマイ
ナスして設定する、などとする。The emotion transition model according to the conversation purpose as described above is held as a table in the user emotion recognition unit 106. Numerical values in this table are determined by the personality (temperedness) of the user registered in the user model.
For example, the degree of impatience of the user model is expressed in five levels,
When the temperament is 5, the standard value of each emotion is subtracted by 1 and set.

【００３９】例として、図１１のような意図獲得段階、
不足情報獲得段階、および回答提示段階の３段階からな
る談話遷移モデルを用いて、対話目的が［依頼］の感情
推移モデルのルールを示す。ここでは、アクセス時の感
情は（期待）とする。As an example, the intention acquisition stage as shown in FIG.
Using a discourse transition model consisting of three stages of the lack information acquisition stage and the answer presentation stage, the rules of the emotion transition model whose dialogue purpose is [request] are shown. Here, the feeling at the time of access is (expectation).

【００４０】意図獲得段階でユーザに意図確認をし、Ｙ
ＥＳの回答を得れば成功、ＹＥＳ以外を失敗と見なす。
そして、成功数＋失敗数＝Ｆ１とする。Ｆ１＝−２にな
ったとき（不安）へ移行し、−４で（焦燥）、−６で
（困惑）と設定する。Confirm the intention to the user at the intention acquisition stage, and
Success is obtained when the response from ES is obtained, and failures other than YES are regarded as failures.
Then, the number of successes + the number of failures = F1. When F1 = -2, move to (anxiety), set -4 (frustrated), and set -6 (confused).

【００４１】また、不足情報獲得段階での成功数＋失敗
数＝Ｆ２とする。Ｆ２が−２のときに（焦燥）、−４で
（困惑）、−６で（怒り）とする。Further, the number of successes at the stage of acquiring the insufficient information + the number of failures = F2. When F2 is -2 (frustrated), -4 is (confused), and -6 is (angry).

【００４２】そして、回答提示段階での成功数＋失敗数
＝Ｆ３とする。Ｆ３＝−１で（落胆）、−２で（困
惑）、−３で（怒り）などとする。Then, the number of successes in the answer presentation stage + the number of failures = F3. F3 = -1 (disappointed), -2 (confused), -3 (angry), etc.

【００４３】もし、意図獲得段階で（焦燥）まで進んで
不足情報獲得段階になった場合（Ｆ１＝４）、Ｆ２−２
として（焦燥）から推移をスタートする。また、意図獲
得段階と不足情報獲得段階との失敗数の合計が−３以上
ならば、回答提示段階では（落胆）から、−５以上なら
ば（困惑）からスタートする、などとする。If, in the intention acquisition stage, the process goes to the impatience stage and reaches the insufficient information acquisition stage (F1 = 4), F2-2
The transition starts from (frustrated). Further, if the total number of failures in the intention acquisition stage and the insufficient information acquisition stage is -3 or more, it starts from (disappointment) in the answer presentation stage, and from -5 (confusion) in the answer presentation stage.

【００４４】また、対話文に感情語が含まれていた場
合、図１１に示した感情語の分析結果を感情推移に反映
する。When the dialogue sentence contains an emotional word, the emotional word analysis result shown in FIG. 11 is reflected in the emotional transition.

【００４５】たとえば、図１２に示すように、（不安）
＝−１、（期待）＝＋１、（焦燥）＝−２、（余裕）＝
＋２、（困惑）＝−３、（納得）＝＋３、（怒り）＝−
４、（感謝）＝＋４などとした感情対応テーブルを感情
推移プラン記憶部に設けておく。For example, as shown in FIG. 12, (anxiety)
= -1, (expectation) = + 1, (frustration) =-2, (margin) =
+2, (confused) = -3, (convinced) = + 3, (anger) =-
An emotion correspondence table such as 4, (gratitude) = + 4 is provided in the emotion transition plan storage unit.

【００４６】いま、（推移モデル数値）−（感情語数
値）の値の絶対値をｓとし、談話遷移モデルの各段階の
失敗数＋成功数をＦで表す。そして、このテーブルを用
いて感情推移モデルの感情、感情語の感情をそれぞれ数
値で表す。（推移モデル数値）×（感情語数値）＜０だ
ったならば、感情語の表現と文脈が異なると判断し、感
情語の表現は無視する。あるいはここで、ユーザの感情
を確認してもよい。Now, let s be the absolute value of (transition model numerical value)-(emotional word numerical value), and F be the number of failures + successes at each stage of the discourse transition model. Then, using this table, the emotions of the emotion transition model and the emotions of the emotional words are represented by numerical values. If (transition model numerical value) × (emotional word numerical value) <0, it is determined that the expression of the emotional word is different from the context, and the expression of the emotional word is ignored. Alternatively, the emotion of the user may be confirmed here.

【００４７】また、（推移モデル数値）×（感情語数
値）＞＝０だったならば、感情語と文脈は一致してお
り、（推移モデル数値）−（感情語数値）の値の絶対値
ｓ＝＝２で、（推移モデル数値）＞０ならばＦ＋１、
（推移モデル数値）＜０ならばＦ−１とする。あるい
は、Ｓ＝＝３で（推移モデル数値）＞０ならばＦ＋２、
（推移モデル数値）＜０ならばＦ−２とする。If (transition model value) × (emotion word value)> = 0, the emotion word and the context match, and the absolute value of the value of (transition model value)-(emotion word value). If s == 2 and (transition model value)> 0, then F + 1,
If (transition model value) <0, F-1 is set. Alternatively, if S == 3 and (transition model numerical value)> 0, then F + 2,
If (transition model value) <0, F-2 is set.

【００４８】この結果、Ｆが属している感情をユーザ感
情として特定する。As a result, the emotion to which F belongs is specified as the user emotion.

【００４９】また、同じ感情語でも音声あるいはテキス
ト文により感情の種類や意味が異なることが考えられ
る。特に音声は、リアルタイムでユーザの感情を表現し
ている。そこで、音声に含まれた感情語を以下のように
処理することが可能である。Further, it is possible that the same emotional word has different emotional types and meanings depending on the voice or text sentence. Especially, the voice expresses the emotion of the user in real time. Therefore, the emotional words included in the voice can be processed as follows.

【００５０】たとえば、金澤等( 電子情報通信学会論文
集D-II, Vol.J77-D-II, No.8, pp.1512-1521) の研究
は、音声信号のピッチやアクセント等の非言語情報であ
る「音声律情報」に注目して、怒り、喜び、悲しみ、同
意、感心、つなぎ等の意図や感情情報を認識理解する手
法に関するものである。For example, the research by Kanazawa et al. (Proceedings of the Institute of Electronics, Information and Communication Engineers D-II, Vol.J77-D-II, No.8, pp.1512-1521) is based on non-language such as pitch and accent of speech signal. The present invention relates to a method of recognizing and understanding intentional and emotional information such as anger, joy, sadness, consent, impression, and connection by paying attention to "sound-rhythm information" that is information.

【００５１】エージェントとの対話中にユーザが発する
［あーあ］や［えー］といった非言語情報を持つ感情語
のうち使用頻度が高い感情語に対して、［余裕−焦
燥］、［満足−不満］および［受容−拒否］の各度合を
登録して辞書を作成する。［なにそれ］や［わからな
い］など言語情報をもつ感情語の音声認識は、たとえ
ば、竹林等( 電子情報処理学会論文誌、D-11,Vol. J77-
D-11,No8,pp1417-1428 等) の従来手法を用いて行な
う。１つの発声に対して、言語情報音声認識と非言語情
報音声認識との両方の結果を用いて感情認識を行なう方
法は、前述の特願平７−８６２６６に詳しいので割愛す
る。Of the emotional words having non-verbal information such as [aa] and [er] issued by the user during the dialogue with the agent, the emotional words that are frequently used are [margin-frustrated] and [satisfied-dissatisfied]. And each degree of [acceptance-rejection] is registered to create a dictionary. Speech recognition of emotional words that have linguistic information such as [what it] and [don't know] is described in Takebayashi et al. (Journal of the Institute of Electronics, Information Processing, D-11, Vol. J77-
D-11, No8, pp1417-1428, etc.). The method of performing emotion recognition using the results of both linguistic information voice recognition and non-verbal information voice recognition for one utterance is detailed in the above-mentioned Japanese Patent Application No. 7-86266, and will be omitted.

【００５２】感情語の語彙や感情の種類などの情報は、
テキスト文の感情語辞書の内容で併用してもよい。ま
た、言語情報音声認識と非言語情報音声認識の組み合わ
せによりユーザの感情を定義する感情語定義辞書を用い
てもよい。Information such as vocabulary of emotional words and types of emotions is
You may use together with the content of the emotional word dictionary of a text sentence. Further, an emotion word definition dictionary that defines the emotion of the user by combining the language information voice recognition and the non-language information voice recognition may be used.

【００５３】また、１対話対のユーザの感情を分析する
と、応答文表示中にはエージェントの表示内容に対して
反射的に生じる感情が生じ、入力時には自身の操作手順
などに対して生じる感情や対話全体への感想などが発せ
られる場合が多い。そこで、１対話対の進行をユーザが
テキスト文を入力しエージェントの応答を待つ入力中と
エージェントの応答文が表示される表示中に分類し、ユ
ーザがどちらのタイミングでどの感情語を発したかの分
析により、１対話対におけるユーザの感情を推定する。Further, when the emotions of the user in one dialogue pair are analyzed, the emotions that occur reflexively to the display contents of the agent occur during the response sentence display, and the emotions that occur with respect to the user's own operating procedure during inputting. In many cases, they give their impressions of the whole dialogue. Therefore, the progress of one dialogue pair is classified into the inputting of a text sentence by the user waiting for the agent's response and the displaying of the agent's response sentence, and at which timing the user uttered which emotional word. The user's emotion in one dialogue pair is estimated by the analysis.

【００５４】分析方法としては、たとえば入力中と表示
中にそれぞれ発せられた感情語の［余裕−焦燥］、［満
足−不満］および［受容−拒否］の各度合を集計して平
均値を算出する。表示中の感情語の平均値から入力中の
感情語の平均値をマイナスした各度合の値の組合せから
図１３のような感情に設定し、さらにそれぞれの感情に
対し［期待：＋１、不安：−1 、余裕：＋２、納得：＋
２、焦燥：−２、困惑：−２、感謝：＋３、怒り：−
３］などと数値を与える。これにより、対話対における
ユーザの感情を推定して数値化することができる。As an analysis method, for example, the degrees of [margin-frustration], [satisfaction-dissatisfaction], and [acceptance-rejection] of emotional words issued during input and display are summed up to calculate an average value. To do. The emotions as shown in FIG. 13 are set from combinations of the values of the degrees obtained by subtracting the average value of the emotional words being displayed from the average value of the emotional words being displayed, and furthermore, [expectation: +1, anxiety: -1, Margin: +2, Consent: +
2, Frustration: -2, Puzzle: -2, Thanks: +3, Anger:-
3] etc. are given. This makes it possible to estimate and quantify the emotion of the user in the dialogue pair.

【００５５】その他にも、前述の発話タイミングに着目
し、[ 余裕−焦燥］、［満足−不満］および［受容−拒
否］のうち、表示中は［満足−不満］と［受容−拒否］
の値を集計し、入力中は［余裕−焦燥］と［満足−不
満］の値を集計し、各度合の集計値を集計語数で割り平
均値を求める方法があり、この平均値と図１３から対話
対におけるユーザ感情を推定する、などがある。In addition, paying attention to the above-mentioned utterance timing, among [margin-frustration], [satisfaction-dissatisfaction] and [acceptance-rejection], [satisfaction-dissatisfaction] and [acceptance-rejection] are displayed.
There is a method of totaling the values of ",""frustration" and "satisfaction" while inputting, and dividing the total value of each degree by the number of total words to obtain an average value. To estimate the user's feelings in a dialogue pair.

【００５６】さらに、この音声感情とテキスト文の感情
を整合する。Further, the voice emotion and the emotion of the text sentence are matched.

【００５７】たとえば、テキスト文から推定した感情Ｆ
も［期待：＋１、不安：−１、余裕納得:+2 、焦燥：−
２、困惑：−２、感謝：＋３、怒り：−３］で音声感情
同様に数値化し、Ｅ＝（テキスト文感情（ｎ）−テキスト文感情（ｎ−
１））＋（音声感情（ｎ）−音声感情（ｎ−１））ｎ＝現在の対話回数の計算式から、対話の進行における感情の流れを把握す
る。たとえば、図１４のように感情を設定し、Ｅを満足
度とする。対話回数が５回目でそれまでのユーザの感情
が不安である時、Ｅ＜０であれば焦燥、Ｅ＝＝０であれ
ば困惑、Ｅ＞０であれば楽観、などと推定する。For example, the emotion F estimated from the text sentence
Also [Expectation: +1, Anxiety: -1, Consent: +2, Frustration:-
2, confused: -2, thankfulness: +3, anger: -3], and digitized in the same manner as voice emotion, and E = (text sentence emotion (n) -text sentence emotion (n-
1)) + (Voice Emotion (n) -Voice Emotion (n-1)) n = Current Number of Dialogues The flow of emotions in the progress of the dialogue is grasped from the calculation formula. For example, emotions are set as shown in FIG. 14, and E is set as the degree of satisfaction. When the number of conversations is the fifth and the user's emotions are uneasy up to that point, it is estimated that E <0 indicates impatience, E == 0 causes confusion, and E> 0 indicates optimism.

【００５８】また、テキスト文に比べ音声はユーザのよ
り潜在的な感情を表すと考え、音声感情の優先度を高く
する方法も考えられる。たとえば、（テキスト文感情の
数値×音声感情の数値）が０以下の場合、テキスト文感
情（ｎ）＝テキスト文感情（ｎ）＋音声感情（ｎ）と
し、Ｅを計算する、などがある。Further, it is considered that voice is considered to represent more latent emotion of the user than text sentences, and a method of increasing the priority of voice emotion can be considered. For example, when (the numerical value of the text sentence emotion × the numerical value of the voice emotion) is 0 or less, the text sentence emotion (n) = the text sentence emotion (n) + the voice emotion (n) is calculated, and E is calculated.

【００５９】以上のような手法で、感情語と感情推移モ
デルを用いて対話の状況に応じたユーザの感情を推定す
ることが可能である。With the method described above, it is possible to estimate the emotion of the user according to the situation of the conversation by using the emotion word and the emotion transition model.

【００６０】（第２実施形態)次に、本発明の第２実施
形態を説明する。(Second Embodiment) Next, a second embodiment of the present invention will be described.

【００６１】同実施形態では、ユーザの意図入力に対し
て複数の応答を生成しうる対話装置において、感情認識
結果を用いて適切な応答を選択、生成する手法を示す。The same embodiment shows a method of selecting and generating an appropriate response using an emotion recognition result in a dialogue device capable of generating a plurality of responses to a user's intention input.

【００６２】図１５は同実施形態の情報公開装置の機能
ブロック図である。FIG. 15 is a functional block diagram of the information disclosure device of the embodiment.

【００６３】入力部２０１は、たとえばネットワークな
どの通信手段を介して、テキスト、静止画像、動画像お
よび音声などの入力を受け付ける。また、ネットワーク
を介さず、直接キーボードやマウス、マイク、カメラ等
の入力デバイスから直接受け付けても良い。The input unit 201 receives input of texts, still images, moving images, voices, and the like via a communication means such as a network. Further, it may be directly accepted from an input device such as a keyboard, a mouse, a microphone, a camera, etc., not via a network.

【００６４】意図感情情報抽出部２０３では、入力に含
まれる意図や感情を表す表現を抽出し、その結果を、た
とえば意味表現に変換するなどして感情認識部２０３と
応答プラン作成部２０４とへ送る。The intention / feeling information extraction unit 203 extracts expressions representing intentions and feelings contained in the input, and converts the result into, for example, a semantic expression to the emotion recognition unit 203 and the response plan creation unit 204. send.

【００６５】応答プラン作成部２０４では、意図感情情
報抽出部２０２で抽出したユーザの意図や感情に対し
て、適切な応答を作成するため、予め格納された知識や
ルールなどを用いて計画する。たとえば、対話の状態を
表す対話遷移モデルを用意して、抽出されたユーザの意
図により応答プランを作成する。また、感情認識部２０
３の結果や履歴記憶部の内容により応答プランを変更す
る。The response plan creating unit 204 makes a plan using knowledge or rules stored in advance in order to create an appropriate response to the user's intention or feeling extracted by the intention / feeling information extracting unit 202. For example, a dialogue transition model representing the state of dialogue is prepared, and a response plan is created according to the extracted intention of the user. In addition, the emotion recognition unit 20
The response plan is changed according to the result of 3 and the contents of the history storage unit.

【００６６】感情認識部２０３では、意図感情情報抽出
部２０２で抽出された感情情報と、応答プラン作成部２
０４や履歴記憶部内の対話遷移状態により、ユーザの感
情を認識する。In the emotion recognition unit 203, the emotion information extracted by the intention emotion information extraction unit 202 and the response plan creation unit 2
04 or the dialogue transition state in the history storage unit, the emotion of the user is recognized.

【００６７】応答生成部２０５では、応答プラン作成部
２０４で決定された応答プランにしたがって、たとえ
ば、テキスト、音声、静止画像および動画像などのデー
タで、またはこれらを組み合わせたものとして、応答を
生成する。The response generation unit 205 generates a response according to the response plan determined by the response plan generation unit 204, for example, as data such as text, voice, still image and moving image, or a combination thereof. To do.

【００６８】図１６は、図１５で示した構成にユーザ情
報記憶部２０６と履歴記憶部２０７とを加えたものであ
る。FIG. 16 shows a configuration in which a user information storage unit 206 and a history storage unit 207 are added to the configuration shown in FIG.

【００６９】ユーザ情報記憶部２０６は、ユーザの性格
（パーソナリティ）や社会的な役割、当該装置への慣れ
などに関するユーザ情報、および複数のユーザ間の社会
的あるいは個人的な対人関係情報などが登録してある。
感情認識部２０３や応答プラン作成部２０４は、ユーザ
情報記憶部２０６の内容にしたがって、ユーザごとに感
情の認識方法や作成する応答プランを変更する。The user information storage unit 206 is registered with user information about the personality and social role of the user, familiarity with the device, and social or personal interpersonal relationship information between a plurality of users. I am doing it.
The emotion recognition unit 203 and the response plan creation unit 204 change the emotion recognition method and the created response plan for each user according to the content of the user information storage unit 206.

【００７０】履歴記憶部２０３では、応答プラン作成部
２０４の対話遷移モデルに対応させてユーザの意図や感
情認識結果、システムの生成した応答など、および、そ
れらの意味表現を記憶する。The history storage unit 203 stores the user's intention and emotion recognition result, the response generated by the system, and their semantic expressions in association with the dialogue transition model of the response plan generation unit 204.

【００７１】図１７は、入力部２０１および応答生成部
２０５と、感情認識部２０３や応答プラン作成部２０４
とを別のプロセス（２０９ａ、２０９ｂ）にし、データ
通信部２０８ａ〜２０８ｂを介してデータの受け渡しを
行なうものである。FIG. 17 shows an input unit 201, a response generation unit 205, an emotion recognition unit 203, and a response plan creation unit 204.
Are set as different processes (209a, 209b), and data is transferred via the data communication units 208a and 208b.

【００７２】図１８は、入力部２０１と意図感情情報抽
出部２０２とを音声とテキストで分離させたものである
（２０１ａ〜２０１ｂ、２０２ａ〜２０２ｂ）。FIG. 18 shows the input unit 201 and the intention / feeling information extraction unit 202 separated by voice and text (201a to 201b, 202a to 202b).

【００７３】図１９に同実施形態の動作手順を示す。ま
た、図２０に同実施形態の応答の例を示す。FIG. 19 shows the operation procedure of the embodiment. Further, FIG. 20 shows an example of the response of the same embodiment.

【００７４】はじめに、対話を希望するユーザがアクセ
ス開始の操作を行なう。たとえば、図２０に示すような
ウィンドウベースのインタフェースで、ユーザがテキス
ト入力可能なウィンドウ内でコマンド列「ｐｉｐｙａ
ｍａｍｏｔｏ」と入力するなどである。First, a user who desires a dialog performs an access start operation. For example, in the window-based interface as shown in FIG. 20, the command string “pip ya” is displayed in the window in which the user can enter text.
for example, "mamoto".

【００７５】同実施形態では、コマンドｐｉｐによっ
て、たとえば人間の代わりに情報を対話的に公開する機
能をもつ。In this embodiment, a command pip has a function of interactively disclosing information on behalf of a human being.

【００７６】ここで、情報公開装置とのアクセスが開始
されるとする。たとえば、図２０に示すように、左上に
情報公開エージェントの画像、左下にユーザ自身の画像
が表示され、右のウィンドウでユーザとエージェントの
対話をテキストベースで行なう。情報公開装置の動作に
関しては、特願平７−８６２６６に詳しいので割愛す
る。なお、本発明は、対話装置の主機能をこの情報公開
装置に限定するものではなく、たとえばデータベースサ
ービスなどのような不特定の情報要求者への情報提供サ
ービスにも応用可能である。Here, it is assumed that access to the information disclosure device is started. For example, as shown in FIG. 20, the image of the information disclosure agent is displayed in the upper left and the image of the user himself is displayed in the lower left, and the dialog between the user and the agent is performed on the text basis in the right window. Details of the operation of the information disclosure device are omitted because they are detailed in Japanese Patent Application No. 7-86266. The present invention is not limited to the information disclosure device as the main function of the dialogue device, but can be applied to an information providing service to an unspecified information requester such as a database service.

【００７７】アクセス開始後、システムは最初の応答プ
ランを作成する。応答プラン作成部２０４では、たとえ
ば、要求対応に関して図２１に示すような対話遷移モデ
ルにしたがって処理を行なう。図２１の太枠楕円で表さ
れる状態では応答プランを作成し、破線枠楕円の状態で
は意図を解析する。After starting access, the system creates the first response plan. In the response plan creating unit 204, for example, the request response is processed according to the dialogue transition model as shown in FIG. A response plan is created in the state indicated by the thick frame ellipse in FIG. 21, and the intention is analyzed in the state indicated by the broken line ellipse.

【００７８】図２１の対話遷移モデル例では、ユーザと
の対話をシステムとユーザとが共有する情報の深さで分
類した４つの段階で表している。レベル０は、ユーザの
要求の種類を獲得するレベルを表す。レベル１は、ユー
ザの要求を実行するために必要であればユーザのもつ情
報を獲得する段階を表す。レベル２では、実行前にシス
テムが行なう操作をユーザに確認する必要がある場合
に、ユーザに確認をとる。レベル３では、実際に操作を
行ない結果をユーザに報告する。In the dialogue transition model example of FIG. 21, the dialogue with the user is represented by four stages classified by the depth of information shared by the system and the user. Level 0 represents a level for acquiring the type of user request. Level 1 represents a step of acquiring information held by the user if necessary to fulfill the user's request. Level 2 asks the user for confirmation when it is necessary to confirm with the user the operation performed by the system before execution. At level 3, the operation is actually performed and the result is reported to the user.

【００７９】ＳＴＡＲＴ状態では、挨拶のための応答生
成プランを、次に要求獲得状態で要求獲得のための応答
プランを作成する。要求獲得状態では、ユーザに要求の
入力を促す応答プランを作成する。要求が入力されない
場合は要求獲得状態に戻る。入力された要求の尤度が低
い場合は、要求確認状態で要求の種類を確認する応答プ
ランを作成する。In the START state, a response generation plan for greeting is created, and then in the request acquisition state, a response plan for request acquisition is created. In the request acquisition state, a response plan that prompts the user to input a request is created. If no request is input, the process returns to the request acquisition state. When the likelihood of the input request is low, a response plan for confirming the request type in the request confirmation state is created.

【００８０】尤度がある程度以上である要求が獲得でき
たら、確認はせずに要求の実行条件をチェックする。実
行条件が要求ごとに異なる場合は、たとえば図２２に示
したような要求の種類ごとに尤度や実行条件を指定する
ためのリスト（実行条件リスト）を用意しておき参照す
る。If a request having a likelihood higher than a certain level can be obtained, the execution condition of the request is checked without confirmation. When the execution conditions are different for each request, a list (execution condition list) for designating the likelihood and the execution condition for each request type as shown in FIG. 22 is prepared and referred to.

【００８１】また、要求ごとに実行条件をチェックする
優先順位を変える場合や、実行条件が満たされない場合
の遷移先を変える場合も、図２２に示すように同様の方
法で実現できる。Also, when the priority order for checking the execution condition is changed for each request, or when the transition destination is changed when the execution condition is not satisfied, the same method can be realized as shown in FIG.

【００８２】たとえば、要求がスケジュールの参照など
の場合は、スケジュールの検索条件が獲得できているか
をチェックする。たとえば、図２２では、日付（ｄａｔ
ｅ）がユーザによって指示されているか、あるいは、項
目種類（ａｃｔ）、タイトル、場所、週および月のうち
の２つ以上の条件が指定されていれば、検索を実行す
る。検索条件が不足している場合は、検索を実行する前
に、情報獲得状態に遷移し、不足する情報の入力を促す
ような応答プランを作成する。情報獲得状態は、レベル
０より対話の内容が深まったレベル１に分類する。要求
が他ユーザへの伝言などの場合、まだ伝言内容が獲得さ
れていなければ情報獲得状態で伝言内容を獲得する応答
プランを作成する。For example, if the request is a schedule reference, it is checked whether the schedule search conditions have been acquired. For example, in FIG. 22, the date (dat
If e) is instructed by the user or two or more conditions of item type (act), title, place, week and month are specified, the search is executed. If the search conditions are insufficient, a response plan is created that transitions to the information acquisition state and prompts for input of the missing information before executing the search. The information acquisition state is classified into level 1 in which the content of the conversation is deeper than level 0. When the request is a message to another user, if the message content has not been acquired yet, a response plan for acquiring the message content in the information acquisition state is created.

【００８３】ユーザの要求が直接対話（ユーザ同士で直
接に対話する）の場合、相手のユーザに連絡がとれるか
を調べ、とれない場合は実行条件が満たされないため、
謝罪などの応答プランを作成し要求獲得状態に戻る。When the user's request is a direct dialogue (the users directly interact with each other), it is checked whether the other user can be contacted. If not, the execution condition is not satisfied.
Create a response plan such as an apology and return to the request acquisition state.

【００８４】必要な情報が獲得されている場合、事前承
認が必要な要求の場合は事前承認状態に遷移する。たと
えば、伝言の場合は伝言を記録する前にユーザに対して
伝言内容の確認を行なう応答プランを作成する。If the required information has been acquired, or if the request requires pre-approval, the pre-approval state is entered. For example, in the case of a message, a response plan for confirming the message content with the user before recording the message is created.

【００８５】すべての条件が満たされたら、要求実行状
態に遷移する。たとえば、スケジュールの検索を行な
い、その結果をユーザに呈示する応答プランを作成す
る。あるいは、検索の失敗など、要求が実行できない場
合は、その旨を伝えて謝罪するなどの応答プランを作成
する。この要求実行はレベル３とし、実行後は、対話終
了要求の実行以外は次の要求獲得のために要求獲得状態
に遷移し、レベルは０に戻る。When all the conditions are satisfied, the state transits to the request execution state. For example, a search for a schedule is performed, and a response plan that presents the result to the user is created. Alternatively, if the request cannot be executed due to a search failure or the like, a response plan is created to inform the user and apologize. This request execution is set to level 3, and after execution, transition to the request acquisition state for acquiring the next request except execution of the dialogue end request, and the level returns to 0.

【００８６】このように図２１では、ユーザとの対話の
遷移をユーザとシステムとの情報共有段階で分類した
が、その他にも、段階数を増減したり、ユーザの意図の
種類や要求の種類ごとに分類したり、状態の分類は行な
わないなどの方法もありうる。分類を行なわなければ感
情認識部２０３や応答生成部２０５で、状態名をそのま
ま用いて処理を記述することになるが、その分、細かく
処理を変えることができる。逆に、分類することによ
り、対話履歴記憶や感情認識、応答生成等の処理を分類
項目ごとに記述できるというメリットが考えられるが、
それだけでは処理が大まかになりすぎる嫌いもある。図
２１の分類方法では、情報の共有段階によりシステムの
失敗や成功がユーザに与える心理的影響が異なる場合に
特に効果的である。As described above, in FIG. 21, the transition of the dialogue with the user is classified at the information sharing stage between the user and the system. However, in addition to this, the number of stages can be increased or decreased, or the type of intention of the user or the type of request. There may be a method of classifying each item or not classifying the states. If classification is not performed, the emotion recognition unit 203 and the response generation unit 205 use the state name as it is to describe the process, but the process can be changed minutely. On the contrary, by classifying, there is an advantage that processes such as dialogue history memory, emotion recognition, and response generation can be described for each classification item.
There is also a dislike that the process is too rough by itself. The classification method of FIG. 21 is particularly effective when the failure or success of the system has different psychological effects on the user depending on the information sharing stage.

【００８７】応答プラン作成部２０４では、たとえば図
２３に示すような形で図２１の対話遷移モデルのレベル
情報にしたがって履歴記憶部２０７に履歴を記録する。
図２３の矩形ひとつひとつが対話履歴の１単位を表す。
図２３の例では、対話履歴の１単位は図２１の応答プラ
ン作成状態を通るごとに作成されるため、必ず１つ以上
の応答プランが含まれる。また、応答プラン作成状態に
到達する直前に意図解析状態を経過した場合は、ユーザ
の入力した意図情報も含まれる。例えば、意図解析や、
条件判断、要求実行等の各段階で、要求種類、成功、失
敗、遷移先などに応じて得点が与えられ、履歴記憶部２
０７に記憶される。The response plan creating unit 204 records a history in the history storage unit 207 according to the level information of the dialogue transition model of FIG. 21, for example, as shown in FIG.
Each rectangle in FIG. 23 represents one unit of the dialogue history.
In the example of FIG. 23, one unit of the dialogue history is created each time the response plan creation state of FIG. 21 is passed, so that one or more response plans are always included. If the intention analysis state has passed immediately before reaching the response plan creation state, the intention information input by the user is also included. For example, intention analysis,
At each stage of condition judgment, request execution, etc., a score is given according to the request type, success, failure, transition destination, etc., and the history storage unit 2
07 is stored.

【００８８】ユーザの発話意図の抽出は、たとえば以下
のような手順で実現される。Extraction of the utterance intention of the user is realized by the following procedure, for example.

【００８９】ユーザは、たとえば、図２０のテキストウ
ィンドウにキーボードなどの入力デバイスを用いて文章
を入力する。意図感情情報抽出部２０２では、入力され
たテキスト文から、まず図２４に示すようなユーザの発
話意図を抽出する。発話意図のうち要求は、たとえば図
２５に示すような種類の要求を受け付けるとする。The user inputs a sentence into the text window shown in FIG. 20, for example, using an input device such as a keyboard. The intention / feeling information extraction unit 202 first extracts the utterance intention of the user as shown in FIG. 24 from the input text sentence. Regarding the request of the utterance intention, for example, a request of the type shown in FIG. 25 is accepted.

【００９０】まず、ユーザの入力は形態素解析され、単
語ごとに区切られ品詞情報を付加される。要求がスケジ
ュールや文書の参照の場合、人名、地名、数字、固有名
詞などの単語の抽出が不可欠である。First, the user's input is subjected to morphological analysis, divided into words and added with part-of-speech information. When the request is a schedule or a document reference, it is essential to extract words such as a person's name, a place name, a number, and a proper noun.

【００９１】次に、図２６に示すようなキーワード辞書
を用いたマッチングを行なう。キーワード辞書には、１
キーワードあたり１つ以上のカテゴリ候補が記述されて
いる。カテゴリ候補は、たとえば尤度や強度等が指定さ
れている。また、カテゴリ名に加えて属性名（ｔａｉ
ｌ、ｅｎｄなど）や項目が付加されているものもある。
たとえば、「今度の会議の予定わかる？」と入力された
とする。図２６の辞書によれば、「今度」はｔｅｎｓｅ
（時制）、「会議」はａｃｔ（項目種類）のカテゴリに
含まれ、「予定」はスケジュールに含まれることがわか
る。キーワード「会議」には、ａｃｔｔａｉｌというカ
テゴリの候補もある。ａｃｔｔａｉｌは名詞のあとにつ
く場合「○○会議」というように会議名を表すという属
性を示すが、この例文では形態素解析の結果、直前に名
詞がないためａｃｔの方が採用される。Next, matching is performed using a keyword dictionary as shown in FIG. 1 in the keyword dictionary
One or more category candidates are described for each keyword. For example, likelihood and strength are specified as the category candidates. In addition to the category name, the attribute name (tai
(l, end, etc.) and items are added.
For example, suppose the user inputs "Do you know the schedule for the next meeting?" According to the dictionary of FIG. 26, “this time” is tense
It can be seen that (meeting) and “meeting” are included in the category of act (item type), and “plan” is included in the schedule. The keyword “meeting” also includes candidates for the category “actail”. When "acttail" follows a noun, it indicates an attribute that represents a conference name such as "XX conference", but in this example sentence, as a result of morphological analysis, since there is no noun immediately before, act is adopted.

【００９２】これらのキーワードの尤度から文の意図が
決定される。この例では意図を表すキーワードはない
が、図２５の要求対象としてスケジュールが含まれるた
め、意図は「要求」とする。発話の意味表現は、たとえ
ば図２７に示すような書式で表現され応答プラン作成部
２０４へ送られる。The likelihood of these keywords determines the intention of the sentence. In this example, there is no keyword indicating the intention, but since the schedule is included as the request target in FIG. 25, the intention is “request”. The semantic expression of the utterance is expressed in a format as shown in FIG. 27 and sent to the response plan creating unit 204.

【００９３】テキストに含まれる感情情報は、第１実施
形態に示すような感情語辞書を用いて以下に示すような
手法でユーザの入力文から抽出し、感情認識部２０３と
応答プラン作成部２０４へ送られる。The emotion information included in the text is extracted from the user's input sentence by the following method using the emotion word dictionary as shown in the first embodiment, and the emotion recognition unit 203 and the response plan creation unit 204 are extracted. Sent to.

【００９４】同実施形態の対話装置は、主にユーザの要
求実行を目的とするものなので、依頼に関する感情認識
手法について述べる。なお、一般的な感情を分類する試
みは他にも多数行われている（福井泰之：感情の心理
学、川島書店）。たとえば、［快−不快］、［強度］お
よび［方向（指向性）］などの３軸に対応させる研究が
見られる。Since the dialogue apparatus of the embodiment is mainly intended to execute the request of the user, the emotion recognition method regarding the request will be described. There are many other attempts to classify general emotions (Yasuyuki Fukui: Psychology of emotions, Kawashima Shoten). For example, there is a study in which three axes such as [comfort-discomfort], [strength], and [direction (directivity)] are associated.

【００９５】たとえば、相手に依頼（命令）する際に伴
なう感情に限定し、［快−不快］に加えて、［余裕−切
迫］と［受容−拒否］を採用するとする。［快−不快］
は、主に、システムの応答内容やユーザの予測、実際の
結果などに対するユーザの評価を表す感情の軸とする。
［余裕−切迫］は、主に時間的な制約などユーザ自身の
状況やパーソナリティから決まる要求達成欲と実際の達
成状況の差分に伴なう感情の軸とする。［受容−拒否］
は、システム自身やその応答をユーザが受け入れるかど
うかを表す感情の軸とする。図２８に以上の３軸の概略
を示す。For example, it is assumed that the feelings involved in requesting (ordering) the other party are limited, and [margin-imminent] and [accept-reject] are adopted in addition to [pleasant-uncomfortable]. [Pleasure-discomfort]
Is mainly an emotional axis that represents the user's evaluation of the response content of the system, the user's prediction, and the actual result.
[Margin-urgency] is an axis of emotions that accompanies the difference between the desire achievement achievement and the actual achievement situation mainly determined by the user's own situation or personality such as time constraint. [Acceptance-Rejection]
Is the emotional axis that indicates whether the user accepts the system itself or its response. FIG. 28 shows an outline of the above three axes.

【００９６】入力文に含まれるユーザの感情は、前述し
た３軸から構成される空間上の座標値をもつとする。た
とえば、それぞれの軸上の最大値を５、最小値を−５と
して、図２７に示した入力文１の「ちょっと」は、［余
裕−切迫］、［快−不快］および［受容−拒否］の軸上
で、（−２，−１，−１）の座標値をもつと定義する。
また、「お願い」は（−２，１，１）、語尾の「あるん
だけど」は（−１，−１，−１）とし、入力文３の「か
な」は（３，−１，１）、入力文４の「悪いけど」は
（−２，−２，０）とする。これらの値を１文ごとに平
均をとる。すると、入力文１の感情表現情報は（−２，
−１，−１）となる。It is assumed that the emotion of the user included in the input sentence has coordinate values in the space composed of the above-mentioned three axes. For example, assuming that the maximum value on each axis is 5 and the minimum value is -5, "a little" in the input sentence 1 shown in FIG. 27 is [margin-imminent], [pleasant-unpleasant] and [accept-reject]. Is defined as having a coordinate value of (-2, -1, -1) on the axis of.
In addition, the “please” is (−2,1,1), the ending “Arudan” is (−1, −1, −1), and the “kana” of the input sentence 3 is (3, −1,1). ), "Bad" of the input sentence 4 is (-2, -2, 0). These values are averaged for each sentence. Then, the emotion expression information of the input sentence 1 is (-2,
-1, -1).

【００９７】また同様に、前述した３軸上の領域に、実
際に依頼のタスクでみられる感情名を、たとえば図２９
のように割り当てる。感情領域同士は重なりがあっても
よい。たとえば、図２９の領域が示すように、「諦め」
は不快ではあるが、結果を受容しており、余裕のある感
情領域を表すとする。「納得」は、あまり不快ではなく
結果を受容している点が「諦め」と若干異なる。「焦
燥」は、状況が切迫している。「期待」と「不安」は、
どちらも少し切迫しており、まだ状況を受容も拒否もし
ていないが、予測したシステムの動作に対して、快であ
れば「期待」、不快であれば「不安」としている。Similarly, the emotion names actually found in the requested task are displayed in the above-mentioned three-axis regions, for example, as shown in FIG.
Assign like. The emotional areas may overlap with each other. For example, as shown in the area of FIG. 29, "give up"
Is uncomfortable, but accepts the results and represents an area of emotion that is generous. "Consent" is a little different from "give up" in that it accepts the result rather than being offensive. The situation is urgent for "frustration." “Expectation” and “anxiety” are
Both of them are a little urgent and have not yet accepted or rejected the situation, but the expected behavior of the system is "expected" if comfortable, and "anxious" if uncomfortable.

【００９８】それぞれの領域は、システム管理者などの
ユーザが定義し直せるようにすることも可能である。た
とえば、図２９のようなグラフィックスインタフェース
を用意して、マウスなどのポインティングデバイスで指
示させてもよい。また、図３０に示すように、それぞれ
の領域をテーブルなどで指定してもよい。Each area can be redefined by a user such as a system administrator. For example, a graphics interface as shown in FIG. 29 may be prepared and a pointing device such as a mouse may be used for the instruction. Further, as shown in FIG. 30, each area may be designated by a table or the like.

【００９９】前述した３軸の値から、たとえばユーザの
システムに対する「許容度＝満足度」として、ユーザの
感情状態を定義する。これはプラス（満足）かマイナス
（不満）かの値をもつ。たとえば、快であっても非常に
状況が切迫している場合は、システムに対する満足度は
低い。また、余裕があってもシステムの応答が不快感を
あおれば、満足度が低くなる。逆に、結果が得られない
場合でも、余裕がある場合は満足度はそう低くならな
い。すなわち、たとえば、３軸の値の平均値をとる、ま
たは最低値と最高値を比べて絶対値の大きい方を採用し
他の２軸の平均値を加える、といった処理を行なって決
定する。The emotional state of the user is defined from the values of the three axes described above, for example, as "acceptance = satisfaction" with respect to the user's system. It has a value of plus (satisfaction) or minus (dissatisfaction). For example, if the situation is pleasant but very urgent, the system is less satisfied. Further, if the system response is uncomfortable even if there is a margin, the satisfaction level becomes low. On the contrary, even when the result is not obtained, the satisfaction is not so low when there is a margin. That is, for example, it is determined by taking the average value of the values on the three axes, or by comparing the lowest value and the highest value, adopting the one with the larger absolute value, and adding the average values of the other two axes.

【０１００】感情認識部２０３では、履歴記憶部２０７
に格納された図２３のような対話履歴とユーザの意図お
よび満足度から、ユーザの感情を認識する。In the emotion recognition unit 203, the history storage unit 207
The emotion of the user is recognized from the dialogue history and the user's intention and satisfaction stored in FIG.

【０１０１】満足度を用いると、図１４で示したよう
に、ユーザの感情は、対話の推移につれていくつかのパ
ターンで記述される。「期待」、「不安」および「楽
観」などの各感情は、図１４中の文字の周囲に、ある範
囲をもって定義されるとする。実際は、「期待半分不安
半分」の状態も考えられ、ユーザの感情は図１４中の一
点に定められず、ある範囲をもって流動的に遷移すると
する。When the degree of satisfaction is used, the emotion of the user is described in several patterns as the dialogue changes, as shown in FIG. It is assumed that each emotion such as “expectation”, “anxiety”, and “optimism” is defined with a certain range around the characters in FIG. In reality, a state of "half expectation and half anxiety" is also conceivable, and the emotion of the user is not fixed at one point in FIG.

【０１０２】たとえば、一般的なユーザの場合、対話の
回数が少ないうちに、対話の段階がレベル０、１、２と
進んでいけば、ユーザの満足度はプラスになり、いつま
でも段階が進まず失敗ばかり繰り返していればマイナス
になる。履歴記憶部２０７に記憶されている応答の成
功、失敗等によって定まる得点により満足度は上下する
が、早い時期ならレベル３で成功すれば感謝、失敗して
も納得できるが、時期が遅くなるにつれ諦めや怒りの気
持ちを覚える。For example, in the case of a general user, if the number of conversations is small and the stages of the conversation progress to levels 0, 1, and 2, the user's satisfaction becomes positive and the stage does not progress forever. It becomes negative if you repeat only failures. Satisfaction varies depending on the score determined by the success or failure of the response stored in the history storage unit 207, but if it is early, thank you if you succeed at level 3, you can be satisfied even if you fail, but as the time becomes late I feel resigned and angry.

【０１０３】そこで、図１４の感情推移モデルを用い
て、前記の対話の状態遷移の段階と、抽出された感情語
による満足度の調整を行なう。たとえば、対話の流れに
対して、目安となる対話の回数を決めておいて比較して
もよい。すなわち、対話の回数により図１４における縦
の線が決まり、状態遷移の段階と、前述の得点等により
線上のある範囲に満足度が定まる。さらに、感情語から
求めた満足度の値と照合し、一致しない場合は、たとえ
ば、図１４で求めた満足度の範囲の中心と感情語から求
めた満足度の値の平均をとる、などの方法により決定で
きる。Therefore, the emotion transition model of FIG. 14 is used to adjust the state transition stage of the dialogue and the degree of satisfaction with the extracted emotion word. For example, the number of times of dialogue as a guide may be determined and compared with the flow of dialogue. That is, the vertical line in FIG. 14 is determined by the number of dialogues, and the degree of satisfaction is determined in a certain range on the line by the state transition stage, the above-mentioned score, and the like. Furthermore, if the satisfaction value obtained from the emotional words is collated and if they do not match, for example, the center of the satisfaction degree obtained in FIG. 14 and the satisfaction value obtained from the emotional words are averaged. It can be determined by the method.

【０１０４】対話の初期の要求獲得段階では、感情語が
入らないと満足度の範囲を限定できないが、図１４に示
すように、対話の初期には満足度は極端にふれているこ
とはあまりないと考える。たとえば、図１４の感情を表
す文字位置の中心からの距離に反比例して、その感情の
確率が表されるとすれば、対話のはじめには「期待」が
「不安」より確率が高いため、やや「期待」よりの満足
度とする。対話が少し進んだ場面で、まだ要求獲得状態
であれば「不安」の確率が高くなるので、満足度をやや
低めとする。At the initial stage of request acquisition in the dialogue, the range of the degree of satisfaction cannot be limited unless emotional words are entered. However, as shown in FIG. 14, it is rare that the degree of satisfaction is extremely touched at the beginning of the dialogue. I don't think so. For example, if the probability of the emotion is expressed in inverse proportion to the distance from the center of the character position representing the emotion in FIG. 14, “expectation” has a higher probability than “anxiety” at the beginning of the dialogue, so it is somewhat Satisfaction is higher than “expectation”. In a situation where the dialogue is a little advanced, the probability of "anxiety" is high if the request is still in the demanding state, so the degree of satisfaction is set to be slightly low.

【０１０５】これらの感情遷移や確率分布を、システム
管理者などのユーザが定義し直せるようにすることも可
能である。たとえば、図２９のようなグラフィックスイ
ンタフェースを用意して、マウスなどのポインティング
デバイスで指示させてもよい。また、感情の種類ごとに
異なる確率分布を定義させることも可能である。It is also possible to allow a user such as a system administrator to redefine these emotional transitions and probability distributions. For example, a graphics interface as shown in FIG. 29 may be prepared and a pointing device such as a mouse may be used for the instruction. It is also possible to define different probability distributions for each emotion type.

【０１０６】図１６に示すように、ユーザ情報記憶部２
０６にユーザのこれまでの操作履歴やパーソナリティ情
報などを蓄積している場合を考える。はじめてのユーザ
でなければ、操作履歴にユーザの感情推移パターンを、
図１４上の軌跡などの形で記録しておき、もっとも多い
パターンや、平均的なパターンなどを用いて推定するこ
とが可能である。また、ユーザがせっかち、怒りっぽ
い、または口は悪いが温厚である、計算機に慣れてい
る、といったパーソナリティ情報を前もって登録してお
くなどにより、はじめてのユーザでも、登録したパター
ンを用いて推定することができる。「怒っている上司」
や「焦っている同僚」など、複数の典型的なパターンを
登録しておけば、ユーザの地位などの情報を共有データ
ベースなどから入手して、その感情を推定することがで
きる。As shown in FIG. 16, the user information storage unit 2
Consider the case where the user's operation history and personality information are stored in 06. If you are not the first user, set the emotional transition pattern of the user in the operation history,
It is possible to record it in the form of a locus on FIG. 14 and estimate it by using the most frequent pattern or average pattern. Also, by registering personality information in advance, such as the user being impatient, irritable, or humorous but warm and accustomed to a computer, even for the first time user to estimate using the registered pattern. be able to. "Angry boss"
By registering a plurality of typical patterns such as "Is a colleague who is impatient", information such as the status of the user can be obtained from a shared database and the emotion can be estimated.

【０１０７】また、第１実施形態に示したように、表情
認識、音声認識などの手法を用いて複数の入力情報から
感情認識を行なうことも可能である。たとえば、図３１
に示すような、抑揚などの非言語情報を伴った独り言の
うち頻度の高いものを、［快−不快］と、声の大きさや
抑揚の激しさなどの［大−小］との２軸に割り当てれ
ば、それぞれの領域を「喜び」、「驚き」、「納得」、
「怒り」および「失望」などの感情にカテゴライズする
ことができる。また、前述の３軸上の値に割り当てるこ
とにより感情を定義しておいてもよい。これをテキスト
の感情認識と組み合わせることにより、より正確な感情
認識結果を得ることができる。この感情認識部２０３に
よって決定された感情の値は、応答プラン作成部２０４
に渡される。Further, as shown in the first embodiment, emotion recognition can be performed from a plurality of input information by using a method such as facial expression recognition and voice recognition. For example, in FIG.
As shown in Figure 2, the frequency of soliloquy with non-verbal information such as intonation is frequently divided into two axes: [pleasant-unpleasant] and [loudness / loudness] such as loudness of voice and intensity of intonation. If you assign them, you will be given "joy,""surprise,""satisfaction,"
Can be categorized into emotions such as "anger" and "disappointment". The emotion may be defined by assigning it to the values on the above three axes. By combining this with the emotion recognition of text, more accurate emotion recognition result can be obtained. The emotion value determined by the emotion recognition unit 203 is the response plan creation unit 204.
Passed to.

【０１０８】応答プランは、たとえば、図２１の各応答
プラン作成状態に応じて、図３２に示すようなシステム
の意図種類、および意図内容の組み合わせにより作成さ
れる。たとえばテキスト文を生成する際は、平叙文、疑
問文といった文型や、開示・要求する情報なども付加す
る。また、応答に表情を与えるための態度や親密度など
の表情情報を付加してもよい。The response plan is created by a combination of the intention type and the intention content of the system as shown in FIG. 32, for example, according to each response plan creation state of FIG. For example, when generating a text sentence, a sentence pattern such as a plain text or an interrogative sentence and information to be disclosed / requested are added. Further, facial expression information such as an attitude and familiarity for giving a facial expression to the response may be added.

【０１０９】表情情報とは顔面における表情に限らず、
テキスト文や音声の応答への表情付けを行なうための情
報である。応答プランを受け取った応答生成部で、図３
２の文例に示すような応答文に変換されて出力される。The facial expression information is not limited to facial expressions,
This is information for giving a facial expression to a text sentence or a voice response. When the response generation unit receives the response plan, FIG.
It is converted into a response sentence as shown in the second sentence example and output.

【０１１０】まず、感情認識結果によらない簡単な応答
プランと応答生成部の処理について例をあげて説明す
る。応答生成部２０５では、図３３のような書式に則っ
た１文単位の応答プランから、実際にユーザに示す応答
を生成する。たとえば、図３４に示すようなテキスト文
を生成する場合に、たとえばスロット法（長尾真：「人
工知能シリーズ２言語工学」昭晃堂、１９８３）を用い
て、図３５に示すような書式で予め登録されている応答
文辞書から、渡された応答プランにあてはまる文を選択
し、必要な情報を埋め込んで文を生成する。First, a simple response plan that does not depend on the emotion recognition result and the processing of the response generation unit will be described with an example. The response generation unit 205 actually generates a response shown to the user from the response plan of one sentence unit according to the format shown in FIG. For example, when a text sentence as shown in FIG. 34 is generated, for example, a slot method (Makoto Nagao: “Artificial Intelligence Series 2 Linguistic Engineering” Shokoido, 1983) is used in advance in a format as shown in FIG. From the registered response sentence dictionary, a sentence applicable to the passed response plan is selected, and necessary information is embedded to generate a sentence.

【０１１１】図３２および図３３の応答プランでは、応
答の意図種類として、たとえばａｃｃｅｐｔ（ユーザの
要求を受け入れる）、ａｎｓｗｅｒ（ユーザの質問に対
して解答する）、ｃｈｉｍｅ（相づち）、ｃｏｎｆｉｒ
ｍ（ユーザの要求を確認する）、ｃｏｎｆｕｓｅ（ユー
ザの意図がわからないことを表明する）、ｇｏｏｄｂｙ
ｅ（対話終了の宣言）、ｇｒｅｅｔｉｎｇ（最初の挨
拶）、ｒｅｊｅｃｔ（ユーザの要求を拒絶する）、ｒｅ
ｑｕｅｓｔ（ユーザに対して情報などを要求する）、ｓ
ｏｒｒｙ（謝罪する）、ｓｕｇｇｅｓｔ（提案する）、
ｔｈａｎｋｓ（感謝する）等がある。意図内容として
は、目的、スケジュール、ユーザ同士の直接対話、ユー
ザの状況、伝言などがある。In the response plans of FIGS. 32 and 33, as the intention type of the response, for example, accept (accepts the user's request), answer (answers the user's question), chime (conflict), and confirm.
m (confirm the user's request), confuse (expresses that the user's intention is not understood), goodby
e (declaration of dialogue end), greeting (first greeting), reject (reject user request), re
quest (request information from the user), s
orry (apologies), suggest (suggests),
thank you. The intent contents include a purpose, a schedule, a direct dialogue between users, a user situation, and a message.

【０１１２】回数とは、同じような状況で同じ応答プラ
ンを作成した回数である。図３４は、図２０の対話例に
対してシステムが生成した応答プランと生成文例とを示
したものである。ここでも、ユーザの要求を獲得するま
では「どのようなご用件でしょうか？」と聞くが、一度
要求が獲得された後は「他にご用件はありませんか？」
と応答を変えるために回数情報を用いている。回数情報
は対話履歴から計数できる。The number of times is the number of times the same response plan is created in the same situation. FIG. 34 shows a response plan and a generated sentence example generated by the system for the dialogue example of FIG. Again, I ask "what kind of requirement?" Until the user's request is received, but once the request is obtained, "Is there any other request?"
And uses the frequency information to change the response. The frequency information can be counted from the dialogue history.

【０１１３】項目には、応答プランで開示する情報、あ
るいはユーザに要求する情報などを記述する。たとえ
ば、図３４の応答プラン３では、項目に「ａｃｔ＝会議
＆ｄａｔｅ＝？」とあり、生成文は「いつ頃の会議です
か? 」となっている。In the item, information disclosed in the response plan or information requested to the user is described. For example, in the response plan 3 of FIG. 34, the item is “act = meeting & date =?”, And the generated sentence is “when is the meeting?”.

【０１１４】これに対して、応答生成部２０５に図３５
のような書式の応答文例を登録しておき、意図種類、意
図内容、文型、項目などが一致する文例を探し出す。項
目ａｃｔは、例文中で変数＄ａｃｔとして用いられてお
り、応答プラン中で指定された値「会議」で置き換えて
文例を完成する。文末には、疑問文の場合は「？」を平
叙文では「。」を付加する。On the other hand, the response generation unit 205 is shown in FIG.
The response sentence example of the format like this is registered, and the sentence example in which the kind of intention, the content of intention, the sentence pattern, the item, and the like match is found. The item act is used as a variable $ act in the example sentence, and is replaced with the value “meeting” specified in the response plan to complete the example sentence. At the end of the sentence, "?" Is added in the case of an interrogative sentence, and "." Is added in the ordinary sentence.

【０１１５】図３６は、図３３に応答の表情情報を付加
した例である。たとえば、親密度、態度等といった値を
応答生成部で指定する。親密度は、システムとユーザと
の親しさ、あるいは、情報を公開するユーザと要求する
ユーザとの親しさとし、事前にユーザ情報記憶部２０６
に登録した値を用いる。態度は、丁重、ていねい、普
通、および無礼といった４段階程度の粗さで、たとえ
ば、ユーザの社会的な地位、あるいはユーザ同士の社会
的な関係、あるいはユーザの感情認識結果などにより適
切な値を決定する。FIG. 36 is an example in which facial expression information of the response is added to FIG. For example, the response generation unit specifies values such as intimacy and attitude. The degree of intimacy is the degree of familiarity between the system and the user, or the degree of familiarity between the user who discloses information and the user who requests the information, and the user information storage unit 206 is set in advance.
Use the value registered in. The attitude has four levels of coarseness, such as polite, polite, normal, and rude. For example, an appropriate value is set according to the social status of the users, the social relationship between the users, or the user's emotion recognition result. decide.

【０１１６】図３７の例は、図３４に態度と親密度とを
加えたものである。ユーザ「佐藤」は情報を公開するユ
ーザ「山本」( エージェントの持ち主) の友達で年齢も
同じとする。よって、たとえば、親密度＝４、態度＝０
といった値を決定し、応答プランに付加する。The example of FIG. 37 is obtained by adding the attitude and familiarity to FIG. 34. The user “Sato” is a friend of the user “Yamamoto” (the owner of the agent) who discloses information and is the same age. Therefore, for example, intimacy = 4, attitude = 0
Value is added to the response plan.

【０１１７】たとえば、感情認識の結果、「佐藤」が怒
っていることが分かれば、親密度を下げて態度の値を上
げるなどの処理を行なう。逆に「佐藤」の機嫌がよいよ
うなら親密度はもっとあげる。逆に、もともと親密度の
低い知り合いの場合は、相手が怒っていても親密度や態
度を変えない。予めこれらの値を決定する応答戦略を図
３８に示すようなルールとして登録しておくことにより
実現できる。また、システム管理者などのユーザによっ
て、ルールを変更させることも可能である。For example, if it is found as a result of emotion recognition that "Sato" is angry, processing such as lowering the degree of intimacy and increasing the value of attitude is performed. On the other hand, if "Sato" is in a good mood, increase the degree of intimacy. On the contrary, if the acquaintance has a low intimacy, the intimacy and attitude will not change even if the other person is angry. This can be realized by registering the response strategy for determining these values in advance as a rule as shown in FIG. It is also possible to change the rule by a user such as a system administrator.

【０１１８】親密度や態度を加えた応答文例辞書の例を
図３９に示す。図３９の上部は、スケジュールに関する
ユーザの問い合わせへの解答文の例であり、場所（ｐｌ
ａｃｅ）とスケジュールの項目種類（ａｃｔ）を答える
文を表している。親密度や態度は、０や１などの値か、
「１−３」などのような範囲を持った値を指定してされ
ている。FIG. 39 shows an example of a response sentence example dictionary including familiarity and attitude. The upper part of FIG. 39 is an example of an answer sentence to the user's inquiry about the schedule, and the place (pl
ace) and the item type (act) of the schedule. Intimacy or attitude is a value such as 0 or 1,
A value having a range such as "1-3" is designated.

【０１１９】図３９の下部は、挨拶のバリエーション
が、親密度や態度だけでなく、時間帯や、ユーザが以前
にアクセスしたことがあるかどうか（既知／未知）とい
った分類により、記述されている。＄ｐｌａｃｅ、＄ａ
ｃｔおよび＄ｕｓｅｒは変数であり、実際に文を生成す
る際に、応答プランの項目で指定される実際の文字列
や、ｌｏｇｉｎ名などのユーザの名前をあてはめる。時
間帯は、応答プランに含まれていなくても、応答生成部
２０５で時間を調べ、たとえば午前３時以降１１時まで
は「朝」、１１時以降午後５時までは「昼」、以降を
「夜」と求めることができる。At the bottom of FIG. 39, greeting variations are described not only by intimacy and attitude, but also by time zone and classification such as whether or not the user has accessed before (known / unknown). . $ Place, $ a
ct and $ user are variables, and when the sentence is actually generated, the actual character string specified in the item of the response plan and the user name such as the login name are applied. Even if the time period is not included in the response plan, the time is checked by the response generation unit 205. For example, “morning” is used from 3:00 am to 11:00, “daytime” is used from 11:00 to 5 pm, and thereafter. You can ask for "night".

【０１２０】応答生成部２０５では、前述した意図種
類、意図内容、文型、回数、親密度、態度および項目の
内容などが合致する文例を、図３９のような応答文辞書
から探しだす。また、１度用いた文例に、ｕｓｅｄフラ
グなどを用いて印をつけておき、同じ対話の中では、出
来る限り別の文例を用いるようにしてもよい。合致する
ものがない場合は、応答プランで指定した項目より項目
の指定が少ない文例、態度の丁寧な文例を選択する。The response generation unit 205 searches the response sentence dictionary as shown in FIG. 39 for a sentence example in which the above-mentioned intention type, intention content, sentence pattern, number of times, intimacy, attitude, item content, and the like match. Further, the sentence example used once may be marked by using a used flag or the like, and different sentence examples may be used as much as possible in the same dialogue. If there is no match, select a sentence example with less items specified than the items specified in the response plan, or a sentence example with careful attitude.

【０１２１】また、文の格構造を基に規則合成をする場
合、態度が丁寧な場合は文末のみ「ですます文」、無礼
な場合は「である文」などの形にして応答文を生成して
もよい。利用する語彙や文型を複数用意し、態度や親密
度などの値により変更する。In addition, when performing rule composition based on the case structure of a sentence, if the attitude is polite, only the end of the sentence is "damasu sentence", and if it is rude, a response sentence is generated. You may. Prepare multiple vocabulary and sentence patterns to use, and change according to values such as attitude and intimacy.

【０１２２】応答生成部２０５で生成した応答文は、た
とえば、図２０で示したようなテキストウィンドウに表
示する。また、図２０の左上のエージェントの画像の表
情を、図４０に示すように複数パターン用意しておき、
応答プランに応じて変更してもよい。この例では、応答
プランで指定された発話意図の種類と親密度によって表
示する画像を指定している。静止画だけでなく動画で
も、同様に複数パターン用意して切り替えることによっ
て応答を変更する。写真やビデオなどで実際の人物や人
形などの表情を複数パターン用意してもよい。また、P.
Ekman とW.V.Friesen("Facial Action Coding System",
Consulting Psycologist Presss, 1977)の表情合成規
則を用いて３次元ＣＧなどにより、程度の違う表情を合
成することが可能である。The response sentence generated by the response generation unit 205 is displayed in, for example, the text window shown in FIG. Also, a plurality of facial expressions of the image of the agent on the upper left of FIG. 20 are prepared as shown in FIG.
It may be changed according to the response plan. In this example, an image to be displayed is designated according to the type of utterance intention designated in the response plan and familiarity. Similarly for still images as well as moving images, the response is changed by preparing and switching multiple patterns in the same manner. You may prepare multiple patterns of facial expressions of actual people or dolls in photographs and videos. Also, P.
Ekman and WV Frisen ("Facial Action Coding System",
Consulting Psycologist Presss, 1977) can be used to synthesize facial expressions of different degrees using three-dimensional CG and the like.

【０１２３】また、以下の方法によっても応答戦略に基
づく応答生成が可能である。Further, it is also possible to generate a response based on the response strategy by the following method.

【０１２４】たとえば、図１４において、要求獲得段階
の感情は「期待」、「不安」、「困惑」、「焦燥」およ
び「怒り」などが考えられる。図３３乃至図３９の応答
プランの項目に、この感情を付加して、たとえば図４１
のような形式で感情の種類を与えることにより、応答文
や表情などを変更することができる。また、満足度の値
を付加することも可能である。これにより、同じ感情で
も強さの違いなどを表現できる。For example, in FIG. 14, the feelings at the request acquisition stage can be "expectation", "anxiety", "confusion", "frustration" and "anger". This emotion is added to the items of the response plan shown in FIGS.
It is possible to change the response sentence, facial expression, etc. by giving the kind of emotion in the form like. It is also possible to add a satisfaction value. This makes it possible to express differences in strength even with the same emotion.

【０１２５】また、図１４上でユーザの感情種類を決定
した後、図２９の軸上の値を求め直して、この３軸の値
から応答戦略を決定する方法もある。すなわち、「拒
否」の強い状態では、システムの応答は信用されないた
め、慎重にユーザの入力を理解し実行する必要がある。
ユーザの感情に訴えず、冷淡に見えない程度に事務的な
応答を生成する。謝罪の場合は真摯な態度をとる。逆
に、「受容」の強い状態では、リラックスした応答を行
ってよい。親密な態度、軽口、失敗なども許容される。
また、「切迫」している状態では、システムもあまり余
計な応答を生成しない。ただし、ユーザの要求に答えら
れない場合は、すばやく関連する情報などを提供し誠意
を示す。システムの能力を明確にし、早めに他のユーザ
に助けを求めるよう助言する。ユーザの意図がわかりに
くい場合は、いくつかの選択肢を表示して選ばせてもよ
い。逆に「余裕」のある状態では、できるだけシステム
自身が対応を行なうようにする。There is also a method in which after determining the emotion type of the user in FIG. 14, the values on the axes of FIG. 29 are recalculated and the response strategy is determined from the values of these three axes. That is, in a "denied" state, the system's response is untrusted, so it is necessary to carefully understand and execute the user's input.
It does not appeal to the user's emotions and generates a clerical response that does not look cold. If you apologize, take a serious attitude. Conversely, a strong "acceptance" may result in a relaxed response. Intimate attitude, light-heartedness and failure are acceptable.
Also, in the "urgent" state, the system does not generate too much extra response. However, if we cannot answer the user's request, we will promptly provide relevant information and show our sincerity. Clarify your system's capabilities and advise other users to seek help early. If the user's intention is difficult to understand, some options may be displayed and selected. On the other hand, when there is a "margin", the system itself tries to handle it as much as possible.

【０１２６】「不快」な感情状態では、ユーザはシステ
ムに対して怒りをストレートにぶつけるか、嫌気がさし
てアクセスを終了するかを望むようになる。「不快」が
強くなる前に応答方針を変更する必要がある。ユーザの
意図をシステムが理解できないとすれば、システムがで
きないことをユーザが要求している可能性が高い。シス
テムのサービス内容を呈示し、ユーザに選択させる。シ
ステムができないことをユーザが何度も要求していると
したら、他のユーザやシステムを紹介しアクセスを終了
する。対話の段階が順調に進んでいるのにもかかわらず
「不快」な傾向がある場合は、応答の態度などを変えて
みる。「快」の場合は、応答方針は継続してよい。In the "unpleasant" emotional state, the user wants to slam his system straight or disgust to end the access. It is necessary to change the response policy before the "discomfort" becomes strong. If the system does not understand the user's intentions, then it is likely that the user is requesting that the system cannot. Present the service content of the system and let the user select it. If the user repeatedly requests that the system cannot do it, introduce the other user or system and terminate the access. If there is a tendency for "discomfort" even though the dialogue stage is proceeding smoothly, try changing the response attitude. In the case of "pleasant", the response policy may be continued.

【０１２７】以上の応答を実現するために、たとえば、
応答方針を「冗長性」、「同調性」、「正確性」、「優
位性」および「情報公開性」などにより設定する。In order to realize the above response, for example,
The response policy is set by "redundancy", "synchronization", "accuracy", "dominance", "information disclosure", and the like.

【０１２８】冗長性は、余計な応答の割合を設定する。
たとえば、失敗した際に言い訳をするか等を表す。Redundancy sets the percentage of extra responses.
For example, it indicates whether or not to make an excuse when it fails.

【０１２９】同調性は、ユーザやユーザの話に対しての
評価の程度を設定する。同調性が高ければ媚びへつらい
になり、低ければ事務的な応答になる。The synchronism sets the degree of evaluation for the user and the user's story. A high degree of synchronicity makes it frustrating and a low degree of clerical response.

【０１３０】正確性は、常に正直に応答するか、ごまか
すか等を設定する。すなわち、ユーザの要求に答えられ
ない際に、正直に言えば「あなたには教えられません」
となる場合も、「ちょっとわかりません」などとごまか
すことが可能になる。The accuracy sets whether to always respond honestly or to cheat. In other words, if you can't answer the user's request, to be honest, "I can't tell you".
Even if it becomes, it is possible to cheat such as "I do not understand a little".

【０１３１】優位性は、システムとユーザとの力関係を
設定する。優位性が高い場合は、ユーザに対して高飛車
な態度となり、低い場合は下手に出る。The superiority sets the power relationship between the system and the user. When the superiority is high, the attitude is high for the user, and when the superiority is low, the attitude is poor.

【０１３２】情報公開性は、システムの持つ情報をユー
ザの適正なアクセス権と比較して、多めに公開するか少
な目に公開するかを設定する。情報公開性が高い場合
は、ふだん教えないような情報も譲歩して教えてしまう
が、低い場合はふだんより出し惜しみする。The information openness is set by comparing the information held by the system with the appropriate access right of the user, and whether it is open to the public a little or a little. If the information disclosure is high, we will give away information that we normally do not teach, but if it is low, we will spare more than usual.

【０１３３】これらの値を応答プランに追加し、たとえ
ば図４２のような形式で応答文を生成する。ここでは、
各応答方針を０から５の間で指定している。「拒否」が
強い場合は、「冗長性」および「同調性」を低くする。
「切迫」が強い場合も「冗長性」を低くし、可能であれ
ば「情報公開性」を高くする。「不快」が強まってきた
際は、相手が上司であればシステムの「優位性」を下げ
る。以上のような応答方針の変更規則を複数用意し、３
軸の値がある閾い値を越えたら適用するといった適用条
件とともに記憶しておく。These values are added to the response plan to generate a response sentence in the format shown in FIG. 42, for example. here,
Each response policy is specified between 0 and 5. When "rejection" is strong, "redundancy" and "synchronism" are lowered.
Even if the "urgency" is strong, the "redundancy" is lowered, and if possible, the "information disclosure" is raised. When "discomfort" increases, if the other party is the boss, the "superiority" of the system is lowered. Prepare multiple rules for changing the response policy as described above, and
It is stored together with the application condition that the axis value is applied when it exceeds a certain threshold value.

【０１３４】あるいは、３軸の値の組み合わせによっ
て、適用条件を記述してもよい。たとえば、同じ切迫度
でも、「受容」している場合は「冗長性」を高めに設定
してもよいが、「拒否」の場合は低くする、といったル
ールも記述できる。Alternatively, the application condition may be described by a combination of values on the three axes. For example, even with the same degree of urgency, a rule can be described in which “redundancy” may be set higher when “accepting”, but lower when “rejecting”.

【０１３５】また、図４２のような応答プランにより、
応答時の画像の表情も変更できる。同調性が高ければ愛
敬のある表情にし、低ければ事務的な表情にする。冗長
性が高ければ表情豊かにし、低ければ単純でわかりやす
い数種類の表情に限定する。優位性が高ければ尊大な表
情、低ければおどおどした表情とする。以上のような表
情パターンを複数用意しておき、応答文の生成とともに
表示を変更してもよい。Further, according to the response plan as shown in FIG.
You can also change the facial expression of the image when responding. If the synchronization is high, the expression will be respectful, and if it is low, the expression will be clerical. If the redundancy is high, the expression is rich, and if the redundancy is low, the expression is limited to a few simple and easy-to-understand expressions. If the superiority is high, the expression is arrogant, and if it is low, the expression is frightening. A plurality of facial expression patterns as described above may be prepared, and the display may be changed when the response sentence is generated.

【０１３６】また、応答戦略は、応答文の表情などを変
更させるだけでなく、対話遷移モデルの変更によっても
実現できる。たとえば、図２１の対話遷移モデル例に新
たな状態を追加し、遷移条件にユーザの感情状態やユー
ザとの親密度や態度などを加えることにより、たとえば
図４３に示すような複雑な応答戦略に基づく応答を実現
できる。The response strategy can be realized not only by changing the facial expression of the response sentence, but also by changing the dialogue transition model. For example, a new state is added to the dialogue transition model example of FIG. 21, and the emotional state of the user, the intimacy with the user, and the attitude are added to the transition condition, so that a complicated response strategy as shown in FIG. 43 is obtained. Based response can be realized.

【０１３７】この方法では対話遷移モデルが複雑になる
嫌いがあるが、その場合は、対話遷移モデルを複数用意
し、ユーザの感情によって別のモデルに切り替えること
により単純さを保つことができる。また、ユーザの感情
状態を条件にし、動的に遷移条件や状態を追加・変更す
るための規則を持つことによって実現すれば、組み合わ
せ爆発の問題は回避できる。たとえば、ユーザが焦って
いる場合は、図２２に示した実行条件リストの尤度を下
げる、あるいは実行条件を減らすなどにより対話速度を
加速するなどの戦略が実現できる。In this method, it is disliked that the dialogue transition model becomes complicated, but in that case, it is possible to maintain simplicity by preparing a plurality of dialogue transition models and switching to another model according to the emotion of the user. Further, if it is realized by using the emotional state of the user as a condition and having a rule for dynamically adding / changing a transition condition or a state, the problem of combination explosion can be avoided. For example, when the user is impatient, a strategy such as lowering the likelihood of the execution condition list shown in FIG. 22 or accelerating the conversation speed by reducing the execution conditions can be realized.

【０１３８】[0138]

【発明の効果】以上詳述したように本発明によれば、対
話の状況に応じたユーザの感情を認識することにより、
ユーザの感情を考慮した応答戦略に則った応答を生成す
る事ができる。すなわち、ユーザが明示的に表現する要
求のみに答えるのではなく、明示的でない要求をも考慮
した適切な応答を生成することが可能になり、ユーザの
精神的な負担を軽減し、対話の円滑化、効率化をはかる
ことができる。この際、単一の入力形態ではなく、テキ
スト、音声、画像などの複数の入力情報を用いて、より
正確にユーザの意図と感情を理解できる。また、複数の
応答形態を組み合わせて用い、それに対するユーザの感
情状態の変化を知ることにより、ユーザの好みや状況に
あわせた応答形態を選択することが可能になる。As described above in detail, according to the present invention, by recognizing the emotion of the user according to the situation of the dialogue,
It is possible to generate a response in accordance with a response strategy considering the user's feeling. In other words, it is possible to generate an appropriate response that considers even non-explicit requests, rather than answering only requests that the user expresses explicitly, reducing the mental burden on the user, and smoothing the dialogue. Efficiency and efficiency. At this time, the user's intention and emotion can be more accurately understood by using a plurality of input information such as text, voice, and image, instead of a single input form. In addition, by using a plurality of response forms in combination and knowing the change in the emotional state of the user with respect to them, it becomes possible to select the response form according to the user's preference and situation.

【０１３９】なお、本発明は情報の公開のみならず、同
様な感情を扱う他の用途の対話システムにも適用するこ
とができる。また、ユーザ間の通信にも応用し、各ユー
ザの感情状態を代わりに伝達する機能を追加すれば、ユ
ーザ間の意図に対する見当外れの解釈を少なくし、ネッ
トワーク上での対話や対話目的の達成を効率的に行なう
ことができることとなる。The present invention can be applied not only to the disclosure of information but also to a dialogue system for other uses that handle similar emotions. Also, by applying it to communication between users and adding a function to convey each user's emotional state instead, misinterpretation of intention between users is reduced, and dialogue on the network and achievement of the dialogue purpose are achieved. Can be efficiently performed.

[Brief description of drawings]

【図１】本発明の第１実施形態の情報公開装置の機能ブ
ロック図。FIG. 1 is a functional block diagram of an information disclosure device according to a first embodiment of the present invention.

【図２】第１実施形態の情報公開装置の動作手順を説明
するためのフローチャート。FIG. 2 is a flowchart for explaining an operation procedure of the information disclosure device of the first embodiment.

【図３】第１実施形態の対話回数感情モデルを示す図。FIG. 3 is a diagram showing a dialogue count emotion model of the first embodiment.

【図４】第１実施形態の感情を加味した対話回数感情モ
デルを示す図。FIG. 4 is a diagram showing a dialogue count emotion model in which the emotion of the first embodiment is added.

【図５】第１実施形態の修正条件を示す図。FIG. 5 is a diagram showing correction conditions according to the first embodiment.

【図６】第１実施形態の感情語が表現する感情を特定す
るアルゴリズムを示す図。FIG. 6 is a diagram showing an algorithm for identifying an emotion expressed by an emotion word according to the first embodiment.

【図７】第１実施形態の対話の一例を示す図。FIG. 7 is a diagram showing an example of a dialogue according to the first embodiment.

【図８】第１実施形態の感情推移モデルを示す図。FIG. 8 is a diagram showing an emotion transition model of the first embodiment.

【図９】第１実施形態の感情推移モデルを示す図。FIG. 9 is a diagram showing an emotion transition model of the first embodiment.

【図１０】第１実施形態の感情推移モデルを示す図。FIG. 10 is a diagram showing an emotion transition model of the first embodiment.

【図１１】第１実施形態の談話遷移モデルを示す図。FIG. 11 is a diagram showing a discourse transition model of the first embodiment.

【図１２】第１実施形態の感情対応テーブルを示す図。FIG. 12 is a diagram showing an emotion correspondence table according to the first embodiment.

【図１３】第１実施形態のユーザ感情の数値化のアルゴ
リズムを示す図。FIG. 13 is a diagram showing an algorithm for digitizing user emotions according to the first embodiment.

【図１４】第１実施形態の感情推移モデルを示す図。FIG. 14 is a diagram showing an emotion transition model of the first embodiment.

【図１５】本発明の第２実施形態の情報公開装置の機能
ブロック図。FIG. 15 is a functional block diagram of an information disclosure device according to a second embodiment of the present invention.

【図１６】第２実施形態の情報公開装置の機能ブロック
図。FIG. 16 is a functional block diagram of an information disclosure device according to a second embodiment.

【図１７】第２実施形態の情報公開装置の機能ブロック
図。FIG. 17 is a functional block diagram of the information disclosure device according to the second embodiment.

【図１８】第２実施形態の情報公開装置の機能ブロック
図。FIG. 18 is a functional block diagram of an information disclosure device according to a second embodiment.

【図１９】第２実施形態の情報公開装置の動作手順を説
明するためのフローチャート。FIG. 19 is a flowchart for explaining an operation procedure of the information disclosure device of the second embodiment.

【図２０】第２実施形態の応答の一例を示す図。FIG. 20 is a diagram showing an example of a response of the second embodiment.

【図２１】第２実施形態の対話遷移モデルを示す図。FIG. 21 is a diagram showing a dialogue transition model of the second embodiment.

【図２２】第２実施形態の実行条件リストを示す図。FIG. 22 is a diagram showing an execution condition list according to the second embodiment.

【図２３】第２実施形態の対話履歴記憶構造を示す図。FIG. 23 is a diagram showing a dialogue history storage structure of the second embodiment.

【図２４】第２実施形態の発話意図の一例を示す図。FIG. 24 is a diagram showing an example of utterance intention according to the second embodiment.

【図２５】第２実施形態のユーザの要求の一例を示す
図。FIG. 25 is a diagram showing an example of a user request according to the second embodiment.

【図２６】第２実施形態のキーワード辞書を示す図。FIG. 26 is a diagram showing a keyword dictionary of the second embodiment.

【図２７】第２実施形態の発話の意味表現を示す図。FIG. 27 is a diagram showing a semantic representation of speech according to the second embodiment.

【図２８】第２実施形態の感情空間を構成する３軸を示
す図。FIG. 28 is a diagram showing three axes forming the emotional space according to the second embodiment.

【図２９】第２実施形態の感情空間に感情名を割り当て
た状態を示す図。FIG. 29 is a diagram showing a state in which an emotion name is assigned to the emotion space according to the second embodiment.

【図３０】第２実施形態の感情領域テーブルを示す図。FIG. 30 is a diagram showing an emotion area table according to the second embodiment.

【図３１】第２実施形態の感情空間に非言語情報を割り
当てた状態を示す図。FIG. 31 is a diagram showing a state in which non-verbal information is assigned to the emotion space according to the second embodiment.

【図３２】第２実施形態の応答プラン作成テーブルを示
す図。FIG. 32 is a diagram showing a response plan creation table according to the second embodiment.

【図３３】第２実施形態の応答の書式を示す図。FIG. 33 is a diagram showing a format of a response according to the second embodiment.

【図３４】第２実施形態の応答生成例を示す図。FIG. 34 is a diagram showing an example of response generation according to the second embodiment.

【図３５】第２実施形態の応答の書式を示す図。FIG. 35 is a diagram showing a format of a response according to the second embodiment.

【図３６】第２実施形態の表情情報を付加した応答の書
式を示す図。FIG. 36 is a diagram showing a format of a response to which facial expression information of the second embodiment is added.

【図３７】第２実施形態の態度と親密度とを加えた応答
生成例を示す図。FIG. 37 is a diagram showing an example of response generation in which the attitude and intimacy of the second embodiment are added.

【図３８】第２実施形態の応答戦略を示す図。FIG. 38 is a view showing a response strategy of the second embodiment.

【図３９】第２実施形態の応答文例辞書の一例を示す
図。FIG. 39 is a diagram showing an example of a response sentence example dictionary according to the second embodiment.

【図４０】第２実施形態のエージェントの画像の表情の
パターンを示す図。FIG. 40 is a diagram showing a facial expression pattern of an image of an agent according to the second embodiment.

【図４１】第２実施形態の感情認識後の応答生成例を示
す図。FIG. 41 is a diagram showing an example of response generation after emotion recognition according to the second embodiment.

【図４２】第２実施形態の応用プランの一例を示す図。FIG. 42 is a diagram showing an example of an application plan of the second embodiment.

【図４３】第２実施形態の応答戦略を示す図。FIG. 43 is a diagram showing a response strategy of the second embodiment.

[Explanation of symbols]

１０…情報公開装置、１０１…入力部、１０２…データ
記憶部、１０３…データ検索管理部、１０４…要求受付
部、１０５…応答プラン作成部、１０６…ユーザ感情認
識部、１０７…応答生成部、２０…情報公開装置、２０
１…入力部、２０２…意図感情情報抽出部、２０３…感
情認識部、２０４…応答プラン生成部、２０５…応答生
成部、２０６…ユーザ情報記憶部、２０７…履歴記憶
部、２０８ａ〜ｃ…データ通信部、２０９ａ〜ｃ…プロ
セス、２１０…対話管理部、２１１…検索部、２１２…
データ記憶部。10 ... Information disclosure device, 101 ... Input unit, 102 ... Data storage unit, 103 ... Data search management unit, 104 ... Request reception unit, 105 ... Response plan creation unit, 106 ... User emotion recognition unit, 107 ... Response generation unit, 20 ... Information disclosure device, 20
1 ... Input unit, 202 ... Intentional emotion information extraction unit, 203 ... Emotion recognition unit, 204 ... Response plan generation unit, 205 ... Response generation unit, 206 ... User information storage unit, 207 ... History storage unit, 208a-c ... Data Communication unit, 209a-c ... Process, 210 ... Dialog management unit, 211 ... Search unit, 212 ...
Data storage.

Claims

[Claims]

1. Input means for inputting a plurality of forms of data including text, voice, image, and pointing position; extraction means for extracting user's intention and emotion information from the data input by the input means; In the information disclosure device having a response plan creating means for creating a response plan based on the extraction result of the extracting means, and a response creating means for creating a response to the user based on the created response plan, the response plan An emotion recognition means for recognizing the emotional state of the user from the internal state of the generating means, the extracted intention and emotional information of the user, and the transition on the time axis of the conversational situation information including the type of the response plan created. Comprising, the response plan creation means selects or changes a response strategy according to the recognition result of the emotion recognition means, Information disclosure and wherein the creating a response plan that matches the response strategy.