JPH11305985A

JPH11305985A - Interactive device

Info

Publication number: JPH11305985A
Application number: JP10112008A
Authority: JP
Inventors: Susumu Seki; 進関
Original assignee: Sharp Corp; Real World Computing Partnership
Current assignee: Sharp Corp; Real World Computing Partnership
Priority date: 1998-04-22
Filing date: 1998-04-22
Publication date: 1999-11-05

Abstract

PROBLEM TO BE SOLVED: To provide a device which generates an answer (response) to a user for positively controlling the flow of an interaction from the side of an artificial agent by autonomously and positively responding to the user so as to smoothly progress the interaction while more smoothly applying the response to the user corresponding to reality. SOLUTION: At an input means 11, a constant is multiplied by the logarithm of the input voice power of the user and the result is outputted along the state of the user mind. That is, when a feeling of wanting to make affirmative responses prepared by the artificial agent matches this state of the user, at a response generating means 12 for receiving the user voice output, it is judged the emotional excitation of both matches. Then, the feeling of wanting to make affirmative responses is improved by stopping responses in order to listen to a story well, and when the will is improved or the speaking of the user is weakened, affimative responses are made by an output means 13. Moreover, a function for preparing the state of decreasing the feeling of wanting to make affirmative responses is provided. Through such operation, the artificial agent can autonomously make affirmative responses while considering the speaking condition of the user as well.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、対話装置に関し、
より詳細には、例えばコンピュータや玩具のような電子
機器と人間とが情報をやりとりし１つの系を構成する場
合に、その系が円滑に動作するようにするためのインタ
フェースに関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an interactive device,
More specifically, the present invention relates to an interface for allowing a system to operate smoothly when an electronic device such as a computer or a toy exchanges information with a human to constitute one system.

【０００２】[0002]

【従来の技術】近年、技術の発展によって画像、音声な
ど複数の情報を出力するマルチメディアが盛んになって
いる。また、コンピュータの進歩によって、コンピュー
タを多くの人にとって使い易くするために、人間が通常
使っている情報のモードである音声情報や、身振り手振
りなどの画像情報を入力手段として用いるマルチモーダ
ルシステムがいくつか提案されている。また、擬人化さ
れた人工エージェントが機械の窓口になり、人間が機械
に接し易くなるよう考慮されているものがある。その１
つとして、エージェントが相槌を行うことにより人間と
エージェントの対話を円滑に進めることを目的とする技
術も提案されている（特開平８−２１１９８６号公
報）。他に、エージェントが自分の理解の状況をうなず
きで相手に伝えることで会話を円滑に進める技術も提案
されている（ＮＴＴ基礎研究所フォーラム’９７シンポ
ジウム，資料）。また、擬人化された人工エージェント
の感情・態度を自律的に生成し、ユーザとの間の対話の
リズムや間を重視したヒューマンインタフェース装置も
提案されている（本願と同一の出願人に係る特願平８−
３０１９５４号）。2. Description of the Related Art In recent years, multimedia for outputting a plurality of information such as images and sounds has become popular due to the development of technology. In addition, with the advancement of computers, in order to make computers easier to use for many people, there are several multimodal systems that use voice information, which is a mode of information normally used by humans, and image information such as gestures as input means. Or has been proposed. In some cases, an anthropomorphic artificial agent is used as a contact point for a machine so that humans can easily contact the machine. Part 1
For example, there has been proposed a technique for facilitating a conversation between a human and an agent by having an agent complied with each other (Japanese Patent Application Laid-Open No. Hei 8-211986). In addition, a technology has been proposed in which agents can communicate smoothly with each other by telling the situation of their understanding to the other party (NTT Basic Research Institute Forum '97 symposium, data). There has also been proposed a human interface device that autonomously generates the emotions and attitudes of an anthropomorphic artificial agent and attaches importance to the rhythm and interval of the dialogue with the user. Nganhei 8-
No. 301954).

【０００３】[0003]

【発明が解決しようとする課題】上記、特開平８−２１
１９６８号公報及びＮＴＴ基礎研究所フォーラム’９７
シンポジウム，資料においては、相槌は人（ユーザ）の
発話に対して受動的、反射的に行われるため機械（人工
エージェント）側からリズムを持って発話を積極的に促
すのは困難である。また、上記ヒューマンインタフェー
ス装置（特願平８−３０１９５４号）においては、自律
的、積極的に人工エージェントが対話を進める枠組みを
提案している。本発明は、上記した先願と共通の考え方
に基づくも、さらにユーザに対する応答をより円滑に現
実に即して適用することを目的とし、ユーザが主体とな
る対話状況において対話を円滑に進めるように自律的、
積極的にユーザに対する応答を行うことにより、人工エ
ージェント側から積極的に対話の流れを制御するよう
な、相槌といったユーザに対する応答を生成する装置を
提供することをその解決すべき課題とする。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention
1968 and NTT Basic Research Institute Forum '97
In symposiums and materials, it is difficult to actively encourage utterance with a rhythm from the machine (artificial agent) side because the negotiation is performed passively and reflexively to the utterance of a person (user). Further, the human interface device (Japanese Patent Application No. 8-301954) proposes a framework in which an artificial agent autonomously and actively promotes a conversation. The present invention is based on the same concept as the above-mentioned prior application, but further aims to apply the response to the user more smoothly and more realistically, and to facilitate the dialogue in the dialogue situation where the user is the main subject. Autonomous,
It is an object of the present invention to provide a device for generating a response to a user, such as a companion, which actively controls a flow of a dialogue from the artificial agent side by actively responding to the user.

【０００４】[0004]

【課題を解決するための手段】請求項１の発明は、ユー
ザにより発せられた音声から得た音声特徴をユーザの状
態として入力する入力手段と、ユーザとの対話を行う人
工エージェントの状態を表すパラメータを前記入力手段
からの入力を少なくとも一要素に入れて求め、そのパラ
メータの状態変化により増減する値に応じてユーザに対
する応答を生成する応答生成手段と、該応答生成手段で
生成された応答を音声および画像の少なくとも一方によ
りユーザに対して表現する出力手段を備える人工エージ
ェントによる対話装置において、前記応答生成手段は、
前記パラメータの値の増大に伴いその絶対値が所定の増
大値をとり、かつ負の値をとる関数を関数１とし、前記
ユーザの状態を表す値と前記パラメータの時間微分の差
の変化に伴い所定の値をとる関数を関数２とした場合、
該関数１と該関数２の和が前記パラメータの二階微分に
等しい微分方程式で表される状態関数を作成し、該状態
関数による前記パラメータの時間微分が負の時に、ユー
ザに対する人工エージェントの応答を生成し、該パラメ
ータの時間微分の絶対値に応じた大きさで出力すること
を特徴としたものである。According to the first aspect of the present invention, there is provided an input unit for inputting a voice feature obtained from a voice uttered by a user as a user's state, and a state of an artificial agent performing a dialogue with the user. A response generation unit that obtains a parameter by inputting at least one element from the input unit and generates a response to the user according to a value that increases or decreases according to a change in the state of the parameter; and a response generated by the response generation unit. In a dialogue device by an artificial agent including an output unit expressing to a user by at least one of a voice and an image, the response generation unit includes:
With the increase in the value of the parameter, the function whose absolute value takes a predetermined increase value and takes a negative value is defined as a function 1, and with a change in the difference between the value representing the state of the user and the time derivative of the parameter, When a function that takes a predetermined value is function 2,
Create a state function represented by a differential equation in which the sum of the functions 1 and 2 is equal to the second derivative of the parameter. When the time derivative of the parameter by the state function is negative, the response of the artificial agent to the user is calculated. It is characterized in that it is generated and output with a magnitude corresponding to the absolute value of the time derivative of the parameter.

【０００５】請求項２の発明は、請求項１記載の対話装
置において、前記音声特徴として音声パワーの対数を用
いることを特徴としたものである。According to a second aspect of the present invention, in the interactive apparatus according to the first aspect, the logarithm of audio power is used as the audio feature.

【０００６】請求項３の発明は、請求項１記載の対話装
置において、前記音声特徴として音声ピッチを用いるこ
とを特徴としたものである。According to a third aspect of the present invention, in the interactive device according to the first aspect, a voice pitch is used as the voice feature.

【０００７】請求項４の発明は、請求項１ないし３のい
ずれか１記載の対話装置において、前記出力手段による
ユーザに対する応答の出力を相槌とすることを特徴とし
たものである。According to a fourth aspect of the present invention, in the interactive device according to any one of the first to third aspects, a response output to the user by the output means is a companion.

【０００８】[0008]

【発明の実施の形態】図１は、本発明による対話装置の
基本構成を示すものである。図１に示すように、この対
話装置には、装置を通して対話しようとするユーザが発
する音声からパワーやピッチを特微量として抽出する入
力手段１１と、応答を音声、画像を通してユーザに出力
する出力手段１３および、入力手段１１によって得られ
た情報に基づいて適切なタイミングで応答の出力を指示
する応答生成手段１２から構成される。以下、図１に示
す基本構成に従う本発明の実施形態を応答が相槌の場合
について添付図に沿って説明する。FIG. 1 shows a basic configuration of a dialogue device according to the present invention. As shown in FIG. 1, the interactive device includes an input unit 11 for extracting a power and a pitch as a very small amount from a voice uttered by a user who wants to interact through the device, and an output unit for outputting a response to the user through voice and image. 13 and a response generation means 12 for instructing output of a response at an appropriate timing based on information obtained by the input means 11. Hereinafter, an embodiment of the present invention according to the basic configuration shown in FIG. 1 will be described with reference to the accompanying drawings when the response is a hammer.

【０００９】入力手段１１では、ユーザの音声パワーを
抽出し、その対数に適当な定数をかけたものυをユーザ
の状態として応答生成手段１２に出力する。応答生成手
段１２では、相槌を打ちたい気持ちを持ち、相槌を打ち
たい気持ちの変化が丁度ユーザの音声パワーと一致して
いるとき、ユーザの盛り上がり（感情の高まり）方と人
工エージェントの盛り上がり方が一致していると考え、
ユーザからの影響を強く受けてユーザの話を良く聞く状
態にあるという仮定をおき、そのために相槌を打たず、
ユーザへの相槌を打ちたい気持ちを上昇させる。相槌を
打ちたい気持ちが上昇するとそれに伴い、相槌を打ちた
い気持ちを解消しようとする相槌生成欲求が生じる。そ
して、相槌生成欲求が非常に大きくなるか、ユーザの発
話が弱くなり相槌を打ちやすくなると相槌を打ち、打つ
ことにより相槌を打ちたい気持ちは減少してくる。この
相槌の大きさは相槌を打ちたい気持ちの減少分とする。
このような場合、単純なユーザの発話状況への反射的応
答ではなく、人工エージェントが自分の気持ちに従うと
共に、ユーザの発話状況も考慮して相槌を打つことが出
来る。The input means 11 extracts the voice power of the user, and outputs υ obtained by multiplying the logarithm by an appropriate constant to the response generating means 12 as the state of the user. In response generation means 12, when the user has a desire to hit a hammer and the change in the desire to hit a hammer exactly matches the voice power of the user, the way of the user's excitement (increase in emotion) and the way of the artificial agent's excitement are Think they match,
Assuming that you are strongly influenced by the user and listening to the user's story well, without hitting it,
Raise the desire to hit the user. When the desire to hit a souvenir rises, there is a desire to create a hammer to eliminate the desire to hit a souvenir. When the desire to generate a companion becomes very large or the user's utterance weakens and the companion is easily hit, the desire to hit the companion is reduced by hitting the companion. The size of this companion is a reduction in the desire to hit the companion.
In such a case, instead of a simple reflexive response to the utterance situation of the user, the artificial agent can follow his / her own feelings and hit each other in consideration of the utterance situation of the user.

【００１０】このような相槌生成を実現するための方法
を式を用いて以下に説明する。人工エージェントにおい
て、その状態を表わす状態パラメータｑ₁をもってお
り、ここでは、ｑ₁は相槌を打ちたい気持ちとし、ま
た、このパラメータの時間微分をｑ₂とする。ｑ₂は相槌
を打ちたい気持ちの変化を意味する。ｑ₁，ｑ₂は次の
（１），（２）式のような微分方程式に基づいて変動す
るものとする。A method for realizing such a hammer generation will be described below using equations. The artificial agent has a state parameter q ₁ representing its state, where q ₁ is a desire to hit each other and the time derivative of this parameter is q ₂ . q ₂ refers to the change of feelings you want to beat the back-channel feedback. It is assumed that q ₁ and q ₂ fluctuate based on differential equations such as the following equations (1) and (2).

【００１１】[0011]

【数１】 (Equation 1)

【００１２】ここに、式（２）の第１項ｆ_in（ｑ₁）は
相槌生成欲求関数と呼ぶべき関数で、相槌を打ちたい気
持ちｑ₁の増大に伴って、その絶対値が増大し、ｑ₁が減
少する方向すなわち負の値を持つ。式（２）の第２項ｆ
_out（υ−ｑ₂）はユーザ影響関数と呼ぶべき関数で、ユ
ーザから人工エージェントへの影響を示す。（υ−
ｑ₂）は人工エージェントの状態パラメータ（ｑ₁）の時
間による変化ｑ₂とユーザからの入力υとの差であり、
ユーザ影響関数の値がユーザの状態だけでなく、その時
の人工エージェントの相槌を打ちたい気持ちの変化の値
によっても影響を受けることを意味する。[0012] Here, in the first term f _in (q ₁₎ is a function should be called back-channel feedback generation desire function of equation (2), with an increase in the feeling q ₁ to be beat nod, its absolute value increases , Q ₁ have a decreasing direction, that is, a negative value. The second term f in equation (2)
_out (υ−q ₂ ) is a function to be called a user influence function, and indicates an influence from the user to the artificial agent. (Υ-
q ₂ ) is the difference between the time-dependent change q ₂ of the state parameter (q ₁ ) of the artificial agent and the input ユーザ from the user,
This means that the value of the user influence function is affected not only by the state of the user, but also by the value of the change in the desire to hit the artificial agent at that time.

【００１３】ユーザ影響関数ｆ_out（υ−ｑ₂）におい
て、ｆ_out（＋０）＞０，ｆ′_out（＋０）＜０（ここ
で、′は微分を表す）ならば、ユーザの相槌を打ちたい
気持ちの変化と音声入力が一致しているυ−ｑ₂＝０の
ときのｆ_outの値は、ユーザの入力が大きくなって相槌
を打ちたい気持ちの変化と少しずれたυ−ｑ₂＝ε＞０
のときと比べるとｆ_out（＋０）＞ｆ_out（ε）＞０であ
り、パラメータｑ₁の二階微分（ｄｑ₂／ｄｔ）への影響
が大きく、延いてはｑ₁への影響も大きい。In the user influence function f _out (０−q ₂ ), if f _out (+0)> 0 and f ′ _out (+0) <0 (here, “′” represents a derivative), the user is hit. The value of f _out when 変化 −q ₂ = 0, where the change in the desire and the voice input match, is slightly different from the change in the desire to hit with the user's input because υ−q ₂ = ε> 0
As compared with the case of (1), f _out (+0)> f _out (ε)> 0, and the influence on the second derivative (dq ₂ / dt) of the parameter q ₁ is great, and hence the influence on q ₁ is great.

【００１４】例としてｆ_in，ｆ_outをｆ_in（ｘ）＝−ω²ｘ（３）ｆ_out（ｘ）＝ｂ（ｘ−ａ）²＋ｃ，ｘ≧０（４）ｆ_out（ｘ）＝−ｂ（ｘ＋ａ）²−ｃ，ｘ＜０（５）とおく。上記（３）ないし（５）式において、ａ，ｂ，
ｃ，ωは正の定数である。ａ，ｂ，ｃ＝１のときの、式
（４）および式（５）のグラフを図２に示す。図２から
分かるように、式（４）は入力がｘ＝＋０の出力はその
正の近傍ｘ＝ε＞０に対して値が大きい。式（１）から
式（５）は負抵抗のある場合の振動子の方程式の一種
で、負抵抗のある振動子は自励振動を起こすことが知ら
れており、υの値によっては、ｑ₁がある周期で振動
し、その振幅が大きくなる。これはユーザの発話によっ
て人工エージェントが盛り上がり、相槌を返すことに対
応している。[0014] As an example f _in, f _out the _{f in (x) = - ω} 2 x (3) f out (x) = b (x-a) 2 + c, x ≧ 0 (4) f out (x) = −b (x + a) ² −c, x <0 (5) In the above equations (3) to (5), a, b,
c and ω are positive constants. FIG. 2 shows graphs of equations (4) and (5) when a, b, and c = 1. As can be seen from FIG. 2, in the equation (4), the output with the input x = + 0 has a large value with respect to its positive neighborhood x = ε> 0. Equations (1) to (5) are a type of oscillator equation with a negative resistance, and it is known that an oscillator with a negative resistance causes self-excited oscillation. Depending on the value of υ, q ₁ Vibrates at a certain cycle, and its amplitude increases. This corresponds to the artificial agent being excited by the user's utterance and returning to the companion.

【００１５】式（２）から、相槌を打ちたい気持ちｑ₁
の変化ｑ₂が相槌生成欲求ｆ_inとユーザ影響ｆ_outによっ
て決まる。ユーザの音声入力υが状態パラメータｑ₁の
時間微分ｑ₂より大きいときは、式（２）と式（４）か
らｆ_out＞０でｑ₂が正の方向に変化させようとし、この
とき式（１）からｑ₂が正になることで、ｑ₁が正の方に
変化し、最終的にｑ₁が増大してｑ₁は正の値を持つ。ｑ
₁が正のとき相槌生成欲求関数（式（３）参照）からｆ
_inが負の値をもち、ｑ₁の増大に伴って、その絶対値が
増大し、その値が大きいほどｑ₂を減少させようとし、
ｑ₂が負になることで式（１）よりｑ₁を減少させようと
する。このｆ_inとｆ_outの関係、即ち相槌生成欲求とユ
ーザの発話状況（ユーザ影響）から最終的にｑ₁の増減
が決まる。ｑ₁が減少するとき、すなわちｑ₂が負の時に
相槌を打つこととし、その強度はｑ₂の絶対値とする。
出力手段１３は応答生成手段１２としての相槌生成手段
から送られてくる相槌強度に従って、大きい頷きや小さ
い頷き、或は「えー」，「はい」など発話を含む相槌な
どを擬似人間を通じて音声、画像で出力する。[0015] From equation (2), the desire to strike a hammer q ₁
Change q ₂ of is determined by the back-channel feedback generated desire f _in and user influence f _out. When the user's voice input より大きい is greater than the time derivative q ₂ of the state parameter q ₁ , q ₂ is to be changed in a positive direction with f _out > 0 from equations (2) and (4). (1) from the q ₂ that is positive, changes towards q ₁ is positive, q ₁ finally q ₁ is increased has a positive value. q
_{When 1} is positive, f
_in has a negative value, and its absolute value increases with an increase in q ₁ , and the larger the value, the more q ₂ is reduced.
q ₂ is an attempt to reduce the q ₁ from the equation (1) by a negative. The relationship between the f _in and f _out, i.e. ultimately increase or decrease of q ₁ is determined from the back-channel feedback generating desire and the user's speech situation (user impact). When q ₁ decreases, that is, when q ₂ is negative, hitting is performed, and the strength is the absolute value of q ₂ .
The output means 13 outputs a large nod, a small nod, or a hammer containing an utterance such as "er", "yes", etc., through a pseudo human, according to the hammer strength transmitted from the hammer generator as the response generator 12. To output.

【００１６】図３に、応答生成手段を相槌に適用した場
合の本発明による実施形態の応答生成手段における処理
動作のフローチャートを示し、図３にもとづき以下にそ
のフローを説明する。この処理をスタートすると、先
ず、状態パラメータｑ₁とその時間微分ｑ₂の初期値を設
定し（ステップＳ３０１）、設定された値により相槌生
成欲求ｆ_inを計算する（ステップＳ３０２）。この時、
ユーザが発する音声パワーの対数を特徴量ｖとして入力
する（ステップＳ３０３）。次いで、状態パラメータの
時間微分ｑ₂と入力特徴量ｖを比較し（ステップＳ３０
４）、入力特徴量ｖが状態パラメータの時間微分ｑ₂以
上の大きさの入力であるときにユーザ影響ｆ_outを上記
式（４）で計算し（ステップＳ３０５）、それ以外（ｖ
＜ｑ₂）のときに上記式（５）で計算する（ステップＳ
３０６）。その後、求めた相槌生成欲求ｆ_inとユーザ影
響ｆ_outからループにかかった時間ｄｔを考慮して次の
ループでの状態パラメータｑ₁（ｑ₁＝ｑ₁＋ｑ₂ｄｔ）と
その時間微分ｑ₂（ｑ₂＝ｑ₂＋（ｆ_in＋ｆ_out）ｄｔ）を
計算する（ステップＳ３０７）。この後、ステップＳ３
０７で求めた状態パラメータｑ₁の時間微分ｑ₂が負であ
るか否かを判断し（ステップＳ３０８）、負のとき、そ
の絶対値｜ｑ₂｜に従った強度で相槌を打ち（ステップ
Ｓ３０９）、ステップＳ３０２に戻る。また、それ以外
の時は、何もせずに再びステップＳ３０２に戻る。FIG. 3 shows a flowchart of a processing operation in the response generating means of the embodiment according to the present invention when the response generating means is applied to a companion, and the flow will be described below with reference to FIG. When you start this process, first, set the state parameters q ₁ and the initial value of the time derivative q ₂ (step S301), and calculates a back-channel feedback generation desire f _in the set value (step S302). At this time,
The logarithm of the audio power emitted by the user is input as a feature value v (step S303). Next, the time derivative q ₂ of the state parameter is compared with the input feature value v (step S30).
4) When the input feature value v is an input having a magnitude equal to or greater than the time derivative q _{2 of the} state parameter, the user influence f _out is calculated by the above equation (4) (step S305), and otherwise (v
<Q ₂ ), it is calculated by the above equation (5) (step S5).
306). Then, taking into account the back-channel feedback generation desire f _in the time spent by the user affect f _out loop dt determined state at the next loop parameter _{_{_{q 1 (q 1 = q 1}}} + q 2 dt) and its time derivative q ₂ (Q ₂ = q ₂ + (f _in + f _out ) dt) is calculated (step S307). Thereafter, step S3
Time differential q ₂ of state parameters q ₁ obtained in 07 it is determined whether the negative (step S308), when negative, the absolute value | q ₂ | beat nod in intensity in accordance with (Step S309 ), And return to step S302. At other times, the process returns to step S302 without doing anything.

【００１７】図４にユーザである実際の人間によって入
力された音声のパワーの対数とそれに対応した対話装置
の相槌出力強度｜ｑ₂｜（状態パラメータの時間微分ｑ₂
の絶対値）の例を示す。このグラフから人間の音声の入
力が小さくなったときに相槌は出易いが、必ずしも全て
の入力が小さくなるところで出るわけではないこと、ま
た、強度も強弱があることが見て取れる。すなわち、機
械的な応答による対話ではなく、変化のある対話が実現
できることが分かる。FIG. 4 shows the logarithm of the power of the voice input by an actual human being as a user and the corresponding hammer output intensity | q ₂ | (time derivative q _{2 of the} state parameter)
Is shown. From this graph, it can be seen that when the input of the human voice becomes small, it is easy to come out, but it does not necessarily appear where all the input becomes small, and that the strength is also weak. That is, it can be understood that a dialog with a change can be realized instead of a dialog with a mechanical response.

【００１８】[0018]

【発明の効果】請求項１の発明の効果：ユーザにより発
せられた音声特徴による影響を人工エージェント自身の
状態を表わすパラメータに反映させて生成する状態に応
じて、人工エージェントが自律的に応答を行うことによ
りユーザとの会話を盛り上げ、円滑な対話を実行するこ
とができる。According to the first aspect of the present invention, the artificial agent responds autonomously according to the state generated by reflecting the effect of the voice feature emitted by the user on the parameter representing the state of the artificial agent itself. By doing so, the conversation with the user can be excited and a smooth conversation can be executed.

【００１９】請求項２の発明の効果：請求項１の発明の
効果に加えて、人工エージェントが音声特徴として音声
パワーを用いることで、ユーザの発話の妨げにならない
応答を行うことができる。According to the second aspect of the present invention, in addition to the effect of the first aspect, the artificial agent uses voice power as a voice feature, so that a response that does not hinder the user's speech can be made.

【００２０】請求項３の発明の効果：請求項１の発明の
効果に加えて、人工エージェントが音声特徴として音声
ピッチを用いることで、ユーザの感情的な情報を捉え対
話の流れに沿った応答を行うことができる。Effect of the invention of claim 3 In addition to the effect of the invention of claim 1, in addition to the effect of the artificial agent, voice pitch is used as a voice feature, thereby capturing emotional information of the user and responding along the flow of dialogue. It can be performed.

【００２１】請求項４の発明の効果：請求項１ないし３
の発明の効果に加えて、人工エージェントが自律的にユ
ーザへの応答として相槌を打つことによりユーザの発話
を積極的に促して対話を盛り上げることができる。Effect of the invention of claim 4: claims 1 to 3
In addition to the effects of the invention, the artificial agent autonomously hits the user as a response to the user, so that the user's utterance can be actively encouraged and the dialogue can be excited.

[Brief description of the drawings]

【図１】本発明による対話装置の基本構成を示す図であ
る。FIG. 1 is a diagram showing a basic configuration of a dialogue device according to the present invention.

【図２】本発明による対話装置に用いるユーザ影響関数
の一例を示すグラフである。FIG. 2 is a graph showing an example of a user influence function used in the dialogue device according to the present invention.

【図３】本発明による対話装置を構成する応答生成手段
の動作の一例を示すフローチャートである。FIG. 3 is a flowchart illustrating an example of an operation of a response generation unit configuring the interactive device according to the present invention.

【図４】本発明による対話装置の実施例より得られた実
験結果を示すもので、ユーザによる入力音声パワー（対
数）に対する相槌出力強度を表わしたグラフである。FIG. 4 is a graph showing experimental results obtained from an embodiment of the dialogue device according to the present invention, and showing the output power of a hammer versus the input voice power (logarithm) of a user.

[Explanation of symbols]

１１…入力手段、１２…応答生成手段、１３…出力手
段。11 ... input means, 12 ... response generation means, 13 ... output means.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号ＦＩＧ１０Ｌ 9/00 ３０１Ｇ０６Ｆ 15/62 ３２１Ａ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁶ Identification code FI G10L 9/00 301 G06F 15/62 321A

Claims

[Claims]

1. An input unit for inputting a voice feature obtained from a voice uttered by a user as a state of the user, and at least one input from the input unit includes a parameter representing a state of an artificial agent that interacts with the user. Response generation means for generating a response to the user according to a value that increases or decreases according to a change in the state of the parameter; and a response generated by the response generation means to the user by at least one of voice and image. In a dialogue device by an artificial agent having an output means for expressing,
The response generating means sets a function whose absolute value takes a predetermined increase value and takes a negative value with an increase in the value of the parameter as a function 1, and calculates a value representing the state of the user and a time derivative of the parameter. When a function that takes a predetermined value according to the change in the difference is defined as a function 2, a state function represented by a differential equation in which the sum of the function 1 and the function 2 is equal to the second derivative of the parameter is created, and An interactive apparatus, wherein when the time derivative of the parameter by the function is negative, a response of the artificial agent to the user is generated, and the response is output with a magnitude corresponding to the absolute value of the time derivative of the parameter.

2. The interactive device according to claim 1, wherein a logarithm of audio power is used as the audio feature.

3. The interactive device according to claim 1, wherein an audio pitch is used as the audio feature.

4. The interactive device according to claim 1, wherein the output of the response to the user by the output unit is a companion.