JPH10143351A

JPH10143351A - Interface unit

Info

Publication number: JPH10143351A
Application number: JP30195496A
Authority: JP
Inventors: Susumu Seki; 進関; Haruo Hide; 晴夫日出; Kenji Sakamoto; 憲治坂本
Original assignee: GIJUTSU KENKYU KUMIAI SHINJOHO SHIYORI KAIHATSU KIKO; Sharp Corp
Current assignee: GIJUTSU KENKYU KUMIAI SHINJOHO SHIYORI KAIHATSU KIKO; Sharp Corp
Priority date: 1996-11-13
Filing date: 1996-11-13
Publication date: 1998-05-29

Abstract

PROBLEM TO BE SOLVED: To provide an interface unit which has an input-output means that smoothly proceeds a multimodal interaction between an agent on the computer side and man. SOLUTION: This device is provided with such an inputting means 11 that inputs user's gestures, pitch of voice, etc., acquired from images and voices caused by the user, a feeling and attitude generating means 12 which generates an internal state parameter of the feeling and attitude of a personified artificial agent with an input from the means 11 as at least one element, and an outputting means 13 which represents and outputs the feeling and attitude of the artificial agent as an image and voice in accordance with the internal state parameter generated by the means 13. Because rhythm, timing, etc., are significant at the time of interaction, the means 12 represents feeling and attitude with a system that takes a temporal change such as an internal state parameter into consideration.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、インタフェース装
置に関し、例えば、コンピュータやコンピュータ化され
た玩具のような電子機器と人間との間に行われる情報の
やりとりを、より円滑にする入出力手段としてのインタ
フェース装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an interface device, for example, as an input / output means for facilitating the exchange of information between an electronic device such as a computer or a computerized toy and a human. Related to the interface device.

【０００２】[0002]

【従来の技術】近年、技術の発展によって画像，音声な
ど複数の情報をマルチモーダルに出力するマルチメディ
アが進展している。また、コンピュータの進歩によっ
て、コンピュータを多くの人にとって使い易くするため
に、人間が通常使っている情報のモードである音声情報
や、身振り手振りなどの画像情報を入力手段として用い
るマルチモーダルシステムがいくつか提案されている。
さらに、擬人化エージェントが機械の窓口となり、人間
が機械に接し易くなるように考慮されているものがあ
り、ヒューマンインタフェースとしてのエージェントの
中には擬似的感情モデルを持ち、エージェントをより人
間らしく振舞わせて情報伝達をスムーズに行うことを目
的とする感情模擬装置も提案されている（例えば、特開
平６−１２４０１号公報，参照）。2. Description of the Related Art In recent years, multimedia capable of outputting a plurality of pieces of information such as images and sounds in a multi-modal manner has been developed due to technological development. In addition, with the advancement of computers, in order to make computers easier to use for many people, there are several multimodal systems that use voice information, which is a mode of information normally used by humans, and image information such as gestures as input means. Or has been proposed.
In addition, some anthropomorphic agents have been designed to be the point of contact for machines, making it easier for humans to access machines. There is also proposed an emotion simulating device aiming at smooth information transmission (for example, see JP-A-6-12401).

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上記従
来技術においては、身振りなども言語情報に代わるも
の、あるいは、言語情報に伴われるものとして言語情報
と同等に取り扱われていた。しかし、マルチモーダル情
報の中には、仕草，ピッチ，タイミングなど記号で表現
可能な言語情報的取り扱いに適さないものもある。これ
らの情報は時空間的に連続的なパターン情報のまま取り
扱う手法の確立が重要となる。また、感情を持つ擬人化
エージェントでは、感情は基本感情要素に分解され、そ
のベクトルとしてとらえられ、時間発展に対して線形に
推移することが想定されている。しかし、このような手
法では比較的単純にしか感情を表現できず、また、イン
タフェースとして有効なタイミングなどを表現すること
ができない。人と人とのコミュニケーションでは、その
場の微妙な変化によって感情・態度は多様に変化した
り、話しの間やリズム，タイミングなどの時間的要素を
変化させて積極的に対話を進めている。コンピュータと
の間の情報の入出力のための擬人化エージェントにおい
ても、人間同士と同様にこのような多様な態度の変化や
時間的要素の変化を積極的に活用することにより、より
良いインタフェースを実現することができるものと考え
られる。However, in the above-mentioned prior art, gestures and the like are treated in the same way as linguistic information as a substitute for the linguistic information or as a component accompanying the linguistic information. However, some of the multimodal information is not suitable for linguistic information handling such as gesture, pitch, timing, etc. which can be expressed by symbols. It is important to establish a method for handling such information as continuous spatial and temporal pattern information. Also, in an anthropomorphic agent having emotions, it is assumed that the emotions are decomposed into basic emotional elements, captured as vectors, and change linearly with time development. However, with such a method, emotions can be expressed only relatively simply, and timing effective as an interface cannot be expressed. In human-to-human communication, emotions and attitudes change in various ways due to subtle changes in the scene, and conversations are actively promoted by changing temporal elements such as utterance, rhythm, and timing. Anthropomorphic agents for inputting / outputting information to / from computers also have a better interface by positively utilizing such various changes in attitudes and changes in temporal factors as in humans. It is thought that it can be realized.

【０００４】本発明は、このような人との対話を可能と
するコンピュータ側のエージェントが対話を行う際に、
マルチモーダルでの対話を円滑に進めるためのマルチモ
ーダルに適した情報の入出力手段を備えたインタフェー
ス装置を提供することで、具体的には、エージェントの
行動の表現力を増し、エージェントの感情・態度を多様
化し、感情・態度の履歴によって行動に変化を与え、エ
ージェント自身に固有な周期に従って感情・態度を生成
すると同時に、ユーザの周期に同調するようにした当該
装置を提供することをその解決すべき課題とする。According to the present invention, when an agent on a computer which enables such a conversation with a person performs a conversation,
By providing an interface device equipped with multi-modal information input / output means for facilitating multi-modal dialogue, concretely, it increases the expressiveness of agent behavior, The solution is to provide a device that diversifies attitudes, changes behavior based on the history of emotions and attitudes, generates emotions and attitudes according to a cycle unique to the agent itself, and simultaneously synchronizes with the cycle of the user. Tasks to be done.

【０００５】[0005]

【課題を解決するための手段】請求項１の発明は、ユー
ザより生じる画像及び音声などから求められたユーザの
仕草，声のピッチなどを入力する入力手段と、該入力手
段からの入力を少くともその１要素として擬人化された
人工エージェントの感情・態度の内部状態パラメータを
生成する感情・態度生成手段と、該感情・態度生成手段
の生成した内部状態パラメータに応じて人工エージェン
トの感情・態度を画像及び音声として表現し出力する出
力手段を備えることを特徴とするもので、入力手段とし
て言語情報のみならず、仕草，顔の動き，声のピッチな
ど非言語情報も想定している。これらの非言語情報を取
り扱うには、時間的，空間的なパターンとしてそのまま
扱うことが有効であり、リズムやタイミングなどが対話
時に重要であることから、感情や態度をある内部状態パ
ラメータという時間的な変化が考慮された系で表現する
ことが有効と考えられる。すなわち、時間変化を取り扱
う発展方程式で下記（１）式のように記述することによ
り、感情・態度を生成することが可能になる。According to the first aspect of the present invention, there is provided input means for inputting a user's gesture, voice pitch, and the like obtained from an image and a voice generated by the user, and the input from the input means is reduced. An emotion / attitude generating means for generating an internal state parameter of the emotion / attitude of the anthropomorphic artificial agent as one of the elements, and an emotion / attitude of the artificial agent according to the internal state parameter generated by the emotion / attitude generating means Is output as an image and a sound, and non-verbal information such as gestures, facial movements, and voice pitches is assumed as input means in addition to linguistic information. In order to handle such non-verbal information, it is effective to treat it as a temporal and spatial pattern as it is. Since rhythm and timing are important at the time of dialogue, emotions and attitudes are temporally defined as certain internal state parameters. It is considered effective to express in a system that takes into account various changes. In other words, by describing the following equation (1) with a development equation dealing with time change, it is possible to generate an emotion / attitude.

【０００６】[0006]

【数１】 (Equation 1)

【０００７】ここで、エージェントの行動のための準備
状態を示す感情や態度を記述するＮ個の内部状態パラメ
ータｑ_j（ｊ＝１，…，Ｎ）の時間変化は現在のそれ自
身の値及び入力される制御用パラメータｘ_i（ｉ＝１，
…，Ｍ）によって定まる。この制御用パラメータはユー
ザの仕草，声のピッチなどや、音声認識されて得た結
果、時刻などである。特に、下記（２）式のように制御
用パラメータと内部状態パラメータを分離すれば、エー
ジェントの感情・態度の生成が容易になる。Here, the time change of N internal state parameters q _j (j = 1,..., N) which describe an emotion or an attitude indicating a preparation state for an agent's action is a current value of itself and The input control parameter x _i (i = 1,
.., M). The control parameters include a gesture of the user, a pitch of the voice, a result obtained by voice recognition, a time, and the like. In particular, if the control parameters and the internal state parameters are separated as in the following equation (2), it is easy to generate the emotion and attitude of the agent.

【０００８】[0008]

【数２】 (Equation 2)

【０００９】請求項２の発明は、ユーザより生じる画像
及び音声などから求められたユーザの仕草，声のピッチ
などを入力する入力手段と、該入力手段からの入力を少
くともその１要素として擬人化された人工エージェント
の感情・態度の内部状態パラメータを生成する感情・態
度生成手段と、該感情・態度生成手段の内部状態パラメ
ータに応じて人工エージェントの感情・態度パラメータ
を決定する感情・態度パラメータ決定手段と、該感情・
態度パラメータ決定手段の出力に応じてエージェントの
感情・態度を画像及び音声として表現し出力する出力手
段を備えることを特徴とするもので、内部状態パラメー
タを直接的に行動を規制する感情・態度として用いるの
ではなく、下記（３）式で表わされるように、According to a second aspect of the present invention, there is provided input means for inputting a user's gesture, voice pitch, and the like obtained from an image and a voice generated by the user, and inputting the input from the input means as at least one element thereof. Emotion / attitude generating means for generating internal state parameters of the emotion / attitude of the artificial agent, and emotion / attitude parameters for determining the emotion / attitude parameter of the artificial agent according to the internal state parameters of the emotion / attitude generating means Determination means and the emotion /
It is characterized by comprising output means for expressing and outputting emotions and attitudes of the agent as images and sounds in accordance with the output of the attitude parameter determination means, and using internal state parameters as emotions and attitudes that directly regulate actions. Instead of using, as expressed by the following equation (3),

【００１０】[0010]

【数３】 (Equation 3)

【００１１】内部状態パラメータｑ_jを感情・態度パラ
メータＹ_k（ｋ＝１，…，Ｌ）へ対応させる手段を導入
することにより、内部状態パラメータの生成をより自由
に設定できると共に、エージェントの出力である行動を
多様にすることができる。By introducing means for associating the internal state parameter q _j with the emotion / attitude parameter Y _k (k = 1,..., L), the generation of the internal state parameter can be set more freely and the output of the agent can be set. Can be varied.

【００１２】請求項３の発明は、請求項１又は２記載の
インタフェース装置において、前記感情・態度生成手段
において、内部状態パラメータが時間系で記述された非
線型方程式にもとづいて生成されるようにしたことを特
徴とするもので、発展方程式として非線形方程式の系を
採用することにより、例えば、カオス、定常的な状態を
とる場合にも実際には内部で動いているリミットサイク
ルなど複雑な態度・感情を生成し、様々な状態をとるこ
とができ、このように生成された感情・態度をもとにエ
ージェントの行動を出力することにより、より人同士に
近い円滑な対話を行うことができる。According to a third aspect of the present invention, in the interface device according to the first or second aspect, the emotion / attitude generating means generates an internal state parameter based on a nonlinear equation described in a time system. By adopting a system of nonlinear equations as an evolution equation, for example, chaos, even in a steady state, complicated attitudes such as limit cycles that are actually moving inside Emotions can be generated and various states can be taken, and by outputting the behavior of the agent based on the generated emotions / attitudes, a smooth dialogue closer to a person can be performed.

【００１３】請求項４の発明は、請求項２又は３記載の
インタフェース装置において、前記感情・態度パラメー
タ決定手段において、内部状態パラメータの値の履歴に
よって感情・態度パラメータの値を異ならせるようにし
たことを特徴とするもので、Ｏ_kとして多価関数を用
い、ｑ_j,Ｙ_kのそれまでとってきた値により、ｑ_jに対し
てＹ_kの値を変えて、それまでどのような態度・感情を
とってきたかによって、現在の感情・態度を変え、より
人間的な振舞いが可能になる。According to a fourth aspect of the present invention, in the interface device according to the second or third aspect, in the emotion / attitude parameter determination means, the value of the emotion / attitude parameter varies depending on the history of the value of the internal state parameter. characterized in that, using a polyvalent function as O _k, q _j, the value that has been taken to its Y _k, with different values of Y _k with respect to q _j, what attitude until it・ Depending on whether you have taken emotions, you can change your current emotions and attitudes, and become more human.

【００１４】請求項５の発明は、請求項３又は４記載の
インタフェース装置において、前記非線型方程式が、非
線型振動子の方程式の形式をとるようにしたことを特徴
とするもので、非線型振動子の方程式の形式をとること
により、例えば、リミットサイクルとしてエージェント
自身がリズムを持つと同時に、外部からの入力に同期す
ることも可能で、より人間同士に近い対話を行うことが
できる。According to a fifth aspect of the present invention, in the interface device according to the third or fourth aspect, the nonlinear equation takes the form of an equation of a nonlinear oscillator. By taking the form of the equation of the oscillator, for example, the agent itself has a rhythm as a limit cycle, and at the same time, it is possible to synchronize with an input from the outside, and a conversation closer to humans can be performed.

【００１５】[0015]

【発明の実施の形態】図１は、本発明の基本構成を示す
ものである。図１に示すように、インタフェース装置側
には、この装置を通して対話しようとするユーザが発す
る音声及び画像を入力，処理する入力手段１１と、この
入力手段１１の出力にもとづいて感情・態度生成手段１
２では人工エージェントの感情・態度を画像及び音声と
して表現する出力手段１３を動作させる。以下、図１に
示す基本構成に従う本発明の実施形態を図面に沿って説
明する。図２は、本発明によるインタフェース装置をマ
ルチモーダルヒューマンインタフェースシステムへ適用
した実施形態を示すものである。マイクによって取り込
まれたユーザの音声は、音声処理手段２１によって言語
に変換され、言語処理手段２２に送られると同時に、ピ
ッチ，強弱が感情・態度生成手段２４に送られる。ま
た、カメラによって取り込まれたユーザの画像は、画像
処理手段２１で画像における顔の向き，頭の振り，うな
ずきが言語処理手段２２に送られると同時に動きの激し
さが感情・態度生成手段２３に送られる。感情・態度生
成手段２３では、例えば、内部パラメータをｒがvan de
r Polの方程式で表す時、ｑ₁＝γ、及びその時間微分ｑ
₂＝ｄγ／ｄｔを用いて、FIG. 1 shows the basic structure of the present invention. As shown in FIG. 1, on the interface device side, input means 11 for inputting and processing voices and images uttered by a user who wants to interact through this device, and emotion / attitude generating means based on the output of the input means 11 1
In step 2, the output means 13 for expressing the emotion and attitude of the artificial agent as images and sounds is operated. Hereinafter, an embodiment of the present invention according to the basic configuration shown in FIG. 1 will be described with reference to the drawings. FIG. 2 shows an embodiment in which the interface device according to the present invention is applied to a multimodal human interface system. The voice of the user captured by the microphone is converted into a language by the voice processing unit 21 and sent to the language processing unit 22, and at the same time, the pitch and strength are sent to the emotion / attitude generation unit 24. The image of the user captured by the camera is sent to the language processing unit 22 at the same time as the orientation of the face, head shake, and nod in the image is sent to the language processing unit 22 by the image processing unit 21 and at the same time, the emotion / attitude generation unit 23 Sent. In the emotion / attitude generating means 23, for example, the internal parameter r is van de
When expressed by r Pol equation, q ₁ = γ and its time derivative q
_{Using 2} = dγ / dt,

【００１６】[0016]

【数４】 (Equation 4)

【００１７】と表すことができる。ここでγ，α，ωは
エージェントの個性や内部パラメータによって決まる定
数である。特に、ωはエージェントの持つ固有のリズム
である。Ｅ（ｔ）は外部から与えられる刺激で、これは
音声処理手段，画像処理手段から送られてきたものであ
る。Can be expressed as Here, γ, α, and ω are constants determined by the personality of the agent and internal parameters. In particular, ω is an inherent rhythm of the agent. E (t) is an externally applied stimulus, which is sent from the voice processing means and the image processing means.

【００１８】この場合、エージェントの個性γ，αによ
って（４）式で表わされる系では、その周期は外部の周
期に同調するように設計することが可能である。これに
より、ユーザにとっては快適なリズムで対話を進められ
るであろう。また、ユーザ自身も同様に擬似人間の影響
を受けると考えられるのであるならば、互いに同調して
いき、会話を積極的に推し進める効果も期待できる。そ
のことをシュミレートした例を以下に示す。In this case, in the system represented by the equation (4) by the personalities γ and α of the agents, the period can be designed to be synchronized with the external period. As a result, the dialogue will proceed with a comfortable rhythm for the user. In addition, if the user himself is also considered to be affected by the simulated human, the effect of synchronizing with each other and actively promoting conversation can be expected. An example of simulating this is shown below.

【００１９】[0019]

【数５】 (Equation 5)

【００２０】上記式（５Ａ），（５Ｂ）は２つの系Ａ，
Ｂでそれぞれ構成されていることを表わしている。系Ａ
とＢはそれぞれ式（４）において、固有のα，γ，ωを
もつ式で表わされる。インタラクションとして互いに相
手のｑ₁に比例する刺激を受けるものとする。The above equations (5A) and (5B) represent two systems A,
B indicates that each is constituted. System A
And B are expressed by equations having unique α, γ, and ω in equation (4). Assume that stimuli that are proportional to each other's q ₁ are received as interactions.

【００２１】図３は、α_A＝０.２７，γ_A＝０.１３，ω
_A＝２.５５，α_B＝０.２３，γ_B＝０.４６，ω_B＝３.０
７で系Ａ，Ｂ間にインタラクションがない場合（λ＝
０）の結果を示すもので、系Ａ，Ｂ間の同期がとれてい
ないリズムのパターンを示している。図４は、２つの振
動する系Ａ，Ｂの間に、インタラクションがある場合
（λ＝１.７５）の結果を示すもので、系Ａ，Ｂ間の同
期がとれているリズムのパターンを示している。この場
合、お互いのリズムの中間の周波数２.７３で同期がと
られ、互いに相手のペースにあわせている状況である。
また、図５は、上と同じ周波数ω_A＝２.５５，ω_B＝３.
０７であるが、α_A＝０.０２，γ_A＝０.４６，α_B＝０.
０６，γ_B＝０.３６でインタラクションがない場合の結
果を示している。図６には、図５と同様の条件がある
が、系Ａ，Ｂの間のインタラクションがある場合（λ＝
１.７５）を示している。このときも、やはり同期がと
れているが、双方のインタラクションがない場合より大
きい周波数３.４９で同期がとられている。人間同士の
対話では、話しが盛り上がってテンポが速くなることが
よくあるが、これはそのような効果を生み出すのに有効
であることをこの結果が示している。FIG. 3 shows that α _A = 0.27, γ _A = 0.13, ω
_A = 2.55, α _B = 0.23, γ _B = 0.46, ω _B = 3.0
7, when there is no interaction between systems A and B (λ =
0) shows the rhythm pattern in which the systems A and B are not synchronized. FIG. 4 shows the result when there is an interaction (λ = 1.75) between the two oscillating systems A and B, and shows a rhythm pattern in which the systems A and B are synchronized. ing. In this case, synchronization is achieved at a frequency 2.73 between the rhythms of each other, and the two rhythms match each other's pace.
FIG. 5 shows the same frequency ω _A = 2.55, ω _B = 3.
07, α _A = 0.02, γ _A = 0.46, α _B = 0.
06, γ _B = 0.36 and no interaction. FIG. 6 has the same condition as FIG. 5 except that there is an interaction between systems A and B (λ =
1.75). At this time, synchronization is also achieved, but synchronization is achieved at a frequency 3.49 which is higher than when there is no interaction between the two. In human-to-human dialogues, it is often the case that the conversation gets excited and the tempo increases, and this result shows that this is effective in producing such an effect.

【００２２】上記した結果を生ずるべく感情・態度生成
手段２３で生成された内部状態パラメータｑ₁，ｑ₂は感
情・態度パラメータ決定手段２４に送られる。ここで、
１秒間に（ｑ₁が０になる回数）/２のリズムをエージェ
ントの感情・態度パラメータの１つとして行動出力手段
２５に送る。このリズムに合わせエージェントの動きや
発話について制御が行われる。また、同時に感情・態度
パラメータ決定手段２４では、感情の高まりＥ＝ｑ₁ ²＋
ｑ₂ ²から行動の激しさＳへ多価関数Ｓ＝ｈ（Ｅ）にゆっ
て写像される。図７は、感情の高まりＥと行動の激しさ
Ｓの関係を表わすもので、同図に示すようにエージェン
トの行動が落ち着いている状態１から、感情の高まりに
従って行動の激しさは状態２，３と大きくなるが、ここ
で、これ以上高まりが大きくなると状態は５にジャンプ
し、急激に行動の激しさは大きくなる。しかし、逆に激
しさが大きい状態６から段々と感情の高まりが小さくな
る場合、状態６，５，４と来た後、状態２にジャンプす
る。このような履歴を考慮して行動の激しさＳは行動出
力手段２５に送られ、同じ感情の高まりでもそれ以前の
エージェントの状態の履歴の違いによって行動を変化さ
せることが可能になる。行動出力手段では、感情・態度
決定手段から送られてきた結果に基づき、エージェント
の表情，体の揺れなどを出力するようになされるが、言
語処理手段から送られてきた情報と共に、感情・態度決
定手段から送られてきた情報に基づいて音声や身振りを
大きさや速さを変えてユーザ入力に対応した出力を行
う。The internal state parameters q ₁ and q ₂ generated by the emotion / attitude generating means 23 to produce the above result are sent to the emotion / attitude parameter determining means 24. here,
Send the action output means 25/2 rhythm (q ₁ is the number becomes 0) as one of the emotional attitude parameters of the agent per second. The movement and utterance of the agent are controlled according to the rhythm. At the same time, the emotion / attitude parameter determination means 24 increases the emotion E = q ₁ ² +
from q _{² 2} to the intensity S of the action is mapped oneself.positive to polyvalent function S = h (E). FIG. 7 shows the relationship between the heightened emotion E and the intensity S of the action. As shown in FIG. 7, from the state 1 in which the behavior of the agent is calm, the intensity of the action is changed to the state 2 in accordance with the heightened emotion. However, if the height is further increased, the state jumps to 5, and the intensity of the action rapidly increases. On the other hand, when the degree of emotion gradually decreases from the state 6 where the intensity is high, the state jumps to the state 2 after the states 6, 5 and 4 are reached. In consideration of such a history, the intensity S of the action is sent to the action output means 25, and the action can be changed depending on the difference in the history of the state of the agent even before the same emotion level increases. The action output means outputs the agent's facial expression, body shaking, etc. based on the result sent from the emotion / attitude determination means. The action output means outputs the emotion / attitude together with the information sent from the language processing means. Based on the information sent from the deciding means, the voice and the gesture are changed in size and speed to output corresponding to the user input.

【００２３】[0023]

【The invention's effect】

請求項１の効果：ユーザが発する言語或いは言語に相当
する情報のみならず、仕草，顔の動き，声のピッチなど
の非言語情報を入力情報とし、それを処理して時間発展
系で感情・態度の内部状態パラメータを生成し、人工エ
ージェントの感情・態度の出力に反映させることによ
り、エージェントがマルチモーダル情報を用いて積極的
に、より人間らしく振舞い円滑な対話を進めていくこと
ができる。請求項２の効果：請求項１の効果に加えて、内部状態パ
ラメータに応じて感情・態度パラメータを決定するよう
にし、このパラメータ間に変化を与えることができるこ
とから、出力を多様化することができ、表現力を増すこ
とが可能となる。請求項３の効果：請求項１及び２の効果に加えて、時間
発展系で記述された非線型方程式にもとづいて複雑な感
情・態度を生成することができるので、より人間同士に
近い円滑な対話を行うことができる。請求項４の効果：請求項２及び３の効果に加えて、例え
ば、履歴により感情と行動の相関が変化するという人間
の性質をとり入れることにより、人間的な振舞いを可能
とする。請求項５の効果：請求項３及び４の効果に加えて、非線
型方程式を非線型振動子の方程式とすることにより、よ
り人間同士に近い対話が可能となる。Advantageous Effect of Claim 1: Non-verbal information such as gestures, facial movements, voice pitches, etc., as input information, as well as language or information corresponding to the language issued by the user, is processed, and emotions and emotions are processed in a time evolution system. By generating an internal state parameter of the attitude and reflecting it in the output of the emotion / attitude of the artificial agent, the agent can actively use the multimodal information to behave more humanly and to promote a smooth dialogue. Effect of Claim 2: In addition to the effect of Claim 1, an emotion / attitude parameter is determined according to an internal state parameter, and a change can be given between these parameters, so that output can be diversified. And expressive power can be increased. Effect of Claim 3: In addition to the effects of Claims 1 and 2, since complex emotions and attitudes can be generated based on the nonlinear equations described in the time evolution system, smoothness closer to humans can be achieved. Can interact. Effect of Claim 4: In addition to the effects of Claims 2 and 3, for example, a human-like behavior is enabled by taking in the human nature that the correlation between emotion and behavior changes depending on the history. Effect of Claim 5: In addition to the effects of Claims 3 and 4, by making the nonlinear equation an equation of a nonlinear oscillator, a dialogue closer to humans is possible.

[Brief description of the drawings]

【図１】本発明のインタフェース装置の基本構成を示す
図である。FIG. 1 is a diagram showing a basic configuration of an interface device of the present invention.

【図２】本発明によるマルチモーダルインタフェースシ
ステムへ適用した実施形態を示す図である。FIG. 2 is a diagram showing an embodiment applied to a multimodal interface system according to the present invention.

【図３】非線型振動子モデルでインタラクションがない
場合の本発明の実施例のシュミレーション結果を示すグ
ラフである。FIG. 3 is a graph showing a simulation result of the example of the present invention when there is no interaction in the nonlinear oscillator model.

【図４】非線型振動子モデルでインタラクションがある
場合の本発明の実施例のシュミレーション結果を示すグ
ラフである。FIG. 4 is a graph showing a simulation result of the embodiment of the present invention when there is an interaction in the nonlinear oscillator model.

【図５】他の非線型振動子モデルでインタラクションが
ない場合の本発明の実施例のシュミレーション結果を示
すグラフである。FIG. 5 is a graph showing a simulation result of the example of the present invention when there is no interaction in another nonlinear oscillator model.

【図６】他の非線型振動子モデルでインタラクションが
ある場合の本発明の実施例のシュミレーション結果を示
すグラフである。FIG. 6 is a graph showing a simulation result of the embodiment of the present invention when there is an interaction with another nonlinear oscillator model.

【図７】感情・態度決定手段における感情と行動の相関
を表わす説明図である。FIG. 7 is an explanatory diagram showing a correlation between emotion and behavior in the emotion / attitude determination means.

[Explanation of symbols]

１１…入力手段、１２…感情・態度生成手段、１３…出
力手段、２１…音声処理手段，画像処理手段、２２…言
語処理手段、２３…感情・態度生成手段、２４…感情・
態度決定手段、２５…行動出力手段。11 input means, 12 emotion / attitude generating means, 13 output means, 21 voice processing means, image processing means, 22 language processing means, 23 emotion / attitude generating means, 24 emotion / attitude
Attitude determination means, 25 ... action output means.

───────────────────────────────────────────────────── フロントページの続き (72)発明者日出晴夫大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内 (72)発明者坂本憲治大阪府大阪市阿倍野区長池町22番22号シャープ株式会社内 ──────────────────────────────────────────────────の Continued on the front page (72) Inventor Haruo Hiji 22-22, Nagaikecho, Abeno-ku, Osaka-shi, Osaka Inside Sharp Corporation (72) Inventor Kenji Sakamoto 22-22, Nagaikecho, Abeno-ku, Osaka-shi, Osaka In the company

Claims

[Claims]

1. An input means for inputting a user's gesture, voice pitch, and the like obtained from an image and a voice generated by the user, and an artificial agent personified as an input at least as one element of the input from the input means. Emotion / attitude generating means for generating the internal state parameter of the emotion / attitude of the user, and output means for expressing and outputting the emotion / attitude of the artificial agent as an image and voice according to the internal state parameter generated by the emotion / attitude generating means An interface device comprising:

2. An input means for inputting a user's gesture, voice pitch, and the like obtained from an image and a voice generated by the user, and an artificial agent personified as an input at least as one element of the input from the input means. And an emotion / attitude generating means for generating internal state parameters of the emotion / attitude, and an emotion / attitude parameter for determining the emotion / attitude parameter of the artificial agent according to the internal state parameters of the emotion / attitude generating means.
An interface apparatus comprising: an attitude parameter determining unit; and an output unit configured to express and output an emotion / attitude of an agent as an image and a voice according to an output of the emotion / attitude parameter determining unit.

3. The interface device according to claim 1, wherein said emotion / attitude generating means generates an internal state parameter based on a non-linear equation described in a time evolution system. .

4. The interface device according to claim 2, wherein the emotion / attitude parameter determining means changes the value of the emotion / attitude parameter depending on the history of the value of the internal state parameter.

5. The method according to claim 3, wherein said nonlinear equation takes the form of a nonlinear oscillator equation.
Or the interface device according to 4.