JP3751898B2

JP3751898B2 - Dialogue simulation apparatus and dialogue support method

Info

Publication number: JP3751898B2
Application number: JP2002097654A
Authority: JP
Inventors: 康子中山; 雄二郎成瀬
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2002-03-29
Filing date: 2002-03-29
Publication date: 2006-03-01
Anticipated expiration: 2022-03-29
Also published as: JP2003295751A

Description

【０００１】
【発明の属する技術分野】
本発明は、対話シミュレーション装置に関する。
【０００２】
【従来の技術】
例えば、会議の定例報告会議、外国語での会議の準備として、参加が予定されている複数の参加者からの発言、質問などを予想し、それらに対処するための発言を用意しておくということがよく行われる。会議などにおいては、その目的とするところがはっきりしており、議論を展開するいるので、予め参加者からの発言内容を予想したり、議論の展開をシナリオ化することは容易である。
【０００３】
しかし、パーティーや、懇談会、会食、その他、様々な場面において、様々な人物と雑談する場合、特に、初対面の人や外国人などと対話しようとする際には、「こんなことを話したらバカにされてしまうのではないか」とか、「こんなこと興味がないのではないか」と話題探しに苦労することが多いのが現実である。仕事や、自分の専門分野については、豊富な話題があるにもかかわらず、雑談となると、話題が乏しいために、対話がすぐにとぎれてしまい、気まずい思いをしたという経験を持つ人は多いはずである。
【０００４】
このような場合、他人と対話する場面を想定して、その場に則した適当な話題を選びながら予め対話のリハーサルを行うことができたら心強いものである。
【０００５】
【発明が解決しようとする課題】
従来は、航空機のパイロット養成用のもの航空機の操縦訓練を目的としたシミュレータや、月面着陸に向けた宇宙飛行士の訓練を目的とするシミュレータなど大規模なものは数多くある。例えば、特開２０００−１２２５２０は、仮想現実シミュレータ及びそのシミュレーション方法に関し、特に、化学プラント、火力発電所のような大規模で複雑な立体構築物の内側で起こる事故のような緊急事態に対応するために繰り返して仮想的に模擬訓練を日常的に行うことができる。
【０００６】
しかしながら、他人と対話する場面を想定し、対話の訓練を目的とした個人レベルで利用する対話シミュレータなるものは存在しなかった。
【０００７】
そこで、本発明は、他人と対話する場面を想定し、対話の訓練を目的とした個人レベルで利用する対話シミュレーション装置および対話支援方法を提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明のシミュレーション装置は、所望の場面と所望の対話相手を設定するための条件を入力するための入力手段と、この入力手段で入力された条件を基に前記場面と前記対話相手を設定する設定手段と、前記条件に基づき分野別に最新情報を取得する取得手段と、分野別に収集された最新情報のうち、ユーザにより選択された分野の最新情報を表示する表示手段と、前記条件と話題として選択された分野とに適した対話が行えるよう、ユーザにアドバイスを提示する提示手段と、ユーザからの発言内容に呼応した前記対話相手の台詞を生成する生成手段とを具備し、前記生成手段は、ユーザからの発言内容から話題語が抽出されたとき、この話題語に呼応した前記対話相手の台詞を生成することを特徴とする。
【０００９】
本発明によれば、個人レベルで遭遇する様々な場面での様々な人との対話の訓練が容易に行える。
【００１０】
好ましくは、前記提示手段は、前記アドバイスとして、ユーザが前記表示手段で表示された最新情報を話題にした対話が行えるよう、前記表示手段で表示された最新情報に含まれる所望の話題語を挿入することで台詞が完成する台詞の雛形を提示する。
【００１１】
また、前記提示手段は、前記アドバイスとして、ユーザが発することが望ましい台詞を提示する。
【００１２】
本発明の対話支援方法は、入力された所望の場面と所望の対話相手を設定するための条件を基に前記場面と前記対話相手を設定するとともに、前記条件に基づき分野別に最新情報を取得し、分野別に収集された最新情報のうち、ユーザにより選択された分野の最新情報を表示して、前記条件と話題として選択された分野とに適した対話が行えるよう、ユーザにアドバイスを提示し、ユーザからの発言があったときには、その発言内容に呼応した前記対話相手の台詞を生成し、その際、ユーザからの発言内容から話題語が抽出されたときには、当該話題語に呼応した前記対話相手の台詞を生成することを特徴とする。
【００１３】
本発明によれば、個人レベルで遭遇する様々な場面での様々な人との対話の訓練が容易に行える。
【００１４】
好ましくは、表示された最新情報を話題にした対話が行えるよう、前記表示された最新情報に含まれる所望の話題語を挿入することでユーザの台詞が完成する台詞の雛形を前記アドバイスとして提示する。
【００１５】
また、前記アドバイスとして、ユーザが発することが望ましい台詞を提示する。
【００１６】
【発明の実施の形態】
以下、本発明の実施形態について図面を参照して説明する。
【００１７】
図１は、本実施形態に係る対話シミュレーション装置の概略構成例を示した、ものである。
【００１８】
入力部２には、所望の対話相手と対話する場面や所望の対話相手を設定するために必要な条件を入力したり、その他、ユーザがシステムに対し各種指示入力を行うためのものである。
【００１９】
場面を設定するための条件としては、例えば、所望の場面がパーティ会場である場合、場面の種別としてパーティ、当該パーティの開催される日時、開催場所（国名や地名など）、そのパーティの開催目的など、例えば、当該パーティの招待状などから知ることのできるデータや、フォーマルかインフォーマルか、さらには、出席者の年齢層や専門性など、ユーザがわかる範囲で入力することのできるデータでよい。好ましくは、できるだけ詳細な条件が入力されることが望ましい。また、対話相手を設定するための条件としては、例えば、性別、年齢、専門性、趣味、職業、ユーザとの関係などが挙げられる。やはり、好ましくは、できるだけ詳細な条件が入力されることが望ましい。
【００２０】
場面設定部３は、入力部２に入力された場面の条件に基づき、部品記憶部４に予め記憶されているＣＧ（computer graphics）部品を組み合わせて、例えば、パーティ会場やレストランでの食事風景などを３次元的に作成するようになっている。
【００２１】
対話相手設定部５は、入力部２に入力された場面の条件に基づき、部品記憶部４に予め記憶されているＣＧ（computer graphics）部品を組み合わせて、ユーザ所望の対話相手を作成する。
【００２２】
情報取得部６は、入力部２から入力された場面、対話相手を設定するための各種条件を基に、これら条件に適合する最新情報を、例えば、インターネットを用いて収集するようになっている。
【００２３】
情報記憶部７には、情報取得部６にて取得された最新情報を記憶する。そのほか、国毎、場面毎などで異なる常識的な情報も予め記憶されていてもよい。
【００２４】
情報選択部８は、入力部２から入力された場面や対話相手の条件、その他各種ユーザからの指示に基づき、情報記憶部７から適当な情報を選択するようになっている。
【００２５】
シナリオ記憶部９には、様々な場面において遭遇することが想定されるあらゆる分野のあらゆる種類（フォーマル向け、インフォーマル向け、女性向け、男性向けなど）の対話のためのシナリオが複数記憶されている。シナリオとして最も簡単なものは、例えば、ユーザと対話相手との間のダイアログ（対話）そのものである。
【００２６】
例えば、アメリカで開催されるパーティ会場という場面で、日本人のユーザがアメリカ人の初対面の男性を対話する場合には、まず、初対面の挨拶から始まり、自己紹介、話の糸口をつくって、話を展開していくという一連の対話戦略ともいうべきものがあるとすると、アメリカ人の初対面の男性との挨拶のためのダイアログ（これはほとんど決まり文句で構成されることが多い）や、それに続く所望の話題について対話するときのダイアログなどが挙げられる。
【００２７】
例えば、アメリカで開催されるパーティ会場という場面で、日本人のユーザがアメリカ人の男性と政治分野に属する、例えば「選挙」について対話するためのシナリオとして、図２に示すようなものがあってもよい。
【００２８】
なお、図２に示したダイアログにおいて、「ｘ１」や「ｘ２」の部分には、ユーザがあるいはシステム側が所望の語を挿入する。なお、ここでは、「ｘ１」や「ｘ２」の部分に挿入される表示された情報中に含まれる語を話題語と呼ぶ。
【００２９】
本実施形態では、この「ｘ１」や「ｘ２」の部分に、ユーザが例えば最新情報中の所望の語（話題語）を挿入して、ユーザが当該台詞を発することで、最新の話題を用いた対話の訓練が行えるようになっている。また、これと同時に、その場面にあった話題やマナー、常識なども収得することができるのである。
【００３０】
なお、図２では、日本語のダイアログを示しているが、これが英語のダイアログであってもよい。この場合、ユーザの母国語が英語でない場合には、外国語の練習にもなる。
【００３１】
シナリオ記憶部９に記憶されているシナリオは、図２に示したようなシナリオをベースとして、ユーザの発言内容に応じた（呼応した）種種のバリエーションがあってもよい。
【００３２】
シナリオ選択部１０は、入力部２から入力された場面や対話相手の条件、その他各種ユーザからの指示、ユーザの発言内容等に基づき、シナリオ記憶部９から最適なシナリオを選択する。
【００３３】
対話制御部１１は、シナリオ選択部１０で選択されたシナリオに従って、システム側（対話相手）とユーザとの間で対話が進められるように、ユーザに提示するアドバイスを生成したり、システム側で発すべき台詞を選択したり等するためのものである。
【００３４】
音声入力部１２は、ユーザの発言（発した台詞）を入力するためのものである。音声入力部１２に入力したユーザの発言は、音声認識部１３で認識され、さらに解析部１４で、当該発言内容から、例えば、図２に示したシナリオのユーザ側の台詞の雛形（テンプレート）に含まれている、例えば、「ｘ１」に挿入された話題語を抽出する。話題語が抽出されると、その話題語に呼応した対話相手の台詞を生成すべく、対話制御部１１は、次の対話相手の台詞の生成を台詞生成部１５に依頼する。その際、必要に応じて、シナリオ選択部１０でシナリオを選択し直したり、情報選択部８で必要な情報を選択したり、また、情報取得部６で当該抽出された話題語に対応する最新情報を取得するようにしてもよい。
【００３５】
台詞生成部１５では、対話相手の台詞を生成するためのもので、例えば、ユーザの発言内容に（ユーザの発言内容に話題語が含まれている場合には、当該話題語に）呼応するよう、例えば、図２に示したシナリオのシステム側の台詞の雛形に含まれている「ｘ１」に、ユーザの発言内容から抽出された話題語を挿入して台詞を生成する。
【００３６】
対話制御部１１で選択された台詞、台詞生成部１５で生成された台詞は、音声合成出力部１６で、対話相手設定部５で設定された対話相手対応に音声合成されて、当該対話相手の発言として、スピーカなどから出力するようになっている。
【００３７】
表示制御部１７は、場面設定部３で設定された場面や、対話相手設定部５で設定された対話相手をディスプレイ２０に表示等するための制御を行うようになっている。
【００３８】
対話相手制御部１８は、解析部１４で、ユーザから入力された発言内容を解析した結果を基に対話相手の表情や動き、話し方を制御する。すなわち、解析部１４では、ユーザから入力された発言内容に、マナーに反する、タブーとなるような発言が含まれているか否か、対話相手が喜ぶ内容か、気を悪くする内容かなどを判断するようにしてもよい。この判断結果を基に、対話相手制御部１８では、例えば、ユーザがマナー違反のことを聞いてきたときには、表情でその旨を伝えるようにすることができる。
【００３９】
制御部１は、上記各部２〜１８を制御するためのものである。
【００４０】
次に、図４〜図６に示すフローチャートを参照して、図１の対話シミュレーション装置の処理動作について説明する。
【００４１】
まず、図４を参照して、対話を開始するまでの準備段階の処理動作について説明する。ユーザは、入力部２に所望の場面の条件（例えば、種別＝パーティ、目的＝学会終了後の懇親を深めるためのもの、開催日時＝○月×日午後１２時、インフォーマル、等）と、ユーザが所望の対話相手の条件（例えば、性別＝男性、国籍＝アメリカ、年齢＝４０才、専門＝情報工学、趣味＝ゴルフ、職業＝Ｙ大学教授、ユーザとの関係＝初対面など）を入力する（ステップＳ１、ステップＳ２）。
【００４２】
場面設定部３は入力された場面の条件を基に場面を作成し、対話相手設定部５は入力された対話相手の条件を基に対話相手を作成する（ステップＳ３，ステップＳ４）。
【００４３】
情報取得部６は、入力された上記条件をキーワードとして、例えばインターネットから最新の情報を収集する（ステップＳ５）。この場合には、例えば、アメリカにおける時事、事件ん、娯楽、食べ物、スポーツなど、好ましくは、あらゆる分野における最新情報が収集できることが望ましい。しかし、予め定められた分野に対応する最新情報を収集するようにしてもよい。
【００４４】
表示制御部１７は、場面設定部３で作成されたＣＧの場面、対話相手設定部５で作成されたＣＧの対話相手をディスプレイ２０に表示するとともに、情報収集部６で収集された最新情報などの情報の分野を選択するためのボタンを表示する（ステップＳ６，ステップＳ７）。
【００４５】
図３は、対話シミュレーション装置のディスプレイ２０に表示される画面表示例を示したものである。
【００４６】
図３に示すように、この表示画面は、ＣＧの場面や対話相手が表示される表示領域Ｗ１と、情報の分野を選択するための選択ボタンが設けられている領域Ｗ２と、最新情報などの情報が表示される表示領域Ｗ３と、ユーザへのアドバイスが表示される表示領域Ｗ４から構成されている。
【００４７】
なお、図４のステップＳ７の段階では、図３に示した領域のうち、領域Ｗ１と領域Ｗ２が表示されている。
【００４８】
この状態で、ユーザが図３の領域Ｗ２に表示されている複数の選択ボタンのうち、所望の分野に対応するボタンをマウス等のポインティングデバイスを用いて選択すると（図５のステップＳ１１）、シナリオ選択部１０は、シナリオ記憶部４に記憶されたシナリオの中から、この選択された分野と、ステップＳ１，ステップＳ２で入力された条件とに適したシナリオを選択する（ステップＳ１２）。
【００４９】
また、情報選択部８は、ユーザに選択された当該分野に属する情報（最新情報、一般常識等）を情報記憶部７から読み出し、それらは、表示制御部１７にて、図３の領域Ｗ３に表示される（ステップＳ１３，ステップＳ１４）。
【００５０】
なお、図３の領域Ｗ２に表示されている分野は大分類の分野であって、このうちのいずれかを選択した場合に、その小分類がさらに表示され、その中から所望の分野を選択するようにしてもよい。
【００５１】
例えば、上記条件として、場面の条件が、種別＝パーティ、目的＝学会終了後の懇親を深めるためのものであり、所望の対話相手の条件が、性別＝男性、国籍＝アメリカ、年齢＝４０才である場合、ユーザが選択した分野が「政治」に属する小分類「選挙」であった場合、ステップＳ１２で選択されるシナリオ、すなわち、ダイアログは、例えば、図２に示すものであってもよい。なお、図２に示したシナリオは日本語であるが、実際には、英語である。しかし、ここでは、説明の簡単のため、日本語である場合を例にとり説明することにする。
【００５２】
さて、図２に示したようなシナリオが選択された場合、ユーザから対話を開始することが予め定められているとき（例えば、ユーザがそのように設定したとしてもよい）、対話制御部１１は、図２の「Ａ」の役をユーザに、「Ｂ」の役をシステムすなわち対話相手と決定する。そこで、対話制御部１１は、このシナリオに沿った対話が行えるよう、すなわち、ユーザにより入力された条件と、ユーザにより話題として選択された分野に適した対話が行えるよう、まず、ユーザに「Ａ」の最初の台詞を発生させるべく、図３の領域Ｗ４に、アドバイスを表示する（ステップＳ１５）。例えば、アドバイスとして、ユーザが領域Ｗ３に表示された最新情報を話題にした対話が行えるよう、ユーザが発することが望ましい台詞として、これに領域Ｗ４で表示された最新情報に含まれる所望の話題語を挿入することで台詞が完成する台詞の雛形、すなわち、例えば、この場合、「Ａ」の最初の台詞に対応する、「ｘ１（何の選挙）に興味ありますか？」を領域Ｗ４に表示するようにしてもよい。
【００５３】
ユーザは、画面の領域Ｗ３に「選挙」に関し選択された情報（この場合、最新情報）が表示されているので、これと、領域Ｗ４のを見て、例えば、「“アメリカの大統領選”に興味ありますか」と発言したとする。なお、上記表記において“ ”で囲った部分は、ユーザが挿入した話題語である。
【００５４】
このユーザの発言は音声入力部１２から入力し、音声認識部１３で認識されて、さらに、解析部１４で解析される（ステップＳ２１〜ステップＳ２３）。
【００５５】
解析部１４でユーザの発言内容を解析した結果、当該発言内容から話題語が抽出されたとする（ステップＳ２５）。例えば、この場合、「アメリカの大統領選」が抽出されたとする。例えば、ユーザの発言内容とアドバイスとして表示した台詞の雛形との差分を求めることで、挿入された話題語は抽出することができる。
【００５６】
対話制御部１１は、まず、抽出された話題語に対応する情報が既に取得されている（情報記憶部７に格納されている）か否かをチェックする（ステップＳ２８）。このとき、情報記憶部７に格納されていないときには、情報取得部６に当該話題語に対応する最新情報の取得を依頼してもよい。この場合、情報取得部６は、渡された話題語をキーワードとして例えばインターネットから情報を取得するようにしてもよい（ステップＳ２９）。
【００５７】
一方、抽出された話題語に対応する情報が既に取得されている（情報記憶部７に格納されている）場合には、対話制御部１１は、システム側（対話相手）の台詞の準備を行う。すなわち、抽出された話題語に呼応した対話相手対応の台詞を生成する（ステップＳ３０）。ここでは、次に対話相手が発すべき台詞は、例えば、この場合、図２「Ｂ」の最初の台詞である。この台詞「もちろんです。ｘ１（何の選挙）は誰が当選すると思いますか。」には、「ｘ１」に、ユーザの発言内容から抽出した話題語“アメリカの大統領選”を入れる必要がある。そこで、台詞生成部１５では、当該台詞の「ｘ１」の部分に、“アメリカの大統領選”を挿入して、「もちろんです。アメリカの大統領選は誰が当選すると思いますか。」という台詞を生成する。
【００５８】
さらに、ステップＳ２３において解析部１４にて、ユーザから入力された発言内容に、例えば、マナーに反する、タブーとなるような発言が含まれていると判断したときは、その判断結果に対応するような対話相手の表情や動き、話し方決定する（ステップＳ２５）。例えば、対話相手の表情を強ばらせるようにしてもよいし、声の調子を怒ったいるように変化させてもよい。
【００５９】
なお、ステップＳ２４において、ユーザからの発言内容から話題語が抽出されなかったとき、すなわち、例えば、挨拶の決まり文句を発言したときとか、アドバイスとして話題語を挿入するような台詞の雛形を表示していなかったときなどは、ユーザの発言内容からは話題語が抽出されることが期待できないので、対話制御部１１は、単純に、次のシステム側（対話相手）の台詞をシナリオから選択すればよい。
【００６０】
ステップＳ２６では、ステップＳ３０で生成された、あるいは、選択された、システム側（対話相手）の台詞を音声合成して出力する。その後、当該出力された対話相手の発言に応じて、先に選択されたシナリオに沿った対話が行えるよう、アドバイスとして、ユーザが発言することが望ましい台詞、あるいは、台詞の雛形を画面の領域Ｗ４に表示する（ステップＳ２７）。
【００６１】
例えば、領域Ｗ４には、「わからないですね。今のところ、ｘ２（第１の候補者の名前）が優勢と見ますね。」と表示される。その後は、図５のステップＳ１５の説明と同様にして、ユーザは、例えば、領域Ｗ３に表示された情報から話題語として、「例えば、ジョン・Ｗ」を選択し、「わからないですね。今のところ、“ジョン・Ｗ”が優勢と見ますね。」という発言を行う。これ以後の処理動作は、上記同様であって、シナリオが終了するまで、図６に示した処理が繰り返される。
【００６２】
以上説明したように、上記実施形態によれば、ユーザは、所望の場面と所望の対話相手の条件を入力することにより、当該所望の対話相手と所望の話題について話を展開しながら対話が行えるよう適切なアドバイスが表示されるので、事前の対話訓練が容易に行える。特に、上記アドバイスとして、ユーザが発することが望ましい台詞を表示することで、挨拶などの決まり文句などの習得にも役にたつ。さらに、上記アドバイスとして、ユーザが発することが望ましい台詞として、既に表示されている最新情報に含まれる所望の話題語を挿入することで台詞が完成する台詞の雛形を提示することにより、最新情報を話題とした対話訓練が行えるので、実際の対話の場面に則した現実味のある対話訓練が行える。また、上記対話を外国語で行うことにより、外国語での対話訓練も可能となる。さらに、所望の場面と所望の対話相手の条件に基づき収集・選択された、所望の場面、所望の対話相手に最適な最新情報や一般常識などが提示されるので、実際の対話に役に立つ予備知識を得ることができる。
【００６３】
なお、上記実施形態では、シナリオは、図５のステップＳ１２において、システム側が自動的に選択する場合について説明したが、この場合に限らず、システム側は、ステップＳ１２において、ユーザにより選択された分野と入力された条件とから最適なシナリオの候補を複数個選択し、その中からユーザが所望のシナリオを選択するようにしてもよい。
【００６４】
また、本発明の実施の形態に記載した本発明の手法（特に、図４〜図６に示した処理動作に対応する本発明の手法）は、コンピュータに実行させることのできるプログラムとして、磁気ディスク（フロッピーディスク、ハードディスクなど）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤなど）、半導体メモリなどの記録媒体に格納して頒布することもできる。
【００６５】
なお、本発明は、上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。さらに、上記実施形態には種々の段階の発明は含まれており、開示される複数の構成用件における適宜な組み合わせにより、種々の発明が抽出され得る。例えば、実施形態に示される全構成要件から幾つかの構成要件が削除されても、発明が解決しようとする課題の欄で述べた課題（の少なくとも１つ）が解決でき、発明の効果の欄で述べられている効果（のなくとも１つ）が得られる場合には、この構成要件が削除された構成が発明として抽出され得る。
【００６６】
【発明の効果】
以上説明したように、本発明によれば、個人レベルで遭遇する様々な場面での様々な人との対話の訓練が容易に行える。
【図面の簡単な説明】
【図１】本発明の実施形態に係る対話シミュレーション装置の概略的な構成例を示した図。
【図２】対話のシナリオの一例を示した図。
【図３】図１の対話シミュレーション装置の画面表示例を示した図。
【図４】図１の対話シミュレーション装置の処理動作を説明するためのフローチャート。
【図５】図１の対話シミュレーション装置の処理動作を説明するためのフローチャート。
【図６】図１の対話シミュレーション装置の処理動作を説明するためのフローチャート。
【符号の説明】
１…制御部
２…入力部
３…場面設定部
４…部品記憶部
５…対話相手設定部
６…情報取得部
７…情報記憶部
８…情報選択部
９…シナリオ記憶部
１０…シナリオ選択部
１１…対話制御部
１２…音声入力部
１３…音声認識部
１４…解析部
１５…台詞生成部
１６…音声合成部
１７…表示制御部
１８…対話相手制御部
２０…ディスプレイ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a dialogue simulation apparatus.
[0002]
[Prior art]
For example, in preparation for a regular report meeting of a meeting or a meeting in a foreign language, it is expected to prepare remarks to anticipate and respond to comments and questions from multiple participants who are scheduled to participate Things are often done. In meetings and the like, the purpose of the meeting is clear and discussions are being developed, so it is easy to predict the contents of comments from participants in advance and to develop a scenario for discussion development.
[0003]
However, when chatting with various people at parties, social gatherings, dinners, and other occasions, especially when trying to talk with people or foreigners who are meeting for the first time, In reality, it is often difficult to find a topic, such as "Isn't it interested?" Although there are a lot of topics about work and my field of specialization, there should be many people who have had experiences of feeling awkward because conversation was interrupted immediately because there were few topics when chatting. It is.
[0004]
In such a case, it is encouraging if you can rehearse in advance while selecting a suitable topic in accordance with the situation, assuming a situation where you interact with others.
[0005]
[Problems to be solved by the invention]
Conventionally, there are many large-scale simulators such as simulators for pilot training of aircraft and simulators for the purpose of training astronauts for landing on the moon. For example, Japanese Patent Laid-Open No. 2000-122520 relates to a virtual reality simulator and a simulation method thereof, and particularly to cope with an emergency situation such as an accident occurring inside a large-scale and complicated three-dimensional structure such as a chemical plant or a thermal power plant. Virtual simulation training can be performed on a daily basis.
[0006]
However, there is no dialogue simulator that can be used at the individual level for the purpose of dialogue training, assuming the situation of dialogue with other people.
[0007]
In view of the above, an object of the present invention is to provide a dialog simulation apparatus and a dialog support method that are used at an individual level for the purpose of dialog training, assuming a scene of dialog with another person.
[0008]
[Means for Solving the Problems]
The simulation apparatus of the present invention sets an input means for inputting a desired scene and a condition for setting a desired conversation partner, and sets the scene and the conversation partner based on the condition input by the input means. Setting means, acquisition means for acquiring latest information for each field based on the conditions, display means for displaying the latest information of the field selected by the user among the latest information collected for each field, and the conditions and topics The presenting means includes a presenting means for presenting advice to the user so that a dialog suitable for the selected field can be performed, and a generating means for generating the dialogue partner's dialogue in response to the content of the speech from the user, When a topic word is extracted from the content of a statement from the user, the dialogue partner's dialogue corresponding to the topic word is generated.
[0009]
According to the present invention, it is possible to easily train a dialogue with various people in various scenes encountered at an individual level.
[0010]
Preferably, the presenting unit inserts a desired topic word included in the latest information displayed on the display unit as the advice so that the user can perform a conversation about the latest information displayed on the display unit. By doing so, we will present a dialogue template that completes the dialogue.
[0011]
Further, the presenting means presents a speech that the user preferably utters as the advice.
[0012]
The dialogue support method of the present invention sets the scene and the dialogue partner based on the input desired scene and the condition for setting the desired dialogue partner, and acquires the latest information for each field based on the condition. In the latest information collected by field, the latest information of the field selected by the user is displayed, and advice is presented to the user so that a dialogue suitable for the condition and the field selected as the topic can be performed. When there is a utterance from the user, the dialogue partner corresponding to the content of the utterance is generated, and when a topic word is extracted from the utterance content from the user, the conversation partner corresponding to the topic word is generated. It is characterized by generating the dialogue.
[0013]
According to the present invention, it is possible to easily train a dialogue with various people in various scenes encountered at an individual level.
[0014]
Preferably, a dialogue template that completes the user's dialogue is presented as the advice by inserting a desired topic word included in the displayed latest information so that a conversation about the latest information displayed can be performed. .
[0015]
In addition, as the advice, a speech that is desired to be uttered by the user is presented.
[0016]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below with reference to the drawings.
[0017]
FIG. 1 shows a schematic configuration example of a dialogue simulation apparatus according to the present embodiment.
[0018]
The input unit 2 is used to input a scene for interacting with a desired conversation partner and conditions necessary for setting the desired conversation partner, and for the user to input various instructions to the system.
[0019]
The conditions for setting the scene include, for example, when the desired scene is a party venue, the party type, the date and time the party is held, the location (country name, place name, etc.), and the purpose of the party For example, data that can be known from the invitation of the party, formal or informal, and data that can be input within the range that the user can understand, such as the age group and expertise of the attendees. . Preferably, as detailed conditions as possible are input. In addition, examples of conditions for setting a conversation partner include sex, age, expertise, hobbies, occupations, relationships with users, and the like. Again, preferably, it is desirable to input as detailed conditions as possible.
[0020]
The scene setting unit 3 combines CG (computer graphics) components stored in advance in the component storage unit 4 based on the scene conditions input to the input unit 2, for example, a meal at a party venue or restaurant, etc. Are created three-dimensionally.
[0021]
The dialogue partner setting unit 5 creates a dialogue partner desired by the user by combining CG (computer graphics) components stored in advance in the component storage unit 4 based on the scene conditions input to the input unit 2.
[0022]
Based on various conditions for setting a scene and a conversation partner input from the input unit 2, the information acquisition unit 6 collects the latest information that meets these conditions, for example, using the Internet. .
[0023]
The information storage unit 7 stores the latest information acquired by the information acquisition unit 6. In addition, common-sense information that varies from country to country and from scene to scene may be stored in advance.
[0024]
The information selection unit 8 selects appropriate information from the information storage unit 7 based on the scene input from the input unit 2, the conditions of the conversation partner, and other instructions from various users.
[0025]
The scenario storage unit 9 stores a plurality of scenarios for dialogues of all kinds (formal, informal, female, male, etc.) in all fields that are expected to be encountered in various scenes. . The simplest scenario is, for example, a dialog (dialog) between a user and a conversation partner.
[0026]
For example, in a party venue held in the United States, when a Japanese user interacts with an American first-time male, start with a first-time greeting, create a self-introduction, and start a conversation. If there is something that should be called a series of dialogue strategies to develop, a dialog for greetings with men who meet Americans for the first time (this is often composed of clichés), followed by For example, a dialog for dialogue on a desired topic can be given.
[0027]
For example, in a party venue held in the United States, a scenario for a Japanese user to interact with an American man in the political field, for example, about “election” is shown in FIG. Also good.
[0028]
In the dialog shown in FIG. 2, the user or the system side inserts a desired word into the “x1” and “x2” portions. Here, the words included in the displayed information inserted in the portions “x1” and “x2” are called topic words.
[0029]
In this embodiment, the user inserts a desired word (topic word) in the latest information, for example, into the “x1” or “x2” portion, and the user utters the dialogue to use the latest topic. The training of the dialogue that was possible can be done now. At the same time, the topic, manners and common sense of the scene can be acquired.
[0030]
In FIG. 2, a Japanese dialog is shown, but this may be an English dialog. In this case, if the user's native language is not English, it is also practiced for a foreign language.
[0031]
The scenario stored in the scenario storage unit 9 may have various variations according to the user's utterance content based on the scenario as shown in FIG.
[0032]
The scenario selection unit 10 selects an optimal scenario from the scenario storage unit 9 based on the scene input from the input unit 2, the conditions of the conversation partner, other instructions from various users, the user's remarks, and the like.
[0033]
The dialogue control unit 11 generates advice to be presented to the user or emits it on the system side so that the dialogue can proceed between the system side (conversation partner) and the user according to the scenario selected by the scenario selection unit 10. This is for selecting a power line.
[0034]
The voice input unit 12 is for inputting a user's speech (uttered speech). The speech of the user input to the speech input unit 12 is recognized by the speech recognition unit 13 and further analyzed by the analysis unit 14 from the content of the speech into, for example, a dialogue template (template) on the user side of the scenario shown in FIG. For example, a topic word inserted in “x1” is extracted. When the topic word is extracted, the dialogue control unit 11 requests the dialogue generation unit 15 to generate the next dialogue partner dialogue in order to generate the dialogue partner dialogue corresponding to the topic word. At that time, if necessary, the scenario selection unit 10 reselects the scenario, the information selection unit 8 selects necessary information, and the information acquisition unit 6 updates the latest topic word corresponding to the extracted topic word. Information may be acquired.
[0035]
The dialogue generation unit 15 is for generating dialogue dialogue, for example, to respond to the user's utterance content (if the user's utterance content includes a topic word). For example, a topic is generated by inserting a topic word extracted from the content of a user's speech into “x1” included in the dialogue template on the system side of the scenario shown in FIG.
[0036]
The dialogue selected by the dialogue control unit 11 and the dialogue generated by the dialogue generation unit 15 are synthesized by the speech synthesis output unit 16 in correspondence with the dialogue partner set by the dialogue partner setting unit 5, and As a remark, it is output from a speaker or the like.
[0037]
The display control unit 17 performs control for displaying on the display 20 the scene set by the scene setting unit 3 and the dialog partner set by the dialog partner setting unit 5.
[0038]
The conversation partner control unit 18 controls the expression, movement, and manner of speaking of the conversation partner based on the analysis result of the speech content input by the user in the analysis unit 14. In other words, the analysis unit 14 determines whether or not the content of remarks input from the user includes remarks that are taboo, contrary to manners, content that the conversation partner is pleased with, or content that makes you feel bad. You may make it do. Based on the determination result, the conversation partner control unit 18 can convey the fact by a facial expression when, for example, the user hears a violation of manners.
[0039]
The control part 1 is for controlling each said parts 2-18.
[0040]
Next, processing operations of the dialogue simulation apparatus in FIG. 1 will be described with reference to flowcharts shown in FIGS.
[0041]
First, with reference to FIG. 4, the processing operation in the preparation stage until the dialogue is started will be described. The user can input a desired scene condition (for example, type = party, purpose = for deepening social gatherings after the end of the conference, date / time = ○ month × day 12:00 pm, informal, etc.) The user inputs the conditions of the desired conversation partner (for example, gender = male, nationality = America, age = 40 years, specialty = information engineering, hobby = golf, occupation = professor at Y University, relationship with user = first meeting) (Step S1, Step S2).
[0042]
The scene setting unit 3 creates a scene based on the inputted scene conditions, and the dialogue partner setting unit 5 creates a dialogue partner based on the inputted dialogue partner conditions (steps S3 and S4).
[0043]
The information acquisition unit 6 collects the latest information from the Internet, for example, using the input condition as a keyword (step S5). In this case, for example, it is desirable that the latest information in all fields such as current affairs, incidents, entertainment, food and sports in the United States can be collected. However, the latest information corresponding to a predetermined field may be collected.
[0044]
The display control unit 17 displays the CG scene created by the scene setting unit 3, the CG dialogue partner created by the dialogue partner setting unit 5 on the display 20, and the latest information collected by the information collection unit 6. A button for selecting the field of information is displayed (step S6, step S7).
[0045]
FIG. 3 shows a screen display example displayed on the display 20 of the dialogue simulation apparatus.
[0046]
As shown in FIG. 3, this display screen includes a display area W1 in which CG scenes and dialogue partners are displayed, an area W2 in which a selection button for selecting a field of information is provided, and the latest information. The display area W3 displays information and the display area W4 displays advice to the user.
[0047]
In step S7 in FIG. 4, the area W1 and the area W2 are displayed among the areas shown in FIG.
[0048]
In this state, when the user selects a button corresponding to a desired field among a plurality of selection buttons displayed in the area W2 in FIG. 3 using a pointing device such as a mouse (step S11 in FIG. 5), the scenario The selection unit 10 selects a scenario suitable for the selected field and the conditions input in Step S1 and Step S2 from the scenarios stored in the scenario storage unit 4 (Step S12).
[0049]
Further, the information selection unit 8 reads out information (latest information, general common sense, etc.) belonging to the relevant field selected by the user from the information storage unit 7, and the information is stored in the area W3 in FIG. It is displayed (step S13, step S14).
[0050]
Note that the fields displayed in the area W2 in FIG. 3 are major classification fields, and when any of these fields is selected, the minor classification is further displayed, and a desired field is selected from among them. You may do it.
[0051]
For example, as the above conditions, the scene conditions are type = party, purpose = to deepen social gathering after the end of the conference, and the desired conversation partner conditions are gender = male, nationality = America, age = 40 years old If the field selected by the user is the sub-category “election” belonging to “politics”, the scenario selected in step S12, that is, the dialog may be, for example, as shown in FIG. . In addition, although the scenario shown in FIG. 2 is Japanese, it is actually English. However, here, for simplicity of explanation, the case of Japanese is taken as an example.
[0052]
Now, when the scenario as shown in FIG. 2 is selected, when it is predetermined that the user starts the dialogue (for example, the user may set the dialogue), the dialogue control unit 11 2, the role “A” in FIG. 2 is determined as the user, and the role “B” is determined as the system, that is, the conversation partner. Therefore, the dialog control unit 11 first asks the user “A to perform a dialog according to the scenario, that is, a dialog suitable for the condition input by the user and the field selected as the topic by the user. Is displayed in the area W4 of FIG. 3 (step S15). For example, as an advice, a desired topic word included in the latest information displayed in the area W4 as a speech that the user preferably utters so that the user can perform a conversation about the latest information displayed in the area W3. A dialogue template that completes the dialogue by inserting “”, that is, for example, “Is interested in x1 (what election)?” Corresponding to the first dialogue of “A” in this case, is displayed in the area W4. You may do it.
[0053]
Since the information selected for “election” (in this case, the latest information) is displayed in the area W3 of the screen, the user looks at this and the area W4, for example, “To the“ American presidential election ” Are you interested? " In the above notation, the portion surrounded by “” is a topic word inserted by the user.
[0054]
The user's speech is input from the voice input unit 12, is recognized by the voice recognition unit 13, and is further analyzed by the analysis unit 14 (steps S21 to S23).
[0055]
As a result of analyzing the content of the user's speech by the analysis unit 14, it is assumed that a topic word is extracted from the content of the speech (step S25). For example, in this case, “American presidential election” is extracted. For example, the inserted topic word can be extracted by obtaining a difference between the content of the user's statement and the dialogue template displayed as advice.
[0056]
The dialogue control unit 11 first checks whether information corresponding to the extracted topic word has already been acquired (stored in the information storage unit 7) (step S28). At this time, when the information is not stored in the information storage unit 7, the information acquisition unit 6 may be requested to acquire the latest information corresponding to the topic word. In this case, the information acquisition unit 6 may acquire information from the Internet, for example, using the passed topic word as a keyword (step S29).
[0057]
On the other hand, when the information corresponding to the extracted topic word has already been acquired (stored in the information storage unit 7), the dialogue control unit 11 prepares the dialogue on the system side (the dialogue partner). . That is, a dialogue-corresponding dialogue corresponding to the extracted topic word is generated (step S30). Here, the dialogue that the dialogue partner should next utter is, for example, the first dialogue in FIG. 2B in this case. In this dialogue “Of course. Who will win x1 (what election)?”, The topic word “American presidential election” extracted from the user's remarks needs to be included in “x1”. Therefore, the dialogue generation unit 15 inserts “American presidential election” into the “x1” portion of the dialogue, and generates the dialogue “Of course. Who thinks the American presidential election will win?” To do.
[0058]
Furthermore, when the analysis unit 14 determines in step S23 that the utterance content input from the user includes, for example, a taboo utterance contrary to manners, the determination result corresponds to the determination result. Determine the facial expression and movement of the conversation partner and how to speak (step S25). For example, the expression of the conversation partner may be strengthened, or the tone of the voice may be changed so as to be angry.
[0059]
In step S24, when a topic word is not extracted from the content of a statement from the user, that is, for example, when a phrase of greeting is spoken, or a dialogue template for inserting a topic word as advice is displayed. If it is not, it is not expected that the topic word is extracted from the content of the user's remarks. Therefore, the dialogue control unit 11 simply selects the next system side (conversation partner) dialogue from the scenario. Good.
[0060]
In step S26, the speech on the system side (the conversation partner) generated or selected in step S30 is synthesized by speech and output. After that, in response to the comment of the dialogue partner that has been output, the dialogue that is desired to be spoken by the user or the template of the dialogue is displayed as an advice on the screen area W4 so that the dialogue can be performed in accordance with the previously selected scenario. (Step S27).
[0061]
For example, in the area W4, “I don't know. For now, x2 (the name of the first candidate) is considered dominant” is displayed. Thereafter, in the same manner as in the description of step S15 in FIG. 5, the user selects “for example, John W” as a topic word from the information displayed in the area W3, for example. However, he says, “I think John W is dominant.” The subsequent processing operations are the same as described above, and the processing shown in FIG. 6 is repeated until the scenario ends.
[0062]
As described above, according to the above-described embodiment, the user can perform a conversation while developing a story about the desired conversation partner and the desired topic by inputting a desired scene and a condition of the desired conversation partner. Appropriate advice is displayed so that prior dialogue training can be performed easily. In particular, as the above-mentioned advice, the dialogue that is desirable to be uttered by the user is displayed, which is useful for mastering phrases such as greetings. Furthermore, as the above-mentioned advice, the latest information can be obtained by presenting a dialogue template that completes the dialogue by inserting a desired topic word included in the latest information that is already displayed as a dialogue that the user preferably utters. Since conversational training can be performed on the topic, realistic conversational training can be performed in accordance with actual conversational situations. Also, by conducting the above dialogue in a foreign language, dialogue training in a foreign language becomes possible. In addition, since the latest information and general common sense, etc., which are collected and selected based on the desired scene and the conditions of the desired conversation partner, and which are most suitable for the desired conversation partner, are presented, preliminary knowledge useful for actual conversations Can be obtained.
[0063]
In the above-described embodiment, the scenario is described in which the system side automatically selects in step S12 of FIG. 5. However, the present invention is not limited to this, and the system side selects the field selected by the user in step S12. A plurality of optimal scenario candidates may be selected from the input conditions, and the user may select a desired scenario.
[0064]
The method of the present invention described in the embodiment of the present invention (in particular, the method of the present invention corresponding to the processing operation shown in FIGS. 4 to 6) is a magnetic disk as a program that can be executed by a computer. It can also be stored and distributed in a recording medium such as a floppy disk or hard disk, an optical disk (CD-ROM, DVD, etc.), or a semiconductor memory.
[0065]
In addition, this invention is not limited to the said embodiment, In the implementation stage, it can change variously in the range which does not deviate from the summary. Further, the above embodiments include inventions at various stages, and various inventions can be extracted by appropriate combinations of a plurality of disclosed constituent requirements. For example, even if some constituent elements are deleted from all the constituent elements shown in the embodiment, the problem (at least one of them) described in the column of the problem to be solved by the invention can be solved, and the column of the effect of the invention If at least one of the effects described in (1) is obtained, a configuration from which this configuration requirement is deleted can be extracted as an invention.
[0066]
【The invention's effect】
As described above, according to the present invention, it is possible to easily train dialogues with various people in various scenes encountered at the individual level.
[Brief description of the drawings]
FIG. 1 is a diagram showing a schematic configuration example of a dialogue simulation apparatus according to an embodiment of the present invention.
FIG. 2 is a diagram showing an example of a dialogue scenario.
FIG. 3 is a diagram showing a screen display example of the dialogue simulation apparatus in FIG. 1;
4 is a flowchart for explaining a processing operation of the dialogue simulation apparatus of FIG. 1;
FIG. 5 is a flowchart for explaining a processing operation of the dialogue simulation apparatus in FIG. 1;
6 is a flowchart for explaining the processing operation of the dialogue simulation apparatus in FIG. 1;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Control part 2 ... Input part 3 ... Scene setting part 4 ... Parts storage part 5 ... Dialog partner setting part 6 ... Information acquisition part 7 ... Information storage part 8 ... Information selection part 9 ... Scenario storage part 10 ... Scenario selection part 11 ... Dialog control unit 12 ... Voice input unit 13 ... Speech recognition unit 14 ... Analysis unit 15 ... Dialogue generation unit 16 ... Speech synthesis unit 17 ... Display control unit 18 ... Dialog partner control unit 20 ... Display

Claims

Desired scene type, date / time, place, purpose, and conditions for setting the scene including at least one of the formal and informal scenes, and the gender, age, and expertise of the desired conversation partner An input means for inputting conditions for setting the conversation partner including at least one of hobbies, occupations, and relationships with users ;
A scenario storage means for storing a plurality of scenarios for dialogue including dialogue patterns of dialogue between the user and the conversation partner, in which a topic word insertion position is predetermined;
Information collecting means for collecting the topic words via a network using, as keywords, the scene input by the input means and the conditions for setting the conversation partner for each of a plurality of predetermined fields; ,
Information storage means for storing topic words collected by the information collection means;
Scenario selection means for selecting a scenario belonging to the field selected by the user from among the plurality of fields among the scenarios stored in the scenario storage means in a scene that satisfies the condition input by the input means. When,
Information selection means for selecting a topic word belonging to the field selected by the user from among the plurality of fields among the topic words stored in the information storage means;
Display means for displaying at least the topic word selected by the information selection means and the user's dialogue template in the scenario selected by the scenario selection means;
Voice input means for inputting a user's speech;
Extraction means for extracting a topic word in the user's speech by obtaining a difference between the user's speech input by the voice input means and a dialogue template displayed by the display means;
Generating means for generating the dialogue partner dialogue by inserting the topic word extracted by the extraction means at a predetermined insertion position of the dialogue partner dialogue template in the selected scenario;
Voice output means for synthesizing and outputting the speech generated by the generation means;
An interactive simulation apparatus characterized by comprising:

CG of the scene and conversation partner ( Computer graphics ) Component storage means for storing CG components for creating an image;
A scene setting means for creating a CG image of the scene using the CG parts stored in the parts storage means based on the conditions for setting the scene input by the input means;
Based on the conditions for setting the dialogue partner input by the input means, the dialogue partner setting means for creating a CG image of the dialogue partner using the CG component stored in the component storage means;
Further comprising
The dialogue simulation apparatus according to claim 1, wherein the display unit further displays the scene and a CG image of the dialogue partner.

Desired scene type, date / time, place, purpose, and conditions for setting the scene including at least one of the formal and informal scenes, and the gender, age, and expertise of the desired conversation partner An input means for inputting conditions for setting the conversation partner including at least one of hobbies, occupations, and relationships with users ;
A scenario storage means for storing a plurality of scenarios for dialogue including dialogue patterns of dialogue between the user and the conversation partner, in which a topic word insertion position is predetermined;
Information collecting means for collecting the topic words via a network;
Information storage means for storing topic words collected by the information collection means;
Display means;
Voice input means for inputting a user's speech;
Audio output means;
Control means;
A dialogue support method in a dialogue simulation apparatus comprising:
A first step in which the input means inputs a condition for setting the scene and a condition for setting the conversation partner;
For each of a plurality of predetermined fields, the information collection unit uses the conditions input for the scene and the conversation partner input by the input unit as keywords, and the topic word is obtained via a network. A second step of collecting;
The control means belongs to a field selected by the user from among the plurality of fields, from among the scenarios stored in the scenario storage unit, in a scene satisfying the condition input in the first step. A third step of selecting a scenario;
A fourth step in which the control means selects a topic word belonging to the field selected by the user from the plurality of fields from the topic words stored in the information storage means;
A fifth step in which the display means displays at least the topic word selected in the fourth step and the user's dialogue template in the scenario selected in the third step;
A sixth step in which the voice input means inputs a user's statement;
The control means obtains the difference between the user's utterance input in the sixth step and the dialogue template displayed by the display means in the fifth step, so that the topic word in the user's utterance A seventh step of extracting
The control means inserts the topic word extracted in the seventh step into a predetermined insertion position of the dialogue model of the dialogue partner in the scenario selected in the third step, An eighth step of generating dialogue for the dialogue partner;
A ninth step in which the voice output means synthesizes and outputs the speech generated in the eighth step;
Dialog support method including

  The dialogue simulation apparatus comprises:
  CG of the scene and conversation partner ( Computer graphics ) Component storage means for storing CG components for creating an image;
  Further comprising
  The control means creating a CG image of the scene using the CG parts stored in the parts storage means based on the conditions for setting the scene input in the first step; ,
  The control unit creates a CG image of the conversation partner using the CG component stored in the component storage unit based on the condition for setting the conversation partner input in the first step. Steps,
  Further including
  4. The dialogue support method according to claim 3, wherein in the fifth step, the display means further displays the scene and a CG image of the dialogue partner.