JP2004309631A

JP2004309631A - Apparatus, method, and program for assisting interaction practice

Info

Publication number: JP2004309631A
Application number: JP2003100296A
Authority: JP
Inventors: Eriko Sano; 恵利子佐野; Yoshihiko Hirakawa; 義彦平川; Akio Kameda; 明男亀田; Shinichiro Takagi; 伸一郎高木
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-04-03
Filing date: 2003-04-03
Publication date: 2004-11-04

Abstract

<P>PROBLEM TO BE SOLVED: To provide an apparatus, a method, and a program for assisting an interaction practice that are an interaction type using an audio and video like an actual simulated interaction practice and evaluate the contents of a user's answer to a question, the answer time, etc. <P>SOLUTION: The apparatus for assisting an interaction practice includes scenario information that branches on the basis of a term and a speechless time included in answer information to question information and selects question information to be sent next. Voice information of the answer information is converted into text information; when a specified term is included, that is marked and stored and when there is a speechless time, that is marked and stored. When an evaluation record request is received from a terminal, evaluation record information consisting of information indicating that the specified term is used at the marked part or the presence of the speechless time is transmitted to the terminal together with the question information and answer information. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、模擬的に対話練習を行うための対話練習支援装置、方法及びプログラムに関する。
【０００２】
【従来の技術】
従来より、コンピュータを使用した会話マナー教育システムがある（例えば特許文献１参照）。このシステムは、利用者の自由な発話を入力し、それを文字情報に変換し、利用者の発話の文字情報と正しい発話の文字情報とを比較して、評価するものであった。そして、その総合的な評価を、ディスプレイに画像情報として出力する。これにより、敬語に対する文法知識、敬語運用知識、会話状況情報等を学習することができる。
【０００３】
【特許文献１】
特許第２６７３８３１号
【０００４】
【発明が解決しようとする課題】
しかしながら、従来のシステムは、利用者の一方的な発話に対して評価するものであり、実際に行われるような対話型のものではなかった。従って、質問に対する返答内容に応じて、次の質問を選択したり、返答に詰まった場合の無言時間を評価することはできなかった。
【０００５】
そこで、本発明は、実際の模擬的な対話練習のように音声及び映像で行われる対話型であって、質問に対する利用者の返答内容及び返答時間等を評価する対話練習支援装置、方法及びプログラムを提供することを目的とする。
【０００６】
【課題を解決するための手段】
本発明の対話練習支援装置によれば、
面接官の音声映像情報からなる質問情報に対して、利用者の音声映像情報からなる返答情報に含まれる用語と、該返答情報が受信されるまでの無言時間とによって分岐し、次に送信すべき質問情報を選択するシナリオ情報を含む質問情報蓄積手段と、
シナリオ情報に基づいて質問情報を端末へ送信する手段と、
端末から受信した返答情報を蓄積する音声映像情報蓄積手段と、
特定用語を蓄積した用語蓄積手段と、
利用者の音声映像情報における音声情報をテキスト情報に変換し、特定用語が含まれていればその旨を又は無言時間があればその旨を、テキスト情報の部分にマーク付けする音声認識処理手段と、
音声認識処理手段によってマーク付けされたテキスト情報を蓄積する対話情報蓄積手段と、
端末から評価録要求を受信した際に、対話情報蓄積手段からマーク付けされた部分を抽出する対話評価手段と、
質問情報蓄積手段から抽出した質問情報と、音声映像情報蓄積手段から抽出した返答情報と、マーク付けされた部分に特定用語が用いられている旨又は無言時間がある旨の情報とからなる評価録情報を、端末へ送信する評価録生成手段と
を有することを特徴とする。
【０００７】
本発明の対話練習支援装置における他の実施形態によれば、
質問情報蓄積手段に蓄積されたシナリオ情報は、分岐に応じて評価ポイント情報が含まれており、
対話評価手段は、シナリオ情報を辿る分岐に応じて評価ポイント情報を加算し、
評価録生成手段は、更に評価ポイント情報を端末へ送信することも好ましい。
【０００８】
また、本発明の対話練習支援装置における他の実施形態によれば、
音声認識処理手段は、シナリオ情報における分岐の通過統計情報を収集し、該通過統計情報を質問情報蓄積手段に蓄積することも好ましい。
【０００９】
本発明の対話練習支援方法によれば、
装置は、面接官の音声映像情報からなる質問情報に対して、利用者の音声映像情報からなる返答情報に含まれる用語と、該返答情報が受信されるまでの無言時間とによって分岐し、次に送信すべき質問情報を選択するシナリオ情報を含む質問情報蓄積手段と、特定用語を蓄積した用語蓄積手段とを有しており、
装置が、シナリオ情報に基づいて質問情報を端末へ送信する第１のステップと、
端末が、質問情報に対する返答情報を装置へ送信する第２のステップと、
装置が、返答情報を音声映像情報蓄積手段に蓄積する第３のステップと、
装置が、利用者の音声映像情報における音声情報をテキスト情報に変換し、特定用語が含まれていればその旨を又は無言時間があればその旨を、テキスト情報の部分にマーク付けして音声認識処理をする第４のステップと、
装置が、音声認識処理手段によってマーク付けされたテキスト情報を対話情報蓄積手段に蓄積する第５のステップと
を有し、第１から第５のステップを繰返し、
端末が、評価録要求を装置へ送信する第６のステップと、
装置が、対話情報蓄積手段からマーク付けされた部分を抽出する第７のステップと、
装置が、質問情報蓄積手段から抽出された質問情報と、音声映像情報蓄積手段から抽出された返答情報と、マーク付けされた部分に特定用語が用いられている旨又は無言時間がある旨の情報とからなる評価録情報を、端末へ送信する第８のステップとを有することを特徴とする。
【００１０】
また、本発明の対話練習支援方法によれば、
装置は、面接官の音声映像情報からなる質問情報に対して、利用者の音声映像情報からなる返答情報に含まれる用語と、該返答情報が受信されるまでの無言時間とによって分岐し、次に送信すべき質問情報を選択するシナリオ情報を含む質問情報蓄積手段と、特定用語を蓄積した用語蓄積手段とを有しており、
端末も、質問情報蓄積手段を有しており、
端末が、シナリオ情報に基づいて質問情報を利用者に出力し、且つ、該質問情報に対する返答情報を入力することを繰返して、一連の返答情報を収集する第１のステップと、
端末が、返答情報を、装置へ送信する第２のステップと、
装置が、返答情報を音声映像情報蓄積手段に蓄積する第３のステップと、
装置が、利用者の音声映像情報における音声情報をテキスト情報に変換し、特定用語が含まれていればその旨を又は無言時間があればその旨を、テキスト情報の部分にマーク付けして音声認識処理をする第４のステップと、
装置が、音声認識処理手段によってマーク付けされたテキスト情報を対話情報蓄積手段に蓄積する第５のステップと
を有し、
端末が、評価録要求を装置へ送信する第６のステップと、
装置が、対話情報蓄積手段からマーク付けされた部分を抽出する第７のステップと、
装置が、質問情報蓄積手段から抽出された質問情報と、音声映像情報蓄積手段から抽出された返答情報と、マーク付けされた部分に特定用語が用いられている旨又は無言時間がある旨の情報とからなる評価録情報を、端末へ送信する第８のステップとを有することを特徴とする。
【００１１】
本発明の対話練習支援方法における他の実施形態によれば、
質問情報蓄積手段に蓄積されたシナリオ情報は、分岐に応じて評価ポイント情報が含まれており、
第５のステップは、シナリオ情報を辿る分岐に応じて評価ポイント情報を加算し、
第６のステップは、更に評価ポイント情報を端末へ送信することも好ましい。
【００１２】
また、本発明の対話練習支援方法における他の実施形態によれば、
音声認識処理手段は、シナリオ情報における分岐の通過統計情報を収集し、該通過統計情報を質問情報蓄積手段に蓄積することも好ましい。
【００１３】
本発明の対話練習支援プログラムによれば、
面接官の音声映像情報からなる質問情報に対して、利用者の音声映像情報からなる返答情報に含まれる用語と、該返答情報が受信されるまでの無言時間とによって分岐し、次に送信すべき質問情報を選択するシナリオ情報を含む質問情報蓄積手段と、
シナリオ情報に基づいて質問情報を端末へ送信する手段と、
端末から受信した返答情報を蓄積する音声映像情報蓄積手段と、
特定用語を蓄積した用語蓄積手段と、
利用者の音声映像情報における音声情報をテキスト情報に変換し、特定用語が含まれていればその旨を又は無言時間があればその旨を、テキスト情報の部分にマーク付けする音声認識処理手段と、
音声認識処理手段によってマーク付けされたテキスト情報を蓄積する対話情報蓄積手段と、
端末から評価録要求を受信した際に、対話情報蓄積手段からマーク付けされた部分を抽出する対話評価手段と、
質問情報蓄積手段から抽出された質問情報と、音声映像情報蓄積手段から抽出された返答情報と、マーク付けされた部分に特定用語が用いられている旨又は無言時間がある旨の情報とからなる評価録情報を、端末へ送信する評価録生成手段としてコンピュータを機能させることを特徴とする。
【００１４】
本発明の対話練習支援プログラムにおける他の実施形態によれば、
質問情報蓄積手段に蓄積されたシナリオ情報は、分岐に応じて評価ポイント情報が含まれており、
対話評価手段は、シナリオ情報を辿る分岐に応じて評価ポイント情報を加算し、
評価録生成手段は、更に評価ポイント情報を端末へ送信するようにコンピュータを機能させることも好ましい。
【００１５】
また、本発明の対話練習支援プログラムにおける他の実施形態によれば、
音声認識処理手段は、シナリオ情報における分岐の通過統計情報を収集し、該通過統計情報を質問情報蓄積手段に蓄積するようにコンピュータを機能させることも好ましい。
【００１６】
【発明の実施の形態】
以下で、図面を用いて、本発明の実施の形態を説明する。
【００１７】
図１は、本発明におけるシステム構成図である。
【００１８】
図１によれば、本発明における対話評価装置１と、利用者の端末２とが、インターネット３を介して接続されている。端末２には、音声情報を取得するマイク２１と、映像情報を取得するビデオカメラ２２と、面接官の声を出力するスピーカ２３とが備えられている。
【００１９】
端末２は、対話評価装置１から送信される質問情報を受信する。質問情報は音声映像情報からなり、端末２は、ブラウザによって映像情報を表示し、スピーカ２３によって音声を出力する。また、マイク２１によって利用者の発話が取得され、ビデオカメラ２２によって利用者の映像情報が取得される。端末２は、これら音声映像情報を対話評価装置１へ送信する。
【００２０】
ここで、端末２が対話評価装置１へ送信する音声映像情報は、質問毎に個別にリアルタイムに送信するものと、複数の質問について一括して送信するものとからなる。後者については、端末２が、対話評価装置１から予め質問情報を受信して蓄積しておく必要がある。
【００２１】
更に、端末２は、対話評価装置１へ対話評価録を要求することができる。受信された評価録情報について、その映像情報はブラウザによって表示され、その音声情報はスピーカ２３から出力される。
【００２２】
図２は、本発明における対話評価装置１の機能構成図である。
【００２３】
図２によれば、対話評価装置１は、音声映像情報データベース１１と、質問情報（シナリオ）データベース１２と、タイムスタンプ１３と、音声認識処理部１４と、対話情報データベース１５と、対話評価部１６と、評価録生成部１７と、特定用語データベース１８と、通信インタフェース１９とを有する。
【００２４】
インターネット３には、通信インタフェース１９を介して接続される。
【００２５】
質問情報データベース１２は、利用者の端末２へ送信すべき質問情報を蓄積している。質問情報は、対話模擬練習を行うためのシナリオ情報と、その面接官の音声映像情報とからなる。例えば就職のための模擬対話に関するものであれば、質問情報は、情報関連企業、バイオテクノロジ関連企業等にパタン分けされている。
【００２６】
シナリオ情報は、質問内容に対する返答に含まれる用語を分岐条件とするフローチャートである。また、質問終了時刻と返答開始時刻との間に一定時間の無言時間が発生した場合、即ち、返答に詰まった場合も、シナリオ情報に基づく分岐条件となる。その条件としては、例えば、複数の用語を演算子で結合したもの（ＡＮＤ／ＯＲ等）、言語の言い直しの場合、語尾や口癖の検出などがある。特に、最後に発言した用語を選択することができる。例えば、「営業部では無く、経理部」という発言があった場合、経理部をシナリオ分岐のキーとなる用語にすることができる。これらシナリオは、プログラムとして構成され、変更及び修正が容易なものである。
【００２７】
また、シナリオ情報のフローチャートには、評価ポイント情報の加点があり、最終的な対話評価の点数とすることができる。例えば、無音時間が長いほど評価ポイント情報を減点することもできる。
【００２８】
更に、シナリオ情報として、予め想定した理想の返答内容を蓄積しておくことも好ましい。この質問情報に対する理想的な返答情報を、端末２へ送信することにより、利用者は、客観的に、自己の受け答えの様子を把握できるばかりか、悪かった点と良かった点を理由つきで把握することや理想パタンを学習することができるようになり、利用者の利便性を向上させることができる。
【００２９】
タイムスタンプ部１３は、通信インタフェース１９によって受信された音声映像情報に時刻を付する。
【００３０】
受信された音声映像情報は、利用者毎に、音声映像情報データベース１１に蓄積される。このとき、音声映像情報は、タイムスタンプにより時刻が付される。尚、映像情報は、端末２に備えられたビデオカメラ２２から取得されたものに限られず、利用者によってプレゼンテーションに利用される文書資料データ又は投影資料データ等の、視覚的効果を有する資料データであってもよい。
【００３１】
音声映像情報データベース１１に蓄積された音声情報は、次に、音声認識処理部１４において、テキスト情報に変換される。このとき、音声認識処理部１４は、特定用語データベース１８と質問情報データベース１２とを用いる。
【００３２】
特定用語データベース１８は、対話時に使用しない方がよい言葉（分岐キーワードとなる用語）を蓄積しているものである。特定用語データベース１３に蓄積された特定用語を、変換されたテキスト情報から検索する。その特定用語は、例えば、対話時には使用しない方がよい用語である。そして、その特定用語の部分にマーク付けされたテキスト情報が、対話情報データベース１５に蓄積される。
【００３３】
また、音声認識処理部１４は、質問終了時刻と返答開始時刻との間の無言時間を計算し、シナリオ情報に基づいて所定の無言時間が発生したと判断した場合、テキスト情報のその箇所にマーク付けする。その無言時間は、例えば、返答が思いつかず、返答までに時間がかかっていることを意味する。そして、そのテキスト情報を対話情報データベース１５に蓄積する。
【００３４】
無音時間は、入力音声のレベルが閾値をもって発話開始及び終了を判定する。発話開始とは、例えば、閾値以上の状態が所定の時間長以上（例えば２秒）継続することをいう。発話終了とは、例えば、閾値以下の状態が所定の時間長以上（例えば２秒）継続することをいう。
【００３５】
更に、音声認識処理部１４は、シナリオ情報における分岐の通過統計情報を収集し、該通過統計情報を質問情報蓄積手段に蓄積する。これは、シナリオ情報のフローチャートの分岐部分に通過統計情報を加算することで実現できる。複数の利用者に対して同じシナリオ情報の対話を実施した場合、ほとんどの利用者について特性の分岐部分で分岐するといった、統計を得ることができる。
【００３６】
模擬対話終了後、端末２は、対話評価装置１に対して、評価録を要求することができる。このとき、端末２は、対話評価装置１に対して、評価録要求メッセージを送信する。ここでは、対話全体に拘わらず、ワンポイント教習のように例えば一部の時間についてのみ、評価録を要求することもできる。
【００３７】
対話評価部１６は、通信インタフェース１９によって受信された評価録要求メッセージを取得する。このとき、対話評価部１６は、質問情報データベース１２から質問情報及び／又は統計情報を抽出し、対話情報蓄積データベース１５からマーク付けされた部分を抽出する。
【００３８】
評価録生成部１７は、対話評価部１６によって検索されたテキスト情報に基づいて、マーク位置及びその時刻情報に基づいて、音声映像情報データベース１１を検索する。評価録生成部１７は、検索されたテキスト情報に付された時刻情報に相当する音声映像情報を取得する。そして、その音声映像情報と、質問情報及び／又は統計情報と、マーク位置に相当するアドバイス情報と、評価ポイント情報とからなるＨＴＭＬ形式のＡＶ（ＡｕｄｉｏａｎｄＶｉｓｕａｌ）評価録を生成し、その評価録情報を、評価録要求メッセージを送信した端末２へ返送する。
【００３９】
評価録要求メッセージを送信した端末は、評価録情報を再生することができる。このとき、ブラウザには、面接官の映像と、利用者の映像と、不適切な返答及び無言発生等によるアドバイス事項と、評価ポイント情報とが表示される。
【００４０】
アドバイス事項としは、例えば、以下のような表示がなされる。
（１）特定用語に該当した場合「対話中、使用しない方がいい言葉を○回使っています。△△△と発言するようにしましょう。」
（２）無音時間に該当した場合「対話中、返答に詰まった箇所が○件ありました。○○○の問いに対する返答に詰まりましたね。×××といったような返答は、準備しておくようにしましょう。」
【００４１】
これらは、音声認識処理部１４によって、不適切な用語等を検出した回数をカウントすることによって容易に実現できる。
【００４２】
尚、図２の他の実施形態としては、質問情報データベース１２に蓄積されたシナリオ情報を、端末２へ予め送信するものがある。端末２は、そのシナリオ情報に基づいて一連の対話を進行させ、利用者の音声映像情報を蓄積する。その一連の対話が終了後、端末２は、その音声映像情報を対話評価装置１へ送信する。対話評価装置１におけるその後の処理は、前述したものと同様である。
【００４３】
図３は、本発明による対話評価装置と端末との間のシーケンス図である。
【００４４】
（Ｓ１）端末２は、対話評価装置１へ対話評価開始要求メッセージを送信する。このとき、メッセージには、レッスン番号、ユーザＩＤ等が含まれる。
（Ｓ２）対話評価装置１は、質問情報データベース１２から、音声映像情報の質問情報を、端末２へ送信する。端末２は、ブラウザを用いて面接官の映像を表示し、スピーカ２３を用いて質問の音声を出力する。これにより、利用者は、画面に表示される面接官像をあたかも実際の面接官と感じながら、模擬対話を行うことができる。このとき、自己の映像又は自己の履歴書文書等が表示されることも好ましい。
（Ｓ３）端末２は、マイク２１によって取得した利用者の返答の音声情報と、ビデオカメラ２２によって取得した利用者の映像情報とを、対話評価装置１へ送信する。
前述した（Ｓ２）（Ｓ３）のシーケンスを繰り返すことにより、対話情報が対話評価装置１に蓄積されていく。
【００４５】
（Ｓ４）対話模擬練習の終了後、利用者は、端末２を用いて、対話評価装置１へ対話評価要求メッセージを送信する。
（Ｓ５）対話評価装置１は、対話評価録情報を、端末２へ送信する。対話評価録情報は、面接官の音声映像情報及び利用者の音声映像情報と、アドバイス情報と、対話評価ポイント情報とが含まれる。端末２は、映像情報とアドバイス情報とポイント情報とをブラウザによって表示し、音声情報はスピーカによって出力される。これにより、利用者は、利用者の対話の様子を見ながら、面接官になったような感覚で、客観的に自己の受け答えの様子を把握できることになる。
【００４６】
図４は、本発明における対話評価装置１が、端末２からの音声映像情報を受信した際のフローチャートである。
【００４７】
図４によれば、以下のシーケンスで進行する。
（Ｓ４１）端末２から、質問に対する返答である音声映像情報を受信する。
（Ｓ４２）その音声映像情報に、現在の時刻をスタンプする。端末において時刻がファイルにスタンプされていれば、ここで時刻をスタンプする必要はない。
（Ｓ４３）音声映像情報は、音声映像情報データベース１１に、利用者毎に蓄積される。
（Ｓ４４）音声映像情報における音声情報は、音声認識処理によって、テキスト情報に変換される。このとき、特定用語データベース１８に蓄積された特定用語を検索し、質問情報データベース１２に蓄積されたシナリオ情報に基づいてマーク付けする。また、シナリオ情報のフローチャートの分岐部分に通過統計情報を加算することもできる。
（Ｓ４５）そのテキスト情報は、対話情報データベース１５に、利用者毎に蓄積される。
【００４８】
図５は、本発明における対話評価装置１が、評価録要求メッセージを受信した際のフローチャートである。
【００４９】
図５によれば、以下のシーケンスで進行する。
（Ｓ５１）端末から、評価録要求メッセージを受信する。
（Ｓ５２）対話情報データベース１５から、テキスト情報を検索する。
（Ｓ５３）検索されたテキスト情報の時刻情報を特定する。
（Ｓ５４）特定された時刻情報に相当する音声映像情報を、音声映像情報データベース１１から取得する。
（Ｓ５５）取得された音声映像情報からなる評価録を生成する。ここで、評価録は、ＨＴＭＬ形式のものである。これにより、マルチメディア評価録を提供することができる。評価録には、テキスト情報にマーク付けされた評価内容のアドバイス情報も含まれる。
（Ｓ５６）生成された評価録を、評価録要求メッセージを送信した端末２へ、送信する。
【００５０】
図６は、質問情報データベース１２におけるシナリオ情報のフローチャートである。
【００５１】
例えば面接においては、誘導面接型と圧迫面接型とがある。誘導面接型は、面接官が志願者に確認したい項目をシナリオにしたものである。一方、圧迫面接型は、希望の意思を確認するために、志願者があえて困るような質問を浴びせるような項目をシナリオにする。以下では、誘導面接型のシナリオ情報（用語抽出によるシナリオ分岐で実施した場合）と、圧迫面接型のシナリオ情報（返答につまった場合にシナリオ分岐を実施した場合）とを説明する。
【００５２】
誘導面接型のシナリオ情報について説明する。例として説明するシナリオ情報は、企業が欲しい研究分野の研究志願者が多く、研究者を選別する場合に、基礎研究者に求めるスキルと今後の希望とに応じて振り分けるものである。
（Ｓ１）「修士論文の概要を説明してください」との質問情報を端末２へ送信する。
（Ｓ２）端末２から返答を受信する。
（Ｓ３）その返答の中に、企業が取り組む研究テーマ「量子コンピューティング」又は「ＤＮＡ」を含むか否か判定する。ＹＥＳであれば、ポイント情報が＋２点となる。
（Ｓ４）ＹＥＳであれば、基礎研究者としてのスキルの確認として「研究は何を支えに進めてきましたか？」との質問情報を端末２へ送信する。
（Ｓ５）端末２から返答を受信する。
（Ｓ６）その返答の中に、自己を信じて遂行できる忍耐力を有するといったような「自己」の用語を含むか否か判定する。ＹＥＳであれば、ポイント情報が＋２点となる。
（Ｓ７）ＹＥＳであれば、今後の希望を問うために「今後も基礎研究を進めたいと思いますか？」との質問情報を端末２へ送信する。
（Ｓ８）ＮＯであれば、メンバを信じて遂行できるといった「メンバ」を含むか否か判定する。メンバとの協調性よりも個人の追求力を必要とする研究者においては、ＮＯであれば、ポイント情報が＋２点となる。
【００５３】
次に、圧迫面接型のシナリオ情報について説明する。例として説明するシナリオ情報は、志願者が営業を希望している場合に、あえて志願者が返答に窮すると思われる質問を浴びせるようなものである。
（Ｓ９）経理部を希望し且つ経理部員として見込みのある志願者に対してその真意を確かめるべく、あえて希望に反した質問として「営業部に配属になった場合どうしますか？」との質問情報を端末２へ送信する。
（Ｓ１０）ここで沈黙が５秒以上あったか否かを判定する。ＹＥＳであれば、この質問に窮したこととなる。ＮＯであれば、この返答に詰まらずに返答したこととなり、ポイント情報が＋１点となる。
（Ｓ１１）ＮＯであれば、返答を受信する。
（Ｓ１２）その返答の中に、営業部に配属されれば御社の損失になるとして「御社」「損失」の用語を含むか否か判定する。ＹＥＳであれば、見込みのある志願者であるとして、ポイント情報が＋２点となる。
（Ｓ１３）前述の（Ｓ１０）についてＹＥＳであれば、返答を受信する。
（Ｓ１４）このような詰まった場合に、その返答の中に、他の部門でもかまわないか又は他社を選択するような「他社」「他の部」の用語を含むか否か判定する。
（Ｓ１５）ＮＯであれば、切迫した質問を浴びせるために「会社に個人としてどのような貢献ができますか？」との質問情報を端末２へ送信する。
（Ｓ１６）ここで沈黙が５秒以上あったか否かを判定する。ＹＥＳであれば、この質問に窮したこととなる。ＮＯであれば、この返答に詰まらずに返答したこととなり、ポイント情報が＋１点となる。
【００５４】
本発明の対話練習支援方法における各ステップは、計算機に内蔵された記録媒体を用い、ＣＰＵ等の制御手段を用いて実行可能である。また、計算機読み取り可能なプログラムをＣＤ等の記録媒体若しくは通信回線を介してインストールして当該計算機に実行させることもできる。これらプログラムは、主に、インターネットにおける装置の一機能として、装置に搭載されるプログラムによって実現されてもよい。もちろん、これら機能は、端末に搭載されるプログラムによっても実現され、Ｐｅｅｒ−ｔｏ−Ｐｅｅｒ型で利用することもできる。
【００５５】
【発明の効果】
以上、詳細に説明したように、本発明の対話練習支援方法、装置及びプログラムによれば、実際の模擬的な対話練習のように音声及び映像で行われる対話型であって、質問に対する利用者の返答内容及び返答時間等を評価することができる。
【００５６】
また、自分が相手方になったような環境で自己の質疑応答の様子を確認でき、対話時には不適切と思われる言葉を指摘したり、返答に窮する場合を設定するなどして、対話の本番さながらの体験が可能になる。特に、入学、就職等における利用者の受け答えの円滑性及び言葉使いの練習が可能となる。
【００５７】
現在、生涯雇用が保証されない時代になっており、自分を短時間にＰＲできる技術として、巷には、対話克服に関する書物が多数出版されている。本発明によれば、実際の対話に近い類似体験を経験することができる。また、対話練習本に書かれている対話の心得や対話の達人なる人達の知識を織り込むことが可能になり、技術や希望業務のスキル等には実力を有すが、人前で話すことが苦手な者などに、対話練習の機会を与えることものであり、対話関連市場に十分合致したサービスが構築できる。また、この応用として、本システムは、営業プレゼンテーション訓練やビジネスマナー研修に応用でき、そのマーケットは計り知れない。
【図面の簡単な説明】
【図１】本発明におけるシステム構成図である。
【図２】本発明における対話評価装置１の機能構成図である。
【図３】本発明による対話評価装置と端末との間のシーケンス図である。
【図４】本発明における対話評価装置１が、端末２からの音声映像情報を受信した際のフローチャートである。
【図５】本発明における対話評価装置１が、評価録要求メッセージを受信した際のフローチャートである。
【図６】質問情報データベース１２におけるシナリオ情報のフローチャートである。
【符号の説明】
１対話評価装置
１１音声映像情報データベース、音声映像情報蓄積手段
１２質問情報データベース、質問情報蓄積手段
１３タイムスタンプ部
１４音声認識処理部、音声認識処理手段
１５対話情報データベース、対話情報蓄積手段
１６対話評価部
１７評価録生成部、評価録生成手段
１８特定用語データベース、特定用語蓄積手段
１９通信インタフェース部
２端末
２１マイク
２２ビデオカメラ
２３スピーカ
３インターネット[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a dialogue practice support device, method, and program for simulating a dialogue practice.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, there is a conversation manner education system using a computer (for example, see Patent Document 1). This system inputs a user's free utterance, converts it into character information, compares the character information of the user's utterance with the character information of a correct utterance, and evaluates it. Then, the comprehensive evaluation is output as image information to a display. As a result, it is possible to learn grammar knowledge, honorific operation knowledge, conversation status information, and the like for honorific words.
[0003]
[Patent Document 1]
Patent No. 2673831
[0004]
[Problems to be solved by the invention]
However, the conventional system evaluates a user's one-sided utterance, and is not an interactive system as actually performed. Therefore, it was not possible to select the next question or evaluate the silence time in the case where the response was blocked according to the contents of the response to the question.
[0005]
Therefore, the present invention is a dialogue practice support apparatus, method, and program which is an interactive type performed by voice and video like an actual simulated dialogue practice, and evaluates the content and time of a user's response to a question. The purpose is to provide.
[0006]
[Means for Solving the Problems]
According to the dialogue practice support device of the present invention,
The question information consisting of the audiovisual information of the interviewer is branched by the term included in the response information consisting of the audiovisual information of the user and the silent time until the response information is received, and then transmitted. Question information storage means including scenario information for selecting question information to be provided;
Means for transmitting question information to the terminal based on the scenario information;
Audio-video information storage means for storing reply information received from the terminal,
Term storage means for storing specific terms,
Voice recognition processing means for converting voice information in the user's voice / video information into text information, and marking the text information portion if a specific term is included, or that if there is silent time, ,
Dialogue information storage means for storing text information marked by the voice recognition processing means;
Dialog evaluation means for extracting the marked part from the dialog information storage means when receiving the evaluation record request from the terminal;
An evaluation record consisting of question information extracted from the question information storage means, response information extracted from the audiovisual information storage means, and information indicating that a specific term is used or that there is silence in the marked part. An evaluation record generating means for transmitting information to the terminal;
It is characterized by having.
[0007]
According to another embodiment of the dialogue practice support device of the present invention,
The scenario information stored in the question information storage means includes evaluation point information according to the branch,
The dialogue evaluation means adds evaluation point information according to a branch following the scenario information,
It is preferable that the evaluation record generating means further transmits the evaluation point information to the terminal.
[0008]
According to another embodiment of the dialogue practice support device of the present invention,
It is also preferable that the voice recognition processing unit collects the passage statistical information of the branch in the scenario information and stores the passage statistical information in the question information storage unit.
[0009]
According to the dialogue practice support method of the present invention,
The apparatus branches with respect to the question information including the audiovisual information of the interviewer according to the term included in the response information including the audiovisual information of the user and the silent time until the response information is received. Question information storage means including scenario information to select the question information to be transmitted to, and a term storage means that stores a specific term,
A first step in which the device transmits question information to the terminal based on the scenario information;
A second step in which the terminal sends reply information to the question information to the device;
A third step in which the device stores the response information in the audiovisual information storage means;
The device converts the audio information in the user's audiovisual information into text information, and marks the text information portion if a specific term is included, or if there is silence time, to the audio information. A fourth step of performing a recognition process;
A fifth step in which the apparatus stores the text information marked by the speech recognition processing means in the conversation information storage means;
And repeating the first to fifth steps,
A sixth step in which the terminal sends an evaluation record request to the device;
A seventh step in which the device extracts the marked part from the dialog information storage means;
The apparatus includes question information extracted from the question information storage means, response information extracted from the audiovisual information storage means, and information indicating that a specific term is used in the marked part or that there is a silence period. Transmitting the evaluation record information to the terminal.
[0010]
According to the dialogue practice support method of the present invention,
The apparatus branches with respect to the question information including the audiovisual information of the interviewer according to the term included in the response information including the audiovisual information of the user and the silent time until the response information is received. Question information storage means including scenario information to select the question information to be transmitted to, and a term storage means that stores a specific term,
The terminal also has a question information storage unit,
A first step in which the terminal outputs question information to the user based on the scenario information, and repeatedly inputs response information to the question information to collect a series of response information;
A second step in which the terminal sends the reply information to the device;
A third step in which the device stores the response information in the audiovisual information storage means;
The device converts the audio information in the user's audiovisual information into text information, and marks the text information part if a specific term is included or if there is silence time, and outputs the audio. A fourth step of performing a recognition process;
A fifth step in which the apparatus stores the text information marked by the speech recognition processing means in the conversation information storage means;
Has,
A sixth step in which the terminal sends an evaluation record request to the device;
A seventh step in which the device extracts the marked part from the dialog information storage means;
The apparatus is configured such that the question information extracted from the question information storage means, the response information extracted from the audiovisual information storage means, and the information indicating that a specific term is used or that there is a silence period in the marked part. Transmitting the evaluation record information to the terminal.
[0011]
According to another embodiment of the dialogue practice support method of the present invention,
The scenario information stored in the question information storage means includes evaluation point information according to the branch,
The fifth step is to add the evaluation point information according to the branch following the scenario information,
In the sixth step, it is preferable that the evaluation point information is further transmitted to the terminal.
[0012]
According to another embodiment of the dialogue practice support method of the present invention,
It is also preferable that the voice recognition processing unit collects the passage statistical information of the branch in the scenario information and stores the passage statistical information in the question information storage unit.
[0013]
According to the dialogue practice support program of the present invention,
The question information consisting of the audiovisual information of the interviewer is branched by the term included in the response information consisting of the audiovisual information of the user and the silent time until the response information is received, and then transmitted. Question information storage means including scenario information for selecting question information to be provided;
Means for transmitting question information to the terminal based on the scenario information;
Audio-video information storage means for storing reply information received from the terminal,
Term storage means for storing specific terms,
Voice recognition processing means for converting voice information in the user's voice / video information into text information, and marking the text information portion if a specific term is included, or that if there is silent time, ,
Dialogue information storage means for storing text information marked by the voice recognition processing means;
Dialog evaluation means for extracting the marked part from the dialog information storage means when receiving the evaluation record request from the terminal;
It consists of question information extracted from the question information storage means, response information extracted from the audiovisual information storage means, and information indicating that a specific term is used in the marked part or that there is silence time. The computer is caused to function as evaluation record generating means for transmitting the evaluation record information to the terminal.
[0014]
According to another embodiment of the dialogue practice support program of the present invention,
The scenario information stored in the question information storage means includes evaluation point information according to the branch,
The dialogue evaluation means adds evaluation point information according to a branch following the scenario information,
It is also preferable that the evaluation record generating means causes the computer to further transmit the evaluation point information to the terminal.
[0015]
According to another embodiment of the dialogue practice support program of the present invention,
It is also preferable that the voice recognition processing means collects the passage statistical information of the branch in the scenario information and causes the computer to function so as to store the passage statistical information in the question information storing means.
[0016]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0017]
FIG. 1 is a system configuration diagram according to the present invention.
[0018]
According to FIG. 1, a dialogue evaluation device 1 of the present invention and a user terminal 2 are connected via the Internet 3. The terminal 2 is provided with a microphone 21 for acquiring audio information, a video camera 22 for acquiring video information, and a speaker 23 for outputting the voice of the interviewer.
[0019]
The terminal 2 receives the question information transmitted from the dialogue evaluation device 1. The question information includes audio-video information, and the terminal 2 displays the video information by the browser and outputs the audio by the speaker 23. Further, the utterance of the user is acquired by the microphone 21, and the video information of the user is acquired by the video camera 22. The terminal 2 transmits the audio-video information to the dialogue evaluation device 1.
[0020]
Here, the audiovisual information transmitted by the terminal 2 to the dialogue evaluation device 1 includes information transmitted individually and in real time for each question, and information transmitted collectively for a plurality of questions. In the latter case, the terminal 2 needs to receive and accumulate the question information from the dialogue evaluation device 1 in advance.
[0021]
Further, the terminal 2 can request the dialogue evaluation device 1 for a dialogue evaluation record. Regarding the received evaluation record information, the video information is displayed by the browser, and the audio information is output from the speaker 23.
[0022]
FIG. 2 is a functional configuration diagram of the dialog evaluation device 1 according to the present invention.
[0023]
According to FIG. 2, the dialogue evaluation device 1 includes an audio-video information database 11, a question information (scenario) database 12, a time stamp 13, a voice recognition processing unit 14, a dialogue information database 15, a dialogue evaluation unit 16 , An evaluation record generation unit 17, a specific term database 18, and a communication interface 19.
[0024]
The Internet 3 is connected via a communication interface 19.
[0025]
The question information database 12 stores question information to be transmitted to the terminal 2 of the user. The question information includes scenario information for performing a dialogue simulation exercise and audiovisual information of the interviewer. For example, if it is related to a simulated dialog for employment, the question information is divided into information-related companies, biotechnology-related companies, and the like.
[0026]
The scenario information is a flowchart in which a term included in a response to the question content is set as a branch condition. Also, when a certain period of silent time occurs between the question end time and the response start time, that is, when the response is stuck, the branch condition is based on the scenario information. The conditions include, for example, a combination of a plurality of terms by an operator (AND / OR or the like), and in the case of rephrasing a language, the detection of the ending or habit. In particular, the last uttered term can be selected. For example, if there is a statement "not the sales department but the accounting department", the accounting department can be used as a key term of the scenario branch. These scenarios are configured as programs and are easy to change and modify.
[0027]
Further, in the flowchart of the scenario information, there is an additional point of the evaluation point information, which can be used as the final score of the dialogue evaluation. For example, it is also possible to deduct the evaluation point information as the silence time is longer.
[0028]
Further, it is preferable to store the ideal response contents assumed in advance as the scenario information. By transmitting the ideal response information to the question information to the terminal 2, the user can objectively grasp the state of his / her response and also grasp the bad points and the good points with a reason. And an ideal pattern can be learned, and the convenience of the user can be improved.
[0029]
The time stamp unit 13 attaches a time to the audiovisual information received by the communication interface 19.
[0030]
The received audiovisual information is stored in the audiovisual information database 11 for each user. At this time, time is given to the audiovisual information by a time stamp. It should be noted that the video information is not limited to that obtained from the video camera 22 provided in the terminal 2, but is material data having a visual effect, such as document material data or projection material data used for presentation by the user. There may be.
[0031]
Next, the audio information stored in the audio-video information database 11 is converted into text information in the audio recognition processing unit 14. At this time, the voice recognition processing unit 14 uses the specific term database 18 and the question information database 12.
[0032]
The specific term database 18 stores words that should not be used during a conversation (terms serving as branch keywords). A specific term stored in the specific term database 13 is searched from the converted text information. The specific term is, for example, a term that should not be used during a conversation. Then, the text information marked on the part of the specific term is stored in the conversation information database 15.
[0033]
Further, the voice recognition processing unit 14 calculates a silence time between the question end time and the reply start time, and when it is determined that a predetermined silence time has occurred based on the scenario information, a mark is added to the corresponding portion of the text information. Attach it. The silence time means that, for example, a response cannot be conceived, and it takes a long time to respond. Then, the text information is stored in the conversation information database 15.
[0034]
The silence period determines the start and end of the utterance based on the threshold of the input voice level. The start of utterance means that, for example, a state that is equal to or more than a threshold continues for a predetermined time length or more (for example, 2 seconds). The terminating the utterance means, for example, that the state below the threshold value continues for a predetermined time length or more (for example, 2 seconds).
[0035]
Further, the speech recognition processing unit 14 collects passage statistical information of the branch in the scenario information, and stores the passage statistical information in the question information storage unit. This can be realized by adding the passage statistical information to the branch part of the flowchart of the scenario information. When the same scenario information is interacted with a plurality of users, statistics such as branching at a characteristic branching point can be obtained for most users.
[0036]
After the end of the simulated dialogue, the terminal 2 can request the dialogue evaluation device 1 for an evaluation record. At this time, the terminal 2 transmits an evaluation record request message to the dialog evaluation device 1. Here, regardless of the entire dialogue, it is also possible to request an evaluation record for only a part of the time, for example, as in a one-point lesson.
[0037]
The dialog evaluation unit 16 acquires the evaluation record request message received by the communication interface 19. At this time, the dialogue evaluation unit 16 extracts question information and / or statistical information from the question information database 12, and extracts a marked part from the dialogue information storage database 15.
[0038]
The evaluation record generation unit 17 searches the audio-video information database 11 based on the mark position and the time information thereof based on the text information searched by the dialog evaluation unit 16. The evaluation record generation unit 17 acquires audiovisual information corresponding to time information added to the searched text information. Then, an HTML (Audio and Visual) evaluation record including the audio / video information, the question information and / or the statistical information, the advice information corresponding to the mark position, and the evaluation point information is generated, and the evaluation record is generated. The information is returned to the terminal 2 that has transmitted the evaluation record request message.
[0039]
The terminal that has transmitted the evaluation record request message can reproduce the evaluation record information. At this time, the browser displays the video of the interviewer, the video of the user, advice items due to inappropriate responses and the occurrence of silence, and evaluation point information.
[0040]
For example, the following items are displayed as the advice items.
(1) When a specific term is met "During the dialogue, words that should not be used are used ○ times. Let's say" △△△. "
(2) If it corresponds to the silence time “During the dialogue, there were ○ places where the response was blocked. There was a response to the question of ○○○. Prepare responses such as ×××. Let's."
[0041]
These can be easily realized by counting the number of times an inappropriate term or the like is detected by the voice recognition processing unit 14.
[0042]
As another embodiment of FIG. 2, there is a case where the scenario information stored in the question information database 12 is transmitted to the terminal 2 in advance. The terminal 2 advances a series of dialogues based on the scenario information, and accumulates audio-video information of the user. After the end of the series of dialogues, the terminal 2 transmits the audiovisual information to the dialogue evaluation device 1. Subsequent processing in the dialogue evaluation device 1 is the same as that described above.
[0043]
FIG. 3 is a sequence diagram between the dialogue evaluation device and the terminal according to the present invention.
[0044]
(S1) The terminal 2 transmits a dialogue evaluation start request message to the dialogue evaluation device 1. At this time, the message includes a lesson number, a user ID, and the like.
(S2) The dialogue evaluation device 1 transmits question information of audiovisual information from the question information database 12 to the terminal 2. The terminal 2 displays a video of the interviewer using a browser, and outputs a question sound using the speaker 23. Thus, the user can perform a simulated dialogue while feeling the interviewer image displayed on the screen as an actual interviewer. At this time, it is also preferable that the user's own image or the user's resume document is displayed.
(S3) The terminal 2 transmits the voice information of the user's response obtained by the microphone 21 and the video information of the user obtained by the video camera 22 to the dialogue evaluation device 1.
By repeating the above-described sequence of (S2) and (S3), the dialog information is accumulated in the dialog evaluation device 1.
[0045]
(S4) After the end of the dialogue simulation practice, the user uses the terminal 2 to transmit a dialogue evaluation request message to the dialogue evaluation device 1.
(S5) The dialogue evaluation device 1 transmits the dialogue evaluation record information to the terminal 2. The dialogue evaluation record information includes the audiovisual information of the interviewer and the audiovisual information of the user, the advice information, and the dialogue evaluation point information. The terminal 2 displays the video information, the advice information, and the point information by a browser, and outputs the audio information by a speaker. As a result, the user can objectively grasp the state of his / her answer and answer as if he were an interviewer while watching the state of the dialogue of the user.
[0046]
FIG. 4 is a flowchart when the dialogue evaluation device 1 according to the present invention receives audiovisual information from the terminal 2.
[0047]
According to FIG. 4, it proceeds in the following sequence.
(S41) The terminal 2 receives audio-video information as a response to the question.
(S42) The current time is stamped on the audio / video information. If the time is stamped on the file at the terminal, there is no need to stamp the time here.
(S43) The audiovisual information is stored in the audiovisual information database 11 for each user.
(S44) The audio information in the audiovisual information is converted into text information by audio recognition processing. At this time, a specific term stored in the specific term database 18 is searched and marked based on the scenario information stored in the question information database 12. In addition, the passage statistical information can be added to the branch part of the flowchart of the scenario information.
(S45) The text information is stored in the conversation information database 15 for each user.
[0048]
FIG. 5 is a flowchart when the dialogue evaluation device 1 according to the present invention receives an evaluation record request message.
[0049]
According to FIG. 5, the process proceeds in the following sequence.
(S51) An evaluation record request message is received from a terminal.
(S52) The dialog information database 15 is searched for text information.
(S53) The time information of the searched text information is specified.
(S54) Audio-video information corresponding to the specified time information is acquired from the audio-video information database 11.
(S55) An evaluation record including the acquired audiovisual information is generated. Here, the evaluation record is in the HTML format. Thereby, a multimedia evaluation record can be provided. The evaluation record also includes the advice information of the evaluation content marked on the text information.
(S56) The generated evaluation record is transmitted to the terminal 2 that has transmitted the evaluation record request message.
[0050]
FIG. 6 is a flowchart of the scenario information in the question information database 12.
[0051]
For example, in the interview, there are an induction interview type and a compression interview type. The guided interview type is a scenario in which the interviewer wants to confirm with the applicant. On the other hand, in the case of the compression interview type, in order to confirm a desired intention, a scenario is used in which a candidate is asked a question that is daunting. In the following, the guide interview type scenario information (in the case where the scenario branch is performed by term extraction) and the compression interview type scenario information (in the case where the scenario branch is performed when a response is stuck) will be described.
[0052]
The guide interview type scenario information will be described. Scenario information described as an example is for research applicants in a research field that a company wants, and when selecting researchers, it is sorted according to the skills required of basic researchers and future hopes.
(S1) The question information "Please explain the outline of the master's thesis" is transmitted to the terminal 2.
(S2) A response is received from the terminal 2.
(S3) It is determined whether or not the response includes the research theme "quantum computing" or "DNA" that the company works on. If YES, the point information is +2 points.
(S4) If YES, the question information "What did the research support you?" Is transmitted to the terminal 2 as confirmation of the skills as a basic researcher.
(S5) A response is received from the terminal 2.
(S6) It is determined whether or not the response includes the term "self", such as having the perseverance to be able to believe and perform the self. If YES, the point information is +2 points.
(S7) If YES, the terminal transmits question information "Do you want to continue basic research in the future?"
(S8) If NO, it is determined whether or not a "member" that can be performed by believing the member is included. For a researcher who requires individual pursuit ability rather than coordination with members, if NO, the point information is +2 points.
[0053]
Next, compression interview type scenario information will be described. The scenario information described as an example is such that if the applicant wants to do business, he / she dares to ask a question that he / she thinks is difficult to answer.
(S9) In order to confirm the intent of a candidate who wants the accounting department and is a prospective member of the accounting department, a question contrary to the request is asked, "What if I am assigned to the sales department?" The information is transmitted to the terminal 2.
(S10) Here, it is determined whether or not the silence lasts for 5 seconds or more. If yes, we are distressed by this question. If NO, it means that the reply has been made without being clogged with this reply, and the point information becomes +1 point.
(S11) If NO, a response is received.
(S12) It is determined whether or not the response includes the terms "your company" and "loss", assuming that your company will be a loss if you are assigned to the sales department. If YES, it is determined that the applicant is a prospective candidate, and the point information is +2 points.
(S13) If YES for (S10), a reply is received.
(S14) When such a blockage occurs, it is determined whether or not the response includes terms such as "other company" and "other department" that may be in another department or select another company.
(S15) If NO, the terminal transmits the question information "What kind of contribution can I make to the company as an individual?"
(S16) Here, it is determined whether or not the silence lasts for 5 seconds or more. If yes, we are distressed by this question. If NO, it means that the reply has been made without being clogged with this reply, and the point information becomes +1 point.
[0054]
Each step in the dialogue practice support method of the present invention can be executed using a recording medium built in a computer and using control means such as a CPU. Also, a computer-readable program can be installed via a recording medium such as a CD or a communication line and executed by the computer. These programs may be mainly realized as a function of the device on the Internet by a program mounted on the device. Of course, these functions are also realized by a program installed in the terminal, and can be used in a peer-to-peer type.
[0055]
【The invention's effect】
As described above in detail, according to the dialogue practice support method, apparatus, and program of the present invention, a dialogue-type interactive practice performed by voice and video as in actual simulated dialogue practice, The response content and the response time can be evaluated.
[0056]
In addition, you can check your question and answer in an environment where you are the other party, point out words that you think are inappropriate at the time of dialogue, set up cases where you are in trouble with responses, etc. The experience just like it becomes possible. In particular, it is possible for the user to smoothly answer and answer words in enrollment, employment, etc., and to practice using words.
[0057]
At present, employment is not guaranteed for a lifetime, and a number of books on overcoming dialogue have been published in the streets as a technology that can promote oneself in a short time. According to the present invention, it is possible to experience a similar experience close to an actual dialogue. In addition, it is possible to incorporate the knowledge of dialogues and the knowledge of experts who are experts in dialogues described in the dialogue practice book, and they have the skills and skills of desired work, but are not good at speaking in public It provides an opportunity for dialogue practice to those who are interested, and can build services that are well suited to the dialogue-related market. Further, as this application, the present system can be applied to sales presentation training and business manner training, and the market is immense.
[Brief description of the drawings]
FIG. 1 is a system configuration diagram according to the present invention.
FIG. 2 is a functional configuration diagram of the dialog evaluation device 1 according to the present invention.
FIG. 3 is a sequence diagram between a dialogue evaluation device and a terminal according to the present invention.
FIG. 4 is a flowchart when the dialogue evaluation device 1 of the present invention receives audiovisual information from the terminal 2.
FIG. 5 is a flowchart when the dialogue evaluation device 1 according to the present invention receives an evaluation record request message.
FIG. 6 is a flowchart of scenario information in a question information database 12.
[Explanation of symbols]
1 Dialogue evaluation device
11 Audio / video information database, audio / video information storage means
12. Question information database, question information storage means
13 Time stamp part
14 Voice recognition processing unit, voice recognition processing means
15 Dialogue information database, dialogue information storage means
16 Dialogue Evaluation Department
17 Evaluation record generation unit, evaluation record generation means
18 Specific term database, specific term storage means
19 Communication interface
2 terminal
21 microphone
22 Video Camera
23 Speaker
3 Internet

Claims

In a dialogue practice support device connected to the terminal via the Internet,
The question information consisting of the audiovisual information of the interviewer is branched by the term included in the response information consisting of the audiovisual information of the user and the silent time until the response information is received, and then transmitted. Question information storage means including scenario information for selecting question information to be provided;
Means for transmitting the question information to the terminal based on the scenario information,
Audio-video information storage means for storing the response information received from the terminal,
Term storage means for storing specific terms,
A voice for converting the audio information in the audio-video information of the user into text information, and marking the text information portion to the effect that the specific term is included, or to indicate that if the silent time is included. Recognition processing means;
Dialogue information storage means for storing the text information marked by the voice recognition processing means,
Dialogue evaluation means for extracting the marked portion from the dialogue information storage means when receiving an evaluation record request from the terminal,
The question information extracted from the question information storage means, the response information extracted from the audiovisual information storage means, and the fact that the specific term is used in the marked part or that there is the silence time A dialogue practice support device, comprising: evaluation record generation means for transmitting evaluation record information including the following information to the terminal.

The scenario information stored in the question information storage unit includes evaluation point information according to a branch,
The dialogue evaluation means adds the evaluation point information according to a branch following the scenario information,
2. The dialogue practice support device according to claim 1, wherein the evaluation record generation unit further transmits the evaluation point information to the terminal.

3. The dialogue practice support device according to claim 1, wherein the voice recognition processing unit collects passage statistical information of a branch in the scenario information and stores the passage statistical information in the question information storage unit. .

In a method for supporting dialogue practice in a terminal and a device connected to the terminal via the Internet,
The apparatus branches to the question information consisting of the audiovisual information of the interviewer, the term included in the response information consisting of the audiovisual information of the user, and the silent time until the response information is received, Question information storage means including scenario information to select the question information to be transmitted next, and a term storage means that stores a specific term,
A first step in which the device transmits the question information to the terminal based on the scenario information;
A second step in which the terminal transmits reply information to the question information to the device;
A third step in which the apparatus stores the reply information in audio-video information storage means;
The device converts the audio information in the audio-video information of the user into text information, and if the specific term is included, or to the effect that if the silent time is included, the text information portion A fourth step of marking and performing speech recognition processing;
A fifth step of storing the text information marked by the voice recognition processing means in a conversation information storage means, and repeating the first to fifth steps;
A sixth step in which the terminal transmits an evaluation record request to the device;
A seventh step in which the device extracts the marked portion from the dialog information storage means;
The apparatus, the question information extracted from the question information storage means, the response information extracted from the audiovisual information storage means, and the fact that the specific term is used in the marked part or An eighth step of transmitting evaluation record information including the information indicating that there is silence time to the terminal.

In a method for supporting dialogue practice in a terminal and a device connected to the terminal via the Internet,
The apparatus branches to the question information consisting of the audiovisual information of the interviewer, the term included in the response information consisting of the audiovisual information of the user, and the silent time until the response information is received, Question information storage means including scenario information to select the question information to be transmitted next, and a term storage means that stores a specific term,
The terminal also has the question information storage unit,
A first step in which the terminal outputs the question information to a user based on the scenario information, and repeatedly inputs response information to the question information to collect a series of response information;
A second step in which the terminal transmits the reply information to the device;
A third step in which the apparatus stores the reply information in audio-video information storage means;
The device converts the audio information in the audio-video information of the user into text information, and if the specific term is included, or to the effect that if the silent time is included, the text information portion A fourth step of marking and performing speech recognition processing;
A fifth step of storing the text information marked by the voice recognition processing means in a conversation information storage means,
A sixth step in which the terminal transmits an evaluation record request to the device;
A seventh step in which the device extracts the marked portion from the dialog information storage means;
The apparatus, the question information extracted from the question information storage means, the response information extracted from the audiovisual information storage means, and the fact that the specific term is used in the marked part or An eighth step of transmitting evaluation record information including the information indicating that there is silence time to the terminal.

The scenario information stored in the question information storage unit includes evaluation point information according to a branch,
The fifth step is to add the evaluation point information according to a branch following the scenario information,
6. The dialogue practice support method according to claim 4, wherein the sixth step further transmits the evaluation point information to the terminal.

The said speech recognition processing means collects the passage statistical information of the branch in the said scenario information, and accumulates the said passage statistical information in the said question information accumulating means, The Claim 1 characterized by the above-mentioned. Dialogue practice support method.

In a dialogue practice support program that makes the terminal function about the device connected via the Internet,
The question information consisting of the audiovisual information of the interviewer is branched by the term included in the response information consisting of the audiovisual information of the user and the silent time until the response information is received, and then transmitted. Question information storage means including scenario information for selecting question information to be provided;
Means for transmitting the question information to the terminal based on the scenario information,
Audio-video information storage means for storing the response information received from the terminal,
Term storage means for storing specific terms,
A voice for converting the audio information in the audio-video information of the user into text information, and marking the text information portion to the effect that the specific term is included, or to indicate that if the silent time is included. Recognition processing means;
Dialogue information storage means for storing the text information marked by the voice recognition processing means,
Dialogue evaluation means for extracting the marked portion from the dialogue information storage means when receiving an evaluation record request from the terminal,
The question information extracted from the question information storage means, the response information extracted from the audiovisual information storage means, and the fact that the specific term is used in the marked part or the mute time A dialogue practice support program for causing a computer to function as evaluation record generation means for transmitting evaluation record information including information to the effect to the terminal.

The scenario information stored in the question information storage unit includes evaluation point information according to a branch,
The dialogue evaluation means adds the evaluation point information according to a branch following the scenario information,
9. The dialogue training support program according to claim 8, wherein the evaluation record generating means causes a computer to further transmit the evaluation point information to the terminal.

10. The apparatus according to claim 8, wherein the voice recognition processing means collects passage statistical information of branches in the scenario information, and causes a computer to function so as to store the passage statistical information in the question information storage means. Dialogue practice support program described.