JP3846161B2

JP3846161B2 - Information provision system by voice and its malfunction cause notification method

Info

Publication number: JP3846161B2
Application number: JP2000185167A
Authority: JP
Inventors: 浩長谷川
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2000-06-20
Filing date: 2000-06-20
Publication date: 2006-11-15
Anticipated expiration: 2020-06-20
Also published as: JP2002007165A

Description

【０００１】
【発明の属する技術分野】
本発明はユーザからの音声コマンドを受け取って、その音声コマンドに対応した情報を音声によって当該ユーザに提供可能とした音声による情報提供システムおよびその誤動作原因報知方法に関する。
【０００２】
【従来の技術】
近年、音声認識技術や音声合成技術など音声に関する技術の進歩により、様々な分野で音声を用いた情報処理システムが提案されてきている。たとえば、システムへのコマンド入力を音声によって行うことができたり、システムからの応答も音声で可能なものもすでに開発されている。このような音声による情報処理システは、特に、視覚障害を持つユーザや、手足の不自由なユーザにとってきわめて便利なものとなり、様々な分野での普及が期待されるところである。
【０００３】
その一つの例として、音声応答による視覚障害者への生活支援システムが提案されている（社団法人電子情報通信学会第２種研究会資料３３頁〜３９頁「音声応答による視覚障害者への生活支援システムの試作」静岡県立大学経営情報学部松浦健一湯瀬裕昭）。
【０００４】
このシステムは、概略的には図１に示すような構成となっていて、ユーザＵはヘッドホン３２とマイクロホン１１を装着し、無線通信などの通信手段を用いてシステム側との間で音声情報をやりとりすることができ、様々な情報を音声によって取得できるようになっている。その対話例を次に示す。なお、ユーザをＵ、システムをＳｙで表す。また、ここでいうシステムＳｙは、図１において、ユーザ側の端末ＴやホストコンピュータＨなどを総称したものを指している。
【０００５】
Ｕ：時計。
Ｓｙ：時計・カレンダ機能を利用できます。
Ｕ：いま何時ですか。
Ｓｙ：現在の時刻は午後４持３０分です。
Ｕ：来週の木曜日は何日ですか。
Ｓｙ：来週の木曜日は５月１０日です。
Ｕ：電卓で計算
Ｓｙ：電卓機能を利用できます。
Ｕ：９８０足す１９８では
Ｓｙ：９８０足す１９８は１１７８です。
Ｕ：辞書ひき
Ｓｙ：辞書機能を利用できます。
Ｕ：「積極的」
Ｓｙ：調べる単語は「せっきょくてき」ですか。
Ｕ：はいそうです。
【０００６】
このように、ユーザＵは情報を取得するためのコマンドを音声で発し、システムＳｙ側ではその音声コマンドを認識し、その認識結果に基づいた回答を音声によってユーザＵに返す。さらには、天気予報、ニュース、鉄道などの時刻表、テレビ番組などの生活情報を問い合わせることによって、それらの情報を得ることも可能となる。この場合、予め特定のwebページを指定しておいて、ユーザＵからの要求に応じて、ホストコンピュータＨが図１に示すサーバＳに接続してサーバＳがそのwebページから情報を取得し、それをユーザＵに知らせる。
【０００７】
【発明が解決しようとする課題】
上述したような音声による情報提供システムは、視覚障害を持つユーザは勿論、健常者にとっても有用なものとして期待されると考えられるが、まだ解決すべき問題点も多い。
【０００８】
たとえば、ユーザＵの入力した音声コマンドに対してシステムＳｙ側から何の応答もない場合や、全く見当違いの応答がなされる場合も多い。これは、システムＳｙ側における音声認識性能やアプリケーション上の問題ばかりでなく、ユーザの発話の仕方やマイクロホン１１などに問題がある場合もあり、また、通信環境に問題があることも考えられる。このように、システムＳｙ側から何の応答もなかったり、見当違いの応答がなされるなどの誤動作（以下では応答異常という）は、様々な原因によって引き起こされる。
【０００９】
ユーザは応答異常を起こした原因がわかれば、それに対応することは可能である。たとえば、発話した音声の音量が小さすぎる、早口で発話しすぎる、さらには、マイクロホンの電源スイッチが入っていないなど、原因がわかれば、それに対処することは可能である。
【００１０】
しかし、従来のこの種のシステムでは、その原因をユーザに知らせる手段は持っていないのが普通であり、ユーザは何が原因で応答異常となったのか、その原因を特定するのは大変難しく、結局は、同じ発話を繰り返したり、システムの状態を調べたりといった試行錯誤的な原因診断を行う必要がある。健常者であれば、色々な方法を用いて時間をかけて調べれば、その原因を特定できる場合もあるが、視覚障害などを持つユーザが原因を調べるのは困難であり、近くにいる人やこの種のシステムの専門家に原因を調べてもらうといったことが必要となってくる。
【００１１】
そこで本発明は、音声コマンドに対しシステムが誤動作した場合、何が原因なのかを自己診断し、その自己診断結果をユーザにわかりやすく知らせることを可能とし、特に自己診断結果を音声によってユーザに提示することにより、健常者のみならず、特に、視覚障害を持つユーザにとって使い勝手のよいシステムを提供することを目的としている。
【００１２】
【課題を解決するための手段】
上述した目的を達成するために本発明の音声による情報提供システムは、ユーザが所持する端末から当該ユーザが音声コマンドを入力して、その音声コマンドをホストコンピュータ側で受信し、当該ホストコンピュータでは受信した音声コマンドを認識し、その認識結果に対する応答を前記端末に送信する手段を有するとともに、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、応答異常となった原因を前記ユーザに対して報知する手段を有する音声による情報提供システムであって、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、応答異常となった原因を前記ユーザに対して報知する手段は、前記ホストコンピュータ側の応答異常をユーザが察知したときに当該ユーザによってなされる動作を検出することで自己診断開始信号を得る自己診断開始信号検出手段と、この自己診断開始信号検出手段からの自己診断開始指示信号を受けて自己診断を開始する自己診断手段と、この自己診断手段によって得られた自己診断結果をユーザに報知する報知手段とを有する構成としている。
【００１３】
このような音声による情報提供システムにおいて、前記音声コマンドに対して前記ホストコンピュータ側からの応答および前記自己診断結果のユーザへの報知は、ともに音声によって行うようにしている。
【００１４】
また、前記自己診断開始信号を得るに必要な前記ユーザによってなされる動作とは、前記端末に存在するマイクロホンに対し、ユーザがマイクロホンの動作状態を確認する動作である。
【００１５】
ここで、前記ユーザがマイクロホンの動作状態を確認する動作は、ユーザがマイクロホンに息を吹きかける動作、ユーザがマイクロホンに所定の音声を入力する動作、ユーザがマイクロホンを軽く叩く動作のうち、少なくとも１つの動作を用いるようにすることが考えられる。
【００１６】
そして、前記自己診断開始信号検出手段、前記自己診断手段、前記報知手段は、主として前記端末側に持たせるようにしている。
【００１７】
また、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、ユーザの入力した当該音声コマンドをユーザにフィードバックすることも可能である。
【００１８】
また、本発明の音声による情報提供システムの誤動作原因報知方法は、ユーザが所持する端末から当該ユーザが音声コマンドを入力して、その音声コマンドをホストコンピュータ側で受信し、当該ホストコンピュータでは受信した音声コマンドを認識し、その認識結果に対する応答を前記端末に送信する手段を有するとともに、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、応答異常となった原因を前記ユーザに対して報知する手段を有する音声による情報提供システムの誤動作原因報知方法であって、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、応答異常となった原因を前記ユーザに対して報知する手段は、前記ホストコンピュータ側の応答異常をユーザが察知したときに当該ユーザによってなされる動作を自己診開始動信号として検出することで自己診断開始動信号を得て、これによって自己診断機能を起動し、前記応答異常となった原因を調べるための自己診断を行い、その自己診断結果をユーザに報知するようにしている。
【００１９】
このような音声による情報提供システムの誤動作原因報知方法において、前記音声コマンドに対して前記ホストコンピュータ側からの応答および前記自己診断結果のユーザへの報知は、ともに音声によって行うようにしている。
【００２０】
また、前記自己診断開始信号を得るに必要な前記ユーザによってなされる動作とは、前記端末に存在するマイクロホンに対し、ユーザがマイクロホンの動作状態を確認する動作である。
【００２１】
ここで、前記ユーザがマイクロホンの動作状態を確認する動作は、ユーザがマイクロホンに息を吹きかける動作、ユーザがマイクロホンに所定の音声を入力する動作、ユーザがマイクロホンを軽く叩く動作のうち、少なくとも１つの動作を用いるようにすることが考えられる。
【００２２】
そして、前記自己診断開始信号を得る機能、前記自己診断を行う機能、自己診断結果をユーザに報知する機能は、主として前記端末側に持たせるようにしている。
【００２３】
また、この音声による情報提供システムの誤動作原因報知方法においても、前記音声コマンドに対するホストコンピュータ側からの応答に応答異常がある場合、ユーザの入力した当該音声コマンドをユーザにフィードバックすることも可能である。
【００２４】
このように本発明は、ユーザの入力した音声コマンドに対してホストコンピュータ（以下では単にホストという）側が応答異常を起こした場合、応答異常のあることをユーザが察知したときにユーザによってなされる動作を検知し、それによって、端末側で自己診断機能を起動し、その自己診断機能によって、異常となった原因を調べ、その結果をユーザに提示するようにしている。
【００２５】
つまり、ユーザの発した音声コマンドに対するホストコンピュータ側からの応答が明らかに応答異常（この応答異常というのは、前述したように、ユーザの発した音声コマンドに対して、何の応答もなかったり、見当違いの応答がなされるなどの誤動作を指す）であるとユーザが判断すると、ユーザは自然に何らかの動作を行うのが普通である。システムでは、そのユーザの動作を検知することによって、どこが悪くて応答異常となったかを自己診断し、その自己診断結果をユーザに対して報知する。
【００２６】
このように、応答異常に対してユーザが何らかの動作ををとることによって、応答異常となった原因を自己診断して調べ、その自己診断結果をユーザに知らせることができるので、ユーザ側で応答異常となった原因を１つ１つ調べて行く必要がなくなる。また、このとき、自己診断結果を音声によって報知するようにすれば、視覚障害を有するユーザにとって便利なシステムとなることは勿論、健常者であっても診断結果をその場に居ながら耳から得ることができるので、他の仕事をしながら本情報提供システムを利用するような場合に便利なものとなる。
【００２７】
また、応答異常があることをユーザが知ったときにユーザによってなされる動作は、ユーザ側に存在するマイクロホンに対し、ユーザがマイクロホンの動作状態を確認する動作であって、具体的には、ユーザがマイクロホンに「フッ、フッ」というように息を吹きかける動作、「あー」というような音声を入力する動作、マイクロホンを軽く叩く動作などであり、これらの動作は、マイクロホンに向かって発話しようとするユーザが、何らかの異常を感じたときにごく自然に行う動作である。
【００２８】
このように、応答異常に対するユーザが行う自然な動作を自己診断機能を開始させるための信号として用いるので、ユーザが特別な操作を行うことなく、自動的に自己診断機能を働かせることができる。
【００２９】
なお、前記自己診断機能を開始させるための信号を得る機能、自己診断を行う機能、自己診断結果をユーザに報知する機能は、主に端末側に持たせるようにしている。これによって、前述の従来の情報提供システムの一例として説明した「音声応答による視覚障害者への生活支援システム」などの既存のシステムに本発明を適用しようとする場合、ホスト側をそれほど大幅に変更ぜずに本発明を実現することができるので、この種の既存の情報提供システムを有効活用することができる。
【００３０】
また、応答異常が起こって自己診断を行う際、ユーザの入力した音声コマンドをそのままユーザにフィードバックすることも可能であり、自分の発話した音声コマンドがそのままフィードバックされることによって、自分の入力した音声コマンドの状態を知ることができる。たとえば、フィードバックされた音声コマンドが、声が大きすぎて音が割れているようであれば、ユーザは自分の発話した声が大きすぎるために、ホスト側で適正な認識が行えなかったということがわかり、その点を注意して発話すればよいということを知る。これを何回か繰り返すことによってユーザはどのように発話すれば適正に音声認識されるかを学習することができる。
【００３１】
【発明の実施の形態】
以下、本発明の実施の形態について説明する。
【００３２】
図１は音声による情報提供システムの概略的な構成を示す図であり、大きくわけると、ユーザ側の端末Ｔとホストコンピュータ（前述したように、単にホストという）Ｈが存在する。なお、音声コマンドの内容によっては、外部の情報提供手段として、たとえば、ネットワークＮに接続されるサーバＳからも情報を取得する場合もある。
【００３３】
ユーザ側の端末Ｔは、図２に示すように、音声入力処理部１、自己診断処理部２、音声出力処理部３、無線通信部４などから構成される。なお、これら各構成要素については後に詳細に説明する。
【００３４】
ホストＨは、端末Ｔとの間で無線通信を可能とする無線通信部６１、端末Ｔから無線通信部６１を介して送られて来た音声を認識する音声認識部６２、少なくとも本発明の機能を実行するに必要な幾つかのアプリケーションからなるアプリケーション部６３、端末Ｔに対して音声により応答を行うための音声信号を生成する音声合成部６４、サーバＳから情報を取得するためのネットワーク通信部６５などを有した構成となっている。
【００３５】
サーバＳはネットワーク通信部７１、各種データが蓄積されているデータベース７２などを有している。
【００３６】
図３は端末１の各構成要素をさらに説明するための図であり、音声入力処理部１は、マイクロホン１１、音声入力部１２、音声一時記憶部１３、音声区間検出部１４などから構成され、自己診断処理部２は、自己診断開始信号検出部２１、自己診断部２２、エラーメッセージ記憶部２３などから構成され、音声出力処理部３は、音声出力部３１、ヘッドホン３２などにより構成されている。また、無線通信部４は信号送信部４１、信号受信部４２により構成されている。そして、ユーザＵはマイクロホン１１とヘッドホン３２をたとえば図１のように装着して使用する。
【００３７】
なお、マイクロホン１１には音声以外の音も入力される場合もあるが、この音声入力処理部１は、音声以外の音に対しても処理可能であることは勿論である。
【００３８】
マイクロホン１１から出力される音声信号（ここでは音声信号以外の音に対する信号も含んで音声信号と呼ぶことにする）は、音声入力部１２でＡ／Ｄ変換処理や増幅処理など一般的な音声信号処理がなされたのち、ある一定区間ごとに順次、音声一時記憶部１３に一時的に記憶される。
【００３９】
そして、その音声一時記憶部１３で記憶されたマイクロホン１１からの音声信号（入力音声信号という）は、自己診断開始信号検出部２１により読み出され、自己診断開始信号であるか否かの判定がなされる。自己診断開始信号検出部２１は、前述したように、応答異常をユーザが察知したときに当該ユーザによってなされる動作、たとえば、ユーザがマイクロホン１１の動作状態を確認する動作によって得られる信号を検出するもので、具体的には、ユーザがマイクロホン１１に「フッ、フッ」というように息を吹きかける動作、「あー」というような音声を入力する動作、マイクロホンを軽く叩く動作のうち、少なくとも１つの動作がユーザによってなされることにより得られる信号を検出する。
【００４０】
この自己診断開始信号検出部２１が行う自己診断開始信号検出処理についてを図４のフローチャートを参照しながら説明する。
【００４１】
図４において、まず、音声一時記憶部１３に一時記憶されている入力音声信号を信号分析し、当該入力音声信号の信号パターンを得る（ステップｓ１）。そして、自己診断開始信号として予め用意されている何種類かの自己診断開始信号パターンを自己診断開始信号パターン記憶部２１１（この自己診断開始信号パターン記憶部２１１は自己診断開始信号検出部２１内に存在する）から読み出して、信号分析して得られた入力音声信号の信号パターンとパターンマッチングを行う（ステップｓ２）。
【００４２】
なお、自己診断開始信号パターン記憶部２１１に登録されている自己診断開始信号パターンは、この場合、ユーザがマイクロホン１１に「フッ、フッ」というような息を吹きかける動作を行ったときに得られる信号パターン（第１の診断開始信号パターンＰ１という）、ユーザがマイクロホン１１に向かって「あー」というような音声を入力する動作を行ったときに得られる信号パターン（第２の診断開始パターンＰ２という）、ユーザがマイクロホンを指先で軽く叩く動作を行ったときに得られる信号パターン（第３の診断開始パターンＰ３という）などであるする。
【００４３】
このような各種の診断開始信号パターンＰ１，Ｐ２，Ｐ３と入力音声信号パターンとを個々にパターンマッチングし、入力音声信号パターンが診断開始信号パターンＰ１，Ｐ２，Ｐ３のいずれかに該当しているか、つまり、入力音声信号が自己診断開始信号であるか否かを判定する（ステップｓ３）。この判定において、もし、入力音声信号が自己診断開始信号であると判定された場合には、自己診断処理に入り（ステップｓ４）、入力音声信号が自己診断開始信号でないと判定された場合には、現在行っている自己診断開始検出処理を終了する。
【００４４】
そして、入力音声信号が自己診断開始信号であると判定された場合の自己診断処理は自己診断部２２によって行われ、まず、端末Ｔの自己診断を行い、次にホストＨの自己診断を行い、さらに、ネットワークＮ、サーバＳへと順次、自己診断を行って行く。このとき、ホストＨ以降の自己診断を行う場合は、自己診断部２２からの自己診断指示信号が無線通信部４の信号送信部４１によってホストＨ側に送られる。
【００４５】
ホストＨ側では端末Ｔ側から送られてきた自己診断指示信号によってアプリケーション部６３に存在する自己診断用のアプリケーションを起動させて自己診断を行う。なお、この自己診断の具体例などについては後述する。
【００４６】
この自己診断を行った結果、応答異常となった原因が特定できれば、自己診断部２２がエラーメッセージ記憶部２３から応答異常となった原因に対応するエラーメッセージを読み出して音声出力部３１に送る。音声出力部３１では読み出されたエラーメッセージを音声信号としてヘッドホン３２に与える。なお、エラーメッセージ記憶部２３には、応答異常となる各種の原因に対応したエラーメッセージが用意されている。
【００４７】
たとえば、通信環境が悪いことが原因である場合に対応したエラーメッセージとして、「電波が届きにくい状況にあります」というようなエラーメッセージが用意され、また、入力された声が小さすぎるがために音声認識が正常になされない場合に対応したエラーメッセージとして、「あなたの声が小さすぎます」というようなエラーメッセージが用意されている。このように、このエラーメッセージは様々な原因に対応して各種用意されている。
【００４８】
一方、自己診断開始信号検出部２１が入力音声信号に対し、自己診断開始信号であるか否かの判定を行った結果、自己診断開始信号ではないと判定した場合には、音声コマンドであるとみなし、音声区間検出部１４に処理を渡す。これにより、音声区間検出部１４は、音声一時記憶部１３に記憶されている入力音声信号から音声区間を検出し、その音声区間に対する音声信号を無線通信部４に送る。無線通信部４はその音声信号を信号送信部４１からホストＨ側に発信する。
【００４９】
そして、ホストＨ側では、端末Ｔ側から送られてきた音声信号を無線通信部６１で受け取って、音声認識部６２で音声認識処理し、その認識結果に基づいてアプリケーション部６３の中から対応するアプリケーションを選んで起動し、入力音声信号（音声コマンド）に対応した処理を行う。
【００５０】
このときの音声コマンドが、ホストＨにて処理できる内容（音声コマンドが、たとえば、時計やカレンダ機能を用いることで対応できる内容、辞書機能を用いることで対応できる内容など）であれば、その音声コマンドに対応した応答を音声合成部６４で生成して無線通信部６１によって端末Ｔ側に送信する。
【００５１】
また、音声コマンドがサーバＳから情報を取得する必要のある内容（音声コマンドが、たとえば、天気予報やその日のニュースなど特定のサーバから情報の取得が必要な内容）であれば、ネットワーク通信部６５によって所定のサーバＳにアクセスしてそのサーバＳから所望とする情報を取得し、ユーザからの音声コマンドに対応した応答内容を音声合成部で生成して無線通信部６１によって端末Ｔ側に送信する。端末Ｔ側ではホストＨ側から送られてきた音声信号を無線通信部４の信号受信部４２で受け取って、音声出力部３１で処理したのちヘッドホン３２から音声として出力する。
【００５２】
次に本発明の音声による情報提供システムの具体的な動作について説明する。前述したように、この種の音声による情報提供システムは、ユーザが音声によって問い合わせを行うことにより、システム側がユーザの所望とする情報を音声で提供してくれるものであるが、本発明は、このような音声による情報提供システムのどこかに問題があって、システムが誤動作した場合、つまり、ホストＨからの応答が応答異常となった場合（この応答異常というのは、前述したように、ユーザが音声コマンドを与えたにも係わらず、ホストＨ側から何の応答もなかったり、見当違いの応答がなされるなどの誤動作を指している）、その応答異常となった原因を当該ユーザにわかりやすく音声にて伝えるものである。
【００５３】
ここで、応答異常となった原因としては様々存在する。すなわち、応答異常を起こした原因が端末Ｔ側にある場合もあり、また、ホストＨ側にある場合もあり、さらには、ネットワークＮやサーバＳ側にある場合も考えられる。
【００５４】
端末Ｔ側の原因としては、たとえば、電源部（電池やスイッチなど）の問題、通信に関する問題、入出力系（マイクロホン１１やヘッドホン３１）の問題、ユーザが行う音声コマンドの入力の仕方などの問題などが考えられる。ユーザが行う音声コマンドの入力の仕方に問題がある例としては、たとえば、ユーザの発話する声の大きさが大きすぎたり小さすぎたりした場合、あるいは、発話する際マイクロホン１１から遠すぎたり近すぎたりした場合、認識可能単語以外の単語を入力した場合などがある。
【００５５】
また、ホストＨ側の原因としては、コンピュータそのものの問題、音声認識部６２の認識性能に関する問題、アプリケーション６３の問題、通信に関する問題、さらには、ネットワークＮから情報を取得する場合は、ネットワークＮとの接続上の問題や、回線上のトラブル、ネットワークＮやサーバＳそのものに問題がある場合も考えられる。
【００５６】
上述したように、ホストＨが応答異常を起こす原因は種々存在する。上述した様々な原因によって応答異常となっても、ユーザは何が原因でそのような応答異常となったのかはわからない場合が多い。
【００５７】
たとえば、ユーザが「いま、何時ですか」とホストＨに聞いた場合、ホストＨ側から全く応答がなかったり、全く見当違いの応答がなされたとしても、ユーザにはその原因は特定できない場合が多い。このようなとき、ユーザは、マイクロホン１１の動作状態を確認する動作を自然に行うのが一般的である。具体的には、マイクロホンに「フッ、フッ」と息を吹きかけたり、「あー」といった音声を入力したり、マイクロホン１１を軽く叩いてみたりする。
【００５８】
このように、不具合があることをユーザが察知したときにユーザが自然に行うマイクロホン１１の動作状態を確認するための幾つかの動作のうち、少なくとも１つの動作が行われると、端末Ｔに設けられた自己診断開始信号検出部２１がそれを検知して、自己診断部２２を起動する。この自己診断開始信号検出部２１が行う自己診断開始信号検出処理は、図４によってすでに説明したのでここでは省略する。
【００５９】
この自己診断開始信号検出部２１によって入力音声が自己診断開始信号であると判定されると、自己診断部２２は、まず、端末Ｔを自己診断し、続いて、ホストＨを自己診断し、続いて、ネットワークＮやサーバＳを自己診断して行くといように、端末Ｔ、ホストＨ、ネットワークＮやサーバＳといった順番で自己診断して行く。
【００６０】
たとえば、端末Ｔ側には特に問題がない場合には、自己診断部２２は自己診断指示信号をホストＨ側に無線通信部４により送信し、この自己診断指示信号をホストＨが受け取ると、ホストＨ側ではアプリケーション部６３の中の自己診断用のアプリケーションを起動して自己診断を行う。
【００６１】
ここで、ホストＨ側の自己診断を行った結果、音声認識部６２にて音声認識処理を行う際、ユーザの入力音声が小さすぎて適正な音声認識処理ができないことがわかったとすれば、「あなたの声が小さすぎます」というような診断結果を音声によってユーザに知らせる。また、サーバＳから情報を取得する必要のある場合、もし、応答異常が起こったとすれば、ネットワークＮやサーバＳに対しても診断を行い、たとえば、ネットワークＮへの接続に問題があることがわかれば、「ネットワークとの接続に問題があります」といった診断結果をユーザに音声によって通知する。
【００６２】
このように、入力音声が自己診断開始信号であると判定された場合には、自己診断部２２が起動され、端末Ｔ、ホストＨ、ネットワークＮやサーバＳというような順番で自己診断を行い、その結果、応答異常を引き起こす原因がみつかれば、その原因を具体的に音声にてユーザに知らせる。
【００６３】
これによって、ユーザは、自分の発した音声コマンドに対してシステム側が応答異常を起こしたとしても、どこがどのような問題で異常が生じたのかを具体的に知ることができる。このように、応答異常となった原因がわかれば、異常となった原因の修正を行って再度音声コマンドを入力することが可能となり、原因がわからぬまま音声コマンドを繰り返し入力するといった無駄を省くことができる。
【００６４】
なお、本発明は以上説明した実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲で種々変形実施可能となるものである。
【００６５】
たとえば、ユーザの音声コマンドに対し応答異常が起こってシステムが自己診断を行う際、ユーザの入力した音声コマンドをそのままユーザにフィードバックするようにしてもよい。一例として、ユーザが現在時刻を問い合わせようとして、ホスト側に「いま何時ですか」という音声コマンドを与え、その音声コマンドに対して応答がなく、ユーザがマイクロホン１１に「フッフッ」と息を吹きかけるような動作を行ったとする。ユーザがこのようなマイクロホン１１の動作状態を確認する動作を行うと、自己診断信号検出部２１がそれを検出すると、自己診断部２２が自己診断動作に入るが、このとき、音声一時記憶部１３に記憶されているユーザの音声コマンド部分（この場合「いま何時ですか」）を音声区間検出部１４で検出して、その音声区間を音声出力部３１で処理してヘッドホン３２から出力する。
【００６６】
これにより、ヘッドホン３２からはユーザの発話した音声コマンド（「いま何時ですか」）がそのまま出力される。ユーザは、それを聞くことによって、自分の入力した音声コマンドの状態を知ることができる。たとえば、ヘッドホン３２からフィードバックされた自分の発話した「いま何時ですか」が、声が大きすぎて音が割れているようであれば、ユーザは自分の発話した声が大きすぎるためホスト側で適正な認識が行えなかったということがわかり、その点を注意して発話すればよいということを知る。
【００６７】
また、前述の実施の形態では、自己診断開始信号検出部２１は、マイクロホン１１に息を吹きかける動作、「あー」といった音声を入力する動作、マイクロホンを軽く叩く動作を、マイクロホン１１の動作状態を試験する際にユーザが一般的に行う動作として検出して自己診断開始信号を得るようにしたが、このマイクロホン１１の動作状態を試験する際にユーザが一般的に行う動作は、上述の３つの例に限定されるものではなく、その他にも種々考えられる。
【００６８】
また、自己診断開始信号は、上述のマイクロホン１１の動作状態を試験する際にユーザが一般的に行う動作を検出して得ることに限られるものではなく、たとえば、自己診断開始ボタンを用意し、システムに応答異常が起こったとき、その自己診断開始ボタンが押されることによって自己診断開始信号を得るようにしてもよい。
【００６９】
また、何らかのキーワードを用意しておいて、システムが応答異常を起こしたら、ユーザがそのキーワードを発話するようにしてもよい。この場合、そのキーワードが発せられたことを自己診断開始信号検出部２１が検出し、それによって自己診断部２２を起動させるようにすればよい。
【００７０】
さらに、ユーザに提示されるエラーメッセージは、音声だけではなく、ユーザ側の端末Ｔ上に文字表示によって行うことも可能であり、音声と文字表示を併用するようにしてもよい。
【００７１】
また、システム側からユーザに対して応答する祭、端末Ｔからユーザに応答するメッセージ（前述の実施の形態の場合、エラーメッセージなど）とホストＨからユーザに応答するメッセージ（音声コマンドに対して取得された情報など）とでそれぞれ声質を異ならせるようにすることもできる。たとえば、エラーメッセージは男性の声とし、取得された情報は女性の声とするというようにそれぞれの声質を異ならせることによって、ユーザは、応答を聞くだけでその応答が端末ＴからのものかホストＨからのものかが即座にわかり、特に、視聴覚障害者にとっては使い勝手のよいものとなる。
【００７２】
また、以上説明した本発明の処理の手順は、フロッピィディスク、光ディスク、ハードディスクなどの記録媒体に記録させておくことができる。そして、本発明はその記録媒体をも含むものである。また、その処理手順はネットワークを介して得るようにしてもよい。
【００７３】
【発明の効果】
以上説明したように本発明によれば、ユーザの入力した音声コマンドに対してホストコンピュータ側が応答異常を起こした場合、応答異常のあることをユーザが察知したときにユーザによってなされる動作を検知し、それによって、端末側で自己診断機能を起動し、その自己診断機能によって、異常となった原因を調べ、その結果をユーザに提示するようにしている。このように、応答異常に対してユーザが何らかの動作ををとることによって、応答異常となった原因を自己診断して調べ、その自己診断結果をユーザに知らせることができるので、ユーザ側で応答異常となった原因を１つ１つ調べて行く必要がなくなる。また、このとき、自己診断結果を音声によって報知するようにすれば、視覚障害を有するユーザにとって便利なシステムとなることは勿論、健常者であっても診断結果をその場に居ながら耳から得ることができるので、他の仕事をしながら本情報提供システムを利用するような場合、仕事をしている手や目を休めることなく、どこが悪くて応答異常が起こったのかを知ることができる。
【００７４】
また、応答異常があることをユーザが知ったときにユーザによってなされる動作は、ユーザ側に存在するマイクロホンに対し、ユーザがマイクロホンの動作状態を確認する動作であって、具体的には、ユーザがマイクロホンに「フッ、フッ」というように息を吹きかける動作、「あー」というような音声を入力する動作、マイクロホンを軽く叩く動作などであり、これらの動作は、マイクロホンに向かって発話しようとするユーザが、何らかの異常を感じたときにごく自然に行う動作である。
【００７５】
このように、応答異常に対するユーザが行う自然な動作を自己診断機能を開始させるための信号として用いるので、ユーザが特別な操作を行うことなく、自動的に自己診断機能を働かせることができる。
【００７６】
また、本発明は、応答異常が起こって自己診断を行う際、ユーザの入力した音声コマンドをそのままユーザにフィードバックすることも可能であり、これによって、ユーザは自分の入力した音声コマンドの状態を知ることができ、これを何回か繰り返すことによってユーザはどのように発話すれば適正に音声認識されるかを学習することができる。
【００７７】
このように、本発明はユーザからの音声コマンドを受け取って、その音声コマンドに対応した情報を主に音声によって提供可能とした音声による情報提供システムにおいて、音声コマンドに対しシステムが誤動作した場合、何が原因なのかを自己診断し、その自己診断結果をユーザにわかりやすく知らせることを可能とし、特に自己診断結果を音声によってユーザに提示するようにしているので、前述したような音声応答による視覚障害者への生活支援システムなどに本発明を適用することによって、視覚障害者にとってより一層使いやすいシステムとすることができる。また、視覚障害者のみらず、健常者であっても、この種のシステムの取り扱いに不慣れなユーザや、他の仕事をしながらこのような情報提供システムを利用する機会の多いユーザにとっては使い勝手のよいものとなる。
【図面の簡単な説明】
【図１】本発明の音声による情報提供システムの実施の形態を説明する全体的なシステム構成を示す図である。
【図２】図１で示したシステム構成図における端末、ホスト、サーバのそれぞれの概略構成を説明する図である。
【図３】図２で示した端末における各構成要素をさらに詳細に説明する図である。
【図４】図３で示した自己診断開始信号検出部における自己診断開始信号検出処理を説明するフローチャートである。
【符号の説明】
Ｔ端末
Ｈホスト（ホストコンピュータ）
Ｎネットワーク
Ｓサーバ
１音声入力処理部
２自己診断部
３音声出力処理部
４無線通信部（端末Ｔ側）
１１マイクロホン
１２音声入力部
１３音声一時記憶部
１４音声区間検出部
２１自己診断開始信号検出部
２２自己診断部
２３エラーメッセージ記憶部
３１音声出力部
３２ヘッドホン
４１信号送信部
４２信号受信部
６１無線通信部（ホストコンピュータＨ側）
６２音声認識部
６３アプリケーション部
６４音声合成部
６５ネットワーク通信部（ホストコンピュータＨ側）
７１ネットワーク通信部（サーバＳ側）
７２データベース[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice information providing system that receives a voice command from a user and can provide information corresponding to the voice command to the user by voice, and a malfunction cause notification method thereof.
[0002]
[Prior art]
In recent years, information processing systems using speech have been proposed in various fields due to advances in speech-related technologies such as speech recognition technology and speech synthesis technology. For example, commands that can be used to input commands to the system by voice and that can respond from the system by voice have already been developed. Such a voice information processing system is particularly convenient for users with visual disabilities and users with limited limbs, and is expected to spread in various fields.
[0003]
As one example, a life support system for visually handicapped persons by voice response has been proposed (Institute of Electronics, Information and Communication Engineers, Type 2 Study Group, pages 33-39, "Life for visually handicapped persons by voice response" Prototype support system ”Shizuoka Prefectural University Faculty of Business and Information Sciences Kenichi Matsuura Hiroaki Yuse).
[0004]
This system is schematically configured as shown in FIG. 1, and the user U wears the headphones 32 and the microphone 11 and uses the communication means such as wireless communication to transmit audio information to and from the system side. It is possible to exchange and acquire various information by voice. The dialogue example is shown below. The user is represented by U, and the system is represented by Sy. In addition, the system Sy referred to here is a generic term for the user terminal T, the host computer H, and the like in FIG.
[0005]
U: Clock.
Sy: You can use the clock / calendar function.
U: What time is it now?
Sy: The current time is 4:30 pm.
U: What day is next Thursday?
Sy: Next Thursday is May 10th.
U: Calculate with a calculator
Sy: You can use the calculator function.
U: 980 plus 198
Sy: 980 plus 198 is 1178.
U: Dictionaries
Sy: You can use the dictionary function.
U: “Active”
Sy: Is the word you're looking for "Sekkikuki"?
U: Yes.
[0006]
In this way, the user U issues a command for acquiring information by voice, the system Sy recognizes the voice command, and returns an answer based on the recognition result to the user U by voice. Furthermore, it is also possible to obtain information by inquiring about life information such as weather forecasts, news, timetables such as railways, and television programs. In this case, a specific web page is designated in advance, and in response to a request from the user U, the host computer H connects to the server S shown in FIG. 1 and the server S acquires information from the web page. This is notified to the user U.
[0007]
[Problems to be solved by the invention]
The above-described information providing system by voice is expected to be useful not only for visually impaired users but also for healthy people, but there are still many problems to be solved.
[0008]
For example, there are many cases where there is no response from the system Sy side to the voice command input by the user U, or there is a totally wrong response. This is not only a problem in the speech recognition performance and application on the system Sy side, but there may be a problem in the way the user speaks, the microphone 11 and the like, and there may be a problem in the communication environment. In this way, malfunctions (hereinafter referred to as response abnormalities) such as no response from the system Sy side or an incorrect response are caused by various causes.
[0009]
If the user knows the cause of the response abnormality, it can cope with it. For example, if the cause is known such as the volume of the spoken voice is too low, the voice is spoken too quickly, or the power switch of the microphone is not turned on, it is possible to cope with it.
[0010]
However, this type of conventional system usually does not have a means to inform the user of the cause, and it is very difficult for the user to identify the cause of the abnormal response, Eventually, it is necessary to perform a trial and error cause diagnosis such as repeating the same utterance or examining the state of the system. If you are a healthy person, you may be able to identify the cause by investigating the time using various methods, but it is difficult for users with visual impairments to investigate the cause. It is necessary to have a specialist in this type of system investigate the cause.
[0011]
Therefore, the present invention makes it possible to self-diagnose what is the cause when the system malfunctions in response to a voice command and to inform the user of the self-diagnosis result in an easy-to-understand manner. By doing this, it is an object to provide a system that is easy to use not only for healthy people but also for users with visual impairments.
[0012]
[Means for Solving the Problems]
In order to achieve the above-described object, the voice information providing system according to the present invention receives a voice command from the terminal owned by the user and receives the voice command on the host computer side, and the host computer receives the voice command. The voice command is recognized and a response to the recognition result is transmitted to the terminal. If the response from the host computer to the voice command is abnormal, the cause of the abnormal response is given to the user. A voice information providing system having a means for informing the user, and when there is a response abnormality in the response from the host computer to the voice command, a means for notifying the user of the cause of the response abnormality, When a user detects an abnormal response on the host computer side, the user A self-diagnosis start signal detecting means for obtaining a self-diagnosis start signal by detecting an operation performed by the self-diagnosis start signal detecting means, a self-diagnosis means for starting a self-diagnosis upon receiving a self-diagnosis start instruction signal from the self-diagnosis start signal detecting means, It has a configuration having notifying means for notifying the user of the self-diagnosis result obtained by the self-diagnosis means.
[0013]
In such a voice information providing system, the response from the host computer and the notification of the self-diagnosis result to the user are both performed by voice in response to the voice command.
[0014]
The operation performed by the user necessary for obtaining the self-diagnosis start signal is an operation in which the user confirms the operation state of the microphone with respect to the microphone existing in the terminal.
[0015]
Here, the user confirming the operation state of the microphone is at least one of an operation in which the user blows the microphone, an operation in which the user inputs a predetermined sound into the microphone, and an operation in which the user taps the microphone. It is conceivable to use an operation.
[0016]
The self-diagnosis start signal detection means, the self-diagnosis means, and the notification means are mainly provided on the terminal side.
[0017]
In addition, when there is a response abnormality in the response from the host computer side to the voice command, the voice command input by the user can be fed back to the user.
[0018]
Further, the malfunction cause notification method of the information providing system by voice according to the present invention is such that the user inputs a voice command from a terminal possessed by the user, the voice command is received by the host computer, and the host computer receives the voice command. A means for recognizing a voice command and transmitting a response to the recognition result to the terminal, and when there is a response abnormality in the response from the host computer to the voice command, the cause of the response abnormality to the user A malfunction cause notifying method of a voice information providing system having means for informing the user, and when there is a response abnormality in a response from the host computer to the voice command, the cause of the response abnormality is notified to the user Means for the user to detect an abnormal response on the host computer side. The self-diagnosis start motion signal is obtained by detecting the operation performed by the user as a self-diagnosis start motion signal, and the self-diagnosis function is activated thereby to investigate the cause of the response abnormality. Diagnosis is performed and the self-diagnosis result is notified to the user.
[0019]
In such a malfunction cause notification method of the information providing system by voice, both the response from the host computer and the notification of the self-diagnosis result to the voice command are performed by voice.
[0020]
The operation performed by the user necessary for obtaining the self-diagnosis start signal is an operation in which the user confirms the operation state of the microphone with respect to the microphone existing in the terminal.
[0021]
Here, the user confirming the operation state of the microphone is at least one of an operation in which the user blows the microphone, an operation in which the user inputs a predetermined sound into the microphone, and an operation in which the user taps the microphone. It is conceivable to use an operation.
[0022]
Then, the function of obtaining the self-diagnosis start signal, the function of performing the self-diagnosis, and the function of notifying the user of the self-diagnosis result are mainly provided on the terminal side.
[0023]
In addition, even in this voice information providing system malfunction cause notification method, if there is a response abnormality in the response from the host computer to the voice command, the voice command input by the user can be fed back to the user. .
[0024]
As described above, in the present invention, when a host computer (hereinafter simply referred to as a host) causes an abnormal response to a voice command input by the user, an operation performed by the user when the user senses that there is an abnormal response. Thus, the self-diagnosis function is activated on the terminal side, the cause of the abnormality is examined by the self-diagnosis function, and the result is presented to the user.
[0025]
In other words, the response from the host computer to the voice command issued by the user is clearly an abnormal response (this response abnormality is, as described above, no response to the voice command issued by the user, When the user determines that the operation is erroneous (such as an incorrect response), the user normally performs some action naturally. In the system, by detecting the operation of the user, self-diagnosis is performed as to where the response is abnormal and the self-diagnosis result is notified to the user.
[0026]
In this way, when the user takes some action for the response abnormality, the cause of the response abnormality can be self-diagnosed and investigated, and the result of the self-diagnosis can be notified to the user. There is no need to investigate each cause. In addition, at this time, if the self-diagnosis result is notified by voice, it becomes a convenient system for a visually impaired user, and even a healthy person can obtain the diagnosis result from the ear while on the spot. Therefore, it is convenient when using this information providing system while doing other work.
[0027]
The operation performed by the user when he / she knows that there is a response abnormality is an operation in which the user confirms the operation state of the microphone with respect to the microphone existing on the user side. Is the action of blowing a breath like "Foot" on the microphone, the action of inputting a sound like "Ah", the action of tapping the microphone, etc. These actions try to speak toward the microphone This is a natural operation when the user feels something abnormal.
[0028]
As described above, since the natural operation performed by the user for the response abnormality is used as a signal for starting the self-diagnosis function, the user can automatically activate the self-diagnosis function without performing a special operation.
[0029]
It should be noted that a function for obtaining a signal for starting the self-diagnosis function, a function for performing self-diagnosis, and a function for notifying the user of the self-diagnosis result are mainly provided on the terminal side. As a result, when the present invention is applied to an existing system such as the “life support system for visually impaired persons by voice response” described as an example of the above-described conventional information providing system, the host side is significantly changed. Since the present invention can be realized without any problem, this kind of existing information providing system can be effectively used.
[0030]
In addition, when self-diagnosis occurs when a response abnormality occurs, the voice command entered by the user can be fed back to the user as it is, and the voice command entered by the user is fed back as it is. You can know the status of the command. For example, if the feedback voice command shows that the voice is too loud and the sound is broken, it means that the user was unable to properly recognize the voice because his / her voice was too loud. Understand and know that you should speak carefully with that point. By repeating this several times, the user can learn how to speak properly to be recognized.
[0031]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below.
[0032]
FIG. 1 is a diagram showing a schematic configuration of an information providing system by voice. Broadly speaking, there are a terminal T on a user side and a host computer (simply referred to as a host as described above) H. Depending on the content of the voice command, information may be acquired from, for example, a server S connected to the network N as an external information providing unit.
[0033]
As shown in FIG. 2, the user-side terminal T includes a voice input processing unit 1, a self-diagnosis processing unit 2, a voice output processing unit 3, a wireless communication unit 4, and the like. Each of these components will be described in detail later.
[0034]
The host H includes a wireless communication unit 61 that enables wireless communication with the terminal T, a voice recognition unit 62 that recognizes a voice transmitted from the terminal T via the wireless communication unit 61, and at least the function of the present invention. An application unit 63 composed of several applications necessary for executing, a voice synthesis unit 64 for generating a voice signal for responding by voice to the terminal T, and a network communication unit for acquiring information from the server S 65 and the like.
[0035]
The server S includes a network communication unit 71, a database 72 in which various data are stored, and the like.
[0036]
FIG. 3 is a diagram for further explaining each component of the terminal 1. The voice input processing unit 1 includes a microphone 11, a voice input unit 12, a voice temporary storage unit 13, a voice section detection unit 14, and the like. The self-diagnosis processing unit 2 includes a self-diagnosis start signal detection unit 21, a self-diagnosis unit 22, an error message storage unit 23, and the like, and the audio output processing unit 3 includes an audio output unit 31, headphones 32, and the like. . The wireless communication unit 4 includes a signal transmission unit 41 and a signal reception unit 42. The user U wears the microphone 11 and the headphones 32 as shown in FIG.
[0037]
Note that although sound other than sound may be input to the microphone 11, the sound input processing unit 1 can of course process sound other than sound.
[0038]
A sound signal output from the microphone 11 (here, a sound signal including a signal other than the sound signal is also referred to as a sound signal) is a general sound signal such as an A / D conversion process or an amplification process in the sound input unit 12. After the processing, it is temporarily stored in the audio temporary storage unit 13 sequentially for every certain interval.
[0039]
Then, the sound signal from the microphone 11 (referred to as an input sound signal) stored in the sound temporary storage unit 13 is read by the self-diagnosis start signal detection unit 21 to determine whether or not it is a self-diagnosis start signal. Made. As described above, the self-diagnosis start signal detection unit 21 detects a signal obtained by an operation performed by the user when the user senses a response abnormality, for example, an operation in which the user confirms the operation state of the microphone 11. Specifically, at least one of the operation in which the user blows the microphone 11 like “Foot”, the operation of inputting the sound “Ah”, and the operation of tapping the microphone Detects a signal obtained by being made by the user.
[0040]
The self-diagnosis start signal detection process performed by the self-diagnosis start signal detector 21 will be described with reference to the flowchart of FIG.
[0041]
In FIG. 4, first, the input voice signal temporarily stored in the voice temporary storage unit 13 is subjected to signal analysis to obtain a signal pattern of the input voice signal (step s1). Then, several types of self-diagnosis start signal patterns prepared in advance as self-diagnosis start signals are stored in the self-diagnosis start signal pattern storage unit 211 (the self-diagnosis start signal pattern storage unit 211 is stored in the self-diagnosis start signal detection unit 21). And pattern matching is performed with the signal pattern of the input audio signal obtained by signal analysis (step s2).
[0042]
In this case, the self-diagnosis start signal pattern registered in the self-diagnosis start signal pattern storage unit 211 is a signal obtained when the user performs an operation of blowing a breath such as “Foot” on the microphone 11. A pattern (referred to as a first diagnosis start signal pattern P1), a signal pattern obtained when the user performs an operation of inputting a sound such as “ah” toward the microphone 11 (referred to as a second diagnosis start pattern P2) And a signal pattern (referred to as a third diagnosis start pattern P3) obtained when the user performs an operation of tapping the microphone with a fingertip.
[0043]
Such various diagnosis start signal patterns P1, P2, P3 and input voice signal patterns are individually pattern-matched, and the input voice signal pattern corresponds to one of the diagnosis start signal patterns P1, P2, P3, That is, it is determined whether or not the input voice signal is a self-diagnosis start signal (step s3). In this determination, if it is determined that the input sound signal is a self-diagnosis start signal, the self-diagnosis process is started (step s4), and if it is determined that the input sound signal is not a self-diagnosis start signal. Then, the current self-diagnosis start detection process is terminated.
[0044]
Then, when it is determined that the input voice signal is a self-diagnosis start signal, the self-diagnosis process is performed by the self-diagnosis unit 22, and first, the terminal T performs self-diagnosis, and then the host H performs self-diagnosis, Further, self-diagnosis is sequentially performed on the network N and the server S. At this time, when performing self-diagnosis after the host H, a self-diagnosis instruction signal from the self-diagnosis unit 22 is sent to the host H side by the signal transmission unit 41 of the wireless communication unit 4.
[0045]
On the host H side, self-diagnosis is performed by starting a self-diagnosis application present in the application unit 63 by a self-diagnosis instruction signal sent from the terminal T side. A specific example of this self-diagnosis will be described later.
[0046]
If the cause of the response abnormality can be identified as a result of the self-diagnosis, the self-diagnosis unit 22 reads out an error message corresponding to the cause of the response abnormality from the error message storage unit 23 and sends it to the voice output unit 31. The audio output unit 31 gives the read error message to the headphones 32 as an audio signal. The error message storage unit 23 is prepared with error messages corresponding to various causes that cause a response abnormality.
[0047]
For example, an error message such as “There is a situation where radio waves are difficult to reach” is prepared as an error message corresponding to the case where the communication environment is bad, and the input voice is too low. An error message such as “Your voice is too low” is prepared as an error message corresponding to the case where the recognition is not performed normally. As described above, various error messages are prepared corresponding to various causes.
[0048]
On the other hand, if the self-diagnosis start signal detection unit 21 determines whether or not the input voice signal is a self-diagnosis start signal, if it is determined that it is not a self-diagnosis start signal, it is a voice command. Regardless, the process is passed to the speech segment detection unit 14. As a result, the voice segment detection unit 14 detects a voice segment from the input voice signal stored in the voice temporary storage unit 13 and sends a voice signal corresponding to the voice segment to the wireless communication unit 4. The wireless communication unit 4 transmits the audio signal from the signal transmission unit 41 to the host H side.
[0049]
Then, on the host H side, the voice signal sent from the terminal T side is received by the wireless communication unit 61, the voice recognition unit 62 performs voice recognition processing, and the application unit 63 responds based on the recognition result. Select and launch an application to perform processing corresponding to the input voice signal (voice command).
[0050]
If the voice command at this time is content that can be processed by the host H (for example, the voice command can be handled by using a clock or calendar function, the content that can be handled by using a dictionary function, etc.) A response corresponding to the command is generated by the speech synthesizer 64 and transmitted to the terminal T side by the wireless communication unit 61.
[0051]
Further, if the voice command is content that needs to acquire information from the server S (the voice command is content that needs to acquire information from a specific server such as weather forecast or news of the day, for example), the network communication unit 65 To obtain a desired information from the server S, generate a response content corresponding to the voice command from the user by the voice synthesizer, and transmit it to the terminal T side by the wireless communication unit 61 . On the terminal T side, the audio signal transmitted from the host H side is received by the signal receiving unit 42 of the wireless communication unit 4, processed by the audio output unit 31, and then output as sound from the headphones 32.
[0052]
Next, a specific operation of the voice information providing system of the present invention will be described. As described above, this type of voice information providing system is such that the user provides information desired by the user by voice when the user makes an inquiry by voice. If there is a problem somewhere in the information providing system by voice and the system malfunctions, that is, if the response from the host H becomes abnormal (this response abnormality is Despite giving a voice command, this means a malfunction such as no response from Host H or incorrect response). It is easy to convey by voice.
[0053]
Here, there are various causes for the response abnormality. That is, the cause of the response abnormality may be on the terminal T side, may be on the host H side, and may be on the network N or server S side.
[0054]
Causes on the terminal T side include, for example, a problem with a power supply unit (battery, switch, etc.), a problem with communication, a problem with an input / output system (microphone 11 or headphone 31), a problem such as how to input voice commands performed by the user And so on. Examples of problems in the way voice commands are input by the user include, for example, when the voice of the user's utterance is too large or too small, or when the utterance is too far or too close to the microphone 11 Or a word other than a recognizable word is input.
[0055]
Further, the cause on the host H side is a problem with the computer itself, a problem with the recognition performance of the voice recognition unit 62, a problem with the application 63, a problem with communication, and when acquiring information from the network N, There may be a case where there is a problem in connection, a problem on the line, or a problem in the network N or the server S itself.
[0056]
As described above, there are various causes for the host H to cause an abnormal response. Even if the response is abnormal due to the various causes described above, the user often does not know what caused the response abnormality.
[0057]
For example, when the user asks the host H "What time is it now?", Even if there is no response from the host H side or a response that is completely misplaced, the user may not be able to determine the cause. Many. In such a case, the user generally performs an operation of confirming the operation state of the microphone 11 naturally. Specifically, it blows on the microphone “Foot”, inputs a sound such as “Ah”, or taps the microphone 11 lightly.
[0058]
Thus, when at least one operation is performed among several operations for confirming the operation state of the microphone 11 that the user naturally performs when the user senses that there is a problem, the terminal T is provided. The detected self-diagnosis start signal detection unit 21 detects this and activates the self-diagnosis unit 22. The self-diagnosis start signal detection process performed by the self-diagnosis start signal detector 21 has already been described with reference to FIG.
[0059]
When the self-diagnosis start signal detection unit 21 determines that the input voice is a self-diagnosis start signal, the self-diagnosis unit 22 first self-diagnose the terminal T, then self-diagnose the host H, and then Then, the self-diagnosis is performed in the order of the terminal T, the host H, the network N, and the server S, as the network N and the server S are self-diagnosed.
[0060]
For example, when there is no particular problem on the terminal T side, the self-diagnosis unit 22 transmits a self-diagnosis instruction signal to the host H side by the wireless communication unit 4, and when the host H receives this self-diagnosis instruction signal, the host H On the H side, a self-diagnosis application in the application unit 63 is activated to perform self-diagnosis.
[0061]
Here, as a result of performing a self-diagnosis on the host H side, when it is found that when the speech recognition unit 62 performs speech recognition processing, the user's input speech is too small to perform proper speech recognition processing. The diagnosis result such as “Your voice is too low” is notified to the user by voice. In addition, if it is necessary to obtain information from the server S, if a response abnormality occurs, the network N and the server S are also diagnosed. For example, there is a problem with the connection to the network N. If known, the diagnosis result such as “There is a problem with the network connection” is notified to the user by voice.
[0062]
Thus, when it is determined that the input voice is a self-diagnosis start signal, the self-diagnosis unit 22 is activated and performs self-diagnosis in the order of the terminal T, the host H, the network N, and the server S, As a result, if a cause of response abnormality is found, the cause is specifically notified to the user by voice.
[0063]
As a result, even if the system side causes an abnormal response to the voice command issued by the user, the user can specifically know where and what problem caused the abnormality. In this way, if the cause of the response abnormality is known, it is possible to correct the cause of the abnormality and input the voice command again, eliminating waste of repeatedly inputting the voice command without knowing the cause. be able to.
[0064]
The present invention is not limited to the embodiment described above, and various modifications can be made without departing from the gist of the present invention.
[0065]
For example, when a response abnormality occurs with respect to a user's voice command and the system performs a self-diagnosis, the voice command input by the user may be fed back to the user as it is. As an example, the user tries to inquire about the current time, gives a voice command “What time is it now” to the host side, does not respond to the voice command, and the user blows on the microphone 11 with a “huff”. Suppose you have performed the correct operation. When the user performs such an operation of confirming the operation state of the microphone 11, when the self-diagnosis signal detection unit 21 detects it, the self-diagnosis unit 22 enters a self-diagnosis operation. The voice command portion (in this case, “What time is it now”) stored in the voice section is detected by the voice section detection unit 14, and the voice section is processed by the voice output unit 31 and output from the headphones 32.
[0066]
As a result, the voice command (“what time is it now”) spoken by the user is output as it is from the headphones 32. The user can know the state of the voice command inputted by listening to the user. For example, if it is "What time is it?" That you have spoken from the headphones 32, but the voice is too loud and the sound seems to be cracking, then the user will be able to speak properly because the voice you spoke is too loud. Know that you couldn't recognize, and know that you should speak carefully.
[0067]
In the above-described embodiment, the self-diagnosis start signal detection unit 21 tests the operation state of the microphone 11 by blowing the microphone 11, inputting a sound such as “Ah”, and tapping the microphone. In this case, the self-diagnosis start signal is obtained by detecting the operation generally performed by the user at the time of the operation. The operation generally performed by the user when testing the operation state of the microphone 11 is the above-described three examples. However, the present invention is not limited to these, and various other methods are conceivable.
[0068]
Further, the self-diagnosis start signal is not limited to being obtained by detecting an operation generally performed by the user when testing the operation state of the microphone 11 described above. For example, a self-diagnosis start button is prepared, When a response abnormality occurs in the system, a self-diagnosis start signal may be obtained by pressing the self-diagnosis start button.
[0069]
Also, some keyword may be prepared, and if the system causes a response abnormality, the user may speak the keyword. In this case, the self-diagnosis start signal detection unit 21 may detect that the keyword has been issued, and the self-diagnosis unit 22 may be activated accordingly.
[0070]
Furthermore, the error message presented to the user can be displayed not only by voice but also by text display on the terminal T on the user side, and voice and text display may be used together.
[0071]
In addition, a message that responds to the user from the system side, a message that responds to the user from the terminal T (in the case of the above-described embodiment, an error message, etc.) and a message that responds to the user from the host H (obtained for voice commands) Voice quality can be made different from each other). For example, by making the voice messages different from each other such that the error message is a male voice and the obtained information is a female voice, the user simply listens to the response and the response is from the terminal T. It is immediately known whether it is from H, and it is particularly convenient for the visually impaired.
[0072]
The processing procedure of the present invention described above can be recorded on a recording medium such as a floppy disk, an optical disk, or a hard disk. The present invention includes the recording medium. Further, the processing procedure may be obtained via a network.
[0073]
【The invention's effect】
As described above, according to the present invention, when the host computer side causes a response abnormality to a voice command input by the user, an operation performed by the user is detected when the user senses that there is a response abnormality. Thereby, a self-diagnosis function is activated on the terminal side, the cause of the abnormality is examined by the self-diagnosis function, and the result is presented to the user. In this way, when the user takes some action for the response abnormality, the cause of the response abnormality can be self-diagnosed and investigated, and the result of the self-diagnosis can be notified to the user. There is no need to investigate each cause. In addition, at this time, if the self-diagnosis result is notified by voice, it becomes a convenient system for a visually impaired user, and even a healthy person can obtain the diagnosis result from the ear while on the spot. Therefore, when using this information provision system while doing other work, it is possible to know what went wrong and the response abnormality occurred without resting the hands and eyes working.
[0074]
The operation performed by the user when he / she knows that there is a response abnormality is an operation in which the user confirms the operation state of the microphone with respect to the microphone existing on the user side. Is the action of blowing a breath like "Foot" on the microphone, the action of inputting a sound like "Ah", the action of tapping the microphone, etc. These actions try to speak toward the microphone This is a natural operation when the user feels something abnormal.
[0075]
As described above, since the natural operation performed by the user for the response abnormality is used as a signal for starting the self-diagnosis function, the user can automatically activate the self-diagnosis function without performing a special operation.
[0076]
Further, according to the present invention, when a response abnormality occurs and self-diagnosis is performed, the voice command input by the user can be fed back to the user as it is, so that the user knows the state of the voice command input by the user. By repeating this several times, the user can learn how to speak to properly recognize the voice.
[0077]
In this way, the present invention receives a voice command from a user and can provide information corresponding to the voice command mainly by voice. It is possible to self-diagnose the cause and inform the user of the self-diagnostic result in an easy-to-understand manner. In particular, the self-diagnosis result is presented to the user by voice. By applying the present invention to a life support system for persons with disabilities, the system can be made even easier for visually handicapped persons. In addition, not only visually impaired people but also healthy people are easy to use for users who are unfamiliar with the handling of this type of system or who have many opportunities to use such information provision systems while doing other work. It will be good.
[Brief description of the drawings]
FIG. 1 is a diagram showing an overall system configuration for explaining an embodiment of a voice information providing system according to the present invention.
FIG. 2 is a diagram illustrating a schematic configuration of each of a terminal, a host, and a server in the system configuration diagram illustrated in FIG.
FIG. 3 is a diagram illustrating each component in the terminal shown in FIG. 2 in more detail.
4 is a flowchart illustrating self-diagnosis start signal detection processing in the self-diagnosis start signal detection unit shown in FIG. 3;
[Explanation of symbols]
T terminal
H Host (host computer)
N network
S server
1 Voice input processor
2 Self-diagnosis department
3 Audio output processing section
4 Wireless communication unit (terminal T side)
11 Microphone
12 Voice input part
13 Voice temporary storage
14 Voice segment detector
21 Self-diagnosis start signal detector
22 Self-diagnosis department
23 Error message storage
31 Audio output unit
32 headphones
41 Signal transmitter
42 Signal receiver
61 Wireless communication unit (host computer H side)
62 Voice recognition unit
63 Application Department
64 Speech synthesis unit
65 Network communication unit (host computer H side)
71 Network communication unit (server S side)
72 Database

Claims

The user inputs a voice command from a terminal possessed by the user, the voice command input from the terminal is recognized by a host computer, and a response to the recognition result of the voice command is transmitted to the terminal. When there is a response abnormality in the response, a voice information providing system having means for notifying the user of the cause of the response abnormality, and when there is a response abnormality in the response, the cause of the response abnormality is The means for notifying the user includes a self-diagnosis start signal detection means for obtaining a self-diagnosis start signal by detecting an operation performed by the user when the user senses the response abnormality, and the self-diagnosis start Self-diagnosis means for starting self-diagnosis in response to the instruction signal and self-diagnosis results obtained by the self-diagnosis means are used. The operation performed by the user necessary for obtaining the self-diagnosis start signal is that the user confirms the operation state of the microphone with respect to the microphone present in the terminal. A voice information providing system characterized by operation.

2. The voice information providing system according to claim 1, wherein both a response from the host computer side to the voice command and a notification of the self-diagnosis result to the user are performed by voice.

The operation for confirming the operation state of the microphone is at least one of an operation in which the user blows into the microphone, an operation in which the user inputs predetermined sound into the microphone, and an operation in which the user taps the microphone. 3. A voice information providing system according to claim 1, wherein the information providing system is based on voice.

4. The voice information providing system according to claim 1, wherein the self-diagnosis start signal detection unit, the self-diagnosis unit, and the notification unit are provided mainly on the terminal side.

5. The voice information according to claim 1, wherein when there is a response abnormality in a response from the host computer to the voice command, the voice command input by the user is fed back to the user. Offer system.

The user inputs a voice command from a terminal possessed by the user, the voice command input from the terminal is recognized by a host computer, and a response to the recognition result of the voice command is transmitted to the terminal. When there is a response abnormality in the response, the malfunction cause notification method of the information providing system by voice having means for notifying the user of the cause of the response abnormality, and when there is a response abnormality in the response, The means for notifying the user of the cause of the response abnormality is to obtain a self-diagnosis start motion signal by detecting an operation performed by the user when the user detects the response abnormality on the host computer side. The self-diagnosis function is activated by the self-diagnosis start signal, and self-diagnosis is performed to investigate the cause of the response abnormality. The operation performed by the user necessary for obtaining the self-diagnosis start signal is the operation state of the microphone relative to the microphone present in the terminal. A malfunction cause notification method of an information providing system by voice, which is an operation to confirm.

7. The method of notifying a cause of malfunction of an information providing system by voice according to claim 6, wherein a response from the host computer to the voice command and a notification of the self-diagnosis result to the user are both performed by voice.

The operation for confirming the operation state of the microphone is at least one of an operation in which the user blows into the microphone, an operation in which the user inputs predetermined sound into the microphone, and an operation in which the user taps the microphone. 8. The malfunction cause notification method for an information providing system by voice according to claim 6.

9. The function of obtaining the self-diagnosis start signal, the function of performing the self-diagnosis, and the function of notifying the user of a self-diagnosis result are mainly provided on the terminal side. Of cause of malfunction of information providing system by voice.

10. The information provision by voice according to claim 6, wherein when a response from the host computer to the voice command is abnormal, the voice command inputted by the user is fed back to the user. System malfunction cause notification method.