JP3760420B2

JP3760420B2 - Voice response service equipment

Info

Publication number: JP3760420B2
Application number: JP32204495A
Authority: JP
Inventors: 敦子佐藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1995-01-11
Filing date: 1995-12-11
Publication date: 2006-03-29
Anticipated expiration: 2015-12-11
Also published as: JPH08251307A

Description

【０００１】
【発明の属する技術分野】
本発明は音声応答サービス装置に関する。本発明は、特に、履歴情報の再生を迅速に行い、テレフォンサービスの利用者の性別や年齢区分などの属性や相手の操作環境に合わせて、音質を変更する音声応答サービス装置に関する。
【０００２】
【従来の技術】
コンピュータにより自動応答する音声応答サービス装置として、テレホンサービスが知られている。
【０００３】
このテレホンサービスは、予め設定された音質（速度、音量、性別あるいはバックグランドミュージックＢＧＭ）により音声を流す。この種の音声応答サービス装置は、テレホンサービスが提供される時に、例えば、特定の年齢層に合わせた標準的な音質に設定する。
【０００４】
このような音声サービス装置は、例えば、特開昭５９ー１８１７６７号公報または特開平３ー１６０８６８号公報に記載される。これらの公報に記載された音声応答サービス装置は、音声サービス装置内の音声認識装置を使用することにより、預金残高照会などのサービスを提供する。
【０００５】
また、特開昭６１ー２３５９４０号公報の装置は、電話機での音声出力レベルを常に一定値になるように出力レベルを補正する。これにより、利用者は音声情報を容易に聴取することができる。
【０００６】
しかし、特開昭６１ー２３５９４０号公報の装置において、出力レベルを補正するために利用者側から音量、性別の指定が必要であった。すなわち、手動により補正が行われるため、作業が大変であった。
【０００７】
また、残りの公報の装置は利用者に合わせた音質によりサービスを提供していなかった。この種の音声応答サービス装置では、通話をする相手にしたがって声を大きくしたり、ゆっくり話すなどによって相手が理解しやすいように微調整を行う必要があった。このため、利用者の年齢、性別により自動的に音質を変更する音声応答サービス装置が要求されていた。
【０００８】
また、前記音声応答サービス装置は、テレホンサービスを利用することにより商品の通信販売を行うことができる。この音声応答サービス装置は、通常では、利用者情報や利用者からの商品入力情報を履歴情報として収集する。
【０００９】
前記音声応答サービス装置において、利用者の発注した商品、数量、金額などの発注トラブルなどによって利用者から問い合わせがあった場合に、過去の履歴情報を追うことにより利用者の操作を確認する作業が必要であった。
【００１０】
さらに、前記音声応答サービス装置は、障害などが発生したとき、履歴情報に基づき人手によって利用者の操作を再生させていた。
【００１１】
【発明が解決しようとする課題】
しかしながら、以上に説明した音声応答サービス装置では、たとえば発注トラブルが生じた場合に、人手によって履歴情報を再生しなければならなかった。このため、多くの時間がかかるだけでなく、人件費も大きくなっていた。また、誤操作箇所を自動的に追跡する必要もあった。
【００１２】
さらに、この種の音声応答サービス装置では、手動によって発注トラブルの原因および音声応答サービスをスムースに行なうためのテストを行っていたので、開発時・保守時のテスト工数がかなりかかっていた。
【００１３】
そこで、本発明の目的は、テレフォンサービスの利用者の性別や年齢区分などの属性や相手の操作環境に合わせて、音質を変更することにより快適性の高いサービスを提供することにある。
【００１４】
また、本発明は、履歴情報から過去のサービスを再生し、容易にそのサービスの内容を確認することにある。
さらに、本発明の目的は、開発時・保守時のテスト工数や人件費を削減することにある。
【００１５】
【課題を解決するための手段】
本発明の音声応答サービス装置は、前記課題を解決するため、以下の手段を採用した。
【００１６】
本発明は、音声応答サービスの処理手順に基づき利用者から入力されるプッシュボタン信号に応じた音声応答サービスを処理する音声応答サービス装置であって、利用者単位に、前記音声応答サービスの処理手順に応じて該利用者から入力されるプッシュボタン信号情報を含む該音声応答サービスのログ情報を時系列に記憶するログ情報記憶手段と、前記ログ情報記憶手段に記憶されたログ情報の中から特定の前記プッシュボタン信号情報を元にログ情報を抽出するログ情報抽出手段と、前記音声応答サービスの処理手順に基づき前記ログ情報抽出手段が抽出したログ情報のプッシュボタン信号情報に応じて音声応答サービスを再生する音声応答サービス再生手段とを備えることを特徴とする音声応答サービス装置についてのものである。
【００１７】
また、本発明は、前記ログ情報記憶手段に記憶するログ情報には、電話回線の接続時間又は電話回線の切断時間の少なくとも一方の時間情報を含み、前記ログ情報抽出手段が、前記ログ情報記憶手段に記憶された時間情報を元に再生開始位置を判定する。
【００１８】
また、本発明は、前記ログ情報記憶手段には、前記プッシュボタン信号情報を文字情報として記憶し、前記プッシュボタン信号の文字情報を音声に合成する音声合成処理手段を更に備え、前記音声応答サービス再生手段が、前記ログ情報抽出手段が抽出したログ情報のプッシュボタン信号情報について前記音声合成処理手段で合成した音声を含む音声応答サービスを再生する。
また、本発明は、前記ログ情報記憶手段に記憶するログ情報のプッシュボタン信号情報には、前記音声応答サービスの処理手順に応じて利用者から入力される該利用者の利用者番号を含み、前記ログ情報抽出手段が、前記ログ情報記憶手段に記憶されたログ情報の中から、該当する利用者番号を含むログ情報を特定して再生したいログ情報を抽出する。
【００１９】
音声合成処理手段は、たとえば文字列より音声を合成する音声規則合成方式により音声
合成を行う。この音声合成方式には、この他に音声信号の特徴を利用して音声波形を符号化する波形符号化方式、音声の生成モデルにしたがって音声信号を符号化する分析合成方式などがある。
【００２３】
本発明において、前記装置は、電話回線を制御するとともに利用者からのプッシュボタン信号及び回線接続通知情報を前記音声応答サービス再生手段に出力する回線制御部を含んでもよい。
【００２４】
本発明において、前記回線制御部からのプッシュボタン信号及び回線接続通知情報を前記音声応答サービス再生手段に出力する場合には、前記音声応答サービス再生手段を前記回線制御部に接続し、前記ログ情報抽出手段により抽出されたログ情報を再生する場合には、前記音声応答サービス再生手段を前記ログ情報抽出手段に接続する切替え部を前記装置に備えてもよい。
【００２５】
本発明において、前記音声応答サービス再生手段にナレーション情報を送出するナレーション情報送出部を備えてもよい。本発明において、前記音声応答サービス再生手段は、前記ナレーション情報送出部からのナレーション情報に従ってサービスを実行し、前記プ
ッシュボタン信号の値に従ったサービスを利用者に提供する。
【００２６】
本発明において、前記音声応答サービス再生手段が音声応答サービスを実行するとき、前記ナレーション情報送出部からのナレーション情報をログ情報として前記ログ情報記憶手段に書き込むログ情報書き込み部を備えるようにしてもよい。
【００２７】
本発明において、前記音声応答サービス再生手段が音声応答サービスを実行するとき、前記ナレーション情報送出部からのナレーション情報を文字情報に変換し変換された文字情報をファイル形式のテキストとして書き込むテキストファイル部を備えてもよい。
【００２８】
本発明において、前記音声合成処理手段は、前記テキストファイル部に書き込まれた文字情報を音声情報に変換してもよい。本発明において、前記音声合成処理手段は、前記ナレーション情報送出部からのナレーション情報を音声情報に変換してもよい。前記音声合成処理手段により変換された音声情報を記憶する蓄積音ファイル部を含んでもよい。
【００２９】
本発明において、前記ログ情報記憶手段にログ情報を書き込むログ情報書き込み部を備える。前記音声応答サービス再生手段は前記ログ情報書き込み部により前記ログ情報記憶手段に書き込まれたログ情報を再生してもよい。
【００３０】
本発明において、前記音声合成処理手段は、疑似回線を制御する疑似回線制御部に接続
され、前記疑似回線制御部を通して前記変換された音声情報をスピーカに送出してもよい。
【００３４】
なお、前記以上説明された各々発明を適宜組み合わせてもよい。本発明によれば、ログ情報記憶手段は音声応答により利用者との対話形式で行われる複数の処理内容を時間的な推移の記録を示すログ情報として記憶するとともに前記複数の処理内容の各々を識別するための識別情報を前記音声応答の内容の一部分として記憶する。音声応答サービス再生手段は、識別情報に対応するログ情報の対話形式の処理内容を前記ログ情報記憶部から読み出し読み出された処理内容を音声に再生する。
【００３５】
これにより、自動的にログ情報から過去のサービスを再生し、容易にそのサービスの内容を確認することができる。
また、ログ情報に含まれる電話回線の接続、切断の時間情報により処理内容を特定することができる。
【００３６】
また、電話回線を介して利用者から受信するプッシュボタン信号が文字情報としてログ情報記憶手段に記憶される。したがって、記憶されたプッシュボタン信号によりログ情報を再生できる。
【００３７】
また、音声合成処理手段は、前記ログ情報記憶手段に記憶された文字情報から応答すべき音声を合成する。
【００３９】
また、ログ情報抽出手段が、前記ログ情報記憶手段に記憶されたログ情報の中から特定の前記プッシュボタン信号情報を元にログ情報を抽出すると、音声応答サービス再生手段は前記ログ情報抽出手段により抽出されたログ情報のプッシュボタン信号情報に応じて音声応答サービスを再生する。前記音声合成処理手段は前記音声応答サービス再生手段から送られてくるログ情報から音声を合成する。これにより、ログ情報から特定の処理内容を再生することができる。
【００４０】
また、回線制御部は電話回線を制御するとともに利用者からのプッシュボタン信号及び回線接続通知情報を前記音声応答サービス再生手段に出力する。利用者と音声応答サービス装置との間の対話がスムースに行われる。
【００４１】
また、切換部は前記音声応答サービス再生手段を前記回線制御部に接続するので、通常のサービスが提供される。切換部は前記音声応答サービス再生手段を前記ログ情報抽出手段に接続するので、ログ情報から過去のサービスを再生することができる。
【００４２】
また、音声応答サービス再生手段は、ナレーション情報送出部からのナレーション情報に従って、サービスを実行することができる。また、前記音声応答サービス再生手段は、前記ナレーション情報送出部からのナレーション情報に従ってサービスを実行し、前記プッシュボタン信号の値に従ったサービスを利用者に提供する。これにより、利用者と音声応答サービス装置との間の対話をナレーション情報に従って行えるので、利用者の負担が軽減できる。
【００４３】
また、前記音声応答サービス再生手段が音声応答サービスを実行するとき、ログ情報書き込み部は前記ナレーション情報送出部からのナレーション情報をログ情報として前記ログ情報記憶手段に書き込む。これにより、たとえば発注情報に誤りがある場合には、ログ情報をログ情報記憶手段から読み出して誤操作箇所の追跡が行える。
【００４４】
また、テキストファイル部はナレーション情報送出部からのナレーション情報を文字情報に変換してファイル形式のテキストとして書き込むので、たとえば音声規則合成により文字情報を音声情報に合成することができる。
【００４５】
また、音声合成処理手段はナレーション情報送出部からのナレーション情報を音声情報に変換し、蓄積音ファイル部は変換された音声情報を書き込むことができる。
【００４６】
また、音声応答サービス再生手段はログ情報書き込み部によりログ情報記憶手段に書き込まれたログ情報を再生する。これにより、障害が発生した場合に行う再現テストのテスト時間を短縮することができる。
【００４７】
また、音声合成処理手段は疑似回線制御部を介して音声情報をスピーカに送出する。その結果、スピーカの出力により利用者にログ情報の誤りが音声で認識される。
【００５２】
【発明の実施の形態】
以下、本発明の音声応答サービス装置の実施の形態を図面を参照して詳細に説明する。
＜実施の形態１＞
図１は本発明の音声応答サービス装置を含む音声応答サービスシステムの実施の形態１を示すブロック図である。
【００５３】
実施の形態１のシステムは、利用者とコンピュータとの対話処理を履歴情報格納部から自動再生して商品の受注業務、予約業務、資料請求業務および銀行振込業務などの時間外業務に適用することにより、人件費を削減する。
【００５４】
図１において、音声応答サービスシステムは、音声応答サービス装置として動作するコンピュータＣと、このコンピュータＣに接続され利用者が利用する電話機側Ｕから構成される。音声応答サービスシステムは、テレホンサービス業務に用いられる。
【００５５】
前記音声応答サービスシステムは、たとえば、家庭、オフイスなどの電話機側ＵによりコンピュータＣにプッシュボタン信号を出力し、コンピュータＣと電話機側Ｕとの間で対話処理を行うことにより前記各種の業務を行う。
【００５６】
前記コンピュータＣは点線で囲まれる音声応答サービス部１０と、履歴情報抽出部１２、疑似回線制御部１９、スピーカ３３を備える。
前記音声応答サービス部１０は、通常時にテレフォンサービスを行うときに動作する。前記音声応答サービス部１０は、ナレーションストーリ出力部１５、履歴情報格納部１１、切換部２０、回線制御部１４、サービス実行部１７、音声合成処理部１８を備える。
【００５７】
図５はサービス内容を記述した図である。ここでは、サービス内容は例えば、商品の受注業務である。このサービス内容はナレーションストーリと呼ばれる。前記ナレーションストーリ出力部１５はナレーションストーリを出力する。
【００５８】
図２は履歴情報格納部１１に格納される履歴情報を示す図である。図５のナレーションストーリの通りに音声応答サービスが行われたとき、図２に記録される履歴情報の一例が示される。履歴情報格納部１１は、時間とともにサービス内容を記録する。
【００５９】
すなわち、履歴情報格納部１１は、時間Ａに利用者が商品受注システムに適用した電話回線の接続から電話回線の切断までに提供するサービスの開始からサービスの終了までのサービスの内容の時間的推移を表すログ情報（履歴情報）を格納する。履歴情報格納部１１は、電話回線の接続時間、切断時間、コンピュータがナレーションを読み始めた時間、利用者がプッシュボタンを入力した時間などの時間情報を記録する。
【００６０】
図３と図４は利用者側（Ｕ）とコンピュータ側（Ｃ）との動作シーケンスを示す図である。図３は図５のナレーションストーリ通りにサービスが実行されるシーケンスを示す図である。図４はサービスの実行途中で障害が発生した例を示す図である。
【００６１】
前記履歴情報抽出部１２は、前記履歴情報格納部１１に接続され、サービスの再生を行うときに動作する。前記履歴情報抽出部１２は、履歴情報格納部１１に格納された履歴情報の中から、利用者が入力したプッシュホンデータを抽出する。
履歴情報抽出部１２には切換部２０が接続される。
【００６２】
切換部２０は接点からなるもので、回線制御部１４をサービス実行部１７に接続するとき、すなわち、通常のサービスが提供されるとき、利用者が入力するプッシュボタン信号を電話機３０からサービス実行部１７に読み込み動作する。
【００６３】
切換部２０は、履歴情報抽出部１２をサービス実行部１７に接続するとき、すなわち、過去のサービスを再生するとき、利用者が入力したプッシュホンデータを履歴情報格納部１１から履歴情報抽出部１２を介してサービス実行部１７に送る。サービス実行部１７は履歴情報抽出部１２からの過去のサービスを実施する。音声合成処理部１８はサービス実行部１７からの過去に行ったサービスを音声で再生する。
【００６４】
前記音声合成処理部１８は、図６に示される音声合成方式の中で、たとえば文字列より音声を合成する規則合成方式によって発音を表す文字列より音声を合成する。音声応答サービスシステムで使用する音声合成方式には図６に示されるように、波形符号化方式、分析合成化方式、規則合成方式などがある。
【００６５】
規則合成方式は、図７および図８において説明される。規則合成方式は、かな漢字混じりの文字列を文章解析し、その解析された文字列からなる信号波形を合成することにより、音声に変換する方式である。この規則合成方式は、他の方式に比べて情報量が少なくて済むので、出力語数を無限にすることができ、音声応答サービスシステムに好適な方式である。
【００６６】
波形符号化方式とは、読み上げたい単語をあらかじめ録音しておき、言葉同士をつなげることにより、合成する方式である。分析合成化方式とは、音声の生成モデルに基づいて音声信号を符号化する音声合成方式であり、これらの方式でも音声を合成することが可能である。
【００６７】
テレフォンサービスでは、音声合成方式による発音のほかに、蓄積音の再生による音声も使用している。蓄積音声は、一般的に規則合成音よりも品質が良いが、内容の変更には録音作業が必要であるため、運用性が悪くなる。この蓄積音声は、不変的な内容の読み上げに使用するのに良い。
【００６８】
一方、規則合成音は、一過性の文章、不定型文章、一部可変文書などの不定型の音声を再生する際に好適である。
図７は音声合成方式を採用した音声合成処理部の具体的な構成例を示す図である。図７において、前記音声合成処理部１８は、言語処理部３１、音響処理部３２とから構成される。言語処理部３１には、文章解析部３４、単語辞書３５、読み・韻律記号付与部３６が設けられている。
【００６９】
文章解析部３４は入力した文章を解析単位に分割し、分割された文章を単語辞書３５に記憶された単語と照合することにより単語分割を行い、単語毎の読み、アクセント型、文法情報を設定する。読み・韻律記号付与部３６は文章解析部３４から得られた情報に基づき読みや韻律記号を付加する。韻律記号は例えばポーズ、文節のアクセント、イントネーションなどである。
【００７０】
この言語処理部３１では、読み・韻律記号付与部３６が読みや韻律記号を付与すると、音響処理部３２は抑揚を付加して波形合成などにより音響を出力する。音響処理部３２は、抑揚生成部３７、波形合成部３８および音素データベースである音素片部３９から構成されている。抑揚生成部３７は前記読み、韻律記号に基づき音素毎の時間長と、声の高さを表す抑揚パターンを生成する。波形合成部３８は音素片部３９に格納された音素片データを読み出し、前記音素時間長、抑揚パターンに従って音素片データをなめらかに接続して音声波形を合成する。
【００７１】
この音響処理部３２は、波形合成部３８から波形合成された音声をスピーカ３３に出力し、スピーカ３３はたとえば「コレワ、オンセーゴーセーデス」を出力する。
【００７２】
図８は図７に示す音声合成方式によって音声合成を行う具体的例を示す図である。たとえば、図８に示す「これは、音声合成です」という文章を例として、音声合成を説明する。先ず、文章解析部３４は前記文章を入力し単語分割を行う。文章解析部３４は文章を「これ／は／、音声／合成／で／す。」のように単語毎に分割する。
【００７３】
次に、文章解析部３４は文節の設定を行う。文章解析部３４は「これは／、音声合成です。」のように文節を設定する。
読み・韻律記号付与部３６は、文章解析部３４の結果に基づき、読みを付与し、ポーズ、文節のアクセント、イントネーションを表す韻律符号を付与する。読みは「コレ／ワ／、オ^,ンセー／ゴーセー／デス／。」となる。韻律符号は、「コレワ、オンセーゴ^,ーセーデス。」となる。
【００７４】
次に、音響処理部３２において、抑揚生成部３７は音の高さを示す抑揚パターンを生成し、波形合成部３８は音素片部３９から音素片データを読み出し音素時間長情報と抑揚パターン情報に従って音素片データを接続して音声波形を合成する。
【００７５】
以上の処理を行うことにより、音声データが音響処理部３２から出力される。（通常サービス時の動作説明）
次に、このように構成された音声応答サービスシステムの実施の形態１の通常サービス時における動作を説明する。
【００７６】
まず、切換部２０が回線制御部１４をサービス実行部１７に接続する。ナレーションストーリ出力部１５が、ナレーションストーリをナレーショーンストーリ解析部１６に出力すると、ナレーショーンストーリ解析部１６は前記ナレーションストーリを解析する。
【００７７】
サービス実行部１７は、解析されたナレーションストーリにしたがって、音声合成処理部１８への音声出力指示、履歴情報書き込み部２７への指示、回線制御部１４へのプッシュボタン読み込み指示、読み込んだ結果での処理の分岐、データベースアクセス部２２へのアクセス指示などを行う。利用者が入力するプッシュボタン信号は電話機３０からサービス実行部１７に読み込まれる。
【００７８】
音声の出力については、ナレーションストーリにしたがって、蓄積音ファイル２６、テキストファイル２５等を用いてサービスを実行する。
データベース２３は、利用者に関する情報が書き込まれている利用者データベース、商品情報が書き込まれている商品データベースあるいは受注情報が書き込まれている受注データベースからなる。
【００７９】
データベースアクセス部２２は、データベース２３の書き込み読み出し処理を行う。テキストファイル２５は予めテキスト情報を格納し、蓄積音声ファイル２６は蓄積音声情報を格納する。履歴情報書き込み部２７は、サービス実行部１７で指示されたサービス内容を履歴情報格納部１１に書き込む。音声合成処理部１８は、既に図６から図８において説明された音声合成方式によって音声を合成する。
【００８０】
なお、切換部５０により音声合成処理部１８からの音声データ出力は、回線制御部１４を介して電話機３０に出力することもできる。このとき、電話回線に接続せずに音声応答サービスの内容をモデル化し、それらのモデルを解析することができる。
【００８１】
また、音声合成処理部１８からの音声データは、疑似回線制御部１９を制御してスピーカ３３を鳴動させることもできる。作業者はサービス内容を音声データとして聞くことができる。
＜ナレーションストーリ出力部＞
次に、ナレーションストーリ出力部１５を、図５に示す手順番号１から手順番号２３までの手順にしたがって詳細に説明する。
【００８２】
このナレーションストーリ出力部１５では、ナレーションの内容が手順番号１から手順番号２３までの手順にしたがって決められている。
すなわち、手順番号１では、サービスの概要に蓄積音声ファイル「open.pcm」が指定されているので、図１の蓄積音ファイル２６から「open.pcm」を読み出し、「open.pcm」が音声応答サービス部１０より読み上げられる。
【００８３】
手順番号２では、ナレーションストーリ出力部１５の利用者番号の入力ナレーションである、たとえば「利用者番号を入力してください」が音声応答サービス部１０より読み上げられる。
【００８４】
手順番号３では、利用者側で利用者番号を電話機３０のプッシュボタンより入力すると、そのプッシュボタンの内容が読み込まれ、利用者番号が音声応答サービス部１０に入力される。
【００８５】
手順番号４では、音声応答サービス部１０は利用者の照会があると、データベース２３の利用者データベースをアクセスさせる。
手順番号５では、手順番号４でのデータベース２３を照会した結果を、ナレーションストーリ出力部１５の利用者番号および利用者番号の確認である「××番、〇〇様ですね？」というナレーションとして、音声応答サービス部１０より読み上げる。
【００８６】
手順番号６では、利用者からの確認結果に基づいて音声応答サービス部１０は処理を分岐する。この手順番号６では、たとえば「１」を利用者が電話機３０のプッシュボタンで選択したとき、処理は手順番号７の処理に進む。また、たとえば「９」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号２の処理を再び実行する。
【００８７】
手順番号７では、ナレーションストーリ出力部１５の商品番号の入力案内ナレーションである「商品番号を入力してください」というナレーションを音声応答サービス部１０より読み上げる。
【００８８】
手順番号８では、たとえばプッシュボタンにより商品番号を音声応答サービス部１０に入力すると、音声応答サービス部１０は前記商品番号を読み込む。
手順番号９では、音声応答サービス部１０は商品の照会があると、データベース２３の商品データベースをアクセスさせる。
【００８９】
手順番号１０では、手順番号９でのデータベースを照会した結果を、ナレーションストーリ出力部１５の商品番号確認ナレーションである「××番、〇〇様ですね？」というナレーションとして、音声応答サービス部１０より読み上げる。
【００９０】
手順番号１１では、利用者からの確認結果に基づいて音声応答サービス部１０は処理を分岐する。
この手順番号１１では、たとえば「１」を利用者が電話機３０のプッシュボタンで選択したとき、処理は手順番号１２の処理に進む。
【００９１】
またたとえば「９」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号７の処理を再び実行する。
手順番号１２では、ナレーションストーリ出力部１５の個数入力ナレーションである、「個数を入力してください」というナレーションを音声応答サービス部１０より読み上げる。
【００９２】
手順番号１３では、たとえばプッシュボタンにより個数を音声応答サービス部１０に入力すると、音声応答サービス部１０は個数を読み込む。
手順番号１４では、ナレーションストーリ出力部１５で入力完了確認ナレーションである、「以上でよろしいですか」というナレーションを音声応答サービス部１０より読み上げる。
【００９３】
手順番号１５では、利用者からの確認結果に基づいて音声応答サービス部１０は処理を分岐する。
この手順番号１５では、たとえば「１」を利用者が電話機３０のプッシュボタンで選択したとき、処理は手順番号１６の処理に進む。
【００９４】
またたとえば「９」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号７の処理を再び実行する。
手順番号１６では、ナレーションストーリ出力部１５の入力確認ナレーションである「以上でよろしいですか」を音声応答サービス部１０から読み上げる。
【００９５】
手順番号１７では、利用者からの確認結果に基づいて音声応答サービス部１０は処理を分岐する。
たとえば、「１」を利用者が電話機３０のプッシュボタンで選択したとき、処理は手順番号１８の処理に進む。
【００９６】
またたとえば「９」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号７の処理を行う。
手順番号１８では、ナレーションストーリ出力部１５で発注内容繰り返しナレーションにテキストファイル「order.txt」が指定されているので、テキストファイル２５から読み出したナレーションを音声応答サービス部１０より読み上げる。
【００９７】
手順番号１９では、ナレーションストーリ出力部１５の発注確認ナレーションである「以上でよろしいですか」というナレーションを音声応答サービス部１０より読み上げる。
【００９８】
このとき、手順番号２０では、利用者からの確認結果に基づいて音声応答サービス部１０は処理を分岐する。
この手順番号２０では、たとえば「１」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号２１の処理を行う。
【００９９】
またたとえば「９」を利用者が電話機３０のプッシュボタンで選択したとき、手順番号２３の処理を実行する。
手順番号２１では、ナレーションストーリ出力部１５の発注番号ナレーションである「発注番号は〇〇番ですね？」というナレーションを音声応答サービス部１０より読み上げる。
【０１００】
手順番号２２では、発注処理を行う。受注した内容をデータベース２３の発注データベースに書き込む。
手順番号２３では、ナレーションストーリ出力部１５のサービスの終了ナレーションに、蓄積音声である「close.pcm」が指定されているので、蓄積音ファイル２６から読み出し音声応答サービス部１０より読み上げられる。
【０１０１】
以上に説明した図５の音声応答サービス装置によれば、利用者側Ｕの電話機３０とコンピュータＣの音声応答サービス部１０との処理を対話形式で確実に行うことができる。
（図３のシーケンスの説明）
以上に説明した図１のナレーションストーリ出力部１５を用いて図３の処理を説明する。
【０１０２】
たとえば、利用者Ｕ側では、利用者番号「６５１１２３」が入力１０３として電話機３０のプッシュボタンにより入力されると、コンピュータＣ側に利用者番号「６５１１２３」が送信される。
コンピュータＣ側に送信された利用者番号「６５１１２３」である送信情報はコンピュータＣ側の回線制御部１４によって確認された後に、ナレーションストーリ出力部１５からナレーションストーリ１０４が出力される。そして、利用者番号・利用者名の確認案内１０４が利用者Ｕ側の電話機３０に送信される。
【０１０３】
一方、コンピュータＣ側では、図３に示すように利用者Ｕ側から送られた利用者番号・利用者名の確認ｏｋを回線制御部１４が確認した後に、コンピュータＣ側から利用者Ｕ側の電話機３０に商品番号の入力案内指示を出力するためにナレーションストリー出力部１５からナレーションストーリ１０６が出力される。
【０１０４】
ナレーションストーリ出力部１５は、ナレーションストーリをナレーションストーリ解析部１６に出力し、ナレーションストーリ解析部１６はナレーションストーリを解析する。
【０１０５】
サービス実行部１７は、ナレーションストーリ解析部１６で解析されたナレーションストーリにしたがってデータベース２３あるいはファイル２５、２６を用いてサービスを実行する。
（発注時の処理）
また、利用者Ｕ側の電話機３０は、コンピュータＣ側のナレーションストーリ出力部１５から送られた商品番号入力指示の案内１０６にしたがって、利用者が商品番号「３２１」を電話機３０のプッシュボタンより入力する。この商品番号「３２１」は、特定の商品を表すコード情報である。
【０１０６】
利用者ＵがコンピュータＣ側に対して商品番号「３２１」が電話機３０より入力１０７として入力したときには、コンピュータＣ側の回線制御部１４はその商品番号「３２１」を確認し、さらに、商品番号「３２１」をデータベース２３の商品データベースにあることを確認する。そして、商品番号「３２１」の確認指示案内がコンピュータＣ側から利用者Ｕ側の電話機３０に送信１０８として送信される。
【０１０７】
利用者Ｕ側の電話機３０は、商品番号「３２１」の確認指示案内をコンピュータＣ側から受信したとき、自己が発注した商品が、商品番号「３２１」で表された商品に相違ないことを示す確認情報「１」が、電話機３０のプッシュボタンより入力される。
【０１０８】
次に、商品の個数入力案内１１０がコンピュータＣ側から利用者Ｕ側の電話機３０に送信される。
個数入力案内１１０を受けた利用者Ｕ側の電話機３０は、発注すべき商品の個数、たとえば「３」を電話機３０のプッシュボタンより入力１１１として入力する。
【０１０９】
利用者Ｕ側からコンピュータＣ側に発注すべき商品の個数情報が送信されたときには、コンピュータＣ側はデータベース２３の商品データベースに商品の在庫があることを確認して、その商品の個数確認情報１１２がコンピュータＣ側から利用者Ｕ側の電話機３０に送信される。
【０１１０】
このとき、利用者Ｕ側の電話機３０は、個数確認済み情報「１」を電話機３０のプッシュボタンより入力１１４として入力する。
コンピュータＣ側から入力終了確認案内１１４が利用者Ｕ側の電話機３０に送信されると、入力完了の意志、たとえば「１」なら終了指示が電話機３０のプッシュボタンより入力１１５として入力され、その終了指示が利用者Ｕ側からコンピュータＣ側に送信される。
【０１１１】
以上の応答指示がコンピュータＣ側と利用者Ｕ側の電話機３０間で行われると、発注の意志を確認するため、ナレーションストーリ出力部１５の注文内容の確認案内１１６、発注確認案内１１７がコンピュータＣ側から利用者Ｕ側の電話機３０に送信される。
【０１１２】
これらの案内に対して利用者Ｕ側の電話機３０において、発注処理「１」を利用者が電話機３０のプッシュボタンより入力１１８として入力する。
発注処理情報「１」が利用者Ｕ側の電話機３０からコンピュータＣ側に送信されたことをコンピュータＣ側が確認したときには、発注番号案内１１９、サービス終了案内１２０はコンピュータＣ側から利用者Ｕ側に送信される。そして、コンピュータＣ側は利用者Ｕ側の電話機３０の電話回線から切断１２１として切断される。
【０１１３】
以上により、実施の形態１では、利用者Ｕ側とコンピュータＣ側との間の対話形式によって、たとえば商品発注業務を行うとき、履歴情報書き込み部２７が図２に示すログ情報を図１に示した履歴情報格納部１１の特定の書き込み領域に書き込むことができる。また、履歴情報抽出部１２は、その書き込まれたログ情報を適宜読み出す。これにより、発注情報に誤りがある場合などに発注業務を円滑に行うことができる。
（コンピュータ側のトラブル時の処理）
次に、コンピュータＣ側にトラブルが発生したときの処理を説明する。図４のシーケンス図は発注トラブル時における処理を示す図である。図３と同じサービスの概要説明１０１から利用者の確認指示１０５までの処理は、詳しい説明を省略する。
【０１１４】
また図４の説明においては、図１から図３までの符号を参照して説明する。実施の形態１では、利用者が確認ＯＫ１０５の「１」を電話機３０のプッシュボタンから入力した後に、コンピュータＣ側のソフトウェアやデータベース２３、テキストファイル２５、蓄積音声ファイル２６などに障害が発生したとする。
【０１１５】
この場合には、障害発生１３０がコンピュータＣ側にあったことを利用者に知らせるために、コンピュータＣは障害発生信号を利用者の電話機３０に送信すると、電話回線が切断１２１として切断される。
【０１１６】
このように、サービス中に障害が発生した場合に、再現試験、あるいは障害修正後における確認テストが行われる。この場合には、図１の履歴情報格納部１１に格納された履歴情報が再生される。これにより、コンピュータＣ側に障害原因があることを利用者に認識させることができる。また、自動的にテスティングが行えるので、テスト時間が短縮される。
（音声応答サービスの再現）
次に、音声応答サービスの再現を説明する。前述したように、通常のサービスが提供されるときには、利用者Ｕが電話機３０のプッシュボタンから入力した信号が回線制御部１４により認識され、そのプッシュボタン信号はサービス実行部１７に読み込まれる。
【０１１７】
一方、音声応答サービスを再生する場合には、切換部２０によりサービス実行部１７が履歴情報抽出部１２に接続される。履歴情報抽出部１２は履歴情報格納部１１から利用者が過去に利用したサービスを抽出（特定）し、履歴情報格納部１１に記録されたプッシュボタンから入力された数値をプッシュボタンの読み込み値として、サービス実行部１７に送る。
【０１１８】
音声合成処理部１８はサービス実行部１７を介して履歴情報抽出部１２により抽出された履歴情報、すなわち、過去の音声応答サービスを再生する。このようにして実施の形態１では、疑似的に音声応答サービスを実行することにより、音声応答サービスを再生することができる。
【０１１９】
このときの動作を図１、図２および図５の説明を参照して説明する。例として、１９９４年９月６日にサービスを利用した利用者番号６５１１２３のサービスの内容を再生する。
【０１２０】
このときには、音声合成処理部１８によって再生された音声が回線制御部１４により、利用者Ｕの電話機３０に送られる。利用者が過去のサービスの内容を聞くことができる。
【０１２１】
履歴情報抽出部１２は、履歴情報格納部１１により格納された図２の履歴情報の中から、再生したい利用者の履歴情報を特定する。履歴情報の特定方法については、図１０を用いて（利用者履歴の特定をするための動作）の欄にて詳細に説明する。ここでは、図２に示される履歴情報の中、例えば、履歴情報Ａが特定される。
【０１２２】
また、切換部２０は、サービス実行部１７に履歴情報抽出部１２を接続し、履歴情報抽出部１２を起動させることにより、図５のサービスストーリにしたがって再生が開始される。
【０１２３】
ナレーションストーリ出力部１５の動作については先に＜ナレーションストーリ出力部＞にて詳細に説明してあるので、ここでは説明を省略する。
図５の手順番号１および手順番号２では、通常の通りに音声応答サービス部１０からサービスの概要説明、利用者番号の入力が読み上げられる。
【０１２４】
手順番号３の利用者番号の入力では、サービス実行部１７は履歴情報抽出部１２にプッシュボタンの読み込み指示を行う。履歴情報抽出部１２は図２に示した履歴情報の利用者番号「６５１１２３」１０３を読み込み、音声合成処理部１８は「ロクゴイチイチ二サン」を合成することにより音声データを得る。音声データは回線制御部１４により利用者Ｕの電話機３０に送られ、利用者に利用者番号が伝えられる。
【０１２５】
手順番号５は、通常の通りの動作となる。手順番号６では、利用者からの結果に基づいて処理が分岐されるため、サービス実行部１７は履歴情報抽出部１２にプッシュボタンの読み込み指示を行う。履歴情報抽出部１２は利用者確認「１」１０５を読み込み、音声合成処理部１８は「イチ」を合成することにより音声データを得る。音声データは回線制御部１４により、利用者Ｕの電話機３０に送られ、利用者に確認結果が伝えられ、処理は手順番号７へ分岐される。
【０１２６】
以降の処理の説明を省略するが、回線切断１２１までのサービスを行うことにより、過去のサービスを再現することができる。
なお、切換部５０により音声合成処理部１８が疑似回線制御部１９に接続された場合には、音声合成処理部１８からの音声データは疑似回線制御部１９を介してスピーカ３３に出力してもよい。
【０１２７】
また、この例では、音声合成処理部１８はプッシュボタンの番号を音声に変換してその番号を利用者に伝えた。例えば、必要に応じてプッシュボタンの番号を音声合成処理部１８に送らないことにより、その番号を利用者に伝えないようにすることも可能である。
（利用者履歴を特定するための動作）
次に、利用者履歴を特定するための動作を詳細に説明する。図９は履歴ファイルから利用者の再生したい記録を特定するための処理を示すフローチャートである。利用者履歴を特定するための動作は、履歴情報抽出部１２の履歴情報の抽出処理によって行われる。
【０１２８】
先ず、履歴情報格納部１１内の利用年月日の履歴ファイルをオープンする（ＳＴ４０）。ここでは、図２の履歴ファイルを例示して説明する。この履歴ファイルの終了まで以下の処理を行う。
【０１２９】
先ず、履歴ファイルから１行を読み込む（ＳＴ４２）。たとえば、ＳＴ４２において読み込んだ行に、「回線切断」が記録されている場合には、再生開始位置の可能性がある。このため、行番号１を履歴情報格納部１１に記録する（ＳＴ４３，ＳＴ４４）。
【０１３０】
ＳＴ４２において読み込んだ行に、「回線切断」が記録されている場合（ＳＴ４５）で、利用者履歴が見つかっている場合には（ＳＴ４６）、行番号１を再生開始位置と断定する（ＳＴ４７）。
【０１３１】
ＳＴ４２において読み込んだ行に、「利用者番号入力ナレーション」が記録されている場合（ＳＴ４８）には、次の行からのプッシュボタン入力数値を利用者番号とする（ＳＴ４９）。
【０１３２】
ここで、再生したい利用者番号とＳＴ４９の利用者番号が一致するとき（ＳＴ５０）には、該当する利用者履歴が見つかったと判断する（ＳＴ５１）。
上記処理を履歴ファイルの終了するまで行ったあと、利用月日の履歴ファイルをクローズする。これにより、利用者履歴の特定を行うことができる。
【０１３３】
このように、実施の形態１では、利用者Ｕ側とコンピュータＣ側での対話形式で、たとえば商品発注業務を行うとき、図２に示すログ情報を履歴情報格納部１１の特定の書き込み領域に書き込みできる。その書き込まれたログ情報を適宜読み出すことにより、発注情報に誤りがある場合などに発注業務を円滑に行うことができる。
【０１３４】
また、テレホンサービスなどの音声応答サービスにおける利用者Ｕ側とコンピュータＣ側とのやり取りをログ情報として再生しそのログ情報を利用者に聞かせることにより、利用者のプッシュボタン入力時におけるプッシュボタンの押し間違えなどによる誤操作箇所を追跡できる。
【０１３５】
例えば、履歴情報格納部１１に電話回線の接続、切断の時間、利用者の利用者情報を読み始めた時間および利用者がプッシュボタンから入力を行った時間などの時系列的な情報などが書き込める。このため、利用者の入力ミスなどがある場合に前記情報を再生することにより、誤操作箇所を利用者が容易に認識できる。
【０１３６】
さらに、自動ティスティングが行われることによりテスト工数を低減できる。＜実施の形態２＞
図１０は本発明の音声応答サービス装置を含む音声応答サービスシステムの実施の形態２を示すブロック図である。人間によってサービスが行われる場合には、相手の様子に従い、声を大きくしたり、ゆっくり話したりする。すなわち、相手が理解し易いように音量が調整される。
【０１３７】
実施の形態２では、コンピュータが相手の年齢、性別、聞き易さ聞き難さなどの発音状態などの情報により自動的に音質を変更し、動的に利用者に合わせた音質によってサービスを実施する。
【０１３８】
ここでは、図１０に示される構成において、実施の形態１の図１に示される構成とは異なる構成のみを説明する。その他の構成については同一の符号を付し、ここでは詳しい説明を省略する。
【０１３９】
図１０において、サービス実行部１７には、属性テーブル４０と、状態ルールテーブル４１とが接続される。これらのテーブルは図示しないデータベースに設けられる。属性テーブル４０は利用者の性別および利用者の年齢区分に応じた音量レベルを設定する。状態ルールデータデース４１は利用者の入力動作の状態に応じて音量または話す速度を設定する。
【０１４０】
図１１は図１０に示す実施の形態２の主要部を示す構成ブロック図である。図１１において、属性テーブル４０は、利用者の性別により男性音または女性音が格納された性別ルールテーブル４０ａと、利用者の年齢により音量レベルが変化した音量テーブル４０ｂとから構成される。
【０１４１】
性別ルールテーブル４０ａにおいて、利用者が男性である場合、声の柔らかさを出すために音声合成された女性音が選択される。利用者が女性である場合には、声のめりはりを出すために音声合成された男性音が選択される。
【０１４２】
音量テーブル４０ｂには、利用者の年齢区分にあった音量レベルがおおまかに設定できる。
たとえば、利用者の年齢が０〜５９才では音量レベルが４に設定され、利用者の年齢が６０〜６９才では音量レベルが５に設定される。利用者の年齢が７０〜７９才では音量レベルが６に設定され、利用者の年齢が８０〜８９才では音量レベルが７に設定される。
【０１４３】
すなわち、たとえば音量レベル１から順に音量レベルを高くし、数値が大きい方が音量が大きくなるように設定される。利用者の性別、年齢区分などにしたがっておおまかな音量が設定される。
【０１４４】
性別ルールテーブル４０ａ及び音量テーブル４０ｂはおおまかな利用者にあった音量レベルを設定する。状態ルールテーブル４１は、音量または声の速度を利用者に合わせて調整する。
【０１４５】
性別ルールテーブル４０ａ、音量テーブル４０ｂ及び状態ルールテーブル４１はサービス実行部１７ｄに接続される。サービス実行部１７ｄには音声属性設定イベント発生部４２、音声合成処理部１８が接続される。
【０１４６】
例えば、音量が小さく速度が遅いことを表す指示が、プッシュボタンから音声属性設定イベント発生部４２に送信された場合には、音声属性設定イベント発生部４２は利用者の音声属性を設定するためのイベントを発生する。
【０１４７】
前記サービス実行部１７ｄは、属性テーブル４０のルールをチェックするルールチェック処理部１７ａ、このルールチェック処理部１７ａの処理出力にしたがって音声属性を設定する音声属性設定部１７ｂ、この音声属性設定部１７ｂの音声属性出力にしたがってナレーション文をナレーションストーリ出力部１５より出力するナレーション文出力部１７ｃとから成る。
【０１４８】
状態ルールテーブル４１は、音量を１レベルだけ上げるとともに、声の速度を１レベルだけ下げるように設定することができる。
音声合成処理部１８は、音声属性設定部１７ｂの出力とナレーション文出力部１７ｃの出力から音声合成処理を行う。
（実施の形態２の動作）
次に、実施の形態２の動作を説明する。たとえば利用者の属性がデータベースに存在する場合に、利用者番号などから利用者属性を得たり、任意の利用者が自分の属性を電話機３０のプッシュボタンより入力すると、音声属性設定イベント発生部１５は、音声属性の設定イベントを発生する。
【０１４９】
ルールチェック処理部１７ａは、音声属性設定イベント発生部１５からの出力によって性別ルールテーブル４０ａ、音量テーブル４０ｂを参照して、参照したテーブル情報を音声属性設定部１７ｂに出力する。音声属性設定部１７ｂは利用者の音声の属性を設定する。
【０１５０】
たとえば、男性、６１才がこの音声応答サービスを利用する場合には、性別ルールテーブル４０ａ、音量テーブル４０ｂが参照され、音声性別を柔らかにするため、女性音が設定され、音量レベルが５に設定される。
【０１５１】
また、音声属設定イベント発生部１５において、利用者の入力操作が発生した場合には、ルールチェック処理部１７ａは状態ルールテーブル４１を参照する。状態ルールテーブル４１のテーブル情報から音声属性設定部１７ｂは音声属性を設定する。
【０１５２】
例えば、利用者が入力誤りをしたときには、音声速度を１レベルだけ遅くし、音量を１レベルだけ大きくする。
すなわち、ナレーションが速く、しかも音量が小さく、音声を聞き取ることができないと考えられる。この場合には、ゆっくりしかも大きく話すように音質を変更する。
【０１５３】
一方、利用者に適した音声の属性設定コマンドが、音声属性設定部１７ｂから出力され、ナレーション文出力部１７ｃからナレーションが出力されると、音声合成処理部１８は利用者に適した音質に合成する。
【０１５４】
このように、実施の形態２では、利用者の年齢区分、性別、聞き易さ聞き難さなどの発音状態などの情報により自動的に音質を変更することにより、どのような利用者に対しても聞き易い音質で音声応答サービスが行える。
【０１５５】
これにより、聞き間違いなどによる誤操作をなくして利用者サービスの向上を図ることができる。
また、実施の形態２を商品の受注業務、予約業務、資料請求業務に適用した場合でも、利用者の音声による誤認識が少なくなり、これらの業務をスムースに行うことができる。
【０１５７】
【発明の効果】
本発明によれば、自動的にログ情報から過去のサービスを再生し、容易にそのサービスの内容を確認することができる。
【図面の簡単な説明】
【図１】本発明の音声応答サービス装置を含む音声応答サービスシステムの実施の形態１を示すブロック図である。
【図２】本発明の実施の形態１の履歴情報を示す図である。
【図３】実施の形態１のナレーションストーリ通りにサービスが実行されるシーケンスを示す図である。
【図４】実施の形態１のサービスの実行途中で障害が発生した例を示す図である。
【図５】実施の形態１のサービス内容を記述した図である。
【図６】実施の形態１の音声合成方式を例示した図である。
【図７】音声合成方式を採用した音声合成処理部の具体的な構成例を示す図である。
【図８】図７に示す音声合成方式によって音声合成を行う具体的例を示す図である。
【図９】履歴ファイルから利用者の再生したい記録を特定するための処理を示すフローチャートである。
【図１０】本発明の音声応答サービス装置を含む音声応答サービスシステムの実施の形態２を示すブロック図である。
【図１１】図１０に示す実施の形態２の主要部を示す構成ブロック図である。
【符号の説明】
１０音声応答サービス部
１１履歴情報格納部
１２履歴情報抽出部
１４回線制御部
１５ナレーションストーリ出力部
１６ナレーションストーリ解析部
１７サービス実行部
１７ａルールチェック処理部
１７ｂ音声属性設定部
１７ｃナレーション文出力部
１８音声合成処理部
１９疑似回線制御部
２０切換部
２２データベースアクセス部
２３データベース
２５テキストファイル
２６蓄積音声ファイル
２７履歴情報書き込み部
３０電話機
３１言語処理部
３２音響処理部
３３スピーカ
３４文章解析部
３５単語辞書
３６読み・韻律記号付与部
３７抑揚生成部
３８波形合成部
３９音素片部
４０属性テーブル
４１状態ルールテーブル
４２音声属性設定イベント発生部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice response service apparatus. In particular, the present invention relates to a voice response service apparatus that quickly reproduces history information and changes the sound quality in accordance with attributes such as the gender and age classification of a telephone service user and the operation environment of the other party.
[0002]
[Prior art]
A telephone service is known as a voice response service device that automatically responds by a computer.
[0003]
In this telephone service, sound is played according to a preset sound quality (speed, volume, sex, or background music BGM). This type of voice response service device sets, for example, a standard sound quality suitable for a specific age group when a telephone service is provided.
[0004]
Such a voice service device is described in, for example, Japanese Patent Application Laid-Open No. 59-181767 or Japanese Patent Application Laid-Open No. 3-160868. The voice response service device described in these publications provides services such as deposit balance inquiry by using a voice recognition device in the voice service device.
[0005]
The device disclosed in Japanese Patent Application Laid-Open No. 61-235940 corrects the output level so that the audio output level at the telephone is always a constant value. Thereby, the user can listen to audio | voice information easily.
[0006]
However, in the apparatus disclosed in Japanese Patent Application Laid-Open No. 61-235940, it is necessary to specify the volume and sex from the user side in order to correct the output level. In other words, since correction is performed manually, the work is difficult.
[0007]
In addition, the remaining gazette apparatuses do not provide services due to sound quality tailored to the user. In this type of voice response service device, it is necessary to make fine adjustments so that the other party can easily understand by increasing the voice according to the other party who is talking or speaking slowly. For this reason, there has been a demand for a voice response service device that automatically changes the sound quality depending on the age and sex of the user.
[0008]
Further, the voice response service device can carry out mail order sales of products by using a telephone service. This voice response service apparatus normally collects user information and product input information from the user as history information.
[0009]
In the voice response service device, when there is an inquiry from the user due to an order trouble such as a product ordered by the user, a quantity, an amount, etc., an operation of checking the user's operation by tracking past history information is performed. It was necessary.
[0010]
Further, the voice response service device manually reproduces the user's operation based on the history information when a failure or the like occurs.
[0011]
[Problems to be solved by the invention]
However, in the voice response service device described above, for example, when an order trouble occurs, the history information has to be reproduced manually. This not only took a lot of time, but also increased labor costs. In addition, it was necessary to automatically track the erroneous operation location.
[0012]
Furthermore, in this type of voice response service device, the cause of the ordering trouble and the test for smoothly performing the voice response service have been manually performed, so that it took a considerable amount of test man-hours during development and maintenance.
[0013]
Accordingly, an object of the present invention is to provide a highly comfortable service by changing sound quality in accordance with attributes such as sex and age classification of a telephone service user and the operation environment of the other party.
[0014]
Further, the present invention is to reproduce a past service from history information and easily confirm the contents of the service.
Furthermore, an object of the present invention is to reduce test man-hours and labor costs during development and maintenance.
[0015]
[Means for Solving the Problems]
The voice response service apparatus according to the present invention employs the following means in order to solve the above-described problems.
[0016]
  The present invention relates to a voice response service apparatus that processes a voice response service according to a push button signal input from a user based on a voice response service processing procedure, and the voice response service processing procedure for each user. Log information storage means for storing, in time series, log information of the voice response service including push button signal information input from the user according to the information, and specified from the log information stored in the log information storage means Log information extracting means for extracting log information based on the push button signal information, and voice response service according to the push button signal information of the log information extracted by the log information extracting means based on the processing procedure of the voice response service And a voice response service reproducing means for reproducing the voice response service device.
[0017]
  In the present invention, the log information stored in the log information storage means includes at least one time information of a telephone line connection time or a telephone line disconnection time, and the log information extraction means includes the log information storage means. The reproduction start position is determined based on the time information stored in the means.
[0018]
  In the present invention, the log information storage means further comprises voice synthesis processing means for storing the push button signal information as character information and synthesizing the character information of the push button signal into speech, and the voice response service The reproduction means reproduces the voice response service including the voice synthesized by the voice synthesis processing means on the push button signal information of the log information extracted by the log information extraction means.
  In the present invention, the push button signal information of the log information stored in the log information storage means includes the user number of the user input from the user according to the processing procedure of the voice response service, The log information extraction unit extracts log information to be reproduced by specifying the log information including the corresponding user number from the log information stored in the log information storage unit.
[0019]
  Speech synthesis processingmeansIs, for example, a speech rule synthesis method that synthesizes speech from a character string.
Perform synthesis. In addition to this, there are a waveform encoding method for encoding a speech waveform using the characteristics of a speech signal, an analysis synthesis method for encoding a speech signal according to a speech generation model, and the like.
[0023]
  BookIn the present invention, the device controls a telephone line and transmits a push button signal and line connection notification information from a user.Voice response service playback meansYou may include a line controller that outputs toYes.
[0024]
  BookIn the invention, the push button signal and the line connection notification information from the line control unitVoice response service playback meansWhen outputting toVoice response service playback meansIs connected to the line control unit and the log information is extracted.meansWhen playing back log information extracted byVoice response service playback meansThe log information extractionmeansThe device may be provided with a switching unit connected toYes.
[0025]
  BookIn the invention,Voice response service playback meansYou may have a narration information sending part that sends narration information toYes. BookIn the invention,Voice response service playback meansExecutes a service according to the narration information from the narration information sending unit, and
Provide users with services according to the value of the buttonThe
[0026]
  BookIn the invention,Voice response service playback meansWhen the voice response service is executed, the narration information from the narration information sending unit is stored as log information in the log information storage.meansA log information writer for writing toYes.
[0027]
  BookIn the invention,Voice response service playback meansWhen the voice response service is executed, a narration information from the narration information sending unit may be converted into character information, and a text file unit may be provided to write the converted character information as file format text.Yes.
[0028]
  BookIn the invention, the speech synthesis processmeansMay convert the character information written in the text file part into voice information.Yes. BookIn the invention, the speech synthesis processmeansMay convert the narration information from the narration information sending unit into voice information. The speech synthesis processmeansIt may also contain a stored sound file part that stores the sound information converted byYes.
[0029]
  BookIn the invention, the log information storagemeansIncludes a log information writing unit for writing log information. SaidVoice response service playback meansIs stored in the log information by the log information writing unit.meansYou can play back the log information written inYes.
[0030]
  BookIn the invention, the speech synthesis processmeansIs connected to the pseudowire controller that controls the pseudowire.
The converted voice information may be sent to a speaker through the pseudo-wire control unit.Yes.
[0034]
  In addition, you may combine suitably each invention demonstrated above. According to the present invention, log information storagemeansStores a plurality of processing contents performed in an interactive manner with a user by voice response as log information indicating a record of temporal transition and identification information for identifying each of the plurality of processing contents As part of the content of. Voice response service playback meansIs, KnowledgeThe interactive processing content of the log information corresponding to the other information is read from the log information storage unit, and the read processing content is reproduced as voice.
[0035]
Thereby, the past service can be automatically reproduced from the log information, and the contents of the service can be easily confirmed.
Further, the processing content can be specified by the time information of connection / disconnection of the telephone line included in the log information.
[0036]
  In addition, push button signals received from users via telephone lines are stored as log information as log information.meansIs remembered. Therefore, log information can be reproduced by the stored push button signal.
[0037]
  Also, speech synthesis processingmeansIs the log information storagemeansSynthesize speech to be responded from character information stored in.
[0039]
  Also, log information extractionmeansThe log information storagemeansFrom the log information stored inBased on specific push button signal informationWhen log information is extracted,Voice response service playback meansIs the log information extractionmeansLog information extracted byPlay voice response service according to push button signal information. The speech synthesis processmeansSaidVoice response service playback meansVoice is synthesized from the log information sent from. Thereby, specific processing content can be reproduced from the log information.
[0040]
  The line control unit controls the telephone line and sends a push button signal and line connection notification information from the user.Voice response service playback meansOutput to. Dialogue between the user and the voice response service apparatus is smoothly performed.
[0041]
  The switching unit isVoice response service playback meansIs connected to the line control unit, so that a normal service is provided. The switching unit isVoice response service playback meansThe log information extractionmeansIt is possible to replay past services from log information.
[0042]
  Also,Voice response service playback meansCan execute the service according to the narration information from the narration information sending unit. Also, the aboveVoice response service playback meansExecutes a service according to the narration information from the narration information transmission unit, and provides the user with a service according to the value of the push button signal. Thereby, since the dialogue between the user and the voice response service device can be performed according to the narration information, the burden on the user can be reduced.
[0043]
  Also, the aboveVoice response service playback meansWhen executing the voice response service, the log information writing unit stores the narration information from the narration information sending unit as the log information.meansWrite to. For example, if there is an error in order information, log information is stored in log information.meansIt is possible to track the erroneous operation location by reading from.
[0044]
In addition, since the text file portion converts the narration information from the narration information sending portion into character information and writes it as file format text, the character information can be synthesized with the speech information by, for example, speech rule synthesis.
[0045]
  Also, speech synthesis processingmeansCan convert the narration information from the narration information sending section into voice information, and the stored sound file section can write the converted voice information.
[0046]
  Also,Voice response service playback meansLog information is stored by the log information writermeansThe log information written in is played back. As a result, it is possible to reduce the test time of the reproduction test performed when a failure occurs.
[0047]
  Also, speech synthesis processingmeansSends audio information to the speaker via the pseudo-wire control unit. As a result, an error in log information is recognized by voice by the user by the output of the speaker.
[0052]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of a voice response service device according to the present invention will be described below in detail with reference to the drawings.
<Embodiment 1>
FIG. 1 is a block diagram showing Embodiment 1 of a voice response service system including a voice response service device of the present invention.
[0053]
The system according to the first embodiment automatically reproduces the dialog processing between the user and the computer from the history information storage unit and applies it to overtime work such as product ordering work, reservation work, data request work and bank transfer work. To reduce labor costs.
[0054]
In FIG. 1, the voice response service system includes a computer C operating as a voice response service device and a telephone side U connected to the computer C and used by a user. The voice response service system is used for telephone service business.
[0055]
For example, the voice response service system performs a variety of tasks by outputting a push button signal to the computer C from the telephone side U such as a home or office and performing interactive processing between the computer C and the telephone side U. .
[0056]
The computer C includes a voice response service unit 10 surrounded by a dotted line, a history information extraction unit 12, a pseudo line control unit 19, and a speaker 33.
The voice response service unit 10 operates when performing a telephone service during normal times. The voice response service unit 10 includes a narration story output unit 15, a history information storage unit 11, a switching unit 20, a line control unit 14, a service execution unit 17, and a voice synthesis processing unit 18.
[0057]
FIG. 5 is a diagram describing service contents. Here, the service content is, for example, a product ordering operation. This service content is called a narration story. The narration story output unit 15 outputs a narration story.
[0058]
FIG. 2 is a diagram showing history information stored in the history information storage unit 11. An example of the history information recorded in FIG. 2 when the voice response service is performed according to the narration story of FIG. 5 is shown. The history information storage unit 11 records service contents with time.
[0059]
That is, the history information storage unit 11 changes the service contents over time from the start of the service to the end of the service provided from the connection of the telephone line applied to the product order receiving system to the disconnection of the telephone line by the user at time A. Log information (history information) is stored. The history information storage unit 11 records time information such as a telephone line connection time, a disconnection time, a time when a computer starts reading narration, and a time when a user inputs a push button.
[0060]
3 and 4 are diagrams showing an operation sequence between the user side (U) and the computer side (C). FIG. 3 is a diagram showing a sequence in which a service is executed according to the narration story of FIG. FIG. 4 is a diagram illustrating an example in which a failure occurs during service execution.
[0061]
The history information extraction unit 12 is connected to the history information storage unit 11 and operates when a service is reproduced. The history information extraction unit 12 extracts touch phone data input by the user from the history information stored in the history information storage unit 11.
A switching unit 20 is connected to the history information extraction unit 12.
[0062]
The switching unit 20 is composed of contacts. When the line control unit 14 is connected to the service execution unit 17, that is, when a normal service is provided, a push button signal input by the user is transmitted from the telephone 30 to the service execution unit. 17 is read.
[0063]
When the history information extraction unit 12 is connected to the service execution unit 17, that is, when a past service is reproduced, the switching unit 20 transfers the touch-phone data input by the user from the history information storage unit 11 to the history information extraction unit 12. To the service execution unit 17. The service execution unit 17 executes the past service from the history information extraction unit 12. The voice synthesis processing unit 18 reproduces the service performed in the past from the service execution unit 17 by voice.
[0064]
The speech synthesis processing unit 18 synthesizes speech from a character string representing pronunciation by, for example, a rule synthesis method for synthesizing speech from a character string in the speech synthesis method shown in FIG. As shown in FIG. 6, the voice synthesis method used in the voice response service system includes a waveform coding method, an analysis synthesis method, a rule synthesis method, and the like.
[0065]
The rule composition method is illustrated in FIG. 7 and FIG. The rule synthesizing method is a method of converting a kana-kanji mixed character string into speech by analyzing a sentence and synthesizing a signal waveform composed of the analyzed character string. Since this rule composition method requires less information than other methods, the number of output words can be infinite, and this method is suitable for a voice response service system.
[0066]
The waveform encoding method is a method of synthesizing words by recording in advance the words to be read out and connecting the words together. The analysis and synthesis method is a speech synthesis method that encodes a speech signal based on a speech generation model, and speech can be synthesized by these methods.
[0067]
In the telephone service, in addition to the sound generation by the voice synthesis method, the sound by playing the accumulated sound is also used. The stored voice generally has better quality than the regular synthesized voice, but the operability is deteriorated because the recording operation is required to change the contents. This stored voice is good for reading out invariant content.
[0068]
On the other hand, the rule-synthesized sound is suitable for reproducing irregular type sounds such as temporary sentences, irregular type sentences, and partially variable documents.
FIG. 7 is a diagram illustrating a specific configuration example of a speech synthesis processing unit employing a speech synthesis method. In FIG. 7, the speech synthesis processing unit 18 includes a language processing unit 31 and an acoustic processing unit 32. The language processing unit 31 includes a sentence analysis unit 34, a word dictionary 35, and a reading / prosodic symbol assignment unit 36.
[0069]
The sentence analysis unit 34 divides the inputted sentence into analysis units, performs word division by collating the divided sentence with words stored in the word dictionary 35, and sets reading, accent type, and grammar information for each word. To do. The reading / prosodic symbol giving unit 36 adds readings and prosodic symbols based on the information obtained from the sentence analyzing unit 34. The prosodic symbols are, for example, poses, phrase accents, intonations, and the like.
[0070]
In this language processing unit 31, when the reading / prosodic symbol giving unit 36 gives readings and prosodic symbols, the acoustic processing unit 32 adds inflection and outputs sound by waveform synthesis or the like. The acoustic processing unit 32 includes an inflection generation unit 37, a waveform synthesis unit 38, and a phoneme piece unit 39 that is a phoneme database. The intonation generating unit 37 generates an inflection pattern representing the time length for each phoneme and the pitch of the voice based on the reading and prosodic symbols. The waveform synthesizing unit 38 reads the phoneme piece data stored in the phoneme piece unit 39, and synthesizes the speech waveform by smoothly connecting the phoneme piece data according to the phoneme time length and the inflection pattern.
[0071]
The sound processing unit 32 outputs the sound whose waveform is synthesized from the waveform synthesis unit 38 to the speaker 33, and the speaker 33 outputs, for example, “Colewa, Onsego Sades”.
[0072]
FIG. 8 is a diagram showing a specific example in which speech synthesis is performed by the speech synthesis method shown in FIG. For example, speech synthesis will be described by taking the sentence “This is speech synthesis” shown in FIG. 8 as an example. First, the sentence analysis unit 34 inputs the sentence and performs word division. The sentence analysis unit 34 divides the sentence into words such as “this / ha /, voice / synthesis / de /”.
[0073]
Next, the sentence analysis unit 34 sets a phrase. The sentence analysis unit 34 sets a phrase such as “This is speech synthesis.”
The reading / prosodic symbol assigning unit 36 assigns a reading based on the result of the sentence analyzing unit 34 and assigns a prosodic code representing a pose, a phrase accent, and intonation. The reading is "Kore / Wa /, Oh^,Nsey / Gosei / Death /. " The prosodic code is "Kolewa, Onsego^,-Sedes. "
[0074]
Next, in the acoustic processing unit 32, the intonation generating unit 37 generates an inflection pattern indicating the pitch of the sound, and the waveform synthesis unit 38 reads out the phoneme piece data from the phoneme piece unit 39, according to the phoneme time length information and the intonation pattern information. A speech waveform is synthesized by connecting phoneme data.
[0075]
Audio data is output from the acoustic processing unit 32 by performing the above processing. (Description of normal service operation)
Next, the operation | movement at the time of the normal service of Embodiment 1 of the voice response service system comprised in this way is demonstrated.
[0076]
First, the switching unit 20 connects the line control unit 14 to the service execution unit 17. When the narration story output unit 15 outputs the narration story to the narrative story analysis unit 16, the narration story analysis unit 16 analyzes the narration story.
[0077]
In accordance with the analyzed narration story, the service execution unit 17 instructs the voice synthesis processing unit 18 to output the voice, instructs the history information writing unit 27, instructs the line control unit 14 to read the push button, and reads the result. Processing branching, an instruction to access the database access unit 22, and the like are performed. The push button signal input by the user is read from the telephone 30 to the service execution unit 17.
[0078]
For audio output, the service is executed using the accumulated sound file 26, the text file 25, etc. according to the narration story.
The database 23 includes a user database in which information about users is written, a product database in which product information is written, or an order database in which order information is written.
[0079]
The database access unit 22 performs write / read processing of the database 23. The text file 25 stores text information in advance, and the stored voice file 26 stores stored voice information. The history information writing unit 27 writes the service content instructed by the service execution unit 17 in the history information storage unit 11. The speech synthesis processing unit 18 synthesizes speech by the speech synthesis method already described in FIGS.
[0080]
Note that the voice data output from the voice synthesis processing unit 18 by the switching unit 50 can also be output to the telephone 30 via the line control unit 14. At this time, the contents of the voice response service can be modeled without connecting to the telephone line, and those models can be analyzed.
[0081]
The voice data from the voice synthesis processing unit 18 can also control the pseudo-wire control unit 19 to cause the speaker 33 to ring. The operator can listen to the service content as voice data.
<Narration story output section>
Next, the narration story output unit 15 will be described in detail according to the procedures from procedure number 1 to procedure number 23 shown in FIG.
[0082]
In the narration story output unit 15, the content of the narration is determined according to the procedure from procedure number 1 to procedure number 23.
That is, in procedure number 1, since the stored voice file “open.pcm” is specified in the service outline, “open.pcm” is read from the stored sound file 26 in FIG. 1 and “open.pcm” is the voice response. Read from service unit 10.
[0083]
In the procedure number 2, the voice response service unit 10 reads out the input narration of the user number of the narration story output unit 15, for example, “Please enter the user number”.
[0084]
In procedure number 3, when the user number is input from the push button of the telephone 30 on the user side, the contents of the push button are read and the user number is input to the voice response service unit 10.
[0085]
In procedure number 4, the voice response service unit 10 accesses the user database of the database 23 when there is a user inquiry.
In procedure number 5, the result of querying database 23 in procedure number 4 is used as a narration “XX number, are you like?”, Which is the confirmation of the user number and user number of narration story output unit 15. , Read from the voice response service unit 10.
[0086]
In procedure number 6, the voice response service unit 10 branches the process based on the confirmation result from the user. In this procedure number 6, for example, when the user selects “1” with the push button of the telephone 30, the process proceeds to the process of the procedure number 7. For example, when the user selects “9” with the push button of the telephone 30, the process of the procedure number 2 is executed again.
[0087]
In the procedure number 7, the voice response service unit 10 reads out the narration “Please input the product number”, which is a narration input guidance narration of the narration story output unit 15.
[0088]
In procedure number 8, for example, when a product number is input to the voice response service unit 10 by a push button, the voice response service unit 10 reads the product number.
In procedure number 9, the voice response service unit 10 accesses the product database in the database 23 when there is a product inquiry.
[0089]
In the procedure number 10, the result of querying the database in the procedure number 9 is used as a voice response service unit 10 as a narration “XX number, OO-sama?” Which is a product number confirmation narration of the narration story output unit 15. Read more.
[0090]
In procedure number 11, the voice response service unit 10 branches the process based on the confirmation result from the user.
In this procedure number 11, for example, when the user selects “1” with the push button of the telephone 30, the process proceeds to the process of the procedure number 12.
[0091]
For example, when the user selects “9” with the push button of the telephone 30, the process of the procedure number 7 is executed again.
In the procedure number 12, the voice response service unit 10 reads out the narration “Please input the number”, which is the number input narration of the narration story output unit 15.
[0092]
In procedure number 13, for example, when the number is input to the voice response service unit 10 by a push button, the voice response service unit 10 reads the number.
In the procedure number 14, the voice response service unit 10 reads out the narration “Are you sure?”, Which is an input completion confirmation narration, in the narration story output unit 15.
[0093]
In procedure number 15, the voice response service unit 10 branches the process based on the confirmation result from the user.
In this procedure number 15, for example, when the user selects “1” with the push button of the telephone 30, the process proceeds to the process of procedure number 16.
[0094]
For example, when the user selects “9” with the push button of the telephone 30, the process of the procedure number 7 is executed again.
In the procedure number 16, the voice response service unit 10 reads “Are you sure?” Which is the input confirmation narration of the narration story output unit 15.
[0095]
In procedure number 17, the voice response service unit 10 branches the process based on the confirmation result from the user.
For example, when the user selects “1” with the push button of the telephone 30, the process proceeds to the process of the procedure number 18.
[0096]
Further, for example, when the user selects “9” with the push button of the telephone 30, the process of the procedure number 7 is performed.
In the procedure number 18, since the text file “order.txt” is designated as the order contents repeated narration in the narration story output unit 15, the voice response service unit 10 reads out the narration read from the text file 25.
[0097]
In the procedure number 19, the voice response service unit 10 reads out the narration “Are you sure?” Which is the order confirmation narration of the narration story output unit 15.
[0098]
At this time, in the procedure number 20, the voice response service unit 10 branches the process based on the confirmation result from the user.
In this procedure number 20, for example, when the user selects “1” with the push button of the telephone 30, the procedure of the procedure number 21 is performed.
[0099]
Further, for example, when the user selects “9” with the push button of the telephone 30, the process of the procedure number 23 is executed.
In the procedure number 21, the voice response service unit 10 reads out the narration “Order number is OO?” Which is the order number narration of the narration story output unit 15.
[0100]
In procedure number 22, order processing is performed. The contents of the received order are written into the order database in the database 23.
In procedure number 23, “close.pcm”, which is the stored voice, is designated as the service end narration of the narration story output unit 15, and is read from the stored sound file 26 and read out by the voice response service unit 10.
[0101]
According to the voice response service device of FIG. 5 described above, the processing between the telephone 30 on the user side U and the voice response service unit 10 of the computer C can be reliably performed in an interactive manner.
(Description of sequence in FIG. 3)
The processing of FIG. 3 will be described using the narrative story output unit 15 of FIG. 1 described above.
[0102]
For example, on the user U side, when the user number “651123” is input as the input 103 by the push button of the telephone 30, the user number “651123” is transmitted to the computer C side.
The transmission information having the user number “651123” transmitted to the computer C side is confirmed by the line control unit 14 on the computer C side, and then the narration story output unit 15 outputs the narration story 104. Then, a user number / user name confirmation guide 104 is transmitted to the telephone 30 on the user U side.
[0103]
On the other hand, on the computer C side, as shown in FIG. 3, after the line controller 14 confirms the confirmation ok of the user number / user name sent from the user U side, the computer C side sends the user U side information. A narration story 106 is output from the narration story output unit 15 in order to output a product number input guidance instruction to the telephone 30.
[0104]
The narration story output unit 15 outputs the narration story to the narration story analysis unit 16, and the narration story analysis unit 16 analyzes the narration story.
[0105]
The service execution unit 17 executes the service using the database 23 or the files 25 and 26 according to the narration story analyzed by the narration story analysis unit 16.
(Process when ordering)
Further, the telephone 30 on the user U side inputs the product number “321” from the push button of the telephone 30 according to the product number input instruction guidance 106 sent from the narration story output unit 15 on the computer C side. To do. This product number “321” is code information representing a specific product.
[0106]
When the user U inputs the product number “321” from the telephone 30 as the input 107 to the computer C side, the line control unit 14 on the computer C side confirms the product number “321”. 321 "is confirmed to be in the product database of the database 23. Then, a confirmation instruction guide for the product number “321” is transmitted as transmission 108 from the computer C side to the telephone 30 on the user U side.
[0107]
When the telephone 30 on the user U side receives the confirmation instruction guidance for the product number “321” from the computer C side, it indicates that the product ordered by himself / herself is not different from the product represented by the product number “321”. Confirmation information “1” is input from a push button of the telephone 30.
[0108]
Next, a product quantity input guide 110 is transmitted from the computer C side to the telephone 30 on the user U side.
The telephone 30 on the user U side receiving the quantity input guide 110 inputs the number of products to be ordered, for example, “3” as an input 111 from the push button of the telephone 30.
[0109]
When the number information of products to be ordered is transmitted from the user U side to the computer C side, the computer C side confirms that there is a stock of products in the product database of the database 23, and the number confirmation information 112 of the products. Is transmitted from the computer C side to the telephone 30 on the user U side.
[0110]
At this time, the telephone set 30 on the user U side inputs the number-confirmed information “1” as an input 114 from the push button of the telephone set 30.
When the input completion confirmation guide 114 is transmitted from the computer C side to the telephone 30 on the user U side, an intention to complete the input, for example, “1”, an end instruction is input as an input 115 from the push button of the telephone 30, and the end An instruction is transmitted from the user U side to the computer C side.
[0111]
When the above response instruction is performed between the telephones 30 on the computer C side and the user U side, the order content confirmation guide 116 and the order confirmation guide 117 of the narration story output unit 15 are displayed on the computer C in order to confirm the intention of ordering. Is transmitted from the side to the telephone 30 on the user U side.
[0112]
In response to these guidance, on the telephone 30 on the user U side, the user inputs the ordering process “1” as an input 118 from the push button of the telephone 30.
When the computer C confirms that the order processing information “1” is transmitted from the telephone 30 on the user U side to the computer C side, the order number guide 119 and the service end guide 120 are sent from the computer C side to the user U side. Sent. Then, the computer C side is disconnected as a disconnection 121 from the telephone line of the telephone 30 on the user U side.
[0113]
As described above, in the first embodiment, the log information shown in FIG. 2 is shown in FIG. 1 by the history information writing unit 27 when, for example, a product ordering operation is performed in the interactive form between the user U side and the computer C side. The history information storage unit 11 can write to a specific writing area. Further, the history information extracting unit 12 reads the written log information as appropriate. Thereby, when there is an error in the ordering information, the ordering work can be performed smoothly.
(Troubleshooting on the computer side)
Next, processing when trouble occurs on the computer C side will be described. The sequence diagram of FIG. 4 is a diagram showing processing at the time of ordering trouble. The detailed description of the processes from the service outline description 101 to the user confirmation instruction 105 in FIG. 3 is omitted.
[0114]
4 will be described with reference to the reference numerals in FIGS. In the first embodiment, after the user inputs “1” of confirmation OK 105 from the push button of the telephone 30, a failure occurs in the software on the computer C side, the database 23, the text file 25, the stored voice file 26, and the like. To do.
[0115]
In this case, when the computer C transmits a failure occurrence signal to the user's telephone 30 in order to notify the user that the failure occurrence 130 is on the computer C side, the telephone line is disconnected as a disconnection 121.
[0116]
As described above, when a failure occurs during the service, a reproduction test or a confirmation test after correcting the failure is performed. In this case, the history information stored in the history information storage unit 11 in FIG. 1 is reproduced. This allows the user to recognize that there is a failure cause on the computer C side. Also, since testing can be performed automatically, the test time is shortened.
(Reproduction of voice response service)
Next, reproduction of the voice response service will be described. As described above, when a normal service is provided, a signal input from the push button of the telephone 30 by the user U is recognized by the line control unit 14, and the push button signal is read into the service execution unit 17.
[0117]
On the other hand, when the voice response service is reproduced, the service execution unit 17 is connected to the history information extraction unit 12 by the switching unit 20. The history information extraction unit 12 extracts (identifies) services that the user has used in the past from the history information storage unit 11, and uses the numerical value input from the push button recorded in the history information storage unit 11 as the read value of the push button. To the service execution unit 17.
[0118]
The voice synthesis processing unit 18 reproduces the history information extracted by the history information extraction unit 12 via the service execution unit 17, that is, the past voice response service. As described above, in the first embodiment, the voice response service can be reproduced by executing the voice response service in a pseudo manner.
[0119]
The operation at this time will be described with reference to FIG. 1, FIG. 2, and FIG. As an example, the content of the service of the user number 651123 that used the service on September 6, 1994 is reproduced.
[0120]
At this time, the voice reproduced by the voice synthesis processing unit 18 is sent to the telephone 30 of the user U by the line control unit 14. Users can listen to the contents of past services.
[0121]
The history information extraction unit 12 specifies the history information of the user who wants to reproduce from the history information of FIG. 2 stored by the history information storage unit 11. The method for specifying history information will be described in detail in the column (operation for specifying user history) with reference to FIG. Here, for example, history information A is specified from the history information shown in FIG.
[0122]
The switching unit 20 connects the history information extraction unit 12 to the service execution unit 17 and starts the history information extraction unit 12 to start reproduction according to the service story of FIG.
[0123]
Since the operation of the narration story output unit 15 has been described in detail in <Narration Story Output Unit> above, description thereof is omitted here.
In the procedure number 1 and the procedure number 2 in FIG. 5, the voice response service unit 10 reads out the outline of the service and the input of the user number as usual.
[0124]
When the user number of procedure number 3 is input, the service execution unit 17 instructs the history information extraction unit 12 to read a push button. The history information extraction unit 12 reads the user number “651123” 103 of the history information shown in FIG. 2, and the voice synthesis processing unit 18 synthesizes “Rokugoichiichisan” to obtain voice data. The voice data is sent to the telephone U of the user U by the line control unit 14 and the user number is transmitted to the user.
[0125]
Procedure number 5 is the normal operation. In procedure number 6, since the process branches based on the result from the user, the service execution unit 17 instructs the history information extraction unit 12 to read a push button. The history information extraction unit 12 reads the user confirmation “1” 105, and the speech synthesis processing unit 18 synthesizes “1” to obtain speech data. The voice data is sent to the telephone 30 of the user U by the line control unit 14, the confirmation result is transmitted to the user, and the process branches to the procedure number 7.
[0126]
Although description of the subsequent processing is omitted, by performing the service up to the line disconnection 121, the past service can be reproduced.
When the voice synthesis processing unit 18 is connected to the pseudo-line control unit 19 by the switching unit 50, the voice data from the voice synthesis processing unit 18 may be output to the speaker 33 via the pseudo-line control unit 19. Good.
[0127]
Further, in this example, the speech synthesis processing unit 18 converts the push button number into speech and transmits the number to the user. For example, it is possible to prevent the number from being transmitted to the user by not sending the push button number to the speech synthesis processing unit 18 as necessary.
(Operation to specify user history)
Next, the operation for specifying the user history will be described in detail. FIG. 9 is a flowchart showing a process for specifying a record the user wants to reproduce from the history file. The operation for specifying the user history is performed by the history information extraction process of the history information extraction unit 12.
[0128]
First, a history file of the use date in the history information storage unit 11 is opened (ST40). Here, the history file in FIG. 2 will be described as an example. The following processing is performed until the end of the history file.
[0129]
First, one line is read from the history file (ST42). For example, if “line disconnection” is recorded in the line read in ST42, there is a possibility of the playback start position. For this reason, line number 1 is recorded in the history information storage unit 11 (ST43, ST44).
[0130]
If “line disconnection” is recorded in the line read in ST42 (ST45), and a user history is found (ST46), line number 1 is determined as the reproduction start position (ST47).
[0131]
When “user number input narration” is recorded in the line read in ST42 (ST48), the push button input numerical value from the next line is set as the user number (ST49).
[0132]
Here, when the user number to be reproduced matches the user number of ST49 (ST50), it is determined that the corresponding user history has been found (ST51).
After performing the above processing until the end of the history file, close the history file of the usage date. Thereby, the user history can be specified.
[0133]
As described above, in the first embodiment, the log information shown in FIG. 2 is stored in a specific writing area of the history information storage unit 11 when, for example, a product ordering operation is performed in an interactive manner between the user U side and the computer C side. Can write. By reading the written log information as appropriate, the ordering operation can be performed smoothly when there is an error in the ordering information.
[0134]
In addition, when a user pushes a push button, the user pushes the push button when the user pushes the push button. It is possible to track erroneous operation parts caused by mistakes.
[0135]
For example, the history information storage unit 11 can write time-series information such as telephone line connection / disconnection time, user's user information start time, and user's input from a push button. . For this reason, when there is an input error of the user, the user can easily recognize the erroneous operation location by reproducing the information.
[0136]
Furthermore, test man-hours can be reduced by performing automatic tasting. <Embodiment 2>
FIG. 10 is a block diagram showing Embodiment 2 of the voice response service system including the voice response service device of the present invention. When a service is provided by a human, the voice is louder or spoken slowly according to the other person's situation. That is, the volume is adjusted so that the other party can easily understand.
[0137]
In the second embodiment, the computer automatically changes the sound quality according to information such as the other party's age, gender, and the state of pronunciation such as ease of hearing, and dynamically implements the service with sound quality adapted to the user. .
[0138]
Here, only the configuration different from the configuration shown in FIG. 1 of the first embodiment in the configuration shown in FIG. 10 will be described. The other components are denoted by the same reference numerals, and detailed description thereof is omitted here.
[0139]
In FIG. 10, an attribute table 40 and a state rule table 41 are connected to the service execution unit 17. These tables are provided in a database (not shown). The attribute table 40 sets a volume level according to the sex of the user and the age category of the user. The state rule data database 41 sets the volume or the speaking speed according to the state of the user's input operation.
[0140]
FIG. 11 is a block diagram showing the main part of the second embodiment shown in FIG. In FIG. 11, the attribute table 40 is composed of a gender rule table 40a storing male sounds or female sounds depending on the user's gender, and a volume table 40b whose volume level changes depending on the age of the user.
[0141]
In the gender rule table 40a, when the user is a male, a female sound synthesized by voice is selected in order to produce a soft voice. When the user is a woman, a male sound that is synthesized by voice is selected to produce a voice.
[0142]
In the volume table 40b, volume levels suitable for the user's age classification can be roughly set.
For example, the volume level is set to 4 when the user's age is 0 to 59, and the volume level is set to 5 when the user's age is 60 to 69. The volume level is set to 6 when the user's age is 70 to 79, and the volume level is set to 7 when the user's age is 80 to 89.
[0143]
That is, for example, the volume level is increased in order from volume level 1, and the volume is set to be larger as the numerical value is larger. Rough volume is set according to the user's gender, age group, etc.
[0144]
The sex rule table 40a and the volume table 40b set the volume level suitable for the general user. The state rule table 41 adjusts the volume or voice speed according to the user.
[0145]
The sex rule table 40a, the volume table 40b, and the state rule table 41 are connected to the service execution unit 17d. The service execution unit 17d is connected to the voice attribute setting event generation unit 42 and the voice synthesis processing unit 18.
[0146]
For example, when an instruction indicating that the volume is low and the speed is low is transmitted from the push button to the voice attribute setting event generation unit 42, the voice attribute setting event generation unit 42 sets the voice attribute of the user. Generate an event.
[0147]
The service execution unit 17d includes a rule check processing unit 17a that checks rules in the attribute table 40, a voice attribute setting unit 17b that sets voice attributes according to the processing output of the rule check processing unit 17a, and a voice attribute setting unit 17b. The narration sentence output part 17c which outputs the narration sentence from the narration story output part 15 according to an audio | voice attribute output is comprised.
[0148]
The state rule table 41 can be set to increase the volume by one level and decrease the voice speed by one level.
The speech synthesis processing unit 18 performs speech synthesis processing from the output of the speech attribute setting unit 17b and the output of the narration sentence output unit 17c.
(Operation of Embodiment 2)
Next, the operation of the second embodiment will be described. For example, when the user attribute is present in the database, the user attribute is obtained from the user number or when an arbitrary user inputs his / her attribute from the push button of the telephone 30, the voice attribute setting event generating unit 15 Generates a voice attribute setting event.
[0149]
The rule check processing unit 17a refers to the gender rule table 40a and the volume table 40b by the output from the audio attribute setting event generation unit 15, and outputs the referenced table information to the audio attribute setting unit 17b. The voice attribute setting unit 17b sets a user's voice attribute.
[0150]
For example, when a man, 61, uses this voice response service, the gender rule table 40a and the volume table 40b are referred to, so that the female gender is set and the volume level is set to 5 in order to soften the voice gender. Is done.
[0151]
In addition, when a user input operation occurs in the voice genus setting event generation unit 15, the rule check processing unit 17 a refers to the state rule table 41. From the table information of the state rule table 41, the voice attribute setting unit 17b sets a voice attribute.
[0152]
For example, when the user makes an input error, the voice speed is decreased by one level and the volume is increased by one level.
That is, it is considered that the narration is fast and the volume is low, and the voice cannot be heard. In this case, the sound quality is changed to speak slowly and loudly.
[0153]
On the other hand, when a voice attribute setting command suitable for the user is output from the voice attribute setting unit 17b and a narration is output from the narration sentence output unit 17c, the speech synthesis processing unit 18 synthesizes the sound quality suitable for the user. To do.
[0154]
As described above, in the second embodiment, for any user, the sound quality is automatically changed according to information such as the user's age classification, gender, and the state of pronunciation such as ease of hearing. Voice response service with easy-to-listen sound quality.
[0155]
As a result, it is possible to improve user service by eliminating erroneous operations due to mistakes in listening.
Further, even when the second embodiment is applied to a product ordering business, a reservation business, and a data requesting business, erroneous recognition by a user's voice is reduced, and these business operations can be performed smoothly.
[0157]
【The invention's effect】
According to the present invention, You can automatically replay the past service from the log information and easily check the contents of the service.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a first embodiment of a voice response service system including a voice response service device according to the present invention.
FIG. 2 is a diagram showing history information according to the first embodiment of the present invention.
FIG. 3 is a diagram showing a sequence in which a service is executed according to the narration story of the first embodiment.
FIG. 4 is a diagram illustrating an example in which a failure occurs during the execution of the service according to the first embodiment.
FIG. 5 is a diagram describing service contents according to the first embodiment.
FIG. 6 is a diagram illustrating a speech synthesis method according to the first embodiment.
FIG. 7 is a diagram illustrating a specific configuration example of a speech synthesis processing unit employing a speech synthesis method.
8 is a diagram showing a specific example in which speech synthesis is performed by the speech synthesis method shown in FIG.
FIG. 9 is a flowchart showing processing for specifying a record that a user wants to reproduce from a history file.
FIG. 10 is a block diagram showing a second embodiment of a voice response service system including a voice response service device according to the present invention.
11 is a block diagram showing the main part of the second embodiment shown in FIG.
[Explanation of symbols]
10 Voice response service department
11 History information storage
12 History information extraction unit
14 Line control unit
15 Narration story output section
16 Narration story analysis section
17 Service execution department
17a Rule check processing part
17b Voice attribute setting part
17c Narration sentence output part
18 Speech synthesis processor
19 Pseudowire control unit
20 switching part
22 Database access part
23 Database
25 Text file
26 Accumulated audio files
27 History information writing part
30 telephone
31 Language processor
32 Sound processor
33 Speaker
34 Text Analysis Department
35 word dictionary
36 Reading / Prosodic Symbol Assignment Unit
37 Intonation generator
38 Waveform synthesis unit
39 Phoneme fragment
40 attribute table
41 State rule table
42 Voice attribute setting event generator

Claims

A voice response service device that processes a voice response service according to a push button signal input from a user based on a processing procedure of the voice response service,
Log information storage means for storing, in a time series, log information of the voice response service that includes, as character information, push button signal information input from the user in accordance with a processing procedure of the voice response service;
Speech synthesis processing means for synthesizing character information of the push button signal into speech;
Log information extracting means for extracting log information based on the specific push button signal information from the log information stored in the log information storage means;
In response to the push button signal information of the log information extracted by the log information extraction means based on the processing procedure of the voice response service, the voice response service including the voice synthesized by the voice synthesis processing means for the push button signal information is reproduced. A voice response service apparatus comprising: voice response service reproduction means.

A voice response service device that processes a voice response service according to a push button signal input from a user based on a processing procedure of the voice response service,
Log information of the voice response service including push button signal information including the user number of the user , which is input from the user according to the processing procedure of the voice response service, in time series. Log information storage means for storing;
Log information extraction means for identifying log information including a corresponding user number from log information stored in the log information storage means and extracting log information to be reproduced ;
A voice response service apparatus comprising: voice response service reproduction means for reproducing voice response service according to push button signal information of log information extracted by the log information extraction means based on a processing procedure of the voice response service. .