JP3608449B2

JP3608449B2 - Voice response method and apparatus, and storage medium storing voice response program

Info

Publication number: JP3608449B2
Application number: JP25564299A
Authority: JP
Inventors: 佳織楢原; 弘行松井; 亮造布川
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1999-09-09
Filing date: 1999-09-09
Publication date: 2005-01-12
Anticipated expiration: 2019-09-09
Also published as: JP2001086243A

Description

【０００１】
【発明の属する技術分野】
本発明は、音声応答方法及び装置及び音声応答プログラムを格納した記憶媒体に係り、特に、通信網に接続され、発信者に対して応答するための音声応答方法及び装置及び音声応答プログラムを格納した記憶媒体に関する。
【０００２】
【従来の技術】
図１０は、従来の応答装置の構成を示す。
従来の応答装置は、回線インタフェース部１、着信検出部２、通話部４、応答部５、及び制御部６から構成される。
同図に示す応答装置では、回線インタフェース部１に着信があると、着信検出部２は、着信を検出し、制御部６に通知し、着信表示し、オペレータは通話部４で応答する。
【０００３】
あるいは、無人化のため夜間などはオペレータの代わりに、応答部５を設け、音声ガイダンスと押しボタンダイヤル信号認識により応答する。
また、電子交換機システムにおいて、ＩＳＤＮ網が提供する発ＩＤにより応答メッセージを変化させる方法（特開平６−２３７２９６）や、発ＩＤにより利用者を検索し、サービス内容を決定することが可能なファクシミリ装置（特開平９−６５０８８）等がある。
【０００４】
【発明が解決しようとする課題】
しかしながら、従来の応答装置では、無人対応時の着信側の応答形式が限定され、利用者の望む応答形式で応答できないという問題がある。また、ＰＢによる選択分岐が多く、利用者の求めるサービスに辿り着くまでに時間がかかるという問題がある。
【０００５】
また、全ての着信呼が夜間受け付けへ転送される場合は、夜間もオペレータの稼働が大きいという問題がある。
さらに、従来の発ＩＤにより応答メッセージを変化させる方法（特開平６−２３７２９６）や、発ＩＤにより利用者を検索し、サービス内容を決定することが可能なファクシミリ装置（特開平９−６５０８８）は、多数話者への音声認識の適用には、話者を特定するＩＤ番号を利用者に入力させる必要がある。また、話者の発話の一部を用いて、蓄積されている多数の話者毎の認識パターンとの照合処理により話者の特定を行う必要がある。
【０００６】
本発明は、上記の点に鑑みなされたもので、音声認識の認識率を向上させ、発話者別に応答形式、応答方法を可変とし、音声認識に向いていない話者と判断される場合には、音声認識を試みないことにより操作時間を短縮することが可能な音声応答方法及び装置及び音声応答プログラムを格納した記憶媒体を提供すること目的とする。
【０００７】
【課題を解決するための手段】
図１は、本発明の原理を説明するための図である。
本発明（請求項１）は、通信網に接続され、発信者に対して音声で応答するための音声応答方法において、
発信者毎に過去の音声認識の成功／失敗の結果、応答回数、応答時における応答形式を含む応答履歴を保持しておき（ステップ１）、
発信者から着信呼があった場合に（ステップ２）、該発信者に対応する応答履歴を検索し（ステップ３）、
検索された発信者に対応する応答履歴において、連続した一定回数以上の音声認識の成功履歴がある場合には（ステップ４）、音声認識による応答を行う（ステップ５）。
【０００８】
本発明（請求項２）は、応答履歴を発信者の発ＩＤ毎に保持しておき、
発信者から着信呼があった場合に、該発信者の発ＩＤを取得し、
発ＩＤを発信者特定のキーとして、応答履歴を検索し、
検索された応答履歴に基づいて応答形式を決定する。
本発明（請求項３）は、応答形式として、
音声認識と音声ガイダンスの組み合わせ、押しボタンダイヤル信号認識と音声ガイダンスの組み合わせ、音声ガイダンスと通話録音の組み合わせ、音声ガイダンス、着信呼を転送し、転送先における応答の何れかを用いる。
【０００９】
本発明（請求項４）は、応答履歴に発ＩＤが登録されていない場合には、応答形式を発信者に指定させ、
発ＩＤのある着信呼については、応答履歴から該発ＩＤに対応する応答履歴情報を取得し、応答回数の履歴により応答形式を決定する。
本発明（請求項５）は、応答時に所定の時間無音である場合に、応答形式を変更する、または、音声認識の成功／失敗により応答形式を変更する、または、発信者の操作により応答形式を変更する制御方法のうちのいずれか、または、複数の制御を行う。
【００１０】
図２は、本発明の原理構成図である。
本発明（請求項６）は、通信網に接続され、発信者に対して音声で応答するための音声応答装置であって、
発信者毎に過去の音声認識の成功／失敗の結果、応答回数、応答時における応答形式を含む応答履歴を保持する応答形式履歴蓄積手段１１と、
発信者からの着信呼を検出する着信検出手段２と、
着信検出手段２において、着信呼があった場合に、該発信者に対応する応答履歴を応答形式履歴蓄積手段１１より検索する履歴検索手段６と、
履歴検索手段６により検索された発信者に対応する応答履歴において、連続した一定回数以上の音声認識の成功履歴がある場合には、音声認識による応答を行う応答手段５とを有する。
【００１１】
本発明（請求項７）は、応答形式履歴蓄積手段１１において、
応答履歴を発信者の発ＩＤ毎に保持しておき、
履歴検索手段６において、
発信者から着信呼があった場合に、該発信者の発ＩＤを取得し、発ＩＤを発信者特定のキーとして、応答履歴を検索する手段を含み、
応答手段５において、
履歴検索手段６により検索された応答履歴に基づいて応答形式を決定する手段を含む。
【００１２】
本発明（請求項８）は、応答履歴の応答形式として、
音声認識と音声ガイダンスの組み合わせ、押しボタンダイヤル信号認識と音声ガイダンスの組み合わせ、音声ガイダンスと通話録音の組み合わせ、音声ガイダンス、着信呼を転送し、転送先における応答の何れかを保持する。
本発明（請求項９）は、応答履歴に発ＩＤが登録されていない場合には、応答形式を発信者に指定させる応答形式指定指示手段を更に有し、
応答手段５において、
発ＩＤのある着信呼については、履歴検索手段６により検索された応答回数の履歴により応答形式を決定する手段を含む。
【００１３】
本発明（請求項１０）は、発信者に対する応答時に所定の時間無音であることを検出する無音時間検出手段を更に有し、
応答手段５において、
無音時間検出手段において所定の時間無音である場合に、応答形式を変更する手段、または、音声認識の成功／失敗により応答形式を変更する手段、または、発信者の操作により応答形式を変更する手段のうちのいずれか、または、複数の手段を実行する。
【００１４】
本発明（請求項１１）は、通信網に接続され、発信者に対して音声で応答するための音声応答装置に搭載される音声応答プログラムを格納した記憶媒体であって、
発信者毎に過去の音声認識の成功／失敗の結果、応答回数、応答時における応答形式を含む応答履歴を記憶手段に蓄積させる応答形式履歴格納プロセスと、
発信者からの着信呼を検出する着信検出プロセスと、
着信検出プロセスにおいて、着信呼があった場合に、該発信者に対応する応答履歴を記憶手段より検索する履歴検索プロセスと、
履歴検索プロセスにより検索された発信者に対応する応答履歴において、連続した一定回数以上の音声認識の成功履歴がある場合には、音声認識による応答を行う応答プロセスとを有する。
【００１５】
本発明（請求項１２）は、応答形式履歴格納プロセスにおいて、
応答履歴を発信者の発ＩＤ毎に記憶手段に格納し、
履歴検索プロセスにおいて、
発信者から着信呼があった場合に、該発信者の発ＩＤを取得し、発ＩＤを発信者特定のキーとして、応答履歴を検索するプロセスを含み、
応答プロセスにおいて、
履歴検索プロセスにより検索された応答履歴に基づいて応答形式を決定するプロセスを含む。
【００１６】
本発明（請求項１３）は、応答履歴の応答形式として、
音声認識と音声ガイダンスの組み合わせ、押しボタンダイヤル信号認識と音声ガイダンスの組み合わせ、音声ガイダンスと通話録音の組み合わせ、音声ガイダンス、着信呼を転送し、転送先における応答の何れかを用いる。
本発明（請求項１４）は、記憶手段に応答履歴として発ＩＤが登録されていない場合には、応答形式を発信者に指定させる応答形式指定指示プロセスを更に有し、
応答プロセスにおいて、
発ＩＤのある着信呼については、履歴検索プロセスにより検索された応答回数の履歴により応答形式を決定するプロセスを含む。
【００１７】
本発明（請求項１５）は、発信者に対する応答時に所定の時間無音であることを検出する無音時間検出プロセスを更に有し、
応答プロセスは、
無音時間検出プロセスにおいて所定の時間無音である場合に、応答形式を変更するプロセス、または、音声認識の成功／失敗により応答形式を変更するプロセス、または、発信者の操作により応答形式を変更するプロセスのうちのいずれか、または、複数のプロセスを実行する。
【００１８】
上述のように、本発明によれば、発話者の過去の音声認識の成功・不成功の履歴を判定して、連続した一定回数以上の音声認識成功履歴がある発話者の場合には、音声認識による応答を起動することにより、音声認識の認識率を向上させることが可能となる。
また、発話者別に応答形式・応答方法を変更することが可能となる。
【００１９】
さらに、発話者の履歴を検索する際に、電話回線を経由して送出される発ＩＤ（発信電話番号）を発話者特定のキーとする検索手段を用いることにより、音声認識利用者の操作性を向上させるとことが可能となる。
【００２０】
【発明の実施の形態】
図３は、本発明の応答装置の構成を示す。
同図に示す応答装置は、通信網と接続する回線インタフェース部１、通信網からの着信を検出する着信検出部２、通信網から送られてくる発ＩＤを検出する発ＩＤ検出部３、利用者との通話を行う通話部４、着信に応答できる複数の応答形式を有する応答部５、当該装置を制御する制御部６、及び応答部５における応答履歴を発ＩＤ別、応答形式別に蓄積する発ＩＤ別応答形式履歴蓄積部１１から構成される。
【００２１】
発ＩＤ別応答形式履歴蓄積部１１は、応答履歴として、発ＩＤ、発信者毎の過去の音声認識の成功／失敗の結果、応答回数、応答時における応答形式を蓄積する。
制御部６は、着信時に発ＩＤ検出部３により検出した発ＩＤに基づいて、発ＩＤ別応答形式履歴蓄積部１１から応答履歴を検索し、得られた応答履歴から応答部５において応答する形式を指定する。
【００２２】
次に、上記の構成における動作を説明する。
図４は、本発明の応答装置の動作のシーケンスチャートである。
ステップ１０１）回線インタフェース部１を介して通信網から着信検出部２が着信を検出し、制御部６に通知する。
ステップ１０２）発ＩＤ検出部３において、通信網から送られてくる発ＩＤを検出し、制御部６に通知する。
【００２３】
ステップ１０３）制御部６は、発ＩＤ検出部３から取得した発ＩＤに基づいて発ＩＤ別応答形式履歴蓄積部１１を検索し、当該発ＩＤに対応する応答履歴を取得する。
ステップ１０４）制御部６は、取得した応答履歴を応答部５に渡し、応答部５は、当該応答部５が有する複数の応答形式１〜ｎにおいて応答履歴に対応する応答形式を決定する。
【００２４】
ステップ１０５）応答部５は、決定された応答形式を通話部４に転送し、通話部４から、通信網を介して利用者に応答する。
これにより、発ＩＤ別応答形式履歴蓄積部１１に蓄積されている発ＩＤ別応答形式に基づいて発信者別に応答形式を可変して提供することが可能となる。
【００２５】
【実施例】
以下、図面と共に本発明の実施例を説明する。
［第１の実施例］
図５は、本発明の第１の実施例の応答装置の構成を示す。同図において図３の構成と同一部分については、同一符号を付し、その説明を省略する。
【００２６】
図５に示す応答装置は、図３の構成に、転送先電話帳蓄積部１５、発信部１６、通話路スイッチ１７、応答検出部１８を付加し、応答部５に、押しボタンダイヤル信号・ダイヤルパルス信号（ＰＢ・ＤＰ）認識部８、音声認識部９、通話録音部１０及び音声ガイダンス部１９を付加した構成である。
制御部６において、発ＩＤ検出部３により検出された発ＩＤに基づいて、発ＩＤ別応答形式履歴蓄積部１１を参照し、応答形式を取得し、当該応答形式に応じて、応答部５の種々の機能に応答形式を通話部４に出力するよう指示する。
【００２７】
本実施例における発ＩＤ津応答形式履歴蓄積部１１には、応答形式として、音声ガイダンスと音声認識を組み合わせた応答形式、音声ガイダンスとＰＢ・ＤＰ認識を組み合わせた応答形式、音声ガイダンスと通話録音を組み合わせた応答形式、音声ガイダンスの応答形式、または、着信呼を転送先に転送し、当該転送先から応答する応答形式等が各発ＩＤ毎に蓄積されているものとする。
【００２８】
また、応答部５における応答形式として、音声ガイダンスと音声認識を組み合わせた応答形式、音声ガイダンスとＰＢ・ＤＰ認識を組み合わせた応答形式、音声ガイダンスと通話録音を組み合わせた応答形式、音声ガイダンスの応答形式、または、着信呼を転送先に転送し、当該転送先から応答する応答形式を指定する。
【００２９】
応答部５の応答認識部９は、例えば、特開平１０−１９０８４２や特開平７−２３０２９５に開示されているような、応答形式に基づいて、通話部４から取得した音声データを音声認識する。
着信呼の転送において、制御部６は、転送先電話帳蓄積部１５から転送先を読み出し、発信部１６から転送先に発信する。さらに、転送応答時に、応答検出部１８にて利用者からの応答を検出し、制御部６へ通知する。制御部６は、通話路スイッチ１７を制御し、着信呼と転送先間で通話路を形成し、着信呼を転送する制御を行う。
【００３０】
図６は、本発明の第１の実施例の応答装置の動作を示すシーケンスチャートである。
ステップ２０１）回線インタフェース部１を介して通信網から着信検出部２が着信を検出し、制御部６に通知する。
ステップ２０２）発ＩＤ検出部３において、通信網から送られてくる発ＩＤを検出し、制御部６に通知する。
【００３１】
ステップ２０３）制御部６は、発ＩＤ検出部３から取得した発ＩＤに基づいて発ＩＤ別応答形式履歴蓄積部１１を検索し、当該発ＩＤに対応する応答履歴を取得する。
ステップ２０４）制御部６は、応答履歴から応答部５において当該発ＩＤに対応する応答形式を参照し、当該応答形式に応じて、応答部５の各機能（ＰＢ・ＤＰ認識部８、音声認識部９、通話録音部１０、音声ガイダンス部１９のいずれか、または、複数組み合わせた機能）から上述した方法により選択する。
【００３２】
ステップ２０５）着信呼以外の場合には、ステップ２０４により選択された応答形式に基づいて、通話部４から通信網を介して利用者に応答する。
ステップ２０６）着信呼の場合には、制御部６において転送先への応答形式を決定する。
ステップ２０７）さらに、制御部６は、着信呼から転送先ＩＤを抽出し、該転送先ＩＤに基づいてら転送先電話帳蓄積部１５から転送先を読み出す。
【００３３】
ステップ２０８）発信部１６は、読み出された転送先にステップ２０６で決定された応答形式に対応する応答を行う。
ステップ２０９）応答検出部１８において転送先からの応答を検出すると、制御部６は、着信呼が転送可能となるように通話路スイッチ１７を制御する。
ステップ２１０）着信呼と転送先との間で通話路を形成し、発信部１６より着信呼を転送先に転送する。
【００３４】
［第２の実施例］
本実施例では、ある一定時間以上無音時間が継続した場合の処理、及び、応答履歴の音声認識が成功している応答回数を抽出し、所定の回数連続している発ＩＤを有する場合に、音声認識による応答を行う処理について説明する。
図７は、本発明の第２の実施例の応答装置の構成を示し、図５と同一構成部分には同一符号を付し、その説明を省略する。
【００３５】
同図に示す構成は、図５の構成に、無音時間検出部１４を付加した構成である。無音時間検出部１４は、応答した着信呼をモニタし、無音時間が一定時間より長ければ、無音と判定する。
図８、図９は、本発明の第２の実施例の応答装置の動作を示すフローチャートである。
【００３６】
まず、応答装置は、無人で対応する（ステップ３００）。ここで、発ＩＤ検出部３において発ＩＤを検出できない場合には、応答部５から通話部４を介して音声ガイダンスを流す（ステップ３０１）。また、発ＩＤが検出できた場合には、制御部６は、発ＩＤ別応答形式履歴蓄積部１１より発ＩＤ別の応答履歴と応答回数を読み出し、応答回数が１回目の場合には応答部５に転送し、通話部４から有人応対を行う。このとき、オペレータは、ユーザのデータを発ＩＤ別応答形式履歴蓄積部１１に投入し、応対した内容を履歴として蓄積する（ステップ３０２）。応答回数が２回以上の場合、制御部６は、発ＩＤ別応答履歴蓄積部１１から発ＩＤ別応答履歴を読み出す（ステップ３０３）。
【００３７】
ここで、音声認識が連続２回上成功しており、音声認識が選択されている場合には、応答部５の音声認識部９より音声認識で応答する（ステップ３０５）。また、ステップ３０４において、ＰＢ・ＤＰ認識が選択されている場合には、応答部５のＰＢ・ＤＰ認識部８で応答する（ステップ３０６）。ステップ３０５における音声認識で、成功し、かつ無音でない場合には、応答履歴を更新する（ステップ３０７）。その後サービスを提供する（ステップ３０８）。なお、音声認識の成功／失敗の判断方法としては、応答の際に、ユーザが音声を入力し、認識した後に音声ガイダンスによりユーザのＰＢ等による確認手段により判定する方法を用いるものとする。
【００３８】
音声認識で応答に失敗、または、無音判定した場合には、応答部５のＰＢ・ＤＰ認識部８により応答する（ステップ３０６）。
ステップ３０６のＰＢ・ＤＰ認識において成功した場合には、応答履歴を更新し（ステップ３０９）。応答履歴の更新は、応答形式・注文内容をデータベースに書き込むことで達成されるものとする。その後サービスを提供する（ステップ３１０）。
【００３９】
ＰＢ・ＤＰ認識部８によるＰＢ・ＤＰ認識に失敗、または、無音判定で無音時間が一定時間よりも長い場合は、有人に転送するか、通話録音するかをガイダンスで質問し（ステップ３１１）、ユーザが転送を希望するならば、ユーザが転送操作して、発ＩＤ別応答形式履歴蓄積部１１の応答履歴を更新し（ステップ３１４）、転送、有人応対を行う（ステップ３１５）。ユーザが転送を希望しない場合は、発ＩＤ別応答形式履歴蓄積部１１の応答履歴を更新し（ステップ３１２）、通話録音をする（ステップ３１３）。
【００４０】
また、音声認識での応答で、連続した一定回数以上の成功履歴がある場合のユーザに対しては、始めから音声認識による応答を行い、それ以外のユーザに対しては、まず、始めに音声認識での応答を希望するかどうかを質問し、前回ＰＢ・ＤＰ認識と音声ガイダンスの組み合わせの形式で応答しているようなユーザは、応答の始めに音声認識での応答を望むかどうかを問い合わせ、ユーザが希望するなら音声認識での応答を行うものとする。
【００４１】
なお、上記の動作において、音声認識を用いた応答に２回以上連続で成功したか否かの判断においては、２回に限定されることなく、過去に所定の回数以上連続して音声認識に成功した履歴があるか否かで判断されるようにしてもよい。
また、上記の実施例は、図３、図５、図７の構成に基づいて説明しているが、これらの応答装置をプログラムとして構築し、応答装置として利用されるコンピュータに接続されるディスク装置、フロッピーディスク、ＣＤ−ＲＯＭ等の可搬記憶媒体に格納しておき、本発明を実施する際にインストールすることにより、容易に本発明を実現することが可能である。
【００４２】
なお、本発明は、上記の実施例に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。
【００４３】
【発明の効果】
上述のように、本発明によれば、発信者の過去の音声認識の成功／失敗の履歴を判定して、連続した一定回数以上の音声認識成功履歴がある発信者の場合には、音声認識による応答を起動することにより、音声認識の認識率を向上させることができる。
【００４４】
また、本発明は、従来のように認識辞書を変更することではなく、発信者別に応答形式・応答方法を可変とすることにより、音声認識に向いていない話者と判断される場合には、以降の処理では、音声認識を試みないことにより、操作時間が短縮される。
また、通信網から送られてくる発信者識別情報を検出し、当該発信者識別情報に基づいて応答形式を選択して応答し、応答履歴を発信者識別情報別に蓄積することにより、それぞれの発信者にあった応答形式を提供することが可能となる。
【００４５】
また、発信者の応答履歴を検索する際に、従来のように、発信者に発ＩＤを入力させることなく、電話回線を経由し送出される発ＩＤを検索のための発信者の特定のキーとして用いることにより、音声認識利用者の操作性を向上させることができる。
また、応答する際に、音声認識と音声ガイダンスの組み合わせ、押しボタンダイヤル信号認識と音声ガイダンスの組み合わせ、音声ガイダンスと通話録音の組み合わせ、音声ガイダンス、等の応答形式を用いて着信呼を転送し、転送先にて応答することにより、発信者への負担を軽減させることが可能となる。
【００４６】
また、発信者識別情報のない着信呼に対しては応答形式を指定し、発信者識別情報のある着信呼に対しては、蓄積されている応答履歴を求め、応答回数の履歴により応答形式を指定することが可能となる。また、応答時の無音検出の場合や、発信者の操作により、応答形式を変更することが可能となる。
【図面の簡単な説明】
【図１】本発明の原理を説明するための図である。
【図２】本発明の原理構成図である。
【図３】本発明の応答装置の構成図である。
【図４】本発明の応答装置の動作のシーケンスチャートである。
【図５】本発明の第１の実施例の応答装置の構成図である。
【図６】本発明の第１の実施例の応答装置の動作を示すシーケンスチャートである。
【図７】本発明の第２の実施例の応答装置の構成図である。
【図８】本発明の第２の実施例の応答装置の動作を示すフローチャート（その１）である。
【図９】本発明の第２の実施例の応答装置の動作を示すフローチャート（その２）である。
【図１０】従来の応答装置の構成図である。
【符号の説明】
１回線インタフェース部
２着信検出手段、着信検出部
３発ＩＤ検出部
４通話部
５応答手段、応答部
６履歴検索手段、制御部
８ＰＢ・ＤＰ認識部
９音声認識部
１０通話録音部
１１発ＩＤ別応答形式履歴蓄積手段、発ＩＤ別応答形式履歴蓄積部
１５転送先電話帳蓄積部
１６発信部
１７通話路スイッチ
１８応答検出部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a voice response method and apparatus and a storage medium storing a voice response program, and more particularly to a voice response method and apparatus for connecting to a communication network and responding to a caller and a voice response program. The present invention relates to a storage medium.
[0002]
[Prior art]
FIG. 10 shows a configuration of a conventional response device.
The conventional response device includes a line interface unit 1, an incoming call detection unit 2, a call unit 4, a response unit 5, and a control unit 6.
In the response device shown in the figure, when an incoming call is received at the line interface unit 1, the incoming call detection unit 2 detects the incoming call, notifies the control unit 6, displays the incoming call, and the operator responds at the call unit 4.
[0003]
Alternatively, the response unit 5 is provided instead of the operator at night for unmanned operation and responds by voice guidance and push button dial signal recognition.
Further, in an electronic exchange system, a method of changing a response message by an originating ID provided by the ISDN network (Japanese Patent Laid-Open No. 6-237296), or a facsimile apparatus capable of searching for a user by the originating ID and determining service contents (Japanese Patent Laid-Open No. 9-65088).
[0004]
[Problems to be solved by the invention]
However, in the conventional response device, there is a problem that the response format on the receiving side at the time of unattended response is limited, and it is impossible to respond in the response format desired by the user. Moreover, there are many selection branches by PB, and there is a problem that it takes time to reach the service requested by the user.
[0005]
Further, when all incoming calls are transferred to the reception at night, there is a problem that the operation of the operator is large at night.
Further, there are a conventional method of changing a response message based on the calling ID (Japanese Patent Laid-Open No. 6-237296) and a facsimile apparatus (Japanese Patent Laid-Open No. 9-65088) capable of searching for a user based on the calling ID and determining service contents. In order to apply voice recognition to a large number of speakers, it is necessary for the user to input an ID number that identifies the speaker. In addition, it is necessary to specify a speaker by using a part of the speaker's utterance and collating with a recognition pattern for each of a large number of accumulated speakers.
[0006]
The present invention has been made in view of the above points. In the case where it is determined that the speaker is not suitable for speech recognition by improving the recognition rate of speech recognition, making the response format and response method variable for each speaker. Another object of the present invention is to provide a voice response method and apparatus capable of reducing the operation time by not attempting voice recognition, and a storage medium storing a voice response program.
[0007]
[Means for Solving the Problems]
FIG. 1 is a diagram for explaining the principle of the present invention.
The present invention (Claim 1) is a voice response method for replying to a caller by voice connected to a communication network.
For each caller, a response history including the result of past speech recognition success / failure, the number of responses, and the response format at the time of response is retained (step 1).
When there is an incoming call from the caller (step 2), the response history corresponding to the caller is searched (step 3),
In the response history corresponding to the retrieved caller, when there is a continuous history of successful voice recognition more than a certain number of times (step 4), a response by voice recognition is performed (step 5).
[0008]
The present invention (Claim 2) maintains a response history for each caller ID,
When there is an incoming call from the caller, obtain the caller ID of the caller,
Search the response history using the caller ID as the caller-specific key,
A response format is determined based on the retrieved response history.
In the present invention (Claim 3), as a response format,
A combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, voice guidance, an incoming call, and a response at the transfer destination are used.
[0009]
The present invention (Claim 4) allows the caller to specify the response format when the calling ID is not registered in the response history,
For an incoming call with a caller ID, response history information corresponding to the caller ID is acquired from the response history, and the response format is determined based on the response frequency history.
The present invention (Claim 5) changes the response format when there is no sound for a predetermined time at the time of response, or changes the response format due to the success / failure of voice recognition, or the response format by the operation of the caller One or a plurality of control methods are performed.
[0010]
FIG. 2 is a principle configuration diagram of the present invention.
The present invention (Claim 6) is a voice response device connected to a communication network for responding with voice to a caller,
Response format history storage means 11 for holding a response history including the result of past voice recognition success / failure for each caller, the number of responses, and the response format at the time of response;
Incoming call detection means 2 for detecting an incoming call from a caller;
In the incoming call detection means 2, when there is an incoming call, the history search means 6 for searching the response history corresponding to the caller from the response format history storage means 11,
In the response history corresponding to the caller searched by the history search means 6, there is a response means 5 that makes a response by voice recognition when there is a continuous history of voice recognition more than a predetermined number of times.
[0011]
According to the present invention (Claim 7), the response format history storage means 11
Keep a response history for each caller ID,
In the history search means 6,
Means for obtaining a caller ID of the caller when there is an incoming call from the caller, and searching for a response history using the caller ID as a caller-specific key;
In response means 5,
A means for determining a response format based on the response history searched by the history search means;
[0012]
In the present invention (claim 8), as a response format of the response history,
A combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, voice guidance, and an incoming call are transferred, and one of the responses at the transfer destination is held.
The present invention (Claim 9) further includes response format designation instruction means for causing the caller to designate a response format when the calling ID is not registered in the response history.
In response means 5,
For an incoming call with a caller ID, means for determining a response format based on the history of the number of responses searched by the history search means 6 is included.
[0013]
The present invention (Claim 10) further includes silent time detecting means for detecting silence for a predetermined time when responding to the caller,
In response means 5,
Means for changing the response format when the silent time detection means is silent for a predetermined time, means for changing the response format due to success / failure of voice recognition, or means for changing the response format by the operation of the caller Or a plurality of means are executed.
[0014]
The present invention (Claim 11) is a storage medium that stores a voice response program that is connected to a communication network and is mounted on a voice response device for responding to a caller with voice.
A response format history storage process for accumulating in the storage means a response history including the results of past voice recognition success / failure for each caller, the number of responses, and the response format at the time of response;
An incoming call detection process to detect incoming calls from callers;
In the incoming call detection process, when there is an incoming call, a history search process for searching a response history corresponding to the caller from the storage means;
In the response history corresponding to the caller searched by the history search process, there is a response process for performing a response by voice recognition when there is a continuous history of voice recognition more than a predetermined number of times.
[0015]
According to the present invention (Claim 12), in the response format history storage process,
The response history is stored in the storage means for each caller ID,
In the history search process,
Including a process of obtaining a caller ID of the caller when there is an incoming call from the caller, and searching a response history using the caller ID as a caller-specific key,
In the response process,
Including a process of determining a response format based on the response history retrieved by the history retrieval process.
[0016]
According to the present invention (claim 13), as a response format of the response history,
A combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, voice guidance, an incoming call, and a response at the transfer destination are used.
The present invention (Claim 14) further includes a response format designation instruction process for causing the caller to designate a response format when the calling ID is not registered as a response history in the storage means,
In the response process,
For an incoming call with a caller ID, a process of determining a response format based on a history of the number of responses retrieved by the history retrieval process is included.
[0017]
The present invention (Claim 15) further includes a silent time detection process for detecting silence for a predetermined time when responding to the caller,
The response process is
The process of changing the response format when there is silence for a predetermined time in the silent time detection process, the process of changing the response format due to the success / failure of voice recognition, or the process of changing the response format by the operation of the caller One or more processes are executed.
[0018]
As described above, according to the present invention, the success / failure history of the speaker's past speech recognition is determined. By activating the response by recognition, the recognition rate of voice recognition can be improved.
In addition, the response format and response method can be changed for each speaker.
[0019]
Further, when searching for the history of the speaker, the operability of the voice recognition user is obtained by using a search means using the calling ID (calling telephone number) transmitted via the telephone line as a key for specifying the speaker. Can be improved.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 3 shows the configuration of the response device of the present invention.
The response device shown in FIG. 1 includes a line interface unit 1 connected to a communication network, an incoming call detection unit 2 that detects an incoming call from the communication network, an outgoing ID detection unit 3 that detects an outgoing ID sent from the communication network, A communication unit 4 for making a call with a person, a response unit 5 having a plurality of response formats capable of responding to incoming calls, a control unit 6 for controlling the device, and a response history in the response unit 5 are stored for each calling ID and response format It is comprised from the response format log | history storage part 11 by origination ID.
[0021]
The response format history storage unit 11 for each calling ID stores the calling ID, the result of past voice recognition success / failure for each caller, the number of responses, and the response format at the time of response as the response history.
The control unit 6 searches the response history from the response ID history storage unit 11 for each calling ID based on the calling ID detected by the calling ID detection unit 3 at the time of an incoming call, and responds in the response unit 5 from the obtained response history. Is specified.
[0022]
Next, the operation in the above configuration will be described.
FIG. 4 is a sequence chart of the operation of the response device according to the present invention.
Step 101) The incoming call detection unit 2 detects an incoming call from the communication network via the line interface unit 1, and notifies the control unit 6 of the incoming call.
Step 102) The calling ID detection unit 3 detects the calling ID sent from the communication network and notifies the control unit 6 of it.
[0023]
Step 103) The control unit 6 searches the response format history storage unit 11 for each outgoing ID based on the outgoing ID acquired from the outgoing ID detection unit 3, and acquires the response history corresponding to the outgoing ID.
Step 104) The control unit 6 passes the acquired response history to the response unit 5, and the response unit 5 determines a response format corresponding to the response history in the plurality of response formats 1 to n included in the response unit 5.
[0024]
Step 105) The response unit 5 transfers the determined response format to the call unit 4, and responds to the user from the call unit 4 via the communication network.
Accordingly, it is possible to provide a variable response format for each caller based on the response format for each caller ID stored in the response format history storage unit 11 for each caller ID.
[0025]
【Example】
Embodiments of the present invention will be described below with reference to the drawings.
[First embodiment]
FIG. 5 shows the configuration of the response device according to the first embodiment of the present invention. 3, the same parts as those in FIG. 3 are denoted by the same reference numerals, and the description thereof is omitted.
[0026]
The response device shown in FIG. 5 has a transfer destination telephone directory storage unit 15, a transmission unit 16, a speech path switch 17, and a response detection unit 18 added to the configuration of FIG. A pulse signal (PB / DP) recognition unit 8, a voice recognition unit 9, a call recording unit 10, and a voice guidance unit 19 are added.
In the control unit 6, based on the outgoing ID detected by the outgoing ID detection unit 3, the response format history storage unit 11 for each outgoing ID is referred to, the response format is acquired, and the response unit 5 Instruct various functions to output a response format to the communication unit 4.
[0027]
In the present embodiment, the ID ID response format history storage unit 11 includes a response format combining voice guidance and voice recognition, a response format combining voice guidance and PB / DP recognition, voice guidance and call recording as response formats. It is assumed that a combined response format, a voice guidance response format, or a response format in which an incoming call is transferred to a transfer destination and responded from the transfer destination is stored for each calling ID.
[0028]
Further, as a response format in the response unit 5, a response format combining voice guidance and voice recognition, a response format combining voice guidance and PB / DP recognition, a response format combining voice guidance and call recording, a response format of voice guidance Alternatively, the incoming call is transferred to the transfer destination, and a response format for responding from the transfer destination is designated.
[0029]
The response recognition unit 9 of the response unit 5 recognizes the voice data acquired from the call unit 4 based on the response format as disclosed in, for example, JP-A-10-190842 and JP-A-7-230295.
In transferring an incoming call, the control unit 6 reads the transfer destination from the transfer destination telephone directory storage unit 15 and transmits the transfer destination from the transmission unit 16 to the transfer destination. Furthermore, at the time of a transfer response, the response detection unit 18 detects a response from the user and notifies the control unit 6 of the response. The control unit 6 controls the speech path switch 17 to form a speech path between the incoming call and the transfer destination, and performs control to transfer the incoming call.
[0030]
FIG. 6 is a sequence chart showing the operation of the response device according to the first exemplary embodiment of the present invention.
Step 201) The incoming call detection unit 2 detects an incoming call from the communication network via the line interface unit 1, and notifies the control unit 6 of the incoming call.
Step 202) The calling ID detection unit 3 detects the calling ID sent from the communication network, and notifies the control unit 6 of it.
[0031]
Step 203) The control unit 6 searches the response ID history storage unit 11 for each outgoing ID based on the outgoing ID acquired from the outgoing ID detection unit 3, and acquires a response history corresponding to the outgoing ID.
Step 204) The control unit 6 refers to the response format corresponding to the caller ID in the response unit 5 from the response history, and according to the response format, each function of the response unit 5 (PB / DP recognition unit 8, voice recognition) From any one of the unit 9, the call recording unit 10, the voice guidance unit 19, or a combination of functions).
[0032]
Step 205) If the call is not an incoming call, the call unit 4 responds to the user via the communication network based on the response format selected in Step 204.
Step 206) In the case of an incoming call, the control unit 6 determines a response format to the transfer destination.
Step 207) Further, the control unit 6 extracts the transfer destination ID from the incoming call, and reads the transfer destination from the transfer destination telephone directory storage unit 15 based on the transfer destination ID.
[0033]
Step 208) The transmitting unit 16 makes a response corresponding to the response format determined in Step 206 to the read transfer destination.
Step 209) When the response detector 18 detects a response from the transfer destination, the controller 6 controls the speech path switch 17 so that the incoming call can be transferred.
Step 210) A communication path is formed between the incoming call and the transfer destination, and the incoming call is transferred from the transmission unit 16 to the transfer destination.
[0034]
[Second Embodiment]
In this embodiment, when the silent time continues for a certain time or more, and when the response history voice recognition has been successful, the number of responses is extracted, and when there is a predetermined number of consecutive IDs, Processing for performing a response by voice recognition will be described.
FIG. 7 shows the configuration of the response device according to the second embodiment of the present invention. The same components as those in FIG.
[0035]
The configuration shown in the figure is a configuration in which a silent time detection unit 14 is added to the configuration of FIG. 5. The silent time detecting unit 14 monitors the incoming call that has been answered, and determines that there is no sound if the silent time is longer than a certain time.
8 and 9 are flowcharts showing the operation of the response device according to the second embodiment of the present invention.
[0036]
First, the response device responds unattended (step 300). Here, when the calling ID detection unit 3 cannot detect the calling ID, the voice guidance is played from the response unit 5 through the calling unit 4 (step 301). When the originating ID can be detected, the control unit 6 reads the response history and response count for each originating ID from the response format history storage unit 11 for each originating ID, and when the response count is the first time, the response unit Then, the call unit 4 performs a manned response. At this time, the operator inputs the user data into the response format history storage unit 11 for each calling ID, and stores the contents of the response as a history (step 302). When the number of responses is two or more, the control unit 6 reads the response history for each outgoing ID from the response history storage for each outgoing ID 11 (step 303).
[0037]
Here, when the voice recognition has succeeded twice consecutively and the voice recognition is selected, the voice recognition unit 9 of the response unit 5 responds by voice recognition (step 305). If PB / DP recognition is selected in step 304, the PB / DP recognition unit 8 of the response unit 5 responds (step 306). If the speech recognition in step 305 is successful and not silent, the response history is updated (step 307). Thereafter, a service is provided (step 308). As a method for determining the success / failure of voice recognition, a method is used in which a user inputs a voice at the time of response, and after the voice is recognized, it is judged by a confirmation means using the user's PB or the like by voice guidance.
[0038]
If the response fails in voice recognition or if silence is determined, the PB / DP recognition unit 8 of the response unit 5 responds (step 306).
If the PB / DP recognition in step 306 is successful, the response history is updated (step 309). The response history is updated by writing the response format / order contents in the database. Thereafter, a service is provided (step 310).
[0039]
If the PB / DP recognition by the PB / DP recognition unit 8 fails, or if the silent time is longer than a certain time in the silence determination, the guidance asks whether to transfer to a manned person or record a call (step 311). If the user wishes to transfer, the user performs a transfer operation to update the response history in the response format history storage unit 11 for each calling ID (step 314), and perform transfer and manned response (step 315). If the user does not wish to transfer, the response history in the response format history storage unit 11 for each calling ID is updated (step 312) and the call is recorded (step 313).
[0040]
In addition, for users who have a success history of a certain number of consecutive responses in voice recognition, a voice recognition response is made from the beginning. Asks whether or not he / she wants a response by recognition, and the user who responded in the form of a combination of PB / DP recognition and voice guidance last time inquires whether or not he / she wants a response by voice recognition at the beginning of the response If the user desires, a response by voice recognition is performed.
[0041]
In the above operation, the determination of whether or not the response using the voice recognition has succeeded twice or more in succession is not limited to two times, and the voice recognition has been continuously performed a predetermined number of times or more in the past. The determination may be made based on whether or not there is a successful history.
Moreover, although the above embodiment has been described based on the configurations of FIGS. 3, 5, and 7, a disk device that is constructed as a program and that is connected to a computer that is used as the response device. It is possible to easily realize the present invention by storing it in a portable storage medium such as a floppy disk or a CD-ROM and installing it when implementing the present invention.
[0042]
The present invention is not limited to the above-described embodiments, and various modifications and applications can be made within the scope of the claims.
[0043]
【The invention's effect】
As described above, according to the present invention, the caller's past speech recognition success / failure history is determined. By starting the response according to, the recognition rate of voice recognition can be improved.
[0044]
Further, the present invention does not change the recognition dictionary as in the prior art, but by changing the response format and response method for each caller, when it is determined that the speaker is not suitable for speech recognition, In the subsequent processing, the operation time is shortened by not trying speech recognition.
Also, by detecting the caller identification information sent from the communication network, responding by selecting the response format based on the caller identification information, and storing the response history for each caller identification information, It is possible to provide a response format suitable for the person.
[0045]
In addition, when searching for a caller's response history, a caller's specific key for searching for a caller ID sent via a telephone line without causing the caller to input a caller ID as in the prior art. As a result, it is possible to improve the operability of the voice recognition user.
Also, when answering, the incoming call is forwarded using a response format such as voice recognition and voice guidance combination, push button dial signal recognition and voice guidance combination, voice guidance and call recording combination, voice guidance, etc. By responding at the forwarding destination, the burden on the caller can be reduced.
[0046]
In addition, a response format is specified for an incoming call without caller identification information, and an accumulated response history is obtained for an incoming call with caller identification information. It can be specified. In addition, it is possible to change the response format in the case of silence detection at the time of response or by the operation of the caller.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining the principle of the present invention.
FIG. 2 is a principle configuration diagram of the present invention.
FIG. 3 is a configuration diagram of a response device according to the present invention.
FIG. 4 is a sequence chart of the operation of the response device according to the present invention.
FIG. 5 is a configuration diagram of a response device according to the first exemplary embodiment of the present invention.
FIG. 6 is a sequence chart showing the operation of the response device according to the first exemplary embodiment of the present invention.
FIG. 7 is a configuration diagram of a response device according to a second embodiment of the present invention.
FIG. 8 is a flowchart (No. 1) showing an operation of the response device according to the second exemplary embodiment of the present invention.
FIG. 9 is a flowchart (part 2) illustrating the operation of the response device according to the second exemplary embodiment of the present invention.
FIG. 10 is a configuration diagram of a conventional response device.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Line interface part 2 Incoming call detection means, incoming call detection part 3 Calling ID detection part 4 Calling part 5 Response means, response part 6 History search means, control part 8 PB / DP recognition part 9 Voice recognition part 10 Call recording part 11 Calling ID Separate response format history storage means, ID-specific response format history storage unit 15 Forwarding destination telephone directory storage unit 16 Transmission unit 17 Communication path switch 18 Response detection unit

Claims

In a voice response method for connecting to a communication network and responding to a caller with voice,
For each caller, keep a response history including the results of past voice recognition success / failure, number of responses, response format at the time of response,
When there is an incoming call from the caller, search the response history corresponding to the caller,
A voice response method characterized by performing a response by voice recognition when there is a success history of voice recognition more than a predetermined number of times in the response history corresponding to the searched sender.

The response history is kept for each caller ID,
When there is an incoming call from the caller, obtain the caller ID of the caller,
Search the response history using the caller ID as a caller specific key,
The voice response method according to claim 1, wherein a response format is determined based on the response history.

As the response format,
3. The combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, voice guidance, an incoming call, and using one of the responses at the transfer destination. Voice response method.

If the calling ID is not registered in the response history, let the caller specify the response format,
2. The voice response method according to claim 1, wherein for an incoming call having a caller ID, response history information corresponding to the caller ID is acquired from the response history, and a response format is determined based on a history of the number of responses.

Any of the control methods of changing the response format when the response is silent for a predetermined time, changing the response format by success / failure of voice recognition, or changing the response format by the operation of the caller The voice response method according to claim 1, wherein a plurality of controls are performed.

A voice response device connected to a communication network for responding to a caller with voice,
Response format history storage means for holding a response history including the result of success / failure of past voice recognition for each caller, the number of responses, and the response format at the time of response;
An incoming call detection means for detecting an incoming call from a caller;
In the incoming call detection means, when there is an incoming call, a history search means for searching a response history corresponding to the caller from the response format history storage means;
In the response history corresponding to the caller searched by the history search means, there is a response means for making a response by voice recognition when there is a continuous history of voice recognition more than a predetermined number of times. Voice response device.

The response format history storage means includes:
The response history is kept for each caller ID,
The history search means includes:
When there is an incoming call from the caller, the caller ID of the caller is acquired, and the caller ID is used as a caller-specific key to search the response history,
The response means includes
7. The voice response device according to claim 6, further comprising means for selecting a response format based on the response history searched by the history search means.

As the response format,
8. A combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, a voice guidance, and an incoming call are transferred, and any one of responses at the transfer destination is used. Voice response device.

In the case where the calling ID is not registered in the response history, it further has a response format designation instruction means for causing the caller to specify a response format,
The response means includes
7. The voice response apparatus according to claim 6, further comprising means for determining a response format based on a history of the number of responses searched by the history search means based on the call ID for an incoming call having a call origination ID.

Silence time detection means for detecting silence for a predetermined time when responding to the caller,
The response means includes
When the silent time detection means is silent for a predetermined time, the response format is changed, the response format is changed by the success / failure of voice recognition, or the response format is changed by the operation of the caller. The voice response device according to claim 6, wherein one or a plurality of means is executed.

A storage medium connected to a communication network and storing a voice response program installed in a voice response device for responding to a caller with voice,
A response format history storage process for accumulating in the storage means a response history including the results of past voice recognition success / failure for each caller, the number of responses, and the response format at the time of response;
An incoming call detection process to detect incoming calls from callers;
In the incoming call detection process, when there is an incoming call, a history search process for searching a response history corresponding to the caller from the storage means;
The response history corresponding to the caller searched by the history search process has a response process of performing a response by voice recognition when there is a continuous history of successful voice recognition more than a certain number of times. A storage medium that stores a voice response program.

The response format history storage process includes:
The response history is stored in the storage means for each caller ID,
The history search process includes:
When an incoming call is received from the caller, the caller ID of the caller is acquired, and the response history is searched using the caller ID as a caller-specific key,
The response process is:
12. A storage medium storing a voice response program according to claim 11, further comprising means for selecting a response format based on the response history searched by the history search process.

As the response format,
The combination of voice recognition and voice guidance, a combination of push button dial signal recognition and voice guidance, a combination of voice guidance and call recording, voice guidance, an incoming call, and using one of the responses at the transfer destination. A storage medium storing a voice response program.

If the calling ID is not registered as a response history in the storage means, it further includes a response format designation instruction process for allowing the caller to designate a response format;
The response process is:
12. A storage medium storing a voice response program according to claim 11, further comprising a process of determining a response format based on a history of the number of responses retrieved by the history retrieval process for an incoming call having a caller ID.

A silence time detection process for detecting silence for a predetermined time when responding to the caller;
The response process is:
When there is no sound for a predetermined time in the silent time detection process, the response format is changed, the response format is changed by the success / failure of voice recognition, or the response format is changed by the operation of the caller. The storage medium storing the voice response program according to claim 11, wherein one or more of the processes are executed.