JP3964724B2

JP3964724B2 - Voice input device and method, and voice input program

Info

Publication number: JP3964724B2
Application number: JP2002119409A
Authority: JP
Inventors: 勝則松下; 英治河野; 宣男石嶋
Original assignee: Toshiba TEC Corp
Current assignee: Toshiba TEC Corp
Priority date: 2002-04-22
Filing date: 2002-04-22
Publication date: 2007-08-22
Anticipated expiration: 2022-04-22
Also published as: JP2003316388A

Abstract

<P>PROBLEM TO BE SOLVED: To realize correct input processing of ordered items without being bound by the order of voice input. <P>SOLUTION: A phrase registered beforehand is extracted one by one out of a sentence inputted by voice. For each of the extracted phrases, an item containing the meaning of the phrase is determined. Whether an item combination pattern of each of the extracted phrases is the pattern satisfying the prescribed fixed conditions or not is judged. When the item combination pattern of each of the extracted phrases is judged to be the pattern satisfying the fixed conditions, each phrase is fixed as voice input data. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、注文内容を受付けて処理する情報処理装置、例えばガソリンスタンド給油システムの伝票発行装置、航空券，乗車券などの券発行装置、レストラン，ファーストフード店でのオーダ登録装置等に対する注文内容の入力デバイスとして用いられる音声入力装置及びその方法並びに音声入力プログラムに関する。
【０００２】
【従来の技術】
例えばガソリンスタンドにおいて、店員は、客が来店すると、先ず給油するガソリンの種類（レギュラーガソリン，ハイオクガソリン，軽油等）と、給油量（量単位又は金額単位）と、決済方法（現金払い，クレジット払い等）を客に尋ねる。これに対して、例えば客が「レギュラーを満タン、現金で。」と答えた場合には、店員は、給油するガソリンの種類が「レギュラーガソリン」であること、給油量は「満タン」であること、決済方法は「現金払い」であることをそれぞれ伝票発行装置に入力した後、給油を開始する。こうすることにより、給油が終了すると、伝票発行装置から自動的に実給油量に応じた決済伝票が印字発行されるので、店員は、決済伝票にしたがって客から給油代金の支払いを受けることになる。
【０００３】
このようなガソリンスタンドの給油システムにおいて、従来、客の注文内容を伝票発行装置に入力する方法としては、キーボードやタッチパネル等の入力デバイスを用いた方法が一般的であった。しかし、この方法では、店員が客から聞いた内容を手入力しなければならなかったので、二度手間であった上、入力ミスも発生し易かった。また、客は店員が近づいて来るまで注文内容を告げるのを待たなければならず、また、注文を聞いた店員は、伝票発行装置が置かれている場所に行って客の注文内容を入力した後、車が停車している場所に戻って給油を始めなければならず、作業効率が悪かった。
【０００４】
そこで近年、音声入力装置を入力デバイスとして伝票発行装置に接続し、給油に来た客が車内から直接音声で伝票発行装置に注文内容を入力することが考えられていた。こうすることにより、店員の手間を軽減でき、入力ミスもなくなる上、店員が車と伝票発行装置との間を往復して客の注文内容を手入力する無駄を省略できるので、作業効率の向上が期待できる。
【０００５】
ところが従来の音声入力装置は、マイクロホンを通じて入力された音声データから認識された語句を１つずつ入力順に取り込み、その順番でコンピュータ等の情報処理装置に転送する形式であった。一方、注文内容を受付けて処理する伝票発行装置等の情報処理装置は、注文内容項目の受付け順序が予め決まっており、転送されてきた語句をその順番に該当項目の注文内容を示すデータとして処理することになる。
【０００６】
このため、例えば、前記給油システムの伝票発行装置が、最初にガソリンの種類を受付け、次に給油量を受付け、最後に決済方法を受付けるようにプログラムされていたとした場合、伝票発行装置は、音声入力装置から最初に転送されてきた語句をガソリンの種類を示すデータとして処理し、次に転送されてきた語句を給油量を示すデータとして処理し、最後に転送されてきたデータを決済方法を示すデータとして処理することになる。したがって、音声入力を行なう者は、先ずガソリンの種類を注文し、次に給油量を注文し、最後に決済方法を注文するというように、予め決められた順番に注文内容を発声する必要があった。
【０００７】
【発明が解決しようとする課題】
このように、注文内容の入力デバイスとして音声入力装置を用いた場合、従来は注文内容を入力する順番が固定化されていたので、例えばガソリンスタンドにおいて、客自身が音声入力装置を利用して給油の注文内容を伝票発行装置に入力するようにした給油システムを構築しても、不特定多数の客が利用するガソリンスタンドで全ての客に注文内容を告げる順番を徹底させるのは困難であり、伝票発行装置が注文内容を正しく認識できず、正常に機能しないおそれが高かった。
【０００８】
本発明はこのような事情に基づいてなされたもので、その目的とするところは、音声入力する順番に捕われずに注文内容を正しく入力処理できる音声入力装置及びその方法並びに音声入力プログラムを提供しようとするものである。
【０００９】
【課題を解決するための手段】
本発明の音声入力装置は、音声入力された注文内容から予め登録されている語句を一つずつ抽出する語句抽出手段と、この語句抽出手段により抽出された語句毎にその語句の意味が含まれる項目を判定する語句項目判定手段と、この語句項目判定手段により項目が判定された各語句を項目別に記憶する語句記憶手段と、この語句記憶手段により記憶された各語句の項目組合せパターンが注文品の種類，量および決済方法の項目からなる注文確定条件を満足するパターンか否かを判断する確定条件判断手段と、この確定条件判断手段により各語句の項目組合せパターンが注文確定条件を満足するパターンであると判断されると、語句記憶手段により記憶された各語句を音声入力注文データとして確定させる注文確定手段と、この注文確定手段により確定された音声入力注文データを、決済伝票を発行する伝票発行装置に転送する手段と、を備えたものである。
【００１０】
したがって、音声入力された注文内容から抽出された各語句の意味がそれぞれ含まれる項目の組合せパターンが注文品の種類，量および決済方法の項目からなる注文確定条件を満足するパターンと一致したときには、その各語句が音声入力注文データとして確定されるので、注文内容を構成する語句の順番が異なっていてもよい。
【００１１】
【発明の実施の形態】
以下、本発明の一実施の形態を図面を用いて説明する。
なお、この実施の形態は、ガソリンスタンド給油システムの伝票発行装置に対する注文内容入力デバイスとして用いられる音声入力装置に、本発明を適用した場合である。
【００１２】
図１は本実施の形態における音声入力装置１の構成を示すブロック図である。この音声入力装置１は、音声入力部としてのマイクロホン１１と、音声出力部としてのスピーカ１２と、音声エンジン部１３と、音声入力確定部１４と、音声入力スイッチ１５とから構成されている。音声入力確定部１４には、注文内容を受付けて処理する情報処理装置、つまりこの実施の形態では、ガソリンスタンド給油システムの伝票発行装置２が接続されている。
【００１３】
音声入力スイッチ１５は、ユーザがマイクロホン１１から音声入力する際にオンするスイッチで、その近傍にマイクロホン１１とスピーカ１２が設けられている。この実施の形態では、ガソリンスタンドの給油場所に停車した客が車内から操作できる位置に音声入力スイッチ１５とマイクロホン１１とスピーカ１２が設けられている。
【００１４】
音声エンジン部１３は、マイクロホン１１から入力されたアナログ音声信号をディジタル音声信号に変換するＡ／Ｄ（アナログ／ディジタル）コンバータ１６、該ディジタル音声信号から１音節ずつ音声を認識して音声データを作成する音声認識部１７、音声入力確定部１４から供給されるガイダンスデータからディジタル音声信号を生成する音声生成部１８、該ディジタル音声データをアナログ音声データに変換してスピーカ１２に出力するＤ／Ａ（ディジタル／アナログ）コンバータ１９によって構成されている。
【００１５】
音声入力確定部１４は、音声入力スイッチ１５がオンされている期間中、前記音声認識部１７にて生成される音声データを取込み、この音声データから伝票発行装置２で処理すべき音声入力データを確定して伝票発行装置２に転送する機能を有するものである。
【００１６】
なお、この実施の形態において、伝票発行装置２は、ガソリンの種類と給油量と決済方法の注文をこの順番に受付け、これらの注文内容が入力された後、給油が終了すると、実給油量に応じた決済伝票を印字発行するように構成されている。したがって、音声入力確定部１４は、音声データからガソリンの種類を表わす語句と、給油量を表わす語句と、決済方法を表わす語句とがそれぞれ得られると、これらの語句から音声入力データを確定して、伝票発行装置２に転送する。これに対し、いずれかの注文内容を表わす語句が不足しているときには、その注文内容を表わす語句の音声再入力を促すガイダンスデータを音声生成部１８に出力して、そのガイダンスを音声出力させる。また、語句の中に、例えば物の総称を表わすために注文内容を確定できない語句が含まれるときには、その語句を確定させる音声の再入力を促すガイダンスデータを音声生成部１８に出力して、そのガイダンスを音声出力させる。
【００１７】
以下、音声入力確定部１４の具体的構成について説明する。
【００１８】
図２は前記音声入力確定部１４の構成を示すブロック図である。この音声入力装置１は、マイクロプロセッサとしてのＣＰＵ（Central Processing Unit）１４１、このＣＰＵ１４１が実行するプログラム等の固定的データが予め格納されたＲＯＭ（Read Only Memory）１４２、このＣＰＵ１４１がデータの一時格納領域として使用する各種のメモリエリアが形成されるＲＡＭ（Random Access Memory）１４３、前記音声エンジン部１３の音声認識部１７から音声データが入力される一方、音声生成部１８に対してガイダンスデータが出力されるＩ／Ｏ（Input／Output）ポート１４４、種々のデータファイル等を保存するＨＤＤ（Hard Disk Drive）装置１４５、前記伝票発行装置２に接続され、音声入力データを転送する通信インタフェース１４６、前記音声入力スイッチ１５からの信号が入力される入力ポート１４７等で構成され、ＣＰＵ１４１と、ＲＯＭ１４２，ＲＡＭ１４３，Ｉ／Ｏポート１４４，ＨＤＤ装置１４５，通信インタフェース１４６，入力ポート１４７等とは、システムバス１４８で接続されている。ここに、音声入力確定部１４は、ＣＰＵ１４１，ＲＯＭ１４２及びＲＡＭ１４３からなるマイクロコンピュータを主体に構成されている。
【００１９】
前記ＨＤＤ装置１４５には、特に図３に示すように、語彙ファイル３１、辞書ファイル３２、ガイダンスファイル３３及び確定条件テーブル３４が保存されている。
【００２０】
前記語彙ファイル３１は、図４（ａ）に示すように、所定の語句を発生したときの音声の特徴量を数値化した語句音声データと、その語句に対して予め設定された語句毎に異なる語句コードとからなるレコードを記憶するもので、特に、伝票発行装置２に注文内容を音声で入力するときに使用されると考えられる多数の語句が選出され、各語句のレコードがそれぞれ記憶されている。例えば、ガソリンの種類を注文するときに使用される語句として「レギュラー」，「レギュラーガソリン」，「ハイオク」，「ハイオクガソリン」，「軽油」，「ガソリン」等が選出され、これらの語句の語句音声データと語句コードとが語彙ファイル３１に記憶されている。また、給油量を注文するときに使用される語句として「満タン」，「１０リットル」，「２０リットル」，「１０００円」，「２０００円」等が選出され、これらの語句の語句音声データと語句コードとが語彙ファイル３１に記憶されている。また、決済方法を注文するときに使用される語句として「現金」，「現金払い」，「カード」，「クレジット」，「デビット」，「クレジットカード」，「デビットカード」等が選出され、これらの語句の語句音声データと語句コードとが語彙ファイル３１に記憶されている。
【００２１】
前記辞書ファイル３２は、図４（ｂ）に示すように、語句コード，項目コード，不確定フラグ，ガイダンス番号及び複数のリンク語句コードからなるレコードを記憶するもので、前記語彙ファイル３１に記憶されている各語句の語句コードにそれぞれリンクする複数のレコードが記憶されている。
【００２２】
この辞書ファイル３２のレコード項目において、項目コードは、各種語句の意味別に設定された項目を識別するコードであり、この実施の形態では、ガソリンの種類を表わす意味の項目「アイテム」（項目コード＝１）と、給油量を表わす意味の項目「数量」（項目コード＝２）と、給油量相当の金額を表わす意味の項目「金額」（項目コード＝３）と、決済方法を表わす意味の項目「決済方法」（項目コード＝４）とを設定している。
【００２３】
また、不確定フラグは、対応する語句が注文内容を確定できる語句か否かを識別するフラグであって、確定できる語句に対しては“０”がセットされ、物の総称などのように不確定な語句に対しては“１”がセットされる。そして、不確定語句に対してのみ、ガイダンス番号とリンク語句コードとが設定される。ガイダンス番号は、その不確定語句を確定させる音声の再入力を促すガイダンスデータに対して設定された番号である。リンク語句コードは、その不確定語句が分類された項目に含まれる語句で注文内容を確定できる一部又は全ての語句の語句コードである。
【００２４】
例えば、ガソリンの種類を注文する際に音声入力される語句として、「レギュラー」，「ハイオク」，「ガソリン」等があり、前の２つはガソリンの種類を特定できるが、「ガソリン」だけではレギュラーガソリンなのかハイオクガソリンなのかを特定できない。そこで、語句「ガソリン」の語句コードに対して不確定フラグを“１”とし、ガイダンス番号としてガソリンの種類を確定させる音声再入力を促すガイダンスデータ（例えば「ガソリンは……ですか……ですか」）の番号とし、リンク語句コードとして語句「レギュラー」及び「ハイオク」の語句コードをそれぞれ設定する。同様に、決済方法を注文する際に音声入力される語句として、「クレジット」，「デビット」，「カード」等があり、前の２つは決済方法を特定できるが、「カード」だけではクレジットカード決済なのかデビットカード決済なのかを特定できない。そこで、語句「カード」の語句コードに対して不確定フラグを“１”とし、ガイダンス番号としてカードの種類を確定させる音声再入力を促すガイダンスデータ（例えば「カードは……ですか……ですか」）の番号とし、リンク語句コードとして語句「クレジット」及び「デビット」の語句コードをそれぞれ設定する。
【００２５】
ガイダンスファイル３３は、図４（ｃ）に示すように、ガイダンス番号とガイダンスデータとからなるレコードを記憶するもので、音声入力データを確定するまでの間に音声出力が必要となる各種のガイダンスデータがそれぞれ異なるガイダンス番号とともに記憶されている。
【００２６】
確定条件テーブル３４は、図５に示すように、各種語句の意味別に設定された項目（この実施の形態では「アイテム」，「数量」，「金額」，「決済方法」の４種）の組合せパターンを識別する条件ナンバー別に、当該項目組合せパターンが対応する項目を含むパターンか否かを示すデータ（１：含む、０：含まない）と、当該項目組合せパターンが確定条件を満足するパターンでないとき、確定条件を満足するのに不足している項目の意味を有する語句の音声再入力を促すガイダンスデータのガイダンス番号とを記憶するものである。
【００２７】
つまり、条件ナンバーＰ１で識別される項目組合せパターンは、データ“１”の項目「アイテム」，「数量」及び「決済方法」を含むパターンであり、条件ナンバーＰ２で識別される項目組合せパターンは、データ“１”の項目「アイテム」，「金額」及び「決済方法」を含むパターンであって、いずれも伝票発行装置２に対する音声入力データとしての確定条件を満足するので、ガイダンス番号は未設定（＝０）となっている。
【００２８】
一方、条件ナンバーＰ３〜Ｐ１２でそれぞれ識別される項目組合せパターンは、いずれも伝票発行装置２に対する音声入力データとしての確定条件を満足しないので、それぞれ該当するガイダンスデータのガイダンス番号が設定されている。例えば、条件ナンバーＰ３及びＰ４で識別されるパターンは、いずれも項目「決済方法」の意味を有する語句が未入力であるため、決済方法の音声再入力を促すガイダンスデータ（例えば「決済方法は何ですか」）のガイダンス番号［１００］が設定されている。また、条件ナンバーＰ５で識別されるパターンは、項目「数量」または「金額」の意味を有する語句が未入力であるため、給油量の音声再入力を促すガイダンスデータのガイダンス番号［１０１］が設定されている。以下、同様に、条件ナンバーＰ６及びＰ７で識別されるパターンは、項目「アイテム」の意味を有する語句が未入力であるため、ガソリン種類の音声再入力を促すガイダンスデータのガイダンス番号［１０２］が設定され、条件ナンバーＰ８で識別されるパターンは、項目「数量」または「金額」と「決済方法」の意味をそれぞれ有する語句がいずれも未入力であるため、給油量と決済方法の音声再入力を促すガイダンスデータのガイダンス番号［１０３］が設定され、条件ナンバーＰ９及びＰ１０で識別されるパターンは、項目「アイテム」と「決済方法」の意味をそれぞれ有する語句がいずれも未入力であるため、ガソリン種類と決済方法の音声再入力を促すガイダンスデータのガイダンス番号［１０４］が設定され、条件ナンバーＰ１１で識別されるパターンは、項目「アイテム」と「数量」または「金額」の意味をそれぞれ有する語句がいずれも未入力であるため、ガソリン種類と給油量の音声再入力を促すガイダンスデータのガイダンス番号［１０５］が設定されている。
【００２９】
さて、ＣＰＵ１４１は、ＲＯＭ１４２に格納された音声入力プログラムにより、ＲＡＭ１４３に図６に示すメモリエリア６１，６２を形成し、図７の流れ図に示す処理を繰返し実行するようになっている。
【００３０】
メモリエリア６１は、語彙ファイル３１から抽出した語句音声データと語句コードとを、抽出順に記憶する領域で、以後、語句メモリ６１と称する。メモリエリア６２は、各種語句の意味別に設定された項目（この実施の形態では「アイテム」，「数量」，「金額」，「決済方法」の４種）別に、その項目の意味を有する語句の語句音声データ，語句コード及び不確定フラグを１つずつ記憶する領域（語句記憶手段）で、以後、項目別メモリ６２と称する。
【００３１】
ＲＡＭ１４３に語句メモリ６１及び項目別メモリ６２を形成した状態で、ＣＰＵ１４１は、ＳＴ（ステップ）１として音声入力スイッチ１５がオン操作されるのを待機している。なお、初期状態として、語句メモリ６１の語句音声データエリア及び語句コードエリアと、項目別メモリ６２の語句音声データエリア，語句コードエリア及び不確定フラグエリアはいずれもクリアされている。
【００３２】
入力ポート１４７に入力される信号により、音声入力スイッチ１５がオン操作されたと判断すると、ＣＰＵ１４１は、ＳＴ２として音声認識部１７にて認識された音声データを取り込む。音声データは、ＳＴ３として音声入力スイッチ１５がオフ操作されるまで継続して取り込む。
【００３３】
入力ポート１４７に入力される信号により、音声入力スイッチ１５がオフ操作されたと判断すると、ＣＰＵ１４１は、ＳＴ４として語句抽出処理を実行する。すなわち、音声認識部１７から取り込んだ音声データを入力順に語彙ファイル３１に登録されている各種語句の語句音声データと比較照合して、当該語彙ファイル３１に登録されている語句音声データと一致する語句を１つずつ抽出する。そして語句を抽出する毎に、その語句の語句音声データと対応する語句コードとを、語句メモリ６１に抽出順に格納する（語句抽出手段）。なお、音声データから語句音声データと一致する語句を１つも抽出できないときにはエラーとし、音声入力スイッチ１５が再度オン操作されるのを待機する。
【００３４】
音声データから語句音声データを抽出し終えると、ＣＰＵ１４１は、語句項目判定処理を実行する。すなわち、語句メモリ６１に格納された語句音声データとその語句コードとをメモリナンバーの小さい順、つまり音声データから抽出された語句の順番に読み出す。そして、語句コードで辞書ファイル３２を検索して、対応する項目コード及び不確定フラグを取得し、項目別メモリ６２の該当項目エリアに、当該語句音声データ及び語句コードと不確定フラグとを格納する。このとき、該当項目エリアに既にデータが格納されていた場合には、その既存のデータに上書きして格納する。したがって、語句メモリ６１の中に同一項目の意味を有する語句の語句音声データ及び語句コードが複数格納されていた場合には、最も後から抽出された語句，つまり最も後から音声入力された語句を有効とし、その語句の語句音声データ，語句コード及び不確定フラグを項目別メモリ６２で記憶する（語句項目判定手段）。
【００３５】
語句メモリ６１に格納された各語句音声データの項目を判定し終えると、ＣＰＵ１４１は、ＳＴ６として不確定語句の有無を判断する。すなわち、項目別メモリ６２に格納された不確定フラグをチェックし、不確定フラグ“１”が格納されているか否かを判断する。そして、不確定フラグ“１”が格納されている場合には不確定語句有りと判断し、不確定フラグ“１”が格納されていない場合には不確定語句無しと判断する。
【００３６】
不確定語句有りと判断した場合には、ＣＰＵ１４１は、ＳＴ７としてこの不確定フラグ“１”に対応する語句コードで辞書ファイル３２を再度検索し、対応するガイダンス番号とリンク語句コードとを読み出す。また、ガイダンス番号に対応するガイダンスデータをガイダンスファイル３３から読み出すとともに、各リンク語句コードにそれぞれ対応する語句音声データを語彙ファイル３１から読み出す。そして、ガイダンスデータと各語句音声データとから、不確定語句を確定させる音声の再入力を促すガイダンスデータ，いわゆる確定語句問合せガイダンスデータを作成する。しかる後、ＳＴ８としてこの確定語句問合せガイダンスデータを音声生成部１８に出力して、スピーカ１２より音声出力させる（確定語句催促手段）。その後、ＣＰＵ１４１は、ＳＴ１に戻って、音声入力スイッチ１５が再度オン操作されるのを待機する。
【００３７】
不確定語句無しと判定した場合には、ＣＰＵ１４１は、ＳＴ９として確定条件判断処理を実行する。すなわち、項目別メモリ６２をチェックして語句音声データ，語句コード及び不確定フラグ“０”の各データが格納されている項目と格納されていない項目とを区別する。そして、各データが格納されている項目の組合せパターン，つまり音声データから抽出された各語句の項目組合せパターンで確定条件テーブル３４を検索して、同一パターンのガイダンス番号を取得する。
【００３８】
ここで、同一パターンのガイダンス番号が“０”以外の場合、つまりガイダンス番号が設定されている場合には、ＣＰＵ１４１は、ＳＴ１０として音声データから抽出された各語句の項目組合せパターンが確定条件を満足するパターンでないので、入力を確定させない。これに対し、同一パターンのガイダンス番号が“０”の場合には、音声データから抽出された各語句の項目組合せパターンが確定条件を満足するパターンであるので、入力を確定させる（確定条件判断手段）。
【００３９】
入力を確定させない場合には、ＳＴ１１としてそのガイダンス番号に対応するガイダンスデータをガイダンスファイル３３から読み出し、確定条件を満足するのに不足している項目の意味を有する語句の音声再入力を促すガイダンスデータ、いわゆる不足項目語句問合せガイダンスデータを作成する。そして、ＳＴ１２としてこの不足項目語句問合せガイダンスデータを音声生成部１８に出力して、スピーカ１２より音声出力させる（不足語句催促手段）。その後、ＳＴ１に戻って、音声入力スイッチ１５が再度オン操作されるのを待機する。
【００４０】
入力を確定させる場合には、ＳＴ１３として項目別メモリ６２に格納されている各項目の語句コードを、項目「アイテム」，「数量」又は「金額」，「決済方法」の順に読み出す。そして、これらの語句コードを順に組み合わせて音声入力確定データを作成したならば、この音声入力確定データを伝票発行装置２にインタフェース１４６を介して転送する（入力確定手段）。
【００４１】
その後、ＣＰＵ１４１は、ＳＴ１４として語句メモリ６１及び項目別メモリ６２をクリアして初期状態に戻したならば、今回の処理を終了する。そして、音声入力スイッチ１５が次にオン操作されるのを待機し、オン操作されたならば、ＳＴ１からの処理を繰り返すものとなっている。
【００４２】
このように本実施の形態においては、ガソリン給油システムの伝票発行装置２に対する注文内容の入力デバイスとして、音声入力装置１が使用されている。そして、ガソリンの給油に来た客は、所定位置に停車後、車内から音声入力スイッチ１５をオンし、マイクロホン１１に向かって注文内容、つまりガソリンの種類と給油量と決済方法を発声する。このとき、注文内容を告げる順番は特に制限されない。
【００４３】
今、例えば客が注文内容を「ガソリンを現金で２０００円分」と発声したとする。そうすると、図８（ａ）に示すように、「ガソリンを現金で２０００円分」という音声データが音声認識部１７で認識され、音声入力確定部１４に入力される。
【００４４】
音声入力確定部１４では、この音声データ「ガソリンを現金で２０００円分」に対して、語句抽出処理が実行される。その結果、この音声データからガソリンの種類を表わす語句「ガソリン」と、支払方法を表わす語句「現金」と、給油量相当の金額を表わす語句「２０００円」が抽出される。そして、図８（ｂ）に示すように、語句メモリ６１に各語句の語句音声データと語句コードとが抽出された順に格納される。
【００４５】
次に、音声入力確定部１４では、語句項目判定処理が実行される。これにより、項目別メモリ６２には、図８（ｃ）に示すように、項目「アイテム」に対して語句「ガソリン」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納され、項目「金額」に対して語句「２０００円」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納され、項目「決済方法」に対して語句「現金」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納される。
【００４６】
この場合、語句「ガソリン」は不確定語句（不確定フラグ＝１）なので、ガソリンの種類を確定させる音声の再入力を促すガイダンスデータ「ガソリンは……ですか……ですか」と、語句「ガソリン」のリンク語句コードに対応する語句「レギュラー」，「ハイオク」とから、図８（ｄ）に示す確定語句問合せガイダンスデータ「ガソリンはレギュラーですかハイオクですか」が作成される。そして、この確定語句問合せガイダンスデータが音声生成部１８に出力され、音声に変換されて、スピーカ１２から出力される。
【００４７】
このガイダンスを聞いた客が、同様にして例えば「レギュラー」と発声したとする。そうすると、図８（ｅ）に示すように、「レギュラー」という音声データが音声認識部１７で認識され、音声入力確定部１４に入力される。音声入力確定部１４では、この音声データ「レギュラー」に対して、語句抽出処理が実行される。その結果、この音声データからガソリンの種類を表わす語句「レギュラー」が抽出される。そして、図８（ｆ）に示すように、語句「レギュラー」の語句音声データと語句コードとが語句メモリ６１に追加される。
【００４８】
次いで、音声入力確定部１４では、語句項目判定処理が再度実行される。これにより、項目別メモリ６２には、図８（ｇ）に示すように、項目「アイテム」に対して語句「レギュラー」の語句音声データ，語句コード及び不確定フラグがそれぞれ上書きされる。
【００４９】
その結果、不確定語句は存在しなくなったので、音声入力確定部１４では、確定条件判断処理が実行される。この場合、項目別メモリ６２には、項目「アイテム」，「金額」及び「決済方法」にそれぞれ属する語句のデータが格納されているので、条件ナンバーＰ２に一致するパターンであると認識される。このパターンは、ガイダンス番号が未設定、つまり確定条件を満足するパターンなので、語句「レギュラー」，「２０００円」，「現金」の各語句コードから音声入力確定データが作成され、伝票発行装置２に転送される。
【００５０】
伝票発行装置２においては、音声入力確定データの最初の語句コードをガソリンの種類を示すデータとして認識し、次の語句コードを給油量を示すデータとして認識し、最後の語句コードを決済方法を示すデータとして認識して処理する。したがって、正常に伝票発行が処理される。
【００５１】
また、例えば別の客が注文内容を「レギュラーガソリンをカードで」と発声したとする。そうすると、図９（ａ）に示すように、「レギュラーガソリンをカードで」という音声データが音声認識部１７で認識され、音声入力確定部１４に入力される。
【００５２】
音声入力確定部１４では、この音声データ「レギュラーガソリンをカードで」に対して、語句抽出処理が実行される。その結果、この音声データからガソリンの種類を表わす語句「レギュラーガソリン」と、支払方法を表わす語句「カード」が抽出される。そして、図９（ｂ）に示すように、語句メモリ６１に各語句の語句音声データと語句コードとが抽出された順に格納される。
【００５３】
次に、音声入力確定部１４では、語句項目判定処理が実行される。これにより、項目別メモリ６２には、図９（ｃ）に示すように、項目「アイテム」に対して語句「レギュラーガソリン」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納され、項目「決済方法」に対して語句「カード」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納される。
【００５４】
この場合、語句「カード」は不確定語句（不確定フラグ＝１）なので、カードの種類を確定させる音声の再入力を促すガイダンスデータ「カードは……ですか……ですか」と、語句「カード」のリンク語句コードに対応する語句「クレジット」，「デビット」とから、図９（ｄ）に示す確定語句問合せガイダンスデータ「カードはクレジットですかデビットですか」が作成される。そして、この確定語句問合せガイダンスデータが音声生成部１８に出力され、音声に変換されて、スピーカ１２から出力される。
【００５５】
このガイダンスを聞いた客が、同様にして例えば「クレジットカードです」と発声したとする。そうすると、図９（ｅ）に示すように、「クレジットカードです」という音声データが音声認識部１７で認識され、音声入力確定部１４に入力される。音声入力確定部１４では、この音声データ「クレジットカードです」に対して、語句抽出処理が実行される。その結果、この音声データから決済方法を表わす語句「クレジットカード」が抽出される。そして、図９（ｆ）に示すように、語句「クレジットカード」の語句音声データと語句コードとが語句メモリ６１に追加される。
【００５６】
次いで、音声入力確定部１４では、語句項目判定処理が再度実行される。これにより、項目別メモリ６２には、図９（ｇ）に示すように、項目「決済方法」に対して語句「クレジットカード」の語句音声データ，語句コード及び不確定フラグがそれぞれ上書きされる。
【００５７】
その結果、不確定語句は存在しなくなったので、音声入力確定部１４では、確定条件判断処理が実行される。この場合、項目別メモリ６２には、項目「アイテム」と「決済方法」にそれぞれ属する語句のデータが格納されているので、条件ナンバーＰ５に一致するパターンであると認識される。このパターンは、ガイダンス番号［１０１］が設定されているパターン、つまり確定条件を満足しないパターンなので、図９（ｈ）に示すガイダンス番号［１０１］のガイダンスデータ、つまり給油量の音声再入力を促す不足項目語句問合せガイダンスデータ「給油予定量または予定金額を入力してください」が作成される。そして、この不足項目語句問合せガイダンスデータが音声生成部１８に出力され、音声に変換されて、スピーカ１２から出力される。
【００５８】
このガイダンスを聞いた客が、同様にして例えば「満タン」と発声したとする。そうすると、図９（ｉ）に示すように、「満タン」という音声データが音声認識部１７で認識され、音声入力確定部１４に入力される。音声入力確定部１４では、この音声データ「満タン」に対して、語句抽出処理が実行される。その結果、この音声データから給油量を表わす語句「満タン」が抽出される。そして、図９（ｊ）に示すように、語句「満タン」の語句音声データと語句コードとが語句メモリ６１に追加される。
【００５９】
次いで、音声入力確定部１４では、語句項目判定処理が再度実行される。これにより、項目別メモリ６２には、図９（ｋ）に示すように、項目「数量」に対して語句「満タン」の語句音声データ，語句コード及び不確定フラグがそれぞれ格納される。次いで、不確定語句は存在しないので、音声入力確定部１４では、確定条件判断処理が実行される。この場合、項目別メモリ６２には、項目「アイテム」，「数量」及び「決済方法」にそれぞれ属する語句のデータが格納されているので、条件ナンバーＰ１に一致するパターンであると認識される。このパターンは、ガイダンス番号が未設定、つまり確定条件を満足するので、語句「レギュラーガソリン」，「満タン」，「クレジットカード」の各語句コードから音声入力確定データが作成され、順次伝票発行装置２に転送される。
【００６０】
この場合も、伝票発行装置においては、音声入力確定データの最初の語句コードをガソリンの種類を示すデータとして認識し、次の語句コードを給油量を示すデータとして認識し、最後の語句コードを決済方法を示すデータとして認識して処理するので、正常に伝票発行が処理される。
【００６１】
このように本実施の形態によれば、伝票発行装置２に対して給油の注文内容を音声入力する際の入力項目の順番が制限されないので、使い勝手がよく、不特定多数の客が自身で注文内容を伝票発行装置２に音声入力できるようになる。その結果、店員の手間を軽減でき、入力ミスもなくなる上、店員が車と伝票発行装置との間を往復して客の注文内容を手入力する無駄を省略できるので、作業効率を向上できる。
【００６２】
また、本実施の形態では、最初に音声入力された語句だけでは、注文内容の確定条件が満足されないときには、当該確定条件を満足するのに不足している項目の意味を有する語句の音声再入力を促すガイダンスを自動的に生成し、ユーザに対して音声出力する。そして、その後に音声入力された語句から確定条件が満足されたならば、音声入力データを確定させるようにしている。したがって、注文内容を音声入力する際に入力漏れの項目があっても、その項目の注文内容のみを入力し直せばよいので、注文内容を誰もが容易にかつ正しく音声で入力できる効果を奏する。
【００６３】
また、音声入力された語句の中に注文内容を確定できない不確定語句が含まれる場合には、その語句を確定させる音声の再入力を促すガイダンスを自動的に生成し、ユーザに対して音声出力する。そして、その後に音声入力された語句から不確定語句が確定されたならば、その後から入力された語句を有効にして音声入力データを確定させるようにしている。したがって、客が例えばガソリンの種類に関して「ガソリン」としか発声しなかったためにガソリンがレギュラーガソリンかハイオクガソリンかを特定できなかった場合でも、その後の音声入力によってガソリンの種類を確定できるので、やはり、注文内容を誰もが容易にかつ正しく音声で入力することができる。
【００６４】
ところで、この実施の形態では、音声データから抽出されて語句メモリ６１に格納された語句の中に同一項目の意味を有する語句が複数ある場合には、最も後から音声入力された語句を有効にして、項目別メモリ６２に記憶するようにしている。したがって、例えばガソリンの種類を言い間違えて後から訂正した場合には、後から訂正した種類の語句が有効になるので、この点からも使い勝手のよいものである。
【００６５】
なお、この実施の形態では、ガソリンスタンド給油システムの伝票発行装置に対する注文内容入力デバイスとして本発明を適用したが、本発明を適用できる情報処理装置はこれに限定されるものではなく、航空券，乗車券などの券発行装置や、レストラン，ファーストフード店でのオーダ登録装置等に対する注文内容入力デバイスとしても用いることができる。例えば航空券の券発行装置に対する入力デバイスとして用いた場合には、語句の意味が含まれる項目として出発日時，出発地，行先，航空会社，座席（禁煙席，窓側，通路側など）等が設定される。
【００６６】
また、ファーストフード店でのオーダ登録装置に対する入力デバイスとして用いた場合には、語句の意味が含まれる項目として品名，数量，決済方法等が設定される。
【００６７】
また、この実施の形態では、確定語句問合せガイダンスデータや不足項目語句問合せガイダンスデータを音声出力したが、音声入力確定部１４に表示部を接続して、ガイダンスデータを画面に表示出力するように構成しても、同様な作用効果を奏し得る。
【００６８】
また、この実施の形態では、ＲＯＭ１４２に格納された音声入力プログラムをＣＰＵ１４１が実行することにより、音声入力装置としての機能を実現させたが、インタフェース１４６を介して外部機器からＨＤＤ装置１４５に音声入力プログラムをダウンロードし、このプログラムをＣＰＵ１４１が実行することによって、音声入力装置としての機能を実現させてもよい。
【００６９】
【発明の効果】
以上詳述したように本発明の音声入力装置及び音声入力方法並びに音声入力プログラムであれば、音声入力する順番に捕われずに注文内容を正しく入力処理することができるので、音声入力装置の使い勝手を向上できる上、注文内容を誰もが容易にかつ正しく音声で入力できるようになる。
【図面の簡単な説明】
【図１】本発明の一実施の形態である音声入力装置の構成を示すブロック図。
【図２】図１における音声入力確定部の構成を示すブロック図。
【図３】同音声入力確定部のＨＤＤ装置に記憶される主要なデータファイル及びデータテーブルを示す図。
【図４】図３に示す各データファイルのレコード構成を示す図。
【図５】図３に示すデータテーブルのデータ構成を示す図。
【図６】同音声入力確定部のＲＡＭに形成される主要なメモリエリアを示す図。
【図７】同音声入力確定部のＣＰＵが実行する主要なプログラム処理の手順を示す流れ図。
【図８】同音声入力確定部の動作の一例におけるデータ例を示す図。
【図９】同音声入力確定部の動作の他の例におけるデータ例を示す図。
【符号の説明】
１…音声入力装置
２…伝票発行装置
１１…マイクロホン
１２…スピーカ
１３…音声エンジン部
１４…音声入力確定部
１５…音声入力スイッチ
３１…語彙ファイル
３２…辞書ファイル
３３…ガイダンスファイル
３４…確定条件テーブル[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an information processing device that accepts and processes an order content, such as a slip issuing device of a gas station refueling system, a ticket issuing device such as an airline ticket or a boarding ticket, an order registration device at a restaurant, a fast food restaurant, or the like. The present invention relates to a voice input device used as an input device, a method thereof, and a voice input program.
[0002]
[Prior art]
For example, at a gas station, when a customer visits a store, the customer first enters the type of gasoline to be refilled (regular gasoline, high-octane gasoline, light oil, etc.), the amount of refueling (quantity unit or monetary unit), and the settlement method (cash payment, credit payment). Etc.) On the other hand, for example, if the customer replied, “Regular full, with cash.”, The clerk said that the type of gasoline to be refilled was “regular gasoline” and the amount of fuel was “full”. After having input to the slip issuing device that the payment method is “cash payment”, refueling is started. By doing so, when the refueling is completed, a payment slip corresponding to the actual refueling amount is automatically printed and issued from the slip issuing device, so that the store clerk receives payment of the refueling price from the customer according to the payment slip. .
[0003]
In such a gas station refueling system, conventionally, a method using an input device such as a keyboard or a touch panel is generally used as a method for inputting customer order contents into a slip issuing device. However, with this method, since the store clerk had to manually input the contents heard from the customer, it was troublesome twice and input errors were likely to occur. Also, the customer must wait for the store clerk to approach the order details, and the store clerk who heard the order went to the place where the slip issuing device is located and entered the customer order details. Later, he had to return to the place where the car stopped and start refueling, which was inefficient.
[0004]
Therefore, in recent years, it has been considered that a voice input device is connected to a slip issuing device as an input device, and a customer who has refueled inputs the order contents to the slip issuing device directly from inside the vehicle. In this way, the labor of the clerk can be reduced, input errors are eliminated, and the clerk can eliminate the waste of manually entering the customer's order contents by going back and forth between the car and the slip issuing device, improving work efficiency. Can be expected.
[0005]
However, the conventional voice input device has a format in which words recognized from voice data inputted through a microphone are taken one by one in the order of input and transferred to an information processing device such as a computer in that order. On the other hand, an information processing device such as a slip issuing device that accepts and processes order contents has a predetermined order for receiving order contents items, and processes the transferred words as data indicating the order contents of the corresponding items in that order. Will do.
[0006]
For this reason, for example, if the slip issuing device of the refueling system is programmed to accept the gasoline type first, then the fueling amount, and finally the settlement method, the slip issuing device The word transferred first from the input device is processed as data indicating the type of gasoline, the word transferred next is processed as data indicating the amount of fuel supply, and the data transferred last indicates the settlement method. It will be processed as data. Therefore, the person who performs voice input needs to speak the order details in a predetermined order, such as first ordering the type of gasoline, then ordering the amount of refueling, and finally ordering the settlement method. It was.
[0007]
[Problems to be solved by the invention]
As described above, when a voice input device is used as an input device for order details, the order of inputting the order details has been fixed in the past. For example, at a gas station, the customer himself refuels using the voice input device. Even if you build a refueling system that inputs the details of orders to the slip issuing device, it is difficult to ensure the order of telling the details of orders to all customers at a gas station used by an unspecified number of customers, There was a high possibility that the slip issuing device could not recognize the order details correctly and would not function properly.
[0008]
The present invention has been made based on such circumstances, and an object of the present invention is to provide a voice input device, a method thereof, and a voice input program capable of correctly inputting order contents without being caught in the order of voice input. It is what.
[0009]
[Means for Solving the Problems]
The speech input device of the present invention includes a phrase extraction unit that extracts one phrase registered in advance from the contents of an order input by speech, and the meaning of each phrase extracted by the phrase extraction unit. A phrase item determination unit for determining an item, a phrase storage unit for storing each phrase determined by the phrase item determination unit for each item, and an item combination pattern for each phrase stored by the phrase storage unit It consists of items of order type, quantity and payment method. Determination condition determining means for determining whether the pattern satisfies the order confirmation condition, and if the item combination pattern of each word is determined to satisfy the order confirmation condition by the confirmation condition determination means, the word storage means Order confirming means for confirming each word / phrase stored as voice input order data; Means for transferring the voice input order data confirmed by the order confirmation means to a slip issuing device for issuing a settlement slip; , With.
[0010]
Therefore, there is a combination pattern of items each containing the meaning of each word extracted from the order contents inputted by voice. It consists of items of order type, quantity and payment method. When matching with the pattern satisfying the order confirmation condition, each word / phrase is confirmed as voice input order data, so the order of the words / phrases constituting the order content may be different.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
In this embodiment, the present invention is applied to a voice input device used as an order content input device for a slip issuing device of a gas station refueling system.
[0012]
FIG. 1 is a block diagram showing a configuration of a voice input device 1 according to the present embodiment. The voice input device 1 includes a microphone 11 as a voice input unit, a speaker 12 as a voice output unit, a voice engine unit 13, a voice input confirmation unit 14, and a voice input switch 15. The voice input confirmation unit 14 is connected to an information processing apparatus that accepts and processes order contents, that is, in this embodiment, a slip issuing apparatus 2 of a gas station refueling system.
[0013]
The voice input switch 15 is a switch that is turned on when a user inputs a voice from the microphone 11, and the microphone 11 and the speaker 12 are provided in the vicinity thereof. In this embodiment, the voice input switch 15, the microphone 11, and the speaker 12 are provided at a position where a customer who stops at the fueling place of the gas station can operate from the inside of the vehicle.
[0014]
The speech engine unit 13 generates an audio data by recognizing speech one syllable at a time from an A / D (analog / digital) converter 16 that converts an analog speech signal input from the microphone 11 into a digital speech signal. A voice recognition unit 17 that generates a digital voice signal from the guidance data supplied from the voice input determination unit 14, and converts the digital voice data into analog voice data that is output to the speaker 12. The digital / analog converter 19 is used.
[0015]
The voice input confirmation unit 14 takes in the voice data generated by the voice recognition unit 17 during the period when the voice input switch 15 is turned on, and the voice input data to be processed by the slip issuing device 2 from the voice data. It has a function of confirming and transferring it to the slip issuing device 2.
[0016]
In this embodiment, the slip issuing device 2 accepts the order of the gasoline type, the amount of refueling, and the settlement method in this order, and when the refueling is finished after these order contents are input, the actual refueling amount is set. A corresponding settlement slip is printed and issued. Accordingly, when the speech input confirmation unit 14 obtains a phrase representing the type of gasoline, a phrase representing the amount of fuel supply, and a phrase representing the settlement method from the speech data, it confirms the speech input data from these phrases. And transferred to the slip issuing device 2. On the other hand, when there is a shortage of a word representing any order content, guidance data for prompting the user to re-input the word representing the order content is output to the sound generation unit 18 and the guidance is output as a sound. Further, when the phrase includes a phrase for which the order contents cannot be determined to represent the generic name of the object, for example, guidance data that prompts the user to re-input the voice for determining the phrase is output to the voice generation unit 18. Make the guidance sound output.
[0017]
Hereinafter, a specific configuration of the voice input confirmation unit 14 will be described.
[0018]
FIG. 2 is a block diagram showing the configuration of the voice input confirmation unit 14. The voice input device 1 includes a CPU (Central Processing Unit) 141 as a microprocessor, a ROM (Read Only Memory) 142 in which fixed data such as a program executed by the CPU 141 is stored in advance, and the CPU 141 temporarily stores data. Random Access Memory (RAM) 143 in which various memory areas used as areas are formed, and voice data is input from the voice recognition unit 17 of the voice engine unit 13, while guidance data is output to the voice generation unit 18. I / O (Input / Output) port 144, HDD (Hard Disk Drive) device 145 for storing various data files, etc., communication interface 146 connected to the slip issuing device 2 for transferring voice input data, Consists of an input port 147 to which a signal from the voice input switch 15 is input, and C The PU 141, ROM 142, RAM 143, I / O port 144, HDD device 145, communication interface 146, input port 147, etc. are connected by a system bus 148. Here, the voice input confirmation unit 14 is mainly configured by a microcomputer including a CPU 141, a ROM 142, and a RAM 143.
[0019]
As shown in FIG. 3 in particular, the HDD device 145 stores a vocabulary file 31, a dictionary file 32, a guidance file 33, and a confirmation condition table 34.
[0020]
As shown in FIG. 4A, the vocabulary file 31 is different for each word / phrase preset for the word / phrase and the word / phrase data obtained by quantifying the voice feature amount when a predetermined word / phrase is generated. A record consisting of a phrase code is stored, and in particular, a large number of phrases that are considered to be used when inputting order contents by voice to the slip issuing device 2 are selected, and a record of each phrase is stored. Yes. For example, “regular”, “regular gasoline”, “high-octane”, “high-octane gasoline”, “light oil”, “gasoline”, etc. are selected as terms used when ordering gasoline types. Audio data and phrase codes are stored in the vocabulary file 31. Also, “full tank”, “10 liters”, “20 liters”, “1000 yen”, “2000 yen”, etc. are selected as phrases used when ordering the amount of fueling, and the phrase audio data of these phrases And the phrase code are stored in the vocabulary file 31. In addition, “cash”, “cash payment”, “card”, “credit”, “debit”, “credit card”, “debit card”, etc. are selected as terms used when ordering a payment method. The phrase audio data and the phrase code of the phrase are stored in the vocabulary file 31.
[0021]
As shown in FIG. 4B, the dictionary file 32 stores a record including a phrase code, an item code, an indeterminate flag, a guidance number, and a plurality of linked phrase codes, and is stored in the vocabulary file 31. A plurality of records linked to the phrase codes of the respective phrases are stored.
[0022]
In the record item of the dictionary file 32, the item code is a code for identifying an item set according to the meaning of various words. In this embodiment, an item “item” (item code = item code) having a meaning representing the type of gasoline. 1), an item “quantity” (item code = 2) indicating the amount of oil supply, an item “money” (item code = 3) indicating an amount corresponding to the amount of oil supply, and an item indicating the settlement method “Payment method” (item code = 4) is set.
[0023]
The indeterminate flag is a flag for identifying whether or not the corresponding word / phrase is a word / phrase that can confirm the contents of the order. “1” is set for a definite word. A guidance number and a link word / phrase code are set only for the uncertain word / phrase. The guidance number is a number that is set for guidance data that prompts re-input of voice for confirming the uncertain word / phrase. The link phrase code is a phrase code of a part or all of the phrases that can confirm the order contents with the phrases included in the item in which the uncertain phrase is classified.
[0024]
For example, when ordering a gasoline type, the words that are input by voice are "regular", "high-octane", "gasoline", etc. The previous two can specify the type of gasoline, but only "gasoline" It cannot be determined whether it is regular gasoline or high-octane gasoline. Therefore, for the phrase code "gasoline", the indeterminate flag is set to "1", and guidance data prompting voice re-entry to confirm the type of gasoline as the guidance number (for example, "gasoline is ...? ...?" )), And the phrase codes “regular” and “high-octane” are set as link phrase codes. Similarly, the words input by voice when ordering a payment method include “credit”, “debit”, “card”, etc. The previous two can specify the payment method, but only “card” can be credited. It is not possible to specify whether the payment is card payment or debit card payment. Therefore, guidance data that prompts voice re-entry to confirm the card type as the guidance number with the indeterminate flag set to “1” for the phrase code of the phrase “card” (for example, “Is the card? )), And the phrase codes of “credit” and “debit” are set as link phrase codes.
[0025]
As shown in FIG. 4C, the guidance file 33 stores a record including a guidance number and guidance data, and various types of guidance data that require voice output before the voice input data is determined. Are stored together with different guidance numbers.
[0026]
As shown in FIG. 5, the fixed condition table 34 is a combination of items set according to the meaning of various words (in this embodiment, “item”, “quantity”, “amount”, and “settlement method”). For each condition number for identifying a pattern, data indicating whether the item combination pattern is a pattern including a corresponding item (1: included, 0: not included), and when the item combination pattern is not a pattern that satisfies a definite condition And the guidance number of guidance data that prompts the user to re-input the words having the meanings of the items that are insufficient to satisfy the definite condition.
[0027]
That is, the item combination pattern identified by the condition number P1 is a pattern including the items “item”, “quantity”, and “settlement method” of the data “1”, and the item combination pattern identified by the condition number P2 is The pattern includes the items “item”, “amount”, and “settlement method” of the data “1”, and all satisfy the confirmation condition as voice input data for the slip issuing device 2, so the guidance number is not set ( = 0).
[0028]
On the other hand, since the item combination patterns identified by the condition numbers P3 to P12 do not satisfy the confirmation condition as the voice input data for the slip issuing device 2, the guidance numbers of the corresponding guidance data are set. For example, in the patterns identified by the condition numbers P3 and P4, since the word having the meaning of the item “settlement method” has not been entered, guidance data that prompts voice re-input of the settlement method (for example, “What is the settlement method? ? ") Guidance number [100] is set. In the pattern identified by the condition number P5, since the word having the meaning of the item “quantity” or “money” is not input, the guidance number [101] of the guidance data that prompts the voice re-input of the fueling amount is set. Has been. Hereinafter, similarly, since the pattern having the meaning of the item “item” is not input in the patterns identified by the condition numbers P6 and P7, the guidance number [102] of the guidance data that prompts the voice type re-input of the gasoline type is obtained. In the pattern that is set and identified by the condition number P8, the words “quantity” or “amount” and “settlement method” have not been entered yet, so the refueling amount and the settlement method are re-input. Since the guidance number [103] of the guidance data for prompting is set and the patterns identified by the condition numbers P9 and P10 have not been entered in terms having the meanings of the items “item” and “settlement method”, respectively, Guidance number [104] of guidance data prompting voice re-input of gasoline type and settlement method is set, and condition number P1 The pattern identified by is the guidance number of the guidance data that prompts voice re-entry of the gasoline type and refueling amount because the words “item” and “quantity” or “amount” have not been entered. [105] is set.
[0029]
The CPU 141 forms the memory areas 61 and 62 shown in FIG. 6 in the RAM 143 by the voice input program stored in the ROM 142, and repeatedly executes the processing shown in the flowchart of FIG.
[0030]
The memory area 61 is an area for storing phrase audio data and phrase codes extracted from the vocabulary file 31 in the order of extraction, and is hereinafter referred to as a phrase memory 61. The memory area 62 stores words or phrases having the meaning of the item for each item (four items of “item”, “quantity”, “amount”, and “settlement method” in this embodiment) set according to the meaning of each word. This area (phrase storage means) stores phrase audio data, phrase code, and indeterminate flag one by one, and is hereinafter referred to as item-specific memory 62.
[0031]
With the word / phrase memory 61 and the item-specific memory 62 formed in the RAM 143, the CPU 141 waits for the voice input switch 15 to be turned on as ST (step) 1. As an initial state, the phrase audio data area and the phrase code area of the phrase memory 61 and the phrase audio data area, the phrase code area, and the indeterminate flag area of the item-specific memory 62 are all cleared.
[0032]
When determining that the voice input switch 15 is turned on by a signal input to the input port 147, the CPU 141 captures the voice data recognized by the voice recognition unit 17 as ST2. Audio data is continuously captured until the audio input switch 15 is turned off in ST3.
[0033]
If it is determined that the voice input switch 15 is turned off by a signal input to the input port 147, the CPU 141 executes a phrase extraction process as ST4. That is, the speech data fetched from the speech recognition unit 17 is compared with the speech data of various phrases registered in the vocabulary file 31 in the order of input, and the phrase that matches the phrase speech data registered in the vocabulary file 31 Are extracted one by one. Each time a phrase is extracted, the phrase audio data of the phrase and the corresponding phrase code are stored in the phrase memory 61 in the order of extraction (phrase extraction means). It should be noted that when no word or phrase that matches the phrase voice data can be extracted from the voice data, an error is generated and the process waits for the voice input switch 15 to be turned on again.
[0034]
When extracting the phrase voice data from the voice data, the CPU 141 executes a phrase item determination process. That is, the phrase audio data and the phrase code stored in the phrase memory 61 are read in ascending order of the memory number, that is, the phrase extracted from the audio data. Then, the dictionary file 32 is searched with the phrase code, the corresponding item code and the uncertain flag are acquired, and the phrase audio data, the phrase code, and the uncertain flag are stored in the corresponding item area of the item-specific memory 62. . At this time, if data has already been stored in the corresponding item area, the existing data is overwritten and stored. Therefore, when a plurality of phrase audio data and phrase codes having the same item meaning are stored in the phrase memory 61, the most recently extracted phrase, that is, the phrase most recently voice-inputted is selected. The phrase audio data, the phrase code and the indeterminate flag are stored in the item-specific memory 62 (phrase item determination means).
[0035]
When the determination of the items of each phrase sound data stored in the phrase memory 61 is completed, the CPU 141 determines the presence / absence of an indeterminate phrase in ST6. That is, the indeterminate flag stored in the item-specific memory 62 is checked to determine whether or not the indeterminate flag “1” is stored. If the indeterminate flag “1” is stored, it is determined that there is an indeterminate phrase, and if the indeterminate flag “1” is not stored, it is determined that there is no indeterminate phrase.
[0036]
If it is determined that there is an indefinite word / phrase, the CPU 141 searches the dictionary file 32 again with the word / phrase code corresponding to the indeterminate flag “1” in ST7, and reads the corresponding guidance number and link word / phrase code. Further, guidance data corresponding to the guidance number is read from the guidance file 33, and phrase audio data corresponding to each link phrase code is read from the vocabulary file 31. Then, from the guidance data and each phrase voice data, guidance data for prompting re-input of voice for finalizing the indeterminate phrase, so-called fixed phrase query guidance data is created. Thereafter, as ST8, the fixed phrase query guidance data is output to the voice generation unit 18 and is output from the speaker 12 (fixed phrase prompting means). Thereafter, the CPU 141 returns to ST1 and waits for the voice input switch 15 to be turned on again.
[0037]
When it is determined that there is no uncertain word / phrase, the CPU 141 executes a definite condition determination process as ST9. That is, the item-specific memory 62 is checked to distinguish between items in which the phrase audio data, the phrase code, and the indeterminate flag “0” data are stored and items that are not stored. Then, the confirmation condition table 34 is searched for the combination pattern of items storing each data, that is, the item combination pattern of each phrase extracted from the speech data, and the guidance number of the same pattern is acquired.
[0038]
Here, when the guidance number of the same pattern is other than “0”, that is, when the guidance number is set, the CPU 141 determines that the item combination pattern of each word extracted from the speech data as ST10 satisfies the definite condition. Since it is not a pattern to be used, input is not confirmed. On the other hand, when the guidance number of the same pattern is “0”, since the item combination pattern of each word and phrase extracted from the speech data is a pattern that satisfies the confirmation condition, the input is confirmed (confirmation condition judging means ).
[0039]
When the input is not confirmed, guidance data corresponding to the guidance number is read from the guidance file 33 as ST11, and guidance data for prompting voice re-input of words having meanings of items that are insufficient to satisfy the confirmation condition The so-called missing item phrase query guidance data is created. Then, as ST12, the missing item phrase inquiry guidance data is output to the voice generation unit 18 and is output as voice from the speaker 12 (missing phrase prompting means). Thereafter, the process returns to ST1 and waits for the voice input switch 15 to be turned on again.
[0040]
When confirming the input, the phrase code of each item stored in the item-specific memory 62 as ST13 is read in the order of the item “item”, “quantity” or “amount”, and “settlement method”. When the speech input confirmation data is created by combining these word codes in order, the speech input confirmation data is transferred to the slip issuing device 2 via the interface 146 (input confirmation means).
[0041]
Thereafter, when the CPU 141 clears the phrase memory 61 and the item-specific memory 62 and returns to the initial state in ST14, the current processing is terminated. Then, it waits for the voice input switch 15 to be turned on next, and if it is turned on, the process from ST1 is repeated.
[0042]
As described above, in the present embodiment, the voice input device 1 is used as an input device for order contents to the slip issuing device 2 of the gasoline refueling system. Then, the customer who has come to refuel gasoline turns on the voice input switch 15 from the inside of the vehicle after stopping at a predetermined position, and speaks out the order contents, that is, the type of gasoline, the amount of refueling, and the settlement method toward the microphone 11. At this time, the order in which the order details are reported is not particularly limited.
[0043]
Now, for example, it is assumed that a customer utters the content of an order as “2000 yen for gasoline in cash”. Then, as shown in FIG. 8A, voice data “gasoline for 2000 yen in cash” is recognized by the voice recognition unit 17 and input to the voice input confirmation unit 14.
[0044]
In the voice input confirmation unit 14, a phrase extraction process is executed for the voice data “Gasoline for 2000 yen in cash”. As a result, the phrase “gasoline” representing the type of gasoline, the phrase “cash” representing the payment method, and the phrase “2000 yen” representing the amount corresponding to the amount of fueling are extracted from the voice data. Then, as shown in FIG. 8B, the phrase audio data and the phrase code of each phrase are stored in the phrase memory 61 in the order of extraction.
[0045]
Next, the phrase input determination unit 14 executes a phrase item determination process. As a result, as shown in FIG. 8C, the item-specific memory 62 stores the phrase voice data, the phrase code, and the uncertain flag of the phrase “gasoline” for the item “item”, and the item “money amount”. Is stored with the phrase voice data, phrase code and uncertain flag of the phrase “2000 yen”, and with the item “settlement method”, the phrase voice data, phrase code and uncertain flag of the phrase “cash” are respectively stored. Stored.
[0046]
In this case, the phrase “gasoline” is an indeterminate phrase (indeterminate flag = 1), so the guidance data “prompt gasoline… is it? From the phrases “regular” and “high-octane” corresponding to the link phrase code of “gasoline”, the definite-phrase query guidance data “Is gasoline regular or high-octane” shown in FIG. 8D is created. Then, this fixed phrase query guidance data is output to the voice generation unit 18, converted into voice, and output from the speaker 12.
[0047]
It is assumed that the customer who heard this guidance uttered “regular”, for example. Then, as shown in FIG. 8 (e), the voice data “regular” is recognized by the voice recognition unit 17 and input to the voice input confirmation unit 14. In the voice input confirmation unit 14, a phrase extraction process is performed on the voice data “regular”. As a result, the phrase “regular” representing the type of gasoline is extracted from the voice data. Then, the phrase audio data and the phrase code of the phrase “regular” are added to the phrase memory 61 as shown in FIG.
[0048]
Next, the speech input determination unit 14 executes the phrase item determination process again. As a result, as shown in FIG. 8G, the phrase “regular” phrase speech data, phrase code, and indeterminate flag are overwritten in the item-by-item memory 62, as shown in FIG.
[0049]
As a result, since the indeterminate word no longer exists, the speech input confirmation unit 14 performs a confirmation condition determination process. In this case, the item-by-item memory 62 stores the data of the words and phrases belonging to the items “item”, “amount”, and “settlement method”, respectively, so that it is recognized that the pattern matches the condition number P2. Since this pattern is a pattern in which the guidance number is not set, that is, the pattern satisfies the confirmation condition, voice input confirmation data is created from each of the phrase codes “regular”, “2000 yen”, and “cash”, and the slip issuing device 2 Transferred.
[0050]
In the slip issuing device 2, the first phrase code of the voice input confirmation data is recognized as data indicating the type of gasoline, the next phrase code is recognized as data indicating the amount of fuel supply, and the last phrase code indicates the settlement method. Recognize and process as data. Therefore, the voucher is processed normally.
[0051]
Further, for example, it is assumed that another customer utters the content of the order as “Regular gasoline with a card”. Then, as shown in FIG. 9A, the voice data “Regular gasoline with a card” is recognized by the voice recognition unit 17 and input to the voice input confirmation unit 14.
[0052]
In the voice input confirmation unit 14, a phrase extraction process is executed for the voice data “Regular gasoline with a card”. As a result, the phrase “regular gasoline” representing the type of gasoline and the phrase “card” representing the payment method are extracted from the voice data. Then, as shown in FIG. 9B, the phrase audio data and the phrase code of each phrase are stored in the phrase memory 61 in the order of extraction.
[0053]
Next, the phrase input determination unit 14 executes a phrase item determination process. As a result, as shown in FIG. 9C, the item-specific memory 62 stores the phrase voice data, the phrase code, and the indeterminate flag of the phrase “regular gasoline” for the item “item”, and the item “item”. The phrase audio data, the phrase code, and the indeterminate flag of the phrase “card” are stored for the “settlement method”.
[0054]
In this case, the word “card” is an indeterminate word (indeterminate flag = 1), so the guidance data “card is ...? ...?” From the phrases “credit” and “debit” corresponding to the link phrase code of “card”, the fixed phrase query guidance data “card is credit or debit” shown in FIG. 9D is created. Then, this fixed phrase query guidance data is output to the voice generation unit 18, converted into voice, and output from the speaker 12.
[0055]
Assume that a customer who has heard this guidance utters, for example, “It is a credit card”. Then, as shown in FIG. 9 (e), voice data “It is a credit card” is recognized by the voice recognition unit 17 and input to the voice input confirmation unit 14. The voice input confirmation unit 14 executes a phrase extraction process for the voice data “It is a credit card”. As a result, the phrase “credit card” representing the settlement method is extracted from the voice data. Then, the phrase audio data and the phrase code of the phrase “credit card” are added to the phrase memory 61 as shown in FIG.
[0056]
Next, the speech input determination unit 14 executes the phrase item determination process again. As a result, the item-by-item memory 62 is overwritten with the item “settlement method”, the phrase audio data of the phrase “credit card”, the phrase code, and the indeterminate flag, respectively, as shown in FIG.
[0057]
As a result, since the indeterminate word does not exist, the speech input confirmation unit 14 executes a confirmation condition determination process. In this case, the item-by-item memory 62 stores data of words and phrases respectively belonging to the items “item” and “settlement method”, so that it is recognized that the pattern matches the condition number P5. Since this pattern is a pattern in which the guidance number [101] is set, that is, a pattern that does not satisfy the confirmation condition, the guidance data of the guidance number [101] shown in FIG. Missing item phrase inquiry guidance data "Please enter the planned amount or amount of refueling" is created. The missing item phrase inquiry guidance data is output to the voice generation unit 18, converted into voice, and output from the speaker 12.
[0058]
It is assumed that the customer who heard this guidance similarly uttered, for example, “full tank”. Then, as shown in FIG. 9 (i), the voice data “full” is recognized by the voice recognition unit 17 and input to the voice input confirmation unit 14. The voice input confirmation unit 14 executes a phrase extraction process for the voice data “full”. As a result, the phrase “full tank” representing the amount of oil supply is extracted from the voice data. Then, as shown in FIG. 9 (j), the phrase audio data of the phrase “full tank” and the phrase code are added to the phrase memory 61.
[0059]
Next, the speech input determination unit 14 executes the phrase item determination process again. Accordingly, as shown in FIG. 9K, the item-specific memory 62 stores the phrase voice data, the phrase code, and the indeterminate flag for the phrase “full” for the item “quantity”. Next, since there is no uncertain word / phrase, the speech input confirmation unit 14 executes a confirmation condition determination process. In this case, the item-by-item memory 62 stores data of words and phrases that respectively belong to the items “item”, “quantity”, and “settlement method”, so that it is recognized that the pattern matches the condition number P1. In this pattern, since the guidance number is not set, that is, the confirmation condition is satisfied, voice input confirmation data is created from each of the phrase codes of “regular gasoline”, “full tank”, and “credit card”, and the slip issuing device sequentially 2 is transferred.
[0060]
Also in this case, the slip issuing device recognizes the first phrase code of the voice input confirmation data as data indicating the type of gasoline, recognizes the next phrase code as data indicating the amount of fuel, and settles the last phrase code. Since the data representing the method is recognized and processed, the slip issuance is processed normally.
[0061]
As described above, according to the present embodiment, the order of input items when inputting the order details of refueling to the slip issuing device 2 is not limited, so that it is easy to use and an unspecified number of customers can order by themselves. The contents can be input to the slip issuing device 2 by voice. As a result, the labor of the store clerk can be reduced, input errors can be eliminated, and the waste of the customer entering the customer's order contents by reciprocating between the car and the slip issuing device can be omitted, so that the work efficiency can be improved.
[0062]
In the present embodiment, when the final input condition is not satisfied by only the first input phrase, the voice re-input of the phrase having the meaning of the item that is insufficient to satisfy the final condition Guidance prompting the user to be automatically generated and output to the user by voice. Then, if the confirmation condition is satisfied from the words that are subsequently input by speech, the speech input data is confirmed. Therefore, even if there is an input omission item when inputting the order contents by voice, it is only necessary to re-enter the order contents of the item, so that it is possible for anyone to input the order contents easily and correctly by voice. .
[0063]
In addition, if there are uncertain words that cannot be confirmed in the order, the guidance that prompts the user to re-input the voice that confirms the words is automatically generated and the voice is output to the user. To do. Then, if an indeterminate phrase is determined from a phrase that is subsequently input by voice, the input phrase is subsequently validated to determine the voice input data. Therefore, even if the customer could not specify whether the gasoline is regular gasoline or high-octane gasoline because the customer only spoke `` gasoline '' with respect to the type of gasoline, for example, the type of gasoline can be confirmed by subsequent voice input, Anyone can easily and correctly input the order contents by voice.
[0064]
By the way, in this embodiment, when there are a plurality of phrases having the meaning of the same item in the phrases extracted from the voice data and stored in the phrase memory 61, the phrase input last by voice is validated. Thus, it is stored in the item-specific memory 62. Therefore, for example, when the type of gasoline is mistaken and corrected later, the type of phrase corrected later becomes effective, which is also convenient from this point of view.
[0065]
In this embodiment, although the present invention is applied as an order content input device for a slip issuing device of a gas station refueling system, an information processing apparatus to which the present invention can be applied is not limited to this, and an air ticket, It can also be used as an order content input device for a ticket issuing device such as a boarding ticket, an order registration device in a restaurant or a fast food restaurant. For example, when used as an input device for an air ticket issuing device, items such as departure date, departure place, destination, airline, seat (non-smoking seat, window side, aisle side, etc.) are set as items that include the meaning of the phrase. Is done.
[0066]
In addition, when used as an input device for an order registration apparatus at a fast food restaurant, the item name, quantity, settlement method, and the like are set as items including the meaning of the phrase.
[0067]
In this embodiment, the fixed phrase query guidance data and the missing item phrase query guidance data are output as voices. However, the display unit is connected to the voice input determination unit 14 and the guidance data is displayed and output on the screen. Even in this case, similar effects can be obtained.
[0068]
In this embodiment, the voice input program stored in the ROM 142 is executed by the CPU 141 to realize the function as the voice input device. However, the voice input from the external device to the HDD device 145 via the interface 146 is realized. A function as a voice input device may be realized by downloading a program and executing the program by the CPU 141.
[0069]
【The invention's effect】
As described above in detail, the voice input device, the voice input method, and the voice input program of the present invention can correctly input and process the order contents without being caught in the order of voice input. In addition to being able to improve, anyone can easily and correctly input the contents of the order.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a voice input device according to an embodiment of the present invention.
FIG. 2 is a block diagram showing a configuration of a voice input confirmation unit in FIG.
FIG. 3 is a view showing main data files and a data table stored in the HDD device of the voice input confirmation unit.
4 is a view showing a record configuration of each data file shown in FIG. 3;
FIG. 5 is a diagram showing a data configuration of a data table shown in FIG. 3;
FIG. 6 is a view showing main memory areas formed in the RAM of the voice input confirmation unit.
FIG. 7 is a flowchart showing a procedure of main program processing executed by the CPU of the voice input confirmation unit.
FIG. 8 is a diagram showing an example of data in an example of the operation of the voice input confirmation unit.
FIG. 9 is a diagram showing an example of data in another example of the operation of the voice input confirmation unit.
[Explanation of symbols]
1 ... Voice input device
2 ... slip issuing device
11 ... Microphone
12 ... Speaker
13 ... Speech engine
14 ... Voice input confirmation section
15 ... Voice input switch
31 ... Vocabulary file
32 ... Dictionary file
33 ... Guidance file
34 ... Confirmation condition table

Claims

Word / phrase extracting means for extracting one word / phrase registered in advance from the contents of the order inputted by voice;
A phrase item determination unit that determines an item including the meaning of the phrase for each phrase extracted by the phrase extraction unit;
A phrase storage means for storing each phrase determined by the phrase item determination means for each item;
A confirmation condition judging means for judging whether or not the item combination pattern of each word stored by the word storage means is a pattern satisfying an order confirmation condition consisting of items of the type, quantity and settlement method of the order item ;
When the confirmation condition determining means determines that the item combination pattern of each word is a pattern satisfying the order confirmation condition, the order confirmation means for confirming each word stored by the word storage means as voice input order data When,
Means for transferring the voice input order data confirmed by the order confirmation means to a slip issuing device for issuing a settlement slip ;
A voice input device comprising:

The deterministic condition judging means extracts a matching pattern from a plurality of preset item combination patterns based on the items of each phrase stored in the phrase storage means, and the extracted pattern is the type of the order item 2. The voice input device according to claim 1, wherein said voice input device is a means for judging whether or not the pattern satisfies an order confirmation condition comprising items of quantity and settlement method .

When it is determined by the confirmation condition judging means that the item combination pattern of each word is not a pattern that satisfies the order confirmation condition, the voice of words having meanings of items that are insufficient to satisfy the order confirmation condition are reproduced. It further comprises a missing word prompting means for prompting input,
The order confirmation means is an item in which an item including the meaning of the phrase extracted by the phrase extraction means after prompting voice re-input by the missing word prompting means is insufficient to satisfy the order confirmation condition. 3. The voice input device according to claim 1, wherein the phrase and the phrase stored by the phrase storage means are determined as voice input order data.

When an uncertain phrase is included in the phrase extracted by the phrase extracting means, further comprising a confirmed phrase prompting means for prompting re-input of a voice for confirming the phrase;
The phrase storage unit stores the phrase extracted by the phrase extraction unit after prompting voice re-input by the fixed phrase prompting unit in place of the indefinite phrase. The voice input device according to any one of the above.

5. The phrase storage unit stores a phrase that is input most recently by voice when there are a plurality of phrases having the same item meaning in the phrase extracted by the phrase extraction unit. The voice input device according to any one of the above.

A step of extracting one word or phrase registered in advance from the contents of an order input by voice; a step of determining an item including the meaning of the word for each extracted word; and an item of each extracted word or phrase Determining whether or not the combination pattern is a pattern satisfying an order confirmation condition comprising items of order type, quantity and settlement method, and the extracted item combination pattern of each word satisfies the order confirmation condition When the pattern is determined to be a pattern, the method includes a step of confirming each word / phrase as voice input order data, and a step of transferring the confirmed voice input order data to a slip issuing device that issues a settlement slip. Voice input method.

The voice input method according to claim 6.
In the determining step, a matching pattern is extracted from a plurality of preset item combination patterns based on the extracted item of each word, and the extracted pattern is the type, quantity and settlement method of the order item. A voice input method for determining whether or not the pattern satisfies an order confirmation condition consisting of the items .

The voice input method according to claim 6 or 7,
When it is determined that the extracted item combination pattern of each word is not a pattern that satisfies the order confirmation condition, speech re-input of a word having meaning of an item that is insufficient to satisfy the order confirmation condition is performed. A voice input method characterized by adding a prompting step.

The voice input method according to any one of claims 6 to 8,
A speech input method comprising a step of prompting re-input of speech for confirming a phrase when an uncertain phrase is included in the extracted phrase.

To have a voice input function, which is connected to the slip issuing device for issuing a settlement documents computer,
A function of extracting one word or phrase registered in advance from the contents of the order input by voice; a function of determining an item including the meaning of the word for each extracted word; and an item of each extracted word or phrase combination pattern types of orders, and a function to determine a pattern or not to satisfy the order decision condition consisting of items of amount and payment, item combination pattern of each word is the extracted satisfies the order decision condition A speech input program that realizes a function of confirming each word / phrase as speech input data when determined to be a pattern and a function of transferring the confirmed speech input order data to the slip issuing device .

The voice input program according to claim 10,
The determining function extracts a matching pattern from a plurality of preset item combination patterns based on the extracted item of each word, and the extracted pattern is the type, quantity and settlement method of the order item. A voice input program, which is a function for determining whether or not the pattern satisfies the order confirmation condition consisting of the items .

The voice input program according to claim 10 or 11, wherein the computer includes:
When it is determined that the extracted item combination pattern of each word is not a pattern that satisfies the order confirmation condition, speech re-input of a word having meaning of an item that is insufficient to satisfy the order confirmation condition is performed. A voice input program that further realizes the prompting function.

The voice input program according to any one of claims 10 to 12, wherein the computer includes:
A speech input program that further realizes a function of prompting re-input of speech for confirming a phrase when an uncertain phrase is included in the extracted phrase.