JP2004061645A

JP2004061645A - Data input device and program

Info

Publication number: JP2004061645A
Application number: JP2002216921A
Authority: JP
Inventors: Tsutomu Sasaki; 佐々木　勉
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2002-07-25
Filing date: 2002-07-25
Publication date: 2004-02-26

Abstract

<P>PROBLEM TO BE SOLVED: To optionally select and input a desired candidate in an input candidate group even by recognizing a speech from an operator as well as key operation etc., when a candidate which is optionally selected from the input candidate group is inputted as data. <P>SOLUTION: When a speech is inputted from a speech input device 7 while an input item of a candidate selection style wherein a candidate optionally selected and specified from the input candidate group in an input candidate table is inputted as data is specified as an object to be inputted, a CPU 1 limits an object of recognition of the input speech to the input candidate group corresponding to the input item and refers to a pattern group for speech recognition made to correspond to the input candidate group to recognize the speech. Then the CPU 1 inputs the most approximate candidate in the input candidate group as data for the input item. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
この発明は、予め設定されている入力候補群の中から任意に選択された候補を入力データとする候補選択型の入力領域に対してデータ入力を行うデータ入力装置およびプログラムに関する。
【０００２】
【従来の技術】
従来、例えば、企業間取引用の伝票あるいは企業内通達用の社内伝票等を作成する伝票作成業務を行うデータ処理装置において、この伝票作成用の入力画面には、伝票の種類に応じて複数の入力項目が配置されているが、各入力項目は、そのデータ入力形式にしたがって候補選択型の入力項目と任意入力型の入力項目とに分類されている。候補選択型の入力項目は、例えば、リストボックス、コンボボックスと呼ばれる項目領域で、予め設定されている入力候補群の中から任意に選択された候補を入力データとする入力項目であるのに対し、任意入力型の入力項目は、例えば、テキストボックスと呼ばれる項目領域で、任意のデータを入力する入力項目である。
【０００３】
この候補選択型の入力項目に対してデータを入力する為にその入力項目が入力対象として指定されると、その入力項目に対応して予め用意されている索引テーブル、例えば、得意先テーブル、商品テーブル等をアクセスして、得意先候補群、商品候補群等を一覧表示させ、このリストの中から希望する候補をキーあるいはマウス等のポインティングデバイスを使用して選択指定すると、選択指定された候補が対応項目のデータとして入力される。また、任意入力型の入力項目に対してデータを入力する場合には、任意の文字列をキーボードから直接入力することによって行われる。
このような候補選択型の入力項目においては、任意入力型の入力項目に比べて、目的候補を選択するだけでデータ入力が可能となる為に、入力操作が容易であり、希望するデータを簡単かつ確実に入力することが可能となる。
【０００４】
【発明が解決しようとする課題】
しかしながら、この候補選択型の入力項目に対してデータを入力する場合でも、例えば、その入力候補群が多い場合には、リスト表示の内容をスクロールしながら目的候補を探し出す作業を必要とする為に目的候補を探し出すまでに時間がかかり、また、類似する候補が多数存在している場合には、候補選択時に誤認混同を招くおそれがある等、必ずしも効率の良い入力形式とは言えず、かえって入力に手間を要する場合もあった。
このことは、伝票作成用の入力画面にデータを入力する場合に限らず、ＧＵＩ等のユーザ・インターフェイス用の入力画面、例えば、プルダウンメニュー画面あるいはダイアログ・ボックス画面等と呼ばれている入力画面において、その入力画面内に一覧表示されている入力候補群（アプリケーション群／コマンド群）の中から目的候補を選択する場合においも同様であった。
【０００５】
この課題は、予め設定されている入力候補群の中から任意に選択された候補を入力データとする場合に、入力候補群の中から希望する候補をキーあるいはマウス等のポインティングデバイスによって選択指定して入力する入力形式の他に、オペレータからの音声を認識することによっても所望する候補を入力候補群の中から任意に選択して入力できるようにすることである。
【０００６】
【課題を解決するための手段】
請求項１記載の発明は、予め設定されている入力候補群の中から任意に選択指定された候補を入力データとする候補選択型の入力領域に対してデータ入力を行うデータ入力装置において、前記候補選択型の入力領域に対応して設定されている前記入力候補群を音声認識する為の音声認識用データ群を記憶する音声データ記憶手段と、入力対象として指定された前記候補選択型の入力領域に対して音声入力された場合に、その入力音声を認識する対象を当該入力領域対応の入力候補群に限定し、この入力候補群に対応付けられている前記音声認識用データ群を参照して入力音声を認識する音声認識手段と、この音声認識手段によって認識された認識結果に基づいて前記入力候補群のうち最も近似している候補を当該入力領域に対するデータとして入力する入力音声処理手段とを具備するものである。
更に、コンピュータに対して、上述した請求項１記載の発明に示した主要機能を実現させるためのプログラムを提供する（請求項６記載の発明）。
【０００７】
したがって、請求項１、６記載の発明は、予め設定されている入力候補群の中から任意に選択指定された候補を入力データとする候補選択型の入力領域が入力対象として指定され、かつ候補選択型の入力領域に対して音声入力された場合に、その入力音声を認識する対象を当該入力領域対応の入力候補群に限定し、この入力候補群に対応付けられている音声認識用データ群を参照して入力音声を認識し、入力候補群のうち最も近似している候補を当該入力領域に対するデータとして入力するようにしたから、候補選択型の入力領域に対するデータ入力を行う場合に、入力候補群の中から希望する候補をキーあるいはマウス等のポインティングデバイスによって指定して選択する方法の他に、オペレータからの音声によっても希望する候補を選択入力することができ、例えば、その入力候補群が多い場合や類似する候補が多数存在している場合には、音声入力で対処することができ、使い勝手が良いデータ入力方式を選ぶことで、入力効率の向上を期待することが可能となると共に、入力音声を認識する対象をその入力領域対応の入力候補群に限定するようにしている為に、認識対象の絞込みが可能となり、認識率が良くなり、入力効率を一層向上させることが可能となる。
【０００８】
なお、請求項１記載の発明は次のようなものであってもよい。
予め用意されている音声認識用辞書を参照することによって、前記候補選択型の入力領域に対応して設定されている入力候補群を音声認識用データ群に逆変換する変換手段を設け、前記音声データ記憶手段は、前記変換手段によって得られた音声認識用データ群を記憶する（請求項２記載の発明）。
【０００９】
したがって、請求項２記載の発明によれば、請求項１記載の発明と同様の効果を有する他に、予め用意されている音声認識用辞書を参照することによって、候補選択型の入力領域に対応して設定されている入力候補群を音声認識用データ群に逆変換して記憶するようにしたから、候補選択型の入力領域に対応する入力候補群を作成して設定しておくだけで、この入力候補群対応の音声認識用データ群（音声認識用辞書）を自動作成することが可能となり、入力環境の設定作業等が容易となる。なお、日常業務の変更等に伴って入力候補の追加、削除、修正が頻繁に行われる場合があるが、入力候補群が更新される毎に、更新後の入力候補群に基づいて音声認識用辞書の内容も更新すればよい。
【００１０】
予め設定されている入力候補群の中から任意に選択された候補を入力データとする候補選択型の入力項目と、任意のデータを入力する任意入力型の入力項目のうち、入力対象として指定された項目の種類を判別する判別手段を設け、この判別手段によって任意入力型の入力項目が入力対象として指定されたことが判別され、かつ、この任意入力型の入力項目に対して音声入力された場合に、前記音声認識手段は、予め用意されている音声認識用辞書を参照することによって入力音声を認識する（請求項３記載の発明）。
【００１１】
したがって、請求項３記載の発明によれば、請求項１記載の発明と同様の効果を有する他に、候補選択型の入力項目と、任意入力型の入力項目のうち、任意入力型の入力項目が入力対象として指定されると共に、この任意入力型の入力項目に対して音声入力された場合には、予め用意されている音声認識用辞書を参照することによって音声認識するようにしたから、例えば、伝票入力画面のように、候補選択型の入力項目と任意入力型の入力項目とが混在している場合、入力形式を変更することなく、任意入力型の入力項目に対しても音声入力によって任意のデータを入力することができると共に、入力対象項目の種類に応じた音声認識が可能となる。
この場合、単語認識に限らず、前後の文脈、構文解析等の言語処理も合わせて行うようにすれば、認識率を向上させ、任意の入力音声を確実に所望するデータに変換することが可能となる。つまり、候補選択型の入力項目に対する音声認識の場合には、単語認識で十分対応可能であるが、任意入力型の入力項目に対する音声認識の場合には、言語処理も含めたより高度な認識方式に自動切り替えを行って対応することもできる。
【００１２】
任意のデータを入力する任意入力型の入力項目に対応して設定されている入力属性に応じて予め用意されている入力属性別の音声認識用辞書を記憶する辞書記憶手段と、前記任意入力型の入力項目に対して音声入力された場合に、その入力項目に対応して設定されている入力属性を判別する判別手段とを設け、前記音声認識手段は、前記属性別の音声認識用辞書の中から前記判別手段によって判別された入力属性に対応する音声認識用辞書を選択指定すると共に、指定した音声認識用辞書を参照することによって入力音声を認識する（請求項４記載の発明）。
【００１３】
したがって、請求項４記載の発明によれば、請求項１記載の発明と同様の効果を有する他に、任意入力型の入力項目に対して音声入力された場合に、この任意入力型の入力項目に対応して設定されている入力属性（例えば、数値、文字、日付、記号等のうち、何れのデータ種を入力するかを示す為の入力属性）を判別し、予め入力属性別に用意されている音声認識用辞書を参照して入力音声を認識するようにしたから、比較的単純な単語認識方式を採用したとしても、専用辞書を使用した音声認識によってその認識率を大幅に向上させることができる。
【００１４】
前記音声認識手段が入力音声を認識した結果、入力音声を認識することができなかった場合には、音声入力以外の入力デバイスによって前記入力候補群の中から任意に選択指定された候補を入力データとする入力形式に移行する（請求項５記載の発明）。
したがって、請求項５記載の発明によれば、請求項１記載の発明と同様の効果を有する他に、入力音声を認識することができなかった場合には、音声以外の入力デバイスを使用する入力方式に直ちに切り換えることができるので、その切り換えをオペレータが行う必要はなく、データの入力作業を効率良く行うことが可能となる。
【００１５】
【発明の実施の形態】
以下、図１〜図８を参照してこの発明の一実施形態を説明する。
図１は、この実施形態におけるデータ入力装置の全体構成を示したブロック図である。
このデータ入力装置は、例えば、企業間取引伝票あるいは企業内通達用の社内伝票等を作成する伝票作成業務を行うパーソナルコンピュータ等であり、伝票データを入力するキーボードやマウス等のポインティングデバイスの他に、音声入力用のマイクロフォーンを有し、このマイクロフォーンから入力された入力音声を認識することによって伝票データを構成する各項目データを入力する音声認識機能を備えている。
【００１６】
すなわち、このデータ入力装置は、伝票を構成する各入力項目に対してデータを入力する場合に、キーあるいはマウス等のポインティングデバイスを使用してデータを入力する通常の入力形式の他に、この実施形態においては、オペレータからの入力音声を認識することによっても伝票データを構成する各項目データの入力を可能としたものである。
なお、この実施形態の特徴部分を詳述する前に、この実施形態のハードウェア上の構成について以下、説明しておく。
【００１７】
ＣＰＵ１は、記憶装置２内のオペレーティングシステムや各種アプリケーションソフトにしたがってこのデータ処理装置の全体動作を制御する中央演算処理装置である。記憶装置２は、プログラム記憶領域とデータ記憶領域とを有し、このプログラム記憶領域内には、オペレーティングシステムの他に、特に、伝票データを作成する伝票作成用のアプリケーションと共に、伝票作成を音声認識によって行う音声認識用のアプリケーション等、各種処理プログラムが格納され、また、データ記憶領域には、後述する各種の音声認識用辞書等が格納され、磁気的、光学的、半導体メモリ等やその駆動系によって構成されている。
【００１８】
この記録装置２はハードディスク等の固定的なメモリの他、ＣＤ−ＲＯＭ、ＤＶＤ等の着脱自在な記憶媒体を装着可能な構成であってもよい。この記憶装置２内のプログラムやデータは、必要に応じてＲＡＭ（例えば、スタティックＲＡＭ）３にロードされたり、ＲＡＭ３内のデータが記憶装置２にセーブされる。なお、ＲＡＭ３内には、プログラム実行領域とワーク域とを有している。
更に、ＣＰＵ１は、通信装置４を介して他の電子機器側のプログラム／データを直接アクセスして使用したり、通信装置４を介してダウンロード受信することもできる。通信装置４は、例えば、通信モデムや赤外線モジュールあるいはアンテナ等を含む有線／無線の通信インターフェイスである。
【００１９】
一方、ＣＰＵ１にはその入出力周辺デバイスである入力装置５、表示装置６、音声入力装置７がバスラインを介して接続されており、入出力プログラムにしたがってＣＰＵ１はそれらの動作を制御する。入力装置５はキーボードやタッチパネルあるいはマウスやタッチ入力ペン等のポインティングデバイスを構成する操作部であり、文字列データや各種コマンドを入力する。表示装置６は、フルカラー表示を行う液晶やＣＲＴあるいはプラズマ表示装置などである。
音声入力装置７は、マイクロフォーン、Ａ／Ｄ変換器等からなり、入力された音声波形をＡ／Ｄ（アナログ／デジタル）変換すると共に、その変換結果を解析することによってその特徴情報を抽出して入力音声をデータ化し、ＣＰＵ１に与える。ＣＰＵ１は、この入力音声データを音声認識し、その認識結果を伝票の項目データとして対応項目に入力する音声入力処理を行う。
【００２０】
図２は、社内伝票を作成する伝票入力画面の構成を例示した図である。
この伝票入力画面には、部門コードを入力する項目Ａと、部門名称を入力する項目Ｂと、コメントを入力する項目Ｃと、担当者名を入力する項目Ｄ等の各種の入力項目を有する構成となっており、各入力項目内に該当データを入力することによって伝票作成が行われる。この場合、伝票入力画面内の各入力項目は、予め決められている順序にしたがって入力対象として１項目毎に順次指定されると共に、指定項目に対して入力されたデータは、当該項目内に配置表示される。
なお、伝票入力画面を構成する部門コード入力項目Ａ、部門名称入力項目Ｂ、コメント入力項目Ｃ、担当者名入力項目Ｄのうち、部門コード入力項目Ａおよびコメント入力項目Ｃと、部門名称入力項目Ｂおよび担当者名入力項目Ｄとでは、その入力形式がそれぞれ相違している。
【００２１】
すなわち、部門コード入力項目Ａおよびコメント入力項目Ｃは、任意のデータを入力する任意入力型の入力項目（テキストボックスと呼ばれる項目領域）であるのに対し、部門名称入力項目Ｂおよび担当者名入力項目Ｄは、予め設定されている入力候補群の中から任意に選択された候補を入力データとする候補選択型の入力項目（リストボックス、コンボボックスと呼ばれる項目領域）であり、この部門名称入力項目Ｂ、担当者名入力項目Ｃが入力対象として指定された際には、伝票入力画面内に当該入力項目に対応してプルダウンメニューをオープンして、入力候補群を一覧表示させる。
【００２２】
図３（Ａ）は、伝票入力画面内の各入力項目に対応して、その入力形式等を定義する入力項目テーブル１０の内容を示した図である。
この入力項目テーブル１０は、伝票入力画面内の各入力項目に対応して、その項目定義情報が設定されるテーブルで、入力項目毎に、その定義情報として、「型」、「属性」、「リンク先」が設定されている。この場合、各入力項目は、予め決められた並び順に記憶されており、ＣＰＵ１は、入力項目テーブル１０に設定されている各入力項目をその並び順にしたがって１項目ずつ入力対象として順次指定するようにしている。この入力項目テーブル１０内の「型」は、入力項目の入力形式を示すもので、“任意”は、任意入力型の入力項目を示し、“候補”は、候補選択型の入力項目であることを示している。
【００２３】
入力項目テーブル１０内の「属性」は、任意入力型の入力項目において、入力すべきデータの種別を示すもので、部門コード入力項目Ａ対応の「属性」には、“数値”が設定されており、部門コードを数値列によって入力すべきことを示し、また、コメント入力項目Ｃ対応の「属性」には、“文字”が設定されており、コメントを文字列によって入力すべきことを示している。
「リンク先」は、候補選択型の入力項目において、入力候補群が設定されている索引テーブルをアクセスする為のテーブル名であり、部門名称入力項目Ｂ対応の「リンク先」には、入力候補テーブル（部門テーブル）１１のテーブル名“Ｆ１”が設定され、また、担当者名入力項目Ｄ対応の「リンク先」には、入力候補テーブル（担当者テーブル）１２のテーブル名“Ｆ２”が設定されている。
【００２４】
入力候補テーブル（部門テーブル）１１は、部門名称入力項目Ｂに対する入力候補群として予め任意に設定された各種の「部門名称」を記憶する索引テーブルであり、また、入力候補テーブル（担当者テーブル）１２は、担当者名入力項目Ｄに対する入力候補群として予め任意に設定された各種の「担当者名」を記憶する索引テーブルである。なお、各入力候補テーブル１１、１２は、各種の「入力候補（部門名称／担当者名）」に対応して、その「入力候補番号」を記憶する構成となっている。
図３（Ｂ）は、上述した項目定義情報にしたがって入力された１レコード分（１伝票分）の伝票データを示した図で、この１伝票分の入力レコードがＣＰＵ１に取り込まれると、ＣＰＵ１は、この入力レコード（伝票レコード）を伝票毎に伝票ファイル（図示せず）内に記憶管理すると共に、この伝票ファイルに基づいて伝票発行処理や伝票集計処理等を行う。
【００２５】
図４（Ａ）は、候補選択型の入力項目である部門名称入力項目Ｂ／担当者名入力項目Ｄに対して、その項目データが音声入力された場合に、その入力音声と比較して音声認識する為の音声データ（音声認識用パターン）が記憶される比較用メモリ１３の内容を示した図である。
この比較用メモリ１３は、候補選択型の入力項目に対する音声認識用の辞書メモリであり、部門名称入力項目Ｂ／担当者名入力項目Ｄが入力対象として選択指定される毎に作成記憶されたものである。
【００２６】
この場合、ＣＰＵ１は、部門名称入力項目Ｂ／担当者名入力項目Ｄが入力対象として選択指定された際に、その入力項目にリンクされている入力候補テーブル（部門テーブル／担当者テーブル）１１、１２をアクセスし、その入力候補群を音声認識用パターン群に変換することによって比較用メモリ１３が作成されて、ＲＡＭ３内のワーク域内にセットされる。その際、後で詳述するが、入力候補を構成する各文字コードに基づいて構文解析および意味解析を行い、その解析結果に基づいて音声認識用辞書を逆変換することによって各入力候補対応の音声データを得るようにしている。
この比較用メモリ１３は、入力候補毎に、その「音声データ（標準パターン）」と、その「入力候補番号」とを記憶する構成となっており、この「入力候補番号」を介して比較用メモリ１３と各入力候補テーブル１１、１２とが対応付けられている。
【００２７】
図４（Ｂ）は、ＲＡＭ３内のワーク域に一時記憶される音声入力フラグ１４、属性フラグ１５のセット状態を示した図である。
この音声入力フラグ１４は、伝票入力時において、その各入力項目に対して音声入力も可能な状態であることを示し、オペレータ操作によって任意にセット／リセットされる入力状態フラグである。
属性フラグ１５は、任意入力型の入力項目である部門コード入力項目Ａ／コメント入力項目Ｃが入力対象として選択指定された際に、その項目の「属性」に応じて自動的にセットされる文字／数値の指定フラグであり、「属性」が文字の場合には、“１”がセットされ、数値の場合には、“０”がセットされる。
【００２８】
図５（Ａ）は、音声入力フラグ１４がセットされている状態において、任意入力型の入力項目である部門コード入力項目Ａ／コメント入力項目Ｃが入力対象として選択指定された際に、その項目の「属性」に応じて選択指定される一般音声認識用辞書１６、数値音声認識用辞書１７を示した図である。
すなわち、属性フラグ１５がフラグ“１”（文字）の場合には、一般音声認識用辞書１６を用いた文字認識を行い、また、フラグ“０”（数値）の場合には、数値音声認識用辞書１７を用いた文字認識を行う為に、属性毎に異なる認識用辞書を指定するようにしている。なお、この例では、属性として、文字、数値のみを示したが、他の属性があれば、当該他の属性に対応付けられている認識用辞書が指定される。
【００２９】
一般音声認識用辞書１６は、伝票入力として一般的に使用される可能性がある単語を含めた各種の音声データ（標準パターン）と、そのコード情報とを対応付けて記憶する構成となっており、伝票入力用としても使用可能な一般音声認識用の辞書であるのに対し、数値音声認識用辞書１７は、数値に対応してその音声データとそのコード情報とを対応付けた数値専用の音声認識辞書である。
なお、一般音声認識用辞書１６には、伝票入力用として一般的に使用される可能性がある特殊な「用語」、「人名」、「法人名」、「住所」等の音声データも記憶されている。更に、一般音声認識用辞書１６あるいは数値音声認識用辞書１７を使用した音声認識には、対象者を特定しない不特定話者対応の音声認識方式を可能としているが、勿論、対象者を特定する特定話者対応の音声認識方式を採用するようにしてもよい。
【００３０】
図５（Ｂ）は、一般音声認識用辞書１６を用いて文字認識を行う場合の処理系を概念的に示した図である。
この場合、任意入力型の入力項目である部門コード入力項目Ａ／コメント入力項目Ｃが入力対象として選択指定された際に、その項目の「属性」が“文字”であれば、連続音声入力の認識精度を向上させる為に、一般音声認識用辞書１６を用いて入力音声を単語毎に文字認識する一般音声認識処理を実行した後に、構文・意味情報辞書１８を用いて構文解析および意味解析を行う言語処理を実行して認識結果を得るようにしている。
【００３１】
次に、この実施形態における音声認識機能付きデータ入力装置の動作アルゴリズムを図６〜図８に示すフローチャートを参照して説明する。ここで、これらのフローチャートに記述されている各機能は、読み取り可能なプログラムコードの形態で格納されており、このプログラムコードにしたがった動作を逐次実行する。また、伝送媒体を介して伝送されてきた上述のプログラムコードにしたがった動作を逐次実行することもできる。すなわち、記録媒体の他、伝送媒体を介して外部供給されたプログラム／データを利用してこの実施形態特有の動作を実行することもできる。
【００３２】
図６〜図８は、音声認識機能付きデータ入力装置の動作（伝票入力処理）を示したフローチャートである。
先ず、ＣＰＵ１は、伝票作成処理が指定されると、伝票入力用のフォーム情報を読み出して、図２に示した入力画面を表示出力させると共に（ステップＡ１）、伝票データの入力方法として、音声入力をも可能とするかを問い合わせる為のメッセージを入力画面内に案内表示させる（ステップＡ２）。すなわち、キーあるいはマウス等のポインティングデバイスを使用してデータを入力する入力方法の他に、オペレータからの入力音声によってもデータ入力を可能とするかの要否を問い合わせる。この場合、オペレータは、この問合せメッセージにしたがって「ＹＥＳキー」あるいは「ＮＯキー」を操作する。いま、音声入力も可能とする為に「ＹＥＳキー」が操作された場合には（ステップＡ３）、音声入力フラグ１４をセットし（ステップＡ４）、「ＮＯキー」が操作された場合には（ステップＡ３）、音声入力フラグ１４をリセットする（ステップＡ５）。
【００３３】
そして、ＣＰＵ１は、入力画面内の各入力項目のうち、その先頭項目を入力対象として指定する（ステップＡ６）。この場合、入力項目テーブル１０内に設定されている各入力項目の並び順を参照して、その第１項目を入力対象として指定する。この例では、先ず、その第１項目である「部門コード」が入力対象として指定される。その後、この指定項目に対応して入力項目テーブル１０内に設定されている「型」を参照し、指定項目は、“任意入力型の入力項目”あるいは“候補選択型の入力項目”かの判別を行う（ステップＡ７）。
【００３４】
なお、この場合、第１項目である部門コード入力項目Ａは、任意入力型の入力項目である為に、ステップＡ７でそのことが検出されて、図８のフローチャートに移るが、以下、説明の便宜上、つまり、図６〜図８の各フローチャートをその順次にしたがって説明する為に、任意入力型の入力項目が入力対象となった場合の動作説明を行う前に、候補選択型の入力項目が入力対象となった場合の動作を説明しておく。
【００３５】
ここで、候補選択型の入力項目として、例えば、第２の入力項目である部門名称入力項目Ｂが入力対象として指定されたものとすると、ＣＰＵ１は、入力項目テーブル１０を参照し、その指定項目の「リンク先」に基づいて該当する入力候補テーブル１１として、部門テーブルを読み出す（ステップＡ８）。そして、音声入力フラグ１４がセットされているかを調べる（ステップＡ９）。
いま、音声入力フラグ１４がリセットされている場合には、ステップＡ１０に移り、カーソルが指定項目の位置まで移動したか（カーソル指示が行われたか）をチェックし、指定項目がカーソル指示された場合には、当該指定項目に対応してプルダウンメニューをオープンし、入力候補テーブル（部門テーブル）１１の内容（部門名）をリスト表示させる（ステップＡ１１）。
【００３６】
このように入力候補群がリスト表示されている状態において、このリストの中から任意の候補がカーソル指示によって選択された場合には（ステップＡ１２）、カーソル指示の候補を入力候補テーブル（部門テーブル）１１から選択し（ステップＡ１３）、この選択候補を当該入力項目（部門名称入力項目Ｂ）対応のデータとして入力し、この入力項目Ｂ内に選択候補「部門名称」を入力データとして表示させる（ステップＡ１４）。
この状態において、候補確定を指示する所定キー（リターンキー）が操作されたかをチェックし（ステップＡ１５）、候補確定が指示されなければ、ステップＡ１０に戻り、再度の候補選択操作が可能となるが、リターンキーが操作されて候補確定が指示された場合には、この入力データを当該入力項目対応の伝票データとして確定する（ステップＡ１６）。
【００３７】
これによって１項目分の入力が終了すると、現在、着目している指定項目は、当該伝票を構成する最終の項目か、つまり、１伝票分における入力項目の終了かを調べる（ステップＡ１７）。この場合、第２の入力項目に対する入力が終った段階であり、最終項目ではないので、次のステップＡ１８に移り、次の入力項目（第３のコメント入力項目Ｃ）を指定するが、いま、第２の入力項目である部門名称入力項目Ｂ（候補選択型の入力項目）が入力対象として指定されている場合において、上述のステップＡ９で音声入力フラグ１４がセットされていることが検出された場合には、図７のステップＡ２０に移る。
【００３８】
この場合、図７のステップＡ２０〜Ａ２３においては、入力候補テーブル（部門テーブル）１１から読み出した各入力候補に基づいて部門テーブル対応の比較用メモリ１３の内容を作成する処理が行われる。つまり、入力候補テーブル（部門テーブル）１１から読み出した各入力候補を構成する各文字列に基づいて構文・意味情報辞書１８および一般音声認識用辞書１６を参照することによって、各入力候補を音声認識用の音声データに逆変換し、部門テーブル対応の比較用メモリ１３に一時記憶する処理が行われる。
【００３９】
先ず、入力候補を構成する各文字コードに基づいて構文・意味情報辞書１８を参照し、言語処理によって構文解析および意味解析を行い、入力候補に含まれている各単語を特定する（ステップＡ２０）。そして、この単語に基づいて一般音声認識用辞書１６をアクセスし、単語に対応付けられている音声データ（音声標準パターン）を取得して、単語を音声データに逆変換する（ステップＡ２１）。その際、入力候補（部門名称）が複数の単語から構成されている場合には、単語毎に逆変換してその音声データを得るようにしている。
【００４０】
更に、各入力候補に基づいて入力候補テーブル（部門テーブル）１１をアクセスし、この入力候補に該当する各入力候補番号を取得する（ステップＡ２２）。そして、上述のようにして入力候補毎に変換した各音声データと、その入力候補テーブル（部門テーブル）１１内に設定されている各入力候補番号とを対応付けた比較用メモリ１３の内容を作成し、ＲＡＭ３内のワーク域内にこの比較用メモリ１３の内容を一時記憶させる（ステップＡ２３）。
【００４１】
このように部門テーブル対応の比較用メモリ１３の内容を作成した状態において、音声入力の有無をチェックする（ステップＡ２４）。ここで、オペレータからの音声入力を受け付けた場合、音声入力装置７によって入力音声がＡ／Ｄ変換された後に、その特徴情報が抽出されて入力音声がデータ化されると（ステップＡ２５）、ＣＰＵ１は、この入力音声データを取り込み、この入力音声データに基づいて比較用メモリ１３をアクセスして各音声データと比較し（ステップＡ２６）、類似した音声データが比較用メモリ１３内に存在しているかをチェックする（ステップＡ２７）。
【００４２】
その結果、類似する音声データが存在していなければ、音声認識不良として判別し、キーあるいはマウス等のポインティングデバイスを使用した候補選択を可能とする為に、ステップＡ３３に移り、入力候補テーブル（部門テーブル）１１内の入力候補をリスト表示するが、音声認識の結果、類似音声が存在している場合には（ステップＡ２７）、その中から最も近似した音声データを特定すると共に（ステップＡ２８）、この音声データに対応付けられている「入力候補番号」を比較用メモリ１３から読み出し、この「入力候補番号」対応の入力候補を入力候補テーブル（部門テーブル）１１から選択取得する（ステップＡ２９）。そして、この選択候補を当該入力項目対応のデータとして入力し、その項目内に配置表示させる（ステップＡ３０）。
【００４３】
なお、音声入力フラグ１４がセットされている状態であっても、音声入力がなければ（ステップＡ２４でＮＯ）、入力項目へカーソル移動されてカーソル指示があったかを調べ（ステップＡ３２）、指定項目がカーソル指示された場合には、入力候補テーブル（部門テーブル）１１の内容をリスト表示させる（ステップＡ３３）。そして、このリスト表示の中から任意の候補がカーソル指示によって選択された場合には（ステップＡ３４）、カーソル指示の候補を入力候補テーブル（部門テーブル）１１から選択し（ステップＡ３５）、この選択候補を当該入力項目対応のデータとして入力し、その項目内に配置表示させる（ステップＡ３０）。
【００４４】
そして、リターンキーが操作されたかをチェックし（ステップＡ３１）、リターンキーが操作されなければ、ステップＡ２４に戻るが、リターンキーが操作されて候補確定が指示された場合には、図６のステップＡ１６に移り、この入力データを当該入力項目対応の伝票データとして確定する。
これによって１項目分の入力が終了すると、１伝票分における入力項目の終了かを調べるが（ステップＡ１７）、いま、第２の入力項目に対する入力が終了した段階であるから、次の入力項目（コメント入力項目Ｃ）を指定した後に（ステップＡ１８）、この指定項目対応の「型」を判別する（ステップＡ７）。この場合、コメント入力項目Ｃは、任意入力型の入力項目である為、図８のフローチャートに移る。
【００４５】
先ず、指定項目に対応して入力項目テーブル１０内に設定されている「属性」を読み出し（ステップＡ４０）、“文字”あるいは“数値”かを判別する（ステップＡ４１）。この結果、「属性」が“文字”であれば、属性フラグ１５に“１”をセットするが（ステップＡ４２）、“数値”であれば、“０”にセットする（ステップＡ４３）。そして、音声入力フラグ１４がセットされているかを判別し（ステップＡ４４／Ａ４５）、音声入力フラグ１４がセットされている場合に、属性フラグ１５が“１”であれば、一般音声認識用辞書１６を選択するが（ステップＡ４６）、“０”であれば、数値音声認識用辞書１７を選択する（ステップＡ４７）。
一方、音声入力フラグ１４がリセットされている場合には（ステップＡ４４あるいはＡ４５でＮＯ）、この指定項目に対してデータがキー入力されるまで待機状態となる（ステップＡ４８）。
【００４６】
いま、音声入力フラグ１４がセットされている状態において、指定項目に対して音声入力が行われると（ステップＡ４９）、選択辞書（一般音声認識用辞書１６あるいは数値音声認識用辞書１７）を参照して入力音声を認識する（ステップＡ５０）。この場合、指定項目は、コメント入力項目Ｃであり、その「属性」は“文字”であるから一般音声認識用辞書１６を参照して入力音声を認識した後、構文・意味情報辞書１８を用いて構文解析および意味解析を行う言語処理を実行し、その認識結果として「コメント文」を得る。なお、指定項目が部門コード入力項目Ａであれば、数値音声認識用辞書１７を参照することによって入力音声を認識し、その認識結果として「部門コード」を得る。
これによって認識された認識結果は、指定項目対応のデータとして入力され、当該項目内に配置表示される（ステップＡ５１）。
【００４７】
一方、音声入力フラグ１４がセットされている状態であっても、音声入力がなければ（ステップＡ４９でＮＯ）、キー操作によるデータ入力を可能とする為に、ステップＡ５２に移り、キー入力有無をチェックし、音声入力あるいはキー入力があるまで待機状態となる。ここで、指定項目に対してキー操作によってデータが入力された場合には（ステップＡ５２）、キー入力されたデータをその「属性」にしたがったデータ（文字／数値）に変換すると共に（ステップＡ５３）、このデータを指定項目対応のデータとして入力し、その項目内に配置表示させる（ステップＡ５４）。そして、リターンキーが操作されたかをチェックし（ステップＡ５５）、リターンキーが操作されなければ、ステップＡ４９に戻るが、リターンキーが操作されて候補確定が指示された場合には、この入力データを当該入力項目対応の伝票データとして確定する（ステップＡ５６）。
【００４８】
これによって１項目分の入力が終了すると、１伝票分における入力項目の終了かを調べるが（ステップＡ５７）、いま、第３の入力項目に対する入力が終了した場合であるから、次の入力項目（担当者名入力項目Ｄ）を指定した後に（ステップＡ５８）、図６のステップＡ７に戻り、指定項目の「型」を判別する。この場合、担当者名入力項目Ｄは、選択候補型の入力項目である為に、以下、上述と同様に音声入力フラグ１４のセット／リセットに応じて選択候補型の入力項目に対する動作が行われる。その際、入力候補テーブル（担当者テーブル）１２内の入力候補群が読み出され、音声入力フラグ１４がリセットされている場合には、上述の場合と同様に、この入力候補群がリスト表示されるが、音声入力フラグ１４がセットされている場合には、上述の場合と同様に、この入力候補群が音声データに変換されて担当者テーブル対応の比較用メモリ１３の内容が作成されて一時記憶される。
【００４９】
以上のように、この実施形態においてＣＰＵ１は、入力候補テーブル１１、１２内の入力候補群の中から任意に選択指定された候補を入力データとする候補選択型の入力項目が入力対象として指定されている状態において、音声入力装置７から音声入力された場合に、その入力音声を認識する対象を当該入力項目対応の入力候補群に限定し、この入力候補群に対応付けられている音声認識用パターン群（比較メモリ１３の内容）を参照して入力音声を認識し、入力候補群のうち最も近似している候補を当該入力項目に対するデータとして入力するようにしたから、候補選択型の入力項目に対するデータ入力を行う場合に、入力候補群の中から希望する候補をキーあるいはマウス等のポインティングデバイスによって指定して選択する方法の他に、オペレータからの入力音声を認識することによっても希望する候補を選択入力することができ、例えば、その入力候補群が多い場合や類似する候補が多数存在している場合には、音声入力で対処することができ、使い勝手が良いデータ入力方式を選ぶことで、入力効率の向上を期待することが可能となると共に、入力音声を認識する対象をその入力項目対応の入力候補群に限定するようにしている為、認識対象の絞込みが可能となり、認識率が良くなり、入力効率を一層向上させることが可能となる。
【００５０】
この場合、予め用意されている一般音声認識用辞書１６を参照することによって、候補選択型の入力項目に対応して設定されている入力候補群を音声認識用パターン群に逆変換して比較用メモリ１３に記憶するようにしたから、候補選択型の入力項目に対応する入力候補群を作成して設定しておくだけで、比較用メモリ１３の内容（音声認識用辞書）を自動作成することが可能となり、入力環境の設定作業等が容易となる。
なお、日常業務の変更等に伴って入力候補の追加、削除、修正が頻繁に行われる場合があるが、入力候補群が更新される毎に、更新後の入力候補群に基づいて音声認識用辞書の内容も更新すればよい。
【００５１】
伝票入力画面において、候補選択型の入力項目と、任意入力型の入力項目のうち、任意入力型の入力項目が入力対象として指定されると共に、この任意入力型の入力項目に対して音声入力された場合にも音声認識を行うようにしたから、候補選択型の入力項目と任意入力型の入力項目とが混在している伝票入力画面において、入力形式を変更することなく、候補選択型の入力項目の他に、任意入力型の入力項目に対しても、オペレータからの音声入力によって任意のデータを入力することができると共に、入力対象項目の種類に応じた音声認識が可能となる。
【００５２】
この場合、単語認識に限らず、構文・意味情報辞書１８を参照し、言語処理によって構文解析および意味解析を行うようにしたから、認識率を向上させ、任意の入力音声を確実に所望するデータに変換することが可能となる。つまり、候補選択型の入力項目に対する音声認識の場合には、単語認識で十分対応可能であるが、任意入力型の入力項目に対する音声認識の場合には、言語処理も含めたより高度な認識方式に自動切り替えを行って対応することができる。
【００５３】
また、任意入力型の入力項目に対して音声入力された場合に、この任意入力型の入力項目に対応して設定されている属性（数値／文字）を判別し、予め属性別に用意されている一般音声認識用辞書１６／数値音声認識用辞書１７を参照して入力音声を認識するようにしたから、単語認識方式を採用したとしても、専用辞書を使用した音声認識によってその認識率を大幅に向上させることができる。
一方、入力音声を認識することができなかった場合には、音声以外の入力デバイス（マウス、タッチペン等）を使用する入力方式に直ちに切り換えることができるので、その切り換えをオペレータが行う必要はなく、入力作業を更に効率良く行うことが可能となる。
【００５４】
なお、上述した実施形態においては、候補選択型の入力項目である部門名称入力項目Ｂあるいは担当者名入力項目Ｄが入力対象として選択指定される毎に、入力候補テーブル（部門テーブル／担当者テーブル）１１、１２内の入力候補群を音声データに変換し、部門テーブル／担当者テーブル対応の比較用メモリ１３に一時記憶するようにしたが、部門テーブル／担当者テーブル対応の比較用メモリ１３に相当する音声認識用の部門辞書および担当者辞書を事前に作成して登録しておいもよい。つまり、候補選択型の入力項目が入力対象として選択される毎に、その入力候補群を音声データに変換するのではなく、予め作成されている音声認識用の部門辞書および担当者辞書を参照して音声認識を行うようにしてもよい。
【００５５】
この場合、部門テーブル、担当者テーブルの内容が更新（削除、追加、修正）される毎に、この更新に連動して部門テーブル、担当者テーブル対応の部門辞書および担当者辞書の内容も更新するようにすればよい。
また、部門テーブル、担当者テーブルと比較用メモリ１３とを別個に作成せず、部門情報／担当者情報が格納されているファイル内において、各入力候補に対応付けて、その音声データを記憶する音声データ記憶領域を設けるようにしてもよい。これによって、比較用メモリ１３の内容を作成したり、部門テーブル、担当者テーブルとは別個に、音声認識用の部門辞書および担当者辞書を設ける必要がなくなり、部門情報ファイル、担当者情報ファイルをアクセスするだけで音声認識が可能となる。
【００５６】
上述した実施形態においては、伝票入力時において、オペレータ操作によって音声入力フラグ１４をセットすることによって、各入力項目に対して音声入力を可能とする状態に設定したが、オペレータ操作によって音声入力可能状態に設定しなくても、音声入力を可能とするようにしてもよい。
更に、上述した実施形態においては、伝票入力画面内の全ての入力項目に対して音声入力を可能としたが、伝票を構成する各入力項目のうち、予め決められている項目だけを音声入力対応としてもよい。つまり、音声認識用辞書の規模、その認識率、コスト等を考慮して所定項目のみを音声入力対応としてもよい。例えば、任意入力型の項目のうち、特に、コメント入力項目Ｃにつていは、音声入力対象から外すようにすれば、一般音声認識用辞書１６および構文・意味情報辞書１８が不要となる。
【００５７】
上述した実施形態においては、入力項目の属性として、“文字”、“数値”を例示したが、“日付”、“アルファベット”、“記号”等も項目属性に含め、それに応じた属性別の音声認識用辞書を用意してもよい。
また、音声認識率を更に向上させる為に、音声認識機能に学習機能を追加し、音声認識によって以前に入力確定されたデータや入力頻度の高いデータを学習記憶しておき、以降、音声認識時にそれらのデータを優先候補として出現させるようにしてもよい。
【００５８】
候補選択型の入力項目は、入力候補群が多い場合にその入力候補群を分類別に仕分け、この分類項目を一覧表示するメインリストの中から所望の分類項目が選択された場合に、そのサブリストを表示するようにした階層候補選択型の項目であってもよい。
このような階層候補選択型の入力項目に対して音声入力を適用することによって候補選択を確実かつ容易に行うことが可能となる。すなわち、従来においては、下位階層のサブリスト内に所望する候補が存在しているにも拘らず、オペレータが上位階層のメインリストのみを確認しただけで該当項目が存在していないと判断してしまうことがあるが、このような不具合は、音声入力を適用することによって確実に防止することが可能となり、特に、階層数が多くなればなる程、有効であり、階層毎に下位階層のサブリストをオープンして、その都度、候補を選択する面倒な操作も不要となる。
【００５９】
上述した実施形態においては、音声入力は、伝票を構成する指定項目に対して行うようにしたが、更に、アプリケーションを操作する「操作コマンド」を音声入力することによって、入力コマンドを音声認識するようにしてもよい。
この場合、指定項目に対する音声入力か、操作コマンドの音声入力かは、入力画面上において、入力項目が指定されているか否かに基づいて区別するようにすれば、伝票指定項目に関する音声認識と操作コマンドに関する音声認識とを混同することなく、音声認識することが可能となる。
【００６０】
上述した実施形態においては、伝票入力を行う場合を例示したが、入力画面は伝票入力画面に限らず、任意であり、例えば、ＧＵＩ等のユーザ・インターフェイス用の入力画面に適用してもよい。
すなわち、プルダウンメニュー画面あるいはダイアログ・ボックス画面等と呼ばれる入力画面において、その入力画面内に一覧表示されている入力候補群（アプリケーション群／コマンド群）の中から任意の候補を選択指定する場合に、キーあるいはマウス等のポインティングデバイスによって選択する入力方式の他に、入力音声を認識することによって所望のアプリケーション／コマンドを選択指定する音声入力方式を採用するようにしてもよい。なお、この場合においても基本的には、上述した各フローチャートにしたがって容易に実現可能となる。これによって、入力候補であるアプリケーション群／コマンド群が多い場合や類似する候補が多数存在している場合には、音声入力で対処することができる為にユーザ・インターフェイスの使い勝手が良くなり、効率の良い入力環境を実現することができるようになる。
【００６１】
また、上述した音声認識機能付きデータ入力装置は、スタンド・アローン・タイプに限らず、その各構成要素が２以上の筐体に物理的に分離され、通信回線やケーブル等の有線伝送路あるいは電波、マイクロウエーブ、赤外線等の無線伝送路を介してデータを送受信する分散型のコンピュータシステムを構成するものであってもよい。
更に、広域ネットワークあるいは構内ネットワークを介して複数のデータ入力装置とを常時接続した構成の通信システム（例えば、クライアント・サーバシステム）にも適用可能である。この場合、各クライアント側でデータを音声入力する際に、入力音声をサーバとして機能するデータ入力装置へ送信し、サーバ側では、この入力音声を音声認識する処理を一括して行い、その認識結果を要求元のクライアントへ転送するようにしてもよい。
【００６２】
一方、コンピュータに対して、上述した各手段を実行させるためのプログラムコードをそれぞれ記録した記録媒体（例えば、ＣＤ−ＲＯＭ、フロッピィデスク、ＲＡＭカード等）を提供するようにしてもよい。
すなわち、コンピュータが読み取り可能なプログラムコードを有する記録媒体であって、予め設定されている入力候補群の中から任意に選択指定された候補を入力データとする候補選択型の入力領域に対して音声入力された場合に、その入力音声を認識する対象を当該入力領域対応の入力候補群に限定し、この入力候補群に対応付けられている音声認識用データ群を参照して入力音声を認識する機能と、この認識結果に基づいて前記入力候補群のうち最も近似している候補を当該入力領域に対するデータとして入力する機能とを実現させるためのプログラムを記録したコンピュータが読み取り可能な記録媒体を提供するようにしてもよい。
【００６３】
【発明の効果】
この発明（請求項１記載の発明）によれば、予め設定されている入力候補群の中から任意に選択指定された候補を入力データとする場合に、入力音声を認識する対象を入力候補群に限定し、この入力候補群に対応付けられている音声認識用データ群を参照して入力音声を認識し、入力候補群のうち最も近似している候補を当該入力領域に対するデータとして入力するようにしたから、候補選択型の入力領域に対するデータ入力を行う場合に、入力候補群の中から希望する候補をキーあるいはマウス等のポインティングデバイスによって指定して選択する方法の他に、オペレータからの音声によっても希望する候補を選択入力することができ、例えば、その入力候補群が多い場合や類似する候補が多数存在している場合には、音声入力で対処することができ、使い勝手が良いデータ入力方式を選ぶことで、入力効率の向上を期待することが可能となると共に、入力音声を認識する対象をその入力項領域対応の入力候補群に限定するようにしている為、認識対象の絞込みが可能となり、認識率が良くなり、入力効率を一層向上させることが可能となる。
【図面の簡単な説明】
【図１】音声認識機能付きデータ入力装置の全体構成を示したブロック図。
【図２】社内伝票を作成する伝票入力画面の構成を示した図。
【図３】（Ａ）は、伝票入力画面内の各入力項目に対応して、その入力形式等を定義する入力項目テーブル１０の内容を示した図、（Ｂ）は、入力項目テーブル１０内の項目定義にしたがって入力された１レコード分（１伝票分）の伝票データを示した図。
【図４】（Ａ）は、候補選択型の入力項目に対して、その項目データが音声入力された場合に、その入力音声と比較して音声認識する為の音声データ（標準パターン）が記憶される比較用メモリ１３の内容を示した図、（Ｂ）は、ＲＡＭ３内のワーク域に一時記憶される音声入力フラグ１４、属性フラグ１５のセット状態を示した図。
【図５】（Ａ）は、音声入力フラグ１４がセットされている状態において、任意入力型の入力項目が入力対象として選択指定された際に、その項目の「属性」に応じて選択指定される一般音声認識用辞書１６、数値音声認識用辞書１７を示した図、（Ｂ）は、一般音声認識用辞書１６を用いて文字認識を行う場合の処理系を概念的に示した図。
【図６】音声認識機能付きデータ入力装置の動作（伝票入力処理）を示したフローチャート。
【図７】図６に続く、音声認識機能付きデータ入力装置における伝票入力処理を示したフローチャート。
【図８】図６および図７に続く、音声認識機能付きデータ入力装置における伝票入力処理を示したフローチャート。
【符号の説明】
１　ＣＰＵ
２　記憶装置
４　通信装置
５　入力装置
６　表示装置
７　音声入力装置
１０　入力項目テーブル
１１、１２　入力候補テーブル
１３　比較用メモリ
１４　音声入力フラグ
１５　属性フラグ
１６　一般音声認識用辞書
１７　数値音声認識用辞書
１８　構文・意味情報辞書
Ａ　部門コード入力項目
Ｂ　部門名称入力項目
Ｃ　コメント入力項目
Ｄ　担当者名入力項目[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a data input device and a program for inputting data to a candidate selection type input area in which a candidate arbitrarily selected from a preset input candidate group is used as input data.
[0002]
[Prior art]
Conventionally, for example, in a data processing device that performs a slip creation task of creating a slip for inter-company transactions or an in-house slip for intra-company notification, an input screen for the slip creation includes a plurality of forms according to the type of the slip. Input items are arranged, and each input item is classified into a candidate selection type input item and an arbitrary input type input item according to the data input format. The input items of the candidate selection type are, for example, input regions in which input data is a candidate arbitrarily selected from a preset input candidate group in an item area called a list box or a combo box. The arbitrary input type input item is, for example, an input item for inputting arbitrary data in an item area called a text box.
[0003]
When the input item is designated as an input target in order to input data to the input item of the candidate selection type, an index table prepared in advance corresponding to the input item, for example, a customer table, a product By accessing a table or the like and displaying a list of customer candidate groups, product candidate groups, and the like, and selecting and specifying a desired candidate from the list using a key or a pointing device such as a mouse, the selected candidate is selected. Is input as the data of the corresponding item. When data is input to an arbitrary input type input item, the input is performed by directly inputting an arbitrary character string from a keyboard.
In such a candidate selection type input item, data can be input only by selecting a target candidate, as compared with an arbitrary input type input item. And it is possible to input it reliably.
[0004]
[Problems to be solved by the invention]
However, even when data is input for this candidate selection type input item, for example, when there are many input candidate groups, it is necessary to search for the target candidate while scrolling the contents of the list display. It takes time to find the target candidate, and if there are many similar candidates, it is not always an efficient input format, such as the possibility of misidentification when selecting a candidate. In some cases, it took time.
This is not limited to the case where data is input to an input screen for creating a slip, but also to an input screen for a user interface such as a GUI, for example, an input screen called a pull-down menu screen or a dialog box screen. The same is true when a target candidate is selected from the input candidate group (application group / command group) displayed in a list on the input screen.
[0005]
The problem is that when a candidate arbitrarily selected from a preset input candidate group is used as input data, a desired candidate from the input candidate group is selected and designated by a key or a pointing device such as a mouse. In addition to an input format for inputting a desired candidate, a desired candidate can be arbitrarily selected from an input candidate group and input by recognizing a voice from an operator.
[0006]
[Means for Solving the Problems]
The invention according to claim 1, wherein the data input device performs data input to a candidate selection type input area in which a candidate arbitrarily selected and designated from a preset input candidate group is used as input data. Voice data storage means for storing a voice recognition data group for voice recognition of the input candidate group set corresponding to the candidate selection type input area; and the candidate selection type input designated as an input target When a voice is input to the area, the target for recognizing the input voice is limited to the input candidate group corresponding to the input area, and the voice recognition data group associated with the input candidate group is referred to. Voice recognition means for recognizing an input voice by using the voice recognition means, and based on the recognition result recognized by the voice recognition means, the closest candidate of the input candidate group is used as data for the input area. Those comprising an input speech processing means for inputting.
Further, the present invention provides a computer with a program for realizing the main functions described in the first aspect of the present invention (the sixth aspect of the invention).
[0007]
Therefore, according to the first and sixth aspects of the present invention, a candidate selection type input area in which a candidate arbitrarily selected and designated from a preset input candidate group is used as input data is designated as an input target, and When a voice is input to a selection-type input area, the target for recognizing the input voice is limited to an input candidate group corresponding to the input area, and a voice recognition data group associated with the input candidate group The input speech is recognized with reference to the input candidate group, and the closest candidate is input as data for the input area. Therefore, when performing data input for a candidate selection type input area, the input In addition to the method of selecting and selecting a desired candidate from a group of candidates using a key or a pointing device such as a mouse, the desired candidate is selected by voice from an operator. For example, when the input candidate group is large or when there are many similar candidates, the input can be handled by voice input. It is possible to expect an improvement in efficiency, and since the recognition target of the input voice is limited to the input candidate group corresponding to the input area, the recognition target can be narrowed down, and the recognition rate is improved. Thus, the input efficiency can be further improved.
[0008]
The invention described in claim 1 may be as follows.
Conversion means for inverting an input candidate group set corresponding to the candidate selection type input area into a voice recognition data group by referring to a voice recognition dictionary prepared in advance; The data storage unit stores the data group for voice recognition obtained by the conversion unit (the invention according to claim 2).
[0009]
Therefore, according to the second aspect of the present invention, in addition to having the same effect as the first aspect of the present invention, it is possible to correspond to a candidate selection type input area by referring to a speech recognition dictionary prepared in advance. Since the input candidate group set as is inversely converted into a voice recognition data group and stored, only by creating and setting an input candidate group corresponding to a candidate selection type input area, This makes it possible to automatically create a speech recognition data group (speech recognition dictionary) corresponding to the input candidate group, thereby facilitating an input environment setting operation and the like. In addition, input candidates may be frequently added, deleted, or corrected in accordance with changes in daily work, etc., but each time the input candidate group is updated, the input candidate for voice recognition is updated based on the updated input candidate group. The content of the dictionary may be updated.
[0010]
A candidate selection type input item in which a arbitrarily selected candidate from a preset input candidate group is used as input data, and an arbitrary input type input item for inputting arbitrary data, which are designated as input targets. A discriminating means for discriminating the type of the item which has been input, the discriminating means discriminating that the input item of the arbitrary input type has been designated as the input target, and voice-inputting the input item of the arbitrary input type In this case, the voice recognition unit recognizes the input voice by referring to a voice recognition dictionary prepared in advance (the invention according to claim 3).
[0011]
Therefore, according to the third aspect of the present invention, in addition to having the same effect as the first aspect of the present invention, of the input items of the candidate selection type and the input items of the arbitrary input type, the input items of the arbitrary input type Is specified as an input target, and when a voice is input for the arbitrary input type input item, the voice recognition is performed by referring to a prepared voice recognition dictionary. In the case where input items of the candidate selection type and input items of the optional input type are mixed, as in the slip input screen, the input format of the optional input type can be changed by voice input without changing the input format. Arbitrary data can be input, and voice recognition according to the type of the input target item can be performed.
In this case, not only word recognition, but also linguistic processing such as context before and after, syntax analysis, etc. can be performed to improve the recognition rate and to convert any input speech to desired data without fail. It becomes. In other words, in the case of speech recognition for a candidate selection type input item, word recognition can sufficiently cope with it. However, in the case of speech recognition for an arbitrary input type input item, a more sophisticated recognition method including language processing is used. It can also respond by performing automatic switching.
[0012]
Dictionary storage means for storing a speech recognition dictionary for each input attribute prepared in advance according to an input attribute set corresponding to an arbitrary input type input item for inputting arbitrary data; A determination unit that determines an input attribute set corresponding to the input item when a voice input is performed for the input item of An input voice is recognized by selecting and specifying a voice recognition dictionary corresponding to the input attribute determined by the determination means from among them, and referring to the specified voice recognition dictionary (the invention according to claim 4).
[0013]
Therefore, according to the fourth aspect of the present invention, in addition to having the same effect as the first aspect of the present invention, when an arbitrary input type input item is input by voice, this arbitrary input type input item Input attributes (e.g., input attributes to indicate which data type to input among numerical values, characters, dates, symbols, and the like) set in correspondence with are determined in advance, and are prepared in advance for each input attribute. The input speech is recognized by referring to the existing speech recognition dictionary, so even if a relatively simple word recognition method is adopted, the recognition rate can be significantly improved by speech recognition using a dedicated dictionary. it can.
[0014]
As a result of recognition of the input voice by the voice recognition means, if the input voice cannot be recognized, a candidate arbitrarily selected and designated from the input candidate group by an input device other than voice input is input data. (The invention described in claim 5).
Therefore, according to the fifth aspect of the present invention, in addition to having the same effect as the first aspect of the present invention, when the input voice cannot be recognized, the input using the input device other than the voice is used. Since the system can be switched immediately, there is no need for the operator to perform the switching, and the data input operation can be performed efficiently.
[0015]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to FIGS.
FIG. 1 is a block diagram showing the overall configuration of a data input device according to this embodiment.
This data input device is, for example, a personal computer or the like that performs a voucher creation operation for creating an inter-company transaction voucher or an in-house voucher for intra-company notification, and other than a pointing device such as a keyboard or a mouse for inputting voucher data. A microphone for voice input, and a voice recognition function for inputting each item data constituting the slip data by recognizing the input voice input from the microphone.
[0016]
That is, this data input device is used in addition to a normal input format for inputting data using a key or a pointing device such as a mouse when inputting data for each input item constituting a voucher. In the embodiment, each item data constituting the slip data can be input also by recognizing the input voice from the operator.
Before describing the features of this embodiment in detail, the hardware configuration of this embodiment will be described below.
[0017]
The CPU 1 is a central processing unit that controls the overall operation of the data processing device according to an operating system and various application software in the storage device 2. The storage device 2 has a program storage area and a data storage area. In this program storage area, in addition to the operating system, in particular, together with an application for creating a slip, the slip creation is performed by voice recognition. Various processing programs, such as an application for voice recognition performed by the CPU, are stored. In a data storage area, various voice recognition dictionaries and the like, which will be described later, are stored. It is constituted by.
[0018]
The recording device 2 may have a configuration in which a removable storage medium such as a CD-ROM or a DVD can be mounted in addition to a fixed memory such as a hard disk. The programs and data in the storage device 2 are loaded into a RAM (for example, a static RAM) 3 as needed, or the data in the RAM 3 is saved in the storage device 2. Note that the RAM 3 has a program execution area and a work area.
Further, the CPU 1 can directly access and use a program / data of another electronic device via the communication device 4 or receive and receive a download via the communication device 4. The communication device 4 is, for example, a wired / wireless communication interface including a communication modem, an infrared module, an antenna, and the like.
[0019]
On the other hand, an input device 5, a display device 6, and a voice input device 7, which are input / output peripheral devices, are connected to the CPU 1 via a bus line, and the CPU 1 controls the operation according to an input / output program. The input device 5 is an operation unit that constitutes a keyboard, a touch panel, or a pointing device such as a mouse or a touch input pen, and inputs character string data and various commands. The display device 6 is a liquid crystal, CRT, or plasma display device that performs full-color display.
The voice input device 7 includes a microphone, an A / D converter, etc., converts an input voice waveform from analog to digital (A / D), and extracts characteristic information by analyzing the conversion result. Then, the input voice is converted into data and supplied to the CPU 1. The CPU 1 performs a voice input process of recognizing the input voice data by voice and inputting the recognition result as item data of a slip to a corresponding item.
[0020]
FIG. 2 is a diagram exemplifying a configuration of a slip input screen for creating an in-house slip.
The slip input screen has various input items such as an item A for inputting a department code, an item B for inputting a department name, an item C for inputting a comment, and an item D for inputting a person in charge name. The slip is created by inputting the corresponding data in each input item. In this case, each input item in the slip input screen is sequentially specified for each item as an input target according to a predetermined order, and data input for the specified item is arranged in the item. Is displayed.
The department code input item A, the department name input item B, the comment input item C, and the person in charge name input item D that constitute the slip input screen, the department code input item A and the comment input item C, and the department name input item B and the person in charge input item D have different input formats.
[0021]
That is, the department code input item A and the comment input item C are arbitrary input type input items (item areas called text boxes) for inputting arbitrary data, whereas the department name input item B and the person in charge input are input. Item D is a candidate selection type input item (item area called a list box or a combo box) in which a candidate arbitrarily selected from a preset input candidate group is used as input data. When the item B and the clerk name input item C are designated as input targets, a pull-down menu is opened in the slip input screen corresponding to the input item, and a list of input candidates is displayed.
[0022]
FIG. 3A is a diagram showing the contents of an input item table 10 that defines the input format and the like corresponding to each input item in the slip input screen.
The input item table 10 is a table in which the item definition information is set corresponding to each input item in the slip input screen. For each input item, "type", "attribute", " Link destination ”is set. In this case, each input item is stored in a predetermined arrangement order, and the CPU 1 sequentially designates each input item set in the input item table 10 as an input target one by one according to the arrangement order. ing. “Type” in the input item table 10 indicates the input format of the input item, “arbitrary” indicates an input item of an arbitrary input type, and “candidate” indicates an input item of a candidate selection type. Is shown.
[0023]
The “attribute” in the input item table 10 indicates the type of data to be input in the input item of the arbitrary input type, and “numeric value” is set in the “attribute” corresponding to the section code input item A. This indicates that the department code should be input by a numeric string, and that "character" is set in the "attribute" corresponding to the comment input item C, indicating that the comment should be input by a character string. I have.
The “link destination” is a table name for accessing an index table in which an input candidate group is set in a candidate selection type input item, and the “link destination” corresponding to the section name input item B is an input candidate. The table name "F1" of the table (department table) 11 is set, and the table name "F2" of the input candidate table (person in charge table) 12 is set in the "link destination" corresponding to the person in charge name input item D. Have been.
[0024]
The input candidate table (department table) 11 is an index table that stores various “department names” arbitrarily set in advance as a group of input candidates for the department name input item B, and an input candidate table (person in charge table). Reference numeral 12 denotes an index table that stores various “person-in-charge names” arbitrarily set in advance as input candidates for the person-in-charge name input item D. The input candidate tables 11 and 12 are configured to store “input candidate numbers” corresponding to various “input candidates (department name / person in charge)”.
FIG. 3B is a diagram showing slip data of one record (one slip) input according to the above-described item definition information. When the input record of one slip is taken into the CPU 1, the CPU 1 The input record (slip record) is stored and managed in a slip file (not shown) for each slip, and a slip issuing process, a slip counting process, and the like are performed based on the slip file.
[0025]
FIG. 4A shows a case where the item data is input by voice to the section name input item B / person in charge name input item D, which is a candidate selection type input item, and the input voice is compared with the input voice. FIG. 4 is a diagram showing the contents of a comparison memory 13 in which voice data (voice recognition pattern) for recognition is stored.
The comparison memory 13 is a dictionary memory for voice recognition of candidate selection type input items, and is created and stored each time a section name input item B / person name input item D is selected and designated as an input target. It is.
[0026]
In this case, when the section name input item B / person name input item D is selected and designated as an input target, the CPU 1 selects an input candidate table (department table / person table) 11 linked to the input item. By accessing the group 12 and converting the input candidate group into a voice recognition pattern group, a comparison memory 13 is created and set in the work area in the RAM 3. At this time, as will be described in detail later, syntax analysis and semantic analysis are performed based on each character code constituting the input candidate, and the speech recognition dictionary is inversely converted based on the analysis result, thereby supporting each input candidate. I try to get audio data.
The comparison memory 13 is configured to store the “voice data (standard pattern)” and the “input candidate number” for each input candidate. The memory 13 is associated with each of the input candidate tables 11 and 12.
[0027]
FIG. 4B is a diagram showing a set state of the voice input flag 14 and the attribute flag 15 temporarily stored in the work area in the RAM 3.
The voice input flag 14 indicates that voice input is possible for each input item when a slip is input, and is an input status flag arbitrarily set / reset by an operator operation.
The attribute flag 15 is a character that is automatically set in accordance with the “attribute” of a section code input item A / comment input item C, which is an arbitrary input type input item, when the item is selected and designated as an input target. / Designation flag for numerical values, where "1" is set when the "attribute" is a character, and "0" is set when the "attribute" is a numerical value.
[0028]
FIG. 5 (A) shows a case where a section code input item A / comment input item C, which is an arbitrary input type input item, is selected and designated as an input target while the voice input flag 14 is set. FIG. 4 is a diagram showing a dictionary 16 for general speech recognition and a dictionary 17 for numerical speech recognition that are selected and designated according to “attribute” of FIG.
That is, when the attribute flag 15 is a flag “1” (character), character recognition using the general voice recognition dictionary 16 is performed. In order to perform character recognition using the dictionary 17, a different recognition dictionary is designated for each attribute. In this example, only characters and numerical values are shown as attributes. However, if there are other attributes, a recognition dictionary associated with the other attributes is specified.
[0029]
The general voice recognition dictionary 16 is configured to store various types of voice data (standard patterns) including words that may be commonly used as slip input and their code information in association with each other. In contrast to the dictionary for general voice recognition that can also be used for inputting a slip, the dictionary for numerical voice recognition 17 is a voice dedicated to numerical values in which voice data and its code information are associated with numerical values. It is a recognition dictionary.
The general voice recognition dictionary 16 also stores voice data such as special “terms”, “person names”, “corporate names”, and “addresses” that may be generally used for slip input. ing. Further, in the speech recognition using the dictionary 16 for general speech recognition or the dictionary 17 for numerical speech recognition, a speech recognition method for an unspecified speaker that does not specify a target person is possible. Of course, the target person is specified. You may make it employ | adopt the speech recognition system corresponding to a specific speaker.
[0030]
FIG. 5B is a diagram conceptually showing a processing system when character recognition is performed using the general speech recognition dictionary 16.
In this case, when a section code input item A / comment input item C, which is an arbitrary input type input item, is selected and designated as an input target, if the "attribute" of the item is "character", continuous voice input is performed. In order to improve the recognition accuracy, after performing a general speech recognition process for recognizing the input speech character by word using the general speech recognition dictionary 16, syntax analysis and semantic analysis are performed using the syntax / semantic information dictionary 18. The language processing to be performed is executed to obtain a recognition result.
[0031]
Next, an operation algorithm of the data input device with a voice recognition function in this embodiment will be described with reference to flowcharts shown in FIGS. Here, the functions described in these flowcharts are stored in the form of readable program codes, and sequentially execute operations according to the program codes. Further, the operation according to the above-described program code transmitted via the transmission medium can be sequentially performed. That is, an operation unique to this embodiment can be executed using a program / data externally supplied via a transmission medium in addition to a recording medium.
[0032]
6 to 8 are flowcharts showing the operation (slip input processing) of the data input device with the voice recognition function.
First, when the slip creation processing is designated, the CPU 1 reads out the form information for slip input, displays and outputs the input screen shown in FIG. 2 (step A1), and uses voice input as a slip data input method. Is displayed on the input screen as a guide for inquiring whether or not to enable the process (step A2). That is, in addition to an input method of inputting data using a key or a pointing device such as a mouse, an inquiry is made as to whether data input is possible by an input voice from an operator. In this case, the operator operates the “YES key” or the “NO key” according to the inquiry message. If the "YES key" is operated to enable voice input (step A3), the voice input flag 14 is set (step A4). If the "NO key" is operated (step A4), Step A3), the voice input flag 14 is reset (step A5).
[0033]
Then, the CPU 1 designates the first one of the input items in the input screen as an input target (step A6). In this case, the first item is designated as an input target by referring to the arrangement order of each input item set in the input item table 10. In this example, first, the first item “department code” is specified as an input target. Then, referring to the “type” set in the input item table 10 corresponding to the specified item, it is determined whether the specified item is “input item of arbitrary input type” or “input item of candidate selection type”. (Step A7).
[0034]
Note that, in this case, since the section code input item A, which is the first item, is an arbitrary input type input item, this is detected in step A7, and the process proceeds to the flowchart of FIG. For the sake of convenience, that is, in order to explain the flowcharts of FIGS. 6 to 8 in that order, before describing the operation when an arbitrary input type input item is to be input, the candidate selection type input item must be The operation when an input is made will be described.
[0035]
Here, assuming that a section name input item B, which is a second input item, is specified as an input target as a candidate selection type input item, for example, the CPU 1 refers to the input item table 10 and specifies the specified item. The section table is read out as the corresponding input candidate table 11 based on the "link destination" (step A8). Then, it is checked whether the voice input flag 14 is set (step A9).
If the voice input flag 14 has been reset, the process proceeds to step A10, where it is checked whether the cursor has moved to the position of the designated item (whether the cursor has been designated). , A pull-down menu is opened corresponding to the specified item, and the contents (department name) of the input candidate table (department table) 11 are displayed as a list (step A11).
[0036]
In the state where the input candidate group is displayed in a list, if an arbitrary candidate is selected from the list by the cursor instruction (step A12), the candidate of the cursor instruction is input to the input candidate table (section table). 11 (step A13), the selection candidate is input as data corresponding to the input item (department name input item B), and the selection candidate "department name" is displayed as input data in this input item B (step A13). A14).
In this state, it is checked whether or not a predetermined key (return key) for instructing the candidate confirmation has been operated (step A15). If the candidate confirmation has not been instructed, the process returns to step A10, and the candidate selection operation can be performed again. When the return key is operated to instruct the candidate confirmation, the input data is confirmed as the slip data corresponding to the input item (step A16).
[0037]
As a result, when the input of one item is completed, it is checked whether the designated item of interest is the last item constituting the slip, that is, whether the input item in one slip is completed (step A17). In this case, the input for the second input item has been completed, and is not the last item. Therefore, the process proceeds to the next step A18, where the next input item (third comment input item C) is specified. When the section name input item B (candidate selection type input item), which is the second input item, is specified as an input target, it is detected in step A9 that the voice input flag 14 is set. In this case, the process proceeds to step A20 in FIG.
[0038]
In this case, in steps A20 to A23 of FIG. 7, a process of creating the contents of the comparison memory 13 corresponding to the department table based on each input candidate read from the input candidate table (department table) 11 is performed. That is, by referring to the syntax / semantic information dictionary 18 and the general speech recognition dictionary 16 based on each character string constituting each input candidate read from the input candidate table (department table) 11, each input candidate is recognized by speech recognition. The audio data is inversely converted to audio data for use and temporarily stored in the comparison memory 13 corresponding to the department table.
[0039]
First, the syntax / semantic information dictionary 18 is referred to based on each character code constituting the input candidate, syntax analysis and semantic analysis are performed by language processing, and each word included in the input candidate is specified (step A20). . Then, the general speech recognition dictionary 16 is accessed based on the word, voice data (voice standard pattern) associated with the word is obtained, and the word is inversely converted into voice data (step A21). At this time, when the input candidate (section name) is composed of a plurality of words, the voice data is obtained by performing an inverse conversion for each word.
[0040]
Further, the input candidate table (department table) 11 is accessed based on each input candidate, and each input candidate number corresponding to the input candidate is obtained (step A22). Then, the contents of the comparison memory 13 in which each voice data converted for each input candidate as described above is associated with each input candidate number set in the input candidate table (department table) 11 are created. Then, the contents of the comparison memory 13 are temporarily stored in the work area in the RAM 3 (step A23).
[0041]
In the state where the contents of the comparison memory 13 corresponding to the department table are created, it is checked whether or not there is a voice input (step A24). Here, when the voice input from the operator is received, after the input voice is A / D converted by the voice input device 7, the characteristic information is extracted and the input voice is converted into data (step A25). Fetches the input voice data, accesses the comparison memory 13 based on the input voice data, compares it with each voice data (step A26), and determines whether similar voice data exists in the comparison memory 13. Is checked (step A27).
[0042]
As a result, if similar voice data does not exist, it is determined that the voice recognition is poor, and the process proceeds to step A33 to enable selection of a candidate using a key or a pointing device such as a mouse. Input candidates in the table 11 are displayed in a list. If similar voices are present as a result of voice recognition (step A27), the most similar voice data is specified from the similar voice data (step A28). The "input candidate number" associated with the voice data is read from the comparison memory 13, and the input candidate corresponding to the "input candidate number" is selected and acquired from the input candidate table (department table) 11 (step A29). Then, this selection candidate is input as data corresponding to the input item, and is arranged and displayed in the item (step A30).
[0043]
Even if the voice input flag 14 is set, if there is no voice input (NO in step A24), the cursor is moved to the input item and it is checked whether or not there is a cursor instruction (step A32). When the cursor is pointed, the contents of the input candidate table (department table) 11 are displayed as a list (step A33). When an arbitrary candidate is selected from the list display by the cursor instruction (step A34), a candidate for the cursor instruction is selected from the input candidate table (department table) 11 (step A35), and the selected candidate is selected. Is input as data corresponding to the input item, and is arranged and displayed in the item (step A30).
[0044]
Then, it is checked whether or not the return key has been operated (step A31). If the return key has not been operated, the process returns to step A24. The process proceeds to A16, and the input data is determined as slip data corresponding to the input item.
As a result, when the input for one item is completed, it is checked whether the input item for one slip is completed (step A17). Since the input for the second input item has now been completed, the next input item ( After designating the comment input item C) (step A18), the "type" corresponding to the designated item is determined (step A7). In this case, since the comment input item C is an arbitrary input type input item, the process proceeds to the flowchart of FIG.
[0045]
First, the "attribute" set in the input item table 10 corresponding to the designated item is read out (step A40), and it is determined whether it is "character" or "numerical value" (step A41). As a result, if the "attribute" is "character", "1" is set to the attribute flag 15 (step A42), but if it is "numerical", it is set to "0" (step A43). Then, it is determined whether the voice input flag 14 is set (steps A44 / A45). If the attribute flag 15 is "1" when the voice input flag 14 is set, the general voice recognition dictionary 16 Is selected (step A46), but if it is "0", the dictionary 17 for numerical voice recognition is selected (step A47).
On the other hand, if the voice input flag 14 has been reset (NO in step A44 or A45), a standby state is set until data is key-inputted for this designated item (step A48).
[0046]
When the voice input is performed on the designated item while the voice input flag 14 is set (step A49), the selected dictionary (the general voice recognition dictionary 16 or the numerical voice recognition dictionary 17) is referred to. To recognize the input voice (step A50). In this case, the designated item is the comment input item C, whose “attribute” is “character”, so that the input speech is recognized with reference to the general speech recognition dictionary 16 and then the syntax / semantic information dictionary 18 is used. To perform language processing for syntactic analysis and semantic analysis, and obtain a "comment sentence" as a recognition result. If the designated item is a department code input item A, the input voice is recognized by referring to the dictionary 17 for numerical voice recognition, and a "department code" is obtained as a result of the recognition.
The recognition result thus recognized is input as data corresponding to the specified item, and is arranged and displayed in the item (step A51).
[0047]
On the other hand, even if the voice input flag 14 is set, if there is no voice input (NO in step A49), the process proceeds to step A52 to enable data input by key operation, and the presence or absence of key input is determined. Check and wait until there is a voice input or key input. Here, when data is input by a key operation on the designated item (step A52), the keyed data is converted into data (character / numerical value) according to the "attribute" (step A53). ), This data is input as data corresponding to the designated item, and is arranged and displayed within the item (step A54). Then, it is checked whether or not the return key has been operated (step A55). If the return key has not been operated, the process returns to step A49. The form data corresponding to the input item is determined (step A56).
[0048]
As a result, when the input for one item is completed, it is checked whether or not the input item for one slip is completed (step A57). Since the input for the third input item has now been completed, the next input item ( After designating the clerk name input item D) (step A58), the process returns to step A7 in FIG. 6 and determines the "type" of the designated item. In this case, since the clerk name input item D is a selection candidate type input item, an operation on the selection candidate type input item is performed in accordance with the setting / reset of the voice input flag 14 as described above. . At this time, the input candidate group in the input candidate table (person in charge table) 12 is read out, and if the voice input flag 14 is reset, the input candidate group is displayed in a list as in the case described above. However, when the voice input flag 14 is set, the input candidate group is converted into voice data, and the contents of the comparison memory 13 corresponding to the person in charge table are created and temporarily stored, as in the case described above. It is memorized.
[0049]
As described above, in this embodiment, the CPU 1 specifies a candidate selection type input item in which a candidate arbitrarily selected and specified from the input candidate groups in the input candidate tables 11 and 12 as input data is specified as an input target. In this state, when a voice is input from the voice input device 7, the recognition target of the input voice is limited to the input candidate group corresponding to the input item. The input speech is recognized with reference to the pattern group (contents of the comparison memory 13), and the closest candidate among the input candidate group is input as data for the input item. When inputting data to a user, there is another method of selecting and selecting a desired candidate from a group of input candidates by using a key or a pointing device such as a mouse. It is also possible to select and input a desired candidate by recognizing an input voice from an operator. For example, when the input candidate group is large or when there are many similar candidates, the voice input By selecting a data input method that is easy to use, it is possible to expect an improvement in input efficiency and to limit the input speech recognition target to the input candidate group corresponding to the input item. Therefore, the recognition target can be narrowed down, the recognition rate can be improved, and the input efficiency can be further improved.
[0050]
In this case, by referring to the general speech recognition dictionary 16 prepared in advance, the input candidate group set corresponding to the candidate selection type input item is inversely converted into a voice recognition pattern group and compared. Since the information is stored in the memory 13, the contents of the comparison memory 13 (speech recognition dictionary) can be automatically created simply by creating and setting an input candidate group corresponding to the candidate selection type input item. This makes it easy to set the input environment.
In addition, input candidates may be frequently added, deleted, or corrected in accordance with changes in daily work, etc., but each time the input candidate group is updated, the input candidate for voice recognition is updated based on the updated input candidate group. The content of the dictionary may be updated.
[0051]
On the slip input screen, of the input items of the candidate selection type and the input items of the arbitrary input type, the input item of the arbitrary input type is designated as an input target, and voice input is performed for the input item of the arbitrary input type. Voice recognition is performed even in the case of input, the candidate selection type input can be performed without changing the input format on the slip input screen where candidate selection type input items and arbitrary input type input items are mixed. In addition to the items, any data can be input to an arbitrary input type input item by voice input from an operator, and voice recognition according to the type of the input target item can be performed.
[0052]
In this case, not only word recognition but also syntax / semantic analysis is performed by language processing with reference to the syntax / semantic information dictionary 18, so that the recognition rate can be improved and arbitrary input speech can be reliably obtained in desired data. Can be converted to In other words, in the case of speech recognition for a candidate selection type input item, word recognition can sufficiently cope with it. However, in the case of speech recognition for an arbitrary input type input item, a more sophisticated recognition method including language processing is used. It can respond by performing automatic switching.
[0053]
When a voice is input for an arbitrary input type input item, the attribute (numerical value / character) set corresponding to the arbitrary input type input item is determined, and the attribute is prepared in advance for each attribute. Since the input speech is recognized with reference to the general speech recognition dictionary 16 / numerical speech recognition dictionary 17, even if a word recognition method is adopted, the recognition rate can be significantly increased by speech recognition using a dedicated dictionary. Can be improved.
On the other hand, if the input voice cannot be recognized, it is possible to immediately switch to an input method using an input device other than voice (such as a mouse or a touch pen). The input operation can be performed more efficiently.
[0054]
In the above-described embodiment, each time a department name input item B or a person in charge name input item D which is a candidate selection type input item is selected and designated as an input target, an input candidate table (department table / person in charge table) The input candidate groups in 11 and 12 are converted into voice data and temporarily stored in the comparison memory 13 corresponding to the department table / person in charge table. Corresponding section dictionaries for speech recognition and person in charge dictionaries may be created and registered in advance. That is, every time a candidate selection type input item is selected as an input target, the input candidate group is not converted into voice data, but is referred to a previously created speech recognition department dictionary and person in charge dictionary. Alternatively, voice recognition may be performed.
[0055]
In this case, every time the contents of the department table and the person in charge table are updated (deleted, added, and modified), the contents of the department table, the person dictionary corresponding to the person in charge table, and the person in charge dictionary are updated in conjunction with this update. What should I do?
Also, the division table, the person in charge table, and the comparison memory 13 are not separately created, and the voice data is stored in a file storing the division information / person in charge information in association with each input candidate. An audio data storage area may be provided. This eliminates the need to create the contents of the comparison memory 13 and to provide a department dictionary and a person dictionary for voice recognition separately from the department table and the person in charge table. Speech recognition becomes possible only by accessing.
[0056]
In the above-described embodiment, when inputting a slip, the voice input flag 14 is set by an operator's operation, so that a voice input is enabled for each input item. It is also possible to enable voice input without setting.
Further, in the above-described embodiment, voice input is enabled for all input items in the slip input screen. However, among input items constituting the slip, only predetermined items are compatible with voice input. It may be. That is, only predetermined items may be compatible with voice input in consideration of the scale of the voice recognition dictionary, its recognition rate, cost, and the like. For example, if the comment input item C among the items of the arbitrary input type is excluded from the voice input target, the general voice recognition dictionary 16 and the syntax / semantic information dictionary 18 become unnecessary.
[0057]
In the above-described embodiment, “character” and “numeric value” are exemplified as attributes of the input item. A dictionary for recognition may be prepared.
Also, in order to further improve the speech recognition rate, a learning function is added to the speech recognition function, and data that has been previously determined and frequently input by speech recognition is learned and stored. These data may be made to appear as priority candidates.
[0058]
When there are a large number of input candidate groups, the input candidate group is sorted by classification, and when a desired classification item is selected from a main list displaying a list of the classification items, a sub-list thereof is displayed. May be a layer candidate selection type item in which is displayed.
By applying voice input to such a hierarchical candidate selection type input item, candidate selection can be performed reliably and easily. That is, conventionally, although the desired candidate exists in the sub-list of the lower hierarchy, it is determined that the corresponding item does not exist only by the operator checking only the main list of the upper hierarchy. However, such problems can be reliably prevented by applying voice input. In particular, the more the number of layers, the more effective it is. When the list is opened, the troublesome operation of selecting a candidate is not required each time.
[0059]
In the above-described embodiment, the voice input is performed for the designated items constituting the voucher. However, the input command is voice-recognized by voice input of the “operation command” for operating the application. It may be.
In this case, whether the voice input for the specified item or the voice input of the operation command is distinguished based on whether or not the input item is specified on the input screen, the voice recognition and the operation for the slip specified item can be performed. It is possible to perform voice recognition without confusing voice recognition with a command.
[0060]
In the above-described embodiment, the case where a slip is input has been described as an example. However, the input screen is not limited to the slip input screen, and may be arbitrary, and may be applied to, for example, an input screen for a user interface such as a GUI.
That is, in an input screen called a pull-down menu screen or a dialog box screen or the like, when an arbitrary candidate is selected and designated from an input candidate group (application group / command group) listed in the input screen, In addition to the input method selected by a key or a pointing device such as a mouse, a voice input method of selecting and specifying a desired application / command by recognizing an input voice may be adopted. In this case, basically, it can be easily realized according to the above-described flowcharts. As a result, when there are many application groups / command groups that are input candidates or when there are many similar candidates, the user interface can be handled by voice input, so that the usability of the user interface is improved and the efficiency is improved. A good input environment can be realized.
[0061]
Further, the above-described data input device with a voice recognition function is not limited to a stand-alone type, and each component is physically separated into two or more housings, and a wired transmission line such as a communication line or a cable or a radio wave. , A distributed computer system for transmitting and receiving data via a wireless transmission path such as a microwave or infrared ray.
Further, the present invention is also applicable to a communication system (for example, a client server system) in which a plurality of data input devices are constantly connected via a wide area network or a private network. In this case, when each client inputs data by voice, the input voice is transmitted to a data input device functioning as a server, and the server performs collectively voice recognition processing of the input voice, and the recognition result is obtained. May be transferred to the requesting client.
[0062]
On the other hand, a recording medium (for example, a CD-ROM, a floppy disk, a RAM card, or the like) in which a program code for causing each of the above-described units to be executed may be provided to the computer.
That is, a recording medium having a program code readable by a computer, wherein audio data is input to a candidate selection type input area in which a candidate arbitrarily selected and designated from a preset input candidate group is used as input data. When the input voice is input, the input voice recognition target is limited to the input candidate group corresponding to the input area, and the input voice is recognized with reference to the voice recognition data group corresponding to the input candidate group. Provided is a computer-readable recording medium storing a program for realizing a function and a function of inputting a candidate closest to the input candidate group as data for the input area based on the recognition result. You may make it.
[0063]
【The invention's effect】
According to this invention (invention of claim 1), when a candidate arbitrarily selected and designated from a preset input candidate group is used as input data, an input speech group to be recognized is determined. The input speech is recognized with reference to the speech recognition data group associated with the input candidate group, and the closest candidate of the input candidate group is input as data for the input area. Therefore, when data is input to the input area of the candidate selection type, in addition to a method of selecting and selecting a desired candidate from a group of input candidates using a key or a pointing device such as a mouse, a voice from an operator is used. Desired input can also be selected and input.For example, when the input candidate group is large or when there are many similar candidates, the input is dealt with by voice input. By selecting an easy-to-use data input method, it is possible to expect an improvement in input efficiency, and to limit the input speech recognition target to the input candidate group corresponding to the input item area. Therefore, the recognition target can be narrowed down, the recognition rate can be improved, and the input efficiency can be further improved.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an overall configuration of a data input device with a voice recognition function.
FIG. 2 is a diagram showing a configuration of a slip input screen for creating an in-house slip.
FIG. 3A is a diagram showing the contents of an input item table 10 that defines an input format and the like corresponding to each input item in a slip input screen, and FIG. FIG. 6 is a diagram showing slip data of one record (one slip) input according to the item definition of FIG.
FIG. 4A shows voice data (standard pattern) for recognizing voice by comparing with the input voice when the item data is input by voice for a candidate selection type input item. FIG. 4B is a diagram showing the contents of a comparison memory 13, and FIG. 4B is a diagram showing a set state of a voice input flag 14 and an attribute flag 15 temporarily stored in a work area in a RAM 3.
FIG. 5A is a diagram showing an example in which an input item of an arbitrary input type is selected and specified as an input target in a state where the voice input flag 14 is set, and is selected and specified according to the “attribute” of the item; FIG. 1B is a diagram showing a general speech recognition dictionary 16 and a numerical speech recognition dictionary 17, and FIG. 2B is a diagram conceptually showing a processing system when character recognition is performed using the general speech recognition dictionary 16.
FIG. 6 is a flowchart showing an operation (a slip input process) of the data input device with a voice recognition function.
FIG. 7 is a flowchart showing a slip input process in the data input device with a voice recognition function, following FIG. 6;
FIG. 8 is a flowchart showing a slip input process in the data input device with a voice recognition function, following FIGS. 6 and 7;
[Explanation of symbols]
1 CPU
2 Storage device
4 Communication equipment
5 Input device
6 Display device
7 Voice input device
10 Input item table
11, 12 Input candidate table
13 Memory for comparison
14 Voice input flag
15 Attribute flag
16 dictionary for general speech recognition
17 Numerical speech recognition dictionary
18 Syntax / Semantic Information Dictionary
A Department code input item
B Department name input item
C Comment input item
D Person in charge name input item

Claims

In a data input device that performs data input to a candidate selection type input area in which a candidate arbitrarily selected and designated from a preset input candidate group is used as input data,
Voice data storage means for storing a voice recognition data group for voice recognition of the input candidate group set corresponding to the candidate selection type input area,
When a voice is input to the input area of the candidate selection type specified as an input target, the input voice recognition target is limited to an input candidate group corresponding to the input area, and is associated with the input candidate group. Voice recognition means for recognizing an input voice by referring to the voice recognition data group being
Input voice processing means for inputting the closest candidate from the input candidate group based on the recognition result recognized by the voice recognition means as data for the input area;
A data input device comprising:

By referring to a speech recognition dictionary prepared in advance, a conversion unit for inversely converting an input candidate group set corresponding to the candidate selection type input area to a speech recognition data group is provided,
The voice data storage means stores a data group for voice recognition obtained by the conversion means,
The data input device according to claim 1, wherein

A candidate selection type input item in which a arbitrarily selected candidate from a preset input candidate group is used as input data, and an arbitrary input type input item for inputting arbitrary data, which are designated as input targets. Determining means for determining the type of the item,
The discrimination means determines that the input item of the arbitrary input type has been designated as the input target, and when a voice is input for the input item of the arbitrary input type, the voice recognition means is prepared in advance. Recognize the input voice by referring to the existing voice recognition dictionary,
The data input device according to claim 1, wherein

Dictionary storage means for storing a speech recognition dictionary for each input attribute prepared in advance according to an input attribute set corresponding to an arbitrary input type input item for inputting arbitrary data,
A determination unit configured to determine an input attribute set corresponding to the input item when a voice input is performed for the arbitrary input type input item;
Wherein the voice recognition unit selects and specifies a voice recognition dictionary corresponding to the input attribute determined by the determination unit from the voice recognition dictionary for each attribute, and refers to the specified voice recognition dictionary. By recognizing the input voice,
The data input device according to claim 1, wherein

As a result of recognition of the input voice by the voice recognition means, if the input voice cannot be recognized, a candidate arbitrarily selected and designated from the input candidate group by an input device other than voice input is input data. Shift to the input format,
The data input device according to claim 1, wherein

Against the computer
When a voice is input to a candidate selection type input area in which a candidate arbitrarily selected and specified from a preset input candidate group is used as input data, a target for recognizing the input voice is set to the input area. A function of recognizing an input voice by limiting to a corresponding input candidate group and referring to a voice recognition data group associated with the input candidate group;
A function of inputting the closest candidate from the input candidate group based on the recognition result as data for the input item;
The program to realize.