JP2004509385A

JP2004509385A - An input device for voice recognition and intelligibility using key input data.

Info

Publication number: JP2004509385A
Application number: JP2002505609A
Authority: JP
Inventors: ディーン、ラリー・ライス・ジュニア
Original assignee: Minebea Co Ltd
Current assignee: Minebea Co Ltd
Priority date: 2000-06-23
Filing date: 2001-06-22
Publication date: 2004-03-25
Also published as: CN1237509C; CA2407930A1; WO2002001551A1; WO2002001551A9; EP1312077A1; EP1312077A4; HK1059332A1; AU2001270088A1; CN1451156A; TW514824B

Abstract

キー入力（１３）と音声データ（１４）の両者をコンピュータへ送るインターフェースコントローラ（１２）を持つ入力デバイスが開示されている。コンピュータは分割した処理を行うためにデータを分けることができ、音声または言語認識のための音声明瞭度処理を含んでいる。入力デバイスは音声認識キーボードを持つことができ、音声認識処理はキーボードとローカルに置かれており、そのキーボードはマルチメディアの電子デバイスを遠隔で制御できる。入力デバイスへの口語コマンドはインターネットアクセスを始めることができる。本発明はさらに音声または言語認識用の単一のインターフェースによってコンピュータシステムにキー入力及び音声入力（１７）を提供する方法、口語をテキストに変換する方法、インターネットにつながれたコンピュータからインターネットへのアクセスを提供する方法、または少なくともひとつの電子デバイスを口語コマンドにより遠隔で制御する方法、も熟考している。An input device having an interface controller (12) for sending both key input (13) and audio data (14) to a computer is disclosed. Computers can separate the data to perform the segmented processing, including speech intelligibility processing for speech or language recognition. The input device can have a speech recognition keyboard, and the speech recognition process is located locally with the keyboard, which can remotely control the multimedia electronic device. Spoken commands to input devices can initiate Internet access. The present invention further provides a method for providing key and voice input (17) to a computer system through a single interface for speech or language recognition, a method for translating spoken language to text, and providing access to the Internet from an Internet-connected computer. Methods of providing or remotely controlling at least one electronic device by spoken commands are also contemplated.

Description

【０００１】
【発明の属する技術分野】
本発明は電子デバイスとインターフェースするための入力デバイスに関連するものである。本発明はさらに特にコンピュータキーボード、音声・言語認識システム、そして電子装置の制御システムに関連している。
【０００２】
【従来の技術】
従来技術として音声検出機能はキーボードに統合されることが知られている。例えばＷｈｅｌｐｌｅｙＪｒ．は米国特許第５６５９６６５号（以下Ｗｈｅｌｐｌｅｙ６６５パテントと呼ぶ）で、キーボードとコンピュータシステムとの間のデータケーブルに外部デバイスを挿入することを開示している。この外部デバイスは音声信号をコンピュータのキーボードポートに送られるキー入力データへ変換することと、一方で、キーボード自身からの通常の信号は外部デバイスを通過していつものとおりにコンピュータへ行くことを許可することにより、コンピュータに音声認識機能を付加している。Ｗｈｅｌｐｌｅｙ６６５パテントで開示されている音声認識システムは音声入力がその中に組み込まれており、音声入力信号の処理用として音声認識のハードウェアは外部デバイスに閉じ込められている。音声コマンドはキーボードデータと等しいとき以外はコンピュータには決して送られず、音声認識機能を実行する際にコンピュータ処理時間は消費されない。
【０００３】
Ｗｈｅｌｐｌｅｙ６６５パテントではまた、キーボードハウジング内に音声認識デバイスが含まれている実施例を開示している。外面的な実施例と同様に、そのデバイスは音声信号をキー入力データに変換したり、キーボードから来るデータストリームにデータを挿入したりする働きをする。Ｗｈｅｌｐｌｅｙ６６５パテントで開示された組み合わせは、音声認識機能をコンピュータに対して透明となるように事実上偽装する。従って音声信号は決してキーボードケーブルによって転送されない。
【０００４】
【発明が解決しようとする課題】
Ｗｈｅｌｐｌｅｙ６６５パテントで開示されているようなキーボード操作を模擬するだけのどんな従来技術のデバイスでも、キーボード入力やキー入力の組み合わせで実行できるコマンドに対しては機能的に事実上の限界がある。コンピュータに直接そして音声認識システムとは無関係につながれた余計なマイクロフォンやスピーカの組み合わせなしでは音声信号自身の操作を実行することがコンピュータにとってできないので、このことは大きな欠点となる。
【０００５】
音声認識は発展していく技術であることがまた知られている。
現在実行されているものは、特定のユーザの音声を正確に認識するために設計された音声依存型のものと、いかなる音声に対しても正確に機能するように設計された音声非依存型のものがある。この分野での進歩は、言語をキー入力に変換する際のさまざまな発音、話し方、アクセントの変化による不正確さを減らしていることである。音声認識のいかなる組み込み手段もハードウェアの限界となりつつあり、技術の進歩とともにアップグレードすることが高価になりそうである。
【０００６】
さらに基本的なテキスト入力のような機能は、特定ユーザの音声を正確に見分けるために現在の音声認識システムをトレーニングする必要により、非常に複雑である。Ｇａｌｖｉｎによる米国特許第５８７４９３９号（Ｇａｌｖｉｎ９３９パテント）ではトレーニングを容易にするディスプレイ付の音声認識キーボードを開示している。システムによって口語をテキストに変換する際には、ユーザはそれらをディスプレイ上で正確にチェックでき、この方法ではシステムオーバータイムの正確さを改善できる。
【０００７】
Ｗｈｅｌｐｌｅｙ６６５パテントと同じように、Ｇａｌｖｉｎ９３９パテントで開示されている音声認識システムは組み込みのものであり、音声データをコンピュータに転送はしない。結果としてトレーニング手順は必然的にキーボードとのインターフェースで局部的な制限が必要となり、開示されたように貧弱なまたは高価なものとなり、間違いなくコンピュータに接続されたディスプレイと同じようにキーボードディスプレイを準備する問題に関連する。
【０００８】
Ｗｈｅｌｐｌｅｙ６６５パテントやＧａｌｖｉｎ９３９パテントで開示されているような組み込みのアプローチは、コンピュータシステムとは分離された外部プロセッサに言語処理を制限することにより、システム効率の改善を図っており、そしてより古いコンピュータシステムに適しているであろうとはいえ、現行のコンピュータシステムや利用可能なハイビットレートのインターフェースでの実用的な使用に対しては十分な柔軟性がなく、他の音声関連技術の使用には適さない。
【０００９】
同じインターフェースによって音声信号とキーボード信号をコンピュータに同時に転送できる低価格の装置に対する要望がある。また、ディジタル技術のような他の音声処理機能をサポートするのと同じようにキーボード信号を模擬できる音声明瞭度認識システムに対する要望や、遠隔デバイス制御に対する要望もある。キーボードと近くない環境でのこれらの機能に対する要望がある。また、キーボード内で、あるいはアプリケーションに依存したコンピュータシステムとともに機能することができる音声明瞭度認識システムに対する要望もある。
【００１０】
【課題を解決するための手段】
本発明は、コンピュータキーボードや音声データを持ったマイクロフォンやテレフォンハンドセットのようなその他の周辺インターフェースから音声明瞭度認識システムにテキスト入力としてキー入力データを統合し、電子装置を制御し、そしてインターネットへアクセスすることに向けられている。
【００１１】
本発明のひとつの実施例としては通常のコンピュータキーボードのような入力デバイスを提供することであり、そのデバイスは音声信号入力とインターフェースコントローラを共有するために、そしてキー入力データを運ぶものと同じケーブルによってコンピュータシステムへ信号を送ることを可能にするために適合される。
【００１２】
コンピュータへ１本のケーブルによってキーボード信号と音声信号の両方を伝送することは、ユニバーサルシリアルバス（ＵＳＢ）コントローラのような十分高いビットレートを持つキーボードインターフェースコントローラと、たとえばマイクロフォンからのアナログ音声信号をディジタルに変換するコントローラに接続された音声プロセッサを使うことにより達成される。この方法ではＵＳＢキーボードはコンピュータとの間の音声信号のコンジットとなる。そして音声処理を行うためにコンピュータシステムで現在使われている音声やサウンドのカードのような現存する周辺デバイスの追加や置き換えそれ自体が可能となる。
【００１３】
音声プロセッサへのスピーカのような音声出力の接続は、テレコミュニケーション分野と互換性のある多重音声転送を可能とする。音声プロセッサにメモリを増設することにより、入力デバイスに対して音声メッセージ化、音声プリント認識、そして音声認識の機能を拡張できる。
【００１４】
本発明の目的は、コンピュータシステム内で使うための通信信号を供給するインターフェースコントローラと、インターフェースコントローラに接続されたキー入力手段と、インターフェースコントローラに接続された音声入力手段とを備えるコンピュータシステムに対して入力デバイスを提供することにより達成され、キー入力手段と音声入力手段はコンピュータにキーボードデータや音声データを送るためのインターフェースコントローラを共有している。
【００１５】
本発明のさらなる実施例は、音声とキーボードの情報を受けるために単一の入力デバイスを用いたコンピュータが、テキストとしての出力用に音声認識技術を使って、または、対応する操作命令を実行することにより、音声入力を処理する、音声認識システムである。上述のように入力デバイスはＵＳＢキーボードのようなものであり、音声プロセッサはアナログ音声入力に適応するためにその間に置かれている。音声プリント認識を容易にするためにメモリを入力デバイスやコンピュータに増設できる。入力デバイスとコンピュータとの間の信号の符号化はインターフェースを詮索することを防ぐために使うことができる。また音声認識は、コンピュータと並列な入力デバイスにより、システムの計算負荷を分散したり、または音声認識機能の精度を高めたりすることを達成できる。
【００１６】
さらにここでは、入力処理手段と出力手段につながれた音声認識手段を持つコンピュータシステムと、コンピュータの入力処理手段につながれた少なくともひとつのインターフェースコントローラに結合され共有しているキー入力手段と音声入力手段を持つ音声信号とキー入力信号を転送するための入力デバイスと、を備える音声認識システムを提供しており、音声認識手段は音声信号を処理し、出力手段に対して出力を生成する。
【００１７】
さらなる実施例では、ひとつ以上の外部デバイスへの入力と同じように、コンピュータにデータを転送できる信号化手段をもつキーボード内に音声認識が組み込まれている。この実施例に従った代表的な処理は、コンピュータとのインタ−フェースにＵＳＢ技術を使ったものや、テレビを制御するために使われるような赤外線（ＩＲ）、またはコードレスフォンのような適用での無線周波数（ＲＦ）転送などである。言語認識を行うために必要なコンポーネントは、キーボードインターフェースによってコンピュータへ言語をテキスト転送へ変換するために、そしてまた互換性のある電子デバイスに対して遠隔制御された音声を提供するために、キーボード内に含まれている。
【００１８】
そしてその上、キー入力や少なくともひとつの電子デバイスを制御するための操作命令を生成するためのコンピュータシステム用の音声認識キーボードを提供しており、コンピュータシステムへの転送に適した第１のタイプと、少なくともひとつの電子デバイスへの転送に適した第２のタイプの２タイプの信号転送が可能な信号化手段と、転送用に信号化手段へのキー入力を提供するために信号化手段に接続された複数のキーと、音声入力手段と、信号手段による転送用に口語を操作命令に変換するために音声入力手段に接続された音声認識手段と、を備えている。
【００１９】
信号処理や音声認識を扱う音声処理回路と同じように、キーボードに音声入力を提供するためには少なくともひとつの音声入力デバイスが必要となる。前記の実施例のように、メモリ、符号化回路、そしてキーボードへの音声出力手段を加えることは、音声プリント比較や、ディジタル電話方式や音声データエントリを確実にするのに適した双方向音声伝達を可能としている。
【００２０】
次の実施例では、インターネットアクセスのために形づくられたコンピュータにつながれる時にキーボードが提供されるように、入力デバイスは特定の口語コマンドを受信するとインターネットへのアクセスを始めるであろう。
【００２１】
インターネットに接続されるために形づくられたコンピュータからインターネットへ音声アクセスを提供するための音声認識入力デバイスを提供しており、音声入力手段と、インターネットにアクセスを始めるために定義された少なくともひとつの口語コマンドを認識するための音声認識手段と、コマンドの実行トリガのための信号手段と、を備えている。
【００２２】
本発明はさらに、単一のインターフェースによってキー入力と音声入力をコンピュータシステムに提供する方法、口語をテキストに変換する方法、インターネットにつながれたコンピュータからインターネットへアクセスを提供する方法、そして少なくともひとつの電子デバイスを口語コマンドで遠隔制御する方法についても熟考している。
【００２３】
【発明の実施の形態】
上述の内容や、本発明の他の特徴、外観、利点は、添付図と共に読まれる以下の記述より明らかとなるであろう。図中の参照番号は同一要素を示している。
【００２４】
さて図１を参照すると、数字１０で一般に示されている本発明による言語入力デバイスの好ましい実施例の電気的なブロック図が示されている。言語入力デバイス１０に対してケース１１が図で示されている。コンピュータシステムは数字２０で一般に表されている。コンピュータ２０は入力プロセッサ２１を持っており、言語入力デバイス１０のインターフェースコントローラ１２と双方向通信でつながれている。インターフェースコントローラ１２はキー入力１３と音声入力１７からの信号を受信するために接続されている。
【００２５】
音声プロセッサ１４は音声入力１７とインターフェースコントローラ１２との間に接続されている。音声プロセッサ１４とインターフェースコントローラ１２との間は双方向接続が存在する。インターフェースコントローラ１２と音声プロセッサ１４はシングルチップ内に結合することができる。機能上は同等である。音声出力１８とメモリ１６は音声プロセッサ１４につながれている。音声入力１７と音声出力１８は図１のケース１１内に示されているとはいえ、本発明の機能や精神に影響を与えることなく、それらはケースの外側に存在しうることが理解できる。
【００２６】
動作中には、音声入力１７から受信した信号は音声プロセッサ１４でディジタル化され、音声データはインターフェースコントローラ１２に送られる。インターフェースコントローラ１２はキー入力１３から受信したキー入力データとこの音声データを結合して、ひとつのデータ信号としてコンピュータシステム２０内の入力プロセッサ２１にそれらを転送する。ＵＳＢに関連した近年のデバイスでは、キーボードとマイクロフォンは複合したＵＳＢデバイスとしてふたつの異なるＵＳＢエンドポイントを使うので、キーボードデータと音声データは同時である必要はない。音声データは同時でも非同時でもよい。もしデータが同時であれば、ＵＳＢバス上で他のすべてのデータタイプのものよりも優先権を与えられるであろう。オーディオデータなどはまた非同時であり、他のＵＳＢデバイスをエミュレートする。どのインターフェース方法が使われるかにかかわらず、音声データとキー入力データは同時にまたはどちらか一方が一度に送られる。音声データとキー入力データは単一のケーブルによってコンピュータシステムに転送される。
【００２７】
同様に、コンピュータ２０からインターフェースコントローラ１２で受信された音声データを含む信号は、音声出力１８に適したアナログ信号への変換のために音声プロセッサに送られる。もっとも単純なケースとしては、音声プロセッサは音声データ入力用のＡＤコンバータとして、そして音声データ出力用のＤＡコンバータとして機能する。
【００２８】
音声入力１７から受信した信号は、人間の言語をテキストに変換する言語認識や、人間の言語をコマンドに変換したり確認を実行したりする音声認識のような音声明瞭度認識のいかなるタイプに対しても処理される。
【００２９】
メモリ１６は音声プリントサンプルまたは音声認識データを含んでおり、音声プロセッサ１４が音声入力１７からくる音声信号の音声プリント比較を行ったり、その結果音声認識機能を実行したりすることを可能にしている。
【００３０】
図２は図１と同様に本発明の電気的なブロック図を示しており、ここでは言語入力デバイス１０はＵＳＢキーボードをベースとしている。ＵＳＢキーボードの標準的な構成要素は代表的な形に並べられており、キー入力１３とＬＥＤ信号１５がつながれそしてＵＳＢ出力１９を持ったインターフェースコントローラ１２に含まれている。
【００３１】
種々のクラスのＵＳＢデバイスが存在する。ＵＳＢキーボードそれ自身はＨＩＤ（ヒューマンインターフェースデバイス）のクラスになる。他のクラスはオーディオデバイス、コミュニケーションデバイス、ディスプレイデバイス、そしてマスストレージデバイスのクラスが含まれる。オーディオデバイスクラスはＵＳＢマイクロフォン用のデバイスディスクリプタと定義される。ＵＳＢオーディオデバイスクラスのディスクリプタを使うことはＵＳＢマイクロフォンをサポートする標準的な方法であり、一括転送をサポートするフルスピード１２メガビット／秒のＵＳＢインターフェースチップを必要とする。このようなチップは低速のチップよりも高価となるが、マイクロフォン用のより広いレンジのＵＳＢシステムドライバとの互換性を保証している。しかしながら本発明では同時、非同時両方でのデータ転送を熟考しており、どのようなインターフェースでもその基本的な機能性を変えることなくこれらのＵＳＢデバイスクラスを取り替えることができる。
【００３２】
本発明はＵＳＢキーボードに対して、インターフェースコントローラ１２につながれた音声プロセッサ１４を付加しており、各々マイクロフォンとスピーカとして表された音声入力１７と音声出力１８がつながれている。音声入力１７はオンボードマイクロフォン（すなわちキーボードに統合されているもの）またはジャックコネクタにプラグを差し込むマイクロフォンとなりうる。またメモリ１６は音声プロセッサ１４に接続されている。
【００３３】
言語入力デバイス１０の動作は図１で記述したものと同等であり、キー入力１３から受信したデータと音声入力１７から受信した音声データをインターフェースコントローラ１２で結合し、音声プロセッサ１４によってディジタル化する。ディジタル信号はインターフェースコントローラ１２に入り、信号をパケット化するＵＳＢプロセッサとして働く。最終的な信号は、同様のＵＳＢインターフェースを持ついかなるコンピュータシステムとも互換性のある標準的なＵＳＢ技術に従って音声パケットとしてＵＳＢケーブル１９によって転送される。パケット化の可能なフォーマットのひとつとしては、１６ビット８ＫｈｚＰＣＭのＵＳＢマイクロフォンオーディオデータフォーマットがある。しかしながら他に多くの可能なオーディオフォーマットがある。ＵＳＢケーブル１９を通って入ってくる音声信号は、一定の順路に従って音声出力１８を通って出力されるために、インターフェースコントローラ１２によって音声プロセッサ１４に送られる。音声プロセッサ１４とインターフェースプロセッサ１２は機能上の影響なく、シングルチップに結合することができる。ＬＥＤ１５は一般的にＮｕｍＬｏｃｋやＣａｐｓＬｏｃｋのようにキーボードの表示ライトとして使われるが、音声入力デバイス１０の他の状況の機能を表示するためにも同様に使うことができる。
【００３４】
この構成では、音声入力デバイス１０はまさに標準のＵＳＢキーボードとして機能し、またＵＳＢに準拠したコンピュータに追加でインターフェースを付加することなく音声信号に対して２ウェイコンジットとして機能する。さらに、音声入力１７と音声出力１８はテレフォンハンドセットとして使われるレシーバユニットに結合されている。音声プロセッサ１４にメモリ１６を追加することは、システムが入ってくる信号とメモリ１６に格納されている信号とを比較することによってユーザの同一性をチェックしたり、特定の認定されたユーザに音声入力デバイス１０へのアクセスを制限したりすることを可能としている。この構成ではまた、単一の入力デバイスの簡潔さを保ったまま、音声プロセッサ１４にいかなる音声明瞭度認識の機能性をも付加することができる。
【００３５】
次に図３を参照すると、入力デバイス１０を使った音声明瞭度認識システムの好ましい実施例がブロック図で示されており、コンピュータシステム２０は出力手段２３がつながっている音声サブシステム手段２２に接続された入力プロセッサ２１を持っている。入力デバイス１０は基本的には図１と同様の基本構成要素を持っており、入力プロセッサ２１への単一のインターフェースによってキー入力や音声データをコンピュータシステム２０に同様に提供する働きをする。音声データは音声サブシステム手段２２に転送するために入力プロセッサ２１によって分離される。この実施例によると、音声サブシステム手段は命令のセットの形をとってソフトを実行するコンピュータシステムのネイティブ処理手段を備えているか、もしくは特別なハードまたは両者の組み合わせからなっている。
【００３６】
出力手段２３は、例えば口語をモニタ上に表示するテキストに変換して音声サブシステム手段２２の出力を直接提供したり、あるいはもっと間接的にはワードプロセッサに対する音声書き取りのようになり、プリンタへ出力したりすることができる。先の実施例のように入力デバイス１０内のインターフェースコントローラ１２はＵＳＢ技術に従って信号を生成するために適応されている。同様に音声プロセッサ１４はアナログからディジタルへの変換を提供するために、音声入力１７とインターフェースコントローラ１２との間につながれている。音声プロセッサ１４がインターフェースコントローラ１２から受信したディジタル音声信号をスピーカへの出力に適したアナログ信号に変換するケースでは音声出力１８が提供される。
【００３７】
先の実施例のように、音声プロセッサ１４の機能は信号モード変換に制限を持たせる必要がない。メモリ１６の追加は、入力デバイス１０でローカルに音声プリント比較や言語認識のような音声を処理することを可能にする。音声プリント比較は入ってくる音声信号をメモリ１６内に記録されている音声プリントと比較するために使うことができる。音声プロセッサ１４によるローカルな音声明瞭度認識は音声サブシステム２２とともに用いることができ、この場合には入力デバイス１０はコンピュータシステム２０のコプロセッサとして機能する。
【００３８】
出力手段２３上で表示されるユーザインターフェースはウィンドウズベースとすることができ、音声コマンドが表示されたメニューをプルダウンすることによって動かすことができる。従ってユーザはシステムで認識されるコマンドを記憶する必要はない。この方法は音声明瞭度認識システムをよりユーザフレンドリーとすることができる。もちろんいかなるディスプレイインターフェースも使用可能であり、またインターフェースは本発明の機能性を変えることなく特殊な機能を実行するためにカスタマイズすることができる。
【００３９】
図４は図３の音声明瞭度認識システムの変更実施例を示しており、図示されたケース１１を持つ入力デバイス１０は、キー入力１３からのキー入力データを受信するためにそして前記音声プロセッサからの音声入力を送受信するために接続されたインターフェースコントローラ１２を持つ。インターフェースコントローラ１２はディジタル音声信号をコンピュータシステム２０へケーブルによって送信するためにパケット化する。音声入力１７と音声出力１８は、コンピュータシステム２０と同一のケーブルによってインターフェースコントローラ１２を通ってくる音声プロセッサ１４からのアナログ音声データを各々送受信している。先の実施例のように、メモリ１６は音声プリント比較や音声認識用のデータを蓄えておくために音声プロセッサ１４によって使われる。
【００４０】
この実施例では、信号符号化手段１９は入力デバイス１０とコンピュータシステム２０との間を転送されるデータの符号化／復号化を行うために入力デバイス１０に追加されている。同様に符号化手段２５がコンピュータシステム２０内に同様の理由で付加されている。符号化手段１９や２５はソフト制御された符号化回路またはプロセッサのいかなる既知のタイプのものとなり得る。符号化手段１９はインターフェースコントローラ１２内に統合することができ、それによりインターフェースコントローラ１２を使ってデータのパケット化と符号化の両方ができる。
【００４１】
コンピュータシステム２０内では、入力プロセッサは入力デバイス１０から音声データとキー入力データを受信し、音声信号を音声認識手段２２に渡す。メモリ２４は音声サブシステム手段２２につながれ、それによって記憶回路として仕えていることが示されている。出力手段２３は音声サブシステム手段２２の出力を表示するために使われるコンピュータシステム２０の表示手段である。この実施例では、Ｉ／Ｏコントローラ２６はコンピュータシステム２０と、インターネット接続、テレフォンサービス、そして周辺機器のようなコンピュータシステム２０の外部のデータソースとの間のゲートウェイインターフェースとして示されている。
【００４２】
この構成では、音声入力１７と音声出力１８はオンライン電話方式を使用するためにレシーバユニット内に結合することができる。これに関係して、インターネットプロバイダによる音声（ＶＯＩＰ）が得られる。さらにＩ／Ｏコントローラ２６の機能はローカルエリアネットワーク（ＬＡＮ）カードのようなインターフェースカードを使うことにより達成できる。
【００４３】
この実施例では、音声明瞭度認識システムは入力デバイス１０からコンピュータシステム２０への音声やキー入力の通信に対して安定した手段を提供するために機能している。インターフェースコントローラ１２と入力プロセッサ２１との間のインターフェースを詮索するいかなる試みも符号化手段１９，２５の動作に従う。
【００４４】
次に図５を参照すると、入力デバイスが音声認識キーボード１０である場合の本発明の実施例のブロック図が示されている。このような音声認識キーボード１０の場合は図的には１１として表されており、キー入力１３と音声プロセッサ１４からのデータを受信するインターフェースコントローラ１２を内部に持っている。音声プロセッサ１４は音声入力１７からの音声信号を受信し、そしてメモリ１６とつながれている。
【００４５】
音声プロセッサ１４は音声入力１７から音声信号として入ってくる言語を受信し、インターフェースコントローラ１２を通じてコンピュータシステム２０や電子デバイス３０への転送に適したコマンドへ口語を変換する。インターフェースコントローラ１２はキー入力１３と音声プロセッサ１４からのデータを結合し、それらを選択的にコンピュータシステム２０や電子デバイス３０に転送する。
【００４６】
この実施例によると、音声認識キーボード１０は通常のキーボードとして使うことができ、コンピュータシステム２０へキー入力を提供するためのＵＳＢのようなインターフェースを持っている。しかしさらに、キー入力からの入力はコントロールデータとなることができ、さらにまた、ＤＶＤやＣＤプレーヤー、ステレオ、ＶＣＲ、セットボックスまたはテレビのような電子デバイスに対して指示することができ、無線周波数（ＲＦ）または赤外線（ＩＲ）またはＵＳＢケーブルのようなインターフェースを装備したインターフェースコントローラ１２は、制御されるデバイス３０との互換性を有している。
【００４７】
さらに、音声入力１７を通じて受信された口語コマンドはそれらのキー入力と同等なものに処理され、インターフェースコントローラ１２に送られる。インターフェースコントローラは音声プロセッサ１４から中継された入力をあたかもキー入力１３からのデータと同じように処理し、ＵＳＢインターフェース，ＲＦまたはＩＲリンクのようなコンピュータインターフェースを経由して、必要であれば任意のその他のふさわしいインターフェースを通じてコンピュータシステム２０または電子デバイス３０に転送する。従って口語コマンドはＤＶＤやＣＤプレーヤー、ステレオ、ＶＣＲ、セットボックスまたはテレビのようなマルチメディア電子デバイスのいかなるタイプのものも遠隔制御できる。
【００４８】
利用可能なキーボードコマンドや入力のセットは利用可能な音声コマンドや入力すべてのサブセットである。キーボードは常に既定の制約されたキーやキーの組み合わせの数を持つが、音声コマンドや入力は物理的には制限を持たない。キーボード入力が一般的には数百のコードで制限されるのに対して、音声コマンドは数千となることができる。
【００４９】
本発明のキーとなる特徴は、音声パケットがＵＳＢケーブルによって転送されるところであり、音声コマンドをキーボードコードに変換し、それからケーブルによってキーボードコードを送るのとは対照的である。音声コマンドセットは、従来はキーボードコードには制限があったのだが、キーボードコードに制限されない。サポートされる音声コマンドの数はシステムメモリや事前に記録されたコマンドの数によって制限されるのみである。事前に記録された音声コマンドの数はインターネット経由で新しいコードをダウンロードすることにより更新できるため、音声コマンドの数は制限なしに増やすことができる。音声コマンドの別個のセットは特別なｗｅｂサイトでサポートされるｗｅｂサイト音声コマンドにユーザがアクセスする時に自動的にダウンロードできる。
【００５０】
図６はインターネットへの音声アクセスを提供することに対して示された本発明の代替の実施例を示している。この実施例では、入力デバイス１０はメモリ１６と音声入力１７がつながれた音声プロセッサ１４と接続されたインターフェースコントローラ１２を持ち、ケース１１の中に図示されている。この実施例では、口語コマンドは音声プロセッサ１４により対応するデータに変換され、インターフェース経由でコンピュータシステム２０に送られる。コンピュータシステム２０はインターネットにつながる構成をしている。従ってインターネットアクセスが音声で起動できる。入力デバイス１０はキーボード上に新しいキーを置き換えたり増やしたりするインターネットアクセスの音声起動を持つキーボードとなる。この実施例は、音声コマンドを、インターネットを操縦する際の使用のためにコンピュータシステム２０によって認識されるコマンドに変換することを可能にしている。本発明のこの実施例を用いて、ワールドワイドネットワークのコンピュータはアクセス可能となる。
【００５１】
本発明は、さらに音声とキー入力データに対して単一のインターフェースを用いる方法を熟考している。コンピュータは別々の処理を行うために、結合された信号をディジタル音声信号のコンポーネントとキー入力信号のコンポーネントに分ける。音声プリント比較がなされる。本発明は音声コマンドによりインターネットアクセスを起動する方法も含んでいる。このような音声コマンドはキーボードへの入力となることができる。方法はまた、口語コマンドを用いてキーボードから電子デバイスを遠隔制御するために熟考されている。
【００５２】
本発明は、好ましい実施例に関して述べられているとはいえ、本発明の精神と領域内で応用例や変更例を熟考していることは、技術に熟練した人には明らかであろう。好ましい実施例による図や記述は発明の領域に制限をかけるよりもむしろ一例としてなされており、発明の精神や領域内で類似の変化・変形はすべてカバーしたつもりである。
【図面の簡単な説明】
【図１】本発明に従った入力デバイスのブロック図である。
【図２】ＵＳＢキーボード内に本発明による言語入力デバイスを統合した実施例のブロック図である。
【図３】音声認識手段を持つコンピュータシステムに接続された本発明に従った入力デバイスの実施例のブロック図である。
【図４】さらに外部のサービスに接続されたコンピュータシステムに接続された本発明に従った入力デバイスの実施例のブロック図である。
【図５】統合された入力デバイスがリモートコントロール機能を有した音声認識キーボードである本発明の実施例のブロック図である。
【図６】インターネットへのアクセスを伴うコンピュータに接続された本発明の実施例のブロック図である。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an input device for interfacing with an electronic device. The invention more particularly relates to computer keyboards, speech and language recognition systems, and control systems for electronic devices.
[0002]
[Prior art]
It is known from the prior art that the voice detection function is integrated into a keyboard. For example, Whelpley Jr. U.S. Pat. No. 5,569,665 (hereinafter Whelpley 665 patent) discloses the insertion of an external device into a data cable between a keyboard and a computer system. This external device converts audio signals into keystroke data sent to the computer's keyboard port, while allowing normal signals from the keyboard itself to pass through the external device to the computer as usual By doing so, a voice recognition function is added to the computer. The speech recognition system disclosed in the Whelpley 665 patent has speech input incorporated therein, and speech recognition hardware for processing the speech input signal is confined to an external device. The voice command is never sent to the computer except when it is equal to the keyboard data, and no computer processing time is consumed in performing the voice recognition function.
[0003]
Whelpley 665 patent also discloses an embodiment in which a voice recognition device is included within the keyboard housing. As in the external embodiment, the device serves to convert audio signals into key-in data and insert data into the data stream coming from the keyboard. The combination disclosed in the Whelpley 665 patent effectively disguises the speech recognition function as being transparent to the computer. Thus the audio signal is never transmitted by the keyboard cable.
[0004]
[Problems to be solved by the invention]
Any prior art device that merely simulates keyboard operation, such as that disclosed in the Whelpley 665 patent, has a practical limit to the commands that can be executed with a combination of keyboard and key inputs. This is a major drawback because the computer cannot perform the operation of the audio signal itself without the extra microphone and speaker combination connected directly to the computer and independently of the speech recognition system.
[0005]
Speech recognition is also known to be an evolving technology.
Currently implemented are voice-dependent, designed to accurately recognize the voice of a particular user, and voice-independent, designed to function correctly for any voice. There is something. Advances in this area are reducing the inaccuracy of various pronunciations, speaking styles, and accent changes in converting language to keystroke. Any built-in means of speech recognition is becoming the limit of the hardware, and upgrading with technology is likely to be expensive.
[0006]
Further, functions such as basic text input are very complex due to the need to train current speech recognition systems to accurately identify a particular user's speech. US Pat. No. 5,874,939 to Galvin (Galvin 939 patent) discloses a speech recognition keyboard with a display to facilitate training. When the system converts spoken words to text, the user can check them accurately on the display, and this method can improve the accuracy of system overtime.
[0007]
Like the Whelpley 665 patent, the speech recognition system disclosed in the Galvin 939 patent is built-in and does not transfer speech data to a computer. As a result, the training procedure inevitably requires local restrictions on interfacing with the keyboard, making it poor or expensive as disclosed, and arranging the keyboard display as if it were a computer connected display Related to the problem you want.
[0008]
Built-in approaches, such as those disclosed in the Whelpley 665 and Galvin 939 patents, improve system efficiency by limiting language processing to external processors separate from the computer system, and to older computer systems. Although suitable, it is not flexible enough for practical use with current computer systems and available high bit rate interfaces, and is not suitable for use with other audio-related technologies.
[0009]
There is a need for a low cost device that can simultaneously transfer voice and keyboard signals to a computer over the same interface. There is also a need for a speech intelligibility recognition system that can simulate keyboard signals as well as supporting other speech processing functions such as digital technology, and a need for remote device control. There is a need for these features in environments that are not close to a keyboard. There is also a need for a speech intelligibility recognition system that can function within a keyboard or with an application-dependent computer system.
[0010]
[Means for Solving the Problems]
The present invention integrates key input data as text input into a speech intelligibility recognition system from a computer keyboard or other peripheral interface such as a microphone with voice data or a telephone handset, controls electronic devices, and accesses the Internet. It is aimed at doing.
[0011]
One embodiment of the present invention is to provide an input device, such as a conventional computer keyboard, which shares the same controller as the audio signal input and interface controller, and which carries the key input data. Adapted to allow signals to be sent to the computer system.
[0012]
Transmitting both keyboard and audio signals over a single cable to a computer is accomplished by using a keyboard interface controller with a sufficiently high bit rate, such as a universal serial bus (USB) controller, and digitally converting analog audio signals from a microphone, for example. This is achieved by using an audio processor connected to the controller to convert to. In this way, the USB keyboard is a conduit for audio signals to and from the computer. The addition or replacement of existing peripheral devices, such as voice and sound cards currently used in computer systems to perform voice processing, is itself possible.
[0013]
The connection of an audio output, such as a loudspeaker, to an audio processor allows for multiple audio transfers compatible with the telecommunications field. By adding memory to the voice processor, the functions of voice messaging, voice print recognition, and voice recognition can be extended to input devices.
[0014]
An object of the present invention is to provide a computer system including an interface controller that supplies a communication signal for use in a computer system, key input means connected to the interface controller, and audio input means connected to the interface controller. This is achieved by providing an input device, wherein the key input means and the voice input means share an interface controller for sending keyboard data and voice data to the computer.
[0015]
A further embodiment of the present invention is a computer using a single input device to receive voice and keyboard information, using voice recognition technology for output as text, or executing corresponding operating instructions. Thus, a speech recognition system that processes speech input. As mentioned above, the input device is like a USB keyboard, with an audio processor interposed to accommodate analog audio input. Memory can be added to an input device or computer to facilitate voice print recognition. The encoding of the signal between the input device and the computer can be used to prevent snooping on the interface. In speech recognition, an input device in parallel with a computer can achieve a distribution of the computational load of the system or an improvement in the accuracy of the speech recognition function.
[0016]
Further, here, a computer system having voice recognition means connected to input processing means and output means, and key input means and voice input means coupled to and shared by at least one interface controller connected to input processing means of the computer. There is provided a voice recognition system including a voice signal having the voice signal and an input device for transferring a key input signal. The voice recognition means processes the voice signal and generates an output to an output means.
[0017]
In a further embodiment, speech recognition is incorporated into a keyboard with signaling means that can transfer data to a computer, as well as input to one or more external devices. Typical processing in accordance with this embodiment is using USB technology to interface with a computer, infrared (IR) as used to control a television, or an application such as a cordless phone. Radio frequency (RF) transfer. The components required to perform language recognition are within the keyboard, in order to convert the language to text transfer to the computer via a keyboard interface, and also to provide remotely controlled speech to compatible electronic devices. Included in
[0018]
In addition, there is provided a voice recognition keyboard for a computer system for generating an operation command for controlling a key input and at least one electronic device, and a first type suitable for transfer to the computer system. A signaling means capable of two types of signal transfer of a second type suitable for transfer to at least one electronic device, and connected to the signaling means for providing a key input to the signaling means for transfer A plurality of keys, voice input means, and voice recognition means connected to the voice input means for converting spoken language into operation commands for transfer by signal means.
[0019]
Like a voice processing circuit that handles signal processing and voice recognition, at least one voice input device is required to provide voice input to a keyboard. As in the previous embodiment, the addition of a memory, an encoding circuit, and a means for voice output to the keyboard can be used to compare voice prints, or to provide a two-way voice transmission suitable for securing digital telephony and voice data entry. Is possible.
[0020]
In the next embodiment, the input device will start accessing the Internet upon receiving certain spoken commands, such that a keyboard is provided when connected to a computer configured for Internet access.
[0021]
A speech recognition input device is provided for providing voice access to the Internet from a computer configured to be connected to the Internet, comprising a voice input means and at least one colloquial defined to initiate access to the Internet. The apparatus includes voice recognition means for recognizing a command and signal means for triggering execution of the command.
[0022]
The invention further provides a method for providing key and voice input to a computer system through a single interface, a method for translating spoken language to text, a method for providing access to the Internet from a computer connected to the Internet, and at least one electronic device. He also considers how to remotely control devices with spoken commands.
[0023]
BEST MODE FOR CARRYING OUT THE INVENTION
The foregoing and other features, aspects, and advantages of the present invention will become apparent from the following description read in conjunction with the accompanying drawings. Reference numbers in the figures indicate the same elements.
[0024]
Referring now to FIG. 1, there is shown an electrical block diagram of a preferred embodiment of a language input device, generally designated by the numeral 10, according to the present invention. A case 11 is shown in the figure for the language input device 10. The computer system is generally designated by the numeral 20. The computer 20 has an input processor 21 and is connected to the interface controller 12 of the language input device 10 by two-way communication. Interface controller 12 is connected to receive signals from key input 13 and audio input 17.
[0025]
The audio processor 14 is connected between the audio input 17 and the interface controller 12. There is a bidirectional connection between the audio processor 14 and the interface controller 12. The interface controller 12 and the audio processor 14 can be combined in a single chip. Functionally equivalent. The audio output 18 and the memory 16 are connected to the audio processor 14. Although audio input 17 and audio output 18 are shown in case 11 of FIG. 1, it can be understood that they can be outside the case without affecting the function or spirit of the present invention.
[0026]
In operation, signals received from audio input 17 are digitized by audio processor 14 and audio data is sent to interface controller 12. The interface controller 12 combines the key input data received from the key input 13 with the voice data and transfers them to the input processor 21 in the computer system 20 as one data signal. In modern USB-related devices, keyboard data and voice data need not be simultaneous because the keyboard and microphone use two different USB endpoints as a composite USB device. The audio data may be simultaneous or non-simultaneous. If the data is simultaneous, it will be given priority over all other data types on the USB bus. Audio data and the like are also non-simultaneous and emulate other USB devices. Regardless of which interface method is used, the voice data and / or key input data are sent simultaneously and / or one at a time. Voice data and key input data are transferred to the computer system over a single cable.
[0027]
Similarly, signals containing audio data received at interface controller 12 from computer 20 are sent to an audio processor for conversion to analog signals suitable for audio output 18. In the simplest case, the audio processor functions as an A / D converter for inputting audio data and as a D / A converter for outputting audio data.
[0028]
The signal received from the speech input 17 can be used for any type of speech intelligibility recognition, such as language recognition, which converts human language to text, or speech recognition, which converts human language to commands or performs confirmation. Is also processed.
[0029]
Memory 16 contains voice print samples or voice recognition data, allowing voice processor 14 to perform voice print comparisons of voice signals coming from voice input 17 and thus perform voice recognition functions. .
[0030]
FIG. 2 shows an electrical block diagram of the present invention, similarly to FIG. 1, in which the language input device 10 is based on a USB keyboard. The standard components of a USB keyboard are arranged in a typical manner, with a key input 13 and an LED signal 15 connected and included in an interface controller 12 with a USB output 19.
[0031]
There are various classes of USB devices. The USB keyboard itself becomes a class of HID (Human Interface Device). Other classes include classes for audio devices, communication devices, display devices, and mass storage devices. The audio device class is defined as a device descriptor for a USB microphone. Using USB audio device class descriptors is the standard way to support USB microphones and requires a full speed 12 Mbit / s USB interface chip that supports batch transfer. Such chips are more expensive than slower chips, but guarantee compatibility with a wider range of USB system drivers for microphones. However, the present invention contemplates simultaneous and non-simultaneous data transfer, and any interface can replace these USB device classes without changing its basic functionality.
[0032]
The present invention adds to a USB keyboard an audio processor 14 connected to an interface controller 12, each of which has a microphone and an audio input 17 and an audio output 18 represented as speakers. Audio input 17 may be an on-board microphone (ie, integrated into a keyboard) or a microphone that plugs into a jack connector. The memory 16 is connected to the audio processor 14.
[0033]
The operation of the language input device 10 is equivalent to that described with reference to FIG. 1. The data received from the key input 13 and the voice data received from the voice input 17 are combined by the interface controller 12 and digitized by the voice processor 14. The digital signal enters interface controller 12 and acts as a USB processor to packetize the signal. The final signal is transferred over USB cable 19 as audio packets according to standard USB technology compatible with any computer system with a similar USB interface. One of the formats that can be packetized is a 16-bit 8Khz PCM USB microphone audio data format. However, there are many other possible audio formats. The audio signal coming through the USB cable 19 is sent by the interface controller 12 to the audio processor 14 for output through the audio output 18 according to a certain route. The audio processor 14 and the interface processor 12 can be combined into a single chip with no functional impact. The LED 15 is generally used as a display light of a keyboard like a Num Lock or a Caps Lock, but can be used similarly to display functions of other situations of the voice input device 10.
[0034]
In this configuration, the audio input device 10 functions exactly as a standard USB keyboard and also functions as a two-way conduit for audio signals without adding an additional interface to a USB compliant computer. Further, audio input 17 and audio output 18 are coupled to a receiver unit used as a telephone handset. Adding memory 16 to audio processor 14 allows the system to check the identity of the user by comparing the incoming signal with the signal stored in memory 16 or to provide audio to a particular authorized user. It is possible to restrict access to the input device 10. This configuration also allows the speech processor 14 to add any speech intelligibility recognition functionality while retaining the simplicity of a single input device.
[0035]
Referring now to FIG. 3, a preferred embodiment of a speech intelligibility recognition system using the input device 10 is shown in a block diagram, wherein the computer system 20 is connected to a speech subsystem means 22 to which an output means 23 is connected. Input processor 21. The input device 10 has basically the same basic components as in FIG. 1, and serves to provide key input and voice data to the computer system 20 by a single interface to the input processor 21 as well. The audio data is separated by input processor 21 for transfer to audio subsystem means 22. According to this embodiment, the audio subsystem means comprises the native processing means of a computer system executing software in the form of a set of instructions, or comprises special hardware or a combination of both.
[0036]
The output means 23 directly converts the spoken language into text to be displayed on a monitor and directly provides the output of the voice subsystem means 22, or more indirectly, such as a voice dictation to a word processor, and outputs the data to a printer. Or you can. As in the previous embodiment, the interface controller 12 in the input device 10 is adapted to generate signals according to USB technology. Similarly, audio processor 14 is coupled between audio input 17 and interface controller 12 to provide analog to digital conversion. In the case where the audio processor 14 converts a digital audio signal received from the interface controller 12 into an analog signal suitable for output to a speaker, an audio output 18 is provided.
[0037]
As in the previous embodiment, the function of the audio processor 14 does not need to limit the signal mode conversion. The addition of the memory 16 allows the input device 10 to process speech such as speech print comparisons and language recognition locally. Voice print comparison can be used to compare incoming voice signals with voice prints stored in memory 16. Local speech intelligibility recognition by speech processor 14 can be used with speech subsystem 22, in which case input device 10 functions as a co-processor of computer system 20.
[0038]
The user interface displayed on the output means 23 can be Windows-based and can be moved by pulling down a menu in which voice commands are displayed. Therefore, the user does not need to memorize commands recognized by the system. This method can make the speech intelligibility recognition system more user friendly. Of course, any display interface can be used, and the interface can be customized to perform special functions without changing the functionality of the present invention.
[0039]
FIG. 4 shows a modified embodiment of the speech intelligibility recognition system of FIG. 3, wherein an input device 10 having the illustrated case 11 is adapted to receive key input data from a key input 13 and from the speech processor. Has an interface controller 12 connected to transmit and receive voice input. Interface controller 12 packetizes the digital audio signal for transmission to computer system 20 by cable. Audio input 17 and audio output 18 each transmit and receive analog audio data from audio processor 14 that passes through interface controller 12 over the same cable as computer system 20. As in the previous embodiment, the memory 16 is used by the voice processor 14 to store data for voice print comparison and voice recognition.
[0040]
In this embodiment, a signal encoding means 19 is added to the input device 10 for encoding / decoding data transferred between the input device 10 and the computer system 20. Similarly, encoding means 25 has been added in computer system 20 for similar reasons. The encoding means 19 and 25 can be of any known type of soft-controlled encoding circuit or processor. The encoding means 19 can be integrated in the interface controller 12 so that both packetization and encoding of data can be carried out using the interface controller 12.
[0041]
Within the computer system 20, the input processor receives voice data and key input data from the input device 10 and passes voice signals to voice recognition means 22. The memory 24 is shown connected to the audio subsystem means 22, thereby serving as a storage circuit. The output means 23 is a display means of the computer system 20 used to display the output of the audio subsystem means 22. In this embodiment, I / O controller 26 is shown as a gateway interface between computer system 20 and a data source external to computer system 20, such as an Internet connection, telephone service, and peripherals.
[0042]
In this configuration, audio input 17 and audio output 18 can be coupled into a receiver unit to use an online telephone system. In this connection, voice (VOIP) by the Internet provider is obtained. Further, the functions of the I / O controller 26 can be achieved by using an interface card such as a local area network (LAN) card.
[0043]
In this embodiment, the speech intelligibility recognition system functions to provide a stable means for voice and key input communication from the input device 10 to the computer system 20. Any attempt to snoop on the interface between the interface controller 12 and the input processor 21 follows the operation of the encoding means 19,25.
[0044]
Referring now to FIG. 5, a block diagram of an embodiment of the present invention where the input device is a speech recognition keyboard 10 is shown. Such a voice recognition keyboard 10 is schematically shown as 11 and has an interface controller 12 for receiving key input 13 and data from a voice processor 14 therein. Audio processor 14 receives audio signals from audio input 17 and is in communication with memory 16.
[0045]
Speech processor 14 receives the incoming language as a speech signal from speech input 17 and translates the spoken language into commands suitable for transfer to computer system 20 or electronic device 30 through interface controller 12. Interface controller 12 combines key inputs 13 and data from audio processor 14 and selectively transfers them to computer system 20 or electronic device 30.
[0046]
According to this embodiment, the voice recognition keyboard 10 can be used as a normal keyboard and has a USB-like interface for providing key input to the computer system 20. However, furthermore, the input from the key input can be control data and can also be directed to an electronic device such as a DVD or CD player, a stereo, a VCR, a set box or a television, the radio frequency ( An interface controller 12 equipped with an interface, such as an RF or infrared (IR) or USB cable, is compatible with the device 30 to be controlled.
[0047]
Further, spoken commands received through the voice input 17 are processed to be equivalent to those key inputs, and sent to the interface controller 12. The interface controller processes the input relayed from the audio processor 14 as if it were data from the key input 13 and via a computer interface such as a USB interface, RF or IR link, and any other To the computer system 20 or the electronic device 30 through the appropriate interface. Thus, spoken commands can remotely control any type of multimedia electronic device such as a DVD or CD player, stereo, VCR, set box or television.
[0048]
The set of available keyboard commands and inputs is a subset of all available voice commands and inputs. Keyboards always have a predefined number of constrained keys and key combinations, but voice commands and inputs have no physical limitations. Voice commands can be thousands while keyboard input is typically limited by hundreds of codes.
[0049]
A key feature of the present invention is where voice packets are transmitted over a USB cable, as opposed to converting voice commands to keyboard codes and then sending the keyboard codes over the cable. Voice command sets are not limited to keyboard codes, which were previously restricted to keyboard codes. The number of voice commands supported is only limited by the number of system memory and pre-recorded commands. Since the number of pre-recorded voice commands can be updated by downloading a new code via the Internet, the number of voice commands can be increased without limit. A separate set of voice commands can be automatically downloaded when a user accesses website voice commands supported by a particular website.
[0050]
FIG. 6 illustrates an alternative embodiment of the present invention shown for providing voice access to the Internet. In this embodiment, the input device 10 has an interface controller 12 connected to an audio processor 14 to which a memory 16 and an audio input 17 are connected, and is shown in a case 11. In this embodiment, spoken commands are converted to corresponding data by the speech processor 14 and sent to the computer system 20 via the interface. The computer system 20 is configured to connect to the Internet. Thus, Internet access can be activated by voice. The input device 10 is a keyboard with voice activation of Internet access to replace or increase new keys on the keyboard. This embodiment allows voice commands to be converted into commands recognized by computer system 20 for use in navigating the Internet. With this embodiment of the invention, the computers of the world wide network are accessible.
[0051]
The present invention further contemplates using a single interface for voice and key input data. The computer separates the combined signal into a digital audio signal component and a key input signal component for separate processing. A voice print comparison is made. The invention also includes a method for activating Internet access by a voice command. Such voice commands can be input to a keyboard. Methods are also contemplated for remotely controlling an electronic device from a keyboard using spoken commands.
[0052]
Although the present invention has been described with reference to preferred embodiments, it will be apparent to those skilled in the art that variations and modifications within the spirit and scope of the invention are contemplated. The drawings and descriptions of the preferred embodiments are provided as examples, rather than limiting the scope of the invention, and all similar variations and modifications within the spirit and scope of the invention are intended to be covered.
[Brief description of the drawings]
FIG. 1 is a block diagram of an input device according to the present invention.
FIG. 2 is a block diagram of an embodiment in which a language input device according to the present invention is integrated in a USB keyboard.
FIG. 3 is a block diagram of an embodiment of an input device according to the present invention connected to a computer system having voice recognition means.
FIG. 4 is a block diagram of an embodiment of an input device according to the present invention connected to a computer system further connected to an external service.
FIG. 5 is a block diagram of an embodiment of the present invention in which the integrated input device is a voice recognition keyboard having a remote control function.
FIG. 6 is a block diagram of an embodiment of the present invention connected to a computer with access to the Internet.

Claims

An interface controller for providing communication signals used in the computer system;
Key input means connected to the interface controller;
Voice input means connected to the interface controller;
An input device for a computer system, wherein the key input means and the voice input means share an interface controller for sending keyboard data and voice data to a computer.

The input device according to claim 1, further comprising an audio output unit connected to the interface controller and sharing the controller with an audio input unit and a keyboard.

The input device of claim 1, wherein the interface controller provides signals for use in the computer system according to USB technology.

The input device according to claim 1, further comprising an audio processor disposed between the audio input means and the interface controller, for digitizing an analog input from the audio input to the interface controller.

5. The input device of claim 4, wherein the speech processor further converts spoken language into key input data.

The input device according to claim 5, further comprising a memory means connected to the audio processor.

7. The input device according to claim 6, wherein said memory means includes a voice print of at least one specific user for comparison with voice data from said voice input means.

3. The input device according to claim 2, wherein said input means and output means are combined in a single receiver unit suitable for telephony use.

A computer system having speech intelligibility recognition means coupled to the input processing means and the output means;
A voice input device coupled to at least one interface controller connected to input processing means of the computer and having a key input means and a voice input means, and an input device for transferring a key input signal; A speech intelligibility recognition system, wherein the recognition means processes the speech signal and generates an output to the output means.

The speech intelligibility recognition system according to claim 9, wherein the interface controller and the input processing unit communicate using a signal generated according to USB technology.

The speech intelligibility recognition system of claim 10, further comprising a speech processor interposed between said speech input means and said interface controller for digitizing an analog signal from said speech input means to said interface controller. .

The speech intelligibility recognition system of claim 11, further comprising memory means coupled to the speech processor, wherein the memory means includes a voice print of at least one particular user for comparison with voice data from the input means. .

The speech intelligibility of claim 9, further comprising memory means coupled to said speech intelligibility recognition means, said memory means including at least one specific user's speech print for comparison with speech data from said input means. Degree recognition system.

Further comprising first encoding means coupled to said interface controller and second encoding means coupled to said input processor to provide an encoded transfer of said audio signal and key input signal. Item 10. A speech intelligibility recognition system according to item 9.

The speech intelligibility recognition system according to claim 9, further comprising a speech processor for performing speech recognition.

Signaling means capable of two types of signal transfer, a first type suitable for transfer to the computer system and a second type suitable for transfer to at least one electronic device;
A plurality of keys connected to the signaling means to provide a key input to the signaling means for transfer therethrough;
Voice input means;
A computer configured to generate a key input or an operation command for controlling at least one electronic device, comprising: a speech recognition unit connected to the voice input unit for converting a spoken language into an operation command for transfer by a signal unit. Voice recognition keyboard for the system.

17. The voice recognition keyboard according to claim 16, wherein said signals of said type suitable for transfer to at least one electronic device are transferred using infrared signals.

17. The voice recognition keyboard of claim 16, wherein the signal of the type suitable for transfer to at least one electronic device is transferred using a radio frequency (RF) signal.

17. The speech recognition keyboard according to claim 16, wherein said signal of said type suitable for transfer to said computer is in accordance with USB technology.

17. The voice recognition keyboard according to claim 16, further comprising a memory means connected to said voice recognition means.

21. The voice recognition keyboard of claim 20, wherein said memory means includes a voice print of at least one specific user for comparison with voice data from said voice input means.

17. The voice recognition keyboard according to claim 16, further comprising voice output means connected to said signal means for receiving voice signals transferred from said computer system or at least one electronic device to said signal means.

23. A speech recognition keyboard according to claim 22, wherein said speech input and output means are combined in a single unit suitable for use as a receiver for telephony.

23. The voice recognition keyboard according to claim 22, wherein the at least one operation command is applied to transmit voice by an IP telephone system.

17. The speech recognition keyboard according to claim 16, wherein said speech recognition means converts spoken words into corresponding key inputs for transmission by signal means.

Voice input means;
Voice recognition means for recognizing spoken commands as at least one command defined to initiate internet access;
And a signal means for triggering the execution of the command, and a voice recognition input device for providing voice access to the Internet from a computer connected to the Internet.

The voice recognition input device according to claim 26, wherein the input device is a keyboard.

27. The voice recognition input device according to claim 26, wherein the at least one command is defined to convey voice by IP telephony.

Voice input means;
Voice recognition means for recognizing spoken commands as at least one of the commands defined to convey Internet access;
And a signal means for executing the command, wherein the voice recognition input device provides voice access to the Internet from a computer connected to a computer on a world wide network.

30. The voice recognition input device of claim 29, wherein at least one of the commands is defined to allow Internet navigation.

30. The speech recognition keyboard of claim 29, wherein at least one of the commands is defined to convey speech by IP telephony.

The voice recognition input device according to claim 29, wherein the input device is a keyboard.

Inputting an audio signal;
Converting the audio signal into a digital audio signal;
Inputting a key input signal;
Simultaneously processing the digital audio signal and the key input signal by a single interface controller to generate a combined signal;
Transferring the combined signal to the computer system to provide key and voice input to the computer system through a single interface.

34. The method of claim 33, further comprising the computer dividing the combined signal into a separate digital audio signal component and a separate key-in signal component for separate processing by the computer system.

Storing a sample voice print;
Inputting an audio signal;
Inputting a key input signal;
Converting the audio signal into a digital audio signal;
Comparing the digital audio signal with the sample audio print;
Simultaneously processing the digital voice signal and the key input signal into a signal combined by a single interface controller if the digital voice signal and the sample voice print match;
Transferring the combined signal to the computer system. Providing a key and voice input to the computer system through a single interface.

Inputting an audio signal including a spoken language;
Converting the audio signal into a digital audio signal;
Inputting a key input signal;
Simultaneously processing the digital voice signal and the keyboard signal into signals combined by a single interface controller;
Transferring the combined signal to a computer system over a single interface;
Splitting the combined signal into a digital audio signal component and a key input signal component at the computer;
Processing the digital audio signal component to convert the spoken language to the corresponding text.

The method of claim 36, further comprising converting the text to a second keystroke signal.

Inputting a speech signal including a spoken language;
Converting the audio signal to a digital audio signal;
Confirming the spoken language in the digital audio signal;
Translating the spoken language into a command, at least one of the commands defined to initiate Internet access;
Transferring the command to the computer, the method comprising providing access to the Internet from a computer connected to the Internet.

39. The method of claim 38, wherein Internet access connects the computer to a computer on a world wide network.

Inputting spoken commands as voice signals to the keyboard;
Inputting key input data using the keyboard;
Converting the voice signal into an operation command;
Transferring the operation command to at least one of the electronic devices;
Transferring said key-in data to a computer system, wherein the exact order of steps is not important, wherein the at least one electronic device is remotely controlled by spoken commands.