JP2004206116A

JP2004206116A - Improved speech recognition user interface for ultrasonic wave system

Info

Publication number: JP2004206116A
Application number: JP2003419786A
Authority: JP
Inventors: J Hill Steven; ジェイヒルスティーヴン; Cedric Chenal; シュナルセドリック; Dickerson Ken; ディッカーソンケン; Rock Joe; ロックジョー
Original assignee: Koninklijke Philips Electronics NV
Current assignee: Koninklijke Philips NV
Priority date: 2002-12-20
Filing date: 2003-12-17
Publication date: 2004-07-22

Abstract

<P>PROBLEM TO BE SOLVED: To disclose an improved ultrasonic wave system which is brought under speech control. <P>SOLUTION: In one aspect, the efficiency and usability of a speech recognition system are improved by using target speech recognition, speech text converting capability, keyword actuation, and user profiles for respective ultrasonic wave system operators. In another aspect, a mechanical user interface (between users and the speech recognition system) is made efficient and simple by using a wireless communication link and a user interface having a headset, a track ball device, and choice input buttons. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

本発明は、超音波撮像システムに関り、特に、スピーチ制御システムを有する超音波撮像システムに係る。 The present invention relates to an ultrasonic imaging system, and more particularly, to an ultrasonic imaging system having a speech control system.

今日利用可能である超音波システムは、キーボード、マウス、トラックボール、ノブ、ボタン制御部、及び／又は、タッチパネルにより手動で制御される。これらの制御手段に音声制御のシステムを追加する、又は、これらの制御手段を音声制御のシステムに置換することも可能である。図１に示すように、最も単純な形態にあるこのような音声制御システムは、マイクロホン１１０、スピーチ認識システム１５０、及び、超音波撮像システム１８０を有する。ユーザは、マイクロホン１１０に向かって話し、マイクロホン１１０は、音声を電気信号に変換し、電気信号をスピーチ認識システム１５０に送信する。スピーチ認識システム１５０は電気信号を解釈して、どのようなコマンドが与えられたのかを判断し、判断したコマンドを超音波撮像システム１８０に送信する。音声作動（及び／又は手動）の制御システムの特定の例は、本願に参考として組み込む特許文献１に開示される。特許文献１では、スピーチ認識システム１５０は、超音波システム１８０内に組み込まれる。 Ultrasound systems available today are manually controlled by a keyboard, mouse, trackball, knob, button control, and / or touch panel. It is also possible to add a voice control system to these control means, or to replace these control means with a voice control system. As shown in FIG. 1, such a voice control system in its simplest form includes a microphone 110, a speech recognition system 150, and an ultrasound imaging system 180. The user speaks into microphone 110, which converts speech into electrical signals and sends the electrical signals to speech recognition system 150. The speech recognition system 150 interprets the electric signal, determines what command has been given, and transmits the determined command to the ultrasonic imaging system 180. A specific example of a voice activated (and / or manual) control system is disclosed in US Pat. In U.S. Pat. No. 6,037,095, a speech recognition system 150 is incorporated into an ultrasound system 180.

このような音声制御される超音波システムは、ユーザが作業を行うことを容易にするが、独自の問題も有する。第１に、ユーザは、ユーザが特定の言葉及び音節を発音する方法を認識することができるようスピーチ認識システムを「訓練」しなければならない場合がある点である。多くのスピーチ認識システムでは、このことは、ユーザが特定の用語、表現、及び、音節を復唱してシステムを初期設定する「音声登録訓練セッション」により達成される。しかし、病院や診療室の環境では、ユーザは様々な超音波撮像システムを使うか、及び／又は、様々なユーザが同じ超音波システムを使用し得る。前者の場合、ユーザは、複数の訓練セッション（各システムにつき１回）を行わなければならず、後者の場合、超音波システムは、１つ以上の訓練ファイルを格納する容量がない場合がある。 While such voice controlled ultrasound systems make it easier for users to do their work, they also have their own problems. First, the user may have to "train" the speech recognition system so that the user can recognize how to pronounce certain words and syllables. In many speech recognition systems, this is accomplished by a "voice registration training session" in which the user repeats certain terms, expressions, and syllables and initializes the system. However, in a hospital or clinic environment, users may use different ultrasound imaging systems and / or different users may use the same ultrasound system. In the former case, the user must perform multiple training sessions (once for each system); in the latter case, the ultrasound system may not have the capacity to store one or more training files.

第２に、一般的に、マイクロホンは、システムの使用時、更には、それ以後もオンにされたままにされ、従って、スピーチ認識システムは、操作者からの会話（例えば、患者に話しかける）を含む大量の関連性のないデータを受信してしまう点である。これは、スピーチ認識システムによる誤解を招き、超音波システムに意図しないコマンドが伝送されてしまい得る。
米国特許第５，５４４，６５４号 Second, typically, the microphone is left turned on during and even after use of the system, so that the speech recognition system communicates with the operator (eg, talks to the patient). A large amount of unrelated data is received. This can be misleading by the speech recognition system and result in unintended commands being transmitted to the ultrasound system.
U.S. Pat. No. 5,544,654

従って、スピーチ認識システムによる誤解が減少し、ユーザインタフェースの効率が増加する超音波撮像システムの改善が必要である。 Therefore, there is a need for an improved ultrasound imaging system that reduces misunderstandings by the speech recognition system and increases the efficiency of the user interface.

本発明は、スピーチ制御される超音波システムの改善点を提供する。本発明の１つの面では、スピーチ認識システムの効率及び使い勝手の良さは、個々のユーザ用の訓練プロファイルと、目標が決められたスピーチ認識と、スピーチからテキストへの変換能力と、キーワード起動を用いて、向上される。ユーザプロファイルは、各操作者に対して作成され、各ユーザプロファイルは、少なくとも、操作者の訓練プロファイルと、修正可能なスピーチ−テキスト辞書と、キーワードの修正可能なリストを有する。別の面では、機械的なユーザインタフェース（ユーザとスピーチ認識システムの間）は、ワイヤレス通信リンク、及び、ヘッドセットと、トラックボールデバイスと、選択入力ボタンを有するユーザインタフェースを用いることにより能率化且つ単純化される。適切なユーザプロファイルを取出しするためにユーザを識別するために、音声認識技術及び／又はワイヤレス識別トランスポンダといった自動ユーザ識別手段を用いることも考えられる。 The present invention provides improvements in speech controlled ultrasound systems. In one aspect of the invention, the efficiency and ease of use of a speech recognition system is based on training profiles for individual users, targeted speech recognition, speech-to-text conversion capabilities, and keyword activation. And be improved. A user profile is created for each operator, each user profile having at least an operator training profile, a modifiable speech-text dictionary, and a modifiable list of keywords. In another aspect, the mechanical user interface (between the user and the speech recognition system) is streamlined and used by using a wireless communication link and a user interface having a headset, a trackball device, and a selection input button. Be simplified. It is also conceivable to use automatic user identification means, such as voice recognition technology and / or wireless identification transponders, to identify the user in order to retrieve the appropriate user profile.

本発明の他の目的及び特徴は、添付図面と共に以下の詳細な説明を考慮することにより明らかとなろう。しかし、図面は、説明目的のために設計されたものであり、本発明の制限を定義付けるものではないことを理解するものとする。本発明の制限に関しては、特許請求の範囲を参照すべきである。更に、図面は必ずしも拡大縮小されているわけではなく、指示しない限りは、本願に説明する構造及び手順を概念的に説明することのみを目的とすることを理解するものとする。 Other objects and features of the present invention will become apparent from consideration of the following detailed description in conjunction with the accompanying drawings. It should be understood, however, that the drawings are designed for illustrative purposes and do not define limitations of the invention. Reference should be made to the appended claims for limitations of the invention. Furthermore, it is to be understood that the drawings are not necessarily drawn to scale, and unless otherwise indicated, are intended only to conceptually describe the structures and procedures described herein.

一般的には、本発明は、音声制御される超音波システムのパーソナライズ化と超音波システムをより効率よくすることを目的とする。本発明の１つの面では、スピーチ認識システムの効率及び使い勝手のよさは、幾つかの改良点により向上される。別の面では、機械的なユーザインタフェース（ユーザとスピーチ認識システム間）が能率化かつ単純化される。 In general, it is an object of the present invention to personalize a voice controlled ultrasound system and to make the ultrasound system more efficient. In one aspect of the present invention, the efficiency and ease of use of a speech recognition system is enhanced by several improvements. In another aspect, the mechanical user interface (between the user and the speech recognition system) is streamlined and simplified.

本発明は、超音波システム用のスピーチ認識ユーザインタフェースの６つの本質的な改良点を有する。即ち、
・ユーザプロファイル
・目標スピーチ認識
・スピーチ−テキスト変換能力
・キーワード起動
・ワイヤレス通信リンク
・「選択」ボタン付きトラックボール
ユーザプロファイル−上述したように、一般的なスピーチ認識システムは、ユーザが、表現、言葉、及び／又は音節を繰り返して、ユーザが話していることを理解できるよう音声認識システムを訓練する音声登録登録セッションを介して初期化される。更に上述したように、このことは、ユーザが、様々な超音波システムを用いる、及び／又は、異なるユーザが同じ超音波システムを用いる病院又は診療所の環境では、問題がある。本発明の実施例による超音波システムでは、訓練プロファイルは、個々のユーザ毎に格納され、それにより、ユーザにより使用される全ての超音波システムにおけるスピーチ認識の一貫したパフォーマンスを可能にする。 The present invention has six essential improvements to the speech recognition user interface for ultrasound systems. That is,
・ User profile ・ Target speech recognition ・ Speech-text conversion ability ・ Keyword activation ・ Wireless communication link ・ Trackball with “select” button
User Profile -As mentioned above, a typical speech recognition system is a speech registration that trains the speech recognition system so that the user can repeat expressions, words, and / or syllables to understand what the user is talking about. Initialized via session. As further noted above, this can be problematic in a hospital or clinic environment where users use different ultrasound systems and / or different users use the same ultrasound system. In an ultrasound system according to an embodiment of the present invention, training profiles are stored for each individual user, thereby enabling a consistent performance of speech recognition in all ultrasound systems used by the user.

本発明の好適な実施例によると、ユーザは最初に、音声登録訓練セッションを行い、結果として、スピーチ訓練プロファイルが形成される。図２に示すように、スピーチ訓練プロファイル２１０は、そのユーザのためのユーザプロファイル２００に保存される。このユーザプロファイル２００は、その特定のユーザが、超音波システムのスピーチ認識システムを起動させる度にアクセスされ且つ利用される。 According to a preferred embodiment of the present invention, a user first conducts a voice registration training session, resulting in a speech training profile being formed. As shown in FIG. 2, the speech training profile 210 is stored in the user profile 200 for the user. This user profile 200 is accessed and utilized each time the particular user activates the speech recognition system of the ultrasound system.

システムが正しいユーザプロファイルを使用するために、スピーチ認識システムは、最初に、その特定のユーザを認識しなければならない。この認識は、多様な手段により行われ得る。１つの実施例では、ユーザは、例えば、キーボードを用いて、スピーチ認識システムに直接自分の識別情報をキーで入力し得る。そうすると、システムは、そのユーザ識別情報２０５に基づいて、正しいユーザプロファイル２００を見つける。別の実施例では、スピーチ認識システムは、音声認識（即ち、特定のユーザの個々の音声を認識且つ識別すること）が可能である。この実施例では、ユーザは、単純に話し始め、そして、スピーチ認識システムは自動的にユーザを識別し、そのユーザのプロファイルを取出しする。このような実施例では、例えば、図２に示す音声識別パラメータ２１５のような、ユーザの声を決める一意のパラメータを格納する必要もある。別の実施例では、クレジットカードタイプのカードの磁気ストリップ、能動電気素子によりピング（ping）されると信号を送信するクレジットカード型のカードに埋め込まれる受動電気素子、識別番号、又は、ブルートゥース又は他のタイプのワイヤレス装置により送信されるＭＡＣアドレス等の、他の形式のユーザ識別情報を用いることが可能である。 In order for the system to use the correct user profile, the speech recognition system must first recognize that particular user. This recognition can be made by various means. In one embodiment, a user may key in his identification information directly into a speech recognition system, for example, using a keyboard. Then, the system finds the correct user profile 200 based on the user identification information 205. In another embodiment, a speech recognition system is capable of speech recognition (ie, recognizing and identifying individual speech of a particular user). In this embodiment, the user simply starts speaking and the speech recognition system automatically identifies the user and retrieves the user's profile. In such an embodiment, it is also necessary to store a unique parameter that determines the user's voice, such as the voice identification parameter 215 shown in FIG. In another embodiment, a magnetic strip of a credit card type card, a passive electrical element embedded in a credit card type card that sends a signal when pinged by an active electrical element, an identification number, or Bluetooth or other Other types of user identification information may be used, such as a MAC address transmitted by a wireless device of this type.

ユーザプロファイル２００に示す残りの構成要素を以下に説明する。 The remaining components shown in the user profile 200 are described below.

ユーザプロファイルは、様々な方法で格納され得る。図３の好適な実施例に示すように、ユーザＡ、Ｂ、及びＣのユーザプロファイル３１０は、スピーチ認識システム１５０のメモリの一部に格納される。この実施例では、スピーチ認識システム１５０は、超音波システムに組み込まれる。従って、図３に示す実施例では、１つの超音波システムは、全てのユーザプロファイルを保持するのに十分なメモリを有さなければならない。尚、全ての実施例において、スピーチ認識システムは、超音波システムと別個であっても、又は、超音波システムと一体にされてもよい。 User profiles can be stored in various ways. As shown in the preferred embodiment of FIG. 3, user profiles 310 of users A, B, and C are stored in a portion of the memory of speech recognition system 150. In this embodiment, speech recognition system 150 is incorporated into an ultrasound system. Thus, in the embodiment shown in FIG. 3, one ultrasound system must have enough memory to hold all user profiles. It should be noted that in all embodiments, the speech recognition system may be separate from the ultrasound system, or may be integrated with the ultrasound system.

図４の好適な実施例に示すように、各超音波システム４０１、４０２、及び４０３は、集中ユーザプロファイルデータベース４１０につながる通信リンク４０５を有する。この記憶手段の例示的な実施例は、病院であり、病院では、各超音波システムは、ローカルエリアネットワーク（ＬＡＮ）に接続される。ユーザプロファイルデータベースを有するサーバもＬＡＮに接続される。ユーザを認識すると、スピーチ認識システムは、ユーザプロファイルデータベースから適切なユーザプロファイルを要求し、そして、そのユーザプロファイルを受信する。図５に示す更なる好適な実施例において、各ユーザプロファイル５０１、５０２、及び５０３は、小さな携帯可能なメモリ手段に格納される。このメモリ手段は、格納されたユーザプロファイルをスピーチ認識システムに送信することができる。一例として、ユーザは、自分のユーザプロファイルを有する小さなソニー（登録商標）製の「メモリスティック」を持ち運び得る。このメモリスティックは、様々なスピーチ認識システムに接続する読出し手段５１１、５１２、及び、５１３に挿入され、読出しされることが可能である。任意の記憶手段（ＣＤ、フロッピー（登録商標）ディスク、組込み式半導体電子機器等）を用いることができ、且つ、任意の有線又は無線記憶読出し手段を用いることができる。 As shown in the preferred embodiment of FIG. 4, each ultrasound system 401, 402, and 403 has a communication link 405 leading to a centralized user profile database 410. An exemplary embodiment of this storage means is a hospital, where each ultrasound system is connected to a local area network (LAN). A server having a user profile database is also connected to the LAN. Upon recognizing the user, the speech recognition system requests an appropriate user profile from the user profile database and receives the user profile. In a further preferred embodiment shown in FIG. 5, each user profile 501, 502 and 503 is stored in a small portable memory means. This memory means can transmit the stored user profile to the speech recognition system. As an example, a user may carry a small Sony "Memory Stick" with his user profile. This memory stick can be inserted and read out by reading means 511, 512 and 513 which connect to various speech recognition systems. Any storage means (CD, floppy (registered trademark) disk, embedded semiconductor electronic equipment, etc.) can be used, and any wired or wireless storage readout means can be used.

本願に説明する例から分かるように、ユーザプロファイルを格納する方法は、集中的（例えば、図４にあるように、全てのユーザプロファイルが１つのサーバ上に格納される）から分散的（例えば、図５にあるように、各ユーザが自分のプロファイルを運ぶ）まで範囲は様々であり、これらの両極間の実施例（例えば、図３にあるように、各スピーチ認識システムが数個のユーザプロファイルを有する）も含む。 As can be seen from the examples described herein, the method of storing user profiles can be from centralized (eg, all user profiles are stored on one server, as in FIG. 4) to distributed (eg, The range can vary between each user carrying their own profile, as in FIG. 5, and embodiments between these extremes (eg, as shown in FIG. 3, each speech recognition system may have several user profiles). Having).

更に、集中システムと分散システムの組合わせも用いることができる。例えば、ユーザプロファイルは、超音波システム、ネットワーク化され集中化されたデータベース、更に、各ユーザにより持ち運ばれる携帯可能な記憶ユニットに格納され得る。このようなシステムにおいて、超音波システムは、最初に、ユーザの識別情報を確認し、次に、そのユーザのプロファイルが格納されているか否かを判断する。格納されていない場合は、超音波システムは、ユーザの記憶装置からそのユーザのプロファイルを直接ダウンロードするか、又は、集中データベースからネットワークを介してダウンロードし得る。この選択は、超音波システムの通信及び記憶能力に依存する。このような重複のシステムは、ユーザプロファイルを無くしたような場合に有用である。更に、集中データベースが、インターネットに接続していると、このシステムは、様々な施設を移動する超音波技術者に非常に有益となる。 Further, a combination of centralized and distributed systems can also be used. For example, the user profiles may be stored on an ultrasound system, a networked centralized database, and further on a portable storage unit carried by each user. In such a system, the ultrasound system first checks the identity of the user and then determines whether the user's profile is stored. If not stored, the ultrasound system may download the user's profile directly from the user's storage or from a centralized database over a network. The choice depends on the communication and storage capabilities of the ultrasound system. Such a duplication system is useful when the user profile is lost. In addition, if the centralized database is connected to the Internet, the system would be very beneficial for ultrasound technicians traveling through various facilities.

更に、図２に示すユーザプロファイル２００は、ユーザプロファイルにおける可能な構成要素の概略図であることを理解するものとする。一部の実施例では、ユーザプロファイルは、ユーザ識別情報２０５及びユーザのスピーチ訓練プロファイル２１０のみを有し得る。更に、ユーザプロファイル２００内の構成要素は、様々な装置に格納され得る。例えば、図４に示すようなシステムにおいて、音声識別パラメータ２１５は、個々の超音波システムに格納され、ユーザプロファイルはサーバ４１０に格納され得る。そのような実施例において、個々のスピーチ認識システムは、最初に、ローカルで格納される音声識別パラメータを用いてユーザを識別し、そして、適切なユーザプロファイルをサーバ４０から取出しする。 Further, it should be understood that the user profile 200 shown in FIG. 2 is a schematic diagram of the possible components in the user profile. In some embodiments, the user profile may include only the user identification information 205 and the user's speech training profile 210. Further, components in the user profile 200 may be stored on various devices. For example, in a system such as that shown in FIG. 4, the voice identification parameters 215 may be stored on a separate ultrasound system and the user profile may be stored on the server 410. In such an embodiment, each speech recognition system first identifies the user using locally stored speech identification parameters, and retrieves the appropriate user profile from server 40.

目標スピーチ認識−特定の言葉の正確さの度合いは、スピーチ認識システムの全体の認識精度より相当に低い場合がある。つまり、他の言葉又は表現よりも大幅に高い頻度で誤訳される特定の言葉又は表現があり得る。しかし、一般的なスピーチ認識システムにおいて、そのような問題を解決する唯一の手段は、最初の音声登録訓練セッションの全部を繰り返すことであり、これは、この誤訳問題は、数個の言葉又は表現についてのみ発生する場合があるので、非常に非効率である。本発明の１つの実施例では、ユーザは、スピーチ認識システムが話された言葉を誤訳する又は話された言葉がわからない場合に、スピーチ認識システムにすぐに「目標とする」補正をすることができる。図２のユーザプロファイル２００に目標訓練２１１として示すこの特徴を用いて、ユーザは、最初の音声登録セッション全体を繰り返すことなく、問題の言葉又は表現のみに向けられた音声登録訓練を行う。 Target speech recognition— The degree of accuracy of a particular word may be significantly lower than the overall recognition accuracy of the speech recognition system. That is, there may be certain words or expressions that are mistranslated much more frequently than other words or expressions. However, in a typical speech recognition system, the only way to solve such a problem is to repeat the entire initial speech registration training session, which is a problem in which the mistranslation problem may take several words or expressions. Is very inefficient because it can only occur for In one embodiment of the present invention, the user can immediately make a "targeted" correction to the speech recognition system if the speech recognition system mistranslates the spoken word or does not know the spoken word. . Using this feature, shown as target training 211 in the user profile 200 of FIG. 2, the user performs voice registration training directed only to the word or expression in question without repeating the entire initial voice registration session.

スピーチ−テキスト変換能力−超音波システムのユーザは、患者の検査時間の少なくとも５％を使って、作成される画像に注釈を付ける。この注釈は一般的に、画像を適切に識別するために用いられ、切断面及び解剖学的構造といった情報を有する。本発明の１つの実施例によると、超音波システムは、スピーチからテキストに変換する能力を有し、この能力は、検査を中断することなく、又は、手を使って注釈情報を入力する必要なく、ユーザが画像に注釈を付けることを可能にする。スピーチ−テキスト変換モジュールは、ハードウェア、ソフトウェア、又は、それらの組合わせにより実現され得、また、スピーチ認識システム及び／又は超音波システムに組み込まれるか、又は、別個のモジュールであり得る。スピーチからテキストへの変換能力がスピーチコマンド機能と干渉することを回避するために、スピーチ−テキスト変換能力は、ハードウェア又は特定のスピーチコマンドにより起動され得る。例えば、ユーザは、「注釈付けの開始」と言うと、スピーチ−テキスト変換能力が起動される。そして、システムは、注釈を付けることを止めるには、話される停止表現、タイムアウト、又は、一部の物理的インタフェースとのユーザインタラクションを待つ。 Speech-to-text conversion capability- The user of the ultrasound system annotates the images created using at least 5% of the patient's examination time. This annotation is typically used to properly identify the image and has information such as cut planes and anatomy. According to one embodiment of the present invention, the ultrasound system has the ability to convert speech to text without the need to interrupt the exam or to manually enter annotation information. Allows users to annotate images. The speech-to-text conversion module may be implemented by hardware, software, or a combination thereof, and may be integrated into the speech recognition system and / or the ultrasound system, or may be a separate module. To avoid the speech-to-text conversion ability interfering with the speech command function, the speech-to-text conversion ability can be activated by hardware or by a specific speech command. For example, when the user says "start annotation", the speech-to-text conversion capability is activated. The system then waits for a spoken stop expression, timeout, or user interaction with some physical interface to stop annotating.

更に、本発明のスピーチ−テキスト変換機能は、スピーチ−テキスト変換モジュールがスピーチを認識するために用いる言葉の「辞書」の追加、削除、及び、編集を可能にする。例えば、ユーザは、絶対に使わない言葉を削除することを希望する（誤訳を回避するために）か、又は、特定の言葉が注釈においてテキストとして現れる様子を修正する、即ち、略記等に修正することを希望し得る。図２にユーザプロファイル２００中のスピーチ−テキスト辞書２２０として示すように、このモジュールは、ユーザが自分のプロファイルにアクセスする度に、連続的に更新且つ修正されることが可能である。 Furthermore, the speech-to-text conversion feature of the present invention allows for the addition, deletion, and editing of "dictionaries" of words used by the speech-to-text conversion module to recognize speech. For example, the user may want to delete words that are never used (to avoid mistranslation), or modify the way certain words appear as text in the annotation, ie, abbreviate etc. You may wish that. As shown in FIG. 2 as speech-text dictionary 220 in user profile 200, this module can be continuously updated and modified each time a user accesses his profile.

キーワード起動−上述したように、周囲の音、背景の会話等は、スピーチ認識システムのパフォーマンスに悪影響を与え、装置が不慮に動作することを潜在的に引き起こし得る。本発明の１つの実施例によるスピーチ認識システムは、各スピーチコマンドの前に「キーワード」を用いることによりこの問題を最小限にする。キーワードは、一般的に、日常会話では話される可能性のない意味のない言葉である。キーワード起動は、キーボードのキーの押下、又は、話された各コマンド後の追加の確認段階を必要とするよりもより良好な解決策である。一部の環境では、キーワードは必要となることも又は望まれることもない場合があり、そのような環境においては、キーワード能力は用いられない。 Keyword activation -As mentioned above, ambient sounds, background conversations, etc., can adversely affect the performance of a speech recognition system, potentially causing the device to operate inadvertently. A speech recognition system according to one embodiment of the present invention minimizes this problem by using a "keyword" before each speech command. Keywords are generally meaningless words that are unlikely to be spoken in everyday conversation. Keyword activation is a better solution than requiring keyboard key presses or an additional confirmation step after each spoken command. In some environments, keywords may not be needed or desired, and in such environments, keyword capabilities are not used.

図２に、１つ以上のキーワード２３０が、ユーザプロファイル２００内に格納されるものとして示す。図２における他のモジュールと同様に、この特徴は、ユーザが自分のプロファイルにアクセスする度に連続的に更新且つ修正され得る。 FIG. 2 shows one or more keywords 230 as stored in the user profile 200. As with the other modules in FIG. 2, this feature can be continuously updated and modified each time the user accesses his profile.

ワイヤレス通信−本発明の１つの好適な実施例において、超音波システムは、構成要素間を、有線接続ではなく、無線接続を用いて実現される。この無線接続は、任意の技術を用い得るが、低電力、短距離の無線技術が用いられることが好適である。低電力、短距離の無線技術の幾つかの例としては、ブルートゥース、ＩＥＥＥ８０２．１１ａ又は８０２．１１ｂ、ＨｉｐｅｒＬＡＮ、及び、ＨｏｍｅＲＦが挙げられる。磁気誘導、赤外線、又は、拡散赤外線技術も用い得る。図６に示すように、ヘッドセット６１０が、超音波システム６８０（この実施例では、スピーチ認識システムは、超音波システム６８０内に組み込まれる）との無線接続を有する。このことは、超音波システム６８０が、プリンタ６６２、又は、ＰＤＡ６６４といった周辺装置、又は、１つ以上の病院内ネットワークと無線通信リンクを有することを可能にする。この追加された無線能力により、超音波システム６８０は、様々な場所に動かすことができ、壁のソケットに電源プラグを再度接続することのみを必要とする。更に、周辺の構成要素は、室内にどの超音波システムがあろうと容易に再接続することができるので、周辺構成要素は１つの部屋から別の部屋に容易に動かすことができる。 Wireless Communication-In one preferred embodiment of the present invention, the ultrasound system is implemented using wireless connections between components, rather than wired connections. This wireless connection may use any technology, but preferably uses low-power, short-range wireless technology. Some examples of low-power, short-range wireless technologies include Bluetooth, IEEE 802.11a or 802.11b, HiperLAN, and HomeRF. Magnetic induction, infrared or diffuse infrared technology may also be used. As shown in FIG. 6, the headset 610 has a wireless connection with an ultrasound system 680 (in this example, the speech recognition system is incorporated into the ultrasound system 680). This allows the ultrasound system 680 to have a wireless communication link with a peripheral device, such as a printer 662 or PDA 664, or one or more hospital networks. With this added wireless capability, the ultrasound system 680 can be moved to various locations and only needs to reconnect the power plug to the wall socket. Further, the peripheral components can be easily reconnected to any ultrasound system in the room, so that the peripheral components can be easily moved from one room to another.

１つの好適な実施例では、ブルートゥースを無線通信リンクに用いる。ブルートゥースプロトコルは、２つのブルートゥースデバイス間で任意の通信セッションの始まりにおいてハンドシェイク手続きを必要とする。従って、ブルートゥースは、一度に１つの音声入力装置のみが超音波システムに登録されることを確実にすることができる。ブルートゥースデバイスも短距離で、軽量で、且つ、低電力消費である。従って、ブルートゥースデバイスは、スピーチ認識超音波システムにおけるヘッドセットとして理想的である。標準ＲＦシステムは、距離が長すぎ、且つ、電力消費が大きすぎる。赤外線システムは、視野方向光路を必要とする。更に、ブルートゥース送信器は、医療用遠隔測定装置といったＲＦを利用する病院又は診療所環境における他の機器と干渉する可能性が低い。 In one preferred embodiment, Bluetooth is used for the wireless communication link. The Bluetooth protocol requires a handshake procedure at the beginning of any communication session between two Bluetooth devices. Thus, Bluetooth can ensure that only one audio input device is registered with the ultrasound system at a time. Bluetooth devices are also short-range, lightweight and have low power consumption. Therefore, Bluetooth devices are ideal as headsets in speech recognition ultrasound systems. Standard RF systems are too long and consume too much power. Infrared systems require a line-of-sight optical path. In addition, Bluetooth transmitters are less likely to interfere with other devices in a hospital or clinic environment that use RF, such as medical telemetry devices.

１つの好適な実施例において、音声識別情報を用いて個々のユーザを識別するのではなくて、ヘッドセット６１０が一意の識別子（例えば、能動電子素子によりピングされると数字コードを無線で送信する受動電子素子）を有し、この識別子により、ユーザがスピーチ認識システムに対し識別される。従って、ユーザが、超音波システムの付近にいると、超音波システムはユーザを自動的に識別し、正しいユーザプロファイルを取出しすることができる。これは、図３又は４の無線の実施例に用い得る。別の実施例では、ユーザプロファイルは、ヘッドセット５１０に組み込まれるか、又は、接続され得る。組み込まれる実施例では、ユーザプロファイルを有する半導体メモリチップがヘッドセット内に組込まれ、そして、ユーザが超音波システムの範囲内に入ってきたときに、超音波システムによりアクセス且つダウンロードされ得る。これは、図５の無線の実施例であり得る。接続される実施例では、ユーザプロファイルを有するメモリモジュールが、有線によりヘッドセット６１０に接続され得る。 In one preferred embodiment, rather than using voice identification to identify individual users, headset 610 wirelessly transmits a numeric code when pinged by a unique identifier (eg, pinged by active electronics). Passive identifier), by means of which the user is identified to the speech recognition system. Thus, when the user is near the ultrasound system, the ultrasound system can automatically identify the user and retrieve the correct user profile. This may be used in the wireless embodiment of FIG. 3 or 4. In another example, the user profile may be embedded or connected to headset 510. In a built-in embodiment, a semiconductor memory chip with a user profile is built into the headset and can be accessed and downloaded by the ultrasound system when a user comes within range of the ultrasound system. This may be the wireless embodiment of FIG. In a connected embodiment, a memory module having a user profile may be connected to headset 610 by wire.

トラックボールデバイス−従来の超音波システムの追加の改善点は、スピーチ認識システムに組み込まれる単純化された機械的ユーザインタフェースである。超音波システムの従来の機械的ユーザインタフェースは、複数の押しボタン、トグルスイッチ、ロッカースイッチ、回転式ノブ、組み込まれたトラックボールを有するキーボードを有する制御パネルを有する。しかし、従来のユーザインタフェースは、比較的大きく、且つ、ユーザが検査を行う際に、ユーザから離れた場所に位置付けられる。そのように従来のユーザインタフェースに頻繁に行ったり来たりすることは、検査の進行速度を妨げ、また、毛根管症候群といった職業上のストレスをもたらし得る。更に、制御のためにスピーチ認識技術を用いる超音波システムにおいて、機械的なユーザインタフェースは、超音波システムを制御するためにスピーチ認識システム内に組み込まれ、且つ、スピーチ認識システムを支援すべきである。 Trackball devices- An additional improvement over conventional ultrasound systems is the simplified mechanical user interface that is incorporated into speech recognition systems. A conventional mechanical user interface for an ultrasound system has a control panel with a keyboard having a plurality of push buttons, toggle switches, rocker switches, rotary knobs, and an integrated trackball. However, conventional user interfaces are relatively large and are located far from the user when the user performs an examination. Such frequent back and forth to the conventional user interface can hinder the progress of the test and can lead to occupational stress such as root canal syndrome. Further, in ultrasound systems that use speech recognition technology for control, a mechanical user interface should be incorporated into and support the speech recognition system to control the ultrasound system. .

図７に示す本発明の１つの好適な実施例において、単純化された機械的ユーザインタフェース７００は、トラックボール７１０、１つの「選択」ボタン７２０、及び、再構成可能なタッチパネル７３０を有する。他の実施例では、この単純化された機械的ユーザインタフェースは、より少ない又はより多い構成要素を有し得る。例えば、別の実施例において、単純化された機械的ユーザインタフェースは、内蔵型の選択機能（即ち、トラックボールを押下することにより起動される）を有するトラックボールのみを有し得る。更に別の実施例において、単純化された機械的ユーザインタフェースは、図７に示す全ての構成要素と、トグルスイッチ又は回転式ノブといった追加の機械的制御部を有し得る。更に別の実施例において、再構成可能なタッチパネル７３０は、標準的なボタン、スイッチ、及び／又は、他の機械的インタフェースに取って代わられる。本発明の様々な実施例による単純化された機械的ユーザインタフェースの重要な点は、超音波システムのスピーチ認識ユーザインタフェースに組み込まれ、且つ、従来の機械的ユーザインタフェースと比較してそのサイズ及び／又は複雑さが減少される点である。 In one preferred embodiment of the present invention, shown in FIG. 7, a simplified mechanical user interface 700 includes a trackball 710, a "select" button 720, and a reconfigurable touch panel 730. In other embodiments, the simplified mechanical user interface may have fewer or more components. For example, in another embodiment, the simplified mechanical user interface may only have a trackball with a built-in selection function (ie, activated by pressing the trackball). In yet another embodiment, a simplified mechanical user interface may have all the components shown in FIG. 7 and additional mechanical controls such as toggle switches or rotary knobs. In yet another embodiment, the reconfigurable touch panel 730 is replaced by standard buttons, switches, and / or other mechanical interfaces. An important aspect of the simplified mechanical user interface according to various embodiments of the present invention is that it is integrated into the speech recognition user interface of the ultrasound system and its size and / or size as compared to conventional mechanical user interfaces. Or, the complexity is reduced.

図７において、トラックボール７１０と選択ボタン７２０は、ヘッドセット（即ち、スピーチ認識システム）と協力して、超音波システムとのプライマリユーザインタフェースを形成する。システムの制御部は、スピーチコマンドにより起動されることが可能であり、そして、選択ボタン７２０を用いて「送信」する。トラックボール７１０は、制御設定を高速で回すことにより従来のユーザインタフェースのパドルスイッチ及び回転式スイッチの機能を再現する簡単な手段を提供する。一部の状況において、スピーチコマンドは、画面上に可能な選択を有するメニュを表示し得る。ユーザは、選択のうちの１つを、トラックボール７１０を動かすことにより選択し、所望の選択を示し、選択ボタン７２０を押す。他の状況において、トラックボール７１０の動作及び／又は選択ボタン７２０の押下は、スピーチ認識システムに特定のタイプのスピーチコマンドを待機させるか、又は、スピーチ認識システムに、再構成可能なタッチパネル７３０内の表示内容を変更させる。３つの入力構成要素のいずれか、及び、再構成可能なタッチパネル７３０上に示す入力手段を介する入力シーケンス及び／又は入力の組合わせは、実質的に制限のない変更が可能である。 In FIG. 7, a trackball 710 and a select button 720 cooperate with a headset (ie, a speech recognition system) to form a primary user interface with an ultrasound system. The control of the system can be activated by a speech command and “send” using the select button 720. Trackball 710 provides a simple means of replicating the functions of paddle switches and rotary switches of a conventional user interface by turning control settings at high speed. In some situations, the speech command may display a menu with possible choices on the screen. The user selects one of the choices by moving the trackball 710, indicating the desired choice, and pressing the select button 720. In other situations, actuation of the trackball 710 and / or pressing of the select button 720 causes the speech recognition system to wait for a particular type of speech command or causes the speech recognition system to Change the displayed content. Any of the three input components and combinations of input sequences and / or inputs via the input means shown on the reconfigurable touch panel 730 can be varied without limitation.

スピーチコマンドに適さないコマンド機能は、ヘッドセットと協力して、トラックボール７１０及び／又は再構成可能なタッチパネル７３０により実行することが可能である。例えば、測定カーソルの位置付けといった微妙な物理的動作を必要とする動作は、トラックボール７１０によって行われ得る。好適な実施例では、再構成可能なタッチパネル７３０は、スピーチ認識システムにより供給される現在のコンテキストに基づいて表示内容（「ボタン」及び他の機械に基づいたインタフェース画像）を変更する。従って、再構成可能なタッチパネル７３０上の「ボタン」及び他の制御部は、ユーザが現在何をやっているかに応じて完全に異なり得る。更に、本発明の実施例は、再構成タッチパネル７３０にユーザにより決められる表示内容を有することも考えられる。更に、スピーチ認識システムが、ユーザが話したコマンドを判断することができない場合、トラックボール７１０、選択ボタン７２０、及び／又は、再構成可能なタッチパネル７３０が、様々な手段をユーザに供給して、スピーチ認識システムにより生成される２つ以上のオプションを選択することができるにする。 Command functions that are not suitable for speech commands can be performed by the trackball 710 and / or the reconfigurable touch panel 730 in cooperation with the headset. For example, an operation requiring a subtle physical operation such as positioning of a measurement cursor may be performed by the trackball 710. In a preferred embodiment, the reconfigurable touch panel 730 changes the display ("buttons" and other machine-based interface images) based on the current context provided by the speech recognition system. Thus, the “buttons” and other controls on the reconfigurable touch panel 730 may be completely different depending on what the user is currently doing. Further, it is conceivable that the embodiment of the present invention has a display content determined by the user on the reconfigurable touch panel 730. Further, if the speech recognition system cannot determine the command spoken by the user, the trackball 710, the select button 720, and / or the reconfigurable touch panel 730 provide various means to the user, Allows to select more than one option generated by the speech recognition system.

図８に示すように、本発明の１つの好適な実施例は、超音波システム８８０に取り付けられる多関節の支持体８１０に取り付けられる単純化された機械的ユーザインタフェース７００を有する。この実施例における単純化された機械的ユーザインタフェース７００では、選択ボタン機能はトラックボールに組み込まれる。トラックボールが軽くたたかれる又は下方向に押されると、これは、選択ボタンを押したことと同じである。多関節の支持体８１０によって、ユーザは、単純化された機械的ユーザインタフェース７００を使い勝手のよい場所に動かすことができる。一部の実施例では、複数の単純化された機械的ユーザインタフェース７００が適切な場所（例えば、超音波システムの隣、ユーザの座席の上に、ベッドの右側、ベッドの左側等）に置かれ、そこから、ユーザは特定の動作のために最良に適したものを選択し得る。他の実施例では、単純化された機械的ユーザインタフェース７００は、超音波システムとの有線又は無線の通信リンクを有し、それにより、動作のより大きい度合いの自由度を可能にする。 As shown in FIG. 8, one preferred embodiment of the present invention has a simplified mechanical user interface 700 that is attached to an articulated support 810 that is attached to an ultrasound system 880. In the simplified mechanical user interface 700 in this embodiment, the select button function is built into the trackball. When the trackball is tapped or pressed down, this is the same as pressing the select button. The articulated support 810 allows the user to move the simplified mechanical user interface 700 to a convenient location. In some embodiments, a plurality of simplified mechanical user interfaces 700 are placed in appropriate locations (eg, next to the ultrasound system, on the user's seat, on the right side of the bed, on the left side of the bed, etc.). From there, the user may select the one that is best suited for a particular operation. In other embodiments, the simplified mechanical user interface 700 has a wired or wireless communication link with the ultrasound system, thereby allowing a greater degree of freedom of operation.

従って、本発明の基本的な新規の特徴を、本発明の好適な実施例に適用して図示し、説明し、且つ、指し示したが、図示する装置の装置の形式及び詳細、及び、動作における様々な省略、置換、及び、変更は、本発明の技術的思想から逸脱することなく当業者により行われ得ることを理解するものとする。例えば、同一の結果を達成するよう略同一の方法で同一の機能を実質的に行う構成要素及び／又は方法段階の全ての組合わせは、本発明の範囲内であることを明白に意図する。更に、任意の開示した形式又は本発明の実施例に管ｒ年して示した及び／又は説明した構造及び／又は構成要素及び／又は方法段階は、設計選択の一般的事項として、任意の他の開示される又は説明される又は提案される形式、又は、実施例に組み込まれ得る。従って、特許請求の範囲によってのみ制限されるものとする。 Thus, while the basic novel features of the invention have been illustrated, described, and pointed out with reference to the preferred embodiment of the invention, it is to be understood that the type and details of the apparatus illustrated and its operation It is to be understood that various omissions, substitutions and changes may be made by those skilled in the art without departing from the spirit of the invention. For example, all combinations of components and / or method steps that perform substantially the same function in substantially the same manner to achieve the same result are expressly intended to be within the scope of the present invention. Further, the structures and / or components and / or method steps shown and / or described in any disclosed form or embodiment of the invention may be subject to any other general design choice. The disclosed or described or proposed forms of the present invention or examples may be incorporated. Therefore, it should be limited only by the appended claims.

スピーチ認識システムを用いる従来の超音波システムを示すブロック図である。FIG. 2 is a block diagram illustrating a conventional ultrasound system using a speech recognition system. 本発明の好適な実施例によるユーザプロファイルを示す概略図である。FIG. 4 is a schematic diagram illustrating a user profile according to a preferred embodiment of the present invention; 本発明の１つの好適な実施例に従い、様々なユーザプロファイルを１つの超音波システムに格納する方法を示す図である。FIG. 4 illustrates a method for storing various user profiles in one ultrasound system according to one preferred embodiment of the present invention. 本発明の別の好適な実施例に従い、様々なユーザプロファイルを１つの超音波システムに格納する方法を示す図である。FIG. 4 illustrates a method of storing various user profiles in one ultrasound system according to another preferred embodiment of the present invention. 本発明の更なる好適な実施例に従い、様々なユーザプロファイルを１つの超音波システムに格納する方法を示す図である。FIG. 7 illustrates a method of storing various user profiles in one ultrasound system according to a further preferred embodiment of the present invention. 本発明の１つの好適な実施例に従い、無線通信リンクを用いる超音波システムを示すブロック図である。1 is a block diagram illustrating an ultrasound system using a wireless communication link, according to one preferred embodiment of the present invention. 本発明の１つの好適な実施例による単純化された機械的ユーザインタフェースを示すブロック図である。FIG. 4 is a block diagram illustrating a simplified mechanical user interface according to one preferred embodiment of the present invention. 本発明の１つの好適な実施例による単純化された機械的ユーザインタフェースを用いる超音波システムを示すブロック図である。FIG. 1 is a block diagram illustrating an ultrasound system using a simplified mechanical user interface according to one preferred embodiment of the present invention.

Explanation of reference numerals

１１０マイクロホン
１５０スピーチ認識システム
１８０超音波撮像システム
２００ユーザプロファイル
２０５ユーザ識別情報
２１０スピーチ訓練プロファイル
２１１目標訓練
２１５音声識別情報
２２０スピーチ−テキスト変換辞書
２３０キーワード
４１０ユーザプロファイルデータベース
４０１、４０２、４０３超音波システム
５０１、５０２、５０３ユーザプロファイル
５１１、５１２、５１３読出し手段
６１０ヘッドセット
６６２プリンタ
６６４ＰＤＡ
６８０超音波システム
７００単純化された機械的ユーザインタフェース
７１０トラックボール
７２０「選択」ボタン
７３０再構成可能なタッチパネル
８１０多関節の支持体
８８０超音波システム 110 Microphone 150 Speech recognition system 180 Ultrasound imaging system 200 User profile 205 User identification information 210 Speech training profile 211 Target training 215 Voice identification information 220 Speech-text conversion dictionary 230 Keywords 410 User profile database 401, 402, 403 Ultrasound system 501 , 502, 503 User profile 511, 512, 513 Reading means 610 Headset 662 Printer 664 PDA
680 Ultrasound System 700 Simplified Mechanical User Interface 710 Trackball 720 "Select" Button 730 Reconfigurable Touch Panel 810 Articulated Support 880 Ultrasound System

Claims

A speech recognition system for controlling the ultrasound imaging system;
At least one user profile, the ultrasound imaging system comprising:
The ultrasound imaging system, wherein the at least one user profile comprises a training profile of each user for the speech recognition system of the ultrasound imaging system.

Further comprising means for recognizing each user,
The ultrasound imaging system according to claim 1, wherein after the means for recognizing each user has recognized each user, a user profile of each of the recognized users is retrieved.

The means for recognizing each user includes:
A tactile input device that allows a user to input data identifying the user;
A voice recognition system in which voice identification parameters of each user are stored in the at least one user profile;
3. The ultrasound imaging system according to claim 2, further comprising at least one of identification means carried by a user and supplying data for identifying the user.

The identification means carried by the user,
Memory means connected to the transmitter;
Having at least one of semiconductor memory means readable by a reading device connected to the ultrasonic imaging system,
4. The ultrasound imaging system according to claim 3, wherein the ultrasound imaging system has a corresponding receiver for receiving data identifying the user.

The ultrasound imaging system according to claim 4, wherein the transmitter is a wireless transmitter, and the receiver is a wireless receiver.

The wireless communication link between the wireless transmitter and the wireless receiver is one of a Bluetooth link, an IEEE 802.11 link, a HiperLAN link, a HomeRF link, an infrared technology link, and a magnetic induction technology link. Item 7. An ultrasonic imaging system according to Item 5.

2. The ultrasound imaging system according to claim 1, wherein the speech recognition system includes a unit for performing a target correction on the speech training profile.

2. The ultrasound imaging system according to claim 1, further comprising means for converting spoken speech to text.

Further comprising means for annotating the image,
9. The ultrasound imaging system of claim 8, wherein the user annotates by speaking using a means for converting the speech to text.

9. The ultrasound imaging system of claim 8, wherein the at least one user profile comprises a user-modifiable speech-text dictionary.

The at least one user profile has at least one keyword;
The ultrasound imaging system of claim 1, wherein the at least one keyword is used by each of the users before speaking a command to the ultrasound imaging system.

The ultrasound imaging system is one of a plurality of ultrasound imaging systems,
The at least one user profile comprises:
Memory means in at least one of the plurality of ultrasound imaging systems;
A centralized database that communicates with a plurality of the plurality of ultrasound imaging systems;
The ultrasound imaging system of claim 1, wherein the ultrasound imaging system is located on at least one of a memory means carried by each user and readable by at least one of the plurality of ultrasound imaging systems.

Further comprising a user headset having a microphone;
The ultrasound imaging system of claim 1, wherein the headset is in wireless communication with at least the speech recognition system.

The ultrasound imaging system is implemented using a wireless communication link between the components,
The ultrasound imaging system of claim 1, wherein the components include a printer, a personal digital assistant (PDA), a user headset, and a user trackball device.

13. The ultrasound imaging system of claim 12, wherein the wireless communication link is one of a Bluetooth link, an IEEE 802.11 link, a HiperLAN link, a HomeRF link, an infrared technology link, and a magnetic induction technology link.

Further comprising a simplified mechanical user interface incorporated into the speech recognition system;
The simplified mechanical user interface comprises:
A trackball device for the operator to operate items displayed on the display,
A selection input for an operator to select an item displayed on the display,
The ultrasonic imaging system according to claim 1, further comprising: a touch panel capable of displaying an item and receiving an input.

A speech recognition system for controlling the ultrasound imaging system;
Speech recognition training file for each operator, voice recognition system used to identify the operator of the ultrasound imaging system for one of the data entry inputs and to find a corresponding portable speech recognition training file, operator Is a speech-to-text converter for annotating ultrasound images, a speech-to-text dictionary for each operator, modifiable by each operator, and an operator for performing targeted speech recognition training on specific words and expressions. An ultrasound imaging system comprising: means and at least one of a keyword activation feature in which keywords are used by an operator to indicate to the speech recognition system that a spoken command will follow.

The ultrasound imaging system is one of a plurality of ultrasound imaging systems,
18. The ultrasound imaging system of claim 17, wherein at least one of the speech recognition training file, the speech-text dictionary, and stored keywords can be carried to any of the plurality of ultrasound imaging systems. .

Further comprising a user headset having a microphone;
18. The ultrasound imaging system of claim 17, wherein the headset is in wireless communication with at least the speech recognition system.

Display and
A simplified mechanical user interface built into the speech recognition system;
The simplified mechanical user interface comprises:
A trackball device for an operator to operate items displayed on the display,
A selection input for an operator to select an item displayed on the display,
The ultrasonic imaging system according to claim 17, further comprising: a touch panel capable of displaying an item and receiving an input.