JP2000260170A

JP2000260170A - Speech input device and speech recognition system

Info

Publication number: JP2000260170A
Application number: JP11061490A
Authority: JP
Inventors: Yuichi Tomii; 雄一冨井
Original assignee: Olympus Optical Co Ltd
Current assignee: Olympus Corp
Priority date: 1999-03-09
Filing date: 1999-03-09
Publication date: 2000-09-22

Abstract

PROBLEM TO BE SOLVED: To make it possible to easily recognize a speech to be in an enrolling mode by displaying as prescribed presenting that enrollment is being operated on a display means, when the speech is inputted for making an external device operate data registration for improvement in accuracy of speech recognition processing. SOLUTION: When a SEL-button is inputted two times and inputted further, PC-LINK processing is performed. In an enrollment mode as one of PC-LINK modes, an operator inputs a speech via a microphone and transmits this speech data to a PC. At the same time, to let the operator know that the speech is being inputted, a counter is operated to increment a counter in a display area of a display part 14. In the enrollment mode, the operator can easily recognize that the speech is being inputted in the enrollment mode, by displaying 'EN ROLL' and then operating the counter to increment the counter.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は音声入力装置及び音
声認識システムに関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input device and a voice recognition system.

【０００２】[0002]

【従来の技術】デジタルレコーダなどの音声入力装置を
用いて音声入力を行ない、パソコンなどの外部装置に転
送してエンロール処理（音声認識処理の精度を向上させ
るために音声認識に先立って操作者の音声の特徴をデー
タ登録する処理）や音声認識ソフトウェアによる音声認
識処理を行なう音声認識システムが従来より知られてい
る。2. Description of the Related Art Voice input is performed using a voice input device such as a digital recorder, and is transferred to an external device such as a personal computer to perform enrollment processing. 2. Description of the Related Art A speech recognition system that performs speech recognition processing using speech recognition software and a process of registering speech characteristics data has been known.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、従来の
音声認識システムにおいては、音声入力装置にエンロー
ル動作中であることを示す専用の表示機能が設けられて
いなかったので、エンロール動作状態であることを容易
に認識することができなかった。However, in the conventional voice recognition system, the voice input device is not provided with a dedicated display function for indicating that the enrolling operation is being performed. It could not be easily recognized.

【０００４】本発明はこのような課題に着目してなされ
たものであり、その目的とするところは、エンロールモ
ードであることを容易に認識することが可能な音声入力
装置及び音声認識システムを提供することにある。The present invention has been made in view of such a problem, and an object thereof is to provide a voice input device and a voice recognition system capable of easily recognizing an enroll mode. Is to do.

【０００５】[0005]

【課題を解決するための手段】上記の目的を達成するた
めに、第１の発明に係る音声入力装置は、単独で録音可
能であり外部装置とのデータ通信が可能な音声入力装置
であって、表示手段と、外部装置に音声認識処理の精度
向上のためのデータ登録を行なわせるために音声が入力
されるときは、上記表示手段にエンロール動作中である
ことを示す所定の表示を行なわせる制御手段とを具備す
る。In order to achieve the above object, a voice input device according to a first aspect of the present invention is a voice input device capable of recording independently and capable of data communication with an external device. When a voice is input to the display means and an external device to register data for improving the accuracy of the voice recognition processing, the display means performs a predetermined display indicating that the enrolling operation is being performed. Control means.

【０００６】また、第２の発明に係る音声認識システム
は、表示手段を備え、単独で録音可能であり外部装置と
のデータ通信が可能な音声入力装置と、音声入力装置か
ら送信された音声データを受信し、受信した音声データ
を音声認識処理させる音声認識処理装置とを有する音声
認識システムであって、上記音声認識処理装置は、音声
認識処理の精度向上のためのデータを登録させるエンロ
ール機能を有しており、上記音声入力装置に音声が入力
され、入力された音声情報が上記音声認識処理装置に送
信されて、上記音声認識処理装置でエンロール動作が行
なわれている間は、上記音声入力装置の表示手段にエン
ロール動作中であることを示す所定の表示を行なわせ
る。A voice recognition system according to a second aspect of the present invention includes a voice input device including a display unit, which can be independently recorded and can perform data communication with an external device, and a voice data transmitted from the voice input device. And a voice recognition processing device for performing voice recognition processing on the received voice data, wherein the voice recognition processing device has an enroll function for registering data for improving the accuracy of voice recognition processing. While the voice is input to the voice input device, the input voice information is transmitted to the voice recognition processing device, and while the enrollment operation is performed in the voice recognition processing device, the voice input is performed. A predetermined display indicating that the enrolling operation is being performed is displayed on the display means of the apparatus.

【０００７】また、第３の発明に係る音声入力装置は、
第１の発明に係る音声入力装置において、上記エンロー
ル動作中であることを示す表示は、カウンタを動作させ
ることにより行なう。[0007] Further, a voice input device according to a third aspect of the present invention comprises:
In the voice input device according to the first invention, the display indicating that the enrolling operation is being performed is performed by operating a counter.

【０００８】すなわち、第１の発明においては、単独で
録音可能かつ外部装置とのデータ通信が可能な音声入力
装置を用い、外部装置に音声認識処理の精度向上のため
のデータ登録を行なわせるために音声が入力されるとき
は、表示手段にエンロール動作中であることを示す所定
の表示を行なわせるようにする。[0008] That is, in the first invention, a voice input device capable of recording independently and capable of data communication with an external device is used, and the external device registers data for improving the accuracy of voice recognition processing. When a voice is input to the device, a predetermined display indicating that the enrolling operation is being performed is displayed on the display means.

【０００９】また、第２の発明においては、単独で録音
可能かつ外部装置とのデータ通信が可能であって、表示
手段を備えた音声入力装置と、この音声入力装置から受
信した音声データを音声認識処理させる音声認識処理装
置とにより音声認識システムを構成し、受信した音声情
報に対する音声認識処理の精度向上のためのデータを登
録させるエンロール動作が上記音声認識処理装置により
行なわれている間は、上記音声入力装置の表示手段にエ
ンロール動作中であることを示す所定の表示を行なわせ
るようにする。According to the second aspect of the present invention, there is provided a voice input device provided with a display means, capable of recording data independently and communicating with an external device, and transmitting voice data received from the voice input device in voice. A speech recognition system is configured with the speech recognition processing device for performing the recognition process, and while the enrollment operation for registering data for improving the accuracy of the speech recognition process for the received speech information is performed by the speech recognition processing device, The display means of the voice input device is caused to perform a predetermined display indicating that the enrolling operation is being performed.

【００１０】また、第３の発明に係る音声入力装置は、
第１の発明に係る音声入力装置において、カウントを動
作させることにより上記エンロール動作中であることを
示す表示を行なわせるようにする。[0010] The voice input device according to a third aspect of the present invention includes:
In the voice input device according to the first invention, a display indicating that the enrolling operation is being performed is performed by operating a count.

【００１１】[0011]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態を詳細に説明する。図１は本発明の音声入力装置
を適用したデジタルレコーダの構成を示す図である。図
１において、マイク１は増幅器（ＡＭＰ）２を介してロ
ーパスフィルタ（ＬＰＦ）３に接続されている。このロ
ーパスフィルタ３は増幅器１７を介してオーディオ出力
端子１８に接続されるとともに、アナログデジタルコン
バータ（Ａ／Ｄ）４を介してデジタル信号処理部（ＤＳ
Ｐ）５のＴ１端子に接続されている。このＤＳＰ５のＴ
２端子にはデジタルアナログコンバータ（Ｄ／Ａ）９と
ローパスフィルタ（ＬＰＦ）１０と増幅器（ＡＭＰ）１
１とを介してスピーカ１２が接続されている。Embodiments of the present invention will be described below in detail with reference to the drawings. FIG. 1 is a diagram showing a configuration of a digital recorder to which a voice input device according to the present invention is applied. In FIG. 1, a microphone 1 is connected to a low-pass filter (LPF) 3 via an amplifier (AMP) 2. The low-pass filter 3 is connected to an audio output terminal 18 via an amplifier 17 and a digital signal processor (DS) via an analog / digital converter (A / D) 4.
P) 5 is connected to the T1 terminal. T of this DSP5
A digital-analog converter (D / A) 9, a low-pass filter (LPF) 10, and an amplifier (AMP) 1 are connected to two terminals.
1 is connected to the speaker 12.

【００１２】ＤＳＰ５のＴ３端子はシステム制御部（Ｃ
ＰＵ）６のＴ４端子に接続されている。ＣＰＵ６のＴ５
端子にはＰＣ接続端子１６が接続され、Ｔ６端子には駆
動回路１３を介して表示部（表示手段）１４が接続さ
れ、Ｔ７端子には電源制御部１５が接続され、Ｔ８端子
には内蔵された記録媒体７が接続されている。この記録
媒体７は着脱自在なものであってもよい。The T3 terminal of the DSP 5 is connected to the system control unit (C
PU) 6 is connected to the T4 terminal. T5 of CPU6
A PC connection terminal 16 is connected to the terminal, a display unit (display means) 14 is connected to the T6 terminal via the drive circuit 13, a power control unit 15 is connected to the T7 terminal, and a built-in is built in the T8 terminal. Recording medium 7 is connected. This recording medium 7 may be removable.

【００１３】さらにＣＰＵ６には各種の操作釦（録音釦
ＲＥＣ、再生釦ＰＬＡＹ、停止釦ＳＴＯＰ、スキップ釦
ＳＫＩＰ、セレクト釦ＳＥＬ）を備えた操作入力部８が
接続されている。Further, the CPU 6 is connected to an operation input unit 8 having various operation buttons (recording button REC, reproduction button PLAY, stop button STOP, skip button SKIP, select button SEL).

【００１４】本実施形態のデジタルレコーダは、外部装
置あるいは音声認識処理装置としてのパーソナルコンピ
ュータ（以下、ＰＣと呼ぶ）と接続されて音声認識シス
テムを構成し、エンロール処理や音声認識処理などの種
々の処理をＰＣに行なわせるＰＣ−ＬＩＮＫモードを備
えている。ここではデジタルレコーダをリモコンマイク
として用いてエンロール作業を行なうことを想定する。
すなわち、まず、オーディオ出力端子１８をケーブルな
どによりＰＣのオーディオ入力端子に接続した後、エン
ロールモードを設定する。次に、ＰＣの画面に表示され
たエンロール用の文章を読み上げることによりマイク１
から音声を入力し、増幅器２で増幅した後、ローパスフ
ィルタ３で不要成分を除去する。次に増幅器１７により
送信可能な信号レベルに増幅しアナログ信号のままオー
ディオ出力端子１８からＰＣに送信する。なお、赤外線
などの無線による方法を用いても良い。The digital recorder of this embodiment is connected to an external device or a personal computer (hereinafter, referred to as a PC) as a speech recognition processing device to constitute a speech recognition system, and performs various processes such as enroll processing and speech recognition processing. A PC-LINK mode is provided for causing the PC to perform processing. Here, it is assumed that enroll work is performed using a digital recorder as a remote control microphone.
That is, first, the audio output terminal 18 is connected to the audio input terminal of the PC by a cable or the like, and then the enroll mode is set. Next, by reading out the text for enrollment displayed on the screen of the PC, the microphone 1 is read.
After the audio is input from the amplifier 2 and amplified by the amplifier 2, unnecessary components are removed by the low-pass filter 3. Next, the signal is amplified to a transmittable signal level by the amplifier 17 and transmitted as an analog signal from the audio output terminal 18 to the PC. Note that a wireless method such as infrared rays may be used.

【００１５】また、アナログ信号の形態でＰＣに送信す
る代わりに、マイク１から入力した音声をアナログデジ
タルコンバータ４でデジタル信号に変換した後、デジタ
ル信号処理部５で圧縮して符号化音声データとしてＰＣ
に送信するようにしてもよい。この場合はＰＣ側が受信
した符号化音声データを復号する機能を備えていること
が前提となる。Also, instead of transmitting the signal to the PC in the form of an analog signal, the sound input from the microphone 1 is converted into a digital signal by the analog-to-digital converter 4 and then compressed by the digital signal processing unit 5 to generate encoded sound data. PC
May be transmitted. In this case, it is assumed that the PC has a function of decoding the encoded audio data received.

【００１６】以下に、上記したデジタルレコーダの一般
的な録音再生動作を説明する。操作者がＲＥＣ釦を押す
と録音モードとなり、マイク１から入力された音声が電
気信号に変換された後、増幅器２により増幅され、ＬＰ
Ｆ３によりその不要成分が除去される。その後アナログ
デジタルコンバータ４によりデジタル信号に変換されて
ＤＳＰ５に入力される。このＤＳＰ５でデジタル音声信
号に対する圧縮処理が施された後、ＣＰＵ６の制御のも
とに音声データとして記録媒体７に記録される。The general recording / reproducing operation of the above-mentioned digital recorder will be described below. When the operator presses the REC button, a recording mode is set, and the sound input from the microphone 1 is converted into an electric signal, which is then amplified by the amplifier 2, and
The unnecessary components are removed by F3. Thereafter, the signal is converted into a digital signal by the analog-digital converter 4 and input to the DSP 5. After the DSP 5 performs a compression process on the digital audio signal, the digital audio signal is recorded on the recording medium 7 as audio data under the control of the CPU 6.

【００１７】また、操作者がＰＬＡＹ釦を押すと再生モ
ードとなり、ＣＰＵ６の制御のもとに記録媒体７から音
声データが読み出された後、ＤＳＰ５において伸長処理
が施される。伸長された音声信号はデジタルアナログコ
ンバータ９でアナログ信号に変換された後、ローパスフ
ィルタ１０でその不要成分が除去され、増幅器１１で増
幅された後、スピーカ１２から音声として出力される。When the operator presses the PLAY button, the reproduction mode is set. After the audio data is read from the recording medium 7 under the control of the CPU 6, the DSP 5 performs an expansion process. The expanded audio signal is converted into an analog signal by a digital-to-analog converter 9, its unnecessary components are removed by a low-pass filter 10, amplified by an amplifier 11, and then output as audio from a speaker 12.

【００１８】上記した音声の録音動作中あるいは再生動
作中にＳＴＯＰ釦が押された場合にはそのときの動作が
停止される。また、表示部１４にはモードに応じて各種
の情報が表示される。また、電源制御部１５は装置内の
各部に供給される電源を制御したり、省電力の制御を行
なうものである。If the STOP button is pressed during the above-described sound recording or reproduction operation, the operation at that time is stopped. Various kinds of information are displayed on the display unit 14 according to the mode. The power control unit 15 controls power supplied to each unit in the apparatus and controls power saving.

【００１９】図２は上記したデジタルレコーダの外観を
示す図であり、操作入力部８としてのＲＥＣ釦、ＰＬＡ
Ｙ釦、ＳＴＯＰ釦、ＳＫＩＰ釦、ＳＥＬ釦の他に、マイ
ク１、表示部１４、スピーカ１２、ＰＣ接続端子１６、
オーディオ出力端子１８が所定の位置に配置されてい
る。FIG. 2 is a view showing the external appearance of the above-mentioned digital recorder.
In addition to the Y button, STOP button, SKIP button, and SEL button, a microphone 1, a display unit 14, a speaker 12, a PC connection terminal 16,
An audio output terminal 18 is arranged at a predetermined position.

【００２０】図３は操作者からの操作入力に基づいたＣ
ＰＵの処理の詳細を説明するためのフローチャートであ
る。電池を装填することで本フローがスタートし、まず
ステップＳ１でメモリチェックなどの動作のための初期
設定を行なった後、レコーダモードになって図５（Ａ）
に示すような情報が表示部１４に表示される（ステップ
Ｓ２）。表示部１４の表示領域Ａに表示されている“Ｎ
ｏ．１”はファイルＮｏ．を示し、表示領域Ｂに表示さ
れている“０００１”はカウンタによるカウント値を示
している。カウンタのカウント値は各ファイルごとに０
〜ｎまで用意されている。表示領域Ｃに表示されている
“ＲＥＣ”はモード表示であり、現在、記録モードであ
ることを示している。再生モード時は“ＰＬＡＹ”が表
示され、停止モード時は何も表示されない。FIG. 3 shows C based on an operation input from the operator.
It is a flowchart for demonstrating the detail of a process of PU. This flow starts when a battery is loaded. First, in step S1, initial settings for an operation such as a memory check are performed, and then the recorder mode is set, and FIG.
Is displayed on the display unit 14 (step S2). “N” displayed in the display area A of the display unit 14
o. "1" indicates the file number, and "0001" displayed in the display area B indicates the count value of the counter.The count value of the counter is 0 for each file.
To n. “REC” displayed in the display area C is a mode display, and indicates that the current mode is the recording mode. “PLAY” is displayed in the reproduction mode, and nothing is displayed in the stop mode.

【００２１】次にＳＥＬ釦が入力されたかどうかを判断
し（ステップＳ３）、ＮＯの場合にはステップＳ４に進
んでＲＥＣ釦が入力されたかどうかを判断する。ここで
ＹＥＳの場合にはステップＳ８に移行して録音処理（Ｒ
ＥＣ処理）を行なってステップＳ３に戻る。また、ステ
ップＳ４の判断がＮＯの場合にはステップＳ５に進んで
ＰＬＡＹ釦が入力されたかどうかを判断する。ここでＹ
ＥＳの場合にはステップＳ９に移行して再生処理（ＰＬ
ＡＹ処理）を行なってステップＳ３に戻る。Next, it is determined whether or not the SEL button has been input (step S3). If NO, the process proceeds to step S4 to determine whether or not the REC button has been input. If YES here, the process shifts to step S8 to perform the recording process (R
EC processing) and returns to step S3. If the determination in step S4 is NO, the process proceeds to step S5 to determine whether the PLAY button has been input. Where Y
In the case of ES, the process proceeds to step S9, and the reproduction process (PL
AY processing) and returns to step S3.

【００２２】また、ステップＳ５の判断がＮＯの場合に
はステップＳ６に進んでＳＫＩＰ釦が入力されたかどう
かを判断し、ＹＥＳの場合にはステップＳ１０に移行し
てＳＫＩＰ処理を行なってステップＳ３に戻る。ＳＫＩ
Ｐ処理はファイルＮｏ．を変更する処理であり、ＳＫＩ
Ｐ釦が押されるごとに表示されるファイルＮｏ．が変更
される。If the determination in step S5 is NO, the flow advances to step S6 to determine whether the SKIP button has been pressed. If the determination is YES, the flow shifts to step S10 to perform SKIP processing and then to step S3. Return. SKI
P processing is performed for the file No. Is the process of changing the SKI
File No. displayed every time the P button is pressed. Is changed.

【００２３】また、ステップＳ６の判断がＮＯの場合に
はステップＳ７に進んで、操作入力が５分以上なかった
かどうかを判断し、ＮＯの場合にはステップＳ３に戻
り、ＹＥＳの場合にはステップＳ２０に移行して省電力
モードの設定を行なう。その後は何らかの操作入力があ
ったかどうかを判断（ステップＳ２１）しながら省電力
モードを維持する。何らかの操作入力があったときにス
テップＳ２１の判断がＹＥＳとなってステップＳ１に戻
る。If the determination in step S6 is NO, the process proceeds to step S7 to determine whether there has been no operation input for 5 minutes or more. If NO, the process returns to step S3, and if YES, the process returns to step S3. The process proceeds to S20 to set the power saving mode. After that, the power saving mode is maintained while determining whether or not any operation input has been made (step S21). If there is any operation input, the determination in step S21 becomes YES and the process returns to step S1.

【００２４】一方、ステップＳ３でＹＥＳの場合にはス
テップＳ１１に移行して消去モード（ＥＲＡＳＥモー
ド）の表示を行なう。このＥＲＡＳＥモード時は図５
（Ｂ）に示すように表示領域ＡにファイルＮｏ．が表示
されるとともに、表示領域Ｃには“ＥＲＡＳＥ”が表示
される。次にＳＥＬ釦が入力されたかどうかを判断し
（ステップＳ１２）、ＮＯの場合にはステップＳ１３に
進んで、操作入力が５分以上なかったかどうかを判断
し、ＮＯの場合にはステップＳ１４に進んで消去すべき
ファイルを選択する。ファイルの選択はＳＫＩＰ釦を逐
次押すことで行なわれる。ファイルＮｏ．は非選択時に
点滅し、選択時においては点灯状態になる。次にＲＥＣ
釦が押されたかどうかにより選択されたファイルを消去
すべきかどうかの確認（ステップＳ１５）を行なった
後、選択ファイルを消去する（ステップＳ１６）。消去
中は表示領域Ｃの“ＥＲＡＳＥ”が点滅する。On the other hand, if YES in the step S3, the process shifts to a step S11 to display an erase mode (ERASE mode). In this ERASE mode,
As shown in FIG. Is displayed, and “ERASE” is displayed in the display area C. Next, it is determined whether or not the SEL button has been input (step S12). If NO, the process proceeds to step S13, and it is determined whether or not the operation input has been performed for 5 minutes or more. If NO, the process proceeds to step S14. Use to select the file to be deleted. The file is selected by sequentially pressing the SKIP button. File No. Flashes when not selected, and is lit when selected. Next, REC
After confirming whether the selected file should be deleted based on whether the button is pressed (step S15), the selected file is deleted (step S16). During erasing, “ERASE” in the display area C blinks.

【００２５】また、ステップＳ１３の判断がＹＥＳの場
合にはステップＳ２０に移行して省電力モードを設定
し、その後は何らかの操作入力があったかどうかを判断
（ステップＳ２１）しながら省電力モードを維持する。
何らかの操作入力があったときにステップＳ２１の判断
がＹＥＳとなってステップＳ１に戻る。If the determination in step S13 is YES, the process shifts to step S20 to set the power saving mode. Thereafter, it is determined whether or not there is any operation input (step S21), and the power saving mode is maintained. .
If there is any operation input, the determination in step S21 becomes YES and the process returns to step S1.

【００２６】また、ステップＳ１２の判断がＹＥＳの場
合にはステップＳ１７に移行してＰＣ−ＬＩＮＫモード
の表示を行なう。ＰＣ−ＬＩＮＫモードでは図５（Ｃ）
に示すように、表示領域ＣにはＰＣ−ＬＩＮＫモードで
あることを示す“ＰＣ−ＬＩＮＫ”が表示される。If the determination in step S12 is YES, the process shifts to step S17 to display the PC-LINK mode. FIG. 5C in the PC-LINK mode.
"PC-LINK" indicating the PC-LINK mode is displayed in the display area C as shown in FIG.

【００２７】次に、ステップＳ１８に進んでＳＥＬ釦が
入力されたかどうかを判断し、ＮＯの場合はステップＳ
２に戻り、ＹＥＳの場合はステップＳ１９に進んでＰＣ
−ＬＩＮＫ処理を行なってステップＳ１７に戻る。Next, the process proceeds to step S18, where it is determined whether or not the SEL button has been pressed.
2 and in the case of YES, the process proceeds to step S19 and the PC
A LINK process is performed, and the process returns to step S17.

【００２８】上記したことからわかるように、本実施形
態では、ＳＥＬ釦が２回入力されたときにＰＣ−ＬＩＮ
Ｋモードの表示を行ない、さらにＳＥＬ釦が入力された
場合にＰＣ−ＬＩＮＫ処理を行なうようになっている。As can be seen from the above description, in this embodiment, when the SEL button is input twice, the PC-LIN
A K-mode display is performed, and a PC-LINK process is performed when the SEL button is pressed.

【００２９】図４は上記したＰＣ−ＬＩＮＫモードの１
つとしてのエンロールモードにおいて、使用者の音声の
特徴を予め登録するときの詳細を示すフローチャートで
ある。エンロールモードを設定すると図５（Ｄ）に示す
ように、表示領域Ｃには“ＥＮＲＯＬＬ”が表示され
る。ステップＳ５０でＰＣ−レコーダ間の接続を確認
し、接続が正常になされているかどうかを判断する（ス
テップＳ５１）。ここでＮＯの場合には接続が正常状態
になるまで待機する。接続が正常であることが確認され
た場合にデジタルレコーダとＰＣとの間のデータ通信が
可能になる。この場合はステップＳ５１の判断がＹＥＳ
となり次のステップＳ５２に進んでＰＣ上にエンロール
用の文章を表示する。ここでは図６に示すように、用意
されている３００の文章のうちの１番目の文章として
“この文章を読み上げて下さい。”のメッセージが表示
されるとともに、エンロール処理を開始させるための
“開始”釦と、エンロール処理を停止させるための“停
止”釦が表示される。FIG. 4 shows one example of the PC-LINK mode.
9 is a flowchart showing details when a feature of a user's voice is registered in advance in one enroll mode. When the enroll mode is set, “ENROLL” is displayed in the display area C as shown in FIG. In step S50, the connection between the PC and the recorder is confirmed, and it is determined whether or not the connection is normally made (step S51). In the case of NO here, the process stands by until the connection becomes normal. When the connection is confirmed to be normal, data communication between the digital recorder and the PC becomes possible. In this case, the determination in step S51 is YES
Then, the flow advances to the next step S52 to display a text for enrollment on the PC. Here, as shown in FIG. 6, a message "Please read this sentence." Is displayed as the first sentence of the 300 sentences prepared, and "Start" for starting the enrollment process is performed. A "button" and a "stop" button for stopping the enrollment process are displayed.

【００３０】次に操作者からの入力開始の指示があった
かどうかを判断する（ステップＳ５３）。ここではＲＥ
Ｃ釦の入力があったかどうかにより判断する。ＲＥＣ釦
の入力があった場合には入力開始を示す制御信号がＰＣ
に送信される。次にステップＳ５４において操作者がＰ
Ｃの画面に表示されている文章を読み上げることにより
マイク１を介して音声の入力が開始され、入力された音
声データはオーディオ出力端子１８を介してＰＣへ送信
される。これと同時に、音声入力中であることを操作者
に知らせるためにカウンタを動作させて表示部１４の表
示領域Ｂのカウント値をインクリメントする（ステップ
Ｓ５５）。カウンタを動作させる代わりに、表示領域Ｃ
の“ＥＮＲＯＬＬ”を点滅させるようにしてもよい。次
にこの音声の入力中にエラーが発生したかどうかを判断
する（ステップＳ５６）。ここでは、音声入力中に背景
雑音などの、操作者の音声からかけ離れた音が入力され
た場合にエラーとして処理される。また、操作者が音声
入力を中断するべくＳＴＯＰ釦を入力した場合にも入力
エラーとして処理される。入力エラーの場合にはカウン
タのカウント動作を停止して（ステップＳ５７）、表示
部１４に図５（Ｅ）に示すようなエラー表示を行なって
（ステップＳ５８）、ステップＳ５２に戻り同じ文章を
読み上げることで音声入力を再度行なう。Next, it is determined whether or not an input start instruction has been received from the operator (step S53). Here, RE
Judgment is made based on whether the C button is input. When the REC button is input, the control signal indicating the start of the input
Sent to. Next, in step S54, the operator sets P
By reading out the text displayed on the screen of C, the input of voice through the microphone 1 is started, and the input voice data is transmitted to the PC through the audio output terminal 18. At the same time, the counter is operated to notify the operator that voice input is being performed, and the count value of the display area B of the display unit 14 is incremented (step S55). Instead of operating the counter, display area C
"ENROLL" may be blinked. Next, it is determined whether or not an error has occurred during the input of the voice (step S56). Here, if a sound such as background noise that is far from the operator's voice is input during voice input, it is processed as an error. Also, when the operator presses the STOP button to interrupt the voice input, it is processed as an input error. In the case of an input error, the counting operation of the counter is stopped (step S57), an error display as shown in FIG. 5E is made on the display unit 14 (step S58), and the process returns to step S52 to read out the same sentence. Then, voice input is performed again.

【００３１】また、ステップＳ５６の判断がＮＯの場合
には次のステップＳ５９で１つの文章の音声入力が終了
したかどうかを判断する。ここでＮＯの場合にはステッ
プＳ５５に戻る。入力エラーがなく１つの文章の音声入
力がされている間はカウント動作が継続して行なわれ、
１つの文章の音声入力が終了したときにステップＳ５９
の判断がＹＥＳとなって次にステップＳ６０でカウンタ
のカウント動作を停止する。If the determination in step S56 is NO, it is determined in next step S59 whether the voice input of one sentence has been completed. Here, in the case of NO, the process returns to step S55. While there is no input error and the voice input of one sentence is being performed, the counting operation is continuously performed.
When the voice input of one sentence is completed, step S59
Is YES, and then the counting operation of the counter is stopped in step S60.

【００３２】次にエンロール動作が終了かどうかを判断
し（ステップＳ６１）、終了の場合にはリターンし、ま
だエンロールすべき文章が残っている場合には次の文章
をＰＣの画面に表示（ステップＳ６２）した後、ステッ
プＳ５４に戻って音声入力を継続する。このようにして
所定の数（ここでは例えば３００）の文章の読み上げが
終了したときにステップＳ６１の判断がＹＥＳとなって
エンロールモードからぬける。Next, it is determined whether or not the enroll operation is completed (step S61). If the enroll operation is completed, the process returns. If there is still a document to be enrolled, the next document is displayed on the PC screen (step S61). After S62), the process returns to step S54 to continue the voice input. When a predetermined number of sentences (here, for example, 300) have been read aloud in this way, the determination in step S61 becomes YES, and the process exits the enrollment mode.

【００３３】上記した実施形態によれば、エンロールモ
ード時は、図５（Ｄ）に示すような“ＥＮＲＯＬＬ”の
表示を行なった上で、カウンタを動作させてカウント値
をインクリメントするようにしたので操作者はエンロー
ルモードで音声入力中であることを容易に認識すること
ができる。また、カウンタは、録音や再生などの進行状
況を表すものとしてデジタルレコーダに通常備わってい
るものであるが、本実施形態ではこれを音声入力中であ
ることを示す標識として兼用しているので、そのための
専用の表示スペースを設ける必要がない。According to the above-described embodiment, in the enroll mode, "ENROLL" is displayed as shown in FIG. 5D, and then the counter is operated to increment the count value. The operator can easily recognize that voice input is being performed in the enroll mode. In addition, the counter is normally provided in the digital recorder as an indicator of the progress of recording or playback, but in the present embodiment, this is also used as a sign indicating that voice input is being performed. There is no need to provide a dedicated display space for that.

【００３４】また、単独で録音可能な音声入力装置とし
てのデジタルレコーダの操作部に設けられた操作釦（こ
こではＲＥＣ釦、ＳＴＯＰ釦）を用いてエンロール処理
を行なうようにしたので、ＰＣの画面に表示された開始
釦や停止釦をマウスなどでクリックすることなしにエン
ロール処理を行なうことができる。また、デジタルレコ
ーダが単独で用いられるときにモードを設定するために
用いられるＲＥＣ釦やＳＴＯＰ釦などの操作釦を、音声
認識用の操作釦として兼用することで音声認識専用の操
作釦を新たに設ける必要がなくなる。Further, the enrollment process is performed using operation buttons (here, REC button, STOP button) provided on the operation unit of the digital recorder as a voice input device capable of recording independently. The enrollment process can be performed without clicking the start button or the stop button displayed in (2) with a mouse or the like. Also, an operation button such as a REC button or a STOP button used for setting a mode when the digital recorder is used alone is also used as an operation button for voice recognition, thereby newly providing an operation button dedicated to voice recognition. There is no need to provide them.

【００３５】なお、上記した具体的実施形態には以下の
構成を有する発明が含まれている。（１）表示手段を備え、単独で録音可能であり外部装
置とのデータ通信が可能な音声入力装置と、音声入力装
置から送信された音声データを受信し、受信した音声デ
ータを音声認識処理させる音声認識処理装置と、を有す
る音声認識システムであって、上記音声認識処理装置
は、音声認識処理の精度向上のためのデータを登録させ
るエンロール機能を有しており、上記音声入力装置に音
声が入力され、入力された音声情報が上記音声認識処理
装置に送信されて、上記音声認識処理装置でエンロール
動作が行なわれている間は、上記音声入力装置の表示手
段にエンロール動作中であることを示す所定の表示を行
なわせることを特徴とする請求項１記載の音声認識シス
テム。The specific embodiments described above include inventions having the following configurations. (1) An audio input device that includes a display unit, is capable of recording independently, and can perform data communication with an external device, receives audio data transmitted from the audio input device, and performs a voice recognition process on the received audio data. And a voice recognition processing device, wherein the voice recognition processing device has an enroll function for registering data for improving the accuracy of voice recognition processing, and a voice is input to the voice input device. While the input voice information is transmitted to the voice recognition processing device and the enrollment operation is being performed in the voice recognition processing device, the display means of the voice input device indicates that the enrollment operation is being performed. 2. The voice recognition system according to claim 1, wherein a predetermined display is performed.

【００３６】（２）上記エンロール動作中であること
を示す表示は、カウンタを動作させることにより行なう
ことを特徴とする（１）に記載の音声認識システム。(2) The speech recognition system according to (1), wherein the display indicating that the enrolling operation is being performed is performed by operating a counter.

【００３７】上記した（１）に記載の発明によれば、エ
ンロールモードであることを容易に認識することができ
るという効果を奏する。According to the invention described in the above (1), it is possible to easily recognize the enroll mode.

【００３８】また、（２）に記載の発明によれば、
（１）に記載の発明の効果に加えて、音声入力中である
ことを表示するための専用のスペースを設ける必要がな
いという効果を奏する。According to the invention described in (2),
In addition to the effect of the invention described in (1), there is an effect that it is not necessary to provide a dedicated space for displaying that voice input is being performed.

【００３９】[0039]

【発明の効果】請求項１または２に記載の発明によれ
ば、エンロールモードであることを容易に認識すること
ができるという効果を奏する。According to the first or second aspect of the present invention, it is possible to easily recognize the enroll mode.

【００４０】また、請求項３に記載の発明によれば、請
求項１に記載の発明の効果に加えて、音声入力中である
ことを表示するための専用のスペースを設ける必要がな
いという効果を奏する。According to the third aspect of the present invention, in addition to the effect of the first aspect, it is not necessary to provide a dedicated space for displaying that voice input is being performed. To play.

[Brief description of the drawings]

【図１】本発明の音声入力装置を適用したデジタルレコ
ーダの構成を示す図である。FIG. 1 is a diagram showing a configuration of a digital recorder to which a voice input device according to the present invention is applied.

【図２】デジタルレコーダの外観を示す図である。FIG. 2 is a diagram illustrating an appearance of a digital recorder.

【図３】操作者からの操作入力に基づいたＣＰＵの処理
の詳細を説明するためのフローチャートである。FIG. 3 is a flowchart illustrating details of processing by a CPU based on an operation input from an operator.

【図４】ＰＣ−ＬＩＮＫモードの１つとしてのエンロー
ルモードにおいて使用者の音声を予め登録するときの詳
細を示すフローチャートである。FIG. 4 is a flowchart showing details when a user's voice is registered in advance in an enroll mode as one of the PC-LINK modes.

【図５】各モードにおける表示部の表示を示す図であ
る。FIG. 5 is a diagram showing a display on a display unit in each mode.

【図６】エンロール時におけるＰＣの画面表示を示す図
である。FIG. 6 is a diagram showing a screen display of a PC at the time of enrollment.

[Explanation of symbols]

１…マイク、２…増幅器（ＡＭＰ）、３…ローパスフィルタ（ＬＰＦ）、４…アナログデジタルコンバータ（Ａ／Ｄ）、５…ＤＳＰ、６…制御部（ＣＰＵ）、７…記録媒体、８…操作入力部、９…デジタルアナログコンバータ（Ｄ／Ａ）、１０…ローパスフィルタ（ＬＰＦ）、１１…増幅器（ＡＭＰ）、１２…スピーカ、１３…駆動回路、１４…表示部、１５…電源制御部、１６…ＰＣ接続端子、１７…増幅器（ＡＭＰ）１８…オーディオ出力端子。 DESCRIPTION OF SYMBOLS 1 ... Microphone, 2 ... Amplifier (AMP), 3 ... Low-pass filter (LPF), 4 ... Analog-digital converter (A / D), 5 ... DSP, 6 ... Control part (CPU), 7 ... Recording medium, 8 ... Operation Input unit, 9: Digital-to-analog converter (D / A), 10: Low-pass filter (LPF), 11: Amplifier (AMP), 12: Speaker, 13: Drive circuit, 14: Display unit, 15: Power supply control unit, 16 ... PC connection terminal 17 ... Amplifier (AMP) 18 ... Audio output terminal

Claims

[Claims]

An audio input device capable of recording independently and capable of data communication with an external device, comprising: a display means; and an external device for registering data for improving accuracy of voice recognition processing. Control means for causing the display means to perform a predetermined display indicating that an enrolling operation is being performed when a sound is input, the sound input apparatus comprising:

2. A voice input device comprising a display means, capable of recording independently and capable of data communication with an external device, receiving voice data transmitted from the voice input device, and performing voice recognition on the received voice data. A voice recognition processing device to be processed, and a voice recognition system comprising: the voice recognition processing device has an enroll function for registering data for improving the accuracy of voice recognition processing;
A voice is input to the voice input device, and the input voice information is transmitted to the voice recognition processing device, and while the enrollment operation is being performed in the voice recognition processing device, a display unit of the voice input device is displayed. 2. The voice recognition system according to claim 1, wherein a predetermined display indicating that an enroll operation is being performed is performed.

3. The voice input device according to claim 1, wherein the display indicating that the enrolling operation is being performed is performed by operating a counter.