JP2000020092A

JP2000020092A - Dictation device and recording medium recording dictation program

Info

Publication number: JP2000020092A
Application number: JP10199525A
Authority: JP
Inventors: Masato Yajima; 真人矢島; Noriko Koyama; 紀子小山
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-06-30
Filing date: 1998-06-30
Publication date: 2000-01-21

Abstract

PROBLEM TO BE SOLVED: To eliminate the possibility that speeches not intended by a user are inputted by attaining the state of not accepting speech input exclusive of a specific command speech when the user interrupts utterance for a specified period of time. SOLUTION: The speech introduced from an input section 11 is subjected to recognition processing and are converted to character strings in a speech recognition section 14. These character string are processed through a command forming section 15 or text forming section 17. At this time, the input intervals of the speech are monitored in a speech input judgment section 17. If the speeches are not inputted even after the specified time, a speech input mode is turned off to make the character strings inputted during this time ineffective. As a result, the possibility that the unnecessary speeches, such as external sounds and the speeches not intended by the user, are inputted while the user interrupts the speech input is eliminated.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えばパーソナル
コンピュータなどに用いられ、ユーザが発声した言葉を
音声認識してテキストとして入力するディクテーション
装置に係り、特にユーザが音声入力を中断している状態
での不要な音声入力の受け付けを制御する機能を備えた
ディクテーション装置及びディクテーションプログラム
を記録した記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a dictation device used for, for example, a personal computer and for recognizing a word uttered by a user and inputting it as text, particularly when the user interrupts voice input. The present invention relates to a dictation device having a function of controlling reception of unnecessary voice input and a recording medium recording a dictation program.

【０００２】[0002]

【従来の技術】音声認識技術の進歩に伴い、テキスト入
力を音声で行うディクテーションシステムが開発される
ようになってきた。ディクテーションシステムでは、キ
ーボードなどから入力していたテキストを音声によって
入力する。2. Description of the Related Art Along with the advance of speech recognition technology, a dictation system for inputting text by voice has been developed. In a dictation system, text input from a keyboard or the like is input by voice.

【０００３】通常、テキスト以外の制御コード（改行、
削除など）は、キーボードから入力するか、もしくは、
予め登録しておいた特定のコマンド音声（例えば「改行
記号」や「削除記号」など）を発声することで入力す
る。Normally, control codes other than text (line feed, line feed,
Delete) can be entered from the keyboard,
A specific command voice (for example, “new line symbol” or “deletion symbol”) registered in advance is input by speaking.

【０００４】ここで、従来のディクテーションシステム
では、音声入力モードのオン／オフの切り替えはキーボ
ードやマウスなどからコントロールすることが多く、一
旦音声入力モードをオンにすると、マイクから入力され
る音声はすべて音声認識される。そのため、ユーザが意
図して発声した音声以外に、例えば外部からきた音声
や、ユーザが入力の意図なく発声した音声（呟きや唸り
声など）が入力される可能性がある。Here, in the conventional dictation system, on / off switching of the voice input mode is often controlled from a keyboard, a mouse, or the like. Speech is recognized. Therefore, in addition to the voice uttered intentionally by the user, for example, a voice coming from the outside or a voice uttered without intention of input by the user (such as muttering or groaning) may be input.

【０００５】従来、ユーザが席を離れるもしくは入力作
業を止めるなど、長期に亘って音声入力を中断する場合
には、キーボードなどを通じて音声入力モードをオフの
状態にして、不要な音声が入力される危険性を回避して
いた。Conventionally, when a user interrupts voice input for a long period of time, for example, when the user leaves a seat or stops inputting, the voice input mode is turned off through a keyboard or the like, and unnecessary voice is input. Danger was avoided.

【０００６】[0006]

【発明が解決しようとする課題】上記したように、不要
な音声入力を避けるため、長期に亘って音声入力を中断
する場合には音声入力モードをオフの状態にしておくこ
とが考えられる。しかしながら、長期の中断ではなく、
ユーザが次の入力を考えるなどの理由から一時的に短期
の中断をする場合には、その都度、キーボードなどから
音声入力オフモードの指示を行う必要があり、非常に面
倒である。As described above, in order to avoid unnecessary voice input, when the voice input is interrupted for a long period of time, the voice input mode may be turned off. However, rather than a long break,
When the user temporarily suspends for a short period of time, for example, considering the next input, it is necessary to instruct the voice input off mode from the keyboard or the like every time, which is very troublesome.

【０００７】しかも、この短期的な中断は音声入力中に
頻繁に起こる上、ユーザが入力を意図しない音声（呟き
など）が最も発声しやすい間隔でもある。このため、従
来方法では、音声入力中にノイズが入力されてしまう可
能性を排除しきれないという欠点があった。[0007] In addition, the short-term interruption frequently occurs during voice input, and is also the interval at which a voice that the user does not intend to input (such as muttering) is most likely to be uttered. For this reason, the conventional method has a disadvantage that the possibility of noise being input during voice input cannot be completely excluded.

【０００８】本発明は上記のような点に鑑みなされたも
ので、ユーザが発声を一定時間中断した場合に、特定の
コマンド音声を除いて音声入力を受け付けない状態にす
ることで、ユーザが意図しない音声が入力される可能性
を排除するようにしたディクテーション装置及びディク
テーションプログラムを記録した記録媒体を提供するこ
とを目的とする。[0008] The present invention has been made in view of the above points, and when the user interrupts the utterance for a certain period of time, by excluding a specific command voice, the user is prevented from accepting voice input, thereby allowing the user to make an intention. It is an object of the present invention to provide a dictation device and a recording medium on which a dictation program is recorded, which eliminates the possibility of inputting undesired voice.

【０００９】[0009]

【課題を解決するための手段】本発明は、ユーザの発声
した音声を認識してテキストを作成する際に、音声の入
力間隔を絶えず監視し、一定時間以上音声が途切れた場
合に、音声入力モードをオフにして、その間に入力され
た文字列を無効とするか、あるいは、例えばシステムを
起動するためのコマンドなど、特定の文字列を除いて入
力を受け付けない状態とすることを特徴とする。SUMMARY OF THE INVENTION According to the present invention, when a text is created by recognizing a voice uttered by a user, a voice input interval is constantly monitored, and when a voice is interrupted for a predetermined time or longer, a voice input is performed. Turn off the mode and invalidate the character string input during that time, or enter a state in which no input is accepted except for a specific character string, for example, a command for starting the system. .

【００１０】このような構成によれば、ディクテーショ
ンシステムにおいて、音声入力が一定時間途絶えると、
音声入力モードが自動的にオフの状態となって不要な音
声入力を受け付けなくなる。したがって、ユーザが音声
入力を中断している間に、外部からの音やユーザが意図
しない音声など、不要な音声が入力される可能性を排除
するこができる。According to such a configuration, in the dictation system, when the voice input is interrupted for a certain time,
The voice input mode is automatically turned off, so that no unnecessary voice input is accepted. Therefore, it is possible to eliminate a possibility that an unnecessary sound such as an external sound or a sound not intended by the user is input while the user stops the sound input.

【００１１】[0011]

【発明の実施の形態】以下、図面を参照して本発明の実
施形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１２】（第１の実施形態）図１は本発明の第１の
実施形態に係るディクテーション装置の構成を示すブロ
ック図である。なお、本装置は、ユーザが発声した言葉
を音声認識してテキストとして入力するものであって、
例えば磁気ディスク等の記録媒体に記録されたプログラ
ムを読み込み、このプログラムによって動作が制御され
るコンピュータによって実現される。(First Embodiment) FIG. 1 is a block diagram showing a configuration of a dictation device according to a first embodiment of the present invention. The present device recognizes words spoken by a user and inputs the words as text.
For example, it is realized by a computer which reads a program recorded on a recording medium such as a magnetic disk and the operation of which is controlled by the program.

【００１３】図１に示すように、本装置は、入力部１
１、制御部１２、表示部１３、音声認識部１４、コマン
ド生成部１５、テキスト生成部１６、音声入力判断部１
７、コマンド辞書１８、テキスト格納部１９から構成さ
れる。As shown in FIG. 1, the present apparatus comprises an input unit 1
1, control unit 12, display unit 13, voice recognition unit 14, command generation unit 15, text generation unit 16, voice input determination unit 1
7, a command dictionary 18, and a text storage unit 19.

【００１４】入力部１１は、ユーザが発した音声を入力
するためのもので、例えばマイクロフォンなどの入力装
置からなる。The input section 11 is for inputting a voice uttered by the user, and is composed of an input device such as a microphone.

【００１５】制御部１２は本装置全体を司るもので、例
えば中央処理ユニット（ＣＰＵ）であり、文章作成処理
を実行するための各処理部（音声認識部１４、コマンド
生成部１５、テキスト生成部１６、音声入力判断部１
７）の制御を行う。また、この制御部１２には、音声入
力モードのオン／オフ状態を記憶するためのモード記憶
部１２ａが設けられている。この音声入力モードがオン
状態のとき、音声入力された情報は有効とされ、音声入
力モードがオフ状態のとき、音声入力された情報は無効
とされる。The control unit 12 controls the entire apparatus, and is, for example, a central processing unit (CPU), and each processing unit (speech recognition unit 14, command generation unit 15, text generation unit) for executing a text creation process. 16. Voice input determination unit 1
The control of 7) is performed. Further, the control unit 12 is provided with a mode storage unit 12a for storing the ON / OFF state of the voice input mode. When the voice input mode is on, the information input by voice is valid, and when the voice input mode is off, the information input by voice is invalid.

【００１６】表示部１３は、音声認識により変換された
テキストの文字列などを表示するためのもので、例えば
ＣＲＴディスプレイ装置または液晶表示装置（等のフラ
ットパネルディスプレイ装置）からなる。The display unit 13 is for displaying a character string of text converted by voice recognition, and is composed of, for example, a CRT display device or a liquid crystal display device (a flat panel display device such as a liquid crystal display device).

【００１７】音声認識部１４、コマンド生成部１５、テ
キスト生成部１６、音声入力判断部１７、コマンド辞書
１８、テキスト生成部１９は、文章作成処理に必要な機
能要素であり、それぞれ固有の処理ルーチン（と当該処
理ルーチンを実行するＣＰＵ）により実現されるもので
ある。The voice recognition unit 14, command generation unit 15, text generation unit 16, voice input determination unit 17, command dictionary 18, and text generation unit 19 are functional elements required for text creation processing. (And a CPU that executes the processing routine).

【００１８】音声認識部１４は、入力部１１から入力さ
れた音声を音声認識辞書１４ａを参照して認識処理し
て、その認識結果として得られる文字列に変換して出力
する。なお、この音声認識部１４による音声認識処理
（認識エンジン）は、音声入力モードのオン／オフに関
係なく常にも起動状態にある。The voice recognition unit 14 performs a recognition process on the voice input from the input unit 11 with reference to the voice recognition dictionary 14a, converts the voice into a character string obtained as a result of the recognition, and outputs the character string. Note that the voice recognition processing (recognition engine) by the voice recognition unit 14 is always activated regardless of whether the voice input mode is on or off.

【００１９】コマンド生成部１５は、音声認識部１４で
変換された文字列がコマンド辞書１８に登録されている
かどうかを検索する。そして、登録されていれば、該当
するコマンドの制御コードを出力し、登録されていなけ
れば、そのまま文字列を出力する。The command generation unit 15 searches whether the character string converted by the voice recognition unit 14 is registered in the command dictionary 18. Then, if registered, the control code of the corresponding command is output, and if not registered, the character string is output as it is.

【００２０】テキスト生成部１６は、音声入力モードが
オンの場合、コマンド生成部１５から渡された文字列を
テキスト格納部１９に格納し、音声入力モードがオフの
場合、テキスト格納部１９への格納はしない。The text generation unit 16 stores the character string passed from the command generation unit 15 in the text storage unit 19 when the voice input mode is on, and stores the text string in the text storage unit 19 when the voice input mode is off. Do not store.

【００２１】音声入力判断部１７は、音声入力の間隔を
監視し、音声入力モードがオンの場合に一定時間経過し
ても音声入力がなければ、制御部１２のモード記憶部１
２ａに設定された音声入力モードをオフにする。The voice input judging section 17 monitors the interval between voice inputs, and if no voice is input after a certain period of time when the voice input mode is on, the mode storage section 1 of the control section 12.
The voice input mode set to 2a is turned off.

【００２２】コマンド辞書１８は、コマンドを表わす文
字列と、対応する制御コードと、コマンドを実行するレ
ベルとの組み合わせを保持してある。図４にコマンド辞
書１８の一例を示す。The command dictionary 18 holds a combination of a character string representing a command, a corresponding control code, and a command execution level. FIG. 4 shows an example of the command dictionary 18.

【００２３】図４の例では、文字列「おんせいにゅうり
ょくもーどおん」とコマンド「音声入力モードをオンに
する」とが対応付けられており、そのレベルが「１」に
設定されている。同様に、文字列「おんせいにゅうりょ
くもーどおふ」とコマンド「音声入力モードをオフにす
る」とが対応付けられており、そのレベルが「１」に設
定されている。また、文字列「かいぎょう」とコマンド
「改行する」とが対応付けられており、そのレベルが
「０」に設定されている。同様に、文字列「かいぺい
じ」とコマンド「改頁する」とが対応付けられており、
そのレベルが「０」に設定されている。In the example shown in FIG. 4, a character string "ONSENRYOKYODOON" is associated with a command "turn on voice input mode", and its level is set to "1". ing. Similarly, the character string "Onsei-ryokumodofu" is associated with the command "Turn off the voice input mode", and its level is set to "1". In addition, the character string “kaigyo” is associated with the command “line feed”, and its level is set to “0”. Similarly, the character string "kaipaiji" is associated with the command "page break",
The level is set to “0”.

【００２４】なお、前者のコマンドはシステムの制御自
体に直接関わるものであり、レベル「１」として登録さ
れている。後者のコマンドは文書編集に関するものであ
って、システムの制御自体に直接関わるものではない。
したがって、この種のコマンドはレベル「０」として登
録されている。The former command is directly related to the control of the system itself, and is registered as level "1". The latter command relates to document editing and does not directly relate to control of the system itself.
Therefore, this type of command is registered as level “0”.

【００２５】テキスト格納部１９は、音声認識で変換さ
れたテキスト文字列を格納するものである。The text storage unit 19 stores a text character string converted by voice recognition.

【００２６】次に、第１の実施形態の動作を説明する。Next, the operation of the first embodiment will be described.

【００２７】まず、入力部１１から音声が入力される
と、音声認識部１４と音声入力判断部１７に送られる。
音声認識部１４は、入力された音声を認識して単語文字
列に変換してコマンド生成部１５に送る。First, when a voice is input from the input unit 11, the voice is sent to the voice recognition unit 14 and the voice input determination unit 17.
The voice recognition unit 14 recognizes the input voice, converts the voice into a word character string, and sends the word character string to the command generation unit 15.

【００２８】コマンド生成部１５は、当該単語文字列を
キーにしてコマンド辞書１８に予め登録してあるコマン
ド文字列を検索して、該当するコマンド文字列があった
場合に、当該文字列に対応するコマンドのレベルが予め
設定されているレベルに該当するかをチェックする。そ
の結果、該当する場合には、その文字列に対応したコマ
ンド処理を実行する。該当しない場合は、そのまま単語
文字列をテキスト生成部１６に送る。The command generation unit 15 searches for a command character string registered in advance in the command dictionary 18 by using the word character string as a key. It is checked whether the level of the command to be executed corresponds to a preset level. As a result, if applicable, the command processing corresponding to the character string is executed. If not, the word character string is sent to the text generator 16 as it is.

【００２９】テキスト生成部１６は、音声入力モードが
オンの場合、送られてきた単語文字列をテキスト格納部
１９に格納し、音声入力モードがオフの場合にはテキス
ト格納部１９への格納はしない。テキスト格納部１９に
格納されたテキストは、たえず表示部１３に表示され
る。The text generator 16 stores the sent word character string in the text storage 19 when the voice input mode is on, and stores the text in the text storage 19 when the voice input mode is off. do not do. The text stored in the text storage unit 19 is constantly displayed on the display unit 13.

【００３０】ここで、音声入力モードをオンにするの
は、上述した入力部１１からコマンド生成部１５までの
流れで行う。入力部１１から音声入力モードをオンにす
るコマンド音声が入力されると、コマンド生成部１５は
コマンド辞書１８に当該コマンド文字列が存在すること
を確認して、そのコマンド文字列に対応するコマンドの
制御コードを出力する。Here, the voice input mode is turned on in the flow from the input unit 11 to the command generation unit 15 described above. When a command voice for turning on the voice input mode is input from the input unit 11, the command generation unit 15 confirms that the command character string exists in the command dictionary 18, and outputs a command corresponding to the command character string. Output control code.

【００３１】音声入力判断部１７は、音声入力の間隔を
監視していて、音声入力モードがオンになると、入力部
１１から送られる音声と音声の間隔が一定時間を越えた
場合には音声入力モードをオフにする。The voice input determination unit 17 monitors the interval between voice inputs. When the voice input mode is turned on, if the interval between voices sent from the input unit 11 exceeds a certain time, voice input is determined. Turn off mode.

【００３２】次に、図２のフローチャートを参照して詳
しく説明する。Next, a detailed description will be given with reference to the flowchart of FIG.

【００３３】図２は第１の実施形態における音声処理の
動作を示すフローチャートである。入力部１１から音声
が入力されると、制御部１２は、これを音声認識部１４
に渡す（ステップＡ１１）。音声認識部１４は、音声認
識辞書１４ａを用いて、入力された音声を認識処理して
単語文字列に変換し、これをコマンド生成部１５に渡す
（ステップＡ１２）。FIG. 2 is a flowchart showing the operation of the voice processing in the first embodiment. When a voice is input from the input unit 11, the control unit 12 transmits the voice to the voice recognition unit 14.
(Step A11). The voice recognition unit 14 uses the voice recognition dictionary 14a to perform a recognition process on the input voice to convert the voice into a word character string, and passes the word character string to the command generation unit 15 (step A12).

【００３４】コマンド生成部１５では、音声認識部１４
によって得られた単語文字列がコマンド辞書１８に登録
されているコマンド文字列と一致するかどうかをチェッ
クする（ステップＡ１３）。そして、コマンド文字列と
一致しない場合、つまり、コマンド以外の文字列が入力
された場合において（ステップＡ１３のＮｏ）、コマン
ド生成部１５は現在の音声入力モードの状態をチェック
する（ステップＡ１４）。The command generation unit 15 includes a voice recognition unit 14
It is checked whether the word character string obtained by the above matches the command character string registered in the command dictionary 18 (step A13). Then, when the character string does not match the command character string, that is, when a character string other than the command is input (No in step A13), the command generation unit 15 checks the current state of the voice input mode (step A14).

【００３５】その結果、音声入力モードがオンであれば
（ステップＡ１４のＹｅｓ）、コマンド生成部１５はそ
のときの単語文字列をテキスト生成部１６に渡してテキ
スト格納部９に格納する（ステップＡ１５）。また、音
声入力モードがオフの場合には（ステップＡ１４のＮ
ｏ）、テキスト格納部１９への格納はせず、音声入力待
ちの状態となる。As a result, if the voice input mode is on (Yes in step A14), the command generation unit 15 passes the word character string at that time to the text generation unit 16 and stores it in the text storage unit 9 (step A15). ). When the voice input mode is off (N in step A14)
o), the text is not stored in the text storage unit 19, and the state is in a state of waiting for voice input.

【００３６】具体的に説明すると、例えば音声入力モー
ドがオンの状態で、入力部１１から「はじめに」といっ
た音声が入力されたとする。この入力音声は音声認識部
１４に渡され、「はじめに」といった単語文字列に変換
される。コマンド生成部１５は、図４に示すようなコマ
ンド辞書１８から該当するコマンド文字列を検索する。
この場合、一致するコマンド文字列がないため、そのま
まテキスト生成部１６に送られる。コマンド生成部１５
から渡された「はじめに」という単語文字列は、テキス
ト格納部１９に格納される。More specifically, for example, it is assumed that a voice such as “Introduction” is input from the input unit 11 while the voice input mode is on. This input voice is passed to the voice recognition unit 14 and is converted into a word character string such as “Introduction”. The command generator 15 searches for a corresponding command character string from the command dictionary 18 as shown in FIG.
In this case, since there is no matching command character string, it is sent to the text generation unit 16 as it is. Command generator 15
Is stored in the text storage unit 19.

【００３７】一方、上記ステップＡ１３で、単語文字列
がコマンド辞書１８に登録されているコマンド文字列と
一致する場合、コマンド生成部１５は音声入力モードが
オンかどうかをチェックする（ステップＡ１６）。音声
入力モードがオフの場合には（ステップＡ１６のＮ
ｏ）、コマンド生成部１５は単語文字列に一致したコマ
ンド文字列のレベルが、予め設定されたレベルに該当す
るかどうかをチェックする（ステップＡ１７）。On the other hand, if the word character string matches the command character string registered in the command dictionary 18 in step A13, the command generator 15 checks whether the voice input mode is on (step A16). When the voice input mode is off (N in step A16)
o), the command generation unit 15 checks whether or not the level of the command character string that matches the word character string corresponds to a preset level (step A17).

【００３８】その結果、設定レベルに該当しない場合、
つまり、文書編集に使われるようなレベルの低いコマン
ド文字列であった場合には（ステップＡ１７のＮｏ）、
当該文字列の入力を無効とし、それに対応するコマンド
処理を実行しない。また、設定レベルに該当する場合、
つまり、システムの制御自体に関わるレベルの高いコマ
ンド文字列であった場合には（ステップＡ１７のＹｅ
ｓ）、当該文字列の入力を有効とし、それに対応するコ
マンド処理を実行する（ステップＡ１８）。As a result, if the set level is not satisfied,
That is, when the command character string is a low-level command character string used for document editing (No in step A17),
Invalidates the input of the character string and does not execute the corresponding command processing. Also, if it corresponds to the setting level,
That is, when the command character string is a high-level command character string related to the control of the system itself (Yes in step A17).
s) The input of the character string is validated, and the corresponding command processing is executed (step A18).

【００３９】上記ステップＡ１６において、音声入力モ
ードがオンの場合には、コマンドレベルに関係なく、そ
のときのコマンド文字列の入力を有効とし、それに応じ
たコマンド処理を実行することになる（ステップＡ１
８）。In step A16, when the voice input mode is ON, regardless of the command level, the input of the command character string at that time is validated, and the command processing corresponding thereto is executed (step A1).
8).

【００４０】例えば、「はじめに」という音声を入力し
てから、次の音声が一定時間以上入力されず、音声入力
モードがオフになった場合を考える。For example, it is assumed that after the voice of "Introduction" is input, the next voice is not input for a certain period of time and the voice input mode is turned off.

【００４１】この状態で、「改行」という音声を入力し
たとする。この入力音声「かいぎょう」は、音声認識部
１４で「かいぎょう」という単語文字列に変換され、コ
マンド辞書１８に登録されているコマンド文字列と比較
される。この場合、単語文字列「かいぎょう」は、図４
に示すようにコマンド辞書１８に登録されたコマンド文
字列と一致する。In this state, it is assumed that a voice "line feed" is input. The input voice “Kaigayo” is converted into a word character string “Kaigayo” by the voice recognition unit 14 and compared with a command character string registered in the command dictionary 18. In this case, the word character string “Kaikyo” is
As shown in FIG. 7, the command character string matches the command character string registered in the command dictionary 18.

【００４２】ステップＡ１６において、音声入力モード
はオフの状態なので、コマンド辞書１８で「かいぎょ
う」のレベルが、予め設定してあるコマンドのレベルに
該当するかどうかをチェックする。ここでは、「レベル
１のコマンドのみを受け付ける」と設定してあるとす
る。「かいぎょう」のレベルは図４に示すように「０」
なので、これに対応するコマンド処理は実行されない。In step A16, since the voice input mode is in the OFF state, it is checked in the command dictionary 18 whether the level of "OK" corresponds to a preset command level. In this case, it is assumed that “only level 1 commands are accepted” is set. The level of “Kaikyo” is “0” as shown in FIG.
Therefore, the corresponding command processing is not executed.

【００４３】続いて、「音声入力オン」という音声を入
力したとする。この入力音声は音声認識部１４で「おん
せいにゅうりょくもーどおん」という単語文字列に変換
され、コマンド辞書１８に登録されているコマンド文字
列と比較される。音声入力モードはオフなので、ステッ
プＡ１７で「おんせいにゅうりょくもーどおん」のレベ
ルをコマンド辞書１８でチェックする。この場合、「お
んせいにゅうりょくもーどおん」というコマンドのレベ
ルは「１」であり、予め設定してあるコマンドのレベル
に該当する。したがって、ステップＡ１８で、対応する
コマンド「音声入力モードをオンにする」の処理を実行
する。Subsequently, it is assumed that a voice "voice input ON" is input. This input voice is converted into a word character string “ONSENRYOKURIDOON” by the voice recognition unit 14 and compared with a command character string registered in the command dictionary 18. Since the voice input mode is off, the level of "ONSENRYUKRO-DON" is checked by the command dictionary 18 in step A17. In this case, the level of the command "ONSEI RYOUKUMO DON" is "1", which corresponds to the level of the command set in advance. Therefore, in step A18, the processing of the corresponding command "turn on the voice input mode" is executed.

【００４４】ここで、予め設定してあるコマンドのレベ
ルというのは、「レベル０のコマンドのみを受け付け
る」とか「すべてのレベルのコマンドを受け付ける」な
どのように設定されていても構わない。Here, the preset command level may be set such as "accept only commands of level 0" or "accept commands of all levels".

【００４５】次に、音声入力モードがオフになる仕組み
については、図３のフローチャートを参照して説明す
る。Next, the mechanism for turning off the voice input mode will be described with reference to the flowchart of FIG.

【００４６】図３は第１の実施形態における音声入力判
断処理の動作を示すフローチャートである。音声入力判
断部１７は、常に音声入力モードの状態を監視している
（ステップＢ１１）。そして、音声入力モードがオンに
なると（ステップＢ１１のＹｅｓ）、音声入力判断部１
７は入力部１１から渡される音声の入力間隔をチェック
する（ステップＢ１２）。ここで、一定時間を越えて音
声が入力されてこないような場合には（ステップＢ１２
のＹｅｓ）、音声入力判断部１７は制御部１２のモード
記憶部１２ａに設定された音声入力モードの状態をオフ
にする（ステップＢ１３）。FIG. 3 is a flowchart showing the operation of the voice input determination process in the first embodiment. The voice input determining unit 17 constantly monitors the state of the voice input mode (step B11). When the voice input mode is turned on (Yes in step B11), the voice input determination unit 1
7 checks the input interval of the sound passed from the input unit 11 (step B12). Here, in the case where no voice is input for a certain period of time (step B12).
Yes), the voice input determination unit 17 turns off the voice input mode state set in the mode storage unit 12a of the control unit 12 (step B13).

【００４７】例えば、「はじめに」という音声を入力し
てから、一定の時間経過後に「えーと」という音声が入
力されたとする。このような場合に、音声入力判断部１
６は音声入力モードがオンだと判断した状態で音声入力
の間隔を監視し、「はじめに」という音声の後に一定時
間音声が入力されていないことを確認して、音声入力モ
ードをオフにする。For example, it is assumed that after a predetermined time has elapsed since the voice of "Introduction" was input, the voice of "Em" was input. In such a case, the voice input determination unit 1
Reference numeral 6 monitors the interval between voice inputs while determining that the voice input mode is on, confirms that no voice has been input for a predetermined time after the voice of "Introduction", and turns off the voice input mode.

【００４８】そのため、入力部１１から入力された音声
「えーと」が音声認識部１４にて単語文字列「えーと」
に変換された際、テキスト生成部１６では音声入力モー
ドがオフであると判断するため、当該文字列はテキスト
格納部１９には格納されない。つまり、テキスト格納部
１９には、音声入力モードがオフになる前に入力された
「はじめに」といった文字列のみが格納される。これに
より、図５に示すように、表示部１３には、テキスト格
納部１９に格納されたテキスト「はじめに」が表示され
る。Therefore, the voice "Eto" input from the input unit 11 is converted to the word character string "Eto" by the voice recognition unit 14.
When the character string is converted to the character string, the text generation unit 16 determines that the voice input mode is off, so that the character string is not stored in the text storage unit 19. That is, the text storage unit 19 stores only a character string such as “Introduction” input before the voice input mode is turned off. As a result, as shown in FIG. 5, the display unit 13 displays the text “Introduction” stored in the text storage unit 19.

【００４９】このように、音声が一定時間経過しても入
力されてない場合に音声入力モードをオフとすること
で、音声入力の合間に発せられた外部の音（ノイズ）
や、ユーザが意図せずに発した音声など、不要な音声を
テキストとして入力してしまうことを回避できる。ま
た、この間に特定のコマンドが音声入力された場合に
は、そのコマンドの処理を行うことで、不要な音声入力
のみ排除して、システムの制御自体に直接関わる音声入
力については、そのまま扱うことができる。As described above, by turning off the voice input mode when no voice is input even after a certain period of time has elapsed, the external sound (noise) generated between voice inputs can be obtained.
Also, it is possible to avoid inputting unnecessary voice as text, such as a voice unintentionally uttered by the user. If a specific command is input by voice during this time, the processing of that command eliminates unnecessary voice input and allows voice input directly related to system control itself to be handled as it is. it can.

【００５０】（第２の実施形態）次に、本発明の第２の
実施形態について説明する。(Second Embodiment) Next, a second embodiment of the present invention will be described.

【００５１】図６は本発明の第２の実施形態に係るディ
クテーション装置の構成を示すブロック図である。本装
置も、図１と同様に、例えば磁気ディスク等の記録媒体
に記録されたプログラムを読み込み、このプログラムに
よって動作が制御されるコンピュータによって実現され
るものであり、図中１１〜１９は図１と同じものであ
る。FIG. 6 is a block diagram showing the configuration of the dictation device according to the second embodiment of the present invention. This apparatus is also realized by a computer which reads a program recorded on a recording medium such as a magnetic disk and the operation of which is controlled by this program, similarly to FIG. Is the same as

【００５２】図６の構成において、図１と異なる点は音
声入力モード警告部２０が設けられていることである。
この音声入力モード警告部２０は、音声入力判断部１７
の判断結果に従って音声入力モードがオンもしくはオフ
であることを明示するためのメッセージを表示部１３に
表示する処理を行う。6 differs from FIG. 1 in that a voice input mode warning unit 20 is provided.
The voice input mode warning unit 20 includes a voice input determination unit 17.
Is performed on the display unit 13 to display a message indicating that the voice input mode is on or off in accordance with the determination result.

【００５３】次に、第２の実施形態の動作について説明
する。Next, the operation of the second embodiment will be described.

【００５４】図６の構成の動作については、上記第１の
実施形態における図１の構成の動作と全く同じであり、
図２のフローチャートによる音声処理の動作の詳細も全
く変わらない。異なるのは、音声入力モードがオフにな
る仕組みについて、音声入力モード警告部２０の処理が
追加されることである。これについては、図７のフロー
チャートを参照して説明する。The operation of the configuration of FIG. 6 is exactly the same as the operation of the configuration of FIG. 1 in the first embodiment.
The details of the operation of the audio processing according to the flowchart in FIG. The difference is that a process of the voice input mode warning unit 20 is added to the mechanism for turning off the voice input mode. This will be described with reference to the flowchart of FIG.

【００５５】図７は第２の実施形態における音声入力判
断処理の動作を示すフローチャートである。音声入力判
断部１７は、常に音声入力モードの状態を監視している
（ステップＣ１１）。そして、音声入力モードがオンに
なると（ステップＣ１１のＹｅｓ）、音声入力判断部１
７は入力部１１から渡される音声の入力間隔をチェック
する（ステップＣ１２）。ここで、一定時間を越えて音
声が入力されてこないような場合には（ステップＣ１２
のＹｅｓ）、音声入力判断部１７は制御部１２のモード
記憶部１２ａに設定される音声入力モードの状態をオフ
にする（ステップＣ１３）。FIG. 7 is a flowchart showing the operation of the voice input determination process in the second embodiment. The voice input determination unit 17 constantly monitors the state of the voice input mode (step C11). When the voice input mode is turned on (Yes in step C11), the voice input determination unit 1
7 checks the input interval of the sound passed from the input unit 11 (step C12). Here, when no voice is input for a certain period of time (step C12).
Yes), the voice input determination unit 17 turns off the voice input mode state set in the mode storage unit 12a of the control unit 12 (step C13).

【００５６】音声入力モードがオフになると、音声入力
判断部１７は音声入力モード警告部２０を起動し、音声
入力モードがオフになったことを伝える。これを受け
て、音声入力モード警告部２０は音声入力モードがオフ
であることを表示部１３に表示する（ステップＣ１
４）。When the voice input mode is turned off, the voice input determination unit 17 activates the voice input mode warning unit 20 to notify that the voice input mode has been turned off. In response to this, the voice input mode warning unit 20 displays on the display unit 13 that the voice input mode is off (step C1).
4).

【００５７】ここでは、テキスト格納部１９に「はじめ
に」が格納された状態で、音声入力モードがオンの状態
を考える。音声入力判断部１６は音声入力モードがオン
であるので、音声と音声の間隔をチェックする。そし
て、「はじめに」に続く音声が一定時間経過しても入力
されなかったら、音声入力判断部１６は音声入力モード
をオフにして、音声入力モード警告部２０に知らせる。
音声入力モード警告部２０は音声入力モードがオフにな
ったことを、図８に示すようなメッセージ３１を表示部
１３に表示することで、ユーザに通知する。Here, it is assumed that the voice input mode is turned on in a state where “Introduction” is stored in the text storage unit 19. Since the voice input mode is on, the voice input determination unit 16 checks the interval between voices. Then, if the voice following “Introduction” is not input even after the elapse of a certain time, the voice input determination unit 16 turns off the voice input mode and notifies the voice input mode warning unit 20.
The voice input mode warning unit 20 notifies the user that the voice input mode has been turned off by displaying a message 31 on the display unit 13 as shown in FIG.

【００５８】このように、音声入力モードが自動的にオ
フした場合に、その旨がユーザに通知されるため、ユー
ザがモードオフの状態を気付かずに無駄な発声をしてし
まうことを回避できる。As described above, when the voice input mode is automatically turned off, the fact is notified to the user, so that it is possible to prevent the user from making useless utterances without noticing the mode off state. .

【００５９】なお、本実施形態では、図８に示すような
メッセージ３１の表示により音声入力モードがオフにな
ったことをユーザに通知するようにしたが、その他の通
知方法として、例えば警告ランプの点灯や警告音の発
生、さらに、音声にて「音声入力モードをオフしまし
た。」といったようなメッセージを出力するようにして
も良い。In the present embodiment, the user is notified that the voice input mode has been turned off by displaying a message 31 as shown in FIG. 8, but other notification methods include, for example, a warning lamp. Lighting, generation of a warning sound, and further, a voice message such as "The voice input mode has been turned off."

【００６０】（第３の実施形態）次に、本発明の第３の
実施形態について説明する。(Third Embodiment) Next, a third embodiment of the present invention will be described.

【００６１】図９は本発明の第３の実施形態に係るディ
クテーション装置の構成を示すブロック図である。本装
置も、図１と同様に、例えば磁気ディスク等の記録媒体
に記録されたプログラムを読み込み、このプログラムに
よって動作が制御されるコンピュータによって実現され
るものであり、図中１１〜１９は図１と同じものであ
る。FIG. 9 is a block diagram showing the configuration of a dictation device according to the third embodiment of the present invention. This apparatus is also realized by a computer which reads a program recorded on a recording medium such as a magnetic disk and the operation of which is controlled by this program, similarly to FIG. Is the same as

【００６２】図９の構成において、図１と異なる点は音
声入力モード切り替え部２１が設けられていることであ
る。この音声入力モード切り替え部２１は、音声入力モ
ードがオフの状態で、コマンド文字列が入力されたとき
に音声入力モードをオンに切り替える処理を行う。The configuration of FIG. 9 differs from that of FIG. 1 in that a voice input mode switching unit 21 is provided. The voice input mode switching unit 21 performs a process of switching the voice input mode on when a command character string is input while the voice input mode is off.

【００６３】次に、第３の実施形態の動作を説明する。Next, the operation of the third embodiment will be described.

【００６４】まず、入力部１１から音声が入力される
と、音声認識部１４と音声入力判断部１７に送られる。
音声認識部１４は、入力された音声を認識して単語文字
列に変換してコマンド生成部１５に送る。First, when a voice is input from the input unit 11, the voice is sent to the voice recognition unit 14 and the voice input determination unit 17.
The voice recognition unit 14 recognizes the input voice, converts the voice into a word character string, and sends the word character string to the command generation unit 15.

【００６５】コマンド生成部１５は、当該単語文字列で
コマンド辞書１８に予め登録してあるコマンド文字列を
検索して、該当するコマンド文字列があった場合に、当
該文字列に対応するコマンドのレベルが予め設定されて
いるレベルに該当するかをチェックする。その結果、該
当する場合には対応したコマンド処理を実行する。その
際に、音声入力モード切り替え部２１を起動する。該当
しない場合は、そのまま単語文字列をテキスト生成部１
６に送る。The command generation unit 15 searches for a command character string registered in advance in the command dictionary 18 with the word character string, and if there is a corresponding command character string, the command generation unit 15 generates a command corresponding to the character string. Check whether the level corresponds to a preset level. As a result, if applicable, the corresponding command processing is executed. At this time, the voice input mode switching unit 21 is activated. If not, the word character string is used as it is in the text generator 1
Send to 6.

【００６６】音声入力モード切り替え部２１は、音声入
力モードがオンかどうかをチェックし、オフの場合には
オンにする。The voice input mode switching section 21 checks whether or not the voice input mode is on, and if it is off, turns it on.

【００６７】テキスト生成部１６は、音声入力モードが
オンの場合、送られて来た単語文字列をテキスト格納部
１９に格納する。音声入力モードがオフの場合にはテキ
スト格納部１９への格納はしない。テキスト格納部１９
に格納されたテキストは、たえず表示部１３に表示され
る。When the voice input mode is on, the text generation unit 16 stores the sent word character string in the text storage unit 19. When the voice input mode is off, the data is not stored in the text storage unit 19. Text storage unit 19
Are displayed on the display unit 13 constantly.

【００６８】音声入力モードをオンにするのは、上述し
た入力部１１からコマンド生成部１５までの流れで行
う。入力部１１から音声入力モードをオンにするコマン
ド音声が入力されると、コマンド生成部１５はコマンド
辞書１８に当該コマンド文字列が存在することを確認し
て、そのコマンド文字列に対応するコマンドの制御コー
ドを出力する。Turning on the voice input mode is performed in the flow from the input unit 11 to the command generation unit 15 described above. When a command voice for turning on the voice input mode is input from the input unit 11, the command generation unit 15 confirms that the command character string exists in the command dictionary 18, and outputs a command corresponding to the command character string. Output control code.

【００６９】音声入力判断部１７は、音声入力の間隔を
監視していて、音声入力モードがオンになると、入力部
１１から送られる音声と音声の間隔が一定時間を越えた
場合には音声入力モードをオフにする。The voice input judging section 17 monitors the interval between voice inputs. When the voice input mode is turned on, if the interval between voices sent from the input section 11 exceeds a predetermined time, voice input is determined. Turn off mode.

【００７０】次に、図１０のフローチャートを参照して
詳しく説明する。Next, a detailed description will be given with reference to the flowchart of FIG.

【００７１】図１０は第３の実施形態における音声処理
の動作を示すフローチャートである。入力部１１から音
声が入力されると、制御部１２は、これを音声認識部１
４に渡す（ステップＤ１１）。音声認識部１４は、音声
認識辞書１４ａを用いて、入力された音声を認識処理し
て単語文字列に変換し、これをコマンド生成部１５に渡
す（ステップＤ１２）。FIG. 10 is a flowchart showing the operation of the voice processing in the third embodiment. When a voice is input from the input unit 11, the control unit 12 transmits the voice to the voice recognition unit 1.
4 (step D11). The voice recognition unit 14 uses the voice recognition dictionary 14a to perform a recognition process on the input voice, converts the voice into a word character string, and passes the word character string to the command generation unit 15 (step D12).

【００７２】コマンド生成部１５では、音声認識部１４
によって得られた単語文字列がコマンド辞書１８に登録
されているコマンド文字列と一致するかどうかをチェッ
クする（ステップＤ１３）。コマンド文字列と一致しな
い場合、つまり、コマンド以外の文字列が入力された場
合において（ステップＤ１３のＮｏ）、コマンド生成部
１５は現在の音声入力モードの状態をチェックする（ス
テップＤ１４）。The command generation unit 15 includes a voice recognition unit 14
It is checked whether or not the word character string obtained by the above matches the command character string registered in the command dictionary 18 (step D13). When it does not match the command character string, that is, when a character string other than the command is input (No in step D13), the command generation unit 15 checks the current state of the voice input mode (step D14).

【００７３】その結果、音声入力モードがオンであれば
（ステップＤ１４のＹｅｓ）、コマンド生成部１５はそ
のときの単語文字列をテキスト生成部１６に渡してテキ
スト格納部９に格納する（ステップＤ１５）。また、音
声入力モードがオフの場合には（ステップＤ１４のＮ
ｏ）、テキスト格納部１９への格納はせず、音声入力待
ちの状態となる。As a result, if the voice input mode is on (Yes in step D14), the command generation section 15 passes the word character string at that time to the text generation section 16 and stores it in the text storage section 9 (step D15). ). When the voice input mode is off (N in step D14)
o), the text is not stored in the text storage unit 19, and the state is in a state of waiting for voice input.

【００７４】一方、上記ステップＤ１３で、単語文字列
がコマンド辞書１８に登録されているコマンド文字列と
一致する場合、コマンド生成部１５は音声入力モードが
オンかどうかをチェックする（ステップＤ１６）。音声
入力モードがオフの場合には（ステップＤ１６のＮ
ｏ）、コマンド生成部１５は単語文字列に一致したコマ
ンド文字列のレベルが、予め設定されたレベルに該当す
るかどうかをチェックする（ステップＤ１７）。On the other hand, if the word character string matches the command character string registered in the command dictionary 18 in step D13, the command generation unit 15 checks whether the voice input mode is on (step D16). If the voice input mode is off (N in step D16)
o), the command generation unit 15 checks whether or not the level of the command character string that matches the word character string corresponds to a preset level (step D17).

【００７５】その結果、コマンド文字列のレベルが設定
レベルに該当しない場合には（ステップＤ１７のＮ
ｏ）、当該文字列の入力を無効として、それに対応する
コマンド処理を実行しない。また、設定レベルに該当す
る場合には（ステップＤ１７のＹｅｓ）、当該文字列の
入力を有効として、それに対応するコマンド処理を実行
する（ステップＤ１８）。As a result, if the command character string level does not correspond to the set level (N in step D17)
o) The input of the character string is invalidated, and the corresponding command processing is not executed. In addition, when the input level corresponds to the set level (Yes in step D17), the input of the character string is validated, and the corresponding command processing is executed (step D18).

【００７６】ここで、コマンド生成部１５は音声入力モ
ード切り替え部２１を起動する。音声入力モード切り替
え部２１は音声入力モードがオンであるかオフであるか
をチェックし、オフの場合には音声入力モードをオンに
切り替える（ステップＤ１９）。Here, the command generation unit 15 activates the voice input mode switching unit 21. The voice input mode switching unit 21 checks whether the voice input mode is on or off, and if it is off, switches the voice input mode on (step D19).

【００７７】また、音声入力モードがオンの場合には
（ステップＤ１６のＹｅｓ）、コマンドレベルに関係な
く、そのときのコマンド文字列を有効とし、それに応じ
た処理を実行することになる（ステップＤ２０）。When the voice input mode is on (Yes in step D16), the command character string at that time is validated regardless of the command level, and the processing corresponding to the command character string is executed (step D20). ).

【００７８】例えば、音声入力モードがオフの状態で、
「改頁」という音声を入力したとする。「改頁」という
音声は音声認識部１４にて「かいぺーじ」という単語文
字列に変換される。コマンド生成部１５でコマンド辞書
１８を検索すると、一致するコマンド文字列が存在す
る。For example, when the voice input mode is off,
Suppose that a voice "page break" is input. The voice of “page break” is converted by the voice recognition unit 14 into a word character string of “kai page”. When the command generator 18 searches the command dictionary 18, a matching command character string exists.

【００７９】ステップＤ１６で音声入力モードをチェッ
クすると、音声入力モードはオフなので、コマンド辞書
１８でコマンド文字列「かいぺーじ」のレベルが予め設
定してあるコマンドのレベルに該当するかどうかをチェ
ックする。When the voice input mode is checked in step D16, since the voice input mode is off, it is checked in the command dictionary 18 whether the level of the command character string "kai page" corresponds to the level of a preset command. .

【００８０】ここで、「すべてのレベルのコマンドを受
け付ける」と設定してあったとすると、当該コマンド文
字列「かいぺーじ」は設定レベルのコマンドということ
になり、ステップＤ１９にて、対応するコマンド「改頁
する」を実行することになる。Here, if "all levels of commands are accepted" is set, the command character string "kai page" is a command of the set level, and the corresponding command "command" is set at step D19. Page break "is executed.

【００８１】さらに、コマンド生成部１５により音声入
力モード切り替え部２１が起動される。音声入力モード
切り替え部２１は、音声入力モードがオンかどうかをチ
ェックする。この場合、音声入力モードがオフの状態に
あるので、これをオンの状態に切り替える。Further, the command input unit 15 activates the voice input mode switching unit 21. The voice input mode switching unit 21 checks whether the voice input mode is on. In this case, since the voice input mode is in the off state, it is switched to the on state.

【００８２】ここで、予め設定してあるコマンドのレベ
ルというのは、「レベル０のコマンドのみを受け付け
る」とか「レベル１のコマンドのみを受け付ける」など
のように設定されていても構わない。Here, the preset command level may be set such as "accept only level 0 commands" or "accept only level 1 commands".

【００８３】このように、入力された音声がコマンドの
ための音声である場合に、ユーザが中断していた音声入
力を再開したと判断して、音声入力を受け付ける状態に
切り替える。これにより、音声入力を受け付けない状態
で（音声入力モードがオフの状態）、ユーザが音声入力
を再開する際に、キーボードなどから状態を変更するコ
マンドを与えるなどしなくとも、コマンドのための音声
入力後、直ちに音声入力を行うことができるようにな
る。As described above, when the input voice is the voice for the command, it is determined that the user has resumed the interrupted voice input, and the state is switched to a state in which the voice input is accepted. Thus, when the voice input is not accepted (the voice input mode is off), when the user resumes the voice input, the voice for the command is issued without giving a command to change the status from a keyboard or the like. Immediately after the input, voice input can be performed.

【００８４】（第４の実施形態）次に、本発明の第４の
実施形態について説明する。(Fourth Embodiment) Next, a fourth embodiment of the present invention will be described.

【００８５】図１１は本発明の第４の実施形態に係るデ
ィクテーション装置の構成を示すブロック図である。本
装置も、図１と同様に、例えば磁気ディスク等の記録媒
体に記録されたプログラムを読み込み、このプログラム
によって動作が制御されるコンピュータによって実現さ
れるものであり、図中１１〜１９は図１と同じものであ
る。FIG. 11 is a block diagram showing a configuration of a dictation device according to a fourth embodiment of the present invention. This apparatus is also realized by a computer which reads a program recorded on a recording medium such as a magnetic disk and the operation of which is controlled by this program, similarly to FIG. Is the same as

【００８６】図１１の構成において、図１と異なる点は
認識率判定部２２が設けられていることである。この認
識率判定部２２は、音声認識部１４で認識された単語文
字列の認識率が一定の値よりも高いかどうかを判定す
る。The configuration of FIG. 11 differs from that of FIG. 1 in that a recognition rate determination unit 22 is provided. The recognition rate determination unit 22 determines whether the recognition rate of the word character string recognized by the speech recognition unit 14 is higher than a certain value.

【００８７】次に、第４の実施形態の動作を説明する。Next, the operation of the fourth embodiment will be described.

【００８８】まず、入力部１１から音声が入力される
と、音声認識部１４と音声入力判定部１７に送られる。
音声認識部１４は、入力された音声を認識して単語文字
列に変換してコマンド生成部１５に送る。First, when a voice is input from the input unit 11, the voice is sent to the voice recognition unit 14 and the voice input determination unit 17.
The voice recognition unit 14 recognizes the input voice, converts the voice into a word character string, and sends the word character string to the command generation unit 15.

【００８９】コマンド生成部１５は、当該単語文字列で
コマンド辞書１８に予め登録してあるコマンド文字列を
検索して、該当するコマンド文字列があった場合に、当
該文字列に対応するコマンドのレベルが予め設定されて
いるレベルに該当するかをチェックする。その結果、該
当する場合には対応したコマンド処理を実行する。該当
しない場合は、認識率判定部２２に問い合わせを行う。The command generation unit 15 searches for a command character string registered in advance in the command dictionary 18 using the word character string, and if there is a corresponding command character string, the command generation unit 15 searches for a command corresponding to the character string. Check whether the level corresponds to a preset level. As a result, if applicable, the corresponding command processing is executed. If not, an inquiry is made to the recognition rate determination unit 22.

【００９０】認識率判定部２２は、音声認識部１４が認
識した単語文字列の認識率が予め設定してある値を越え
ているかどうかをチェックする。越えている場合には、
音声入力モードをオンにして、コマンド生成部１５に制
御を戻す。コマンド生成部１５は、単語文字列をテキス
ト生成部１６に送る。The recognition rate determining section 22 checks whether or not the recognition rate of the word character string recognized by the voice recognition section 14 exceeds a preset value. If so,
The voice input mode is turned on, and control is returned to the command generation unit 15. The command generation unit 15 sends the word character string to the text generation unit 16.

【００９１】テキスト生成部１６は、音声入力モードが
オンの場合、送られて来た単語文字列をテキスト格納部
１９に格納し、音声入力モードがオフの場合にはテキス
ト格納部１９への格納はしない。テキスト格納部１９に
格納されたテキストは、たえず表示部１３に表示され
る。The text generator 16 stores the sent word character string in the text storage 19 when the voice input mode is on, and stores it in the text storage 19 when the voice input mode is off. Do not. The text stored in the text storage unit 19 is constantly displayed on the display unit 13.

【００９２】音声入力モードをオンにするのは、上述し
た入力部１１からコマンド生成部１５までの流れで行
う。入力部１１から音声入力モードをオンにするコマン
ド音声が入力されると、コマンド生成部１５はコマンド
辞書１８に当該コマンド文字列が存在することを確認し
て、そのコマンド文字列に対応するコマンドの制御コー
ドを出力する。Turning on the voice input mode is performed by the flow from the input unit 11 to the command generation unit 15 described above. When a command voice for turning on the voice input mode is input from the input unit 11, the command generation unit 15 confirms that the command character string exists in the command dictionary 18, and outputs a command corresponding to the command character string. Output control code.

【００９３】音声入力判断部１７は、音声入力の間隔を
監視していて、音声入力モードがオンになると、入力部
１１から送られる音声と音声の間隔が一定時間を越えた
場合には音声入力モードをオフにする。The voice input judging section 17 monitors the interval between voice inputs. When the voice input mode is turned on, if the interval between voices sent from the input section 11 exceeds a certain time, voice input is determined. Turn off mode.

【００９４】次に、図１２のフローチャートを参照して
詳しく説明する。Next, a detailed description will be given with reference to the flowchart of FIG.

【００９５】図１２は第４の実施形態における音声処理
の動作を示すフローチャートである。入力部１１から音
声が入力されると、制御部１２は、これを音声認識部１
４に渡す（ステップＥ１１）。音声認識部１４は、音声
認識辞書１４ａを用いて、入力された音声を認識処理し
て単語文字列に変換し、これをコマンド生成部１５に渡
す（ステップＥ１２）。FIG. 12 is a flowchart showing the operation of the voice processing in the fourth embodiment. When a voice is input from the input unit 11, the control unit 12 transmits the voice to the voice recognition unit 1.
4 (step E11). The voice recognition unit 14 uses the voice recognition dictionary 14a to perform a recognition process on the input voice to convert the voice into a word character string, and passes the word character string to the command generation unit 15 (step E12).

【００９６】コマンド生成部１５では、音声認識部１４
によって得られた単語文字列がコマンド辞書１８に登録
されているコマンド文字列と一致するかどうかをチェッ
クする（ステップＥ１３）。コマンド文字列と一致する
場合は、コマンド生成部１５は音声入力モードがオンか
どうかをチェックする（ステップＥ１４）。In the command generation unit 15, the voice recognition unit 14
It is checked whether the word character string obtained by the above matches the command character string registered in the command dictionary 18 (step E13). If it matches the command character string, the command generator 15 checks whether the voice input mode is on (step E14).

【００９７】音声入力モードがオフの場合には（ステッ
プＥ１４のＮｏ）、コマンド生成部１５は単語文字列に
一致したコマンド文字列のレベルが、予め設定されたレ
ベルに該当するかどうかをチェックする（ステップＥ１
５）。If the voice input mode is off (No in step E14), the command generator 15 checks whether the level of the command character string that matches the word character string corresponds to a preset level. (Step E1
5).

【００９８】その結果、設定レベルに該当しない場合、
つまり、文書編集に使われるようなレベルの低いコマン
ド文字列であった場合には（ステップＥ１５のＮｏ）、
当該文字列の入力を無効とし、それに対応するコマンド
処理を実行しない。また、設定レベルに該当する場合、
つまり、システムの制御を自体に関わるレベルの高いコ
マンド文字列であった場合には（ステップＥ１５のＹｅ
ｓ）、当該文字列の入力を有効とし、それに対応するコ
マンド処理を実行する（ステップＥ１６）。As a result, if it does not correspond to the set level,
That is, if the command character string is a low-level command character string used for document editing (No in step E15),
Invalidates the input of the character string and does not execute the corresponding command processing. Also, if it corresponds to the setting level,
That is, when the command character string is a high-level command character string relating to the control of the system itself (Ye in step E15).
s) The input of the character string is validated, and the corresponding command processing is executed (step E16).

【００９９】一方、上記ステップＥ１３において、単語
文字列がコマンド辞書１８に登録されたコマンド文字列
と一致しない場合には、認識率判定部２２に問い合わせ
を行う。On the other hand, if the word character string does not match the command character string registered in the command dictionary 18 at the step E13, an inquiry is made to the recognition rate judgment section 22.

【０１００】例えば、音声入力モードがオフの状態で、
入力部１１から「おわりに」という音声を入力したとす
る。「おわりに」という音声は音声認識部１４に渡さ
れ、「おわりに」という単語文字列に変換される。コマ
ンド生成部１５はコマンド辞書１８を検索するが、該当
するコマンド文字列がないので、認識率判定部２２に問
い合わせる。For example, when the voice input mode is off,
It is assumed that a voice saying “End” is input from the input unit 11. The voice of “End” is passed to the voice recognition unit 14 and is converted into a word character string of “End”. The command generation unit 15 searches the command dictionary 18 but inquires of the recognition rate determination unit 22 because there is no corresponding command character string.

【０１０１】認識率判定部２２は、音声認識部１４によ
って得られる単語文字列の認識率が予め設定された値よ
り大きいかどうかを判定する（ステップＥ１７）。その
結果、単語文字列の認識率が設定値を越えている場合に
は（ステップＥ１７のＹｅｓ）、音声入力モードをオン
にして（ステップＥ１８）、コマンド生成部１５に制御
を戻す。コマンド生成部１５は、当該単語文字列をその
ままテキスト生成部１６に渡す。The recognition rate determination section 22 determines whether or not the recognition rate of the word character string obtained by the voice recognition section 14 is larger than a preset value (step E17). As a result, when the recognition rate of the word character string exceeds the set value (Yes in step E17), the voice input mode is turned on (step E18), and the control is returned to the command generation unit 15. The command generator 15 passes the word character string to the text generator 16 as it is.

【０１０２】この場合、音声認識部１４に設けられた音
声認識辞書１４ａには、予め文章作成に関する単語を対
象として、その単語を音声認識するための情報が登録さ
れている。したがって、文章作成に関する単語が音声入
力された際には、その単語を高い認識率で認識すること
ができ、ノイズと区別することができる。In this case, the speech recognition dictionary 14a provided in the speech recognition section 14 has registered therein information for speech recognition of a word relating to text creation in advance. Therefore, when a word related to text creation is input by speech, the word can be recognized at a high recognition rate, and can be distinguished from noise.

【０１０３】ここで、「おわりに」という単語文字列は
ノイズではなく正しい単語なので認識率が高いとする。
すると、認識率判定率１２は設定値より大きい認識率だ
と判定し、音声入力モードをオンとする。Here, it is assumed that the word character string "conclusion" is not a noise but a correct word, so that the recognition rate is high.
Then, it is determined that the recognition rate determination rate 12 is higher than the set value, and the voice input mode is turned on.

【０１０４】テキスト生成部１６は、音声入力モードが
オンかどうかをチェックし（ステップＥ１９）、音声入
力モードがオンの場合には（ステップＥ１９のＹｅ
ｓ）、コマンド生成部１５から渡された単語文字列をテ
キスト格納部１９に格納する（ステップＥ２０）。音声
入力モードがオフの場合は（ステップＥ１９のＮｏ）、
テキスト格納部１９に格納しないで、ステップＥ１１に
戻る。The text generator 16 checks whether the voice input mode is on (step E19). If the voice input mode is on (Yes in step E19)
s) The word character string passed from the command generation unit 15 is stored in the text storage unit 19 (step E20). If the voice input mode is off (No in step E19),
The process returns to step E11 without storing in the text storage unit 19.

【０１０５】上記の例で、コマンド生成部１５から渡さ
れた「おわりに」という単語文字列は、認識率判定部２
２の判定により音声入力モードがオンになっているた
め、テキスト格納部１９に格納され、表示部１３に表示
される。In the above example, the word character string “End” passed from the command generation unit 15 is
Since the voice input mode is turned on by the determination of 2, the data is stored in the text storage unit 19 and displayed on the display unit 13.

【０１０６】このように、入力された音声がノイズであ
るかどうかを音声認識の認識率で区別し、認識率が高け
れば、ユーザが意図した音声入力であると判断すること
ができる。したがって、音声入力を受け付けない状態
（音声入力モードがオフの状態）で、ユーザが音声入力
を再開する際に、キーボードなどから状態を変更するコ
マンドを与えるなどしなくとも、直ちに音声入力を行う
ことができるようになる。As described above, whether or not the input voice is noise is distinguished by the recognition rate of voice recognition, and if the recognition rate is high, it can be determined that the voice input is intended by the user. Therefore, in the state where voice input is not accepted (the voice input mode is off), when the user resumes voice input, the voice input is performed immediately without giving a command to change the status from the keyboard or the like. Will be able to

【０１０７】なお、本発明は上述した実施形態に限定さ
れるものではない。The present invention is not limited to the above embodiment.

【０１０８】例えば、音声入力モードをオフにする音声
入力判定部１７と、認識率が一定の値より大きい場合に
音声入力モードをオンにする認識率判定部２２と、コマ
ンドのための音声が入力された場合に音声モードをオン
にする音声入力モード切り替え部２１とが別々に処理を
行う構成となっているが、これらを同時に行うようにし
ても良い。For example, a voice input determining unit 17 for turning off the voice input mode, a recognition rate determining unit 22 for turning on the voice input mode when the recognition rate is larger than a certain value, and a voice for command being input. Although the audio input mode switching unit 21 that turns on the audio mode when the processing is performed is configured to perform the processing separately, they may be performed simultaneously.

【０１０９】要するに、本発明は要旨を逸脱しない範囲
で種々変形して実施することができる。In short, the present invention can be variously modified and implemented without departing from the gist.

【０１１０】また、上述した実施形態において記載した
手法は、コンピュータに実行させることのできるプログ
ラムとして、例えば磁気ディスク（フロッピーディス
ク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、
ＤＶＤ等）、半導体メモリなどの記録媒体に書き込んで
各種装置に適用したり、通信媒体により伝送して各種装
置に適用することも可能である。本装置を実現するコン
ピュータは、記録媒体に記録されたプログラムを読み込
み、このプログラムによって動作が制御されることによ
り、上述した処理を実行する。Further, the methods described in the above-described embodiments include, for example, a magnetic disk (floppy disk, hard disk, etc.), an optical disk (CD-ROM,
It is also possible to write the data on a recording medium such as a DVD or a semiconductor memory and apply it to various devices, or to transmit it via a communication medium and apply it to various devices. A computer that realizes the present apparatus reads the program recorded on the recording medium, and executes the above-described processing by controlling the operation of the program.

【０１１１】[0111]

【発明の効果】以上のように本発明によれば、ユーザの
発声した音声を認識してテキストを作成する際に、音声
の入力間隔を絶えず監視し、一定時間以上音声が途切れ
た場合に、音声入力モードをオフにして、その間に入力
された文字列を無効とするか、あるいは、例えばシステ
ムを起動するためのコマンドなど、特定の文字列を除い
て入力を受け付けない状態とすることで、ユーザが音声
入力を中断している間に、外部からの音やユーザが意図
しない音声など、不要な音声が入力される可能性を排除
するこができる。As described above, according to the present invention, when a text is created by recognizing a voice uttered by a user, the input interval of the voice is constantly monitored, and when the voice is interrupted for a predetermined time or more, By turning off the voice input mode, invalidating the character string input during that time, or by not accepting input except for a specific character string, for example, a command to start the system, While the user interrupts the voice input, it is possible to eliminate a possibility that unnecessary voices such as an external sound and a voice not intended by the user are input.

【０１１２】また、音声入力を受け付けない状態（音声
入力モードがオフの状態）にあるときに、ユーザにその
旨を通知することで、ユーザがモードオフの状態を気付
かずに無駄な発声をしてしまうことを回避できる。When the voice input is not accepted (the voice input mode is off), the user is notified of this fact, so that the user does not notice the mode off state and makes useless utterances. Can be avoided.

【０１１３】また、コマンドのための音声が入力された
場合に、ユーザが中断していた音声入力を再開したと判
断して、音声入力を受け付ける状態に切り替えること
で、音声入力を受け付けない状態で（音声入力モードが
オフの状態）、ユーザが音声入力を再開する際に、キー
ボードなどから状態を変更するコマンドを与えるなどし
なくとも、コマンドのための音声入力後、直ちに音声入
力を行うことができる。また、入力された音声がノイズ
であるかどうかを音声認識の認識率で区別し、認識率が
高ければ、ユーザが意図した音声入力であると判断する
ことで、音声入力を受け付けない状態（音声入力モード
がオフの状態）で、ユーザが音声入力を再開する際に、
キーボードなどから状態を変更するコマンドを与えるな
どしなくとも、直ちに音声入力を行うことができる。When a voice for a command is input, it is determined that the user has resumed the interrupted voice input, and the state is switched to a state in which the voice input is accepted. (When the voice input mode is off), when the user resumes voice input, voice input can be performed immediately after voice input for a command without giving a command to change the status from a keyboard or the like. it can. Also, whether or not the input voice is noise is distinguished by the recognition rate of voice recognition. If the recognition rate is high, it is determined that the voice input is intended by the user, so that the voice input is not accepted (voice Input mode is off) and when the user resumes voice input,
Voice input can be performed immediately without giving a command to change the state from a keyboard or the like.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の第１の実施形態に係るディクテーショ
ン装置の構成を示すブロック図。FIG. 1 is a block diagram showing a configuration of a dictation device according to a first embodiment of the present invention.

【図２】上記第１の実施形態における音声処理の動作を
説明するためのフローチャート。FIG. 2 is a flowchart for explaining the operation of audio processing in the first embodiment.

【図３】上記第１の実施形態における音声入力判断処理
の動作をを説明するためのフローチャート。FIG. 3 is a flowchart illustrating an operation of a voice input determination process according to the first embodiment.

【図４】上記第１の実施形態におけるコマンド辞の構成
を示す図。FIG. 4 is a diagram showing a configuration of a command word in the first embodiment.

【図５】上記第１の実施形態における表示画面を示す
図。FIG. 5 is a view showing a display screen in the first embodiment.

【図６】本発明の第２の実施形態に係るディクテーショ
ン装置の構成を示すブロック図。FIG. 6 is a block diagram showing a configuration of a dictation device according to a second embodiment of the present invention.

【図７】上記第２の実施形態における音声入力判断処理
の動作を説明するためのフローチャート。FIG. 7 is a flowchart for explaining the operation of a voice input determination process in the second embodiment.

【図８】上記第２の実施形態における表示画面を示す
図。FIG. 8 is a diagram showing a display screen in the second embodiment.

【図９】本発明の第３の実施形態に係るディクテーショ
ン装置の構成を示すブロック図。FIG. 9 is a block diagram showing a configuration of a dictation device according to a third embodiment of the present invention.

【図１０】上記第３の実施形態における音声処理の動作
を説明するためのフローチャート。FIG. 10 is a flowchart for explaining the operation of audio processing in the third embodiment.

【図１１】本発明の第４の実施形態に係るディクテーシ
ョン装置の構成を示すブロック図。FIG. 11 is a block diagram showing a configuration of a dictation device according to a fourth embodiment of the present invention.

【図１２】上記第４の実施形態における音声処理の動作
を説明するためのフローチャート。FIG. 12 is a flowchart for explaining the operation of audio processing in the fourth embodiment.

[Explanation of symbols]

１１…入力部１２…制御部１２ａ…モード記憶部１３…表示部１４…音声認識部１４ａ…音声認識辞書１５…コマンド生成部１６…テキスト生成部１７…音声入力判断部１８…コマンド辞書１９…テキスト格納部２０…音声入力モード警告部２１…音声入力モード切り替え部２２…認識率判定部 DESCRIPTION OF SYMBOLS 11 ... Input part 12 ... Control part 12a ... Mode storage part 13 ... Display part 14 ... Speech recognition part 14a ... Speech recognition dictionary 15 ... Command generation part 16 ... Text generation part 17 ... Speech input judgment part 18 ... Command dictionary 19 ... Text Storage unit 20: voice input mode warning unit 21: voice input mode switching unit 22: recognition rate determination unit

Claims

[Claims]

A voice input unit for inputting a voice; a voice recognition unit for recognizing a voice input by the voice input unit to convert the voice into a character string; and a character obtained as a recognition result by the voice recognition unit. Processing means for processing the sequence; monitoring a voice input interval; if no voice is input after a certain period of time from the voice input means, the voice input mode is turned off, and characters input during that time are turned off. A dictation device comprising input control means for invalidating a column.

2. A voice input unit for inputting voice, a voice recognition unit for recognizing a voice input by the voice input unit and converting the voice into a character string, and a character obtained as a recognition result by the voice recognition unit. Processing means for processing a sequence, storage means for storing a specific character string to receive an input, and monitoring of a voice input interval, and when no voice is input even after a certain period of time from the voice input means, A dictation device comprising: an input control unit that turns off a voice input mode and invalidates a character string other than a specific character string stored in the storage unit during the input mode.

3. The dictation device according to claim 1, further comprising a notification unit for notifying that the voice input mode is in an off state.

4. The dictation device according to claim 1, wherein the input control means switches the voice input mode to an on state when a character string relating to a command is input.

5. The input control means switches the voice input mode to an on state when a character string whose recognition rate by the voice recognition means is higher than a preset value is input. The dictation device according to claim 1.

6. A recording medium storing a dictation program for recognizing voice and inputting it as text, comprising: a step of inputting voice; and a step of recognizing the input voice and converting the input voice into a character string. , A procedure for processing a character string obtained as a recognition result, and monitoring a voice input interval.
A computer-readable recording medium that records a program for causing a computer to execute a procedure for invalidating a character string input during that time.

7. A recording medium on which a dictation program for recognizing voice and inputting it as text is recorded, wherein a procedure for inputting voice, a procedure for recognizing the input voice and converting it into a character string are provided. , A procedure for processing a character string obtained as a recognition result, and monitoring a voice input interval.
A computer-readable recording medium which records a program for causing a computer to execute a procedure for invalidating a character string other than a specific character string input during that time.