JPS63155229A

JPS63155229A - Conversion system for word processor

Info

Publication number: JPS63155229A
Application number: JP61300124A
Authority: JP
Inventors: Masahiko Uchiyama; 昌彦内山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1986-12-18
Filing date: 1986-12-18
Publication date: 1988-06-28

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔概　要〕本発明は音声入力ワードプロセッサにおいて、人力音声
の時間長を識別することによって文字変換、行変換等を
キーボードによらず、音声により行えるようにした変換
方式である。[Detailed Description of the Invention] [Summary] The present invention provides a conversion method in a voice input word processor that enables character conversion, line conversion, etc. to be performed by voice instead of using a keyboard by identifying the time length of human voice. be.

[Industrial application field]

本発明はワードプロセッサに関し、特に音声入力が可能
なワードプロセッサの変換方式に関する。The present invention relates to a word processor, and more particularly to a conversion method for a word processor that allows voice input.

〔従来の技術及び発明が解決しようとする問題点）近年
、音声認識技術の発展に伴いワードプロセッサのキー人
力の代りに音声を用いて入力することが行われている。[Prior Art and Problems to be Solved by the Invention] In recent years, with the development of speech recognition technology, voice input has become commonplace in place of human input using the keys of a word processor.

この場合、通常行われる音声認識装置では、音素、単語
、構文など種々の次元の言語情報を記憶した装置によっ
て、音素標準パターンに基づいて入力音声との類似度を
調べ音素の認識を行っている。装置にはいわゆる単語辞
書（テンプレート）が備えられ、カタカナ、ひらがな、
漢字等の文字列が記憶され単語認識のために参照される
。In this case, the speech recognition device that is normally used uses a device that stores linguistic information of various dimensions such as phonemes, words, and syntax to recognize the phonemes by checking the degree of similarity with the input speech based on a standard phoneme pattern. . The device is equipped with a so-called word dictionary (template), which includes katakana, hiragana,
Character strings such as kanji are stored and referenced for word recognition.

第４図は上述した従来装置の要部の一例である。FIG. 4 shows an example of the main part of the conventional device described above.

第４図において、１は音声入力するマイクロホン、２は
データ、コマンド等入力するキーボード、３は音声認識
部、４はテンプレート切換部、５は文章データ用単音節
テンプレート、６はコマンド用単語テンプレート、７は
文章処理部、そして８は表示装置（ＣＲＴ）である。In FIG. 4, 1 is a microphone for voice input, 2 is a keyboard for inputting data, commands, etc., 3 is a voice recognition section, 4 is a template switching section, 5 is a monosyllabic template for text data, 6 is a word template for commands, 7 is a text processing section, and 8 is a display device (CRT).

このような構成において、「音声」と入力したい場合に
は、第５図に示すように単音節で「オ」、「ン」、「セ
」、「イ」と発声するとマイクロホン１を経て音声認識
部３に入力される。音声認識部３は、単音節テンプレー
ト５と照合しながら「オ」、「ン」、「セ」、「イ」と
認識し、その結果を文章処理部７に入力する。その後、
第５図に示すようにキーボード２のコマンドキーを押下
する（ＯＮする）とテンプレート切換部４によりコマン
ド用単語テンプレート６に切り換え「ヘンカン」の入力
音声に対して照合を行い文字変換処理の指示であること
を認識した後文章処理部７に変換処理の指示を送る。こ
れにより「音声」の入力データを表示装置８に表示する
ことができる。In such a configuration, if you want to input "speech", utter the monosyllables "o", "n", "se", and "i" as shown in Figure 5, and the voice will be recognized through microphone 1. The information is input to section 3. The speech recognition unit 3 recognizes “o”, “n”, “se”, and “i” while checking against the monosyllable template 5, and inputs the result to the sentence processing unit 7. after that,
As shown in FIG. 5, when the command key on the keyboard 2 is pressed (turned ON), the template switching unit 4 switches to the command word template 6, matches the input voice of "Henkan", and instructs the character conversion process. After recognizing this, an instruction for conversion processing is sent to the text processing section 7. Thereby, the "audio" input data can be displayed on the display device 8.

このように従来は音声入力を変換する場合にはコマンド
キーの押下により行われており、データ部分は音声入力
が可能であっても文字変換指示等の制御部分については
相変らずコマンドキーで行われ手操作を繁雑なものにし
ていた。In this way, conventionally, when converting voice input, it was done by pressing the command key, and even though voice input is possible for the data part, the command key is still used for control parts such as character conversion instructions. This made manual operations complicated.

〔問題点を解決するための手段および作用〕本発明は上
述の問題点を解決したワードプロセッサの変換方式を提
供するものであり、音声データの単音節の発声時間とコ
マンドとしての単語の発声時間の違いに着目して、音声
データと変換コマンドの切換えをキーボード操作なしで
音声のみで行うようにしたものであって、その手段は、
入力音声の時間長を検出し基準値と比較する音声認識部
を備え、該入力音声の時間長が該基準値を超えた時は該
入力音声を変換コマンドと認識して文字変換、あるいは
行変換等の変換を行い、該時間長が該基準値を超えない
時は文章データと認識して表示することを特徴とする。[Means and effects for solving the problems] The present invention provides a conversion method for a word processor that solves the above-mentioned problems. Focusing on the difference, the system allows switching between audio data and conversion commands using only voice without keyboard operations, and the method is as follows:
Equipped with a voice recognition unit that detects the time length of input voice and compares it with a reference value, and when the time length of the input voice exceeds the reference value, it recognizes the input voice as a conversion command and converts characters or lines. etc., and when the time length does not exceed the reference value, it is recognized as text data and displayed.

〔Example〕

第１図は本発明に係るワードプロセッサの変換方式の基
本構成図である。図からも明らかなように基本的装置構
成は従来のそれと同じであるが、本発明においては音声
認識部３から音声の時間長信号Ｓがテンプレート切換部
４に送出され、単音節の発声時間の時間長と単語の発声
時間の時間長の相違を検出して音声データか変換コマン
ドが判断しテンプレートの切換を行っている。尚、第４
図のキーボード２は他の機能において必要であるが図示
を省略する。FIG. 1 is a basic configuration diagram of a word processor conversion method according to the present invention. As is clear from the figure, the basic device configuration is the same as that of the conventional one, but in the present invention, the speech time length signal S is sent from the speech recognition section 3 to the template switching section 4, and the utterance time of a single syllable is The template is switched by detecting the difference between the time length and the time length of the utterance of the word, determining whether it is voice data or a conversion command. Furthermore, the fourth
Although the illustrated keyboard 2 is necessary for other functions, illustration thereof is omitted.

さらに第２図を参照しつつさらに詳細に説明する。音声
入力によるワードプロセッサにおいて、例えば「音声」
という単語を入力する場合、「オ」、「ン」、「セ」、
「イ」と単音節ごとに区切って発声すると、音声認識部
３は単音節テンプレート５と照合をとり、オ、ン、セ、
イと認識する。この場合の音声時間はｔ、〜ｔ４で示す
ようにほぼ各単音節で同じと見ることができる。尚、第
２図の縦軸は音の大きさ、横軸は時間である。A more detailed explanation will be given with reference to FIG. In a word processor using voice input, for example, "voice"
When entering the word ``o'', ``n'', ``se'',
When you utter "i" in single syllables, the speech recognition unit 3 compares it with the monosyllable template 5 and utters ``o'', ``n'', ``se'', etc.
I recognize that. In this case, the speech time can be seen to be almost the same for each monosyllable, as shown by t to t4. Note that the vertical axis in FIG. 2 represents the loudness of the sound, and the horizontal axis represents the time.

次に「ヘンカン」と区切らずに発声された場合、その発
声時間ｔ、は単音節の場合よりも長くなるので、音声認
識部３はこの時間の差異を検知して音声時間長信号Ｓを
テンプレート切換部４に送出しコマンド用の単語テンプ
レート６に切り換え、照合を行って「変換処理」である
ことを認識した後変換コマンドＣを送出して文章処理部
７に通知する。Next, when "Henkan" is uttered without separation, the utterance time t is longer than when it is a single syllable, so the speech recognition unit 3 detects this difference in time and converts the speech duration signal S into a template. The switching unit 4 switches to the word template 6 for the sending command, performs a check, and after recognizing that it is a "conversion process", sends a conversion command C and notifies the text processing unit 7.

一般に単音節発声の時間長は約２３０ｍ５ｅｃであり単
語の発声時間長は１　ｓｅｃ程度である。しかし発声者
の個人差もあり、−概に決められない場合には入力音声
の時間長の分布から、一時的に長く発声された音声をコ
マンドとする方法も考えられ、また分布状態から決定す
ることも考えられる。Generally, the time length of monosyllable utterance is about 230 m5ec, and the length of time uttered of a word is about 1 sec. However, there are individual differences between speakers, and if it cannot be determined generally, it is possible to use a temporarily long voice as a command based on the distribution of the time length of the input voice. It is also possible.

第３図は本発明に係る変換方式の制御′ｆｌＩフローチ
ャートである。第３図において、音声がマイクロホン１
を通して音声認識部３に入力される（ステップ１）。音
声認識部３では入力音声の時間長を比較するための比較
手段（図示せず）によって単音節か単語かの判断がなさ
れる。この判断基準は例えば基準時間長との比較により
行われ、人力音声が基準時間長より短いときは単音節、
長いときは単語と判断し、音声時間長信号Ｓを出力する
（ステップ２）。そしてテンプレート切換部４において
、単音節の場合には単音節テンプレート５が選択され（
ステップ３）、単語辞書を参照して音声認識部３におい
て単語同定され（ステップ４）、文章処理部７を経て（
ステップ５）表示される（ステップ６）。FIG. 3 is a control 'flI flowchart of the conversion method according to the present invention. In Figure 3, the sound is transmitted to microphone 1.
is input to the speech recognition unit 3 through the voice recognition unit 3 (step 1). In the speech recognition unit 3, a comparison means (not shown) for comparing the time length of the input speech determines whether the input speech is a monosyllable or a word. This judgment criterion is, for example, compared with a standard time length, and when the human voice is shorter than the standard time length, it is monosyllable,
If it is long, it is determined that it is a word, and a voice duration signal S is output (step 2). Then, in the template switching unit 4, in the case of a monosyllable, a monosyllabic template 5 is selected (
Step 3), the word is identified in the speech recognition unit 3 with reference to the word dictionary (step 4), and then passed through the sentence processing unit 7 (
Step 5) Displayed (Step 6).

一方、テンプレート切換部４において、単語の場合には
単語テンプレート６が選択され（ステップ７）。制御コ
マンド用単語辞書を参照して音声認識部３において制御
コマンドとして認識され（ステップ８）制御コマンド生
成部（図示せず）において変換コマンドＣが作成され（
ステップ９）、文章処理部７において文字変換された（
ステップ５）後表示される（ステップ６）。尚、文字変
換に限らず行変換も同様に時間長の判別によって変換す
ることができる。On the other hand, in the case of a word, the template switching unit 4 selects the word template 6 (step 7). It is recognized as a control command in the speech recognition section 3 with reference to the control command word dictionary (step 8), and a conversion command C is created in the control command generation section (not shown) (
Step 9), character conversion is performed in the text processing unit 7 (
Step 5) is then displayed (Step 6). Note that not only character conversion but also line conversion can be similarly performed by determining the time length.

〔Effect of the invention〕

以上説明したように、本発明によれば音声入力ワードプ
ロセッサにおいて文字変換あるいは行変換する場合にコ
マンドキーによらず音声の時間長により行えるようにし
たので、手操作による繁雑さを解消することができる。As explained above, according to the present invention, when converting characters or lines in a voice input word processor, it is possible to perform character conversion or line conversion based on the duration of the voice rather than using command keys, thereby eliminating the complexity of manual operations. .

[Brief explanation of the drawing]

第１図は本発明に係るワードプロセッサの変換方式の構
成図、第２図は文字変換を説明する図、第３図は本発明に係る文字変換方式を示すフローチャー
ト、第４図は従来方式の構成図、および第５図は従来の文字変換を説明する図である。（符号の説明）ｌ・・・マイクロホン２・・・キーボード３・・・音声認識部４・・・テンプレート切換部５・・・単音節テンプレート６・・・単語テンプレート７・・・文章処理部８・・・表示装置本発明に係る文字変換方式の構成図＄１図不発明の詳細な説明する図第２図本発明に係る方式のフローチャート第３図Fig. 1 is a block diagram of a word processor conversion method according to the present invention, Fig. 2 is a diagram explaining character conversion, Fig. 3 is a flowchart showing a character conversion method according to the present invention, and Fig. 4 is a configuration of a conventional method. 1 and 5 are diagrams illustrating conventional character conversion. (Explanation of symbols) l...Microphone 2...Keyboard 3...Speech recognition section 4...Template switching section 5...Monosyllabic template 6...Word template 7...Sentence processing section 8 ... Display device Configuration diagram of the character conversion method according to the present invention $1 Figure Detailed explanation of the non-invention Figure 2 Flowchart of the method according to the present invention Figure 3

Claims

[Claims]

1. A conversion method for a voice input type word processor in which text data is input using the input voice when the text data is read aloud. When the duration of the audio exceeds the standard value, the input audio is recognized as a conversion command and converted into text or lines, and when the duration does not exceed the standard, it is converted into text data. A conversion method for a voice input word processor characterized by recognition and display.