JPS6048099A

JPS6048099A - Voice recognition equipment

Info

Publication number: JPS6048099A
Application number: JP58156599A
Authority: JP
Inventors: 相良　良二; 楠原　久代; 裕一谷口
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1983-08-26
Filing date: 1983-08-26
Publication date: 1985-03-15

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】産業上の利用分野本発明は、予め登録しである単語および単音節の音声の
標準パターンを用いて入力音声を認識する音声認識装置
に関する。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a speech recognition device that recognizes input speech using standard patterns of pre-registered words and monosyllabic speech.

従来例の構成とその問題点近年、人間−機械系の入力手段として音声が注目を集め
ており、各種の音声認識装置が商品化されている。この
様な音声認識装置には単語単位の認識を行なうものと、
単音節単位の認識を行なうものとがあり、特に単音節単
位の認識が可能な音声認識装置では単語数ＶＣ制限がな
いため任意の文章の入力が可能となる。この際、数字等
の様な頻繁に使用される単語も単音節と同様に認識でき
れば、文章あるいはデータの入力はさらに楽になる。2. Description of the Related Art Structures and Problems Therein In recent years, voice has been attracting attention as an input means for human-machine systems, and various voice recognition devices have been commercialized. These speech recognition devices include those that perform word-by-word recognition, and
There are speech recognition devices that perform recognition in units of monosyllables, and in particular, speech recognition devices that are capable of recognizing units in monosyllables do not have a VC limit on the number of words, so it is possible to input any sentence. At this time, if frequently used words such as numbers can be recognized in the same way as monosyllables, inputting sentences or data will be even easier.

そこで、単語および単音節の両方を認識する音声認識装
置も商品化されている。Therefore, speech recognition devices that recognize both words and monosyllables have also been commercialized.

以下第１図を参照しながら、単語および単音節を認識す
る従来の音声認識装置について説明する０同図において
、１は音声を電気信号に変換するマイクロフォン等から
なる入力部、２は入力手段１からの電気信号から音声の
特徴を、抽出する特徴抽出部、３は特徴抽出部２によっ
て抽出された音声の特徴をディジタル化するＡ／Ｄ変換
部、４はＡ／Ｄ変換部３によってディジタル化さ扛た音
声の特徴を入カバターンとして一時的に記憶しておく入
カバターン記憶部、５は認識対象となる複数個の単語の
音声の特徴を標準ノくターンとして予め記憶せしめてお
く単語標準ノくターン記憶部、６は認識対象となる複数
個の単音節の音声の特徴を標準パターンとして予め記憶
せしめておく単音節標準パターン記憶部、７は標準、Ｓ
ターン記憶部５゜６あるいは入カバターン記憶部４と上
記へ／Ｄ変換部３とを接続するモード切換スイッチであ
り、標準パターン記憶部５，６に音声の特徴を予め記憶
せしめる登録モードと入カッくターン記憶部４に入力音
声の特徴を記憶せしめる認識モードとを選択する手段で
ある。８は入カバターンより入力音声の長さを検出して
入カバターンを単語標準ノくターンと単音節標準パター
ンのどちらと比較するかを決定する音声長検出部、９は
音声検出部８によって制御されるゲート、１Ｑは入カッ
くターン記憶部４に記憶された入カバターンと単語標準
ノくターン記憶部５に記憶さｎた複数個の単語標準ノく
ターンとを比較し入力音声を特定の単語と認識する単音
節認識処理部、１１は入カッくターンと単音節標準パタ
ーン記憶部６に記憶された複数個の単音節標準パターン
とを比較し、入力音声を特定の単音節と認識する単音節
認識処理部、１２は両認識処理部１゛０，１１によって
認識された単語あるいは単音節に対応する信号を出力す
る出力部である０上記のように構成された音声認識装置
について、以下具体的に動作を説明する０まず単語登録モード時には、モード切換スイッチＴｔＡ
側に接続し、出力部１２より出力される各種の出力信号
に対応した単語を発声して入力部１に入力することによ
って、単語の登録が為される。即ち、入力された音声を
入力部１によって電気信号に変換し、この電気信号から
音声の特徴を特徴抽出部２により抽出し、この音声の特
徴をＡ／Ｄ変換部３によりディジタル化して、単語標準
パターン記憶部５にこのディジタル化された音声の特徴
を登録する。A conventional speech recognition device for recognizing words and monosyllables will be explained below with reference to FIG. 3 is an A/D converter that digitizes the voice features extracted by the feature extractor 2; 4 is a digitizer that is digitized by the A/D converter 3; Reference numeral 5 denotes an input pattern storage unit that temporarily stores the characteristics of the voice that has been picked up as an input pattern, and a word standard word storage unit 5 that stores in advance the characteristics of the sounds of a plurality of words to be recognized as standard patterns. 6 is a monosyllabic standard pattern storage section in which characteristics of a plurality of monosyllabic sounds to be recognized are stored in advance as a standard pattern; 7 is a standard S;
This is a mode changeover switch that connects the turn storage section 5.6 or the input pattern storage section 4 and the above-mentioned to/D conversion section 3. It is a mode changeover switch that connects the turn storage section 5. This is means for selecting a recognition mode in which the characteristics of the input voice are stored in the turn storage section 4. Reference numeral 8 denotes a speech length detecting section which detects the length of the input speech from the input kabata turn and determines whether to compare the incoming kabata turn with a standard word turn or a standard monosyllabic pattern; 9 is controlled by the speech detecting section 8; Gate 1Q compares the input cover turn stored in the input cut turn storage unit 4 with a plurality of word standard turn turns stored in the word standard turn storage unit 5, and converts the input speech into a specific word. The monosyllabic recognition processing unit 11 compares the incoming crinkle turn with a plurality of monosyllabic standard patterns stored in the monosyllabic standard pattern storage unit 6, and recognizes the input speech as a specific monosyllable. The syllable recognition processing unit 12 is an output unit that outputs a signal corresponding to the word or monosyllable recognized by both recognition processing units 1, 11.The speech recognition device configured as described above will be described in detail below. 0 First, in the word registration mode, press the mode changeover switch TtA.
Words are registered by uttering words corresponding to various output signals outputted from the output section 12 and inputting them to the input section 1. That is, inputted speech is converted into an electrical signal by the input section 1, features of the speech are extracted from this electrical signal by the feature extraction section 2, and the features of the speech are digitized by the A/D conversion section 3 to generate words. The characteristics of this digitized voice are registered in the standard pattern storage section 5.

また単音節登録モード時には、スイッチ７をＣ側に接続
し、出力部１２よシ出力される各種の出力信号に対応し
た単音節を発声して入力部１に入力することによって、
単語登録時と同様に単音節の登録が為される。In addition, in the monosyllable registration mode, by connecting the switch 7 to the C side and uttering monosyllables corresponding to various output signals output from the output section 12 and inputting them to the input section 1,
Monosyllables are registered in the same way as when registering words.

次に認識モード時にはモード切換スイッチ７をＢ側に接
続し、両登録モード時に登録された複数個の単語あるい
は単音節の内から所望の単語あるいは単音節を入力部１
に入力すれば、両認識処理部１０，１１の一方によって
入カバターンに最も類似していると判定された標準パタ
ーンに対応した信号が出力部１２から出力される。即ち
、入力部１に入力された音声は特徴抽出部２とへ／Ｄ変
換部３とによシディジタル化きれた特徴パターンに変換
さｎ、一旦入力パターン記憶部４に記憶された後５．音
声長検出部８によって入力音声長がある閾値より長いか
どうか調べられる。まず入力音声長がある値よりも長い
場合、入カバターンは音声検出部８によって制御される
ゲート９を経て単語認識処理部１０に送らｎ、ここで奉
語標準パターン記憶部５ｖｃ記憶さｎている複数個の標
準パターンと比較されて、最も類似している単語に対応
し、た信号が出力される。一方、入力音声長が閾値、１
りも短かい場合、入カバターンはゲート９を経て単音節
認識処理部１１に送られ、ここで単音節標準パターン記
憶部６に記憶されている複数個の標準パターンと比較さ
れて、最も類似している卓子１節に対応した信号が出力
される。Next, in the recognition mode, connect the mode changeover switch 7 to the B side, and input a desired word or monosyllable from among the plural words or monosyllables registered in the both registration modes to the input section 1.
, the output unit 12 outputs a signal corresponding to the standard pattern determined by one of the recognition processing units 10 and 11 to be most similar to the input cover pattern. That is, the voice input to the input section 1 is converted into a digitalized feature pattern by the feature extraction section 2 and the D/D conversion section 3, and once stored in the input pattern storage section 4, it is stored in the input pattern storage section 4. The speech length detector 8 checks whether the input speech length is longer than a certain threshold. First, if the input speech length is longer than a certain value, the input pattern is sent to the word recognition processing section 10 via the gate 9 controlled by the speech detection section 8, where it is stored in the Hogo standard pattern storage section 5vc. It is compared with a plurality of standard patterns, and a signal corresponding to the most similar word is output. On the other hand, the input audio length is the threshold, 1
If the length is also shorter, the input pattern is sent to the monosyllabic recognition processing unit 11 via the gate 9, where it is compared with a plurality of standard patterns stored in the monosyllabic standard pattern storage unit 6, and the most similar pattern is selected. A signal corresponding to the first section of the table is output.

しかし上記のように構成された音声認識装置では、単語
の認識を行なう単語認識処理部と単音節の認識を行なう
単音節認識処理部とを別個に設けて単語と単音節とを別
々に処理してやる必要があるため、装置の大型化、処理
の複雑化が避けら扛ないという欠点を有していた０しか
も、単音節か単語かを音声の長さで判定するため、１音
節から成る単語、例えば数字の２（″に”）や５（″ど
”）を単音節と判定したり、長めに発声した単音節を単
語と判定する、といった誤りを起し易いという欠点を有
していた。However, in the speech recognition device configured as described above, a word recognition processing section for recognizing words and a monosyllable recognition processing section for recognizing monosyllables are provided separately, and words and monosyllables are processed separately. However, since it is necessary to increase the size of the device and complicate the processing, it is necessary to increase the size of the device and complicate the processing.In addition, since it is determined whether the sound is a single syllable or a word based on the length of the sound, words consisting of one syllable, For example, it has the disadvantage that it is prone to errors, such as determining the numbers 2 (``ni'') and 5 (``do'') to be monosyllables, or determining a monosyllable uttered for a long time to be a word.

発明の目的本発明は上記欠点に鑑み、語頭に特徴を持つ単語を単語
節と同一の処理で認識することにより、装置の小型化、
処理の簡易化、誤認識の防止を図ることのできる音声認
識装置を提供することを目的とするものである。Purpose of the Invention In view of the above-mentioned drawbacks, the present invention recognizes words with characteristics at the beginning using the same processing as word clauses, thereby reducing the size of the device.
It is an object of the present invention to provide a speech recognition device that can simplify processing and prevent erroneous recognition.

発明の構成本発明は、互いに異なる音節を語頭に持つ単語の語頭の
特徴を標準パターンとして記憶しておく単語標準パター
ン記憶部と、入力、＜ターンと比較する標準パターンを
単語標準ノ：ターン記憶部と単音節標準パターン記憶部
のどちらから受け取るかを選択するスイッチ手段と、こ
のスイッチ手段を経て送られてくる各標準ノくターンと
入カッくターンとを比較して認識を行なう認識処理手段
とを備えた音声認識装置であり、音声を入力する際に認
識対象が単語であるか単音節であるかをスイッチ手段に
よって選択することにより、単語と単音節を同一の処理
で認識でき、同時に単語と単音節との混同も避けること
のできるものである。Structure of the Invention The present invention includes a word standard pattern storage unit that stores the characteristics of the beginnings of words having different syllables at the beginning as a standard pattern, and a word standard pattern storage unit that stores a standard pattern to be compared with input and < turn. a switch means for selecting whether to receive from the part or the monosyllabic standard pattern storage part, and a recognition processing means for comparing and recognizing each standard nok turn and the input kaku turn sent through the switch means. By selecting whether the recognition target is a word or a monosyllable using a switch when inputting speech, it is possible to recognize words and monosyllables in the same process. Confusion between words and monosyllables can also be avoided.

実施例の説明以下、本発明の実施例について図面とともに説明する。Description of examples Embodiments of the present invention will be described below with reference to the drawings.

第２図は本発明の一実施例における音声認識装置を示す
ブロック図であり、１０個の数字と６８音節の認識が可
能な音声認識装置である。FIG. 2 is a block diagram showing a speech recognition device according to an embodiment of the present invention, which is capable of recognizing 10 numbers and 68 syllables.

同図において、１は音声を電気信号に変換するマイクロ
フォン等からなる入力部、２は入力部１からの電気信号
より音声の特徴を抽出する特徴抽出部、３は特徴抽出部
によって抽出さ扛た音声の１１！ｆ徴をディジタル化す
るＡ／Ｄ変換部、４はＡ／Ｄ変換部３によってディジタ
ル化された音声の特徴を入カバターンとして一時的に記
憶しておく入カバターン記憶部、５は１Ｑ個の数字１，
２，３゜４．６，６，７，８，９．０の音声の特徴を標
準パターンとして記憶せしめておく単語標準ノくターン
記″憶部、６は６８個の単音節の音声の特徴を標準パタ
ーンとして記憶せしめておく単音節標準パターン記憶部
、了は両標準パターン記憶部６，６あるいは入カバター
ン記憶部４とＡ／Ｄ変換部３とを接続するモード切換ス
イッチ、１１は入カッくターンと単語あふいは単音節の
いずれかの標準／くターンとを比較し認識を行なう単音
節認識処理部、１２は単音節認識処理部１１によって認
識された単音節あるいは数字に対応する信号を出力する
出力部、１３は入カバターンを単語標準ノくターンと単
音節標準パターンのいずれと比較するかを選択するスイ
ッチである。In the figure, 1 is an input section consisting of a microphone or the like that converts voice into an electrical signal, 2 is a feature extraction section that extracts features of the voice from the electrical signal from the input section 1, and 3 is a section that extracts the features of the voice from the electrical signal from the input section 1. Audio 11! 4 is an input cover pattern storage unit that temporarily stores the audio features digitized by the A/D converter 3 as input cover patterns; 5 is 1Q numbers; 1,
2,3゜4.6, 6, 7, 8, 9.0 is a word standard noku turn storage unit that stores the voice characteristics of 0 as standard patterns, 6 is the voice characteristic of 68 monosyllables. 1 is a monosyllabic standard pattern storage section in which the standard pattern is stored as a standard pattern, Ryo is a mode changeover switch that connects both standard pattern storage sections 6, 6 or the input cover pattern storage section 4 and the A/D conversion section 3, and 11 is an input cutout pattern storage section. 12 is a signal corresponding to the monosyllable or number recognized by the monosyllabic recognition processing unit 11. An output unit 13 is a switch for selecting whether to compare the input kabata turn with a word standard noku turn or a monosyllabic standard pattern.

以上のように構成された音声認識装置について、以下そ
の動作について第２図を参照しながら説明する。The operation of the speech recognition apparatus configured as described above will be described below with reference to FIG. 2.

まず単語登録モード時には、モード切換スイッチ７をＡ
側に接続し、出力部１２より出力される信号に対応した
数字を発声して入力部１に順次人力する事によって、数
字の登録が為さ扛る。即ち、入力された音声を入力部１
によって電気信号に変換し、こ、の電気信号から数−牢
の先頭の音檜の特徴、例えば１なら“イ″２なら゛二″
の特徴を特徴抽出部２より抽出し、この音節の特徴をＡ
／Ｄ変換部３によりディジクル化して、単語標準パター
ン記憶部５にこの数字の先頭音節の特徴全登録する。First, when in word registration mode, press mode selector switch 7 to A.
Registering the numbers is done by connecting to the side and manually inputting the numbers corresponding to the signals outputted from the output section 12 into the input section 1 in sequence. That is, input audio is input to the input unit 1.
Convert it into an electrical signal by , and from this electrical signal, you can calculate the number - the characteristic of the sound chamber at the beginning of the prison, for example, 1 for "i" and 2 for 2.
The feature of this syllable is extracted by the feature extraction unit 2, and the feature of this syllable is defined as A.
The /D converter 3 converts the number into a digit, and registers all the features of the first syllable of this number in the word standard pattern storage 5.

また単音節登録モード時には、モード切換スイッチ７を
Ｃ側に接続し、出力部１２より出力される信号に対応し
た単音節を発声して入力部１に順次入力する事によって
、単音節の登録が為される。In addition, in the monosyllabic registration mode, by connecting the mode selector switch 7 to the C side and uttering monosyllables corresponding to the signal output from the output section 12 and sequentially inputting them to the input section 1, monosyllables can be registered. will be done.

即ち、入力された音声を入力部１によって電気信号に変
換し、この単音節の特徴を特徴抽出部２により抽出し、
この単音節の特徴をＡ／Ｄ変換部３によりディジタル化
して、単音節標準パターン記憶７′１１１ｅにこの単音
節の特徴を登録する。That is, input speech is converted into an electrical signal by the input section 1, and features of this monosyllable are extracted by the feature extraction section 2,
The features of this monosyllable are digitized by the A/D converter 3 and are registered in the monosyllabic standard pattern storage 7'111e.

次に認識モード時にはモード切換スイッチ７をＤ側に接
続し、スイッチ１３をＤ側に接続して単語登録モード時
に登録された１０個の数字の内所望の数字を入力部１に
入力すれば、単音節認識処理部１１によって入力音声に
最も類似していると判定された数字に対応した信号が出
力部１２から出力され、スイッチ１３をＥ側に接続して
単音節登録時に登録された６８音節の内所望の単音節を
入力部１に入力すれば、単音節認識処理部１１によって
入力音声に最も類似していると判定された単音節に対応
した信号が出力部１２から出力される。即ち、まず数字
を入力する際、スイッチ１３がＤ側に接続され、入力部
１に入力さｎた音声は特徴抽出部２とＡ／Ｄ変換部３と
によりディジタル化された特徴パターンに変換されて入
カバターン記憶部４に一旦記憶され、単語標準パターン
記憶部５に記憶されている数字の語頭の音節から抽出さ
れた標準パターンと上記の入カバターンとが単音節認識
処理部１１によって比較される。数字の語頭の音節は、
′い”（１）ｌ”に”（２）。Next, in the recognition mode, connect the mode changeover switch 7 to the D side, connect the switch 13 to the D side, and input a desired number from among the 10 numbers registered in the word registration mode into the input section 1. A signal corresponding to the number determined to be most similar to the input speech by the monosyllable recognition processing section 11 is output from the output section 12, and the 68 syllables registered at the time of monosyllable registration are output by connecting the switch 13 to the E side. When a desired monosyllable among the above is input to the input section 1, a signal corresponding to the monosyllable determined by the monosyllable recognition processing section 11 to be most similar to the input speech is outputted from the output section 12. That is, when first inputting numbers, the switch 13 is connected to the D side, and the voice input to the input section 1 is converted into a digital feature pattern by the feature extraction section 2 and the A/D conversion section 3. The monosyllable recognition processing unit 11 compares the standard pattern extracted from the initial syllable of the number stored in the word standard pattern storage unit 5 with the above input cover pattern, which is temporarily stored in the input cover pattern storage unit 4. . The first syllable of a number is
``I''(1)l''ni''(2).

′さ”　（３）　ｒ　”よ”　（４）　、　”ご″（５
）　、　＋１ろ”（６）　ｌ　”な”（７）、”は”　
（ｓ　＞　＋　”き”（９）、″ぜ”（０）の１０個で
単音節と同一の認識処理で十分認識することができ、単
音節認識処理部１１で認識された数字に対応した信号が
出力部１２から出力される。一方、単音節を入力す２）
際、スイッチ１３はＥ側に接続され、単音節標準パター
ン記憶部６に記憶されている各標準パターンと上記の入
カバターンとが単音節認識処理部′１１によって比較さ
扛、最も入力に類似している単音節に対応した信号が出
力される。'sa' (3) r 'yo' (4), 'go' (5
), +1ro” (6) l “na” (7), “ha”
(s > + "ki" (9), "ze" (0) can be sufficiently recognized by the same recognition process as a monosyllable, and the number corresponding to the number recognized by the monosyllable recognition processing unit 11 A signal is output from the output unit 12. On the other hand, when a monosyllable is input 2)
At this time, the switch 13 is connected to the E side, and each standard pattern stored in the monosyllabic standard pattern storage section 6 is compared with the input pattern pattern described above by the monosyllabic recognition processing section '11. A signal corresponding to the single syllable is output.

以上のように本実施例によれば、スイッチ１３を設けて
、１０個の数字の語頭の音節から抽出し／こ標準パター
ンと６８音節の標準パターンとを別個に単音節認識処理
部１１に送ることにより、語頭に異なる音節を持つ１０
個の数字と、６８個のｉ′？節とを同一の単音節認識処
理部によって認識することができ、装置を簡単化するこ
とができる。As described above, according to this embodiment, the switch 13 is provided to separately send the standard pattern extracted from the initial syllables of 10 numbers and the standard pattern of 68 syllables to the monosyllable recognition processing unit 11. 10 with different syllables at the beginning of the word
numbers and 68 i'? clauses can be recognized by the same monosyllable recognition processing unit, and the apparatus can be simplified.

以下本発明の第２の実施例について図面を参照しながら
説明する。A second embodiment of the present invention will be described below with reference to the drawings.

第３図は本発明の第２の実施例を示す不特定話者用音声
認識装置のブロック図である。FIG. 3 is a block diagram of a speaker-independent speech recognition device showing a second embodiment of the present invention.

同図において、１は入力部、２は特徴抽、出部、３はＡ
／Ｄ変換部、４は入カバターン記憶部、１１は単音節認
識処理部、１２は出力部で、以上は第２図の構成と同様
なものである。第２図の構成と異なるのは、予め１ｏ数
字の先頭の音節から抽出した不特定話者用の標準パター
ンを記憶せしめた波字標準パターン記憶部１４を設け、
予め゛てんそう”。In the same figure, 1 is an input section, 2 is a feature extraction and output section, and 3 is an A
/D conversion section, 4 is an input cover pattern storage section, 11 is a monosyllable recognition processing section, and 12 is an output section, which is the same as the configuration shown in FIG. 2. The difference from the configuration shown in FIG. 2 is that a wavy standard pattern storage unit 14 is provided in which a standard pattern for unspecified speakers extracted from the first syllable of the 1o number is stored in advance.
``Tensou'' in advance.

″へんかんＩＩ　、　＋＋さくしよｎ　、　＋＋まつし
よう″。``Henkan II, ++Sakushiyon, ++Matsusyo''.

′“とりけしｕ、’ａかいぎょう１１　、　ｕすペーす
″。``Torikeshi u, 'a Kaigyo 11, u space''.

等のような文章入力装置に用いるコマンドの最初の音節
から抽出した不特定話者用の標準パターンを記憶せしめ
たコマンド標準パターン記１、ハ部１５を設け、さらに
予め６８個の音節から抽出した不特定話者用の標準パタ
ーンを記憶せしめた単音節標準パターン記憶部１６を設
けた点と、スイッチ１７を設けて入カバターンを数字、
コマンド、単音節のいずれの標準パターンと比較其るか
８択す名ようにした点である。A command standard pattern list 1, part C 15, is provided, which stores standard patterns for unspecified speakers extracted from the first syllable of commands used in text input devices such as A monosyllabic standard pattern storage unit 16 is provided to store standard patterns for unspecified speakers, and a switch 17 is provided to change the input cover turns to numbers, numbers, etc.
The point is that we have eight choices to choose from compared to the standard patterns of commands and single syllables.

上記のように構成された第２の実施例の音声認識装置に
ついて、以下その動作を説明する。The operation of the speech recognition device of the second embodiment configured as described above will be explained below.

捷ず、数字を入力する場合、スイッチ１７をＤ側に接続
し、１０数字の内所望のものを発声して人力部１に入力
する。この音声は特徴抽出部２とＡ／Ｄ変換部３とによ
りディジタル化された特徴パターン、に変換され、一旦
入力パターン記憶部４に記憶された後、単音節認識処理
部１１によって数字標準パターン記憶部１４の各標準パ
ターン（パい、“にＩＩ　、　＋＋さＩＩ　、　１１よ
、′ごｎ　、　ＩＩろ″。When inputting numbers without selecting them, connect the switch 17 to the D side, speak the desired number out of the 10 numbers, and input it into the human power section 1. This voice is converted into a digitized feature pattern by the feature extraction section 2 and the A/D conversion section 3, and once stored in the input pattern storage section 4, the monosyllable recognition processing section 11 stores the numerical standard pattern. Each standard pattern in part 14 (pai, "ni II, ++sa II, 11yo, 'Go n, II Ro".

“′なｕ、、、＋＋はＩＩ　、　ＩＩさＩＩ　、　Ｉｔ
せ″）と比較され、最も類似した数字に対応した記号が
出力部１２に、１：り出力される。“'na u,,,++ is II, IIsaII, It
The symbol corresponding to the most similar number is output to the output section 12.

同様に、文章入力装置に用いられるコマンドを欠力する
場合、スイッチ１７をＥ側に接続し、各コマンドの内所
望のものを発声して入力部１に入力すると、単音節認識
処理部１１によってコマンド標準パターン記憶部１５内
の各標準パターンと人カバターンとが比較さｎ、認識結
果が出力部１２により出力される。即ち、入力されたコ
マンドの語頭から最初の単音節の特徴が抽出さ汎、各コ
マンドの語頭から抽出された標準パターンと比較されて
、入力と語頭の音節が最も類似したコマンドＧ′こ対応
する信号が出力部１２により出力される。Similarly, if you want to output a command to be used in the text input device, connect the switch 17 to the E side, utter the desired command and input it into the input unit 1, and the monosyllable recognition processing unit 11 Each standard pattern in the command standard pattern storage section 15 is compared with the human cover pattern, and the recognition result is outputted by the output section 12. That is, the features of the first monosyllable from the beginning of the input command are extracted and compared with the standard pattern extracted from the beginning of each command, and the command G′ whose initial syllable is most similar to the input is selected. A signal is output by the output section 12.

次に、単音節を入力する場合は、スイッチ１７ｉＦ側に
接続し、予め単音節標準パターン記憶部１６に記憶しで
ある単音節の内所望のものを発声して入力部１に入力す
ると第１の実施例と全く同様にして、最も入カバターン
に類似した単音節に対応した信号が出力部１２により出
力される。Next, when inputting a monosyllable, connect it to the switch 17iF side, utter a desired monosyllable among the monosyllables stored in the monosyllable standard pattern storage section 16 in advance, and input it to the input section 1. In exactly the same manner as in the embodiment described above, the output section 12 outputs a signal corresponding to the monosyllable most similar to the input cover turn.

以上のように本実施例によｎば、スイッチ１７を設けて
、１０個の数字の語頭の音節から抽出した標準パターン
と、文章入力装置に用いられるコマンドの語頭の音節か
ら抽出した標準パターンと、６８音節のパターンとを別
個に単音節認識処理部１１に送ることにより、語頭に異
なる音節を持つ１０個の数字と、語頭に異なる音節を持
つコマンドと、６８個の音節とを同一の単音節認識処理
部によって認識することができ、３つの記憶部の構造も
同じにすることもでき、装置を簡単化することができる
。As described above, according to the present embodiment, the switch 17 is provided, and the standard pattern extracted from the initial syllables of the words of 10 numbers and the standard pattern extracted from the initial syllables of the commands used in the text input device. , and 68 syllable patterns to the single syllable recognition processing unit 11, 10 numbers with different syllables at the beginning of words, commands with different syllables at the beginning of words, and 68 syllables can be combined into the same word. It can be recognized by a syllable recognition processing section, and the structure of the three storage sections can also be made the same, making it possible to simplify the device.

なお、第１および第２の実施例では認識対象として数字
、コマンドをあげたが、これらは互いに異なる音節を語
頭に持つ単語群ならば何でも良い。In the first and second embodiments, numbers and commands are used as objects to be recognized, but any group of words having different syllables at the beginning may be used.

寸たＡ／Ｄ変換部３は特徴抽出部２の後に設けたが、実
際には特徴抽出部２の前に設けて先に電気信号をデジタ
ル化し、ディジタル化した信号から特徴を抽出しても良
いことは言うまでもない。Although the small A/D converter 3 is installed after the feature extractor 2, it is actually installed before the feature extractor 2, digitizes the electrical signal first, and extracts the features from the digitized signal. Needless to say, it's a good thing.

発明の効果本発明の音声認識装置・は、標準パターン記憶部を選択
するスイッチを設けて、互いに異なる音節全語頭に持つ
単語群に認識の候補を分割して絞り込むことにより、単
一の学音節認識処理のみで単１に一節と単語を認識する
ことができ、さらに単語と単音節の辞書の構造も画一化
することができ、装置の簡単化、処理の単純化を太幅Ｖ
（進めることができる。Effects of the Invention The speech recognition device of the present invention is equipped with a switch for selecting a standard pattern storage section, and divides and narrows down recognition candidates into groups of words that have different syllables at the beginning of each word. It is possible to recognize single passages and words using only recognition processing, and the structure of the word and monosyllable dictionaries can be standardized, simplifying the device and processing.
(You can proceed.

[Brief explanation of drawings]

第１図は従来の音声認識装置のブロック図、第２図は本
発明の第１の実施例における音声認識装置のブロック図
、第３図は本発明の第２の実施例しこおける音声認識装
置のブロック図である。１３・・・・・スイッチ、１４・・・・数字標準パター
ン記憶部、１５・・・・・コマンド標準パターン記憶部
、１６　・・・単音節標準パターン記憶部、１７　・　
スイッチ。FIG. 1 is a block diagram of a conventional speech recognition device, FIG. 2 is a block diagram of a speech recognition device according to a first embodiment of the present invention, and FIG. 3 is a block diagram of a speech recognition device according to a second embodiment of the present invention. FIG. 2 is a block diagram of the device. 13...Switch, 14...Numeric standard pattern storage section, 15...Command standard pattern storage section, 16...Monosyllabic standard pattern storage section, 17.
switch.

Claims

[Claims]

an input means for converting input speech into an electrical signal; a feature extraction means for extracting features of a monosyllable at the beginning of a word from the electrical signal;
input cover pattern storage means for temporarily storing the voice features extracted by the feature extraction means as input cover patterns; and input cover pattern storage means for temporarily storing the voice features extracted by the feature extraction means as input cover patterns; at least one word standard pattern storage means for storing, a monosyllabic standard pattern storage means for storing as a standard pattern the features extracted by the feature extraction means from the monosyllabic speech input in advance, and the standard pattern. A switch means for selecting one of the storage means compares the standard pattern stored in the standard pattern storage means selected by the switch means with the input cover pattern, and selects the input cover pattern as a specific standard pattern. What is claimed is a speech recognition device comprising a recognition processing means for recognizing the specified standard pattern, and an output means for outputting a signal corresponding to the specific standard pattern recognized by the recognition processing means.