JPS6329278B2 - - Google Patents

Info

Publication number
JPS6329278B2
JPS6329278B2 JP56079579A JP7957981A JPS6329278B2 JP S6329278 B2 JPS6329278 B2 JP S6329278B2 JP 56079579 A JP56079579 A JP 56079579A JP 7957981 A JP7957981 A JP 7957981A JP S6329278 B2 JPS6329278 B2 JP S6329278B2
Authority
JP
Japan
Prior art keywords
speaker
feature
voice
feature pattern
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
JP56079579A
Other languages
Japanese (ja)
Other versions
JPS57195297A (en
Inventor
Kyoshi Tajima
Hiroki Oonishi
Masanori Myatake
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanyo Electric Co Ltd
Original Assignee
Sanyo Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanyo Electric Co Ltd filed Critical Sanyo Electric Co Ltd
Priority to JP56079579A priority Critical patent/JPS57195297A/en
Publication of JPS57195297A publication Critical patent/JPS57195297A/en
Publication of JPS6329278B2 publication Critical patent/JPS6329278B2/ja
Granted legal-status Critical Current

Links

Description

【発明の詳細な説明】 本発明は音声装置、更に詳しくは登録モードに
於て音声を登録し、その後の認識モードに於てそ
の登録音声についての認識を行う音声装置に関す
る。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an audio device, and more particularly to an audio device that registers audio in a registration mode and then recognizes the registered audio in a recognition mode.

音声の特徴を抽出し、その抽出した音声の特徴
とその後に入力される音声の特徴とを比較して認
識する音声装置が出現し、実用化されつつある。
此種音声装置は特定の人の音声を認識するもの
と、不特定話者の声を認識するものとが存在する
が、認識時間、認識率及び装置の構成、の各点か
ら現在のところ前者の特定話者用のものが殆どで
ある。
2. Description of the Related Art Speech devices that extract speech features and perform recognition by comparing the extracted speech features with the features of subsequently input speech have appeared and are being put into practical use.
There are two types of voice devices: one that recognizes the voice of a specific person and one that recognizes the voice of an unspecified speaker, but the former is currently the best in terms of recognition time, recognition rate, and device configuration. Most of them are for specific speakers.

本発明はこのような音声装置の使い勝手の向上
を図つたもので、以下に図面を参照しつつ詳述す
る。
The present invention aims to improve the usability of such an audio device, and will be described in detail below with reference to the drawings.

1は音声を電気信号に変換するマイクロフオ
ン、2はこのマイクロフオンから得られる音声電
気信号の特徴を抽出する特徴抽出回路で、ゼロク
ロス検出回路、音声スペクトル抽出回路、音声領
域検出回路、等から成つている。3は音声の登録
モードと音声の認識モードとの切り換えを行うモ
ード切り換えスイツチ、4はこのモード切り換え
スイツチ3の登録モード側に第1の入力ゲート群
5を介して連つた第1の特徴パターンメモリで、
特徴抽出回路2で抽出された音声の特徴パターン
が貯えられる。6もモード切換えスイツチ3の登
録モード側に第2の入力ゲート群7を介して連つ
た第2の特徴パターンメモリで、特徴抽出回路2
で抽出された音声の特徴パターンが貯えられる。
尚、これ等の特徴パターンメモリ4,6は例えば
夫々8語程度の音声の特徴パターンが記憶される
容量を持つており、またその各メモリ4,6への
特徴パターンの書き込み、並びに読み出しタイミ
ングは例えばT1〜T8並びにT9〜T16の如く、
夫々同様には行われずずれている。8は上記モー
ド切り換えスイツチ3の認識モード側に連つたバ
ツフアメモリで、特徴抽出回路2からの未知音声
の特徴パターンが一時的に貯えられる。9,10
は上記第1、第2の特徴パターンメモリ4,6の
出力側に設けられた第1、第2の出力ゲート群、
11はこれ等の出力ゲート群9,10を介して得
られる登録パターンとバツフアメモリ8からの未
知音声パターンとを比較認識する認識回路、12
は上記各入出力ゲート群5,7・9,10へ切り
換え信号を供給する話者選択スイツチで、第1の
話者Aと、第2の話者Bと、特定話者Zとの3種
の話者選択が行われる。次に上記各入出力ゲート
群5,7・9,10の内部構成に就いて説明を加
えておく。第1の入出力ゲート群5,9は、第1
話者Aと特定話者Zとの論理和を採るORゲート
20,22と、その論理和出力と音声特徴パター
ンとの論理積を採るANDゲート21,23と、
から成つている。第2の入出力ゲート群7,10
は、第2話者Bと特定話者Zと論理和を採るOR
ゲート24,26と、その論理和出力と音声パタ
ーンとの論理積を採るANDゲート25,27と、
から成つている。
1 is a microphone that converts audio into an electrical signal, and 2 is a feature extraction circuit that extracts the features of the audio electrical signal obtained from this microphone, which consists of a zero-cross detection circuit, an audio spectrum extraction circuit, an audio region detection circuit, etc. It's on. Reference numeral 3 denotes a mode changeover switch for switching between a voice registration mode and a voice recognition mode, and 4 a first feature pattern memory connected to the registration mode side of the mode changeover switch 3 via a first input gate group 5. in,
The voice feature patterns extracted by the feature extraction circuit 2 are stored. 6 is also a second feature pattern memory connected to the registration mode side of the mode changeover switch 3 via the second input gate group 7, and is connected to the feature extraction circuit 2.
The extracted voice feature patterns are stored.
Note that these feature pattern memories 4 and 6 each have a capacity to store speech feature patterns of, for example, about 8 words, and the writing and reading timing of the feature patterns to each memory 4 and 6 is determined by the following timing. For example, T 1 to T 8 and T 9 to T 16 ,
They are not performed in the same way and are shifted. Reference numeral 8 denotes a buffer memory connected to the recognition mode side of the mode changeover switch 3, in which the characteristic pattern of the unknown voice from the characteristic extraction circuit 2 is temporarily stored. 9,10
are first and second output gate groups provided on the output side of the first and second feature pattern memories 4 and 6,
11 is a recognition circuit that compares and recognizes the registered pattern obtained through these output gate groups 9 and 10 and the unknown voice pattern from the buffer memory 8; 12;
is a speaker selection switch that supplies switching signals to each input/output gate group 5, 7, 9, and 10, and selects three types of speakers: first speaker A, second speaker B, and specific speaker Z. Speaker selection is performed. Next, the internal configuration of each input/output gate group 5, 7, 9, and 10 will be explained. The first input/output gate group 5, 9 includes a first
OR gates 20 and 22 that take the logical sum of speaker A and specific speaker Z; AND gates 21 and 23 that take the logical product of the logical sum output and the voice feature pattern;
It consists of Second input/output gate group 7, 10
is an OR that is logically ORed with the second speaker B and the specific speaker Z.
gates 24 and 26, and AND gates 25 and 27 that take the logical product of the logical sum output and the voice pattern;
It consists of

而して先ず音声登録に就いての動作説明を行
う。モード切り換えスイツチ3を登録モード側に
倒し登録話者の選択を第1話者Aとした場合は、
マイクロフオン1、特徴抽出回路2を介して得ら
れる第1話者の音声の特徴パターンが第1の入力
ゲート群5を通過して最大8語まで第1の特徴パ
ターンメモリ4に貯えられる。また第2話者Bを
選択した場合は第2の入力ゲート群7が開きその
第2話者の音声特徴パターンが第2の特徴パター
ンメモリ6に最大8語まで貯えられる。更に特定
話者Zを話者選択スイツチ12で選択すると、両
入力ゲート5,7が開き、特定話者の特徴パター
ンは第1第2の特徴パターンメモリ4,6に導入
されるのであるが、上述した如くこの各メモリ
4,6への特徴パターンの書き込みタイミングが
ずれているので、8+8、即ち最大16語の特徴パ
ターンが両メモリ4,6にに亘つて貯えられる。
First, we will explain the operation of voice registration. When the mode changeover switch 3 is set to the registration mode side and the registered speaker is selected as the first speaker A,
The characteristic pattern of the first speaker's voice obtained through the microphone 1 and the characteristic extraction circuit 2 passes through the first input gate group 5 and is stored in the first characteristic pattern memory 4 for up to eight words. Further, when the second speaker B is selected, the second input gate group 7 is opened and the voice feature pattern of the second speaker is stored in the second feature pattern memory 6 for up to eight words. Further, when a specific speaker Z is selected by the speaker selection switch 12, both input gates 5 and 7 are opened, and the feature pattern of the specific speaker is introduced into the first and second feature pattern memories 4 and 6. As mentioned above, since the writing timings of the feature patterns to each memory 4 and 6 are shifted, a feature pattern of 8+8, ie, a maximum of 16 words, is stored in both memories 4 and 6.

このようにして登録された音声を用いての認識
動作を次に説明する。モード切り換えスイツチ3
を認識モード側に倒し、先ず第1話者についての
認識を行う場合は話者選択スイツチ12で第1話
者Aを指定する。その状態で第1話者がマイクロ
フオン1に向つて未知音声を発すると、その特徴
が特徴抽出回路2で抽出され、バツフアメモリ8
に一時的に貯えられる。そしてその一時的に貯え
られた未知音声の特徴パターンと第1の特徴パタ
ーンメモリ4に登録されている登録音声の特徴パ
ターンとが比較認識回路11で比較され、その未
知特徴パターンを登録パターンのうちのどちらか
に特定認識される。また第2話者Bを選択した時
は第2話者Bの未知音声の特徴パターンが第2特
徴パターンメモリ6に登録されているパターンと
比較され、比較認識回路11で認識される。
The recognition operation using the voice registered in this way will be explained next. Mode change switch 3
If the first speaker is to be recognized by switching to the recognition mode side, the first speaker A is designated by the speaker selection switch 12. In this state, when the first speaker utters an unknown voice into the microphone 1, its features are extracted by the feature extraction circuit 2, and the buffer memory 8
temporarily stored. Then, the temporarily stored characteristic pattern of the unknown voice and the characteristic pattern of the registered voice registered in the first characteristic pattern memory 4 are compared in the comparison recognition circuit 11, and the unknown characteristic pattern is selected from among the registered patterns. It is recognized specifically by either. When the second speaker B is selected, the characteristic pattern of the unknown voice of the second speaker B is compared with the pattern registered in the second characteristic pattern memory 6 and recognized by the comparison recognition circuit 11.

次に本発明の最も特徴とする選択スイツチ12
で特定話者Zを選択した場合を考えてみる。音声
の登録モードに於て、第1、第2各話者の音声を
夫々第1、第2各特徴パターンメモリ4,6に登
録した状態で特定話者Zを選択した場合は、バツ
フアメモリ8に一時的に貯えられた未知音声と第
1、第2両話者の音声の登録パターンとが比較認
識回路11で比較されその登録パターンのうち最
も類似した音声パターンを選び出して未知音声を
認識する。即ちこの状態では2話者の音声の認識
が行われる。
Next, the selection switch 12 which is the most characteristic of the present invention
Let us consider the case where specific speaker Z is selected. In the voice registration mode, when specific speaker Z is selected with the voices of the first and second speakers registered in the first and second characteristic pattern memories 4 and 6, respectively, the voices of the first and second speakers are registered in the buffer memory 8. The temporarily stored unknown voice and the registered patterns of the voices of both the first and second speakers are compared in a comparison recognition circuit 11, and the most similar voice pattern among the registered patterns is selected to recognize the unknown voice. That is, in this state, the voices of the two speakers are recognized.

一方、登録モードに於て、特定話者の(8+
8)語の登録を両特徴パターンメモリ4,6に登
録した状態で特定話者を選択スイツチ12で選択
した場合は、バツフアメモリ8の内容と両特徴パ
ターンメモリ4,6にある特定話者の16語の内容
とが比較認識回路11で比較され、未知音声がそ
の16語の内から選び出されて認識される。
On the other hand, in the registration mode, a specific speaker's (8+
8) When a specific speaker is selected by the selection switch 12 with the word registered in both feature pattern memories 4 and 6, the content of the buffer memory 8 and the specific speaker's 16 in both feature pattern memories 4 and 6 are selected. The contents of the words are compared in a comparison recognition circuit 11, and unknown speech is selected from among the 16 words and recognized.

本発明は以上の説明から明らかな如く、第1、
第2の話者の音声については夫々個別に認識動作
し、その両話者の音声を登録した状態で特定話者
を選択して認識動作を行つた時は未知音声を両話
者の音声に対しての認識動作を行い、また特定話
者の音声を登録した状態でその特定話者を選択し
て認識を行うと、通常の2倍の数の音声の認識を
行わしめる事が出来るものである。従つて数多く
のの音声認識形態に対処する事が出来、音声認識
装置としての汎用性が増すと同時にハード面での
負担は僅かで実用性の極めて高い音声装置を得る
事が出来る。
As is clear from the above description, the present invention has the following features:
The second speaker's voice is recognized individually, and when the voices of both speakers are registered and a specific speaker is selected and recognized, the unknown voice is recognized as the voice of both speakers. If you perform a recognition operation for a specific speaker, and select that specific speaker and perform recognition with the voice of a specific speaker registered, it is possible to recognize twice the number of voices normally. be. Therefore, it is possible to deal with a large number of speech recognition formats, increase the versatility of the speech recognition device, and at the same time, it is possible to obtain an extremely practical speech device with a small burden on hardware.

【図面の簡単な説明】[Brief explanation of the drawing]

図は本発明の音声認識処理方法を実現する音声
装置の構成を示すブロツク図であつて、2は特徴
抽出回路、4,6は特徴パターンメモリ、8はバ
ツフアメモリ、11は比較認識回路、12は話者
選択スイツチ、を夫々示している。
The figure is a block diagram showing the configuration of a voice device that implements the voice recognition processing method of the present invention, in which 2 is a feature extraction circuit, 4 and 6 are feature pattern memories, 8 is a buffer memory, 11 is a comparison recognition circuit, and 12 is a A speaker selection switch is shown.

Claims (1)

【特許請求の範囲】 1 登録モードに於て音声を登録し、その後の認
識モードに於て登録音声について認識を行う音声
認識処理方法に於て、音声を電気信号に変換する
マイクロフオンと、該マイクロフオンからの音声
電気信号の特徴を抽出する特徴抽出回路と、該特
徴抽出回路に依つて抽出されたn語(nは1以上
の整数)の特徴パターンを格納する事の出来る第
1、第2の特徴パターンメモリと、上記特徴抽出
回路から得られる入力音声の特徴パターンを一時
的に貯えるバツフアメモリと、該バツフアメモリ
の内容と上記第1、第2の特徴パターンメモリの
内容とを比較認識する認識回路と、音声登録モー
ドと音声認識モードとの切り換えを行うモード切
り換え手段と、第1話者、第2話者並びに特定話
者の話者選択手段と、から成り、 登録モードに於て第1話者音声のn語の特徴パ
ターンは第1の特徴パターンメモリに、第2話者
音声のn語の特徴パターンは第2の特徴パターン
メモリに、そして特定話者の2n語の特徴パター
ンは第1、第2の両特徴パターンメモリに夫々貯
えられ、 また認識モードに於て、第1話者を選択した時
はバツフアメモリに導入されている内容と第1の
特徴パターンメモリに貯えられている内容とを認
識回路で比較認識し、第2の話者を選択した時は
バツフアメモリに導入されている内容と第2の特
徴パターンメモリに貯えられている内容とを認識
回路で比較認識し、 第1、第2両話者の音声特徴パターンを第1、
第2両特徴パターンメモリに貯えた状態で認識モ
ードに於て特定話者を選択した時は、バツフアメ
モリの内容と両特徴パターンメモリの内容とを比
較認識して二者の音声の認識を行い、 特定話者を選択して2n語の特徴パターンを両
特徴メモリに貯えた状態で認識モードに於て特定
話者を選択した時は、バツフアメモリの内容と両
特徴メモリの内容とを比較して2n語の音声認識
を行なう事を特徴とした音声認識処理方法。
[Claims] 1. A voice recognition processing method that registers voice in a registration mode and then recognizes the registered voice in a subsequent recognition mode, which includes: a microphone that converts voice into an electrical signal; a feature extraction circuit for extracting the features of the audio electrical signal from the microphone; and a first and second feature extraction circuit capable of storing feature patterns of n words (n is an integer of 1 or more) extracted by the feature extraction circuit. 2, a buffer memory for temporarily storing the feature pattern of the input voice obtained from the feature extraction circuit, and recognition for comparing and comparing the contents of the buffer memory with the contents of the first and second feature pattern memories. a circuit, a mode switching means for switching between a voice registration mode and a voice recognition mode, and a speaker selection means for a first speaker, a second speaker, and a specific speaker; The feature pattern of n words of the speaker's voice is stored in the first feature pattern memory, the feature pattern of n words of the second speaker's voice is stored in the second feature pattern memory, and the feature pattern of 2n words of the specific speaker is stored in the first feature pattern memory. 1 and 2 are respectively stored in the feature pattern memory, and when the first speaker is selected in the recognition mode, the content introduced in the buffer memory and the content stored in the first feature pattern memory are When the second speaker is selected, the recognition circuit compares and recognizes the contents introduced in the buffer memory and the contents stored in the second feature pattern memory, and , the second speaker's voice feature pattern as the first,
When a specific speaker is selected in the recognition mode with the data stored in the second feature pattern memory, the contents of the buffer memory and the contents of both feature pattern memories are compared and recognized to recognize the voices of the two speakers. When a specific speaker is selected in the recognition mode with the feature pattern of 2n words stored in both feature memories, the content of the buffer memory and the content of both feature memories are compared and the 2n word feature pattern is stored in both feature memories. A speech recognition processing method characterized by performing speech recognition of words.
JP56079579A 1981-05-25 1981-05-25 Voice unit Granted JPS57195297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP56079579A JPS57195297A (en) 1981-05-25 1981-05-25 Voice unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP56079579A JPS57195297A (en) 1981-05-25 1981-05-25 Voice unit

Publications (2)

Publication Number Publication Date
JPS57195297A JPS57195297A (en) 1982-11-30
JPS6329278B2 true JPS6329278B2 (en) 1988-06-13

Family

ID=13693888

Family Applications (1)

Application Number Title Priority Date Filing Date
JP56079579A Granted JPS57195297A (en) 1981-05-25 1981-05-25 Voice unit

Country Status (1)

Country Link
JP (1) JPS57195297A (en)

Also Published As

Publication number Publication date
JPS57195297A (en) 1982-11-30

Similar Documents

Publication Publication Date Title
JPH0361959B2 (en)
JPS6329278B2 (en)
GB981154A (en) Improved phonetic typewriter system
JPS6312312B2 (en)
JPS5855520B2 (en) Renzokuonseininshikisouchi
JPS6135494A (en) Voice recognition processor
JPS63205698A (en) Pattern identifier
JPS6332596A (en) Voice recognition equipment
JPH0262879B2 (en)
JPS61281298A (en) Voice recognition equipment
JPH06309443A (en) Individual recognizing system combining finger print and voice
US20030046084A1 (en) Method and apparatus for providing location-specific responses in an automated voice response system
JPH01158499A (en) Standing noise eliminaton system
KR200234902Y1 (en) System of Voice Recognition
JPS58179899A (en) Pattern matching apparatus
JPS5934597A (en) Voice recognition processor
JPS63125998A (en) Voice input/output unit
JPS61121090A (en) Voice recognition equipment
JPS62159200A (en) Word voice recognition equipment for specified speaker
JPS6231900A (en) Voice recognition equipment
JPH0535440A (en) Automatic document preparing device
JPS6070497A (en) Voice recognition equipment
JPH04307599A (en) Word voice recognizer
JPH0535441A (en) Automatic document preparing device
KR20020073825A (en) System of Voice Recognition