JPS58189694A - Voice recognition system - Google Patents
Voice recognition systemInfo
- Publication number
- JPS58189694A JPS58189694A JP57071225A JP7122582A JPS58189694A JP S58189694 A JPS58189694 A JP S58189694A JP 57071225 A JP57071225 A JP 57071225A JP 7122582 A JP7122582 A JP 7122582A JP S58189694 A JPS58189694 A JP S58189694A
- Authority
- JP
- Japan
- Prior art keywords
- recognition
- standard
- state
- speech
- continuous
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.
Description
【発明の詳細な説明】
本発明は音声認識方式、%lこ不特定の話者が発声した
連続単語音声を認識する方式の改良に関するものである
。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an improvement in a speech recognition method, a method for recognizing continuous word speech uttered by an unspecified speaker.
従来、不特定の話者の発する音声は認識する方式におい
ては、入力音声の特徴を調べ、その特徴に合うように標
準パタンを変形する学習方式、逆1こ入力音声を標準パ
タンに合うように変形する正規化方式、あるいは話者が
異なることによる音声の変形の範囲を予め予想し、その
変動範囲に多数の標準パタンを配置する多標準方式、お
よび適当な前処理手法と組み合せた判別関数法などが提
案されている。これらの内現在、実用レベルの認識能力
を持つものは多標準方式と判別関数法によるものである
。さら匿、連続単語認識まで能力を拡張することを考え
ると多標準方式がほぼ唯一の現実的方式と言えよう。Conventionally, methods for recognizing speech uttered by unspecified speakers include a learning method that examines the characteristics of the input speech and transforms a standard pattern to match the characteristics, and a learning method that transforms the input speech into a standard pattern to match the characteristics. A normalization method that performs deformation, a multi-standard method that predicts the range of speech deformation due to different speakers and arranges a large number of standard patterns within that range of variation, and a discriminant function method that combines with an appropriate preprocessing method. etc. have been proposed. Among these, the ones that currently have practical level recognition ability are based on the multi-standard method and the discriminant function method. Considering that the ability can be expanded to include hidden and continuous word recognition, the multi-standard method is almost the only realistic method.
しかしながら、連続単語認識において、可能性のある単
語連続の組み合せを考えると、二段DP法や連続DP法
などの手法を用いても、認識のための処理量は大幅tこ
増加する。従って多標準方式そのままに、不特定話者連
続単語認識を行なう方式では、パタンマ、チング部等の
規膜が非常に大きくなり経済性の点で非現実的なものと
なる。However, in continuous word recognition, when considering possible combinations of consecutive words, even if techniques such as the two-stage DP method or the continuous DP method are used, the amount of processing for recognition increases significantly. Therefore, in a system that performs speaker-independent continuous word recognition while maintaining the multi-standard system, the membranes such as pattern and chiming parts become extremely large, making it unrealistic from an economic point of view.
本発明では、このような問題点を改善することを目的と
している。The present invention aims to improve such problems.
音声認識装置への入力中の者は、ある利用場面に注目す
れば、利用中に男女の性が変ったり、成人から子供に変
るなどの変動は起り得ない点に注目する。Those who are inputting information to a speech recognition device should note that, if they pay attention to a certain use situation, changes such as changing gender from male to female or changing from adult to child cannot occur during use.
すなわち、本発明では、認識装置の状態を、比較的認識
処理量の少ない離散発声音声の認識状態(状態1)と、
処理量の多い連続発声音声の認識状態(状態2)に分け
、先ず状態lで入力音声を認識し、話者の性格を限定し
た後に、その性格の共通の組の標準バタンを用い、状態
2の認識を行なうことにより、状態2における処理量を
低減させようというものである。That is, in the present invention, the state of the recognition device is divided into a recognition state (state 1) of discrete utterances that requires a relatively small amount of recognition processing;
It is divided into continuous speech recognition states (state 2) that require a large amount of processing. First, the input speech is recognized in state l, and after limiting the personality of the speaker, a standard button of a common set of the personality is used, and state 2 is recognized. The aim is to reduce the amount of processing in state 2 by recognizing the following.
以下、実施例にもとづき本発明を説明する。Hereinafter, the present invention will be explained based on Examples.
第1図は本発明を応用した電話情報サービスシステム構
成の一例である。システム制御部lと本発明iこよる音
声認識部2、音声応答部3、ハイブリッドコイル4、加
入者電話器5からなり、電話情報サービスシステムより
本発明の説明に必要な部分のみを取り出して記しである
。FIG. 1 shows an example of the configuration of a telephone information service system to which the present invention is applied. Consisting of a system control unit 1, a voice recognition unit 2 according to the present invention, a voice response unit 3, a hybrid coil 4, and a subscriber telephone 5, only the parts necessary for explaining the present invention are extracted from the telephone information service system and written down. It is.
第2図は本発明を説明するための音声認識装置の構成例
である。第2図において、制御部21は第1図のシステ
ム制御部1からの指令と結果27を授受する他、音声認
識部2の制御を行なう。分析部22で分析された入力音
声は標準バタンメモリ24中の標準バタンデータとの類
似度が類似度計算部23で計算され、連続バタン・マツ
チング部25で最適マツチング値が各標準バタンとの間
で計算される。その結果は判定部26で判定され、判定
結果が制御部21に送られる。連続バタン・マツチング
処理を行なう認識装置の構成はすでに公知なので(%開
昭55−2205号公報参照)その説明は省略する。こ
の装置の例では、常に入力バタンと指定された標準バタ
ンとを照合しているので、入力が離散発声であることが
あらかじめ判明していれば、マツチング部の出力は離散
発声単語が入力されたものとして判定部26で判定すれ
ば良く、連続単語入力の場合をこは、連続単語として判
定して行く方式となっており、連続バタンマツチング部
25の動作は共通である。このマツチング部25の動作
を離散発声用と連続発声用に ′切り換える方式
(たとえば、特願昭55−158296号参照)の装置
においても以下の説明は全く同様に取り扱える。FIG. 2 is a configuration example of a speech recognition device for explaining the present invention. In FIG. 2, a control section 21 not only sends and receives commands and results 27 from the system control section 1 of FIG. 1, but also controls the voice recognition section 2. The similarity calculation unit 23 calculates the degree of similarity between the input voice analyzed by the analysis unit 22 and the standard bang data in the standard bang memory 24, and the continuous bang matching unit 25 calculates the optimum matching value between each standard bang data. is calculated. The result is determined by the determination section 26, and the determination result is sent to the control section 21. Since the configuration of a recognition device that performs continuous bump matching processing is already known (see Japanese Patent Application No. 1982-2205), a description thereof will be omitted. In the example of this device, the input button is always compared with the specified standard button, so if it is known in advance that the input is a discrete utterance, the output of the matching section will be the same as that of the input discrete utterance word. In this case, continuous word input is judged as a continuous word, and the operation of the continuous slam matching section 25 is the same. The following explanation can be applied in exactly the same manner in a system in which the operation of the matching section 25 is switched between discrete vocalization and continuous vocalization (see, for example, Japanese Patent Application No. 158296/1983).
いま、登録されている単語の種類が「はい」、「いいえ
」と0〜9の数字とする。また、各単語と数字の標準バ
タンは話者lこよる差異を考慮し、/男/女/子供/各
5種すなわち、3X5−15個ずつ登録されているもの
とする。銀行における残高照会の例を取り上げると、第
1図に戻って、利用者からの電話がシステムに入ると、
先ず音声応答部3は「残高照会ですか」と利用者に問う
と共に、音声認識部2はシステム制御部lの指令にもと
づき、「はい」か「いいえ」の2種の単語を離散入力と
して認識するモート(状態l)で入力を待つ。利用者が
「はい」又は「いいえ」と答えると、認識部2は「はい
」「いいえ」の2語に対し各15個の合計30個の標準
バタンとの照合をすれば良い。この結果、最もマツチン
グの良い標準バタンか男の組(又は女、又は子供の組)
であれば、以降状態2(連続単語認識の状態)では、男
(又は女、又は子供)に属する数字の標準バタンのみを
用いるように制御部21が割部指令を出す。次の段階で
音声応答装置3は「暗証番号をどうぞ」と利用者に音声
出力すると共に、認識部2は状態2となり連続数字認識
可能な状態となる。It is assumed that the types of words currently registered are "yes", "no", and numbers from 0 to 9. Further, it is assumed that five types of standard clicks for each word and number, ie, 3×5−15, are registered for each word/number/man/woman/child, taking into consideration the differences among speakers. Taking the example of balance inquiry at a bank, going back to Figure 1, when a call from a user enters the system,
First, the voice response unit 3 asks the user, “Do you want to inquire about your balance?” At the same time, the voice recognition unit 2 recognizes two types of words, “yes” or “no,” as discrete inputs based on commands from the system control unit l. Wait for input at the mote (state l). When the user answers "yes" or "no", the recognition unit 2 only has to compare the two words "yes" and "no" with a total of 30 standard bangs, 15 each. As a result, the best matching standard batan or the male group (or female or child group)
If so, then in state 2 (state of continuous word recognition), the control unit 21 issues a division command so that only standard bangs with numbers belonging to men (or women, or children) are used. In the next step, the voice response device 3 outputs a voice to the user saying, "Please enter your password," and the recognition unit 2 changes to state 2, making it possible to recognize consecutive numbers.
利用者は、たとえば暗証番号1’−1234Jなどと音
声で入力すると認識部2は男(又は女、又は子供)の組
に所属する数字標準バタン10X5−50個との照合を
行なえば良いことになる。従って、認識部3のマツチン
グ能力は高々50個の標準バタンとの照合で良いことに
なる。これ番こ対し、状態lで組を定めずに認識する場
合は10XI 5−150個の標準バタンとの照合を要
することになる。For example, when the user inputs a password such as 1'-1234J by voice, the recognition unit 2 only needs to match it with 10X5-50 number standard buttons belonging to the male (or female, or child) group. Become. Therefore, the matching ability of the recognition unit 3 is sufficient to match at most 50 standard batons. On the other hand, if recognition is performed in state 1 without determining the set, it will be necessary to check with 5 to 150 standard drums of 10XI.
以上説明したごとく、本発明によれば、経済的に、不特
定話者の連続発声した音声を認識するシステムが実現で
きることになりその効果は大きい。As described above, according to the present invention, it is possible to economically realize a system for recognizing continuous voices uttered by an unspecified speaker, and the effects thereof are significant.
第1図は本発明を応用した電話情報サービスシステムの
一構成例を示し、
第2図は本発明による音声認識装置のブロック構成を示
す。
第 1 図
犯 2 図FIG. 1 shows an example of the configuration of a telephone information service system to which the present invention is applied, and FIG. 2 shows a block configuration of a speech recognition device according to the present invention. Figure 1 Criminal Figure 2
Claims (1)
種類の話者の組ごとに用意された複数組の標準パタンに
より離散発声の音声を認識する第1の認識状態と連続発
声の音声を認識する第2の認識状態とを備えた音声認識
方式において、上記第1の認識状態で入力音声の性質を
認識し、該認識された性質にもとづいて上記第2の認識
状態で使用すべき標準パタンの組を限定することを特徴
とする音声認識方式。A first recognition state that recognizes discrete utterances and recognizes continuous utterances using multiple sets of standard patterns prepared for multiple types of speakers with different characteristics for voice patterns having the same meaning. In a speech recognition method having a second recognition state, the characteristics of the input speech are recognized in the first recognition state, and a standard pattern to be used in the second recognition state is determined based on the recognized characteristics. A speech recognition method characterized by limiting the number of pairs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP57071225A JPS58189694A (en) | 1982-04-30 | 1982-04-30 | Voice recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP57071225A JPS58189694A (en) | 1982-04-30 | 1982-04-30 | Voice recognition system |
Publications (2)
Publication Number | Publication Date |
---|---|
JPS58189694A true JPS58189694A (en) | 1983-11-05 |
JPH0421880B2 JPH0421880B2 (en) | 1992-04-14 |
Family
ID=13454518
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP57071225A Granted JPS58189694A (en) | 1982-04-30 | 1982-04-30 | Voice recognition system |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS58189694A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01290000A (en) * | 1988-05-17 | 1989-11-21 | Sharp Corp | Voice recognition device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS493507A (en) * | 1972-04-19 | 1974-01-12 | ||
JPS56119199A (en) * | 1980-02-26 | 1981-09-18 | Sanyo Electric Co | Voice identifying device |
-
1982
- 1982-04-30 JP JP57071225A patent/JPS58189694A/en active Granted
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS493507A (en) * | 1972-04-19 | 1974-01-12 | ||
JPS56119199A (en) * | 1980-02-26 | 1981-09-18 | Sanyo Electric Co | Voice identifying device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01290000A (en) * | 1988-05-17 | 1989-11-21 | Sharp Corp | Voice recognition device |
Also Published As
Publication number | Publication date |
---|---|
JPH0421880B2 (en) | 1992-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0647344B1 (en) | Method for recognizing alphanumeric strings spoken over a telephone network | |
US5895448A (en) | Methods and apparatus for generating and using speaker independent garbage models for speaker dependent speech recognition purpose | |
US5842165A (en) | Methods and apparatus for generating and using garbage models for speaker dependent speech recognition purposes | |
JP3968133B2 (en) | Speech recognition dialogue processing method and speech recognition dialogue apparatus | |
US6076054A (en) | Methods and apparatus for generating and using out of vocabulary word models for speaker dependent speech recognition | |
US5127043A (en) | Simultaneous speaker-independent voice recognition and verification over a telephone network | |
US5125022A (en) | Method for recognizing alphanumeric strings spoken over a telephone network | |
US5517558A (en) | Voice-controlled account access over a telephone network | |
US5365574A (en) | Telephone network voice recognition and verification using selectively-adjustable signal thresholds | |
CA2189011C (en) | Method for reducing database requirements for speech recognition systems | |
US20010056345A1 (en) | Method and system for speech recognition of the alphabet | |
CA1239478A (en) | Method and apparatus for use in interactive dialogue | |
JPS58189694A (en) | Voice recognition system | |
JP3919314B2 (en) | Speaker recognition apparatus and method | |
JP2980382B2 (en) | Speaker adaptive speech recognition method and apparatus | |
JPH10116093A (en) | Voice recognition device | |
JPH01179198A (en) | Rejecting system | |
KR20200134868A (en) | Speech synthesis device and speech synthesis method | |
JPS5934596A (en) | Voice recognition processing system | |
JPS6348599A (en) | Voice recognition response system | |
JPH01197795A (en) | Voice recognizing device | |
JPH04199199A (en) | Speech recognition device | |
JPS60172099A (en) | Speaker recognition equipment | |
JPS60241097A (en) | Voice recognition applying equipment | |
JPH053596B2 (en) |