JPH06118989A - Continuous speech recognizing method - Google Patents

Continuous speech recognizing method

Info

Publication number
JPH06118989A
JPH06118989A JP4287115A JP28711592A JPH06118989A JP H06118989 A JPH06118989 A JP H06118989A JP 4287115 A JP4287115 A JP 4287115A JP 28711592 A JP28711592 A JP 28711592A JP H06118989 A JPH06118989 A JP H06118989A
Authority
JP
Japan
Prior art keywords
sentence
unregistered word
standard pattern
word
unregistered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP4287115A
Other languages
Japanese (ja)
Inventor
Kazuya Takeda
一哉 武田
Shingo Kuroiwa
眞吾 黒岩
Seiichi Yamamoto
誠一 山本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
KDDI Corp
Original Assignee
Kokusai Denshin Denwa KK
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kokusai Denshin Denwa KK filed Critical Kokusai Denshin Denwa KK
Priority to JP4287115A priority Critical patent/JPH06118989A/en
Publication of JPH06118989A publication Critical patent/JPH06118989A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To recognize continuous vocalization including an unregistered word by recognizing a continuous speech which includes the unregistered word by using unregistered word patterns generated by using speeches other than regis tered words as learning data. CONSTITUTION:A sentence head segmenting process part 8 divides a sentence- vocalized input equally into three and segments the 1st part as a sentence head part, an in-sentence segmenting process part 9 segments the section other than the sentence head part and sentence tail part, and a sentence tail segmenting process part 10 divides the input equally into three and segments the final part as the sentence tail part. The results of the respective sectioning processes are sent to a standard pattern generating process part 3 as well as registered words to generate three kind of unregistered word patterns, which are sent to an unregistered word standard pattern storage part 11. The unregistered word standard pattern storage part 11 stores the three unregistered word standard patterns of the sentence head part, center part, and tail part. Therefore, a difference with the position such as possible pausing at the head of a sentence can be reflected.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、音声言語によるマン・
マシンインタフェイスにおける、連続に発声された音声
の認識方法に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention
The present invention relates to a method for recognizing continuously uttered voice in a machine interface.

【0002】単に音声による入出力を可能にするだけで
なく、人間同様の対話を通じて複雑な情報を無駄なく伝
達しあう音声対話システムが、例えば計算機の高速化に
支えられた高度の音声モデル手法の導入により実用化さ
れようとしている。
[0002] A voice dialogue system that not only enables input and output by voice but also transmits complex information through human-like dialogues without waste is an advanced voice model method supported by speeding up of computers, for example. It is about to be put to practical use by introducing it.

【0003】かかるシステムの開発の過程において、利
用者の発話から十分な情報を抽出する音声入力方法が必
要である。
In the process of developing such a system, a voice input method for extracting sufficient information from the user's utterance is required.

【0004】[0004]

【従来の技術】図1は、標準パタンとしてマルコフモデ
ルを用い、照合処理として当該マルコフモデルから入力
音声が出力される確率値を用いる方法を用いて、構文拘
束として正規文法(有限状態オートマトン)を用いる方
法(「確立モデルによる音声認識」中川聖一著)を、ハ
ードウエアで実現する場合の構成図である。
2. Description of the Related Art In FIG. 1, a Markov model is used as a standard pattern, and a probability value that an input speech is output from the Markov model is used as a matching process, and a regular grammar (finite state automaton) is used as a syntax constraint. It is a block diagram in the case of implementing the method ("Speech recognition by an established model" by Seiichi Nakagawa) by hardware.

【0005】この方法は、与えられた文法拘束を満たす
任意の単語列毎に入力音声と標準パタンとの照合を行な
い、音声認識と構文解析を同時に進めることで実現され
る。
This method is realized by matching the input voice with the standard pattern for each arbitrary word string satisfying the given grammatical constraint, and advancing the voice recognition and the syntactic analysis at the same time.

【0006】そのためには、まず登録単語が文内で出現
しうる位置を有限状態オートマトンの形式で記述した文
法拘束を、文法拘束格納部2に格納する。
For that purpose, first, the grammatical constraint storing unit 2 stores the grammatical constraint in which the position where a registered word can appear in a sentence is described in the form of a finite state automaton.

【0007】次に、学習データとして、登録単語毎の標
準パタンを作成するために、登録単語に対応する入力音
声を音響分析処理部1において分析し、その結果1a(学
習時)を標準パタン作成処理部3へ送る。順次処理して
作成された標準パタン3aは標準パタン格納部5に格納さ
れる。
Next, in order to create a standard pattern for each registered word as learning data, the input voice corresponding to the registered word is analyzed by the acoustic analysis processing unit 1, and the result 1a (during learning) is created as a standard pattern. Send to the processing unit 3. The standard pattern 3a created by sequential processing is stored in the standard pattern storage unit 5.

【0008】単語標準パタン連結処理部4において、文
法拘束格納部2に格納された文法拘束に従い、単語標準
パタン格納部5に格納された単語標準パタンを連結し、
文標準パタンを作成する処理を行い、文標準パタン格納
部6に格納するために渡す。
In the word standard pattern connection processing unit 4, the word standard patterns stored in the word standard pattern storage unit 5 are connected according to the grammatical constraints stored in the grammar constraint storage unit 2,
A process of creating a sentence standard pattern is performed, and the sentence standard pattern is stored in the sentence standard pattern storage unit 6.

【0009】一方認識時として、利用者の認識対象の文
発声が入力された場合、音響分析の結果1c(認識時)は
パタン照合処理部7に送られ、文標準パタン格納部6に
格納された全ての文標準パタンと、パタン照合処理部7
において照合される。
On the other hand, at the time of recognition, when a sentence utterance to be recognized by the user is input, the acoustic analysis result 1c (at the time of recognition) is sent to the pattern matching processing unit 7 and stored in the sentence standard pattern storage unit 6. All sentence standard patterns and pattern matching processing unit 7
Collated in.

【0010】ここで照合距離は、標準パタンであるマル
コフモデルから音響分析結果が出力される確率の値を用
いる。照合距離とは、入力音声と標準パタンとの類似度
を表わす数値をいう。
Here, as the matching distance, the value of the probability that the acoustic analysis result is output from the Markov model which is the standard pattern is used. The matching distance is a numerical value indicating the similarity between the input voice and the standard pattern.

【0011】従って、最も良好な照合結果をあたえる文
標準パタンに対する単語列を認識結果として出力する。
これにより数千の大語彙を不特定話者の連続発声におい
て認識するシステムが可能となる。
Therefore, the word string for the sentence standard pattern giving the best matching result is output as the recognition result.
This enables a system that recognizes thousands of large vocabulary words in a continuous utterance of an unspecified speaker.

【0012】[0012]

【発明が解決しようとする課題】このような従来方法で
は登録単語の標準パタンと入力との照合により認識を行
なうため、利用者は登録単語のみを用いて発声する必要
がある。
In such a conventional method, since the recognition is performed by collating the standard pattern of the registered word with the input, the user needs to speak using only the registered word.

【0013】また音声入力中に言い淀みや言い誤りが起
きた場合、言い淀みや言い誤りと正しく照合される標準
パタンが存在しないことから、正しい認識結果を得るこ
とが困難になる。
Further, when stagnation or erroneous words occur during voice input, it is difficult to obtain a correct recognition result because there is no standard pattern that is correctly collated with the stagnation and erroneous words.

【0014】これら未登録語や言い誤りを含む音声入力
に対する連続音声認識装置の誤動作が、本発明が解決し
ようとする問題である。
The malfunction of the continuous speech recognition apparatus for a speech input including these unregistered words and erroneous words is a problem to be solved by the present invention.

【0015】[0015]

【問題を解決するための手段】本発明は、マルコフモデ
ルにより表現された単語標準パタンと入力音声とを文法
的拘束にしたがい連続的に照合することで、連続に発声
された音声を認識する連続音声認識手法において、連続
音声の中で出現する可能性のある登録外単語パタンの出
現位置を文法的拘束の中に定義し、登録単語以外の音声
を学習データに用いて作成された登録外単語パタンを用
いて登録されていない未登録単語を含む連続音声の認識
を行なうことを特徴とする。
SUMMARY OF THE INVENTION The present invention continuously recognizes continuously uttered voices by continuously collating a word standard pattern represented by a Markov model with an input voice according to grammatical constraints. In the speech recognition method, unregistered words created by defining the appearance positions of unregistered word patterns that may appear in continuous speech in grammatical constraints and using speech other than registered words as learning data. The feature is that continuous speech including unregistered words that are not registered is recognized using a pattern.

【0016】また、この認識方法において、登録外単語
パタンとして、文頭、文中、文末の3種類のパタンを作
成し、未登録単語と照合することを特徴とする。
Further, in this recognition method, three types of patterns, that is, the beginning of a sentence, the middle of a sentence, and the end of a sentence, are created as unregistered word patterns, and the patterns are compared with unregistered words.

【0017】[0017]

【実施例】図2は、登録単語以外の音声を学習データに
用いて作成された登録外単語パタンを用いて、登録単語
以外の音声を未登録単語として認識し、入力音声と照合
処理を行う、本発明の実施例を示す。
FIG. 2 is a diagram showing a case in which a voice other than a registered word is recognized as an unregistered word by using an unregistered word pattern created by using a voice other than a registered word as learning data, and a matching process with an input voice is performed. An example of the present invention will be shown.

【0018】なお、文法拘束格納部2、単語標準パタン
処理部4、標準パタン作成処理部3、単語標準パタン格
納部5、文標準パタン格納部6、パタン照合処理部7
は、従来の方法と同様の処理を実行する。
A grammar constraint storage unit 2, a word standard pattern processing unit 4, a standard pattern creation processing unit 3, a word standard pattern storage unit 5, a sentence standard pattern storage unit 6, a pattern matching processing unit 7 are provided.
Performs the same processing as the conventional method.

【0019】文頭切り出し処理部8、文中切り出し処理
部9、文末切り出し処理部10および登録外単語標準パ
タン格納部11が新規に付加された構成である。
The sentence head cutout processing unit 8, the sentence cutout processing unit 9, the sentence end cutout processing unit 10, and the unregistered word standard pattern storage unit 11 are newly added.

【0020】単語標準パタン作成用の音声入力の分析結
果1aに対しては、従来の方法と同様に登録単語毎に単語
標準パタンを作成し、単語標準パタン格納部に格納す
る。
For the analysis result 1a of the voice input for creating the word standard pattern, a word standard pattern is created for each registered word and stored in the word standard pattern storage unit, as in the conventional method.

【0021】次に登録外単語パタン学習時には、音声入
力の分析結果1bは、文頭切り出し処理部8、文中切り出
し処理部9、文末切り出し処理部10により、文頭、文
中、文末の3つの区間に区分化さる。
Next, at the time of learning the unregistered word pattern, the analysis result 1b of the voice input is divided into three sections, that is, the beginning of a sentence, the middle of a sentence, and the end of a sentence by the sentence head cutout processing unit 8, the sentence cutout processing unit 9, and the sentence end cutout processing unit 10. Be transformed.

【0022】即ち文頭切り出し処理部8は、文発声され
た入力を3等分し、最初の3分の1を文頭部とし切り出
す処理を行う。文中切り出し処理部9は、文発声から文
頭部と文末部以外の区間を切り出す処理を行う。文末切
り出し処理部10は、文発声から入力を3等分し、最後
の3分の1を文末部として切り出す処理を行う。
That is, the sentence head cut-out processing unit 8 divides a sentence-uttered input into three equal parts, and cuts out the first one-third portion as the sentence head. The in-sentence cutout processing unit 9 performs a process of cutting out a section other than the sentence head and the sentence end from the sentence utterance. The sentence end cutout processing unit 10 divides the input from the sentence utterance into three equal parts, and cuts out the last one third as a sentence end part.

【0023】各々の区分化処理の結果は登録単語の場合
と同様に標準パタン作成処理部3に送られ、3種類の登
録外単語パタンが作成され、登録外単語標準パタン格納
部11に送られる。
The results of each segmentation process are sent to the standard pattern creation processing unit 3 as in the case of registered words, three types of unregistered word patterns are created, and sent to the unregistered word standard pattern storage unit 11. .

【0024】登録外単語標準パタン格納部11は、文
頭、文中、文末の3つの登録外単語標準パタンを格納す
る。
The unregistered word standard pattern storage unit 11 stores three unregistered word standard patterns of the beginning of a sentence, the middle of a sentence, and the end of a sentence.

【0025】このように登録外単語パタンの学習用音声
を文頭、文中、文末の3種類に分け、各々に標準パタン
を作成することで、例えば文頭には「あのー」や「えー
っと」といった言い淀みが出現しやすいといった、位置
による違いを反映することが可能になる。
As described above, the learning voice of the unregistered word pattern is divided into the three types of the beginning, the sentence, and the end of the sentence, and the standard pattern is created for each of them. For example, at the beginning of the sentence, "Ano" and "Eh" It is possible to reflect the difference depending on the position such as is likely to appear.

【0026】単語標準パタンを連結するための文法拘束
の使用は従来例と同様であるが、本発明の実施にあたっ
ては、図3に示すとおり文法拘束中に未登録単語の出現
位置が示される。
Although the use of the grammar constraint for connecting the word standard patterns is similar to the conventional example, in the practice of the present invention, the appearance position of the unregistered word is shown in the grammar constraint as shown in FIG.

【0027】次に入力音声として「えーっと、右に、90
度回れよ。」という発声が入力された場合を例に、本実
施例と従来法との違いを説明する。
Next, as the input voice, "Well, on the right, 90
Turn around. The difference between the present embodiment and the conventional method will be described by taking the case where the utterance "" is input as an example.

【0028】従来法では「えーっと」「( 回れ) よ」に
対応する標準パターンが用意されていないため、誤った
認識結果が出力される可能性が高い。
In the conventional method, since the standard patterns corresponding to "um" and "(turn)" are not prepared, there is a high possibility that an incorrect recognition result will be output.

【0029】本実施例では、「えーっと」の部分が状態
0において、文頭の未登録単語パタンと高い確率で照合
される。
In the present embodiment, in the state 0, the "um" portion is matched with the unregistered word pattern at the beginning of the sentence with high probability.

【0030】また「( 回れ) よ」は状態9において、文
末の未登録単語パタンと高い確率で照合される。
In the state 9, "(turn) yo" is matched with the unregistered word pattern at the end of the sentence with high probability.

【0031】本実施例が示すとおり、本発明により発声
中に言い淀み等の未登録単語が挿入された連続音声を正
しく認識することが可能になる。
As shown in this embodiment, the present invention makes it possible to correctly recognize a continuous voice in which an unregistered word such as stagnation is inserted during utterance.

【0032】[0032]

【発明の効果】本発明により、単に登録された単語によ
る音声入力だけでなく、人間同様の未登録語を含む連続
発声を認識する音声認識装置が実現可能である。
According to the present invention, it is possible to realize a voice recognition device which recognizes not only voice input using registered words but also continuous speech including unregistered words similar to human beings.

【0033】また、認識率の向上が可能であるととも
に、使用者への発話の制限が少なくなり自由な発話を受
け付けることができ、利用範囲の拡大が可能となる。
Further, it is possible to improve the recognition rate, limit the utterances to the user, and receive free utterances, and it is possible to expand the range of use.

【図面の簡単な説明】[Brief description of drawings]

【図1】従来の方法をハードウエアで実現する場合を示
す構成図である。
FIG. 1 is a configuration diagram showing a case where a conventional method is implemented by hardware.

【図2】本発明の一実施例をハードウエアで実現する場
合を示す構成図である。
FIG. 2 is a configuration diagram showing a case where an embodiment of the present invention is implemented by hardware.

【図3】本発明の実施例で用いられる有限状態オートマ
トンで与えられる文法拘束を示す図である。
FIG. 3 is a diagram showing grammatical constraints given by a finite state automaton used in an embodiment of the present invention.

【符号の説明】[Explanation of symbols]

1 音響分析処理部 2 文法拘束格納部 3 標準パタン作成処理部 4 単語標準パタン連結処理部 5 単語標準パタン格納部 6 文標準パタン格納部 7 パタン照合処理部 8 文頭切り出し処理部 9 文中切り出し処理部 10 文末切り出し処理部 11 登録単語外標準パタン格納部 1 Acoustic analysis processing section 2 Grammar constraint storage section 3 Standard pattern creation processing section 4 Word standard pattern concatenation processing section 5 Word standard pattern storage section 6 Sentence standard pattern storage section 7 Pattern matching processing section 8 Sentence extraction processing section 9 Sentence extraction processing section 10 Sentence cutout processing unit 11 Non-registered word standard pattern storage unit

Claims (2)

【特許請求の範囲】[Claims] 【請求項1】 マルコフモデルにより表現された単語標
準パタンと入力音声とを文法的拘束にしたがい連続的に
照合することで、連続に発声された音声を認識する連続
音声認識方法において、 連続音声の中で出現する可能性のある登録外単語パタン
の出現位置を文法的拘束の中に定義し、 登録外単語パタンを用いて登録されていない未登録単語
を含む連続音声の認識を行なうことを特徴とする連続音
声認識方法。
1. A continuous speech recognition method for recognizing a continuously uttered speech by continuously matching a word standard pattern represented by a Markov model with an input speech according to grammatical constraints. The feature is that the position of the unregistered word pattern that may appear in the inside is defined in the grammatical constraint, and continuous speech including unregistered words that are not registered is recognized using the unregistered word pattern. And continuous speech recognition method.
【請求項2】 請求項1項記載の認識方法において、登
録外単語パタンとして、文頭、文中、文末の3種類のパ
タンを作成し、未登録単語と照合することを特徴とする
連続音声認識方法。
2. The continuous speech recognition method according to claim 1, wherein three types of patterns, that is, the beginning of a sentence, the middle of a sentence, and the end of a sentence, are created as unregistered word patterns and the unregistered words are compared with each other. .
JP4287115A 1992-10-02 1992-10-02 Continuous speech recognizing method Pending JPH06118989A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP4287115A JPH06118989A (en) 1992-10-02 1992-10-02 Continuous speech recognizing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP4287115A JPH06118989A (en) 1992-10-02 1992-10-02 Continuous speech recognizing method

Publications (1)

Publication Number Publication Date
JPH06118989A true JPH06118989A (en) 1994-04-28

Family

ID=17713259

Family Applications (1)

Application Number Title Priority Date Filing Date
JP4287115A Pending JPH06118989A (en) 1992-10-02 1992-10-02 Continuous speech recognizing method

Country Status (1)

Country Link
JP (1) JPH06118989A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008242059A (en) * 2007-03-27 2008-10-09 Mitsubishi Electric Corp Device for creating speech recognition dictionary, and speech recognition apparatus

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008242059A (en) * 2007-03-27 2008-10-09 Mitsubishi Electric Corp Device for creating speech recognition dictionary, and speech recognition apparatus

Similar Documents

Publication Publication Date Title
US10074363B2 (en) Method and apparatus for keyword speech recognition
CN109410914B (en) Method for identifying Jiangxi dialect speech and dialect point
Scagliola Language models and search algorithms for real-time speech recognition
US5712957A (en) Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists
EP0965979B1 (en) Position manipulation in speech recognition
US7974843B2 (en) Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer
JPH05197389A (en) Voice recognition device
CA2680304A1 (en) Decoding-time prediction of non-verbalized tokens
US20020152068A1 (en) New language context dependent data labeling
CN108806691B (en) Voice recognition method and system
Chow et al. Speech understanding using a unification grammar
Moore Integration of speech with natural language understanding.
JP3058125B2 (en) Voice recognition device
JPH06118989A (en) Continuous speech recognizing method
JPH0340177A (en) Voice recognizing device
JP3277579B2 (en) Voice recognition method and apparatus
JPH0283593A (en) Noise adaptive speech recognizing device
JP3766111B2 (en) Voice recognition device
Downey et al. A decision tree approach to task-independent speech recognition
JP2000330588A (en) Method and system for processing speech dialogue and storage medium where program is stored
JP3039453B2 (en) Voice recognition device
JP2001188556A (en) Method and device for voice recognition
Chen et al. Application of allophonic and lexical constraints in continuous digit recognition
JPH08314490A (en) Word spotting type method and device for recognizing voice
JPS6229796B2 (en)

Legal Events

Date Code Title Description
A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 19980922