KR940015968A

KR940015968A - Speech duration modeling method of speech recognizer

Info

Publication number: KR940015968A
Application number: KR1019920023405A
Authority: KR
Inventors: 김민성
Original assignee: 이헌조; 주식회사 금성사
Priority date: 1992-12-05
Filing date: 1992-12-05
Publication date: 1994-07-22
Also published as: KR950010020B1

Abstract

본 발명은 화자독립 음성인식기에 관한 것으로, 특히 음성인식 오류를 개선하기 위해 화자간 변화에 무관한 음성지속시간을 모델링하여 그 모델링된 음성지속시간 정보에 의해 음성인식시 인식율을 향상시킬 수 있게한 한 화자독립 음성인식기의 음성지속시간 모델링 방법에 관한 것이다.The present invention relates to a speaker-independent speech recognizer, and in particular, in order to improve speech recognition error, modeling a speech duration independent of change between speakers, and improving the recognition rate in speech recognition by the modeled speech duration information. The present invention relates to a speech duration modeling method of a speaker-independent speech recognizer.

본 발명은, 연속한 n개의 상태에서 음성지속시간을 확률적으로 모델링하는데, 먼저, 두 상태열을 결합할 것인가를 판단하여 음성지속시간을 모델링 할 수 있는 상태들을 찾는 과정을 수행하고, 이와같은 과정으로 선택한 상태들에서 확률분포로부터 입력음성의 상태열이 주어졌을때 지속시간 확률을 구하는 과정을 수행하도록 되어 있다.According to the present invention, probabilistic modeling of voice durations in n consecutive states is performed. First, a process of finding states capable of modeling voice durations by determining whether to combine two state sequences is performed. In the states selected as the process, the process of calculating the duration probability when the state of the input voice is given from the probability distribution is performed.

Description

Speech duration modeling method of speech recognizer

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제1도는 일반적인 에이취 엠엠(HMM)을 이용한 음성인식 시스템의 구조를 보인 확률분포의 상태 블록도, 제2도는 화자에 따라 발음된 음성의 상태에서 지속되는 시간과 대응되는 상태를 보인 설명도, 제3도는 본 발명에 의한 음성인식기의 음성지속시간 모델링 방법을 보인 제어 흐름도.1 is a state block diagram of a probability distribution showing a structure of a speech recognition system using a general HMM, and FIG. 2 is an explanatory diagram showing a state corresponding to a time duration in a state of a pronounced voice according to a speaker. 3 is a control flowchart showing a voice duration modeling method of a voice recognizer according to the present invention.

Claims

When the voice is input, extracting the feature vector (O = O ₁ 0 ₂ ... O) (STEP1), extracting the feature vector (O) and distribution of each model Probability for Feature Vector from And the probability Obtaining a state sequence i ^r corresponding to (STEP2), and then obtaining a probability (P _d ) of a duration between the states from the sequence (i ^r ) (STEP3), and Probability (P _d ) and probability for feature vector (O) Obtaining the total probability value P _r of the model r with respect to the input (STEP4), and if the total probability value P _r is greater than the maximum value P _max (P _max <P _r ), the maximum value ( P _max ) is replaced by the overall probability value P _r (P _max ← P _r ), and the model R is replaced with a model r (R → r) (STEP5), and the step (STEP5) After performing or in the case where the total probability value Pr is smaller than the maximum value P _max (P _max > P _r ), the model r is calculated until the model r becomes the mutual information value M. Return to step (STEP2) of increasing (r ← r + 1) to obtain the probability and state sequence for the feature vector (STEP6), and when the process is completed for all models, the model (R) And a step of outputting the recognized word (STEP7).

※ Note: The disclosure is based on the initial application.