KR940015968A - Speech duration modeling method of speech recognizer - Google Patents

Speech duration modeling method of speech recognizer Download PDF

Info

Publication number
KR940015968A
KR940015968A KR1019920023405A KR920023405A KR940015968A KR 940015968 A KR940015968 A KR 940015968A KR 1019920023405 A KR1019920023405 A KR 1019920023405A KR 920023405 A KR920023405 A KR 920023405A KR 940015968 A KR940015968 A KR 940015968A
Authority
KR
South Korea
Prior art keywords
speech
probability
model
max
duration
Prior art date
Application number
KR1019920023405A
Other languages
Korean (ko)
Other versions
KR950010020B1 (en
Inventor
김민성
Original Assignee
이헌조
주식회사 금성사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 이헌조, 주식회사 금성사 filed Critical 이헌조
Priority to KR1019920023405A priority Critical patent/KR950010020B1/en
Publication of KR940015968A publication Critical patent/KR940015968A/en
Application granted granted Critical
Publication of KR950010020B1 publication Critical patent/KR950010020B1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

본 발명은 화자독립 음성인식기에 관한 것으로, 특히 음성인식 오류를 개선하기 위해 화자간 변화에 무관한 음성지속시간을 모델링하여 그 모델링된 음성지속시간 정보에 의해 음성인식시 인식율을 향상시킬 수 있게한 한 화자독립 음성인식기의 음성지속시간 모델링 방법에 관한 것이다.The present invention relates to a speaker-independent speech recognizer, and in particular, in order to improve speech recognition error, modeling a speech duration independent of change between speakers, and improving the recognition rate in speech recognition by the modeled speech duration information. The present invention relates to a speech duration modeling method of a speaker-independent speech recognizer.

본 발명은, 연속한 n개의 상태에서 음성지속시간을 확률적으로 모델링하는데, 먼저, 두 상태열을 결합할 것인가를 판단하여 음성지속시간을 모델링 할 수 있는 상태들을 찾는 과정을 수행하고, 이와같은 과정으로 선택한 상태들에서 확률분포로부터 입력음성의 상태열이 주어졌을때 지속시간 확률을 구하는 과정을 수행하도록 되어 있다.According to the present invention, probabilistic modeling of voice durations in n consecutive states is performed. First, a process of finding states capable of modeling voice durations by determining whether to combine two state sequences is performed. In the states selected as the process, the process of calculating the duration probability when the state of the input voice is given from the probability distribution is performed.

Description

음성인식기의 음성지속시간 모델링 방법Speech duration modeling method of speech recognizer

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제1도는 일반적인 에이취 엠엠(HMM)을 이용한 음성인식 시스템의 구조를 보인 확률분포의 상태 블록도, 제2도는 화자에 따라 발음된 음성의 상태에서 지속되는 시간과 대응되는 상태를 보인 설명도, 제3도는 본 발명에 의한 음성인식기의 음성지속시간 모델링 방법을 보인 제어 흐름도.1 is a state block diagram of a probability distribution showing a structure of a speech recognition system using a general HMM, and FIG. 2 is an explanatory diagram showing a state corresponding to a time duration in a state of a pronounced voice according to a speaker. 3 is a control flowchart showing a voice duration modeling method of a voice recognizer according to the present invention.

Claims (1)

음성이 입력되면, 특징벡터(O=O102…O)를 추출하는 단계(STEP1)와, 추출된 특징벡터(O)와 각 모델 분포로부터 특징벡터(O)에 대한 확률과, 그 확률에 해당하는 상태 시퀀스(ir)를 구하는 단계(STEP2)와 이후, 상기 스퀀스(ir)로부터 각 상태들 간의 지속시간에 대한 확률(Pd)를 구하는 단계(STEP3)와, 지속시간에 대한 확률(Pd) 및 특징벡터(O)에 대한 확률로부터 입력에 대한 모델(r)의 전체 확률값(Pr)을 구하는 단계(STEP4)와, 상기 전체 확률값(Pr)이 최대값(Pmax)보다 크면(Pmax〈Pr), 최대값(Pmax)을 상기 전체 확률값(Pr)으로 대체하고 (Pmax←Pr), 모델(R)을 모델(r)로 대체하는 (R→r) 단계(STEP5)와, 그 단계(STEP5)를 수행한 이후 또는 상기에서 전체 확률값(Pr)이 최대값(Pmax)보다 작은 (Pmax〉Pr) 경우에, 모델(r)이 상호정보치 (M)가 될때까지 모델(r)를 증가(r←r+1)시켜 상기 특징벡터에 대한 확률과 상태시퀀스를 구하는 단계(STEP2)로 되돌아가 반본수행하는 단계(STEP6)와, 상기 과정이 모든 모델에 대해 끝나면, 모델(R)을 인식된 단어로 출력하는 단계(STEP7)를 수행하도록 구성된 것을 특징으로 하는 음성인식기의 음성지속시간 모델링 방법.When the voice is input, extracting the feature vector (O = O 1 0 2 ... O) (STEP1), extracting the feature vector (O) and distribution of each model Probability for Feature Vector from And the probability Obtaining a state sequence i r corresponding to (STEP2), and then obtaining a probability (P d ) of a duration between the states from the sequence (i r ) (STEP3), and Probability (P d ) and probability for feature vector (O) Obtaining the total probability value P r of the model r with respect to the input (STEP4), and if the total probability value P r is greater than the maximum value P max (P max <P r ), the maximum value ( P max ) is replaced by the overall probability value P r (P max ← P r ), and the model R is replaced with a model r (R → r) (STEP5), and the step (STEP5) After performing or in the case where the total probability value Pr is smaller than the maximum value P max (P max > P r ), the model r is calculated until the model r becomes the mutual information value M. Return to step (STEP2) of increasing (r ← r + 1) to obtain the probability and state sequence for the feature vector (STEP6), and when the process is completed for all models, the model (R) And a step of outputting the recognized word (STEP7). ※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.※ Note: The disclosure is based on the initial application.
KR1019920023405A 1992-12-05 1992-12-05 Voice continu ating time modeling method of voice recognizer KR950010020B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019920023405A KR950010020B1 (en) 1992-12-05 1992-12-05 Voice continu ating time modeling method of voice recognizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019920023405A KR950010020B1 (en) 1992-12-05 1992-12-05 Voice continu ating time modeling method of voice recognizer

Publications (2)

Publication Number Publication Date
KR940015968A true KR940015968A (en) 1994-07-22
KR950010020B1 KR950010020B1 (en) 1995-09-04

Family

ID=19344812

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019920023405A KR950010020B1 (en) 1992-12-05 1992-12-05 Voice continu ating time modeling method of voice recognizer

Country Status (1)

Country Link
KR (1) KR950010020B1 (en)

Also Published As

Publication number Publication date
KR950010020B1 (en) 1995-09-04

Similar Documents

Publication Publication Date Title
US5268990A (en) Method for recognizing speech using linguistically-motivated hidden Markov models
Bahl et al. A maximum likelihood approach to continuous speech recognition
JP3049259B2 (en) Voice recognition method
US5241619A (en) Word dependent N-best search method
JP2964507B2 (en) HMM device
CN107680597A (en) Audio recognition method, device, equipment and computer-readable recording medium
CN104978963A (en) Speech recognition apparatus, method and electronic equipment
JP2012037619A (en) Speaker-adaptation device, speaker-adaptation method and program for speaker-adaptation
CN103337241B (en) Voice recognition method and device
EP0903730B1 (en) Search and rescoring method for a speech recognition system
US20040019483A1 (en) Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
JP2002358097A (en) Voice recognition device
JPS5852696A (en) Voice recognition unit
CN111554270A (en) Training sample screening method and electronic equipment
JPH0296800A (en) Continuous voice recognizing device
JP2013182261A (en) Adaptation device, voice recognition device and program
KR940015968A (en) Speech duration modeling method of speech recognizer
JP2002215184A (en) Speech recognition device and program for the same
CN117456999B (en) Audio identification method, audio identification device, vehicle, computer device, and medium
JP3316352B2 (en) Voice recognition method
JP3532248B2 (en) Speech recognition device using learning speech pattern model
JP3144341B2 (en) Voice recognition device
JP3368989B2 (en) Voice recognition method
JPH08314490A (en) Word spotting type method and device for recognizing voice
JP2003022091A (en) Method, device, and program for voice recognition

Legal Events

Date Code Title Description
A201 Request for examination
G160 Decision to publish patent application
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
LAPS Lapse due to unpaid annual fee