KR970002856A - Speech Recognition Using Linear Predictive Analysis Synthesis - Google Patents

Speech Recognition Using Linear Predictive Analysis Synthesis Download PDF

Info

Publication number
KR970002856A
KR970002856A KR1019950018111A KR19950018111A KR970002856A KR 970002856 A KR970002856 A KR 970002856A KR 1019950018111 A KR1019950018111 A KR 1019950018111A KR 19950018111 A KR19950018111 A KR 19950018111A KR 970002856 A KR970002856 A KR 970002856A
Authority
KR
South Korea
Prior art keywords
speech
recognition
characteristic
speech recognition
recognition method
Prior art date
Application number
KR1019950018111A
Other languages
Korean (ko)
Other versions
KR100322693B1 (en
Inventor
공병구
김상룡
Original Assignee
김광호
삼성전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 김광호, 삼성전자 주식회사 filed Critical 김광호
Priority to KR1019950018111A priority Critical patent/KR100322693B1/en
Publication of KR970002856A publication Critical patent/KR970002856A/en
Application granted granted Critical
Publication of KR100322693B1 publication Critical patent/KR100322693B1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • G10L15/075Adaptation to the speaker supervised, i.e. under machine guidance
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)
  • Image Analysis (AREA)

Abstract

입력되는 음성신호로부터 추출된 특징을 미리 준비한 표준특성을 매핑시키고, 이로부터 합성된 합성음을 대상으로 음성인식을 수행하는 개선된 음성인식방법이 개시된다.An improved speech recognition method is disclosed in which a feature extracted from an input speech signal is mapped to a standard feature prepared in advance, and speech recognition is performed on the synthesized sound synthesized therefrom.

본 발명에 따른 음성인식방법은 입력된 음성신호에서 화자종속적인 특성을 제거하여 표준화자의 특성으로 매핑시키는 과정; 상기 매핑된 특성에 근거하여 합성음을 발생시키고, 발생된 합성음으로부터 인식특징을 재추출하는 과정; 및 상기 합성음으로부터 추출된 인식특징을 표준패턴의 인식특징과 비교하여 음성을 인식하는 과정을 포함함을 특징으로 한다.Speech recognition method according to the present invention comprises the steps of removing the speaker-dependent characteristics from the input speech signal to the characteristics of the standardizer; Generating a synthesized sound based on the mapped characteristic and re-extracting a recognition feature from the generated synthesized sound; And recognizing the speech by comparing the recognition feature extracted from the synthesized sound with the recognition feature of the standard pattern.

본 발명에 따른 음성인식방법은 화자의 발성마다의 변화와 화자간의 특성편차를 제거하여 인식률을 향상시키는 효과를 갖는다.The speech recognition method according to the present invention has an effect of improving the recognition rate by removing the change of each speaker's voice and the characteristic deviation between the speakers.

Description

선형예측분석합성을 이용한 음성인식방법Speech Recognition Using Linear Predictive Analysis Synthesis

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제2도는 본 발명에 따른 음성 인식과정을 보이기 위한 도면이다. 제3도는 제2도에 도시된 음성합성과정을 수행하는 장치의 일실시예를 보이는 블럭도이다.2 is a view for showing a speech recognition process according to the present invention. 3 is a block diagram showing an embodiment of an apparatus for performing the speech synthesis process shown in FIG.

Claims (7)

입력된 음성신호에서 인식특성을 추출하고 이를 표준의 인식특성과 비교하여 음성을 인식하는 방법에 있어서, 입력된 음성신호에서 화자종속적인 특성을 제거하여 표준화자의 특성으로 매핑시키는 과정; 상기 매핑된 특성에 근거하여 합성음을 발생시키고, 발생된 합성음으로부터 인식특징을 재추출하는 과정; 및 상기 합성음으로부터 추출된 인식특징을 표준패턴의 인식특징과 비교하여 음성을 인식하는 과정을 포함하는 음성인식방법.A method of recognizing speech by extracting a recognition characteristic from an input speech signal and comparing the recognition characteristic with a recognition characteristic of a standard, the method comprising: mapping a speaker-dependent characteristic from an input speech signal to a characteristic of a standardizer; Generating a synthesized sound based on the mapped characteristic and re-extracting a recognition feature from the generated synthesized sound; And recognizing a speech by comparing the recognition feature extracted from the synthesized sound with the recognition feature of the standard pattern. 제1항에 있어서, 상기 매핑과정은 1차 성형예측분석에 의해 추출된 피치값, 크기, 성도계수를 사용하여 매핑하는 것을 특징으로 하는 음성인식방법.The speech recognition method of claim 1, wherein the mapping is performed by using pitch values, magnitudes, and vocal coefficients extracted by the first shape prediction analysis. 제2항에 있어서, 입력된 음성신호의 피차값은 추출된 평균피치값과 가장 적은 차이값을 보이는 대표화자의 피치값으로 일정하게 매핑됨을 특징으로 하는 음성인식방법.The speech recognition method of claim 2, wherein the difference value of the input speech signal is uniformly mapped to the pitch value of the representative speaker having the smallest difference from the extracted average pitch value. 제2항에 있어서, 입력된 음성신호의 성도계수값은 입력된 음성신호의 각 차수별 성도계수의 분포범위가 표준화자의 분포범위중 가장 가까운 기준성도계수를 택한 후 입력신호의 각 차수별 성도계수의 분포범위가 선택된 대표화자의 성도계수의 각 차수별 분포범위가 되도록 매핑됨을 특징으로 하는 음성인식방법.The method of claim 2, wherein the vocal tract coefficient value of the input voice signal is a distribution range of the vocal tract coefficients for each order of the input voice signal after the reference range coefficient closest to the standard range is selected among the distribution ranges of the standardizer. A voice recognition method, characterized in that the range is mapped to be the distribution range for each order of the vocal tract coefficient of the selected representative speaker. 제1항에 있어서, 표준화자의 특성은 남자, 여자, 아이의 음성을 대표하는 각 한사람의 음성신호를 분석하여 얻어진 것임을 특징으로 하는 음성인식방법.The speech recognition method of claim 1, wherein the characteristics of the standardizer are obtained by analyzing a voice signal of each person representing the voice of a man, a woman, and a child. 제1항에 있어서, 상기 재합성과정은 성도모델의 여기신호로서 유성음인 경우에는 소정의 주파수를 갖는 펄스, 무성음인 경우에는 임의 잡음을 사용함을 특징으로 하는 음성인식방법.The speech recognition method of claim 1, wherein the resynthesis process uses a pulse having a predetermined frequency in the case of voiced sound and an arbitrary noise in the case of unvoiced sound as an excitation signal of the vocal tract model. 제1항에 있어서, 사익 재합성과정은 재합성의 결과로서 발생된 합성음의 프레임경계구간이 입력된 음성신호의 특징추출시 사용된 프레임경계구간과 일치됨을 특징으로 하는 음성인식방법.2. The speech recognition method according to claim 1, wherein the reintegration process of the sound field is characterized in that the frame boundary section of the synthesized sound generated as a result of the resynthesis coincides with the frame boundary section used in the feature extraction of the input voice signal. ※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.※ Note: The disclosure is based on the initial application.
KR1019950018111A 1995-06-29 1995-06-29 Voice recognition method using linear prediction analysis synthesis KR100322693B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019950018111A KR100322693B1 (en) 1995-06-29 1995-06-29 Voice recognition method using linear prediction analysis synthesis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019950018111A KR100322693B1 (en) 1995-06-29 1995-06-29 Voice recognition method using linear prediction analysis synthesis

Publications (2)

Publication Number Publication Date
KR970002856A true KR970002856A (en) 1997-01-28
KR100322693B1 KR100322693B1 (en) 2002-05-13

Family

ID=37460732

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019950018111A KR100322693B1 (en) 1995-06-29 1995-06-29 Voice recognition method using linear prediction analysis synthesis

Country Status (1)

Country Link
KR (1) KR100322693B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100427243B1 (en) * 2002-06-10 2004-04-14 휴먼씽크(주) Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof
KR100667522B1 (en) * 1998-12-18 2007-05-17 주식회사 현대오토넷 Speech Recognition Method of Mobile Communication Terminal Using LPC Coefficient
KR100762588B1 (en) * 2001-06-26 2007-10-01 엘지전자 주식회사 voice recognition method for joing the speaker adaptation and the rejection of error input
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20240081311A (en) 2022-11-30 2024-06-07 주식회사 아큐리스 Voice recognition program and method for reducing occupied space

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100667522B1 (en) * 1998-12-18 2007-05-17 주식회사 현대오토넷 Speech Recognition Method of Mobile Communication Terminal Using LPC Coefficient
KR100762588B1 (en) * 2001-06-26 2007-10-01 엘지전자 주식회사 voice recognition method for joing the speaker adaptation and the rejection of error input
KR100427243B1 (en) * 2002-06-10 2004-04-14 휴먼씽크(주) Method and apparatus for analysing a pitch, method and system for discriminating a corporal punishment, and computer readable medium storing a program thereof
CN112102833A (en) * 2020-09-22 2020-12-18 北京百度网讯科技有限公司 Voice recognition method, device, equipment and storage medium
CN112102833B (en) * 2020-09-22 2023-12-12 阿波罗智联(北京)科技有限公司 Speech recognition method, device, equipment and storage medium

Also Published As

Publication number Publication date
KR100322693B1 (en) 2002-05-13

Similar Documents

Publication Publication Date Title
Vergin et al. Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
EP1850328A1 (en) Enhancement and extraction of formants of voice signals
JP3006677B2 (en) Voice recognition device
JP2006171750A (en) Feature vector extracting method for speech recognition
CA2483607C (en) Syllabic nuclei extracting apparatus and program product thereof
Gussenhoven et al. On the speaker-dependence of the perceived prominence of F0peaks
KR100738332B1 (en) Apparatus for vocal-cord signal recognition and its method
Cherif et al. Pitch detection and formant analysis of Arabic speech processing
KR970002856A (en) Speech Recognition Using Linear Predictive Analysis Synthesis
Deiv et al. Automatic gender identification for hindi speech recognition
Hasija et al. Recognition of children Punjabi speech using tonal non-tonal classifier
KR101560833B1 (en) Apparatus and method for recognizing emotion using a voice signal
JP2904279B2 (en) Voice synthesis method and apparatus
Jung et al. Pitch alteration technique in speech synthesis system
JPH0580791A (en) Device and method for speech rule synthesis
Adam et al. Analysis of Momentous Fragmentary Formants in Talaqi-like Neoteric Assessment of Quran Recitation using MFCC Miniature Features of Quranic Syllables
Kongkachandra et al. Thai intonation analysis in harmonic-frequency domain
CN113409762B (en) Emotion voice synthesis method, emotion voice synthesis device, emotion voice synthesis equipment and storage medium
JP2011158515A (en) Device and method for recognizing speech
KR100608643B1 (en) Pitch modelling apparatus and method for voice synthesizing system
Arısoy et al. Duration of Turkish vowels revisited
Sangwan Feature Extraction for Speaker Recognition: A Systematic Study
JP2658426B2 (en) Voice recognition method
Rouf et al. Madurese Speech Synthesis using HMM
Reddy et al. Neutral to joyous happy emotion conversion

Legal Events

Date Code Title Description
A201 Request for examination
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20101230

Year of fee payment: 10

LAPS Lapse due to unpaid annual fee