KR970002856A

KR970002856A - Speech Recognition Using Linear Predictive Analysis Synthesis

Info

Publication number: KR970002856A
Application number: KR1019950018111A
Authority: KR
Inventors: 공병구; 김상룡
Original assignee: 김광호; 삼성전자 주식회사
Priority date: 1995-06-29
Filing date: 1995-06-29
Publication date: 1997-01-28
Also published as: KR100322693B1

Abstract

입력되는 음성신호로부터 추출된 특징을 미리 준비한 표준특성을 매핑시키고, 이로부터 합성된 합성음을 대상으로 음성인식을 수행하는 개선된 음성인식방법이 개시된다.An improved speech recognition method is disclosed in which a feature extracted from an input speech signal is mapped to a standard feature prepared in advance, and speech recognition is performed on the synthesized sound synthesized therefrom.

본 발명에 따른 음성인식방법은 입력된 음성신호에서 화자종속적인 특성을 제거하여 표준화자의 특성으로 매핑시키는 과정; 상기 매핑된 특성에 근거하여 합성음을 발생시키고, 발생된 합성음으로부터 인식특징을 재추출하는 과정; 및 상기 합성음으로부터 추출된 인식특징을 표준패턴의 인식특징과 비교하여 음성을 인식하는 과정을 포함함을 특징으로 한다.Speech recognition method according to the present invention comprises the steps of removing the speaker-dependent characteristics from the input speech signal to the characteristics of the standardizer; Generating a synthesized sound based on the mapped characteristic and re-extracting a recognition feature from the generated synthesized sound; And recognizing the speech by comparing the recognition feature extracted from the synthesized sound with the recognition feature of the standard pattern.

본 발명에 따른 음성인식방법은 화자의 발성마다의 변화와 화자간의 특성편차를 제거하여 인식률을 향상시키는 효과를 갖는다.The speech recognition method according to the present invention has an effect of improving the recognition rate by removing the change of each speaker's voice and the characteristic deviation between the speakers.

Description

Speech Recognition Using Linear Predictive Analysis Synthesis

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is an open matter, no full text was included.

제2도는 본 발명에 따른 음성 인식과정을 보이기 위한 도면이다. 제3도는 제2도에 도시된 음성합성과정을 수행하는 장치의 일실시예를 보이는 블럭도이다.2 is a view for showing a speech recognition process according to the present invention. 3 is a block diagram showing an embodiment of an apparatus for performing the speech synthesis process shown in FIG.

Claims

A method of recognizing speech by extracting a recognition characteristic from an input speech signal and comparing the recognition characteristic with a recognition characteristic of a standard, the method comprising: mapping a speaker-dependent characteristic from an input speech signal to a characteristic of a standardizer; Generating a synthesized sound based on the mapped characteristic and re-extracting a recognition feature from the generated synthesized sound; And recognizing a speech by comparing the recognition feature extracted from the synthesized sound with the recognition feature of the standard pattern.

The speech recognition method of claim 1, wherein the mapping is performed by using pitch values, magnitudes, and vocal coefficients extracted by the first shape prediction analysis.

The speech recognition method of claim 2, wherein the difference value of the input speech signal is uniformly mapped to the pitch value of the representative speaker having the smallest difference from the extracted average pitch value.

The method of claim 2, wherein the vocal tract coefficient value of the input voice signal is a distribution range of the vocal tract coefficients for each order of the input voice signal after the reference range coefficient closest to the standard range is selected among the distribution ranges of the standardizer. A voice recognition method, characterized in that the range is mapped to be the distribution range for each order of the vocal tract coefficient of the selected representative speaker.

The speech recognition method of claim 1, wherein the characteristics of the standardizer are obtained by analyzing a voice signal of each person representing the voice of a man, a woman, and a child.

The speech recognition method of claim 1, wherein the resynthesis process uses a pulse having a predetermined frequency in the case of voiced sound and an arbitrary noise in the case of unvoiced sound as an excitation signal of the vocal tract model.

2. The speech recognition method according to claim 1, wherein the reintegration process of the sound field is characterized in that the frame boundary section of the synthesized sound generated as a result of the resynthesis coincides with the frame boundary section used in the feature extraction of the input voice signal.

※ Note: The disclosure is based on the initial application.