KR20000025292A - Method for extracting voice characteristic suitable for core word detection in noise circumstance - Google Patents
Method for extracting voice characteristic suitable for core word detection in noise circumstance Download PDFInfo
- Publication number
- KR20000025292A KR20000025292A KR1019980042317A KR19980042317A KR20000025292A KR 20000025292 A KR20000025292 A KR 20000025292A KR 1019980042317 A KR1019980042317 A KR 1019980042317A KR 19980042317 A KR19980042317 A KR 19980042317A KR 20000025292 A KR20000025292 A KR 20000025292A
- Authority
- KR
- South Korea
- Prior art keywords
- cepstrum
- noise
- signal
- voice
- local
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 239000013598 vector Substances 0.000 claims description 23
- 238000000605 extraction Methods 0.000 claims description 16
- 230000003595 spectral effect Effects 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 abstract 2
- 238000003786 synthesis reaction Methods 0.000 abstract 2
- 238000001228 spectrum Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/0308—Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Algebra (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
본 발명은 음성인식에 관한 것으로, 특히 음성인식에서 핵심어 검출에 적합한 잡음제거 방법에 관한 것으로써, 잡음환경에서 수행되는 핵심어 검출 시스템의 응용분야에 사용하고자 한 것이다.The present invention relates to speech recognition, and more particularly, to a noise reduction method suitable for keyword detection in speech recognition, and is intended for use in application fields of a keyword detection system performed in a noise environment.
일반적으로 음성인식이란 사람의 말을 인식하는 기술을 말한다. 그 중 핵심어 검출이란 어휘에 제한 없이 자연스럽게 발음한 연속음성으로부터 미리 정해진 핵심어들을 검출해내는 기술을 말한다. 핵심어 검출은 고립단어 인식에서의 사용자의 불편함과 연속음성 인식에서의 성능저조의 문제를 모두 해결할 수 있으며, 입력된 연속음성으로부터 핵심주제어만 검출해 내면 의미가 통할 수 있는 많은 응용분야, 예를 들면 전화 교환 및 안내 서비스나 각종 정보검색 서비스 등에 효과적으로 응용될 수 있다.In general, speech recognition refers to a technology of recognizing a person's speech. Among them, key word detection refers to a technique for detecting predetermined key words from naturally pronounced continuous speech without limiting vocabulary. Key word detection can solve both the user's inconvenience in isolated word recognition and poor performance in continuous speech recognition, and many applications that can make sense if only key control is input from continuous speech input. For example, it can be effectively applied to telephone exchange and guidance services or various information retrieval services.
여기서 전화망을 통한 음성인식은 음성인식의 매우 유망한 응용분야중 하나이다. 그러나 전화망을 통한 음성인식은 연구실 환경의 고음질 음성의 인식에서는 고려할 필요가 없었던 많은 문제점을 지닌다. 그 대표적인 예로 핵심어 검출을 비롯한 음성인식이 적용되는 실제 환경은 채널대역폭 제한에 의한 왜곡, 핸드셋(Handset) 마이크로폰 특성에 의한 왜곡, 배경잡음 등의 주위 잡음이 존재한다. 따라서 모든 음성인식 시스템이 실생활에 적용되기 위해서는 이러한 주위잡음의 효과적인 제거가 필수적이다. 따라서 전화망을 통한 음성인식의 성능향상을 위해서는 여러 가지 왜곡과 배경잡음에 대한 효과적인 보상방법이 요구된다.Voice recognition over the telephone network is one of the very promising applications of speech recognition. However, voice recognition through telephone network has many problems that need not be considered in recognition of high quality voice in laboratory environment. For example, in the real environment to which speech recognition including key word detection is applied, there are ambient noises such as distortion due to channel bandwidth limitation, distortion caused by handset microphone characteristics, and background noise. Therefore, in order for all speech recognition systems to be applied in real life, effective removal of the ambient noise is essential. Therefore, in order to improve the performance of speech recognition through the telephone network, an effective compensation method for various distortions and background noises is required.
그래서 채널대역폭 제한에 의한 왜곡, 핸드셋 마이크로폰 특성에 의한 왜곡 등 여러 가지 왜곡을 하나의 채널왜곡으로 본다면, 전화망을 통한 음성은 입력음성에 채널왜곡과 부가잡음이 첨가된 것으로 모델링할 수 있다. 이러한 왜곡의 특성이 사용자가 발음하는 동안에는 변하지 않는다고 가정하면, 이러한 왜곡을 선형시불변 필터만으로 모델링할 수 있다.Therefore, if one considers various distortions such as distortion due to channel bandwidth limitation and distortion due to the characteristics of the handset microphone as one channel distortion, voice through the telephone network can be modeled as channel distortion and additional noise added to the input voice. Assuming that the characteristics of the distortion do not change during the user's pronunciation, such distortion can be modeled with a linear time invariant filter only.
따라서 이러한 왜곡을 시간 영역과 주파수 영역에서 표현하면 각각 다음의 수학식1 및 수학식2로 표현된다.Therefore, when the distortion is expressed in the time domain and the frequency domain, the following equations (1) and (2) are respectively represented.
여기서 x(n)은 입력음성이고, h(n)은 스펙트럼 포락(spectral envelop) 함수이며, z(n)은 입력음성과 스펙트럼 포락 함수의 컨볼루션(convolution) 형태로 표현되는 음성신호이다. 그리고 수학식2는 시간 영역에서 나타낸 음성신호를 주파수 영역에서 나타낸 것이다.Where x (n) is the input voice, h (n) is the spectral envelope function, and z (n) is the speech signal expressed in the form of convolution of the input voice and the spectral envelope function. Equation 2 shows the voice signal shown in the time domain in the frequency domain.
그리고 수학식1 및 2를 로그 주파수 영역과 로그 주파수 영역의 역 푸리에(Fourier) 변환 영역인 켑스트럼(Cepstrum, spectrum이라는 단어의 앞부분을 역순으로 배열한 단어) 영역에서 표현하면 다음의 수학식3 및 4로 표현된다.Equations 1 and 2 are expressed in the domain of the log frequency domain and the inverse Fourier transform domain of the log frequency domain in the region of the word Cepstrum (the word arranged in the reverse order). And 4.
여기서
종래의 왜곡 보상방법 중, CMS 방법의 목적은 수학식4에서 미지의 bias 항목으로 나타나는
여기서 T는 관찰벡터의 길이를 의미한다. 그리고 CMS 방법에 의해 채널왜곡이 보상된 벡터
이와 같이 채널왜곡이 보상된 벡터는 채널에 의한 bias에 영향을 받지 않는다.The channel distortion compensated vector is not affected by the bias caused by the channel.
그러나 이러한 방법은 훈련과 인식시 평균벡터를 구할 때 핵심어 부분을 포함한 전체 입력 음성에 대해 평균을 구해야 하기 때문에 실시간 처리가 곤란하다는 단점이 있었다.However, this method has a disadvantage in that it is difficult to process in real time because the average of the entire input speech including the key word must be obtained when the average vector is obtained during training and recognition.
또한 이 방법이 핵심어 검출에 사용될 경우 전체 입력 음성에 대해 평균을 취하기 때문에 비핵심어 음성이 핵심어 모델을 모델링하는데 영향을 끼치게 되어, 핵심어 모델의 정밀도를 떨어뜨리게 되는 문제점도 있었다.In addition, when this method is used for keyword detection, since the average of the entire input speech is taken into account, the non-core speech affects modeling of the keyword model, thereby lowering the accuracy of the keyword model.
이에 본 발명은 상기와 같은 종래의 제반 문제점을 해소하기 위해 제안된 것으로, 본 발명의 목적은 음성인식에서 주위 잡음을 제거하여 핵심어를 검출할 수 있는 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법을 제공하는 데 있다.Therefore, the present invention has been proposed to solve the above-mentioned conventional problems, and an object of the present invention is to extract a voice feature suitable for key word detection in a noise environment that can detect key words by removing ambient noise from voice recognition. To provide.
상기와 같은 목적을 달성하기 위하여 본 발명에 의한 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법은,In order to achieve the above object, a voice feature extraction method suitable for key word detection in a noisy environment according to the present invention,
음성이 입력되면 관찰 켑스트럼 분석으로 켑스트럼의 로컬평균을 추출하는 음성특징 추출단계와; 상기 음성특징 추출단계 실행 후 켑스트럼의 로컬평균으로 잡음에 의한 왜곡을 보상하는 왜곡보상 단계를 수행함을 그 기술적 구성상의 특징으로 한다.A voice feature extraction step of extracting a local mean of the cepstrum by the observation cepstrum analysis when the voice is input; The technical feature of the present invention is that after performing the speech feature extraction step, a distortion compensation step of compensating for distortion caused by noise is performed by the local mean of the cepstrum.
도 1은 본 발명이 적용되는 음성인식 시스템의 블록구성도,1 is a block diagram of a speech recognition system to which the present invention is applied;
도 2는 본 발명에 의한 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법을 보인 흐름도.2 is a flowchart illustrating a voice feature extraction method suitable for key word detection in a noisy environment according to the present invention.
<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>
10 : 음성특징 추출부 20 : 왜곡보상부10: voice feature extraction unit 20: distortion compensation unit
30 : HMM 모델부 40 : 패턴인식부30: HMM model part 40: pattern recognition part
이하, 상기와 같은 본 발명 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법의 기술적 사상에 따른 일실시예를 첨부한 도면에 의거 상세히 설명하면 다음과 같다.Hereinafter, with reference to the accompanying drawings an embodiment according to the technical idea of the speech feature extraction method suitable for key word detection in the present invention as described above in detail as follows.
먼저 본 발명은 핵심어 검출의 특성에 맞는 왜곡 보상방법으로 LCMS(Local Cepstral Mean Subtraction) 방법을 제안한다. LCMS 방법 역시 CMS와 마찬가지로 수학식4에서 bias 항목으로 나타나는
여기서
이러한 LCMS 방법은 moving average를 취하기 때문에 채널왜곡이 고정되어 있지 않고, 시간에 따라 서서히 변화하더라도 그 영향을 어느 정도 제거해 줄 수 있다. LCMS 방법 역시 채널 bias와 함께 입력 음성 특징 벡터의 평균을 제거하는 문제점이 있으나, 실시가 처리가 가능한 장점을 지닌다. 이 방법이 핵심어 검출에 사용될 경우 moving average를 평균 벡터로 취했기 때문에 비핵심어 부분이 핵심어 모델을 모델링하는데 거의 영향을 미치지 않는다. 이러한 방법은 종래의 왜곡보상 방법들에 비해 핵심어 모델의 정교함이 가장 중요한 핵심어 검출에 보다 더 적합하다.Since the LCMS method takes a moving average, the channel distortion is not fixed. Even if it gradually changes over time, the influence of the LCMS method can be eliminated to some extent. The LCMS method also has a problem of eliminating the average of the input speech feature vectors along with the channel bias, but has the advantage that the implementation can be processed. When this method is used for keyword detection, since the moving average is taken as the average vector, the non-core part has little effect on modeling the keyword model. This method is more suitable for the key word detection where the sophistication of the key word model is the most important than the conventional distortion compensation methods.
도1은 본 발명이 적용되는 음성인식 시스템의 블록구성도이다.1 is a block diagram of a speech recognition system to which the present invention is applied.
이에 도시된 바와 같이, 입력된 음성으로부터 인식에 유효한 특징을 추출하는 음성특징 추출부(10)와; 상기 음성특징 추출부(10)의 출력에서 잡음에 의한 왜곡을 보상하는 왜곡보상부(20)와; 입력음성의 변화되는 통계적인 특징을 확률적으로 모델링하는 HMM(Hidden Markov Model) 모델부(30)와; 상기 HMM 모델부(30)의 모델링에 따라 상기 왜곡보상부(20)에서 출력된 잡음제거된 특징벡터에서 패턴을 인식하여 인식된 단어를 출력하는 패턴인식부(40)로 구성된다.As shown therein, a voice feature extraction unit 10 for extracting a feature effective for recognition from the input voice; A distortion compensator (20) for compensating for distortion caused by noise at the output of the voice feature extractor (10); A Hidden Markov Model (HMM) model unit 30 for probabilistically modeling changing statistical characteristics of an input voice; According to the modeling of the HMM model unit 30 is composed of a pattern recognition unit 40 for outputting the recognized word by recognizing the pattern in the noise-rejected feature vector output from the distortion compensator 20.
도2는 본 발명에 의한 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법을 보인 흐름도이다.2 is a flowchart illustrating a voice feature extraction method suitable for key word detection in a noisy environment according to the present invention.
이에 도시된 바와 같이, 음성이 입력되면 관찰 켑스트럼 분석으로 켑스트럼의 로컬평균을 추출하는 음성특징 추출단계(ST1 - ST5)와; 상기 음성특징 추출단계 실행 후 켑스트럼의 로컬평균으로 잡음에 의한 왜곡을 보상하는 왜곡보상 단계(ST6)(ST7)를 수행한다.As shown therein, a voice feature extraction step (ST1-ST5) of extracting a local mean of the cepstrum by observation cepstrum analysis when a voice is input; After the speech feature extraction step is performed, a distortion compensation step ST6 or ST7 is performed to compensate for the distortion caused by noise as a local mean of the cepstrum.
그래서 음성이 입력되면, 스펙트럼 경사(spectral tilt)를 평탄화해 줌으로써 신호의 동적 범위를 억제하는 프리앰퍼시스(Preemphasis)를 수행하여 SNR(Signal to Noise Ratio, 신호대 잡음비)를 높인다(ST1)(ST2). 그리고 프리앰퍼시스를 수행한 신호에 윈도우 함수를 곱하는 윈도윙(windowing)을 수행한다(ST3). 그런 다음 선형 예측 방법에 의해서 표현된 조음 필터에 여기 신호(Excitation Signal)를 통과시켜 합성음을 얻는 LPC(Linear Prediction Coefficient, 선형 예측 계수) 분석을 수행한다(ST4).Thus, when voice is input, the signal to noise ratio (SNR) is increased by performing preemphasis that suppresses the dynamic range of the signal by flattening the spectral tilt (ST1) (ST2). . Then, windowing is performed by multiplying the signal on which the pre-emphasis is performed by the window function (ST3). Then, an LPC (Linear Prediction Coefficient) analysis is performed to pass synthesized sound by passing an excitation signal through an articulation filter represented by a linear prediction method (ST4).
그리고 로컬 평균을 구하기 위해 켑스트럼 분석을 한다. 이때의 로컬 평균은 상기한 수학식7과 같다(ST5).Then perform a spectral analysis to find the local mean. The local average at this time is equal to the above Equation 7 (ST5).
그런 다음 켑스트럼의 로컬 평균을 구하고 입력 켑스트럼 벡터에서 로컬 평균을 차감하여 채널 보상된 켑스트럼 벡터를 구하게 되는 것이다(ST6)(ST7).Then, the local average of the cepstrum is obtained and the channel compensated cepstrum vector is obtained by subtracting the local average from the input cepstrum vector (ST6) (ST7).
이처럼 본 발명은 음성인식에서 주위 잡음을 제거하여 핵심어를 검출하게 되는 것이다.As such, the present invention removes ambient noise from speech recognition to detect key words.
이상에서 본 발명의 바람직한 실시예를 설명하였으나, 본 발명은 다양한 변화와 변경 및 균등물을 사용할 수 있다. 본 발명은 상기 실시예를 적절히 변형하여 동일하게 응용할 수 있음이 명확하다. 따라서 상기 기재 내용은 하기 특허청구범위의 한계에 의해 정해지는 본 발명의 범위를 한정하는 것이 아니다.Although the preferred embodiment of the present invention has been described above, the present invention may use various changes, modifications, and equivalents. It is clear that the present invention can be applied in the same manner by appropriately modifying the above embodiments. Accordingly, the above description does not limit the scope of the invention as defined by the limitations of the following claims.
이상에서 살펴본 바와 같이, 본 발명에 의한 잡음환경에서의 핵심어 검출에 적합한 음성특징 추출방법은 핵심어 검출의 특성에 맞는 왜곡 보상방법으로 LCMS 방법을 사용하여 채널왜곡이 보상된 켑스트럼 벡터를 구하여 주위 잡음이 제거된 핵심어를 검출할 수 있는 효과가 있게 된다.As described above, a voice feature extraction method suitable for key word detection in a noisy environment according to the present invention is a distortion compensation method suitable for the characteristics of the key word detection. There is an effect that can detect the keywords with the noise removed.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1019980042317A KR20000025292A (en) | 1998-10-09 | 1998-10-09 | Method for extracting voice characteristic suitable for core word detection in noise circumstance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1019980042317A KR20000025292A (en) | 1998-10-09 | 1998-10-09 | Method for extracting voice characteristic suitable for core word detection in noise circumstance |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20000025292A true KR20000025292A (en) | 2000-05-06 |
Family
ID=19553540
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1019980042317A KR20000025292A (en) | 1998-10-09 | 1998-10-09 | Method for extracting voice characteristic suitable for core word detection in noise circumstance |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20000025292A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100421013B1 (en) * | 2001-08-10 | 2004-03-04 | 삼성전자주식회사 | Speech enhancement system and method thereof |
US7613611B2 (en) | 2004-11-04 | 2009-11-03 | Electronics And Telecommunications Research Institute | Method and apparatus for vocal-cord signal recognition |
KR20150060300A (en) * | 2013-11-26 | 2015-06-03 | 현대모비스 주식회사 | System for command operation using speech recognition and method thereof |
-
1998
- 1998-10-09 KR KR1019980042317A patent/KR20000025292A/en not_active Application Discontinuation
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100421013B1 (en) * | 2001-08-10 | 2004-03-04 | 삼성전자주식회사 | Speech enhancement system and method thereof |
US7613611B2 (en) | 2004-11-04 | 2009-11-03 | Electronics And Telecommunications Research Institute | Method and apparatus for vocal-cord signal recognition |
KR20150060300A (en) * | 2013-11-26 | 2015-06-03 | 현대모비스 주식회사 | System for command operation using speech recognition and method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7197456B2 (en) | On-line parametric histogram normalization for noise robust speech recognition | |
Hirsch et al. | A new approach for the adaptation of HMMs to reverberation and background noise | |
EP1794746A2 (en) | Method of training a robust speaker-independent speech recognition system with speaker-dependent expressions and robust speaker-dependent speech recognition system | |
Chen et al. | Cepstrum derived from differentiated power spectrum for robust speech recognition | |
Nandkumar et al. | Dual-channel iterative speech enhancement with constraints on an auditory-based spectrum | |
Seltzer et al. | Robust bandwidth extension of noise-corrupted narrowband speech. | |
Haton | Automatic speech recognition: A Review | |
KR100738341B1 (en) | Apparatus and method for voice recognition using vocal band signal | |
KR20000025292A (en) | Method for extracting voice characteristic suitable for core word detection in noise circumstance | |
Hirsch | HMM adaptation for applications in telecommunication | |
Neumeyer et al. | Training issues and channel equalization techniques for the construction of telephone acoustic models using a high-quality speech corpus | |
Chien et al. | Estimation of channel bias for telephone speech recognition | |
Upadhyay et al. | Robust recognition of English speech in noisy environments using frequency warped signal processing | |
Solé-Casals et al. | A non-linear VAD for noisy environments | |
Chen et al. | Robust MFCCs derived from differentiated power spectrum | |
Upadhyay et al. | Auditory driven subband speech enhancement for automatic recognition of noisy speech | |
Benaroya et al. | Experiments in audio source separation with one sensor for robust speech recognition | |
Seyedin et al. | Robust MVDR-based feature extraction for speech recognition | |
Hsieh et al. | Enhancing the complex-valued acoustic spectrograms in modulation domain for creating noise-robust features in speech recognition | |
Tsai et al. | A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering. | |
KR100304109B1 (en) | Method for modified cepstral mean subraction using low resolution data deletion | |
Kaur et al. | Correlative consideration concerning feature extraction techniques for speech recognition—a review | |
Athanaselis et al. | Signal Enhancement for Continuous Speech Recognition | |
Sunitha et al. | NOISE ROBUST SPEECH RECOGNITION UNDER NOISY ENVIRONMENTS. | |
KR100281569B1 (en) | Deferred Sequential Cepstrum Mean Subtraction Method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WITN | Withdrawal due to no request for examination |