JP2002311989A - チャネル歪みおよび背景雑音の両方に対して補正した音声認識方法 - Google Patents
チャネル歪みおよび背景雑音の両方に対して補正した音声認識方法Info
- Publication number
- JP2002311989A JP2002311989A JP2002067939A JP2002067939A JP2002311989A JP 2002311989 A JP2002311989 A JP 2002311989A JP 2002067939 A JP2002067939 A JP 2002067939A JP 2002067939 A JP2002067939 A JP 2002067939A JP 2002311989 A JP2002311989 A JP 2002311989A
- Authority
- JP
- Japan
- Prior art keywords
- vector
- model
- noise
- speech
- average
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 239000013598 vector Substances 0.000 claims abstract description 99
- 238000010606 normalization Methods 0.000 claims abstract description 12
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 238000012937 correction Methods 0.000 abstract description 10
- 230000006978 adaptation Effects 0.000 description 9
- 239000000654 additive Substances 0.000 description 9
- 230000000996 additive effect Effects 0.000 description 9
- 230000006870 function Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 7
- 230000003595 spectral effect Effects 0.000 description 6
- 238000012360 testing method Methods 0.000 description 4
- 238000007476 Maximum Likelihood Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 230000005534 acoustic noise Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Circuit For Audible Band Transducer (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Time-Division Multiplex Systems (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US275487 | 1981-06-19 | ||
US27548701P | 2001-03-14 | 2001-03-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2002311989A true JP2002311989A (ja) | 2002-10-25 |
Family
ID=23052506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2002067939A Pending JP2002311989A (ja) | 2001-03-14 | 2002-03-13 | チャネル歪みおよび背景雑音の両方に対して補正した音声認識方法 |
Country Status (4)
Country | Link |
---|---|
US (1) | US7062433B2 (fr) |
EP (1) | EP1241662B1 (fr) |
JP (1) | JP2002311989A (fr) |
DE (1) | DE60212477T2 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006030551A1 (fr) * | 2004-09-15 | 2006-03-23 | The University Of Tokyo | Procede d'adaptation de modele pour reconnaissance vocale dans du bruit, par approximation polynomiale |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6985858B2 (en) * | 2001-03-20 | 2006-01-10 | Microsoft Corporation | Method and apparatus for removing noise from feature vectors |
US20040148160A1 (en) * | 2003-01-23 | 2004-07-29 | Tenkasi Ramabadran | Method and apparatus for noise suppression within a distributed speech recognition system |
JP4357867B2 (ja) * | 2003-04-25 | 2009-11-04 | パイオニア株式会社 | 音声認識装置、音声認識方法、並びに、音声認識プログラムおよびそれを記録した記録媒体 |
WO2005070130A2 (fr) * | 2004-01-12 | 2005-08-04 | Voice Signal Technologies, Inc. | Normalisation d'un canal de reconnaissance de la parole automatique |
US7236930B2 (en) * | 2004-04-12 | 2007-06-26 | Texas Instruments Incorporated | Method to extend operating range of joint additive and convolutive compensating algorithms |
US20070033027A1 (en) * | 2005-08-03 | 2007-02-08 | Texas Instruments, Incorporated | Systems and methods employing stochastic bias compensation and bayesian joint additive/convolutive compensation in automatic speech recognition |
US7877255B2 (en) * | 2006-03-31 | 2011-01-25 | Voice Signal Technologies, Inc. | Speech recognition using channel verification |
CN1897109B (zh) * | 2006-06-01 | 2010-05-12 | 电子科技大学 | 一种基于mfcc的单一音频信号识别方法 |
US7664643B2 (en) * | 2006-08-25 | 2010-02-16 | International Business Machines Corporation | System and method for speech separation and multi-talker speech recognition |
CN101030369B (zh) * | 2007-03-30 | 2011-06-29 | 清华大学 | 基于子词隐含马尔可夫模型的嵌入式语音识别方法 |
US8180637B2 (en) * | 2007-12-03 | 2012-05-15 | Microsoft Corporation | High performance HMM adaptation with joint compensation of additive and convolutive distortions |
WO2009078093A1 (fr) | 2007-12-18 | 2009-06-25 | Fujitsu Limited | Procédé de détection de section non-parole et dispositif de détection de section non-parole |
US8306817B2 (en) * | 2008-01-08 | 2012-11-06 | Microsoft Corporation | Speech recognition with non-linear noise reduction on Mel-frequency cepstra |
US8145488B2 (en) * | 2008-09-16 | 2012-03-27 | Microsoft Corporation | Parameter clustering and sharing for variable-parameter hidden markov models |
US8214215B2 (en) * | 2008-09-24 | 2012-07-03 | Microsoft Corporation | Phase sensitive model adaptation for noisy speech recognition |
EP2182512A1 (fr) * | 2008-10-29 | 2010-05-05 | BRITISH TELECOMMUNICATIONS public limited company | Vérification du locuteur |
US8639502B1 (en) | 2009-02-16 | 2014-01-28 | Arrowhead Center, Inc. | Speaker model-based speech enhancement system |
CN103811008A (zh) * | 2012-11-08 | 2014-05-21 | 中国移动通信集团上海有限公司 | 一种音频内容识别方法和装置 |
US9489965B2 (en) * | 2013-03-15 | 2016-11-08 | Sri International | Method and apparatus for acoustic signal characterization |
CN106057195A (zh) * | 2016-05-25 | 2016-10-26 | 东华大学 | 一种基于嵌入式音频识别的无人机探测系统 |
US10720165B2 (en) * | 2017-01-23 | 2020-07-21 | Qualcomm Incorporated | Keyword voice authentication |
US20210201928A1 (en) * | 2019-12-31 | 2021-07-01 | Knowles Electronics, Llc | Integrated speech enhancement for voice trigger application |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5450522A (en) * | 1991-08-19 | 1995-09-12 | U S West Advanced Technologies, Inc. | Auditory model for parametrization of speech |
US5924065A (en) * | 1997-06-16 | 1999-07-13 | Digital Equipment Corporation | Environmently compensated speech processing |
US6529872B1 (en) * | 2000-04-18 | 2003-03-04 | Matsushita Electric Industrial Co., Ltd. | Method for noise adaptation in automatic speech recognition using transformed matrices |
US6912497B2 (en) * | 2001-03-28 | 2005-06-28 | Texas Instruments Incorporated | Calibration of speech data acquisition path |
-
2002
- 2002-01-18 US US10/051,640 patent/US7062433B2/en not_active Expired - Lifetime
- 2002-03-13 JP JP2002067939A patent/JP2002311989A/ja active Pending
- 2002-03-14 DE DE60212477T patent/DE60212477T2/de not_active Expired - Lifetime
- 2002-03-14 EP EP02100251A patent/EP1241662B1/fr not_active Expired - Lifetime
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006030551A1 (fr) * | 2004-09-15 | 2006-03-23 | The University Of Tokyo | Procede d'adaptation de modele pour reconnaissance vocale dans du bruit, par approximation polynomiale |
Also Published As
Publication number | Publication date |
---|---|
DE60212477D1 (de) | 2006-08-03 |
US7062433B2 (en) | 2006-06-13 |
EP1241662A2 (fr) | 2002-09-18 |
EP1241662A3 (fr) | 2004-02-18 |
US20020173959A1 (en) | 2002-11-21 |
EP1241662B1 (fr) | 2006-06-21 |
DE60212477T2 (de) | 2007-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7165028B2 (en) | Method of speech recognition resistant to convolutive distortion and additive distortion | |
EP1241662B1 (fr) | Reconnaissance de la parole avec compensation de distorsion convolutif et de bruit additif | |
US7269555B2 (en) | Unsupervised incremental adaptation using maximum likelihood spectral transformation | |
EP0831461B1 (fr) | Schéma et adaptation des modèles chez la reconnaissance des dessins basé sur l'expansion de Taylor | |
EP1262953B1 (fr) | Adaptation au locuteur pour la reconnaissance de la parole | |
JP4750271B2 (ja) | ノイズ補償されたスピーチ認識システムおよび方法 | |
Wang et al. | Speaker and noise factorization for robust speech recognition | |
Liao et al. | Joint uncertainty decoding for noise robust speech recognition. | |
US7571095B2 (en) | Method and apparatus for recognizing speech in a noisy environment | |
US20080208578A1 (en) | Robust Speaker-Dependent Speech Recognition System | |
US20020165712A1 (en) | Method and apparatus for feature domain joint channel and additive noise compensation | |
US20110015925A1 (en) | Speech recognition system and method | |
JP5242782B2 (ja) | 音声認識方法 | |
Buera et al. | Cepstral vector normalization based on stereo data for robust speech recognition | |
US7120580B2 (en) | Method and apparatus for recognizing speech in a noisy environment | |
US7236930B2 (en) | Method to extend operating range of joint additive and convolutive compensating algorithms | |
US20020013697A1 (en) | Log-spectral compensation of gaussian mean vectors for noisy speech recognition | |
Nisa et al. | The speech signal enhancement approach with multiple sub-frames analysis for complex magnitude and phase spectrum recompense | |
JPH10149191A (ja) | モデル適応方法、装置およびその記憶媒体 | |
Hansen et al. | Robust speech recognition in noise: an evaluation using the spine corpus | |
JP4058521B2 (ja) | 背景雑音歪みの補正処理方法及びそれを用いた音声認識システム | |
Kim et al. | Advanced parallel combined Gaussian mixture model based feature compensation integrated with iterative channel estimation | |
Torre et al. | On the comparison of front-ends for robust speech recognition in car environments | |
Chien et al. | Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition. | |
Bernard et al. | Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20050303 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20070608 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20070910 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20070913 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20071009 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20071012 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20071108 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20071113 |
|
A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20080401 |