ATE363712T1 - Parametrische online-histogramm normierung zur rauschrobusten spracherkennung - Google Patents
Parametrische online-histogramm normierung zur rauschrobusten spracherkennungInfo
- Publication number
- ATE363712T1 ATE363712T1 AT03718984T AT03718984T ATE363712T1 AT E363712 T1 ATE363712 T1 AT E363712T1 AT 03718984 T AT03718984 T AT 03718984T AT 03718984 T AT03718984 T AT 03718984T AT E363712 T1 ATE363712 T1 AT E363712T1
- Authority
- AT
- Austria
- Prior art keywords
- speech recognition
- parametric
- noise
- robust speech
- histogram normalization
- Prior art date
Links
- 238000010606 normalization Methods 0.000 title abstract 2
- 230000003595 spectral effect Effects 0.000 abstract 3
- 238000000034 method Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
Landscapes
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Complex Calculations (AREA)
- Noise Elimination (AREA)
- Image Processing (AREA)
- Machine Translation (AREA)
- Image Analysis (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/136,039 US7197456B2 (en) | 2002-04-30 | 2002-04-30 | On-line parametric histogram normalization for noise robust speech recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
ATE363712T1 true ATE363712T1 (de) | 2007-06-15 |
Family
ID=29249598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AT03718984T ATE363712T1 (de) | 2002-04-30 | 2003-04-28 | Parametrische online-histogramm normierung zur rauschrobusten spracherkennung |
Country Status (7)
Country | Link |
---|---|
US (1) | US7197456B2 (de) |
EP (1) | EP1500087B1 (de) |
CN (1) | CN1650349A (de) |
AT (1) | ATE363712T1 (de) |
AU (1) | AU2003223017A1 (de) |
DE (1) | DE60314128T2 (de) |
WO (1) | WO2003094154A1 (de) |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6826513B1 (en) * | 2002-11-29 | 2004-11-30 | Council Of Scientific & Industrial Research | Method and apparatus for online identification of safe operation and advance detection of unsafe operation of a system or process |
TWI223791B (en) * | 2003-04-14 | 2004-11-11 | Ind Tech Res Inst | Method and system for utterance verification |
EP1774516B1 (de) * | 2004-01-12 | 2011-03-16 | Voice Signal Technologies Inc. | Normierung von cepstralen Merkmalen für die Spracherkennung |
US7707029B2 (en) * | 2005-02-08 | 2010-04-27 | Microsoft Corporation | Training wideband acoustic models in the cepstral domain using mixed-bandwidth training data for speech recognition |
US7729909B2 (en) * | 2005-03-04 | 2010-06-01 | Panasonic Corporation | Block-diagonal covariance joint subspace tying and model compensation for noise robust automatic speech recognition |
KR101127184B1 (ko) | 2006-02-06 | 2012-03-21 | 삼성전자주식회사 | 델타 히스토그램을 이용한 음성 특징 벡터의 정규화 방법및 그 장치 |
KR100717385B1 (ko) * | 2006-02-09 | 2007-05-11 | 삼성전자주식회사 | 인식 후보의 사전적 거리를 이용한 인식 신뢰도 측정 방법및 인식 신뢰도 측정 시스템 |
KR100717401B1 (ko) * | 2006-03-02 | 2007-05-11 | 삼성전자주식회사 | 역방향 누적 히스토그램을 이용한 음성 특징 벡터의 정규화방법 및 그 장치 |
US8355913B2 (en) * | 2006-11-03 | 2013-01-15 | Nokia Corporation | Speech recognition with adjustable timeout period |
KR100919223B1 (ko) * | 2007-09-19 | 2009-09-28 | 한국전자통신연구원 | 부대역의 불확실성 정보를 이용한 잡음환경에서의 음성인식 방법 및 장치 |
US8180718B2 (en) * | 2008-01-14 | 2012-05-15 | Hewlett-Packard Development Company, L.P. | Engine for performing root cause and effect analysis |
US8374854B2 (en) * | 2008-03-28 | 2013-02-12 | Southern Methodist University | Spatio-temporal speech enhancement technique based on generalized eigenvalue decomposition |
JP5573627B2 (ja) * | 2010-11-22 | 2014-08-20 | 富士通株式会社 | 光デジタルコヒーレント受信器 |
CN102290047B (zh) * | 2011-09-22 | 2012-12-12 | 哈尔滨工业大学 | 基于稀疏分解与重构的鲁棒语音特征提取方法 |
US20130080165A1 (en) * | 2011-09-24 | 2013-03-28 | Microsoft Corporation | Model Based Online Normalization of Feature Distribution for Noise Robust Speech Recognition |
US8768695B2 (en) * | 2012-06-13 | 2014-07-01 | Nuance Communications, Inc. | Channel normalization using recognition feedback |
US9984676B2 (en) * | 2012-07-24 | 2018-05-29 | Nuance Communications, Inc. | Feature normalization inputs to front end processing for automatic speech recognition |
CN105139855A (zh) * | 2014-05-29 | 2015-12-09 | 哈尔滨理工大学 | 一种两阶段稀疏分解的说话人识别方法与装置 |
US9886948B1 (en) * | 2015-01-05 | 2018-02-06 | Amazon Technologies, Inc. | Neural network processing of multiple feature streams using max pooling and restricted connectivity |
CN105068515B (zh) * | 2015-07-16 | 2017-08-25 | 华南理工大学 | 一种基于自学习算法的智能家居设备语音控制方法 |
KR102413692B1 (ko) | 2015-07-24 | 2022-06-27 | 삼성전자주식회사 | 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치 |
KR102192678B1 (ko) * | 2015-10-16 | 2020-12-17 | 삼성전자주식회사 | 음향 모델 입력 데이터의 정규화 장치 및 방법과, 음성 인식 장치 |
WO2017149542A1 (en) * | 2016-03-01 | 2017-09-08 | Sentimetrix, Inc | Neuropsychological evaluation screening system |
US10593349B2 (en) * | 2016-06-16 | 2020-03-17 | The George Washington University | Emotional interaction apparatus |
US10540990B2 (en) * | 2017-11-01 | 2020-01-21 | International Business Machines Corporation | Processing of speech signals |
US11694708B2 (en) * | 2018-09-23 | 2023-07-04 | Plantronics, Inc. | Audio device and method of audio processing with improved talker discrimination |
US11264014B1 (en) * | 2018-09-23 | 2022-03-01 | Plantronics, Inc. | Audio device and method of audio processing with improved talker discrimination |
JP7564117B2 (ja) * | 2019-03-10 | 2024-10-08 | カードーム テクノロジー リミテッド | キューのクラスター化を使用した音声強化 |
US11545172B1 (en) * | 2021-03-09 | 2023-01-03 | Amazon Technologies, Inc. | Sound source localization using reflection classification |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5148489A (en) * | 1990-02-28 | 1992-09-15 | Sri International | Method for spectral estimation to improve noise robustness for speech recognition |
FR2677828B1 (fr) | 1991-06-14 | 1993-08-20 | Sextant Avionique | Procede de detection d'un signal utile bruite. |
US5323337A (en) | 1992-08-04 | 1994-06-21 | Loral Aerospace Corp. | Signal detector employing mean energy and variance of energy content comparison for noise detection |
GB9419388D0 (en) | 1994-09-26 | 1994-11-09 | Canon Kk | Speech analysis |
US6038528A (en) | 1996-07-17 | 2000-03-14 | T-Netix, Inc. | Robust speech processing with affine transform replicated data |
US6173258B1 (en) | 1998-09-09 | 2001-01-09 | Sony Corporation | Method for reducing noise distortions in a speech recognition system |
US6289309B1 (en) | 1998-12-16 | 2001-09-11 | Sarnoff Corporation | Noise spectrum tracking for speech enhancement |
FI19992350A (fi) * | 1999-10-29 | 2001-04-30 | Nokia Mobile Phones Ltd | Parannettu puheentunnistus |
GB2355834A (en) * | 1999-10-29 | 2001-05-02 | Nokia Mobile Phones Ltd | Speech recognition |
GB2364814A (en) | 2000-07-12 | 2002-02-06 | Canon Kk | Speech recognition |
US20030004720A1 (en) | 2001-01-30 | 2003-01-02 | Harinath Garudadri | System and method for computing and transmitting parameters in a distributed voice recognition system |
US7035797B2 (en) * | 2001-12-14 | 2006-04-25 | Nokia Corporation | Data-driven filtering of cepstral time trajectories for robust speech recognition |
-
2002
- 2002-04-30 US US10/136,039 patent/US7197456B2/en not_active Expired - Fee Related
-
2003
- 2003-04-28 AT AT03718984T patent/ATE363712T1/de not_active IP Right Cessation
- 2003-04-28 DE DE60314128T patent/DE60314128T2/de not_active Expired - Lifetime
- 2003-04-28 AU AU2003223017A patent/AU2003223017A1/en not_active Abandoned
- 2003-04-28 WO PCT/IB2003/001621 patent/WO2003094154A1/en active IP Right Grant
- 2003-04-28 CN CN03809428.2A patent/CN1650349A/zh active Pending
- 2003-04-28 EP EP03718984A patent/EP1500087B1/de not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
EP1500087B1 (de) | 2007-05-30 |
WO2003094154A1 (en) | 2003-11-13 |
EP1500087A4 (de) | 2005-05-18 |
DE60314128D1 (de) | 2007-07-12 |
US20030204398A1 (en) | 2003-10-30 |
AU2003223017A1 (en) | 2003-11-17 |
EP1500087A1 (de) | 2005-01-26 |
CN1650349A (zh) | 2005-08-03 |
US7197456B2 (en) | 2007-03-27 |
DE60314128T2 (de) | 2008-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ATE363712T1 (de) | Parametrische online-histogramm normierung zur rauschrobusten spracherkennung | |
WO2007015869A3 (en) | Spoken language proficiency assessment by computer | |
ATE403212T1 (de) | Extraktion und abgleich von charakteristischen fingerabdrücken aus tonsignalen | |
DE60309142D1 (de) | Vorrichtung zur bestimmung von parametern eines gauss'schen mischungmodells (gmm) oder eines gmm basierten hidden markov modells | |
CN106782521A (zh) | 一种语音识别系统 | |
CN110931022B (zh) | 基于高低频动静特征的声纹识别方法 | |
CN106205623A (zh) | 一种声音转换方法及装置 | |
CN1268732A (zh) | 基于语音识别专用芯片的特定人语音识别、语音回放方法 | |
Hermansky et al. | Perceptual properties of current speech recognition technology | |
CN108091340B (zh) | 声纹识别方法、声纹识别系统和计算机可读存储介质 | |
CN1877697A (zh) | 一种基于分布式结构的说话人确认方法 | |
CN106297769B (zh) | 一种应用于语种识别的鉴别性特征提取方法 | |
Li et al. | The Hokkien isolated word recognition system based on FPGA | |
CN101419796A (zh) | 自动分割单字语音信号的装置与方法 | |
CN1697018A (zh) | 一种利用改进的谱相减法提高语音识别精度的方法 | |
Cheng et al. | A study on emotional feature analysis and recognition in speech signal | |
Kim et al. | Speech recognition using hidden markov models in embedded platform | |
Haeb-Umbach et al. | An investigation of cepstral parameterisations for large vocabulary speech recognition | |
CN108877833A (zh) | 一种基于嵌入式微处理单位非特定对象语音识别方法 | |
Murtazin et al. | The speech synthesis detection algorithm based on cepstral coefficients and convolutional neural network | |
Missaoui et al. | Physiologically motivated feature extraction for robust automatic speech recognition | |
Wu et al. | Speech endpoint detection in noisy environment using Spectrogram Boundary Factor | |
TW201411577A (zh) | 點讀裝置之語音處理方法 | |
Vijay et al. | Personality Traits from Speech Signal Using Cross-Corpus Technique | |
Yamasaki et al. | Accuracy improvement of speaker authentication in noisy environments using bone-conducted speech |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
RER | Ceased as to paragraph 5 lit. 3 law introducing patent treaties |