ATE363712T1 - Parametrische online-histogramm normierung zur rauschrobusten spracherkennung - Google Patents

Parametrische online-histogramm normierung zur rauschrobusten spracherkennung

Info

Publication number
ATE363712T1
ATE363712T1 AT03718984T AT03718984T ATE363712T1 AT E363712 T1 ATE363712 T1 AT E363712T1 AT 03718984 T AT03718984 T AT 03718984T AT 03718984 T AT03718984 T AT 03718984T AT E363712 T1 ATE363712 T1 AT E363712T1
Authority
AT
Austria
Prior art keywords
speech recognition
parametric
noise
robust speech
histogram normalization
Prior art date
Application number
AT03718984T
Other languages
English (en)
Inventor
Hemmo Haverinen
Imre Kiss
Original Assignee
Nokia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corp filed Critical Nokia Corp
Application granted granted Critical
Publication of ATE363712T1 publication Critical patent/ATE363712T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Complex Calculations (AREA)
  • Noise Elimination (AREA)
  • Image Processing (AREA)
  • Machine Translation (AREA)
  • Image Analysis (AREA)
AT03718984T 2002-04-30 2003-04-28 Parametrische online-histogramm normierung zur rauschrobusten spracherkennung ATE363712T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/136,039 US7197456B2 (en) 2002-04-30 2002-04-30 On-line parametric histogram normalization for noise robust speech recognition

Publications (1)

Publication Number Publication Date
ATE363712T1 true ATE363712T1 (de) 2007-06-15

Family

ID=29249598

Family Applications (1)

Application Number Title Priority Date Filing Date
AT03718984T ATE363712T1 (de) 2002-04-30 2003-04-28 Parametrische online-histogramm normierung zur rauschrobusten spracherkennung

Country Status (7)

Country Link
US (1) US7197456B2 (de)
EP (1) EP1500087B1 (de)
CN (1) CN1650349A (de)
AT (1) ATE363712T1 (de)
AU (1) AU2003223017A1 (de)
DE (1) DE60314128T2 (de)
WO (1) WO2003094154A1 (de)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6826513B1 (en) * 2002-11-29 2004-11-30 Council Of Scientific & Industrial Research Method and apparatus for online identification of safe operation and advance detection of unsafe operation of a system or process
TWI223791B (en) * 2003-04-14 2004-11-11 Ind Tech Res Inst Method and system for utterance verification
EP1774516B1 (de) * 2004-01-12 2011-03-16 Voice Signal Technologies Inc. Normierung von cepstralen Merkmalen für die Spracherkennung
US7707029B2 (en) * 2005-02-08 2010-04-27 Microsoft Corporation Training wideband acoustic models in the cepstral domain using mixed-bandwidth training data for speech recognition
US7729909B2 (en) * 2005-03-04 2010-06-01 Panasonic Corporation Block-diagonal covariance joint subspace tying and model compensation for noise robust automatic speech recognition
KR101127184B1 (ko) 2006-02-06 2012-03-21 삼성전자주식회사 델타 히스토그램을 이용한 음성 특징 벡터의 정규화 방법및 그 장치
KR100717385B1 (ko) * 2006-02-09 2007-05-11 삼성전자주식회사 인식 후보의 사전적 거리를 이용한 인식 신뢰도 측정 방법및 인식 신뢰도 측정 시스템
KR100717401B1 (ko) * 2006-03-02 2007-05-11 삼성전자주식회사 역방향 누적 히스토그램을 이용한 음성 특징 벡터의 정규화방법 및 그 장치
US8355913B2 (en) * 2006-11-03 2013-01-15 Nokia Corporation Speech recognition with adjustable timeout period
KR100919223B1 (ko) * 2007-09-19 2009-09-28 한국전자통신연구원 부대역의 불확실성 정보를 이용한 잡음환경에서의 음성인식 방법 및 장치
US8180718B2 (en) * 2008-01-14 2012-05-15 Hewlett-Packard Development Company, L.P. Engine for performing root cause and effect analysis
US8374854B2 (en) * 2008-03-28 2013-02-12 Southern Methodist University Spatio-temporal speech enhancement technique based on generalized eigenvalue decomposition
JP5573627B2 (ja) * 2010-11-22 2014-08-20 富士通株式会社 光デジタルコヒーレント受信器
CN102290047B (zh) * 2011-09-22 2012-12-12 哈尔滨工业大学 基于稀疏分解与重构的鲁棒语音特征提取方法
US20130080165A1 (en) * 2011-09-24 2013-03-28 Microsoft Corporation Model Based Online Normalization of Feature Distribution for Noise Robust Speech Recognition
US8768695B2 (en) * 2012-06-13 2014-07-01 Nuance Communications, Inc. Channel normalization using recognition feedback
US9984676B2 (en) * 2012-07-24 2018-05-29 Nuance Communications, Inc. Feature normalization inputs to front end processing for automatic speech recognition
CN105139855A (zh) * 2014-05-29 2015-12-09 哈尔滨理工大学 一种两阶段稀疏分解的说话人识别方法与装置
US9886948B1 (en) * 2015-01-05 2018-02-06 Amazon Technologies, Inc. Neural network processing of multiple feature streams using max pooling and restricted connectivity
CN105068515B (zh) * 2015-07-16 2017-08-25 华南理工大学 一种基于自学习算法的智能家居设备语音控制方法
KR102413692B1 (ko) 2015-07-24 2022-06-27 삼성전자주식회사 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치
KR102192678B1 (ko) * 2015-10-16 2020-12-17 삼성전자주식회사 음향 모델 입력 데이터의 정규화 장치 및 방법과, 음성 인식 장치
WO2017149542A1 (en) * 2016-03-01 2017-09-08 Sentimetrix, Inc Neuropsychological evaluation screening system
US10593349B2 (en) * 2016-06-16 2020-03-17 The George Washington University Emotional interaction apparatus
US10540990B2 (en) * 2017-11-01 2020-01-21 International Business Machines Corporation Processing of speech signals
US11694708B2 (en) * 2018-09-23 2023-07-04 Plantronics, Inc. Audio device and method of audio processing with improved talker discrimination
US11264014B1 (en) * 2018-09-23 2022-03-01 Plantronics, Inc. Audio device and method of audio processing with improved talker discrimination
JP7564117B2 (ja) * 2019-03-10 2024-10-08 カードーム テクノロジー リミテッド キューのクラスター化を使用した音声強化
US11545172B1 (en) * 2021-03-09 2023-01-03 Amazon Technologies, Inc. Sound source localization using reflection classification

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5148489A (en) * 1990-02-28 1992-09-15 Sri International Method for spectral estimation to improve noise robustness for speech recognition
FR2677828B1 (fr) 1991-06-14 1993-08-20 Sextant Avionique Procede de detection d'un signal utile bruite.
US5323337A (en) 1992-08-04 1994-06-21 Loral Aerospace Corp. Signal detector employing mean energy and variance of energy content comparison for noise detection
GB9419388D0 (en) 1994-09-26 1994-11-09 Canon Kk Speech analysis
US6038528A (en) 1996-07-17 2000-03-14 T-Netix, Inc. Robust speech processing with affine transform replicated data
US6173258B1 (en) 1998-09-09 2001-01-09 Sony Corporation Method for reducing noise distortions in a speech recognition system
US6289309B1 (en) 1998-12-16 2001-09-11 Sarnoff Corporation Noise spectrum tracking for speech enhancement
FI19992350A (fi) * 1999-10-29 2001-04-30 Nokia Mobile Phones Ltd Parannettu puheentunnistus
GB2355834A (en) * 1999-10-29 2001-05-02 Nokia Mobile Phones Ltd Speech recognition
GB2364814A (en) 2000-07-12 2002-02-06 Canon Kk Speech recognition
US20030004720A1 (en) 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US7035797B2 (en) * 2001-12-14 2006-04-25 Nokia Corporation Data-driven filtering of cepstral time trajectories for robust speech recognition

Also Published As

Publication number Publication date
EP1500087B1 (de) 2007-05-30
WO2003094154A1 (en) 2003-11-13
EP1500087A4 (de) 2005-05-18
DE60314128D1 (de) 2007-07-12
US20030204398A1 (en) 2003-10-30
AU2003223017A1 (en) 2003-11-17
EP1500087A1 (de) 2005-01-26
CN1650349A (zh) 2005-08-03
US7197456B2 (en) 2007-03-27
DE60314128T2 (de) 2008-01-24

Similar Documents

Publication Publication Date Title
ATE363712T1 (de) Parametrische online-histogramm normierung zur rauschrobusten spracherkennung
WO2007015869A3 (en) Spoken language proficiency assessment by computer
ATE403212T1 (de) Extraktion und abgleich von charakteristischen fingerabdrücken aus tonsignalen
DE60309142D1 (de) Vorrichtung zur bestimmung von parametern eines gauss'schen mischungmodells (gmm) oder eines gmm basierten hidden markov modells
CN106782521A (zh) 一种语音识别系统
CN110931022B (zh) 基于高低频动静特征的声纹识别方法
CN106205623A (zh) 一种声音转换方法及装置
CN1268732A (zh) 基于语音识别专用芯片的特定人语音识别、语音回放方法
Hermansky et al. Perceptual properties of current speech recognition technology
CN108091340B (zh) 声纹识别方法、声纹识别系统和计算机可读存储介质
CN1877697A (zh) 一种基于分布式结构的说话人确认方法
CN106297769B (zh) 一种应用于语种识别的鉴别性特征提取方法
Li et al. The Hokkien isolated word recognition system based on FPGA
CN101419796A (zh) 自动分割单字语音信号的装置与方法
CN1697018A (zh) 一种利用改进的谱相减法提高语音识别精度的方法
Cheng et al. A study on emotional feature analysis and recognition in speech signal
Kim et al. Speech recognition using hidden markov models in embedded platform
Haeb-Umbach et al. An investigation of cepstral parameterisations for large vocabulary speech recognition
CN108877833A (zh) 一种基于嵌入式微处理单位非特定对象语音识别方法
Murtazin et al. The speech synthesis detection algorithm based on cepstral coefficients and convolutional neural network
Missaoui et al. Physiologically motivated feature extraction for robust automatic speech recognition
Wu et al. Speech endpoint detection in noisy environment using Spectrogram Boundary Factor
TW201411577A (zh) 點讀裝置之語音處理方法
Vijay et al. Personality Traits from Speech Signal Using Cross-Corpus Technique
Yamasaki et al. Accuracy improvement of speaker authentication in noisy environments using bone-conducted speech

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties