DE60207784D1 - Sprecheranpassung für die Spracherkennung - Google Patents

Sprecheranpassung für die Spracherkennung

Info

Publication number
DE60207784D1
DE60207784D1 DE60207784T DE60207784T DE60207784D1 DE 60207784 D1 DE60207784 D1 DE 60207784D1 DE 60207784 T DE60207784 T DE 60207784T DE 60207784 T DE60207784 T DE 60207784T DE 60207784 D1 DE60207784 D1 DE 60207784D1
Authority
DE
Germany
Prior art keywords
speaker adaptation
speech recognition
domain
adaptation
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
DE60207784T
Other languages
English (en)
Other versions
DE60207784T2 (de
DE60207784T9 (de
Inventor
Luca Rigazio
Patrick Nguyen
David Kryze
Jean-Claude Junqua
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Application granted granted Critical
Publication of DE60207784D1 publication Critical patent/DE60207784D1/de
Publication of DE60207784T2 publication Critical patent/DE60207784T2/de
Publication of DE60207784T9 publication Critical patent/DE60207784T9/de
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Noise Elimination (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)
  • Complex Calculations (AREA)
DE60207784T 2001-05-24 2002-05-23 Sprecheranpassung für die Spracherkennung Expired - Fee Related DE60207784T9 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US864838 2001-05-24
US09/864,838 US6915259B2 (en) 2001-05-24 2001-05-24 Speaker and environment adaptation based on linear separation of variability sources

Publications (3)

Publication Number Publication Date
DE60207784D1 true DE60207784D1 (de) 2006-01-12
DE60207784T2 DE60207784T2 (de) 2006-07-06
DE60207784T9 DE60207784T9 (de) 2006-12-14

Family

ID=25344185

Family Applications (1)

Application Number Title Priority Date Filing Date
DE60207784T Expired - Fee Related DE60207784T9 (de) 2001-05-24 2002-05-23 Sprecheranpassung für die Spracherkennung

Country Status (4)

Country Link
US (1) US6915259B2 (de)
EP (1) EP1262953B1 (de)
AT (1) ATE312398T1 (de)
DE (1) DE60207784T9 (de)

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002366187A (ja) * 2001-06-08 2002-12-20 Sony Corp 音声認識装置および音声認識方法、並びにプログラムおよび記録媒体
CN1453767A (zh) * 2002-04-26 2003-11-05 日本先锋公司 语音识别装置以及语音识别方法
US7103540B2 (en) 2002-05-20 2006-09-05 Microsoft Corporation Method of pattern recognition using noise reduction uncertainty
US7107210B2 (en) * 2002-05-20 2006-09-12 Microsoft Corporation Method of noise reduction based on dynamic aspects of speech
US7174292B2 (en) * 2002-05-20 2007-02-06 Microsoft Corporation Method of determining uncertainty associated with acoustic distortion-based noise reduction
US7340396B2 (en) * 2003-02-18 2008-03-04 Motorola, Inc. Method and apparatus for providing a speaker adapted speech recognition model set
US7729909B2 (en) * 2005-03-04 2010-06-01 Panasonic Corporation Block-diagonal covariance joint subspace tying and model compensation for noise robust automatic speech recognition
US7729908B2 (en) * 2005-03-04 2010-06-01 Panasonic Corporation Joint signal and model based noise matching noise robustness method for automatic speech recognition
US9571652B1 (en) 2005-04-21 2017-02-14 Verint Americas Inc. Enhanced diarization systems, media and methods of use
US7877255B2 (en) * 2006-03-31 2011-01-25 Voice Signal Technologies, Inc. Speech recognition using channel verification
EP2022042B1 (de) * 2006-05-16 2010-12-08 Loquendo S.p.A. Kompensation der variabilität zwischen sitzungen zur automatischen extraktion von informationen aus sprache
US8180637B2 (en) * 2007-12-03 2012-05-15 Microsoft Corporation High performance HMM adaptation with joint compensation of additive and convolutive distortions
US8798994B2 (en) * 2008-02-06 2014-08-05 International Business Machines Corporation Resource conservative transformation based unsupervised speaker adaptation
JP5423670B2 (ja) * 2008-04-30 2014-02-19 日本電気株式会社 音響モデル学習装置および音声認識装置
US9798653B1 (en) * 2010-05-05 2017-10-24 Nuance Communications, Inc. Methods, apparatus and data structure for cross-language speech adaptation
KR20120054845A (ko) * 2010-11-22 2012-05-31 삼성전자주식회사 로봇의 음성인식방법
GB2493413B (en) 2011-07-25 2013-12-25 Ibm Maintaining and supplying speech models
US8543398B1 (en) 2012-02-29 2013-09-24 Google Inc. Training an automatic speech recognition system using compressed word frequencies
US9984678B2 (en) * 2012-03-23 2018-05-29 Microsoft Technology Licensing, Llc Factored transforms for separable adaptation of acoustic models
US8374865B1 (en) 2012-04-26 2013-02-12 Google Inc. Sampling training data for an automatic speech recognition system based on a benchmark classification distribution
US8805684B1 (en) * 2012-05-31 2014-08-12 Google Inc. Distributed speaker adaptation
US8571859B1 (en) 2012-05-31 2013-10-29 Google Inc. Multi-stage speaker adaptation
US8880398B1 (en) 2012-07-13 2014-11-04 Google Inc. Localized speech recognition with offload
US9368116B2 (en) 2012-09-07 2016-06-14 Verint Systems Ltd. Speaker separation in diarization
US9123333B2 (en) 2012-09-12 2015-09-01 Google Inc. Minimum bayesian risk methods for automatic speech recognition
US10134401B2 (en) 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using linguistic labeling
JP6000094B2 (ja) * 2012-12-03 2016-09-28 日本電信電話株式会社 話者適応化装置、話者適応化方法、プログラム
US9275638B2 (en) 2013-03-12 2016-03-01 Google Technology Holdings LLC Method and apparatus for training a voice recognition model database
US9460722B2 (en) 2013-07-17 2016-10-04 Verint Systems Ltd. Blind diarization of recorded calls with arbitrary number of speakers
US9984706B2 (en) 2013-08-01 2018-05-29 Verint Systems Ltd. Voice activity detection using a soft decision mechanism
US9875742B2 (en) 2015-01-26 2018-01-23 Verint Systems Ltd. Word-level blind diarization of recorded calls with arbitrary number of speakers
US9865256B2 (en) 2015-02-27 2018-01-09 Storz Endoskop Produktions Gmbh System and method for calibrating a speech recognition system to an operating environment
US11538128B2 (en) 2018-05-14 2022-12-27 Verint Americas Inc. User interface for fraud alert management
US10887452B2 (en) 2018-10-25 2021-01-05 Verint Americas Inc. System architecture for fraud detection
IL303147B1 (en) 2019-06-20 2024-05-01 Verint Americas Inc Systems and methods for verification and fraud detection
US11868453B2 (en) 2019-11-07 2024-01-09 Verint Americas Inc. Systems and methods for customer authentication based on audio-of-interest
EP3857544B1 (de) 2019-12-04 2022-06-29 Google LLC Sprecherbewusstsein mittels sprecherabhängiger sprachmodelle

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5131043A (en) * 1983-09-05 1992-07-14 Matsushita Electric Industrial Co., Ltd. Method of and apparatus for speech recognition wherein decisions are made based on phonemes
US5345536A (en) * 1990-12-21 1994-09-06 Matsushita Electric Industrial Co., Ltd. Method of speech recognition
JP2870224B2 (ja) * 1991-06-19 1999-03-17 松下電器産業株式会社 音声認識方法
NO179421C (no) * 1993-03-26 1996-10-02 Statoil As Apparat for fordeling av en ström av injeksjonsfluid i adskilte soner i en grunnformasjon
US5664059A (en) * 1993-04-29 1997-09-02 Panasonic Technologies, Inc. Self-learning speaker adaptation based on spectral variation source decomposition
JP3114468B2 (ja) * 1993-11-25 2000-12-04 松下電器産業株式会社 音声認識方法
US5684925A (en) * 1995-09-08 1997-11-04 Matsushita Electric Industrial Co., Ltd. Speech representation by feature-based word prototypes comprising phoneme targets having reliable high similarity
US5822728A (en) * 1995-09-08 1998-10-13 Matsushita Electric Industrial Co., Ltd. Multistage word recognizer based on reliably detected phoneme similarity regions
JP3001037B2 (ja) 1995-12-13 2000-01-17 日本電気株式会社 音声認識装置
US6026359A (en) * 1996-09-20 2000-02-15 Nippon Telegraph And Telephone Corporation Scheme for model adaptation in pattern recognition based on Taylor expansion

Also Published As

Publication number Publication date
ATE312398T1 (de) 2005-12-15
US20030050780A1 (en) 2003-03-13
EP1262953A2 (de) 2002-12-04
EP1262953B1 (de) 2005-12-07
US6915259B2 (en) 2005-07-05
DE60207784T2 (de) 2006-07-06
DE60207784T9 (de) 2006-12-14
EP1262953A3 (de) 2004-04-07

Similar Documents

Publication Publication Date Title
DE60207784D1 (de) Sprecheranpassung für die Spracherkennung
ATE443316T1 (de) Spracherkennungsystem mittels impliziter sprecheradaptation
DE60125542D1 (de) System und verfahren zur spracherkennung mit einer vielzahl von spracherkennungsvorrichtungen
ATE297588T1 (de) Anpassung des phonetischen kontextes zur verbesserung der spracherkennung
Govindan et al. Adaptive wavelet shrinkage for noise robust speaker recognition
ATE246835T1 (de) Sprecher-erkennung
DE50209455D1 (de) Verfahren zum Training oder zur Adaption eines Spracherkenners
ATE347162T1 (de) Rauschunterdrückung zur robusten spracherkennung
ATE541287T1 (de) Rechnerisch effizienter hintergrundrauschunterdrücker für die sprachcodierung und spracherkennung
ATE362632T1 (de) Nachrichtenübertragungsgerät
IL146985A0 (en) Automatic dynamic speech recognition vocabulary based on external sources of information
DE60229095D1 (de) Ausprachen in mehreren Sprachen zur Spracherkennung
WO2006023631A3 (en) Document transcription system training
BR0113725A (pt) Combinação de dtw e hmm nos modos de reconhecimento de fala dependente e independente do falante
EP1103951A3 (de) Adaptive Wavelet-Extraktion für die Spracherkennung
ATE465485T1 (de) Verbesserung der spracherkennung von mobilgeräten
EP0865032A3 (de) Spracherkenner mit Rauschadaptierung
DE502004009294D1 (de) Verfahren zur automatischen Verstärkungseinstellung in einem Hörhilfegerät sowie Hörhilfegerät
DE502006004136D1 (de) Verfahren und vorrichtung zur geräuschunterdrückung
DE60002584D1 (de) Anwendung von Referenzdaten für Spracherkennung
ATE363120T1 (de) Audio-dialogsystem und sprachgesteuertes browsing-verfahren
DE60113787D1 (de) Verfahren und Vorrichtung zur Texteingabe durch Spracherkennung
ATE331279T1 (de) Vorrichtung zur sprachverbesserung
ATE441918T1 (de) Sprachdialogverfahren und -system
DE60303278D1 (de) Vorrichtung zur Verbesserung der Spracherkennung

Legal Events

Date Code Title Description
8364 No opposition during term of opposition
8339 Ceased/non-payment of the annual fee