DE60207784D1 - Sprecheranpassung für die Spracherkennung - Google Patents
Sprecheranpassung für die SpracherkennungInfo
- Publication number
- DE60207784D1 DE60207784D1 DE60207784T DE60207784T DE60207784D1 DE 60207784 D1 DE60207784 D1 DE 60207784D1 DE 60207784 T DE60207784 T DE 60207784T DE 60207784 T DE60207784 T DE 60207784T DE 60207784 D1 DE60207784 D1 DE 60207784D1
- Authority
- DE
- Germany
- Prior art keywords
- speaker adaptation
- speech recognition
- domain
- adaptation
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000006978 adaptation Effects 0.000 title abstract 4
- 238000000605 extraction Methods 0.000 abstract 1
- 238000000034 method Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Noise Elimination (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
- Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)
- Complex Calculations (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US864838 | 2001-05-24 | ||
US09/864,838 US6915259B2 (en) | 2001-05-24 | 2001-05-24 | Speaker and environment adaptation based on linear separation of variability sources |
Publications (3)
Publication Number | Publication Date |
---|---|
DE60207784D1 true DE60207784D1 (de) | 2006-01-12 |
DE60207784T2 DE60207784T2 (de) | 2006-07-06 |
DE60207784T9 DE60207784T9 (de) | 2006-12-14 |
Family
ID=25344185
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
DE60207784T Expired - Fee Related DE60207784T9 (de) | 2001-05-24 | 2002-05-23 | Sprecheranpassung für die Spracherkennung |
Country Status (4)
Country | Link |
---|---|
US (1) | US6915259B2 (de) |
EP (1) | EP1262953B1 (de) |
AT (1) | ATE312398T1 (de) |
DE (1) | DE60207784T9 (de) |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002366187A (ja) * | 2001-06-08 | 2002-12-20 | Sony Corp | 音声認識装置および音声認識方法、並びにプログラムおよび記録媒体 |
CN1453767A (zh) * | 2002-04-26 | 2003-11-05 | 日本先锋公司 | 语音识别装置以及语音识别方法 |
US7103540B2 (en) | 2002-05-20 | 2006-09-05 | Microsoft Corporation | Method of pattern recognition using noise reduction uncertainty |
US7107210B2 (en) * | 2002-05-20 | 2006-09-12 | Microsoft Corporation | Method of noise reduction based on dynamic aspects of speech |
US7174292B2 (en) * | 2002-05-20 | 2007-02-06 | Microsoft Corporation | Method of determining uncertainty associated with acoustic distortion-based noise reduction |
US7340396B2 (en) * | 2003-02-18 | 2008-03-04 | Motorola, Inc. | Method and apparatus for providing a speaker adapted speech recognition model set |
US7729909B2 (en) * | 2005-03-04 | 2010-06-01 | Panasonic Corporation | Block-diagonal covariance joint subspace tying and model compensation for noise robust automatic speech recognition |
US7729908B2 (en) * | 2005-03-04 | 2010-06-01 | Panasonic Corporation | Joint signal and model based noise matching noise robustness method for automatic speech recognition |
US9571652B1 (en) | 2005-04-21 | 2017-02-14 | Verint Americas Inc. | Enhanced diarization systems, media and methods of use |
US7877255B2 (en) * | 2006-03-31 | 2011-01-25 | Voice Signal Technologies, Inc. | Speech recognition using channel verification |
EP2022042B1 (de) * | 2006-05-16 | 2010-12-08 | Loquendo S.p.A. | Kompensation der variabilität zwischen sitzungen zur automatischen extraktion von informationen aus sprache |
US8180637B2 (en) * | 2007-12-03 | 2012-05-15 | Microsoft Corporation | High performance HMM adaptation with joint compensation of additive and convolutive distortions |
US8798994B2 (en) * | 2008-02-06 | 2014-08-05 | International Business Machines Corporation | Resource conservative transformation based unsupervised speaker adaptation |
JP5423670B2 (ja) * | 2008-04-30 | 2014-02-19 | 日本電気株式会社 | 音響モデル学習装置および音声認識装置 |
US9798653B1 (en) * | 2010-05-05 | 2017-10-24 | Nuance Communications, Inc. | Methods, apparatus and data structure for cross-language speech adaptation |
KR20120054845A (ko) * | 2010-11-22 | 2012-05-31 | 삼성전자주식회사 | 로봇의 음성인식방법 |
GB2493413B (en) | 2011-07-25 | 2013-12-25 | Ibm | Maintaining and supplying speech models |
US8543398B1 (en) | 2012-02-29 | 2013-09-24 | Google Inc. | Training an automatic speech recognition system using compressed word frequencies |
US9984678B2 (en) * | 2012-03-23 | 2018-05-29 | Microsoft Technology Licensing, Llc | Factored transforms for separable adaptation of acoustic models |
US8374865B1 (en) | 2012-04-26 | 2013-02-12 | Google Inc. | Sampling training data for an automatic speech recognition system based on a benchmark classification distribution |
US8805684B1 (en) * | 2012-05-31 | 2014-08-12 | Google Inc. | Distributed speaker adaptation |
US8571859B1 (en) | 2012-05-31 | 2013-10-29 | Google Inc. | Multi-stage speaker adaptation |
US8880398B1 (en) | 2012-07-13 | 2014-11-04 | Google Inc. | Localized speech recognition with offload |
US9368116B2 (en) | 2012-09-07 | 2016-06-14 | Verint Systems Ltd. | Speaker separation in diarization |
US9123333B2 (en) | 2012-09-12 | 2015-09-01 | Google Inc. | Minimum bayesian risk methods for automatic speech recognition |
US10134401B2 (en) | 2012-11-21 | 2018-11-20 | Verint Systems Ltd. | Diarization using linguistic labeling |
JP6000094B2 (ja) * | 2012-12-03 | 2016-09-28 | 日本電信電話株式会社 | 話者適応化装置、話者適応化方法、プログラム |
US9275638B2 (en) | 2013-03-12 | 2016-03-01 | Google Technology Holdings LLC | Method and apparatus for training a voice recognition model database |
US9460722B2 (en) | 2013-07-17 | 2016-10-04 | Verint Systems Ltd. | Blind diarization of recorded calls with arbitrary number of speakers |
US9984706B2 (en) | 2013-08-01 | 2018-05-29 | Verint Systems Ltd. | Voice activity detection using a soft decision mechanism |
US9875742B2 (en) | 2015-01-26 | 2018-01-23 | Verint Systems Ltd. | Word-level blind diarization of recorded calls with arbitrary number of speakers |
US9865256B2 (en) | 2015-02-27 | 2018-01-09 | Storz Endoskop Produktions Gmbh | System and method for calibrating a speech recognition system to an operating environment |
US11538128B2 (en) | 2018-05-14 | 2022-12-27 | Verint Americas Inc. | User interface for fraud alert management |
US10887452B2 (en) | 2018-10-25 | 2021-01-05 | Verint Americas Inc. | System architecture for fraud detection |
IL303147B1 (en) | 2019-06-20 | 2024-05-01 | Verint Americas Inc | Systems and methods for verification and fraud detection |
US11868453B2 (en) | 2019-11-07 | 2024-01-09 | Verint Americas Inc. | Systems and methods for customer authentication based on audio-of-interest |
EP3857544B1 (de) | 2019-12-04 | 2022-06-29 | Google LLC | Sprecherbewusstsein mittels sprecherabhängiger sprachmodelle |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5131043A (en) * | 1983-09-05 | 1992-07-14 | Matsushita Electric Industrial Co., Ltd. | Method of and apparatus for speech recognition wherein decisions are made based on phonemes |
US5345536A (en) * | 1990-12-21 | 1994-09-06 | Matsushita Electric Industrial Co., Ltd. | Method of speech recognition |
JP2870224B2 (ja) * | 1991-06-19 | 1999-03-17 | 松下電器産業株式会社 | 音声認識方法 |
NO179421C (no) * | 1993-03-26 | 1996-10-02 | Statoil As | Apparat for fordeling av en ström av injeksjonsfluid i adskilte soner i en grunnformasjon |
US5664059A (en) * | 1993-04-29 | 1997-09-02 | Panasonic Technologies, Inc. | Self-learning speaker adaptation based on spectral variation source decomposition |
JP3114468B2 (ja) * | 1993-11-25 | 2000-12-04 | 松下電器産業株式会社 | 音声認識方法 |
US5684925A (en) * | 1995-09-08 | 1997-11-04 | Matsushita Electric Industrial Co., Ltd. | Speech representation by feature-based word prototypes comprising phoneme targets having reliable high similarity |
US5822728A (en) * | 1995-09-08 | 1998-10-13 | Matsushita Electric Industrial Co., Ltd. | Multistage word recognizer based on reliably detected phoneme similarity regions |
JP3001037B2 (ja) | 1995-12-13 | 2000-01-17 | 日本電気株式会社 | 音声認識装置 |
US6026359A (en) * | 1996-09-20 | 2000-02-15 | Nippon Telegraph And Telephone Corporation | Scheme for model adaptation in pattern recognition based on Taylor expansion |
-
2001
- 2001-05-24 US US09/864,838 patent/US6915259B2/en not_active Expired - Lifetime
-
2002
- 2002-05-23 AT AT02253651T patent/ATE312398T1/de not_active IP Right Cessation
- 2002-05-23 EP EP02253651A patent/EP1262953B1/de not_active Expired - Lifetime
- 2002-05-23 DE DE60207784T patent/DE60207784T9/de not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
ATE312398T1 (de) | 2005-12-15 |
US20030050780A1 (en) | 2003-03-13 |
EP1262953A2 (de) | 2002-12-04 |
EP1262953B1 (de) | 2005-12-07 |
US6915259B2 (en) | 2005-07-05 |
DE60207784T2 (de) | 2006-07-06 |
DE60207784T9 (de) | 2006-12-14 |
EP1262953A3 (de) | 2004-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60207784D1 (de) | Sprecheranpassung für die Spracherkennung | |
ATE443316T1 (de) | Spracherkennungsystem mittels impliziter sprecheradaptation | |
DE60125542D1 (de) | System und verfahren zur spracherkennung mit einer vielzahl von spracherkennungsvorrichtungen | |
ATE297588T1 (de) | Anpassung des phonetischen kontextes zur verbesserung der spracherkennung | |
Govindan et al. | Adaptive wavelet shrinkage for noise robust speaker recognition | |
ATE246835T1 (de) | Sprecher-erkennung | |
DE50209455D1 (de) | Verfahren zum Training oder zur Adaption eines Spracherkenners | |
ATE347162T1 (de) | Rauschunterdrückung zur robusten spracherkennung | |
ATE541287T1 (de) | Rechnerisch effizienter hintergrundrauschunterdrücker für die sprachcodierung und spracherkennung | |
ATE362632T1 (de) | Nachrichtenübertragungsgerät | |
IL146985A0 (en) | Automatic dynamic speech recognition vocabulary based on external sources of information | |
DE60229095D1 (de) | Ausprachen in mehreren Sprachen zur Spracherkennung | |
WO2006023631A3 (en) | Document transcription system training | |
BR0113725A (pt) | Combinação de dtw e hmm nos modos de reconhecimento de fala dependente e independente do falante | |
EP1103951A3 (de) | Adaptive Wavelet-Extraktion für die Spracherkennung | |
ATE465485T1 (de) | Verbesserung der spracherkennung von mobilgeräten | |
EP0865032A3 (de) | Spracherkenner mit Rauschadaptierung | |
DE502004009294D1 (de) | Verfahren zur automatischen Verstärkungseinstellung in einem Hörhilfegerät sowie Hörhilfegerät | |
DE502006004136D1 (de) | Verfahren und vorrichtung zur geräuschunterdrückung | |
DE60002584D1 (de) | Anwendung von Referenzdaten für Spracherkennung | |
ATE363120T1 (de) | Audio-dialogsystem und sprachgesteuertes browsing-verfahren | |
DE60113787D1 (de) | Verfahren und Vorrichtung zur Texteingabe durch Spracherkennung | |
ATE331279T1 (de) | Vorrichtung zur sprachverbesserung | |
ATE441918T1 (de) | Sprachdialogverfahren und -system | |
DE60303278D1 (de) | Vorrichtung zur Verbesserung der Spracherkennung |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
8364 | No opposition during term of opposition | ||
8339 | Ceased/non-payment of the annual fee |