WO2014182453A3 - Method and apparatus for training a voice recognition model database - Google Patents
Method and apparatus for training a voice recognition model database Download PDFInfo
- Publication number
- WO2014182453A3 WO2014182453A3 PCT/US2014/035117 US2014035117W WO2014182453A3 WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3 US 2014035117 W US2014035117 W US 2014035117W WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- recognition model
- model database
- noise
- voice recognition
- voice input
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An electronic device (102) digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input / noise sample combinations are used to train a voice recognition model database (308) without the user (104) having to repeat the voice input in each of the different environments. In one variation, the electronic device (102) transmits the user's voice input to a server (301) that maintains and trains the voice recognition model database (308).
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201480025758.9A CN105580071B (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
EP14725344.7A EP2994907A2 (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361819985P | 2013-05-06 | 2013-05-06 | |
US61/819,985 | 2013-05-06 | ||
US14/094,875 US9275638B2 (en) | 2013-03-12 | 2013-12-03 | Method and apparatus for training a voice recognition model database |
US14/094,875 | 2013-12-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2014182453A2 WO2014182453A2 (en) | 2014-11-13 |
WO2014182453A3 true WO2014182453A3 (en) | 2014-12-31 |
Family
ID=51867838
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/035117 WO2014182453A2 (en) | 2013-05-06 | 2014-04-23 | Method and apparatus for training a voice recognition model database |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP2994907A2 (en) |
CN (1) | CN105580071B (en) |
WO (1) | WO2014182453A2 (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232909A (en) * | 2018-03-02 | 2019-09-13 | 北京搜狗科技发展有限公司 | A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing |
CN109192216A (en) * | 2018-08-08 | 2019-01-11 | 联智科技(天津)有限责任公司 | A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device |
KR20200033707A (en) * | 2018-09-20 | 2020-03-30 | 삼성전자주식회사 | Electronic device, and Method of providing or obtaining data for training thereof |
CN109545196B (en) * | 2018-12-29 | 2022-11-29 | 深圳市科迈爱康科技有限公司 | Speech recognition method, device and computer readable storage medium |
CN109545195B (en) * | 2018-12-29 | 2023-02-21 | 深圳市科迈爱康科技有限公司 | Accompanying robot and control method thereof |
CN110544469B (en) * | 2019-09-04 | 2022-04-19 | 秒针信息技术有限公司 | Training method and device of voice recognition model, storage medium and electronic device |
CN110808030B (en) * | 2019-11-22 | 2021-01-22 | 珠海格力电器股份有限公司 | Voice awakening method, system, storage medium and electronic equipment |
CN111128141B (en) * | 2019-12-31 | 2022-04-19 | 思必驰科技股份有限公司 | Audio identification decoding method and device |
CN111369979B (en) * | 2020-02-26 | 2023-12-19 | 广州市百果园信息技术有限公司 | Training sample acquisition method, device, equipment and computer storage medium |
CN113099353A (en) * | 2021-04-21 | 2021-07-09 | 浙江吉利控股集团有限公司 | Integrated microphone, safety belt, steering wheel and vehicle for vehicle |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1199708A2 (en) * | 2000-10-16 | 2002-04-24 | Microsoft Corporation | Noise robust pattern recognition |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4590692B2 (en) * | 2000-06-28 | 2010-12-01 | パナソニック株式会社 | Acoustic model creation apparatus and method |
US6556971B1 (en) * | 2000-09-01 | 2003-04-29 | Snap-On Technologies, Inc. | Computer-implemented speech recognition system training |
US6889189B2 (en) * | 2003-09-26 | 2005-05-03 | Matsushita Electric Industrial Co., Ltd. | Speech recognizer performance in car and home applications utilizing novel multiple microphone configurations |
US20060149693A1 (en) * | 2005-01-04 | 2006-07-06 | Isao Otsuka | Enhanced classification using training data refinement and classifier updating |
US8762143B2 (en) * | 2007-05-29 | 2014-06-24 | At&T Intellectual Property Ii, L.P. | Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition |
US8234111B2 (en) * | 2010-06-14 | 2012-07-31 | Google Inc. | Speech and noise models for speech recognition |
TWI442384B (en) * | 2011-07-26 | 2014-06-21 | Ind Tech Res Inst | Microphone-array-based speech recognition system and method |
CN102426837B (en) * | 2011-12-30 | 2013-10-16 | 中国农业科学院农业信息研究所 | Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition |
-
2014
- 2014-04-23 WO PCT/US2014/035117 patent/WO2014182453A2/en active Application Filing
- 2014-04-23 EP EP14725344.7A patent/EP2994907A2/en not_active Withdrawn
- 2014-04-23 CN CN201480025758.9A patent/CN105580071B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1199708A2 (en) * | 2000-10-16 | 2002-04-24 | Microsoft Corporation | Noise robust pattern recognition |
Non-Patent Citations (3)
Title |
---|
AKIRA SASOU ET AL: "Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair", EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, vol. 20, no. 3, 1 January 2009 (2009-01-01), pages 1 - 9, XP055132340, ISSN: 1687-6180, DOI: 10.1016/j.specom.2006.03.002 * |
JI MING ET AL: "Robust Speaker Recognition in Noisy Conditions", IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, IEEE SERVICE CENTER, NEW YORK, NY, USA, vol. 15, no. 5, 1 July 2007 (2007-07-01), pages 1711 - 1723, XP011185748, ISSN: 1558-7916, DOI: 10.1109/TASL.2007.899278 * |
PEI DING ET AL: "Robust mandarin speech recognition in car environments for embedded navigation system", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 54, no. 2, 1 May 2008 (2008-05-01), pages 584 - 590, XP011229939, ISSN: 0098-3063, DOI: 10.1109/TCE.2008.4560134 * |
Also Published As
Publication number | Publication date |
---|---|
WO2014182453A2 (en) | 2014-11-13 |
CN105580071B (en) | 2020-08-21 |
EP2994907A2 (en) | 2016-03-16 |
CN105580071A (en) | 2016-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014182453A3 (en) | Method and apparatus for training a voice recognition model database | |
EP2781883A3 (en) | Method and apparatus for optimizing timing of audio commands based on recognized audio patterns | |
WO2014140816A3 (en) | Apparatus and method for performing actions based on captured image data | |
EP3751561A3 (en) | Hotword recognition | |
WO2014105357A3 (en) | Systems and methods for data entry in a non-destructive testing system | |
EP2806425A3 (en) | System and method for speaker verification | |
WO2014022659A3 (en) | Isolating vowel sounds for assessment | |
WO2009111721A3 (en) | Voice recognition grammar selection based on context | |
EP3968179A4 (en) | Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device | |
EP2787449A3 (en) | Text data processing method and corresponding electronic device | |
EP2846226A3 (en) | Method and system for providing haptic effects based on information complementary to multimedia content | |
WO2014105359A3 (en) | Voice inspection guidance | |
WO2014172781A8 (en) | Electronic dental charting | |
EP2963643A3 (en) | Entity name recognition | |
EP2339576A3 (en) | Multi-modal input on an electronic device | |
WO2011011413A8 (en) | Method and apparatus for evaluation of a subject's emotional, physiological and/or physical state with the subject's physiological and/or acoustic data | |
SG196783A1 (en) | Systems and methods for analyzing learner?s roles and performance and for intelligently adapting the delivery of education | |
EP2385520A3 (en) | Method and device for generating text from spoken word | |
ATE506890T1 (en) | DEVICE AND METHOD FOR PREDICTING LOSS OF CONTROL OVER A MUSCLE | |
EP3414758A4 (en) | Method and electronic device for performing voice based actions | |
WO2012057588A3 (en) | Apparatus and method for diagnosing learning ability | |
MX2019003927A (en) | Analysis system and method for testing a sample. | |
WO2014052326A3 (en) | Apparatus and methods for managing resources for a system using voice recognition | |
EP3534155A3 (en) | Methods and apparatus to separate biological entities | |
UA113173C2 (en) | SYSTEM AND METHOD OF RECOGNITION OF THE CONTENT OF THE SPEECH PROGRAM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480025758.9 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14725344 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014725344 Country of ref document: EP |