WO2014182453A3 - Method and apparatus for training a voice recognition model database - Google Patents

Method and apparatus for training a voice recognition model database Download PDF

Info

Publication number
WO2014182453A3
WO2014182453A3 PCT/US2014/035117 US2014035117W WO2014182453A3 WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3 US 2014035117 W US2014035117 W US 2014035117W WO 2014182453 A3 WO2014182453 A3 WO 2014182453A3
Authority
WO
WIPO (PCT)
Prior art keywords
recognition model
model database
noise
voice recognition
voice input
Prior art date
Application number
PCT/US2014/035117
Other languages
French (fr)
Other versions
WO2014182453A2 (en
Inventor
John R Meloney
Joel A. Clark
Joseph C. Dwyer
Adrian SCHUSTER
Snehitha Singaraju
Robert A. Zurek
Original Assignee
Motorola Mobility Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/094,875 external-priority patent/US9275638B2/en
Application filed by Motorola Mobility Llc filed Critical Motorola Mobility Llc
Priority to CN201480025758.9A priority Critical patent/CN105580071B/en
Priority to EP14725344.7A priority patent/EP2994907A2/en
Publication of WO2014182453A2 publication Critical patent/WO2014182453A2/en
Publication of WO2014182453A3 publication Critical patent/WO2014182453A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An electronic device (102) digitally combines a single voice input with each of a series of noise samples. Each noise sample is taken from a different audio environment (e.g., street noise, babble, interior car noise). The voice input / noise sample combinations are used to train a voice recognition model database (308) without the user (104) having to repeat the voice input in each of the different environments. In one variation, the electronic device (102) transmits the user's voice input to a server (301) that maintains and trains the voice recognition model database (308).
PCT/US2014/035117 2013-05-06 2014-04-23 Method and apparatus for training a voice recognition model database WO2014182453A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480025758.9A CN105580071B (en) 2013-05-06 2014-04-23 Method and apparatus for training a voice recognition model database
EP14725344.7A EP2994907A2 (en) 2013-05-06 2014-04-23 Method and apparatus for training a voice recognition model database

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201361819985P 2013-05-06 2013-05-06
US61/819,985 2013-05-06
US14/094,875 US9275638B2 (en) 2013-03-12 2013-12-03 Method and apparatus for training a voice recognition model database
US14/094,875 2013-12-03

Publications (2)

Publication Number Publication Date
WO2014182453A2 WO2014182453A2 (en) 2014-11-13
WO2014182453A3 true WO2014182453A3 (en) 2014-12-31

Family

ID=51867838

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/035117 WO2014182453A2 (en) 2013-05-06 2014-04-23 Method and apparatus for training a voice recognition model database

Country Status (3)

Country Link
EP (1) EP2994907A2 (en)
CN (1) CN105580071B (en)
WO (1) WO2014182453A2 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232909A (en) * 2018-03-02 2019-09-13 北京搜狗科技发展有限公司 A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing
CN109192216A (en) * 2018-08-08 2019-01-11 联智科技(天津)有限责任公司 A kind of Application on Voiceprint Recognition training dataset emulation acquisition methods and its acquisition device
KR20200033707A (en) * 2018-09-20 2020-03-30 삼성전자주식회사 Electronic device, and Method of providing or obtaining data for training thereof
CN109545196B (en) * 2018-12-29 2022-11-29 深圳市科迈爱康科技有限公司 Speech recognition method, device and computer readable storage medium
CN109545195B (en) * 2018-12-29 2023-02-21 深圳市科迈爱康科技有限公司 Accompanying robot and control method thereof
CN110544469B (en) * 2019-09-04 2022-04-19 秒针信息技术有限公司 Training method and device of voice recognition model, storage medium and electronic device
CN110808030B (en) * 2019-11-22 2021-01-22 珠海格力电器股份有限公司 Voice awakening method, system, storage medium and electronic equipment
CN111128141B (en) * 2019-12-31 2022-04-19 思必驰科技股份有限公司 Audio identification decoding method and device
CN111369979B (en) * 2020-02-26 2023-12-19 广州市百果园信息技术有限公司 Training sample acquisition method, device, equipment and computer storage medium
CN113099353A (en) * 2021-04-21 2021-07-09 浙江吉利控股集团有限公司 Integrated microphone, safety belt, steering wheel and vehicle for vehicle

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1199708A2 (en) * 2000-10-16 2002-04-24 Microsoft Corporation Noise robust pattern recognition

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4590692B2 (en) * 2000-06-28 2010-12-01 パナソニック株式会社 Acoustic model creation apparatus and method
US6556971B1 (en) * 2000-09-01 2003-04-29 Snap-On Technologies, Inc. Computer-implemented speech recognition system training
US6889189B2 (en) * 2003-09-26 2005-05-03 Matsushita Electric Industrial Co., Ltd. Speech recognizer performance in car and home applications utilizing novel multiple microphone configurations
US20060149693A1 (en) * 2005-01-04 2006-07-06 Isao Otsuka Enhanced classification using training data refinement and classifier updating
US8762143B2 (en) * 2007-05-29 2014-06-24 At&T Intellectual Property Ii, L.P. Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition
US8234111B2 (en) * 2010-06-14 2012-07-31 Google Inc. Speech and noise models for speech recognition
TWI442384B (en) * 2011-07-26 2014-06-21 Ind Tech Res Inst Microphone-array-based speech recognition system and method
CN102426837B (en) * 2011-12-30 2013-10-16 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1199708A2 (en) * 2000-10-16 2002-04-24 Microsoft Corporation Noise robust pattern recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AKIRA SASOU ET AL: "Noise Robust Speech Recognition Applied to Voice-Driven Wheelchair", EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, vol. 20, no. 3, 1 January 2009 (2009-01-01), pages 1 - 9, XP055132340, ISSN: 1687-6180, DOI: 10.1016/j.specom.2006.03.002 *
JI MING ET AL: "Robust Speaker Recognition in Noisy Conditions", IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, IEEE SERVICE CENTER, NEW YORK, NY, USA, vol. 15, no. 5, 1 July 2007 (2007-07-01), pages 1711 - 1723, XP011185748, ISSN: 1558-7916, DOI: 10.1109/TASL.2007.899278 *
PEI DING ET AL: "Robust mandarin speech recognition in car environments for embedded navigation system", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 54, no. 2, 1 May 2008 (2008-05-01), pages 584 - 590, XP011229939, ISSN: 0098-3063, DOI: 10.1109/TCE.2008.4560134 *

Also Published As

Publication number Publication date
WO2014182453A2 (en) 2014-11-13
CN105580071B (en) 2020-08-21
EP2994907A2 (en) 2016-03-16
CN105580071A (en) 2016-05-11

Similar Documents

Publication Publication Date Title
WO2014182453A3 (en) Method and apparatus for training a voice recognition model database
EP2781883A3 (en) Method and apparatus for optimizing timing of audio commands based on recognized audio patterns
WO2014140816A3 (en) Apparatus and method for performing actions based on captured image data
EP3751561A3 (en) Hotword recognition
WO2014105357A3 (en) Systems and methods for data entry in a non-destructive testing system
EP2806425A3 (en) System and method for speaker verification
WO2014022659A3 (en) Isolating vowel sounds for assessment
WO2009111721A3 (en) Voice recognition grammar selection based on context
EP3968179A4 (en) Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device
EP2787449A3 (en) Text data processing method and corresponding electronic device
EP2846226A3 (en) Method and system for providing haptic effects based on information complementary to multimedia content
WO2014105359A3 (en) Voice inspection guidance
WO2014172781A8 (en) Electronic dental charting
EP2963643A3 (en) Entity name recognition
EP2339576A3 (en) Multi-modal input on an electronic device
WO2011011413A8 (en) Method and apparatus for evaluation of a subject's emotional, physiological and/or physical state with the subject's physiological and/or acoustic data
SG196783A1 (en) Systems and methods for analyzing learner?s roles and performance and for intelligently adapting the delivery of education
EP2385520A3 (en) Method and device for generating text from spoken word
ATE506890T1 (en) DEVICE AND METHOD FOR PREDICTING LOSS OF CONTROL OVER A MUSCLE
EP3414758A4 (en) Method and electronic device for performing voice based actions
WO2012057588A3 (en) Apparatus and method for diagnosing learning ability
MX2019003927A (en) Analysis system and method for testing a sample.
WO2014052326A3 (en) Apparatus and methods for managing resources for a system using voice recognition
EP3534155A3 (en) Methods and apparatus to separate biological entities
UA113173C2 (en) SYSTEM AND METHOD OF RECOGNITION OF THE CONTENT OF THE SPEECH PROGRAM

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201480025758.9

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14725344

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2014725344

Country of ref document: EP