MX346294B - Método y sistema para el reconocimiento de comandos de voz. - Google Patents

Método y sistema para el reconocimiento de comandos de voz.

Info

Publication number
MX346294B
MX346294B MX2015009812A MX2015009812A MX346294B MX 346294 B MX346294 B MX 346294B MX 2015009812 A MX2015009812 A MX 2015009812A MX 2015009812 A MX2015009812 A MX 2015009812A MX 346294 B MX346294 B MX 346294B
Authority
MX
Mexico
Prior art keywords
sound sample
acoustic model
foreground
sound
sample
Prior art date
Application number
MX2015009812A
Other languages
English (en)
Other versions
MX2015009812A (es
Inventor
Jian Liu
Xiang Zhang
Li Lu
Shuai Yue
Haibo Liu
Bo Chen
dadong Xie
Original Assignee
Tencent Tech Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Tech Shenzhen Co Ltd filed Critical Tencent Tech Shenzhen Co Ltd
Publication of MX2015009812A publication Critical patent/MX2015009812A/es
Publication of MX346294B publication Critical patent/MX346294B/es

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/083Recognition networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • User Interface Of Digital Computer (AREA)
  • Selective Calling Equipment (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

Un método de reconocimiento de comandos de voz incluye generar un modelo acústico de fondo para un sonido utilizando una primera muestra de sonido, el modelo acústico de fondo se caracteriza por una primera métrica de precisión; un modelo acústico de primer plano se genera para el sonido utilizando una segunda muestra de sonido, el modelo acústico de primer plano se caracteriza por una segunda métrica de precisión; una tercera muestra de sonido es recibida y decodificada mediante la asignación de una ponderación a la tercera muestra de sonido correspondiente a una probabilidad de que la muestra de sonido se originó en un primer plano utilizando el modelo acústico de primer plano y el modelo acústico de fondo; el método incluye adicionalmente determinar si la ponderación cumple con criterios predefinidos para la asignación de la tercera muestra de sonido al primer plano y, cuando la ponderación cumple con los criterios predefinidos, interpretar la tercera muestra de sonido como una porción de un comando de voz; ee otra manera, el reconocimiento de la tercera muestra de sonido como una porción de un comando de voz no se percibe.
MX2015009812A 2013-01-30 2013-11-21 Método y sistema para el reconocimiento de comandos de voz. MX346294B (es)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310035979.1A CN103971685B (zh) 2013-01-30 2013-01-30 语音命令识别方法和系统
PCT/CN2013/085738 WO2014117544A1 (en) 2013-01-30 2013-11-21 Method and system for recognizing speech commands

Publications (2)

Publication Number Publication Date
MX2015009812A MX2015009812A (es) 2015-10-29
MX346294B true MX346294B (es) 2017-03-13

Family

ID=51241103

Family Applications (1)

Application Number Title Priority Date Filing Date
MX2015009812A MX346294B (es) 2013-01-30 2013-11-21 Método y sistema para el reconocimiento de comandos de voz.

Country Status (7)

Country Link
US (1) US9805715B2 (es)
CN (1) CN103971685B (es)
AR (1) AR094604A1 (es)
CA (1) CA2897365C (es)
MX (1) MX346294B (es)
SG (1) SG11201505403SA (es)
WO (1) WO2014117544A1 (es)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104143326B (zh) * 2013-12-03 2016-11-02 腾讯科技(深圳)有限公司 一种语音命令识别方法和装置
CN105374352B (zh) * 2014-08-22 2019-06-18 中国科学院声学研究所 一种语音激活方法及系统
US9318107B1 (en) * 2014-10-09 2016-04-19 Google Inc. Hotword detection on multiple devices
CN105843811B (zh) * 2015-01-13 2019-12-06 华为技术有限公司 转换文本的方法和设备
JP6227209B2 (ja) * 2015-09-09 2017-11-08 三菱電機株式会社 車載用音声認識装置および車載機器
CN106940998B (zh) * 2015-12-31 2021-04-16 阿里巴巴集团控股有限公司 一种设定操作的执行方法及装置
CN105869624B (zh) * 2016-03-29 2019-05-10 腾讯科技(深圳)有限公司 数字语音识别中语音解码网络的构建方法及装置
JP2018013590A (ja) * 2016-07-20 2018-01-25 株式会社東芝 生成装置、認識システム、有限状態トランスデューサの生成方法、および、データ
CN106409294B (zh) * 2016-10-18 2019-07-16 广州视源电子科技股份有限公司 防止语音命令误识别的方法和装置
TWI643123B (zh) * 2017-05-02 2018-12-01 瑞昱半導體股份有限公司 具有語音喚醒功能的電子裝置及其操作方法
CN112802459B (zh) * 2017-05-23 2024-06-18 创新先进技术有限公司 一种基于语音识别的咨询业务处理方法及装置
CN107680582B (zh) * 2017-07-28 2021-03-26 平安科技(深圳)有限公司 声学模型训练方法、语音识别方法、装置、设备及介质
US10204624B1 (en) * 2017-08-14 2019-02-12 Lenovo (Singapore) Pte. Ltd. False positive wake word
CN107644638B (zh) * 2017-10-17 2019-01-04 北京智能管家科技有限公司 语音识别方法、装置、终端和计算机可读存储介质
CN107919130B (zh) * 2017-11-06 2021-12-17 百度在线网络技术(北京)有限公司 基于云端的语音处理方法和装置
CN108257596B (zh) * 2017-12-22 2021-07-23 北京小蓦机器人技术有限公司 一种用于提供目标呈现信息的方法与设备
CN110782898B (zh) * 2018-07-12 2024-01-09 北京搜狗科技发展有限公司 端到端语音唤醒方法、装置及计算机设备
US11308939B1 (en) * 2018-09-25 2022-04-19 Amazon Technologies, Inc. Wakeword detection using multi-word model
CN109273007B (zh) * 2018-10-11 2022-05-17 西安讯飞超脑信息科技有限公司 语音唤醒方法及装置
CN110148403B (zh) * 2019-05-21 2021-04-13 腾讯科技(深圳)有限公司 解码网络生成方法、语音识别方法、装置、设备及介质
CN111179974B (zh) * 2019-12-30 2022-08-09 思必驰科技股份有限公司 一种命令词识别方法和装置
CN113192493B (zh) * 2020-04-29 2022-06-14 浙江大学 一种结合GMM Token配比与聚类的核心训练语音选择方法
US11521643B2 (en) * 2020-05-08 2022-12-06 Bose Corporation Wearable audio device with user own-voice recording
CN114944155B (zh) * 2021-02-14 2024-06-04 成都启英泰伦科技有限公司 一种终端硬件和算法软件处理相结合的离线语音识别方法
US11538461B1 (en) * 2021-03-18 2022-12-27 Amazon Technologies, Inc. Language agnostic missing subtitle detection
CN117198271A (zh) * 2023-10-10 2023-12-08 美的集团(上海)有限公司 语音解析方法及装置、智能设备、介质和计算机程序产品

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6098040A (en) * 1997-11-07 2000-08-01 Nortel Networks Corporation Method and apparatus for providing an improved feature set in speech recognition by performing noise cancellation and background masking
US6539353B1 (en) * 1999-10-12 2003-03-25 Microsoft Corporation Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition
JP4590692B2 (ja) * 2000-06-28 2010-12-01 パナソニック株式会社 音響モデル作成装置及びその方法
US20020087306A1 (en) * 2000-12-29 2002-07-04 Lee Victor Wai Leung Computer-implemented noise normalization method and system
US20030004720A1 (en) * 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US6985862B2 (en) * 2001-03-22 2006-01-10 Tellme Networks, Inc. Histogram grammar weighting and error corrective training of grammar weights
US7587321B2 (en) * 2001-05-08 2009-09-08 Intel Corporation Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system
US7236929B2 (en) * 2001-05-09 2007-06-26 Plantronics, Inc. Echo suppression and speech detection techniques for telephony applications
JP2003308091A (ja) * 2002-04-17 2003-10-31 Pioneer Electronic Corp 音声認識装置、音声認識方法および音声認識プログラム
DE60212725T2 (de) * 2002-08-01 2007-06-28 Telefonaktiebolaget Lm Ericsson (Publ) Verfahren zur automatischen spracherkennung
TWI223792B (en) * 2003-04-04 2004-11-11 Penpower Technology Ltd Speech model training method applied in speech recognition
US7480615B2 (en) * 2004-01-20 2009-01-20 Microsoft Corporation Method of speech recognition using multimodal variational inference with switching state space models
JP4541781B2 (ja) * 2004-06-29 2010-09-08 キヤノン株式会社 音声認識装置および方法
CN1300763C (zh) * 2004-09-29 2007-02-14 上海交通大学 嵌入式语音识别系统的自动语音识别处理方法
KR100745976B1 (ko) * 2005-01-12 2007-08-06 삼성전자주식회사 음향 모델을 이용한 음성과 비음성의 구분 방법 및 장치
JP4667082B2 (ja) * 2005-03-09 2011-04-06 キヤノン株式会社 音声認識方法
US20060241937A1 (en) * 2005-04-21 2006-10-26 Ma Changxue C Method and apparatus for automatically discriminating information bearing audio segments and background noise audio segments
US20060247927A1 (en) * 2005-04-29 2006-11-02 Robbins Kenneth L Controlling an output while receiving a user input
US8014536B2 (en) * 2005-12-02 2011-09-06 Golden Metallic, Inc. Audio source separation based on flexible pre-trained probabilistic source models
KR100679051B1 (ko) * 2005-12-14 2007-02-05 삼성전자주식회사 복수의 신뢰도 측정 알고리즘을 이용한 음성 인식 장치 및방법
JP4949687B2 (ja) * 2006-01-25 2012-06-13 ソニー株式会社 ビート抽出装置及びビート抽出方法
US7890325B2 (en) * 2006-03-16 2011-02-15 Microsoft Corporation Subword unit posterior probability for measuring confidence
ATE508452T1 (de) * 2007-11-12 2011-05-15 Harman Becker Automotive Sys Unterscheidung zwischen vordergrundsprache und hintergrundgeräuschen
EP2148325B1 (en) * 2008-07-22 2014-10-01 Nuance Communications, Inc. Method for determining the presence of a wanted signal component
US8600073B2 (en) * 2009-11-04 2013-12-03 Cambridge Silicon Radio Limited Wind noise suppression
US9443511B2 (en) * 2011-03-04 2016-09-13 Qualcomm Incorporated System and method for recognizing environmental sound

Also Published As

Publication number Publication date
CN103971685A (zh) 2014-08-06
SG11201505403SA (en) 2015-08-28
CA2897365A1 (en) 2014-08-07
CN103971685B (zh) 2015-06-10
AR094604A1 (es) 2015-08-12
MX2015009812A (es) 2015-10-29
CA2897365C (en) 2018-10-02
US20140214416A1 (en) 2014-07-31
US9805715B2 (en) 2017-10-31
WO2014117544A1 (en) 2014-08-07

Similar Documents

Publication Publication Date Title
MX2015009812A (es) Metodo y sistema para el reconicimiento de comandos de voz.
AU2019268131A1 (en) Speech recognition method, speech wakeup apparatus, speech recognition apparatus, and terminal
MY179900A (en) Speech recognition method and speech recognition apparatus
GB2566215A (en) Voice user interface
WO2015009586A3 (en) Performing an operation relative to tabular data based upon voice input
GB2551917A (en) Privacy-preserving training corpus selection
GB2536836A (en) Voice command triggered speech enhancement
EP4239628A3 (en) Determining hotword suitability
MX2016013015A (es) Métodos y sistemas de administrar un dialogo con un robot.
GB201212783D0 (en) A speech processing system
EP3767622A3 (en) Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface
WO2018038385A3 (ko) 음성 인식 방법 및 이를 수행하는 전자 장치
NZ725145A (en) Methods and systems for managing dialogs of a robot
MX2017001121A (es) Reconocimiento del habla en base a acustica y a dominio para vehiculos.
EP3751561A3 (en) Hotword recognition
WO2014115115A3 (en) Determining apnea-hypopnia index ahi from speech
EP3384488A4 (en) SYSTEM AND METHOD FOR IMPLEMENTING A VOICE USER INTERFACE BY COMBINING A SPEECH-TEXT SYSTEM AND A SPEECH-INTENTION SYSTEM
CL2015002362A1 (es) Método y sistema para el control de un dispositivo de recepción de usuario por el uso de comando de voz
SG10201900178WA (en) Speech transaction processing
WO2014124332A3 (en) Voice trigger for a digital assistant
MX2014010795A (es) Dispositivo para extraer informacion a partir de un dialogo.
MX2017003754A (es) Mirada para entendimiento de lenguaje por voz en interacciones de conversacion multimodal.
IN2014DN09942A (es)
EP3349125A4 (en) Language model generation device, language model generation method and program therefor, voice recognition device, and voice recognition method and program therefor
WO2013134641A3 (en) Recognizing speech in multiple languages

Legal Events

Date Code Title Description
HH Correction or change in general
FG Grant or registration