MX346294B - Método y sistema para el reconocimiento de comandos de voz. - Google Patents
Método y sistema para el reconocimiento de comandos de voz.Info
- Publication number
- MX346294B MX346294B MX2015009812A MX2015009812A MX346294B MX 346294 B MX346294 B MX 346294B MX 2015009812 A MX2015009812 A MX 2015009812A MX 2015009812 A MX2015009812 A MX 2015009812A MX 346294 B MX346294 B MX 346294B
- Authority
- MX
- Mexico
- Prior art keywords
- sound sample
- acoustic model
- foreground
- sound
- sample
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/083—Recognition networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- User Interface Of Digital Computer (AREA)
- Selective Calling Equipment (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Un método de reconocimiento de comandos de voz incluye generar un modelo acústico de fondo para un sonido utilizando una primera muestra de sonido, el modelo acústico de fondo se caracteriza por una primera métrica de precisión; un modelo acústico de primer plano se genera para el sonido utilizando una segunda muestra de sonido, el modelo acústico de primer plano se caracteriza por una segunda métrica de precisión; una tercera muestra de sonido es recibida y decodificada mediante la asignación de una ponderación a la tercera muestra de sonido correspondiente a una probabilidad de que la muestra de sonido se originó en un primer plano utilizando el modelo acústico de primer plano y el modelo acústico de fondo; el método incluye adicionalmente determinar si la ponderación cumple con criterios predefinidos para la asignación de la tercera muestra de sonido al primer plano y, cuando la ponderación cumple con los criterios predefinidos, interpretar la tercera muestra de sonido como una porción de un comando de voz; ee otra manera, el reconocimiento de la tercera muestra de sonido como una porción de un comando de voz no se percibe.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310035979.1A CN103971685B (zh) | 2013-01-30 | 2013-01-30 | 语音命令识别方法和系统 |
PCT/CN2013/085738 WO2014117544A1 (en) | 2013-01-30 | 2013-11-21 | Method and system for recognizing speech commands |
Publications (2)
Publication Number | Publication Date |
---|---|
MX2015009812A MX2015009812A (es) | 2015-10-29 |
MX346294B true MX346294B (es) | 2017-03-13 |
Family
ID=51241103
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
MX2015009812A MX346294B (es) | 2013-01-30 | 2013-11-21 | Método y sistema para el reconocimiento de comandos de voz. |
Country Status (7)
Country | Link |
---|---|
US (1) | US9805715B2 (es) |
CN (1) | CN103971685B (es) |
AR (1) | AR094604A1 (es) |
CA (1) | CA2897365C (es) |
MX (1) | MX346294B (es) |
SG (1) | SG11201505403SA (es) |
WO (1) | WO2014117544A1 (es) |
Families Citing this family (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104143326B (zh) * | 2013-12-03 | 2016-11-02 | 腾讯科技(深圳)有限公司 | 一种语音命令识别方法和装置 |
CN105374352B (zh) * | 2014-08-22 | 2019-06-18 | 中国科学院声学研究所 | 一种语音激活方法及系统 |
US9318107B1 (en) * | 2014-10-09 | 2016-04-19 | Google Inc. | Hotword detection on multiple devices |
CN105843811B (zh) * | 2015-01-13 | 2019-12-06 | 华为技术有限公司 | 转换文本的方法和设备 |
JP6227209B2 (ja) * | 2015-09-09 | 2017-11-08 | 三菱電機株式会社 | 車載用音声認識装置および車載機器 |
CN106940998B (zh) * | 2015-12-31 | 2021-04-16 | 阿里巴巴集团控股有限公司 | 一种设定操作的执行方法及装置 |
CN105869624B (zh) * | 2016-03-29 | 2019-05-10 | 腾讯科技(深圳)有限公司 | 数字语音识别中语音解码网络的构建方法及装置 |
JP2018013590A (ja) * | 2016-07-20 | 2018-01-25 | 株式会社東芝 | 生成装置、認識システム、有限状態トランスデューサの生成方法、および、データ |
CN106409294B (zh) * | 2016-10-18 | 2019-07-16 | 广州视源电子科技股份有限公司 | 防止语音命令误识别的方法和装置 |
TWI643123B (zh) * | 2017-05-02 | 2018-12-01 | 瑞昱半導體股份有限公司 | 具有語音喚醒功能的電子裝置及其操作方法 |
CN112802459B (zh) * | 2017-05-23 | 2024-06-18 | 创新先进技术有限公司 | 一种基于语音识别的咨询业务处理方法及装置 |
CN107680582B (zh) * | 2017-07-28 | 2021-03-26 | 平安科技(深圳)有限公司 | 声学模型训练方法、语音识别方法、装置、设备及介质 |
US10204624B1 (en) * | 2017-08-14 | 2019-02-12 | Lenovo (Singapore) Pte. Ltd. | False positive wake word |
CN107644638B (zh) * | 2017-10-17 | 2019-01-04 | 北京智能管家科技有限公司 | 语音识别方法、装置、终端和计算机可读存储介质 |
CN107919130B (zh) * | 2017-11-06 | 2021-12-17 | 百度在线网络技术(北京)有限公司 | 基于云端的语音处理方法和装置 |
CN108257596B (zh) * | 2017-12-22 | 2021-07-23 | 北京小蓦机器人技术有限公司 | 一种用于提供目标呈现信息的方法与设备 |
CN110782898B (zh) * | 2018-07-12 | 2024-01-09 | 北京搜狗科技发展有限公司 | 端到端语音唤醒方法、装置及计算机设备 |
US11308939B1 (en) * | 2018-09-25 | 2022-04-19 | Amazon Technologies, Inc. | Wakeword detection using multi-word model |
CN109273007B (zh) * | 2018-10-11 | 2022-05-17 | 西安讯飞超脑信息科技有限公司 | 语音唤醒方法及装置 |
CN110148403B (zh) * | 2019-05-21 | 2021-04-13 | 腾讯科技(深圳)有限公司 | 解码网络生成方法、语音识别方法、装置、设备及介质 |
CN111179974B (zh) * | 2019-12-30 | 2022-08-09 | 思必驰科技股份有限公司 | 一种命令词识别方法和装置 |
CN113192493B (zh) * | 2020-04-29 | 2022-06-14 | 浙江大学 | 一种结合GMM Token配比与聚类的核心训练语音选择方法 |
US11521643B2 (en) * | 2020-05-08 | 2022-12-06 | Bose Corporation | Wearable audio device with user own-voice recording |
CN114944155B (zh) * | 2021-02-14 | 2024-06-04 | 成都启英泰伦科技有限公司 | 一种终端硬件和算法软件处理相结合的离线语音识别方法 |
US11538461B1 (en) * | 2021-03-18 | 2022-12-27 | Amazon Technologies, Inc. | Language agnostic missing subtitle detection |
CN117198271A (zh) * | 2023-10-10 | 2023-12-08 | 美的集团(上海)有限公司 | 语音解析方法及装置、智能设备、介质和计算机程序产品 |
Family Cites Families (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6098040A (en) * | 1997-11-07 | 2000-08-01 | Nortel Networks Corporation | Method and apparatus for providing an improved feature set in speech recognition by performing noise cancellation and background masking |
US6539353B1 (en) * | 1999-10-12 | 2003-03-25 | Microsoft Corporation | Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition |
JP4590692B2 (ja) * | 2000-06-28 | 2010-12-01 | パナソニック株式会社 | 音響モデル作成装置及びその方法 |
US20020087306A1 (en) * | 2000-12-29 | 2002-07-04 | Lee Victor Wai Leung | Computer-implemented noise normalization method and system |
US20030004720A1 (en) * | 2001-01-30 | 2003-01-02 | Harinath Garudadri | System and method for computing and transmitting parameters in a distributed voice recognition system |
US6985862B2 (en) * | 2001-03-22 | 2006-01-10 | Tellme Networks, Inc. | Histogram grammar weighting and error corrective training of grammar weights |
US7587321B2 (en) * | 2001-05-08 | 2009-09-08 | Intel Corporation | Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system |
US7236929B2 (en) * | 2001-05-09 | 2007-06-26 | Plantronics, Inc. | Echo suppression and speech detection techniques for telephony applications |
JP2003308091A (ja) * | 2002-04-17 | 2003-10-31 | Pioneer Electronic Corp | 音声認識装置、音声認識方法および音声認識プログラム |
DE60212725T2 (de) * | 2002-08-01 | 2007-06-28 | Telefonaktiebolaget Lm Ericsson (Publ) | Verfahren zur automatischen spracherkennung |
TWI223792B (en) * | 2003-04-04 | 2004-11-11 | Penpower Technology Ltd | Speech model training method applied in speech recognition |
US7480615B2 (en) * | 2004-01-20 | 2009-01-20 | Microsoft Corporation | Method of speech recognition using multimodal variational inference with switching state space models |
JP4541781B2 (ja) * | 2004-06-29 | 2010-09-08 | キヤノン株式会社 | 音声認識装置および方法 |
CN1300763C (zh) * | 2004-09-29 | 2007-02-14 | 上海交通大学 | 嵌入式语音识别系统的自动语音识别处理方法 |
KR100745976B1 (ko) * | 2005-01-12 | 2007-08-06 | 삼성전자주식회사 | 음향 모델을 이용한 음성과 비음성의 구분 방법 및 장치 |
JP4667082B2 (ja) * | 2005-03-09 | 2011-04-06 | キヤノン株式会社 | 音声認識方法 |
US20060241937A1 (en) * | 2005-04-21 | 2006-10-26 | Ma Changxue C | Method and apparatus for automatically discriminating information bearing audio segments and background noise audio segments |
US20060247927A1 (en) * | 2005-04-29 | 2006-11-02 | Robbins Kenneth L | Controlling an output while receiving a user input |
US8014536B2 (en) * | 2005-12-02 | 2011-09-06 | Golden Metallic, Inc. | Audio source separation based on flexible pre-trained probabilistic source models |
KR100679051B1 (ko) * | 2005-12-14 | 2007-02-05 | 삼성전자주식회사 | 복수의 신뢰도 측정 알고리즘을 이용한 음성 인식 장치 및방법 |
JP4949687B2 (ja) * | 2006-01-25 | 2012-06-13 | ソニー株式会社 | ビート抽出装置及びビート抽出方法 |
US7890325B2 (en) * | 2006-03-16 | 2011-02-15 | Microsoft Corporation | Subword unit posterior probability for measuring confidence |
ATE508452T1 (de) * | 2007-11-12 | 2011-05-15 | Harman Becker Automotive Sys | Unterscheidung zwischen vordergrundsprache und hintergrundgeräuschen |
EP2148325B1 (en) * | 2008-07-22 | 2014-10-01 | Nuance Communications, Inc. | Method for determining the presence of a wanted signal component |
US8600073B2 (en) * | 2009-11-04 | 2013-12-03 | Cambridge Silicon Radio Limited | Wind noise suppression |
US9443511B2 (en) * | 2011-03-04 | 2016-09-13 | Qualcomm Incorporated | System and method for recognizing environmental sound |
-
2013
- 2013-01-30 CN CN201310035979.1A patent/CN103971685B/zh active Active
- 2013-11-21 WO PCT/CN2013/085738 patent/WO2014117544A1/en active Application Filing
- 2013-11-21 CA CA2897365A patent/CA2897365C/en active Active
- 2013-11-21 SG SG11201505403SA patent/SG11201505403SA/en unknown
- 2013-11-21 MX MX2015009812A patent/MX346294B/es active IP Right Grant
- 2013-12-13 US US14/106,634 patent/US9805715B2/en active Active
-
2014
- 2014-01-28 AR ARP140100256A patent/AR094604A1/es active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
CN103971685A (zh) | 2014-08-06 |
SG11201505403SA (en) | 2015-08-28 |
CA2897365A1 (en) | 2014-08-07 |
CN103971685B (zh) | 2015-06-10 |
AR094604A1 (es) | 2015-08-12 |
MX2015009812A (es) | 2015-10-29 |
CA2897365C (en) | 2018-10-02 |
US20140214416A1 (en) | 2014-07-31 |
US9805715B2 (en) | 2017-10-31 |
WO2014117544A1 (en) | 2014-08-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
MX2015009812A (es) | Metodo y sistema para el reconicimiento de comandos de voz. | |
AU2019268131A1 (en) | Speech recognition method, speech wakeup apparatus, speech recognition apparatus, and terminal | |
MY179900A (en) | Speech recognition method and speech recognition apparatus | |
GB2566215A (en) | Voice user interface | |
WO2015009586A3 (en) | Performing an operation relative to tabular data based upon voice input | |
GB2551917A (en) | Privacy-preserving training corpus selection | |
GB2536836A (en) | Voice command triggered speech enhancement | |
EP4239628A3 (en) | Determining hotword suitability | |
MX2016013015A (es) | Métodos y sistemas de administrar un dialogo con un robot. | |
GB201212783D0 (en) | A speech processing system | |
EP3767622A3 (en) | Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface | |
WO2018038385A3 (ko) | 음성 인식 방법 및 이를 수행하는 전자 장치 | |
NZ725145A (en) | Methods and systems for managing dialogs of a robot | |
MX2017001121A (es) | Reconocimiento del habla en base a acustica y a dominio para vehiculos. | |
EP3751561A3 (en) | Hotword recognition | |
WO2014115115A3 (en) | Determining apnea-hypopnia index ahi from speech | |
EP3384488A4 (en) | SYSTEM AND METHOD FOR IMPLEMENTING A VOICE USER INTERFACE BY COMBINING A SPEECH-TEXT SYSTEM AND A SPEECH-INTENTION SYSTEM | |
CL2015002362A1 (es) | Método y sistema para el control de un dispositivo de recepción de usuario por el uso de comando de voz | |
SG10201900178WA (en) | Speech transaction processing | |
WO2014124332A3 (en) | Voice trigger for a digital assistant | |
MX2014010795A (es) | Dispositivo para extraer informacion a partir de un dialogo. | |
MX2017003754A (es) | Mirada para entendimiento de lenguaje por voz en interacciones de conversacion multimodal. | |
IN2014DN09942A (es) | ||
EP3349125A4 (en) | Language model generation device, language model generation method and program therefor, voice recognition device, and voice recognition method and program therefor | |
WO2013134641A3 (en) | Recognizing speech in multiple languages |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
HH | Correction or change in general | ||
FG | Grant or registration |