SG11201808360SA - Acoustic model training method, speech recognition method, apparatus, device and medium - Google Patents

Acoustic model training method, speech recognition method, apparatus, device and medium

Info

Publication number
SG11201808360SA
SG11201808360SA SG11201808360SA SG11201808360SA SG11201808360SA SG 11201808360S A SG11201808360S A SG 11201808360SA SG 11201808360S A SG11201808360S A SG 11201808360SA SG 11201808360S A SG11201808360S A SG 11201808360SA SG 11201808360S A SG11201808360S A SG 11201808360SA
Authority
SG
Singapore
Prior art keywords
training
acoustic model
model
medium
model training
Prior art date
Application number
SG11201808360SA
Other languages
English (en)
Inventor
Hao Liang
Jianzong Wang
Ning Cheng
Jing Xiao
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Publication of SG11201808360SA publication Critical patent/SG11201808360SA/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/16Hidden Markov models [HMM]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/148Duration modelling in HMMs, e.g. semi HMM, segmental models or transition probabilities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/022Demisyllables, biphones or triphones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Character Discrimination (AREA)
SG11201808360SA 2017-07-28 2017-08-31 Acoustic model training method, speech recognition method, apparatus, device and medium SG11201808360SA (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710627480.8A CN107680582B (zh) 2017-07-28 2017-07-28 声学模型训练方法、语音识别方法、装置、设备及介质
PCT/CN2017/099825 WO2019019252A1 (zh) 2017-07-28 2017-08-31 声学模型训练方法、语音识别方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
SG11201808360SA true SG11201808360SA (en) 2019-02-27

Family

ID=61133210

Family Applications (1)

Application Number Title Priority Date Filing Date
SG11201808360SA SG11201808360SA (en) 2017-07-28 2017-08-31 Acoustic model training method, speech recognition method, apparatus, device and medium

Country Status (4)

Country Link
US (1) US11030998B2 (zh)
CN (1) CN107680582B (zh)
SG (1) SG11201808360SA (zh)
WO (1) WO2019019252A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110634476A (zh) * 2019-10-09 2019-12-31 深圳大学 一种快速搭建鲁棒性声学模型的方法及系统

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102535411B1 (ko) * 2017-11-16 2023-05-23 삼성전자주식회사 메트릭 학습 기반의 데이터 분류와 관련된 장치 및 그 방법
CN108447475A (zh) * 2018-03-02 2018-08-24 国家电网公司华中分部 一种基于电力调度系统的语音识别模型的建立方法
CN108564940B (zh) * 2018-03-20 2020-04-28 平安科技(深圳)有限公司 语音识别方法、服务器及计算机可读存储介质
CN108806696B (zh) * 2018-05-08 2020-06-05 平安科技(深圳)有限公司 建立声纹模型的方法、装置、计算机设备和存储介质
CN108831463B (zh) * 2018-06-28 2021-11-12 广州方硅信息技术有限公司 唇语合成方法、装置、电子设备及存储介质
CN108989341B (zh) * 2018-08-21 2023-01-13 平安科技(深圳)有限公司 语音自主注册方法、装置、计算机设备及存储介质
CN108986835B (zh) * 2018-08-28 2019-11-26 百度在线网络技术(北京)有限公司 基于改进gan网络的语音去噪方法、装置、设备及介质
CN109167880B (zh) * 2018-08-30 2021-05-21 努比亚技术有限公司 双面屏终端控制方法、双面屏终端及计算机可读存储介质
CN109036379B (zh) * 2018-09-06 2021-06-11 百度时代网络技术(北京)有限公司 语音识别方法、设备及存储介质
CN110164452B (zh) * 2018-10-10 2023-03-10 腾讯科技(深圳)有限公司 一种声纹识别的方法、模型训练的方法以及服务器
CN111048062B (zh) 2018-10-10 2022-10-04 华为技术有限公司 语音合成方法及设备
CN109559735B (zh) * 2018-10-11 2023-10-27 平安科技(深圳)有限公司 一种基于神经网络的语音识别方法、终端设备及介质
CN109524011A (zh) * 2018-10-22 2019-03-26 四川虹美智能科技有限公司 一种基于声纹识别的冰箱唤醒方法及装置
CN109243429B (zh) * 2018-11-21 2021-12-10 苏州奇梦者网络科技有限公司 一种语音建模方法及装置
US11170761B2 (en) * 2018-12-04 2021-11-09 Sorenson Ip Holdings, Llc Training of speech recognition systems
CN109326277B (zh) * 2018-12-05 2022-02-08 四川长虹电器股份有限公司 半监督的音素强制对齐模型建立方法及系统
CN109243465A (zh) * 2018-12-06 2019-01-18 平安科技(深圳)有限公司 声纹认证方法、装置、计算机设备以及存储介质
CN109830277B (zh) * 2018-12-12 2024-03-15 平安科技(深圳)有限公司 一种跳绳监测方法、电子装置及存储介质
CN109817191B (zh) * 2019-01-04 2023-06-06 平安科技(深圳)有限公司 颤音建模方法、装置、计算机设备及存储介质
CN109616103B (zh) * 2019-01-09 2022-03-22 百度在线网络技术(北京)有限公司 声学模型的训练方法、装置及存储介质
CN109887484B (zh) * 2019-02-22 2023-08-04 平安科技(深圳)有限公司 一种基于对偶学习的语音识别与语音合成方法及装置
CN111798857A (zh) * 2019-04-08 2020-10-20 北京嘀嘀无限科技发展有限公司 一种信息识别方法、装置、电子设备及存储介质
CN111833847B (zh) * 2019-04-15 2023-07-25 北京百度网讯科技有限公司 语音处理模型训练方法和装置
CN110415685A (zh) * 2019-08-20 2019-11-05 河海大学 一种语音识别方法
EP4078918B1 (en) * 2019-12-20 2023-11-08 Eduworks Corporation Real-time voice phishing detection
US11586964B2 (en) * 2020-01-30 2023-02-21 Dell Products L.P. Device component management using deep learning techniques
CN111489739B (zh) * 2020-04-17 2023-06-16 嘉楠明芯(北京)科技有限公司 音素识别方法、装置及计算机可读存储介质
CN111696525A (zh) * 2020-05-08 2020-09-22 天津大学 一种基于Kaldi的中文语音识别声学模型构建方法
CN111666469B (zh) * 2020-05-13 2023-06-16 广州国音智能科技有限公司 语句库构建方法、装置、设备和存储介质
CN111798841B (zh) * 2020-05-13 2023-01-03 厦门快商通科技股份有限公司 声学模型训练方法、系统、移动终端及存储介质
CN111833852B (zh) * 2020-06-30 2022-04-15 思必驰科技股份有限公司 一种声学模型的训练方法、装置以及计算机可读存储介质
CN111816171B (zh) * 2020-08-31 2020-12-11 北京世纪好未来教育科技有限公司 语音识别模型的训练方法、语音识别方法及装置
CN111933121B (zh) * 2020-08-31 2024-03-12 广州市百果园信息技术有限公司 一种声学模型训练方法及装置
CN112331219B (zh) * 2020-11-05 2024-05-03 北京晴数智慧科技有限公司 语音处理方法和装置
CN112489662A (zh) * 2020-11-13 2021-03-12 北京沃东天骏信息技术有限公司 用于训练语音处理模型的方法和装置
CN113035247B (zh) * 2021-03-17 2022-12-23 广州虎牙科技有限公司 一种音频文本对齐方法、装置、电子设备及存储介质
CN113223504B (zh) * 2021-04-30 2023-12-26 平安科技(深圳)有限公司 声学模型的训练方法、装置、设备和存储介质
TWI780738B (zh) * 2021-05-28 2022-10-11 宇康生科股份有限公司 構音異常語料擴增方法及系統、語音辨識平台,及構音異常輔助裝置
CN113345418A (zh) * 2021-06-09 2021-09-03 中国科学技术大学 基于跨语种自训练的多语种模型训练方法
CN113450803B (zh) * 2021-06-09 2024-03-19 上海明略人工智能(集团)有限公司 会议录音转写方法、系统、计算机设备和可读存储介质
CN113449626B (zh) * 2021-06-23 2023-11-07 中国科学院上海高等研究院 隐马尔科夫模型振动信号分析方法装置、存储介质和终端
CN113689867B (zh) * 2021-08-18 2022-06-28 北京百度网讯科技有限公司 一种语音转换模型的训练方法、装置、电子设备及介质
CN113723546B (zh) * 2021-09-03 2023-12-22 江苏理工学院 基于离散隐马尔可夫模型的轴承故障检测方法和系统
CN114360517B (zh) * 2021-12-17 2023-04-18 天翼爱音乐文化科技有限公司 一种复杂环境下的音频处理方法、装置及存储介质
CN116364063B (zh) * 2023-06-01 2023-09-05 蔚来汽车科技(安徽)有限公司 音素对齐方法、设备、驾驶设备和介质

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7089178B2 (en) * 2002-04-30 2006-08-08 Qualcomm Inc. Multistream network feature processing for a distributed speech recognition system
US8972253B2 (en) 2010-09-15 2015-03-03 Microsoft Technology Licensing, Llc Deep belief network for large vocabulary continuous speech recognition
US8442821B1 (en) 2012-07-27 2013-05-14 Google Inc. Multi-frame prediction for hybrid neural network/hidden Markov models
US9972306B2 (en) * 2012-08-07 2018-05-15 Interactive Intelligence Group, Inc. Method and system for acoustic data selection for training the parameters of an acoustic model
NZ730641A (en) * 2012-08-24 2018-08-31 Interactive Intelligence Inc Method and system for selectively biased linear discriminant analysis in automatic speech recognition systems
CN103117060B (zh) * 2013-01-18 2015-10-28 中国科学院声学研究所 用于语音识别的声学模型的建模方法、建模系统
CN103971678B (zh) 2013-01-29 2015-08-12 腾讯科技(深圳)有限公司 关键词检测方法和装置
CN103971685B (zh) * 2013-01-30 2015-06-10 腾讯科技(深圳)有限公司 语音命令识别方法和系统
CN104575504A (zh) * 2014-12-24 2015-04-29 上海师范大学 采用声纹和语音识别进行个性化电视语音唤醒的方法
KR101988222B1 (ko) 2015-02-12 2019-06-13 한국전자통신연구원 대어휘 연속 음성 인식 장치 및 방법
WO2016165120A1 (en) * 2015-04-17 2016-10-20 Microsoft Technology Licensing, Llc Deep neural support vector machines
KR102494139B1 (ko) * 2015-11-06 2023-01-31 삼성전자주식회사 뉴럴 네트워크 학습 장치 및 방법과, 음성 인식 장치 및 방법
JP6679898B2 (ja) * 2015-11-24 2020-04-15 富士通株式会社 キーワード検出装置、キーワード検出方法及びキーワード検出用コンピュータプログラム
CN105702250B (zh) * 2016-01-06 2020-05-19 福建天晴数码有限公司 语音识别方法和装置
CN105869624B (zh) * 2016-03-29 2019-05-10 腾讯科技(深圳)有限公司 数字语音识别中语音解码网络的构建方法及装置
CN105976812B (zh) * 2016-04-28 2019-04-26 腾讯科技(深圳)有限公司 一种语音识别方法及其设备
CN105957518B (zh) * 2016-06-16 2019-05-31 内蒙古大学 一种蒙古语大词汇量连续语音识别的方法
CN106409289B (zh) * 2016-09-23 2019-06-28 合肥美的智能科技有限公司 语音识别的环境自适应方法、语音识别装置和家用电器

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110634476A (zh) * 2019-10-09 2019-12-31 深圳大学 一种快速搭建鲁棒性声学模型的方法及系统

Also Published As

Publication number Publication date
CN107680582B (zh) 2021-03-26
US11030998B2 (en) 2021-06-08
US20210125603A1 (en) 2021-04-29
CN107680582A (zh) 2018-02-09
WO2019019252A1 (zh) 2019-01-31

Similar Documents

Publication Publication Date Title
SG11201808360SA (en) Acoustic model training method, speech recognition method, apparatus, device and medium
EP3751561A3 (en) Hotword recognition
EP3767619A4 (en) PROCESS AND APPARATUS FOR SPEECH RECOGNITION AND SPEECH RECOGNITION MODEL LEARNING
AU2019268131A1 (en) Speech recognition method, speech wakeup apparatus, speech recognition apparatus, and terminal
EP4235369A3 (en) Modality learning on mobile devices
EP3154054A3 (en) Method and apparatus for training language model and recognizing speech
PH12019501674A1 (en) Speech wakeup method, apparatus, and electronic device
EP3770905A4 (en) VOICE RECOGNITION METHOD, DEVICE AND DEVICE AND STORAGE MEDIUM
EP3742436A4 (en) SPEECH SYNTHESIS METHOD, MODEL TRAINING METHOD, DEVICE AND COMPUTER DEVICE
EP3742332A4 (en) METHOD AND APPARATUS FOR TRAINING A MODEL FOR DETECTING KEY POINTS OF THE HAND AND METHOD AND DEVICE FOR DETECTING KEY POINTS OF THE HAND
EP3001662A3 (en) Conference proceed apparatus and method for advancing conference
EP4064284A4 (en) SPEECH DETECTION METHODS, TRAINING METHODS FOR PREDICTIVE MODELS, DEVICE, DEVICE AND MEDIUM
WO2014025682A3 (en) Acoustic data selection for training the parameters of an acoustic model
GB2551917A (en) Privacy-preserving training corpus selection
EP3648099A4 (en) VOICE RECOGNITION METHOD, DEVICE, DEVICE AND STORAGE MEDIUM
MY179900A (en) Speech recognition method and speech recognition apparatus
EP3479376A4 (en) METHOD AND APPARATUS FOR VOICE RECOGNITION BASED ON RECOGNITION OF SPEAKER
WO2018038385A3 (ko) 음성 인식 방법 및 이를 수행하는 전자 장치
EP3584790A4 (en) VOICEPRINT RECOGNITION METHOD, DEVICE, STORAGE MEDIUM AND BACKGROUND SERVER
EP3046053A3 (en) Method and apparatus for training language model, and method and apparatus for recognizing language
EP3193328A4 (en) Method and device for performing voice recognition using grammar model
WO2016044027A8 (en) Method and apparatus for performing speaker recognition
EP4280210A3 (en) Hotword detection on multiple devices
EP3353766A4 (en) METHODS FOR AUTOMATED GENERATION OF VOICE SAMPLE ASSET PRODUCTION NOTES FOR USERS OF DISTRIBUTED LANGUAGE LEARNING SYSTEM, AUTOMATED RECOGNITION AND QUANTIFICATION OF ACCENT AND ENHANCED SPEECH RECOGNITION
EP2963643A3 (en) Entity name recognition