CN106782520B - 一种复杂环境下语音特征映射方法 - Google Patents
一种复杂环境下语音特征映射方法 Download PDFInfo
- Publication number
- CN106782520B CN106782520B CN201710151497.0A CN201710151497A CN106782520B CN 106782520 B CN106782520 B CN 106782520B CN 201710151497 A CN201710151497 A CN 201710151497A CN 106782520 B CN106782520 B CN 106782520B
- Authority
- CN
- China
- Prior art keywords
- feature
- under
- complex environment
- environment
- voice signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013507 mapping Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000000605 extraction Methods 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000009432 framing Methods 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 6
- 238000013179 statistical model Methods 0.000 claims description 4
- 238000007476 Maximum Likelihood Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 3
- 230000003993 interaction Effects 0.000 description 6
- 238000003909 pattern recognition Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710151497.0A CN106782520B (zh) | 2017-03-14 | 2017-03-14 | 一种复杂环境下语音特征映射方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710151497.0A CN106782520B (zh) | 2017-03-14 | 2017-03-14 | 一种复杂环境下语音特征映射方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106782520A CN106782520A (zh) | 2017-05-31 |
CN106782520B true CN106782520B (zh) | 2019-11-26 |
Family
ID=58962777
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710151497.0A Active CN106782520B (zh) | 2017-03-14 | 2017-03-14 | 一种复杂环境下语音特征映射方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106782520B (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108766430B (zh) * | 2018-06-06 | 2020-08-04 | 华中师范大学 | 一种基于巴氏距离的语音特征映射方法及系统 |
CN111816187A (zh) * | 2020-07-03 | 2020-10-23 | 中国人民解放军空军预警学院 | 复杂环境下基于深层神经网络的语音特征映射方法 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103413548A (zh) * | 2013-08-16 | 2013-11-27 | 中国科学技术大学 | 一种基于受限玻尔兹曼机的联合频谱建模的声音转换方法 |
US9373324B2 (en) * | 2013-12-06 | 2016-06-21 | International Business Machines Corporation | Applying speaker adaption techniques to correlated features |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100262423A1 (en) * | 2009-04-13 | 2010-10-14 | Microsoft Corporation | Feature compensation approach to robust speech recognition |
US8515758B2 (en) * | 2010-04-14 | 2013-08-20 | Microsoft Corporation | Speech recognition including removal of irrelevant information |
US9466292B1 (en) * | 2013-05-03 | 2016-10-11 | Google Inc. | Online incremental adaptation of deep neural networks using auxiliary Gaussian mixture models in speech recognition |
CN104392719B (zh) * | 2014-11-26 | 2017-09-19 | 河海大学 | 一种用于语音识别系统的中心子带模型自适应方法 |
CN104900232A (zh) * | 2015-04-20 | 2015-09-09 | 东南大学 | 一种基于双层gmm结构和vts特征补偿的孤立词识别方法 |
-
2017
- 2017-03-14 CN CN201710151497.0A patent/CN106782520B/zh active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103413548A (zh) * | 2013-08-16 | 2013-11-27 | 中国科学技术大学 | 一种基于受限玻尔兹曼机的联合频谱建模的声音转换方法 |
US9373324B2 (en) * | 2013-12-06 | 2016-06-21 | International Business Machines Corporation | Applying speaker adaption techniques to correlated features |
Non-Patent Citations (2)
Title |
---|
"Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition";Duc Hoang Ha Nguyen等;《IEEE/ACM Transactions on Audio, Speech, and Language Processing》;20160630;第24卷(第6期);第1006-1009页 * |
"INCREMENTAL ON-LINE FEATURE SPACE MLLR ADAPTATION FOR TELEPHONY SPEECH RECOGNITION";Yongxin Li等;《ISCA Archive》;20020920;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN106782520A (zh) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110531860B (zh) | 一种基于人工智能的动画形象驱动方法和装置 | |
WO2022052481A1 (zh) | 基于人工智能的vr互动方法、装置、计算机设备及介质 | |
CN110838289A (zh) | 基于人工智能的唤醒词检测方法、装置、设备及介质 | |
CN110288077A (zh) | 一种基于人工智能的合成说话表情的方法和相关装置 | |
WO2016150001A1 (zh) | 语音识别的方法、装置及计算机存储介质 | |
CN110428808A (zh) | 一种语音识别方法及装置 | |
CN106710590A (zh) | 基于虚拟现实环境的具有情感功能的语音交互系统及方法 | |
CN110265040A (zh) | 声纹模型的训练方法、装置、存储介质及电子设备 | |
US20220392224A1 (en) | Data processing method and apparatus, device, and readable storage medium | |
CN105895105A (zh) | 语音处理方法及装置 | |
CN109887484A (zh) | 一种基于对偶学习的语音识别与语音合成方法及装置 | |
CN110148399A (zh) | 一种智能设备的控制方法、装置、设备及介质 | |
CN107589828A (zh) | 基于知识图谱的人机交互方法及系统 | |
CN106782520B (zh) | 一种复杂环境下语音特征映射方法 | |
CN110970018A (zh) | 语音识别方法和装置 | |
CN113077537A (zh) | 一种视频生成方法、存储介质及设备 | |
CN109343695A (zh) | 基于虚拟人行为标准的交互方法及系统 | |
CN108717732A (zh) | 一种基于MobileNets模型的表情追踪方法 | |
CN110111769A (zh) | 一种电子耳蜗控制方法、装置、可读存储介质及电子耳蜗 | |
CN108052250A (zh) | 基于多模态交互的虚拟偶像演绎数据处理方法及系统 | |
CN110501673A (zh) | 一种基于多任务时频卷积神经网络的双耳听觉声源空间方向估计方法和系统 | |
CN113873297B (zh) | 一种数字人物视频的生成方法及相关装置 | |
Chakraborty et al. | Front-End Feature Compensation and Denoising for Noise Robust Speech Emotion Recognition. | |
CN110085236A (zh) | 一种基于自适应语音帧加权的说话人识别方法 | |
CN112420063A (zh) | 一种语音增强方法和装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20170531 Assignee: Hubei ZHENGBO Xusheng Technology Co.,Ltd. Assignor: CENTRAL CHINA NORMAL University Contract record no.: X2024980001275 Denomination of invention: A Speech Feature Mapping Method in Complex Environments Granted publication date: 20191126 License type: Common License Record date: 20240124 |
|
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20170531 Assignee: Hubei Rongzhi Youan Technology Co.,Ltd. Assignor: CENTRAL CHINA NORMAL University Contract record no.: X2024980001548 Denomination of invention: A Speech Feature Mapping Method in Complex Environments Granted publication date: 20191126 License type: Common License Record date: 20240126 |