CN104732978B - 基于联合深度学习的文本相关的说话人识别方法 - Google Patents
基于联合深度学习的文本相关的说话人识别方法 Download PDFInfo
- Publication number
- CN104732978B CN104732978B CN201510107647.9A CN201510107647A CN104732978B CN 104732978 B CN104732978 B CN 104732978B CN 201510107647 A CN201510107647 A CN 201510107647A CN 104732978 B CN104732978 B CN 104732978B
- Authority
- CN
- China
- Prior art keywords
- speaker
- audio
- lda
- feature
- threshold value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000007935 neutral effect Effects 0.000 claims abstract description 16
- 238000012360 testing method Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 11
- 238000001228 spectrum Methods 0.000 claims description 7
- 230000001755 vocal effect Effects 0.000 claims description 5
- 238000011161 development Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 210000004218 nerve net Anatomy 0.000 claims 1
- 230000006870 function Effects 0.000 description 9
- 230000000694 effects Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 230000003542 behavioural effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
Landscapes
- Image Analysis (AREA)
- Complex Calculations (AREA)
Abstract
Description
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510107647.9A CN104732978B (zh) | 2015-03-12 | 2015-03-12 | 基于联合深度学习的文本相关的说话人识别方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510107647.9A CN104732978B (zh) | 2015-03-12 | 2015-03-12 | 基于联合深度学习的文本相关的说话人识别方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104732978A CN104732978A (zh) | 2015-06-24 |
CN104732978B true CN104732978B (zh) | 2018-05-08 |
Family
ID=53456817
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510107647.9A Active CN104732978B (zh) | 2015-03-12 | 2015-03-12 | 基于联合深度学习的文本相关的说话人识别方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104732978B (zh) |
Families Citing this family (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105160229A (zh) * | 2015-09-01 | 2015-12-16 | 武汉同迅智能科技有限公司 | 一种具有语音和指纹双重鉴权的单兵系统 |
CN106601238A (zh) * | 2015-10-14 | 2017-04-26 | 阿里巴巴集团控股有限公司 | 一种应用操作的处理方法和装置 |
CN106683661B (zh) * | 2015-11-05 | 2021-02-05 | 阿里巴巴集团控股有限公司 | 基于语音的角色分离方法及装置 |
CN105575394A (zh) * | 2016-01-04 | 2016-05-11 | 北京时代瑞朗科技有限公司 | 基于全局变化空间及深度学习混合建模的声纹识别方法 |
US10373612B2 (en) * | 2016-03-21 | 2019-08-06 | Amazon Technologies, Inc. | Anchored speech detection and speech recognition |
CN106024011A (zh) * | 2016-05-19 | 2016-10-12 | 仲恺农业工程学院 | 一种基于moas的深层特征提取方法 |
CN105869644A (zh) * | 2016-05-25 | 2016-08-17 | 百度在线网络技术(北京)有限公司 | 基于深度学习的声纹认证方法和装置 |
CN106019230B (zh) * | 2016-05-27 | 2019-01-08 | 南京邮电大学 | 一种基于i-vector说话人识别的声源定位方法 |
CN107492382B (zh) * | 2016-06-13 | 2020-12-18 | 阿里巴巴集团控股有限公司 | 基于神经网络的声纹信息提取方法及装置 |
CN106098059B (zh) * | 2016-06-23 | 2019-06-18 | 上海交通大学 | 可定制语音唤醒方法及系统 |
CN106095733B (zh) * | 2016-06-23 | 2019-01-25 | 闽江学院 | 一种改进的基于深度学习的自然语言特征精确提取方法 |
US20180018973A1 (en) * | 2016-07-15 | 2018-01-18 | Google Inc. | Speaker verification |
CN106683680B (zh) * | 2017-03-10 | 2022-03-25 | 百度在线网络技术(北京)有限公司 | 说话人识别方法及装置、计算机设备及计算机可读介质 |
CN106960185B (zh) * | 2017-03-10 | 2019-10-25 | 陕西师范大学 | 线性判别深度信念网络的多姿态人脸识别方法 |
CN107146624B (zh) * | 2017-04-01 | 2019-11-22 | 清华大学 | 一种说话人确认方法及装置 |
CN107452403B (zh) * | 2017-09-12 | 2020-07-07 | 清华大学 | 一种说话人标记方法 |
CN108417217B (zh) * | 2018-01-11 | 2021-07-13 | 思必驰科技股份有限公司 | 说话人识别网络模型训练方法、说话人识别方法及系统 |
CN109545227B (zh) * | 2018-04-28 | 2023-05-09 | 华中师范大学 | 基于深度自编码网络的说话人性别自动识别方法及系统 |
CN110598840B (zh) * | 2018-06-13 | 2023-04-18 | 富士通株式会社 | 知识迁移方法、信息处理设备以及存储介质 |
CN110164452B (zh) | 2018-10-10 | 2023-03-10 | 腾讯科技(深圳)有限公司 | 一种声纹识别的方法、模型训练的方法以及服务器 |
CN109377984B (zh) * | 2018-11-22 | 2022-05-03 | 北京中科智加科技有限公司 | 一种基于ArcFace的语音识别方法及装置 |
CN110033757A (zh) * | 2019-04-04 | 2019-07-19 | 行知技术有限公司 | 一种人声识别算法 |
CN109903774A (zh) * | 2019-04-12 | 2019-06-18 | 南京大学 | 一种基于角度间隔损失函数的声纹识别方法 |
CN110047468B (zh) * | 2019-05-20 | 2022-01-25 | 北京达佳互联信息技术有限公司 | 语音识别方法、装置及存储介质 |
CN110719158B (zh) * | 2019-09-11 | 2021-11-23 | 南京航空航天大学 | 基于联合学习的边缘计算隐私保护系统及保护方法 |
CN111081255B (zh) * | 2019-12-31 | 2022-06-03 | 思必驰科技股份有限公司 | 说话人确认方法和装置 |
CN111462762B (zh) * | 2020-03-25 | 2023-02-24 | 清华大学 | 一种说话人向量正则化方法、装置、电子设备和存储介质 |
CN111667836B (zh) * | 2020-06-19 | 2023-05-05 | 南京大学 | 基于深度学习的文本无关多标号说话人识别方法 |
CN112071301B (zh) * | 2020-09-17 | 2022-04-08 | 北京嘀嘀无限科技发展有限公司 | 语音合成的处理方法、装置、设备及存储介质 |
CN111933155B (zh) * | 2020-09-18 | 2020-12-25 | 北京爱数智慧科技有限公司 | 声纹识别模型训练方法、装置和计算机系统 |
CN113241081B (zh) * | 2021-04-25 | 2023-06-16 | 华南理工大学 | 一种基于梯度反转层的远场说话人认证方法及系统 |
CN113705671B (zh) * | 2021-08-27 | 2023-08-29 | 厦门大学 | 一种基于文本相关信息感知的说话人识别方法与系统 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0369485A2 (en) * | 1988-11-17 | 1990-05-23 | Sekisui Kagaku Kogyo Kabushiki Kaisha | Speaker recognition system |
CN103971690A (zh) * | 2013-01-28 | 2014-08-06 | 腾讯科技(深圳)有限公司 | 一种声纹识别方法和装置 |
CN104008751A (zh) * | 2014-06-18 | 2014-08-27 | 周婷婷 | 一种基于bp神经网络的说话人识别方法 |
CN104143327A (zh) * | 2013-07-10 | 2014-11-12 | 腾讯科技(深圳)有限公司 | 一种声学模型训练方法和装置 |
US9530417B2 (en) * | 2013-01-04 | 2016-12-27 | Stmicroelectronics Asia Pacific Pte Ltd. | Methods, systems, and circuits for text independent speaker recognition with automatic learning features |
-
2015
- 2015-03-12 CN CN201510107647.9A patent/CN104732978B/zh active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0369485A2 (en) * | 1988-11-17 | 1990-05-23 | Sekisui Kagaku Kogyo Kabushiki Kaisha | Speaker recognition system |
US9530417B2 (en) * | 2013-01-04 | 2016-12-27 | Stmicroelectronics Asia Pacific Pte Ltd. | Methods, systems, and circuits for text independent speaker recognition with automatic learning features |
CN103971690A (zh) * | 2013-01-28 | 2014-08-06 | 腾讯科技(深圳)有限公司 | 一种声纹识别方法和装置 |
CN104143327A (zh) * | 2013-07-10 | 2014-11-12 | 腾讯科技(深圳)有限公司 | 一种声学模型训练方法和装置 |
CN104008751A (zh) * | 2014-06-18 | 2014-08-27 | 周婷婷 | 一种基于bp神经网络的说话人识别方法 |
Non-Patent Citations (1)
Title |
---|
"Deep Neural Networks for Acoustic Modeling in Speech Recognition The Shared Views of Four Research Groups";Geoffrey Hinton等;《IEEE Signal Processing Magazine》;20121018;第29卷(第6期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN104732978A (zh) | 2015-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104732978B (zh) | 基于联合深度学习的文本相关的说话人识别方法 | |
Cummins et al. | An image-based deep spectrum feature representation for the recognition of emotional speech | |
CN105869630B (zh) | 基于深度学习的说话人语音欺骗攻击检测方法及系统 | |
CN105632501B (zh) | 一种基于深度学习技术的自动口音分类方法及装置 | |
CN110310647B (zh) | 一种语音身份特征提取器、分类器训练方法及相关设备 | |
TWI473080B (zh) | The use of phonological emotions or excitement to assist in resolving the gender or age of speech signals | |
CN108231067A (zh) | 基于卷积神经网络与随机森林分类的声音场景识别方法 | |
CN106952649A (zh) | 基于卷积神经网络和频谱图的说话人识别方法 | |
US20170154640A1 (en) | Method and electronic device for voice recognition based on dynamic voice model selection | |
CN106448684A (zh) | 基于深度置信网络特征矢量的信道鲁棒声纹识别系统 | |
CN110428842A (zh) | 语音模型训练方法、装置、设备及计算机可读存储介质 | |
CN105938716A (zh) | 一种基于多精度拟合的样本复制语音自动检测方法 | |
CN103400577A (zh) | 多语种语音识别的声学模型建立方法和装置 | |
Huang et al. | Speech emotion recognition under white noise | |
CN102664010B (zh) | 一种基于多因素频率位移不变特征的鲁棒说话人辨别方法 | |
CN106898355B (zh) | 一种基于二次建模的说话人识别方法 | |
CN103985381A (zh) | 一种基于参数融合优化决策的音频索引方法 | |
CN109346084A (zh) | 基于深度堆栈自编码网络的说话人识别方法 | |
CN107784215B (zh) | 基于智能终端的声音装置进行唇读的用户认证方法及系统 | |
CN108962229A (zh) | 一种基于单通道、无监督式的目标说话人语音提取方法 | |
CN108986798A (zh) | 语音数据的处理方法、装置及设备 | |
Lei et al. | Speaker Recognition Using Wavelet Cepstral Coefficient, I‐Vector, and Cosine Distance Scoring and Its Application for Forensics | |
Sekkate et al. | Speaker identification for OFDM-based aeronautical communication system | |
Murugaiya et al. | Probability enhanced entropy (PEE) novel feature for improved bird sound classification | |
CN105845143A (zh) | 基于支持向量机的说话人确认方法及其系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200624 Address after: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Co-patentee after: AI SPEECH Co.,Ltd. Patentee after: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Address before: 200240 Dongchuan Road, Shanghai, No. 800, No. Co-patentee before: AI SPEECH Co.,Ltd. Patentee before: SHANGHAI JIAO TONG University |
|
TR01 | Transfer of patent right |
Effective date of registration: 20201102 Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: AI SPEECH Co.,Ltd. Address before: Room 105G, 199 GuoShoujing Road, Pudong New Area, Shanghai, 200120 Patentee before: Shanghai Jiaotong University Intellectual Property Management Co.,Ltd. Patentee before: AI SPEECH Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CP01 | Change in the name or title of a patent holder |
Address after: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee after: Sipic Technology Co.,Ltd. Address before: 215123 14 Tengfei Innovation Park, 388 Xinping street, Suzhou Industrial Park, Suzhou, Jiangsu. Patentee before: AI SPEECH Co.,Ltd. |
|
CP01 | Change in the name or title of a patent holder | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Text dependent speaker recognition method based on joint deep learning Effective date of registration: 20230726 Granted publication date: 20180508 Pledgee: CITIC Bank Limited by Share Ltd. Suzhou branch Pledgor: Sipic Technology Co.,Ltd. Registration number: Y2023980049433 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |