CN111883177B - 基于深度学习的语音关键信息分离方法 - Google Patents
基于深度学习的语音关键信息分离方法 Download PDFInfo
- Publication number
- CN111883177B CN111883177B CN202010681349.1A CN202010681349A CN111883177B CN 111883177 B CN111883177 B CN 111883177B CN 202010681349 A CN202010681349 A CN 202010681349A CN 111883177 B CN111883177 B CN 111883177B
- Authority
- CN
- China
- Prior art keywords
- voice
- information
- voice information
- vector
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000926 separation method Methods 0.000 title claims abstract description 29
- 238000013135 deep learning Methods 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims abstract description 5
- 239000013598 vector Substances 0.000 claims description 61
- 238000013527 convolutional neural network Methods 0.000 claims description 27
- 238000011176 pooling Methods 0.000 claims description 22
- 230000004913 activation Effects 0.000 claims description 21
- 238000004422 calculation algorithm Methods 0.000 claims description 17
- 238000013145 classification model Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000013507 mapping Methods 0.000 claims description 6
- 230000007704 transition Effects 0.000 claims description 6
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000007619 statistical method Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 8
- 238000004458 analytical method Methods 0.000 abstract description 6
- 238000012545 processing Methods 0.000 abstract description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 238000009825 accumulation Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010681349.1A CN111883177B (zh) | 2020-07-15 | 2020-07-15 | 基于深度学习的语音关键信息分离方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010681349.1A CN111883177B (zh) | 2020-07-15 | 2020-07-15 | 基于深度学习的语音关键信息分离方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111883177A CN111883177A (zh) | 2020-11-03 |
CN111883177B true CN111883177B (zh) | 2023-08-04 |
Family
ID=73154487
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010681349.1A Active CN111883177B (zh) | 2020-07-15 | 2020-07-15 | 基于深度学习的语音关键信息分离方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111883177B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113241092A (zh) * | 2021-06-15 | 2021-08-10 | 新疆大学 | 基于双注意力机制和多阶段混合卷积网络声源分离方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104575544A (zh) * | 2008-10-24 | 2015-04-29 | 尼尔森(美国)有限公司 | 从媒体提取标识信息的方法和装置 |
CN109858482A (zh) * | 2019-01-16 | 2019-06-07 | 创新奇智(重庆)科技有限公司 | 一种图像关键区域检测方法及其系统、终端设备 |
CN109859770A (zh) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | 音乐分离方法、装置及计算机可读存储介质 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11373672B2 (en) * | 2016-06-14 | 2022-06-28 | The Trustees Of Columbia University In The City Of New York | Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments |
US10642846B2 (en) * | 2017-10-13 | 2020-05-05 | Microsoft Technology Licensing, Llc | Using a generative adversarial network for query-keyword matching |
-
2020
- 2020-07-15 CN CN202010681349.1A patent/CN111883177B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104575544A (zh) * | 2008-10-24 | 2015-04-29 | 尼尔森(美国)有限公司 | 从媒体提取标识信息的方法和装置 |
CN109859770A (zh) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | 音乐分离方法、装置及计算机可读存储介质 |
CN109858482A (zh) * | 2019-01-16 | 2019-06-07 | 创新奇智(重庆)科技有限公司 | 一种图像关键区域检测方法及其系统、终端设备 |
Also Published As
Publication number | Publication date |
---|---|
CN111883177A (zh) | 2020-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111680706B (zh) | 一种基于编码和解码结构的双通道输出轮廓检测方法 | |
CN113220919B (zh) | 一种大坝缺陷图像文本跨模态检索方法及模型 | |
CN106228980B (zh) | 数据处理方法和装置 | |
CN110188047B (zh) | 一种基于双通道卷积神经网络的重复缺陷报告检测方法 | |
CN109934269B (zh) | 一种电磁信号的开集识别方法和装置 | |
CN107562938A (zh) | 一种法院智能审判方法 | |
CN110097096B (zh) | 一种基于tf-idf矩阵和胶囊网络的文本分类方法 | |
CN116167010B (zh) | 具有智能迁移学习能力的电力系统异常事件快速识别方法 | |
CN111128128B (zh) | 一种基于互补模型评分融合的语音关键词检测方法 | |
CN104978569B (zh) | 一种基于稀疏表示的增量人脸识别方法 | |
CN111461025A (zh) | 一种自主进化的零样本学习的信号识别方法 | |
CN106528527A (zh) | 未登录词的识别方法及识别系统 | |
CN109800309A (zh) | 课堂话语类型分类方法及装置 | |
CN111984790B (zh) | 一种实体关系抽取方法 | |
CN112949288B (zh) | 一种基于字符序列的文本检错方法 | |
CN111883177B (zh) | 基于深度学习的语音关键信息分离方法 | |
CN115064154A (zh) | 混合语言语音识别模型的生成方法及装置 | |
CN114357206A (zh) | 基于语义分析的教育类视频彩色字幕生成方法及系统 | |
CN116738332A (zh) | 一种结合注意力机制的飞行器多尺度信号分类识别与故障检测方法 | |
CN110289004B (zh) | 一种基于深度学习的人工合成声纹检测系统及方法 | |
CN115588112A (zh) | 一种基于rfef-yolo目标检测方法 | |
CN117235137B (zh) | 一种基于向量数据库的职业信息查询方法及装置 | |
CN111833856B (zh) | 基于深度学习的语音关键信息标定方法 | |
CN112489689B (zh) | 基于多尺度差异对抗的跨数据库语音情感识别方法及装置 | |
CN116977834B (zh) | 一种开放条件下分布内外图像识别方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230711 Address after: Room 621, South Building, torch Plaza, No. 56-58, torch garden, torch hi tech Zone, Xiamen City, Fujian Province, 361000 Applicant after: XIAMEN HEROCHEER ELECTRONIC TECHNOLOGY CO.,LTD. Applicant after: Shanghai Xizhong Technology Co.,Ltd. Applicant after: Xiamen Xiquan Sports Technology Co.,Ltd. Address before: Room 621, South Building, torch Plaza, No. 56-58, torch garden, torch hi tech Zone, Xiamen City, Fujian Province, 361000 Applicant before: XIAMEN HEROCHEER ELECTRONIC TECHNOLOGY CO.,LTD. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |