CN111899756A - 一种单通道语音分离方法和装置 - Google Patents
一种单通道语音分离方法和装置 Download PDFInfo
- Publication number
- CN111899756A CN111899756A CN202011057720.3A CN202011057720A CN111899756A CN 111899756 A CN111899756 A CN 111899756A CN 202011057720 A CN202011057720 A CN 202011057720A CN 111899756 A CN111899756 A CN 111899756A
- Authority
- CN
- China
- Prior art keywords
- target
- voice
- phase
- spectrum
- amplitude
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000926 separation method Methods 0.000 title claims abstract description 32
- 238000001228 spectrum Methods 0.000 claims abstract description 85
- 230000003595 spectral effect Effects 0.000 claims abstract description 31
- 238000003062 neural network model Methods 0.000 claims abstract description 28
- 239000013598 vector Substances 0.000 claims abstract description 15
- 230000000873 masking effect Effects 0.000 claims description 18
- 238000000034 method Methods 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 5
- 230000001131 transforming effect Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011057720.3A CN111899756B (zh) | 2020-09-29 | 2020-09-29 | 一种单通道语音分离方法和装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011057720.3A CN111899756B (zh) | 2020-09-29 | 2020-09-29 | 一种单通道语音分离方法和装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111899756A true CN111899756A (zh) | 2020-11-06 |
CN111899756B CN111899756B (zh) | 2021-04-09 |
Family
ID=73224084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011057720.3A Active CN111899756B (zh) | 2020-09-29 | 2020-09-29 | 一种单通道语音分离方法和装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111899756B (zh) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112382306A (zh) * | 2020-12-02 | 2021-02-19 | 苏州思必驰信息科技有限公司 | 分离说话人音频的方法及装置 |
CN113539293A (zh) * | 2021-08-10 | 2021-10-22 | 南京邮电大学 | 基于卷积神经网络和联合优化的单通道语音分离方法 |
CN113921022A (zh) * | 2021-12-13 | 2022-01-11 | 北京世纪好未来教育科技有限公司 | 音频信号分离方法、装置、存储介质和电子设备 |
CN114446316A (zh) * | 2022-01-27 | 2022-05-06 | 腾讯科技(深圳)有限公司 | 音频分离方法、音频分离模型的训练方法、装置及设备 |
CN114678037A (zh) * | 2022-04-13 | 2022-06-28 | 北京远鉴信息技术有限公司 | 一种重叠语音的检测方法、装置、电子设备及存储介质 |
CN115862669A (zh) * | 2022-11-29 | 2023-03-28 | 南京领行科技股份有限公司 | 一种保证乘车安全的方法、装置、电子设备及存储介质 |
CN117727312A (zh) * | 2023-12-12 | 2024-03-19 | 广州伏羲智能科技有限公司 | 一种目标噪声分离方法、系统及终端设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2023343A1 (en) * | 2007-08-09 | 2009-02-11 | HONDA MOTOR CO., Ltd. | Sound-source separation system |
CN103170068A (zh) * | 2013-04-15 | 2013-06-26 | 南京大学 | 一种相控阵非线性声场的定量确定方法 |
CN103811020A (zh) * | 2014-03-05 | 2014-05-21 | 东北大学 | 一种智能语音处理方法 |
CN109887494A (zh) * | 2017-12-01 | 2019-06-14 | 腾讯科技(深圳)有限公司 | 重构语音信号的方法和装置 |
CN110544482A (zh) * | 2019-09-09 | 2019-12-06 | 极限元(杭州)智能科技股份有限公司 | 一种单通道语音分离系统 |
-
2020
- 2020-09-29 CN CN202011057720.3A patent/CN111899756B/zh active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2023343A1 (en) * | 2007-08-09 | 2009-02-11 | HONDA MOTOR CO., Ltd. | Sound-source separation system |
CN103170068A (zh) * | 2013-04-15 | 2013-06-26 | 南京大学 | 一种相控阵非线性声场的定量确定方法 |
CN103811020A (zh) * | 2014-03-05 | 2014-05-21 | 东北大学 | 一种智能语音处理方法 |
CN109887494A (zh) * | 2017-12-01 | 2019-06-14 | 腾讯科技(深圳)有限公司 | 重构语音信号的方法和装置 |
CN110544482A (zh) * | 2019-09-09 | 2019-12-06 | 极限元(杭州)智能科技股份有限公司 | 一种单通道语音分离系统 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112382306A (zh) * | 2020-12-02 | 2021-02-19 | 苏州思必驰信息科技有限公司 | 分离说话人音频的方法及装置 |
CN112382306B (zh) * | 2020-12-02 | 2022-05-10 | 思必驰科技股份有限公司 | 分离说话人音频的方法及装置 |
CN113539293A (zh) * | 2021-08-10 | 2021-10-22 | 南京邮电大学 | 基于卷积神经网络和联合优化的单通道语音分离方法 |
CN113539293B (zh) * | 2021-08-10 | 2023-12-26 | 南京邮电大学 | 基于卷积神经网络和联合优化的单通道语音分离方法 |
CN113921022A (zh) * | 2021-12-13 | 2022-01-11 | 北京世纪好未来教育科技有限公司 | 音频信号分离方法、装置、存储介质和电子设备 |
CN114446316A (zh) * | 2022-01-27 | 2022-05-06 | 腾讯科技(深圳)有限公司 | 音频分离方法、音频分离模型的训练方法、装置及设备 |
CN114446316B (zh) * | 2022-01-27 | 2024-03-12 | 腾讯科技(深圳)有限公司 | 音频分离方法、音频分离模型的训练方法、装置及设备 |
CN114678037A (zh) * | 2022-04-13 | 2022-06-28 | 北京远鉴信息技术有限公司 | 一种重叠语音的检测方法、装置、电子设备及存储介质 |
CN114678037B (zh) * | 2022-04-13 | 2022-10-25 | 北京远鉴信息技术有限公司 | 一种重叠语音的检测方法、装置、电子设备及存储介质 |
CN115862669A (zh) * | 2022-11-29 | 2023-03-28 | 南京领行科技股份有限公司 | 一种保证乘车安全的方法、装置、电子设备及存储介质 |
CN117727312A (zh) * | 2023-12-12 | 2024-03-19 | 广州伏羲智能科技有限公司 | 一种目标噪声分离方法、系统及终端设备 |
Also Published As
Publication number | Publication date |
---|---|
CN111899756B (zh) | 2021-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111899756B (zh) | 一种单通道语音分离方法和装置 | |
CN112331218B (zh) | 一种针对多说话人的单通道语音分离方法和装置 | |
Yegnanarayana et al. | Processing of reverberant speech for time-delay estimation | |
Nesta et al. | Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation | |
CN106847301A (zh) | 一种基于压缩感知和空间方位信息的双耳语音分离方法 | |
Kumar et al. | Non-negative matrix based optimization scheme for blind source separation in automatic speech recognition system | |
Haridas et al. | A novel approach to improve the speech intelligibility using fractional delta-amplitude modulation spectrogram | |
Do et al. | Speech Separation in the Frequency Domain with Autoencoder. | |
Xiao et al. | Beamforming networks using spatial covariance features for far-field speech recognition | |
Gul et al. | Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions | |
KR101802444B1 (ko) | 독립 벡터 분석 및 반향 필터 재추정을 이용한 베이시안 특징 향상에 의한 강인한 음성 인식 장치 및 방법 | |
Zhang et al. | Multi-Target Ensemble Learning for Monaural Speech Separation. | |
KR100969138B1 (ko) | 은닉 마코프 모델을 이용한 잡음 마스크 추정 방법 및 이를수행하는 장치 | |
Girin et al. | Audio source separation into the wild | |
Cobos et al. | Two-microphone separation of speech mixtures based on interclass variance maximization | |
Yoshioka et al. | Dereverberation by using time-variant nature of speech production system | |
Marti et al. | Automatic speech recognition in cocktail-party situations: A specific training for separated speech | |
Jafari et al. | Underdetermined blind source separation with fuzzy clustering for arbitrarily arranged sensors | |
Meutzner et al. | A generative-discriminative hybrid approach to multi-channel noise reduction for robust automatic speech recognition | |
Shareef et al. | Comparison between features extraction techniques for impairments arabic speech | |
He et al. | Mask-based blind source separation and MVDR beamforming in ASR | |
Jahanirad et al. | Blind source computer device identification from recorded VoIP calls for forensic investigation | |
Al-Ali et al. | Enhanced forensic speaker verification performance using the ICA-EBM algorithm under noisy and reverberant environments | |
Adiloğlu et al. | A general variational Bayesian framework for robust feature extraction in multisource recordings | |
KR20100056859A (ko) | 음성 인식 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20221208 Address after: Room 3068, Floor 3, Building 2, No. 602, Tongpu Road, Putuo District, Shanghai, 200062 Patentee after: Shanghai Qingwei Intelligent Technology Co.,Ltd. Address before: 100192 201, 2nd floor, building 26, yard 1, Baosheng South Road, Haidian District, Beijing Patentee before: Beijing Qingwei Intelligent Technology Co.,Ltd. |
|
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Shi Huiyu Inventor after: OuYang Peng Inventor before: Shi Huiyu Inventor before: OuYang Peng Inventor before: Yin Shouyi |