CN106796803A - 用于在音频通信中将语音数据与背景数据分离的方法和装置 - Google Patents
用于在音频通信中将语音数据与背景数据分离的方法和装置 Download PDFInfo
- Publication number
- CN106796803A CN106796803A CN201580055548.9A CN201580055548A CN106796803A CN 106796803 A CN106796803 A CN 106796803A CN 201580055548 A CN201580055548 A CN 201580055548A CN 106796803 A CN106796803 A CN 106796803A
- Authority
- CN
- China
- Prior art keywords
- voice communication
- speech
- caller
- model
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004891 communication Methods 0.000 title claims abstract description 102
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000004590 computer program Methods 0.000 claims description 7
- 238000001228 spectrum Methods 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000003595 spectral effect Effects 0.000 description 6
- 238000000926 separation method Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000004568 cement Substances 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000013016 learning Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 201000007902 Primary cutaneous amyloidosis Diseases 0.000 description 1
- 241000220317 Rosa Species 0.000 description 1
- 230000018199 S phase Effects 0.000 description 1
- 230000005534 acoustic noise Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000005039 memory span Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
- Time-Division Multiplex Systems (AREA)
Abstract
Description
Claims (14)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP14306623.1 | 2014-10-14 | ||
EP14306623.1A EP3010017A1 (en) | 2014-10-14 | 2014-10-14 | Method and apparatus for separating speech data from background data in audio communication |
PCT/EP2015/073526 WO2016058974A1 (en) | 2014-10-14 | 2015-10-12 | Method and apparatus for separating speech data from background data in audio communication |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106796803A true CN106796803A (zh) | 2017-05-31 |
CN106796803B CN106796803B (zh) | 2023-09-19 |
Family
ID=51844642
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580055548.9A Active CN106796803B (zh) | 2014-10-14 | 2015-10-12 | 用于在音频通信中将语音数据与背景数据分离的方法和装置 |
Country Status (7)
Country | Link |
---|---|
US (1) | US9990936B2 (zh) |
EP (2) | EP3010017A1 (zh) |
JP (1) | JP6967966B2 (zh) |
KR (2) | KR20170069221A (zh) |
CN (1) | CN106796803B (zh) |
TW (1) | TWI669708B (zh) |
WO (1) | WO2016058974A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI801085B (zh) * | 2022-01-07 | 2023-05-01 | 矽響先創科技股份有限公司 | 智能網路通訊之雜訊消減方法 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10621990B2 (en) | 2018-04-30 | 2020-04-14 | International Business Machines Corporation | Cognitive print speaker modeler |
US10811007B2 (en) * | 2018-06-08 | 2020-10-20 | International Business Machines Corporation | Filtering audio-based interference from voice commands using natural language processing |
CN112562726B (zh) * | 2020-10-27 | 2022-05-27 | 昆明理工大学 | 一种基于mfcc相似矩阵的语音音乐分离方法 |
US11462219B2 (en) * | 2020-10-30 | 2022-10-04 | Google Llc | Voice filtering other speakers from calls and audio messages |
KR20230158462A (ko) | 2021-03-23 | 2023-11-20 | 토레 엔지니어링 가부시키가이샤 | 적층체 제조 장치 및 자기 조직화 단분자막의 형성 방법 |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1265217A (zh) * | 1997-07-02 | 2000-08-30 | 西莫克国际有限公司 | 在语音通信系统中语音增强的方法和装置 |
CN1313983A (zh) * | 1999-06-15 | 2001-09-19 | 松下电器产业株式会社 | 噪声信号编码装置及语音信号编码装置 |
US20030216911A1 (en) * | 2002-05-20 | 2003-11-20 | Li Deng | Method of noise reduction based on dynamic aspects of speech |
US6766295B1 (en) * | 1999-05-10 | 2004-07-20 | Nuance Communications | Adaptation of a speech recognition system across multiple remote sessions with a speaker |
US20070021958A1 (en) * | 2005-07-22 | 2007-01-25 | Erik Visser | Robust separation of speech signals in a noisy environment |
CN101166017A (zh) * | 2006-10-20 | 2008-04-23 | 松下电器产业株式会社 | 用于声音产生设备的自动杂音补偿方法及装置 |
US20100131086A1 (en) * | 2007-04-13 | 2010-05-27 | Kyoto University | Sound source separation system, sound source separation method, and computer program for sound source separation |
US20100332237A1 (en) * | 2009-06-30 | 2010-12-30 | Kabushiki Kaisha Toshiba | Sound quality correction apparatus, sound quality correction method and sound quality correction program |
JP2011191337A (ja) * | 2010-03-11 | 2011-09-29 | Nara Institute Of Science & Technology | 雑音抑制装置、方法、及びプログラム |
CN102903360A (zh) * | 2011-07-26 | 2013-01-30 | 财团法人工业技术研究院 | 以麦克风阵列为基础的语音辨识系统与方法 |
CN102903368A (zh) * | 2011-07-29 | 2013-01-30 | 杜比实验室特许公司 | 用于卷积盲源分离的方法和设备 |
JP2013114151A (ja) * | 2011-11-30 | 2013-06-10 | Nippon Telegr & Teleph Corp <Ntt> | 雑音抑圧装置、方法及びプログラム |
CN103238181A (zh) * | 2010-12-07 | 2013-08-07 | 三菱电机株式会社 | 用于恢复由于对测试语音信号进行噪声去除导致在测试噪声去除后语音信号中衰减的谱成分的方法 |
US20130332165A1 (en) * | 2012-06-06 | 2013-12-12 | Qualcomm Incorporated | Method and systems having improved speech recognition |
CN103617798A (zh) * | 2013-12-04 | 2014-03-05 | 中国人民解放军成都军区总医院 | 一种强背景噪声下的语音提取方法 |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5946654A (en) | 1997-02-21 | 1999-08-31 | Dragon Systems, Inc. | Speaker identification using unsupervised speech models |
JP2002330193A (ja) * | 2001-05-07 | 2002-11-15 | Sony Corp | 通話装置および方法、記録媒体、並びにプログラム |
US7072834B2 (en) * | 2002-04-05 | 2006-07-04 | Intel Corporation | Adapting to adverse acoustic environment in speech processing using playback training data |
US20040122672A1 (en) * | 2002-12-18 | 2004-06-24 | Jean-Francois Bonastre | Gaussian model-based dynamic time warping system and method for speech processing |
US7231019B2 (en) | 2004-02-12 | 2007-06-12 | Microsoft Corporation | Automatic identification of telephone callers based on voice characteristics |
JP2006201496A (ja) * | 2005-01-20 | 2006-08-03 | Matsushita Electric Ind Co Ltd | フィルタリング装置 |
KR100766061B1 (ko) * | 2005-12-09 | 2007-10-11 | 한국전자통신연구원 | 화자적응 방법 및 장치 |
JP2007184820A (ja) * | 2006-01-10 | 2007-07-19 | Kenwood Corp | 受信装置及び受信音声信号の補正方法 |
ATE536611T1 (de) * | 2006-02-14 | 2011-12-15 | Intellectual Ventures Fund 21 Llc | Kommunikationsgerät mit lautsprecherunabhängiger spracherkennung |
US8121837B2 (en) * | 2008-04-24 | 2012-02-21 | Nuance Communications, Inc. | Adjusting a speech engine for a mobile computing device based on background noise |
US8077836B2 (en) * | 2008-07-30 | 2011-12-13 | At&T Intellectual Property, I, L.P. | Transparent voice registration and verification method and system |
BR112012031656A2 (pt) * | 2010-08-25 | 2016-11-08 | Asahi Chemical Ind | dispositivo, e método de separação de fontes sonoras, e, programa |
US8886526B2 (en) * | 2012-05-04 | 2014-11-11 | Sony Computer Entertainment Inc. | Source separation using independent component analysis with mixed multi-variate probability density function |
CN102915742B (zh) * | 2012-10-30 | 2014-07-30 | 中国人民解放军理工大学 | 基于低秩与稀疏矩阵分解的单通道无监督语噪分离方法 |
CN103871423A (zh) * | 2012-12-13 | 2014-06-18 | 上海八方视界网络科技有限公司 | 一种基于nmf非负矩阵分解的音频分离方法 |
US9886968B2 (en) * | 2013-03-04 | 2018-02-06 | Synaptics Incorporated | Robust speech boundary detection system and method |
CN103559888B (zh) * | 2013-11-07 | 2016-10-05 | 航空电子系统综合技术重点实验室 | 基于非负低秩和稀疏矩阵分解原理的语音增强方法 |
CN103903632A (zh) * | 2014-04-02 | 2014-07-02 | 重庆邮电大学 | 一种多声源环境下的基于听觉中枢系统的语音分离方法 |
-
2014
- 2014-10-14 EP EP14306623.1A patent/EP3010017A1/en not_active Withdrawn
-
2015
- 2015-10-02 TW TW104132463A patent/TWI669708B/zh active
- 2015-10-12 US US15/517,953 patent/US9990936B2/en active Active
- 2015-10-12 WO PCT/EP2015/073526 patent/WO2016058974A1/en active Application Filing
- 2015-10-12 KR KR1020177009838A patent/KR20170069221A/ko active Application Filing
- 2015-10-12 JP JP2017518295A patent/JP6967966B2/ja active Active
- 2015-10-12 EP EP15778666.6A patent/EP3207543B1/en active Active
- 2015-10-12 KR KR1020237001962A patent/KR102702715B1/ko active IP Right Grant
- 2015-10-12 CN CN201580055548.9A patent/CN106796803B/zh active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1265217A (zh) * | 1997-07-02 | 2000-08-30 | 西莫克国际有限公司 | 在语音通信系统中语音增强的方法和装置 |
US6766295B1 (en) * | 1999-05-10 | 2004-07-20 | Nuance Communications | Adaptation of a speech recognition system across multiple remote sessions with a speaker |
CN1313983A (zh) * | 1999-06-15 | 2001-09-19 | 松下电器产业株式会社 | 噪声信号编码装置及语音信号编码装置 |
US20030216911A1 (en) * | 2002-05-20 | 2003-11-20 | Li Deng | Method of noise reduction based on dynamic aspects of speech |
US20070021958A1 (en) * | 2005-07-22 | 2007-01-25 | Erik Visser | Robust separation of speech signals in a noisy environment |
CN101166017A (zh) * | 2006-10-20 | 2008-04-23 | 松下电器产业株式会社 | 用于声音产生设备的自动杂音补偿方法及装置 |
US20100131086A1 (en) * | 2007-04-13 | 2010-05-27 | Kyoto University | Sound source separation system, sound source separation method, and computer program for sound source separation |
US20100332237A1 (en) * | 2009-06-30 | 2010-12-30 | Kabushiki Kaisha Toshiba | Sound quality correction apparatus, sound quality correction method and sound quality correction program |
JP2011191337A (ja) * | 2010-03-11 | 2011-09-29 | Nara Institute Of Science & Technology | 雑音抑制装置、方法、及びプログラム |
CN103238181A (zh) * | 2010-12-07 | 2013-08-07 | 三菱电机株式会社 | 用于恢复由于对测试语音信号进行噪声去除导致在测试噪声去除后语音信号中衰减的谱成分的方法 |
CN102903360A (zh) * | 2011-07-26 | 2013-01-30 | 财团法人工业技术研究院 | 以麦克风阵列为基础的语音辨识系统与方法 |
CN102903368A (zh) * | 2011-07-29 | 2013-01-30 | 杜比实验室特许公司 | 用于卷积盲源分离的方法和设备 |
JP2013114151A (ja) * | 2011-11-30 | 2013-06-10 | Nippon Telegr & Teleph Corp <Ntt> | 雑音抑圧装置、方法及びプログラム |
US20130332165A1 (en) * | 2012-06-06 | 2013-12-12 | Qualcomm Incorporated | Method and systems having improved speech recognition |
CN103617798A (zh) * | 2013-12-04 | 2014-03-05 | 中国人民解放军成都军区总医院 | 一种强背景噪声下的语音提取方法 |
Non-Patent Citations (2)
Title |
---|
ZHIYAO DUAN ET AL.: "Online PLCA for Real-Time Semi-supervised Source Separation", 《LVA/ICA 2012: LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION》 * |
ZHIYAO DUAN ET AL.: "Online PLCA for Real-Time Semi-supervised Source Separation", 《LVA/ICA 2012: LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION》, 31 December 2012 (2012-12-31), pages 34 - 41, XP019172729, DOI: 10.1007/978-3-642-28551-6_5 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI801085B (zh) * | 2022-01-07 | 2023-05-01 | 矽響先創科技股份有限公司 | 智能網路通訊之雜訊消減方法 |
Also Published As
Publication number | Publication date |
---|---|
JP2017532601A (ja) | 2017-11-02 |
KR102702715B1 (ko) | 2024-09-05 |
CN106796803B (zh) | 2023-09-19 |
TW201614642A (en) | 2016-04-16 |
US9990936B2 (en) | 2018-06-05 |
TWI669708B (zh) | 2019-08-21 |
KR20170069221A (ko) | 2017-06-20 |
JP6967966B2 (ja) | 2021-11-17 |
WO2016058974A1 (en) | 2016-04-21 |
KR20230015515A (ko) | 2023-01-31 |
EP3010017A1 (en) | 2016-04-20 |
US20170309291A1 (en) | 2017-10-26 |
EP3207543A1 (en) | 2017-08-23 |
EP3207543B1 (en) | 2024-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xu et al. | Dynamic noise aware training for speech enhancement based on deep neural networks. | |
CN106796803A (zh) | 用于在音频通信中将语音数据与背景数据分离的方法和装置 | |
Xu et al. | Listening to sounds of silence for speech denoising | |
CN108899047B (zh) | 音频信号的掩蔽阈值估计方法、装置及存储介质 | |
WO2022107393A1 (en) | A neural-network-based approach for speech denoising statement regarding federally sponsored research | |
US20110218803A1 (en) | Method and system for assessing intelligibility of speech represented by a speech signal | |
KR102152197B1 (ko) | 음성 검출기를 구비한 보청기 및 그 방법 | |
He et al. | Multiplicative update of auto-regressive gains for codebook-based speech enhancement | |
KR20180125385A (ko) | 잡음 환경 분류 및 제거 기능을 갖는 보청기 및 그 방법 | |
Gupta et al. | Speech feature extraction and recognition using genetic algorithm | |
Martín-Doñas et al. | Dual-channel DNN-based speech enhancement for smartphones | |
Hou et al. | Domain adversarial training for speech enhancement | |
Balasubramanian et al. | Ideal ratio mask estimation based on cochleagram for audio-visual monaural speech enhancement | |
Samui et al. | Tensor-train long short-term memory for monaural speech enhancement | |
Leutnant et al. | Bayesian feature enhancement for reverberation and noise robust speech recognition | |
Abel et al. | Cognitively inspired audiovisual speech filtering: towards an intelligent, fuzzy based, multimodal, two-stage speech enhancement system | |
Balasubramanian et al. | Estimation of ideal binary mask for audio-visual monaural speech enhancement | |
CN106971735A (zh) | 一种定期更新缓存中训练语句的声纹识别的方法及系统 | |
Yoshida et al. | Audio-visual voice activity detection based on an utterance state transition model | |
Soni et al. | Comparing front-end enhancement techniques and multiconditioned training for robust automatic speech recognition | |
Mital | Speech enhancement for automatic analysis of child-centered audio recordings | |
Ming et al. | Wide matching—An approach to improving noise robustness for speech enhancement | |
Vasuki et al. | Emotion recognition using ensemble of cepstral, perceptual and temporal features | |
Samanta et al. | RETRACTED ARTICLE: An energy-efficient voice activity detector using reconfigurable Gaussian base normalization deep neural network | |
TN et al. | An Improved Method for Speech Enhancement Using Convolutional Neural Network Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20190610 Address after: Paris France Applicant after: Interactive digital CE patent holding Co. Address before: I Si Eli Murli Nor, France Applicant before: THOMSON LICENSING |
|
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20210127 Address after: Paris France Applicant after: Interactive Digital Madison Patent Holdings Address before: Paris France Applicant before: Interactive Digital CE Patent Holding Co. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |