CN1737906A - 利用中枢网络分离语音信号 - Google Patents

利用中枢网络分离语音信号 Download PDF

Info

Publication number
CN1737906A
CN1737906A CNA2005100677770A CN200510067777A CN1737906A CN 1737906 A CN1737906 A CN 1737906A CN A2005100677770 A CNA2005100677770 A CN A2005100677770A CN 200510067777 A CN200510067777 A CN 200510067777A CN 1737906 A CN1737906 A CN 1737906A
Authority
CN
China
Prior art keywords
signal
valuation
sound signal
voice signal
backbone network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2005100677770A
Other languages
English (en)
Chinese (zh)
Inventor
P·赫瑟林顿
P·扎卡拉乌斯卡斯
S·帕尔文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haman Beck - Takemi Branch Automatic System
Harman Becker Automotive Systems GmbH
Original Assignee
Haman Beck - Takemi Branch Automatic System
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haman Beck - Takemi Branch Automatic System filed Critical Haman Beck - Takemi Branch Automatic System
Publication of CN1737906A publication Critical patent/CN1737906A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Noise Elimination (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
CNA2005100677770A 2004-03-23 2005-03-22 利用中枢网络分离语音信号 Pending CN1737906A (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US55558204P 2004-03-23 2004-03-23
US60/555,582 2004-03-23

Publications (1)

Publication Number Publication Date
CN1737906A true CN1737906A (zh) 2006-02-22

Family

ID=34860539

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2005100677770A Pending CN1737906A (zh) 2004-03-23 2005-03-22 利用中枢网络分离语音信号

Country Status (7)

Country Link
US (1) US7620546B2 (https=)
EP (1) EP1580730B1 (https=)
JP (1) JP2005275410A (https=)
KR (1) KR20060044629A (https=)
CN (1) CN1737906A (https=)
CA (1) CA2501989C (https=)
DE (1) DE602005009419D1 (https=)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104867495A (zh) * 2013-08-28 2015-08-26 德州仪器公司 上下文感知的声音标志检测
CN105359209A (zh) * 2013-06-21 2016-02-24 弗朗霍夫应用科学研究促进协会 在错误隐藏过程中在不同域中改善信号衰落的装置及方法
CN105741848A (zh) * 2010-04-14 2016-07-06 谷歌公司 用于增强话音识别准确度的有地理标记的环境音频
CN106683663A (zh) * 2015-11-06 2017-05-17 三星电子株式会社 神经网络训练设备和方法以及语音识别设备和方法
CN107481728A (zh) * 2017-09-29 2017-12-15 百度在线网络技术(北京)有限公司 背景声消除方法、装置及终端设备
CN108470476A (zh) * 2018-05-15 2018-08-31 黄淮学院 一种英语发音匹配纠正系统
CN110797021A (zh) * 2018-05-24 2020-02-14 腾讯科技(深圳)有限公司 混合语音识别网络训练方法、混合语音识别方法、装置及存储介质
CN112562710A (zh) * 2020-11-27 2021-03-26 天津大学 一种基于深度学习的阶梯式语音增强方法
CN114187914A (zh) * 2021-12-17 2022-03-15 广东电网有限责任公司 一种语音识别方法及系统
WO2024055751A1 (zh) * 2022-09-13 2024-03-21 腾讯科技(深圳)有限公司 音频数据处理方法、装置、设备、存储介质及程序产品
US20250391420A1 (en) * 2024-06-21 2025-12-25 Bank Of America Corporation System and method for adaptive audio segmentation for contextual speech signal processing

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101615262B1 (ko) * 2009-08-12 2016-04-26 삼성전자주식회사 시멘틱 정보를 이용한 멀티 채널 오디오 인코딩 및 디코딩 방법 및 장치
US8768406B2 (en) * 2010-08-11 2014-07-01 Bone Tone Communications Ltd. Background sound removal for privacy and personalization use
US8239196B1 (en) * 2011-07-28 2012-08-07 Google Inc. System and method for multi-channel multi-feature speech/noise classification for noise suppression
US9390712B2 (en) * 2014-03-24 2016-07-12 Microsoft Technology Licensing, Llc. Mixed speech recognition
US10832138B2 (en) 2014-11-27 2020-11-10 Samsung Electronics Co., Ltd. Method and apparatus for extending neural network
JP6348427B2 (ja) * 2015-02-05 2018-06-27 日本電信電話株式会社 雑音除去装置及び雑音除去プログラム
US10741195B2 (en) * 2016-02-15 2020-08-11 Mitsubishi Electric Corporation Sound signal enhancement device
DE112017001830B4 (de) * 2016-05-06 2024-02-22 Robert Bosch Gmbh Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen
US9875747B1 (en) * 2016-07-15 2018-01-23 Google Llc Device specific multi-channel data compression
US10276187B2 (en) * 2016-10-19 2019-04-30 Ford Global Technologies, Llc Vehicle ambient audio classification via neural network machine learning
US10714118B2 (en) * 2016-12-30 2020-07-14 Facebook, Inc. Audio compression using an artificial neural network
JP6673861B2 (ja) * 2017-03-02 2020-03-25 日本電信電話株式会社 信号処理装置、信号処理方法及び信号処理プログラム
US12106214B2 (en) 2017-05-17 2024-10-01 Samsung Electronics Co., Ltd. Sensor transformation attention network (STAN) model
US11501154B2 (en) 2017-05-17 2022-11-15 Samsung Electronics Co., Ltd. Sensor transformation attention network (STAN) model
US10170137B2 (en) 2017-05-18 2019-01-01 International Business Machines Corporation Voice signal component forecaster
US11321604B2 (en) * 2017-06-21 2022-05-03 Arm Ltd. Systems and devices for compressing neural network parameters
US11270198B2 (en) 2017-07-31 2022-03-08 Syntiant Microcontroller interface for audio signal processing
US11545162B2 (en) * 2017-10-24 2023-01-03 Samsung Electronics Co., Ltd. Audio reconstruction method and device which use machine learning
US10283140B1 (en) * 2018-01-12 2019-05-07 Alibaba Group Holding Limited Enhancing audio signals using sub-band deep neural networks
CN108648527B (zh) * 2018-05-15 2020-07-24 黄淮学院 一种英语发音匹配纠正方法
CN110503967B (zh) * 2018-05-17 2021-11-19 中国移动通信有限公司研究院 一种语音增强方法、装置、介质和设备
CN108806707B (zh) * 2018-06-11 2020-05-12 百度在线网络技术(北京)有限公司 语音处理方法、装置、设备及存储介质
EP3644565A1 (en) * 2018-10-25 2020-04-29 Nokia Solutions and Networks Oy Reconstructing a channel frequency response curve
CN109545228A (zh) * 2018-12-14 2019-03-29 厦门快商通信息技术有限公司 一种端到端说话人分割方法及系统
JP7242903B2 (ja) 2019-05-14 2023-03-20 ドルビー ラボラトリーズ ライセンシング コーポレイション 畳み込みニューラルネットワークに基づく発話源分離のための方法および装置
KR20200132645A (ko) 2019-05-16 2020-11-25 삼성전자주식회사 음성 인식 서비스를 제공하는 장치 및 방법
WO2020255242A1 (ja) * 2019-06-18 2020-12-24 日本電信電話株式会社 復元装置、復元方法、およびプログラム
US11514928B2 (en) * 2019-09-09 2022-11-29 Apple Inc. Spatially informed audio signal processing for user speech
US11257510B2 (en) 2019-12-02 2022-02-22 International Business Machines Corporation Participant-tuned filtering using deep neural network dynamic spectral masking for conversation isolation and security in noisy environments
CN111951819B (zh) * 2020-08-20 2024-04-09 北京字节跳动网络技术有限公司 回声消除方法、装置及存储介质
CN112735460B (zh) * 2020-12-24 2021-10-29 中国人民解放军战略支援部队信息工程大学 基于时频掩蔽值估计的波束成形方法及系统
US11887583B1 (en) * 2021-06-09 2024-01-30 Amazon Technologies, Inc. Updating models with trained model update objects
CN115512714B (zh) * 2022-03-22 2025-09-12 钉钉(中国)信息技术有限公司 语音增强方法、装置及设备
GB2620747B (en) * 2022-07-19 2024-10-02 Samsung Electronics Co Ltd Method and apparatus for speech enhancement
CN115862618A (zh) * 2022-11-24 2023-03-28 深圳正扬智能有限公司 一种智慧楼宇中央集成管理系统
KR20250065958A (ko) * 2023-11-06 2025-05-13 한국전자기술연구원 발화 내 언어, 화자, 감정 병합을 통한 음성 합성을 위한 학습 데이터셋 구축 방법

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02253298A (ja) * 1989-03-28 1990-10-12 Sharp Corp 音声通過フィルタ
JPH0566795A (ja) 1991-09-06 1993-03-19 Gijutsu Kenkyu Kumiai Iryo Fukushi Kiki Kenkyusho 雑音抑圧装置とその調整装置
US5749066A (en) * 1995-04-24 1998-05-05 Ericsson Messaging Systems Inc. Method and apparatus for developing a neural network for phoneme recognition
US5960391A (en) * 1995-12-13 1999-09-28 Denso Corporation Signal extraction system, system and method for speech restoration, learning method for neural network model, constructing method of neural network model, and signal processing system
GB9611138D0 (en) * 1996-05-29 1996-07-31 Domain Dynamics Ltd Signal processing arrangements
JP2000047697A (ja) * 1998-07-30 2000-02-18 Nec Eng Ltd ノイズキャンセラ
US6347297B1 (en) * 1998-10-05 2002-02-12 Legerity, Inc. Matrix quantization with vector quantization error compensation and neural network postprocessing for robust speech recognition
US6910011B1 (en) * 1999-08-16 2005-06-21 Haman Becker Automotive Systems - Wavemakers, Inc. Noisy acoustic signal enhancement
EP1152399A1 (fr) * 2000-05-04 2001-11-07 Faculte Polytechniquede Mons Traitement en sous bandes de signal de parole par réseaux de neurones
US7203643B2 (en) * 2001-06-14 2007-04-10 Qualcomm Incorporated Method and apparatus for transmitting speech activity in distributed voice recognition systems

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105741848A (zh) * 2010-04-14 2016-07-06 谷歌公司 用于增强话音识别准确度的有地理标记的环境音频
CN105741848B (zh) * 2010-04-14 2019-07-23 谷歌有限责任公司 用于增强话音识别准确度的有地理标记的环境音频的系统及方法
US10854208B2 (en) 2013-06-21 2020-12-01 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method realizing improved concepts for TCX LTP
US11776551B2 (en) 2013-06-21 2023-10-03 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out in different domains during error concealment
US12125491B2 (en) 2013-06-21 2024-10-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method realizing improved concepts for TCX LTP
US11869514B2 (en) 2013-06-21 2024-01-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out for switched audio coding systems during error concealment
CN105359209B (zh) * 2013-06-21 2019-06-14 弗朗霍夫应用科学研究促进协会 在错误隐藏过程中在不同域中改善信号衰落的装置及方法
CN105359209A (zh) * 2013-06-21 2016-02-24 弗朗霍夫应用科学研究促进协会 在错误隐藏过程中在不同域中改善信号衰落的装置及方法
US11501783B2 (en) 2013-06-21 2022-11-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application
US10607614B2 (en) 2013-06-21 2020-03-31 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application
US10672404B2 (en) 2013-06-21 2020-06-02 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for generating an adaptive spectral shape of comfort noise
US10679632B2 (en) 2013-06-21 2020-06-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out for switched audio coding systems during error concealment
US11462221B2 (en) 2013-06-21 2022-10-04 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for generating an adaptive spectral shape of comfort noise
US10867613B2 (en) 2013-06-21 2020-12-15 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for improved signal fade out in different domains during error concealment
CN104867495A (zh) * 2013-08-28 2015-08-26 德州仪器公司 上下文感知的声音标志检测
CN106683663B (zh) * 2015-11-06 2022-01-25 三星电子株式会社 神经网络训练设备和方法以及语音识别设备和方法
CN106683663A (zh) * 2015-11-06 2017-05-17 三星电子株式会社 神经网络训练设备和方法以及语音识别设备和方法
CN107481728A (zh) * 2017-09-29 2017-12-15 百度在线网络技术(北京)有限公司 背景声消除方法、装置及终端设备
CN108470476A (zh) * 2018-05-15 2018-08-31 黄淮学院 一种英语发音匹配纠正系统
CN110797021B (zh) * 2018-05-24 2022-06-07 腾讯科技(深圳)有限公司 混合语音识别网络训练方法、混合语音识别方法、装置及存储介质
US11996091B2 (en) 2018-05-24 2024-05-28 Tencent Technology (Shenzhen) Company Limited Mixed speech recognition method and apparatus, and computer-readable storage medium
CN110797021A (zh) * 2018-05-24 2020-02-14 腾讯科技(深圳)有限公司 混合语音识别网络训练方法、混合语音识别方法、装置及存储介质
CN112562710B (zh) * 2020-11-27 2022-09-30 天津大学 一种基于深度学习的阶梯式语音增强方法
CN112562710A (zh) * 2020-11-27 2021-03-26 天津大学 一种基于深度学习的阶梯式语音增强方法
CN114187914A (zh) * 2021-12-17 2022-03-15 广东电网有限责任公司 一种语音识别方法及系统
WO2024055751A1 (zh) * 2022-09-13 2024-03-21 腾讯科技(深圳)有限公司 音频数据处理方法、装置、设备、存储介质及程序产品
US20250391420A1 (en) * 2024-06-21 2025-12-25 Bank Of America Corporation System and method for adaptive audio segmentation for contextual speech signal processing

Also Published As

Publication number Publication date
CA2501989A1 (en) 2005-09-23
CA2501989C (en) 2011-07-26
US20060031066A1 (en) 2006-02-09
EP1580730A3 (en) 2006-04-12
EP1580730B1 (en) 2008-09-03
KR20060044629A (ko) 2006-05-16
JP2005275410A (ja) 2005-10-06
DE602005009419D1 (de) 2008-10-16
US7620546B2 (en) 2009-11-17
EP1580730A2 (en) 2005-09-28

Similar Documents

Publication Publication Date Title
CN1737906A (zh) 利用中枢网络分离语音信号
Alim et al. Some commonly used speech feature extraction algorithms
CN112820315B (zh) 音频信号处理方法、装置、计算机设备及存储介质
CN108711436B (zh) 基于高频和瓶颈特征的说话人验证系统重放攻击检测方法
CN108447495B (zh) 一种基于综合特征集的深度学习语音增强方法
US20210193149A1 (en) Method, apparatus and device for voiceprint recognition, and medium
CN108198545B (zh) 一种基于小波变换的语音识别方法
CN112786059A (zh) 一种基于人工智能的声纹特征提取方法及装置
CN102592607A (zh) 一种使用盲语音分离的语音转换系统和方法
CN104183245A (zh) 一种演唱者音色相似的歌星推荐方法与装置
CN101527141A (zh) 基于径向基神经网络的耳语音转换为正常语音的方法
CN109036470A (zh) 语音区分方法、装置、计算机设备及存储介质
CN116438599A (zh) 通过标准arm嵌入式平台上的卷积神经网络嵌入式语音指纹进行人声轨道去除
Saeki et al. DRSpeech: Degradation-robust text-to-speech synthesis with frame-level and utterance-level acoustic representation learning
CN103559893B (zh) 一种水下目标gammachirp倒谱系数听觉特征提取方法
Zouhir et al. A bio-inspired feature extraction for robust speech recognition
CN120148484B (zh) 一种基于微型计算机的语音识别方法及装置
TWI746138B (zh) 構音異常語音澄析裝置及其方法
Permana et al. Improved feature extraction for sound recognition using combined constant-q transform (cqt) and mel spectrogram for cnn input
CN116913296A (zh) 音频处理方法和装置
Cai et al. Dual-channel drum separation for low-cost drum recording using non-negative matrix factorization
Gupta et al. Speech analysis of Chhattisgarhi dialects using wavelet transformation and mel frequency cepstral coefficient
Kumar et al. Performance evaluation of MLP for speech recognition in noisy environments using MFCC & wavelets
CN114512141B (zh) 音频分离的方法、装置、设备、存储介质和程序产品
Bae et al. A Study on Enhancement of Speech using Non-uniform Sampling

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20060222