JP2005275410A - ニューラルネットワークを利用してスピーチ信号を分離する。 - Google Patents

ニューラルネットワークを利用してスピーチ信号を分離する。 Download PDF

Info

Publication number
JP2005275410A
JP2005275410A JP2005085040A JP2005085040A JP2005275410A JP 2005275410 A JP2005275410 A JP 2005275410A JP 2005085040 A JP2005085040 A JP 2005085040A JP 2005085040 A JP2005085040 A JP 2005085040A JP 2005275410 A JP2005275410 A JP 2005275410A
Authority
JP
Japan
Prior art keywords
signal
audio signal
speech signal
speech
estimate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2005085040A
Other languages
English (en)
Japanese (ja)
Other versions
JP2005275410A5 (https=
Inventor
Phillip Hetherington
ヘザーリントン フィリップ
Pierre Zakarauskas
ザカラウスカス ピアー
Shahla Parveen
パービーン シャーラ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
QNX Software Systems Wavemakers Inc
Harman Becker Automotive Systems GmbH
Original Assignee
Harman Becker Automotive Systems Wavemakers Inc
Harman Becker Automotive Systems GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harman Becker Automotive Systems Wavemakers Inc, Harman Becker Automotive Systems GmbH filed Critical Harman Becker Automotive Systems Wavemakers Inc
Publication of JP2005275410A publication Critical patent/JP2005275410A/ja
Publication of JP2005275410A5 publication Critical patent/JP2005275410A5/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Noise Elimination (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
JP2005085040A 2004-03-23 2005-03-23 ニューラルネットワークを利用してスピーチ信号を分離する。 Pending JP2005275410A (ja)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US55558204P 2004-03-23 2004-03-23

Publications (2)

Publication Number Publication Date
JP2005275410A true JP2005275410A (ja) 2005-10-06
JP2005275410A5 JP2005275410A5 (https=) 2008-04-24

Family

ID=34860539

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2005085040A Pending JP2005275410A (ja) 2004-03-23 2005-03-23 ニューラルネットワークを利用してスピーチ信号を分離する。

Country Status (7)

Country Link
US (1) US7620546B2 (https=)
EP (1) EP1580730B1 (https=)
JP (1) JP2005275410A (https=)
KR (1) KR20060044629A (https=)
CN (1) CN1737906A (https=)
CA (1) CA2501989C (https=)
DE (1) DE602005009419D1 (https=)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016143042A (ja) * 2015-02-05 2016-08-08 日本電信電話株式会社 雑音除去装置及び雑音除去プログラム
JP2017515140A (ja) * 2014-03-24 2017-06-08 マイクロソフト テクノロジー ライセンシング,エルエルシー 混合音声認識
JP2018146683A (ja) * 2017-03-02 2018-09-20 日本電信電話株式会社 信号処理装置、信号処理方法及び信号処理プログラム
WO2020255242A1 (ja) * 2019-06-18 2020-12-24 日本電信電話株式会社 復元装置、復元方法、およびプログラム

Families Citing this family (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101615262B1 (ko) * 2009-08-12 2016-04-26 삼성전자주식회사 시멘틱 정보를 이용한 멀티 채널 오디오 인코딩 및 디코딩 방법 및 장치
US8265928B2 (en) * 2010-04-14 2012-09-11 Google Inc. Geotagged environmental audio for enhanced speech recognition accuracy
US8768406B2 (en) * 2010-08-11 2014-07-01 Bone Tone Communications Ltd. Background sound removal for privacy and personalization use
US8239196B1 (en) * 2011-07-28 2012-08-07 Google Inc. System and method for multi-channel multi-feature speech/noise classification for noise suppression
AU2014283198B2 (en) 2013-06-21 2016-10-20 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application
US9412373B2 (en) * 2013-08-28 2016-08-09 Texas Instruments Incorporated Adaptive environmental context sample and update for comparing speech recognition
US10832138B2 (en) 2014-11-27 2020-11-10 Samsung Electronics Co., Ltd. Method and apparatus for extending neural network
KR102494139B1 (ko) * 2015-11-06 2023-01-31 삼성전자주식회사 뉴럴 네트워크 학습 장치 및 방법과, 음성 인식 장치 및 방법
US10741195B2 (en) * 2016-02-15 2020-08-11 Mitsubishi Electric Corporation Sound signal enhancement device
DE112017001830B4 (de) * 2016-05-06 2024-02-22 Robert Bosch Gmbh Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen
US9875747B1 (en) * 2016-07-15 2018-01-23 Google Llc Device specific multi-channel data compression
US10276187B2 (en) * 2016-10-19 2019-04-30 Ford Global Technologies, Llc Vehicle ambient audio classification via neural network machine learning
US10714118B2 (en) * 2016-12-30 2020-07-14 Facebook, Inc. Audio compression using an artificial neural network
US12106214B2 (en) 2017-05-17 2024-10-01 Samsung Electronics Co., Ltd. Sensor transformation attention network (STAN) model
US11501154B2 (en) 2017-05-17 2022-11-15 Samsung Electronics Co., Ltd. Sensor transformation attention network (STAN) model
US10170137B2 (en) 2017-05-18 2019-01-01 International Business Machines Corporation Voice signal component forecaster
US11321604B2 (en) * 2017-06-21 2022-05-03 Arm Ltd. Systems and devices for compressing neural network parameters
US11270198B2 (en) 2017-07-31 2022-03-08 Syntiant Microcontroller interface for audio signal processing
CN107481728B (zh) * 2017-09-29 2020-12-11 百度在线网络技术(北京)有限公司 背景声消除方法、装置及终端设备
US11545162B2 (en) * 2017-10-24 2023-01-03 Samsung Electronics Co., Ltd. Audio reconstruction method and device which use machine learning
US10283140B1 (en) * 2018-01-12 2019-05-07 Alibaba Group Holding Limited Enhancing audio signals using sub-band deep neural networks
CN108470476B (zh) * 2018-05-15 2020-06-30 黄淮学院 一种英语发音匹配纠正系统
CN108648527B (zh) * 2018-05-15 2020-07-24 黄淮学院 一种英语发音匹配纠正方法
CN110503967B (zh) * 2018-05-17 2021-11-19 中国移动通信有限公司研究院 一种语音增强方法、装置、介质和设备
CN111445905B (zh) 2018-05-24 2023-08-08 腾讯科技(深圳)有限公司 混合语音识别网络训练方法、混合语音识别方法、装置及存储介质
CN108806707B (zh) * 2018-06-11 2020-05-12 百度在线网络技术(北京)有限公司 语音处理方法、装置、设备及存储介质
EP3644565A1 (en) * 2018-10-25 2020-04-29 Nokia Solutions and Networks Oy Reconstructing a channel frequency response curve
CN109545228A (zh) * 2018-12-14 2019-03-29 厦门快商通信息技术有限公司 一种端到端说话人分割方法及系统
JP7242903B2 (ja) 2019-05-14 2023-03-20 ドルビー ラボラトリーズ ライセンシング コーポレイション 畳み込みニューラルネットワークに基づく発話源分離のための方法および装置
KR20200132645A (ko) 2019-05-16 2020-11-25 삼성전자주식회사 음성 인식 서비스를 제공하는 장치 및 방법
US11514928B2 (en) * 2019-09-09 2022-11-29 Apple Inc. Spatially informed audio signal processing for user speech
US11257510B2 (en) 2019-12-02 2022-02-22 International Business Machines Corporation Participant-tuned filtering using deep neural network dynamic spectral masking for conversation isolation and security in noisy environments
CN111951819B (zh) * 2020-08-20 2024-04-09 北京字节跳动网络技术有限公司 回声消除方法、装置及存储介质
CN112562710B (zh) * 2020-11-27 2022-09-30 天津大学 一种基于深度学习的阶梯式语音增强方法
CN112735460B (zh) * 2020-12-24 2021-10-29 中国人民解放军战略支援部队信息工程大学 基于时频掩蔽值估计的波束成形方法及系统
US11887583B1 (en) * 2021-06-09 2024-01-30 Amazon Technologies, Inc. Updating models with trained model update objects
CN114187914A (zh) * 2021-12-17 2022-03-15 广东电网有限责任公司 一种语音识别方法及系统
CN115512714B (zh) * 2022-03-22 2025-09-12 钉钉(中国)信息技术有限公司 语音增强方法、装置及设备
GB2620747B (en) * 2022-07-19 2024-10-02 Samsung Electronics Co Ltd Method and apparatus for speech enhancement
CN117746874A (zh) * 2022-09-13 2024-03-22 腾讯科技(北京)有限公司 一种音频数据处理方法、装置以及可读存储介质
CN115862618A (zh) * 2022-11-24 2023-03-28 深圳正扬智能有限公司 一种智慧楼宇中央集成管理系统
KR20250065958A (ko) * 2023-11-06 2025-05-13 한국전자기술연구원 발화 내 언어, 화자, 감정 병합을 통한 음성 합성을 위한 학습 데이터셋 구축 방법
US20250391420A1 (en) * 2024-06-21 2025-12-25 Bank Of America Corporation System and method for adaptive audio segmentation for contextual speech signal processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02253298A (ja) * 1989-03-28 1990-10-12 Sharp Corp 音声通過フィルタ
JP2000047697A (ja) * 1998-07-30 2000-02-18 Nec Eng Ltd ノイズキャンセラ

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0566795A (ja) 1991-09-06 1993-03-19 Gijutsu Kenkyu Kumiai Iryo Fukushi Kiki Kenkyusho 雑音抑圧装置とその調整装置
US5749066A (en) * 1995-04-24 1998-05-05 Ericsson Messaging Systems Inc. Method and apparatus for developing a neural network for phoneme recognition
US5960391A (en) * 1995-12-13 1999-09-28 Denso Corporation Signal extraction system, system and method for speech restoration, learning method for neural network model, constructing method of neural network model, and signal processing system
GB9611138D0 (en) * 1996-05-29 1996-07-31 Domain Dynamics Ltd Signal processing arrangements
US6347297B1 (en) * 1998-10-05 2002-02-12 Legerity, Inc. Matrix quantization with vector quantization error compensation and neural network postprocessing for robust speech recognition
US6910011B1 (en) * 1999-08-16 2005-06-21 Haman Becker Automotive Systems - Wavemakers, Inc. Noisy acoustic signal enhancement
EP1152399A1 (fr) * 2000-05-04 2001-11-07 Faculte Polytechniquede Mons Traitement en sous bandes de signal de parole par réseaux de neurones
US7203643B2 (en) * 2001-06-14 2007-04-10 Qualcomm Incorporated Method and apparatus for transmitting speech activity in distributed voice recognition systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02253298A (ja) * 1989-03-28 1990-10-12 Sharp Corp 音声通過フィルタ
JP2000047697A (ja) * 1998-07-30 2000-02-18 Nec Eng Ltd ノイズキャンセラ

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017515140A (ja) * 2014-03-24 2017-06-08 マイクロソフト テクノロジー ライセンシング,エルエルシー 混合音声認識
JP2016143042A (ja) * 2015-02-05 2016-08-08 日本電信電話株式会社 雑音除去装置及び雑音除去プログラム
JP2018146683A (ja) * 2017-03-02 2018-09-20 日本電信電話株式会社 信号処理装置、信号処理方法及び信号処理プログラム
WO2020255242A1 (ja) * 2019-06-18 2020-12-24 日本電信電話株式会社 復元装置、復元方法、およびプログラム
JPWO2020255242A1 (https=) * 2019-06-18 2020-12-24
JP7188589B2 (ja) 2019-06-18 2022-12-13 日本電信電話株式会社 復元装置、復元方法、およびプログラム

Also Published As

Publication number Publication date
CA2501989A1 (en) 2005-09-23
CN1737906A (zh) 2006-02-22
CA2501989C (en) 2011-07-26
US20060031066A1 (en) 2006-02-09
EP1580730A3 (en) 2006-04-12
EP1580730B1 (en) 2008-09-03
KR20060044629A (ko) 2006-05-16
DE602005009419D1 (de) 2008-10-16
US7620546B2 (en) 2009-11-17
EP1580730A2 (en) 2005-09-28

Similar Documents

Publication Publication Date Title
JP2005275410A (ja) ニューラルネットワークを利用してスピーチ信号を分離する。
US10504539B2 (en) Voice activity detection systems and methods
CN111161752B (zh) 回声消除方法和装置
JP6903611B2 (ja) 信号生成装置、信号生成システム、信号生成方法およびプログラム
KR101045627B1 (ko) 윈드 노이즈 억제 시스템, 윈드 노이즈 검출 시스템, 윈드버핏 제거 방법 및 노이즈 검출 제어용 소프트웨어를구비하는 신호 기록 매체
RU2373584C2 (ru) Способ и устройство для повышения разборчивости речи с использованием нескольких датчиков
JP5666444B2 (ja) 特徴抽出を使用してスピーチ強調のためにオーディオ信号を処理する装置及び方法
JP5127754B2 (ja) 信号処理装置
EP2643981B1 (en) A device comprising a plurality of audio sensors and a method of operating the same
Shivakumar et al. Perception optimized deep denoising autoencoders for speech enhancement.
Chaki Pattern analysis based acoustic signal processing: a survey of the state-of-art
CN114333874B (zh) 处理音频信号的方法
JP2002537585A (ja) 音声およびアコースティック信号の有声音化励起を特徴付けて、音声からアコースティック・ノイズを除去し、音声を合成するシステムおよび方法
JP2010055000A (ja) 信号帯域拡張装置
CN115223584B (zh) 音频数据处理方法、装置、设备及存储介质
Singh et al. Usefulness of linear prediction residual for replay attack detection
CN119052696A (zh) 一种基于声纹识别及反向波抵消降风噪的耳机控制方法
JP5443547B2 (ja) 信号処理装置
JP2003510665A (ja) 適応フィルタリングアルゴリズムを用いるデエッサーのための装置および方法
Tchorz et al. Estimation of the signal-to-noise ratio with amplitude modulation spectrograms
Uhle et al. Speech enhancement of movie sound
CN113593604A (zh) 检测音频质量方法、装置及存储介质
CN116758930A (zh) 语音增强方法、装置、电子设备及存储介质
He et al. Time-frequency feature extraction from spectrograms and wavelet packets with application to automatic stress and emotion classification in speech
KR20150131588A (ko) 전자 장치 및 피치 생성 방법

Legal Events

Date Code Title Description
A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20080310

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080310

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100930

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20110301