JP2005275410A - ニューラルネットワークを利用してスピーチ信号を分離する。 - Google Patents
ニューラルネットワークを利用してスピーチ信号を分離する。 Download PDFInfo
- Publication number
- JP2005275410A JP2005275410A JP2005085040A JP2005085040A JP2005275410A JP 2005275410 A JP2005275410 A JP 2005275410A JP 2005085040 A JP2005085040 A JP 2005085040A JP 2005085040 A JP2005085040 A JP 2005085040A JP 2005275410 A JP2005275410 A JP 2005275410A
- Authority
- JP
- Japan
- Prior art keywords
- signal
- audio signal
- speech signal
- speech
- estimate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000926 separation method Methods 0.000 title claims abstract description 35
- 230000007935 neutral effect Effects 0.000 title abstract 3
- 230000005236 sound signal Effects 0.000 claims description 113
- 238000013528 artificial neural network Methods 0.000 claims description 96
- 238000000034 method Methods 0.000 claims description 58
- 230000015572 biosynthetic process Effects 0.000 claims description 13
- 238000003786 synthesis reaction Methods 0.000 claims description 13
- 238000007906 compression Methods 0.000 claims description 11
- 230000006835 compression Effects 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 2
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 210000002569 neuron Anatomy 0.000 description 27
- 210000002364 input neuron Anatomy 0.000 description 20
- 238000010586 diagram Methods 0.000 description 17
- 238000012545 processing Methods 0.000 description 14
- 230000008569 process Effects 0.000 description 13
- 210000004205 output neuron Anatomy 0.000 description 12
- 230000006870 function Effects 0.000 description 11
- 238000001228 spectrum Methods 0.000 description 7
- 238000011156 evaluation Methods 0.000 description 6
- 230000004913 activation Effects 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000000470 constituent Substances 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 210000004704 glottis Anatomy 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 210000000867 larynx Anatomy 0.000 description 2
- 238000012886 linear function Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000001260 vocal cord Anatomy 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000013013 elastic material Substances 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
- Noise Elimination (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US55558204P | 2004-03-23 | 2004-03-23 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2005275410A true JP2005275410A (ja) | 2005-10-06 |
| JP2005275410A5 JP2005275410A5 (https=) | 2008-04-24 |
Family
ID=34860539
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2005085040A Pending JP2005275410A (ja) | 2004-03-23 | 2005-03-23 | ニューラルネットワークを利用してスピーチ信号を分離する。 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US7620546B2 (https=) |
| EP (1) | EP1580730B1 (https=) |
| JP (1) | JP2005275410A (https=) |
| KR (1) | KR20060044629A (https=) |
| CN (1) | CN1737906A (https=) |
| CA (1) | CA2501989C (https=) |
| DE (1) | DE602005009419D1 (https=) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2016143042A (ja) * | 2015-02-05 | 2016-08-08 | 日本電信電話株式会社 | 雑音除去装置及び雑音除去プログラム |
| JP2017515140A (ja) * | 2014-03-24 | 2017-06-08 | マイクロソフト テクノロジー ライセンシング,エルエルシー | 混合音声認識 |
| JP2018146683A (ja) * | 2017-03-02 | 2018-09-20 | 日本電信電話株式会社 | 信号処理装置、信号処理方法及び信号処理プログラム |
| WO2020255242A1 (ja) * | 2019-06-18 | 2020-12-24 | 日本電信電話株式会社 | 復元装置、復元方法、およびプログラム |
Families Citing this family (43)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101615262B1 (ko) * | 2009-08-12 | 2016-04-26 | 삼성전자주식회사 | 시멘틱 정보를 이용한 멀티 채널 오디오 인코딩 및 디코딩 방법 및 장치 |
| US8265928B2 (en) * | 2010-04-14 | 2012-09-11 | Google Inc. | Geotagged environmental audio for enhanced speech recognition accuracy |
| US8768406B2 (en) * | 2010-08-11 | 2014-07-01 | Bone Tone Communications Ltd. | Background sound removal for privacy and personalization use |
| US8239196B1 (en) * | 2011-07-28 | 2012-08-07 | Google Inc. | System and method for multi-channel multi-feature speech/noise classification for noise suppression |
| AU2014283198B2 (en) | 2013-06-21 | 2016-10-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application |
| US9412373B2 (en) * | 2013-08-28 | 2016-08-09 | Texas Instruments Incorporated | Adaptive environmental context sample and update for comparing speech recognition |
| US10832138B2 (en) | 2014-11-27 | 2020-11-10 | Samsung Electronics Co., Ltd. | Method and apparatus for extending neural network |
| KR102494139B1 (ko) * | 2015-11-06 | 2023-01-31 | 삼성전자주식회사 | 뉴럴 네트워크 학습 장치 및 방법과, 음성 인식 장치 및 방법 |
| US10741195B2 (en) * | 2016-02-15 | 2020-08-11 | Mitsubishi Electric Corporation | Sound signal enhancement device |
| DE112017001830B4 (de) * | 2016-05-06 | 2024-02-22 | Robert Bosch Gmbh | Sprachverbesserung und audioereignisdetektion für eine umgebung mit nichtstationären geräuschen |
| US9875747B1 (en) * | 2016-07-15 | 2018-01-23 | Google Llc | Device specific multi-channel data compression |
| US10276187B2 (en) * | 2016-10-19 | 2019-04-30 | Ford Global Technologies, Llc | Vehicle ambient audio classification via neural network machine learning |
| US10714118B2 (en) * | 2016-12-30 | 2020-07-14 | Facebook, Inc. | Audio compression using an artificial neural network |
| US12106214B2 (en) | 2017-05-17 | 2024-10-01 | Samsung Electronics Co., Ltd. | Sensor transformation attention network (STAN) model |
| US11501154B2 (en) | 2017-05-17 | 2022-11-15 | Samsung Electronics Co., Ltd. | Sensor transformation attention network (STAN) model |
| US10170137B2 (en) | 2017-05-18 | 2019-01-01 | International Business Machines Corporation | Voice signal component forecaster |
| US11321604B2 (en) * | 2017-06-21 | 2022-05-03 | Arm Ltd. | Systems and devices for compressing neural network parameters |
| US11270198B2 (en) | 2017-07-31 | 2022-03-08 | Syntiant | Microcontroller interface for audio signal processing |
| CN107481728B (zh) * | 2017-09-29 | 2020-12-11 | 百度在线网络技术(北京)有限公司 | 背景声消除方法、装置及终端设备 |
| US11545162B2 (en) * | 2017-10-24 | 2023-01-03 | Samsung Electronics Co., Ltd. | Audio reconstruction method and device which use machine learning |
| US10283140B1 (en) * | 2018-01-12 | 2019-05-07 | Alibaba Group Holding Limited | Enhancing audio signals using sub-band deep neural networks |
| CN108470476B (zh) * | 2018-05-15 | 2020-06-30 | 黄淮学院 | 一种英语发音匹配纠正系统 |
| CN108648527B (zh) * | 2018-05-15 | 2020-07-24 | 黄淮学院 | 一种英语发音匹配纠正方法 |
| CN110503967B (zh) * | 2018-05-17 | 2021-11-19 | 中国移动通信有限公司研究院 | 一种语音增强方法、装置、介质和设备 |
| CN111445905B (zh) | 2018-05-24 | 2023-08-08 | 腾讯科技(深圳)有限公司 | 混合语音识别网络训练方法、混合语音识别方法、装置及存储介质 |
| CN108806707B (zh) * | 2018-06-11 | 2020-05-12 | 百度在线网络技术(北京)有限公司 | 语音处理方法、装置、设备及存储介质 |
| EP3644565A1 (en) * | 2018-10-25 | 2020-04-29 | Nokia Solutions and Networks Oy | Reconstructing a channel frequency response curve |
| CN109545228A (zh) * | 2018-12-14 | 2019-03-29 | 厦门快商通信息技术有限公司 | 一种端到端说话人分割方法及系统 |
| JP7242903B2 (ja) | 2019-05-14 | 2023-03-20 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 畳み込みニューラルネットワークに基づく発話源分離のための方法および装置 |
| KR20200132645A (ko) | 2019-05-16 | 2020-11-25 | 삼성전자주식회사 | 음성 인식 서비스를 제공하는 장치 및 방법 |
| US11514928B2 (en) * | 2019-09-09 | 2022-11-29 | Apple Inc. | Spatially informed audio signal processing for user speech |
| US11257510B2 (en) | 2019-12-02 | 2022-02-22 | International Business Machines Corporation | Participant-tuned filtering using deep neural network dynamic spectral masking for conversation isolation and security in noisy environments |
| CN111951819B (zh) * | 2020-08-20 | 2024-04-09 | 北京字节跳动网络技术有限公司 | 回声消除方法、装置及存储介质 |
| CN112562710B (zh) * | 2020-11-27 | 2022-09-30 | 天津大学 | 一种基于深度学习的阶梯式语音增强方法 |
| CN112735460B (zh) * | 2020-12-24 | 2021-10-29 | 中国人民解放军战略支援部队信息工程大学 | 基于时频掩蔽值估计的波束成形方法及系统 |
| US11887583B1 (en) * | 2021-06-09 | 2024-01-30 | Amazon Technologies, Inc. | Updating models with trained model update objects |
| CN114187914A (zh) * | 2021-12-17 | 2022-03-15 | 广东电网有限责任公司 | 一种语音识别方法及系统 |
| CN115512714B (zh) * | 2022-03-22 | 2025-09-12 | 钉钉(中国)信息技术有限公司 | 语音增强方法、装置及设备 |
| GB2620747B (en) * | 2022-07-19 | 2024-10-02 | Samsung Electronics Co Ltd | Method and apparatus for speech enhancement |
| CN117746874A (zh) * | 2022-09-13 | 2024-03-22 | 腾讯科技(北京)有限公司 | 一种音频数据处理方法、装置以及可读存储介质 |
| CN115862618A (zh) * | 2022-11-24 | 2023-03-28 | 深圳正扬智能有限公司 | 一种智慧楼宇中央集成管理系统 |
| KR20250065958A (ko) * | 2023-11-06 | 2025-05-13 | 한국전자기술연구원 | 발화 내 언어, 화자, 감정 병합을 통한 음성 합성을 위한 학습 데이터셋 구축 방법 |
| US20250391420A1 (en) * | 2024-06-21 | 2025-12-25 | Bank Of America Corporation | System and method for adaptive audio segmentation for contextual speech signal processing |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02253298A (ja) * | 1989-03-28 | 1990-10-12 | Sharp Corp | 音声通過フィルタ |
| JP2000047697A (ja) * | 1998-07-30 | 2000-02-18 | Nec Eng Ltd | ノイズキャンセラ |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0566795A (ja) | 1991-09-06 | 1993-03-19 | Gijutsu Kenkyu Kumiai Iryo Fukushi Kiki Kenkyusho | 雑音抑圧装置とその調整装置 |
| US5749066A (en) * | 1995-04-24 | 1998-05-05 | Ericsson Messaging Systems Inc. | Method and apparatus for developing a neural network for phoneme recognition |
| US5960391A (en) * | 1995-12-13 | 1999-09-28 | Denso Corporation | Signal extraction system, system and method for speech restoration, learning method for neural network model, constructing method of neural network model, and signal processing system |
| GB9611138D0 (en) * | 1996-05-29 | 1996-07-31 | Domain Dynamics Ltd | Signal processing arrangements |
| US6347297B1 (en) * | 1998-10-05 | 2002-02-12 | Legerity, Inc. | Matrix quantization with vector quantization error compensation and neural network postprocessing for robust speech recognition |
| US6910011B1 (en) * | 1999-08-16 | 2005-06-21 | Haman Becker Automotive Systems - Wavemakers, Inc. | Noisy acoustic signal enhancement |
| EP1152399A1 (fr) * | 2000-05-04 | 2001-11-07 | Faculte Polytechniquede Mons | Traitement en sous bandes de signal de parole par réseaux de neurones |
| US7203643B2 (en) * | 2001-06-14 | 2007-04-10 | Qualcomm Incorporated | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
-
2005
- 2005-03-21 US US11/085,825 patent/US7620546B2/en active Active
- 2005-03-22 CN CNA2005100677770A patent/CN1737906A/zh active Pending
- 2005-03-22 CA CA2501989A patent/CA2501989C/en not_active Expired - Lifetime
- 2005-03-23 DE DE602005009419T patent/DE602005009419D1/de not_active Expired - Lifetime
- 2005-03-23 EP EP05006440A patent/EP1580730B1/en not_active Expired - Lifetime
- 2005-03-23 JP JP2005085040A patent/JP2005275410A/ja active Pending
- 2005-03-23 KR KR1020050024110A patent/KR20060044629A/ko not_active Withdrawn
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02253298A (ja) * | 1989-03-28 | 1990-10-12 | Sharp Corp | 音声通過フィルタ |
| JP2000047697A (ja) * | 1998-07-30 | 2000-02-18 | Nec Eng Ltd | ノイズキャンセラ |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2017515140A (ja) * | 2014-03-24 | 2017-06-08 | マイクロソフト テクノロジー ライセンシング,エルエルシー | 混合音声認識 |
| JP2016143042A (ja) * | 2015-02-05 | 2016-08-08 | 日本電信電話株式会社 | 雑音除去装置及び雑音除去プログラム |
| JP2018146683A (ja) * | 2017-03-02 | 2018-09-20 | 日本電信電話株式会社 | 信号処理装置、信号処理方法及び信号処理プログラム |
| WO2020255242A1 (ja) * | 2019-06-18 | 2020-12-24 | 日本電信電話株式会社 | 復元装置、復元方法、およびプログラム |
| JPWO2020255242A1 (https=) * | 2019-06-18 | 2020-12-24 | ||
| JP7188589B2 (ja) | 2019-06-18 | 2022-12-13 | 日本電信電話株式会社 | 復元装置、復元方法、およびプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| CA2501989A1 (en) | 2005-09-23 |
| CN1737906A (zh) | 2006-02-22 |
| CA2501989C (en) | 2011-07-26 |
| US20060031066A1 (en) | 2006-02-09 |
| EP1580730A3 (en) | 2006-04-12 |
| EP1580730B1 (en) | 2008-09-03 |
| KR20060044629A (ko) | 2006-05-16 |
| DE602005009419D1 (de) | 2008-10-16 |
| US7620546B2 (en) | 2009-11-17 |
| EP1580730A2 (en) | 2005-09-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP2005275410A (ja) | ニューラルネットワークを利用してスピーチ信号を分離する。 | |
| US10504539B2 (en) | Voice activity detection systems and methods | |
| CN111161752B (zh) | 回声消除方法和装置 | |
| JP6903611B2 (ja) | 信号生成装置、信号生成システム、信号生成方法およびプログラム | |
| KR101045627B1 (ko) | 윈드 노이즈 억제 시스템, 윈드 노이즈 검출 시스템, 윈드버핏 제거 방법 및 노이즈 검출 제어용 소프트웨어를구비하는 신호 기록 매체 | |
| RU2373584C2 (ru) | Способ и устройство для повышения разборчивости речи с использованием нескольких датчиков | |
| JP5666444B2 (ja) | 特徴抽出を使用してスピーチ強調のためにオーディオ信号を処理する装置及び方法 | |
| JP5127754B2 (ja) | 信号処理装置 | |
| EP2643981B1 (en) | A device comprising a plurality of audio sensors and a method of operating the same | |
| Shivakumar et al. | Perception optimized deep denoising autoencoders for speech enhancement. | |
| Chaki | Pattern analysis based acoustic signal processing: a survey of the state-of-art | |
| CN114333874B (zh) | 处理音频信号的方法 | |
| JP2002537585A (ja) | 音声およびアコースティック信号の有声音化励起を特徴付けて、音声からアコースティック・ノイズを除去し、音声を合成するシステムおよび方法 | |
| JP2010055000A (ja) | 信号帯域拡張装置 | |
| CN115223584B (zh) | 音频数据处理方法、装置、设备及存储介质 | |
| Singh et al. | Usefulness of linear prediction residual for replay attack detection | |
| CN119052696A (zh) | 一种基于声纹识别及反向波抵消降风噪的耳机控制方法 | |
| JP5443547B2 (ja) | 信号処理装置 | |
| JP2003510665A (ja) | 適応フィルタリングアルゴリズムを用いるデエッサーのための装置および方法 | |
| Tchorz et al. | Estimation of the signal-to-noise ratio with amplitude modulation spectrograms | |
| Uhle et al. | Speech enhancement of movie sound | |
| CN113593604A (zh) | 检测音频质量方法、装置及存储介质 | |
| CN116758930A (zh) | 语音增强方法、装置、电子设备及存储介质 | |
| He et al. | Time-frequency feature extraction from spectrograms and wavelet packets with application to automatic stress and emotion classification in speech | |
| KR20150131588A (ko) | 전자 장치 및 피치 생성 방법 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Written amendment |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20080310 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20080310 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20100930 |
|
| A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20110301 |