JP2023024295A - 動的音声強調のための方法及びシステム - Google Patents
動的音声強調のための方法及びシステム Download PDFInfo
- Publication number
- JP2023024295A JP2023024295A JP2022110199A JP2022110199A JP2023024295A JP 2023024295 A JP2023024295 A JP 2023024295A JP 2022110199 A JP2022110199 A JP 2022110199A JP 2022110199 A JP2022110199 A JP 2022110199A JP 2023024295 A JP2023024295 A JP 2023024295A
- Authority
- JP
- Japan
- Prior art keywords
- source input
- gain control
- channel
- control parameter
- signal processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0324—Details of processing therefor
- G10L21/034—Automatic adjustment
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G3/00—Gain control in amplifiers or frequency changers
- H03G3/20—Automatic control
- H03G3/30—Automatic control in amplifiers having semiconductor devices
- H03G3/3005—Automatic control in amplifiers having semiconductor devices in amplifiers suitable for low-frequencies, e.g. audio amplifiers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/007—Two-channel systems in which the audio signals are in digital form
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/01—Aspects of volume control, not necessarily automatic, in sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/01—Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/05—Generation or adaptation of centre channel in multi-channel audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
- Circuit For Audible Band Transducer (AREA)
- Stereophonic System (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110895493.XA CN115881146A (zh) | 2021-08-05 | 2021-08-05 | 用于动态语音增强的方法及系统 |
| CN202110895493.X | 2021-08-05 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| JP2023024295A true JP2023024295A (ja) | 2023-02-16 |
| JP2023024295A5 JP2023024295A5 (https=) | 2025-07-14 |
Family
ID=82608415
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2022110199A Pending JP2023024295A (ja) | 2021-08-05 | 2022-07-08 | 動的音声強調のための方法及びシステム |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20230040743A1 (https=) |
| EP (1) | EP4131265B1 (https=) |
| JP (1) | JP2023024295A (https=) |
| KR (1) | KR20230021580A (https=) |
| CN (1) | CN115881146A (https=) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116701921B (zh) * | 2023-08-08 | 2023-10-20 | 电子科技大学 | 多通道时序信号自适应抑噪电路 |
| CN119889331A (zh) * | 2023-10-24 | 2025-04-25 | 哈曼国际工业有限公司 | 智能动态语音增强的方法及系统 |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2001237920A (ja) * | 2000-02-23 | 2001-08-31 | Hitachi Kokusai Electric Inc | 入力レベル調整回路 |
| JP2009163118A (ja) * | 2008-01-09 | 2009-07-23 | Alpine Electronics Inc | 音声再生方法およびマルチプロセスシステム |
| JP2010539792A (ja) * | 2007-09-12 | 2010-12-16 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | スピーチ増強 |
| JP2011518520A (ja) * | 2008-04-18 | 2011-06-23 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | サラウンド体験に対する影響を最小限にしてマルチチャンネルオーディオにおけるスピーチの聴覚性を維持するための方法及び装置 |
| JP2012120052A (ja) * | 2010-12-02 | 2012-06-21 | Fujitsu Ten Ltd | 相関低減方法、音声信号変換装置および音響再生装置 |
| WO2013038451A1 (ja) * | 2011-09-15 | 2013-03-21 | 三菱電機株式会社 | ダイナミックレンジ制御装置 |
| WO2013118192A1 (ja) * | 2012-02-10 | 2013-08-15 | 三菱電機株式会社 | 雑音抑圧装置 |
| US9324337B2 (en) * | 2009-11-17 | 2016-04-26 | Dolby Laboratories Licensing Corporation | Method and system for dialog enhancement |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| FI20045315L (fi) * | 2004-08-30 | 2006-03-01 | Nokia Corp | Ääniaktiivisuuden havaitseminen äänisignaalissa |
| US7464029B2 (en) * | 2005-07-22 | 2008-12-09 | Qualcomm Incorporated | Robust separation of speech signals in a noisy environment |
| US8856049B2 (en) * | 2008-03-26 | 2014-10-07 | Nokia Corporation | Audio signal classification by shape parameter estimation for a plurality of audio signal samples |
| EP2107553B1 (en) * | 2008-03-31 | 2011-05-18 | Harman Becker Automotive Systems GmbH | Method for determining barge-in |
| US8831936B2 (en) * | 2008-05-29 | 2014-09-09 | Qualcomm Incorporated | Systems, methods, apparatus, and computer program products for speech signal processing using spectral contrast enhancement |
| US8503694B2 (en) * | 2008-06-24 | 2013-08-06 | Microsoft Corporation | Sound capture system for devices with two microphones |
| US20110058676A1 (en) * | 2009-09-07 | 2011-03-10 | Qualcomm Incorporated | Systems, methods, apparatus, and computer-readable media for dereverberation of multichannel signal |
| TWI459828B (zh) * | 2010-03-08 | 2014-11-01 | Dolby Lab Licensing Corp | 在多頻道音訊中決定語音相關頻道的音量降低比例的方法及系統 |
| US8989403B2 (en) * | 2010-03-09 | 2015-03-24 | Mitsubishi Electric Corporation | Noise suppression device |
| US8744091B2 (en) * | 2010-11-12 | 2014-06-03 | Apple Inc. | Intelligibility control using ambient noise detection |
| WO2013184520A1 (en) * | 2012-06-04 | 2013-12-12 | Stone Troy Christopher | Methods and systems for identifying content types |
| WO2014043024A1 (en) * | 2012-09-17 | 2014-03-20 | Dolby Laboratories Licensing Corporation | Long term monitoring of transmission and voice activity patterns for regulating gain control |
| US10546593B2 (en) * | 2017-12-04 | 2020-01-28 | Apple Inc. | Deep learning driven multi-channel filtering for speech enhancement |
| US11164592B1 (en) * | 2019-05-09 | 2021-11-02 | Amazon Technologies, Inc. | Responsive automatic gain control |
-
2021
- 2021-08-05 CN CN202110895493.XA patent/CN115881146A/zh active Pending
-
2022
- 2022-07-08 JP JP2022110199A patent/JP2023024295A/ja active Pending
- 2022-07-14 EP EP22184919.3A patent/EP4131265B1/en active Active
- 2022-07-18 KR KR1020220088509A patent/KR20230021580A/ko active Pending
- 2022-08-02 US US17/879,561 patent/US20230040743A1/en active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2001237920A (ja) * | 2000-02-23 | 2001-08-31 | Hitachi Kokusai Electric Inc | 入力レベル調整回路 |
| JP2010539792A (ja) * | 2007-09-12 | 2010-12-16 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | スピーチ増強 |
| JP2009163118A (ja) * | 2008-01-09 | 2009-07-23 | Alpine Electronics Inc | 音声再生方法およびマルチプロセスシステム |
| JP2011518520A (ja) * | 2008-04-18 | 2011-06-23 | ドルビー・ラボラトリーズ・ライセンシング・コーポレーション | サラウンド体験に対する影響を最小限にしてマルチチャンネルオーディオにおけるスピーチの聴覚性を維持するための方法及び装置 |
| US9324337B2 (en) * | 2009-11-17 | 2016-04-26 | Dolby Laboratories Licensing Corporation | Method and system for dialog enhancement |
| JP2012120052A (ja) * | 2010-12-02 | 2012-06-21 | Fujitsu Ten Ltd | 相関低減方法、音声信号変換装置および音響再生装置 |
| WO2013038451A1 (ja) * | 2011-09-15 | 2013-03-21 | 三菱電機株式会社 | ダイナミックレンジ制御装置 |
| WO2013118192A1 (ja) * | 2012-02-10 | 2013-08-15 | 三菱電機株式会社 | 雑音抑圧装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| KR20230021580A (ko) | 2023-02-14 |
| EP4131265A3 (en) | 2023-04-19 |
| EP4131265A2 (en) | 2023-02-08 |
| US20230040743A1 (en) | 2023-02-09 |
| EP4131265B1 (en) | 2025-06-11 |
| CN115881146A (zh) | 2023-03-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10531198B2 (en) | Apparatus and method for decomposing an input signal using a downmixer | |
| US9424852B2 (en) | Determining the inter-channel time difference of a multi-channel audio signal | |
| US9311923B2 (en) | Adaptive audio processing based on forensic detection of media processing history | |
| CN105284133B (zh) | 基于信号下混比进行中心信号缩放和立体声增强的设备和方法 | |
| JP7818660B2 (ja) | 空間オーディオ表現およびレンダリング | |
| US10798511B1 (en) | Processing of audio signals for spatial audio | |
| CN109841223B (zh) | 一种音频信号处理方法、智能终端及存储介质 | |
| JP2023024295A (ja) | 動的音声強調のための方法及びシステム | |
| US20250365552A1 (en) | Binaural signal post-processing | |
| GB2574667A (en) | Spatial audio capture, transmission and reproduction | |
| US12058511B2 (en) | Sound field related rendering | |
| CN120660137A (zh) | 对话可懂度增强方法和系统 | |
| US20250131939A1 (en) | Method and System of Intelligent Dynamic Voice Enhancement | |
| CN118942477B (zh) | 增强人声的信号处理方法、电子设备及存储介质 | |
| Uhle | Center signal scaling using signal-to-downmix ratios | |
| US20240274137A1 (en) | Parametric spatial audio rendering | |
| JP2018029306A (ja) | チャンネル数変換装置およびそのプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20250704 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20250704 |
|
| A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20251223 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20251226 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20260310 |