JP2023024295A - 動的音声強調のための方法及びシステム - Google Patents

動的音声強調のための方法及びシステム Download PDF

Info

Publication number
JP2023024295A
JP2023024295A JP2022110199A JP2022110199A JP2023024295A JP 2023024295 A JP2023024295 A JP 2023024295A JP 2022110199 A JP2022110199 A JP 2022110199A JP 2022110199 A JP2022110199 A JP 2022110199A JP 2023024295 A JP2023024295 A JP 2023024295A
Authority
JP
Japan
Prior art keywords
source input
gain control
channel
control parameter
signal processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2022110199A
Other languages
English (en)
Japanese (ja)
Other versions
JP2023024295A5 (https=
Inventor
シー シャオ-フー
Shao-Fu Shih
ジャンウェン ジェン
jian wen Zheng
イー シャオ
Yi Xiao
エヴィン ジャオ
Jiao Evin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harman International Industries Inc
Original Assignee
Harman International Industries Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harman International Industries Inc filed Critical Harman International Industries Inc
Publication of JP2023024295A publication Critical patent/JP2023024295A/ja
Publication of JP2023024295A5 publication Critical patent/JP2023024295A5/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0324Details of processing therefor
    • G10L21/034Automatic adjustment
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03GCONTROL OF AMPLIFICATION
    • H03G3/00Gain control in amplifiers or frequency changers
    • H03G3/20Automatic control
    • H03G3/30Automatic control in amplifiers having semiconductor devices
    • H03G3/3005Automatic control in amplifiers having semiconductor devices in amplifiers suitable for low-frequencies, e.g. audio amplifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/01Aspects of volume control, not necessarily automatic, in sound systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/05Generation or adaptation of centre channel in multi-channel audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/13Aspects of volume control, not necessarily automatic, in stereophonic sound systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Stereophonic System (AREA)
JP2022110199A 2021-08-05 2022-07-08 動的音声強調のための方法及びシステム Pending JP2023024295A (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110895493.XA CN115881146A (zh) 2021-08-05 2021-08-05 用于动态语音增强的方法及系统
CN202110895493.X 2021-08-05

Publications (2)

Publication Number Publication Date
JP2023024295A true JP2023024295A (ja) 2023-02-16
JP2023024295A5 JP2023024295A5 (https=) 2025-07-14

Family

ID=82608415

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2022110199A Pending JP2023024295A (ja) 2021-08-05 2022-07-08 動的音声強調のための方法及びシステム

Country Status (5)

Country Link
US (1) US20230040743A1 (https=)
EP (1) EP4131265B1 (https=)
JP (1) JP2023024295A (https=)
KR (1) KR20230021580A (https=)
CN (1) CN115881146A (https=)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116701921B (zh) * 2023-08-08 2023-10-20 电子科技大学 多通道时序信号自适应抑噪电路
CN119889331A (zh) * 2023-10-24 2025-04-25 哈曼国际工业有限公司 智能动态语音增强的方法及系统

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001237920A (ja) * 2000-02-23 2001-08-31 Hitachi Kokusai Electric Inc 入力レベル調整回路
JP2009163118A (ja) * 2008-01-09 2009-07-23 Alpine Electronics Inc 音声再生方法およびマルチプロセスシステム
JP2010539792A (ja) * 2007-09-12 2010-12-16 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション スピーチ増強
JP2011518520A (ja) * 2008-04-18 2011-06-23 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション サラウンド体験に対する影響を最小限にしてマルチチャンネルオーディオにおけるスピーチの聴覚性を維持するための方法及び装置
JP2012120052A (ja) * 2010-12-02 2012-06-21 Fujitsu Ten Ltd 相関低減方法、音声信号変換装置および音響再生装置
WO2013038451A1 (ja) * 2011-09-15 2013-03-21 三菱電機株式会社 ダイナミックレンジ制御装置
WO2013118192A1 (ja) * 2012-02-10 2013-08-15 三菱電機株式会社 雑音抑圧装置
US9324337B2 (en) * 2009-11-17 2016-04-26 Dolby Laboratories Licensing Corporation Method and system for dialog enhancement

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI20045315L (fi) * 2004-08-30 2006-03-01 Nokia Corp Ääniaktiivisuuden havaitseminen äänisignaalissa
US7464029B2 (en) * 2005-07-22 2008-12-09 Qualcomm Incorporated Robust separation of speech signals in a noisy environment
US8856049B2 (en) * 2008-03-26 2014-10-07 Nokia Corporation Audio signal classification by shape parameter estimation for a plurality of audio signal samples
EP2107553B1 (en) * 2008-03-31 2011-05-18 Harman Becker Automotive Systems GmbH Method for determining barge-in
US8831936B2 (en) * 2008-05-29 2014-09-09 Qualcomm Incorporated Systems, methods, apparatus, and computer program products for speech signal processing using spectral contrast enhancement
US8503694B2 (en) * 2008-06-24 2013-08-06 Microsoft Corporation Sound capture system for devices with two microphones
US20110058676A1 (en) * 2009-09-07 2011-03-10 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for dereverberation of multichannel signal
TWI459828B (zh) * 2010-03-08 2014-11-01 Dolby Lab Licensing Corp 在多頻道音訊中決定語音相關頻道的音量降低比例的方法及系統
US8989403B2 (en) * 2010-03-09 2015-03-24 Mitsubishi Electric Corporation Noise suppression device
US8744091B2 (en) * 2010-11-12 2014-06-03 Apple Inc. Intelligibility control using ambient noise detection
WO2013184520A1 (en) * 2012-06-04 2013-12-12 Stone Troy Christopher Methods and systems for identifying content types
WO2014043024A1 (en) * 2012-09-17 2014-03-20 Dolby Laboratories Licensing Corporation Long term monitoring of transmission and voice activity patterns for regulating gain control
US10546593B2 (en) * 2017-12-04 2020-01-28 Apple Inc. Deep learning driven multi-channel filtering for speech enhancement
US11164592B1 (en) * 2019-05-09 2021-11-02 Amazon Technologies, Inc. Responsive automatic gain control

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001237920A (ja) * 2000-02-23 2001-08-31 Hitachi Kokusai Electric Inc 入力レベル調整回路
JP2010539792A (ja) * 2007-09-12 2010-12-16 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション スピーチ増強
JP2009163118A (ja) * 2008-01-09 2009-07-23 Alpine Electronics Inc 音声再生方法およびマルチプロセスシステム
JP2011518520A (ja) * 2008-04-18 2011-06-23 ドルビー・ラボラトリーズ・ライセンシング・コーポレーション サラウンド体験に対する影響を最小限にしてマルチチャンネルオーディオにおけるスピーチの聴覚性を維持するための方法及び装置
US9324337B2 (en) * 2009-11-17 2016-04-26 Dolby Laboratories Licensing Corporation Method and system for dialog enhancement
JP2012120052A (ja) * 2010-12-02 2012-06-21 Fujitsu Ten Ltd 相関低減方法、音声信号変換装置および音響再生装置
WO2013038451A1 (ja) * 2011-09-15 2013-03-21 三菱電機株式会社 ダイナミックレンジ制御装置
WO2013118192A1 (ja) * 2012-02-10 2013-08-15 三菱電機株式会社 雑音抑圧装置

Also Published As

Publication number Publication date
KR20230021580A (ko) 2023-02-14
EP4131265A3 (en) 2023-04-19
EP4131265A2 (en) 2023-02-08
US20230040743A1 (en) 2023-02-09
EP4131265B1 (en) 2025-06-11
CN115881146A (zh) 2023-03-31

Similar Documents

Publication Publication Date Title
US10531198B2 (en) Apparatus and method for decomposing an input signal using a downmixer
US9424852B2 (en) Determining the inter-channel time difference of a multi-channel audio signal
US9311923B2 (en) Adaptive audio processing based on forensic detection of media processing history
CN105284133B (zh) 基于信号下混比进行中心信号缩放和立体声增强的设备和方法
JP7818660B2 (ja) 空間オーディオ表現およびレンダリング
US10798511B1 (en) Processing of audio signals for spatial audio
CN109841223B (zh) 一种音频信号处理方法、智能终端及存储介质
JP2023024295A (ja) 動的音声強調のための方法及びシステム
US20250365552A1 (en) Binaural signal post-processing
GB2574667A (en) Spatial audio capture, transmission and reproduction
US12058511B2 (en) Sound field related rendering
CN120660137A (zh) 对话可懂度增强方法和系统
US20250131939A1 (en) Method and System of Intelligent Dynamic Voice Enhancement
CN118942477B (zh) 增强人声的信号处理方法、电子设备及存储介质
Uhle Center signal scaling using signal-to-downmix ratios
US20240274137A1 (en) Parametric spatial audio rendering
JP2018029306A (ja) チャンネル数変換装置およびそのプログラム

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20250704

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20250704

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20251223

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20251226

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20260310