CN116438811A - 用于声音编解码器中的非相关立体声内容的分类、串音检测和立体声模式选择的方法和设备 - Google Patents

用于声音编解码器中的非相关立体声内容的分类、串音检测和立体声模式选择的方法和设备 Download PDF

Info

Publication number
CN116438811A
CN116438811A CN202180071762.9A CN202180071762A CN116438811A CN 116438811 A CN116438811 A CN 116438811A CN 202180071762 A CN202180071762 A CN 202180071762A CN 116438811 A CN116438811 A CN 116438811A
Authority
CN
China
Prior art keywords
stereo
stereo mode
channel
sound signal
crosstalk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180071762.9A
Other languages
English (en)
Chinese (zh)
Inventor
V·马列诺夫斯基
T·维兰考特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
VoiceAge Corp
Original Assignee
VoiceAge Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by VoiceAge Corp filed Critical VoiceAge Corp
Publication of CN116438811A publication Critical patent/CN116438811A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R27/00Public address systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Stereophonic System (AREA)
CN202180071762.9A 2020-09-09 2021-09-08 用于声音编解码器中的非相关立体声内容的分类、串音检测和立体声模式选择的方法和设备 Pending CN116438811A (zh)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063075984P 2020-09-09 2020-09-09
US63/075,984 2020-09-09
PCT/CA2021/051238 WO2022051846A1 (en) 2020-09-09 2021-09-08 Method and device for classification of uncorrelated stereo content, cross-talk detection, and stereo mode selection in a sound codec

Publications (1)

Publication Number Publication Date
CN116438811A true CN116438811A (zh) 2023-07-14

Family

ID=80629696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180071762.9A Pending CN116438811A (zh) 2020-09-09 2021-09-08 用于声音编解码器中的非相关立体声内容的分类、串音检测和立体声模式选择的方法和设备

Country Status (9)

Country Link
US (1) US12494210B2 (https=)
EP (1) EP4211683B1 (https=)
JP (1) JP7808095B2 (https=)
KR (1) KR20230066056A (https=)
CN (1) CN116438811A (https=)
BR (1) BR112023003311A2 (https=)
CA (1) CA3192085A1 (https=)
MX (1) MX2023002825A (https=)
WO (1) WO2022051846A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12341621B1 (en) * 2022-01-31 2025-06-24 Zoom Communications, Inc. Audio capture device selection for in-person conference participants

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004509366A (ja) * 2000-09-15 2004-03-25 テレフオンアクチーボラゲツト エル エム エリクソン 複数チャネル信号の符号化及び復号化
CN101548315A (zh) * 2006-11-30 2009-09-30 诺基亚公司 用于立体声编码的方法、装置和计算机程序产品
US20100189290A1 (en) * 2009-01-29 2010-07-29 Samsung Electronics Co. Ltd Method and apparatus to evaluate quality of audio signal
US20150049872A1 (en) * 2012-04-05 2015-02-19 Huawei Technologies Co., Ltd. Multi-channel audio encoder and method for encoding a multi-channel audio signal
CN107430863A (zh) * 2015-03-09 2017-12-01 弗劳恩霍夫应用研究促进协会 用于编码多声道信号的音频编码器及用于解码经编码的音频信号的音频解码器
CN108352162A (zh) * 2015-09-25 2018-07-31 沃伊斯亚吉公司 用于使用主声道的编码参数编码立体声声音信号以编码辅声道的方法和系统

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3207281B2 (ja) 1993-02-12 2001-09-10 株式会社東芝 ステレオ音声符号化・復号化方式、ステレオ音声復号化装置及び単独発言/複数同時発言判別装置
AU5663296A (en) * 1995-04-10 1996-10-30 Corporate Computer Systems, Inc. System for compression and decompression of audio signals fo r digital transmission
US6456964B2 (en) 1998-12-21 2002-09-24 Qualcomm, Incorporated Encoding of periodic speech using prototype waveforms
US6151571A (en) * 1999-08-31 2000-11-21 Andersen Consulting System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters
KR20070065401A (ko) * 2004-09-23 2007-06-22 코닌클리케 필립스 일렉트로닉스 엔.브이. 오디오 데이터를 처리하는 시스템 및 방법, 프로그램구성요소, 및 컴퓨터-판독가능 매체
US7599840B2 (en) * 2005-07-15 2009-10-06 Microsoft Corporation Selectively using multiple entropy models in adaptive coding and decoding
KR20070077652A (ko) 2006-01-24 2007-07-27 삼성전자주식회사 적응적 시간/주파수 기반 부호화 모드 결정 장치 및 이를위한 부호화 모드 결정 방법
KR20100006492A (ko) 2008-07-09 2010-01-19 삼성전자주식회사 부호화 방식 결정 방법 및 장치
CN101615910B (zh) * 2009-05-31 2010-12-22 华为技术有限公司 压缩编码的方法、装置和设备以及压缩解码方法
PT2633521T (pt) * 2010-10-25 2018-11-13 Voiceage Corp Codificação de sinais áudio genéricos com baixos débitos binários e pouco atraso
JP6061121B2 (ja) 2011-07-01 2017-01-18 ソニー株式会社 オーディオ符号化装置、オーディオ符号化方法、およびプログラム
TWI612518B (zh) * 2012-11-13 2018-01-21 Samsung Electronics Co., Ltd. 編碼模式決定方法、音訊編碼方法以及音訊解碼方法
US9886963B2 (en) 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
WO2016184958A1 (en) 2015-05-20 2016-11-24 Telefonaktiebolaget Lm Ericsson (Publ) Coding of multi-channel audio signals
US9888318B2 (en) * 2015-11-25 2018-02-06 Mediatek, Inc. Method, system and circuits for headset crosstalk reduction
US11145316B2 (en) 2017-06-01 2021-10-12 Panasonic Intellectual Property Corporation Of America Encoder and encoding method for selecting coding mode for audio channels based on interchannel correlation
US11270710B2 (en) * 2017-09-25 2022-03-08 Panasonic Intellectual Property Corporation Of America Encoder and encoding method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004509366A (ja) * 2000-09-15 2004-03-25 テレフオンアクチーボラゲツト エル エム エリクソン 複数チャネル信号の符号化及び復号化
CN101548315A (zh) * 2006-11-30 2009-09-30 诺基亚公司 用于立体声编码的方法、装置和计算机程序产品
US20100189290A1 (en) * 2009-01-29 2010-07-29 Samsung Electronics Co. Ltd Method and apparatus to evaluate quality of audio signal
US20150049872A1 (en) * 2012-04-05 2015-02-19 Huawei Technologies Co., Ltd. Multi-channel audio encoder and method for encoding a multi-channel audio signal
CN107430863A (zh) * 2015-03-09 2017-12-01 弗劳恩霍夫应用研究促进协会 用于编码多声道信号的音频编码器及用于解码经编码的音频信号的音频解码器
CN108352162A (zh) * 2015-09-25 2018-07-31 沃伊斯亚吉公司 用于使用主声道的编码参数编码立体声声音信号以编码辅声道的方法和系统
US20180233154A1 (en) * 2015-09-25 2018-08-16 Voiceage Corporation Method and system for encoding left and right channels of a stereo sound signal selecting between two and four sub-frames models depending on the bit budget

Also Published As

Publication number Publication date
MX2023002825A (es) 2023-05-30
JP7808095B2 (ja) 2026-01-28
EP4211683A1 (en) 2023-07-19
KR20230066056A (ko) 2023-05-12
WO2022051846A1 (en) 2022-03-17
EP4211683A4 (en) 2024-08-07
CA3192085A1 (en) 2022-03-17
JP2023540377A (ja) 2023-09-22
US12494210B2 (en) 2025-12-09
US20240021208A1 (en) 2024-01-18
BR112023003311A2 (pt) 2023-03-21
EP4211683B1 (en) 2026-04-01

Similar Documents

Publication Publication Date Title
US12198705B2 (en) Apparatus, method or computer program for estimating an inter-channel time difference
Tan et al. Real-time speech enhancement using an efficient convolutional recurrent network for dual-microphone mobile phones in close-talk scenarios
Zheng Soundfield navigation: Separation, compression and transmission
CN115428068B (zh) 用于声音编解码器中的语音/音乐分类和核心编码器选择的方法和设备
US11463833B2 (en) Method and apparatus for voice or sound activity detection for spatial audio
US12494210B2 (en) Method and device for classification of uncorrelated stereo content, cross-talk detection, and stereo mode selection in a sound codec
Lee et al. Speech/audio signal classification using spectral flux pattern recognition
RU2648632C2 (ru) Классификатор многоканального звукового сигнала
HK40090246A (zh) 用於声音编解码器中的非相关立体声内容的分类、串音检测和立体声模式选择的方法和设备
Lewis et al. Cochannel speaker count labelling based on the use of cepstral and pitch prediction derived features
Liu et al. Deep Clustering in Complex Domain for Single-Channel Speech Separation
Yang et al. Multi-channel speech separation using deep embedding model with multilayer bootstrap networks
CN118020101A (zh) 与阵列几何形状无关的多通道个性化语音增强
House 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Ma Identification and Elimination of Crosstalk in Audio Recordings
Cantzos Psychoacoustically-Driven Multichannel Audio Coding
Sadjadi Robust front-end processing for speech applications under acoustic mismatch conditions
House 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
HK1125216A (en) Neural network classifier for separating audio sources from a monophonic audio signal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40090246

Country of ref document: HK