WO2006098583A1 - Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles - Google Patents

Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles Download PDF

Info

Publication number
WO2006098583A1
WO2006098583A1 PCT/KR2006/000916 KR2006000916W WO2006098583A1 WO 2006098583 A1 WO2006098583 A1 WO 2006098583A1 KR 2006000916 W KR2006000916 W KR 2006000916W WO 2006098583 A1 WO2006098583 A1 WO 2006098583A1
Authority
WO
WIPO (PCT)
Prior art keywords
channel
vector
angle
location information
audio signal
Prior art date
Application number
PCT/KR2006/000916
Other languages
English (en)
Inventor
Jeong Il Seo
Seung Kwon Beack
In Seon Jang
Kyeong Ok Kang
Jin Woo Hong
Min Soo Hahn
Original Assignee
Electronics And Telecommunications Research Intitute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics And Telecommunications Research Intitute filed Critical Electronics And Telecommunications Research Intitute
Priority to EP06716366.7A priority Critical patent/EP1859439B1/fr
Priority to CN2006800081055A priority patent/CN101138021B/zh
Priority to US11/817,808 priority patent/US20080187144A1/en
Priority claimed from KR1020060023545A external-priority patent/KR100714980B1/ko
Publication of WO2006098583A1 publication Critical patent/WO2006098583A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition

Definitions

  • the present invention relates to compression and decompression of a
  • multi-channel audio signal and more particularly, to a method for
  • VSLI source location information
  • ICLD ICLD
  • the ICLD is
  • quantization process assigns a limited number of bits, resolution is limited.
  • the present invention is directed to a method for representing,
  • VSLI source location information
  • the present invention is also directed to a method for compressing a
  • One aspect of the present invention provides a method for estimating
  • VSLI virtual source location information
  • angle of the global vector is greater than zero and in a second set when the
  • Another aspect of the present invention provides a method for
  • VSLI VSI information
  • Yet another aspect of the present invention provides a method for
  • VSLI source location information
  • the method comprising the steps of: (i) predicting inverse panning angle information from the VSLI using a constant
  • spatial cue information is represented using virtual sound location
  • FIG. 1 schematically illustrates the configuration of a multi-channel
  • FIG. 2 is a flowchart illustrating a process of estimating virtual sound
  • VSLI location information
  • FIG. 3 illustrates an example in which respective channels of a multi ⁇
  • channel audio signal are virtually assigned on a semicircular plane structure according to an exemplary embodiment of the present invention.
  • FIG. 4 illustrates an example of local vectors estimated in respective
  • FIG. 5 is a flowchart illustrating a process of decoding a multi-channel
  • FIG. 1 schematically illustrates the configuration of a multi-channel
  • multi-channel audio encoder includes a down mixer 110 for down-mixing an
  • AAC advanced audio coding
  • VSLI virtual source location information
  • a quantizing unit 140 for quantizing the VSLI
  • a multiplexing unit 150 for multiplexing the down-mixed audio signal encoded
  • the virtual source location information (VSLI)
  • ICLD inter-channel level difference
  • sound location vectors include a global vector Gv b , left and right half-plane
  • Ga b LHa b , RHa b , LSa b and RSa b , respectively.
  • the channels of the multi-channel audio signal are identical to the channels of the multi-channel audio signal
  • FIG. 2 is a flowchart illustrating a process of estimating VSLI of a
  • step 210 respective channels of an input multi-channel audio signal
  • FIG. 3 shows
  • step 220 the multi-channel audio signal is converted into a signal in
  • step 230 the signal in the frequency domain is
  • S Ch,n denotes a frequency coefficient of the ch-th channel.
  • eh denotes one of a center channel (C)
  • B b and B b+ i-1 denote frequency indexes corresponding to upper and lower boundaries of the sub-band B b , respectively.
  • step 240 a global vector represented on the semicircular plane
  • assigned the channels is estimated from the signal magnitude of each channel
  • a global vector Gv b is estimated using
  • Aj denotes virtual location information of each channel signal assigned
  • the virtual location information may be defined as
  • step 250 it is determined whether the angle Ga b of the global vector
  • step 260 if the angle of the global global
  • step 1 a first set of local vectors are estimated.
  • the first set of local vectors are estimated.
  • the first set of local vectors are estimated.
  • the second set of local vectors includes LHv b , LSv b , and RSv b , and the second set of local vectors includes
  • Equations 3 An embodiment thereof is shown in FIG. 4.
  • step 280 the angle of the global vector and the angles of the local
  • vectors estimated in step 260 or 270 are transmitted as the VSLI to the decoder.
  • RSa b , LSa b ⁇ is transmitted, and otherwise, ⁇ Ga b , LHa b , LSa b , RSa b ⁇ is
  • the spatial cue information for N multi-channel audio signals can be
  • FIG. 5 is a flowchart illustrating a process of decoding a multi-channel
  • decoder estimates vector information of original sound from virtual source
  • the sound vector is represented by its magnitude and angle.
  • the vector angle can be obtained from the received VSLI, and the vector
  • an inverse panning angle is predicted
  • the inverse panning angle is predicted using
  • step 520 an estimated power component for each channel in the
  • sub-band is obtained from the predicted inverse panning angle.
  • estimated power component for each channel is obtained using the following
  • each channel signal in each sub-band can be finally
  • S k ' denotes a frequency component coefficient of the received down- mixed signal
  • U c i 1;k denotes the decompressed audio signal
  • the present invention described above may be provided as one or more
  • the mediums may include a floppy disc, a hard disc, a CD-ROM,
  • a flash memory card a programmable read only memory (PROM), a random access memory (RAM).
  • PROM programmable read only memory
  • RAM random access memory
  • ROM read only memory
  • magnetic tape a magnetic tape

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

L'invention concerne un procédé permettant de compresser et décompresser un signal multicanal à l'aide d'informations de localisation source virtuelles (VSLI) sur un plan semi-circulaire. On utilise des VSLI plutôt d'une différence de niveaux intercanaux (ICLD) comme informations de repère spatial, ce qui permet de limiter les pertes entraînées par la quantification des informations de repère spatial, d'améliorer la qualité du son d'un signal audio décompressé et de reproduire un excellent signal audio par réduction des distorsions lors de la décompression d'un signal d'origine au niveau d'un spectre de décodeur.
PCT/KR2006/000916 2005-03-14 2006-03-14 Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles WO2006098583A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP06716366.7A EP1859439B1 (fr) 2005-03-14 2006-03-14 Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles
CN2006800081055A CN101138021B (zh) 2005-03-14 2006-03-14 使用虚拟源位置信息的多声道音频压缩和解压缩方法
US11/817,808 US20080187144A1 (en) 2005-03-14 2006-03-14 Multichannel Audio Compression and Decompression Method Using Virtual Source Location Information

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20050021104 2005-03-14
KR10-2005-0021104 2005-03-14
KR10-2006-0023545 2006-03-14
KR1020060023545A KR100714980B1 (ko) 2005-03-14 2006-03-14 가상음원위치정보를 이용한 멀티채널 오디오 신호의 압축및 복원 방법

Publications (1)

Publication Number Publication Date
WO2006098583A1 true WO2006098583A1 (fr) 2006-09-21

Family

ID=36991912

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/000916 WO2006098583A1 (fr) 2005-03-14 2006-03-14 Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles

Country Status (2)

Country Link
EP (1) EP1859439B1 (fr)
WO (1) WO2006098583A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010000313A1 (fr) * 2008-07-01 2010-01-07 Nokia Corporation Appareil et procédé pour ajuster des informations de repère spatial d'un signal audio à canaux multiples
KR101086347B1 (ko) * 2006-12-27 2011-11-23 한국전자통신연구원 부가정보 비트스트림 변환을 포함하는 다양한 채널로구성된 다객체 오디오 신호의 부호화 및 복호화 장치 및방법

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030035553A1 (en) * 2001-08-10 2003-02-20 Frank Baumgarte Backwards-compatible perceptual coding of spatial cues
US20030236583A1 (en) * 2002-06-24 2003-12-25 Frank Baumgarte Hybrid multi-channel/cue coding/decoding of audio signals

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030035553A1 (en) * 2001-08-10 2003-02-20 Frank Baumgarte Backwards-compatible perceptual coding of spatial cues
US20030236583A1 (en) * 2002-06-24 2003-12-25 Frank Baumgarte Hybrid multi-channel/cue coding/decoding of audio signals

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FALLER C. AND BAUMGARTE F.: "Binaural Cue Coding: A Novel and Efficient Representation of Spatial Audio", PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 2, 2002, pages 1841 - 1844, XP010804253 *
See also references of EP1859439A4 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101086347B1 (ko) * 2006-12-27 2011-11-23 한국전자통신연구원 부가정보 비트스트림 변환을 포함하는 다양한 채널로구성된 다객체 오디오 신호의 부호화 및 복호화 장치 및방법
US8370164B2 (en) 2006-12-27 2013-02-05 Electronics And Telecommunications Research Institute Apparatus and method for coding and decoding multi-object audio signal with various channel including information bitstream conversion
KR101531239B1 (ko) * 2006-12-27 2015-07-06 한국전자통신연구원 다객체 오디오 신호의 부호화 장치
KR101546744B1 (ko) 2006-12-27 2015-08-24 한국전자통신연구원 다양한 채널로 구성된 다객체 오디오 신호의 트랜스코딩 장치
US9257127B2 (en) 2006-12-27 2016-02-09 Electronics And Telecommunications Research Institute Apparatus and method for coding and decoding multi-object audio signal with various channel including information bitstream conversion
WO2010000313A1 (fr) * 2008-07-01 2010-01-07 Nokia Corporation Appareil et procédé pour ajuster des informations de repère spatial d'un signal audio à canaux multiples
CN102084418A (zh) * 2008-07-01 2011-06-01 诺基亚公司 用于调整多通道音频信号的空间线索信息的设备和方法
CN102084418B (zh) * 2008-07-01 2013-03-06 诺基亚公司 用于调整多通道音频信号的空间线索信息的设备和方法
US9025775B2 (en) 2008-07-01 2015-05-05 Nokia Corporation Apparatus and method for adjusting spatial cue information of a multichannel audio signal

Also Published As

Publication number Publication date
EP1859439A4 (fr) 2010-12-22
EP1859439B1 (fr) 2013-10-30
EP1859439A1 (fr) 2007-11-28

Similar Documents

Publication Publication Date Title
US10381013B2 (en) Method and device for metadata for multi-channel or sound-field audio signals
EP3874492B1 (fr) Détermination du codage de paramètre audio spatial et décodage associé
US9659569B2 (en) Audio signal encoder
US10199044B2 (en) Audio signal encoder comprising a multi-channel parameter selector
US20080187144A1 (en) Multichannel Audio Compression and Decompression Method Using Virtual Source Location Information
US20240185869A1 (en) Combining spatial audio streams
WO2006006809A1 (fr) Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle
CN114945982A (zh) 空间音频参数编码和相关联的解码
US20110137661A1 (en) Quantizing device, encoding device, quantizing method, and encoding method
US20160111100A1 (en) Audio signal encoder
JP5949270B2 (ja) オーディオ復号装置、オーディオ復号方法、オーディオ復号用コンピュータプログラム
WO2019106221A1 (fr) Traitement de paramètres audio spatiaux
EP1859439A1 (fr) Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles
CN111179951B (zh) 包括编码hoa表示的位流的解码方法和装置、以及介质
CN116762127A (zh) 量化空间音频参数
JP6051621B2 (ja) オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化用コンピュータプログラム、及びオーディオ復号装置
JP5990954B2 (ja) オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化用コンピュータプログラム、オーディオ復号装置、オーディオ復号方法ならびにオーディオ復号用コンピュータプログラム
EP3861548B1 (fr) Sélection de schémas de quantification pour un codage de paramètre audio spatial
CA3208666A1 (fr) Transformation de parametres audio spatiaux
CN116508098A (zh) 量化空间音频参数
CN116982108A (zh) 空间音频参数编码和相关联解码的确定

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680008105.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2006716366

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 11817808

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

WWP Wipo information: published in national office

Ref document number: 2006716366

Country of ref document: EP