WO2006098583A1 - Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles - Google Patents
Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles Download PDFInfo
- Publication number
- WO2006098583A1 WO2006098583A1 PCT/KR2006/000916 KR2006000916W WO2006098583A1 WO 2006098583 A1 WO2006098583 A1 WO 2006098583A1 KR 2006000916 W KR2006000916 W KR 2006000916W WO 2006098583 A1 WO2006098583 A1 WO 2006098583A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- channel
- vector
- angle
- location information
- audio signal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 230000006837 decompression Effects 0.000 title abstract description 4
- 230000006835 compression Effects 0.000 title description 3
- 238000007906 compression Methods 0.000 title description 3
- 230000005236 sound signal Effects 0.000 claims abstract description 52
- 239000013598 vector Substances 0.000 claims description 85
- 238000004091 panning Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 4
- 238000013139 quantization Methods 0.000 abstract description 5
- 238000001228 spectrum Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Definitions
- the present invention relates to compression and decompression of a
- multi-channel audio signal and more particularly, to a method for
- VSLI source location information
- ICLD ICLD
- the ICLD is
- quantization process assigns a limited number of bits, resolution is limited.
- the present invention is directed to a method for representing,
- VSLI source location information
- the present invention is also directed to a method for compressing a
- One aspect of the present invention provides a method for estimating
- VSLI virtual source location information
- angle of the global vector is greater than zero and in a second set when the
- Another aspect of the present invention provides a method for
- VSLI VSI information
- Yet another aspect of the present invention provides a method for
- VSLI source location information
- the method comprising the steps of: (i) predicting inverse panning angle information from the VSLI using a constant
- spatial cue information is represented using virtual sound location
- FIG. 1 schematically illustrates the configuration of a multi-channel
- FIG. 2 is a flowchart illustrating a process of estimating virtual sound
- VSLI location information
- FIG. 3 illustrates an example in which respective channels of a multi ⁇
- channel audio signal are virtually assigned on a semicircular plane structure according to an exemplary embodiment of the present invention.
- FIG. 4 illustrates an example of local vectors estimated in respective
- FIG. 5 is a flowchart illustrating a process of decoding a multi-channel
- FIG. 1 schematically illustrates the configuration of a multi-channel
- multi-channel audio encoder includes a down mixer 110 for down-mixing an
- AAC advanced audio coding
- VSLI virtual source location information
- a quantizing unit 140 for quantizing the VSLI
- a multiplexing unit 150 for multiplexing the down-mixed audio signal encoded
- the virtual source location information (VSLI)
- ICLD inter-channel level difference
- sound location vectors include a global vector Gv b , left and right half-plane
- Ga b LHa b , RHa b , LSa b and RSa b , respectively.
- the channels of the multi-channel audio signal are identical to the channels of the multi-channel audio signal
- FIG. 2 is a flowchart illustrating a process of estimating VSLI of a
- step 210 respective channels of an input multi-channel audio signal
- FIG. 3 shows
- step 220 the multi-channel audio signal is converted into a signal in
- step 230 the signal in the frequency domain is
- S Ch,n denotes a frequency coefficient of the ch-th channel.
- eh denotes one of a center channel (C)
- B b and B b+ i-1 denote frequency indexes corresponding to upper and lower boundaries of the sub-band B b , respectively.
- step 240 a global vector represented on the semicircular plane
- assigned the channels is estimated from the signal magnitude of each channel
- a global vector Gv b is estimated using
- Aj denotes virtual location information of each channel signal assigned
- the virtual location information may be defined as
- step 250 it is determined whether the angle Ga b of the global vector
- step 260 if the angle of the global global
- step 1 a first set of local vectors are estimated.
- the first set of local vectors are estimated.
- the first set of local vectors are estimated.
- the second set of local vectors includes LHv b , LSv b , and RSv b , and the second set of local vectors includes
- Equations 3 An embodiment thereof is shown in FIG. 4.
- step 280 the angle of the global vector and the angles of the local
- vectors estimated in step 260 or 270 are transmitted as the VSLI to the decoder.
- RSa b , LSa b ⁇ is transmitted, and otherwise, ⁇ Ga b , LHa b , LSa b , RSa b ⁇ is
- the spatial cue information for N multi-channel audio signals can be
- FIG. 5 is a flowchart illustrating a process of decoding a multi-channel
- decoder estimates vector information of original sound from virtual source
- the sound vector is represented by its magnitude and angle.
- the vector angle can be obtained from the received VSLI, and the vector
- an inverse panning angle is predicted
- the inverse panning angle is predicted using
- step 520 an estimated power component for each channel in the
- sub-band is obtained from the predicted inverse panning angle.
- estimated power component for each channel is obtained using the following
- each channel signal in each sub-band can be finally
- S k ' denotes a frequency component coefficient of the received down- mixed signal
- U c i 1;k denotes the decompressed audio signal
- the present invention described above may be provided as one or more
- the mediums may include a floppy disc, a hard disc, a CD-ROM,
- a flash memory card a programmable read only memory (PROM), a random access memory (RAM).
- PROM programmable read only memory
- RAM random access memory
- ROM read only memory
- magnetic tape a magnetic tape
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Stereophonic System (AREA)
Abstract
L'invention concerne un procédé permettant de compresser et décompresser un signal multicanal à l'aide d'informations de localisation source virtuelles (VSLI) sur un plan semi-circulaire. On utilise des VSLI plutôt d'une différence de niveaux intercanaux (ICLD) comme informations de repère spatial, ce qui permet de limiter les pertes entraînées par la quantification des informations de repère spatial, d'améliorer la qualité du son d'un signal audio décompressé et de reproduire un excellent signal audio par réduction des distorsions lors de la décompression d'un signal d'origine au niveau d'un spectre de décodeur.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06716366.7A EP1859439B1 (fr) | 2005-03-14 | 2006-03-14 | Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles |
CN2006800081055A CN101138021B (zh) | 2005-03-14 | 2006-03-14 | 使用虚拟源位置信息的多声道音频压缩和解压缩方法 |
US11/817,808 US20080187144A1 (en) | 2005-03-14 | 2006-03-14 | Multichannel Audio Compression and Decompression Method Using Virtual Source Location Information |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20050021104 | 2005-03-14 | ||
KR10-2005-0021104 | 2005-03-14 | ||
KR10-2006-0023545 | 2006-03-14 | ||
KR1020060023545A KR100714980B1 (ko) | 2005-03-14 | 2006-03-14 | 가상음원위치정보를 이용한 멀티채널 오디오 신호의 압축및 복원 방법 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006098583A1 true WO2006098583A1 (fr) | 2006-09-21 |
Family
ID=36991912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2006/000916 WO2006098583A1 (fr) | 2005-03-14 | 2006-03-14 | Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP1859439B1 (fr) |
WO (1) | WO2006098583A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010000313A1 (fr) * | 2008-07-01 | 2010-01-07 | Nokia Corporation | Appareil et procédé pour ajuster des informations de repère spatial d'un signal audio à canaux multiples |
KR101086347B1 (ko) * | 2006-12-27 | 2011-11-23 | 한국전자통신연구원 | 부가정보 비트스트림 변환을 포함하는 다양한 채널로구성된 다객체 오디오 신호의 부호화 및 복호화 장치 및방법 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
US20030236583A1 (en) * | 2002-06-24 | 2003-12-25 | Frank Baumgarte | Hybrid multi-channel/cue coding/decoding of audio signals |
-
2006
- 2006-03-14 WO PCT/KR2006/000916 patent/WO2006098583A1/fr active Application Filing
- 2006-03-14 EP EP06716366.7A patent/EP1859439B1/fr not_active Not-in-force
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
US20030236583A1 (en) * | 2002-06-24 | 2003-12-25 | Frank Baumgarte | Hybrid multi-channel/cue coding/decoding of audio signals |
Non-Patent Citations (2)
Title |
---|
FALLER C. AND BAUMGARTE F.: "Binaural Cue Coding: A Novel and Efficient Representation of Spatial Audio", PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 2, 2002, pages 1841 - 1844, XP010804253 * |
See also references of EP1859439A4 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101086347B1 (ko) * | 2006-12-27 | 2011-11-23 | 한국전자통신연구원 | 부가정보 비트스트림 변환을 포함하는 다양한 채널로구성된 다객체 오디오 신호의 부호화 및 복호화 장치 및방법 |
US8370164B2 (en) | 2006-12-27 | 2013-02-05 | Electronics And Telecommunications Research Institute | Apparatus and method for coding and decoding multi-object audio signal with various channel including information bitstream conversion |
KR101531239B1 (ko) * | 2006-12-27 | 2015-07-06 | 한국전자통신연구원 | 다객체 오디오 신호의 부호화 장치 |
KR101546744B1 (ko) | 2006-12-27 | 2015-08-24 | 한국전자통신연구원 | 다양한 채널로 구성된 다객체 오디오 신호의 트랜스코딩 장치 |
US9257127B2 (en) | 2006-12-27 | 2016-02-09 | Electronics And Telecommunications Research Institute | Apparatus and method for coding and decoding multi-object audio signal with various channel including information bitstream conversion |
WO2010000313A1 (fr) * | 2008-07-01 | 2010-01-07 | Nokia Corporation | Appareil et procédé pour ajuster des informations de repère spatial d'un signal audio à canaux multiples |
CN102084418A (zh) * | 2008-07-01 | 2011-06-01 | 诺基亚公司 | 用于调整多通道音频信号的空间线索信息的设备和方法 |
CN102084418B (zh) * | 2008-07-01 | 2013-03-06 | 诺基亚公司 | 用于调整多通道音频信号的空间线索信息的设备和方法 |
US9025775B2 (en) | 2008-07-01 | 2015-05-05 | Nokia Corporation | Apparatus and method for adjusting spatial cue information of a multichannel audio signal |
Also Published As
Publication number | Publication date |
---|---|
EP1859439A4 (fr) | 2010-12-22 |
EP1859439B1 (fr) | 2013-10-30 |
EP1859439A1 (fr) | 2007-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10381013B2 (en) | Method and device for metadata for multi-channel or sound-field audio signals | |
EP3874492B1 (fr) | Détermination du codage de paramètre audio spatial et décodage associé | |
US9659569B2 (en) | Audio signal encoder | |
US10199044B2 (en) | Audio signal encoder comprising a multi-channel parameter selector | |
US20080187144A1 (en) | Multichannel Audio Compression and Decompression Method Using Virtual Source Location Information | |
US20240185869A1 (en) | Combining spatial audio streams | |
WO2006006809A1 (fr) | Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle | |
CN114945982A (zh) | 空间音频参数编码和相关联的解码 | |
US20110137661A1 (en) | Quantizing device, encoding device, quantizing method, and encoding method | |
US20160111100A1 (en) | Audio signal encoder | |
JP5949270B2 (ja) | オーディオ復号装置、オーディオ復号方法、オーディオ復号用コンピュータプログラム | |
WO2019106221A1 (fr) | Traitement de paramètres audio spatiaux | |
EP1859439A1 (fr) | Procede de compression decompression audio multicanal utilisant des informations de localisation source virtuelles | |
CN111179951B (zh) | 包括编码hoa表示的位流的解码方法和装置、以及介质 | |
CN116762127A (zh) | 量化空间音频参数 | |
JP6051621B2 (ja) | オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化用コンピュータプログラム、及びオーディオ復号装置 | |
JP5990954B2 (ja) | オーディオ符号化装置、オーディオ符号化方法、オーディオ符号化用コンピュータプログラム、オーディオ復号装置、オーディオ復号方法ならびにオーディオ復号用コンピュータプログラム | |
EP3861548B1 (fr) | Sélection de schémas de quantification pour un codage de paramètre audio spatial | |
CA3208666A1 (fr) | Transformation de parametres audio spatiaux | |
CN116508098A (zh) | 量化空间音频参数 | |
CN116982108A (zh) | 空间音频参数编码和相关联解码的确定 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200680008105.5 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006716366 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11817808 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
WWP | Wipo information: published in national office |
Ref document number: 2006716366 Country of ref document: EP |