EP1779385B1 - Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle - Google Patents

Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle Download PDF

Info

Publication number
EP1779385B1
EP1779385B1 EP05774399A EP05774399A EP1779385B1 EP 1779385 B1 EP1779385 B1 EP 1779385B1 EP 05774399 A EP05774399 A EP 05774399A EP 05774399 A EP05774399 A EP 05774399A EP 1779385 B1 EP1779385 B1 EP 1779385B1
Authority
EP
European Patent Office
Prior art keywords
audio signal
vector
channel
vsli
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Not-in-force
Application number
EP05774399A
Other languages
German (de)
English (en)
Other versions
EP1779385A1 (fr
EP1779385A4 (fr
Inventor
Jeong Il Seo
Han Gil Moon
Seung Kwon Information and Communication Beack
Kyeong Ok Kang
In Seon Jang
Koeng Mo Sung
Min Soo Hahn
Jin Woo Hong
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Seoul National University Industry Foundation
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Seoul National University Industry Foundation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020050061425A external-priority patent/KR100663729B1/ko
Application filed by Electronics and Telecommunications Research Institute ETRI, Seoul National University Industry Foundation filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of EP1779385A1 publication Critical patent/EP1779385A1/fr
Publication of EP1779385A4 publication Critical patent/EP1779385A4/fr
Application granted granted Critical
Publication of EP1779385B1 publication Critical patent/EP1779385B1/fr
Not-in-force legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Definitions

  • the present invention relates to a method and apparatus for encoding/decoding a multi-channel audio signal, and more particularly, to a method and apparatus for effectively encoding/decoding a multi-channel audio signal using Virtual Sound Location Information (VLSI).
  • VLSI Virtual Sound Location Information
  • Moving Picture Experts Group has performed research on compressing a multi-channel audio signal. Owing to the remarkable increase in multi-channel contents, increased demand for multi-channel contents, and increased need for a multi-channel audio services in a broadcasting communications environment, research on the multi-channel audio compression technology has been stepped up.
  • multi-channel audio compression technology such as MPEG-2 Backward Compatibility (BC), MPEG-2 Advanced Audio Coding (AAC), and MPEG-4 AAC
  • BC MPEG-2 Backward Compatibility
  • AAC MPEG-2 Advanced Audio Coding
  • MPEG-4 AAC MPEG-4 AAC
  • BCC is technology for effectively compressing a multi-channel audio signal that has been developed on a basis of the fact that people can acoustically perceive space due to a binaural effect. BCC is based on the fact that a pair of ears perceives a location of a specific sound source using interaural level differences and/or interaural time differences.
  • a multi-channel audio signal is downmixed to a monophonic or stereophonic signal and channel information is represented by binaural cue parameters such as Inter-channel Level Difference (ICLD) and Inter-channel Time Difference (ICTD).
  • ICLD Inter-channel Level Difference
  • ICTD Inter-channel Time Difference
  • Document WO 99/52326 discloses a spatial audio coding system, including an encoder and a decoder, operates at very low bit-rates and is useful for audio via the Internet.
  • the listener or listeners preferably are located within a predictable listening area, for example, users of a personal computer or television viewers.
  • An encoder produces a composite audio-information signal representing the soundfield to be reproduced and a directional vector or "steering control signal".
  • the composite audio-information signal has its frequency spectrum broken into a number of subbands, preferably commensurate with the critical bands of the human ear.
  • the steering control signal has a component relating to the dominant direction of the soundfield in each of the subbands. Because the system is based on the premise that only sound from a single diection is heard at any instant, decoder need not apply a signal to more than two sound transducers at any instant.
  • Document WO 03/090208 discloses a psycho-acoustically motivated, parametric representation of the spatial attributes of multichannel audio signals.
  • the parametric representation allows strong bitrate reductions in audio coders, since only one monaural signal has to be transmitted, combined with (quantized) parameters which describe the spatial properties of the signal.
  • the decoder can form the original amount of audio channels by applying the spatial parameters.
  • the present invention as defined by the independent claims 1, 10, 13, 22, 26 and 27, is directed to reproduction of a realistic audio signal by encoding/decoding a multi-channel audio signal using only a downmixed audio signal and a small amount of additional information.
  • the present invention is also directed to maximizing transmission efficiency by analyzing a per-channel sound source of a multi-channel audio signal, extracting a small amount of virtual source location information, and transmitting the extracted virtual source location information together with a downmixed audio signal.
  • FIG. 1 is a block diagram of an apparatus for encoding a multi-channel audio signal according to an exemplary embodiment of the present invention.
  • the multi-channel audio signal encoding apparatus includes a frame converter 100, a downmixer 110, an Advanced Audio Coding (AAC) encoder 120, a multiplexer 130, a quantizer 140, and a Virtual Source Location Information (VSLI) analyzer 150.
  • AAC Advanced Audio Coding
  • VSLI Virtual Source Location Information
  • the frame converter 100 frames the multi-channel audio signal, using a window function such as a sine window, to process the multi-channel audio signal in each block.
  • the downmixer 110 receives the framed multi-channel audio signal from the frame converter 100 and downmixes it into a monophonic signal or a stereophonic signal.
  • the AAC encoder 120 compresses the downmixed audio signal received from the downmixer 110, to generate an AAC encoded signal. It then transmits the AAC encoded signal to the multiplexer 130.
  • the VSLI analyzer 150 extracts Virtual Source Location Information (VSLI) from the framed audio signal.
  • the VSLI analyzer 150 may include a time-to-frequency converter 151, an Equivalent Rectangular Bandwidth (ERB) filter bank 152, an energy vector detector 153, and a location estimator 154.
  • ERP Equivalent Rectangular Bandwidth
  • the time-to-frequency converter 151 performs a plurality of Fast Fourier Transforms (FFTs) to convert the framed audio signal into a frequency domain signal.
  • the ERB filter bank 152 divides the converted frequency domain signal (spectrum) into per-band spectrums (for example, 20 bands).
  • FIG. 2 is a conceptual diagram of a time-to-frequency lattice using the ERB filter bank 152.
  • the energy vector extractor 153 estimates per-channel energy vectors from the corresponding per-band spectrum.
  • the location estimator 154 estimates virtual source location information (VSLI) using the per-channel energy vectors estimated by the energy vector extractor 153.
  • the VSLI may be represented using azimuth angles between the source location vectors and a center channel.
  • the VSLI estimated by the location estimator 154 can vary depending on whether the downmixed audio signal is monophonic or stereophonic.
  • FIG. 3 is a conceptual diagram illustrating the source location vectors estimated according to the present invention, in the case where the downmixed audio signal is monophonic.
  • the source location vectors estimated from the downmixed monophonic signal include a Left Half-plane Vector (LHV), a Right Half-plane Vector (RHV), a Left Subsequent Vector (LSV), a Right Subsequent Vector (RSV), and a Global Vector (GV).
  • LHV Left Half-plane Vector
  • RHV Right Half-plane Vector
  • LSV Left Subsequent Vector
  • RSV Right Subsequent Vector
  • GV Global Vector
  • FIG. 4 is a conceptual diagram illustrating the source location vectors estimated according to the present invention, in the case where the downmixed multi-channel audio signal is stereophonic.
  • the source location vectors estimated from the downmixed monophonic signal include the LHV, the RHV, the LSV, and the RSV, but not the GV.
  • the quantizer 140 quantizes the VSLI (azimuth angles) received from the VSLI analyzer 150 and transmits the quantized VSLI signal to the multiplexer 130.
  • the multiplexer 130 receives the AAC encoded signal from the AAC encoder 120 and the quantized VSLI signal from the quantizer 140 and multiplexes them to generate an encoded multi-channel audio signal (i.e., the AAC encoded signal +the VSLI signal).
  • FIG. 5 is a conceptual diagram illustrating a process of estimating the VSLI according to an exemplary embodiment of the present invention.
  • the input multi-channel audio signal is comprised of five channels including center (C), front left (L), front right (R), left subsequent (LS), and right subsequent (RS)
  • the input signal is converted into the frequency axis signal through the plurality of FFTs and divided into N number of frequency bands (BAND 1, BAND 2,..., and BAND N) in the ERB filter bank 152.
  • the per-channel energy vectors may be detected from the power of each of the five channels for each band (for example, C1 PWR, L1 PWR, R1 PWR, LS1 PWR, and RS1 PWR).
  • CPP Constant Power Panning
  • the source location vectors may be estimated from the detected per-channel energy vectors and the azimuth angles between the source location vectors and the center channel, which represent VSLI, may be estimated.
  • FIG. 6 to 9 illustrate detailed processes of estimating the VSLI according to the present invention.
  • the per-channel energy vectors estimated using the energy vector estimator are a center channel energy vector (C), a front left channel energy vector (L), a left subsequent channel energy vector (LS), a front right channel energy vector (R), and a right subsequent channel energy vector (RS).
  • the LHV is estimated using the front left channel energy vector (L) and the left subsequent channel energy vector (LS)
  • the RHV is estimated using the front right channel energy vector (R) and the right subsequent channel energy vector (RS) (Refer to FIG. 7 ).
  • the LSV and RSV may be estimated using the LHV, the RHV, and the center channel energy vector (C) (Refer to FIG. 8 ).
  • the gain of each channel can be calculated using only the LHV, RHV, LSV, and RSV.
  • the GV can be calculated using the LSV and RSV (Refer to FIG. 9 ).
  • the magnitude of the GV is set to the magnitude of the downmixed audio signal.
  • the source location vectors extracted using the above method may be expressed using the azimuth angles between themselves and the center channel.
  • FIG. 10 illustrates the azimuth angles of the source location vectors extracted by the processes shown in FIGS. 6 to 9 .
  • the VSLI may be expressed using five azimuth angles, which include a Left Half-plane vector angle (LHa), a Right Half-plane vector angle (RHa), a Left Subsequent vector angle (LSa), and a Right Subsequent vector angle (RSa), and further include a Global vector angle (Ga) in the case where the downmixed audio signal is monophonic. Since each value has a limited dynamic range, quantization can be performed using fewer bits than Inter-Channel Level Difference (ICLD).
  • ICLD Inter-Channel Level Difference
  • a linear quantization method in which quantization is performed in uniform intervals or a nonlinear quantization method in which quantization is performed in non-uniform intervals may be used.
  • ⁇ i,max represents the maximal variance level of each angle.
  • ⁇ 1,max 180° ⁇ 2,max and ⁇ 3,max equal 15° and ⁇ 4,max and ⁇ 5, max equal 55°.
  • a maximal variance interval of each angle magnitude is limited, and therefore more effective and higher resolution quantization can be provided.
  • the Ga has a generation frequency with a roughly symmetrical distribution centered on a center speaker.
  • the generation distribution has an average expectation value of 0°. Accordingly, for the Ga, a more effective quantization level can be obtained when quantization is performed using the nonlinear quantization method.
  • the nonlinear quantization method is performed in a general m-law scheme, and m value can be determined depending on a resolution of the quantization level. For example, when the resolution is low, a relatively large m value may be used (15 ⁇ ⁇ 255), and when the resolution is high, a smaller m value (0 ⁇ 5) may be used to perform the nonlinear quantization.
  • FIG. 11 is a block diagram illustrating an apparatus for decoding an encoded multi-channel audio signal according to an exemplary embodiment of the present invention.
  • the multi-channel audio signal decoding apparatus includes a signal distributor 1110, an AAC decoder 1120, a time-to-frequency converter 1130, an inverse quantizer 1140, a per-band channel gain distributor 1150, a multi-channel spectrum synthesizer 1160, and a frequency-to-time converter 1170.
  • the signal distributor 1110 separates the encoded multi-channel audio signal back into the AAC encoded signal and the VLSI encoded signal, respectively.
  • the AAC decoder 1120 converts the AAC encoded signal back into the downmixed audio signal (monophonic or stereophonic signal).
  • the converted downmixed audio signal can be used to produce monophonic or stereophonic sound.
  • the time-to-frequency converter 1130 converts the downmixed audio signal into a frequency axis signal and transmits it to the multi-channel spectrum synthesizer 1160.
  • the inverse quantizer 1140 receives the separated VSLI encoded signal from the signal distributor 1110 and produces per-band source location vector information from the received VSLI encoded signal.
  • the VSLI includes azimuth angle information (for example, LHa, RHa, LSa, RSa, and Ga in the case where the downmixed audio signal is monophonic), each of which represents the corresponding per-band source location vector.
  • the source location vector is produced from the VSLI.
  • the per-band channel gain distributor 1150 calculates the gain per channel using the per-band VSLI signal converted by the inverse quantizer 1140, and transmits the calculated gain to the multi-channel spectrum synthesizer 1160.
  • the multi-channel spectrum synthesizer 1160 receives a spectrum of the downmixed audio signal from the time-to-frequency converter 1130, separates the received spectrum into per-band spectrums using the ERB filter bank, and restores the spectrum of the multi-channel signal using per-band channel gains output from the per-band channel gain distributor 1150.
  • the frequency-to-time converter 1170 (for example, IFFF) converts the spectrum of the restored multi-channel signal into a time axis signal to generate the multi-channel audio signal.
  • FIG. 12 is a block diagram illustrating a process of calculating the per-channel gain of the downmixed audio signal using the VSLI according to an exemplary embodiment of the present invention.
  • the downmixed audio signal is monophonic is illustrated.
  • block 1210 is omitted.
  • magnitudes of the LSV and the RSV are calculated using the magnitude of the downmixed monophonic signal, which is the magnitude of the GV, and the angle (Ga) of the GV.
  • magnitudes of the LHV and the first gain of the center channel (C) are calculated using the magnitude and angle (LSa) of the LSV (Block 1220).
  • the gain of the center channel (C) is obtained by summing the first gain and the second gain calculated in the above process (block 1240).
  • gains of the front left channel (L) and the left subsequent channel (LS) are calculated using the magnitude of the LHV and the corresponding angle (LHa) (block 1250), and gains of the front right channel (R) and the right subsequent channel (RS) are calculated using the magnitude of the RHV and the corresponding angle (RHa) (block 1260). According to the above processes, the gains of all channels can be calculated.
  • a multi-channel audio signal can be more effectively encoded/decoded using virtual source location information, and more realistic audio signal reproduction in a multi-channel environment can be realized.

Abstract

L'invention concerne un procédé et un dispositif destinés à coder/décoder un signal audio multicanal. Cet appareil de codage d'un signal audio multicanal comprend un convertisseur de trame destiné à convertir le signal audio multicanal en un signal audio tramé, une unité permettant de mélanger-abaisser le signal audio tramé, une unité destinée à coder le signal audio mélangé-abaissé, un estimateur d'informations d'emplacement de source servant à estimer les informations d'emplacement de source à partir du signal audio multicanal tramé, une unité destinée à quantifier les informations d'emplacement de source estimées, et une unité permettant de multiplexer le signal audio codé et les informations d'emplacement de source quantifiées en vue de la génération d'un signal audio multicanal.

Claims (27)

  1. Dispositif d'encodage d'un signal audio multicanal, le dispositif comprenant :
    un convertisseur de trame (100) pour convertir le signal audio multicanal en un signal audio sous forme de trames ;
    des moyens pour mélanger-abaisser (110) le signal audio sous forme de trames ;
    des moyens pour encoder (120) le signal audio mélangé-abaissé;
    un estimateur d'informations de position de source (150) pour estimer des informations de position de source virtuelle (VSLI) à partir du signal audio sous forme de trames ;
    des moyens pour quantifier (140) les informations VSLI estimées ; et
    des moyens pour multiplexer (130) le signal audio encodé et les informations VSLI quantifiées pour générer un signal audio multicanal encodé ;
    dans lequel :
    les informations VSLI comprennent un angle de vecteur de demi-plan gauche (LHa), un angle de vecteur de demi-plan droit (RHa), un angle de vecteur gauche suivant (LSa) et un angle de vecteur droit suivant (RSa).
  2. Dispositif selon la revendication 1, dans lequel lesdits moyens de mélange-abaissement (110) sont adaptés pour mélanger-abaisser le signal audio sous forme de trames en tant que l'un quelconque d'un signal monophonique ou d'un signal stéréophonique.
  3. Dispositif selon la revendication 2, dans lequel, lorsque le signal audio mélangé-abaissé est le signal monophonique, l'estimateur d'informations de position de source (150) estime un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV), un vecteur droit suivant (RSV) et un vecteur global (GV).
  4. Dispositif selon la revendication 2, dans lequel, lorsque le signal audio mélangé-abaissé est le signal stéréophonique, l'estimateur d'informations de position de source (150) estime un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV) et un vecteur droit suivant (RSV).
  5. Dispositif selon la revendication 1, dans lequel ledit estimateur d'informations de position de source (150) comprend :
    un convertisseur temps-fréquence (151) pour convertir le signal audio sous forme de trames en un spectre ;
    un séparateur (152) pour séparer en spectres par bande ;
    un détecteur de vecteur d'énergie (153) pour détecter les vecteurs d'énergie par canal dans le spectre par bande correspondant ; et
    un estimateur d'informations VSLI (154) pour estimer les informations VSLI en utilisant le vecteur d'énergie par canal détecté qui est détecté par le détecteur de vecteur d'énergie (153).
  6. Dispositif selon la revendication 5, dans lequel ledit convertisseur temps-fréquence (151) est adapté pour convertir le signal audio sous forme de trames en le spectre en utilisant une pluralité de transformations de Fourier rapide (FFT).
  7. Dispositif selon la revendication 5, dans lequel le séparateur (152) est adapté pour séparer le spectre en utilisant une batterie de filtres à largeur de bande rectangulaire équivalente (ERB).
  8. Dispositif selon la revendication 5, dans lequel le vecteur d'énergie par canal détecté comprend un vecteur d'énergie de canal central (C), un vecteur d'énergie de canal gauche avant (L), un vecteur d'énergie de canal gauche suivant (LS), un vecteur d'énergie de canal droit avant (R) et un vecteur d'énergie de canal droit suivant (RS).
  9. Dispositif selon la revendication 1, dans lequel, lorsque le signal audio mélangé-abaissé est le signal monophonique, les informations VSLI comprennent en outre un angle de vecteur global (Ga).
  10. Dispositif de décodage d'un signal audio multicanal, le dispositif comprenant :
    des moyens pour recevoir le signal audio multicanal ;
    un distributeur de signal (1110) pour séparer le signal audio multicanal reçu en un signal audio mélangé-abaissé encodé et un signal d'informations de position de source virtuelle (VSLI) quantifiées ;
    des moyens pour décoder (1120) le signal audio mélangé-abaissé encodé ;
    des moyens pour convertir (1130) le signal audio mélangé-abaissé décodé en un signal d'axe de fréquence ;
    un extracteur d'informations VSLI (1140) pour extraire les informations VSLI par bande du signal d'informations VSLI quantifiées ;
    un calculateur de gain de canal (1150) pour calculer des gains de canal par bande en utilisant les informations VSLI par bande extraites ;
    des moyens pour synthétiser (1160) un spectre de signal audio multicanal en utilisant le signal d'axe de fréquence converti et les gains de canal par bande calculés ; et
    des moyens pour générer (1170) un signal audio multicanal à partir du spectre multicanal synthétisé,
    dans lequel les informations VSLI par bande comprennent un angle de vecteur de demi-plan gauche (LHa), un angle de vecteur de demi-plan droit (RHa), un angle de vecteur gauche suivant (LSa) et un angle de vecteur droit suivant (RSa) pour chaque bande.
  11. Dispositif selon la revendication 10, dans lequel l'extracteur d'informations VSLI (1140) est adapté pour produire un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV) et un vecteur droit suivant (RSV).
  12. Dispositif selon la revendication 10, dans lequel, lorsque le signal audio mélangé-abaissé encodé est monophonique, les informations VSLI comprennent en outre un angle de vecteur global (Ga), et un vecteur global (GV) est produit à partir de Ga.
  13. Procédé d'encodage d'un signal audio multicanal, comprenant les étapes consistant à :
    convertir le signal audio multicanal en un signal audio sous forme de trames ;
    mélanger-abaisser le signal audio sous forme de trames ;
    encoder le signal audio mélangé-abaissé;
    estimer des informations VSLI à partir du signal audio sous forme de trames ;
    quantifier les informations VSLI estimées ; et
    multiplexer le signal audio mélangé-abaissé encodé et les informations VSLI quantifiées pour générer un signal audio multicanal encodé ;
    les informations VSLI comprenant un angle de vecteur de demi-plan gauche (LHa), un angle de vecteur de demi-plan droit (RHa), un angle de vecteur gauche suivant (LSa) et un angle de vecteur droit suivant (RSa).
  14. Procédé selon la revendication 13, dans lequel le signal audio sous forme de trames est mélangé-abaissé en l'un quelconque d'un signal monophonique et d'un signal stéréophonique.
  15. Procédé selon la revendication 14, dans lequel, lorsque le signal audio mélangé-abaissé est le signal monophonique, les informations VSLI comprennent en outre un angle de vecteur global (Ga), et un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV), un vecteur droit suivant (RSV) et un vecteur global (GV) sont produits à partir des informations VSLI.
  16. Procédé selon la revendication 14, dans lequel, lorsque le signal audio mélangé-abaissé est le signal stéréophonique, un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV) et un vecteur droit suivant (RSV) sont produits à partir des informations VSLI.
  17. Procédé selon la revendication 13, dans lequel l'étape d'estimation des informations VSLI comprend les étapes consistant à :
    convertir le signal audio sous forme de trames en un spectre ;
    séparer le spectre en spectres par bande ;
    détecter des vecteurs d'énergie par canal dans les spectres par bande ; et
    estimer les informations VSLI en utilisant les vecteurs d'énergie par canal détectés.
  18. Procédé selon la revendication 17, dans lequel les vecteurs d'énergie par canal détectés comprennent un vecteur d'énergie de canal central (C), un vecteur d'énergie de canal gauche avant (L), un vecteur d'énergie de canal gauche suivant (LS), un vecteur d'énergie de canal droit avant (R) et un vecteur d'énergie de canal droit suivant (RS).
  19. Procédé selon la revendication 17, dans lequel l'étape d'estimation des informations VSLI comprend les étapes consistant à :
    estimer un vecteur LHV en utilisant le vecteur d'énergie de canal gauche avant (L) et le vecteur d'énergie de canal gauche suivant (LS) ;
    estimer un vecteur RHV en utilisant le vecteur d'énergie de canal droit avant (R) et le vecteur d'énergie de canal droit suivant (RS) ;
    estimer un vecteur LSV en utilisant le vecteur LHV estimé et le vecteur d'énergie de canal central (C) ; et
    estimer un vecteur RSV en utilisant le vecteur RHV estimé et le vecteur d'énergie de canal central (C).
  20. Procédé selon la revendication 19, dans lequel, lorsque le signal audio mélangé-abaissé est le signal monophonique, les informations VLSI estimées comprennent en outre un vecteur GV, et l'estimation des informations VSLI comprend en outre l'étape d'estimation du vecteur GV en utilisant le vecteur LSV et le vecteur RSV estimés.
  21. Procédé selon la revendication 17, dans lequel, lorsque le signal audio mélangé-abaissé est le signal monophonique, les informations d'angle azimutal comprennent en outre un Ga.
  22. Procédé de décodage d'un signal audio multicanal comprenant les étapes consistant à :
    recevoir le signal audio multicanal ;
    séparer le signal audio multicanal reçu en un signal audio mélangé-abaissé encodé et un signal d'informations VSLI quantifiées ;
    décoder le signal audio mélangé-abaissé encodé ;
    convertir le signal audio mélangé-abaissé décodé en un signal d'axe de fréquence ;
    analyser le signal d'informations VSLI quantifiées et extraire les informations VSLI par bande de celui-ci ;
    calculer des gains de canal par bande à partir des informations VSLI par bande extraites ;
    synthétiser un spectre de signal audio multicanal en utilisant le signal d'axe de fréquence converti et les gains de canal par bande calculés ; et
    produire un signal audio multicanal à partir du spectre multicanal synthétisé,
    dans lequel les informations VSLI par bande comprennent un angle de vecteur de demi-plan gauche (LHa), un angle de vecteur de demi-plan droit (RHa), un angle de vecteur gauche suivant (LSa) et un angle de vecteur droit suivant (RSa) pour chaque bande.
  23. Procédé selon la revendication 22, dans lequel un vecteur de demi-plan gauche (LHV), un vecteur de demi-plan droit (RHV), un vecteur gauche suivant (LSV) et un vecteur droit suivant (RSV) sont produits à partir des informations VSLI.
  24. Procédé selon la revendication 22, dans lequel, lorsque le signal audio mélangé-abaissé encodé est monophonique, les informations VSLI comprennent en outre un angle de vecteur global (Ga), et un vecteur global (GV) est produit à partir de Ga.
  25. Procédé selon la revendication 23, dans lequel ladite étape de calcul du gain de canal comprend, pour chaque bande, les étapes consistant à :
    calculer les amplitudes du vecteur LSV et du vecteur RSV en utilisant une amplitude du signal audio mélangé-abaissé;
    calculer un premier gain d'un canal central (C) et une amplitude du vecteur LHV en utilisant l'amplitude du vecteur LSV et l'angle LSa ;
    calculer un deuxième gain d'un canal central (C) et une amplitude du vecteur RHV en utilisant l'amplitude du vecteur RSV et l'angle RSa ;
    sommer les premier et deuxième gains du canal central (C) pour produire un gain du canal central (C) ;
    calculer les gains d'un canal gauche avant (L) et d'un canal gauche suivant (LS) en utilisant l'amplitude du vecteur LHV et l'angle LHa ; et
    calculer les gains d'un canal droit avant (R) et d'un canal droit suivant (RS) en utilisant l'amplitude du vecteur RHV et l'angle RHa.
  26. Support d'enregistrement pouvant être lu par un ordinateur mémorisant un programme d'ordinateur pour effectuer le procédé d'encodage d'un signal audio multicanal selon l'une quelconque des revendications 13 à 21.
  27. Support d'enregistrement pouvant être lu par un ordinateur mémorisant un programme d'ordinateur pour effectuer le procédé de décodage d'un signal audio multicanal selon l'une quelconque des revendications 22 à 25.
EP05774399A 2004-07-09 2005-07-08 Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle Not-in-force EP1779385B1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20040053665 2004-07-09
KR20040081303 2004-10-12
KR1020050061425A KR100663729B1 (ko) 2004-07-09 2005-07-07 가상 음원 위치 정보를 이용한 멀티채널 오디오 신호부호화 및 복호화 방법 및 장치
PCT/KR2005/002213 WO2006006809A1 (fr) 2004-07-09 2005-07-08 Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle

Publications (3)

Publication Number Publication Date
EP1779385A1 EP1779385A1 (fr) 2007-05-02
EP1779385A4 EP1779385A4 (fr) 2007-07-25
EP1779385B1 true EP1779385B1 (fr) 2010-09-22

Family

ID=35784122

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05774399A Not-in-force EP1779385B1 (fr) 2004-07-09 2005-07-08 Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle

Country Status (2)

Country Link
EP (1) EP1779385B1 (fr)
WO (1) WO2006006809A1 (fr)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8180067B2 (en) 2006-04-28 2012-05-15 Harman International Industries, Incorporated System for selectively extracting components of an audio input signal
US7876904B2 (en) 2006-07-08 2011-01-25 Nokia Corporation Dynamic decoding of binaural audio signals
JP5270566B2 (ja) 2006-12-07 2013-08-21 エルジー エレクトロニクス インコーポレイティド オーディオ処理方法及び装置
WO2011044064A1 (fr) * 2009-10-05 2011-04-14 Harman International Industries, Incorporated Système pour l'extraction spatiale de signaux audio
EP2541547A1 (fr) 2011-06-30 2013-01-02 Thomson Licensing Procédé et appareil pour modifier les positions relatives d'objets de son contenu dans une représentation ambisonique d'ordre supérieur
EP3547312A1 (fr) 2012-05-18 2019-10-02 Dolby Laboratories Licensing Corp. Système et procédé de commande de plage dynamique d'un signal audio
US10844689B1 (en) 2019-12-19 2020-11-24 Saudi Arabian Oil Company Downhole ultrasonic actuator system for mitigating lost circulation
KR102340151B1 (ko) * 2014-01-07 2021-12-17 하만인터내셔날인더스트리스인코포레이티드 신호 품질-기반 압축 오디오 신호 향상 및 보상
PL3338462T3 (pl) 2016-03-15 2020-03-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Urządzenie, sposób lub program komputerowy do generowania opisu pola dźwięku
EP3297298B1 (fr) * 2016-09-19 2020-05-06 A-Volute Procédé de reproduction de sons répartis dans l'espace

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6128597A (en) * 1996-05-03 2000-10-03 Lsi Logic Corporation Audio decoder with a reconfigurable downmixing/windowing pipeline and method therefor
US5946352A (en) * 1997-05-02 1999-08-31 Texas Instruments Incorporated Method and apparatus for downmixing decoded data streams in the frequency domain prior to conversion to the time domain
US6016473A (en) * 1998-04-07 2000-01-18 Dolby; Ray M. Low bit-rate spatial coding method and system
US7292901B2 (en) * 2002-06-24 2007-11-06 Agere Systems Inc. Hybrid multi-channel/cue coding/decoding of audio signals
US20030014243A1 (en) * 2001-07-09 2003-01-16 Lapicque Olivier D. System and method for virtual localization of audio signals
ES2300567T3 (es) * 2002-04-22 2008-06-16 Koninklijke Philips Electronics N.V. Representacion parametrica de audio espacial.
SE0400997D0 (sv) * 2004-04-16 2004-04-16 Cooding Technologies Sweden Ab Efficient coding of multi-channel audio

Also Published As

Publication number Publication date
EP1779385A1 (fr) 2007-05-02
EP1779385A4 (fr) 2007-07-25
WO2006006809A1 (fr) 2006-01-19

Similar Documents

Publication Publication Date Title
US7783495B2 (en) Method and apparatus for encoding and decoding multi-channel audio signal using virtual source location information
EP1779385B1 (fr) Procede et dispositif destines a coder et decoder un signal audio multicanal au moyen d'informations d'emplacement de source virtuelle
EP2612322B1 (fr) Procédé et appareil de décodage d'un signal audio multicanal
KR101117336B1 (ko) 오디오 신호 부호화 장치 및 오디오 신호 복호화 장치
EP1881486B1 (fr) Dispositif de décodage avec unité de décorrelation
US8332229B2 (en) Low complexity MPEG encoding for surround sound recordings
TWI404429B (zh) 用於將多頻道音訊信號編碼/解碼之方法與裝置
EP1393303B1 (fr) Suppression de la redondance de signaux intercanaux dans le codage audio perceptuel
US20070269063A1 (en) Spatial audio coding based on universal spatial cues
US8706508B2 (en) Audio decoding apparatus and audio decoding method performing weighted addition on signals
CN112567765B (zh) 空间音频捕获、传输和再现
EP4082010A1 (fr) Combinaison de paramètres audio spatiaux
JP6686015B2 (ja) オーディオ信号のパラメトリック混合
Cheng et al. A spatial squeezing approach to ambisonic audio compression
US11096002B2 (en) Energy-ratio signalling and synthesis
JP2023500631A (ja) 方向メタデータを使用するマルチチャネルオーディオ符号化及び復号化
EP4315324A1 (fr) Combinaison de flux audio spatiaux
WO2006011367A1 (fr) Codeur et décodeur de signal audio
CA3208666A1 (fr) Transformation de parametres audio spatiaux
CN116547749A (zh) 音频参数的量化
EP3948861A1 (fr) Détermination de l'importance des paramètres audio spatiaux et codage associé

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070205

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

A4 Supplementary search report drawn up and despatched

Effective date: 20070621

RIC1 Information provided on ipc code assigned before grant

Ipc: G11B 20/10 20060101AFI20060216BHEP

Ipc: G10L 19/00 20060101ALI20070615BHEP

Ipc: H04S 3/00 20060101ALI20070615BHEP

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION

Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTIT

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20080125

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 602005023738

Country of ref document: DE

Date of ref document: 20101104

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20100922

LTIE Lt: invalidation of european patent or patent extension

Effective date: 20100922

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101223

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20110124

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20110122

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20110102

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20110623

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602005023738

Country of ref document: DE

Effective date: 20110623

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20110708

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20120330

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110801

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110731

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110708

Ref country code: CY

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20100922

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 602005023738

Country of ref document: DE

Representative=s name: PATENTANWAELTE BETTEN & RESCH, DE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R088

Ref document number: 602005023738

Country of ref document: DE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 602005023738

Country of ref document: DE

Representative=s name: BETTEN & RESCH PATENT- UND RECHTSANWAELTE PART, DE

Effective date: 20130717

Ref country code: DE

Ref legal event code: R081

Ref document number: 602005023738

Country of ref document: DE

Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH IN, KR

Free format text: FORMER OWNERS: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, DAEJEON, KR; SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SEOUL/SOUL, KR

Effective date: 20130717

Ref country code: DE

Ref legal event code: R081

Ref document number: 602005023738

Country of ref document: DE

Owner name: SNU R&DB FOUNDATION, KR

Free format text: FORMER OWNERS: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, DAEJEON, KR; SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SEOUL/SOUL, KR

Effective date: 20130717

Ref country code: DE

Ref legal event code: R081

Ref document number: 602005023738

Country of ref document: DE

Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH IN, KR

Free format text: FORMER OWNER: ELECTRONICS AND TELECOMMUNICATI, SEOUL NATIONAL UNIVERSITY INDUS, , KR

Effective date: 20130717

Ref country code: DE

Ref legal event code: R082

Ref document number: 602005023738

Country of ref document: DE

Representative=s name: PATENTANWAELTE BETTEN & RESCH, DE

Effective date: 20130717

Ref country code: DE

Ref legal event code: R081

Ref document number: 602005023738

Country of ref document: DE

Owner name: SNU R&DB FOUNDATION, KR

Free format text: FORMER OWNER: ELECTRONICS AND TELECOMMUNICATI, SEOUL NATIONAL UNIVERSITY INDUS, , KR

Effective date: 20130717

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20101222

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100922

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20160614

Year of fee payment: 12

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602005023738

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180201