WO2007078254A2 - Personalized decoding of multi-channel surround sound - Google Patents
Personalized decoding of multi-channel surround sound Download PDFInfo
- Publication number
- WO2007078254A2 WO2007078254A2 PCT/SE2007/000006 SE2007000006W WO2007078254A2 WO 2007078254 A2 WO2007078254 A2 WO 2007078254A2 SE 2007000006 W SE2007000006 W SE 2007000006W WO 2007078254 A2 WO2007078254 A2 WO 2007078254A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameters
- spatial
- spatial parameters
- modifying
- bitstream
- Prior art date
Links
- 238000009877 rendering Methods 0.000 claims abstract description 30
- 230000036962 time dependent Effects 0.000 claims abstract description 5
- 238000000034 method Methods 0.000 claims description 35
- 230000005236 sound signal Effects 0.000 claims description 11
- 230000001131 transforming effect Effects 0.000 claims description 5
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 3
- 230000000881 depressing effect Effects 0.000 claims description 2
- 230000002194 synthesizing effect Effects 0.000 claims 4
- 230000009466 transformation Effects 0.000 abstract description 2
- 239000011159 matrix material Substances 0.000 description 29
- 238000010586 diagram Methods 0.000 description 11
- 230000015572 biosynthetic process Effects 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 9
- 238000001914 filtration Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000012805 post-processing Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 239000002131 composite material Substances 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000035807 sensation Effects 0.000 description 2
- 238000007493 shaping process Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- PEIBAWRLFPGPAT-UHFFFAOYSA-N 1-(diazomethyl)pyrene Chemical compound C1=C2C(C=[N+]=[N-])=CC=C(C=C3)C2=C2C3=CC=CC2=C1 PEIBAWRLFPGPAT-UHFFFAOYSA-N 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2420/00—Details of connection covered by H04R, not provided for in its groups
- H04R2420/07—Applications of wireless loudspeakers or wireless microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
- H04S3/004—For headphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
Definitions
- the present invention is related to decoding a multi-channel surround audio bitstream.
- the next field where this audio technology will be used includes mobile wireless units or terminals, in particular small units such as cellular telephones and PDAs.
- mobile wireless units or terminals in particular small units such as cellular telephones and PDAs.
- the immersive nature of the surround sound is even more important because of the small sizes of the displays.
- the available bit-rate is in many cases low in wireless mobile channels. 2.
- the processing power of mobile terminals is often limited.
- Small mobile terminals generally have only two micro speakers and earplugs or headphones.
- a surround sound solution for a mobile terminal has to use a much lower bit rate than the 384 kbits/s used in the Dolby Digital 5.1 system. Due to the limited processing power, the decoders of the mobile terminals must be computationally optimized and due to the speaker configuration of the mobile terminal, the surround sound must be delivered through the earplugs or headphones.
- a standard way of delivering multi-channel surround sound through headphones or earplugs is to perform a 3D audio or binaural rendering of each of the speaker signals.
- each incoming monophonic signal is filtered through a set of filters that model the transformations created by the human head, torso and ears.
- These filters are called head related filters (HRFs) having head related transfer functions (HRTFs) and if appropriately designed, they give a good 3D audio scene perception.
- HRFs head related filters
- HRTFs head related transfer functions
- the diagram of Fig. 1 illustrates a method of complete 3D audio rendering of an audio signal according to the Dolby Digital 5.1 system.
- the six multi-channel signals according to the Dolby Digital 5.1 system are: - surround right (SR), - right (R),
- the center and low frequency signals are combined into one signal.
- five different filters Hf , Hf ,H c , Hf and H c r are needed in order to implement this method of head related filtering.
- the SR signal is input to filters Hf and Hf.
- the o R signal is input to filters Hf and Hf.
- the C and LFE signals are jointly input to filter H c
- the L signal is input to filters Hf. and Hf
- the SL signal is input to filters Hf. and Hf .
- the quality in terms of 3D perception of such rendering depends on how closely the HRFs model or represent the listener's own head related filtering when she/he is listening. Hence, it may be advantageous if the HRFs can be adapted and personalized for each listener if a good or very good quality is desired.
- This adaptation and personalization step may include modeling, measure- 0 ment and in general a user dependent tuning in order to refine the quality of the perceived 3D audio scene.
- the encoder 3 then forms in down-mixing unit 5 a composite down-mixed signal comprising the individual down-mixed signals Z 1 (H) to z M ( ⁇ ) .
- the number M of down-mixed channels (M ⁇ N) is dependent upon the required or allowable maximum bit-rate, the required quality and the availability of an M-channel audio encoder 7.
- One key aspect of the encoding process is that the down-mixed composite signal, typically a stereo signal but it could also be a mono signal, is derived from the multichannel input signal, and it is this down-mixed composite signal that is compressed in the audio encoder 7 for transmission over the wireless channel 9 rather than the original multi-channel signal.
- the parametric encoder 3 and in particular down-mixing unit 5 thereof may be capable of performing a down-mixing process, such that it creates a more or less true equivalent of the multi-channel signal, in the mono or stereo down-mixing.
- the parametric surround encoder also comprises a spatial parameter estimation unit 9 that from the input signals X 1 (Ji) to x N (n) computes the cues or spatial parameters that in some way can be said to describe the down- mixing process or the assumptions made therein.
- the compressed audio signal which is output from the M-channel audio encoder and also is the main signal is together with the spatial parameters that constitute side information transmitted over an interface 11 such as a wireless interface to the receiving side that in the case considered here typically is a mobile terminal.
- the down-mixing could be supplied by some external unit, such as from a unit employing Artistic Downmix.
- a complementary parametric surround decoder 13 includes an audio decoder 15 and should be constructed to be capable of creating the best possible multi-channel decoding based on knowledge of the down-mixing algorithm used on the transmitting side and the encoded spatial parameters or cues that are received in parallel to the compressed multichannel signal.
- the audio decoder 15 produces signals Z 1 (H) to z M ( ⁇ ) that should be as similar as possible to the signals Z 1 (Ii) to z M (n) on the transmitting side. These are together with the spatial parameters input to a spatial synthesis unit 17 that produces output signals X 1 Qi) to x N ( ⁇ ) that should be as similar as possible to the original input signals X 1 (U) to x N ( ⁇ ) on the transmitting side.
- the output signals X 1 ( ⁇ ) to x N ( ⁇ ) can be input to a binaural rendering system such as that shown in Fig. 1.
- the encoding process can use any of a number of high-performance compression algorithms such as AMR-WB+, MPEG-I Layer III, MPEG-4 AAC or MPEG-4 High Efficiency AAC, and it could even use PCM.
- the above operations are done in the transformed signal domain, such as Fourier transform or MDCT. This is especially beneficial if the spatial parameter estimation and synthesis in the units 9 and 17 use the same type of transform as that used in the audio encoder 7, also called core codec.
- Fig. 3 is a detailed block diagram of an efficient parametric audio encoder.
- the N-channel discrete time input signal denoted in vector form as x N (n) , is first transformed to the frequency domain in a transform unit 21, and in general to a transform domain that gives a signal x ⁇ (k, m) .
- the index k is the index of the transform coefficients, or sub-bands if a frequency domain transform is chosen.
- the index m represents the decimated time domain index that is also related to the input signal possibly through overlapped frames.
- the signal is thereafter down-mixed in a down-mixing unit 5 to generate the M-channel downmix signal z M (k,m) , where M ⁇ N.
- a sequence of spatial model parameter vectors ⁇ p N (k, m) is estimated in an estimation unit 9. This can be either done in an open-loop or closed loop fashion.
- Spatial parameters consist of psycho-acoustical cues that are representative of the surround sound sensation. For instance, in the MPEG surround encoder, these parameters consist of inter- channel differences in level, phase and coherence equivalent to the ILD, ITD and IC cues to capture the spatial image of a multi-channel audio signal relative to a transmitted down-mixed signal z M (k,m) (or if in closed loop, the decoded signal x M (k,m)).
- the cues p N (k,m) can be encoded in a very compact form such as in a spatial parameter quantization unit 23 producing the signal p N (k,7n) followed by a spatial parameter encoder 25.
- the M-channel audio encoder 7 produces the main bitstream which in a multiplexer 27 is multiplexed with the spatial side information produced by the parameter encoder. From the multiplexer the multiplexed signal is transmitted to demultiplexer 29 on the receiving side in which the side information and the main bitstream are recovered as seen in the block diagram of Fig. 4. On the receiving side the main bitstream is decoded to synthesize a high quality multichannel representation using the received spatial parameters. The main bitstream is first decoded in an M-channel audio decoder 31 from which the decoded signals x M (k,m) are input to the spatial synthesis unit 17.
- the spatial side information holding the spatial parameters is extracted by the demultiplexer 29 and provided to a spatial parameter decoder 33 that produces the decoded parameters p N (k,m) and transmits them to the synthesis unit 17.
- the spatial synthesis unit produces the signal x N (k,m) , that is provided to the signal F/T transform unit 35 transforming into the time domain to produce the signal ⁇ N (n) , i.e. the multichannel decoded signal.
- a 3D audio rendering of a multi-channel surround sound can be delivered to a mobile terminal user by using an efficient parametric surround decoder to first obtain the multiple surround sound channels, using for instance the multi-channel decoder described above with reference to Fig. 4. Thereupon, the system illustrated in Fig.
- 3D audio rendering is multiple and include gamming, mobile TV shows, using standards such as 3GPP MBMS or DVB-H, listening to music concerts, watching movies and in general multimedia services, which contain a multi-channel audio component.
- the second disadvantage consists of the temporary memory that is needed in order to store the intermediate decoded channels. They are in fact buffered since they are needed in the second stage of 3D rendering.
- post-processing steps that usually are part of speech and audio codecs may affect the quality of such 3D audio rendering.
- These post-processing are beneficial for listening in a loudspeaker environment. However, they may introduce severe nonlinear phase distortion that is unequally distributed over the multiple channels and that may impact the 3D audio rendering quality.
- the spatial parameters received by a parametric multi-channel decoder may be transformed into a new set of spatial parameters that are used in order to obtain a different decoding of multi-channel surround sound.
- the transformed parameters may also be personalized spatial parameters and can then be obtained by combining both the received spatial parameters and a representation of user head related filters.
- the personalized spatial parameters may also be obtained by combining the received spatial parameters and a representation of the user head related filters and a set of additional rendering parameters determined by the user.
- a subset of the set of additional rendering parameters may be interactive parameters that are set in response to user choices that may be changed during the listening process.
- the set of additional rendering parameters may be time dependent parameters.
- the method as described herein may allow a simple and efficient way to render surround sound, which is encoded by parametric encoders on mobile devices.
- the major advantage consists of a reduced complexity and an increased interactivity when listening through headphones using a mobile device. Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention maybe realized and obtained by means of the methods, processes, instrumentalities and combinations particularly pointed out in the appended claims.
- Fig. 1 is a block diagram illustrating a possible 3D audio or binaural rendering of a 5.1 audio signal
- - Fig. 2 is a high-level description of the principles of a parametric multi-channel coding and decoding system
- - Fig. 4 is a detailed description of the parametric multi-channel audio decoder
- - Fig. 5 is 3D-audio rendering of decoded multi-channel signal (Prior- Art)
- - Fig. 6 is a personalized binaural decoding of multi-channel surround sound
- - Fig. 7 is a generalized diagram of the spatial audio processing in the MPEG-surround decoder
- Fig. 8 is an embodiment of the invention for personalized binaural decoding
- Fig. 9 is a schematic illustrating combining parameters
- Fig. 10 is a diagram illustrating results of listening test.
- the block diagram of Fig. 6 illustrates the main steps in a method of decoding a parametric multi-channel surround audio bitstream as performed in a parametric sound decoder 13.
- the demultiplexer 29 the main bitstream and the spatial side information are recovered.
- the main bitstream is first decoded in an M-channel audio decoder 31 from which the decoded signals i M (k,m) are input to the personalized spatial synthesis unit 17'.
- the spatial side information holding the spatial parameters is from the demultiplexer 29 provided to a spatial parameter decoder 33 that produces the decoded parameters p N (k,m) .
- the decoded spatial parameters are input to a parameter combining unit 37 that may also receive other parameter information, in particular personalized parameters and HRF information.
- the combining unit produces new parameters that in particular may be personalized spatial parameters and are input to the synthesis unit 17'.
- the spatial synthesis unit produces the signal x 2 (k,m) that is provided to the signal F/T transform unit 35 transforming back into the time domain.
- the time domain signal is provided to e.g. the ear-phones 39 of a mobile terminal 41 in which the parametric surround decoder is running.
- the additional information and parameters received by the combining unit 37 can be obtained from a parameter unit 43 that e.g. may be constructed to receive user input interactively during a listening session such as from depressing some suitable key of the mobile terminal or unit 41.
- the processing in the MPEG surround decoder can be defined by two matrix multiplications as illustrated in the diagram of Fig. 7, the multiplications shown as including matrix units Ml and M2, also called the predecorrelator matrix unit and the mix matrix unit, respectively, to which the respective signals are input.
- the first matrix multiplication forms the input signals to decorrelation units or decorrelators D 1 , D 2 , ..., and the second matrix multiplication forms the output signals based on the down-mix input and the output from the decorrelators.
- the above operations are done for each hybrid subband, indexed by the hybrid subband index L
- the index n is used for a number of a time slot
- k is used to index a hybrid subband
- / is used to index the parameter set.
- M" 1 * is a two-dimensional matrix mapping a certain number of input channels to a certain number of channels going into the decorrelators, and is defined for every time-slot n and every hybrid subband Jc
- Mf is a two-dimensional matrix mapping a certain number of pre- processed channels to a certain number of output channels and is defined for every time-slot n and every hybrid subband k.
- the matrix M!, ⁇ ' comes in two versions depending on whether time- domain temporal shaping (TP) or temporal envelope shaping (TES) of the decorrelated signal is used, the two versions denoted Ml' k wa and MIJ' ⁇ .
- the input vector x n ' k to the first matrix unit Ml corresponds to the decoded signals n k z M (k,m) of Fig. 6 obtained from the audio decoder 31.
- the vector W ' that is input to the mix matrix unit M2 is a combination of the output d ls d 2 , ... from the decorrelators D 1 , D 2 , ..., the output from first matrix multiplication, i.e. from the predecorrelator matrix unit M 1 , and residual signals res ls res 2 , ..., and is defined for every time-slot n and every hybrid subband k.
- the output vector y"' k has components I f , l s , r f , r s , cf and lfe that basically correspond to the signals L, SL, R, SR, C and LFE as described above.
- the components must be transformed to the time domains and in some way be rendered to be provided to used earphones, i.e. they cannot be directly used.
- a method for 3D audio rendering and in particular personalized decoding uses a decoder that includes a "Reconstruct from Model” block that takes extra input such as a representation of the personal 3D audio filters in the hybrid filter-bank domain and uses it to transform derivatives of the model parameters to other model parameters that allows generating the two binaural signals directly in the transform domain, so that only the binaural 2-channel signal has to be transformed into the discrete time domain, compare the transform unit 35 in Fig. 6.
- a third matrix M ⁇ symbolically shown as the parameter modification matrix M3, is in this example a linear mapping from 6 channels to two channels, which are used as input to the user headphones 39 through the transform unit 35.
- the matrix multiplication can be written as
- Additional binaural post-processing may also be done and is outside the scope of the method as described herein. This may include further post-processing of the left and right channels.
- the new mix matrix M ⁇ has parameters that depend both on the bit-stream parameters and the user predefined head related filters HRFs and as well on other dynamic rendering parameters if desired.
- the matrix M?'* can be written as rn,k HH)C) H c B (k) Hf (Jc) Hf (Jc) H c (k) H c (k)
- the matrix elements being the five different filters which are used to implement the head related filtering and as above are denoted Hf , H ⁇ , H c , Hf and H C F .
- the filters are represented in the hybrid domain.
- Such an operation to represent filters from the time domain to the frequency or transform domain are well known in the signal processing literature.
- the filters that form the matrix M 3 '* are functions of the hybrid subband index k and are similar to those illustrated in Fig. 1.
- the matrix M 3 1 '* is independent of the time slot index n. Head related filters might also be changed dynamically if the user wants another virtual loudspeaker configuration to be experienced through the headphones 39.
- the user may want to interactively change his spatial position.
- the user may want to experience how it is to be close to the concert scene if for instance a live concert is played, or farther away.
- This could be easily implemented by adding delay lines to the parameter modification matrix M 3 1 '* .
- the user action may be dynamic and in that case, the matrix M"'* is dependent on the time slot index n.
- the user may want to experience different spatial sensations.
- reverberation and other sound effects can be efficiently introduced in the matrix M 3 1 '* .
- the parameter modification matrix M 3 ' fc can contain additional rendering parameters that are interactive and are changed in response to user input.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP07701092A EP1969901A2 (en) | 2006-01-05 | 2007-01-05 | Personalized decoding of multi-channel surround sound |
BRPI0706285-0A BRPI0706285A2 (en) | 2006-01-05 | 2007-01-05 | methods for decoding a parametric multichannel surround audio bitstream and for transmitting digital data representing sound to a mobile unit, parametric surround decoder for decoding a parametric multichannel surround audio bitstream, and, mobile terminal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US74309606P | 2006-01-05 | 2006-01-05 | |
US60/743,096 | 2006-01-05 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007078254A2 true WO2007078254A2 (en) | 2007-07-12 |
WO2007078254A3 WO2007078254A3 (en) | 2007-08-30 |
Family
ID=38228634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2007/000006 WO2007078254A2 (en) | 2006-01-05 | 2007-01-05 | Personalized decoding of multi-channel surround sound |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1969901A2 (en) |
CN (1) | CN101433099A (en) |
BR (1) | BRPI0706285A2 (en) |
RU (1) | RU2008132156A (en) |
WO (1) | WO2007078254A2 (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2436736A (en) * | 2006-03-31 | 2007-10-03 | Sony Corp | Sound field correction system for performing correction of frequency-amplitude characteristics |
US20090144063A1 (en) * | 2006-02-03 | 2009-06-04 | Seung-Kwon Beack | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
EP2154911A1 (en) * | 2008-08-13 | 2010-02-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | An apparatus for determining a spatial output multi-channel audio signal |
EP2175670A1 (en) * | 2008-10-07 | 2010-04-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Binaural rendering of a multi-channel audio signal |
US7746964B2 (en) | 2005-12-13 | 2010-06-29 | Sony Corporation | Signal processing apparatus and signal processing method |
US8199932B2 (en) | 2006-11-29 | 2012-06-12 | Sony Corporation | Multi-channel, multi-band audio equalization |
US8280075B2 (en) | 2007-02-05 | 2012-10-02 | Sony Corporation | Apparatus, method and program for processing signal and method for generating signal |
WO2015080994A1 (en) * | 2013-11-27 | 2015-06-04 | Dolby Laboratories Licensing Corporation | Audio signal processing |
WO2016003206A1 (en) * | 2014-07-01 | 2016-01-07 | 한국전자통신연구원 | Multichannel audio signal processing method and device |
US9805727B2 (en) | 2013-04-03 | 2017-10-31 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and interactively rendering object based audio |
US9848272B2 (en) | 2013-10-21 | 2017-12-19 | Dolby International Ab | Decorrelator structure for parametric reconstruction of audio signals |
US9883308B2 (en) | 2014-07-01 | 2018-01-30 | Electronics And Telecommunications Research Institute | Multichannel audio signal processing method and device |
US9900692B2 (en) | 2014-07-09 | 2018-02-20 | Sony Corporation | System and method for playback in a speaker system |
US9933989B2 (en) | 2013-10-31 | 2018-04-03 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
WO2018154175A1 (en) * | 2017-02-17 | 2018-08-30 | Nokia Technologies Oy | Two stage audio focus for spatial audio processing |
US10410644B2 (en) | 2011-03-28 | 2019-09-10 | Dolby Laboratories Licensing Corporation | Reduced complexity transform for a low-frequency-effects channel |
CN110797037A (en) * | 2013-07-31 | 2020-02-14 | 杜比实验室特许公司 | Method and apparatus for processing audio data, medium, and device |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101556799B (en) * | 2009-05-14 | 2013-08-28 | 华为技术有限公司 | Audio decoding method and audio decoder |
US9584912B2 (en) | 2012-01-19 | 2017-02-28 | Koninklijke Philips N.V. | Spatial audio rendering and encoding |
WO2016089133A1 (en) * | 2014-12-04 | 2016-06-09 | 가우디오디오랩 주식회사 | Binaural audio signal processing method and apparatus reflecting personal characteristics |
EP3220668A1 (en) * | 2016-03-15 | 2017-09-20 | Thomson Licensing | Method for configuring an audio rendering and/or acquiring device, and corresponding audio rendering and/or acquiring device, system, computer readable program product and computer readable storage medium |
CN106373582B (en) * | 2016-08-26 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Method and device for processing multi-channel audio |
EP4138396A4 (en) * | 2020-05-21 | 2023-07-05 | Huawei Technologies Co., Ltd. | Audio data transmission method, and related device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
WO2005041447A1 (en) * | 2003-10-22 | 2005-05-06 | Unwired Technology Llc | Multiple channel wireless communication system |
WO2007004830A1 (en) * | 2005-06-30 | 2007-01-11 | Lg Electronics Inc. | Apparatus for encoding and decoding audio signal and method thereof |
WO2007031896A1 (en) * | 2005-09-13 | 2007-03-22 | Koninklijke Philips Electronics N.V. | Audio coding |
US20070094037A1 (en) * | 2005-08-30 | 2007-04-26 | Pang Hee S | Slot position coding for non-guided spatial audio coding |
-
2007
- 2007-01-05 WO PCT/SE2007/000006 patent/WO2007078254A2/en active Application Filing
- 2007-01-05 EP EP07701092A patent/EP1969901A2/en not_active Withdrawn
- 2007-01-05 RU RU2008132156/09A patent/RU2008132156A/en not_active Application Discontinuation
- 2007-01-05 BR BRPI0706285-0A patent/BRPI0706285A2/en not_active Application Discontinuation
- 2007-01-05 CN CN 200780001908 patent/CN101433099A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
WO2005041447A1 (en) * | 2003-10-22 | 2005-05-06 | Unwired Technology Llc | Multiple channel wireless communication system |
WO2007004830A1 (en) * | 2005-06-30 | 2007-01-11 | Lg Electronics Inc. | Apparatus for encoding and decoding audio signal and method thereof |
US20070094037A1 (en) * | 2005-08-30 | 2007-04-26 | Pang Hee S | Slot position coding for non-guided spatial audio coding |
WO2007031896A1 (en) * | 2005-09-13 | 2007-03-22 | Koninklijke Philips Electronics N.V. | Audio coding |
Non-Patent Citations (2)
Title |
---|
ANONYMOUS: 'Model-based HRTF parameter interpolation' IP.COM JOURNAL, IP.COM INC., WEST HENRIETTA, NY, US 05 September 2006, pages 1 - 5, XP003012896 * |
See also references of EP1969901A2 * |
Cited By (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7746964B2 (en) | 2005-12-13 | 2010-06-29 | Sony Corporation | Signal processing apparatus and signal processing method |
US10277999B2 (en) | 2006-02-03 | 2019-04-30 | Electronics And Telecommunications Research Institute | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
US20120294449A1 (en) * | 2006-02-03 | 2012-11-22 | Electronics And Telecommunications Research Institute | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
US9426596B2 (en) * | 2006-02-03 | 2016-08-23 | Electronics And Telecommunications Research Institute | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
US20090144063A1 (en) * | 2006-02-03 | 2009-06-04 | Seung-Kwon Beack | Method and apparatus for control of randering multiobject or multichannel audio signal using spatial cue |
GB2436736A (en) * | 2006-03-31 | 2007-10-03 | Sony Corp | Sound field correction system for performing correction of frequency-amplitude characteristics |
GB2436736B (en) * | 2006-03-31 | 2008-03-12 | Sony Corp | Signal processing apparatus, signal processing method, and sound field correction system |
US8150069B2 (en) | 2006-03-31 | 2012-04-03 | Sony Corporation | Signal processing apparatus, signal processing method, and sound field correction system |
US8199932B2 (en) | 2006-11-29 | 2012-06-12 | Sony Corporation | Multi-channel, multi-band audio equalization |
US8280075B2 (en) | 2007-02-05 | 2012-10-02 | Sony Corporation | Apparatus, method and program for processing signal and method for generating signal |
CN102165797A (en) * | 2008-08-13 | 2011-08-24 | 弗朗霍夫应用科学研究促进协会 | An apparatus for determining a spatial output multi-channel audio signal |
US8855320B2 (en) | 2008-08-13 | 2014-10-07 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus for determining a spatial output multi-channel audio signal |
EP2154911A1 (en) * | 2008-08-13 | 2010-02-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | An apparatus for determining a spatial output multi-channel audio signal |
US8879742B2 (en) | 2008-08-13 | 2014-11-04 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Apparatus for determining a spatial output multi-channel audio signal |
US8824689B2 (en) | 2008-08-13 | 2014-09-02 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus for determining a spatial output multi-channel audio signal |
EP2175670A1 (en) * | 2008-10-07 | 2010-04-14 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Binaural rendering of a multi-channel audio signal |
WO2010040456A1 (en) * | 2008-10-07 | 2010-04-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Binaural rendering of a multi-channel audio signal |
AU2009301467B2 (en) * | 2008-10-07 | 2013-08-01 | Dolby International Ab | Binaural rendering of a multi-channel audio signal |
US8325929B2 (en) | 2008-10-07 | 2012-12-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Binaural rendering of a multi-channel audio signal |
US10410644B2 (en) | 2011-03-28 | 2019-09-10 | Dolby Laboratories Licensing Corporation | Reduced complexity transform for a low-frequency-effects channel |
US11081118B2 (en) | 2013-04-03 | 2021-08-03 | Dolby Laboratories Licensing Corporation | Methods and systems for interactive rendering of object based audio |
US11270713B2 (en) | 2013-04-03 | 2022-03-08 | Dolby Laboratories Licensing Corporation | Methods and systems for rendering object based audio |
US11769514B2 (en) | 2013-04-03 | 2023-09-26 | Dolby Laboratories Licensing Corporation | Methods and systems for rendering object based audio |
US11948586B2 (en) | 2013-04-03 | 2024-04-02 | Dolby Laboratories Licensing Coporation | Methods and systems for generating and rendering object based audio with conditional rendering metadata |
US10553225B2 (en) | 2013-04-03 | 2020-02-04 | Dolby Laboratories Licensing Corporation | Methods and systems for rendering object based audio |
US11727945B2 (en) | 2013-04-03 | 2023-08-15 | Dolby Laboratories Licensing Corporation | Methods and systems for interactive rendering of object based audio |
US10515644B2 (en) | 2013-04-03 | 2019-12-24 | Dolby Laboratories Licensing Corporation | Methods and systems for interactive rendering of object based audio |
US9997164B2 (en) | 2013-04-03 | 2018-06-12 | Dolby Laboratories Licensing Corporation | Methods and systems for interactive rendering of object based audio |
US10748547B2 (en) | 2013-04-03 | 2020-08-18 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and rendering object based audio with conditional rendering metadata |
US10832690B2 (en) | 2013-04-03 | 2020-11-10 | Dolby Laboratories Licensing Corporation | Methods and systems for rendering object based audio |
US9881622B2 (en) | 2013-04-03 | 2018-01-30 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and rendering object based audio with conditional rendering metadata |
US11568881B2 (en) | 2013-04-03 | 2023-01-31 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and rendering object based audio with conditional rendering metadata |
US10276172B2 (en) | 2013-04-03 | 2019-04-30 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and interactively rendering object based audio |
US9805727B2 (en) | 2013-04-03 | 2017-10-31 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and interactively rendering object based audio |
US10388291B2 (en) | 2013-04-03 | 2019-08-20 | Dolby Laboratories Licensing Corporation | Methods and systems for generating and rendering object based audio with conditional rendering metadata |
CN110797037A (en) * | 2013-07-31 | 2020-02-14 | 杜比实验室特许公司 | Method and apparatus for processing audio data, medium, and device |
US11736890B2 (en) | 2013-07-31 | 2023-08-22 | Dolby Laboratories Licensing Corporation | Method, apparatus or systems for processing audio objects |
US9848272B2 (en) | 2013-10-21 | 2017-12-19 | Dolby International Ab | Decorrelator structure for parametric reconstruction of audio signals |
US10255027B2 (en) | 2013-10-31 | 2019-04-09 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US10503461B2 (en) | 2013-10-31 | 2019-12-10 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US11269586B2 (en) | 2013-10-31 | 2022-03-08 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US10838684B2 (en) | 2013-10-31 | 2020-11-17 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US11681490B2 (en) | 2013-10-31 | 2023-06-20 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US9933989B2 (en) | 2013-10-31 | 2018-04-03 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US12061835B2 (en) | 2013-10-31 | 2024-08-13 | Dolby Laboratories Licensing Corporation | Binaural rendering for headphones using metadata processing |
US10142763B2 (en) | 2013-11-27 | 2018-11-27 | Dolby Laboratories Licensing Corporation | Audio signal processing |
US20170026771A1 (en) * | 2013-11-27 | 2017-01-26 | Dolby Laboratories Licensing Corporation | Audio Signal Processing |
WO2015080994A1 (en) * | 2013-11-27 | 2015-06-04 | Dolby Laboratories Licensing Corporation | Audio signal processing |
DE112015003108B4 (en) * | 2014-07-01 | 2021-03-04 | Electronics And Telecommunications Research Institute | Method and device for processing a multi-channel audio signal |
US10645515B2 (en) | 2014-07-01 | 2020-05-05 | Electronics And Telecommunications Research Institute | Multichannel audio signal processing method and device |
US10264381B2 (en) | 2014-07-01 | 2019-04-16 | Electronics And Telecommunications Research Institute | Multichannel audio signal processing method and device |
US9883308B2 (en) | 2014-07-01 | 2018-01-30 | Electronics And Telecommunications Research Institute | Multichannel audio signal processing method and device |
WO2016003206A1 (en) * | 2014-07-01 | 2016-01-07 | 한국전자통신연구원 | Multichannel audio signal processing method and device |
US9900692B2 (en) | 2014-07-09 | 2018-02-20 | Sony Corporation | System and method for playback in a speaker system |
KR102214205B1 (en) * | 2017-02-17 | 2021-02-10 | 노키아 테크놀로지스 오와이 | 2-stage audio focus for spatial audio processing |
US10785589B2 (en) | 2017-02-17 | 2020-09-22 | Nokia Technologies Oy | Two stage audio focus for spatial audio processing |
KR20190125987A (en) * | 2017-02-17 | 2019-11-07 | 노키아 테크놀로지스 오와이 | Two-stage audio focus for spatial audio processing |
WO2018154175A1 (en) * | 2017-02-17 | 2018-08-30 | Nokia Technologies Oy | Two stage audio focus for spatial audio processing |
Also Published As
Publication number | Publication date |
---|---|
CN101433099A (en) | 2009-05-13 |
WO2007078254A3 (en) | 2007-08-30 |
EP1969901A2 (en) | 2008-09-17 |
RU2008132156A (en) | 2010-02-10 |
BRPI0706285A2 (en) | 2011-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2000001B1 (en) | Method and arrangement for a decoder for multi-channel surround sound | |
WO2007078254A2 (en) | Personalized decoding of multi-channel surround sound | |
JP7564295B2 (en) | Apparatus, method, and computer program for encoding, decoding, scene processing, and other procedures for DirAC-based spatial audio coding - Patents.com | |
Herre et al. | MPEG-H 3D audio—The new standard for coding of immersive spatial audio | |
US8266195B2 (en) | Filter adaptive frequency resolution | |
Engdegard et al. | Spatial audio object coding (SAOC)—the upcoming MPEG standard on parametric object based audio coding | |
CN103489449B (en) | Audio signal decoder, method for providing upmix signal representation state | |
KR101358700B1 (en) | Audio encoding and decoding | |
US8880413B2 (en) | Binaural spatialization of compression-encoded sound data utilizing phase shift and delay applied to each subband | |
US9219972B2 (en) | Efficient audio coding having reduced bit rate for ambient signals and decoding using same | |
JP6134867B2 (en) | Renderer controlled space upmix | |
CN111970629B (en) | Audio decoder and decoding method | |
Breebaart et al. | Spatial audio object coding (SAOC)-the upcoming MPEG standard on parametric object based audio coding | |
JP2009543142A (en) | Concept for synthesizing multiple parametrically encoded sound sources | |
US10013993B2 (en) | Apparatus and method for surround audio signal processing | |
CN112218229A (en) | Method and apparatus for binaural dialog enhancement | |
Breebaart et al. | Binaural rendering in MPEG Surround | |
Quackenbush et al. | MPEG surround | |
WO2008084436A1 (en) | An object-oriented audio decoder | |
Peters et al. | Scene-based audio implemented with higher order ambisonics | |
Herre | Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio | |
Plogsties et al. | MPEG Sorround binaural rendering-Sorround sound for mobile devices (Binaurale Wiedergabe mit MPEG Sorround-Sorround sound fuer mobile Geraete) | |
Meng | Virtual sound source positioning for un-fixed speaker set up | |
Breebaart et al. | 19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 | |
EA047653B1 (en) | AUDIO ENCODING AND DECODING USING REPRESENTATION TRANSFORMATION PARAMETERS |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2007701092 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200780001908.2 Country of ref document: CN |
|
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 6071/DELNP/2008 Country of ref document: IN |
|
ENP | Entry into the national phase in: |
Ref document number: 2008132156 Country of ref document: RU Kind code of ref document: A |
|
ENP | Entry into the national phase in: |
Ref document number: PI0706285 Country of ref document: BR Kind code of ref document: A2 Effective date: 20080701 |