WO2008035275A2 - Encoding and decoding of audio objects - Google Patents
Encoding and decoding of audio objects Download PDFInfo
- Publication number
- WO2008035275A2 WO2008035275A2 PCT/IB2007/053748 IB2007053748W WO2008035275A2 WO 2008035275 A2 WO2008035275 A2 WO 2008035275A2 IB 2007053748 W IB2007053748 W IB 2007053748W WO 2008035275 A2 WO2008035275 A2 WO 2008035275A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- encoding
- audio objects
- data
- encoder
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 125
- 230000004048 modification Effects 0.000 claims abstract description 110
- 238000012986 modification Methods 0.000 claims abstract description 110
- 230000004044 response Effects 0.000 claims abstract description 52
- 238000009877 rendering Methods 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims description 37
- 238000004891 communication Methods 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 2
- 238000013459 approach Methods 0.000 abstract description 12
- 239000011159 matrix material Substances 0.000 description 34
- 230000006870 function Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000004091 panning Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000003321 amplification Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 238000003908 quality control method Methods 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Definitions
- the invention relates to encoding and decoding of audio objects and in particular, but not exclusively to manipulation of audio objects of a down-mix spatial signal.
- MPEG Moving Pictures Experts Group
- a multi-channel signal is down-mixed into a stereo signal and the additional signals are encoded by parametric data in the ancillary data portion allowing an MPEG Surround multi-channel decoder to generate a representation of the multi-channel signal.
- a legacy mono or stereo decoder will disregard the ancillary data and thus only decode the mono or stereo down-mix.
- parameters are extracted from the original audio signal so as to produce an audio signal having a reduced number of channels, for example only a single channel, plus a set of parameters describing the spatial properties of the original audio signal.
- the spatial properties described by the transmitted spatial parameters are used to recreate the original spatial multi-channel signal.
- a work item is started on object-based spatial audio coding.
- the aim of this work item is to explore new technology and reuse of current MPEG Surround components and technologies for the bit rate efficient coding of multiple sound sources or objects into a number of down-mix channels and corresponding spatial parameters.
- the intention is to use similar techniques as used for down-mixing of spatial (surround) channels to fewer channels to down-mix independent audio objects into a smaller number of channels.
- an improved system for audio object encoding/decoding would be advantageous and in particular a system allowing increased flexibility, improved quality, facilitated implementation and/or improved performance would be advantageous.
- This may provide a highly efficient and/or high quality control of the relative volume of an audio object by a listener while reducing or eliminating the effect on other audio objects.
- a high performance individual audio object volume control may be achieved.
- the encoding modification data may be embedded in a speech, music or other audio signal.
- the encoding modification data may specifically be embedded in ancillary or user data fields of an encoded audio signal received from the remote unit, such as e.g. an MPEG 4 bitstream. This may allow an efficient, backward compatible and low complexity communication of control data and may in particular be useful in systems employing two-way communications between a apparatus comprising the encoder and the remote unit.
- the encoder is arranged to receive encoding modification data from a plurality of remote units and to generate different parametric data for the different remote units in response to receiving different encoding modification data from the different remote units.
- the encoding means may furthermore be arranged to generate different audio signals for the different remote units.
- the approach may allow e.g. a centralized audio object encoder to customize the transmitted data to the requirements and preferences of the individual users of the remote units.
- a decoder for decoding audio objects, the decoder comprising: a receiver for receiving from an encoder a number of audio signals being a down-mix of a plurality of audio objects and parametric data representing the plurality of audio objects relative to the number of audio signals, the parametric data comprising a set of object parameters for at least one of the different audio objects; decoding means for decoding the audio objects from the number of audio signals in response to the parametric data; rendering means for generating a spatial multi-channel output signal from the audio objects; means for generating encoding modification data for the object encoder; and means for transmitting the encoding modification data to the object encoder.
- a transmitter for transmitting audio signals comprising: means for receiving a plurality of audio objects; encoding means for encoding the plurality of audio objects in a number of audio signals and parametric data representing the plurality of audio objects relative to the number of audio signals, the parametric data comprising a set of object parameters for at least one of the different audio objects; means for receiving encoding modification data from a remote unit; and parameter means for determining the parametric data in response to the modification data.
- a communication system for communicating audio signals comprising: a transmitter comprising: means for receiving a plurality of audio objects, encoding means for encoding the plurality of audio objects in a number of audio signals and parametric data representing the plurality of audio objects relative to the number of audio signals, the parametric data comprising a set of object parameters for at least one of the different audio objects, and means for transmitting the number of audio signals and the parametric data to a receiver; and the receiver comprising: a receiver element for receiving from the transmitter the number of audio signals and the parametric data, decoding means for decoding the audio objects from the number of audio signals in response to the parametric data, rendering means for generating a spatial multi-channel output signal from the audio objects, means for generating encoding modification data for the encoding means, and means for transmitting the encoding modification data to the transmitter; and wherein the transmitter comprises means for receiving the encoding modification data from the receiver; parameter means for determining the parametric data in response to the
- Fig. 1 is an illustration of an audio system in accordance with the prior art
- Fig. 7 illustrates an example of a method of decoding audio objects in accordance with some embodiments of the invention.
- the transmitter 201 is part of a teleconferencing hub.
- the speech signals of several far-end talkers are mixed in a teleconferencing hub. Then for each person in the teleconference, a mix of all signals except his/her own is transmitted to all receivers.
- the transmitter 201 can receive speech signals from a plurality of remote communication units taking part in the teleconference and can generate and distribute speech signals to the remote communication units.
- the receiver 203 is a signal player device which can generate a speech output to a participant of the conference call.
- the receiver 203 is part of a remote communication unit such as telephone.
- the receiver 207 is coupled to the encoder 209 of Fig. 2 which is fed the individual speech audio objects and which encodes the audio objects in accordance with an encoding algorithm.
- the encoder 209 is coupled to a network transmitter 211 which receives the encoded signal and interfaces to the Internet 205.
- the network transmitter may transmit the encoded signal to the receiver 203 through the Internet 205.
- the audio objects of the present example are individual and isolated sound sources.
- the decoder 215 furthermore comprises a rendering unit 305 which is arranged to generate an output signal based on the audio inputs.
- the rendering unit 305 can freely manipulate and mix the received audio objects to generate a desired output signal.
- the rendering unit 305 can generate a five channel surround sound signal and can freely position each individual audio object in the generated sound image.
- the rendering unit 305 may generate a binaural stereo signal which can provide a spatial experience through e.g. a set of headphones.
- the functionality of the decoding unit 303 and the rendering unit 305 is combined into a single processing step.
- the operation of the decoding unit 303 typically corresponds to a matrix multiplication by an up-mix matrix and the operation of the rendering unit 305 similarly corresponds to a matrix multiplication performed on the output of the up-mix matrix multiplication.
- the cascaded matrix multiplication can be combined into a single matrix multiplication.
- the rendering unit 305 can place each individual speaker of the conference call at a different location in the sound image with the specific location for each speaker being freely selectable for example by a user controlling the rendering unit 305.
- the audio object corresponds to different musical instruments from a piece of music
- the user can freely mix, equalize etc the individual instruments as well as freely position them in the sound image.
- the described approach allows a high degree of freedom the individual user to manipulate the different audio objects to generate a customized audio output which can be independent of the audio output generated for other users and recipients of the encoded signal from the encoder 209.
- an inverse matrix does not exist for a down-mix matrix (where N>M) and therefore parameter data can only be generated which allows a non-ideal regeneration of the original speech objects.
- the parameter unit 405 generates parameters which represent characteristics of the individual speech objects relative to the down-mix signal.
- the parameter unit first transforms the speech object into the frequency domain in time blocks (e.g. by use of an FFT) and then performs the down-mix matrix multiplication for each time frequency block (or time frequency tile). Furthermore, for the time frequency blocks, the relative amplitude of each speech object relative to the down-mix result is determined.
- the parameter unit 405 generates relative level information described in separate time/frequency tiles for the various speech objects. Thereby, a level vector is generated for the time/frequency tiles with each element of the vector representing the amount of energy in the time/frequency tile of the object of that element.
- the speech objects are fed to the rendering unit 305 which can proceed to generate an output signal for the user.
- the user may be able to adjust various rendering parameters and characteristics including for example changing a position of one or more of the speech objects in the generated sound image.
- the encoder 209 comprises a control data receiver 409 which receives the encoding modification data.
- the control data receiver 409 is coupled to the encoding unit 403 and the parameter unit 405 which are arranged to modify the encoding and generation of parameter data depending on the received encoding modification data.
- the user thereof can also control the encoding operation of the object oriented encoding performed at the encoder side.
- the decoder user may request that the volume of a specific speech object is increased substantially. If this is performed by amplifying the corresponding speech object at the decoder, the amplification will also amplify the cross interference components from other speech objects which may not only result in a higher volume of these but also in distortion of these objects and possibly in a shift in the position of these objects.
- the decoder 215 does not change the scaling of the generated speech object replicas but rather generates encoding modification data which will cause the encoder to modify the down-mix weights for the desired speech objects.
- the encoder 209 may be arranged to modify at least one of the audio objects prior to the down-mixing being performed.
- the encoding unit 403 can scale the received audio objects before performing the down-mix matrix multiplication. Thus, if encoding modification data is received which indicates that a specific speech object should be lower, the received signal samples for this object may be multiplied by a factor larger than one.
- the resulting signal can then be used in the down-mix matrix multiplication to generate the down-mix signal.
- This approach may allow a fixed down-mix matrix to be used and may specifically allow suitable easy to multiply coefficients to be used (for example the down-mix matrix could contain only unity coefficients thereby effectively reducing the down-mix multiplication to a number of simple additions).
- the determination of the object parameters may be determined based on the modified signals.
- the scaled speech objects can also be fed to the parameter unit 405 which can determine the relative levels of the frequency time tiles for the modified signals.
- This approach will result in the up-mixing process by the decoder generating a speech object having the desired volume level.
- the modification of the parametric data depending on the encoding modification data is indirect in the sense that the encoding modification data is first used to modify the speech objects and the parameter data is then generated on the basis of the modified speech objects.
- the object parameters may be changed such that the decoder will generate the required speech objects by applying the modified object parameters.
- it may be necessary to not only change the object parameter for that speech object but also for other speech objects.
- the encoder 209 is arranged to render the speech objects as a spatial output signal wherein each speech object is rendered at a specific location with a specific volume level and frequency characteristic etc.
- the output of the encoder 209 may be a stereo signal, a surround sound multi-channel signal and/or a binaural spatial surround signal e.g. generated using Head Related Transfer Functions.
- the encoding modification data received from the decoder 215 can comprise spatial rendering parameters which affect the rendering of the speech objects in the spatial signal.
- the spatial rendering parameters can for example indicate that the position of one or more of the audio objects should be changed in the spatial output mix.
- equalization data may be provided which can be applied to an individual audio object.
- the perceived distance of each audio object may be remotely controlled from the decoder end. For example, if encoding modification data is received which indicates that an audio object should be moved further away in a spatial down-mix, the rendering of this audio object may be changed such that the volume level is reduced and the correlation between front and back channels is increased.
- the remote user may control the spatial rendering mode of the encoder. For example, for a two-channel output signal, the user can select whether the rendering should be optimized for loudspeakers or headphones. Specifically, the remote user can select whether the output should be generated as a traditional stereo signal or as a binaural spatial surround signal for use with headphones.
- Fig. 6 illustrates an example of a method of encoding audio signals in accordance with some embodiments of the invention.
- Step 703 is followed by step 705 wherein a spatial multi-channel output signal is generated from the audio objects.
- Step 707 is followed by step 709 wherein the encoding modification data is transmitted to the object encoder.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Stereophonic System (AREA)
- Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
Claims
Priority Applications (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP07826410A EP2067138B1 (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects |
KR1020097007892A KR101396140B1 (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects |
DE602007012730T DE602007012730D1 (en) | 2006-09-18 | 2007-09-17 | CODING AND DECODING AUDIO OBJECTS |
US12/441,538 US8271290B2 (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects |
AT07826410T ATE499677T1 (en) | 2006-09-18 | 2007-09-17 | ENCODING AND DECODING AUDIO OBJECTS |
CN2007800345382A CN101517637B (en) | 2006-09-18 | 2007-09-17 | Encoder and decoder of audio frequency, encoding and decoding method, hub, transreciver, transmitting and receiving method, communication system and playing device |
PL07826410T PL2067138T3 (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects |
JP2009527954A JP5281575B2 (en) | 2006-09-18 | 2007-09-17 | Audio object encoding and decoding |
MX2009002795A MX2009002795A (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects. |
BRPI0716854-3A BRPI0716854B1 (en) | 2006-09-18 | 2007-09-17 | ENCODER FOR ENCODING AUDIO OBJECTS, DECODER FOR DECODING AUDIO OBJECTS, TELECONFERENCE DISTRIBUTOR CENTER, AND METHOD FOR DECODING AUDIO SIGNALS |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06120819.5 | 2006-09-18 | ||
EP06120819 | 2006-09-18 | ||
EP06123799 | 2006-11-10 | ||
EP06123799.6 | 2006-11-10 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2008035275A2 true WO2008035275A2 (en) | 2008-03-27 |
WO2008035275A3 WO2008035275A3 (en) | 2008-05-29 |
Family
ID=39079648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2007/053748 WO2008035275A2 (en) | 2006-09-18 | 2007-09-17 | Encoding and decoding of audio objects |
Country Status (12)
Country | Link |
---|---|
US (1) | US8271290B2 (en) |
EP (1) | EP2067138B1 (en) |
JP (1) | JP5281575B2 (en) |
KR (1) | KR101396140B1 (en) |
CN (1) | CN101517637B (en) |
AT (1) | ATE499677T1 (en) |
BR (1) | BRPI0716854B1 (en) |
DE (1) | DE602007012730D1 (en) |
MX (1) | MX2009002795A (en) |
PL (1) | PL2067138T3 (en) |
RU (1) | RU2460155C2 (en) |
WO (1) | WO2008035275A2 (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010005264A2 (en) * | 2008-07-10 | 2010-01-14 | 한국전자통신연구원 | Method and apparatus for editing audio object in spatial information-based multi-object audio coding apparatus |
US7672744B2 (en) | 2006-11-15 | 2010-03-02 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7715569B2 (en) | 2006-12-07 | 2010-05-11 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
WO2010125104A1 (en) * | 2009-04-28 | 2010-11-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
US20110112843A1 (en) * | 2008-07-11 | 2011-05-12 | Nec Corporation | Signal analyzing device, signal control device, and method and program therefor |
US8027477B2 (en) | 2005-09-13 | 2011-09-27 | Srs Labs, Inc. | Systems and methods for audio processing |
US8265941B2 (en) | 2006-12-07 | 2012-09-11 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
KR101230691B1 (en) | 2008-07-10 | 2013-02-07 | 한국전자통신연구원 | Method and apparatus for editing audio object in multi object audio coding based spatial information |
US8396575B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | Object-oriented audio streaming system |
EP2590360A1 (en) * | 2010-06-29 | 2013-05-08 | ZTE Corporation | Multi-point sound mixing and long distance view showing method, device and system |
US8831254B2 (en) | 2006-04-03 | 2014-09-09 | Dts Llc | Audio signal processing |
US9026450B2 (en) | 2011-03-09 | 2015-05-05 | Dts Llc | System for dynamically creating and rendering audio objects |
EP2879131A1 (en) * | 2013-11-27 | 2015-06-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
US9373335B2 (en) | 2012-08-31 | 2016-06-21 | Dolby Laboratories Licensing Corporation | Processing audio objects in principal and supplementary encoded audio signals |
US9558785B2 (en) | 2013-04-05 | 2017-01-31 | Dts, Inc. | Layered audio coding and transmission |
US9786286B2 (en) | 2013-03-29 | 2017-10-10 | Dolby Laboratories Licensing Corporation | Methods and apparatuses for generating and using low-resolution preview tracks with high-quality encoded object and multichannel audio signals |
US9818412B2 (en) | 2013-05-24 | 2017-11-14 | Dolby International Ab | Methods for audio encoding and decoding, corresponding computer-readable media and corresponding audio encoder and decoder |
Families Citing this family (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8959016B2 (en) | 2002-09-27 | 2015-02-17 | The Nielsen Company (Us), Llc | Activating functions in processing devices using start codes embedded in audio |
US9711153B2 (en) | 2002-09-27 | 2017-07-18 | The Nielsen Company (Us), Llc | Activating functions in processing devices using encoded audio and detecting audio signatures |
WO2008039041A1 (en) * | 2006-09-29 | 2008-04-03 | Lg Electronics Inc. | Methods and apparatuses for encoding and decoding object-based audio signals |
US8121830B2 (en) * | 2008-10-24 | 2012-02-21 | The Nielsen Company (Us), Llc | Methods and apparatus to extract data encoded in media content |
US8359205B2 (en) | 2008-10-24 | 2013-01-22 | The Nielsen Company (Us), Llc | Methods and apparatus to perform audio watermarking and watermark detection and extraction |
US9667365B2 (en) | 2008-10-24 | 2017-05-30 | The Nielsen Company (Us), Llc | Methods and apparatus to perform audio watermarking and watermark detection and extraction |
US8508357B2 (en) | 2008-11-26 | 2013-08-13 | The Nielsen Company (Us), Llc | Methods and apparatus to encode and decode audio for shopper location and advertisement presentation tracking |
JP5274359B2 (en) | 2009-04-27 | 2013-08-28 | 三菱電機株式会社 | 3D video and audio recording method, 3D video and audio playback method, 3D video and audio recording device, 3D video and audio playback device, 3D video and audio recording medium |
WO2010127268A1 (en) | 2009-05-01 | 2010-11-04 | The Nielsen Company (Us), Llc | Methods, apparatus and articles of manufacture to provide secondary content in association with primary broadcast media content |
US20100324915A1 (en) * | 2009-06-23 | 2010-12-23 | Electronic And Telecommunications Research Institute | Encoding and decoding apparatuses for high quality multi-channel audio codec |
EP2323130A1 (en) * | 2009-11-12 | 2011-05-18 | Koninklijke Philips Electronics N.V. | Parametric encoding and decoding |
TWI759223B (en) * | 2010-12-03 | 2022-03-21 | 美商杜比實驗室特許公司 | Audio decoding device, audio decoding method, and audio encoding method |
CN103050124B (en) | 2011-10-13 | 2016-03-30 | 华为终端有限公司 | Sound mixing method, Apparatus and system |
CN103152500B (en) * | 2013-02-21 | 2015-06-24 | 黄文明 | Method for eliminating echo from multi-party call |
US9559651B2 (en) * | 2013-03-29 | 2017-01-31 | Apple Inc. | Metadata for loudness and dynamic range control |
EP2973551B1 (en) | 2013-05-24 | 2017-05-03 | Dolby International AB | Reconstruction of audio scenes from a downmix |
RU2630754C2 (en) * | 2013-05-24 | 2017-09-12 | Долби Интернешнл Аб | Effective coding of sound scenes containing sound objects |
EP3005353B1 (en) | 2013-05-24 | 2017-08-16 | Dolby International AB | Efficient coding of audio scenes comprising audio objects |
CA3211308A1 (en) | 2013-05-24 | 2014-11-27 | Dolby International Ab | Coding of audio scenes |
US10049683B2 (en) | 2013-10-21 | 2018-08-14 | Dolby International Ab | Audio encoder and decoder |
CN104882145B (en) * | 2014-02-28 | 2019-10-29 | 杜比实验室特许公司 | It is clustered using the audio object of the time change of audio object |
CN105336339B (en) | 2014-06-03 | 2019-05-03 | 华为技术有限公司 | A kind for the treatment of method and apparatus of voice frequency signal |
US10037202B2 (en) | 2014-06-03 | 2018-07-31 | Microsoft Technology Licensing, Llc | Techniques to isolating a portion of an online computing service |
US9510125B2 (en) * | 2014-06-20 | 2016-11-29 | Microsoft Technology Licensing, Llc | Parametric wave field coding for real-time sound propagation for dynamic sources |
CN105989845B (en) * | 2015-02-25 | 2020-12-08 | 杜比实验室特许公司 | Video content assisted audio object extraction |
CN107358959B (en) * | 2016-05-10 | 2021-10-26 | 华为技术有限公司 | Coding method and coder for multi-channel signal |
CN109479178B (en) | 2016-07-20 | 2021-02-26 | 杜比实验室特许公司 | Audio object aggregation based on renderer awareness perception differences |
EP3605531B1 (en) * | 2017-03-28 | 2024-08-21 | Sony Group Corporation | Information processing device, information processing method, and program |
US10602296B2 (en) * | 2017-06-09 | 2020-03-24 | Nokia Technologies Oy | Audio object adjustment for phase compensation in 6 degrees of freedom audio |
US10602298B2 (en) | 2018-05-15 | 2020-03-24 | Microsoft Technology Licensing, Llc | Directional propagation |
US10932081B1 (en) | 2019-08-22 | 2021-02-23 | Microsoft Technology Licensing, Llc | Bidirectional propagation of sound |
CN111462767B (en) * | 2020-04-10 | 2024-01-09 | 全景声科技南京有限公司 | Incremental coding method and device for audio signal |
US11662975B2 (en) | 2020-10-06 | 2023-05-30 | Tencent America LLC | Method and apparatus for teleconference |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030026441A1 (en) * | 2001-05-04 | 2003-02-06 | Christof Faller | Perceptual synthesis of auditory scenes |
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6782360B1 (en) * | 1999-09-22 | 2004-08-24 | Mindspeed Technologies, Inc. | Gain quantization for a CELP speech coder |
SE0202159D0 (en) * | 2001-07-10 | 2002-07-09 | Coding Technologies Sweden Ab | Efficientand scalable parametric stereo coding for low bitrate applications |
JP2003188731A (en) * | 2001-12-18 | 2003-07-04 | Yrp Mobile Telecommunications Key Tech Res Lab Co Ltd | Variable rate encoding method, encoder and decoder |
WO2004036955A1 (en) * | 2002-10-15 | 2004-04-29 | Electronics And Telecommunications Research Institute | Method for generating and consuming 3d audio scene with extended spatiality of sound source |
KR101049751B1 (en) * | 2003-02-11 | 2011-07-19 | 코닌클리케 필립스 일렉트로닉스 엔.브이. | Audio coding |
DE10344638A1 (en) * | 2003-08-04 | 2005-03-10 | Fraunhofer Ges Forschung | Generation, storage or processing device and method for representation of audio scene involves use of audio signal processing circuit and display device and may use film soundtrack |
JP2005352396A (en) * | 2004-06-14 | 2005-12-22 | Matsushita Electric Ind Co Ltd | Sound signal encoding device and sound signal decoding device |
JP4892184B2 (en) * | 2004-10-14 | 2012-03-07 | パナソニック株式会社 | Acoustic signal encoding apparatus and acoustic signal decoding apparatus |
DE102005008369A1 (en) * | 2005-02-23 | 2006-09-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for simulating a wave field synthesis system |
US7974422B1 (en) * | 2005-08-25 | 2011-07-05 | Tp Lab, Inc. | System and method of adjusting the sound of multiple audio objects directed toward an audio output device |
KR20080093422A (en) * | 2006-02-09 | 2008-10-21 | 엘지전자 주식회사 | Method for encoding and decoding object-based audio signal and apparatus thereof |
EP1853092B1 (en) * | 2006-05-04 | 2011-10-05 | LG Electronics, Inc. | Enhancing stereo audio with remix capability |
US8295494B2 (en) * | 2007-08-13 | 2012-10-23 | Lg Electronics Inc. | Enhancing audio with remixing capability |
-
2007
- 2007-09-17 CN CN2007800345382A patent/CN101517637B/en active Active
- 2007-09-17 BR BRPI0716854-3A patent/BRPI0716854B1/en active IP Right Grant
- 2007-09-17 WO PCT/IB2007/053748 patent/WO2008035275A2/en active Application Filing
- 2007-09-17 PL PL07826410T patent/PL2067138T3/en unknown
- 2007-09-17 KR KR1020097007892A patent/KR101396140B1/en active IP Right Grant
- 2007-09-17 AT AT07826410T patent/ATE499677T1/en not_active IP Right Cessation
- 2007-09-17 RU RU2009114741/08A patent/RU2460155C2/en active
- 2007-09-17 US US12/441,538 patent/US8271290B2/en active Active
- 2007-09-17 JP JP2009527954A patent/JP5281575B2/en active Active
- 2007-09-17 MX MX2009002795A patent/MX2009002795A/en active IP Right Grant
- 2007-09-17 DE DE602007012730T patent/DE602007012730D1/en active Active
- 2007-09-17 EP EP07826410A patent/EP2067138B1/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030026441A1 (en) * | 2001-05-04 | 2003-02-06 | Christof Faller | Perceptual synthesis of auditory scenes |
US20030035553A1 (en) * | 2001-08-10 | 2003-02-20 | Frank Baumgarte | Backwards-compatible perceptual coding of spatial cues |
Non-Patent Citations (1)
Title |
---|
BREEBART J ET AL: "MPEG Spatial Audio Coding / MPEG surround: Overview and Current Status" AUDIO ENGINEERING SOCIETY CONVENTION PAPER, NEW YORK, NY, US, 7 October 2005 (2005-10-07), pages 1-17, XP002379094 * |
Cited By (55)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8027477B2 (en) | 2005-09-13 | 2011-09-27 | Srs Labs, Inc. | Systems and methods for audio processing |
US9232319B2 (en) | 2005-09-13 | 2016-01-05 | Dts Llc | Systems and methods for audio processing |
US8831254B2 (en) | 2006-04-03 | 2014-09-09 | Dts Llc | Audio signal processing |
US7672744B2 (en) | 2006-11-15 | 2010-03-02 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8265941B2 (en) | 2006-12-07 | 2012-09-11 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8488797B2 (en) | 2006-12-07 | 2013-07-16 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7783050B2 (en) | 2006-12-07 | 2010-08-24 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7783048B2 (en) | 2006-12-07 | 2010-08-24 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7715569B2 (en) | 2006-12-07 | 2010-05-11 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7783049B2 (en) | 2006-12-07 | 2010-08-24 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7986788B2 (en) | 2006-12-07 | 2011-07-26 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8005229B2 (en) | 2006-12-07 | 2011-08-23 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US7783051B2 (en) | 2006-12-07 | 2010-08-24 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8428267B2 (en) | 2006-12-07 | 2013-04-23 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8340325B2 (en) | 2006-12-07 | 2012-12-25 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
US8311227B2 (en) | 2006-12-07 | 2012-11-13 | Lg Electronics Inc. | Method and an apparatus for decoding an audio signal |
KR101230691B1 (en) | 2008-07-10 | 2013-02-07 | 한국전자통신연구원 | Method and apparatus for editing audio object in multi object audio coding based spatial information |
WO2010005264A3 (en) * | 2008-07-10 | 2010-04-22 | 한국전자통신연구원 | Method and apparatus for editing audio object in spatial information-based multi-object audio coding apparatus |
WO2010005264A2 (en) * | 2008-07-10 | 2010-01-14 | 한국전자통신연구원 | Method and apparatus for editing audio object in spatial information-based multi-object audio coding apparatus |
US20110112843A1 (en) * | 2008-07-11 | 2011-05-12 | Nec Corporation | Signal analyzing device, signal control device, and method and program therefor |
US9786285B2 (en) | 2009-04-28 | 2017-10-10 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
CN102576532A (en) * | 2009-04-28 | 2012-07-11 | 弗兰霍菲尔运输应用研究公司 | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
US8731950B2 (en) | 2009-04-28 | 2014-05-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
WO2010125104A1 (en) * | 2009-04-28 | 2010-11-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
RU2573738C2 (en) * | 2009-04-28 | 2016-01-27 | Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Форшунг Е.Ф. | Device for optimising one or more upmixing signal presentation parameters based on downmixing signal presentation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using object-oriented parametric information |
US9167346B2 (en) | 2009-08-14 | 2015-10-20 | Dts Llc | Object-oriented audio streaming system |
US8396576B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | System for adaptively streaming audio objects |
US8396577B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | System for creating audio objects for streaming |
US8396575B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | Object-oriented audio streaming system |
EP2590360A4 (en) * | 2010-06-29 | 2014-06-18 | Zte Corp | Multi-point sound mixing and long distance view showing method, device and system |
EP2590360A1 (en) * | 2010-06-29 | 2013-05-08 | ZTE Corporation | Multi-point sound mixing and long distance view showing method, device and system |
US9165558B2 (en) | 2011-03-09 | 2015-10-20 | Dts Llc | System for dynamically creating and rendering audio objects |
US9026450B2 (en) | 2011-03-09 | 2015-05-05 | Dts Llc | System for dynamically creating and rendering audio objects |
US9721575B2 (en) | 2011-03-09 | 2017-08-01 | Dts Llc | System for dynamically creating and rendering audio objects |
US9373335B2 (en) | 2012-08-31 | 2016-06-21 | Dolby Laboratories Licensing Corporation | Processing audio objects in principal and supplementary encoded audio signals |
US9786286B2 (en) | 2013-03-29 | 2017-10-10 | Dolby Laboratories Licensing Corporation | Methods and apparatuses for generating and using low-resolution preview tracks with high-quality encoded object and multichannel audio signals |
US9613660B2 (en) | 2013-04-05 | 2017-04-04 | Dts, Inc. | Layered audio reconstruction system |
US9837123B2 (en) | 2013-04-05 | 2017-12-05 | Dts, Inc. | Layered audio reconstruction system |
US9558785B2 (en) | 2013-04-05 | 2017-01-31 | Dts, Inc. | Layered audio coding and transmission |
US9818412B2 (en) | 2013-05-24 | 2017-11-14 | Dolby International Ab | Methods for audio encoding and decoding, corresponding computer-readable media and corresponding audio encoder and decoder |
KR20160075756A (en) * | 2013-11-27 | 2016-06-29 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
US9947325B2 (en) | 2013-11-27 | 2018-04-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems |
AU2014356467B2 (en) * | 2013-11-27 | 2016-12-15 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
CN105874532A (en) * | 2013-11-27 | 2016-08-17 | 弗劳恩霍夫应用研究促进协会 | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
WO2015078956A1 (en) * | 2013-11-27 | 2015-06-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
EP2879131A1 (en) * | 2013-11-27 | 2015-06-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
WO2015078964A1 (en) | 2013-11-27 | 2015-06-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems |
TWI569259B (en) * | 2013-11-27 | 2017-02-01 | 弗勞恩霍夫爾協會 | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
KR101852950B1 (en) * | 2013-11-27 | 2018-06-07 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
US10497376B2 (en) | 2013-11-27 | 2019-12-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder, and method for informed loudness estimation in object-based audio coding systems |
US10699722B2 (en) | 2013-11-27 | 2020-06-30 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems |
US10891963B2 (en) | 2013-11-27 | 2021-01-12 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder, and method for informed loudness estimation in object-based audio coding systems |
US11423914B2 (en) | 2013-11-27 | 2022-08-23 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems |
US11688407B2 (en) | 2013-11-27 | 2023-06-27 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder, and method for informed loudness estimation in object-based audio coding systems |
US11875804B2 (en) | 2013-11-27 | 2024-01-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation employing by-pass audio object signals in object-based audio coding systems |
Also Published As
Publication number | Publication date |
---|---|
BRPI0716854A2 (en) | 2013-10-01 |
US20090326960A1 (en) | 2009-12-31 |
KR101396140B1 (en) | 2014-05-20 |
WO2008035275A3 (en) | 2008-05-29 |
PL2067138T3 (en) | 2011-07-29 |
CN101517637B (en) | 2012-08-15 |
ATE499677T1 (en) | 2011-03-15 |
CN101517637A (en) | 2009-08-26 |
RU2460155C2 (en) | 2012-08-27 |
JP5281575B2 (en) | 2013-09-04 |
DE602007012730D1 (en) | 2011-04-07 |
EP2067138B1 (en) | 2011-02-23 |
BRPI0716854A8 (en) | 2019-01-15 |
JP2010503887A (en) | 2010-02-04 |
KR20090080945A (en) | 2009-07-27 |
BRPI0716854B1 (en) | 2020-09-15 |
MX2009002795A (en) | 2009-04-01 |
EP2067138A2 (en) | 2009-06-10 |
RU2009114741A (en) | 2010-10-27 |
US8271290B2 (en) | 2012-09-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8271290B2 (en) | Encoding and decoding of audio objects | |
US9460729B2 (en) | Layered approach to spatial audio coding | |
Herre et al. | MPEG spatial audio object coding—the ISO/MPEG standard for efficient coding of interactive audio scenes | |
JP4838361B2 (en) | Audio signal decoding method and apparatus | |
KR101358700B1 (en) | Audio encoding and decoding | |
Engdegard et al. | Spatial audio object coding (SAOC)—the upcoming MPEG standard on parametric object based audio coding | |
JP5645951B2 (en) | An apparatus for providing an upmix signal based on a downmix signal representation, an apparatus for providing a bitstream representing a multichannel audio signal, a method, a computer program, and a multi-channel audio signal using linear combination parameters Bitstream | |
JP5455647B2 (en) | Audio decoder | |
JP5719372B2 (en) | Apparatus and method for generating upmix signal representation, apparatus and method for generating bitstream, and computer program | |
CN117395593A (en) | Apparatus, method and computer program for encoding, decoding, scene processing and other processes related to DirAC-based spatial audio coding | |
EP3809709A1 (en) | Apparatus and method for audio encoding | |
KR101062353B1 (en) | Method for decoding audio signal and apparatus therefor | |
TW202230336A (en) | Apparatus and method for encoding a plurality of audio objects or apparatus and method for decoding using two or more relevant audio objects | |
TW202223880A (en) | Apparatus and method for encoding a plurality of audio objects using direction information during a downmixing or apparatus and method for decoding using an optimized covariance synthesis | |
Engdegård et al. | MPEG spatial audio object coding—the ISO/MPEG standard for efficient coding of interactive audio scenes | |
ES2360740T3 (en) | CODING AND DECODING OF AUDIO OBJECTS. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200780034538.2 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07826410 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007826410 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2009527954 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: MX/A/2009/002795 Country of ref document: MX |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12441538 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1970/CHENP/2009 Country of ref document: IN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020097007892 Country of ref document: KR |
|
ENP | Entry into the national phase |
Ref document number: 2009114741 Country of ref document: RU Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: PI0716854 Country of ref document: BR Kind code of ref document: A2 Effective date: 20090316 |