EP2451196A1 - Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois - Google Patents

Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois Download PDF

Info

Publication number
EP2451196A1
EP2451196A1 EP10306212A EP10306212A EP2451196A1 EP 2451196 A1 EP2451196 A1 EP 2451196A1 EP 10306212 A EP10306212 A EP 10306212A EP 10306212 A EP10306212 A EP 10306212A EP 2451196 A1 EP2451196 A1 EP 2451196A1
Authority
EP
European Patent Office
Prior art keywords
data
ambisonics
order
value
coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP10306212A
Other languages
German (de)
English (en)
Inventor
Holger Kropp
Florian Keiler
Johann-Markus Batke
Stefan Abeling
Johannes Boehm
Sven Kordon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thomson Licensing SAS
Original Assignee
Thomson Licensing SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing SAS filed Critical Thomson Licensing SAS
Priority to EP10306212A priority Critical patent/EP2451196A1/fr
Publication of EP2451196A1 publication Critical patent/EP2451196A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels

Definitions

  • the invention relates to a method and to an apparatus for generating and for decoding sound field data including Ambisonics sound field data of an order higher than three, wherein for encoding and for decoding different processing paths can be used.
  • 2D presentations include formats like stereo or surround sound, and are based on audio container formats like WAV and BWF (Broadcast Wave Format).
  • WAV Broadcast Wave Format
  • the wave format WAV is described in Microsoft, "Multiple Channel Audio Data and WAVE Files", updated March 7,2007 , http://www.microsoft.com/whdc/device/audio/multichaud.mspx , and in http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats /WAVE/WAVE.html, last update 19 June 2006.
  • WFS combines a high number of spherical sound sources for emulating plane waves from different directions. Therefore, a lot of loudspeakers or audio channels are required.
  • a description contains a number of source signals as well as their specific positions.
  • Ambisonics uses specific coefficients based on spherical harmonics for providing a sound field description that is independent from any specific loudspeaker set-up. This leads to a description which does not require information about loudspeaker positions during sound field recording or generation of synthetic scenes.
  • the reproduction accuracy in an Ambisonics system can be modified by its order N .
  • the 'higher-order Ambisonics' (HOA) description considers an order of more than one, and the focus in this application is on HOA.
  • the number of required audio information channels can be determined for a 2D or a 3D system, because this depends on the number of spherical harmonic bases.
  • 'mixed orders' have different orders in 2D (x-y plane only) and 3D (additionally z axis).
  • the first-order B-Format uses three channels for 2D and four channels for 3D.
  • the first-order B-Format is extended to the higher-order B-format. Depending on O a horizontal (2D), a full-sphere (3D), or a mixture sound field type description can be generated. By ignoring appropriate channels, this B-format is backward compatible, i.e. a 2D Ambisonics receiver is able to decode the 2D components from a 3D Ambisonics sound field.
  • the extended B-format for HOA considers only orders up to three, which corresponds to 16 channels maximum.
  • the older UHJ-format was introduced to enable mono and stereo compatibility.
  • the G-format was introduced to reproduce sound scenarios in 5.1 environments.
  • Wave FORMAT_EXTENSIBLE format is an extension of the above-mentioned WAV format.
  • One application is the use of Ambisonics B-format in the WAVEX description: "Wave Format Extensible and the .amb suffix or WAVEX and Ambisonics", http://mchapman.com/amb/wavex .
  • Wave-based audio format descriptions are used in different applications.
  • An environment which is very important today and will become even more important in the future are internet applications based on Ethernet transmission protocols.
  • a data structure for Ambisonics transmission that is able to use the above-mentioned B-format as well as additional features like the Ambisonics order and their coefficient's bit lengths in an efficient manner is not yet known to the applicant.
  • RTP Real-Time Protocol
  • This payload header extends the RTP header of Fig. 1 by a 2-octet extended sequence number and a 2-octet extended time stamp. Furthermore, one octet for flags and a reserved field, followed by a 3-octet SMPTE time stamp and a 4-octet offset value is proposed therein.
  • the 32-bit aligned payload data is following the header data.
  • a problem to be solved by the invention is to provide a data structure (i.e. a protocol layer) for 3D higher-order Ambisonics sound field description formats, which can be used for real-time transmission over Ethernet.
  • This problem is solved by the encoding method disclosed in claim 1 and the decoding method disclosed in claim 3. Apparatuses which utilise these methods are disclosed in claims 2 and 4, respectively.
  • the data structures described below facilitate real-time transmission of 3D sound field descriptions over Ethernet. From the content of additional metadata the transmitted 3D sound field can be adapted at receiver side to the available headphones or the number and positions of loudspeakers, for regular as well as for irregular set-ups. No regular loudspeaker set-ups including a large number of loudspeakers are required like in WFS.
  • the sound quality level can be adapted to the available sound reproduction system, e.g. by mapping a 3D Ambisonics sound field description onto a 2D loudspeaker set-up.
  • inventive data structure considers single microphones or microphone arrays as well as virtual acoustical sources with different accuracies and sample rates.
  • moving sources i.e. sources with time-dependent spatial positions
  • Ambisonics descriptions inherently.
  • the Ambisonics header information level is adaptable between a simple and an encoder related mode.
  • the latter one enables fast decoder modifications. This is useful especially for real-time applications.
  • the proposed data structure is extendable for classical audio scene descriptions, i.e. sound sources and their positions.
  • the inventive Ambisonics processing is based on linear operators, i.e. the Ambisonics channels data can be packed and transmitted singly or in an assembled manner as a matrix.
  • the inventive encoding method is suited for generating sound field data including Ambisonics sound field data of an order higher than three, said method including the steps:
  • the inventive encoder apparatus is suited for generating sound field data including Ambisonics sound field data of an order higher than three, said apparatus including:
  • the inventive decoding method is suited for decoding sound field data that were encoded according to the above encoding method using one or two or more of said paths, said method including the steps:
  • the inventive decoder apparatus is suited for decoding sound field data that were encoded according to the above encoding method using one or two or more of said paths, said apparatus including:
  • a first step or multiplier 33 all s source signals x ( k ) at each sample time kT , i.e. virtual single sources as well as microphone array sources, are multiplied with a matrix ⁇ defined in Eq.(1).
  • Fig. 3 shows a block diagram of an Ambisonics encoder for these four cases at production side. The required functions are represented by corresponding steps or stages in front of the transmission. All processing steps are clocked by a frequency that is made in stage 38 synchronous with the sample frequency 1/T.
  • a controller 37 receives a mode selection signal and the value of order N , and controls an optional multiplexer 36 that receives the filter responses and the output signal of multiplier 33, and outputs the inventive data structure frames 39.
  • Multiplier 33 represents a directional encoder providing corresponding coefficients and outputs the unfiltered vector data d ( k ), the order N value, and parameter Norm .
  • An array response filter 42 ('Filter 1' in Fig. 4 ) only for the microphone sources data can be arranged at decoder side.
  • the unfiltered vector data d ( k ), the order N value, and parameter Norm are assembled in a combiner 340 with radii data R S ( t ), and are fed to an optional multiplexer 36.
  • Radii data R S ( t ) represent the distances of the audio sources of the S input signals x ( k ), and refer to microphones as well as to artificially generated virtual sound sources.
  • the coefficients vector data d ( k ) pass through an array response filter 341 for the microphone sources (filter 2).
  • the filtering compensates the microphone-array response and is based on Bessel or Hankel functions. Basically, the signals from the output vectors d ( k ) are filtered.
  • the other inputs serve as parameters for the filter, e.g. parameter R is used for the term k * r .
  • the filtering is relevant only for microphones that have the individual radius R m . Such radii are taken into consideration in the term k * r of the Bessel or Hankel functions. Normally, the amplitude response of the filter starts with a lowpass characteristic but increases for higher frequencies.
  • the filtering is performed in dependency from the Ambisonics order N , the order n and the radii R m values, so as to compensate for non-linear frequency dependency.
  • a subsequent normalisation step or stage 351 for spherical waves data provides filtered coefficients A ( k ). It is assumed that there is also a corresponding filter at reproduction side (filter 431 in Fig. 4 ).
  • the filtered and normalised coefficients A ( k ), parameter Norm and the order N value are fed to multiplexer 36.
  • the coefficients vector data d ( k ) pass through an array response filter 342 for the microphone sources (filter 3).
  • the filtering is performed in dependency from said Ambisonics order N , said order n , the radii R m values and a radius R ref value representing the average radius R ref of the loudspeakers at decoder side as described in the below section "Radius R ref (RREF)", so as to compensate for non-linear frequency dependency.
  • a filter for spherical waves data is also arranged at reproduction side. Then the average radius R ref of the loudspeakers has to be considered already in filter 342.
  • a subsequent normalisation step or stage 352 for spherical waves data provides filtered coefficients A ( k ).
  • Step/stage 352 can include a distance coding like that described in connection with Fig. 4 .
  • the filtered coefficients A ( k ) from step/stage 352, parameter Norm , the order N value and radius value R ref are fed to multiplexer
  • the coefficients vector data d ( k ) pass through an array response filter 343 for the microphone sources (filter 4).
  • the filtering is performed in dependency from the Ambisonics order N , the radii R m values and a Plane Wave parameter.
  • a subsequent normalisation step or stage 353 for plane waves data provides parameter Norm , the order N value and a flag for Plane Wave to multiplexer 36.
  • the Ambisonics encoder can code the output signals 361 in any one of these paths, in any two of these paths, or in more than two of these paths.
  • the normalisation steps or stages 351 to 353 can use a normalisation or scaling as described below in section "Ambisonics Normalisation/Scaling Format (ANSF)".
  • the Ambisonics decoder depicted in Fig. 4 parses the incoming data data structures in a parser 41 in order to detect the case type and to provide the data for performing the appropriate functions.
  • An example for such parser is disclosed in WO 2009/106637 A1 .
  • Unfiltered vector data d ( k ), order value N , parameter Norm and each radii data R S ( t ) are parsed. These values pass through an array response filter 42 (Filter 1) for filtering (a filtering as described in Fig. 3 ) the received d ( k ) data under consideration of all radii R S ( t ).
  • the resulting filtered coefficients A ( k ) are distance coded (DC) in a distance coding step or stage 431 for all loudspeaker radii R LS and order N , and pass thereafter together with loudspeaker direction values ⁇ l ( representing the directions of the LS loudspeakers 46 ), value N and parameter Norm through an optional multiplexer 44 to a panning or pseudo inverse step or stage 45.
  • Distance coding means taking into account Bessel or Hankel functions with radii parameter in term k * r for plane or spherical waves.
  • Filtered coefficients A ( k ), parameter Norm and order value N are parsed.
  • the filtered coefficients A ( k ) are distance coded (DC) in a distance coding step or stage 432 for all loudspeaker radii R LS and order N , and pass thereafter together with loudspeaker direction values ⁇ l , value N and parameter Norm through multiplexer 44 to the panning or pseudo inverse step or stage 45.
  • Spherical waves on AE and AD sides are assumed.
  • Filtered coefficients A ( k ), order value N , parameter Norm and radius value R ref are parsed.
  • the filtered coefficients A ( k ) are distance coded (DC) in a distance coding step or stage 432 for all loudspeaker radii R LS and order N under consideration of radius R ref , and pass thereafter together with loudspeaker direction values ⁇ l , value N and parameter Norm through multiplexer 44 to the panning or pseudo inverse step or stage 45.
  • Spherical waves on AE and AD sides are assumed.
  • Filtered coefficients A ( k ), order value N , parameter Norm and a flag for Plane Waves are parsed.
  • the filtered coefficients A ( k ) together with loudspeaker direction values ⁇ l , value N and parameter Norm pass through multiplexer 44 to the panning or pseudo inverse step or stage 45. Plane waves on AE and AD sides are assumed.
  • a mode selector 47 selects in multiplexer 44 the corresponding path or paths a) to d) which was or were used at encoder side.
  • Decoder 45 which represents a panning or a mode matching operation including pseudo inverse, inverts the matrix ⁇ operation in the Ambisonics encoder in Fig. 3 , and applies this operation to the filtered coefficients A ( k ) or the filtered and distance coded coefficients A '( k ), respectively, in dependency from the parameter Norm , order value N and the loudspeaker direction values ⁇ l , and provides the l loudspeaker signals for a loudspeaker array 46.
  • Parser 41 also provides synchronisation information that is used for re-synchronisation of a clock 48.
  • the invention specifies a packet-based streaming format for encapsulating spatial sound field descriptions based on Ambisonics into an extended real-time transport protocol, in particular RTP, for real-time streaming of spatial audio scenes.
  • RTP extended real-time transport protocol
  • the focus is on a standalone spatial (2D/3D) audio real-time application, e.g. a transmission of a live concert or a live sport event via IP. This requires a specific spatial audio layer including time stamps and possibly synchronisation information.
  • the Ambisonics real-time stream can be used together with an RTP layer.
  • alternative RTP layers with or without extended headers are described below.
  • EASF Extended Ambisonics streaming format
  • Ethernet transmissions are performed in data packets with a typical packet length called 'path MTU' with up to 1500 or 9000 bytes.
  • 'path MTU' a typical packet length
  • 'frames' Such frame represents a dedicated time interval within which a typical number of packets is transmitted.
  • a frame For example in video applications, in 1080p video mode a frame contains 1080 data packets of which each one describes one line of a complete video frame.
  • a transmission should be frame based.
  • Case 1 requires a transmission of each time-dependent radii R S ( t ). This is an option if filter processing is to be performed in the decoder. However, in the following section the focus is on Cases 2-4 in which the filtered coefficients A ( k ) are transmitted. This allows a higher bandwidth because the transmission remains independent from all source positions, i.e. this is suited more for Ambisonics.
  • the protocol For standalone audio transmission, the protocol contains the following header data structure.
  • Payload Type 7 bits
  • the payload type is defined for an Audio standalone transmission as EASF.
  • EASF audio standalone transmission
  • the film format is chosen, e.g. DPX.
  • Sequence Number 16 bits The LSB bits for the sequence number. It increments by one for each RTP data packet sent, and may be used by the receiver for detecting packet loss and for restoring the packet sequence. The initial value of the sequence number is random (i.e. unpredictable) in order to make known-plaintext attacks on encryption more difficult.
  • Timestamp 32 bits
  • the timestamp denotes the sampling instant of the frame to which the RTP packet belongs. Packets belonging to the same frame must have the same timestamp.
  • RTP payload header extension According to the invention, the fields of the known RTP header keep their usual meaning, but that header is amended as follows: RTP Payload Frame Status (PLFS) - 2 bit The frame status describes which type of data will follow the extended RTP header in the payload block: PLFS code Payload type 00 Ambisonics coefficients 01 Frame end (+ Ambisonics coefficients) 10 Frame begin (+ Metadata) 11 Metadata I.e., in the first packet of a frame, instead of audio data, additional metadata can be transmitted. In case of Ambisonics transmission, the metadata contains source and Ambisonics encoder related information (production side information) required for the decoding process.
  • Time Code/Sync Frequency (TCSF) - 30 bit unsigned integer
  • the following SMPTE time code or the synchronisation is based on a specific clock frequency, the Time Code/Sync Frequency TSCF.
  • the TCSF is defined as a 30 bit integer field. The value is represented in Hz and leads to a frequency range from 0 to 1073.741824 MHz, wherein a value of 0 Hz is signalling that no time code is available.
  • the selection in data field AST facilitates not only a separation within Ambisonics (cf. the example provided below in connection with Fig. 9 ) but also the parallel transmission of differently encoded audio source signals (Ambisonocs and/or PCM data + position data), i.e. the inventive protocol can be complemented e.g. for PCM data.
  • the below-described SMPTE Time Code/Clock Sync Info (STCSI) facilitates the temporally correct assignment of the audio signal sources.
  • the general Ambisonics header is transmitted only in the first data packet of a frame and the individual Ambisonics header is transmitted in all other data packets.
  • the general Ambisonics header shall also be available in every data packet in front of the individual Ambisonics header. This mode enables a modification of the parameters in each data packet, i.e. in real-time. It can be useful for real-time applications where no or only small buffers are available. However, this mode decreases the available bandwidth.
  • Different sources can generate audio signals at the same time.
  • Known protocols are based on a separate transmission of the sound sources, i.e. every data frame refers to a single temporal section in which, depending on the sampling frequency, several samples can be contained. Therefore, in known protocols, different source signal occurring at the same time instant will use the same time stamp and the same frame number. This poses no problem for an offline processing, i.e. no real-time processing.
  • the transmitted data are buffered and assembled later on. However, this does not work for real-time processing in which a small latency is demanded.
  • the data field XAH facilitates a continued entrainment of the header, and the parser 41 in Fig. 4 can switch back and forth block-by-block (or Ethernet packet-by-packet or frame-by-frame) between different audio sources types.
  • Distinguishing between general header and individual header facilitates a real-time adaptation.
  • the value in the 24 bit field STCSI (see below) represents the SMPTE time code. If STS is set, field STCSI contains user-specific synchronisation information.
  • the packet offset describes the distance in bytes between the first payload octet of the first data packet in a frame relative to the first payload octet in the current data packet.
  • PAO(HIGH) represents the 32 MSBs and PAO(LOW) represents the 32 LSBs.
  • Ambisonics payload data and Ambisonics header data shall be fragmented such that the resulting RTP data packet is smaller than the 'path MTU' mentioned above.
  • the path MTU is a 'jumbo frame' of e.g. 9000 bytes.
  • a small individual Ambisonics header is sent in front of each data packet.
  • a general header contains source and encoder related information that can be useful for the Ambisonics decoder. It contains information that is valid for the all data packets within a frame, and for small frames and/or data packets it can be sent once at the beginning of a frame. Especially for real-time applications where the packet information is changing frequently, it can be advantageous to send the general header with each data packet.
  • Table 1 AFT code Format 00 B-Format order 01 numerical upward 10 numerical downward 11 Reserved Degree n Order m Channel 0 0 W 1 1 X 1 -1 Y 1 0 Z 2 0 R 2 1 S 2 -1 T 2 2 U 2 -2 V 3 0 K 3 1 L 3 -1 M 3 2 N 3 -2 O 3 3 P 3 -3 Q
  • the sequence of each matrix column in Eq.(1) from top to bottom represents a numerical upward order type.
  • a degree value always starts with 0 and runs up to Ambisonics Order N .
  • the sequence starts with lowest order - N and runs up to order + N .
  • the downward type uses for each degree the reversed order.
  • the Ambisonics order describes the quality of the Ambisonics en- and decoding via ⁇ .
  • An order up to 255 should be sufficient.
  • the order is distinguished in horizontal and vertical direction. In case of 2D, only AHO has a value greater than '0'.
  • a mixed order can have different AHO and AVO values.
  • Ambisonics Normalisation/Scaling Format (ANSF) - 3 bit Identifies different normalisation formats, typically used for Ambisonics.
  • the normalisation corresponds to the orthogonality relationship between Y n m and Y n ⁇ m ⁇ * .
  • additional normalisation principles e.g. Furse-Malham.
  • the Furse-Malham formulation facilitates a normalisation of the coefficients to get maximum values of ⁇ 1, which yields an optimal dynamic range.
  • the scaling factors are fixed over one frame. The scaling factors will be transmitted only once in front of the Ambisonics coefficients.
  • ANF code Format 000 Orthonormal 001 Schmidt semi-normalised 010 4n normalised 011 Unnormalised 100 Furse-Malham 101 Dedicated scaling 11x Reserved
  • the reference radius R ref value of the loudspeakers in mm is required in case of spherical waves.
  • f audible frequencies
  • speed of sound c 340 m/s.
  • This code defines the word length as well as the format (integer/floating point) of the transmitted Ambisonics coefficients A ( k ).
  • the sample format enables an adaptation to different value ranges.
  • nine sample formats are predefined: ASF code Format 0000 Unsigned integer 8 bit 0001 Signed integer 8 bit 0010 Signed integer 16 bit 0011 Signed integer 24 bit 0100 Signed integer 32 bit 0101 Signed integer 64 bit 0110 Float 32 bit (binary single prec.) 0111 Float 64 bit (binary double prec.) 1000 Float 128 bit (binary quad prec.) 1001-1111 Reserved
  • AIB If ASF is specified as an integer format, the number AIB of invalid bits can mask the lowest bits within the ASF integer. AIB is coded as 5 bit unsigned integer value, so that up to 31 bits can be marked as invalid. Valid bits start at MSB. Note that the word length of AIB is less than the ASF integer word length.
  • the rate at which the input data x i ( k ) are sampled is coded as an unsigned integer.
  • FSM If FSM is cleared, the following 31 bits for FS represent the file size in bytes. If FSM is set, FS represents the total number of data packets in the actual frame.
  • the frame size number FS is to be interpreted in view of the FSM flag's value. Depending on the application, the frame size can vary from frame to frame.
  • a 'frame' can contain several equal-length packets, wherein the last packet can have a different length that is described in the individual Ambisonics header. Every packet may use such a header for describing at the end lengths values that differ from prior packet lengths.
  • bits in front of APL are reserved. This enables an extension of the individual header, e.g. by packet related flags, and a 32 bit alignment for the following Ambisonics coefficients.
  • the maximum length is 65535.
  • the payload data type is defined in the data field PLFS (RTP Payload Frame Status), cf. Fig. 5 .
  • PLFS RTP Payload Frame Status
  • cf. Fig. 5 the payload data type is defined in the data field PLFS (RTP Payload Frame Status), cf. Fig. 5 .
  • 'pure' Ambisonics data or 'pure' metadata can be arranged.
  • the transmission processing operates in a sequential manner, i.e. at each transmission clock step (which is totally different from the sampling rate) only 32 or 64 bits of a data packet can be dealt with.
  • the number of considered Ambisonics samples in one data packet is related to one concatenated sample time or to a group of concatenated sample times.
  • the following examples of payload data show different dimensions, orders, and Ambisonics coefficients based on the encoder/decoder cases 2 to 4 of Fig. 3 .
  • the first index x of A( x , y ) describes the sequence number for a specific order, whereas the second index y stands for the sample time k in a data packet.
  • SMPTE MXF and XML are pre-defined.
  • AMT code Format 0x00 SMPTE MXF 0x80 XML 0x01-7F Rsrvd 0x81-0xFF Rsrvd
  • This data field is followed by specific metadata. If possible the metadata descriptions should be kept simple in order to get only one metadata packet in the 'begin packet' of a frame. However, the packet length in bytes is the same as for Ambisonics coefficients. If the amount of metadata will exceed this packet length, the metadata has to be fragmented into several packets which shall be inserted between packets with Ambisonics coefficients. If the metadata amount in bytes in one packet is less than the regular packet length, the remaining packet bytes are to be padded with '0' or stuffing bits.
  • the encapsulated CRC word at the end of each Ethernet packet should be used.
  • the content addressable memories CAM detect all protocol data which will lead to a decision about how the received data are to be processed in the following steps or stages, and the registers REG store information about the length or the payload data.
  • the parser evaluates the header data in a hierarchical manner and can be implemented in hardware or software, according to any real-time requirements.
  • spherical waves SPW or plane waves PW e.g. the worldwide live broadcast of a concert in 3D format, wherein all receiving units are arranged in cinemas.
  • the individual signals are to be transmitted separately so that a correct presentation can be facilitated.
  • the parser can distinguish this and supply two separate 'distance coding' units with the corresponding data items.
  • the inventive Ambisonics decoder depicted in Fig. 4 can process all these signals, whereas in the prior art several decoders would be required. I.e., the considering the Ambisonics wave type facilitates the advantages described above.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)
EP10306212A 2010-11-05 2010-11-05 Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois Withdrawn EP2451196A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP10306212A EP2451196A1 (fr) 2010-11-05 2010-11-05 Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP10306212A EP2451196A1 (fr) 2010-11-05 2010-11-05 Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois

Publications (1)

Publication Number Publication Date
EP2451196A1 true EP2451196A1 (fr) 2012-05-09

Family

ID=43585582

Family Applications (1)

Application Number Title Priority Date Filing Date
EP10306212A Withdrawn EP2451196A1 (fr) 2010-11-05 2010-11-05 Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d'ambiophonie d'un ordre supérieur à trois

Country Status (1)

Country Link
EP (1) EP2451196A1 (fr)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013545391A (ja) * 2010-11-05 2013-12-19 トムソン ライセンシング 高次アンビソニックス・オーディオ・データ用のデータ構造
WO2014012945A1 (fr) * 2012-07-16 2014-01-23 Thomson Licensing Procédé et dispositif de restitution d'une représentation de champs sonores audio pour une lecture audio
EP2733963A1 (fr) 2012-11-14 2014-05-21 Thomson Licensing Procédé et appareil permettant de faciliter l'écoute d'un signal sonore de signaux sonores matricés
WO2014124261A1 (fr) * 2013-02-08 2014-08-14 Qualcomm Incorporated Signalisation d'informations de rendu audio dans un flux binaire
DE102013223201B3 (de) * 2013-11-14 2015-05-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Verfahren und Vorrichtung zum Komprimieren und Dekomprimieren von Schallfelddaten eines Gebietes
WO2015104166A1 (fr) * 2014-01-08 2015-07-16 Thomson Licensing Procédé et appareil destinés à l'amélioration du codage d'informations annexes nécessaires pour le codage d'une représentation d'ambiophonie d'ordre élevé d'un champ acoustique
WO2015130765A1 (fr) * 2014-02-25 2015-09-03 Qualcomm Incorporated Signalisation de format d'ordre de donnees audio ambiophoniques d'ordre superieur
EP3002960A1 (fr) * 2014-10-04 2016-04-06 Patents Factory Ltd. Sp. z o.o. Système et procédé pour générer des sons d'ambiance
US9483228B2 (en) 2013-08-26 2016-11-01 Dolby Laboratories Licensing Corporation Live engine
US9609452B2 (en) 2013-02-08 2017-03-28 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
CN106796794A (zh) * 2014-10-07 2017-05-31 高通股份有限公司 环境高阶立体混响音频数据的归一化
US9883310B2 (en) 2013-02-08 2018-01-30 Qualcomm Incorporated Obtaining symmetry information for higher order ambisonic audio renderers
CN109448742A (zh) * 2012-12-12 2019-03-08 杜比国际公司 对声场的高阶立体混响表示进行压缩和解压缩的方法和设备
US10334387B2 (en) 2015-06-25 2019-06-25 Dolby Laboratories Licensing Corporation Audio panning transformation system and method
US10356484B2 (en) 2013-03-15 2019-07-16 Samsung Electronics Co., Ltd. Data transmitting apparatus, data receiving apparatus, data transceiving system, method for transmitting data, and method for receiving data
TWI666931B (zh) * 2013-03-15 2019-07-21 三星電子股份有限公司 資料傳送裝置、資料接收裝置以及資料傳收系統
CN111460883A (zh) * 2020-01-22 2020-07-28 电子科技大学 基于深度强化学习的视频行为自动描述方法
CN112216292A (zh) * 2014-06-27 2021-01-12 杜比国际公司 声音或声场的压缩hoa声音表示的解码方法和装置
US11234091B2 (en) 2012-05-14 2022-01-25 Dolby Laboratories Licensing Corporation Method and apparatus for compressing and decompressing a Higher Order Ambisonics signal representation
US11962990B2 (en) 2013-05-29 2024-04-16 Qualcomm Incorporated Reordering of foreground audio objects in the ambisonics domain

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004047485A1 (fr) 2002-11-21 2004-06-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Systeme de restitution audio et procede de restitution d'un signal audio
EP1936908A1 (fr) 2006-12-19 2008-06-25 Deutsche Thomson OHG Procédé, appareil et conteneur de données pour le transfert de données audio/vidéo haute résolution dans un réseau IP grande vitesse
WO2009106637A1 (fr) 2008-02-28 2009-09-03 Thomson Licensing Analyseur matériel pour protocoles orientés paquets

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004047485A1 (fr) 2002-11-21 2004-06-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Systeme de restitution audio et procede de restitution d'un signal audio
EP1936908A1 (fr) 2006-12-19 2008-06-25 Deutsche Thomson OHG Procédé, appareil et conteneur de données pour le transfert de données audio/vidéo haute résolution dans un réseau IP grande vitesse
WO2009106637A1 (fr) 2008-02-28 2009-09-03 Thomson Licensing Analyseur matériel pour protocoles orientés paquets

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"Spherical harmonics", 28 June 2011 (2011-06-28), XP002646194, Retrieved from the Internet <URL:http://en.wikipedia.org/wiki/Spherical_harmonics> [retrieved on 20110628] *
J.DANIEL: "Spatial Sound Encoding Including Near Field Effect: Introducing Distance Coding Filters and a Viable, New Ambisonic Format", AES 23RD INTERNATIONAL CONFERENCE, vol. 23, 23 May 2003 (2003-05-23), XP002647040 *
J.DANIEL: "Spatial Sound Encoding Including Near Field Effect: Introducing Distance Coding Filters and a Viable, New Ambisonic Format", AES 23TH INTL.CONF., vol. 23, 25 May 2003 (2003-05-25)
M.A.POLETTI: "Three-Dimensional Surround Sound Systems Based on Sperical Harmonics", J.AUDIO ENG.SOC., vol. 53, no. 11, November 2005 (2005-11-01)
MICROSOFT, MULTIPLE CHANNEL AUDIO DATA AND WAVE FILES, 7 March 2007 (2007-03-07)

Cited By (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013545391A (ja) * 2010-11-05 2013-12-19 トムソン ライセンシング 高次アンビソニックス・オーディオ・データ用のデータ構造
US9241216B2 (en) 2010-11-05 2016-01-19 Thomson Licensing Data structure for higher order ambisonics audio data
TWI823073B (zh) * 2012-05-14 2023-11-21 瑞典商杜比國際公司 高階保真立體音響訊號表象之壓縮方法和裝置以及解壓縮方法和裝置以及非暫時性電腦可讀取媒體
US11234091B2 (en) 2012-05-14 2022-01-25 Dolby Laboratories Licensing Corporation Method and apparatus for compressing and decompressing a Higher Order Ambisonics signal representation
US11792591B2 (en) 2012-05-14 2023-10-17 Dolby Laboratories Licensing Corporation Method and apparatus for compressing and decompressing a higher order Ambisonics signal representation
CN107071685A (zh) * 2012-07-16 2017-08-18 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN106658343A (zh) * 2012-07-16 2017-05-10 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
EP4284026A3 (fr) * 2012-07-16 2024-02-21 Dolby International AB Procédé et dispositif de rendu d'une représentation d'un champ acoustique audio
WO2014012945A1 (fr) * 2012-07-16 2014-01-23 Thomson Licensing Procédé et dispositif de restitution d'une représentation de champs sonores audio pour une lecture audio
US11743669B2 (en) 2012-07-16 2023-08-29 Dolby Laboratories Licensing Corporation Method and device for decoding a higher-order ambisonics (HOA) representation of an audio soundfield
US11451920B2 (en) 2012-07-16 2022-09-20 Dolby Laboratories Licensing Corporation Method and device for decoding a higher-order ambisonics (HOA) representation of an audio soundfield
CN106658343B (zh) * 2012-07-16 2018-10-19 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
EP4013072A1 (fr) * 2012-07-16 2022-06-15 Dolby International AB Procédé et dispositif de rendu d'une représentation d'un champ acoustique audio
US10075799B2 (en) 2012-07-16 2018-09-11 Dolby Laboratories Licensing Corporation Method and device for rendering an audio soundfield representation
US10939220B2 (en) 2012-07-16 2021-03-02 Dolby Laboratories Licensing Corporation Method and device for decoding a higher-order ambisonics (HOA) representation of an audio soundfield
US10595145B2 (en) 2012-07-16 2020-03-17 Dolby Laboratories Licensing Corporation Method and device for decoding a higher-order ambisonics (HOA) representation of an audio soundfield
CN104584588B (zh) * 2012-07-16 2017-03-29 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN106658342A (zh) * 2012-07-16 2017-05-10 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN104584588A (zh) * 2012-07-16 2015-04-29 汤姆逊许可公司 用于渲染音频声场表示以供音频回放的方法和设备
CN107071686B (zh) * 2012-07-16 2020-02-14 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
US9712938B2 (en) 2012-07-16 2017-07-18 Dolby Laboratories Licensing Corporation Method and device rendering an audio soundfield representation for audio playback
US10306393B2 (en) 2012-07-16 2019-05-28 Dolby Laboratories Licensing Corporation Method and device for rendering an audio soundfield representation
CN107071687A (zh) * 2012-07-16 2017-08-18 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN107071686A (zh) * 2012-07-16 2017-08-18 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN107071687B (zh) * 2012-07-16 2020-02-14 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN107071685B (zh) * 2012-07-16 2020-02-14 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
CN106658342B (zh) * 2012-07-16 2020-02-14 杜比国际公司 用于渲染音频声场表示以供音频回放的方法和设备
US9961470B2 (en) 2012-07-16 2018-05-01 Dolby Laboratories Licensing Corporation Method and device for rendering an audio soundfield representation
US9723424B2 (en) 2012-11-14 2017-08-01 Dolby Laboratories Licensing Corporation Making available a sound signal for higher order ambisonics signals
EP2733963A1 (fr) 2012-11-14 2014-05-21 Thomson Licensing Procédé et appareil permettant de faciliter l'écoute d'un signal sonore de signaux sonores matricés
WO2014075934A1 (fr) 2012-11-14 2014-05-22 Thomson Licensing Mise à disponibilité d'un signal sonore pour des signaux ambiophoniques d'ordre supérieur
CN109448742A (zh) * 2012-12-12 2019-03-08 杜比国际公司 对声场的高阶立体混响表示进行压缩和解压缩的方法和设备
CN109448742B (zh) * 2012-12-12 2023-09-01 杜比国际公司 对声场的高阶立体混响表示进行压缩和解压缩的方法和设备
CN104981869B (zh) * 2013-02-08 2019-04-26 高通股份有限公司 在位流中用信号表示音频渲染信息
US9883310B2 (en) 2013-02-08 2018-01-30 Qualcomm Incorporated Obtaining symmetry information for higher order ambisonic audio renderers
US9609452B2 (en) 2013-02-08 2017-03-28 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
RU2661775C2 (ru) * 2013-02-08 2018-07-19 Квэлкомм Инкорпорейтед Передача сигнальной информации рендеринга аудио в битовом потоке
US9870778B2 (en) 2013-02-08 2018-01-16 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
WO2014124261A1 (fr) * 2013-02-08 2014-08-14 Qualcomm Incorporated Signalisation d'informations de rendu audio dans un flux binaire
CN104981869A (zh) * 2013-02-08 2015-10-14 高通股份有限公司 在位流中用信号表示音频渲染信息
US10178489B2 (en) 2013-02-08 2019-01-08 Qualcomm Incorporated Signaling audio rendering information in a bitstream
TWI666931B (zh) * 2013-03-15 2019-07-21 三星電子股份有限公司 資料傳送裝置、資料接收裝置以及資料傳收系統
US10356484B2 (en) 2013-03-15 2019-07-16 Samsung Electronics Co., Ltd. Data transmitting apparatus, data receiving apparatus, data transceiving system, method for transmitting data, and method for receiving data
US11962990B2 (en) 2013-05-29 2024-04-16 Qualcomm Incorporated Reordering of foreground audio objects in the ambisonics domain
US9483228B2 (en) 2013-08-26 2016-11-01 Dolby Laboratories Licensing Corporation Live engine
DE102013223201B3 (de) * 2013-11-14 2015-05-13 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Verfahren und Vorrichtung zum Komprimieren und Dekomprimieren von Schallfelddaten eines Gebietes
WO2015071148A1 (fr) 2013-11-14 2015-05-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Procédé et dispositif pour compresser et décompresser des données de champ sonore d'un domaine
CN111179955A (zh) * 2014-01-08 2020-05-19 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
CN111182443B (zh) * 2014-01-08 2021-10-22 杜比国际公司 包括编码hoa表示的位流的解码方法和装置
CN111179951A (zh) * 2014-01-08 2020-05-19 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
CN111028849A (zh) * 2014-01-08 2020-04-17 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
CN111182443A (zh) * 2014-01-08 2020-05-19 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
US10714112B2 (en) 2014-01-08 2020-07-14 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded higher order Ambisonics representations
CN111179951B (zh) * 2014-01-08 2024-03-01 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
CN111028849B (zh) * 2014-01-08 2024-03-01 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
CN111179955B (zh) * 2014-01-08 2024-04-09 杜比国际公司 包括编码hoa表示的位流的解码方法和装置、以及介质
EP3648102A1 (fr) * 2014-01-08 2020-05-06 Dolby International AB Procédé et appareil pour améliorer le codage d'informations secondaires nécessaires pour le codage d'une représentation ambisonique d'ordre supérieur d'un champ acoustique
US11211078B2 (en) 2014-01-08 2021-12-28 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
CN105981100A (zh) * 2014-01-08 2016-09-28 杜比国际公司 用于改善对声场的高阶高保真度立体声响复制表示进行编码所需的边信息的编码的方法和装置
US9990934B2 (en) 2014-01-08 2018-06-05 Dolby Laboratories Licensing Corporation Method and apparatus for improving the coding of side information required for coding a Higher Order Ambisonics representation of a sound field
US11869523B2 (en) 2014-01-08 2024-01-09 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
US10553233B2 (en) 2014-01-08 2020-02-04 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
US11488614B2 (en) 2014-01-08 2022-11-01 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded Higher Order Ambisonics representations
EP4089675A1 (fr) * 2014-01-08 2022-11-16 Dolby International AB Procédé et appareil pour améliorer le codage d'informations secondaires nécessaires pour le codage d'une représentation ambisonique d'ordre supérieur d'un champ acoustique
US10424312B2 (en) 2014-01-08 2019-09-24 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations
WO2015104166A1 (fr) * 2014-01-08 2015-07-16 Thomson Licensing Procédé et appareil destinés à l'amélioration du codage d'informations annexes nécessaires pour le codage d'une représentation d'ambiophonie d'ordre élevé d'un champ acoustique
US10147437B2 (en) 2014-01-08 2018-12-04 Dolby Laboratories Licensing Corporation Method and apparatus for decoding a bitstream including encoding higher order ambisonics representations
WO2015130765A1 (fr) * 2014-02-25 2015-09-03 Qualcomm Incorporated Signalisation de format d'ordre de donnees audio ambiophoniques d'ordre superieur
CN112216292A (zh) * 2014-06-27 2021-01-12 杜比国际公司 声音或声场的压缩hoa声音表示的解码方法和装置
EP3002960A1 (fr) * 2014-10-04 2016-04-06 Patents Factory Ltd. Sp. z o.o. Système et procédé pour générer des sons d'ambiance
CN106796794A (zh) * 2014-10-07 2017-05-31 高通股份有限公司 环境高阶立体混响音频数据的归一化
US10334387B2 (en) 2015-06-25 2019-06-25 Dolby Laboratories Licensing Corporation Audio panning transformation system and method
CN111460883B (zh) * 2020-01-22 2022-05-03 电子科技大学 基于深度强化学习的视频行为自动描述方法
CN111460883A (zh) * 2020-01-22 2020-07-28 电子科技大学 基于深度强化学习的视频行为自动描述方法

Similar Documents

Publication Publication Date Title
EP2451196A1 (fr) Procédé et appareil pour générer et décoder des données de champ sonore incluant des données de champ sonore d&#39;ambiophonie d&#39;un ordre supérieur à trois
EP3175446B1 (fr) Systèmes et procédés de traitement audio
TWI476761B (zh) 用以產生可由實施不同解碼協定之解碼器所解碼的統一位元流之音頻編碼方法及系統
EP3800898B1 (fr) Processeur de données et transport de données de contrôle utilisateur vers des décodeurs audio et des lecteurs audio
JP4787442B2 (ja) マルチチャネル・オーディオ環境において対話型オーディオを提供するシステムおよび方法
EP1949693B1 (fr) Procede et appareil de traitement/emission de flux de bits et procede et appareil de reception/traitement de flux de bits
EP1624448A2 (fr) Multiplexage de paquets de données audio à canaux multiples
CN111837182A (zh) 用于产生或解码包括沉浸式音频信号的位流的方法及装置
JP6908168B2 (ja) 受信装置、受信方法、送信装置および送信方法
CN107533846B (zh) 发送装置、发送方法、接收装置与接收方法
WO2020152394A1 (fr) Représentation audio et rendu associé
JP2021105735A (ja) 受信装置および受信方法
CN106375778B (zh) 一种符合数字电影规范的三维音频节目码流传输的方法
CN108206984B (zh) 利用多信道传输三维声信号的编解码器及其编解码方法
JP6699564B2 (ja) 送信装置、送信方法、受信装置および受信方法
KR101531510B1 (ko) 수신 시스템 및 오디오 데이터 처리 방법
CN108206983B (zh) 兼容现有音视频系统的三维声信号的编码器及其方法
CN114448955B (zh) 一种数字音频网络传输方法、装置、设备及存储介质
WO2021255327A1 (fr) Gestion de gigue de réseau pour de multiples flux audio

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20121110