US12327569B2 - Spatial parameter signalling - Google Patents
Spatial parameter signalling Download PDFInfo
- Publication number
- US12327569B2 US12327569B2 US17/270,354 US201917270354A US12327569B2 US 12327569 B2 US12327569 B2 US 12327569B2 US 201917270354 A US201917270354 A US 201917270354A US 12327569 B2 US12327569 B2 US 12327569B2
- Authority
- US
- United States
- Prior art keywords
- frequency bands
- parameter
- frequency band
- audio signal
- energy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0204—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
Definitions
- the present application relates to apparatus and methods for spatial parameter signalling, but not exclusively for spatial parameter signalling within and between spatial audio encoders and decoders.
- Parametric spatial audio processing is a field of audio signal processing where the spatial aspect of the sound is described using a set of parameters.
- parameters such as directions of the sound in frequency bands, and the ratios between the directional and non-directional parts of the captured sound in frequency bands.
- These parameters are known to well describe the perceptual spatial properties of the captured sound at the position of the microphone array.
- These parameters can be utilized in synthesis of the spatial sound accordingly, for headphones binaurally, for loudspeakers, or to other formats, such as Ambisonics.
- an apparatus comprising means for: obtaining at least one audio signal; obtaining at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- the means for obtaining at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal may be further for obtaining a direction and energy respectively for each of the at least two frequency bands associated with the at least one audio signal, and wherein the means for selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands may be further for: determining a directional energy weight factor for each of the at least two frequency bands based on the direction and energy for each of the at least two frequency bands, wherein the directional energy weight factor is the at least one further respective parameter for each of the at least two frequency bands; determining a weight limit factor based on an averaged energy; comparing the directional energy weight factor for each of the at least two frequency bands to the weight limit factor; and selecting a highest frequency band where the directional energy weight factor is greater than the weight limit factor.
- the energy may be a normalized energy.
- the means for selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameters for each of the at least two frequency bands may be further for selecting the highest frequency band of the at least two frequency bands.
- the means for obtaining respectively at least one parameter for at least two frequency bands associated with the at least one audio signal may be further for obtaining at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- the means for selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands may be further for: saving the at least one parameter for one of the at least two frequency bands; and discarding any other of the at least one parameter for the at least two frequency bands, wherein the means for generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further for generating an output comprising the saved at least one parameter for one of the at least two frequency bands and not the discarded other of the at least one parameter for the at least two frequency bands.
- the means for selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameters for each of the at least two frequency bands may be further for: saving the at least one parameter for one of the at least two frequency bands; and determining a difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein the means for generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further for generating an output further comprising the difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- the means are further for generating at least one transport signal based on the at least one audio signal and wherein the means for generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further for generating a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal.
- the means for generating a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal may be further for: encoding the at least one transport signal; encoding the at least one parameter associated with the selected frequency band of the at least two frequency bands; and combining the encoded transport signal and the encoded at least one parameter associated with the selected frequency band of the at least two frequency bands.
- the means for generating at least one transport signal based on the at least one audio signal may be further for at least one of: downmixing the at least one audio signal; selecting at least one audio signal from the at least one audio signal, when the at least one audio signal comprises two or more audio signals; generating directional signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generating cardioid signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generating cardioid signals directed at opposite directions, when the at least one audio signal comprises first order ambisonic audio signals; and passing at least one transport audio signal, when the at least one audio signal comprises at least one transport audio signal.
- an apparatus comprising means for: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- the means for obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands may be further for obtaining at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- the means for replicating, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may be further for copying the at least one parameter for one of the at least two frequency bands as the at least one other of the at least two frequency bands.
- the at least one signal may further comprise at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein the means for replicating, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may be further for replicating the at least one parameter for at least one other of the at least two frequency bands based on a combination of the at least one parameter for one of the at least two frequency bands and the at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- a method comprising: obtaining at least one audio signal; obtaining at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- the energy may be a normalized energy.
- Obtaining respectively at least one parameter for at least two frequency bands associated with the at least one audio signal may comprise obtaining at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- Selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands may comprise: saving the at least one parameter for one of the at least two frequency bands; and discarding any other of the at least one parameter for the at least two frequency bands, wherein generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may comprise generating an output comprising the saved at least one parameter for one of the at least two frequency bands and not the discarded other of the at least one parameter for the at least two frequency bands.
- Selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameters for each of the at least two frequency bands may comprise: saving the at least one parameter for one of the at least two frequency bands; and determining a difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may comprise generating an output further comprising the difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- the method may further comprise generating at least one transport signal based on the at least one audio signal and wherein generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may comprise generating a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal.
- Generating a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal may further comprise: encoding the at least one transport signal; encoding the at least one parameter associated with the selected frequency band of the at least two frequency bands; and combining the encoded transport signal and the encoded at least one parameter associated with the selected frequency band of the at least two frequency bands.
- Generating at least one transport signal based on the at least one audio signal may further comprise at least one of: downmixing the at least one audio signal; selecting at least one audio signal from the at least one audio signal, when the at least one audio signal comprises two or more audio signals; generating directional signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generating cardioid signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generating cardioid signals directed at opposite directions, when the at least one audio signal comprises first order ambisonic audio signals; and passing at least one transport audio signal, when the at least one audio signal comprises at least one transport audio signal.
- a method comprising: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- Obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands may further comprise obtaining at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- Replicating, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may be further comprise copying the at least one parameter for one of the at least two frequency bands as the at least one other of the at least two frequency bands.
- the at least one signal may further comprise at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein replicating, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may further comprise replicating the at least one parameter for at least one other of the at least two frequency bands based on a combination of the at least one parameter for one of the at least two frequency bands and the at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- an apparatus comprising at least one processor and at least one memory including a computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to: obtain at least one audio signal; obtain at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and select a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; generate an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- the apparatus caused to obtain at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal may be further be caused to obtain a direction and energy respectively for each of the at least two frequency bands associated with the at least one audio signal, and wherein the apparatus caused to select at least one frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands may be further be caused to: determine a directional energy weight factor for each of the at least two frequency bands based on the direction and energy for each of the at least two frequency bands, wherein the directional energy weight factor is the at least one further respective parameter for each of the at least two frequency bands; determine a weight limit factor based on an averaged energy; compare the directional energy weight factor for each of the at least two frequency bands to the weight limit factor; and select a highest frequency band where the directional energy weight factor is greater than the weight limit factor.
- the energy may be a normalized energy.
- the apparatus caused to select at least one frequency band of the at least two frequency bands based on comparing at least one further respective parameters for each of the at least two frequency bands may be further caused to select the highest frequency band of the at least two frequency bands.
- the apparatus caused to obtain respectively at least one parameter for at least two frequency bands associated with the at least one audio signal may be further caused to obtain at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- the apparatus caused to select at least one frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands may be further caused to: save the at least one parameter for one of the at least two frequency bands; and discard any other of the at least one parameter for the at least two frequency bands, wherein the apparatus caused to generate an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further caused to generate an output comprising the saved at least one parameter for one of the at least two frequency bands and not the discarded other of the at least one parameter for the at least two frequency bands.
- the apparatus caused to select at least one frequency band of the at least two frequency bands based on comparing at least one further respective parameters for each of the at least two frequency bands may be further cause to: save the at least one parameter for one of the at least two frequency bands; and determine a difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein the apparatus caused to generate an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further caused to generate an output further comprising the difference between any other of the at least one parameter for the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- the apparatus may be further caused to generate at least one transport signal based on the at least one audio signal and wherein the apparatus caused to generate an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands may be further caused to generate a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal.
- the apparatus caused to generate a datastream for storing/transmission based on a combination of the at least one parameter and the at least one transport signal may be further caused to: encode the at least one transport signal; encode the at least one parameter associated with the selected frequency band of the at least two frequency bands; and combine the encoded transport signal and the encoded at least one parameter associated with the selected frequency band of the at least two frequency bands.
- the apparatus caused to generate at least one transport signal based on the at least one audio signal may be further caused to perform least one of: downmix the at least one audio signal; selecting at least one audio signal from the at least one audio signal, when the at least one audio signal comprises two or more audio signals; generate directional signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generate cardioid signals directed to different directions, when the at least one audio signal comprises first order ambisonic audio signals; generate cardioid signals directed at opposite directions, when the at least one audio signal comprises first order ambisonic audio signals; and pass at least one transport audio signal, when the at least one audio signal comprises at least one transport audio signal.
- an apparatus comprising at least one processor and at least one memory including a computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to: obtain at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicate, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesise at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- the apparatus caused to obtain at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands may be further caused to obtain at least one of: a directional parameter; a distance parameter; an energy parameter; and an energy ratio parameter.
- the apparatus caused to replicate, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may be further caused to copy the at least one parameter for one of the at least two frequency bands as the at least one other of the at least two frequency bands.
- the at least one signal may further comprise at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands, wherein the apparatus caused to replicate, based on the at least one parameter for one of the at least two frequency bands, at least one parameter for at least one other of the at least two frequency bands may be further caused to replicate the at least one parameter for at least one other of the at least two frequency bands based on a combination of the at least one parameter for one of the at least two frequency bands and the at least one parameter associated with a difference between at least one other of the at least two frequency bands and the at least one parameter for one of the at least two frequency bands.
- a computer program comprising instructions [or a computer readable medium comprising program instructions] for causing an apparatus to perform at least the following: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- a non-transitory computer readable medium comprising program instructions for causing an apparatus to perform at least the following: obtaining at least one audio signal; obtaining at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- a non-transitory computer readable medium comprising program instructions for causing an apparatus to perform at least the following: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- a computer readable medium comprising program instructions for causing an apparatus to perform at least the following: obtaining at least one audio signal; obtaining at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and selecting a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; generating an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- a computer readable medium comprising program instructions for causing an apparatus to perform at least the following: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- a fourteenth aspect there is provided a computer readable medium comprising program instructions for causing an apparatus to perform at least the following: obtaining at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- an apparatus comprising: audio signal obtaining circuitry configured to obtain at least one audio signal; parameter obtaining circuitry configured to obtain at least one parameter respectively for each of at least two frequency bands associated with the at least one audio signal; and selecting circuitry configured to select a frequency band of the at least two frequency bands based on comparing at least one further respective parameter for each of the at least two frequency bands wherein the at least one further respective parameter is determined from each of the at least two frequency bands; output generating circuitry configured to generate an output comprising a selection of the at least one parameter associated with the selected frequency band of the at least two frequency bands, such that the selection of the at least one parameter associated with the selected frequency band is configured to reduce a bitrate or size of the output and wherein the at least one parameter of the selected frequency band is configured to represent respective parameters of the at least two frequency bands.
- an apparatus comprising: signal obtaining circuitry configured to obtain at least one signal, the at least one signal comprising at least one parameter associated with a selected frequency band from at least two frequency bands and at least one transport signal; replicating circuitry configured to replicate, based on the at least one parameter for one of the at least two frequency bands and a transport signal, at least one parameter for at least one other of the at least two frequency bands; and synthesising circuitry configured to synthesise at least two audio signals based on the at least one parameter associated with the selected frequency band from at least two frequency bands and at least one replicated parameter for the at least one other of the at least two frequency bands and the transport signal, wherein the at least two audio signals are configured to provide spatial audio reproduction.
- An apparatus comprising means for performing the actions of the method as described above.
- An apparatus configured to perform the actions of the method as described above.
- a computer program comprising program instructions for causing a computer to perform the method as described above.
- a computer program product stored on a medium may cause an apparatus to perform the method as described herein.
- An electronic device may comprise apparatus as described herein.
- a chipset may comprise apparatus as described herein.
- Embodiments of the present application aim to address problems associated with the state of the art.
- FIG. 1 shows schematically a system of apparatus suitable for implementing some embodiments
- FIG. 2 shows a flow diagram of the operation of the system as shown in FIG. 1 according to some embodiments
- FIG. 3 shows schematically capture/encoding apparatus
- FIG. 4 shows a flow diagram of the operation of capture/encoding apparatus as shown in FIG. 3 ;
- FIG. 5 shows schematically capture/encoding apparatus according to some embodiments
- FIG. 6 shows a flow diagram of the operation of capture/encoding apparatus as shown in FIG. 5 according to some embodiments
- FIG. 7 shows a flow diagram of the operation of encoding apparatus encoding obtained transport signals and metadata according to some embodiments
- FIG. 8 shows a flow diagram of the band selection operation of capture/encoding apparatus as shown in FIG. 5 according to some embodiments.
- FIG. 9 shows schematically shows schematically an example device suitable for implementing the apparatus shown.
- Apparatus has been designed to transmit a spatial audio modelling of a sound field using Q (which is typically 2) transport audio signals and spatial metadata.
- the transport audio signals are typically compressed with a suitable audio encoding scheme (for example advanced audio coding—AAC or enhanced voice services—EVS codecs).
- the spatial metadata may contain parameters such as Direction (for example azimuth, elevation) in time-frequency domain.
- parameters which may be determined and signalled to a renderer or receiver is one or more direct-to-total energy ratios (in the time-frequency domain) which represents the distribution of energy between each specific direction and the total audio energy.
- Another parameter may be one (or more where practical) diffuse-to-total energy ratio (in the time-frequency domain) which represents distribution of energy between ambient or diffuse signal (i.e., non-directional signal such as reverberation) and total energy.
- the parametric spatial audio signals may be represented as Q channels+metadata. This format can be compressed in encoding to efficiently store it for later retrieval or transmit it over a suitable transmission channel. Various methods can be used depending on how the channels are configured and what the metadata contains.
- a common procedure is to define a constant bitrate budget for the whole bitstream that contains audio channels and the metadata. This bitrate budget can then be divided statically or adaptively (dynamically) between audio channels and metadata.
- a bitrate budget of 64 kb/s for 2-channels+metadata could be used in various ways.
- Using the full 64 kb/s for the 2 audio channels would offer very good quality for encoding the stereo signal (for example using an EVS codec), but in this example the metadata would not be transmitted.
- In using 56 kb/s for the audio and 8 kb/s for metadata would usually provide a higher overall quality as the difference in audio coding quality is not large but the signalled metadata can provide full 3d surround reproduction.
- Optimizing between these example modes may require listening experiments. However, previous experiments have shown that with such low bitrates offering more bitrate to the raw audio quality over multiple channels tends to offer better perceived quality.
- the effect of metadata bitrate budgeting is that reducing the metadata bitrate such that the audio signal receives at least 90% of the total bitrate budget is believed to be a good target.
- the amount of metadata generated and therefore the amount of data defining spatial parameters is frequency band related.
- B e.g., 5, 10, 20, or 30
- K is number of bits per parameter
- B e.g., 5, 10, 20, or 30
- K is number of bits per parameter
- B e.g., 5, 10, 20, or 30
- K is number of bits per parameter
- B e.g., 5 kb/s metadata generated.
- the total target bitrate with audio can be so low as 14 kb/s so the metadata would take a big portion of the bitrate budget even after entropy coding (which may reduce the bitrate to half of the generated total).
- attempts to reduce the generated include reducing bit accuracy per parameter or even removing less important parameters when the bitrate budget is low.
- Another approach is to reduce the number of frequency bands for metadata, for example generating just one parameter per timeframe and thus producing a reduction of generated metadata by B.
- One method for achieving this is to perform a wideband analysis (in other words assume only one frequency band for the full audible frequency range) and encode this wideband group.
- the concept as discussed in further detail in the embodiments herein implements an analysis system with multiple bands and then selects the best frequency band to represent the current time frame.
- the embodiments discussed herein therefore attempt to reduce the bitrate by selecting one frequency band from the analysed metadata to represent all frequency bands. This reduces bitrate usage by factor of B (where B is the original number of frequency bands).
- the selection process in some embodiments may thus relate to audio encoding and decoding using a sound-field related parametrization (e.g., direction(s) and direct-to-total energy ratio(s) in frequency bands) where a solution is provided for automatically reducing the bitrate of the direction parameters by transmitting only one direction value for all frequency bands and where the transmitted one direction value is determined by:
- the directions and the direct-to-total energy ratios can be estimated using any suitable method (e.g., SPAC), and depends on the type of the audio signals (e.g., microphone-array, Ambisonics, multichannel audio signals).
- SPAC SPAC
- the type of the audio signals e.g., microphone-array, Ambisonics, multichannel audio signals.
- the normalized energy can be estimated as discussed in the embodiments herein in a suitable manner. For example by computing the sum of squares of the frequency-domain samples and dividing with the largest energy.
- the threshold value may in some embodiments be determined for example by multiplying the average normalized energy by a factor.
- all other parameters may be encoded using the same scheme. In other words transmitting only one parameter value for all frequency bands.
- the value to be transmitted can be selected using the same procedure.
- the decoding can be performed using any suitable method for example by using the same parameter value at all frequency bands.
- the selected frequency band in encoding, can be used as a reference band and a very low bitrate difference coding related to it determined for other bands.
- the system 171 is shown with an ‘analysis’ part 121 and a ‘synthesis’ part 131 .
- the ‘analysis’ part 121 is the part from receiving the input (multichannel loudspeaker, microphone array, ambisonics, or mobile device capture) audio signals 100 up to an encoding of the metadata and transport signal 102 which may be transmitted or stored 104 .
- the ‘synthesis’ part 131 may be the part from a decoding of the encoded metadata and transport signal 104 to the presentation of the synthesized signal (for example in multi-channel loudspeaker form 106 via loudspeakers 107 or binaural or ambisonic formats).
- the input to the system 171 and the ‘analysis’ part 121 is therefore audio signals 100 .
- These may be suitable input multichannel loudspeaker audio signals, microphone array audio signals, ambisonic audio signals, or mobile captured audio signals.
- the analysis processor is configured to pass the received input audio signals 100 unprocessed to an encoder in the same manner as the transport signals.
- the analysis processor 101 is configured to select one or more of the microphone audio signals and output the selection as the transport signals 104 .
- the analysis processor 101 is configured to apply any suitable encoding or quantization to the transport audio signals.
- the analysis processor 101 is also configured to analyse the input audio signals 100 to produce metadata associated with the input audio signals (and thus associated with the transport signals).
- the analysis processor 101 can, for example, be a computer (running suitable software stored on memory and on at least one processor), mobile device, or alternatively a specific device utilizing, for example, FPGAs or ASICs.
- the metadata may comprise, for each time-frequency analysis interval, at least one direction parameter and at least one energy ratio parameter.
- the at least one direction parameter and the at least one energy ratio parameter may in some embodiments be considered to be spatial audio parameters.
- the spatial audio parameters comprise parameters which aim to characterize the sound-field of the input audio signals.
- the parameters generated may differ from frequency band to frequency band and may be dependent on the transmission bit rate.
- band X all of the parameters are generated and transmitted, whereas in band Y only one of the parameters is generated and transmitted, and furthermore in band Z any other number of parameters are generated or transmitted.
- band Z any other number of parameters are generated or transmitted.
- a practical example of this may be that for some frequency bands such as the highest band some of the parameters are not required for perceptual reasons.
- the received or retrieved data (stream) may be input to a synthesis processor 105 .
- the synthesis processor 105 may be configured to demultiplex the data (stream) to coded transport and metadata.
- the synthesis processor 105 may then decode any encoded streams in order to obtain the transport signals and the metadata.
- the desired perceptual properties of a sound field can be reproduced over headphones using the binaural reproduction methods as described herein.
- the perceptual properties of a sound field could be reproduced as an Ambisonic output signal, and these Ambisonic signals can be reproduced with Ambisonic decoding methods to provide for example a binaural output with the desired perceptual properties.
- the synthesis processor 105 can in some embodiments be a computer (running suitable software stored on memory and on at least one processor), mobile device, or alternatively a specific device utilizing, for example, FPGAs or ASICs.
- FIG. 2 an example flow diagram of the overview shown in FIG. 1 is shown.
- First the system (analysis part) is configured to receive input audio signals or suitable multichannel input as shown in FIG. 2 by step 201 .
- the system (analysis part) is configured to generate a transport signal channels or transport signals (for example downmix/selection/beamforming based on the multichannel input audio signals) as shown in FIG. 2 by step 203 .
- system (analysis part) is configured to analyse the audio signals to generate metadata: Directions; Energy ratios as shown in FIG. 2 by step 205 .
- the system may retrieve/receive the transport signals and metadata as shown in FIG. 2 by step 211 .
- the system is configured to extract from the transport signals and metadata as shown in FIG. 2 by step 213 .
- the system (synthesis part) is configured to synthesize an output spatial audio signals (which as discussed earlier may be any suitable output format such as binaural, multi-channel loudspeaker or Ambisonics signals, depending on the use case) based on extracted audio signals and metadata as shown in FIG. 2 by step 215 .
- an output spatial audio signals which as discussed earlier may be any suitable output format such as binaural, multi-channel loudspeaker or Ambisonics signals, depending on the use case
- an example analysis processor 101 is shown where the input audio signal is provided from an audio source 301 which in this example is a spatial capture device configured to generate multichannel audio signals from multiple microphones.
- the multichannel audio signals in this example are passed to a transport (audio) signal generator 311 .
- the transport signal generator 311 is configured to generate the transport audio signals according to any of the options described previously.
- the transport signals may be downmixed from the input signals.
- the number of the transport audio signals may be any number and may be 2 or more or fewer than 2.
- the frequency band processor 305 is configured to generate spatial metadata outputs such as shown as the directions, direct-to-total energy ratios, and in some embodiments other types of energy ratios such as diffuse-to-total energy ratio(s) and remainder-to-total energy ratio(s).
- the implementation of the analysis may be any suitable implementation that produces the described metadata outputs.
- the frequency band processor 305 comprises a direction analyser 307 configured to generate the direction metadata and an energy ratio analyser 309 configured to generate the energy ratio metadata.
- the direction and energy ratio metadata for all of the analysed frequency bands may then be passed to a transmission/storage encoder 313 .
- the transmission/storage encoder 313 may be configured to combine and encode the transport signals, the directions, and the energy ratios to generate the data stream 102 .
- the transmission/storage encoder 313 may comprise a suitable transport signal compressor/encoder configured to compress the audio signals using a suitable codec (e.g., AAC or EVS).
- a suitable codec e.g., AAC or EVS.
- FIG. 4 With respect to FIG. 4 is shown a flow diagram of the operation of the analysis processor.
- the first operation is one of receiving the (multichannel loudspeaker or other) audio signals as shown in FIG. 4 by step 401 .
- the audio signals are processed in some form to generate the transport audio signals as shown in FIG. 4 by step 403 .
- the following operation may be one of spatially analysing the (multichannel loudspeaker) signals in order to determine direction metadata as shown in FIG. 4 by step 405 .
- the energy ratios (for example the direct, diffuse and remainder energy ratios) are determined as shown in FIG. 4 by step 407 .
- the metadata and transport audio signals are processed (compressed/encoded). For example the number of the directions and ratios are furthermore controlled (and may be selected and/or combined).
- the processing of the metadata/transport audio signals is shown in FIG. 4 by step 409 .
- the processed transport audio signals and the metadata may then be furthermore be combined to generate a suitable data stream as shown in FIG. 4 by step 411 .
- FIG. 5 there is shown an example analysis processor 101 suitable for implementing some embodiments with additions over the example provided in FIG. 3 .
- the example analysis processor 101 is shown again with the input audio signal provided from an audio source 301 which also in this example is a spatial capture device configured to generate multichannel audio signals from multiple microphones.
- an audio source 301 which also in this example is a spatial capture device configured to generate multichannel audio signals from multiple microphones.
- capturing a spatial audio signal can be performed with any known capture device.
- an Eigenmike or Nokia 8 mobile phone are suitable.
- the multichannel (spatial) audio signal may be any format such as mixed content (e.g., a multichannel audio format such as 5.1) and Ambisonics content that may produce the relevant spatial audio parameters.
- the multichannel audio signals in this example are passed to a transport (audio) signal generator 311 .
- the transport signal generator 311 similar to the example in FIG. 3 is configured to generate the transport audio signals according to any of the options described previously.
- the transport signals may be downmixed from the input signals.
- the number of the transport audio signals may be any number and may be 2 or more or fewer than 2.
- the multichannel audio signals are also input to a time frequency transform 303 .
- the time frequency transform 303 may be configured to generate suitable time-frequency representations of the multichannel audio signals and pass these to a frequency band processor 505 .
- the frequency band processor 505 is configured to generate spatial metadata outputs such as shown as the directions, direct-to-total energy ratios, and in some embodiments other types of energy ratios such as diffuse-to-total energy ratio(s) and remainder-to-total energy ratio(s).
- the implementation of the analysis may be any suitable implementation that produces the described metadata outputs.
- the frequency band processor 505 comprises a direction analyser 307 configured to generate the direction metadata and an energy ratio analyser 309 configured to generate the energy ratio metadata.
- These may be determined by performing spatial analysis on the time-frequency transformed multichannel audio signal.
- An example of spatial analysis may be for example DirAC (Directional Audio Coding) spatial analysis.
- DirAC may estimate the directions and diffuseness ratios (equivalent information to a direct-to-total ratio parameter) from a first-order Ambisonic (FOA) signal, or its variant the B-format signal.
- FOA first-order Ambisonic
- FOA i ⁇ ( t ) [ w i ⁇ ( T ) x i ⁇ ( r ) y i ⁇ ( r ) z i ⁇ ( T ) ]
- DirAC estimates the intensity vector by
- I ⁇ ( k , n ) - R ⁇ e ⁇ ⁇ w ⁇ ( k , n ) * ⁇ ⁇ ⁇ x ⁇ ( k , n ) y ⁇ ( k , n ) z ⁇ ( k , n ) ⁇ ⁇ ,
- the direction parameter is opposite of the direction of the real part of the intensity vector.
- the intensity vector may be averaged over several time and/or frequency indices prior to the determination of the direction parameter.
- ⁇ ⁇ ( k , n ) 1 -
- Diffuseness is a ratio value that is 1 when the sound is fully ambient, and 0 when the sound is fully directional. Again, all parameters in the equation are typically averaged over time and/or frequency. The expectation operator E[ ] can be replaced with an average operator in practical systems.
- the diffuseness (and direction) parameters typically are determined in frequency bands combining several frequency bins k, for example, approximating the Bark frequency resolution.
- DirAC is only one of the options to determine the directional and ratio metadata, and clearly one may utilize other methods to determine the metadata, for example, using a spatial audio capture (SPAC) algorithm with microphone-array signals (real or simulated).
- SPAC spatial audio capture
- DirAC analysis in the literature. For example where the input content is not FOA, a suitable modification can be done to convert the signal into FOA-format to perform analysis. Other analysis methods are also applicable as long as they produce the directional and energy ratio metadata.
- the direction and energy ratio metadata for all of the analysed frequency bands may then be passed to a metadata selector 521 .
- the output of the energy ratio analyser 309 is output to a weight factor determiner 517 .
- the frequency band processor 505 comprises a normalised energy determiner 515 configured to generate a normalised energy determination and pass this to a weight factor determiner 517 and to a weight limit determiner 519 .
- the normalised energy determination may be performed as a two step operation.
- a first step being to calculate the average energy for each frequency band in this time instant for example with the following equation:
- N is number of time samples in this time frame
- K b and K t are the current frequency band bottom and top frequency bins
- I is the number of input channels in the signal.
- S(i,k,n) is the time-frequency domain representation of the transport signal.
- the second step may be to normalize the average energies of each frequency band so that the largest energy of any frequency band is found and then divide all energies with the largest energy value. This may be seen as the largest energy of a frequency band is (always) 1 and other frequency bands have less energy or represented as an equation as:
- any suitable alternative normalization methods may be employed (e.g., normalizing with total energy instead of largest energy) and can be used but the limit parameter (as discussed hereafter) is appropriately tuned.
- unnormalized energy may be employed but the limit parameter requires even more careful tuning.
- the frequency band processor 505 in some embodiments further comprises a weight factor determiner 517 configured to receive the normalised energy and the energy ratios and determine at least one weighting factor which is output to the metadata selector 521 .
- the weight factor may be determined by based on the product of energy ratio and the normalized energy in the frequency band.
- This weight factor is a number between 0 and 1. It will be a very high value when there is a directional impulsive onset present in the scene as both energy ratio and normalized energy will be high. Likewise, if there is no onset present, these values tend to be lower for higher frequencies.
- the use of the product ensures that, for example, high normalized energy but low energy ratio (i.e., loud reverberation) does not produce high weight values as the direction and the metadata in this case is not the best representative.
- this weight factor can be any other suitable weight factor such as only the energy ratio parameter r.
- the analysis processor 101 in some embodiments comprises a weight limit determiner 519 configured to receive the normalised energy determination and output a weight limit value to the metadata selector 521 .
- the weight limit can be a constant value (e.g., 0.5) or it can be based on the average normalized energy of all frequency bands in the time frame (e.g., average normalized energy multiplied with a constant like 0.5).
- the latter option is preferred and is formed as:
- c is tuned threshold constant such as 0.5 and B is the total number of frequency bands.
- this weight limit can be any other suitable value.
- the analysis processor 101 in some embodiments comprises a metadata selector 521 configured to receive the output of the direction analyser 307 (direction metadata for each band), energy ratio analyser 309 (energy ratio metadata for each band), weight factor determiner 517 (weight factors) and weight limit determiner 519 .
- the metadata selector 521 is then configured to select one of the directions and energy ratios based on the weight factor and weight factor limit and pass the selected metadata to a transmission/storage encoder 513 .
- the metadata selector may be configured to choose or select the highest frequency band that has a weight factor over the weight limit. If for some reason no band has weight over the limit, the metadata selector in some embodiments is configured to select the lowest frequency band.
- the metadata selector determines the selected frequency band, it may be configured to discard metadata associated with the other bands.
- the metadata selector is configured to prioritize and only discard part of the metadata. For example, in some embodiments the direction information for the other bands are discarded but the energy ratio parameters are kept for all frequency bands.
- two or more frequency bands are selected to represent the other frequency bands.
- two frequency bands can be selected such that two (or N where N is less than the total number of frequency bands) highest frequency bands with weights over the threshold (or weight limit) are selected.
- the parameters associated with the selected higher frequency band is then used to represent parameters for frequency bands above it, and parameters associated with the lower frequency band is used to represent parameters for frequency bands below it, and both are used to represent frequency bands between them.
- the ‘best’ frequency band is selected but a difference coding technique is employed to represent the other frequency bands.
- a few bits are used to signal which frequency band is the reference band for the difference coding. Using this method still significantly reduces the bitrate but offers more accurate representation.
- the highest frequency band is selected and the metadata associated with the highest frequency band is used to ‘represent’ all frequency bands. This is less optimal in quality but is computationally more efficient to implement.
- the analysis processor 101 may further comprise a transmission/storage encoder 513 .
- the transmission/storage encoder 513 may be configured to combine and encode the transport signals, the selected direction, and the energy ratio to generate the data stream 102 .
- the transmission/storage encoder 513 may comprise a suitable transport signal compressor/encoder configured to compress the audio signals using a suitable codec (e.g., AAC or EVS) and encoding metadata using entropy coding methods (e.g., codebook coding).
- a suitable codec e.g., AAC or EVS
- encoding metadata e.g., codebook coding
- FIG. 6 With respect to FIG. 6 is shown a flow diagram of the operation of the analysis processor shown in FIG. 5 (and additionally the synthesis processor shown in FIG. 1 ).
- the first operation is one of obtaining the (multichannel loudspeaker or other) audio signals as shown in FIG. 6 by step 601 .
- the audio signals may be processed by the application of a time-frequency transform as shown in FIG. 6 by step 603 .
- the time-frequency domain audio signals are processed in some form to generate the transport signals as shown in FIG. 6 by step 617 .
- time-frequency domain audio signals are processed and spatial analysis performed to determine parameters such as direction(s) (and/or distance) and energy ratio(s) for each band as shown in FIG. 6 by step 607 .
- time-frequency domain audio signals are processed and a normalised energy per band calculated as shown in FIG. 6 by step 605 .
- the weight factor per band is formed or determined as shown in FIG. 6 by step 609 .
- the weight factor limit is formed or determined as shown in FIG. 6 by step 611 .
- a highest band with a weight over the limit is chosen as shown in FIG. 6 by step 613 .
- the other metadata is then discarded and the chosen band metadata saved as shown in FIG. 6 by step 615 .
- the selected metadata and transport signals are then compressed/encoded (and combined) before being stored and/or transmitted as shown in FIG. 6 by step 619 .
- the transmitted/retrieved signal is decoded and metadata replicated for all frequency bands as shown in FIG. 6 by step 621 .
- step 623 a suitable spatial synthesis is performed as shown in FIG. 6 by step 623 .
- the audio signal input format may be any suitable format.
- FIG. 7 is shown a flow diagram of the operation of an encoder suitable to encoding an obtained transport audio signal and metadata.
- the frequency band processor may comprise only the normalised energy determiner and weight factor determiner as the direction and energy ratios have been determined.
- the first operation is one of obtaining the transport audio signals and metadata as shown in FIG. 7 by step 701 .
- the parameters such as direction(s) (and/or distance) and energy ratio(s) for each band have been obtained and a normalised energy per band calculated as shown in FIG. 7 by step 705 .
- the weight factor per band is formed or determined as shown in FIG. 7 by step 709 .
- the weight factor limit is formed or determined as shown in FIG. 7 by step 711 .
- a highest band with a weight over the limit is chosen as shown in FIG. 7 by step 713 .
- the other metadata is then discarded and the chosen band metadata saved as shown in FIG. 7 by step 715 .
- the selected metadata and transport signals are then compressed/encoded (and combined) before being stored and/or transmitted as shown in FIG. 7 by step 719 .
- the transmitted/retrieved signal is decoded and metadata replicated for all frequency bands as shown in FIG. 7 by step 721 .
- step 723 a suitable spatial synthesis is performed as shown in FIG. 7 by step 723 .
- the first operation is to start and receive the inputs such as weight factors, weight limits, and parameters as shown in FIG. 8 by step 801 .
- the next operation is testing the index weight factor w i against the weight limit w thr as shown in FIG. 8 by step 803 .
- the next operation is determining i is the selected frequency band as shown in FIG. 8 by step 809 and then ending the operation as shown in FIG. 8 by step 813 .
- the next operation is determining i is the selected frequency band as shown in FIG. 8 by step 809 and then ending the operation as shown in FIG. 8 by step 813 .
- frequency band indexing starts from 1.
- the above can be modified to accommodate any other indexing system (such as starting from 0).
- the single band metadata values may be obtained and then replicated for all frequency bands. This results in a normal full set of metadata that can be used in further synthesis.
- the synthesis operation may then use the transport signals and replicated metadata to generate a suitable rendering of the audio signals.
- This procedure can be performed using any suitable means, for example, with methods such as DirAC based spatial audio signal synthesis.
- An example procedure for synthesising audio signals for loudspeakers is that the directions are synthesized into specific directions using 3D panning techniques such as vector-base amplitude panning (VBAP) multiplied with ⁇ square root over (r) ⁇ , and non-directional ambient signal is decorrelated with a phase-scrambling filter and reproduced to all directions multiplied with
- VBAP vector-base amplitude panning
- r is the energy ratio parameter and C is the number of loudspeaker channels.
- Such embodiments may be able to produce a signal which is at least as good or better than using single wideband parametric analysis.
- the implementation is computationally efficient method to reduce bitrate as it only requires a determination of the energies (this is often part of the analysis already) and weight factors and then discard data.
- spatial sound transmission storage can be achieved even at very low bitrates.
- a teleconference system may use a parametric spatial audio, e.g., DirAC, as the main analysis and synthesis method.
- Spatial capture may be obtained with an Eigenmike that produces first-order Ambisonics for this use.
- the spatial audio is analysed in time-frequency (20 ms frame and 30 frequency bands) domain and produces direction parameters as azimuth and elevation, and energy ratio parameter in form of diffuseness.
- the application of some embodiments may result in a bitrate of just 1.2 kb/s for the metadata (before other compression). This leaves more bits to use for the coding of the audio signal which directly results in better perceived audio quality.
- a further example would be using time-frequency resolution such as 10 ms time frame and 12 frequency bands would result in following comparison bitrates. 24 kb/s compared to 2.4 kb/s according to some embodiments.
- bitrate budget is very low.
- 24 kb/s is usually in the domain of mono downmix or very compressed stereo if only raw audio encoding is used.
- spatial metadata is introduced using, for example, the second time-frequency resolution above, the full spatial metadata would be hard to fit to the bitrate budget even after expected 50% entropy coding for it (metadata would take 12 kb/s of 24 kb/s available).
- the device may be any suitable electronics device or apparatus.
- the device 1900 is a mobile device, user equipment, tablet computer, computer, audio playback apparatus, etc.
- the device 1900 comprises at least one processor or central processing unit 1907 .
- the processor 1907 can be configured to execute various program codes such as the methods such as described herein.
- the device 1900 comprises a memory 1911 .
- the at least one processor 1907 is coupled to the memory 1911 .
- the memory 1911 can be any suitable storage means.
- the memory 1911 comprises a program code section for storing program codes implementable upon the processor 1907 .
- the memory 1911 can further comprise a stored data section for storing data, for example data that has been processed or to be processed in accordance with the embodiments as described herein. The implemented program code stored within the program code section and the data stored within the stored data section can be retrieved by the processor 1907 whenever needed via the memory-processor coupling.
- the device 1900 comprises a user interface 1905 .
- the user interface 1905 can be coupled in some embodiments to the processor 1907 .
- the processor 1907 can control the operation of the user interface 1905 and receive inputs from the user interface 1905 .
- the user interface 1905 can enable a user to input commands to the device 1900 , for example via a keypad.
- the user interface 1905 can enable the user to obtain information from the device 1900 .
- the user interface 1905 may comprise a display configured to display information from the device 1900 to the user.
- the user interface 1905 can in some embodiments comprise a touch screen or touch interface capable of both enabling information to be entered to the device 1900 and further displaying information to the user of the device 1900 .
- the device 1900 comprises an input/output port 1909 .
- the input/output port 1909 in some embodiments comprises a transceiver.
- the transceiver in such embodiments can be coupled to the processor 1907 and configured to enable a communication with other apparatus or electronic devices, for example via a wireless communications network.
- the transceiver or any suitable transceiver or transmitter and/or receiver means can in some embodiments be configured to communicate with other electronic devices or apparatus via a wire or wired coupling.
- the transceiver can communicate with further apparatus by any suitable known communications protocol.
- the transceiver or transceiver means can use a suitable universal mobile telecommunications system (UMTS) protocol, a wireless local area network (WLAN) protocol such as for example IEEE 802.X, a suitable short-range radio frequency communication protocol such as Bluetooth, or infrared data communication pathway (IRDA).
- UMTS universal mobile telecommunications system
- WLAN wireless local area network
- IRDA infrared data communication pathway
- the transceiver input/output port 1909 may be configured to receive the loudspeaker signals (or other input format audio signals) and in some embodiments determine the parameters as described herein by using the processor 1907 executing suitable code. Furthermore the device may generate a suitable transport signal and parameter output to be transmitted to the synthesis device.
- the device 1900 may be employed as at least part of the synthesis device.
- the input/output port 1909 may be configured to receive the transport signals and in some embodiments the parameters determined at the capture device or processing device as described herein, and generate a suitable audio signal format output by using the processor 1907 executing suitable code.
- the input/output port 1909 may be coupled to any suitable audio output for example to a multichannel speaker system and/or headphones or similar.
- the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
- some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto.
- While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- the embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware.
- any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions.
- the software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
- the memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
- the data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
- Embodiments of the inventions may be practiced in various components such as integrated circuit modules.
- the design of integrated circuits is by and large a highly automated process.
- Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
- Programs such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre-stored design modules.
- the resultant design in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or “fab” for fabrication.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
-
- One channel audio 16 kb/s
- One channel audio 15 kb/s+1 kb/s metadata
- One channel audio 11 kb/s+5 kb/s metadata
- Two channel audio 16 kb/s
- Two channel audio 15 kb/s+1 kb/s metadata
- Two channel audio 11 kb/s+5 kb/s metadata
-
- require a single analysis system for different bitrates (and therefore not one band analysis for low bitrates, multiple band analysis for high bitrates); and
- improve for the sound scene time-frequency resolution in a practical manner suitable for the human hearing range.
-
- obtaining audio signals;
- determining (spatial parameters) directions and direct-to-total energy ratios in frequency bands;
- determining normalized energy in frequency bands;
- determining directional energy weight factor (e.g., energy multiplied by direct-to-total energy ratio);
- determining the highest frequency band with directional energy weight factor above a threshold;
- encoding/storing/transmitting only the direction of the determined band.
-
- Obtain multichannel audio signals (for example Capture spatial audio signals);
- Apply time-frequency transform to the multichannel audio signals;
- Perform spatial analysis for the transformed signal;
- Calculate normalized energy for each frequency band for the transformed signal;
- Calculate frequency band weight factor for each band (energy multiplied with energy ratio) for the transformed signal;
- Choose or select a highest band that has a weight factor over defined limit (e.g., 0.5);
- Discard other metadata and save only the metadata for the chosen frequency band;
- Create transport signals;
- Encode and transmit/store transport signals and metadata.
- With respect to the synthesis apparatus it is then configured to:
- Obtain (receive/retrieve) the transmitted/stored transport signals and metadata; replicate the selected/chosen metadata to all frequency bands; and
- Synthesize output using transport signals and replicated metadata.
w=rEnorm
-
- Direction may be coded separately for azimuth and elevation
- Azimuth has 2 bits and represents offsets of 0°, 90°, 180°, or 270° from the chosen band azimuth
- Elevation has 2 bits and represents offsets of 0°, 45°, and −45° (one value not used)
- Each ratio parameter has 2 bits and represents offsets of 0, 0.25, −0.25, −0.5
- Direction may be coded separately for azimuth and elevation
where r is the energy ratio parameter and C is the number of loudspeaker channels.
Claims (20)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GB1814227.3A GB2576769A (en) | 2018-08-31 | 2018-08-31 | Spatial parameter signalling |
| EP1814227.3 | 2018-08-31 | ||
| GB1814227 | 2018-08-31 | ||
| PCT/FI2019/050581 WO2020043935A1 (en) | 2018-08-31 | 2019-08-08 | Spatial parameter signalling |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/FI2019/050581 A-371-Of-International WO2020043935A1 (en) | 2018-08-31 | 2019-08-08 | Spatial parameter signalling |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/196,641 Continuation US20250259636A1 (en) | 2018-08-31 | 2025-05-01 | Spatial parameter signalling |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20210319799A1 US20210319799A1 (en) | 2021-10-14 |
| US12327569B2 true US12327569B2 (en) | 2025-06-10 |
Family
ID=63920928
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/270,354 Active 2041-02-22 US12327569B2 (en) | 2018-08-31 | 2019-08-08 | Spatial parameter signalling |
| US19/196,641 Pending US20250259636A1 (en) | 2018-08-31 | 2025-05-01 | Spatial parameter signalling |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/196,641 Pending US20250259636A1 (en) | 2018-08-31 | 2025-05-01 | Spatial parameter signalling |
Country Status (7)
| Country | Link |
|---|---|
| US (2) | US12327569B2 (en) |
| EP (2) | EP4598061A3 (en) |
| CN (2) | CN119252267A (en) |
| ES (1) | ES3037973T3 (en) |
| GB (1) | GB2576769A (en) |
| PL (1) | PL3844748T3 (en) |
| WO (1) | WO2020043935A1 (en) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102599744B1 (en) | 2018-12-07 | 2023-11-08 | 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 | Apparatus, methods, and computer programs for encoding, decoding, scene processing, and other procedures related to DirAC-based spatial audio coding using directional component compensation. |
| SG11202107802VA (en) * | 2019-01-21 | 2021-08-30 | Fraunhofer Ges Forschung | Apparatus and method for encoding a spatial audio representation or apparatus and method for decoding an encoded audio signal using transport metadata and related computer programs |
| US12073842B2 (en) * | 2019-06-24 | 2024-08-27 | Qualcomm Incorporated | Psychoacoustic audio coding of ambisonic audio data |
| GB2598932A (en) | 2020-09-18 | 2022-03-23 | Nokia Technologies Oy | Spatial audio parameter encoding and associated decoding |
| CA3202283A1 (en) * | 2020-12-15 | 2022-06-23 | Adriana Vasilache | Quantizing spatial audio parameters |
Citations (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050053242A1 (en) | 2001-07-10 | 2005-03-10 | Fredrik Henn | Efficient and scalable parametric stereo coding for low bitrate applications |
| US20060235679A1 (en) | 2005-04-13 | 2006-10-19 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Adaptive grouping of parameters for enhanced coding efficiency |
| US20060233380A1 (en) * | 2005-04-15 | 2006-10-19 | FRAUNHOFER- GESELLSCHAFT ZUR FORDERUNG DER ANGEWANDTEN FORSCHUNG e.V. | Multi-channel hierarchical audio coding with compact side information |
| CN1926610A (en) | 2004-03-12 | 2007-03-07 | 诺基亚公司 | Synthesizing a mono audio signal based on an encoded multi-channel audio signal |
| CN1932877A (en) | 2006-09-30 | 2007-03-21 | 中山大学 | Data decomposition and reconfiguration method with parameter |
| US20070297519A1 (en) | 2004-10-28 | 2007-12-27 | Jeffrey Thompson | Audio Spatial Environment Engine |
| US20090006103A1 (en) | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Bitstream syntax for multi-process audio decoding |
| US20110002393A1 (en) | 2009-07-03 | 2011-01-06 | Fujitsu Limited | Audio encoding device, audio encoding method, and video transmission device |
| US8069052B2 (en) * | 2002-09-04 | 2011-11-29 | Microsoft Corporation | Quantization and inverse quantization for audio |
| WO2012058805A1 (en) | 2010-11-03 | 2012-05-10 | Huawei Technologies Co., Ltd. | Parametric encoder for encoding a multi-channel audio signal |
| US20140112482A1 (en) * | 2012-04-05 | 2014-04-24 | Huawei Technologies Co., Ltd. | Method for Parametric Spatial Audio Coding and Decoding, Parametric Spatial Audio Coder and Parametric Spatial Audio Decoder |
| CN103824557A (en) | 2014-02-19 | 2014-05-28 | 清华大学 | Audio detecting and classifying method with customization function |
| WO2014191793A1 (en) | 2013-05-28 | 2014-12-04 | Nokia Corporation | Audio signal encoder |
| US20160005413A1 (en) * | 2013-02-14 | 2016-01-07 | Dolby Laboratories Licensing Corporation | Audio Signal Enhancement Using Estimated Spatial Parameters |
| US20160005407A1 (en) * | 2013-02-21 | 2016-01-07 | Dolby International Ab | Methods for Parametric Multi-Channel Encoding |
| US9241216B2 (en) | 2010-11-05 | 2016-01-19 | Thomson Licensing | Data structure for higher order ambisonics audio data |
| US20160035369A1 (en) | 2006-06-21 | 2016-02-04 | Samsung Electronics Co., Ltd. | Method and apparatus for adaptively encoding and decoding high frequency band |
| US20160148618A1 (en) * | 2013-07-05 | 2016-05-26 | Dolby Laboratories Licensing Corporation | Packet Loss Concealment Apparatus and Method, and Audio Processing System |
| US20160293174A1 (en) | 2015-04-05 | 2016-10-06 | Qualcomm Incorporated | Audio bandwidth selection |
| CN106023999A (en) | 2016-07-11 | 2016-10-12 | 武汉大学 | Encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio |
| US20170069329A1 (en) | 2012-10-26 | 2017-03-09 | Huawei Technologies Co., Ltd. | Method and Apparatus for Allocating Bits of Audio Signal |
| CN103928030B (en) | 2014-04-30 | 2017-03-15 | 武汉大学 | Based on the scalable audio coding system and method that subband spatial concern is estimated |
| US9756448B2 (en) * | 2014-04-01 | 2017-09-05 | Dolby International Ab | Efficient coding of audio scenes comprising audio objects |
| US20190066701A1 (en) * | 2016-03-10 | 2019-02-28 | Orange | Optimized coding and decoding of spatialization information for the parametric coding and decoding of a multichannel audio signal |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7831434B2 (en) * | 2006-01-20 | 2010-11-09 | Microsoft Corporation | Complex-transform channel coding with extended-band frequency coding |
| US8942989B2 (en) * | 2009-12-28 | 2015-01-27 | Panasonic Intellectual Property Corporation Of America | Speech coding of principal-component channels for deleting redundant inter-channel parameters |
| US9570081B2 (en) * | 2012-04-26 | 2017-02-14 | Nokia Technologies Oy | Backwards compatible audio representation |
| US9659569B2 (en) * | 2013-04-26 | 2017-05-23 | Nokia Technologies Oy | Audio signal encoder |
| US10163447B2 (en) * | 2013-12-16 | 2018-12-25 | Qualcomm Incorporated | High-band signal modeling |
-
2018
- 2018-08-31 GB GB1814227.3A patent/GB2576769A/en not_active Withdrawn
-
2019
- 2019-08-08 EP EP25182792.9A patent/EP4598061A3/en active Pending
- 2019-08-08 WO PCT/FI2019/050581 patent/WO2020043935A1/en not_active Ceased
- 2019-08-08 EP EP19855639.1A patent/EP3844748B1/en active Active
- 2019-08-08 US US17/270,354 patent/US12327569B2/en active Active
- 2019-08-08 CN CN202411391576.5A patent/CN119252267A/en active Pending
- 2019-08-08 ES ES19855639T patent/ES3037973T3/en active Active
- 2019-08-08 CN CN201980070712.1A patent/CN112970062B/en active Active
- 2019-08-08 PL PL19855639.1T patent/PL3844748T3/en unknown
-
2025
- 2025-05-01 US US19/196,641 patent/US20250259636A1/en active Pending
Patent Citations (29)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050053242A1 (en) | 2001-07-10 | 2005-03-10 | Fredrik Henn | Efficient and scalable parametric stereo coding for low bitrate applications |
| US8069052B2 (en) * | 2002-09-04 | 2011-11-29 | Microsoft Corporation | Quantization and inverse quantization for audio |
| CN1926610A (en) | 2004-03-12 | 2007-03-07 | 诺基亚公司 | Synthesizing a mono audio signal based on an encoded multi-channel audio signal |
| US20070208565A1 (en) | 2004-03-12 | 2007-09-06 | Ari Lakaniemi | Synthesizing a Mono Audio Signal |
| US20070297519A1 (en) | 2004-10-28 | 2007-12-27 | Jeffrey Thompson | Audio Spatial Environment Engine |
| US20060235679A1 (en) | 2005-04-13 | 2006-10-19 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Adaptive grouping of parameters for enhanced coding efficiency |
| US20060233380A1 (en) * | 2005-04-15 | 2006-10-19 | FRAUNHOFER- GESELLSCHAFT ZUR FORDERUNG DER ANGEWANDTEN FORSCHUNG e.V. | Multi-channel hierarchical audio coding with compact side information |
| US20160035369A1 (en) | 2006-06-21 | 2016-02-04 | Samsung Electronics Co., Ltd. | Method and apparatus for adaptively encoding and decoding high frequency band |
| CN1932877A (en) | 2006-09-30 | 2007-03-21 | 中山大学 | Data decomposition and reconfiguration method with parameter |
| US20090006103A1 (en) | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Bitstream syntax for multi-process audio decoding |
| US20110002393A1 (en) | 2009-07-03 | 2011-01-06 | Fujitsu Limited | Audio encoding device, audio encoding method, and video transmission device |
| CN102844808A (en) | 2010-11-03 | 2012-12-26 | 华为技术有限公司 | Parametric encoder for encoding multi-channel audio signal |
| WO2012058805A1 (en) | 2010-11-03 | 2012-05-10 | Huawei Technologies Co., Ltd. | Parametric encoder for encoding a multi-channel audio signal |
| US9241216B2 (en) | 2010-11-05 | 2016-01-19 | Thomson Licensing | Data structure for higher order ambisonics audio data |
| US20140112482A1 (en) * | 2012-04-05 | 2014-04-24 | Huawei Technologies Co., Ltd. | Method for Parametric Spatial Audio Coding and Decoding, Parametric Spatial Audio Coder and Parametric Spatial Audio Decoder |
| US20170069329A1 (en) | 2012-10-26 | 2017-03-09 | Huawei Technologies Co., Ltd. | Method and Apparatus for Allocating Bits of Audio Signal |
| US20160005413A1 (en) * | 2013-02-14 | 2016-01-07 | Dolby Laboratories Licensing Corporation | Audio Signal Enhancement Using Estimated Spatial Parameters |
| US20160005407A1 (en) * | 2013-02-21 | 2016-01-07 | Dolby International Ab | Methods for Parametric Multi-Channel Encoding |
| WO2014191793A1 (en) | 2013-05-28 | 2014-12-04 | Nokia Corporation | Audio signal encoder |
| CN105474308A (en) | 2013-05-28 | 2016-04-06 | 诺基亚技术有限公司 | Audio signal encoder |
| US20160111100A1 (en) | 2013-05-28 | 2016-04-21 | Nokia Technologies Oy | Audio signal encoder |
| US20160148618A1 (en) * | 2013-07-05 | 2016-05-26 | Dolby Laboratories Licensing Corporation | Packet Loss Concealment Apparatus and Method, and Audio Processing System |
| CN103824557A (en) | 2014-02-19 | 2014-05-28 | 清华大学 | Audio detecting and classifying method with customization function |
| US9756448B2 (en) * | 2014-04-01 | 2017-09-05 | Dolby International Ab | Efficient coding of audio scenes comprising audio objects |
| CN103928030B (en) | 2014-04-30 | 2017-03-15 | 武汉大学 | Based on the scalable audio coding system and method that subband spatial concern is estimated |
| US20160293174A1 (en) | 2015-04-05 | 2016-10-06 | Qualcomm Incorporated | Audio bandwidth selection |
| CN107408392A (en) | 2015-04-05 | 2017-11-28 | 高通股份有限公司 | Audio bandwidth selects |
| US20190066701A1 (en) * | 2016-03-10 | 2019-02-28 | Orange | Optimized coding and decoding of spatialization information for the parametric coding and decoding of a multichannel audio signal |
| CN106023999A (en) | 2016-07-11 | 2016-10-12 | 武汉大学 | Encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio |
Non-Patent Citations (11)
| Title |
|---|
| Extended European Search Report for European Application No. 19855639.1 dated May 3, 2022, 9 pages. |
| Hirvonen et al., "Perceptual Compression Methods for Metadata in Directional Audio Coding Applied to Audiovisual Teleconference", Audio Engineering Society Convention Paper 7706, 126th Convention (May 7-10, 2009), 8 pages. |
| Intention to Grant for European Application No. 19855639.1 dated Dec. 19, 2023, 7 pages. |
| Intention to Grant for European Application No. 19855639.1 dated Feb. 13, 2025, 48 pages. |
| Intention to Grant for European Application No. 19855639.1 dated May 8, 2024, 49 pages. |
| Intention to Grant for European Application No. 19855639.1 dated Sep. 17, 2024, 48 pages. |
| International Search Report and Written Opinion for Patent Cooperation Treaty Application No. PCT/FI2019/050581 dated Nov. 28, 2019, 16 pages. |
| Notice of Allowance for Chinese Application No. 201980070712.1 dated Aug. 9, 2024, 10 pages. |
| Office Action for Chinese Application No. 201980070712.1 dated Mar. 26, 2024, 24 pages. |
| Peters et al., "Scene-Based Audio Implemented with Higher Order Ambisonics", SMPTE Motion Imaging Journal, vol. 125, No. 9 (Nov.-Dec. 2016), pp. 16-24. |
| Yang et al., "Multi-Channel Object-Based Spatial Parameter Compression Approach for 3D Audio", Advances in Multimedia Information Processing—PCM 2015, proceedings of the 16th Pacific-Rim Conference on Multimedia (Sep. 16-18, 2015), 11 pages. |
Also Published As
| Publication number | Publication date |
|---|---|
| PL3844748T3 (en) | 2025-09-15 |
| EP3844748B1 (en) | 2025-07-23 |
| US20250259636A1 (en) | 2025-08-14 |
| US20210319799A1 (en) | 2021-10-14 |
| EP3844748A4 (en) | 2022-06-01 |
| CN112970062A (en) | 2021-06-15 |
| GB201814227D0 (en) | 2018-10-17 |
| EP4598061A2 (en) | 2025-08-06 |
| EP3844748A1 (en) | 2021-07-07 |
| ES3037973T3 (en) | 2025-10-08 |
| GB2576769A (en) | 2020-03-04 |
| CN112970062B (en) | 2024-10-18 |
| EP4598061A3 (en) | 2025-09-03 |
| CN119252267A (en) | 2025-01-03 |
| WO2020043935A1 (en) | 2020-03-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12114146B2 (en) | Determination of targeted spatial audio parameters and associated spatial audio playback | |
| US20250259636A1 (en) | Spatial parameter signalling | |
| US20240363127A1 (en) | Determination of the significance of spatial audio parameters and associated encoding | |
| US11096002B2 (en) | Energy-ratio signalling and synthesis | |
| CN112219236A (en) | Spatial audio parameters and associated spatial audio playback | |
| US12451147B2 (en) | Spatial audio parameter encoding and associated decoding | |
| US20210250717A1 (en) | Spatial audio Capture, Transmission and Reproduction | |
| US20240357304A1 (en) | Sound Field Related Rendering | |
| US20250157475A1 (en) | Parametric spatial audio rendering | |
| US20250349303A1 (en) | Spatial audio parameter encoding and associated decoding | |
| CN116547749B (en) | Quantization of audio parameters | |
| CA3237983A1 (en) | Spatial audio parameter decoding | |
| US20250210049A1 (en) | Parametric spatial audio encoding |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| AS | Assignment |
Owner name: NOKIA TECHNOLOGIES OY, FINLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PIHLAJAKUJA, TAPANI JOHANNES;LAITINEN, MIKKO-VILLE;REEL/FRAME:063220/0311 Effective date: 20190805 Owner name: NOKIA TECHNOLOGIES OY, FINLAND Free format text: ASSIGNMENT OF ASSIGNOR'S INTEREST;ASSIGNORS:PIHLAJAKUJA, TAPANI JOHANNES;LAITINEN, MIKKO-VILLE;REEL/FRAME:063220/0311 Effective date: 20190805 |
|
| FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| CC | Certificate of correction |