US20140358565A1 - Compression of decomposed representations of a sound field - Google Patents

Compression of decomposed representations of a sound field Download PDF

Info

Publication number
US20140358565A1
US20140358565A1 US14/289,588 US201414289588A US2014358565A1 US 20140358565 A1 US20140358565 A1 US 20140358565A1 US 201414289588 A US201414289588 A US 201414289588A US 2014358565 A1 US2014358565 A1 US 2014358565A1
Authority
US
United States
Prior art keywords
vectors
spherical harmonic
harmonic coefficients
matrix
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/289,588
Inventor
Nils Günther Peters
Dipanjan Sen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US201361828445P priority Critical
Priority to US201361828615P priority
Priority to US201361829155P priority
Priority to US201361829182P priority
Priority to US201361829174P priority
Priority to US201361829846P priority
Priority to US201361829791P priority
Priority to US201361886617P priority
Priority to US201361886605P priority
Priority to US201361899034P priority
Priority to US201361899041P priority
Priority to US201461925126P priority
Priority to US201461925158P priority
Priority to US201461925074P priority
Priority to US201461925112P priority
Priority to US201461933721P priority
Priority to US201461933706P priority
Priority to US201462003515P priority
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to US14/289,588 priority patent/US20140358565A1/en
Assigned to QUALCOMM INCORPORATED reassignment QUALCOMM INCORPORATED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: PETERS, NILS GÜNTHER, SEN, DIPANJAN
Publication of US20140358565A1 publication Critical patent/US20140358565A1/en
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • H04S5/005Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation  of the pseudo five- or more-channel type, e.g. virtual surround
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/40Visual indication of stereophonic sound image
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0004Design or structure of the codebook
    • G10L2019/0005Multi-stage vector quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2205/00Details of stereophonic arrangements covered by H04R5/00 but not provided for in any of its subgroups
    • H04R2205/021Aspects relating to docking-station type assemblies to obtain an acoustical effect, e.g. the type of connection to external loudspeakers or housings, frequency improvement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/01Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Abstract

In general, techniques are described for obtaining decomposed versions of spherical harmonic coefficients. A device comprising one or more processors may be configured to perform the techniques, whereby the processors may be configured to obtain, from a bitstream, at least one of one or more vectors decomposed from spherical harmonic coefficients that were recombined with background spherical harmonic coefficients, wherein the spherical harmonic coefficients describe a sound field, and wherein the background spherical harmonic coefficients described one or more background components of the same sound field.

Description

  • This application claims the benefit of U.S. Provisional Application No. 61/828,445 filed 29 May 2013, U.S. Provisional Application No. 61/829,791 filed 31 May 2013, U.S. Provisional Application No. 61/899,034 filed 1 Nov. 2013, U.S. Provisional Application No. 61/899,041 filed 1 Nov. 2013, U.S. Provisional Application No. 61/829,182 filed 30 May 2013, U.S. Provisional Application No. 61/829,174 filed 30 May 2013, U.S. Provisional Application No. 61/829,155 filed 30 May 2013, U.S. Provisional Application No. 61/933,706 filed 30 Jan. 2014, U.S. Provisional Application No. 61/829,846 filed 31 May 2013, U.S. Provisional Application No. 61/886,605 filed 3 Oct. 2013, U.S. Provisional Application No. 61/886,617 filed 3 Oct. 2013, U.S. Provisional Application No. 61/925,158 filed 8 Jan. 2014, U.S. Provisional Application No. 61/933,721 filed 30 Jan. 2014, U.S. Provisional Application No. 61/925,074 filed 8 Jan. 2014, U.S. Provisional Application No. 61/925,112 filed 8 Jan. 2014, U.S. Provisional Application No. 61/925,126 filed 8 Jan. 2014, and U.S. Provisional Application No. 62/003,515 filed 27 May 2014, and U.S. Provisional Application No. 61/828,615 filed 29 May 2013 the entire content of each which are incorporated herein by reference.
  • TECHNICAL FIELD
  • This disclosure relate to audio data and, more specifically, compression of audio data.
  • BACKGROUND
  • A higher order ambisonics (HOA) signal (often represented by a plurality of spherical harmonic coefficients (SHC) or other hierarchical elements) is a three-dimensional representation of a soundfield. This HOA or SHC representation may represent this soundfield in a manner that is independent of the local speaker geometry used to playback a multi-channel audio signal rendered from this SHC signal. This SHC signal may also facilitate backwards compatibility as this SHC signal may be rendered to well-known and highly adopted multi-channel formats, such as a 5.1 audio channel format or a 7.1 audio channel format. The SHC representation may therefore enable a better representation of a soundfield that also accommodates backward compatibility.
  • SUMMARY
  • In general, techniques are described for compression and decompression of higher order ambisonic audio data.
  • In one aspect, a method comprises obtaining one or more first vectors describing distinct components of the soundfield and one or more second vectors describing background components of the soundfield, both the one or more first vectors and the one or more second vectors generated at least by performing a transformation with respect to the plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to determine one or more first vectors describing distinct components of the soundfield and one or more second vectors describing background components of the soundfield, both the one or more first vectors and the one or more second vectors generated at least by performing a transformation with respect to the plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for obtaining one or more first vectors describing distinct components of the soundfield and one or more second vectors describing background components of the soundfield, both the one or more first vectors and the one or more second vectors generated at least by performing a transformation with respect to the plurality of spherical harmonic coefficients, and means for storing the one or more first vectors.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to obtain one or more first vectors describing distinct components of the soundfield and one or more second vectors describing background components of the soundfield, both the one or more first vectors and the one or more second vectors generated at least by performing a transformation with respect to the plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises selecting one of a plurality of decompression schemes based on the indication of whether an compressed version of spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object, and decompressing the compressed version of the spherical harmonic coefficients using the selected one of the plurality of decompression schemes.
  • In another aspect, a device comprises one or more processors configured to select one of a plurality of decompression schemes based on the indication of whether an compressed version of spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object, and decompress the compressed version of the spherical harmonic coefficients using the selected one of the plurality of decompression schemes.
  • In another aspect, a device comprises means for selecting one of a plurality of decompression schemes based on the indication of whether an compressed version of spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object, and means for decompressing the compressed version of the spherical harmonic coefficients using the selected one of the plurality of decompression schemes.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors of an integrated decoding device to select one of a plurality of decompression schemes based on the indication of whether an compressed version of spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object, and decompress the compressed version of the spherical harmonic coefficients using the selected one of the plurality of decompression schemes.
  • In another aspect, a method comprises obtaining an indication of whether spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object.
  • In another aspect, a device comprises one or more processors configured to obtain an indication of whether spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object.
  • In another aspect, a device comprises means for storing spherical harmonic coefficients representative of a sound field, and means for obtaining an indication of whether the spherical harmonic coefficients are generated from a synthetic audio object.
  • In another aspect, anon-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to obtain an indication of whether spherical harmonic coefficients representative of a sound field are generated from a synthetic audio object.
  • In another aspect, a method comprises quantizing one or more first vectors representative of one or more components of a sound field, and compensating for error introduced due to the quantization of the one or more first vectors in one or more second vectors that are also representative of the same one or more components of the sound field.
  • In another aspect, a device comprises one or more processors configured to quantize one or more first vectors representative of one or more components of a sound field, and compensate for error introduced due to the quantization of the one or more first vectors in one or more second vectors that are also representative of the same one or more components of the sound field.
  • In another aspect, a device comprises means for quantizing one or more first vectors representative of one or more components of a sound field, and means for compensating for error introduced due to the quantization of the one or more first vectors in one or more second vectors that are also representative of the same one or more components of the sound field.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to quantize one or more first vectors representative of one or more components of a sound field, and compensate for error introduced due to the quantization of the one or more first vectors in one or more second vectors that are also representative of the same one or more components of the sound field.
  • In another aspect, a method comprises performing, based on a target bitrate, order reduction with respect to a plurality of spherical harmonic coefficients or decompositions thereof to generate reduced spherical harmonic coefficients or the reduced decompositions thereof, wherein the plurality of spherical harmonic coefficients represent a sound field.
  • In another aspect, a device comprises one or more processors configured to perform, based on a target bitrate, order reduction with respect to a plurality of spherical harmonic coefficients or decompositions thereof to generate reduced spherical harmonic coefficients or the reduced decompositions thereof, wherein the plurality of spherical harmonic coefficients represent a sound field.
  • In another aspect, a device comprises means for storing a plurality of spherical harmonic coefficients or decompositions thereof, and means for performing, based on a target bitrate, order reduction with respect to the plurality of spherical harmonic coefficients or decompositions thereof to generate reduced spherical harmonic coefficients or the reduced decompositions thereof, wherein the plurality of spherical harmonic coefficients represent a sound field.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to perform, based on a target bitrate, order reduction with respect to a plurality of spherical harmonic coefficients or decompositions thereof to generate reduced spherical harmonic coefficients or the reduced decompositions thereof, wherein the plurality of spherical harmonic coefficients represent a sound field.
  • In another aspect, a method comprises obtaining a first non-zero set of coefficients of a vector that represent a distinct component of the sound field, the vector having been decomposed from a plurality of spherical harmonic coefficients that describe a sound field.
  • In another aspect, a device comprises one or more processors configured to obtain a first non-zero set of coefficients of a vector that represent a distinct component of a sound field, the vector having been decomposed from a plurality of spherical harmonic coefficients that describe the sound field.
  • In another aspect, a device comprises means for obtaining a first non-zero set of coefficients of a vector that represent a distinct component of a sound field, the vector having been decomposed from a plurality of spherical harmonic coefficients that describe the sound field, and means for storing the first non-zero set of coefficients.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to determine a first non-zero set of coefficients of a vector that represent a distinct component of a sound field, the vector having been decomposed from a plurality of spherical harmonic coefficients that describe the sound field.
  • In another aspect, a method comprises obtaining, from a bitstream, at least one of one or more vectors decomposed from spherical harmonic coefficients that were recombined with background spherical harmonic coefficients, wherein the spherical harmonic coefficients describe a sound field, and wherein the background spherical harmonic coefficients described one or more background components of the same sound field.
  • In another aspect, a device comprises one or more processors configured to determine, from a bitstream, at least one of one or more vectors decomposed from spherical harmonic coefficients that were recombined with background spherical harmonic coefficients, wherein the spherical harmonic coefficients describe a sound field, and wherein the background spherical harmonic coefficients described one or more background components of the same sound field.
  • In another aspect, a device comprises means for obtaining, from a bitstream, at least one of one or more vectors decomposed from spherical harmonic coefficients that were recombined with background spherical harmonic coefficients, wherein the spherical harmonic coefficients describe a sound field, and wherein the background spherical harmonic coefficients described one or more background components of the same sound field.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to obtain, from a bitstream, at least one of one or more vectors decomposed from spherical harmonic coefficients that were recombined with background spherical harmonic coefficients, wherein the spherical harmonic coefficients describe a sound field, and wherein the background spherical harmonic coefficients described one or more background components of the same sound field.
  • In another aspect, a method comprises identifying one or more distinct audio objects from one or more spherical harmonic coefficients (SHC) associated with the audio objects based on a directionality determined for one or more of the audio objects.
  • In another aspect, a device comprises one or more processors configured to identify one or more distinct audio objects from one or more spherical harmonic coefficients (SHC) associated with the audio objects based on a directionality determined for one or more of the audio objects.
  • In another aspect, a device comprises means for storing one or more spherical harmonic coefficients (SHC), and means for identifying one or more distinct audio objects from the one or more spherical harmonic coefficients (SHC) associated with the audio objects based on a directionality determined for one or more of the audio objects.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to identify one or more distinct audio objects from one or more spherical harmonic coefficients (SHC) associated with the audio objects based on a directionality determined for one or more of the audio objects.
  • In another aspect, a method comprises performing a vector-based synthesis with respect to a plurality of spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients representative of one or more audio objects and corresponding directional information, wherein the spherical harmonic coefficients are associated with an order and describe a sound field, determining distinct and background directional information from the directional information, reducing an order of the directional information associated with the background audio objects to generate transformed background directional information, applying compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
  • In another aspect, a device comprises one or more processors configured to perform a vector-based synthesis with respect to a plurality of spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients representative of one or more audio objects and corresponding directional information, wherein the spherical harmonic coefficients are associated with an order and describe a sound field, determine distinct and background directional information from the directional information, reduce an order of the directional information associated with the background audio objects to generate transformed background directional information, apply compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
  • In another aspect, a device comprises means for performing a vector-based synthesis with respect to a plurality of spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients representative of one or more audio objects and corresponding directional information, wherein the spherical harmonic coefficients are associated with an order and describe a sound field, means for determining distinct and background directional information from the directional information, means for reducing an order of the directional information associated with the background audio objects to generate transformed background directional information, and means for applying compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to perform a vector-based synthesis with respect to a plurality of spherical harmonic coefficients to generate decomposed representations of the plurality of spherical harmonic coefficients representative of one or more audio objects and corresponding directional information, wherein the spherical harmonic coefficients are associated with an order and describe a sound field, determine distinct and background directional information from the directional information, reduce an order of the directional information associated with the background audio objects to generate transformed background directional information, and apply compensation to increase values of the transformed directional information to preserve an overall energy of the sound field.
  • In another aspect, a method comprises obtaining decomposed interpolated spherical harmonic coefficients for a time segment by, at least in part, performing an interpolation with respect to a first decomposition of a first plurality of spherical harmonic coefficients and a second decomposition of a second plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to obtain decomposed interpolated spherical harmonic coefficients for a time segment by, at least in part, performing an interpolation with respect to a first decomposition of a first plurality of spherical harmonic coefficients and a second decomposition of a second plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for storing a first plurality of spherical harmonic coefficients and a second plurality of spherical harmonic coefficients, and means for obtain decomposed interpolated spherical harmonic coefficients for a time segment by, at least in part, performing an interpolation with respect to a first decomposition of the first plurality of spherical harmonic coefficients and the second decomposition of a second plurality of spherical harmonic coefficients.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to obtain decomposed interpolated spherical harmonic coefficients for a time segment by, at least in part, performing an interpolation with respect to a first decomposition of a first plurality of spherical harmonic coefficients and a second decomposition of a second plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises obtaining a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to obtain a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for obtaining a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients, and means for storing the bitstream.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that when executed cause one or more processors to obtain a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises generating a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to generate a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for generating a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients, and means for storing the bitstream.
  • In another aspect, a non-transitory computer-readable storage medium has instructions that when executed cause one or more processors to generate a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises identifying a Huffman codebook to use when decompressing a compressed version of a spatial component of a plurality of compressed spatial components based on an order of the compressed version of the spatial component relative to remaining ones of the plurality of compressed spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to identify a Huffman codebook to use when decompressing a compressed version of a spatial component of a plurality of compressed spatial components based on an order of the compressed version of the spatial component relative to remaining ones of the plurality of compressed spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for identifying a Huffman codebook to use when decompressing a compressed version of a spatial component of a plurality of compressed spatial components based on an order of the compressed version of the spatial component relative to remaining ones of the plurality of compressed spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients, and means for string the plurality of compressed spatial components.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that when executed cause one or more processors to identify a Huffman codebook to use when decompressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises identifying a Huffman codebook to use when compressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to identify a Huffman codebook to use when compressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for storing a Huffman codebook, and means for identifying the Huffman codebook to use when compressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that, when executed, cause one or more processors to identify a Huffman codebook to use when compressing a spatial component of a plurality of spatial components based on an order of the spatial component relative to remaining ones of the plurality of spatial components, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a method comprises determining a quantization step size to be used when compressing a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises one or more processors configured to determine a quantization step size to be used when compressing a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • In another aspect, a device comprises means for determining a quantization step size to be used when compressing a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients, and means for storing the quantization step size.
  • In another aspect, a non-transitory computer-readable storage medium has stored thereon instructions that when executed cause one or more processors to determine a quantization step size to be used when compressing a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
  • The details of one or more aspects of the techniques are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of these techniques will be apparent from the description and drawings, and from the claims.
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIGS. 1 and 2 are diagrams illustrating spherical harmonic basis functions of various orders and sub-orders.
  • FIG. 3 is a diagram illustrating a system that may perform various aspects of the techniques described in this disclosure.
  • FIG. 4 is a block diagram illustrating, in more detail, one example of the audio encoding device shown in the example of FIG. 3 that may perform various aspects of the techniques described in this disclosure.
  • FIG. 5 is a block diagram illustrating the audio decoding device of FIG. 3 in more detail.
  • FIG. 6 is a flowchart illustrating exemplary operation of a content analysis unit of an audio encoding device in performing various aspects of the techniques described in this disclosure.
  • FIG. 7 is a flowchart illustrating exemplary operation of an audio encoding device in performing various aspects of the vector-based synthesis techniques described in this disclosure.
  • FIG. 8 is a flow chart illustrating exemplary operation of an audio decoding device in performing various aspects of the techniques described in this disclosure.
  • FIGS. 9A-9L are block diagrams illustrating various aspects of the audio encoding device of the example of FIG. 4 in more detail.
  • FIGS. 10A-10O(ii) are diagrams illustrating a portion of the bitstream or side channel information that may specify the compressed spatial components in more detail.
  • FIGS. 11A-11G are block diagrams illustrating, in more detail, various units of the audio decoding device shown in the example of FIG. 5.
  • FIG. 12 is a diagram illustrating an example audio ecosystem that may perform various aspects of the techniques described in this disclosure.
  • FIG. 13 is a diagram illustrating one example of the audio ecosystem of FIG. 12 in more detail.
  • FIG. 14 is a diagram illustrating one example of the audio ecosystem of FIG. 12 in more detail.
  • FIGS. 15A and 15B are diagrams illustrating other examples of the audio ecosystem of FIG. 12 in more detail.
  • FIG. 16 is a diagram illustrating an example audio encoding device that may perform various aspects of the techniques described in this disclosure.
  • FIG. 17 is a diagram illustrating one example of the audio encoding device of FIG. 16 in more detail.
  • FIG. 18 is a diagram illustrating an example audio decoding device that may perform various aspects of the techniques described in this disclosure.
  • FIG. 19 is a diagram illustrating one example of the audio decoding device of FIG. 18 in more detail.
  • FIGS. 20A-20G are diagrams illustrating example audio acquisition devices that may perform various aspects of the techniques described in this disclosure.
  • FIGS. 21A-21E are diagrams illustrating example audio playback devices that may perform various aspects of the techniques described in this disclosure.
  • FIGS. 22A-22H are diagrams illustrating example audio playback environments in accordance with one or more techniques described in this disclosure.
  • FIG. 23 is a diagram illustrating an example use case where a user may experience a 3D soundfield of a sports game while wearing headphones in accordance with one or more techniques described in this disclosure.
  • FIG. 24 is a diagram illustrating a sports stadium at which a 3D soundfield may be recorded in accordance with one or more techniques described in this disclosure.
  • FIG. 25 is a flow diagram illustrating a technique for rendering a 3D soundfield based on a local audio landscape in accordance with one or more techniques described in this disclosure.
  • FIG. 26 is a diagram illustrating an example game studio in accordance with one or more techniques described in this disclosure.
  • FIG. 27 is a diagram illustrating a plurality game systems which include rendering engines in accordance with one or more techniques described in this disclosure.
  • FIG. 28 is a diagram illustrating a speaker configuration that may be simulated by headphones in accordance with one or more techniques described in this disclosure.
  • FIG. 29 is a diagram illustrating a plurality of mobile devices which may be used to acquire and/or edit a 3D soundfield in accordance with one or more techniques described in this disclosure.
  • FIG. 30 is a diagram illustrating a video frame associated with a 3D soundfield which may be processed in accordance with one or more techniques described in this disclosure.