CN106023999A - Encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio - Google Patents

Encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio Download PDF

Info

Publication number
CN106023999A
CN106023999A CN201610541939.8A CN201610541939A CN106023999A CN 106023999 A CN106023999 A CN 106023999A CN 201610541939 A CN201610541939 A CN 201610541939A CN 106023999 A CN106023999 A CN 106023999A
Authority
CN
China
Prior art keywords
spatial parameter
audio
decoding
coding
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610541939.8A
Other languages
Chinese (zh)
Other versions
CN106023999B (en
Inventor
胡瑞敏
杨乘
王晓晨
杜鹏慧
苏柳月
武庭照
陈玮
杨玉红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201610541939.8A priority Critical patent/CN106023999B/en
Publication of CN106023999A publication Critical patent/CN106023999A/en
Application granted granted Critical
Publication of CN106023999B publication Critical patent/CN106023999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides an encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio. According to the invention, the audio signal of a three-dimensional audio, the spatial side information of the three-dimensional audio and the number of the audio object of spatial parameters are input when encoding is carried out; when encoding is carried out, clustering, quantization, intra-frame encoding and inter-frame differential encoding are successively carried out on the spatial parameters; when decoding is carried out, inter-frame differential decoding, intra-frame decoding, inverse quantization and spatial parameter mapping are successively carried out. According to the invention, based on the characteristic that different sub-band spatial parameters in the same sound source and the same frame have similarities, a spatial parameter clustering method is adopted to improve the compression ratio of the three-dimensional audio spatial parameters, and the compression ratio of the three-dimensional audio spatial parameters is high.

Description

For improving decoding method and the system of three-dimensional audio spatial parameter compression ratio
Technical field
The present invention relates to digital audio field, for the demand of raising three-dimensional audio spatial parameter compression ratio, particularly relate to A kind of decoding method improving three-dimensional audio spatial parameter compression ratio and system.
Background technology
In the end of the year 2009, three-dimensional movie " A Fanda " climbs up top box-office value, in JIUYUE, 2010 in more than 30 country in the whole world Just, accumulative box office, the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain the most brilliant box office achievement, is that it is adopted Brand-new three-dimensional special effect making technology bring the shock effect on people's sense organ.
In order to provide a kind of sensation more immersed and one in 3d space more real sound field to auditor, space Audio object coding (SAOC), direction audio coding (DirAC) and space squeezing audio coding (S3AC) are suggested.Along with 3D The raising of spatial resolution and increasing sound channel or object, the bit rate of spatial parameter improves the most sharp.Such as, In space orientation point of quantification (SLQP) method of S3AC coding, the bit rate of spatial parameter is 18kbps/ object, then for 16 sound object, spatial parameter needs the bit rate of 288kbps.Therefore, the ratio of the spatial parameter in minimizing 3D audio coding Special rate is the most urgent.
Compression method BCC, MPEG Surround and S3AC of spatial parameter considers the characteristic between consecutive frame, then The bit rate of spatial parameter can be reduced by differential coding.These methods can remove in identical frequency band empty between consecutive frame Between the inter-frame redundancy of parameter, but in same frame between same sound source different frequency bands in the frame of spatial parameter redundancy yet suffer from. Remove redundancy in these frames if can try every possible means, then spatial parameter bit rate can be further compressed.
Summary of the invention
Present invention aims to above-mentioned prior art not enough present on compression 3D audio space parameter, it is provided that A kind of new object-based spatial parameter compression method for 3D audio recording;The method based on same sound source at same frame Interior different frequency bands has the characteristic of identical spatial parameter, can remove in existing spatial parameter compression method with height ratio Redundancy in the frame of the spatial parameter not considered, thus compression stroke parameters bit rate further.
Technical scheme provides a kind of decoding method for improving three-dimensional audio spatial parameter compression ratio, bag Including cataloged procedure and decoding process, described cataloged procedure comprises the following steps:
Step C1, input includes three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter comprising n object The numbering of affiliated audio object, transforms to frequency domain by three-dimensional audio time-domain signal, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), described s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three The spatial parameter of dimension audio frequencyDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f);Time-domain signal s (t) of three-dimensional audio is become Changing to frequency domain, (n, f), (n f) includes S to described S to obtain the frequency-region signal S of three-dimensional audio1(n,f)、S2(n,f)、Sk(n, f)…、SK(n,f);Wherein, skT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time;Sk(n f) is kth The frequency domain presentation of individual aeoplotropism audio signal;Represent the spatial parameter that kth aeoplotropism audio signal is corresponding, θ For horizontal angle,For elevation angle, r is distance side information;The value of k is 1,2 ..., K, K are original aeoplotropism audio signal Sum;(n, value f) is the numbering of audio object belonging to spatial parameter to Index;N represents frame index, and f represents frequency indices;
Step C2, carries out intraframe coding to the spatial parameter of input, it is achieved as follows, to belonging to same audio frequency pair in same frame The spatial parameter of the different frequency bands of elephant clusters;To the spatial parameter after clusterQuantify;After quantifying Spatial parameter carries out intraframe coding;
Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is that difference is compiled Code;
Described decoding process comprises the following steps;
Step D1, carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2, carries out intraframe decoder to spatial parameter, it is achieved as follows, and spatial parameter is carried out intraframe decoder;To in frame Decoded spatial parameter carries out inverse quantization;Reduce original spatial parameter
Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' T (), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is that s (t) is after encoding and decoding to S to the S ' described in contracting Signal;The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd it is former (n f) constitutes the sound of the decoded three-dimensional audio comprising n object to numbering Index of audio object belonging to the spatial parameter begun Frequently the numbering of audio object belonging to signal, spatial parameter and spatial parameter.
Further, in described step C2, it it is the space to the different frequency bands belonging to same audio object in same frame Parameter clusters, i.e. identical for n, and (n, value f) is identical but spatial parameter that f is different for IndexGather Class, generates the spatial parameter after cluster
Further, in described step D2, it is by the difference of the most clustered same audio object belonging to same frame The spatial parameter of frequency bandMap to they corresponding frequency bands, be reduced into original spatial parameter
Further, in described step C2, to the spatial parameter after clusterQuantify, described amount Change is that perception quantifies or directly quantifies;To quantify after spatial parameter carry out intraframe coding, described coding be perceptual coding or Direct coding.
Further, in described step D2, spatial parameter is carried out intraframe decoder, described decoding be perception decoding or Directly decode;Spatial parameter after intraframe decoder carries out inverse quantization, and described inverse quantization is aimed at the inverse that perception quantifies Change or be directed to the inverse quantization directly quantified.
A kind of coding/decoding system for improving three-dimensional audio spatial parameter compression ratio, including encoder;
Described encoder includes following module:
Time-frequency conversion module, the three-dimensional sound signal including comprising n object for input, three-dimensional audio spatial parameter with And the numbering of audio object belonging to spatial parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and described s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncludingEmpty Between audio object belonging to parameter numbered Index (n, f);Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, obtains (n, f), (n f) includes S to described S to the frequency-region signal S of three-dimensional audio1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f);Its In, skT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time;Sk(n f) is kth aeoplotropism audio signal Frequency domain presentation;Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For height Angle, r is distance side information;The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal;Index (n, f) Value numbering of audio object belonging to spatial parameter;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding, including for belonging in same frame to the spatial parameter of input The spatial parameter of the different frequency bands of same audio object clusters;To the spatial parameter after clusterQuantify; Spatial parameter after quantifying is carried out intraframe coding;
Inter-coding module, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is for poor Coded;
Described decoder includes with lower module:
Decoding inter frames module, for spatial parameter is carried out decoding inter frames, coding/decoding method is differential decoding;
Intraframe decoder module, for spatial parameter is carried out intraframe decoder, solves including in spatial parameter carries out frame Code;Spatial parameter after intraframe decoder is carried out inverse quantization;Reduce original spatial parameter
Time-frequency inverse transform block, for by frequency domain presentation S of audio signal ' (n f) transforms to time domain, obtains audio signal Time domain express s ' (t), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is s (t) warp to S to the S ' described in contracting Cross the signal after encoding and decoding;The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd numbering Index of audio object belonging to original spatial parameter (n f) constitutes the decoded n that comprises right The numbering of audio object belonging to the audio signal of the three-dimensional audio of elephant, spatial parameter and spatial parameter.
Further, described intraframe coding module includes cluster module, and described cluster module is in same frame The spatial parameter of the different frequency bands belonging to same audio object clusters, i.e. identical for n, Index (n, value f) identical but The spatial parameter that f is differentCluster, generate the spatial parameter after cluster
Further, described intraframe decoder module includes recovery module, and described recovery module is for by the most clustered The spatial parameter of different frequency bands of the same audio object belonging to same frameMap to they corresponding frequency bands, It is reduced into original spatial parameter
Further, described intraframe coding module includes quantization modules, and described quantization modules is after to cluster Spatial parameterQuantifying, described quantization is that perception quantifies or directly quantifies;Spatial parameter after quantifying is entered Row intraframe coding, described coding is perceptual coding or direct coding.
Further, described intraframe decoder module includes inverse quantization module, and described inverse quantization module is for space Parameter carries out intraframe decoder, and described decoding is perception decoding or directly decodes;Spatial parameter after intraframe decoder is carried out instead Quantifying, described inverse quantization is aimed at the inverse quantization of perception quantization or is directed to the inverse quantization directly quantified.
The invention has the beneficial effects as follows: present invention different frequency bands based on sound source same in same frame has identical space to join Number, at coding side by spatial parameter cluster, spatial parameter quantization, spatial parameter intraframe coding, then carries out spatial parameter frame Between differential coding, further compress three-dimensional audio space parameters bit rate, improve spatial parameter compression ratio.Decoding end is to three-dimensional sound Frequency code stream is decoded, and carries out inter-frame difference decoding including to spatial parameter, and spatial parameter intraframe decoder, after intraframe decoder Spatial parameter carries out inverse quantization, and is mapped by the spatial parameter of cluster, obtains the audio signal of three-dimensional audio, spatial parameter And the numbering of audio object belonging to spatial parameter.Therefore, the present invention, by encoding and decoding in increase frame, solves the most existing Spatial parameter compression method in do not consider the defect of redundancy in spatial parameter frame, can compress three-dimensional audio space ginseng further Number bit rate, improves spatial parameter compression ratio.
Accompanying drawing explanation
Fig. 1 is the flow chart of the coding side of the embodiment of the present invention;
Fig. 2 is the flow chart of the decoding end of the embodiment of the present invention.
Detailed description of the invention
(wherein step C1 to step C3 is encoded to describe technical solution of the present invention in detail below in conjunction with drawings and Examples Journey, step D1 to step D3 is decoding process).
See Fig. 1, the coding side execution below scheme of the embodiment of the present invention:
Step C1, transforms to frequency domain by time-domain signal s (t) of three-dimensional audio, obtain three-dimensional audio frequency-region signal S (n, f)。
The input of coding side is: three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter comprising n object The numbering of affiliated audio object.The time domain of the audio signal of three-dimensional audio is expressed as s (t), and s (t) is by s1(t)、s2(t)、…、sK T () is constituted, t express time;The spatial parameter of three-dimensional audio, namely the spatial parameter that each time frequency point is correspondingByConstitute;The numbering of audio object belonging to spatial parameter, uses Index (n f) expresses.Wherein, skT () is that the time domain of kth aeoplotropism audio signal is expressed,Represent kth aeoplotropism The spatial parameter that audio signal is corresponding, spatial parameter is by direction parameter (horizontal angle θ, elevation angle) and distance parameter r composition.K's Value is 1,2 ..., K, K are the sum of original aeoplotropism audio signal.
The time-domain signal of three-dimensional audio is transformed to frequency domain, time-domain signal s (t) of three-dimensional audio can be used Fu in short-term In leaf transformation (STFT) transform to frequency domain, (n, f), (n, f) by S for S to obtain the frequency-region signal S of three-dimensional audio1(n,f)、S2(n, f)、…、SK(n,f).Wherein, Sk(n, f) is the frequency domain presentation of kth aeoplotropism audio signal, and n represents frame index, and f represents frequency Rate indexes.When being embodied as, it is possible to use the additive method such as MDCT or Hilbert Huang to convert.
K=8, f=1,2 in embodiment ..., 40.8 aeoplotropism audio signals s1(t)、s2(t)、…、s8The frequency domain of (t) Signal is (S1(n,f),S2(n,f),…,S8(n, f)), the spatial parameter of they correspondences isAnd the numbered Index of object belonging to these spatial parameters (n, f).
Step C2, carries out intraframe coding to spatial parameter, when embodiment carries out step C3, specifically performs following steps:
C21: the spatial parameter of the different frequency bands belonging to same audio object in same frame is clustered, i.e. for n phase With, Index, (n, value f) is identical but spatial parameter that f is differentCarry out Cluster, generates the spatial parameter after cluster
C22: to the spatial parameter after clusterQuantify, permissible It is that perception quantifies or directly quantifies;
C23: the spatial parameter after quantifying is carried out intraframe coding, can be perceptual coding or direct coding;
Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and embodiment carries out step C3 Time, coded method is differential coding.
See Fig. 2, the decoding end execution below scheme of the embodiment of the present invention:
Step D1, carries out decoding inter frames to spatial parameter, and when embodiment carries out step D1, coding/decoding method is differential decoding.
Step D2, carries out intraframe decoder to spatial parameter, when embodiment carries out step D2, specifically performs following steps:
D21: spatial parameter is carried out intraframe decoder can be perception decoding or directly decodes;
D22: the spatial parameter after intraframe decoder is carried out inverse quantization, can be aimed at perception quantify inverse quantization or It is directed to the inverse quantization directly quantified;
D23: by the spatial parameter of the different frequency bands of the most clustered same audio object belonging to same frame Map to they corresponding frequency bands, be reduced into original spatial parameter
Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' T (), (n is f) that (n, f) signal after encoding and decoding, s ' (t) is the s (t) signal after encoding and decoding to S to S ';Comprise n individual right The time domain of the audio signal of elephant expresses s ' (t) and step D2 gained spatial parameterAnd original spatial parameter institute (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object to numbering Index of genus audio object, and space is joined The numbering of audio object belonging to number and spatial parameter.Different configuration of speaker or earphone can be used accordingly when being embodied as Rebuild three-dimensional audio sound field, the most reducible original three-dimensional audio.
Embodiment by after encoding and decoding 8 aeoplotropism audio signals (S '1(n,f),S’2(n,f),…,S’8(n, f)) converts To time domain, obtain 8 aeoplotropism audio signals s '1(t),s’2(t),…,s’8(t) with decoded spatial parameter And numbering Index of audio object belonging to original spatial parameter (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object, audio frequency belonging to spatial parameter and spatial parameter The numbering of object.The present embodiment uses earphone to realize the band playback apart from the three-dimensional sound signal of side information, in order to realize ear The three-dimensional audio of machine is reappeared, and needs with related transfer function (HRTF) storehouse to the end, and PKU&IOA HRTF storehouse is to far field and near field all Measuring, the resolution that distance r changes to 160cm, horizontal angle and elevation angle from 20cm is 5 respectively0With 100, we select PKU&IOA HRTF storehouse completes to have carried out the three-dimensional audio of frame data compression and interframe compression and rebuilds.
By Experimental comparison, add the three-dimensional audio compression method of intraframe coding than the three of original only interframe encode The compression effectiveness of dimension audio compression method is good, and compression ratio is higher and reconstruction audio quality is still kept.Owing to adding in frame Coding, can eliminate redundancy in frame, and therefore this method improves three dimensions on the basis of ensureing to rebuild three-dimensional audio quality Compression of parameters rate, reduces spatial parameter bit rate.
Method provided by the present invention can use software engineering to realize automatically and run, it is possible to be embodied as corresponding modularity system System.A kind of parametric codec system for improving three-dimensional audio spatial impression distance perspective that the present invention provides, including encoder and Decoder, described encoder includes following module,
Time-frequency conversion module, the three-dimensional sound signal including comprising n object for input, three-dimensional audio spatial parameter with And the numbering of audio object belonging to spatial parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and described s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f);Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, To the frequency-region signal S of three-dimensional audio, (n, f), (n f) includes S to described S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f); Wherein, skT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time;Sk(n f) is kth aeoplotropism audio frequency letter Number frequency domain presentation;Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For height Degree angle, r is distance side information;The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal;Index(n,f) Value be the numbering of audio object belonging to spatial parameter;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding, including for belonging in same frame to the spatial parameter of input The spatial parameter of the different frequency bands of same audio object clusters;To the spatial parameter after clusterQuantify; Spatial parameter after quantifying is carried out intraframe coding;
Inter-coding module, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is for poor Coded;
Described decoder includes with lower module:
Decoding inter frames module, for spatial parameter is carried out decoding inter frames, coding/decoding method is differential decoding;
Intraframe decoder module, for spatial parameter is carried out intraframe decoder, solves including in spatial parameter carries out frame Code;Spatial parameter after intraframe decoder is carried out inverse quantization;Reduce original spatial parameter
Time-frequency inverse transform block, for by frequency domain presentation S of audio signal ' (n f) transforms to time domain, obtains audio signal Time domain express s ' (t), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is s (t) warp to S to the S ' described in contracting Cross the signal after encoding and decoding;The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd numbering Index of audio object belonging to original spatial parameter (n f) constitutes the decoded n that comprises right The numbering of audio object belonging to the audio signal of the three-dimensional audio of elephant, spatial parameter and spatial parameter.
Intraframe coding module includes cluster module, and described cluster module is for belonging to same audio object in same frame The spatial parameter of different frequency bands cluster, i.e. identical for n, (n, value f) is identical but spatial parameter that f is different for IndexCluster, generate the spatial parameter after cluster
Intraframe decoder module includes recovery module, and described recovery module is for belonging to the same of same frame by the most clustered The spatial parameter of the different frequency bands of one audio objectMap to they corresponding frequency bands, be reduced into original space Parameter
Intraframe coding module includes quantization modules, and described quantization modules is for the spatial parameter after clusterQuantifying, described quantization is that perception quantifies or directly quantifies;Compile in spatial parameter after quantifying is carried out frame Code, described coding is perceptual coding or direct coding.
Intraframe decoder module includes inverse quantization module, and described inverse quantization module solves in spatial parameter carries out frame Code, described decoding is perception decoding or directly decodes;Spatial parameter after intraframe decoder is carried out inverse quantization, described inverse Change and be aimed at the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Each module implements corresponding to method step, and it will not go into details for the present invention.
Specific embodiment described herein is only explanation for example to present invention.Technology neck belonging to the present invention Described specific embodiment can be made various amendment or supplements or use similar mode to replace by the technical staff in territory Generation, but without departing from present disclosure or surmount scope defined in appended claims.

Claims (10)

1. the decoding method being used for improving three-dimensional audio spatial parameter compression ratio, it is characterised in that include cataloged procedure With decoding process, described cataloged procedure comprises the following steps:
Step C1, inputs belonging to three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter including comprising n object The numbering of audio object, transforms to frequency domain by three-dimensional audio time-domain signal, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), described s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three-dimensional audio Spatial parameterDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f);Time-domain signal s (t) of three-dimensional audio is become Changing to frequency domain, (n, f), (n f) includes S to described S to obtain the frequency-region signal S of three-dimensional audio1(n,f)、S2(n,f)、Sk(n, f)…、SK(n,f);Wherein, skT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time;Sk(n f) is kth The frequency domain presentation of individual aeoplotropism audio signal;Represent the spatial parameter that kth aeoplotropism audio signal is corresponding, θ For horizontal angle,For elevation angle, r is distance side information;The value of k is 1,2 ..., K, K are original aeoplotropism audio signal Sum;(n, value f) is the numbering of audio object belonging to spatial parameter to Index;N represents frame index, and f represents frequency indices;
Step C2, carries out intraframe coding to the spatial parameter of input, it is achieved as follows, to belonging to same audio object in same frame The spatial parameter of different frequency bands clusters;To the spatial parameter after clusterQuantify;To the space after quantifying Parameter carries out intraframe coding;
Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is differential coding;
Described decoding process comprises the following steps;
Step D1, carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2, carries out intraframe decoder to spatial parameter, it is achieved as follows, and spatial parameter is carried out intraframe decoder;To intraframe decoder After spatial parameter carry out inverse quantization;Reduce original spatial parameter
Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' (t), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is the s (t) letter after encoding and decoding to S to S ' described in contracting Number;The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd it is original Spatial parameter belonging to numbering Index of audio object (n f) constitutes the audio frequency of the decoded three-dimensional audio comprising n object The numbering of audio object belonging to signal, spatial parameter and spatial parameter.
The most according to claim 1 for improving the decoding method of three-dimensional audio compression of parameters rate, it is characterised in that:
In described step C2, it is that the spatial parameter to the different frequency bands belonging to same audio object in same frame clusters, I.e. identical for n, (n, value f) is identical but spatial parameter that f is different for IndexCluster, after generating cluster Spatial parameter
The most according to claim 1 for improving the decoding method of three-dimensional audio compression of parameters rate, it is characterised in that:
In described step D2, it it is the spatial parameter of different frequency bands by the most clustered same audio object belonging to same frameMap to they corresponding frequency bands, be reduced into original spatial parameter
The most according to claim 1 for improving the decoding method of three-dimensional audio compression of parameters rate, it is characterised in that:
In described step C2, to the spatial parameter after clusterQuantify, described quantization be perception quantify or Directly quantify;Spatial parameter after quantifying is carried out intraframe coding, and described coding is perceptual coding or direct coding.
The most according to claim 1 for improving the decoding method of three-dimensional audio compression of parameters rate, it is characterised in that:
In described step D2, spatial parameter carrying out intraframe decoder, described decoding is perception decoding or directly decodes;To frame Interior decoded spatial parameter carries out inverse quantization, and described inverse quantization is aimed at the inverse quantization of perception quantization or is directed to straight Connect the inverse quantization of quantization.
6. one kind for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: include encoder and Decoder, described encoder includes following module,
Time-frequency conversion module, includes three-dimensional sound signal, three-dimensional audio spatial parameter and the sky comprising n object for input Between the numbering of audio object belonging to parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets the time domain of three-dimensional audio Signal is s (t), and described s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audio DescribedIncludingSpatial parameter Belonging to audio object numbered Index (n, f);Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, obtains three-dimensional sound (n, f), (n f) includes S to described S to the frequency-region signal S of frequency1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f);Wherein, sk(t) Time domain for kth aeoplotropism audio signal is expressed, t express time;Sk(n f) is the frequency domain of kth aeoplotropism audio signal Express;Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For elevation angle, r is Distance side information;The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal;(n, value f) is empty to Index Between the numbering of audio object belonging to parameter;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding, including for belonging to same in same frame to the spatial parameter of input The spatial parameter of the different frequency bands of audio object clusters;To the spatial parameter after clusterQuantify;To amount Spatial parameter after change carries out intraframe coding;
Inter-coding module, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is that difference is compiled Code;
Described decoder includes with lower module:
Decoding inter frames module, for spatial parameter is carried out decoding inter frames, coding/decoding method is differential decoding;
Intraframe decoder module, for carrying out intraframe decoder to spatial parameter, including for spatial parameter is carried out intraframe decoder;Right Spatial parameter after intraframe decoder carries out inverse quantization;Reduce original spatial parameter
Time-frequency inverse transform block, for by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, obtain audio signal time S ' (t) is expressed in territory, and (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is that s (t) is through compiling to S to the S ' described in contracting Decoded signal;The time domain of the audio signal comprising n object expresses s ' (t) and intraframe decoder module gained spatial parameterAnd numbering Index of audio object belonging to original spatial parameter (n f) constitutes the decoded n that comprises right The numbering of audio object belonging to the audio signal of the three-dimensional audio of elephant, spatial parameter and spatial parameter.
The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe coding module include cluster module, described cluster module is for the difference belonging to same audio object in same frame The spatial parameter of frequency band clusters, i.e. identical for n, and (n, value f) is identical but spatial parameter that f is different for IndexCluster, generate the spatial parameter after cluster
The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe decoder module include recovery module, described recovery module is for by the most clustered same audio frequency belonging to same frame The spatial parameter of the different frequency bands of objectMap to they corresponding frequency bands, be reduced into original spatial parameter
The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe coding module include quantization modules, described quantization modules for cluster after spatial parameterCarry out Quantifying, described quantization is that perception quantifies or directly quantifies;Spatial parameter after quantifying is carried out intraframe coding, described coding It is perceptual coding or direct coding.
The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: institute The intraframe decoder module stated includes inverse quantization module, and described inverse quantization module for carrying out intraframe decoder, institute to spatial parameter The decoding stated is perception decoding or directly decodes;Spatial parameter after intraframe decoder carries out inverse quantization, and described inverse quantization is It is directed to the inverse quantization of perception quantization or is directed to the inverse quantization directly quantified.
CN201610541939.8A 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio Active CN106023999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610541939.8A CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610541939.8A CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Publications (2)

Publication Number Publication Date
CN106023999A true CN106023999A (en) 2016-10-12
CN106023999B CN106023999B (en) 2019-06-11

Family

ID=57108555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610541939.8A Active CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Country Status (1)

Country Link
CN (1) CN106023999B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020043935A1 (en) 2018-08-31 2020-03-05 Nokia Technologies Oy Spatial parameter signalling
WO2020089523A1 (en) * 2018-11-01 2020-05-07 Nokia Technologies Oy Apparatus, methods and computer programs for encoding spatial metadata
CN108206022B (en) * 2016-12-16 2020-12-18 南京青衿信息科技有限公司 Codec for transmitting three-dimensional acoustic signals by using AES/EBU channel and coding and decoding method thereof
WO2021032909A1 (en) * 2019-08-16 2021-02-25 Nokia Technologies Oy Quantization of spatial audio direction parameters
RU2763155C2 (en) * 2017-11-17 2021-12-27 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Apparatus and method for encoding or decoding the directional audio encoding parameters using quantisation and entropy encoding
WO2022129672A1 (en) * 2020-12-15 2022-06-23 Nokia Technologies Oy Quantizing spatial audio parameters
CN115662448A (en) * 2022-10-17 2023-01-31 深圳市超时代软件有限公司 Method and device for converting audio data coding format
US12020713B2 (en) 2019-08-16 2024-06-25 Nokia Technologies Oy Quantization of spatial audio direction parameters

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070025907A (en) * 2005-08-30 2007-03-08 엘지전자 주식회사 Method of effective bitstream composition for the parameter band number of channel conversion module in multi-channel audio coding
CN101521013A (en) * 2009-04-08 2009-09-02 武汉大学 Spatial audio parameter bidirectional interframe predictive coding and decoding devices
CN101609674A (en) * 2008-06-20 2009-12-23 华为技术有限公司 Decoding method, device and system
US7974287B2 (en) * 2006-02-23 2011-07-05 Lg Electronics Inc. Method and apparatus for processing an audio signal
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN103928030A (en) * 2014-04-30 2014-07-16 武汉大学 Gradable audio coding system and method based on sub-band space attention measure
CN104064194A (en) * 2014-06-30 2014-09-24 武汉大学 Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070025907A (en) * 2005-08-30 2007-03-08 엘지전자 주식회사 Method of effective bitstream composition for the parameter band number of channel conversion module in multi-channel audio coding
US7974287B2 (en) * 2006-02-23 2011-07-05 Lg Electronics Inc. Method and apparatus for processing an audio signal
CN101609674A (en) * 2008-06-20 2009-12-23 华为技术有限公司 Decoding method, device and system
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN101521013A (en) * 2009-04-08 2009-09-02 武汉大学 Spatial audio parameter bidirectional interframe predictive coding and decoding devices
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN103928030A (en) * 2014-04-30 2014-07-16 武汉大学 Gradable audio coding system and method based on sub-band space attention measure
CN104064194A (en) * 2014-06-30 2014-09-24 武汉大学 Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡瑞敏等: "AVS-P10移动音频编解码标准与关键技术", 《电视技术》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108206022B (en) * 2016-12-16 2020-12-18 南京青衿信息科技有限公司 Codec for transmitting three-dimensional acoustic signals by using AES/EBU channel and coding and decoding method thereof
RU2763155C2 (en) * 2017-11-17 2021-12-27 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Apparatus and method for encoding or decoding the directional audio encoding parameters using quantisation and entropy encoding
RU2763313C2 (en) * 2017-11-17 2021-12-28 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Apparatus and method for encoding or decoding the directional audio encoding parameters using various time and frequency resolutions
US11367454B2 (en) 2017-11-17 2022-06-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding or decoding directional audio coding parameters using quantization and entropy coding
US11783843B2 (en) 2017-11-17 2023-10-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for encoding or decoding directional audio coding parameters using different time/frequency resolutions
WO2020043935A1 (en) 2018-08-31 2020-03-05 Nokia Technologies Oy Spatial parameter signalling
CN112970062A (en) * 2018-08-31 2021-06-15 诺基亚技术有限公司 Spatial parameter signaling
JP7208385B2 (en) 2018-11-01 2023-01-18 ノキア テクノロジーズ オーユー Apparatus, method and computer program for encoding spatial metadata
WO2020089523A1 (en) * 2018-11-01 2020-05-07 Nokia Technologies Oy Apparatus, methods and computer programs for encoding spatial metadata
JP2022506581A (en) * 2018-11-01 2022-01-17 ノキア テクノロジーズ オーユー Devices, methods and computer programs for encoding spatial metadata
WO2021032909A1 (en) * 2019-08-16 2021-02-25 Nokia Technologies Oy Quantization of spatial audio direction parameters
US12020713B2 (en) 2019-08-16 2024-06-25 Nokia Technologies Oy Quantization of spatial audio direction parameters
WO2022129672A1 (en) * 2020-12-15 2022-06-23 Nokia Technologies Oy Quantizing spatial audio parameters
CN115662448A (en) * 2022-10-17 2023-01-31 深圳市超时代软件有限公司 Method and device for converting audio data coding format
CN115662448B (en) * 2022-10-17 2023-10-20 深圳市超时代软件有限公司 Method and device for converting audio data coding format

Also Published As

Publication number Publication date
CN106023999B (en) 2019-06-11

Similar Documents

Publication Publication Date Title
CN106023999B (en) For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio
ES2899286T3 (en) Temporal Envelope Configuration for Audio Spatial Encoding Using Frequency Domain Wiener Filtering
CN111226442B (en) Method of configuring transforms for video compression and computer-readable storage medium
CN101120615B (en) Multi-channel encoder/decoder and related encoding and decoding method
CN106415714A (en) Coding independent frames of ambient higher-order ambisonic coefficients
CN106463121A (en) Higher order ambisonics signal compression
HRP20140400T1 (en) Decoding of multichannel aufio encoded bit streams using adaptive hybrid transformation
CN103108187B (en) The coded method of a kind of 3 D video, coding/decoding method, encoder
CN104064194A (en) Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency
JP2013513330A5 (en)
TW200935403A (en) Technique for encoding/decoding of codebook indices for quantized MDCT spectrum in scalable speech and audio codecs
CN101371447A (en) Complex-transform channel coding with extended-band frequency coding
US11776552B2 (en) Methods and apparatus for decoding encoded audio signal(s)
TWI702594B (en) Backward-compatible integration of high frequency reconstruction techniques for audio signals
CN109448741A (en) A kind of 3D audio coding, coding/decoding method and device
CN109887517A (en) Method, decoder and the computer-readable medium that audio scene is decoded
TW201503113A (en) Encoding device and method, decoding device and method, and program
JP2020074052A (en) Backward compatible integration of harmonic converter for high frequency reconstruction of audio signal
WO2015096789A1 (en) Method and device for use in vector quantization encoding/decoding of audio signal
CN103065634A (en) Three-dimensional audio space parameter quantification method based on perception characteristic
JP6094322B2 (en) Orthogonal transformation device, orthogonal transformation method, computer program for orthogonal transformation, and audio decoding device
CN104347077B (en) A kind of stereo coding/decoding method
CN112365896A (en) Object-oriented encoding method based on stack type sparse self-encoder
KR102546098B1 (en) Apparatus and method for encoding / decoding audio based on block
CN105336334B (en) Multi-channel sound signal coding method, decoding method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant