CN106023999A

CN106023999A - Encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio

Info

Publication number: CN106023999A
Application number: CN201610541939.8A
Authority: CN
Inventors: 胡瑞敏; 杨乘; 王晓晨; 杜鹏慧; 苏柳月; 武庭照; 陈玮; 杨玉红
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2016-07-11
Filing date: 2016-07-11
Publication date: 2016-10-12
Anticipated expiration: 2036-07-11
Also published as: CN106023999B

Abstract

The invention provides an encoding and decoding method and system for improving three-dimensional audio spatial parameter compression ratio. According to the invention, the audio signal of a three-dimensional audio, the spatial side information of the three-dimensional audio and the number of the audio object of spatial parameters are input when encoding is carried out; when encoding is carried out, clustering, quantization, intra-frame encoding and inter-frame differential encoding are successively carried out on the spatial parameters; when decoding is carried out, inter-frame differential decoding, intra-frame decoding, inverse quantization and spatial parameter mapping are successively carried out. According to the invention, based on the characteristic that different sub-band spatial parameters in the same sound source and the same frame have similarities, a spatial parameter clustering method is adopted to improve the compression ratio of the three-dimensional audio spatial parameters, and the compression ratio of the three-dimensional audio spatial parameters is high.

Description

For improving decoding method and the system of three-dimensional audio spatial parameter compression ratio

Technical field

The present invention relates to digital audio field, for the demand of raising three-dimensional audio spatial parameter compression ratio, particularly relate to A kind of decoding method improving three-dimensional audio spatial parameter compression ratio and system.

Background technology

In the end of the year 2009, three-dimensional movie " A Fanda " climbs up top box-office value, in JIUYUE, 2010 in more than 30 country in the whole world Just, accumulative box office, the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain the most brilliant box office achievement, is that it is adopted Brand-new three-dimensional special effect making technology bring the shock effect on people's sense organ.

In order to provide a kind of sensation more immersed and one in 3d space more real sound field to auditor, space Audio object coding (SAOC), direction audio coding (DirAC) and space squeezing audio coding (S3AC) are suggested.Along with 3D The raising of spatial resolution and increasing sound channel or object, the bit rate of spatial parameter improves the most sharp.Such as, In space orientation point of quantification (SLQP) method of S3AC coding, the bit rate of spatial parameter is 18kbps/ object, then for 16 sound object, spatial parameter needs the bit rate of 288kbps.Therefore, the ratio of the spatial parameter in minimizing 3D audio coding Special rate is the most urgent.

Compression method BCC, MPEG Surround and S3AC of spatial parameter considers the characteristic between consecutive frame, then The bit rate of spatial parameter can be reduced by differential coding.These methods can remove in identical frequency band empty between consecutive frame Between the inter-frame redundancy of parameter, but in same frame between same sound source different frequency bands in the frame of spatial parameter redundancy yet suffer from. Remove redundancy in these frames if can try every possible means, then spatial parameter bit rate can be further compressed.

Summary of the invention

Present invention aims to above-mentioned prior art not enough present on compression 3D audio space parameter, it is provided that A kind of new object-based spatial parameter compression method for 3D audio recording；The method based on same sound source at same frame Interior different frequency bands has the characteristic of identical spatial parameter, can remove in existing spatial parameter compression method with height ratio Redundancy in the frame of the spatial parameter not considered, thus compression stroke parameters bit rate further.

Technical scheme provides a kind of decoding method for improving three-dimensional audio spatial parameter compression ratio, bag Including cataloged procedure and decoding process, described cataloged procedure comprises the following steps:

Step C1, input includes three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter comprising n object The numbering of affiliated audio object, transforms to frequency domain by three-dimensional audio time-domain signal, specific as follows,

If the time-domain signal of three-dimensional audio is s (t), described s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), three The spatial parameter of dimension audio frequencyDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f)；Time-domain signal s (t) of three-dimensional audio is become Changing to frequency domain, (n, f), (n f) includes S to described S to obtain the frequency-region signal S of three-dimensional audio₁(n,f)、S₂(n,f)、S_k(n, f)…、S_K(n,f)；Wherein, s_kT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time；S_k(n f) is kth The frequency domain presentation of individual aeoplotropism audio signal；Represent the spatial parameter that kth aeoplotropism audio signal is corresponding, θ For horizontal angle,For elevation angle, r is distance side information；The value of k is 1,2 ..., K, K are original aeoplotropism audio signal Sum；(n, value f) is the numbering of audio object belonging to spatial parameter to Index；N represents frame index, and f represents frequency indices；

Step C2, carries out intraframe coding to the spatial parameter of input, it is achieved as follows, to belonging to same audio frequency pair in same frame The spatial parameter of the different frequency bands of elephant clusters；To the spatial parameter after clusterQuantify；After quantifying Spatial parameter carries out intraframe coding；

Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is that difference is compiled Code；

Described decoding process comprises the following steps；

Step D1, carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding；

Step D2, carries out intraframe decoder to spatial parameter, it is achieved as follows, and spatial parameter is carried out intraframe decoder；To in frame Decoded spatial parameter carries out inverse quantization；Reduce original spatial parameter

Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' T (), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is that s (t) is after encoding and decoding to S to the S ' described in contracting Signal；The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd it is former (n f) constitutes the sound of the decoded three-dimensional audio comprising n object to numbering Index of audio object belonging to the spatial parameter begun Frequently the numbering of audio object belonging to signal, spatial parameter and spatial parameter.

Further, in described step C2, it it is the space to the different frequency bands belonging to same audio object in same frame Parameter clusters, i.e. identical for n, and (n, value f) is identical but spatial parameter that f is different for IndexGather Class, generates the spatial parameter after cluster

Further, in described step D2, it is by the difference of the most clustered same audio object belonging to same frame The spatial parameter of frequency bandMap to they corresponding frequency bands, be reduced into original spatial parameter

Further, in described step C2, to the spatial parameter after clusterQuantify, described amount Change is that perception quantifies or directly quantifies；To quantify after spatial parameter carry out intraframe coding, described coding be perceptual coding or Direct coding.

Further, in described step D2, spatial parameter is carried out intraframe decoder, described decoding be perception decoding or Directly decode；Spatial parameter after intraframe decoder carries out inverse quantization, and described inverse quantization is aimed at the inverse that perception quantifies Change or be directed to the inverse quantization directly quantified.

A kind of coding/decoding system for improving three-dimensional audio spatial parameter compression ratio, including encoder；

Described encoder includes following module:

Time-frequency conversion module, the three-dimensional sound signal including comprising n object for input, three-dimensional audio spatial parameter with And the numbering of audio object belonging to spatial parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and described s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), the spatial parameter of three-dimensional audioDescribedIncludingEmpty Between audio object belonging to parameter numbered Index (n, f)；Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, obtains (n, f), (n f) includes S to described S to the frequency-region signal S of three-dimensional audio₁(n,f)、S₂(n,f)、S_k(n,f)…、S_K(n,f)；Its In, s_kT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time；S_k(n f) is kth aeoplotropism audio signal Frequency domain presentation；Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For height Angle, r is distance side information；The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal；Index (n, f) Value numbering of audio object belonging to spatial parameter；N represents frame index, and f represents frequency indices；

Intraframe coding module, for carrying out intraframe coding, including for belonging in same frame to the spatial parameter of input The spatial parameter of the different frequency bands of same audio object clusters；To the spatial parameter after clusterQuantify； Spatial parameter after quantifying is carried out intraframe coding；

Inter-coding module, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is for poor Coded；

Described decoder includes with lower module:

Decoding inter frames module, for spatial parameter is carried out decoding inter frames, coding/decoding method is differential decoding；

Intraframe decoder module, for spatial parameter is carried out intraframe decoder, solves including in spatial parameter carries out frame Code；Spatial parameter after intraframe decoder is carried out inverse quantization；Reduce original spatial parameter

Time-frequency inverse transform block, for by frequency domain presentation S of audio signal ' (n f) transforms to time domain, obtains audio signal Time domain express s ' (t), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is s (t) warp to S to the S ' described in contracting Cross the signal after encoding and decoding；The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd numbering Index of audio object belonging to original spatial parameter (n f) constitutes the decoded n that comprises right The numbering of audio object belonging to the audio signal of the three-dimensional audio of elephant, spatial parameter and spatial parameter.

Further, described intraframe coding module includes cluster module, and described cluster module is in same frame The spatial parameter of the different frequency bands belonging to same audio object clusters, i.e. identical for n, Index (n, value f) identical but The spatial parameter that f is differentCluster, generate the spatial parameter after cluster

Further, described intraframe decoder module includes recovery module, and described recovery module is for by the most clustered The spatial parameter of different frequency bands of the same audio object belonging to same frameMap to they corresponding frequency bands, It is reduced into original spatial parameter

Further, described intraframe coding module includes quantization modules, and described quantization modules is after to cluster Spatial parameterQuantifying, described quantization is that perception quantifies or directly quantifies；Spatial parameter after quantifying is entered Row intraframe coding, described coding is perceptual coding or direct coding.

Further, described intraframe decoder module includes inverse quantization module, and described inverse quantization module is for space Parameter carries out intraframe decoder, and described decoding is perception decoding or directly decodes；Spatial parameter after intraframe decoder is carried out instead Quantifying, described inverse quantization is aimed at the inverse quantization of perception quantization or is directed to the inverse quantization directly quantified.

The invention has the beneficial effects as follows: present invention different frequency bands based on sound source same in same frame has identical space to join Number, at coding side by spatial parameter cluster, spatial parameter quantization, spatial parameter intraframe coding, then carries out spatial parameter frame Between differential coding, further compress three-dimensional audio space parameters bit rate, improve spatial parameter compression ratio.Decoding end is to three-dimensional sound Frequency code stream is decoded, and carries out inter-frame difference decoding including to spatial parameter, and spatial parameter intraframe decoder, after intraframe decoder Spatial parameter carries out inverse quantization, and is mapped by the spatial parameter of cluster, obtains the audio signal of three-dimensional audio, spatial parameter And the numbering of audio object belonging to spatial parameter.Therefore, the present invention, by encoding and decoding in increase frame, solves the most existing Spatial parameter compression method in do not consider the defect of redundancy in spatial parameter frame, can compress three-dimensional audio space ginseng further Number bit rate, improves spatial parameter compression ratio.

Accompanying drawing explanation

Fig. 1 is the flow chart of the coding side of the embodiment of the present invention；

Fig. 2 is the flow chart of the decoding end of the embodiment of the present invention.

Detailed description of the invention

(wherein step C1 to step C3 is encoded to describe technical solution of the present invention in detail below in conjunction with drawings and Examples Journey, step D1 to step D3 is decoding process).

See Fig. 1, the coding side execution below scheme of the embodiment of the present invention:

Step C1, transforms to frequency domain by time-domain signal s (t) of three-dimensional audio, obtain three-dimensional audio frequency-region signal S (n, f)。

The input of coding side is: three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter comprising n object The numbering of affiliated audio object.The time domain of the audio signal of three-dimensional audio is expressed as s (t), and s (t) is by s₁(t)、s₂(t)、…、s_K T () is constituted, t express time；The spatial parameter of three-dimensional audio, namely the spatial parameter that each time frequency point is correspondingByConstitute；The numbering of audio object belonging to spatial parameter, uses Index (n f) expresses.Wherein, s_kT () is that the time domain of kth aeoplotropism audio signal is expressed,Represent kth aeoplotropism The spatial parameter that audio signal is corresponding, spatial parameter is by direction parameter (horizontal angle θ, elevation angle) and distance parameter r composition.K's Value is 1,2 ..., K, K are the sum of original aeoplotropism audio signal.

The time-domain signal of three-dimensional audio is transformed to frequency domain, time-domain signal s (t) of three-dimensional audio can be used Fu in short-term In leaf transformation (STFT) transform to frequency domain, (n, f), (n, f) by S for S to obtain the frequency-region signal S of three-dimensional audio₁(n,f)、S₂(n, f)、…、S_K(n,f).Wherein, S_k(n, f) is the frequency domain presentation of kth aeoplotropism audio signal, and n represents frame index, and f represents frequency Rate indexes.When being embodied as, it is possible to use the additive method such as MDCT or Hilbert Huang to convert.

K=8, f=1,2 in embodiment ..., 40.8 aeoplotropism audio signals s₁(t)、s₂(t)、…、s₈The frequency domain of (t) Signal is (S₁(n,f),S₂(n,f),…,S₈(n, f)), the spatial parameter of they correspondences isAnd the numbered Index of object belonging to these spatial parameters (n, f).

Step C2, carries out intraframe coding to spatial parameter, when embodiment carries out step C3, specifically performs following steps:

C21: the spatial parameter of the different frequency bands belonging to same audio object in same frame is clustered, i.e. for n phase With, Index, (n, value f) is identical but spatial parameter that f is differentCarry out Cluster, generates the spatial parameter after cluster

C22: to the spatial parameter after clusterQuantify, permissible It is that perception quantifies or directly quantifies；

C23: the spatial parameter after quantifying is carried out intraframe coding, can be perceptual coding or direct coding；

Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and embodiment carries out step C3 Time, coded method is differential coding.

See Fig. 2, the decoding end execution below scheme of the embodiment of the present invention:

Step D1, carries out decoding inter frames to spatial parameter, and when embodiment carries out step D1, coding/decoding method is differential decoding.

Step D2, carries out intraframe decoder to spatial parameter, when embodiment carries out step D2, specifically performs following steps:

D21: spatial parameter is carried out intraframe decoder can be perception decoding or directly decodes；

D22: the spatial parameter after intraframe decoder is carried out inverse quantization, can be aimed at perception quantify inverse quantization or It is directed to the inverse quantization directly quantified；

D23: by the spatial parameter of the different frequency bands of the most clustered same audio object belonging to same frame Map to they corresponding frequency bands, be reduced into original spatial parameter

Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' T (), (n is f) that (n, f) signal after encoding and decoding, s ' (t) is the s (t) signal after encoding and decoding to S to S '；Comprise n individual right The time domain of the audio signal of elephant expresses s ' (t) and step D2 gained spatial parameterAnd original spatial parameter institute (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object to numbering Index of genus audio object, and space is joined The numbering of audio object belonging to number and spatial parameter.Different configuration of speaker or earphone can be used accordingly when being embodied as Rebuild three-dimensional audio sound field, the most reducible original three-dimensional audio.

Embodiment by after encoding and decoding 8 aeoplotropism audio signals (S '₁(n,f),S’₂(n,f),…,S’₈(n, f)) converts To time domain, obtain 8 aeoplotropism audio signals s '₁(t),s’₂(t),…,s’₈(t) with decoded spatial parameter And numbering Index of audio object belonging to original spatial parameter (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object, audio frequency belonging to spatial parameter and spatial parameter The numbering of object.The present embodiment uses earphone to realize the band playback apart from the three-dimensional sound signal of side information, in order to realize ear The three-dimensional audio of machine is reappeared, and needs with related transfer function (HRTF) storehouse to the end, and PKU&IOA HRTF storehouse is to far field and near field all Measuring, the resolution that distance r changes to 160cm, horizontal angle and elevation angle from 20cm is 5 respectively⁰With 10⁰, we select PKU&IOA HRTF storehouse completes to have carried out the three-dimensional audio of frame data compression and interframe compression and rebuilds.

By Experimental comparison, add the three-dimensional audio compression method of intraframe coding than the three of original only interframe encode The compression effectiveness of dimension audio compression method is good, and compression ratio is higher and reconstruction audio quality is still kept.Owing to adding in frame Coding, can eliminate redundancy in frame, and therefore this method improves three dimensions on the basis of ensureing to rebuild three-dimensional audio quality Compression of parameters rate, reduces spatial parameter bit rate.

Method provided by the present invention can use software engineering to realize automatically and run, it is possible to be embodied as corresponding modularity system System.A kind of parametric codec system for improving three-dimensional audio spatial impression distance perspective that the present invention provides, including encoder and Decoder, described encoder includes following module,

Time-frequency conversion module, the three-dimensional sound signal including comprising n object for input, three-dimensional audio spatial parameter with And the numbering of audio object belonging to spatial parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and described s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), the spatial parameter of three-dimensional audioDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f)；Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, To the frequency-region signal S of three-dimensional audio, (n, f), (n f) includes S to described S₁(n,f)、S₂(n,f)、S_k(n,f)…、S_K(n,f)； Wherein, s_kT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time；S_k(n f) is kth aeoplotropism audio frequency letter Number frequency domain presentation；Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For height Degree angle, r is distance side information；The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal；Index(n,f) Value be the numbering of audio object belonging to spatial parameter；N represents frame index, and f represents frequency indices；

Described decoder includes with lower module:

Intraframe coding module includes cluster module, and described cluster module is for belonging to same audio object in same frame The spatial parameter of different frequency bands cluster, i.e. identical for n, (n, value f) is identical but spatial parameter that f is different for IndexCluster, generate the spatial parameter after cluster

Intraframe decoder module includes recovery module, and described recovery module is for belonging to the same of same frame by the most clustered The spatial parameter of the different frequency bands of one audio objectMap to they corresponding frequency bands, be reduced into original space Parameter

Intraframe coding module includes quantization modules, and described quantization modules is for the spatial parameter after clusterQuantifying, described quantization is that perception quantifies or directly quantifies；Compile in spatial parameter after quantifying is carried out frame Code, described coding is perceptual coding or direct coding.

Intraframe decoder module includes inverse quantization module, and described inverse quantization module solves in spatial parameter carries out frame Code, described decoding is perception decoding or directly decodes；Spatial parameter after intraframe decoder is carried out inverse quantization, described inverse Change and be aimed at the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.

Each module implements corresponding to method step, and it will not go into details for the present invention.

Specific embodiment described herein is only explanation for example to present invention.Technology neck belonging to the present invention Described specific embodiment can be made various amendment or supplements or use similar mode to replace by the technical staff in territory Generation, but without departing from present disclosure or surmount scope defined in appended claims.

Claims

1. the decoding method being used for improving three-dimensional audio spatial parameter compression ratio, it is characterised in that include cataloged procedure With decoding process, described cataloged procedure comprises the following steps:

Step C1, inputs belonging to three-dimensional sound signal, three-dimensional audio spatial parameter and the spatial parameter including comprising n object The numbering of audio object, transforms to frequency domain by three-dimensional audio time-domain signal, specific as follows,

If the time-domain signal of three-dimensional audio is s (t), described s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), three-dimensional audio Spatial parameterDescribedIncluding The numbered Index of audio object belonging to spatial parameter (n, f)；Time-domain signal s (t) of three-dimensional audio is become Changing to frequency domain, (n, f), (n f) includes S to described S to obtain the frequency-region signal S of three-dimensional audio₁(n,f)、S₂(n,f)、S_k(n, f)…、S_K(n,f)；Wherein, s_kT () is that the time domain of kth aeoplotropism audio signal is expressed, t express time；S_k(n f) is kth The frequency domain presentation of individual aeoplotropism audio signal；Represent the spatial parameter that kth aeoplotropism audio signal is corresponding, θ For horizontal angle,For elevation angle, r is distance side information；The value of k is 1,2 ..., K, K are original aeoplotropism audio signal Sum；(n, value f) is the numbering of audio object belonging to spatial parameter to Index；N represents frame index, and f represents frequency indices；

Step C2, carries out intraframe coding to the spatial parameter of input, it is achieved as follows, to belonging to same audio object in same frame The spatial parameter of different frequency bands clusters；To the spatial parameter after clusterQuantify；To the space after quantifying Parameter carries out intraframe coding；

Step C3, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is differential coding；

Described decoding process comprises the following steps；

Step D2, carries out intraframe decoder to spatial parameter, it is achieved as follows, and spatial parameter is carried out intraframe decoder；To intraframe decoder After spatial parameter carry out inverse quantization；Reduce original spatial parameter

Step D3, by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, and the time domain obtaining audio signal expresses s ' (t), (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is the s (t) letter after encoding and decoding to S to S ' described in contracting Number；The time domain of the audio signal comprising n object expresses s ' (t) and step D2 gained spatial parameterAnd it is original Spatial parameter belonging to numbering Index of audio object (n f) constitutes the audio frequency of the decoded three-dimensional audio comprising n object The numbering of audio object belonging to signal, spatial parameter and spatial parameter.

The most according to claim 1 for improving the decoding method of three-dimensional audio compression of parameters rate, it is characterised in that:

In described step C2, it is that the spatial parameter to the different frequency bands belonging to same audio object in same frame clusters, I.e. identical for n, (n, value f) is identical but spatial parameter that f is different for IndexCluster, after generating cluster Spatial parameter

In described step D2, it it is the spatial parameter of different frequency bands by the most clustered same audio object belonging to same frameMap to they corresponding frequency bands, be reduced into original spatial parameter

In described step C2, to the spatial parameter after clusterQuantify, described quantization be perception quantify or Directly quantify；Spatial parameter after quantifying is carried out intraframe coding, and described coding is perceptual coding or direct coding.

In described step D2, spatial parameter carrying out intraframe decoder, described decoding is perception decoding or directly decodes；To frame Interior decoded spatial parameter carries out inverse quantization, and described inverse quantization is aimed at the inverse quantization of perception quantization or is directed to straight Connect the inverse quantization of quantization.

6. one kind for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: include encoder and Decoder, described encoder includes following module,

Time-frequency conversion module, includes three-dimensional sound signal, three-dimensional audio spatial parameter and the sky comprising n object for input Between the numbering of audio object belonging to parameter, three-dimensional audio time-domain signal is transformed to frequency domain, specifically sets the time domain of three-dimensional audio Signal is s (t), and described s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), the spatial parameter of three-dimensional audio DescribedIncludingSpatial parameter Belonging to audio object numbered Index (n, f)；Time-domain signal s (t) of three-dimensional audio is transformed to frequency domain, obtains three-dimensional sound (n, f), (n f) includes S to described S to the frequency-region signal S of frequency₁(n,f)、S₂(n,f)、S_k(n,f)…、S_K(n,f)；Wherein, s_k(t) Time domain for kth aeoplotropism audio signal is expressed, t express time；S_k(n f) is the frequency domain of kth aeoplotropism audio signal Express；Representing the spatial parameter that kth aeoplotropism audio signal is corresponding, θ is horizontal angle,For elevation angle, r is Distance side information；The value of k is 1,2 ..., K, K are the sum of original aeoplotropism audio signal；(n, value f) is empty to Index Between the numbering of audio object belonging to parameter；N represents frame index, and f represents frequency indices；

Intraframe coding module, for carrying out intraframe coding, including for belonging to same in same frame to the spatial parameter of input The spatial parameter of the different frequency bands of audio object clusters；To the spatial parameter after clusterQuantify；To amount Spatial parameter after change carries out intraframe coding；

Inter-coding module, carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coded method is that difference is compiled Code；

Described decoder includes with lower module:

Intraframe decoder module, for carrying out intraframe decoder to spatial parameter, including for spatial parameter is carried out intraframe decoder；Right Spatial parameter after intraframe decoder carries out inverse quantization；Reduce original spatial parameter

Time-frequency inverse transform block, for by frequency domain presentation S of audio signal ' (n, f) transforms to time domain, obtain audio signal time S ' (t) is expressed in territory, and (n is f) that (n, f) signal after encoding and decoding, described s ' (t) is that s (t) is through compiling to S to the S ' described in contracting Decoded signal；The time domain of the audio signal comprising n object expresses s ' (t) and intraframe decoder module gained spatial parameterAnd numbering Index of audio object belonging to original spatial parameter (n f) constitutes the decoded n that comprises right The numbering of audio object belonging to the audio signal of the three-dimensional audio of elephant, spatial parameter and spatial parameter.

The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe coding module include cluster module, described cluster module is for the difference belonging to same audio object in same frame The spatial parameter of frequency band clusters, i.e. identical for n, and (n, value f) is identical but spatial parameter that f is different for IndexCluster, generate the spatial parameter after cluster

The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe decoder module include recovery module, described recovery module is for by the most clustered same audio frequency belonging to same frame The spatial parameter of the different frequency bands of objectMap to they corresponding frequency bands, be reduced into original spatial parameter

The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: described Intraframe coding module include quantization modules, described quantization modules for cluster after spatial parameterCarry out Quantifying, described quantization is that perception quantifies or directly quantifies；Spatial parameter after quantifying is carried out intraframe coding, described coding It is perceptual coding or direct coding.

The most according to claim 6 for improving the coding/decoding system of three-dimensional audio compression of parameters rate, it is characterised in that: institute The intraframe decoder module stated includes inverse quantization module, and described inverse quantization module for carrying out intraframe decoder, institute to spatial parameter The decoding stated is perception decoding or directly decodes；Spatial parameter after intraframe decoder carries out inverse quantization, and described inverse quantization is It is directed to the inverse quantization of perception quantization or is directed to the inverse quantization directly quantified.