CN106023999B - For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio - Google Patents
For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio Download PDFInfo
- Publication number
- CN106023999B CN106023999B CN201610541939.8A CN201610541939A CN106023999B CN 106023999 B CN106023999 B CN 106023999B CN 201610541939 A CN201610541939 A CN 201610541939A CN 106023999 B CN106023999 B CN 106023999B
- Authority
- CN
- China
- Prior art keywords
- spatial parameter
- audio
- coding
- decoding
- dimensional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000007906 compression Methods 0.000 title claims abstract description 33
- 230000006835 compression Effects 0.000 title claims abstract description 33
- 238000013139 quantization Methods 0.000 claims abstract description 72
- 230000005236 sound signal Effects 0.000 claims abstract description 58
- 230000008447 perception Effects 0.000 claims description 19
- 230000008859 change Effects 0.000 claims description 7
- 238000011084 recovery Methods 0.000 claims description 6
- 241000406668 Loxodonta cyclotis Species 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 241000208340 Araliaceae Species 0.000 claims description 3
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims description 3
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 235000008434 ginseng Nutrition 0.000 claims description 3
- 238000013507 mapping Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The present invention provides the decoding methods and system for improving three-dimensional audio spatial parameter compression ratio, the present invention inputs audio signal, the spatial side information of three-dimensional audio and the number of the affiliated audio object of spatial parameter of three-dimensional audio in coding, and when coding successively clusters spatial parameter, quantifies, intraframe coding, inter-frame difference coding;Inter-frame difference decoding, intraframe decoder, inverse quantization, spatial parameter mapping are successively carried out when decoding;The present invention is based on the different sub-band spatial parameters in the same frame of same sound source to have the characteristics that similitude, and the compression ratio of the spatial parameter of three-dimensional audio, available higher three-dimensional audio spatial parameter compression ratio are improved using the method for spatial parameter cluster.
Description
Technical field
The present invention relates to digital audio fields, for the demand for improving three-dimensional audio spatial parameter compression ratio, more particularly to
A kind of decoding method and system improving three-dimensional audio spatial parameter compression ratio.
Background technique
In the end of the year 2009, three-dimensional movie " A Fanda " climbs up top box-office value in global more than 30 a countries, in September, 2010
Just, the accumulative box office in the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain the box office achievement of such splendidness, be that it is adopted
Completely new three-dimensional special effect making technology brings the shock effect on people's sense organ.
In order to provide a kind of feeling and a kind of more true sound field more immersed, space in 3d space to auditor
Audio object encodes (SAOC), and direction audio coding (DirAC) and space squeezing audio coding (S3AC) are suggested.With 3D
The raising of spatial resolution and more and more sound channels or object, the bit rate of spatial parameter also sharp improve.For example,
In space orientation point of quantification (SLQP) method of S3AC coding, the bit rate of spatial parameter is 18kbps/ object, then for
16 sound objects, spatial parameter need the bit rate of 288kbps.Therefore, the ratio of the spatial parameter in 3D audio coding is reduced
Special rate is very urgent.
Compression method BCC, the MPEG Surround and S3AC of spatial parameter considers the characteristic between consecutive frame, then
The bit rate of spatial parameter can be reduced by differential encoding.These methods can remove empty between consecutive frame in identical frequency band
Between parameter inter-frame redundancy, but redundancy still has in the frame of spatial parameter between same sound source different frequency bands in same frame.
If can try every possible means to remove redundancy in these frames, spatial parameter bit rate can be further compressed.
Summary of the invention
It is an object of the invention to, in deficiency present on compression 3D audio space parameter, provide for the above-mentioned prior art
A kind of new object-based spatial parameter compression method for 3D audio recording;This method is based on same sound source in same frame
The characteristic of interior different frequency bands spatial parameter having the same can with height ratio remove in existing spatial parameter compression method
Redundancy in the frame for the spatial parameter not considered, thus further compression space parameters bit rate.
Technical solution of the present invention provides a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, packet
Include cataloged procedure and decoding process, the cataloged procedure the following steps are included:
Step C1, input include three-dimensional sound signal, three-dimensional audio spatial parameter and spatial parameter comprising n object
Three-dimensional audio time-domain signal is transformed to frequency domain by the number of affiliated audio object, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three
Tie up the spatial parameter of audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is become
Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,
f)…、SK(n,f);Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is kth
The frequency domain presentation of a aeoplotropism audio signal;Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ
For horizontal angle,For elevation angle, r is apart from side information;The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal
Sum;The value of Index (n, f) is the number of the affiliated audio object of spatial parameter;N represents frame index, and f represents frequency indices;
Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio pair in same frame
The spatial parameter of the different frequency bands of elephant is clustered;To the spatial parameter after clusterQuantified;After quantization
Spatial parameter carry out intraframe coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume
Code;
The decoding process includes the following steps;
Step D1 carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter;To in frame
Decoded spatial parameter carries out inverse quantization;Restore original spatial parameter
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal
(t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) after encoding and decoding
Signal;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is former
The number Index (n, f) of the affiliated audio object of the spatial parameter of beginning constitutes the sound of the decoded three-dimensional audio comprising n object
Frequency signal, the number of spatial parameter and the affiliated audio object of spatial parameter.
It further, is the space to the different frequency bands for belonging to same audio object in same frame in the step C2
Parameter is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentGathered
Class, the spatial parameter after generating cluster
It further, is by the difference of the clustered same audio object for belonging to same frame in the step D2
The spatial parameter of frequency bandTheir corresponding frequency bands are mapped to, original spatial parameter is reduced into
Further, in the step C2, to the spatial parameter after clusterQuantified, the amount
Change is perception quantization or directly quantization;To after quantization spatial parameter carry out intraframe coding, the coding be perceptual coding or
Direct coding.
Further, in the step D2, to spatial parameter carry out intraframe decoder, the decoding be perception decoding or
It directly decodes;Inverse quantization is carried out to the spatial parameter after intraframe decoder, the inverse quantization is the inverse for being directed to perception quantization
Change or be directed to the inverse quantization directly quantified.
It is a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, including encoder and decoder;
The encoder comprises the following modules:
Time-frequency conversion module, for input include comprising the three-dimensional sound signal of n object, three-dimensional audio spatial parameter with
And the number of the affiliated audio object of spatial parameter, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets three-dimensional audio
Time-domain signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding
The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained
To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f);
Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter
Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height
Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f)
Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding to the spatial parameter of input, including for belonging in same frame
The spatial parameter of the different frequency bands of same audio object is clustered;To the spatial parameter after clusterThe amount of progress
Change;Intraframe coding is carried out to the spatial parameter after quantization;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is poor
Coded;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for solve in frame to spatial parameter
Code;Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block obtains audio signal for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain
Time domain express s ' (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) warp
Signal after crossing encoding and decoding;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded,
The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
Further, the intraframe coding module includes cluster module, and the cluster module is used for in same frame
The spatial parameter for belonging to the different frequency bands of same audio object is clustered, i.e., identical for n, the value of Index (n, f) it is identical but
F different spatial parametersIt is clustered, the spatial parameter after generating cluster
Further, the intraframe decoder module includes recovery module, and the recovery module is used for will be clustered
The same audio object for belonging to same frame different frequency bands spatial parameterTheir corresponding frequency bands are mapped to,
It is reduced into original spatial parameter
Further, the intraframe coding module includes quantization modules, after the quantization modules are used for cluster
Spatial parameterQuantified, the quantization is perception quantization or directly quantization;To the spatial parameter after quantization into
Row intraframe coding, the coding are perceptual coding or direct coding.
Further, the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used for space
Parameter carries out intraframe decoder, and the decoding is perception decoding or directly decodes;Spatial parameter after intraframe decoder is carried out anti-
Quantization, the inverse quantization are to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Join the beneficial effects of the present invention are: the present invention is based on the different frequencies of same sound source in same frame with identical space
Then number carries out spatial parameter frame in coding side by spatial parameter cluster, spatial parameter quantization, spatial parameter intraframe coding
Between differential encoding, further compress three-dimensional audio space parameters bit rate, improve spatial parameter compression ratio.Decoding end is to three-dimensional sound
Frequency code stream is decoded, including carries out inter-frame difference decoding, spatial parameter intraframe decoder, after intraframe decoder to spatial parameter
Spatial parameter carries out inverse quantization, and the spatial parameter of cluster is mapped, and obtains audio signal, the spatial parameter of three-dimensional audio
And the number of the affiliated audio object of spatial parameter.Therefore, the present invention solves previous only existing by increasing encoding and decoding in frame
Spatial parameter compression method in do not consider the defect of redundancy in spatial parameter frame, can further compress three-dimensional audio space ginseng
Number bit rate, improves spatial parameter compression ratio.
Detailed description of the invention
Fig. 1 is the flow chart of the coding side of the embodiment of the present invention;
Fig. 2 is the flow chart of the decoding end of the embodiment of the present invention.
Specific embodiment
Below in conjunction with drawings and examples the present invention will be described in detail technical solution, (wherein step C1 to step C3 is encoded
Journey, step D1 to step D3 are decoding process).
Referring to Fig. 1, the coding side of the embodiment of the present invention executes following below scheme:
The time-domain signal s (t) of three-dimensional audio is transformed to frequency domain by step C1, obtain three-dimensional audio frequency-region signal S (n,
f)。
The input of coding side are as follows: the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter
The number of affiliated audio object.The time domain of the audio signal of three-dimensional audio is expressed as s (t), and s (t) is by s1(t)、s2(t)、…、sK
(t) it constitutes, t indicates the time;The spatial parameter of three-dimensional audio namely the corresponding spatial parameter of each time frequency pointByIt constitutes;The number of the affiliated audio object of spatial parameter, uses Index
(n, f) expression.Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal,Indicate k-th of aeoplotropism
The corresponding spatial parameter of audio signal, spatial parameter is by direction parameter (horizontal angle θ, elevation angle) and distance parameter r composition.K's
Value is 1,2 ..., K, and K is the sum of original aeoplotropism audio signal.
The time-domain signal of three-dimensional audio is transformed into frequency domain, it can be by the time-domain signal s (t) of three-dimensional audio using Fu in short-term
In leaf transformation (STFT) transform to frequency domain, obtain the frequency-region signal S (n, f) of three-dimensional audio, S (n, f) is by S1(n,f)、S2(n,
f)、…、SK(n,f).Wherein, Sk(n, f) is the frequency domain presentation of k-th of aeoplotropism audio signal, and n represents frame index, and f represents frequency
Rate index.It is converted when it is implemented, the other methods such as MDCT or Hilbert Huang can also be used.
K=8, f=1,2 ... in embodiment, 40.8 aeoplotropism audio signal s1(t)、s2(t)、…、s8(t) frequency domain
Signal is (S1(n,f),S2(n,f),…,S8(n, f)), their corresponding spatial parameters areAnd the number of the affiliated object of these spatial parameters is Index (n, f).
Step C2 carries out intraframe coding when embodiment carries out step C3 to spatial parameter and specifically performs following steps:
C21: the spatial parameter for the different frequency bands for belonging to same audio object in same frame is clustered, i.e., for n phase
Together, the spatial parameter that the value of Index (n, f) is identical but f is differentIt carries out
Cluster, the spatial parameter after generating cluster
C22: to the spatial parameter after clusterQuantified, it can be with
It is perception quantization or directly quantization;
C23: intraframe coding is carried out to the spatial parameter after quantization, can be perceptual coding or direct coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and embodiment carries out step C3
When, coding method is differential encoding.
Referring to fig. 2, the decoding end of the embodiment of the present invention executes following below scheme:
Step D1 carries out decoding inter frames to spatial parameter, and when embodiment carries out step D1, coding/decoding method is differential decoding.
Step D2 carries out intraframe decoder when embodiment carries out step D2 to spatial parameter and specifically performs following steps:
D21: carrying out intraframe decoder to spatial parameter, can be perception decoding or directly decodes;
D22: to after intraframe decoder spatial parameter carry out inverse quantization, can be directed to perception quantization inverse quantization or
It is directed to the inverse quantization directly quantified;
D23: by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame Their corresponding frequency bands are mapped to, original spatial parameter is reduced into
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal
(t), S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding;It is right comprising n
Time domain expression s ' (t) of the audio signal of elephant and step D2 gained spatial parameterAnd original spatial parameter institute
The number Index (n, f) for belonging to audio object constitutes the audio signal of the decoded three-dimensional audio comprising n object, space ginseng
Several and the affiliated audio object of spatial parameter number.Different configuration of loudspeaker or earphone can be used when specific implementation accordingly
Three-dimensional audio sound field is rebuild, can restore original three-dimensional audio.
Embodiment is by 8 aeoplotropism audio signal (S ' after encoding and decoding1(n,f),S’2(n,f),…,S’8(n, f)) transformation
To time domain, 8 aeoplotropism audio signal s ' are obtained1(t),s’2(t),…,s’8(t) and spatial parameter has been decoded And the number Index of the original affiliated audio object of spatial parameter
(n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object, spatial parameter and the affiliated audio of spatial parameter
The number of object.The present embodiment realizes the playback of three-dimensional sound signal of the band apart from side information using earphone, in order to realize ear
The three-dimensional audio of machine is reappeared, and is needed with the library related transfer function (HRTF) to the end, the library PKU&IOA HRTF to far field and near field all
It measures, distance r changes to 160cm from 20cm, and the resolution ratio of horizontal angle and elevation angle is 5 respectively0With 100, we select
The library PKU&IOA HRTF rebuilds to complete to have carried out the three-dimensional audio of frame data compression and interframe compression.
By Experimental comparison, three of three-dimensional audio compression method than original only interframe encode of intraframe coding are increased
The compression effectiveness for tieing up audio compression method is good, and compression ratio is higher and reconstruction audio quality is still kept.Due to increasing in frame
Coding can eliminate redundancy in frame, therefore this method improves three-dimensional space on the basis of guaranteeing reconstruction three-dimensional audio quality
Compression of parameters rate reduces spatial parameter bit rate.
Method provided by the present invention can realize automatic running using software technology, can also realize as corresponding modularization system
System.It is provided by the invention a kind of for improving the parametric codec system of three-dimensional audio spatial impression distance perception, including encoder and
Decoder, the encoder comprise the following modules,
Time-frequency conversion module, for input include comprising the three-dimensional sound signal of n object, three-dimensional audio spatial parameter with
And the number of the affiliated audio object of spatial parameter, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets three-dimensional audio
Time-domain signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding
The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained
To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f);
Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter
Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height
Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f)
Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding to the spatial parameter of input, including for belonging in same frame
The spatial parameter of the different frequency bands of same audio object is clustered;To the spatial parameter after clusterThe amount of progress
Change;Intraframe coding is carried out to the spatial parameter after quantization;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is poor
Coded;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for solve in frame to spatial parameter
Code;Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block obtains audio signal for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain
Time domain express s ' (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) warp
Signal after crossing encoding and decoding;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded,
The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
Intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame
The spatial parameters of different frequency bands clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster
Intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame
The spatial parameter of the different frequency bands of one audio objectTheir corresponding frequency bands are mapped to, original space is reduced into
Parameter
Intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after clusterQuantified, the quantization is perception quantization or directly quantization;Spatial parameter after quantization is carried out in frame
Coding, the coding is perceptual coding or direct coding.
Intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame
Code, the decoding are perception decodings or directly decode;Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder
Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Each module specific implementation is corresponding to method and step, and it will not go into details by the present invention.
Specific embodiment described herein is only to give an example to the content of present invention.The neck of technology belonging to the present invention
The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method
Generation, but without departing from the contents of the present invention or beyond the scope of the appended claims.
Claims (10)
1. a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, which is characterized in that including cataloged procedure
And decoding process, the cataloged procedure the following steps are included:
Step C1, input include belonging to the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter
Three-dimensional audio time-domain signal is transformed to frequency domain by the number of audio object, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three-dimensional audio
Spatial parameterDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is become
Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,
f)…、SK(n,f);Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is kth
The frequency domain presentation of a aeoplotropism audio signal;Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ
For horizontal angle,For elevation angle, r is apart from side information;The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal
Sum;The value of Index (n, f) is the number of the affiliated audio object of spatial parameter;N represents frame index, and f represents frequency indices;
Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio object in same frame
The spatial parameter of different frequency bands is clustered;To the spatial parameter after clusterQuantified;To the space after quantization
Parameter carries out intraframe coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, coding method is differential encoding;Institute
Decoding process is stated to include the following steps,
Step D1 carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter;To intraframe decoder
Spatial parameter afterwards carries out inverse quantization;Restore original spatial parameter
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains time domain expression s ' (t) of audio signal,
The S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding;
Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is original
The number Index (n, f) of the affiliated audio object of spatial parameter constitutes the audio letter of the decoded three-dimensional audio comprising n object
Number, the number of spatial parameter and the affiliated audio object of spatial parameter.
2. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
It in the step C2, is clustered to the spatial parameter for the different frequency bands for belonging to same audio object in same frame,
Spatial parameter i.e. identical for n, that the value of Index (n, f) is identical but f is differentIt is clustered, after generating cluster
Spatial parameter
3. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
It is by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame in the step D2Their corresponding frequency bands are mapped to, original spatial parameter is reduced into
4. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
In the step C2, to the spatial parameter after clusterQuantified, the quantization be perception quantization or
Directly quantify;Intraframe coding is carried out to the spatial parameter after quantization, the coding is perceptual coding or direct coding.
5. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
In the step D2, intraframe decoder is carried out to spatial parameter, the decoding is perception decoding or directly decodes;To frame
Interior decoded spatial parameter carries out inverse quantization, and the inverse quantization is to be directed to the inverse quantization of perception quantization or be directed to straight
Connect the inverse quantization of quantization.
6. a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: including encoder and
Decoder,
The encoder comprises the following modules,
Time-frequency conversion module includes three-dimensional sound signal, three-dimensional audio spatial parameter and sky comprising n object for inputting
Between the affiliated audio object of parameter number, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets the time domain of three-dimensional audio
Signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding
The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained
To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f);
Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter
Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height
Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f)
Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for the spatial parameter progress intraframe coding to input, including for same to belonging in same frame
The spatial parameter of the different frequency bands of audio object is clustered;To the spatial parameter after clusterQuantified;To amount
Spatial parameter after change carries out intraframe coding;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume
Code;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for carrying out intraframe decoder to spatial parameter;
Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block, for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain, obtain audio signal when
S ' (t) is expressed in domain, and the S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) by compiling solution
Signal after code;Spatial parameter obtained by time domain expression s ' (t) of audio signal comprising n object and intraframe decoder moduleAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded,
The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
7. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
The intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame
The spatial parameter of different frequency bands is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster
8. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
The intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame
The spatial parameter of the different frequency bands of audio objectTheir corresponding frequency bands are mapped to, original space ginseng is reduced into
Number
9. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
The intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after cluster
Quantified, the quantization is perception quantization or directly quantization;Intraframe coding is carried out to the spatial parameter after quantization, it is described
Coding is perceptual coding or direct coding.
10. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, feature exists
In: the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame
Code, the decoding are perception decodings or directly decode;Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder
Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610541939.8A CN106023999B (en) | 2016-07-11 | 2016-07-11 | For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610541939.8A CN106023999B (en) | 2016-07-11 | 2016-07-11 | For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106023999A CN106023999A (en) | 2016-10-12 |
CN106023999B true CN106023999B (en) | 2019-06-11 |
Family
ID=57108555
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610541939.8A Active CN106023999B (en) | 2016-07-11 | 2016-07-11 | For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106023999B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108206022B (en) * | 2016-12-16 | 2020-12-18 | 南京青衿信息科技有限公司 | Codec for transmitting three-dimensional acoustic signals by using AES/EBU channel and coding and decoding method thereof |
CN111656441B (en) | 2017-11-17 | 2023-10-03 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for encoding or decoding directional audio coding parameters |
GB2576769A (en) | 2018-08-31 | 2020-03-04 | Nokia Technologies Oy | Spatial parameter signalling |
GB2578625A (en) | 2018-11-01 | 2020-05-20 | Nokia Technologies Oy | Apparatus, methods and computer programs for encoding spatial metadata |
GB2586586A (en) * | 2019-08-16 | 2021-03-03 | Nokia Technologies Oy | Quantization of spatial audio direction parameters |
CA3202283A1 (en) * | 2020-12-15 | 2022-06-23 | Adriana Vasilache | Quantizing spatial audio parameters |
CN115662448B (en) * | 2022-10-17 | 2023-10-20 | 深圳市超时代软件有限公司 | Method and device for converting audio data coding format |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101521013A (en) * | 2009-04-08 | 2009-09-02 | 武汉大学 | Spatial audio parameter bidirectional interframe predictive coding and decoding devices |
CN101609674A (en) * | 2008-06-20 | 2009-12-23 | 华为技术有限公司 | Decoding method, device and system |
US7974287B2 (en) * | 2006-02-23 | 2011-07-05 | Lg Electronics Inc. | Method and apparatus for processing an audio signal |
CN102177542A (en) * | 2008-10-10 | 2011-09-07 | 艾利森电话股份有限公司 | Energy conservative multi-channel audio coding |
CN103165134A (en) * | 2013-04-02 | 2013-06-19 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN103400582A (en) * | 2013-08-13 | 2013-11-20 | 武汉大学 | Encoding and decoding method and system for multi-channel three-dimensional voice frequency |
CN103928030A (en) * | 2014-04-30 | 2014-07-16 | 武汉大学 | Gradable audio coding system and method based on sub-band space attention measure |
CN104064194A (en) * | 2014-06-30 | 2014-09-24 | 武汉大学 | Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070025907A (en) * | 2005-08-30 | 2007-03-08 | 엘지전자 주식회사 | Method of effective bitstream composition for the parameter band number of channel conversion module in multi-channel audio coding |
-
2016
- 2016-07-11 CN CN201610541939.8A patent/CN106023999B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7974287B2 (en) * | 2006-02-23 | 2011-07-05 | Lg Electronics Inc. | Method and apparatus for processing an audio signal |
CN101609674A (en) * | 2008-06-20 | 2009-12-23 | 华为技术有限公司 | Decoding method, device and system |
CN102177542A (en) * | 2008-10-10 | 2011-09-07 | 艾利森电话股份有限公司 | Energy conservative multi-channel audio coding |
CN101521013A (en) * | 2009-04-08 | 2009-09-02 | 武汉大学 | Spatial audio parameter bidirectional interframe predictive coding and decoding devices |
CN103165134A (en) * | 2013-04-02 | 2013-06-19 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN103400582A (en) * | 2013-08-13 | 2013-11-20 | 武汉大学 | Encoding and decoding method and system for multi-channel three-dimensional voice frequency |
CN103928030A (en) * | 2014-04-30 | 2014-07-16 | 武汉大学 | Gradable audio coding system and method based on sub-band space attention measure |
CN104064194A (en) * | 2014-06-30 | 2014-09-24 | 武汉大学 | Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency |
Non-Patent Citations (1)
Title |
---|
AVS-P10移动音频编解码标准与关键技术;胡瑞敏等;《电视技术》;20101031;第34卷(第10期);第4-8页 |
Also Published As
Publication number | Publication date |
---|---|
CN106023999A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106023999B (en) | For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio | |
KR101221918B1 (en) | A method and an apparatus for processing a signal | |
JP6346278B2 (en) | Audio encoder, audio decoder, method, and computer program using joint encoded residual signal | |
KR20200100061A (en) | Apparatus and method for encoding or decoding directional audio coding parameters using different time/frequency resolutions | |
CN102342105B (en) | For carrying out the Apparatus and method for of Code And Decode to multi-layer video | |
US20220180881A1 (en) | Speech signal encoding and decoding methods and apparatuses, electronic device, and storage medium | |
CN106415714A (en) | Coding independent frames of ambient higher-order ambisonic coefficients | |
CN111226442A (en) | Method and apparatus for configuring transforms for video compression | |
TW200935403A (en) | Technique for encoding/decoding of codebook indices for quantized MDCT spectrum in scalable speech and audio codecs | |
CN106463121A (en) | Higher order ambisonics signal compression | |
TWI702594B (en) | Backward-compatible integration of high frequency reconstruction techniques for audio signals | |
TWI820123B (en) | Integration of high frequency reconstruction techniques with reduced post-processing delay | |
CN109448741A (en) | A kind of 3D audio coding, coding/decoding method and device | |
BRPI0612218A2 (en) | adaptive residual audio coding | |
JP7413334B2 (en) | Backward-compatible integration of harmonic converters for high-frequency reconstruction of audio signals | |
CN104064194A (en) | Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency | |
TWI463483B (en) | Method and device of bitrate distribution/truncation for scalable audio coding | |
CN109996073A (en) | A kind of method for compressing image, system, readable storage medium storing program for executing and computer equipment | |
WO2015096789A1 (en) | Method and device for use in vector quantization encoding/decoding of audio signal | |
CN110660401A (en) | Audio object coding and decoding method based on high-low frequency domain resolution switching | |
JP6094322B2 (en) | Orthogonal transformation device, orthogonal transformation method, computer program for orthogonal transformation, and audio decoding device | |
CN104347077B (en) | A kind of stereo coding/decoding method | |
KR20230035373A (en) | Audio encoding method, audio decoding method, related device, and computer readable storage medium | |
CN112365896A (en) | Object-oriented encoding method based on stack type sparse self-encoder | |
CN102768834B (en) | A kind of realization decoded method of audio frame |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |