CN102682779B - Double-channel encoding and decoding method for 3D audio frequency and codec - Google Patents

Double-channel encoding and decoding method for 3D audio frequency and codec Download PDF

Info

Publication number
CN102682779B
CN102682779B CN2012101839630A CN201210183963A CN102682779B CN 102682779 B CN102682779 B CN 102682779B CN 2012101839630 A CN2012101839630 A CN 2012101839630A CN 201210183963 A CN201210183963 A CN 201210183963A CN 102682779 B CN102682779 B CN 102682779B
Authority
CN
China
Prior art keywords
signal
subband signal
frequency
channel
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2012101839630A
Other languages
Chinese (zh)
Other versions
CN102682779A (en
Inventor
胡瑞敏
董石
郑翔
涂卫平
杨玉红
王晓晨
高戈
刘梦颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN2012101839630A priority Critical patent/CN102682779B/en
Publication of CN102682779A publication Critical patent/CN102682779A/en
Application granted granted Critical
Publication of CN102682779B publication Critical patent/CN102682779B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a double-channel encoding and decoding method for 3D audio frequency and a codec. According to the invention, on the basis of a double-channel technology for 3D audio frequency, more encoding energy is used in an encoding principal component according to auditory characteristics of human ears, and different encoding methods are adopted aiming at different audio signals for encoding so as to further provide the double-channel encoding and decoding method for 3D audio frequency and the corresponding codec. The double-channel encoding and decoding method can reduce the encoding and decoding noise, enable reconstructed signals to have higher signal-to-noise ratio, and better simulate 3D audio signals.

Description

Two-channel decoding method and codec towards the 3D audio frequency
Technical field
The present invention relates to the audio compression techniques field, related in particular to a kind of two-channel decoding method and codec towards the 3D audio frequency.
Background technology
Along with the fast development of new century infotech, the widespread use that audio compression techniques obtains.3D Audiotechnica of today as 5.1 sound channels, 7.1 sound channels, even morely is used for the channel that audio plays up and becomes more and more popular.Multichannel audio can provide true auditory effect more on the spot in person.But continuous increase along with voice-grade channel, the bit rate that coding is produced also increases in linearity, thereby just need more audio recording space and more real-time Transmission bandwidth, so many coding techniquess efficiently arise at the historic moment following mixed parameter stereo coding.And also produced the audio codec of many stereo codings simultaneously at above-mentioned technology, as PS, EAAC+, MPEG-Surround and based on the stereo audio codec of PCA etc.Under many sound sources, multidirectional situation, the encoding and decoding result of traditional audio codec can not show better subjectivity and objective tonequality.
Summary of the invention
For further improving the audio coding decoding quality, reduce the encoding and decoding noise, strengthening subjective and objective tonequality, the present invention proposes a kind of two-channel decoding method and codec towards the 3D audio frequency.
For solving the problems of the technologies described above, the present invention adopts following technical scheme:
One, a kind of two-channel coding method towards the 3D audio frequency comprises step:
S1.1 carries out time-frequency conversion respectively to the binaural signal of input, converts the binaural signal on the time domain on the frequency domain binaural signal;
S1.2, the binaural signal on the described frequency domain is carried out sub-band division respectively, obtain the two-channel subband signal;
S1.3, adopt based on frequency domain master composition with based on the parameter coding method of polar coordinates master composition described two-channel subband signal is encoded one by one respectively, to obtain the coding noise energy that each two-channel subband signal is produced under above-mentioned two kinds of coding methods;
Described employing based on the parameter coding method of polar coordinates master composition to the described two-channel subband signal resulting coding noise energy of encoding
Figure BDA00001733175900011
ε 2.k is the coding noise energy of k two-channel subband signal, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure BDA00001733175900021
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
S1.4, at each two-channel subband signal, select the pairing parameter coding method of less coding noise energy that this two-channel subband signal is further encoded, if noise energy equates, then selects based on the parameter coding method of frequency domain major component this two-channel subband signal further to be encoded; Further encode based on the parameter coding method of frequency domain major component if adopt, then export coding chief composition series, deflection and the noise energy ratio of two-channel subband signal; Further encode based on the parameter coding method of polar coordinates major component if adopt, then export coding chief composition series, radius of turn and the noise energy ratio of two-channel subband signal;
Described employing based on the resulting coding chief composition series of the parameter coding method of polar coordinates major component is:
PC k={PC k(j)|j=1,2,...,n}
Wherein, PC kBe the chief composition series of k two-channel subband signal, PC k(j) be the principal ingredient of j frequency in k the two-channel subband signal,
Figure BDA00001733175900022
The deflection of representing j frequency in k the two-channel subband signal,
Figure BDA00001733175900024
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity that is numbered the subband intermediate-frequeney point of k;
Described employing based on the resulting radius of turn of parameter coding method of polar coordinates major component is:
ρ ‾ k = Σ j = 1 n L k 2 ( j ) + R k 1 ( j ) n
Wherein,
Figure BDA00001733175900026
Be the radius of turn of k two-channel subband signal, L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Described employing based on the resulting noise energy ratio of the parameter coding method of polar coordinates major component is:
PAR = π 2 48 Σ j = 1 n [ ρ k ( j ) - 1 n Σ j = 1 n ρ k ( j ) ] 2
Wherein, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure BDA00001733175900028
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
S1.5, described coding chief composition series is descended to mix, obtain down mixed signal;
S1.6, adopt core encoder that described mixed signal is down encoded, obtain encoding code stream, and described deflection or radius of turn and noise energy ratio are write encoding code stream.
Two, a kind of two-channel scrambler towards the 3D audio frequency comprises:
The time-frequency conversion module is used for the binaural signal of input is carried out time-frequency conversion respectively, converts the binaural signal on the time domain on the frequency domain binaural signal;
The sub-band division module is used for the binaural signal on the described frequency domain is carried out sub-band division respectively, obtains the two-channel subband signal;
Coding noise energy computing module, be used for adopting respectively based on frequency domain master composition with based on the parameter coding method of polar coordinates master composition described two-channel subband signal is encoded one by one, to obtain the coding noise energy that each two-channel subband signal is produced under above-mentioned two kinds of coding methods; Described employing based on the parameter coding method of polar coordinates master composition to the described two-channel subband signal resulting coding noise energy of encoding
Figure BDA00001733175900031
ε 2.kBe the coding noise energy of k two-channel subband signal, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure BDA00001733175900032
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
The parameter coding module, be used at each two-channel subband signal, select the pairing parameter coding method of less coding noise energy that this two-channel subband signal is further encoded, if noise energy equates, then selects based on the parameter coding method of frequency domain major component this two-channel subband signal further to be encoded; Further encode based on the parameter coding method of frequency domain major component if adopt, then export coding chief composition series, deflection and the noise energy ratio of two-channel subband signal; Further encode based on the parameter coding method of polar coordinates major component if adopt, then export coding chief composition series, radius of turn and the noise energy ratio of two-channel subband signal;
Described employing based on the resulting coding chief composition series of the parameter coding method of polar coordinates major component is:
PC k={PC k(j)|j=1,2,...,n}
Wherein, PC kBe the chief composition series of k two-channel subband signal, PC k(j) be the principal ingredient of j frequency in k the two-channel subband signal,
Figure BDA00001733175900041
Figure BDA00001733175900042
The deflection of representing j frequency in k the two-channel subband signal,
Figure BDA00001733175900043
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity that is numbered the subband intermediate-frequeney point of k;
Described employing based on the resulting radius of turn of parameter coding method of polar coordinates major component is:
ρ ‾ k = Σ j = 1 n L k 2 ( j ) + R k 1 ( j ) n
Wherein,
Figure BDA00001733175900045
Be the radius of turn of k two-channel subband signal, L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Described employing based on the resulting noise energy ratio of the parameter coding method of polar coordinates major component is:
PAR = π 2 48 Σ j = 1 n [ ρ k ( j ) - 1 n Σ j = 1 n ρ k ( j ) ] 2
Wherein, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure BDA00001733175900047
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Mixed module is used for described coding chief composition series is descended to mix down, obtains down mixed signal;
Core encoder is used for described down mixed signal is encoded, and obtains encoding code stream, and described deflection or radius of turn and noise energy ratio are write encoding code stream.
Three, a kind of two-channel coding/decoding method towards the 3D audio frequency comprises step:
S2.1 adopts core decoder that encoding code stream is decoded, and obtains decoded signal;
S2.2 carries out sub-band division to described decoded signal, obtains the subband signal of decoding;
S2.3 adopts and encodes used parameter coding method relevant parameters coding/decoding method and combine deflection in the encoding code stream or radius of turn, noise ability are compared described decoding subband signal and decoded, the frequency domain subband signal that obtains rebuilding;
S2.4 merges the frequency-region signal that the frequency domain subband signal of described reconstruction obtains rebuilding;
S2.5 carries out the time-frequency inverse transformation to described frequency-region signal, converts frequency-region signal to time-domain signal, recovers the sound signal of reconstruction.
Above-mentioned parametric solution code method is based on the parametric solution code method of frequency domain master composition or based on the parametric solution code method of polar coordinates master composition.
Described utilization is decoded to described decoding subband signal based on the parametric solution code method of frequency domain master composition, the frequency domain subband signal that obtains rebuilding, be specially: according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the deflection in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
Described utilization is decoded to described decoding subband signal based on the parametric solution code method of polar coordinates master composition, the frequency domain subband signal that obtains rebuilding, be specially: according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the radius of turn in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
Four, a kind of binaural decoder towards the 3D audio frequency comprises:
Core decoder is used for encoding code stream is decoded, and obtains decoded signal;
The sub-band division module is used for described decoded signal is carried out sub-band division, obtains the subband signal of decoding;
The parameter decoder module, be used for adopting and encode used parameter coding method relevant parameters coding/decoding method and combine deflection in the encoding code stream or radius of turn, noise ability are compared described decoding subband signal and decoded, the frequency domain subband signal that obtains rebuilding;
Subband merges module, is used for merging the frequency-region signal that the frequency domain subband signal of described reconstruction obtains rebuilding;
The time-frequency inverse transform module is used for described frequency-region signal is carried out the time-frequency inverse transformation, converts frequency-region signal to time-domain signal, recovers the sound signal of reconstruction.
The above-mentioned parameter decoder module further comprises based on the parameter decoder module of frequency domain master composition with based on the parameter decoder module of polar coordinates master composition.
Described parameter decoder module based on frequency domain master composition, be used for according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the deflection in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
Described parameter decoder module based on polar coordinates master composition, be used for according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the radius of turn in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
The present invention is on the two-channel technical foundation of 3D audio frequency, according to human hearing characteristic, the major component that the energy of more encoding is used for encoding, and adopt different coding methods to encode, and then a kind of two-channel decoding method and corresponding codec towards the 3D audio frequency proposed at different sound signal.The inventive method can reduce the encoding and decoding noise, makes reconstruction signal have higher signal to noise ratio (S/N ratio), can better simulate the 3D sound signal simultaneously.
Description of drawings
Fig. 1 is the process flow diagram of coding method of the present invention;
Fig. 2 is the process flow diagram of coding/decoding method of the present invention;
Fig. 3 is the process flow diagram of the sub-band division in the coding method of the present invention;
Fig. 4 is the process flow diagram that coding method is selected in the coding method of the present invention;
Fig. 5 is the parameter coding method synoptic diagram based on the polar coordinates major component of the present invention;
Fig. 6 is the process flow diagram that coding/decoding method is selected in the coding/decoding method of the present invention;
Fig. 7 is the process flow diagram of the parameter decoding in the coding/decoding method of the present invention.
Embodiment
The present invention proposes a kind of two-channel coding method towards the 3D audio frequency, and corresponding two-channel coding/decoding method, when specifically implementing, can adopt the computer software means to realize the automatic encoding and decoding of audio frequency by those skilled in the art according to providing technical scheme.Owing in encoding and decoding are used, often the encoding and decoding software approach can also be solidify to form coding and decoding device, so the present invention also provides corresponding two-channel encoder towards the 3D audio frequency.
Below with reference to accompanying drawing the specific embodiment of the present invention is elaborated, so that technical scheme of the present invention and beneficial effect are more clear.
Come the analysis space sound signal in order to adopt among the present invention based on the parameter coding method of frequency domain major component, utilize least mean-square error (MMSE) that two sound channels are merged into a sound channel in the encoding scheme, have only this sound channel coded then by core encoder.When decoding, utilize the environmental noise energy of deflection, principal ingredient and submember to carry out the reconstruction of signal than (PAR), wherein, environmental noise produces a white noise that is similar to original energy and simulates original signal.But for the multi-channel signal of 3D, during sub-band division, some subbands are to be merged by little uniform subband, have wherein comprised a lot of left and right acoustic channels energy than discrepant subband.Because these subbands can better be simulated the sound source of a plurality of different directions, therefore in parameter coding mode, only transmit mixed down channel and unreasonable with a deflection and PAR based on the frequency domain major component.At the problems referred to above, the present invention proposes a kind ofly, in polar coordinates, carry out the parameter coding of principal ingredient and submember, carry out the reconstruction of signal with radius of turn and PAR based on polar parameter coding method, better simulate the 3D sound signal, make it that higher signal to noise ratio (S/N ratio) be arranged.
Two-channel coding method towards the 3D audio frequency of the present invention, particular flow sheet comprises the steps: referring to Fig. 1
Step 1.1 is carried out time-frequency conversion respectively to the binaural signal of input, converts the binaural signal on the time domain on the frequency domain binaural signal
Binaural signal is made up of left channel signals l and right-channel signals r, being embodied as of this step: adopt Fast Fourier Transform (FFT) (FFT) to convert the left channel signals l on the time domain and right-channel signals r on the frequency domain left channel signals L and right-channel signals R respectively.
Step 1.2 is carried out sub-band division to left channel signals L on the frequency domain and right-channel signals R, obtains the left and right sound channels subband signal, and Fig. 3 is the process flow diagram of a kind of concrete enforcement of this step.
Being embodied as of this step:
Employing is divided into 64 subbands based on the division methods of equivalent rectangular bandwidth (ERB) respectively with left channel signals L on the frequency domain and right-channel signals R, again according to the demand of human hearing characteristic and scrambler, respectively the subband of left channel signals L and right-channel signals R is merged or segmentation or not only merge but also segment again again, obtain final L channel subband signal and right-channel signals.
Because people's ear is relatively more responsive to the sound of low frequency, and it is relatively poor to the perception of the sound of high frequency, therefore, can bring row into to 64 sons of left channel signals L and right-channel signals R further handles: can low frequency sub-band wherein be segmented again, or high-frequency sub-band merged, perhaps not only low frequency sub-band was segmented again but also high-frequency sub-band was merged.In this concrete enforcement 3 low frequency sub-bands in 64 subband signals are subdivided into 16 subbands again, 61 high-frequency sub-band are merged into 4 subbands, finally obtain 20 subband signals, following operation is exactly that 20 subband signals at gained carry out.The above-mentioned low frequency and the scope of high frequency are in the specific implementation, artificially stipulate as required.
Step 1.3, adopt respectively based on the parameter coding method (PCA) of frequency domain major component with based on the parameter coding method (PC-PCA) of polar coordinates major component the L channel subband signal and the R channel subband signal of step 1.2 gained are encoded, obtain the coding noise energy of above-mentioned two kinds of parameter coding methods respectively.
Being embodied as of this step:
1) adopts and L channel subband signal and R channel subband signal to be encoded, the coding noise energy that is produced in the hope of parameter coding method based on the frequency domain major component based on the parameter coding method of frequency domain major component.
Suppose the resulting L channel subband signal of step 1.2 L kWith R channel subband signal R kQuantity is N, and k L channel subband signal and R channel subband signal are expressed as L respectively k, R k, k=1,2 ..., N, and hypothesis L channel subband signal L kWith R channel subband signal R kIn respectively contain n frequency, subband signal L then kAnd R kCan regard the sequence of forming by the signal of n frequency as, L k={ L k(j) | j=1,2 ..., n} and R k={ R k(j) | j=1,2 ..., n}, L k(j) and R k(j) be respectively subband signal L kAnd R kIn the signal of j frequency.This step is one by one at each subband signal L kAnd R k, k=1,2 ..., N obtains the coding noise energy that the parameter coding method based on the frequency domain major component is produced.
Below will be with subband signal L kAnd R kBe example, the obtaining of the coding noise energy that the parameter coding method based on the frequency domain major component that further specifies is produced:
A) calculate L kAnd R kThe covariance matrix R that sequence constitutes k:
R k = r ll r lr r rl r rr - - - ( 1 )
Wherein,
r ll=cov[L k,L k],r lr=r rl=cov[L k,R k],r rr=cov[R k,R k];
B) ask covariance matrix R kEigenvalue 1And λ 2:
λ 1 = 1 2 [ r ll + r rr + ( r ll - r rr ) 2 + ( 2 r lr ) 2 ] - - - ( 2 )
λ 2 = 1 2 [ r ll + r rr + ( r ll - r rr ) 2 + ( 2 r lr ) 2 ] - - - ( 3 )
C) according to eigenvalue 1And λ 2Obtain principal ingredient ENERGY E respectively based on the parameter coding method (PCA) of frequency domain major component pWith the submember ENERGY E s:
E p=max(λ 12) (4)
E s=min(λ 12) (5)
Then, the coding noise energy ε that is produced based on the parameter coding method of frequency domain major component 1=E s=min (λ 1, λ 2).
2) adopt and L channel subband signal and R channel subband signal to be encoded, the coding noise energy that is produced in the hope of parameter coding method based on the polar coordinates major component based on the parameter coding method of polar coordinates major component.
Parameter coding mode based on the polar coordinates major component is to create certainly on based on the basis of frequency domain major component parameter coding mode, both coding principles are identical, but the coordinate difference that is adopted, what adopt based on frequency domain major component parameter coding mode is rectangular coordinate system, and what then adopt based on the parameter coding mode of polar coordinates major component is polar coordinate system.
Suppose the resulting L channel subband signal of step 1.2 L kWith R channel subband signal R kQuantity is N, and k L channel subband signal and R channel subband signal are expressed as L respectively k, R k, k=1,2 ..., N, and hypothesis L channel subband signal L kWith R channel subband signal R kIn contain n frequency, subband signal L then kAnd R kThe sequence of forming by the signal of n frequency as can be seen, L k={ L k(j) | j=1,2 ..., n} and R k={ R k(j) | j=1,2 ..., n}, L k(j) and R k(j) be respectively the signal of j frequency among subband signal Lk and the Rk.This step is one by one at each subband signal L kAnd R k, k=1,2 ..., N obtains the coding noise energy that the parameter coding method based on the polar coordinates major component is produced.
Below will be with subband signal L kAnd R kBe example, further specify this step:
A) in order in polar coordinate system, to carry out the major component parameter coding, one by one with subband signal L kAnd R kIn the subband signal L of each frequency k(j) and in Rk (j) the introducing polar coordinate system form 2 new stochastic variable ρ k(j) and
Figure BDA00001733175900091
As shown in Figure 5, wherein, j=1,2 ..., n, L k(j), Rk (j) expression subband signal L kAnd R kIn the signal of j frequency, ρ k(j) amplitude of the signal of j frequency among expression subband signal Lk and the Rk,
Figure BDA00001733175900092
Figure BDA00001733175900093
Expression subband signal L kAnd R kIn the deflection of j frequency:
With subband signal L kAnd R kIn the signal amplitude of each frequency constitute ρ kSequence is with subband signal L kAnd R kIn the pairing deflection of each frequency constitute
Figure BDA00001733175900095
Sequence:
ρ k={ρ k(j)|j=1,2,...,n} (7)
Figure BDA00001733175900096
B) calculate ρ kWith
Figure BDA00001733175900097
The covariance matrix R that sequence constitutes k,
Figure BDA00001733175900098
Wherein,
C) ask covariance matrix R k(9) eigenvalue 1, λ 2, and according to λ 1, λ 2Draw principal ingredient energy based on the parameter coding method (PC-PCA) of polar coordinates major component
Figure BDA000017331759000912
With the submember ENERGY E ρ:
Figure BDA000017331759000913
E ρ = λ 1 = Σ j = 1 n [ ρ k ( j ) - Σ j = 1 n ( j ) ) n ] 2 - - - ( 11 )
Then, based on the coding noise energy ε of the parameter coding mode of polar coordinates major component 2=E ρ
Adopt respectively and above-mentionedly one by one N subband signal Lk and Rk are found the solution the coding noise energy, finally obtain N group coding noise energy based on the frequency domain major component with based on the parameter coding method of polar coordinates major component.
Step 1.4 is selected the best parameter coding method according to the coding noise energy size that above-mentioned two kinds of parameter coding methods are produced, and adopts selected parameter coding method to left and right sound channels subband signal (L kAnd R k) further encode
Select being embodied as of optimized parameter coded system in this step:
Select the less parameter coding method of coding noise energy, and export the mode m ode of this parameter coding method correspondence, adopt the selected parameter coding method that the left and right sound channels signal of step 1.2 gained is further encoded again.
Suppose to adopt based on the frequency domain major component and be ε to the coding noise energy that subband signal Lk and Rk coding are produced based on the parameter coding method of polar coordinates major component 1, ε 2, below still with subband signal L kAnd R kThe concrete enforcement of this step is described for example:
1) if ε 1≤ ε 2, then export mode=0, at this moment, adopt parameter coding method based on the frequency domain major component to subband signal L kAnd R kFurther encode:
Covariance matrix R according to formula (1) kDraw subband signal L kAnd R kDeflection
Figure BDA00001733175900102
Employing based on the parameter coding method of frequency domain major component to subband signal L kAnd R kFurther encode the principal ingredient sequence PC after obtaining encoding kWith the submember sequence A k, PC k={ PC k(j) | j=1,2 ..., n}, A k={ A k(j) | j=1,2 ..., n}, PC k(j) be subband signal L kAnd R kIn the principal ingredient of j frequency, A k(j) be subband signal L kAnd R kIn the submember of j frequency, wherein:
cos θ k sin θ k - sin θ k cos θ k L k ( j ) R k ( j ) = PC k ( j ) A k ( j ) - - - ( 12 )
L k(j), R k(j) be respectively subband signal L kAnd R kIn the signal of j frequency, θ kExpression subband signal L kAnd R kDeflection, k=1,2 ..., N, j=1,2 ..., n.
Adopt said method that all subbands are decoded one by one, and export the principal ingredient sequence PC of each subband k, deflection θ k, and noise energy is E than PAR( pAnd E sThe ratio).
2) if ε 1>ε 2, then export mode=1, at this moment, adopt parameter coding method based on the polar coordinates major component to subband signal L kAnd R kFurther encode:
Employing based on the parameter coding method of polar coordinates major component to subband signal L kAnd R kFurther encode the principal ingredient sequence PC after obtaining encoding kWith the submember sequence A k, PC k={ PC k(j) | j=1,2 ..., n}, A k={ A k(j) | j=1,2 ..., n}, PC k(j) be subband signal L kAnd R kIn the principal ingredient of j frequency, A k(j) be subband signal L kAnd R kIn the submember of j frequency:
Wherein, L k(j), R k(j) be respectively subband signal L kAnd R kIn the signal of j frequency,
Figure BDA00001733175900112
L in the expression subband signal kAnd R kThe deflection of j frequency,
Figure BDA00001733175900113
Value as the formula (6), k=1,2 ..., N, j=1,2 ..., n.
Find the solution subband signal L kAnd R kRadius of turn
Figure BDA00001733175900114
Radius of turn
Figure BDA00001733175900115
Be subband signal L kAnd R kThe mean value of the signal amplitude of each frequency, that is:
ρ ‾ k = Σ j = 1 n L k 2 ( j ) + R k 1 ( j ) n - - - ( 14 )
Adopt said method that all subbands are decoded one by one, and export the principal ingredient sequence PC of each subband k, radius of turn ρ k, and PAR(is E ρWith
Figure BDA00001733175900117
The ratio).
Step 1.3 and 1.4 is to encode in the basis with each subband signal all, all can calculate a coding noise energy ε based on the parameter coding method of frequency domain major component at each subband signal 1With a coding noise energy ε based on the parameter coding method of polar coordinates major component 2, each subband signal all carries out ε one time 1And ε 2The comparison of size, and select the pairing parameter coding method of less coding noise energy that this subband is further encoded.Step 1.3 and 1.4 process are as shown in Figure 3.
Step 1.5 is to all principal ingredient sequence PC that step 1.4 produced kCarry out mixing under the signal, the signal m after obtaining down mixing, k=1,2 ..., N;
Step 1.6 is imported the following mixed signal m of step 1.5 gained into core encoder and is encoded, the code stream after obtaining encoding, if the parameter coding method that adopts based on the polar coordinates major component encode, then with radius of turn ρ k, PAR and mode value write in the encoding code stream; If the parameter coding method based on utmost point frequency domain major component that adopts is encoded, then with deflection θ k, PAR and mode value write in the encoding code stream.
The present invention also provides a kind of two-channel coding method towards the 3D audio frequency, and particular flow sheet comprises the steps: referring to Fig. 2
Step 2.1 is decoded to the encoding code stream of coding side gained, obtains decoded signal m
During concrete enforcement, encoding code stream is imported core decoder, utilize the core decoder decoding to obtain decoded signal m.
Step 2.2 is carried out sub-band division to the decoded signal m that obtains in the step 2.1, obtains the subband signal of decoding
During concrete enforcement, the decoded signal m that core decoder is exported is divided into subband sequence P (N), and wherein, N is a number of sub-bands, is equal to the N value in the coding method.
Step 2.3 is selected corresponding decoding schema according to the mode m ode value in the encoding code stream, carries out decoding work in conjunction with the deflection in the encoding code stream or radius of turn, noise energy ratio, the frequency domain subband signal that obtains rebuilding, as shown in Figure 6 and Figure 7.
Being embodied as of this step:
1) if mode=0 then selects the parametric solution code method based on the frequency domain major component:
Compare PAR according to the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the deflection in the encoding code stream, utilization recovers based on the parametric solution code method of the frequency domain master composition subband sequence P (N) with gained in the step 2.2, obtain decoded subband signal, i.e. frequency domain subband signal of Chong Jianing
Figure BDA00001733175900121
With
L ^ 1 , L ^ 2 , . . . , L ^ N .
2) if mode=1 then selects the parametric solution code method based on the polar coordinates major component:
Compare PAR according to the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the radius of turn in the encoding code stream, utilization recovers based on the parametric solution code method of the polar coordinates master composition subband sequence P (N) with gained in the step 2.2, obtain decoded subband signal, i.e. frequency domain subband signal of Chong Jianing R ^ 1 , R ^ 2 , . . . , R ^ N . With L ^ 1 , L ^ 2 , . . . , L ^ N .
Step 2.4, the frequency-region signal that the frequency domain subband signal of combining step 2.3 resulting reconstructions obtains rebuilding
Figure BDA00001733175900125
With
Figure BDA00001733175900126
Step 2.5 is carried out the time-frequency inverse transformation to the frequency domain sound channel signal of the resulting reconstruction of step 2.4, recovers the time-domain signal of reconstruction
Figure BDA00001733175900131
With During concrete enforcement, can adopt prior art, as the FFT(Fast Fourier Transform (FFT)) conversion realizes that the present invention will not give unnecessary details.
Specific embodiment described herein only is that the present invention's spirit is illustrated.The technician of the technical field of the invention can make various modifications or replenishes or adopt similar mode to substitute described specific embodiment, but can't depart from spirit of the present invention or surmount the defined scope of appended claims.

Claims (10)

1. the two-channel coding method towards the 3D audio frequency is characterized in that, comprises step:
S1.1, the binaural signal of input is carried out time-frequency conversion respectively, convert the binaural signal on the time domain on the frequency domain binaural signal;
S1.2, the binaural signal on the described frequency domain is carried out sub-band division respectively, obtain the two-channel subband signal;
S1.3, adopt based on frequency domain master composition with based on the parameter coding method of polar coordinates master composition described two-channel subband signal is encoded one by one respectively, to obtain the coding noise energy that each two-channel subband signal is produced under above-mentioned two kinds of coding methods;
Described employing based on the parameter coding method of polar coordinates master composition to the described two-channel subband signal resulting coding noise energy of encoding ε 2.kBe the coding noise energy of k two-channel subband signal, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure FDA00003108165100012
R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
S1.4, at each two-channel subband signal, select the pairing parameter coding method of less coding noise energy that this two-channel subband signal is further encoded, if noise energy equates, then selects based on the parameter coding method of frequency domain major component this two-channel subband signal further to be encoded; Further encode based on the parameter coding method of frequency domain major component if adopt, then export coding chief composition series, deflection and the noise energy ratio of two-channel subband signal; Further encode based on the parameter coding method of polar coordinates major component if adopt, then export coding chief composition series, radius of turn and the noise energy ratio of two-channel subband signal;
Described employing based on the resulting coding chief composition series of the parameter coding method of polar coordinates major component is:
PC k={PC k(j)|j=1,2,...,n}
Wherein, PC kBe the chief composition series of k two-channel subband signal, PC k(j) be the principal ingredient of j frequency in k the two-channel subband signal,
Figure FDA00003108165100014
Figure FDA00003108165100015
The deflection of representing j frequency in k the two-channel subband signal,
Figure FDA00003108165100013
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity that is numbered the subband intermediate-frequeney point of k;
Described employing based on the resulting radius of turn of parameter coding method of polar coordinates major component is:
ρ k ‾ = Σ j = 1 n L k 2 ( j ) + R k 2 ( j ) n
Wherein,
Figure FDA00003108165100025
Be the radius of turn of k two-channel subband signal, L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Described employing based on the resulting noise energy ratio of the parameter coding method of polar coordinates major component is:
PAR = π 2 48 Σ j = 1 n [ ρ k ( j ) - 1 n Σ j = 1 n ρ k ( j ) ] 2
Wherein, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure FDA00003108165100023
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
S1.5, described coding chief composition series is descended to mix, obtain down mixed signal;
S1.6, adopt core encoder that described mixed signal is down encoded, obtain encoding code stream, and described deflection or radius of turn and noise energy ratio are write encoding code stream.
2. the two-channel scrambler towards the 3D audio frequency is characterized in that, comprising:
The time-frequency conversion module is used for the binaural signal of input is carried out time-frequency conversion respectively, converts the binaural signal on the time domain on the frequency domain binaural signal;
The sub-band division module is used for the binaural signal on the described frequency domain is carried out sub-band division respectively, obtains the two-channel subband signal;
Coding noise energy computing module, be used for adopting respectively based on frequency domain master composition with based on the parameter coding method of polar coordinates master composition described two-channel subband signal is encoded one by one, to obtain the coding noise energy that each two-channel subband signal is produced under above-mentioned two kinds of coding methods; Described employing based on the parameter coding method of polar coordinates master composition to the described two-channel subband signal resulting coding noise energy of encoding
Figure FDA00003108165100024
ε 2.kBe the coding noise energy of k two-channel subband signal, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure FDA00003108165100031
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
The parameter coding module, be used at each two-channel subband signal, select the pairing parameter coding method of less coding noise energy that this two-channel subband signal is further encoded, if noise energy equates, then selects based on the parameter coding method of frequency domain major component this two-channel subband signal further to be encoded; Further encode based on the parameter coding method of frequency domain major component if adopt, then export coding chief composition series, deflection and the noise energy ratio of two-channel subband signal; Further encode based on the parameter coding method of polar coordinates major component if adopt, then export coding chief composition series, radius of turn and the noise energy ratio of two-channel subband signal;
Described employing based on the resulting coding chief composition series of the parameter coding method of polar coordinates major component is:
PC k={PC k(j)|j=1,2,...,n}
Wherein, PC kBe the chief composition series of k two-channel subband signal, PC k(j) be the principal ingredient of j frequency in k the two-channel subband signal,
Figure FDA00003108165100032
Figure FDA00003108165100033
The deflection of representing j frequency in k the two-channel subband signal,
Figure FDA00003108165100034
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity that is numbered the subband intermediate-frequeney point of k;
Described employing based on the resulting radius of turn of parameter coding method of polar coordinates major component is:
ρ k ‾ = Σ j = 1 n L k 2 ( j ) + R k 2 ( j ) n
Wherein,
Figure FDA00003108165100036
Be the radius of turn of k two-channel subband signal, L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Described employing based on the resulting noise energy ratio of the parameter coding method of polar coordinates major component is:
PAR = π 2 48 Σ j = 1 n [ ρ k ( j ) - 1 n Σ j = 1 n ρ k ( j ) ] 2
Wherein, ρ k(j) be the signal amplitude of j frequency in k the two-channel subband signal,
Figure FDA00003108165100038
L k(j), R k(j) be respectively the signal of j frequency in k L channel subband signal and the R channel subband signal, n is the quantity of k two-channel subband signal intermediate-frequeney point;
Mixed module is used for described coding chief composition series is descended to mix down, obtains down mixed signal;
Core encoder is used for described down mixed signal is encoded, and obtains encoding code stream, and described deflection or radius of turn and noise energy ratio are write encoding code stream.
3. the two-channel coding/decoding method towards the 3D audio frequency is characterized in that, comprises step:
S2.1, employing core decoder are decoded to the encoding code stream that uses coding method as claimed in claim 1 to obtain, and obtain decoded signal;
S2.2, described decoded signal is carried out sub-band division, obtain the subband signal of decoding;
S2.3, adopt and encode used parameter coding method relevant parameters coding/decoding method and combine deflection in the encoding code stream or radius of turn, noise ability are compared described decoding subband signal and decoded, the frequency domain subband signal that obtains rebuilding;
The frequency-region signal that the frequency domain subband signal of S2.4, the described reconstruction of merging obtains rebuilding;
S2.5, described frequency-region signal is carried out the time-frequency inverse transformation, convert frequency-region signal to time-domain signal, recover the sound signal of reconstruction.
4. the two-channel coding/decoding method towards the 3D audio frequency according to claim 3 is characterized in that:
Parametric solution code method described in the step S2.3 is based on the parametric solution code method of frequency domain master composition or based on the parametric solution code method of polar coordinates master composition.
5. the two-channel coding/decoding method towards the 3D audio frequency according to claim 4 is characterized in that:
Described utilization is decoded to described decoding subband signal based on the parametric solution code method of frequency domain master composition, and the frequency domain subband signal that obtains rebuilding is specially:
According to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal,, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding in conjunction with principal ingredient sequence and the deflection in the encoding code stream.
6. the two-channel coding/decoding method towards the 3D audio frequency according to claim 4 is characterized in that:
Described utilization is decoded to described decoding subband signal based on the parametric solution code method of polar coordinates master composition, and the frequency domain subband signal that obtains rebuilding is specially:
According to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal,, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding in conjunction with principal ingredient sequence and the radius of turn in the encoding code stream.
7. the binaural decoder towards the 3D audio frequency is characterized in that, comprising:
Core decoder is used for the encoding code stream that uses coding method as claimed in claim 1 to obtain is decoded, and obtains decoded signal;
The sub-band division module is used for described decoded signal is carried out sub-band division, obtains the subband signal of decoding;
The parameter decoder module, be used for adopting and encode used parameter coding method relevant parameters coding/decoding method and combine deflection in the encoding code stream or radius of turn, noise ability are compared described decoding subband signal and decoded, the frequency domain subband signal that obtains rebuilding;
Subband merges module, is used for merging the frequency-region signal that the frequency domain subband signal of described reconstruction obtains rebuilding;
The time-frequency inverse transform module is used for described frequency-region signal is carried out the time-frequency inverse transformation, converts frequency-region signal to time-domain signal, recovers the sound signal of reconstruction.
8. the binaural decoder towards the 3D audio frequency according to claim 7 is characterized in that:
Described parameter decoder module further comprises based on the parameter decoder module of frequency domain master composition with based on the parameter decoder module of polar coordinates master composition.
9. the binaural decoder towards the 3D audio frequency according to claim 8 is characterized in that:
Described parameter decoder module based on frequency domain master composition, be used for according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the deflection in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
10. the binaural decoder towards the 3D audio frequency according to claim 8 is characterized in that:
Described parameter decoder module based on polar coordinates master composition, be used for according to the ratio of the noise energy in the encoding code stream, produce one and have the white noise of identical energy with original signal, in conjunction with principal ingredient sequence and the radius of turn in the encoding code stream, described decoding subband signal is recovered the frequency domain subband signal that obtains rebuilding.
CN2012101839630A 2012-06-06 2012-06-06 Double-channel encoding and decoding method for 3D audio frequency and codec Expired - Fee Related CN102682779B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101839630A CN102682779B (en) 2012-06-06 2012-06-06 Double-channel encoding and decoding method for 3D audio frequency and codec

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101839630A CN102682779B (en) 2012-06-06 2012-06-06 Double-channel encoding and decoding method for 3D audio frequency and codec

Publications (2)

Publication Number Publication Date
CN102682779A CN102682779A (en) 2012-09-19
CN102682779B true CN102682779B (en) 2013-07-24

Family

ID=46814589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101839630A Expired - Fee Related CN102682779B (en) 2012-06-06 2012-06-06 Double-channel encoding and decoding method for 3D audio frequency and codec

Country Status (1)

Country Link
CN (1) CN102682779B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400582B (en) * 2013-08-13 2015-09-16 武汉大学 Towards decoding method and the system of multisound path three dimensional audio frequency
CN105336333B (en) * 2014-08-12 2019-07-05 北京天籁传音数字技术有限公司 Multi-channel sound signal coding method, coding/decoding method and device
CN104240712B (en) * 2014-09-30 2018-02-02 武汉大学深圳研究院 A kind of three-dimensional audio multichannel grouping and clustering coding method and system
CN105632505B (en) * 2014-11-28 2019-12-20 北京天籁传音数字技术有限公司 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model
WO2016204579A1 (en) * 2015-06-17 2016-12-22 삼성전자 주식회사 Method and device for processing internal channels for low complexity format conversion

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101162904A (en) * 2007-11-06 2008-04-16 武汉大学 Space parameter stereo coding/decoding method and device thereof
CN101401152A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for encoding by principal component analysis a multichannel audio signal

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2124486A1 (en) * 2008-05-13 2009-11-25 Clemens Par Angle-dependent operating device or method for generating a pseudo-stereophonic audio signal
US8452587B2 (en) * 2008-05-30 2013-05-28 Panasonic Corporation Encoder, decoder, and the methods therefor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101401152A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for encoding by principal component analysis a multichannel audio signal
CN101162904A (en) * 2007-11-06 2008-04-16 武汉大学 Space parameter stereo coding/decoding method and device thereof

Also Published As

Publication number Publication date
CN102682779A (en) 2012-09-19

Similar Documents

Publication Publication Date Title
JP7342091B2 (en) Method and apparatus for encoding and decoding a series of frames of an ambisonics representation of a two-dimensional or three-dimensional sound field
US9984694B2 (en) Method and device for improving the rendering of multi-channel audio signals
CN105706467B (en) Method and apparatus for handling audio signal
US9992599B2 (en) Method, device, encoder apparatus, decoder apparatus and audio system
CN102682779B (en) Double-channel encoding and decoding method for 3D audio frequency and codec
CN101410890B (en) Parameter calculator for guiding up-mixing parameter and method, audio channel reconfigure and audio frequency receiver including the parameter calculator
CN101542596B (en) For the method and apparatus of the object-based audio signal of Code And Decode
TWI404429B (en) Method and apparatus for encoding/decoding multi-channel audio signal
US8332229B2 (en) Low complexity MPEG encoding for surround sound recordings
CN103262158B (en) The multi-channel audio signal of decoding or stereophonic signal are carried out to the apparatus and method of aftertreatment
CN107610710B (en) Audio coding and decoding method for multiple audio objects
CN109887517A (en) Method, decoder and the computer-readable medium that audio scene is decoded
TWI689210B (en) Time domain stereo codec method and related products
CN101149925A (en) Space parameter selection method for parameter stereo coding
CN103700372B (en) A kind of parameter stereo coding based on orthogonal decorrelation technique, coding/decoding method
CN101162904A (en) Space parameter stereo coding/decoding method and device thereof
CN103000179B (en) Multichannel audio coding/decoding system and method
CN105164749A (en) Hybrid encoding of multichannel audio
CN105405445A (en) Parameter stereo coding, decoding method based on inter-channel transfer function
CN105308680A (en) Audio encoder and decoder
CN104464742A (en) System and method for carrying out comprehensive non-uniform quantitative coding on 3D audio space parameters
CN106104678A (en) Derive multi channel signals from two or more baseband signals
CN110660401A (en) Audio object coding and decoding method based on high-low frequency domain resolution switching
CN101673549A (en) Spatial audio parameters prediction coding and decoding methods of movable sound source and system
CN101604983B (en) Device, system and method for coding and decoding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130724

Termination date: 20190606

CF01 Termination of patent right due to non-payment of annual fee