Embodiment
In order on existing audio coding basis, to obtain bigger encoded bandwidth with lower code rate; And the higher coding quality of acquisition; The embodiment of the invention provides a kind of audio coding method, specifically can extract the temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that are used for characterize audio signals; Behind said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter coding, be transferred to decoding end.
Further, when the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, extract the first harmonic offset parameter of said audio signal, and to being transferred to said decoding end behind its coding.
Fig. 1 is the audio coding method schematic flow sheet of the embodiment of the invention, will combine Fig. 1 that the audio coding method of the embodiment of the invention is introduced below.As shown in Figure 1, specifically can comprise:
11: extraction need be carried out the temporal envelope parameter of the audio signal of encoding process; Concrete, can obtain the temporal envelope of signal through the subframe energy of calculating audio signal, also can convert the signal into frequency domain (or transform domain) and extract autoregression (AR, Auto Regressive) model parameter afterwards and come the temporal envelope of characterization signal;
12: the frequency domain envelope parameters of extracting audio signal; Concrete, can obtain the signal frequency-domain envelope through the sub belt energy under the calculating frequency domain (or transform domain), also can come the frequency domain envelope of characterization signal in the white regression model parameter that time domain is extracted signal;
13: the pitch parameters of extracting audio signal; Pitch parameters has characterized the ratio between the harmonic signal and noise signal in the audio signal; The method for expressing of pitch parameters has multiple, can be the ratio of maximum with the minimum value of auto-correlation function;
14: harmonic interval (PG, the Pitch Grid) parameter of extracting audio signal; The harmonic interval parameter characterization interval between the different harmonic waves of signal; Specifically can estimate the harmonic interval parameter through the peak extraction method;
15: extract first harmonic offset parameter (P0, Pitch Offset); Concrete, can estimate the first harmonic offset parameter according to the harmonic interval parameter, and with this first harmonic offset parameter coding transmission; The first harmonic offset parameter has characterized the position of first harmonic wave of audio signal; It is pointed out that then this step can be omitted if the value of first harmonic side-play amount equals harmonic interval; Just when the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, extract the first harmonic offset parameter of said audio signal;
To above-mentioned temporal envelope parameter, the frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter coding back (also can quantize the back coding) are with its output.
It is pointed out that above-mentioned pitch parameters, harmonic interval parameter and first harmonic offset parameter can but be not limited to calculate at frequency domain (or transform domain), for example can also calculate in time domain.And the order of obtaining above-mentioned each parameter is not unique, no matter promptly with which kind of order, as long as obtain the temporal envelope parameter of above-mentioned audio signal, and the frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter get final product.
Foregoing has been described the audio coding method flow process of the embodiment of the invention, through said method, and can be with comprising the temporal envelope parameter; The frequency domain envelope parameters, pitch parameters, one group of parameter of harmonic interval parameter and first harmonic offset parameter; Or with comprising the temporal envelope parameter; The frequency domain envelope parameters, one group of parameter of pitch parameters harmonic spacing parameter is come characterize audio signals.With respect to prior art based on the parametric audio coding of certain model technology, one group of parameter that the embodiment of the invention adopts, the number of the parameter that has needed when having reduced coding, needed bit number when having reduced operation parameter simultaneously and encoding; Thereby solved traditional RIRAC coding method bit number problem of higher; Simultaneously; Compare with existing parametric audio coding algorithm,, thereby further reduce the code rate of signal owing to this group parameter of the embodiment of the invention can be encoded with bit number still less; And transmittability one timing when channel; Because number of coded bits of the present invention is lower, therefore can encode has the more signal of high bandwidth, has realized obtaining bigger encoded bandwidth and higher coding quality with lower code rate.
The embodiment of the invention also provides a kind of audio-frequency decoding method, specifically can comprise: the data to receiving are decoded, and obtain being used for temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of characterize audio signals; According to said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter, synthetic audio signal.
Further, also comprise: the data that comprise the first harmonic offset parameter to receiving are decoded, and obtain being used to characterize the first harmonic offset parameter of said audio signal.
The step of said synthetic audio signal comprises:
Obtain harmonic signal according to said harmonic interval parameter; Or when the harmonic interval of said audio signal and first harmonic offset parameter not simultaneously, according to said harmonic interval parameter and said first harmonic offset parameter, obtain harmonic signal;
According to said pitch parameters, the ratio between adjustment harmonic signal and the noise signal; And according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding;
According to said frequency domain envelope parameters and temporal envelope parameter the spectrum signal of said reconstruction is handled and to be obtained synthetic audio signal.
Fig. 2 is the audio-frequency decoding method schematic flow sheet that the embodiment of the invention provides, and will combine Fig. 2 that the audio-frequency decoding method of the embodiment of the invention is introduced below.As shown in Figure 2, specifically can comprise:
21: the data to receiving are decoded; Obtain being used for temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of characterize audio signals; When the value of the harmonic interval of audio signal and first harmonic side-play amount not simultaneously, also comprise the first harmonic offset parameter;
22: according to the harmonic interval parameter, obtain harmonic signal (when the harmonic interval of said audio signal and first harmonic offset parameter not simultaneously, according to harmonic interval parameter and first harmonic offset parameter, obtain harmonic signal; Otherwise the value of first harmonic side-play amount equals the value of harmonic interval); This harmonic structure can be represented that wherein the first harmonic offset parameter has been confirmed the position of first harmonic wave by the harmonic wave with random phase, and the interval of each harmonic wave is by the harmonic interval parameter determining; This harmonic structure is harmonic signal;
23: produce noise signal, for example, can produce noise signal by a tandom number generator;
24: according to the value adjustment harmonic signal of pitch parameters and the ratio between the noise signal; And according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding;
25: according to the frequency domain envelope parameters spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping; For example, obtain signal after the frequency-domain shaping after can going normalization to handle to the spectrum signal of rebuilding according to the sub belt energy envelope that decodes;
26: carry out the time domain shaping according to the signal of temporal envelope parameter after and handle, obtain final synthetic audio signal to said frequency-domain shaping; For example, can go normalization to handle later on again to time domain according to the signal transformation of the subframe energy envelope that decodes after to frequency-domain shaping after, obtain final synthetic audio signal.
It is to be noted that the order of frequency-domain shaping and time domain shaping is not unique; Also can be earlier according to the temporal envelope parameter spectrum signal of said reconstruction being carried out the time domain shaping handles; Carry out frequency-domain shaping according to the spectrum signal of frequency domain envelope parameters after again and handle, obtain final synthetic audio signal shaping.
Foregoing has been described the audio-frequency decoding method flow process of the embodiment of the invention; The one group of parameter that comprises the temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and the first harmonic offset parameter that are used for characterize audio signals that provides through the embodiment of the invention; Can realize utilizing bit number still less to come synthetic audio signal, and this audio signal quality is higher; And when the harmonic structure of audio signal was obvious, the audio quality that decoding obtains was better.
For ease of understanding, will carry out detailed description to coding, the concrete implementation of decoding of the embodiment of the invention below to the embodiment of the invention.
Embodiment one
In the present embodiment; Coding side has extracted temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of audio signal respectively; Because the harmonic interval of present embodiment sound intermediate frequency signal is identical with the first harmonic offset parameter, therefore omitted the step of extracting the first harmonic offset parameter; After decoding end is received above-mentioned parameter, decode, obtain synthetic audio signal according to above-mentioned each parameter.
The implementation process of coding side specifically can comprise:
(1): the temporal envelope parameter of extracting signal: for example, adopt the subframe energy of calculating audio signal to obtain the temporal envelope parameter of signal, subframe energy envelope { temp_env (0) that can signal calculated; Temp (1) ..., temp (N-1) }; Wherein N is a number of sub-frames; If frame length is 15ms, subframe lengths is 3ms, then N=5; This subframe energy envelope is quantized, promptly obtain the temporal envelope parameter, further can encode this temporal envelope parameter; While can be utilized the temporal envelope after the quantification that signal is carried out time domain normalization and handle;
Certainly, also can convert the signal in the practical application frequency domain (or transform domain) extract afterwards autoregression (AR, AutoRegressive) model parameter is come the temporal envelope of characterization signal;
(2): the frequency domain envelope parameters of extracting audio signal; For example, when the autoregression model parameter of time domain extraction signal is come the frequency domain envelope of characterization signal, calculate the autoregression model parameter { α of signal in time domain
0, α
1..., α
M-1, wherein M is the exponent number of autoregression model, further can quantize, encode and transmit this autoregression model parameter; Carry out filtering according to the autoregression model parameter after quantizing simultaneously, obtain residual signals err (n);
In concrete the application, can also obtain the signal frequency-domain envelope parameters through the sub belt energy that calculates under the frequency domain (or transform domain);
(3): the pitch parameters of extracting audio signal; Pitch parameters has characterized the ratio between the harmonic signal and noise signal in the audio signal; The method for expressing of pitch parameters has multiple, can be the ratio of maximum with the minimum value of auto-correlation function, for example T=max (ACF (k
0))/min (ACF (k
0)), also can be other form of expression, as long as can characterize the proportionate relationship between harmonic wave and the noise; Wherein, auto-correlation function ACF (k
0) calculating can utilize FFT to change to obtain, for example, the residual signals err (n) in (2) is carried out the FFT conversion, obtain frequency-region signal S (k)=FFT (err (n)), and further obtain auto-correlation function ACF (k with contrary FFT conversion
0)=IFFT (| FFT (S (k)) |
2); Certainly, also can directly calculate, for example
Wherein L is the number of frequency domain transform coefficient in the encoded bandwidth scope; In addition, can also use average magnitude difference function (AMDF, Average Mean Difference Function) to revise auto-correlation function;
(4): harmonic interval (PG, the Pitch Grid) parameter of extracting audio signal; The harmonic interval parameter characterization interval between the different harmonic waves of signal; Specifically can estimate the integer part of harmonic interval parameter through the peak extraction method, for example through PG=arg max (ACF (k
0)) calculate the integer part of harmonic interval parameter; The fractional value of harmonic interval can interiorly be inserted auto-correlation function ACF (k
0) later method acquisition through peak extraction; Particularly, carry out the interior slotting calculating of auto-correlation function near the integer harmonic interval that can only formerly obtain, and search out the fractional value of harmonic interval in the auto-correlation function after interior inserting; In order to obtain more performance, carry out coding transmission again after can further revising the harmonic interval parameter that obtains, to suppress the generation frequently of frequency multiplication and mark; For example, the harmonic interval PG of the present frame of trying to achieve and the harmonic interval old_PG of former frame are compared, if the ratio between the harmonic interval of present frame and the former frame harmonic interval is less than certain thresholding (as 0.1) and ACF (old_PG)>0.95ACF the harmonic interval PG=old_PG that (PG), then tries to achieve with this frame of harmonic interval replacement of former frame;
(5): because the value of first harmonic side-play amount equals harmonic interval in the present embodiment, this step can be omitted; But when the value of first harmonic side-play amount is not equal to harmonic interval, when extracting the first harmonic offset parameter specifically can: according to the harmonic interval parameter, estimate the first harmonic offset parameter, and with this first harmonic offset parameter coding transmission; The first harmonic offset parameter has characterized the position of first harmonic wave of audio signal; It is pointed out that then this step can be omitted if the value of first harmonic side-play amount equals harmonic interval; Just when the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, extract the first harmonic offset parameter of said audio signal;
With above-mentioned temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter coding back (or quantizing back output) output.Certainly, if (5) are not omitted, then the first harmonic offset parameter also will be encoded, transmit.
It is pointed out that above-mentioned pitch parameters, harmonic interval parameter and first harmonic offset parameter can but be not limited to calculate at frequency domain (or transform domain), for example can also calculate in time domain.And the order of obtaining above-mentioned each parameter is not unique, no matter promptly with which kind of order, as long as obtain the temporal envelope parameter of above-mentioned audio signal, and the frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter get final product;
Corresponding, the data decode of decoding end to receiving, obtain being used for the temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter of characterize audio signals after, synthetic audio signal.Certainly, if coding side (5) is not omitted, then the parameter that obtains of decoding end decoding also comprises the first harmonic offset parameter.
The concrete processing procedure that decoding end is implemented decoding can comprise:
(6): the data to receiving are decoded, and obtain being used for temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of characterize audio signals; Certainly, if the value of the harmonic interval of coding side audio signal and first harmonic side-play amount is not simultaneously, also obtain the first harmonic offset parameter;
(7): obtain harmonic signal according to the harmonic interval parameter; This harmonic structure can be represented that wherein the position of first harmonic wave equals the value of harmonic interval by the harmonic wave with random phase, and the interval of each harmonic wave is also by the harmonic interval parameter determining; This harmonic structure is harmonic signal; Concrete, for example: begin harmonic wave that harmonic interval according to harmonic interval parameter (PG) expression will have a random phase from initial frequency and be positioned over corresponding frequency in the signal bandwidth scope, thereby produce harmonic signal buf_pulses (k), for example with the form of pulse
Wherein h (k) expression has the harmonic wave of random phase;
Need to prove that if also received the first harmonic offset parameter, decoding end then can obtain harmonic signal according to harmonic interval parameter and first harmonic offset parameter as if decoding end; This harmonic structure can be represented that wherein the first harmonic offset parameter has been confirmed the position of first harmonic wave by the harmonic wave with random phase, and the interval of each harmonic wave is by the harmonic interval parameter determining; This harmonic structure is harmonic signal.Concrete is concrete; For example; First harmonic offset parameter (P0) is the position of first pulse; Begin harmonic wave that harmonic interval according to harmonic interval parameter (PG) expression will have a random phase from first pulse position and be positioned over corresponding frequency in the signal bandwidth scope, thereby produce harmonic signal buf_pulses (k), for example with the form of pulse
Wherein h (k) expression has the harmonic wave of random phase;
(8): produce noise signal, for example, can produce noise signal buf_noise (k) by a tandom number generator;
(9): according to the value adjustment harmonic signal of pitch parameters and the ratio between the noise signal; And according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding; Concrete adjustment can have multiple, and for example: calculate earlier the energy of harmonic signal and noise signal respectively, note is made enerP and enerN, calculates the adjustment factor-beta again
1=1-T with
Wherein T is a pitch parameters; And obtain revised reconstruction spectrum signal
Through contrary FFT conversion the spectrum signal of rebuilding is transformed to time domain, note is done
(10): according to the frequency domain envelope parameters spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping; For example; The autoregression model parameter that obtains according to decoding;
carries out liftering to signal, obtains the signal
after the frequency-domain shaping
(11): carry out the time domain shaping according to the signal of temporal envelope parameter after and handle, obtain final synthetic audio signal to said frequency-domain shaping; For example; After can going normalization to handle to signal
according to the subframe energy envelope that decodes, obtain final synthetic audio signal.
With respect to prior art based on the parametric audio coding of certain model technology, one group of parameter that the embodiment of the invention adopts, the number of the parameter that has needed when having reduced coding, needed bit number when having reduced operation parameter simultaneously and encoding; Thereby solved existing coding method bit number problem of higher; Simultaneously; Compare with existing parametric audio coding algorithm,, thereby further reduce the code rate of signal owing to this group parameter of the embodiment of the invention can be encoded with bit number still less; And transmittability one timing when channel; Because number of coded bits of the present invention is lower, therefore can encode has the more signal of high bandwidth, has realized obtaining bigger encoded bandwidth and higher coding quality with lower code rate.Simultaneously can realize utilizing bit number still less to come synthetic audio signal, and this audio signal quality is higher in decoding end; And when the harmonic structure of audio signal was obvious, the audio quality that decoding obtains was better.
The embodiment of the invention also provides a kind of code processing method; Specifically can comprise: when with the mode of dividing frequency band during coding audio signal; If the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; Then extract temporal envelope parameter and the frequency domain envelope parameters that is used for characterize audio signals; And with said temporal envelope parameter and the back transmission of frequency domain envelope parameters coding, the spectrum signal information similar of the spectrum signal of the audio signal of the current frequency band of transmission expression simultaneously and the audio signal of previous frequency band; If the spectrum signal of the spectrum signal of the audio signal of current frequency band and the audio signal of previous frequency band is dissimilar; Then extract temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that is used for characterize audio signals; And with said temporal envelope parameter, frequency domain envelope parameters, the back transmission of pitch parameters harmonic spacing parameter coding, the spectrum signal information similar of the spectrum signal of the audio signal of the current frequency band of transmission expression simultaneously and the audio signal of previous frequency band.
Concrete, the similar or dissimilar information with the spectrum signal of the audio signal of previous frequency band of the spectrum signal of the audio signal of the current frequency band of said expression specifically can be used the coding mode parametric representation; Said coding mode parameter; Be used to indicate decoding end when the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; According to the temporal envelope parameter and the frequency domain envelope parameters of said audio signal, the audio signal of current frequency band is decoded; Perhaps the instruction decoding end is when the spectrum signal of the audio signal of the spectrum signal of the audio signal of current frequency band and previous frequency band is dissimilar; According to temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of said audio signal, the audio signal of current frequency band is decoded.
Further; If the spectrum signal of the spectrum signal of the audio signal of current frequency band and the audio signal of previous frequency band is dissimilar; And when the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, extract the first harmonic offset parameter of said audio signal; And said first harmonic offset parameter is transferred to decoding end.And, if when the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band, can also extract the pitch parameters of said audio signal, and said pitch parameters is transferred to decoding end.
Accordingly; The embodiment of the invention also provides a kind of decoding processing method; Specifically can comprise: the data that the received code end sends, if receive the spectrum signal information similar of audio signal of spectrum signal and previous frequency band of the audio signal of the current frequency band of expression, according to temporal envelope parameter that is used for characterize audio signals and frequency domain envelope parameters synthetic audio signal; Wherein, said temporal envelope parameter and frequency domain envelope parameters are from the data that receive, to decode to obtain; As if the dissimilar information of spectrum signal between the audio signal of spectrum signal that receives the audio signal of representing current frequency band and previous frequency band; According to the temporal envelope parameter that is used for characterize audio signals, frequency domain envelope parameters, pitch parameters harmonic spacing parameter synthetic audio signal; Wherein, said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter are from the data that receive, to decode to obtain.
Concrete, according to the coding mode parameter that receives, confirm that the spectrum signal between the audio signal of spectrum signal and previous frequency band of audio signal of said current frequency band is similar or dissimilar; If the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal between the audio signal of previous frequency band, then according to said temporal envelope parameter of receiving that is used for characterize audio signals and frequency domain envelope parameters synthetic audio signal; If the spectrum signal between the spectrum signal of the audio signal of current frequency band and the audio signal of previous frequency band is dissimilar, then according to the said temporal envelope parameter of receiving, frequency domain envelope parameters, pitch parameters harmonic spacing parameter, synthetic audio signal.
If the spectrum signal of the spectrum signal of the audio signal of current frequency band and the audio signal of previous frequency band is dissimilar; The said temporal envelope parameter of receiving, frequency domain envelope parameters, pitch parameters harmonic spacing parameter can also comprise: the first harmonic offset parameter of said audio signal; If the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal between the audio signal of previous frequency band; Said temporal envelope parameter that is used for characterize audio signals and the frequency domain envelope parameters of receiving can also comprise: the pitch parameters that is used to characterize said audio signal.
Fig. 3 is the code processing method schematic flow sheet of the embodiment of the invention, will combine Fig. 3 that the code processing method of the embodiment of the invention is introduced below.As shown in Figure 3, specifically can comprise:
31: when with the mode of dividing frequency band during, judge whether the spectrum signal of audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band to coding audio signal; Concrete can be through confirming whether similar coding mode parameters C M representes; For example particularly, can calculate the cross-correlation between current band signal spectrum and the previous band signal spectrum earlier, to confirm the similitude between current frequency band harmonic structure and the previous frequency band harmonic structure; When cross-correlation during greater than a certain thresholding, it is similar can being judged to be between current frequency band harmonic structure and the previous frequency band harmonic structure, CM is changed to 1, otherwise CM is changed to 0; And current band signal spectrum can no longer be extracted following pitch parameters, harmonic interval parameter and first harmonic offset parameter with between previous band signal is composed when similar;
32:, then extract temporal envelope parameter and the frequency domain envelope parameters that is used for characterize audio signals if similar; If dissimilar, then extract temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that is used for characterize audio signals; Current band signal spectrum that is to say, if can not extracted pitch parameters, harmonic interval parameter and the first harmonic offset parameter of audio signal with between previous band signal is composed when similar; Concrete, the method for extracting above-mentioned each parameter can be following:
Extract the temporal envelope parameter; For example can pass through to calculate the subframe energy envelope and the global gain factor gain of current band signal, and judge that according to this two class value signal is steady-state signal or transient signal; If steady-state signal then quantizes global gain factor gain, with the quantized value that obtains as the temporal envelope parameter; If transient signal then quantizes the subframe energy envelope, with the quantized value that obtains as the temporal envelope parameter; And according to the temporal envelope parameter current band signal is carried out time domain normalization and handle, obtain the signal after the time domain normalization;
Extract the frequency domain envelope parameters; For example the later signal of time domain normalization is carried out the MDCT (discrete cosine transform of correction; Modified Discrete Cosine Transform) obtained one group of MDCT coefficient after the conversion; Be the corresponding frequency-region signal of this frequency band after the time domain normalization, when this frequency-region signal is handled this group frequency-region signal be divided into N subband, the filial generation energy that extracts each subband also quantizes; Obtain one group of frequency domain envelope after the quantification, be the frequency domain envelope parameters; According to the frequency domain envelope parameters frequency-region signal is carried out frequency domain normalization and handle, obtain the signal after the frequency domain normalization;
Extract pitch parameters; Concrete, can directly carry out parameter extraction in the MDCT territory; In order further to improve the performance of encoder, can directly not carry out parameter extraction yet, but calculate pseudo-spectrum signal, and calculate pitch parameters according to this pseudo-spectrum signal according to original frequency-region signal in the MDCT territory; Pitch parameters can be through the maximum of auto-correlation function and the ratio value representation between the minimum value, and wherein obtaining of maximum and minimum value is in desired range or in to the useful scope of harmonic interval calculation of parameter, to carry out;
Extract harmonic interval parameter PG; The harmonic interval parameter of high-frequency band signals is normally extracted down at frequency domain (or transform domain); The integer value of harmonic interval can be estimated out that the fractional value of harmonic interval can be estimated out by interior slotting auto-correlation function through the method for peak extraction through the peak extraction method by auto-correlation function; Also can only near the integer harmonic interval of trying to achieve, carry out the interior slotting calculating of auto-correlation function, obtain the fractional value of harmonic interval afterwards through the method for peak extraction;
Extract the first harmonic offset parameter,, estimate first harmonic offset parameter P0 for example according to harmonic interval; Concrete can be in the harmonic interval scope; Promptly [0; PG] in the scope, the first harmonic component is placed different deviation posts respectively, and place other harmonic wave successively by harmonic interval; And calculating the correlation between consequent spectrum and the pseudo-spectrum, the maximum deviation post of correlation is the first harmonic side-play amount of being asked; Simultaneously, the first harmonic offset parameter also can be used for further revising the estimated value of harmonic interval parameter, thereby reaches more excellent parameter extraction effect; It is pointed out that then this step can be omitted if the value of first harmonic side-play amount equals harmonic interval all the time;
33: the similar or dissimilar information with the spectrum signal of the audio signal of previous frequency band of spectrum signal that will represent the audio signal of current frequency band is sent; For example with sending behind the coding mode parameter coding; And with sending behind the parameter coding that extracts; Concrete, when CM equaled 1, the one group of parameter that comprises coding mode parameter, temporal envelope parameter and frequency domain envelope parameters will be quantized or encode, and is transferred to decoding end; When CM equals 0, comprised one group of parameter of coding mode parameter, temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter, will be quantized, encode, and be transferred to decoding end;
It is to be noted that when CM equals 1 the parameter that is transferred to decoding end can also comprise pitch parameters; When CM equals 0,, then also to transmit the first harmonic offset parameter if the value of first harmonic side-play amount is not equal to harmonic interval.
Corresponding; Decoding end is according to the above-mentioned one group of parameter of receiving that comprises coding mode parameter, temporal envelope parameter and frequency domain envelope parameters; Or receive the above-mentioned one group of parameter that comprises coding mode parameter, temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter, synthetic audio signal.
It is pointed out that if coding side equals also to transmit pitch parameters at 1 o'clock at CM, corresponding decoding end also will receive pitch parameters; If coding side equals also to transmit the first harmonic offset parameter at 0 o'clock at CM, corresponding decoding end also will receive the first harmonic offset parameter.
Fig. 4 is the decoding processing method schematic flow sheet of the embodiment of the invention; As shown in Figure 4, the concrete processing procedure of decoding processing is as shown in Figure 4, specifically can comprise:
41: receive the spectrum signal information similar between the audio signal of spectrum signal and previous frequency band of audio signal of the current frequency band of expression, or dissimilar information; For example, decode coding mode parameters C M,, can determine whether similar according to this coding mode parameters C M according to the data that receive;
42: when the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal between the audio signal of previous frequency band, according to temporal envelope parameter that is used for characterize audio signals and the frequency domain envelope parameters to receiving that data decode obtains, synthetic audio signal; When dissimilar, according to the temporal envelope parameter that is used for characterize audio signals, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter to receiving that data decode obtains, synthetic audio signal;
Concrete, when rebuilding spectrum signal:
If similar between current band signal spectrum and the previous band signal spectrum, for example CM equals 1, then can adopt mode that spectrum duplicates with the spectrum signal of the previous frequency band spectrum signal as current frequency band reconstruction; Can certainly adopt and be different from the spectrum mode of duplicating and rebuild spectrum signal; Also can from code stream, decode pitch parameters if coding side CM equals also to transmit pitch parameters at 1 o'clock, the mode that adopts spectrum to duplicate is rebuild the spectrum signal of current frequency band through the spectrum of previous frequency band; Concrete, can do shaping to the spectrum signal of previous frequency band according to pitch parameters, obtain the reconstruction spectrum signal after the shaping, the spectrum signal that the spectrum signal after the shaping is rebuild as current frequency band;
Uncorrelated between if current band signal spectrum is composed with previous band signal, for example CM equals 0, then from code stream, decodes pitch parameters, harmonic interval parameter and first harmonic offset parameter, obtains harmonic signal according to said harmonic interval parameter; Or, obtain harmonic signal according to said harmonic interval parameter and first harmonic offset parameter; According to said pitch parameters, the ratio between adjustment harmonic signal and the noise signal; And according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding; Promptly use the spectrum signal of rebuilding high frequency band based on the artificial method for reconstructing of pitch parameters, harmonic interval parameter and first harmonic offset parameter; Need to prove that when not transmitting the first harmonic offset parameter in the code stream of coding, decoding end first harmonic offset parameter equals the harmonic interval parameter.
According to the frequency domain envelope that decodes the spectrum signal of rebuilding is carried out frequency-domain shaping, for example carry out frequency domain and go normalization to handle, and the spectrum signal after the shaping is transformed to time domain; Can be through contrary MDCT conversion, the spectrum signal after also can will repairing through contrary FFT conversion transforms to time domain, but must be corresponding with the transform method that coding side adopts;
Carry out the time domain shaping according to the temporal envelope parameter that decodes and handle, for example time domain goes normalization to handle, and obtains the high-frequency signal that parametric audio decodes; The audio signal that obtains synthesizing.
Need to prove that the order of above-mentioned frequency-domain shaping and time domain shaping is not unique, promptly also can carry out the time domain shaping to the spectrum signal of rebuilding earlier, carry out frequency-domain shaping again.For example: according to said frequency domain envelope parameters the spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping, carry out the time domain shaping according to the signal of said temporal envelope parameter after and handle, obtain synthetic audio signal to frequency-domain shaping; Perhaps, according to said temporal envelope parameter the spectrum signal of said reconstruction is carried out the time domain shaping and handle, obtain the signal after the time domain shaping, carry out frequency-domain shaping according to the signal of said frequency domain envelope parameters after and handle, obtain synthetic audio signal the time domain shaping.
Foregoing has been described when with the mode of dividing frequency band during to coding audio signal; Whether the spectrum signal of audio signal of judging current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; When dissmilarity, extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter; When similar, only extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters and pitch parameters; Also can be only to extract the one group of parameter that comprises temporal envelope parameter and frequency domain envelope parameters; Because the number of the parameter that has needed when the embodiment of the invention has reduced coding, needed bit number when having reduced operation parameter simultaneously and encoding; Also effectively utilize the similitude of composing between the signal different frequency bands and further reduced code rate, obtained bigger encoded bandwidth.Decoding end can realize adopting different spectrum signal method for reconstructing to the characteristic of unlike signal in minute process of frequency band decoded audio signal according to above-mentioned parameter, and is stronger to the adaptability of signal characteristic, can obtain same high synthetic quality to unlike signal.
For ease of understanding, will carry out detailed description to coding, the concrete implementation of decoding of the embodiment of the invention below to the embodiment of the invention.
Embodiment two
In this embodiment, at coding side the audio signal of importing is divided into high-frequency band signals and low band signal, and respectively high-frequency band signals and low band signal is carried out encoding process.
Fig. 5 is the processing procedure sketch map of the embodiment of the invention two at coding side, and as shown in Figure 5, the encoding process process comprises:
51: the audio signal to input is carried out filter analyses; If the sample rate of the audio signal of input is 32KHz, the processing frame length is 20ms; The signal of input is carried out after branch frequency band, down-sampling handle, 320 sampled points are arranged, 320 sampled points are arranged corresponding to the signal of 8 ~ 16kHz frequency band corresponding to the signal of 0 ~ 8kHz frequency band;
Signal in 52:0 ~ 8kHz frequency band carries out encoding process through core encoder; In concrete the application; Core encoder can be through G.729.1 codec completion; Also can accomplish coding through other broadband signal codec, promptly no matter adopt which kind of coded system, can encode to the signal in 0 ~ 8kHz frequency band gets final product; And the bit stream of output low frequency signal, i.e. output code flow;
53: to the signal in 8 ~ 16kHz frequency band; Time-domain signal { y_hi (0) for example; Y_hi (1) ..., y_hi (319) }; The code processing method that adopts the embodiment of the invention to provide carries out parametric audio coding: high frequency band is the current frequency band described in the code processing method here, and low-frequency band is described previous frequency band; When the spectrum of the spectrum of high-frequency signal and low frequency signal does not have similitude, extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter, first harmonic offset parameter and coding mode parameter; When having similitude, only extraction comprises temporal envelope parameter, frequency domain envelope parameters, pitch parameters and coding mode parameter, also can only extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters and coding mode parameter; Concrete processing procedure can comprise:
(1) confirms coding mode parameters C M; Particularly, can calculate the cross-correlation between low band signal spectrum and the high-frequency band signals spectrum earlier, to confirm the similitude between low-frequency band harmonic structure and the high frequency band harmonic structure; When cross-correlation during greater than a certain thresholding, it is similar can being judged to be between low-frequency band harmonic structure and the high frequency band harmonic structure, CM is changed to 1, and the mode that adopts spectrum to duplicate shaping is rebuild the spectrum signal of high frequency band through the spectrum signal of low-frequency band; Or be different from the spectrum mode of duplicating through other and rebuild spectrum signal; When cross-correlation during smaller or equal to said thresholding, judge that then between low-frequency band harmonic structure and the high frequency band harmonic structure be dissimilar, CM is changed to 0, and manual work reconstructs the spectrum signal of high frequency band according to parameter; Certainly in the application of reality, also can adopt a kind of simple mode to carry out coding mode decision, promptly, CM is changed to 1 as harmonic interval PG during less than a certain thresholding; Otherwise be changed to 0;
(2) the subframe energy envelope of signal calculated temp_env (0), temp (1) ..., temp (N-1) } and global gain factor gain, N=8 in the present embodiment; And judge that according to this two class value signal is steady-state signal or transient signal; If steady-state signal then quantizes global gain factor gain, the quantized value that obtains as the temporal envelope parameter, and is encoded and write code stream; If transient signal then quantizes the subframe energy envelope, the quantized value that obtains as the temporal envelope parameter, and is encoded and write code stream; And according to the temporal envelope parameter 8 ~ 16kHz band signal is carried out time domain normalization and handle, obtain the signal after the time domain normalization;
(3) obtained one group of MDCT coefficient after process MDCT (discrete cosine transform of correction, the Modified Discrete CosineTransform) conversion of the signal after the time domain normalization (for example 640 points), i.e. the corresponding frequency-region signal { y_swb (0) of this frequency band; Y_swb (1) ..., y_swb (319) }; Because the ultra broadband encoder is only required the signal of handling in 8 ~ 14kHz frequency band, so frequency-region signal is only handled { y_swb (0), y_swb (1);, y_swb (239) } and part; During processing this group frequency-region signal is divided into N subband, extracts the filial generation energy and the quantification of each subband, obtain one group of frequency domain envelope { spec_env (0) after the quantification; Spec_env (1);, spec_env (N-1) }, be the frequency domain envelope parameters in 8 ~ 14kHz frequency band;
Because for the broadband core encoder G.729.1,7 ~ 8kHz part signal not in its process range, in order to ensure the continuity at decoding end decoded signal frequency spectrum, also need extract the characteristic parameter of the signal of 7 ~ 8kHz part; Because G.729.1 encoder has carried out MDCT conversion (for example 320 points) to the signal of 4 ~ 8kHz, corresponding frequency-region signal y_wb (0), y_wb (1) ... Y_wb (159) }, the frequency-region signal that wherein 7 ~ 8kHz is corresponding is { y_wb (120), y_wb (121);, y_wb (159) }, it is divided into M subband; Extract the frequency domain envelope of each subband and quantize, obtain frequency domain envelope { spec_env_extra (0), spec_env_extra (1) after the quantification in one group of 7 ~ 8kHz frequency band;, spec_env_extra (M-1) }, form whole frequency domain envelope parameters with the frequency domain envelope parameters in 8 ~ 14kHz frequency band; This group envelope can be transferred to decoding end through coding; N=15 in the present embodiment, M=3;
(4) extract pitch parameters; Concrete, can directly carry out parameter extraction in the MDCT territory; In order further to improve the performance of encoder, can directly not carry out parameter extraction in the MDCT territory yet, but according to original frequency-region signal y_swb (0), y_swb (1) ..., y_swb (239) } and calculate pseudo-spectrum signal, and calculate pitch parameters according to this pseudo-spectrum signal;
Concrete pseudo-spectrum signal S (k)=and S (0), S (1) ..., S (239) } can calculate according to following formula:
Can certainly pass through other method, obtain as original frequency-region signal is directly taken absolute value | y_swb (0) |, | y_swb (1) | ..., | y_swb (239) | } calculate; Then calculate auto-correlation function ACF (k
0), auto-correlation function can be obtained through frequency-domain calculations by pseudo-spectrum signal, for example ACF (k
0)=IFFT (| FFT (S (k)) |
2), wherein FFT is FFT, IFFT is its inverse transformation; In addition, also can directly calculate, for example
In addition, can also use average magnitude difference function (AMDF) to strengthen auto-correlation function;
Pitch parameters can be through maximum and the ratio value representation between the minimum value, the for example T=max (ACF (k of auto-correlation function
0))/min (ACF (k
0)), wherein obtaining of maximum and minimum value is in desired range or in to the useful scope of harmonic interval calculation of parameter, to carry out;
(5) according to ACF (k
0), estimate harmonic interval parameter PG; The harmonic interval parameter of high-frequency band signals is normally extracted down at frequency domain (or transform domain); The integer value of harmonic interval can be estimated out by auto-correlation function through the peak extraction method, for example according to PG=argmax (ACF (k
0)) obtain, wherein peaked obtain can be limited in the desired range or in the interested scope carry out, the fractional value of harmonic interval can be inserted auto-correlation function ACF (k in suitably
0) afterwards, through the method acquisition of peak extraction; Also can only near the integer harmonic interval of trying to achieve, carry out the interior slotting calculating of auto-correlation function, obtain the fractional value of harmonic interval afterwards through the method for peak extraction;
(6) can also revise the harmonic interval parameter value of estimating, to suppress the generation frequently of frequency multiplication and mark; For example; The harmonic interval PG of the present frame of trying to achieve and the harmonic interval old_PG of former frame are compared; If the ratio between the harmonic interval of present frame and the former frame harmonic interval is less than certain thresholding (as 0.1) and ACF (old_PG)>0.95ACF (PG), the harmonic interval PG=old_PG that then tries to achieve with this frame of harmonic interval replacement of former frame;
(7), estimate first harmonic offset parameter P0 according to harmonic interval; For example; Concrete can be in the harmonic interval scope; Promptly [0; PG] in the scope; The first harmonic component is placed different deviation posts respectively, and place other harmonic wave successively, and calculate the correlation between consequent spectrum and the pseudo-spectrum by harmonic interval; The maximum deviation post of correlation is the first harmonic side-play amount of being asked, for example
wherein
expression round downwards; It is pointed out that in fact also to exist correlation to a certain extent between the harmonic interval parameter and first harmonic offset parameter, therefore can go out the first harmonic offset parameter of high-frequency band signals through the harmonic interval parameter Estimation; Simultaneously, the first harmonic offset parameter also can be used for further revising the estimated value of harmonic interval parameter, thereby reaches more excellent parameter extraction effect;
(8) when CM equals 1, comprise coding mode parameter, temporal envelope parameter, frequency domain envelope parameters, and one group of parameter of pitch parameters will be quantized or encode, and be transferred to decoding end (being carry high frequency parameter bit stream); When CM equals 0; The one group of parameter that has comprised coding mode parameter, temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter; Will be quantized or encode, and be transferred to decoding end (being carry high frequency parameter bit stream);
It is pointed out that when CM equals 1, also can only will comprise the one group of parameter quantification or the coding of coding mode parameter, temporal envelope parameter and frequency domain envelope parameters, and be transferred to decoding end;
54: behind the parametric audio coding of accomplishing high-frequency band signals, the high-frequency signal after can selecting whether to utilize selectable RIRAC audio coding to parametric audio coding according to last number of coded bits strengthens; The enhancing mode that present embodiment adopts is that high-frequency band signals is carried out transition coding in the MDCT territory; High-frequency signal after can certainly selecting for use alternate manner to parametric audio coding strengthens, as the residual signals behind high frequency band primary signal and the high frequency band audio coding is carried out transition coding etc.; And carry high frequency strengthens bit stream.
Corresponding, decoding end receives that above-mentioned low frequency bit stream, high-frequency parameter bit stream, high frequency strengthen after the bit stream, decodes, and synthetic audio signal; Fig. 6 is the processing procedure sketch map of the embodiment of the invention two in decoding end, and is as shown in Figure 6, and the concrete processing procedure of decoding can comprise:
Signal in 61:0 ~ 8kHz frequency band is synthetic to be accomplished through core codec;
Signal in 62:8 ~ 16kHz frequency band is synthetic then to be accomplished through the parametric audio decoding; The concrete processing comprises: (1) decodes coding mode parameters C M according to the data that receive;
(2) from data, decode temporal envelope parameter, frequency domain envelope parameters;
(3) if CM equals 1, then can from the data of receiving, decode pitch parameters, adopt mode that spectrum duplicates shaping to rebuild the spectrum signal of high frequency band, or be different from the mode that spectrum duplicates through other and rebuild spectrum signal through the spectrum of low-frequency band; For example concrete, can be according to pitch parameters, the spectrum signal of the low band signal that obtains through core codec is done shaping, with the spectrum signal after the shaping as the high frequency band spectrum signal of rebuilding;
It is pointed out that when coding side does not have the transmission tone parameter when CM equals 1, then the spectrum signal of decoding end low-frequency band that core codec is obtained is directly as the high frequency band spectrum signal of reconstruction;
If CM equals 0; Then can from the data of receiving, decode pitch parameters, harmonic interval parameter and first harmonic offset parameter, use the spectrum signal of rebuilding high frequency band based on the artificial method for reconstructing of pitch parameters, harmonic interval parameter and first harmonic offset parameter; The method for reconstructing of spectrum signal is based on harmonic signal plus noise signal; Particularly; Harmonic wave with random phase is placed on some frequency in the frequency domain scope with the form of pulse; Thereby the reconstruction harmonic signal, wherein the interval of pulse is by the harmonic interval parameter determining, and the position of first pulse can obtain according to the first harmonic side-play amount; Noise signal can be obtained by a tandom number generator; According to the value of pitch parameters T, the ratio between adjustment harmonic signal and the noise signal; And with adjusted harmonic signal and noise signal addition, the spectrum signal that obtains rebuilding; Concrete adjustment can have multiple, and for example: calculate earlier the energy of harmonic signal and noise signal respectively, note is made enerP and enerN, calculates the adjustment factor-beta again
1=1-T with
And the spectrum signal that obtains rebuilding
(4) according to the frequency domain envelope that decodes the spectrum signal of rebuilding is carried out frequency-domain shaping, for example frequency domain goes normalization to handle, and the spectrum signal after the shaping is transformed to time domain; For example, can change through contrary MDCT, the spectrum signal after also can will repairing through contrary FFT conversion transforms to time domain;
(5) carry out the time domain shaping according to the temporal envelope parameter that decodes and handle, for example time domain goes normalization to handle the high-frequency signal that obtains decoding;
Need to prove, go to carry out a kind of selectable The disposal of gentle filter to temporal envelope and frequency domain envelope in the normalization processing in time domain and frequency domain.If the spectrum signal of high frequency band is to carry out according to the mode that manual work is rebuild, in case harmonic wave be placed in the wrong subband, go this moment normalization used will be the wrong envelope factor.If slight deviation appears in the harmonic wave position, will introduce distortion to a certain degree, use smothing filtering can alleviate this distortion.Particularly, if near near the subband border, a very strong tonal content is arranged, so just can carry out frequency domain and go normalization to handle with the sub belt energy envelope factor after interior the inserting; Then time domain is arrived in the signal transformation that obtains, in time domain, insert out the time domain gain function by adaptive subframe energy envelope (ATE) again; This time domain gain function can be used to go normalization to handle to time-domain signal at last;
Whether 63: after the decoding of 62 completion high-frequency band signals, can select strengthen the high-frequency signal behind the coding according to bit number last in the data that receive, concrete method be corresponding with the enhancing mode that coding side adopts, and repeats no more here;
64: with the composite signal of 0 ~ 8kHz frequency band, with the composite signal of 8 ~ 16kHz frequency band through the QMF synthetic filtering, can obtain the synthetic audio signal of final 32kHz sample rate.
Among the embodiment two; Audio signal is being divided under the situation of low band signal and high-frequency band signals; High-frequency band signals to wherein carries out parameter coding, decoding processing; Promptly adopt the indication of coding mode parameter to utilize the one group of parameter that comprises temporal envelope, frequency domain envelope, tone, harmonic interval and first harmonic side-play amount of characterization signal to accomplish encoding and decoding, perhaps utilize the one group of parameter that comprises temporal envelope, frequency domain envelope and tone of characterization signal, accomplish encoding and decoding.One group of parameter that the embodiment of the invention adopts, the number of the parameter that has needed when having reduced coding, needed bit number when having reduced operation parameter simultaneously and encoding; Thereby solved existing coding method bit number problem of higher; Simultaneously; Compare with existing parametric audio coding algorithm,, thereby further reduce the code rate of signal owing to this group parameter of the embodiment of the invention can be encoded with bit number still less; And transmittability one timing when channel; Because the number of coded bits of wood invention is lower, therefore can encode has the more signal of high bandwidth, has realized obtaining bigger encoded bandwidth and higher coding quality with lower code rate.Simultaneously can realize utilizing bit number still less to come synthetic audio signal, and this audio signal quality is higher in decoding end; And when the harmonic structure of audio signal was obvious, the audio quality that decoding obtains was better.
Embodiment three
Adopted the method for extracting the frequency domain envelope parameters after the first extraction temporal envelope parameter with respect to embodiment two; The method of first extraction frequency domain envelope parameters that adopted three of embodiment realize coding (with the audio signal among the embodiment three with divide the frequency band method, and be all example mutually among the embodiment two).
In the present embodiment, the process of high-frequency band signals being handled at coding side specifically can comprise:
(1): confirm coding mode parameters C M according to the method in (1) of coding side among the embodiment two;
(2): obtained one group of MDCT coefficient after the time-domain signal process MDCT conversion in 8 ~ 16kHz frequency band; Because the ultra broadband part is only handled the signal in 8 ~ 14kHz frequency band, so frequency-region signal is only handled { y_swb (0), y_swb (1);, y_swb (239) } and part; For core encoder, 7 ~ 8kHz part signal is not within its process range, in order to ensure the continuity at decoding end decoded signal frequency spectrum; Need extract 7 ~ 8kHz part MDCT transform-domain signals { y_wb (120) at coding side; Y_wb (121) ..., y_wb (159) };
(3): the MDCT coefficient in 7 ~ 14kHz frequency band is carried out the branch band, and calculate sub belt energy separately, as the frequency domain envelope parameters, and to its quantification back coding transmission;
(4): the MDCT coefficient in 7 ~ 14kHz frequency band is carried out frequency domain normalization handle, and extract linear predictor coefficient according to the later MDCT coefficient of frequency domain normalization, as the temporal envelope parameter, and to this group linear predictor coefficient quantification back coding transmission;
(5): carry out linear prediction filtering for the normalized MDCT coefficient of frequency domain, obtain the linear predictive residual in MDCT territory;
(6): the pitch parameters, harmonic interval parameter and the first harmonic offset parameter that extract high-frequency signal according to the method in (4) ~ (8) of coding side 53 among the embodiment two; When coding mode was 1, a transfer encoding mode parameter, temporal envelope parameter, frequency domain envelope parameters and pitch parameters were to decoding end; When coding mode is 0, then coding mode parameter, temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter are transferred to decoding end together;
Corresponding, decoding end specifically can comprise the process of the processing of high-frequency band signals:
(7):, decode coding mode parameters C M according to the code stream that receives; And from code stream, decode temporal envelope parameter, frequency domain envelope parameters; Particularly, can search the linear predictor coefficient after obtaining to quantize, i.e. time domain envelope parameters through code book; Handle so that carry out the time domain shaping according to the linear predictor coefficient of this acquisition subsequently; Search the sub belt energy after obtaining to quantize, i.e. frequency domain envelope parameters through code book; Handle so that carry out frequency-domain shaping according to the sub belt energy of this acquisition subsequently;
(8): the spectrum signal of rebuilding high frequency band according to the method in (3) in the decoding end 62 among the embodiment two;
(9): the high frequency band spectrum signal that makes reconstruction is equivalent to promptly that also the high frequency band spectrum signal of rebuilding is carried out the time domain shaping and handles through the linear prediction inverse filter;
(10):, the high frequency band spectrum signal of rebuilding is carried out frequency-domain shaping handle according to the sub belt energy after quantizing;
(11): through contrary MDCT conversion, the high frequency band spectrum signal after the shaping is transformed to time domain, obtain final high frequency band composite signal.
Can know by foregoing description; The method of first extraction frequency domain envelope parameters that adopted embodiment three realizes coding, because it is not unique to obtain the order of above-mentioned each parameter, no matter promptly with which kind of order; As long as obtain coding mode parameter, the temporal envelope parameter of above-mentioned audio signal; The frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter get final product.One group of parameter that the embodiment of the invention adopts, the number of the parameter that has needed when having reduced coding, needed bit number when having reduced operation parameter simultaneously and encoding; Thereby solved existing coding method bit number problem of higher; Simultaneously; Compare with existing parametric audio coding algorithm,, thereby further reduce the code rate of signal owing to this group parameter of the embodiment of the invention can be encoded with bit number still less; And transmittability one timing when channel; Because number of coded bits of the present invention is lower, therefore can encode has the more signal of high bandwidth, has realized obtaining bigger encoded bandwidth and higher coding quality with lower code rate.Simultaneously can realize utilizing bit number still less to come synthetic audio signal, and this audio signal quality is higher in decoding end; And when the harmonic structure of audio signal was obvious, the audio quality that decoding obtains was better.
The embodiment of the invention also provides corresponding audio coding apparatus, and its structure is as shown in Figure 7, and concrete implementation structure can comprise:
Parameter extraction unit 71 is used to extract the temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that are used for characterize audio signals; When the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, also be used to extract the first harmonic offset parameter that is used to characterize said audio signal, and be sent to transmitting element;
Transmitting element 72; Be used for behind said temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter coding; Be transferred to decoding end, concrete, for example: to said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter; After encoding, be transferred to decoding end; Perhaps be used for being transferred to decoding end behind said temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and the first harmonic offset parameter coding.
The embodiment of the invention also provides corresponding audio decoding apparatus, and its structure is as shown in Figure 8, and concrete implementation structure can comprise:
Decoding unit 81 is used for the data of receiving are decoded, and obtains being used for temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of characterize audio signals; Also be used for the data of receiving that comprise the first harmonic offset parameter are decoded, obtain being used to characterize the first harmonic offset parameter of said audio signal;
Synthesis unit 82 is used for according to temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter; Perhaps temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter, synthetic audio signal; Specifically can comprise:
Harmonic wave is rebuild subelement 821, is used for obtaining harmonic signal according to said harmonic interval parameter; Or harmonic interval and the first harmonic side-play amount that ought saidly be used for characterize audio signals are not simultaneously, according to said harmonic interval parameter and said first harmonic offset parameter, obtain harmonic signal;
Spectrum signal is rebuild subelement 822, is used for according to said pitch parameters, adjusts said harmonic wave and rebuilds harmonic signal that subelement 821 obtains and the ratio between the noise signal; And according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding;
Shaping subelement 823 is used for according to said frequency domain envelope parameters and temporal envelope parameter the spectrum signal of said spectrum signal reconstruction subelement 822 reconstructions being handled, and obtains synthetic audio signal; For example: according to said frequency domain envelope parameters the spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping, carry out the time domain shaping according to the signal of said temporal envelope parameter after and handle, obtain said synthetic audio signal to frequency-domain shaping; Perhaps, according to said temporal envelope parameter the spectrum signal of said reconstruction is carried out the time domain shaping and handle, obtain the signal after the time domain shaping, carry out frequency-domain shaping according to the signal of said frequency domain envelope parameters after and handle, obtain said synthetic audio signal the time domain shaping.
The embodiment of the invention also provides corresponding audio coding and decoding system, and its structure is as shown in Figure 9, and concrete implementation structure can comprise:
Code device 91 is used to extract the temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that are used for characterize audio signals; Behind the said temporal envelope parameter that is used for characterize audio signals, frequency domain envelope parameters, pitch parameters harmonic spacing parameter coding, be sent to decoding device; Specifically can comprise:
Parameter extraction unit 911 is used to extract temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of audio signal; When the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, also be used to extract the first harmonic offset parameter of said audio signal;
Transmitting element 912 is used for said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter; Perhaps said temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter are transferred to decoding device behind the coding;
Decoding device 92 is used for the data that said code device sends are decoded, and obtains said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter; According to said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter synthetic audio signal; Specifically can comprise:
Decoding unit 921; Be used for the data of receiving are decoded; Obtain said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter, perhaps said temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter;
Synthesis unit 922; Be used for according to said temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter; Perhaps said temporal envelope parameter, frequency domain envelope parameters, pitch parameters, harmonic interval parameter and first harmonic offset parameter, synthetic audio signal.
The embodiment of the invention also provides the respective coding processing unit, and its structure is shown in figure 10, and concrete implementation structure can comprise:
Judging unit 101 is used to judge whether the spectrum signal of audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; Concrete, can whether represent information similar with the value of coding mode parameter;
Coding unit 102; Be used for the judged result information that obtains according to said judging unit 101; When the spectrum signal of the audio signal of current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; Extraction is used for the temporal envelope parameter and the frequency domain envelope parameters of characterize audio signals, also is used to extract pitch parameters; Perhaps, when the spectrum signal between the audio signal of the spectrum signal of the audio signal of current frequency band and previous frequency band is dissimilar, extract the temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that are used for characterize audio signals; In the value of the harmonic interval of said audio signal and first harmonic side-play amount not simultaneously, also be not used to extract the first harmonic offset parameter of said audio signal;
Transmission unit 103 is used to send the spectrum signal information similar between the audio signal of spectrum signal and previous frequency band of audio signal of the current frequency band that said judging unit 101 obtains, for example with sending behind the coding mode parameter coding; The back transmission of encoding of the temporal envelope parameter of the said audio signal that also is used for said coding unit is extracted and frequency domain envelope parameters (can also comprise pitch parameters); Perhaps; Send the dissimilar information of spectrum signal between the audio signal of spectrum signal and previous frequency band of audio signal of the current frequency band that said judging unit obtains, send temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter (can also comprise the first harmonic offset parameter) of the audio signal that said coding unit the is extracted back of encoding.
The embodiment of the invention also provides corresponding decoding processing device, and its structure is shown in figure 11, and concrete implementation structure can comprise:
Receive information unit 111; Be used to receive the spectrum signal information similar of audio signal of spectrum signal and previous frequency band of the audio signal of the current frequency band of expression, and the data decode of receiving is obtained being used for the temporal envelope parameter and the frequency domain envelope parameters of characterize audio signals; Perhaps; Receive the dissimilar information of spectrum signal of audio signal of spectrum signal and previous frequency band of the audio signal of the current frequency band of expression, and the data decode of receiving is obtained being used for temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter of characterize audio signals; Also be used for obtaining being used for the first harmonic offset parameter of characterize audio signals to comprising the data decode of first harmonic offset parameter; Concrete, receiving information unit 111 can be according to the coding mode parameter that receives, and confirms that the spectrum signal between the audio signal of spectrum signal and previous frequency band of audio signal of said current frequency band is similar or dissimilar;
Decoding unit 112 is used for the said information similar according to said reception information unit 111 receptions, and said temporal envelope parameter and the frequency domain envelope parameters that is used for characterize audio signals, synthetic audio signal; Perhaps, according to said dissimilar information, and said temporal envelope parameter, frequency domain envelope parameters, the pitch parameters harmonic spacing parameter that is used for characterize audio signals, synthetic audio signal; Concrete:
When the spectrum signal information similar of the audio signal of the spectrum signal of the audio signal that receives current frequency band and previous frequency band, said decoding unit 112, specifically shown in figure 12, can comprise:
Rebuild subelement 121: be used to rebuild spectrum signal, the spectrum signal that obtains rebuilding;
The second shaping subelement 122: be used for according to said pitch parameters, the spectrum signal that said reconstruction subelement 121 is rebuild carries out shaping to be handled, and obtains the reconstruction spectrum signal after the shaping;
The first shaping subelement 123: be used for the spectrum signal (or the spectrum signal after the shaping) of said reconstruction being handled obtaining synthetic audio signal according to said frequency domain envelope parameters and temporal envelope parameter; For example: the spectrum signal of the reconstruction after according to said frequency domain envelope parameters and temporal envelope parameter the said second shaping subelement shaping being handled is handled; Comprise: according to said frequency domain envelope parameters the spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping; Carry out the time domain shaping according to the signal of said temporal envelope parameter after and handle, obtain said synthetic audio signal frequency-domain shaping; Or, according to said temporal envelope parameter the spectrum signal of said reconstruction is carried out the time domain shaping and handle, obtain the signal after the time domain shaping; Carry out frequency-domain shaping according to the signal of said frequency domain envelope parameters after and handle, obtain said synthetic audio signal the time domain shaping;
When the dissimilar information of spectrum signal of the audio signal of the spectrum signal of the audio signal that receives current frequency band and previous frequency band, said decoding unit 112, specifically shown in figure 12, can comprise:
Harmonic wave is rebuild subelement 124, is used for obtaining harmonic signal according to said harmonic interval parameter; Or, obtain harmonic signal according to said harmonic interval parameter and first harmonic offset parameter;
Spectrum signal is rebuild subelement 125, is used for according to said pitch parameters, and the ratio between adjustment harmonic signal and the noise signal, and according to adjusted harmonic signal and noise signal, the spectrum signal that obtains rebuilding;
The 3rd shaping subelement 126 is used for according to said frequency domain envelope parameters and temporal envelope parameter the spectrum signal of said reconstruction being handled obtaining synthetic audio signal.For example, according to said frequency domain envelope parameters the spectrum signal of said reconstruction is carried out frequency-domain shaping and handle, obtain the signal after the frequency-domain shaping; Carry out the time domain shaping according to the signal of said temporal envelope parameter after and handle, obtain said synthetic audio signal frequency-domain shaping; Or, according to said temporal envelope parameter the spectrum signal of said reconstruction is carried out the time domain shaping and handle, obtain the signal after the time domain shaping; Carry out frequency-domain shaping according to the signal of said frequency domain envelope parameters after and handle, obtain synthetic audio signal the time domain shaping.
Above-mentioned each embodiment of the invention can but be not limited to be applied in the audio encoding/decoding apparatus.
In sum; Compare in various embodiments of the present invention and the prior art; Because the embodiment of the invention adopts the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter (can also comprise the first harmonic offset parameter); Come characterize audio signals, to audio-frequency signal coding the time, can be implemented in needed bit number when having reduced operation parameter on the existing basis and encoding, can encode to signal with bit number still less; Further reduce the code rate of signal; Thereby obtain bigger encoded bandwidth and higher coding quality with lower code rate,, adopt the embodiment of the invention can obtain good coding quality particularly to the tangible signal of harmonic structure.In the coding that the while embodiment of the invention provides, the decoding processing technical scheme; When with the mode of dividing frequency band during to coding audio signal; Whether the spectrum signal of audio signal of judging current frequency band is similar with the spectrum signal of the audio signal of previous frequency band; When dissmilarity, extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters, pitch parameters harmonic spacing parameter (can also comprise the first harmonic offset parameter); When similar, only extract the one group of parameter that comprises temporal envelope parameter, frequency domain envelope parameters (can also comprise pitch parameters); Effectively utilize the similitude of composing between the signal different frequency bands and further reduced code rate, obtained bigger encoded bandwidth.Decoding end can realize adopting different spectrum signal method for reconstructing to the characteristic of unlike signal in minute process of frequency band decoded audio signal according to above-mentioned parameter, and is stronger to the adaptability of signal characteristic, can obtain same high synthetic quality to unlike signal.In other words, when transmittability one timing of channel, because number of coded bits of the present invention is lower, therefore can encode has the more signal of high bandwidth.Because good more from the big more acquisition auditory perception of bandwidth of acoustically saying signal, therefore when transmittability one timing of channel, method provided by the invention can obtain higher encoded bandwidth and higher synthetic quality.And a kind of technical scheme of audio signal being carried out subband coding, decoding processing that the embodiment of the invention provides; Can in minute process of frequency band encoding and decoding audio signal, realize obtaining bigger encoded bandwidth, obtain higher coding quality with lower code rate.
The above; Be merely the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, any technical staff who is familiar with the present technique field is in the technical scope that the present invention discloses; The variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of claim.