CN102194457B

CN102194457B - Audio encoding and decoding method, system and noise level estimation method

Info

Publication number: CN102194457B
Application number: CN2010191850619A
Authority: CN
Inventors: 江东平; 袁浩; 彭科; 陈国明; 黎家力
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2010-03-02
Filing date: 2010-03-02
Publication date: 2013-02-27
Anticipated expiration: 2030-03-02
Also published as: CN102194457A

Abstract

The invention relates to an audio encoding and decoding method, a system and a noise level estimation method, and the noise level estimation method comprises the following steps: estimating a power spectrum of audio signals to be encoded according to a frequency domain coefficient of the audio signals to be encoded; and estimating noise level of the audio signals of a zero-bit encoding sub-band according to the calculated power spectrum, wherein the noise level is used for controlling the ratio of energy for noise filling to the energy for frequency band replication during decoding, and the zero-bit encoding sub-band refers to the encoding sub-band of which the distributed number of bits is zero. By adopting the method in the invention, the frequency domain coefficient which is not encoded can be well re-constructed.

Description

Audio encoding and decoding method, system and noise level estimation method

Technical field

The present invention relates to a kind of audio encoding and decoding technique, especially a kind of audio encoding and decoding method, system and noise level estimation method that uncoded coding subband is carried out frequency spectrum reconfiguration.

Background technology

Audio decoding techniques is the core of multimedia application technology such as the propagation music of digital audio broadcasting, internet and voice communication, and the raising of audio coder compression performance can be greatly benefited from these application.Perceptual audio encoders is modern main flow audio coder as diminishing a kind of of transform domain coding.Usually because the restriction of coding bit rate, part frequency coefficient or frequency content can't be encoded during audio coding, for the do not encode spectrum component of subband of better recovery, existing audio codec comes the do not encode spectrum component of subband of reconstruct with the method for noise filling or spectral band replication usually.G.722.1C adopted the method for noise filling, HE-AAC-V1 has adopted the spectral band replication technology, has G.719 adopted the method for noise filling and simple spectral band replication combination.Adopt the method for noise filling can't well recover the not encode spectrum envelope of subband and tone and the noise contribution of subband inside.The spectral band replication method of HE-AAC-V1 need to carry out spectrum analysis to the sound signal before encoding, the signal of radio-frequency component is carried out tone and noise is estimated, extracting parameter, and to using the AAC scrambler to encode behind the sound signal down-sampling, its computational complexity is high, but also need to transmit more parameter information to decoding end, take more coded-bit, also can increase coding delay simultaneously.And replication theme G.719 is too simple, and the spectrum envelope of subband and tone and the noise contribution of subband inside can't well recover not encode.

Summary of the invention

The technical problem to be solved in the present invention provides a kind of audio encoding and decoding method, system and noise level estimation method, the frequency coefficient that is not encoded with reconstruct well.

For solving above technical matters, the invention provides a kind of noise level estimation method, the method comprises:

Estimate the power spectrum of sound signal to be encoded according to the frequency coefficient of sound signal to be encoded;

According to the noise level of the power Spectral Estimation that calculates zero bits of encoded subband sound signal, this noise level is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding; Wherein, zero bits of encoded subband refers to that the bit number that is assigned to is zero coding subband.

The ratio of the tonal content power that the noise contribution power that further, estimation obtains in the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal and zero bits of encoded subband estimation obtain.

Further,

Estimate the power spectrum of sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded, the rating formula of the frequency k of i frame is as follows:

P _i(k)=λ P _I-1(k)+(1-λ) X _j(k) ², P when i equals 0 wherein _I-1(k)=0; P _i(k) k the performance number that the frequency estimation obtains of expression i frame; X _i(k) the MDCT coefficient of k frequency of expression i frame, λ is the filter factor of one pole smoothing filter.

Further,

The frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, and the process of filling the noise level of subband according to certain effective noise of spectra calculation of the sound signal to be encoded of estimating specifically comprises:

Calculate this effective noise and fill the mean value of the power of all frequency coefficients of all or part zero bits of encoded subband in the subband, obtain average power P_aveg (j);

Calculate this effective noise and fill power P in the subband all or part zero bits of encoded subband _i(k) greater than the mean value of the power of all frequency coefficients of average power P_aveg (j), obtain the tonal content average power P_signal_aveg (j) that this effective noise is filled zero bits of encoded subband in subband;

Calculate this effective noise and fill power P in the subband all or part zero bits of encoded subband _i(k) be less than or equal to the power P of all frequency coefficients of average power P_aveg (j) _i(k) mean value obtains the noise contribution average power P_noise_aveg (j) that this effective noise is filled zero bits of encoded subband in the subband;

The ratio P_noise_rate (j) of calculating noise composition average power P_noise_aveg (j) and tonal content average power P_signal_aveg (j) obtains the noise level that this effective noise is filled subband.

Wherein, effective noise is filled the noise filling subband that subband refers to contain zero bits of encoded subband.

For solving above technical matters, the present invention also provides a kind of audio coding method, and the method comprises:

A, the MDCT frequency coefficient of sound signal to be encoded is divided into several coding subbands, the amplitude envelope value of each coding subband is carried out quantization encoding, obtain the amplitude envelope coded-bit;

B, each coding subband is carried out Bit Allocation in Discrete, and non-zero bit coding subband is carried out quantization encoding, obtain MDCT frequency coefficient coded-bit;

C, estimate the power spectrum of sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded, and then estimate the noise level of zero bits of encoded subband sound signal, and quantization encoding obtains the noise level coded-bit; Wherein, this noise level is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding, and zero bits of encoded subband refers to that the bit number that is assigned to is zero coding subband;

Behind D, the amplitude envelope coded-bit and frequency coefficient coded-bit and the multiplexing packing of noise level coded-bit with each coding subband, send decoding end to.

Further, among the step C, the ratio of the tonal content power that estimation obtains in the noise contribution power that the interior estimation of the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal obtains and the zero bits of encoded subband.

Further,

Estimate the power spectrum of sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded, the algorithm that the power of the frequency k of i frame is estimated is as follows:

P _i(k)=λ P _I-1(k)+(1-λ) X _i(k) ², wherein equal 0 as i, the time P _I-1(k)=0; P _i(k) k the performance number that the frequency estimation obtains of expression i frame; X _i(k) the MDCT coefficient of k frequency of expression i frame, λ is the filter factor of one pole smoothing filter.

Further, among the step B, the frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, and after to each coding allocation of subbands bit, is that effective noise is filled the allocation of subbands bit; Among the step C, the process of filling the noise level of subband according to certain effective noise of spectra calculation of the sound signal to be encoded of estimating specifically comprises:

Calculate this effective noise and fill the mean value of all frequency coefficients of all or part zero bits of encoded subband in the subband, obtain average power P_aveg (j);

Further, when dividing the noise filling subband, evenly divide or carry out non-homogeneous division according to human hearing characteristic, a noise filling subband comprises one or more coding subbands.

Further, the effective noise of filling allocation of subbands bits for all effective noises among the step B or skipping one or several low frequency is filled subband, is that the effective noise of follow-up higher-frequency is filled the allocation of subbands bit; Among the step C dispensed effective noise of bit fill the noise level of subband; Use the bit of this distribution to the multiplexing packing of noise level coded-bit among the step D.

Further, each effective noise is filled the identical bit number of allocation of subbands or is distributed different bit numbers according to auditory properties.

For solving above technical matters, the present invention also provides a kind of audio-frequency decoding method, and the method comprises:

A2, treat in the decoding bit stream each amplitude envelope coded-bit inverse quantization of decoding, the amplitude envelope of the subband of respectively being encoded;

B2, each coding subband is carried out Bit Allocation in Discrete, the noise level coded-bit inverse quantization of decoding is obtained the noise level of zero bits of encoded subband, the frequency coefficient coded-bit inverse quantization of decoding is obtained the frequency coefficient of non-zero bit coding subband;

C2, zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of each zero bits of encoded subband, control the ratio of the energy of noise filling and spectral band replication according to the noise level of this zero bits of encoded subband, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

D2, the frequency coefficient of the zero bits of encoded subband of the frequency coefficient of non-zero bit coding subband and reconstruct is revised inverse discrete cosine transform (IMDCT), obtain final sound signal.

Further, among the step C2, during spectral band replication, the position at certain tone place of search sound signal in the MDCT frequency coefficient, the bandwidth of the frequency take 0 frequency to tone locations is the spectral band replication cycle, and be offset backward copyband_offset frequency with 0 frequency and be offset backward the frequency range of a described copyband_offset frequency as the source frequency range to the frequency of tone locations, zero bits of encoded subband is carried out spectral band replication, if the highest frequency of zero bits of encoded subband inside less than the frequency of the tone that searches, then should only adopt noise filling to carry out frequency spectrum reconfiguration by zero bits of encoded subband.

Further, among the step C2,

The frequency coefficient of the first frequency range is taken absolute value or square value and carry out smothing filtering;

According to the result of smothing filtering, search for the position at the maximum extreme value place of the first frequency range filtering output value, with the position as certain tone place, the position at this maximum extreme value place.

Further, as follows to the take absolute value operational formula of carrying out smothing filtering of the frequency coefficient of this first frequency range:

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k) + (1 - μ) | {\overset{&OverBar;}{X}}_{i} (k) |

Or the operational formula of the frequency coefficient square value of this first frequency range being carried out smothing filtering is as follows

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k - 1) + (1 - μ) {\overset{&OverBar;}{X}}_{i} {(k)}^{2}

Wherein, μ is the smothing filtering coefficient, X_amp _i(k) filtering output value of k frequency of expression i frame,

Be the decoded MDCT coefficient of k frequency of i frame, and during i=0, X_amp _I-1(k)=0.

Further, described the first frequency range is that its medium and low frequency refers to the spectrum component less than 1/2nd signal total bandwidths according to the frequency range of the concentrated low frequency of the definite energy comparison of the statistical property of frequency spectrum.

Further, adopt following methods to determine the maximum extreme value of filtering output value: directly from the filtering output value of frequency coefficient corresponding to the first frequency range, to search for original maximum, with the maximum extreme value of this maximal value as the first frequency range filtering output value.

Further, adopt following methods to determine the maximum extreme value of filtering output value:

With this first frequency range wherein one section as the second frequency range, from the filtering output value of frequency coefficient corresponding to the second frequency range, search for original maximum, carry out different processing according to the position of frequency coefficient corresponding to this original maximum:

If a. this original maximum is the filtering output value of the frequency coefficient of the second frequency range low-limit frequency, then the filtering output value of the frequency coefficient of this second frequency range low-limit frequency is compared with the filtering output value of previous more low-frequency frequency coefficient in the first frequency range, compare forward successively, until the filtering output value of current frequency coefficient is when larger than the filtering output value of previous frequency coefficient, then the filtering output value of this current frequency coefficient is the final maximum extreme value of determining, or, until the filtering output value of frequency coefficient of low-limit frequency that relatively draws the first frequency range is during greater than the filtering output value of a rear frequency coefficient, then the filtering output value of the frequency coefficient of the low-limit frequency of the first frequency range is the final maximum extreme value of determining;

If b. this original maximum is the filtering output value of the frequency coefficient of the second frequency range highest frequency, then the filtering output value of the frequency coefficient of this second frequency range highest frequency is compared with the filtering output value of the frequency coefficient of a rear higher frequency in the first frequency range, compare backward successively, until the filtering output value of current frequency coefficient is when larger than the filtering output value of a rear frequency coefficient, then the filtering output value of this current frequency coefficient is the final maximum extreme value of determining, or, until relatively draw the filtering output value of frequency coefficient of highest frequency of the first frequency range when larger than the filtering output value of previous frequency coefficient, then the filtering output value of the frequency coefficient of the highest frequency of the first frequency range is the final maximum extreme value of determining;

If c. this original maximum is the filtering output value of the frequency coefficient between the second frequency range low-limit frequency and the highest frequency, then frequency coefficient corresponding to this original maximum is the position at tone place, that is, this original maximum is the final maximum extreme value of determining.

Further, among the step C2, when zero bits of encoded subband is carried out spectral band replication, the start sequence number of first carrying out the zero bits of encoded subband of spectral band replication according to source frequency range and needs is calculated the source frequency range replication initiation sequence number of this zero bits of encoded subband, again take the spectral band replication cycle as the cycle, the frequency coefficient that begins the source frequency range from source frequency range replication initiation sequence number periodically copies to zero bits of encoded subband.

Further, the method for the source frequency range replication initiation sequence number of this zero bits of encoded subband of calculating is among the step C2:

Acquisition needs the sequence number of frequency of initial MDCT frequency coefficient of the zero bits of encoded subband of reconstructed frequency domain coefficient, be designated as fillband_start_freq, the sequence number of the frequency that tone is corresponding is designated as Tonal_pos, Tonal_pos is added 1 obtain replicative cycle copy_period, the spectral band replication skew is designated as copyband_offset, the value circulation of fillband_start_freq is deducted copy_period, until this value drops on the value interval of the sequence number of source frequency range, this value is designated as copy_pos_mod for source frequency range replication initiation sequence number.

Further, among the step C2 take the spectral band replication cycle as the cycle, begin from source frequency range replication initiation sequence number with the frequency coefficient periodic repetitions of source frequency range to the method for zero bits of encoded subband be:

To copy to backward successively with the frequency coefficient that source frequency range replication initiation sequence number begins on the zero bits of encoded subband take fillband_start_freq as reference position, until behind the frequency that the source frequency range the copies arrival Tonal_pos+copyband_offset frequency, again will continue to copy to backward on this zero bits of encoded subband since the frequency coefficient of copyband_offset frequency, the rest may be inferred, until finish the spectral band replication when all frequency coefficients of leading zero bits of encoded subband.

Further, among the step C2, the frequency coefficient that the employing following methods obtains after zero bits of encoded subband is copied carries out energy adjusting:

Calculate the amplitude envelope of the frequency coefficient that obtains after the zero bits of encoded subband spectral band replication, be designated as sbr_rms (r);

The formula that the frequency coefficient that obtains after copying is carried out energy adjusting is:

\overset{&OverBar;}{X_sbr} (r) = X_sbr (r) * sbr_lev_scale (r) * rms (r) / sbr_rms (r);

Wherein,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, X_sbr (r) expression zero frequency coefficient of bits of encoded subband r by obtaining after copying, sbr_rms (r) is the amplitude envelope of the frequency coefficient X_sbr (r) that obtains after zero bits of encoded subband r copies, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, sbr_lev_scale (r) is the energy control ratio factor of the spectral band replication of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

sbr_lev_scale (r) = \sqrt{(1 - \overset{&OverBar;}{P_noise_rate} (j)) * fill_energy_saclefactor}

Fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1), Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place.

Further, among the step C2, carry out noise filling according to the frequency coefficient of following formula after to energy adjusting:

\overset{&OverBar;}{X} (r) = \overset{&OverBar;}{X_sbr} (r) + rms (r) * noise_lev_scale (r) * random ();

Wherein,

Expression zero bits of encoded subband r reconstructed frequency domain coefficient,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, random () is the random phase generator, produce the random phase value, its rreturn value is+1 or-1, noise_lev_scale (r) is the noise level control ratio factor of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

noise_lev_scale (r) = \sqrt{\overset{&OverBar;}{P_noise_rate} (j) * fill_energy_saclefactor}

Wherein, fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1),

Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place.

Further, among the step B2, after subband carries out Bit Allocation in Discrete to each coding, the sub-band division of will encoding is several noise filling subbands, effective noise is filled subband carry out Bit Allocation in Discrete, among the step C2, the zero bits of encoded subband that the effective noise that has distributed bit is filled in the subband carries out spectral band replication and the energy level of the control frequency coefficient that copies and the energy level of noise filling, the zero bits of encoded subband that the effective noise of unallocated bit is filled in the subband carries out noise filling, and wherein effective noise is filled the noise filling subband that subband refers to contain zero bits of encoded subband.

For solving above technical matters, the present invention also provides a kind of audio coding system, this system comprises Modified Discrete Cosine Transform (MDCT) unit, amplitude envelope computing unit, amplitude envelope quantification and coding unit, Bit Allocation in Discrete unit, frequency coefficient coding unit and bit stream multiplexer (MUX), this system also comprises the noise level estimation unit, wherein:

The MDCT unit is used for that sound signal is revised the inverse discrete cosine transform conversion and generates frequency coefficient;

The amplitude envelope computing unit is connected with described MDCT unit, is used for the frequency coefficient that described MDCT generates is divided into several coding subbands, and calculates the amplitude envelope value of the subband of respectively encoding;

Amplitude envelope quantizes and coding unit, is connected with described amplitude envelope computing unit, is used for amplitude envelope value with each coding subband and quantizes and encode, respectively the encode coded-bit of subband amplitude envelope of generation;

The Bit Allocation in Discrete unit quantizes to be connected with coding unit with described amplitude envelope, is used for each coding allocation of subbands bit;

Frequency coefficient quantization encoding unit quantizes to be connected with coding unit with MDCT unit, Bit Allocation in Discrete unit and amplitude envelope, is used for that each all frequency coefficient of coding subband is carried out normalization, quantification and coding and processes, and generates the frequency coefficient coded-bit;

The noise level estimation unit, be connected with MDCT unit and Bit Allocation in Discrete unit, be used for estimating according to the MDCT frequency coefficient of sound signal to be encoded the power spectrum of sound signal to be encoded, and then the noise level of estimation zero bits of encoded subband sound signal, and quantization encoding obtains the noise level coded-bit; Wherein, this noise level is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding;

Bit stream multiplexer (MUX), be connected frequency coefficient coding unit and noise level estimation unit with coding unit with described amplitude envelope quantification and be connected, be used for the coded-bit of the coded-bit of each coding subband and frequency coefficient is multiplexing and send to decoding end.

Further, described noise level estimation unit specifically comprises:

The power Spectral Estimation module is for the power spectrum of estimating sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded;

The noise level computing module is connected with described power Spectral Estimation module, is used for the noise level according to the power Spectral Estimation zero bits of encoded subband sound signal of described power Spectral Estimation module estimation;

The noise level coding module is connected with described noise level computing module, is used for the noise level that described noise level computing module calculates is carried out quantization encoding, obtains the noise level coded-bit.

Further, described power Spectral Estimation module adopts following formula to estimate the power of the frequency k of i frame, and formula is as follows:

P _i(k)=λ P _I-1(k)+(1-λ) X _i(k) ², P when i equals 0 wherein _I-1(k)=0; P _i(k) k the performance number that the frequency estimation obtains of expression i frame; X _i(k) the MDCT coefficient of k frequency of expression i frame, λ is the filter factor of one pole smoothing filter.

Further,

The frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, the function of described noise level computing module specifically comprises: be used for calculating the mean value that this effective noise is filled all frequency coefficient power of subband all or part zero bits of encoded subband, obtain average power P_aveg (j); Be used for calculating this effective noise and fill subband all or part zero bits of encoded subband power P _i(k) greater than the mean value of the power of all frequency coefficients of average power P_aveg (j), obtain the tonal content average power P_signal_aveg (j) that this effective noise is filled zero bits of encoded subband in subband; Be used for calculating this effective noise and fill subband all or part zero bits of encoded subband power P _i(k) be less than or equal to the power P of all frequency coefficients of average power P_aveg (j) _i(k) mean value obtains the noise contribution average power P_noise_aveg (j) that this effective noise is filled zero bits of encoded subband in the subband; The ratio that is used for calculating noise composition average power P_noise_aveg (j) and tonal content average power P_signal_aveg (j) obtains the noise level that this effective noise is filled subband;

Further, described noise level estimation unit also comprises the Bit Allocation in Discrete module that is connected with noise level computing module and noise level coding module, the effective noise that is used to all effective noises to fill allocation of subbands bits or skips one or several low frequency is filled subband, for the effective noise of follow-up higher-frequency is filled the allocation of subbands bit, and notice noise level computing module and noise level coding module; Described noise level computing module is the noise filling subband calculating noise level for having distributed bit only; Described noise level coding module utilizes the bit of Bit Allocation in Discrete module assignment that described noise level is carried out quantization encoding.

For solving above technical matters, the present invention also provides a kind of audio decoding system, this system comprises bit stream demultiplexer (DeMUX), coding subband amplitude envelope decoding unit, Bit Allocation in Discrete unit, frequency coefficient decoding unit, frequency spectrum reconfiguration unit, revises inverse discrete cosine transform (IMDCT) unit, wherein:

Described DeMUX is used for isolating amplitude envelope coded-bit, frequency coefficient coded-bit and noise level coded-bit from bit stream to be decoded;

Described amplitude envelope decoding unit is connected with described DeMUX, is used for the amplitude envelope coded-bit of described bit stream demultiplexer output is decoded the amplitude envelope quantification index of the subband of respectively being encoded;

Described Bit Allocation in Discrete unit is connected with described amplitude envelope decoding unit, is used for carrying out Bit Allocation in Discrete, the number of coded bits of distributing for each frequency coefficient in the subband of respectively being encoded;

The frequency coefficient decoding unit is connected with the Bit Allocation in Discrete unit with the amplitude envelope decoding unit, be used for to the coding subband decode, inverse quantization and renormalization to be to obtain frequency coefficient;

The noise level decoding unit is connected with described bit stream demultiplexer and Bit Allocation in Discrete unit, obtains noise level for inverse quantization that the noise level coded-bit is decoded;

Described frequency spectrum reconfiguration unit, be connected with described noise level decoding unit, frequency coefficient decoding unit, amplitude envelope decoding unit and Bit Allocation in Discrete unit, be used for zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of amplitude envelope decoding unit output, control the ratio of the energy of noise filling and spectral band replication according to the noise level of noise level decoding unit output, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

The IMDCT unit is connected with described frequency spectrum reconfiguration unit, is used for the frequency coefficient behind the frequency spectrum reconfiguration of finishing zero bits of encoded subband is carried out IMDCT, the sound signal that obtains.

Further, described frequency spectrum reconfiguration unit comprises spectral band replication subelement, energy adjusting subelement and the noise filling subelement that connects successively, wherein:

The spectral band replication subelement is used for zero bits of encoded subband is carried out spectral band replication;

The energy adjusting subelement, the amplitude envelope for calculating the frequency coefficient that obtains after the zero bits of encoded subband spectral band replication is designated as sbr_rms (r); And according to the noise level of noise level decoding unit output the frequency coefficient that obtains after copying is carried out energy adjusting, the formula of energy adjusting is:

\overset{&OverBar;}{X_sbr} (r) = X_sbr (r) * sbr_lev_scale (r) * rms (r) / sbr_rms (r);

Wherein,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, the frequency coefficient that obtains after X_sbr (r) expression zero bits of encoded subband r copies, sbr_rms (r) is the amplitude envelope of the frequency coefficient X_sbr (r) that obtains after zero bits of encoded subband r copies, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, sbr_lev_scale (r) is the energy control ratio factor of the spectral band replication of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

sbr_lev_scale (r) = \sqrt{(1 - \overset{&OverBar;}{P_noise_rate} (j)) * fill_energy_saclefactor}

Fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1),

Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place;

The noise filling subelement is used for noise level according to the output of the noise level decoding unit frequency coefficient after to energy adjusting and carries out noise filling, and the formula of noise filling is:

\overset{&OverBar;}{X} (r) = \overset{&OverBar;}{X_sbr} (r) + rms (r) * noise_lev_scale (r) * random ();

Wherein,

Copy frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, random () is the random phase generator, produce the random phase value, its rreturn value is+1 or-1, noise_lev_scale (r) is the noise level control ratio factor of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

noise_lev_scale (r) = \sqrt{\overset{&OverBar;}{P_noise_rate} (j) * fill_energy_saclefactor}

Wherein, fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1), Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place.

Further, described spectral band replication subelement comprises tone locations search module, cycle and source frequency range computing module, source frequency range replication initiation sequence number computing module and the spectral band replication module that connects successively, wherein:

The tone locations search module is used in the position at certain tone place of MDCT frequency coefficient search sound signal,

Cycle and source frequency range computing module, for the spectral band replication cycle and the source frequency range that are identified for copying according to the tone position, this spectral band replication cycle is 0 frequency to the bandwidth of the frequency of tone locations, and described source frequency range is frequency that 0 frequency is offset spectral band replication skew copyband_offset backward is offset the frequency of described copyband_offset backward to the frequency of tone locations frequency range;

Source frequency range replication initiation sequence number computing module is used for calculating according to the start sequence number that source frequency range and needs carry out the zero bits of encoded subband of spectral band replication the source frequency range replication initiation sequence number of this zero bits of encoded subband;

Described spectral band replication module be used for take the spectral band replication cycle as the cycle, begin frequency coefficient periodic repetitions with the source frequency range to zero bits of encoded subband from source frequency range replication initiation sequence number; Such as the highest frequency of the zero bits of encoded subband inside frequency less than the tone that searches, then this frequency only adopts noise filling to carry out frequency spectrum reconfiguration.

Further, described tone locations search module adopts following method search tone position: the MDCT frequency coefficient to the first frequency range takes absolute value or square value, and carries out smothing filtering; According to the result of smothing filtering, search for the position at the maximum extreme value place of the first frequency range filtering output value, the position at this maximum extreme value place is the position at tone place.

Further, described tone locations search module to the MDCT frequency coefficient of this first frequency range operational formula of carrying out smothing filtering that takes absolute value is:

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k) + (1 - μ) | {\overset{&OverBar;}{X}}_{i} (k) |

Or the computing of the frequency coefficient square value of this first frequency range being carried out smothing filtering is:

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k - 1) + (1 - μ) {\overset{&OverBar;}{X}}_{i} {(k)}^{2}

Further, described tone locations search module computing module is directly searched for original maximum from the filtering output value of frequency coefficient corresponding to the first frequency range, with the maximum extreme value of this maximal value as the first frequency range filtering output value.

Further, when described tone locations search module is determined the maximum extreme value of filtering output value, with this first frequency range wherein one section as the second frequency range, from the filtering output value of frequency coefficient corresponding to the second frequency range, search for original maximum first, carry out different processing according to the position of frequency coefficient corresponding to this original maximum again:

Further, the process that described source frequency range replication initiation sequence number computing module calculates the source frequency range replication initiation sequence number of the zero bits of encoded subband that need to carry out spectral band replication comprises: the sequence number that obtains the initial frequency of the current zero bits of encoded subband that needs the reconstructed frequency domain coefficient, be designated as fillband_start_freq, the sequence number of the frequency that tone is corresponding is designated as Tonal_pos, Tonal_pos is added 1 obtain replicative cycle copy_period, source frequency range start sequence number is designated as copyband_offset, the value circulation of fillband_start_freq is deducted copy_period, until this value drops on the value interval of the sequence number of source frequency range, this value is designated as copy_pos_mod for source frequency range replication initiation sequence number.

Further, when the spectral band replication module is carried out spectral band replication, to copy to backward successively with the frequency coefficient that source frequency range replication initiation sequence number begins on the zero bits of encoded subband take fillband_start_freq as reference position, until behind the frequency that the source frequency range the copies arrival Tonal_pos+copyband_offset frequency, again will continue to copy to backward on this zero bits of encoded subband since the frequency coefficient of copyband_offset frequency, the rest may be inferred, until all frequency coefficients of finishing when leading zero bits of encoded subband copy.

Further, described Bit Allocation in Discrete unit also is used to all effective noises filling allocation of subbands bits or skips the effective noise filling subband of one or several low frequency, is that the effective noise of follow-up higher-frequency is filled the allocation of subbands bit; The frequency coefficient that described energy adjusting subelement obtains after to spectral band replication carries out energy adjusting; Described noise filling subelement after to energy adjusting frequency coefficient and the zero bits of encoded subband in the noise filling subband of unallocated bit carry out noise filling.

The present invention estimates the power spectrum of sound signal to be encoded by the MDCT frequency coefficient at coding side, and estimate the noise level of zero bits of encoded subband sound signal by the power spectrum of estimating to obtain, to be sent to decoding end behind the noise level information coding, be used for controlling the ratio of the energy of the noise filling of decoding end and spectral band replication; After the decoding end decoding obtains encoding the MDCT frequency coefficient, adopt the method for spectral band replication and noise filling that uncoded coding subband is carried out frequency coefficient reconstruct, wherein the ratio of the energy of noise filling and spectral band replication is controlled by the noise level coded-bit that coding side sends.The method can be recovered spectrum envelope and the inner tonal noise composition of uncoded coding subband well, has obtained preferably subjective hearing effect.

Description of drawings

Fig. 1 is audio coding method schematic diagram of the present invention.

Fig. 2 is the schematic flow sheet that the present invention obtains the noise level coded-bit of the inner zero bits of encoded subband of noise filling subband.

Fig. 3 is the schematic flow sheet of calculating noise level of the present invention.

Fig. 4 is audio-frequency decoding method schematic diagram of the present invention.

Fig. 5 is the schematic flow sheet of frequency spectrum reconfiguration of the present invention.

Fig. 6 is the structural representation of audio coding system of the present invention.

Fig. 7 is the modular structure schematic diagram of noise level estimation unit of the present invention.

Fig. 8 is the structural representation of audio decoding system of the present invention.

Fig. 9 is the modular structure schematic diagram of frequency spectrum reconfiguration of the present invention unit.

Figure 10 is that the code stream of the embodiment of the invention consists of schematic diagram.

Embodiment

Core concept of the present invention is, estimate the power spectrum of sound signal to be encoded by the MDCT frequency coefficient at coding side, and estimate the noise level of zero bits of encoded subband sound signal by the power spectrum of estimating to obtain, to be sent to decoding end behind the noise level information coding, be used for controlling the ratio of the energy of the noise filling of decoding end and spectral band replication; After the decoding end decoding obtains encoding the MDCT frequency coefficient, adopt the method for spectral band replication and noise filling that uncoded coding subband is carried out frequency coefficient reconstruct, wherein the ratio of the energy of noise filling and spectral band replication is controlled by the noise level coded-bit that coding side sends.The method can be recovered spectrum envelope and the inner tonal noise composition of uncoded coding subband well, has obtained preferably subjective hearing effect.

The said frequency coefficient of the present invention all refers to the MDCT frequency coefficient.

Following Coded method, coding/decoding method, coded system, decode system four parts come that the present invention is described in detail:

One, coding method

Audio coding method of the present invention may further comprise the steps:

A, the MDCT frequency coefficient of sound signal to be encoded is divided into several coding subbands, and the amplitude envelope value of each coding subband is carried out quantization encoding, obtain the coded-bit of amplitude envelope;

When dividing the coding subband, the frequency coefficient after the described MDCT conversion is divided into several equally spaced coding subbands, perhaps is divided into several non-uniform encoding subbands according to auditory perception property.

B, each coding subband is carried out Bit Allocation in Discrete, and non-zero bit coding subband is carried out quantization encoding, obtain the coded-bit of MDCT frequency coefficient;

After subband carries out Bit Allocation in Discrete to each coding, if the assigned bit number of certain coding subband is zero, then this coding subband is not carried out quantization encoding, the subband of should encoding herein is called zero bits of encoded subband or uncoded coding subband, and other coding subbands are called non-zero bit coding subband.

Adopting which kind of method that each coding subband is carried out normalization, quantification and coding is not outline of the present invention.

C, estimate the power spectrum of sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded, and then estimate the noise level of zero bits of encoded sound signal, and quantization encoding obtains the noise level coded-bit; Wherein, this noise level coded-bit is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding;

The ratio of the tonal content power that the noise contribution power that estimation obtains in the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal and zero bits of encoded subband estimation obtain.

Behind D, the coded-bit and the multiplexing packing of noise level coded-bit with coded-bit, the described frequency coefficient of the described amplitude envelope of each coding subband, send decoding end to.

Below in conjunction with accompanying drawing, audio coding method of the present invention is elaborated:

Embodiment 1-coding method

Fig. 1 is the structural representation of a kind of audio coding method of the embodiment of the invention.Take frame length as 20ms, sampling rate is that the audio stream of 32kHz is that example specifies audio coding method of the present invention in the present embodiment.Under other frame length and sampling rate condition, method of the present invention is applicable equally.As shown in Figure 1, the method comprises:

101: treat coded audio stream enforcement MDCT (Modified Discrete Cosine Transform, Modified Discrete Cosine Transform) and obtain N the frequency coefficient on the frequency domain sample point;

The specific implementation of this step can be:

With the N point time-domain sampling signal x (n) of present frame and the N point time-domain sampling signal x of previous frame _Old(n) form 2N point time-domain sampling signal

The time-domain sampling signal that 2N is ordered can be expressed from the next:

\overset{&OverBar;}{x} (n) = \{\begin{matrix} x_{old} (n) & n = 0,1, \cdot \cdot \cdot, N - 1 \\ x (n - N) & n = N, N + 1, \cdot \cdot \cdot, 2 N - 1 \end{matrix} - - - (1)

Right

Implement the MDCT conversion, obtain following frequency coefficient:

X (k) = Σ_{n = 0}^{2 N - 1} \overset{&OverBar;}{x} (n) w (n) \cos [\frac{π}{N} (n + \frac{1}{2} + \frac{N}{2}) (k + \frac{1}{2})], k = 0, \cdot \cdot \cdot, N - 1 - - - (2)

Wherein, w (n) expression sinusoidal windows function, expression formula is:

w (n) = \sin [\frac{π}{2 N} (n + \frac{1}{2})], n = 0, \cdot \cdot \cdot, 2 N - 1 - - - (3)

When frame length is 20ms, when sampling rate is 32kHz, obtain 640 frequency coefficients.Other frame lengths and sampling rate can be calculated corresponding frequency coefficient number N equally.

102: N frequency coefficient is divided into several coding subbands, calculates the amplitude envelope of each coding subband;

Adopt in the present embodiment non-homogeneous sub-band division, calculate the frequency domain amplitude envelope (abbreviation amplitude envelope) of each subband.

This step can adopt following substep to realize:

102a: the frequency coefficient in the frequency band range of required processing is divided into L subband (can be called the coding subband);

In the present embodiment, the frequency band range of required processing is 0～13.6kHz, can carry out non-homogeneous sub-band division according to the auditory perceptual characteristic, and table 1 has provided a concrete dividing mode.

In table 1, the frequency coefficient in 0～13.6kHz frequency band range is divided into 28 codings subband, namely L=28; And the frequency coefficient more than the 13.6kHz is set to 0.

102b: the amplitude envelope of calculating the subband of respectively encoding according to following formula:

Th (j) = \sqrt{\frac{1}{HIndex (j) - LIndex (j) + 1} Σ_{k = LIndex (j)}^{HIndex (j)} X (k) X (k)}, j = 0,1, \cdot \cdot \cdot, L - 1 - - - (4)

Wherein, LIndex (j) and HIndex (j) represent respectively initial frequency point and the end Frequency point of j coding subband, and its concrete numerical value is as shown in table 1.

The non-homogeneous sub-band division mode of table 1 frequency domain example

Sub-band serial number	Initial frequency point (LIndex)	Finish Frequency point (HIndex)	Subband width (BandWidth)
				0	0	7	8
1	8	15	8
				2	16	23	8
3	24	31	8
				4	32	47	16
5	48	63	16
				6	64	79	16
7	80	95	16
				8	96	111	16
9	112	127	16
				10	128	143	16
11	144	159	16
				12	160	183	24
13	184	207	24
				14	208	231	24
15	232	255	24
				16	256	279	24
17	280	303	24
				18	304	327	24
19	328	351	24
				20	352	375	24
21	376	399	24
				22	400	423	24
23	424	447	24
				24	448	471	24
25	472	495	24
				26	496	519	24
27	520	543	24

103: the amplitude envelope to each coding subband quantizes and encodes, and obtains the quantification index of amplitude envelope and the quantification index coded-bit of amplitude envelope (being the coded-bit of amplitude envelope);

Adopt following formula (5) the subband amplitude envelope of respectively encoding that calculates according to formula (4) to be quantized the quantification index of the subband amplitude envelope of respectively being encoded:

Wherein,

Expression rounds Th downwards _q(0) for the amplitude envelope quantification index of first coding subband, its scope is limited in [5,34], namely works as Th _q(0)＜-5 o'clock, makes Th _q(0)=-5; Work as Th _q(0)＞34 o'clock, makes Th _q(0)=34.

The quantization amplitude envelope of rebuilding according to quantification index is

Use 6 bits that the amplitude envelope quantification index of first coding subband is encoded, namely consume 6 bits.

Each calculus of differences value of encoding between subband amplitude envelope quantification index adopts following formula to calculate:

ΔTh _q(j)＝Th _q(j+1)-Th _q (j)j＝0，…，L-2 (6)

Can carry out following correction to guarantee Δ Th to amplitude envelope _q(j) scope is within [15,16]:

If Δ Th _q(j)＜-15, then make Δ Th _q(j)=-15, Th _q(j)=Th _q(j+1)+15, j=L-2 ..., 0;

If Δ Th _q(j)＞16, then make Δ Th _q(j)=16, Th _q(j+1)=Th _q(j)+16, j=0 ..., L-2;

To Δ Th _q(j), j=0 ..., L-2 carries out Huffman (Huffman) coding, and calculates the bit number (being called the huffman coding bit, Huffman coded bits) that consume this moment.If the huffman coding bit then uses the natural coding mode to Δ Th more than or equal to the bit number (in the present embodiment greater than (L-1) * 5) of fixed allocation at this moment _q(j), j=0 ..., L-2 encodes, juxtaposition amplitude envelope huffman coding zone bit Flag_huff_rms=0; Otherwise utilize huffman coding to Δ Th _q(j), j=0 ..., L-2 encodes, juxtaposition amplitude envelope huffman coding zone bit Flag_huff_rms=1.The coded-bit of amplitude envelope quantification index (being the coded-bit of amplitude envelope difference value) and amplitude envelope huffman coding zone bit need to be sent among the MUX.

104: the importance according to each coding subband is carried out Bit Allocation in Discrete to each coding subband;

First and coding subband amplitude envelope information calculations theoretical according to the code rate distortion initial value of subband importance of respectively encoding carries out Bit Allocation in Discrete according to the importance of each subband to each subband again; This step can adopt following substep to realize:

104a: the bit consumption mean value that calculates single frequency coefficient;

From the available total bit number bits_available of 20ms frame length, the bit number bit_sides that the deduction side information consumes, the noise level information reserved bit bits_noiseband of noise filling subband and the used up bit number bits_Th of coding subband amplitude envelope, obtain the remaining bit number bits_left that can be used for the frequency coefficient coding, that is:

bits_left＝bits_available-bit_sides-bits_Th-bits_noiseband (7)

The noise level information reserved bit bits_noiseband of noise filling subband is the bit of reserving for the noise level coded-bit of noise filling subband, after finishing the Bit Allocation in Discrete of noise filling subband, if also have remaining bits, then remaining noise filling subband noise level information reserved bit bits_noiseband is used for the Bit Allocation in Discrete correction.

Side information comprises the bit of amplitude envelope huffman coding sign of flag _ huff_rms, frequency coefficient huffman coding sign of flag _ huff_plvq and iterations count.Flag_huff_rms is used for sign and whether the subband amplitude envelope has been used huffman coding; Flag_huff_plvq is used for sign and whether has used huffman coding when frequency coefficient being carried out vector quantization and coding, and the iterations (see the description of subsequent step) of iterations count when being used for the correction of sign Bit Allocation in Discrete.

104b: calculate the importance initial value of subband in Bit Allocation in Discrete of respectively encoding:

Importance when being used for Bit Allocation in Discrete with j coding of rk (j) expression subband.

104c: the importance according to each coding subband is carried out Bit Allocation in Discrete to each coding subband;

Specifically describe as follows:

The coding subband at maximizing place from each rk (j) is at first supposed the j that is numbered of this coding subband _k, then increase the number of coded bits of each frequency coefficient in this coding subband, and reduce the importance of this coding subband; Calculate for this sub-band coding simultaneously and consume total number of bits bit_band_used (j _k); Calculate at last the summation sum (bit_band_used (j)) of all coding bit numbers that subband consumes, j=0 ..., L-1; Repeat said process until the maximal value that can provide under the bit limit condition is provided the summation of the bit number that consumes.

The Bit Allocation in Discrete number refers to the assigned bit number of single frequency coefficient in the coding subband.Bit number of consuming of coding subband refers to that single frequency coefficient institute allocation bit number multiply by the number that comprises frequency coefficient in this coding subband in this coding subband.

In the present embodiment, the step-length that to the bit allotment is 0 coding allocation of subbands bit is 1 bit, the step-length that importance reduces after the Bit Allocation in Discrete is 1, to the bit allotment greater than 0 and Bit Allocation in Discrete step-length when appending allocation bit less than the coding subband of threshold value 5 be 0.5 bit, appending the step-length that importance reduces after the allocation bit also is 0.5, Bit Allocation in Discrete step-length when the bit allotment is appended allocation bit more than or equal to the coding subband of threshold value 5 is 1, and appending the step-length that importance reduces after the allocation bit also is 1.

Bit distribution method in this step can be represented by following false code:

Make region_bit (j)=0, j=0,1 ..., L-1;

For coding subband 0,1 ..., L-1:

{

Seek

j_{k} = \underset{j = 0, \cdot \cdot \cdot, L - 1}{\arg \max} [rk (j)];

If region_bit is (j _k)＜5

{

If region_bit is (j _k)=0

Make region_bit (j _k)=region_bit (j _k)+1;

Calculate bit_band_used (j _k)=region_bit (j _k) * BandWidth (j _k);

Make rk (j _k)=rk (j _k)-1;

Region_bit (j else if _k)＞=1

Make region_bit (jk)=region_bit (jk)+0.5;

Calculate bit_band_used (j _k)=region_bit (j _k) * BandWidth (j _k) * 0.5;

Make rk (j _k)=rk (j _k)-0.5;

}

Region_bit (j else if _k)＞=5

{

Make region_bit (j _k)=region_bit (j _k)+1;

Order

rk (j_{k}) = \{\begin{matrix} rk (j_{k}) - 1 & ifregion_bit (j_{k}) < MaxBit \\ - 100 & else \end{matrix};

Calculate bit_band_used (j _k)=region_bit (j _k) * BandWidth (j _k);

}

Calculate bit_used_all=sum (bit_band_used (j)) j=0,1 ..., L-1;

If bit_used_all＜bits_left-24 returns and again seeks j at each coding in subband _k, cycle calculations Bit Allocation in Discrete value; Wherein 24 is the maximal values of coding subband width.

Otherwise end loop is calculated the Bit Allocation in Discrete value, output Bit Allocation in Discrete value at this moment.

}

At last, importance according to the coding subband, distribute to the coding subband that meets the demands with remaining by following principle less than 24 bits, preferentially in Bit Allocation in Discrete is 1 coding subband, distribute 0.5 bit to each frequency coefficient, reduce simultaneously the importance 0.5 of this coding subband; Otherwise be that each frequency coefficient distributes 1 bit in 0 the subband to Bit Allocation in Discrete, reduce simultaneously the importance 1 of this coding subband, until bit_left-bit_used_all＜4, the Bit Allocation in Discrete end.

Wherein, MaxBit is the number of coded bits of the maximum that single frequency coefficient can be assigned in the coding subband.Adopt MaxBit=9 in the present embodiment.This value can suitably be adjusted according to the encoder bit rate of codec.Region_bit (j) is the bit number that single frequency coefficient distributes in j the coding subband.

105: according to the bit allocation result of step 104, the effective noise that contains zero bits of encoded subband for inside is filled the allocation of subbands bit; By the power spectrum of MDCT frequency coefficient estimation sound signal, according to estimating that the power spectrum that obtains estimates the noise level of effective noise filling subband; This noise level information is carried out quantization encoding, obtain the noise level coded-bit of noise filling subband;

N MDCT frequency coefficient can be regarded a noise filling subband as, also can evenly divide or is divided into several noise filling subbands according to human hearing characteristic.A noise filling subband comprises one or more coding subbands.

The noise filling subband that the present invention contains zero bits of encoded subband with inside is called effective noise filling subband.

When carrying out noise filling subband Bit Allocation in Discrete, can fill the allocation of subbands bit for all effective noises, also can skip the effective noise of one or several low frequency and fill subband, for the effective noise of follow-up higher-frequency is filled the allocation of subbands bit, corresponding, when decoding, the zero bits of encoded subband that the low frequency effective noise of this unallocated bit is filled in the subband adopts the mode of white noise filling to carry out frequency spectrum reconfiguration.

Each effective noise is filled the identical bit number of allocation of subbands, or according to the auditory properties distribution different bit number of people's ear to each subband.Follow-up, after the acquisition effective noise is filled the noise level coded-bit of subband, with the multiplexing packing of this bit.

106: the vector of non-zero bit coding subband is quantized and encodes, obtain the coded-bit of frequency coefficient;

107: the structure encoding code stream

Figure 10 is that the code stream of the embodiment of the invention consists of schematic diagram.At first side information is write among the bit stream multiplexer MUX in the following order Flag_huff_rms, Flag_huff_plvq and count; Then the subband amplitude envelope of will encoding coded-bit writes MUX, then the noise level coded-bit is write MUX, and then the coded-bit with frequency coefficient writes MUX; To be sent to decoding end by the code stream that said sequence is write as at last.

So that N MDCT frequency coefficient is divided into a plurality of noise filling subbands, and to fill the allocation of subbands bit from second effective noise be that example is elaborated to step 105 below in conjunction with accompanying drawing.

As shown in Figure 2, the process of the noise level coded-bit of the inner zero bits of encoded subband of acquisition noise filling subband specifically comprises:

201: the sub-band division of will encoding is several noise filling subbands, according to coding subband bit allocation result, for effective noise is filled the allocation of subbands bit;

Frequency coefficient in the frequency band range of required processing according to non-homogeneous several subbands that is divided into of human hearing characteristic, is called the noise filling subband; A noise filling subband comprises one or more coding subbands;

The concrete dividing mode example of an example sees Table 2:

The non-homogeneous sub-band division mode of table 2 noise filling subband example

The noise filling sub-band serial number	Start code sub-band serial number (NLIndex)	Finish coding sub-band serial number (NHIndex)	Comprise coding subband number (SubBandNum)
				0	0	11	12
1	12	13	2
				2	14	16	3
3	17	20	4
				4	21	28	8

In the above table 2, the noise filling subband is according to from low to high tactic of coding sub-bands of frequencies.

Suppose, noise filling subband noise level information reserved bit is that each the noise filling subband except sequence number 0 is reserved two bits, and the aggregate reservation bit number equals to multiply by 2 after noise filling subband number subtracts 1.

When Bit Allocation in Discrete, it is not 0 noise filling allocation of subbands bit for sequence number, namely do not take coded-bit, correspondingly, in when decoding, be 0 noise filling subband to sequence number, if the inner coding subband that zero bit is arranged, then adopt the white noise fill method that the frequency coefficient of the coding subband of zero bit is carried out frequency spectrum reconfiguration, see step 504 for details; Be that 1 noise filling subband begins to judge whether this noise filling subband inside has zero bits of encoded subband from sequence number, if this noise filling subband has zero bits of encoded subband, then be 2 bits of this noise filling allocation of subbands, be used for representing the noise level information of the inner zero bits of encoded subband of this noise filling subband, and noise filling subband noise level information reserved bit bits_noiseband is subtracted 2.After finishing the Bit Allocation in Discrete of all noise filling subbands, remaining noise level information reserved bit bits_noiseband is used for the Bit Allocation in Discrete correction.

Noise filling subband Bit distribution method can be represented by following false code in this step:

Nregion_bitflag (j-1) is the Bit Allocation in Discrete sign of noise filling subband j, and 1: expression has distributed bit; 0 expression there is not allocation bit.

Make Nregion_bitflag (j-1)=0, j=1,2 ..., L_noise-1;

Make noise filling subband Bit Allocation in Discrete remaining bits noiseband_remain_bits=0;

For noise filling subband j=1,2 ... L_noise-1

Make region=NLInde (j), NLIndex (j)+1 ... NHIndex (j);

For all region

{

If region_bit (region) equals 0

{

Then make Nregion_bit (Nregion)=1;

bits_noiseband＝bits_noiseband-2；

Jump out current circulation;

}

noiseband_remain_bits＝bits_noiseband；

It is exactly the said noise level coded-bit in back that each bit of distributing to the noise filling subband is lined up in order.

More than being the process of carrying out Bit Allocation in Discrete for the noise filling subband, can certainly directly be the bit (such as 2 bits) that each noise filling subband is reserved specific quantity.

202: based on the noise filling sub-band division mode of table 2, be the power spectrum of the signal of 4 noise filling subbands of 1,2,3,4 by MDCT coefficient estimate sequence number;

The algorithm that the power of the frequency k of i frame is estimated is suc as formula (13):

P _i(k)＝λP _i-1(k)+(1-λ)X _i(k) ² (13)

P when i equals 0 wherein _I-1(k)=0; P _i(k) k the performance number that the frequency estimation obtains of expression i frame.X _i(k) the MDCT coefficient of k frequency of expression i frame, λ is the filter factor of one pole smoothing filter, one of them example λ=0.875;

Carrying out the principle of power Spectral Estimation by MDCT derives as follows:

It is the signal x of 2M at the discrete time Fourier transform (DTFT) at angular frequency place that following formula provides length:

X_{DTFT} (ω) = Σ_{n = 0}^{2 M - 1} x (n) e^{- jωn} - - - (14)

2M homogeneous phase between 0 and 2 π every frequency on DTFT is sampled.This conversion through sampling is called discrete Fourier transform (DFT) (DFT), and following formula provides the DFT at frequency k place:

X_{DFT} (k) = X_{DTFT} (2 πk / 2 M) = Σ_{k = 0}^{2 M - 1} x (n) e^{- j \frac{2 πkn}{2 M}} - - - (15)

Utilize the skew of half frequency that DTFT is sampled, to generate displacement discrete Fourier transform (DFT) (SDFT):

X_{SDFT} (k) = X_{DTFT} (2 π (k + 1 / 2) / 2 M) = Σ_{k = 0}^{2 M - 1} x (n) e^{- j \frac{2 π (k + 1 / 2) n}{2 M}} - - - (16)

As follows to the SDFT after actual signal x (n) windowing:

X_{SDFT} (k) = Σ_{k = 0}^{2 M - 1} w (n) x (n) e^{- j \frac{2 π (k + 1 / 2) n}{2 M}} - - - (16)

Remember that according to formula (2) MDCT frequency coefficient X (k) is X _MDCT(k), and make M=N, rewriting formula (2) is as follows:

X_{MDCT} (k) = Σ_{n = 0}^{2 M - 1} \overset{&OverBar;}{x} (n) w (n) \cos (\frac{π}{M} (n + \frac{1}{2} + \frac{M}{2}) (k + \frac{1}{2})), k = 0, \cdot \cdot \cdot, M - 1 - - - (17)

SDFT and MDCT adopt same window type, order

x (n) = \overset{&OverBar;}{x} (n);

The MDCT of actual signal x (n) and the relation between the SDFT can represent with following formula:

X_{MDCT} (k) = | X_{SDFT} (k) | \cos (&angle; X_{SDFT} (k) - \frac{π}{M} (\frac{1}{2} + \frac{M}{2}) (k + \frac{1}{2})) - - - (18)

That is to say, MDCT can be expressed as the amplitude by the SDFT of cosine modulation, and this cosine is the angle function of SDFT.

The SDFT of the windowing piece of the continuous crossover by sound signal comes the power spectrum of estimated signal, and the transform length of supposing signal x is 2M, and following formula has provided at frequency k and at the in short-term displacement discrete Fourier transform (DFT) STSDFT at piece t place so:

X_{STSDFT} (k, t) = Σ_{k = 0}^{2 M - 1} w (n) x (n + Ht) e^{- j \frac{2 π (k + 1 / 2) n}{2 M}} - - - (19)

H is that the jumping of piece is long.H=M in addition, then STSDFT has identical jumping long with MDCT.

Utilize STSDFT to pass through on many t X _SDFTThe squared magnitude of [k, t] averages the power spectrum of estimated signal, and by following formula, computational length is the moving average of the piece of T, to generate the estimation to the time variation of power spectrum:

P_{STSDFT} (k, t) = \frac{1}{T} Σ_{n = 0}^{T - 1} {| X_{STSDFT} (k, t - η) |}^{2} - - - (20)

According to the operation relation of MDCT and SDFT, under some assumed condition, can be according to X _MDCT(k, t) approximate P that obtains _STSDFT(k, t).Definition:

P_{MDCT} (k, t) = \frac{1}{T} Σ_{η = 0}^{T - 1} {| X_{MDCT} (k, t - η) |}^{2} - - - (21)

Can obtain according to formula (18):

P_{MDCT} (k, t) = \frac{1}{T} Σ_{η = 0}^{T - 1} {| X_{STSDFT} (k, t - η) |}^{2} \cos^{2} ({&angle; X}_{STSDFT} (k, t - η) - \frac{π}{M} (\frac{1}{2} + \frac{M}{2}) (k + \frac{1}{2})) - - - (22)

If supposition is on piece | X _STSDFT(k, t-η) | and ∠ X _STSDFTThe co-variation (this hypothesis is true for most of sound signals) that (k, t-η) is relatively independent then can obtain:

P_{MDCT} (k, t) &cong; (\frac{1}{T} Σ_{η = 0}^{T - 1} {| X_{STSDFT} (k, t - η) |}^{2}) (\frac{1}{T} Σ_{η = 0}^{T - 1} \cos^{2} (&angle; X_{STSDFT} (k) - \frac{π}{M} (\frac{1}{2} + \frac{M}{2}) (k + \frac{1}{2}))) - - - (23)

If further suppose ∠ X _STSDFT(k) if generally speaking T piece be uniformly distributed between 0 and 2 π and T relatively large, then because the expectation value of the cosine square of equally distributed phase angle is arranged is 1/2nd, can obtain:

P_{MDCT} (k, t) &cong; \frac{1}{2} (\frac{1}{T} Σ_{η = 0}^{T - 1} {| X_{STSDFT} [k, t - η] |}^{2}) = \frac{1}{2} P_{STSDFT} (k, t); - - - (24)

Therefore, can see, the power spectrum of estimating according to MDCT approximates greatly half of the power spectrum estimated according to STSDFT.

Because the requirement of coding computing low delay, we select the one pole smoothing filter, carry out power Spectral Estimation, P _MDCTPiece t represents with i in (k, t), and is written as subscript, P _MDCT(k, t) can be written as P _i(k), the length of piece is decided to be the length of a frame sound signal, and what then i represented is the numbering of frame, can obtain the algorithm of final estimation suc as formula (13), and formula (13) is exactly the algorithm that is used for power Spectral Estimation among the present invention.

203: according to the power spectrum that formula (13) is estimated, calculate zero bits of encoded subband noise level in each noise filling subband that is assigned to bit.

As shown in Figure 3: the detailed process of calculating noise level is:

Step 301: calculate the mean value of the power of all frequency coefficients of all or part zero bits of encoded subband in this noise filling subband, obtain average power P_aveg (j);

Step 302: the power in the zero bits of encoded subband of all or part in this noise filling subband is thought tonal content in this noise filling subband greater than the frequency coefficient of average power, calculate this effective noise and fill that power obtains the tonal content average power P_signal_aveg (j) that this effective noise is filled zero bits of encoded subband in subband greater than the mean value of the power of all frequency coefficients of average power P_aveg (j) in subband all or part zero bits of encoded subband;

Step 303: the frequency coefficient that the power in the zero bits of encoded subband of all or part in this noise filling subband is less than or equal to average power is thought the noise contribution in this noise filling subband, calculate this effective noise and fill in subband the mean value of power that power in all or part zero bits of encoded subband is less than or equal to all frequency coefficients of average power P_aveg (j), obtain the noise contribution average power P_noise_aveg (j) that this effective noise is filled zero bits of encoded subband in subband;

The ratio P_noise_rate (j) of step 304: calculating noise composition average power P_noise_aveg (j) and tonal content average power P_signal_aveg (j), its value is filled the noise level of subband for this effective noise.

Noise level is carried out obtaining the noise level coded-bit behind the quantization encoding;

P_noise_rate (j) carries out quantization encoding and obtains P_noise_rate_bits (j).After finishing the noise level quantization encoding, each noise level coded-bit that is assigned to the noise filling subband of bit is arranged from low to high by the sequence number of subband, obtained the noise level coded-bit that whole effective noise is filled subband.

One of them adopts the example of non-uniform quantizing as shown in table 3:

Table 3 noise signal ratio non-uniform quantizing example

P_noise_rate(j)	P_noise_rate_bits(j)
		[0，0.04)	00
[0.04，0.08)	01
		[0.08，0.16)	10
[0.16，1)	11

The noise level that this effective noise is filled subband also is the noise level of zero bits of encoded subband in this noise filling subband, this noise level can also represent with the ratio of tonal content average power P_signal_aveg (j) and noise contribution average power P_noise_aveg (j) except can using P_noise_rate (j) expression.

Two, coding/decoding method

Audio-frequency decoding method of the present invention is the inverse process of coding method, comprising:

A, treat in the decoding bit stream each amplitude envelope coded-bit and decode the amplitude envelope quantification index of the subband of respectively being encoded;

B, each coding subband is carried out Bit Allocation in Discrete, the noise level coded-bit inverse quantization of decoding is obtained the noise level of zero bits of encoded subband, the frequency coefficient coded-bit inverse quantization of decoding is obtained the frequency coefficient of non-zero bit coding subband;

C, zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of each zero bits of encoded subband in the bit stream to be decoded, control the ratio of the energy of noise filling and spectral band replication according to the noise level of this zero bits of encoded subband, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

D, the frequency coefficient of the zero bits of encoded subband of the frequency coefficient of non-zero bit coding subband and reconstruct is revised inverse discrete cosine transform (IMDCT), obtain final sound signal.

Fig. 4 is the structural representation of a kind of audio-frequency decoding method of the embodiment of the invention.As shown in Figure 4, the method comprises:

401: to each amplitude envelope coded-bit amplitude envelope quantification index of subband of respectively being encoded of decoding;

(namely from bit stream demultiplexer DeMUX) extracts the coded-bit of a frame from the coded bit stream that coding side sends; After extracting coded-bit, at first side information is decoded, then each amplitude envelope coded-bit in this frame is carried out Hofmann decoding or directly decode the amplitude envelope quantification index Th of the subband of respectively being encoded according to the value of amplitude envelope huffman coding sign of flag _ huff_rms _q(j), j=0 ..., L-1.

402: each coding subband is carried out Bit Allocation in Discrete, and effective noise is filled subband carry out Bit Allocation in Discrete;

Amplitude envelope quantification index according to each coding subband calculates the subband importance initial value of respectively encoding, and utilizes coding subband importance that each coding subband is carried out Bit Allocation in Discrete, the Bit Allocation in Discrete number of the subband that obtains encoding; The Bit distribution method of decoding end and the Bit distribution method of coding side are identical.In bit allocation procedures, the step-length that coding subband importance reduces after Bit Allocation in Discrete step-length and the Bit Allocation in Discrete changes.

After finishing above-mentioned bit allocation procedures, according to the Bit Allocation in Discrete correction iterations count value of coding side and the importance of each coding subband, the coding subband is carried out count Bit Allocation in Discrete correction again, then the Bit Allocation in Discrete overall process finishes.

In Bit Allocation in Discrete and makeover process, Bit Allocation in Discrete step-length when being 0 coding allocation of subbands bit to the bit allotment and the step-length of Bit Allocation in Discrete correction are 1 bits, the step-length that importance reduces after Bit Allocation in Discrete and the Bit Allocation in Discrete correction is 1, to the bit allotment greater than 0 and Bit Allocation in Discrete step-length when appending allocation bit less than the coding subband of certain threshold value and the step-length of Bit Allocation in Discrete correction be 0.5 bit, the step-length that importance reduces after Bit Allocation in Discrete and the Bit Allocation in Discrete correction also is 0.5, Bit Allocation in Discrete step-length when the bit allotment is appended allocation bit more than or equal to the coding subband of this threshold value and the step-length of Bit Allocation in Discrete correction are 1, and the step-length that importance reduces after Bit Allocation in Discrete and the Bit Allocation in Discrete correction also is 1;

The sub-band division of will encoding is several noise filling subbands, according to coding subband bit allocation result, for effective noise is filled the allocation of subbands bit; The Bit distribution method of the division methods of noise filling subband and noise filling subband is identical with coding method, does not repeat them here.

403: the noise level coded-bit inverse quantization of decoding is obtained the noise level of zero bits of encoded subband, and the frequency coefficient coded-bit inverse quantization of decoding is obtained the MDCT frequency coefficient;

404: zero bits of encoded subband is carried out spectral band replication, control the integral-filled energy level of this coding subband according to the amplitude envelope of zero bits of encoded subband, and according to the noise level of the noise filling subband at this coding subband place, control the ratio of each zero bits of encoded subband spectral band replication and noise filling energy, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

This step detailed process is seen following Fig. 5 explanation.

The effective noise that has distributed bit is filled zero bits of encoded subband in the subband carry out the energy level of the frequency coefficient that obtains after spectral band replication and control copy and the energy level of noise filling, the zero bits of encoded subband in the effective noise filling subband of unallocated bit is carried out noise filling.

405: the frequency coefficient behind the frequency spectrum reconfiguration is carried out IMDCT (Inverse Modified DiscreteCosine Transform revises inverse discrete cosine transform), obtain final audio output signal.

Below in conjunction with Fig. 5 step 404 is elaborated:

As shown in Figure 5, step 404 specifically comprises:

Step 501: the zero bits of encoded subband of effective noise being filled subband carries out spectral band replication;

The position at certain tone place of search sound signal in the MDCT frequency coefficient, the bandwidth of the frequency take 0 frequency to tone locations is the spectral band replication cycle, to be offset backward copyband_offset frequency be offset a described copyband_offset frequency backward to tone locations frequency range as spectral band replication source frequency range from 0 frequency, zero bits of encoded subband will be carried out spectral band replication.If need to carry out the highest frequency of zero bits of encoded subband inside of spectral band replication less than the frequency of the tone that searches, then this frequency only adopts noise filling to carry out frequency spectrum reconfiguration.

Frequency coefficient is by from low to high arranged sequentially of frequency, and skew is namely to the high position skew of frequency backward.

Below this spectral band replication method is elaborated:

A, in the MDCT frequency coefficient position at certain tone place of search sound signal;

The method that the present invention preferably searches for the position at tone place is that the MDCT frequency coefficient is carried out smothing filtering: the MDCT frequency coefficient to certain special frequency channel of low frequency takes absolute value or square value, and carries out smothing filtering; According to the result of smothing filtering, the position at maximum extreme value place of search filtering output value is with the position as the tone place, the position at this maximum extreme value place;

The tone of the said sound signal of the present invention refers to the fundamental tone of sound signal or certain harmonic wave of fundamental tone.

Here said special frequency channel can be the frequency range of concentrating according to the energy comparison that spectral characteristic is determined, is called the first frequency range.The low frequency here refers to the spectrum component less than 1/2nd signal total bandwidths.

The frequency coefficient here is the decoded MDCT frequency coefficient of step 403, and frequency is arranged from low to high.

As follows to the take absolute value operational formula of carrying out smothing filtering of the frequency coefficient of this first frequency range:

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k) + (1 - μ) | {\overset{&OverBar;}{X}}_{i} (k) |

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k - 1) + (1 - μ) {\overset{&OverBar;}{X}}_{i} {(k)}^{2}

Wherein, μ is the smothing filtering coefficient, and its span is (0,1), but value is 0.125.X_amp _i(k) filtering output value of k frequency of expression i frame, Be the decoded MDCT coefficient of k frequency of i frame, and during i=0, X_amp _I-1(k)=0.

There are following two kinds of methods the position of searching for the maximum extreme value place of the first frequency range filtering output value:

(1) directly from the filtering output value of frequency coefficient corresponding to the first frequency range, searches for original maximum, with the maximum extreme value of this maximal value as the first frequency range filtering output value, with the sequence number of the frequency of the correspondence position as maximum extreme value (being tone);

When (2) searching for maximum extreme value, with this first frequency range wherein one section as the second frequency range, from the filtering output value of frequency coefficient corresponding to the second frequency range, search for original maximum, and with the maximum extreme value of this original maximum as the first frequency range filtering output value, with the sequence number of the frequency of the correspondence position as maximum extreme value (being tone).

The start position of the second frequency range is greater than the starting point of the first frequency range, and the final position of the second frequency range is less than the terminal point of the first frequency range, and preferably, the number of the first frequency range and the second frequency range medium frequency coefficient is not less than 8.

For frequency coefficient corresponding to the original maximum that prevents from finding is not the position at the tone place of sound signal, when carrying out the tone locations search, from the filtering output value of this second frequency range, search for original maximum first, and carry out different processing according to the position of frequency coefficient corresponding to this original maximum:

(a) if this original maximum is the filtering output value of the frequency coefficient of the second frequency range low-limit frequency, then the filtering output value of the frequency coefficient of this second frequency range low-limit frequency is compared with the filtering output value of previous more low-frequency frequency coefficient in the first frequency range, compare forward successively, until the filtering output value of current frequency coefficient is when larger than the filtering output value of previous frequency coefficient, then think the position that this current frequency coefficient is the tone place, namely the filtering output value of this current frequency coefficient is the final maximum extreme value of determining, or, until the filtering output value of frequency coefficient of low-limit frequency that relatively draws the first frequency range is during greater than the filtering output value of a rear frequency coefficient, think that then the frequency coefficient of low-limit frequency of the first frequency range is the position at tone place, namely the filtering output value of the frequency coefficient of the low-limit frequency of the first frequency range is the final maximum extreme value of determining;

(b) if this original maximum is the filtering output value of the frequency coefficient of the second frequency range highest frequency, then the filtering output value of the frequency coefficient of this second frequency range highest frequency is compared with the filtering output value of the frequency coefficient of a rear higher frequency in the first frequency range, compare backward successively, until the filtering output value of current frequency coefficient is when larger than the filtering output value of a rear frequency coefficient, think that then current frequency coefficient is the position at tone place, namely the filtering output value of this current frequency coefficient is the final maximum extreme value of determining, or, until relatively draw the filtering output value of frequency coefficient of highest frequency of the first frequency range when larger than the filtering output value of previous frequency coefficient, think that then the frequency coefficient of highest frequency of the first frequency range is the position at tone place, namely the filtering output value of the frequency coefficient of the highest frequency of the first frequency range is the final maximum extreme value of determining;

(c) if this original maximum is the filtering output value of the frequency coefficient between the second frequency range low-limit frequency and the highest frequency, then frequency coefficient corresponding to this original maximum is the position at tone place, that is, this original maximum is the final maximum extreme value of determining.

Below take the frequency coefficient of the first frequency range as the 24th to the 64th MDCT frequency coefficient, the frequency coefficient of the second frequency range is that the 33rd to the 56th MDCT frequency coefficient is that example describes the method for determining the sound signal position:

Its maximal value of search the filtering output value of the from the 33rd to 56 MDCT frequency coefficient; If corresponding the 33rd frequency coefficient of maximal value, whether the detection Output rusults of judging the 32nd frequency coefficient large than the 33rd frequency coefficient, if will continue forward than, whether the detection Output rusults of seeing the 31st frequency coefficient whether large than the 32nd frequency coefficient, according to the method successively forward relatively until the filtering output value of current frequency coefficient is larger than previous; Perhaps until find the filtering output value of the 24th frequency coefficient greater than the filtering output value of the 25th frequency coefficient, then current frequency coefficient or the 24th frequency coefficient are the position of tone;

If maximal value is 56 will adopt similar method to seek backward successively, until the filtering output value of current frequency coefficient is larger than rear one, this current frequency coefficient position that is tone then, or until find the filtering output value of the 64th frequency coefficient and its value greater than the filtering output value of the 63rd frequency coefficient, then the 64th frequency coefficient is the position of tone;

If maximal value is between 33 to 56, frequency coefficient corresponding to this maximal value is the position of tone.

The value of this position is designated as the sequence number that Tonal_pos is frequency corresponding to maximum extreme value.

The bandwidth of b, the frequency take 0 frequency to tone locations is the cycle, will be offset backward copyband_offset frequency be offset a described copyband_offset frequency backward to the frequency of tone locations frequency range from 0 frequency and as the source frequency range zero bits of encoded subband be carried out spectral band replication;

That is, the start sequence number copyband_offset of the frequency of source frequency range, the end sequence number is copyband_offset+Tonal_pos.

Among the present invention, the value of spectral band replication skew (being designated as copyband_offset) is for preseting, copyband_offset 〉=0, when predefined copyband_offset=0 is zero, the source frequency range is the frequency range of the frequency from 0 frequency to tone locations, in order to reduce the frequency spectrum saltus step that copies frequency band, copyband_offset is made as greater than zero, then the source frequency range is frequency that (is designated as copyband_offset) among a small circle of backward skew of 0 frequency is offset the frequency range of an identical frequency that (is designated as copyband_offset) among a small circle backward to the frequency of maximum extreme value position MDCT frequency coefficient, and filling subband for certain effective noise more than the frequency (is 1 such as sequence number, 2,3,4) frequency spectrum of inner zero bits of encoded subband is filled and is all copied from the frequency range of source.

Flow process corresponding to Fig. 2, zero bits of encoded subband for first noise filling subband adopts the random noise fill method to carry out frequency spectrum reconfiguration, be the zero bits of encoded subband of 1,2,3,4 noise filling subband for sequence number, adopt frequency coefficient to copy in conjunction with the method for noise filling and carry out frequency spectrum reconfiguration;

When carrying out spectral band replication, the start sequence number of first carrying out the zero bits of encoded subband of spectral band replication according to source frequency range and needs is calculated the source frequency range replication initiation sequence number of this zero bits of encoded subband, again take the spectral band replication cycle as the cycle, begin the frequency coefficient periodic repetitions of source frequency range to zero bits of encoded subband from source frequency range replication initiation sequence number.

The method of determining source frequency range replication initiation sequence number is:

At first, from first zero bits of encoded subband that needs copy, acquisition needs the sequence number of frequency of initial MDCT frequency coefficient of the zero bits of encoded subband of reconstructed frequency domain coefficient, be designated as fillband_start_freq, the sequence number of the frequency that tone is corresponding is designated as Tonal_pos, and the replicative cycle of frequency band is designated as copy_period.Copy_period equals Tonal_pos and adds 1.If need to carry out the highest frequency of zero bits of encoded subband inside of spectral band replication less than the frequency of the tone that searches, then this frequency only adopts noise filling to carry out frequency spectrum reconfiguration, does not carry out spectral band replication.Spectral band replication skew is designated as copyband_offset, the value circulation of fillband_start_freq is deducted copy_period, until that its value drops on the value of sequence number of source frequency range is interval, this value then is source frequency range replication initiation sequence number, is designated as copy_pos_mod.

Source frequency range replication initiation sequence number copy_pos_mod can obtain by following false code algorithm:

Make copy_pos_mod=fillband_start_freq;

When copy_pos_mod greater than (Tonal_pos+copyband_offset)

{

copy_pos_mod＝copy_pos_mod-copy_period；

}

Copy_pos_mod then is source frequency range replication initiation sequence number after finishing computing.

When copying, to copy to backward successively with the frequency coefficient that source frequency range replication initiation sequence number begins on the zero bits of encoded subband take fillband_start_freq as reference position, until behind the frequency that the source frequency range the copies arrival Tonal_pos+copyband_offset frequency, again will continue to copy to backward on this zero bits of encoded subband since the frequency coefficient of copyband_offset frequency, the rest may be inferred, until finish the spectral band replication when all frequency coefficients of leading zero bits of encoded subband.

Setting spectral band replication skew copyband_offset is 10 o'clock, to arrange from low to high on the zero bits of encoded subband that copies to take fillband_start_freq as reference position by frequency from the frequency band that copy_pos_mod begins, until again copy since the 10th frequency coefficient behind the Tonal_pos+10 frequency, the rest may be inferred, the all signals that are somebody's turn to do zero bits of encoded subband all copy from 10 to Tonal_pos+10 frequency coefficients, and the frequency coefficient of frequency 10 to Tonal_pos+10 is the source frequency range of spectral band replication.

Method above adopting is all zero bits of encoded subband replica spectra of 1,2,3,4 noise filling subband for sequence number.

Except above spectral band replication method, other spectral band replication method is equally applicable to the present invention, and realization of the present invention be there is no impact.

Step 502: the frequency coefficient that the noise level that obtains according to decoding obtains after the zero bits of encoded subband of each noise filling subband inside is copied carries out energy adjusting;

Calculate the amplitude envelope of the frequency coefficient that obtains after zero bits of encoded subband copies, be designated as sbr_rms (r).The computing formula of the frequency coefficient behind the energy adjusting is:

\overset{&OverBar;}{X_sbr} (r) = X_sbr (r) * sbr_lev_scale (r) * rms (r) / sbr_rms (r);

Wherein,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, the frequency coefficient that obtains after X_sbr (r) expression zero bits of encoded subband r copies, sbr_rms (r) is the amplitude envelope (namely its root mean square) of the frequency coefficient X_sbr (r) that obtains after zero bits of encoded subband r copies, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, sbr_lev_scale (r) is the energy control ratio factor of the spectral band replication of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

sbr_lev_scale (r) = \sqrt{(1 - \overset{&OverBar;}{P_noise_rate} (j)) * fill_energy_saclefactor}

Fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1), and value is 0.2 in this example.

Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, it can obtain the inverse quantization value according to the noise level coded-bit from the quantizing range of table 3, realizes example such as table 10 for one in this example:

Table 10 noise level inverse quantization value

Wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place.

Step 503: Additive White Noise forms last reconstructed frequency domain coefficient on the frequency coefficient behind the energy adjusting.

After finishing the energy adjusting that copies frequency coefficient, Additive White Noise forms last reconstructed frequency domain coefficient on the frequency coefficient behind the energy adjusting

\overset{&OverBar;}{X} (r) = \overset{&OverBar;}{X_sbr} (r) + rms (r) * noise_lev_scale (r) * random ();

Wherein,

The frequency coefficient of expression zero bits of encoded subband r reconstruct,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, random () is the random phase generator, produce the random phase value, its rreturn value is+1 or-1, noise_lev_scale (r) is the noise level control ratio factor of zero bits of encoded subband r, and its value is determined by the noise level of the noise filling subband at zero bits of encoded subband r place.Specific formula for calculation is as follows:

noise_lev_scale (r) = \sqrt{\overset{&OverBar;}{P_noise_rate} (j) * fill_energy_saclefactor}

Wherein, fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1), and value is 0.2 in this example. Be the noise level of the noise filling subband j that obtains of decoding inverse quantization, wherein j is the sequence number of the noise filling subband at zero bits of encoded subband r place.

The effective noise of certainly also tackling unallocated bit is filled zero bits of encoded subband in the subband (be such as sequence number 0 noise filling subband) and is carried out white noise and fill to realize frequency coefficient reconstruct, does not repeat them here.

The present invention also provides a kind of method of estimation of noise level, and the method comprises:

According to the noise level of the power Spectral Estimation of estimating gained zero bits of encoded subband sound signal, this noise level is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding; Wherein, zero bits of encoded subband refers to that the bit number that is assigned to is zero coding subband.Can for each zero bits of encoded subband calculates a noise level, also can calculate a shared noise level by several noise subbands.

Further, the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal estimates to estimate in the noise contribution power that obtains and the zero bits of encoded subband ratio of the tonal content power that obtains.

Further, estimate the power spectrum of sound signal to be encoded according to the MDCT frequency coefficient of sound signal to be encoded, the formula that the power of the frequency k of i frame is estimated is as follows:

Further, the frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, the process of filling the noise level of subband according to certain effective noise of spectra calculation of the sound signal to be encoded of estimating specifically comprises:

Three, coded system

For realizing above coding method, the present invention also provides a kind of audio coding system, as shown in Figure 6, this system comprises Modified Discrete Cosine Transform (MDCT) unit, amplitude envelope computing unit, amplitude envelope quantification and coding unit, Bit Allocation in Discrete unit, frequency coefficient coding unit, noise level estimation unit and bit stream multiplexer (MUX); Wherein:

The amplitude envelope computing unit is connected with described MDCT unit, is used for the frequency coefficient that described MDCT generates is divided into several coding subbands, and calculates the amplitude envelope of the subband of respectively encoding;

When the amplitude envelope computing unit is divided the coding subband, the frequency coefficient after the described MDCT conversion is divided into several equally spaced coding subbands, perhaps is divided into several non-uniform encoding subbands according to auditory perception property.

The Bit Allocation in Discrete unit quantizes to be connected with coding unit with described amplitude envelope, is used for carrying out Bit Allocation in Discrete, the number of coded bits of distributing for each frequency coefficient in the subband of respectively being encoded;

Particularly, the Bit Allocation in Discrete unit comprises importance computing module and the Bit Allocation in Discrete module that is connected, wherein:

The importance computing module is used for calculating according to coding subband amplitude envelope value the initial value of the subband importance of respectively encoding;

Described Bit Allocation in Discrete module is used for according to the importance of each coding subband each frequency coefficient of coding subband being carried out Bit Allocation in Discrete, and in bit allocation procedures, the step-length that importance reduces after Bit Allocation in Discrete step-length and the Bit Allocation in Discrete all changes.

Described importance initial value is to quantize the optimum bit value under the snr gain condition and the scale factor calculation that meets the auditory perceptual characteristic according to maximum, or the quantification index Th of the subband amplitude envelope of respectively encoding _q(j) or

μ＞0 wherein, μ and v are real number.

When described importance computing module calculates described importance initial value, calculate first the bit consumption mean value of single frequency coefficient; Again according to the theoretical optimum bit value of calculating under maximum quantification snr gain condition of code rate distortion; Refer to calculate the importance initial value of subband in Bit Allocation in Discrete of respectively encoding according to described bit consumption mean value and optimum bit more afterwards;

Described Bit Allocation in Discrete module is carried out Bit Allocation in Discrete according to the importance of each coding subband to each coding subband: increase the number of coded bits of each frequency coefficient in the coding subband of importance maximum, and reduce the importance of this coding subband; Until the maximal value that can provide under the bit limit condition is provided the summation of all coding bit numbers that subband consumes.

When described Bit Allocation in Discrete module was carried out Bit Allocation in Discrete, the Bit Allocation in Discrete step-length of low bits of encoded subband and the importance after the Bit Allocation in Discrete reduced step-length less than Bit Allocation in Discrete step-length and the reduction of the importance after the Bit Allocation in Discrete step-length of zero bits of encoded subband and higher bit coding subband.As: as described in Bit Allocation in Discrete module when carrying out Bit Allocation in Discrete, the Bit Allocation in Discrete step-length of low bits of encoded subband and the importance after the Bit Allocation in Discrete reduce step-length and are 0.5; The Bit Allocation in Discrete step-length of zero bits of encoded subband and higher bit coding subband and the importance after the Bit Allocation in Discrete reduce step-length and are 1.

The noise level estimation unit, be connected with MDCT unit and Bit Allocation in Discrete unit, be used for estimating according to the MDCT frequency coefficient of sound signal to be encoded the power spectrum of sound signal to be encoded, and then the noise level of estimation zero bits of encoded subband sound signal, and quantization encoding obtains the noise level coded-bit; Wherein, this noise level is used for controlling the ratio of noise filling and spectral band replication energy when decoding, see Fig. 7 for details;

Bit stream multiplexer (MUX), be connected frequency coefficient coding unit and noise level estimation unit with coding unit with described amplitude envelope quantification and be connected, be used for the coded-bit of the coded-bit of each coding subband and frequency coefficient and noise level coded-bit is multiplexing and send to decoding end.

Bit stream multiplexer is followed successively by amplitude envelope huffman coding sign, frequency coefficient huffman coding sign, Bit Allocation in Discrete correction iterations, the coded-bit of amplitude envelope, coded-bit and the noise level coded-bit of frequency coefficient to the order of the multiplexing packing of bit after encoding.

As shown in Figure 7, the noise level estimation unit specifically comprises:

Described power Spectral Estimation module adopts following formula to estimate the power of the frequency k of i frame:

The noise level computing module is connected with described power Spectral Estimation module, is used for being assigned to according to the power Spectral Estimation that described power Spectral Estimation module is estimated the noise level of sound signal of the noise filling subband of bit;

Further, the frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, the function of described noise level computing module specifically comprises: be used for calculating the mean value that this effective noise is filled all frequency coefficient power of subband all or part zero bits of encoded subband, obtain average power P_aveg (j); Be used for calculating this effective noise and fill subband all or part zero bits of encoded subband power P _i(k) greater than the mean value of the power of all frequency coefficients of average power P_aveg (j), obtain the tonal content average power P_signal_aveg (j) that this effective noise is filled zero bits of encoded subband in subband; Be used for calculating this effective noise and fill subband all or part zero bits of encoded subband power P _i(k) be less than or equal to the power P of all frequency coefficients of average power P_aveg (j) _i(k) mean value obtains the average power P_noise_aveg (j) that this effective noise is filled the noise contribution of zero bits of encoded subband in the subband; The ratio that is used for calculating noise composition average power P_noise_aveg (j) and tonal content average power P_signal_aveg (j) obtains the noise level that this effective noise is filled subband;

Described noise level estimation unit also comprises the Bit Allocation in Discrete module that is connected with noise level computing module and noise level coding module, the effective noise that is used to all effective noises to fill allocation of subbands bits or skips one or several low frequency is filled subband, for the effective noise of follow-up higher-frequency is filled the allocation of subbands bit, and notice noise level computing module and noise level coding module; Described noise level computing module is the noise filling subband calculating noise level for having distributed bit only; Described noise level coding module utilizes the bit of Bit Allocation in Discrete module assignment that described noise level is carried out quantization encoding.

Four, decode system

In order to realize above coding/decoding method, the present invention also provides a kind of audio decoding system, as shown in Figure 8, this system comprises bit stream demultiplexer (DeMUX), coding subband amplitude envelope decoding unit, Bit Allocation in Discrete unit, frequency coefficient decoding unit, frequency spectrum reconfiguration unit, revises inverse discrete cosine transform (IMDCT) unit, wherein:

Bit stream demultiplexer (DeMUX) is used for isolating amplitude envelope coded-bit, frequency coefficient coded-bit and noise level coded-bit from bit stream to be decoded;

The amplitude envelope decoding unit is connected with described bit stream demultiplexer, decodes the amplitude envelope quantification index of the subband of respectively being encoded for the coded-bit of the amplitude envelope that described bit stream demultiplexer is exported;

The Bit Allocation in Discrete unit is connected with described amplitude envelope decoding unit, is used to the allocation of subbands bit and for containing the noise filling allocation of subbands bit of zero bits of encoded subband of respectively encoding;

The Bit Allocation in Discrete unit comprises importance computing module and Bit Allocation in Discrete module and Bit Allocation in Discrete correcting module, wherein:

Described Bit Allocation in Discrete module is used for according to the importance initial value of each coding subband each frequency coefficient of coding subband being carried out Bit Allocation in Discrete, and in bit allocation procedures, the step-length that importance reduces after Bit Allocation in Discrete step-length and the Bit Allocation in Discrete all changes;

The Bit Allocation in Discrete correcting module is used for after carrying out Bit Allocation in Discrete, according to the Bit Allocation in Discrete correction iterations count value of coding side and the importance of each coding subband, the coding subband is carried out count Bit Allocation in Discrete correction again.

When described Bit Allocation in Discrete module was carried out Bit Allocation in Discrete, the Bit Allocation in Discrete step-length of low bits of encoded subband and the importance after the Bit Allocation in Discrete reduced step-length less than Bit Allocation in Discrete step-length and the reduction of the importance after the Bit Allocation in Discrete step-length of zero bits of encoded subband and higher bit coding subband.

When described Bit Allocation in Discrete correcting module carried out bit correction, the bit correction step-length of low bits of encoded subband and the importance after the bit correction reduced step-length less than bit correction step-length and the reduction of the importance after the bit correction step-length of zero bits of encoded subband and higher bit coding subband.

When described Bit Allocation in Discrete unit is noise filling allocation of subbands bit, filling subband according to the effective noise that the distribution method of scrambler is filled allocation of subbands bits for all effective noises or skipped one or several low frequency, is that the effective noise of follow-up higher-frequency is filled the allocation of subbands bit.

Frequency spectrum reconfiguration unit: be connected with noise level decoding unit, frequency coefficient decoding unit, amplitude envelope decoding unit and Bit Allocation in Discrete unit, be used for zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of amplitude envelope decoding unit output, control the ratio of the energy of noise filling and spectral band replication according to the noise level of noise level decoding unit output, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

Revise inverse discrete cosine transform (IMDCT) unit, be connected with described frequency spectrum reconfiguration unit, be used for the frequency coefficient behind the frequency spectrum reconfiguration of finishing zero bits of encoded subband is carried out IMDCT, obtain sound signal.

As shown in Figure 9, described frequency spectrum reconfiguration unit specifically comprises spectral band replication subelement, energy adjusting subelement and the noise filling subelement that connects successively, wherein:

The energy adjusting subelement, the amplitude envelope for calculating the frequency coefficient that obtains after the zero bits of encoded subband spectral band replication is designated as sbr_rms (r); And according to the noise level of noise level decoding unit output the frequency coefficient that obtains after copying is carried out energy adjusting, the frequency coefficient behind the energy adjusting is:

\overset{&OverBar;}{X_sbr} (r) = X_sbr (r) * sbr_lev_scale (r) * rms (r) / sbr_rms (r);

Wherein,

Frequency coefficient behind the energy adjusting of expression zero bits of encoded subband r, X_sbr (r) expression zero bits of encoded subband r is by copying the frequency coefficient that obtains, sbr_rms (r) is the amplitude envelope of the frequency coefficient X_sbr (r) that obtains after zero bits of encoded subband r copies, rms (r) is the amplitude envelope of the front frequency coefficient of coding of zero bits of encoded subband r, obtained by amplitude envelope quantification index inverse quantization, sbr_lev_scale (r) is the energy control ratio factor of the spectral band replication of zero bits of encoded subband r, its value is by the noise level decision of the noise filling subband at zero bits of encoded subband r place, and specific formula for calculation is as follows:

sbr_lev_scale (r) = \sqrt{(1 - \overset{&OverBar;}{P_noise_rate} (j)) * fill_energy_saclefactor}

\overset{&OverBar;}{X} (r) = \overset{&OverBar;}{X_sbr} (r) + rms (r) * noise_lev_scale (r) * random ();

Wherein,

noise_lev_scale (r) = \sqrt{\overset{&OverBar;}{P_noise_rate} (j) * fill_energy_saclefactor}

Wherein, fill_energy_saclefactor is used for adjusting the gain of whole filling energy for filling the energy proportion factor, and its span is (0,1), and value is 0.2 in this example.

Described spectral band replication subelement carries out spectral band replication according to the bit allocation result of Bit Allocation in Discrete unit to the zero bits of encoded subband in the noise filling subband that has distributed bit; The frequency coefficient that described energy adjusting subelement obtains after to spectral band replication carries out energy adjusting; Described noise filling subelement after to energy adjusting frequency coefficient and the zero bits of encoded subband in the noise filling subband of unallocated bit carry out noise filling.

Further, as shown in Figure 9, described spectral band replication subelement comprises tone locations search module, cycle and source frequency range computing module, source frequency range replication initiation sequence number computing module and the spectral band replication module that connects successively, wherein:

The tone locations search module is used in the position at certain tone place of MDCT frequency coefficient search sound signal, and specifically comprise: the MDCT frequency coefficient to the first frequency range takes absolute value or square value, and carries out smothing filtering; According to the result of smothing filtering, search for the position at the maximum extreme value place of the first frequency range filtering output value, the position at this maximum extreme value place is the position at tone place;

If the sequence number of the frequency of tone locations is designated as Tonal_pos, preset the spectral band replication skew and be designated as copyband_offset, the start sequence number copyband_offset of the frequency coefficient of source frequency range then, the end sequence number is copyband_offset+Tonal_pos.

Source frequency range replication initiation sequence number computing module is used for calculating according to the start sequence number that source frequency range and needs carry out the zero bits of encoded subband of spectral band replication the source frequency range replication initiation sequence number of this zero bits of encoded subband.

Described spectral band replication module be used for take the spectral band replication cycle as the cycle, begin frequency coefficient periodic repetitions with the source frequency range to zero bits of encoded subband from source frequency range replication initiation sequence number;

If need to carry out the highest frequency of zero bits of encoded subband inside of spectral band replication less than the frequency of the tone that searches, then this frequency only adopts noise filling to carry out frequency spectrum reconfiguration, does not carry out spectral band replication.

Described tone locations search module adopts following method search tone position: the MDCT frequency coefficient to the first frequency range takes absolute value or square value, and carries out smothing filtering; According to the result of smothing filtering, search for the position at the maximum extreme value place of the first frequency range filtering output value, the position at this maximum extreme value place is the position at tone place;

Further,

Described tone locations search module to the MDCT frequency coefficient of this first frequency range operational formula of carrying out smothing filtering that takes absolute value is:

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k) + (1 - μ) | {\overset{&OverBar;}{X}}_{i} (k) |

X_{amp}_{i} (k) = μX_{amp}_{i - 1} (k - 1) + (1 - μ) {\overset{&OverBar;}{X}}_{i} {(k)}^{2}

Wherein, μ is the smothing filtering coefficient, and getting its value among the embodiment is 0.125, X_amp _i(k) filtering output value of k frequency of expression i frame,

Further, described tone locations search module is directly searched for original maximum from the filtering output value of frequency coefficient corresponding to the first frequency range, with the maximum extreme value of this maximal value as the first frequency range filtering output value.

Further, the process that described source frequency range replication initiation sequence number computing module calculates the source frequency range replication initiation sequence number of the zero bits of encoded subband that need to carry out spectral band replication comprises: the sequence number that obtains the initial frequency of the current zero bits of encoded subband that needs the reconstructed frequency domain coefficient, be designated as fillband_start_freq, the sequence number of the frequency that tone is corresponding is designated as Tonal_pos, Tonal_pos is added 1 obtain replicative cycle copy_period, source frequency range start sequence number is designated as copyband_offset, the value circulation of fillband_start_freq is deducted copy_period, until this value drops on the value interval of the sequence number of source frequency range, this value is designated as copy_pos_mod, is source frequency range replication initiation sequence number.

Further, when described spectral band replication module is carried out spectral band replication, specifically comprise:

To copy to backward successively with the frequency coefficient that source frequency range replication initiation sequence number begins on the zero bits of encoded subband take fillband_start_freq as reference position, until behind the frequency that the source frequency range the copies arrival Tonal_pos+copyband_offset frequency, again will continue to copy to backward on this zero bits of encoded subband since the frequency coefficient of copyband_offset frequency, the rest may be inferred, until all frequency coefficients of finishing when leading zero bits of encoded subband copy.

Claims

1. a noise level estimation method is characterized in that, the method comprises:

According to the noise level of the power Spectral Estimation that calculates zero bits of encoded subband sound signal, this noise level is used for controlling the ratio of the energy of noise filling and spectral band replication when decoding; Wherein, zero bits of encoded subband refers to that the bit number that is assigned to is zero coding subband;

2. the method for claim 1 is characterized in that:

3. the method for claim 1 is characterized in that:

The ratio P_noise_rate (j) of calculating noise composition average power P_noise_aveg (j) and tonal content average power P_signal_aveg (j) obtains the noise level that this effective noise is filled subband;

4. an audio coding method is characterized in that, the method comprises:

Behind D, the amplitude envelope coded-bit and frequency coefficient coded-bit and the multiplexing packing of noise level coded-bit with each coding subband, send decoding end to;

Among the step C, the ratio of the tonal content power that estimation obtains in the noise contribution power that the interior estimation of the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal obtains and the zero bits of encoded subband.

5. method as claimed in claim 4 is characterized in that:

6. method as claimed in claim 4 is characterized in that, among the step B, the frequency coefficient of sound signal to be encoded is divided into one or several noise filling subbands, and after to each coding allocation of subbands bit, is that effective noise is filled the allocation of subbands bit; Among the step C, the process of filling the noise level of subband according to certain effective noise of spectra calculation of the sound signal to be encoded of estimating specifically comprises:

7. method as claimed in claim 6 is characterized in that:

When dividing the noise filling subband, evenly divide or carry out non-homogeneous division according to human hearing characteristic, a noise filling subband comprises one or more coding subbands.

8. method as claimed in claim 6 is characterized in that: the effective noise of filling allocation of subbands bits for all effective noises among the step B or skipping one or several low frequency is filled subband, is that the effective noise of follow-up higher-frequency is filled the allocation of subbands bit; Among the step C dispensed effective noise of bit fill the noise level of subband; Use the bit of this distribution to the multiplexing packing of noise level coded-bit among the step D.

9. method as claimed in claim 6 is characterized in that: each effective noise is filled the identical bit number of allocation of subbands or is distributed different bit numbers according to auditory properties.

10. an audio-frequency decoding method is characterized in that, the method comprises:

B2, to each the coding subband carry out Bit Allocation in Discrete, the noise level coded-bit inverse quantization of decoding is obtained the noise level of zero bits of encoded subband sound signal, the frequency coefficient coded-bit inverse quantization of decoding is obtained the frequency coefficient of non-zero bit coding subband;

The ratio of the tonal content power that the noise contribution power that estimation obtains in the noise level nulling bits of encoded subband of described zero bits of encoded subband sound signal and zero bits of encoded subband estimation obtain;

C2, zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of each zero bits of encoded subband, control the ratio of the energy of noise filling and spectral band replication according to the noise level of this zero bits of encoded subband sound signal, obtain the frequency coefficient of the zero bits of encoded subband of reconstruct;

D2, the frequency coefficient of the zero bits of encoded subband of the frequency coefficient of non-zero bit coding subband and reconstruct is revised inverse discrete cosine transform IMDCT, obtain final sound signal.

11. method as claimed in claim 10, it is characterized in that, among the step C2, during spectral band replication, the position at certain tone place of search sound signal in the MDCT frequency coefficient, the bandwidth of the frequency take 0 frequency to tone locations is the spectral band replication cycle, and be offset backward copyband_offset frequency with 0 frequency and be offset backward the frequency range of a described copyband_offset frequency as the source frequency range to the frequency of tone locations, zero bits of encoded subband is carried out spectral band replication, if the highest frequency of zero bits of encoded subband inside less than the frequency of the tone that searches, then should only adopt noise filling to carry out frequency spectrum reconfiguration by zero bits of encoded subband.

12. method as claimed in claim 11 is characterized in that, among the step C2,

13. method as claimed in claim 12 is characterized in that:

14. method as claimed in claim 12 is characterized in that, described the first frequency range is that its medium and low frequency refers to the spectrum component less than 1/2nd signal total bandwidths according to the frequency range of the concentrated low frequency of the definite energy comparison of the statistical property of frequency spectrum.

15. method as claimed in claim 12, it is characterized in that, adopt following methods to determine the maximum extreme value of filtering output value: directly from the filtering output value of frequency coefficient corresponding to the first frequency range, to search for original maximum, with the maximum extreme value of this maximal value as the first frequency range filtering output value.

16. method as claimed in claim 12 is characterized in that, adopts following methods to determine the maximum extreme value of filtering output value:

17. method as claimed in claim 11, it is characterized in that, among the step C2, when zero bits of encoded subband is carried out spectral band replication, the start sequence number of first carrying out the zero bits of encoded subband of spectral band replication according to source frequency range and needs is calculated the source frequency range replication initiation sequence number of this zero bits of encoded subband, again take the spectral band replication cycle as the cycle, the frequency coefficient that begins the source frequency range from source frequency range replication initiation sequence number periodically copies to zero bits of encoded subband.

18. method as claimed in claim 17 is characterized in that, the method for calculating the source frequency range replication initiation sequence number of this zero bits of encoded subband among the step C2 is:

19. method as claimed in claim 18 is characterized in that, among the step C2 take the spectral band replication cycle as the cycle, begin from source frequency range replication initiation sequence number with the frequency coefficient periodic repetitions of source frequency range to the method for zero bits of encoded subband be:

20. method as claimed in claim 10 is characterized in that, among the step C2, the frequency coefficient that the employing following methods obtains after zero bits of encoded subband is copied carries out energy adjusting:

Wherein,

21. method as claimed in claim 10 is characterized in that, among the step C2, carries out noise filling according to the frequency coefficient of following formula after to energy adjusting:

Wherein, Expression zero bits of encoded subband r reconstructed frequency domain coefficient,

22. method as claimed in claim 11, it is characterized in that: among the step B2, after subband carries out Bit Allocation in Discrete to each coding, the sub-band division of will encoding is several noise filling subbands, effective noise is filled subband carry out Bit Allocation in Discrete, among the step C2, the zero bits of encoded subband that the effective noise that has distributed bit is filled in the subband carries out spectral band replication and the energy level of the control frequency coefficient that copies and the energy level of noise filling, the zero bits of encoded subband that the effective noise of unallocated bit is filled in the subband carries out noise filling, and wherein effective noise is filled the noise filling subband that subband refers to contain zero bits of encoded subband.

23. audio coding system, this system comprises Modified Discrete Cosine Transform MDCT unit, amplitude envelope computing unit, amplitude envelope quantification and coding unit, Bit Allocation in Discrete unit, frequency coefficient coding unit and bit stream multiplexer MUX, it is characterized in that, this system also comprises the noise level estimation unit, wherein:

Bit stream multiplexer MUX is connected frequency coefficient coding unit and noise level estimation unit with described amplitude envelope quantification and is connected with coding unit, be used for the coded-bit of the coded-bit of each coding subband and frequency coefficient is multiplexing and send to decoding end;

24. system as claimed in claim 23 is characterized in that, described noise level estimation unit specifically comprises:

25. system as claimed in claim 24 is characterized in that: described power Spectral Estimation module adopts following formula to estimate the power of the frequency k of i frame, and formula is as follows:

26. system as claimed in claim 24 is characterized in that:

27. system as claimed in claim 24, it is characterized in that: described noise level estimation unit also comprises the Bit Allocation in Discrete module that is connected with noise level computing module and noise level coding module, the effective noise that is used to all effective noises to fill allocation of subbands bits or skips one or several low frequency is filled subband, for the effective noise of follow-up higher-frequency is filled the allocation of subbands bit, and notice noise level computing module and noise level coding module; Described noise level computing module is the noise filling subband calculating noise level for having distributed bit only; Described noise level coding module utilizes the bit of Bit Allocation in Discrete module assignment that described noise level is carried out quantization encoding.

28. audio decoding system, this system comprises bit stream demultiplexer DeMUX, coding subband amplitude envelope decoding unit, Bit Allocation in Discrete unit, frequency coefficient decoding unit, frequency spectrum reconfiguration unit, revises inverse discrete cosine transform IMDCT unit, it is characterized in that:

Described frequency spectrum reconfiguration unit, be connected with described noise level decoding unit, frequency coefficient decoding unit, amplitude envelope decoding unit and Bit Allocation in Discrete unit, be used for zero bits of encoded subband is carried out spectral band replication, and control the integral energy fill level of this coding subband according to the amplitude envelope of amplitude envelope decoding unit output, according to the ratio of the energy of the noise level control noise filling of the zero bits of encoded subband sound signal of noise level decoding unit output and spectral band replication, the frequency coefficient of the zero bits of encoded subband of acquisition reconstruct;

The IMDCT unit is connected with described frequency spectrum reconfiguration unit, is used for the frequency coefficient behind the frequency spectrum reconfiguration of finishing zero bits of encoded subband is carried out IMDCT, the sound signal that obtains;

29. system as claimed in claim 28 is characterized in that:

Described frequency spectrum reconfiguration unit comprises spectral band replication subelement, energy adjusting subelement and the noise filling subelement that connects successively, wherein:

Wherein,

Wherein,

30. system as claimed in claim 28 is characterized in that: described spectral band replication subelement comprises tone locations search module, cycle and source frequency range computing module, source frequency range replication initiation sequence number computing module and the spectral band replication module that connects successively, wherein:

31. system as claimed in claim 30 is characterized in that: described tone locations search module adopts following method search tone position: the MDCT frequency coefficient to the first frequency range takes absolute value or square value, and carries out smothing filtering; According to the result of smothing filtering, search for the position at the maximum extreme value place of the first frequency range filtering output value, the position at this maximum extreme value place is the position at tone place.

32. system as claimed in claim 31 is characterized in that:

Wherein, μ is the smothing filtering coefficient, X_amp _i(k) filtering output value of k frequency of expression i frame, Be the decoded MDCT coefficient of k frequency of i frame, and during i=0, X_amp _I-1(k)=0.

33. system as claimed in claim 30 is characterized in that, described the first frequency range is that its medium and low frequency refers to the spectrum component less than 1/2nd signal total bandwidths according to the frequency range of the concentrated low frequency of the definite energy comparison of the statistical property of frequency spectrum.

34. system as claimed in claim 30, it is characterized in that: described tone locations search module computing module is directly searched for original maximum from the filtering output value of frequency coefficient corresponding to the first frequency range, with the maximum extreme value of this maximal value as the first frequency range filtering output value.

35. system as claimed in claim 30, it is characterized in that: when described tone locations search module is determined the maximum extreme value of filtering output value, with this first frequency range wherein one section as the second frequency range, from the filtering output value of frequency coefficient corresponding to the second frequency range, search for original maximum first, carry out different processing according to the position of frequency coefficient corresponding to this original maximum again:

36. system as claimed in claim 30 is characterized in that:

The process that described source frequency range replication initiation sequence number computing module calculates the source frequency range replication initiation sequence number of the zero bits of encoded subband that need to carry out spectral band replication comprises: the sequence number that obtains the initial frequency of the current zero bits of encoded subband that needs the reconstructed frequency domain coefficient, be designated as fillband_start_freq, the sequence number of the frequency that tone is corresponding is designated as Tonal_pos, Tonal_pos is added 1 obtain replicative cycle copy_period, spectral band replication is offset copyband_offset as source frequency range start sequence number, the value circulation of fillband_start_freq is deducted copy_period, until this value drops on the value interval of the sequence number of source frequency range, this value is designated as copy_pos_mod for source frequency range replication initiation sequence number.

37. system as claimed in claim 36, it is characterized in that: when the spectral band replication module is carried out spectral band replication, to copy to backward successively with the frequency coefficient that source frequency range replication initiation sequence number begins on the zero bits of encoded subband take fillband_start_freq as reference position, until behind the frequency that the source frequency range the copies arrival Tonal_pos+copyband_offset frequency, again will continue to copy to backward on this zero bits of encoded subband since the frequency coefficient of copyband_offset frequency, the rest may be inferred, until all frequency coefficients of finishing when leading zero bits of encoded subband copy.

38. system as claimed in claim 28 is characterized in that:

Described Bit Allocation in Discrete unit also is used to all effective noises filling allocation of subbands bits or skips the effective noise filling subband of one or several low frequency, is that the effective noise of follow-up higher-frequency is filled the allocation of subbands bit; The frequency coefficient that described energy adjusting subelement obtains after to spectral band replication carries out energy adjusting; Described noise filling subelement after to energy adjusting frequency coefficient and the zero bits of encoded subband in the noise filling subband of unallocated bit carry out noise filling.