Subband decoding method based on SILK codec and device
Technical field
The present invention relates to audio coding decoding field, particularly to a kind of subband decoding method based on SILK codec
And device.
Background technology
Along with the development of the Internet, the demand of speech communication constantly increases, VOIP(Voice based on voice packet exchange
Over Internet Protocol) technology is with its low cost, easily expand and excellent speech quality is increasingly by user's
Favor.
The coded system comparing main flow in VOIP technology is SILK coding, and its coded system is: at coding side to voice
Signal is modeled, and by speech model, signal is disassembled into different systematic parameters, and by channel, these parameters are reached solution
Code end, decoder solves relevant parameter, then recovers voice signal according to identical speech model.
During realizing the present invention, inventor finds that prior art at least there is problems in that
Owing to, in the sound that sends people, high-frequency signal usually not low frequency signal enriches, and SILK encoder is then root
Respectively low-and high-frequency signal is processed according to default bit resources, so position distribution is efficient not when compiling broadband voice, make
The efficiency of low-frequency signal processing must be reduced.
Summary of the invention
In order to solve problem of the prior art, embodiments provide a kind of subband based on SILK codec and compile
Coding/decoding method and device.Described technical scheme is as follows:
On the one hand, it is provided that a kind of method of coding subband based on SILK codec, described method includes:
Obtain the full range time-domain signal that current audio frame is corresponding;And described full range time-domain signal is decomposed into low-frequency time-domain
Signal and high frequency time-domain signal;
Described low-frequency time-domain signal is carried out SILK coded treatment, generates the low frequency ginseng that described low-frequency time-domain signal is corresponding
Number;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate the high frequency that high frequency time-domain signal is corresponding
Parameter;
Described low-frequency parameter is carried out with described high-frequency parameter quantifies compression and generates the bit that described current audio frame is corresponding
Stream.
Preferably, described and according to described full range time-domain signal and described high frequency time-domain signal, carry out coded treatment generation
The high-frequency parameter that high frequency time-domain signal is corresponding, including:
When described current audio frame is unvoiced frame, described full range time-domain signal is converted into full range frequency-region signal, according to
Described low-frequency time-domain signal carries out the pitch period obtained during SILK coded treatment, and by described pitch period and described full-time frequency
Territory signal input harmonics structure analyzer, calculates the cut-off frequency of harmonic structure;
Cut-off frequency according to described harmonic structure, it is judged that whether there is harmonic structure in described high frequency time-domain signal;
When described high frequency time-domain signal exists harmonic structure, carry out at SILK coding according to described low-frequency time-domain signal
The low frequency complete excitation obtained during reason, described low frequency sore throat relieving encourages and is calculated according to the cut-off frequency of described harmonic structure
Modulating frequency, calculating simulation high frequency pumping;
By described high frequency time-domain signal input linear predictive coefficient LPC analyzer, be calculated true high frequency pumping and
High frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, it is calculated gain-adjusted
Ratio;
Described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as described high frequency time domain letter
Number corresponding high-frequency parameter.
Preferably, the described low frequency complete excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal,
The excitation of described low frequency sore throat relieving and the calculated modulating frequency of cut-off frequency according to described harmonic structure, calculating simulation high frequency
Excitation, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described
Full range excitation obtains the first high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described first high frequency pumping;
The excitation of described low frequency sore throat relieving carrying out spectrum folding and time-delay alignment obtains the second high frequency pumping, described second is high
Harmonic structure is not carried in frequency excitation;
According to the mixed coefficint that default described first high frequency pumping is corresponding, corresponding with described second high frequency pumping preset
Mixed coefficint, described first high frequency pumping and described second high frequency pumping are carried out mixed weighting be calculated simulation high frequency swash
Encourage.
Preferably, described judge whether described full range time-domain signal exists harmonic structure after, described method also includes:
When described high frequency time-domain signal not existing harmonic structure or described current audio frame is unvoiced frames, according to institute
State the low frequency sore throat relieving excitation that low-frequency time-domain signal carries out obtaining during SILK coded treatment and carry out spectrum folding and time-delay alignment obtains
To the 3rd high frequency pumping, and it is defined as described 3rd high frequency pumping simulating high frequency pumping.
On the other hand, it is provided that a kind of subband coding/decoding method based on SILK codec, described method includes:
Obtain bit stream corresponding to current audio frame, and by parameter decoder, described bit stream decoding obtained low frequency ginseng
Number and high-frequency parameter;
It is decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And according to SILK decoder
The intermediate parameters generated when being decoded described low-frequency parameter, is decoded described high-frequency parameter obtaining high frequency time domain letter
Number;
Described low-frequency time-domain signal and described high frequency time-domain signal are synthesized full range time-domain signal by QMF synthesizer,
Described full range time-domain signal is the decoded voice data of described current audio frame.
Preferably, the described intermediate parameters generated when being decoded described low-frequency parameter according to SILK decoder, to institute
State high-frequency parameter to be decoded obtaining high frequency time-domain signal, including:
According to the modulating frequency in described high-frequency parameter, it is judged that whether described current audio frame exists harmonic structure;
When described voice data exists harmonic structure, obtain described SILK decoder and described low-frequency parameter is decoded
The low frequency complete excitation of Shi Shengcheng and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation, the excitation of described low frequency sore throat relieving with
And described modulating frequency, calculating simulation high frequency pumping;
By the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation high frequency pumping input LPC
Synthesizer, the high frequency time-domain signal after output synthesis.
Preferably, described according to described low frequency complete excitation, the excitation of described low frequency sore throat relieving and described modulating frequency, calculate
Simulation high frequency pumping, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described
Full range excitation obtains the 4th high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described 4th high frequency pumping;
The excitation of described low frequency sore throat relieving carrying out spectrum folding and time-delay alignment obtains the 5th high frequency pumping, the described 5th is high
Harmonic structure is not carried in frequency excitation;
According to the mixed coefficint that default described 4th high frequency pumping is corresponding, corresponding with described 5th high frequency pumping preset
Mixed coefficint, described 4th high frequency pumping and described 5th high frequency pumping are carried out mixed weighting be calculated simulation high frequency swash
Encourage.
Preferably, described judge whether described current audio frame exists harmonic structure after, described method also includes:
When described voice data does not exists harmonic structure, obtain described SILK decoder and described low-frequency parameter is solved
The low frequency sore throat relieving excitation generated during code, and carry out spectrum folding and time-delay alignment obtains the 6th according to the excitation of described low frequency sore throat relieving
High frequency pumping, and be defined as described 6th high frequency pumping simulating high frequency pumping;
On the other hand, it is provided that a kind of subband coding apparatus based on SILK codec, described device includes:
First acquisition module, for obtaining the full range time-domain signal that current audio frame is corresponding;And described full range time domain is believed
Number it is decomposed into low-frequency time-domain signal and high frequency time-domain signal;
Coding module, for described low-frequency time-domain signal is carried out SILK coded treatment, generates described low-frequency time-domain signal
Corresponding low-frequency parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate high frequency time domain letter
Number corresponding high-frequency parameter;
Generation module, generates described present video for carrying out described low-frequency parameter and described high-frequency parameter quantifying to compress
The bit stream that frame is corresponding.
Preferably, described coding module, including:
First computing unit, for described current audio frame be unvoiced frame time, described full range time-domain signal is converted into
Full range frequency-region signal, the pitch period obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, and by described base
Because of cycle and described full-time frequency-region signal input harmonics structure analyzer, calculate the cut-off frequency of harmonic structure;
First judging unit, for the cut-off frequency according to described harmonic structure, it is judged that in described high frequency time-domain signal be
No there is harmonic structure;
, for when there is harmonic structure in described high frequency time-domain signal, according to described low-frequency time-domain in the second computing unit
Signal carries out the low frequency complete excitation obtained during SILK coded treatment, and described low frequency sore throat relieving encourages and according to described harmonic structure
The calculated modulating frequency of cut-off frequency, calculating simulation high frequency pumping;
3rd computing unit, for by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, is calculated
True high frequency pumping and high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping,
It is calculated gain-adjusted ratio;
Determine unit, for described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as institute
State the high-frequency parameter that high frequency time-domain signal is corresponding.
Preferably, described second computing unit, including:
First processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains
To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the first high frequency pumping, in described first high frequency pumping
Carry harmonic structure;
Second processes subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains second
High frequency pumping, does not carries harmonic structure in described second high frequency pumping;
First computation subunit, for according to mixed coefficint corresponding to described first high frequency pumping preset, and preset
The mixed coefficint that described second high frequency pumping is corresponding, carries out mixing by described first high frequency pumping and described second high frequency pumping and adds
Power is calculated simulation high frequency pumping.
Preferably, described coding module also includes:
4th computing unit, for when there is not harmonic structure or described current audio frame in described high frequency time-domain signal
During for unvoiced frames, the low frequency sore throat relieving excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal carries out frequency spectrum folding
Folded and time-delay alignment obtains the 3rd high frequency pumping, and is defined as described 3rd high frequency pumping simulating high frequency pumping.
On the other hand, it is provided that a kind of subband decoding apparatus based on SILK codec, described device includes:
Second acquisition module, for obtaining the bit stream that current audio frame is corresponding, and by parameter decoder to described ratio
The decoding of special stream obtains low-frequency parameter and high-frequency parameter;
Decoder module, for being decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And
The intermediate parameters generated when being decoded described low-frequency parameter according to SILK decoder, is decoded described high-frequency parameter
To high frequency time-domain signal;
Synthesis module, for synthesizing described low-frequency time-domain signal and described high frequency time-domain signal by QMF synthesizer
Full range time-domain signal, described full range time-domain signal is the decoded voice data of described current audio frame.
Preferably, described decoder module, including:
Second judging unit, for according to the modulating frequency in described high-frequency parameter, it is judged that in described current audio frame be
No there is harmonic structure;
5th computing unit, for when described voice data exists harmonic structure, obtains described SILK decoder to institute
State the low frequency complete excitation generated when low-frequency parameter is decoded and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation,
The excitation of described low frequency sore throat relieving and described modulating frequency, calculating simulation high frequency pumping;
Synthesis unit, is used for the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation height
Frequency excitation input LPC synthesizer, the high frequency time-domain signal after output synthesis.
Preferably, described 5th computing unit, including:
3rd processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains
To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the 4th high frequency pumping, in described 4th high frequency pumping
Carry harmonic structure;
Fourth process subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th
High frequency pumping, does not carries harmonic structure in described 5th high frequency pumping;
Second computation subunit, for according to mixed coefficint corresponding to described 4th high frequency pumping preset, and preset
The mixed coefficint that described 5th high frequency pumping is corresponding, carries out mixing add described 4th high frequency pumping and described 5th high frequency pumping
Power is calculated simulation high frequency pumping.
Preferably, described decoder module also includes:
6th computing unit, for when described voice data does not exists harmonic structure, obtains described SILK decoder pair
When described low-frequency parameter is decoded generate low frequency sore throat relieving excitation, and according to described low frequency sore throat relieving excitation carry out spectrum folding with
And time-delay alignment obtains the 6th high frequency pumping, and it is defined as described 6th high frequency pumping simulating high frequency pumping;
The technical scheme that the embodiment of the present invention provides has the benefit that
By SILK encoder, low frequency signal is encoded, by high-frequency signal is individually encoded, will be more
Bit resources distributes to low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus realizes more reasonably
Bit resources distributes.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal, thus identical
Bit rate lower acquirement more preferable sense of hearing effect is set.
Accompanying drawing explanation
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, in embodiment being described below required for make
Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for
From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings
Accompanying drawing.
Fig. 1 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention one provides;
Fig. 2 is subband based on the SILK codec decoding method flow diagram that the embodiment of the present invention two provides;
Fig. 3 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention three provides;
Fig. 4 is the structure of encoder in the subband coding/decoding method based on SILK codec that the embodiment of the present invention three provides
Figure;
Fig. 5 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention four provides;
Fig. 6 is the structure of decoder in the subband coding/decoding method based on SILK codec that the embodiment of the present invention four provides
Figure;
Fig. 7 is the subband coding apparatus structural representation based on SILK codec that the embodiment of the present invention five provides;
Fig. 8 is subband based on the SILK codec decoding apparatus structure schematic diagram that the embodiment of the present invention six provides.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention
Formula is described in further detail.
Embodiment one
Embodiments provide a kind of method of coding subband based on SILK codec, see Fig. 1, method flow
Including:
101: obtain the full range time-domain signal that current audio frame is corresponding;And full range time-domain signal is decomposed into low-frequency time-domain
Signal and high frequency time-domain signal;
102: low-frequency time-domain signal is carried out SILK coded treatment, generate the low-frequency parameter that low-frequency time-domain signal is corresponding;And
According to low-frequency parameter, high frequency time-domain signal is carried out coded treatment and generate the high-frequency parameter that high frequency time-domain signal is corresponding;
103: low-frequency parameter is carried out with high-frequency parameter quantifies compression and generates the bit stream that current audio frame is corresponding.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal
Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real
The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal,
Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment two
Embodiments provide a kind of subband coding/decoding method based on SILK codec, see Fig. 2, method flow
Including:
201: obtain bit stream corresponding to current audio frame, and by parameter decoder, bit stream decoding obtained low frequency ginseng
Number and high-frequency parameter;
202: be decoded obtaining low-frequency time-domain signal to low-frequency parameter according to SILK decoder;And according to SILK decoder
The intermediate parameters generated when being decoded low-frequency parameter, is decoded high-frequency parameter obtaining high frequency time-domain signal;
203: low-frequency time-domain signal and high frequency time-domain signal are synthesized full range time-domain signal, full range by QMF synthesizer
Time-domain signal is the decoded voice data of current audio frame.
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal
The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to
Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set
Really.
Embodiment three
Embodiments provide a kind of method of coding subband based on SILK codec, see Fig. 3.Wherein, should
The structure of audio coder is as shown in Figure 4.
Wherein, the method flow process includes:
301: by the analog-digital converter acquisition crude sampling digital signal of digital communication equipment, and by it at preset timed intervals
Interval framing windowing obtains full range time-domain signal;Obtain the full range time-domain signal that current frame data is corresponding, and by this full range time domain
Signal decomposition becomes low-frequency time-domain signal and high frequency time-domain signal.
Wherein, former be sampled digital signal be the voice data of certain time length, after framing, obtain each frame number
According to corresponding full range time-domain signal.
In embodiments of the present invention, full range time-domain signal is replicated and is divided into two paths of signals, wherein a road full range time-domain signal
It is sent to QMF(Quadrature mirror filter, quadrature mirror filter bank) decomposer unit 401, by full range time domain
Signal decomposition is low-frequency time-domain signal and high frequency time-domain signal;The FFT that another road full range time-domain signal is sent in encoder
(Fast Fourier Transform, fast Fourier transform) unit 402, is believed full range time domain by fast fourier transform
Number it is converted into full range frequency-region signal.
Wherein, as a example by the sample rate broadband signal as 16KHz, full range time-domain signal s (n) initially enters QMF decomposer
In unit 401,
This QMF analysis filterbank of the process decomposing full range time-domain signal is by two 64 symmetrical rank high low passes
FIR(Finite Impulse Response, finite impulse response (FIR)) wave filter composition, the impulse response relation between them is such as
Under:
Primary signal s (n) is decomposed into the low-frequency time-domain signal y of 0-4KHz by QMF decomposer 201lbThe height of (n) and 4-8KHz
Frequently time-domain signal yhb(n)。
Wherein, low-frequency time-domain signal ylb(n) and high frequency time-domain signal yhbN the computing formula of () is as follows:
Further, low-frequency time-domain signal ylbThe SILK cell encoder 403 of (n) entrance support 8KHz sampling, and according to
The original coded system of SILK extracts all low-frequency parameters, and carries out quantifying in compression loading bit stream load.And for high frequency
Time-domain signal yhbN coding and the reconstruction of () then use more classical source-filter model, high frequency time-domain signal is by high frequency pumping
Entering LPC(Linear Prediction Coefficients, linear predictor coefficient) synthesizer obtains.Under this model, high
Frequency encodes and needs three sample essential elements: high-frequency signal injection signal, the high frequency LSP(Line Spectral of HFS
Pairs, line spectrum antithetical phrase) coefficient, and high-frequency gain, wherein high-frequency gain is mutually multiplied with gain-adjusted ratio by low-frequency gain
Arrive.
302: described low-frequency time-domain signal is carried out SILK coded treatment, generate the low frequency that described low-frequency time-domain signal is corresponding
Parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate the height that high frequency time-domain signal is corresponding
Frequently parameter.
Wherein, the mode carrying out encoding for low-frequency time-domain signal is:
3021: described low-frequency time-domain signal is carried out SILK coded treatment, generate corresponding low of described low-frequency time-domain signal
Frequently parameter.
Wherein, low-frequency time-domain signal ylbN () encodes in SILK cell encoder 403, generation includes but not limited to:
Irregular pulse, pitch period and LTP(Long-Term Prediction, long-term prediction) coefficient, low frequency LPC coefficient, pure and impure
The parameters such as sound critical parameter and low frequency gain factors are as low-frequency parameter.
Wherein, for high frequency time-domain signal yhb(n) carry out the mode that encodes can particularly as follows:
3022: when current audio frame is unvoiced frame, described full range time-domain signal is converted into full range frequency-region signal, according to
Described low-frequency time-domain signal carries out the pitch period obtained during SILK coded treatment, and by described pitch period and described full-time frequency
Territory signal input harmonics structure analyzer, calculates the cut-off frequency of harmonic structure.
Meanwhile, the FFT(Fast Fourier Transform that full range time-domain signal is sent in encoder, soon
Speed Fourier transformation) unit 402, by fast fourier transform, full range time-domain signal is converted to full range frequency-region signal.Wherein,
This frame is being determined by the pure and impure sound critical parameter of SILK encoder of judging of Unvoiced signal or Voiced signal.
Then full range frequency-region signal and pitch period are input in harmonic structure analyzer module 404, by harmonic structure
Analyzer 404 analyzes the cut-off frequency obtaining harmonic structure.
Its principle is: pitch period determines the frequency axis position of harmonic wave, harmonic structure analyzer 404 by high frequency to low frequency
Check fundamental frequency F0The harmonic amplitude of integer multiple frequency position | Y [m*F0]|.By with default threshold value δ1And δ2Determine harmonic wave
The cut-off frequency of structure.
|Y[m*F0]|2-|Y[(mm1)*F0]|2>δ1
|Y[(m+1)*F0]|2<δ2
Before meeting, formula represents and have found the transfer point that a harmonic wave is substantially decayed, and after meeting, formula confirms follow-up amplitude
It is not enough to become effective harmonic wave.The frequency location i.e. harmonic wave knot finding first to meet above two formulas is started by low frequency
The cut-off frequency of structure.
3023: according to the cut-off frequency of described harmonic structure, it is judged that whether described high frequency time-domain signal exists harmonic wave knot
Structure.
If cut-off frequency is between 0-4KHz, illustrate that HFS does not has harmonic structure really, so high frequency pumping leads to
The modulation crossing the excitation of low frequency sore throat relieving obtains.If cut-off frequency is between 4-8KHz, illustrate that HFS there is also necessarily
Harmonic structure.Wherein, now the half of cut-off frequency is defined as modulating frequency, and its incoming simulation high frequency pumping is generated
Device unit 413 is for further processing.Step 3024 is performed, when high frequency time domain when high frequency time-domain signal exists harmonic structure
Signal does not exists harmonic structure or present frame performs step 3025 when being unvoiced frames.
3024: when described high frequency time-domain signal exists harmonic structure, carry out SILK according to described low-frequency time-domain signal
The low frequency complete excitation obtained during coded treatment, the excitation of described low frequency sore throat relieving and the cut-off frequency meter according to described harmonic structure
The modulating frequency obtained, calculating simulation high frequency pumping.
When cut-off frequency is between 4-8KHz, performing this step, the embodiment of the present invention is by encouraging low frequency sore throat relieving
Mixing with low frequency complete excitation obtains the HFS with harmonic structure and encourages, and i.e. simulates high frequency pumping.
The half of cut-off frequency is conveyed into simulation high frequency pumping maker unit 413 as calculated modulating frequency
In frequency spectrum translation unit 415.
Wherein, in SILK cell encoder 403, low-frequency time-domain signal is carried out also producing low frequency in cataloged procedure complete
Whole excitation and the excitation of low frequency sore throat relieving, receive the two signal incoming simulation high frequency pumping maker from SILK cell encoder 403
In unit 413.
Wherein, in step 3024 process of calculating simulation high frequency pumping can particularly as follows:
30241: according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and
Obtain the first high frequency pumping after the excitation of described full range is carried out high-pass filtering, described first high frequency pumping carries harmonic structure.
This step is the frequency spectrum translation unit 415 modulating frequency being conveyed in simulation high frequency pumping maker unit 413
In, and by the frequency spectrum translation unit 415 in low frequency complete excitation input simulation high frequency pumping maker unit 413,0-ΩM
The complete excitation part translation of frequency range copies to ΩM-2ΩMFrequency range.It follows below equation:
ufb(k)=ulb(k)*(1+ζ*cos(ΩMk))
Wherein zoom factor ζ ∈ (1,2) is in order to ensure signal energy accurately, ΩMFor modulating frequency.Achieved above is complete
Frequency excitation ufbK () enters in high-pass filter unit 406 and obtains the first high frequency pumping uhb_v(k), wherein the first high frequency pumping by
Carry out frequency spectrum translation thus according to modulating frequency, carry harmonic structure.
30242: the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the second high frequency pumping, described
Second high frequency pumping does not carries harmonic structure.
By the spectrum folding unit 407 in low frequency sore throat relieving excitation input simulation high frequency pumping maker unit 413 and time delay
In alignment unit 408, to obtain the second high frequency pumping uhb_uv(k).Time-delay alignment is to compensate for prolonging of high pass filter band
Time.
The theoretical step that spectrum folding obtains the second high frequency pumping is as follows: low frequency sore throat relieving is encouraged ulbK () up-samples, pass through
Following formula is converted to full range excitation ufbK (), obtains high frequency pumping u through high-pass filteringhb(k)。
ufb(k)=ulb(k)*(1+(-1)k)
Due to the particularity of spectrum folding, above step obtains high frequency pumping and is equal to directly to take low frequency signal negative
Number.
uhb(k)=-ulb(k)
30243: according to the mixed coefficint that default described first high frequency pumping is corresponding, and described second high frequency preset swashs
Encourage the mixed coefficint of correspondence, described first high frequency pumping and described second high frequency pumping are carried out mixed weighting and is calculated simulation
High frequency pumping.
Final simulation high frequency pumping uhbK () is mixed as the following formula by mixed coefficint α ∈ (0,1), and the most respectively
One high frequency pumping and the second high frequency pumping arrange correspondence mixed coefficint, the two mixed coefficint be combined into 1.
uhb(k)=α*uhb_v(k)+(1-α)*uhb_uv(k)
The most directly use the first high frequency pumping uhb_vK () has two reasons as last high frequency pumping:
1. the harmonic structure obtained by frequency spectrum translation covers 0-2 ΩMFrequency range, and at 2 ΩM-8KHz frequency range needs mixing
Some sore throat relieving pumping signals;
If 2. using different excitation producing methods, Ke Nengzao for different clear unvoiced frames and different cut-off frequencies
Before and after one-tenth, frame discontinuously affects sense of hearing.
3025: when described high frequency time-domain signal not existing harmonic structure or current audio frame is unvoiced frames, according to
The low frequency sore throat relieving excitation that described low-frequency time-domain signal carries out obtaining during SILK coded treatment carries out spectrum folding and time-delay alignment
Obtain the 3rd high frequency pumping, and be defined as described 3rd high frequency pumping simulating high frequency pumping.
3026: by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, be calculated true high frequency pumping
And high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, it is calculated gain
Regulation ratio.
The LSP coefficient of HFS directly is inputted to calculate to LPC analyzer unit 409 by high frequency time-domain signal and gets,
Its computational methods are as follows:
Being first depending on linear prediction model, current sample x (n) can be with P sample x (n-i) in the past by different weights ai
Linear superposition such as following formula forms:
Wherein e (n) is forecast error, i.e. the truest high frequency pumping of output residual signals of LPC analyzer unit 409.
Predictive coefficient { a1,a2,…,aPI.e. high frequency LSP coefficient, can obtain by solving following formula normal equation:
Wherein the computational methods of autocorrelation coefficient r (i) are
The above-mentioned process calculating true high frequency pumping and high frequency LSP coefficient is first to be calculated sub-phase according to formula (3)
Close coefficient r (i), calculate predictive coefficient { a by formula (2)1,a2,…,aP, i.e. high frequency LPC coefficient, finally according to formula
(1) being calculated e (n) is true high frequency pumping.
In actual applications, linear predictor coefficient can be high by Lai Wenxun-Du Bin (Levinson-Durbin) recurrence method
Solving of effect.Additionally, due to have more preferably robustness, be typically used in quantization with transmission is that often group LPC coefficient is relative
Answer the LSP coefficient on ground.
Calculating process for gain-adjusted ratio is as follows:
The gain-adjusted of HFS is than being mainly used in compensating high frequency pumping and the true high frequency pumping that system model generates
Between capacity volume variance.In embodiments of the present invention, simulation high frequency pumping maker unit 413 the simulation high frequency pumping produced
Entering root mean square calculator unit 410, circular follows following formula:
Similarly, high frequency time-domain signal enters LPC analyzer unit 409 and obtains residual signals, that is in the inverse fortune of decoding end
Calculation enters the true high-frequency excitation signal before LPC synthesizer.This signal enters root mean square calculator unit 412.Simulation and
True high frequency pumping calculates the root-mean-square got and respectively enters ratio of gains unit calculator 411, and true high frequency pumping is divided by simulation
High frequency pumping also does threshold restriction and obtains the gain-adjusted ratio of decoding end to be passed to.This gain-adjusted ratio will be applied to decoding end institute
Some high-frequency excitation signal samples, in order to adjust the energy mating true high-frequency signal.
3027: when described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as described high frequency
The high-frequency parameter that territory signal is corresponding.
The parameter of embodiment of the present invention coding side also has three in addition to original low-frequency parameter: modulating frequency, and gain is adjusted
Joint ratio and high frequency LSP coefficient.
303: described low-frequency parameter is carried out quantifies the compression described current audio frame of generation corresponding with described high-frequency parameter
Bit stream.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal
Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real
The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal,
Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment four
The method embodiments providing the decoding of a kind of subband based on SILK codec, sees Fig. 5.Wherein,
The structure of this audio decoder is as shown in Figure 6.
Wherein, the method flow process includes:
501: obtain the bit stream that current audio frame is corresponding, and by parameter decoder, described bit stream decoding is obtained low
Frequently parameter and high-frequency parameter.
Audio decoder termination receives voice packet bit stream 601, and the parameter decoder being input in audio decoder
In unit 602, the decoding parametric that output disparate modules needs.
Wherein low-frequency parameter includes but not limited to: irregular pulse, pitch period, LTP coefficient, and low frequency LPC coefficient is pure and impure
The parameters such as sound critical parameter and low frequency gain factors.
502: be decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And solve according to SILK
The intermediate parameters that code device generates when being decoded described low-frequency parameter, is decoded obtaining high frequency time domain to described high-frequency parameter
Signal.
Wherein, being decoded obtaining the process of low-frequency time-domain signal for low-frequency parameter is:
5021: be decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder.
First parameter decoder unit 602 solves the quantization index of low frequency sore throat relieving driver unit, in order to calculate in SILK
Irregular pulse signal.Then the bass sore throat relieving excitation of low frequency part is generated by sore throat relieving excitation maker unit 603.Next
Situation according to pure and impure sound determines whether to enter LTP synthesizer unit 604.
If this frame is Voiced signal, parameter decoder unit 602 solve pitch period and LTP coefficient, input the cycle
Property signal LTP synthesizer unit 604 generate the excitation of low frequency voiced portions, the excitation of low frequency sore throat relieving encourages with low frequency voiced sound and is added and obtains
Complete low-frequency excitation.Low frequency complete excitation finally enters LPC synthesizer and obtains last low-frequency time-domain signal.
If this frame is Unvoiced signal, then skip cycle signal synthesizer unit 604 is directly entered LPC synthesizer unit
605 generate low-frequency time-domain signal.Wherein, this frame is being sentenced by the pure and impure sound of SILK encoder of judging of Unvoiced signal or Voiced signal
Determine parameter determination.
SILK low frequency decoder unit 612 is consistent with SILK decoder functions principle in embodiments of the present invention.
Wherein, being decoded obtaining the process of high frequency time-domain signal for high-frequency parameter is:
5022: according to the modulating frequency in described high-frequency parameter, it is judged that whether described current audio frame exists harmonic wave knot
Structure.
Whether deposit according in the chirp parameter audio data in the high-frequency parameter obtained in parameter decoder unit 602
At harmonic structure.Wherein, when modulating frequency is at 0-2KHz, it is determined that there is not harmonic structure, step 5023 is performed;Work as modulation
Frequency is when 2-4KHz, it is determined that there is harmonic structure, performs step 5024.
5023: when described current audio frame exists harmonic structure, obtain described SILK decoder to described low-frequency parameter
The low frequency complete excitation generated when being decoded and the excitation of low frequency sore throat relieving;According to described low frequency complete excitation, described low frequency sore throat relieving
Excitation and described modulating frequency, calculating simulation high frequency pumping;
Concrete, the process of step 5023 can particularly as follows:
50231: according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and
Obtain the 4th high frequency pumping after the excitation of described full range is carried out high-pass filtering, described 4th high frequency pumping carries harmonic structure.
Wherein, by the low frequency complete excitation input tremendously high frequency decoder element 613 of output in LTP synthesizer unit 603
Frequency spectrum translation unit 608.And the chirp parameter input tremendously high frequency in the high-frequency parameter that will obtain in parameter decoder unit 602
In frequency spectrum translation unit 608 in decoder element 613.The full range excitation obtained in frequency spectrum translation unit 608 is entered high pass
Filter cell 609 obtains the 4th high frequency pumping, 30241 phases in the calculating process such as embodiment two related in this step
With, do not repeat them here.
50232: the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th high frequency pumping, described
5th high frequency pumping does not carries harmonic structure.
Sore throat relieving is encouraged in the low frequency sore throat relieving excitation input tremendously high frequency decoder element 613 of output in maker unit 603
Spectrum folding unit 606 and time-delay alignment unit 607 in.Concrete in calculating process such as embodiment two 30242 are identical,
This repeats no more.
50233: according to the mixed coefficint that default described 4th high frequency pumping is corresponding, and described 5th high frequency preset swashs
Encourage the mixed coefficint of correspondence, described 4th high frequency pumping and described 5th high frequency pumping are carried out mixed weighting and be calculated simulation
High frequency pumping.
In the concrete calculating process such as embodiment two of this step 30243 are identical, do not repeat them here.
5024: when described current audio frame does not exists harmonic structure, obtain described SILK decoder and described low frequency is joined
The low frequency sore throat relieving excitation generated when number is decoded, and carry out spectrum folding and time-delay alignment according to the excitation of described low frequency sore throat relieving
Obtain the 6th high frequency pumping, and be defined as described 6th high frequency pumping simulating high frequency pumping.
Concrete in calculating process such as embodiment two 3025 are identical, do not repeat them here.
5025: by the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation high frequency pumping is defeated
Enter LPC synthesizer, the high frequency time-domain signal after output synthesis.
High frequency LPC coefficient in the high-frequency parameter that will obtain in parameter decoder unit 602 and gain-adjusted ratio, Yi Jibu
In rapid 5023, calculated simulation high frequency pumping inputs to LPC synthesizer unit 610, synthesizes high frequency time-domain signal.
503: described low-frequency time-domain signal and described high frequency time-domain signal are synthesized full range time domain by QMF synthesizer and believes
Number, described full range time-domain signal is the decoded voice data of described current audio frame.
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal
The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to
Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set
Really.
Embodiment five
Embodiments provide the device of a kind of sub-band coding based on SILK codec, see Fig. 7.This device
Including:
First acquisition module 701, for obtaining the full range time-domain signal that current audio frame is corresponding;And by described full range time domain
Signal decomposition becomes low-frequency time-domain signal and high frequency time-domain signal;
Coding module 702, for described low-frequency time-domain signal is carried out SILK coded treatment, generates described low-frequency time-domain letter
Number corresponding low-frequency parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate high frequency time domain
The high-frequency parameter that signal is corresponding;
Generation module 703, described currently for described low-frequency parameter and described high-frequency parameter being carried out quantization compression generation
The bit stream that audio frame is corresponding.
Wherein, described coding module 702, including:
First computing unit, for described current audio frame be unvoiced frame time, described full range time-domain signal is converted into
Full range frequency-region signal, the pitch period obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, and by described base
Because of cycle and described full-time frequency-region signal input harmonics structure analyzer, calculate the cut-off frequency of harmonic structure;
First judging unit, for the cut-off frequency according to described harmonic structure, it is judged that in described high frequency time-domain signal be
No there is harmonic structure;
, for when there is harmonic structure in described high frequency time-domain signal, according to described low-frequency time-domain in the second computing unit
Signal carries out the low frequency complete excitation obtained during SILK coded treatment, and described low frequency sore throat relieving encourages and according to described harmonic structure
The calculated modulating frequency of cut-off frequency, calculating simulation high frequency pumping;
3rd computing unit, for by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, is calculated
True high frequency pumping and high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping,
It is calculated gain-adjusted ratio;
Determine unit, for described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as institute
State the high-frequency parameter that high frequency time-domain signal is corresponding.
Wherein, described second computing unit, including:
First processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains
To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the first high frequency pumping, in described first high frequency pumping
Carry harmonic structure;
Second processes subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains second
High frequency pumping, does not carries harmonic structure in described second high frequency pumping;
First computation subunit, for according to mixed coefficint corresponding to described first high frequency pumping preset, and preset
The mixed coefficint that described second high frequency pumping is corresponding, carries out mixing by described first high frequency pumping and described second high frequency pumping and adds
Power is calculated simulation high frequency pumping.
Wherein, described coding module 702 also includes:
4th computing unit, for when there is not harmonic structure or described current audio frame in described high frequency time-domain signal
During for unvoiced frames, the low frequency sore throat relieving excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal carries out frequency spectrum folding
Folded and time-delay alignment obtains the 3rd high frequency pumping, and is defined as described 3rd high frequency pumping simulating high frequency pumping.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal
Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real
The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal,
Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment six
Embodiments provide the device of a kind of subband based on SILK codec decoding, see Fig. 8.This device
Including:
Second acquisition module 801, for obtaining the bit stream that current audio frame is corresponding, and by parameter decoder to described
Bit stream decoding obtains low-frequency parameter and high-frequency parameter;
Decoder module 802, for being decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;
And the intermediate parameters generated when described low-frequency parameter being decoded according to SILK decoder, described high-frequency parameter is decoded
Obtain high frequency time-domain signal;
Synthesis module 803, for synthesizing described low-frequency time-domain signal and described high frequency time-domain signal by QMF synthesizer
For full range time-domain signal, described full range time-domain signal is the decoded voice data of described current audio frame.
Wherein, described decoder module 802, including:
Second judging unit, for according to the modulating frequency in described high-frequency parameter, it is judged that in described current audio frame be
No there is harmonic structure;
5th computing unit, for when described current audio frame exists harmonic structure, obtains described SILK decoder pair
The low frequency complete excitation generated when described low-frequency parameter is decoded and the excitation of low frequency sore throat relieving, and completely swash according to described low frequency
Encourage, the excitation of described low frequency sore throat relieving and described modulating frequency, calculating simulation high frequency pumping;
Synthesis unit, is used for the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation height
Frequency excitation input LPC synthesizer, the high frequency time-domain signal after output synthesis.
Wherein, described 5th computing unit, including:
3rd processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains
To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the 4th high frequency pumping, in described 4th high frequency pumping
Carry harmonic structure;
Fourth process subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th
High frequency pumping, does not carries harmonic structure in described 5th high frequency pumping;
Second computation subunit, for according to mixed coefficint corresponding to described 4th high frequency pumping preset, and preset
The mixed coefficint that described 5th high frequency pumping is corresponding, carries out mixing add described 4th high frequency pumping and described 5th high frequency pumping
Power is calculated simulation high frequency pumping.
Wherein, described decoder module 802 also includes:
6th computing unit, for when described current audio frame does not exists harmonic structure, obtains described SILK decoder
The low frequency sore throat relieving excitation generated when described low-frequency parameter is decoded, and carry out spectrum folding according to the excitation of described low frequency sore throat relieving
And time-delay alignment obtains the 6th high frequency pumping, and it is defined as described 6th high frequency pumping simulating high frequency pumping;
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal
The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to
Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set
Really.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can pass through hardware
Completing, it is also possible to instruct relevant hardware by program and complete, described program can be stored in a kind of computer-readable
In storage medium, storage medium mentioned above can be read only memory, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.