CN103714822B - Sub-band coding and decoding method and device based on SILK coder decoder - Google Patents

Sub-band coding and decoding method and device based on SILK coder decoder Download PDF

Info

Publication number
CN103714822B
CN103714822B CN201310740505.7A CN201310740505A CN103714822B CN 103714822 B CN103714822 B CN 103714822B CN 201310740505 A CN201310740505 A CN 201310740505A CN 103714822 B CN103714822 B CN 103714822B
Authority
CN
China
Prior art keywords
frequency
high frequency
time
low
domain signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310740505.7A
Other languages
Chinese (zh)
Other versions
CN103714822A (en
Inventor
陈若非
高泽华
邢世义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Cubesili Information Technology Co Ltd
Original Assignee
Guangzhou Huaduo Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huaduo Network Technology Co Ltd filed Critical Guangzhou Huaduo Network Technology Co Ltd
Priority to CN201310740505.7A priority Critical patent/CN103714822B/en
Publication of CN103714822A publication Critical patent/CN103714822A/en
Application granted granted Critical
Publication of CN103714822B publication Critical patent/CN103714822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a sub-band coding and decoding method and device based on an SILK coder decoder, and belongs to the field of audio coding and decoding. The sub-band coding and decoding method comprises the steps of obtaining a full-frequency time-domain signal corresponding to a current audio frame, dividing the full-frequency time-domain signal into a low-frequency time-domain signal and a high-frequency time-domain signal, carrying out SILK coding processing on the low-frequency time-domain signal to generate a low-frequency parameter corresponding to the low-frequency time-domain signal, coding the high-frequency time-domain signal according to the low-frequency parameter to generate a high-frequency parameter corresponding to the high-frequency time-domain signal, and compressing the low-frequency parameter and the high-frequency parameter quantitatively to generate a bit stream corresponding to the current audio frame. More bit resources are distributed to the low-frequency signal, the high-frequency signal is coded through relatively few bit resources, and therefore more reasonable distribution of the bit resources is achieved. The coding efficiency can be improved effectively, the harmonic wave structure in the high-frequency signal can be kept, and a better listening effect is achieved under the same setting of the bit rate.

Description

Subband decoding method based on SILK codec and device
Technical field
The present invention relates to audio coding decoding field, particularly to a kind of subband decoding method based on SILK codec And device.
Background technology
Along with the development of the Internet, the demand of speech communication constantly increases, VOIP(Voice based on voice packet exchange Over Internet Protocol) technology is with its low cost, easily expand and excellent speech quality is increasingly by user's Favor.
The coded system comparing main flow in VOIP technology is SILK coding, and its coded system is: at coding side to voice Signal is modeled, and by speech model, signal is disassembled into different systematic parameters, and by channel, these parameters are reached solution Code end, decoder solves relevant parameter, then recovers voice signal according to identical speech model.
During realizing the present invention, inventor finds that prior art at least there is problems in that
Owing to, in the sound that sends people, high-frequency signal usually not low frequency signal enriches, and SILK encoder is then root Respectively low-and high-frequency signal is processed according to default bit resources, so position distribution is efficient not when compiling broadband voice, make The efficiency of low-frequency signal processing must be reduced.
Summary of the invention
In order to solve problem of the prior art, embodiments provide a kind of subband based on SILK codec and compile Coding/decoding method and device.Described technical scheme is as follows:
On the one hand, it is provided that a kind of method of coding subband based on SILK codec, described method includes:
Obtain the full range time-domain signal that current audio frame is corresponding;And described full range time-domain signal is decomposed into low-frequency time-domain Signal and high frequency time-domain signal;
Described low-frequency time-domain signal is carried out SILK coded treatment, generates the low frequency ginseng that described low-frequency time-domain signal is corresponding Number;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate the high frequency that high frequency time-domain signal is corresponding Parameter;
Described low-frequency parameter is carried out with described high-frequency parameter quantifies compression and generates the bit that described current audio frame is corresponding Stream.
Preferably, described and according to described full range time-domain signal and described high frequency time-domain signal, carry out coded treatment generation The high-frequency parameter that high frequency time-domain signal is corresponding, including:
When described current audio frame is unvoiced frame, described full range time-domain signal is converted into full range frequency-region signal, according to Described low-frequency time-domain signal carries out the pitch period obtained during SILK coded treatment, and by described pitch period and described full-time frequency Territory signal input harmonics structure analyzer, calculates the cut-off frequency of harmonic structure;
Cut-off frequency according to described harmonic structure, it is judged that whether there is harmonic structure in described high frequency time-domain signal;
When described high frequency time-domain signal exists harmonic structure, carry out at SILK coding according to described low-frequency time-domain signal The low frequency complete excitation obtained during reason, described low frequency sore throat relieving encourages and is calculated according to the cut-off frequency of described harmonic structure Modulating frequency, calculating simulation high frequency pumping;
By described high frequency time-domain signal input linear predictive coefficient LPC analyzer, be calculated true high frequency pumping and High frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, it is calculated gain-adjusted Ratio;
Described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as described high frequency time domain letter Number corresponding high-frequency parameter.
Preferably, the described low frequency complete excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, The excitation of described low frequency sore throat relieving and the calculated modulating frequency of cut-off frequency according to described harmonic structure, calculating simulation high frequency Excitation, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described Full range excitation obtains the first high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described first high frequency pumping;
The excitation of described low frequency sore throat relieving carrying out spectrum folding and time-delay alignment obtains the second high frequency pumping, described second is high Harmonic structure is not carried in frequency excitation;
According to the mixed coefficint that default described first high frequency pumping is corresponding, corresponding with described second high frequency pumping preset Mixed coefficint, described first high frequency pumping and described second high frequency pumping are carried out mixed weighting be calculated simulation high frequency swash Encourage.
Preferably, described judge whether described full range time-domain signal exists harmonic structure after, described method also includes:
When described high frequency time-domain signal not existing harmonic structure or described current audio frame is unvoiced frames, according to institute State the low frequency sore throat relieving excitation that low-frequency time-domain signal carries out obtaining during SILK coded treatment and carry out spectrum folding and time-delay alignment obtains To the 3rd high frequency pumping, and it is defined as described 3rd high frequency pumping simulating high frequency pumping.
On the other hand, it is provided that a kind of subband coding/decoding method based on SILK codec, described method includes:
Obtain bit stream corresponding to current audio frame, and by parameter decoder, described bit stream decoding obtained low frequency ginseng Number and high-frequency parameter;
It is decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And according to SILK decoder The intermediate parameters generated when being decoded described low-frequency parameter, is decoded described high-frequency parameter obtaining high frequency time domain letter Number;
Described low-frequency time-domain signal and described high frequency time-domain signal are synthesized full range time-domain signal by QMF synthesizer, Described full range time-domain signal is the decoded voice data of described current audio frame.
Preferably, the described intermediate parameters generated when being decoded described low-frequency parameter according to SILK decoder, to institute State high-frequency parameter to be decoded obtaining high frequency time-domain signal, including:
According to the modulating frequency in described high-frequency parameter, it is judged that whether described current audio frame exists harmonic structure;
When described voice data exists harmonic structure, obtain described SILK decoder and described low-frequency parameter is decoded The low frequency complete excitation of Shi Shengcheng and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation, the excitation of described low frequency sore throat relieving with And described modulating frequency, calculating simulation high frequency pumping;
By the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation high frequency pumping input LPC Synthesizer, the high frequency time-domain signal after output synthesis.
Preferably, described according to described low frequency complete excitation, the excitation of described low frequency sore throat relieving and described modulating frequency, calculate Simulation high frequency pumping, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described Full range excitation obtains the 4th high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described 4th high frequency pumping;
The excitation of described low frequency sore throat relieving carrying out spectrum folding and time-delay alignment obtains the 5th high frequency pumping, the described 5th is high Harmonic structure is not carried in frequency excitation;
According to the mixed coefficint that default described 4th high frequency pumping is corresponding, corresponding with described 5th high frequency pumping preset Mixed coefficint, described 4th high frequency pumping and described 5th high frequency pumping are carried out mixed weighting be calculated simulation high frequency swash Encourage.
Preferably, described judge whether described current audio frame exists harmonic structure after, described method also includes:
When described voice data does not exists harmonic structure, obtain described SILK decoder and described low-frequency parameter is solved The low frequency sore throat relieving excitation generated during code, and carry out spectrum folding and time-delay alignment obtains the 6th according to the excitation of described low frequency sore throat relieving High frequency pumping, and be defined as described 6th high frequency pumping simulating high frequency pumping;
On the other hand, it is provided that a kind of subband coding apparatus based on SILK codec, described device includes:
First acquisition module, for obtaining the full range time-domain signal that current audio frame is corresponding;And described full range time domain is believed Number it is decomposed into low-frequency time-domain signal and high frequency time-domain signal;
Coding module, for described low-frequency time-domain signal is carried out SILK coded treatment, generates described low-frequency time-domain signal Corresponding low-frequency parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate high frequency time domain letter Number corresponding high-frequency parameter;
Generation module, generates described present video for carrying out described low-frequency parameter and described high-frequency parameter quantifying to compress The bit stream that frame is corresponding.
Preferably, described coding module, including:
First computing unit, for described current audio frame be unvoiced frame time, described full range time-domain signal is converted into Full range frequency-region signal, the pitch period obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, and by described base Because of cycle and described full-time frequency-region signal input harmonics structure analyzer, calculate the cut-off frequency of harmonic structure;
First judging unit, for the cut-off frequency according to described harmonic structure, it is judged that in described high frequency time-domain signal be No there is harmonic structure;
, for when there is harmonic structure in described high frequency time-domain signal, according to described low-frequency time-domain in the second computing unit Signal carries out the low frequency complete excitation obtained during SILK coded treatment, and described low frequency sore throat relieving encourages and according to described harmonic structure The calculated modulating frequency of cut-off frequency, calculating simulation high frequency pumping;
3rd computing unit, for by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, is calculated True high frequency pumping and high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, It is calculated gain-adjusted ratio;
Determine unit, for described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as institute State the high-frequency parameter that high frequency time-domain signal is corresponding.
Preferably, described second computing unit, including:
First processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the first high frequency pumping, in described first high frequency pumping Carry harmonic structure;
Second processes subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains second High frequency pumping, does not carries harmonic structure in described second high frequency pumping;
First computation subunit, for according to mixed coefficint corresponding to described first high frequency pumping preset, and preset The mixed coefficint that described second high frequency pumping is corresponding, carries out mixing by described first high frequency pumping and described second high frequency pumping and adds Power is calculated simulation high frequency pumping.
Preferably, described coding module also includes:
4th computing unit, for when there is not harmonic structure or described current audio frame in described high frequency time-domain signal During for unvoiced frames, the low frequency sore throat relieving excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal carries out frequency spectrum folding Folded and time-delay alignment obtains the 3rd high frequency pumping, and is defined as described 3rd high frequency pumping simulating high frequency pumping.
On the other hand, it is provided that a kind of subband decoding apparatus based on SILK codec, described device includes:
Second acquisition module, for obtaining the bit stream that current audio frame is corresponding, and by parameter decoder to described ratio The decoding of special stream obtains low-frequency parameter and high-frequency parameter;
Decoder module, for being decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And The intermediate parameters generated when being decoded described low-frequency parameter according to SILK decoder, is decoded described high-frequency parameter To high frequency time-domain signal;
Synthesis module, for synthesizing described low-frequency time-domain signal and described high frequency time-domain signal by QMF synthesizer Full range time-domain signal, described full range time-domain signal is the decoded voice data of described current audio frame.
Preferably, described decoder module, including:
Second judging unit, for according to the modulating frequency in described high-frequency parameter, it is judged that in described current audio frame be No there is harmonic structure;
5th computing unit, for when described voice data exists harmonic structure, obtains described SILK decoder to institute State the low frequency complete excitation generated when low-frequency parameter is decoded and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation, The excitation of described low frequency sore throat relieving and described modulating frequency, calculating simulation high frequency pumping;
Synthesis unit, is used for the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation height Frequency excitation input LPC synthesizer, the high frequency time-domain signal after output synthesis.
Preferably, described 5th computing unit, including:
3rd processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the 4th high frequency pumping, in described 4th high frequency pumping Carry harmonic structure;
Fourth process subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th High frequency pumping, does not carries harmonic structure in described 5th high frequency pumping;
Second computation subunit, for according to mixed coefficint corresponding to described 4th high frequency pumping preset, and preset The mixed coefficint that described 5th high frequency pumping is corresponding, carries out mixing add described 4th high frequency pumping and described 5th high frequency pumping Power is calculated simulation high frequency pumping.
Preferably, described decoder module also includes:
6th computing unit, for when described voice data does not exists harmonic structure, obtains described SILK decoder pair When described low-frequency parameter is decoded generate low frequency sore throat relieving excitation, and according to described low frequency sore throat relieving excitation carry out spectrum folding with And time-delay alignment obtains the 6th high frequency pumping, and it is defined as described 6th high frequency pumping simulating high frequency pumping;
The technical scheme that the embodiment of the present invention provides has the benefit that
By SILK encoder, low frequency signal is encoded, by high-frequency signal is individually encoded, will be more Bit resources distributes to low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus realizes more reasonably Bit resources distributes.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal, thus identical Bit rate lower acquirement more preferable sense of hearing effect is set.
Accompanying drawing explanation
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, in embodiment being described below required for make Accompanying drawing be briefly described, it should be apparent that, below describe in accompanying drawing be only some embodiments of the present invention, for From the point of view of those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to obtain other according to these accompanying drawings Accompanying drawing.
Fig. 1 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention one provides;
Fig. 2 is subband based on the SILK codec decoding method flow diagram that the embodiment of the present invention two provides;
Fig. 3 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention three provides;
Fig. 4 is the structure of encoder in the subband coding/decoding method based on SILK codec that the embodiment of the present invention three provides Figure;
Fig. 5 is the method for coding subband flow chart based on SILK codec that the embodiment of the present invention four provides;
Fig. 6 is the structure of decoder in the subband coding/decoding method based on SILK codec that the embodiment of the present invention four provides Figure;
Fig. 7 is the subband coding apparatus structural representation based on SILK codec that the embodiment of the present invention five provides;
Fig. 8 is subband based on the SILK codec decoding apparatus structure schematic diagram that the embodiment of the present invention six provides.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.
Embodiment one
Embodiments provide a kind of method of coding subband based on SILK codec, see Fig. 1, method flow Including:
101: obtain the full range time-domain signal that current audio frame is corresponding;And full range time-domain signal is decomposed into low-frequency time-domain Signal and high frequency time-domain signal;
102: low-frequency time-domain signal is carried out SILK coded treatment, generate the low-frequency parameter that low-frequency time-domain signal is corresponding;And According to low-frequency parameter, high frequency time-domain signal is carried out coded treatment and generate the high-frequency parameter that high frequency time-domain signal is corresponding;
103: low-frequency parameter is carried out with high-frequency parameter quantifies compression and generates the bit stream that current audio frame is corresponding.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal, Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment two
Embodiments provide a kind of subband coding/decoding method based on SILK codec, see Fig. 2, method flow Including:
201: obtain bit stream corresponding to current audio frame, and by parameter decoder, bit stream decoding obtained low frequency ginseng Number and high-frequency parameter;
202: be decoded obtaining low-frequency time-domain signal to low-frequency parameter according to SILK decoder;And according to SILK decoder The intermediate parameters generated when being decoded low-frequency parameter, is decoded high-frequency parameter obtaining high frequency time-domain signal;
203: low-frequency time-domain signal and high frequency time-domain signal are synthesized full range time-domain signal, full range by QMF synthesizer Time-domain signal is the decoded voice data of current audio frame.
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set Really.
Embodiment three
Embodiments provide a kind of method of coding subband based on SILK codec, see Fig. 3.Wherein, should The structure of audio coder is as shown in Figure 4.
Wherein, the method flow process includes:
301: by the analog-digital converter acquisition crude sampling digital signal of digital communication equipment, and by it at preset timed intervals Interval framing windowing obtains full range time-domain signal;Obtain the full range time-domain signal that current frame data is corresponding, and by this full range time domain Signal decomposition becomes low-frequency time-domain signal and high frequency time-domain signal.
Wherein, former be sampled digital signal be the voice data of certain time length, after framing, obtain each frame number According to corresponding full range time-domain signal.
In embodiments of the present invention, full range time-domain signal is replicated and is divided into two paths of signals, wherein a road full range time-domain signal It is sent to QMF(Quadrature mirror filter, quadrature mirror filter bank) decomposer unit 401, by full range time domain Signal decomposition is low-frequency time-domain signal and high frequency time-domain signal;The FFT that another road full range time-domain signal is sent in encoder (Fast Fourier Transform, fast Fourier transform) unit 402, is believed full range time domain by fast fourier transform Number it is converted into full range frequency-region signal.
Wherein, as a example by the sample rate broadband signal as 16KHz, full range time-domain signal s (n) initially enters QMF decomposer In unit 401,
This QMF analysis filterbank of the process decomposing full range time-domain signal is by two 64 symmetrical rank high low passes FIR(Finite Impulse Response, finite impulse response (FIR)) wave filter composition, the impulse response relation between them is such as Under:
H l p ( n ) = ( - 1 ) n * h h p ( n )
Primary signal s (n) is decomposed into the low-frequency time-domain signal y of 0-4KHz by QMF decomposer 201lbThe height of (n) and 4-8KHz Frequently time-domain signal yhb(n)。
Wherein, low-frequency time-domain signal ylb(n) and high frequency time-domain signal yhbN the computing formula of () is as follows:
y lb ( n ) = Σ i = 0 31 h l p ( i ) [ S ( n + 1 + i ) + S ( n - i ) ]
y hb ( n ) = Σ i = 0 31 h h p ( i ) [ S ( n + 1 + i ) + S ( n - i ) ]
Further, low-frequency time-domain signal ylbThe SILK cell encoder 403 of (n) entrance support 8KHz sampling, and according to The original coded system of SILK extracts all low-frequency parameters, and carries out quantifying in compression loading bit stream load.And for high frequency Time-domain signal yhbN coding and the reconstruction of () then use more classical source-filter model, high frequency time-domain signal is by high frequency pumping Entering LPC(Linear Prediction Coefficients, linear predictor coefficient) synthesizer obtains.Under this model, high Frequency encodes and needs three sample essential elements: high-frequency signal injection signal, the high frequency LSP(Line Spectral of HFS Pairs, line spectrum antithetical phrase) coefficient, and high-frequency gain, wherein high-frequency gain is mutually multiplied with gain-adjusted ratio by low-frequency gain Arrive.
302: described low-frequency time-domain signal is carried out SILK coded treatment, generate the low frequency that described low-frequency time-domain signal is corresponding Parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate the height that high frequency time-domain signal is corresponding Frequently parameter.
Wherein, the mode carrying out encoding for low-frequency time-domain signal is:
3021: described low-frequency time-domain signal is carried out SILK coded treatment, generate corresponding low of described low-frequency time-domain signal Frequently parameter.
Wherein, low-frequency time-domain signal ylbN () encodes in SILK cell encoder 403, generation includes but not limited to: Irregular pulse, pitch period and LTP(Long-Term Prediction, long-term prediction) coefficient, low frequency LPC coefficient, pure and impure The parameters such as sound critical parameter and low frequency gain factors are as low-frequency parameter.
Wherein, for high frequency time-domain signal yhb(n) carry out the mode that encodes can particularly as follows:
3022: when current audio frame is unvoiced frame, described full range time-domain signal is converted into full range frequency-region signal, according to Described low-frequency time-domain signal carries out the pitch period obtained during SILK coded treatment, and by described pitch period and described full-time frequency Territory signal input harmonics structure analyzer, calculates the cut-off frequency of harmonic structure.
Meanwhile, the FFT(Fast Fourier Transform that full range time-domain signal is sent in encoder, soon Speed Fourier transformation) unit 402, by fast fourier transform, full range time-domain signal is converted to full range frequency-region signal.Wherein, This frame is being determined by the pure and impure sound critical parameter of SILK encoder of judging of Unvoiced signal or Voiced signal.
Then full range frequency-region signal and pitch period are input in harmonic structure analyzer module 404, by harmonic structure Analyzer 404 analyzes the cut-off frequency obtaining harmonic structure.
Its principle is: pitch period determines the frequency axis position of harmonic wave, harmonic structure analyzer 404 by high frequency to low frequency Check fundamental frequency F0The harmonic amplitude of integer multiple frequency position | Y [m*F0]|.By with default threshold value δ1And δ2Determine harmonic wave The cut-off frequency of structure.
|Y[m*F0]|2-|Y[(mm1)*F0]|21
|Y[(m+1)*F0]|22
Before meeting, formula represents and have found the transfer point that a harmonic wave is substantially decayed, and after meeting, formula confirms follow-up amplitude It is not enough to become effective harmonic wave.The frequency location i.e. harmonic wave knot finding first to meet above two formulas is started by low frequency The cut-off frequency of structure.
3023: according to the cut-off frequency of described harmonic structure, it is judged that whether described high frequency time-domain signal exists harmonic wave knot Structure.
If cut-off frequency is between 0-4KHz, illustrate that HFS does not has harmonic structure really, so high frequency pumping leads to The modulation crossing the excitation of low frequency sore throat relieving obtains.If cut-off frequency is between 4-8KHz, illustrate that HFS there is also necessarily Harmonic structure.Wherein, now the half of cut-off frequency is defined as modulating frequency, and its incoming simulation high frequency pumping is generated Device unit 413 is for further processing.Step 3024 is performed, when high frequency time domain when high frequency time-domain signal exists harmonic structure Signal does not exists harmonic structure or present frame performs step 3025 when being unvoiced frames.
3024: when described high frequency time-domain signal exists harmonic structure, carry out SILK according to described low-frequency time-domain signal The low frequency complete excitation obtained during coded treatment, the excitation of described low frequency sore throat relieving and the cut-off frequency meter according to described harmonic structure The modulating frequency obtained, calculating simulation high frequency pumping.
When cut-off frequency is between 4-8KHz, performing this step, the embodiment of the present invention is by encouraging low frequency sore throat relieving Mixing with low frequency complete excitation obtains the HFS with harmonic structure and encourages, and i.e. simulates high frequency pumping.
The half of cut-off frequency is conveyed into simulation high frequency pumping maker unit 413 as calculated modulating frequency In frequency spectrum translation unit 415.
Wherein, in SILK cell encoder 403, low-frequency time-domain signal is carried out also producing low frequency in cataloged procedure complete Whole excitation and the excitation of low frequency sore throat relieving, receive the two signal incoming simulation high frequency pumping maker from SILK cell encoder 403 In unit 413.
Wherein, in step 3024 process of calculating simulation high frequency pumping can particularly as follows:
30241: according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and Obtain the first high frequency pumping after the excitation of described full range is carried out high-pass filtering, described first high frequency pumping carries harmonic structure.
This step is the frequency spectrum translation unit 415 modulating frequency being conveyed in simulation high frequency pumping maker unit 413 In, and by the frequency spectrum translation unit 415 in low frequency complete excitation input simulation high frequency pumping maker unit 413,0-ΩM The complete excitation part translation of frequency range copies to ΩM-2ΩMFrequency range.It follows below equation:
ufb(k)=ulb(k)*(1+ζ*cos(ΩMk))
Wherein zoom factor ζ ∈ (1,2) is in order to ensure signal energy accurately, ΩMFor modulating frequency.Achieved above is complete Frequency excitation ufbK () enters in high-pass filter unit 406 and obtains the first high frequency pumping uhb_v(k), wherein the first high frequency pumping by Carry out frequency spectrum translation thus according to modulating frequency, carry harmonic structure.
30242: the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the second high frequency pumping, described Second high frequency pumping does not carries harmonic structure.
By the spectrum folding unit 407 in low frequency sore throat relieving excitation input simulation high frequency pumping maker unit 413 and time delay In alignment unit 408, to obtain the second high frequency pumping uhb_uv(k).Time-delay alignment is to compensate for prolonging of high pass filter band Time.
The theoretical step that spectrum folding obtains the second high frequency pumping is as follows: low frequency sore throat relieving is encouraged ulbK () up-samples, pass through Following formula is converted to full range excitation ufbK (), obtains high frequency pumping u through high-pass filteringhb(k)。
ufb(k)=ulb(k)*(1+(-1)k)
Due to the particularity of spectrum folding, above step obtains high frequency pumping and is equal to directly to take low frequency signal negative Number.
uhb(k)=-ulb(k)
30243: according to the mixed coefficint that default described first high frequency pumping is corresponding, and described second high frequency preset swashs Encourage the mixed coefficint of correspondence, described first high frequency pumping and described second high frequency pumping are carried out mixed weighting and is calculated simulation High frequency pumping.
Final simulation high frequency pumping uhbK () is mixed as the following formula by mixed coefficint α ∈ (0,1), and the most respectively One high frequency pumping and the second high frequency pumping arrange correspondence mixed coefficint, the two mixed coefficint be combined into 1.
uhb(k)=α*uhb_v(k)+(1-α)*uhb_uv(k)
The most directly use the first high frequency pumping uhb_vK () has two reasons as last high frequency pumping:
1. the harmonic structure obtained by frequency spectrum translation covers 0-2 ΩMFrequency range, and at 2 ΩM-8KHz frequency range needs mixing Some sore throat relieving pumping signals;
If 2. using different excitation producing methods, Ke Nengzao for different clear unvoiced frames and different cut-off frequencies Before and after one-tenth, frame discontinuously affects sense of hearing.
3025: when described high frequency time-domain signal not existing harmonic structure or current audio frame is unvoiced frames, according to The low frequency sore throat relieving excitation that described low-frequency time-domain signal carries out obtaining during SILK coded treatment carries out spectrum folding and time-delay alignment Obtain the 3rd high frequency pumping, and be defined as described 3rd high frequency pumping simulating high frequency pumping.
3026: by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, be calculated true high frequency pumping And high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, it is calculated gain Regulation ratio.
The LSP coefficient of HFS directly is inputted to calculate to LPC analyzer unit 409 by high frequency time-domain signal and gets, Its computational methods are as follows:
Being first depending on linear prediction model, current sample x (n) can be with P sample x (n-i) in the past by different weights ai Linear superposition such as following formula forms:
x ( n ) = Σ i = 1 P a i x ( n - i ) + e ( n ) - - - ( 1 )
Wherein e (n) is forecast error, i.e. the truest high frequency pumping of output residual signals of LPC analyzer unit 409.
Predictive coefficient { a1,a2,…,aPI.e. high frequency LSP coefficient, can obtain by solving following formula normal equation:
Wherein the computational methods of autocorrelation coefficient r (i) are
r ( i ) = Σ n = 0 N - 1 - i x ( n ) * x ( n + i ) - - - ( 3 )
The above-mentioned process calculating true high frequency pumping and high frequency LSP coefficient is first to be calculated sub-phase according to formula (3) Close coefficient r (i), calculate predictive coefficient { a by formula (2)1,a2,…,aP, i.e. high frequency LPC coefficient, finally according to formula (1) being calculated e (n) is true high frequency pumping.
In actual applications, linear predictor coefficient can be high by Lai Wenxun-Du Bin (Levinson-Durbin) recurrence method Solving of effect.Additionally, due to have more preferably robustness, be typically used in quantization with transmission is that often group LPC coefficient is relative Answer the LSP coefficient on ground.
Calculating process for gain-adjusted ratio is as follows:
The gain-adjusted of HFS is than being mainly used in compensating high frequency pumping and the true high frequency pumping that system model generates Between capacity volume variance.In embodiments of the present invention, simulation high frequency pumping maker unit 413 the simulation high frequency pumping produced Entering root mean square calculator unit 410, circular follows following formula:
u rms = Σ k = 0 N u hb ( k ) 2 N
Similarly, high frequency time-domain signal enters LPC analyzer unit 409 and obtains residual signals, that is in the inverse fortune of decoding end Calculation enters the true high-frequency excitation signal before LPC synthesizer.This signal enters root mean square calculator unit 412.Simulation and True high frequency pumping calculates the root-mean-square got and respectively enters ratio of gains unit calculator 411, and true high frequency pumping is divided by simulation High frequency pumping also does threshold restriction and obtains the gain-adjusted ratio of decoding end to be passed to.This gain-adjusted ratio will be applied to decoding end institute Some high-frequency excitation signal samples, in order to adjust the energy mating true high-frequency signal.
3027: when described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as described high frequency The high-frequency parameter that territory signal is corresponding.
The parameter of embodiment of the present invention coding side also has three in addition to original low-frequency parameter: modulating frequency, and gain is adjusted Joint ratio and high frequency LSP coefficient.
303: described low-frequency parameter is carried out quantifies the compression described current audio frame of generation corresponding with described high-frequency parameter Bit stream.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal, Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment four
The method embodiments providing the decoding of a kind of subband based on SILK codec, sees Fig. 5.Wherein, The structure of this audio decoder is as shown in Figure 6.
Wherein, the method flow process includes:
501: obtain the bit stream that current audio frame is corresponding, and by parameter decoder, described bit stream decoding is obtained low Frequently parameter and high-frequency parameter.
Audio decoder termination receives voice packet bit stream 601, and the parameter decoder being input in audio decoder In unit 602, the decoding parametric that output disparate modules needs.
Wherein low-frequency parameter includes but not limited to: irregular pulse, pitch period, LTP coefficient, and low frequency LPC coefficient is pure and impure The parameters such as sound critical parameter and low frequency gain factors.
502: be decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And solve according to SILK The intermediate parameters that code device generates when being decoded described low-frequency parameter, is decoded obtaining high frequency time domain to described high-frequency parameter Signal.
Wherein, being decoded obtaining the process of low-frequency time-domain signal for low-frequency parameter is:
5021: be decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder.
First parameter decoder unit 602 solves the quantization index of low frequency sore throat relieving driver unit, in order to calculate in SILK Irregular pulse signal.Then the bass sore throat relieving excitation of low frequency part is generated by sore throat relieving excitation maker unit 603.Next Situation according to pure and impure sound determines whether to enter LTP synthesizer unit 604.
If this frame is Voiced signal, parameter decoder unit 602 solve pitch period and LTP coefficient, input the cycle Property signal LTP synthesizer unit 604 generate the excitation of low frequency voiced portions, the excitation of low frequency sore throat relieving encourages with low frequency voiced sound and is added and obtains Complete low-frequency excitation.Low frequency complete excitation finally enters LPC synthesizer and obtains last low-frequency time-domain signal.
If this frame is Unvoiced signal, then skip cycle signal synthesizer unit 604 is directly entered LPC synthesizer unit 605 generate low-frequency time-domain signal.Wherein, this frame is being sentenced by the pure and impure sound of SILK encoder of judging of Unvoiced signal or Voiced signal Determine parameter determination.
SILK low frequency decoder unit 612 is consistent with SILK decoder functions principle in embodiments of the present invention.
Wherein, being decoded obtaining the process of high frequency time-domain signal for high-frequency parameter is:
5022: according to the modulating frequency in described high-frequency parameter, it is judged that whether described current audio frame exists harmonic wave knot Structure.
Whether deposit according in the chirp parameter audio data in the high-frequency parameter obtained in parameter decoder unit 602 At harmonic structure.Wherein, when modulating frequency is at 0-2KHz, it is determined that there is not harmonic structure, step 5023 is performed;Work as modulation Frequency is when 2-4KHz, it is determined that there is harmonic structure, performs step 5024.
5023: when described current audio frame exists harmonic structure, obtain described SILK decoder to described low-frequency parameter The low frequency complete excitation generated when being decoded and the excitation of low frequency sore throat relieving;According to described low frequency complete excitation, described low frequency sore throat relieving Excitation and described modulating frequency, calculating simulation high frequency pumping;
Concrete, the process of step 5023 can particularly as follows:
50231: according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and Obtain the 4th high frequency pumping after the excitation of described full range is carried out high-pass filtering, described 4th high frequency pumping carries harmonic structure.
Wherein, by the low frequency complete excitation input tremendously high frequency decoder element 613 of output in LTP synthesizer unit 603 Frequency spectrum translation unit 608.And the chirp parameter input tremendously high frequency in the high-frequency parameter that will obtain in parameter decoder unit 602 In frequency spectrum translation unit 608 in decoder element 613.The full range excitation obtained in frequency spectrum translation unit 608 is entered high pass Filter cell 609 obtains the 4th high frequency pumping, 30241 phases in the calculating process such as embodiment two related in this step With, do not repeat them here.
50232: the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th high frequency pumping, described 5th high frequency pumping does not carries harmonic structure.
Sore throat relieving is encouraged in the low frequency sore throat relieving excitation input tremendously high frequency decoder element 613 of output in maker unit 603 Spectrum folding unit 606 and time-delay alignment unit 607 in.Concrete in calculating process such as embodiment two 30242 are identical, This repeats no more.
50233: according to the mixed coefficint that default described 4th high frequency pumping is corresponding, and described 5th high frequency preset swashs Encourage the mixed coefficint of correspondence, described 4th high frequency pumping and described 5th high frequency pumping are carried out mixed weighting and be calculated simulation High frequency pumping.
In the concrete calculating process such as embodiment two of this step 30243 are identical, do not repeat them here.
5024: when described current audio frame does not exists harmonic structure, obtain described SILK decoder and described low frequency is joined The low frequency sore throat relieving excitation generated when number is decoded, and carry out spectrum folding and time-delay alignment according to the excitation of described low frequency sore throat relieving Obtain the 6th high frequency pumping, and be defined as described 6th high frequency pumping simulating high frequency pumping.
Concrete in calculating process such as embodiment two 3025 are identical, do not repeat them here.
5025: by the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation high frequency pumping is defeated Enter LPC synthesizer, the high frequency time-domain signal after output synthesis.
High frequency LPC coefficient in the high-frequency parameter that will obtain in parameter decoder unit 602 and gain-adjusted ratio, Yi Jibu In rapid 5023, calculated simulation high frequency pumping inputs to LPC synthesizer unit 610, synthesizes high frequency time-domain signal.
503: described low-frequency time-domain signal and described high frequency time-domain signal are synthesized full range time domain by QMF synthesizer and believes Number, described full range time-domain signal is the decoded voice data of described current audio frame.
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set Really.
Embodiment five
Embodiments provide the device of a kind of sub-band coding based on SILK codec, see Fig. 7.This device Including:
First acquisition module 701, for obtaining the full range time-domain signal that current audio frame is corresponding;And by described full range time domain Signal decomposition becomes low-frequency time-domain signal and high frequency time-domain signal;
Coding module 702, for described low-frequency time-domain signal is carried out SILK coded treatment, generates described low-frequency time-domain letter Number corresponding low-frequency parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate high frequency time domain The high-frequency parameter that signal is corresponding;
Generation module 703, described currently for described low-frequency parameter and described high-frequency parameter being carried out quantization compression generation The bit stream that audio frame is corresponding.
Wherein, described coding module 702, including:
First computing unit, for described current audio frame be unvoiced frame time, described full range time-domain signal is converted into Full range frequency-region signal, the pitch period obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, and by described base Because of cycle and described full-time frequency-region signal input harmonics structure analyzer, calculate the cut-off frequency of harmonic structure;
First judging unit, for the cut-off frequency according to described harmonic structure, it is judged that in described high frequency time-domain signal be No there is harmonic structure;
, for when there is harmonic structure in described high frequency time-domain signal, according to described low-frequency time-domain in the second computing unit Signal carries out the low frequency complete excitation obtained during SILK coded treatment, and described low frequency sore throat relieving encourages and according to described harmonic structure The calculated modulating frequency of cut-off frequency, calculating simulation high frequency pumping;
3rd computing unit, for by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, is calculated True high frequency pumping and high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, It is calculated gain-adjusted ratio;
Determine unit, for described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as institute State the high-frequency parameter that high frequency time-domain signal is corresponding.
Wherein, described second computing unit, including:
First processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the first high frequency pumping, in described first high frequency pumping Carry harmonic structure;
Second processes subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains second High frequency pumping, does not carries harmonic structure in described second high frequency pumping;
First computation subunit, for according to mixed coefficint corresponding to described first high frequency pumping preset, and preset The mixed coefficint that described second high frequency pumping is corresponding, carries out mixing by described first high frequency pumping and described second high frequency pumping and adds Power is calculated simulation high frequency pumping.
Wherein, described coding module 702 also includes:
4th computing unit, for when there is not harmonic structure or described current audio frame in described high frequency time-domain signal During for unvoiced frames, the low frequency sore throat relieving excitation obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal carries out frequency spectrum folding Folded and time-delay alignment obtains the 3rd high frequency pumping, and is defined as described 3rd high frequency pumping simulating high frequency pumping.
Low frequency signal is encoded by the embodiment of the present invention by SILK encoder, by individually compiling high-frequency signal Code, distributes to more bit resources low frequency signal, and goes to encode high-frequency signal with relatively little of bit resources, thus real The most more reasonably bit resources distribution.Can effectively improve code efficiency, it is possible to the harmonic structure in reserved high-frequency signal, Thus lower acquirement more preferable sense of hearing effect is set at identical bit rate.
Embodiment six
Embodiments provide the device of a kind of subband based on SILK codec decoding, see Fig. 8.This device Including:
Second acquisition module 801, for obtaining the bit stream that current audio frame is corresponding, and by parameter decoder to described Bit stream decoding obtains low-frequency parameter and high-frequency parameter;
Decoder module 802, for being decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder; And the intermediate parameters generated when described low-frequency parameter being decoded according to SILK decoder, described high-frequency parameter is decoded Obtain high frequency time-domain signal;
Synthesis module 803, for synthesizing described low-frequency time-domain signal and described high frequency time-domain signal by QMF synthesizer For full range time-domain signal, described full range time-domain signal is the decoded voice data of described current audio frame.
Wherein, described decoder module 802, including:
Second judging unit, for according to the modulating frequency in described high-frequency parameter, it is judged that in described current audio frame be No there is harmonic structure;
5th computing unit, for when described current audio frame exists harmonic structure, obtains described SILK decoder pair The low frequency complete excitation generated when described low-frequency parameter is decoded and the excitation of low frequency sore throat relieving, and completely swash according to described low frequency Encourage, the excitation of described low frequency sore throat relieving and described modulating frequency, calculating simulation high frequency pumping;
Synthesis unit, is used for the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation height Frequency excitation input LPC synthesizer, the high frequency time-domain signal after output synthesis.
Wherein, described 5th computing unit, including:
3rd processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains To full range encourage, and by described full range excitation carry out high-pass filtering after obtain the 4th high frequency pumping, in described 4th high frequency pumping Carry harmonic structure;
Fourth process subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th High frequency pumping, does not carries harmonic structure in described 5th high frequency pumping;
Second computation subunit, for according to mixed coefficint corresponding to described 4th high frequency pumping preset, and preset The mixed coefficint that described 5th high frequency pumping is corresponding, carries out mixing add described 4th high frequency pumping and described 5th high frequency pumping Power is calculated simulation high frequency pumping.
Wherein, described decoder module 802 also includes:
6th computing unit, for when described current audio frame does not exists harmonic structure, obtains described SILK decoder The low frequency sore throat relieving excitation generated when described low-frequency parameter is decoded, and carry out spectrum folding according to the excitation of described low frequency sore throat relieving And time-delay alignment obtains the 6th high frequency pumping, and it is defined as described 6th high frequency pumping simulating high frequency pumping;
The embodiment of the present invention, by the voice data after being encoded separately by low-and high-frequency signal, solves respectively according to low-and high-frequency signal The mode of code is decoded.Individually low-frequency parameter is decoded by SILK encoder, more bit resources is distributed to Low frequency signal, and retained the harmonic structure in high-frequency parameter, at identical bit rate, lower acquirement more preferable sense of hearing effect is set Really.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can pass through hardware Completing, it is also possible to instruct relevant hardware by program and complete, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read only memory, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Within principle, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.

Claims (12)

1. a method of coding subband based on SILK codec, it is characterised in that described method includes:
Obtain the full range time-domain signal that current audio frame is corresponding;And described full range time-domain signal is decomposed into low-frequency time-domain signal With high frequency time-domain signal;
Described low-frequency time-domain signal is carried out SILK coded treatment, generates the low-frequency parameter that described low-frequency time-domain signal is corresponding;And According to described low-frequency parameter, described high frequency time-domain signal is carried out coded treatment and generate the high-frequency parameter that high frequency time-domain signal is corresponding;
Described low-frequency parameter is carried out with described high-frequency parameter quantifies compression and generates the bit stream that described current audio frame is corresponding;
Wherein, the described coded treatment that carries out described high frequency time-domain signal according to described low-frequency parameter generates high frequency time-domain signal pair The high-frequency parameter answered, including:
When described current audio frame is unvoiced frame, described full range time-domain signal is converted into full range frequency-region signal, according to described Low-frequency time-domain signal carries out the pitch period obtained during SILK coded treatment, and described pitch period and described full-time frequency domain is believed Number input harmonics structure analyzer, calculates the cut-off frequency of harmonic structure;
Cut-off frequency according to described harmonic structure, it is judged that whether there is harmonic structure in described high frequency time-domain signal;
When described high frequency time-domain signal exists harmonic structure, when carrying out SILK coded treatment according to described low-frequency time-domain signal The low frequency complete excitation obtained, the excitation of low frequency sore throat relieving and the cut-off frequency calculated modulation frequency according to described harmonic structure Rate, calculating simulation high frequency pumping;
By described high frequency time-domain signal input linear predictive coefficient LPC analyzer, it is calculated true high frequency pumping and high frequency Line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, it is calculated gain-adjusted ratio;
Described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient are defined as described high frequency time-domain signal pair The high-frequency parameter answered.
Method the most according to claim 1, it is characterised in that described carry out SILK coding according to described low-frequency time-domain signal The low frequency complete excitation obtained during process, the excitation of low frequency sore throat relieving and the cut-off frequency according to described harmonic structure are calculated Modulating frequency, calculating simulation high frequency pumping, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described full range Excitation obtains the first high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described first high frequency pumping;
The excitation of described low frequency sore throat relieving being carried out spectrum folding and time-delay alignment obtains the second high frequency pumping, described second high frequency swashs Harmonic structure is not carried in encouraging;
According to the mixed coefficint that default described first high frequency pumping is corresponding, corresponding with described second high frequency pumping preset is mixed Syzygy number, carries out mixed weighting by described first high frequency pumping and described second high frequency pumping and is calculated simulation high frequency pumping.
Method the most according to claim 1, it is characterised in that described judge whether described full range time-domain signal exists humorous After wave structure, described method also includes:
When described high frequency time-domain signal not existing harmonic structure or described current audio frame is unvoiced frames, according to described low Frequently the low frequency sore throat relieving excitation that time-domain signal carries out obtaining during SILK coded treatment carries out spectrum folding and time-delay alignment obtains the Three high frequency pumpings, and be defined as described 3rd high frequency pumping simulating high frequency pumping.
4. a subband coding/decoding method based on SILK codec, it is characterised in that described method includes:
Obtain bit stream corresponding to current audio frame, and by parameter decoder described bit stream decoding obtained low-frequency parameter with And high-frequency parameter;
It is decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And according to SILK decoder to institute State the intermediate parameters generated when low-frequency parameter is decoded, be decoded obtaining high frequency time-domain signal to described high-frequency parameter;
Described low-frequency time-domain signal and described high frequency time-domain signal are synthesized full range time-domain signal by QMF synthesizer, described Full range time-domain signal is the decoded voice data of described current audio frame;
Wherein, the described intermediate parameters generated when being decoded described low-frequency parameter according to SILK decoder, to described high frequency Parameter is decoded obtaining high frequency time-domain signal, including:
According to the modulating frequency in described high-frequency parameter, it is judged that whether described current audio frame exists harmonic structure;
When described current audio frame exists harmonic structure, obtain described SILK decoder time described low-frequency parameter is decoded The low frequency complete excitation generated and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation, the excitation of described low frequency sore throat relieving and Described modulating frequency, calculating simulation high frequency pumping;
By the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and the input LPC synthesis of described simulation high frequency pumping Device, the high frequency time-domain signal after output synthesis.
Method the most according to claim 4, it is characterised in that described according to described low frequency complete excitation, described low frequency is clear Sound excitation and described modulating frequency, calculating simulation high frequency pumping, including:
According to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains full range excitation, and by described full range Excitation obtains the 4th high frequency pumping after carrying out high-pass filtering, carries harmonic structure in described 4th high frequency pumping;
The excitation of described low frequency sore throat relieving being carried out spectrum folding and time-delay alignment obtains the 5th high frequency pumping, described 5th high frequency swashs Harmonic structure is not carried in encouraging;
According to the mixed coefficint that default described 4th high frequency pumping is corresponding, corresponding with described 5th high frequency pumping preset is mixed Syzygy number, carries out mixed weighting be calculated simulation high frequency pumping by described 4th high frequency pumping and described 5th high frequency pumping.
Method the most according to claim 4, it is characterised in that described judge whether there is harmonic wave in described current audio frame After structure, described method also includes:
When described current audio frame does not exists harmonic structure, obtain described SILK decoder and described low-frequency parameter is decoded The low frequency sore throat relieving excitation of Shi Shengcheng, and carry out spectrum folding according to the excitation of described low frequency sore throat relieving and time-delay alignment to obtain the 6th high Frequency excitation, and be defined as described 6th high frequency pumping simulating high frequency pumping.
7. a subband coding apparatus based on SILK codec, it is characterised in that described device includes:
First acquisition module, for obtaining the full range time-domain signal that current audio frame is corresponding;And described full range time-domain signal is divided Solution becomes low-frequency time-domain signal and high frequency time-domain signal;
Coding module, for described low-frequency time-domain signal is carried out SILK coded treatment, generates described low-frequency time-domain signal corresponding Low-frequency parameter;And according to described low-frequency parameter, described high frequency time-domain signal carried out coded treatment and generate high frequency time-domain signal pair The high-frequency parameter answered;
Generation module, generates described current audio frame pair for carrying out described low-frequency parameter and described high-frequency parameter quantifying to compress The bit stream answered;
Wherein, described coding module, including:
First computing unit, for described current audio frame be unvoiced frame time, described full range time-domain signal is converted into full range Frequency-region signal, the pitch period obtained when carrying out SILK coded treatment according to described low-frequency time-domain signal, and by described fundamental tone week Phase and described full-time frequency-region signal input harmonics structure analyzer, calculate the cut-off frequency of harmonic structure;
Whether the first judging unit, for the cut-off frequency according to described harmonic structure, it is judged that deposit in described high frequency time-domain signal At harmonic structure;
, for when there is harmonic structure in described high frequency time-domain signal, according to described low-frequency time-domain signal in the second computing unit Carry out the low frequency complete excitation obtained during SILK coded treatment, the excitation of low frequency sore throat relieving and the cutoff frequency according to described harmonic structure The calculated modulating frequency of rate, calculating simulation high frequency pumping;
3rd computing unit, for by described high frequency time-domain signal input linear predictive coefficient LPC analyzer, is calculated true High frequency pumping and high frequency line spectrum antithetical phrase LSP coefficient, and according to described simulation high frequency pumping and described true high frequency pumping, calculate Obtain gain-adjusted ratio;
Determine unit, for being defined as described high by described modulating frequency, described gain-adjusted ratio and described high frequency LSP coefficient Frequently the high-frequency parameter that time-domain signal is corresponding.
Device the most according to claim 7, it is characterised in that described second computing unit, including:
First processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains entirely Frequency excitation, and by described full range excitation carry out high-pass filtering after obtain the first high frequency pumping, described first high frequency pumping carries Harmonic structure;
Second processes subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the second high frequency Excitation, does not carries harmonic structure in described second high frequency pumping;
First computation subunit, for according to mixed coefficint corresponding to described first high frequency pumping preset, and preset described The mixed coefficint that second high frequency pumping is corresponding, carries out mixed weighting meter by described first high frequency pumping and described second high frequency pumping Calculation obtains simulating high frequency pumping.
Device the most according to claim 7, it is characterised in that described coding module also includes:
4th computing unit, for when there is not harmonic structure in described high frequency time-domain signal or described current audio frame is clear During sound frame, the excitation of the low frequency sore throat relieving that obtains when carrying out SILK coded treatment according to described low-frequency time-domain signal carry out spectrum folding with And time-delay alignment obtains the 3rd high frequency pumping, and it is defined as described 3rd high frequency pumping simulating high frequency pumping.
10. a subband decoding apparatus based on SILK codec, it is characterised in that described device includes:
Second acquisition module, for obtaining the bit stream that current audio frame is corresponding, and by parameter decoder to described bit stream Decoding obtains low-frequency parameter and high-frequency parameter;
Decoder module, for being decoded obtaining low-frequency time-domain signal to described low-frequency parameter according to SILK decoder;And according to The intermediate parameters generated when described low-frequency parameter is decoded by SILK decoder, is decoded obtaining height to described high-frequency parameter Frequently time-domain signal;
Synthesis module, for synthesizing full range by described low-frequency time-domain signal and described high frequency time-domain signal by QMF synthesizer Time-domain signal, described full range time-domain signal is the decoded voice data of described current audio frame;
Wherein, described decoder module, including:
Whether the second judging unit, for according to the modulating frequency in described high-frequency parameter, it is judged that deposit in described current audio frame At harmonic structure;
5th computing unit, for when described current audio frame exists harmonic structure, obtains described SILK decoder to described The low frequency complete excitation generated when low-frequency parameter is decoded and the excitation of low frequency sore throat relieving, and according to described low frequency complete excitation, institute State the excitation of low frequency sore throat relieving and described modulating frequency, calculating simulation high frequency pumping;
Synthesis unit, is used for the high frequency LPC coefficient in described high-frequency parameter and gain-adjusted ratio, and described simulation high frequency swashs Encourage input LPC synthesizer, the high frequency time-domain signal after output synthesis.
11. devices according to claim 10, it is characterised in that described 5th computing unit, including:
3rd processes subelement, for according to described modulating frequency, described low frequency complete excitation is carried out frequency spectrum translation and obtains entirely Frequency excitation, and by described full range excitation carry out high-pass filtering after obtain the 4th high frequency pumping, described 4th high frequency pumping carries Harmonic structure;
Fourth process subelement, is used for that the excitation of described low frequency sore throat relieving is carried out spectrum folding and time-delay alignment obtains the 5th high frequency Excitation, does not carries harmonic structure in described 5th high frequency pumping;
Second computation subunit, for according to mixed coefficint corresponding to described 4th high frequency pumping preset, and preset described The mixed coefficint that 5th high frequency pumping is corresponding, carries out mixed weighting meter by described 4th high frequency pumping and described 5th high frequency pumping Calculation obtains simulating high frequency pumping.
12. devices according to claim 10, it is characterised in that described decoder module also includes:
6th computing unit, for when described current audio frame does not exists harmonic structure, obtains described SILK decoder to institute State when low-frequency parameter is decoded generate low frequency sore throat relieving excitation, and according to described low frequency sore throat relieving excitation carry out spectrum folding and Time-delay alignment obtains the 6th high frequency pumping, and is defined as described 6th high frequency pumping simulating high frequency pumping.
CN201310740505.7A 2013-12-27 2013-12-27 Sub-band coding and decoding method and device based on SILK coder decoder Active CN103714822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310740505.7A CN103714822B (en) 2013-12-27 2013-12-27 Sub-band coding and decoding method and device based on SILK coder decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310740505.7A CN103714822B (en) 2013-12-27 2013-12-27 Sub-band coding and decoding method and device based on SILK coder decoder

Publications (2)

Publication Number Publication Date
CN103714822A CN103714822A (en) 2014-04-09
CN103714822B true CN103714822B (en) 2017-01-11

Family

ID=50407727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310740505.7A Active CN103714822B (en) 2013-12-27 2013-12-27 Sub-band coding and decoding method and device based on SILK coder decoder

Country Status (1)

Country Link
CN (1) CN103714822B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105047201A (en) * 2015-06-15 2015-11-11 广东顺德中山大学卡内基梅隆大学国际联合研究院 Broadband excitation signal synthesis method based on segmented expansion
CN105808651A (en) * 2016-02-29 2016-07-27 四川秘无痕信息安全技术有限责任公司 Android WeChat based silk_v3 voice file format decoding method
CN108231083A (en) * 2018-01-16 2018-06-29 重庆邮电大学 A kind of speech coder code efficiency based on SILK improves method
CN110085242B (en) * 2019-04-28 2021-04-16 武汉大学 SILK-based sound range self-adaptive steganography method based on minimum distortion cost
CN112767954B (en) * 2020-06-24 2024-06-14 腾讯科技(深圳)有限公司 Audio encoding and decoding method, device, medium and electronic equipment
CN113096670B (en) * 2021-03-30 2024-05-14 北京字节跳动网络技术有限公司 Audio data processing method, device, equipment and storage medium
CN114598886B (en) * 2022-05-09 2022-09-13 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Image coding method, decoding method and related devices

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5687157A (en) * 1994-07-20 1997-11-11 Sony Corporation Method of recording and reproducing digital audio signal and apparatus thereof
CN1222997A (en) * 1996-07-01 1999-07-14 松下电器产业株式会社 Audio signal coding and decoding method and audio signal coder and decoder
WO2006107840A1 (en) * 2005-04-01 2006-10-12 Qualcomm Incorporated Systems, methods, and apparatus for wideband speech coding
CN101185124A (en) * 2005-04-01 2008-05-21 高通股份有限公司 Method and apparatus for dividing frequencyband coding of voice signal
CN101903945A (en) * 2007-12-21 2010-12-01 松下电器产业株式会社 Encoder, decoder, and encoding method
CN101964189A (en) * 2010-04-28 2011-02-02 华为技术有限公司 Audio signal switching method and device
CN101276587B (en) * 2007-03-27 2012-02-01 北京天籁传音数字技术有限公司 Audio encoding apparatus and method thereof, audio decoding device and method thereof
CN102436820A (en) * 2010-09-29 2012-05-02 华为技术有限公司 High frequency band signal coding and decoding methods and devices
CN102473414A (en) * 2009-06-29 2012-05-23 弗兰霍菲尔运输应用研究公司 Bandwidth extension encoder, bandwidth extension decoder and phase vocoder
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8103516B2 (en) * 2005-11-30 2012-01-24 Panasonic Corporation Subband coding apparatus and method of coding subband
CN102800317B (en) * 2011-05-25 2014-09-17 华为技术有限公司 Signal classification method and equipment, and encoding and decoding methods and equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5687157A (en) * 1994-07-20 1997-11-11 Sony Corporation Method of recording and reproducing digital audio signal and apparatus thereof
CN1222997A (en) * 1996-07-01 1999-07-14 松下电器产业株式会社 Audio signal coding and decoding method and audio signal coder and decoder
WO2006107840A1 (en) * 2005-04-01 2006-10-12 Qualcomm Incorporated Systems, methods, and apparatus for wideband speech coding
CN101185124A (en) * 2005-04-01 2008-05-21 高通股份有限公司 Method and apparatus for dividing frequencyband coding of voice signal
CN101276587B (en) * 2007-03-27 2012-02-01 北京天籁传音数字技术有限公司 Audio encoding apparatus and method thereof, audio decoding device and method thereof
CN101903945A (en) * 2007-12-21 2010-12-01 松下电器产业株式会社 Encoder, decoder, and encoding method
CN102473414A (en) * 2009-06-29 2012-05-23 弗兰霍菲尔运输应用研究公司 Bandwidth extension encoder, bandwidth extension decoder and phase vocoder
CN101964189A (en) * 2010-04-28 2011-02-02 华为技术有限公司 Audio signal switching method and device
CN102436820A (en) * 2010-09-29 2012-05-02 华为技术有限公司 High frequency band signal coding and decoding methods and devices
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"SILK Speech Codec draft-vos-silk-02";K.vos,S.Jensen;《NETWORK Working Group》;20100909;全文 *
"基于GPGPPU的SILK语音Codec优化";韩怡;《中国优秀硕士学位论文全文数据库信息科技辑》;20111215(第12期);参见第24-38页,第2.5.2-2.5.3小节 *
"宽带语音编码技术专题讲座(四)一种适用于VOIP的开宽带语音编码算法:SILK";郑国宏等;《军事通信技术》;20120331;第33卷(第1期);全文 *

Also Published As

Publication number Publication date
CN103714822A (en) 2014-04-09

Similar Documents

Publication Publication Date Title
CN103714822B (en) Sub-band coding and decoding method and device based on SILK coder decoder
CN101578657B (en) Method and apparatus for getting attenuation factor
US11721349B2 (en) Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates
CN100550712C (en) A kind of signal processing method and processing unit
CN101542599B (en) Method, apparatus, and system for encoding and decoding broadband voice signal
CN104025189B (en) The method of encoding speech signal, the method for decoded speech signal, and use its device
CN102741921B (en) Improved subband block based harmonic transposition
US11594236B2 (en) Audio encoding/decoding based on an efficient representation of auto-regressive coefficients
CN104969290A (en) Method and apparatus for controlling audio frame loss concealment
KR20160087827A (en) Selective phase compensation in high band coding
JPH10307599A (en) Waveform interpolating voice coding using spline
CN110634503B (en) Method and apparatus for signal processing
CN102664003A (en) Residual excitation signal synthesis and voice conversion method based on harmonic plus noise model (HNM)
CN102201240B (en) Harmonic noise excitation model vocoder based on inverse filtering
CN105830153A (en) High-band signal modeling
CN103155034A (en) Audio signal bandwidth extension in CELP-based speech coder
JPH10319996A (en) Efficient decomposition of noise and periodic signal waveform in waveform interpolation
US20150149157A1 (en) Frequency domain gain shape estimation
CN103093757B (en) Conversion method for conversion from narrow-band code stream to wide-band code stream
Bhatt Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on artificial bandwidth extension of speech over proposed global system for mobile full rate narrow band coder
US20060149534A1 (en) Speech coding apparatus and method therefor
CN103155035A (en) Audio signal bandwidth extension in celp-based speech coder
Alku et al. Linear predictive method for improved spectral modeling of lower frequencies of speech with small prediction orders
Huo et al. A Novel Push-To-Talk Service over Beidou-3 Satellite Navigation System
Xiao et al. Multi-mode neural speech coding based on deep generative networks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 511446 Guangzhou City, Guangdong Province, Panyu District, South Village, Huambo Business District Wanda Plaza, block B1, floor 28

Applicant after: Guangzhou Huaduo Network Technology Co., Ltd.

Address before: 510655, Guangzhou, Whampoa Avenue, No. 2, creative industrial park, building 3-08,

Applicant before: Guangzhou Huaduo Network Technology Co., Ltd.

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210118

Address after: 511442 3108, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Patentee after: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 511446 28th floor, block B1, Wanda Plaza, Wanbo business district, Nancun Town, Panyu District, Guangzhou City, Guangdong Province

Patentee before: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20140409

Assignee: GUANGZHOU HUADUO NETWORK TECHNOLOGY Co.,Ltd.

Assignor: GUANGZHOU CUBESILI INFORMATION TECHNOLOGY Co.,Ltd.

Contract record no.: X2021440000053

Denomination of invention: Subband codec method and device based on silk codec

Granted publication date: 20170111

License type: Common License

Record date: 20210208