CN101458930B - Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus - Google Patents
Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus Download PDFInfo
- Publication number
- CN101458930B CN101458930B CN200710198774XA CN200710198774A CN101458930B CN 101458930 B CN101458930 B CN 101458930B CN 200710198774X A CN200710198774X A CN 200710198774XA CN 200710198774 A CN200710198774 A CN 200710198774A CN 101458930 B CN101458930 B CN 101458930B
- Authority
- CN
- China
- Prior art keywords
- frequency
- exc
- env
- signal
- limit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 89
- 230000005284 excitation Effects 0.000 title claims description 41
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 title claims description 7
- 238000001228 spectrum Methods 0.000 claims abstract description 103
- 238000005086 pumping Methods 0.000 claims description 82
- 238000005070 sampling Methods 0.000 claims description 71
- 238000001914 filtration Methods 0.000 claims description 32
- 238000007493 shaping process Methods 0.000 claims description 18
- OVOUKWFJRHALDD-UHFFFAOYSA-N 2-[2-(2-acetyloxyethoxy)ethoxy]ethyl acetate Chemical compound CC(=O)OCCOCCOCCOC(C)=O OVOUKWFJRHALDD-UHFFFAOYSA-N 0.000 claims description 14
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 230000008520 organization Effects 0.000 claims description 5
- 238000012805 post-processing Methods 0.000 claims description 4
- 239000002131 composite material Substances 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 2
- 230000015572 biosynthetic process Effects 0.000 abstract description 2
- 238000003786 synthesis reaction Methods 0.000 abstract description 2
- 239000012792 core layer Substances 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 239000010410 layer Substances 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000013139 quantization Methods 0.000 description 4
- 238000002386 leaching Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 241001269238 Data Species 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 239000012467 final product Substances 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The invention discloses a method for generating exciting signals in bandwidth extension, which processes narrow-band low frequency signals via frequency spectrum folding and synthesis to generate required high frequency exciting signals. The invention further provides a reconstruction method for the high frequency signals in bandwidth extension and a device thereof. The technical scheme utilizes low frequency signals to generate high frequency signals, and is based on the harmonic characteristic of low and high frequency spectrums of the signals, therefore, the method can extend speech and music signals effectively, and the frequency spectrum folding method can confirm the signal frequency spectrum continuity of high and low frequencies at joint part. Tests prove that the technical scheme is suitable for extending ultra wideband signals of 7 to 14 kHz.
Description
Technical field
The present invention relates to the bandwidth expansion technique field, be specifically related to the generation method of pumping signal in the bandwidth expansion and the method for reconstructing and the corresponding device thereof of high-frequency signal, the present invention is specially adapted to the wide expansion of ultrabroad band.
Background technology
Bandwidth expansion (BWE:BandWidth Extension) technology is a kind of by the suitable parameter model of selection, and the signal extension that frequency band range is narrower is to the wideer technology of frequency band range, thus raising sensing audio quality of signals.
Usually under the limited condition of encoder bit rate, for example move and network environment in, based on people's ear for sharper this auditory properties of low frequency signal, in order to obtain the effect of encoding preferably, generally most available bits can be distributed to low frequency signal, but, therefore still wish at decoding end reconstruction high-frequency signal as well as possible owing to the subjective impression of radio-frequency component to sound quality still plays an important role.A kind of (0~3.4kHz) expands to broadband voice (0~7kHz) method (ITU-T from narrowband speech at present employed below, G.729.1) be introduced, it adopts the mode of time domain bandwidth expansion (TDBWE:Time Domain BandWidth Extension), and concrete scheme comprises:
One, coding side
1. pre-service
To input carry out spectrum folding with the 16kHz sampling rate high-frequency signal that obtains of sampling, that is, 4~8kHz frequency range of the HFS of input signal is folded to 0~4kHz part, 160 time domain sampling points that this process is equivalent to HFS all multiply by (1)
nSignal after will folding again is by 3/4 low-pass filter, the frequency range of its 3~4kHz of filtering, promptly corresponding to the part of 7~8kHz in the former frequency range, through pretreated signal be S ' (n), n=0 ..., 159.
2. time domain spectrum envelope parameters extraction
The S ' of 20ms is the frame segment that to be subdivided into 16 length be 1.25ms (n), and each fragment comprises 10 sampled points.Per 10 sampling points are carried out time domain spectrum envelope CALCULATION OF PARAMETERS one time, and computing formula is as follows:
Obtain 16 time domain spectrum envelope parameter T altogether
Env(i).
3. the extraction of frequency domain spectra envelope parameters
In G.729.1, coding side only carries out the extraction of frequency domain parameter to the back 10ms subframe (80 sampling points) of 20ms frame, the frequency domain parameter of 10ms subframe before being obtained by interpolation by decoding end.When calculating frequency domain parameter, to S ' (n) behind the frame sequence of 10ms subframe add 128 Hanning window w
F, this window is made of back 56 of preceding 72 and 112 s' of 144 rising Hanning window decline Hanning window, and the junction is at the 72nd sampling point; See 32 sampling points before this window, after see 16 sampling points, 80 sampling points that add current subframe are 128 points altogether.Signal after the windowing is:
S
w(n)=S’(n)·w
F(n+31),n=-31,…,96。
To S
w(n) to frequency domain, the length of FFT conversion is 64 to employing fast fourier transform (FFT:Fast Fourier Transform), obtains S by spatial transform
Fft(n), n=0 ..., 64.Owing to carried out 3/4 low-pass filtering in preprocessing process, therefore after being converted into frequency domain, the frequency spectrum data that has only front 3/4 is effective; And because the FFT conversion has symmetry, so only need choose preceding 24 data just is enough to express the frequency range of 0~3kHz in 32 frequency domain datas in front, calculates the frequency domain spectra envelope parameters according to preceding 20 frequency domain datas and is:
W wherein
F(n) be weighting function, W
F(0)=W
F(2)=0.5, W
F(1)=1.
4. the quantification of parameter
To 16 T
Env(i) and 12 F
Env(j) remove average division vector quantization.At first calculate T
Env(i) mean value M
T, at log-domain 5bit scalar quantization M
TCalculate T respectively
Env(i) and F
Env(j) with the residual error of quantization scalar; Then 16 time domain residual errors are split into 28 n dimensional vector ns, use same code book to quantize with 7bit respectively, 12 frequency-domain residual are split into 34 n dimensional vector ns, use different code books, preceding two 4 n dimensional vector ns quantize with 5bit respectively, and last 4 n dimensional vector n quantizes with 4bit.
Two, decoding end
1. excitation generates
The pumping signal (Excitation Signal) of bandwidth expansion is rebuild by the core layer decoding parametric and is obtained.Following core layer decoding parametric is used to generate the pumping signal of bandwidth expansion: integer pitch delay T0, mark fundamental tone time-delay frac; The ENERGY E p of the ENERGY E c of fixed codebook contribution, adaptive codebook contribution; The gain g of basic layer constant codebook excitations c (n), c (n) in the core layer
c, the gain g of adaptive codebook excitation v (n), v (n)
pEnhancement layer in the core layer strengthen excitation c ' (n), c ' gain g (n)
Enh
By the ratio of estimating clearly, the voiced sound gain contribution is calculated each frame adaptive code book and fixed codebook (comprising the enhancement layer code book) encourages, multiply by gain by the excitation separately of pure and impure sound then and form preliminary pumping signal, again preliminary pumping signal is carried out the aftertreatment that fundamental tone is delayed time according to parameters such as pitch delays, obtain final pumping signal exc (n).Exc (n) also needs by 3/4 low-pass filter frequency range to be restricted to 0~3kHz.
2. the decoding of parameter
From code stream, decode 16 time domain spectrum envelope parameter T
Env(i) and 12 frequency domain spectra envelope parameters F
Env(j), decode procedure is the inverse process of the quantization encoding process of coding side.
3. time domain spectrum envelope shaping
The time domain shaping mainly is that the energy of pumping signal is adjusted.According to coding side T
Env(i) computing method are calculated the time domain spectrum envelope parameter of pumping signal exc (n), obtain 16 T '
Env(i), again by T
Env(i) deduct T ' respectively
Env(i) draw both energy differences, thus the energy discharge amplitude gain that obtains to adjust:
gain=2^[T
env(i)-T’
env(i)];
Multiply by corresponding gain respectively by the pumping signal exc (n) of 160 sampling points then and recover the adjusted signal S of time domain
T(n).
4. frequency domain spectra envelope shaping
The frequency domain parameter F that decodes
Env(j) characterized the back 10ms of 20ms frame, the frequency domain parameter of its preceding 10ms frame can obtain by the frequency domain parameter interpolation of present frame and previous 20ms frame, and the frequency domain parameter of 10ms before and after the present frame is designated as F respectively
Env, 1(j), F
Env, 2(j).
Disposal route with time domain is similar then, with S
T(n) carry out frequency domain parameter according to the computing method of coding side and extract, every 10ms extracts once, calculates two groups of frequency domain parameters, is designated as F '
Env, 1(j), F '
Env, 2(j).By F
Env, 1(j), F
Env, 2(j) respectively with F '
Env, 1(j), F '
Env, 2(j) difference obtains the adjusting range G of two subframes
F, 1(j), G
F, 2(j).Because frequency-domain calculations is that frequency-division section carries out, therefore adopt a bank of filters that the spectrum envelope of corresponding with each frequency domain parameter respectively signal frequency range is adjusted respectively, obviously have 12 wave filters, adopt G
F, 1(j), G
F, 2(j) respectively the coefficient of bank of filters is weighted, respectively front and back 10ms subframe is carried out filtering then, obtain the signal output S after the frequency-domain shaping
HB(n).
5. the aftertreatment of BWE
Owing to after time domain and frequency domain two readjust, may produce the part burr, therefore adopt self-adaptation amplitude compression function to carry out aftertreatment to reduce departing from of envelope.Post-processing approach is that per 80 sampling points are handled once, and it is divided into three sections, 6 sampling points of leading portion, and 70 sampling points in stage casing, last 4 sampling points, adjusted being output as of envelope (every row is followed successively by leading portion, interruption and back segment) of process aftertreatment:
Wherein, T
Env(i) be the time domain spectrum envelope parameter corresponding with the sampling point of current adjustment.
Owing at coding side the high-frequency signal of 4~8kHz is folded to 0~4kHz, therefore when decoding end is reduced, should carries out spectrum folding once more.The spectrum folding mode of method for folding and coding side is similar, because the output signal of rebuilding is 0~3kHz, therefore the frequency coefficient of 3~4kHz can be mended and fold the high-frequency reconstruction signal that obtains 4~7kHz after 0.
In proposing process of the present invention, the inventor finds, the decoding end excitation of above-mentioned bandwidth expansion technique generates the binary excitation production method that adopts in the similar speech production model and produces, and is fit to speech signal coding, and is then relatively poor to the coding effect of class music signal; And experimental results show that under above-mentioned excitation generating mode as if the ultra broadband expansion that this bandwidth expansion technique is used for 7~14kHz, noise is big, the coding weak effect illustrates that this technology is not suitable for being applied in the ultra broadband expansion.
The invention provides the generation method of pumping signal in a kind of bandwidth expansion and the method for reconstructing and the device of corresponding high-frequency signal, be applicable in broadband and ultra broadband expansion sound signals such as voice and music are carried out high-frequency reconstruction.
The generation method of pumping signal in a kind of bandwidth expansion, comprising: the generated frequency scope is 0~B
0The first pumping signal exc (n), n=0 ..., N-1; Exc (n) is carried out spectrum folding, and the generated frequency scope is B
0~2B
0The second pumping signal exc
Fold(n); To exc (n) and exc
Fold(n) carry out synthetic filtering, reference frequency output is 0~2B
0The 3rd pumping signal exc
HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc
HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
A kind of method for reconstructing of bandwidth expansion medium-high frequency signal, comprising: the generation method according to aforementioned pumping signal generates pumping signal exc
HB(m), m=0 ..., 2N-1; Decoding obtains time domain spectrum envelope parameter T
Env(i) and frequency domain spectra envelope parameters F
Env(j), i=0 wherein ..., I-1, j=0 ..., J-1; According to T
Env(i) to exc
HB(m) time domain spectrum envelope is adjusted, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I generates the adjusted signal S of time domain
T(m); According to F
Env(j) to S
T(m) frequency domain spectra envelope is adjusted, each F
Env(j) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) frequency span generates the adjusted reconstruction signal S of frequency domain
F(m); To S
F(m) carry out spectrum folding, the generated frequency scope is 2B
0~2B
0+ B
2High-frequency reconstruction signal S
HB(m).
The generating apparatus of pumping signal in a kind of bandwidth expansion, comprising: the core codec module, being used for reference frequency output is 0~B
0The first pumping signal exc (n), n=0 ..., N-1; The spectrum folding module is used for exc (n) is carried out spectrum folding, and reference frequency output is B
0~2B
0The second pumping signal exc
Fold(n); The synthetic filtering module is used for exc (n) and exc
Fold(n) carry out synthetic filtering, reference frequency output is 0~2B
0The 3rd pumping signal exc
HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc
HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
A kind of reconstructing device of bandwidth expansion medium-high frequency signal comprises: the pumping signal generation unit, the logical organization of the generating apparatus of any described pumping signal of employing claim 15~17 is used to generate pumping signal exc
HB(m), m=0 ..., 2N-1; Decoding unit is used for decoding output time domain spectrum envelope parameter T
Env(i) and frequency domain spectra envelope parameters F
Env(j), i=0 wherein ..., I-1, j=0 ..., J-1; The time domain shaping unit is used for according to T
Env(i) to exc
HB(m) time domain spectrum envelope is adjusted, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I, the adjusted signal S of output time domain
T(m); The frequency-domain shaping unit is used for according to F
Env(j) to S
T(m) frequency domain spectra envelope is adjusted, each F
Env(j) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) frequency span, the adjusted reconstruction signal S of output frequency domain
F(m); The spectrum folding unit is used for the S to input
F(m) carry out spectrum folding, the generated frequency scope is 2B
0~2B
0+ B
2High-frequency reconstruction signal S
HB(m).
Technique scheme adopts the arrowband low frequency signal is generated needed high-frequency excitation signal by the synthetic again mode of spectrum folding; Owing to utilize low frequency signal to produce high-frequency signal, the mediation characteristic that has based on signal low frequency and high frequency spectrum, can all expand preferably voice and music signal, the spectrum folding mode that is adopted also guaranteed low-and high-frequency the joining place signal spectrum continuously; Experiment showed, not only to be fit to 4~7kHz band signal is carried out the bandwidth expansion, and be fit to 7~14kHz ultra-broadband signal is expanded.
Description of drawings
Fig. 1 is the step synoptic diagram of generation method of the pumping signal of the embodiment of the invention;
Fig. 2 is the logical organization synoptic diagram of generating apparatus of the pumping signal of the embodiment of the invention;
Fig. 3 is the step synoptic diagram of method for reconstructing of the high-frequency signal of the embodiment of the invention;
Fig. 4 is the logical organization synoptic diagram of reconstructing device of the high-frequency signal of the embodiment of the invention.
Embodiment
The embodiment of the invention provides the generation method of pumping signal in a kind of bandwidth expansion, and the arrowband low frequency signal is synthetic again by spectrum folding, generates needed high-frequency excitation signal.The embodiment of the invention also provides the method for reconstructing of corresponding bandwidth expansion medium-high frequency signal, and the generating apparatus of pumping signal and the reconstructing device of high-frequency signal in the bandwidth expansion.Below be elaborated respectively.
With reference to figure 1, the generation method of pumping signal mainly comprises step in the expansion of the bandwidth of the embodiment of the invention:
A1, generated frequency scope are 0~B
0First pumping signal, this first pumping signal is generally a kind of arrowband pumping signal.
In the present embodiment, as the arrowband pumping signal exc (n) of first pumping signal, n=0 ..., N-1, the parameter reconstruction that is obtained by decoding core layer code stream obtains.Exc (n) can adopt code book Excited Linear Prediction (CELP:Code Excited Linear Prediction) to rebuild and obtain based on the core layer coded system of coding side, for example the pumping signal reconstruction mode in the aforementioned background art.
Process reduces computational complexity to simplify the process, and a kind of simple and effective exc (n) generating mode based on CELP is provided in the present embodiment, comprising:
1. the core code stream of decoding obtains constant codebook excitations and adaptive codebook excitation and gain separately.
According to the coded system of coding side core layer, constant codebook excitations can by basic layer constant codebook excitations c (n) and enhancement layer strengthen excitation c ' (n) two parts forms, gaining accordingly is respectively g
cAnd g
Enh
2. obtain exc (n) according to separately gain weighting superposition constant codebook excitations and adaptive codebook excitation.
Comprise under the two-part situation that at constant codebook excitations the computing formula of exc (n) is:
exc(n)=g
p·v(n)+g
c·c(n)+g
enh·c’(n)
Wherein, v (n) is adaptive codebook excitation, g
pGain for v (n).
Usually the frequency range of exc (n) is 0~4kHz, and a frame is that 160 time domain sampling points of 20ms are formed by duration, i.e. B
0=4kHz, N=160.
A2, exc (n) is carried out spectrum folding, the generated frequency scope is B
0~2B
0Second pumping signal; Corresponding to the character of exc (n) arrowband, low frequency, this second pumping signal can be considered arrowband high-frequency signal exc
Fold(n).
N the time domain sampling point that this process is equivalent to exc (n) all multiply by (1)
n
A3, to exc (n) and exc
Fold(n) carry out synthetic filtering, reference frequency output is 0~2B
0The 3rd pumping signal, the 3rd pumping signal is the high-frequency excitation signal of bandwidth expansion.
Alleged synthetic filtering is with exc (n) and exc
Fold(n) frequency spectrum merges, and obtains bandwidth and expands to 0~2B
0High-frequency excitation signal exc
HB(m), m=0 ..., 2N-1.A kind of optional synthesis mode is:
Adopt Quadrature Mirror Filter QMF (QMF:Quandrature Mirror Filter) to exc (n) and exc
Fold(n) carry out the orthogonal mirror image synthetic filtering.
In addition, can also be 0~2B further according to the needs of practical application to frequency range
0Exc
HB(m) carry out low pass, high pass or bandpass filtering, the exc of output frequency range
HB(m).Based on the requirement of present audio-frequency signal coding to frequency range, generally the frequency range that requires for broadband signal is 0~7kHz, comprises the low frequency part of 0~4kHz and the HFS of 4~7kHz; The frequency range that requires for ultra-broadband signal is 0~14kHz, comprise the low frequency part of 0~8kHz and the HFS of 8~14kHz, as seen the bandwidth of HFS coding is generally 3/4 of low frequency part, therefore in this case, also need the high frequency pumping that generates based on low-frequency excitation is further processed, that is:
A4, be 0~2B to frequency range
0Exc
HB(m) carry out 3/4 low-pass filtering, reference frequency output is 0~3B
0/ 2 exc
HB(m).
This frequency range is 0~3B
0It is 2B that/2 high-frequency excitation signal promptly can be used for rebuilding frequency range
0~3.5B
0Broadband or ultra broadband high-frequency signal.
Generating apparatus to the bandwidth expansion pumping signal of the embodiment of the invention that is used for carrying out above-mentioned pumping signal generation method describes below, and with reference to figure 2, its basic logical structure comprises:
In addition, based in the aforementioned generation method to the description of exciting signal frequency area requirement, the pumping signal generating apparatus of present embodiment also can comprise:
3/4 low-pass filter 104, being used for the incoming frequency scope is 0~2B
0Exc
HB(m), it is carried out 3/4 low-pass filtering, reference frequency output is 0~3B
0/ 2 exc
HB(m).
For better understanding the foregoing description, with a kind of example that is applied as in the expansion of ultra broadband bandwidth, above-mentioned pumping signal generative process is described below: a frame pumping signal exc (n) (160 sampling points) who at first extracts 0~4kHz by core layer based on the CELP coding; Mode by spectrum folding folds into 4~8kHz frequency range then, generates the pumping signal exc of 4~8kHz frequency range
Fold(n) (160 sampling points); Pass through the QMF composite filter then, with exc (n) and exc
Fold(n) synthetic required full frequency band encourages exc
Qmf(m) (320 sampling points), this moment, the bandwidth of signal was 0~8kHz; Again with full frequency band pumping signal exc
Qmf(m), obtain the pumping signal exc of 0~6kHz by 3/4 low-pass filter filtering
HB(m) (320 sampling points).
Above-mentioned pumping signal generates among the method and apparatus embodiment, adopts the arrowband low frequency signal is generated needed high-frequency excitation signal by the synthetic again mode of spectrum folding; Owing to utilize low frequency signal to produce high-frequency signal, the mediation characteristic that has based on signal low frequency and high frequency spectrum, can all expand preferably voice and music signal, the binary that has solved in the similar speech production model that adopts in the existing time domain bandwidth expansion encourages production method for the poor problem of the coding effect of class music signal.In addition, the spectrum folding mode that adopted also guaranteed low-and high-frequency the joining place signal spectrum continuously; Experiment showed, that above-mentioned pumping signal generates scheme and not only is fit to 4~7kHz band signal is carried out the bandwidth expansion, and be fit to 7~14kHz ultra-broadband signal is expanded.
Below the method for reconstructing based on the bandwidth expansion medium-high frequency signal of the embodiment of the invention of above-mentioned pumping signal generation method is described.With reference to figure 3, mainly comprise step:
B1, generation high-frequency excitation signal.
High-frequency excitation signal exc
HB(m), m=0 ..., 2N-1, the generation method with reference to previous embodiment, its bandwidth is B
2, B
2=2B
0Or 3B
0/ 2, use the latter usually.
B2, decoding obtain time domain spectrum envelope parameter and frequency domain spectra envelope parameters.
From code stream, decode time domain spectrum envelope parameter T according to the decoding process corresponding with the coded system of coding side
Env(i), i=0 ..., I-1 and frequency domain spectra envelope parameters F
Env(j), j=0 ..., J-1, concrete code encoding/decoding mode present embodiment does not limit.Need to prove that the step of this decoding there is no strict logical order requirement in whole process of reconstruction, can carry out synchronously or in proper order with other steps, and not necessarily require to decode simultaneously T
Env(i) and F
Env(j), as long as the decoding of executed relevant parameter before certain parameter of use in process of reconstruction.
B3, according to T
Env(i) to exc
HB(m) time domain spectrum envelope is adjusted, and generates the adjusted signal S of time domain
T(m).
Time domain spectrum envelope adjustment process is carried out corresponding to coding side time domain spectrum envelope Parameter Extraction process, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I, that is, the sampling point number of being adjusted can be all or part of of 2N sampling point.Each T
Env(i) corresponding relation with the adjustment sampling point is identical with the corresponding relation in the coding side leaching process.Concrete adjustment mode can adopt the time domain spectrum envelope adjustment mode in the aforementioned background art for example etc.
For better adjustment effect is provided, provide a kind of time domain spectrum envelope to adjust mode in the present embodiment, comprising:
1. calculate T according to coding side
Env(i) mode is calculated exc
HB(m) time domain spectrum envelope parameter T '
Env(i).
Alleged coding side calculates T
Env(i) mode is the high-frequency signal S that coding side extracts needs coding
Hb(m) T
Env(i) process.S
Hb(m) by coding side the HFS that needs encoded signals being carried out pre-service usually obtains: the high-frequency signal that the back frequency division of at first will sampling obtains folds into low-frequency range, carries out low-pass filtering by the frequency range requirement of coding then.A kind of T '
Env(i) account form example is as follows:
With exc
HB(m) a 2N sampling point is divided into the I section, and every section A sampling point calculates every section log-domain energy
Usually desirable 10 sampling points are one section, i.e. A=10, T ' at this moment
Env(i) number is I=N/5.
2. according to T
Env(i) and T '
Env(i) the energy difference between is calculated the preliminary gain factor g of time domain
T(i).
A kind of g
T(i) account form example is as follows:
g
T(i)=2^[T
env(i)-T’
env(i)],
Obviously, each g
T(i) corresponding to exc
HB(m) comprise a section of A time domain sampling point, corresponding relation and T ' in
Env(i) and exc
HB(m) corresponding relation of sampling point is identical in.
3. each g of interpolation
T(i) obtain A gain factor.
Can adopt various interpolation methods with each g as required
T(i) expand to A gain factor g
T, i(a), a=0 ..., A-1 for example can simply make each g
T, i(a) be equal to g
T(i).For obtaining the effect of time domain adjustment preferably, under the situation of A=10, provide a kind of level and smooth interpolation algorithm to calculate g in the present embodiment
T, i(a):
g
T, i(a)=w
T(a) g
T(i)+[1-w
T(a)] g
Last T, i(a); Wherein, w
T(a) be window function, g
Last T, i(a) be previous frame exc
HB(m) gain factor of corresponding sampling point.w
T(a) be specially:
Above-mentioned interpolation algorithm can be understood as, to preceding 5 g
T, i(a) the corresponding g that adopts the level and smooth interpolation of previous frame to obtain
Last T, i(a) carry out smoothing processing, to back 5 g
T, i(a) then adopt g
T(i) value.
4. according to g
T, i(a) adjust exc
HBThe gain of the sampling point of A * I (m) obtains S
T(m).
Exc
HB(m) sample value and corresponding gain factor g of time domain spectrum envelope shaping by accepting to adjust
T, i(a) obtain by simply multiplying each other:
S
T(m)=g
T,i(a)·exc
HB(m)。
B4, according to F
Env(j) to S
T(m) frequency domain spectra envelope is adjusted, and generates the adjusted reconstruction signal S of frequency domain
F(m).
Similar with time domain spectrum envelope adjustment process, frequency domain spectra envelope adjustment process is carried out each F corresponding to the leaching process of coding side frequency domain spectra envelope parameters equally
Env(i) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) also be exc
HB(m) frequency span.Each F
Env(j) corresponding relation with the adjustment frequency band is identical with the corresponding relation in the coding side leaching process.Concrete adjustment mode can adopt the frequency domain spectra envelope adjustment mode in the aforementioned background art for example etc.
For reducing computational complexity, improve and adjust effect, provide a kind of frequency domain spectra envelope to adjust mode in the present embodiment, comprising:
1. calculate F according to coding side
Env(j) mode is to S
T(m) carry out time-frequency conversion and generate frequency-region signal S
F1(m) and calculate S
F1(m) frequency domain spectra envelope parameters F '
Env(j).
Alleged coding side calculates F
Env(j) mode is the high-frequency signal S that coding side extracts needs coding
Hb(m) F
Env(j) process.A kind of F '
Env(i) account form example is as follows:
Be S
T(m) and previous frame S
T, last(m) windowing w
TDAC(k) the signal S after the acquisition windowing
w(k), k=0 ..., 4N-1, wherein,
S
w(k)=w
TDAC(k)·S
T,last(k),k=0,…,2N-1,
S
w(k)=w
TDAC(k)·S
T(k-2N),k=2N,…,4N-1;
To S
w(k) carry out discrete cosine transform (DCT:Diserete Cosine Transform) and generate S
F1(m), concrete mapping mode can adopt modified discrete cosine transform (MDCT:Modified DCT),
Extract S
F1(m) preceding D * J sampling point calculates F '
Env(j),
Because exc
HB(m) may carry out 3/4 low-pass filtering treatment of restricted band scope in the generative process, 0~3B has only been arranged in this case
0The data of/2 frequency ranges are effectively, and therefore, after carrying out time-frequency conversion, preceding 3/2N the point that only needs to extract 2N frequency domain sampling point is used to calculate F '
Env(j) get final product, at this moment D * J=3/2N.
Usually desirable 16 sampling points are as a sub-frequency bands, i.e. D=16, this moment F '
Env(j) number is J=3N/32.In addition, employed window function w
TDAC(k) can select following sinusoidal windows:
w
TDAC(k)=sin[(k+0.5)π/4N]。
2. according to F
Env(j) and F '
Env(j) the energy difference between is calculated the preliminary gain factor g of frequency domain
F(j), each g
F(j) corresponding to S
F1(m) comprise a section of D frequency domain sampling point, D * J≤2N in.
A kind of g
F(j) account form example is as follows:
g
F(i)=2^[F
env(j)-F’
env(j)],
Each g
F(i) and S
F1(m) corresponding relation of sub-band and F '
Env(i) and S
F1(m) corresponding relation of sub-band is identical.
3. each g of interpolation
F(j) obtain D gain factor g
F, j(d), d=0 ..., D-1.
Concrete interpolation method can certainly adopt other interpolation methods with reference to the interpolation method of aforementioned time domain gain factor, repeats no more.
4. according to g
F, j(d) adjust S
F1The gain of the sampling point of D * J (m) generates adjusted frequency-region signal S
F2(m).Similar with the adjustment of time domain spectrum envelope, with frequency domain sample value and corresponding gain factor g
F, j(d) simply multiply each other and get final product:
S
F2(m)=g
F,j(d)·S
F1(m)。
5. to S
F2(m) carry out the inverse transformation of described time-frequency conversion, obtain S
F(m).
For example, if before the frequency domain adjustment, adopt MDCT to transform to frequency domain, then adopt this moment contrary MDCT (IMDCT) to transform to time domain.
B5, to S
F(m) carry out spectrum folding, the generated frequency scope is 2B
0~2B
0+ B
2High-frequency reconstruction signal S
HB(m).
Because at coding side is that high-frequency signal is folded to low-frequency range, therefore when decoding end is reduced, should carry out spectrum folding once more.Spectrum folding mode when method for folding and coding side carry out the high-frequency signal pre-service is similar.If in process of reconstruction, based on coding the requirement of frequency range has been carried out low-pass filtering to pumping signal, the frequency coefficient of the HFS that can remove filtering this moment is mended to fold after 0 and is obtained final high-frequency reconstruction signal.
Further,, make reconstruction signal burr occur probably owing in above-mentioned signal reconstruction process, crossed time domain and frequency domain two readjust, in order to eliminate these burrs, can be earlier to the adjusted signal S of time-frequency before carrying out spectrum folding
F(m) carry out aftertreatment, that is, before step B5, increase following steps:
B51, use envelope are adjusted threshold value limit
1(i), limit
2(i) to S
F(m) carry out the envelope adjustment.Adjusted S
F(m) be:
At m=m
1~m
2Part in, if | S
F, old(m) |<limit
1(i), S then
F(m)=S
F, old(m),
At m=m
2+ 1~m
3Part in, if limit
1(i)≤| S
F, old(m) |≤limit
2(i), S then
F(m)=[S
F, old(m)-limit
1(i)]/2+limit
1(i),
At m=m
3+ 1~m
4Part in, if | S
F, old(m) |>limit
2(i), S then
F(m)=[S
F, old(m)-limit
2(i)]/16+limit
2(i), wherein, S
F, old(m) adjust preceding S for envelope
F(m); Limit
1(i), limit
2(i) and S
F(m) corresponding relation of time domain sampling point in, and T
Env(i) and S
F(m) corresponding relation of time domain sampling point is identical in.
In above-mentioned last handling process, a kind of limit of threshold value preferably
1(i), limit
2(i) set-up mode is:
limit
1(i)=2^T
env(i),
limit
2(i)=[2^T
env(i)]×2.5。
In addition, above-mentioned last handling process can be handled once per 80 sampling points, per 80 sampling points is divided into three sections, preceding 6 sampling point (m
1~m
2Part), middle 70 sampling point (m
2+ 1~m
3Part), last 4 sampling point (m
3+ 1~m
4Part).Illustrate as follows: if N=160, then the adjusted signal of time-frequency is 320 sampling points, can divide and carry out aftertreatment 4 times; M wherein
1~m
2Part be 0~5,80~85,160~165,240~245 part; m
2+ 1~m
3Part be 6~75,86~155,166~235,246~315 part; m
3+ 1~m
4Part be 76~79,156~159,236~239,316~319 part.
Reconstructing device to the bandwidth expansion medium-high frequency signal of the embodiment of the invention that is used to carry out above-mentioned high-frequency signal method for reconstructing describes below, and with reference to figure 4, its basic logical structure comprises:
Pumping signal generation unit 201, the logical organization of the generating apparatus of the pumping signal of employing previous embodiment is used to generate pumping signal exc
HB(m), m=0 ..., 2N-1;
Decoding unit 202 is used for decoding output time domain spectrum envelope parameter T
Env(i) and frequency domain spectra envelope parameters F
Env(j), i=0 wherein ..., I-1, j=0 ..., J-1;
Time domain shaping unit 203 is used for the T according to decoding unit 202 outputs
Env(i) exc that pumping signal generation unit 201 is exported
HB(m) time domain spectrum envelope is adjusted, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I, the adjusted signal S of output time domain
T(m);
Frequency-domain shaping unit 204 is used for the F according to decoding unit 202 outputs
Env(j) S that time domain shaping unit 203 is exported
T(m) frequency domain spectra envelope is adjusted, each F
Env(j) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) frequency span, the adjusted reconstruction signal S of output frequency domain
F(m);
In addition, based on the last handling process that uses for eliminate signal burr in the aforementioned method for reconstructing, the high-frequency signal reconstructing device of present embodiment also can comprise:
Post-processing unit 206 is used to use envelope to adjust threshold value limit
1(i), limit
2(i) S that frequency-domain shaping unit 204 is exported
F(m) carry out the envelope adjustment, adjusted S
F(m) be: at m=m
1~m
2Part in, if | S
F, old(m) |<limit
1(i), S then
F(m)=S
F, old(m); At m=m
2+ 1~m
3Part in, if limit
1(i)≤| S
F, old(m) |≤limit
2(i), S then
F(m)=[S
F, old(m)-limit
1(i)]/2+limit
1(i); At m=m
3+ 1~m
4Part in, if | S
F, old(m) |>limit
2(i), S then
F(m)=[S
F, old(m)-limit
2(i)]/16+limit
2(i); Wherein, S
F, old(m) adjust preceding S for envelope
F(m); Limit
1(i), limit
2(i) and S
F(m) corresponding relation of time domain sampling point in, and T
Env(i) and S
F(m) corresponding relation of time domain sampling point is identical in; With adjusted S
F(m) export to spectrum folding unit 205.
The level and smooth interpolation method of time domain gain factor that further provides among above-mentioned high-frequency signal method for reconstructing and the device embodiment can obtain better time domain and adjust effect; The concrete frequency domain spectra envelope adjustment mode that further provides has been avoided using multinomial bank of filters frequency-division section to signal filtering in decoding end, has simplified processing procedure, has reduced computational complexity; The shaping post processing mode that further provides can better be eliminated the burr that the shaping process occurs.
For better understanding the foregoing description, with a kind of example that is applied as in the expansion of ultra broadband bandwidth, above-mentioned high-frequency signal process of reconstruction is described below:
1. generate the high-frequency excitation signal exc of 0~6kHz
HB(m), 320 sampling points of the every frame of time domain.That is, 2N=320, B
0=4kHz, B
2=3B
0/ 2=6kHz.
2. decoding obtains 32 time domain spectrum envelope parameter T from code stream
Env(i), i=0 ..., 31, each corresponding 10 time domain sampling point, i.e. I=32, A=10.
3. with exc
HB(m) be divided into 32 segments equally, every section 10 sampling points calculate corresponding T '
Env(i):
Calculate time domain gain g then
T(i)=2^[T
Env(i)-T '
Env(i)], and with level and smooth each g of interpolation algorithm interpolation
T(i):
g
T,i(a)=w
T(a)·g
T(i)+[1-w
T(a)]·g
last T,i(a),a=0,…,4。
g
T,i(a)=gT(i),a=5,…,9。
Wherein, w
T(a)=0.0669872981f, 0.2500000000f, 0.5000000000f, 0.7500000000f, 0.9330127019f}, a is followed successively by 0~4, and f represents floating number.Calculate the signal after the time domain shaping then:
S
T(m)=g
T,i(a)·exc
HB(m)。
4. decoding obtains 15 frequency domain spectra envelope parameters F from code stream
Env(j), j=0 ..., 14, the sub-band of each corresponding 0.4kHz bandwidth, i.e. J=15.
5. to S
T(m) and previous frame S
T, last(m) add sinusoidal windows w
TDAC(k),
w
TDAC(k)=sin[(k+0.5) π/640], k=0 ..., 639; Signal S after the acquisition windowing
w(k),
S
w(k)=w
TDAC(k)·S
T,last(k),k=0,…,319,
S
w(k)=w
TDAC(k)·S
T(k-2N),k=320,…,639;
Then to the S after the windowing
w(k) sequence is carried out 640 MDCT, generates frequency-region signal S
F1(m),
Owing to generate exc
HB(m) carried out 3/4 low-pass filtering in the process, filtering the frequency range data of 6~8kHz, the data of therefore having only 0~6kHz frequency range are effectively, therefore extract S
F1(m) preceding 240 points are used to calculate 15 F '
Env(j), one group of per 16 point, i.e. D=16,
Calculate frequency domain gain g then
F(i)=2^[F
Env(j)-F '
Env(j)], the signal S after the acquisition frequency-domain shaping
F2(m)=g
F(i) S
F1(m).Again to S
F1(m) carry out IMDCT and obtain S
F(m).
6. to S
F(m) per 80 sampling points of 320 sampling points are handled once, are divided into three sections at every turn, preceding 6 sampling points, and middle 70 sampling points, last 4 sampling points are according to limit
1(i)=2^T
Env(i), limit
2(i)=[2^T
Env(i)] * 2.5 carry out the envelope adjustment.
7. then the signal of the adjusted 0~6kHz of envelope is carried out spectrum folding, obtain the high-frequency reconstruction signal S of 8~14kHz
HB(m).
With S
HB(m) (0~8kHz) merges (for example synthetic by QMF) can obtain complete ultra broadband reconstruction signal (0~14kHz) to the low frequency signal that obtains with core code stream decoding.
One of ordinary skill in the art will appreciate that all or part of step in the whole bag of tricks of the foregoing description is to instruct relevant hardware to finish by program, this program can be stored in the computer-readable recording medium, and storage medium can comprise: ROM, RAM, disk or CD etc.
More than the generation method of pumping signal in the bandwidth provided by the present invention expansion and the method for reconstructing and the device of corresponding high-frequency signal are described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.
Claims (19)
1. the generation method of pumping signal is characterized in that during a bandwidth was expanded, and comprising:
The generated frequency scope is 0~B
0The first pumping signal exc (n), n=0 ..., N-1;
Exc (n) is carried out spectrum folding, and the generated frequency scope is B
0~2B
0The second pumping signal exc
Fold(n);
To exc (n) and exc
Fold(n) carry out synthetic filtering, reference frequency output is 0~2B
0The 3rd pumping signal exc
HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc
HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
2. the generation method of pumping signal according to claim 1 is characterized in that, and is described to exc (n) and exc
Fold(n) step of carrying out synthetic filtering is specially: to exc (n) and exc
Fold(n) carry out the orthogonal mirror image synthetic filtering.
3. the generation method of pumping signal according to claim 1 and 2 is characterized in that, also comprises:
To frequency range is 0~2B
0Exc
HB(m) carry out 3/4 low-pass filtering, reference frequency output is 0~3B
0/ 2 exc
HB(m).
4. the generation method of pumping signal according to claim 3 is characterized in that, the step of described generation exc (n) is specially:
The decoding core code stream obtains constant codebook excitations and adaptive codebook excitation and gain separately;
According to described constant codebook excitations of gain weighting superposition and adaptive codebook excitation acquisition exc (n) separately.
5. the generation method of pumping signal according to claim 4 is characterized in that:
Described constant codebook excitations comprises that basic layer constant codebook excitations c (n) and enhancement layer strengthen excitation c ' (n), and corresponding gain is respectively g
cAnd g
Enh
Calculate exc (n) according to following formula:
Exc (n)=g
pV (n)+g
cC (n)+g
EnhC ' (n) wherein, v (n) is adaptive codebook excitation, g
pBe the gain of v (n), N=160, B
0=4kHz.
6. the method for reconstructing of a bandwidth expansion medium-high frequency signal is characterized in that, comprising:
Generate pumping signal exc according to any described method of claim 1~5
HB(m), m=0 ..., 2N-1;
Decoding obtains time domain spectrum envelope parameter T
Env(i) and frequency domain spectra envelope parameters F
Env(j), i=0 wherein ..., I-1, j=0 ..., J-1;
According to T
Env(i) to exc
HB(m) time domain spectrum envelope is adjusted, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I generates the adjusted signal S of time domain
T(m);
According to F
Env(j) to S
T(m) frequency domain spectra envelope is adjusted, each F
Env(j) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) frequency span generates the adjusted reconstruction signal S of frequency domain
F(m);
To S
F(m) carry out spectrum folding, the generated frequency scope is 2B
0~2B
0+ B
2High-frequency reconstruction signal S
HB(m).
7. the method for reconstructing of high-frequency signal according to claim 6 is characterized in that, and is described according to T
Env(i) to exc
HB(m) step that time domain spectrum envelope is adjusted comprises:
Calculate T according to coding side
Env(i) mode is calculated exc
HB(m) time domain spectrum envelope parameter T '
Env(i);
According to T
Env(i) and T '
Env(i) the energy difference between is calculated the preliminary gain factor g of time domain
T(i), each g
T(i) corresponding to exc
HB(m) comprise a section of A time domain sampling point in;
Each g of interpolation
T(i) obtain A gain factor g
T, i(a), a=0 ..., A-1;
According to g
T, i(a) adjust exc
HBThe gain of the sampling point of A * I (m) obtains S
T(m).
8. the method for reconstructing of high-frequency signal according to claim 7 is characterized in that, A=10, and I=N/5, described according to T
Env(i) and T '
Env(i) the energy difference between is calculated g
T(i) step is specially:
g
T(i)=2^[T
env(i)-T’
env(i)];
Each g of described interpolation
T(i) obtain A g
T, i(a) step is specially:
g
T, i(a)=w
T(a) g
T(i)+[1-w
T(a)] g
Last T, i(a); Wherein, w
T(a) be window function, work as a=0 ..., 4 o'clock, w
T(a)=1/2{1-cos[(a+1) π/6] }, work as a=5 ..., 9 o'clock, w
T(a)=1; g
Last T, i(a) be previous frame exc
HB(m) gain factor of corresponding sampling point.
9. according to the method for reconstructing of any described high-frequency signal of claim 6~8, it is characterized in that, described according to F
Env(j) to S
T(m) step that frequency domain spectra envelope is adjusted comprises:
Calculate F according to coding side
Env(j) mode is to S
T(m) carry out time-frequency conversion and generate frequency-region signal S
F1(m) and calculate S
F1(m) frequency domain spectra envelope parameters F '
Env(j);
According to F
Env(j) and F '
Env(j) the energy difference between is calculated the preliminary gain factor g of frequency domain
F(j), each g
F(j) corresponding to S
F1(m) comprise a section of D frequency domain sampling point, D * J≤2N in;
Each g of interpolation
F(j) obtain D gain factor g
F, j(d), d=0 ..., D-1;
According to g
F, j(d) adjust S
F1The gain of the sampling point of D * J (m) generates adjusted frequency-region signal S
F2(m);
To S
F2(m) carry out the inverse transformation of described time-frequency conversion, obtain S
F(m).
10. the method for reconstructing of high-frequency signal according to claim 9 is characterized in that, and is described according to coding side calculating F
Env(j) mode generates S
F1(m) and calculate F '
Env(j) step comprises:
Be S
T(m) and previous frame S
T, last(m) windowing w
TDAC(k) the signal S after the acquisition windowing
w(k), k=0 ..., 4N-1, wherein,
S
w(k)=w
TDAC(k)·S
T,last(k),k=0,…,2N-1,
S
w(k)=w
TDAC(k)·S
T(k-2N),k=2N,…,4N-1;
To S
w(k) carry out discrete cosine transform and generate S
F1(m),
Extract S
F1(m) preceding D * J sampling point calculates F '
Env(j),
11. the method for reconstructing of high-frequency signal according to claim 10 is characterized in that: D=16, J=3N/32, described window function w
TDAC(k) be:
w
TDAC(k)=sin[(k+0.5)π/4N]。
12. the method for reconstructing according to any described high-frequency signal of claim 6~11 is characterized in that, to S
F(m) carry out spectrum folding before, also comprise:
Use envelope to adjust threshold value limit
1(i), limit
2(i) to S
F(m) carry out the envelope adjustment, adjusted S
F(m) be:
At m=m
1~m
2Part in, if | S
F, old(m) |<limit
1(i), S then
F(m)=S
F, old(m),
At m=m
2+ 1~m
3Part in, if limit
1(i)≤| S
F, old(m) |≤limit
2(i), S then
F(m)=[S
F, old(m)-limit
1(i)]/2+limit
1(i),
At m=m
3+ 1~m
4Part in, if | S
F, old(m) |>limit
2(i), S then
F(m)=[S
F, old(m)-limit
2(i)]/16+limit
2(i), wherein, S
F, old(m) adjust preceding S for envelope
F(m); Limit
1(i), limit
2(i) and S
F(m) corresponding relation of time domain sampling point in, and T
Env(i) and S
F(m) corresponding relation of time domain sampling point is identical in.
13. the method for reconstructing of high-frequency signal according to claim 12 is characterized in that: described limit
1(i), limit
2(i) be,
limit
1(i)=2^T
env(i),limit
2(i)=[2^T
env(i)]×2.5。
14. the method for reconstructing of high-frequency signal according to claim 13 is characterized in that: N=160; Described m
1~m
2Part be 0~5,80~85,160~165,240~245 part; Described m
2+ 1~m
3Part be 6~75,86~155,166~235,246~315 part; Described m
3+ 1~m
4Part be 76~79,156~159,236~239,316~319 part.
15. the generating apparatus of pumping signal is characterized in that during a bandwidth was expanded, and comprising:
The core codec module, being used for reference frequency output is 0~B
0The first pumping signal exc (n), n=0 ..., N-1;
The spectrum folding module is used for exc (n) is carried out spectrum folding, and reference frequency output is B
0~2B
0The second pumping signal exc
Fold(n);
The synthetic filtering module is used for exc (n) and exc
Fold(n) carry out synthetic filtering, reference frequency output is 0~2B
0The 3rd pumping signal exc
HB(m), m=0 ..., 2N-1, described the 3rd pumping signal exc
HB(m) be used for carrying out the reconstruction of high-frequency signal as high-frequency excitation signal.
16. the generating apparatus of pumping signal according to claim 15 is characterized in that: described synthetic filtering module is the orthogonal mirror image composite filter.
17. the generating apparatus according to claim 15 or 16 described pumping signals is characterized in that, also comprises:
3/4 low-pass filter, being used for the incoming frequency scope is 0~2B
0Exc
HB(m), it is carried out 3/4 low-pass filtering, reference frequency output is 0~3B
0/ 2 exc
HB(m).
18. the reconstructing device of a bandwidth expansion medium-high frequency signal is characterized in that, comprising:
The pumping signal generation unit, the logical organization of the generating apparatus of any described pumping signal of employing claim 15~17 is used to generate pumping signal exc
HB(m), m=0 ..., 2N-1;
Decoding unit is used for decoding output time domain spectrum envelope parameter T
Env(i) and frequency domain spectra envelope parameters F
Env(j), i=0 wherein ..., I-1, j=0 ..., J-1;
The time domain shaping unit is used for according to T
Env(i) to exc
HB(m) time domain spectrum envelope is adjusted, each T
Env(i) the corresponding exc that adjusts
HB(m) comprise a section of A time domain sampling point in, A≤2N/I, the adjusted signal S of output time domain
T(m);
The frequency-domain shaping unit is used for according to F
Env(j) to S
T(m) frequency domain spectra envelope is adjusted, each F
Env(j) the corresponding S that adjusts
T(m) bandwidth is B in the frequency domain
1A subband, B
1≤ B
2/ J, B
2Be S
T(m) frequency span, the adjusted reconstruction signal S of output frequency domain
F(m);
The spectrum folding unit is used for the S to input
F(m) carry out spectrum folding, the generated frequency scope is 2B
0~2B
0+ B
2High-frequency reconstruction signal S
HB(m).
19. the reconstructing device of high-frequency signal according to claim 18 is characterized in that, also comprises:
Post-processing unit is used to use envelope to adjust threshold value limit
1(i), limit
2(i) S that described frequency-domain shaping unit is exported
F(m) carry out the envelope adjustment, adjusted S
F(m) be: at m=m
1~m
2Part in, if | S
F, old(m) |<limit
1(i), S then
F(m)=S
F, old(m); At m=m
2+ 1~m
3Part in, if limit
1(i)≤| S
F, old(m) |≤limit
2(i), S then
F(m)=[S
F, old(m)-limit
1(i)]/2+limit
1(i); At m=m
3+ 1~m
4Part in, if | S
F, old(m) |>limit
2(i), S then
F(m)=[S
F, old(m)-limit
2(i)]/16+limit
2(i); Wherein, S
F, old(m) adjust preceding S for envelope
F(m); Limit
1(i), limit
2(i) and S
F(m) corresponding relation of time domain sampling point in, and T
Env(i) and S
F(m) corresponding relation of time domain sampling point is identical in; With adjusted S
F(m) export to described spectrum folding unit.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710198774XA CN101458930B (en) | 2007-12-12 | 2007-12-12 | Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus |
PCT/CN2008/073368 WO2009076871A1 (en) | 2007-12-12 | 2008-12-08 | Method and apparatus for generating excitation signal and regenerating signal in bandwidth extension |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710198774XA CN101458930B (en) | 2007-12-12 | 2007-12-12 | Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101458930A CN101458930A (en) | 2009-06-17 |
CN101458930B true CN101458930B (en) | 2011-09-14 |
Family
ID=40769743
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200710198774XA Expired - Fee Related CN101458930B (en) | 2007-12-12 | 2007-12-12 | Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101458930B (en) |
WO (1) | WO2009076871A1 (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102714041B (en) * | 2009-11-19 | 2014-04-16 | 瑞典爱立信有限公司 | Improved excitation signal bandwidth extension |
US8600737B2 (en) * | 2010-06-01 | 2013-12-03 | Qualcomm Incorporated | Systems, methods, apparatus, and computer program products for wideband speech coding |
PL2596497T3 (en) | 2010-07-19 | 2014-10-31 | Dolby Int Ab | Processing of audio signals during high frequency reconstruction |
KR101826331B1 (en) * | 2010-09-15 | 2018-03-22 | 삼성전자주식회사 | Apparatus and method for encoding and decoding for high frequency bandwidth extension |
CN103915104B (en) * | 2012-12-31 | 2017-07-21 | 华为技术有限公司 | Signal bandwidth extended method and user equipment |
CN104036781B (en) * | 2013-03-05 | 2017-02-22 | 深港产学研基地 | Voice signal bandwidth expansion device and method |
CN103165134B (en) * | 2013-04-02 | 2015-01-14 | 武汉大学 | Coding and decoding device of audio signal high frequency parameter |
CN105761723B (en) | 2013-09-26 | 2019-01-15 | 华为技术有限公司 | A kind of high-frequency excitation signal prediction technique and device |
CN111312277B (en) * | 2014-03-03 | 2023-08-15 | 三星电子株式会社 | Method and apparatus for high frequency decoding of bandwidth extension |
CN104269173B (en) * | 2014-09-30 | 2018-03-13 | 武汉大学深圳研究院 | The audio bandwidth expansion apparatus and method of switch mode |
US10586553B2 (en) | 2015-09-25 | 2020-03-10 | Dolby Laboratories Licensing Corporation | Processing high-definition audio data |
CN107221334B (en) * | 2016-11-01 | 2020-12-29 | 武汉大学深圳研究院 | Audio bandwidth extension method and extension device |
CN107545900B (en) * | 2017-08-16 | 2020-12-01 | 广州广晟数码技术有限公司 | Method and apparatus for bandwidth extension coding and generation of mid-high frequency sinusoidal signals in decoding |
CN107682096B (en) * | 2017-09-14 | 2020-07-14 | 大连理工大学 | Narrow-band random signal generation method based on multi-stage interpolation |
EP3785260A1 (en) | 2018-04-25 | 2021-03-03 | Dolby International AB | Integration of high frequency audio reconstruction techniques |
RU2758199C1 (en) | 2018-04-25 | 2021-10-26 | Долби Интернешнл Аб | Integration of techniques for high-frequency reconstruction with reduced post-processing delay |
CN110556121B (en) * | 2019-09-18 | 2024-01-09 | 腾讯科技(深圳)有限公司 | Band expansion method, device, electronic equipment and computer readable storage medium |
CN110556123B (en) * | 2019-09-18 | 2024-01-19 | 腾讯科技(深圳)有限公司 | Band expansion method, device, electronic equipment and computer readable storage medium |
CN114999503A (en) * | 2022-05-23 | 2022-09-02 | 北京百瑞互联技术有限公司 | Full-bandwidth spectral coefficient generation method and system based on generation countermeasure network |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1606687A (en) * | 2002-09-19 | 2005-04-13 | 松下电器产业株式会社 | Audio decoding apparatus and method |
CN1220972C (en) * | 2002-02-08 | 2005-09-28 | 株式会社Ntt都科摩 | Decoding apparatus and coding apparatus, decoding method and coding method |
CN101083076A (en) * | 2006-06-03 | 2007-12-05 | 三星电子株式会社 | Method and apparatus to encode and/or decode signal using bandwidth extension technology |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101317217B (en) * | 2005-11-30 | 2012-07-18 | 松下电器产业株式会社 | Subband coding apparatus and method of coding subband |
-
2007
- 2007-12-12 CN CN200710198774XA patent/CN101458930B/en not_active Expired - Fee Related
-
2008
- 2008-12-08 WO PCT/CN2008/073368 patent/WO2009076871A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1220972C (en) * | 2002-02-08 | 2005-09-28 | 株式会社Ntt都科摩 | Decoding apparatus and coding apparatus, decoding method and coding method |
CN1606687A (en) * | 2002-09-19 | 2005-04-13 | 松下电器产业株式会社 | Audio decoding apparatus and method |
CN101083076A (en) * | 2006-06-03 | 2007-12-05 | 三星电子株式会社 | Method and apparatus to encode and/or decode signal using bandwidth extension technology |
Also Published As
Publication number | Publication date |
---|---|
WO2009076871A1 (en) | 2009-06-25 |
CN101458930A (en) | 2009-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101458930B (en) | Excitation signal generation in bandwidth spreading and signal reconstruction method and apparatus | |
JP5551814B2 (en) | Speech encoder, speech decoder, speech encoding method, speech decoding method, and computer program | |
CN1766993B (en) | Enhancing perceptual performance of high frequency reconstruction coding methods by adaptive filtering | |
JP5551692B2 (en) | Speech encoder, speech decoder, speech encoding method, speech decoding method, and computer program | |
JP6701429B2 (en) | Harmonic conversion | |
US8532998B2 (en) | Selective bandwidth extension for encoding/decoding audio/speech signal | |
CN100568345C (en) | The method and apparatus that is used for the bandwidth of artificial expanded voice signal | |
CN102194457B (en) | Audio encoding and decoding method, system and noise level estimation method | |
CN103366749B (en) | A kind of sound codec devices and methods therefor | |
EP3262639A1 (en) | Apparatus and method for processing an audio signal to obtain a processed audio signal using a target time-domain envelope | |
JP5809066B2 (en) | Speech coding apparatus and speech coding method | |
CN101276587A (en) | Audio encoding apparatus and method thereof, audio decoding device and method thereof | |
JP2014508322A (en) | Bandwidth expansion method and apparatus | |
CN105960675B (en) | Improved band extension in audio signal decoder | |
JPH0744193A (en) | High-efficiency encoding method | |
CN103155034A (en) | Audio signal bandwidth extension in CELP-based speech coder | |
Kornagel | Techniques for artificial bandwidth extension of telephone speech | |
CN103366750A (en) | Sound coding and decoding apparatus and sound coding and decoding method | |
CN103366751B (en) | A kind of sound codec devices and methods therefor | |
CN103155035B (en) | Audio signal bandwidth extension in CELP-based speech coder | |
CN105431898A (en) | Audio decoder having a bandwidth extension module with an energy adjusting module | |
JP3598111B2 (en) | Broadband audio restoration device | |
Park et al. | Artificial bandwidth extension of narrowband speech signals for the improvement of perceptual speech communication quality | |
Kang et al. | A phase generation method for speech reconstruction from spectral envelope and pitch intervals | |
JPH05297892A (en) | Voiced sound synthesizing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20110914 Termination date: 20171212 |
|
CF01 | Termination of patent right due to non-payment of annual fee |