CN106935243A - A kind of low bit digital speech vector quantization method and system based on MELP - Google Patents
A kind of low bit digital speech vector quantization method and system based on MELP Download PDFInfo
- Publication number
- CN106935243A CN106935243A CN201511005800.3A CN201511005800A CN106935243A CN 106935243 A CN106935243 A CN 106935243A CN 201511005800 A CN201511005800 A CN 201511005800A CN 106935243 A CN106935243 A CN 106935243A
- Authority
- CN
- China
- Prior art keywords
- vector quantization
- melp
- lsf
- lsf parameters
- digital speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013139 quantization Methods 0.000 title claims abstract description 120
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000001914 filtration Methods 0.000 claims description 15
- 238000001228 spectrum Methods 0.000 claims description 12
- 230000005284 excitation Effects 0.000 claims description 11
- 101100243558 Caenorhabditis elegans pfd-3 gene Proteins 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 238000005311 autocorrelation function Methods 0.000 claims description 5
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/087—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The embodiment of the invention provides a kind of low bit digital speech vector quantization method based on MELP and system.The present invention carries out linear predictor coefficient vector quantization to the pitch signal after adjustment using MELP MELP algorithms, including:Two-stage Split vector quantizer is used to LSF parameters, the LSF parameters of first order vector quantization, the LSF parameters of the LSF parameter acquirings second level vector quantization based on the first order vector quantization is first obtained;Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.The present invention, using LSF two-stages level vector quantization scheme, reduces code check on the basis of MELP algorithms, reduces the amount of storage and computation complexity of code book.
Description
Technical field
The present invention relates to signal processing technology field, more particularly to a kind of low bit numeral language based on MELP
Sound vector quantization method.
Background technology
At this stage, the research of the digital voice compression algorithm of low bit is more and more ripe, and in low bit
In digital speech algorithm, MELP MELP (Mixed Excitation Linear
Prediction) algorithm has oneself distinctive advantage, and 2.4Kbps MELP are (linear based on LPC
Predictive coding) on the basis of combine mixed excitation, band excitation and the coding method such as prototype waveform interpolation more
Advantage, voice is synthesized using a kind of speech production model of the new pronunciation for more meeting people.MELP is calculated
The characteristics of method is to employ multi band mixed excitation, non-periodic pulse, residual error harmonic management, Adaptive spectra to increase
Strong and shaping pulse filtering.
For problem above, generally propose in the prior art using identification synthesis type vocoder, using language
Sound is recognized and synthetic technology is encoded to voice signal, and coding unit is speech primitive, so can be coding speed
Rate is down to below 1Kb/s.In addition, in 2.4K/s linear predictive codings LPC (Linear Predictive
Coding on the basis of), also there is the frame-to-frame correlation using vector quantization technology and voice, further
Compression speech data.So-called vector quantization, refers to regard one group of scalar data as a vector, in vector
Space carries out overall quantization to it, so not only have compressed data but also had not lost how much information.Vector quantization
Efficiency determines the efficiency of encoder.In the parameter of low rate coding quantifies, due to
LSP (bit number that Line Spectrum Pa quantify to take is higher, therefore, if can be to LSP parameters
The method of quantization makes certain improvements, and can necessarily bring significantly reducing for code rate.Due to voice letter
Number consecutive frame between, especially in the steady section of voice, there is very big correlation.If every one
If speech parameter of frame coding transmission, code rate will be substantially reduced.Therefore, somebody proposes
The bit number of parameter quantization is further reduced using frame-to-frame correlation.I.e. a few frame continuous signals as one
Frame is encoded, and the parameter to super frame carries out overall vector quantization so as to compress inter-frame redundancy.Also learn
Person proposes a kind of segment quantization method for being variable segment length, and will being input into voice, to regard a sequence as long
The variable section of degree, every section is made up of a frame or a few frame signals, per parameters such as frame gain, fundamental tone and frequency spectrums
To represent.Although implementing more complicated, encoding rate can be substantially reduced, shorten coding and prolong
Late, and the synthesis voice of better quality can be obtained.
The content of the invention
The embodiment provides a kind of low bit digital speech vector quantization method based on MELP and
System, the invention provides following scheme:
Linear predictor coefficient is carried out to the pitch signal after adjustment using MELP MELP algorithms
Vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, first order vector quantization is first obtained
LSF parameters, the LSF of the LSF parameter acquirings second level vector quantization based on the first order vector quantization
Parameter;
Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
According to another aspect of the present invention, a kind of low bit digital speech vector quantity based on MELP is also provided
Change system, including:
Coefficient acquisition module:It is used for using MELP MELP algorithms to the fundamental tone after adjustment
Signal carries out linear predictor coefficient vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters,
First obtain the LSF parameters of first order vector quantization, the LSF parameter acquirings based on the first order vector quantization
The LSF parameters of second level vector quantization;
Quantization modules:It is used to carry out digital speech vector quantity using the LSF parameters after the vector quantization of the second level
Change.
The technical scheme provided by embodiments of the invention described above can be seen that and the embodiment of the invention provides
A kind of low bit digital speech vector quantization method and system based on MELP.The present invention uses mixed excitation
Linear prediction MELP algorithms carry out linear predictor coefficient vector quantization to the pitch signal after adjustment, including:
Two-stage Split vector quantizer is used to LSF parameters, the LSF parameters of first order vector quantization are first obtained, is based on
The LSF parameters of the LSF parameter acquirings second level vector quantization of the first order vector quantization;Using the second level
LSF parameters after vector quantization carry out digital speech vector quantization.It is of the invention a kind of based on the low ratios of MELP
Special digital speech algorithm design is based on existing method for designing and the defect for its presence proposes one
Plant the new low bit digital speech building method based on MELP.On the basis of MELP algorithms, even if
The quantization of method is analyzed, and the quantization of pitch period and the quantization of linear predictor coefficient are analyzed emphatically, and
The further quantization to linear predictor coefficient proposes a kind of improved method, using LSF two-stage vector quantizations
Scheme, reduces code check, reduces the amount of storage and computation complexity of code book, compared with original scheme more
It is advantageous.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, institute in being described to embodiment below
The accompanying drawing for needing to use is briefly described, it should be apparent that, drawings in the following description are only this hair
Some bright embodiments, for those of ordinary skill in the art, are not paying creative labor
Under the premise of, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of low bit digital speech vector quantization based on MELP that the embodiment of the present invention one is provided
The process chart of method;
Fig. 2 is a kind of low bit digital speech vector quantization based on MELP that the embodiment of the present invention two is provided
The module map of system.
Specific embodiment
For ease of the understanding to the embodiment of the present invention, below in conjunction with accompanying drawing by taking several specific embodiments as an example
Explanation is further explained, and each embodiment does not constitute the restriction to the embodiment of the present invention.
Embodiment one
In embodiments of the invention, it is necessary first to obtain pitch signal;The fundamental tone letter is obtained in this implementation
Number, specifically include:
By the audio digital signals sampled by high-pass filter, filtering signal is obtained;
Voicing decision is carried out using multi band mixed excitation to the filtering signal, and calculates the filtering letter
Number gain, to obtain the pitch signal;
Specifically, the filtering signal is divided into several subbands, voicing decision is carried out respectively, to institute
The intensity of sound for stating subband marks pure and impure sound respectively,
To the intensity of sound of the subband, using parameter Vbpi, (i=1,2 ... n) represent, Vbpi is represented respectively
The intensity of sound of individual subband, its value represents voiced sound when being 1, and voiceless sound is represented when being 0,
In the present embodiment, using voice every 22.5ms long as an analysis frame, corresponding to 8kHz sample rates
Under 180 sampled points (8000 sampled points/s), after treatment per frame export 54 bits be transmitted,
So its speed is 2.4kbps;
As a example by filtering signal to be divided into 5 subbands, each sub-band parameter is Vbpi (i=1,2 ..., 5),
Preferably, input signal will be input into respectively by 5 Butterworth bandpass filters of 6 ranks
Signal is divided into 0Hz~500Hz, and 500Hz~1000Hz, 1000Hz~2000Hz, 2000Hz~
3000Hz, 3000Hz~subbands of 4000Hz five.Voice signal through 0Hz~500Hz bandpass filtering
The filtered output of device is used for carrying out an estimation for fraction fundamental tone, thus obtains fractional pitch cycle P2With
Corresponding auto-correlation function value r (P2), r (P2) value determine most low strap and total clear/voiced sound judgement knot
Really.According to fractional pitch cycle P2Corresponding auto-correlation function value r (P2) the first intensity threshold is set, this
Value is 0.6 in embodiment;
When the intensity of sound parameter Vbp1 of the first subband is not more than the first intensity threshold, present frame is voiceless sound
Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frames quantization encoding;
When the intensity of sound parameter Vbp1 of the first subband is more than the first intensity threshold, present frame is voiced sound
Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frame quantization encoding.
When Vbp1≤0.6, present frame is illustrated for unvoiced frames, the pure and impure intensity of remaining band logical
All quantization encoding is 0 to Vbpi (i=1,2,3,4,5);
Work as Vbp1>When 0.6, i=2 when 3,4,5, illustrates present frame for unvoiced frame, and Vbp1 is encoded to 1.
The filtering signal gain is calculated, needs to carry out at adding window adjustment the audio digital signals sampled before
Reason, specifically,
Intensity of sound parameter according to the first subband, the window that adjustment is used to the audio digital signals sampled
It is long, specifically, when the intensity of sound of the first subband is more than the first intensity threshold, and fractional pitch cycle
P2Minimum factor product be not more than window threshold value long, then adjust window it is a length of more than fractional pitch cycle P2Minimum
Factor is accumulated;
When the intensity of sound of the first subband is more than the first intensity threshold, and fractional pitch cycle P2It is minimum because
Scalar product is more than window threshold value long, then adjust the half of the minimum factor product of a length of fractional pitch cycle of window;
When the intensity of sound of the first subband is less than or equal to the first intensity threshold, adjustment window is long equal to fraction base
The minimum factor product in sound cycle.
For example, working as Vbp1 for the first intensity threshold with 0.6 in the present embodiment>When 0.6, window is long to be greater than
P2The minimum factor product of fractional pitch cycle;In the present embodiment, with using voice every 22.5ms long as
One analysis frame, corresponding to 180 sampled points (8000 sampled points/s) under 8kHz sample rates, passes through
54 bits are exported per frame to be transmitted, as a example by so its speed is 2.4kbps, now adjust window after treatment
It is a length of to be more than 120 sampling points,
Now, so that window threshold value long is 320 sampling points as an example, if the window of above-mentioned calculating adjustment is long to be more than 320
Sampling point, then it is the window of above-mentioned calculating is long divided by 2.
When Vbp1≤0.6, a length of 120 sampling points of window are adjusted.
Secondly, it is necessary to carry out linear predictive coding to the pitch signal, residual signals are obtained, and calculate
The pitch period of the pitch signal, and the base is adjusted according to the pitch period and the residual signals
Message number, obtains the pitch signal after adjustment;Specifically, including:
Step A, the audio digital signals to the sampling of the pitch signal carry out LPC (Linear
Predictive Coding) linear predictive coding;
200 Hamming windows of sampling point included with 25ms voice signals long in the present embodiment are to the number sampled
Word voice signal is weighted, then carries out 10 rank linear predictive codings, and the center of window is the reference of present frame
Point.
After step B, linear predictive coding, residual signals are obtained;The residual signals of acquisition do not include sound channel
Response message but comprising complete excitation information, effect can be the influence for reducing tract characteristics, improve base
Sound phase estimate effect;
In order to obtain residual signals, by the audio digital signals sampled by Linear prediction error fiker,
Transmission function is:
Wherein, aiIt is linear predictor coefficient, residual signals are:
Wherein n is long for the window of residual analysis.Linear prediction error fiker is FIR filter, and it is output as
Residual signals.
Step C, the pitch period for calculating the pitch signal, and according to the pitch period and the residual error
Signal adjusts the pitch signal, obtains the pitch signal after adjustment;
Step C1, the calculating for integer pitch period, the audio digital signals of sampling first pass through one and cut
Only frequency is the 6 rank Butterworth low pass filters of 1KHz, eliminates the high frequency of the voice in Parameter analysis
The interference that composition is estimated pitch period.Normalized autocorrelation functions r (τ) is defined as:
Wherein
The value of integer pitch period reaches T corresponding during maximum equal to normalized autocorrelation functions r (τ)
Value, obtains max (r (τ)), as integer pitch period P from calculating formula above1。
Step C2, the output signal of first subband bandpass filter (0~500Hz) are Sb1 (n), letter
The Main Function of number Sb1 (n) is the search for fractional pitch cycle.Because Sb1 (n) signals are by
More than four times harmonic filtrations of pitch period are fallen when one sub-filter, so as to eliminate
Influence of the higher hamonic wave to pitch search, after being operated more than, in conjunction with the integer of rough estimate before
Pitch period P1So that more accurately pitch period can be estimated.Using present frame and previous
Frame estimates roughly the integer pitch period for obtaining, in (P1-5,P1+ 5) the thin of integer fundamental tone is carried out in the range of
Search obtains P2, recycle P2Calculate fractional pitch cycle.The calculating of fractional pitch cycle can be carried significantly
The accuracy that pitch period high is estimated.Real pitch period is also possible in (P2-1,P2) between, or
(P2,P2+ 1) between, therefore, it is typically with formula C τ (m, n) and compares CP2(0,P2- 1) and
CP2(0,P2+ 1) method of size is determined.After between the scope [P, P+1] that pitch period is determined, just
Fractional pitch cycle can be determined using the method for interpolation.
In the present embodiment, for the calculating of fraction pitch period, the extraction of fractional pitch cycle uses band logical
First band (0~500Hz) output signal in analysis, two candidate values are respectively present frame and former frame
Integer pitch period.Assuming that real pitch period is △, 0 with the side-play amount of integer pitch period<△
<1, the formula for calculating △ is as follows:
The normalized autocorrelation value of fractional pitch cycle is:
Set respectively:A=CT (0,0);B=CT (0, T);C=CT (0, T+1);D=CT (T, T);E=CT (T, T+1);F=
(T+1, T+1) substitutes into two formula evaluations above and obtains fractional pitch cycle;
Step C3:On the basis of pitch period candidate value based on above-mentioned steps C1 and step C2, base is carried out
The final calculating in sound cycle.P3 is final pitch period estimate, and corresponding normalized autocorrelation value is
r(P3);When autocorrelation value is larger (r (P3) >=0.6), illustrate that the estimation of pitch period is more accurate, most
The Ploidy testing of pitch period is carried out with the residual signals of LPF, you can obtain final fundamental tone week afterwards
Phase estimate.Pitch period has influence on the discrimination of speech recognition, has influence on the correct of voice compression coding
Rate.
(the r (P3) when autocorrelation value is smaller<0.6), illustrate that the pitch signal in LPC residual signal may
It is corrupted by noise, or the frame signal is unstable, replaces LPC residual to believe with the audio digital signals of sampling
The search of fractional pitch cycle number is carried out near only, new P3 and r (P3) is obtained.
The embodiment of the invention provides a kind of place of the low bit digital speech vector quantization method based on MELP
Reason flow is as shown in figure 1, including following process step:
Step 11, using MELP MELP (Mixed Excitation Linear
Prediction) algorithm carries out linear predictor coefficient vector quantization to the pitch signal after adjustment, including:It is right
LSF parameters use two-stage Split vector quantizer, first obtain the LSF (Line of first order vector quantization
Spectrum Frequency) parameter, the LSF parameter acquirings second based on the first order vector quantization
The LSF parameters of level vector quantization;
Specifically, in the present embodiment, first order vector quantization is carried out to LSF parameters using 5 bits, is obtained
The LSF parameters of 10 dimensions;
The LSF parameters of 10 dimensions are divided into 5 dimension after preceding 5 peacekeeping, respectively to the LSF parameters of preceding 5 dimensions using 7 ratios
Special code book carries out second level vector quantization, and the LSF parameters of 5 dimensions carry out second level arrow using 5 bit code books afterwards
Amount quantifies;
Specifically, in the present embodiment, the 2nd subframe and the 4th subframe to the pitch signal after the adjustment
LSF parameters carry out vector quantization using 17 bits;
The LSF parameters of the 1st subframe and the 3rd subframe to the pitch signal after the adjustment use equation below
Calculated:
J=1,2 ..., 9
Wherein,It is the 1st subframe and the interpolated value of the LSF parameters of the 3rd subframe,For preceding
The quantized value of one joint frame last subframe LSF parameters,It is the 2nd subframe and the 4th subframe
LSF quantized values, a1(j),a2J () is LSF interpolation coefficients, wherein, a1(j),a2J () uses the code of 4 bits
Originally vector quantization is carried out.
The a1(j),a2J () carries out vector quantization using the code book of 4 bits, including:
Set up the object function of following vector quantization, i.e. vector quantization object:
Wherein, w1(j),w3J () is weight coefficient, l1(j),l3J () is the 1st subframe and the 3rd not quantified
Subframe LSF parameters.
Step 12, digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
Specifically, also include, the LSF parameters after the second level vector quantization are adopted with original LSF quantized values
It is compared with spectrum distortion index, N represents spectrum distortion index;
Wherein, L is fundamental tone harmonic wave number, A in subframemlIt is original spectrum amplitude angle value, AmrlIt is to use second
The spectrum amplitude angle value rebuild after LSF parameters after level vector quantization.
Embodiment two
This embodiment offers a kind of low bit digital speech Vector Quantization based on MELP, its is specific
Structure is realized as shown in Fig. 2 can specifically include following module:
Coefficient acquisition module 21:It is used for using MELP MELP algorithms to the base after adjustment
Message number carries out linear predictor coefficient vector quantization, including:Vector quantity is divided using two-stage to LSF parameters
Change, first obtain the LSF parameters of first order vector quantization, the LSF parameters based on the first order vector quantization
Obtain the LSF parameters of second level vector quantization;
Quantization modules 23:It is used for it is used to carry out numeral using the LSF parameters after the vector quantization of the second level
Speech vector quantifies.
The detailed process and preceding method reality of digital speech vector quantization are carried out with the system of the embodiment of the present invention
Apply example to be similar to, here is omitted.
In sum, the embodiment of the present invention obtains the pitch signal after adjustment;Using mixed excitation linear
Prediction MELP algorithms carry out linear predictor coefficient vector quantization to the pitch signal after the adjustment, including:
LPC parameters are converted into line spectrum pair vector LSF parameters, wherein, the LSF parameters are divided using two-stage
Vector quantization, including:First order vector quantization is carried out to LSF parameters using 5 bits, 10 dimensions are obtained
LSF parameters;The LSF parameters of 10 dimensions are divided into 5 dimension after preceding 5 peacekeeping, the LSF parameters to preceding 5 dimensions are adopted respectively
Second level vector quantization is carried out with 7 bit code books, the LSF parameters of 5 dimensions carry out second using 5 bit code books afterwards
Level vector quantization;A kind of MELP low bit digital speech algorithms design that is based on of the invention is based on having
Method for designing and the defect for its presence proposes a kind of new low bit digital speech based on MELP
Building method.On the basis of MELP algorithms, the quantization with regard to algorithm is analyzed, and fundamental tone is analyzed emphatically
The quantization in cycle and the quantization of linear predictor coefficient, and further the quantization to linear predictor coefficient is proposed
A kind of improved method, using LSF two-stage vector quantization schemes, reduces code check, reduces the storage of code book
Amount and computation complexity, it is more advantageous compared with original scheme.
One of ordinary skill in the art will appreciate that:Accompanying drawing is the schematic diagram of one embodiment, in accompanying drawing
Module or necessary to flow not necessarily implements the present invention.
As seen through the above description of the embodiments, those skilled in the art can be understood that
The present invention can add the mode of required general hardware platform to realize by software.Based on such understanding,
The part that technical scheme substantially contributes to prior art in other words can be with software product
Form embody, the computer software product can be stored in storage medium, such as ROM/RAM,
Magnetic disc, CD etc., including some instructions are used to so that a computer equipment (can be individual calculus
Machine, server, or network equipment etc.) perform some portions of each embodiment of the invention or embodiment
Method described in point.
Each embodiment in this specification is described by the way of progressive, identical between each embodiment
Similar part is mutually referring to what each embodiment was stressed is the difference with other embodiment
Part.For especially for device or system embodiment, because it is substantially similar to embodiment of the method,
So describing fairly simple, the relevent part can refer to the partial explaination of embodiments of method.The above is retouched
The apparatus and system embodiment stated is only schematical, wherein the unit illustrated as separating component
Can be or may not be physically separate, the part shown as unit can be or also may be used
Not being physical location, you can with positioned at a place, or multiple NEs can also be distributed to
On.Some or all of module therein can be according to the actual needs selected to realize this embodiment scheme
Purpose.Those of ordinary skill in the art are without creative efforts, you can to understand simultaneously
Implement.
The above, preferably specific embodiment only of the invention, but protection scope of the present invention is not
Be confined to this, any one skilled in the art the invention discloses technical scope in, can
The change or replacement for readily occurring in, should all be included within the scope of the present invention.Therefore, the present invention
Protection domain should be defined by scope of the claims.
Claims (10)
1. a kind of low bit digital speech vector quantization method based on MELP, it is characterised in that including:
Linear predictor coefficient is carried out to the pitch signal after adjustment using MELP MELP algorithms
Vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, first order vector quantization is first obtained
LSF parameters, the LSF of the LSF parameter acquirings second level vector quantization based on the first order vector quantization
Parameter;
Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
2. a kind of low bit digital speech vector quantization method based on MELP according to claim 1,
Characterized in that, the LSF parameters for first obtaining first order vector quantization, based on the first order vector quantity
The LSF parameters of the LSF parameter acquirings second level vector quantization of change, including:
First order vector quantization is carried out to LSF parameters using 5 bits, the LSF parameters of 10 dimensions are obtained;By 10
The LSF parameters of dimension are divided into 5 dimensions after preceding 5 peacekeeping, and the LSF parameters to preceding 5 dimensions are carried out using 7 bit code books respectively
Second level vector quantization, afterwards 5 dimension LSF parameters carry out second level vector quantization using 5 bit code books, obtain
Take the LSF parameters of second level vector quantization.
3. a kind of low bit digital speech vector quantization method based on MELP according to claim 1,
Characterized in that, the use MELP MELP algorithms are carried out to the pitch signal after adjustment
Linear predictor coefficient vector quantization, including:
The LSF parameters of the 2nd subframe and the 4th subframe to the pitch signal after the adjustment are entered using 17 bits
Row vector quantization;
The LSF parameters of the 1st subframe and the 3rd subframe to the pitch signal after the adjustment use equation below
Calculated:
l3(j)=a2(j)l2(j)+[1-a2(j)]l4(j)
J=1,2 ..., 9
Wherein,It is the 1st subframe and the interpolated value of the LSF parameters of the 3rd subframe,For preceding
The quantized value of one joint frame last subframe LSF parameters,It is the 2nd subframe and the 4th subframe
LSF quantized values, a1(j),a2J () is LSF interpolation coefficients, wherein, a1(j),a2J () uses the code of 4 bits
Originally vector quantization is carried out.
4. a kind of low bit digital speech vector quantization method based on MELP according to claim 3,
Characterized in that, a1(j),a2J () carries out vector quantization using the code book of 4 bits, including:
Set up the object function of following vector quantization:
Wherein, w1(j),w3J () is weight coefficient, l1(j),l3J () is the 1st subframe and the 3rd son not quantified
Frame LSF parameters.
5. a kind of low bit digital speech vector quantization method based on MELP according to claim 4,
Characterized in that,
LSF parameters after the second level vector quantization are carried out with original LSF quantized values using spectrum distortion index
Compare,
Wherein, L is fundamental tone harmonic wave number, A in subframemlIt is original spectrum amplitude angle value, AmrlIt is to use second
The spectrum amplitude angle value rebuild after LSF parameters after level vector quantization.
6. a kind of low bit digital speech vector quantization method based on MELP according to claim 1,
Characterized in that, the pitch signal after the adjustment is obtained, including:
By the audio digital signals sampled by high-pass filter, filtering signal is obtained;
Voicing decision is carried out using multi band mixed excitation to the filtering signal, and calculates the filtering letter
Number gain, to obtain the pitch signal;
Linear predictive coding is carried out to the pitch signal, residual signals is obtained, and calculate the fundamental tone letter
Number pitch period, and adjust the pitch signal according to the pitch period and the residual signals, obtain
Take the pitch signal after the adjustment.
7. a kind of low bit digital speech vector quantization method based on MELP according to claim 6,
Characterized in that, described carry out voicing decision to the filtering signal using multi band mixed excitation, bag
Include:
The filtering signal is divided into several subbands, voicing decision is carried out respectively, to the subband
Intensity of sound marks pure and impure sound respectively,
To the intensity of sound of the subband, using parameter Vbpi, (i=1,2 ... n) represent, Vbpi is represented respectively
The intensity of sound of individual subband,
According to fractional pitch cycle P2Corresponding auto-correlation function value r (P2) the first intensity threshold is set, when
When the intensity of sound parameter Vbp1 of the first subband is not more than the first intensity threshold, present frame is unvoiced frames, its
The remaining pure and impure intensity Vbpi (i=1,2,3,4,5) of band logical all uses unvoiced frames quantization encoding;
When the intensity of sound parameter Vbp1 of the first subband is more than the first intensity threshold, present frame is voiced sound
Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frame quantization encoding.
8. a kind of low bit digital speech vector quantization method based on MELP according to claim 6,
Characterized in that, the calculating filtering signal gain, including:
When the first subband intensity of sound be more than the first intensity threshold when, and fractional pitch cycle P2 minimum
Factor product is not more than window threshold value long, then adjust a length of minimum factor product more than fractional pitch cycle P2 of window;
When the first subband intensity of sound be more than the first intensity threshold when, and fractional pitch cycle P2 minimum
Factor product is more than window threshold value long, then adjust the half of the minimum factor product of a length of fractional pitch cycle of window;
When the intensity of sound of the first subband is less than or equal to the first intensity threshold, adjustment window is long equal to fraction
The minimum factor product of pitch period.
9. a kind of low bit digital speech vector quantization method based on MELP according to claim 6,
Characterized in that, described carry out linear predictive coding to the pitch signal, residual signals, bag are obtained
Include:By the audio digital signals sampled by Linear prediction error fiker, transmission function is:
Wherein, aiIt is linear predictor coefficient, residual signals are:
Wherein, n for residual analysis window it is long, Linear prediction error fiker is FIR filter, and it is output as
Residual signals.
10. a kind of low bit digital speech Vector Quantization based on MELP, it is characterised in that including:
Coefficient acquisition module:It is used for using MELP MELP algorithms to the fundamental tone after adjustment
Signal carries out linear predictor coefficient vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters,
First obtain the LSF parameters of first order vector quantization, the LSF parameter acquirings based on the first order vector quantization
The LSF parameters of second level vector quantization;
Quantization modules:It is used to carry out digital speech vector quantity using the LSF parameters after the vector quantization of the second level
Change.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511005800.3A CN106935243A (en) | 2015-12-29 | 2015-12-29 | A kind of low bit digital speech vector quantization method and system based on MELP |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511005800.3A CN106935243A (en) | 2015-12-29 | 2015-12-29 | A kind of low bit digital speech vector quantization method and system based on MELP |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106935243A true CN106935243A (en) | 2017-07-07 |
Family
ID=59458182
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201511005800.3A Pending CN106935243A (en) | 2015-12-29 | 2015-12-29 | A kind of low bit digital speech vector quantization method and system based on MELP |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106935243A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109256143A (en) * | 2018-09-21 | 2019-01-22 | 西安蜂语信息科技有限公司 | Speech parameter quantization method, device, computer equipment and storage medium |
CN111818519A (en) * | 2020-07-16 | 2020-10-23 | 郑州信大捷安信息技术股份有限公司 | End-to-end voice encryption and decryption method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040153317A1 (en) * | 2003-01-31 | 2004-08-05 | Chamberlain Mark W. | 600 Bps mixed excitation linear prediction transcoding |
CN101114450A (en) * | 2007-07-20 | 2008-01-30 | 华中科技大学 | Speech encoding selectivity encipher method |
CN101281750A (en) * | 2008-05-29 | 2008-10-08 | 上海交通大学 | Expanding encoding and decoding system based on vector quantization high-order code book of variable splitting table |
CN101937680A (en) * | 2010-08-27 | 2011-01-05 | 太原理工大学 | Vector quantization method for sorting and rearranging code book and vector quantizer thereof |
CN103050122A (en) * | 2012-12-18 | 2013-04-17 | 北京航空航天大学 | MELP-based (Mixed Excitation Linear Prediction-based) multi-frame joint quantization low-rate speech coding and decoding method |
-
2015
- 2015-12-29 CN CN201511005800.3A patent/CN106935243A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040153317A1 (en) * | 2003-01-31 | 2004-08-05 | Chamberlain Mark W. | 600 Bps mixed excitation linear prediction transcoding |
CN101114450A (en) * | 2007-07-20 | 2008-01-30 | 华中科技大学 | Speech encoding selectivity encipher method |
CN101281750A (en) * | 2008-05-29 | 2008-10-08 | 上海交通大学 | Expanding encoding and decoding system based on vector quantization high-order code book of variable splitting table |
CN101937680A (en) * | 2010-08-27 | 2011-01-05 | 太原理工大学 | Vector quantization method for sorting and rearranging code book and vector quantizer thereof |
CN103050122A (en) * | 2012-12-18 | 2013-04-17 | 北京航空航天大学 | MELP-based (Mixed Excitation Linear Prediction-based) multi-frame joint quantization low-rate speech coding and decoding method |
Non-Patent Citations (2)
Title |
---|
王国文 等: "《第十六届全国青年通信学术会议论文集(上)》", 31 December 2011 * |
王国文: "语音密码机中的语音压缩改进算法研究", 《中国优秀硕士学位论文全文数据库,信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109256143A (en) * | 2018-09-21 | 2019-01-22 | 西安蜂语信息科技有限公司 | Speech parameter quantization method, device, computer equipment and storage medium |
CN111818519A (en) * | 2020-07-16 | 2020-10-23 | 郑州信大捷安信息技术股份有限公司 | End-to-end voice encryption and decryption method and system |
CN111818519B (en) * | 2020-07-16 | 2022-02-11 | 郑州信大捷安信息技术股份有限公司 | End-to-end voice encryption and decryption method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102169692B (en) | Signal processing method and device | |
CN107945811B (en) | Frequency band expansion-oriented generation type confrontation network training method and audio encoding and decoding method | |
JP3241959B2 (en) | Audio signal encoding method | |
EP1899962B1 (en) | Audio codec post-filter | |
JP3475446B2 (en) | Encoding method | |
DE60006271T2 (en) | CELP VOICE ENCODING WITH VARIABLE BITRATE BY MEANS OF PHONETIC CLASSIFICATION | |
CN103503061B (en) | In order to process the device and method of decoded audio signal in a spectrum domain | |
US5018200A (en) | Communication system capable of improving a speech quality by classifying speech signals | |
US6182030B1 (en) | Enhanced coding to improve coded communication signals | |
US20060064301A1 (en) | Parametric speech codec for representing synthetic speech in the presence of background noise | |
EP1995723A1 (en) | Neuroevolution training system | |
JP2002516420A (en) | Voice coder | |
JP4040126B2 (en) | Speech decoding method and apparatus | |
CA2697604A1 (en) | Method and device for efficient quantization of transform information in an embedded speech and audio codec | |
CN105960675A (en) | Improved frequency band extension in an audio signal decoder | |
US20040083097A1 (en) | Optimized windows and interpolation factors, and methods for optimizing windows, interpolation factors and linear prediction analysis in the ITU-T G.729 speech coding standard | |
CN106104682A (en) | Weighting function for quantifying linear forecast coding coefficient determines apparatus and method | |
CN106935243A (en) | A kind of low bit digital speech vector quantization method and system based on MELP | |
Qian et al. | Wideband speech recovery from narrowband speech using classified codebook mapping | |
Srivastava | Fundamentals of linear prediction | |
Wong | On understanding the quality problems of LPC speech | |
JP3321933B2 (en) | Pitch detection method | |
CN112233686B (en) | Voice data processing method of NVOCPLUS high-speed broadband vocoder | |
KR100557113B1 (en) | Device and method for deciding of voice signal using a plural bands in voioce codec | |
JPH0756599A (en) | Wide band voice signal reconstruction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170707 |