CN106935243A - A kind of low bit digital speech vector quantization method and system based on MELP - Google Patents

A kind of low bit digital speech vector quantization method and system based on MELP Download PDF

Info

Publication number
CN106935243A
CN106935243A CN201511005800.3A CN201511005800A CN106935243A CN 106935243 A CN106935243 A CN 106935243A CN 201511005800 A CN201511005800 A CN 201511005800A CN 106935243 A CN106935243 A CN 106935243A
Authority
CN
China
Prior art keywords
vector quantization
melp
lsf
lsf parameters
digital speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201511005800.3A
Other languages
Chinese (zh)
Inventor
王国文
罗世新
何丽
张盼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino Corp filed Critical Aisino Corp
Priority to CN201511005800.3A priority Critical patent/CN106935243A/en
Publication of CN106935243A publication Critical patent/CN106935243A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/087Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the invention provides a kind of low bit digital speech vector quantization method based on MELP and system.The present invention carries out linear predictor coefficient vector quantization to the pitch signal after adjustment using MELP MELP algorithms, including:Two-stage Split vector quantizer is used to LSF parameters, the LSF parameters of first order vector quantization, the LSF parameters of the LSF parameter acquirings second level vector quantization based on the first order vector quantization is first obtained;Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.The present invention, using LSF two-stages level vector quantization scheme, reduces code check on the basis of MELP algorithms, reduces the amount of storage and computation complexity of code book.

Description

A kind of low bit digital speech vector quantization method and system based on MELP
Technical field
The present invention relates to signal processing technology field, more particularly to a kind of low bit numeral language based on MELP Sound vector quantization method.
Background technology
At this stage, the research of the digital voice compression algorithm of low bit is more and more ripe, and in low bit In digital speech algorithm, MELP MELP (Mixed Excitation Linear Prediction) algorithm has oneself distinctive advantage, and 2.4Kbps MELP are (linear based on LPC Predictive coding) on the basis of combine mixed excitation, band excitation and the coding method such as prototype waveform interpolation more Advantage, voice is synthesized using a kind of speech production model of the new pronunciation for more meeting people.MELP is calculated The characteristics of method is to employ multi band mixed excitation, non-periodic pulse, residual error harmonic management, Adaptive spectra to increase Strong and shaping pulse filtering.
For problem above, generally propose in the prior art using identification synthesis type vocoder, using language Sound is recognized and synthetic technology is encoded to voice signal, and coding unit is speech primitive, so can be coding speed Rate is down to below 1Kb/s.In addition, in 2.4K/s linear predictive codings LPC (Linear Predictive Coding on the basis of), also there is the frame-to-frame correlation using vector quantization technology and voice, further Compression speech data.So-called vector quantization, refers to regard one group of scalar data as a vector, in vector Space carries out overall quantization to it, so not only have compressed data but also had not lost how much information.Vector quantization Efficiency determines the efficiency of encoder.In the parameter of low rate coding quantifies, due to LSP (bit number that Line Spectrum Pa quantify to take is higher, therefore, if can be to LSP parameters The method of quantization makes certain improvements, and can necessarily bring significantly reducing for code rate.Due to voice letter Number consecutive frame between, especially in the steady section of voice, there is very big correlation.If every one If speech parameter of frame coding transmission, code rate will be substantially reduced.Therefore, somebody proposes The bit number of parameter quantization is further reduced using frame-to-frame correlation.I.e. a few frame continuous signals as one Frame is encoded, and the parameter to super frame carries out overall vector quantization so as to compress inter-frame redundancy.Also learn Person proposes a kind of segment quantization method for being variable segment length, and will being input into voice, to regard a sequence as long The variable section of degree, every section is made up of a frame or a few frame signals, per parameters such as frame gain, fundamental tone and frequency spectrums To represent.Although implementing more complicated, encoding rate can be substantially reduced, shorten coding and prolong Late, and the synthesis voice of better quality can be obtained.
The content of the invention
The embodiment provides a kind of low bit digital speech vector quantization method based on MELP and System, the invention provides following scheme:
Linear predictor coefficient is carried out to the pitch signal after adjustment using MELP MELP algorithms Vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, first order vector quantization is first obtained LSF parameters, the LSF of the LSF parameter acquirings second level vector quantization based on the first order vector quantization Parameter;
Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
According to another aspect of the present invention, a kind of low bit digital speech vector quantity based on MELP is also provided Change system, including:
Coefficient acquisition module:It is used for using MELP MELP algorithms to the fundamental tone after adjustment Signal carries out linear predictor coefficient vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, First obtain the LSF parameters of first order vector quantization, the LSF parameter acquirings based on the first order vector quantization The LSF parameters of second level vector quantization;
Quantization modules:It is used to carry out digital speech vector quantity using the LSF parameters after the vector quantization of the second level Change.
The technical scheme provided by embodiments of the invention described above can be seen that and the embodiment of the invention provides A kind of low bit digital speech vector quantization method and system based on MELP.The present invention uses mixed excitation Linear prediction MELP algorithms carry out linear predictor coefficient vector quantization to the pitch signal after adjustment, including: Two-stage Split vector quantizer is used to LSF parameters, the LSF parameters of first order vector quantization are first obtained, is based on The LSF parameters of the LSF parameter acquirings second level vector quantization of the first order vector quantization;Using the second level LSF parameters after vector quantization carry out digital speech vector quantization.It is of the invention a kind of based on the low ratios of MELP Special digital speech algorithm design is based on existing method for designing and the defect for its presence proposes one Plant the new low bit digital speech building method based on MELP.On the basis of MELP algorithms, even if The quantization of method is analyzed, and the quantization of pitch period and the quantization of linear predictor coefficient are analyzed emphatically, and The further quantization to linear predictor coefficient proposes a kind of improved method, using LSF two-stage vector quantizations Scheme, reduces code check, reduces the amount of storage and computation complexity of code book, compared with original scheme more It is advantageous.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, institute in being described to embodiment below The accompanying drawing for needing to use is briefly described, it should be apparent that, drawings in the following description are only this hair Some bright embodiments, for those of ordinary skill in the art, are not paying creative labor Under the premise of, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of low bit digital speech vector quantization based on MELP that the embodiment of the present invention one is provided The process chart of method;
Fig. 2 is a kind of low bit digital speech vector quantization based on MELP that the embodiment of the present invention two is provided The module map of system.
Specific embodiment
For ease of the understanding to the embodiment of the present invention, below in conjunction with accompanying drawing by taking several specific embodiments as an example Explanation is further explained, and each embodiment does not constitute the restriction to the embodiment of the present invention.
Embodiment one
In embodiments of the invention, it is necessary first to obtain pitch signal;The fundamental tone letter is obtained in this implementation Number, specifically include:
By the audio digital signals sampled by high-pass filter, filtering signal is obtained;
Voicing decision is carried out using multi band mixed excitation to the filtering signal, and calculates the filtering letter Number gain, to obtain the pitch signal;
Specifically, the filtering signal is divided into several subbands, voicing decision is carried out respectively, to institute The intensity of sound for stating subband marks pure and impure sound respectively,
To the intensity of sound of the subband, using parameter Vbpi, (i=1,2 ... n) represent, Vbpi is represented respectively The intensity of sound of individual subband, its value represents voiced sound when being 1, and voiceless sound is represented when being 0,
In the present embodiment, using voice every 22.5ms long as an analysis frame, corresponding to 8kHz sample rates Under 180 sampled points (8000 sampled points/s), after treatment per frame export 54 bits be transmitted, So its speed is 2.4kbps;
As a example by filtering signal to be divided into 5 subbands, each sub-band parameter is Vbpi (i=1,2 ..., 5),
Preferably, input signal will be input into respectively by 5 Butterworth bandpass filters of 6 ranks Signal is divided into 0Hz~500Hz, and 500Hz~1000Hz, 1000Hz~2000Hz, 2000Hz~ 3000Hz, 3000Hz~subbands of 4000Hz five.Voice signal through 0Hz~500Hz bandpass filtering The filtered output of device is used for carrying out an estimation for fraction fundamental tone, thus obtains fractional pitch cycle P2With Corresponding auto-correlation function value r (P2), r (P2) value determine most low strap and total clear/voiced sound judgement knot Really.According to fractional pitch cycle P2Corresponding auto-correlation function value r (P2) the first intensity threshold is set, this Value is 0.6 in embodiment;
When the intensity of sound parameter Vbp1 of the first subband is not more than the first intensity threshold, present frame is voiceless sound Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frames quantization encoding;
When the intensity of sound parameter Vbp1 of the first subband is more than the first intensity threshold, present frame is voiced sound Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frame quantization encoding.
When Vbp1≤0.6, present frame is illustrated for unvoiced frames, the pure and impure intensity of remaining band logical All quantization encoding is 0 to Vbpi (i=1,2,3,4,5);
Work as Vbp1>When 0.6, i=2 when 3,4,5, illustrates present frame for unvoiced frame, and Vbp1 is encoded to 1.
The filtering signal gain is calculated, needs to carry out at adding window adjustment the audio digital signals sampled before Reason, specifically,
Intensity of sound parameter according to the first subband, the window that adjustment is used to the audio digital signals sampled It is long, specifically, when the intensity of sound of the first subband is more than the first intensity threshold, and fractional pitch cycle P2Minimum factor product be not more than window threshold value long, then adjust window it is a length of more than fractional pitch cycle P2Minimum Factor is accumulated;
When the intensity of sound of the first subband is more than the first intensity threshold, and fractional pitch cycle P2It is minimum because Scalar product is more than window threshold value long, then adjust the half of the minimum factor product of a length of fractional pitch cycle of window;
When the intensity of sound of the first subband is less than or equal to the first intensity threshold, adjustment window is long equal to fraction base The minimum factor product in sound cycle.
For example, working as Vbp1 for the first intensity threshold with 0.6 in the present embodiment>When 0.6, window is long to be greater than P2The minimum factor product of fractional pitch cycle;In the present embodiment, with using voice every 22.5ms long as One analysis frame, corresponding to 180 sampled points (8000 sampled points/s) under 8kHz sample rates, passes through 54 bits are exported per frame to be transmitted, as a example by so its speed is 2.4kbps, now adjust window after treatment It is a length of to be more than 120 sampling points,
Now, so that window threshold value long is 320 sampling points as an example, if the window of above-mentioned calculating adjustment is long to be more than 320 Sampling point, then it is the window of above-mentioned calculating is long divided by 2.
When Vbp1≤0.6, a length of 120 sampling points of window are adjusted.
Secondly, it is necessary to carry out linear predictive coding to the pitch signal, residual signals are obtained, and calculate The pitch period of the pitch signal, and the base is adjusted according to the pitch period and the residual signals Message number, obtains the pitch signal after adjustment;Specifically, including:
Step A, the audio digital signals to the sampling of the pitch signal carry out LPC (Linear Predictive Coding) linear predictive coding;
200 Hamming windows of sampling point included with 25ms voice signals long in the present embodiment are to the number sampled Word voice signal is weighted, then carries out 10 rank linear predictive codings, and the center of window is the reference of present frame Point.
After step B, linear predictive coding, residual signals are obtained;The residual signals of acquisition do not include sound channel Response message but comprising complete excitation information, effect can be the influence for reducing tract characteristics, improve base Sound phase estimate effect;
In order to obtain residual signals, by the audio digital signals sampled by Linear prediction error fiker, Transmission function is:
Wherein, aiIt is linear predictor coefficient, residual signals are:
Wherein n is long for the window of residual analysis.Linear prediction error fiker is FIR filter, and it is output as Residual signals.
Step C, the pitch period for calculating the pitch signal, and according to the pitch period and the residual error Signal adjusts the pitch signal, obtains the pitch signal after adjustment;
Step C1, the calculating for integer pitch period, the audio digital signals of sampling first pass through one and cut Only frequency is the 6 rank Butterworth low pass filters of 1KHz, eliminates the high frequency of the voice in Parameter analysis The interference that composition is estimated pitch period.Normalized autocorrelation functions r (τ) is defined as:
Wherein
The value of integer pitch period reaches T corresponding during maximum equal to normalized autocorrelation functions r (τ) Value, obtains max (r (τ)), as integer pitch period P from calculating formula above1
Step C2, the output signal of first subband bandpass filter (0~500Hz) are Sb1 (n), letter The Main Function of number Sb1 (n) is the search for fractional pitch cycle.Because Sb1 (n) signals are by More than four times harmonic filtrations of pitch period are fallen when one sub-filter, so as to eliminate Influence of the higher hamonic wave to pitch search, after being operated more than, in conjunction with the integer of rough estimate before Pitch period P1So that more accurately pitch period can be estimated.Using present frame and previous Frame estimates roughly the integer pitch period for obtaining, in (P1-5,P1+ 5) the thin of integer fundamental tone is carried out in the range of Search obtains P2, recycle P2Calculate fractional pitch cycle.The calculating of fractional pitch cycle can be carried significantly The accuracy that pitch period high is estimated.Real pitch period is also possible in (P2-1,P2) between, or (P2,P2+ 1) between, therefore, it is typically with formula C τ (m, n) and compares CP2(0,P2- 1) and CP2(0,P2+ 1) method of size is determined.After between the scope [P, P+1] that pitch period is determined, just Fractional pitch cycle can be determined using the method for interpolation.
In the present embodiment, for the calculating of fraction pitch period, the extraction of fractional pitch cycle uses band logical First band (0~500Hz) output signal in analysis, two candidate values are respectively present frame and former frame Integer pitch period.Assuming that real pitch period is △, 0 with the side-play amount of integer pitch period<△ <1, the formula for calculating △ is as follows:
The normalized autocorrelation value of fractional pitch cycle is:
Set respectively:A=CT (0,0);B=CT (0, T);C=CT (0, T+1);D=CT (T, T);E=CT (T, T+1);F= (T+1, T+1) substitutes into two formula evaluations above and obtains fractional pitch cycle;
Step C3:On the basis of pitch period candidate value based on above-mentioned steps C1 and step C2, base is carried out The final calculating in sound cycle.P3 is final pitch period estimate, and corresponding normalized autocorrelation value is r(P3);When autocorrelation value is larger (r (P3) >=0.6), illustrate that the estimation of pitch period is more accurate, most The Ploidy testing of pitch period is carried out with the residual signals of LPF, you can obtain final fundamental tone week afterwards Phase estimate.Pitch period has influence on the discrimination of speech recognition, has influence on the correct of voice compression coding Rate.
(the r (P3) when autocorrelation value is smaller<0.6), illustrate that the pitch signal in LPC residual signal may It is corrupted by noise, or the frame signal is unstable, replaces LPC residual to believe with the audio digital signals of sampling The search of fractional pitch cycle number is carried out near only, new P3 and r (P3) is obtained.
The embodiment of the invention provides a kind of place of the low bit digital speech vector quantization method based on MELP Reason flow is as shown in figure 1, including following process step:
Step 11, using MELP MELP (Mixed Excitation Linear Prediction) algorithm carries out linear predictor coefficient vector quantization to the pitch signal after adjustment, including:It is right LSF parameters use two-stage Split vector quantizer, first obtain the LSF (Line of first order vector quantization Spectrum Frequency) parameter, the LSF parameter acquirings second based on the first order vector quantization The LSF parameters of level vector quantization;
Specifically, in the present embodiment, first order vector quantization is carried out to LSF parameters using 5 bits, is obtained The LSF parameters of 10 dimensions;
The LSF parameters of 10 dimensions are divided into 5 dimension after preceding 5 peacekeeping, respectively to the LSF parameters of preceding 5 dimensions using 7 ratios Special code book carries out second level vector quantization, and the LSF parameters of 5 dimensions carry out second level arrow using 5 bit code books afterwards Amount quantifies;
Specifically, in the present embodiment, the 2nd subframe and the 4th subframe to the pitch signal after the adjustment LSF parameters carry out vector quantization using 17 bits;
The LSF parameters of the 1st subframe and the 3rd subframe to the pitch signal after the adjustment use equation below Calculated:
J=1,2 ..., 9
Wherein,It is the 1st subframe and the interpolated value of the LSF parameters of the 3rd subframe,For preceding The quantized value of one joint frame last subframe LSF parameters,It is the 2nd subframe and the 4th subframe LSF quantized values, a1(j),a2J () is LSF interpolation coefficients, wherein, a1(j),a2J () uses the code of 4 bits Originally vector quantization is carried out.
The a1(j),a2J () carries out vector quantization using the code book of 4 bits, including:
Set up the object function of following vector quantization, i.e. vector quantization object:
Wherein, w1(j),w3J () is weight coefficient, l1(j),l3J () is the 1st subframe and the 3rd not quantified Subframe LSF parameters.
Step 12, digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
Specifically, also include, the LSF parameters after the second level vector quantization are adopted with original LSF quantized values It is compared with spectrum distortion index, N represents spectrum distortion index;
Wherein, L is fundamental tone harmonic wave number, A in subframemlIt is original spectrum amplitude angle value, AmrlIt is to use second The spectrum amplitude angle value rebuild after LSF parameters after level vector quantization.
Embodiment two
This embodiment offers a kind of low bit digital speech Vector Quantization based on MELP, its is specific Structure is realized as shown in Fig. 2 can specifically include following module:
Coefficient acquisition module 21:It is used for using MELP MELP algorithms to the base after adjustment Message number carries out linear predictor coefficient vector quantization, including:Vector quantity is divided using two-stage to LSF parameters Change, first obtain the LSF parameters of first order vector quantization, the LSF parameters based on the first order vector quantization Obtain the LSF parameters of second level vector quantization;
Quantization modules 23:It is used for it is used to carry out numeral using the LSF parameters after the vector quantization of the second level Speech vector quantifies.
The detailed process and preceding method reality of digital speech vector quantization are carried out with the system of the embodiment of the present invention Apply example to be similar to, here is omitted.
In sum, the embodiment of the present invention obtains the pitch signal after adjustment;Using mixed excitation linear Prediction MELP algorithms carry out linear predictor coefficient vector quantization to the pitch signal after the adjustment, including: LPC parameters are converted into line spectrum pair vector LSF parameters, wherein, the LSF parameters are divided using two-stage Vector quantization, including:First order vector quantization is carried out to LSF parameters using 5 bits, 10 dimensions are obtained LSF parameters;The LSF parameters of 10 dimensions are divided into 5 dimension after preceding 5 peacekeeping, the LSF parameters to preceding 5 dimensions are adopted respectively Second level vector quantization is carried out with 7 bit code books, the LSF parameters of 5 dimensions carry out second using 5 bit code books afterwards Level vector quantization;A kind of MELP low bit digital speech algorithms design that is based on of the invention is based on having Method for designing and the defect for its presence proposes a kind of new low bit digital speech based on MELP Building method.On the basis of MELP algorithms, the quantization with regard to algorithm is analyzed, and fundamental tone is analyzed emphatically The quantization in cycle and the quantization of linear predictor coefficient, and further the quantization to linear predictor coefficient is proposed A kind of improved method, using LSF two-stage vector quantization schemes, reduces code check, reduces the storage of code book Amount and computation complexity, it is more advantageous compared with original scheme.
One of ordinary skill in the art will appreciate that:Accompanying drawing is the schematic diagram of one embodiment, in accompanying drawing Module or necessary to flow not necessarily implements the present invention.
As seen through the above description of the embodiments, those skilled in the art can be understood that The present invention can add the mode of required general hardware platform to realize by software.Based on such understanding, The part that technical scheme substantially contributes to prior art in other words can be with software product Form embody, the computer software product can be stored in storage medium, such as ROM/RAM, Magnetic disc, CD etc., including some instructions are used to so that a computer equipment (can be individual calculus Machine, server, or network equipment etc.) perform some portions of each embodiment of the invention or embodiment Method described in point.
Each embodiment in this specification is described by the way of progressive, identical between each embodiment Similar part is mutually referring to what each embodiment was stressed is the difference with other embodiment Part.For especially for device or system embodiment, because it is substantially similar to embodiment of the method, So describing fairly simple, the relevent part can refer to the partial explaination of embodiments of method.The above is retouched The apparatus and system embodiment stated is only schematical, wherein the unit illustrated as separating component Can be or may not be physically separate, the part shown as unit can be or also may be used Not being physical location, you can with positioned at a place, or multiple NEs can also be distributed to On.Some or all of module therein can be according to the actual needs selected to realize this embodiment scheme Purpose.Those of ordinary skill in the art are without creative efforts, you can to understand simultaneously Implement.
The above, preferably specific embodiment only of the invention, but protection scope of the present invention is not Be confined to this, any one skilled in the art the invention discloses technical scope in, can The change or replacement for readily occurring in, should all be included within the scope of the present invention.Therefore, the present invention Protection domain should be defined by scope of the claims.

Claims (10)

1. a kind of low bit digital speech vector quantization method based on MELP, it is characterised in that including:
Linear predictor coefficient is carried out to the pitch signal after adjustment using MELP MELP algorithms Vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, first order vector quantization is first obtained LSF parameters, the LSF of the LSF parameter acquirings second level vector quantization based on the first order vector quantization Parameter;
Digital speech vector quantization is carried out using the LSF parameters after the vector quantization of the second level.
2. a kind of low bit digital speech vector quantization method based on MELP according to claim 1, Characterized in that, the LSF parameters for first obtaining first order vector quantization, based on the first order vector quantity The LSF parameters of the LSF parameter acquirings second level vector quantization of change, including:
First order vector quantization is carried out to LSF parameters using 5 bits, the LSF parameters of 10 dimensions are obtained;By 10 The LSF parameters of dimension are divided into 5 dimensions after preceding 5 peacekeeping, and the LSF parameters to preceding 5 dimensions are carried out using 7 bit code books respectively Second level vector quantization, afterwards 5 dimension LSF parameters carry out second level vector quantization using 5 bit code books, obtain Take the LSF parameters of second level vector quantization.
3. a kind of low bit digital speech vector quantization method based on MELP according to claim 1, Characterized in that, the use MELP MELP algorithms are carried out to the pitch signal after adjustment Linear predictor coefficient vector quantization, including:
The LSF parameters of the 2nd subframe and the 4th subframe to the pitch signal after the adjustment are entered using 17 bits Row vector quantization;
The LSF parameters of the 1st subframe and the 3rd subframe to the pitch signal after the adjustment use equation below Calculated:
l ^ 1 ( j ) = a 1 ( j ) l ^ 0 ( j ) + &lsqb; 1 - a 1 ( j ) &rsqb; l ^ 2 ( j )
l3(j)=a2(j)l2(j)+[1-a2(j)]l4(j)
J=1,2 ..., 9
Wherein,It is the 1st subframe and the interpolated value of the LSF parameters of the 3rd subframe,For preceding The quantized value of one joint frame last subframe LSF parameters,It is the 2nd subframe and the 4th subframe LSF quantized values, a1(j),a2J () is LSF interpolation coefficients, wherein, a1(j),a2J () uses the code of 4 bits Originally vector quantization is carried out.
4. a kind of low bit digital speech vector quantization method based on MELP according to claim 3, Characterized in that, a1(j),a2J () carries out vector quantization using the code book of 4 bits, including:
Set up the object function of following vector quantization:
E = &Sigma; 0 9 w 1 ( j ) | l 1 ( j ) - l ^ 1 ( j ) | 2 + &Sigma; 0 9 w 3 ( j ) | l 3 ( j ) - l ^ 3 ( j ) |
Wherein, w1(j),w3J () is weight coefficient, l1(j),l3J () is the 1st subframe and the 3rd son not quantified Frame LSF parameters.
5. a kind of low bit digital speech vector quantization method based on MELP according to claim 4, Characterized in that,
LSF parameters after the second level vector quantization are carried out with original LSF quantized values using spectrum distortion index Compare,
N = 1 L &Sigma; 1 L &lsqb; 10 lg | A m l / A m r l | 2 &rsqb; 2
Wherein, L is fundamental tone harmonic wave number, A in subframemlIt is original spectrum amplitude angle value, AmrlIt is to use second The spectrum amplitude angle value rebuild after LSF parameters after level vector quantization.
6. a kind of low bit digital speech vector quantization method based on MELP according to claim 1, Characterized in that, the pitch signal after the adjustment is obtained, including:
By the audio digital signals sampled by high-pass filter, filtering signal is obtained;
Voicing decision is carried out using multi band mixed excitation to the filtering signal, and calculates the filtering letter Number gain, to obtain the pitch signal;
Linear predictive coding is carried out to the pitch signal, residual signals is obtained, and calculate the fundamental tone letter Number pitch period, and adjust the pitch signal according to the pitch period and the residual signals, obtain Take the pitch signal after the adjustment.
7. a kind of low bit digital speech vector quantization method based on MELP according to claim 6, Characterized in that, described carry out voicing decision to the filtering signal using multi band mixed excitation, bag Include:
The filtering signal is divided into several subbands, voicing decision is carried out respectively, to the subband Intensity of sound marks pure and impure sound respectively,
To the intensity of sound of the subband, using parameter Vbpi, (i=1,2 ... n) represent, Vbpi is represented respectively The intensity of sound of individual subband,
According to fractional pitch cycle P2Corresponding auto-correlation function value r (P2) the first intensity threshold is set, when When the intensity of sound parameter Vbp1 of the first subband is not more than the first intensity threshold, present frame is unvoiced frames, its The remaining pure and impure intensity Vbpi (i=1,2,3,4,5) of band logical all uses unvoiced frames quantization encoding;
When the intensity of sound parameter Vbp1 of the first subband is more than the first intensity threshold, present frame is voiced sound Frame, the pure and impure intensity Vbpi of remaining band logical (i=1,2,3,4,5) all uses unvoiced frame quantization encoding.
8. a kind of low bit digital speech vector quantization method based on MELP according to claim 6, Characterized in that, the calculating filtering signal gain, including:
When the first subband intensity of sound be more than the first intensity threshold when, and fractional pitch cycle P2 minimum Factor product is not more than window threshold value long, then adjust a length of minimum factor product more than fractional pitch cycle P2 of window;
When the first subband intensity of sound be more than the first intensity threshold when, and fractional pitch cycle P2 minimum Factor product is more than window threshold value long, then adjust the half of the minimum factor product of a length of fractional pitch cycle of window;
When the intensity of sound of the first subband is less than or equal to the first intensity threshold, adjustment window is long equal to fraction The minimum factor product of pitch period.
9. a kind of low bit digital speech vector quantization method based on MELP according to claim 6, Characterized in that, described carry out linear predictive coding to the pitch signal, residual signals, bag are obtained Include:By the audio digital signals sampled by Linear prediction error fiker, transmission function is:
H ( z ) = 1 - &Sigma; i = 1 10 a i &CenterDot; z i
Wherein, aiIt is linear predictor coefficient, residual signals are:
r n = s ( n ) - &Sigma; i = 1 10 a i s ( n - i )
Wherein, n for residual analysis window it is long, Linear prediction error fiker is FIR filter, and it is output as Residual signals.
10. a kind of low bit digital speech Vector Quantization based on MELP, it is characterised in that including:
Coefficient acquisition module:It is used for using MELP MELP algorithms to the fundamental tone after adjustment Signal carries out linear predictor coefficient vector quantization, including:Two-stage Split vector quantizer is used to LSF parameters, First obtain the LSF parameters of first order vector quantization, the LSF parameter acquirings based on the first order vector quantization The LSF parameters of second level vector quantization;
Quantization modules:It is used to carry out digital speech vector quantity using the LSF parameters after the vector quantization of the second level Change.
CN201511005800.3A 2015-12-29 2015-12-29 A kind of low bit digital speech vector quantization method and system based on MELP Pending CN106935243A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511005800.3A CN106935243A (en) 2015-12-29 2015-12-29 A kind of low bit digital speech vector quantization method and system based on MELP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511005800.3A CN106935243A (en) 2015-12-29 2015-12-29 A kind of low bit digital speech vector quantization method and system based on MELP

Publications (1)

Publication Number Publication Date
CN106935243A true CN106935243A (en) 2017-07-07

Family

ID=59458182

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511005800.3A Pending CN106935243A (en) 2015-12-29 2015-12-29 A kind of low bit digital speech vector quantization method and system based on MELP

Country Status (1)

Country Link
CN (1) CN106935243A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109256143A (en) * 2018-09-21 2019-01-22 西安蜂语信息科技有限公司 Speech parameter quantization method, device, computer equipment and storage medium
CN111818519A (en) * 2020-07-16 2020-10-23 郑州信大捷安信息技术股份有限公司 End-to-end voice encryption and decryption method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153317A1 (en) * 2003-01-31 2004-08-05 Chamberlain Mark W. 600 Bps mixed excitation linear prediction transcoding
CN101114450A (en) * 2007-07-20 2008-01-30 华中科技大学 Speech encoding selectivity encipher method
CN101281750A (en) * 2008-05-29 2008-10-08 上海交通大学 Expanding encoding and decoding system based on vector quantization high-order code book of variable splitting table
CN101937680A (en) * 2010-08-27 2011-01-05 太原理工大学 Vector quantization method for sorting and rearranging code book and vector quantizer thereof
CN103050122A (en) * 2012-12-18 2013-04-17 北京航空航天大学 MELP-based (Mixed Excitation Linear Prediction-based) multi-frame joint quantization low-rate speech coding and decoding method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040153317A1 (en) * 2003-01-31 2004-08-05 Chamberlain Mark W. 600 Bps mixed excitation linear prediction transcoding
CN101114450A (en) * 2007-07-20 2008-01-30 华中科技大学 Speech encoding selectivity encipher method
CN101281750A (en) * 2008-05-29 2008-10-08 上海交通大学 Expanding encoding and decoding system based on vector quantization high-order code book of variable splitting table
CN101937680A (en) * 2010-08-27 2011-01-05 太原理工大学 Vector quantization method for sorting and rearranging code book and vector quantizer thereof
CN103050122A (en) * 2012-12-18 2013-04-17 北京航空航天大学 MELP-based (Mixed Excitation Linear Prediction-based) multi-frame joint quantization low-rate speech coding and decoding method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王国文 等: "《第十六届全国青年通信学术会议论文集(上)》", 31 December 2011 *
王国文: "语音密码机中的语音压缩改进算法研究", 《中国优秀硕士学位论文全文数据库,信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109256143A (en) * 2018-09-21 2019-01-22 西安蜂语信息科技有限公司 Speech parameter quantization method, device, computer equipment and storage medium
CN111818519A (en) * 2020-07-16 2020-10-23 郑州信大捷安信息技术股份有限公司 End-to-end voice encryption and decryption method and system
CN111818519B (en) * 2020-07-16 2022-02-11 郑州信大捷安信息技术股份有限公司 End-to-end voice encryption and decryption method and system

Similar Documents

Publication Publication Date Title
CN102169692B (en) Signal processing method and device
CN107945811B (en) Frequency band expansion-oriented generation type confrontation network training method and audio encoding and decoding method
JP3241959B2 (en) Audio signal encoding method
EP1899962B1 (en) Audio codec post-filter
JP3475446B2 (en) Encoding method
DE60006271T2 (en) CELP VOICE ENCODING WITH VARIABLE BITRATE BY MEANS OF PHONETIC CLASSIFICATION
CN103503061B (en) In order to process the device and method of decoded audio signal in a spectrum domain
US5018200A (en) Communication system capable of improving a speech quality by classifying speech signals
US6182030B1 (en) Enhanced coding to improve coded communication signals
US20060064301A1 (en) Parametric speech codec for representing synthetic speech in the presence of background noise
EP1995723A1 (en) Neuroevolution training system
JP2002516420A (en) Voice coder
JP4040126B2 (en) Speech decoding method and apparatus
CA2697604A1 (en) Method and device for efficient quantization of transform information in an embedded speech and audio codec
CN105960675A (en) Improved frequency band extension in an audio signal decoder
US20040083097A1 (en) Optimized windows and interpolation factors, and methods for optimizing windows, interpolation factors and linear prediction analysis in the ITU-T G.729 speech coding standard
CN106104682A (en) Weighting function for quantifying linear forecast coding coefficient determines apparatus and method
CN106935243A (en) A kind of low bit digital speech vector quantization method and system based on MELP
Qian et al. Wideband speech recovery from narrowband speech using classified codebook mapping
Srivastava Fundamentals of linear prediction
Wong On understanding the quality problems of LPC speech
JP3321933B2 (en) Pitch detection method
CN112233686B (en) Voice data processing method of NVOCPLUS high-speed broadband vocoder
KR100557113B1 (en) Device and method for deciding of voice signal using a plural bands in voioce codec
JPH0756599A (en) Wide band voice signal reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170707