CN1190773A - Method estimating wave shape gain for phoneme coding - Google Patents

Method estimating wave shape gain for phoneme coding Download PDF

Info

Publication number
CN1190773A
CN1190773A CN97100716A CN97100716A CN1190773A CN 1190773 A CN1190773 A CN 1190773A CN 97100716 A CN97100716 A CN 97100716A CN 97100716 A CN97100716 A CN 97100716A CN 1190773 A CN1190773 A CN 1190773A
Authority
CN
China
Prior art keywords
signal
voice
frame
consonant
gain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN97100716A
Other languages
Chinese (zh)
Inventor
林进灯
林信安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HETAI SEMICONDUCTOR CO Ltd
Original Assignee
HETAI SEMICONDUCTOR CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HETAI SEMICONDUCTOR CO Ltd filed Critical HETAI SEMICONDUCTOR CO Ltd
Priority to CN97100716A priority Critical patent/CN1190773A/en
Publication of CN1190773A publication Critical patent/CN1190773A/en
Pending legal-status Critical Current

Links

Images

Abstract

The waveform gain estimating method for voice coding includes the following steps: providing a decoding envelope data, including envelope form index value and quantification gain value; a periodic voice pulse signal, making it pass through an oscillator to produce a non-periodic pulse signal and feed it into sound/soundless recognition unit, and making noise signal pass through another path and feeding it into sound/soundless recognition unit; dividing inputted voice signal frame into several sub-frames, and discriminating every sub-frame; providing a modified LPC parameter, and at the same time feeding it into synthesizing filter and post filter; using an amplitude calculating unit to obtain the LPC parameter and decoded envelope data from the synthesizing filter; calculating gain value and feeding into gain unit to control potential of synthesized voice; finally, using the post filter to output a smooth voice signal.

Description

The method estimating wave shape gain of voice coding
The present invention relates to a kind of speech coding technology, particularly relevant for a kind of voice coding method estimating wave shape gain.
In speech synthesis technique, generally use linear predictor speech coder LPC (Liner Predictive Coding Vocoder) technology.And in this linear predictor voice coding method, vocoder LPC-10 speech coder is widely used in the voice compression technique of low bit rate.
Fig. 1 has shown the calcspar of this traditional voice coding techniques.In the square, include a voice pulse generator 11 (Impulse Train Generator) shown in the diagram.One random noise signal generator 12 (Random Noise Generator).One sound/noiseless change-over switch 13 (Voiced/unvoiced Switch), a gain unit 14 (Gain Unit), a composite filter (LPC Filter) 15, composite filter controlled variable setup unit 16, wherein gain unit 14 has a gain setting unit in addition.
The white noise signal (While Noise) that periodic speech pulse signal that voice pulse generator 11 is produced (Periodic Impulse Train) or random noise signal generator 12 are produced, through a sound/noiseless change-over switch 13, type attribute according to its input signal, do suitably to switch after the selection, process gain unit 14 is according to its default yield value earlier, to the gain of signal to adjust its signal potential, carry out filtering by composite filter 15 according to the default LPC parameter (LPC Parameters) of composite filter controlled variable again, export voice signal S (n) by the output terminal of composite filter 15 at last.
When the actual speech coding was used, the output gain signal value of synthetic speech need be set or be controlled, so that its output signal is accorded with the signal potential of importing voice.In conventional art, reach this gain value settings and control purpose, mainly adopt following two kinds of technology, first method is that the energy according to the linear prediction of voice signal sampling (Linear Predicted Samples) decides its yield value.The method of another gain value settings and control is that (root-mean-square RMS) does the calculating of yield value according to root-mean-square value.In this kind located by prior art, sound frame (Unvoiced Frame) for noise signal, its gain is estimated by this root-mean-square value merely, and having voice to produce under (Voiced Frame) situation, also adopt identical root-mean-square value estimation method, but it can adopt further the estimating and measuring method of a kind of so-called rectangular window (several present pitch cycles) to obtain more precise gain value.Adopting the resulting yield value edge of the former located by prior art is to give equal quantification treatment with 7 bit logarithm value.
Yet which kind of gain estimating technology of commonly using of masterization employing not all can't accurately be estimated with single kind of gain estimating method and its correct yield value, and its reason is that traditional LPC scrambler belongs to open system.
Purpose of the present invention promptly is in order to overcome the shortcoming of aforementioned located by prior art, to provide a kind of, can obtain the voice coding gain estimating method of a level and smooth synthetic speech signal.
The shape that another object of the present invention provides a kind of signal envelope according to speech waveform is carried out the method for the voice coding estimation yield value of yield value estimation.
For arriving above-mentioned purpose, the present invention takes following scheme:
The method estimating wave shape gain of voice coding of the present invention may further comprise the steps:
A., the one envelope data of decoding is provided, and it obtains via analyzing typical voice signal;
B. sound/noiseless recognition unit select the aperiodicity pulse signal that produces through an oscillator by the periodic speech pulse signal and a noise signal the two one of;
C. the voice signal sound frame with input is divided into several consonant frames, by described recognition unit each consonant frame of this input is carried out sound/noiseless differentiation then;
D. provide a LPC parameter of crossing to deliver to simultaneously in a composite filter and the postfilter with revising;
E. obtain the envelope data of this LPC parameter and this decoding by a magnitude determinations unit by the composite filter place,, this yield value is delivered in the gain unit, with the current potential of control output synthetic speech through the calculating of yield value;
F. export a voice signal by postfilter.
Conjunction with figs. and embodiment give that details are as follows:
Brief Description Of Drawings:
Fig. 1 is the basic calcspar of traditional voice combiner circuit;
Fig. 2 is a phonetic synthesis step synoptic diagram of the present invention;
Fig. 3 comprises the corresponding coding schedule of 16 kinds of different envelope shapes in the preferred embodiment of the present invention with 4 byte codes.
As shown in Figure 2, it is a phonetic synthesis synoptic diagram of the present invention, it mainly includes an oscillator 21 (Vibrator), one sound/no acoustical signal recognition unit 22 (Voiced/Unvoiced Decision), one composite filter 24 (SynthesisFilter), one corrected LPC parameter unit 23 inserts LPC (Interpolate LPC Coefficient in LSPDomain) in the LSP zone, one magnitude determinations unit 25 (Amplitude Calculation Unit), the one signal envelope information unit 26 (DecodedEnvelope) of decoding, one gain unit 27 (Gain Unit), one postfilter 28 (Post Filter).Composite filter 24 includes a full polarity wave filter (ALL-pole Filter) and and separates accentuation filter (De-emphasis Filter).
Periodic speech pulse signal (Periodic Impulse Train) is through after the oscillator 21, send an aperiodicity pulse signal (Aperiodic Pulse) to sound/noiseless recognition unit 22, white noise signal (White Noise) is then delivered to sound/noiseless recognition unit 22 through another path.
Sound/method of discrimination that noiseless recognition unit 22 is adopted be adopt will input voice signal sound frame be divided into four consonant frames (Subframe), and then each consonant frame differentiated, in this method of discrimination, at first each the sound frame in the input speech signal is divided into four consonant frames (Subframe), then at each sub-frame, according to its correlation parameter, each sub-frame of comprehensive distinguishing.Aforesaid parameter include NC, energy, linear spectral to (line Spectrum Pair is called for short LSP) and the paramount frequency range energy ratio of low-frequency range (Low to High Band Energy Ratio Value, LOH).Relevant this sound/noiseless recognition technology, same applicant has applied for another patent.
In the voice input signal that slowly changes, the method for upgrading each sound frame one by one can reach required quality of output signals.Yet, if when some transient behaviour, can when changing, each sound frame produce the situation of transient distortion, therefore in order to reduce transient distortion, so when sending the LPC parameter, can revise LSP parameter (the LSP parameter means the LPC parameter before revising in above explanation) by the corrected LPC parameter unit 23 among the present invention to composite filter 24.Its method is for the middle groups parameter between assessment sound frame, not increase under the code capacity, can reach and makes the more level and smooth purpose in sound frame confluce.In order to reduce the linear calculation times of revising of LPC, so in preferred embodiment of the present invention, be that each voice sound frame is divided into four consonant frames, and obtaining of the LSP parameter of each consonant frame is to obtain by revising the LSP parameter value between present sound frame and last sound frame.And then this LSP Parameters Transformation become the LPC parameter, this corrected LPC parameter can be delivered to synthetic filter Chinese device 24 and postfilter 28 simultaneously at last.
The LPC parameter can be obtained and by sealing after the related data that information unit 26 sent into of decoding by composite filter 24 places in magnitude determinations unit 25, outputing gain control signal is also delivered in the gain unit 27, exports a required voice signal by postfilter 28 more at last.
Inputing to the signal of sealing information unit 26 comprises and seals shape index value (Shape Index) and quantize yield value (Quantized Gain).Obtaining of these two parameters is to obtain by the sound frame of analyzing typical voice signal.In an embodiment of the present invention, be to comprise 16 kinds of different shapes of sealing with 4 byte codes, its corresponding tables is as shown in Figure 3.One seal the shape coding table according to this, in sealing cataloged procedure, in case to the shape of voice sound frame of input, compare out and accord with most in this coding schedule after some index values of sealing shape, promptly the technology with known logarithm quantizer is quantized into for example yield value of 7 bits.With the resulting quantification yield value of this technology and seal the shape index value and can send into as sealing in the information unit 26 among Fig. 2.
Yield value of the present invention calculates, and is calculated when the peak swing of synthetic speech just reaches sealing of decoding.In yield value computing method of the present invention, the sound frame to voice and noise consonant carries out analytical calculation respectively.
One, voice sound frame:
For sound consonant frame, it is the form that excites for the aperiodicity pulse.When carrying out Calculation of Gain, at first calculate composite filter and respond in the unit at this pulse position place digit pulse.The yield value of this pulse can calculate by following formula:
α k=min (abs (Envk, i/imp_resk, i)), p o≤ i≤p o+ r wherein α k represents k ThThe gain of pulse;
Envk, i are illustrated in i place, position, k ThThe decoding of pulse is sealed;
Imp_resk, the response of i indicating impulse;
p oThe position of indicating impulse;
R represents to search length (representative value is 10); After the yield value that calculates this pulse, this pulse promptly is sent in the composite filter, and composite filter can so can produce a synthetic speech signal (Synthesized Speech) at the output terminal of composite filter 27 with this signal times with the aforementioned α k value that is calculated after receiving this signal.After finishing the aforementioned calculation step, can repeat above-mentioned steps to calculate the yield value of next pulse.
Two, noise consonant frame:
For noise consonant frame, be the form that excites that adopts by noise (White Noise).At first calculate the position of the noise response of composite filter in whole consonant frame, this purpose is to surpass decoding envelope phenomenon for fear of the amplitude of composite signal in this consonant frame.The yield value of the noise signal of whole consonant frame can be calculated by following formula:
βj=min(abs(Envj,i/noise_resj,i)),
w o≤ i≤sub_leng wherein β k represents whole j ThThe gain of the noise signal of consonant frame;
Envj, i are illustrated in i place, position, the decoding envelope of noise signal;
Noise_resj, i represent the noise signal response;
w oThe position of beginning of opening of representing each consonant;
Sub_leng represents the length of consonant frame; After the yield value that calculates this noise signal, this noise signal promptly is sent in the composite filter, and composite filter can be with this signal times with the aforementioned β j value that is calculated after receiving this signal, so can in the consonant frame of whole jth, produce a noiseless consonant synthetic speech signal (Unvoiced Synthesized Speech) by output terminal on the composite filter 27.
In sum, effect of the present invention is as follows:
Because the present invention takes the method according to the waveform shape estimation yield value of voice signal envelope, can in the voice input signal that slowly changes, upgrade one by one the data of each sound frame, the transient distortion of signal can be reduced, therefore truer and level and smooth synthesized voice signal can be obtained.

Claims (8)

1, a kind of method estimating wave shape gain of voice coding may further comprise the steps:
A., the one envelope data of decoding is provided, and it obtains via analyzing typical voice signal;
B. sound/noiseless recognition unit select the aperiodicity pulse signal that produces through an oscillator by the periodic speech pulse signal and a noise signal the two one of;
C. the voice signal sound frame with input is divided into several consonant frames, by described recognition unit each consonant frame of this input is carried out sound/noiseless differentiation then;
D. provide a corrected LPC parameter to deliver to simultaneously in a composite filter and the postfilter;
E. obtain the envelope data of this LPC parameter and this decoding by a magnitude determinations unit by the composite filter place,, this yield value is delivered in the gain unit, with the current potential of control output synthetic speech through the calculating of yield value;
F. export a voice signal by postfilter.
2, the method estimating wave shape gain of voice coding according to claim 1 is characterized in that, the envelope data among the described step a includes the envelope shape index value of voice signal and quantizes yield value.
3, the method estimating wave shape gain of voice coding according to claim 2, it is characterized in that, described envelope shape index value and obtaining of yield value of quantification are to obtain by the sound frame of analyzing speech signal, according to analysis result, comprise 16 kinds of different envelope shapes with 4 byte codes, and obtain a corresponding tables.
4, the method estimating wave shape gain of voice coding according to claim 1 is characterized in that, the corrected LPC parameter of delivering in the described steps d in the composite filter obtains with the following step:
According to a LSP parameter of decode, to insert in first LSP time domain by a corrected LPC parameter, its method is the middle groups parameter of assessing between the sound frame, is not increasing under the code capacity, makes with interpolation method that sound frame confluce is more smooth-going, the reduction transient distortion.
5, voice coding method estimating wave shape gain according to claim 4, it is characterized in that, described when inserting the step of the LPC parameter in the LSP time domain, be that each voice sound frame is divided into four consonant frames, and the obtaining of the LSP parameter of each consonant frame, be get by revising the LSP parameter value between present sound frame and last sound frame with, and then this LSP Parameters Transformation become the LPC parameter.
6, the method estimating wave shape gain of voice coding according to claim 1, it is characterized in that, calculating at yield value described in the described step e, when just reaching the envelope of decoding, the peak swing of synthetic speech calculated suitable yield value, and the sound/noiseless consonant sound frame to input speech signal carries out analytical calculation respectively respectively, to calculate the yield value of its voice and noise consonant frame respectively.
7, the method estimating wave shape gain of voice coding according to claim 6 is characterized in that, may further comprise the steps for the calculating of the yield value of described voice sound frame:
A. calculate composite filter in the impulse response of the unit at this pulse position place;
B. calculate the yield value of this pulse with following formula;
αk=min(abs(Envk,i/imp_resk,i)),p o≤i≤p o+r
α k represents k ThThe gain of pulse;
Env K, iBe illustrated in i place, position, k ThThe decoding envelope of pulse;
Imp_res K, iThe indicating impulse response;
p oThe position of indicating impulse;
R represents the length of searching;
C. after the yield value that calculates this pulse, this pulse promptly is sent in the composite filter;
D. composite filter with the aforementioned α k value that is calculated, produces a synthetic speech signal with the output terminal at composite filter with this signal times after receiving this signal;
E. after finishing the aforementioned calculation step, repeat above-mentioned step to calculate the yield value of next pulse.
8, the method estimating wave shape gain of voice coding according to claim 6 is characterized in that, includes for the calculating of the yield value of described noise signal consonant sound frame:
A. at first calculate the position of the noise signal response of composite filter in whole consonant frame;
B. calculate the yield value of whole consonant frame with following formula:
βj=min(abs(Env j,i/noise_res j,i)),
w o≤ i≤sub_leng wherein β k represents whole j ThThe noise signal gain of consonant frame;
Env J, iBe illustrated in i place, position, the decoding envelope of noise signal;
Noise_res J, iThe response of expression noise signal;
w oThe position of beginning of opening of representing each consonant;
Sub_leng represents the length of consonant frame;
C. after the yield value that calculates this noise signal, this noise signal promptly is sent in the composite filter;
D. composite filter is after receiving this signal, with this signal times with the aforementioned β j value that is calculated, so can be at whole j ThThe consonant frame in, produce a noise signal consonant synthetic speech by the output terminal of composite filter.
CN97100716A 1997-02-13 1997-02-13 Method estimating wave shape gain for phoneme coding Pending CN1190773A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97100716A CN1190773A (en) 1997-02-13 1997-02-13 Method estimating wave shape gain for phoneme coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN97100716A CN1190773A (en) 1997-02-13 1997-02-13 Method estimating wave shape gain for phoneme coding

Publications (1)

Publication Number Publication Date
CN1190773A true CN1190773A (en) 1998-08-19

Family

ID=5165266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97100716A Pending CN1190773A (en) 1997-02-13 1997-02-13 Method estimating wave shape gain for phoneme coding

Country Status (1)

Country Link
CN (1) CN1190773A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7519530B2 (en) 2003-01-09 2009-04-14 Nokia Corporation Audio signal processing
CN100587807C (en) * 1999-01-27 2010-02-03 编码技术股份公司 Device for enhancing information source decoder and method for enhancing information source decoding method
CN101199233B (en) * 2005-05-18 2012-01-18 松下电器产业株式会社 Howling control apparatus and acoustic apparatus
CN103001598A (en) * 2011-07-19 2013-03-27 联发科技股份有限公司 Audio processing device and audio systems using same
US9252730B2 (en) 2011-07-19 2016-02-02 Mediatek Inc. Audio processing device and audio systems using the same
CN105355197A (en) * 2015-10-30 2016-02-24 百度在线网络技术(北京)有限公司 Gain processing method and device for speech recognition system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100587807C (en) * 1999-01-27 2010-02-03 编码技术股份公司 Device for enhancing information source decoder and method for enhancing information source decoding method
CN101625866B (en) * 1999-01-27 2012-12-26 杜比国际公司 Methods and an apparatus for enhancement of source decoder
US7519530B2 (en) 2003-01-09 2009-04-14 Nokia Corporation Audio signal processing
CN101199233B (en) * 2005-05-18 2012-01-18 松下电器产业株式会社 Howling control apparatus and acoustic apparatus
CN103001598A (en) * 2011-07-19 2013-03-27 联发科技股份有限公司 Audio processing device and audio systems using same
CN103001598B (en) * 2011-07-19 2015-10-28 联发科技股份有限公司 Apparatus for processing audio and use the audio system of this apparatus for processing audio
US9252730B2 (en) 2011-07-19 2016-02-02 Mediatek Inc. Audio processing device and audio systems using the same
CN105355197A (en) * 2015-10-30 2016-02-24 百度在线网络技术(北京)有限公司 Gain processing method and device for speech recognition system
CN105355197B (en) * 2015-10-30 2020-01-07 百度在线网络技术(北京)有限公司 Gain processing method and device for voice recognition system

Similar Documents

Publication Publication Date Title
US8620647B2 (en) Selection of scalar quantixation (SQ) and vector quantization (VQ) for speech coding
JP3483891B2 (en) Speech coder
TW448417B (en) Speech encoder adaptively applying pitch preprocessing with continuous warping
KR100908219B1 (en) Method and apparatus for robust speech classification
US5018200A (en) Communication system capable of improving a speech quality by classifying speech signals
US20060064301A1 (en) Parametric speech codec for representing synthetic speech in the presence of background noise
KR20020052191A (en) Variable bit-rate celp coding of speech with phonetic classification
KR19990006262A (en) Speech coding method based on digital speech compression algorithm
US6985857B2 (en) Method and apparatus for speech coding using training and quantizing
EP1420391B1 (en) Generalized analysis-by-synthesis speech coding method, and coder implementing such method
CN101359978A (en) Method for control rate variant multi-mode wideband encoding rate
AU2014317525A1 (en) Unvoiced/voiced decision for speech processing
CN1190773A (en) Method estimating wave shape gain for phoneme coding
US20100153099A1 (en) Speech encoding apparatus and speech encoding method
Wang et al. Phonetic segmentation for low rate speech coding
WO2003001172A1 (en) Method and device for coding speech in analysis-by-synthesis speech coders
JP3232701B2 (en) Audio coding method
Wong On understanding the quality problems of LPC speech
CN1189664A (en) Sub-voice discrimination method of voice coding
Mcaulay et al. Sinusoidal transform coding
WO2001009880A1 (en) Multimode vselp speech coder
Mao et al. A 2000 bps LPC vocoder based on multiband excitation
HEIKKINEN et al. On Improving the Performance of an ACELP Speech Coder
Ould-cheikh WIDE BAND SPEECH CODER AT 13 K bit/s
Zhang et al. A 2400 bps improved MBELP vocoder

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C53 Correction of patent for invention or patent application
CB02 Change of applicant information

Applicant after: Shengqun Semiconductor Co., Ltd.

Applicant before: Hetai Semiconductor Co., Ltd.

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: HETAI SEMICONDUCTOR CO., LTD. TO: SHENGQUN SEMICONDUCTOR CO., LTD.

C01 Deemed withdrawal of patent application (patent law 1993)
WD01 Invention patent application deemed withdrawn after publication