CN1737904A

CN1737904A - Voice coding apparatus and method using plp in mobile communications terminal

Info

Publication number: CN1737904A
Application number: CNA2005101098544A
Authority: CN
Inventors: 金灿佑
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2004-07-23
Filing date: 2005-07-25
Publication date: 2006-02-22
Also published as: KR100619893B1; ATE480852T1; KR20060008078A; EP1619665A1; EP1619665B1; JP2006039559A; DE602005023385D1

Abstract

A voice coding apparatus and method of a mobile communications terminal can embody higher compressibility and ensure high sound quality, compared with the case of using a Linear Prediction (LP) coefficient, by performing a Linear Predictive Coding (LPC) using a Perceptual Linear Prediction (PLP) coefficient.

Description

Use the speech coding apparatus and the method for consciousness linear prediction in the mobile communication terminal

Technical field

The present invention relates to the coding of mobile communication terminal, especially, relate to speech coding apparatus and the method for using consciousness linear prediction (PLP).

Background technology

Along with the development of mobile communication technology, the voice mobile communication terminal has provided the data communication that utilizes numeral, character, symbol or the like, comprises the multimedia communication and the voice communication of various picture signal.A plurality of terminal users receive the radio channel that is allocated in this from system, and use radio resource to send and receive required data.But, use this radio channel simultaneously in order to make a plurality of users, this radio channel has limited bandwidth, so each user's data bit rate is restricted in the nature of things.

Therefore, proposed a kind ofly to be used to use above limited data bit-rate to transmit the coding techniques of greater amount data.Various methods exist as the prior art speech coding technology, and wherein each has some advantages on a certain bit rate.

For example, use the voice coding of universal audio coding, pulse code modulation (pcm) and auto-adaptive increment pulse-code modulation (ADPCM) effectively to be used, and Code Excited Linear Prediction (CELP) and the intermediate bit rate of other various variations in 2.4Kbps to 16Kbps scope are effectively used at the high bit rate that surpasses 16Kbps.Especially, use coding method and the wideband speech coding of LD-CELP, CS-ACELP, VSELP and MELP on intermediate bit rate, to be used.In addition, linear predictive coding (LPC), residual excitation linear indication (RELP), formant vocoder and cepstrum (Cepstral) vocoder have many advantages on the low bit rate of 75bps to 2.4Kbps scope.

Therefore, in prior art and the present invention, will explain a kind of method that is used in the middle of the coding method of using with low bit rate, improving LPC now.

Fig. 1 illustrates the structure of the LPC scrambler of prior art.

As illustrational in the figure, the LPC scrambler of prior art comprises: be used to calculate input signal x[n] autocorrelation value r _xThe correlator 10 of [n]; Be used for by handling this autocorrelation value r _x[n] calculates LP coefficient a _LLP coefficient calculator 11 with gain G; Be used for determining this input signal x[n] be a sound V signal or the V/UV determining unit 12 of noiseless UV signal; Be used for as this input signal x[n] when being sound V signal, calculate the pitch counter 13 of the pitch P of corresponding signal; Be used for according to V/UV indication bit, by the LP coefficient a of coding from LP coefficient calculator 11 and 13 receptions of pitch counter from 12 outputs of V/UV determining unit _n, gain G and pitch P, come the parameter coding unit 14 of output bit flow.

The operation that now explanation is had the prior art LPC scrambler of said structure.

At first, input signal x[n of these correlator 10 auto-correlations].LP coefficient calculator 11 is handled the autocorrelation value r that is calculated by this correlator 10 _x[n] is so that calculate LP coefficient a _nAnd gain G.At this moment, V/UV determining unit 12 is determined this input signal x[n] be sound V signal or noiseless UV signal to export a V/UV indication bit, only export sound V signal then.Pitch counter 13 calculates from the pitch P of the sound V signal of V/UV determining unit 12 outputs.

Therefore, when the V/UV indication bit is represented sound V signal, by the LP coefficient a of coding (passing through low rate encoding) from LP coefficient calculator 11 and 13 receptions of pitch counter _n, gain G and pitch P, bit stream of parameter coding unit 14 outputs.Then, the controller (not shown) is handled this bit stream, thereby it is outputed to radio (wireless) unit (not shown).This radio unit will be radio (wireless) signal from the signal transformation of control module output, and emission is through the radio signal of conversion.

Thereby in the prior art, mobile communication terminal is carried out the LPC coding, to send a sound signal by low bit rate.But, in prior art LPC coding, using linear predictor coefficient usually, it does not consider people's sense of hearing sensation characteristics.Therefore, for the prior art LPC coding that uses the low bit rate operation, compression efficiency is not very high (that is, 1200Kbps to 2400Kbps), and can not obtain good sound quality.

Summary of the invention

Therefore, an object of the present invention is to provide a kind of speech coding apparatus and the method that can improve the mobile communication terminal of compression efficiency and sound quality by use PLP coefficient execution LPC coding.

In order to realize the advantage of these and other, and according to purpose of the present invention, implement and describe widely as concrete herein, a kind of linear predictive coding (LPC) scrambler of mobile communication terminal is provided, comprise: consciousness linear prediction (PLP) coefficient calculator is used for calculating PLP coefficient and gain by handling input signal; The V/UV determining unit be used for determining that input signal is audible signal or no acoustical signal, thereby when input signal was audible signal, signal and audible signal was determined in output; The pitch counter is used to calculate the pitch from the input signal of V/UV determining unit output; With the parameter coding unit, be used for using PLP coefficient, gain and pitch to carry out low rate encoding based on definite signal.

In order to realize the advantage of these and other, and according to purpose of the present invention, implement and describe widely as concrete herein, a kind of low bit rate speech coding method of mobile communication terminal is provided, has comprised: calculated consciousness linear prediction (PLP) coefficient and gain by handling input signal; Determine that input signal is audible signal or no acoustical signal, thereby when input signal was confirmed as audible signal, bit value and audible signal were determined in output; Calculating is from the pitch of the input signal of V/UV determining unit output; With use PLP coefficient, gain and pitch to carry out low rate encoding based on definite bit value.

Preferably, this audible signal is a voice signal.

Preferably, this PLP coefficient has about the 7th time for the 8kHz sampling rate.

From the detailed description below in conjunction with accompanying drawing, above-mentioned purpose, characteristics, mode and advantage with other of the present invention will become more high-visible.

Description of drawings

Be included to provide the present invention is further understood, and be merged in and the accompanying drawing that constitutes the part of this instructions illustrates embodiments of the invention, and can work to explain the principle of the invention with instructions.

In the accompanying drawings:

Fig. 1 illustrates a structure of using the prior art LPC scrambler of LP coefficient;

Fig. 2 illustrates a LPC scrambler according to use PLP coefficient of the present invention; With

Fig. 3 at length illustrates the sequential steps of the PLP coefficient in the calculating chart 2.

Embodiment

To at length be introduced the preferred embodiments of the present invention now, it is illustrated for example in the accompanying drawings.

The invention provides the low bit rate speech coding of a kind of use consciousness linear prediction (PLP), this consciousness linear prediction (PLP) can be carried out than linear predictive coding (LPC) coding of low order (rank) more, has the voice coding of high compression rate so that carry out.

Now at first will explain the difference between PLP and the LP.

LP is known traditionally, therefore will not be given. than the description of detailed inference formula.LP relates to LP coefficient a of acquisition basically _k, make square error (MSE) according to formula (1), that is, e[n] value can be minimum value, as following.

\underset{&OverBar;}{e} [n] = \underset{&OverBar;}{x} [n] - x_{&OverBar;}^{^} [n] = Σ_{k = 0}^{N_{pred}} a_{k} \underset{&OverBar;}{x} [n - k]

Formula (1)

The LP coefficient a of this acquisition _kHave the about the 8th to the 12nd time (rank) for the 8kHz sampling rate.Therefore, the LP coefficient a of this acquisition _kBe used to use the various coding methods (for example, LPC, CELP, MELP, RELP or the like) of linear prediction (LP), it is reached synthetic in more detail in voice coding, Amsterdam, and Holland: Elsevier, open in nineteen ninety-five.

PLP introduces for the first time in nineteen ninety in the paper of Hermansky.PLP uses the people's who is similar to existing Mel frequency cepstral coefficient (MFCC) sense of hearing sensation characteristics.Therefore, when carrying out LPC with low bit rate, the present invention uses the PLP coefficient rather than uses the LP coefficient to carry out low bit rate speech coding.

That is to say that the present invention uses the PLP coefficient to obtain frequency spectrum.PLP coefficient reflection people's auditory effect.Therefore, aspect MSE, use the PLP coefficient ratio to use LP bigger error may in this frequency spectrum, occur.But when considering auditory effect, the frequency spectrum of use PLP coefficient can have error still less.In addition, for the coefficient transmission, under the situation of LPC,, use the transmission on about the 10th time (rank), but, use the transmission on about the 7th time (rank), thereby can reduce this bit rate for PLP for typical 8kHz sampling rate.

Fig. 2 illustrates a structure according to the LPC scrambler of use PLP coefficient of the present invention.

With reference to figure 2, except not comprising correlator 10 and replacing the LP coefficient calculator 11 with PLP coefficient calculator 20, it is identical using the LPC scrambler of PLP coefficient to be constituted as with prior art LPC scrambler in Fig. 1.

PLP coefficient calculator 20 processes voice signals S[n], to calculate PLP coefficient a _pAnd gain G, wherein consider auditory effect.

Explain the operation have according to the LPC scrambler of the use PLP coefficient of said structure of the present invention referring now to accompanying drawing.

At first, PLP coefficient calculator 20 received speech signal S[n], so that calculate PLP coefficient a by sequentially carrying out operation shown in Figure 3 _pAnd gain G.

That is to say, 20 pairs of input signals of PLP coefficient calculator, that is, voice signal S[n] execution Fast Fourier Transform (FFT) (FFT).Carry out critical band integration (critical-bankintegration) and repeated sampling processing for this through voice signal of Fourier transform, with by frequency cells from this voice signal S[n] remove noise component.

In case remove this noise component, the voice signal of 20 pairs of these Fourier transforms of PLP coefficient calculator is carried out balanced and the processing that amplifies, so that it becomes the sound component with the sense of hearing sensation amplitude that is suitable for the people, this voice signal and an output power that allows the mankind to listen to are complementary then.

When finishing power match, 20 pairs of corresponding speech signal of PLP coefficient calculator are carried out inverse discrete Fourier transform, then obtain one group of linear equation from corresponding speech signal.Therefore, 20 pairs of these systems of linear equations of PLP coefficient calculator are carried out the cepstrum recurrence and are handled, thus the cepstrum coefficient of output PLP model, that is, and PLP coefficient a _pIn other words, PLP coefficient calculator 20 is to low order (rank) the PLP coefficient a of parameter coding unit 23 output reflection people's sense of hearing sensation characteristics _pWith gain G as parameter value.

At this moment, V/UV indication bit of V/UV determining unit 21 output, and with voice signal S[n] send pitch counter 22 to.Pitch counter 22 computing voice signal S[n] pitch P.

Therefore, parameter coding unit 23 is by V/UV indication bit value, the PLP coefficient a of coding (passing through low rate encoding) from

PLP coefficient calculator

20 and 22 receptions of pitch counter _p, gain G and pitch P export a bit stream.Preferably, the PLP coefficient a that is transmitted _pNumber of times approximately be to be used for the 7th time of 8kHz sampling rate.Then, the controller (not shown) is handled this bit stream, exports the bit stream of this processing then and gives radio (wireless) unit (not shown).Radio unit is radio signal (radio signal) and launches it the signal transformation of slave controller output.

As mentioned above, in the present invention, LPC carries out by using the PLP coefficient, thereby can improve compressibility, and the voice level signal can be utilized more efficient low bit rate transmission.

In addition, in the present invention, can realize higher compressibility, and by using the PLP coefficient rather than using existing LP coefficient as a parameter, the signal quality that can expect to have high sound quality.

Therefore, can use low bit rate to be used for the Code And Decode voice, perhaps be used to occupy the very equipment of small size, and use the PLP parameter to carry out phonetic synthesis according to speech coding apparatus of the present invention and method.

In addition, it almost is not very important can being used for for sound itself according to speech coding apparatus of the present invention and method, but the voice coding of the application that is enough to hear.In addition, by high compression rate the Internet storage data or need low bit rate, in an embedding/system with finite memory, effective voice dialogue can carried out on the Internet.

Because the present invention can not break away from its spirit or basic characteristic is implemented with some forms, unless otherwise mentioned, should be appreciated that in addition, above-described embodiment is not subjected to the restriction of previously described any details, but should be interpreted as widely in the spirit and scope that appended claim limits, therefore, all variations and modification fall within the scope of the claims, and therefore the of equal value of perhaps such scope be intended to be comprised by appended claim.

Claims

1. the speech coding apparatus in the mobile communication terminal comprises:

Consciousness linear prediction (PLP) coefficient calculator is used for calculating PLP coefficient and gain by handling input signal;

The V/UV determining unit be used for determining that input signal is audible signal or no acoustical signal, thereby when input signal was audible signal, result and audible signal was determined in output;

The pitch counter is used to calculate the pitch from the input signal of V/UV determining unit output; With

The parameter coding unit is used for using PLP coefficient, gain and pitch to carry out low rate encoding based on definite result.

2. according to the equipment of claim 1, wherein audible signal is a voice signal.

3. according to the equipment of claim 1, determine that wherein the result represents that input signal is the audible signal or the bit value of no acoustical signal.

4. according to the equipment of claim 1, wherein the number of times of this PLP coefficient approximately is the 7th time for the 8kHz sampling rate.

5. the voice coding method of a mobile communication terminal comprises:

Calculate consciousness linear prediction (PLP) coefficient and gain by handling input signal;

Determine that input signal is audible signal or no acoustical signal, thereby when input signal was confirmed as audible signal, signal and audible signal were determined in output;

Calculating is from the pitch of the input signal of V/UV determining unit output; With

Use this PLP coefficient, gain and pitch to carry out low rate encoding based on definite signal.

6. according to the method for claim 5, wherein audible signal is a voice signal.

7. according to the method for claim 5, the step of wherein calculating PLP coefficient and gain comprises:

Carry out Fast Fourier Transform (FFT) (FFT) for input signal;

Voice signal through Fourier transform is carried out critical band integration and repeated sampling, thereby utilize frequency cells to remove noise component;

To carrying out balanced and the processing that amplifies through the voice signal of Fourier transform, make it to become sound component with the sense of hearing sensation amplitude that is suitable for the people, then this voice signal and a suitable output power are complementary;

The voice signal that is complementary with output power is carried out inverse discrete Fourier transform, thereby obtain one group of linear equation; With

Carry out the cepstrum recurrence for this system of linear equations and handle, thereby obtain PLP coefficient and gain.

8. according to the method for claim 5, wherein the number of times of PLP coefficient approximately is the 7th time for the 8kHz sampling rate.