JP2006039559A

JP2006039559A - Device and method of audio coding using plp of transfer communication terminal

Info

Publication number: JP2006039559A
Application number: JP2005213527A
Authority: JP
Inventors: Chan-Woo Kim; チャン−ウキム
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2004-07-23
Filing date: 2005-07-22
Publication date: 2006-02-09
Also published as: KR100619893B1; EP1619665A1; ATE480852T1; CN1737904A; KR20060008078A; DE602005023385D1; EP1619665B1

Abstract

<P>PROBLEM TO BE SOLVED: To provide an audio coding method of a transfer communication terminal that can realize higher compressibility than when the traditional LP coefficient is used and ensure higher sound quality by conducting LPC (Linear Predictive Coding) with using PLP (Perceptual Linear Prediction) coefficients. <P>SOLUTION: An audio coding device of the transfer communication terminal includes a PLP coefficient calculating portion 20 processing an input signal and calculating PLP coefficient and gain, a V/UV determination portion 21 determining whether the input signal is a voiced signal or an unvoiced signal and outputting a determination signal and the voiced signal when the input signal is the voiced signal, a pitch calculating portion 22 calculating the pitch of the input signal output from the V/UV determination portion 21 and a parameter coding portion 23 conducting low bit rate coding by using PLP coefficient, gain and pitch based on the determination signal. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、移動通信端末機のコーディングに関し、特に、ＰＬＰ（ＰｅｒｃｅｐｔｕａｌＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）を利用した音声コーディング装置及び方法に関する。 The present invention relates to coding of a mobile communication terminal, and more particularly, to a speech coding apparatus and method using PLP (Perceptual Linear Prediction).

移動通信技術の発展により、現在の移動通信端末機は、音声通信だけでなく、数字、文字及び記号などを利用したデータ通信と、多様な映像信号を含むマルチメディア通信を提供している。多数の端末機使用者は、システムから無線チャンネルが割り当てられた後、無線資源を利用して必要なデータを送受信する。しかしながら、前記無線チャンネルは、多数の使用者が同時に使用可能にするために帯域幅を制限しているため、各使用者のビットレートが制限される。 With the development of mobile communication technology, current mobile communication terminals provide not only voice communication but also data communication using numbers, characters and symbols, and multimedia communication including various video signals. Many terminal users transmit and receive necessary data using radio resources after a radio channel is allocated from the system. However, since the wireless channel has a limited bandwidth so that a large number of users can use it simultaneously, the bit rate of each user is limited.

従って、前記制限されたビットレートでより多くのデータを伝送するための技術としてコーディング技術が提案された。従来の音声コーディング技術には、非常に多様な方法が存在するが、各音声コーディング方法は、所定のビットレートで所定の長所を有する。 Accordingly, a coding technique has been proposed as a technique for transmitting more data at the limited bit rate. There are a wide variety of conventional voice coding techniques, but each voice coding method has predetermined advantages at a predetermined bit rate.

例えば、基準音声コーディング（ＧｅｎｅｒｉｃＡｕｄｉｏＣｏｄｉｎｇ）を利用したスピーチコーディング（ｓｐｅｅｃｈｃｏｄｉｎｇ）、ＰＣＭ（ＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）及びＡＤＰＣＭ（ＡｄａｐｔｉｖｅＤｅｌｔａＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）は、１６Ｋｂｐｓ以上の高ビットレートにおいて効果的であり、ＣＥＬＰ（ＣｏｄｅＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）とその各種変形は、２．４Ｋｂｐｓ〜１６Ｋｂｐｓ範囲の中ビットレートにおいて効果的である。特に、前記中ビットレートでは、ＬＤ−ＣＥＬＰ（ＬｏｗＤｅｌａｙＣｏｄｅＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）、ＣＳ−ＡＣＥＬＰ（ＣｏｎｊｕｇａｔｅＳｔｒｕｃｔｕｒｅＡｌｇｅｂｒａｉｃＣｏｄｅＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）、ＶＳＥＬＰ（ＶｅｃｔｏｒＳｕｍＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）及びＭＥＬＰ（ＭｉｘｅｄＥｘｃｉｔａｔｉｏｎＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）を利用したコーディング方法と広帯域スピーチコーディング（ｗｉｄｅｂａｎｄｓｐｅｅｃｈｃｏｄｉｎｇ）などが使用される。また、ＬＰＣ（ＬｉｎｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄｉｎｇ）、ＲＥＬＰ（ＲｅｓｉｄｕａｌＥｘｃｉｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）、フォルマントボコーダ（ｆｏｒｍａｎｔｓｖｏｃｏｄｅｒ）及びケプストラルボコーダ（ＣｅｐｓｔｒａｌＶｏｃｏｄｅｒ）などは、７５ｂｐｓ〜２．４Ｋｂｐｓの低ビットレートにおいてより多くの長所を有する。 For example, speech coding using standard audio coding (Generic Audio Coding), PCM (Pulse Code Modulation) and ADPCM (Adaptive Delta Pulse Code Modulation) are effective at a high bit rate of 16 Kbps or more, and P is effective. (Code Excited Linear Prediction) and various modifications thereof are effective at medium bit rates in the range of 2.4 Kbps to 16 Kbps. In particular, at the above-mentioned medium bit rates, LD-CELP (Low Delay Code Excited Linear Prediction), CS-ACELP (Conjugate Structure Excite Linear Excited Linear Prediction), and VSELP (Vex. The coding method used and wideband speech coding are used. In addition, LPC (Linear Predictive Coding), RELP (Residual Excited Linear Prediction), Formant vocoder, Cepstral vocoder (Cepstral Vocoder), etc., which has a bit rate from 75 bps to 2. Have.

ここで、本発明は、低ビットレートで使用されるコーディングのうち線形予測コーディング（ＬｉｎｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄｉｎｇ：以下、ＬＰＣと称する）の改善方法を提案している。 Here, the present invention proposes an improvement method of linear predictive coding (hereinafter referred to as LPC) among codings used at a low bit rate.

図３は、従来のＬＰＣエンコーダの構成図である。 FIG. 3 is a configuration diagram of a conventional LPC encoder.

図３に示すように、従来のＬＰＣエンコーダは、入力信号ｘ［ｎ］の自己相関（ａｕｔｏｃｏｒｒｅｌａｔｉｏｎ）値ｒ_ｘ［ｎ］を計算する相関器１０と、前記自己相関値ｒ_ｘ［ｎ］を処理して線形予測係数（ＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎＣｏｅｆｆｉｃｉｅｎｔ：以下、ＬＰ係数という）ａ_ＬとゲインＧを計算するＬＰ係数計算部１１と、入力信号ｘ［ｎ］が音声（Ｖｏｉｃｅｄ：Ｖ）信号であるか、非音声（Ｕｎｖｏｉｃｅｄ：ＵＶ）信号であるかを決定するＶ／ＵＶ決定部１２と、前記入力信号ｘ［ｎ］が音声（Ｖ）信号である場合、該当信号のピッチＰを計算するピッチ計算部１３と、Ｖ／ＵＶ決定部１２から出力されたＶ／ＵＶ指示ビット（Ｉｎｄｉｃａｔｉｏｎｂｉｔ）によりＬＰ係数計算部１１とピッチ計算部１３から受信したＬＰ係数ａ_ｎ、ゲインＧ及びピッチＰをコーディングしてビットストリームを出力するパラメータコーディング部１４とから構成される。 As shown in FIG. 3, a conventional LPC encoder processes a correlator 10 that calculates an autocorrelation value r _x [n] of an input signal x [n] and the autocorrelation value r _x [n]. Then, LP coefficient calculation unit 11 for calculating linear prediction coefficient (Linear Prediction Coefficient: hereinafter referred to as LP coefficient) a _L and gain G, and whether input signal x [n] is a voice (Voided: V) signal or not A V / UV determination unit 12 that determines whether the signal is a voice (Unvoiced: UV) signal, and a pitch calculation unit 13 that calculates a pitch P of the corresponding signal when the input signal x [n] is a voice (V) signal. And the LP coefficient calculation unit 11 and the pitch calculation unit 1 according to the V / UV instruction bit (Indication bit) output from the V / UV determination unit 12. 3 is configured by a parameter coding unit 14 that codes the LP coefficient a _n , the gain G, and the pitch P received from 3 and outputs a bit stream.

以下、このように構成された従来のＬＰＣエンコーダの動作を説明する。 Hereinafter, the operation of the conventional LPC encoder configured as described above will be described.

まず、相関器１０は、入力信号ｘ［ｎ］を自己相関し、ＬＰ係数計算部１１は、相関器１０により計算された自己相関値ｒ_ｘ［ｎ］を処理してＬＰ係数ａ_ｎとゲインＧを計算する。ここで、Ｖ／ＵＶ決定部１２は、入力信号ｘ［ｎ］が音声（Ｖ）信号であるか、非音声（ＵＶ）信号であるかを決定してＶ／ＵＶ指示ビットと音声（Ｖ）信号を出力し、ピッチ計算部１３は、Ｖ／ＵＶ決定部１２から出力された音声（Ｖ）信号のピッチＰを計算する。 First, the correlator 10 autocorrelation of the input signal x [n], LP coefficient calculator 11, LP coefficient _{a n} and a gain by processing the autocorrelation value _r x calculated by the correlator 10 [n] G is calculated. Here, the V / UV determination unit 12 determines whether the input signal x [n] is a voice (V) signal or a non-voice (UV) signal, and determines the V / UV instruction bit and the voice (V). The signal is output, and the pitch calculation unit 13 calculates the pitch P of the voice (V) signal output from the V / UV determination unit 12.

従って、パラメータコーディング部１４は、Ｖ／ＵＶ指示ビットが音声（Ｖ）信号を示す場合、ＬＰ係数計算部１１とピッチ計算部１３から受信したＬＰ係数ａ_ｎ、ゲインＧ及びピッチＰをパラメータコーディング（低ビットレートでエンコーディング）してビットストリームを出力する。その後、制御部（図示せず）は、ビットストリームを処理して無線部（図示せず）に出力し、前記無線部は、前記制御部から出力された信号を無線信号に変換して伝送する。 Accordingly, when the V / UV instruction bit indicates a voice (V) signal, the parameter coding unit 14 performs parameter coding on the LP coefficient a _n , the gain G, and the pitch P received from the LP coefficient calculation unit 11 and the pitch calculation unit 13 ( Encode at a low bit rate) and output a bitstream. Thereafter, a control unit (not shown) processes the bit stream and outputs the bit stream to a radio unit (not shown), and the radio unit converts the signal output from the control unit into a radio signal and transmits the radio signal. .

このように、移動通信端末機において音声信号を低ビットレートで伝送するために、従来技術では、ＬＰＣコーディングを行う。しかしながら、従来のＬＰＣコーディングは、通常ＬＰ係数を利用するため、人間の聴覚特性が考慮されない。従って、低ビットレートで動作する従来のＬＰＣコーディングの場合は、圧縮効率が低く（１２００ｋｂｐｓ〜２４００ｋｂｐｓ）、音質が悪いという短所があった。 In this way, in order to transmit the audio signal at a low bit rate in the mobile communication terminal, the conventional technique performs LPC coding. However, since conventional LPC coding normally uses LP coefficients, human auditory characteristics are not considered. Therefore, the conventional LPC coding that operates at a low bit rate has the disadvantages that the compression efficiency is low (1200 kbps to 2400 kbps) and the sound quality is poor.

本発明の目的は、ＰＬＰ係数を利用してＬＰＣコーディングを行うことにより、圧縮効率及び音質を向上させる移動通信端末機の音声コーディング装置及び方法を提供することにある。 An object of the present invention is to provide a voice coding apparatus and method for a mobile communication terminal that improves compression efficiency and sound quality by performing LPC coding using PLP coefficients.

このような目的を達成するために、本発明に係る移動通信端末機のＬＰＣエンコーダは、入力信号を処理してＰＬＰ係数とゲインを計算するＰＬＰ係数計算部と、前記入力信号が音声信号であるか、非音声信号であるかを決定し、前記入力信号が音声信号であると、前記決定信号と該当音声信号を出力するＶ／ＵＶ決定部と、前記Ｖ／ＵＶ決定部から出力された入力信号のピッチを計算するピッチ計算部と、前記決定信号に基づいて、前記ＰＬＰ係数、ゲイン及びピッチを利用して低ビットレートコーディングを行うパラメータコーディング部とを含むことを特徴とする。 In order to achieve such an object, an LPC encoder of a mobile communication terminal according to the present invention includes a PLP coefficient calculation unit that processes an input signal to calculate a PLP coefficient and a gain, and the input signal is an audio signal. Or a non-audio signal, and if the input signal is an audio signal, a V / UV determination unit that outputs the determination signal and the corresponding audio signal, and an input output from the V / UV determination unit A pitch calculating unit that calculates a pitch of a signal, and a parameter coding unit that performs low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal.

このような目的を達成するために、本発明に係る移動通信端末機の低ビットレート音声コーディング方法は、入力信号を処理してＰＬＰ係数とゲインを計算する段階と、前記入力信号が音声信号であるか、非音声信号であるかを決定し、前記入力信号が音声信号であると、前記決定信号と音声信号を出力する段階と、前記Ｖ／ＵＶ決定部から出力された入力信号のピッチを計算する段階と、前記決定信号に基づいて、前記ＰＬＰ係数、ゲイン及びピッチを利用して低ビットレートコーディングを行う段階とを含むことを特徴とする。 In order to achieve the above object, a low bit rate speech coding method for a mobile communication terminal according to the present invention includes a step of processing an input signal to calculate a PLP coefficient and a gain, and the input signal is a speech signal. Determining whether the input signal is an audio signal, and outputting the determination signal and the audio signal; and determining a pitch of the input signal output from the V / UV determination unit. Calculating, and performing low bit rate coding using the PLP coefficient, gain, and pitch based on the decision signal.

好ましくは、前記音声信号はスピーチ信号である。 Preferably, the audio signal is a speech signal.

好ましくは、前記ＰＬＰ係数の次数は、８ｋＨｚサンプリングレート（ｓａｍｐｌｉｎｇｒａｔｅ）の場合、７次程度である。 Preferably, the order of the PLP coefficient is about the 7th order in the case of an 8 kHz sampling rate.

上記目的を達成するために、本発明は、例えば、以下の手段を提供する。
（項目１）
移動通信端末機において、
入力信号を処理してＰＬＰ（ＰｅｒｃｅｐｔｕａｌＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）係数とゲインを計算するＰＬＰ係数計算部と、
前記入力信号が音声（Ｖｏｉｃｅｄ）信号であるか、非音声（Ｕｎｖｏｉｃｅｄ）信号であるかを決定し、前記入力信号が音声信号であると、決定信号と該当音声信号を出力するＶ／ＵＶ決定部と、
前記Ｖ／ＵＶ決定部から出力された入力信号のピッチを計算するピッチ計算部と、
前記決定信号に基づいて、前記ＰＬＰ係数、ゲイン、及びピッチを利用して低ビットレートコーディングを行うパラメータコーディング部と
を含むことを特徴とする音声コーディング装置。
（項目２）
前記音声信号は、スピーチ（ｓｐｅｅｃｈ）信号であることを特徴とする項目１に記載の音声コーディング装置。
（項目３）
前記決定信号は、前記入力信号が音声信号であるか又は非音声信号であるかを示すビット値であることを特徴とする項目１に記載の音声コーディング装置。
（項目４）
前記ＰＬＰ係数の次数は、８ｋＨｚサンプリングレート（ｓａｍｐｌｉｎｇｒａｔｅ）の場合、７次程度であることを特徴とする項目１に記載の音声コーディング装置。
（項目５）
入力信号を処理してＰＬＰ係数とゲインを計算する段階と、
前記入力信号が音声信号であるか、非音声信号であるかを決定し、前記入力信号が音声信号であると、決定信号と音声信号を出力する段階と、
前記Ｖ／ＵＶ決定部から出力された入力信号のピッチを計算する段階と、
前記決定信号に基づいて、前記ＰＬＰ係数、ゲイン、及びピッチを利用して低ビットレートコーディングを行う段階と
を含むことを特徴とする移動通信端末機の音声コーディング方法。
（項目６）
前記音声信号は、スピーチ信号であることを特徴とする項目５に記載の移動通信端末機の音声コーディング方法。
（項目７）
前記ＰＬＰ係数とゲインを計算する段階は、
前記入力信号を高速フーリエ変換する段階と、
前記高速フーリエ変換されたスピーチ信号に対して積分（ｉｎｔｅｇｒａｔｉｏｎ）及びリサンプリング（ｒｅｓａｍｐｌｉｎｇ）を行うことにより、周波数単位で雑音成分を除去する段階と、
前記雑音成分が除去されたスピーチ信号を人間の聴覚に適した大きさの音成分にイコライジング（ｅｑｕａｌｉｚｉｎｇ）処理及びラウドネス（ｌｏｕｄｎｅｓｓ）補正した後、適正電力にマッチングさせる段階と、
前記電力マッチングされたスピーチ信号を逆離散フーリエ変換して線形方程式系を求める段階と、
前記線形方程式セットに対してケプストラル循環（ＣｅｐｓｔｒａｌＲｅｃｕｒｓｉｏｎ）処理を行うことにより、ＰＬＰ係数とゲインを求める段階とから構成されることを特徴とする項目５に記載の移動通信端末機の音声コーディング方法。
（項目８）
前記ＰＬＰ係数の次数は、８ｋＨｚサンプリングレートの場合、７次程度であることを特徴とする項目５に記載の移動通信端末機の音声コーディング方法。 In order to achieve the above object, the present invention provides, for example, the following means.
(Item 1)
In mobile communication terminals,
A PLP coefficient calculation unit that processes an input signal and calculates a PLP (Perceptual Linear Prediction) coefficient and a gain;
A V / UV determination unit that determines whether the input signal is a voice (Voiced) signal or a non-voice (Unvoiced) signal, and outputs the determination signal and the corresponding voice signal if the input signal is a voice signal When,
A pitch calculation unit that calculates the pitch of the input signal output from the V / UV determination unit;
And a parameter coding unit that performs low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal.
(Item 2)
The speech coding apparatus according to item 1, wherein the speech signal is a speech signal.
(Item 3)
The speech coding apparatus according to item 1, wherein the determination signal is a bit value indicating whether the input signal is a speech signal or a non-speech signal.
(Item 4)
The speech coding apparatus according to item 1, wherein the order of the PLP coefficient is about 7th in the case of an 8 kHz sampling rate.
(Item 5)
Processing the input signal to calculate PLP coefficients and gains;
Determining whether the input signal is an audio signal or a non-audio signal, and outputting the determination signal and the audio signal when the input signal is an audio signal;
Calculating the pitch of the input signal output from the V / UV determining unit;
And performing low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal.
(Item 6)
The method of claim 5, wherein the voice signal is a speech signal.
(Item 7)
The step of calculating the PLP coefficient and gain includes:
Fast Fourier transforming the input signal;
Removing noise components in units of frequencies by performing integration and resampling on the fast Fourier transformed speech signal;
The speech signal from which the noise component has been removed is subjected to equalizing processing and loudness correction to a sound component having a magnitude suitable for human hearing, and then matched to an appropriate power.
Obtaining a linear equation system by performing inverse discrete Fourier transform on the power-matched speech signal;
6. The voice coding method of a mobile communication terminal according to item 5, comprising a step of obtaining a PLP coefficient and a gain by performing a cepstral circulation process on the linear equation set.
(Item 8)
The method of claim 5, wherein the order of the PLP coefficient is about 7th in the case of an 8 kHz sampling rate.

本発明は、ＰＬＰ係数を利用してＬＰＣを行うことにより、圧縮率を向上させ、より効率的な低ビットレートで音声信号を伝送できるという効果がある。 The present invention has an effect of improving the compression rate by performing LPC using PLP coefficients and transmitting an audio signal at a more efficient low bit rate.

また、本発明は、ＰＬＰ係数をパラメータとして使用することにより、既存のＬＰ係数を使用する場合より、高圧縮率を実現でき、高い信号品質を期待できるという効果がある。 In addition, the present invention has an effect that by using the PLP coefficient as a parameter, a higher compression rate can be realized and higher signal quality can be expected than when the existing LP coefficient is used.

従って、本発明は、低ビットレートを利用した音声コーディング及びデコーディングのために使用したり、ＰＬＰパラメータを利用して小さい空間で音声合成を行う装置に使用することができる。 Therefore, the present invention can be used for speech coding and decoding using a low bit rate, or for an apparatus that performs speech synthesis in a small space using PLP parameters.

また、本発明は、高音質を必要とはしないが、十分に聞こえる程度のアプリケーションのためのスピーチコーディングに使用されることができる。また、本発明は、メモリが限定されているエンベディッドシステム（ｅｍｂｅｄｄｅｄｓｙｓｔｅｍ）における高圧縮率でのデータの保存や低ビットレートを要求するインターネットなどにおける音声通話などに効果的である。 In addition, the present invention does not require high sound quality, but can be used for speech coding for applications that are sufficiently audible. In addition, the present invention is effective for storing data at a high compression rate in an embedded system having a limited memory, and for voice calls on the Internet or the like that require a low bit rate.

以下、図面に基づいて、本発明の望ましい実施形態を説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

本発明は、高圧縮率を有する音声コーディングを行うために、ＬＰＣより低い次数のコーディングを行い得るＰＬＰを利用した低ビットレート音声コーディングを提供する。 The present invention provides low bit rate speech coding using PLP that can perform lower-order coding than LPC in order to perform speech coding having a high compression rate.

まず、ＰＬＰとＬＰの相違点は、次の通りである。 First, the differences between PLP and LP are as follows.

前記ＬＰは、公知であるため、それを求める公式は説明を省略する。前記ＬＰは、基本的に次の数１によりＭＳＥ（ｍｅａｎｓｑｕａｒｅｄｅｒｒｏｒ）、すなわち、ｅ［ｎ］の値が最小になるようにＬＰ係数ａ_ｋを求める。 Since the LP is known, the formula for obtaining it is not described. The LP basically obtains an LPE _ak such that the value of e [n] is minimized by MSE (mean squared error) according to the following equation (1).

前記求められたＬＰ係数ａ_ｋの次数は、８ｋＨｚサンプリングレートの場合、８〜１２次程度となる。従って、前記求められたＬＰ係数ａ_ｋがＬＰを利用する各種コーディング（ＬＰＣ、ＣＥＬＰ、ＭＥＬＰ、ＲＥＬＰ）に使用される。このような内容は、『Ｗ．Ｂ．Ｋｌｅｉｊｉｎ，ａｎｄＫ．Ｋ．Ｐａｌｉｗａｌ，Ｓｐｅｅｃｈｃｏｄｉｎｇａｎｄｓｙｎｔｈｅｓｉｓ，Ａｍｓｔｅｒｄａｍ，ｔｈｅＮｅｔｈｅｒｌａｎｄｓ：Ｅｌｓｅｖｉｅｒ，１９９５』に詳細に記載されている。

The order of the obtained LP coefficient _ak is about 8 to 12 in the case of an 8 kHz sampling rate. Therefore, the obtained LP coefficient _ak is used for various types of coding (LPC, CELP, MELP, RELP) using LP. Such contents are “W. B. Kleijin, and K.K. K. Paliwal, Speech coding and synthesis, Amsterdam, the Netherlands: Elsevier, 1995.

前記ＰＬＰは、１９９０年にハーマンスキー（Ｈｅｒｍａｎｓｋｙ）の論文に初登場し、既存のＭＦＣＣ（Ｍｅｌ−ＦｒｅｑｕｅｎｃｙＣｅｐｓｔｒａｌＣｏｅｆｆｉｃｉｅｎｔ）と同様に人間の聴覚特性を利用する。従って、本発明は、低ビットレートのためのＬＰＣを行う場合、ＬＰ係数の代わりにＰＬＰ係数を利用して低ビットレート音声コーディング（ＬＰＣ）を行う。 The PLP first appeared in a Hermansky paper in 1990, and uses human auditory characteristics in the same way as the existing MFCC (Mel-Frequency Cepstial Coefficient). Accordingly, when performing LPC for a low bit rate, the present invention performs low bit rate speech coding (LPC) using PLP coefficients instead of LP coefficients.

すなわち、本発明は、ＰＬＰ係数を利用してスペクトルを求める。前記ＰＬＰ係数は、人間の聴覚特性が反映されているため、これを利用して得られたスペクトルは、ＭＳＥ的な概念としてはＬＰより誤差が大きいが、聴覚的特性を考慮した場合は、誤差がより小さくなる。また、ＬＰ係数の伝送の場合、一般に、８ｋＨｚサンプリングレートにおいて１０次程度で伝送されるが、ＰＬＰ係数の伝送の場合は、７次程度で伝送されるので、ビットレートを低くすることができる。 That is, the present invention obtains a spectrum using the PLP coefficient. Since the PLP coefficient reflects human auditory characteristics, the spectrum obtained using this PLP coefficient has a larger error than LP as an MSE concept. Becomes smaller. In the case of transmission of LP coefficients, transmission is generally performed at the 10th order at the 8 kHz sampling rate. However, in the case of transmission of PLP coefficients, transmission is performed at the seventh order, so that the bit rate can be lowered.

図１は、本発明に係るＰＬＰ係数を利用したＬＰＣエンコーダの構成図である。 FIG. 1 is a configuration diagram of an LPC encoder using a PLP coefficient according to the present invention.

図１を参照すると、前記ＰＬＰ係数を利用したＬＰＣエンコーダは、相関器１０を除去し、ＬＰ係数計算部１１をＰＬＰ係数計算部２０に代えたことを除くと、図３の従来のＬＰＣエンコーダと同様である。 Referring to FIG. 1, the LPC encoder using the PLP coefficient is the same as the conventional LPC encoder of FIG. 3 except that the correlator 10 is removed and the LP coefficient calculator 11 is replaced with a PLP coefficient calculator 20. It is the same.

ＰＬＰ係数計算部２０は、スピーチ信号Ｓ［ｎ］を処理して聴覚特性が考慮されたＰＬＰ係数ａ_ＰとゲインＧを計算する。 PLP coefficient calculator 20 calculates the PLP coefficient auditory characteristics is considered by processing the speech signal S [n] _{a P} and a gain G.

以下、図面を参照して、このように構成された本発明に係る他のＰＬＰ係数を利用したＬＰＣエンコーダの動作を説明する。 Hereinafter, the operation of the LPC encoder using another PLP coefficient according to the present invention configured as described above will be described with reference to the drawings.

まず、ＰＬＰ係数計算部２０は、スピーチ信号Ｓ［ｎ］を受信して図２に示す動作を順次行なってＰＬＰ係数ａＰとゲインＧを計算する。 First, the PLP coefficient calculation unit 20 receives the speech signal S [n] and sequentially performs the operations shown in FIG. 2 to calculate the PLP coefficient aP and the gain G.

すなわち、ＰＬＰ係数計算部２０は、まずスピーチ信号Ｓ［ｎ］である入力信号を高速フーリエ変換（ＦＦＴ）し、該高速フーリエ変換されたスピーチ信号に対して積分（ｉｎｔｅｇｒａｔｉｏｎ）及びリサンプリング（ｒｅｓａｍｐｌｉｎｇ）を行うことにより、スピーチ信号Ｓ［ｎ］から周波数単位で雑音成分を除去する。 That is, the PLP coefficient calculation unit 20 first performs fast Fourier transform (FFT) on the input signal that is the speech signal S [n], and integrates and resampling the fast Fourier transformed speech signal. By performing the above, noise components are removed from the speech signal S [n] in frequency units.

雑音成分が除去されると、ＰＬＰ係数計算部２０は、フーリエ変換されたスピーチ信号を人間の聴覚に適した大きさの音成分にイコライジング（ｅｑｕａｌｉｚｉｎｇ）処理及びラウドネス（ｌｏｕｄｎｅｓｓ）補正した後、人間の聴取に適した電力にマッチングさせる。 When the noise component is removed, the PLP coefficient calculation unit 20 performs equalizing processing and loudness correction of the Fourier-transformed speech signal to a sound component having a size suitable for human hearing, and then performs human loudness correction. Match power suitable for listening.

前記電力マッチングが完了すると、ＰＬＰ係数計算部２０は、該当スピーチ信号を逆離散フーリエ変換した後、該当スピーチ信号から線形方程式系（ＳｅｔｏｆＬｉｎｅａｒｅｑｕａｔｉｏｎｓ）を求める。従って、ＰＬＰ係数計算部２０は、前記線形方程式系に対してケプストラル循環（ＣｅｐｓｔｒａｌＲｅｃｕｒｓｉｏｎ）処理を行うことにより、ＰＬＰモデルのケプストラル係数（ＣｅｐｓｔｒａｌＣｏｅｆｆｉｃｉｅｎｔｓ）、すなわち、ＰＬＰ係数ａ_Ｐを出力する。すなわち、ＰＬＰ係数計算部２０は、人間の聴覚特性を反映した低い次数のＰＬＰ係数ａ_ＰとゲインＧをパラメータ値としてパラメータコーディング部２３に出力する。 When the power matching is completed, the PLP coefficient calculator 20 performs inverse discrete Fourier transform on the speech signal and then obtains a linear equation system (Set of Linear equations) from the speech signal. Accordingly, the PLP coefficient calculation unit 20 outputs cepstral coefficients of the PLP model, that is, the PLP coefficient a _P , by performing cepstral circulation processing on the linear equation system. That, PLP coefficient calculator 20 outputs to the parameter coding unit 23 a PLP coefficient of low reflecting the human auditory characteristics orders a _P and a gain G as parameters values.

ここで、Ｖ／ＵＶ決定部２１は、Ｖ／ＵＶ指示ビットを出力すると共に、スピーチ信号Ｓ［ｎ］をピッチ計算部２２に伝達し、ピッチ計算部２２は、前記スピーチ信号Ｓ［ｎ］のピッチＰを計算する。 Here, the V / UV determination unit 21 outputs the V / UV instruction bit and transmits the speech signal S [n] to the pitch calculation unit 22, and the pitch calculation unit 22 determines the speech signal S [n]. The pitch P is calculated.

従って、パラメータコーディング部２３は、Ｖ／ＵＶ指示ビット値、ＰＬＰ係数計算部２０とピッチ計算部２２から受信したＰＬＰ係数ａ_Ｐ、ゲインＧ及びピッチＰをコーディング（低ビットレートでエンコーディング）してビットストリームを出力する。好ましくは、前記伝送されるＰＬＰ係数ａ_Ｐの次数は、８ｋＨｚサンプリングレートの場合、７次程度となる。以後、制御部（図示せず）は、ビットストリームを処理して無線部（図示せず）に出力し、前記無線部は、制御部から出力された信号を無線信号に変換して伝送する。 Accordingly, the parameter coding unit 23 codes (encodes at a low bit rate) the V / UV instruction bit value, the PLP coefficient a _P , the gain G, and the pitch P received from the PLP coefficient calculation unit 20 and the pitch calculation unit 22 to generate bits. Output a stream. Preferably, the order of the PLP coefficient _{a P} is the transmission, in the case of 8kHz sampling rate, the 7 th order. Thereafter, the control unit (not shown) processes the bit stream and outputs it to a radio unit (not shown), and the radio unit converts the signal output from the control unit into a radio signal and transmits it.

以上のように、本発明の好ましい実施形態を用いて本発明を例示してきたが、本発明は、この実施形態に限定して解釈されるべきものではない。本発明は、特許請求の範囲によってのみその範囲が解釈されるべきであることが理解される。当業者は、本発明の具体的な好ましい実施形態の記載から、本発明の記載および技術常識に基づいて等価な範囲を実施することができることが理解される。 As mentioned above, although this invention has been illustrated using preferable embodiment of this invention, this invention should not be limited and limited to this embodiment. It is understood that the scope of the present invention should be construed only by the claims. It is understood that those skilled in the art can implement an equivalent range based on the description of the present invention and the common general technical knowledge from the description of specific preferred embodiments of the present invention.

ＰＬＰ（ＰｅｒｃｅｐｔｕａｌＬｉｎｅａｒＰｒｅｄｉｃｔｉｏｎ）係数を利用してＬＰＣ（ＬｉｎｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄｉｎｇ）コーディングを行うことにより、既存のＬＰ係数を使用する場合より高圧縮率を実現でき、高い音質を保障できる移動通信端末機の音声コーディング方法を提供する。 By performing LPC (Linear Predictive Coding) coding using PLP (Perceptual Linear Prediction) coefficients, it is possible to realize a higher compression ratio than when using existing LP coefficients, and the voice of a mobile communication terminal that can guarantee high sound quality Provide a coding method.

移動通信端末機の音声コーディング装置は、入力信号を処理してＰＬＰ係数とゲインを計算するＰＬＰ係数計算部２０と、前記入力信号が音声（Ｖｏｉｃｅｄ）信号であるか、非音声（Ｕｎｖｏｉｃｅｄ）信号であるかを決定し、前記入力信号が音声信号であると、決定信号と該当音声信号を出力するＶ／ＵＶ決定部２１と、Ｖ／ＵＶ決定部２１から出力された入力信号のピッチを計算するピッチ計算部２２と、前記決定信号に基づいて、前記ＰＬＰ係数、ゲイン、及びピッチを利用して低ビットレートコーディングを行うパラメータコーディング部２３とを含む。 A voice coding apparatus of a mobile communication terminal includes a PLP coefficient calculation unit 20 that processes an input signal to calculate a PLP coefficient and a gain, and the input signal is a voice (Voiced) signal or a non-voiced (Unvoiced) signal. If the input signal is an audio signal, the V / UV determination unit 21 that outputs the determination signal and the corresponding audio signal, and the pitch of the input signal output from the V / UV determination unit 21 are calculated. A pitch calculation unit 22 and a parameter coding unit 23 that performs low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal are included.

本発明に係るＰＬＰ係数を利用したＬＰＣエンコーダの構造を示す図である。It is a figure which shows the structure of the LPC encoder using the PLP coefficient based on this invention. 図１におけるＰＬＰ係数を計算する段階を詳細に説明するための図である。It is a figure for demonstrating in detail the step which calculates the PLP coefficient in FIG. 従来のＬＰ係数を利用したＬＰＣエンコーダの構造を示す図である。It is a figure which shows the structure of the LPC encoder using the conventional LP coefficient.

Explanation of symbols

２０：ＰＬＰ係数計算部
２１：Ｖ／ＵＶ決定部
２２：ピッチ計算部
２３：パラメータコーディング部 20: PLP coefficient calculation unit 21: V / UV determination unit 22: Pitch calculation unit 23: Parameter coding unit

Claims

In mobile communication terminals,
A PLP coefficient calculation unit that processes an input signal and calculates a PLP (Perceptual Linear Prediction) coefficient and a gain;
A V / UV determination unit that determines whether the input signal is a voice (Voiced) signal or a non-voice (Unvoiced) signal, and outputs the determination signal and the corresponding voice signal when the input signal is a voice signal. When,
A pitch calculation unit that calculates the pitch of the input signal output from the V / UV determination unit;
And a parameter coding unit that performs low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal.

The speech coding apparatus according to claim 1, wherein the speech signal is a speech signal.

The speech coding apparatus according to claim 1, wherein the determination signal is a bit value indicating whether the input signal is a speech signal or a non-speech signal.

The speech coding apparatus according to claim 1, wherein the order of the PLP coefficient is about 7th in the case of an 8 kHz sampling rate.

Processing the input signal to calculate PLP coefficients and gains;
Determining whether the input signal is an audio signal or a non-audio signal, and outputting the determination signal and the audio signal when the input signal is an audio signal;
Calculating the pitch of the input signal output from the V / UV determining unit;
And performing low bit rate coding using the PLP coefficient, gain, and pitch based on the determination signal.

The method of claim 5, wherein the voice signal is a speech signal.

The step of calculating the PLP coefficient and gain includes:
Fast Fourier transforming the input signal;
Removing noise components in units of frequencies by performing integration and resampling on the fast Fourier transformed speech signal;
The speech signal from which the noise component has been removed is subjected to equalizing processing and loudness correction to a sound component having a magnitude suitable for human hearing, and then matched to an appropriate power.
Obtaining a linear equation system by performing inverse discrete Fourier transform on the power-matched speech signal;
6. The voice coding method of a mobile communication terminal according to claim 5, comprising a step of obtaining a PLP coefficient and a gain by performing a cepstral circulation process on the linear equation set. .

The method of claim 5, wherein the order of the PLP coefficient is about 7th in the case of an 8 kHz sampling rate.