JPH08129400A

JPH08129400A - Voice coding system

Info

Publication number: JPH08129400A
Application number: JP6266508A
Authority: JP
Inventors: Yoshiaki Tanaka; 良紀田中
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1994-10-31
Filing date: 1994-10-31
Publication date: 1996-05-21

Abstract

PURPOSE: To provide a voice coding system to improve the quality of a generated voice. CONSTITUTION: In a coder 10, a linear prediction analysis is performed against sound source signals by a linear prediction residual detecting means 11 for every frame and prediction coefficients are obtained. These coefficients are used against the signals and an inverse filter process is performed and a prediction residual signal R is obtained. A pitch analysis means 12 obtains a pitch period N corresponding to a maximum amplitude pitch of the sound source signals. A pitch waveform extracting means 13 extracts a residual pitch waveform Pi from the signal R in accordance with the period N. A two-dimensional Fourier transformation means 14 performs a two-dimensional Fourier transformation of the waveform Pi at the pitch period and transformation coefficients are obtained. A quantizing means 15 quantizes the transformation coefficients and quantized coefficients are obtained as sound source information. A two dimensional Fourier inversion transformation section 41 of a decoder 40 performs two-dimensional Fourier inverse transformation and sound source signal waveforms are obtained.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は音声信号を高能率に圧縮
する音声符号化方式に関する。近年、企業内通信システ
ム、ディジタル移動無線システム、音声蓄積システム等
において、音声信号を高能率に圧縮することが要望され
ている。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice coding system for compressing a voice signal with high efficiency. In recent years, there has been a demand for highly efficient compression of voice signals in corporate communication systems, digital mobile radio systems, voice storage systems, and the like.

【０００２】[0002]

【従来の技術】線型予測分析により求めた全極型モデル
は、音声信号のスペクトル包絡の良いモデルとなること
が知られており、音声信号の高能率符号化においては、
線型予測分析により求めた線型予測係数と、音源に関す
るパラメータを伝送する方式が広く用いられている。2. Description of the Related Art An all-pole model obtained by linear prediction analysis is known to be a model with a good spectrum envelope of a speech signal, and in high efficiency coding of a speech signal,
A method of transmitting a linear prediction coefficient obtained by linear prediction analysis and a parameter related to a sound source is widely used.

【０００３】図６に線型予測分析を用いた音声符号器の
ブロック構成図、図７に音声復号器のブロック構成図を
示し、その説明を行う。図６に示す音声符号器は、線型
予測分析部１と、予測係数量子化部２と、逆フィルタ３
と、残差信号量子化部４と、多重化部５とを具備して構
成されており、図７に示す音声復号器は、分離部７と、
合成フィルタ８とを具備して構成されている。FIG. 6 shows a block diagram of a speech coder using linear prediction analysis, and FIG. 7 shows a block diagram of a speech decoder, which will be described. The speech coder shown in FIG. 6 includes a linear prediction analysis unit 1, a prediction coefficient quantization unit 2, and an inverse filter 3.
And a residual signal quantizing unit 4 and a multiplexing unit 5, and the speech decoder shown in FIG.
And a synthesizing filter 8.

【０００４】図６に示す符号器において、入力音声信号
の分析フレーム毎に、線型予測分析部１が線型予測分析
処理を行うことによって予測係数を求め、この求められ
た予測係数を予測係数量子化部２が量子化する。In the encoder shown in FIG. 6, the linear prediction analysis unit 1 performs a linear prediction analysis process for each analysis frame of an input speech signal to obtain a prediction coefficient, and the obtained prediction coefficient is quantized into a prediction coefficient. The part 2 quantizes.

【０００５】入力音声信号をその量子化された予測係数
をフィルタ係数とする逆フィルタ３に通すことにより得
られる予測残差信号を残差信号量子化部４で量子化し、
この量子化された予測残差信号を多重化部５で多重化し
て伝送路へ伝送する。Prediction residual signals obtained by passing the input speech signal through an inverse filter 3 using the quantized prediction coefficients as filter coefficients are quantized by a residual signal quantization unit 4,
The quantized prediction residual signal is multiplexed by the multiplexing unit 5 and transmitted to the transmission path.

【０００６】図７に示す復号器においては、符号器から
伝送されてきた量子化予測残差信号を分離部７で予測残
差信号と予測係数とに分離した後、予測係数をフィルタ
係数とする合成フィルタ８に通すことによって符号化さ
れた音声信号を再生する。In the decoder shown in FIG. 7, the quantized prediction residual signal transmitted from the encoder is separated into a prediction residual signal and a prediction coefficient by the separating unit 7, and the prediction coefficient is used as a filter coefficient. The encoded audio signal is reproduced by passing it through the synthesis filter 8.

【０００７】このとき、予測残差信号の効率的伝送のた
めの方式が幾つか知られている。その方式として、一定
長の予測残差信号をベクトル量子化して、そのインデッ
クスを伝送するコード駆動線型予測符号化方式（ＣＥＬ
Ｐ）、一定長の予測残差信号を有限個のパルス列でモデ
ル化し、最適なパルス位置、及びパルス振幅を伝送する
マルチパルス駆動符号化方式（ＭＰＣ）等がある。At this time, some methods are known for efficiently transmitting the prediction residual signal. As the method, a code-driven linear predictive coding method (CEL) in which a predictive residual signal of a fixed length is vector-quantized and the index is transmitted.
P), a multi-pulse drive coding method (MPC) that models a prediction residual signal of a constant length with a finite number of pulse trains, and transmits an optimum pulse position and pulse amplitude.

【０００８】また、代表波形補間方式(Prototype Wavef
orm Interpolation:PWI)は、４Ｋｂｐｓ以下程度の低ビ
ットレートにおいて高い品質が得られる方式である。こ
の方式は、音声信号フレームにおける残差信号の中から
代表的な１ピッチ波形を抽出したあと量子化して伝送
し、フレーム中のその他の残差信号を、前フレームにお
ける代表残差ピッチ波形と、現フレームにおける代表残
差ピッチ波形とを補間することにより求めるものであ
る。In addition, a representative waveform interpolation method (Prototype Wavef
orm Interpolation (PWI) is a method that can obtain high quality at a low bit rate of about 4 Kbps or less. In this method, a typical one-pitch waveform is extracted from the residual signal in the voice signal frame, quantized and transmitted, and the other residual signals in the frame are compared with the representative residual pitch waveform in the previous frame. This is obtained by interpolating with the representative residual pitch waveform in the current frame.

【０００９】[0009]

【発明が解決しようとする課題】ところで、上述したＣ
ＥＬＰやＭＰＣでは、フレームの全残差信号を対象とし
て符号化を行うため、低ビットレート化するためにフレ
ームを長くしたり、量子化ビット数を少なくすると、音
声品質が劣化してしまう問題があった。By the way, the above-mentioned C
In ELP and MPC, encoding is performed on the entire residual signal of a frame, so if the frame is lengthened to reduce the bit rate or the number of quantization bits is reduced, there is a problem that the voice quality deteriorates. there were.

【００１０】一方、ＰＷＩではフレーム中の１ピッチ波
形のみを符号化して伝送するため、低ビットレート化が
可能である。しかし、復号側では伝送されなかった残り
の残差信号を、代表残差ピッチ波形の補間により求める
ため、フレーム内の連続する残差ピッチ波形にフレーム
周波数以上の変動成分がある場合、補間により折り返し
歪みが発生する。このため、実際の残差信号との誤差が
大きくなり音声品質が劣化する問題があった。On the other hand, in PWI, since only one pitch waveform in a frame is encoded and transmitted, it is possible to reduce the bit rate. However, the residual signal that was not transmitted on the decoding side is obtained by interpolation of the representative residual pitch waveform, so if there is a fluctuation component above the frame frequency in the continuous residual pitch waveform within the frame, it will be folded back by interpolation. Distortion occurs. For this reason, there is a problem that the error from the actual residual signal becomes large and the voice quality deteriorates.

【００１１】本発明は、このような点に鑑みてなされた
ものであり、音声品質の向上を図ることができる音声符
号化方式を提供することを目的としている。The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a speech coding system capable of improving speech quality.

【００１２】[0012]

【課題を解決するための手段】図１に本発明の音声符号
化方式の原理図を示す。図中、１０は符号器である。こ
の符号器１０において、１１は線型予測残差算出手段で
あり、音声信号をフレーム毎に線型予測分析することに
より予測係数を求め、この求められた予測係数を用い音
声信号に対して逆フィルタ処理を行うことによって予測
残差信号Ｒを求めるものである。FIG. 1 shows the principle of the speech coding system of the present invention. In the figure, 10 is an encoder. In this encoder 10, 11 is a linear prediction residual calculation means, which obtains a prediction coefficient by performing a linear prediction analysis on the audio signal for each frame, and uses the obtained prediction coefficient to perform an inverse filter process on the audio signal. The prediction residual signal R is obtained by performing.

【００１３】１２はピッチ分析手段であり、音声信号の
ピッチ周期Ｎを求めるものである。１３はピッチ波形抽
出手段であり、ピッチ周期Ｎに応じて予測残差信号Ｒよ
り残差ピッチ波形Ｐi を抽出するものである。Reference numeral 12 is a pitch analysis means for obtaining the pitch period N of the voice signal. Reference numeral 13 is a pitch waveform extracting means for extracting the residual pitch waveform Pi from the predicted residual signal R according to the pitch cycle N.

【００１４】１４は２次元フーリエ変換手段であり、残
差ピッチ波形Ｐi をピッチ同期で２次元フーリエ変換を
行い変換係数を求めるものである。１５は量子化手段で
あり、変換係数を量子化することにより音源情報として
の量子化変換係数を求めるものである。Numeral 14 is a two-dimensional Fourier transforming means, which obtains transform coefficients by performing two-dimensional Fourier transform on the residual pitch waveform Pi in pitch synchronization. Reference numeral 15 denotes a quantizing means, which obtains a quantized transform coefficient as sound source information by quantizing the transform coefficient.

【００１５】４０は復号器であり、２次元フーリエ逆変
換することにより音源信号波形を求める２次元フーリエ
逆変換手段４１を具備する。Reference numeral 40 denotes a decoder, which comprises a two-dimensional inverse Fourier transform means 41 for obtaining a sound source signal waveform by performing a two-dimensional inverse Fourier transform.

【００１６】[0016]

【作用】上述した本発明において、予測残差信号Ｒのベ
クトル（Ｒとする）は、個別の残差ピッチ波形Ｐi が連
続した波形と考えることができるため、次のように表す
ことができる。In the present invention described above, the vector (denoted as R) of the prediction residual signal R can be considered as a waveform in which the individual residual pitch waveforms Pi are continuous, and therefore can be expressed as follows.

【００１７】Ｒ＝（Ｐ₀，Ｐ₁，…，Ｐ_M-1） … また、Ｐ_i＝〔p(i,0),p(i,1),…， p(i,N_i) 〕（０≦ｉ≦Ｍ−１）… である。R = (P ₀ , P ₁ , ..., P _M-1 ), and P _i = [p (i, 0), p (i, 1), ..., p (i, N _i )] (0 ≦ i ≦ M−1).

【００１８】ここでＭはフレームに含まれる残差ピッチ
波形Ｐi の数である。各残差ピッチ波形Ｐi は、例えば
図２に示すように抽出することができる。この場合、時
間的に最も古い残差ピッチ波形Ｐ₀は、前フレームの波
形を含んでいる。Here, M is the number of residual pitch waveforms Pi included in the frame. Each residual pitch waveform Pi can be extracted, for example, as shown in FIG. In this case, the temporally oldest residual pitch waveform P ₀ includes the waveform of the previous frame.

【００１９】次に、各残差ピッチ波形Ｐi を、p(i,0)が
最大値となるように巡回シフトすることにより、各残差
ピッチ波形の位相を揃える。これを改めてＰi とおく
と、予測残差信号ベクトルＲは次のようなＮ×Ｍ行列の
形で表すことができる。Next, each residual pitch waveform Pi is cyclically shifted so that p (i, 0) becomes the maximum value, thereby aligning the phases of each residual pitch waveform. Putting this again as Pi, the prediction residual signal vector R can be expressed in the form of the following N × M matrix.

【００２０】[0020]

【数１】 [Equation 1]

【００２１】次に、この行列を２次元フーリエ変換す
る。即ち列方向と行方向とに順番にフーリエ変換を行
う。得られた複素２次元フーリエ変換係数ｃ(i,j) は、
次の様なＮ／２×Ｍの行列で表すことができる。Next, this matrix is two-dimensionally Fourier transformed. That is, the Fourier transform is sequentially performed in the column direction and the row direction. The obtained complex two-dimensional Fourier transform coefficient c (i, j) is
It can be expressed by the following N / 2 × M matrix.

【００２２】[0022]

【数２】 [Equation 2]

【００２３】有声音区間では、連続するピッチ波形の間
に高い相関があるため、図３の符号１６で示す２次元フ
ーリエ逆変換係数ｃ(i,j) の例に見られるように、２次
元フーリエ変換係数ｃ(i,j) は、ｊの値が小さい係数の
値が大きくなる。但し、図３はｃ(i,j) の形状及びその
遷移を３次元で表したスペクトル図であり、図中、枠１
７にこのｃ(i,j) を求めた音声信号波形を示す。Since there is a high correlation between the continuous pitch waveforms in the voiced sound section, the two-dimensional Fourier transform coefficient c (i, j) shown by reference numeral 16 in FIG. In the Fourier transform coefficient c (i, j), the value of the coefficient having a small value of j becomes large. However, FIG. 3 is a three-dimensional spectrum diagram showing the shape of c (i, j) and its transition.
FIG. 7 shows a voice signal waveform for which this c (i, j) is obtained.

【００２４】ｊに関しての低次の成分は、各ピッチ波形
間でゆっくり変化する成分であり、高次の成分は速く変
化する成分である。そこで、符号１８で示す振幅の大き
い部分、即ちｊの値が小さい係数に対しては量子化精度
を高くし、符号１９で示す振幅の小さい部分、即ちｊの
値が大きい係数に対しては量子化精度を低くすることに
より、同一の量子化ビット数を用いて均一に量子化した
場合より、再生信号の品質を高くすることができる。ｊ
がある一定の値より大きい係数は、全く伝送しないよう
にすることもできる。The low-order component with respect to j is a component that changes slowly between pitch waveforms, and the high-order component is a component that changes rapidly. Therefore, the quantization precision is increased for a portion having a large amplitude indicated by reference numeral 18, that is, a coefficient having a small value of j, and a quantization is performed for a portion having a small amplitude indicated by reference numeral 19, that is, a coefficient having a large value of j. By lowering the quantization accuracy, it is possible to improve the quality of the reproduced signal as compared with the case where the quantization is performed uniformly using the same number of quantization bits. j
Coefficients larger than a certain value may not be transmitted at all.

【００２５】ＮとＭの値はフレーム毎に変化する値であ
る。この２つは逆比例の関係にあり、ピッチ周期Ｎが大
きくなればＭは小さくなり、Ｎが小さくなればＭは大き
くなる。The values of N and M are values that change for each frame. These two are in an inversely proportional relationship, and as the pitch period N increases, M decreases, and when N decreases, M increases.

【００２６】ピッチ周期Ｎが大きい場合、個別ピッチ波
形は長くなるが、連続するピッチ波形間の変化は小さ
い。一方、ピッチ周期Ｎが小さい場合は個別ピッチ波形
は短くなるが、連続するピッチ波形間の変化は大きくな
る。When the pitch period N is large, the individual pitch waveform becomes long, but the change between consecutive pitch waveforms is small. On the other hand, when the pitch period N is small, the individual pitch waveform becomes short, but the change between continuous pitch waveforms becomes large.

【００２７】本発明の方式では、上述の数２に示した変
換係数行列がマトリクス量子化され伝送される。ある範
囲のＮの値に対応した１つのコードブックを用意してマ
トリクス量子化が行われる。各コードブックのサイズを
同一にしておくと、個別ピッチ波形の位相情報と、連続
するピッチ波形の変化の情報に対する量子化精度の配分
が自動的に最適化される。According to the method of the present invention, the transform coefficient matrix shown in the above equation 2 is matrix-quantized and transmitted. Matrix quantization is performed by preparing one codebook corresponding to the value of N in a certain range. If the size of each codebook is kept the same, the distribution of the quantization precision to the phase information of the individual pitch waveform and the information of the change of the continuous pitch waveform is automatically optimized.

【００２８】図１に示す復号器４０では、伝送されてき
た量子化係数（量子化２次元フーリエ変換係数）を、２
次元フーリエ逆変換手段４１で再び２次元フーリエ逆変
換することにより、音源信号波形を復元することができ
る。この音源信号波形を予測合成フィルタに通すことに
よって再生音声信号を得ることができる。In the decoder 40 shown in FIG. 1, the transmitted quantized coefficient (quantized two-dimensional Fourier transform coefficient) is converted into 2
By performing the two-dimensional inverse Fourier transform again by the three-dimensional inverse Fourier transform means 41, the sound source signal waveform can be restored. A reproduced voice signal can be obtained by passing this sound source signal waveform through the predictive synthesis filter.

【００２９】従って、本発明の方式では、残差信号のピ
ッチ同期２次元フーリエ変換係数の分布の偏りを利用
し、これに応じて、量子化精度の最適配分を行うことに
より、符号化音声品質の向上が図れる。Therefore, in the method of the present invention, the deviation of the pitch-synchronized two-dimensional Fourier transform coefficient distribution of the residual signal is utilized, and the quantization accuracy is optimally distributed in accordance with the deviation, so that the encoded speech quality is improved. Can be improved.

【００３０】[0030]

【実施例】以下、図面を参照して本発明の一実施例につ
いて説明する。図４は本発明の一実施例の音声符号化方
式による音声符号器のブロック構成図、図５は音声復号
器のブロック構成図である。これらの図において図１に
しめした原理図の各部に対応する部分には同一符号を付
す。An embodiment of the present invention will be described below with reference to the drawings. FIG. 4 is a block configuration diagram of a voice encoder according to the voice encoding system of an embodiment of the present invention, and FIG. 5 is a block configuration diagram of a voice decoder. In these figures, parts corresponding to parts of the principle diagram shown in FIG.

【００３１】図４に示す音声符号器１０は、有声／無声
判定部２１と、線型予測分析部２２及び逆フィルタ２３
から成る線型予測残差算出部１１と、ピッチ分析部１２
と、ピッチ波形抽出部１３と、巡回シフト部２４と、第
１フーリエ変換部２５、振幅正規化部２６及び第２フー
リエ変換部２７から成る２次元フーリエ変換部１４と、
マトリクス量子化部１５とを具備して構成されている。The speech coder 10 shown in FIG. 4 has a voiced / unvoiced determination section 21, a linear prediction analysis section 22 and an inverse filter 23.
A linear prediction residual calculation unit 11 and a pitch analysis unit 12
A pitch waveform extraction unit 13, a cyclic shift unit 24, a two-dimensional Fourier transform unit 14 including a first Fourier transform unit 25, an amplitude normalization unit 26, and a second Fourier transform unit 27,
The matrix quantizer 15 is provided.

【００３２】また図５に示す音声復号器４０は、２次元
フーリエ逆変換部４１と、ノイズ発生部４２と、切替ス
イッチ４３と、アンプ４４と、予測合成フィルタ４５と
を具備して構成されている。The speech decoder 40 shown in FIG. 5 comprises a two-dimensional inverse Fourier transform unit 41, a noise generation unit 42, a changeover switch 43, an amplifier 44, and a prediction synthesis filter 45. There is.

【００３３】図４に示す符号器１０において、入力音声
信号を、フレーム毎に有声／無声判定部２１で有声音／
無声音の判定を行い、また線型予測分析部２２で線型予
測分析を行って予測係数を求め、逆フィルタ２３でその
予測係数を用いて入力音声信号に対して逆フィルタ処理
を行い、予測残差信号Ｒを求める。更にピッチ分析部１
２でピッチ分析を行って音声信号に対応するピッチ周期
信号Ｎを求める。In the encoder 10 shown in FIG. 4, the voiced / unvoiced decision unit 21 processes the input voice signal for voiced sound / voice for each frame.
The unvoiced sound is determined, the linear prediction analysis unit 22 performs linear prediction analysis to obtain a prediction coefficient, and the inverse filter 23 performs inverse filter processing on the input speech signal using the prediction coefficient to obtain a prediction residual signal. Find R. Further pitch analysis unit 1
In step 2, pitch analysis is performed to obtain a pitch period signal N corresponding to the voice signal.

【００３４】次に、ピッチ波形抽出部１３でピッチ周期
信号Ｎに応じて予測残差信号Ｒから残差ピッチ波形信号
Ｐi を抽出する。ここではピッチ周期毎に残差ピッチ波
形信号Ｐi が出力される。Next, the pitch waveform extraction unit 13 extracts the residual pitch waveform signal Pi from the predicted residual signal R according to the pitch period signal N. Here, the residual pitch waveform signal Pi is output for each pitch period.

【００３５】次に、巡回シフト部２４で残差ピッチ波形
信号Ｐi のピーク位置を先頭に持ってくることによっ
て、各残差ピッチ波形信号Ｐi の位相を揃える。この位
相の揃えられた各残差ピッチ波形信号Ｐi を２次元フー
リエ変換部１４で２次元フーリエ変換することによって
２次元フーリエ変換係数を求める。Next, the cyclic shift unit 24 brings the peak position of the residual pitch waveform signal Pi to the beginning so that the phases of the residual pitch waveform signals Pi are aligned. The two-dimensional Fourier transform coefficient is obtained by two-dimensional Fourier transforming the two-dimensional Fourier transform unit 14 on each of the residual pitch waveform signals Pi having the aligned phases.

【００３６】その２次元フーリエ変換は、まず第１フー
リエ変換部２５で残差ピッチ波形信号Ｐi のベクトル表
現（数２のＣ）が列方向にフーリエ変換された後で、振
幅正規化部２６で各変換係数の振幅が１に正規化され、
この正規化された変換係数に対して第２フーリエ変換部
２７で行方向のフーリエ変換が行われるものである。こ
の処理によって変換係数の量子化効率を向上させること
ができる。In the two-dimensional Fourier transform, first, the vector representation (C of equation 2) of the residual pitch waveform signal Pi is Fourier-transformed in the column direction by the first Fourier transform unit 25, and then the amplitude normalization unit 26. The amplitude of each transform coefficient is normalized to 1,
The second Fourier transform unit 27 performs the Fourier transform in the row direction on the normalized transform coefficient. This processing can improve the quantization efficiency of the transform coefficient.

【００３７】このようにして求められた変換係数をマト
リクス量子化部１５でマトリクス量子化することによっ
て量子化係数信号を求める。量子化を行う場合は、ピッ
チ周期の値Ｎに対応したコードブックを用いる。これら
のコードブックは予め用意しておく。The transform coefficient thus obtained is matrix-quantized by the matrix quantizer 15 to obtain a quantized coefficient signal. When quantizing, a codebook corresponding to the pitch period value N is used. These codebooks are prepared in advance.

【００３８】量子化係数信号は、予測係数、ピッチ周期
信号Ｎ、有声／無声判定結果信号、フレーム電力信号と
共に図５に示す復号器４０へ伝送する。図５に示す復号
器４０では、有声／無声判定結果信号が有声音であるこ
とを示す場合、切替スイッチ４３がＶ側に切り替わり、
無声音であることを示す場合、ＵＶ側に切り替わる。The quantized coefficient signal is transmitted to the decoder 40 shown in FIG. 5 together with the prediction coefficient, pitch period signal N, voiced / unvoiced decision result signal, and frame power signal. In the decoder 40 shown in FIG. 5, when the voiced / unvoiced determination result signal indicates voiced sound, the changeover switch 43 switches to the V side,
When it indicates unvoiced sound, it is switched to the UV side.

【００３９】即ち、有声音の場合、受信した量子化係数
信号を２次元フーリエ逆変換部４１で２次元フーリエ逆
変換することにより求められる予測残差信号、即ち残差
ピッチ波形信号が、切替スイッチ４３を介してアンプ４
４で増幅され、更に予測合成フィルタ４５を通ることに
よって音声信号に変換される。That is, in the case of voiced sound, the prediction residual signal obtained by subjecting the received quantized coefficient signal to the two-dimensional inverse Fourier transform in the two-dimensional inverse Fourier transform unit 41, that is, the residual pitch waveform signal, is changed over. Amplifier 4 through 43
4 is amplified and further passed through the prediction synthesis filter 45 to be converted into a voice signal.

【００４０】一方、無声音の場合、ノイズ発生部４２か
ら発生される白色雑音を音源信号として用い、この音源
信号を予測合成フィルタ４５を通すことによって再生音
声信号を得る。On the other hand, in the case of unvoiced sound, white noise generated from the noise generator 42 is used as a sound source signal, and this sound source signal is passed through the predictive synthesis filter 45 to obtain a reproduced sound signal.

【００４１】[0041]

【発明の効果】以上説明したように、本発明によれば、
残差信号のピッチ同期２次元フーリエ変換係数の分布の
偏りに応じて、量子化精度の最適配分を行うことによ
り、符号化音声品質の向上を図れる効果がある。As described above, according to the present invention,
There is an effect that the quality of the coded speech can be improved by optimally distributing the quantization accuracy according to the deviation of the distribution of the pitch-synchronized two-dimensional Fourier transform coefficient of the residual signal.

[Brief description of drawings]

【図１】本発明の原理図である。FIG. 1 is a principle diagram of the present invention.

【図２】本発明の原理を説明するための残差ピッチ波形
図である。FIG. 2 is a residual pitch waveform diagram for explaining the principle of the present invention.

【図３】本発明の原理を説明するための音声信号波形の
２次元フーリエスペクトル図である。FIG. 3 is a two-dimensional Fourier spectrum diagram of a voice signal waveform for explaining the principle of the present invention.

【図４】本発明の一実施例の音声符号化方式による音声
符号器のブロック構成図である。FIG. 4 is a block configuration diagram of a speech coder according to a speech coding system of an embodiment of the present invention.

【図５】本発明の一実施例の音声符号化方式による音声
復号器のブロック構成図である。[Fig. 5] Fig. 5 is a block configuration diagram of a voice decoder according to a voice encoding system of an embodiment of the present invention.

【図６】従来例の音声符号化方式による音声符号器のブ
ロック構成図である。FIG. 6 is a block configuration diagram of a speech coder according to a conventional speech coding system.

【図７】従来例の音声符号化方式による音声復号器のブ
ロック構成図である。[Fig. 7] Fig. 7 is a block configuration diagram of a speech decoder according to a conventional speech encoding method.

[Explanation of symbols]

１０音声符号器１１線型予測残差算出手段１２ピッチ分析手段１３ピッチ波形抽出手段１４２次元フーリエ変換手段１５量子化手段４０音声復号器４１２次元フーリエ逆変換手段Ｒ予測残差信号Ｐi 残差ピッチ波形Ｎピッチ周期 10 speech encoder 11 linear prediction residual calculation means 12 pitch analysis means 13 pitch waveform extraction means 14 two-dimensional Fourier transform means 15 quantization means 40 speech decoder 41 two-dimensional inverse Fourier transform means R prediction residual signal Pi residual pitch Waveform N pitch period

Claims

[Claims]

1. A linear method for obtaining a prediction residual signal R by obtaining a prediction coefficient by performing a linear prediction analysis of a speech signal for each frame and performing an inverse filter process on the speech signal using the obtained prediction coefficient. Prediction residual detection means, pitch analysis means for obtaining the pitch period N of the voice signal, pitch waveform extraction means for extracting the residual pitch waveform Pi from the prediction residual signal R according to the pitch period N, A decoder provided with a two-dimensional Fourier transform means for performing a two-dimensional Fourier transform of the residual pitch waveform Pi in pitch synchronization to obtain a transform coefficient, and a quantizing means for quantizing the transform coefficient to obtain a quantized coefficient. A speech coding method characterized by having.

2. The speech according to claim 1, wherein when the quantizing means quantizes the transform coefficient, the quantized bit allocation is different depending on the amplitude distribution of each transform coefficient. Encoding method.

3. The speech encoding system according to claim 1, wherein said quantizing means matrix-quantizes said transform coefficient and transmits it.

4. The speech coding method according to claim 3, wherein, when the quantizing means performs the matrix quantization, weighting quantization is performed according to an amplitude distribution of the transform coefficient.

5. The speech coding system according to claim 1, wherein the two-dimensional Fourier transforming means normalizes the amplitude of the residual pitch waveform Pi when performing the two-dimensional Fourier transform.

6. The speech code according to claim 1, further comprising a decoder equipped with a two-dimensional inverse Fourier transform means for obtaining the excitation signal waveform by receiving the transform coefficient and performing a two-dimensional inverse Fourier transform. Method.