JPH08129400A - Voice coding system - Google Patents

Voice coding system

Info

Publication number
JPH08129400A
JPH08129400A JP6266508A JP26650894A JPH08129400A JP H08129400 A JPH08129400 A JP H08129400A JP 6266508 A JP6266508 A JP 6266508A JP 26650894 A JP26650894 A JP 26650894A JP H08129400 A JPH08129400 A JP H08129400A
Authority
JP
Japan
Prior art keywords
pitch
signal
residual
coefficient
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP6266508A
Other languages
Japanese (ja)
Inventor
Yoshiaki Tanaka
良紀 田中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP6266508A priority Critical patent/JPH08129400A/en
Publication of JPH08129400A publication Critical patent/JPH08129400A/en
Withdrawn legal-status Critical Current

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

PURPOSE: To provide a voice coding system to improve the quality of a generated voice. CONSTITUTION: In a coder 10, a linear prediction analysis is performed against sound source signals by a linear prediction residual detecting means 11 for every frame and prediction coefficients are obtained. These coefficients are used against the signals and an inverse filter process is performed and a prediction residual signal R is obtained. A pitch analysis means 12 obtains a pitch period N corresponding to a maximum amplitude pitch of the sound source signals. A pitch waveform extracting means 13 extracts a residual pitch waveform Pi from the signal R in accordance with the period N. A two-dimensional Fourier transformation means 14 performs a two-dimensional Fourier transformation of the waveform Pi at the pitch period and transformation coefficients are obtained. A quantizing means 15 quantizes the transformation coefficients and quantized coefficients are obtained as sound source information. A two dimensional Fourier inversion transformation section 41 of a decoder 40 performs two-dimensional Fourier inverse transformation and sound source signal waveforms are obtained.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は音声信号を高能率に圧縮
する音声符号化方式に関する。近年、企業内通信システ
ム、ディジタル移動無線システム、音声蓄積システム等
において、音声信号を高能率に圧縮することが要望され
ている。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice coding system for compressing a voice signal with high efficiency. In recent years, there has been a demand for highly efficient compression of voice signals in corporate communication systems, digital mobile radio systems, voice storage systems, and the like.

【0002】[0002]

【従来の技術】線型予測分析により求めた全極型モデル
は、音声信号のスペクトル包絡の良いモデルとなること
が知られており、音声信号の高能率符号化においては、
線型予測分析により求めた線型予測係数と、音源に関す
るパラメータを伝送する方式が広く用いられている。
2. Description of the Related Art An all-pole model obtained by linear prediction analysis is known to be a model with a good spectrum envelope of a speech signal, and in high efficiency coding of a speech signal,
A method of transmitting a linear prediction coefficient obtained by linear prediction analysis and a parameter related to a sound source is widely used.

【0003】図6に線型予測分析を用いた音声符号器の
ブロック構成図、図7に音声復号器のブロック構成図を
示し、その説明を行う。図6に示す音声符号器は、線型
予測分析部1と、予測係数量子化部2と、逆フィルタ3
と、残差信号量子化部4と、多重化部5とを具備して構
成されており、図7に示す音声復号器は、分離部7と、
合成フィルタ8とを具備して構成されている。
FIG. 6 shows a block diagram of a speech coder using linear prediction analysis, and FIG. 7 shows a block diagram of a speech decoder, which will be described. The speech coder shown in FIG. 6 includes a linear prediction analysis unit 1, a prediction coefficient quantization unit 2, and an inverse filter 3.
And a residual signal quantizing unit 4 and a multiplexing unit 5, and the speech decoder shown in FIG.
And a synthesizing filter 8.

【0004】図6に示す符号器において、入力音声信号
の分析フレーム毎に、線型予測分析部1が線型予測分析
処理を行うことによって予測係数を求め、この求められ
た予測係数を予測係数量子化部2が量子化する。
In the encoder shown in FIG. 6, the linear prediction analysis unit 1 performs a linear prediction analysis process for each analysis frame of an input speech signal to obtain a prediction coefficient, and the obtained prediction coefficient is quantized into a prediction coefficient. The part 2 quantizes.

【0005】入力音声信号をその量子化された予測係数
をフィルタ係数とする逆フィルタ3に通すことにより得
られる予測残差信号を残差信号量子化部4で量子化し、
この量子化された予測残差信号を多重化部5で多重化し
て伝送路へ伝送する。
Prediction residual signals obtained by passing the input speech signal through an inverse filter 3 using the quantized prediction coefficients as filter coefficients are quantized by a residual signal quantization unit 4,
The quantized prediction residual signal is multiplexed by the multiplexing unit 5 and transmitted to the transmission path.

【0006】図7に示す復号器においては、符号器から
伝送されてきた量子化予測残差信号を分離部7で予測残
差信号と予測係数とに分離した後、予測係数をフィルタ
係数とする合成フィルタ8に通すことによって符号化さ
れた音声信号を再生する。
In the decoder shown in FIG. 7, the quantized prediction residual signal transmitted from the encoder is separated into a prediction residual signal and a prediction coefficient by the separating unit 7, and the prediction coefficient is used as a filter coefficient. The encoded audio signal is reproduced by passing it through the synthesis filter 8.

【0007】このとき、予測残差信号の効率的伝送のた
めの方式が幾つか知られている。その方式として、一定
長の予測残差信号をベクトル量子化して、そのインデッ
クスを伝送するコード駆動線型予測符号化方式(CEL
P)、一定長の予測残差信号を有限個のパルス列でモデ
ル化し、最適なパルス位置、及びパルス振幅を伝送する
マルチパルス駆動符号化方式(MPC)等がある。
At this time, some methods are known for efficiently transmitting the prediction residual signal. As the method, a code-driven linear predictive coding method (CEL) in which a predictive residual signal of a fixed length is vector-quantized and the index is transmitted.
P), a multi-pulse drive coding method (MPC) that models a prediction residual signal of a constant length with a finite number of pulse trains, and transmits an optimum pulse position and pulse amplitude.

【0008】また、代表波形補間方式(Prototype Wavef
orm Interpolation:PWI)は、4Kbps以下程度の低ビ
ットレートにおいて高い品質が得られる方式である。こ
の方式は、音声信号フレームにおける残差信号の中から
代表的な1ピッチ波形を抽出したあと量子化して伝送
し、フレーム中のその他の残差信号を、前フレームにお
ける代表残差ピッチ波形と、現フレームにおける代表残
差ピッチ波形とを補間することにより求めるものであ
る。
In addition, a representative waveform interpolation method (Prototype Wavef
orm Interpolation (PWI) is a method that can obtain high quality at a low bit rate of about 4 Kbps or less. In this method, a typical one-pitch waveform is extracted from the residual signal in the voice signal frame, quantized and transmitted, and the other residual signals in the frame are compared with the representative residual pitch waveform in the previous frame. This is obtained by interpolating with the representative residual pitch waveform in the current frame.

【0009】[0009]

【発明が解決しようとする課題】ところで、上述したC
ELPやMPCでは、フレームの全残差信号を対象とし
て符号化を行うため、低ビットレート化するためにフレ
ームを長くしたり、量子化ビット数を少なくすると、音
声品質が劣化してしまう問題があった。
By the way, the above-mentioned C
In ELP and MPC, encoding is performed on the entire residual signal of a frame, so if the frame is lengthened to reduce the bit rate or the number of quantization bits is reduced, there is a problem that the voice quality deteriorates. there were.

【0010】一方、PWIではフレーム中の1ピッチ波
形のみを符号化して伝送するため、低ビットレート化が
可能である。しかし、復号側では伝送されなかった残り
の残差信号を、代表残差ピッチ波形の補間により求める
ため、フレーム内の連続する残差ピッチ波形にフレーム
周波数以上の変動成分がある場合、補間により折り返し
歪みが発生する。このため、実際の残差信号との誤差が
大きくなり音声品質が劣化する問題があった。
On the other hand, in PWI, since only one pitch waveform in a frame is encoded and transmitted, it is possible to reduce the bit rate. However, the residual signal that was not transmitted on the decoding side is obtained by interpolation of the representative residual pitch waveform, so if there is a fluctuation component above the frame frequency in the continuous residual pitch waveform within the frame, it will be folded back by interpolation. Distortion occurs. For this reason, there is a problem that the error from the actual residual signal becomes large and the voice quality deteriorates.

【0011】本発明は、このような点に鑑みてなされた
ものであり、音声品質の向上を図ることができる音声符
号化方式を提供することを目的としている。
The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a speech coding system capable of improving speech quality.

【0012】[0012]

【課題を解決するための手段】図1に本発明の音声符号
化方式の原理図を示す。図中、10は符号器である。こ
の符号器10において、11は線型予測残差算出手段で
あり、音声信号をフレーム毎に線型予測分析することに
より予測係数を求め、この求められた予測係数を用い音
声信号に対して逆フィルタ処理を行うことによって予測
残差信号Rを求めるものである。
FIG. 1 shows the principle of the speech coding system of the present invention. In the figure, 10 is an encoder. In this encoder 10, 11 is a linear prediction residual calculation means, which obtains a prediction coefficient by performing a linear prediction analysis on the audio signal for each frame, and uses the obtained prediction coefficient to perform an inverse filter process on the audio signal. The prediction residual signal R is obtained by performing.

【0013】12はピッチ分析手段であり、音声信号の
ピッチ周期Nを求めるものである。13はピッチ波形抽
出手段であり、ピッチ周期Nに応じて予測残差信号Rよ
り残差ピッチ波形Pi を抽出するものである。
Reference numeral 12 is a pitch analysis means for obtaining the pitch period N of the voice signal. Reference numeral 13 is a pitch waveform extracting means for extracting the residual pitch waveform Pi from the predicted residual signal R according to the pitch cycle N.

【0014】14は2次元フーリエ変換手段であり、残
差ピッチ波形Pi をピッチ同期で2次元フーリエ変換を
行い変換係数を求めるものである。15は量子化手段で
あり、変換係数を量子化することにより音源情報として
の量子化変換係数を求めるものである。
Numeral 14 is a two-dimensional Fourier transforming means, which obtains transform coefficients by performing two-dimensional Fourier transform on the residual pitch waveform Pi in pitch synchronization. Reference numeral 15 denotes a quantizing means, which obtains a quantized transform coefficient as sound source information by quantizing the transform coefficient.

【0015】40は復号器であり、2次元フーリエ逆変
換することにより音源信号波形を求める2次元フーリエ
逆変換手段41を具備する。
Reference numeral 40 denotes a decoder, which comprises a two-dimensional inverse Fourier transform means 41 for obtaining a sound source signal waveform by performing a two-dimensional inverse Fourier transform.

【0016】[0016]

【作用】上述した本発明において、予測残差信号Rのベ
クトル(Rとする)は、個別の残差ピッチ波形Pi が連
続した波形と考えることができるため、次のように表す
ことができる。
In the present invention described above, the vector (denoted as R) of the prediction residual signal R can be considered as a waveform in which the individual residual pitch waveforms Pi are continuous, and therefore can be expressed as follows.

【0017】R=(P0 ,P1 ,…,PM-1 ) … また、 Pi =〔p(i,0),p(i,1),…, p(i,Ni ) 〕 (0≦i≦M−1)… である。R = (P 0 , P 1 , ..., P M-1 ), and P i = [p (i, 0), p (i, 1), ..., p (i, N i )] (0 ≦ i ≦ M−1).

【0018】ここでMはフレームに含まれる残差ピッチ
波形Pi の数である。各残差ピッチ波形Pi は、例えば
図2に示すように抽出することができる。この場合、時
間的に最も古い残差ピッチ波形P0 は、前フレームの波
形を含んでいる。
Here, M is the number of residual pitch waveforms Pi included in the frame. Each residual pitch waveform Pi can be extracted, for example, as shown in FIG. In this case, the temporally oldest residual pitch waveform P 0 includes the waveform of the previous frame.

【0019】次に、各残差ピッチ波形Pi を、p(i,0)が
最大値となるように巡回シフトすることにより、各残差
ピッチ波形の位相を揃える。これを改めてPi とおく
と、予測残差信号ベクトルRは次のようなN×M行列の
形で表すことができる。
Next, each residual pitch waveform Pi is cyclically shifted so that p (i, 0) becomes the maximum value, thereby aligning the phases of each residual pitch waveform. Putting this again as Pi, the prediction residual signal vector R can be expressed in the form of the following N × M matrix.

【0020】[0020]

【数1】 [Equation 1]

【0021】次に、この行列を2次元フーリエ変換す
る。即ち列方向と行方向とに順番にフーリエ変換を行
う。得られた複素2次元フーリエ変換係数c(i,j) は、
次の様なN/2×Mの行列で表すことができる。
Next, this matrix is two-dimensionally Fourier transformed. That is, the Fourier transform is sequentially performed in the column direction and the row direction. The obtained complex two-dimensional Fourier transform coefficient c (i, j) is
It can be expressed by the following N / 2 × M matrix.

【0022】[0022]

【数2】 [Equation 2]

【0023】有声音区間では、連続するピッチ波形の間
に高い相関があるため、図3の符号16で示す2次元フ
ーリエ逆変換係数c(i,j) の例に見られるように、2次
元フーリエ変換係数c(i,j) は、jの値が小さい係数の
値が大きくなる。但し、図3はc(i,j) の形状及びその
遷移を3次元で表したスペクトル図であり、図中、枠1
7にこのc(i,j) を求めた音声信号波形を示す。
Since there is a high correlation between the continuous pitch waveforms in the voiced sound section, the two-dimensional Fourier transform coefficient c (i, j) shown by reference numeral 16 in FIG. In the Fourier transform coefficient c (i, j), the value of the coefficient having a small value of j becomes large. However, FIG. 3 is a three-dimensional spectrum diagram showing the shape of c (i, j) and its transition.
FIG. 7 shows a voice signal waveform for which this c (i, j) is obtained.

【0024】jに関しての低次の成分は、各ピッチ波形
間でゆっくり変化する成分であり、高次の成分は速く変
化する成分である。そこで、符号18で示す振幅の大き
い部分、即ちjの値が小さい係数に対しては量子化精度
を高くし、符号19で示す振幅の小さい部分、即ちjの
値が大きい係数に対しては量子化精度を低くすることに
より、同一の量子化ビット数を用いて均一に量子化した
場合より、再生信号の品質を高くすることができる。j
がある一定の値より大きい係数は、全く伝送しないよう
にすることもできる。
The low-order component with respect to j is a component that changes slowly between pitch waveforms, and the high-order component is a component that changes rapidly. Therefore, the quantization precision is increased for a portion having a large amplitude indicated by reference numeral 18, that is, a coefficient having a small value of j, and a quantization is performed for a portion having a small amplitude indicated by reference numeral 19, that is, a coefficient having a large value of j. By lowering the quantization accuracy, it is possible to improve the quality of the reproduced signal as compared with the case where the quantization is performed uniformly using the same number of quantization bits. j
Coefficients larger than a certain value may not be transmitted at all.

【0025】NとMの値はフレーム毎に変化する値であ
る。この2つは逆比例の関係にあり、ピッチ周期Nが大
きくなればMは小さくなり、Nが小さくなればMは大き
くなる。
The values of N and M are values that change for each frame. These two are in an inversely proportional relationship, and as the pitch period N increases, M decreases, and when N decreases, M increases.

【0026】ピッチ周期Nが大きい場合、個別ピッチ波
形は長くなるが、連続するピッチ波形間の変化は小さ
い。一方、ピッチ周期Nが小さい場合は個別ピッチ波形
は短くなるが、連続するピッチ波形間の変化は大きくな
る。
When the pitch period N is large, the individual pitch waveform becomes long, but the change between consecutive pitch waveforms is small. On the other hand, when the pitch period N is small, the individual pitch waveform becomes short, but the change between continuous pitch waveforms becomes large.

【0027】本発明の方式では、上述の数2に示した変
換係数行列がマトリクス量子化され伝送される。ある範
囲のNの値に対応した1つのコードブックを用意してマ
トリクス量子化が行われる。各コードブックのサイズを
同一にしておくと、個別ピッチ波形の位相情報と、連続
するピッチ波形の変化の情報に対する量子化精度の配分
が自動的に最適化される。
According to the method of the present invention, the transform coefficient matrix shown in the above equation 2 is matrix-quantized and transmitted. Matrix quantization is performed by preparing one codebook corresponding to the value of N in a certain range. If the size of each codebook is kept the same, the distribution of the quantization precision to the phase information of the individual pitch waveform and the information of the change of the continuous pitch waveform is automatically optimized.

【0028】図1に示す復号器40では、伝送されてき
た量子化係数(量子化2次元フーリエ変換係数)を、2
次元フーリエ逆変換手段41で再び2次元フーリエ逆変
換することにより、音源信号波形を復元することができ
る。この音源信号波形を予測合成フィルタに通すことに
よって再生音声信号を得ることができる。
In the decoder 40 shown in FIG. 1, the transmitted quantized coefficient (quantized two-dimensional Fourier transform coefficient) is converted into 2
By performing the two-dimensional inverse Fourier transform again by the three-dimensional inverse Fourier transform means 41, the sound source signal waveform can be restored. A reproduced voice signal can be obtained by passing this sound source signal waveform through the predictive synthesis filter.

【0029】従って、本発明の方式では、残差信号のピ
ッチ同期2次元フーリエ変換係数の分布の偏りを利用
し、これに応じて、量子化精度の最適配分を行うことに
より、符号化音声品質の向上が図れる。
Therefore, in the method of the present invention, the deviation of the pitch-synchronized two-dimensional Fourier transform coefficient distribution of the residual signal is utilized, and the quantization accuracy is optimally distributed in accordance with the deviation, so that the encoded speech quality is improved. Can be improved.

【0030】[0030]

【実施例】以下、図面を参照して本発明の一実施例につ
いて説明する。図4は本発明の一実施例の音声符号化方
式による音声符号器のブロック構成図、図5は音声復号
器のブロック構成図である。これらの図において図1に
しめした原理図の各部に対応する部分には同一符号を付
す。
An embodiment of the present invention will be described below with reference to the drawings. FIG. 4 is a block configuration diagram of a voice encoder according to the voice encoding system of an embodiment of the present invention, and FIG. 5 is a block configuration diagram of a voice decoder. In these figures, parts corresponding to parts of the principle diagram shown in FIG.

【0031】図4に示す音声符号器10は、有声/無声
判定部21と、線型予測分析部22及び逆フィルタ23
から成る線型予測残差算出部11と、ピッチ分析部12
と、ピッチ波形抽出部13と、巡回シフト部24と、第
1フーリエ変換部25、振幅正規化部26及び第2フー
リエ変換部27から成る2次元フーリエ変換部14と、
マトリクス量子化部15とを具備して構成されている。
The speech coder 10 shown in FIG. 4 has a voiced / unvoiced determination section 21, a linear prediction analysis section 22 and an inverse filter 23.
A linear prediction residual calculation unit 11 and a pitch analysis unit 12
A pitch waveform extraction unit 13, a cyclic shift unit 24, a two-dimensional Fourier transform unit 14 including a first Fourier transform unit 25, an amplitude normalization unit 26, and a second Fourier transform unit 27,
The matrix quantizer 15 is provided.

【0032】また図5に示す音声復号器40は、2次元
フーリエ逆変換部41と、ノイズ発生部42と、切替ス
イッチ43と、アンプ44と、予測合成フィルタ45と
を具備して構成されている。
The speech decoder 40 shown in FIG. 5 comprises a two-dimensional inverse Fourier transform unit 41, a noise generation unit 42, a changeover switch 43, an amplifier 44, and a prediction synthesis filter 45. There is.

【0033】図4に示す符号器10において、入力音声
信号を、フレーム毎に有声/無声判定部21で有声音/
無声音の判定を行い、また線型予測分析部22で線型予
測分析を行って予測係数を求め、逆フィルタ23でその
予測係数を用いて入力音声信号に対して逆フィルタ処理
を行い、予測残差信号Rを求める。更にピッチ分析部1
2でピッチ分析を行って音声信号に対応するピッチ周期
信号Nを求める。
In the encoder 10 shown in FIG. 4, the voiced / unvoiced decision unit 21 processes the input voice signal for voiced sound / voice for each frame.
The unvoiced sound is determined, the linear prediction analysis unit 22 performs linear prediction analysis to obtain a prediction coefficient, and the inverse filter 23 performs inverse filter processing on the input speech signal using the prediction coefficient to obtain a prediction residual signal. Find R. Further pitch analysis unit 1
In step 2, pitch analysis is performed to obtain a pitch period signal N corresponding to the voice signal.

【0034】次に、ピッチ波形抽出部13でピッチ周期
信号Nに応じて予測残差信号Rから残差ピッチ波形信号
Pi を抽出する。ここではピッチ周期毎に残差ピッチ波
形信号Pi が出力される。
Next, the pitch waveform extraction unit 13 extracts the residual pitch waveform signal Pi from the predicted residual signal R according to the pitch period signal N. Here, the residual pitch waveform signal Pi is output for each pitch period.

【0035】次に、巡回シフト部24で残差ピッチ波形
信号Pi のピーク位置を先頭に持ってくることによっ
て、各残差ピッチ波形信号Pi の位相を揃える。この位
相の揃えられた各残差ピッチ波形信号Pi を2次元フー
リエ変換部14で2次元フーリエ変換することによって
2次元フーリエ変換係数を求める。
Next, the cyclic shift unit 24 brings the peak position of the residual pitch waveform signal Pi to the beginning so that the phases of the residual pitch waveform signals Pi are aligned. The two-dimensional Fourier transform coefficient is obtained by two-dimensional Fourier transforming the two-dimensional Fourier transform unit 14 on each of the residual pitch waveform signals Pi having the aligned phases.

【0036】その2次元フーリエ変換は、まず第1フー
リエ変換部25で残差ピッチ波形信号Pi のベクトル表
現(数2のC)が列方向にフーリエ変換された後で、振
幅正規化部26で各変換係数の振幅が1に正規化され、
この正規化された変換係数に対して第2フーリエ変換部
27で行方向のフーリエ変換が行われるものである。こ
の処理によって変換係数の量子化効率を向上させること
ができる。
In the two-dimensional Fourier transform, first, the vector representation (C of equation 2) of the residual pitch waveform signal Pi is Fourier-transformed in the column direction by the first Fourier transform unit 25, and then the amplitude normalization unit 26. The amplitude of each transform coefficient is normalized to 1,
The second Fourier transform unit 27 performs the Fourier transform in the row direction on the normalized transform coefficient. This processing can improve the quantization efficiency of the transform coefficient.

【0037】このようにして求められた変換係数をマト
リクス量子化部15でマトリクス量子化することによっ
て量子化係数信号を求める。量子化を行う場合は、ピッ
チ周期の値Nに対応したコードブックを用いる。これら
のコードブックは予め用意しておく。
The transform coefficient thus obtained is matrix-quantized by the matrix quantizer 15 to obtain a quantized coefficient signal. When quantizing, a codebook corresponding to the pitch period value N is used. These codebooks are prepared in advance.

【0038】量子化係数信号は、予測係数、ピッチ周期
信号N、有声/無声判定結果信号、フレーム電力信号と
共に図5に示す復号器40へ伝送する。図5に示す復号
器40では、有声/無声判定結果信号が有声音であるこ
とを示す場合、切替スイッチ43がV側に切り替わり、
無声音であることを示す場合、UV側に切り替わる。
The quantized coefficient signal is transmitted to the decoder 40 shown in FIG. 5 together with the prediction coefficient, pitch period signal N, voiced / unvoiced decision result signal, and frame power signal. In the decoder 40 shown in FIG. 5, when the voiced / unvoiced determination result signal indicates voiced sound, the changeover switch 43 switches to the V side,
When it indicates unvoiced sound, it is switched to the UV side.

【0039】即ち、有声音の場合、受信した量子化係数
信号を2次元フーリエ逆変換部41で2次元フーリエ逆
変換することにより求められる予測残差信号、即ち残差
ピッチ波形信号が、切替スイッチ43を介してアンプ4
4で増幅され、更に予測合成フィルタ45を通ることに
よって音声信号に変換される。
That is, in the case of voiced sound, the prediction residual signal obtained by subjecting the received quantized coefficient signal to the two-dimensional inverse Fourier transform in the two-dimensional inverse Fourier transform unit 41, that is, the residual pitch waveform signal, is changed over. Amplifier 4 through 43
4 is amplified and further passed through the prediction synthesis filter 45 to be converted into a voice signal.

【0040】一方、無声音の場合、ノイズ発生部42か
ら発生される白色雑音を音源信号として用い、この音源
信号を予測合成フィルタ45を通すことによって再生音
声信号を得る。
On the other hand, in the case of unvoiced sound, white noise generated from the noise generator 42 is used as a sound source signal, and this sound source signal is passed through the predictive synthesis filter 45 to obtain a reproduced sound signal.

【0041】[0041]

【発明の効果】以上説明したように、本発明によれば、
残差信号のピッチ同期2次元フーリエ変換係数の分布の
偏りに応じて、量子化精度の最適配分を行うことによ
り、符号化音声品質の向上を図れる効果がある。
As described above, according to the present invention,
There is an effect that the quality of the coded speech can be improved by optimally distributing the quantization accuracy according to the deviation of the distribution of the pitch-synchronized two-dimensional Fourier transform coefficient of the residual signal.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の原理図である。FIG. 1 is a principle diagram of the present invention.

【図2】本発明の原理を説明するための残差ピッチ波形
図である。
FIG. 2 is a residual pitch waveform diagram for explaining the principle of the present invention.

【図3】本発明の原理を説明するための音声信号波形の
2次元フーリエスペクトル図である。
FIG. 3 is a two-dimensional Fourier spectrum diagram of a voice signal waveform for explaining the principle of the present invention.

【図4】本発明の一実施例の音声符号化方式による音声
符号器のブロック構成図である。
FIG. 4 is a block configuration diagram of a speech coder according to a speech coding system of an embodiment of the present invention.

【図5】本発明の一実施例の音声符号化方式による音声
復号器のブロック構成図である。
[Fig. 5] Fig. 5 is a block configuration diagram of a voice decoder according to a voice encoding system of an embodiment of the present invention.

【図6】従来例の音声符号化方式による音声符号器のブ
ロック構成図である。
FIG. 6 is a block configuration diagram of a speech coder according to a conventional speech coding system.

【図7】従来例の音声符号化方式による音声復号器のブ
ロック構成図である。
[Fig. 7] Fig. 7 is a block configuration diagram of a speech decoder according to a conventional speech encoding method.

【符号の説明】[Explanation of symbols]

10 音声符号器 11 線型予測残差算出手段 12 ピッチ分析手段 13 ピッチ波形抽出手段 14 2次元フーリエ変換手段 15 量子化手段 40 音声復号器 41 2次元フーリエ逆変換手段 R 予測残差信号 Pi 残差ピッチ波形 N ピッチ周期 10 speech encoder 11 linear prediction residual calculation means 12 pitch analysis means 13 pitch waveform extraction means 14 two-dimensional Fourier transform means 15 quantization means 40 speech decoder 41 two-dimensional inverse Fourier transform means R prediction residual signal Pi residual pitch Waveform N pitch period

Claims (6)

【特許請求の範囲】[Claims] 【請求項1】 音声信号をフレーム毎に線型予測分析す
ることにより予測係数を求め、この求められた予測係数
を用い音声信号に対して逆フィルタ処理を行うことによ
って予測残差信号Rを求める線型予測残差検出手段と、 該音声信号のピッチ周期Nを求めるピッチ分析手段と、 該ピッチ周期Nに応じて該予測残差信号Rより残差ピッ
チ波形Pi を抽出するピッチ波形抽出手段と、 該残差ピッチ波形Pi をピッチ同期で2次元フーリエ変
換を行い変換係数を求める2次元フーリエ変換手段と、 該変換係数を量子化することにより量子化係数を求める
量子化手段とを具備した復号器を有することを特徴とす
る音声符号化方式。
1. A linear method for obtaining a prediction residual signal R by obtaining a prediction coefficient by performing a linear prediction analysis of a speech signal for each frame and performing an inverse filter process on the speech signal using the obtained prediction coefficient. Prediction residual detection means, pitch analysis means for obtaining the pitch period N of the voice signal, pitch waveform extraction means for extracting the residual pitch waveform Pi from the prediction residual signal R according to the pitch period N, A decoder provided with a two-dimensional Fourier transform means for performing a two-dimensional Fourier transform of the residual pitch waveform Pi in pitch synchronization to obtain a transform coefficient, and a quantizing means for quantizing the transform coefficient to obtain a quantized coefficient. A speech coding method characterized by having.
【請求項2】 前記量子化手段が前記変換係数を量子化
する際に、各変換係数の振幅分布に応じてその量子化ビ
ット割当てが異なるように行うことを特徴とする請求項
1記載の音声符号化方式。
2. The speech according to claim 1, wherein when the quantizing means quantizes the transform coefficient, the quantized bit allocation is different depending on the amplitude distribution of each transform coefficient. Encoding method.
【請求項3】 前記量子化手段が前記変換係数をマトリ
クス量子化して伝送することを特徴とする請求項1記載
の音声符号化方式。
3. The speech encoding system according to claim 1, wherein said quantizing means matrix-quantizes said transform coefficient and transmits it.
【請求項4】 前記量子化手段が前記マトリクス量子化
を行う際に、前記変換係数の振幅分布に応じて重み付け
量子化を行うことを特徴とする請求項3記載の音声符号
化方式。
4. The speech coding method according to claim 3, wherein, when the quantizing means performs the matrix quantization, weighting quantization is performed according to an amplitude distribution of the transform coefficient.
【請求項5】 前記2次元フーリエ変換手段が、2次元
フーリエ変換を行う際に前記残差ピッチ波形Pi の振幅
の正規化を行うことを特徴とする請求項1記載の音声符
号化方式。
5. The speech coding system according to claim 1, wherein the two-dimensional Fourier transforming means normalizes the amplitude of the residual pitch waveform Pi when performing the two-dimensional Fourier transform.
【請求項6】 前記変換係数を受信し、2次元フーリエ
逆変換することにより音源信号波形を求める2次元フー
リエ逆変換手段を具備した復号器を有することを特徴と
する請求項1記載の音声符号化方式。
6. The speech code according to claim 1, further comprising a decoder equipped with a two-dimensional inverse Fourier transform means for obtaining the excitation signal waveform by receiving the transform coefficient and performing a two-dimensional inverse Fourier transform. Method.
JP6266508A 1994-10-31 1994-10-31 Voice coding system Withdrawn JPH08129400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP6266508A JPH08129400A (en) 1994-10-31 1994-10-31 Voice coding system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP6266508A JPH08129400A (en) 1994-10-31 1994-10-31 Voice coding system

Publications (1)

Publication Number Publication Date
JPH08129400A true JPH08129400A (en) 1996-05-21

Family

ID=17431890

Family Applications (1)

Application Number Title Priority Date Filing Date
JP6266508A Withdrawn JPH08129400A (en) 1994-10-31 1994-10-31 Voice coding system

Country Status (1)

Country Link
JP (1) JPH08129400A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553343B1 (en) 1995-12-04 2003-04-22 Kabushiki Kaisha Toshiba Speech synthesis method
WO2006001159A1 (en) * 2004-06-28 2006-01-05 Sony Corporation Signal encoding device and method, and signal decoding device and method
JP2009501909A (en) * 2005-07-18 2009-01-22 トグノラ,ディエゴ,ジュセッペ Signal processing method and system
JP2012234206A (en) * 2007-09-12 2012-11-29 Kawai Musical Instr Mfg Co Ltd Information compression method of musical sound waveform, information decompression method, computer program for information compression, information compression device, information decompression device, and data structure
JP2013125187A (en) * 2011-12-15 2013-06-24 Fujitsu Ltd Decoder, encoder, encoding decoding system, decoding method, encoding method, decoding program and encoding program

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553343B1 (en) 1995-12-04 2003-04-22 Kabushiki Kaisha Toshiba Speech synthesis method
US7184958B2 (en) 1995-12-04 2007-02-27 Kabushiki Kaisha Toshiba Speech synthesis method
WO2006001159A1 (en) * 2004-06-28 2006-01-05 Sony Corporation Signal encoding device and method, and signal decoding device and method
JP2006011170A (en) * 2004-06-28 2006-01-12 Sony Corp Signal-coding device and method, and signal-decoding device and method
JP4734859B2 (en) * 2004-06-28 2011-07-27 ソニー株式会社 Signal encoding apparatus and method, and signal decoding apparatus and method
US8015001B2 (en) 2004-06-28 2011-09-06 Sony Corporation Signal encoding apparatus and method thereof, and signal decoding apparatus and method thereof
JP2009501909A (en) * 2005-07-18 2009-01-22 トグノラ,ディエゴ,ジュセッペ Signal processing method and system
JP2012234206A (en) * 2007-09-12 2012-11-29 Kawai Musical Instr Mfg Co Ltd Information compression method of musical sound waveform, information decompression method, computer program for information compression, information compression device, information decompression device, and data structure
JP2013125187A (en) * 2011-12-15 2013-06-24 Fujitsu Ltd Decoder, encoder, encoding decoding system, decoding method, encoding method, decoding program and encoding program

Similar Documents

Publication Publication Date Title
KR100472585B1 (en) Method and apparatus for reproducing voice signal and transmission method thereof
KR100873836B1 (en) Celp transcoding
KR101000345B1 (en) Audio encoding device, audio decoding device, audio encoding method, and audio decoding method
US6721700B1 (en) Audio coding method and apparatus
KR100427753B1 (en) Method and apparatus for reproducing voice signal, method and apparatus for voice decoding, method and apparatus for voice synthesis and portable wireless terminal apparatus
US4868867A (en) Vector excitation speech or audio coder for transmission or storage
CA2429832C (en) Lpc vector quantization apparatus
KR100304682B1 (en) Fast Excitation Coding for Speech Coders
US5451951A (en) Method of, and system for, coding analogue signals
US4945565A (en) Low bit-rate pattern encoding and decoding with a reduced number of excitation pulses
JPH11510274A (en) Method and apparatus for generating and encoding line spectral square root
JP3590071B2 (en) Predictive partition matrix quantization of spectral parameters for efficient speech coding
US6269332B1 (en) Method of encoding a speech signal
JPH08129400A (en) Voice coding system
JPH08234795A (en) Voice encoding device
JP2004348120A (en) Voice encoding device and voice decoding device, and method thereof
JP3063087B2 (en) Audio encoding / decoding device, audio encoding device, and audio decoding device
JPH07177031A (en) Voice coding control system
JP3715417B2 (en) Audio compression encoding apparatus, audio compression encoding method, and computer-readable recording medium storing a program for causing a computer to execute each step of the method
JPH11305798A (en) Voice compressing and encoding device
JPH11133999A (en) Voice coding and decoding equipment
JPH0632034B2 (en) Speech coding method
JPS61154287A (en) System and device for coding of picture signal

Legal Events

Date Code Title Description
A300 Application deemed to be withdrawn because no request for examination was validly filed

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 20020115