JPH07177031A - Voice coding control system - Google Patents

Voice coding control system

Info

Publication number
JPH07177031A
JPH07177031A JP5318862A JP31886293A JPH07177031A JP H07177031 A JPH07177031 A JP H07177031A JP 5318862 A JP5318862 A JP 5318862A JP 31886293 A JP31886293 A JP 31886293A JP H07177031 A JPH07177031 A JP H07177031A
Authority
JP
Japan
Prior art keywords
pitch waveform
residual pitch
representative
unit
waveform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP5318862A
Other languages
Japanese (ja)
Inventor
Yoshiaki Tanaka
良紀 田中
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP5318862A priority Critical patent/JPH07177031A/en
Publication of JPH07177031A publication Critical patent/JPH07177031A/en
Withdrawn legal-status Critical Current

Links

Landscapes

  • Analogue/Digital Conversion (AREA)

Abstract

PURPOSE:To improve reproducing sound quality by preventing foldover distortion by suppressing the change of each remainder pitch waveform in a frame, and sampling a representative remainder pitch waveform from a smoothed remainder pitch waveform. CONSTITUTION:A shift part 16 performs the phase matching of each remainder pitch, and a Fourier transformation part 17 performs the Fourier transformation of the remainder pitch waveform. Two-dimensional Fourier transformation for the remainder pitch waveform is performed by the transformation part 17 and a Fourier transformation part 19 at the next stage. Since a high-order component represents large change between the remainder pitch waveforms as a result of two-dimensional Fourier transformation, smoothing processing is performed by suppressing a component with such large change, and such processing is performed at a coefficient processing part 20. Thence, inverse Fourier transformation is performed at a two-dimensional inverse Fourier transformation part 21, and the smoothed remainder pitch waveform can be found. The latest remainder pitch waveform is sampled at a representative remainder pitch waveform sampling part 22 as the representative remainder pitch waveform of the frame, and it is quantized at a quantization part 23, then, it is sent to a reception side.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、音声信号を高能率符号
化する音声符号化制御方式に関する。近年、企業内通信
システム,ディジタル移動無線通信システム,音声蓄積
システム等に於いては、音声信号を高能率符号化して、
回線利用効率の向上や蓄積容量の削減等が図られてい
る。このような音声信号の高能率符号化に於いて、再生
音声品質の改善が要望されている。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice coding control system for highly efficiently coding a voice signal. In recent years, in enterprise communication systems, digital mobile radio communication systems, voice storage systems, etc., voice signals have been encoded with high efficiency,
It is attempting to improve the line utilization efficiency and reduce the storage capacity. In such high-efficiency encoding of audio signals, improvement of reproduced audio quality is desired.

【0002】[0002]

【従来の技術】音声信号の高能率符号化は既に各種の方
式が提案されている。例えば、線形予測分析により求め
た線形予測係数と、音源に関するパラメータとを伝送す
る方式が広く採用されている。
2. Description of the Related Art Various systems have already been proposed for high-efficiency coding of voice signals. For example, a method of transmitting a linear prediction coefficient obtained by linear prediction analysis and a parameter regarding a sound source is widely adopted.

【0003】図5は線形予測符号化部の説明図であり、
51は線形予測分析部、52は予測係数量子化部、53
は逆フィルタ部、54は残差信号量子化部、55は多重
化部を示す。
FIG. 5 is an explanatory diagram of the linear predictive coding unit.
51 is a linear prediction analysis unit, 52 is a prediction coefficient quantization unit, 53
Is an inverse filter unit, 54 is a residual signal quantization unit, and 55 is a multiplexing unit.

【0004】入力音声信号は、線形予測分析部51と逆
フィルタ部53とに加えられ、線形予測分析部51に於
いて、分析フレーム毎に線形予測分析により予測係数が
求められ、予測係数量子化部52に於いて量子化され
る。この予測係数を係数として、入力音声信号が加えら
れる逆フィルタ部53から予測残差信号が出力され、残
差信号量子化部54に於いて量子化される。多重化部5
5は、量子化された予測残差信号と予測係数とを多重化
して、伝送路を介して受信側へ送出する。
The input speech signal is added to the linear prediction analysis unit 51 and the inverse filter unit 53, and in the linear prediction analysis unit 51, the prediction coefficient is obtained by the linear prediction analysis for each analysis frame, and the prediction coefficient quantization is performed. It is quantized in the part 52. The prediction residual signal is output from the inverse filter unit 53 to which the input audio signal is added, using this prediction coefficient as a coefficient, and is quantized in the residual signal quantization unit 54. Multiplexer 5
5 multiplexes the quantized prediction residual signal and the prediction coefficient, and sends them to the receiving side via the transmission path.

【0005】受信側に於ける復号化部では、多重化され
た予測残差信号と予測係数とを分離し、予測係数を係数
とする合成フィルタに予測残差信号を加えて、音声信号
を再生することになる。
The decoding unit on the receiving side separates the multiplexed prediction residual signal and the prediction coefficient, adds the prediction residual signal to a synthesis filter having the prediction coefficient as a coefficient, and reproduces a voice signal. Will be done.

【0006】予測残差信号の効率的な伝送の為に、一定
長の予測残差信号をベクトル量子化し、そのインデック
スを伝送するコード駆動線形予測符号化方式CELP
や、一定長の予測残差信号を有限個のパルス列でモデル
化し、最適なパルス位置及びパルス振幅を伝送するマル
チパルス駆動符号化方式MPC等の各種の方式が知られ
ている。
For efficient transmission of a prediction residual signal, a code-driven linear predictive coding method CELP for vector-quantizing a prediction residual signal of a fixed length and transmitting its index.
In addition, various methods such as a multi-pulse drive coding method MPC in which a prediction residual signal of a constant length is modeled by a finite number of pulse trains and an optimum pulse position and pulse amplitude are transmitted are known.

【0007】又代表波形補間方式(PWI;Prototype
Waveform Interpolation)は、4kpbs以下の低
ビットレートに於いても、再生音声品質の劣化が少ない
方式として知られている。この方式は、入力音声信号の
分析フレームに於ける残差信号の中から代表的な1ピッ
チ波形を抽出して量子化するもので、復号化側では、複
数ピッチ波形を含む1フレーム中の代表残差ピッチ波形
以外の残差信号は、前フレームに於ける代表残差ピッチ
波形と、現フレームに於ける代表残差ピッチ波形とを補
間することにより求めるものである。
A representative waveform interpolation method (PWI; Prototype)
Waveform Interpolation) is known as a method in which the reproduced voice quality is less deteriorated even at a low bit rate of 4 kpbs or less. This method extracts a typical one-pitch waveform from the residual signal in the analysis frame of the input speech signal and quantizes it. The residual signal other than the residual pitch waveform is obtained by interpolating the representative residual pitch waveform in the previous frame and the representative residual pitch waveform in the current frame.

【0008】図6はPWI方式の説明図であり、Lサン
プルの長さの現フレームの代表残差ピッチ波形をZn
そのピッチ周期をNn とし、前フレームの代表残差ピッ
チ波形をZn-1 、そのピッチ周期をNn-1 として示して
おり、1フレーム内の複数残差ピッチ波形の中の一つの
残差ピッチ波形を代表として量子化するものであり、復
号化側では、現フレームの代表残差ピッチ波形Zn を除
いた残りの残差ピッチ波形Pn は、前フレームの代表残
差ピッチ波形Zn-1 を巡回シフトにより、現フレームの
代表残差ピッチ波形Zn と位相を合わせた後、補間によ
り得ることができるものである。即ち、次式に示す線形
補間によって求めることができる。 P(k) =(1−α(k) )Zn-1 (k) +α(k) Zn (k) …(1) 但し、0≦α(k) ≦1 Pn =Pn (0) ,Pn (1) ,・・・Pn (N-1) Zn =Zn (0) ,Zn (1) ,・・・Zn (N-1) Zn-1 =Zn-1 (0) ,Zn-1 (1) ,・・・Zn-1 (N-1)
FIG. 6 is an explanatory view of the PWI system, in which a representative residual pitch waveform of the current frame having a length of L samples is represented by Z n ,
The pitch cycle is shown as N n , the representative residual pitch waveform of the previous frame is shown as Z n−1 , and the pitch cycle is shown as N n−1 , and one of the plural residual pitch waveforms in one frame is shown. The difference pitch waveform is quantized as a representative, and on the decoding side, the remaining residual pitch waveform P n excluding the representative residual pitch waveform Z n of the current frame is the representative residual pitch waveform Z of the previous frame. It can be obtained by performing interpolation after cyclic matching of n−1 with the representative residual pitch waveform Z n of the current frame. That is, it can be obtained by the linear interpolation shown in the following equation. P (k) = (1-α (k)) Z n-1 (k) + α (k) Z n (k) (1) where 0 ≦ α (k) ≦ 1 P n = P n (0 ), P n (1), ... P n (N-1) Z n = Z n (0), Z n (1), ... Z n (N-1) Z n-1 = Z n -1 (0), Z n-1 (1), ... Z n-1 (N-1)

【0009】[0009]

【発明が解決しようとする問題点】従来例の前述のPW
I方式は、残差信号を代表残差ピッチ波形の補間によっ
て求めるものであるが、フレーム内の連続する残差ピッ
チ波形に、フレーム周波数以上の変動成分があると、代
表残差ピッチ波形以外の残差ピッチ波形を生成する補間
処理に於いて折返歪が発生し、実際の残差信号との誤差
が大きくなる。それによって、再生音声品質が劣化する
問題があった。本発明は、折返歪の発生を防止して、再
生音声品質を向上させることを目的とする。
[Problems to be Solved by the Invention]
In the I method, the residual signal is obtained by interpolating the representative residual pitch waveform. However, if a continuous residual pitch waveform in a frame has a fluctuation component equal to or higher than the frame frequency, the residual residual waveform other than the representative residual pitch waveform is detected. In the interpolation process for generating the residual pitch waveform, aliasing distortion occurs, and the error from the actual residual signal becomes large. As a result, there is a problem that the quality of reproduced voice is deteriorated. It is an object of the present invention to prevent the occurrence of aliasing distortion and improve reproduced voice quality.

【0010】[0010]

【課題を解決するための手段】本発明の音声符号化制御
方式は、図1を参照して説明すると、入力音声信号を分
析部1に於いてフレーム毎に線形予測残差信号を求め、
代表残差ピッチ波形抽出部2に於いてフレーム内の代表
残差ピッチ波形を抽出し、この代表残差ピッチ波形を符
号化情報の一つとして送出し、この符号化情報の一つの
代表残差ピッチ波形を受信した補間処理復号化部3に於
いて、現フレームの代表残差ピッチ波形と前フレームの
代表残差ピッチ波形とを用いた補間処理により、現フレ
ーム内の残差ピッチ波形を求める音声符号化制御方式に
於いて、フレーム内の各残差ピッチ波形の変化を平滑化
部4に於いて平滑化し、代表残差ピッチ波形抽出部2に
於いて代表残差ピッチ波形の抽出を行うものである。
The speech coding control system of the present invention will be described with reference to FIG. 1. The input speech signal is analyzed by the analysis unit 1 to obtain a linear prediction residual signal for each frame.
The representative residual pitch waveform extracting unit 2 extracts a representative residual pitch waveform in the frame, sends this representative residual pitch waveform as one piece of encoded information, and then outputs one representative residual of this encoded information. In the interpolation processing decoding unit 3 which has received the pitch waveform, the residual pitch waveform in the current frame is obtained by the interpolation processing using the representative residual pitch waveform of the current frame and the representative residual pitch waveform of the previous frame. In the speech coding control method, the smoothing unit 4 smoothes changes in each residual pitch waveform in the frame, and the representative residual pitch waveform extracting unit 2 extracts the representative residual pitch waveform. It is a thing.

【0011】又平滑化部4は、残差ピッチ波形を二次元
フーリエ変換し、この二次元フーリエ変換により求めた
変換係数を平滑化処理した後に、二次元逆フーリエ変換
して、代表残差ピッチ波形抽出部2に加える構成とする
ことができる。
The smoothing unit 4 performs a two-dimensional Fourier transform on the residual pitch waveform, smoothes the transform coefficient obtained by the two-dimensional Fourier transform, and then performs a two-dimensional inverse Fourier transform to obtain a representative residual pitch. The configuration may be added to the waveform extracting unit 2.

【0012】又平滑化部4は、二次元フーリエ変換過程
に於いて、変換係数の正規化を行う構成とすることがで
きる。
Further, the smoothing unit 4 can be configured to normalize the transform coefficient in the two-dimensional Fourier transform process.

【0013】[0013]

【作用】分析部1に於いては、入力音声信号を所定のフ
レーム毎に線形予測分析を行って残差ピッチ波形を出力
し、平滑化部4は、フレーム内の残差ピッチ波形間の変
化が大きくならないように平滑化処理する。そして、代
表残差ピッチ波形抽出部2に於いてフレーム内の残差ピ
ッチ波形から代表を一つ選択して代表残差ピッチ波形と
し、これを入力音声信号の符号化情報の一つとして送出
する。従って、補間処理復号化部3に於いて、現フレー
ムの代表残差ピッチ波形と、前フレームの代表残差ピッ
チ波形とを用いて補間処理し、現フレームの他の残差ピ
ッチ波形を求めた後、音声信号を再生する場合に、折返
歪を抑制することができるから、再生音声の品質を改善
することができる。
In the analyzing unit 1, the input speech signal is subjected to the linear prediction analysis for each predetermined frame to output the residual pitch waveform, and the smoothing unit 4 changes between the residual pitch waveforms in the frame. Is smoothed so that does not become large. Then, in the representative residual pitch waveform extracting section 2, one representative is selected from the residual pitch waveforms in the frame to make a representative residual pitch waveform, and this is sent as one of the encoded information of the input speech signal. . Therefore, in the interpolation processing decoding unit 3, interpolation processing is performed using the representative residual pitch waveform of the current frame and the representative residual pitch waveform of the previous frame to obtain another residual pitch waveform of the current frame. After that, when the audio signal is reproduced, aliasing distortion can be suppressed, so that the quality of reproduced audio can be improved.

【0014】又平滑化部4は、残差ピッチ波形を二次元
フーリエ変換し、その変換係数の高次の成分が残差ピッ
チ波形の変化が大きい部分を示すことから、この高次の
成分を例えば零とし、二次元逆フーリエ変換により残差
ピッチ波形に戻す。即ち、残差ピッチ波形を平滑化する
ことができる。
Further, the smoothing section 4 performs a two-dimensional Fourier transform on the residual pitch waveform, and the high-order component of the transform coefficient indicates a portion where the change of the residual pitch waveform is large. For example, the value is set to zero, and the residual pitch waveform is restored by the two-dimensional inverse Fourier transform. That is, the residual pitch waveform can be smoothed.

【0015】又残差ピッチ波形の変換過程に於いて、列
方向にフーリエ変換を行った後に、変換係数の振幅を例
えば1に正規化し、この正規化された変換係数に対して
行方向のフーリエ変換を行う。即ち、二次元フーリエ変
換過程に於いて変換係数の正規化を行うことができる。
Further, in the process of converting the residual pitch waveform, after Fourier transform is performed in the column direction, the amplitude of the transform coefficient is normalized to, for example, 1 and the normalized transform coefficient is Fourier-transformed in the row direction. Do the conversion. That is, the transform coefficients can be normalized in the two-dimensional Fourier transform process.

【0016】[0016]

【実施例】図2は本発明の実施例の説明図であり、10
は符号化部、30は復号化部を示し、11は有声,無声
判定部(V/UV)、12は逆フィルタ部、13は線形
予測分析部、14はピッチ分析部、15は残差ピッチ波
形形成部、16はシフト部、17はフーリエ変換部(D
FT)、18は振幅正規化部(NOM)、19はフーリ
エ変換部(DFT)、20は係数処理部(MOD)、2
1は二次元逆フーリエ変換部(2DIDFT)、22は
代表残差ピッチ波形抽出部、23は量子化部、31はパ
ルス発生部、32は波形成形部、33は有声,無声切替
部、34はノイズ発生部、35は増幅器、36は予測合
成フィルタ部、37は補間処理部である。
EXAMPLE FIG. 2 is an explanatory view of an example of the present invention.
Is an encoding unit, 30 is a decoding unit, 11 is a voiced / unvoiced determination unit (V / UV), 12 is an inverse filter unit, 13 is a linear prediction analysis unit, 14 is a pitch analysis unit, and 15 is a residual pitch. Waveform forming unit, 16 is a shift unit, 17 is a Fourier transform unit (D
FT), 18 is an amplitude normalization unit (NOM), 19 is a Fourier transform unit (DFT), 20 is a coefficient processing unit (MOD), 2
1 is a two-dimensional inverse Fourier transform unit (2D IDFT), 22 is a representative residual pitch waveform extraction unit, 23 is a quantization unit, 31 is a pulse generation unit, 32 is a waveform shaping unit, 33 is a voiced / unvoiced switching unit, and 34 is A noise generation unit, 35 is an amplifier, 36 is a prediction synthesis filter unit, and 37 is an interpolation processing unit.

【0017】送信側の符号化部10に於いては、入力音
声信号を、分析フレーム毎に有声,無声判定部11に於
いて有声音であるか無声音であるかを判定し、その判定
信号を図示を省略した多重化部に加えて、受信側へ音声
信号の符号化情報の一つとして送出する。
In the encoding section 10 on the transmission side, the input voice signal is judged for each analysis frame in the voiced / unvoiced judgment section 11 to determine whether it is a voiced sound or an unvoiced sound, and the judgment signal is determined. In addition to the multiplexing unit (not shown), it is sent to the receiving side as one piece of encoded information of the audio signal.

【0018】又線形予測分析部13に於いては、入力音
声信号から予測係数を求め、ピッチ分析部14に於いて
はピッチ周期Nを求める。又逆フィルタ部12に於いて
は、線形予測分析部13からの予測係数を用いて入力音
声信号に対して逆フィルタ処理を行って予測残差信号R
を求め、残差ピッチ波形形成部15に於いては、ピッチ
分析部14で求めたピッチ周期Nを基に、残差ピッチ波
形の切り出し処理を行う。このような処理を行う構成
は、既に知られたPWI方式に於ける残差ピッチ波形を
求める構成と同様な構成で実現できる。
The linear prediction analysis unit 13 obtains a prediction coefficient from the input speech signal, and the pitch analysis unit 14 obtains the pitch period N. In the inverse filter unit 12, the prediction residual signal R is obtained by performing an inverse filter process on the input speech signal using the prediction coefficient from the linear prediction analysis unit 13.
Then, the residual pitch waveform forming section 15 performs a cutting process of the residual pitch waveform based on the pitch cycle N obtained by the pitch analyzing section 14. The configuration for performing such processing can be realized by a configuration similar to the configuration for obtaining the residual pitch waveform in the already known PWI method.

【0019】又シフト部16は、残差ピッチ波形を巡回
シフトして、各残差ピッチ波形の位相を合わせ、フーリ
エ変換部17は残差ピッチ波形のフーリエ変換を行う、
このフーリエ変換部17と、次のフーリエ変換部19と
により残差ピッチ波形についての二次元フーリエ変換処
理を行うものであり、その過程に於いて振幅正規化部1
8に於いて正規化処理を行う。
The shift unit 16 cyclically shifts the residual pitch waveform to match the phases of the residual pitch waveforms, and the Fourier transform unit 17 performs the Fourier transform of the residual pitch waveform.
The Fourier transform unit 17 and the next Fourier transform unit 19 perform two-dimensional Fourier transform processing on the residual pitch waveform, and in the process, the amplitude normalization unit 1
In step 8, normalization processing is performed.

【0020】二次元フーリエ変換の結果、高次の成分
は、残差ピッチ波形間で大きく変化する成分を示すか
ら、このような変化が大きい成分を抑圧することによ
り、平滑化処理を行うものであり、係数処理部20に於
いて前述の処理を行い、次に二次元逆フーリエ変換部2
1に於いて逆フーリエ変換処理を行い、平滑化された残
差ピッチ波形を求める。そして、代表残差ピッチ波形抽
出部22に於いて時間的に最新の残差ピッチ波形を、こ
のフレームに於ける代表残差ピッチ波形として抽出し、
量子化部23に於いて量子化して受信側へ送出する。受
信側へは、この代表残差ピッチ波形と、ピッチ周期N
と、有声/無声判定信号とを多重化して送出することに
なる。
As a result of the two-dimensional Fourier transform, the higher-order component shows a component that greatly changes between the residual pitch waveforms. Therefore, by suppressing the component having such a large change, smoothing processing is performed. Yes, the coefficient processing unit 20 performs the above-described processing, and then the two-dimensional inverse Fourier transform unit 2
In step 1, inverse Fourier transform processing is performed to obtain a smoothed residual pitch waveform. Then, the representative residual pitch waveform extracting section 22 extracts the temporally latest residual pitch waveform as a representative residual pitch waveform in this frame,
The quantizer 23 quantizes and quantizes and sends it to the receiving side. To the receiving side, this representative residual pitch waveform and pitch cycle N
And the voiced / unvoiced determination signal are multiplexed and transmitted.

【0021】受信側の復号化部30に於いては、ピッチ
周期Nに従ってパルス発生部31からパルスを発生し、
又補間処理部37に於いて代表残差ピッチ波形を用いて
補間処理する。即ち、前フレームの代表残差ピッチ波形
と、現フレームの代表残差ピッチ波形とを用いて、現フ
レームの他の残差ピッチ波形を補間処理によって求めて
波形成形部32に加えて、有声音の波形を生成する。
In the decoding section 30 on the receiving side, pulses are generated from the pulse generation section 31 in accordance with the pitch cycle N,
In addition, the interpolation processing unit 37 performs interpolation processing using the representative residual pitch waveform. That is, by using the representative residual pitch waveform of the previous frame and the representative residual pitch waveform of the current frame, another residual pitch waveform of the current frame is obtained by interpolation processing and added to the waveform shaping section 32 to generate the voiced sound. Generate the waveform of.

【0022】又図示を省略した経路による有声音と無声
音との判定信号によって有声,無声切替部33が切替制
御され、無声音の場合は白色ノイズを発生するノイズ発
生部34側に切替え、有声音の場合は波形成形部32側
に切替える。又増幅器35は図示を省略した経路による
フレーム電力情報に従ってゲインが設定される。そし
て、増幅された信号は予測合成フィルタ部36に加えら
れて、音声信号が再生される。
Further, the voiced / unvoiced switching section 33 is switch-controlled by a decision signal of voiced sound and unvoiced sound by a route not shown in the figure. In this case, the waveform shaping section 32 is switched to. Further, the gain of the amplifier 35 is set in accordance with the frame power information by a path (not shown). Then, the amplified signal is added to the prediction synthesis filter unit 36 to reproduce the audio signal.

【0023】逆フィルタ部12からの予測残差信号ベク
トルRは、各残差ピッチ波形Pi が連続した波形と見做
すことができるから、 R=(P0 ,P1 ,P2,・・・PM-1 ) …(2) Pi =(p(i,0),p(i,1) ,p(i,2) ,・・・p(i,Ni ) ) 0≦i≦(M−1) と表すことができる。なお、Nはピッチ周期、Mはフレ
ーム内に含まれる残差ピッチ波形の数を示す。
Since the prediction residual signal vector R from the inverse filter unit 12 can be regarded as a waveform in which each residual pitch waveform P i is continuous, R = (P 0 , P 1 , P2, ... P M-1 ) (2) P i = (p (i, 0), p (i, 1), p (i, 2), ... P (i, N i )) 0 ≦ i ≦ It can be expressed as (M-1). Note that N indicates the pitch period, and M indicates the number of residual pitch waveforms included in the frame.

【0024】図3は残差ピッチ波形の抽出説明図であ
り、ピッチ分析部14に於いて入力音声信号から現フレ
ームのピッチ周期Nを求め、この現フレームの最新の時
刻から遡ってピッチ周期Nで予測残差信号ベクトルRか
ら残差ピッチ波形Pi を切り出す。この場合、時間的に
最も新しい残差ピッチ波形P2 に対して、最も古い残差
ピッチ波形P0 は、前フレームの波形の一部を含む場合
を示している。
FIG. 3 is an explanatory diagram of extraction of the residual pitch waveform. The pitch analysis unit 14 finds the pitch period N of the current frame from the input voice signal, and the pitch period N is traced back from the latest time of this current frame. The residual pitch waveform P i is cut out from the prediction residual signal vector R at. In this case, the oldest residual pitch waveform P 0 is shown to include a part of the waveform of the previous frame with respect to the temporally newest residual pitch waveform P 2 .

【0025】次に、シフト部16に於いて各残差ピッチ
波形Pi を、p(i,0) が最大値となるように巡回シフト
して、各残差ピッチ波形Pi の位相を揃える。これをあ
らためてPi とすると、予測残差信号ベクトルRは、次
の(3)式のR=〔 〕として示すように、N×M行列
の形で表すことができる。
Next, in the shift section 16, each residual pitch waveform P i is cyclically shifted so that p (i, 0) becomes the maximum value, and the phases of each residual pitch waveform P i are aligned. . Letting this be P i again, the prediction residual signal vector R can be expressed in the form of an N × M matrix as shown by R = [] in the following equation (3).

【0026】この(3)式を二次元フーリエ変換する。
即ち、フーリエ変換部17,19により列方向と行方向
とに順番にフーリエ変換を行う。この場合、フーリエ変
換部17により列方向にフーリエ変換を行った後、振幅
正規化部18に於いて各変換係数の振幅を1に正規化
し、次のフーリエ変換部19に於いて正規化された変換
係数に対して行方向にフーリエ変換する。この正規化に
より、代表残差ピッチ波形の量子化効率を向上すること
ができる。
The equation (3) is two-dimensionally Fourier transformed.
That is, the Fourier transform units 17 and 19 sequentially perform the Fourier transform in the column direction and the row direction. In this case, the Fourier transform unit 17 performs the Fourier transform in the column direction, the amplitude normalization unit 18 normalizes the amplitude of each transform coefficient to 1, and the next Fourier transform unit 19 normalizes the amplitude. Fourier transform is performed in the row direction on the transform coefficient. By this normalization, the quantization efficiency of the representative residual pitch waveform can be improved.

【0027】フーリエ変換部17,19による二次元フ
ーリエ変換によって得られた変換係数c(i,j) (但し、
i=0,1,2,・・・N/2,j=0,1,2,・・
・M−1)は、次の(4)式のC=〔 〕として示すよ
うに、(N/2)×Mの行列の形で表すことができる。
The transform coefficient c (i, j) obtained by the two-dimensional Fourier transform by the Fourier transform units 17 and 19 (however,
i = 0,1,2, ... N / 2, j = 0,1,2, ...
-M-1) can be represented in the form of a matrix of (N / 2) * M as shown as C = [] in the following equation (4).

【0028】有声音区間に於いては、連続する残差ピッ
チ波形の間には高い相関がある。例えば、図4の(a)
に示すフレームの音声信号波形について、残差ピッチ波
形の二次元フーリエ変換を行ったところ、(b)に示す
結果が得られた。即ち、変換係数c(i,j) のjの値が小
さい程、変換係数の値が大きくなる。このjに関して低
次の成分は、各残差ピッチ波形間でゆっくり変化する成
分であり、高次の成分は速く変化する成分である。
In the voiced sound section, there is a high correlation between successive residual pitch waveforms. For example, in FIG.
When the two-dimensional Fourier transform of the residual pitch waveform was performed on the audio signal waveform of the frame shown in (2), the result shown in (b) was obtained. That is, the smaller the value of j of the conversion coefficient c (i, j), the larger the value of the conversion coefficient. The low-order component with respect to j is a component that changes slowly between the residual pitch waveforms, and the high-order component is a component that changes rapidly.

【0029】そこで、二次元フーリエ変換係数c(i,j)
の中で、jの値が次の(5)式で示す閾値K(M,N,
cut )より大きい係数を零として、平滑化処理を行
う。即ち、係数処理部20に於いて変換係数c(i,j) に
ついての処理を行う。 K(M,N,fcut )=MNfcut /fs …(5) fcut =fF /2 但し、fcut は遮断周波数、fs はサンプリング周波
数、fF はフレーム周波数である。
Therefore, the two-dimensional Fourier transform coefficient c (i, j)
Where the value of j is the threshold value K (M, N,
The smoothing process is performed with a coefficient larger than f cut ) set to zero. That is, the coefficient processing unit 20 processes the conversion coefficient c (i, j). K (M, N, f cut ) = MN f cut / f s (5) f cut = f F / 2 where f cut is the cutoff frequency, f s is the sampling frequency, and f F is the frame frequency.

【0030】遮断周波数fcut をフレーム周波数fF
1/2に選定するのは、補間処理に於いて、このフレー
ム周波数fF の1/2より大きい変動成分については折
返歪が発生することになり、従って、再生音声品質が劣
化する。しかし、前述のように平滑化することにより、
補間処理に於ける折返歪の発生を回避できるから、再生
音声品質を改善することができる。
The cut- off frequency f cut is selected to be 1/2 of the frame frequency f F because , in the interpolation process, aliasing distortion occurs for a fluctuation component larger than 1/2 of the frame frequency f F. Therefore, the quality of reproduced voice is deteriorated. However, by smoothing as described above,
Since it is possible to avoid the occurrence of aliasing distortion in the interpolation processing, it is possible to improve the reproduced voice quality.

【0031】係数処理部20に於いて平滑化処理した二
次元フーリエ変換係数について、二次元逆フーリエ変換
部21に於いて逆フーリエ変換し、平滑化された残差ピ
ッチ波形を求める。そして、代表残差ピッチ波形抽出部
22に於いて、時間的に最も新しい残差ピッチ波形P
M-1 を、このフレームに於ける代表残差ピッチ波形とし
て抽出し、量子化部23に於いて量子化する。そして、
前述のように、ピッチ周期Nと有声/無声判定信号等と
をそれぞれ量子化して多重化し、受信側へ送出すること
になる。
The two-dimensional Fourier transform coefficient smoothed by the coefficient processing unit 20 is inverse Fourier transformed by the two-dimensional inverse Fourier transform unit 21 to obtain a smoothed residual pitch waveform. Then, in the representative residual pitch waveform extracting unit 22, the temporally newest residual pitch waveform P
M-1 is extracted as a representative residual pitch waveform in this frame, and quantized in the quantizer 23. And
As described above, the pitch period N and the voiced / unvoiced determination signal are quantized and multiplexed, respectively, and sent to the receiving side.

【0032】受信側の前述の復号化部30に於いては、
代表残差ピッチ波形を用いて補間処理部37で補間処理
し、フレーム内の残差ピッチ波形を再生して、有声音を
再生することができる。
In the decoding section 30 on the receiving side,
Interpolation processing is performed by the interpolation processing unit 37 using the representative residual pitch waveform, and the residual pitch waveform in the frame is reproduced, so that the voiced sound can be reproduced.

【0033】[0033]

【発明の効果】以上説明したように、本発明は、入力音
声信号を分析するフレーム毎に線形予測残差信号を求
め、代表残差ピッチ波形を抽出して送出し、その代表残
差ピッチ波形を受信して、前フレームの代表残差ピッチ
波形と現フレームの代表残差ピッチ波形との補間処理に
より、現フレーム内の残差ピッチ波形を再生する音声符
号化制御方式に於いて、フレーム内の残差ピッチ波形の
変化を平滑化部4に於いて平滑化するものであり、それ
によって、補間処理した場合の折返歪の発生を防止し、
再生音声品質を改善することができる利点がある。
As described above, according to the present invention, a linear prediction residual signal is obtained for each frame in which an input speech signal is analyzed, a representative residual pitch waveform is extracted and transmitted, and the representative residual pitch waveform is extracted. In the voice coding control method for reproducing the residual pitch waveform in the current frame by interpolating the typical residual pitch waveform in the previous frame and the representative residual pitch waveform in the current frame. The smoothing unit 4 smoothes the change in the residual pitch waveform of, thereby preventing the occurrence of aliasing distortion in the case of interpolation processing,
There is an advantage that the reproduced voice quality can be improved.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の原理説明図である。FIG. 1 is a diagram illustrating the principle of the present invention.

【図2】本発明の実施例の説明図である。FIG. 2 is an explanatory diagram of an example of the present invention.

【図3】残差ピッチ波形の抽出説明図である。FIG. 3 is an explanatory diagram for extracting a residual pitch waveform.

【図4】二次元フーリエ変換係数の説明図である。FIG. 4 is an explanatory diagram of a two-dimensional Fourier transform coefficient.

【図5】線形予測符号化部の説明図である。FIG. 5 is an explanatory diagram of a linear predictive coding unit.

【図6】PWI方式の説明図である。FIG. 6 is an explanatory diagram of a PWI method.

【符号の説明】[Explanation of symbols]

1 分析部 2 代表残差ピッチ波形抽出部 3 補間処理復号化部 4 平滑化部 1 analysis unit 2 representative residual pitch waveform extraction unit 3 interpolation processing decoding unit 4 smoothing unit

Claims (3)

【特許請求の範囲】[Claims] 【請求項1】 入力音声信号を分析部(1)に於いてフ
レーム毎に線形予測残差信号を求め、代表残差ピッチ波
形抽出部(2)に於いてフレーム内の代表残差ピッチ波
形を抽出し、該代表残差ピッチ波形を符号化情報の一つ
として送出し、該符号化情報の一つの代表残差ピッチ波
形を受信した補間処理復号化部(3)に於いて、現フレ
ームの代表残差ピッチ波形と前フレームの代表残差ピッ
チ波形とを用いた補間処理により、現フレーム内の残差
ピッチ波形を求める音声符号化制御方式に於いて、 前記フレーム内の各残差ピッチ波形の変化を平滑化部
(4)に於いて平滑して前記代表残差ピッチ波形抽出部
(2)に於いて代表残差ピッチ波形の抽出を行うことを
特徴とする音声符号化制御方式。
1. An input speech signal is obtained by an analyzing unit (1) for a linear prediction residual signal for each frame, and a representative residual pitch waveform extracting unit (2) obtains a representative residual pitch waveform within a frame. In the interpolation processing decoding unit (3) that has extracted and transmitted the representative residual pitch waveform as one piece of encoded information, and received one representative residual pitch waveform of the encoded information, In a speech coding control method for obtaining a residual pitch waveform in the current frame by an interpolation process using the representative residual pitch waveform and the representative residual pitch waveform of the previous frame, each residual pitch waveform in the frame The speech coding control method is characterized in that the smoothing section (4) smoothes the change of the above and the representative residual pitch waveform extracting section (2) extracts the representative residual pitch waveform.
【請求項2】 前記平滑化部(4)は、前記残差ピッチ
波形を二次元フーリエ変換し、該二次元フーリエ変換に
より求めた変換係数を平滑化処理した後に、二次元逆フ
ーリエ変換して、前記代表残差ピッチ波形抽出部(2)
に加える構成としたことを特徴とする請求項1記載の音
声符号化制御方式。
2. The smoothing unit (4) performs a two-dimensional Fourier transform on the residual pitch waveform, smoothes the transform coefficient obtained by the two-dimensional Fourier transform, and then performs a two-dimensional inverse Fourier transform. , The representative residual pitch waveform extraction unit (2)
The voice coding control system according to claim 1, wherein the voice coding control system is added to the above.
【請求項3】 前記平滑化部(4)は、前記二次元フー
リエ変換過程に於いて、変換係数の正規化を行う構成を
有することを特徴とする請求項2記載の音声符号化制御
方式。
3. The speech coding control system according to claim 2, wherein the smoothing unit (4) has a configuration for normalizing transform coefficients in the two-dimensional Fourier transform process.
JP5318862A 1993-12-20 1993-12-20 Voice coding control system Withdrawn JPH07177031A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5318862A JPH07177031A (en) 1993-12-20 1993-12-20 Voice coding control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5318862A JPH07177031A (en) 1993-12-20 1993-12-20 Voice coding control system

Publications (1)

Publication Number Publication Date
JPH07177031A true JPH07177031A (en) 1995-07-14

Family

ID=18103792

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5318862A Withdrawn JPH07177031A (en) 1993-12-20 1993-12-20 Voice coding control system

Country Status (1)

Country Link
JP (1) JPH07177031A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553343B1 (en) 1995-12-04 2003-04-22 Kabushiki Kaisha Toshiba Speech synthesis method
JP2003522965A (en) * 1998-12-21 2003-07-29 クゥアルコム・インコーポレイテッド Periodic speech coding
USRE39336E1 (en) 1998-11-25 2006-10-10 Matsushita Electric Industrial Co., Ltd. Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6553343B1 (en) 1995-12-04 2003-04-22 Kabushiki Kaisha Toshiba Speech synthesis method
US7184958B2 (en) 1995-12-04 2007-02-27 Kabushiki Kaisha Toshiba Speech synthesis method
USRE39336E1 (en) 1998-11-25 2006-10-10 Matsushita Electric Industrial Co., Ltd. Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains
JP2003522965A (en) * 1998-12-21 2003-07-29 クゥアルコム・インコーポレイテッド Periodic speech coding
JP4824167B2 (en) * 1998-12-21 2011-11-30 クゥアルコム・インコーポレイテッド Periodic speech coding

Similar Documents

Publication Publication Date Title
KR100427753B1 (en) Method and apparatus for reproducing voice signal, method and apparatus for voice decoding, method and apparatus for voice synthesis and portable wireless terminal apparatus
KR101000345B1 (en) Audio encoding device, audio decoding device, audio encoding method, and audio decoding method
US6721700B1 (en) Audio coding method and apparatus
CN101518083B (en) Method, medium, and system encoding and/or decoding audio signals by using bandwidth extension and stereo coding
EP0939394A1 (en) Apparatus for encoding and apparatus for decoding speech and musical signals
US20090192789A1 (en) Method and apparatus for encoding/decoding audio signals
US4945565A (en) Low bit-rate pattern encoding and decoding with a reduced number of excitation pulses
JP2002372996A (en) Method and device for encoding acoustic signal, and method and device for decoding acoustic signal, and recording medium
JP2007504503A (en) Low bit rate audio encoding
JP3087814B2 (en) Acoustic signal conversion encoding device and decoding device
JP3472279B2 (en) Speech coding parameter coding method and apparatus
JP3888097B2 (en) Pitch cycle search range setting device, pitch cycle search device, decoding adaptive excitation vector generation device, speech coding device, speech decoding device, speech signal transmission device, speech signal reception device, mobile station device, and base station device
JP3531780B2 (en) Voice encoding method and decoding method
JPH07177031A (en) Voice coding control system
US6535847B1 (en) Audio signal processing
US6208962B1 (en) Signal coding system
JP4699117B2 (en) A signal encoding device, a signal decoding device, a signal encoding method, and a signal decoding method.
US5588089A (en) Bark amplitude component coder for a sampled analog signal and decoder for the coded signal
EP3248190B1 (en) Method of encoding, method of decoding, encoder, and decoder of an audio signal
JP2958726B2 (en) Apparatus for coding and decoding a sampled analog signal with repeatability
JPH08129400A (en) Voice coding system
JP2002049397A (en) Digital signal processing method, learning method, and their apparatus, and program storage media therefor
JP3417362B2 (en) Audio signal decoding method and audio signal encoding / decoding method
JP3453116B2 (en) Audio encoding method and apparatus
JP2004348120A (en) Voice encoding device and voice decoding device, and method thereof

Legal Events

Date Code Title Description
A300 Application deemed to be withdrawn because no request for examination was validly filed

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 20010306