JPH1078797A

JPH1078797A - Acoustic signal processing method

Info

Publication number: JPH1078797A
Application number: JP8233799A
Authority: JP
Inventors: Naoki Iwagami; 直樹岩上; Kazunaga Ikeda; 和永池田; Takehiro Moriya; 健弘守谷; Akio Jin; 明夫神
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-09-04
Filing date: 1996-09-04
Publication date: 1998-03-24
Anticipated expiration: 2016-09-04
Also published as: JP3384523B2

Abstract

PROBLEM TO BE SOLVED: To reduce operation quantity and to improve accuracy of operation for a post filter of temporal region processing. SOLUTION: For example, a reproduction frequency region coefficient in a conversion coding and decoding method and reproduction spectrum envelope are inputted to terminals 51, 52, a large part of values of spectrum envelope is deformed further larger and a small part is deformed further smaller by an emphasizing part 54, the frequency region coefficient is inversely flatted by an inversely flatting section 55 due to this emphasis-processed spectrum envelope, further, converted into a temporal region signal by a converting section 56.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明はオーディオ信号、
特に符号化・復号化された音声信号に含まれる雑音感を
低減させる信号処理方法に関する。The present invention relates to an audio signal,
In particular, the present invention relates to a signal processing method for reducing noise included in encoded / decoded audio signals.

【０００２】[0002]

【従来の技術】この発明は復号化音響信号以外の音響信
号に適用できるが、復号化音響信号に適用の際の適用個
所の理解のため従来の音響信号変換符号化・復号化法を
図３を参照して説明する。符号化器１０においては、入
力端子１１よりの入力音響信号を、時間−周波数変換部
１２によって周波数領域係数に変換する。この変換の方
法としては、ＭＤＣＴ（Modified Discrete Cosine Tra
nsformation,変形離散コサイン変換）や、ＤＣＴ（Disc
rete Cosine Transformation, 離散コサイン変換），Ｄ
ＦＴ（Discrete Fourier Transformation,離散フーリエ
変換）などを用いることができる。なお、時間−周波数
変換部１２では、前処理として、入力信号サンプル列の
フレーム分割と窓掛けが必要である。フレーム分割は、
ＭＤＣＴの場合入力サンプルがＮ点入力されるごとにこ
れを含む過去２Ｎ点のサンプルを１フレームとして分割
する。ＤＣＴとＤＦＴの場合入力サンプルがＮ点ごとに
これを含む過去Ｎ＋α点のサンプルを１フレームとして
分割する。窓掛けは従来行われている手法により行い、
いずれの変換方法でも、Ｎ点の周波数領域係数が得られ
る。2. Description of the Related Art Although the present invention can be applied to audio signals other than decoded audio signals, a conventional audio signal conversion encoding / decoding method is shown in FIG. This will be described with reference to FIG. In the encoder 10, an input audio signal from an input terminal 11 is converted into a frequency domain coefficient by a time-frequency converter 12. As a method of this conversion, MDCT (Modified Discrete Cosine Tra
nsformation, modified discrete cosine transform, DCT (Disc
rete Cosine Transformation, discrete cosine transform), D
FT (Discrete Fourier Transformation, Discrete Fourier Transform) or the like can be used. The time-frequency conversion unit 12 needs to divide the input signal sample sequence into frames and apply a window as preprocessing. Frame division is
In the case of MDCT, each time N input samples are input, the sample of the past 2N points including this is divided into one frame. In the case of DCT and DFT, a sample of N + α points in the past including N input samples is divided into one frame. Window hanging is done by the conventional method,
With any of the conversion methods, N frequency domain coefficients can be obtained.

【０００３】概形計算部１３で周波数領域係数の概形を
抽出する。この概形抽出方法としては、前処理された音
響信号を入力として線形予測分析をする方法、周波数領
域係数を入力としてスケールファクタを計算する方法、
周波数領域係数をリフタリングする方法などを用いるこ
とができる。線形予測分析をする方法では、入力信号を
線形予測分析し、線形予測係数を求め、この係数のスペ
クトル振幅の逆数を周波数特性概形とする。線形予測の
次数は、２０次程度にするのが効果的である。[0005] A rough shape calculator 13 extracts a rough shape of a frequency domain coefficient. The outline extraction method includes a method of performing a linear prediction analysis using a preprocessed audio signal as an input, a method of calculating a scale factor using a frequency domain coefficient as an input,
A method of lifting the frequency domain coefficient can be used. In the method of performing linear prediction analysis, an input signal is subjected to linear prediction analysis to obtain a linear prediction coefficient, and the reciprocal of the spectrum amplitude of the coefficient is used as a frequency characteristic outline. It is effective to set the order of linear prediction to about the 20th order.

【０００４】スケールファクタを計算する方法では、周
波数領域係数を複数の小帯域に分割し、小帯域ごとにス
ケールファクタを計算し、これを周波数特性概形とす
る。小帯域に分割する方法は、周波数スケールで等間隔
としてもよいし、バークスケールで等間隔（つまり聴覚
的に等間隔）としてもよい。小帯域の数は３０程度に設
定するのが効果的である。スケールファクタは小帯域内
のサンプルの振幅の平均値でもよいし、振幅の最大値で
もよい。In the method of calculating a scale factor, a frequency domain coefficient is divided into a plurality of small bands, and a scale factor is calculated for each of the small bands to obtain a rough frequency characteristic. The method of dividing into small bands may be equally spaced on a frequency scale or equally spaced on a bark scale (that is, equally spaced auditory). It is effective to set the number of small bands to about 30. The scale factor may be the average value of the amplitudes of the samples in the small band or the maximum value of the amplitude.

【０００５】周波数領域係数をリフタリングする方法で
は、周波数領域係数をケプストラム分析し、ケプストラ
ム係数の低次部分のみのスペクトル振幅を周波数特性概
形とする。また、周波数領域係数の概形は、上記の方法
の併用により求めてもよい。たとえば、線形予測分析と
スケールファクタを併用する場合、線形予測分析による
線形予測スペクトルを決定した後、これに掛け合わせた
際に実際の周波数特性にもっとも近い形状になるように
スケールファクタを決定するなどの方法をとる。In the method of lifting the frequency domain coefficients, the frequency domain coefficients are subjected to cepstrum analysis, and the spectral amplitude of only the low-order part of the cepstrum coefficients is used as an approximate frequency characteristic. Further, the approximate shape of the frequency domain coefficient may be obtained by using the above-mentioned method in combination. For example, when using linear predictive analysis and scale factor together, determine the linear predictive spectrum by linear predictive analysis, and then determine the scale factor so that when it is multiplied by this, the shape becomes the closest to the actual frequency characteristic. Take the method.

【０００６】この周波数特性概形を概形量子化部１４で
量子化して、そのインデックスＩn₁を得る。周波数特性
概形を線形予測分析により求めた場合、線形予測係数を
線スペクトル対（ＬＳＰ）に変換し、これを量子化する
方法が能率がよい。スケールファクタを量子化する場
合、各々のスケールファクタをスカラー量子化してもよ
いし、いくつかのスケールファクタをまとめてベクトル
量子化してもよい。ベクトル量子化をする際、インタリ
ーブベクトル量子化の技術を使うと、能率良く量子化が
可能である。ケプストラム係数を量子化する場合、ケプ
ストラム係数をスカラー量子化してもよいし、ベクトル
量子化してもよい。The rough shape of the frequency characteristic is quantized by the rough shape quantization section 14 to obtain the index In ₁ . When the outline of the frequency characteristic is obtained by the linear prediction analysis, a method of converting the linear prediction coefficient into a line spectrum pair (LSP) and quantizing this is efficient. When quantizing the scale factors, each scale factor may be scalar-quantized, or some of the scale factors may be vector-quantized collectively. When performing vector quantization, if an interleave vector quantization technique is used, quantization can be performed efficiently. When quantizing the cepstrum coefficient, the cepstrum coefficient may be scalar-quantized or vector-quantized.

【０００７】いずれの方法も、予測量子化を行うとさら
に高い能率が得られる。予測の方法としては、ＡＲ予
測、ＭＡ予測などを用いることができる。複数方法で周
波数特性概形を求めた場合、用いたすべての方法につい
て量子化を行う。量子化した周波数特性概形を概形再生
部１５で復号化し、周波数特性概形を再生する。線スペ
クトル対を量子化した場合、復号化して得られた再生線
スペクトル対を再生線形予測係数に変換し、再生線形予
測係数のスペクトル振幅の逆数を再生周波数特性概形と
する。スケールファクタを量子化した場合、復号化した
再生スケールファクタを再生周波数特性概形とする。ケ
プストラム係数を量子化した場合、復号化された再生ケ
プストラム係数のスペクトル振幅を再生周波数特性概形
とする。[0007] In either method, higher efficiency can be obtained by performing predictive quantization. As a prediction method, AR prediction, MA prediction, or the like can be used. When the frequency characteristic outline is obtained by a plurality of methods, quantization is performed for all the methods used. The quantized frequency characteristic outline is decoded by the outline reproduction unit 15, and the outline of the frequency characteristic is reproduced. When the line spectrum pair is quantized, the reproduction line spectrum pair obtained by decoding is converted into a reproduction linear prediction coefficient, and the reciprocal of the spectrum amplitude of the reproduction linear prediction coefficient is used as a reproduction frequency characteristic outline. When the scale factor is quantized, the decoded reproduction scale factor is used as a reproduction frequency characteristic. When the cepstrum coefficients are quantized, the spectrum amplitude of the decoded reproduced cepstrum coefficients is used as an approximate reproduction frequency characteristic.

【０００８】平坦化部１６において周波数領域係数を再
生周波数特性概形で平坦化する。ここでは、各々の周波
数領域係数をこれに対応する周波数特性概形で割ること
によって平坦化周波数領域係数（残差周波数係数）が得
られる。この平坦化周波数係数を残差量子化部１７でベ
クトル量子化してインデックスＩn₂を得る。この量子化
方法として、重み付きベクトル量子化による変換符号化
法（ＴＣ−ＷＶＱ，Transform Coding with Weighted V
ector Quantization），周波数領域重み付けインタリー
ブベクトル量子化法（ＴＷＩＮＶＱ，Transform-domain
Weighted Interleave Vector Quantization）などがあ
る。それぞれの技術については、Ｔ．Ｍoriya,Ｈ．Ｓud
a ：“Ａn,８kbit/s transform coder fornoisy channe
ls," Proc.ＩＣＡＳＳＰ '89 pp１９６−１９９および
岩上、守谷、三樹、“周波数領域重みづけインタリーブ
ベクトル量子化（ＴｗｉｎＶＱ）によるオーディオ符号
化、”日本音響学会講演論文集平成６年１０月〜１１
月ｐｐ．３３９−３４０に述べられている。In the flattening section 16, the frequency domain coefficients are flattened in a reproduction frequency characteristic outline. Here, a flattened frequency domain coefficient (residual frequency coefficient) is obtained by dividing each frequency domain coefficient by the corresponding frequency characteristic outline. Obtaining an index In ₂ the flattening frequency coefficients vector quantized by residual quantizer 17. As this quantization method, a transform coding method using weighted vector quantization (TC-WVQ, Transform Coding with Weighted V
Quantization), frequency domain weighted interleaved vector quantization (TWINVQ, Transform-domain)
Weighted Interleave Vector Quantization). For each technology, see T.A. Moriya, H .; Sud
a: "An, 8 kbit / s transform coder for noisy channe
ls, "Proc. ICASP '89 pp 196-199 and Iwagami, Moriya, Miki," Audio coding by frequency domain weighted interleaved vector quantization (TwinVQ), "Proc.
Month pp. 339-340.

【０００９】復号化器２０において、量子化された平坦
化された周波数領域係数のインデックスＩn₂を再生部２
１で復号再生する。量子化した周波数特性概形のインデ
ックスＩn₁を再生部２２で復号化し、再生周波数特性概
形を再生する。線スペクトル対を量子化した場合、復号
化して得られた再生線スペクトル対を再生線形予測係数
に変換し、再生線形予測係数のスペクトル振幅の逆数を
再生周波数特性概形とする。スケールファクタを量子化
した場合、復号化した再生スケールファクタを再生周波
数特性概形とする。ケプストラム係数を量子化した場
合、復号化された再生ケプストラム係数のスペクトル振
幅を再生周波数特性概形とする。[0009] In the decoder 20, the index In ₂ of the quantized and flattened frequency domain coefficient is reproduced by the reproducing unit 2.
Decryption reproduction is performed with 1. The reproducing unit 22 decodes the quantized index In ₁ of the approximate frequency characteristic and reproduces the approximate reproduced frequency characteristic. When the line spectrum pair is quantized, the reproduction line spectrum pair obtained by decoding is converted into a reproduction linear prediction coefficient, and the reciprocal of the spectrum amplitude of the reproduction linear prediction coefficient is used as a reproduction frequency characteristic outline. When the scale factor is quantized, the decoded reproduction scale factor is used as a reproduction frequency characteristic. When the cepstrum coefficients are quantized, the spectrum amplitude of the decoded reproduced cepstrum coefficients is used as an approximate reproduction frequency characteristic.

【００１０】なお、予測量子化を行った場合、同じ予測
合成を用いて再生を行う。複数方法の量子化を行った場
合、すべての方法について再生を行い、たとえば各々で
再生した概形を互いに掛け合わせるなどの方法により再
生周波数特性概形を得る。再生された平坦化周波数領域
係数を、再生周波数特性概形を用いて逆平坦化部２３で
逆平坦化する。ここでは、各々の再生された平坦化周波
数領域係数と、これに対応する再生周波数特性概形を掛
け合わせることによって逆平坦化が行われ、再生周波数
領域係数が得られる。When prediction quantization is performed, reproduction is performed using the same prediction synthesis. When quantization is performed by a plurality of methods, reproduction is performed for all methods, and a reproduction frequency characteristic profile is obtained by, for example, multiplying the reproduced profiles by each other. The reproduced flattened frequency domain coefficient is inverse-flattened by the inverse flattening unit 23 using an outline of the reproduced frequency characteristic. Here, inverse flattening is performed by multiplying each reproduced flattened frequency domain coefficient by the corresponding reproduction frequency characteristic outline to obtain a reproduced frequency domain coefficient.

【００１１】周波数−時間変換部２４によって再生周波
数領域係数を出力音響信号に変換出力する。変換の方法
としては、ＩＭＤＣＴ（Inverse Modified Discrete Co
sineTransformation, 逆変形離散コサイン変換）や、Ｉ
ＤＣＴ（Inverse DiscreteCosine Transformation,逆
離散コサイン変換）、ＩＤＦＴ（Inverse DiscreteFou
rier Transformation，逆離散フーリエ変換）などを用
いることができる。なお、周波数−時間変換部では、後
処理として、出力信号サンプル列の窓掛けとフレーム結
合が必要である。窓掛けは従来の手法と同様に行う。The reproduction frequency domain coefficient is converted into an output audio signal by a frequency-time conversion unit 24 and output. As a conversion method, IMDCT (Inverse Modified Discrete Co
sineTransformation, inverse transformed discrete cosine transform) and I
DCT (Inverse Discrete Cosine Transformation), IDFT (Inverse Discrete Fosine)
rier Transformation, inverse discrete Fourier transform) and the like can be used. In the frequency-time conversion unit, windowing of an output signal sample sequence and frame combination are necessary as post-processing. Windowing is performed in the same manner as in the conventional method.

【００１２】更に符号化音声の雑音感を低減するために
スペクトルの山谷を強調するポストフィルタ２５に変換
部２４よりの復号化音声信号を入力することが知られて
いる。このポストフィルタ２５の典型としては線形予測
係数αに基づく以下の形式がある。It is known that the decoded speech signal from the converter 24 is input to a post-filter 25 for enhancing the peaks and valleys of the spectrum in order to reduce the noise of the encoded speech. A typical example of the post filter 25 is as follows based on the linear prediction coefficient α.

【００１３】[0013]

【数１】ここでμはスペクトルの傾斜を補正する定数で例えば0.
４，γ₁，γ₂はスペクトルの山を強調するための１以
下の正定数で例えばそれぞれ0.５と0.８である。この手
法は畳み込みの処理を必要とするため大きな演算量を必
要とする。また詳細なスペクトル強調処理を行うために
は線形予測の次数を高くする必要があり、演算量と演算
精度の点からも問題がある。(Equation 1) Here, μ is a constant for correcting the slope of the spectrum, for example, 0.
4, γ ₁ and γ ₂ are positive constants of 1 or less for emphasizing the peak of the spectrum, for example, 0.5 and 0.8, respectively. This method requires a large amount of calculation because it requires convolution processing. In addition, in order to perform detailed spectrum enhancement processing, it is necessary to increase the order of linear prediction, and there is a problem in terms of the amount of calculation and the calculation accuracy.

【００１４】[0014]

【発明が解決しようとする課題】この発明の目的は、オ
ーディオ信号、特に符号化・復号化された音声信号に含
まれる雑音感を低減させる信号処理方法を小さな演算量
で詳細に実現することを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to provide a signal processing method for reducing noise contained in an audio signal, particularly an encoded / decoded audio signal, with a small amount of computation. Aim.

【００１５】[0015]

【課題を解決するための手段】この発明では、入力信号
の周波数特性の概形が取り除かれた周波数領域係数と、
そのスペクトルの包絡を求め、そのスペクトル包絡形状
を強調し、その強調されたスペクトル包絡により周波数
領域係数を逆平坦化する。特にスペクトル包絡形状を求
める際バーク尺度（聴覚上で周波数分解能が同一とな
る）の周波数軸で等分解能をもたせるとより高い能率で
処理を行うことができる。According to the present invention, a frequency domain coefficient from which an outline of a frequency characteristic of an input signal is removed,
The spectrum envelope is obtained, the spectrum envelope shape is enhanced, and the frequency domain coefficient is inverse-flattened by the enhanced spectrum envelope. In particular, when obtaining the spectrum envelope shape, if the equal resolution is provided on the frequency axis of the Bark scale (the frequency resolution becomes the same in hearing), processing can be performed with higher efficiency.

【００１６】雑音感の多い音響信号は、スペクトルの大
小を強調することにより雑音感を低減することができ
る。この発明ではこの処理を周波数領域で行うので、少
ない演算量で詳細な処理が可能である。この発明を変換
符号化方式の復号器に組み込む場合には、この発明の処
理過程の一つである周波数−時間変換処理を共有できる
ので演算量の点で特に有利である。A noise signal having a large amount of noise can be reduced by emphasizing the magnitude of the spectrum. According to the present invention, since this processing is performed in the frequency domain, detailed processing can be performed with a small amount of calculation. When the present invention is incorporated in a decoder of the transform coding system, the frequency-time conversion process, which is one of the processes of the present invention, can be shared, which is particularly advantageous in terms of the amount of calculation.

【００１７】[0017]

【発明の実施の形態】図１にこの発明の第１実施例を示
す。この実施例では平坦化された周波数領域係数とスペ
クトル包絡とが端子５１，５２にそれぞれ入力され、端
子５３から時間領域信号を出力とする。平坦化周波数領
域係数は、例えば図３中の符号化器１０で説明したよう
に入力音響信号を時間−周波数変換した後、スペクトル
包絡を用いて平坦化することによって求めてもよいし、
図３中の復号化器２０に示したように変換符号化方法の
復号器において、残差再生部２１より再生された平坦化
周波数領域係数を用いてもよい。時間−周波数変換は、
先に述べたように離散フーリエ変換（Discrete Fourier
Transformation,ＤＣＴ），離散コサイン変換（Discre
te Cosine Transformation, ＤＣＴ），変形離散コサイ
ン変換（Modified Discrete Cosine Transformation,Ｍ
ＤＣＴ）などを用いることができる。これらの変換は、
入力Ｎサンプルごとに行う。Ｎの値は例えば入力信号の
サンプリング周波数が４８kHz の場合５１２ないし４０
９６程度が良好である。FIG. 1 shows a first embodiment of the present invention. In this embodiment, the flattened frequency domain coefficients and the spectral envelope are input to terminals 51 and 52, respectively, and a time domain signal is output from terminal 53. The flattened frequency domain coefficient may be obtained by, for example, performing time-frequency conversion on the input audio signal as described in the encoder 10 in FIG. 3 and then flattening using a spectral envelope,
As shown by the decoder 20 in FIG. 3, in the decoder of the transform coding method, the flattened frequency domain coefficients reproduced by the residual reproducing unit 21 may be used. The time-frequency conversion is
As mentioned earlier, the Discrete Fourier Transform
Transformation, DCT, Discrete Cosine Transform (Discre
te Cosine Transformation, DCT, Modified Discrete Cosine Transformation, M
DCT) can be used. These transformations are
Performed every N input samples. The value of N is, for example, 512 to 40 when the sampling frequency of the input signal is 48 kHz.
About 96 is good.

【００１８】スペクトル包絡は変換符号化方法の復号器
において、図３の周波数特性概形再生部２２により再生
されたスペクトル包絡を用いてもよいし、入力音響信号
を時間−周波数変換して周波数領域係数を求め、その周
波数領域係数の概形を求めてもよい。スペクトル包絡の
表現方法として先に述べたように、スケールファクタ、
線形予測スペクトルなどを用いることができる。スケー
ルファクタは、周波数領域係数を複数の周波数バンドご
とにまとめた各バンドごとの代表値である。代表値はバ
ント内の係数の振幅の最大値でもよいし平均値でもよ
い。また各周波数のバンド幅は、線形スケール（Hzスケ
ール）で一定幅でもよいし、非線形スケール（例えばバ
ークスケール）で一定幅としてもよい。特にバークスケ
ールで一定幅とした場合には、聴感的に高能率な処理が
可能である。線形予測スペクトルは、線形予測係数を周
波数分析し、その逆数を求めることにより与えられる。
線形予測係数は入力音響信号を線形予測分析して求めて
もよいし、符号化方法の復号器において、再生された線
形予測係数を用いてもよい。端子５２に入力されたス
ペクトル包絡はスペクトル包絡強調部５４で強調処理が
なされる。この強調処理では、値が大きいときには更に
大きく、値が小さいときには更に小さくすることを行
う。例えば式（２）のような変形を行う。For the spectral envelope, the decoder of the transform coding method may use the spectral envelope reproduced by the frequency characteristic outline reproducing unit 22 in FIG. The coefficient may be determined, and the approximate shape of the frequency domain coefficient may be determined. As described above, the scale factor,
A linear prediction spectrum or the like can be used. The scale factor is a representative value for each band in which frequency domain coefficients are grouped for each of a plurality of frequency bands. The representative value may be the maximum value or the average value of the amplitudes of the coefficients in the band. The bandwidth of each frequency may be a fixed width on a linear scale (Hz scale) or a fixed width on a non-linear scale (for example, a Bark scale). In particular, when the width is set to be constant on the bark scale, highly efficient processing can be performed audibly. The linear prediction spectrum is given by frequency-analyzing the linear prediction coefficient and calculating its reciprocal.
The linear prediction coefficient may be obtained by performing a linear prediction analysis on the input audio signal, or a reproduced linear prediction coefficient may be used in a decoder of the encoding method. The spectrum envelope input to the terminal 52 is subjected to enhancement processing by a spectrum envelope enhancement unit 54. In this emphasis processing, the value is further increased when the value is large, and further decreased when the value is small. For example, a modification as shown in Expression (2) is performed.

【００１９】ｗ（ｉ）′＝ｗ₀（ｗ（ｉ）／ｗ₀）^q （２）ここで、ｗ（ｉ）′は変形後のスペクトル包絡、ｗ
（ｉ）は入力スペクトル包絡、ｗ₀は変形の基準値、ｑ
は１以上の定数、例えば２〜４，ｉはスペクトル包絡の
サンプル番号である。基準値ｗ₀は任意に選ぶことがで
きるが、スペクトル包絡の平均値とすると効果的であ
る。また、式（２）の変形を一律に行うのではなく、基
準値ｗ₀よりもスペクトル包絡の値ｗが小さいときのみ
変形を行ってもよい。W (i) ′ = w ₀ (w (i) / w ₀ ) ^q (2) where w (i) ′ is the spectral envelope after deformation, w
(I) is the input spectrum envelope, w ₀ is the reference value of the deformation, q
Is a constant of 1 or more, for example, 2 to 4, i is a sample number of the spectral envelope. The reference value w ₀ can be arbitrarily selected, but it is effective to set the average value of the spectral envelope. Also, instead of performing the transformation of Equation (2) uniformly, the transformation may be performed only when the value w of the spectral envelope is smaller than the reference value w ₀ .

【００２０】次に、端子５１に入力された平坦化周波数
領域係数を強調されたスペクトル包絡を用いて逆平坦化
部５５で逆平坦化する。この際、強調されたスペクトル
包絡のサンプル点数は平坦化周波数領域係数のサンプル
点数と一致している必要がある。一致していない場合に
は、補間・間引きなどの処理によりサンプル点数を一致
させる。逆平坦化は次式（３）に従って行う。Next, the flattening frequency domain coefficient input to the terminal 51 is inverse flattened by the inverse flattening unit 55 using the emphasized spectrum envelope. At this time, the number of sample points of the emphasized spectrum envelope needs to match the number of sample points of the flattened frequency domain coefficient. If they do not match, the number of sample points is matched by processing such as interpolation and thinning. The inverse flattening is performed according to the following equation (3).

【００２１】ｙ（ｊ）＝ｗ（ｊ）′ｘ（ｊ）（３）ただし、ｙは逆平坦化して得られた周波数領域係数、ｘ
は平坦化周波数領域係数、ｊはサンプル番号である。最
後に、逆平坦化して得られた周波数領域係数を変換部５
６で周波数−時間変換して音響信号出力を得る。周波数
−時間変換の方法として、逆離散フーリエ変換（Invers
e Discrete Fourier Transformation,ＩＤＦＴ），逆離
散コサンイン変換（Inverse Discrete Cosine Transfo
rmation,ＩＤＣＴ），逆変形離散コサンイン変換（Inve
rse Modified Discrete Cosine Transformation,ＩＭＤ
ＣＴ）などを用いることができる。Y (j) = w (j) ′ x (j) (3) where y is a frequency domain coefficient obtained by inverse flattening, x
Is a flattening frequency domain coefficient, and j is a sample number. Finally, the frequency domain coefficients obtained by inverse flattening are
At 6, the frequency-time conversion is performed to obtain an audio signal output. As a method of frequency-time conversion, an inverse discrete Fourier transform (Invers
e Discrete Fourier Transformation (IDFT), Inverse Discrete Cosine Transfo
rmation, IDCT), inverse transformed discrete Kosan-in transform (Inve
rse Modified Discrete Cosine Transformation, IMD
CT) can be used.

【００２２】図２にこの発明の第２実施例を示す。スペ
クトル包絡強調、逆平坦化の手法は図１に示した第１実
施例と同様である。第１実施例との違いは、複数のスペ
クトル包絡を用い、別々に強調処理を行うことである。
端子６１よりの微細スペクトル包絡は端子６２よりの大
局的スペクトル包絡よりもより細かい。例えばそれぞれ
バークスケール上で等間隔に分割したスケールファクタ
と線形予測スペクトルなどを用いる。どちらのスペクト
ル包絡も第１実施例で述べた種類のスペクトル包絡を用
いることができる。また微細スペクトル包絡として、ピ
ッチ包絡を用いてもよい。ピッチ包絡は、基本周波数の
整数倍ごとに鋭いピークを持つ包絡であり、入力音響信
号を分析して求めてもよいし、符号化方法の復号器にお
いて、再生されたピッチ包絡を用いたり、あるいは再生
されたピッチ情報からピッチ包絡を用いてもよい。FIG. 2 shows a second embodiment of the present invention. The methods of spectral envelope enhancement and inverse flattening are the same as in the first embodiment shown in FIG. The difference from the first embodiment is that a plurality of spectral envelopes are used and enhancement processing is performed separately.
The fine spectral envelope from terminal 61 is finer than the global spectral envelope from terminal 62. For example, a scale factor and a linear prediction spectrum that are respectively divided at equal intervals on a bark scale are used. For both spectral envelopes, a spectral envelope of the type described in the first embodiment can be used. Further, a pitch envelope may be used as the fine spectrum envelope. The pitch envelope is an envelope having a sharp peak for every integral multiple of the fundamental frequency, and may be obtained by analyzing the input audio signal, or in a decoder of the encoding method, using a reproduced pitch envelope, or A pitch envelope may be used from the reproduced pitch information.

【００２３】大局的スペクトル包絡はスペクトル包絡強
調部６３で第１実施例と同様に強調処理が行われ、また
微細スペクトル包絡もスペクトル包絡強調部６４で同様
に強調処理が行われる。端子５１よりの平坦化周波数領
域係数は逆平坦化部６５でスペクトル包絡強調部６３よ
りの強調処理された大局的スペクトル包絡により逆平坦
化処理がなされ、この逆平坦化された周波数領域係数
は、逆平坦化部６６でスペクトル包絡強調部６４で強調
処理された微細スペクトル包絡により平坦化処理され、
その平坦化処理された周波数領域係数が変換部５６で周
波数−時間変換がなされて出力される。The global spectrum envelope is enhanced by the spectrum envelope enhancing section 63 in the same manner as in the first embodiment, and the fine spectrum envelope is also enhanced by the spectrum envelope enhancing section 64 in the same manner. The flattened frequency domain coefficient from the terminal 51 is subjected to inverse flattening by the global spectrum envelope enhanced by the spectrum envelope enhancing section 63 in the inverse flattening section 65, and the inverse flattened frequency domain coefficient is The flattening process is performed by the fine spectrum envelope emphasized by the spectrum envelope emphasizing unit 64 by the inverse flattening unit 66,
The frequency domain coefficients subjected to the flattening process are subjected to frequency-time conversion by the conversion unit 56 and output.

【００２４】なお、スペクトル包絡強調−逆平坦化の組
み合わせは第２実施例のように２つに限定する必要はな
く、更に多い組み合わせを用意してもよい。The combination of the spectral envelope enhancement and the inverse flattening need not be limited to two as in the second embodiment, and more combinations may be prepared.

【００２５】[0025]

【発明の効果】以上述べたように、この発明によれば周
波数領域で音響信号のスペクトル包絡の強調処理を行
い、スペクトル包絡の強弱を強調することにより、スペ
クトルの谷間に歪みがあるために聞こえる雑音感を低減
することができる。この処理を時間領域でなく周波数領
域で行うことにより、時間領域で必要だった畳み込み演
算を行う必要がなく、詳細な処理でも小さな演算量で行
うことができる。また、この発明を変換符号化方法の復
号器と組み合わせる場合、例えば図３中の逆平坦化部２
３を、この発明の逆平坦化部５３または６５と共有する
ことができ、また周波数−時間変換部５６を図３中の周
波数−時間変換部２４と共有でき、演算量とメモリ規模
の点で特に有利である。As described above, according to the present invention, the spectrum envelope of the audio signal is enhanced in the frequency domain to enhance the strength of the spectrum envelope, so that there is distortion in the valleys of the spectrum. Noise can be reduced. By performing this processing not in the time domain but in the frequency domain, it is not necessary to perform the convolution operation required in the time domain, and it is possible to perform detailed processing with a small amount of calculation. When the present invention is combined with a decoder of the transform coding method, for example, the inverse flattening unit 2 in FIG.
3 can be shared with the inverse flattening unit 53 or 65 of the present invention, and the frequency-time conversion unit 56 can be shared with the frequency-time conversion unit 24 in FIG. It is particularly advantageous.

[Brief description of the drawings]

【図１】この発明の第１実施例の機能構成を示すブロッ
ク図。FIG. 1 is a block diagram showing a functional configuration of a first embodiment of the present invention.

【図２】この発明の第２実施例の機能構成を示すブロッ
ク図。FIG. 2 is a block diagram showing a functional configuration of a second embodiment of the present invention.

【図３】従来の音響信号変換符号化・復号化方法におけ
る機能構成例を示すブロック図。FIG. 3 is a block diagram showing an example of a functional configuration in a conventional audio signal conversion encoding / decoding method.

───────────────────────────────────────────────────── フロントページの続き (72)発明者神明夫東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 ────────────────────────────────────────────────── ─── Continuing from the front page (72) Inventor Akio Kami 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo Nippon Telegraph and Telephone Corporation

Claims

[Claims]

1. A first step of obtaining a frequency domain coefficient from which an outline of a frequency characteristic of an acoustic signal is removed in a frame unit; a second step of obtaining an outline of the frequency characteristic; Adding a frequency characteristic to the frequency domain coefficient obtained in the first step by using a third step of enhancing the outline shape, and using the outline of the emphasized frequency characteristic obtained in the third step. And a fourth step of performing inverse flattening.

2. The acoustic signal processing method according to claim 1, wherein the second step includes, as an outline of the frequency characteristic, a scale factor having a frequency resolution of equal intervals on a bark scale.

3. The sound according to claim 1, wherein the third step includes a fifth step in which a sample having a large value of each sample of the approximate shape of the frequency characteristic further increases the value. Signal processing method.

4. The sound according to claim 1, wherein the third step includes a sixth step in which a sample having a small value of each sample of the approximate shape of the frequency characteristic further reduces the value. Signal processing method.