JPH0990989A

JPH0990989A - Conversion encoding method and conversion decoding method

Info

Publication number: JPH0990989A
Application number: JP7248145A
Authority: JP
Inventors: Naoki Iwagami; 直樹岩上; Takehiro Moriya; 健弘守谷; Satoshi Miki; 聡三樹
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-09-26
Filing date: 1995-09-26
Publication date: 1997-04-04
Anticipated expiration: 2015-09-26
Also published as: JP3348759B2

Abstract

PROBLEM TO BE SOLVED: To improve an efficiency of an entire encoder by encoding a signal with a high efficiency by making use of a regularity of a spikelike pitch component appearing in a frequency domain signal transformed into a frequency domain. SOLUTION: An encoder A is constituted of a time-frequency converter 1, a general situation outline calculation and quantization device 2, a first flattening device 3, a pitch encoder 4 an adder 5, a minute spectrum outline calculation and quantization device 6, a second flattening device 7, and a quantization device 8. And a decoder B is constituted of a reproducing device 9, a minute spectrum outline reproducing device 10, a first inverse flattening device 11, a pitch reproducing device 12, an adder 13, a general situation outline reproducing device 14, a second inverse flattening device 15, and a time-frequency inverter 16. And a tone signal or a voice signal is divided into frames of a certain interval and is converted into a frequency domain signal. Next, only the pitch component is extracted from the frequency domain signal, to be separated and decoded.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、楽音信号あるいは
音声信号等、ピッチ成分を含む信号の変換符号化方法お
よび変換復号化方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a transform coding method and a transform decoding method for a signal including a pitch component such as a tone signal or a voice signal.

【０００２】[0002]

【従来の技術】現在、楽音信号あるいは音声信号等のオ
ーディオ信号を高能率に符号化する方法として、該オー
ディオ信号をフレームと呼ばれる５〜５０ｍｓ程度の一
定間隔の区間に分割し、この１フレームの信号に時間−
周波数変換を施して得られた周波数領域信号を、周波数
特性の包絡形状（周波数特性の概形）と、周波数領域信
号を周波数特性概形で平坦化して得られる残差信号の２
つの情報に分離し、それぞれを符号化することが提案さ
れている。2. Description of the Related Art At present, as a method of highly efficiently encoding an audio signal such as a musical tone signal or a voice signal, the audio signal is divided into intervals of a constant interval of about 5 to 50 ms called a frame, Time to signal-
The frequency domain signal obtained by performing frequency conversion is the envelope shape of the frequency characteristic (frequency characteristic outline) and the residual signal obtained by flattening the frequency domain signal with the frequency characteristic outline.
It has been proposed to separate the information into two pieces and encode each piece.

【０００３】また、このような符号化法の具体的な方法
として、適応スペクトル聴感制御エントロピー符号化法
（ASPEC,Adaptive Spectral Perceptual Entropy Codin
g）、重み付きベクトル量子化による変換符号化法（TCW
VQ,Transform Coding withWeighted Vector Quantizati
on）、およびエムペグ−オーディオ・レイヤ３方式（MP
EG-Audio Layer 3）等が提案されている。As a concrete method of such an encoding method, an adaptive spectral perceptual control entropy encoding method (ASPEC, Adaptive Spectral Perceptual Entropy Codin) is used.
g), transform coding method by weighted vector quantization (TCW
VQ, Transform Coding with Weighted Vector Quantizati
on), and Mpeg-Audio Layer 3 method (MP
EG-Audio Layer 3) etc. have been proposed.

【０００４】なお、これらの技術については、K.Brande
nburg, J.Herre, J.D.Johnston etal:"ASPEC:Adaptive
spectral entropy coding of high quality music sign
als", Proc.AES'91 、T.Moriya, H.Suda:"An 8 Kbit/s
transform coder for noisychannels", Proc.ICASSP'89
pp.196--199 、および ISO/IEC標準 IS-11172-3 に詳
しく述べられている。Regarding these technologies, K. Brande
nburg, J.Herre, JDJohnston et al: "ASPEC: Adaptive
spectral entropy coding of high quality music sign
als ", Proc.AES'91, T.Moriya, H.Suda:" An 8 Kbit / s
transform coder for noisychannels ", Proc.ICASSP'89
pp.196--199, and ISO / IEC standard IS-11172-3.

【０００５】ここで、これらの符号化法によって高能率
な符号化を実現するためには、残差信号は、できるだけ
周波数特性が平坦であることが望ましい。このため、上
述の適応スペクトル聴感制御エントロピー符号化法（AS
PEC）あるいはエムペグ−オーディオ・レイヤ３方式（M
PEG-Audio Layer 3）では、図５に示すように、周波数
領域信号をいくつかの小帯域に分割し、各小帯域内の信
号を帯域の強さを表すスケーリングファクタと呼ばれる
値で除算して正規化することにより、残差信号の周波数
特性の平坦化を図っている。Here, in order to realize highly efficient coding by these coding methods, it is desirable that the residual signal has as flat a frequency characteristic as possible. For this reason, the adaptive spectrum auditory control entropy coding method (AS
PEC) or Mpeg-Audio Layer 3 method (M
In PEG-Audio Layer 3), as shown in FIG. 5, the frequency domain signal is divided into several small bands, and the signal in each small band is divided by a value called a scaling factor that represents the strength of the band. By normalizing, the frequency characteristic of the residual signal is flattened.

【０００６】一方、これらの方法よりも高能率な周波数
領域信号の平坦化方法として、図６に示すような線形予
測分析を用いる方法がある。この方法では、入力信号を
線形予測して得られた線形予測係数で線形予測分析フィ
ルタを駆動することにより周波数特性の平坦化を行う。
この方法は、上記重み付きベクトル量子化による変換
符号化法（TCWVQ）で用いられている手法である。On the other hand, as a method of flattening a frequency domain signal which is more efficient than these methods, there is a method using linear prediction analysis as shown in FIG. In this method, the frequency characteristics are flattened by driving a linear prediction analysis filter with a linear prediction coefficient obtained by linearly predicting an input signal.
This method is a method used in the transform coding method (TCWVQ) by the weighted vector quantization.

【０００７】なお、線形予測分析、離散コサイン変換
（DCT）、変形離散コサイン変換（MDCT）等の各関連各
技術については、斉藤、中田”音声情報処理の基礎”
（オーム社）の第６章、K.R.Rao,P.Yip 著、安田、藤原
訳”画像符号化技術−DCTとその国際標準”（オーム
社）の第２章、H.S.Malvar,"Signal Processing with L
apedTransforms,"Artech House 、および ISO/IEC 標
準 IS-11172-3 に記載されている。Regarding each related technique such as linear prediction analysis, discrete cosine transform (DCT) and modified discrete cosine transform (MDCT), Saito and Nakata "Basics of Speech Information Processing"
(Ohmsha) Chapter 6, KRRao, P.Yip, Translated by Yasuda and Fujiwara "Image Coding Technology-DCT and Its International Standards" Chapter 2 (Ohmsha), HSMalvar, "Signal Processing with L
apedTransforms, "Artech House, and ISO / IEC standard IS-11172-3.

【０００８】[0008]

【発明が解決しようとする課題】しかし、これらの符号
化方法では、周波数特性の大局的な概形を正規化するに
とどまり、楽音や音声のピッチ成分による微視的な周波
数特性の凹凸を能率良く除去することができない。した
がって、このことが障害となり、上記従来の符号化方法
は、ピッチ成分の強いオーディオ信号を符号化する場合
に高能率化することが困難であった。However, in these encoding methods, the general outline of the frequency characteristic is limited to normalization, and the unevenness of the microscopic frequency characteristic due to the pitch component of the musical sound or voice is efficiently generated. It cannot be removed well. Therefore, this becomes an obstacle, and it is difficult for the above-mentioned conventional encoding method to improve the efficiency when encoding an audio signal having a strong pitch component.

【０００９】本発明は、上述する問題点に鑑みてなされ
たもので、ピッチ成分が含まれたオーディオ信号を能率
良く符号化することが可能な変換符号化方法および変換
復号化方法を提供することを目的としている。The present invention has been made in view of the above-mentioned problems, and provides a transform coding method and a transform decoding method capable of efficiently coding an audio signal containing a pitch component. It is an object.

【００１０】[0010]

【課題を解決するための手段】請求項１記載の発明は、
ピッチ成分を含む楽音信号あるいは音声信号の変換符号
化方法であって、音信号あるいは音声信号を一定時間間
隔のフレームに分割し、周波数領域信号に変換する第１
の段階と、前記周波数領域信号からピッチ成分のみを抽
出、分離して符号化する第２の段階と、前記周波数領域
信号からピッチ成分を除去した信号を符号化する第３の
段階とからなることを特徴としている。According to the first aspect of the present invention,
A conversion coding method of a tone signal or a voice signal including a pitch component, which divides a voice signal or a voice signal into frames at constant time intervals and transforms into a frequency domain signal.
, A second step of extracting only a pitch component from the frequency domain signal, separating and encoding the same, and a third step of encoding a signal from which the pitch component is removed from the frequency domain signal. Is characterized by.

【００１１】請求項２記載の発明は、請求項１記載の発
明において、第２の段階は、ピッチ成分の基本周波数を
求めて符号化する第４の段階と、該基本周波数をもとに
ピッチ成分を抽出して符号化する第５の段階とからなる
ことを特徴としている。According to a second aspect of the present invention, in the first aspect of the invention, the second step is the fourth step of obtaining and encoding the fundamental frequency of the pitch component, and the pitch based on the fundamental frequency. And a fifth step of extracting and encoding the component.

【００１２】請求項３記載の発明は、請求項２記載の発
明において、第５の段階は、基本周波数の自然数倍の周
波数に最も近い周波数領域信号のサンプルを中心とし
て、これを含めた連続する複数のサンプルを１単位とし
てピッチ成分を抽出することを特徴としている。According to a third aspect of the invention, in the second aspect of the invention, the fifth step is such that a sample of a frequency domain signal closest to a frequency that is a natural number multiple of the fundamental frequency is centered, and is included continuously. It is characterized in that the pitch component is extracted with a plurality of samples as a unit.

【００１３】請求項４記載の発明は、請求項２または３
記載の発明において、第５の段階は、ピッチ成分の各単
位ごとにベクトル量子化することにより符号化すること
を特徴としている。The invention according to claim 4 is the invention according to claim 2 or 3.
In the invention described above, the fifth step is characterized in that each unit of pitch components is encoded by vector quantization.

【００１４】請求項５記載の発明は、請求項１記載の変
換符号化方法によって得られた符号の復号化方法であっ
て、第２の段階において符号化されたピッチ成分を復号
化する第６の段階と、第３の段階において符号化された
周波数領域信号からピッチ成分を除去した信号を復号化
する第７の段階と、前記第６の段階において得られた複
合化出力と第７の段階において得られた複合化出力を合
成して得られた周波数領域信号を時間領域信号に変換す
る第８の段階とからなることを特徴としている。The invention described in claim 5 is a method for decoding a code obtained by the transform coding method according to claim 1, wherein the pitch component coded in the second stage is decoded. Step, a seventh step of decoding a signal obtained by removing a pitch component from the frequency domain signal encoded in the third step, a composite output obtained in the sixth step and a seventh step The eighth step of transforming the frequency domain signal obtained by synthesizing the composite output obtained in 1) into a time domain signal.

【００１５】請求項６記載の発明は、請求項５記載の発
明において、第６の段階は、第４の段階で得られたピッ
チ成分の基本周波数を復号化する第９の段階と、該第９
の段階によって得られたピッチ成分の基本周波数をもと
にピッチ成分を周波数領域の信号として配置する第１０
の段階とからなることを特徴としている。According to a sixth aspect of the invention, in the fifth aspect of the invention, the sixth step includes a ninth step of decoding the fundamental frequency of the pitch component obtained in the fourth step, and the ninth step. 9
Arranging the pitch component as a signal in the frequency domain based on the fundamental frequency of the pitch component obtained by the step of
It is characterized in that

【００１６】請求項７記載の発明は、請求項６記載の発
明において、第１０の段階は、基本周波数の自然数倍の
周波数に最も近い周波数領域のサンプルを含めた連続す
るサンプルを１単位としてピッチ成分を配置することを
特徴としている。According to a seventh aspect of the invention, in the tenth aspect of the invention, the tenth step is such that a continuous sample including a sample in a frequency region closest to a natural multiple of the fundamental frequency is taken as one unit. The feature is that pitch components are arranged.

【００１７】請求項８記載の発明は、請求項６または７
記載の発明において、第１０の段階は、ピッチ成分の各
単位ごとにベクトル量子化されたインデックスを復号化
することを特徴としている。The invention according to claim 8 is the invention according to claim 6 or 7.
In the described invention, the tenth step is characterized in that the vector-quantized index is decoded for each unit of the pitch component.

【００１８】[0018]

【作用】楽音あるいは音声は、ピッチすなわち音程の高
／低を有する。この楽音あるいは音声を周波数変換して
得られる周波数領域信号には、一定の周波数間隔で並ぶ
ピッチ成分が含まれる。したがって、該周波数領域信号
を自らの周波数特性の概形で正規化して得られる残差信
号にも、上記ピッチ成分が含まれている。このピッチ成
分は、全体のパワーに対してエネルギーの大きいスパイ
クとなって現れるので、残差信号の平坦度を落として量
子化能率を悪化させる。しかし、本発明は、ピッチ成分
が周波数軸上で等間隔に並んでいる点に着目し、ピッチ
成分を残差信号から差し引くことにより、少ない付加情
報量で残差係数の平坦度を高める。The musical tone or voice has a pitch, that is, high / low pitch. The frequency domain signal obtained by frequency-converting this musical sound or voice contains pitch components arranged at regular frequency intervals. Therefore, the above-mentioned pitch component is also included in the residual signal obtained by normalizing the frequency domain signal with the outline of its own frequency characteristic. Since this pitch component appears as a spike having a large energy with respect to the total power, it lowers the flatness of the residual signal and deteriorates the quantization efficiency. However, the present invention focuses on the fact that the pitch components are arranged at equal intervals on the frequency axis, and subtracts the pitch components from the residual signal, thereby increasing the flatness of the residual coefficient with a small amount of additional information.

【００１９】[0019]

【発明の実施の形態】以下、図面を参照して本発明の一
実施形態について説明する。図１は、本実施形態による
変換符号化方法および変換復号化方法を説明する図であ
り、符号Ａは符号器、またＢは復号器である。図示する
ように、符号器Ａは、時間−周波数変換器１、大局的概
形計算・量子化器２、第１平坦化器３、ピッチ符号化器
４、加算器５、微細スペクトル概形計算・量子化器６、
第２平坦化器７、および量子化器８によって構成されて
いる。DETAILED DESCRIPTION OF THE INVENTION An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a diagram for explaining the transform coding method and transform decoding method according to the present embodiment, where code A is an encoder and B is a decoder. As shown in the figure, the encoder A includes a time-frequency converter 1, a global rough shape calculator / quantizer 2, a first flattener 3, a pitch encoder 4, an adder 5, and a fine spectrum rough shape calculation.・ Quantizer 6,
The second flattener 7 and the quantizer 8 are included.

【００２０】時間−周波数変換器１は、時間領域の入力
信号（楽音信号あるいは音声信号等のオーディオ信号）
を一定時間間隔のフレームに分割し、各々のフレームに
時間−周波数変換を施して周波数領域信号を生成する。The time-frequency converter 1 is an input signal in the time domain (audio signal such as tone signal or voice signal).
Is divided into frames having a constant time interval, and each frame is subjected to time-frequency conversion to generate a frequency domain signal.

【００２１】図２は、この周波数領域信号の周波数特性
を示したものである。この図に示すように、楽音信号あ
るいは音声信号の周波数領域信号は、一定周波数間隔ｐ
で配列するピッチ成分が含まれている。なお、変換手法
としては、離散コサイン変換（Discrete Cosine Transf
ormation,DCT）や変形離散コサイン変換（Modified Dis
crete Cosine Transformation,MDCT）を用いることがで
きる。FIG. 2 shows the frequency characteristic of this frequency domain signal. As shown in this figure, the frequency domain signal of the tone signal or the voice signal has a constant frequency interval p.
It contains the pitch components arranged in. The conversion method is Discrete Cosine Transf
ormation, DCT) and modified discrete cosine transform (Modified Dis)
crete Cosine Transformation (MDCT) can be used.

【００２２】大局的概形計算・量子化器２は、上記時間
−周波数変換器１から出力された周波数領域信号の大局
的な概形を示す信号を生成し、また量子化する。そし
て、この信号を上記第１平坦化器３に出力すると共に、
量子化大局的概形インデックスとして外部に出力する。
該大局的概形の算出手法としては、線形予測スペクト
ル、あるいは周波数領域信号を複数のサブバンドに分割
し、各バンドの代表値によって周波数領域信号全体の概
形を表現するスケールファクタを用いてもよい。The global outline calculator / quantizer 2 generates and quantizes a signal representing the global outline of the frequency domain signal output from the time-frequency converter 1. Then, while outputting this signal to the first flattener 3,
Output as a quantized global outline index.
As a method of calculating the global outline, a linear prediction spectrum or a scale factor that divides the frequency domain signal into a plurality of subbands and expresses the overall shape of the frequency domain signal by the representative value of each band may be used. Good.

【００２３】なお、線形予測スペクトルを量子化する場
合は、線形予測パラメータをＬＳＰパラメータに変換し
て量子化する。またはＫパラメータに変換して量子化す
る。When the linear prediction spectrum is quantized, the linear prediction parameters are converted into LSP parameters and quantized. Alternatively, it is converted into K parameters and quantized.

【００２４】第１平坦化器３は、上記時間−周波数変換
器１から出力された周波数領域信号を大局的概形計算・
量子化器２から出力された上記大局的概形信号によって
除算することにより平坦化し、第１の平坦化信号を出力
する。The first flattener 3 calculates a global shape of the frequency domain signal output from the time-frequency converter 1
It is flattened by division by the global outline signal output from the quantizer 2, and a first flattened signal is output.

【００２５】次に、ピッチ符号化器４は、上記第１の平
坦化信号からピッチ成分を検出して符号化する。また、
図３はピッチ符号化器４の詳細を示す図であり、上記第
１の平坦化信号は、図示するピッチ基本周波数抽出器４
ａおよびピッチサンプル抽出器４ｂに入力される。Next, the pitch encoder 4 detects the pitch component from the first flattened signal and encodes it. Also,
FIG. 3 is a diagram showing the details of the pitch encoder 4, in which the first flattened signal is the pitch fundamental frequency extractor 4 shown.
a and the pitch sample extractor 4b.

【００２６】このピッチ基本周波数抽出器４ａは、第１
の平坦化信号を分析することによりピッチ成分の基本周
波数（ピッチ基本周波数）を求める。すなわち、ピッチ
基本周波数抽出器４ａは、第１の平坦化係数のケプスト
ラムを計算し、その最大値をピッチ成分の基本周期とす
る。そして、該基本周期の逆数を演算することによりピ
ッチ基本周波数を求め、ピッチ基本周波数量子化器４ｃ
に出力する。This pitch fundamental frequency extractor 4a has a first
The fundamental frequency of the pitch component (pitch fundamental frequency) is obtained by analyzing the flattened signal of. That is, the pitch fundamental frequency extractor 4a calculates the cepstrum of the first flattening coefficient, and sets the maximum value as the fundamental cycle of the pitch component. Then, the pitch fundamental frequency is calculated by calculating the reciprocal of the fundamental period, and the pitch fundamental frequency quantizer 4c
Output to

【００２７】なお、ピッチ基本周波数をより正確にする
ために、求められたピッチ基本周波数の前後で、ピッチ
基本周波数ごとの第１の平坦化信号のサンプルのパワー
の総和が最大になる基本周波数を検索し、新たにこれを
ピッチ基本周波数としてもよい。In order to make the pitch fundamental frequency more accurate, the fundamental frequency at which the total sum of the powers of the samples of the first flattening signal for each pitch fundamental frequency is maximized before and after the obtained pitch fundamental frequency. You may search and use this as a new pitch fundamental frequency.

【００２８】ピッチ基本周波数量子化器４ｃは、このよ
うにして求められたピッチ基本周波数を量子化する。す
なわち、このピッチ基本周波数量子化器４ｃは、ピッチ
基本周波数の対数値をスカラ量子化し、量子化ピッチ基
本周波数インデックスとして外部に出力すると共に、こ
のスカラ量子化された信号を上記ピッチサンプル抽出器
４ｂに出力する。The pitch fundamental frequency quantizer 4c quantizes the pitch fundamental frequency thus obtained. That is, the pitch fundamental frequency quantizer 4c scalar-quantizes the logarithmic value of the pitch fundamental frequency and outputs it to the outside as a quantized pitch fundamental frequency index, and at the same time, outputs the scalar-quantized signal to the pitch sample extractor 4b. Output to.

【００２９】このピッチサンプル抽出器４ｂは、第１平
坦化器３から入力された第１の平坦化信号に対して、ピ
ッチ基本周波数量子化器４ｃから入力された量子化ピッ
チ基本周波数の自然数倍の周波数に最も近いサンプルを
中心として前後１サンプルを抽出し、この３サンプル一
組を一本のピッチ成分のサンプル群としてピッチサンプ
ル量子化器４ｄに出力する。なお、このピッチ成分のサ
ンプル群の数は、固定値でも良いし、可変としても良
い。The pitch sample extractor 4b is a natural number of the quantized pitch fundamental frequency input from the pitch fundamental frequency quantizer 4c with respect to the first flattened signal input from the first flattener 3. One sample before and after is extracted centering on the sample closest to the doubled frequency, and a set of these three samples is output to the pitch sample quantizer 4d as a sample group of one pitch component. The number of sample groups of this pitch component may be a fixed value or may be variable.

【００３０】ピッチサンプル量子化器４ｄは、上記ピッ
チ成分のサンプル群を量子化して量子化ピッチ成分イン
デックスとして外部に出力すると共に、この量子化ピッ
チ成分インデックスを復号した量子化ピッチ成分を上記
加算器５に出力する。なお、該サンプル群の量子化は、
スカラ量子化であっても良いし、３サンプルからなるサ
ンプル群ごとにベクトル量子化してもよい。また、全サ
ンプル群を一括でベクトル量子化しても良い。以上がピ
ッチ符号化器４において行われる処理である。The pitch sample quantizer 4d quantizes the pitch component sample group and outputs it to the outside as a quantized pitch component index, and the quantized pitch component obtained by decoding the quantized pitch component index is added to the adder. Output to 5. The quantization of the sample group is
Scalar quantization may be used, or vector quantization may be performed for each sample group consisting of three samples. Further, vector quantization may be performed on all sample groups at once. The above is the processing performed in the pitch encoder 4.

【００３１】次に、加算器５は、該ピッチ符号化器４か
ら入力されたピッチ成分の量子化信号を用いて、第１平
坦化器３から入力された第１の平坦化信号からピッチ成
分のみを差し引いて第２の平坦化信号を生成し、微細ス
ペクトル概形計算・量子化器６および第２平坦化器７に
出力する。Next, the adder 5 uses the quantized signal of the pitch component input from the pitch encoder 4 to generate the pitch component from the first flattened signal input from the first flattener 3. A second flattening signal is generated by subtracting only the above, and is output to the fine spectrum rough shape calculating / quantizing unit 6 and the second flattening unit 7.

【００３２】ここで、図４は、この第２の平坦化信号の
周波数特性を示す図である。上記図２との比較でわかる
ように、第２の平坦化信号は、時間−周波数変換器１か
ら出力された周波数領域信号からピッチ成分を除去した
ものとなる。FIG. 4 is a diagram showing the frequency characteristic of the second flattened signal. As can be seen from the comparison with FIG. 2, the second flattened signal is the frequency domain signal output from the time-frequency converter 1 with the pitch component removed.

【００３３】微細スペクトル概形計算・量子化器６は、
該第２の平坦化信号から微細なスペクトルの概形（微細
スペクトル概形）を計算し、これを量子化する。そし
て、この量子化した信号を量子化微細スペクトル概形イ
ンデックスとして外部に出力すると共に、第２平坦化器
７に出力する。The fine spectrum rough shape calculator / quantizer 6 is
A fine spectrum outline (fine spectrum outline) is calculated from the second flattened signal and quantized. Then, this quantized signal is output to the outside as a quantized fine spectrum outline index, and is also output to the second flattener 7.

【００３４】この微細スペクトル概形は、微細スペクト
ル概形を直接量子化して求めてもよいし、過去のフレー
ムの微細スペクトル概形を線形合成して求めてもよい。
また、過去および現在のフレームの量子化された微細ス
ペクトル概形の情報を線形合成して求めてもよい。さら
に、この微細スペクトル概形は、例えば、第２の平坦化
信号の絶対値に３から５程度の幅の窓関数を畳み込んだ
ものを用いてもよいし、サブバンド分割した第２の平坦
化信号の振幅の代表値を各バンドごとに用意し、これを
概形としてもよい。The fine spectrum outline may be obtained by directly quantizing the fine spectrum outline, or may be obtained by linearly combining the fine spectrum outlines of past frames.
Alternatively, the information of the quantized fine spectrum outlines of the past and present frames may be obtained by linear synthesis. Further, as this fine spectrum outline, for example, a value obtained by convolving a window function having a width of about 3 to 5 with the absolute value of the second flattened signal may be used, or the second flattened signal obtained by subband division. It is also possible to prepare a representative value of the amplitude of the encoded signal for each band and use this as a rough shape.

【００３５】第２平坦化器７は、加算器５から入力され
た第２の平坦化信号を微細スペクトル概形計算・量子化
器６で得られた微細スペクト概形で除算して平坦化し、
第３の平坦化信号として量子化器８に出力する。この量
子化器８は、該第３の平坦化信号をスカラ量子化あるい
はベクトル量子化し、量子化インデックスとして外部に
出力する。The second flattening unit 7 divides the second flattening signal input from the adder 5 by the fine spectrum rough shape obtained by the fine spectrum rough shape calculating / quantizing unit 6 to flatten it,
The third flattened signal is output to the quantizer 8. The quantizer 8 performs scalar quantization or vector quantization on the third flattened signal and outputs it to the outside as a quantization index.

【００３６】なお、ベクトル量子化する場合は、フレー
ムの全サンプルを一括で量子化してもよいが、フレーム
のサンプル列を複数のサブベクトルに分割して、このサ
ブベクトルごとに量子化する方が演算量の面で現実的で
ある。また、分割の方法は、単純なサブバンド分割でも
よいし、サンプルをインタリーブしてから分割するイン
タリーブ分割でもよい。また、量子化の際必要な情報量
にあわせて適応的ビット割り当てをしてもよい。In the case of vector quantization, all samples of a frame may be quantized at once, but it is better to divide the sample sequence of the frame into a plurality of subvectors and quantize each subvector. It is realistic in terms of calculation amount. The division method may be simple subband division, or interleave division in which samples are interleaved and then divided. Also, adaptive bit allocation may be performed according to the amount of information required for quantization.

【００３７】次に、復号器Ｂについて説明する。図１に
示すように、復号器Ｂは、再生器９、微細スペクトル概
形再生器１０、第１逆平坦化器１１、ピッチ再生器１
２、加算器１３、大局的概形再生器１４、第２逆平坦化
器１５、および時間−周波数逆変換器１６によって構成
されている。Next, the decoder B will be described. As shown in FIG. 1, the decoder B includes a regenerator 9, a fine spectrum outline regenerator 10, a first inverse flatter 11, and a pitch regenerator 1.
2, an adder 13, a global rough regenerator 14, a second inverse flatter 15, and a time-frequency inverse converter 16.

【００３８】このうち、再生器９は、上記符号器Ａから
伝送されてきた量子化インデックスから上記第３の平坦
化信号を再生する。この再生器９は、上記量子化器８の
逆処理を行うことにより第３の平坦化信号を再生し、第
１逆平坦化器１１に出力する。微細スペクトル概形再生
器１０は、符号器Ａから伝送されてきた微細スペクトル
概形量子化インデックスから微細スペクトル概形を再生
する。Of these, the regenerator 9 regenerates the third flattened signal from the quantization index transmitted from the encoder A. The regenerator 9 reproduces the third flattened signal by performing the inverse process of the quantizer 8 and outputs it to the first inverse flattener 11. The fine spectrum outline regenerator 10 reproduces the fine spectrum outline from the fine spectrum outline quantization index transmitted from the encoder A.

【００３９】第１逆平坦化器１１は、再生器９から入力
された第３の平坦化信号に微細スペクトル概形を付加し
て、上記第２の平坦化信号を再生して加算器１３に出力
する。また、ピッチ再生器１２は、符号器Ａから伝送さ
れてきた量子化ピッチ成分インデックスおよび量子化ピ
ッチ基本周波数インデックスから上記ピッチ成分を再生
し、加算器１３に出力する。The first inverse flattener 11 adds a fine spectrum outline to the third flattened signal input from the regenerator 9, reproduces the second flattened signal, and supplies it to the adder 13. Output. The pitch regenerator 12 regenerates the pitch component from the quantized pitch component index and the quantized pitch fundamental frequency index transmitted from the encoder A, and outputs it to the adder 13.

【００４０】加算器１３は、第１逆平坦化器１１から入
力された第２の平坦化信号に、ピッチ再生器１２から入
力されたピッチ成分を加えて上記第１の平坦化信号を再
生し、第２逆平坦化器１５に出力する。また、大局的概
形再生器１４は、符号器Ａから伝送されてきた量子化大
局的概形インデックスから上記大局的概形を再生し、第
２逆平坦化器１５に出力する。The adder 13 adds the pitch component input from the pitch regenerator 12 to the second flattened signal input from the first inverse flattener 11 to reproduce the first flattened signal. , To the second inverse flatter 15. Further, the global outline regenerator 14 reproduces the global outline from the quantized global outline index transmitted from the encoder A, and outputs it to the second inverse flatter 15.

【００４１】第２逆平坦化器１５は、加算器１３から入
力された第１の平坦化信号に、大局的概形再生器１４か
ら入力された大局的概形を付加し、上記周波数領域信号
を生成する。そして、時間−周波数逆変換器１６は、該
第２逆平坦化器１５から入力された周波数領域信号に時
間−周波数逆変換を施して復号し、時間領域の音声信号
あるいは楽音信号を出力する。The second inverse flatter 15 adds the global outline input from the global outline regenerator 14 to the first flattened signal input from the adder 13, and outputs the frequency domain signal. To generate. Then, the time-frequency inverse converter 16 performs time-frequency inverse conversion on the frequency domain signal input from the second inverse flatter 15 and decodes it to output a time domain voice signal or musical tone signal.

【００４２】[0042]

【発明の効果】以上説明したように、本発明によれば、
ピッチ成分を有する楽音信号あるいは音声信号を符号化
するに際し、該信号を周波数領域に変換した周波数領域
信号に現れるスパイク状のピッチ成分のの規則性を利用
して、これを高能率に符号化する。したがって、より平
坦化された残差係数を得ることができ、符号化器全体の
能率を高めることが可能である。As described above, according to the present invention,
When encoding a musical tone signal or a voice signal having a pitch component, the regularity of spike-like pitch components appearing in a frequency domain signal obtained by converting the signal into the frequency domain is used to highly efficiently encode the signal. . Therefore, a more flattened residual coefficient can be obtained, and the efficiency of the entire encoder can be improved.

[Brief description of drawings]

【図１】本発明の一実施形態を示す符号器および復号器
を説明する図である。FIG. 1 is a diagram illustrating an encoder and a decoder according to an embodiment of the present invention.

【図２】本発明において時間−周波数変換器の出力信号
の周波数特性を示す図である。FIG. 2 is a diagram showing frequency characteristics of an output signal of the time-frequency converter in the present invention.

【図３】本発明においてピッチ符号化器の詳細構成を示
す図である。FIG. 3 is a diagram showing a detailed configuration of a pitch encoder in the present invention.

【図４】本発明において第２平坦化信号の周波数特性を
示す図である。FIG. 4 is a diagram showing frequency characteristics of a second flattened signal in the present invention.

【図５】従来の変換符号化方法を説明する第１の図であ
る。FIG. 5 is a first diagram illustrating a conventional transform coding method.

【図６】従来の変換符号化方法を説明する第２の図であ
る。FIG. 6 is a second diagram illustrating a conventional transform encoding method.

[Explanation of symbols]

１時間−周波数変換器２大局的概形計算・量子化器３第１平坦化器４ピッチ符号化器５、１３加算器６微細スペクトル概形計算・量子化器７第２平坦化器８量子化器９再生器１０微細スペクトル概形再生器１１第１逆平坦化器１２ピッチ再生器１４大局的概形再生器１５第２逆平坦化器１６時間−周波数逆変換器 1 Time-Frequency Converter 2 Global Approximate Calculator / Quantizer 3 First Flatter 4 Pitch Encoder 5 13 Adder 6 Fine Spectrum Approximate Calculator / Quantizer 7 2nd Flatter 8 Quantum 9 Regenerator 10 Fine spectrum rough regenerator 11 First inverse flatter 12 Pitch regenerator 14 Global rough regenerator 15 Second inverse flatter 16 Time-frequency inverse converter

Claims

[Claims]

1. A method for transform-encoding a tone signal or voice signal containing a pitch component, the method comprising: a first step of dividing the tone signal or voice signal into frames at fixed time intervals and transforming into a frequency domain signal; It is characterized by comprising a second step of extracting only the pitch component from the frequency domain signal, separating and encoding the same, and a third step of encoding a signal from which the pitch component is removed from the frequency domain signal. Transform coding method.

2. The second step comprises a fourth step of obtaining and coding a fundamental frequency of a pitch component and a fifth step of extracting and coding a pitch component based on the fundamental frequency. The transform coding method according to claim 1, wherein

3. A fifth step is to extract a pitch component with a unit of a plurality of continuous samples including a sample of a frequency domain signal closest to a frequency which is a natural multiple of the fundamental frequency. The transform coding method according to claim 2.

4. The transform coding method according to claim 2, wherein in the fifth step, coding is performed by vector quantization for each unit of the pitch component.

5. A method for decoding a code obtained by the transform coding method according to claim 1, comprising a sixth step of decoding the pitch component coded in the second step, and a third step. A seventh step of decoding a signal from which a pitch component has been removed from the frequency domain signal encoded in the step of :, the composite output obtained in the sixth step, and the composite output obtained in the seventh step. An eighth step of transforming a frequency domain signal obtained by synthesizing outputs into a time domain signal, the transform decoding method.

6. The sixth step also includes a ninth step of decoding the fundamental frequency of the pitch component obtained in the fourth step and a fundamental frequency of the pitch component obtained by the ninth step. First, the pitch component is arranged as a signal in the frequency domain in and
6. The transform decoding method according to claim 5, further comprising:

7. The pitch component is arranged in the tenth step with one continuous sample including a sample in a frequency region closest to a frequency that is a natural multiple of the fundamental frequency as one unit. Conversion decoding method.

8. The transform decoding method according to claim 6, wherein in the tenth step, the vector-quantized index is decoded for each unit of the pitch component.