JPS5994797A

JPS5994797A - Adaptive conversion coding system for voice

Info

Publication number: JPS5994797A
Application number: JP57204850A
Authority: JP
Inventors: 守谷健弘; 誉田雅彰
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1982-11-22
Filing date: 1982-11-22
Publication date: 1984-05-31
Also published as: JPS5936280B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】この発明は音声信号を周波数領域に袈換し、その量子化
を適応的に変化させる適応変換符号化方式に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to an adaptive transform coding method that transforms an audio signal into a frequency domain and adaptively changes its quantization.

〈従来技術〉この種の音声符号化方式は例えば゛特開昭５５−５’７
９００号「音声信号処理回路」に示されている。この方
式は第１図に示すように、入力端子１１よりの入力音声
は例えば８ＫＨｚでサンプリングされ、各サンプル値が
ディジタル信号として直交変換部１２に入力され、直交
変換部１２は例えば第２図Ａに示す一定数の入力音声サ
ンプルＳ１・・・・・Ｓ２ｎを離散的フーリエ変換によ
り周波数領域の信号（スペクトル）　ｆ１ｆ２・・・・
ｆｎ　（第２図Ｂ）に変換されて適応量子化部１３へ送
られる。一方端子１１の入力音声はスペクトル包絡抽出
部１４に入力さね、入力音声のスペクトルの包絡が線形
予６１す分析により推定され、このスペクトル包絡及び
ピッチ周期は適応情報割当部１５に供給される。<Prior art> This type of audio encoding method is known, for example, from Japanese Patent Application Laid-Open No. 55-5'7.
No. 900 "Audio Signal Processing Circuit". In this method, as shown in FIG. 1, the input audio from the input terminal 11 is sampled at, for example, 8 kHz, and each sample value is input as a digital signal to the orthogonal transform section 12. A fixed number of input audio samples S1...S2n shown in are subjected to discrete Fourier transform to produce a frequency domain signal (spectrum) f1f2...
fn (FIG. 2B) and sent to the adaptive quantization section 13. On the other hand, the input voice of the terminal 11 is input to the spectral envelope extraction section 14, and the spectral envelope of the input voice is estimated by linear prediction 61 analysis, and the spectral envelope and pitch period are supplied to the adaptive information assignment section 15.

適応情報割当部１５は周波数領域の信号ｆ＞ｆ２・・・
・・ｆｎのそれぞれにおけるスペクトル包絡の瞬時レベ
ルに応じて、このレベルが太きければ割当てビットを多
くシ、小さければ割当てビットを少なくするように、ピ
ッチ周期をも考慮して量子化部１３における各信号ｆ１
ｆ２・・・・ｆｎに対する量子化ビラトラ適応的に鉱化
する。このようにして量子化されだ□Ｈ￥　報と、ビッ
ト割当てを示す情報とが合成回路１６で合成されて符号
化出力として送出される。The adaptive information allocation unit 15 receives a frequency domain signal f>f2...
...In accordance with the instantaneous level of the spectral envelope for each of fn, each bit in the quantization unit 13 is set in consideration of the pitch period, so that if this level is thick, more bits are allocated, and if this level is small, fewer bits are allocated. signal f1
f2... Mineralizes adaptively with quantization for fn. The quantized □H\ information and the information indicating bit allocation are combined in the combining circuit 16 and sent out as encoded output.

この手法によって８ＫＨｚサンプリングの音声信号を１
６　ＫｂｐＳ程度の情報量で能率よく符号化でき、高品
質の音声が得られる。しかし、ビット割当て情報に２Ｋ
ｂｐＳ程度の情報量が必要であるため、全体で９’、　
６　ｒ＜ｂｐｓ　（一般に用いられている伝送連層の１
つ）以下の情報量で符号化する際には信号ｆ　１．　ｆ
　２・・・・ｆｎを１ビツト／サンプル以下で量子化す
る必要がある。この際、周波数の成分中の強さの小さい
区間にはほとんど情報を割当てることができず、音声品
質の大きな劣化を招く。With this method, the audio signal of 8KHz sampling is
It can be efficiently encoded with an amount of information of about 6 Kbps, and high quality speech can be obtained. However, the bit allocation information has 2K
Since the amount of information about bpS is required, the total amount is 9',
6 r<bps (1 of commonly used transmission layers)
1) When encoding with the following information amount, the signal f1. f
2... It is necessary to quantize fn to 1 bit/sample or less. At this time, almost no information can be assigned to sections of low intensity among frequency components, resulting in significant deterioration of voice quality.

〈発明の概要〉この発明は入力音声信号を周波数領域に変換し、その変
換されたスペクトルをブロック単位に分割し、その単位
でベクトル量子化をスペクトル包絡情報に応じて適応的
に行うことにより、例えば符号化速度が９．６Ｋｂｌ）
Ｓ以下においても音声品質の劣化を少なくするようにし
た音声の適応変換符号化方式を提供することにある。ま
たスペクトルをブロック分割する前にスペクトルは平坦
イヒしておくことによシベクトル量子化を効率的に行う
ことができる。<Summary of the Invention> This invention converts an input audio signal into the frequency domain, divides the converted spectrum into blocks, and adaptively performs vector quantization in each block according to spectral envelope information. For example, the encoding speed is 9.6Kbl)
An object of the present invention is to provide an adaptive conversion encoding system for audio that reduces deterioration in audio quality even when the audio quality is lower than S. Further, vector quantization can be efficiently performed by flattening the spectrum before dividing the spectrum into blocks.

〈第１実施例〉第３図はこの発明による音声符号化方式の実施例を示す
。端子１１からの入力信号は直交ｆ　？Ａ部１２で１フ
レームを単位に、離散的フーリエ変換（ＤＦＴ）、離散
的余弦変換（ＤＣＴ）などの直交変換によシ周波数領域
の信号、即ちスペクトルに変換され、このスペクトルは
スペクトル平滑部１７において、別に求められ、量子化
されたスペクトル包絡の情報で大域的に平坦化される。<First Embodiment> FIG. 3 shows an embodiment of the speech encoding system according to the present invention. The input signal from terminal 11 is quadrature f? In unit A 12, each frame is converted into a frequency domain signal, that is, a spectrum, by orthogonal transformation such as discrete Fourier transform (DFT) or discrete cosine transform (DCT), and this spectrum is converted into a frequency domain signal, that is, a spectrum. In this step, the spectral envelope is globally flattened using separately determined and quantized spectral envelope information.

即ち端子１１の入力音声はスペクトル包絡抽出部１４に
おいて線形予測分析によりスペクトル包絡が推定され、
このスペクトル包絡情報及び音声パワは量子化部１８で
補助情報としてカ（子化され、この量子化出力は烏合１
５後号部１９で復号され、その復号されたｉＩｉ助情報
によシスベクトルｆ　１’ｆ　２・・・・ｆｎがスペク
トル平滑部１７で割算される。That is, the spectral envelope of the input voice at the terminal 11 is estimated by linear predictive analysis in the spectral envelope extraction section 14,
This spectral envelope information and audio power are quantized as auxiliary information in the quantization section 18, and this quantized output is
5 is decoded by the post-coding unit 19, and the cis vector f1'f2...fn is divided by the decoded iIi auxiliary information by the spectrum smoothing unit 17.

この平坦化されたスペクトルはブロック分割部２１で第
４図に示すように連続するｐ個ずつのブロックにＦｌ＝
（ｆ　１１　ｆ　１２−・・・ｆｔｐ）、Ｆ２＝＝（ｆ
２ｔｆ２２・・・・ｆ２　）、・・・・・Ｆｓ−（ｆＳ
ｌｆＳ２・・・・ｆＳｐ）に分　　　１割される。スペ
クトルの性成分子　ｉｊ（１＝＝ｌ・・・Ｓ。This flattened spectrum is divided into p consecutive blocks by the block dividing unit 21 as shown in FIG.
(f 11 f 12-...ftp), F2==(f
2tf22...f2),...Fs-(fS
lfS2...fSp). Spectral gender component ij (1==l...S.

ｊ＝１・・・・・ｐ）はそれぞれ実部Ｒ（ｆｉｊ）と虚
部工（ｆｉｊ）とよりなり、各ブロックごとにこれら実
部を要撚とするベクトルＲ（Ｆ　１）＝（Ｒ（ｆ　１１
）Ｒ（ｆ　１２　）・・・・Ｒ（ｆ’ｔｐ）’）、Ｒ（
Ｆ２）−（Ｒ（ｆ２ｔ）Ｒ（ｆｚ２）・・・・・Ｒ（ｆ
２ｐ））”・Ｒ（Ｆｓ）＝（Ｒ（ｆｓｘ）Ｒ（ｆｓ２）
””・Ｒ（ｆｓｐ））と、同様に谷虚部を要素とする８
個のベクトルＩ（Ｆｉ　）＝（Ｉ（ｆ　ｉ　ｊ刀とを作
る。j=1...p) consists of a real part R(fij) and an imaginary part (fij), and the vector R(F 1)=(R (f 11
)R(f 12 )...R(f'tp)'), R(
F2) - (R(f2t)R(fz2)...R(f
2p))”・R(Fs)=(R(fsx)R(fs2)
“”・R(fsp)) and 8 which similarly uses the valley imaginary part as an element.
Create a vector I(Fi)=(I(f i j).

これらベクトルが、予め用意した辞書中の何れの標準ベ
クトルと最もよく対応するかを検出してベクトル量子化
をベクトル量子化部２２で行う。The vector quantization unit 22 detects which standard vector in a dictionary prepared in advance these vectors most closely corresponds to, and performs vector quantization.

つま９辞書として予測される複数の標準的なベクトルを
記憶しておき、入力音声のベクトルが何れの標準ベクト
ルに近いかを検出し、その一致乃至類似した標準ベクト
ルを示す番号などの符号を出力する。従って各スペクト
ル成分の強さを量子化するよりも少ないビット数で符号
化することができる。しかもこのベクトル量子化に対す
るビット割当てを適応的に変化する。Store multiple predicted standard vectors as a dictionary, detect which standard vector the input speech vector is close to, and output a code such as a number indicating the matching or similar standard vector. do. Therefore, the intensity of each spectral component can be encoded with a smaller number of bits than when quantized. Furthermore, the bit allocation for this vector quantization is adaptively changed.

即ち局部復号化部１９の出力である補助情報の復号出力
に応じて前記ブロックごとにビット数を割当てる。一般
的には強いスペクトルが含捷れるブロックには多くのビ
ットを割当て、弱いスペクトル量子化部２２では多くの
ビット数が割当てられる時は、比較すべき標準ベクトル
の数が多い辞書を参照し、少いビット数が割当てられる
時は、標準ベクトル数が少い辞書を参照する。標準ベク
トルの要素の数ｐは一定であるから標準ベクトル数が多
い辞書は記憶されている標準ベクトルは微細なパターン
をも表示していることになシ、標準ベクトルの数が少い
辞書に記憶されている標赫ベクトルは大ざっばなパター
ンを示すに過ぎないと云える。That is, the number of bits is assigned to each block according to the decoded output of the auxiliary information that is the output of the local decoding section 19. Generally, a large number of bits are allocated to blocks containing strong spectra, and when a large number of bits are allocated to the weak spectrum quantization unit 22, a dictionary with a large number of standard vectors to be compared is referred to, When a small number of bits is allocated, a dictionary with a small number of standard vectors is referred to. Since the number p of elements of a standard vector is constant, a dictionary with a large number of standard vectors is stored.Standard vectors that display even minute patterns are stored in a dictionary with a small number of standard vectors. It can be said that the marker vector shown only shows a rough pattern.

この適応的情報割当（ビット割当）は入力信号と出力信
号のフレームごとのＳＮ比を最大化することを目的とし
て行われる。直交変換してもＳＮ比は不変であるから符
号化器２４のスペクトル平滑部１７の出力と受信側の復
号化器２５のスペクトル再生出力との歪を最小とするよ
うにすればよく、歪尺度はユークリッド距離とする。１
フレームあたりの歪りは次式である。This adaptive information allocation (bit allocation) is performed with the aim of maximizing the S/N ratio of the input signal and output signal for each frame. Since the SN ratio remains unchanged even after orthogonal transformation, it is only necessary to minimize the distortion between the output of the spectral smoothing section 17 of the encoder 24 and the spectral reproduction output of the decoder 25 on the receiving side, and the distortion scale is the Euclidean distance. 1
The distortion per frame is as follows.

まだ全サンプル（スペクトル）数はｐ’ｓであってサン
プルあたシの平均情報量（平均量子化ビット数）Ｂは、Ｂ＝り゛　ｂｊ／（ｐ−８） −１である。Ｂは一定に保持するから歪りを最小化する量子
化ビット数ｂｊは次式となる。The total number of samples (spectrums) is still p's, and the average information amount (average number of quantized bits) B for each sample is B=ri bj/(p-8) −1. Since B is held constant, the number of quantization bits bj that minimizes distortion is given by the following equation.

とのｂｊ　　を整数値化し、２５ｊ個からなる辞書から
歪最小となるものを選択することで量子化が実行される
。Quantization is performed by converting bj to an integer value and selecting the one with the minimum distortion from a dictionary consisting of 25j items.

なお量子化部１８における量子化もベクトル量子化する
ことができる。このスペクトル包絡の量子化出力、つま
り補助情報と、ベクトル量子化部２２の出力である波形
情報とは合成されて符号化出力として復号イヒ器２５へ
送られる。Note that the quantization in the quantization unit 18 can also be vector quantization. The quantized output of this spectrum envelope, that is, the auxiliary information, and the waveform information that is the output of the vector quantizer 22 are combined and sent to the decoder 25 as an encoded output.

復号化器２５では入力された波形情報が平滑化スペクト
ル再生部２６で、符号化器２４におけるベクトル量子化
部２２で用いた辞書と同一のものを用いて椰準ベクトル
を各ブロックの量子化符号により読出して、平滑化スペ
クトルを再生する。In the decoder 25, the input waveform information is processed by a smoothing spectrum reproducing unit 26, which uses the same dictionary as that used by the vector quantizer 22 in the encoder 24 to convert the palm quasi-vector into a quantized code for each block. to reproduce the smoothed spectrum.

一方入力された補助情報はスペクトル包絡再生部２７で
スペクトル包絡が再生され、これとパワとを再生された
平滑化スペクトルに対してスペクトル再生部２８で乗算
してスペクトルを再生する。On the other hand, the spectrum envelope of the input auxiliary information is reproduced by the spectrum envelope reproduction section 27, and the spectrum envelope is multiplied by the power by the reproduced smoothed spectrum in the spectrum reproduction section 28 to reproduce the spectrum.

この再生されたスペクトルを逆変換部２９で時間領域に
逆変換して出力端子３１に再生音声信号を得る。The reproduced spectrum is inversely transformed into the time domain by the inverse transformer 29 to obtain a reproduced audio signal at the output terminal 31.

〈第２実施例〉上述においては直交度お冬を行った後にスペクトル平滑
化を行ったが、入力音声を逆フィルタに通した後に、直
交変換を行ってもよい。例えば第５図に第３図と対応す
る部分に同一符号を付けて示すように入力端子１１から
の入力音声信号は逆フィルタ３２を通して直交変換部１
２へ供給される。<Second Embodiment> In the above description, spectrum smoothing was performed after orthogonality adjustment, but orthogonal transformation may be performed after passing the input audio through an inverse filter. For example, as shown in FIG. 5 with the same reference numerals attached to parts corresponding to those in FIG.
2.

−実入力音声信号は線形予測分析器３３でスペクトル包
絡が分析され、その分析予測係数は量子化部１８でベク
トル量子化され、その量子化出力は局部復号化部１９で
復号化され、その復号出力、つまり線形予測係数によシ
逆フィルタ３２のフィルタ定数が制御される。この逆フ
ィルタ３２の出力は残差信号であり、これを直交変換し
て前述と同様に符号化して送出する。復号化器２５では
スペクトル再生部２６でベクトル量子化された符号を復
号して残差信号のスペクトルを再生し、これを時間領域
に逆変換して線形予測合成フィルタ部３４へ送出する。- The spectral envelope of the actual input audio signal is analyzed by the linear prediction analyzer 33, the analyzed prediction coefficients are vector quantized by the quantizer 18, the quantized output is decoded by the local decoder 19, and the decoding The filter constant of the inverse filter 32 is controlled by the output, that is, the linear prediction coefficient. The output of this inverse filter 32 is a residual signal, which is orthogonally transformed, encoded in the same manner as described above, and sent out. In the decoder 25, the vector quantized code is decoded by the spectrum reproducing unit 26 to reproduce the spectrum of the residual signal, which is inversely transformed into the time domain and sent to the linear prediction synthesis filter unit 34.

この合成フィルタ部３４のフィルタ定数は、スペクトル
包絡再生部２７で再生された予測係数によシ制御され、
フィルタ部３４より音声信号か再生される。The filter constant of this synthesis filter section 34 is controlled by the prediction coefficient reproduced by the spectrum envelope reproduction section 27,
The filter section 34 reproduces the audio signal.

第６図Ａに、入力音声信号の波形ａｔ、その直交変換出
力の実部の波形ａ２、虚部の波形ａ３を示し、第６図Ｂ
に入力音声信号の波形ａ１を逆フイルタ部３２に通した
後の残差信号波形ｂ１を、との残差信号の直交変換出力
の実部の波形ｂ２を、虚部の波形ｂ８をそれぞれ示す。FIG. 6A shows the waveform at of the input audio signal, the real part waveform a2, and the imaginary part waveform a3 of the orthogonal transform output, and FIG. 6B
shows a residual signal waveform b1 after passing the input audio signal waveform a1 through the inverse filter unit 32, a real part waveform b2 of the orthogonal transform output of the residual signal, and an imaginary part waveform b8.

音声入力波形ａｌのスペクトル包絡１１ｂ４と、各ブロ
ックに対する割当ビットｂ５をそれぞれ示す。たソしｐ
＝＝１３、Ｂ＝１．０の例である。The spectrum envelope 11b4 of the audio input waveform al and the allocated bits b5 for each block are shown, respectively. Tasoshi p
==13 and B=1.0.

上述において量子化の単位となるベクトルの次元Ｐを入
力音声のピッチ周波数に適応させ、１フレームの長さを
ピッチ周期の整数倍とすることで；ｊ量子化の効率をさ
らに高めることができる。この場合はピッチ周波数は時
間的に変化するためピッチ周波数も補助情報に含める。In the above, by adapting the dimension P of the vector that is the unit of quantization to the pitch frequency of the input audio and making the length of one frame an integral multiple of the pitch period, the efficiency of ;j quantization can be further improved. In this case, since the pitch frequency changes over time, the pitch frequency is also included in the auxiliary information.

また、ベクトルを実部、虚部独立とせず、複素数のま捷
の単位として処理することも可能である。また上述にお
ける各部はそれぞれ独立した或は共通の電子計算様で処
理することができる。Furthermore, it is also possible to process vectors as a unit of complex numbers, without making the real and imaginary parts independent. Furthermore, each of the above-mentioned units can be processed independently or in a common electronic calculation manner.

〈効　果〉以上説明したように、周波数領域で平坦化された信号を
ブロックに分割し適応的情報割当をすることで量子化効
率を高めることができ、特に９．６Ｋｂｐｓ°以下でス
カラ量子化の従来の適応変換符号化方式より高いＳＮ比
を持つ音声を再生することができる。周波数領域の平坦
化によシベクトル量子化の標準ベクトルの数が少なくて
済む。また１ブロツクあたシに割当てられる情報量が整
数であればよく、１サンプルあたりの情報量は１／Ｐビ
ツトの単位で細かく割当てられる。このことにょシ従来
方式の欠点であった情報量がまったく割当てられない周
波数成分が存在し、かつそれが適応的に変化することに
起因する聴覚的劣化を避けることができる。<Effects> As explained above, quantization efficiency can be increased by dividing a signal flattened in the frequency domain into blocks and adaptively allocating information, especially when scalar quantization is performed at 9.6 Kbps or less. It is possible to reproduce audio with a higher SN ratio than the conventional adaptive transform coding method. Due to frequency domain flattening, the number of standard vectors for vector quantization can be reduced. Further, the amount of information allocated to one block only needs to be an integer, and the amount of information per sample is allocated in detail in units of 1/P bits. In this way, it is possible to avoid auditory deterioration caused by the existence of frequency components to which no amount of information is assigned, which is a drawback of the conventional method, and which is adaptively changed.

次に実験例を述べる。サンプリング周波数を８ＫＨｚ、
線形予測分析部３３の分析次数を８次、分析長（変換部
）を２６〜３１ｍ５．分析の重複２ｍ５（台形窓で接続
）、ベクトル次元数を６〜１２（ピッチ適応）とした場
合の情報量Ｂ（ビット／サンプル）に対するＳＮ比を第
７図に示す。第６図において曲ｇ１４１は均一量子化で
、各１サンプルごとに符号化した場合、曲線４２は均一
量子化で６次元固定ベクトル符号化した場合、曲線４３
は均一量子化でベクトルの次元をピッチ周波数に応じて
変化させて符号化した場合、曲線４４は適応量子化で各
サンプルごとに符号化する場合（従来方式）、曲線４５
はこの発明の方式で６次元固定ベクトル量子化による符
号化する場合、曲線４６はこの発明の方式でベクトルの
次元をピッチ周波数に応じて適応的に変化させて符号化
する場合である。これらよシ、均−量子化（曲＠４１〜
４３）よシも適応量子化（曲線４４〜４６）の方が優れ
、適応量子化でも従来方式（曲線４４）よりもこの発明
方式（曲線４５．４６）の方が優れていることが理邊角
！トされる。０．５〜１．１ビット／サンプル域で、こ
のＳＮ比の向上は学習サンプル外でも女声で２．５ｄＢ
，男声でｘ．ｏａＢｓ度得られた。スペクトル包絡もベ
クトル量子化することによりピッチ、パワなどを含めて
補助情報は８　０　０　ｂｐｓ程要と見積ることができ
るから、残差信号１サンプル当シの情報量Ｂが０．　５
で４．　８　Ｋｂｐｓ，　　１．　１で９．６Ｋｂｐｓ
の符号化が可能である。Next, an experimental example will be described. The sampling frequency is 8KHz,
The analysis order of the linear prediction analysis section 33 is 8th order, and the analysis length (conversion section) is 26 to 31 m5. FIG. 7 shows the S/N ratio for the information amount B (bits/sample) when the analysis overlap is 2m5 (connected by trapezoidal windows) and the vector dimensions are 6 to 12 (pitch adaptation). In Fig. 6, the curve g141 is uniform quantization and each sample is encoded, the curve 42 is uniform quantization and 6-dimensional fixed vector encoding is performed, the curve 43 is
Curve 44 is the case when encoding is performed by uniform quantization with the dimension of the vector changed according to the pitch frequency, curve 44 is the case when each sample is encoded using adaptive quantization (conventional method), and curve 45 is
curve 46 represents the case where the method of the present invention performs encoding by six-dimensional fixed vector quantization, and the curve 46 represents the case where the method of the present invention performs encoding by adaptively changing the dimension of the vector according to the pitch frequency. All these, uniform quantization (song @41~
43) In theory, adaptive quantization (curves 44 to 46) is better, and even in adaptive quantization, the invented method (curves 45 and 46) is better than the conventional method (curve 44). corner! will be played. In the 0.5 to 1.1 bit/sample range, this S/N ratio improvement is 2.5 dB for female voices even outside of the training samples.
, x in a male voice. Obtained oaBs degrees. By vector quantizing the spectrum envelope, it can be estimated that about 800 bps of auxiliary information including pitch, power, etc. is required, so the amount of information B per sample of the residual signal is 0. 5
So 4. 8 Kbps, 1. 9.6Kbps in 1
It is possible to encode

[Brief explanation of drawings]

Ｈｓ　１図は従来の適応変換符号化方式を示すブロック
図、第２図はその動作の説明に供する図、第３図はこの
発明による適応変換符号化方式の一例を示すブロック図
、第４図はそのブロック分割の例を示す図、第５図はこ
の発明の他の例を示すブロック図、第６図はその動作例
を示す図、第７図は各種符号化方式のＳＮ比−情報量Ｂ
との関係を　　　□示す図である。１１：音声入力、１２：直交変換部、１４ニスベクトル
包絡抽出器、１７：スペクトル平滑部、１８：ベクトル
量子化器、１９：局部復号化器、２１ニブロック分割部
、２２：ベクトル量子化器、２３：適応情報割当部、２
４：符号化器。特許出願人　　日本電信電話公社代　　理　　人　　　草　　野　　　　草汁１図７１７２　　図丼３図Hs 1 is a block diagram showing a conventional adaptive transform encoding method, FIG. 2 is a diagram for explaining its operation, FIG. 3 is a block diagram showing an example of an adaptive transform encoding method according to the present invention, and FIG. 4 is a diagram showing an example of block division, FIG. 5 is a block diagram showing another example of this invention, FIG. 6 is a diagram showing an example of its operation, and FIG. 7 is a diagram showing SN ratio vs. information amount of various encoding methods. B
It is a diagram showing the relationship with □. 11: Audio input, 12: Orthogonal transform unit, 14 Varnish vector envelope extractor, 17: Spectral smoother, 18: Vector quantizer, 19: Local decoder, 21 Niblock division unit, 22: Vector quantizer , 23: Adaptive information allocation unit, 2
4: Encoder. Patent Applicant: Representative of Nippon Telegraph and Telephone Public Corporation Kusano Kusano Sojiru 1 Figure 7172 Figure Bowl 3 Figure

Claims

[Claims]

(1) The sample value series of the audio signal is divided into 1
In an encoding method that calculates a cis vector by orthogonal transformation for each frame and adaptively quantizes it,
Spectral envelope extraction means for determining the envelope of the spectrum, quantizing it together with power, and encoding it as auxiliary information;
local decoding means for decoding the auxiliary information; smoothing means for converting the spectrum into a spectral signal sequence flattened on the frequency axis using the decoded auxiliary information; block dividing means for dividing into blocks on the frequency axis; adaptive information allocation means for adaptively allocating information to each divided block using the above-mentioned auxiliary information; and vector quantization means for vector quantizing a spectral signal sequence.

(2) In a coding method that analyzes and encodes the sample value series of the audio signal in units of one frame, the audio signal is analyzed with linear prediction, its spectral envelope is determined, and the power and spectral envelope are quantized and encoded as auxiliary information. a linear prediction analysis means; a local decoding means for decoding the auxiliary information;
an inverse filter means whose filter constant is controlled by the linear prediction coefficient of the decoded auxiliary information and which receives the sample value series of the audio signal and outputs a residual signal; orthogonal transformation means for obtaining a spectrum by orthogonal transformation; block division means for dividing this spectrum into blocks on the frequency axis; and adaptive information allocation for each divided block using the above-mentioned auxiliary information. 1. An adaptive transform coding method for speech, comprising an adaptive information allocation means and a vector quantization means for converting the divided spectral residual signal sequence into vectors/synthesizers according to the allocation.