JPS594718B2

JPS594718B2 - Audio encoding method

Info

Publication number: JPS594718B2
Application number: JP56067856A
Authority: JP
Inventors: 信彦北脇; 雅彰誉田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1981-05-06
Filing date: 1981-05-06
Publication date: 1984-01-31
Also published as: JPS57182797A

Description

【発明の詳細な説明】この発明は音声信号をその性質などにより適応的に符号
化特性を変化させて予測符号化する音声符号化方式に関
する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech encoding method that predictively encodes an audio signal by adaptively changing its encoding characteristics depending on its properties.

このような予測符号化方式は特願昭５４−４２８５８号
「適応予測符号化方式」で提案された。Such a predictive coding method was proposed in Japanese Patent Application No. 1983-42858 entitled "Adaptive Predictive Coding Method".

即ちこの方式においては音声信号は複数の帯域に分割さ
れ、それぞれについて予測係数が求められ、この予測係
数を用いて、それぞれの帯域の信号は予測符号化される
。その際に分割された各周波数帯域の信号間の電力の割
合及び予測残差電力の時間的局在性に応じて伝送情報量
、つまり量子化レベル数を時間的に不均一に割当て、符
号化５前後の量子化誤差が小さくなるようにする。こ
のようにして伝送容量が比較的小さい場合でも品質がよ
く、従つて自然性が得られる。しかしこの従来の適応ビ
ット割当て予測符号化方式では分割された各周波数帯域
について予測残１０差信号を送つており、この予測残差
信号の情報量が比較的多く、全体としての伝送情報量が
多くなる欠点があつた。That is, in this method, an audio signal is divided into a plurality of bands, prediction coefficients are determined for each band, and the signals in each band are predictively encoded using the prediction coefficients. At that time, the amount of transmitted information, that is, the number of quantization levels, is allocated and encoded temporally non-uniformly according to the power ratio between the signals of each divided frequency band and the temporal locality of the predicted residual power. 5. Make sure that the quantization error before and after is small. In this way, even if the transmission capacity is relatively small, a good quality and therefore naturalness is obtained. However, in this conventional adaptive bit allocation predictive coding method, a prediction residual 10 difference signal is sent for each divided frequency band, and the amount of information in this prediction residual signal is relatively large, and the overall amount of transmitted information is large. There was a drawback.

この発明の目的は少ない伝送情報量で品質がよく、自然
性もある音声符号化方式を提供すること１５にある。It is an object of the present invention to provide a speech encoding system that has good quality and naturalness with a small amount of transmitted information15.

この発明によれば、音声信号を複数個の周波数帯域に分
割し、低い方からいくつかの帯域を適応予測符号化を用
いて符号化し、それ以上の帯域ではスペクトルの特性を
表わす予測係数だけを符号フ０化して送出する。According to this invention, an audio signal is divided into a plurality of frequency bands, some of the lower bands are coded using adaptive predictive coding, and only prediction coefficients representing spectral characteristics are used for the upper bands. The code is changed to 0 and sent.

その復号に当つては低域成分はその適応予測符号の復号
により得、高域成分は低域から擬似残差信号を作り、こ
れと予測係数とから復号する。第１図はこの発明による
音声符号化方式の実施ノ５例を示す。In decoding, the low frequency component is obtained by decoding the adaptive prediction code, and the high frequency component is obtained by creating a pseudo residual signal from the low frequency band and decoding it from this and the prediction coefficient. FIG. 1 shows a fifth embodiment of the speech encoding system according to the present invention.

入力端子１からのデジタル化された音声信号は、低域信
号変換部２で複数個の周波数帯域、例えば第２図Ａに示
すようにＬｉ、Ｌ２、Ｈに分割され、その各帯域はその
帯域幅のナイキスト周波数で再標本化されて第２図Ｂに
示すように低？０域信号に変換される。その各低域信
号Ｌｉ’、Ｌ２’、狸について線形予測分析部３で帯域
ごとの予測係数が抽出され、また残差電力算出部４で帯
域と時間で二次元に分割された残差電力が抽出される。
低域信号中の音声信号における低い周波数のチャ９５
ンネルＬ、’、Ｌ。′が予測符号化部６で予測符号化さ
れる予測符号化部６ではその予測係数、量子化幅、量子
化レベル数がそれぞれ短時間ごとの低域信号の特性に応
じて適応的に変化される。量子化レベル数はビツト割当
て部５において、残差電力の帯域間での偏り及び帯域内
での時間的な偏りに応じて情報速度が一定となる条件の
もとで、時間一周波数的に割当てられる。これらの具体
的手法は前記特願昭５４−４２８５８号明細書に示され
ている。伝送路符号化部７では残差信号と低い周波数の
チヤンネルＬ／，Ｌ２′については予測係数、残差電力
、部分区間の周期、部分区間のフレーム相対位置などの
パラメータ情報と残差信号（予測符号化部の出力）とを
、上記低域信号中の音声信号における高い周波数のチヤ
ンネルｗについては予測係数及び各フレームにおける平
均残差電力のパラメータを合わせて、ビツトシーケンス
に変換し伝送路８へ送出する。The digitized audio signal from the input terminal 1 is divided into a plurality of frequency bands, for example, Li, L2, and H, as shown in FIG. The low ? width is resampled at the Nyquist frequency as shown in Figure 2B. It is converted to a 0 band signal. For each of the low-frequency signals Li', L2', and Raccoon, the linear prediction analysis unit 3 extracts prediction coefficients for each band, and the residual power calculation unit 4 calculates the residual power divided into two dimensions by band and time. Extracted.
Low frequency cha 95 in the audio signal in the low frequency signal
NnelL,',L. ' is predictively coded by the predictive coding unit 6. In the predictive coding unit 6, the predictive coefficient, quantization width, and number of quantization levels are adaptively changed according to the characteristics of the low-frequency signal for each short time. Ru. The number of quantization levels is allocated by the bit allocation unit 5 in terms of time and frequency under the condition that the information rate is constant according to the deviation of the residual power between bands and the deviation in time within the band. It will be done. These specific methods are shown in the specification of Japanese Patent Application No. 54-42858. For the residual signal and low frequency channels L/, L2', the transmission path encoding unit 7 processes the residual signal (predicted For the high frequency channel w of the audio signal in the low frequency signal, the output of the encoding unit) is combined with the prediction coefficient and the average residual power parameter in each frame, and is converted into a bit sequence and sent to the transmission path 8. Send.

復号化側では伝送路復号化部９で残差信号とパラメータ
情報を分離してパラメータ情報を復号化する。On the decoding side, a transmission path decoding unit 9 separates the residual signal and parameter information and decodes the parameter information.

ビツト割当て部１０で前記特許願明細書に示すように残
差電力から量子化幅と量子化レベル数を算出し、これを
用いて残差信号復号化部１１で低い周波数のチヤンネル
の残差信号ＳＬｌ，ＳＬ２（第２図Ｃ）を復号化し、予
測復号化部１２において低域信号に変換される。他方、
上記残差信号ＳＬｌ，ＳＬ２は帯域信号変換部１３にお
いて、入力信号と同じ標本化周波数を有する帯域残差信
号ＳＬ／，ＳＬ２′（第２図Ｄ）、つまり音声信号を複
数の帯域に分割した時の低域に変換する前の対応周波数
帯域に変換され、擬似残差信号生成部１４において低域
信号域に変換されて高域成分をもつ擬似的な残差信号Ｓ
ｎ″（第２図Ｅ）が得られる。The bit allocation unit 10 calculates the quantization width and the number of quantization levels from the residual power as shown in the patent application specification, and using this, the residual signal decoding unit 11 calculates the residual signal of the low frequency channel. SL1 and SL2 (FIG. 2C) are decoded and converted into low frequency signals in the predictive decoding section 12. On the other hand,
The above-mentioned residual signals SLl and SL2 are processed in the band signal converter 13 by dividing the band residual signals SL/, SL2' (FIG. 2D) having the same sampling frequency as the input signal, that is, the audio signal into a plurality of bands. The pseudo residual signal S is converted into a corresponding frequency band before being converted to a low frequency band, and is converted into a low frequency signal band in the pseudo residual signal generation unit 14, and is converted into a pseudo residual signal S having high frequency components.
n'' (Fig. 2E) is obtained.

擬似残差信号生成部１４では帯域残差信号ＳＬ／，ＳＬ
２″の標本値を第３図Ａに示すように数標本値間隔（こ
の図では４標本値間隔）で矢印で示すもののみを残し、
それ以外の標本値を零にする処理を行なう。その場合、
標本値を残す間隔の設定を、ピツチ周期に相当するタイ
ミングで第３図Ｂに示すようにりセツトすることが好ま
しい。第３図Ｂは最後の矢印は３標本値である。なお、
ピツチ周期はピツチ周期算出部１５において帯域残差信
号ＳＬ／，ＳＬ２″の自己相関係数のピークサーチによ
り算出することができる。上記擬似残差信号ＳＨは低域
信号変換部１６において、高域のチヤンネルＨ分の低域
残差信号に変換され、予測復号化部１２において、低域
信号に変換される。帯域信号変換部１７では予測復号化
された低域のチヤンネルＬ／，Ｌ２′と高域のチヤンネ
ルＦの各帯域の低域信号から帯域ごとの帯域信号Ｌｌ，
Ｌ２，Ｈに変換し、これらを加え合わせて出力音声信号
が得られる。以上説明したように、本音声符号化方式は
、低域での高能率な波形符号化と高域では残差信号を伝
送しないで受信側で擬似残差信号生成することにより、
伝送情報量を著しく減少できる。The pseudo residual signal generation unit 14 generates band residual signals SL/,SL
As shown in Figure 3A, the sample values of 2'' are separated by several sample value intervals (4 sample value intervals in this figure), leaving only those indicated by arrows.
The other sample values are set to zero. In that case,
It is preferable to reset the interval at which sample values are left as shown in FIG. 3B at a timing corresponding to the pitch period. In FIG. 3B, the last arrow indicates three sample values. In addition,
The pitch period can be calculated by the peak search of the autocorrelation coefficient of the band residual signals SL/, SL2'' in the pitch period calculating section 15. is converted into a low-frequency residual signal for channels H, and converted into a low-frequency signal in the predictive decoding section 12.In the band signal conversion section 17, the predictively decoded low-frequency channels L/, L2' and From the low frequency signal of each band of the high frequency channel F, the band signal Ll for each band,
The output audio signal is obtained by converting into L2 and H and adding these. As explained above, this audio encoding method uses highly efficient waveform encoding in the low frequency range and generates a pseudo residual signal on the receiving side without transmitting the residual signal in the high frequency range.
The amount of transmitted information can be significantly reduced.

特に擬似残差信号の生成に第３図Ｂに示したようにする
と、ハーモニツク構造が保持され、６．４〜９．６キロ
ビツト／秒の情報速度で品質の良い音声を提供すること
ができる。具体例として音声帯域をＯ〜８００Ｈｚ、８
００〜１６００Ｈｚ、１６００〜３２００Ｈｚの３帯域
に分割して低域のＯ〜１６００Ｈｚを適応ビツト割当て
予測符号化し、１６００〜３２００Ｈｚをスペクトル符
号化した場合の音声品質は、全帯域を適応ビツト割当て
予測符号化する場合より満足性及び明りよう性で秀れて
いることが確認された。In particular, if the pseudo residual signal is generated as shown in FIG. 3B, the harmonic structure is maintained and high quality speech can be provided at an information rate of 6.4 to 9.6 kilobits/second. As a specific example, the audio band is O~800Hz, 8
The voice quality when the low frequency band 0 to 1600 Hz is divided into three bands, 0 to 1600 Hz and 1600 to 3200 Hz is adaptive bit allocation predictive coding, and 1600 to 3200 Hz is spectrum coded, the entire band is adaptive bit allocation predictive coding. It was confirmed that the results were superior in terms of satisfaction and brightness compared to the case where the

また、擬似残差信号の生成としてパルス信号を用いる場
合にくらべて第３図Ｂに示した手法により擬似残差信号
を生成する場合は音声の明りよう性を劣化させない点で
有効であることが確認された。Furthermore, compared to the case where a pulse signal is used to generate a pseudo residual signal, generating a pseudo residual signal using the method shown in FIG. 3B is more effective in not deteriorating the clarity of speech. confirmed.

[Brief explanation of drawings]

第１図はこの発明による音声符号化方式の実施例を示す
プロツク図、第２図はその動作の説明に供するための図
、第３図は擬似残差信号の生成を説明するための図であ
る。１：信号入力端子、２：低域信号変換部、３：線形予測
分析部、４：残差電力算出部、５：ビツト割当て部、６
：予測符号化部、７：伝送路符号化部、８：伝送路、９
：伝送路復号化部、１０：ビツト割当て部、１１：残差
信号復号化部、１２：予測復号化部、１３：帯域信号変
換部、１４：擬似残差信号生成部、１５：低域信号変換
部、１６：ピツチ抽出物、１７：帯域信号変換部、１８
：信号出力端子。FIG. 1 is a block diagram showing an embodiment of the audio encoding method according to the present invention, FIG. 2 is a diagram for explaining its operation, and FIG. 3 is a diagram for explaining the generation of a pseudo residual signal. be. 1: Signal input terminal, 2: Low frequency signal conversion section, 3: Linear prediction analysis section, 4: Residual power calculation section, 5: Bit allocation section, 6
: Predictive coding unit, 7: Transmission line coding unit, 8: Transmission line, 9
: Transmission path decoding unit, 10: Bit allocation unit, 11: Residual signal decoding unit, 12: Predictive decoding unit, 13: Band signal conversion unit, 14: Pseudo residual signal generation unit, 15: Low frequency signal Conversion unit, 16: Pitch extract, 17: Band signal conversion unit, 18
: Signal output terminal.

Claims

[Claims]

1 The audio signal is divided into a low frequency band and a high frequency band, and the signal in the low frequency band is encoded by adaptive bit allocation predictive coding, and the signal in the high frequency band is encoded by
A speech encoding method that obtains and encodes prediction coefficients representing spectral characteristics, and outputs prediction codes for these low-frequency band signals and codes for prediction coefficients for high-frequency band signals.