JPH07106978A

JPH07106978A - Digital speech signal coding/decoding method for transfer of data

Info

Publication number: JPH07106978A
Application number: JP27587393A
Authority: JP
Inventors: Hajime Obinata; 肇小日向
Original assignee: Abit Corp
Current assignee: Abit Corp
Priority date: 1993-09-30
Filing date: 1993-09-30
Publication date: 1995-04-21
Anticipated expiration: 2014-11-08
Also published as: JP2971715B2

Abstract

PURPOSE:To produce the signals by compressing and coding these speech signals in a real-time processable range and decoding them at the receiving side based on the compressed and coded signal when the digital speech signals are transmitted through a prescribed transmission line, e.g. telephone circuit. CONSTITUTION:The digital speech input signal 4 supplied at a prescribed sampling speed is transmitted through a narrow band filter 10, and the signal component SF (F, T) is calculated together with the time base maximum TMAX (F) and the frequency axis maximum value FMAX (T) of the signal component against a frequency band F divided into M pieces and continuous N pieces of processing time T. Then the component SF (F, T) is normalized. The normalized speech signal NS (F, T) is compared with the audible voice threshold value TH (F) which is previously stored. If the speed signals of all time value are less than the value TH (F) within the same frequency band, these speech signals are not transmitted. The quantity of data on the speech signals to be transmitted can be extremely reduced to the quantity of data not to the transmitted. Thus the speech signals are coded. This procedure is reversed when the speech signals are reproduced.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、データ伝送における
デジタル音声信号の符号化と復号化方法に関し、特にＩ
ＳＤＮにあってリアルタイムでデジタル音声信号のデー
タを圧縮符号化方法とこの符号化信号を復号化する方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for encoding and decoding a digital voice signal in data transmission, and particularly to I
The present invention relates to a method for compressing and encoding data of a digital audio signal in SDN in real time and a method for decoding the encoded signal.

【０００２】[0002]

【従来の技術】デジタル音声信号の符号化、およびそれ
に関連する音声信号のデジタル記録・伝送は既に実用に
供されている。例えば、デジタルコンパクトカセットに
よる録音に関して、藤本健文著「フィリップスＤＣＣシ
ステムのキイ・ポイント：サイコ・アクースチックＰＡ
ＳＣコードの特徴と詳細」株式会社アイエー出版、ラジ
オ技術誌、１９９１年，１２月，第１５６−１６１頁を
参照されたい。ここでは、高効率音声信号符号化（ＰＡ
ＳＣ：ＰｒｅｃｉｓｉｏｎＡｄａｐｔｉｖｅＳｕｂｂ
ａｎｄＣｏｄｉｎｇ）が使用されている。2. Description of the Related Art Coding of digital audio signals and digital recording / transmission of audio signals related thereto have already been put into practical use. For example, regarding recording with a digital compact cassette, Takefumi Fujimoto, "Key points of Philips DCC system: Psycho-Acoustic PA
Characteristics and details of SC code ", AIA Publishing Co., Ltd., Radio Technical Journal, December 1991, pp. 156-161. Here, high-efficiency speech signal coding (PA
SC: Precision Adaptive Subb
and Coding) is used.

【０００３】この符号化方式では、音声信号を先ずバン
ドパス・フィルタに導入し、この信号を例えば３２の等
間隔の帯域に分割する。ＤＣＣシステムでは、通常、標
本化周波数が４８ｋＨｚであるので７５０Ｈｚの帯域幅
が採用される。そして、５１２個の入力データを用い、
これ等のデータを３２の帯域で求め、人間の可聴音声信
号レベルと音声感度に関する周波数依存性を加味して、
音声信号の量子化と符号化を行っている。In this encoding method, a voice signal is first introduced into a bandpass filter, and this signal is divided into, for example, 32 equally spaced bands. In DCC systems, a bandwidth of 750 Hz is typically used because the sampling frequency is 48 kHz. And using 512 input data,
Obtaining these data in 32 bands, taking into account the frequency dependence of human audible voice signal level and voice sensitivity,
It quantizes and encodes audio signals.

【０００４】周知のように、音声信号の検知に関して著
しい周波数依存性がある。つまり、周波数が０Ｈｚ付近
および約１５ｋＨｚ以上の音響信号（音圧）は人間の耳
に検知できない。そして、特に２〜５ｋＨｚで音響信号
の検知感度が高く、この点に着目してＰＡＳＣで音声の
受信品質を殆ど低下させることなく、音声信号の符号化
を効率化し高品質の音声信号の記録を可能にしている。As is well known, there is a significant frequency dependence with respect to the detection of voice signals. That is, an acoustic signal (sound pressure) having a frequency near 0 Hz and a frequency of about 15 kHz or higher cannot be detected by the human ear. Especially, the detection sensitivity of the acoustic signal is high at 2 to 5 kHz, and paying attention to this point, the PASC hardly deteriorates the reception quality of the audio and the encoding of the audio signal is made efficient to record the high quality audio signal. It is possible.

【０００５】他方、電電公社（ＮＴＴ）により既に１９
８６年に提唱されているＩＳＤＮ計画としての高度情報
通信システム（ＩＮＳ：ＩｎｆｏｒｍａｔｉｏｎＮｅ
ｔｗｏｒｋＳｙｓｔｅｍ）では、データの伝送速度は
６４ｋｂｐｓが使用されている。このような伝送速度
は、当然デジタル音声信号の伝送量に、上記ＤＣＣシス
テムより大きな制約、つまり伝送される単位時間当たり
のデータ量に制限を課すことになる。On the other hand, the Electric Power Corporation (NTT) has already made 19
The advanced information and communication system (INS: Information Ne) as the ISDN plan proposed in 1986.
In the “work system”, a data transmission rate of 64 kbps is used. Such a transmission speed naturally imposes a greater restriction on the transmission amount of the digital audio signal than the DCC system, that is, a limitation on the transmission amount of data per unit time.

【０００６】このようにデータ伝送量に制限のある系に
ＰＡＳＣのような符号化法を用いると、粗い量子化を行
うことによって小さい信号が零となってしまうことが往
々起こり得る可能性がある。そのため、音が途切れた
り、あるいは小さな信号が連続的に流れている時、パル
ス性の付加的な信号が入力すると、小さな実音声信号が
消えたりする。When a coding method such as PASC is used in such a system having a limited data transmission amount, it is possible that a small signal may become zero due to coarse quantization. . Therefore, when the sound is interrupted or a small signal continuously flows, if a pulse-like additional signal is input, the small actual voice signal disappears.

【０００７】[0007]

【発明が解決しようとする課題】この発明の課題は、Ｉ
ＳＤＮのようなデータ伝送量にかなり制約があっても、
高品質の音声信号を保つことができ、それにもかかわら
ずリアルタイムで再生できるように、圧縮符号化された
音声信号データを効率的に圧縮符号化でき、同時にこの
圧縮符号化された音声信号を復号する方法を提供するこ
とにある。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention
Even if there are some restrictions on the amount of data transmission such as SDN,
It is possible to efficiently compress and encode compression-encoded voice signal data so that a high-quality voice signal can be retained and nevertheless reproduced in real time, and at the same time, the compression-encoded voice signal can be decoded. To provide a way to do.

【０００８】[0008]

【課題を解決するための手段】上記の課題は、この発明
により、デジタル音声信号の伝送にあって下記処理過
程、ａ）所定標本周波数のデジタル音声信号を狭帯域多
重デジタルフィルタに入力し、多重周波数帯域に分離
し、一定時間間隔の順次時間Ｔで各周波数帯域Ｆの信号
成分Ｓ（Ｆ，Ｔ）を求め、ｂ）各周波数帯域Ｆ内で信号
成分Ｓ（Ｆ，Ｔ）の絶対値の最大値である時間軸最大値
ＴＭＡＸ（Ｆ）を求め、ｃ）各時間Ｔ内で信号成分Ｓ
（Ｆ，Ｔ）の最大値である周波数軸最大値ＦＭＡＸ
（Ｔ）を求め、ｄ）信号成分Ｓ（Ｆ，Ｔ）を時間軸最大
値ＴＭＡＸ（Ｆ）と周波数軸最大値ＦＭＡＸ（Ｔ）の小
さい方で割り算して正規信号成分ＮＳ（Ｆ，Ｔ）を求
め、ｅ）各周波数帯域Ｆで時間軸最大値ＴＭＡＸ（Ｆ）
を可聴音声しきい値ＴＨ（Ｆ）と比較し、換算時間軸最
大値ＣＴＭＡＸ（Ｆ）として、前者が後者より小さい
時、あるいは等しい時、０の値を、また前者が後者より
大きい時、時間軸最大値ＴＭＡＸ（Ｆ）の値を使用し、
ｆ）前記換算時間軸最大値ＣＴＭＡＸ（Ｆ）が０の時、
その周波数帯域Ｆ内の全ての時間Ｔの信号の以後の伝送
を禁止し、ｇ）前記正規信号成分ＮＳ（Ｆ，Ｔ）をビッ
ト数のより少ない少なくとも１種の符号で量子化信号Ｑ
Ｓ（Ｆ，Ｔ）に変換し、ｈ）送出するデータとしてフレ
ーム同期信号ＳＹＮＣ，換算時間軸最大値ＣＴＭＡＸ
（Ｆ），周波数軸最大値ＦＭＡＸ（Ｔ）および誤り符号
ＣＲＣを順次配列したフレーム情報符号を形成し、各周
波数帯域Ｆの同一時間軸上の全ての量子化信号ＱＳ
（Ｆ，Ｔ）を固有な音声信号データとして更に付加す
る、から成るデジタル音声信号の圧縮符号化方法によっ
て解決されている。SUMMARY OF THE INVENTION According to the present invention, the above object is to provide the following processing steps in transmission of a digital audio signal, a) input a digital audio signal of a predetermined sampling frequency to a narrow band multiplex digital filter, The signal component S (F, T) of each frequency band F is obtained by dividing into frequency bands and the sequential time T of a fixed time interval, and b) the absolute value of the signal component S (F, T) of each frequency band F is calculated. The maximum value TMAX (F) on the time axis, which is the maximum value, is obtained, and c) the signal component S within each time T.
Frequency axis maximum value FMAX which is the maximum value of (F, T)
(T) is calculated, and d) the signal component S (F, T) is divided by the smaller one of the time axis maximum value TMAX (F) and the frequency axis maximum value FMAX (T) to obtain the normal signal component NS (F, T). And e) time axis maximum value TMAX (F) in each frequency band F
Is compared with the audible voice threshold TH (F), and the converted time axis maximum value CTMAX (F) is set to 0 when the former is smaller than or equal to the latter, and when the former is larger than the latter, the time is Use the value of axis maximum value TMAX (F),
f) When the converted time axis maximum value CTMAX (F) is 0,
The subsequent transmission of the signal at all times T within the frequency band F is prohibited, and g) the normal signal component NS (F, T) is quantized by at least one code having a smaller number of bits.
The frame sync signal SYNC as the data to be converted into S (F, T) and h) to be transmitted, the converted time axis maximum value CTMAX
(F), frequency axis maximum value FMAX (T) and error code CRC are sequentially arranged to form a frame information code, and all quantized signals QS on the same time axis of each frequency band F are formed.
(F, T) is further added as unique audio signal data, which is solved by a compression encoding method of a digital audio signal.

【０００９】更に、上記の課題は、この発明により、デ
ジタル音声信号の伝送にあって上の処理過程によって求
めた圧縮符号化データ信号を受信して、復号化するのに
下記処理過程、ｉ）受信したデータブロック中のフレー
ム同期信号ＳＹＮＣから、次に続くデータにより換算時
間軸最大値ＣＴＭＡＸ（Ｆ），周波数軸最大値ＦＭＡＸ
（Ｔ）を復号化し、ｊ）デーテブロック中の誤り符号Ｃ
ＲＣからデータの誤りを調べ、誤りのないデータ信号を
取り出し、ｋ）音声信号に付属する固有のデータをデー
タブロックから取り出し、復号化により逆正規化音声信
号ＴＮＳ（Ｆ，Ｔ）を形成し、ｌ）換算時間軸最大値Ｃ
ＴＭＡＸ（Ｆ）と周波数軸最大値ＦＭＡＸ（Ｔ）の小さ
い方の値を前記逆正規化音声信号ＴＮＳ（Ｆ，Ｔ）に乗
算して、逆音声信号（Ｆ，Ｔ）を求め、ｍ）狭帯域デジ
タル逆フィルタにより、逆音声信号ＴＳ（Ｆ，Ｔ）から
所望の標本化周波数のデジタル出力音声信号を求める、
を実行する圧縮音声信号の復号化方法によって解決され
ている。Further, according to the present invention, there is provided the following processing step i) for receiving and decoding the compression coded data signal obtained by the above processing step in the transmission of the digital audio signal. From the frame synchronization signal SYNC in the received data block, the converted time axis maximum value CTMAX (F) and frequency axis maximum value FMAX are converted by the following data.
Decode (T), j) Error code C in data block
Checking the data for errors from the RC, taking out the error-free data signal, k) taking the unique data attached to the voice signal from the data block and decoding it to form the denormalized voice signal TNS (F, T), l) Conversion time axis maximum value C
The inverse normalized voice signal TNS (F, T) is multiplied by the smaller value of TMAX (F) and the frequency axis maximum value FMAX (T) to obtain the inverse voice signal (F, T), and the inverse m) is narrowed. A band digital inverse filter is used to obtain a digital output audio signal of a desired sampling frequency from the inverse audio signal TS (F, T).
Has been solved by a method of decoding a compressed audio signal, which performs

【００１０】この発明による他の有利な構成は、特許請
求の範囲の従属請求項に記載されている。Further advantageous configurations according to the invention are described in the dependent claims.

【００１１】[0011]

【作用】この発明によれば、本発明者が既に提案してい
る極狭帯域の多重デジタルフィルタを使用している（こ
れに関しては特許出願準備中）。このフィルタは高速デ
ジタル演算が可能であるため、リアルタイムでデジタル
音声信号（ＰＣＭ）の伝送処理を行う場合に特に適して
いる。この発明の一つの実施例の場合では、前記フィル
タが０〜１５ｋＨｚの間で５１２個に分割された周波数
帯域を設けているため、個々のフィルタの帯域幅は２９
Ｈｚである。特に、この発明で使用するフィルタでは、
個々の帯域分割フィルタの作用が隣接した帯域にしか及
ばないため、極めたシャープカットな狭い帯域フィルタ
を実現している。従って、音声信号を非常に高い分解能
で分解できる。更に、この発明によれば、同一周波数帯
域内で異なる時間の複数個の音声信号成分を処理する。
実際には、周波数帯域の数が多いため、ＰＣＭ音声信号
の量子化を相当粗くしても、歪みが狭い帯域内でのみ発
生するため、再生された音声信号の品質を悪化させな
い。According to the present invention, the extremely narrow band multiplex digital filter proposed by the present inventor is used (patent pending). Since this filter is capable of high-speed digital calculation, it is particularly suitable for real-time transmission processing of a digital audio signal (PCM). In the case of one embodiment of the present invention, since the filter has frequency bands divided into 512 between 0 and 15 kHz, the bandwidth of each filter is 29.
Hz. In particular, in the filter used in this invention,
Since the action of each band-splitting filter affects only adjacent bands, a narrow band filter with extremely sharp cut is realized. Therefore, the audio signal can be decomposed with a very high resolution. Further, according to the present invention, a plurality of audio signal components at different times within the same frequency band are processed.
In reality, since the number of frequency bands is large, even if the quantization of the PCM audio signal is considerably roughened, the distortion occurs only in a narrow band, and therefore the quality of the reproduced audio signal is not deteriorated.

【００１２】上に述べたＰＡＳＣで使用しているよう
に、可聴音声しきい値の信号レベル曲線と検知音圧の周
波数依存性を計算に入れると、非可聴音声信号の信号処
理を省略することができる。このことは、伝送量に制限
のあるＩＳＤＮでの音声信号の伝送で伝送すべき情報量
を有効に確保できる。こうして、可聴音声信号の品質を
低下させることなく、音声信号を特徴付ける信号データ
量を相当圧縮することができる。この発明では、圧縮さ
れた符号化音声信号を復号化して、再生しているが、そ
の手順を符号化方法の逆を行うもので、符号化の時と類
似の演算ルーチンを使用することができる。As used in the above-mentioned PASC, when the signal level curve of the audible voice threshold and the frequency dependence of the detected sound pressure are taken into account, the signal processing of the inaudible voice signal is omitted. You can This makes it possible to effectively secure the amount of information to be transmitted in the transmission of an audio signal by ISDN, which has a limited amount of transmission. In this way, the amount of signal data characterizing the audio signal can be considerably compressed without degrading the quality of the audible audio signal. In the present invention, the compressed encoded audio signal is decoded and reproduced, but the procedure is the reverse of the encoding method, and a calculation routine similar to that at the time of encoding can be used. .

【００１３】[0013]

【実施例】以下では、図面に示す好適実施例に基づき、
この発明の内容をより詳しく説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Below, based on the preferred embodiments shown in the drawings,
The contents of the present invention will be described in more detail.

【００１４】図１に示すように、記号４で示す所定の標
本化周波数のデジタル音声入力信号（ＰＣＭ信号）をこ
の発明で使用する極狭帯域多重フィルタ１０に導入す
る。このフィルタにより可聴周波数帯域をＭ等分した狭
い帯域内の信号成分を取り出せる。この周波数分割処理
をＮ回にわたり実行して、結局、ＭｘＮ個の信号成分Ｓ（Ｆ，Ｔ）；０＜Ｆ ≦Ｍ，０＜Ｔ≦Ｎをバッファ１２に収納する。収納された信号成分Ｓ
（Ｆ，Ｔ）は図示のように周波数帯域の指数Ｆと時間軸
の指数Ｔで指定される行列状の配置で表すことができ
る。この実施例では、使用する分割帯域数Ｍと処理時間
ＮはＭ＝５１２Ｎ＝１０である。また、信号の処理で使用するビット数は１６ビ
ット以上である。As shown in FIG. 1, a digital voice input signal (PCM signal) having a predetermined sampling frequency indicated by symbol 4 is introduced into a very narrow band multiplex filter 10 used in the present invention. With this filter, a signal component in a narrow band obtained by dividing the audible frequency band into M equal parts can be taken out. This frequency division processing is executed N times, and after all, MxN signal components S (F, T); 0 <F ≤ M, 0 <T ≤ N are stored in the buffer 12. Stored signal component S
(F, T) can be represented by a matrix arrangement specified by the index F of the frequency band and the index T of the time axis as illustrated. In this embodiment, the number M of divided bands to be used and the processing time N are M = 512 N = 10. The number of bits used for signal processing is 16 bits or more.

【００１５】次に、これ等の周波数分割された信号成分
Ｓ（Ｆ，Ｔ）を処理過程２０で正規化するため、先ず信
号成分Ｓ（Ｆ，Ｔ）の絶対値の周波数帯域軸と時間軸に
関する最大値ＦＭＡＸ（Ｔ）とＴＭＡＸ（Ｆ）を各処理
時間Ｔと各周波数帯域Ｆに付いて求める。つまりＦＭＡＸ（Ｔ）＝ＭＡＸ｛│Ｓ（Ｆ，Ｔ）│；Ｆ＝１〜
Ｍ｝，Ｔ＝１，２・・・Ｎ，ＴＭＡＸ（Ｆ）＝ＭＡＸ｛│Ｓ（Ｆ，Ｔ）│；Ｔ＝１〜
Ｎ｝，Ｆ＝１，２・・・Ｍ．Next, in order to normalize these frequency-divided signal components S (F, T) in the processing step 20, first, the frequency band axis and time axis of the absolute value of the signal component S (F, T) are calculated. The maximum values FMAX (T) and TMAX (F) with respect to each processing time T and each frequency band F are obtained. That is, FMAX (T) = MAX {| S (F, T) |; F = 1 to
M}, T = 1, 2 ... N, TMAX (F) = MAX {| S (F, T) |; T = 1 to
N}, F = 1, 2 ... M.

【００１６】次いで、周波数帯域Ｆと時間軸Ｔで指定さ
れる信号成分Ｓ（Ｆ，Ｔ）に対して、周波数帯域Ｆ内の
信号成分の最大値ＦＭＡＸ（Ｔ）と時間軸Ｔ内の信号成
分の最大値ＴＭＡＸ（Ｆ）の小さい方で信号成分Ｓ
（Ｆ，Ｔ）を割り算したものを、正規化された信号成分
ＮＳ（Ｆ，Ｔ）とする。つまり、 Next, with respect to the signal component S (F, T) designated by the frequency band F and the time axis T, the maximum value FMAX (T) of the signal components in the frequency band F and the signal component in the time axis T are obtained. Of the maximum value TMAX (F) of
The value obtained by dividing (F, T) is the normalized signal component NS (F, T). That is,

【００１７】周波数帯域Ｆと時間軸Ｔの全ての範囲に対
し、このように正規化した信号成分ＮＳ（Ｆ，Ｔ）を求
め、これ等をバッファ２２に納める。The signal components NS (F, T) thus normalized are calculated for the entire range of the frequency band F and the time axis T, and these are stored in the buffer 22.

【００１８】ここで、音声信号の可聴周波数特性を加味
して、データの圧縮の準備を行う。これには、先ずビッ
ト配分決定部５５で、処理装置の所定箇所５０に予め保
管されていて、転送路７４を介して導入される可聴音声
しきい値の周波数特性の離散値ＴＨ（Ｆ）；Ｆ＝１，２・・・Ｍと、記号７０で示す転送路によって導入される各周波数
帯域Ｆ内の最大値ＴＭＡＸ（Ｆ）とを全ての周波数帯域
Ｆにわたって比較する。もし、ＴＭＡＸ（Ｆ）が可聴音
声しきい値の周波数特性の離散値ＴＨ（Ｆ）より小さい
のであれば、この帯域Ｆの音声信号を聴きとることがで
きないとして、後で更に詳しく説明するように、周波数
帯域Ｆの処理時間Ｎの全てにわたって以後の信号情報の
伝送を行わない処理を指定するため、新たな時間軸に関
する最大値ＣＴＭＡＸ（Ｆ）としてこの値を０とする。
また、そうでない場合、そのままの値ＴＭＡＸ（Ｆ）を
使用する。換言すれば、と変換される。これは図２の符号化ビット配分決定処理
ルーチンの前半部（ステップＳ１０）に示してある。Here, preparation is made for data compression in consideration of the audible frequency characteristic of the audio signal. To this end, first, in the bit allocation determining unit 55, a discrete value TH (F) of the frequency characteristic of the audible voice threshold value that is stored in advance in the predetermined location 50 of the processing device and is introduced through the transfer path 74; F = 1, 2, ... M and the maximum value TMAX (F) in each frequency band F introduced by the transfer path indicated by symbol 70 are compared over all frequency bands F. If TMAX (F) is smaller than the discrete value TH (F) of the frequency characteristic of the audible voice threshold value, it is assumed that the voice signal in this band F cannot be heard, which will be described in more detail later. , Is set to 0 as the maximum value CTMAX (F) on the new time axis in order to specify the processing that does not perform subsequent signal information transmission over the entire processing time N of the frequency band F.
Otherwise, the value TMAX (F) as it is is used. In other words, Is converted to. This is shown in the first half (step S10) of the coded bit allocation determination processing routine of FIG.

【００１９】更に、図２の符号化ビット配分決定処理ル
ーチンの後半部（ステップＳ１１〜Ｓ１５）には、伝送
するデータを圧縮する前の準備処理が示してある。ここ
では、上に述べたように、ＣＴＭＡＸ（Ｆ）＝０での周
波数帯域Ｆ内の信号ＮＳ（Ｆ，Ｔ）は全て伝送されない
ので、それだけ実際に伝送するデータ量に余裕ができ
る。加えて、この発明による実施例では、ビット数１６
以上であった初期音声信号Ｓ（Ｆ，Ｔ）またはＮＳ
（Ｆ，Ｔ）が通常１．６ビットに圧縮されるが、ＣＴＭ
ＡＸ（Ｆ）の値が大きいもの関しては、その周波数帯域
Ｆ内の全てのデータのビット数を更に大きくして、つま
り分解能をもっと高めた信号の圧縮を行う。この実施例
では、値が０以外のＣＴＭＡＸ（Ｆ）を更に３種のグル
ープに分類する。つまり、値が大、中、小に分類し、デ
ータ圧縮時に対応するデータにそれぞれ４，２．４およ
び１．６ビットを割り当てる。このビット配分を示す指
数としてを規定する。Furthermore, the latter half of the coded bit allocation determination processing routine of FIG. 2 (steps S11 to S15) shows the preparation processing before compressing the data to be transmitted. Here, as described above, the signals NS (F, T) in the frequency band F when CTMAX (F) = 0 are not all transmitted, so that there is a margin in the amount of data actually transmitted. In addition, in the embodiment according to the present invention, the number of bits is 16
The initial voice signal S (F, T) or NS that is the above
(F, T) is normally compressed to 1.6 bits, but CTM
For a large value of AX (F), the number of bits of all data in the frequency band F is further increased, that is, the signal having a higher resolution is compressed. In this embodiment, CTMAX (F) having a value other than 0 is further classified into three types of groups. That is, the values are classified into large, medium and small, and 4,2.4 and 1.6 bits are assigned to the corresponding data at the time of data compression. As an index showing this bit allocation Stipulate.

【００２０】更に、この発明の実施例によれば、４ビッ
トの圧縮データと２．４ビットの圧縮データの数を１：
２に配分する。このような規則を適用すると、最終的に
伝送するデータ出力速度、例えばこの実施例の場合、６
４ｋｂｐｓと、入力された音声信号の特性から伝送しな
くても良いデータ量（上に述べたＣＴＭＡＸ（Ｆ）＝０
での周波数帯域Ｆ内の信号ＮＳ（Ｆ，Ｔ）の総ビット
数）によって、ＡＬＯＣ（Ｆ）＝Ｏ，ＡＬＯＣ（Ｆ）＝
１およびＡＬＯＣ（Ｆ）＝２に対応するデータ数を計算
で求めることができる。Further, according to the embodiment of the present invention, the number of 4-bit compressed data and the number of 2.4-bit compressed data is 1 :.
Allocate to 2. When such a rule is applied, the data transmission rate of the final transmission, for example, 6 in the case of this embodiment.
4 kbps, which is the amount of data that need not be transmitted due to the characteristics of the input audio signal (CTMAX (F) = 0 described above)
By the total number of bits of the signal NS (F, T) in the frequency band F at, ALOC (F) = O, ALOC (F) =
The number of data corresponding to 1 and ALOC (F) = 2 can be calculated.

【００２１】ステップＳ１１では、伝送データ出力速度
から決まるバッファ２２の１処理時間内の余裕分ビット
数ＳＢＩＴとしてＳＢＩＴ＝ＭｘＮバッファに与えられる総ビット数−Ｎ
ｘ１６が与えられる。ここで、右辺の最後の項は時間軸Ｎ（＝
１０）個のデータが取り合えず全て１．６ビットである
と見なして計算する。ＳＢＩＴにＣＴＭＡＸ（Ｆ）＝０
による非伝送分のデータ１６×Ｐ（ＰはステップＳ１０
で求めたＣＴＭＡＸ（Ｆ）＝０の帯域数）を加算した数
は、ＣＴＭＡＸ（Ｆ）≠０の全てのデータを１．６ビッ
トで処理する場合、未だ使用可能な残りの総ビット数で
ある。ステップＳ１１では、残りの総ビット数を４０で
割り算して整数商Ｑと余りＲを求める（４０は先に述べ
た４ビットデータと２．４ビットデータの１：２配分と
帯域内のデータ数＝１０から決まる）。これにより、ス
テップＳ１２で２．４ビットを割り当てる帯域Ｆの数と
してｋ_２４および４ビットを割り当てる帯域Ｆの数とし
てｋ_４０が求まる。次いで、ステップＳ１３，Ｓ１４お
よびＳ１５により最終的にビット配分を指示する指数Ａ
ＬＯＣ（Ｆ）が全て指定される。In step S11, the margin bit number SBIT within one processing time of the buffer 22 determined by the transmission data output speed is SBIT = M × N Total bit number given to the buffer−N
x16 is given. Here, the last term on the right side is the time axis N (=
10) It is assumed that all the pieces of data are incompatible with each other and that they are all 1.6 bits. CTMAX (F) = 0 in SBIT
Non-transmitted data 16 × P (P is step S10
The number obtained by adding CTMAX (F) = 0 in (1) is the total number of remaining bits that can be used when all the data of CTMAX (F) ≠ 0 is processed with 1.6 bits. . In step S11, the remaining total number of bits is divided by 40 to obtain the integer quotient Q and the remainder R (where 40 is the 1: 2 allocation of 4-bit data and 2.4-bit data described above and the number of data in the band). = 10). As a result, in step S12, k ₂₄ is obtained as the number of bands F to which 2.4 bits are allocated, and k ₄₀ is obtained as the number of bands F to which 4 bits are allocated. Next, in steps S13, S14 and S15, the index A that finally indicates the bit allocation
All LOC (F) are designated.

【００２２】図２の符号化ビット配分決定処理の次に
は、第一量子化処理部３０（図１）で正規化された信号
ＮＳ（Ｆ，Ｔ）が先に述べた粗いビット数に量子化され
る。これは図３に示す手順で行われる。ビット配分決定
処理部５５から転送路７６を介して導入された各周波数
帯域Ｆのビット配分を指示する指数ＡＬＯＣ（Ｆ）をス
テップＳ２０で判定し、その指数ＡＬＯＣ（Ｆ）の値に
応じて係数ＰＰＸの値を指定し、ステップＳ２１でバッ
ファ２２にある信号ＮＳ（Ｆ，Ｔ）を再量子化する。こ
こで、Ｉｎｔ（Ｘ）はＸを越えない最大整数値を意味
し、＞＞は１ビット右シフトを行い、２で割って余りを
切り捨てることを意味する。最後の処理は、信号の正負
のビット配分を０信号レベルに対して対称にするために
行う（図４参照）。このように量子化された信号ＱＳ
（Ｆ，Ｔ）はバッファ３２に収納される。After the coded bit allocation determination process of FIG. 2, the signal NS (F, T) normalized by the first quantization processing unit 30 (FIG. 1) is quantized to the above-mentioned coarse bit number. Be converted. This is performed according to the procedure shown in FIG. The index ALOC (F) that directs the bit allocation of each frequency band F introduced from the bit allocation determination processing unit 55 via the transfer path 76 is determined in step S20, and a coefficient is determined according to the value of the index ALOC (F). The value of PPX is designated, and the signal NS (F, T) in the buffer 22 is requantized in step S21. Here, Int (X) means the maximum integer value that does not exceed X, and >> means right shift by 1 bit and divide by 2 to discard the remainder. The final processing is performed to make the positive and negative bit distribution of the signal symmetrical with respect to the 0 signal level (see FIG. 4). The signal QS thus quantized
(F, T) is stored in the buffer 32.

【００２３】第二量子化処理部６０（図１）には、各周
波数帯域中の最大値ＣＴＭＡＸ（Ｆ）と各時間軸内の最
大値ＦＭＡＸ（Ｔ）がそれぞれ転送路７８および７２を
介して導入され、ここで６ビット対数量子化される（２
ｄＢステップ）。これ等の対数量子化された値をそれぞ
れＱＴＭＡＸ（Ｆ）とＱＦＭＡＸ（Ｔ）とする。In the second quantization processing unit 60 (FIG. 1), the maximum value CTMAX (F) in each frequency band and the maximum value FMAX (T) in each time axis are transferred via the transfer paths 78 and 72, respectively. Introduced, where 6-bit logarithmic quantization (2
dB step). Let these logarithmically quantized values be QTMAX (F) and QFMAX (T), respectively.

【００２４】最後に、量子化された信号ＱＳ（Ｆ，
Ｔ），ＱＴＭＡＸ（Ｆ）とＱＦＭＡＸ（Ｔ）を出力伝送
路に送り出す出力信号８にするには、データ符号化処理
部４０（図１）でこれ等の量子化信号を符号化し、処理
時間毎にブロック化されたフレーム信号（フレーム情報
信号と圧縮された音声信号）にする。Finally, the quantized signal QS (F,
In order to output T), QTMAX (F) and QFMAX (T) to the output signal 8 sent to the output transmission line, these quantized signals are coded by the data coding processing unit 40 (FIG. 1), and at each processing time. The frame signal is made into blocks (frame information signal and compressed audio signal).

【００２５】図５に示すように、先ずステップＳ３０に
よりフレームの頭に同期信号ＳＹＮＣ（８ビット）を付
ける。次いでステップＳ３１でＱＴＭＡＸ（Ｆ）を送り
出し、ステップＳ３２でＱＦＭＡＸ（Ｔ）を送り出す。
更に、巡回符号ＣＲＣ（７ビット）を付加してフレーム
情報の符号を完成させる。As shown in FIG. 5, first, in step S30, a synchronization signal SYNC (8 bits) is added to the beginning of the frame. Then, QTMAX (F) is sent out in step S31, and QFMAX (T) is sent out in step S32.
Further, the cyclic code CRC (7 bits) is added to complete the code of the frame information.

【００２６】次に、圧縮された音声信号は、図６に示す
データ符号化処理により、前記のフレーム情報に続くフ
リーフォーマット区間に後置される。この場合、データ
符号化処理部４０には、量子化した音声信号ＱＳ（Ｆ，
Ｔ）の外に、転送路７７を介してビット配分指数ＡＬＯ
Ｃ（Ｆ）も導入されている。この指数ＡＬＯＣ（Ｆ）を
先ずステップＳ４０で判定する。１．６ビットの場合
（ＡＬＯＣ（０）の時）、ステップＳ４１で図示のよう
に時間軸の１〜５までのデータを一つの３進表示値ＳＤ
０として、時間軸の６〜１０までのデータをもう一つの
３進表示値ＳＤ１として、ステップＳ４４でそれぞれ８
ビットにして送り出す。また、２．４ビットの場合（Ａ
ＬＯＣ（１）の時）には、ステップＳ４２で図示のよう
に時間軸の１〜５までのデータを一つの５進表示値ＳＤ
０として、時間軸の６〜１０までのデータをもう一つの
５進表示値ＳＤ０として、ステップＳ４５でそれぞれ１
２ビットにして送り出す。あるいは、４ビットの場合
（ＡＬＯＣ（２）の時）には、ステップＳ４３で対応す
る周波数帯域Ｆ内の各時間でのデータに７を加算して４
ビットにして送り出す。Next, the compressed audio signal is added to the free format section following the frame information by the data encoding process shown in FIG. In this case, the data encoding processing unit 40 informs the quantized audio signal QS (F,
T) in addition to the bit allocation index ALO via the transfer path 77.
C (F) is also introduced. This index ALOC (F) is first determined in step S40. In the case of 1.6 bits (when ALOC (0)), in step S41, the data of 1 to 5 on the time axis is converted into one ternary display value SD.
As 0, the data up to 6 to 10 on the time axis is set as another ternary display value SD1 and set to 8 in step S44.
Send it out in bits. In the case of 2.4 bits (A
At the time of LOC (1)), in step S42, as shown in the figure, the data of 1 to 5 on the time axis is converted into one quinary display value SD.
The data up to 6 to 10 on the time axis is set as another quinary display value SD0 and set to 1 in step S45.
Send in 2 bits. Alternatively, in the case of 4 bits (when ALOC (2)), 7 is added to the data at each time within the corresponding frequency band F in step S43 to obtain 4
Send it out in bits.

【００２７】上の説明した信号符号化処理では、多数の
帯域にわたる演算を行った、特に周波数軸の最大値ＦＭ
ＡＸ（Ｔ）はＭ＝５１２の範囲で調べた。しかし、音声
は周波数によって大きく異なる可聴特性を有するので、
この最大値も更に数分割した周波数範囲内で求めると、
伝送する音声の品質をより忠実に表現できる。この実施
例でＭ＝５１２を使用する場合、８分割すると実験的に
再生された音声の品質が最も良好であった。このような
分割処理は、正規化処理部２０、量子化処理部３０およ
びデータ符号化処理部４０の全ての処理に対して実行す
る必要がある。In the above-described signal coding processing, the calculation over a large number of bands, particularly the maximum value FM on the frequency axis is performed.
AX (T) was examined in the range of M = 512. However, since voice has audible characteristics that vary greatly with frequency,
If this maximum value is obtained within the frequency range that is further divided,
The quality of the transmitted voice can be expressed more faithfully. When using M = 512 in this example, eight divisions gave the best experimentally reproduced speech quality. Such division processing needs to be executed for all the processing of the normalization processing unit 20, the quantization processing unit 30, and the data encoding processing unit 40.

【００２８】このようにして作成されたＭ＝５１２，Ｎ
＝１０および周波数帯域分割数＝８の場合の圧縮データ
を図７ａに示す。ここでは、ＣＴＭＡＸ（Ｆ）がＦ＝３
で０になっているため、周波数帯域のデータＤＴ（Ｆ）
の内ＤＴ（３）は送っていないことに注意されたい。そ
の外、周波数帯域を８に分割しているため、周波数軸の
最大値はＦ（ＦＢ，Ｔ）として表されている。この場
合、分割指数ＦＢ＝１，２・・・・８である。また、フ
リーフォーマット区間の各データＤＴ（Ｆ）は同一時間
軸上の全てのデータを一組にして形成されている。この
データＤＴ（Ｆ）内の１．６，２，４および４ビットの
データ配置を図７ｂに示す。M = 512, N created in this way
= 10 and the number of frequency band divisions = 8 shows compressed data in Fig. 7a. Here, CTMAX (F) is F = 3
Since it is 0, the frequency band data DT (F)
Note that DT (3) is not sent. In addition, since the frequency band is divided into eight, the maximum value on the frequency axis is expressed as F (FB, T). In this case, the division indices FB = 1, 2, ... Further, each data DT (F) in the free format section is formed by grouping all the data on the same time axis. The data arrangement of 1.6, 2, 4 and 4 bits in this data DT (F) is shown in FIG. 7b.

【００２９】次に、或る伝送回路を経由して導入され
た、あるいは何らかのデジタル信号読取装置によって検
出された、図７に示すフレーム信号配置を有するデジタ
ル音声信号を復号化して、元の音声信号に変換する処理
方法に付いて説明する。Next, the digital audio signal having the frame signal arrangement shown in FIG. 7 introduced through a certain transmission circuit or detected by some digital signal reader is decoded to obtain the original audio signal. The processing method for converting to will be described.

【００３０】図８には、上で図１〜７を参照して説明し
た圧縮符号化方法によって得られたデジタル音声信号９
が、受信装置側のフレーム情報に対する復号化部４１に
導入される。この受信装置は、ＩＳＤＮ対応のモデムで
あっても良い、あるいはデジタルコンパクトカセットの
読取ヘッドから得られた検出信号を波形整形し、図７に
示すフレーム信号配置を有するデジタル音声信号に整形
したものでもよい。この受信信号を元の音声信号に近い
値にＴＳ（Ｆ，Ｔ）に戻し、更に狭帯域多重フィルタ１
１によって逆フィルタを行い、最終的にデジタル音声信
号５を求める。この詳細を次に説明する。FIG. 8 shows a digital audio signal 9 obtained by the compression coding method described above with reference to FIGS.
Are introduced into the decoding unit 41 for the frame information on the receiving device side. This receiving device may be an ISDN-compatible modem, or may be one in which the detection signal obtained from the reading head of the digital compact cassette is waveform-shaped and shaped into a digital audio signal having the frame signal arrangement shown in FIG. Good. This received signal is returned to TS (F, T) to a value close to the original voice signal, and the narrow band multiplex filter 1
The inverse filter is performed by 1 to finally obtain the digital audio signal 5. The details will be described below.

【００３１】復号化部４１では、先ず時間軸最大値ＱＴ
ＭＡＸ（Ｆ）と周波数軸最大値ＱＦＭＡＸ（Ｔ）（図９
のステップＳ６０）を受信し、図９のステップＳ６１に
示すように、巡回符号ＣＲＣを検出して、受信したフレ
ーム同期信号、ＴＭＡＸ（Ｆ），ＦＭＡＸ（Ｔ）でＣＲ
Ｃ符号を作成してＣＲＣＲとして、送信されたＣＲＣ符
号がＣＲＣＲと一致している否かを判定する。一致して
いる場合には、そのまま次の過程に進み、一致していな
い場合、そのフレームのデータを用いた計算を実質上行
わない。こうして、対数量子化された受信時間軸最大値
ＱＴＭＡＸ（Ｆ）と周波数軸最大値ＱＦＭＡＸ（Ｔ）は
それぞれ転送路８６，８５を介して逆量子化処理部６１
に転送され、ここで両者の値が逆対数（指数関数）変換
されて、時間軸最大値ＣＴＭＡＸ（Ｆ）と周波数軸最大
値ＦＭＡＸ（Ｔ）が定まる。In the decoding unit 41, first, the maximum value QT on the time axis is displayed.
MAX (F) and frequency axis maximum value QFMAX (T) (Fig. 9
Step S60) of FIG. 9, and as shown in step S61 of FIG. 9, the cyclic code CRC is detected, and the received frame synchronization signal TMAX (F), FMAX (T) is used for CR.
A C code is created and used as a CRCR, and it is determined whether the transmitted CRC code matches the CRCR. If they match, the process directly proceeds to the next step. If they do not match, the calculation using the data of the frame is not substantially performed. Thus, the logarithmically quantized reception time axis maximum value QTMAX (F) and frequency axis maximum value QFMAX (T) are respectively dequantized through the transfer paths 86 and 85.
, And both values are subjected to inverse logarithm (exponential function) conversion to determine the time-axis maximum value CTMAX (F) and the frequency-axis maximum value FMAX (T).

【００３２】次いで、ビット配分決定処理部５６で音声
データの伝送を必要としないデータ区間の周波数帯域Ｆ
（ＴＭＡＸ（Ｆ）＝０の場合の帯域）と、このような周
波数帯域の総数Ｐを図１０のステップＳ５０で求める。
そして、この総数Ｐからビット配分決定処理部５６で伝
送に使用された残りのビット数を、図２で説明した符号
化の時と全く同じ計算方法に基づき、つまり図１０のス
テップＳ５１，Ｓ５２，Ｓ５３，Ｓ５４およびＳ５５に
より求める。この場合、圧縮符号化のところで説明した
ように、同じビット配分、即ち通常の信号強度の周波数
帯域のデータを１．６ビットとして、信号強度のより強
い周波数帯域Ｆと更に強い周波数帯域Ｆのデータを１：
２の割合の２ビットと、２．４ビットに割り振る。それ
等の対応するグループに対するビット配分を指示する指
数ＡＬＯＣ（Ｆ）と各ビットの周波数帯域の数とを決定
する。Next, in the bit allocation decision processing unit 56, the frequency band F of the data section which does not require the transmission of the voice data.
(The band when TMAX (F) = 0) and the total number P of such frequency bands are obtained in step S50 of FIG.
Then, from this total number P, the remaining number of bits used for transmission in the bit allocation determination processing unit 56 is calculated based on exactly the same calculation method as that of the encoding described in FIG. 2, that is, steps S51, S52, It is determined by S53, S54 and S55. In this case, as described in the compression encoding, the same bit allocation, that is, the data of the frequency band of the normal signal strength is set to 1.6 bits, and the data of the frequency band F of the stronger signal strength and the data of the stronger frequency band F of the signal strength are set. To 1:
Allocate to 2 bits at a ratio of 2 and 2.4 bits. Determine the index ALOC (F) that indicates the bit allocation for their corresponding groups and the number of frequency bands for each bit.

【００３３】次いで、データ復号化処理部４３で、図１
１のように、ビット配分決定処理部５６から送られたビ
ット配分指示指数ＡＬＯＣ（Ｆ）に基づき、データフリ
ーフォマット区間の時間軸上のデータＤＴ（Ｆ）から量
子化された圧縮符号化音声信号ＱＳ（Ｆ．Ｔ）を求め
る。この処理は、図６の処理の逆変換に相当する。この
場合、ＨＤＡＴＡ（０，Ｊ）＝３^４−ＪＨＤＡＴＡ（１，Ｊ）＝５^４−Ｊである。Next, in the data decoding processing section 43, as shown in FIG.
1, the compressed coded audio signal quantized from the data DT (F) on the time axis of the data free format section based on the bit allocation instruction index ALOC (F) sent from the bit allocation determination processing unit 56. Calculate QS (FT). This process corresponds to the inverse conversion of the process of FIG. In this case, HDATA (0, J) = 3 ^4-J HDATA (1, J) = 5 ^4-J .

【００３４】更に、図１２に示すように、フレーム内の
量子化された圧縮符号化音声信号成分ＱＳ（Ｆ，Ｔ）か
ら、図３のステップＳ１２とは逆の処理により、正規化
された音声信号ＴＮＳ（Ｆ，Ｔ）を求める。これをバッ
ファ２３に納める。Further, as shown in FIG. 12, from the quantized compression-encoded voice signal component QS (F, T) in the frame, the normalized voice is processed by a process reverse to that of step S12 of FIG. The signal TNS (F, T) is obtained. This is stored in the buffer 23.

【００３５】次に、正規化処理部２１で、逆量子化処理
部６１で求まり転送路８９と９０を介して導入された時
間軸最大値ＣＴＭＡＸ（Ｆ）と周波数軸最大値ＦＭＡＸ
（Ｔ）を用いて、正規化された音声信号ＴＮＳ（Ｆ，
Ｔ）を初期音声信号ＴＳ（Ｆ，Ｔ）に変換して、バッフ
ァ１３に納める。この場合の変換は、図１の正規化の逆
である。即ち、である。なお、ここで言う正規化音声信号ＴＮＳ（Ｆ，
Ｔ）と初期音声信号ＴＳ（Ｆ，Ｔ）は、図１で示した正
規化音声信号ＮＳ（Ｆ，Ｔ）および初期音声信号Ｓ
（Ｆ，Ｔ）とは実質上異なる。何故ならば、図１の量子
化処理３０で、音声信号成分の強度がかなり粗いビット
の分解能で変換されたからである。この信号レベルの粗
さは、人の可聴能力を加味して決定されているので、実
際に音声に変換した場合、依然として高い品質の音声と
して聴くことができる。Next, in the normalization processing unit 21, the time-axis maximum value CTMAX (F) and the frequency-axis maximum value FMAX obtained by the dequantization processing unit 61 and introduced through the transfer paths 89 and 90.
Using (T), the normalized voice signal TNS (F,
T) is converted into an initial audio signal TS (F, T) and stored in the buffer 13. The conversion in this case is the reverse of the normalization of FIG. That is, Is. The normalized voice signal TNS (F,
T) and the initial audio signal TS (F, T) are the normalized audio signal NS (F, T) and the initial audio signal S shown in FIG.
It is substantially different from (F, T). This is because the quantization processing 30 shown in FIG. 1 converted the strength of the audio signal component with a considerably coarse bit resolution. Since the roughness of this signal level is determined in consideration of human audibility, it can still be heard as high quality voice when actually converted into voice.

【００３６】最後に、バッファ１３の行列状の音声信号
ＴＳ（Ｆ，Ｔ）のブロックは、この発明による狭帯域の
多重帯域逆フィルタ１１を通過させて、記号５で示すデ
ジタル音声信号（ＰＣＭ）として取り出せる。この音声
信号を所定の記憶装置で記憶媒体に記憶しておくか、あ
るいは所定の音声変換装置（再生装置）の助けにより音
声として聴くことができる。Finally, the block of the matrix-shaped voice signal TS (F, T) in the buffer 13 is passed through the narrow band multi-band inverse filter 11 according to the present invention to obtain a digital voice signal (PCM) indicated by symbol 5. Can be taken out as. This voice signal can be stored in a storage medium in a predetermined storage device, or can be heard as voice with the help of a predetermined voice conversion device (playback device).

【００３７】なお、ＩＳＤＮで２Ｂチャンネルにそれぞ
れステレオの左右音声信号を送り出す場合、２Ｂチャン
ネルのスキューが補償されていないので、予めフレーム
の約半分の周期だけ送出時点をずらすことにより、スキ
ューのずれを吸収する。When the stereo left and right audio signals are respectively sent to the 2B channel by ISDN, the skew of the 2B channel is not compensated. Therefore, the skew is shifted in advance by shifting the sending time point by about a half cycle of the frame. Absorb.

【００３８】この発明による圧縮されたデジタル音声信
号の復号化方法を、主としてＩＳＤＮの場合について説
明した。しかし、この発明は単にＩＳＤＮだけでなく、
デジタルコンパクトカセットや磁気テープ等での再生に
も利用できる。これ等の場合には、単位時間当たりのデ
ータ量にＩＳＤＮの場合より余裕があるため、ビット配
分を更に増やし、細かいステップによる高音質を保持で
きる信号の符号化圧縮方法およびそれに対する復号化方
法も可能である。The decoding method of the compressed digital audio signal according to the present invention has been described mainly for the case of ISDN. However, this invention is not limited to ISDN,
It can also be used for playback on digital compact cassettes and magnetic tapes. In these cases, since the data amount per unit time has a margin more than in the case of ISDN, the bit coding is further increased, and a signal coding / compression method capable of maintaining high sound quality by fine steps and a decoding method therefor are also provided. It is possible.

【００３９】更に、この発明は上記の実施例で使用した
パラメータＭ＝５１２，Ｎ＝１０，ＦＢ＝８の値、およ
びＡＬＯＣ（Ｆ）の種類＝３に限定されるものではな
く、２．４ビットと４ビットの比率も２：１に限定する
ものではない。これ等のパラメータは、必要に応じて、
適宜選択して変更して使用できる。３種の中の２種のビ
ットは、上記のように各周波数帯域のＣＴＭＡＸ（Ｆ）
の大きさの順で（即ち相対的に）決めるのでなく、ＣＴ
ＭＡＸ（Ｆ）の値の絶対値で評価することもできる。周
知のように、分割周波数ＦＢでの可聴音声しきい値の特
性曲線ＴＨ（ＦＢ）は周波数範囲に応じて相当異なる特
性を示す。その場合、分割周波数ＦＢでの可聴音声しき
い値の特性曲線ＴＨ（ＦＢ）に適当に重みを付けて前記
絶対値を決定できる。このような操作は、実験的に決め
るべきである。Furthermore, the present invention is not limited to the values of the parameters M = 512, N = 10, FB = 8 and the type of ALOC (F) = 3 used in the above embodiment, but 2.4. The ratio of bits to 4 bits is not limited to 2: 1. These parameters can be
It can be appropriately selected and changed. Two of the three bits are the CTMAX (F) of each frequency band as described above.
CT is not decided in the order of size (that is, relative)
It can also be evaluated by the absolute value of the value of MAX (F). As is well known, the characteristic curve TH (FB) of the audible voice threshold at the divided frequency FB shows a characteristic that is considerably different depending on the frequency range. In that case, the absolute value can be determined by appropriately weighting the characteristic curve TH (FB) of the audible voice threshold at the division frequency FB. Such an operation should be decided experimentally.

【００４０】[0040]

【発明の効果】以上説明したように、この発明によれ
ば、デジタル音声データを圧縮して符号化し、伝送回路
を介して受信後復号化する方法は、下記の事項により、
少ないデータ伝送量で音声信号を充分高品質に保持でき
る。即ち、（１）狭帯域デジタルフィルタで多数の帯域に信号を
分割することにより、各帯域内の信号を非常に粗く量子
化しても、量子化の歪みが狭い帯域内のみに発生するか
ら、聴感上の音質の劣化が少ない。（２）時間軸および周波数軸の最大値の小さい方の値
により、粗い量子化を行う前の信号を正規化しているた
め、どちらかの最大値のみによって正規化するよりも、
全体に正規化後の信号レベルが上昇し、粗い量子化の影
響が少なくなり、また時間軸および周波数軸の急激な変
化に対しても、正規化後の信号レベルが極端に小さくな
ることがなくなり、粗い量子化を行っても、音声が欠落
することがなくなった。（３）符号化ビット配分の指示値は受信側で時間軸最
大値より計算により再生できるため、伝送する必要がな
い。この処置により伝送データ量の大幅な増加が可能に
なっている。As described above, according to the present invention, a method of compressing and encoding digital audio data and decoding after receiving it via a transmission circuit is as follows.
A voice signal can be maintained in sufficiently high quality with a small amount of data transmission. That is, (1) by dividing a signal into a number of bands with a narrow band digital filter, even if a signal in each band is quantized very roughly, quantization distortion occurs only in a narrow band. There is little deterioration in the above sound quality. (2) Since the signal before the coarse quantization is normalized by the smaller one of the maximum values on the time axis and the frequency axis, it is preferable to normalize by only one of the maximum values.
The signal level after normalization rises overall, the effect of coarse quantization is reduced, and the signal level after normalization does not become extremely small even with rapid changes in the time axis and frequency axis. , The voice is no longer lost even with coarse quantization. (3) Since the coded bit allocation instruction value can be reproduced by calculation from the maximum value on the time axis on the receiving side, it is not necessary to transmit. This measure makes it possible to greatly increase the amount of transmitted data.

[Brief description of drawings]

【図１】この発明によるデータ圧縮符号化処理の各過程
とそれに付属する信号を模式的に示す処理図である。FIG. 1 is a processing diagram schematically showing each step of data compression encoding processing according to the present invention and a signal attached thereto.

【図２】符号化ビット配分を決定するためのサブルーチ
ンのフローチャートである。FIG. 2 is a flowchart of a subroutine for determining coded bit allocation.

【図３】量子化を行うサブルーチンのフローチャートで
ある。FIG. 3 is a flowchart of a subroutine for performing quantization.

【図４】量子化により分解能を粗くした信号のビット配
置を示す図である。FIG. 4 is a diagram showing a bit arrangement of a signal whose resolution is roughened by quantization.

【図５】データフレームの処理・データ情報を与えるた
めにデータ符号化を行うサブルーチンのフローチャート
である。FIG. 5 is a flowchart of a subroutine for processing data frames and performing data encoding to provide data information.

【図６】圧縮された音声データの符号化するサブルーチ
ンである。FIG. 6 is a subroutine for encoding compressed audio data.

【図７】伝送されるデータのフレーム内配置（ａ）と圧
縮された音声信号の１データブロック内での配置（ｂ）
を示す。FIG. 7 is a layout (a) of data to be transmitted and a layout (b) of a compressed audio signal in one data block.
Indicates.

【図８】この発明による圧縮符号化音声データを復号化
する各過程とそれに付属する信号を模式的に示す処理図
である。FIG. 8 is a process diagram schematically showing each step of decoding compression-coded audio data according to the present invention and a signal attached thereto.

【図９】受信時の符号誤りの検査と周波数軸最大値の受
信ルーチンの模式フローチャートである。FIG. 9 is a schematic flowchart of a routine for receiving a code error check and a frequency axis maximum value during reception.

【図１０】ビット配分決定ルーチンのフローチャートで
ある。FIG. 10 is a flowchart of a bit allocation determination routine.

【図１１】データ復号化ルーチンのフローチャートであ
る。FIG. 11 is a flowchart of a data decoding routine.

【図１２】正規化音声信号成分を算出するルーチンのフ
ローチャートである。FIG. 12 is a flowchart of a routine for calculating a normalized audio signal component.

[Explanation of symbols]

４入力音声信号（標本化されたＰＣＭ信号）５出力音声信号８圧縮された出力音声符号化信号９圧縮された受信音声信号１０狭帯域多重フィルタ１１狭帯域逆多重フィルタ２０正規化処理部２１逆正規化処理部３０量子化処理部Ｉ４０データ符号化処理部４１最大値復号化部４３データ復号化部５０可聴音声しきい特性曲線の数値用の保管部５５ビット配分決定部５６ビット配分決定部６０量子化処理部ＩＩ６１逆量子化処理部 4 Input Speech Signal (Sampled PCM Signal) 5 Output Speech Signal 8 Compressed Output Speech Coded Signal 9 Compressed Reception Speech Signal 10 Narrow Band Multiplex Filter 11 Narrow Band Demultiplexing Filter 20 Normalization Processing Unit 21 Inverse Normalization processing unit 30 Quantization processing unit I 40 Data encoding processing unit 41 Maximum value decoding unit 43 Data decoding unit 50 Storage unit for numerical values of audible voice threshold curve 55 bit allocation determination unit 56 bit allocation determination unit 60 Quantization Processor II 61 Inverse Quantization Processor

Claims

[Claims]

1. The following processing steps in the transmission of a digital audio signal: a) A digital audio signal of a predetermined sampling frequency is input to a narrow band multiplex digital filter, separated into multiple frequency bands, and a sequential time of a fixed time interval ( Signal component (S (F, T)) of each frequency band (F) in T), and b) signal component (S (F, T)) in each frequency band (F).
Time-axis maximum value (TMAX maximum value of the absolute value of
(F)) is obtained, and c) The frequency axis maximum value (FMAX (T)), which is the maximum value of the signal component (S (F, T)) within each time (T), is obtained, and d) The signal component (S (F, T) is the maximum value on the time axis (TMA
X (F)) and frequency axis maximum value (FMAX (T)) are divided to obtain the normal signal component (NS (F, T)), and e) Time domain maximum value in each frequency band (F) (TMAX
(F)) with an audible voice threshold (TH (F)),
As the converted time axis maximum value (CTMAX (F)), a value of 0 is set when the former is smaller than or equal to the latter, and a time axis maximum value (TMAX is set when the former is larger than the latter.
(F)), f) When the conversion time axis maximum value (CTMAX (F)) is 0, the subsequent transmission of the signal at all times (T) within the frequency band (F) is performed. G) quantizing the normal signal component (NS (F, T)) with at least one code having a smaller number of bits (QS)
(F, T), h) Frame sync signal (SYN) as data to be sent
C), converted time axis maximum value (CTMAX (F)), frequency axis maximum value (FMAX (T)) and error code (CRC)
Are sequentially arranged to form a frame information code, and all the quantized signals (QS (F, F,
T)) is further added as unique audio signal data (DT (F)).

2. A converted time axis maximum value (CTMAX (F))
Is 0, the number of bits in the frequency band (F) with a large time axis maximum value (CTMAX (F)) is taken into consideration, considering the amount of data not used for data transmission and the amount of data transmission specified by the transmission rate. A code having a large value is distributed and converted into a quantized signal (QS (F, T)).
The compression encoding method according to.

3. The frequency axis maximum value (FMAX (T)) and the operations such as normalization and quantization over the frequency band attached to it are divided into a plurality of sub frequency bands (FB). The compression encoding method according to claim 1 or 2.

4. A converted time axis maximum value (CTMAX (F))
4. The compression coding method according to claim 1, wherein the frequency axis maximum value (FMAX (T)) is transmitted as a logarithmically quantized value.

5. The number of frequency bands to be used is 512, the number of time axis data is 10, and the number of band divisions is 8. The converted time axis maximum value (CTMAX (F)) and frequency axis maximum value (FMA).
The logarithmic coding of X (T) is a 6-bit 2 dB step, and the bit number of the code of the compressed data is 1.6,
5. The compression encoding method according to claim 4, wherein there are three types of 2.4 and 4 bits.

6. The following processing steps for receiving and decoding a data signal compressed and encoded according to claim 1 in the transmission of a digital audio signal, i) a frame synchronization signal in a received data block ( S
YNC), the converted time axis maximum value (CTMAX (F)), frequency axis maximum value (FMAX)
(T)) decoding, j) checking the data error from the error code (CRC) in the data block, taking out an error-free data signal, k) taking out the unique data attached to the voice signal from the data block, Denormalized speech signal (TNS) by decoding
(F, T)), and 1) the smaller one of the converted time axis maximum value (CTMAX (F)) and the frequency axis maximum value (FMAX (T)) is used as the inverse normalized audio signal (TNS (F)). , T)) to obtain the inverse audio signal (TS (F, T)), and m) the inverse audio signal (T
A method for decoding a compression-encoded digital audio signal, which comprises: obtaining a digital output audio signal of a desired sampling frequency from S (F, T).

7. The inverse normalized speech signal (TNS (F,
T)) is formed by the converted time axis large value (CTMAX (F))
Is 0, the number of bits in the frequency band (F) with a large time axis maximum value (CTMAX (F)) is taken into consideration, considering the amount of data not used for data transmission and the amount of data transmission specified by the transmission rate. The decoding method according to claim 6, wherein the decoding is performed by allocating a large code.

8. The frequency axis maximum value (FMAX (T)) and the operations such as normalization and quantization over the frequency band attached to it are divided into a plurality of sub frequency bands (FB). The decoding method according to claim 6 or 7.

9. A converted time axis maximum value (CTMAX (F))
And the frequency axis maximum value (FMAX (T)) are the received converted time axis maximum value (CTMAX (F)) and the frequency axis maximum value (FMAX).
The decoding method according to any one of claims 6 to 8, wherein the decoding method is performed by performing inverse logarithmic quantization of MAX (T)).

10. The number of frequency bands to be used is 512, the number of time axis data is 10, the number of band divisions is 8, and the number of bits of the code of the compressed data is 1.6, 2.4 and 4 bits. The decoding method according to claim 1, wherein there are three types.