JPS5941599B2

JPS5941599B2 - Redundancy removed PCM audio transmission method

Info

Publication number: JPS5941599B2
Application number: JP53083763A
Authority: JP
Inventors: 伸二林; 一彦筧; 功芳畔柳
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1978-07-10
Filing date: 1978-07-10
Publication date: 1984-10-08
Also published as: JPS5511616A

Abstract

PURPOSE:To increase the compression effect through a simple constitution and thus to obtain the excellent reproduced voice by eliminating the redundant continuation part of the vocal sound to transmit only the frame information and then giving the interpolation reproduction to the continuation part at the time of demodulation. CONSTITUTION:The PCM aural signal given from input terminal 11 with sampling of 8K Hz and quantization of 8 bits applied is divided 12 into 1-frame 80 samples (10ms). Then the signal of each frame is sent to arithmetic part 13 to undergo the calculation of the average power and the zero cross number to decide the frames for the silence, voiceless sound and vocal sound each; and these frames are added to the initial sample of each frame. While the frame-divided 12 PCM signal is supplied also to switch circuit 14. And the silence frame, voiceless frame and vocal frame are distributed fo frame eliminating part 15, but rearranging part 16 and steady frame omitting part 17 respectively. In case the changing rate is under the fixed level between the adjacent frames of the frame average power at part 17, the information showing this is sent from arithmetic part 13. And the redundant continuation parts excepting the frame information are eliminated for the subsequent frames.

Description

【発明の詳細な説明】この発明はＰＣＭによる音声信号伝送において特に音声
の冗長性を除去し、伝送効率を向上する伝送方式に関す
るものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a transmission system that particularly removes audio redundancy and improves transmission efficiency in audio signal transmission using PCM.

従来提案されていた冗長性除去方式として、ＴＡＳＩ（
ＴｉｍｅＡｓｓｉｇｎｅｄＳｐｅｅｃｈＩｎｔｅｒｐｏ
ｌａｔｉｏｎ）方式がある。As a previously proposed redundancy removal method, TASI (
TimeAssignedSpeechInterpo
ration) method.

この方式は、通話音声の有無を検出して無音区間は伝送
しないで、その間に他チャンネルの伝送を行なうよう構
成されていた。つまりこの方式は相手方が話していてこ
ちらが音声を発生していない状態に、他の通話を伝送し
ようとするものであり、音声自身の冗長性を圧縮する機
能は無い。従つて送話音声が連続している場合は通常の
ＰＣＭ方式と同じ伝送容量が必要であるという欠点があ
つた。他の方式として適応帯域幅ＰＣＭがある。This system was configured to detect the presence or absence of voice during a call, and to not transmit during a silent period, and to transmit other channels during that period. In other words, this method attempts to transmit another call while the other party is talking and the other party is not producing any audio, and there is no function to compress the redundancy of the audio itself. Therefore, when the transmitted voice is continuous, it has the disadvantage that the same transmission capacity as the normal PCM system is required. Another method is adaptive bandwidth PCM.

この方式は人力されるＰＣＭ音声をフレームに分け、フ
レーム内で音声の電力スペクトル帯域幅を計算し、その
帯域幅を伝送するのに適した標本化率で再標本化して伝
送するものである。この方式は電力スペクトル帯域幅を
計算する必要があり、これが複雑であるという欠点があ
つた。またＴＡＳＩ方式の改良方式の一つとして無音状
態の他に有声音や無声音かの判定をフレーム内の平均絶
対値、零交叉数、隣接標本間の差分の符号変化により推
定し、無声部分は少ない量子化段階数のＰＣＭで、有声
部分はＤＰＣＭで伝送するものもある。This method divides human-generated PCM audio into frames, calculates the power spectrum bandwidth of the audio within the frame, resamples the bandwidth at a sampling rate suitable for transmission, and transmits the resulting signal. This method has the disadvantage that it is necessary to calculate the power spectrum bandwidth, which is complicated. In addition, as an improved method of the TASI method, in addition to the silent state, the determination of whether the sound is voiced or unvoiced is estimated based on the average absolute value within the frame, the number of zero crossings, and the sign change of the difference between adjacent samples. There is also a PCM with a number of quantization stages, and a voiced part is transmitted with DPCM.

これはＤＰＣＭ符復号化機能が必要でありまた全体とし
ての構成が複雑となるなどの欠ウがあつた。この発明の
目的は、有声音の冗長な持続部分を除去して、フレーム
情報のみを伝送し、復号時にその部分を補間再生するこ
とにより、簡単な構成で品質を損なうことなく、通話音
声の時間領域での冗長性除去を行ない、通話路の有効利
用をはかることができるＰＣＭ音声伝送方式を提供する
ことにある。This required a DPCM encoding/decoding function, and the overall configuration was complicated. The purpose of this invention is to remove the redundant continuous parts of voiced sounds, transmit only frame information, and interpolate and reproduce that part during decoding. It is an object of the present invention to provide a PCM voice transmission system capable of eliminating redundancy in a region and making effective use of communication paths.

入力端子１１から例えば８ＫＨ２標本化、８ビット量子
化を施したＰＣＭ音声信号がフレーム分割部１２に人力
される。For example, a PCM audio signal subjected to 8KH2 sampling and 8-bit quantization is input from an input terminal 11 to a frame dividing section 12 .

この分割部１２では入力信号を例えば１フレーム８０サ
ンプル（１０ｍＳ）毎に分割し、フレーム同期信号を各
フレームの最初の標本化時に付ける。各フレームの信号
は演算部１３に送られる。演算部１３では各フレーム内
における各サンプル符号の値の自乗の和をとり、フレー
ム平均パワーを算出し、その平均パワーの隣接フレーム
間における変化率、即ち増減率を演算し、更に各フレー
ム内の各符号のサインビツトが変化する数を計数して各
フレーム内の零交差数を算出する。前記フレーム平均パ
ワーが飽和レベルと比較して十分小さい場合は無音フレ
ームと判定し、フレーム平均パワーが飽和レベルの２０
％以内の程度の小さなものであり、かつ零交叉数が１０
以上、つまり周期成分が１ＫＨＺ以上である場合は、無
声音フレームと判定し、その他の場合は有声音フレーム
と判定する。The dividing unit 12 divides the input signal into, for example, every 80 samples (10 mS) per frame, and adds a frame synchronization signal to each frame at the time of first sampling. The signal of each frame is sent to the calculation section 13. The calculation unit 13 calculates the sum of the squares of the values of each sample code in each frame, calculates the frame average power, calculates the rate of change in the average power between adjacent frames, that is, the rate of increase/decrease, and further calculates the sum of the squares of the values of each sample code in each frame. The number of zero crossings in each frame is calculated by counting the number of changes in the sign bit of each code. If the frame average power is sufficiently small compared to the saturation level, it is determined to be a silent frame, and the frame average power is set at 20% of the saturation level.
% or less, and the number of zero crossings is 10
In other words, if the periodic component is 1 KHz or more, the frame is determined to be an unvoiced frame, and in other cases, it is determined to be a voiced frame.

これ等無音、無声音、有声音の判別情報はフレーム情報
として、フレーム同期信号と同様、各フレームの最初の
標本に加えられる。一方、フレーム分割されたＰＣＭ信
号は切換回路１４へも供給され、無音フレームは無音フ
レーム除去部１５に、無声音フレームはビツト組変え部
１６へ、有声音フレームは定常フレーム切換部１７へそ
れぞれ分配される。This discrimination information of silent, unvoiced, and voiced sounds is added as frame information to the first sample of each frame, similar to the frame synchronization signal. On the other hand, the frame-divided PCM signal is also supplied to the switching circuit 14, where silent frames are distributed to a silent frame removal section 15, unvoiced frames to a bit recombination section 16, and voiced frames to a steady frame switching section 17. Ru.

無音フレーム除去部１５でぱフレーム情報以外は除去さ
れ、ビツト組変え部１６は各標本化符号中の上位４ビツ
トは切り捨てられ、２標本化分のビツトが１標本化分に
組み変えられる。The silent frame removing section 15 removes all other than the frame information, and the bit recombination section 16 discards the upper 4 bits of each sampling code, and recombines the bits for two samplings into one sampling.

つまり無声音はレベルが小さいため、上位ビツトはゼロ
であるから、例えば下位４ビツトで十分であり、２標本
化分の下位４ビツトが組合される。定常フレーム切捨部
１７では、フレーム内平均パワーの隣接フレーム間にお
ける変化率が一定直以下の場合、つまり余り変化がない
場合は、そのことを示す情報が演算部１３から送られ、
後に続くフレームはフレーム情報以外は除去される。In other words, since the unvoiced sound has a low level and the upper bits are zero, for example, the lower 4 bits are sufficient, and the lower 4 bits of the two samples are combined. In the steady frame cutting section 17, when the rate of change of the intra-frame average power between adjacent frames is less than a constant value, that is, when there is not much change, information indicating this is sent from the calculation section 13,
Subsequent frames are stripped of all but the frame information.

これ等無音フレーム除去部１５、ビツト組変え部１６及
び定常フレーム切捨部１７の各出力はフレーム接続器１
８で結合される。なお入力端子１１からの信号のクロツ
クからタイミング発生器１９でクロツク信号及びフレー
ム信号が作られ、これ等はフレーム分割部１２、演算部
１３その他の各部に供給される。以上のようにして冗長
性が除かれ、時間的に短縮された信号Ｃｈｌは多重化部
２１へ供給される。The outputs of the silent frame removing section 15, bit recombination section 16, and steady frame cutting section 17 are sent to the frame connector 1.
Combined with 8. A timing generator 19 generates a clock signal and a frame signal from the clock signal from the input terminal 11, and these signals are supplied to the frame dividing section 12, the calculating section 13, and other sections. As described above, redundancy has been removed and the signal Chl, which has been shortened in time, is supplied to the multiplexing section 21.

同様にして圧縮された他のチャネルからの信号Ｃｈｌ，
Ｃｈ３，・・・・・・も多重化部ロへ入力され、これ等
信号は８ＫＨ２標本化、８ビツト量子化の空回線を占め
るように多重化されてＰＣＭ回線２２へ伝送される。一
方、受信側ではＰＣＭ回線２２から受信された信号は第
２図に示すように、多重分離部２３により各チャンネル
の通話信号、フレーム情報及びタイミング情報に分離さ
れる。Similarly compressed signals Chl from other channels,
Ch3, . . . are also input to the multiplexing unit B, and these signals are multiplexed so as to occupy the vacant line for 8KH2 sampling and 8-bit quantization, and are transmitted to the PCM line 22. On the other hand, on the receiving side, the signal received from the PCM line 22 is demultiplexed by a demultiplexer 23 into speech signals for each channel, frame information, and timing information, as shown in FIG.

分離された一つのチヤンネル信号Ｅｈｌに着目して説明
する。多重分離部２３からの信号よりタイミング情報検
出部２４においてフレーム情報及びタイミング情報が検
出される。分離されたチヤンネル信号Ｃｈｌはフレーム
種類分離部２５でタイミング清報検出部２４から送られ
る制御情報により、無音、無声音及び有声音の各フレー
ムを分離する。その分離された無声音フレームはビツト
組変え部２６に送られ、有声音フレーム及び無音フレー
ムはそれぞれ補間部２７及び２８へ供給される。ビツト
組変え部２６では１標本時間の８ビツトを２標本時間の
４ビツト×２に組み変える。補間部２７では少くとも１
フレーム分の有声音が記憶され、定常フレーム切捨が行
われたことを示すフレーム情報が入力されると、記憶さ
れている有声音情報が読出されて繰返されて補間が行わ
れる。無音フレーム信号は補間部２８において無音区間
が再生される。The explanation will be focused on one separated channel signal Ehl. Frame information and timing information are detected by the timing information detection section 24 from the signal from the demultiplexing section 23. The separated channel signal Chl is sent to a frame type separator 25, which separates silent, unvoiced, and voiced frames according to control information sent from the timing information detector 24. The separated unvoiced frames are sent to a bit recombination unit 26, and the voiced frames and silent frames are supplied to interpolation units 27 and 28, respectively. The bit recombination unit 26 recombines 8 bits of one sample time into 4 bits x 2 of two sample times. In the interpolation section 27, at least 1
When a frame's worth of voiced sounds is stored and frame information indicating that steady frame truncation has been performed is input, the stored voiced sound information is read out and repeated to perform interpolation. A silent section of the silent frame signal is reproduced in the interpolation section 28.

これ等ビツト組変え部２６、補間部２７、この各出力は
合成部２９で合成される。このようにして合成部２９か
ら除去した冗長性を再生加えられた信号が得られる。こ
の復調された信号は原音声の冗長な部分に処理を加えた
のみであるから、音声の自然性、了解性は損われない。
第３図に示すように模式的な音声の時間包絡波形が入力
された場合、無音区間３１ではフレーム情報のみが伝送
され、無声音区間３２では１標本化に対し低ビツト（４
ビツト）で伝送される。レベルが変化している有声音の
区間３３では１標本化に対し高ビツト（８ビツト）で伝
送される。レベル変化が小さい有声音の区間３４はフレ
ーム情報のみ伝送される。この区間３４は復調時にその
直前のフレーム３５を繰返し使つて補間する。この波形
の例では区間３１，３４の占める時間の冗長性が圧縮さ
れ、区間３２の占める時間内のビツト数が半減され、フ
レーム情報として全体の１／８０だけ冗長性が増大する
。ＴＦはフレーム周期を示す。なお上述において無声音
はビツト組変えを行なうことなく、１標本値に対し、８
ビツトで伝送しても無音フレームや有声音フレーム中の
レベル変化が小さい区間の切捨除去により大きな圧縮効
果が得られる。The respective outputs of the bit recombination unit 26, the interpolation unit 27, and the outputs thereof are combined by a combination unit 29. In this way, a signal with the redundancy removed is obtained from the combining section 29. Since this demodulated signal is obtained by only processing redundant parts of the original speech, the naturalness and intelligibility of the speech are not impaired.
When a schematic temporal envelope waveform of speech is input as shown in FIG. 3, only frame information is transmitted in the silent section 31, and low bits (4
bits). In the voiced sound section 33 where the level is changing, high bits (8 bits) are transmitted for one sampling. In the voiced sound section 34 where the level change is small, only frame information is transmitted. This section 34 is interpolated by repeatedly using the immediately preceding frame 35 during demodulation. In this waveform example, the redundancy in the time occupied by sections 31 and 34 is compressed, the number of bits in the time occupied by section 32 is halved, and the redundancy is increased by 1/80 of the total as frame information. TF indicates a frame period. Note that in the above, unvoiced sounds have 8 bits per sample value without bit recombination.
Even when transmitted in bits, a large compression effect can be obtained by truncating and removing sections with small level changes in silent frames and voiced frames.

以上説明したように、この発明のＰＣＭ音声伝送方式に
よれば音声自体に含まれる冗長性が除去され、圧縮効果
が大であり、しかも再生音声の品質が優れている。As explained above, according to the PCM audio transmission system of the present invention, redundancy contained in the audio itself is removed, the compression effect is large, and the quality of the reproduced audio is excellent.

従つて応用分野としては、長距離のＰＣＭ通信、音声応
答装置、音声の記録等に適用すると、経済的かつ比質劣
化が少ないという利点がある。Therefore, when applied to long-distance PCM communications, voice response devices, voice recording, etc., it has the advantage of being economical and with little deterioration in quality.

[Brief explanation of drawings]

第１図はこの発明による冗長性除去ＰＣＭ音声伝送方式
の送信側の一例を示すプロツク図、第２図はその受信側
の例を示すプロツク図、第３図は音声包絡線の例を示す
図である。１１：入力端子、１２：フレーム分割部、１３：演算部
、１４：切換回路、１５：無音フレーム切捨て部、１６
：ビツト組変え部、１７：定常フレーム切捨て部、１８
：フレーム接続器、１９：タイミング信号発生器、２１
：チヤンネル多重化部、２２：ＰＣＭ回線、２３：チヤ
ンネル多重分離部、２４：フレーム情報分離部、２５：
フレーム種類分離部、２６：ビツト組み変え部、２７，
２８：補間部、２９：フレーム合成部。FIG. 1 is a block diagram showing an example of the transmitting side of the redundancy-removed PCM audio transmission system according to the present invention, FIG. 2 is a block diagram showing an example of the receiving side, and FIG. 3 is a diagram showing an example of the audio envelope. It is. 11: Input terminal, 12: Frame dividing section, 13: Arithmetic section, 14: Switching circuit, 15: Silent frame cutting section, 16
: Bit recombination part, 17: Steady frame truncation part, 18
: Frame connector, 19: Timing signal generator, 21
: Channel multiplexer, 22: PCM line, 23: Channel multiplexer/demultiplexer, 24: Frame information demultiplexer, 25:
Frame type separation section, 26: Bit recombination section, 27,
28: interpolation section, 29: frame synthesis section.

Claims

[Claims]

1 Divide PCM encoded audio into frames, insert frame information into each frame, distinguish each frame as silent, unvoiced, or voiced, and insert the discrimination information into each frame information, Redundancy-removed PCM audio characterized by transmitting silent frames and voiced frames whose level changes are below a certain value while leaving frame information and removing others, and interpolating and reproducing the removed portions at the time of decoding. Transmission method.