JPH0236628A

JPH0236628A - Transmission system and transmission/reception system for voice signal

Info

Publication number: JPH0236628A
Application number: JP63187521A
Authority: JP
Inventors: Kimitatsu Satou; 佐藤　仁樹
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1988-07-27
Filing date: 1988-07-27
Publication date: 1990-02-06

Abstract

PURPOSE:To attain the effective utilization of a line and the reproduction of voice with excellent naturality by sending a voice signal while taking the presence of a voice period and a silence period with background noise into account. CONSTITUTION:A voice/silence decision means 2 decides whether a signal is in the voice part or the silence part for each frame, the voice frame is outputted to a voice coding means 3 and the noise frame decided to the silence part is outputted to a buffer means 4 respectively. The voice coding means 3 adds identification information to the voice frame, the buffer means upon the storage of a prescribed number T of noise frame, transfers them to a noise coding means 5, where k-set of code blocks are generated and the identification information is added. The voice and noise coding blocks obtained by the voice coding means 3 and the noise coding means 5 are sent. Thus, not only the information at the voice frame but also the information of background noise in the silence frame are sent from the sender side. Thus, natural reproduced voice having almost no change in the background noise depending on the voice and silence periods is obtained at the receiver side.

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）本発明は、音声信号を有音区間と背景雑音のある無音区
間の存在を考慮して送信することにより、回線の釘効利
用と自然性に優れた音声の再生を可能とした音声信号の
送信方式及び送受信方式に関する。[Detailed Description of the Invention] [Objective of the Invention] (Industrial Application Field) The present invention eliminates the problem of line congestion by transmitting audio signals taking into account the presence of sound sections and silent sections with background noise. The present invention relates to an audio signal transmission method and a transmission/reception method that enable reproduction of audio with excellent efficiency and naturalness.

（従来の技術）音声信号、特に会話音声の信号は有音区間と無音区間と
で構成されており、会話音声としての情報は有音区間の
みに含まれている。そこで、音声信号を符号化して送信
する場合、従来では有音区間の符号のみを送信すること
により、回線の有効利用が図る方式がとられている。(Prior Art) An audio signal, particularly a conversational audio signal, consists of a sound section and a silent section, and information about the conversational sound is included only in the sound section. Therefore, when a voice signal is encoded and transmitted, a conventional method has been adopted in which only the codes of the voiced sections are transmitted, thereby making effective use of the line.

第６図及び第７図はこのような方式に基づ〈従来の音声
信号送受信システムを示したものであり、第６図は送信
側、第７図は受信側の構成をそれぞれ示す。送信側にお
いては、有音検出器６１で入力された音声信号を有音区
間と無音区間とに区別し、符号器６２により有音区間の
みを符号化して送信する。FIGS. 6 and 7 show conventional audio signal transmitting and receiving systems based on such a system, with FIG. 6 showing the configuration of the transmitting side and FIG. 7 showing the configuration of the receiving side, respectively. On the transmitting side, a speech detector 61 distinguishes the input audio signal into a speech section and a silent section, and an encoder 62 encodes and transmits only the speech section.

一方、受信側では受信した符号を復号器７１（；より音
声に復号する。受信側にはさらに白色雑音発生器からな
る雑音合成器７２が備えられている。On the other hand, on the receiving side, the received code is decoded into speech by a decoder 71 (;). The receiving side is further equipped with a noise synthesizer 72 consisting of a white noise generator.

切替器７３は復号器７１からの有音区間の音声信号があ
る時は、それをそのまま出力するが、有音区間の音声信
号がない時、すなわち無音区間では雑音合成器７２から
の白色雑音を無音区間の背景雑音として出力する。The switch 73 outputs the voice signal from the decoder 71 as it is when there is a voice signal in a voice zone, but when there is no voice signal in a voice zone, that is, in a silent zone, it outputs the white noise from the noise synthesizer 72. Output as background noise during silent periods.

しかしながら、この方式では受信側において無音区間に
背景雑音として出力される雑音発生器７２からの白色雑
音は、有音区間に音声信号とともに聞こえる実際の背景
雑音とは異なっているため、有音区間と無音区間とで背
景雑音が大きく変化し、非常に不自然な音声となってし
まう。However, in this method, the white noise from the noise generator 72 that is output as background noise during the silent period on the receiving side is different from the actual background noise that is heard together with the audio signal during the sound period. The background noise changes greatly between silent sections, resulting in extremely unnatural sound.

（発明が解決しようとする課題）このように従来の技術では、送信側から雑音情報を送信
せず、受信側において無音区間に白色雑音を背景雑音と
して出力しているため、有音区間と無音区間の背景雑音
が大きく異なって聞こえてしまうという問題があった。(Problem to be Solved by the Invention) In this way, in the conventional technology, the transmitting side does not transmit noise information, and the receiving side outputs white noise as background noise in the silent section. There was a problem in that the background noise between sections sounded very different.

本発明はこのような問題点を解決し、送信側からの伝送
情報量をあまり増大させることなく、受信側において有
音区間と無音区間との間で背景雑音の変化の少ない再生
音声が得られる音声信号の送信方式及び送受信方式を提
供することを目的とする。The present invention solves these problems, and allows the receiving side to obtain reproduced audio with little change in background noise between the sound section and the silent section without significantly increasing the amount of information transmitted from the transmitting side. The purpose of this invention is to provide a method for transmitting and receiving audio signals.

［発明の構成］（課題を解決するための手段）本発明に係る音声信号の送信方式は、有音・無音判定手
段において音声信号を所定長のフレームに区切り、フレ
ーム毎に有音部か無音部かを判定し、有音部と判定され
た有音フレームは有音符号化手段へ、また無音部と判定
された雑音フレームはバッファ手段へそれぞれ出力する
。有音符号化手段では、有音フレームを符号化して符号
ブロックを生成するとともに、各符号ブロックにそれが
有名の符号ブロックであることを示す識別情報を付加す
る。バッファ手段は所定数Ｔの雑音フレ−ムを蓄積する
と、そのＴ個の２１音フレームを雑音符号化手段へ転送
する。雑音符号化手段では、Ｔ個の雑音フレームについ
て所定の基準で選んだ所定のフレームについて符号化を
行なってに個の符号ブロックを生成し、各々の符号ブロ
ックに雑音の符号ブロックであることを示す識別情報を
付加する。そして、有音符号化手段及び雑音符号化手段
により得られた有音及び雑音の符号ブロックを送信する
。[Structure of the Invention] (Means for Solving the Problems) The audio signal transmission method according to the present invention divides the audio signal into frames of a predetermined length in the voice/silence determining means, and determines whether there is a voice portion or a silent portion for each frame. A sound frame determined to be a sound part is outputted to a sound encoding means, and a noise frame determined to be a silent part is outputted to a buffer means. The voice encoding means encodes the voice frame to generate code blocks, and adds identification information to each code block indicating that it is a famous code block. When the buffer means accumulates a predetermined number T of noise frames, it transfers the T 21-tone frames to the noise encoding means. The noise encoding means encodes a predetermined frame selected based on a predetermined criterion among the T noise frames to generate T code blocks, and indicates to each code block that it is a noise code block. Add identification information. Then, the speech and noise code blocks obtained by the speech encoding means and the noise encoding means are transmitted.

また、本発明に係る音声信号の送受信方式では、上述の
ようにして送信側から有音及び雑音の符号ブロックを送
信する。これを受信する受信側では、まず符号ブロック
分離手段により各符号ブロックを該符号ブロックに付加
されている識別情報に基づいて有音の符号ブロックか雑
音の符号ブロックか判定し、有音の符号ブロックは有音
復号手段へ、２ｉ音の符号ブロックは雑音復号手段へ出
力する。Furthermore, in the audio signal transmission/reception system according to the present invention, code blocks of voice and noise are transmitted from the transmitting side as described above. On the receiving side that receives this, the code block separation means first determines whether each code block is a voice code block or a noise code block based on the identification information added to the code block, and then is output to the voice decoding means, and the code block of 2i tones is output to the noise decoding means.

そして、これら有音復号手段及び雑音復号手段により得
られた自゛音信号及び雑音信号を連結して出力する。Then, the natural sound signal and the noise signal obtained by the sound decoding means and the noise decoding means are concatenated and output.

（作　用）本発明では送信側よりを音区間の情報のみでなく、無音
区間における背景雑音の情報も送信されることにより、
受信側において有音区間と無音区間とで背景雑音の変化
のほとんどない自然な再生音声が得られる。(Function) In the present invention, the transmitting side not only transmits information on sound intervals but also information on background noise in silent intervals.
On the receiving side, natural reproduced audio with almost no change in background noise between the sound section and the silent section can be obtained.

また、送信側において有音フレームはフレーム毎に符号
化されて送信されるが、雑音フレームはバッファ手段に
Ｔ個のフレームが蓄積された時点でまとめて符号化され
、フレーム数より少ないに個の符号ブロックとして送信
される。すなわち、雑音フレームは有音フレームよりも
高い圧縮率で符号化される。Furthermore, on the transmitting side, sound frames are encoded frame by frame and transmitted, but noise frames are encoded all at once when T frames have been accumulated in the buffer means, and T frames are encoded at the same time as the number of frames is less than the number of frames. Sent as a code block. That is, noise frames are encoded at a higher compression rate than speech frames.

有音区間の情報は会話の自然性を保ち、円滑な会話を進
めるためにも、伝送遅延時間は極力短い方がよいが、本
発明によると有音フレームはフレーム毎に符号化される
ため、伝送遅延時間は極めて少ない。In order to maintain the naturalness of the conversation and to proceed with smooth conversation, it is better for the transmission delay time of the information in the voiced section to be as short as possible, but according to the present invention, since the voiced frame is encoded frame by frame, Transmission delay time is extremely small.

一方、無音区間の背景雑音は会話内容の情報伝達に重要
な意味を持つわけではないが、自然な会話を行なうため
には必要である。しかし、背景雑音はその雑音の特徴が
受信側で再現できればよいため、本発明のように有音区
間より情報量を圧縮して伝送することが＝１能である。On the other hand, background noise during silent periods does not have an important meaning in transmitting information about the conversation content, but is necessary for natural conversation. However, since it is only necessary that the characteristics of background noise can be reproduced on the receiving side, it is possible to compress the amount of information from the sound interval and transmit it as in the present invention.

また、本発明では雑音フレームはＴフレーム分の伝送遅
延を伴なって受信側に伝達されるが、背景雑音は定常的
なものがほとんどであることから、多少の伝送遅延があ
っても自然性を損なうことはない。In addition, in the present invention, noise frames are transmitted to the receiving side with a transmission delay of T frames, but since most background noises are stationary, even if there is some transmission delay, it is natural. It will not damage.

（実施例）以下、図面を参照して本発明の詳細な説明する。(Example) Hereinafter, the present invention will be described in detail with reference to the drawings.

第１図は本発明の一実施例に係る送信側の構成を示すブ
ロックである。同図において、送信すべき音声信号は入
力端子１を介して有音・無音判定器２に入力される。有
音・無音判定器２は入力された音声信号を所定の長さの
フレームに区切り、フレーム毎に有音部か無音部かを判
定して、有音部と判定した有音フレームと無音部と判定
した雑音フレームとを分離して出力する。FIG. 1 is a block diagram showing the configuration of a transmitting side according to an embodiment of the present invention. In the figure, an audio signal to be transmitted is input to a voice/silence determiner 2 via an input terminal 1. The sound/silence determiner 2 divides the input audio signal into frames of a predetermined length, determines whether each frame is a sound part or a silent part, and separates the sound frame determined to be a sound part and the silent part. and the determined noise frame are separated and output.

有音・無音判定器２から出力された有音フレームは、６
−　ｇ符号器３に入力される。この４音符号器３は入力
された有音フレームを例えばＡＰＣ−ＭＬＱ　（最大量
子化付適応予測符号化）。The sound frames output from the sound/silence determiner 2 are 6
- input to g encoder 3; The four-tone encoder 3 converts the input voice frame into, for example, APC-MLQ (adaptive predictive coding with maximum quantization).

ＡＴＣ−ＶＱ　（ベクトル量子化付適応変換符号化）等
の符号化法により符号化し、ある長さの符号列を符号ブ
ロックとするとともに、その符号ブロックの先頭に有音
の符号ブロックであることを示す識別情報を付加して出
力する。また、この符号ブロックを送信回路６を介して
送信する際には、各７１号ブロックの始まりと終わりを
示す情報も付加してもよい。ここで、有音フレームと符
号ブロックとは１；１で対応している必要はなく、例え
ばいくつかのを音フレームを１個の符号ブロックに対応
させてもよいし、１個の４客フレームを複数の符号ブロ
ックに対応させてもよい。また、符号ブロックの長さは
一定でなくともよく、可変長ブロックであってもよい。It is encoded using a coding method such as ATC-VQ (adaptive transform coding with vector quantization), and a code string of a certain length is used as a code block, and the beginning of the code block is marked as a voice code block. The identification information shown is added and output. Furthermore, when transmitting this code block via the transmitting circuit 6, information indicating the start and end of each No. 71 block may also be added. Here, there is no need for a 1:1 correspondence between a voice frame and a code block; for example, several sound frames may correspond to one code block, or one four-person frame may correspond to one code block. may correspond to multiple code blocks. Further, the length of the code block does not need to be constant and may be a variable length block.

一方、有音・無音判定器２から出力された雑音フレーム
は、バッファ４に入力される。バッファ４は入力された
雑音フレームを所定の複数フレ−ム分保持する。バッフ
ァ４には雑音符号器５が接続されている。雑音符号器５
はバッファ４にＴ個の雑音フレームが蓄積された時、バ
ッファ４に転送指令を出し、それによって転送されてき
たＴ個の雑音フレームをまとめて符号化し、ｋ個の符号
ブロックを出力する。なお、Ｔ、には必ずしも一定であ
る必要はない。On the other hand, the noise frame output from the voice/silence determiner 2 is input to the buffer 4. The buffer 4 holds a predetermined number of input noise frames. A noise encoder 5 is connected to the buffer 4. noise encoder 5
When T noise frames are accumulated in the buffer 4, it issues a transfer command to the buffer 4, collectively encodes the T noise frames transferred, and outputs k code blocks. Note that T does not necessarily have to be constant.

有音符号器３及び無音符号器５から得られた有音及び無
音の符号ブロックは、送信部６より出力端子７を経て伝
送路に送出され、送信される。The sound and silence code blocks obtained from the sound coder 3 and the soundless coder 5 are sent from the transmitter 6 to the transmission path via the output terminal 7 and are transmitted.

第２図は雑音符号器５の具体的な構成例を示すブロック
図である。第１図のバッファ４からの雑音フレームは、
端子１１を介してフレーム内特徴検出器１２に人力され
る。この特徴検出器１２は雑音フレーム内の特徴を分析
・検出し、フレーム内時微量を出力する。このフレーム
内時微量は、バッファ１３へ転送されて蓄積される。フ
レーム間特徴検出器１４はバッファ１３に蓄積されたフ
レーム内時微量より、フレーム間特徴量を検出する。判
定器１５はバッファ１３に蓄積されたフレーム内時微量
と、フレーム間特徴検出器１４によって検出されたフレ
ーム開時微量に基づいて、Ｔ個の雑音フレームのうちｊ
個のフレームを符号化することを決定する。この判定器
１５の判定結果は、バッファ１３を介してフレーム内特
徴検出器１２の出力が入力されている符号器１６に与え
られる。符号器１６は判定器１５の判定結果に従って、
ｊ個の雑音フレームのフレーム内時微量を符号化し、端
子１７を経て第１図の送信部６へ出力する。FIG. 2 is a block diagram showing a specific example of the configuration of the noise encoder 5. As shown in FIG. The noise frame from buffer 4 in FIG.
The intra-frame feature detector 12 is inputted via a terminal 11 . This feature detector 12 analyzes and detects features within the noise frame and outputs a trace amount within the frame. This intra-frame time trace amount is transferred to the buffer 13 and accumulated. The inter-frame feature detector 14 detects inter-frame feature amounts from the intra-frame time traces accumulated in the buffer 13. Based on the intra-frame time trace amount accumulated in the buffer 13 and the frame opening time trace amount detected by the inter-frame feature detector 14, the determiner 15 selects j of the T noise frames.
Decide to encode frames. The determination result of the determiner 15 is applied via the buffer 13 to the encoder 16 to which the output of the intra-frame feature detector 12 is input. According to the determination result of the determiner 15, the encoder 16
The intra-frame time traces of j noise frames are encoded and output to the transmitter 6 in FIG. 1 via the terminal 17.

第２図に示した雑音符号器５の各部をさらに具体的に説
明すると、まずフレーム内特徴検出器１２は例えば１フ
レーム毎に雑音フレーム内のサンプルの電力の和、フー
リエ係数、自己相関係数。More specifically, each part of the noise encoder 5 shown in FIG. .

線形予ｕｌ係数等の１個の特徴量を計算する。ｍ番目の
雑音フレームにおけるｉ番目の特徴】をＣＡＩ（ｍ）（
ｍ−１，２，−、Ｔ％ｉ−１，２，−■）とする。こう
して計算されたフレーム内時微量ＣＡｉ（ｍ）は、バッ
ファ１３によりｍ−１〜Ｔまで保持される。One feature such as a linear pre-ul coefficient is calculated. i-th feature in m-th noise frame] is defined as CAI(m)(
m-1,2,-, T%i-1,2,-■). The intra-frame time minute amount CAi(m) thus calculated is held by the buffer 13 from m-1 to T.

フレーム間特徴検出器１４ではバッファ１３にフレーム
開時ｍｆ；ｋｃ　Ａｉ（ｉ）　（ｍ　−１〜Ｔ）が蓄積
された時点で、Ｔ個のフレーム間での相関ＣＩ　ｉ（ｍ
、ｎ）を検出する。例えばＣＡｉ（ａ＋）をｍ番目の雑
音フレーム内のフーリエ変換とすると、その第１成分Ｃ
Ａ　ｉｆ）はｍ番目のフレームの電力となる。フレーム
間相関としては、例えばＣＡ　ｉ（ｍ）（ｍ−１〜Ｔ）
とＣＡ　Ｉ（１１）との相関ＣＩ　ｉ（ｍ、ｎ）（ｍ−
１〜Ｔ　ｓ　ｎ−１〜Ｔ）等を用いることができる。In the interframe feature detector 14, when the frame opening time mf;kc Ai(i) (m -1~T) is accumulated in the buffer 13, the correlation CI i(m
, n) is detected. For example, if CAi(a+) is the Fourier transform in the m-th noise frame, its first component C
A if) is the power of the m-th frame. As the interframe correlation, for example, CA i(m)(m-1~T)
and CA I (11) correlation CI i (m, n) (m-
1 to T s n-1 to T), etc. can be used.

判定器１５ではバッファ１３に蓄積されているフレーム
内特徴隘ＣＡ　１（ｉ）と、フレーム間特徴検出器１４
からのフレーム間特徴ゴ（フレーム間相関）　ＣＩ　ｌ
（ｉ、ｎ）より、Ｔ個の雑音フレームのうちの符号化す
べきフレーム数ｊを決定する。例えば判定の閾値Ｔ　ｈ
ｌ、　　Ｔ　ｈ２とすると、ＣＡ　１（１）≧Ｔｂｌな
らば、ｊ　＝ａ＋Ｉｎ　　［ｎ　ｌ　ＣＩ　ｌ（１，ｎ）　ｌ
　＞Ｔｈ２］　　　・＝（１）ＣＡ　１（１）＜　Ｔ　
ｈｌならば、ｊ−０・・・（２）のようにしてｊを決定する。但し、（１）式は［］内の
条件を満たす最小のｎを表わす。The determiner 15 uses the intra-frame feature CA 1(i) stored in the buffer 13 and the inter-frame feature detector 14.
Inter-frame features (inter-frame correlation) from CI l
From (i, n), the number j of frames to be encoded among the T noise frames is determined. For example, the determination threshold T h
l, T h2, if CA 1(1)≧Tbl, then j = a+In [n l CI l(1, n) l
>Th2] ・=(1)CA 1(1)<T
If hl, determine j as follows: j-0...(2). However, formula (1) represents the minimum n that satisfies the conditions in [ ].

符号器１６では、判定器１５で計算されたｊに基づいて
、バッチ１３内の先頭からｊ個の雑音フレームの特徴量
をに個の符号プロ、ツクとして符号化し、各符号ブロッ
クの先頭に雑音の符号ブロックであることを示す識別情
報を付加して出力する。The encoder 16 encodes the features of the j noise frames from the beginning in the batch 13 as j code blocks based on j calculated by the determiner 15, and adds noise at the beginning of each code block. The code block is output with identification information indicating that it is a code block.

また、この符号ブロックを送信回路６を介して送信する
際には、各符号ブロックの始まりと終わりを示す情報も
付加する。このように無音区間の雑音フレームは、雑音
の有意性が大きい場合、換言すれば背景雑音として認識
されるレベル以上の場合のみ符号化されて送信される。Furthermore, when transmitting this code block via the transmission circuit 6, information indicating the start and end of each code block is also added. In this way, a noise frame in a silent section is encoded and transmitted only when the significance of the noise is large, in other words, when the noise is at a level equal to or higher than that recognized as background noise.

第３図は受信側の構成を示すブロック図である。FIG. 3 is a block diagram showing the configuration of the receiving side.

同図において、第１図の送信側から送信された信号は入
力端子２０より受信部２１に人力される。In the same figure, a signal transmitted from the transmitting side shown in FIG.

受信部２１で受信された信号は符号ブロック分離器２２
に入力される。符号ブロック分離器２２は人力された符
号ブロックに付加されている識別情報に基づいて何台符
号ブロックと雑音符号ブロックとの識別を行ない、有音
符号ブロックは有音復号器２３へ、無音符号ブロックは
バッファ２４へそれぞれ転送する。有音復号器２３では
、第１図の有音符号器３と逆のプロセスで有音符号ブロ
ックを復号する。The signal received by the receiving unit 21 is sent to the code block separator 22
is input. The code block separator 22 distinguishes between code blocks and noise code blocks based on the identification information added to the manually input code blocks, and sends the voice code blocks to the voice decoder 23, and sends the voice code blocks to the voice decoder 23. are transferred to the buffer 24, respectively. The sound decoder 23 decodes the sound code block in a process reverse to that of the sound encoder 3 in FIG.

バッファ２４は入力された無音符号ブロックを一時蓄積
し、バッファ監視器２６の指示により蓄積した雑音符号
ブロックをもう一つのバッファ２５へ転送する。バッフ
ァ２５は雑音復号器２７が復号すべき雑音符号ブロック
を蓄積し、バ・ソファ監視器２６の指示により雑音復号
器２７へ転送する。The buffer 24 temporarily stores input silence code blocks and transfers the stored noise code blocks to another buffer 25 according to instructions from the buffer monitor 26 . The buffer 25 stores noise code blocks to be decoded by the noise decoder 27, and transfers them to the noise decoder 27 according to instructions from the bus monitor 26.

バッファ監視器２６では、バッファ２４にｋ。The buffer monitor 26 stores k in the buffer 24.

番目の雑音符号ブロックが入った後、ある時間Ｔｂ待っ
ても次の雑音符号ブロックが到来しなかった場合、バッ
ファ２４内のに個の雑音符号ブロックをバッファ２５へ
転送する。この時バッファ２５内に蓄積されている雑音
符号ブロックの数をｋ”とする。バッファ２５内のｉ番
目の雑音符号ブロックが復号された後、バッファ２４内
のｉ番目の雑音符号ブロックをｉ−１〜に゛の順序でバ
ッファ２５へ転送する。ここで、ｋ’　＞ｋ”であって
、バッファ２５内の最後の雑音ブロックを復号し終わっ
てもバッファ２４内に雑音符号ブロックがまだ残ってい
る場合、バッファ２４内の残りの雑音符号ブロックを直
ちにバッファ２５へ転送する。逆に、ｋ’　＜ｋ”の場
合は、バッファ２４内の雑音符号ブロックを全てバッフ
ァ２５へ転送し終わってもバッファ２５内に前の雑音符
号ブロックが残っているので、それは消去する。その後
、ｋ　”−ｋ　’　とする。If the next noise code block does not arrive even after waiting for a certain period of time Tb after the th noise code block has entered, the noise code blocks in the buffer 24 are transferred to the buffer 25. The number of noise code blocks stored in the buffer 25 at this time is k''. After the i-th noise code block in the buffer 25 is decoded, the i-th noise code block in the buffer 24 is They are transferred to the buffer 25 in the order of 1 to 2. Here, if k'>k'', and even after the last noise block in the buffer 25 has been decoded, there are still noise code blocks remaining in the buffer 24. If so, the remaining noise code blocks in the buffer 24 are immediately transferred to the buffer 25. Conversely, if k'<k'', even after all the noise code blocks in the buffer 24 have been transferred to the buffer 25, the previous noise code block remains in the buffer 25, so it is deleted. After that, Let k ”−k ′.

雑音復号器２７ではバッファ２５に蓄積された雑音符号
ブロックについて、先頭から第２図の符号器１６と逆の
プロセスで復号を行なうとともに、バッファ監視器２６
からｋ”の情報を知り、ｋ番目の雑音符号ブロックを復
号したら、再び先頭の雑音符号ブロックから復号を行な
う。The noise decoder 27 decodes the noise code blocks accumulated in the buffer 25 from the beginning in a process reverse to that of the encoder 16 in FIG.
When the kth noise code block is decoded, decoding is performed again starting from the first noise code block.

有音復号器２３及び雑音復号器２７によりそれぞれ得ら
れた復号結果である有音信号及び雑音信号は、切替器２
８に入力される。切替器２８では有音復号器２３からの
有音信号が入かされている場合は、それをそのまま出力
し、有音復号器２３からの復号結果の入力がない場合は
、雑音復号器２７からの雑音信号を出力する。すなわち
、切替器２８では有音復号器２３の出力と雑音復号器２
７の出力とを連結して出力する。この切替器２８の出力
は、スムージングフィルタ２９に人力される。スムジン
グフィルタ２９では符号器からの１（°°番目の符号ブ
ロックの出力と１番目の符号ブロックの出力との連結部
、及びｑ音復号′ａ２３の出力と雑音復号器２７の出力
との連結部をスムージングの操作により滑らかにする。The voice signal and the noise signal, which are the decoding results obtained by the voice decoder 23 and the noise decoder 27, respectively, are sent to the switch 2.
8 is input. If the switcher 28 receives a voice signal from the voice decoder 23, it is output as is, and if the decoding result from the voice decoder 23 is not input, it is output from the noise decoder 27. outputs a noise signal. That is, the switch 28 outputs the output of the voice decoder 23 and the noise decoder 2.
7 is connected and output. The output of this switch 28 is input to a smoothing filter 29 . The smoothing filter 29 connects the output of the 1 (°° code block) and the output of the 1st code block from the encoder, and connects the output of the q-tone decoder 'a23 and the output of the noise decoder 27. Smooth the area using the smoothing operation.

このスムージングフィルタ２つの出力は、出力端子３０
より再生音声として出力される。The outputs of these two smoothing filters are output from output terminal 30.
It is output as playback audio.

次に、第４図及び第５図を参照して、第１図における雑
音符号器５の他の構成例を説明する。Next, another example of the configuration of the noise encoder 5 in FIG. 1 will be described with reference to FIGS. 4 and 5.

第２図に示した例では、符号器１６にフレーム内特徴検
出器１２の出力がバッファ１３を介して入力されている
が、第４図の例では第１図のバッファ４からの雑音フレ
ームが端子１１より直接符号器１６に入力されている。In the example shown in FIG. 2, the output of the intraframe feature detector 12 is input to the encoder 16 via the buffer 13, but in the example shown in FIG. The signal is directly input to the encoder 16 from the terminal 11.

この場合、符号器１６は人力された雑音フレームの先頭
からｊフレーム分をＤＰＣＭや直交変換符号化等の方法
により符号化し、その初号をに個の符号ブロックに分け
る。そして、前述と同様に各符号ブロックの先頭に雑音
の符号ブロックであることを示す識別情報を付加して出
力する。In this case, the encoder 16 encodes j frames from the beginning of the human-generated noise frame using a method such as DPCM or orthogonal transform encoding, and divides the initial code into code blocks. Then, as described above, identification information indicating that the code block is a noise code block is added to the beginning of each code block and output.

また、第２図ではフレーム間特徴検出器１４にフレーム
内特徴検出器１２の出力がバンファ１３を介して人力さ
れているが、第５図では端子１１を介して雑音フレーム
が直接フレーム間特徴検出器１４に人力され、フレーム
内特徴の検出とフレーム間特徴の検出が・ｌｋ行して行
なわれるようになっている。In addition, in FIG. 2, the output of the intra-frame feature detector 12 is manually inputted to the inter-frame feature detector 14 via the bumper 13, but in FIG. Detection of intra-frame features and inter-frame features are performed in 1k steps.

その他、本発明は要旨を逸脱しない範囲で種々変形して
実施が可能である。In addition, the present invention can be implemented with various modifications without departing from the scope of the invention.

［発明の効果］以上述べたように、本発明によれば無音区間においても
背景雑音に関する符号化データを送信し、受信側でそれ
に猛づいて背景雑音を再生して無音区間に出力すること
により、有音区間と無音区間とで定常的な背景背景雑音
についてはほとんど変化がなくなるため、自然性に優れ
た再生音声が得られる。[Effects of the Invention] As described above, according to the present invention, encoded data related to background noise is transmitted even during a silent section, and the receiving side uses it to reproduce the background noise and output it during the silent section. Since there is almost no change in the stationary background noise between the sound section and the silent section, reproduced speech with excellent naturalness can be obtained.

また、背景雑音の情報は有音区間の情報と異なり、あま
り正確な再現は必要なく、しかも時間的に遅れがあって
も定常的な成分がほとんどであるために不自然さを与え
ないという点に着目して、Ｔ個の雑音フレームをまとめ
て、Ｈ音フレームより少ない符号ブロックにして送信す
るため、伝送情報量が増大することがなく、伝送効率の
低下を最少限に止どめるとかできる。Additionally, background noise information differs from the information of voiced sections in that it does not require very accurate reproduction, and even if there is a time delay, it does not cause any unnaturalness because it is mostly composed of stationary components. By focusing on T noise frames and transmitting them as code blocks smaller than H sound frames, the amount of transmitted information does not increase and the decrease in transmission efficiency is kept to a minimum. can.

[Brief explanation of the drawing]

第１図は本発明の一実施例における送信側の構成を示す
ブロック図、第２図は第１図における雑音符号器の具体
的な構成例を示すブロック図、第３図は本発明の一実施
例における受信側の構成を示すブロック図、第４図及び
第５図は本発明における雑音符号器の他の構成例を示す
ブロック図、第６図及び第７図は従来の音声信号の送受
信方式を説明するためのブロック図である。１・・・音声信号入力端子、２・・・有音・無音判定器
、３・・・有音符号器、４・・・バッファ、５・・・雑
音符号器、６・・・送信部、７・・・送信出力端子、２
０・・・受信入力端子、２１・・・受信部、２２・・・
符号ブロック分離器、２３・・・有音復号器、２７・・
・雑音復号器、２８・・・切替器、２９・・・スムージ
ングフィルタ、３０・・・再生音声出力端子。出願人代理人　弁理士　鈴江武彦第１図第５図第６図第２図第７図FIG. 1 is a block diagram showing the configuration of the transmitting side in one embodiment of the present invention, FIG. 2 is a block diagram showing a specific configuration example of the noise encoder in FIG. 1, and FIG. FIGS. 4 and 5 are block diagrams showing other configuration examples of the noise encoder according to the present invention. FIGS. 6 and 7 are diagrams showing the conventional transmission and reception of audio signals. FIG. 2 is a block diagram for explaining the method. DESCRIPTION OF SYMBOLS 1... Audio signal input terminal, 2... Speech/no-speech determiner, 3... Speech encoder, 4... Buffer, 5... Noise encoder, 6... Transmitter, 7... Transmission output terminal, 2
0... Reception input terminal, 21... Receiving section, 22...
Code block separator, 23...Speech decoder, 27...
- Noise decoder, 28... Switcher, 29... Smoothing filter, 30... Playback audio output terminal. Applicant's Representative Patent Attorney Takehiko Suzue Figure 1 Figure 5 Figure 6 Figure 2 Figure 7

Claims

[Claims]

(1) Divide the audio signal into frames of a predetermined length, determine whether each frame is a voiced portion or a silent portion, and separate and output the voiced frame determined to be a voiced portion and the noise frame determined to be a silent portion. A speech/non-speech determination means; a speech frame obtained by the speech/non-speech determination means is encoded to generate a code block, and each code block is provided with identification information indicating that it is a speech code block. a voice encoding means for adding; a buffer means for holding the noise frames obtained by the voice/non-speech determining means; and when a predetermined number T of noise frames are accumulated in the buffer means, the T noise frames are noise encoding means for generating k code blocks for a frame and adding identification information to each code block indicating that it is a noise code block; and means for transmitting the obtained voiced and noise code blocks.

(2) Divide the audio signal into frames of a predetermined length, determine whether each frame is a voiced portion or a silent portion, and separate and output the voiced frame determined to be a voiced portion and the noise frame determined to be a silent portion. A speech/non-speech determination means; a speech frame obtained by the speech/non-speech determination means is encoded to generate a code block, and each code block is provided with identification information indicating that it is a speech code block. a voice encoding means for adding; a buffer means for holding the noise frames obtained by the voice/non-speech determining means; and when a predetermined number T of noise frames are accumulated in the buffer means, the T noise frames are noise encoding means for generating k code blocks for a frame and adding identification information to each code block indicating that it is a noise code block; means for transmitting the obtained voiced and noise code blocks; and a method for determining whether each code block transmitted by the means is a voiced code block or a noise code block based on identification information added to the code block. code block separation means for determining and separating and outputting voiced and silent code blocks; voice decoding means for decoding the voiced code blocks obtained by this means; and code block separation means for decoding the voiced code blocks obtained by this means; a noise decoding means for decoding a code block of noise; and a means for concatenating and outputting a voice signal and a noise signal obtained by the voice decoding means and the noise decoding means. transmission and reception method.