JP3001584B2

JP3001584B2 - Audio signal transmission method

Info

Publication number: JP3001584B2
Application number: JP1025541A
Authority: JP
Inventors: 仁樹佐藤
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1989-02-03
Filing date: 1989-02-03
Publication date: 2000-01-24
Anticipated expiration: 2015-01-24
Also published as: JPH02206246A

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）本発明は、音声信号を音声のある有音区間と背景雑音
のある雑音区間と無音区間に区別して、有音区間および
雑音区間の情報を符号化することにより、回線の有効利
用と自然性に優れた音声の再生を可能とした音声信号符
号化方法に関する。DETAILED DESCRIPTION OF THE INVENTION [Purpose of the Invention] (Industrial application field) The present invention distinguishes a speech signal into a sound section with speech, a noise section with background noise, and a silent section, and The present invention relates to an audio signal encoding method that enables effective use of a line and reproduction of audio with excellent naturalness by encoding information of a noise section.

（従来の技術）音声信号、特に会話音声の信号は音声のある有音区間
と無音区間とで構成されており、会話音声としての情報
は有音区間のみに含まれている。そこで、音声信号を符
号化して送信する場合、従来では有音区間の符号のみを
送信することにより、回線の有効利用をはかる方式が採
られている。(Prior Art) A speech signal, especially a speech voice signal, is composed of a voiced section and a non-voice section with voice, and information as a conversational voice is included only in a voiced section. Therefore, when an audio signal is encoded and transmitted, a method for effectively using a line is conventionally adopted by transmitting only a code in a sound section.

第10図と第11図はこのような方式に基づく従来の音声
信号送受信システムを示したものであり、第10図は送信
側、第11図は受信側の構成をそれぞれ示す。送信側にお
いては、有音検出器101で入力された音声信号を有音区
間と無音区間とに区別し、符号化器102により有音区間
のみを符号化して送信する。FIGS. 10 and 11 show a conventional audio signal transmission / reception system based on such a system. FIG. 10 shows the configuration on the transmission side, and FIG. 11 shows the configuration on the reception side. On the transmitting side, the speech signal input by the sound detector 101 is distinguished into a sound section and a silent section, and only the sound section is encoded by the encoder 102 and transmitted.

一方、受信側では受信した符号を復号器111により音
声に復号する。受信側にはさらに白色雑音発生器からな
る雑音合成器112が備えられている。切替器113は復号器
111からの有音区間の音声信号が存在するときは、それ
をそのまま出力するが、有音区間の音声信号がないと
き、すなわち無音区間では雑音合成器112からの白色雑
音を無音区間の背景雑音として出力する。On the other hand, on the receiving side, the received code is decoded by the decoder 111 into speech. The receiving side further includes a noise synthesizer 112 including a white noise generator. Switch 113 is a decoder
When the voice signal of the voiced section from the voice section 111 exists, it is output as it is, but when the voice signal of the voiced section does not exist, that is, in the voiceless section, the white noise from the noise synthesizer 112 is converted into the background noise of the voiceless section. Output as

しかしながら、この方式では受信側において無音区間
に背景雑音として出力される雑音合成器112からの白色
雑音は、有音区間に音声とともに聞こえる実際の背景雑
音とは異なっているため、有音区間と無音区間で背景雑
音が大きく変化し、非常に不自然な音声となってしま
う。However, in this method, the white noise from the noise synthesizer 112, which is output as background noise in a silent section on the receiving side, is different from the actual background noise that can be heard together with the voice in the voiced section, so that the voiced section and the silent Background noise greatly changes in the section, resulting in a very unnatural voice.

（発明が解決しようとする課題）このように従来の技術では、送信側から雑音情報を送
信せず、受信側において無音区間に白色雑音を背景雑音
として出力しているため、有音区間で音声と重複して聞
こえてくる背景雑音と、無音区間での背景雑音が大きく
異なって聞こえてしまう。そのため、有音区間と無音区
間で背景雑音の変化を伴い、不自然な音声に聞こえてし
まう。(Problems to be Solved by the Invention) As described above, according to the conventional technique, the noise information is not transmitted from the transmitting side, and white noise is output as background noise in a silent section on the receiving side. The background noise heard overlapping with the background noise in the silent section sounds very different. For this reason, the background noise changes between the sound section and the silent section, and the sound sounds unnatural.

本発明は、このような問題点を解決し、送信側からの
伝送情報量をほとんど増加させることなく、受信側にお
いて有音区間と無音区間との間で、背景雑音の変化が少
ない再生音声を得るための音声信号符号化方法を提供す
ることを目的とする。The present invention solves such a problem, and reproduces reproduced sound with little change in background noise between a sound section and a silent section on the reception side without increasing the amount of transmission information from the transmission side. It is an object of the present invention to provide an audio signal encoding method for obtaining the same.

［発明の構成］（課題を解決するための手段）上記目的を達成するため本発明は、音声信号について
有音区間と無音区間の判定を行い、有音区間の音声情報
および無音区間の雑音情報も送信する音声信号送信方法
において、前記雑音情報を前記音声信号より高い圧縮率
で符号化して送信することを特徴とする。[Configuration of the Invention] (Means for Solving the Problems) In order to achieve the above object, the present invention determines a voiced section and a silent section of a voice signal, and performs voice information of a voiced section and noise information of a voiceless section. And transmitting the noise information after encoding the noise information at a higher compression ratio than the audio signal.

第２の発明は、音声信号について有音区間と無音区間
の判定を行い、有音区間の音声情報および無音区間の雑
音情報も送信する音声信号送信方法において、前記雑音
情報を予め定められたしきい値より高い雑音情報のみ前
記音声信号より高い圧縮率で符号化して送信することを
特徴とする。According to a second aspect of the present invention, in the sound signal transmitting method for determining a sound section and a silent section for a sound signal, and also transmitting sound information of the sound section and noise information of the silent section, the noise information is determined in advance. Only the noise information higher than the threshold value is encoded at a higher compression ratio than the voice signal and transmitted.

第３の発明は、音声信号について有音区間と無音区間
の判定を行い、有音区間の有音情報および無音区間の雑
音情報も送信する音声信号送信方法において、前記雑音
情報は特徴を有する雑音情報のみ前記音声信号より高い
圧縮率で符号化して送信することを特徴とする。A third invention is a sound signal transmission method for determining a sound section and a sound section for a sound signal, and also transmitting sound information of the sound section and noise information of the sound section, wherein the noise information has a characteristic noise. It is characterized in that only information is encoded and transmitted at a higher compression ratio than the audio signal.

第４の発明は、音声信号について有音区間と無音区間
の判定を行い、有音区間の音声情報および無音区間の雑
音情報も送信する音声信号送信方法において、前記無音
区間の雑音情報について受信側に送信すべき雑音が存在
する有意雑音か判定し、有意雑音のみ前記音声信号より
高い圧縮率で符号化して送信することを特徴とする音声
信号送信方法。According to a fourth aspect of the present invention, there is provided an audio signal transmitting method for determining a sound section and a silent section with respect to an audio signal, and also transmitting voice information of the voice section and noise information of the silent section. A speech signal transmission method comprising: determining whether there is significant noise to be transmitted to the speech signal, encoding only the significant noise at a higher compression ratio than the speech signal, and transmitting the encoded speech signal.

第５の発明は、音声信号について有音区間と無音区間
の判定を行い、有音区間の音声情報および無音区間の雑
音情報も送信する音声信号送信方法において、前記無音
区間の雑音情報について予め定められたしきい値を超え
た雑音が存在する有意雑音か判定し、有意雑音のみ前記
音声信号より高い圧縮率で符号化して送信することを特
徴とする。According to a fifth aspect of the present invention, in the audio signal transmitting method for determining a voiced section and a silent section for a voice signal and also transmitting voice information of the voiced section and noise information of the voiceless section, the noise information of the voiceless section is predetermined. It is characterized in that it is determined whether or not there is noise exceeding a given threshold, and only significant noise is encoded at a higher compression ratio than the voice signal and transmitted.

第６の発明は、音声信号について有音区間と無音区間
の判定を行い、有音区間の音声情報および無音区間の雑
音情報も送信する音声信号送信方法において、前記無音
区間の雑音情報について受信側で雑音が聞こえる有意雑
音か判定し、有意雑音のみ前記音声信号より高い圧縮率
で符号化して送信することを特徴とする。According to a sixth aspect of the present invention, there is provided an audio signal transmitting method for determining a sound interval and a silent interval for an audio signal, and also transmitting voice information of a voice interval and noise information of a silent interval. Is determined to be significant noise in which noise can be heard, and only significant noise is encoded at a higher compression ratio than the voice signal and transmitted.

（作用）無音区間の背景雑音は、会話に重要な意味を持つわけ
ではないが、自然な会話をするためには必要である。ま
た、有音区間は、会話の自然性を保つためにも伝送の遅
延時間は短い方が良い。しかし、背景雑音は、その雑音
の特徴が再現できれば良いため、有音区間より情報量を
圧縮して伝送することができる。また、遅延時間に関し
ても有音区間のように厳しいものはない。(Operation) Background noise in a silent section does not have an important meaning in conversation, but is necessary for natural conversation. Also, in a sound section, the shorter the transmission delay time is, the better to maintain the naturalness of conversation. However, since background noise only needs to be able to reproduce the characteristics of the noise, the amount of information can be compressed and transmitted from a sound section. Also, there is no strict delay time as in the sound section.

本発明によれば、送信側より、有音区間の情報のみで
なく、無音区間における背景雑音の情報も送信すること
により、受信側において有音区間と無音区間とで背景雑
音の変化のほとんどない自然な再生音声が得られる。According to the present invention, by transmitting not only information on a sound section but also information on background noise in a silent section from the transmitting side, there is almost no change in background noise between a sound section and a silent section on the receiving side. Natural playback sound can be obtained.

また、送信側において背景雑音は、その遅延に許容す
ることにより、有音フレームより高い圧縮率で符号化す
ることができるため、背景雑音情報を送信することによ
る情報量の増加がほとんどない。In addition, the background noise can be encoded at a higher compression rate than the voiced frame by allowing the delay of the background noise on the transmission side, so that the amount of information caused by transmitting the background noise information hardly increases.

（実施例）以下図面にもとずいて本発明の一実施例を説明する。An embodiment of the present invention will be described below with reference to the drawings.

第１図は本発明の一実施例の構成を示すブロック図で
ある。同図において、符号10は音声信号入力端子であ
り、ここを介してディジタル音声信号が入力される。FIG. 1 is a block diagram showing the configuration of one embodiment of the present invention. In FIG. 1, reference numeral 10 denotes an audio signal input terminal through which a digital audio signal is input.

以下で用いられているバッファ132、142、150、156
は、音声信号の情報をフレーム単位で記憶するものであ
り、このときバッファに入力された情報の順序関係を保
存しておく必要がある。音声信号の情報は、特徴パラメ
ータ、またはサンプルである。バッファ内で蓄積される
順序関係を保存するために、たとえば特徴パラメータが
バッファに入力された順番で、バッファのヘッドからテ
イルに向かって蓄積する。すなわち、一番新しい特徴パ
ラメータ（現在判定すべきフレームの特徴パラメータ）
をバッファのヘッドに、一番過去の特徴パラメータをテ
イルに蓄積する。Buffers 132, 142, 150, 156 used below
Stores audio signal information in frame units, and it is necessary to store the order of the information input to the buffer at this time. The information of the audio signal is a feature parameter or a sample. In order to preserve the order relation accumulated in the buffer, the characteristic parameters are accumulated from the head of the buffer toward the tail, for example, in the order in which the characteristic parameters are input to the buffer. That is, the newest feature parameter (the feature parameter of the frame to be determined at present)
Is stored in the head of the buffer, and the last characteristic parameter is stored in the tail.

有音検出器11は、ディジタル音声信号を入力とし、デ
ィジタル音声信号Ｎサンプルをフレームとし、１フレー
ムごとに音声の存在する有音区間（有音フレーム）に区
別する。なお、フレーム長Ｎは一定である必要はない。The voice detector 11 receives a digital voice signal as input, sets N samples of the digital voice signal as a frame, and discriminates, for each frame, a voiced section (voiced frame) in which voice exists. Note that the frame length N does not need to be constant.

その方法は、たとえば音声信号の線形予測係数（LPC
係数）や、自己相関係数、零交差数、PARCOR係数、スペ
クトル分析により得られたスペクトル等は、有音と無音
では大きく異なるため、ある適当なしきい値を用いて有
音か無音かを判定することができる。The method is, for example, the linear prediction coefficient (LPC
Coefficient), autocorrelation coefficient, number of zero crossings, PARCOR coefficient, spectrum obtained by spectrum analysis, etc., are significantly different between voiced and non-voiced. can do.

その一例として有音・無音の識別方法を第２図に示
す。FIG. 2 shows an example of a method for discriminating between sound and silence.

電力計算器111では、音声信号入力端子10から入力さ
れたディジタル音声信号１フレームの平均電力Ｐを以下
のように測定する。ここで、Ｘ（ｉ）は標本化された音
声信号である。The power calculator 111 measures the average power P of one frame of the digital audio signal input from the audio signal input terminal 10 as follows. Here, X (i) is a sampled audio signal.

しきい値比較器112では、電力計算器111で計算された
Ｐと予め設定したしきい値ＴPを比較し、以下のように
有音／無音の判定を行う。 The threshold value comparator 112 compares P calculated by the power calculator 111 with a preset threshold value TP, and determines the presence or absence of sound as follows.

もし、Ｐ＜ＴPならば、無音もし、Ｐ≧ＴPならば、有音音声符号化器12は、音声信号入力端子10を介して入力
された音声信号が有音検出器11で有音であると判断され
たとき、そのフレーム、またはそのフレームの近傍を符
号化する。If P <TP, there is no sound. If P ≧ TP, sound is generated. The voice encoder 12 determines that the voice signal input through the voice signal input terminal 10 is voiced by the voice detector 11. When it is determined, the frame or the vicinity of the frame is encoded.

ここでは、APC−MLQ（最尤量子化器付適応予測符号
化）、ATC−VQ（ベクトル量子化器付適応変換符号
化）、ADPCM等の符号化法により符号化する。ここで、
この符号長を一定とする必要はない。Here, coding is performed by a coding method such as APC-MLQ (adaptive predictive coding with maximum likelihood quantizer), ATC-VQ (adaptive transform coding with vector quantizer), and ADPCM. here,
It is not necessary to keep this code length constant.

このとき、基本的に、有音検出器11で用いたフレーム
単位で音声信号を符号化し符号ブロックとする。有音検
出器11で用いたフレーム単位の音声信号を整数フレーム
まとめて符号化し符号ブロックとしてもよい。また、有
音検出器11で用いたフレームの整数分の１の単位で音声
信号を符号化し、符号ブロックとしてもよい。At this time, basically, the audio signal is encoded in units of frames used in the sound detector 11 to be a code block. The audio signal of the frame unit used in the sound detector 11 may be encoded as an integral frame to form a code block. Further, the audio signal may be encoded in units of an integer fraction of the frame used in the sound detector 11, and may be used as a code block.

ここで、有音フレームの符号ブロックであることを示
す情報を符号列の先頭に付加してもよい。Here, information indicating that the block is a code block of a sound frame may be added to the head of the code string.

また、符号ブロックの前後に、符号ブロックの先頭と
終わりを示す識別情報を付加してもよい。Further, identification information indicating the beginning and end of the code block may be added before and after the code block.

無音符号化器13では、有音検出器11で有音でないと判
定されたフレームの符号化を行う。The silent encoder 13 encodes a frame determined by the sound detector 11 as not having sound.

その構成例を第３図に示す。まず、音声信号のうち有
音検出器11で無音と判定されたフレームの雑音を符号化
器131で符号化する。FIG. 3 shows an example of the configuration. First, the encoder 131 encodes noise of a frame of the audio signal determined to be silent by the sound detector 11.

同時に、フレームカウンタ130が符号化されたフレー
ムの数をカウントし、Ｍフレーム以上になった場合、出
力装置133に符号を出力する指令を出す。出力装置133が
符号を出力した場合、出力装置133は、フレームカウン
タ130にリセット命令を出す。リセット命令をもらった
フレームカウンタ130は、フレーム数を「０」からカウ
ントし始める。At the same time, the frame counter 130 counts the number of encoded frames, and issues an instruction to output a code to the output device 133 when the number of encoded frames is equal to or more than M frames. When the output device 133 outputs the code, the output device 133 issues a reset command to the frame counter 130. The frame counter 130 receiving the reset instruction starts counting the number of frames from “0”.

一方、符号化器131で符号化された符号は、バッファ1
32に蓄積される。On the other hand, the code encoded by the encoder 131 is stored in the buffer 1
Stored in 32.

バッファ監視器134は、符号がＮフレーム連続してバ
ッファ132内に入った場合、出力装置にバッファ132内の
Ｎフレームの符号を出力する指令を出す。途中でバッフ
ァ132に入るフレームがとぎれた場合、バッファ132をク
リアする。When the code enters the buffer 132 continuously for N frames, the buffer monitor 134 issues a command to output the code of the N frames in the buffer 132 to the output device. If a frame entering the buffer 132 is interrupted on the way, the buffer 132 is cleared.

出力装置133は、フレームカウンタ130と、バッファ監
視器134の両方から出力命令を受けたときのみ、バッフ
ァ132内の符号をＮフレーム分出力する。The output device 133 outputs the code in the buffer 132 for N frames only when receiving an output command from both the frame counter 130 and the buffer monitor 134.

ここで、もしＮ＞＞Ｎならば、背景雑音の情報の送信
による効率の悪化はほとんどない。Here, if N >> N, there is almost no deterioration in efficiency due to transmission of background noise information.

符号化器131は背景雑音を符号化する。この符号化器1
31は音声符号化器12と同じものを使用してもよいし、圧
縮率を上げるために別のものを使用してもよい。但し、
量子化テーブル等は、背景雑音に適したものを使用した
方が符号化能率を上げることができる。The encoder 131 encodes the background noise. This encoder 1
31 may be the same as the speech encoder 12, or may be another to increase the compression ratio. However,
The use of a quantization table or the like suitable for background noise can improve the coding efficiency.

かくして本実施例によれば、有音区間で音声の背景の
雑音が知覚される場合、無音区間でもその雑音の特徴を
伝送することにより自然な音声を再現できる。また、無
音区間の信号の符号化器を音声用と分けることにより、
効能率な符号化が可能となる。しかも、有意雑音検出器
を用いないため、ハードウェアが小さくてすむ。さらに
雑音信号は音声信号と異なり、正確に復号する必要はな
い。Thus, according to the present embodiment, when noise in the background of speech is perceived in a sound section, natural speech can be reproduced by transmitting the characteristics of the noise even in a silent section. Also, by separating the encoder for the signal in the silent section from that for speech,
Efficient encoding becomes possible. Moreover, since no significant noise detector is used, the hardware can be small. Further, unlike a speech signal, a noise signal does not need to be decoded accurately.

第４図は本発明の第２の実施例に係る音声信号符号化
装置の構成を示すブロック図である。この音声信号符号
化装置は有音検出器11、有意雑音検出器14、音声雑音符
号化器15からなる。FIG. 4 is a block diagram showing a configuration of a speech signal encoding device according to a second embodiment of the present invention. This speech signal encoding device includes a sound detector 11, a significant noise detector 14, and a speech noise encoder 15.

第５図は有意雑音検出器14の構成を示すブロック図で
ある。FIG. 5 is a block diagram showing the configuration of the significant noise detector 14.

ここでは、有音検出器11で有音でないと判定されたフ
レームの中から、さらに受信側で知覚されるような有意
雑音がフレーム内に存在するかどうかを調べる。Here, it is checked whether or not significant noise as perceived on the receiving side is present in the frame, from among the frames determined to be non-sound by the sound detector 11.

ここでは、有意雑音かどうかの判定を次のような音声
信号の性質に基づいて行っている。Here, the determination as to whether the noise is significant is made based on the following characteristics of the audio signal.

平均電力の最小値であるフレームは無音であると判定
できる。また、その平均電力の最小値が一定値より大き
い場合、受信側では無音区間の雑音が聞こえてくる。そ
のような雑音のあるフレームは、送信すべきであると判
定し、有意雑音フレームとする。The frame having the minimum average power can be determined to be silent. When the minimum value of the average power is larger than a certain value, noise in a silent section is heard on the receiving side. Such a noisy frame is determined to be transmitted and is determined to be a significant noise frame.

まず、フレーム内電力測定器141で、音声信号のうち
有音検出器11で無音と判定されたフレームのフレーム内
の平均電力を計算する。First, the in-frame power measuring device 141 calculates the average power in a frame of the audio signal, which is determined to be silent by the sound detector 11, in the frame.

同時に、フレームカウンタ140が端子から入力された
フレームの数をカウントし、その数がＭフレーム以上に
なった場合、判定手段145に「１」を出力する。Ｍに満
たないときは「０」を出力する。判定手段145が有意雑
音であることを示す情報を出力した場合、判定手段145
は、フレームカウンタにリセット命令を出す。リセット
命令をもらったフレームカウンタ140は、フレーム数を
０からカウントし始める。At the same time, the frame counter 140 counts the number of frames input from the terminal, and outputs “1” to the determination means 145 when the number becomes M frames or more. When it is less than M, "0" is output. When the determination unit 145 outputs information indicating that the noise is significant, the determination unit 145
Issues a reset instruction to the frame counter. The frame counter 140 that has received the reset instruction starts counting the number of frames from 0.

一方、フレーム内電力測定器141で測定されたフレー
ム内の平均電力の値は、バッファ142に蓄積される。On the other hand, the value of the average power in the frame measured by the intra-frame power measuring device 141 is stored in the buffer 142.

バッファ監視器144は、フレーム内電力測定器141の出
力がＮフレーム連続してバッファ142内に入った場合、
バッファ内電力測定器143にバッファ142内のII番目から
JJ番目のＬフレームの電力の平均を計算し（Ｌ＝JJ−I
I）、その値を判定器145に出力する指令を出す。途中で
バッファ142に入るフレームがとぎれた場合、バッファ1
42をクリアする。When the output of the intra-frame power measuring device 141 enters the buffer 142 for N consecutive frames, the buffer monitor 144
From the IIth in the buffer 142 to the in-buffer power meter 143
Calculate the average of the power of the JJ-th L frame (L = JJ-I
I), and issue a command to output the value to the determiner 145. If the frame entering buffer 142 is interrupted in the middle, buffer 1
Clear 42.

判定器145は、フレームカウンタ140の出力が「１」で
あり、かつ、バッファ内電力測定器143の出力値が、あ
るしきい値Ｔを越えたときのみ、有意雑音であることを
示す情報を出力する。The determiner 145 outputs information indicating significant noise only when the output of the frame counter 140 is “1” and the output value of the in-buffer power measuring device 143 exceeds a certain threshold T. Output.

但し、有音検出器11の情報を用いずに有意雑音の検出
を行ってもよい。However, detection of significant noise may be performed without using the information of the sound detector 11.

音声雑音符号化器15は、音声符号化器12と同じもので
よい。その場合、有音フレーム間、または有意雑音フレ
ームのどちらかであれば、そのフレームを符号化して送
信する。The speech noise encoder 15 may be the same as the speech encoder 12. In that case, if it is either between sound frames or a significant noise frame, the frame is encoded and transmitted.

また、符号化器自体の構成は同じでも、音声と雑音に
よって符号化速度、量子化テーブル等の符号化パラメー
タを変えてもよい。Further, even though the configuration of the encoder itself is the same, encoding parameters such as an encoding speed and a quantization table may be changed depending on speech and noise.

第６図は、音声雑音符号化器15の構成例を示す。 FIG. 6 shows a configuration example of the speech noise encoder 15.

音声入力端子10からの音声信号は、そのままの形でバ
ッファ150に蓄積される。The audio signal from the audio input terminal 10 is stored in the buffer 150 as it is.

一方、現在のフレームが、音声検出器11で無音である
と判定され、しかも有意雑音検出器14で有意雑音である
と判定された場合、符号化フレーム決定器151は、バッ
ファ150内に蓄積されているフレームのうち、現在のフ
レームからＩ番目からＪ番目までのフレームを符号化す
るように、符号化器152に指示を出す。そのときの量子
化テーブルは、切替器153により雑音量子化テーブル155
に切替えられる。On the other hand, if the current frame is determined to be silent by the speech detector 11 and is determined to be significant noise by the significant noise detector 14, the coded frame determiner 151 stores the current frame in the buffer 150. It instructs the encoder 152 to encode the I-th to J-th frames from the current frame among the existing frames. The quantization table at that time is supplied to the noise quantization table 155 by the switch 153.
Is switched to

有音検出器11で現在のフレームが有音であると判定さ
れた場合、符号化フレーム決定器151は、バッファ150内
に蓄積されているフレームのうち、現在のフレームから
Ｉ番目からＪ番目までのフレームを符号化するように、
符号化器152に指示を出す。このとき、Ｉ＝０、Ｊ＝０
である。すなわち、符号化器152は、バッファ内150の１
番新しいフレーム信号の符号化を行い出力する。その際
の量子化テーブルは、切替器153により音声量子化テー
ブル154に切替えられる。When the voiced detector 11 determines that the current frame is voiced, the coded frame determiner 151 determines, from among the frames stored in the buffer 150, the I-th to J-th frames from the current frame. To encode the frame of
An instruction is issued to the encoder 152. At this time, I = 0 and J = 0
It is. That is, the encoder 152 selects one of the 150 in the buffer.
The newest frame signal is encoded and output. The quantization table at that time is switched to the audio quantization table 154 by the switch 153.

符号化器152は、バッファ150内の現在のフレームから
Ｉ番目からＪ番目までのフレームを符号化する。その
際、フレーム単位で符号化しても、Ｉ番目からＪ番目ま
でのフレームをまとめて符号化してもよい。符号化は音
声符号化器12と同様に行うことができる。The encoder 152 encodes the I-th to J-th frames from the current frame in the buffer 150. At this time, encoding may be performed in frame units, or the I-th to J-th frames may be encoded collectively. The encoding can be performed in the same manner as the speech encoder 12.

本実施例によれば有意雑音検出器により、正確に有意
雑音のフレームを検出することができる。しかも、符号
化器が１つですむため、ハードウェアが小さくてすむ。According to the present embodiment, a frame of significant noise can be accurately detected by the significant noise detector. Moreover, since only one encoder is required, the hardware is small.

第７図に音声雑音符号化器15の別の構成例を示す。 FIG. 7 shows another example of the configuration of the speech noise encoder 15.

これは、第６図の音声雑音符号化器と基本的な動作は
同じである。異なる点はバッファである。第６図のバッ
ファ150は音声のサンプルを蓄積する必要があるため、
バッファの大きさが大きくなってしまう。しかし、第７
図のバッファ156は符号化器152で符号化された音声の符
号を蓄積すればよいため、バッファ150の数分の１の大
きさになるため、ハードウェア規模が小さくなる。しか
し、有意雑音を符号化する際、１フレームずつ符号化し
なければならないため、第７図の方式より圧縮率が下が
るという欠点を持つ。This is basically the same in operation as the speech noise encoder of FIG. The difference is the buffer. Since the buffer 150 in FIG. 6 needs to store audio samples,
The size of the buffer increases. But the seventh
The buffer 156 shown in the figure only needs to store the code of the audio coded by the encoder 152, so that the size of the buffer 156 is a fraction of that of the buffer 150, so that the hardware scale is reduced. However, when significant noise is encoded, it must be encoded one frame at a time, which has the disadvantage that the compression rate is lower than that of the method of FIG.

第８図は本発明の第３の実施例を示すもので、有音検
出器11、音声符号化器12、有意雑音検出器14、有意雑音
符号化器16からなる。FIG. 8 shows a third embodiment of the present invention, which comprises a sound detector 11, a speech encoder 12, a significant noise detector 14, and a significant noise encoder 16.

第９図は有意雑音符号化器16の構成を示すブロック図
である。FIG. 9 is a block diagram showing the configuration of the significant noise encoder 16.

バッファ160では、音声入力端子10から入ってきた音
声信号を、フレーム単位で蓄積する。The buffer 160 accumulates an audio signal input from the audio input terminal 10 in frame units.

符号化フレーム決定器161では、バッファ内に蓄積さ
れているフレームのうち、II番目のフレームからJJ番目
のフレームまでを符号化することを決定する。The coded frame determiner 161 determines that the frames from the II-th frame to the JJ-th frame among the frames stored in the buffer are to be coded.

符号化器162では、バッファ160の音声信号を符号化す
る。このとき、フレーム単位で符号化してもよいし、Ｋ
フレーム（Ｋ＜Ｎ）まとめて符号化してもよい。The encoder 162 encodes the audio signal in the buffer 160. At this time, encoding may be performed in frame units,
Frames (K <N) may be encoded together.

受信側で、受信した符号ブロックが音声のものである
か、有意雑音のものであるかを識別するための方法とし
て、音声を符号化した際に、その符号の先頭に「１」
を、有意雑音を符号化した際に、その符号の先頭に
「０」を付加することによって、受信側で、その符号ブ
ロックが、音声のものか有意雑音のものかを識別でき
る。On the receiving side, as a method for identifying whether a received code block is for speech or significant noise, when speech is encoded, "1" is added to the beginning of the code.
By coding “0” at the beginning of the code when significant noise is encoded, the receiving side can identify whether the code block is speech or significant noise.

本実施例によれば有意雑音検出器により、正確に有意
雑音のフレームを検出することができる。しかも、雑音
フレームの符号化器を音声用と分けることにより、高能
率で雑音フレームの符号化が可能となる。According to the present embodiment, a frame of significant noise can be accurately detected by the significant noise detector. In addition, by separating the noise frame encoder from that for speech, it is possible to encode the noise frame with high efficiency.

［発明の効果］以上説明したように本発明によれば、有音区間で音声
の背景の雑音が知覚される場合、無音区間でもその雑音
の特徴を伝送することにより自然な音声を再現できる。[Effects of the Invention] As described above, according to the present invention, when noise in the background of speech is perceived in a sound section, a natural sound can be reproduced by transmitting the characteristics of the noise in a silent section.

[Brief description of the drawings]

第１図は本発明の第１の実施例に係る音声信号符号化装
置の構成を示すブロック図、第２図は有音検出器11の構
成を示すブロック図、第３図は無音符号化器13の構成を
示すブロック図、第４図は本発明の第２の実施例を示す
ブロック図、第５図は有意雑音検出器14の構成を示すブ
ロック図、第６図および第７図は音声雑音符号化器15の
構成を示すブロック図、第８図は本発明の第３の実施例
を示すブロック図、第９図は有意雑音符号化器16の構成
を示すブロック図、第10図および第11図は従来の音声信
号送受信システムの構成を示すブロック図である。 11……有音検出器 12……音声符号化器 13……無音符号化器 14……有意雑音検出器 15……音声雑音符号化器 16……有意雑音符号化器 18……送信符号出力端子FIG. 1 is a block diagram showing the configuration of a speech signal encoding apparatus according to a first embodiment of the present invention, FIG. 2 is a block diagram showing the configuration of a sound detector 11, and FIG. 3 is a silence encoder. 13 is a block diagram showing the configuration of the second embodiment of the present invention, FIG. 5 is a block diagram showing the configuration of the significant noise detector 14, and FIGS. FIG. 8 is a block diagram showing a configuration of the noise encoder 15, FIG. 8 is a block diagram showing a third embodiment of the present invention, FIG. 9 is a block diagram showing a configuration of the significant noise encoder 16, FIG. FIG. 11 is a block diagram showing a configuration of a conventional audio signal transmitting / receiving system. 11 Voice detector 12 Voice encoder 13 Silence encoder 14 Significant noise detector 15 Voice noise encoder 16 Significant noise encoder 18 Transmit code output Terminal

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04J 3/00 H04L 12/56 H04B 14/00 ──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int. Cl. ⁷ , DB name) H04J 3/00 H04L 12/56 H04B 14/00

Claims

(57) [Claims]

1. A voice signal transmitting method for determining a voiced section and a voiceless section of a voice signal and transmitting voice information of a voiced section and noise information of a voiceless section, wherein the noise information is compressed at a higher level than the voice signal. An audio signal transmission method characterized in that the audio signal is encoded at a rate and transmitted.

2. A voice signal transmitting method for determining a voiced section and a silent section of a voice signal, and also transmitting voice information of a voiced section and noise information of a voiceless section, wherein the noise information is determined in advance by a predetermined threshold. An audio signal transmission method, characterized in that only noise information higher than a value is encoded at a higher compression ratio than the audio signal and transmitted.

3. A voice signal transmitting method for determining a voiced section and a voiceless section of a voice signal and transmitting voiced information of a voiced section and noise information of a voiceless section. An audio signal transmission method characterized in that it is encoded at a higher compression ratio and transmitted.

4. A voice signal transmitting method for determining a voiced section and a voiceless section of a voice signal and transmitting voice information of a voiced section and noise information of a voiceless section, wherein the noise information of the voiceless section is transmitted to a receiving side. An audio signal transmission method, comprising determining whether there is significant noise to be transmitted, encoding the significant noise only at a higher compression ratio than the audio signal, and transmitting the encoded signal.

5. A voice signal transmitting method for determining a voiced section and a voiceless section of a voice signal and transmitting voice information of a voiced section and noise information of a voiceless section, wherein the noise information of the voiceless section is predetermined. An audio signal transmitting method, wherein it is determined whether or not there is noise exceeding a threshold value, and only significant noise is encoded at a higher compression ratio than the audio signal and transmitted.

6. A voice signal transmitting method for determining a voiced section and a voiceless section of a voice signal and also transmitting voice information of a voiced section and noise information of a voiceless section. An audio signal transmission method, comprising: determining whether significant noise is audible, encoding only the significant noise at a higher compression ratio than the audio signal, and transmitting the encoded audio signal.