JP2005077889A - Voice packet absence interpolation system - Google Patents

Voice packet absence interpolation system Download PDF

Info

Publication number
JP2005077889A
JP2005077889A JP2003309745A JP2003309745A JP2005077889A JP 2005077889 A JP2005077889 A JP 2005077889A JP 2003309745 A JP2003309745 A JP 2003309745A JP 2003309745 A JP2003309745 A JP 2003309745A JP 2005077889 A JP2005077889 A JP 2005077889A
Authority
JP
Japan
Prior art keywords
voice
missing
packet
signal
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2003309745A
Other languages
Japanese (ja)
Inventor
Kazuhiro Kondo
和弘 近藤
Seiji Nakagawa
清司 中川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to JP2003309745A priority Critical patent/JP2005077889A/en
Publication of JP2005077889A publication Critical patent/JP2005077889A/en
Pending legal-status Critical Current

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To interpolate a voice signal included in a voice packet which is lost during transmission. <P>SOLUTION: Linear prediction coefficients are calculated from voice signals normally received before and after they are lost. The prediction coefficient in a forward direction, i.e. a precedent direction is calculated from the voice signal right before the loss is calculated and the linear prediction coefficient in a backward direction, i.e. traced back is calculated from the voice signal right after the loss. A sample which is one sample after the forward linear prediction coefficient is predicted from the voice signal before the loss and a sample which is further one sample after is predicted from the voice signal before the loss. Those operations are repeated to predict all samples of the lost part. on the other hand a sample which is one sample before, is predicted from the voice signal after the loss and the backward prediction coefficient and then a sample which is further one sample before is predicted from the predicted sample and the voice signal after the loss. Those are repeated to predict all samples of the lost part. There is provided a means for obtaining a high quality voice sample for interpolating the lost part by averaging the predicted sample of the lost voice of these two kinds. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は音声信号の欠落部分の補間に関し、特に線形予測を利用した補間方式に関する。   The present invention relates to interpolation of missing portions of a speech signal, and more particularly to an interpolation method using linear prediction.

音声信号をパケット等に収容して伝送する場合発生する欠落の補間方式として、従来から、欠落部直前のピッチを推定し、欠落直前の1ピッチ分の音声信号を欠落部で必要回数繰り返して補間する方法がITU標準G.711 Appendix Iに詳しく記載されている。
ITU標準G.711 Appendix I
As a missing interpolation method that occurs when an audio signal is accommodated in a packet or the like and transmitted, conventionally, the pitch immediately before the missing portion is estimated, and the audio signal for one pitch immediately before the missing portion is repeatedly interpolated by the missing portion as many times as necessary. ITU standard G. 711 Appendix I.
ITU standard G. 711 Appendix I

上記従来技術では、欠落区間が長くなると繰り返し回数が多くなり、補間部分の音質が合成的で不自然な音になる。   In the above prior art, if the missing section becomes longer, the number of repetitions increases, and the sound quality of the interpolation portion becomes synthetic and unnatural.

従って、本発明の目的は欠落区間が長くなっても補間部分の音質が不自然にならない方法を提供することである。   Therefore, an object of the present invention is to provide a method in which the sound quality of the interpolation portion does not become unnatural even if the missing section becomes long.

本発明によれば、欠落直前の音声より欠落部分の補間信号を線形予測により得る手段を備えた欠落補間方式が提供される。また本発明によれば、欠落直前および直後の音声より欠落部分の補間信号を線形予測により得る手段を備えた欠落補間方式が提供される。   According to the present invention, there is provided a missing interpolation method provided with means for obtaining an interpolation signal of a missing part by linear prediction from speech immediately before missing. Further, according to the present invention, there is provided a missing interpolation method including means for obtaining an interpolation signal of a missing portion by linear prediction from speech immediately before and after the missing.

本発明によれば、伝送中に欠落したパケットに含まれる音声信号を自然な音質で補間することが出来る。例えば30%の音声パケットが欠落する場合、5段階の平均主観音質において従来の方法では2を下回るのに対し、本発明によれば2.4の音質を保つことが出来る。 According to the present invention, it is possible to interpolate audio signals included in packets lost during transmission with natural sound quality. For example, when 30% of voice packets are lost, the average subjective sound quality of five levels is lower than 2 in the conventional method, whereas the sound quality of 2.4 can be maintained according to the present invention.

本発明においてはまず欠落前後の正常に受信された音声信号より線形予測係数を算出しておく。この時、欠落直前の音声信号よりは前方向、すなわち先行する方向の予測係数を、また欠落直後の音声信号からは後ろ向き、すなわち時間をさかのぼった線形予測係数を算出する。次に欠落前の音声信号から上記前向線形予測係数から1サンプル先のサンプルを予測する。次に予測したサンプルと欠落前の音声信号から更に1サンプル先のサンプルを予測する。これを繰り返して、欠落部分の全サンプルを予測する。一方、欠落後の音声信号と後向予測係数から1サンプル前のサンプル、すなわち欠落部分最後のサンプルを予測する。次にこの予測サンプルと欠落直後の音声信号からさらに1サンプル前のサンプルを予測する。これを繰り返して、欠落部分の全サンプルを後ろ向きに予測する。この2種類の欠落音声の予測信号を平均化して欠落部分を補間する音声サンプルを得る。   In the present invention, linear prediction coefficients are first calculated from normally received speech signals before and after the loss. At this time, the prediction coefficient in the forward direction, that is, the preceding direction is calculated from the speech signal immediately before the loss, and the linear prediction coefficient is calculated backward from the speech signal immediately after the loss, that is, by going back in time. Next, a sample one sample ahead is predicted from the forward linear prediction coefficient from the speech signal before missing. Next, a sample one sample ahead is predicted from the predicted sample and the audio signal before missing. This is repeated to predict all samples of missing parts. On the other hand, the previous sample, that is, the last sample of the missing portion is predicted from the missing audio signal and the backward prediction coefficient. Next, a sample one more sample before is predicted from the predicted sample and the voice signal immediately after the loss. This is repeated to predict all missing samples backwards. The two types of missing speech prediction signals are averaged to obtain speech samples that interpolate the missing portions.

以下、図面を用いて本発明の実施例について説明するが、本発明の範囲をこれらに限定するものでないことはいうまでもない。   Examples of the present invention will be described below with reference to the drawings, but it goes without saying that the scope of the present invention is not limited thereto.

図1に本発明の第1の実施例を示す。あらかじめ定められたサンプル数の音声信号が1つのパケットと呼ばれる形式に集合され、あて先等の付加情報を付与され、パケット毎にネットワーク伝送される。ネットワークを伝送中にはいくつかの中継点を経る。各中継点においては、各パケットは他の場所から送信されてきたパケットとともに一旦メモリに収容され、受け取った順に次の中継点に送出される。このときあまりに中継点にパケットが集中して送られてくると、メモリが有限であるため、パケットが棄却され、収容されている音声信号が失われる。また、棄却されなくてもパケットが集中し、次の中継点に送り出されるまで長い待ち時間が発生する。いずれ最終到着点に達しても長い伝送遅延のため、収容されている音声信号の再生すべき時刻に間に合わない場合は、やはりパケットは棄却される。このようにパケットを使った音声信号の伝送においては、主に上記2種類の要因により音声信号の欠落が生じる可能性がある。   FIG. 1 shows a first embodiment of the present invention. Audio signals of a predetermined number of samples are collected in a format called one packet, added with additional information such as a destination, and transmitted over the network for each packet. There are several relay points during network transmission. At each relay point, each packet is once stored in a memory together with a packet transmitted from another location, and is sent to the next relay point in the order received. At this time, if the packets are concentrated and sent to the relay point, the memory is limited and the packets are discarded and the stored voice signal is lost. Even if the packet is not rejected, a long waiting time is generated until the packet is concentrated and sent to the next relay point. Even if the final arrival point is reached at any time, the packet is discarded if it is not in time for the playback time of the accommodated audio signal due to a long transmission delay. As described above, in the transmission of the audio signal using the packet, the audio signal may be lost mainly due to the above two types of factors.

最終到着点に到達した音声パケットはまずバッファに蓄えられる。このとき、パケットに付与されている通し番号を監視し、欠落の有無を判定し、これが欠落補間部および再生音声源切り替えスイッチに送られる。パケットが無事到着してかつ再生タイミングに間に合っている場合はパケットは分解され、次のパケットが欠落していない場合はスピーカに送られてそのまま再生される。しかしこの音声信号は次のパケットが欠落していると判断される場合は欠落補間部にてスムージング処理されたのちスピーカにて再生される。現在再生されるべきパケットが欠落していると判断される場合は、新たなパケットは読み出されず、欠落直前の蓄積してある音声信号から欠落部分の音声信号を発生する。   Voice packets that have reached the final arrival point are first stored in a buffer. At this time, the serial number given to the packet is monitored to determine the presence or absence of a missing packet, and this is sent to the missing interpolation unit and the playback audio source changeover switch. If the packet arrives safely and is in time for playback, the packet is disassembled, and if the next packet is not missing, it is sent to the speaker and played as it is. However, when it is determined that the next packet is missing, this audio signal is reproduced by the speaker after being smoothed by the missing interpolation unit. When it is determined that a packet to be reproduced at present is missing, a new packet is not read out, and a missing portion audio signal is generated from the accumulated audio signal immediately before the loss.

図2に第1の実施例における欠落補間部分の構成を示す。分解されたパケットはまずバッファに記憶される。このバッファには常に最近受信、および再生された2パケットに収容されていた音声信号が記憶されている。まず最新の音声パケットに含まれている音声信号より線形予測係数を線形予測係数算出部において算出する。予測係数の算出は例えばLevinson−Durbinのアルゴリズムで算出することが出来る。このアルゴリズムについては例えばS. Haykin著“Adaptive Filter Theory”(Prentice−Hall社、1996年、254項)に詳しく記述されている。   FIG. 2 shows the configuration of the missing interpolation part in the first embodiment. The decomposed packet is first stored in a buffer. This buffer always stores the audio signal contained in the two recently received and reproduced packets. First, a linear prediction coefficient is calculated by a linear prediction coefficient calculation unit from a voice signal included in the latest voice packet. The prediction coefficient can be calculated by, for example, the Levinson-Durbin algorithm. For this algorithm, for example It is described in detail in “Adaptive Filter Theory” by Haykin (Prentice-Hall, 1996, item 254).

次に最新のパケットに含まれる音声信号をまずシフトレジスタにプリセットする。この信号と算出した予測係数よりまず欠落部の最初の音声信号を予測する。この様子を図3に示す。次に予測した1サンプルと、シフトレジスタに記憶されている音声信号よりさらに次のサンプルを予測する。これを繰り返し、欠落パケット分に相当するサンプル数の音声信号を予測する。   Next, the audio signal included in the latest packet is first preset in the shift register. Based on this signal and the calculated prediction coefficient, the first speech signal of the missing portion is first predicted. This is shown in FIG. Next, the next sample is predicted from the predicted one sample and the audio signal stored in the shift register. This process is repeated to predict a voice signal having the number of samples corresponding to the missing packet.

線形予測を予測したサンプルを用いて繰り返すと振幅が徐々に減少する。そこでこれを補正するために図2に示すように可変利得を用いる。この利得はまず1に設定され、一様に増加され、最終的にあらかじめ設定した上限値まで増加する。これにより予測サンプルの振幅は原音に極めて近い値に保たれる。   When the linear prediction is repeated using a predicted sample, the amplitude gradually decreases. In order to correct this, a variable gain is used as shown in FIG. This gain is first set to 1, is increased uniformly, and finally increases to a preset upper limit value. As a result, the amplitude of the predicted sample is kept very close to the original sound.

受信したパケットに含まれる音声信号と予測した音声信号では特性が異なる。よってこれをスムージングするため、上記線形予測係数を用いて1パケット更に過去の音声信号より欠落直前のパケットに含まれる音声信号を前記と同様に予測する。この予測したサンプルに対して、最初のサンプルにおいて0から、パケットの最終サンプルにおいて徐々に1に近づく重み1(ω)を乗算する。これに実際に受信した欠落直前の音声サンプルを、最初のサンプルにおいて1から、最終サンプルにおいて徐々に0に近づく重み2(ω)を乗算したものに加算する。ここで、ω+ω=1となる。この様子を図4に示す。この方法により、欠落部分から遠いサンプルに対しては受信した音声サンプルを、また欠落に近くなるにつれ予測した音声により多い重みが乗算されるので、スムーズに予測した欠落部につながる。同様のスムージングを欠落直後の音声にも適用する。すなわち予測した欠落音声信号を用いて欠落直後の音声信号をさらに繰り返し予測する。予測した音声サンプルには、欠落直後のサンプルにおいては1より欠落直後の音声パケットの最終サンプルにおいて徐々に0に近づく重み1(ω)を乗算し、一方受信した音声信号には最初のサンプルにおいては0から最終サンプルにおいて徐々に1に近づく重み2(ω)を乗算し、この2種の音声信号を加算して、スムージングを適用した音声信号として出力する。この方法により欠落直後は予測した音声信号に多くの重みを与え、これを徐々に受信した音声信号に重みを移行する特性となる。 The audio signal included in the received packet has different characteristics from the predicted audio signal. Therefore, in order to smooth this, the voice signal included in the packet immediately before the missing voice packet is predicted in the same manner as described above by using the linear prediction coefficient. This predicted sample is multiplied by a weight 1 (ω 1 ) that approaches 0 in the first sample and gradually approaches 1 in the last sample of the packet. Then, the voice sample actually received just before the missing is added to a value obtained by multiplying the weight 2 (ω 2 ) gradually approaching 0 in the final sample from 1 in the first sample. Here, ω 1 + ω 2 = 1. This is shown in FIG. With this method, the received voice sample is multiplied by the weight of the predicted voice as the sample is far from the missing portion, and the predicted voice is multiplied as it becomes closer to the missing portion, leading to a smoothly predicted missing portion. The same smoothing is also applied to the voice immediately after missing. That is, using the predicted missing voice signal, the voice signal immediately after the missing is further repeatedly predicted. The predicted voice sample is multiplied by a weight 1 (ω 1 ) that gradually approaches 0 in the final sample of the voice packet immediately after the missing in the sample immediately after the missing, while the received voice signal is multiplied in the first sample. Is multiplied by a weight 2 (ω 2 ) that gradually approaches 1 from 0 in the final sample, and these two types of audio signals are added and output as an audio signal to which smoothing is applied. Immediately after the loss by this method, a large amount of weight is given to the predicted audio signal, and the weight is gradually transferred to the audio signal that has been received.

以上の第1の実施例によれば、音声パケット欠落部に対する音声信号を自然な音質を持つ予測音声信号で補間することが出来る。 According to the first embodiment described above, it is possible to interpolate the voice signal for the voice packet missing portion with the predicted voice signal having natural sound quality.

次に欠落の補間を、欠落直前の音声信号から前向き予測を用いて得た予測信号と、欠落直後の音声信号から後向き予測を用いて得た予測信号双方を用いて行う第2の実施例について説明する。図5にこの実施例の欠落補完部分を示す。全体の構成は図1に示すとおりである。前方向パケットバッファ、シフトレジスタ、前向き線形予測音声係数算出、前向き線形予測および可変利得で構成される前向き欠落補間部分は第1の実施例と同じである。 Next, in the second embodiment, missing interpolation is performed using both a prediction signal obtained from the speech signal immediately before the loss using forward prediction and a prediction signal obtained from the speech signal immediately after the loss using backward prediction. explain. FIG. 5 shows a missing complement portion of this embodiment. The overall configuration is as shown in FIG. The forward missing interpolation part composed of the forward packet buffer, shift register, forward linear prediction speech coefficient calculation, forward linear prediction and variable gain is the same as that of the first embodiment.

一方、音声信号は後方向パケットバッファにも格納される。欠落直後の同バッファに蓄積された音声信号より欠落部を予測する後向き線形予測係数を算出する。これはバッファに格納されている1パケット分の音声信号を時間を逆転して、前記Levinson−Durbinのアルゴリズムを適用すればよい。次に時間軸を逆転した欠落直後の音声信号をまずシフトレジスタにプリセットする。この信号と算出した予測係数よりまず欠落部の最後の音声信号を予測する。この様子は図3と同様であるが、時間軸が逆転したもの、すなわち欠落部分の最後のサンプルから徐々にさかのぼって予測を行う。次に予測した1サンプルと、シフトレジスタに記憶されている音声信号双方よりさらに1つ前のサンプルを予測する。これを繰り返し、欠落パケット分に相当するサンプル数の音声信号を予測する。後向きの予測でも前向きと同様、予測を繰り返すと振幅が減少するので、徐々に利得が大きくなる可変利得を適用する。 On the other hand, the audio signal is also stored in the backward packet buffer. A backward linear prediction coefficient for predicting the missing part is calculated from the audio signal stored in the buffer immediately after the missing. For this purpose, the Levinson-Durbin algorithm may be applied by reversing the time of the audio signal for one packet stored in the buffer. Next, the audio signal immediately after the loss with the time axis reversed is first preset in the shift register. Based on this signal and the calculated prediction coefficient, the last speech signal of the missing part is first predicted. This situation is the same as in FIG. 3, but prediction is performed by gradually going back from the last sample in which the time axis is reversed, that is, the missing part. Next, a predicted one sample and a sample immediately before both of the audio signal stored in the shift register are predicted. This process is repeated to predict a voice signal having the number of samples corresponding to the missing packet. In the backward prediction, as in the forward prediction, since the amplitude decreases when the prediction is repeated, a variable gain that gradually increases the gain is applied.

このようにして欠落部分に対し前向き予測および後向き予測より得た2種類の音声信号を得る。これを欠落の最初の部分は前者に、最後の部分は後者に多い重みを乗算の上加算して補間信号を得る。この様子を図6に示す。欠落部分の前向き予測信号には欠落直後は1となり、徐々に0となる重み2(ω)を乗算する。一方、後向き予測信号には欠落直後は0となり、徐々に1となる重み3(ω)を乗算する。欠落区間においてはω+ω=1である。この2種の重み付き信号の和を補間信号とする。これにより予測の繰り返し回数が少なくまだ前向き予測の精度が高いと思われる欠落前半では前向き予測音声を多く、一方同じ理由で後ろ向き予測の精度が高い後半では後ろ向き予測音声を多く含むことになる。 In this way, two types of speech signals obtained from forward prediction and backward prediction for the missing portion are obtained. The interpolated signal is obtained by multiplying the first part of the missing part by the former and the latter part by multiplying the latter by multiplying the weights. This is shown in FIG. The forward prediction signal of the missing part is multiplied by a weight 2 (ω 2 ) that becomes 1 immediately after the missing and gradually becomes 0. On the other hand, the backward prediction signal is multiplied by a weight 3 (ω 3 ) that becomes 0 immediately after being lost and gradually becomes 1. In the missing section, ω 2 + ω 3 = 1. The sum of these two types of weighted signals is used as an interpolation signal. As a result, in the first half of the missing period, where the number of prediction iterations is small and the accuracy of forward prediction is still high, there are many forward prediction voices. On the other hand, in the latter half, where the accuracy of backward prediction is high for the same reason, many backward prediction voices are included.

欠落前の信号は第1の実施例と同じようにさらに1パケット前の音声信号から予測した音声と受信したパケットの音声信号をスムージングする。一方、欠落直後の音声信号も同様にスムージングする。すなわち、さらに1パケット先の音声信号から後ろ向きに音声信号を予測する。これと、欠落直後に受信した音声信号に重みを乗算して加算し、スムージングをかける。このとき欠落直後は予測音声信号の重み3(ω)を1にし、これを徐々に0にする。一方、受信信号の重み1(ω)を欠落直後は0とし、これを徐々に1にしていく。ここでもω+ω=1となる。以上の2種の重みを乗算した信号の和を音声信号とする。これにより欠落直後は後ろ向き予測音声信号の特性を多く含み、徐々に受信音声信号の特性を多く含む音声信号となる。 As in the first embodiment, the signal before missing further smoothes the voice predicted from the voice signal one packet before and the voice signal of the received packet. On the other hand, the audio signal immediately after the loss is similarly smoothed. That is, the audio signal is predicted backward from the audio signal one packet ahead. This is multiplied by a weight and added to the audio signal received immediately after the loss, and smoothed. At this time, immediately after the loss, the weight 3 (ω 3 ) of the predicted speech signal is set to 1, and this is gradually set to 0. On the other hand, the received signal weight 1 (ω 1 ) is set to 0 immediately after being lost, and is gradually set to 1. Again, ω 3 + ω 1 = 1. The sum of signals obtained by multiplying the above two types of weights is defined as an audio signal. As a result, immediately after the loss, the sound signal includes many characteristics of the backward predicted sound signal and gradually includes many characteristics of the received sound signal.

以上の第2の実施例により、欠落直後の正常に受信した音声信号も利用することによりさらに自然で高品質な音声信号で欠落を補間することが出来る。 According to the second embodiment described above, it is possible to interpolate the lack with a more natural and high-quality speech signal by using the normally received speech signal immediately after the lack.

次に第1の実施例に示した前向き予測音声信号と、第2の実施例に示した前向きと後向き双方を用いた双方向予測音声信号を切り替えて用いる第3の実施例について説明する。 Next, a description will be given of a third embodiment in which the forward predicted speech signal shown in the first embodiment and the bidirectional predicted speech signal using both forward and backward directions shown in the second embodiment are switched.

第2の実施例は欠落が多くても高品質な補間音声を得ることができる。しかし欠落後のパケットに含まれる音声信号を利用するため、このパケットの受信を待つ必要がある。すなわち、長い遅延が必要となる。一方、前向き予測ではあえて欠落後のパケット受信を待つ必要がないため、長い遅延が必要ない。10%の欠落までは前向き予測と双方向予測ではそれほど音質に差がない。そこで欠落率を監視し、10%程度の欠落率までは図2に図示し第1の実施例で説明した前向き予測補間を採用し、これ以上では図5に図示し第2の実施例で説明した双方向予測を用いる。この様子を図7に示す。 In the second embodiment, high quality interpolated speech can be obtained even if there are many missing parts. However, since the audio signal included in the lost packet is used, it is necessary to wait for reception of this packet. That is, a long delay is required. On the other hand, in the forward prediction, there is no need to wait for packet reception after the loss, so that a long delay is not necessary. Until 10% loss, there is not much difference in sound quality between forward prediction and bidirectional prediction. Therefore, the missing rate is monitored, and the forward predictive interpolation illustrated in FIG. 2 and described in the first embodiment is adopted up to a missing rate of about 10%, and more than that, illustrated in FIG. 5 and described in the second embodiment. Bi-directional prediction. This is shown in FIG.

本実施例の構成は第1の実施例とほぼ同じであるが、バッファより出力される欠落信号が欠落率算出部に入力される。欠落率算出部では欠落信号より欠落率の移動平均を推定する。欠落率が10%程度までは第1の実施例で説明した前向き予測補間を行い、これをスピーカより再生する。一方10%以上となった場合は第2の実施例で説明した双方向予測補間を行い、これをスピーカより再生する。 The configuration of this embodiment is substantially the same as that of the first embodiment, but the missing signal output from the buffer is input to the missing rate calculation unit. The missing rate calculation unit estimates the moving average of the missing rate from the missing signal. The forward predictive interpolation described in the first embodiment is performed until the missing rate is about 10%, and this is reproduced from the speaker. On the other hand, when it becomes 10% or more, the bidirectional predictive interpolation described in the second embodiment is performed, and this is reproduced from the speaker.

第3の実施例に拠れば、欠落率が低い場合は遅延を少なく抑えて高品質な欠落補間を行い、欠落率が高い場合はやや長い遅延を許容し高品質な音声補間を行うことが出来る。 According to the third embodiment, when the missing rate is low, high-quality missing interpolation is performed with a small delay, and when the missing rate is high, a slightly longer delay is allowed and high-quality speech interpolation can be performed. .

本発明の第1の実施例を示す図である。It is a figure which shows the 1st Example of this invention. 本発明の第1の実施例における欠落補間部の構成を示す図である。It is a figure which shows the structure of the missing interpolation part in 1st Example of this invention. 本発明の第1の実施例における線形予測補間の動作を説明する図である。It is a figure explaining the operation | movement of the linear prediction interpolation in 1st Example of this invention. 本発明の第1の実施例におけるスムージングの動作を説明する図である。It is a figure explaining the operation | movement of the smoothing in 1st Example of this invention. 本発明の第2の実施例における欠落補間部の構成を示す図である。It is a figure which shows the structure of the missing interpolation part in 2nd Example of this invention. 本発明の第2の実施例におけるスムージングの動作を示す図である。It is a figure which shows the operation | movement of the smoothing in 2nd Example of this invention. 本発明の第3の実施例における欠落補間部の構成を示す図である。It is a figure which shows the structure of the missing interpolation part in the 3rd Example of this invention.

符号の説明Explanation of symbols

11…バッファ
12…欠落補間部
21…線形予測係数算出部
22…線形予測部
DESCRIPTION OF SYMBOLS 11 ... Buffer 12 ... Missing interpolation part 21 ... Linear prediction coefficient calculation part 22 ... Linear prediction part

Claims (8)

音声信号をパケット化する手段と、パケットをネットワークに伝送する手段と、ネットワークより伝送されたパケットを受信する手段と、受信されたパケットを分解して音声信号を再現する手段と、パケットが正常に受信されなかったことを検出する手段と、欠落したパケットに含まれる音声信号を、正常に受信したパケットに含まれる音声信号から推定し、補間する手段を備えた音声通信方式。 Means for packetizing the audio signal; means for transmitting the packet to the network; means for receiving the packet transmitted from the network; means for resolving the received packet to reproduce the audio signal; A voice communication system comprising means for detecting that a packet has not been received, and means for estimating and interpolating a voice signal contained in a missing packet from a voice signal contained in a normally received packet. 欠落音声パケット直前に正常受信した音声パケットに含まれる音声信号を用いて線形予測を再帰的に繰り返して、欠落音声信号を補間する第1項記載の音声通信方式。 The voice communication method according to claim 1, wherein linear prediction is recursively repeated using a voice signal included in a voice packet normally received immediately before the missing voice packet to interpolate the missing voice signal. 欠落音声信号を線形予測を繰り返して補間する際に、予測利得の低下を補正する可変利得を備えた請求項2記載の音声通信方式。 3. The voice communication system according to claim 2, further comprising a variable gain that corrects a decrease in prediction gain when interpolating a missing voice signal by repeating linear prediction. 欠落直前の音声パケットに含まれる音声信号を、同パケットの信号を線形予測を繰り返して予測した音声信号と受信音声信号の線形和により置き換えてスムージングを行なう手段を備えた請求項2記載の音声通信方式。 The voice communication according to claim 2, further comprising means for performing smoothing by replacing a voice signal contained in a voice packet immediately before the loss with a linear sum of the voice signal predicted by repeating linear prediction of the signal of the packet and the received voice signal. method. 欠落音声信号の補間に、欠落直前に正常受信した音声パケットに含まれる音声信号、および欠落直後に正常受信した音声パケットに含まれる音声信号を用いて線形予測を再帰的に繰り返して、欠落音声信号を補間する第1項記載の音声通信方式。 For the interpolation of the missing voice signal, the voice signal included in the voice packet normally received just before the missing and the voice signal contained in the voice packet normally received just after the missing are recursively repeated to perform the linear prediction, thereby missing the voice signal. The voice communication system according to claim 1 for interpolating. 欠落直前の音声パケットに含まれる音声信号を、同パケットの信号を線形予測を繰り返して予測した音声信号と受信音声信号の線形和に置き換えてスムージングを行ない、欠落直後の音声パケットに含まれる音声信号を、同パケットの信号を線形予測を繰り返して予測した音声信号と受信音声信号の線形和に置き換えてスムージングを行なう手段を備えた請求項5記載の音声通信方式。 The voice signal contained in the voice packet immediately after the loss is smoothed by replacing the voice signal contained in the voice packet immediately before the loss with the linear sum of the signal of the packet repeated by linear prediction and the received voice signal. 6. A voice communication system according to claim 5, further comprising means for performing smoothing by replacing the signal of the packet with a linear sum of a voice signal predicted by repeating linear prediction and a received voice signal. 連続欠落パケット数に応じて、欠落の補間に用いる線形予測信号の種類やその重み、および欠落前後のパケットに含まれる音声信号のスムージングに用いる信号の種類やその重みを変化させる請求項5および6記載の音声通信方式。 The type and weight of a linear prediction signal used for missing interpolation and its weight, and the type and weight of a signal used for smoothing a speech signal contained in packets before and after the missing are changed according to the number of consecutive missing packets. The voice communication method described. 音声パケットの欠落率を監視する手段と、欠落率に応じて請求項2記載の欠落直前の音声信号を用いて欠落音声を補間する手段と請求項5記載の欠落直前および直後の音声信号を用いて欠落音声を補間する手段を適応的に切り替えて欠落音声信号を補間する音声通信方式。 6. Means for monitoring the missing rate of voice packets; means for interpolating missing speech using the voice signal immediately before missing according to claim 2; and voice signals immediately before and after missing according to claim 5. A voice communication system that interpolates missing voice signals by adaptively switching the means for interpolating missing voices.
JP2003309745A 2003-09-02 2003-09-02 Voice packet absence interpolation system Pending JP2005077889A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2003309745A JP2005077889A (en) 2003-09-02 2003-09-02 Voice packet absence interpolation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2003309745A JP2005077889A (en) 2003-09-02 2003-09-02 Voice packet absence interpolation system

Publications (1)

Publication Number Publication Date
JP2005077889A true JP2005077889A (en) 2005-03-24

Family

ID=34411809

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2003309745A Pending JP2005077889A (en) 2003-09-02 2003-09-02 Voice packet absence interpolation system

Country Status (1)

Country Link
JP (1) JP2005077889A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008104196A (en) * 2006-10-20 2008-05-01 Kofukin Seimitsu Kogyo (Shenzhen) Yugenkoshi Packet transceiver system and method
JP2008529423A (en) * 2005-01-31 2008-07-31 クゥアルコム・インコーポレイテッド Frame loss cancellation in voice communication
WO2009008220A1 (en) * 2007-07-09 2009-01-15 Nec Corporation Sound packet receiving device, sound packet receiving method and program
JP2009139399A (en) * 2007-12-03 2009-06-25 Yamaha Corp Speech processing apparatus
JPWO2008013135A1 (en) * 2006-07-27 2009-12-17 日本電気株式会社 Audio data decoding device
JP2011095378A (en) * 2009-10-28 2011-05-12 Nikon Corp Sound recording device, imaging device and program
US8175867B2 (en) 2007-08-06 2012-05-08 Panasonic Corporation Voice communication apparatus
US8698911B2 (en) 2009-10-28 2014-04-15 Nikon Corporation Sound recording device, imaging device, photographing device, optical device, and program
JP2015215539A (en) * 2014-05-13 2015-12-03 セイコーエプソン株式会社 Voice processor and control method of the same
CN107112022A (en) * 2014-07-28 2017-08-29 三星电子株式会社 The method and apparatus hidden for data-bag lost and the coding/decoding method and device using this method
JP2018160872A (en) * 2017-03-24 2018-10-11 ヤマハ株式会社 Sound data processing apparatus and sound data processing method

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008529423A (en) * 2005-01-31 2008-07-31 クゥアルコム・インコーポレイテッド Frame loss cancellation in voice communication
JP4678440B2 (en) * 2006-07-27 2011-04-27 日本電気株式会社 Audio data decoding device
US8327209B2 (en) 2006-07-27 2012-12-04 Nec Corporation Sound data decoding apparatus
JPWO2008013135A1 (en) * 2006-07-27 2009-12-17 日本電気株式会社 Audio data decoding device
JP2008104196A (en) * 2006-10-20 2008-05-01 Kofukin Seimitsu Kogyo (Shenzhen) Yugenkoshi Packet transceiver system and method
JP5012897B2 (en) * 2007-07-09 2012-08-29 日本電気株式会社 Voice packet receiving apparatus, voice packet receiving method, and program
WO2009008220A1 (en) * 2007-07-09 2009-01-15 Nec Corporation Sound packet receiving device, sound packet receiving method and program
US8175867B2 (en) 2007-08-06 2012-05-08 Panasonic Corporation Voice communication apparatus
JP2009139399A (en) * 2007-12-03 2009-06-25 Yamaha Corp Speech processing apparatus
JP2011095378A (en) * 2009-10-28 2011-05-12 Nikon Corp Sound recording device, imaging device and program
US8698911B2 (en) 2009-10-28 2014-04-15 Nikon Corporation Sound recording device, imaging device, photographing device, optical device, and program
JP2015215539A (en) * 2014-05-13 2015-12-03 セイコーエプソン株式会社 Voice processor and control method of the same
CN107112022A (en) * 2014-07-28 2017-08-29 三星电子株式会社 The method and apparatus hidden for data-bag lost and the coding/decoding method and device using this method
US10720167B2 (en) 2014-07-28 2020-07-21 Samsung Electronics Co., Ltd. Method and apparatus for packet loss concealment, and decoding method and apparatus employing same
CN112216288A (en) * 2014-07-28 2021-01-12 三星电子株式会社 Method for time domain data packet loss concealment of audio signals
US11417346B2 (en) 2014-07-28 2022-08-16 Samsung Electronics Co., Ltd. Method and apparatus for packet loss concealment, and decoding method and apparatus employing same
JP2018160872A (en) * 2017-03-24 2018-10-11 ヤマハ株式会社 Sound data processing apparatus and sound data processing method

Similar Documents

Publication Publication Date Title
US7502733B2 (en) Method and arrangement in a communication system
US7246057B1 (en) System for handling variations in the reception of a speech signal consisting of packets
EP1746581B1 (en) Sound packet transmitting method, sound packet transmitting apparatus, sound packet transmitting program, and recording medium in which that program has been recorded
JP4651194B2 (en) Delay packet concealment method and apparatus
US20080154584A1 (en) Method for Concatenating Frames in Communication System
JP2005077889A (en) Voice packet absence interpolation system
KR20070065876A (en) Adaptive de-jitter buffer for voice over ip
JP4485690B2 (en) Transmission system for transmitting multimedia signals
EP1218876A1 (en) Apparatus and method for a telecommunications system
CN101379556B (en) Controlling a time-scaling of an audio signal
JP3416331B2 (en) Audio decoding device
JP4945429B2 (en) Echo suppression processing device
JP4572755B2 (en) Decoding device, decoding method, and digital audio communication system
KR100594599B1 (en) Apparatus and method for restoring packet loss based on receiving part
JP5074749B2 (en) Voice signal receiving apparatus, voice packet loss compensation method used therefor, program for implementing the method, and recording medium recording the program
JPH088933A (en) Voice cell coder
JP3583550B2 (en) Interpolator
JP4535069B2 (en) Compensation circuit
JP2002261629A (en) Method and device for estimating continuous value of digital symbol
EP1813045B1 (en) Methods and devices for providing protection in packet switched communication networks
JP2005233993A (en) Voice transmission system
JPH10200580A (en) Method for reproducing voice packet
WO1998039848A1 (en) Serial estimating method
JP2006319685A (en) Audio coding selection control method, audio packet transmitter, audio packet receiver, program, and storage medium
JP3231807B2 (en) Speech encoder