KR100591544B1

KR100591544B1 - Frame loss concealment method and apparatus for PIP

Info

Publication number: KR100591544B1
Application number: KR1020030097769A
Authority: KR
Inventors: 이미숙; 이응돈; 김도영; 김홍국; 최승호
Original assignee: 한국전자통신연구원
Priority date: 2003-12-26
Filing date: 2003-12-26
Publication date: 2006-06-19
Also published as: KR20050066477A

Abstract

본 발명은 패킷 망을 통하여 음성 데이터를 전송할 경우 발생할 수 있는 패킷 손실로 인한 음질 저하를 줄이기 위한 프레임 손실 은닉 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for concealing frame loss for reducing sound quality degradation due to packet loss that may occur when voice data is transmitted through a packet network.

본 발명에 따른 VoIP 시스템을 위한 프레임 손실 은닉 방법은 프레임 손실이 발생한 경우, 상기 손실된 프레임 이전에 올바르게 수신된 프레임 데이터와 플레이 아웃 버퍼에 있는 프레임 데이터 중에서 상기 손실된 프레임과 가장 인접한 프레임 데이터를 이용하여 상기 손실된 프레임 데이터를 복원한다. The frame loss concealment method for a VoIP system according to the present invention uses frame data closest to the lost frame among frame data correctly received before the lost frame and frame data in a playout buffer when the frame is lost. To restore the lost frame data.

이와 같이 하면, 계산량 증가 없이 패킷 손실 환경에서 음성 부호화기의 음질 향상에 기여할 수 있는 효과가 있다.In this way, there is an effect that can contribute to the improvement of the sound quality of the speech coder in a packet loss environment without increasing the amount of computation.

플레이아웃 버퍼, 복호화기, 프레임 손실 은닉 모듈, VoIP 시스템Playout buffer, decoder, frame loss concealment module, VoIP system

Description

Frame loss concealment method and apparatus for BIPIP system {METHOD AND APPARATUS FOR FRAME LOSS CONCEALMENT FOR VoIP SYSTEMS}

도 1은 종래 G.729에서 사용하고 있는 반복에 기초한 프레임 손실 은닉 방법을 나타낸 도면이다.1 is a diagram illustrating a frame loss concealment method based on repetition used in the conventional G.729.

도 2는 종래 VoIP 시스템의 특성에 기반을 두어 제안된 프레임 손실 은닉 방법을 나타낸 도면이다.2 is a diagram illustrating a proposed frame loss concealment method based on the characteristics of a conventional VoIP system.

도 3은 본 발명을 설명하기 위해 패킷 망을 통해 음성신호를 전달하는 과정을 보여주는 음성신호 송수신 장치의 구성블록도이다.3 is a block diagram illustrating an apparatus for transmitting and receiving a voice signal showing a process of transmitting a voice signal through a packet network for explaining the present invention.

도 4는 본 발명의 실시 예에 따른 VoIP 시스템을 위한 프레임 손실 은닉 장치의 구성도이다.4 is a block diagram of a frame loss concealment apparatus for a VoIP system according to an embodiment of the present invention.

도 5는 본 발명의 실시 예에 따른 VoIP 시스템을 위한 프레임 손실 은닉 장치의 동작 과정을 나타낸 흐름도이다.5 is a flowchart illustrating an operation of a frame loss concealment apparatus for a VoIP system according to an embodiment of the present invention.

도 6은 본 발명의 실시 예에 따른 VoIP 시스템을 위한 프레임 손실 은닉 방법을 도시한 개념도이다.6 is a conceptual diagram illustrating a frame loss concealment method for a VoIP system according to an embodiment of the present invention.

본 발명은 패킷 망을 통해 음성신호를 전달할 때 발생할 수 있는 패킷 손실로 인한 음질 저하를 줄이기 위한 프레임(패킷) 손실 은닉 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for concealing a frame (packet) loss for reducing sound quality degradation due to packet loss that may occur when transmitting a voice signal through a packet network.

인터넷이 널리 보급되면서 음성 신호를 인터넷 망을 통해 전송하기 위한 VoIP(Voice over Internet Protocol) 기술이 많은 관심을 끌고 있다.With the widespread use of the Internet, Voice over Internet Protocol (VoIP) technology for transmitting voice signals over the Internet has attracted much attention.

일반적으로 패킷 망을 통해 음성 신호를 전송하기 위해서는 송신 단에서 음성 부호화기로 음성 신호를 프레임 단위로 압축한 후에 패킷 단위로 패킹하여 전송하고, 수신 단에서는 상기 송신 단으로부터 전송된 패킷 데이터를 이용하여 다시 원래의 음성 신호를 재생한다. VoIP 시스템에서 음성신호를 압축하기 위해 널리 사용되고 있는 음성 부호화기는 CELP 유형의 부호화기로 보통 20msec의 음성 신호를 한꺼번에 처리한다. 이 단위를 프레임이라 하는데 일반적으로 하나의 패킷은 하나의 프레임 데이터로 구성될 수도 있고, 여러 프레임 데이터로 구성될 수도 있다. 즉, 하나의 패킷은 하나의 프레임 혹은 여러 개의 프레임으로 구성될 수 있다. 그러나 음성 부호화기는 프레임 단위로 동작을 하기 때문에 패킷 손실 은닉 알고리즘도 프레임 단위로 동작을 한다. 따라서 프레임 손실 은닉이라는 용어를 패킷 손실 은닉 이라는 용어와 혼용하여 사용하기도 한다.In general, in order to transmit a voice signal through a packet network, the transmitting end compresses the voice signal into a voice coder in units of frames and then packs and transmits the packet in packet units, and the receiving end uses packet data transmitted from the transmitting end again. Play the original audio signal. The voice coder, which is widely used to compress voice signals in VoIP systems, is a CELP type coder that processes voice signals of 20msec at once. This unit is called a frame. Generally, one packet may consist of one frame data or may consist of several frame data. That is, one packet may consist of one frame or several frames. However, since the speech coder operates in units of frames, the packet loss concealment algorithm also operates in units of frames. Therefore, the term frame loss concealment is sometimes used interchangeably with the term packet loss concealment.

패킷 망을 통해 음성 신호를 전송할 경우에 모든 패킷이 동일한 경로를 통해 전송되는 것이 아니라 각 패킷이 서로 다른 경로를 통해 전송될 수 있기 때문에 각 패킷을 전달하는데 소요되는 시간이 일정하지 않을 수 있다. 따라서 네트웍 부하로 인해 패킷이 손실되거나, 정해진 시간 내에 패킷이 수신되지 않거나 또는 송신한 순서가 바뀌어 수신되는 경우가 발생한다.When transmitting a voice signal through a packet network, the time taken to deliver each packet may not be constant because not all packets are transmitted through the same path but each packet may be transmitted through a different path. Therefore, a packet may be lost due to a network load, a packet may not be received within a predetermined time, or a packet may be received out of order.

실시간 동작이 필요하지 않는 경우라면 손실된 패킷에 대해 재전송을 요청하고, 수신 단에서 버퍼를 크게 함으로써 모든 패킷을 신뢰성 있게 수신할 수 있다. 그러나 음성 통화와 같이 실시간 동작이 필요한 경우에는 재전송을 요청할 수 없기 때문에 다양한 원인에 의한 패킷 손실로 인해 음질 저하가 발생한다. 패킷 손실은 음질에 직접적인 영향을 주는 커다란 원인이 된다.If no real time operation is required, all packets can be reliably received by requesting retransmission of lost packets and increasing the buffer at the receiving end. However, when real-time operation such as voice call is required, retransmission cannot be requested, resulting in a drop in sound quality due to packet loss due to various causes. Packet loss is a major cause of direct impact on sound quality.

G.729에서는 도 1에서와 같이 손실된 프레임의 파라미터를 복원하기 위하여 손실이 발생되기 바로 이전에 손실없이 올바르게 수신된 프레임 데이터의 정보를 이용한다. 즉, 종래 기술에서는 손실된 현재 프레임의 데이터 중에서 LSF(Line Spectrum Frequency)는 이전 프레임의 LSF 계수를 그대로 사용하고 피치는 이전 서브 프레임의 피치를 하나씩 증가해서 복원한다. 또한 피치와 고정 코드북의 이득은 이전 프레임의 이득을 감쇄 상수를 이용하여 감소시켜 사용한다.In G.729, as shown in FIG. 1, in order to recover parameters of a lost frame, information of frame data received correctly without loss immediately before the loss is used is used. That is, in the prior art, the LSF (Line Spectrum Frequency) of the lost current frame data uses the LSF coefficient of the previous frame as it is, and the pitch is increased by increasing the pitch of the previous subframe by one. In addition, the gain of the pitch and the fixed codebook is used by reducing the gain of the previous frame by using the attenuation constant.

이와 같이 G. 729에서 사용하는 반복에 기초한 프레임 손실 은닉 방법의 성능을 향상시키기 위하여 다른 종래 기술로서, 미국 특허번호 제5,732,389호(1998.3.24)에는 "Voiced/unvoiced classification of speech for excitation codebook selection in CELP speech decoding during frame erasures"가 개시되어 있다. 이 종래 기술에서는 기본적으로 손실된 프레임의 데이터는 이전 프레임의 데이터를 그대로 사용하거나 스케일링하여 사용하고, 손실된 프레임을 유성음과 무성음으로 분류한 후에 유성음인 경우 피치 정보만으로 여기 신호를 재생하고 무성음인 경우에는 고정 코드북 정보만으로 여기신호를 재생한다.In order to improve the performance of the repetition based frame loss concealment method used in G. 729, US Patent No. 5,732,389 (1998.3.24) discloses "Voiced / unvoiced classification of speech for excitation codebook selection in CELP speech decoding during frame erasures. In the prior art, basically, the data of the lost frame is used as it is or scaled by the data of the previous frame, and the lost frame is classified into voiced and unvoiced, and in the case of voiced sound, the excitation signal is reproduced using only the pitch information. Reproduces the excitation signal only with fixed codebook information.

한편, 도 2는 종래 VoIP 시스템의 특성에 기반을 두어 제안된 프레임 손실 은닉 방법을 나타낸 도면이다.Meanwhile, FIG. 2 is a diagram illustrating a proposed frame loss concealment method based on characteristics of a conventional VoIP system.

일반적으로 VoIP 시스템에서는 딜레이 지터(delay jitter)로 인한 음질 저하를 줄이기 위하여 수신단에 플레이아웃 버퍼를 사용하고 있다. 패킷이 수신되자 마자 바로 음성신호를 복원하여 출력하는 것이 아니라, 일종의 완충역할을 할 수 있는 버퍼를 두어 버퍼에 데이터가 모두 차면 그때부터 음성신호를 복원하기 시작한다. 그러므로 상기 플레이아웃 버퍼의 사이즈를 크게 하면 패킷 손실로 인한 음질 저하를 줄일 수 있지만, 플레이아웃 버퍼의 사이즈가 커지게 되면 전체적인 통화의 지연이 커지게 되어 반향이 발생할 뿐만 아니라 자연스러운 대화가 어려워진다는 문제점이 있다.In general, a VoIP system uses a playout buffer at a receiving end to reduce sound degradation caused by delay jitter. Rather than restoring and outputting the audio signal as soon as the packet is received, it has a buffer that can act as a buffer and starts restoring the audio signal when the data is full in the buffer. Therefore, if the size of the playout buffer is increased, the sound quality deterioration due to packet loss can be reduced. However, if the size of the playout buffer is increased, the overall call delay is increased, so that reverberation occurs and natural conversation becomes difficult. have.

VoIP 시스템에서는 무선망에서와는 달리 손실된 프레임의 정보를 예측하는데 플레이아웃 버퍼에 있는 프레임 데이터를 사용할 수 있다. 만약, 플레이아웃 버퍼에 손실된 프레임 이후의 프레임 데이터가 저장되어 있다면, 과거 프레임 데이터 정보와 함께 이 미래 프레임 데이터를 선형 인터폴레이션(linear interpolation)하여 손실된 프레임 정보를 예측할 수 있다. 이와 같은 방법이 ICASSP2000에서 게재된 논문 "improved frame erasure concealment for CELP-based coders"에 개시되어 있다.Unlike a wireless network, a VoIP system can use frame data in a playout buffer to predict lost frame information. If the frame data after the lost frame is stored in the playout buffer, the future frame data may be linearly interpolated together with the past frame data information to predict the lost frame information. Such a method is disclosed in the paper "improved frame erasure concealment for CELP-based coders" published in ICASSP2000.

이 논문에서는 한 프레임 데이터가 손실된 경우 손실없이 올바르게 수신된 바로 이전 프레임과 이후 프레임 데이터를 선형 인터폴레이션하여 현재 손실된 프레임 데이터를 예측한다.In this paper, when one frame data is lost, the current frame data is predicted by linear interpolation of the immediately previous frame and the subsequent frame data correctly received without loss.

이 때, 양자화 시 예측기를 사용하지 않는 피치 이득(pitch gain)과 피치 지연(pitch lag)은 선형 인터폴레이션을 통해 예측하고 나머지 LSF와 고정 코드북 파라미터는 기존 코덱에서 주로 사용하는 반복에 기초한 프레임 손실 은닉 방법을 사용하여 예측한다. 패킷당 프레임 수가 하나이고, 패킷 손실이 연속적으로 발생하지 않는 경우에 대한 실험 결과를 보면 기존 코덱에서 주로 사용하는 반복에 기초한 프레임 손실 은닉 방법보다 선형 인터폴레이션을 이용한 프레임 손실 은닉 방법이 좋은 효과를 나타낸다.At this time, pitch gain and pitch lag without using a predictor in quantization are predicted through linear interpolation, and the remaining LSF and fixed codebook parameters are frame loss concealment based on repetition mainly used in existing codecs. Predict using Experimental results for the case of one frame per packet and no packet loss occur continuously show that the frame loss concealment method using linear interpolation is better than the repetition based frame loss concealment method.

한편, 음성신호는 시간에 따라 그 특성이 천천히 변하는 성질을 가지고 있다. 따라서, 이웃한 프레임 데이터 사이에는 상관관계가 존재하며, 가까이 있는 프레임 데이터 사이에는 매우 높은 상관도를 가진다. 그리고 프레임 사이의 간격이 멀어질수록 상관도는 점점 떨어진다. 따라서 대부분의 프레임 손실 은닉 방법은 이러한 프레임 데이터 간의 상관도를 이용하고 있다. 그러나 VoIP 시스템에서는 플레이아웃 버퍼에 있는 프레임 데이터의 손실 이후에 수신된 프레임 데이터를 이용할 수 있으므로 추가적인 지연의 증가 없이 프레임 손실이 발생하기 전에 수신된 프레임 데이터와 플레이 아웃 버퍼에 있는 프레임 데이터 중에서 손실된 프레임과 가장 상관도가 높은 프레임 데이터를 이용하여 손실된 프레임 데이터를 예측하므로써 좀 더 효과적으로 손실된 프레임 데이터를 예측할 수 있게 된다.On the other hand, the voice signal has a property that its characteristics slowly change with time. Therefore, there is a correlation between neighboring frame data and a very high correlation between adjacent frame data. And as the distance between frames increases, the correlation decreases. Therefore, most frame loss concealment methods utilize the correlation between these frame data. However, the VoIP system can use the received frame data after the loss of the frame data in the playout buffer, so that frames lost between the received frame data and the frame data in the playout buffer before frame loss occurs without additional delay increases. The lost frame data can be predicted more effectively by using the frame data having the highest correlation with.

본 발명은 상기와 같은 문제점을 해결하고자 하는 것으로, 본 발명의 목적은 VoIP 시스템에서 손실된 프레임의 음성신호를 복원하기 위해 손실된 프레임 이전에 올바르게 수신된 프레임 데이터와 플레이아웃 버퍼에 저장되어 있는 손실된 프레임 이후의 프레임 데이터 중 손실된 프레임과 가장 인접한 프레임 데이터를 이용하여 보다 정확하게 손실된 프레임의 음성신호를 복원할 수 있는 VoIP 시스템을 위한 프레임 손실 은닉 방법 및 장치를 제공하기 위한 것이다. The present invention aims to solve the above problems, and an object of the present invention is to store lost data stored in a playout buffer and frame data correctly received before a lost frame in order to recover a voice signal of a lost frame in a VoIP system. Disclosed is a method and apparatus for concealing a frame loss for a VoIP system which can more accurately recover a lost audio signal of a lost frame using the frame data closest to the lost frame after the lost frame.

상기의 목적을 달성하기 위하여, 본 발명에서는 패킷 손실로 인해 손실된 프레임 데이터를 효과적으로 예측할 수 있는 방법이 제공된다.In order to achieve the above object, the present invention provides a method that can effectively predict the frame data lost due to packet loss.

본 발명의 하나의 특징에 따른 VoIP 시스템을 위한 프레임 손실 은닉 방법은 a) 상기 손실된 프레임을 검출하는 단계; b) 상기 a)단계에서 프레임 손실이 검출된 경우, 상기 손실된 프레임 이전에 수신된 올바른 프레임 데이터와 플레이아웃 버퍼에 있는 프레임 데이터 중 상기 손실된 프레임과 가장 인접한 하나의 프레임 데이터를 결정하는 단계; 및 c) 상기 결정된 프레임 데이터를 이용하여 상기 손실된 프레임의 데이터를 예측하여 음성신호를 복원하는 단계를 포함한다.Frame loss concealment method for a VoIP system according to an aspect of the present invention comprises the steps of: a) detecting the lost frame; b) when frame loss is detected in step a), determining one frame data closest to the lost frame from the correct frame data received before the lost frame and the frame data in a playout buffer; And c) reconstructing a voice signal by predicting the data of the lost frame using the determined frame data.

그리고 상기 이전에 손실 없이 수신된 프레임 데이터를 이용하여 손실된 프레임 데이터를 복원할 경우, 상기 플레이 아웃 버퍼에 있는 프레임 데이터에 기초하여 파라미터 복원시 제약을 줄 수 있고 상기 이후 손실 없는 프레임 데이터를 이용하여 손실된 프레임 데이터를 복원할 경우에는 상기 손실된 프레임 데이터를 복원할 때 이전에 손실 없이 수신된 프레임 데이터에 기초하여 파라미터 복원시 제약 을 줄 수 있다.When restoring lost frame data by using the previously received frame data without loss, constraints may be given when restoring parameters based on the frame data in the play-out buffer. When restoring lost frame data, constraints may be given when restoring parameters based on frame data previously received without loss when restoring the lost frame data.

또한, 상기 프레임 손실이 연속적으로 발생한 경우, 손실된 프레임 각각은 상기 손실된 프레임 이전에 수신된 올바른 프레임 데이터와 플레이아웃 버퍼에 있는 프레임 데이터 중 상기 손실된 프레임과 가장 인접한 프레임 데이터를 이용하여 연속적으로 손실된 프레임 데이터를 순차적으로 복원할 수 있다.Further, when the frame loss occurs continuously, each lost frame is successively using frame data closest to the lost frame among the correct frame data received before the lost frame and the frame data in the playout buffer. The lost frame data can be sequentially restored.

본 발명에서는 송신장치에서 송신한 음성 신호의 패킷 데이터에 손실이 발생하였을 경우 손실된 음성신호를 복원하는 프레임 손실 은닉 장치가 제공된다.The present invention provides a frame loss concealment apparatus for recovering a lost voice signal when a loss occurs in packet data of a voice signal transmitted from a transmitter.

본 발명의 하나의 특징에 따른 VoIP 시스템을 위한 프레임 손실 은닉 장치는 상기 프레임 데이터를 플레이아웃 버퍼에 저장하고, 정해진 시간마다 프레임 데이터를 복호화기에 전달하고, 프레임이 손실이 발생하였을 경우 플레이 아웃 버퍼에서 손실된 프레임과 가장 인접한 프레임 데이터 및 상기 시간 정보를 프레임 손실 은닉 모듈에 전달하는 플레이아웃 버퍼; 상기 플레이아웃 버퍼로부터 전달된 프레임 데이터 중 올바른 프레임 데이터와 손실된 프레임 데이터에 따라 다르게 음성신호를 복원하고, 손실된 프레임 데이터를 복원하는 경우, 상기 손실된 프레임 이전에 손실 없이 전달된 프레임 데이터와 플레이 아웃버퍼에서 전달된 프레임 데이터 중에서 상기 손실된 프레임에 가장 인접한 프레임 데이터를 이용하여 손실된 프레임 데이터를 예측하는 프레임 손실 은닉 장치; 및 상기 플레이아웃 버퍼와 상기 복호화기 사이에, 상기 올바른 프레임과 손실된 프레임에 기초하여 스위칭하도록 지시하는 BFI(Bad Frame Indicator)를 포함한다.A frame loss concealment apparatus for a VoIP system according to an aspect of the present invention stores the frame data in a playout buffer, delivers the frame data to the decoder at predetermined time intervals, and in a playout buffer when a frame is lost. A playout buffer for transmitting the frame data closest to the lost frame and the time information to a frame loss concealment module; In the case of restoring a voice signal differently according to correct frame data and lost frame data among the frame data transferred from the playout buffer, and restoring lost frame data, play with the transmitted frame data without loss before the lost frame. A frame loss concealment apparatus for predicting lost frame data using frame data closest to the lost frame among frame data transferred from an outbuffer; And a Bad Frame Indicator (BFI) instructing between the playout buffer and the decoder to switch based on the correct frame and the lost frame.

그리고 상기 디코더는 올바른 프레임 데이터를 복원하는 디코딩 모듈; 및 상기 플레이 아웃버퍼로부터 손실된 프레임과 가장 인접한 프레임 데이터를 전달받고, 상기 손실된 프레임 이전에 손실 없이 수신된 프레임 데이터를 디코딩 모듈로부터 전달받아 손실된 프레임과 가장 인접한 프레임을 선택하여 손실된 프레임 데이터를 예측하여 음성신호를 복원하는 프레임 손실 은닉 모듈을 포함한다.And the decoder comprises: a decoding module for recovering correct frame data; And received frame data closest to the lost frame from the play out buffer, and received frame data received without loss before the lost frame from the decoding module to select the frame closest to the lost frame and lost the frame data. It includes a frame loss concealment module for predicting the reconstructed speech signal.

아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였다. 명세서 전체를 통하여 유사한 부분에 대해서는 동일한 도면 부호를 붙였다. DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention. Like parts are designated by like reference numerals throughout the specification.

이하, 첨부된 도면들을 참조하여 본 발명을 상세하게 설명하기로 한다.Hereinafter, with reference to the accompanying drawings will be described in detail the present invention.

도 3은 본 발명을 설명하기 위한 패킷망을 통한 음성신호의 전달을 나타낸 음성신호 송수신 장치의 구성블록도이다.3 is a block diagram illustrating an apparatus for transmitting and receiving a voice signal showing a voice signal through a packet network for explaining the present invention.

도 3에 나타낸 바와 같이 음성신호 송수신 장치에서 음성신호 송신부는 A/D 변환기(20), 부호화기(30), 패킷 프로토콜(packet protocol) 모듈(40)을 포함한다.As illustrated in FIG. 3, the voice signal transmitter includes an A / D converter 20, an encoder 30, and a packet protocol module 40.

A/D 변환기(20)는 마이크를 통해 입력된 송신자의 아날로그 음성신호를 디지털 음성 신호로 변환한다.The A / D converter 20 converts an analog voice signal of a sender input through a microphone into a digital voice signal.

부호화기(30)는 디지털 음성 신호를 압축 부호화한다.The encoder 30 compresses and encodes a digital speech signal.

패킷 프로토콜 모듈(40)은 압축 부호화된 디지털 음성 데이터를 인터넷 프로토콜(IP;Internet Protocol)에 맞게 가공하여 패킷망을 통해 전송하기 적합한 형태 로 변환한 후, 음성 패킷 형태로 출력한다. The packet protocol module 40 converts the compressed and encoded digital voice data into a form suitable for transmission through a packet network by processing it according to the Internet Protocol (IP;) and outputs it in the form of a voice packet.

또한, 음성신호 수신부는 패킷 프로토콜 모듈(packet protocol)(40), 플레이아웃 버퍼(50), 복호화기(60), D/A 변환기(70)를 포함한다.In addition, the voice signal receiver includes a packet protocol module 40, a playout buffer 50, a decoder 60, and a D / A converter 70.

음성신호 수신부에서 패킷 프로토콜 모듈(40)은 패킷망을 통해 전송된 음성 패킷을 수신하여 언패킹(unpacking) 한 후, 프레임 단위의 음성 데이터를 시간 순서에 맞게 플레이아웃 버퍼(50)에 저장한다.In the voice signal receiver, the packet protocol module 40 receives and unpacks a voice packet transmitted through a packet network, and stores the voice data in a frame unit in the playout buffer 50 in a time sequence.

플레이아웃 버퍼(50)는 변환된 프레임 단위의 음성 데이터를 저장하고, 정해진 시간마다 복호화기(60)에 프레임 데이터를 전송한다.The playout buffer 50 stores the converted voice data in frame units, and transmits the frame data to the decoder 60 at predetermined times.

복호화기(60)는 수신된 프레임 데이터로부터 음성신호를 복원한다.The decoder 60 restores the voice signal from the received frame data.

D/A 변환기(70)는 음성신호로 복원된 디지털 음성 데이터를 아날로그 신호로 변환한다. 변환된 아날로그 음성신호는 스피커를 통해 출력된다.The D / A converter 70 converts digital voice data reconstructed into a voice signal into an analog signal. The converted analog voice signal is output through the speaker.

상기와 같이 구성된 음성신호 송수신 장치는 PCM 부호화기를 제외한 대부분의 부호화기들이 프레임 단위로 동작한다. 프레임 단위로 음성신호를 압축하여 데이터를 출력하고, 한 프레임 데이터를 한 패킷으로 패킹하거나 또는 두 프레임 또는 세 프레임의 데이터를 하나의 패킷으로 패킹하여 전송한다. 수신단에서는 수신된 패킷 데이터를 시간 순서에 맞게 정렬하고, 언패킹하여 프레임 단위의 데이터로 변환하여 플레이아웃 버퍼(50)에 저장한다. 플레이아웃 버퍼(50)의 사이즈는 전송 지연의 크기에 따라 변할 수도 있고, 고정된 사이즈를 가질 수도 있으며, 프레임 시간 간격으로 프레임 데이터를 복호화기(60)에 전달한다. 상기 프레임 데이터를 이용하여 디지털 음성신호를 재생하고, D/A 변환기를 통해 아날로그 신호로 변환하여 출력한다.In the apparatus for transmitting and receiving a voice signal configured as described above, most of the encoders except the PCM encoder operate on a frame basis. Data is output by compressing a voice signal in units of frames, and one frame of data is packed into one packet, or two or three frames of data are packed into one packet and transmitted. The receiving end arranges the received packet data in time order, unpacks the data, and converts the received packet data into frame data and stores the data in the playout buffer 50. The size of the playout buffer 50 may vary depending on the size of the transmission delay, may have a fixed size, and deliver the frame data to the decoder 60 at frame time intervals. The digital voice signal is reproduced using the frame data, converted into an analog signal through a D / A converter, and output.

이와 같이 패킷 망을 통해 패킷 데이터를 전송할 경우, 네트웍 부하로 인해 패킷이 손실되거나 각 패킷을 전송하는데 소요되는 시간이 일정치 않기 때문에 정해진 시간에 수신되지 않은 패킷은 일반적으로 손실로 처리한다. 이로 인해서 음질의 저하가 발생하는데, 손실된 음성신호를 효과적으로 복원하게 되면 음질 저하를 줄일 수 있다.As described above, when packet data is transmitted through a packet network, packets that are not received at a predetermined time are generally regarded as losses because packets are lost due to network load or the time required to transmit each packet is not constant. As a result, the sound quality deteriorates. When the lost voice signal is effectively restored, the sound quality deterioration can be reduced.

도 4는 본 발명의 실시예에 따른 VoIP 시스템을 위한 프레임 손실 은닉 장치를 도시한 블록도이다.4 is a block diagram illustrating a frame loss concealment apparatus for a VoIP system according to an embodiment of the present invention.

도 4에 나타낸 바와 같이 본 발명의 실시 예에 따른 프레임 손실 은닉 장치는 플레이아웃 버퍼(50)와 복호화기(60) 및 BFI(80)를 포함한다. 그리고 복호화기(60)는 프레임 손실 은닉 모듈(62) 및 디코딩 모듈(64)을 포함한다.As shown in FIG. 4, the apparatus for concealing frame loss according to an embodiment of the present invention includes a playout buffer 50, a decoder 60, and a BFI 80. The decoder 60 also includes a frame loss concealment module 62 and a decoding module 64.

플레이아웃 버퍼(50)는 프레임 단위로 변환된 데이터를 저장하고, 프레임 시간 간격마다 프레임 데이터를 전달하며 프레임의 손실이 발생하였을 경우 플레이아웃 버퍼(50)에 있는 프레임 데이터 중에서 손실된 프레임과 가장 인접한 프레임의 데이터를 프레임 손실 은닉 모듈(62)로 전달한다. 플레이아웃 버퍼(50)는 고정된 사이즈를 가질 수도 있지만, 네트워크 상황에 따라 버퍼의 크기가 변경될 수도 있다. 또한, 플레이아웃 버퍼(50)는 손실된 프레임과 가장 인접한 프레임의 데이터를 프레임 손실 은닉 모듈로 전달하는 동시에 시간 정보도 전달한다. 여기서, 도 4에 따른 본 발명의 실시예에서는 플레이아웃 버퍼(혹은 지터 버퍼라고도 한다.)(50)의 사이즈는 4이고, N번째 시간에서 오류(손실)가 발생하였다고 가정하며, N 번째 시간에 대한 프레임 데이터를 F(N)이라고 표기하였다.The playout buffer 50 stores the converted data in units of frames, transfers the frame data at frame time intervals, and when a loss of a frame occurs, the playout buffer 50 closest to the lost frame among the frame data in the playout buffer 50. The data of the frame is passed to the frame loss concealment module 62. The playout buffer 50 may have a fixed size, but the size of the buffer may change according to network conditions. In addition, the playout buffer 50 transmits the data of the frame closest to the lost frame to the frame loss concealment module, while also transmitting time information. Here, in the embodiment of the present invention according to FIG. 4, it is assumed that the size of the playout buffer (or jitter buffer) 50 is 4, and an error (loss) occurs at the Nth time. The frame data for this is designated as F (N).

복호화기(60)는 프레임 손실로 인한 음질 저하를 줄이기 위해 프레임 손실 은닉 모듈(62)을 내장하고 있다. 이 때, 프레임 데이터가 손실없이 정상적으로 수신되었을 경우에 동작하는 모듈을 디코딩 모듈이라 하면, 손실된 프레임 데이터로 인한 음질 저하를 줄이기 위한 모듈을 프레임 손실 은닉 모듈이라 한다. 여기서 프레임 데이터는 CELP 유형의 부호화/복호화기의 경우에 피치 주기, LSP, 고정 코드북 인덱스, 피치와 고정 코드북의 이득을 포함한다. The decoder 60 includes a frame loss concealment module 62 to reduce sound quality degradation due to frame loss. In this case, a module operating when frame data is normally received without loss is called a decoding module. A module for reducing sound quality degradation due to lost frame data is called a frame loss concealment module. Here, the frame data includes a pitch period, an LSP, a fixed codebook index, a pitch, and a gain of the fixed codebook in the case of a CELP type encoder / decoder.

프레임 손실 은닉 모듈(62)은 프레임 손실이 발생하기 전에 손실없이 수신된 프레임 데이터와 플레이아웃 버퍼(50)에 저장되어 있는 프레임 데이터 중 손실된 프레임과 가장 인접한 프레임 데이터를 이용하여 손실된 프레임의 음성신호를 복원한다. 또한, 프레임 손실 은닉 모듈(62)은 손실이 발생하기 전후에 손실없이 수신된 프레임 데이터 중 손실된 프레임과 가장 인접한 프레임의 데이터를 이용하여 손실된 프레임을 복원할 때, 나머지 하나의 손실없이 수신된 프레임 데이터를 이용하여 손실된 프레임 파라미터 예측시 제약조건으로 사용할 수 있다.The frame loss concealment module 62 uses the frame data closest to the lost frame data among the frame data received without loss and the frame data stored in the playout buffer 50 before the frame loss occurs, and the voice of the lost frame is lost. Restore the signal. In addition, when the frame loss concealment module 62 restores a lost frame using the data of the frame closest to the lost frame among the received frame data without loss before and after the loss occurs, the frame loss concealment module 62 is received without loss of the other one. Frame data can be used as a constraint when predicting lost frame parameters.

디코딩 모듈(64)은 손실없이 수신된 프레임의 음성신호를 복원한다.The decoding module 64 recovers the audio signal of the received frame without loss.

BFI(Bad Frame Indicator)(80)는 프레임 손실이 발생되었는지의 여부를 나타내는 플래그로, 손실없이 수신된 프레임의 데이터는 디코딩 모듈(64)에서 음성신호를 복원하도록 하고, 프레임 손실이 발생하였을 경우에는 프레임 손실 은닉 모듈(62)에서 손실된 프레임에 대한 음성신호를 복원하도록 스위칭할 수 있도록 한다. 예를 들어, 프레임 손실이 발생하여 플레이아웃 버퍼(50)에 복호화기로 전달해야 하는 프레임 데이터가 존재하지 않으면 BFI(80)는 1이 되어 스위치가 프레임 손실 은닉 모듈(62)에 연결되도록 지시하고, 플레이아웃 버퍼(50)에 시간 순서에 맞는 프레임 데이터가 존재하면, BFI(80)는 0이 되어 스위치가 디코딩 모듈(64)에 연결되도록 지시한다.The BFI (Bad Frame Indicator) 80 is a flag indicating whether or not a frame loss has occurred, and the data of a frame received without loss causes the decoding module 64 to restore a voice signal, and when a frame loss occurs. The frame loss concealment module 62 enables switching to recover the speech signal for the lost frame. For example, if frame loss occurs and there is no frame data to be delivered to the decoder in the playout buffer 50, the BFI 80 becomes 1 to instruct the switch to be connected to the frame loss concealment module 62, If there is frame data in time order in the playout buffer 50, the BFI 80 goes to zero indicating that the switch is coupled to the decoding module 64.

BFI(80)는 별도의 독립된 모듈로서 구성될 수도 있고, 플레이아웃 버퍼(50) 또는 디코더(60)에서 BFI(80)를 위한 별도의 모듈이 구비될 수도 있다.The BFI 80 may be configured as a separate module or may be provided with a separate module for the BFI 80 in the playout buffer 50 or the decoder 60.

상기와 같이 구성된 VoIP 시스템을 위한 프레임 손실 은닉 장치의 동작 과정에 대해서 도 5를 참조하여 상세하게 설명한다. 이 때, 도 4에서와 같이 N번째 시간에서 오류(손실)가 발생하였다고 가정하고, N 번째 시간에 대한 프레임의 데이터를 F(N)이라고 표기한다. 그리고 시간 정보에 기초하여 프레임 데이터를 복원할 때, 도 5에서와 같은 동작을 계속해서 반복한다.The operation of the frame loss concealment apparatus for the VoIP system configured as described above will be described in detail with reference to FIG. In this case, it is assumed that an error (loss) occurs in the Nth time as shown in FIG. 4, and the data of the frame for the Nth time is denoted as F (N). When restoring the frame data based on the time information, the operation as shown in FIG. 5 is continuously repeated.

도 5에 나타낸 바와 같이, 복호화기(60)는 시간 정보(N-1 ~ N+2)에 따라 패킷을 수신하여(S60), 이를 복원하여 음성신호를 재생시킨다. 이 때, 수신된 패킷을 언패킹하여 생성된 프레임 데이터를 플레이아웃 버퍼(50)에 저장한다(S61). 플레이아웃 버퍼(50)에 프레임 데이터가 모두 차면 그때부터 음성신호를 복호화기(60)에 전달하기 시작한다.As shown in FIG. 5, the decoder 60 receives a packet according to time information (N-1 to N + 2) (S60), restores it, and reproduces a voice signal. At this time, the frame data generated by unpacking the received packet is stored in the playout buffer 50 (S61). When the frame data is completely filled in the playout buffer 50, the audio signal is transmitted to the decoder 60 from that time.

먼저 프레임의 손실이 발생하였는지를 판단하는 데, 이를 위해 시간 순서에 따라 플레이아웃 버퍼(50)는 N-1번째 시간에 해당하는 프레임 데이터 F(N-1)이 존재하므로 BFI(80)가 디코딩 모듈(64)로 연결시킨다. 그리고 나서 디코딩 모듈(64)은 F(N-1)을 전달받아 음성신호를 복원한다(S64-S65).First, it is determined whether a loss of a frame has occurred. To this end, the playout buffer 50 has a frame data F (N-1) corresponding to the N-1th time according to the time order, so that the BFI 80 decodes the decoding module. (64). Then, the decoding module 64 receives F (N-1) and restores the voice signal (S64-S65).

다음, 플레이아웃 버퍼(50)는 N번째 시간에 해당하는 프레임 데이터 F(N)을 디코더(60)로 전달해야 할 시점에서 플레이 아웃 버퍼(50)에 F(N)이 존재하지 않으므로 BFI(80)가 프레임 손실 은닉 모듈(62)로 연결시킨다(S60-S62). 그리고 나서 플레이아웃 버퍼(50)는 N번째 시간에 N+1번째 시간에 해당하는 프레임의 데이터 F(N+1)을 프레임 손실 은닉 모듈(62)로 전달한다. 프레임 손실 은닉 모듈(62)은 손실된 프레임의 이전 프레임과 이후 프레임 중 손실된 프레임과 가장 인접한 프레임을 이용하여 손실된 프레임의 데이터를 예측하여(S63) 음성신호를 복원한다(S65). 이 때, 손실된 프레임 이전에 전송되어 손실없이 수신된 프레임 또는 손실된 프레임의 이후에 전송되어 손실없이 수신된 프레임 중 손실된 프레임과 가장 인접한 프레임을 이용하여 손실된 프레임 데이터를 예측할 때 나머지 하나의 프레임 데이터는 손실된 프레임 데이터 예측시 제약조건으로 사용한다.Next, since the playout buffer 50 does not have F (N) in the playout buffer 50 at the point in time when the frame data F (N) corresponding to the Nth time should be transmitted to the decoder 60, the BFI 80 ) Is connected to the frame loss concealment module 62 (S60-S62). The playout buffer 50 then transfers the data F (N + 1) of the frame corresponding to the N + 1th time to the Nth time to the frame loss concealment module 62. The frame loss concealment module 62 reconstructs the speech signal by predicting data of the lost frame using the frame closest to the lost frame among the previous frame and the subsequent frame of the lost frame (S63). At this time, when predicting the lost frame data using the frame which is transmitted before the lost frame and received without loss, or the frame transmitted after the lost frame without loss and the closest to the lost frame, the remaining one is estimated. Frame data is used as a constraint when predicting lost frame data.

다음, N+1번째 시간에 해당하는 프레임 데이터 F(N+1)을 복원해야 할 시점에서 플레이아웃 버퍼(50)에는 F(N+1)이 존재하므로 디코딩 모듈(64)로 F(N+1)이 전달되어 F(N+1)을 복원하여 음성신호를 재생한다(S64-65). 또한, N+2번째 시간에 해당하는 프레임 데이터는 F(N+1)을 복원하는 방법과 동일하게 동작한다.Next, when the frame data F (N + 1) corresponding to the N + 1th time needs to be restored, F (N + 1) exists in the playout buffer 50, so that F (N +) to the decoding module 64. 1) is transmitted to restore F (N + 1) to reproduce the audio signal (S64-65). The frame data corresponding to the N + 2th time operates in the same manner as the method of restoring F (N + 1).

본 발명의 실시예에서는 패킷 당 프레임이 한 개인 경우에 대해서 설명하였지만, 패킷당 프레임 수가 커지면 한 패킷이 손실되어도 여러 프레임이 연속적으로 손실될 수 있다. 이와 같이 연속적으로 다수의 프레임이 손실될 경우 프레임 손실 은닉 모듈(62)은 각각의 손실된 프레임에 대해 손실된 프레임의 이전 또는 이후에 손실없이 수신된 데이터 중 가장 인접한 프레임 데이터를 이용하여 시간 정보에 따라 순차적으로 손실된 음성신호를 복원한다.In the embodiment of the present invention, the case of one frame per packet has been described. However, when the number of frames per packet increases, several frames may be continuously lost even if one packet is lost. When a plurality of frames are continuously lost in this manner, the frame loss concealment module 62 uses the closest frame data among the received data without loss before or after the lost frame for each lost frame to provide time information. Therefore, the lost audio signal is sequentially restored.

이와 같이 본 발명의 실시 예에 따른 프레임 손실 은닉 방법은 주로 CELP(Code-Excited Linear Predictive) 음성 부호화기를 기반으로 사용한다.As described above, the frame loss concealment method according to an embodiment of the present invention mainly uses a CELP (Code-Excited Linear Predictive) speech coder.

CELP 부호화기는 16kbps 보다 낮은 비트 전송율에서도 높은 음질의 음성 신호를 만들어 내는 일종의 혼성(hybrid) 부호화 기술을 이용한다. CELP형 음성 부호화기에서는 일반적으로 한 프레임의 음성신호로부터 고정 코드북 색인과 피치, 그리고 LSF와 고정 코드북과 피치의 이득을 추출하여 전송하고 수신단에서는 이들 데이터로부터 음성신호를 복원한다.The CELP coder uses a kind of hybrid coding technique that produces high quality speech signals even at bit rates lower than 16 kbps. In general, a CELP speech coder extracts and transmits a fixed codebook index and a pitch and a gain of an LSF, a fixed codebook, and a pitch from a speech signal of one frame, and restores the speech signal from these data.

본 발명의 실시예에 따른 프레임 손실 은닉 방법을 좀 더 간단하게 도시하면 도 6과 같이 나타낼 수 있다. 이하, 도 6을 참조하여 CELP 음성부호화기에서의 손실된 프레임의 데이터를 복원하는 방법에 대해서 설명하기로 한다.6 illustrates a frame loss concealment method according to an embodiment of the present invention. Hereinafter, a method of recovering data of lost frames in the CELP speech encoder will be described with reference to FIG. 6.

도 6에서 나타낸 바와 같이 CELP 음성부호화기에서 프레임 데이터를 플레이아웃 버퍼(20)에 저장하고, 플레이아웃 버퍼(50)는 일정시간마다 프레임 데이터를 복호화기에 전달한다. 그리고 플레이아웃 버퍼(50)는 프레임 손실이 발생하였을 경우, 플레이아웃 버퍼(50)에서 손실된 프레임과 가장 인접한 프레임의 데이터를 프레임 손실 은닉 모듈(62)에 전달한다. 프레임 손실 은닉 모듈(62)은 플레이아웃 버퍼(50)에서 전달받은 프레임과 프레임 손실이 발생하기 전에 전달받은 프레임을 비교하여 손실된 프레임과 가장 가까운 프레임을 선택한다. 그리고 선택된 프레임 데이터를 이용하여 손실된 프레임 데이터를 복원한다. CELP 계열의 음성 부호화기의 경우에는 손실된 프레임의 LSP, 피치 정보, 고정 코드북 정보를 복원하는데, 이 때, 손실된 프레임을 복원하기 위해 선택된 프레임 이외의 프레임을 이용하여 데이터를 예측시의 제약 조건으로 사용 가능하다.As illustrated in FIG. 6, the CELP voice encoder stores frame data in the playout buffer 20, and the playout buffer 50 delivers the frame data to the decoder at predetermined times. When the frame loss occurs, the playout buffer 50 transmits the data of the frame closest to the frame lost in the playout buffer 50 to the frame loss concealment module 62. The frame loss concealment module 62 selects the frame closest to the lost frame by comparing the frame received from the playout buffer 50 with the received frame before frame loss occurs. The lost frame data is recovered using the selected frame data. In the case of the CELP-based speech coder, the LSP, pitch information, and fixed codebook information of the lost frame are recovered. In this case, data other than the frame selected to recover the lost frame is used as a constraint for prediction. Can be used

표 1은 종래 도 1 및 도 2에 따른 프레임 손실 은닉 방법과 본 발명의 실시예에 따른 프레임 손실 은닉 방법에 따른 음질의 지각 평가를 나타낸다. 여기서, 도 1에 따른 반복에 기초한 프레임 손실 은닉 방법을 F-PLC라고 하고, 도 2에 따른 선형 인터폴레이션에 기초한 프레임 손실 은닉 방법을 MI-PLC라고 하며, 본 발명의 실시예에 따른 프레임 손실 은닉 방법을 FB-PLC라고 한다.Table 1 shows the perceptual evaluation of the sound quality according to the conventional frame loss concealment method according to Figures 1 and 2 and the frame loss concealment method according to an embodiment of the present invention. Here, the frame loss concealment method based on repetition according to FIG. 1 is called F-PLC, and the frame loss concealment method based on linear interpolation according to FIG. 2 is called MI-PLC, and the frame loss concealment method according to an embodiment of the present invention is described. Is called FB-PLC.

패킷 사이즈Packet size PLC TypePLC Type 프레임 손실율(%)% Frame loss 1One 33 55 77 1010 1515 1010 F-PLCF-PLC 3.673.67 3.383.38 3.243.24 3.093.09 2.912.91 2.722.72 MI-PLCMI-PLC 3.693.69 3.463.46 3.323.32 3.203.20 3.043.04 2.842.84 FB-PLCFB-PLC 3.713.71 3.503.50 3.393.39 3.283.28 3.133.13 2.922.92 2020 F-PLCF-PLC 3.623.62 3.273.27 3.093.09 2.932.93 2.702.70 2.522.52 MI-PLCMI-PLC 3.653.65 3.353.35 3.173.17 3.033.03 2.812.81 2.632.63 FB-PLCFB-PLC 3.683.68 3.423.42 3.303.30 3.163.16 3.023.02 2.782.78 3030 F-PLCF-PLC 3.623.62 3.303.30 2.992.99 2.812.81 2.602.60 2.322.32 MI-PLCMI-PLC 3.653.65 3.423.42 3.133.13 2.972.97 2.802.80 2.522.52 FB-PLCFB-PLC 3.703.70 3.473.47 3.213.21 3.093.09 2.932.93 2.742.74

그리고 표 2는 표 1에서 프레임 손실율 5%인 경우, MI-PLC와 FB-PLC와의 Preference 테스트 결과를 나타낸다.In Table 1, when the frame loss rate is 5% in Table 1, Preference test results of MI-PLC and FB-PLC are shown.

SpeakerSpeaker Preference Score(%)Preference Score (%) MI-PLCMI-PLC FB-PLCFB-PLC 여성female 33.3333.33 66.6766.67 남성male 30.5630.56 69.4469.44

표 1 및 표 2를 보면, 본 발명의 실시예에 따른 프레임 손실 은닉 방법이 종래의 프레임 손실 은닉 방법에서보다 효과적임을 알 수 있다.Referring to Table 1 and Table 2, it can be seen that the frame loss concealment method according to the embodiment of the present invention is more effective than the conventional frame loss concealment method.

또한, 상술한 바와 같이 본 발명에 따른 손실된 음성신호 복원 방법은 프로그램으로 구현되는 프로그램을 기록한 기록매체(씨디롬, 램, 롬, 플로피 디스크, 하드 디스크, 광자기 디스크 등)에 저장될 수 있다.In addition, as described above, the lost audio signal restoration method according to the present invention may be stored in a recording medium (CD-ROM, RAM, ROM, floppy disk, hard disk, magneto-optical disk, etc.) recording a program implemented as a program.

이상의 실시예들은 본원 발명을 설명하기 위한 것으로, 본원 발명의 범위는 실시예들에 한정되지 아니하며, 첨부된 청구 범위에 의거하여 정의되는 본원 발명의 범주 내에서 당업자들에 의하여 변형 또는 수정될 수 있다.The above embodiments are intended to illustrate the present invention, the scope of the present invention is not limited to the embodiments, it can be modified or modified by those skilled in the art within the scope of the invention defined by the appended claims. .

본 발명에 의하면, 손실된 프레임 이전의 올바른 프레임 데이터와 플레이아웃 버퍼에 저장되어 있는 손실된 프레임 이후의 프레임 데이터 중 손실된 프레임과 가장 인접한 프레임 데이터를 이용하여 손실된 프레임 데이터를 복원한다. 이와 같이 하면 이웃한 프레임의 음성신호 사이에 존재하는 상관관계로 인해 손실된 프레임의 음성신호를 보다 효율적이고 보다 정확하게 복원할 수 있고, 종래 반복에 기초한 프레임 손실 은닉 방법 내지 선형 인터폴레이션에 기초한 프레임 손실 은닉 방법에 비해 계산량의 증가 없이 전송 오류가 있는 패킷망에서 음성통화의 품질향상에 기여할 수 있다.According to the present invention, the lost frame data is recovered using the frame data closest to the lost frame among the correct frame data before the lost frame and the frame data after the lost frame stored in the playout buffer. By doing so, it is possible to more efficiently and more accurately recover the lost audio signal of the lost frame due to the correlation between the adjacent audio signals of the adjacent frames, and conceal the frame loss concealment method based on the conventional repetition or the linear interpolation Compared to the method, it can contribute to the improvement of voice call quality in packet network with transmission error without increasing the amount of computation.

Claims

In the frame loss concealment method for reducing the deterioration of the quality of the restored speech signal generated by the loss of the frame data of the speech signal received from the transmitting device,

a) detecting whether the received frame data is lost;

b) when frame loss is detected in step a), determining a frame closest to the lost frame among the frames received in the playout buffer and the frames received without loss before the lost frame; And

c) restoring a speech signal by predicting data of the lost frame using the determined frame data;

Frame loss concealment method for a VoIP system comprising a.

The method of claim 1,

When predicting lost frame data by using the data of the frame received without loss before, VoIP system characterized in that it is used as a constraint when predicting the lost frame data using the frame data in the playout buffer Method for concealing frame loss.

The method of claim 1,

When restoring lost frame data by using the frame data in the playout buffer, the start of the lost frame data is a constraint when predicting the data of the lost frame using the previously received frame data without loss. A frame loss concealment method for a VoIP system, characterized in that the use.

The method of claim 1,

When the loss of the frame data occurs continuously, each lost frame uses data of the frame closest to the lost frame among the correct frame data received before the lost frame and the frame data in the playout buffer. And sequentially recovering lost frame data sequentially.

In the frame loss concealment apparatus for reducing the deterioration of the quality of the restored speech signal generated by the loss of the frame data of the speech signal received from the transmitting apparatus,

A playout buffer that stores frame data, transfers the frame data to the decoder at predetermined times, and transfers frame data and the time information closest to the lost frame when a frame loss occurs;

When the frame data received from the playout buffer and the frame data lost without loss are restored in a different manner, and when the frame data is lost, the frame data transferred without loss before the lost frame and A decoder for predicting lost frame data by using frame data closest to the lost frame among the frame data transferred from the playout buffer; And

A bad frame indicator (BFI) instructing between the playout buffer and the decoder to switch based on the received frame and the lost frame without the loss

Frame loss concealment device for a VoIP system comprising a.

The method of claim 5, wherein the decoder,

A decoding module for recovering received frame data without loss; And

Frame data lost from the play out buffer is received closest to the lost frame, frame data received without loss before the lost frame is received from the decoding module, and the frame closest to the lost frame is lost. Frame loss concealment module to predict speech signal

Frame loss concealment device for a VoIP system comprising a.