KR100315188B1

KR100315188B1 - Apparatus and method for receiving voice data

Info

Publication number: KR100315188B1
Application number: KR1019990059588A
Authority: KR
Inventors: 염응문
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1999-12-21
Filing date: 1999-12-21
Publication date: 2001-11-26
Also published as: KR20010057276A

Abstract

본 발명은 수신측 음성 처리부에서 지터버퍼링된 인코딩 음성 데이터를 타임스탬프에 따라 동적으로 복원함으로 좋은 음질을 수신자에게 제공하는 음성데이터 수신 장치 및 방법에 관한 것이다.The present invention relates to a voice data receiving apparatus and method for providing a receiver with good sound quality by dynamically recovering jitter-buffered encoded voice data according to a timestamp in a receiving voice processor.

본 발명의 목적은 지터버퍼링된 데이터의 효율적인 디코딩을 위해 RTP 헤더의 정보 중 타임스템프 데이터를 이용하여 각 패킷의 처리간격에 따라 디코딩하게 하여 좋은 음질을 수신자에게 제공하는데 있다.It is an object of the present invention to provide a receiver with good sound quality by decoding the packet according to the processing interval of each packet using timestamp data among the information of the RTP header for efficient decoding of jitter buffered data.

본 발명은 음성 데이터를 코딩방식에 따라 소정의 시간간격 마다 압축 해제하여 디코딩하는 음성처리부와; 새로운 패킷을 수신하여 앞서 처리된 패킷과의 처리간격을 계산하고 상기 새로운 패킷에 포함된 인코딩된 음성데이터를 분리하는 패킷수신모듈과; 상기 패킷수신모듈을 통해 계산 및 분리된 상기 처리간격과 상기 인코딩된 음성 데이터를 쌍으로 임시 저장하는 음성데이터수신버퍼와; 상기 음성데이터수신버퍼에 저장된 상기 인코딩된 음성데이터를 상기 인코딩된 음성데이터의 상기 처리간격만큼 지연시켜 상기 음성처리부에 전달하여 디코딩시키는 음성처리제어부를 포함한다.The present invention provides a speech processing unit which decompresses and decodes speech data at predetermined time intervals according to a coding scheme; A packet receiving module for receiving a new packet, calculating a processing interval with a previously processed packet, and separating encoded voice data included in the new packet; A voice data receiving buffer for temporarily storing the processed interval and the encoded voice data calculated and separated by the packet receiving module in pairs; And a speech processing controller configured to delay the encoded speech data stored in the speech data receiving buffer by the processing interval of the encoded speech data, and transmit the same to the speech processor.

본 발명에 따르면 지터버퍼링된 데이터의 효율적인 디코딩을 위해 RTP 헤더의 정보 중 타임스템프 데이터를 이용하여 각 패킷의 처리간격에 따라 디코딩하므로 좋은 음질을 수신자에게 제공하는 효과가 있다.According to the present invention, since the timestamp data of the information of the RTP header is decoded according to the processing interval of each packet for efficient decoding of the jitter-buffered data, there is an effect of providing good sound quality to the receiver.

Description

Apparatus and method for receiving voice data

본 발명은 음성통신 장치 및 방법에 관한 것으로, 더욱 상세하게는 수신측 음성 처리부에서 지터버퍼링된 인코딩 음성 데이터를 타임스탬프에 따라 동적으로 복원함으로 좋은 음질을 수신자에게 제공하는 음성데이터 수신 장치 및 방법에 관한 것이다.The present invention relates to a voice communication apparatus and method, and more particularly, to a voice data receiving apparatus and method for providing a receiver with a good sound quality by dynamically recovering jitter-buffered encoded voice data according to a timestamp in a receiving voice processor. It is about.

최근 들어 인터넷 사용이 폭발적으로 증가하고 있는데, 이는 인터넷을 통한 멀티미디어 서비스가 지원되기 때문일 것이다. 인터넷을 이용한 멀티미디어 서비스 중 하나인 인터넷폰 서비스는 기존의 전화망을 통하지 않고 인터넷을 통해 국제전화를 할 수 있게 하여 값싸게 서비스를 사용자에게 제공한다.In recent years, the use of the Internet has exploded, probably because multimedia services are supported through the Internet. Internet phone service, one of the multimedia services using the Internet, provides users with cheap services by enabling international calls through the Internet rather than through the existing telephone network.

일반적인 인터넷폰 서비스와 이의 문제점을 간단히 살펴보면 다음과 같다.The general Internet phone service and its problems are as follows.

먼저, 통신단말기(이를테면, 인터넷폰)가 발신자 음성을 보코더(vocoder)를사용하여 보코딩하여 RTP(Real-Time Transport Protocol) 패킷으로 구성하여 UDP(User Datagram Protocol) 포트를 통해 인터넷으로 전송하고, 이를 수신측 통신단말기가 수신하여 디코딩하여 사용자에게 출력한다.First, a communication terminal (such as an internet phone) vocodes a caller's voice using a vocoder, configures it as a Real-Time Transport Protocol (RTP) packet, and transmits it to the Internet through a UDP (User Datagram Protocol) port. The receiving side communication terminal receives it, decodes it and outputs it to the user.

이처럼, 발신측 통신단말기로부터의 음성 인코딩 데이터가 수신측 통신단말기로 전달되는 과정에 통신로의 잡음(noise)과 상호간섭(interference), 주위온도의 변화 또는 비트 스텁핑(bit stuffing) 등에 의해서 신호의 위상이 변화하여 수신측 통신단말기가 음성 인코딩 데이터를 수신하여 정확한 시간에 디코딩할 수 없는 지터(jitter) 문제가 발생한다.As such, the voice encoded data from the calling communication terminal is transmitted to the receiving communication terminal by the noise and the interference of the communication path, the change of the ambient temperature or the bit stuffing. The jitter problem occurs because the receiver side of the communication terminal cannot receive the speech encoded data and decode it at the correct time.

일반적으로 이와 같은 지터문제를 해결하기 위해 버퍼 알고리즘을 사용하는데 이를 도 1과 도2를 참조하여 살펴보면 다음과 같다.In general, a buffer algorithm is used to solve the jitter problem. Referring to FIGS. 1 and 2, the following description is made.

먼저, 발신측 통신단말기는 도 1과 같이 음성 데이터를 인코딩하여 RTP 패킷을 생성한다. 이때, 생성된 RTP 패킷에는 묵음 패킷도 존재한다. 그러나, 발신측 통신단말기는 이와 같이 생성된 RTP 패킷 중 묵음 패킷을 수신측에 전송하지 않는다.First, the calling terminal generates an RTP packet by encoding voice data as shown in FIG. 1. At this time, a silence packet also exists in the generated RTP packet. However, the originating communication terminal does not transmit the silent packet among the RTP packets generated in this way to the receiving side.

한편, 수신측 통신단말기는 발신측 통신단말기로부터 전달되는 음성 코딩된 데이터를 적재한 RTP 패킷을 수신하여 도 2와 같이 버퍼링한 후, 수신한 순서에 따라 디코딩하여 출력하므로 지터문제를 해결한다.Meanwhile, the receiving communication terminal solves the jitter problem by receiving an RTP packet carrying voice coded data transmitted from the calling communication terminal, buffering the packet as shown in FIG.

그러나, 이처럼 음성 인코딩된 RTP 패킷을 수신한 수신측 통신단말기가 송신측에서 송신하지 않는 묵음 패킷을 고려하지 않고 단지 지터 버퍼에 쌓인 음성 인코딩 패킷을 디코딩함으로 즉, 송신측에서 송신하지 않은 묵음 패킷을 배제한 수신측의 음성 디코딩을 함으로 실제 송신측에서 전송하고자하는 원래의 음성 데이터와 다른 음성 데이터를 생성하게 되는 문제가 발생한다.However, the receiving communication terminal receiving the voice encoded RTP packet thus decodes the speech encoded packet accumulated in the jitter buffer without considering the silent packet not transmitted from the transmitting side, that is, the silent packet not transmitted from the transmitting side. The voice decoding of the excluded receiver causes a problem of generating voice data different from the original voice data to be transmitted by the actual transmitter.

따라서, 본 발명의 목적은 지터버퍼링된 데이터의 효율적인 디코딩을 위해 RTP 헤더의 정보 중 타임스템프 데이터를 이용하여 각 패킷의 처리간격에 따라 디코딩하게 하여 좋은 음질을 수신자에게 제공하는데 있다.Accordingly, an object of the present invention is to provide a receiver with good sound quality by decoding the packet according to the processing interval of each packet by using the timestamp data of the information of the RTP header for efficient decoding of the jitter buffered data.

도 1은 발신측 통신단말기에 의해 음성 데이터를 인코딩하여 RTP 패킷을 생성한 것을 예시하며,1 illustrates an example of generating an RTP packet by encoding voice data by an originating communication terminal.

도 2는 도 1에 예시된 RTP 패킷들을 수신측 버퍼를 통해 지터 버퍼링한 것을 예시하고,FIG. 2 illustrates jitter buffering of the RTP packets illustrated in FIG. 1 through a receiving buffer;

도 3은 본 발명에 따른 음성데이터 통신 장치의 주요 구성요소를 나타내며,3 shows the main components of a voice data communication apparatus according to the present invention,

도 4는 본 발명에 따른 음성데이터수신버퍼의 구성을 나타내고,4 shows a configuration of a voice data receiving buffer according to the present invention,

도 5는 본 발명의 바람직한 실시예를 설명하기 위한 발신측 통신단말기에 의해 생성된 RTP 패킷들을 예시하며,5 illustrates RTP packets generated by an originating communication terminal for explaining a preferred embodiment of the present invention.

도 6은 도 5와 같이 생성된 RTP 패킷들이 본 발명에 따른 음성데이터수신버퍼에 저장되는 것을 예시하며,FIG. 6 illustrates that RTP packets generated as shown in FIG. 5 are stored in a voice data receiving buffer according to the present invention.

도 7은 본 발명에 따른 음성데이터 디코딩 방법을 나타내고,7 shows a voice data decoding method according to the present invention,

도 8은 본 발명에 따른 RTP 패킷수신모듈의 수행 과정을 나타낸다.8 shows a process of performing the RTP packet receiving module according to the present invention.

<도면의 주요부분의 부호의 설명><Description of Symbols of Major Parts of Drawings>

100 : 음성처리부 200 : RTP 패킷수신모듈100: voice processing unit 200: RTP packet receiving module

300 : 음성데이터수신버퍼 400 : 음성처리제어부300: voice data receiving buffer 400: voice processing control unit

410 : 음성기록모듈 420 : 음성판독모듈410: voice recording module 420: voice reading module

500 : 음성데이터송신버퍼 600 : RTP 패킷전송모듈500: voice data transmission buffer 600: RTP packet transmission module

상기 목적을 달성하기 위한 본 발명의 일 측면에 의하면, 음성 데이터를 코딩방식에 따라 소정의 시간간격 마다 압축 해제하여 디코딩하는 음성처리부와; 새로운 패킷을 수신하여 앞서 처리된 패킷과의 처리간격을 계산하고 상기 새로운 패킷에 포함된 인코딩된 음성데이터를 분리하는 패킷수신모듈과; 상기 패킷수신모듈을 통해 계산 및 분리된 상기 처리간격과 상기 인코딩된 음성 데이터를 쌍으로 임시 저장하는 음성데이터수신버퍼와; 상기 음성데이터수신버퍼에 저장된 상기 인코딩된 음성데이터를 상기 인코딩된 음성데이터의 상기 처리간격만큼 지연시켜 상기 음성처리부에 전달하여 디코딩시키는 음성처리제어부를 포함하는 음성데이터 수신 장치가 개시된다.According to an aspect of the present invention for achieving the above object, a voice processing unit for decompressing and decoding the speech data at predetermined time intervals according to a coding scheme; A packet receiving module for receiving a new packet, calculating a processing interval with a previously processed packet, and separating encoded voice data included in the new packet; A voice data receiving buffer for temporarily storing the processed interval and the encoded voice data calculated and separated by the packet receiving module in pairs; Disclosed is a speech data receiving apparatus including a speech processing controller for delaying the encoded speech data stored in the speech data receiving buffer by the processing interval of the encoded speech data, and transmitting the same to the speech processor for decoding.

바람직하게, 상기 패킷수신모듈은 상기 두 패킷에 각각 저장된 타임스템프를비교하여 상기 처리간격을 계산한다.Preferably, the packet receiving module calculates the processing interval by comparing the time stamps stored in the two packets, respectively.

바람직하게, 상기 소정의 시간간격은 상기 코딩방식이 G.723.1인 경우 30㎳이고, G.729인 경우 10㎳ 이다.Preferably, the predetermined time interval is 30 ms when the coding scheme is G.723.1, and 10 ms when the coding scheme is G.729.

바람직하게, 상기 처리 간격은 상기 소정의 시간간격에 대한 정수배이다.Preferably, the processing interval is an integer multiple of the predetermined time interval.

바람직하게, 상기 패킷은 RTP(Real-Time Transport Protocol) 패킷이다.Preferably, the packet is a Real-Time Transport Protocol (RTP) packet.

바람직하게, 상기 음성처리제어부는 상기 음성데이터수신버퍼에 일정량 이상의 상기 인코딩된 음성 데이터가 저장된 경우에만 상기 디코딩 작업을 수행한다.Preferably, the speech processing controller performs the decoding operation only when a predetermined amount or more of the encoded speech data is stored in the speech data receiving buffer.

본 발명의 다른 측면에 의하면, 새로운 패킷을 수신하여 앞서 처리된 패킷과의 처리간격을 계산하고 상기 새로운 패킷에 포함된 인코딩된 음성데이터를 분리하여 상기 처리간격과 상기 인코딩된 음성 데이터를 쌍으로 임시 저장한 후, 상기 저장된 상기 인코딩된 음성데이터를 상기 인코딩된 음성데이터의 상기 처리간격만큼 지연시켜 디코딩시키는 음성데이터 수신 방법이 개시된다.According to another aspect of the present invention, a new packet is received to calculate a processing interval with a previously processed packet and the encoded voice data included in the new packet is separated to temporarily process the processing interval and the encoded voice data in pairs. After storing, a method of receiving voice data is disclosed which delays and stores the encoded voice data by the processing interval of the encoded voice data.

바람직하게, 상기 처리 간격은 코딩방식이 G.723.1인 경우 30㎳의 정수 배이고, 상기 코딩방식이 G.729인 경우 10㎳의 정수 배이다.Preferably, the processing interval is an integer multiple of 30 ms when the coding scheme is G.723.1, and an integer multiple of 10 ms when the coding scheme is G.729.

바람직하게, 상기 음성데이터 수신 방법은 상기 저장된 상기 인코딩된 음성 데이터가 일정량 이상인 경우에만 상기 디코딩 작업을 수행한다.Preferably, the voice data receiving method performs the decoding operation only when the stored encoded voice data is a predetermined amount or more.

이하, 상기한 본 발명의 목적들, 특징들, 그리고 장점들을 첨부된 도면에 나타낸 본 발명의 바람직한 실시예를 통해 보다 상세히 설명한다.Hereinafter, the objects, features, and advantages of the present invention described above will be described in more detail with reference to the preferred embodiments of the present invention shown in the accompanying drawings.

후술되는 용어들은 본 발명에서의 기능을 고려하여 정의 내려진 용어들로서 이는 당 분야에 종사하는 기술자의 의도 또는 관례 등에 따라 달라질 수 있으므로, 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.The terms to be described below are terms defined in consideration of functions in the present invention, and may vary according to the intention or custom of a person skilled in the art, and the definitions should be made based on the contents throughout the specification.

도 3을 참조하여 본 발명에 따른 음성통신 장치를 살펴보면 다음과 같다.Looking at the voice communication apparatus according to the present invention with reference to FIG.

먼저, 음성통신장치는 음성처리부(G.723.1, G.729)를 통해 전송할 음성을 30㎳(G.723.1인 경우) 또는 10㎳(G.729인 경우) 동안 샘플링하여 압축하여 RTP 패킷으로 구성한 후 이를 UDP를 통해 수신측에 전송한다.First, the voice communication apparatus samples and compresses the voice to be transmitted through the voice processing units (G.723.1 and G.729) for 30 ms (for G.723.1) or 10 ms (for G.729) to form an RTP packet. It is then sent to the receiver via UDP.

한편, 음성통신장치의 음성처리부(100)는 음성 데이터를 30㎳(G.723.1인 경우) 또는 10㎳(G.729인 경우)마다 압축해제하며 음성을 디코딩한다. 패킷수신모듈(200)은 새로운 패킷을 수신하여 앞서 처리된 패킷과의 처리간격을 계산하고 새로운 패킷에 포함된 인코딩된 음성데이터(즉, 패킷의 payload에 저장된 데이터)를 분리한다. 음성데이터수신버퍼(300)는 패킷수신모듈(200)을 통해 계산 및 분리된 처리간격과 인코딩된 음성 데이터를 쌍으로 도 4와 같은 구조로 임시 저장한다. 음성처리제어부(400)는 도 4와 같은 구조로 음성데이터수신버퍼(300)에 임시 저장된 인코딩된 음성데이터를 상기 인코딩된 음성데이터의 처리간격만큼 지연시켜 음성처리부(100)에 전달하여 디코딩시켜 음성질을 향상시킨다.On the other hand, the speech processing unit 100 of the speech communication apparatus decompresses the speech data every 30 ms (for G.723.1) or 10 ms (for G.729) and decodes the speech. The packet receiving module 200 receives a new packet, calculates a processing interval with the previously processed packet, and separates encoded voice data (ie, data stored in the payload of the packet) included in the new packet. The voice data reception buffer 300 temporarily stores the processing data and the encoded voice data calculated and separated through the packet reception module 200 in pairs as shown in FIG. 4. The speech processing controller 400 delays the encoded speech data temporarily stored in the speech data receiving buffer 300 by the processing interval of the encoded speech data in the structure as shown in FIG. Improve quality.

한편, 음성처리제어부(400)에 포함된 음성기록모듈(410)과 음성판독모듈(420)은 일정시간 간격(코딩방식이 G.723.1인 경우 30㎳ 또는 G.729인경우 10㎳)으로 작동한다. 이때, 음성기록모듈(410)은 음성데이터수신버퍼(300)로부터 디코딩할 음성데이터를 독취하여 음성처리부(100)에 전달하고, 음성판독모듈(420)은 음성처리부(100)에 의해 인코딩된 음성데이터를 판독하여 음성데이터송신버퍼(500)에 저장한다. RTP 패킷전송모듈(600)은 음성데이터송신버퍼(500)에 저장된 인코딩된 음성데이터를 외부로 전송한다.On the other hand, the voice recording module 410 and the voice reading module 420 included in the voice processing control unit 400 operates at a predetermined time interval (30 경우 for G.723.1 or 10 경우 for G.729). do. At this time, the voice recording module 410 reads the voice data to be decoded from the voice data receiving buffer 300 and transmits the voice data to the voice processing unit 100, and the voice reading module 420 is the voice encoded by the voice processing unit 100. The data is read and stored in the voice data transmission buffer 500. The RTP packet transmission module 600 transmits the encoded voice data stored in the voice data transmission buffer 500 to the outside.

현재, H.323 엔드포인트(Endpoint), 게이트웨이(Gateway), 게이트키퍼(Gatekeeper), 다중 콘트롤 유니트(Multiple Control Unit), 멀티포인트콘트롤러(Multipoint Controller)는 음성처리부(코덱(codec)이라고도 칭함)를 G.723.1, G.729A를 사용하며, 각 음성처리부의 샘플링 타임은 각각 30㎳, 10㎳이다. 따라서, 음성기록모듈(410)과 음성판독모듈(420)은 30㎳, 10㎳마다 작동하며, 특히, 음성기록모듈(410)은 도 7에 도시된 본 발명의 따른 음성데이터 디코딩 방법을 주기적으로 수행한다. 이하 본 발명의 바람직한 실시예를 음성처리부가 G.723.1인 경우로 설명한다Currently, H.323 endpoints, gateways, gatekeepers, multiple control units, and multipoint controllers refer to voice processing units (also called codec). G.723.1 and G.729A are used, and the sampling times of each audio processing unit are 30 ms and 10 ms, respectively. Accordingly, the voice recording module 410 and the voice reading module 420 operate every 30 ms and 10 ms. In particular, the voice recording module 410 periodically performs the voice data decoding method of the present invention shown in FIG. Perform. Hereinafter, a preferred embodiment of the present invention will be described in the case where the voice processing unit is G.723.1.

도 4를 참조하여 음성데이터수신버퍼(300) 구성을 살펴보면, R_Addr_P는 음성기록모듈(410)에서 판독할 어드레스이고, W_Addr_P는 패킷수신모듈(200)을 통해 계산 및 분리된 처리간격과 인코딩된 음성 데이터를 쌍을 기록할 어드레스를 나타낸다. Voice_Data가 저장되는 영역은 도시된 바와 같이 처리간격, 즉 지연 디코딩 카운트(Decoding_delay_CNT) 값과 RTP 패킷의 payload에 존재하는 인코딩된 음성 데이터를 한 쌍으로 저장한다.Looking at the configuration of the voice data receiving buffer 300 with reference to Figure 4, R_Addr_P is the address to be read by the voice recording module 410, W_Addr_P is the processing interval and the encoded processing interval and packet encoding through the packet receiving module 200 Indicates an address at which to record data pairs. As shown, the area in which Voice_Data is stored stores a processing interval, that is, a delay decoding count (Decoding_delay_CNT) value and encoded voice data present in payload of the RTP packet as a pair.

본 발명의 음성기록모듈(410)을 통해 음성데이터를 디코딩하는 과정을 도 7을 참조하여 살펴보면 다음과 같다The process of decoding the voice data through the voice recording module 410 of the present invention will be described with reference to FIG.

먼저, 음성기록모듈(410)은 음성데이터수신버퍼(300)에 저장된 R_Addr_P와 W_Addr_P를 비교한다(단계 : S710).First, the voice recording module 410 compares R_Addr_P and W_Addr_P stored in the voice data receiving buffer 300 (step S710).

비교결과, R_Addr_P와 W_Addr_P가 같은 경우 즉, 음성기록모듈(410)이 음성데이터수신버퍼(300)로부터 판독할 데이터가 없는 경우 Jitter_Size_ok flag를 0으로 세팅하고 일정량의 데이터(이를테면, 60바이트)가 쌓이기를 기다린다(단계 : S720).As a result of the comparison, when R_Addr_P and W_Addr_P are the same, that is, when the voice recording module 410 has no data to read from the voice data receiving buffer 300, the Jitter_Size_ok flag is set to 0 and a certain amount of data (for example, 60 bytes) is accumulated. Wait (step: S720).

음성기록모듈(410)을 30㎳마다 처리하던 중, S710에서 R_Addr_P와 W_Addr_P가 같지 않으면 Jitter_Size_ok 값을 체크한다(단계 : S730). Jitter_Size_ok 값을 체크하는 이유는 음성기록모듈(410)을 동작시킬 때마다 매번(30㎳) 음성데이터수신버퍼(300)에 일정량의 데이터가 쌓였는지 체크할 필요 없이 바로 음성처리부(100)로 디코딩할 데이터를 전달하기 위해서 이다. 즉, S740에서 한번 체크하여 일정량만큼 쌓이면 S750에서 Jitter_Size_ok flag를 세팅한 후 S760에서 인코딩된 음성데이터를 저장한 패킷의 payload를 처리한다.While processing the voice recording module 410 every 30 ms, if R_Addr_P and W_Addr_P are not the same in S710, the Jitter_Size_ok value is checked (step S730). The reason for checking the Jitter_Size_ok value is to decode the voice processing unit 100 without having to check whether a certain amount of data is accumulated in the voice data receiving buffer 300 every time the voice recording module 410 is operated. To convey data. That is, once checked in S740 and accumulated by a certain amount, the Jitter_Size_ok flag is set in S750 and the payload of the packet storing the encoded voice data is processed in S760.

이처럼, 일단 Jitter_Size_ok flag가 세팅되면 S740에서 일정량의 음성 데이터 누적을 점검하지 않고 바로 처리하고, 음성기록모듈(410)이 작동시 S710에서 R_Addr_P와 W_Addr_P가 같으면 Jitter_Size_ok flag를 0으로 세팅하고 음성데이터수신버퍼(300)에 일정량의 데이터가 쌓인 후 이어 설명할 S760 내지 S780을 수행함으로써 QOS를 보장하지 못하는 랜(LAN)에서의 전송지연차(jitter)문제를 해결한다.As such, once the Jitter_Size_ok flag is set, the S740 processes the accumulated amount of voice data immediately without checking, and if the R_Addr_P and W_Addr_P are the same in S710 when the voice recording module 410 is operating, the Jitter_Size_ok flag is set to 0 and the voice data reception buffer is set. After a certain amount of data is accumulated at 300, S760 to S780 which will be described later are performed to solve a transmission jitter problem in a LAN that does not guarantee QOS.

음성데이터수신버퍼(300)에 일정량의 데이터가 존재하는 경우, 음성기록모듈(410)은 패킷수신모듈(200)을 통해 계산되어 음성데이터수신버퍼(300)에 저장된 지연 디코딩 카운트(decoding_delay_CNT) 값과 신호처리부(100)에서 관리하는 디코딩 지연(Decoding_Delay) 값이 같은지를 비교한다(단계 : S760).When a certain amount of data is present in the voice data receiving buffer 300, the voice recording module 410 is calculated by the packet receiving module 200 and stored in the voice data receiving buffer 300 with a delay decoding count (decoding_delay_CNT) value. The decoding delay (Decoding_Delay) values managed by the signal processing unit 100 are compared with each other (step S760).

한편, 신호처리부(100)에서 관리하는 디코딩 지연(Decoding_Delay)에 대한 초기값은 0으로 설정한다.On the other hand, the initial value for the decoding delay (Decoding_Delay) managed by the signal processing unit 100 is set to zero.

S760에서 지연 디코딩 카운트(Decoding_delay_CNT) 값과 디코딩 지연(Decoding_Delay) 값을 비교한 결과, 지연 디코딩 카운트(Decoding_delay_CNT) 값과 디코딩 지연(Decoding_Delay) 값이 다른 경우, 디코딩 지연(Decoding_Delay) 값을 1 증가하고 리턴한다(단계 : S770). 즉, 신호처리부(100)가 30㎳ 동안에 인코딩된 음성 데이터를 디코딩하지 않으므로 묵음패킷을 수신하여 출력하는 효과를 낸다.As a result of comparing the delay decoding count (Decoding_delay_CNT) value and the decoding delay (Decoding_Delay) value in S760, if the delay decoding count (Decoding_delay_CNT) value and the decoding delay (Decoding_Delay) value are different, the decoding delay (Decoding_Delay) value is increased by 1 and returned. (Step S770). That is, since the signal processing unit 100 does not decode the encoded speech data for 30 ms, it has the effect of receiving and outputting a silent packet.

S760에서 지연 디코딩 카운트(decoding_delay_CNT) 값과 디코딩 지연(Decoding_Delay) 값을 비교한 결과, 지연 디코딩 카운트(Decoding_delay_CNT) 값과 디코딩 지연(Decoding_Delay) 값이 같은 경우 즉, 앞에 처리한 RTP 패킷의 타임스템프 값과 현재 처리할 패킷의 타임스템프 값만큼의 간격(이를테면, 30㎳)으로 인터럽트가 발생하면 디코딩 지연(Decoding_Delay) 값을 리셋하고, 지연 디코딩 카운트(Decoding_delay_CNT)에 해당하는 인코딩된 음성데이터를 음성처리부(100)에 전달하여 디코딩한다(단계 : S780).As a result of comparing the delay decoding count (decoding_delay_CNT) value and the decoding delay (Decoding_Delay) value in S760, when the delay decoding count (Decoding_delay_CNT) value and the decoding delay (Decoding_Delay) value are the same, that is, the timestamp value of the RTP packet processed before If an interrupt occurs at an interval equal to the timestamp value of the packet to be processed (for example, 30 ms), the decoding delay (Decoding_Delay) value is reset, and the encoded speech data corresponding to the delay decoding count (Decoding_delay_CNT) is processed by the speech processing unit 100. ) To decode (step S780).

본 발명의 패킷수신모듈(200)의 작동을 도 8을 참조하여 살펴보면 다음과 같다.Looking at the operation of the packet receiving module 200 of the present invention with reference to FIG.

먼저, 패킷수신모듈(200)은 새롭게 수신된 RTP 패킷의 헤더 중 타임스템프(CRT_TS)를 체크하여 앞서 처리한 RTP 패킷의 타임스템프(FWD_TS)와 비교한다(단계 : S810). 즉, 앞서 처리된 패킷과 현재 처리할 패킷의 처리 간격을 계산한다. 각 패킷간 타임스템프 간격이 일정하지 않은 것은 발신측 통신단말기가 묵음인 경우에는 묵음패킷을 전송하지 않기 때문이다. 따라서, 신호처리부(100)가 G.723.1 코딩 방식을 따를 경우 패킷간의 처리 간격은 30㎳의 배수 간격이 된다.First, the packet receiving module 200 checks the timestamp CRT_TS of the header of the newly received RTP packet and compares it with the timestamp FWD_TS of the RTP packet previously processed (step S810). That is, the processing interval of the previously processed packet and the packet currently to be processed is calculated. The reason that the timestamp interval between packets is not constant is because a silent packet is not transmitted when the calling terminal is silent. Therefore, when the signal processing unit 100 follows the G.723.1 coding scheme, the processing interval between packets becomes a multiple interval of 30 ms.

이어, 새롭게 수신된 RTP 패킷의 타임스템프(CRT_TS)와 앞서 처리한 RTP 패킷의 타임스템프(FWD_TS)의 비교를 통해 앞서 처리된 패킷과 현재 처리할 패킷의 처리 간격을 계산 후, 계산된 처리간격을 Decoding_delay_CNT에 저장한다(단계 : S820).Next, the processing interval between the previously processed packet and the current packet is calculated by comparing the timestamp CRT_TS of the newly received RTP packet with the timestamp FWD_TS of the previously processed RTP packet, and then the calculated processing interval is calculated. Store in Decoding_delay_CNT (step S820).

예를 들어, 처리간격이 30㎳인 경우 Decoding_delay_CNT에 0을 저장하고, 처리간격이 60㎳인 경우(두 패킷사이에 묵음 패킷 하나 존재) Decoding_delay_CNT에 1을 저장하고, 처리간격이 90㎳인 경우(두 패킷사이에 묵음 패킷 하나 존재) Decoding_delay_CNT에 2를 저장한다.For example, if the processing interval is 30 ms, 0 is stored in Decoding_delay_CNT. If the processing interval is 60 ms (one silent packet exists between two packets), 1 is stored in Decoding_delay_CNT, and the processing interval is 90 ms ( There is one silent packet between the two packets.) 2 is stored in Decoding_delay_CNT.

계산된 처리간격을 Decoding_delay_CNT에 저장한 후, 새롭게 수신되는 RTP 패킷을 처리하기 위해 현재처리 중이 패킷의 타임스템프값을 저장한 CRT_TS 값을 FWD_TS 변수에 할당한다(단계 : S830).After storing the calculated processing interval in Decoding_delay_CNT, in order to process a newly received RTP packet, a CRT_TS value storing the timestamp value of the packet currently being processed is allocated to the FWD_TS variable (step S830).

이어, 패킷수신모듈(200)은 현재 패킷과 앞선 패킷간의 처리간격을 나타내는지연 디코딩 카운트(decoding_delay_CNT) 값과 현재 패킷의 payload에 저장된 인코딩된 음성데이터를 추출하여 음성데이터수신버퍼(300)에 저장한다(단계 : S840).Subsequently, the packet receiving module 200 extracts the delayed decoding count value (decoding_delay_CNT) representing the processing interval between the current packet and the preceding packet and the encoded voice data stored in the payload of the current packet, and stores the encoded voice data in the voice data receiving buffer 300. (Step: S840).

패킷수신모듈(200)에 의해 수신되는 RTP 패킷이 처리되어 음성데이터수신버퍼(300)에 저장되는 것을 도 5와 도 6을 참조하여 살펴보면, 발신측으로부터 도 5와 같은 순서로 음성데이터를 저장한 패킷들과 패킷들 사이에 묵음패킷을 생성하고, 이중 음성데이터를 저장한 패킷들을 전송하면, 이와 같은 패킷을 수신한 수신측의 패킷수신모듈(200)은 도 8과 같은 방법으로 패킷들 간격을 계산하고 음성데이터를 추출하여 도 6과 같이 음성데이터수신버퍼(300)에 저장한다.Referring to FIGS. 5 and 6, the RTP packet received by the packet receiving module 200 is processed and stored in the voice data receiving buffer 300. When the silence packet is generated between the packets and the packets are transmitted, and the packets storing the double voice data are transmitted, the packet receiving module 200 of the receiving side receiving the packet receives the packet intervals as shown in FIG. 8. Calculate and extract the voice data and store it in the voice data receiving buffer 300 as shown in FIG.

따라서, 음성처리제어부(400)의 음성기록모듈(410)은 패킷에 인코딩된 음성데이터를 디코딩할 때 해당되는 패킷에 대한 음성데이터수신버퍼(300)의 지연 디코딩 카운트(decoding_delay_CNT) 필드 값을 참조하여 음성처리부(100)를 제어하여 디코딩한다.Accordingly, when the voice recording module 410 of the voice processing controller 400 decodes the voice data encoded in the packet, the voice recording module 410 refers to the value of the delay decoding count (decoding_delay_CNT) field of the voice data receiving buffer 300 for the corresponding packet. The voice processing unit 100 is controlled to decode.

예를 들어, 도 6에서 패킷4에 인코딩된 음성 데이터는 음성기록모듈(410)이 두 번 기동된 후(즉, 60㎳ 후)에 처리되므로 발신측에서 묵음 패킷을 전송한 것과 같이 음성을 디코딩하므로 좋은 음질을 수신자가 들을 수 있게 해준다.For example, in FIG. 6, the voice data encoded in Packet 4 is processed after the voice recording module 410 is activated twice (that is, after 60 ms), so that the voice is decoded as if the sender sent a silent packet. This allows the receiver to hear good sound quality.

이상에서 살펴본 바와 같이, 본 발명에 따르면 지터버퍼링된 데이터의 효율적인 디코딩을 위해 RTP 헤더의 정보 중 타임스템프 데이터를 이용하여 각 패킷의처리간격에 따라 디코딩하므로 좋은 음질을 수신자에게 제공하는 효과가 있다.As described above, according to the present invention, since the timestamp data of the information of the RTP header is decoded according to the processing interval of each packet for efficient decoding of the jitter-buffered data, it is effective to provide a good sound quality to the receiver.

이상 본 발명의 바람직한 실시예에 대해 상세히 기술되었지만, 본 발명이 속하는 기술 분야에 있어서 통상의 지식을 가진 사람이라면, 첨부된 청구 범위에 정의된 본 발명의 정신 및 범위를 벗어나지 않으면서 본 발명을 여러 가지로 변형 또는 변경하여 실시할 수 있음을 알 수 있을 것이다. 따라서 본 발명의 앞으로의 실시예들의 변경은 본 발명의 기술을 벗어날 수 없을 것이다.Although the preferred embodiments of the present invention have been described in detail above, those skilled in the art will appreciate that the present invention may be modified without departing from the spirit and scope of the invention as defined in the appended claims. It will be appreciated that modifications or variations may be made. Therefore, changes in the future embodiments of the present invention will not be able to escape the technology of the present invention.

Claims

A speech processing unit for decompressing and decoding speech data at predetermined time intervals according to a coding scheme;

A packet receiving module for receiving a new packet, calculating a processing interval with a previously processed packet, and separating encoded voice data included in the new packet;

A voice data receiving buffer for temporarily storing the encoded voice data in pairs and the processing interval calculated and separated by the packet receiving module;

And a speech processing control unit configured to delay the encoded speech data stored in the speech data receiving buffer by the processing interval of the encoded speech data and transmit the same to the speech processing unit for decoding.

The method of claim 1, wherein the packet receiving module,

And calculating the processing interval by comparing the time stamps stored in the two packets, respectively.

The method of claim 1, wherein the predetermined time interval,

And 30. If the coding method is G. 723.1, 10.

The method of claim 3, wherein the processing interval,

And an integer multiple of the predetermined time interval.

The method of claim 1, wherein the packet,

An apparatus for receiving voice data, which is a Real-Time Transport Protocol (RTP) packet.

The method of claim 1, wherein the voice processing control unit,

And the decoding operation is performed only when a predetermined amount or more of the encoded voice data is stored in the voice data receiving buffer.

A packet receiving step of receiving a new packet, calculating a processing interval with a previously processed packet, and separating encoded voice data included in the new packet;

A voice data buffering step of temporarily storing the processing interval and the encoded voice data calculated and separated by the packet receiving module in pairs;

And a speech processing step of delaying and decoding the stored encoded speech data by the processing interval of the encoded speech data.

The method of claim 7, wherein the packet receiving step,

The method of claim 7, wherein the processing interval,

And a coding method of G. 723.1 is an integer multiple of 30 ms, and if the coding scheme is G.729, an integer multiple of 10 ms.

The method of claim 7, wherein the packet,

A method of receiving voice data, characterized in that it is a Real-Time Transport Protocol (RTP) packet.

The method of claim 7, wherein the voice processing step,

And performing the decoding operation only when the stored encoded voice data is a predetermined amount or more.