KR101953580B1

KR101953580B1 - Data Transceiving Apparatus and Method in Telepresence System

Info

Publication number: KR101953580B1
Application number: KR1020150145170A
Authority: KR
Inventors: 구기종
Original assignee: 한국전자통신연구원
Priority date: 2014-11-25
Filing date: 2015-10-19
Publication date: 2019-03-04
Also published as: KR20160062679A

Abstract

The present invention can recover lost packets even at a packet loss rate that can occur in a general public Internet network or a mobile environment, rather than a dedicated network for video conferencing. In order to provide bidirectional services, packet loss recovery And more particularly, to an apparatus and method for transmitting and receiving data in a video conference system capable of recovering packet loss so as to reduce energy consumption.

Description

TECHNICAL FIELD [0001] The present invention relates to a data transmitting / receiving apparatus and method in a video conferencing system,

The present invention relates to a video conference / telepresence system, and more particularly, to a video conference / telepresence system for providing seamless high-quality video in a video conference system using the public Internet, To a data transmission / reception apparatus and method for performing packet loss recovery.

Traditional video conferencing systems (eg, Cisco and Polycom) can use a dedicated Internet network to keep packet loss rates, delays, and jitter below levels suitable for video conferencing. The packet loss recovery algorithm used in the video conferencing system using the dedicated Internet network is designed to have a low packet loss rate. However, it is not guaranteed to keep the packet loss rate, delay, and jitter below a predetermined level suitable for video conferencing in a general public Internet network, and therefore, strong packet recovery technology is required even at a higher packet loss rate.

Since a unidirectional streaming service (eg, RaptorQ) such as broadcast / VoD can provide a service after buffering for a predetermined time, it is necessary to request a retransmission of a lost packet to recover a packet loss or apply a packet loss recovery technique considering buffering time It is possible. However, since the telepresence system provides a real-time bidirectional service, it is necessary to recover lost packets in real time when packet loss occurs in the transmission network because buffering time for media processing is insufficient.

The activation of smart work environment and the increase of communication bandwidth due to the spread of 4th generation mobile communication technology enables participation of video conference through mobile terminal. However, the packet loss rate due to network congestion due to interference, hand-over, and data usage increase in a wireless environment may be higher than in a wired environment. Therefore, a packet loss recovery technique suitable for a mobile environment is needed.

Video conferencing requires large amounts of data compression, transmission, recovery and playback processes, and these processes require high power consumption. When a video conference is performed using a mobile terminal, there is a limitation in power supply, so a packet loss recovery technology considering energy consumption is needed.

SUMMARY OF THE INVENTION The present invention has been made in order to solve the above-mentioned problems, and it is an object of the present invention to provide a method and apparatus for preventing loss in a packet loss rate And to provide a data transmission / reception apparatus and method in a video conference system capable of restoring a packet that has been received.

A second object of the present invention is to provide an apparatus and method for transmitting and receiving data in a video conference system capable of recovering packet loss in a short time in real time in order to provide a bidirectional service.

A third object of the present invention is to provide an apparatus and method for transmitting / receiving data in a video conference system capable of recovering packet loss so as to reduce energy consumption.

The technical problems of the present invention are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned can be understood by those skilled in the art from the following description.

According to another aspect of the present invention, there is provided a method of transmitting and receiving data in a video conference system, the method comprising: transmitting a plurality of RTP (Real-time Transport Protocol) Sequentially transmitting recovery packets; And sequentially receiving the plurality of RTP packets and the plurality of recovery packets in a receiving apparatus, wherein the receiving comprises: checking whether there is a lost packet among the plurality of RTP packets; And recovering the lost packets using the plurality of recovery packets, wherein the plurality of recovery packets are generated by selectively XORing the plurality of RTP packets constituting one picture frame do.

Wherein the plurality of recovery packets include an XOR-ODD packet in which odd-numbered packets among the plurality of RTP packets are XORed to recover an odd-numbered packet among the plurality of RTP packets, and an even- And an XOR-EVEN packet obtained by performing an XOR operation on even-numbered packets among the plurality of RTP packets to recover the ROB packet.

The plurality of recovery packets may further include XOR-SUB packets obtained by dividing the plurality of RTP packets into predetermined subblocks and performing XOR operation on each subblock, and the XOR-SUB packets may include odd-numbered Or may be used to recover even-numbered packets.

The number of packets XOR-computed for each sub-block may be defined as an integer value obtained by dividing the plurality of RTP packets by the size of the sub-block.

Each header of the plurality of recovery packets includes a plr_indicator field indicating that the packet is a recovery packet, a num_of_packet field indicating the number of RTP packets included in the same picture frame, and a size of the last RTP packet of the same picture frame and the number of the XOR- And a mark_size field indicating a mark_size field.

The plr_indicator field may include a Reserved field, a payload_type field for classifying the picture image into QCIF or HD, and a redundant_type field for classifying XOR-ODD, XOR-EVEN, and XOR-SUB packets.

According to another aspect of the present invention, there is provided a video conference system including: a transmitter including a plurality of RTP (Real-time Transport Protocol) packets and an encoder for sequentially transmitting a plurality of recovery packets; And a decoder for sequentially receiving the plurality of RTP packets and the plurality of recovery packets, wherein the decoder checks whether there is a lost packet among the plurality of RTP packets, And restores the lost packet using the plurality of recovery packets, and the plurality of recovery packets are generated by selectively XORing the plurality of RTP packets constituting one picture frame.

According to the apparatus and method for transmitting and receiving data in the video conference system according to the present invention, a method of restoring a lost packet even at a packet loss rate that can occur in a general public Internet network or a mobile environment, Therefore, there is an advantage in that it is not necessary to construct a dedicated network for building a video conference system.

In addition, since there is no separate packet retransmission process for recovering a lost packet, it is possible to recover packet loss in a short time in real time, thereby maintaining the service quality of the bidirectional video conference at a constant level.

Moreover, by avoiding the packet loss decoding process which requires a large amount of energy consumption, it is possible to reduce energy consumption over the same call quality.

1 is a diagram for explaining a general NAL unit header.
2 is a diagram for explaining a general H.264 / AVC layer.
3 is a diagram for explaining the format of Single NAL, STAP-A, and FU, which is a representative RTP payload format used in the present invention.
FIG. 4 is a view for explaining the FU indicator and the FU header of FIG. 3. FIG.
5 is a conceptual diagram of packet loss recovery in a general streaming service (RaptorQ).
6 is a structure of a general data transmitting / receiving apparatus for recovering video packet loss using RQ codes.
FIG. 7 is a diagram for explaining a RTP packet segmentation process in the data transmitting / receiving apparatus of the video conference system of the present invention.
8 is a diagram for explaining generation of XOR-ODD and XOR-EVEN packets in the present invention.
9 is a diagram for explaining generation of an XOR-SUB packet in the present invention.
10 is a diagram for explaining the FU-A format of the RTP payload in the present invention.
11 is a view for explaining a PLR format of an RTP payload in the present invention.
12 is a diagram for explaining a packet loss check process according to the present invention.
13 is a diagram for explaining an XOR-SUB packet in the PLR header analysis according to the present invention.
14 is a diagram for explaining an energy efficient video packet loss recovery algorithm structure in the present invention.
FIG. 15 is a diagram for explaining an example of a method of implementing a video conference system according to an embodiment of the present invention.

Hereinafter, some embodiments of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in adding reference numerals to the constituent elements of the drawings, the same constituent elements are denoted by the same reference symbols as possible even if they are shown in different drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the difference that the embodiments of the present invention are not conclusive.

In describing the components of the embodiment of the present invention, terms such as first, second, A, B, (a), and (b) may be used. These terms are intended to distinguish the constituent elements from other constituent elements, and the terms do not limit the nature, order or order of the constituent elements. Also, unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the relevant art and are to be interpreted in an ideal or overly formal sense unless explicitly defined in the present application Do not.

The profile (compression encoding function) means that the technical components included in the algorithm (processing sequence for executing the encoding process) in the video encoding / decoding process are standardized. H.264 / AVC, the video compression standard, defines three profiles: a baseline profile, a main profile, and an extended profile.

The baseline profile is a profile that collects the basic technical elements of H.264 / AVC and the technical elements for error tolerance (the ability to decode and reconstruct image quality to some extent when an error occurs). It is defined for real-time and two-way communication application systems such as wireless mobile systems and video telephony or video conferencing systems.

The main profile is defined in terms of application systems such as broadcast, storage media, and so on. In this case, the encoding efficiency is considered to be the highest priority because it is targeted to a large amount of contents, so that a technology element with good encoding efficiency is adopted even if the delay occurs and the complexity is high.

The extension profile is defined as a streaming application system. The streaming application system encodes the content in advance and prepares the bit stream. Because it aims at transmitting bitstreams of data sequentially and real-time playback (streaming), real-time description of the encoding process is not required, but high encoding efficiency is required.

&Lt; Real-time Transport Protocol (RTP) >

Real-time Transport Protocol (RTP) is a protocol for packetizing real-time data such as video and audio and transmitting it over the Internet Protocol (IP) network. It is supported by Internet Engineering Task Force (IETF) Standardized. RTP packets use UDP (User Datagram Protocol) data without flow control and retransmission control in general because they emphasize real-time characteristics.

The RTP packet consists of an RTP header and a payload, and data sent from the application layer is encapsulated in an RTP payload. The RTP header may have a 12-byte fixed header and an extension header of variable length depending on the type of data to be transmitted (payload type). The main fields required for recovering the packet loss proposed by the present invention among the RTP fixed headers are Payload type (7 bits), Sequence Number (16 bits) and Timestamp (32 bits).

The payload type indicates the type of media data stored in the packet. In the present invention, for example, the HD image is set to 121 and the QCIF image is set to 120. [ Sequence Number has a value that increases by 1 for each RTP packet to be transmitted. It is used for packet loss recovery or ordering at the receiver side. The Timestamp represents the sampling instant of the first byte of the RTP data packet. The frequency is dependent on the payload data type and is specified in the profile or payload format document.

In H.264 / AVC, as shown in FIG. 2, a video coding layer (VCL), which handles the moving picture encoding process itself, and a network abstraction layer (NAL) Network abstract layer) is defined, so that the VCL and the NAL layer are separated from each other.

The image data delivered to the system that transmits information is [NAL unit (unit)] which is the basic unit of NAL. The NAL unit basically consists of two parts: a NAL header and RBSP (Raw Byte Sequence Payload) generated from the VCL. Each NAL unit is encapsulated into an RTP packet. A NAL unit header (see FIG. 1) and a NAL unit payload are inserted into the RTP payload.

As shown in Fig. 1, the NAL header includes flag information (nal_ref_idc (NRI: Network Reference Indicator)) indicating whether F (forbidden_zero_bit) and a slice serving as a reference picture (/ picture) (Nal_unit_type (Type)) indicating the identifier (nal_unit_type (Type)). To represent the length of the RBSP in multiples of 8 bits (1 byte), add a padding with a pattern of '1000 ...', followed by '1' at the end of the RBSP followed by '0'. Here, F (forbidden_zero_bit): 1 bit, 0: Indicates that there is no error or grammar error, and 1: Indicates that there is an error or syntax error. Also, NRI (nal_ref_idc): 2 bits, 00: indicates that the reference image is not used, and 01, 10, 11: indicates that the reference image is used. Type (nal_unit_type): Describes payload type of 5 bits and NAL unit.

There are three types of NAL unit inserted into RTP packets. Single NAL Unit Packet: One NAL unit is inserted in the packet. Aggregation Packet: Packet has several NAL units inserted. Fragmentation Unit: One NAL unit is inserted into several packets. [Table 1] shows the types of NAL units inserted and the payload structure according to RTP packet type values.

[Table 1]

3 is a diagram for explaining the format of Single NAL, STAP-A, and FU, which is a representative RTP payload format used in the present invention.

In FIG. 3, the Single NAL Unit Packet RTP payload format (1) consists of a NAL header consisting of F, NRI, Type fields and Single NAL Unit data in byte units. A single Time Aggregation Packet (STAP-A) packet (2) format (STAP-A, STAP-A, STAP-B, MTAP16 and MTAP24) header, NALU1 Size, NALU1 header, NALU1 Data, ..) are presented. When one NAL unit is divided into several RTP packets, it is encapsulated into a Fragmentation Unit. FIG. 3 (3) shows the format of the Fragmentation Unit (FU indicator, FU header, FU payload). The FU indicator and the FU header in the Fragmentation Unit format have the structure shown in FIG. As shown in FIG. 4A, the FU indicator has the same meaning as the NAL unit header. 4 (b), the meaning of each field in the FU header is as follows. Here, S: 1 bit is the start bit, E: 1 bit is the end bit, R: Reserved as 1 bit, and Type: 5 bits.

5 is a conceptual diagram of packet loss recovery in a general streaming service (RaptorQ).

The RaptorQ encoder is an RFC 6330 standard technology of the Internet Engineering Task Force (IETF) as shown in the conceptual diagram of FIG. 5, and includes forward error correction Error Correction (FEC)) technique. RaptorQ code is a kind of rateless code which is a channel coding technology. When a streaming or a data file is transmitted, when the decoder receives two additional information overhead in addition to the number of source symbols, a decoding failure probability of 10 ^-7 level is obtained This is the code.

Describes the general behavior of a system using the RaptorQ codec. The transmission system divides the content to be transmitted into source blocks each consisting of K source symbols and performs RaptorQ encoding / decoding for each source block (1 = K = 56,403). One symbol size is typically set between 1 and 1024 bytes. The RaptorQ encoder generates L intermediate symbols from a K source symbol through a generator matrix composed of Low Density Parity Check (LDPC) codes, High Density Parity Check (HDPC) codes, and LT (Luby Transform) codes. And transmits N encoded RaptorQ codes from the L intermediate symbols to the destination receiving terminal.

After receiving the RaptorQ code, the receiving terminal performs RaptorQ decoding to recover K source symbols through the same generation matrix as the encoder. The performance of RaptorQ can restore the original source symbol to 10 ^-7 levels by receiving two additional symbols in the source symbol (K). The process of obtaining the inverse matrix of the generator matrix for RaptorQ decoding requires a complexity of O (n ³ ) level. The larger the value of K, the larger the dimension of the generator matrix. Thus, the RaptorQ decoding processing time for obtaining the inverse matrix becomes longer.

6 is a structure of a general data transmitting / receiving apparatus for recovering video packet loss using RQ codes. Referring to FIG. 6, an operation method of an image packet loss recovery system using a conventional RaptorQ (RQ) code will be described.

The video encoder receives the video from the camera and generates I, P, B type compressed video frames. A symbolizer generates a source block consisting of K source symbols S1, S2, S3, ... SK from video frames I, P, B. The RQ encoder is a systematic encoder that can transmit K source symbols without modification and generate infinite number of repair symbols R1, R2, .. from K source symbols. However, it is usually possible to limit the number of recovery symbols according to the target data rate or channel state. The RQ encoder transmits the RQ encoded source symbol and the recovery symbol and is packetized through the network layer and transmitted to the receiving terminal.

Regardless of the type of the source symbol and the recovered symbol, the RQ decoder receiving the K symbols attempts to recover the original source symbol through the IDGE (Inactivation Decoding Gaussian Elimination) decoding process provided in the RFC 6330 standard. If the original source symbol recovery fails, the recovery symbol is further received and the decoding process is repeatedly performed to restore the source symbol. The inverse symbolizer assembles a video frame of a frame unit from the reconstructed source symbol and delivers it to the video decoder. The video decoder that receives the video frame decompresses and reproduces the video through the playback apparatus.

However, this method has the following problems. The RQ decoder starts RQ decoding after receiving the source or recovery symbol by the number of original source symbols (K). The RQ decoder can recover the original source symbol if it is possible to obtain the inverse matrix of the generator matrix, otherwise it receives the recovery symbol and repeats the process of obtaining the inverse matrix again. The process of finding the inverse of the generator matrix in RQ decoding consumes the most time. RQ decoder shows 99% decoding success probability when receiving K symbols, 99.99% success probability when receiving K + 1, and 99.9999% probability when receiving K + 2.

We describe a video packet loss recovery method in a telepresence system that provides full-HD (1080p) video using H.264 / AVC codec. The video conferencing system uses the H.264 / AVC baseline profile as a real-time bidirectional communication application and transmits 30 frames per second from the transmitting terminal to the receiving terminal to provide full-HD video. In the baseline profile, only I and P pictures are used as reference pictures, and each I / P picture is divided into a maximum transmission unit (MTU) size, encapsulated in an RTP packet, and transmitted.

7 is a diagram for explaining the RTP packet segmentation process in the data transmission / reception apparatus (or transmitting terminal, receiving terminal) of the video conference system of the present invention. In Fig. 7, a GOP (Group Of Picture) has a size of 30 pictures / second and is composed of I and P pictures. I and P pictures are composed of NAL units through the NAL layer and then divided into MTU sizes and encapsulated in RTP packets. I-picture data is divided into 40 to 50 RTP packets and transmitted. P-picture data is divided into 10 to 20 RTP packets. And transmitted. The number of RTP packets to be divided is variable according to the degree of motion of the image. Here, an average value is shown as an example.

In the present invention, a method of recovering a lost packet on a picture-by-picture basis will be described. When one I or P picture is configured as a NAL unit and transmitted to a lower layer (the lower layer is a PLR encoding layer in the present invention), the PLR encoding layer determines the transmission format by comparing the size of the NAL unit with the MTU size do. In the example of the present invention, since the size of the NAL unit is larger than the MTU, the NAL unit is divided into RTP packets in the Fragmentation Unit format and transmitted.

When an I or P picture is divided into n RTP packets in Fragmentation Unit format, the first RTP packet starts at an odd number in the picture as a start packet, and includes an odd number including a start packet. Lt; RTI ID = 0.0 > XOR < / RTI &

) Operation to generate an XOR-ODD packet. In addition, the packets corresponding to the even (XOR)

) Operation to generate the generated XOR-Even packet.

For example, as shown in FIG. 8, one picture frame can be divided into eight RTP packets and transmitted, and each RTP packet numbers RTP S1 to RTP S8 for odd / even division. This number is independent of the sequence number of the RTP packet. XOR-ODD packets are generated by XORing only the payload data of the RTP packets of S1, S3, S5, S7 corresponding to odd numbers in the picture. That is, XOR-ODD = RTP (S1

S3

S5

S7). An XOR-ODD packet can be used to recover odd-numbered packets.

8, an XOR-EVEN packet is generated by XORing only the payload data of the RTP packets of S2, S4, S6 and S8 corresponding to even numbers in the picture. That is, XOR-EVEN = RTP (S2

S4

S6

S8). The XOR-EVEN packet can be used to recover the even-numbered packet.

An XOR-SUB packet is generated for each sub-block size (the number of RTP packets included in a sub-block) arbitrarily determined in one picture. An XOR-SUB packet is generated by an integer number obtained by dividing the number of RTP packets by the sub-block size. That is, the number of XOR-SUB packets = minimum integer (number of RTP packets in one picture / size of sub-block). XOR-SUB packets can be used to recover odd-numbered or even-numbered packets belonging to a sub-block.

8 and 9, when the number of RTP packets in the picture is 8 and the size of the sub-block is arbitrarily set to 4, the first sub-block corresponds to RTP S1 to S4 and the second sub-block corresponds to RTP S5 to S8 . XOR operation is performed with RTP packets corresponding to the first sub-block to generate an XOR-SUB1 packet. That is, XOR-SUB1 = RTP (S1

S2

S3

S4). XOR operation is performed on the RTP packets corresponding to the second sub-block to generate the XOR-SUB2 packet. That is, XOR-SUB2 = RTP (S5

S6

S7

S8).

In the packet transmission order, the RTP packets are transmitted in the order of the sequence numbers of the RTP packets (RTP S1 to S8), the XOR-SUB packets generated in the order of the sub-blocks are transmitted, and then the XOR- .

<H.264 / AVC RTP packet format for packet loss recovery>

<FU-A format>

10 is a diagram for explaining the FU-A format of the RTP payload in the present invention. The RTP packets S1 to S8 used in FIGS. 8 and 9 are generated by dividing data of a single NAL unit. At the head of the RTP payload, there are an FU indicator (identifier) and an FU header having the same syntax as the NAL unit header as shown in FIG. In one example, the FU-A mode, which is a mode in which decoding order can not be specified, is used. The RTP header uses the sampling time in NAL units for the timestamp, and uses the same timestamp in one picture.

In FIG. 10, the FU indicator is composed of F, NRI, and Type, and the NRI is set to binary "10" when the reference picture is a P picture and "11" when it is an I picture. Type is set to "11100" (28) for the FU-A format.

The FU header consists of S, E, R, and Type, S is set only in the first RTP packet of the picture, and E is set only in the last RTP packet of the picture. S and E are not set for the remaining RTP packets of the picture. Type indicates a reference picture. It is set to "00001" (1) for a P-picture and "00101" (5) for an I-picture.

<STAP-A> format

The RTP packet of the STAP-A format is generated once before the I picture, and the same packet is repeated twice to recover the STAP-A format RTP packet. That is, three RTP packets of the same STAP-A format are generated. Since the STAP-A packet is usually much smaller than the size of the intermediate RTP packet in the FU-A format, the bandwidth occupied by the repeated occurrence is insignificant. After receiving the STAP-A packet, the receiver discards the STAP-A packet with the duplicated sequence number.

FIG. 11 is a diagram for explaining a PLR (Packet Loss Recovery) format of an RTP payload in the present invention. 11 shows an RTP payload format (Video PLR Header, Video PLR payload) of PLR packets used for packet loss recovery, that is, XOR-ODD, XOR-EVEN and XOR-SUB packets. The sequence number of the RTP header sequentially increases the sequence number of the previous video RTP packet, and the time stamp assigns the same value in the same picture.

The PLR_indicator in the Video PLR Header is a field for indicating that the packet is a recovery packet, that is, a PLR packet. The PLR_indicator is composed of 4 reserved bits, a payload_type (2 bits) field and a redundant_type (2 bits) field of the RTP packet. payload_type is a binary image of "00" for a QCIF screen and "01" for an HD screen. The redundant_type field is used to identify the PLR packet. The XOR-ODD packet is set to "00", the XOR-EVEN packet is set to "01", and the XOR-SUB packet is set to "10".

The num_of_packet field (1 byte) indicates the number of RTP packets included in the same picture. In this case, XOR-ODD, XOR-EVEN, and XOR-SUB packets for packet loss recovery are excluded.

The upper 11 bits of the mark_size field (11 bits) indicate the size of the last RTP packet of the same picture. That is, the value of the E (End) field of the FU header field indicates the size of the set packet. The lower 5 bits of the mark_size field indicate the number of the XOR-SUB packet within the same I or P picture. For example, the mark_size [4: 0] of the XOR-SUB1 packet is set to 1 and the mark_size [4: 0] of the XOR-SUB2 packet is set to 2 in FIG.

&Lt; Packet loss recovery (PLR) decoding algorithm >

When a receiving terminal receives an HD video packet having a payload type of 121 in the RTP header, it calculates the difference between the sequence number of the currently received RTP packet and the sequence number of the immediately received RTP packet to determine whether or not the packet has been lost. do. In the transmitting terminal, the RTP packet sets a sequential RTP sequence number in the order in which the RTP packets are generated. Therefore, if there is no packet loss, the difference between the sequence number of the currently received RTP packet and the sequence number of the immediately preceding RTP packet should be 1. 2 or more, it is determined that a packet loss has occurred.

The packet loss checking process will be described with reference to FIG. 12 as an example.

12 is a diagram for explaining a packet loss check process according to the present invention.

First, the receiving terminal performs a lost packet count indicating the number of lost packets (S10). If the packet loss count is not 0 (S20), the sequence number of the lost packet is stored (S30), and the count of lost packets in one picture is incremented by one (S40). The sequence number of the lost packet is determined by subtracting the packet loss count from the sequence number of the currently received RTP packet (S50). If the sequence number of the lost packet corresponds to the odd number, odd_loss_count is incremented by 1 (S60), and if it corresponds to the even number, the even_loss_count is incremented by one (S70). After the packet loss count is decreased by 1, if the packet loss count is not 0, the process is repeated again. If the packet loss count is 0, the process is terminated (S80).

13 is a diagram for explaining an XOR-SUB packet in the PLR header analysis according to the present invention.

First, when the receiving terminal receives the video RTP packet, it determines whether to use XOR-SUB, XOR-ODD, or XOR-EVEN packets for packet loss recovery using the first byte information of the RTP payload. According to FIG. 11, if the upper 4 bits of the first byte of the RTP payload are in the hexadecimal number " 0xF ", it is determined that the packet is a PLR purpose packet. The next two bits are the payload_type field and the binary number "01" for HD pictures. Since the present invention describes a packet loss recovery method of a full-HD image, the payload_type of an RTP packet to be described later is regarded as HD. Next, the redundant_type field is checked to determine 2 bits, and if it is a binary number "00", it is determined to be an XOR-ODD packet, "01" to XOR-EVEN and "10" to XOR-SUB.

For example, when receiving the XOR-SUB packet, the sequence number (sub_xor_count) in the subblock of the currently received XOR-SUB packet is determined using the mark_size [4: 0] bits in the PLR header of FIG. 11 (S110). The sub-block sequence number of the XOR-SUB packet sequentially increases from 1 to 1 after the End packet (the packet in which the E field value of the FU header is set to 1). The size (SUB_BLOCK_SIZE) of the subblock is a value that is known and known by using the PLR encoder of the transmitting terminal and the decoder of the receiving terminal. In addition, the sequence number of the start packet is a value inputted before analyzing the PLR header.

After receiving the XOR-SUB packet, the packet loss checking process of FIG. 12 is performed (S10 to S80).

The subsequent process is illustrated by way of example in FIG.

After performing the packet loss checking procedure (S10 to S80) at the receiving terminal, if the number of lost packets in one picture is equal to or greater than 1, it is determined that a packet loss has occurred and the next step is performed (S120). Thereafter, it is determined which sub-block in which picture the currently received XOR-SUB packet corresponds to (S130, S140). That is, sub_loop_start and sub_loop_end are determined as follows.

sub_loop_start = sequence number of the start packet + {(previous packet sub_xor_count-1) * SUB_BLOCK_SIZE}

sub_loop_end = sequence number of the start packet + {(current packet sub_xor_count-1) * SUB_BLOCK_SIZE}

Thereafter, referring to the result of the packet loss checking process (S10 to S80), it is determined at step S150 how many packets are lost in subblocks bounded by sub_loop_start and sub_loop_end. In the loss count determination, if the lost sequence number stored in FIG. 12 corresponds to a sub-block, sub_xor_loss_count is incremented by one. Subsequently, if the value of sub_xor_loss_count is 1 (S160), it is determined that one packet loss has occurred in the corresponding sub-block, a lost packet recovery process is performed using the currently received XOR-SUB packet (S180) If the value of sub_xor_loss_count is 2 or more, it can not be recovered as an XOR-SUB packet and an error is reported (S170).

In step S180 of FIG. 13, a method of recovering a lost packet using an XOR-SUB packet will be described with reference to FIG. 9 as an example. The XOR-SUB1 packet is generated by XORing the payload data of the RTP packets corresponding to the first sub-block. That is, XOR-SUB1 = RTP (S1

S2

S3

S4).

However, if one packet is lost in the serving block and the lost packet is RTP S1, the receiving terminal will have received RTP (S2, S3, S4) and XOR-SUB1 packets. At this time, by XORing the XOR-SUB1 packet and the remaining received packets, the lost RTP S1 packet can be restored. That is, RTP (S1) = RTP (S2

S3

S4)

XOR-SUB1.

Next, upon receiving the XOR-ODD packet, the value of odd_loss_count is determined through the packet loss check process of FIG. 12, and if the odd_loss_count is 1, the lost packet is recovered. The packet loss recovery process using the XOR-ODD packet will be described with reference to FIG. In the transmitting terminal, the XOR-ODD packet is generated by XORing RTP (S1, S3, S5, S7). That is, XOR-ODD = RTP (S1

S3

S5

S7). At this time, if the RTP S1 packet is lost and the remaining RTP (S3, S5, S7) and XOR-ODD packet are received, RTP S1 = RTP (S3

S5

S7)

The RTP S1 packet can be restored through the XOR-ODD operation.

In addition, upon receiving the XOR-EVEN packet, the even_loss_count value is determined through the packet loss checking process of FIG. 12, and if the even_loss_count is 1, the lost packet is recovered. The packet loss recovery process using the XOR-EVEN packet will be described with reference to FIG. In the transmitting terminal, the XOR-EVEN packet is generated by XORing RTP (S2, S4, S6, S8). That is, XOR-EVEN = RTP (S2

S4

S6

S8). At this time, if the receiving terminal has lost the RTP S2 packet and has received the remaining RTP (S4, S6, S8) and XOR-EVEN packet, RTP S2 = RTP (S4

S6

S8)

The RTP S2 packet can be restored through the XOR-EVEN operation.

14 is a diagram for explaining an energy efficient video packet loss recovery algorithm structure in the present invention. That is, the data transmitting / receiving apparatus (or transmitting terminal, receiving terminal) of the video conference system performing the efficient video packet loss recovery algorithm (X-EO + RQ) according to an embodiment of the present invention has a structure as shown in FIG. 14 . The data transmission / reception apparatuses are all provided in one system for the video conference system and can be used for data transmission and reception, respectively.

14, the data transmitting apparatus (or transmitting terminal) of the video conference system of the present invention includes a camera 10, a video encoder 110, a symbolizer 111, an X-EO encoder 112, An RQ encoder 113, and an RTP module 114. [ The RTP module 120, the X-EO decoder 121, the RQ decoder 122, the inverse symbolizer 123, the video decoder 124 (or the receiving terminal) of the video conference system of the present invention, ), And a display device 20 such as a TV / monitor.

Here, the transmitting / receiving device (or transmitting terminal, receiving terminal) includes a wired terminal such as a desktop PC or other communication dedicated terminal, and may be a smart phone, a wearable device capable of voice / And the like. Also, the transmitting / receiving device (or transmitting terminal, receiving terminal) can interoperate with each other through a wired / wireless network such as a general public Internet network (wired / wireless Internet, WiFi, WiBro, etc.) or a mobile communication network such as WCDMA and LTE.

In the present invention, an X-EO encoder 112 as a PLR encoder and an X-EO decoder 121 as a PLR decoder are added in order to recover packet loss using XOR-ODD and XOR- And the XOR-SUB packet can be further used for packet loss recovery as described in FIG.

For example, in the data transmitting apparatus (or transmitting terminal), the video encoder 110 receives an image from the camera 10 and generates I, P, and B type compressed video frames. The symbolizer 111 generates a source block composed of K source symbols (or packets) S1, S2, S3, .. SK from video frames I, P,

The X-EO encoder 112 generates an XOR-ODD (XO) recovery symbol by performing an XOR operation on symbols corresponding to the Odd symbol from the K source symbols, and outputs XOR-ODD -EVEN (XE) recovery symbols are generated and further transmitted to the source symbols S1, S2, S3, .. SK. The X-EO encoder 112 is a systematic encoder that converts the K source symbols S1, S2, S3, ... SK into a User Datagram Protocol (UDP) / IP Internet Protocol (RIP) packet, and the XO and XE recovery symbols may be further transmitted through the RTP module 114.

The RQ encoder 113 can also generate infinitely many Repair symbols R1, R2, .. from the K source symbols S1, S2, S3, ... SK with a systematic encoder. However, if the state of the communication channel can recover the original source symbol using the X-EO algorithm (code) of the present invention, the generation and the number of recovered symbols can be appropriately limited. The recovered symbol by RQ encoding in the RQ encoder 113 may be further transmitted via the RTP module 114 after the XO, XE symbol.

In the data receiving apparatus (or receiving terminal), the X-EO decoder 121 checks whether or not the source symbol can be recovered after receiving the XO and XE packets through the RTP module 120. If the X-EO decoder 121 can recover the source symbol by the X-EO algorithm (see FIG. 13), the RQ decoding process by the RQ decoder 122 is skipped and restored by the X-EO decoder 121 A source symbol is sent to the inverse symbolizer 123 to generate a video frame and decompressed through the video decoder 124 to reproduce the image. Only when the source symbol recovery fails due to the X-EO algorithm, the RQ decoding process is performed, and further recovery symbols R1, R2, ... are received to recover the original source symbol through repetitive RQ decoding.

Since the X-EO algorithm does not require a process of obtaining an inverse matrix of a generator matrix required for RQ decoding, fast source symbol restoration is possible and energy can be saved. According to the structure of the present invention, since the RQ decoding is not performed within the packet loss range in which the source symbol can be recovered by the X-EO algorithm, the source symbol recovery can be performed in a short period of time, . &Lt; / RTI >

FIG. 15 is a diagram for explaining an example of a method of implementing the video conference system 100 according to an embodiment of the present invention. The data transmission / reception device (or transmitting terminal, receiving terminal) of the video conference system 100 according to an embodiment of the present invention may be implemented by hardware, software, or a combination thereof. For example, the data transmission / reception device (or transmitting terminal, receiving terminal) of the video conference system 100 may be implemented in the computing system 1000 as shown in FIG.

The computing system 1000 includes at least one processor 1100, a memory 1300, a user interface input device 1400, a user interface output device 1500, a storage 1600, And an interface 1700. The processor 1100 may be a central processing unit (CPU) or a memory device 1300 and / or a semiconductor device that performs processing for instructions stored in the storage 1600. Memory 1300 and storage 1600 may include various types of volatile or non-volatile storage media. For example, the memory 1300 may include a ROM (Read Only Memory) 1310 and a RAM (Random Access Memory)

Thus, the steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by processor 1100, or in a combination of the two. The software module may reside in a storage medium (i.e., memory 1300 and / or storage 1600) such as a RAM memory, a flash memory, a ROM memory, an EPROM memory, an EEPROM memory, a register, a hard disk, a removable disk, You may. An exemplary storage medium is coupled to the processor 1100, which can read information from, and write information to, the storage medium. Alternatively, the storage medium may be integral to the processor 1100. [ The processor and the storage medium may reside within an application specific integrated circuit (ASIC). The ASIC may reside within the user terminal. Alternatively, the processor and the storage medium may reside as discrete components in a user terminal.

As described above, the data transmitting / receiving apparatus in the video conference system 100 according to the present invention can restore lost packets even at a packet loss rate that can occur in a general public Internet network or a mobile environment, It is advantageous in that it is not necessary to construct a dedicated network for building a video conference system. In addition, since there is no separate packet retransmission process for recovering a lost packet, it is possible to recover packet loss in a short time in real time, thereby maintaining the service quality of the bidirectional video conference at a constant level. Moreover, by avoiding the packet loss decoding process which requires a large amount of energy consumption, it is possible to reduce energy consumption over the same call quality.

The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention.

Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.

The camera (10)
The video encoder 110,
A symbolizer (111)
The X-EO encoder 112,
RQ encoder 113,
RTP module 114,
RTP module 120,
The X-EO decoder 121,
RQ decoder 122,
The inverse symbolizer (123)
Video decoder 124,
In the display device 20,

Claims

A method for transmitting and receiving data in a video conference system,
Sequentially transmitting a plurality of Real-time Transport Protocol (RTP) packets and a plurality of recovery packets in a transmitting apparatus; And sequentially receiving the plurality of RTP packets and the plurality of recovery packets at a receiving apparatus,
Wherein the receiving comprises: checking whether there is a lost packet among the plurality of RTP packets; And recovering the lost packet using the plurality of recovery packets,
Wherein the plurality of recovery packets are generated by selectively XORing the plurality of RTP packets constituting one picture frame,
Wherein the plurality of recovery packets further include XOR-SUB packets obtained by dividing the plurality of RTP packets into predetermined subblocks and XORing each subblock.

The method according to claim 1,
Wherein the plurality of recovery packets include an XOR-ODD packet in which odd-numbered packets among the plurality of RTP packets are XORed to recover an odd-numbered packet among the plurality of RTP packets, and an even- And an XOR-EVEN packet obtained by performing an XOR operation on even-numbered packets among the plurality of RTP packets in order to recover the ROT packets.

The method according to claim 1,
Wherein the XOR-SUB packets are used to recover odd-numbered or even-numbered packets belonging to the sub-block.

The method of claim 3,
Wherein the number of XOR-SUB packets XOR-computed for each of the subblocks is an integer value obtained by dividing the plurality of RTP packets by the size of the subblock.

The method according to claim 1,
Each header of the plurality of recovery packets includes a plr_indicator field indicating that the packet is a recovery packet, a num_of_packet field indicating the number of RTP packets included in the same picture frame, and a size of the last RTP packet of the same picture frame and the number of the XOR- And a mark_size field indicating a mark_size field.

The method of claim 5,
Wherein the plr_indicator field includes a Reserved field, a payload_type field for delimiting a screen of an image by QCIF or HD, and a redundant_type field for distinguishing between XOR-ODD, XOR-EVEN, and XOR-SUB packets.

In a video conference system,
A transmitter including a plurality of RTP (Real-time Transport Protocol) packets and an encoder for sequentially transmitting a plurality of recovery packets; And a decoder for sequentially receiving the plurality of RTP packets and the plurality of recovery packets,
Wherein the decoder checks whether there is a lost packet among the plurality of RTP packets and restores the lost packet using the plurality of recovery packets, A plurality of RTP packets constituting a frame are selectively generated by performing an XOR operation on the plurality of RTP packets,
Wherein the plurality of recovery packets further include XOR-SUB packets obtained by dividing the plurality of RTP packets into predetermined subblocks and XORing each subblock.