WO2007045141A1 - A method for supporting multimedia data transmission with error resilience - Google Patents

A method for supporting multimedia data transmission with error resilience Download PDF

Info

Publication number
WO2007045141A1
WO2007045141A1 PCT/CN2006/001846 CN2006001846W WO2007045141A1 WO 2007045141 A1 WO2007045141 A1 WO 2007045141A1 CN 2006001846 W CN2006001846 W CN 2006001846W WO 2007045141 A1 WO2007045141 A1 WO 2007045141A1
Authority
WO
WIPO (PCT)
Prior art keywords
error correction
forward error
data
correction coding
real
Prior art date
Application number
PCT/CN2006/001846
Other languages
French (fr)
Chinese (zh)
Inventor
Zhong Luo
Bin Song
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2007045141A1 publication Critical patent/WO2007045141A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0009Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the channel coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0015Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy
    • H04L1/0017Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy where the mode-switching is based on Quality of Service requirement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS

Definitions

  • the present invention relates to the field of multimedia communication technologies, and in particular, to a multimedia data transmission method supporting fault tolerance and flexibility. Background technique
  • Real-time streaming is a real-time transmission, especially for live events. Real-time streaming must match the connection bandwidth, which means that the image shield will be degraded due to the reduced network speed to reduce the need for transmission bandwidth.
  • the concept of "real time" means that the delivery of data in an application must be kept in precise time relationship with the generation of the data.
  • video communication is gradually becoming one of the main services of communication.
  • Two-way or multi-party video communication services such as video telephony, video conferencing, and mobile terminal multimedia services, impose strict requirements on the transmission of multimedia data streams and the quality of services. Not only does network transmission require better real-time performance, but equivalently requires video data compression coding to be more efficient.
  • the ITU-T Telecommunication Standardization Sector In view of the current demand for media communication, the ITU-T Telecommunication Standardization Sector officially released ⁇ .264 in 2003 after the development of video compression standards such as ⁇ 261, ⁇ 263, ⁇ .263+. standard.
  • This is an efficient compression coding standard jointly developed by ITU-T and the Moving Picture Experts Group (MPEG) of the International Standardization Organization (ISO) to adapt to the new phase of network media transmission and communication requirements. It is also the main content of Part 10 of the MPEG-4 standard.
  • MPEG Moving Picture Experts Group
  • the purpose of the H.264 standard is to improve video coding efficiency and its adaptability to the network more effectively. In fact, due to its superiority, the H.264 video compression coding standard has gradually become the mainstream standard in multimedia communication.
  • H.264 multimedia real-time communication products such as conference TV, videophone, 3G mobile communication terminal
  • network streaming products have been published. Whether to support H.264 has become the key to determining product competitiveness in this market segment. factor. It can be predicted that with the official promulgation and widespread use of H.264, multimedia communication based on IP networks and 3G and 3G wireless networks will inevitably enter a new stage of rapid development.
  • multimedia communication not only requires high efficiency of media compression coding, but also requires real-time transmission of the network.
  • multimedia streaming basically adopts Real-time Transport Protocol (RTP) and Real-time Transport Control Protocol (RTCP).
  • RTP is a transport protocol for multimedia data streams over the Internet, published by the Internet Engineering Task Force (IETF).
  • IETF Internet Engineering Task Force
  • RTP is defined to work in one-to-one or one-to-many transmissions with the goal of providing time information and stream synchronization.
  • the typical application of RTP is based on the User Datagram Protocol (UDP), but it can also work on other protocols such as TCP (Transport Control Protocol) or Asynchronous Transfer Mode (ATM). .
  • UDP User Datagram Protocol
  • ATM Asynchronous Transfer Mode
  • RTP itself only guarantees the transmission of real-time data, and does not provide a reliable transmission mechanism for transmitting packets in sequence, nor does it provide flow control or congestion control. It relies on RTCP to provide these services.
  • RTCP is responsible for managing the transmission quality to exchange control information between current application processes.
  • each participant periodically transmits RTCP packets, which contain statistics such as the number of transmitted packets and the number of lost packets. Therefore, the server can use this information to dynamically change the transmission rate, even Change the payload type.
  • RTP and RTCP work together to optimize transmission efficiency with effective feedback and minimal overhead, making it suitable for delivering real-time data on the network.
  • H.264 multimedia data is transmitted over the IP network, also based on UDP and its upper layer RTP protocol.
  • RTP itself is structurally applicable to different media data types, but different high-level protocols or media compression coding standards in multimedia communication (eg H.261, H.263, MPEG-1/-2/-4, MP3) Etc), the IETF will develop an RTP net for the agreement.
  • the specification file of the Payload packaging method which specifies the method of encapsulating large packets of RTP, is optimized for this specific protocol.
  • the corresponding IETF standard for H.264 is RFC 3984: RTP Payload Format for H.264 Video. This standard is currently the main standard for H.264 video stream transmission over IP networks, and is widely used. In the field of video communication, the products of major manufacturers are based on RFC 3984, and it is currently the only H.264/RTP transmission method.
  • H.264 defines a new layer, called Network Abstract Layer (NAL), which is a standard that makes it standard.
  • NAL Network Abstract Layer
  • the interface opens up the underlying business capabilities and shields the underlying network from the differences and abstracts the business capability layer.
  • VCL Video Coding Layer
  • H.264 brings greater application flexibility and defines a new layer of NAL.
  • the early ITU-T video compression coding protocols such as H.261, H.263/H.263+/H.263++ were not available.
  • RTP better for H.264, practical, and worthy of study.
  • the method of RTP carrying H.264 NAL layer data proposed by RFC3984 is the current mainstream transmission method.
  • RTP protocol RTC 3550
  • the scheme encapsulates NAL layer data in RTP payload for bearer.
  • the NAL layer is located between the VCL and the RTP, and specifies that the video stream is divided into a series of network abstraction layer data units (NALUs, NAL Units) according to defined rules and structures.
  • NALUs, NAL Units network abstraction layer data units
  • the encapsulation format of the RTP payload for NALU is defined in RFC3984. The following is a brief introduction to the RTP frame format and the NALU packaging method in the prior art.
  • RTP Real-time multimedia conferencing and continuous data storage, interactive distributed simulation, control and measurement applications.
  • RTP is typically carried over the UDP protocol to take advantage of its multiplexing and parity functions. If the underlying provides multipoint distribution, RTP supports multi-address delivery.
  • RTP includes: payload type identification, sequence numbering, timestamp, and transmission monitoring.
  • RTP packages the NA. package of H.264 into RTP. Packet flow.
  • the NALU is mainly defined in the RFC 3984 file, and based on this, the encapsulation and packing format of the H.264 layer NAL data in the RTP is given.
  • the RTP encapsulation format of this NALU is shown in Figure 2. '
  • Figure 1 shows the encapsulation structure of a NALU in the payload of the RTP.
  • the first byte in the previous byte is the NALU header information, followed by the data content of the NALU.
  • the multiple NALUs are filled end-to-end into the payload of the RTP packet.
  • RTP padding which is specified in the RTP packet format. In order to make the length of the RTP packet meet certain requirements (such as reaching a fixed length), the optional RTP padding data is generally filled with zeros.
  • the NALU header information is the first byte, also known as the octet (Octet), which has three fields.
  • the meaning and full name are respectively described as follows:
  • the F field is defined as a forbidden bit (forbidden-zero-bit), which is 1 bit, used to identify grammatical errors, etc., and is set to 1 if there is a syntax conflict.
  • a forbidden bit forbidden-zero-bit
  • the I field is defined as the NAL reference identifier (nal_ref_idc), which is 2 bits, used to indicate
  • NALU data whose value is 00 means that the content of the NALU is not used to reconstruct the inter-predicted reference picture, while the non-00 indicates that the current NALU is a slice or sequence parameter set belonging to the reference frame (SPS, Sequence Parameter Set), image parameter set (PPS, Picture Parameter Set) and other important data.
  • SPS Sequence Parameter Set
  • PPS Picture Parameter Set
  • the Type field is defined as NALU type (Nal_unitjype), a total of 5 bits, which can have
  • the information given in one byte of the NALU header information mainly contains the validity and importance level of the NALU. Based on this information, the importance of the data carried by the RTP can be determined.
  • H.264 video is the main protocol for multimedia communication in the future.
  • the network of future multimedia communication applications is mainly the packet switching network and wireless network represented by IP. Neither of these two types of networks can provide good quality of service (QoS) guarantees. Therefore, video transmission on the network is bound to be affected by various transmission errors and packet loss, resulting in lower communication quality.
  • the IP network implements "best effort" transmission, it does not guarantee the QoS of the transmitted video signal.
  • the best-effort transmission on the IP network does not guarantee the QoS of real-time video communication, which is manifested in three aspects: packet loss, delay, and delay jitter. Among them, packet loss has the greatest impact on the quality of recovered video.
  • H.264 compression coding algorithm uses motion estimation and motion compensation technology, once there is packet loss, it not only affects the current decoded image, but also affects the subsequent decoded image. Error spread. The effect of error spread on recovering video quality is very large. Only when the combination of the encoding end and the decoding end is combined with error resistance can the error spread be completely avoided.
  • Error Resilience refers to the ability of the transmission mechanism to prevent errors from occurring or to be corrected with certain ability after the error occurs. (The error strength can be completely corrected within a certain range; if it exceeds a certain range, it can only be partially corrected). Extensive in the future (can be said to be omnipresent) In a multimedia communication environment, it is critical that a video delivery mechanism is resilient to fault.
  • FEC Forward Error Correction
  • ARQ Automatic Retransmission Request
  • JSCC Source Channel Joint Coding
  • Interleaving and elimination of bit error spread.
  • FEC Forward Error Correction
  • ARQ Automatic Retransmission Request
  • JSCC Source Channel Joint Coding
  • Interleaving and elimination of bit error spread.
  • FEC Forward Error Correction
  • ARQ Automatic Retransmission Request
  • JSCC Source Channel Joint Coding
  • Interleaving elimination of bit error spread.
  • H.264 video to be transmitted over a packet network FEC is a very practical technique that works well.
  • This method mainly uses a variety of error correction coding to encode the data to be protected, which essentially forms data redundancy, thereby increasing the ability to resist errors.
  • the main error of the packet on the network is the packet loss error, which is called Erasure Error in the error correction coding theory.
  • Error correction codes for deletion errors are a large class called Erasure Codes.
  • the so-called erasure code is to divide the data stream sequence into segments of the same size (Unit), also called data nodes (Data Nodes). For convenience of presentation, it is assumed that there are n data nodes. Then, according to certain mathematical operation rules, these data nodes are calculated to generate a check node (? 1: In order to enhance the protection capability, the check nodes may continue to generate the second layer check node according to the same or different mathematical operation rules, and so on, and the third layer, the fourth layer, and the Nth layer check may be generated. node.
  • Layer node structure It can be visually represented as a pyramid that turns 90 degrees to the right. The leftmost side is the data node layer, and the right side is the first layer check node, the second layer check node, ..., the Nth layer check node.
  • linear time characteristic linear-time I and many other erasure codes such as famous The Reed-Solomon code requires much more time complexity and is on the order of n*log2n*log(logn). Therefore, linear time-based erasure codes are much better used in real-time communication.
  • Tornado code is a kind of one that appeared around 1998. New erasure code. Tornado code is simple in structure and efficient in operation because it has linear time and strong protection. In practical applications, good results have been obtained. It has been widely used. 1" The latest ITU-T dynamics, where SG16 is currently considering the possibility of standardizing Error Control Codes technology, mainly for video and audio network transmission protection. Tornado code and its many variants are very May be an important technology among them.
  • multiple check node layers are generated layer by layer from the data node. Both the check node and the data node are sent by the sender to the receiver over the network. If some nodes are lost during the network transmission process, because the upper node participates in the generation of the lower node, the information of the upper node is already included in the lower node and the lower node, so the information of the lost node can pass the lower level of sufficient majority. The node or lower node is fully recovered. If each node is a packet, the lost packet can be fully recovered by other packets that are correctly received. Let the number of data nodes be n, and the number of generated check nodes is L.
  • Figure 3 shows a typical Tornado code data node and the relationship between the check nodes of each layer.
  • the connection between the nodes in the figure is called the edge, and the node on the left side of the edge participates in the calculation of the right node. It can be seen that there is a many-to-many logical relationship between the two nodes before and after.
  • the most commonly used calculation method in the Tornado code generation process is the XOR operation, because the XOR operation has a convenient recovery function, and any node can be recovered by all the remaining nodes after it is lost. Since the scaling factor of the last layer of check nodes is different, it is generally calculated using a conventional error correction coding scheme, such as a Reed-Solomon code.
  • erasure codes In fact, the range of erasure codes is very large. Tornado codes are only one of them. In addition, there are RS (Reed-Solomon) codes and Low Density Parity Codes (LDPC).
  • RS Random-Solomon
  • LDPC Low Density Parity Codes
  • An important performance indicator of the erasure code is its error correction capability (or protection capability), which is directly reflected in the maximum number of lost packets allowed under the packet loss error (under the total number of precursors of a certain packet), or when The packet loss is higher than this maximum allowable number, and the percentage of the packet can be corrected correctly.
  • the protection is higher and the redundancy is the same under other conditions. The higher the rate.
  • the protection capability is not only applicable to erasure codes, but on a larger scale, all FEC codes can be measured by protection capabilities.
  • some data are relatively important, such as structural parameters of video sequences, structural parameters of images, header information, etc.
  • Other data are relatively less important, such as image content data.
  • FEC FEC
  • a code with stronger protection is used for relatively important data, and a code with weak protection for relatively unimportant data. This balances protection and efficiency.
  • the protection capability cannot be adjusted blindly because it leads to high redundancy and the P-bar is inefficient.
  • This method of FEC protection based on the relative importance of data for different protection capabilities is called Unequal Protection (UEP), and QoS guarantee for video communication services is easily realized by unequal protection.
  • the RTP protocol for transmitting video multimedia data does not support fault-tolerant flexibility and is provided by a higher application layer.
  • the erasure code protection is generally used to achieve elastic fault tolerance.
  • the measures taken by the prior art scheme are: - The sender is at the NALU level of H.264, and directly uses some type of erasure code for the NALU data unit, and then the result (including the data node and the checksum) The node) is directly encapsulated in the RTP packet and then transmitted.
  • the receiving end After receiving the RTP data packet, the receiving end performs decapsulation to extract the data node and the check node. If packet loss occurs, that is, some or some RTP data packets are lost, then according to which data nodes are encapsulated in the lost packets. Or verifying the node, it can be judged whether the correctly received data node and the check node can be used to completely recover or partially recover the lost node, and the recovery operation is performed.
  • the prior art performs erasure coding on the multimedia data such as NALU at the upper layer and then transmits the data in the RTP, and performs corresponding erasure decoding on the receiving end.
  • the transmitting and receiving parties generally negotiate and decide what forward error correction coding scheme to use and the parameter settings adopted by the scheme, such as H.323/H.245 and other protocol channels.
  • the two sides negotiated.
  • the fault-tolerant and flexible mechanisms in the prior art solutions are implemented at the upper layer of the RTP.
  • the two parties negotiate or inform the type of the erasure code to be used and its parameter settings need to be implemented through other logical channels, which seriously affects the multimedia transmission efficiency.
  • the network bandwidth resource is consumed.
  • the fault-tolerant resiliency mechanism is transparent. Therefore, the RTP layer cannot know the structure of the encoded multimedia data generated by the FEC codec scheme, and thus cannot perform targeted encapsulation and encapsulation. , unable to reorganize the transport hierarchy, lengthen network transmission delays, and the transmission equipment becomes complicated;
  • the multimedia data is always transmitted according to the scheme.
  • the unequal protection mechanism cannot be implemented, and the fault-tolerant elastic mechanism cannot be implemented. To achieve QoS guarantee.
  • the prior art implements a fault-tolerant elastic mechanism such as FEC at a high level, and does not utilize the RTP protocol and its encapsulation. Therefore, the transmitting and receiving parties need to establish another logical channel or use a specific application layer protocol, such as some in the H.323 protocol system.
  • Protocol H.245 to negotiate or inform the FEC encoding type, structural parameters and other information used; no fault-tolerant resiliency related details are involved in the RTP layer, and no RTP data packet is encapsulated to encapsulate the data nodes and check nodes generated by FEC protection; There is also no choice of FEC codec scheme according to the network condition and the importance of multimedia data, and there is no mechanism for providing FEC protection for different protection capabilities with different relative importance data, that is, unequal protection cannot be achieved. Summary of the invention
  • the main purpose of the present invention is to provide a real-time transmission method for a multimedia data network that supports fault-tolerant resilience, so that a fault-tolerant elastic mechanism for real-time transmission of multimedia data can be implemented at a transmission protocol level.
  • U of the present invention is implemented for Unequal protection mechanisms and hierarchical protection mechanisms for different data and network conditions.
  • a real-time transmission method for a multimedia data network supporting fault tolerance resilience includes:
  • the transmitting end selects a forward error correction coding mode to perform forward error correction coding on the multimedia data
  • the transmitting end encapsulates the encoded multimedia data by using a fault-tolerant elastic real-time transmission protocol, and And carrying the forward error correction coding mode related information in the header information of the fault tolerant elastic real-time transmission protocol data packet, and sending the information to the receiving end;
  • the receiving end decapsulates the received fault-tolerant elastic real-time transport protocol data packet, and extracts the forward error correction coding mode related information from the header information of the fault-tolerant elastic real-time transport protocol data packet;
  • the receiving end selects the forward error correction decoding mode to perform forward error correction decoding according to the forward error correction coding mode related information, Restoring or partially recovering the lost multimedia data.
  • the forward error correction encoded multimedia data includes a data node and a check node.
  • the transmitting end selects a forward error correction coding mode according to a current network transmission condition or/and a service quality level of the multimedia data to be transmitted, wherein the service volume level is determined according to the relative importance of the data.
  • the packet fault information of the fault tolerant elastic real-time transport protocol includes:
  • a forward error correction coding type field configured to indicate a forward error correction code type used
  • a forward error correction coding subtype field configured to indicate a related parameter setting of the forward error correction coding mode
  • a packet length field configured to indicate a length of a node obtained after correcting the forward error correction code for the multimedia data
  • a packet number field used to indicate the number of the data nodes carried by the fault tolerant elastic real-time transport protocol data packet.
  • the transmitting end divides at least one of the H.264 network abstraction layer units into at least one data node of equal length, and then performs the foregoing. Encoding to the error correction to obtain at least one calibration node; the transmitting end encapsulates the data node and the verification node packet in at least one of the fault tolerant elastic real-time transmission protocol packets for transmission;
  • the receiving end After receiving the fault tolerant elastic real-time transport protocol packet, the receiving end decapsulates the data node and the check node;
  • the receiving end is according to the school
  • the node performs forward error correction decoding on the data node, and divides and obtains the H.264 network abstraction layer unit.
  • the transmitting end and the receiving end negotiate to determine the value of the fault tolerant forward error correction code subtype field and the related parameter setting of the forward error correction code indicated. Correspondence relationship.
  • the sending end and the receiving end both establish a correspondence table according to the indication correspondence relationship of the forward error correction coding subtype field, and configured to perform, according to the forward error correction coding type field and the forward error correction coding
  • the forward type error correction coding or forward error correction decoding processing module corresponding to the subtype field query;
  • the transmitting end invokes a corresponding forward error correction coding processing module to perform forward error correction coding; the receiving end invokes a corresponding forward error correction decoding processing module to perform forward error correction decoding. Determining, by the sending end, the relative importance of the corresponding data according to the network abstraction layer reference identifier field or/and the network abstraction layer unit type field in the header information of the H.264 network abstraction layer unit, determining the quality of service level, selecting Corresponding forward error correction coding mode determines the forward error correction coding type field and the forward error correction coding subtype field.
  • the transmitting end evaluates the network transmission status according to the transmission report fed back by the receiving end, and further selects the forward error correction coding mode, and determines the forward error correction coding type field and the forward error correction coding subtype field. .
  • the forward error correction coding type field is located after the contribution source identifier list; the forward error correction coding subtype field is located after the forward error correction coding type field;
  • the data packet length field is located after the forward error correction coding subtype field; the data packet number field is located after the data packet length field.
  • the forward error correction coding mode uses an improved "Tornado" erasure code; the improved “Tornado” erasure code generates only one layer of the check node for a set of said data nodes.
  • an ER TP transmission that can carry information related to the forward error correction coding scheme is provided on the basis of the existing RTP. Sending a layer encapsulation format, so that the multimedia data is transmitted on the ERRTP while marking its corresponding forward error correction coding scheme information, thereby integrating the error resilience mechanism into the transport layer;
  • various alternate forward error correction coding schemes can be selected according to factors such as current network conditions and multimedia data importance levels, thereby achieving the purpose of unequal protection and hierarchical protection, achieving protection capability and transmission. Balance of efficiency;
  • the fault-tolerant elastic mechanism in the transport layer greatly simplifies the fault-tolerant elastic transmission structure, which saves the network transmission bandwidth.
  • the realization of the unequal protection achieves the balance between protection capability and transmission efficiency, facilitating the realization of QoS guarantee for multimedia transmission; H.264 data
  • the implementation of the specific transmission scheme can greatly improve the performance and user satisfaction of H.264-based multimedia communication products such as conference television, videophone application on IP networks.
  • 1 is a schematic diagram of a package format of an RTP packet payload to NALU data
  • FIG. 2 is a schematic diagram showing the structure of a header information of an RTP data packet
  • Figure 3 is a schematic diagram of the Tornado erasure code principle
  • FIG. 4 is a schematic structural diagram of an EERTP packet header according to a first embodiment of the present invention
  • FIG. 5 is a flow chart of a H.264 multimedia data transmission method according to a second embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a H.264 NALU partitioning codec process according to a second embodiment of the present invention. detailed description
  • the present invention proposes an improved RTP protocol supporting fault tolerance resilience, which aims to integrate a fault-tolerant elastic mechanism into a transport layer protocol, which not only simplifies the transmission structure and reduces complexity, but also improves the fault-tolerant elastic mechanism. Flexibility enhances transmission reliability. Due to its fault-tolerant flexibility, this improved RTP protocol is called a fault-tolerant elastic real-time transport protocol. ( ERRTP /ER2TP, Error Resilience Real-time Transport Protocol ).
  • ERRTP /ER2TP Error Resilience Real-time Transport Protocol
  • the main difference between ERRTP and RTP is that the ERTP protocol packet header information extension can carry information about the forward error correction codec scheme, such as FEC type, protection capability, and coding parameters.
  • the present invention conveniently realizes unequal protection. Firstly, various protection measures with different protection capabilities are available for selection, and then the sender can collect information such as network status and importance of multimedia data. These factors are used to select appropriate protection measures to achieve the goal of unequal protection and to achieve a balance between protection capability and transmission efficiency. Since the FEC related information is carried on each ERRTP data packet, the transmitting end only needs to fill in the information of the selected scheme into the ERRTP header information, and the receiving end can correctly recover or correct the error according to it. .
  • the specific implementation method based on erasure code protection is given, including the steps of dividing, generating, encapsulating and decapsulating the data node and the check node.
  • a series of NALUs are equally divided into several data nodes, and then the check nodes are generated by Tornado codes. All of these nodes are distributed in several ERRTP packets, and the receiving end performs this inverse process.
  • the format of the RTP packet is briefly introduced:
  • the basic option of the RTP header information occupies 12 bytes (minimum case), and the header information of the IP protocol and the UDP protocol respectively occupy 20 bytes and 8 words. Therefore, the RTP packet is encapsulated in the UDP packet and then encapsulated in the IP packet.
  • the detailed structure of the header information of the RTP packet is shown in Figure 2.
  • the front-to-back RTP header information shown in Figure 2 is:
  • the first byte (byte 0) is some field about the header information structure itself
  • the second byte (byte 1) is the defined payload type
  • the third 4 bytes (bytes 2, 3) are the sequence number (Sequence Number)
  • the 5th-8th byte is the timestamp (timestamp)
  • the 9th-12th byte is the synchronous contribution source identifier (SSRC ID, Synchronous Source) Identifier )
  • SSRC ID Synchronous Source
  • CSRC Ids Contributing Source Identifiers
  • the first 12 bytes appear in all different types of RTP packets, while other data in the header information, such as the contribution source identifier, is only available when the mixer is inserted. therefore CSRC is generally used when there is media mixing. For example, in multi-party conferences, audio needs to be mixed, and video can also provide multi-screen functions in this way.
  • the synchronization source identifier SSRC is actually the identifier of the carried media stream.
  • the P field is a padding flag (Padding), which occupies 1 bit. If P is set, it indicates that the packet contains one or more padding bytes (Padding) at the end, and the padding is not part of the payload;
  • the X field is an extension identification bit (Extension), which occupies 1 bit. If X is set, the RTP header must be followed by a variable-length header extension (if there is a CSRC list, the header extension is followed), mainly Retaining the case where the header information field is not sufficient for some application environments, the header extension includes a 16-bit length field to count how many 32-bit words in the extension, and the first 16 bits of the header extension are left-open. In order to distinguish between identifiers and parameters, the 16-bit format is defined by a specific level specification, which is described in detail in section 5.3.1 of RFC 3550, which is not given here;
  • the CC field is the number of contributing sources (CSRC Count), which is 4 bits s, indicating the number of CSRC identifiers at the end of the header information, and the receiving CC field can determine the length of the CSRC IDs list following the header information;
  • the M field is a marker bit (Marker), which occupies 1 bit.
  • the interpretation of the identifier bit is defined in a specific profile, which allows identification of important events in the packet stream.
  • One layer can define additional identification bits or regulations. There is no identification bit.
  • the so-called level here refers to the specific application environment setting, which is specifically agreed by the communication parties and is not limited by the agreement;
  • the PT field is the payload type (PT, Payload Type), a total of 7 bits s, identifies the format of the RTP payload and determines his interpretation in the application; the flag bit and the payload type share a layer of specified information, this byte may It will be redefined by specific levels to suit different needs.
  • a so-called profile can be defined, which is actually a set of static (ie communication). The two parties agree on the corresponding relationship in advance, and the different values of the FT bits are associated with different media formats.
  • dynamic negotiation can also be used to define the relationship between the FT value and the media format through signaling other than RTP.
  • the RTP source can change the PT.
  • the following field is the serial number of a total of 16 bits. Each time an RTP data packet is sent, the serial number value is incremented by one, so that the receiver can use it to detect the data packet loss and recover the data packet sequence.
  • the initial value of the serial number in one communication can be given randomly. , does not affect communication.
  • the timestamp occupies 32 bits, which reflects the sampling time of the first byte in the RTP packet.
  • the sampling time here must be derived from a monotonically increasing clock, and the receiver adjusts the media playback time or synchronizes according to it.
  • the synchronization source SSRC ID occupies 32 bits, and its specific value can be randomly selected. However, to ensure the uniqueness in the same RTP session, it can uniquely identify a media source. If a source changes the source transmission address, a new SSRC must be selected. The identifier.
  • the source CSRC list can be 0-15 items as needed, each item occupying 32 bits s, and the length of the list, ie the number of CSRC IDs, is exactly indicated by 4 bits of the CC field.
  • the CSRC identifier used to identify a media source is identical to the SSRC identifier of its corresponding contribution source, except that the role of the different receivers is different and is set to SSRC or CSRC.
  • the CSRC ID is inserted by the mixer.
  • the sending and receiving parties implement unequal protection based on ER TP.
  • the main steps are as follows:
  • the transmitting end selects the forward error correction coding scheme to perform erasure coding on the multimedia data, encapsulates the encoded multimedia data with ER TP, and carries relevant information of the forward error correction coding scheme in the ER TP header information, and then sends the information to the receiving end;
  • the receiving end encapsulates the received ERRTP packet, and extracts the relevant information of the forward error correction coding scheme from the ERRTP header information, and then selects the forward error correction coding scheme to perform the erasure decoding and decoding according to the related information of the forward error correction coding scheme. Get multimedia data.
  • the unequal protection is reflected in that the sending end selects the forward error correction coding scheme according to the current network transmission status and/or the quality of service level of the multimedia data to be transmitted.
  • the specific structure of ERRTP is introduced.
  • the following is an example of the structure of the header information of the specific ERRTP.
  • 4 is a block diagram showing the structure of an ERRTP header according to a first embodiment of the present invention.
  • the header information extension is finally accompanied by a related information field regarding the forward error correction codec scheme, and the example includes: a forward error correction coding type field, a forward error correction coding parameter field, a packet length field, and data.
  • the number of packages field is a packet length of the packet length field.
  • the forward error correction coding type field is used to indicate the erasure code type used by the forward error correction coding scheme, and may also be referred to as an FEC Type field, that is, an FEC coding type, which is 4 bits, and can represent 16 different FEC types. , from the actual application, is enough.
  • the types defined here are actually large types, and will be further subdivided into various schemes, called subtypes.
  • the large types in practical applications are, for example, 0010 for Tornado code and 0011 for RS code.
  • This field can identify 16 different types of FEC codes.
  • the query table (LUT, Look-Up Table), which needs to agree in advance on the correspondence between the FEC encoding type and the encoding type code, is called FECTypeLUT.
  • the forward error correction coding subtype field is used to indicate the related parameter setting of the forward error correction coding scheme. For each type of FEC coding, it is also necessary to determine the setting of various parameters to be specifically implemented, and this field is to clear specific parameters. The role. Since the resources in the ERRTP header information are limited, it is impossible to list specific parameters corresponding to various FEC encoding schemes, their rules, etc., and the first embodiment of the present invention indicates various alternative parameters by using the concept of subtypes. Set the plan.
  • This field is also known as the FEC coded subtype field, FEC Subtype, which occupies 9 bits. This field mainly represents the subtypes further subdivided under the major types defined in the FECTypeLUT.
  • MTU Maximum Transport Unit
  • the number of packets field used to indicate the number of data nodes carried by the ERRTP packet, also known as the Packet Number field, which occupies 8 bits, for example, before a number of NALUs pass through
  • the packet is encapsulated in multiple ERRTPs, and the number of data nodes carried in each ERRTP.
  • the decoding end or the network node can verify the received data packet according to the FEC code type and the check type of the data packet given by the field, and recover the lost data packet.
  • sub-type FEC Subtype field mentioned above has a total of 9 bits for encoding a parameter setting scheme indicating various alternatives, and how to perform the coding indication in the first embodiment of the present invention is given below. technical details.
  • the receiving and receiving party needs to negotiate to determine the field indicating the relationship correspondence table.
  • the sender and the receiver negotiate to determine: for various types of FEC codes, the correspondence between the value of the FEC Subtype and the related parameter setting scheme of the FEC code indicated, and various alternatives. Specific parameter settings.
  • the sender and the receiver both establish a correspondence table according to the negotiation result, and are configured to query the corresponding FEC coding type or FEC codec processing module according to the FEC Type and FEC Subtype fields;
  • the transmitting end calls the corresponding erasure coding processing module to perform erasure coding
  • the receiving end calls the corresponding erasure decoding processing module to perform erasure decoding.
  • the so-called generation rule is a rule or algorithm (Algorithm) of how the data node is processed at the transmitting end to generate each check node. Of course, the opposite is done at the receiving end. If a packet loss occurs during the transmission, that is, some nodes are lost, the lost node can be recovered or partially recovered according to the generation rule. It can be seen that the generation rule is very important information, according to which both parties of the communication can work based on the FEC mechanism.
  • Each of the FEC types listed in the FECTypeLUT has different generation rules; in each class, such as Tornado code, the following subclass generation rules are combined with specific generation parameters. . So for each subclass here, the claim rule will be combined with the build parameters.
  • the generation parameters include the following data: According to the total number of nodes, the total number of check nodes, the number of check node layers, the scaling ratio of the number of power saves between successive layers, and the association of node associations between successive two layers.
  • Matrix if there is an L-layer check node, then such an associative matrix has L or equivalent bipartite graphs representing the relationship between successive two-layer nodes.
  • the generation is performed. The parameters often determine the protection strength of the subtype.
  • Tornado code in the various generation parameters given above, the total number of data nodes and the total number of test nodes can basically determine the protection ability to a large extent (of course, strictly speaking, to fully determine the protection capability, all the generation parameters are required. ).
  • the representative generation parameters representative generation parameters
  • FECSubTypeLUT For example, Tornado code, in the various generation parameters given above, the total number of data nodes and the total number of test nodes can basically determine the protection ability to a large extent (of course, strictly speaking, to fully determine the protection capability, all the generation parameters are required. ).
  • select some of the main parameters determining the protection capability the decision is the most important
  • the representative generation parameters representative generation parameters
  • Subclasses are arranged in order of protection from weak to strong (ascending order).
  • creating a LUT is called FECSubTypeLUT.
  • Each large type specifically supports multiple subtypes below, and can have specific application and communication capabilities (CPU processing speed, memory, program complexity, etc.) and needs to be determined. If the communication environment changes a lot and the performance of the network fluctuates widely, then the subtypes that need to be supported are generally more, but less. This can be agreed upon by the communication parties through the capability negotiation process before the communication begins. Negotiation can be carried out through the current mainstream multimedia communication framework protocols such as H.323 or Session Initial Protocol (SIP).
  • H.323 Session Initial Protocol
  • a given set of generation rules combined with corresponding generation parameters corresponds to a unique coding scheme, that is, the only decision is how to generate a calibration node from the data node, and how to recover the lost node.
  • a database can be created to store the generation parameters for each of the large types and subtypes.
  • the generation rules themselves are implemented in hardware or software modules. Therefore, each type of macro corresponds to a FEC processing module at the transmitting end, which is responsible for generating a check node; at the receiving end, it also corresponds to an FEC processing module, which is responsible for restoring the node.
  • each large type of module it is necessary to read the specific generation parameters of each seed type from the above generated parameter database, thereby performing processing. Therefore, both parties are based on
  • the information of the two information fields FEC Type and FEC Subtype determines which FEC processing module is called and reads those generation parameters.
  • the second embodiment of the present invention gives the NALU of H.264 with ERRTP.
  • Step 501 The sender combines multiple (assumed S) H.264 NALUs into a unified group of coded transmissions, and first re-divides the S NALUs into blocks of equal length, which are assumed to be M, and the M are data. node.
  • the S NALUs of H.264 are grouped into one group; then the S NALUs are concatenated end-to-end, connected to form a large block, and then the large block is equally divided into M data blocks, wherein Each data block has a length of K bytes.
  • the rounding operation should be performed so that the length of each data block is Ceiling (TBZM) bytes, and the Ceiling function indicates rounding, that is, Ceiling(x) is equal to no
  • the smallest integer less than x, x is any real number.
  • the operation of zero padding may be used, so that the number of bytes is equal to Ceiling (TB/M).
  • Step 502 Perform FEC encoding on the M data nodes to obtain N check nodes.
  • FEC code encoding for M data blocks to generate N check blocks the generation process uses the method described above to determine which FEC processing module to call for the generation of the check block according to the FEC Type and FEC Subtype information.
  • Step 503 The sender encapsulates all data nodes and check node packets in an ERRTP packet for transmission.
  • Figure 6 shows the structure of P + ER TP packages carrying M + N data nodes. Combined with the header information format of ERRTP given in Figure 4, in this example the fields should be set as follows:
  • Type field FEC Type 0010, indicating the use of Tornado code
  • the channel coding redundancy is 16.7%; the erasure code can completely recover the lost data packet when the packet loss rate is less than or equal to 3%;
  • Packet Number Packet Number (M+N)/P, which represents the number of data nodes carried in an ERRTP payload.
  • Step 504 After receiving the ERRTP packets, the receiving end encapsulates the data node and the check node. The receiving end starts with P packets and starts decoding and recovering every time a group of P packets is received. How many packets of a group are determined by mutual agreement.
  • Step 505 The receiving end performs forward error correction decoding on the data node according to the check node. Each time after receiving the data packet P+1, it starts to detect whether there is a packet loss in the P packets received before. If there is, the method described above is used to determine which FEC to call according to the FEC Type and FEC Subtype information. The processing module decodes and recovers or partially loses data.
  • Step 506 finally, after obtaining the complete data node, re-merging to obtain a large block, and dividing the S NALUs in the same manner as the transmitting end.
  • the above example uses the ERRTP-based anti-data packet loss algorithm, which can greatly improve the anti-data packet loss of the video code stream when the number of codewords is less than 17%. Force. Compared with the RTP payload header structure, only 4 bytes have been added, which shows that there is basically no effect on the transmission efficiency, and significant practical results have been achieved.
  • Another key technical point that has been mentioned above with respect to the present invention is the implementation of unequal protection. It is mainly embodied in two aspects. One is to select the appropriate codec scheme or parameters according to the multimedia data of different important levels, that is, to determine the aforementioned FEC coding type and subtype, and the other is to select according to the network conditions at different times. Corresponding to these two aspects, they are called mixed and alternate use of various FEC coding schemes. Hybrid refers to the simultaneous use of multiple FEC subtypes at the same time, mainly for protecting data of different importance. The so-called Alternation refers to the use at different times (different network conditions). Different FEC subtypes.
  • these two unequal protection mechanisms are given based on the first embodiment.
  • its first byte reflects the importance of the data, so the sender can evaluate the QoS level according to the NRI field or Type field in the NALU header information, and then select the forward error correction.
  • the coding scheme that is, the FEC Type field and the FEC Subtype field are determined.
  • the general network transmission has a corresponding network condition monitoring mechanism, and the transmitting end can learn the transmission report fed back by the receiving end according to these mechanisms, so as to evaluate the network transmission status, and then select the forward error correction coding scheme, that is, Determine the FEC Type field and the FEC Subtype field.
  • the H.264 code stream is transmitted or stored based on the NALU, which consists of NAL header information and NAL payload.
  • NALU which consists of NAL header information and NAL payload.
  • different NALU types have different effects on decoding and restoring images.
  • a NRI of 0 means that a Slice or Slice data strip of a non-reference image in the NALU does not affect subsequent decoding; and a non-zero indicates that a sequence/image parameter set or a slice of the reference image is stored in the NALU or Slice data strips can seriously affect subsequent decoding.
  • Nal_unit_type divides the data of H.264 into two categories: one is relatively important image data (for example, Nal_ref-idc is equal to 1); the other is secondary image data (for example, Nal). — ref— idc is equal to 0). Then, the important image data is protected by the FECI code with high redundancy and strong anti-dropping ability; while the secondary image data can be used with less redundancy and weaker anti-loss capability. The FEC2 code is protected.
  • FEC1, FEC2 are just general representations, representing any two seed types. These two seed types can belong to the same large type, or they can belong to different large types.
  • the above method can be extended to a more general case, and the data is divided into more classes according to the value of NAL_unit-type, such as five categories: the most important data, the second most important data, the general important data, the less important data, The least important data; can also be divided into 7 categories or more, then, can be protected with the same number of FEC subtypes, each type of data corresponds to a different subtype. As long as the protection ability is weak to strong, these subtypes do not necessarily belong to the same large type.
  • the image information that has not been recovered after the protection of the most protected FEC code is protected by error concealment and error-proof diffusion.
  • Another case of unequal protection according to the present invention is the ability to select FECs of different protection capabilities depending on the real-time conditions of the network.
  • the two sides of the communication are then notified by the header information of ERRTP so that they can correctly decode the data and recover the lost data.
  • the image information that has not been recovered after the protection of the FEC code with the strongest protection is protected by error concealment and error-preventing. Network conditions can be monitored through various existing QoS monitoring methods.
  • the subscript of the middle FEC is represented by a two-dimensional subscript.
  • the fault-tolerant elastic mechanism FEC(i,j) in the table, 0 ⁇ i ⁇ U, 0 ⁇ j ⁇ V, may be any of the above T FEC schemes. .
  • an improved Tornado erasure code is specifically employed.
  • the improved Tornado erasure code generates only one layer of the check node for a group of data nodes, which can greatly reduce coding. Delay, to meet the needs of real-time communication.
  • NALUs are grouped into one group, and one NALU contains the code stream data of a Slice. If a frame of image is divided into a slice, the encoding end will have the delay of the S frame, and the decoding end will also have the delay of the S frame.
  • the relationship between NALU and the number of data nodes is as follows:
  • the delay of the delay of one frame is basically determined by the value of S. Determine, and the DataNode greatly affects the value of s. Therefore, under the premise of ensuring the ability of video communication to resist packet loss, the delay introduced by FEC is minimized, and the QoS of real-time video communication is further ensured.
  • the present invention employs an improved Tornado code protection algorithm in the case where the DataNode is limited.
  • the improved Tornado method does not use a multi-level even graph coding method, but uses only one layer of check node coding.
  • the improved coding method greatly improves the flexibility of the algorithm.
  • the number of data nodes and check nodes can be set arbitrarily, and the complexity of the codec algorithm is also reduced. It can be used for real-time video communication. Anti-packet loss.
  • the improved anti-data packet loss performance of the Tornado code is basically not reduced in the case where the data node is limited.
  • the specific principle and detailed steps of the improved Tornado coding method are described in Chinese Patent Application No. 200510066146.7, entitled "A Data Transmission Protection Method Based on Erasure Code".

Abstract

A method for supporting multimedia data transmission with error resilience is disclosed, thereby the error resilience mechanism of multimedia data real-time transmission can be realized on the transmission protocol layer. In present invention, at first, it provides multimedia data real-time transmission by providing the ERRTP protocol which carrying information about the forward error correction(FEC) coding manner to the exist RTP protocol, so that it can mark the information about the corresponding FEC coding manner for multimedia data at the same time it is transmitted on ERRTP, thereby the error resilience mechanism can be introduced in transmission layer. Next, each standby FEC coding manner can be selected according to present network condition and the level of importance about the multimedia data at transmitting end, so that it can achieve the purpose of protect based on the level, and can realize the equilibrium about the protect ability and the transmission efficiency.

Description

支持容错弹性的多媒体数据传送方法  Multimedia data transmission method supporting fault tolerance elasticity
技术领域 Technical field
本发明涉及多媒体通信技术领域, 特别涉及支持容错弹性的多媒体 数据传送方法。 背景技术  The present invention relates to the field of multimedia communication technologies, and in particular, to a multimedia data transmission method supporting fault tolerance and flexibility. Background technique
随着计算机互联网 (Internet )和移动通信网络的飞速发展, 流媒体 技术的应用越来越广泛, 从网上广播、 电影播放到远程教学以及在线的 新闻网站等都用到了流媒体技术。 当前网上传送视频、 音频主要有下载 ( Download )和流式传送(Streaming ) 两种方式。 流式传送是连续传送 视 /音频信号, 当流媒体在客户机播放时其余部分在后台继续下载。 流式 传送有顺序流式传送 (Progressive Streaming)和实时流式传送 (Realtime Streaming)两种方式。 实时流式传送是实时传送, 特别适合现场事件, 实 时流式传送必须匹配连接带宽, 这意味着图像盾量会因网络速度降低而 变差, 以减少对传送带宽的需求。 "实时" 的概念是指在一个应用中数据 的交付必须与数据的产生保持精确的时间关系。  With the rapid development of computer Internet (Internet) and mobile communication networks, streaming media technology is becoming more and more widely used, from streaming media, movie playback to distance learning and online news sites. Currently, there are two ways to download video and audio on the Internet, including downloading and streaming. Streaming is the continuous transmission of video/audio signals, and the rest of the video continues to be downloaded in the background while the streaming media is playing. Streaming has two methods: Progressive Streaming and Realtime Streaming. Real-time streaming is a real-time transmission, especially for live events. Real-time streaming must match the connection bandwidth, which means that the image shield will be degraded due to the reduced network speed to reduce the need for transmission bandwidth. The concept of "real time" means that the delivery of data in an application must be kept in precise time relationship with the generation of the data.
尤其是随着第三代移动通信系统(3G, 3rd Generation )的出现和普 遍基于网际协议(IP, Internet Protocol )的网络迅速发展, 视频通信正逐 步成为通信的主要业务之一。 而双方或多方视频通信业务, 如可视电话、 视频会议、 移动终端多媒体服务等, 更对多媒体数据流的传送及服务质 量提出苛刻的要求。 不仅要求网络传送实时性更好, 而且等效的也要求 视频数据压缩编码效率更高。  Especially with the advent of third-generation mobile communication systems (3G, 3rd Generation) and the rapid development of networks based on Internet Protocol (IP), video communication is gradually becoming one of the main services of communication. Two-way or multi-party video communication services, such as video telephony, video conferencing, and mobile terminal multimedia services, impose strict requirements on the transmission of multimedia data streams and the quality of services. Not only does network transmission require better real-time performance, but equivalently requires video data compression coding to be more efficient.
鉴于媒体通信的需求现状, 国际电信联盟标准部 ( ITU-T Telecommunication Standardization Sector )继制定了 Η·261、 Η·263、 Η.263+ 等视频压缩标准后, 于 2003年正式发布了 Η.264标准。 这是 ITU-T和国 际标准化组织 (ISO, International Standardization Organization ) 的运动 图像专家组(MPEG, Moving Picture Experts Group )一起联合制定的适 应新阶段网絡媒体传送及通信需求的高效压缩编码标准。 它同时也是 MPEG-4标准第 10部分的主要内容。 制定 H.264标准的目的在于更加有效地提高视频编码效率和它对网 络的适配性。 事实上由于其优越性, H.264视频压缩编码标准很快就已经 逐渐成为当前多媒体通信中的主流标准。 大量的采用 H.264 多媒体实时 通信产品 (如会议电视, 可视电话, 3G移动通信终端)和网络流媒体产 品先后问世, 是否支持 H.264 已经成为这个市场领域中决定产品竟争力 的关键因素。 可以预测, 随着 H.264的正式颁布和广泛使用, 基于 IP网 络和 3G、后 3G无线网络的多媒体通信必然进入一个飞跃发展的新阶段。 In view of the current demand for media communication, the ITU-T Telecommunication Standardization Sector officially released Η.264 in 2003 after the development of video compression standards such as Η·261, Η·263, Η.263+. standard. This is an efficient compression coding standard jointly developed by ITU-T and the Moving Picture Experts Group (MPEG) of the International Standardization Organization (ISO) to adapt to the new phase of network media transmission and communication requirements. It is also the main content of Part 10 of the MPEG-4 standard. The purpose of the H.264 standard is to improve video coding efficiency and its adaptability to the network more effectively. In fact, due to its superiority, the H.264 video compression coding standard has gradually become the mainstream standard in multimedia communication. A large number of H.264 multimedia real-time communication products (such as conference TV, videophone, 3G mobile communication terminal) and network streaming products have been published. Whether to support H.264 has become the key to determining product competitiveness in this market segment. factor. It can be predicted that with the official promulgation and widespread use of H.264, multimedia communication based on IP networks and 3G and 3G wireless networks will inevitably enter a new stage of rapid development.
如前所述, 多媒体通信不仅要求媒体压缩编码效率高, 而且要求网 络传送的实时性。目前多媒体流传送基本上都是采用实时传送协议(RTP, Real-time Transport Protocol )及其控制协议 ( RTCP, Real-time Transport Control Protocol )。 RTP是针对 Internet上多媒体数据流的一个传送协议, 由互联网工程任务组 ( IETF , Internet Engineering Task Force )发布。 RTP 被定义为在一对一或一对多的传送情况下工作, 其目的是提供时间信息 和实现流同步。 RTP 的典型应用建立在用户数据包协议(UDP , User Datagram Protocol )上,但也可以在传送控制协议( TCP , Transport Control Protocol )或异步传送模式(ATM, Asynchronous Transfer Mode )等其他 协议之上工作。  As mentioned above, multimedia communication not only requires high efficiency of media compression coding, but also requires real-time transmission of the network. At present, multimedia streaming basically adopts Real-time Transport Protocol (RTP) and Real-time Transport Control Protocol (RTCP). RTP is a transport protocol for multimedia data streams over the Internet, published by the Internet Engineering Task Force (IETF). RTP is defined to work in one-to-one or one-to-many transmissions with the goal of providing time information and stream synchronization. The typical application of RTP is based on the User Datagram Protocol (UDP), but it can also work on other protocols such as TCP (Transport Control Protocol) or Asynchronous Transfer Mode (ATM). .
RTP 本身只保证实时数据的传送, 并不能为按顺序传送数据包提供 可靠的传送机制,也不提供流量控制或拥塞控制, 它依靠 RTCP提供这些 服务。 RTCP 负责管理传送质量在当前应用进程之间交换控制信息。 在 RTP会话期间, 各参与者周期性地传送 RTCP 包, 包中含有已发送的数 据包的数量、 丟失的数据包的数量等统计资料, 因此, 服务器可以利用 这些信息动态地改变传送速率, 甚至改变有效载荷类型。 RTP和 RTCP 配合使用, 能以有效的反馈和最小的开销使传送效率最佳化, 故适合传 送网上的实时数据。  RTP itself only guarantees the transmission of real-time data, and does not provide a reliable transmission mechanism for transmitting packets in sequence, nor does it provide flow control or congestion control. It relies on RTCP to provide these services. RTCP is responsible for managing the transmission quality to exchange control information between current application processes. During the RTP session, each participant periodically transmits RTCP packets, which contain statistics such as the number of transmitted packets and the number of lost packets. Therefore, the server can use this information to dynamically change the transmission rate, even Change the payload type. RTP and RTCP work together to optimize transmission efficiency with effective feedback and minimal overhead, making it suitable for delivering real-time data on the network.
而 H.264多媒体数据在 IP网络上传送, 也是基于 UDP和其上层的 RTP协议。 RTP本身在结构上对于不同的媒体数据类型都能够适用, 但 是在多媒体通信中不同的高层协议或媒体压缩编码标准 (如 H.261 , H.263, MPEG-1/-2/-4, MP3等), IETF都会制定针对该协议的 RTP净 荷 (Payload)打包方法的规范文件, 详细规定 RTP封装大包的方法, 对于 该具体协议是经过优化的。 同样的 , 对于 H.264也存在对应的 IETF标准 是 RFC 3984: RTP Payload Format for H.264 Video„ 该标准目前是 H.264 视频码流在 IP网络上传送的主要标准, 应用很广泛。 在视频通信领域, 各主流厂商的产品都是基于 RFC 3984的,也是目前仅有的 H.264/RTP传 送方式。 H.264 multimedia data is transmitted over the IP network, also based on UDP and its upper layer RTP protocol. RTP itself is structurally applicable to different media data types, but different high-level protocols or media compression coding standards in multimedia communication (eg H.261, H.263, MPEG-1/-2/-4, MP3) Etc), the IETF will develop an RTP net for the agreement. The specification file of the Payload packaging method, which specifies the method of encapsulating large packets of RTP, is optimized for this specific protocol. Similarly, the corresponding IETF standard for H.264 is RFC 3984: RTP Payload Format for H.264 Video. This standard is currently the main standard for H.264 video stream transmission over IP networks, and is widely used. In the field of video communication, the products of major manufacturers are based on RFC 3984, and it is currently the only H.264/RTP transmission method.
事实上, H.264和以往其它的视频压缩编码协议不同的关键地方在于 H.264 定义了一个新的层面, 称为网络抽象层(NAL, Network Abstract Layer ), 该层是一种使得可以标准的接口开放底层业务能力, 并屏蔽底层 网络的差异性而抽象的业务能力层。 H.264为了增加其视频编码层 (VCL, Video Coding Layer)和下面具体的网络传送协议层的分离和无关性, 带来 更大的应用灵活性, 定义了 NAL这个新的层面, 该层在 ITU-T早期的视 频压缩编码协议比如 H.261 , H.263/H.263+/H.263++中都是没有的。然而, 如何在 NAL和 RTP协议承载协同工作中针对 H.264的优点设计效率更 高、 更好的方案, 使得 RTP对于 H.264的承载性能更好, 具有实用性, 值得研究。  In fact, the key difference between H.264 and other video compression coding protocols is that H.264 defines a new layer, called Network Abstract Layer (NAL), which is a standard that makes it standard. The interface opens up the underlying business capabilities and shields the underlying network from the differences and abstracts the business capability layer. In order to increase the separation and independence of its video coding layer (VCL, Video Coding Layer) and the following specific network transport protocol layer, H.264 brings greater application flexibility and defines a new layer of NAL. The early ITU-T video compression coding protocols such as H.261, H.263/H.263+/H.263++ were not available. However, how to design a more efficient and better solution for the advantages of H.264 in the NAL and RTP protocol bearer cooperation makes RTP better for H.264, practical, and worthy of study.
RFC3984规范所提出的 RTP承载 H.264的 NAL层数据的方法是目前 主流传送方法, 该方案在 RTP协议 ( RFC 3550 ) 的基础上, 将 NAL层 数据封装在 RTP净荷中进行承载。 NAL层位于 VCL和 RTP之间, 规定 要把视频码流按照定义的规则和结构, 分割成一连串的网络抽象层数据 单元( NALU, NAL Units )。 在 RFC3984中定义了 RTP净荷对于 NALU 的封装格式。 下面依次简单介绍 RTP的帧格式和现有技术中 NALU的封 装方法。  The method of RTP carrying H.264 NAL layer data proposed by RFC3984 is the current mainstream transmission method. Based on RTP protocol (RFC 3550), the scheme encapsulates NAL layer data in RTP payload for bearer. The NAL layer is located between the VCL and the RTP, and specifies that the video stream is divided into a series of network abstraction layer data units (NALUs, NAL Units) according to defined rules and structures. The encapsulation format of the RTP payload for NALU is defined in RFC3984. The following is a brief introduction to the RTP frame format and the NALU packaging method in the prior art.
RTP设计的主要目的是实时多媒体会议和连续数据存储、 交互分布 式仿真、 控制和测量应用等。 RTP通常被承载于 UDP协议之上, 以利用 其多路复用和校验的功能。 如果底层提供多点分发, RTP 支持多地址传 送。 RTP提供的功能包括: 载荷类型鉴别、 序列编号、 时间戳、 和发送 监测。  The main objectives of the RTP design are real-time multimedia conferencing and continuous data storage, interactive distributed simulation, control and measurement applications. RTP is typically carried over the UDP protocol to take advantage of its multiplexing and parity functions. If the underlying provides multipoint distribution, RTP supports multi-address delivery. Features provided by RTP include: payload type identification, sequence numbering, timestamp, and transmission monitoring.
在承载 H.264视频的情况下, RTP把 H.264的 NALU封装打包成 RTP 包流。 在 RFC 3984文件中主要定义了 NALU, 并且基于此给出 H.264层 NAL数据在 RTP中的封装打包格式。这种 NALU的 RTP封装格式如图 2 所示。 ' In the case of carrying H.264 video, RTP packages the NA. package of H.264 into RTP. Packet flow. The NALU is mainly defined in the RFC 3984 file, and based on this, the encapsulation and packing format of the H.264 layer NAL data in the RTP is given. The RTP encapsulation format of this NALU is shown in Figure 2. '
图 1中给出一个 NALU在 RTP的净荷中的封装结构, 前面第一个字 节为 NALU头信息,之后为 NALU的数据内容, 多个 NALU首尾相接的 填充到 RTP包的净荷中,在最后还有可选的 RTP填充, 这是 RTP包格式 规定的内容, 是为了使得 RTP包的长度符合某种特定要求(比如达到固 定长度), 可选的 RTP填充数据一般都填零。  Figure 1 shows the encapsulation structure of a NALU in the payload of the RTP. The first byte in the previous byte is the NALU header information, followed by the data content of the NALU. The multiple NALUs are filled end-to-end into the payload of the RTP packet. Finally, there is optional RTP padding, which is specified in the RTP packet format. In order to make the length of the RTP packet meet certain requirements (such as reaching a fixed length), the optional RTP padding data is generally filled with zeros.
NALU头信息即第 1个字节,也称为八比特組 (Octet),其共有三个字 段, 意义和全称分别描述如下:  The NALU header information is the first byte, also known as the octet (Octet), which has three fields. The meaning and full name are respectively described as follows:
F字段定义为禁止比特( forbidden—zero—比特), 占 1 比特, 用于标 识语法错等情况, 如果有语法冲突则置为 1 , 当网络识别此单元中存在比 特错误时, 可将其设为 1 , 以便接收方丢掉该单元, 主要用于适应不同种 类的网络环境(比如有线无线相结合的环境); The F field is defined as a forbidden bit (forbidden-zero-bit), which is 1 bit, used to identify grammatical errors, etc., and is set to 1 if there is a syntax conflict. When the network recognizes that there is a bit error in this unit, it can be set. Is 1, for the receiver to drop the unit, mainly used to adapt to different kinds of network environments (such as wired and wireless combined environment);
I字段定义为 NAL参考标识( nal—ref一 idc ), 占 2 比特, 用于指示 The I field is defined as the NAL reference identifier (nal_ref_idc), which is 2 bits, used to indicate
NALU数据的重要程度,其值为 00表示 NALU的内容不用于重建帧间预 测的参考图像, 而非 00则表示当前 NALU是属于参考帧的条带 (slice ) 或序列参数集( SPS, Sequence Parameter Set )、 图像参数集( PPS, Picture Parameter Set )等重要数据, 该值越大表示当前 NAL越重要; The importance of NALU data, whose value is 00 means that the content of the NALU is not used to reconstruct the inter-predicted reference picture, while the non-00 indicates that the current NALU is a slice or sequence parameter set belonging to the reference frame (SPS, Sequence Parameter Set), image parameter set (PPS, Picture Parameter Set) and other important data. The larger the value, the more important the current NAL is;
Type字段定义为 NALU类型 (Nal_unitjype ), 共 5 比特, 可以有 The Type field is defined as NALU type (Nal_unitjype), a total of 5 bits, which can have
32种 NALU的类型, 其值和具体类型的对应关系在表 1中详细给出。 The types of 32 NALUs, the correspondence between their values and specific types are given in detail in Table 1.
表 1 NALU头信息中 Type字段取值与类型对应关系表  Table 1 Relationship between Type and Type of Type Fields in NALU Header Information
Type值 NALU内容的类型 Type value Type of NALU content
0 未指定  0 not specified
1 非 IDR图像的编码 slice  1 encoding of non-IDR images
2 编码 slice数据划分 A  2 encoding slice data division A
3 编码 slice数据划分 B  3 encoding slice data division B
4 编码 slice数据划分 C  4 encoding slice data division C
5 IDR图像中的编码 slice 6 SEI (补充增强信息) 5 Coded slice in IDR image 6 SEI (Supplemental Enhancement Information)
7 SPS (序列参数集)  7 SPS (sequence parameter set)
8 PPS (图像参数集)  8 PPS (image parameter set)
9 接入单元定界符  9 access unit delimiter
10 序列结束  10 end of sequence
11 码流结束  11 code stream ends
12 填充数据  12 Fill data
13-23 保留 '  13-23 Reserved '
24-31 未指定 可见, NALU的头信息的一个字节中给出的信息主要包含 NALU的 有效性、 重要性等级, 根据这些信息可以确定 RTP所承载的数据重要性。  24-31 Unspecified It can be seen that the information given in one byte of the NALU header information mainly contains the validity and importance level of the NALU. Based on this information, the importance of the data carried by the RTP can be determined.
在了解了 H.264/RTP 的传送结构之后, 与本发明的内容密切相关的 还有多媒体网络传送的容错弹性机制。 下面简单介绍有关于视频网络传 送的容错弹性和相关技术背景。  After understanding the transport structure of H.264/RTP, closely related to the content of the present invention is a fault tolerant resilient mechanism for multimedia network transmission. The following is a brief introduction to the fault-tolerant resiliency and related technical background of video network transmission.
H.264视频是未来多媒体通信的主要协议,未来的多媒体通信应用的 网絡主要是以 IP为代表的数据包交换网络和无线网络。 这两大类网络都 无法提供很好的服务质量(QoS, Quality of Service )保证, 因此视频在 网络上传送必然会受到各种传送错误而丟包的影响, 从而使得通信质量 降低。 由于 IP网络实现 "尽力"(best effort )传送, 并不能保证传送视频 信号的 QoS。 特别是对经过高效压缩编码的 H.264码流。 IP网络上的尽 力传送不能保证实时视频通信的 QoS, 具体表现在三个方面: 数据包丢 失、 时延和时延抖动。 其中, 数据包丢失对恢复视频的质量影响最大, 由于 H.264压缩编码算法使用运动估值和运动补偿技术, 一旦有数据包 丟失存在, 不仅影响当前解码图像, 而且会影响后续解码图像, 即误码 扩散。 误码扩散对恢复视频质量的影响非常大, 只有结合编码端和解码 端联合抗误码, 才能完全避免误码扩散。  H.264 video is the main protocol for multimedia communication in the future. The network of future multimedia communication applications is mainly the packet switching network and wireless network represented by IP. Neither of these two types of networks can provide good quality of service (QoS) guarantees. Therefore, video transmission on the network is bound to be affected by various transmission errors and packet loss, resulting in lower communication quality. Since the IP network implements "best effort" transmission, it does not guarantee the QoS of the transmitted video signal. Especially for H.264 code streams that are efficiently compressed and encoded. The best-effort transmission on the IP network does not guarantee the QoS of real-time video communication, which is manifested in three aspects: packet loss, delay, and delay jitter. Among them, packet loss has the greatest impact on the quality of recovered video. Because H.264 compression coding algorithm uses motion estimation and motion compensation technology, once there is packet loss, it not only affects the current decoded image, but also affects the subsequent decoded image. Error spread. The effect of error spread on recovering video quality is very large. Only when the combination of the encoding end and the decoding end is combined with error resistance can the error spread be completely avoided.
容错弹性 (Error Resilience)是指传送机制具有预防错误发生或者在 错误发生后能够以一定能力纠正的能力 (错误强度在一定范围内, 可以完 全纠正;超过一定范围, 只能部分纠正)。在未来的广泛 (可以说无所不在) 的多媒体通信环境中, 一种视频传送机制是否具有容错弹性将是非常关 键的。 Error Resilience refers to the ability of the transmission mechanism to prevent errors from occurring or to be corrected with certain ability after the error occurs. (The error strength can be completely corrected within a certain range; if it exceeds a certain range, it can only be partially corrected). Extensive in the future (can be said to be omnipresent) In a multimedia communication environment, it is critical that a video delivery mechanism is resilient to fault.
存在多种容错弹性机制, 比如前向纠错 (FEC , Forward Error Correction)、 自动重发请求 (ARQ, Automatic Retransmission Request)、 错误掩盖(Error Concealment) , 信源信道联合编码 (JSCC, Joint Source-Channel Coding)、 交织 (Interleaving )及消除误码扩散等。 对于 H.264视频在数据包网络上传送, FEC是一种很实用的技术, 效果很好。 该方法主要采用多种纠错编码来对于要保护的数据进行编码, 实质是形 成数据冗余, 从而增加抗御错误的能力。  There are a variety of fault-tolerant resilience mechanisms, such as Forward Error Correction (FEC), Automatic Retransmission Request (ARQ), Error Concealment, and Source Channel Joint Coding (JSCC, Joint Source- Channel Coding), Interleaving, and elimination of bit error spread. For H.264 video to be transmitted over a packet network, FEC is a very practical technique that works well. This method mainly uses a variety of error correction coding to encode the data to be protected, which essentially forms data redundancy, thereby increasing the ability to resist errors.
数据包在网络上主要的错误是丟包错误, 这种错误在纠错编码理论 中叫做删除错误 ( Erasure Error )。针对删除错误的纠错编码是一大类叫做 纠删码( Erasure Codes )。 所谓纠删码就是把数据码流顺序逐段分割成大 小相同的一个个单元 (Unit),也叫做数据节点(Data Nodes ), 为表示方便, 假设共有 n个数据节点。 然后按照一定的数学运算规则对于这些数据节 点进行计算产生出校验节点(? 1:
Figure imgf000008_0001
为了增强保护 能力, 还可以对于这些校验节点继续按照相同或者不同的数学运算规则 运算产生出第二层校验节点, 依次类推, 可以生成第三层, 第四层, 直 至第 N层校验节点。
The main error of the packet on the network is the packet loss error, which is called Erasure Error in the error correction coding theory. Error correction codes for deletion errors are a large class called Erasure Codes. The so-called erasure code is to divide the data stream sequence into segments of the same size (Unit), also called data nodes (Data Nodes). For convenience of presentation, it is assumed that there are n data nodes. Then, according to certain mathematical operation rules, these data nodes are calculated to generate a check node (? 1:
Figure imgf000008_0001
In order to enhance the protection capability, the check nodes may continue to generate the second layer check node according to the same or different mathematical operation rules, and so on, and the third layer, the fourth layer, and the Nth layer check may be generated. node.
一般来说, 如果涉及多层校验节点, 每层上的节点数目相对于上一 层是按照一定规律(最常见的是等比规律)递減的, 这样就行成一个逐 层递缩的多层节点结构。 可以形象地表示为一个向右转 90度的金字塔。 其中, 最左边是数据节点层, 向右排列依次是第一层校验节点, 第二层 校验节点, ... ... , 第 N层校验节点。  In general, if multiple layers of check nodes are involved, the number of nodes on each layer is decremented according to a certain rule (the most common is the law of proportionality), so that it becomes a layer-by-layer shrinkage. Layer node structure. It can be visually represented as a pyramid that turns 90 degrees to the right. The leftmost side is the data node layer, and the right side is the first layer check node, the second layer check node, ..., the Nth layer check node.
其中一类纠删码具有一种非常重要的性质, 即处理需要的时间复杂 度是和数据节点数 n存在线性关系,因此叫做线性时间特性( linear-time I 而很多其它的纠删码比如著名的 Reed-Solomon码需要的时间复杂度就要 高得多, 是 n*log2n*log(logn)数量级的。 因此, 具有线性时间性的纠删码 其在实时通信中的用途要好得多。  One type of erasure code has a very important property, that is, the time complexity required for processing is linear with the number n of data nodes, so it is called linear time characteristic (linear-time I and many other erasure codes such as famous The Reed-Solomon code requires much more time complexity and is on the order of n*log2n*log(logn). Therefore, linear time-based erasure codes are much better used in real-time communication.
Tornado纠删码(下文均简称 Tornado码)是 1998年前后出现的一种 的新型纠删码。 Tornado码结构简单,运算高效, 因为它具有线性时间性、 保护能力强。 在实际应用中, 获得了很好的效果。 目前已经获得较为广 泛的应用。 1"最新的 ITU-T动态,其中的 SG16目前正在考虑对于错误 控制编码类 (Error Control Codes )技术进行标准化的可能性, 主要是针 对视频音频网络传送进行保护。 Tornado码及其多个变种很可能是其中的 重要技术。 Tornado erasure code (hereinafter referred to as Tornado code) is a kind of one that appeared around 1998. New erasure code. Tornado code is simple in structure and efficient in operation because it has linear time and strong protection. In practical applications, good results have been obtained. It has been widely used. 1" The latest ITU-T dynamics, where SG16 is currently considering the possibility of standardizing Error Control Codes technology, mainly for video and audio network transmission protection. Tornado code and its many variants are very May be an important technology among them.
在 Tornado码中,从数据节点逐层产生出多个校验节点层。校验节点 和数据节点都由发送端通过网络发送给接收端。 如果在网络传送过程中, 部分节点丟失了, 因为上层节点参加了下层节点的生成, 因此上层节点 的信息已经包含在了下层节点以及更下层节点中, 因此丢失节点的信息 可以通过足够多数目的下层节点或者更下层节点来完全恢复。 如果每个 节点是一个包, 则丟失的包可以由正确接收到的其它包完全恢复。 设数 据节点个数为 n, 产生的校验节点数为 L 则定义纠删码的码率和冗余率 分别是: r=n/(n+l), l-r=l/(n+l); 在其它条件相同情况下 (保护能力, 造 成的延迟等), 码率越高 (必然地, 冗余率越低), 则纠删码的效率越高。  In the Tornado code, multiple check node layers are generated layer by layer from the data node. Both the check node and the data node are sent by the sender to the receiver over the network. If some nodes are lost during the network transmission process, because the upper node participates in the generation of the lower node, the information of the upper node is already included in the lower node and the lower node, so the information of the lost node can pass the lower level of sufficient majority. The node or lower node is fully recovered. If each node is a packet, the lost packet can be fully recovered by other packets that are correctly received. Let the number of data nodes be n, and the number of generated check nodes is L. The code rate and redundancy rate of the erasure code are defined as: r=n/(n+l), lr=l/(n+l) Under the same conditions (protection ability, delay caused, etc.), the higher the code rate (inevitably, the lower the redundancy rate), the higher the efficiency of the erasure code.
图 3示出了一种典型的 Tornado码数据节点及各层校验节点间的关 系。 图中节点之间的连线称为边, 表示边的左侧节点参与计算右侧节点, 可见前后两层节点之间是一种多对多的逻辑关系。 Tornado码产生过程中 最常采用的计算方法是异或运算, 因为异或运算具有很方便的恢复功能, 任意一个节点丢失后, 均可由所有其余节点恢复。 由于最后一层校验节 点的递缩比例不同, 因此一般采用常规的纠错编码方案进行计算, 比如 Reed-Solomon码。  Figure 3 shows a typical Tornado code data node and the relationship between the check nodes of each layer. The connection between the nodes in the figure is called the edge, and the node on the left side of the edge participates in the calculation of the right node. It can be seen that there is a many-to-many logical relationship between the two nodes before and after. The most commonly used calculation method in the Tornado code generation process is the XOR operation, because the XOR operation has a convenient recovery function, and any node can be recovered by all the remaining nodes after it is lost. Since the scaling factor of the last layer of check nodes is different, it is generally calculated using a conventional error correction coding scheme, such as a Reed-Solomon code.
其实, 纠删码的范围很大, Tornado码只是其中比较典型的一种, 另 外还有比如 RS(Reed-Solomon)码、 低密度校验码 (LDPC, Low Density Parity Codes)等。  In fact, the range of erasure codes is very large. Tornado codes are only one of them. In addition, there are RS (Reed-Solomon) codes and Low Density Parity Codes (LDPC).
纠删码的一个重要的性能指标就是其纠错能力 (或者叫做保护能力), 直接体现为能够完全纠正丢包错误所允许的最大丢包数量(在一定包的 总数前体下), 或者当丟包高于这个最大允许数量条件下, 能够正确纠正 包的百分比。 一般来说, 在其他条件相同情况下, 保护能力越高, 冗余 率越高。 An important performance indicator of the erasure code is its error correction capability (or protection capability), which is directly reflected in the maximum number of lost packets allowed under the packet loss error (under the total number of precursors of a certain packet), or when The packet loss is higher than this maximum allowable number, and the percentage of the packet can be corrected correctly. In general, the protection is higher and the redundancy is the same under other conditions. The higher the rate.
保护能力不仅适用于纠删码, 在更大范围内, 所有 FEC编码都可以 用保护能力来度量。 在视频数据中, 有些数据相对重要性高, 比如视频 序列的结构参数、 图像的结构参数、 头信息等; 另外一些数据的重要性 相对低, 比如图像内容数据等。 在使用 FEC进行保护时, 对于相对重要 的数据采用保护能力较强的编码; 而对于相对不重要的数据采用保护能 力较弱的编码。 这样可以在保护能力和效率之间达成平衡。 不能一味强 调保护能力, 因为这样会导致很高的冗余, P条低效率。 这种根据数据相 对重要性来进行不同保护能力的 FEC保护的方法叫做不等保护 (UEP, Unequal Protection),通过不等保护,容易实现视频通信服务的 QoS保证。  The protection capability is not only applicable to erasure codes, but on a larger scale, all FEC codes can be measured by protection capabilities. In video data, some data are relatively important, such as structural parameters of video sequences, structural parameters of images, header information, etc. Other data are relatively less important, such as image content data. When using FEC for protection, a code with stronger protection is used for relatively important data, and a code with weak protection for relatively unimportant data. This balances protection and efficiency. The protection capability cannot be adjusted blindly because it leads to high redundancy and the P-bar is inefficient. This method of FEC protection based on the relative importance of data for different protection capabilities is called Unequal Protection (UEP), and QoS guarantee for video communication services is easily realized by unequal protection.
目前传送视频多媒体数据的 RTP协议是不支持容错弹性的, 要靠更 高的应用层来提供。 现有技术在对于 H.264等视频数据网络传送中, 一 般是使用纠删码保护来实现弹性容错的。 以 H.264 为例, 现有技术方案 所采取的措施是. - 发送端在 H.264的 NALU层面, 对于 NALU数据单元直接使用某种 纠删码, 然后将结果(包括数据节点和校验节点)直接封装在 RTP数据 包中, 然后进行传送。  Currently, the RTP protocol for transmitting video multimedia data does not support fault-tolerant flexibility and is provided by a higher application layer. In the prior art, for the transmission of video data networks such as H.264, the erasure code protection is generally used to achieve elastic fault tolerance. Taking H.264 as an example, the measures taken by the prior art scheme are: - The sender is at the NALU level of H.264, and directly uses some type of erasure code for the NALU data unit, and then the result (including the data node and the checksum) The node) is directly encapsulated in the RTP packet and then transmitted.
接收端在收到 RTP数据包后, 进行去封装, 提取出数据节点和校验 节点, 如果发生丟包, 即某个或者某些 RTP数据包丟失, 那么根据这些 丢失包中封装了哪些数据节点或者校验节点, 可以判断是否能够利用正 确接收到的数据节点和校验节点来完全恢复或者部分恢复丢失的节点, 并且进行恢复操作。  After receiving the RTP data packet, the receiving end performs decapsulation to extract the data node and the check node. If packet loss occurs, that is, some or some RTP data packets are lost, then according to which data nodes are encapsulated in the lost packets. Or verifying the node, it can be judged whether the correctly received data node and the check node can be used to completely recover or partially recover the lost node, and the recovery operation is performed.
当然也有采用其他除了纠删码以外的容错弹性机制实现的, 但纠删 码对于 H.264数据的保护能够提供最高效的容错弹性机制。  Of course, other fault-tolerant mechanisms other than erasure codes are used, but the protection of H.264 data can provide the most efficient fault-tolerant elastic mechanism.
可见, 现有技术是在高层通过对 NALU等多媒体数据进行纠删编码 然后在 RTP传送, 在接收端进行相应的纠删解码。 需要注意的是, 收发 双方一般是通过传送之前的信令交互来协商决定采用什么前向纠错编码 方案以及该方案所采用的参数设置等,比如采用 H.323/H.245等协议通道 供双方进行协商。 在实际应用中,现有技术方案中容错弹性机制均在 RTP的高层实现, 其中双方协商或者告知所采用的纠删码类型及其参数设置时需要通过其 他的逻辑信道实现, 严重影响多媒体传送效率, 耗用网絡带宽资源; 而对于 RTP传送层面来说, 容错弹性机制是透明的, 因此 RTP层并 不能得知 FEC编解码方案产生的编码多媒体数据的结构, 从而无法进行 针对性的封装去封装, 无法筒化传送层次架构, 加长网络传送延时、 传 送设备变得复杂; It can be seen that the prior art performs erasure coding on the multimedia data such as NALU at the upper layer and then transmits the data in the RTP, and performs corresponding erasure decoding on the receiving end. It should be noted that the transmitting and receiving parties generally negotiate and decide what forward error correction coding scheme to use and the parameter settings adopted by the scheme, such as H.323/H.245 and other protocol channels. The two sides negotiated. In practical applications, the fault-tolerant and flexible mechanisms in the prior art solutions are implemented at the upper layer of the RTP. The two parties negotiate or inform the type of the erasure code to be used and its parameter settings need to be implemented through other logical channels, which seriously affects the multimedia transmission efficiency. The network bandwidth resource is consumed. For the RTP transport layer, the fault-tolerant resiliency mechanism is transparent. Therefore, the RTP layer cannot know the structure of the encoded multimedia data generated by the FEC codec scheme, and thus cannot perform targeted encapsulation and encapsulation. , unable to reorganize the transport hierarchy, lengthen network transmission delays, and the transmission equipment becomes complicated;
对于收发两端在协商确定 FEC编码方案之后, 即一直按照该方案传 送多媒体数据, 对于不同重要性的数据和不同时刻的网络传送状态来说, 无法实现不等保护机制, 也无法由容错弹性机制来实现 QoS保证。  After the transmitting and receiving ends negotiate the FEC encoding scheme, the multimedia data is always transmitted according to the scheme. For different importance data and network transmission states at different times, the unequal protection mechanism cannot be implemented, and the fault-tolerant elastic mechanism cannot be implemented. To achieve QoS guarantee.
现有技术是在高层实现 FEC等容错弹性机制的, 并没有利用到 RTP 协议及其封装, 因此收发双方需要另外建立逻辑信道或者利用具体的应 用层协议, 比如 H.323协议体系中的某些协议 H.245, 来协商或通知所采 用的 FEC编码类型, 结构参数等信息; 没有在 RTP层涉及容错弹性相关 细节, 没有制定 RTP数据包如何封装经过 FEC保护产生的数据节点和校 验节点; 也没有根据网络状况和多媒体数据的重要性来选择 FEC编解码 方案, 缺乏提供对于具有不同相对重要性数据进行不同保护能力 FEC保 护的机制, 即无法实现不等保护。 发明内容  The prior art implements a fault-tolerant elastic mechanism such as FEC at a high level, and does not utilize the RTP protocol and its encapsulation. Therefore, the transmitting and receiving parties need to establish another logical channel or use a specific application layer protocol, such as some in the H.323 protocol system. Protocol H.245, to negotiate or inform the FEC encoding type, structural parameters and other information used; no fault-tolerant resiliency related details are involved in the RTP layer, and no RTP data packet is encapsulated to encapsulate the data nodes and check nodes generated by FEC protection; There is also no choice of FEC codec scheme according to the network condition and the importance of multimedia data, and there is no mechanism for providing FEC protection for different protection capabilities with different relative importance data, that is, unequal protection cannot be achieved. Summary of the invention
有鉴于此, 本发明的主要目的在于提供一种支持容错弹性的多媒体 数据网络实时传送方法, 使得多媒体数据实时传送的容错弹性机制能在 传送协议层面得以实现; 本发明进一步的 U的是实现针对不同数据和网 络状况的不等保护机制和分级保护机制。  In view of this, the main purpose of the present invention is to provide a real-time transmission method for a multimedia data network that supports fault-tolerant resilience, so that a fault-tolerant elastic mechanism for real-time transmission of multimedia data can be implemented at a transmission protocol level. Further U of the present invention is implemented for Unequal protection mechanisms and hierarchical protection mechanisms for different data and network conditions.
根据本发明提供的一种支持容错弹性的多媒体数据网络实时传送方 法, 包括:  A real-time transmission method for a multimedia data network supporting fault tolerance resilience according to the present invention includes:
发送端选择前向纠错编码方式对所述多媒体数据进行前向纠错编 码;  The transmitting end selects a forward error correction coding mode to perform forward error correction coding on the multimedia data;
所述发送端用容错弹性实时传送协议封装编码后的多媒体数据, 并 在所述容错弹性实时传送协议数据包的头信息中携带所述前向纠错编码 方式相关信息, 并发送给接收端; The transmitting end encapsulates the encoded multimedia data by using a fault-tolerant elastic real-time transmission protocol, and And carrying the forward error correction coding mode related information in the header information of the fault tolerant elastic real-time transmission protocol data packet, and sending the information to the receiving end;
所述接收端将收到的容错弹性实时传送协议数据包去封装, 并从所 述容错弹性实时传送协议数据包的头信息中提取所述前向纠错编码方式 相关信息;  The receiving end decapsulates the received fault-tolerant elastic real-time transport protocol data packet, and extracts the forward error correction coding mode related information from the header information of the fault-tolerant elastic real-time transport protocol data packet;
当在传送过程中发生数据节点对应的容错弹性实时传送协议包丟 失, 所述接收端根据所述前向纠错编码方式相关信息, 选择所述前向纠 错解码方式进行前向糾错解码, 恢复或者部分恢复所述丢失的多媒体数 据。  When the fault-tolerant elastic real-time transport protocol packet corresponding to the data node is lost during the transmission, the receiving end selects the forward error correction decoding mode to perform forward error correction decoding according to the forward error correction coding mode related information, Restoring or partially recovering the lost multimedia data.
所述前向纠错编码后的多媒体数据包括数据节点和校验节点。  The forward error correction encoded multimedia data includes a data node and a check node.
所述发送端根据当前网络传送状况或 /和待发送多媒体数据的服务质 量等级选择前向纠错编码方式, 其中服务廣量等级根据数据的相对重要 性确定。  The transmitting end selects a forward error correction coding mode according to a current network transmission condition or/and a service quality level of the multimedia data to be transmitted, wherein the service volume level is determined according to the relative importance of the data.
所述容错弹性实时传送协议数据包头信息中包含:  The packet fault information of the fault tolerant elastic real-time transport protocol includes:
前向纠错编码类型字段, 用于指示所采用的前向纠错码类型; 前向纠错编码子类型字段, 用于指示所述前向纠错编码方式的相关 参数设置;  a forward error correction coding type field, configured to indicate a forward error correction code type used; a forward error correction coding subtype field, configured to indicate a related parameter setting of the forward error correction coding mode;
数据包长度字段, 用于指示在对所述多媒体数据进行纠前向纠错码 后得到的节点的长度;  a packet length field, configured to indicate a length of a node obtained after correcting the forward error correction code for the multimedia data;
数据包数目字段, 用于指示该容错弹性实时传送协议数据包所承载 的所述数据节点的数目。  A packet number field, used to indicate the number of the data nodes carried by the fault tolerant elastic real-time transport protocol data packet.
更适宜第, 当所述多媒体数据为 H.264网絡抽象层单元时, 所述发送端将至少一个所述 H.264 网络抽象层单元划分为等长的至 少一个数据节点, 然后对其进行前向纠错编码, 得到至少一个校-瞼节点; 所述发送端将所述数据节点和所述校验节点分组封装在至少一个所 述容错弹性实时传送协议包中进行发送;  Preferably, when the multimedia data is an H.264 network abstraction layer unit, the transmitting end divides at least one of the H.264 network abstraction layer units into at least one data node of equal length, and then performs the foregoing. Encoding to the error correction to obtain at least one calibration node; the transmitting end encapsulates the data node and the verification node packet in at least one of the fault tolerant elastic real-time transmission protocol packets for transmission;
所述接收端在接收到所述容错弹性实时传送协议包后, 去封装得到 所述数据节点和所述校验节点;  After receiving the fault tolerant elastic real-time transport protocol packet, the receiving end decapsulates the data node and the check node;
如果发生了传送过程中的数据节点丢失, 则所述接收端根据所述校 验节点对所述数据节点进行前向纠错解码, 并划分得到所述 H.264 网络 抽象层单元。 If a data node loss occurs during transmission, the receiving end is according to the school The node performs forward error correction decoding on the data node, and divides and obtains the H.264 network abstraction layer unit.
更适宜地, 在开始传送之前, 包括:  More suitably, before starting the transfer, include:
对于各种所述前向纠错码类型, 所述发送端和所述接收端协商确定 所述容错前向糾错码子类型字段的取值与其所指示的前向纠错码的相关 参数设置的对应关系。  For each of the types of the forward error correction code, the transmitting end and the receiving end negotiate to determine the value of the fault tolerant forward error correction code subtype field and the related parameter setting of the forward error correction code indicated. Correspondence relationship.
所述发送端和所述接收端都根据所述前向纠错编码子类型字段的指 示对应关系建立对应关系表, 用于根据所述前向纠错编码类型字段和所 述前向纠错编码子类型字段查询所对应的前向纠错编码或前向纠错解码 处理模块;  And the sending end and the receiving end both establish a correspondence table according to the indication correspondence relationship of the forward error correction coding subtype field, and configured to perform, according to the forward error correction coding type field and the forward error correction coding The forward type error correction coding or forward error correction decoding processing module corresponding to the subtype field query;
所述发送端调用相应前向纠错编码处理模块进行前向纠错编码; 所述接收端调用相应前向纠错解码处理模块进行前向纠错解码。 所述发送端根据所述 H.264 网络抽象层单元的头信息中的网络抽象 层参考标识字段或 /和网络抽象层单元类型字段评估对应数据的相对重要 性, 确定所述服务质量等级, 选择相应的前向纠错编码方式, 确定所述 前向纠错编码类型字段和前向纠错编码子类型字段。  The transmitting end invokes a corresponding forward error correction coding processing module to perform forward error correction coding; the receiving end invokes a corresponding forward error correction decoding processing module to perform forward error correction decoding. Determining, by the sending end, the relative importance of the corresponding data according to the network abstraction layer reference identifier field or/and the network abstraction layer unit type field in the header information of the H.264 network abstraction layer unit, determining the quality of service level, selecting Corresponding forward error correction coding mode determines the forward error correction coding type field and the forward error correction coding subtype field.
所述发送端根据所述接收端反馈的传送报告评价所述网络传送状 况, 进而选择所述前向纠错编码方式, 确定所述前向纠错编码类型字段 和前向纠错编码子类型字段。  The transmitting end evaluates the network transmission status according to the transmission report fed back by the receiving end, and further selects the forward error correction coding mode, and determines the forward error correction coding type field and the forward error correction coding subtype field. .
更适宜地, 所述前向纠错编码类型字段位于贡献源标识符列表之后; 所述前向纠错编码子类型字段位于所述前向纠错编码类型字段之 后;  Preferably, the forward error correction coding type field is located after the contribution source identifier list; the forward error correction coding subtype field is located after the forward error correction coding type field;
所述数据包长度字段位于所述前向纠错编码子类型字段之后; 所述数据包数目字段位于所述数据包长度字段之后。  The data packet length field is located after the forward error correction coding subtype field; the data packet number field is located after the data packet length field.
优选地, 所述前向纠错编码方式使用改进的 "Tornado" 纠删码; 所述改进的 "Tornado" 纠删码对于一组所述数据节点仅生成一层所 述校验节点。  Preferably, the forward error correction coding mode uses an improved "Tornado" erasure code; the improved "Tornado" erasure code generates only one layer of the check node for a set of said data nodes.
与现有技术的主要区別在于, 根据本发明的技术方案中, 首先, 在 现有 RTP基础上提供了可以携带前向糾错编码方案相关信息的 ER TP传 送层封装格式, 使得多媒体数据在 ERRTP上传送的同时标记其相应的前 向纠错编码方案信息, 从而将错误弹性机制融入传送层; The main difference from the prior art is that, according to the technical solution of the present invention, first, an ER TP transmission that can carry information related to the forward error correction coding scheme is provided on the basis of the existing RTP. Sending a layer encapsulation format, so that the multimedia data is transmitted on the ERRTP while marking its corresponding forward error correction coding scheme information, thereby integrating the error resilience mechanism into the transport layer;
其次, 在发送端还可以根据当前网络状况和多媒体数据重要性等级 等因素来选择采用各种备用的前向纠错编码方案, 从而达到不等保护的 和分级保护的目的, 实现保护能力和传送效率的均衡;  Secondly, at the transmitting end, various alternate forward error correction coding schemes can be selected according to factors such as current network conditions and multimedia data importance levels, thereby achieving the purpose of unequal protection and hierarchical protection, achieving protection capability and transmission. Balance of efficiency;
最后, 对于 Η.264的 NALU的纠删码保护方案, 给出了数据节点和 校验节点的生成、 传送、 封装和去封装方法。  Finally, for the NALU erasure code protection scheme of Η.264, the methods of generating, transmitting, encapsulating and decapsulating data nodes and check nodes are given.
在传送层实现容错弹性机制大大简化容错弹性传送结构, 节省了网 络传送带宽; 不等保护的实现, 达到了保护能力和传送效率的均衡, 方 便于多媒体传送的 QoS保证的实现; H.264数据的具体传送方案的实现, 可以大大提高基于 H.264 的多媒体通信产品比如会议电视、 可视电话在 IP网络上应用的性能和用户满意度。 附图说明  The fault-tolerant elastic mechanism in the transport layer greatly simplifies the fault-tolerant elastic transmission structure, which saves the network transmission bandwidth. The realization of the unequal protection achieves the balance between protection capability and transmission efficiency, facilitating the realization of QoS guarantee for multimedia transmission; H.264 data The implementation of the specific transmission scheme can greatly improve the performance and user satisfaction of H.264-based multimedia communication products such as conference television, videophone application on IP networks. DRAWINGS
图 1是 RTP包净荷对 NALU数据的封装格式示意图;  1 is a schematic diagram of a package format of an RTP packet payload to NALU data;
图 2是 RTP数据包的头信息结构示意图;  2 is a schematic diagram showing the structure of a header information of an RTP data packet;
图 3是 Tornado纠删码原理示意图;  Figure 3 is a schematic diagram of the Tornado erasure code principle;
图 4是根据本发明的第一实施方式的 EERTP包头信息结构示意图; 图 5是根据本发明的第二实施方式的 H.264多媒体数据传送方法流 程图;  4 is a schematic structural diagram of an EERTP packet header according to a first embodiment of the present invention; FIG. 5 is a flow chart of a H.264 multimedia data transmission method according to a second embodiment of the present invention;
图 6是根据本发明的第二实施方式的 H.264 NALU划分编解码过程 示意图。 具体实施方式  6 is a schematic diagram of a H.264 NALU partitioning codec process according to a second embodiment of the present invention. detailed description
为使本发明的目的、 技术方案和优点更加清楚, 下面将结合附图对 本发明作进一步地详细描述。  In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings.
针对现有技术存在的诸多问题, 本发明提出一种改进的支持容错弹 性的 RTP协议, 旨在将容错弹性机制融入传送层协议, 不但可以简化传 送结构降低复杂度, 而且还能提高容错弹性机制灵活性增强传送可靠性。 由于具有容错弹性, 称这种改进的 RTP协议为容错弹性实时传送协议 ( ERRTP /ER2TP, Error Resilience Real-time Transport Protocol )。 ERRTP 与 RTP的主要区别在于 ERRTP协议数据包头信息扩展可以携带前向纠错 编解码方案相关信息, 比如 FEC类型、 保护能力、 编码参数等。 In view of the problems existing in the prior art, the present invention proposes an improved RTP protocol supporting fault tolerance resilience, which aims to integrate a fault-tolerant elastic mechanism into a transport layer protocol, which not only simplifies the transmission structure and reduces complexity, but also improves the fault-tolerant elastic mechanism. Flexibility enhances transmission reliability. Due to its fault-tolerant flexibility, this improved RTP protocol is called a fault-tolerant elastic real-time transport protocol. ( ERRTP /ER2TP, Error Resilience Real-time Transport Protocol ). The main difference between ERRTP and RTP is that the ERTP protocol packet header information extension can carry information about the forward error correction codec scheme, such as FEC type, protection capability, and coding parameters.
在 ERRTP基础上, 本发明很方便地实现了不等保护, 首先提供多种 保护能力不同的保护措施可供选择使用, 然后发送端在收集得到网络状 况和多媒体数据重要性等信息后, 可以根据这些因素来选择合适的保护 措施, 从而达到不等保护的目的, 实现保护能力和传送效率的均衡。 由 于在每个 ERRTP数据包上都携带了其所采用的 FEC相关信息,因此发送 端只需将所选择的方案的信息填入 ERRTP包头信息中, 接收端就能根据 其进行正确恢复或纠错。  On the basis of ERRTP, the present invention conveniently realizes unequal protection. Firstly, various protection measures with different protection capabilities are available for selection, and then the sender can collect information such as network status and importance of multimedia data. These factors are used to select appropriate protection measures to achieve the goal of unequal protection and to achieve a balance between protection capability and transmission efficiency. Since the FEC related information is carried on each ERRTP data packet, the transmitting end only needs to fill in the information of the selected scheme into the ERRTP header information, and the receiving end can correctly recover or correct the error according to it. .
最后, 对于 H.264的 NALU数据传送应用, 给出了基于纠删码保护 的具体实现方法, 包括划分、 生成、 封装和去封装数据节点和校验节点 的步骤。 将连续一串 NALU—起等长地划分为若干个数据节点, 然后用 Tornado码产生校验节点, 所有这些节点又分布在若干个 ERRTP包中传 送, 接收端则进行这个逆过程。  Finally, for the NALU data transmission application of H.264, the specific implementation method based on erasure code protection is given, including the steps of dividing, generating, encapsulating and decapsulating the data node and the check node. A series of NALUs are equally divided into several data nodes, and then the check nodes are generated by Tornado codes. All of these nodes are distributed in several ERRTP packets, and the receiving end performs this inverse process.
为了便于理解本发明的技术方案, 在此, 简要介绍 RTP包的格式: RTP头信息基本选项占用 12字节 (最小情况), 而 IP协议和 UDP协 议的头信息分别占用 20字节和 8字节, 因此 RTP包封装在 UDP包再封 装在 IP包中, 总的头信息占用字节数是 12+8+20=40字节。 RTP包的头 信息的详细结构如图 2所示。  In order to facilitate the understanding of the technical solution of the present invention, here, the format of the RTP packet is briefly introduced: The basic option of the RTP header information occupies 12 bytes (minimum case), and the header information of the IP protocol and the UDP protocol respectively occupy 20 bytes and 8 words. Therefore, the RTP packet is encapsulated in the UDP packet and then encapsulated in the IP packet. The total number of bytes occupied by the header information is 12+8+20=40 bytes. The detailed structure of the header information of the RTP packet is shown in Figure 2.
图 2中所示从前到后 RTP头信息依次为: 第 1字节 (字节 0 )为一 些关于头信息结构本身的字段, 第 2字节(字节 1 )为定义净荷类型, 第 3、 4字节 (字节 2、 3 )为包序号 (Sequence Number ), 第 5-8字节为时 间戳 (timestamp ) , 第 9-12 字节为同步贡献源标识符 ( SSRC ID , Synchronous Source Identifier ) , 最后为贡献源标识符 ( CSRC Ids , Contributing Source Identifiers ) 的列表, 其数目不确定。 注意到, 在本文 描述中第 1个字节为标注的字节 0, 之后依此类推。  The front-to-back RTP header information shown in Figure 2 is: The first byte (byte 0) is some field about the header information structure itself, the second byte (byte 1) is the defined payload type, the third 4 bytes (bytes 2, 3) are the sequence number (Sequence Number), the 5th-8th byte is the timestamp (timestamp), and the 9th-12th byte is the synchronous contribution source identifier (SSRC ID, Synchronous Source) Identifier ) , and finally the list of contributing source identifiers ( CSRC Ids , Contributing Source Identifiers ), the number of which is uncertain. Note that the first byte in the description in this article is the byte 0 of the label, and so on.
其中前 12个字节出现在所有不同类型的 RTP数据包中,而头信息中 的其它数据, 比如贡献源标识符标识只有当混合器插入时才有。 因此 CSRC—般用于存在媒体混合时候的情况, 比如在多方会议中, 音频需要 混合, 视频也可以用这种方法提供多画面的功能。 而同步源标识 SSRC 其实就是所承载媒体流的标识。 The first 12 bytes appear in all different types of RTP packets, while other data in the header information, such as the contribution source identifier, is only available when the mixer is inserted. therefore CSRC is generally used when there is media mixing. For example, in multi-party conferences, audio needs to be mixed, and video can also provide multi-screen functions in this way. The synchronization source identifier SSRC is actually the identifier of the carried media stream.
上述各个字段的具体意义及全称分别描述如下:  The specific meanings and full names of the above fields are described as follows:
V字段为版本( Version )信息, 占 2比特 (bits), 目前采用的版本为 2, 因此置 V=2, 而其他值如 V=l表示更早的 RTP版本, V=0表示最原始的 RTP前身, 即在早期 Mbone网络上使用的语音 IP ( VOIP )通信系统中采 用, 后来演化成了 RTP, 而 V=3则尚未定义, 因此本发明可以使用。; The V field is version (Version) information, which occupies 2 bits. Currently, the version used is 2, so V=2 is set, and other values such as V=l indicate the earlier RTP version, and V=0 indicates the original. The RTP predecessor, which was adopted in the voice over IP (VOIP) communication system used on the early Mbone network, later evolved into RTP, and V=3 has not yet been defined, so the present invention can be used. ;
P字段为填充标识(Padding ), 占 1比特, P如果置位, 则表示数据 包末尾包含一个或多个填充字节 (Padding), 填充不属于有效载荷的一部 分; The P field is a padding flag (Padding), which occupies 1 bit. If P is set, it indicates that the packet contains one or more padding bytes (Padding) at the end, and the padding is not part of the payload;
X字段为扩展标识比特(Extension ), 占 1比特, X如果置位,则 RTP 头的最后必须跟一个可变长的头扩展(如果有 CSRC列表, 头扩展要跟 在其后), 主要是保留用于某些应用环境下头信息字段不够用的情况, 该 头信息扩展包含一个 16比特的长度字段来计数扩展中有多少个 32比特 长的字, 头扩展的前 16比特是左开放的, 以便区分标识符和参数, 这 16 比特的格式由具体的层面规范定义, 该头扩展的格式定义在 RFC3550第 5.3.1节中有详细描述, 此处限于篇幅不再给出;  The X field is an extension identification bit (Extension), which occupies 1 bit. If X is set, the RTP header must be followed by a variable-length header extension (if there is a CSRC list, the header extension is followed), mainly Retaining the case where the header information field is not sufficient for some application environments, the header extension includes a 16-bit length field to count how many 32-bit words in the extension, and the first 16 bits of the header extension are left-open. In order to distinguish between identifiers and parameters, the 16-bit format is defined by a specific level specification, which is described in detail in section 5.3.1 of RFC 3550, which is not given here;
CC字段为贡献源数目 (CSRC Count ), 占 4比特 s, 指明头信息最 后面的 CSRC标识符的个数, 接收方 居 CC字段可以确定头信息后面 的 CSRC IDs列表长度;  The CC field is the number of contributing sources (CSRC Count), which is 4 bits s, indicating the number of CSRC identifiers at the end of the header information, and the receiving CC field can determine the length of the CSRC IDs list following the header information;
M字段为标识比特(Marker ), 占 1比特, 该标识比特的解释在特定 的层面(Profile ) 中定义, 它允许标识出数据包流中的重要事件, 一个层 面可以定义附加的标识比特或规定没有标识比特, 这里所谓层面就是指 具体的应用环境设置, 由通信双方具体协定, 不受协议的限定;  The M field is a marker bit (Marker), which occupies 1 bit. The interpretation of the identifier bit is defined in a specific profile, which allows identification of important events in the packet stream. One layer can define additional identification bits or regulations. There is no identification bit. The so-called level here refers to the specific application environment setting, which is specifically agreed by the communication parties and is not limited by the agreement;
PT字段为载荷类型(PT, Payload Type ), 共 7 比特 s, 标识 RTP载 荷的格式并确定他在应用程序中的解释; 标志比特和载荷类型共一个字 节携带层面规定信息, 这个字节可能会被具体层面重新定义以适应不同 需求, 在具体应用中可以定义所谓的 profile, 其实就是一组静态(即通信 双方事先约定好的)对应关系, 将 FT比特不同的取值和不同的媒体格式 对应起来。当然也可以通过 RTP之外的信令来进行动态协商定义 FT取值 和媒体格式之间的关系。 在一个 RTP会话 (Session)中, RTP源是可以变 更 PT的。 The PT field is the payload type (PT, Payload Type), a total of 7 bits s, identifies the format of the RTP payload and determines his interpretation in the application; the flag bit and the payload type share a layer of specified information, this byte may It will be redefined by specific levels to suit different needs. In a specific application, a so-called profile can be defined, which is actually a set of static (ie communication). The two parties agree on the corresponding relationship in advance, and the different values of the FT bits are associated with different media formats. Of course, dynamic negotiation can also be used to define the relationship between the FT value and the media format through signaling other than RTP. In an RTP session (Session), the RTP source can change the PT.
接着的字段就是序号共 16 比特, 每发送一个 RTP数据包, 该序号 值加一, 这样接收者可以用它来检测数据包丢失和恢复数据包顺序, 一 次通信中的序号初始值可以随机给定, 不影响通信。  The following field is the serial number of a total of 16 bits. Each time an RTP data packet is sent, the serial number value is incremented by one, so that the receiver can use it to detect the data packet loss and recover the data packet sequence. The initial value of the serial number in one communication can be given randomly. , does not affect communication.
时间戳占 32比特, 它反映了 RTP数据包中第一个字节的采样时间, 这里的采样时间必须来源于一个单调线性增长的时钟, 接收方根据其调 整媒体播放时间或者进行同步。  The timestamp occupies 32 bits, which reflects the sampling time of the first byte in the RTP packet. The sampling time here must be derived from a monotonically increasing clock, and the receiver adjusts the media playback time or synchronizes according to it.
同步源 SSRC ID占 32 比特, 其具体值可随机选择, 但要确保同一 个 RTP会话中的唯一性, 即能唯一标识一个媒体源, 如果一个源改变了 源传送地址, 必须选择一个新的 SSRC标志符。  The synchronization source SSRC ID occupies 32 bits, and its specific value can be randomly selected. However, to ensure the uniqueness in the same RTP session, it can uniquely identify a media source. If a source changes the source transmission address, a new SSRC must be selected. The identifier.
贡献源 CSRC列表, 可以根据需要为 0-15项, 每项占 32 比特 s, 该 列表的长度即 CSRC ID的数目正好由 CC字段的 4个比特标出。事实上, 用于标识某个媒体源的 CSRC标志符与其对应的贡献源的 SSRC标志符 是一致的,只不过在不同的接收方的角色不同,而被置为 SSRC或 CSRC。 在多方通信中, CSRC ID是由混合器插入。  The source CSRC list can be 0-15 items as needed, each item occupying 32 bits s, and the length of the list, ie the number of CSRC IDs, is exactly indicated by 4 bits of the CC field. In fact, the CSRC identifier used to identify a media source is identical to the SSRC identifier of its corresponding contribution source, except that the role of the different receivers is different and is set to SSRC or CSRC. In multiparty communication, the CSRC ID is inserted by the mixer.
第一实施例  First embodiment
本实施例中, 收发双方基于 ER TP实现不等保护, 主要步骤如下所 述:  In this embodiment, the sending and receiving parties implement unequal protection based on ER TP. The main steps are as follows:
发送端选择前向纠错编码方案对多媒体数据进行糾删编码, 用 ER TP封装编码后的多媒体数据,并在 ER TP包头信息中携带前向纠错 编码方案相关信息, 然后发送到接收端;  The transmitting end selects the forward error correction coding scheme to perform erasure coding on the multimedia data, encapsulates the encoded multimedia data with ER TP, and carries relevant information of the forward error correction coding scheme in the ER TP header information, and then sends the information to the receiving end;
接收端将收到的 ERRTP包去封装, 并从 ERRTP包头信息中提取前 向纠错编码方案相关信息, 然后根据前向纠错编码方案相关信息, 选择 前向纠错编码方案进行纠删解码, 获得多媒体数据。  The receiving end encapsulates the received ERRTP packet, and extracts the relevant information of the forward error correction coding scheme from the ERRTP header information, and then selects the forward error correction coding scheme to perform the erasure decoding and decoding according to the related information of the forward error correction coding scheme. Get multimedia data.
其中, 不等保护体现在发送端是根据当前网络传送状况和 /或待发送 多媒体数据的服务质量等级来选择前向纠错编码方案的。 首先介绍 ERRTP的具体结构, 下面给出具体 ERRTP的头信息结构 实施例。 图 4是根据本发明的第一实施例的 ERRTP头信息结构示意图。 从图中可以看出, 版本信息字段 V取值为 3 , 表示 ERRTP协议, 以区别 于传统的 RTP协议 ( V=2 )。 其中在头信息扩展也就是最后附有关于前向 纠错编解码方案的相关信息字段, 此例中包括: 前向纠错编码类型字段、 前向纠错编码参数字段、 数据包长度字段、 数据包数目字段。 The unequal protection is reflected in that the sending end selects the forward error correction coding scheme according to the current network transmission status and/or the quality of service level of the multimedia data to be transmitted. First, the specific structure of ERRTP is introduced. The following is an example of the structure of the header information of the specific ERRTP. 4 is a block diagram showing the structure of an ERRTP header according to a first embodiment of the present invention. As can be seen from the figure, the version information field V takes a value of 3, indicating the ERRTP protocol, which is different from the traditional RTP protocol (V=2). The header information extension is finally accompanied by a related information field regarding the forward error correction codec scheme, and the example includes: a forward error correction coding type field, a forward error correction coding parameter field, a packet length field, and data. The number of packages field.
前向纠错编码类型字段, 用于指示前向纠错编码方案采用的纠删码 类型, 也可以称为 FEC Type字段, 即指示 FEC编码类型, 占 4比特, 可 以表示 16种不同的 FEC类型, 从实际应用中, 是足够的。 这里定义的类 型其实是大的类型, 后面还将继续细分为各种不同的方案, 称为子类型, 实际应用中的大类型例如: 0010表示 Tornado码, 0011表示 RS码等。 该字段可标识 16种不同的 FEC码大类型, 通信双方需要事先约定一个 FEC 编码类型和编码类型代号之间对应关系的查询表(LUT, Look-Up Table )称为 FECTypeLUT。  The forward error correction coding type field is used to indicate the erasure code type used by the forward error correction coding scheme, and may also be referred to as an FEC Type field, that is, an FEC coding type, which is 4 bits, and can represent 16 different FEC types. , from the actual application, is enough. The types defined here are actually large types, and will be further subdivided into various schemes, called subtypes. The large types in practical applications are, for example, 0010 for Tornado code and 0011 for RS code. This field can identify 16 different types of FEC codes. The query table (LUT, Look-Up Table), which needs to agree in advance on the correspondence between the FEC encoding type and the encoding type code, is called FECTypeLUT.
前向纠错编码子类型字段, 用于指示前向纠错编码方案的相关参数 设置, 对于每种类型的 FEC编码还需要确定各种参数的设置才能具体实 施, 这个字段就是起到明确具体参数的作用。 由于 ERRTP头信息中资源 有限, 不可能把各种 FEC编码方案所对应的具体参数及其规则等——罗 列, 本发明的第一实施例通过用子类型的概念来指示各种备选的参数设 置方案。 该字段也称为 FEC编码子类型字段, FEC Subtype, 占 9比特。 该域主要表示在 FECTypeLUT中定义的各大类型下面进一步细分的子类 型。  The forward error correction coding subtype field is used to indicate the related parameter setting of the forward error correction coding scheme. For each type of FEC coding, it is also necessary to determine the setting of various parameters to be specifically implemented, and this field is to clear specific parameters. The role. Since the resources in the ERRTP header information are limited, it is impossible to list specific parameters corresponding to various FEC encoding schemes, their rules, etc., and the first embodiment of the present invention indicates various alternative parameters by using the concept of subtypes. Set the plan. This field is also known as the FEC coded subtype field, FEC Subtype, which occupies 9 bits. This field mainly represents the subtypes further subdivided under the major types defined in the FECTypeLUT.
数据包长度字段, 用于指示前向纠错编码方案在对多媒体数据进行 纠删编码后的数据节点长度, 称为 Data Length字段, 占 11比特。 由于每 个数据包长度应小于网络传送最大传送单元(MTU, Maximum Transport Unit ), 而目前有线信道 MTU<1500 = 0x5DC字节, 无线信道 MTI 100 字节, 因此该字段 11个比特足以存放数据包的长度。  The packet length field is used to indicate the length of the data node after the forward error correction coding scheme performs erasure coding on the multimedia data, and is called a Data Length field, which is 11 bits. Since each packet length should be less than the Maximum Transport Unit (MTU), and the current cable channel MTU<1500 = 0x5DC bytes, the wireless channel MTI is 100 bytes, so this field is 11 bits enough to store the data packet. length.
数据包数目字段, 用于指示该 ERRTP包所承载的数据节点的数目, 又称为 Packet Number字段, 占 8比特, 比如对于若干个 NALU经过前 向纠错码校验后, 分组封装在多个 ERRTP中, 每个 ERRTP中所承载的 数据节点数。 The number of packets field, used to indicate the number of data nodes carried by the ERRTP packet, also known as the Packet Number field, which occupies 8 bits, for example, before a number of NALUs pass through After the error correction code is verified, the packet is encapsulated in multiple ERRTPs, and the number of data nodes carried in each ERRTP.
可见有了这些字段之后, 解码端或网络节点可以根据该字段给出的 FEC码类型和数据包的校验类型对接收到的数据包进行校验, 并恢复丟 失的数据包。  It can be seen that after these fields are available, the decoding end or the network node can verify the received data packet according to the FEC code type and the check type of the data packet given by the field, and recover the lost data packet.
值得注意的是,上面提到的子类型 FEC Subtype字段共 9个比特是用 来编码指示各种备选的参数设置方案的, 下面就给出本发明的第一实施 例中如何进行编码指示的技术细节。  It is to be noted that the sub-type FEC Subtype field mentioned above has a total of 9 bits for encoding a parameter setting scheme indicating various alternatives, and how to perform the coding indication in the first embodiment of the present invention is given below. technical details.
首先收发欢方需要协商确定该字段指示关系对应表。 在开始传送之 前, 发送端和接收端协商确定: 对于各种 FEC码大类型, FEC Subtype 的取值与其所指示的该种 FEC码的相关参数设置方案的对应关系, 及各 种备选方案的具体参数设置情况。  First, the receiving and receiving party needs to negotiate to determine the field indicating the relationship correspondence table. Before starting the transmission, the sender and the receiver negotiate to determine: for various types of FEC codes, the correspondence between the value of the FEC Subtype and the related parameter setting scheme of the FEC code indicated, and various alternatives. Specific parameter settings.
然后, 发送端和接收端都根据协商结果建立对应关系表, 用于根据 FEC Type和 FEC Subtype字段来查询所对应的 FEC编码类型或 FEC编解 码处理模块;  Then, the sender and the receiver both establish a correspondence table according to the negotiation result, and are configured to query the corresponding FEC coding type or FEC codec processing module according to the FEC Type and FEC Subtype fields;
在收发过程中, 发送端调用相应纠删编码处理模块进行纠删编码, 接收端调用相应纠删解码处理模块进行纠删解码。  In the process of transmitting and receiving, the transmitting end calls the corresponding erasure coding processing module to perform erasure coding, and the receiving end calls the corresponding erasure decoding processing module to perform erasure decoding.
在实际应用中, 子类型的信息实际上指示两个方面:  In practical applications, subtype information actually indicates two aspects:
A. FEC编码的生成规则 ( Generation Rule );  A. Generation rules for FEC coding (Generation Rule);
B. 保护强度 /保护能力。  B. Protection strength / protection.
所谓生成规则就是在发送端如何将数据节点进行处理生成各个校验 节点的规则或者算法 (Algorithm)。 当然在接收端所做的正好相反,如果在 传送过程中发生了丢包, 即某些节点丢失了, 那么才 据生成规则可以恢 复或者部分恢复丟失的节点。 可见生成规则是很重要的信息, 根据它, 通信的双方就可以基于 FEC机制来工作了。 在 FECTypeLUT 中列出的 FEC类型中的每一类,都有不同的生成规则;而在每一类中,比如 Tornado 码, 下面的子类的生成规则还要结合具体的生成参数(generation parameters)。 因此具体到这里的每个子类, 声称规则将和生成参数结合起 来。 比如对于 Tornado码, 生成参数包括如下数据: 据节点总数、 校验 节点总数、 校验节点层数、 相继两层之间节电数目的递缩比例、 表示相 继两层之间节点关联关系的关联矩阵, 如果有 L层校验节点, 那么这样 的关联矩阵就有 L个、 或者等效的表示相继两层节点关联关系的二部图 一般来说, 在大的生成规则相同的前提下, 生成参数往往决定子类 型的保护强度。 比如 Tornado码, 在上面给出的各项生成参数中, 数据 节点总数和检验节点总数基本上能够在很大程度上决定保护能力 (当然 严格来说, 要完全决定保护能力, 需要全部的生成参数)。 在本发明中, 对于每个 FEC大类型,选择一些决定保护能力的主要参数 (决定作用最大) 作为代表性生成参数 ( representative generation parameters )0 通过使用代 表性生成参数, 就可以把大类下面的子类按照保护能力从弱到强的顺序 (升序 )排列起来。 从而建立一个 LUT叫做 FECSubTypeLUT。 The so-called generation rule is a rule or algorithm (Algorithm) of how the data node is processed at the transmitting end to generate each check node. Of course, the opposite is done at the receiving end. If a packet loss occurs during the transmission, that is, some nodes are lost, the lost node can be recovered or partially recovered according to the generation rule. It can be seen that the generation rule is very important information, according to which both parties of the communication can work based on the FEC mechanism. Each of the FEC types listed in the FECTypeLUT has different generation rules; in each class, such as Tornado code, the following subclass generation rules are combined with specific generation parameters. . So for each subclass here, the claim rule will be combined with the build parameters. For example, for the Tornado code, the generation parameters include the following data: According to the total number of nodes, the total number of check nodes, the number of check node layers, the scaling ratio of the number of power saves between successive layers, and the association of node associations between successive two layers. Matrix, if there is an L-layer check node, then such an associative matrix has L or equivalent bipartite graphs representing the relationship between successive two-layer nodes. Generally, under the premise of large generation rules, the generation is performed. The parameters often determine the protection strength of the subtype. For example, Tornado code, in the various generation parameters given above, the total number of data nodes and the total number of test nodes can basically determine the protection ability to a large extent (of course, strictly speaking, to fully determine the protection capability, all the generation parameters are required. ). In the present invention, for each FEC large type, select some of the main parameters determining the protection capability (the decision is the most important) as the representative generation parameters (representative generation parameters) 0 by using the representative generation parameters, it is possible to Subclasses are arranged in order of protection from weak to strong (ascending order). Thus creating a LUT is called FECSubTypeLUT.
每个大类型下面具体支持多个子类型, 可以有具体的应用和通信双 方的通信能力 (CPU处理速度、 内存、 程序复杂度等因素)和需要决定。 如果通信环境变化很大, 网络的性能波动范围很大, 那么需要支持的子 类型一般来说要多, 相反可以较少。 这个完全可以在通信开始前通过能 力协商过程, 由通信双方来达成一致的约定。 协商可以通过 H.323 或会 话初始协议 ( SIP, Session Initial Protocol )等目前主流的多媒体通信框架 协议进行。  Each large type specifically supports multiple subtypes below, and can have specific application and communication capabilities (CPU processing speed, memory, program complexity, etc.) and needs to be determined. If the communication environment changes a lot and the performance of the network fluctuates widely, then the subtypes that need to be supported are generally more, but less. This can be agreed upon by the communication parties through the capability negotiation process before the communication begins. Negotiation can be carried out through the current mainstream multimedia communication framework protocols such as H.323 or Session Initial Protocol (SIP).
假定针对某个大类下面的子类, 如果需要区分 S个子类型 (S≤29-l), 代表性生成参数有 k个, 用 pl,p2,...,pk表示, 那么表 2给出一个对应关 系的例子, 表中上标表示 FEC大类型, 下标表示具体哪个参数。  Assume that for subclasses under a large class, if it is necessary to distinguish S subtypes (S ≤ 29-1), there are k representative generation parameters, denoted by pl, p2, ..., pk, then Table 2 gives An example of a correspondence, the superscript in the table indicates the FEC big type, and the subscript indicates which parameter.
表 2 FEC Subtype和参数设置方案对应关系表  Table 2 FEC Subtype and parameter setting scheme correspondence table
■ FEC Subtype FEC编码子类型 (参数设置) ■ FEC Subtype FEC coding subtype (parameter setting)
000000000 FEC子类型 0 (p° p°2, . . , ρ\)000000000 FEC subtype 0 (p° p° 2 , . . , ρ\)
000000001 FEC子类型 p^ p'2,. .,p'k)000000001 FEC subtype p^ p' 2 ,. .,p' k )
000000010 FEC子类型 2 (p2 p2 2, . . , p\) 000000011 FEC子类型 3 (p3 p3 2,. .,p ) 000000010 FEC subtype 2 (p 2 p 2 2 , . . , p\) 000000011 FEC subtype 3 (p 3 p 3 2 , . . , p )
S (S^29-1) FEC子类型 S (ps h ps 2, . , , psJ 比如,对于 Tornado码, 可以设置对应关系是 :000000010 - ( 24, 20 ) (数据节点总数 =20 , 校验节点总数 =4 ), 000000011 - ( 30, 20 ), …, 111111111—其它。 S (S^2 9 -1) FEC subtype S (p s h p s 2 , . , , p s J For example, for Tornado code, the correspondence can be set to: 000000010 - ( 24, 20 ) (total number of data nodes =20, total number of check nodes = 4), 000000011 - (30, 20), ..., 111111111 - others.
针对某种特性的 FEC编码的子类型, 一组给定的生成规则结合相应 的生成参数对应唯一的一个编码方案, 即唯一决定了如何由数据节点生 成校睑节点, 以及如何恢复丟失的节点。 可以建立一个数据库, 来存储 每种大类型和子类型对应的生成参数。 而生成规则本身用硬件或者软件 模块来实现。 因此, 每种大类型在发送端对应一个 FEC处理模块, 负责 生成校验节点; 在接收端同样对应一个 FEC处理模块, 负责恢复节点。 但是, 对应每种大类型的模块, 需要从上述生成参数数据库中读取具体 的每种子类型的生成参数, 从而来进行处理。 因此, 通行双方都是根据 For a subtype of FEC coding of a certain characteristic, a given set of generation rules combined with corresponding generation parameters corresponds to a unique coding scheme, that is, the only decision is how to generate a calibration node from the data node, and how to recover the lost node. A database can be created to store the generation parameters for each of the large types and subtypes. The generation rules themselves are implemented in hardware or software modules. Therefore, each type of macro corresponds to a FEC processing module at the transmitting end, which is responsible for generating a check node; at the receiving end, it also corresponds to an FEC processing module, which is responsible for restoring the node. However, for each large type of module, it is necessary to read the specific generation parameters of each seed type from the above generated parameter database, thereby performing processing. Therefore, both parties are based on
FEC Type和 FEC Subtype两个信息域的信息来决定调用哪个 FEC处理模 块和读取那些生成参数。 The information of the two information fields FEC Type and FEC Subtype determines which FEC processing module is called and reads those generation parameters.
由于目前多媒体通信技术的发展, H.264视频编码标准已逐渐成为主 流媒体编码格式, 因此本发明的第二实施例在第一实施例的基础上, 给 出了用 ERRTP对 H.264的 NALU数据流进行 FEC编解码的具体步骤, 其流程如图 5所示。  Due to the development of multimedia communication technology, the H.264 video coding standard has gradually become the mainstream media coding format. Therefore, based on the first embodiment, the second embodiment of the present invention gives the NALU of H.264 with ERRTP. The specific steps of the data stream for FEC encoding and decoding, the flow of which is shown in Figure 5.
步骤 501, 发送端将多个 (假设为 S个) H.264 NALU合并为一组统一 进行编码传送, 先把 S个 NALU重新划分为等长的块, 假设为 M个, 这 M个就是数据节点。  Step 501: The sender combines multiple (assumed S) H.264 NALUs into a unified group of coded transmissions, and first re-divides the S NALUs into blocks of equal length, which are assumed to be M, and the M are data. node.
在该步中, 将 H.264的 S个 NALU分为一组; 然后将 S个 NALU首 尾相接 (concatenated), 连接形成一个大块, 然后将该大块等分为 M个数 据块, 其中每个数据块的长度为 K个字节。 这里如果该大块的总的字节 数(设为 TB ) 不能被 M整除, 那么应该进行取整运算,使得每个数据块 的长度为 Ceiling(TBZM)字节, Ceiling函数表示取整, 即 Ceiling(x)等于不 小于 x的最小整数, x为任意实数。 那么在某些数据块中的后面可能要采 用填充零串 (zero padding)的操作 , 使得字节数凑齐到 Ceiling(TB/M) )。 In this step, the S NALUs of H.264 are grouped into one group; then the S NALUs are concatenated end-to-end, connected to form a large block, and then the large block is equally divided into M data blocks, wherein Each data block has a length of K bytes. Here, if the total number of bytes of the large block (set to TB) cannot be divisible by M, then the rounding operation should be performed so that the length of each data block is Ceiling (TBZM) bytes, and the Ceiling function indicates rounding, that is, Ceiling(x) is equal to no The smallest integer less than x, x is any real number. Then in some data blocks, the operation of zero padding may be used, so that the number of bytes is equal to Ceiling (TB/M).
步骤 502, 对 M个数据节点其进行 FEC编码, 得到 N个校验节点。 对 M个数据块使用 FEC码编码生成 N个校验块,生成过程釆用前面描述 过的方法, 根据 FEC Type和 FEC Subtype信息, 确定调用具体哪个 FEC 处理模块进行校验块的生成。  Step 502: Perform FEC encoding on the M data nodes to obtain N check nodes. Using FEC code encoding for M data blocks to generate N check blocks, the generation process uses the method described above to determine which FEC processing module to call for the generation of the check block according to the FEC Type and FEC Subtype information.
步骤 503, 发送端将所有数据节点和校验节点分组封装在 ERRTP包 中进行发送。 图 6示出了 P个 ER TP包装载 M+N个数据节点的结构。 结合图 4给出的 ERRTP的头信息格式, 在此例中各个字段应该按如下设 置:  Step 503: The sender encapsulates all data nodes and check node packets in an ERRTP packet for transmission. Figure 6 shows the structure of P + ER TP packages carrying M + N data nodes. Combined with the header information format of ERRTP given in Figure 4, in this example the fields should be set as follows:
类型字段 FEC Type = 0010, 表示使用 Tornado码;  Type field FEC Type = 0010, indicating the use of Tornado code;
子类型字段则由发送端具体根据实际情况选择, 比如取值为 FEC Subtype = 000000010, 表示使用 Tornado(24,20)码, 其中数据节点 20个, 校验节点 4个, 信道编码冗余度为 16.7%; 该纠删码在丢包率小于等于 3%时, 可以完全恢复丢失的数据包;  The subtype field is selected by the sender according to the actual situation. For example, the value is FEC Subtype = 000000010, which means that the Tornado (24, 20) code is used, including 20 data nodes and 4 check nodes. The channel coding redundancy is 16.7%; the erasure code can completely recover the lost data packet when the packet loss rate is less than or equal to 3%;
数据包长度 Data-Length = K Bytes;  Packet length Data-Length = K Bytes;
数据包数目 Packet Number = (M+N)/P,表示一个 ERRTP载荷中承载 的数据节点个数。  Packet Number Packet Number = (M+N)/P, which represents the number of data nodes carried in an ERRTP payload.
步驟 504, 接收端在接收到这些 ERRTP包后, 去封装得到数据节点 和校验节点。 接收端以 P个数据包为周期, 每接收到一组 P个数据包就 开始进行一次解码恢复。 一组多少个数据包由双方协商确定。  Step 504: After receiving the ERRTP packets, the receiving end encapsulates the data node and the check node. The receiving end starts with P packets and starts decoding and recovering every time a group of P packets is received. How many packets of a group are determined by mutual agreement.
步骤 505, 接收端根据校验节点对数据节点进行前向纠错解码。每次 在收到数据包 P+1后开始检测前面收到的 P个数据包中是否有数据包丟 失,如果有就采用前面描述的方法,根据 FEC Type和 FEC Subtype信息, 确定调用具体哪个 FEC处理模块进行解码和恢复或者部分丟失的数据。  Step 505: The receiving end performs forward error correction decoding on the data node according to the check node. Each time after receiving the data packet P+1, it starts to detect whether there is a packet loss in the P packets received before. If there is, the method described above is used to determine which FEC to call according to the FEC Type and FEC Subtype information. The processing module decodes and recovers or partially loses data.
步骤 506,最后在得到完整的数据节点后,重新合并就得到一个大块, 采用与发送端相同的方式, 划分得到 S个 NALU。  Step 506, finally, after obtaining the complete data node, re-merging to obtain a large block, and dividing the S NALUs in the same manner as the transmitting end.
在实际应用中发现, 上例采用基于 ERRTP的抗数据包丟失算法, 可 以在增加不到 17%码字的情况下, 大大提高视频码流的抗数据包丢失能 力。 而与 RTP载荷头结构相比, 仅仅增加了 4 字节, 可见对传送效率基 本没有影响, 取得了显著的实际效果。 In practical applications, the above example uses the ERRTP-based anti-data packet loss algorithm, which can greatly improve the anti-data packet loss of the video code stream when the number of codewords is less than 17%. Force. Compared with the RTP payload header structure, only 4 bytes have been added, which shows that there is basically no effect on the transmission efficiency, and significant practical results have been achieved.
在前面已经提到关于本发明的另外一个关键技术点就是不等保护的 实现。 主要体现在两个方面, 一个是根据不同重要等级的多媒体数据来 选择合适的编解码方案或者参数, 即确定前述 FEC编码类型与子类型, 另一个就是才艮据不同时刻的网络状况来选择。 对应这两个方面, 分别称 为混合和交替使用各种 FEC编码方案。 所谓混合 (Hybrid),是指在同一时 间内同时使用多种 FEC子类型, 主要用于保护不同重要性的数据; 而所 谓交替 (Alternation), 是指在不同时间 (不同的网絡状况下)使用不同的 FEC子类型。  Another key technical point that has been mentioned above with respect to the present invention is the implementation of unequal protection. It is mainly embodied in two aspects. One is to select the appropriate codec scheme or parameters according to the multimedia data of different important levels, that is, to determine the aforementioned FEC coding type and subtype, and the other is to select according to the network conditions at different times. Corresponding to these two aspects, they are called mixed and alternate use of various FEC coding schemes. Hybrid refers to the simultaneous use of multiple FEC subtypes at the same time, mainly for protecting data of different importance. The so-called Alternation refers to the use at different times (different network conditions). Different FEC subtypes.
因此, 在本发明的第三实施例中, 基于第一实施例给出这两种不等 保护机制。 对于 H.264 NALU数据流, 如前所述, 其头字节体现了数据 的重要程度, 因此发送端根据 NALU的头信息中的 NRI字段或 Type字 段可以评估 QoS等级, 进而选择前向纠错编码方案, 即确定 FEC Type 字段和 FEC Subtype字段。 而对于网絡状况,一般的网络传送都有相应的 网络状况监测机制 , 发送端可以根据这些机制获知接收端反馈的传送报 告,' 以此评价网络传送状况, 进而选择前向纠错编码方案, 即确定 FEC Type字段和 FEC Subtype字段。  Therefore, in the third embodiment of the present invention, these two unequal protection mechanisms are given based on the first embodiment. For the H.264 NALU data stream, as mentioned above, its first byte reflects the importance of the data, so the sender can evaluate the QoS level according to the NRI field or Type field in the NALU header information, and then select the forward error correction. The coding scheme, that is, the FEC Type field and the FEC Subtype field are determined. For the network condition, the general network transmission has a corresponding network condition monitoring mechanism, and the transmitting end can learn the transmission report fed back by the receiving end according to these mechanisms, so as to evaluate the network transmission status, and then select the forward error correction coding scheme, that is, Determine the FEC Type field and the FEC Subtype field.
H.264码流是基于 NALU进行传送或存储, NALU由 NAL头信息和 NAL载荷組成。在 H.264的 NALU中, 不同 NALU类型对解码恢复图像 的影响不同。 例如, NRI取 0表示 NALU中存放非参考图象的一个 Slice 或 Slice数据条带, 不会影响后续解码; 而 取非 0表明 NALU中存 放一个序列 /图像参数集或者是参考图像的一个 Slice或 Slice数据条带, 会严重影响后续解码。  The H.264 code stream is transmitted or stored based on the NALU, which consists of NAL header information and NAL payload. In the NALU of H.264, different NALU types have different effects on decoding and restoring images. For example, a NRI of 0 means that a Slice or Slice data strip of a non-reference image in the NALU does not affect subsequent decoding; and a non-zero indicates that a sequence/image parameter set or a slice of the reference image is stored in the NALU or Slice data strips can seriously affect subsequent decoding.
因此, 在对 H.264 的码流进行数据包保护时, 可以根据 NRI 或 Therefore, when packet protection is applied to the H.264 code stream, it can be based on NRI or
Nal— unit— type的取值将 H.264的数据分为两类:一类为相对重要的图像数 据(例如 Nal— ref— idc等于 1 );另一类为次要的图像数据(例如 Nal— ref— idc 等于 0 )。然后,对重要的图像数据使用冗余度较大、抗丢包能力强的 FECI 码进行保护; 而次要的图像数据可以使用冗余度较小、 抗丢包能力较弱 的 FEC2码进行保护。 The value of Nal_unit_type divides the data of H.264 into two categories: one is relatively important image data (for example, Nal_ref-idc is equal to 1); the other is secondary image data (for example, Nal). — ref— idc is equal to 0). Then, the important image data is protected by the FECI code with high redundancy and strong anti-dropping ability; while the secondary image data can be used with less redundancy and weaker anti-loss capability. The FEC2 code is protected.
通过这种不等保护算法, 保证了各类重要信息在高数据包丟失环境 下的正确恢复,而对 FEC2码仍然未能恢复的图像信息采用误码掩盖和防 止误码扩散等技术。 FEC1,FEC2这里只是一般的表示方法, 表示任意两 种子类型。 这两种子类型可以属于同一大类型, 也可以属于不同大类型。  Through this unequal protection algorithm, the correct recovery of all kinds of important information in a high packet loss environment is ensured, and the image information that the FEC2 code still fails to recover adopts techniques such as error concealment and prevention of error diffusion. FEC1, FEC2 are just general representations, representing any two seed types. These two seed types can belong to the same large type, or they can belong to different large types.
很显然, 上述方法可以推广到更加一般的情形, 把数据按照 NAL— unit-type的取值分成更多类, 比如五类: 最重要数据、 次重要数据、 一般重要数据、 较不重要数据、 最不重要数据; 也可以分成 7类或者更 多, 那么, 可以用相同数量的 FEC子类型来保护, 每类数据对应一种不 同的子类型。 只要保护能力从弱到强就可以了, 这些子类型不一定属于 同一个大类型。 而对保护能力最强的 FEC码保护后仍然未能恢复的图像 信息采用误码掩盖和防止误码扩散等技术。  Obviously, the above method can be extended to a more general case, and the data is divided into more classes according to the value of NAL_unit-type, such as five categories: the most important data, the second most important data, the general important data, the less important data, The least important data; can also be divided into 7 categories or more, then, can be protected with the same number of FEC subtypes, each type of data corresponds to a different subtype. As long as the protection ability is weak to strong, these subtypes do not necessarily belong to the same large type. The image information that has not been recovered after the protection of the most protected FEC code is protected by error concealment and error-proof diffusion.
根据本发明的不等保护的另外一种情况是可以根据网络实时状况选 择不同保护能力的的 FEC。然后通过 ERRTP的头信息来通知通信的双方, 使得它们能够正确对数据进行解码和恢复丢失的数据。 可以把网络当前 受到影响传送性能下降的情况分成几个级别。 比如五级: 最严重、 次严 重、 一般严重、 较不严重、 最不严重; 也可以分成 7级或者更多, 那么, 可以用相同数量的 FEC子类型来保护, 每级对应一种不同的子类型。 只 要保护能力从弱到强就可以了, 这些子类型不一定属于同一个大类型。 而对保护能力最强的 FEC码保护后仍然未能恢复的图像信息采用误码掩 盖和防止误码扩散等技术。 可以通过现有的各种 QoS监测方法监测网络 状况。  Another case of unequal protection according to the present invention is the ability to select FECs of different protection capabilities depending on the real-time conditions of the network. The two sides of the communication are then notified by the header information of ERRTP so that they can correctly decode the data and recover the lost data. It is possible to divide the current situation in which the network is affected by the drop in transmission performance into several levels. For example, five levels: the most serious, the second most serious, the more serious, the less serious, the least serious; can also be divided into 7 or more, then, you can use the same number of FEC subtypes to protect, each level corresponds to a different Subtype. As long as the protection ability is weak to strong, these subtypes do not necessarily belong to the same large type. The image information that has not been recovered after the protection of the FEC code with the strongest protection is protected by error concealment and error-preventing. Network conditions can be monitored through various existing QoS monitoring methods.
根据本发明还可提供更为复杂的应用方案, 如果总共有 T种 FEC方 案 (不同类型 /子类型)可以使用 (通信双方终端都支持)。 决定采用哪种 FEC, 要同时取决于数据重要性和网络的状况。 那么可以采用一个二维 LUT的方法, 如表 3所示:  More complex applications can also be provided in accordance with the present invention, if a total of T FEC schemes (different types/subtypes) are available (both terminals are supported by both parties). Deciding which FEC to use depends on both the importance of the data and the state of the network. Then you can use a two-dimensional LUT method, as shown in Table 3:
表 3 多种 FEC机制混合和交替使用的二维 LUT
Figure imgf000025_0001
Table 3 Two-dimensional LUTs mixed and alternated with various FEC mechanisms
Figure imgf000025_0001
以上表格中, 数据重要性级别和网络状况级别都按照升序排列。 其 In the above table, the data importance level and the network status level are in ascending order. its
'中 FEC的下标用二维下标表示,表中的容错弹性机制 FEC(i,j), 0<i < U,0<j < V, 可以是上述 T个 FEC方案中的任意一种。 The subscript of the middle FEC is represented by a two-dimensional subscript. The fault-tolerant elastic mechanism FEC(i,j) in the table, 0<i < U, 0<j < V, may be any of the above T FEC schemes. .
上述发明的实施例描述中均以 FEC纠删码特别是 Tornado码为例, 但对于其他类似的容错弹性机制特别是除 Tornado码以外的 FEC编码方 案都可以适用, 并不影响本发明的实质和范围。  The description of the embodiments of the above invention is exemplified by the FEC erasure code, especially the Tornado code, but can be applied to other similar fault-tolerant elastic mechanisms, especially the FEC coding scheme except the Tornado code, without affecting the essence of the present invention. range.
而在本发明的另外一个实施例中,专门采用了一种改进的 Tornado纠 删码,这种改进的 Tornado纠删码对于一组数据节点仅生成一层所述校验 节点, 可以大大减少编码延时, 满足实时通信的需求。  In another embodiment of the present invention, an improved Tornado erasure code is specifically employed. The improved Tornado erasure code generates only one layer of the check node for a group of data nodes, which can greatly reduce coding. Delay, to meet the needs of real-time communication.
在实时视频通信中, 使用 FEC码数据包保护会引入时延, 时延的大 小与图像数据数据包的大小相关。 将 S 个 NALU分为一组, 其中一个 NALU包含一个 Slice的码流数据。如果一帧图像划分为一个 Slice, 则编 码端就会有 S帧的时延, 同样解码端也会有 S帧的时延。 NALU与数据 节点个数的关系如下式所示:  In real-time video communication, packet protection using FEC codes introduces delays, the size of which is related to the size of the image data packets. The S NALUs are grouped into one group, and one NALU contains the code stream data of a Slice. If a frame of image is divided into a slice, the encoding end will have the delay of the S frame, and the decoding end will also have the delay of the S frame. The relationship between NALU and the number of data nodes is as follows:
NalSize; = PackSize x DataNode ( 1 ) i=0 NalSize ; = PackSize x DataNode ( 1 ) i=0
式中 S个 NALU长度值相加等于数据节点个数乘上每个节点数据包 的大小。 由式(1)可以看出当 S取值受限时, PackSize xDataNode的取值也会 受限, 另外由于 IP 网络传送的有效性导致 PackSize Ji 值不能太小, 因此 DataNode的取值受限。 IP网络上实时视频通信中, 一帧图像的延时 。,计算如下:
Figure imgf000026_0001
The sum of the S NALU length values in the equation is equal to the number of data nodes multiplied by the size of each node packet. It can be seen from equation (1) that when the value of S is limited, the value of PackSize xDataNode is also limited. In addition, the value of PackSize Ji cannot be too small due to the validity of IP network transmission, so the value of DataNode is limited. In real-time video communication over IP networks, the delay of one frame of image. , calculated as follows:
Figure imgf000026_0001
该式中^ ¾c是加入 FEC保护后引入的时延, ^和 Γ„分别是 Η.264 编解码器处理时延和网絡传送时延。 由于数字信号处理技术和 IP网络的 迅速发展, 可以假定 Tcdec和 Τ 都能够满足实时性要求: Toodec <= Tih , Tlram <= Tth , 其中 = 1/^ , (3) 式 (3)中 是解码目标帧率(可取值 10Hz, 30Hz等), 且设一帧图 像划分为一个 Slice, 这时式 (2)可改为: The formula ^ ¾ c delay is introduced after the addition of FEC protection, and ^ Γ "are Η.264 codec for processing the network transmission delay and delay due to the rapid development of digital signal processing technology and IP networks, can be It is assumed that T c dec and Τ can satisfy the real-time requirement: Toodec <= T ih , T lram <= T th , where = 1/^ , (3) is the decoding target frame rate in formula (3) (available value) 10Hz, 30Hz, etc.), and set a frame image into a slice, then the formula (2) can be changed to:
Tlolal <= S * Tlh + 2 * Tth = (S + 2r Tlh (4) 由式 (4)和式 (1)可知, 一帧图像的延时 的延时基本由 S 的取值确 定, 而 DataNode又大大影响 s的取值。 因此, 要在能够保证视频通信抗数 据包丢失能力的前提下, 尽量减少 FEC引入的时延, 进一步保证实时视 频通信的 QoS。 T lolal <= S * T lh + 2 * T th = (S + 2r T lh (4) From equations (4) and (1), the delay of the delay of one frame is basically determined by the value of S. Determine, and the DataNode greatly affects the value of s. Therefore, under the premise of ensuring the ability of video communication to resist packet loss, the delay introduced by FEC is minimized, and the QoS of real-time video communication is further ensured.
本发明在 DataNode受限的情况下, 采用改进的 Tornado码保护算法。 该改进的 Tornado方法, 不采用多级偶图的编码方式, 而是只使用一层校 验节点的编码方式。 与原来的 Tornado编码方式相比, 改进后的编码方法 大大提高了算法的灵活性, 数据节点和校验节点的个数可以任意设置, 也降低了编解码算法的复杂度, 可用于实时视频通信的抗数据包丟失。 The present invention employs an improved Tornado code protection algorithm in the case where the DataNode is limited. The improved Tornado method does not use a multi-level even graph coding method, but uses only one layer of check node coding. Compared with the original Tornado coding method, the improved coding method greatly improves the flexibility of the algorithm. The number of data nodes and check nodes can be set arbitrarily, and the complexity of the codec algorithm is also reduced. It can be used for real-time video communication. Anti-packet loss.
另夕卜, 在数据节点受限的情况下, 改进 Tornado码的抗数据包丢失性 能基本没有下降。 该改进的 Tornado编码方法具体原理及详细步骤, 参考 申请号为 200510066146.7的发明名称为 "一种基于纠删码的数据传送保 护方法" 的中国专利申请。  In addition, the improved anti-data packet loss performance of the Tornado code is basically not reduced in the case where the data node is limited. The specific principle and detailed steps of the improved Tornado coding method are described in Chinese Patent Application No. 200510066146.7, entitled "A Data Transmission Protection Method Based on Erasure Code".
本领域的技术人员可以理解, 上述实施例中所给出的具体参数设置 和数值以及其他实现细节举例, 在具体应用中可以采用其他可行值或者 方案, 实现本发明目的而不影响实质和范围。  It will be understood by those skilled in the art that the specific parameter settings and values and other implementation details given in the above embodiments may be used in the specific application, and other feasible values or solutions may be used to achieve the object of the present invention without affecting the essence and scope.

Claims

-25- 权 利 要 求 -25- Claims
1. 一种支持容错弹性的多媒体数据网络实时传送方法,其特征在于, 包括:  A real-time transmission method for a multimedia data network supporting fault-tolerant flexibility, comprising:
发送端选择前向糾错编码方式对所述多媒体数据进行前向纠错编 码;  The transmitting end selects a forward error correction coding mode to perform forward error correction coding on the multimedia data;
所述发送端用容错弹性实时传送协议封装编码后的多媒体数据, 并 在所述容错弹性实时传送协议数据包的头信息中携带所述前向纠错编码 方式相关信息, 并发送给接收端;  The transmitting end encapsulates the encoded multimedia data by using a fault-tolerant elastic real-time transmission protocol, and carries the forward error correction coding mode related information in the header information of the fault-tolerant elastic real-time transmission protocol data packet, and sends the information to the receiving end;
所述接收端将收到的容错弹性实时传送协议数据包去封装, 并从所 述容错弹性实时传送协议数据包的头信息中提取所述前向纠错编码方式 相关信息;  The receiving end decapsulates the received fault-tolerant elastic real-time transport protocol data packet, and extracts the forward error correction coding mode related information from the header information of the fault-tolerant elastic real-time transport protocol data packet;
当在传送过程中发生数据节点对应的容错弹性实时传送协议包丟 失, 所述接收端根据所述前向纠错编码方式相关信息, 选择所述前向纠 错解码方式进行前向纠错解码, 恢复或者部分恢复所述丟失的多媒体数 据。  When the fault-tolerant elastic real-time transport protocol packet corresponding to the data node is lost during the transmission, the receiving end selects the forward error correction decoding mode to perform forward error correction decoding according to the forward error correction coding mode related information, Restoring or partially recovering the lost multimedia data.
2. 根据权利要求 1所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述前向纠错编码后的多媒体数据包括数据节点和 校驗节点。  2. The method for real-time transmission of a multimedia data network supporting error tolerance resilience according to claim 1, wherein the forward error correction encoded multimedia data comprises a data node and a check node.
3. 根据权利要求 2所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述发送端根据当前网络传送状况或 /和待发送多媒 体数据的服务质量等级选择前向纠错编码方式, 其中服务^:量等级根据 数据的相对重要性确定。  The method for real-time transmission of a multimedia data network supporting fault tolerance resilience according to claim 2, wherein the transmitting end selects forward error correction coding according to a current network transmission condition or/and a quality of service level of the multimedia data to be transmitted. The mode, where the service ^: the quantity level is determined according to the relative importance of the data.
4. 根据权利要求 3所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述容错弹性实时传送协议数据包头信息中包含: 前向纠错编码类型字段, 用于指示所采用的前向纠错码类型; 前向纠错编码子类型字段, 用于指示所述前向纠错编码方式的相关 参数设置;  The real-time transmission method for supporting a fault-tolerant elastic multimedia data network according to claim 3, wherein the packet header information of the fault-tolerant elastic real-time transmission protocol includes: a forward error correction coding type field, which is used to indicate that a forward error correction code type; a forward error correction coding subtype field, configured to indicate a related parameter setting of the forward error correction coding mode;
数据包长度字段, 用于指示在对所述多媒体数据进行纠前向纠错码 后得到的节点的长度; a packet length field, configured to perform a forward error correction code on the multimedia data The length of the resulting node;
数据包数目字段, 用于指示该容错弹性实时传送协议数据包所承载 的所述数据节点的数目。  A packet number field, used to indicate the number of the data nodes carried by the fault tolerant elastic real-time transport protocol data packet.
5. 根据权利要求 4所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 当所述多媒体数据为 H.264网络抽象层单元时, 所述发送端将至少一个所述 H.264 网络抽象层单元划分为等长的至 少一个数据节点, 然后对其进行前向纠错编码, 得到至少一个校验节点; 所述发送端将所述数据节点和所述校睑节点分组封装在至少一个所 述容错弹性实时传送协议包中进行发送;  The real-time transmission method of the multimedia data network supporting the fault-tolerant elasticity according to claim 4, wherein when the multimedia data is an H.264 network abstraction layer unit, the transmitting end shall have at least one of the H. 264, the network abstraction layer unit is divided into at least one data node of equal length, and then subjected to forward error correction coding to obtain at least one check node; the sender encapsulates the data node and the calibration node in a packet Transmitting in at least one of the fault tolerant elastic real-time transport protocol packets;
所述接收端在接收到所述容错弹性实时传送协议包后, 去封装得到 所述数据节点和所述校验节点;  After receiving the fault tolerant elastic real-time transport protocol packet, the receiving end decapsulates the data node and the check node;
如果发生了传送过程中的数据节点丟失, 则所述接收端根据所述校 验节点对所述数据节点进行前向纠错解码, 并划分得到所述 H.264 网络 抽象层单元。  If the data node is lost during the transmission, the receiving end performs forward error correction decoding on the data node according to the verification node, and divides the H.264 network abstract layer unit.
6. 根据权利要求 5所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 在开始传送之前, 包括:  6. The method for real-time transmission of a multimedia data network supporting fault tolerance resilience according to claim 5, wherein before starting the transmission, the method comprises:
对于各种所述前向纠错码类型, 所述发送端和所述接收端协商确定 所述容错前向纠错码子类型字段的取值与其所指示的前向纠错码的相关 参数设置的对应关系。  For each of the types of the forward error correction code, the transmitting end and the receiving end negotiate to determine the value of the fault tolerant forward error correction code subtype field and the related parameter setting of the forward error correction code indicated. Correspondence relationship.
7. 根据权利要求 6所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述发送端和所述接收端都根据所述前向纠错编码 子类型字段的指示对应关系建立对应关系表, 用于根据所述前向纠错编 码类型字段和所述前向纠错编码子类型字段查询所对应的前向纠错编码 或前向纠错解码处理模块;  The real-time transmission method of the multimedia data network supporting the fault-tolerant elasticity according to claim 6, wherein the transmitting end and the receiving end are both established according to the indication correspondence relationship of the forward error correction coding subtype field Corresponding relationship table, configured to query, according to the forward error correction coding type field and the forward error correction coding subtype field, a forward error correction coding or a forward error correction decoding processing module;
所述发送端调用相应前向纠错编码处理模块进行前向纠错编码; 所述接收端调用相应前向纠错解码处理模块进行前向纠错解码。  The transmitting end invokes a corresponding forward error correction coding processing module to perform forward error correction coding; the receiving end invokes a corresponding forward error correction decoding processing module to perform forward error correction decoding.
8. 根据权利要求 7所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述发送端根据所述 H.264 网络抽象层单元的头信 息中的网絡抽象层参考标识字段或 /和网络抽象层单元类型字段评估对应 数据的相对重要性, 确定所述服务质量等级, 选择相应的前向纠错编码 方式, 确定所述前向纠错编码类型字段和前向纠错编码子类型字段。 The real-time transmission method of the multimedia data network supporting the fault-tolerant elasticity according to claim 7, wherein the transmitting end is based on a network abstraction layer reference identifier field in the header information of the H.264 network abstraction layer unit or / Correspond to the network abstraction layer unit type field evaluation The relative importance of the data, determining the quality of service level, selecting a corresponding forward error correction coding mode, and determining the forward error correction coding type field and the forward error correction coding subtype field.
9. 根据权利要求 7所述的支持容错弹性的多媒体数据网络实时传送 方法, 其特征在于, 所述发送端根据所述接收端反馈的传送报告评价所 述网络传送状况, 进而选择所述前向纠错编码方式, 确定所述前向纠错 编码类型字段和前向纠错编码子类型字段。  The real-time transmission method of the multimedia data network supporting the fault-tolerant elasticity according to claim 7, wherein the transmitting end evaluates the network transmission status according to the transmission report fed back by the receiving end, and further selects the forward direction. The error correction coding mode determines the forward error correction coding type field and the forward error correction coding subtype field.
10. 根据权利要求 8或 9所述的支持容错弹性的多媒体数据网络实 时传送方法, 其特征在于,  10. The method for real-time transmission of a multimedia data network supporting error tolerance resilience according to claim 8 or 9, wherein:
所述前向纠错编码类型字段位于贡献源标识符列表之后;  The forward error correction coding type field is located after the contribution source identifier list;
所述前向纠错编码子类型字段位于所述前向纠错编码类型字段之 后;  The forward error correction coding subtype field is located after the forward error correction coding type field;
所述数据包长度字段位于所述前向纠错编码子类型字段之后; 所述数据包数目字段位于所述数据包长度字段之后。  The data packet length field is located after the forward error correction coding subtype field; the data packet number field is located after the data packet length field.
11. 根据权利要求 8或 9所述的支持容错弹性的多媒体数据网络实 时传送方法, 其特征在于, 所述前向纠错编码方式使用改进的 "Tornado" 纠删码;  11. The method for real-time transmission of a multimedia data network supporting error tolerance resilience according to claim 8 or 9, wherein the forward error correction coding method uses an improved "Tornado" erasure code;
所述改进的 "Tornado" 纠删码对于一组所述数据节点仅生成一层所 述校验节点。  The improved "Tornado" erasure code generates only one layer of the check node for a set of said data nodes.
PCT/CN2006/001846 2005-10-17 2006-07-25 A method for supporting multimedia data transmission with error resilience WO2007045141A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200510113940.2 2005-10-17
CNB2005101139402A CN100450187C (en) 2005-10-17 2005-10-17 Multimedia data network realtime transfer method for supporting error elasticity

Publications (1)

Publication Number Publication Date
WO2007045141A1 true WO2007045141A1 (en) 2007-04-26

Family

ID=37298434

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2006/001846 WO2007045141A1 (en) 2005-10-17 2006-07-25 A method for supporting multimedia data transmission with error resilience

Country Status (2)

Country Link
CN (1) CN100450187C (en)
WO (1) WO2007045141A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109245850A (en) * 2017-07-11 2019-01-18 上海交通大学 Adaptable System code FEC coding and decoding method based on media content
WO2021180065A1 (en) * 2020-03-09 2021-09-16 华为技术有限公司 Data transmission method and communication apparatus
CN113541853A (en) * 2020-04-13 2021-10-22 海能达通信股份有限公司 Data transmission method, terminal and computer readable storage medium
CN115189810A (en) * 2022-07-07 2022-10-14 福州大学 Low-delay real-time video FEC coding transmission control method

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101193060B (en) * 2006-12-01 2010-09-01 武汉烽火网络有限责任公司 Method for reliable E1 transmission based on forward error correction mechanism in packet network
BRPI0622135A2 (en) * 2006-12-21 2011-12-27 Thomson Licensing Method for Corrective Support of Future Errors for Real-Time Video and Audio Data via Internet-Protocol Work Networks
US8570869B2 (en) * 2007-09-21 2013-10-29 Nokia Corporation Scalable error detection and cross-session timing synchronization for packet-switched transmission
CN101616139A (en) * 2008-06-24 2009-12-30 华为技术有限公司 The method of transmitting multimedia service in the next generation network, system, and device
CN101594206A (en) * 2009-06-23 2009-12-02 中兴通讯股份有限公司 The method for synchronous of forward error correction encoding/decoding mode and device
TWI467977B (en) * 2009-11-13 2015-01-01 Panasonic Ip Corp America Encoding method, decoding method, encoder and decoder
CN101778295B (en) * 2009-12-25 2012-11-14 中兴通讯股份有限公司 Video monitor system and method for forward correcting thereof
CN102595252B (en) * 2011-01-11 2016-09-28 中兴通讯股份有限公司 Streaming media forward error correction realization method and system
AU2012207719B2 (en) 2011-01-19 2016-05-19 Samsung Electronics Co., Ltd. Apparatus and method for configuring a control message in a broadcast system
KR101933465B1 (en) * 2011-10-13 2019-01-11 삼성전자주식회사 Apparatus and method for transmitting/receiving a packet in a mobile communication system
CN103532923B (en) * 2012-11-14 2016-07-13 Tcl集团股份有限公司 A kind of real-time media stream transmission method and system
CN103354615B (en) * 2013-06-24 2015-04-15 西安交通大学 Signal intensity based live video data transmission error control method
CN105337682B (en) * 2014-05-26 2018-10-12 联想(北京)有限公司 A kind of method and device of transmission data
CN107294631A (en) * 2016-03-30 2017-10-24 北京数码视讯科技股份有限公司 The signal processing method and driver of a kind of driver
CN110299963A (en) * 2019-06-05 2019-10-01 西安万像电子科技有限公司 Data processing method and device
CN114866195A (en) * 2022-07-07 2022-08-05 深圳市江元科技(集团)有限公司 Method for controlling thermal printer by using android system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000187441A (en) * 1997-12-08 2000-07-04 Nippon Telegr & Teleph Corp <Ntt> Method and device for encoding embedded information, storage medium storing embedded information encoding program, method and device for decoding extracted information, and storage medium storing extracted information decoding program, and method and device for encoding digital watermark information, and storage medium storing digital watermark information encoding program, and method and device for decoding digital watermark information, and storage medium storing digital watermark information decoding program
CN1353895A (en) * 1999-04-01 2002-06-12 诺基亚移动电话有限公司 Method and device for digital data transfer
US6665420B1 (en) * 1999-12-02 2003-12-16 Verizon Laboratories Inc. Message authentication code with improved error tolerance
CN1479489A (en) * 2002-08-29 2004-03-03 ����ͨѶ�ɷ����޹�˾ Method of transmitting broadband multimedia data on comprehensive business digital network
CN1571512A (en) * 2004-04-30 2005-01-26 清华大学 Mobile terminal oriented multimedia broadcasting system and implementing method thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6349138B1 (en) * 1996-06-14 2002-02-19 Lucent Technologies Inc. Method and apparatus for digital transmission incorporating scrambling and forward error correction while preventing bit error spreading associated with descrambling
WO2003092207A1 (en) * 2002-04-25 2003-11-06 Passave, Inc. Forward error correction coding in ethernet networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000187441A (en) * 1997-12-08 2000-07-04 Nippon Telegr & Teleph Corp <Ntt> Method and device for encoding embedded information, storage medium storing embedded information encoding program, method and device for decoding extracted information, and storage medium storing extracted information decoding program, and method and device for encoding digital watermark information, and storage medium storing digital watermark information encoding program, and method and device for decoding digital watermark information, and storage medium storing digital watermark information decoding program
CN1353895A (en) * 1999-04-01 2002-06-12 诺基亚移动电话有限公司 Method and device for digital data transfer
US6665420B1 (en) * 1999-12-02 2003-12-16 Verizon Laboratories Inc. Message authentication code with improved error tolerance
CN1479489A (en) * 2002-08-29 2004-03-03 ����ͨѶ�ɷ����޹�˾ Method of transmitting broadband multimedia data on comprehensive business digital network
CN1571512A (en) * 2004-04-30 2005-01-26 清华大学 Mobile terminal oriented multimedia broadcasting system and implementing method thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109245850A (en) * 2017-07-11 2019-01-18 上海交通大学 Adaptable System code FEC coding and decoding method based on media content
CN109245850B (en) * 2017-07-11 2021-04-02 上海交通大学 Self-adaptive system code FEC coding and decoding method based on media content
WO2021180065A1 (en) * 2020-03-09 2021-09-16 华为技术有限公司 Data transmission method and communication apparatus
CN113541853A (en) * 2020-04-13 2021-10-22 海能达通信股份有限公司 Data transmission method, terminal and computer readable storage medium
CN113541853B (en) * 2020-04-13 2022-12-16 海能达通信股份有限公司 Data transmission method, terminal and computer readable storage medium
CN115189810A (en) * 2022-07-07 2022-10-14 福州大学 Low-delay real-time video FEC coding transmission control method
CN115189810B (en) * 2022-07-07 2024-04-16 福州大学 Low-delay real-time video FEC coding transmission control method

Also Published As

Publication number Publication date
CN100450187C (en) 2009-01-07
CN1859580A (en) 2006-11-08

Similar Documents

Publication Publication Date Title
WO2007045141A1 (en) A method for supporting multimedia data transmission with error resilience
Wang et al. RTP payload format for H. 264 video
Wenger et al. RTP payload format for H. 264 video
WO2007051425A1 (en) A multimedia communication method and the terminal thereof
Turletti et al. RTP payload format for H. 261 video streams
EP1936868B1 (en) A method for monitoring quality of service in multimedia communication
Li RTP payload format for generic forward error correction
US20090103635A1 (en) System and method of unequal error protection with hybrid arq/fec for video streaming over wireless local area networks
CN101867453B (en) RTP anti-packet-loss method
WO2007045140A1 (en) A real-time method for transporting multimedia data
BRPI0608977A2 (en) methods and equipment for packaging content for transmission over a network
WO2006105713A1 (en) Video transmission protection method based on h.264
US20060190593A1 (en) Signaling buffer parameters indicative of receiver buffer architecture
JP2001045098A (en) Data communication system, data communication unit, data communication method and storage medium
WO2013098810A1 (en) Adaptive forward error correction (fec) system and method
JP2018505597A (en) FEC mechanism based on media content
US20150006991A1 (en) Graceful degradation-forward error correction method and apparatus for performing same
Wenger et al. RFC 3984: RTP payload format for H. 264 video
KR102163338B1 (en) Apparatus and method for transmitting and receiving packet in a broadcasting and communication system
Frescura et al. JPEG2000 and MJPEG2000 transmission in 802.11 wireless local area networks
KR101953580B1 (en) Data Transceiving Apparatus and Method in Telepresence System
Monteiro et al. Robust multipoint and multi-layered transmission of H. 264/SVC with Raptor codes
Chung-How et al. Loss resilient H. 263+ video over the Internet
Wang et al. RFC 6184: RTP Payload Format for H. 264 Video
Fonnes Reducing packet loss in real-time wireless multicast video streams with forward error correction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06761576

Country of ref document: EP

Kind code of ref document: A1