WO2010014239A2 - Diffusion échelonnée d’informations de codage hiérarchique - Google Patents

Diffusion échelonnée d’informations de codage hiérarchique Download PDF

Info

Publication number
WO2010014239A2
WO2010014239A2 PCT/US2009/004406 US2009004406W WO2010014239A2 WO 2010014239 A2 WO2010014239 A2 WO 2010014239A2 US 2009004406 W US2009004406 W US 2009004406W WO 2010014239 A2 WO2010014239 A2 WO 2010014239A2
Authority
WO
WIPO (PCT)
Prior art keywords
stream
data units
encoded data
frames
frame
Prior art date
Application number
PCT/US2009/004406
Other languages
English (en)
Other versions
WO2010014239A3 (fr
Inventor
Avinash Sridhar
David Anthony Campana
Zhenyu Wu
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2010014239A2 publication Critical patent/WO2010014239A2/fr
Publication of WO2010014239A3 publication Critical patent/WO2010014239A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8451Structuring of content, e.g. decomposing content into time segments using Advanced Video Coding [AVC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/89Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
    • H04N19/895Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder in combination with error concealment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2383Channel coding or modulation of digital bit-stream, e.g. QPSK modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26275Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for distributing content or additional data in a staggered manner, e.g. repeating movies on different channels in a time-staggered manner in a near video on demand system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
    • H04N21/4382Demodulation or channel decoding, e.g. QPSK demodulation

Definitions

  • the present invention generally relates to data communications systems, and more particularly to the transmission of video data with time diversity.
  • Staggercasting offers a method of protection against signal loss by transmitting a secondary, redundant stream that is time-shifted with respect to a primary stream. This allows a receiver to pre-buffer packets of the secondary stream to replace packets of the primary stream lost in transmission.
  • Various staggercasting techniques exist that differ in the types of redundant data sent in the secondary stream. For example, the secondary stream may simply be an exact copy of the primary stream staggered with some time offset. Such an arrangement, however, can be inefficient as it effectively doubles the bandwidth required by the staggercast transmission.
  • Another staggercasting technique involves the transmission of a secondary stream that is separately encoded from the primary stream.
  • this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video.
  • video decoders typically must maintain state data, such as previously decoded reference frames needed for decoding future frames, such a staggercasting arrangement requires a receiver to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver.
  • staggercasting and various coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream.
  • the subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the coding technique by which they were encoded. More important frames are thus conveyed in the primary and secondary streams, whereas less important frames are conveyed only in the primary stream.
  • staggercasting and hierarchical predictive coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream.
  • the subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the hierarchical predictive coding technique by which they were encoded.
  • Frames used in decoding other frames are transmitted in both the primary and secondary streams, whereas frames not used in decoding other frames are transmitted only in the primary stream.
  • FIG. 1 is a block diagram of an exemplary staggercasting arrangement in which the present invention can be implemented;
  • FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding;
  • FIG. 3 shows an illustrative scenario in which a B frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention.
  • 8-VSB eight-level vestigial sideband
  • QAM Quadrature Amplitude Modulation
  • RP radio-frequency
  • IP Internet Protocol
  • RTP Real-time Transport Protocol
  • RTCP RTP Control Protocol
  • UDP User Datagram Protocol
  • FIG. 1 is a block diagram of an illustrative staggercasting environment 100 comprising a stagger transmitter 15; a communications network 20, which may include a variety of elements (e.g., networking, routing, switching, transport) operating over various media (e.g., wireline, optical, wireless); and a stagger receiver 25.
  • a source such as a video encoder 10, provides an original stream 12 of encoded data units to the stagger transmitter 15, which, in turn, sends out a staggercast transmission for transmission over the communications network 20 for reception by the stagger receiver 25.
  • An additional stream 13 may be included by which the encoder 10 communicates coding information to the stagger transmitter 15, as described in greater detail below.
  • the staggercast transmission from the transmitter 15 comprises two streams.
  • the secondary stream 17 can be time-shifted or staggered relative to the primary stream 16, in which case it may also be referred to as a "staggered" stream.
  • Corresponding primary and secondary streams 21 and 22, respectively, are received by the stagger receiver 25. Staggering allows the receiver 25 to pre-buffer data units of the secondary stream 22 so that they may replace corresponding data units in the primary stream 21 that may have been lost or corrupted in transmission.
  • the primary and secondary streams 16, 17 may be combined into a single stream by a multiplexer or the like (not shown) before being provided to the network 20, conveyed as a single stream by the network 20, and de-multiplexed into streams 21 and 22 before being provided to the receiver 25.
  • the primary and secondary streams can transmitted, conveyed and received as separate streams.
  • the present invention is not limited to any specific implementation in this regard.
  • the stagger receiver 25 is coupled to a client, such as a video decoder 30 for decoding the received video data.
  • the decoder 30 provides a stream 35 of decoded pictures for display by a display device 40.
  • the contents of the secondary stream 17 output from the stagger transmitter 15 are a subset of the contents of the primary stream 16.
  • This provides a more efficient use of bandwidth over an arrangement in which the secondary stream 17 is a fully redundant stream; i.e., a complete copy of the primary stream 16.
  • the subset of data that is conveyed in secondary stream 17 is selected in accordance with the coding scheme used to encode the data conveyed.
  • One such scheme entails temporal scalable coding as described, for example, in H. Schwarz et al., "Analysis Of Hierarchical B Pictures and MCTF," ICME 2006 (hereinafter "Schwarz et al.”).
  • FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding, as described in Schwarz et al. hi the structure depicted, frames are organized into groups of pictures (GOPs), each with eight frames.
  • the last frame in each GOP also known as the key frame, can be an intra-coded (I) or a predictive (P) frame.
  • the other seven frames are B frames.
  • the subscript of each frame label indicates the frame's level in the frame structure hierarchy, with lower subscripts indicating greater importance.
  • FIG. 2 also indicates the orders in which the frames are coded and displayed.
  • the order of decoding is the same as the display order.
  • the order of transmission can be the same as the coding or the display order.
  • each GOP has one Bi frame, whose successful decoding depends on the key frame (Io/Po) of that GOP and the key frame of the previous GOP.
  • the aforementioned key frames are reference frames for the Bi frame.
  • the Bi frame is the second frame in the GOP to be coded, after the key frame, and the fourth frame to be decoded and displayed.
  • the key frames can be thought of as the base layer and the Bi frames as the first enhancement layer of a temporally scalable SVC stream.
  • Each GOP also includes two B 2 frames, the first of which depends on the Bi frame and the key frame of the previous GOP, and is the third frame to be coded and the second to be decoded and displayed.
  • the second B 2 frame depends on the Bj frame and the key frame of the current GOP, and is the sixth frame in the GOP to be coded and the sixth frame to be decoded and displayed.
  • each GOP includes four B 3 frames, each of which is dependent on an adjacent B 2 frame and a B 1 frame or a key frame, as shown in FIG. 2.
  • key frames are transmitted in the primary stream 16 as well as in the secondary stream 17 to protect against their loss.
  • Bi frames also be sent in both the primary and secondary streams 16 and 17.
  • the determination of whether to include a frame in the secondary stream 17 can be based on whether or not it is a reference frame, a frame on which the decoding of other frames relies.
  • B 2 frames are also sent in the secondary stream 17, but B 3 frames are not. Doing so provides improved picture quality with a small increase in required bandwidth since there are only two B 2 frames per GOP.
  • the hierarchical coding scheme illustrated in FIG. 2 is only one of a variety of different coding schemes that can be used with embodiments of the invention.
  • an encoder may use a coding scheme in which reference frames are generated with greater or lesser frequency than in the scheme depicted. For instance, every other frame in stream 12 can be a reference frame.
  • reference frames can occur regularly (e.g., every Nth frame), or at varying intervals, and with different patterns.
  • the coding scheme used by the encoder 10 is preferably selected with bandwidth efficiency in mind so as to allow the stagger transmitter 15 to select those frames for inclusion in the secondary stream which will provide the greatest value in terms of recreating lost frames in light of the additional bandwidth required to include those frames in the secondary stream.
  • bandwidth availability information can be fed-back to the encoder which can accordingly change the coding scheme that it uses in order to optimize bandwidth efficiency.
  • the determination of which frames to include in the secondary stream 17 is made by the stagger transmitter 15.
  • the decision to include a frame in the secondary stream 17 will depend on the characteristics (e.g., frame type, priority level) of the frame and/or available bandwidth.
  • the stagger transmitter 15 can determine the characteristics of each frame that it receives from the source 10 in a number of different ways.
  • the source 10 communicates frame characteristics and/or coding scheme information to downstream devices such as the stagger transmitter 15.
  • Such information can be sent in-band, via stream 12 in the form of additional packets or header information added to encoded data units, or out-of-band, via a separate stream 13 in one or more packets.
  • Coding scheme information may include a variety of information about the coding scheme used so as to enable a downstream device such as the stagger transmitter 15 to determine frame characteristics. Such information may include, for example, detailed information about a segment of video data explicitly indicating the type of each frame in the segment, or it may include a few key parameters of the coding scheme used to encode the video segment (e.g., GOP size, frame structure), which devices such as the stagger transmitter 15 can use to infer frame types.
  • the coding scheme information may be sent in the in the form of a file conveyed as payload by one or more packets, or in packet headers.
  • the stagger transmitter 15 decodes and/or parses the headers of packets in the stream 12, typically organized as Network Abstraction Layer (NAL) units, for information indicative of one or more characteristics of each frame received from the source 10.
  • NAL Network Abstraction Layer
  • NAL units with a NRI value of OO' are not used to reconstruct reference pictures for future prediction, in which case they can be lost or discarded without risking the integrity of the reference pictures in the same layer.
  • a NRI value greater than OO' indicates that the decoding of the NAL unit is required to maintain the integrity of reference pictures in the same layer, or that the NAL unit contains parameter sets. If it is determined that a frame is a reference frame (i.e., NRI > OO'), and thus should be protected, the stagger transmitter 15 can decide to include the frame in the secondary stream 17, assuming there is available bandwidth to do so.
  • Another field in the SVC NAL unit header that can be used to determine whether a NAL unit should be included in the secondary stream 17 is the six-bit priority id (PRID) field. A lower PRID value indicates a higher priority.
  • the stagger transmitter 15 can select NAL units for inclusion in the secondary stream 17 based on PRID so that, for example, NAL units with a PRID value less than a threshold value will be included in the secondary stream.
  • frame characteristic information can be conveyed using Quality of Service (QOS) or Type of Service (TOS) information (referred to herein collectively as "type-of-service" information) contained in the stream 12.
  • QOS Quality of Service
  • TOS Type of Service
  • the source 10 sets type-of-service bits in the headers of packets that it forwards to downstream devices such as the stagger transmitter 15.
  • the type-of-service bits of each packet are set in accordance with the frame information contained in the packet.
  • the stagger transmitter 15 parses the type-of-service information in the headers of encoded data units in stream 12 to determine the type of frame (e.g., key frame) being conveyed.
  • the frame characteristics can be determined by the stagger transmitter 15 for all frames communicated from the source 10 or a subset thereof. For example, if only key frames are to be contained in the secondary stream 17, the stagger transmitter 15 need only determine whether a frame in stream 12 is a key frame or not in deciding whether to include the frame in the secondary stream. However, even if additional frames are to be included in the secondary stream 12, such as Bi frames in the above example, the stagger transmitter can infer the positions of such frames in the stream 12 knowing the positions of the key frames. This saves the processing overhead that would otherwise be required to parse header information to identify such frames as well.
  • the determination of whether to include frames in a stagger stream can be made by other components in a staggercasting environment as well.
  • a multiplexer in network 20 receiving the primary 16 and secondary 17 streams can identify frames, using one of the above-described techniques, and decide whether to drop or add frames from the secondary stream 17 to the multiplexer output based on frame type and/or bandwidth availability.
  • the determination of which frames to include in the primary and secondary streams may also be done upstream, by the source 10.
  • FIG. 3 shows an illustrative scenario in which a Bi frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention.
  • the lost frames include the Bi frame of GOP(N+ 1), in addition to the two B 3 frames and the two B 2 frames transmitted before and after the Bi frame.
  • the secondary stream 22 contains copies of the key frame and the Bi frame of each GOP, designated I'/P' and Bi' respectively. In this scenario, the secondary stream is received without error.
  • the offset between the two streams 21, 22 is shown as four data units; i.e., the secondary stream 17 is transmitted four data units earlier than the primary stream 16.
  • all frames are shown in FIG. 3 to have the same transmission time.
  • the size of a coded frame will vary substantially from frame to frame and thus so will the transmission time of each frame.
  • the stagger offset is typically expressed in terms of time rather than frames; e.g., the secondary stream frames may be transmitted four seconds earlier than their primary stream equivalents.
  • the invention is not limited to any specific time offset. The preferred time offset for a given implementation will depend on implementation- specific details such as, for example, the amount of memory at the receiver available for buffering and error or loss characteristics.
  • the secondary stream can be staggered later in time from the primary stream.
  • the secondary stream should preferably precede the primary stream.
  • transmitting the secondary stream later in time from the primary stream would result in the protection coming some time after a data loss. Either at initial playback or upon the first loss event, the primary stream would have to pause to wait for the replacement data units from the stagger stream to arrive, resulting in a diminished viewer experience.
  • the receiver can immediately begin playback of the primary stream while buffering the secondary stream to protect against future loss.
  • the primary and secondary streams may be provided with error protection (e.g., turbo coding, forward error correction, etc.) Both or only the secondary stream may be provided with error protection.
  • the two streams may also be provided with different levels of error protection, with the secondary stream preferably being provided with a higher level of protection. It would be possible to reduce the overhead of an error protection scheme by applying it only to the secondary stream. This also offers the advantage of allowing the receiver to immediately decode and play the unprotected primary stream. Since the secondary stream is preferably received before the primary stream, there should be sufficient time to correct errors in any secondary stream data units before they may be needed to replace any lost primary stream data units. [0040] As illustrated in FIG.
  • the lost B 1 frame of GOP(N+ 1) is replaced in the decoder output stream 35 by its copy Bi' received in the secondary stream 22. Additionally, the two B 2 frames and two B 3 frames of GOP(N+ 1) that were lost in the primary stream 21 are re-created by the decoder 30 using the frame Bi' (of the same GOP) received on the secondary stream 22.
  • the re-created frames are designated B 2 * and B 3 * in FIG. 3. To the extent that they would be relevant to the re-creation of the missing frames, other frames received successfully in the primary or secondary stream would also be used in the re-creation. For instance, in the scenario illustrated, the key frames of GOP(N) and GOP(N+ 1) are used in the re-creation of the B 2 * frames.
  • the missing B 2 and B 3 frames can be replaced with the B 1 ' frame, or they can be estimated by applying some form of interpolation or the like.
  • the present invention is not limited to any one particular recreation method in this regard.
  • the principles of the present invention can be applied to any coding scheme which includes frames that can be lost and re-created from other frames without unduly compromising video quality.
  • the coding scheme is a hierarchical predictive or P-frame scheme in which each GOP comprises a key frame, one or more P frames and/or B frames.
  • the key and P frames of each GOP are transmitted in both the primary and secondary streams 16, 17.
  • the source 10 provides a single stream 12 which is re-transmitted by the transmitter 15 as part of a staggercast transmission of two streams 16, 17.
  • This is only one of a variety of possible arrangements to which the principles of the present invention can be applied.
  • an arrangement in which the source 10 generates a staggercast transmission (with two streams) which is then received and re-transmitted by one or more staggercast transceivers could also be used with the present invention.
  • a variety of combinations of the source 10, stagger transmitter 15 and other elements such as a multiplexer are contemplated by the present invention.
  • Embodiments of the present invention enjoy several advantages over known approaches.
  • one staggercasting method involves the transmission of a secondary stream that is separately encoded from the primary stream.
  • this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video.
  • Typical video decoders must maintain state data, such as previously decoded reference frames that must be available for decoding future frames that are predicted from them.
  • a receiver would need to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver.
  • the exemplary arrangement of the present invention described above can be implemented with only one decoder and associated state memory given that the two streams are related; i.e., the secondary stream is a subset of the primary stream.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L’invention concerne la diffusion échelonnée de flux d’informations, tels que des vidéos codées, des flux primaire et secondaire étant transmis de manière relativement décalée dans le temps (« échelonnée »). Selon ce processus, un récepteur peut placer à l’avance en mémoire tampon les trames du flux secondaire pour remplacer les trames du flux primaire susceptibles d’avoir été perdues durant la transmission. Dans un exemple de mise en œuvre illustratif, on exécute une diffusion échelonnée dans laquelle le flux secondaire contient un sous-ensemble des trames vidéo codées transmises dans le flux primaire. Le flux primaire contient des trames essentielles qui sont indispensables pour décoder correctement les données vidéo, ainsi que des trames non essentielles qui ne sont pas indispensables. Le flux secondaire, cependant, contient des copies des trames essentielles et peut également contenir, ou non, des copies de certaines des trames non essentielles. En cas de perte de trames dans le flux primaire, ce montage permet la reconstruction au niveau du récepteur d’un flux vidéo de haute qualité en utilisant les trames du flux secondaire. La détermination des trames à inclure dans le flux secondaire dépendra de leur importance, celle-ci étant déterminée par le schéma de codage avec lequel elles ont été générées.
PCT/US2009/004406 2008-07-28 2009-07-27 Diffusion échelonnée d’informations de codage hiérarchique WO2010014239A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US8396808P 2008-07-28 2008-07-28
US61/083,968 2008-07-28

Publications (2)

Publication Number Publication Date
WO2010014239A2 true WO2010014239A2 (fr) 2010-02-04
WO2010014239A3 WO2010014239A3 (fr) 2010-03-25

Family

ID=41508439

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/004406 WO2010014239A2 (fr) 2008-07-28 2009-07-27 Diffusion échelonnée d’informations de codage hiérarchique

Country Status (1)

Country Link
WO (1) WO2010014239A2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106416251A (zh) * 2014-03-27 2017-02-15 英特尔Ip公司 基于感知质量的可缩放视频编码速率适配
US11075965B2 (en) 2015-12-21 2021-07-27 Interdigital Ce Patent Holdings, Sas Method and apparatus for detecting packet loss in staggercasting

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107027052B (zh) * 2017-02-28 2019-11-08 青岛富视安智能科技有限公司 Svc视频自适应降帧率的方法及系统

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050024543A1 (en) * 2001-07-19 2005-02-03 Kumar Ramaswamy Robust reception of digital broadcast transmission

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050024543A1 (en) * 2001-07-19 2005-02-03 Kumar Ramaswamy Robust reception of digital broadcast transmission

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HEIKO SCHWARZ ET AL: "Analysis of Hierarchical B Pictures and MCTF" PROC. OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, TORONTO, ONT., CANADA, 9-12 JULY, 2006, 9 July 2006 (2006-07-09), pages 1929-1932, XP031033239 Piscataway, NJ, USA ISBN: 978-1-4244-0366-0 cited in the application *
TIAN V, KUMAR D ET AL.: "Improved H.264/AVC video broadcast/multicast" PROCEEDINGS OF THE SPIE - THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING; VISUAL COMMUNICATIONS AND IMAGE PROCESSING; 12-15 JULY 2005; BEIJING, CHINA, vol. 5960, 12 July 2005 (2005-07-12), pages 71-82, XP030080844 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106416251A (zh) * 2014-03-27 2017-02-15 英特尔Ip公司 基于感知质量的可缩放视频编码速率适配
EP3123720A4 (fr) * 2014-03-27 2018-03-21 Intel IP Corporation Adaptation du débit de codage vidéo échelonnable sur la base de la qualité perçue
US11075965B2 (en) 2015-12-21 2021-07-27 Interdigital Ce Patent Holdings, Sas Method and apparatus for detecting packet loss in staggercasting

Also Published As

Publication number Publication date
WO2010014239A3 (fr) 2010-03-25

Similar Documents

Publication Publication Date Title
CN1801944B (zh) 用于视频编码和解码的方法和设备
KR101635235B1 (ko) 스케일러블 비디오 코딩(svc)을 이용한 고속 채널 변경 응용을 위한 실시간 전송 프로토콜(rtp) 패킷화 방법
TWI396445B (zh) 媒體資料的傳送/接收方法、編碼器、解碼器、儲存媒體、用於編碼/解碼圖像之系統、電子設備及傳送裝置、以及用於解碼圖像之接收裝置
US7751324B2 (en) Packet stream arrangement in multimedia transmission
Apostolopoulos et al. Video streaming: Concepts, algorithms, and systems
US8798145B2 (en) Methods for error concealment due to enhancement layer packet loss in scalable video coding (SVC) decoding
US20110029684A1 (en) Staggercasting with temporal scalability
US8832519B2 (en) Method and apparatus for FEC encoding and decoding
CA2656453C (fr) Methode permettant de determiner les patrametres de compression et de protection pour la transmission de donnees multimedias sur un canal de donnees sans fil
EP2257073A1 (fr) Procédé et dispositif pour transmettre des données vidéo
Greengrass et al. Not all packets are equal, part i: Streaming video coding and sla requirements
US20110090958A1 (en) Network abstraction layer (nal)-aware multiplexer with feedback
WO2010014239A2 (fr) Diffusion échelonnée d’informations de codage hiérarchique
Hellge et al. Intra-burst layer aware FEC for scalable video coding delivery in DVB-H
Purandare et al. Impact of bit error on video transmission over wireless networks and error resiliency
Cai et al. Error-resilient unequal protection of fine granularity scalable video bitstreams
Peng et al. End-to-end distortion optimized error control for real-time wireless video streaming
Zhang et al. An unequal packet loss protection scheme for H. 264/AVC video transmission
Al-Jobouri et al. Robust IPTV delivery with adaptive rateless coding over a mobile WiMAX channel
Nguyen et al. Adaptive error protection for Scalable Video Coding extension of H. 264/AVC
Tian et al. Improved H. 264/AVC video broadcast/multicast
Moiron et al. Enhanced slicing for robust video transmission
Nazir Scalable Video Streaming with Fountain Codes
Seferoğlu Multimedia streaming over wireless channels
Farrahi Robust H. 264/AVC video transmission in 3G packet-switched networks

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09789042

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 09789042

Country of ref document: EP

Kind code of ref document: A2