EP1997236A2 - Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables - Google Patents

Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables

Info

Publication number
EP1997236A2
EP1997236A2 EP07757937A EP07757937A EP1997236A2 EP 1997236 A2 EP1997236 A2 EP 1997236A2 EP 07757937 A EP07757937 A EP 07757937A EP 07757937 A EP07757937 A EP 07757937A EP 1997236 A2 EP1997236 A2 EP 1997236A2
Authority
EP
European Patent Office
Prior art keywords
layer
spatial
quality
base
temporal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07757937A
Other languages
German (de)
English (en)
Other versions
EP1997236A4 (fr
Inventor
Alexandros Eleftheriadis
Danny Hong
Ofer Shapiro
Thomas Wiegand
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vidyo Inc
Original Assignee
Vidyo Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/US2006/028365 external-priority patent/WO2008060262A1/fr
Priority claimed from PCT/US2006/028367 external-priority patent/WO2007075196A1/fr
Priority claimed from PCT/US2006/028366 external-priority patent/WO2008082375A2/fr
Priority claimed from PCT/US2006/028368 external-priority patent/WO2008051181A1/fr
Priority claimed from PCT/US2006/061815 external-priority patent/WO2007067990A2/fr
Priority claimed from PCT/US2006/062569 external-priority patent/WO2007076486A2/fr
Priority claimed from PCT/US2007/062357 external-priority patent/WO2007095640A2/fr
Application filed by Vidyo Inc filed Critical Vidyo Inc
Publication of EP1997236A2 publication Critical patent/EP1997236A2/fr
Publication of EP1997236A4 publication Critical patent/EP1997236A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/34Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/89Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
    • H04N19/895Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder in combination with error concealment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64784Data processing by the network
    • H04N21/64792Controlling the complexity of the content stream, e.g. by dropping packets

Definitions

  • the present invention relates to video data communication systems.
  • the invention specifically relates to simultaneously providing error resilience, random access, and rate control capabilities in video communication systems utilizing scalable video coding techniques.
  • IP Internet Protocol
  • the video compression techniques employed in the communication systems can create a very strong temporal dependency between sequential video packets or frames.
  • use of motion compensated prediction e.g., involving the use of P or B frames
  • codecs creates a chain of frame dependencies in which a displayed frame depends on past frame(s).
  • the chain of dependencies can extend all the way to the beginning of the video sequence.
  • the loss of a given packet can affect the decoding of a number of the subsequent packets at the receiver.
  • Error propagation due to the loss of the given packet terminates only at an "intra” (I) refresh point, or at a frame that does not use any temporal prediction at all.
  • I intra
  • Error resilience in digital video communication systems requires having at least some level of redundancy in the transmitted signals. However, this requirement is contrary to the goals of video compression techniques, which strive to eliminate or minimize redundancy in the transmitted signals.
  • a video data communication application may exploit network features to deliver some or all of video signal data in a lossless or nearly lossless manner to a receiver.
  • a best-effort network such as the Internet
  • a data communication application has to rely on its own features for achieving error resilience.
  • Known techniques e.g., the Transmission Control Protocol - TCP
  • TCP Transmission Control Protocol
  • TCP techniques may be used for error resilience in data transport using the File Transfer Protocol.
  • TCP keeps on retransmitting data until confirmation that all data is received, even if it involves a delay is several seconds.
  • TCP is inappropriate for video data transport in a live or interactive videoconferencing application because the end-to-end delay, which is unbounded, would be unacceptable to participants.
  • a related problem is that of random access. Assume that a receiver joins an existing transmission of a video signal. Typical instances are when a user who joins a videoconference, or a user who tunes in to a broadcast. Such a user would have to find a point in the incoming bitstream where he/she can start decoding and be in synchronization with the encoder. Providing such random access points, however, has a considerable impact on compression efficiency. Note that a random access point is, by definition, an error resilience feature since at that point any error propagation terminates (i.e., it is an error recovery point). Hence, the better the random access support provided by a particular coding scheme, the faster error recovery the coding scheme can provide. The converse may not always be true; it depends on the assumptions made about the duration and extent of the errors that the error resilience technique has been designed to address. For error resilience, some state information could be assumed to be available at the receiver at the time the error occurred.
  • I pictures are used at periodic intervals (typically 0.5 sec) to enable fast switching into a stream.
  • the I pictures are considerably larger than their P or B counterparts (typically by 3-6 times) and are thus to be avoided, especially in low bandwidth and/or low delay applications.
  • P or B counterparts typically by 3-6 times
  • the concept of requesting an intra update is often used for error resilience.
  • the update involves a request from the receiver to the sender for an intra picture transmission, which enables the decoder to be synchronized.
  • the bandwidth overhead of this operation is significant. Additionally, this overhead is also incurred when packet errors occur.
  • Scalable coding is used to generate two or more "scaled" bitstreams collectively representing a given medium in a bandwidth-efficient manner. Scalability can be provided in a number of different dimensions, namely temporally, spatially, and quality (also referred to as SNR "Signal-to-Noise Ratio" scalability or fidelity scalability).
  • a video signal may be scalably coded in different layers at CIF and QCIF resolutions, and at frame rates of 7.5, 15, and 30 frames per second (fps).
  • fps frames per second
  • the bits corresponding to the different layers can be transmitted as separate bitstreams (i.e., one stream per layer) or they can be multiplexed together in one or more bitstreams.
  • the coded bits corresponding to a given layer may be referred to as that layer's bitstream, even if the various layers are multiplexed and transmitted in a single bitstream.
  • Codecs specifically designed to offer scalability features include, for example, MPEG-2 (ISO/IEC 13818-2, also known as ITU-T H.262) and the currently developed SVC (known as ITU-T H.264 Annex G or MPEG-4 Part 10 SVC).
  • Scalable coding techniques specifically designed for video communication are described in commonly assigned international patent application No. PCT/US06/028365, "SYSTEM AND METHOD FOR SCALABLE AND LOW- DELAY VIDEOCONFERENCING USING SCALABLE VIDEO CODING". It is noted that even codecs that are not specifically designed to be scalable can exhibit scalability characteristics in the temporal dimension.
  • the sequential elimination process results in a decodable bitstream because the MPEG-2 Main Profile codec is designed so that coding of the P pictures does not rely on the B pictures, and similarly coding of the I pictures does not rely on other P or B pictures.
  • single-layer codecs with temporal scalability features are considered to be a special case of scalable video coding, and are thus included in the term scalable video coding, unless explicitly indicated otherwise.
  • Scalable codecs typically have a pyramidal bitstream structure in which one of the constituent bitstreams (called the “base layer”) is essential in recovering the original medium at some basic quality.
  • the enhancement layer(s) Use of one or more the remaining bitstream(s) (hereinafter called “the enhancement layer(s)" along with the base layer increases the quality of the recovered medium.
  • Data losses in the enhancement layers may be tolerable, but data losses in the base layer can cause significant distortions or complete loss of the recovered medium.
  • Scalable codecs pose challenges similar to those posed by single layer codecs for error resilience and random access.
  • the coding structures of the scalable codecs have unique characteristics that are not present in single layer video codecs.
  • scalable coding may involve switching from one scalability layer to another (e.g., switching back and forth between CIF and QCIF resolutions). Instantaneous layer switching when switching between different resolutions with very little bit rate overhead is desirable for random access in scalable coding systems in which multiple signal resolutions (spatial/temporal/quality) may be available from the encoder.
  • a problem related to those of error resilience and random access is that of rate control.
  • the output of a typical video encoder has a variable bit rate, due to the extensive use of prediction, transform and entropy coding techniques.
  • buffer-constrained rate control is typically employed in a video communication system.
  • an output buffer at the encoder is assumed, which is emptied at a constant rate (the channel rate); the encoder monitors the buffer's occupancy and makes parameter selections (e.g., quantizer step size) in order to avoid buffer overflow or underflow.
  • parameter selections e.g., quantizer step size
  • Rate control decisions are made at an intermediate gateway (e.g., at a Multipoint Control Unit - MCU), which is situated between the sender and the receiver.
  • Bitstream-level manipulation, or transcoding can be used at the gateway, but at considerable processing and complexity cost. It is therefore desirable to employ a technique that achieves rate control without requiring any additional processing at the intermediate gateway.
  • Consideration is now being given to improving error resilience and capabilities for random access to the coded bitstreams, and rate control in video communications systems. Attention is directed developing error resilience, rate control, and random access techniques, which have a minimal impact on end-to-end delay and the bandwidth used by the system.
  • the present invention provides systems and methods to increase error resilience and provide random access and rate control capabilities in video communication systems that use scalable video coding.
  • the systems and methods also allow the derivation of an output signal at a resolution different than the coded resolutions, with excellent rate-distortion performance.
  • the present invention provides a mechanism to recover from loss of packets of a high resolution spatially scalable layer by using information from the low resolution spatial layer.
  • the present invention provides a mechanism to switch from a low spatial or SNR resolution to a high spatial or SNR resolution with little or no delay.
  • the present invention provides a mechanism for performing rate control, in which the encoder or an intermediate gateway (e.g., an MCU) selectively eliminates packets from the high resolution spatial layer, anticipating the use of appropriate error recovery mechanisms at the receiver that minimize the impact of the lost packets on the quality of the received signal.
  • the encoder or an intermediate gateway e.g., an MCU
  • the encoder or an intermediate gateway selectively replaces packets from the high resolution spatial layer with information that effectively instructs the encoder to reconstruct an approximation to the high resolution data being replaced using information from the base layer and past frames of the enhancement layer.
  • the present invention describes a mechanism for deriving an output video signal at a resolution different than the coded resolutions, and specifically an intermediate resolution between those used for spatially scalable coding.
  • the techniques simultaneously achieve error resilience and rate control for a particular family of video encoders referred to as scalable video encoders.
  • the rate-distortion performance of the error concealment techniques is such that it matches or exceeds that of coding at the effective transfer rate (total transmitted minus the rate of the lost packets).
  • the techniques allow nearly instantaneous layer switching with very little bit rate overhead.
  • the techniques can be used to derive a decoded version of the received signal at a resolution different than the coded resolution(s). This allows, for example, the creation of a 1 A CIF (HCIF) signal out of a spatially scalable coded signal at QCIF and CIF resolutions.
  • HCIF 1 A CIF
  • the receiver would either have to use the QCIF signal and upsample it (with poor quality), or use the CIF signal and downsample it (with good quality but high bit rate utilization).
  • the same problem also exists if the QCIF and CIF are simulcast as single-layer streams.
  • the techniques also provide rate control with minimal processing of the encoded video bitstream without adversely affecting picture quality.
  • FIG. 1 is a block diagram illustrating the overall architecture of a videoconferencing system in accordance with the principles of the present invention
  • FIG. 2 is a block diagram illustrating an exemplary end-user terminal in accordance with the principles of the present invention
  • FIG. 3 is a block diagram illustrating an exemplary architecture of a video encoder (base and temporal enhancement layers) in accordance with the principles of the present invention
  • FIG. 4 is a diagram illustrating an exemplary picture coding structure in accordance with the principles of the present invention
  • FIG. 5 is a diagram illustrating an example of an alternative picture coding structure in accordance with the principles of the present invention.
  • FIG. 6 is a block diagram illustrating an exemplary architecture of a video encoder for a spatial enhancement layer in accordance with the principles of the present invention
  • FIG. 7 is a diagram illustrating an exemplary picture coding structure when spatial scalability is used in accordance with the principles of the present invention
  • FIG. 8 is a diagram illustrating an exemplary decoding process with concealment of enhancement layer pictures in accordance with the principles of the present invention
  • FIG. 9 is a diagram illustrating exemplary R-D curves of the concealment process when applied to the 'Foreman' sequence in accordance with the principles of the present invention
  • FIG. 10 is a diagram illustrating an exemplary picture coding structure when spatial scalability with SR pictures is used in accordance with the principles of the present invention.
  • Systems and methods are provided for error resilient transmission, random access and rate control in video communication systems.
  • the systems and methods exploit error concealment techniques based on features of scalable video coding, which may be used in the video communication systems.
  • an exemplary video communication system may be a multi-point videoconferencing system 10 operated over a packet-based network.
  • Multi-point videoconferencing system may include optional bridges 120a and 120b (e.g., Multipoint Control Unit (MCU) or Scalable Video Communication Server (SVCS)) to mediate scalable multilayer or single layer video communications between endpoints (e.g., users 1-k and 1-m) over the network.
  • MCU Multipoint Control Unit
  • SVCS Scalable Video Communication Server
  • the operation of the exemplary video communication system is the same and as advantageous for a point-to-point connection with or without the use of optional bridges 120a and 120b.
  • the techniques described in this invention can be applied directly to all other video communication applications, including point-to-point streaming, broadcasting, multicasting, etc.
  • FIG. 1 shows the general structure of a videoconferencing system 10.
  • Videoconferencing system 10 includes a plurality of end-user terminals (e.g., users 1- k and users 1-m) that are linked over a network 100 via LANs 1 and 2 and servers 120a and 120b.
  • the servers may be traditional MCUs, or Scalable Video Coding servers (SVCS) or Compositing Scalable Video Coding servers (CSVCS).
  • SVCS Scalable Video Coding servers
  • CSVCS Compositing Scalable Video Coding servers
  • the latter servers have the same purpose as traditional MCUs, but with significantly reduced complexity and improved functionality.
  • PCT/US06/28366 and PCT/US06/62569 See e.g., International patent application Nos. PCT/US06/28366 and PCT/US06/62569.
  • the term "server” may be used generically to refer to either an SVCS or an CSVCS.
  • FIG. 2 shows the architecture of an end-user terminal 140, which is designed for use with videoconferencing systems (e.g., system 100) based on multi layer coding.
  • Terminal 140 includes human interface input/output devices (e.g., a camera 210A, a microphone 210B, a video display 250C, a speaker 250D), and one or more network interface controller cards (NICs) 230 coupled to input and output signal multiplexer and demultiplexer units (e.g., packet MUX 220A and packet DMUX 220B).
  • NIC 230 may be a standard hardware component, such as an Ethernet LAN adapter, or any other suitable network interface device, or a combination thereof.
  • Camera 210A and microphone 210B are designed to capture participant video and audio signals, respectively, for transmission to other conferencing participants.
  • video display 250C and speaker 250D are designed to display and play back video and audio signals received from other participants, respectively.
  • Video display 250C may also be configured to optionally display participant/terminal 140' s own video.
  • Camera 210A and microphone 210B outputs are coupled to video and audio encoders 210G and 210H via analog-to-digital converters 210E and 210F, respectively.
  • Video and audio encoders 210G and 210H are designed to compress input video and audio digital signals in order to reduce the bandwidths necessary for transmission of the signals over the electronic communications network.
  • the input video signal may be live, or pre-recorded and stored video signals.
  • the encoders compress the local digital signals in order to minimize the bandwidth necessary for transmission of the signals.
  • the audio signal may be encoded using any suitable technique known in the art (e.g., G.711, G.729, G.729EV, MPEG-I, etc.).
  • the scalable audio codec G.729EV is employed by audio encoder 21OG to encode audio signals.
  • the output of audio encoder 210G is sent to multiplexer MUX 220A for transmission over network 100 via NIC 230.
  • Packet MUX 220A may perform traditional multiplexing using the RTP protocol. Packet MUX 220A may also perform any related Quality of Service (QoS) processing that may be offered by network 100 or directly by a video communication application (see e.g. International patent application No. PCT/US06/061815). Each stream of data from terminal 140 is transmitted in its own virtual channel or "port number" in IP terminology.
  • QoS Quality of Service
  • Video encoder 210G is a scalable video encoder that has multiple outputs, corresponding to the various layers (here labeled "base” and "enhancement"). It is noted that simulcasting is a special case of scalable coding, where no inter layer prediction takes place. In the following, when the term scalable coding is used, it includes the simulcasting case. The operation of the video encoder and the nature of the multiple outputs are described in more detail herein below. [0041] In the H.264 standard specification, it is possible to combine views of multiple participants in a single coded picture by using a flexible macroblock ordering (FMO) scheme. In this scheme, each participant occupies a portion of the coded image corresponding to one of its slices.
  • FMO flexible macroblock ordering
  • a single decoder can be used to decode all participant signals.
  • the receiver/terminal will have to decode several smaller independently coded slices.
  • terminal 140 shown in FIG. 2 with decoders 230A may be used in applications of the H.264 specification.
  • the server for forwarding slices is a CSVCS.
  • demultiplexer DMUX 220B receives packets from NIC 320 and redirects them to the appropriate decoder unit 230A.
  • the SERVER CONTROL block in terminal 140 coordinates the interaction between the server (SVCS/CSVCS) and the end-user terminals as described in International patent applications Nos. PCT/US06/028366 and PCT/US06/62569. In a point-to-point communication system without intermediate servers, the SERVER CONTROL block is not needed. Similarly, in non- conferencing applications, point-to-point conferencing applications, or when a CSVCS is used, only a single decoder may be needed at a receiving end-user terminal.
  • the transmitting end-user terminal may not involve the entire functionality of the audio and video encoding blocks and all blocks preceding them (camera, microphone, etc.). Specifically, only the portions related to selective transmission of video packets, as explained below, need to be provided.
  • terminal the various components of the terminal may be separate devices that are interconnected to each other, they may be integrated in a personal computer in software or hardware, or they could be combinations thereof.
  • FIG. 3 shows an exemplary base layer video encoder 300.
  • Encoder 300 includes a FRAME BUFFERS block 310 and an Encoder Reference Control (ENC REF CONTROL) block 320 in addition to conventional "text-book" variety video coding process blocks 330 for motion estimation (ME), motion compensation (MC), and other encoding functions.
  • Video encoder 300 may be designed, for example, according to the H.264/MPEG-4 AVC (ITU-T and ISO/IEC JTC 1, "Advanced video coding for generic audiovisual services," ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG4-AVC)) or SVC (J. Reichel, H. Schwarz, and M.
  • Standard block-based motion-compensated codecs have a regular structure of I, P, and B frames.
  • a picture sequence in display order
  • the 'P' frames are predicted from the previous P or I frame in the sequence
  • the B pictures are predicted using both the previous and next P or I frame.
  • the number of B pictures between successive I or P pictures can vary, as can the rate at which I pictures appear, it is not possible, for example, for a P picture to use as a reference for prediction another P picture that is earlier in time than the most recent one.
  • the H.264 coding standard advantageously provides an exception in that two reference picture lists are maintained by the encoder and decoder, respectively, with appropriate signaling information that provide for reordering and selective use of pictures from within those lists. This exception can be exploited to select which pictures are used as references and also which references are used for a particular picture that is to be coded.
  • FRAME BUFFERS block 310 represents memory for storing the reference picture list(s).
  • ENC REF CONTROL block 320 is designed to determine which reference picture is to be used for the current picture at the encoder side. [0047] The operation of ENC REF CONTROL block 320 is placed in further context with reference to an exemplary layered picture coding "threading" or "prediction chain" structure 400 shown in FIG.
  • LO is simply a series of regular P pictures spaced four pictures apart. Ll has the same frame rate, but prediction is only allowed from the previous LO frame. L2 frames are predicted from the most recent LO or Ll frame. LO provides one fourth (1 :4) of the full temporal resolution, Ll doubles the LO frame rate (1 :2), and L2 doubles the L0+L1 frame rate (1 : 1).
  • Codecs 300 utilized in implementations of the present invention may be configured to generate a set of separate picture "threads" (e.g., a set of three threads 410-430) in order to enable multiple levels of temporal scalability resolutions (e.g., L0-L2) and other enhancement resolutions (e.g., S0-S2).
  • a thread or prediction chain is defined as a sequence of pictures that are motion-compensated using pictures either from the same thread, or pictures from a lower level thread.
  • ENC REF CONTROL block may use only P pictures as reference pictures.
  • B pictures with both forward and backward prediction increases the coding delay by the time it takes to capture and encode the reference pictures used for the B pictures.
  • B pictures with prediction from future pictures increases the coding delay and is therefore avoided.
  • B pictures also may be used with accompanying gains in overall compression efficiency.
  • Using even a single B picture in the set of threads e.g., by having L2 be coded as a B picture
  • some or all pictures can be B pictures with bi-directional prediction.
  • it is possible to use B pictures without incurring extra delay as the standard allows the use of two motion vectors that both use reference pictures that are in the past in display order. In this case, such B pictures can be used without increasing the coding delay compared with P picture coding.
  • base layer encoder 300 can be augmented to create spatial and/or quality enhancement layers, as described, for example in the H.264 SVC Standard draft and in International patent application No. PCT/US06/28365.
  • FIG. 6 shows the structure of an exemplary encoder 600 for creating the spatial enhancement layer.
  • the structure of encoder 600 is similar to that of base layer codec 300, with the additional feature that the base layer information is also made available to encoder 600. This information may include motion vector data, macroblock mode data, coded prediction error data, and reconstructed pixel data. Encoder 600 can re-use some or all of this information in order to make coding decisions for the enhancement layer.
  • the base layer data has to be scaled to the target resolution of the enhancement layer (e.g., by factor of 2 if the base layer is QCIF and the enhancement layer is CIF).
  • spatial scalability usually requires two coding loops to be maintained, it is possible (e.g., under the H.264 SVC draft standard) to perform single-loop decoding by limiting the base layer data that is used for enhancement layer coding to only values that are computable from the information encoded in the current picture's base layer. For example, if a base layer macroblock is inter-coded, then the enhancement layer cannot use the reconstructed pixels of that macroblock as a basis for prediction. It can, however, use its motion vectors and the prediction error values since they are obtainable by just decoding the information contained in the current base layer picture.
  • Single-loop decoding is desirable since the complexity of the decoder is significantly decreased.
  • the threading structure can be utilized for the enhancement layer frames in the same manner as for the base layer frames.
  • FIG. 7 shows an exemplary threading structure 700 for the enhancement layer frames following the design shown in FIG. 4.
  • the enhancement layer blocks in structure 700 are indicated by the letter 'S'. It is noted that threading structures for the enhancement layer frames and the base layer can be different, as explained in International patent application No. PCT/US06/28365.
  • enhancement layer codecs for quality scalability can be constructed, for example, as described in the SVC draft standard and described in International patent application No. PCT/US06/28365.
  • the enhancement layer is built by coding the residual prediction error at the same spatial resolution as the input.
  • all the macroblock data of the base layer can be re-used at the enhancement layer for quality scalability, in either single- or dual-loop coding configurations.
  • structure 400 (FIG. 4) has distinct advantages in terms of robustness in the presence of transmission errors.
  • threading structure 400 creates three self-contained chains of dependencies. A packet loss occurring at an L2 picture will only affect L2 pictures; LO and Ll pictures can still be decoded and displayed. Similarly, a packet loss occurring at an Ll picture will only affect Ll and L2 pictures; LO pictures can still be decoded and displayed.
  • the same error containment properties of the threads extend to S packets. For example, with structure 700 (FIG.
  • HRC High Reliability Channel
  • LRC Low Reliability Channel
  • the base layer information that can be used includes motion vector data (appropriately scaled for the target layer resolution), coded prediction error difference (upsampled for the enhancement layer resolution, if necessary), and intra data (upsampled for the enhancement layer resolution, if necessary).
  • Prediction references from prior pictures are taken, when needed, from the enhancement layer resolution pictures rather than the corresponding base layer pictures. This data allows the decoder to reconstruct a very close approximation of the missing frame, thus minimizing the actual and perceived distortions on the missing frame. Furthermore, decoding of any dependent frames is now also possible since a good approximation of the missing frame is available.
  • process 800 shows exemplary steps 810-840 of a concealment decoding process 800, using an example of a two-layer spatial scalability encoded signal with resolutions QCIF and CIF and two prediction threads (L0/S0 and Ll/Sl). It will be understood that process 800 is applicable to other resolutions and to different numbers of threads than shown. In the example, it is assumed that at coded data arrival step 810 the coded data for LO, SO, and Ll arrive intact at the receiving terminal, but the coded data for Sl are lost. Further, it is assumed that all coded data for pictures prior to the picture corresponding to time t0 also have been received at the receiving terminal.
  • FIG. 8 shows a particular example, in which a block of the Ll picture at time tl, LBl is encoded at base layer decoding step 820 by using motion-compensated prediction with a motion vector LMVl and a residual LRESl that is to be added to the motion-compensated prediction.
  • the data for LMVl and LRESl are contained in the Ll data received by the receiving terminal.
  • the decoding process requires block LBO from the prior base layer picture (the LO picture), which is available at the decoder as a result of the normal decoding process. Since the Sl data assumed to be lost in this example, the decoder cannot use the corresponding information to decode the enhancement layer picture.
  • Concealment decoding process 800 constructs an approximation for an enhancement layer block SBl.
  • process 800 generates concealment data by obtaining the coded data of the corresponding base layer block LBl, in this example LMVl and LRESl. It then scales the motion vector to the resolution of the enhancement layer, to construct an enhancement layer motion vector SMVl.
  • SMVl is equal to two times LMVl since the ratio of resolutions of the scalable signal is 2.
  • the concealment decoding process 800 upsamples the base layer residual signal to the resolution of the enhancement layer, by a factor of 2 in each dimension, and then optionally low-pass filters the result with the filter LPF, in accordance with well- known principles of sample rate conversion processes.
  • the further result of concealment data generation step 830 is a residual signal SRESl.
  • Next step 840 (Decoding process for the enhancement layer with concealment) uses the constructed concealment data SMVl and SRESl to approximate block SBl. It is noted that the approximation requires the block SBO from the previous enhancement layer picture, which is assumed to be available at the decoder as a result of the regular decoding process of the enhancement layer. Different encoding modes may operate in the same or similar way.
  • a further illustrative application of the inventive concealment technique relates to the example of high resolution images.
  • high resolution images e.g., greater than CIF
  • MTU maximum transmission unit
  • an S layer frame is broken into MTU size slices at the encoder for transmission. On the decoder side whatever slices are available from the S picture as received are used. Missing slices are compensated for using the concealment method (e.g., process 800), thus reducing the overall distortion.
  • the concealment method e.g., process 800
  • FIG. 9 shows rate-distortion curves obtained using the standard "foreman" video test sequence with different QPs. For each QP, rate-distortion values were obtained by dropping different amount of Sl and S2 frames, while applying the inventive error concealment technique described above. As seen in FIG.
  • the effective transmission rate is defined as the transmission rate minus the loss rate, i.e., the rate calculated based on the packets that actually arrive at the destination.
  • the bit rate corresponding to Sl and S2 frames is typically 30% of the total for the specific coding structure, which implies that any bit rate between 70% and 100% may be achieved by eliminating a selected number of Sl and S2 frames for rate control.
  • Bit rates between 70% and 100% may be achieved by selecting the number of S2 or
  • Table I summarizes the rate percentage of the different frame types for a typical video sequences (e.g., spatial scalability, QCIF-CIF resolution, three-layer threading, 380 Kbps).
  • Alternative techniques known in the art such as Fine Granularity Scalability (FGS) attempt to achieve similar rate flexibility, but with very poor rate-distortion performance and significant computational overhead.
  • FGS Fine Granularity Scalability
  • the concealment technique of the present invention offers the rate scalability associated with FGS, but without the coding efficiency penalty associated with such techniques.
  • the intentional elimination of Sl and S2 frames from the video transmission may be performed either at the encoder or at an available intermediate gateway (e.g., a SVCS/CSVCS).
  • an available intermediate gateway e.g., a SVCS/CSVCS.
  • a further use of the inventive concealment technique is to display the video signal at a resolution in between the two coded resolutions. For example, assume a video signal is coded at QCIF and CIF resolution using a spatially scalable codec.
  • a traditional decoder would follow one of two approaches: 1) decode the QCIF signal and upsample to HCIF, or 2) decode the CIF signal and downsample to HCIF.
  • the HCIF picture quality will not be good, but the bitrate used will be low.
  • the quality can be very good, but the bitrate used will also be nearly double that required in the first approach.
  • application of the inventive concealment technique for deriving an intermediate resolution requires operation of the enhancement layer decoding loop for SO at full resolution.
  • the decoding involves both the generation of the decoded prediction error, as well as the application of motion compensation at full resolution.
  • the decoded prediction error may be generated in full resolution, followed by downsampling to the target resolution (e.g., HCIF).
  • the reduced resolution signal may then be motion compensated using appropriately scaled motion vectors and residual information.
  • This technique can also be used on any portion of the 'S' layer that is retained for transmission to the receiver. As there will be drift introduced in the enhancement layer decoding loop, a mechanism to periodically eliminate drift may be required.
  • the periodic use of the INTRA BL mode of spatial scalability for each enhancement layer macroblock may be employed, where only information from the base layer is used for prediction. (See e.g., PCT/US06/28365). Since no temporal information is used, the drift for that particular macroblock is eliminated. If SR pictures are used, drift can also be eliminated by decoding all SR pictures at full resolution. Since SR pictures are far apart, there can still be considerable gain in computational complexity.
  • the technique for deriving an intermediate resolution signal may be modified by operating the enhancement layer decoder loop in reduced resolution. In cases, where CPU resources are not a limiting factor and faster switching than the SR separation is required or desired, the same (i.e., operating the decoder loop at full resolution) can be applied to higher temporal level (e.g., SO) as needed.
  • higher temporal level e.g., SO
  • Another exemplary application of the inventive concealment technique is to a video conferencing system in which spatial or quality levels are achieved via simulcast.
  • concealment is performed using base layer information as described above.
  • the enhancement layer's drift can be eliminated via any one of a) threading, b) standard SVC temporal scalability, c) periodic I frames, and d) periodic intra macroblocks.
  • An SVCS/CSVCS that is utilizing simulcast to provide spatial scalability, and is only transmitting the higher resolution information for a particular destination for a particular stream (for example if it assumes no or almost no errors), may replace a missing frame of the high resolution with a low resolution one, anticipating such error concealment mechanism on the decoder, and relying on temporal scalability to eliminate drift as discussed above. It will be understood that the concealment process described can be readily adapted to create an effective rate control on such a system.
  • such entity may create a replacement high resolution frame that will achieve a similar functionality by one of following methods: a) for error resilience in spatial scalability coding, create a synthetic frame, based on parsing of the lower resolution frame that will include only the appropriate signaling to use upsampled base layer information without any additional residuals or motion vector refinement; b) for rate control in a system using spatial scalability, the combination of the method described in (a) with the addition that some macroblocks (MBs) containing significant information from the original high resolution frame are retained; c) for an error resilient system using simulcast for spatial scalability, create a replacement high resolution frame that will include synthetic MBs that will include upsampled motion vectors and residual information; d) for rate control in a system using simulcast for spatial s
  • the signaling to use only an upsampled version of the base layer picture can be performed either in-band through the coded video bitstream or through out-of-band information that is sent from the encoder or
  • the encoder or SVCS/SVCS may selectively eliminate some information from enhancement layer MBs.
  • the encoder or SVCS/SVCS may selectively maintain motion vector refinements, but eliminate residual prediction, or keep residual prediction, but eliminate motion vector refinements.
  • base mode flag motion_prediction_flag
  • residual_prediction_flag residual_prediction_flag
  • adaptive_prediction_flag which is used to indicate the presence of base mode flag in the MB layer.
  • the bitrate of the coded stream without the Sl and S2 frames may be very uneven or "bursty," since the SO frames are typically quite large (e.g., as high as 45% of the total bandwidth.
  • the SO packets may be transmitted by splitting them into smaller packets and/or slices and spreading their transmission over the time interval between successive SO pictures.
  • the entire SO picture will not be available for the first S2 picture, but information that has been received by the first S2 picture (i.e., portions of SO and the entire LO and L2) can be used for concealment purposes.
  • VBR variable bit-rate
  • the progressive concealment technique provides a further solution for performing video switching.
  • the progressive concealment technique described above also may be used for video switching.
  • An exemplary switching application is to a single-loop, spatially scalable signal coded at QCIF and CIF resolutions with a three- layer threading structure, with the three-layer threading structure shown in FIG. 7.
  • increased error resilience can be achieved by ensuring reliable transmission of some of the LO pictures.
  • the LO pictures that are reliably transmitted are referred to as LR pictures.
  • the same threading pattern can be extended to the S pictures, as shown in FIG. 10.
  • the temporal prediction paths for the S pictures are identical to those of the L pictures.
  • SR period 10 shows an exemplary SR period of 1/3 (one out of every 3 SO pictures is SR) for purposes of illustration.
  • different periods and different threading patterns can be used in accordance with the principles of the present invention.
  • different paths in the S and L pictures could also be used, but with a reduction in coding efficiency for the S pictures.
  • the SR pictures are assumed to be transmitted reliably. As described in International patent application No. PCT/US06/061815, this can be accomplished using a number of techniques, such as DiffServ coding (where LR and SR are in the HRC), FEC or ARQ.
  • an end-user at terminal receiving a QCIF signal may desire to switch to a CIF signal.
  • the terminal In order to be able to start decoding the enhancement layer CIF signal, the terminal must acquire at least one correct CIF reference picture.
  • PCT/US06/061815 involves using periodic intra macroblocks, so that within a period of time all macroblocks of the CIF picture will be intra coded.
  • a drawback is that it will take a significant amount of time to do so, if the percentage of intra macroblocks is kept low (to minimize their impact on the total bandwidth).
  • the switching application of the progressive concealment technique exploits the reliable transmission of the SR pictures in order to be able to start decoding the enhancement layer CIF signal.
  • the SR pictures can be transmitted to the receiver and be decoded even if it operates at a QCIF level. Since they are infrequent, their overall effect on the bit rate can be minimal.
  • the decoder can utilize the most recent SR frame, and proceed as if intermediate S pictures until the first S picture received were lost. If additional bit rate is available, the sender or server can also forward cached copies of all intermediate SO pictures to further aid the receiver in constructing a reference picture as close to the starting frame of CIF playback as possible.
  • the rate-distortion performance of the S1/S2 concealment technique will ensure that the impact on quality is minimized.
  • the inventive technique can also be used advantageously when the end-user decodes at an intermediate output resolution, e.g., HCIF, and desires to switch to CIF.
  • An HCIF signal can be effectively derived from the L0-L2 and portion of the S0-S2 pictures (e.g., only SO), coupled with concealment for dropped S frames.
  • the decoder which receives at least a portion of the SO pictures, can immediately switch to CIF resolution with very small PSNR penalty. Further, this penalty will be eliminated as soon as the next S0/SR picture arrives. Thus, in this case, there is practically no overhead and almost instantaneous switching can be achieved.
  • This feature is desirebable for accommodating both conference participants who prefer to view such an active layout, and other conference participants who prefer a static view. Since the switching-by-concealment method does not require any additional information to be sent by the encoder, the choice of layout by one receiver does not impact the bandwidth received by others. [0090]
  • the foregoing description refers to creating effective rendering for intermediate resolutions and bit rates that span the range between resolutions/bit rates directly provided by the encoder. It will be understood that other methods that are known to decrease the bit rate (e.g., by introducing drift) such as data partitioning or re-quantization can be employed by the SVCS/CSVCS in conjunction with inventive methods described herein to provide a more detailed manipulation of the bit stream.
  • the scalable codecs and concealment techniques described herein may be implemented using any suitable combination of hardware and software.
  • the software (i.e., instructions) for implementing and operating the aforementioned scalable codecs can be provided on computer-readable media, which can include without limitation, firmware, memory, storage devices, microcontrollers, microprocessors, integrated circuits, ASICS, on-line downloadable media, and other available media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Systèmes et procédés de transmission à robustesse aux erreurs, de commande de débit et d'accès direct dans des systèmes de communication vidéo reposant sur l'utilisation du codage vidéo échelonnable. La robustesse aux erreurs est obtenue par l'utilisation d'informations provenant de couches à faible résolution pour cacher ou compenser la perte d'informations de couches à haute résolution. Le même mécanisme est utilisé pour la commande de débit, par élimination sélective d'informations de couches à haute résolution dans des signaux transmis, cette élimination pouvant être compensée dans le récepteur à l'aide d'informations provenant de couches à faible résolution. En outre, l'accès direct ou la commutation entre la faible résolution et la haute résolution sont également obtenus par l'utilisation d'informations provenant de couches à faible résolution pour compenser les paquets de couche spatiale à haute résolution qui n'ont peut-être pas été reçus avant le moment de la commutation.
EP07757937A 2006-03-03 2007-03-05 Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables Withdrawn EP1997236A4 (fr)

Applications Claiming Priority (14)

Application Number Priority Date Filing Date Title
US77876006P 2006-03-03 2006-03-03
US78703106P 2006-03-29 2006-03-29
US78699706P 2006-03-29 2006-03-29
PCT/US2006/028365 WO2008060262A1 (fr) 2005-09-07 2006-07-21 Système et procédé pour vidéoconférence échelonnable à faible retard utilisant un codage vidéo échelonnable
PCT/US2006/028367 WO2007075196A1 (fr) 2005-09-07 2006-07-21 Systeme et procede pour circuit a couche de base haute fiabilite
PCT/US2006/028366 WO2008082375A2 (fr) 2005-09-07 2006-07-21 Système et procédé pour une architecture de serveur de conférence pour des applications de conférence distribuée et à faible retard
PCT/US2006/028368 WO2008051181A1 (fr) 2006-07-21 2006-07-21 Système et procédé de réduction de tampon de gigue dans un codage échelonnable
US82960906P 2006-10-16 2006-10-16
US86251006P 2006-10-23 2006-10-23
PCT/US2006/061815 WO2007067990A2 (fr) 2005-12-08 2006-12-08 Systemes et procedes relatifs a l'elasticite d'erreur et a l'acces aleatoire dans des systemes de communication video
PCT/US2006/062569 WO2007076486A2 (fr) 2005-12-22 2006-12-22 Systeme et procede pour la videoconference utilisant le decodage video echelonnable et serveurs de videoconference de composition d'images
US88414807P 2007-01-09 2007-01-09
PCT/US2007/062357 WO2007095640A2 (fr) 2006-02-16 2007-02-16 Système et procédé d'amincissement de flux binaires de codage vidéo à échelle modifiable
PCT/US2007/063335 WO2007103889A2 (fr) 2006-03-03 2007-03-05 Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables

Publications (2)

Publication Number Publication Date
EP1997236A2 true EP1997236A2 (fr) 2008-12-03
EP1997236A4 EP1997236A4 (fr) 2011-05-04

Family

ID=39884835

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07757937A Withdrawn EP1997236A4 (fr) 2006-03-03 2007-03-05 Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables

Country Status (3)

Country Link
EP (1) EP1997236A4 (fr)
AU (1) AU2007223300A1 (fr)
WO (1) WO2007103889A2 (fr)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8320450B2 (en) 2006-03-29 2012-11-27 Vidyo, Inc. System and method for transcoding between scalable and non-scalable video codecs
US7844097B2 (en) * 2007-12-03 2010-11-30 Samplify Systems, Inc. Compression and decompression of computed tomography data
US8319820B2 (en) * 2008-06-23 2012-11-27 Radvision, Ltd. Systems, methods, and media for providing cascaded multi-point video conferencing units
EP2152009A1 (fr) * 2008-08-06 2010-02-10 Thomson Licensing Procédé de prédiction d'un bloc perdu ou endommagé d'un cadre de couche spatial amélioré et décodeur SVC adapté correspondant
US9285589B2 (en) 2010-02-28 2016-03-15 Microsoft Technology Licensing, Llc AR glasses with event and sensor triggered control of AR eyepiece applications
US9366862B2 (en) 2010-02-28 2016-06-14 Microsoft Technology Licensing, Llc System and method for delivering content to a group of see-through near eye display eyepieces
US8467133B2 (en) 2010-02-28 2013-06-18 Osterhout Group, Inc. See-through display with an optical assembly including a wedge-shaped illumination system
KR20130000401A (ko) 2010-02-28 2013-01-02 오스터하우트 그룹 인코포레이티드 대화형 머리­장착식 아이피스 상의 지역 광고 컨텐츠
US9229227B2 (en) 2010-02-28 2016-01-05 Microsoft Technology Licensing, Llc See-through near-eye display glasses with a light transmissive wedge shaped illumination system
US9223134B2 (en) 2010-02-28 2015-12-29 Microsoft Technology Licensing, Llc Optical imperfections in a light transmissive illumination system for see-through near-eye display glasses
US10180572B2 (en) 2010-02-28 2019-01-15 Microsoft Technology Licensing, Llc AR glasses with event and user action control of external applications
US8472120B2 (en) 2010-02-28 2013-06-25 Osterhout Group, Inc. See-through near-eye display glasses with a small scale image source
US9759917B2 (en) 2010-02-28 2017-09-12 Microsoft Technology Licensing, Llc AR glasses with event and sensor triggered AR eyepiece interface to external devices
US9129295B2 (en) 2010-02-28 2015-09-08 Microsoft Technology Licensing, Llc See-through near-eye display glasses with a fast response photochromic film system for quick transition from dark to clear
US9091851B2 (en) 2010-02-28 2015-07-28 Microsoft Technology Licensing, Llc Light control in head mounted displays
US9134534B2 (en) 2010-02-28 2015-09-15 Microsoft Technology Licensing, Llc See-through near-eye display glasses including a modular image source
US20120249797A1 (en) 2010-02-28 2012-10-04 Osterhout Group, Inc. Head-worn adaptive display
US8482859B2 (en) 2010-02-28 2013-07-09 Osterhout Group, Inc. See-through near-eye display glasses wherein image light is transmitted to and reflected from an optically flat film
US9182596B2 (en) 2010-02-28 2015-11-10 Microsoft Technology Licensing, Llc See-through near-eye display glasses with the optical assembly including absorptive polarizers or anti-reflective coatings to reduce stray light
US8477425B2 (en) 2010-02-28 2013-07-02 Osterhout Group, Inc. See-through near-eye display glasses including a partially reflective, partially transmitting optical element
US20150309316A1 (en) 2011-04-06 2015-10-29 Microsoft Technology Licensing, Llc Ar glasses with predictive control of external device based on event input
US9128281B2 (en) 2010-09-14 2015-09-08 Microsoft Technology Licensing, Llc Eyepiece with uniformly illuminated reflective display
US8488246B2 (en) 2010-02-28 2013-07-16 Osterhout Group, Inc. See-through near-eye display glasses including a curved polarizing film in the image source, a partially reflective, partially transmitting optical element and an optically flat film
US9097891B2 (en) 2010-02-28 2015-08-04 Microsoft Technology Licensing, Llc See-through near-eye display glasses including an auto-brightness control for the display brightness based on the brightness in the environment
US9097890B2 (en) 2010-02-28 2015-08-04 Microsoft Technology Licensing, Llc Grating in a light transmissive illumination system for see-through near-eye display glasses
US9341843B2 (en) 2010-02-28 2016-05-17 Microsoft Technology Licensing, Llc See-through near-eye display glasses with a small scale image source
CA2838067A1 (fr) 2011-06-08 2012-12-13 Vidyo, Inc. Systemes et procedes pour ameliorer le partage de contenu interactif dans des systemes de communication video
US8184069B1 (en) 2011-06-20 2012-05-22 Google Inc. Systems and methods for adaptive transmission of data
WO2014043165A2 (fr) 2012-09-11 2014-03-20 Vidyo, Inc. Système et procédé d'intégration à base d'agent de systèmes de messagerie instantanée et de vidéocommunication

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003063505A1 (fr) * 2002-01-23 2003-07-31 Nokia Corporation Groupage d'images pour codage video
US20040170331A1 (en) * 2003-01-15 2004-09-02 Canon Kabushiki Kaisha Decoding of a digital image encoded at a plurality of resolution levels

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7072394B2 (en) * 2002-08-27 2006-07-04 National Chiao Tung University Architecture and method for fine granularity scalable video coding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003063505A1 (fr) * 2002-01-23 2003-07-31 Nokia Corporation Groupage d'images pour codage video
US20040170331A1 (en) * 2003-01-15 2004-09-02 Canon Kabushiki Kaisha Decoding of a digital image encoded at a plurality of resolution levels

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEN Y ET AL: "SVC frame loss concealment", ITU STUDY GROUP 16 - VIDEO CODING EXPERTS GROUP -ISO/IEC MPEG & ITU-T VCEG(ISO/IEC JTC1/SC29/WG11 AND ITU-T SG16 Q6), XX, XX, no. JVT-Q046, 12 October 2005 (2005-10-12) , XP030006207, *
JVT: "Joint Scalable Video Model JSVM 4", ITU STUDY GROUP 16 - VIDEO CODING EXPERTS GROUP -ISO/IEC MPEG & ITU-T VCEG(ISO/IEC JTC1/SC29/WG11 AND ITU-T SG16 Q6), XX, XX, no. JVT-Q202, 18 November 2005 (2005-11-18), XP030006256, *
See also references of WO2007103889A2 *
TIAN V KUMAR MV TAMPERE INTERNATIONAL CTR FOR SIGNAL PROCESSING (FINLAND) D ET AL: "Improved H.264/AVC video broadcast/multicast", VISUAL COMMUNICATIONS AND IMAGE PROCESSING; 12-7-2005 - 15-7-2005; BEIJING,, 12 July 2005 (2005-07-12), XP030080844, *

Also Published As

Publication number Publication date
WO2007103889A2 (fr) 2007-09-13
WO2007103889A8 (fr) 2008-10-09
AU2007223300A1 (en) 2007-09-13
EP1997236A4 (fr) 2011-05-04
WO2007103889A3 (fr) 2008-02-28

Similar Documents

Publication Publication Date Title
US9270939B2 (en) System and method for providing error resilience, random access and rate control in scalable video communications
JP6309463B2 (ja) スケーラブルビデオ通信でエラー耐性、ランダムアクセス、およびレート制御を提供するシステムおよび方法
WO2007103889A2 (fr) Système et procédé permettant de fournir la robustesse aux erreurs, l'accès direct et la commande de débit dans des communications vidéo échelonnables
US8442120B2 (en) System and method for thinning of scalable video coding bit-streams
US9426499B2 (en) System and method for scalable and low-delay videoconferencing using scalable video coding
CA2640246C (fr) Systeme et procede d'amincissement de flux binaires de codage video a echelle modifiable
CA2633366C (fr) Systeme et procede pour la videoconference utilisant le decodage video echelonnable et serveurs de videoconference de composition d'images
US8436889B2 (en) System and method for videoconferencing using scalable video coding and compositing scalable video conferencing servers
US20160360155A1 (en) System and method for scalable and low-delay videoconferencing using scalable video coding
EP1952631A1 (fr) Système et procédé pour vidéoconférence échelonnable à faible retard utilisant un codage vidéo échelonnable
JP2009540625A6 (ja) スケーラブルビデオコーディングビットストリームのシニングのためのシステムおよび方法
Sun et al. Seamless switching of scalable video bitstreams for efficient streaming
CA2796882A1 (fr) Systeme et methode pour videoconference echelonnable et a faible delai faisant appel au codage echelonnable
AU2011254031B2 (en) System and method for providing error resilience, random access and rate control in scalable video communications
AU2006346225B8 (en) System and method for scalable and low-delay videoconferencing using scalable video coding

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20081002

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

RIC1 Information provided on ipc code assigned before grant

Ipc: H04B 1/66 20060101AFI20081105BHEP

RIN1 Information on inventor provided before grant (corrected)

Inventor name: ELEFTHERIADIS, ALEXANDROS

Inventor name: SHAPIRO, OFER

Inventor name: WIEGAND, THOMAS

Inventor name: HONG, DANNY

DAX Request for extension of the european patent (deleted)
A4 Supplementary search report drawn up and despatched

Effective date: 20110404

RIC1 Information provided on ipc code assigned before grant

Ipc: H04N 7/26 20060101ALI20110329BHEP

Ipc: H04N 7/46 20060101ALI20110329BHEP

Ipc: H04B 1/66 20060101AFI20081105BHEP

17Q First examination report despatched

Effective date: 20141016

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: VIDYO, INC.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20181002