WO2007067479A1 - Feedback and frame synchronization between media encoders and decoders - Google Patents

Feedback and frame synchronization between media encoders and decoders Download PDF

Info

Publication number
WO2007067479A1
WO2007067479A1 PCT/US2006/046221 US2006046221W WO2007067479A1 WO 2007067479 A1 WO2007067479 A1 WO 2007067479A1 US 2006046221 W US2006046221 W US 2006046221W WO 2007067479 A1 WO2007067479 A1 WO 2007067479A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
decoder
frames
encoder
cache
Prior art date
Application number
PCT/US2006/046221
Other languages
French (fr)
Inventor
Warren V. Barkley
Regis J. Crinon
Chih-Lung Lin (Bruce)
Tim M. Moore
Wei Zhong
Minghui Xia (Jason)
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Priority to BRPI0618719-6A priority Critical patent/BRPI0618719A2/en
Priority to RU2008122940/07A priority patent/RU2470481C2/en
Priority to EP06847494.9A priority patent/EP1961232B1/en
Priority to JP2008544413A priority patent/JP5389449B2/en
Priority to CN2006800463242A priority patent/CN101341754B/en
Publication of WO2007067479A1 publication Critical patent/WO2007067479A1/en
Priority to KR1020087013567A priority patent/KR101343234B1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0014Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the source coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/12Arrangements for detecting or preventing errors in the information received by using return channel
    • H04L1/16Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
    • H04L1/18Automatic repetition systems, e.g. Van Duuren systems
    • H04L1/1829Arrangements specially adapted for the receiver end
    • H04L1/1835Buffer management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/12Arrangements for detecting or preventing errors in the information received by using return channel
    • H04L1/16Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
    • H04L1/18Automatic repetition systems, e.g. Van Duuren systems
    • H04L1/1867Arrangements specially adapted for the transmitter end
    • H04L1/1887Scheduling and prioritising arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/89Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • H04N21/25825Management of client data involving client display capabilities, e.g. screen resolution of a mobile phone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4331Caching operations, e.g. of an advertisement for later insertion during playback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/633Control signals issued by server directed to the network components or client
    • H04N21/6332Control signals issued by server directed to the network components or client directed to client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/633Control signals issued by server directed to the network components or client
    • H04N21/6332Control signals issued by server directed to the network components or client directed to client
    • H04N21/6336Control signals issued by server directed to the network components or client directed to client directed to decoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/654Transmission by server directed to the client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6582Data stored in the client, e.g. viewing habits, hardware capabilities, credit card number
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0023Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the signalling
    • H04L1/0026Transmission of channel quality indication

Definitions

  • coders and decoders enable media to be transmitted from point to point within networks.
  • Cooperating sets of coders and decoders are referred to as “codecs” herein. Additionally, the terms “coder” and “encoder” are used herein synonymously.
  • the encoder may interact or cooperate with a number of decoders. All of these decoders may or may not be configured alike, or have the same processing capabilities. Additionally, the decoders are typically not configured to provide the encoder with information such as the properties, features, or capabilities of particular ones of the decoders. In this environment, the encoders may send data to the decoders as if all of the decoders are homogenous entities, when the decoders may not be.
  • Networks typically represent lossy channels, such that some amount of data transmitted via such networks is expected to be corrupted, damaged, or lost altogether.
  • Various schemes for recovering from such data loss or corruption have been proposed. Some of these recovery schemes may involve resending entire duplicates of the lost or damaged data. Accordingly, these recovery schemes may unnecessarily consume network bandwidth.
  • the encoder can encode frames that are based on source content to be sent to the decoder.
  • the encoder can determine whether the frame should be cached by the encoder and the decoder. If the frame is to be cached, the encoder can so indicate by encoding the frame with one or more cache control bits.
  • the decoder can receive the frame from the decoder, and can examine the cache control bits to determine whether to cache the frame. The decoder can also decode the frame.
  • Figure 1 is a block diagram illustrating an operating environment suitable for performing feedback and frame synchronization between media encoders and decoders.
  • Figure 2 is a block diagram illustrating a data structure, at least parts of which may be suitable for implementing respective instances of frames as shown in Figure 1.
  • Figure 3 is a block diagram illustrating a data structure, at least parts of which may be suitable for implementing respective instances of a feedback channel as shown in Figure 1.
  • Figure 4 is a block diagram illustrating an operating environment for receiving frames, merging a new frame with a previous display to produce an updated display, caching a frame, and merging a new frame with the contents of a cache to produce an updated display.
  • Figure 5 is a flow diagram illustrating a process flow that may be performed to encode frames and to respond to a frame loss report.
  • Figure 6 is a flow diagram illustrating a process flow for processing a frame as received by, for example, the decoders.
  • tools capable of many techniques and processes.
  • the following discussion describes exemplary ways in which the tools enable feedback and frame synchronization between media encoders and decoders. This discussion also describes ways in which the tools perform other techniques as well.
  • packet and "frame” are used herein for convenience of illustration and discussion. For further convenience, it can be assumed that all frames can fit into the payload ot one packet, and therefore that the number of packets equals the number of frames when discussing the demarcation of a given frame and the next frame.
  • FIG. 1 illustrates one such operating environment generally at 100.
  • the operating environment 100 can comprise a workstation 102a having one or more processor(s) 104a and computer-readable media 106a.
  • the workstation 102a can comprise a computing device, such as a cell phone, desktop computer, personal digital assistant, server, or the like.
  • the processor 104a can be configured to access and/or execute the computer-readable media 106a.
  • the computer-readable media 106a can comprise or have access to an encoder 108, which may be implemented as a module, program, or other entity capable of interacting with a network-enabled entity.
  • the encoder 108 can be operative to encode source content 110 into a plurality of corresponding frames 112.
  • the source content 110 can assume any number of different forms, such as a live presentation featuring a speaker or other performer.
  • the source content 110 can be a video and/or audio conference.
  • the source content 110 can be pre-existing or pre-recorded media, such as audio or video media.
  • the operating environment 100 can also comprise a network 114 and related server(s) 116.
  • the network 114 enables communication between the workstation 102 and the server(s) 116, and can comprise a global or local wired or wireless network, such as the Internet or a corporate intranet. It is understood that the encoder 108 can be operative to encode the source content 110 into the frames 112 using a protocol that is appropriate for transmission over the network 114.
  • the server(s) 116 can comprise a single server or multiple servers, such as a server farm, though the server(s) 116 may also comprise additional server or non- server entities capable of communicating with other entities or of governing the individual servers (e.g., for load balancing).
  • the server(s) 116 are shown with three separate servers 116a, 116b, and 116c operating serially or in parallel to service requests from, for example, the workstations 102.
  • the network 114 can be operative to transmit the frames 112 from the workstation 102a to at least one additional workstation 102b. It is understood that the network 114 may not transmit all of the frames 112 perfectly from the workstation 102a to the workstation 102b.
  • the reference 112 in Figure 1 represents the frames as they leave the workstation 102a
  • the reference 118 represents the frames as they emerge from the network 114 and are provided to the workstation 102b.
  • Some frames 112 may be lost, distorted, or otherwise corrupted during transmission through the network 114, as compared to the frames 118. Accordingly, if some of the frames 112 are lost, then the received frames 118 may be viewed as a subset of the sent frames 112. Also, if some of the frames 112 are corrupted, then the received frames 118 may be viewed as the sent frames 112 in a corrupted state.
  • the workstation 102b may be implemented similarly to the workstation 102a described above.
  • the workstation 102b can include processor(s) 104b and computer-readable media 106b.
  • the computer- readable media 106b can comprise or have access to a decoder 120.
  • the decoder 120 can be operative to receive and decode the frames 118 as received from the workstation 102a via the network 114.
  • the decoder 120 would use the same protocol to decode the frames 118 as was used previously by the encoder 108 to encode the frames 112. If the decoder 120 dete ⁇ nines that the frames 118 are not corrupted, damaged, or lost, relative to the frames 112, then the decoder can decode these frames 118 into decoded content 122.
  • the decoded content 122 represents the source content 110 as reproduced on the workstation 102b.
  • the decoded content 122 could represent the presentation as displayed via the workstation 102b.
  • the decoded content 122 may be that audio or video stream as heard or seen by another conference participant.
  • the decoded content 122 could represent the media as displayed via the workstation 102b.
  • the operating environment 100 is not limited to a unidirectional nature. Instead, the workstation 102a may transmit certain source content 110 at some times, while the workstation 102b may transmit other source content 110 at other times.
  • the data flows shown in Figure 1 and other Figures herein are illustrative only, and not limiting.
  • the decoder 120 can report accordingly to the encoder 108. More particularly, the decoder 120 can report to the encoder 108 using a feedback channel 124.
  • the feedback channel 124 can be implemented, at least in part, using the network 114, although the protocol used to encode and/or transmit data via the feedback channel 124 may or may not be the same as the protocol used to encode and/or transmit the frames 112 and 118.
  • Data moving through the feedback channel 124 via the network 114 is represented by the references 124a and 124b. Due to errors in the network or other issues, the data 124a and 124b may differ somewhat, for the same reasons as the frames 112 may differ from the frames 118.
  • the encoder 108 can transmit corrected or replacement frames 112 to the decoder 120. It is understood that this process of reporting damaged frames and transmitting corrected or replacement frames 112 can be repeated until the decoder 120 has the information appropriate to decode the frames 118, so as to produce the decoded content 122 on the workstation 102b.
  • the feedback channel 124 may also enable the decoder 120 to communicate information about itself, its configuration, or other relevant parameters back to the encoder 108. Given this information about the decoder 120, the encoder 108 can adjust or optimize its encoding process accordingly. These aspects of the feedback channel 124 as used to report information about the decoder 120 are also described further below. In light of the foregoing description, the feedback channel 124 may provide the decoder(s) 120 with an out-of-band channel to communicate with the encoder 108.
  • each workstation 104b and related decoder 120 could have a respective feedback channel 124. Using this feedback channel 124, the workstations 104b and/or decoders 120 can provide specific local information germane to their local environments back to the encoder 108.
  • the tools described and illustrated herein may utilize data structures as part of their implementation and/or operations to perform feedback and frame synchronization between media encoders and decoders. Examples of such data structures are now described.
  • Figure 2 illustrates a data structure 200, at least parts of which may be suitable for implementing respective instances of the frames 112 and/or 118 as shown in Figure 1.
  • the data structure 200 can include, for a given frame 112 and/or 118, a field 205 for the RTP standard header, and a field 210 that contains the data that is considered the payload of the frames 112 and/or 118.
  • the data structure 200 can contain a field 215 for additional header data.
  • the data contained in the field 215 may be considered an extension to the underlying protocol used by the encoder 108 and the decoder 120.
  • the protocol can be RTP, although it is understood that other protocols may be equally suitable.
  • FIG. 2 illustrates several examples of data that may be included in the field 215 for a particular frame 112 and/or 118.
  • a sub-field 220 can contain one or more cache control bits. These cache control bits 220 can enable the encoder 108 to control and/or manage the caching of particular frames 112 and/or 118 by the decoder 120. These bits 220 can support frame recovery and synchronization operations between the encoder 108 and ones of. the decoder 120. This frame caching operation is described in further detail below.
  • a sub-field 222 can indicate, for a given frame 112 and/or 118, what type of frame it is.
  • Figure 2 illustrates three types of frames, although it is understood that other types of frames may be implemented, and the implemented frames may be named or labeled differently than as described herein.
  • an I-frame represents an entire, self-contained frame of content, for example, audio or video.
  • An I-frame is "free standing", and can De ⁇ eco ⁇ e ⁇ and reproduced by the decoder 120 without reference to any other previous or future frames.
  • a P-frame represents a difference between a current state of the audio or video and a previous I-frame.
  • a P-frame may be said to reference the previous I- frame.
  • the P-frame contains data representing only the differences relative to this previous I-frame, the P-frame is typically much smaller than the I-frame. To conserve bandwidth across the network 114, it may be appropriate to utilize P-frames as much as possible.
  • the encoder 108 may use a sequence of P-frames, because under such circumstances, the differences in successive frames are typically relatively small and readily represented by P-frames.
  • the encoder 108 may use one or more I-frames to set the new scene or context.
  • the loss rate experienced by the workstation 102b and/or the decoder 120 may be reported to the workstation 102a and/or the encoder 108.
  • the encoder 108 can consider the loss rate reported by the decoder 120 in determining whether to send I- frames or P-frames to the decoder 120. Additionally, the reported loss rate can be one factor in controlling the frame rate, bit rate, quality, and whether to send Super-P frames.
  • the sub-field 222 can also support an additional type of frame 112 and/or 118, which is referred to herein for convenience, but not limitation, as a Super P-frame.
  • a - Super P-frame is similar to a P-frame in that it defines a change in the content, relative to a previous state of the content. However, instead of referencing a previous frame, the Super P-frame references the contents of a cache that is maintained locally on the decoder 120. This caching operation is described in further detail below.
  • a sub-field 224 can contain an index or other type of unique identifier for a given frame 112 and/or 118.
  • the contents of the sub-field 224 can take the form of a sequence number for frames or packets, a unique timestamp, an offset or position of the given frame 112 and/or 118 within the context of the source content 110, a displacement of the given frame 112 and/or 118 relative to the beginning of the source content 110, or the like.
  • the contents of the sub-field 224 may be populated by the encoder 108 when encoding the source content 110 into the frames 112 at the workstation 104a.
  • the decoder 120 may reference the contents of the sub-field 224 for a given frame 118 when decoding and assembling a plurality of the frames 118 into the decoded content 122. More particularly, the contents of the sub-field 224 may enable the decoder 120 to assemble the frames 118 into an appropriate order when presenting the decoded content 122. Additionally, the decoder 120 can use the contents of the sub-field 224, at least in part, to determine if one or more frames 112 sent by the encoder 108 were lost during transmission through the network 114 to the workstation 104b.
  • the decoder 120 may receive a given sequence of frames 118 having identifiers 224 such as A, B, and D. However, the decoder 120 might expect these three frames 118 to have the identifiers 224 as A, B, and C. If the decoder 120 does not receive frame C in some amount of time, the decoder 120 may conclude that the frame 112 corresponding to expected frame C will never arrive, and was lost in the network 114. Accordingly, the decoder 120 may report to the decoder 120 that the packet C was lost, through for example the feedback channel 124.
  • a sub-field 226 can contain data pertaining to a color space conversion performed by the encoder 108 based on the characteristics or configuration of a particular decoder 120. Recall, from the discussion of Figure 1 above, that particular instances of the decoder 120 can communicate information pertaining to their local color display capabilities or features back to the encoder 108, for example, via the feedback channel 124. In response to this feedback from particular decoders 120, the encoder 108 can specifically tailor the frames 112 that are sent to each of the particular decoders 120. Any data pertaining to specific color conversions performed by the encoder 108 on behalf of a given decoder 120 can be stored in the sub-field 226.
  • the source content 110 may be captured and presented to the encoder 108 in an illustrative range of 256 colors.
  • the encoder 108 may instruct the decoder 120 how to convert the colors, as represented in the frame 112, into colors that are supported by the decoder 120.
  • the encoder 108 may indicate, through data in the sub-field 226, how the encoder 108 has already converted the colors in the frame 112, for the benefit of the decoder 120.
  • a sut>- ⁇ ei ⁇ 228 can contain data pertaining to any pixel resolution conversions performed by the encoder 108 on behalf of a particular decoder 120.
  • particular instances of the decoder 120 can communicate data such as their pixel resolution to the encoder 108, for example, via the feedback channel 124.
  • the sub-field 228 can enable similar processing regarding pixel resolution.
  • the source content 110 may be captured and presented to the encoder 108 in a relatively high pixel density.
  • one or more of the decoders 120 may not support this high pixel density, and different ones of the decoders 120 may support different pixel densities.
  • the encoder 108 may optimize the pixel5 density of different frames 112 sent to different decoders 120, depending on the capabilities of the different decoders 120. Accordingly, the sub-field 228 can contain any information pertaining to any conversions in pixel resolutions perfo ⁇ ned by the encoder 108, or pertaining to any conversions that should be performed by the decoder 120 in processing the frames 118.
  • the data structure 200 could include one or more of these example fields 205-215 or sub-fields 220-228, or may contain additional data, fields, or sub-fields other than those illustrated in Figure 2.
  • the layout, names, and configuration of the fields or sub-fields of the5 data structure 200 are illustrative only, and are chosen only for convenience of illustration and description, and do not limit possible implementations of the data structure 200. It is further understood that given instances of the data structure 200 may be associated with particular frames 112, but each instance of the data structure 200 need not have populated each field and/or sub-field as shown in Figure 2.
  • Figure 3 illustrates a data structure 300, at least parts of which may be suitable for implementing respective instances of the feedback channel 124 as shown in Figure 1. More particularly, data transfer from respective instances of the decoder 120 to the encoder 108 may be facilitated, at least in part, using the data structure 300.
  • a field 305 can contain data5 reporting a local frame or packet loss rate experienced by ones of the decoders 120.
  • This loss rate may be expressed, for example, as a number of frames lost per unit of time, as experienced by a particular decoder 120.
  • the encoder 108 may choose how often to transmit I-frames or P-frames to the decoders 120. / ⁇ ii ⁇ , inis miormanon may enable the encoder 108 to determine when and/or how often to direct or instruct the decoders 120 to cache particular frames 112/118. These caching operations are discussed further below in connection with Figures 4-6.
  • a field 310 can contain data reporting the loss of a particular frame 112/118.
  • the decoder 120 can reference data such as that discussed previously regarding the sub-field 224 as shown in Figure 2. Recall that the sub-field 224 can contain identification information for particular frames 112/118. For example, if the decoder 120 suspects that one or more frames 112 are missing, the decoder 120 might report a sequence of frames 118 that are actually received, so the encoder 108 can determine which frames 112 were lost. In another example, the decoder 120 could estimate or determine the identification information for the suspected missing frames 112.
  • a field 315 can contain data representing a local pixel resolution supported by a particular decoder 120.
  • the encoder 108 can transform the pixel resolution of the frames 112/118 sent to the decoder 120, can instruct the decoder 120 how to transform the pixel resolution of the frames 112/118, or can perform other related processing. Any of the foregoing can be performed in connection with the sub-field 228 shown in Figure 2.
  • a field 320 can contain data representing a local color depth supported by a particular decoder 120.
  • the encoder 108 can transform the color depth of the frames 112/118 sent to the decoder 120, can instruct the decoder 120 how to transform the color depth of the frames 112/118, or can perform other related processing. Any of the foregoing can be perfo ⁇ ned in connection with the sub-field 226 shown in Figure 2.
  • the data structure 300 could include one or more of these example fields 305-320, or may contain additional data, fields, or sub-fields other than those illustrated in Figure 3.
  • the layout, names, and configuration of the fields of the data structure 300 are illustrative only, and are chosen only for convenience of illustration and description, and do not limit possible implementations of the data structure 300.
  • given instances of the data structure 300 may be associated with particular instances of data transmitted from the decoders 120 to the encoder 108. However, each instance of the data structure 300 need not have populated each field as shown in Figure 3. Data Flows
  • the tools described herein may implement data flows that are suitable for performing feedback and frame synchronization between media encoders and decoders.
  • An illustrative data flow is now described in connection with another operating environment.
  • Figure 4 illustrates an operating environment 400 for receiving frames 118, merging a new frame 118 with a previous display to produce an updated display, caching a frame 118, and merging a new frame 118 with the contents of the cache to produce an updated display.
  • the operating environment 400 may be implemented, at least in part, by the workstation 104b and/or the decoder 120, although aspects of the operating environment 400 may also be implemented by other components or tools as well.
  • a frame 118a is received.
  • frames 118 can be associated with respective instances of the data structure 200, as discussed above in connection with Figure 2.
  • a field 222 of the data structure 200 for the frame 118a indicates that the frame 118a is an I-Frame.
  • the frame 118a can be presented directly on a display 402 associated with, for example, the workstation 104b.
  • the display 402 as it would stand when presenting the I-Frame 118a is denoted as display 402a in Figure 4.
  • the data structure 200 for the frames 112/118 can include the cache control bits 220. Assume for the purposes of describing the operating environment 400 that the data structure 200 includes two cache control bits 220. A first cache control bit may be labeled "Cache”, and at least a second cache control bit may be labeled "Use Cache”. Either or both of these bits may be set or active for a given frame 118.
  • this bit when this bit is set for a given frame 112/118, this bit directs the decoder 120 to store the frame 112/118, and/or the display resulting from that frame 112/118, into a cache 404 maintained locally by the workstation 104b and/or the decoder 120.
  • this bit assume that the I- Frame 118a has the "Cache” bit set or active, as indicted in block 406. Accordingly, the I-Frame 118a would be presented as the display 402a, and stored in the cache 404.
  • Some implementations of the operating environment 400 may cache all instances of I-Frames 112/118 by default. Other implementations of the operating environment 4UU may cache only those I-Frames 112/118 that have their "Cache" bits set or active.
  • a frame 118b arrives, and that its frame type 222 indicates that it is a P-Frame.
  • a P-Frame expresses the difference between the current state of the source content 110 and some previous reference frame. Accordingly, the contents of the frame 118b are merged with the previous display 402a, as represented by merge block 408a.
  • the merge 408a results in an updated display 402b.
  • a P-Frame 118 may have its Cache" bit set or active.
  • the contents of the P-Frame 118 itself may be stored in the cache 404, in some implementations of the operating environment 400.
  • the display e.g., display 402b
  • the display resulting from the merge of the P-Frame (e.g., frame 118b) may be cached.
  • the encoder 108 may send a replacement P-Frame 118d. Assume that the operating environment 400 receives this replacement P-Frame 118d at time (T 4 ). As indicated by the block 410, the cache control bits 220 for the replacement P-Frame 118d can have its "Use Cache" bit set or active. This directs the operating environment 400 to merge the current P-Frame 118d with the contents of the cache 404, rather than the previous display. This merge-from- cache is represented generally by the merge block 408c.
  • the P-Frame 118d is referred to herein as a Super P-Frame, as discussed above.
  • the encoder 108 encodes the Super P-Frame 118d based on the cached reference, and thus the Super P-Frame 118d is muui Miia ⁇ er man a replacement I-Frame 118 would be.
  • sending the Super P- Frame 118d to compensate for the frame loss consumes less network bandwidth than sending a replacement I-Frame 118.
  • the encoder 108 may not send the Super P-Frame 118d if the corrupted frame is sufficiently close to the next I-Frame 118e that will arrive at the decoder 120.
  • the decoder 120 may await the next I-Frame 118e.
  • the decoder 120 may be configured with one or more settings that specify how close the corrupted frame should be relative to the next I-Frame 118e for this processing to occur.
  • Figure 4 also depicts the arrival of a new I-Frame 118e at time (T 5 ), resulting in a new display 402e.
  • Figure 4 shows one cache 404 only for convenience of illustration and description. Additional caches 404 could be provided by the encoder
  • multiple reference frames 112/118 can be stored and retained by the encoder 108 and/or the decoder 120. These multiple reference frames 112/118 may be useful in situations wherein one or more of the cached frames
  • cache control bits 220 in Figure 2 may be implemented as appropriate to dictate or indicate which cache 404 was used to encode a given replacement frame 112/118.
  • the cache control bits 220 provide a means for enabling the encoder 108 to instruct the decoder 120 in how to handle caching and related synchronization of replacement frames 112/118.
  • the reference frames cached at the encoder 108 and the decoder 120, and the replacement frames 112/118 encoded therefrom, provide a means for synchronizing the processing of the encoder 108 and the decoder 120.
  • the tools as described herein can implement various process flows to perform feedback and frame synchronization between media encoders and decoders. Examples of such process flows are now described.
  • FIG. 5 illustrates a process flow 500 that may be performed to encode frames and to respond to a packet loss report.
  • the process flow 500 is described here in connection with the encoder 108. However, it is understood that the process flow 500 may be implemented on devices or components other than the encoder 108 without departing from the spirit and scope of the description herein.
  • Block 502 encodes one or more frames from the source content 110.
  • block 504 evaluates whether to cache the current frame 112 for possible later reference. If the frame 112 is to be cached, block 506 sets the "Cache" bit for the current frame 112. Recall that the "Cache" bit may be implemented as part of the cache control bits
  • Block 508 caches the current frame 112 for later reference.
  • the current frame 112 may be cached by the encoder 108.
  • Block 510 transmits the current frame 112 to the decoder 120. Illustrative processing of the frame 112 at, for example, the decoder 120 is described in connection with Figure 6 below.
  • block 512 clears the "Cache" bit for this frame 112.
  • the "Cache" bit may be initialized to a set or clear state when the frame 112 is instantiated. In such cases, blocks 512 or 506 may not be performed, if it is not necessary to change the state of the "Cache” bit from its initialized state.
  • the process flow 500 proceeds to block 510 as described above. After block 510 is performed, the process flow 500 can return to block 502 to process the next frame 112 into which the source content 110 is encoded. It is understood that the process flow 500 may loop through blocks 502-512 as appropriate to encode the source content 110 into suitable frames 112.
  • block 514 can receive a frame loss report 310. Block 514 can occur at any point within the process flow 500.
  • the process flow 500 may also test for and respond to the receipt of the frame loss report 310 at any point relative to blocks 502-512. Additionally, the process flow 500 may implement block 514 as an interrupt, branch from some point within blocks 502-
  • Figure 5 shows the process flow 500 branching to block 514 when the frame loss report 310 is received, regardless of where the process flow 500 is within blocks 502-512.
  • Block 516 references the frame that was cached previously in block 508.
  • Block 518 encodes a new P-Frame relative to or referencing the cached frame.
  • Block 520 sets the "Use Cache" bit of the new P-Frame, if this bit is not already set.
  • the cache control bits 220 shown and discussed in Figure 2 can include a "Use Cache” bit, which directs, for example, the decoder 120 to reference the contents of the cache 404 rather than the current display 402, when updating the current display 402.
  • This new P-Frame is referred to herein for convenience only as a Super P- Frame.
  • Block 522 transmits the Super P-Frame to the decoder 120 to allow the latter to compensate for the loss of the frame reported in block 514.
  • Figure 6 illustrates a process flow 600 for processing a frame as received by, for example, the decoder 120. While the process flow 600 is described herein in connection with tools such as the decoder 120 and the encoder 108, other implementations of the process flow 600 could also be implemented with other tools without departing from the spirit and scope of the description herein.
  • Block 602 receives a frame 118, as transmitted by, for example, block 510 shown in Figure 5.
  • Block 604 tests whether the received frame 118 is corrupted, or whether a different frame 118 was expected.
  • frame corruption block 604 can test for corruption by, for example, evaluating a checksum or other error-detection and correction scheme implemented by the decoder 120 and/or encoder 108.
  • frame loss recall that frames 118 can be associated with respective instances of the data structure 200, described above in Figure 2.
  • the data structure 200 can contain a field 224 for sequencing or otherwise uniquely identifying the frame 118. Using, for example, this field 224, block 604 can test whether the current frame 118 is the expected successor to a previous frame 118. If not, then the expected successor frame 118 may have been lost.
  • block 606 reports the lost or corrupted frame.
  • the report issued from block 606 can correspond to the report received in block 514 shown in Figure 5 and to the frame loss report 310 shown in Figure 3.
  • blocks 608, 610, and 612 can test what frame type the frame 118 is. Recall that the data structure 200 can contain a sub-field 222 indicating a frame type. Block 608 tests whether the frame 118 is an I-Frame, block 610 tests whether the frame 118 is a P-Frame, and block 612 tests whether the frame 118 is a Super P-Frame.
  • block 614 can display the frame 118 directly, without reference to the current display or any other frame 118.
  • block 616 updates the current display by merging it with the frame 118.
  • Block 614 then presents the updated display.
  • block 612 if the frame 118 is a Super P-Frame, then block 618 updates the display by merging it with the contents of a cache, such as the cache 404 shown in Figure 4. Recall that a Super P-Frame can be indicated or detected by a "Use Cache" bit being set or activated. Block 614 then presents the updated display.
  • block 620 can process this other type of frame. Afterwards, the process flow 600 can return to block 602 to await the next frame 118.
  • block 622 tests whether the "Cache" bit is set for the frame
  • block 624 stores the frame 118 in a cache, such as for example the cache 404 shown in Figure 4.
  • Block 602 then awaits the arrival of the next frame 118.
  • block 624 can be bypassed, and block 602 then awaits the arrival of the next frame 118.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Graphics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

Feedback and frame synchronization between media encoders and decoders is described. More particularly, the encoder can encode frames that are based on source content to be sent to the decoder. The encoder can determine whether the frame should be cached by the encoder and the decoder. If the frame is to be cached, the encoder can so indicate by encoding the frame with one or more cache control bits. The decoder can receive the frame from the decoder, and can examine the cache control bits to determine whether to cache the frame. The decoder can also decode the frame.

Description

FEEDBACK AND FRAME SYNCHRONIZATION BETWEEN MEDIA
ENCODERS AND DECODERS
BACKGROUND
Various forms of media coders and decoders enable media to be transmitted from point to point within networks. Cooperating sets of coders and decoders are referred to as "codecs" herein. Additionally, the terms "coder" and "encoder" are used herein synonymously.
Typically, the encoder may interact or cooperate with a number of decoders. All of these decoders may or may not be configured alike, or have the same processing capabilities. Additionally, the decoders are typically not configured to provide the encoder with information such as the properties, features, or capabilities of particular ones of the decoders. In this environment, the encoders may send data to the decoders as if all of the decoders are homogenous entities, when the decoders may not be.
Networks typically represent lossy channels, such that some amount of data transmitted via such networks is expected to be corrupted, damaged, or lost altogether. Various schemes for recovering from such data loss or corruption have been proposed. Some of these recovery schemes may involve resending entire duplicates of the lost or damaged data. Accordingly, these recovery schemes may unnecessarily consume network bandwidth.
SUMMARY
Systems and/or methods ("tools") are described that enable feedback and frame synchronization between media encoders and decoders. More particularly, the encoder can encode frames that are based on source content to be sent to the decoder. The encoder can determine whether the frame should be cached by the encoder and the decoder. If the frame is to be cached, the encoder can so indicate by encoding the frame with one or more cache control bits. The decoder can receive the frame from the decoder, and can examine the cache control bits to determine whether to cache the frame. The decoder can also decode the frame.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram illustrating an operating environment suitable for performing feedback and frame synchronization between media encoders and decoders.
Figure 2 is a block diagram illustrating a data structure, at least parts of which may be suitable for implementing respective instances of frames as shown in Figure 1.
Figure 3 is a block diagram illustrating a data structure, at least parts of which may be suitable for implementing respective instances of a feedback channel as shown in Figure 1.
Figure 4 is a block diagram illustrating an operating environment for receiving frames, merging a new frame with a previous display to produce an updated display, caching a frame, and merging a new frame with the contents of a cache to produce an updated display.
Figure 5 is a flow diagram illustrating a process flow that may be performed to encode frames and to respond to a frame loss report.
Figure 6 is a flow diagram illustrating a process flow for processing a frame as received by, for example, the decoders.
The same numbers are used throughout the disclosure and figures to reference like components and features.
DETAILED DESCRIPTION
Overview
The following document describes system(s) and/or method(s) ("tools") capable of many techniques and processes. The following discussion describes exemplary ways in which the tools enable feedback and frame synchronization between media encoders and decoders. This discussion also describes ways in which the tools perform other techniques as well.
This document is organized into sections for convenience, with the sections introduced by headings chosen for convenience, but not limitation. First, an illustrative Operating Environment for performing feedback and frame synchronization between media encoders and decoders is described. Then, illustrative Data Structures are described, followed by illustrative Data Flows. Finally, illustrative Process Flows are described.
The terms "packet" and "frame" are used herein for convenience of illustration and discussion. For further convenience, it can be assumed that all frames can fit into the payload ot one packet, and therefore that the number of packets equals the number of frames when discussing the demarcation of a given frame and the next frame.
Operating Environment
Before describing the tools in detail, the following discussion of an exemplary operating environment is provided to assist the reader in understanding one way in which various aspects of the tools may be employed. The environment described below constitutes but one example and is not intended to limit application of the tools to any one particular operating environment. Other environments may be used without departing from the spirit and scope of the claimed subject matter.
Figure 1 illustrates one such operating environment generally at 100. The operating environment 100 can comprise a workstation 102a having one or more processor(s) 104a and computer-readable media 106a. The workstation 102a can comprise a computing device, such as a cell phone, desktop computer, personal digital assistant, server, or the like. The processor 104a can be configured to access and/or execute the computer-readable media 106a. The computer-readable media 106a can comprise or have access to an encoder 108, which may be implemented as a module, program, or other entity capable of interacting with a network-enabled entity.
The encoder 108 can be operative to encode source content 110 into a plurality of corresponding frames 112. The source content 110 can assume any number of different forms, such as a live presentation featuring a speaker or other performer. The source content 110 can be a video and/or audio conference. Finally, the source content 110 can be pre-existing or pre-recorded media, such as audio or video media.
The operating environment 100 can also comprise a network 114 and related server(s) 116. The network 114 enables communication between the workstation 102 and the server(s) 116, and can comprise a global or local wired or wireless network, such as the Internet or a corporate intranet. It is understood that the encoder 108 can be operative to encode the source content 110 into the frames 112 using a protocol that is appropriate for transmission over the network 114.
The server(s) 116 can comprise a single server or multiple servers, such as a server farm, though the server(s) 116 may also comprise additional server or non- server entities capable of communicating with other entities or of governing the individual servers (e.g., for load balancing). The server(s) 116 are shown with three separate servers 116a, 116b, and 116c operating serially or in parallel to service requests from, for example, the workstations 102. The network 114 can be operative to transmit the frames 112 from the workstation 102a to at least one additional workstation 102b. It is understood that the network 114 may not transmit all of the frames 112 perfectly from the workstation 102a to the workstation 102b. Accordingly, the reference 112 in Figure 1 represents the frames as they leave the workstation 102a, and the reference 118 represents the frames as they emerge from the network 114 and are provided to the workstation 102b. Some frames 112 may be lost, distorted, or otherwise corrupted during transmission through the network 114, as compared to the frames 118. Accordingly, if some of the frames 112 are lost, then the received frames 118 may be viewed as a subset of the sent frames 112. Also, if some of the frames 112 are corrupted, then the received frames 118 may be viewed as the sent frames 112 in a corrupted state.
Turning to the workstation 102b in more detail, it may be implemented similarly to the workstation 102a described above. Thus, the workstation 102b can include processor(s) 104b and computer-readable media 106b. The computer- readable media 106b can comprise or have access to a decoder 120.
The decoder 120 can be operative to receive and decode the frames 118 as received from the workstation 102a via the network 114. The decoder 120 would use the same protocol to decode the frames 118 as was used previously by the encoder 108 to encode the frames 112. If the decoder 120 deteπnines that the frames 118 are not corrupted, damaged, or lost, relative to the frames 112, then the decoder can decode these frames 118 into decoded content 122.
The decoded content 122 represents the source content 110 as reproduced on the workstation 102b. For example, if the source content 110 is a live presentation, the decoded content 122 could represent the presentation as displayed via the workstation 102b. If the source content 110 is a spoken conference-related audio or video stream, the decoded content 122 may be that audio or video stream as heard or seen by another conference participant. As another example, if the source content 110 is pre-existing or pre-recorded media, the decoded content 122 could represent the media as displayed via the workstation 102b.
In providing the above description, it is understood that the operating environment 100 is not limited to a unidirectional nature. Instead, the workstation 102a may transmit certain source content 110 at some times, while the workstation 102b may transmit other source content 110 at other times. Thus, the data flows shown in Figure 1 and other Figures herein are illustrative only, and not limiting. Returning to the processing of the decoder 120, if the decoder 120 determines that some of the frames 118 were corrupted or damaged during transmission through the network 114, or that some of the frames 112 were lost and never arrived at the workstation 102b, the decoder 120 can report accordingly to the encoder 108. More particularly, the decoder 120 can report to the encoder 108 using a feedback channel 124. The feedback channel 124 can be implemented, at least in part, using the network 114, although the protocol used to encode and/or transmit data via the feedback channel 124 may or may not be the same as the protocol used to encode and/or transmit the frames 112 and 118. Data moving through the feedback channel 124 via the network 114 is represented by the references 124a and 124b. Due to errors in the network or other issues, the data 124a and 124b may differ somewhat, for the same reasons as the frames 112 may differ from the frames 118.
Having received a report of lost, damaged, or otherwise corrupted frames 112 or 118, the encoder 108 can transmit corrected or replacement frames 112 to the decoder 120. It is understood that this process of reporting damaged frames and transmitting corrected or replacement frames 112 can be repeated until the decoder 120 has the information appropriate to decode the frames 118, so as to produce the decoded content 122 on the workstation 102b.
Additional aspects of the feedback channel 124 as used to report frame losses are described in further detail below. However, the feedback channel 124 may also enable the decoder 120 to communicate information about itself, its configuration, or other relevant parameters back to the encoder 108. Given this information about the decoder 120, the encoder 108 can adjust or optimize its encoding process accordingly. These aspects of the feedback channel 124 as used to report information about the decoder 120 are also described further below. In light of the foregoing description, the feedback channel 124 may provide the decoder(s) 120 with an out-of-band channel to communicate with the encoder 108.
It is understood that only one workstation 104b and related decoder 120 is shown in Figure 1 only for clarity and legibility, and not to limit possible implementations of the operating environment 100. In particular, it is noted that any number of different workstations 104b and corresponding decoders 120 could be included, with different ones of the workstations 104b and decoders 120 having different configurations, features, capacities, capabilities, or other characteristics. For example, different workstations 104b might support different color depths, pixel resolutions, display sizes, or other aspects of processing the source content 110 and/or the decoded content 122. It is further understood that each workstation 104b and/or decoder 120 could have a respective feedback channel 124. Using this feedback channel 124, the workstations 104b and/or decoders 120 can provide specific local information germane to their local environments back to the encoder 108.
Data Structures
The tools described and illustrated herein may utilize data structures as part of their implementation and/or operations to perform feedback and frame synchronization between media encoders and decoders. Examples of such data structures are now described.
Figure 2 illustrates a data structure 200, at least parts of which may be suitable for implementing respective instances of the frames 112 and/or 118 as shown in Figure 1. Assuming only for example that the encoder 108 and decoder 120 implement the Real-time Transport Protocol (RTP), the data structure 200 can include, for a given frame 112 and/or 118, a field 205 for the RTP standard header, and a field 210 that contains the data that is considered the payload of the frames 112 and/or 118. In addition, the data structure 200 can contain a field 215 for additional header data. The data contained in the field 215 may be considered an extension to the underlying protocol used by the encoder 108 and the decoder 120. In the example shown in Figure 2, the protocol can be RTP, although it is understood that other protocols may be equally suitable.
Turning to the field 215 in more detail, Figure 2 illustrates several examples of data that may be included in the field 215 for a particular frame 112 and/or 118. A sub-field 220 can contain one or more cache control bits. These cache control bits 220 can enable the encoder 108 to control and/or manage the caching of particular frames 112 and/or 118 by the decoder 120. These bits 220 can support frame recovery and synchronization operations between the encoder 108 and ones of. the decoder 120. This frame caching operation is described in further detail below.
A sub-field 222 can indicate, for a given frame 112 and/or 118, what type of frame it is. Figure 2 illustrates three types of frames, although it is understood that other types of frames may be implemented, and the implemented frames may be named or labeled differently than as described herein.
As shown in the sub-field 222, an I-frame represents an entire, self-contained frame of content, for example, audio or video. An I-frame is "free standing", and can De αecoαeα and reproduced by the decoder 120 without reference to any other previous or future frames.
A P-frame represents a difference between a current state of the audio or video and a previous I-frame. Thus, a P-frame may be said to reference the previous I- frame. Because a P-frame contains data representing only the differences relative to this previous I-frame, the P-frame is typically much smaller than the I-frame. To conserve bandwidth across the network 114, it may be appropriate to utilize P-frames as much as possible. When the source content 110 exhibits relatively little motion over time, the encoder 108 may use a sequence of P-frames, because under such circumstances, the differences in successive frames are typically relatively small and readily represented by P-frames. However, when the source content 110 exhibits relatively great motion over time, or exhibits a substantial change of scene or context, the encoder 108 may use one or more I-frames to set the new scene or context. Also, the loss rate experienced by the workstation 102b and/or the decoder 120 may be reported to the workstation 102a and/or the encoder 108. In turn, the encoder 108 can consider the loss rate reported by the decoder 120 in determining whether to send I- frames or P-frames to the decoder 120. Additionally, the reported loss rate can be one factor in controlling the frame rate, bit rate, quality, and whether to send Super-P frames.
The sub-field 222 can also support an additional type of frame 112 and/or 118, which is referred to herein for convenience, but not limitation, as a Super P-frame. A - Super P-frame is similar to a P-frame in that it defines a change in the content, relative to a previous state of the content. However, instead of referencing a previous frame, the Super P-frame references the contents of a cache that is maintained locally on the decoder 120. This caching operation is described in further detail below.
A sub-field 224 can contain an index or other type of unique identifier for a given frame 112 and/or 118. For example, the contents of the sub-field 224 can take the form of a sequence number for frames or packets, a unique timestamp, an offset or position of the given frame 112 and/or 118 within the context of the source content 110, a displacement of the given frame 112 and/or 118 relative to the beginning of the source content 110, or the like.
The contents of the sub-field 224, in whatever form, may be populated by the encoder 108 when encoding the source content 110 into the frames 112 at the workstation 104a. At the workstation 104b, the decoder 120 may reference the contents of the sub-field 224 for a given frame 118 when decoding and assembling a plurality of the frames 118 into the decoded content 122. More particularly, the contents of the sub-field 224 may enable the decoder 120 to assemble the frames 118 into an appropriate order when presenting the decoded content 122. Additionally, the decoder 120 can use the contents of the sub-field 224, at least in part, to determine if one or more frames 112 sent by the encoder 108 were lost during transmission through the network 114 to the workstation 104b.
As an example of the foregoing, the decoder 120 may receive a given sequence of frames 118 having identifiers 224 such as A, B, and D. However, the decoder 120 might expect these three frames 118 to have the identifiers 224 as A, B, and C. If the decoder 120 does not receive frame C in some amount of time, the decoder 120 may conclude that the frame 112 corresponding to expected frame C will never arrive, and was lost in the network 114. Accordingly, the decoder 120 may report to the decoder 120 that the packet C was lost, through for example the feedback channel 124.
A sub-field 226 can contain data pertaining to a color space conversion performed by the encoder 108 based on the characteristics or configuration of a particular decoder 120. Recall, from the discussion of Figure 1 above, that particular instances of the decoder 120 can communicate information pertaining to their local color display capabilities or features back to the encoder 108, for example, via the feedback channel 124. In response to this feedback from particular decoders 120, the encoder 108 can specifically tailor the frames 112 that are sent to each of the particular decoders 120. Any data pertaining to specific color conversions performed by the encoder 108 on behalf of a given decoder 120 can be stored in the sub-field 226. For example, the source content 110 may be captured and presented to the encoder 108 in an illustrative range of 256 colors. However, if a given decoder 120 can only support and display 16 colors, it would not be useful to transmit frames 112 that support 256 colors to this given decoder 120. Accordingly, through data contained in the sub-field 226, the encoder 108 may instruct the decoder 120 how to convert the colors, as represented in the frame 112, into colors that are supported by the decoder 120. In addition to or instead of the foregoing, the encoder 108 may indicate, through data in the sub-field 226, how the encoder 108 has already converted the colors in the frame 112, for the benefit of the decoder 120. D A sut>-πeiα 228 can contain data pertaining to any pixel resolution conversions performed by the encoder 108 on behalf of a particular decoder 120. Recall from the above discussion of Figure 1 that particular instances of the decoder 120 can communicate data such as their pixel resolution to the encoder 108, for example, via the feedback channel 124. Referring to the above discussion of sub-field 2260 regarding color depth, the sub-field 228 can enable similar processing regarding pixel resolution. For example, the source content 110 may be captured and presented to the encoder 108 in a relatively high pixel density. However, one or more of the decoders 120 may not support this high pixel density, and different ones of the decoders 120 may support different pixel densities. Thus, the encoder 108 may optimize the pixel5 density of different frames 112 sent to different decoders 120, depending on the capabilities of the different decoders 120. Accordingly, the sub-field 228 can contain any information pertaining to any conversions in pixel resolutions perfoπned by the encoder 108, or pertaining to any conversions that should be performed by the decoder 120 in processing the frames 118.
0 Having described the foregoing examples of the fields 205-215 and the sub- fields 220-228, it is understood that various implementations of the data structure 200 could include one or more of these example fields 205-215 or sub-fields 220-228, or may contain additional data, fields, or sub-fields other than those illustrated in Figure 2. In addition, the layout, names, and configuration of the fields or sub-fields of the5 data structure 200 are illustrative only, and are chosen only for convenience of illustration and description, and do not limit possible implementations of the data structure 200. It is further understood that given instances of the data structure 200 may be associated with particular frames 112, but each instance of the data structure 200 need not have populated each field and/or sub-field as shown in Figure 2.
0 Figure 3 illustrates a data structure 300, at least parts of which may be suitable for implementing respective instances of the feedback channel 124 as shown in Figure 1. More particularly, data transfer from respective instances of the decoder 120 to the encoder 108 may be facilitated, at least in part, using the data structure 300.
Turning to the data structure 300 in more detail, a field 305 can contain data5 reporting a local frame or packet loss rate experienced by ones of the decoders 120.
This loss rate may be expressed, for example, as a number of frames lost per unit of time, as experienced by a particular decoder 120. Given this information, the encoder 108 may choose how often to transmit I-frames or P-frames to the decoders 120. /\iiϋ, inis miormanon may enable the encoder 108 to determine when and/or how often to direct or instruct the decoders 120 to cache particular frames 112/118. These caching operations are discussed further below in connection with Figures 4-6.
A field 310 can contain data reporting the loss of a particular frame 112/118. In reporting a frame loss, the decoder 120 can reference data such as that discussed previously regarding the sub-field 224 as shown in Figure 2. Recall that the sub-field 224 can contain identification information for particular frames 112/118. For example, if the decoder 120 suspects that one or more frames 112 are missing, the decoder 120 might report a sequence of frames 118 that are actually received, so the encoder 108 can determine which frames 112 were lost. In another example, the decoder 120 could estimate or determine the identification information for the suspected missing frames 112.
A field 315 can contain data representing a local pixel resolution supported by a particular decoder 120. In response to this data 315 as reported by the decoder 120, the encoder 108 can transform the pixel resolution of the frames 112/118 sent to the decoder 120, can instruct the decoder 120 how to transform the pixel resolution of the frames 112/118, or can perform other related processing. Any of the foregoing can be performed in connection with the sub-field 228 shown in Figure 2.
A field 320 can contain data representing a local color depth supported by a particular decoder 120. In response to this data 315 as reported by the decoder 120, the encoder 108 can transform the color depth of the frames 112/118 sent to the decoder 120, can instruct the decoder 120 how to transform the color depth of the frames 112/118, or can perform other related processing. Any of the foregoing can be perfoπned in connection with the sub-field 226 shown in Figure 2.
Having described the foregoing examples of the fields 305-320, it is understood that various implementations of the data structure 300 could include one or more of these example fields 305-320, or may contain additional data, fields, or sub-fields other than those illustrated in Figure 3. In addition, the layout, names, and configuration of the fields of the data structure 300 are illustrative only, and are chosen only for convenience of illustration and description, and do not limit possible implementations of the data structure 300. It is further understood that given instances of the data structure 300 may be associated with particular instances of data transmitted from the decoders 120 to the encoder 108. However, each instance of the data structure 300 need not have populated each field as shown in Figure 3. Data Flows
The tools described herein may implement data flows that are suitable for performing feedback and frame synchronization between media encoders and decoders. An illustrative data flow is now described in connection with another operating environment.
Figure 4 illustrates an operating environment 400 for receiving frames 118, merging a new frame 118 with a previous display to produce an updated display, caching a frame 118, and merging a new frame 118 with the contents of the cache to produce an updated display. The operating environment 400 may be implemented, at least in part, by the workstation 104b and/or the decoder 120, although aspects of the operating environment 400 may also be implemented by other components or tools as well.
Assume that at a time (T1), a frame 118a is received. Recall that frames 118 can be associated with respective instances of the data structure 200, as discussed above in connection with Figure 2. Assume further that a field 222 of the data structure 200 for the frame 118a indicates that the frame 118a is an I-Frame. Because the frame 118a is an I-Frame, the frame 118a can be presented directly on a display 402 associated with, for example, the workstation 104b. For convenience, the display 402 as it would stand when presenting the I-Frame 118a is denoted as display 402a in Figure 4.
Recall from the discussion of Figure 2 that the data structure 200 for the frames 112/118 can include the cache control bits 220. Assume for the purposes of describing the operating environment 400 that the data structure 200 includes two cache control bits 220. A first cache control bit may be labeled "Cache", and at least a second cache control bit may be labeled "Use Cache". Either or both of these bits may be set or active for a given frame 118.
Turning first to the "Cache" bit, when this bit is set for a given frame 112/118, this bit directs the decoder 120 to store the frame 112/118, and/or the display resulting from that frame 112/118, into a cache 404 maintained locally by the workstation 104b and/or the decoder 120. Thus, in the example shown in Figure 4, assume that the I- Frame 118a has the "Cache" bit set or active, as indicted in block 406. Accordingly, the I-Frame 118a would be presented as the display 402a, and stored in the cache 404.
Some implementations of the operating environment 400 may cache all instances of I-Frames 112/118 by default. Other implementations of the operating environment 4UU may cache only those I-Frames 112/118 that have their "Cache" bits set or active.
Assume that at time (T2), a frame 118b arrives, and that its frame type 222 indicates that it is a P-Frame. Recall that a P-Frame expresses the difference between the current state of the source content 110 and some previous reference frame. Accordingly, the contents of the frame 118b are merged with the previous display 402a, as represented by merge block 408a. The merge 408a results in an updated display 402b.
Having described the processing of the P-Frame 118b, it is noted generally that a P-Frame 118 may have its Cache" bit set or active. In such a case, the contents of the P-Frame 118 itself may be stored in the cache 404, in some implementations of the operating environment 400. In other implementations, the display (e.g., display 402b) resulting from the merge of the P-Frame (e.g., frame 118b) may be cached.
Assume that at time (T3), a frame 118c arrives, and that its frame type 222 indicates that it is a P-Frame. In this case, the contents of the frame 118c are merged with the previous display 402b, as represented by merge block 408b. The merge 408b results in an updated display 402c. The foregoing, however, assumes that the frame 118c actually arrives at the operating environment 400. If the frame 118c fails to arrive at the operating environment 400, there will be no updated display 402c. Further, by the time that the operating environment 400 detects the frame loss, the previous display 402b may have expired or otherwise become outdated. In this event, the encoder 108 may be notified of a frame loss. See, e.g., block 310 and related discussion of Figure 3.
In response to the frame loss report 310, the encoder 108 may send a replacement P-Frame 118d. Assume that the operating environment 400 receives this replacement P-Frame 118d at time (T4). As indicated by the block 410, the cache control bits 220 for the replacement P-Frame 118d can have its "Use Cache" bit set or active. This directs the operating environment 400 to merge the current P-Frame 118d with the contents of the cache 404, rather than the previous display. This merge-from- cache is represented generally by the merge block 408c.
Because this replacement P-Frame 118d is encoded relative to a cached reference frame, rather than the previous I-Frame, the P-Frame 118d is referred to herein as a Super P-Frame, as discussed above. The encoder 108 encodes the Super P-Frame 118d based on the cached reference, and thus the Super P-Frame 118d is muui Miiaπer man a replacement I-Frame 118 would be. Thus, sending the Super P- Frame 118d to compensate for the frame loss consumes less network bandwidth than sending a replacement I-Frame 118.
In some implementations, the encoder 108 may not send the Super P-Frame 118d if the corrupted frame is sufficiently close to the next I-Frame 118e that will arrive at the decoder 120. In such implementations, the decoder 120 may await the next I-Frame 118e. The decoder 120 may be configured with one or more settings that specify how close the corrupted frame should be relative to the next I-Frame 118e for this processing to occur.
After the Super P-Frame 118d is merged with the contents of the cache 404, the display 402d results. Figure 4 also depicts the arrival of a new I-Frame 118e at time (T5), resulting in a new display 402e.
It is noted that Figure 4 shows one cache 404 only for convenience of illustration and description. Additional caches 404 could be provided by the encoder
108 and/or the decoder 120, such that multiple reference frames 112/118 can be stored and retained by the encoder 108 and/or the decoder 120. These multiple reference frames 112/118 may be useful in situations wherein one or more of the cached frames
112/118 may have been corrupted or lost. Where multiple caches 404 are implemented, additional cache control bits (e.g., cache control bits 220 in Figure 2) may be implemented as appropriate to dictate or indicate which cache 404 was used to encode a given replacement frame 112/118.
Additionally, it is noted that the cache control bits 220 provide a means for enabling the encoder 108 to instruct the decoder 120 in how to handle caching and related synchronization of replacement frames 112/118. Finally, the reference frames cached at the encoder 108 and the decoder 120, and the replacement frames 112/118 encoded therefrom, provide a means for synchronizing the processing of the encoder 108 and the decoder 120.
Process Flows
The tools as described herein can implement various process flows to perform feedback and frame synchronization between media encoders and decoders. Examples of such process flows are now described.
Figure 5 illustrates a process flow 500 that may be performed to encode frames and to respond to a packet loss report. The process flow 500 is described here in connection with the encoder 108. However, it is understood that the process flow 500 may be implemented on devices or components other than the encoder 108 without departing from the spirit and scope of the description herein.
Block 502 encodes one or more frames from the source content 110. Block
504 evaluates whether to cache the current frame 112 for possible later reference. If the frame 112 is to be cached, block 506 sets the "Cache" bit for the current frame 112. Recall that the "Cache" bit may be implemented as part of the cache control bits
220 shown in Figure 2. Block 508 caches the current frame 112 for later reference.
For example, the current frame 112 may be cached by the encoder 108. Block 510 transmits the current frame 112 to the decoder 120. Illustrative processing of the frame 112 at, for example, the decoder 120 is described in connection with Figure 6 below.
Returning to block 504, if the current frame 112 is not to be cached, then block 512 clears the "Cache" bit for this frame 112. In some instances, the "Cache" bit may be initialized to a set or clear state when the frame 112 is instantiated. In such cases, blocks 512 or 506 may not be performed, if it is not necessary to change the state of the "Cache" bit from its initialized state.
After block 512, the process flow 500 proceeds to block 510 as described above. After block 510 is performed, the process flow 500 can return to block 502 to process the next frame 112 into which the source content 110 is encoded. It is understood that the process flow 500 may loop through blocks 502-512 as appropriate to encode the source content 110 into suitable frames 112.
At any time during the processing of blocks 502-512, block 514 can receive a frame loss report 310. Block 514 can occur at any point within the process flow 500.
The process flow 500 may also test for and respond to the receipt of the frame loss report 310 at any point relative to blocks 502-512. Additionally, the process flow 500 may implement block 514 as an interrupt, branch from some point within blocks 502-
512 to service the interrupt, perform blocks 516-522 (described below) as an interrupt service routine, and return to the point in blocks 502-512 at which the interrupt was received. For convenience of illustration, Figure 5 shows the process flow 500 branching to block 514 when the frame loss report 310 is received, regardless of where the process flow 500 is within blocks 502-512.
Block 516 references the frame that was cached previously in block 508. Block 518 encodes a new P-Frame relative to or referencing the cached frame. Block 520 sets the "Use Cache" bit of the new P-Frame, if this bit is not already set. Recall that the cache control bits 220 shown and discussed in Figure 2 can include a "Use Cache" bit, which directs, for example, the decoder 120 to reference the contents of the cache 404 rather than the current display 402, when updating the current display 402. This new P-Frame is referred to herein for convenience only as a Super P- Frame. Block 522 transmits the Super P-Frame to the decoder 120 to allow the latter to compensate for the loss of the frame reported in block 514.
Figure 6 illustrates a process flow 600 for processing a frame as received by, for example, the decoder 120. While the process flow 600 is described herein in connection with tools such as the decoder 120 and the encoder 108, other implementations of the process flow 600 could also be implemented with other tools without departing from the spirit and scope of the description herein.
Block 602 receives a frame 118, as transmitted by, for example, block 510 shown in Figure 5. Block 604 tests whether the received frame 118 is corrupted, or whether a different frame 118 was expected. Regarding frame corruption, block 604 can test for corruption by, for example, evaluating a checksum or other error-detection and correction scheme implemented by the decoder 120 and/or encoder 108. Regarding frame loss, recall that frames 118 can be associated with respective instances of the data structure 200, described above in Figure 2. The data structure 200 can contain a field 224 for sequencing or otherwise uniquely identifying the frame 118. Using, for example, this field 224, block 604 can test whether the current frame 118 is the expected successor to a previous frame 118. If not, then the expected successor frame 118 may have been lost.
If the current frame 118 is corrupted or is not expected, block 606 reports the lost or corrupted frame. The report issued from block 606 can correspond to the report received in block 514 shown in Figure 5 and to the frame loss report 310 shown in Figure 3.
If the current frame 118 is not corrupted and is the expected successor frame, then blocks 608, 610, and 612 can test what frame type the frame 118 is. Recall that the data structure 200 can contain a sub-field 222 indicating a frame type. Block 608 tests whether the frame 118 is an I-Frame, block 610 tests whether the frame 118 is a P-Frame, and block 612 tests whether the frame 118 is a Super P-Frame.
Turning to block 608, if the frame 118 is an I-Frame, then block 614 can display the frame 118 directly, without reference to the current display or any other frame 118. In block 610, if the frame 118 is a P-Frame, then block 616 updates the current display by merging it with the frame 118. Block 614 then presents the updated display. In block 612, if the frame 118 is a Super P-Frame, then block 618 updates the display by merging it with the contents of a cache, such as the cache 404 shown in Figure 4. Recall that a Super P-Frame can be indicated or detected by a "Use Cache" bit being set or activated. Block 614 then presents the updated display.
From block 612, if the frame 118 is neither an I-Frame, a P-Frame, nor a Super
P-Frame, then block 620 can process this other type of frame. Afterwards, the process flow 600 can return to block 602 to await the next frame 118.
From block 614, block 622 tests whether the "Cache" bit is set for the frame
118. If so, block 624 stores the frame 118 in a cache, such as for example the cache 404 shown in Figure 4. Block 602 then awaits the arrival of the next frame 118.
Returning to block 614, if the "Cache" bit is not set for the frame 118, block 624 can be bypassed, and block 602 then awaits the arrival of the next frame 118.
Conclusion
Although the system and method has been described in language specific to structural features and/or methodological acts, it is to be understood that the system and method defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed system and method.
In addition, regarding certain flow diagrams described and illustrated herein, it is noted that the processes and sub-processes depicted therein may be performed in orders other than those illustrated without departing from the spirit and scope of the description herein.

Claims

1. A method comprising at least:
encoding (502) at least one frame (112) based on source content (110); determining (504) whether to cache the frame at an encoder and at least one decoder; and
encoding (506) at least one cache control bit associated with the frame in response to the determining.
2. The method of claim 1, further comprising caching (504) the frame at the encoder.
3. The method of claim 1, wherein encoding at least one cache control bit includes setting the cache control bit (506) to indicate that the frame is to be cached at the encoder and the decoder.
4. The method of claim 1, further comprising transmitting (510) the frame to the at least one decoder.
5. The method of claim 1, further comprising receiving a loss report (514) indicating that at least one additional frame was lost or corrupted after transmission from the encoder.
6. The method of claim 5, further comprising referencing at least one cached frame (516) in response to receiving the loss report.
7. The method of claim 5, further comprising encoding a replacement frame (518) or the additional frame based on at least one cached reference frame.
8. The method of claim 5, further comprising transmitting a replacement frame (522) for the additional frame, the replacement frame being encoded based on at least one cached reference frame.
9. The method of claim 5, further comprising setting at least one additional cache control bit (520) associated with the replacement frame, wherein the additional cache control bit indicates that the replacement frame is encoded based on a cached reference frame.
10. The method of claim 1, further comprising receiving data representing a pixel resolution (228) supported by the decoder.
11. The method of claim 10, wherein encoding at least one frame is performed based on the data representing the pixel resolution of the decoder,
12. The method of claim 1, further comprising receiving data representing a color depth (226) supported by the decoder.
13. The method of claim 12, wherein encoding at least one frame is performed based on the data representing the color depth of the decoder.
14. A method comprising at least:
receiving (602) at least one frame encoded from source content;
determining (622) whether to cache the frame; and
decoding the frame (122).
15. The method of claim 14, further comprising at least one of determining whether the frame is corrupted (604) and determining whether at least one additional frame is missing (604).
16. The method of claim 15, further comprising reporting at least one lost or damaged frame (606).
17. The method of claim 14, wherein determining whether to cache the frame (622) includes testing at least one cache control bit (220) associated with the frame.
18. The method of claim 14, further comprising caching the frame
(624).
19. The method of claim 14, further comprising determining (612) whether the frame is encoded based on a cached reference frame by testing at least one cache control bit (220) associated with the frame.
20. The method of claim 14, further comprising merging the frame
(618) with at least one cached reference.
PCT/US2006/046221 2005-12-07 2006-12-07 Feedback and frame synchronization between media encoders and decoders WO2007067479A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
BRPI0618719-6A BRPI0618719A2 (en) 2005-12-07 2006-12-07 feedback and frame synchronization between media encoders and decoders
RU2008122940/07A RU2470481C2 (en) 2005-12-07 2006-12-07 Feedback and framing between media coders and decoders
EP06847494.9A EP1961232B1 (en) 2005-12-07 2006-12-07 Feedback and frame synchronization between media encoders and decoders
JP2008544413A JP5389449B2 (en) 2005-12-07 2006-12-07 Feedback and frame synchronization between media encoder and decoder
CN2006800463242A CN101341754B (en) 2005-12-07 2006-12-07 Feedback and frame synchronization between media encoders and decoders
KR1020087013567A KR101343234B1 (en) 2005-12-07 2008-06-05 feedback and frame synchronization between media encoders and decoders

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/275,071 2005-12-07
US11/275,071 US7716551B2 (en) 2005-12-07 2005-12-07 Feedback and frame synchronization between media encoders and decoders

Publications (1)

Publication Number Publication Date
WO2007067479A1 true WO2007067479A1 (en) 2007-06-14

Family

ID=38120188

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/046221 WO2007067479A1 (en) 2005-12-07 2006-12-07 Feedback and frame synchronization between media encoders and decoders

Country Status (9)

Country Link
US (1) US7716551B2 (en)
EP (1) EP1961232B1 (en)
JP (1) JP5389449B2 (en)
KR (1) KR101343234B1 (en)
CN (1) CN101341754B (en)
BR (1) BRPI0618719A2 (en)
RU (1) RU2470481C2 (en)
TW (1) TWI408967B (en)
WO (1) WO2007067479A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9061207B2 (en) * 2002-12-10 2015-06-23 Sony Computer Entertainment America Llc Temporary decoder apparatus and method
US9124757B2 (en) * 2010-10-04 2015-09-01 Blue Jeans Networks, Inc. Systems and methods for error resilient scheme for low latency H.264 video coding
US9043818B2 (en) 2012-05-23 2015-05-26 Fur Entertainment, Inc. Adaptive feedback loop based on a sensor for streaming static and interactive media content to animals
TW201501496A (en) * 2013-06-17 2015-01-01 Quanta Comp Inc Video codec system and video stream transmission method
US10516891B2 (en) 2015-11-20 2019-12-24 Intel Corporation Method and system of reference frame caching for video coding
CN118540517A (en) 2017-07-28 2024-08-23 杜比实验室特许公司 Method and system for providing media content to client
US20240291882A1 (en) * 2023-02-27 2024-08-29 Microsoft Technology Licensing, Llc Video Encoding Dynamic Reference Frame Selection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5444489A (en) * 1993-02-11 1995-08-22 Georgia Tech Research Corporation Vector quantization video encoder using hierarchical cache memory scheme
EP0902593A1 (en) 1997-09-12 1999-03-17 Oki Electric Industry Co., Ltd. Video coder, decoder and transmission system
WO2000056077A1 (en) 1999-03-12 2000-09-21 Microsoft Corporation Media coding for loss recovery with remotely predicted data units
US6580767B1 (en) * 1999-10-22 2003-06-17 Motorola, Inc. Cache and caching method for conventional decoders
US6760749B1 (en) * 2000-05-10 2004-07-06 Polycom, Inc. Interactive conference content distribution device and methods of use thereof

Family Cites Families (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5999173A (en) * 1992-04-03 1999-12-07 Adobe Systems Incorporated Method and apparatus for video editing with video clip representations displayed along a time line
CA2156889C (en) * 1994-09-30 1999-11-02 Edward L. Schwartz Method and apparatus for encoding and decoding data
JP3068002B2 (en) * 1995-09-18 2000-07-24 沖電気工業株式会社 Image encoding device, image decoding device, and image transmission system
JPH09121358A (en) * 1995-10-25 1997-05-06 Matsushita Electric Ind Co Ltd Picture coding/decoding device and its method
JP3427149B2 (en) * 1996-01-26 2003-07-14 三菱電機株式会社 Decoding circuit for coded signal, synchronization control method thereof, synchronization detection circuit and synchronization detection method
JPH1079949A (en) * 1996-09-04 1998-03-24 Oki Electric Ind Co Ltd Image encoder, image decoder and image transmission system
JPH10191356A (en) * 1996-12-27 1998-07-21 Oki Electric Ind Co Ltd Image encoder
JP3373130B2 (en) * 1997-03-24 2003-02-04 沖電気工業株式会社 Image decoding device
US6061399A (en) * 1997-05-28 2000-05-09 Sarnoff Corporation Method and apparatus for information stream frame synchronization
AU1115599A (en) * 1997-10-23 1999-05-10 Sony Electronics Inc. Apparatus and method for partial buffering transmitted data to provide robust error recovery in a lossy transmission environment
EP0940989A3 (en) * 1998-03-02 2003-10-29 Nippon Telegraph and Telephone Corporation Video communication system and method
US6289054B1 (en) 1998-05-15 2001-09-11 North Carolina University Method and systems for dynamic hybrid packet loss recovery for video transmission over lossy packet-based network
US6115080A (en) * 1998-06-05 2000-09-05 Sarnoff Corporation Channel selection methodology in an ATSC/NTSC television receiver
US6269130B1 (en) * 1998-08-04 2001-07-31 Qualcomm Incorporated Cached chainback RAM for serial viterbi decoder
JP3660513B2 (en) * 1998-12-25 2005-06-15 沖電気工業株式会社 Image communication apparatus and local decoding processing method
US6658618B1 (en) * 1999-09-02 2003-12-02 Polycom, Inc. Error recovery method for video compression coding using multiple reference buffers and a message channel
EP1130921B1 (en) * 2000-03-02 2005-01-12 Matsushita Electric Industrial Co., Ltd. Data transmission in non-reliable networks
JP2002010265A (en) * 2000-06-20 2002-01-11 Sony Corp Transmitting device and its method and receiving device and it method
US7191242B1 (en) * 2000-06-22 2007-03-13 Apple, Inc. Methods and apparatuses for transferring data
KR100354768B1 (en) * 2000-07-06 2002-10-05 삼성전자 주식회사 Video codec system, method for processing data between the system and host system and encoding/decoding control method in the system
US7174561B2 (en) * 2001-04-13 2007-02-06 Emc Corporation MPEG dual-channel decoder data and control protocols for real-time video streaming
US6823489B2 (en) * 2001-04-23 2004-11-23 Koninklijke Philips Electronics N.V. Generation of decision feedback equalizer data using trellis decoder traceback output in an ATSC HDTV receiver
US8923688B2 (en) * 2001-09-12 2014-12-30 Broadcom Corporation Performing personal video recording (PVR) functions on digital video streams
US20040016000A1 (en) * 2002-04-23 2004-01-22 Zhi-Li Zhang Video streaming having controlled quality assurance over best-effort networks
US7606314B2 (en) * 2002-08-29 2009-10-20 Raritan America, Inc. Method and apparatus for caching, compressing and transmitting video signals
US7684483B2 (en) * 2002-08-29 2010-03-23 Raritan Americas, Inc. Method and apparatus for digitizing and compressing remote video signals
US20040125816A1 (en) * 2002-12-13 2004-07-01 Haifeng Xu Method and apparatus for providing a buffer architecture to improve presentation quality of images
JP4329358B2 (en) * 2003-02-24 2009-09-09 富士通株式会社 Stream delivery method and stream delivery system
US7237061B1 (en) * 2003-04-17 2007-06-26 Realnetworks, Inc. Systems and methods for the efficient reading of data in a server system
JP2005101677A (en) * 2003-09-22 2005-04-14 Ricoh Co Ltd Image transmitting apparatus, image processing system, program, and information recording medium
GB0323284D0 (en) * 2003-10-04 2003-11-05 Koninkl Philips Electronics Nv Method and apparatus for processing image data
US7143207B2 (en) * 2003-11-14 2006-11-28 Intel Corporation Data accumulation between data path having redrive circuit and memory device
US20050201471A1 (en) * 2004-02-13 2005-09-15 Nokia Corporation Picture decoding method
US7627227B2 (en) * 2004-05-17 2009-12-01 Microsoft Corporation Reverse presentation of digital media streams
US8634413B2 (en) 2004-12-30 2014-01-21 Microsoft Corporation Use of frame caching to improve packet loss recovery
TW200642450A (en) * 2005-01-13 2006-12-01 Silicon Optix Inc Method and system for rapid and smooth selection of digitally compressed video programs
US20070008323A1 (en) * 2005-07-08 2007-01-11 Yaxiong Zhou Reference picture loading cache for motion prediction
US8300701B2 (en) * 2005-12-09 2012-10-30 Avid Technology, Inc. Offspeed playback in a video editing system of video data compressed using long groups of pictures

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5444489A (en) * 1993-02-11 1995-08-22 Georgia Tech Research Corporation Vector quantization video encoder using hierarchical cache memory scheme
EP0902593A1 (en) 1997-09-12 1999-03-17 Oki Electric Industry Co., Ltd. Video coder, decoder and transmission system
WO2000056077A1 (en) 1999-03-12 2000-09-21 Microsoft Corporation Media coding for loss recovery with remotely predicted data units
US6580767B1 (en) * 1999-10-22 2003-06-17 Motorola, Inc. Cache and caching method for conventional decoders
US6760749B1 (en) * 2000-05-10 2004-07-06 Polycom, Inc. Interactive conference content distribution device and methods of use thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1961232A4

Also Published As

Publication number Publication date
EP1961232B1 (en) 2014-06-04
US7716551B2 (en) 2010-05-11
EP1961232A1 (en) 2008-08-27
EP1961232A4 (en) 2011-11-30
BRPI0618719A2 (en) 2011-09-06
RU2008122940A (en) 2009-12-20
JP5389449B2 (en) 2014-01-15
TWI408967B (en) 2013-09-11
CN101341754B (en) 2010-10-27
RU2470481C2 (en) 2012-12-20
CN101341754A (en) 2009-01-07
TW200731811A (en) 2007-08-16
US20070130493A1 (en) 2007-06-07
KR101343234B1 (en) 2013-12-18
KR20080080521A (en) 2008-09-04
JP2009518956A (en) 2009-05-07

Similar Documents

Publication Publication Date Title
EP1961232B1 (en) Feedback and frame synchronization between media encoders and decoders
CN1981492B (en) Buffer level signaling for rate adaptation in multimedia streaming
EP2070083B1 (en) System and method for providing redundancy management
US20100091801A1 (en) Data communication system, data transmitting device, data transmitting method, data receiving device, and data receiving method
CN103210642B (en) Occur during expression switching, to transmit the method for the scalable HTTP streams for reproducing naturally during HTTP streamings
US8432937B2 (en) System and method for recovering the decoding order of layered media in packet-based communication
RU2006101400A (en) SWITCHING THE FLOW BASED ON THE GRADUAL RESTORATION DECODING
CN101594203A (en) Dispensing device, sending method and receiving system
EP1340381A2 (en) Apparatus and method for improving the quality of video communication over a packet-based network
US7834904B2 (en) Video surveillance system
US20060005101A1 (en) System and method for providing error recovery for streaming fgs encoded video over an ip network
US8484540B2 (en) Data transmitting device, control method therefor, and program
CN102316360A (en) Video refreshing method, device and system
US7827458B1 (en) Packet loss error recovery
CN1798342A (en) Method for converting coding of video image in conversion equipment
US20070121532A1 (en) Application specific encoding of content
JP2005033556A (en) Data transmitter, data transmitting method, data receiver, data receiving method
Belda et al. Hybrid FLUTE/DASH video delivery over mobile wireless networks
CN101296166A (en) Method for measuring multimedia data based on index
KR100704116B1 (en) Multiple Real-time Encoding method for Multi-media Service And Server Apparatus Thereof
CN116016995B (en) Video acquisition equipment and method
JP2005051707A (en) Image data telecommunication system and image data communication method
KR20110138136A (en) Apparatus and method for transfering bitstream
JP2002247134A (en) Communication control system, receiver and transmitter

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680046324.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2209/CHENP/2008

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: MX/a/2008/006099

Country of ref document: MX

WWE Wipo information: entry into national phase

Ref document number: 2008544413

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 1020087013567

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2008122940

Country of ref document: RU

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2006847494

Country of ref document: EP

ENP Entry into the national phase

Ref document number: PI0618719

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20080516